[
https://issues.apache.org/jira/browse/BOOKKEEPER-154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13197946#comment-13197946
]
Ivan Kelly commented on BOOKKEEPER-154:
---------------------------------------
Hubs in hedwig have all the information they need to do this, so zookeeper can
be left alone. How I would see this working would be, that you'd have a client,
{code}
$ hedwig console gc 10d
{code}
This would connect to all hubs[1] and send a GarbageCollect message to each hub
with 10 days as the parameter. The hub can then go through it's list of topics
and perform garbage collection on it. The hub must have a its list of topic's
in memory as well as the list of subscribers for each topic. This is part of
the basic design of Hedwig. Additionally, each hub would not have much to do as
topics should be spread pretty evenly.
The open questions here are:
* how do we deal with time?
* in the event of a crashed hub, what do we do with the topics which have not
been taken over by another hub (since noone has tried to access them since the
crash)?
[1] Im not sure how to get the list of all hubs without contacting zk, but
that's an auxillary problem.
> Garbage collect messages for those subscribers inactive/offline for a long
> time.
> ---------------------------------------------------------------------------------
>
> Key: BOOKKEEPER-154
> URL: https://issues.apache.org/jira/browse/BOOKKEEPER-154
> Project: Bookkeeper
> Issue Type: New Feature
> Components: hedwig-client, hedwig-server
> Affects Versions: 4.0.0
> Reporter: Sijie Guo
>
> Currently hedwig tracks subscribers progress for garbage collecting published
> messages. If subscriber subscribe and becomes offline without unsubscribing
> for a long time, those messages published in its topic have no chance to be
> garbage collected.
> A time based garbage collection policy would be suitable for this case.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira