[ 
https://issues.apache.org/jira/browse/BOOKKEEPER-154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13197946#comment-13197946
 ] 

Ivan Kelly commented on BOOKKEEPER-154:
---------------------------------------

Hubs in hedwig have all the information they need to do this, so zookeeper can 
be left alone. How I would see this working would be, that you'd have a client, 
{code}
$ hedwig console gc 10d
{code}

This would connect to all hubs[1] and send a GarbageCollect message to each hub 
with 10 days as the parameter. The hub can then go through it's list of topics 
and perform garbage collection on it. The hub must have a its list of topic's 
in memory as well as the list of subscribers for each topic. This is part of 
the basic design of Hedwig. Additionally, each hub would not have much to do as 
topics should be spread pretty evenly. 

The open questions here are:
  * how do we deal with time?
  * in the event of a crashed hub, what do we do with the topics which have not 
been taken over by another hub (since noone has tried to access them since the 
crash)?

[1] Im not sure how to get the list of all hubs without contacting zk, but 
that's an auxillary problem.
                
> Garbage collect messages for those subscribers inactive/offline for a long 
> time. 
> ---------------------------------------------------------------------------------
>
>                 Key: BOOKKEEPER-154
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-154
>             Project: Bookkeeper
>          Issue Type: New Feature
>          Components: hedwig-client, hedwig-server
>    Affects Versions: 4.0.0
>            Reporter: Sijie Guo
>
> Currently hedwig tracks subscribers progress for garbage collecting published 
> messages. If subscriber subscribe and becomes offline without unsubscribing 
> for a long time, those messages published in its topic have no chance to be 
> garbage collected.
> A time based garbage collection policy would be suitable for this case. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to