[
https://issues.apache.org/jira/browse/BOOKKEEPER-154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13197812#comment-13197812
]
Flavio Junqueira commented on BOOKKEEPER-154:
---------------------------------------------
Ivan and I had an offline discussion about this issues, and here is a summary.
We have not reached agreement on a solution, btw, this is just a bit of
brainstorm.
We find it better to have an application thread responsible for
garbage-collecting messages from subscribers by consuming such messages. It is
better in the sense that it avoids introducing functionality that is
application specific.
Assuming that this feature is implemented at the application level, we need a
way for the application thread to determine:
# the subscribers it needs to watch for;
# the last time each of those subscribers has consumed a message.
This information is in principle available through ZooKeeper, so one way of
implementing this feature is to make the information in ZooKeeper available.
Having the application accessing directly ZooKeeper sounds messy because it is
prone to consistency problems to have the application manipulating the
ZooKeeper metadata directly and it is operationally more difficult (e.g., for
open ports). One option is to expose it through Hubs.
Exposing the ZooKeeper metadata via hubs doesn't solve the whole problem.
Assuming millions of subscribers, such an application thread would have to loop
through the subscribers frequently inducing a high load. If we could use the
watch functionality of ZooKeeper, then perhaps we could have the application
thread build a local table of subscribers and update the table when anything
changes. This way it has to loop through the same subscribers, but locally.
> Garbage collect messages for those subscribers inactive/offline for a long
> time.
> ---------------------------------------------------------------------------------
>
> Key: BOOKKEEPER-154
> URL: https://issues.apache.org/jira/browse/BOOKKEEPER-154
> Project: Bookkeeper
> Issue Type: New Feature
> Components: hedwig-client, hedwig-server
> Affects Versions: 4.0.0
> Reporter: Sijie Guo
>
> Currently hedwig tracks subscribers progress for garbage collecting published
> messages. If subscriber subscribe and becomes offline without unsubscribing
> for a long time, those messages published in its topic have no chance to be
> garbage collected.
> A time based garbage collection policy would be suitable for this case.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira