I don't think we should attempt to store "regular" ack holes. What I mean there is that all shared(queue) consumers will have ack holes in the regular order of business. So if we store this unconditionally, we will end up doing this for all queues.
Ideally we want the mark delete to keep pace with the dispatch rate (more or less). When that does not happen, we end up with the a large set of duplicate deliveries on topic reload. So one way to specify the problem would be to set a size N (configurable, with a sensible default) that would be the operating gap between mark delete and the read cursor. Anything that falls beyond N is an ack hole. The other comment I have is that the "ack hole" is a concept that is not easy for end users to understand. Holes could vary in size. There could be 10K messages in a single ack hole, or 1000 messages in 1000 ack holes (one in each. So in one case you could have 10,000 unacked messages and have no issues, but in the other case you could have 1,001 messages and have issues. This would be difficult to explain, without users grasping what happens under the covers. I would rather we use the concept and terminology of the "number of unacked messages". This is an area where we have and continue to face significant user education and support cases, hence my emphasis on user friendliness. Joe On Wed, Sep 6, 2017 at 5:02 PM, <g...@git.apache.org> wrote: > merlimat commented on issue #742: Avoid huge backlog on topic reloading: > due to large gap between markDelete-offset and read-position of cursor. > URL: https://github.com/apache/incubator-pulsar/issues/742# > issuecomment-327644681 > > > > where large ack-holes can create backlog > > large *number* of ack-holes > > > So, I think having option to restrict distance between markDelete and > readPosition is something useful for broker. > > It's not related at all with the distance. I think the better metric is > indeed the number of "holes" > > ---------------------------------------------------------------- > This is an automated message from the Apache Git Service. > To respond to the message, please log on GitHub and use the > URL above to go to the specific comment. > > For queries about this service, please contact Infrastructure at: > us...@infra.apache.org > > > With regards, > Apache Git Services >