wmccarley opened a new issue #5792: Extend autoSkipNonRecoverableData to apply 
to missing Schemas in Bookkeeper
URL: https://github.com/apache/pulsar/issues/5792
   Pulsar uses bookkeeper as the storage mechanism for data/cursors. In #1046 a 
flag was added by  @rdhabalia : _autoSkipNonRecoverableData_ that allows an 
admin to tell the broker to disregard missing ledgers and skip ahead. This 
helps in certain scenarios where the ledger is gone and it is unrecoverable (or 
not worth recovering.)
   This flag helps prevent users from getting stuck with 
NoSuchEntryException/NoLedgerException when the bookkeeper cluster has suffered 
data loss.
   **Is your feature request related to a problem? Please describe.**
   The _autoSkipNonRecoverableData_ flag does not solve the problem where the 
topic has a schema attached (stored in bookkeeper) and the ledger containing 
that schema goes missing. In this situation clients will receive 
NoSuchEntryException/NoLedgerException exceptions. Even if the admin unloads 
the topic the problem will still continue to occur until the admin deletes the 
   There are two key problems:
   If the admin has never seen this issue before they will probably spend time 
checking stats-internal to see if the ledger is a cursor or if it contains 
entries. However the referenced ledger will not appear anywhere in 
   The _autoSkipNonRecoverableData_ ostensibly exists to prevent users from 
getting stuck with NoSuchEntryException/NoLedgerException when ledgers go 
missing but it doesn't apply to missing schemas.
   **Describe the solution you'd like**
 to check _autoSkipNonRecoverableData_ and behave accordingly.
   Add additional output to stats-internal to show schema information including 
the ledgers that are used for schema storage (if the default 
BookkeeperSchemaStorage implementation is being used.)
   **Describe alternatives you've considered**
   Rather than modify BookkeeperSchemaStorage the broker code could be modified 
to catch the missing ledger exception and hide it if this flag is set.
   A different boolean flag could also be used to control this behavior if 
there is a scenario where the admin wants to skip missing schemas but not 
missing data (or vice-versa)

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:

With regards,
Apache Git Services

Reply via email to