chenhongSZ opened a new issue, #4036:
URL: https://github.com/apache/bookkeeper/issues/4036

   **BUG REPORT**
   
   ***Describe the bug***
   Bookeeper version : 4.14.7
   
   The following exception occurred in our production environment,and Bookie 
process shutdown. 
   I searched for the issue and did not find any similar errors
   <img width="1411" alt="企业微信截图_01d6ed89-c87d-42dc-b1b9-28c822e14d53" 
src="https://github.com/apache/bookkeeper/assets/18002867/a3b1d3c3-ad94-44d8-8645-e9362e2e4cf2";>
   
   
   We added some logs and found that when replicate a certain ledger, this 
error is reported
   
   the ledger metadata here below
   <img width="1217" alt="企业微信截图_200ebf21-35ba-40bb-92ce-fcae2d7a9dfe" 
src="https://github.com/apache/bookkeeper/assets/18002867/0cb19592-cdbe-4b39-aec7-da872e211de5";>
   
   
   we do some dig into codes but  I'm not sure if there's a problem here
   
   `
   org.apache.bookkeeper.client.LedgerFragmentReplicator#splitIntoSubFragments
   /**
        * Split the full fragment into batched entry fragments by keeping
        * rereplicationEntryBatchSize of entries in each one and can treat them 
as
        * sub fragments.
        */
       static Set<LedgerFragment> splitIntoSubFragments(LedgerHandle lh,
               LedgerFragment ledgerFragment, long rereplicationEntryBatchSize) 
{
           Set<LedgerFragment> fragments = new HashSet<LedgerFragment>();
           if (rereplicationEntryBatchSize <= 0) {
               // rereplicationEntryBatchSize can not be 0 or less than 0,
               // returning with the current fragment
               fragments.add(ledgerFragment);
               return fragments;
           }
   
           long firstEntryId = ledgerFragment.getFirstStoredEntryId();
           long lastEntryId = ledgerFragment.getLastStoredEntryId();
   
           /*
            * if firstEntryId is INVALID_ENTRY_ID then lastEntryId should be
            * INVALID_ENTRY_ID and viceversa.
            */
           if (firstEntryId == INVALID_ENTRY_ID ^ lastEntryId == 
INVALID_ENTRY_ID) {
               LOG.error("For LedgerFragment: {}, seeing inconsistent 
firstStoredEntryId: {} and lastStoredEntryId: {}",
                       ledgerFragment, firstEntryId, lastEntryId);
               assert false;
           }
   
           long numberOfEntriesToReplicate = (lastEntryId - firstEntryId) + 1;
           long splitsWithFullEntries = numberOfEntriesToReplicate
                   / rereplicationEntryBatchSize;
   
           if (splitsWithFullEntries == 0) {// only one fragment
               fragments.add(ledgerFragment);
               return fragments;
           }
   
           long fragmentSplitLastEntry = 0;
           for (int i = 0; i < splitsWithFullEntries; i++) {
               fragmentSplitLastEntry = (firstEntryId + 
rereplicationEntryBatchSize) - 1;
               fragments.add(new LedgerFragment(lh, firstEntryId,
                       fragmentSplitLastEntry, 
ledgerFragment.getBookiesIndexes()));
               firstEntryId = fragmentSplitLastEntry + 1;
           }
   
           long lastSplitWithPartialEntries = numberOfEntriesToReplicate
                   % rereplicationEntryBatchSize;
           if (lastSplitWithPartialEntries > 0) {
               fragments.add(new LedgerFragment(lh, firstEntryId, firstEntryId
                       + lastSplitWithPartialEntries - 1, ledgerFragment
                       .getBookiesIndexes()));
           }
           return fragments;
       }
   `
   
   the `numberOfEntriesToReplicate` variable can be wrong(assigned 1) when 
`firstEntryId == INVALID_ENTRY_ID && lastEntryId == INVALID_ENTRY_ID`, It 
should be assigned 0.  Then it leads to the above error message when the 
control reaches to `new LedgerFragment` line.
   
   Is it a bug?  I’v reviewed the master branch and it has not been fixed yet.
   
   
   
   
   
   ***To Reproduce***
   
   Is hard to reproduce in test env.
   
   ***Expected behavior***
   
   the ledger replicate can be completed and replicate next.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to