sijie opened a new issue #1255: Bookie should not advance the journal marker 
before creating the index file
URL: https://github.com/apache/bookkeeper/issues/1255
 
 
   **Bug**
   
   *Problem*
   
   Currently Bookie journal 'new ledger' entry if a ledger doesn't exist at 
ledger storage. This 'new ledger' entry is journaled before adding the entry to 
ledger storage. so this would cause a problem on checkpointing.
   
   - journal 'new ledger' entry
   
   ```
       /**
        * Retrieve the ledger descriptor for the ledger which entry should be 
added to.
        * The LedgerDescriptor returned from this method should be eventually 
freed with
        * #putHandle().
        *
        * @throws BookieException if masterKey does not match the master key of 
the ledger
        */
       private LedgerDescriptor getLedgerForEntry(ByteBuf entry, final byte[] 
masterKey)
               throws IOException, BookieException {
           final long ledgerId = entry.getLong(entry.readerIndex());
   
           LedgerDescriptor l = handles.getHandle(ledgerId, masterKey);
           if (masterKeyCache.get(ledgerId) == null) {
               // Force the load into masterKey cache
               byte[] oldValue = masterKeyCache.putIfAbsent(ledgerId, 
masterKey);
               if (oldValue == null) {
                   // new handle, we should add the key to journal ensure we 
can rebuild
                   ByteBuffer bb = ByteBuffer.allocate(8 + 8 + 4 + 
masterKey.length);
                   bb.putLong(ledgerId);
                   bb.putLong(METAENTRY_ID_LEDGER_KEY);
                   bb.putInt(masterKey.length);
                   bb.put(masterKey);
                   bb.flip();
   
                   getJournal(ledgerId).logAddEntry(bb, false /* ackBeforeSync 
*/, new NopWriteCallback(), null);
               }
           }
   
           return l;
       }
   ```
   
   - add entry to ledger storage
   ```
       /**
        * Add an entry to a ledger as specified by handle.
        */
       private void addEntryInternal(LedgerDescriptor handle, ByteBuf entry,
                                     boolean ackBeforeSync, WriteCallback cb, 
Object ctx)
               throws IOException, BookieException {
           long ledgerId = handle.getLedgerId();
           long entryId = handle.addEntry(entry);
   
           writeBytes.add(entry.readableBytes());
   
           if (LOG.isTraceEnabled()) {
               LOG.trace("Adding {}@{}", entryId, ledgerId);
           }
           getJournal(ledgerId).logAddEntry(entry, ackBeforeSync, cb, ctx);
       }
   
   ```
   
   The problematic sequence can be described as below:
   
   - thread t1 is adding the first entry of ledger L1
   - thread t2 is adding entries of ledger L2
   - t1 is adding a journal entry of 'new ledger L1'
   - t2 is adding entries after t1 adds the journal entry and before t1 adding 
the entry to ledger storage
   - after t2 added entries, checkpoint happens in the ledger storage. it would 
roll the journal mark, which will claim the 'new ledger L1' entry as persistent.
   - if the bookie restarts, it would fail with no such ledger exception.
   
   The problem can be produced using a unit test: 
https://github.com/sijie/bookkeeper/commit/5053a717cd578aeb88236d373553d7494501b801
   
   *Solution*
   
   The fix is simple - just make sure the 'new ledger' journal entry is added 
after an entry is added to ledger storage. so it make sure when checkpoint 
happen it will flush and create the ledger before moving the journal mark.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to