poorbarcode opened a new pull request, #4701:
URL: https://github.com/apache/bookkeeper/pull/4701
### Motivation
**Background**
- Ledger states management
- In BK server, ledgers are managed by `HandleFactoryImpl`, which
maintains the ledger state.
- When fencing entries, BK change the state of the Ledger Handle to
`fenced`
- When adding entries, BK checks fencing state first, then apply writing.
- Open ledger with recovery
- BK fence ledger first, then read LAC to ensure no entries missed
- Threads design
- Opening ledger with recovery works on the thread
`BookieHighPriorityThread`
- Adding entries works on the thread `BookieWriteThreadPool`
To avoid the following multi-threading competition issue,BK uses a
lock<sup>[1]</sup>
| `(BookieWriteThreadPool)`Add entry | `(BookieHighPriorityThread)`Opening
ledger with recovery |
| --- | --- |
| get `LedgerDescriptor` from `HandleFactoryImpl` |
| check fenced status: `non-fenced` |
| | get `LedgerDescriptor` from `HandleFactoryImpl` |
| | fence ledger |
| | read Lac from write cache |
| add entry into write cache |
| add entry into Journal Disk |
| success |
| Lac is `1` | Lac is `0` |
- **[1]** Both fencing ledger and adding entry acquire the lock
`synchronized(LedgerDescriptor.this)`
- **[1-1]**
https://github.com/apache/bookkeeper/blob/release-4.17.2/bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/BookieImpl.java#L1098-L1105
```java
public void addEntry(ByteBuf entry, boolean ackBeforeSync, WriteCallback cb,
Object ctx, byte[] masterKey)
throws IOException, BookieException, InterruptedException {
try {
LedgerDescriptor handle = getLedgerForEntry(entry, masterKey);
synchronized (handle) {
if (handle.isFenced()) {
throw BookieException
.create(BookieException.Code.LedgerFencedException);
}
addEntryInternal(handle, entry, ackBeforeSync, cb, ctx,
masterKey);
}
}
}
```
- **[1-2]**
https://github.com/apache/bookkeeper/blob/release-4.17.2/bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/LedgerDescriptorImpl.java#L92-L93
```java
synchronized CompletableFuture<Boolean> fenceAndLogInJournal(Journal
journal) throws IOException {
boolean success = this.setFenced();
return logFenceEntryInJournal(journal);
}
```
---
Issue: there is a race condition breaks the lock above
https://github.com/apache/bookkeeper/blob/release-4.17.2/bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/HandleFactoryImpl.java#L54-L68
```java
public LedgerDescriptor getHandle(final long ledgerId, final byte[]
masterKey, boolean journalReplay)
throws IOException, BookieException {
LedgerDescriptor handle = ledgers.get(ledgerId);
if (handle == null) {
handle = LedgerDescriptor.create(masterKey, ledgerId,
ledgerStorage);
ledgers.putIfAbsent(ledgerId, handle);
}
handle.checkAccess(masterKey);
return handle;
}
```
| `(BookieWriteThreadPool)`Add entry | `(BookieHighPriorityThread)`Opening
ledger with recovery |
| --- | --- |
| get `LedgerDescriptor` from `HandleFactoryImpl` |
| check if created: `false` | check if created: `false` |
| creates a new one | creates a new one |
### Changes
Fix the bug
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]