hangc0276 opened a new pull request #2965:
URL: https://github.com/apache/bookkeeper/pull/2965
### Motivation
When we use RocksDB backend entryMetadataMap for multi ledger directories
configured, the bookie start up failed, and throw the following exception.
```
12:24:28.530 [main] ERROR org.apache.pulsar.PulsarStandaloneStarter - Failed
to start pulsar service.
java.io.IOException: Error open RocksDB database
at
org.apache.bookkeeper.bookie.storage.ldb.KeyValueStorageRocksDB.<init>(KeyValueStorageRocksDB.java:202)
~[org.apache.bookkeeper-bookkeeper-server-4.15.0-SNAPSHOT.jar:4.15.0-SNAPSHOT]
at
org.apache.bookkeeper.bookie.storage.ldb.KeyValueStorageRocksDB.<init>(KeyValueStorageRocksDB.java:89)
~[org.apache.bookkeeper-bookkeeper-server-4.15.0-SNAPSHOT.jar:4.15.0-SNAPSHOT]
at
org.apache.bookkeeper.bookie.storage.ldb.KeyValueStorageRocksDB.lambda$static$0(KeyValueStorageRocksDB.java:62)
~[org.apache.bookkeeper-bookkeeper-server-4.15.0-SNAPSHOT.jar:4.15.0-SNAPSHOT]
at
org.apache.bookkeeper.bookie.storage.ldb.PersistentEntryLogMetadataMap.<init>(PersistentEntryLogMetadataMap.java:87)
~[org.apache.bookkeeper-bookkeeper-server-4.15.0-SNAPSHOT.jar:4.15.0-SNAPSHOT]
at
org.apache.bookkeeper.bookie.GarbageCollectorThread.createEntryLogMetadataMap(GarbageCollectorThread.java:265)
~[org.apache.bookkeeper-bookkeeper-server-4.15.0-SNAPSHOT.jar:4.15.0-SNAPSHOT]
at
org.apache.bookkeeper.bookie.GarbageCollectorThread.<init>(GarbageCollectorThread.java:154)
~[org.apache.bookkeeper-bookkeeper-server-4.15.0-SNAPSHOT.jar:4.15.0-SNAPSHOT]
at
org.apache.bookkeeper.bookie.GarbageCollectorThread.<init>(GarbageCollectorThread.java:133)
~[org.apache.bookkeeper-bookkeeper-server-4.15.0-SNAPSHOT.jar:4.15.0-SNAPSHOT]
at
org.apache.bookkeeper.bookie.storage.ldb.SingleDirectoryDbLedgerStorage.<init>(SingleDirectoryDbLedgerStorage.java:182)
~[org.apache.bookkeeper-bookkeeper-server-4.15.0-SNAPSHOT.jar:4.15.0-SNAPSHOT]
at
org.apache.bookkeeper.bookie.storage.ldb.DbLedgerStorage.newSingleDirectoryDbLedgerStorage(DbLedgerStorage.java:190)
~[org.apache.bookkeeper-bookkeeper-server-4.15.0-SNAPSHOT.jar:4.15.0-SNAPSHOT]
at
org.apache.bookkeeper.bookie.storage.ldb.DbLedgerStorage.initialize(DbLedgerStorage.java:150)
~[org.apache.bookkeeper-bookkeeper-server-4.15.0-SNAPSHOT.jar:4.15.0-SNAPSHOT]
at
org.apache.bookkeeper.bookie.BookieResources.createLedgerStorage(BookieResources.java:110)
~[org.apache.bookkeeper-bookkeeper-server-4.15.0-SNAPSHOT.jar:4.15.0-SNAPSHOT]
at
org.apache.pulsar.zookeeper.LocalBookkeeperEnsemble.buildBookie(LocalBookkeeperEnsemble.java:328)
~[org.apache.pulsar-pulsar-zookeeper-utils-2.8.1.jar:2.8.1]
at
org.apache.pulsar.zookeeper.LocalBookkeeperEnsemble.runBookies(LocalBookkeeperEnsemble.java:391)
~[org.apache.pulsar-pulsar-zookeeper-utils-2.8.1.jar:2.8.1]
at
org.apache.pulsar.zookeeper.LocalBookkeeperEnsemble.startStandalone(LocalBookkeeperEnsemble.java:521)
~[org.apache.pulsar-pulsar-zookeeper-utils-2.8.1.jar:2.8.1]
at
org.apache.pulsar.PulsarStandalone.start(PulsarStandalone.java:264)
~[org.apache.pulsar-pulsar-broker-2.8.1.jar:2.8.1]
at
org.apache.pulsar.PulsarStandaloneStarter.main(PulsarStandaloneStarter.java:121)
[org.apache.pulsar-pulsar-broker-2.8.1.jar:2.8.1]
Caused by: org.rocksdb.RocksDBException: lock hold by current process,
acquire time 1640492668 acquiring thread 123145515651072:
data/standalone/bookkeeper00/entrylogIndexCache/metadata-cache/LOCK: No locks
available
at org.rocksdb.RocksDB.open(Native Method)
~[org.rocksdb-rocksdbjni-6.10.2.jar:?]
at org.rocksdb.RocksDB.open(RocksDB.java:239)
~[org.rocksdb-rocksdbjni-6.10.2.jar:?]
at
org.apache.bookkeeper.bookie.storage.ldb.KeyValueStorageRocksDB.<init>(KeyValueStorageRocksDB.java:199)
~[org.apache.bookkeeper-bookkeeper-server-4.15.0-SNAPSHOT.jar:4.15.0-SNAPSHOT]
... 15 more
```
The reason is multi garbageCollectionThread will open the same RocksDB and
own the LOCK, and then throw the above exception.
### Modification
1. Change the default GcEntryLogMetadataCachePath from
`getLedgerDirNames()[0] + "/" + ENTRYLOG_INDEX_CACHE` to `null`. If it is
`null`, it will use each ledger's directory.
2. Remove the internal directory `entrylogIndexCache`. The data structure
looks like:
```
└── current
├── lastMark
├── ledgers
│ ├── 000003.log
│ ├── CURRENT
│ ├── IDENTITY
│ ├── LOCK
│ ├── LOG
│ ├── MANIFEST-000001
│ └── OPTIONS-000005
├── locations
│ ├── 000003.log
│ ├── CURRENT
│ ├── IDENTITY
│ ├── LOCK
│ ├── LOG
│ ├── MANIFEST-000001
│ └── OPTIONS-000005
└── metadata-cache
├── 000003.log
├── CURRENT
├── IDENTITY
├── LOCK
├── LOG
├── MANIFEST-000001
└── OPTIONS-000005
```
3. If user configured `GcEntryLogMetadataCachePath` in `bk_server.conf`, it
only support one ledger directory configured for `ledgerDirectories`.
Otherwise, the best practice is to keep it default.
4. The PR is better to release with #1949
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]