[
https://issues.apache.org/jira/browse/BOOKKEEPER-39?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13151875#comment-13151875
]
Sijie Guo commented on BOOKKEEPER-39:
-------------------------------------
> I propose we have a ZNODE at the toplevel of the namespace /ledgers/LAYOUT
I like such kind of propose, which make management more consistent.
> Also, Im still not convinced about hashing, instead of simply splitting the
> long.
Splitting the long means that we need get an id before splitting. Either using
a sequential znode or test/set counter to get unique id is not good enough as
previous comments.
Hashing is not right to describe the LedgerManager in patch, since it doesn't
generate id first then do hashing. I prefer to call it Hierarchical as your
suggestion: it fist pickup a sequential znode from a 2-level hierarchical
znodes in round-robin way, then do id generation using this sequential znode.
which means id generation could be processed in different sequential znodes,
not depends on a single znode.
For admins, is it OK to provide a tool using LedgerManager interface to talk
with different layout which let them interact with bookie metadata easier?
similar as BOOKKEEPER-77 .
> Bookie server failed to restart because of too many ledgers (more than
> ~50,000 ledgers)
> ---------------------------------------------------------------------------------------
>
> Key: BOOKKEEPER-39
> URL: https://issues.apache.org/jira/browse/BOOKKEEPER-39
> Project: Bookkeeper
> Issue Type: Bug
> Components: bookkeeper-server
> Affects Versions: 4.0.0
> Reporter: Sijie Guo
> Assignee: Sijie Guo
> Fix For: 4.0.0
>
> Attachments: bookkeeper-39.patch, bookkeeper-39.patch_v2
>
>
> If we have ~500,000 topics in hedwig, we might have more than ~500,000
> ledgers in bookkeeper (a topic has more than 1 ledger). So when the bookie
> server restarted, a logfile GC thread is started, which will call
> zk.getChildren to fetch all ledgers, and it failed because of package length
> limitation.
> 2011-08-01 01:18:46,373 - ERROR
> [main-EventThread:EntryLogger$GarbageCollectorThread$1@164] - Error polling
> ZK for the available ledger nodes:
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode
> = ConnectionLoss for /ledgers
> at
> org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
> at
> org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
> at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1519)
> at
> org.apache.bookkeeper.bookie.EntryLogger$GarbageCollectorThread$1.processResult(EntryLogger.java:162)
> at
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:592)
> at
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:481)
> 2011-08-01 01:18:46,373 - WARN [main-EventThread:Bookie$1@242] - ZK client
> has been disconnected to the ZK server!
> 2011-08-01 01:18:47,278 - WARN
> [main-SendThread(perf13.platform.mobile.sp2.yahoo.com:2181):ClientCnxn$SendThread@980]
> - Session 0x131833dec850034 for server
> perf13.platform.mobile.sp2.yahoo.com/98.139.43.86:2181, unexpected error,
> closing socket connection and attempting reconnect
> java.io.IOException: Packet len9976413 is out of range!
> at
> org.apache.zookeeper.ClientCnxnSocket.readLength(ClientCnxnSocket.java:112)
> at
> org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:78)
> at
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:264)
> at
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:958)
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira