Thanks Flavio, We have been considering reading the BK state out of ZK ourselves. I could see how this data might be available in a round-about (not advised way) from the BK client. I don't think we would be needing to manipulate it, because after we have processed the ledgers we delete them. The only additional state I believe we would need is simply a lock around a ledger while it is being processed (moved out of BK.)
On Mon, Feb 4, 2013 at 4:08 PM, Flavio Junqueira <fpjunque...@yahoo.com>wrote: > Hi Whitney, > > In general we leave it up to the application to organize the ledgers it > creates. It is indifferent to bookkeeper which ledgers have been created by > a single writer and and how the content of ledgers relate. Managing this > kind of application state is something that zookeeper does well and since > we assume a zookeeper deployment, the application can use it to manage its > metadata. Although we typically don't recommend that applications access > the zookeeper metadata for bookkeeper ledgers, there is nothing really that > prevents you from doing it. If it is useful for you to read this metadata, > I don't see a big problem with doing it, although I'd like to stress that I > find it a bad idea to try to manipulate the zookeeper state for bookkeeper > ledger directly. > > On your point about duplication, don't you need to remember which closed > ledgers have been already processed? Just knowing the list of closed > ledgers might not be sufficient. If this is the case, then you need to keep > some additional metadata on the side. > > -Flavio > > On Feb 4, 2013, at 7:44 PM, Whitney Sorenson <wsoren...@hubspot.com> > wrote: > > Thank you for responding. > > Forgive me if I'm missing something, but if I have a writer and separate > readers, why would I want to have to communicate ledger ids between them? > More specifically, we have a series of writers writing to a write-ahead log > and a separate set of readers that are consuming these ledgers to move them > into long term storage and send them to queues / workflows to be processed. > This means I have to keep the state about which ledgers are available, and > which are closed, which seems to be a complete duplication of the state > that is already in BK. > > I'm not sure named ledgers are helpful in this situation, except that we > could keep less state (perhaps a sequential id.) > > On Mon, Feb 4, 2013 at 1:27 PM, Sijie Guo <guosi...@gmail.com> wrote: > >> >> Hello, Whitney: >> >> please check the replies inline. >> >> On Mon, Feb 4, 2013 at 8:47 AM, Whitney Sorenson >> <wsoren...@hubspot.com>wrote: >> >>> Hey all, >>> >>> A couple questions about running BK stand-alone: >>> >>> 1) If I call openLedgerNoRecovery am I blocking writes or not? What are >>> the guarantees I lose - just ordering? Can I use this to essentially read / >>> tail an active ledger? >>> >> >> open a ledger using openLedgerNoRecovery doesn't block any writes to it. >> And you don't lose the ordering guarantee. You could use it to read/tail an >> active ledger, but please keep in mind that you need to call >> #readLastConfirmed to catch up to the latest confirmed entries added by the >> writer. And the entries you could read from an openLedgerNoRecovery ledger, >> is just between 0 and last confirmed. >> >> you could check: >> http://zookeeper.apache.org/bookkeeper/docs/r4.2.0/apidocs/org/apache/bookkeeper/client/BookKeeper.html#asyncOpenLedgerNoRecovery(long, >> org.apache.bookkeeper.client.BookKeeper.DigestType, byte[], >> org.apache.bookkeeper.client.AsyncCallback.OpenCallback, java.lang.Object) >> >> >>> >>> 2) How can I access BK's metadata so that I can determine a list of >>> ledgers, and which ledgers are closed/open? It doesn't appear in the client >>> documentation ( >>> http://zookeeper.apache.org/bookkeeper/docs/r4.2.0/apidocs/org/apache/bookkeeper/client/) >>> Is this not an intended operation? Are clients supposed to track ledger ids >>> on their own (we are currently doing this but it seems suboptimal) >>> >>> >> currently we don't expose the API for client. Is there any special case >> you are considering? We'd happy to expose it if necessary. >> >> Since most of the cases are working in following styles: a *standby* >> writer observes the *active* writer state, if the *active* writer failed, >> the *standby* writer would take over the responsibility, closed the ledger >> written by *active* writer, replayed the ledger and created a new ledger to >> write new entries. For now, clients needs to track ledger ids on their end. >> >> There is one proposal working on providing *named* ledgers on top of >> bookkeeper to ease user's experience tracking ledger ids. You could check : >> https://issues.apache.org/jira/browse/BOOKKEEPER-220 . And we are under >> discussion on whether to provide ledger name internally in bookkeeper for >> metadata access concerns. We'd like to hear your feedback on the usage of >> API and make it better. >> >> >> >>> Thank you; >>> >>> -Whitney Sorenson >>> HubSpot >>> >>> >> > >