On Tue, May 16, 2017 at 10:59 PM, Venkateswara Rao Jujjuri < [email protected]> wrote:
> Please note that I started this thread to stir up discussion.. not my final > proposals. > > > On Tue, May 16, 2017 at 1:02 AM, Sijie Guo <[email protected]> wrote: > > > On May 16, 2017 12:09 AM, "Enrico Olivelli - Diennea" < > > [email protected]> wrote: > > > > Il giorno mar, 16/05/2017 alle 00.06 -0700, Sijie Guo ha scritto: > > > > On Mon, May 15, 2017 at 10:35 PM, Venkateswara Rao Jujjuri < > > [email protected]<mailto:[email protected]>> wrote: > > > > > > > > As we are moving towards mega store, which can house 10s of millions(even > > 100s) of ledgers, and 1000s of bookies, there can be a huge overhead on > > some of the operations we are performing right now. > > > > 1. Compaction/GC > > A deletion is just a metadata operation. Deletes the zk node. (currently > it > > just deletes the leaf node, which can be a minor bugfix to delete entire > > tree if applicable). > > > > > > > > > > +1 for a fix for improving this. > > > > > > > > +1 for me too, client could send some hint to the bookie that the ledger > > has been removed and maybe trigger a special GC > > > > > > Just to clarify my +1 is on the fix of deleting the tree nodes on current > > approach. Not for sending the requests yet. > > > > > > > > > > > > > > > > But each bookie > > need to get a list of existing ledgers from bookie, compare to its local > > storage, and identify > > deleted ledgers. Then go through compaction logic which is another > process. > > > > But 1000s of bookies, in parallel parsing the entire zk tree, making > their > > own list doesn't appear to be efficient scalable architecture to me. > > > > Why not introduce a opportunistic delete operation from client side, > which > > will inform to all bookies in that ledger's metadata. We can still keep > our > > bruit-force method but at very very low frequency, once a week? to > address > > transient/corner case scenarios like bookie down at that time etc. Is > there > > any big architectural correctness issue I am missing in this method? > > > > > > > > > > I don't think there is a correctness issue for the approach your proposed > > if current background gc is still running. > > The current approach is just for simplifying the client logic. > > > > Instead of introducing complexity (more operations) on client side, why > > can't the leader (auditor) perform the deletions? > > > > I believe auditor doing it is more complicated, unless I am mistaken. > Technically Auditor is a **client**. What I am try to say here is - It can be part of client job, but it can be kept as an internal client. It is always good to make the public client as thin as possible. > - Auditor Leader is on just one one node, so it need to communicate to > bookies on client protocol. Right? > Hence we can't avoid the 'complexity' you mentioned above. > > - How does auditor leader know about deleted list? This needs more work. > > > > > > > > > > > > > > > > > 2. Close > > Ledger close is also a metadata operation. I believe sending > opportunistic > > close to bookies of the current ensemble can greatly enhance some of the > > use-cases where we need open-to-close consistency. Where in the data > > doesn't need to be persistent until the close. Any thoughts?? > > > > > > > > > > You mean "close-to-open" consistency? > > > > I am trying to understand - Why "where in the data doesn't need to be > > persistent until the close" is related to ledger close? Are you thinking > of > > flushing all entries on the bookies on closing a ledger? > > > > Correct. > > > > How do you handle > > ensemble changes? > > > > Thinking aloud... but in this mode, any ensemble change will result in > write failure. > Client must be aware of this mode. +1 on this (although there are some tech debts on close-to-open semantic). At least I think it is good to write fence request or advance LAC when close would help a lot on reducing the zk dependency. > > JV > > > > > - Sijie > > > > > > > > > > > > > > > > -- > > Jvrao > > --- > > First they ignore you, then they laugh at you, then they fight you, then > > you win. - Mahatma Gandhi > > > > > > > > -- > > > > Enrico Olivelli Software Development Manager @Diennea Tel.: (+39) 0546 > > 066100 - Int. 925 Viale G.Marconi 30/14 - 48018 Faenza (RA) MagNews - > > E-mail Marketing Solutions http://www.magnews.it Diennea - Digital > > Marketing Solutions http://www.diennea.com > > > > ________________________________ > > > > Iscriviti alla nostra newsletter per rimanere aggiornato su digital ed > > email marketing! http://www.magnews.it/newsletter/ > > > > The information in this email is confidential and may be legally > > privileged. If you are not the intended recipient please notify the > sender > > immediately and destroy this email. Any unauthorized, direct or indirect, > > disclosure, copying, storage, distribution or other use is strictly > > forbidden. > > > > > > -- > Jvrao > --- > First they ignore you, then they laugh at you, then they fight you, then > you win. - Mahatma Gandhi >
