Please note that I started this thread to stir up discussion.. not my final proposals.
On Tue, May 16, 2017 at 1:02 AM, Sijie Guo <guosi...@gmail.com> wrote: > On May 16, 2017 12:09 AM, "Enrico Olivelli - Diennea" < > enrico.olive...@diennea.com> wrote: > > Il giorno mar, 16/05/2017 alle 00.06 -0700, Sijie Guo ha scritto: > > On Mon, May 15, 2017 at 10:35 PM, Venkateswara Rao Jujjuri < > jujj...@gmail.com<mailto:jujj...@gmail.com>> wrote: > > > > As we are moving towards mega store, which can house 10s of millions(even > 100s) of ledgers, and 1000s of bookies, there can be a huge overhead on > some of the operations we are performing right now. > > 1. Compaction/GC > A deletion is just a metadata operation. Deletes the zk node. (currently it > just deletes the leaf node, which can be a minor bugfix to delete entire > tree if applicable). > > > > > +1 for a fix for improving this. > > > > +1 for me too, client could send some hint to the bookie that the ledger > has been removed and maybe trigger a special GC > > > Just to clarify my +1 is on the fix of deleting the tree nodes on current > approach. Not for sending the requests yet. > > > > > > > > But each bookie > need to get a list of existing ledgers from bookie, compare to its local > storage, and identify > deleted ledgers. Then go through compaction logic which is another process. > > But 1000s of bookies, in parallel parsing the entire zk tree, making their > own list doesn't appear to be efficient scalable architecture to me. > > Why not introduce a opportunistic delete operation from client side, which > will inform to all bookies in that ledger's metadata. We can still keep our > bruit-force method but at very very low frequency, once a week? to address > transient/corner case scenarios like bookie down at that time etc. Is there > any big architectural correctness issue I am missing in this method? > > > > > I don't think there is a correctness issue for the approach your proposed > if current background gc is still running. > The current approach is just for simplifying the client logic. > > Instead of introducing complexity (more operations) on client side, why > can't the leader (auditor) perform the deletions? > I believe auditor doing it is more complicated, unless I am mistaken. - Auditor Leader is on just one one node, so it need to communicate to bookies on client protocol. Right? Hence we can't avoid the 'complexity' you mentioned above. - How does auditor leader know about deleted list? This needs more work. > > > > > > 2. Close > Ledger close is also a metadata operation. I believe sending opportunistic > close to bookies of the current ensemble can greatly enhance some of the > use-cases where we need open-to-close consistency. Where in the data > doesn't need to be persistent until the close. Any thoughts?? > > > > > You mean "close-to-open" consistency? > > I am trying to understand - Why "where in the data doesn't need to be > persistent until the close" is related to ledger close? Are you thinking of > flushing all entries on the bookies on closing a ledger? > Correct. > How do you handle > ensemble changes? > Thinking aloud... but in this mode, any ensemble change will result in write failure. Client must be aware of this mode. JV > > - Sijie > > > > > > > > -- > Jvrao > --- > First they ignore you, then they laugh at you, then they fight you, then > you win. - Mahatma Gandhi > > > > -- > > Enrico Olivelli Software Development Manager @Diennea Tel.: (+39) 0546 > 066100 - Int. 925 Viale G.Marconi 30/14 - 48018 Faenza (RA) MagNews - > E-mail Marketing Solutions http://www.magnews.it Diennea - Digital > Marketing Solutions http://www.diennea.com > > ________________________________ > > Iscriviti alla nostra newsletter per rimanere aggiornato su digital ed > email marketing! http://www.magnews.it/newsletter/ > > The information in this email is confidential and may be legally > privileged. If you are not the intended recipient please notify the sender > immediately and destroy this email. Any unauthorized, direct or indirect, > disclosure, copying, storage, distribution or other use is strictly > forbidden. > -- Jvrao --- First they ignore you, then they laugh at you, then they fight you, then you win. - Mahatma Gandhi