On May 16, 2017 12:09 AM, "Enrico Olivelli - Diennea" < enrico.olive...@diennea.com> wrote:
Il giorno mar, 16/05/2017 alle 00.06 -0700, Sijie Guo ha scritto: On Mon, May 15, 2017 at 10:35 PM, Venkateswara Rao Jujjuri < jujj...@gmail.com<mailto:jujj...@gmail.com>> wrote: As we are moving towards mega store, which can house 10s of millions(even 100s) of ledgers, and 1000s of bookies, there can be a huge overhead on some of the operations we are performing right now. 1. Compaction/GC A deletion is just a metadata operation. Deletes the zk node. (currently it just deletes the leaf node, which can be a minor bugfix to delete entire tree if applicable). +1 for a fix for improving this. +1 for me too, client could send some hint to the bookie that the ledger has been removed and maybe trigger a special GC Just to clarify my +1 is on the fix of deleting the tree nodes on current approach. Not for sending the requests yet. But each bookie need to get a list of existing ledgers from bookie, compare to its local storage, and identify deleted ledgers. Then go through compaction logic which is another process. But 1000s of bookies, in parallel parsing the entire zk tree, making their own list doesn't appear to be efficient scalable architecture to me. Why not introduce a opportunistic delete operation from client side, which will inform to all bookies in that ledger's metadata. We can still keep our bruit-force method but at very very low frequency, once a week? to address transient/corner case scenarios like bookie down at that time etc. Is there any big architectural correctness issue I am missing in this method? I don't think there is a correctness issue for the approach your proposed if current background gc is still running. The current approach is just for simplifying the client logic. Instead of introducing complexity (more operations) on client side, why can't the leader (auditor) perform the deletions? 2. Close Ledger close is also a metadata operation. I believe sending opportunistic close to bookies of the current ensemble can greatly enhance some of the use-cases where we need open-to-close consistency. Where in the data doesn't need to be persistent until the close. Any thoughts?? You mean "close-to-open" consistency? I am trying to understand - Why "where in the data doesn't need to be persistent until the close" is related to ledger close? Are you thinking of flushing all entries on the bookies on closing a ledger? How do you handle ensemble changes? - Sijie -- Jvrao --- First they ignore you, then they laugh at you, then they fight you, then you win. - Mahatma Gandhi -- Enrico Olivelli Software Development Manager @Diennea Tel.: (+39) 0546 066100 - Int. 925 Viale G.Marconi 30/14 - 48018 Faenza (RA) MagNews - E-mail Marketing Solutions http://www.magnews.it Diennea - Digital Marketing Solutions http://www.diennea.com ________________________________ Iscriviti alla nostra newsletter per rimanere aggiornato su digital ed email marketing! http://www.magnews.it/newsletter/ The information in this email is confidential and may be legally privileged. If you are not the intended recipient please notify the sender immediately and destroy this email. Any unauthorized, direct or indirect, disclosure, copying, storage, distribution or other use is strictly forbidden.