Re: Good VLDB paper on WALs

2011-01-01 Thread M. C. Srivas
My observation (working on Spinnaker's NFS server, and then on MapR's server), is that ELR + group-commit is essential. ELR is trivial and I am a bit surprised that the paper claims no one does it. Once ELR is implemented, the bottleneck immediately shifts to forcing the log on a commit. But if

Re: Good VLDB paper on WALs

2010-12-29 Thread Stack
Nice list of things we need to do to make logging faster (with useful citations on current state of art). This notion of early lock release (ELR) is worth looking into (Jon, for high rates of counter transactions, you've been talking about aggregating counts in front of the WAL lock... maybe an

Re: Good VLDB paper on WALs

2010-12-29 Thread Ryan Rawson
Oh no, let's be wary of those server rewrites. My micro profiling is showing about 30 usec for a lock handoff in the HBase client... I think we should be able to get big wins with minimal things. A big rewrite has it's major costs, not to mention to effectively be async we'd have to rewrite

Re: Good VLDB paper on WALs

2010-12-29 Thread Nicolas Spiegelberg
+1 for ELR. I think having some data structure where we prepare the next stage of sync() operations instead of holding the row lock over the sync would be a big win for hot regions without a huge refactor. I think the other two optimizations are useful to think about, but wouldn't have the same

Re: Good VLDB paper on WALs

2010-12-28 Thread Stack
On Mon, Dec 27, 2010 at 11:48 AM, Dhruba Borthakur dhr...@gmail.com wrote: Does anybody have any idea on how to figure out what percentage of the above sys-time is spent in thread scheduling vs the time spent in other system calls (especially in the Namenode context)? Dhruba: Our Benoit

Good VLDB paper on WALs

2010-12-24 Thread Todd Lipcon
Via Hammer - I thought this was a pretty good read, some good ideas for optimizations for our WAL. http://infoscience.epfl.ch/record/149436/files/vldb10aether.pdf -Todd -- Todd Lipcon Software Engineer, Cloudera