Thanks for collecting this data. I think the expectation is that HBase is both fast and reliable, so picking an option that ensures that is tricky.
I generally support flushlogentries=1, but I think we need a clustered test before we can say. The performance is substantially different on HDFS across multiple hosts. On Thu, Dec 3, 2009 at 3:24 PM, Jean-Daniel Cryans <jdcry...@apache.org> wrote: > I have the feeling that this discussion isn't over, there's no > consensus yet, so I did some tests to get some numbers. > > PE sequentialWrite 1 with the write buffer disabled (I get the same > numbers on every different config with it) on a standalone setup. I > stopped HBase and deleted the data dir between each run. > > - hbase.regionserver.flushlogentries=1 and > hbase.regionserver.optionallogflushinterval=1000 > ran in 354765ms > > - hbase.regionserver.flushlogentries=100 and > hbase.regionserver.optionallogflushinterval=1000 > run #1 in 333972ms > run #2 in 331943ms > > - hbase.regionserver.flushlogentries=1, > hbase.regionserver.optionallogflushinterval=1000 and deferred flush > enabled on TestTable > run #1 in 309857ms > run #2 in 311440ms > > So 100 entries per flush takes ~7% less time, deferred flush takes 14% less. > > I thereby think that not only should we set flushlogentries=1 in 0.21, > but also we should enable deferred log flush by default with a lower > optional log flush interval. It will be a nearly as safe but much > faster alternative to the previous option. I would even get rid of the > hbase.regionserver.flushlogentries config. > > J-D > > On Tue, Nov 17, 2009 at 7:10 PM, Jean-Daniel Cryans <jdcry...@apache.org> > wrote: >> Well it's even better than that ;) We have optional log flushing which >> by default is 10 secs. Make that 100 milliseconds and that's as much >> data you can lose. If any other table syncs then this table's edits >> are also synced. >> >> J-D >> >> >> On Tue, Nov 17, 2009 at 4:36 PM, Jonathan Gray <jl...@streamy.com> wrote: >>> Thoughts on a client-facing call to explicit call a WAL sync? So I could >>> turn on DEFERRED_LOG_FLUSH (possibly leave it on always), run a batch of >>> my inserts, and then run an explicit flush/sync. The returning of that >>> call would guarantee to the client that the data up to that point is safe. >>> >>> JG >>> >>> On Mon, November 16, 2009 11:00 am, Jean-Daniel Cryans wrote: >>>> I added a new feature for tables called "deferred flush", see >>>> https://issues.apache.org/jira/browse/HBASE-1944 >>>> >>>> >>>> My opinion is that the default should be paranoid enough to not lose >>>> any user data. If we can change a table's attribute without taking it down >>>> (there's a jira on that), wouldn't that solve the import problem? >>>> >>>> >>>> For example: have some table that needs to have fast insertion via MR. >>>> During the creation of the job, you change the table's >>>> DEFERRED_LOG_FLUSH to "true", then run the job and finally set the >>>> value to false when the job is done. >>>> >>>> This way you still pass the responsibility to the user but for >>>> performance reasons. >>>> >>>> J-D >>>> >>>> >>>> On Mon, Nov 16, 2009 at 2:05 AM, Cosmin Lehene <cleh...@adobe.com> wrote: >>>> >>>>> We could have a speedy default and an extra parameter for puts that >>>>> would specify a flush is needed. This way you pass the responsibility to >>>>> the user and he can decide if he needs to be paranoid or not. This could >>>>> be part of Put and even specify granularity of the flush if needed. >>>>> >>>>> >>>>> Cosmin >>>>> >>>>> >>>>> >>>>> On 11/15/09 6:59 PM, "Andrew Purtell" <apurt...@apache.org> wrote: >>>>> >>>>> >>>>>> I agree with this. >>>>>> >>>>>> >>>>>> I also think we should leave the default as is with the caveat that >>>>>> we call out the durability versus write performance tradeoff in the >>>>>> flushlogentries description and up on the wiki somewhere, maybe on >>>>>> http://wiki.apache.org/hadoop/PerformanceTuning . We could also >>>>>> provide two example configurations, one for performance (reasonable >>>>>> tradeoffs), one for paranoia. I put up an issue: >>>>>> https://issues.apache.org/jira/browse/HBASE-1984 >>>>>> >>>>>> >>>>>> - Andy >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> ________________________________ >>>>>> From: Ryan Rawson <ryano...@gmail.com> >>>>>> To: hbase-dev@hadoop.apache.org >>>>>> Sent: Sat, November 14, 2009 11:22:13 PM >>>>>> Subject: Re: Should we change the default value of >>>>>> hbase.regionserver.flushlogentries for 0.21? >>>>>> >>>>>> That sync at the end of a RPC is my doing. You dont want to sync >>>>>> every _EDIT_, after all, the previous definition of the word "edit" >>>>>> was each KeyValue. So we could be calling sync for every single >>>>>> column in a row. Bad stuff. >>>>>> >>>>>> In the end, if the regionserver crashes during a batch put, we will >>>>>> never know how much of the batch was flushed to the WAL. Thus it makes >>>>>> sense to only do it once and get a massive, massive, speedup. >>>>>> >>>>>> On Sat, Nov 14, 2009 at 9:45 PM, stack <st...@duboce.net> wrote: >>>>>> >>>>>>> I'm for leaving it as it is, at every 100 edits -- maybe every 10 >>>>>>> edits? Speed stays as it was. We used to lose MBs. By default, >>>>>>> we'll now lose 99 or 9 edits max. >>>>>>> >>>>>>> We need to do some work bringing folks along regardless of what we >>>>>>> decide. Flush happens at the end of the put up in the regionserver. >>>>>>> If you are >>>>>>> doing a batch of commits -- e.g. using a big write buffer over on >>>>>>> your client -- the puts will only be flushed on the way out after >>>>>>> the batch put completes EVEN if you have configured hbase to sync >>>>>>> every edit (I ran into this this evening. J-D sorted me out). We >>>>>>> need to make sure folks are up on this. >>>>>>> >>>>>>> St.Ack >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Sat, Nov 14, 2009 at 4:37 PM, Jean-Daniel Cryans >>>>>>> <jdcry...@apache.org>wrote: >>>>>>> >>>>>>> >>>>>>>> Hi dev! >>>>>>>> >>>>>>>> >>>>>>>> Hadoop 0.21 now has a reliable append and flush feature and this >>>>>>>> gives us the opportunity to review some assumptions. The current >>>>>>>> situation: >>>>>>>> >>>>>>>> >>>>>>>> - Every edit going to a catalog table is flushed so there's no >>>>>>>> data loss. - The user tables edits are flushed every >>>>>>>> hbase.regionserver.flushlogentries which by default is 100. >>>>>>>> >>>>>>>> Should we now set this value to 1 in order to have more durable >>>>>>>> but slower inserts by default? Please speak up. >>>>>>>> >>>>>>>> Thx, >>>>>>>> >>>>>>>> >>>>>>>> J-D >>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>> >>>> >>> >>> >> >