Thanks for picking up this discussion again J-D. See below.
On Thu, Dec 3, 2009 at 3:24 PM, Jean-Daniel Cryans <jdcry...@apache.org>wrote: > I have the feeling that this discussion isn't over, there's no > consensus yet, so I did some tests to get some numbers. > > PE sequentialWrite 1 with the write buffer disabled (I get the same > numbers on every different config with it) on a standalone setup. The write buffer is disabled because otherwise it will get in the way of the hbase.regionserver.flushlogentries=1? It would be interesting to get a baseline for 0.20 which IMO would be settings we had in 0.19 w/ write buffer. Would be good for comparison. You like the idea of the sync being time-based rather than some number of edits? I can see fellas wanting both. stack I > stopped HBase and deleted the data dir between each run. > > - hbase.regionserver.flushlogentries=1 and > hbase.regionserver.optionallogflushinterval=1000 > ran in 354765ms > > - hbase.regionserver.flushlogentries=100 and > hbase.regionserver.optionallogflushinterval=1000 > run #1 in 333972ms > run #2 in 331943ms > > - hbase.regionserver.flushlogentries=1, > hbase.regionserver.optionallogflushinterval=1000 and deferred flush > enabled on TestTable > run #1 in 309857ms > run #2 in 311440ms > > So 100 entries per flush takes ~7% less time, deferred flush takes 14% > less. > > I thereby think that not only should we set flushlogentries=1 in 0.21, > but also we should enable deferred log flush by default with a lower > optional log flush interval. It will be a nearly as safe but much > faster alternative to the previous option. I would even get rid of the > hbase.regionserver.flushlogentries config. > > J-D > > On Tue, Nov 17, 2009 at 7:10 PM, Jean-Daniel Cryans <jdcry...@apache.org> > wrote: > > Well it's even better than that ;) We have optional log flushing which > > by default is 10 secs. Make that 100 milliseconds and that's as much > > data you can lose. If any other table syncs then this table's edits > > are also synced. > > > > J-D > > > > > > On Tue, Nov 17, 2009 at 4:36 PM, Jonathan Gray <jl...@streamy.com> > wrote: > >> Thoughts on a client-facing call to explicit call a WAL sync? So I > could > >> turn on DEFERRED_LOG_FLUSH (possibly leave it on always), run a batch of > >> my inserts, and then run an explicit flush/sync. The returning of that > >> call would guarantee to the client that the data up to that point is > safe. > >> > >> JG > >> > >> On Mon, November 16, 2009 11:00 am, Jean-Daniel Cryans wrote: > >>> I added a new feature for tables called "deferred flush", see > >>> https://issues.apache.org/jira/browse/HBASE-1944 > >>> > >>> > >>> My opinion is that the default should be paranoid enough to not lose > >>> any user data. If we can change a table's attribute without taking it > down > >>> (there's a jira on that), wouldn't that solve the import problem? > >>> > >>> > >>> For example: have some table that needs to have fast insertion via MR. > >>> During the creation of the job, you change the table's > >>> DEFERRED_LOG_FLUSH to "true", then run the job and finally set the > >>> value to false when the job is done. > >>> > >>> This way you still pass the responsibility to the user but for > >>> performance reasons. > >>> > >>> J-D > >>> > >>> > >>> On Mon, Nov 16, 2009 at 2:05 AM, Cosmin Lehene <cleh...@adobe.com> > wrote: > >>> > >>>> We could have a speedy default and an extra parameter for puts that > >>>> would specify a flush is needed. This way you pass the responsibility > to > >>>> the user and he can decide if he needs to be paranoid or not. This > could > >>>> be part of Put and even specify granularity of the flush if needed. > >>>> > >>>> > >>>> Cosmin > >>>> > >>>> > >>>> > >>>> On 11/15/09 6:59 PM, "Andrew Purtell" <apurt...@apache.org> wrote: > >>>> > >>>> > >>>>> I agree with this. > >>>>> > >>>>> > >>>>> I also think we should leave the default as is with the caveat that > >>>>> we call out the durability versus write performance tradeoff in the > >>>>> flushlogentries description and up on the wiki somewhere, maybe on > >>>>> http://wiki.apache.org/hadoop/PerformanceTuning . We could also > >>>>> provide two example configurations, one for performance (reasonable > >>>>> tradeoffs), one for paranoia. I put up an issue: > >>>>> https://issues.apache.org/jira/browse/HBASE-1984 > >>>>> > >>>>> > >>>>> - Andy > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> ________________________________ > >>>>> From: Ryan Rawson <ryano...@gmail.com> > >>>>> To: hbase-dev@hadoop.apache.org > >>>>> Sent: Sat, November 14, 2009 11:22:13 PM > >>>>> Subject: Re: Should we change the default value of > >>>>> hbase.regionserver.flushlogentries for 0.21? > >>>>> > >>>>> That sync at the end of a RPC is my doing. You dont want to sync > >>>>> every _EDIT_, after all, the previous definition of the word "edit" > >>>>> was each KeyValue. So we could be calling sync for every single > >>>>> column in a row. Bad stuff. > >>>>> > >>>>> In the end, if the regionserver crashes during a batch put, we will > >>>>> never know how much of the batch was flushed to the WAL. Thus it > makes > >>>>> sense to only do it once and get a massive, massive, speedup. > >>>>> > >>>>> On Sat, Nov 14, 2009 at 9:45 PM, stack <st...@duboce.net> wrote: > >>>>> > >>>>>> I'm for leaving it as it is, at every 100 edits -- maybe every 10 > >>>>>> edits? Speed stays as it was. We used to lose MBs. By default, > >>>>>> we'll now lose 99 or 9 edits max. > >>>>>> > >>>>>> We need to do some work bringing folks along regardless of what we > >>>>>> decide. Flush happens at the end of the put up in the regionserver. > >>>>>> If you are > >>>>>> doing a batch of commits -- e.g. using a big write buffer over on > >>>>>> your client -- the puts will only be flushed on the way out after > >>>>>> the batch put completes EVEN if you have configured hbase to sync > >>>>>> every edit (I ran into this this evening. J-D sorted me out). We > >>>>>> need to make sure folks are up on this. > >>>>>> > >>>>>> St.Ack > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> On Sat, Nov 14, 2009 at 4:37 PM, Jean-Daniel Cryans > >>>>>> <jdcry...@apache.org>wrote: > >>>>>> > >>>>>> > >>>>>>> Hi dev! > >>>>>>> > >>>>>>> > >>>>>>> Hadoop 0.21 now has a reliable append and flush feature and this > >>>>>>> gives us the opportunity to review some assumptions. The current > >>>>>>> situation: > >>>>>>> > >>>>>>> > >>>>>>> - Every edit going to a catalog table is flushed so there's no > >>>>>>> data loss. - The user tables edits are flushed every > >>>>>>> hbase.regionserver.flushlogentries which by default is 100. > >>>>>>> > >>>>>>> Should we now set this value to 1 in order to have more durable > >>>>>>> but slower inserts by default? Please speak up. > >>>>>>> > >>>>>>> Thx, > >>>>>>> > >>>>>>> > >>>>>>> J-D > >>>>>>> > >>>>>>> > >>>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>> > >>>> > >>> > >>> > >> > >> > > >