Re: Should we change the default value of hbase.regionserver.flushlogentries for 0.21?

Ryan Rawson Thu, 03 Dec 2009 15:37:17 -0800

Thanks for collecting this data. I think the expectation is that HBase
is both fast and reliable, so picking an option that ensures that is
tricky.


I generally support flushlogentries=1, but I think we need a clustered
test before we can say.  The performance is substantially different on
HDFS across multiple hosts.

On Thu, Dec 3, 2009 at 3:24 PM, Jean-Daniel Cryans <jdcry...@apache.org> wrote:
> I have the feeling that this discussion isn't over, there's no
> consensus yet, so I did some tests to get some numbers.
>
> PE sequentialWrite 1 with the write buffer disabled (I get the same
> numbers on every different config with it) on a standalone setup. I
> stopped HBase and deleted the data dir between each run.
>
> - hbase.regionserver.flushlogentries=1 and
> hbase.regionserver.optionallogflushinterval=1000
>  ran in 354765ms
>
> - hbase.regionserver.flushlogentries=100 and
> hbase.regionserver.optionallogflushinterval=1000
>  run #1 in 333972ms
>  run #2 in 331943ms
>
> - hbase.regionserver.flushlogentries=1,
> hbase.regionserver.optionallogflushinterval=1000 and deferred flush
> enabled on TestTable
>  run #1 in 309857ms
>  run #2 in 311440ms
>
> So 100 entries per flush takes ~7% less time, deferred flush takes 14% less.
>
> I thereby think that not only should we set flushlogentries=1 in 0.21,
> but also we should enable deferred log flush by default with a lower
> optional log flush interval. It will be a nearly as safe but much
> faster alternative to the previous option. I would even get rid of the
> hbase.regionserver.flushlogentries config.
>
> J-D
>
> On Tue, Nov 17, 2009 at 7:10 PM, Jean-Daniel Cryans <jdcry...@apache.org> 
> wrote:
>> Well it's even better than that ;) We have optional log flushing which
>> by default is 10 secs. Make that 100 milliseconds and that's as much
>> data you can lose. If any other table syncs then this table's edits
>> are also synced.
>>
>> J-D
>>
>>
>> On Tue, Nov 17, 2009 at 4:36 PM, Jonathan Gray <jl...@streamy.com> wrote:
>>> Thoughts on a client-facing call to explicit call a WAL sync?  So I could
>>> turn on DEFERRED_LOG_FLUSH (possibly leave it on always), run a batch of
>>> my inserts, and then run an explicit flush/sync.  The returning of that
>>> call would guarantee to the client that the data up to that point is safe.
>>>
>>> JG
>>>
>>> On Mon, November 16, 2009 11:00 am, Jean-Daniel Cryans wrote:
>>>> I added a new feature for tables called "deferred flush", see
>>>> https://issues.apache.org/jira/browse/HBASE-1944
>>>>
>>>>
>>>> My opinion is that the default should be paranoid enough to not lose
>>>> any user data. If we can change a table's attribute without taking it down
>>>> (there's a jira on that), wouldn't that solve the import problem?
>>>>
>>>>
>>>> For example: have some table that needs to have fast insertion via MR.
>>>> During the creation of the job, you change the table's
>>>> DEFERRED_LOG_FLUSH to "true", then run the job and finally set the
>>>> value to false when the job is done.
>>>>
>>>> This way you still pass the responsibility to the user but for
>>>> performance reasons.
>>>>
>>>> J-D
>>>>
>>>>
>>>> On Mon, Nov 16, 2009 at 2:05 AM, Cosmin Lehene <cleh...@adobe.com> wrote:
>>>>
>>>>> We could have a speedy default and an extra parameter for puts that
>>>>> would specify a flush is needed. This way you pass the responsibility to
>>>>> the user and he can decide if he needs to be paranoid or not. This could
>>>>> be part of Put and even specify granularity of the flush if needed.
>>>>>
>>>>>
>>>>> Cosmin
>>>>>
>>>>>
>>>>>
>>>>> On 11/15/09 6:59 PM, "Andrew Purtell" <apurt...@apache.org> wrote:
>>>>>
>>>>>
>>>>>> I agree with this.
>>>>>>
>>>>>>
>>>>>> I also think we should leave the default as is with the caveat that
>>>>>> we call out the durability versus write performance tradeoff in the
>>>>>> flushlogentries description and up on the wiki somewhere, maybe on
>>>>>> http://wiki.apache.org/hadoop/PerformanceTuning . We could also
>>>>>> provide two example configurations, one for performance (reasonable
>>>>>> tradeoffs), one for paranoia. I put up an issue:
>>>>>> https://issues.apache.org/jira/browse/HBASE-1984
>>>>>>
>>>>>>
>>>>>>     - Andy
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> ________________________________
>>>>>> From: Ryan Rawson <ryano...@gmail.com>
>>>>>> To: hbase-dev@hadoop.apache.org
>>>>>> Sent: Sat, November 14, 2009 11:22:13 PM
>>>>>> Subject: Re: Should we change the default value of
>>>>>> hbase.regionserver.flushlogentries  for 0.21?
>>>>>>
>>>>>> That sync at the end of a RPC is my doing. You dont want to sync
>>>>>> every _EDIT_, after all, the previous definition of the word "edit"
>>>>>> was each KeyValue.  So we could be calling sync for every single
>>>>>> column in a row. Bad stuff.
>>>>>>
>>>>>> In the end, if the regionserver crashes during a batch put, we will
>>>>>> never know how much of the batch was flushed to the WAL. Thus it makes
>>>>>>  sense to only do it once and get a massive, massive, speedup.
>>>>>>
>>>>>> On Sat, Nov 14, 2009 at 9:45 PM, stack <st...@duboce.net> wrote:
>>>>>>
>>>>>>> I'm for leaving it as it is, at every 100 edits -- maybe every 10
>>>>>>> edits? Speed stays as it was.  We used to lose MBs.  By default,
>>>>>>> we'll now lose 99 or 9 edits max.
>>>>>>>
>>>>>>> We need to do some work bringing folks along regardless of what we
>>>>>>> decide. Flush happens at the end of the put up in the regionserver.
>>>>>>>  If you are
>>>>>>> doing a batch of commits -- e.g. using a big write buffer over on
>>>>>>> your client -- the puts will only be flushed on the way out after
>>>>>>> the batch put completes EVEN if you have configured hbase to sync
>>>>>>> every edit (I ran into this this evening.  J-D sorted me out).  We
>>>>>>> need to make sure folks are up on this.
>>>>>>>
>>>>>>> St.Ack
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Sat, Nov 14, 2009 at 4:37 PM, Jean-Daniel Cryans
>>>>>>> <jdcry...@apache.org>wrote:
>>>>>>>
>>>>>>>
>>>>>>>> Hi dev!
>>>>>>>>
>>>>>>>>
>>>>>>>> Hadoop 0.21 now has a reliable append and flush feature and this
>>>>>>>> gives us the opportunity to review some assumptions. The current
>>>>>>>> situation:
>>>>>>>>
>>>>>>>>
>>>>>>>> - Every edit going to a catalog table is flushed so there's no
>>>>>>>> data loss. - The user tables edits are flushed every
>>>>>>>> hbase.regionserver.flushlogentries which by default is 100.
>>>>>>>>
>>>>>>>> Should we now set this value to 1 in order to have more durable
>>>>>>>> but slower inserts by default? Please speak up.
>>>>>>>>
>>>>>>>> Thx,
>>>>>>>>
>>>>>>>>
>>>>>>>> J-D
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>
>

Re: Should we change the default value of hbase.regionserver.flushlogentries for 0.21?

Reply via email to