Re: commit semantics

Jean-Daniel Cryans Tue, 12 Jan 2010 13:36:34 -0800

Even with 100 regions, times 1000 region servers, we talk about
potentially having 100 000 opened files instead of 1000 (and also we
have to count every replica).


I guess that an OS that was configured for such usage would be able to
sustain it... You would have to watch that metric cluster-wide, get
new nodes when needed, etc.

Then you need to make sure that GC pauses won't block for too long to
have a very low unavailability time.

J-D

On Tue, Jan 12, 2010 at 1:07 PM, Kannan Muthukkaruppan
<[email protected]> wrote:
>> I presume you intend to run HBase region servers
>> colocated with HDFS DataNodes.
>
> Yes.
>
> ---
>
> Seems like we all generally agree that large number of regions per region 
> server may not be the way to go.
>
> So coming back to Dhruba's question on having one commit log per region 
> instead of one commit log per region server. Is the number of HDFS files open 
> still a major concern?
>
> Is my understanding correct that unavailability window during region server 
> failover is large due to the time it takes to split the shared commit log 
> into a per region log? Instead, if we always had per-region commit logs even 
> in the normal mode of operation, then the unavailability window would be 
> minimized? It does minimize the extent of batch/group commits you can do 
> though-- since you can only batch updates going to the same region. Any other 
> gotchas/issues?
>
> regards,
> Kannan
> -----Original Message-----
> From: Andrew Purtell [mailto:[email protected]]
> Sent: Tuesday, January 12, 2010 12:50 PM
> To: [email protected]
> Subject: Re: commit semantics
>
>> But would say having a
>> smaller number of regions per region server (say ~50) be really bad.
>
> Not at all.
>
> There are some (test) HBase deployments I know of that go pretty
> vertical, multiple TBs of disk on each node therefore wanting a high
> number of regions per region server to match that density. That may meet
> with operational success but it is architecturally suspect. I ran a test
> cluster once with > 1,000 regions per server on 25 servers, in the 0.19
> timeframe. 0.20 is much better in terms of resource demand (less) and
> liveness (enormously improved), but I still wouldn't recommend it,
> unless your clients can wait for up to several minutes on blocked reads
> and writes to affected regions should a node go down. With that many
> regions per server,  it stands to reason just about every client would be
> affected.
>
> The numbers I have for Google's canonical BigTable deployment are several
> years out of date but they go pretty far in the other direction -- about
> 100 regions per server is the target.
>
> I think it also depends on whether you intend to colocate TaskTrackers
> with the region servers. I presume you intend to run HBase region servers
> colocated with HDFS DataNodes. After you have a HBase cluster up for some
> number of hours, certainly ~24, background compaction will bring the HDFS
> blocks backing region data local to the server, generally. MapReduce
> tasks backed by HBase tables will see similar advantages of data locality
> that you are probably accustomed to with working with files in HDFS. If
> you mix storage and computation this way it makes sense to seek a balance
> between the amount of data stored on each node (number of regions being
> served) and the available computational resources (available CPU cores,
> time constraints (if any) on task execution).
>
> Even if you don't intend to do the above, it's possible that an overly
> high region density can negatively impact performance if too much I/O
> load is placed on average on each region server. Adding more servers to
> spread load would then likely help**.
>
> These considerations bias against hosting a very large number of regions
> per region server.
>
>   - Andy
>
> **: I say likely because this presumes query and edit patterns have been
> guided as necessary through engineering to be widely distributed in the
> key space. You have to take some care to avoid hot regions.
>
>
> ----- Original Message ----
>> From: Kannan Muthukkaruppan <[email protected]>
>> To: "[email protected]" <[email protected]>
>> Sent: Tue, January 12, 2010 11:40:00 AM
>> Subject: RE: commit semantics
>>
>> Btw, is there much gains in having a large number of regions-- i.e. to the 
>> tune
>> of 500 -- per region server?
>>
>> I understand that having multiple regions per region server allows finer 
>> grained
>> rebalancing when new nodes are added or a node goes down. But would say 
>> having a
>> smaller number of regions per region server (say ~50) be really bad. If a 
>> region
>> server goes down, 50 other nodes would pick up ~1/50 of its work. Not as 
>> good as
>> 500 other nodes picking up 1/500 of its work each-- but seems acceptable 
>> still.
>> Are there other advantages of having a large number of regions per region
>> server?
>>
>> regards,
>> Kannan
>> -----Original Message-----
>> From: [email protected] [mailto:[email protected]] On Behalf Of Jean-Daniel
>> Cryans
>> Sent: Tuesday, January 12, 2010 9:42 AM
>> To: [email protected]
>> Subject: Re: commit semantics
>>
>> wrt 1 HLog per region server, this is from the Bigtable paper. Their
>> main concern is the number of opened files since if you have 1000
>> region servers * 500 regions then you may have 100 000 HLogs to
>> manage. Also you can have more than one file per HLog, so let's say
>> you have on average 5 log files per HLog that's 500 000 files on HDFS.
>>
>> J-D
>>
>> On Tue, Jan 12, 2010 at 12:24 AM, Dhruba Borthakur wrote:
>> > Hi Ryan,
>> >
>> > thanks for ur response.
>> >
>> >>Right now each regionserver has 1 log, so if 2 puts on different
>> >>tables hit the same RS, they hit the same HLog.
>> >
>> > I understand. My point was that the application could insert the same 
>> > record
>> > into two different tables on two different Hbase instances on two different
>> > piece of hardware.
>> >
>> > On a related note, can somebody explain what the tradeoff is if each region
>> > has its own hlog? are you worried about the number of files in HDFS? or
>> > maybe the number of sync-threads in the region server? Can multiple hlog
>> > files provide faster region splits?
>> >
>> >
>> >> I've thought about this issue quite a bit, and I think the sync every
>> >> 1 rows combined with optional no-sync and low time sync() is the way
>> >> to go. If you want to discuss this more in person, maybe we can meet
>> >> up for brews or something.
>> >>
>> >
>> > The group-commit thing I can understand. HDFS does a very similar thing. 
>> > But
>> > can you explain your alternative "sync every 1 rows combined with optional
>> > no-sync and low time sync"? For those applications that have the natural
>> > characteristics of updating only one row per logical operation, how can 
>> > they
>> > be sure that their data has reached some-sort-of-stable-storage unless they
>> > sync after every row update?
>> >
>> > thanks,
>> > dhruba
>> >
>
>
>
>
>
>

Re: commit semantics

Reply via email to