Re: commit semantics

Dhruba Borthakur Tue, 12 Jan 2010 00:25:06 -0800

Hi Ryan,

thanks for ur response.


>Right now each regionserver has 1 log, so if 2 puts on different
>tables hit the same RS, they hit the same HLog.

I understand. My point was that the application could insert the same record
into two different tables on two different Hbase instances on two different
piece of hardware.

On a related note, can somebody explain what the tradeoff is if each region
has its own hlog? are you worried about the number of files in HDFS? or
maybe the number of sync-threads in the region server? Can multiple hlog
files provide faster region splits?


> I've thought about this issue quite a bit, and I think the sync every
> 1 rows combined with optional no-sync and low time sync() is the way
> to go. If you want to discuss this more in person, maybe we can meet
> up for brews or something.
>

The group-commit thing I can understand. HDFS does a very similar thing. But
can you explain your alternative "sync every 1 rows combined with optional
no-sync and low time sync"? For those applications that have the natural
characteristics of updating only one row per logical operation, how can they
be sure that their data has reached some-sort-of-stable-storage unless they
sync after every row update?

thanks,
dhruba

Re: commit semantics

Reply via email to