Hi,
Came across a problem that I need to walk through.
On the client side, when you instantiate an HTable object, you can specify
HTable.setAutoFlush(true/false). Setting the boolean value to true means that
when you execute a put(), the write is not buffered on the client and will be
written directly to HBase. This overrides the client side buffering that you
can set in your configuration files.
While for many applications its ok for the app to buffer up its writes, however
there's a set of apps where you don't want to do this. That is when your app
writes a record to HBase, you want it exposed ASAP.
On the server side, you have the Write Ahead Log.
If I understand the WAL, it abstracts the actual process of writing to disk so
that as far as your application is concerned, when you write to the WAL, its in
HBase.
So, my question is how long does it take for a record in the WAL to be written
to Disk?
Also if a record is in the WAL, if I did a get() will the record be found?
Its possible that in a m/r job that client side buffering could mean that it
could take a relatively 'long' time to actually have a record written to HBase,
where as once the record is written to the WAL, it should be consistent in the
time it takes to be written to disk for access by other HBase apps.
Or what am I missing?
Thx
-Mike