Ok, but our app has online/realtime processing requirements. My understanding bulk importing requires M/R job and is only good for batch processing?
The Javadoc says HBaseAdmin flush is an async operation. How do I get the confirmation whether it succeeded or not? On 5/29/11, Todd Lipcon <[email protected]> wrote: > Or actually flush the table rather than just flushing commits: > http://archive.cloudera.com/cdh/3/hbase-0.90.1-cdh3u0/apidocs/org/apache/hadoop/hbase/client/HBaseAdmin.html#flush(byte[]) > > -Todd > > On Sat, May 28, 2011 at 12:29 PM, Joey Echeverria <[email protected]> wrote: >> You might want to look into bulk loading. >> >> -Joey >> On May 28, 2011 9:47 AM, "Qing Yan" <[email protected]> wrote: >>> Well, I realized myself RS flush to HDFS is not designed to do >>> incremental >>> changes. So there is no way around of WAL? man..just wish it can run a >>> bit >>> faster:-P >>> >>> On Sat, May 28, 2011 at 9:36 PM, Qing Yan <[email protected]> wrote: >>> >>>> Ok, thanks for the explaination. so data loss is normal in this case. >>>> Yeah , I did a "kill -9". I did wait till the RS get reassigned and >>>> actually let process B keep retring over the night .. >>>> >>>> Is WAL the only way to guarantee data safety in hbase? We want high >> insert >>>> rate though. >>>> Is there a middle ground? e.g. a sync operation to flush RS to HDFS will >> be >>>> perfect! >>>> >>>> >>>>> >>>> >> > > > > -- > Todd Lipcon > Software Engineer, Cloudera >
