Could it be that you could use Completebulkload and see if that works....That must be faster...than HBaseStorage.....you could pre-split using
export HADOOP_CLASSPATH=`hbase classpath`;hbase org.apache.hadoop.hbase.util.RegionSplitter -c 10 '<table_name>' -f <cf name> On Sat, Apr 28, 2012 at 8:46 PM, M. C. Srivas <[email protected]> wrote: > On Thu, Apr 26, 2012 at 4:38 AM, Rajgopal Vaithiyanathan < > [email protected]> wrote: > > > Hey all, > > > > The default - HBaseStorage() takes hell lot of time for puts. > > > > In a cluster of 5 machines, insertion of 175 Million records took 4Hours > 45 > > minutes > > Question - Is this good enough ? > > each machine has 32 cores and 32GB ram with 7*600GB harddisks. HBASE's > heap > > has been configured to 8GB. > > If the put speed is low, how can i improve them..? > > > > Raj, how big is each record? > > > > > > > I tried tweaking the TableOutputFormat by increasing the WriteBufferSize > to > > 24MB, and adding the multi put feature (by adding 10,000 puts in > ArrayList > > and putting it as a batch). After doing this, it started throwing > > > > >
