Thanks Harsh.. Viv
On Sat, Mar 19, 2011 at 11:52 AM, Harsh J <[email protected]> wrote: > Have you tried out the mix of importtsv + completebulkload? Would that > work for you? > > On Sat, Mar 19, 2011 at 9:18 PM, Vivek Krishna <[email protected]> > wrote: > > I have around 20 GB of data to be dumped into a hbase table. > > > > Initially, I had a simple java program to put the values in a batch of > > (5000-10000) records. I tried concurrent inserts and each insert took > about > > 15 seconds to write. Which is very slow and was taking ages. > > > > Next approach was to use importtsv, this started off with a set of maps > and > > after few minutes, I started getting RetriesException and errors out in a > > while. > > > > Of these experiments, I noticed that the master node was handing all the > > traffic. I understand that initially it dumps data in one node and then > > splits across multiple nodes as data comes in. Is there a way to split > this > > across regions in the beginning? > > > > Or any other thoughts on how to handle inserts of large amounts of data? > > Viv > > > > > > -- > Harsh J > http://harshj.com >
