Hi Ram, I'm using HTablePool to get the instance for each thread. So, no, I'm not sharing the same instance across multiple threads.
Regards, Naveen -----Original Message----- From: Ramkrishna.S.Vasudevan [mailto:[email protected]] Sent: Wednesday, September 26, 2012 12:15 PM To: [email protected] Subject: RE: Mass dumping of data has issues For the NPE that you got, is the same HTable instance shared by different threads. This is a common problem users encounter while using HTable across multiple threads. Pls check and ensure that the HTable is not shared. Regards Ram > -----Original Message----- > From: Naveen [mailto:[email protected]] > Sent: Wednesday, September 26, 2012 11:52 AM > To: [email protected] > Subject: RE: Mass dumping of data has issues > > Hi Dan, > > I'm actually trying to simulate the kind of load we're expecting on > production servers(the intention is not to migrate data), hence a > self-written program over sqoop. (PS: I've actually tried sqoop) > > Warm regards, > Naveen > > -----Original Message----- > From: Dan Han [mailto:[email protected]] > Sent: Wednesday, September 26, 2012 7:20 AM > To: [email protected] > Subject: Re: Mass dumping of data has issues > > Hi Naveen, > > There is tool called Sqoop which supports importing the data from > relational database to HBase. > https://blogs.apache.org/sqoop/entry/apache_sqoop_graduates_from_incub > a > tor > > Maybe it can help you migrate the data easily. > > Best Wishes > Dan Han > > On Mon, Sep 24, 2012 at 9:20 AM, Paul Mackles <[email protected]> > wrote: > > > Did you adjust the writebuffer to a larger size and/or turn off > > autoFlush for the Htable? I've found that both of those settings can > > have a profound impact on write performance. You might also look at > > adjusting the handler count for the regionservers which by default > > is pretty low. You should also confirm that your splits are > > effective in > distributing the writes. > > > > On 9/24/12 11:01 AM, "Naveen" <[email protected]> wrote: > > > > >Hi, > > > > > >I've come across the following issue for which I'm unable to deduce > > >what the root-cause could be. > > > > > >Scenario: > > >I'm trying to dump data(8.3M+ records) from mysql into a hbase > > >table using > > >multi-threading(25 threads dumping 10 puts/tuples at a time). > > > > > >Config: > > >hbase v 0.92.0 > > >hadoop v 1.0 > > >1 master + 4 slaves > > >table is pre-split > > > > > >Issue: > > >Getting a NPE because RPC call takes longer than timeout(default 60 > sec). > > >I'm not worried about the NPE(it's been fixed in 0.92.1+ releases) > > >but about what could be causing RPC call to timeout on arbitrary > > >intervals. > > > > > >Custom printed log : pastebin.com/r85wv8Yt > > > > > >WARN [Thread-99255] (HConnectionManager.java:1587) - Failed all > > >from > > >region=dump,a405cdd9-b5b7-4ec2-9f91- > fea98d5cb656,1348331511473.77f13d > > >455fd > > >63 > > >c601816759b6ed575e8., hostname=hdslave1.company.com, port=60020 > > >java.util.concurrent.ExecutionException: java.lang.RuntimeException: > > >java.lang.NullPointerException > > > at > > >java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222) > > > at java.util.concurrent.FutureTask.get(FutureTask.java:83) > > > at > > > >org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplemen > > >tatio > > >n. > > >processBatchCallback(HConnectionManager.java:1557) > > > at > > > >org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplemen > > >tatio > > >n. > > >processBatch(HConnectionManager.java:1409) > > > at > > >org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:900) > > > at > org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:777) > > > at org.apache.hadoop.hbase.client.HTable.put(HTable.java:760) > > > at > > > >org.apache.hadoop.hbase.client.HTablePool$PooledHTable.put(HTablePool > > >.java > > >:4 > > >02) > > > at coprocessor.dump.Dumper.run(Dumper.java:41) > > > at java.lang.Thread.run(Thread.java:662) > > > > > >Any help or insights are welcome. > > > > > >Warm Regards, > > >Naveen > > > > > > >
