RE: Mass dumping of data has issues

Naveen Wed, 26 Sep 2012 00:10:27 -0700

Hi Ram,

I'm using HTablePool to get the instance for each thread. So, no, I'm not
sharing the same instance across multiple threads.


Regards,
Naveen

-----Original Message-----
From: Ramkrishna.S.Vasudevan [mailto:[email protected]] 
Sent: Wednesday, September 26, 2012 12:15 PM
To: [email protected]
Subject: RE: Mass dumping of data has issues

For the NPE that you got, is the same HTable instance shared by different
threads.  This is a common problem users encounter while using HTable across
multiple threads.
Pls check and ensure that the HTable is not shared.

Regards
Ram

> -----Original Message-----
> From: Naveen [mailto:[email protected]]
> Sent: Wednesday, September 26, 2012 11:52 AM
> To: [email protected]
> Subject: RE: Mass dumping of data has issues
> 
> Hi Dan,
> 
> I'm actually trying to simulate the kind of load we're expecting on 
> production servers(the intention is not to migrate data), hence a 
> self-written program over sqoop. (PS: I've actually tried sqoop)
> 
> Warm regards,
> Naveen
> 
> -----Original Message-----
> From: Dan Han [mailto:[email protected]]
> Sent: Wednesday, September 26, 2012 7:20 AM
> To: [email protected]
> Subject: Re: Mass dumping of data has issues
> 
> Hi Naveen,
> 
>    There is tool called Sqoop which supports importing the data from 
> relational database to HBase.
> https://blogs.apache.org/sqoop/entry/apache_sqoop_graduates_from_incub
> a
> tor
> 
> Maybe it can help you migrate the data easily.
> 
> Best Wishes
> Dan Han
> 
> On Mon, Sep 24, 2012 at 9:20 AM, Paul Mackles <[email protected]>
> wrote:
> 
> > Did you adjust the writebuffer to a larger size and/or turn off 
> > autoFlush for the Htable? I've found that both of those settings can 
> > have a profound impact on write performance. You might also look at 
> > adjusting the handler count for the regionservers which by default 
> > is pretty low. You should also confirm that your splits are 
> > effective in
> distributing the writes.
> >
> > On 9/24/12 11:01 AM, "Naveen" <[email protected]> wrote:
> >
> > >Hi,
> > >
> > >I've come across the following issue for which I'm unable to deduce 
> > >what the root-cause could be.
> > >
> > >Scenario:
> > >I'm trying to dump data(8.3M+ records) from mysql into a hbase 
> > >table using
> > >multi-threading(25 threads dumping 10 puts/tuples at a time).
> > >
> > >Config:
> > >hbase v 0.92.0
> > >hadoop v 1.0
> > >1 master + 4 slaves
> > >table is pre-split
> > >
> > >Issue:
> > >Getting a NPE because RPC call takes longer than timeout(default 60
> sec).
> > >I'm not worried about the NPE(it's been fixed in 0.92.1+ releases) 
> > >but about what could be causing RPC call to timeout on arbitrary 
> > >intervals.
> > >
> > >Custom printed log : pastebin.com/r85wv8Yt
> > >
> > >WARN [Thread-99255] (HConnectionManager.java:1587) - Failed all 
> > >from
> > >region=dump,a405cdd9-b5b7-4ec2-9f91-
> fea98d5cb656,1348331511473.77f13d
> > >455fd
> > >63
> > >c601816759b6ed575e8., hostname=hdslave1.company.com, port=60020
> > >java.util.concurrent.ExecutionException: java.lang.RuntimeException:
> > >java.lang.NullPointerException
> > >       at
> > >java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
> > >       at java.util.concurrent.FutureTask.get(FutureTask.java:83)
> > >       at
> >
> >org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplemen
> > >tatio
> > >n.
> > >processBatchCallback(HConnectionManager.java:1557)
> > >       at
> >
> >org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplemen
> > >tatio
> > >n.
> > >processBatch(HConnectionManager.java:1409)
> > >       at
> > >org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:900)
> > >       at
> org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:777)
> > >       at org.apache.hadoop.hbase.client.HTable.put(HTable.java:760)
> > >       at
> >
> >org.apache.hadoop.hbase.client.HTablePool$PooledHTable.put(HTablePool
> > >.java
> > >:4
> > >02)
> > >       at coprocessor.dump.Dumper.run(Dumper.java:41)
> > >       at java.lang.Thread.run(Thread.java:662)
> > >
> > >Any help or insights are welcome.
> > >
> > >Warm Regards,
> > >Naveen
> > >
> >
> >

RE: Mass dumping of data has issues

Reply via email to