Well the XMLStreamingInputFormat lets you map XML files which is neat but it has a problem and always needs to be patched. I wondered if that was missing but in your case it's not the problem.
Did you check the logs of the master and region servers? Also I'd like to know - Version of Hadoop and HBase - Nodes's hardware - How many map slots per TT - HBASE_HEAPSIZE from conf/hbase-env.sh - Special configuration you use Thx, J-D On Wed, Oct 21, 2009 at 7:57 AM, Mark Vigeant <[email protected]> wrote: > No. Should I? > > -----Original Message----- > From: [email protected] [mailto:[email protected]] On Behalf Of Jean-Daniel > Cryans > Sent: Wednesday, October 21, 2009 10:55 AM > To: [email protected] > Subject: Re: Table Upload Optimization > > Are you using the Hadoop Streaming API? > > J-D > > On Wed, Oct 21, 2009 at 7:52 AM, Mark Vigeant > <[email protected]> wrote: >> Hey >> >> So I want to upload a lot of XML data into an HTable. I have a class that >> successfully maps up to about 500 MB of data or so (on one regionserver) >> into a table, but if I go for much bigger than that it takes forever and >> eventually just stops. I tried uploading a big XML file into my 4 >> regionserver cluster (about 7 GB) and it's been a day and it's still going >> at it. >> >> What I get when I run the job on the 4 node cluster is: >> 10/21/09 10:22:35 INFO mapred.LocalJobRunner: >> 10/21/09 10:22:38 INFO mapred.LocalJobRunner: >> (then it does that for a while until...) >> 10/21/09 10:22:52 INFO mapred.TaskRunner: Task attempt_local_0001_m_000117_0 >> is done. And is in the process of committing >> 10/21/09 10:22:52 INFO mapred.LocalJobRunner: >> 10/21/09 10:22:52 mapred.TaskRunner: Task 'attempt_local_0001_m_000117_0' is >> done. >> 10/21/09 10:22:52 INFO mapred.JobClient: map 100% reduce 0% >> 10/21/09 10:22:58 INFO mapred.LocalJobRunner: >> 10/21/09 10:22:59 INFO mapred.JobClient: map 99% reduce 0% >> >> >> I'm convinced I'm not configuring hbase or hadoop correctly. Any suggestions? >> >> Mark Vigeant >> RiskMetrics Group, Inc. >> >
