Return code 2 essentially means a hadoop error. Congrats on locating and fixing your issue.
However, can somebody still throw some light on this particular error code? On Fri, Jan 28, 2011 at 6:16 AM, Christopher, Pat < patrick.christop...@hp.com> wrote: > It was the SerDe. There was a null pointer error. It was getting > reported to a hadoop logfile and not to anywhere in Hive. I found the > hadoop log and fixed the problem. > > > > Thanks for the help! > > > > Pat > > > > *From:* Christopher, Pat > *Sent:* Thursday, January 27, 2011 11:21 AM > > *To:* user@hive.apache.org > *Subject:* RE: Hive Error on medium sized dataset > > > > I removed the part of the SerDe that handled the arbitrary key/value pairs > and I was able to process my entire data set. Sadly the part I removed has > all the interesting data. > > > > I’ll play more with the heap settings and see if that lets me process the > key/value pairs. Is the below the correct way to set the child heap value? > > > > > Thanks, > > Pat > > > > *From:* Christopher, Pat > *Sent:* Thursday, January 27, 2011 10:27 AM > *To:* user@hive.apache.org > *Subject:* RE: Hive Error on medium sized dataset > > > > It will be tricky to clean up the data format as I’m operating on somewhat > arbitrary key-value pairs in part of the record. I will try and create > something similar though, might take a bit. Thanks. > > > > I’ve tried resetting the heap size, I think. I added the following block > to my mapred-site.xml: > > > > <property> > > <name>mapred.child.java.opts</name> > > <value>-Xm512M</value> > > </property> > > > > Is that how I’m supposed to do that? > > > > Thanks, > > Pat > > > > *From:* hadoop n00b [mailto:new2h...@gmail.com] > *Sent:* Wednesday, January 26, 2011 9:09 PM > *To:* user@hive.apache.org > *Subject:* Re: Hive Error on medium sized dataset > > > > We typically get this error while running complex queries on our 4-node > setup when the child JVM runs out of heap size. Would be interested in what > the experts have to say about this error. > > On Thu, Jan 27, 2011 at 7:27 AM, Ajo Fod <ajo....@gmail.com> wrote: > > Any chance you can convert the data to a tab separated text file and try > the same query? > > It may not be the SerDe, but it may be good to isolate that away as a > potential source of the problem. > > -Ajo. > > > > On Wed, Jan 26, 2011 at 5:47 PM, Christopher, Pat < > patrick.christop...@hp.com> wrote: > > Hi, > > I’m attempting to load a small to medium sized log file, ~250MB, and > produce some basic reports from it, counts etc. Nothing fancy. However, > whenever I try and read the entire dataset, ~330k rows, I get the following > error: > > > > FAILED: Execution Error, return code 2 from > org.apache.hadoop.hive.ql.exec.MapRedTask > > > > This result gets produced with basic queries like: > > > > SELECT count(1) FROM medium_table; > > > > However, if do the following: > > > > SELECT count(1) FROM ( SELECT col1 FROM medium_table LIMIT 70000 ) tbl; > > > > It works okay until I get to around 70,800ish then I get the first error > message again. I’m running my HDFS system in single node, pseudo > distributed mode with 1.5GB of memory and 20 GB of disk as a virtual > machine. And I am using a custom SerDe. I don’t think it’s the SerDe but > I’m open to suggestions for how I can check if it is causing the problem. I > can’t see anything in the data that would be causing it though. > > > > Anyone have any ideas of what might be causing this or something I can > check? > > > > Thanks, > > Pat > > > > >