Hello Daniel, > Have you checked mapreduce UI? Most probably it is caused by OOM. If you see that in mapreduce log, You mean the jobtracker's log?
I applied the settings you sent me, and it solved that exception but I am stuck in the middle of the job: I reached 22 percents withing less than a minute and reached 22%, then no progress in the last 45 minutes... I am running on a 10 nodes cluster, with 4GB of memory each and defaults number of mappers and reducers. I am looking at the web interface of the jobtracker, but nothing looks abnormal. Thanks for your help! Keren Generating mapping file for column d:1:100000:z:5 into hdfs://node020062.boca.lo cal:54310/user/kouaknine/tmp/tmp-1343473685/tmp-467941747 processed 99%. Generating input files into hdfs://node020062.boca.local:54310/user/kouaknine/tm p/tmp-1343473685/tmp1142302988 Submit hadoop job... 11/09/11 11:50:00 INFO mapred.FileInputFormat: Total input paths to process : 90 11/09/11 11:50:01 INFO mapred.JobClient: Running job: job_201109111147_0001 11/09/11 11:50:02 INFO mapred.JobClient: map 0% reduce 0% 11/09/11 11:50:42 INFO mapred.JobClient: map 1% reduce 0% 11/09/11 11:50:43 INFO mapred.JobClient: map 3% reduce 0% 11/09/11 11:50:44 INFO mapred.JobClient: map 7% reduce 0% 11/09/11 11:50:45 INFO mapred.JobClient: map 13% reduce 0% 11/09/11 11:50:46 INFO mapred.JobClient: map 17% reduce 0% 11/09/11 11:50:47 INFO mapred.JobClient: map 20% reduce 0% 11/09/11 11:50:48 INFO mapred.JobClient: map 21% reduce 0% *11/09/11 11:50:49 INFO mapred.JobClient: map 22% reduce 0%* On Sun, Sep 11, 2011 at 3:07 AM, Daniel Dai <[email protected]> wrote: > Hi, Keren, > Have you checked mapreduce UI? Most probably it is caused by OOM. If you > see > that in mapreduce log, try to put this entry to mapred-site.xml: > <property> > <name>mapred.child.java.opts</name> > <value>-Xmx2048m</value> > </property> > > Also change hadoop-env.sh: > export HADOOP_HEAPSIZE=2000 > > I tried 0.20.204 with pig 0.8.1, I didn't finish the run but I didn't see > any error for the first 15m (still running the first hadoop job to generate > page_view). > > Daniel > > On Sat, Sep 10, 2011 at 10:04 PM, Keren Ouaknine <[email protected]> wrote: > > > Hello, > > > > I tried several versions to generate data for pigmix queries: > > *- Hadoop apache 0.20.204 with pig 0.7* > > *==>* java.lang.RuntimeException: Error in configuring > > object at > > org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils... > > *[Full error at the very bottom]* > > * > > * > > *- Hadoop apache 0.20.204 with pig 0.9* > > ==> I get an error while patching the pixmix2.patch > > (on build.xml: Reversed (or previously applied) patch detected! ) > > I didnapplied the patch up to that error and when generating the data: > > Exception in thread "main" org.apache.hadoop.ipc.RPC$VersionMismatch: > > Protocol.. > > org.apache.hadoop.hdfs.protocol.ClientProtocol version mismatch. > > > > *- CDH3 with pig 0.7* > > *==>* java.lang.RuntimeException: Error in configuring > > object at > > org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils... > > > > *- CDH3 with CDH3-pig (which I downloaded from > > http://nightly.cloudera.com/cdh/3/ )* > > I applied pigmix2.patch, and used pig.jar and pigperf.jar (which I > couldnt > > recompile locally for an internal reason), and got the same error: > > *==>* java.lang.RuntimeException: Error in configuring > > object at > > org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils... > > > > Bottom line, none of these configurations worked. > > Keren > > > > [kouaknine@dataland_oss:node020062 scripts]$ ./generate_data.sh > > > > Generating pages50m > > > > Using seed 1315606991650 > > > > Generate data in hadoop mode. > > > > Generating column config file in > > hdfs://node020062.boca.local:54310/user/kouak > > > > ne/tmp/tmp1129210882/tmp-1545723655 > > > > Generating mapping file for column s:20:1600000:z:7 into > > hdfs://node020062.boc > > > > local:54310/user/kouaknine/tmp/tmp1129210882/tmp-163757285 > > > > processed 18%. > > > > processed 37%. > > > > processed 56%. > > > > processed 75%. > > > > processed 93%. > > > > processed 99%. > > > > Generating mapping file for column s:10:1800000:z:20 into hdfs:// > > node020062.bo > > > > .local:54310/user/kouaknine/tmp/tmp1129210882/tmp-1525412142 > > > > processed 16%. > > > > processed 33%. > > > > processed 50%. > > > > processed 66%. > > > > processed 83%. > > > > processed 99%. > > > > Generating mapping file for column d:1:100000:z:5 into > > hdfs://node020062.boca. > > > > cal:54310/user/kouaknine/tmp/tmp1129210882/tmp-738880094 > > > > processed 99%. > > > > Generating input files into > > hdfs://node020062.boca.local:54310/user/kouaknine/ > > > > p/tmp1129210882/tmp-1696754417 > > > > Submit hadoop job... > > > > 11/09/09 18:23:38 INFO mapred.FileInputFormat: Total input paths to > process > > : > > > > > > > > 11/09/09 18:23:39 INFO mapred.JobClient: Running job: > job_201109091527_0005 > > > > 11/09/09 18:23:40 INFO mapred.JobClient: map 0% reduce 0% > > > > *11/09/09 18:24:45 INFO mapred.JobClient: Task Id : > > attempt_201109091527_0005_m* > > > > *00000_0, Status : FAILED* > > > > *java.lang.RuntimeException: Error in configuring object* > > > > * at > > org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.j* > > > > *a:93)* > > > > at > > org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java > > > > 4) > > > > at > > org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils. > > > > va:117) > > > > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:386) > > > > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:324) > > > > at org.apache.hadoop.mapred.Child$4.run(Child.java:268) > > > > at java.security.AccessController.doPrivileged(Native Method) > > > > at javax.security.auth.Subject.doAs(Subject.java:396) > > > > at > > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInfor > > > > tion.java:1115) > > > > at org.apache.hadoop.mapred.Child.main(Child.java:262) > > > > Caused by: java.lang.reflect.InvocationTargetException > > > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > > > > at > > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImp > > > > java:39) > > > > at > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcc > > > > sorImpl.java:25) > > > > at java.lang.reflect.Method.invoke(Method.java:597) > > > > at > > org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.j > > > > a:88) > > > > ... 9 more > > > > > > On Fri, Sep 9, 2011 at 1:12 PM, Alan Gates <[email protected]> > wrote: > > > > > If you're going to run with Apache Pig 0.8.1 or 0.9, you should use > > Apache > > > Hadoop 0.20.2. If you want to use CDH, you should stick with their > > versions > > > of Hadoop and Pig. > > > > > > Alan. > > > > > > On Sep 9, 2011, at 6:59 AM, Keren Ouaknine wrote: > > > > > > > Hello, > > > > > > > > What is the latest version of pig supporting the pigmix queries? The > > jira > > > > latest update mentions pig 0.7 only: > > > > Assuming its 0.8 or 0.9, can I use hadoop cdh3 or should I switch to > > > > apache's version and which one? > > > > > > > > == > > > > * > > > > * > > > > *1. Download pig 0.7 release > > > > 2. Apply the patch > > > > 3. copy http://www.eli.sdsu.edu/java-SDSU/sdsuLibJKD12.jar to lib > > > > 4. ant jar pigperf > > > > 5. You will use pig.jar, pigperf.jar. Scripts is in > > > > test/utils/pigmix/scripts. To generate data, use generate_data.sh. To > > run > > > > PigMix2, use runpigmix-adhoc.pl.* > > > > > > > > Thanks, > > > > Keren > > > > > > > > -- > > > > Keren Ouaknine > > > > Cell: +972 54 2565404 > > > > Web: www.kereno.com > > > > > > > > > > > > -- > > Keren Ouaknine > > Cell: +972 54 2565404 > > Web: www.kereno.com > > > -- Keren Ouaknine Cell: +972 54 2565404 Web: www.kereno.com
