It takes a while, but did you check the jobtracker UI? You will get the notice "xxxx tuples generated". Or you can try to generate fewer data first, just change "generate_data.sh".
Daniel On Sun, Sep 11, 2011 at 9:33 AM, Keren Ouaknine <[email protected]> wrote: > Hello Daniel, > > > Have you checked mapreduce UI? Most probably it is caused by OOM. If you > see > that in mapreduce log, > You mean the jobtracker's log? > > I applied the settings you sent me, and it solved that exception but I am > stuck in the middle of the job: I reached 22 percents withing less than a > minute and reached 22%, then no progress in the last 45 minutes... > > I am running on a 10 nodes cluster, with 4GB of memory each and defaults > number of mappers and reducers. > I am looking at the web interface of the jobtracker, but nothing looks > abnormal. > > Thanks for your help! > Keren > > > Generating mapping file for column d:1:100000:z:5 into > hdfs://node020062.boca.lo > cal:54310/user/kouaknine/tmp/tmp-1343473685/tmp-467941747 > processed 99%. > Generating input files into > hdfs://node020062.boca.local:54310/user/kouaknine/tm > p/tmp-1343473685/tmp1142302988 > Submit hadoop job... > 11/09/11 11:50:00 INFO mapred.FileInputFormat: Total input paths to process > : 90 > > 11/09/11 11:50:01 INFO mapred.JobClient: Running job: job_201109111147_0001 > 11/09/11 11:50:02 INFO mapred.JobClient: map 0% reduce 0% > 11/09/11 11:50:42 INFO mapred.JobClient: map 1% reduce 0% > 11/09/11 11:50:43 INFO mapred.JobClient: map 3% reduce 0% > 11/09/11 11:50:44 INFO mapred.JobClient: map 7% reduce 0% > 11/09/11 11:50:45 INFO mapred.JobClient: map 13% reduce 0% > 11/09/11 11:50:46 INFO mapred.JobClient: map 17% reduce 0% > 11/09/11 11:50:47 INFO mapred.JobClient: map 20% reduce 0% > 11/09/11 11:50:48 INFO mapred.JobClient: map 21% reduce 0% > *11/09/11 11:50:49 INFO mapred.JobClient: map 22% reduce 0%* > > > > On Sun, Sep 11, 2011 at 3:07 AM, Daniel Dai <[email protected]> wrote: > > > Hi, Keren, > > Have you checked mapreduce UI? Most probably it is caused by OOM. If you > > see > > that in mapreduce log, try to put this entry to mapred-site.xml: > > <property> > > <name>mapred.child.java.opts</name> > > <value>-Xmx2048m</value> > > </property> > > > > Also change hadoop-env.sh: > > export HADOOP_HEAPSIZE=2000 > > > > I tried 0.20.204 with pig 0.8.1, I didn't finish the run but I didn't see > > any error for the first 15m (still running the first hadoop job to > generate > > page_view). > > > > Daniel > > > > On Sat, Sep 10, 2011 at 10:04 PM, Keren Ouaknine <[email protected]> > wrote: > > > > > Hello, > > > > > > I tried several versions to generate data for pigmix queries: > > > *- Hadoop apache 0.20.204 with pig 0.7* > > > *==>* java.lang.RuntimeException: Error in configuring > > > object at > > > org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils... > > > *[Full error at the very bottom]* > > > * > > > * > > > *- Hadoop apache 0.20.204 with pig 0.9* > > > ==> I get an error while patching the pixmix2.patch > > > (on build.xml: Reversed (or previously applied) patch detected! ) > > > I didnapplied the patch up to that error and when generating the data: > > > Exception in thread "main" org.apache.hadoop.ipc.RPC$VersionMismatch: > > > Protocol.. > > > org.apache.hadoop.hdfs.protocol.ClientProtocol version mismatch. > > > > > > *- CDH3 with pig 0.7* > > > *==>* java.lang.RuntimeException: Error in configuring > > > object at > > > org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils... > > > > > > *- CDH3 with CDH3-pig (which I downloaded from > > > http://nightly.cloudera.com/cdh/3/ )* > > > I applied pigmix2.patch, and used pig.jar and pigperf.jar (which I > > couldnt > > > recompile locally for an internal reason), and got the same error: > > > *==>* java.lang.RuntimeException: Error in configuring > > > object at > > > org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils... > > > > > > Bottom line, none of these configurations worked. > > > Keren > > > > > > [kouaknine@dataland_oss:node020062 scripts]$ ./generate_data.sh > > > > > > Generating pages50m > > > > > > Using seed 1315606991650 > > > > > > Generate data in hadoop mode. > > > > > > Generating column config file in > > > hdfs://node020062.boca.local:54310/user/kouak > > > > > > ne/tmp/tmp1129210882/tmp-1545723655 > > > > > > Generating mapping file for column s:20:1600000:z:7 into > > > hdfs://node020062.boc > > > > > > local:54310/user/kouaknine/tmp/tmp1129210882/tmp-163757285 > > > > > > processed 18%. > > > > > > processed 37%. > > > > > > processed 56%. > > > > > > processed 75%. > > > > > > processed 93%. > > > > > > processed 99%. > > > > > > Generating mapping file for column s:10:1800000:z:20 into hdfs:// > > > node020062.bo > > > > > > .local:54310/user/kouaknine/tmp/tmp1129210882/tmp-1525412142 > > > > > > processed 16%. > > > > > > processed 33%. > > > > > > processed 50%. > > > > > > processed 66%. > > > > > > processed 83%. > > > > > > processed 99%. > > > > > > Generating mapping file for column d:1:100000:z:5 into > > > hdfs://node020062.boca. > > > > > > cal:54310/user/kouaknine/tmp/tmp1129210882/tmp-738880094 > > > > > > processed 99%. > > > > > > Generating input files into > > > hdfs://node020062.boca.local:54310/user/kouaknine/ > > > > > > p/tmp1129210882/tmp-1696754417 > > > > > > Submit hadoop job... > > > > > > 11/09/09 18:23:38 INFO mapred.FileInputFormat: Total input paths to > > process > > > : > > > > > > > > > > > > 11/09/09 18:23:39 INFO mapred.JobClient: Running job: > > job_201109091527_0005 > > > > > > 11/09/09 18:23:40 INFO mapred.JobClient: map 0% reduce 0% > > > > > > *11/09/09 18:24:45 INFO mapred.JobClient: Task Id : > > > attempt_201109091527_0005_m* > > > > > > *00000_0, Status : FAILED* > > > > > > *java.lang.RuntimeException: Error in configuring object* > > > > > > * at > > > org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.j* > > > > > > *a:93)* > > > > > > at > > > org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java > > > > > > 4) > > > > > > at > > > org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils. > > > > > > va:117) > > > > > > at > org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:386) > > > > > > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:324) > > > > > > at org.apache.hadoop.mapred.Child$4.run(Child.java:268) > > > > > > at java.security.AccessController.doPrivileged(Native Method) > > > > > > at javax.security.auth.Subject.doAs(Subject.java:396) > > > > > > at > > > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInfor > > > > > > tion.java:1115) > > > > > > at org.apache.hadoop.mapred.Child.main(Child.java:262) > > > > > > Caused by: java.lang.reflect.InvocationTargetException > > > > > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > > > > > > at > > > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImp > > > > > > java:39) > > > > > > at > > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcc > > > > > > sorImpl.java:25) > > > > > > at java.lang.reflect.Method.invoke(Method.java:597) > > > > > > at > > > org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.j > > > > > > a:88) > > > > > > ... 9 more > > > > > > > > > On Fri, Sep 9, 2011 at 1:12 PM, Alan Gates <[email protected]> > > wrote: > > > > > > > If you're going to run with Apache Pig 0.8.1 or 0.9, you should use > > > Apache > > > > Hadoop 0.20.2. If you want to use CDH, you should stick with their > > > versions > > > > of Hadoop and Pig. > > > > > > > > Alan. > > > > > > > > On Sep 9, 2011, at 6:59 AM, Keren Ouaknine wrote: > > > > > > > > > Hello, > > > > > > > > > > What is the latest version of pig supporting the pigmix queries? > The > > > jira > > > > > latest update mentions pig 0.7 only: > > > > > Assuming its 0.8 or 0.9, can I use hadoop cdh3 or should I switch > to > > > > > apache's version and which one? > > > > > > > > > > == > > > > > * > > > > > * > > > > > *1. Download pig 0.7 release > > > > > 2. Apply the patch > > > > > 3. copy http://www.eli.sdsu.edu/java-SDSU/sdsuLibJKD12.jar to lib > > > > > 4. ant jar pigperf > > > > > 5. You will use pig.jar, pigperf.jar. Scripts is in > > > > > test/utils/pigmix/scripts. To generate data, use generate_data.sh. > To > > > run > > > > > PigMix2, use runpigmix-adhoc.pl.* > > > > > > > > > > Thanks, > > > > > Keren > > > > > > > > > > -- > > > > > Keren Ouaknine > > > > > Cell: +972 54 2565404 > > > > > Web: www.kereno.com > > > > > > > > > > > > > > > > > -- > > > Keren Ouaknine > > > Cell: +972 54 2565404 > > > Web: www.kereno.com > > > > > > > > > -- > Keren Ouaknine > Cell: +972 54 2565404 > Web: www.kereno.com >
