Probably unrelated, but I always run from the root directory, not from bin/, when working with an svn version. Also, never had dev vs core issues. Can you try cding to pig-branch-0.6-take2 and running ./bin/pig , and put pig.jar (not core, not dev) on $PIG_CLASSPATH?
How about avoiding the pig script and just running java -cp $PIGDIR/pig.jar:$HADOOP_CONF_DIR org.apache.pig.Main <<yourscript>> ? I grepped through all the code, including the codegenned stuff, and none of it contains the string PIGDIR (except bin/pig of course). At this point, assuming none of the above helps (which I have little hope it will), the best I can say is set up your eclipse environment and step through pig's main with the debugger. -D On Wed, Feb 17, 2010 at 2:53 AM, Alex Parvulescu <[email protected]>wrote: > Hello > > This is getting tiresome :) > > Today, I tried with a fresh download of pig 0.5 and I get the exact same > error. Something somewhere just broke down, and it took my machine with it. > At this point I have no way to run any kind of pig version. > Needless to say, this is a big problem for me :) > > Another thing I tried is to do a fresh build of the 0.6 branch. Same error. > I even dropped the sym links. > > Back to our debugging session: > > > svn co > http://svn.apache.org/repos/asf/hadoop/pig/branches/branch-0.6pig-branch-0.6-take2 > A pig-branch-0.6-take2/test/bin/test-patch.sh > Checked out external at revision 910901. > > Checked out revision 910900. > > > cd pig-branch-0.6-take2/ > > ant > BUILD SUCCESSFUL > Total time: 26 seconds > > > cd bin/ > > next is to check if all vars are there > > printenv | grep PIG > PIG_HOME=/home/alex/hadoop/pig-branch-0.6-take2 > PIGDIR=/home/alex/hadoop/pig-branch-0.6-take2 > > > printenv | grep HADOOP > HADOOP_HOME=/home/alex/hadoop/hadoop > HADOOP_CONF_DIR=/home/alex/hadoop/hadoop/conf > HADOOPDIR=/home/alex/hadoop/hadoop/conf > > > pig > Exception in thread "main" java.lang.NoClassDefFoundError: > jline/ConsoleReaderInputStream > Caused by: java.lang.ClassNotFoundException: jline.ConsoleReaderInputStream > at java.net.URLClassLoader$1.run(URLClassLoader.java:200) > at java.security.AccessController.doPrivileged(Native Method) > at java.net.URLClassLoader.findClass(URLClassLoader.java:188) > at java.lang.ClassLoader.loadClass(ClassLoader.java:307) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) > at java.lang.ClassLoader.loadClass(ClassLoader.java:252) > at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:320) > Could not find the main class: org.apache.pig.Main. Program will exit. > > I've sent some email to the list about this problem before. A quick fix is > to copy and rename the build/pig-0.6.1-dev.jar. Here goes: > > cp ../build/pig-0.6.1-dev.jar ../pig-0.6.1-dev-core.jar > > > pig > 2010-02-17 11:48:52,843 [main] INFO org.apache.pig.Main - Logging error > messages to: > /home/alex/hadoop/pig-branch-0.6-take2/bin/pig_1266403732842.log > 2010-02-17 11:48:53,175 [main] INFO > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting > to hadoop file system at: file:/// > grunt> > > Now at least it will start. > > Getting to the initial issue: > > the '-f' option: > > pig -f /home/alex/hadoop/test/test_lite.pig > 2010-02-17 11:49:28,988 [main] INFO org.apache.pig.Main - Logging error > messages to: > /home/alex/hadoop/pig-branch-0.6-take2/bin/pig_1266403768987.log > 2010-02-17 11:49:29,012 [main] ERROR org.apache.pig.Main - ERROR 2999: > Unexpected internal error. Undefined parameter : PIGDIR > Details at logfile: > /home/alex/hadoop/pig-branch-0.6-take2/bin/pig_1266403768987.log > > the '- secretDebugCmd' option > > pig -secretDebugCmd -f /home/alex/hadoop/test/test_lite.pig > dry run: > /usr/lib/jvm/java-6-sun/bin/java -Xmx1000m > -Dpig.log.dir=/home/alex/hadoop/pig-branch-0.6-take2/bin/../logs > -Dpig.log.file=pig.log > -Dpig.home.dir=/home/alex/hadoop/pig-branch-0.6-take2/bin/.. > -Dpig.root.logger=INFO,console,DRFA -classpath > > /home/alex/hadoop/pig-branch-0.6-take2/bin/../conf:/usr/lib/jvm/java-6-sun/lib/tools.jar:/home/alex/hadoop/pig-branch-0.6-take2/bin/../build/classes:/home/alex/hadoop/pig-branch-0.6-take2/bin/../build/test/classes:/home/alex/hadoop/pig-branch-0.6-take2/bin/../pig-0.6.1-dev-core.jar:/home/alex/hadoop/pig-branch-0.6-take2/bin/../build/pig-0.6.1-dev-core.jar:/home/alex/hadoop/pig-branch-0.6-take2/bin/../lib/hadoop20.jar > org.apache.pig.Main -f /home/alex/hadoop/test/test_lite.pig > > I did not change the scripts, and I get the same error. > > This is really killing me :) > > thanks, > alex > > > > On Tue, Feb 16, 2010 at 6:08 PM, Dmitriy Ryaboy <[email protected]> > wrote: > > > Hm, nothing jumps out. Definitely the error you are getting indicates > that > > somehow the preprocessor is trying to substitute a variable called PIGDIR > > in > > your script. Which is odd. Does the same thing happen if you try bin/pig > -f > > test_lite.pig? If yes, try running with -secretDebugCmd (shh, it's > secret), > > and sending along the output. Why are you using pig-core instead of > pig.jar > > that ant should be generating for you when you build? Where did you get > it, > > how did you build it, and what's the output of cksum on it? > > The fact that now your 0.5 is broken too makes me think that maybe your > > symlinks are messed up. Surely a change to one jar shouldn't be affecting > a > > totally unrelated jar in a directory the first jar doesn't know about. > > > > Sorry about the barrage of questions, I am just a bit dumbfounded about > how > > this could even begin to start happening. Any preprocessor experts > around? > > > > -D > > > > > > On Tue, Feb 16, 2010 at 3:37 AM, Alex Parvulescu > > <[email protected]>wrote: > > > > > Hello > > > > > > And thanks again for all your help > > > > > > I have a symbolic link /home/alex/hadoop/pig which points to > > > /home/alex/hadoop/pig-branch-0.6-ro - this is a checkout and build of > the > > > 0.6 branch. > > > I'm running the script from /home/alex/hadoop/test. > > > > > > I didn't touch pig.properties, also I'm running with the default pig > > > script. > > > > > > The only thing I did is I copied the pig/build/pig-0.6.0-dev.jar > > > to pig/pig-0.6.0-dev-core.jar. Because otherwise it would not > > > work(Exception > > > in thread "main" java.lang.NoClassDefFoundError: org/apache/pig/Main). > > > > > > I'm running with a fresh 0.6 build and still no luck. > > > > > > The info you reqested: > > > > > > a...@alex-desktop:~/hadoop/test$ pwd > > > /home/alex/hadoop/test > > > a...@alex-desktop:~/hadoop/test$ printenv | grep JAVA > > > JAVA_HOME=/usr/lib/jvm/java-6-sun > > > a...@alex-desktop:~/hadoop/test$ printenv | grep PIG > > > PIGDIR=/home/alex/hadoop/pig > > > a...@alex-desktop:~/hadoop/test$ printenv | grep HADOOP > > > HADOOP_HOME=/home/alex/hadoop/hadoop > > > HADOOP_CONF_DIR=/home/alex/hadoop/hadoop/conf > > > HADOOPDIR=/home/alex/hadoop/hadoop/conf > > > > > > a...@alex-desktop:~/hadoop/test$ cat > > > /home/alex/hadoop/pig/conf/pig.properties > > > # Pig configuration file. All values can be overwritten by command line > > > arguments. > > > # see bin/pig -help > > > > > > # log4jconf log4j configuration file > > > # log4jconf=./conf/log4j.properties > > > > > > # brief logging (no timestamps) > > > brief=false > > > > > > # clustername, name of the hadoop jobtracker. If no port is defined > port > > > 50020 will be used. > > > #cluster > > > > > > #debug level, INFO is default > > > debug=INFO > > > > > > # a file that contains pig script > > > #file= > > > > > > # load jarfile, colon separated > > > #jar= > > > > > > #verbose print all log messages to screen (default to print only INFO > and > > > above to screen) > > > verbose=false > > > > > > #exectype local|mapreduce, mapreduce is default > > > #exectype=mapreduce > > > # hod realted properties > > > #ssh.gateway > > > #hod.expect.root > > > #hod.expect.uselatest > > > #hod.command > > > #hod.config.dir > > > #hod.param > > > > > > > > > #Do not spill temp files smaller than this size (bytes) > > > pig.spill.size.threshold=5000000 > > > #EXPERIMENT: Activate garbage collection when spilling a file bigger > than > > > this size (bytes) > > > #This should help reduce the number of files being spilled. > > > pig.spill.gc.activation.size=40000000 > > > > > > > > > ###################### > > > # Everything below this line is Yahoo specific. Note that I've made > > > # (almost) no changes to the lines above to make merging in from Apache > > > # easier. Any values I don't want from above I override below. > > > # > > > # This file is configured for use with HOD on the production clusters. > > If > > > you > > > # want to run pig with a static cluster you will need to remove > > everything > > > # below this line and set the cluster value (above) to the > > > # hostname and port of your job tracker. > > > > > > exectype=mapreduce > > > > > > hod.config.dir=/export/crawlspace/kryptonite/hod/current/conf > > > hod.server=local > > > > > > cluster.domain=inktomisearch.com > > > > > > log.file= > > > > > > yinst.cluster=kryptonite > > > > > > > > > > > > And now boom! > > > > > > java -cp $PIGDIR/pig-0.6.0-dev-core.jar:$HADOOPDIR org.apache.pig.Main > > > test_lite.pig > > > > > > > > > 2010-02-16 12:27:17,283 [main] INFO org.apache.pig.Main - Logging > error > > > messages to: /home/alex/hadoop/test/pig_1266319637282.log > > > 2010-02-16 12:27:17,303 [main] ERROR org.apache.pig.Main - ERROR 2999: > > > Unexpected internal error. Undefined parameter : PIGDIR > > > Details at logfile: /home/alex/hadoop/test/pig_1266319637282.log > > > > > > The log file: > > > a...@alex-desktop:~/hadoop/test$ cat pig_1266319637282.log > > > Error before Pig is launched > > > ---------------------------- > > > ERROR 2999: Unexpected internal error. Undefined parameter : PIGDIR > > > > > > java.lang.RuntimeException: Undefined parameter : PIGDIR > > > at > > > > > > > > > org.apache.pig.tools.parameters.PreprocessorContext.substitute(PreprocessorContext.java:232) > > > at > > > > > > > > > org.apache.pig.tools.parameters.ParameterSubstitutionPreprocessor.parsePigFile(ParameterSubstitutionPreprocessor.java:106) > > > at > > > > > > > > > org.apache.pig.tools.parameters.ParameterSubstitutionPreprocessor.genSubstitutedFile(ParameterSubstitutionPreprocessor.java:86) > > > at org.apache.pig.Main.runParamPreprocessor(Main.java:515) > > > at org.apache.pig.Main.main(Main.java:366) > > > > > > > > > ================================================================================ > > > > > > > > > The script: 'A = load '/home/alex/hadoop/test/t.csv' using > > PigStorage('\t') > > > as (id: long); ' > > > The file: '1' . > > > Simple enough :) > > > > > > I'm all out of ideas. It seems that even 0.5 is broken now. I can't > start > > > anything as scripts. If I go with manual processing in grunt (line by > > line) > > > it's all good on any version. > > > > > > thanks, > > > alex > > > > > > > > > On Tue, Feb 16, 2010 at 9:58 AM, Dmitriy Ryaboy <[email protected]> > > > wrote: > > > > > > > Pig starts, the error you are getting is from inside pig.Main . > > > > I think something is getting messed up in your environment because > you > > > > are juggling too many different versions of pig (granted, I have 3 or > > > > 4 in various stages of development on my laptop most of the time, and > > > > haven't had your problems. But then neither of them has a hacked > > > > bin/pig , with the exception of the cloudera one...). I've never > tried > > > > running multiple ant tests, either. There's a short "sanity" version > > > > of tests, ant test-commit, that runs in under 10 minutes. You might > > > > want to try that if you are not doing things like changing join > > > > implementations or messing with the optimizer. > > > > > > > > Let's do this: send a full trace of what you are doing and what your > > > > environment looks like. Something like > > > > > > > > pwd > > > > printenv | grep JAVA > > > > printenv | grep PIG > > > > printenv | grep HADOOP > > > > cat conf/pig.properties > > > > java -cp ........ > > > > <boom!> > > > > > > > > -D > > > > > > > > > > > > On Tue, Feb 16, 2010 at 12:39 AM, Alex Parvulescu > > > > <[email protected]> wrote: > > > > > Hello, > > > > > > > > > > sorry for the delay, but I wanted to build from the source again > just > > > to > > > > > make sure. > > > > > > > > > > The script is like this: > > > > > > > > > > 'A = load '/home/alex/hadoop/reviews/r.csv' using PigStorage('\t') > as > > > > (id: > > > > > long, hid: long, locale: chararray, r1: int, r2: int, r3: int, r4: > > > int); > > > > ' > > > > > > > > > > That's it. My guess is that Pig doesn't even start. Do you think I > > need > > > > to > > > > > change something in the properties file? I'm not sure anymore and I > > > don't > > > > > have a lot of luck going through the sources. > > > > > > > > > > And another thing, I tried running 'ant test' for both 0.6-branch > and > > > > trunk > > > > > at the same time (because they take a very long time) and both test > > > > scripts > > > > > failed. I've switched to running them one after the other and they > > are > > > > fine. > > > > > Do you think that is ok? > > > > > > > > > > thanks for your time, > > > > > alex > > > > > > > > > > On Fri, Feb 12, 2010 at 6:37 PM, Dmitriy Ryaboy < > [email protected]> > > > > wrote: > > > > > > > > > >> what does your script1-hadoop.pig look like? > > > > >> > > > > >> The error you are getting happens when the pig preprocessor can't > > > > >> substitute Pig variables (the stuff you specify with -param and > > > > >> %default, etc). Do you have $PIGDIR in your script somewhere? > > > > >> > > > > >> -D > > > > >> > > > > >> On Fri, Feb 12, 2010 at 6:51 AM, Alex Parvulescu > > > > >> <[email protected]> wrote: > > > > >> > Hello, > > > > >> > > > > > >> > I seem to have broken my Pig install, and I don't know where to > > > look. > > > > >> > > > > > >> > If I use directly the script (grunt) everything works ok, but > > every > > > > time > > > > >> I > > > > >> > try to run a pig script: 'java -cp > > $PIGDIR/pig.jar:$HADOOPSITEPATH > > > > >> > org.apache.pig.Main script1-hadoop.pig' > > > > >> > > > > > >> > I get this nice error: [main] ERROR org.apache.pig.Main - ERROR > > > 2999: > > > > >> > Unexpected internal error. Undefined parameter : PIGDIR > > > > >> > > > > > >> > Obviously I have the PIGDIR var set: > > > > >> >> echo $PIGDIR > > > > >> >> /home/alex/hadoop/pig > > > > >> > > > > > >> > This is something that I did, as I have used 0.5 and 0.6 and a > > > patched > > > > >> > version of 0.6 in parallel, but I can't figure out where to > look. > > > Any > > > > >> > version I try to start now, gives the same error. > > > > >> > > > > > >> > any help would be greatly appreciated! > > > > >> > > > > > >> > alex > > > > >> > > > > > >> > > > > > > > > > > > > > > >
