Hello
And thanks again for all your help
I have a symbolic link /home/alex/hadoop/pig which points to
/home/alex/hadoop/pig-branch-0.6-ro - this is a checkout and build of the
0.6 branch.
I'm running the script from /home/alex/hadoop/test.
I didn't touch pig.properties, also I'm running with the default pig script.
The only thing I did is I copied the pig/build/pig-0.6.0-dev.jar
to pig/pig-0.6.0-dev-core.jar. Because otherwise it would not work(Exception
in thread "main" java.lang.NoClassDefFoundError: org/apache/pig/Main).
I'm running with a fresh 0.6 build and still no luck.
The info you reqested:
a...@alex-desktop:~/hadoop/test$ pwd
/home/alex/hadoop/test
a...@alex-desktop:~/hadoop/test$ printenv | grep JAVA
JAVA_HOME=/usr/lib/jvm/java-6-sun
a...@alex-desktop:~/hadoop/test$ printenv | grep PIG
PIGDIR=/home/alex/hadoop/pig
a...@alex-desktop:~/hadoop/test$ printenv | grep HADOOP
HADOOP_HOME=/home/alex/hadoop/hadoop
HADOOP_CONF_DIR=/home/alex/hadoop/hadoop/conf
HADOOPDIR=/home/alex/hadoop/hadoop/conf
a...@alex-desktop:~/hadoop/test$ cat
/home/alex/hadoop/pig/conf/pig.properties
# Pig configuration file. All values can be overwritten by command line
arguments.
# see bin/pig -help
# log4jconf log4j configuration file
# log4jconf=./conf/log4j.properties
# brief logging (no timestamps)
brief=false
# clustername, name of the hadoop jobtracker. If no port is defined port
50020 will be used.
#cluster
#debug level, INFO is default
debug=INFO
# a file that contains pig script
#file=
# load jarfile, colon separated
#jar=
#verbose print all log messages to screen (default to print only INFO and
above to screen)
verbose=false
#exectype local|mapreduce, mapreduce is default
#exectype=mapreduce
# hod realted properties
#ssh.gateway
#hod.expect.root
#hod.expect.uselatest
#hod.command
#hod.config.dir
#hod.param
#Do not spill temp files smaller than this size (bytes)
pig.spill.size.threshold=5000000
#EXPERIMENT: Activate garbage collection when spilling a file bigger than
this size (bytes)
#This should help reduce the number of files being spilled.
pig.spill.gc.activation.size=40000000
######################
# Everything below this line is Yahoo specific. Note that I've made
# (almost) no changes to the lines above to make merging in from Apache
# easier. Any values I don't want from above I override below.
#
# This file is configured for use with HOD on the production clusters. If
you
# want to run pig with a static cluster you will need to remove everything
# below this line and set the cluster value (above) to the
# hostname and port of your job tracker.
exectype=mapreduce
hod.config.dir=/export/crawlspace/kryptonite/hod/current/conf
hod.server=local
cluster.domain=inktomisearch.com
log.file=
yinst.cluster=kryptonite
And now boom!
java -cp $PIGDIR/pig-0.6.0-dev-core.jar:$HADOOPDIR org.apache.pig.Main
test_lite.pig
2010-02-16 12:27:17,283 [main] INFO org.apache.pig.Main - Logging error
messages to: /home/alex/hadoop/test/pig_1266319637282.log
2010-02-16 12:27:17,303 [main] ERROR org.apache.pig.Main - ERROR 2999:
Unexpected internal error. Undefined parameter : PIGDIR
Details at logfile: /home/alex/hadoop/test/pig_1266319637282.log
The log file:
a...@alex-desktop:~/hadoop/test$ cat pig_1266319637282.log
Error before Pig is launched
----------------------------
ERROR 2999: Unexpected internal error. Undefined parameter : PIGDIR
java.lang.RuntimeException: Undefined parameter : PIGDIR
at
org.apache.pig.tools.parameters.PreprocessorContext.substitute(PreprocessorContext.java:232)
at
org.apache.pig.tools.parameters.ParameterSubstitutionPreprocessor.parsePigFile(ParameterSubstitutionPreprocessor.java:106)
at
org.apache.pig.tools.parameters.ParameterSubstitutionPreprocessor.genSubstitutedFile(ParameterSubstitutionPreprocessor.java:86)
at org.apache.pig.Main.runParamPreprocessor(Main.java:515)
at org.apache.pig.Main.main(Main.java:366)
================================================================================
The script: 'A = load '/home/alex/hadoop/test/t.csv' using PigStorage('\t')
as (id: long); '
The file: '1' .
Simple enough :)
I'm all out of ideas. It seems that even 0.5 is broken now. I can't start
anything as scripts. If I go with manual processing in grunt (line by line)
it's all good on any version.
thanks,
alex
On Tue, Feb 16, 2010 at 9:58 AM, Dmitriy Ryaboy <[email protected]> wrote:
> Pig starts, the error you are getting is from inside pig.Main .
> I think something is getting messed up in your environment because you
> are juggling too many different versions of pig (granted, I have 3 or
> 4 in various stages of development on my laptop most of the time, and
> haven't had your problems. But then neither of them has a hacked
> bin/pig , with the exception of the cloudera one...). I've never tried
> running multiple ant tests, either. There's a short "sanity" version
> of tests, ant test-commit, that runs in under 10 minutes. You might
> want to try that if you are not doing things like changing join
> implementations or messing with the optimizer.
>
> Let's do this: send a full trace of what you are doing and what your
> environment looks like. Something like
>
> pwd
> printenv | grep JAVA
> printenv | grep PIG
> printenv | grep HADOOP
> cat conf/pig.properties
> java -cp ........
> <boom!>
>
> -D
>
>
> On Tue, Feb 16, 2010 at 12:39 AM, Alex Parvulescu
> <[email protected]> wrote:
> > Hello,
> >
> > sorry for the delay, but I wanted to build from the source again just to
> > make sure.
> >
> > The script is like this:
> >
> > 'A = load '/home/alex/hadoop/reviews/r.csv' using PigStorage('\t') as
> (id:
> > long, hid: long, locale: chararray, r1: int, r2: int, r3: int, r4: int);
> '
> >
> > That's it. My guess is that Pig doesn't even start. Do you think I need
> to
> > change something in the properties file? I'm not sure anymore and I don't
> > have a lot of luck going through the sources.
> >
> > And another thing, I tried running 'ant test' for both 0.6-branch and
> trunk
> > at the same time (because they take a very long time) and both test
> scripts
> > failed. I've switched to running them one after the other and they are
> fine.
> > Do you think that is ok?
> >
> > thanks for your time,
> > alex
> >
> > On Fri, Feb 12, 2010 at 6:37 PM, Dmitriy Ryaboy <[email protected]>
> wrote:
> >
> >> what does your script1-hadoop.pig look like?
> >>
> >> The error you are getting happens when the pig preprocessor can't
> >> substitute Pig variables (the stuff you specify with -param and
> >> %default, etc). Do you have $PIGDIR in your script somewhere?
> >>
> >> -D
> >>
> >> On Fri, Feb 12, 2010 at 6:51 AM, Alex Parvulescu
> >> <[email protected]> wrote:
> >> > Hello,
> >> >
> >> > I seem to have broken my Pig install, and I don't know where to look.
> >> >
> >> > If I use directly the script (grunt) everything works ok, but every
> time
> >> I
> >> > try to run a pig script: 'java -cp $PIGDIR/pig.jar:$HADOOPSITEPATH
> >> > org.apache.pig.Main script1-hadoop.pig'
> >> >
> >> > I get this nice error: [main] ERROR org.apache.pig.Main - ERROR 2999:
> >> > Unexpected internal error. Undefined parameter : PIGDIR
> >> >
> >> > Obviously I have the PIGDIR var set:
> >> >> echo $PIGDIR
> >> >> /home/alex/hadoop/pig
> >> >
> >> > This is something that I did, as I have used 0.5 and 0.6 and a patched
> >> > version of 0.6 in parallel, but I can't figure out where to look. Any
> >> > version I try to start now, gives the same error.
> >> >
> >> > any help would be greatly appreciated!
> >> >
> >> > alex
> >> >
> >>
> >
>