Alright I was eventually able to get the issue resolved.. For everyone's 
benefit, the root issue was that the pig libraries needed to be present in the 
HADOOP_CLASSPATH and it needs to be specifically set in hadoop_env.sh.. I tried 
setting PIG_CLASSPATH and HADOOP_CLASSPATH to have those libraries in the shell 
and then start pig but that did not help either.. Setting that explicitly in 
hadoop_env.sh did the trick.. 

As I see there are 2 issues here:
1. Somehow the pig executable seems broken.. As in setting HADOOP_CLASSPATH on 
the shell and then running pig should have the same effect as setting it in 
hadoop_env.sh.. Also having a * in the PIG_CLASSPATH/HADOOP_CLASSPATH (when set 
in the shell) makes the pig executable set HADOOP_CLASSPATH to empty before it 
actually runs the hadoop jar command inside the pig script.. In CDH3, I never 
needed to set the classpath to include pig libraries.. The pig executable did 
that for me..(and I am never overriding HADOOP_CLASSPATH.. I only append to it)
2. The error message is not super helpful.. If libraries are missing, the pig 
shell/grunt should not open at all.. However, in my case, it did start up and 
then the error message was in no ways intuitive or pointing to the root issue..

Also, I don't think this is a legacy pig issue as we tried the newer version 
0.10 and it came up with the same issue.. A second issue that did pop up after 
putting the pig libraries in the classpath was with mapreduce framework being 
selected as classic and it turned out that the MRv1 libraries needed to be 
explicitly specified in the HADOOP_CLASSPATH (without that entry Sqoop, Java MR 
jobs work fine but Pig did not for whatever reason)

Anyways thanks everyone for the help
 
Regards,
Dhaval


----- Original Message -----
From: Russell Jurney <[email protected]>
To: "[email protected]" <[email protected]>
Cc: 
Sent: Tuesday, 9 October 2012 7:34 PM
Subject: Re: Error with Pig (CDH4.0.0)

I've often wondered - as we see a lot of these legacy pig issues in
CDH on this list - is it hard to upgrade Pig on CDH by downloading the
latest stable release of pig, unpacking the .tgz and running it? Is
upgrading Pig on CDH as simple as wget, like on Apache Hadoop and
others, or is it somehow more complex? Pig is after all a client-side
tool.

Russell Jurney http://datasyndrome.com

On Oct 9, 2012, at 4:15 PM, Cheolsoo Park <[email protected]> wrote:

> Hi Dhaval,
>
> That certainly works for me.
>
> How did you upgrade CDH? Did you install it via RPMs? Did you completely
> uninstall CDH3u3 before installing CDH4?
>
> It sounds to me like a CDH-upgrade issue rather than Pig issue. Can you
> please provide steps that you took to upgrade CDH?
>
> Thanks,
> Cheolsoo
>
> On Tue, Oct 9, 2012 at 4:02 PM, Dhaval Shah 
> <[email protected]>wrote:
>
>> Thanks for getting back Cheolsoo..
>>
>> All I am trying to do is run the pig shell/grunt and do this:
>> p = LOAD 'file_name';
>>
>> It comes back with the exception mentioned below..
>>
>> Regards,
>> Dhaval
>>
>>
>> ----- Original Message -----
>> From: Cheolsoo Park <[email protected]>
>> To: [email protected]; Dhaval Shah <[email protected]>
>> Cc:
>> Sent: Tuesday, 9 October 2012 6:57 PM
>> Subject: Re: Error with Pig (CDH4.0.0)
>>
>> Hi Dhaval,
>>
>> CDH3u3 includes Pig 0.8, and CDH4.0.0 includes Pig 0.9. There were
>> some incompatibilities introduced between two version.
>>
>>
>> https://cwiki.apache.org/confluence/display/PIG/Pig+0.9+Backward+Compatibility
>>
>> To pin down the exact cause, I'd like to reproduce your error. Would
>> mind providing an example script that generates the exception?
>>
>> Thanks,
>> Cheolsoo
>>
>> On Tue, Oct 9, 2012 at 3:29 PM, Dhaval Shah <[email protected]
>>> wrote:
>>
>>> Hi everyone.. We just upgraded to CDH4.0.0 and are seeing a very weird
>>> issue with Pig.. Everytime I try to run a LOAD command, it dies with the
>>> following exception:
>>>
>>> ERROR 2998: Unhandled internal error. name
>>>
>>> java.lang.NoSuchFieldError: name
>>>        at
>>>
>> org.apache.pig.parser.QueryParserStringStream.<init>(QueryParserStringStream.java:32)
>>>        at
>>>
>> org.apache.pig.parser.QueryParserDriver.tokenize(QueryParserDriver.java:194)
>>>        at
>>> org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:162)
>>>        at
>>> org.apache.pig.PigServer$Graph.validateQuery(PigServer.java:1609)
>>>        at
>>> org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1582)
>>>        at org.apache.pig.PigServer.registerQuery(PigServer.java:584)
>>>        at
>>> org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:967)
>>>        at
>>>
>> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:386)
>>>        at
>>>
>> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:189)
>>>        at
>>>
>> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165)
>>>        at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:69)
>>>        at org.apache.pig.Main.run(Main.java:495)
>>>        at org.apache.pig.Main.main(Main.java:111)
>>>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>        at
>>>
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>>        at
>>>
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>>        at java.lang.reflect.Method.invoke(Method.java:597)
>>>        at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
>>>
>>> The same script/command worked fine with CDH3U3.. Did any APIs change? Or
>>> is this a bug?
>>> Regards,
>>> Dhaval
>>>
>>
>>

Reply via email to