Ah thanks William. I am trying it with an upgraded piggybank of 0.10.0-cdh4.1.1. It seems to be running , else pig 10 would be the way to go.
On Fri, Nov 30, 2012 at 1:11 PM, William Oberman <[email protected]>wrote: > I should have read more closely, you're not using EMR. > > I'm guessing if you upgrade to pig 0.10 the issue will go away... > > > On Fri, Nov 30, 2012 at 4:09 PM, William Oberman > <[email protected]>wrote: > > > A couple of weeks ago I spent a bunch of time trying to get EMR + S3 + > > Avro working: > > https://forums.aws.amazon.com/thread.jspa?messageID=398194񡍲 > > > > Short story, yes I think PIG-2540 is the issue. I'm currently trying to > > get pig 0.10 running in EMR with help from AWS support. You have to do: > > --bootstrap-action s3://elasticmapreduce/bootstrap-actions/run-if --args > > "instance.isMaster=true,s3://yourbucket/path/install_pig_0.10.0.sh" > > > > install_pig_0.10.0.sh contents: > > --------------------- > > #!/usr/bin/env bash > > cd /home/hadoop > > wget http://apache.mirrors.hoobly.com/pig/pig-0.10.0/pig-0.10.0.tar.gz > > tar zxf pig-0.10.0.tar.gz > > mv pig-0.10.0 pig > > echo "export HADOOP_HOME=/home/hadoop" >> ~/.bashrc > > echo "export PATH=/home/hadoop/pig/bin/:\$PATH" >> ~/.bashrc > > cd pig > > ant > > cd contrib/piggybank/java > > ant > > cp piggybank.jar /home/hadoop/lib/. > > cd /home/hadoop/lib > > wget "http://json-simple.googlecode.com/files/json_simple-1.1.jar" > > ------------------ > > > > But note, I have NOT got around to testing this yet! If you do, and it > > works let me know :-) > > > > will > > > > On Fri, Nov 30, 2012 at 4:05 PM, meghana narasimhan < > > [email protected]> wrote: > > > >> Oh I should also mention piggybank : 0.9.2-cdh4.0.1 > >> > >> > >> On Fri, Nov 30, 2012 at 12:59 PM, meghana narasimhan < > >> [email protected]> wrote: > >> > >> > Hi all, > >> > > >> > Is this bug https://issues.apache.org/jira/browse/PIG-2540 applicable > >> to > >> > plain ec2 instances as well. I seem to have hit a snag with Apache Pig > >> > version 0.9.2-cdh4.0.1 (rexported) and avro files on S3. My hadoop > >> cluster > >> > is made of Amazon ec2 instances. > >> > > >> > Here is my load statement : > >> > > >> > dimRad = LOAD 's3n://credentials@bucket > >> /dimensions/2012/11/29/20121129-000159123456/dim' > >> > USING > >> > AVRO_STORAGE AS > >> > (a:int > >> > , b:chararray > >> > ); > >> > > >> > and it gives me a : > >> > > >> > 2012-11-30 20:42:44,205 [main] ERROR org.apache.pig.tools.grunt.Grunt > - > >> > ERROR 1200: Wrong FS: s3n://credentials@bucket > >> /dimensions/2012/11/29/20121129-000159123456/dim, > >> > expected: hdfs://ec2-1xxxx.compute-1.amazonaws.com:8020 > >> > > >> > > >> > Thanks, > >> > Meg > >> > > >> > > > > > > > > >
