Re: Pig & Cassandra integration

Shai Harel Tue, 02 Aug 2011 04:41:34 -0700

Jeremy, where you able to make it run on AMAZON elastic map reduce
machines?


i'v tried to copy the jars (both pig's and cassandra) to the new machine
set the PIG_HOME environment variable
even added the hadoop config files to the class path
and I'm getting this error

Error before Pig is launched
----------------------------
ERROR 2999: Unexpected internal error. Failed to create DataStorage

java.lang.RuntimeException: Failed to create DataStorage
        at
org.apache.pig.backend.hadoop.datastorage.HDataStorage.init(HDataStorage.java:75)
        at
org.apache.pig.backend.hadoop.datastorage.HDataStorage.<init>(HDataStorage.java:58)
        at
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.init(HExecutionEngine.java:213)
        at
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.init(HExecutionEngine.java:133)
        at org.apache.pig.impl.PigContext.connect(PigContext.java:183)
        at org.apache.pig.PigServer.<init>(PigServer.java:225)
        at org.apache.pig.PigServer.<init>(PigServer.java:214)
        at org.apache.pig.tools.grunt.Grunt.<init>(Grunt.java:55)
        at org.apache.pig.Main.run(Main.java:462)
        at org.apache.pig.Main.main(Main.java:107)
Caused by: java.io.IOException: Call to
ip-10-56-51-167.eu-west-1.compute.internal/10.56.51.167:9000 failed on local
exception: java.io.EOFExc
eption
        at org.apache.hadoop.ipc.Client.wrapException(Client.java:1139)
        at org.apache.hadoop.ipc.Client.call(Client.java:1107)
        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:226)
        at $Proxy0.getProtocolVersion(Unknown Source)
        at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:398)
        at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:384)
        at
org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:111)
        at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:213)
        at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:180)
        at
org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89)
        at
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1514)
        at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:67)
        at
org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:1548)
        at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1530)
        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:228)
        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:111)
        at
org.apache.pig.backend.hadoop.datastorage.HDataStorage.init(HDataStorage.java:72)
        ... 9 more
Caused by: java.io.EOFException
        at java.io.DataInputStream.readInt(DataInputStream.java:375)
        at
org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:812)
        at org.apache.hadoop.ipc.Client$Connection.run(Client.java:720)
================================================================================

Amazon claims to run hadoop v 0.20, what am i doing wrong?



On Mon, Aug 1, 2011 at 5:55 PM, Jeremy Hanna <[email protected]>wrote:

> Ah - just saw this, glad you got it working - cheers.
>
> On Aug 1, 2011, at 5:43 AM, Shai Harel wrote:
>
> > hey all, i'v successfully fixed this problem,
> > i was missing the cassandra jars,
> > so you actually need to build cassandra (ant) and then you need to jar it
> > (ant jar)
> > and only then it'll work
> >
> > BTW if you have hue installed, remove it first!
> >
> >
> >
> > On Mon, Aug 1, 2011 at 12:41 PM, Shai Harel <[email protected]>
> wrote:
> >
> >> thanks for the help, i'v tried to be conservative and i'm using pig 0.8
> &
> >> cassandra 0.8
> >> and still getting this error
> >>
> >> Pig Stack Trace
> >> ---------------
> >> ERROR 2998: Unhandled internal error. Could not initialize class
> >> org.apache.cassandra.thrift.SliceRange
> >>
> >> java.lang.NoClassDefFoundError: Could not initialize class
> >> org.apache.cassandra.thrift.SliceRange
> >>
> >>    at
> org.apache.cassandra.hadoop.pig.CassandraStorage.setLocation(Unknown
> >> Source)
> >>    at
> >>
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.getJob(JobControlCompiler.java:369)
> >>    at
> >>
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.compile(JobControlCompiler.java:256)
> >>    at
> >>
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:147)
> >>    at
> >>
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:378)
> >>    at
> >> org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1198)
> >>    at org.apache.pig.PigServer.storeEx(PigServer.java:874)
> >>    at org.apache.pig.PigServer.store(PigServer.java:816)
> >>    at org.apache.pig.PigServer.openIterator(PigServer.java:728)
> >>    at
> >> org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:612)
> >>    at
> >>
> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:303)
> >>    at
> >>
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165)
> >>    at
> >>
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:141)
> >>    at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:76)
> >>    at org.apache.pig.Main.run(Main.java:465)
> >>    at org.apache.pig.Main.main(Main.java:107)
> >>
> >> does anyone else have this problem?
> >>
> >>
> >>
> >> On Sun, Jul 31, 2011 at 2:04 PM, Jeremy Hanna <
> [email protected]>wrote:
> >>
> >>> Try following this and see if it helps getting started:
> >>> https://github.com/jeromatron/pygmalion/wiki/Getting-Started
> >>>
> >>> I haven't tried it with 0.9 yet but I plan to this week.  We use the
> >>> CassandraStorage jar in production.  If you can, validate your data
> with
> >>> Cassandra's schema validators.  CassandraStorage gets the schema from
> >>> Cassandra and tries to unmarshal the data into Pig data types with the
> >>> schema information.
> >>>
> >>> See if that helps.
> >>>
> >>> On Jul 31, 2011, at 9:48 AM, Shai Harel wrote:
> >>>
> >>>> hey all, i'v been trying to query cassandra using my pig script,
> >>>> so i used the contrib jar from cassandra. and i'm getting the
> following
> >>>> error...
> >>>> some thrift failure err.... :|
> >>>>
> >>>> ERROR 2998: Unhandled internal error.
> >>>> org.apache.thrift.meta_data.FieldValueMetaData.<init>(BZ)V
> >>>>
> >>>> java.lang.NoSuchMethodError:
> >>>> org.apache.thrift.meta_data.FieldValueMetaData.<init>(BZ)V
> >>>>   at
> >>> org.apache.cassandra.thrift.SliceRange.<clinit>(SliceRange.java:149)
> >>>>   at
> >>> org.apache.cassandra.hadoop.pig.CassandraStorage.setLocation(Unknown
> >>>> Source)
> >>>>   at
> >>>>
> >>>
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.getJob(JobControlCompiler.java:369)
> >>>>   at
> >>>>
> >>>
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.compile(JobControlCompiler.java:256)
> >>>>   at
> >>>>
> >>>
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:147)
> >>>>   at
> >>>>
> >>>
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:378)
> >>>>   at
> >>>>
> org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1198)
> >>>>   at org.apache.pig.PigServer.storeEx(PigServer.java:874)
> >>>>   at org.apache.pig.PigServer.store(PigServer.java:816)
> >>>>   at org.apache.pig.PigServer.openIterator(PigServer.java:728)
> >>>>   at
> >>>>
> org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:612)
> >>>>   at
> >>>>
> >>>
> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:303)
> >>>>   at
> >>>>
> >>>
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165)
> >>>>   at
> >>>>
> >>>
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:141)
> >>>>   at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:76)
> >>>>   at org.apache.pig.Main.run(Main.java:465)
> >>>>   at org.apache.pig.Main.main(Main.java:107)
> >>>>
> >>>>
> >>>> does anyone managed to get this up and running?
> >>>> i'm considering to rewrite the CassandraStorage.jar using Hector,
> >>>> Any thoughts about that?
> >>>
> >>>
> >>
>
>

Re: Pig & Cassandra integration

Reply via email to