That changes things entirely. There's some weirdness in the way data is read from Cassandra. Have you applied the latest patches (eg. https://issues.apache.org/jira/browse/CASSANDRA-2387) ?
See also some UDFs for working with Cassandra data that Jeremy Hanna (@jeromatron) wrote: https://github.com/jeromatron/pygmalion Best of luck! --jacob @thedatachef On Sun, 2011-04-24 at 18:31 +0200, pob wrote: > Maybe I forget one more thing, rows are taken from Cassandra. > > rows = LOAD 'cassandra://emailArchive/messagesMetaData' USING > CassandraStorage() AS (key, columns: bag {T: tuple(name, value)}); > > I have no idea how to format AS for bag in foreach. > > > P. > > 2011/4/24 Jacob Perkins <[email protected]> > > > Strange, that looks right to me. What happens if you try the 'AS' > > statement anyhow? > > > > --jacob > > @thedatachef > > > > On Sun, 2011-04-24 at 18:22 +0200, pob wrote: > > > Hello, > > > > > > pom = foreach rows generate myUDF.toTuple($1); -- reading data > > > describe pom > > > pom: {y: {t: (domain: chararray,spam: int,size: long,time: float)}} > > > > > > data = foreach pom generate flatten($0); > > > grunt> describe data; > > > data: {y::domain: chararray,y::spam: int,y::size: long,y::time: float} > > > > > > > > > I thing they are casted fine, right? > > > > > > UDF is python one with decorator > > > @outputSchema("y:bag{t:tuple(domain:chararray, spam:int, size:long, > > > time:float)}") > > > > > > Thanks > > > > > > > > > > > > 2011/4/24 Jacob Perkins <[email protected]> > > > > > > > You're getting a 'ClassCastException' because the contents of the bags > > > > are DataByteArray and not long (or cannot be cast to long). I suspect > > > > that you're generating the contents of the bag in some way from a UDF, > > > > no? > > > > > > > > You need to either declare the output schema explicitly in the UDF or > > > > just use the 'AS' statement. For example, say you have a UDF that sums > > > > two numbers: > > > > > > > > data = LOAD 'foobar' AS (int:a, int:b); > > > > summed = FOREACH data GENERATE MyFancySummingUDF(a,b) AS (sum:int); > > > > DUMP summed; > > > > > > > > --jacob > > > > @thedatachef > > > > > > > > On Sun, 2011-04-24 at 18:02 +0200, pob wrote: > > > > > x = foreach g2 generate group, data.(size); > > > > > dump x; > > > > > > > > > > ((drm,0),{(464868)}) > > > > > ((drm,1),{(464868)}) > > > > > ((snezz,0),{(8073),(8073)}) > > > > > > > > > > but: > > > > > x = foreach g2 generate group, SUM(data.size); > > > > > > > > > > > > > > > > > > > > > > > > > 2011-04-24 18:02:18,910 [Thread-793] WARN > > > > > org.apache.hadoop.mapred.LocalJobRunner - job_local_0038 > > > > > org.apache.pig.backend.executionengine.ExecException: ERROR 2106: > > Error > > > > > while computing sum in Initial > > > > > at org.apache.pig.builtin.LongSum$Initial.exec(LongSum.java:87) > > > > > at org.apache.pig.builtin.LongSum$Initial.exec(LongSum.java:65) > > > > > at > > > > > > > > > > > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:229) > > > > > at > > > > > > > > > > > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:273) > > > > > at > > > > > > > > > > > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:343) > > > > > at > > > > > > > > > > > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:291) > > > > > at > > > > > > > > > > > org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:276) > > > > > at > > > > > > > > > > > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNext(POLocalRearrange.java:256) > > > > > at > > > > > > > > > > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:236) > > > > > at > > > > > > > > > > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:231) > > > > > at > > > > > > > > > > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:53) > > > > > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) > > > > > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621) > > > > > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305) > > > > > at > > > > > > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177) > > > > > Caused by: java.lang.ClassCastException: > > > > org.apache.pig.data.DataByteArray > > > > > cannot be cast to java.lang.Long > > > > > at org.apache.pig.builtin.LongSum$Initial.exec(LongSum.java:79) > > > > > ... 14 more > > > > > 2011-04-24 18:02:19,213 [main] INFO > > > > > > > > > > > > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher > > > > > - HadoopJobId: job_local_0038 > > > > > 2011-04-24 18:02:19,213 [main] INFO > > > > > > > > > > > > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher > > > > > - 0% complete > > > > > 2011-04-24 18:02:24,215 [main] INFO > > > > > > > > > > > > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher > > > > > - job job_local_0038 has failed! Stop running all dependent jobs > > > > > 2011-04-24 18:02:24,216 [main] INFO > > > > > > > > > > > > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher > > > > > - 100% complete > > > > > 2011-04-24 18:02:24,216 [main] ERROR > > > > > org.apache.pig.tools.pigstats.PigStatsUtil - 1 map reduce job(s) > > failed! > > > > > 2011-04-24 18:02:24,216 [main] INFO > > > > org.apache.pig.tools.pigstats.PigStats > > > > > - Detected Local mode. Stats reported below may be incomplete > > > > > 2011-04-24 18:02:24,216 [main] INFO > > > > org.apache.pig.tools.pigstats.PigStats > > > > > - Script Statistics: > > > > > > > > > > > > > > > > > > > > > > > > > Pig Stack Trace > > > > > --------------- > > > > > ERROR 1066: Unable to open iterator for alias x > > > > > > > > > > org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: > > Unable to > > > > > open iterator for alias x > > > > > at org.apache.pig.PigServer.openIterator(PigServer.java:754) > > > > > at > > > > > > > org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:612) > > > > > at > > > > > > > > > > > org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:303) > > > > > at > > > > > > > > > > > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165) > > > > > at > > > > > > > > > > > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:141) > > > > > at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:76) > > > > > at org.apache.pig.Main.run(Main.java:465) > > > > > at org.apache.pig.Main.main(Main.java:107) > > > > > Caused by: java.io.IOException: Job terminated with anomalous status > > > > FAILED > > > > > at org.apache.pig.PigServer.openIterator(PigServer.java:744) > > > > > ... 7 more > > > > > > > > > > > > > > > > > >
