Sigh. @jeromatron , @thedatachef -- this one's on you :). Toldya you need
the LoadCaster...


D

On Sun, Apr 24, 2011 at 1:17 PM, pob <[email protected]> wrote:

> hello,
>
> thanks but w/out sucess ;/
>
>
> grunt> pom = foreach rows generate myUDF.toTuple($1);
> grunt> describe pom
> pom: {y: {t: (domain: bytearray,spam: bytearray,size: bytearray,time:
> bytearray)}}
> grunt> data = foreach pom generate flatten($0) as (domain, spam, size,
> time);
> grunt> data = foreach data generate (chararray) domain, (int) spam, (long)
> size,
> >> (float) time;
> grunt> describe data;
> data: {domain: chararray,spam: int,size: long,time: float}
>
> z = foreach data generate time+size;
>
>
> org.apache.pig.backend.executionengine.ExecException: ERROR 1075: Received
> a
> bytearray from the UDF. Cannot determine how to convert the bytearray to
> float.
> at
>
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POCast.getNext(POCast.java:529)
> at
>
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.Add.getNext(Add.java:92)
> at
>
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:364)
> at
>
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:291)
> at
>
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:236)
> at
>
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:231)
> at
>
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:53)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
> at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
> 2011-04-24 22:16:06,129 [main] INFO
>
>  
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> - job job_local_0001 has failed! Stop running all dependent jobs
>
>
>
>
> z = foreach data generate time
>
>
> org.apache.pig.backend.executionengine.ExecException: ERROR 1075: Received
> a
> bytearray from the UDF. Cannot determine how to convert the bytearray to
> float.
> at
>
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POCast.getNext(POCast.java:529)
> at
>
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:364)
> at
>
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:291)
> at
>
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:236)
> at
>
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:231)
> at
>
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:53)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
> at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
>
>
> 2011/4/24 Dmitriy Ryaboy <[email protected]>
>
> > Try this:
> >
> > data = foreach pom generate flatten($0) as (domain, spam, size, time);
> > data = foreach data generate (chararray) domain, (int) spam, (long) size,
> > (float) time;
> >
> > Pig is inconsistent in what "as foo:type" does vs " (type) foo"
> >
> > D
> >
> > On Sun, Apr 24, 2011 at 10:44 AM, pob <[email protected]> wrote:
> >
> > > Hi,
> > >
> > > but why i cant re-cast it during flatten?
> > >
> > >
> > > data = foreach pom generate flatten($0) AS (domain:chararray, spam:int,
> > > size:long, time:float);
> > >
> > > grunt> z = foreach data generate time+size;
> > >
> > >
> > > java.lang.ClassCastException: org.apache.pig.data.DataByteArray cannot
> be
> > > cast to java.lang.Float
> > > at
> > >
> > >
> >
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.Add.getNext(Add.java:97)
> > > at
> > >
> > >
> >
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:364)
> > > at
> > >
> > >
> >
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:291)
> > > at
> > >
> > >
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:236)
> > > at
> > >
> > >
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:231)
> > > at
> > >
> > >
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:53)
> > > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
> > > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
> > > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
> > > at
> > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
> > >
> > >
> > >
> > > 2011/4/24 Dmitriy Ryaboy <[email protected]>
> > >
> > > > I think it's the deep-casting issue from
> > > > https://issues.apache.org/jira/browse/PIG-1758 .
> > > > Should work in 0.9 but didn't get into 0.8 or 0.8.1
> > > >
> > > > D
> > > >
> > > > On Sun, Apr 24, 2011 at 9:52 AM, pob <[email protected]> wrote:
> > > >
> > > > > Thats stramge, pygmalion works fine (but there are any numerical
> > > > > operations).
> > > > >
> > > > > I think Im using C* 0.7.5 where it suppose to be patched ;/ so idk
> :(
> > > > >
> > > > >
> > > > > 2011/4/24 Jacob Perkins <[email protected]>
> > > > >
> > > > > > That changes things entirely. There's some weirdness in the way
> > data
> > > is
> > > > > > read from Cassandra. Have you applied the latest patches (eg.
> > > > > > https://issues.apache.org/jira/browse/CASSANDRA-2387) ?
> > > > > >
> > > > > > See also some UDFs for working with Cassandra data that Jeremy
> > Hanna
> > > > > > (@jeromatron) wrote:
> > > > > >
> > > > > > https://github.com/jeromatron/pygmalion
> > > > > >
> > > > > >
> > > > > > Best of luck!
> > > > > >
> > > > > > --jacob
> > > > > > @thedatachef
> > > > > >
> > > > > > On Sun, 2011-04-24 at 18:31 +0200, pob wrote:
> > > > > > > Maybe I forget one more thing, rows are taken from Cassandra.
> > > > > > >
> > > > > > > rows = LOAD 'cassandra://emailArchive/messagesMetaData' USING
> > > > > > > CassandraStorage() AS (key, columns: bag {T: tuple(name,
> > value)});
> > > > > > >
> > > > > > > I have no idea how to format AS for bag in foreach.
> > > > > > >
> > > > > > >
> > > > > > > P.
> > > > > > >
> > > > > > > 2011/4/24 Jacob Perkins <[email protected]>
> > > > > > >
> > > > > > > > Strange, that looks right to me. What happens if you try the
> > 'AS'
> > > > > > > > statement anyhow?
> > > > > > > >
> > > > > > > > --jacob
> > > > > > > > @thedatachef
> > > > > > > >
> > > > > > > > On Sun, 2011-04-24 at 18:22 +0200, pob wrote:
> > > > > > > > > Hello,
> > > > > > > > >
> > > > > > > > > pom = foreach rows generate myUDF.toTuple($1); -- reading
> > data
> > > > > > > > > describe pom
> > > > > > > > > pom: {y: {t: (domain: chararray,spam: int,size: long,time:
> > > > float)}}
> > > > > > > > >
> > > > > > > > > data = foreach pom generate flatten($0);
> > > > > > > > > grunt> describe data;
> > > > > > > > > data: {y::domain: chararray,y::spam: int,y::size:
> > long,y::time:
> > > > > > float}
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > I thing they are casted fine, right?
> > > > > > > > >
> > > > > > > > > UDF is python one with decorator
> > > > > > > > > @outputSchema("y:bag{t:tuple(domain:chararray, spam:int,
> > > > size:long,
> > > > > > > > > time:float)}")
> > > > > > > > >
> > > > > > > > > Thanks
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > 2011/4/24 Jacob Perkins <[email protected]>
> > > > > > > > >
> > > > > > > > > > You're getting a 'ClassCastException' because the
> contents
> > of
> > > > the
> > > > > > bags
> > > > > > > > > > are DataByteArray and not long (or cannot be cast to
> long).
> > I
> > > > > > suspect
> > > > > > > > > > that you're generating the contents of the bag in some
> way
> > > from
> > > > a
> > > > > > UDF,
> > > > > > > > > > no?
> > > > > > > > > >
> > > > > > > > > > You need to either declare the output schema explicitly
> in
> > > the
> > > > > UDF
> > > > > > or
> > > > > > > > > > just use the 'AS' statement. For example, say you have a
> > UDF
> > > > that
> > > > > > sums
> > > > > > > > > > two numbers:
> > > > > > > > > >
> > > > > > > > > > data   = LOAD 'foobar' AS (int:a, int:b);
> > > > > > > > > > summed = FOREACH data GENERATE MyFancySummingUDF(a,b) AS
> > > > > (sum:int);
> > > > > > > > > > DUMP summed;
> > > > > > > > > >
> > > > > > > > > > --jacob
> > > > > > > > > > @thedatachef
> > > > > > > > > >
> > > > > > > > > > On Sun, 2011-04-24 at 18:02 +0200, pob wrote:
> > > > > > > > > > > x = foreach g2 generate group, data.(size);
> > > > > > > > > > > dump x;
> > > > > > > > > > >
> > > > > > > > > > > ((drm,0),{(464868)})
> > > > > > > > > > > ((drm,1),{(464868)})
> > > > > > > > > > > ((snezz,0),{(8073),(8073)})
> > > > > > > > > > >
> > > > > > > > > > > but:
> > > > > > > > > > > x = foreach g2 generate group, SUM(data.size);
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > 2011-04-24 18:02:18,910 [Thread-793] WARN
> > > > > > > > > > >  org.apache.hadoop.mapred.LocalJobRunner -
> job_local_0038
> > > > > > > > > > > org.apache.pig.backend.executionengine.ExecException:
> > ERROR
> > > > > 2106:
> > > > > > > > Error
> > > > > > > > > > > while computing sum in Initial
> > > > > > > > > > > at
> > > > org.apache.pig.builtin.LongSum$Initial.exec(LongSum.java:87)
> > > > > > > > > > > at
> > > > org.apache.pig.builtin.LongSum$Initial.exec(LongSum.java:65)
> > > > > > > > > > > at
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:229)
> > > > > > > > > > > at
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:273)
> > > > > > > > > > > at
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:343)
> > > > > > > > > > > at
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:291)
> > > > > > > > > > > at
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:276)
> > > > > > > > > > > at
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNext(POLocalRearrange.java:256)
> > > > > > > > > > > at
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:236)
> > > > > > > > > > > at
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:231)
> > > > > > > > > > > at
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:53)
> > > > > > > > > > > at
> > org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
> > > > > > > > > > > at
> > > > > > org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
> > > > > > > > > > > at
> org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
> > > > > > > > > > > at
> > > > > > > > > >
> > > > > > > >
> > > > > >
> > > >
> > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
> > > > > > > > > > > Caused by: java.lang.ClassCastException:
> > > > > > > > > > org.apache.pig.data.DataByteArray
> > > > > > > > > > > cannot be cast to java.lang.Long
> > > > > > > > > > > at
> > > > org.apache.pig.builtin.LongSum$Initial.exec(LongSum.java:79)
> > > > > > > > > > > ... 14 more
> > > > > > > > > > > 2011-04-24 18:02:19,213 [main] INFO
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>  
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> > > > > > > > > > > - HadoopJobId: job_local_0038
> > > > > > > > > > > 2011-04-24 18:02:19,213 [main] INFO
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>  
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> > > > > > > > > > > - 0% complete
> > > > > > > > > > > 2011-04-24 18:02:24,215 [main] INFO
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>  
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> > > > > > > > > > > - job job_local_0038 has failed! Stop running all
> > dependent
> > > > > jobs
> > > > > > > > > > > 2011-04-24 18:02:24,216 [main] INFO
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>  
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> > > > > > > > > > > - 100% complete
> > > > > > > > > > > 2011-04-24 18:02:24,216 [main] ERROR
> > > > > > > > > > > org.apache.pig.tools.pigstats.PigStatsUtil - 1 map
> reduce
> > > > > job(s)
> > > > > > > > failed!
> > > > > > > > > > > 2011-04-24 18:02:24,216 [main] INFO
> > > > > > > > > >  org.apache.pig.tools.pigstats.PigStats
> > > > > > > > > > > - Detected Local mode. Stats reported below may be
> > > incomplete
> > > > > > > > > > > 2011-04-24 18:02:24,216 [main] INFO
> > > > > > > > > >  org.apache.pig.tools.pigstats.PigStats
> > > > > > > > > > > - Script Statistics:
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Pig Stack Trace
> > > > > > > > > > > ---------------
> > > > > > > > > > > ERROR 1066: Unable to open iterator for alias x
> > > > > > > > > > >
> > > > > > > > > > > org.apache.pig.impl.logicalLayer.FrontendException:
> ERROR
> > > > 1066:
> > > > > > > > Unable to
> > > > > > > > > > > open iterator for alias x
> > > > > > > > > > >         at
> > > > > > org.apache.pig.PigServer.openIterator(PigServer.java:754)
> > > > > > > > > > >         at
> > > > > > > > > > >
> > > > > > > >
> > > > > >
> > > >
> > org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:612)
> > > > > > > > > > >         at
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:303)
> > > > > > > > > > >         at
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165)
> > > > > > > > > > >         at
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:141)
> > > > > > > > > > >         at
> > > > org.apache.pig.tools.grunt.Grunt.run(Grunt.java:76)
> > > > > > > > > > >         at org.apache.pig.Main.run(Main.java:465)
> > > > > > > > > > >         at org.apache.pig.Main.main(Main.java:107)
> > > > > > > > > > > Caused by: java.io.IOException: Job terminated with
> > > anomalous
> > > > > > status
> > > > > > > > > > FAILED
> > > > > > > > > > >         at
> > > > > > org.apache.pig.PigServer.openIterator(PigServer.java:744)
> > > > > > > > > > >         ... 7 more
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to