pig-user  

Re: Unable to convert non-flat tuple to string

Alan Gates
Thu, 10 Apr 2008 12:23:29 -0700

The output of your foreach line will have two fields, the first group, which is a tuple containing log.$1 and log.$2 and the second a number generated by the count function. PigStorage (the default storage function) cannot store nested types (tuples, bags, maps), so it is complaining that it cannot store cs. The easiest fix is to modify your script to:

cs = foreach gs generate flatten(group), COUNT ( logs.$1);

This will produce output with three fields, log.$1, log.$2, count (that is the inner tuple will be 'flattened' into the out tuple).

If you truly want to maintain the group tuple in the storage you will need to store your output with BinStorage or some other storage function that can handle nested data.

Alan.

mickey hsieh wrote:
Any ideal

here is my script

logs = load '/test/action_log_2008-03-30.log' using PigStorage(',') ;
gs = group logs by ( $1, $2) ;
cs = foreach gs generate group, COUNT ( logs.$1);
store cs into '/test/case1' ;


java.lang.RuntimeException: java.io.IOException: Unable to convert non-flat
tuple to string.
at
org.apache.pig.backend.hadoop.executionengine.mapreduceExec.PigMapReduce$ReduceDataOutputCollector.add(PigMapReduce.java:358)
at
org.apache.pig.impl.eval.collector.UnflattenCollector.add(UnflattenCollector.java:52)
at
org.apache.pig.impl.eval.GenerateSpec$CrossProductItem.add(GenerateSpec.java:232)
at
org.apache.pig.impl.eval.collector.UnflattenCollector.add(UnflattenCollector.java:52)
at
org.apache.pig.impl.eval.collector.DataCollector.addToSuccessor(DataCollector.java:98)
at org.apache.pig.impl.eval.SimpleEvalSpec$1.add(SimpleEvalSpec.java:38)
at
org.apache.pig.impl.eval.GenerateSpec$CrossProductItem.exec(GenerateSpec.java:263)
at org.apache.pig.impl.eval.GenerateSpec$1.add(GenerateSpec.java:88)
at
org.apache.pig.backend.hadoop.executionengine.mapreduceExec.PigMapReduce.reduce(PigMapReduce.java:159)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:333)
at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2084)