Alan Gates
Thu, 10 Apr 2008 12:23:29 -0700
cs = foreach gs generate flatten(group), COUNT ( logs.$1);This will produce output with three fields, log.$1, log.$2, count (that is the inner tuple will be 'flattened' into the out tuple).
If you truly want to maintain the group tuple in the storage you will need to store your output with BinStorage or some other storage function that can handle nested data.
Alan. mickey hsieh wrote:
Any ideal
here is my script
logs = load '/test/action_log_2008-03-30.log' using PigStorage(',') ;
gs = group logs by ( $1, $2) ;
cs = foreach gs generate group, COUNT ( logs.$1);
store cs into '/test/case1' ;
java.lang.RuntimeException: java.io.IOException: Unable to convert non-flat
tuple to string.
at
org.apache.pig.backend.hadoop.executionengine.mapreduceExec.PigMapReduce$ReduceDataOutputCollector.add(PigMapReduce.java:358)
at
org.apache.pig.impl.eval.collector.UnflattenCollector.add(UnflattenCollector.java:52)
at
org.apache.pig.impl.eval.GenerateSpec$CrossProductItem.add(GenerateSpec.java:232)
at
org.apache.pig.impl.eval.collector.UnflattenCollector.add(UnflattenCollector.java:52)
at
org.apache.pig.impl.eval.collector.DataCollector.addToSuccessor(DataCollector.java:98)
at org.apache.pig.impl.eval.SimpleEvalSpec$1.add(SimpleEvalSpec.java:38)
at
org.apache.pig.impl.eval.GenerateSpec$CrossProductItem.exec(GenerateSpec.java:263)
at org.apache.pig.impl.eval.GenerateSpec$1.add(GenerateSpec.java:88)
at
org.apache.pig.backend.hadoop.executionengine.mapreduceExec.PigMapReduce.reduce(PigMapReduce.java:159)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:333)
at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2084)