Updated about my problem:
I have tuples with varied length. I am trying to convert them to tuples with
only one field(each field is a map).
Original data:
dump entryArray;
([symbol#HIG,security_type#EQUITY,foreign_entry_id#743094])
([symbol#PEW,security_type#EQUITY,foreign_entry_id#743084])
([symbol#AFFY,security_type#EQUITY,foreign_entry_id#5585],[symbol#RFG,security_type#ETF,foreign_entry_id#5586],[symbol#SCHW,security_type#EQUITY,foreign_entry_id#5587],[symbol#VWO,security_type#ETF,foreign_entry_id#5588])
I hope the output would be(each field still be map for further uses):
([symbol#HIG,security_type#EQUITY,foreign_entry_id#743094])
([symbol#PEW,security_type#EQUITY,foreign_entry_id#743084])
([symbol#AFFY,security_type#EQUITY,foreign_entry_id#5585])
([symbol#RFG,security_type#ETF,foreign_entry_id#5586])
([symbol#SCHW,security_type#EQUITY,foreign_entry_id#5587])
([symbol#VWO,security_type#ETF,foreign_entry_id#5588])
I have tried: `entry = FOREACH entryArray GENERATE FLATTEN(TOBAG()); the output
has same format, but it seems that the field is no longer MAP:
entry = FOREACH entryArray GENERATE FLATTEN(TOBAG());
dump entry;
([symbol#HIG,security_type#EQUITY,foreign_entry_id#743094])
([symbol#PEW,security_type#EQUITY,foreign_entry_id#743084])
([symbol#AFFY,security_type#EQUITY,foreign_entry_id#5585])
([symbol#RFG,security_type#ETF,foreign_entry_id#5586])
([symbol#SCHW,security_type#EQUITY,foreign_entry_id#5587])
([symbol#VWO,security_type#ETF,foreign_entry_id#5588])
security_type = FOREACH entry GENERATE FLATTEN($0#'security_type');
it throws:
ERROR 1052: Cannot cast bytearray to map with schema :map
org.apache.pig.impl.logicalLayer.validators.TypeCheckerException: ERROR
1059: <line 18, column 16> Problem while reconciling output schema of ForEach
at
org.apache.pig.newplan.logical.visitor.TypeCheckingRelVisitor.throwTypeCheckerException(TypeCheckingRelVisitor.java:141)
at
org.apache.pig.newplan.logical.visitor.TypeCheckingRelVisitor.visit(TypeCheckingRelVisitor.java:181)
at org.apache.pig.newplan.logical.relational.LOForEach.accept(LOForEach.java:75)
......
Any suggestion would be very appreciated. Thanks!
From: "Yahoo! Inc." <[email protected]<mailto:[email protected]>>
Date: Wednesday, July 24, 2013 10:54 PM
To: "[email protected]<mailto:[email protected]>"
<[email protected]<mailto:[email protected]>>
Subject: Split tuple with multiple fields to tuples with single field in Pig
Hi pig-users,
I have tuples with varied length. I am trying to convert them to tuples with
only one field (each field is a map).
Original data:
([symbol#HIG,security_type#EQUITY,foreign_entry_id#743094])
([symbol#PEW,security_type#EQUITY,foreign_entry_id#743084])
([symbol#AFFY,security_type#EQUITY,foreign_entry_id#5585],[symbol#RFG,security_type#ETF,foreign_entry_id#5586],[symbol#SCHW,security_type#EQUITY,foreign_entry_id#5587],[symbol#VWO,security_type#ETF,foreign_entry_id#5588])
I hope the output would be:
([symbol#HIG,security_type#EQUITY,foreign_entry_id#743094])
([symbol#PEW,security_type#EQUITY,foreign_entry_id#743084])
([symbol#AFFY,security_type#EQUITY,foreign_entry_id#5585])
([symbol#RFG,security_type#ETF,foreign_entry_id#5586])
([symbol#SCHW,security_type#EQUITY,foreign_entry_id#5587])
([symbol#VWO,security_type#ETF,foreign_entry_id#5588])
I have tried: FOREACH entryArray GENERATE FLATTEN(TOBAG(*));
It returns (only the first field of each tuple):
([symbol#HIG,security_type#EQUITY,foreign_entry_id#743094])
([symbol#PEW,security_type#EQUITY,foreign_entry_id#743084])
([symbol#AFFY,security_type#EQUITY,foreign_entry_id#5585])
Any suggestion would be very appreciated. Thanks!
Regards,
Dan