Hi, Did you get a chance to look into the PiggyBank String functions?
http://pig.apache.org/docs/r0.7.0/api/org/apache/pig/piggybank/evaluation/string/package-summary.html I guess you need to use the substring function. REGISTER <path-to-piggybank>/piggybank.jar; DEFINE StrSub org.apache.pig.piggybank.evaluation.string.SUBSTRING(); ... now you can use the SUBSTRING function as StrSub. B = ForEach A generate StrSub(sid,1,64); Hope it Helps. Sumit ________________________________ From: Vincent Barat <[email protected]> To: "[email protected]" <[email protected]> Sent: Wed, 20 April, 2011 7:37:03 PM Subject: How to remove the field key from bags tuples after a GROUP ? Hi, First, I group 2 tables using a key (named sid): rich_sessions = GROUP sessions BY sid, activities BY sid; After this operation, all the tuples in the bag "activities" start with the same "sid" field. This field is long (64 bytes) and I would like to remove it from all activity tuples in order to save space before storing this rich_sessions in a file. Is there any way to do this ? Thank for your help,
