[ https://issues.apache.org/jira/browse/PIG-4608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16340555#comment-16340555 ]
Koji Noguchi commented on PIG-4608: ----------------------------------- {quote}I didn't see UPDATE/DROP in a single statement in the example, are we not going to support both in the same statement? I actually prefer those in the same statement, as I feel users usually think about adjusting all columns in the same time. {quote} This could be because I requested in one of my previous comments as. "For now, can we just require separate statements for update and delete ?" I just wanted to keep it simple and leave the combining part later when we have more use cases. Also, I'm afraid of confusions in overlapping index/fields. Say, {{A:(f0:int, f1:int, f2:int, f3:int)}} {code:java} B = FOREACH A drop f1 , update 2 with $1 ; {code} Is the code updating {{f2}} with the value of {{f1}}? Or, updating {{f3}} with value of {{f2}} ? or something else? > FOREACH ... UPDATE > ------------------ > > Key: PIG-4608 > URL: https://issues.apache.org/jira/browse/PIG-4608 > Project: Pig > Issue Type: New Feature > Reporter: Haley Thrapp > Priority: Major > > I would like to propose a new command in Pig, FOREACH...UPDATE. > Syntactically, it would look much like FOREACH … GENERATE. > Example: > Input data: > (1,2,3) > (2,3,4) > (3,4,5) > -- Load the data > three_numbers = LOAD 'input_data' > USING PigStorage() > AS (f1:int, f2:int, f3:int); > -- Sum up the row > updated = FOREACH three_numbers UPDATE > 5 as f1, > f1+f2 as new_sum > ; > Dump updated; > (5,2,3,3) > (5,3,4,5) > (5,4,5,7) > Fields to update must be specified by alias. Any fields in the UPDATE that do > not match an existing field will be appended to the end of the tuple. > This command is particularly desirable in scripts that deal with a large > number of fields (in the 20-200 range). Often, we need to only make > modifications to a few fields. The FOREACH ... UPDATE statement, allows the > developer to focus on the actual logical changes instead of having to list > all of the fields that are also being passed through. > My team has prototyped this with changes to FOREACH ... GENERATE. We believe > this can be done with changes to the parser and the creation of a new > LOUpdate. No physical plan changes should be needed because we will leverage > what LOGenerate does. -- This message was sent by Atlassian JIRA (v7.6.3#76005)