[jira] Updated: (PIG-994) Provide 'append' keyword to allow appending to diferent dataset once the feature is available in Hadoop
[ https://issues.apache.org/jira/browse/PIG-994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rekha updated PIG-994: -- Tags: append, update, hadoop 0.20 (was: append, hadoop 0.20) Provide 'append' keyword to allow appending to diferent dataset once the feature is available in Hadoop --- Key: PIG-994 URL: https://issues.apache.org/jira/browse/PIG-994 Project: Pig Issue Type: New Feature Components: impl Affects Versions: 0.4.0 Environment: Grid clusters Reporter: Rekha Priority: Minor Provide 'append' keyword to allow appending to diferent dataset on pig 0.5.0 as it is now on hadoop 0.20(which has append feature) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-994) Provide 'append' keyword to allow appending to diferent dataset once the feature is available in Hadoop
[ https://issues.apache.org/jira/browse/PIG-994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rekha updated PIG-994: -- Thanks Alan. I am for 'option on store' mostly and definitely if they are exclusive possibilities. However for arguments sake, a keyword approach can be considered, in addition. This is because I am hoping append will open doors to be able to easily patch in update feature on similar lines into pig api, (and hopefully as part of same jira ticket) My idea of update is a syntax like update DS1 by (join_keys) from DS2 by (join_keys) parallel $PARALLEL This will update dataset1(DS1) with data from dataset2(DS2) based on key joins. {code} update b by (jon_key1, join_key2) from c by (join_key1, join_key2); //this will update the DS b directly //or alternatively //x = update b by (jon_key1, join_key2) from c by (join_key1, join_key2); // making it two-step. z = foreach b generate $0, $32, $50; // incase you are taking only few cols from main(b), new (c) store z into 'bla' append; // appends the o/p data into 'bla' directly. {code} The append case, this below construct will be another way of doing it. {code} append b, c; // appends directly into b. z = foreach b generate $0, $32, $50; // incase you are taking only few cols from main(b), new (c) store z into 'bla'; {code} Provide 'append' keyword to allow appending to diferent dataset once the feature is available in Hadoop --- Key: PIG-994 URL: https://issues.apache.org/jira/browse/PIG-994 Project: Pig Issue Type: New Feature Components: impl Affects Versions: 0.4.0 Environment: Grid clusters Reporter: Rekha Priority: Minor Provide 'append' keyword to allow appending to diferent dataset on pig 0.5.0 as it is now on hadoop 0.20(which has append feature) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-994) Provide 'append' keyword to allow appending to diferent dataset once the feature is available in Hadoop
[ https://issues.apache.org/jira/browse/PIG-994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Olga Natkovich updated PIG-994: --- Summary: Provide 'append' keyword to allow appending to diferent dataset once the feature is available in Hadoop (was: Provide 'append' keyword to allow appending to diferent dataset on pig 2.3 as it is now on hadoop 0.20(which has append feature)) Provide 'append' keyword to allow appending to diferent dataset once the feature is available in Hadoop --- Key: PIG-994 URL: https://issues.apache.org/jira/browse/PIG-994 Project: Pig Issue Type: New Feature Components: impl Affects Versions: 0.4.0 Environment: Grid clusters Reporter: Rekha Priority: Minor Provide 'append' keyword to allow appending to diferent dataset on pig 0.5.0 as it is now on hadoop 0.20(which has append feature) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.