[jira] Commented: (PIG-1167) [zebra] Zebra does not support Hadoop Globs
[ https://issues.apache.org/jira/browse/PIG-1167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12796259#action_12796259 ] Chao Wang commented on PIG-1167: Patch looks good +1. [zebra] Zebra does not support Hadoop Globs --- Key: PIG-1167 URL: https://issues.apache.org/jira/browse/PIG-1167 Project: Pig Issue Type: Improvement Reporter: Olga Natkovich Assignee: Yan Zhou Fix For: 0.6.0, 0.7.0 Attachments: PIG-1167.patch Pssing the following path to Zebra causing error but works with Hadoop directly: /projects/FETL/sample/ABF1/{2009120204} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1167) [zebra] Zebra does not support Hadoop Globs
[ https://issues.apache.org/jira/browse/PIG-1167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yan Zhou updated PIG-1167: -- Resolution: Fixed Status: Resolved (was: Patch Available) Committed to both Apache trunk and 6.0 branch. [zebra] Zebra does not support Hadoop Globs --- Key: PIG-1167 URL: https://issues.apache.org/jira/browse/PIG-1167 Project: Pig Issue Type: Improvement Reporter: Olga Natkovich Assignee: Yan Zhou Fix For: 0.6.0, 0.7.0 Attachments: PIG-1167.patch Pssing the following path to Zebra causing error but works with Hadoop directly: /projects/FETL/sample/ABF1/{2009120204} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1094) Fix unit tests corresponding to source changes so far
[ https://issues.apache.org/jira/browse/PIG-1094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12796316#action_12796316 ] Pradeep Kamath commented on PIG-1094: - +1 to PIG-1094_6.patch , patch committed - thanks Thejas! Here is the output of test-patch for the same: [exec] [exec] +1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 6 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. [exec] [exec] [exec] Fix unit tests corresponding to source changes so far - Key: PIG-1094 URL: https://issues.apache.org/jira/browse/PIG-1094 Project: Pig Issue Type: Sub-task Reporter: Pradeep Kamath Assignee: Pradeep Kamath Attachments: PIG-1094.patch, PIG-1094_2.patch, PIG-1094_3.patch, PIG-1094_4.patch, PIG-1094_5.patch, PIG-1094_6.patch The check-in's so far on load-store-redesign branch have nor addressed unit test failures due to interface changes. This jira is to track the task of making the common case unit tests work with the new interfaces. Some aspects of the new proposal like using LoadCaster interface for casting, making local mode work have not been completed yet. Tests which are failing due to those reasons will not be fixed in this jira and addressed in the jiras corresponding to those tasks -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1090) Update sources to reflect recent changes in load-store interfaces
[ https://issues.apache.org/jira/browse/PIG-1090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Richard Ding updated PIG-1090: -- Attachment: PIG-1090-9.patch This patch replaced msStorage with a Configuration object in LOLoad and fixed corresponding test cases. The results of test-patch run: {code} [exec] +1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 15 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. {code} Update sources to reflect recent changes in load-store interfaces - Key: PIG-1090 URL: https://issues.apache.org/jira/browse/PIG-1090 Project: Pig Issue Type: Sub-task Reporter: Pradeep Kamath Assignee: Pradeep Kamath Attachments: PIG-1090-2.patch, PIG-1090-3.patch, PIG-1090-4.patch, PIG-1090-6.patch, PIG-1090-7.patch, PIG-1090-8.patch, PIG-1090-9.patch, PIG-1090.patch, PIG-1190-5.patch There have been some changes (as recorded in the Changes Section, Nov 2 2009 sub section of http://wiki.apache.org/pig/LoadStoreRedesignProposal) in the load/store interfaces - this jira is to track the task of making those changes under src. Changes under test will be addresses in a different jira. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1090) Update sources to reflect recent changes in load-store interfaces
[ https://issues.apache.org/jira/browse/PIG-1090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12796386#action_12796386 ] Daniel Dai commented on PIG-1090: - +1 for PIG-1090-8.patch Update sources to reflect recent changes in load-store interfaces - Key: PIG-1090 URL: https://issues.apache.org/jira/browse/PIG-1090 Project: Pig Issue Type: Sub-task Reporter: Pradeep Kamath Assignee: Pradeep Kamath Attachments: PIG-1090-2.patch, PIG-1090-3.patch, PIG-1090-4.patch, PIG-1090-6.patch, PIG-1090-7.patch, PIG-1090-8.patch, PIG-1090-9.patch, PIG-1090.patch, PIG-1190-5.patch There have been some changes (as recorded in the Changes Section, Nov 2 2009 sub section of http://wiki.apache.org/pig/LoadStoreRedesignProposal) in the load/store interfaces - this jira is to track the task of making those changes under src. Changes under test will be addresses in a different jira. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1172) PushDownForeachFlatten shall not push ForEach below Join if the flattened fields is used in Join
[ https://issues.apache.org/jira/browse/PIG-1172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12796394#action_12796394 ] Alan Gates commented on PIG-1172: - Changes look good, +1. The patch lists a new hadoop20.jar. Is this intentional? PushDownForeachFlatten shall not push ForEach below Join if the flattened fields is used in Join Key: PIG-1172 URL: https://issues.apache.org/jira/browse/PIG-1172 Project: Pig Issue Type: Bug Components: impl Affects Versions: 0.6.0 Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.6.0 Attachments: PIG-1172-1.patch Currently the following script will push B below D. But we will use fattened column in the join, we cannot push that. A = load '1.txt' as (bg:bag{t:tuple(a0,a1)}); B = FOREACH A generate flatten($0); C = load '3.txt' AS (c0, c1); D = JOIN B by a1, C by c1; E = limit D 10; explain E; -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1172) PushDownForeachFlatten shall not push ForEach below Join if the flattened fields is used in Join
[ https://issues.apache.org/jira/browse/PIG-1172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-1172: Attachment: PIG-1172-2.patch hadoop20.jar should not be in patch. I reattched the patch. Thanks. PushDownForeachFlatten shall not push ForEach below Join if the flattened fields is used in Join Key: PIG-1172 URL: https://issues.apache.org/jira/browse/PIG-1172 Project: Pig Issue Type: Bug Components: impl Affects Versions: 0.6.0 Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.6.0 Attachments: PIG-1172-1.patch, PIG-1172-2.patch Currently the following script will push B below D. But we will use fattened column in the join, we cannot push that. A = load '1.txt' as (bg:bag{t:tuple(a0,a1)}); B = FOREACH A generate flatten($0); C = load '3.txt' AS (c0, c1); D = JOIN B by a1, C by c1; E = limit D 10; explain E; -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1174) Creation of output path should be done by storage function
[ https://issues.apache.org/jira/browse/PIG-1174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12796449#action_12796449 ] Alan Gates commented on PIG-1174: - Delegating creation of the output path to the storage function is not trivial. The storage function is invoked on every reducer (or every mapper for map only jobs). So delaying creation until the storage function will create a race condition that the storage functions will handle. And if the solution is just to let the first one win and all the rest error out and ignore the error, for a large job this will still bombard the namenode with hundreds or thousands of bogus mkdir requests. It also has the problem that all the storage functions that get an error can't tell if it's really an error (there's old data there they are overwriting) versus they just lost the race and another function has already created it. We are reworking the way load and store function interact with InputFormats and OutputFormats (see PIG-966 for full details). This will push the responsibility of file creation onto the OutputFormat. This may partially address your concerns. Creation of output path should be done by storage function -- Key: PIG-1174 URL: https://issues.apache.org/jira/browse/PIG-1174 Project: Pig Issue Type: Bug Reporter: Bill Graham When executing a STORE command, Pig creates the output location before the storage function gets called. This causes problems with storage functions that have logic to determine the output location. See this thread: http://www.mail-archive.com/pig-user%40hadoop.apache.org/msg01538.html For example, when making a request like this: STORE A INTO '/my/home/output' USING MultiStorage('/my/home/output','0', 'none', '\t'); Pig creates a file '/my/home/output' and then an exception is thrown when MultiStorage tries to make a directory under '/my/home/output'. The workaround is to instead specify a dummy location as the first path like so: STORE A INTO '/my/home/output/temp' USING MultiStorage('/my/home/output','0', 'none', '\t'); Two changes should be made: 1. The path specified in the INTO clause should be available to the storage function so it doesn't need to be duplicated. 2. The creation of the output paths should be delegated to the storage function. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Hudson build is back to normal: Pig-trunk #658
See http://hudson.zones.apache.org/hudson/job/Pig-trunk/658/changes
[jira] Updated: (PIG-1175) Pig 0.6 Docs - Store v. Dump
[ https://issues.apache.org/jira/browse/PIG-1175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Corinne Chandel updated PIG-1175: - Attachment: PIG-1175.patch Patch file. Pig 0.6 Docs - Store v. Dump Key: PIG-1175 URL: https://issues.apache.org/jira/browse/PIG-1175 Project: Pig Issue Type: Task Components: documentation Affects Versions: 0.6.0 Reporter: Corinne Chandel Fix For: 0.6.0 Attachments: PIG-1175.patch Pig 0.6 Docs (1) Pig Latin Ref Manual Update STORE Update DUMP (and move under Diagnostic Operators) (2) Pig Latin User Guide Under Multi-Query Execution, add new section: Store v. Dump Updates clarify how STORE and DUMP work with multi-query execution (optimization). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (PIG-1176) Column Pruner issues in union of loader with and without schema
Column Pruner issues in union of loader with and without schema --- Key: PIG-1176 URL: https://issues.apache.org/jira/browse/PIG-1176 Project: Pig Issue Type: Bug Components: impl Affects Versions: 0.6.0 Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.7.0 Column pruner for union could fail if one source of union have the schema and the other does not have schema. For example, the following script fail: {code} a = load '1.txt' as (a0, a1, a2); b = foreach a generate a0; c = load '2.txt'; d = foreach c generate $0; e = union b, d; dump e; {code} However, this issue is in trunk only and is not applicable to 0.6 branch. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1176) Column Pruner issues in union of loader with and without schema
[ https://issues.apache.org/jira/browse/PIG-1176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-1176: Status: Patch Available (was: Open) Column Pruner issues in union of loader with and without schema --- Key: PIG-1176 URL: https://issues.apache.org/jira/browse/PIG-1176 Project: Pig Issue Type: Bug Components: impl Affects Versions: 0.6.0 Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.7.0 Attachments: PIG-1176-1.patch Column pruner for union could fail if one source of union have the schema and the other does not have schema. For example, the following script fail: {code} a = load '1.txt' as (a0, a1, a2); b = foreach a generate a0; c = load '2.txt'; d = foreach c generate $0; e = union b, d; dump e; {code} However, this issue is in trunk only and is not applicable to 0.6 branch. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1176) Column Pruner issues in union of loader with and without schema
[ https://issues.apache.org/jira/browse/PIG-1176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-1176: Attachment: PIG-1176-1.patch Column Pruner issues in union of loader with and without schema --- Key: PIG-1176 URL: https://issues.apache.org/jira/browse/PIG-1176 Project: Pig Issue Type: Bug Components: impl Affects Versions: 0.6.0 Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.7.0 Attachments: PIG-1176-1.patch Column pruner for union could fail if one source of union have the schema and the other does not have schema. For example, the following script fail: {code} a = load '1.txt' as (a0, a1, a2); b = foreach a generate a0; c = load '2.txt'; d = foreach c generate $0; e = union b, d; dump e; {code} However, this issue is in trunk only and is not applicable to 0.6 branch. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1172) PushDownForeachFlatten shall not push ForEach below Join if the flattened fields is used in Join
[ https://issues.apache.org/jira/browse/PIG-1172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-1172: Resolution: Fixed Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) Patch committed to both trunk and 0.6 branch. PushDownForeachFlatten shall not push ForEach below Join if the flattened fields is used in Join Key: PIG-1172 URL: https://issues.apache.org/jira/browse/PIG-1172 Project: Pig Issue Type: Bug Components: impl Affects Versions: 0.6.0 Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.6.0 Attachments: PIG-1172-1.patch, PIG-1172-2.patch Currently the following script will push B below D. But we will use fattened column in the join, we cannot push that. A = load '1.txt' as (bg:bag{t:tuple(a0,a1)}); B = FOREACH A generate flatten($0); C = load '3.txt' AS (c0, c1); D = JOIN B by a1, C by c1; E = limit D 10; explain E; -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1176) Column Pruner issues in union of loader with and without schema
[ https://issues.apache.org/jira/browse/PIG-1176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12796562#action_12796562 ] Hadoop QA commented on PIG-1176: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12429411/PIG-1176-1.patch against trunk revision 895753. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/165/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/165/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/165/console This message is automatically generated. Column Pruner issues in union of loader with and without schema --- Key: PIG-1176 URL: https://issues.apache.org/jira/browse/PIG-1176 Project: Pig Issue Type: Bug Components: impl Affects Versions: 0.6.0 Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.7.0 Attachments: PIG-1176-1.patch Column pruner for union could fail if one source of union have the schema and the other does not have schema. For example, the following script fail: {code} a = load '1.txt' as (a0, a1, a2); b = foreach a generate a0; c = load '2.txt'; d = foreach c generate $0; e = union b, d; dump e; {code} However, this issue is in trunk only and is not applicable to 0.6 branch. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1176) Column Pruner issues in union of loader with and without schema
[ https://issues.apache.org/jira/browse/PIG-1176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-1176: Status: Patch Available (was: Open) Column Pruner issues in union of loader with and without schema --- Key: PIG-1176 URL: https://issues.apache.org/jira/browse/PIG-1176 Project: Pig Issue Type: Bug Components: impl Affects Versions: 0.6.0 Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.7.0 Attachments: PIG-1176-1.patch Column pruner for union could fail if one source of union have the schema and the other does not have schema. For example, the following script fail: {code} a = load '1.txt' as (a0, a1, a2); b = foreach a generate a0; c = load '2.txt'; d = foreach c generate $0; e = union b, d; dump e; {code} However, this issue is in trunk only and is not applicable to 0.6 branch. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1176) Column Pruner issues in union of loader with and without schema
[ https://issues.apache.org/jira/browse/PIG-1176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-1176: Status: Open (was: Patch Available) Column Pruner issues in union of loader with and without schema --- Key: PIG-1176 URL: https://issues.apache.org/jira/browse/PIG-1176 Project: Pig Issue Type: Bug Components: impl Affects Versions: 0.6.0 Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.7.0 Attachments: PIG-1176-1.patch Column pruner for union could fail if one source of union have the schema and the other does not have schema. For example, the following script fail: {code} a = load '1.txt' as (a0, a1, a2); b = foreach a generate a0; c = load '2.txt'; d = foreach c generate $0; e = union b, d; dump e; {code} However, this issue is in trunk only and is not applicable to 0.6 branch. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1173) pig cannot be built without an internet connection
[ https://issues.apache.org/jira/browse/PIG-1173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12796564#action_12796564 ] Daniel Dai commented on PIG-1173: - +1, will commit patch shortly. pig cannot be built without an internet connection -- Key: PIG-1173 URL: https://issues.apache.org/jira/browse/PIG-1173 Project: Pig Issue Type: Bug Reporter: Jeff Hodges Priority: Minor Attachments: offlinebuild-v2.patch, offlinebuild.patch Pig's build.xml does not allow for offline building even when it's been built before. This is because the ivy-download target has not conditional associated with it to turn it off. The Hadoop seems to be adding an unless=offline to the ivy-download target. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.