[jira] Updated: (PIG-978) ERROR 2100 (hdfs://localhost/tmp/temp175740929/tmp-1126214010 does not exist) and ERROR 2999: (Unexpected internal error. null) when using Multi-Query optimization

2009-12-01 Thread Corinne Chandel (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Corinne Chandel updated PIG-978:


Attachment: pig-latin-users-guide-2.patch

Patch #2 (moved EXEC statement).

Opps. I actually walked over asked Richard where the EXEC should go, but I 
guess things got mixed up.

 ERROR 2100 (hdfs://localhost/tmp/temp175740929/tmp-1126214010 does not exist) 
 and ERROR 2999: (Unexpected internal error. null) when using Multi-Query 
 optimization
 ---

 Key: PIG-978
 URL: https://issues.apache.org/jira/browse/PIG-978
 Project: Pig
  Issue Type: Bug
  Components: documentation
Affects Versions: 0.6.0
Reporter: Viraj Bhat
Assignee: Corinne Chandel
 Fix For: 0.6.0

 Attachments: pig-latin-users-guide-2.patch, 
 pig-latin-users-guide.patch


 I have  Pig script of this form.. which I execute using Multi-query 
 optimization.
 {code}
 A = load '/user/viraj/firstinput' using PigStorage();
 B = group 
 C = ..agrregation function
 store C into '/user/viraj/firstinputtempresult/days1';
 ..
 Atab = load '/user/viraj/secondinput' using PigStorage();
 Btab = group 
 Ctab = ..agrregation function
 store Ctab into '/user/viraj/secondinputtempresult/days1';
 ..
 E = load '/user/viraj/firstinputtempresult/' using PigStorage();
 F = group 
 G = aggregation function
 store G into '/user/viraj/finalresult1';
 Etab = load '/user/viraj/secondinputtempresult/' using PigStorage();
 Ftab = group 
 Gtab = aggregation function
 store Gtab into '/user/viraj/finalresult2';
 {code}
 2009-07-20 22:05:44,507 [main] ERROR org.apache.pig.tools.grunt.GruntParser - 
 ERROR 2100: hdfs://localhost/tmp/temp175740929/tmp-1126214010 does not exist. 
 Details at logfile: /homes/viraj/pigscripts/pig_1248127173601.log)  
 is due to the mismatch of store/load commands. The script first stores files 
 into the 'days1' directory (store C into 
 '/user/viraj/firstinputtempresult/days1' using PigStorage();), but it later 
 loads from the top level directory (E = load 
 '/user/viraj/firstinputtempresult/' using PigStorage()) instead of the 
 original directory (/user/viraj/firstinputtempresult/days1).
 The current multi-query optimizer can't solve the dependency between these 
 two commands--they have different load file paths. So the jobs will run 
 concurrently and result in the errors.
 The solution is to add 'exec' or 'run' command after the first two stores . 
 This will force the first two store commands to run before the rest commands.
 It would be nice to see this fixed as a part of an enhancement to the 
 Multi-query. We either disable the Multi-query or throw a warning/error 
 message, so that the user can correct his load/store statements.
 Viraj

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-978) ERROR 2100 (hdfs://localhost/tmp/temp175740929/tmp-1126214010 does not exist) and ERROR 2999: (Unexpected internal error. null) when using Multi-Query optimization

2009-12-01 Thread Olga Natkovich (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Olga Natkovich updated PIG-978:
---

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Patch committed to both the trunk and 0.6 branch. Thanks, Corinne!

 ERROR 2100 (hdfs://localhost/tmp/temp175740929/tmp-1126214010 does not exist) 
 and ERROR 2999: (Unexpected internal error. null) when using Multi-Query 
 optimization
 ---

 Key: PIG-978
 URL: https://issues.apache.org/jira/browse/PIG-978
 Project: Pig
  Issue Type: Bug
  Components: documentation
Affects Versions: 0.6.0
Reporter: Viraj Bhat
Assignee: Corinne Chandel
 Fix For: 0.6.0

 Attachments: pig-latin-users-guide-2.patch, 
 pig-latin-users-guide.patch


 I have  Pig script of this form.. which I execute using Multi-query 
 optimization.
 {code}
 A = load '/user/viraj/firstinput' using PigStorage();
 B = group 
 C = ..agrregation function
 store C into '/user/viraj/firstinputtempresult/days1';
 ..
 Atab = load '/user/viraj/secondinput' using PigStorage();
 Btab = group 
 Ctab = ..agrregation function
 store Ctab into '/user/viraj/secondinputtempresult/days1';
 ..
 E = load '/user/viraj/firstinputtempresult/' using PigStorage();
 F = group 
 G = aggregation function
 store G into '/user/viraj/finalresult1';
 Etab = load '/user/viraj/secondinputtempresult/' using PigStorage();
 Ftab = group 
 Gtab = aggregation function
 store Gtab into '/user/viraj/finalresult2';
 {code}
 2009-07-20 22:05:44,507 [main] ERROR org.apache.pig.tools.grunt.GruntParser - 
 ERROR 2100: hdfs://localhost/tmp/temp175740929/tmp-1126214010 does not exist. 
 Details at logfile: /homes/viraj/pigscripts/pig_1248127173601.log)  
 is due to the mismatch of store/load commands. The script first stores files 
 into the 'days1' directory (store C into 
 '/user/viraj/firstinputtempresult/days1' using PigStorage();), but it later 
 loads from the top level directory (E = load 
 '/user/viraj/firstinputtempresult/' using PigStorage()) instead of the 
 original directory (/user/viraj/firstinputtempresult/days1).
 The current multi-query optimizer can't solve the dependency between these 
 two commands--they have different load file paths. So the jobs will run 
 concurrently and result in the errors.
 The solution is to add 'exec' or 'run' command after the first two stores . 
 This will force the first two store commands to run before the rest commands.
 It would be nice to see this fixed as a part of an enhancement to the 
 Multi-query. We either disable the Multi-query or throw a warning/error 
 message, so that the user can correct his load/store statements.
 Viraj

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-978) ERROR 2100 (hdfs://localhost/tmp/temp175740929/tmp-1126214010 does not exist) and ERROR 2999: (Unexpected internal error. null) when using Multi-Query optimization

2009-11-30 Thread Corinne Chandel (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Corinne Chandel updated PIG-978:


Attachment: pig-latin-users-guide.patch

Patch file.

Upated Pig Latin Users Guide: Implicit Dependencies

 ERROR 2100 (hdfs://localhost/tmp/temp175740929/tmp-1126214010 does not exist) 
 and ERROR 2999: (Unexpected internal error. null) when using Multi-Query 
 optimization
 ---

 Key: PIG-978
 URL: https://issues.apache.org/jira/browse/PIG-978
 Project: Pig
  Issue Type: Bug
  Components: documentation
Affects Versions: 0.6.0
Reporter: Viraj Bhat
Assignee: Corinne Chandel
 Fix For: 0.6.0

 Attachments: pig-latin-users-guide.patch


 I have  Pig script of this form.. which I execute using Multi-query 
 optimization.
 {code}
 A = load '/user/viraj/firstinput' using PigStorage();
 B = group 
 C = ..agrregation function
 store C into '/user/viraj/firstinputtempresult/days1';
 ..
 Atab = load '/user/viraj/secondinput' using PigStorage();
 Btab = group 
 Ctab = ..agrregation function
 store Ctab into '/user/viraj/secondinputtempresult/days1';
 ..
 E = load '/user/viraj/firstinputtempresult/' using PigStorage();
 F = group 
 G = aggregation function
 store G into '/user/viraj/finalresult1';
 Etab = load '/user/viraj/secondinputtempresult/' using PigStorage();
 Ftab = group 
 Gtab = aggregation function
 store Gtab into '/user/viraj/finalresult2';
 {code}
 2009-07-20 22:05:44,507 [main] ERROR org.apache.pig.tools.grunt.GruntParser - 
 ERROR 2100: hdfs://localhost/tmp/temp175740929/tmp-1126214010 does not exist. 
 Details at logfile: /homes/viraj/pigscripts/pig_1248127173601.log)  
 is due to the mismatch of store/load commands. The script first stores files 
 into the 'days1' directory (store C into 
 '/user/viraj/firstinputtempresult/days1' using PigStorage();), but it later 
 loads from the top level directory (E = load 
 '/user/viraj/firstinputtempresult/' using PigStorage()) instead of the 
 original directory (/user/viraj/firstinputtempresult/days1).
 The current multi-query optimizer can't solve the dependency between these 
 two commands--they have different load file paths. So the jobs will run 
 concurrently and result in the errors.
 The solution is to add 'exec' or 'run' command after the first two stores . 
 This will force the first two store commands to run before the rest commands.
 It would be nice to see this fixed as a part of an enhancement to the 
 Multi-query. We either disable the Multi-query or throw a warning/error 
 message, so that the user can correct his load/store statements.
 Viraj

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-978) ERROR 2100 (hdfs://localhost/tmp/temp175740929/tmp-1126214010 does not exist) and ERROR 2999: (Unexpected internal error. null) when using Multi-Query optimization

2009-11-30 Thread Corinne Chandel (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Corinne Chandel updated PIG-978:


Status: Patch Available  (was: Open)

Apply patch to trunk: http://svn.apache.org/repos/asf/hadoop/pig/trunk

Note: No new test code required; changes to documentation only.

 ERROR 2100 (hdfs://localhost/tmp/temp175740929/tmp-1126214010 does not exist) 
 and ERROR 2999: (Unexpected internal error. null) when using Multi-Query 
 optimization
 ---

 Key: PIG-978
 URL: https://issues.apache.org/jira/browse/PIG-978
 Project: Pig
  Issue Type: Bug
  Components: documentation
Affects Versions: 0.6.0
Reporter: Viraj Bhat
Assignee: Corinne Chandel
 Fix For: 0.6.0

 Attachments: pig-latin-users-guide.patch


 I have  Pig script of this form.. which I execute using Multi-query 
 optimization.
 {code}
 A = load '/user/viraj/firstinput' using PigStorage();
 B = group 
 C = ..agrregation function
 store C into '/user/viraj/firstinputtempresult/days1';
 ..
 Atab = load '/user/viraj/secondinput' using PigStorage();
 Btab = group 
 Ctab = ..agrregation function
 store Ctab into '/user/viraj/secondinputtempresult/days1';
 ..
 E = load '/user/viraj/firstinputtempresult/' using PigStorage();
 F = group 
 G = aggregation function
 store G into '/user/viraj/finalresult1';
 Etab = load '/user/viraj/secondinputtempresult/' using PigStorage();
 Ftab = group 
 Gtab = aggregation function
 store Gtab into '/user/viraj/finalresult2';
 {code}
 2009-07-20 22:05:44,507 [main] ERROR org.apache.pig.tools.grunt.GruntParser - 
 ERROR 2100: hdfs://localhost/tmp/temp175740929/tmp-1126214010 does not exist. 
 Details at logfile: /homes/viraj/pigscripts/pig_1248127173601.log)  
 is due to the mismatch of store/load commands. The script first stores files 
 into the 'days1' directory (store C into 
 '/user/viraj/firstinputtempresult/days1' using PigStorage();), but it later 
 loads from the top level directory (E = load 
 '/user/viraj/firstinputtempresult/' using PigStorage()) instead of the 
 original directory (/user/viraj/firstinputtempresult/days1).
 The current multi-query optimizer can't solve the dependency between these 
 two commands--they have different load file paths. So the jobs will run 
 concurrently and result in the errors.
 The solution is to add 'exec' or 'run' command after the first two stores . 
 This will force the first two store commands to run before the rest commands.
 It would be nice to see this fixed as a part of an enhancement to the 
 Multi-query. We either disable the Multi-query or throw a warning/error 
 message, so that the user can correct his load/store statements.
 Viraj

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.