[jira] Assigned: (PIG-19) A=load causes parse error
[ https://issues.apache.org/jira/browse/PIG-19?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates reassigned PIG-19: - Assignee: Xuefu Zhang A=load causes parse error - Key: PIG-19 URL: https://issues.apache.org/jira/browse/PIG-19 Project: Pig Issue Type: Bug Components: grunt Reporter: Olga Natkovich Assignee: Xuefu Zhang Priority: Minor Fix For: 0.9.0 Parser expects spaces around =. This should be a minor change in src/org/apache/pig/tools/grunt/GruntParser.jj -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (PIG-313) Error handling aggregate of a computation
[ https://issues.apache.org/jira/browse/PIG-313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates reassigned PIG-313: -- Assignee: Alan Gates Error handling aggregate of a computation - Key: PIG-313 URL: https://issues.apache.org/jira/browse/PIG-313 Project: Pig Issue Type: Bug Affects Versions: 0.2.0 Reporter: Pradeep Kamath Assignee: Alan Gates Priority: Minor Fix For: 0.9.0 Query which fails: {code} a = load ':INPATH:/singlefile/studenttab10k' as (name:chararray, age:int, gpa:double); b = group a by name; c = foreach b generate group, SUM(a.age*a.gpa); store c into ':OUTPATH:';\, {code} Error output: {quote} 2008-07-14 16:34:08,684 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: testhost.com:8020 2008-07-14 16:34:08,741 [main] WARN org.apache.hadoop.fs.FileSystem - testhost.com:8020 is a deprecated filesystem name. Use hdfs://testhost:8020/ instead. 2008-07-14 16:34:08,995 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: testhost.com:50020 2008-07-14 16:34:09,251 [main] WARN org.apache.hadoop.fs.FileSystem - testhost.com:8020 is a deprecated filesystem name. Use hdfs://testhost:8020/ instead. 2008-07-14 16:34:09,559 [main] ERROR org.apache.pig.PigServer - Cannot evaluate output type of Mul/Div Operator 2008-07-14 16:34:09,559 [main] ERROR org.apache.pig.PigServer - Problem resolving LOForEach schema 2008-07-14 16:34:09,559 [main] ERROR org.apache.pig.PigServer - Severe problem found during validation org.apache.pig.impl.plan.PlanValidationException: An unexpected exception caused the validation to stop 2008-07-14 16:34:09,560 [main] ERROR org.apache.pig.tools.grunt.Grunt - java.io.IOException: Unable to store for alias: c 2008-07-14 16:34:09,560 [main] ERROR org.apache.pig.Main - java.io.IOException: Unable to store for alias: c {quote} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (PIG-333) MIN on strings (undeclared) gives strange error in store
[ https://issues.apache.org/jira/browse/PIG-333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates reassigned PIG-333: -- Assignee: Alan Gates (was: Santhosh Srinivasan) MIN on strings (undeclared) gives strange error in store Key: PIG-333 URL: https://issues.apache.org/jira/browse/PIG-333 Project: Pig Issue Type: Bug Affects Versions: 0.2.0 Reporter: Pradeep Kamath Assignee: Alan Gates Priority: Minor Fix For: 0.9.0 Script which causes error: {code} a = load '/user/pig/tests/data/singlefile/votertab10k' as (name, age, registration, contribution); b = group a all; c = foreach b generate MIN(a.name), MAX(a.name); store c into '/tmp'; {code} Error: {noformat} 2008-07-23 11:31:15,415 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher - 0.0% complete 2008-07-23 11:31:19,167 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher - 50.0% complete 2008-07-23 11:31:43,431 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher - 100.0% complete 2008-07-23 11:31:45,956 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher - Unsuccessful attempt. Completed 0.0% of the job 2008-07-23 11:31:45,969 [main] ERROR org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher - Error message from task (map) tip_20080723_0002_m_00 2008-07-23 11:31:45,974 [main] ERROR org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher - Error message from task (reduce) tip_20080723_0002_r_00 java.io.IOException: Cannot store a non-flat tuple using PigStorage at org.apache.pig.builtin.PigStorage.putNext(PigStorage.java:163) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:117) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:90) at org.apache.hadoop.mapred.ReduceTask$3.collect(ReduceTask.java:373) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:170) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:85) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:391) at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2124) java.io.IOException: Cannot store a non-flat tuple using PigStorage at org.apache.pig.builtin.PigStorage.putNext(PigStorage.java:163) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:117) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:90) at org.apache.hadoop.mapred.ReduceTask$3.collect(ReduceTask.java:373) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:170) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:85) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:391) at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2124) java.io.IOException: Cannot store a non-flat tuple using PigStorage at org.apache.pig.builtin.PigStorage.putNext(PigStorage.java:163) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:117) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:90) at org.apache.hadoop.mapred.ReduceTask$3.collect(ReduceTask.java:373) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:170) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:85) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:391) at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2124) java.io.IOException: Cannot store a non-flat tuple using PigStorage at org.apache.pig.builtin.PigStorage.putNext(PigStorage.java:163) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:117) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:90) at
[jira] Assigned: (PIG-144) The error message should be more meaningful when there is a typo in PIg script
[ https://issues.apache.org/jira/browse/PIG-144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates reassigned PIG-144: -- Assignee: Xuefu Zhang The error message should be more meaningful when there is a typo in PIg script -- Key: PIG-144 URL: https://issues.apache.org/jira/browse/PIG-144 Project: Pig Issue Type: Bug Reporter: Xu Zhang Assignee: Xuefu Zhang Priority: Minor Fix For: 0.9.0 When I ran the following Pig script on the command line {{pig -c mycluster myscript.pig}}, I got the error: 2008-03-07 16:31:45,992 [main] ERROR org.apache.pig.tools.grunt.Grunt - {code} A = load '/user/pig/tests/data/singlefile/fileexists'; B = foreach A generate $2, $1, $0; C = strean B through `awk '{print $3 $4 \t $2 \t $1}'`; store C into '/user/pig/tests/data/singlefile/results1'; {code} The error message is not quite meaningful, and it took me a while to find out what was wrong - the word strean should have been stream. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (PIG-356) map lookup on empty key should be disallowed at parse time
[ https://issues.apache.org/jira/browse/PIG-356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates reassigned PIG-356: -- Assignee: Xuefu Zhang map lookup on empty key should be disallowed at parse time -- Key: PIG-356 URL: https://issues.apache.org/jira/browse/PIG-356 Project: Pig Issue Type: Bug Affects Versions: 0.2.0 Reporter: Pradeep Kamath Assignee: Xuefu Zhang Priority: Minor Fix For: 0.9.0 Currently the following is allowed: {code} a = load 'testfile'; b = foreach a generate $0#'apple', $0#'mango', $0#'', flatten($1#'orange'); {code} Looking up an empty key ($0#'') should not be allowed at parse time -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (PIG-548) ParseException involving as keyword
[ https://issues.apache.org/jira/browse/PIG-548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates reassigned PIG-548: -- Assignee: Xuefu Zhang ParseException involving as keyword -- Key: PIG-548 URL: https://issues.apache.org/jira/browse/PIG-548 Project: Pig Issue Type: Bug Affects Versions: 0.2.0 Reporter: Viraj Bhat Assignee: Xuefu Zhang Priority: Minor Fix For: 0.9.0 Attachments: assyntax.pig The enclosed Pig script, throws the following error: = org.apache.pig.tools.pigscript.parser.ParseException: Encountered as at line 13, column 11. Was expecting one of: EOF cat ... cd ... cp ... copyFromLocal ... copyToLocal ... dump ... describe ... explain ... help ... kill ... ls ... mv ... mkdir ... pwd ... quit ... register ... rm ... rmf ... set ... illustrate ... scriptDone ... ... EOL ... ; ... at org.apache.pig.tools.pigscript.parser.PigScriptParser.generateParseException(PigScriptParser.java:688) at org.apache.pig.tools.pigscript.parser.PigScriptParser.handle_invalid_command(PigScriptParser.java:515) at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:356) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:84) at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:64) at org.apache.pig.Main.main(Main.java:306) = But the error seems to disappear if a few lines are moved around the foreach and as keywords. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (PIG-435) wrong columns produced if incomplete definition provided during load
[ https://issues.apache.org/jira/browse/PIG-435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates reassigned PIG-435: -- Assignee: Alan Gates (was: Pradeep Kamath) wrong columns produced if incomplete definition provided during load Key: PIG-435 URL: https://issues.apache.org/jira/browse/PIG-435 Project: Pig Issue Type: Bug Affects Versions: 0.2.0 Reporter: Olga Natkovich Assignee: Alan Gates Priority: Minor Fix For: 0.9.0 Scrip: A = load 'studenttab10k' as (name); -- note that data has more than 1 column B = load 'votertab10k' as (name, age, reg, contrib); D = COGROUP A by name, B by name; E = foreach D generate flatten(A), flatten(B); F = foreach E generate registration, contr; dump F; The dump produces the wrong columns. This is because even though we declared only one column, we actually load all columns of A. So any place where we explicitely or implicitely use A.* as the case in flatten, we would produce the wrong results. The long term solution is actually to push projections into the load. Shorter term the proposal is to notice if the script uses A.* and stick a project after the load. Note that we don't need to do that if types are declared because there will be already casting foreach there. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (PIG-673) several aggregate functions do not check the number of arguments and do not correctly check for a type bag
[ https://issues.apache.org/jira/browse/PIG-673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates reassigned PIG-673: -- Assignee: Daniel Dai several aggregate functions do not check the number of arguments and do not correctly check for a type bag Key: PIG-673 URL: https://issues.apache.org/jira/browse/PIG-673 Project: Pig Issue Type: Bug Environment: i686 i386 GNU/Linux Reporter: Araceli Henley Assignee: Daniel Dai Fix For: 0.9.0 DIFF expects two bags as the argument. But in this negative test case we pass: 1) there is a single argument to diff instead of two, 2) The argument should be a bag but is an int. TEST: AggregateFunc_190 A =LOAD '/user/pig/tests/data/types/DataAll' USING PigStorage() AS ( Fint:int, Flong:long, Fdouble:double, Ffloat:float, Fchar:chararray, Fchararray:chararray, Fbytearray:bytearray, Fmap:map[], Fbag:BAG{ t:tuple( name, age, avg ) }, Ftuple:( name:chararray, age:int, avg:float) ); B =GROUP A ALL; X =FOREACH B GENERATE DIFF( A.Fint) + DIFF( A.Fint); STORE X INTO '/user/pig/tests/results/araceli.1234381533/AggregateFunc_190.out' USING PigStorage(); ERROR 1000: Error during parsing. Atomic field expected but found non-atomic field TEST AggregateFunc_1901 A =LOAD '/user/pig/tests/data/types/DataAll' USING PigStorage() AS ( Fint:int, Flong:long, Fdouble:double, Ffloat:float, Fchar:chararray, Fchararray:chararray, Fbytearray:bytearray, Fmap:map[], Fbag:BAG{ t:tuple( name, age, avg ) }, Ftuple:( name:chararray, age:int, avg:float) ); B =GROUP A ALL; X =FOREACH B GENERATE DIFF( A.Fint, A.Fint) + DIFF( A.Fint, A.Fint); STORE X INTO '/user/pig/tests/results/araceli.1234467894/AggregateFunc_1901.out' USING PigStorage(); ERROR 1000: Error during parsing. Atomic field expected but found non-atomic field TEST AggregateFunc_1902 A =LOAD '/user/pig/tests/data/types/DataAll' USING PigStorage() AS ( Fint:int, Flong:long, Fdouble:double, Ffloat:float, Fchar:chararray, Fchararray:chararray, Fbytearray:bytearray, Fmap:map[], Fbag:BAG{ t:tuple( name, age, avg ) }, Ftuple:( name:chararray, age:int, avg:float) ); B =GROUP A ALL; X =FOREACH B GENERATE DIFF( A.Fint, A.Fint + A.Fint); STORE X INTO '/user/pig/tests/results/araceli.1234467894/AggregateFunc_1902.out' USING PigStorage(); throws error: ERROR 1039: Incompatible types in Add Operator left hand side:bag right hand side:bag -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (PIG-674) Improve errors in Pig parser
[ https://issues.apache.org/jira/browse/PIG-674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates reassigned PIG-674: -- Assignee: Xuefu Zhang Improve errors in Pig parser Key: PIG-674 URL: https://issues.apache.org/jira/browse/PIG-674 Project: Pig Issue Type: Bug Reporter: Araceli Henley Assignee: Xuefu Zhang Priority: Minor Fix For: 0.9.0 These tests are for Aggregate Functions Recomend msg - SHould indicate that this is an invalid cast. ERROR - MAX with int with invalid cast TEST: 106, PIG SCRIPT: A =LOAD ':INPATH:/types/DataAll' USING PigStorage() AS ( Fint:int, Flong:long, Fdouble:double, Ffloat:float, Fchar:chararray, Fchararray:chararray, Fbytearray:bytearray, Fmap:map[], Fbag:BAG{ t:tuple( name, age, avg ) }, Ftuple:( name:chararray, age:int, avg:float) );B =GROUP A ALL; X =FOREACH B GENERATE A.Fint, MAX( (invalid) A.Fint ); STORE X INTO ':OUTPATH:' USING PigStorage();\, CURRENT ERROR MESSAGE: ERROR 1000:.*Invalid alias: MAX, Recomend msg - SHould indicate that this is an invalid cast. ERROR - MAX with int with invalid cast TEST: 106, PIG SCRIPT: A =LOAD ':INPATH:/types/DataAll' USING PigStorage() AS ( Fint:int, Flong:long, Fdouble:double, Ffloat:float, Fchar:chararray, Fchararray:chararray, Fbytearray:bytearray, Fmap:map[], Fbag:BAG{ t:tuple( name, age, avg ) }, Ftuple:( name:chararray, age:int, avg:float) );B =GROUP A ALL; X =FOREACH B GENERATE A.Fint, MAX( (invalid) A.Fint ); STORE X INTO ':OUTPATH:' USING PigStorage();\, CURRENT ERROR MESSAGE: ERROR 1000:.*Invalid alias: MAX, Recomend msg - ERROR: invalid use of foreach with multiple functions and positional parameters TEST: 107, PIG SCRIPT: A =LOAD ':INPATH:/types/DataAll' USING PigStorage() AS ( Fint:int, Flong:long, Fdouble:double, Ffloat:float, Fchar:chararray, Fchararray:chararray, Fbytearray:bytearray, Fmap:map[], Fbag:BAG{ t:tuple( name, age, avg ) }, Ftuple:( name:chararray, age:int, avg:float) );B =GROUP A ALL; X =FOREACH A GENERATE SUM( A.$0), AVG( A.$0), COUNT( A.$0), MAX(A.$0), MIN( A.$0); STORE X INTO ':OUTPATH:' USING PigStorage();\, CURRENT ERROR MESSAGE: FIX: improve msg, Recomend msg - ERROR 1052: Cannot cast bag with schema.*: bag ERROR: invalid use of MIN with int with valid cast TEST: 108, PIG SCRIPT: A =LOAD ':INPATH:/types/DataAll' USING PigStorage() AS ( Fint:int, Flong:long, Fdouble:double, Ffloat:float, Fchar:chararray, Fchararray:chararray, Fbytearray:bytearray, Fmap:map[], Fbag:BAG{ t:tuple( name, age, avg ) }, Ftuple:( name:chararray, age:int, avg:float) );B =GROUP A ALL; X =FOREACH B GENERATE A.Fint, MIN( (double) A.Fint ); STORE X INTO ':OUTPATH:' USING PigStorage();\, CURRENT ERROR MESSAGE: ERROR 1052: Cannot cast.*, Recomend msg - ERROR - AVG needs bag TEST: 113, PIG SCRIPT: A =LOAD ':INPATH:/types/DataAll' USING PigStorage() AS ( Fint:int, Flong:long, Fdouble:double, Ffloat:float, Fchar:chararray, Fchararray:chararray, Fbytearray:bytearray, Fmap:map[], Fbag:BAG{ t:tuple( name, age, avg ) }, Ftuple:( name:chararray, age:int, avg:float) ); B = GROUP A ALL; X =FOREACH B GENERATE AVG( A.Fint); STORE X INTO ':OUTPATH:' USING PigStorage();\, CURRENT ERROR MESSAGE: ERROR 1052: Cannot cast bag with schema.*bag, Recomend msg - this should indicate there was an invalid Cast ERROR - AVG with int with invalid cast TEST: 115, PIG SCRIPT: A =LOAD ':INPATH:/types/DataAll' USING PigStorage() AS ( Fint:int, Flong:long, Fdouble:double, Ffloat:float, Fchar:chararray, Fchararray:chararray, Fbytearray:bytearray, Fmap:map[], Fbag:BAG{ t:tuple( name, age, avg ) }, Ftuple:( name:chararray, age:int, avg:float) );B =GROUP A ALL; X =FOREACH B GENERATE A.Fint, AVG( (invalid) A.Fint ); STORE X INTO ':OUTPATH:' USING PigStorage();\, CURRENT ERROR MESSAGE: ERROR 1000:.*Invalid alias: AVG, Recomend msg - this should indicate that COUNT expects a bag for an argument ERROR - COUNT needs bag TEST: 118, PIG SCRIPT: A =LOAD ':INPATH:/types/DataAll' USING PigStorage() AS ( Fint:int, Flong:long, Fdouble:double, Ffloat:float, Fchar:chararray, Fchararray:chararray, Fbytearray:bytearray, Fmap:map[], Fbag:BAG{ t:tuple( name, age, avg ) }, Ftuple:( name:chararray, age:int,
[jira] Assigned: (PIG-671) typechecker does not throw an error when multiple arguments are passed to COUNT
[ https://issues.apache.org/jira/browse/PIG-671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates reassigned PIG-671: -- Assignee: Daniel Dai typechecker does not throw an error when multiple arguments are passed to COUNT --- Key: PIG-671 URL: https://issues.apache.org/jira/browse/PIG-671 Project: Pig Issue Type: Bug Environment: i686 i386 GNU/Linux Reporter: Araceli Henley Assignee: Daniel Dai Priority: Trivial Fix For: 0.9.0 In this example, the agggregate function COUNT is passed multiple arguments and does not throw an error. TEST: Aggregate_184 A =LOAD '/user/pig/tests/data/types/DataAll' USING PigStorage() AS ( Fint:int, Flong:long, Fdouble:double, Ffloat:float, Fchar:chararray, Fchararray:chararray, Fbytearray:bytearray, Fmap:map[], Fbag:BAG{ t:tuple( name, age, avg ) }, Ftuple:( name:chararray, age:int, avg:float) ); B =GROUP A ALL; X =FOREACH B GENERATE COUNT ( A.$0, A.$0 ); STORE X INTO '/user/pig/tests/results/araceli.1234381533/AggregateFunc_184.out' USING PigStorage(); -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (PIG-709) Handling of NULL in Pig builtin functions needs to be reviewed
[ https://issues.apache.org/jira/browse/PIG-709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates reassigned PIG-709: -- Assignee: Daniel Dai (was: Alan Gates) Handling of NULL in Pig builtin functions needs to be reviewed -- Key: PIG-709 URL: https://issues.apache.org/jira/browse/PIG-709 Project: Pig Issue Type: Bug Components: impl Reporter: Santhosh Srinivasan Assignee: Daniel Dai Fix For: 0.9.0 Pig builtin functions do not handle NULL consistently. Some examples are the combiner versus non-combiner for AVG. All the builtins need a review of cases where NULL is handled. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (PIG-747) Logical to Physical Plan Translation fails when temporary alias are created within foreach
[ https://issues.apache.org/jira/browse/PIG-747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates reassigned PIG-747: -- Assignee: Alan Gates (was: Daniel Dai) Logical to Physical Plan Translation fails when temporary alias are created within foreach -- Key: PIG-747 URL: https://issues.apache.org/jira/browse/PIG-747 Project: Pig Issue Type: Bug Affects Versions: 0.4.0 Reporter: Viraj Bhat Assignee: Alan Gates Fix For: 0.9.0 Attachments: physicalplan.txt, physicalplanprob.pig, PIG-747-1.patch Consider a the pig script which calculates a new column F inside the foreach as: {code} A = load 'physicalplan.txt' as (col1,col2,col3); B = foreach A { D = col1/col2; E = col3/col2; F = E - (D*D); generate F as newcol; }; dump B; {code} This gives the following error: === Caused by: org.apache.pig.backend.hadoop.executionengine.physicalLayer.LogicalToPhysicalTranslatorException: ERROR 2015: Invalid physical operators in the physical plan at org.apache.pig.backend.hadoop.executionengine.physicalLayer.LogToPhyTranslationVisitor.visit(LogToPhyTranslationVisitor.java:377) at org.apache.pig.impl.logicalLayer.LOMultiply.visit(LOMultiply.java:63) at org.apache.pig.impl.logicalLayer.LOMultiply.visit(LOMultiply.java:29) at org.apache.pig.impl.plan.DependencyOrderWalkerWOSeenChk.walk(DependencyOrderWalkerWOSeenChk.java:68) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.LogToPhyTranslationVisitor.visit(LogToPhyTranslationVisitor.java:908) at org.apache.pig.impl.logicalLayer.LOForEach.visit(LOForEach.java:122) at org.apache.pig.impl.logicalLayer.LOForEach.visit(LOForEach.java:41) at org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:68) at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:51) at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(HExecutionEngine.java:246) ... 10 more Caused by: org.apache.pig.impl.plan.PlanException: ERROR 0: Attempt to give operator of type org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.Divide multiple outputs. This operator does not support multiple outputs. at org.apache.pig.impl.plan.OperatorPlan.connect(OperatorPlan.java:158) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.plans.PhysicalPlan.connect(PhysicalPlan.java:89) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.LogToPhyTranslationVisitor.visit(LogToPhyTranslationVisitor.java:373) ... 19 more === -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1017) Converts strings to text in Pig
[ https://issues.apache.org/jira/browse/PIG-1017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12909251#action_12909251 ] Alan Gates commented on PIG-1017: - Are we really going to do this? I doubt it now, as the backward incompatibility cost would be so high. At the very least I don't think we'll do it for 0.9. Converts strings to text in Pig --- Key: PIG-1017 URL: https://issues.apache.org/jira/browse/PIG-1017 Project: Pig Issue Type: Improvement Reporter: Sriranjan Manjunath Assignee: Sriranjan Manjunath Fix For: 0.9.0 Attachments: stotext.patch Strings in Java are UTF-16 and takes 2 bytes. Text (org.apache.hadoop.io.Text) stores the data in UTF-8 and could show significant reductions in memory. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (PIG-1092) Pig Latin Parser fails to recognize \n as a whitespace
[ https://issues.apache.org/jira/browse/PIG-1092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates reassigned PIG-1092: --- Assignee: Xuefu Zhang Pig Latin Parser fails to recognize \n as a whitespace Key: PIG-1092 URL: https://issues.apache.org/jira/browse/PIG-1092 Project: Pig Issue Type: Bug Components: grunt Environment: RHEL linux Reporter: Yang Yang Assignee: Xuefu Zhang Priority: Minor Fix For: 0.9.0 the following pig latin script fails to parse a = load 'input_file' as ( field1 : int ); note that there is no char after the as, so there is only one \n char between the as and ( on the next line. adding a whitespace after as solves it. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (PIG-1341) BinStorage cannot convert DataByteArray to Chararray and results in FIELD_DISCARDED_TYPE_CONVERSION_FAILED
[ https://issues.apache.org/jira/browse/PIG-1341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates reassigned PIG-1341: --- Assignee: Alan Gates (was: Richard Ding) BinStorage cannot convert DataByteArray to Chararray and results in FIELD_DISCARDED_TYPE_CONVERSION_FAILED -- Key: PIG-1341 URL: https://issues.apache.org/jira/browse/PIG-1341 Project: Pig Issue Type: Bug Components: impl Affects Versions: 0.6.0 Reporter: Viraj Bhat Assignee: Alan Gates Fix For: 0.9.0 Attachments: PIG-1341.patch Script reads in BinStorage data and tries to convert a column which is in DataByteArray to Chararray. {code} raw = load 'sampledata' using BinStorage() as (col1,col2, col3); --filter out null columns A = filter raw by col1#'bcookie' is not null; B = foreach A generate col1#'bcookie' as reqcolumn; describe B; --B: {regcolumn: bytearray} X = limit B 5; dump X; B = foreach A generate (chararray)col1#'bcookie' as convertedcol; describe B; --B: {convertedcol: chararray} X = limit B 5; dump X; {code} The first dump produces: (36co9b55onr8s) (36co9b55onr8s) (36hilul5oo1q1) (36hilul5oo1q1) (36l4cj15ooa8a) The second dump produces: () () () () () It also throws an error message: FIELD_DISCARDED_TYPE_CONVERSION_FAILED 5 time(s). Viraj -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (PIG-1358) [piggybank] String functions should handle exceptions in a consistent manner
[ https://issues.apache.org/jira/browse/PIG-1358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates reassigned PIG-1358: --- Assignee: Daniel Dai [piggybank] String functions should handle exceptions in a consistent manner - Key: PIG-1358 URL: https://issues.apache.org/jira/browse/PIG-1358 Project: Pig Issue Type: Bug Affects Versions: 0.7.0 Reporter: Richard Ding Assignee: Daniel Dai Fix For: 0.9.0 The String functions in piggybank handles exceptions differently. Some catches all exceptions, some catches only ClassCastException, while some catches only ExecException. The exception handling code in these functions should be consistent. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (PIG-1538) isTwoLevelAccessRequired() returns false for nested relation
[ https://issues.apache.org/jira/browse/PIG-1538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates reassigned PIG-1538: --- Assignee: Alan Gates isTwoLevelAccessRequired() returns false for nested relation Key: PIG-1538 URL: https://issues.apache.org/jira/browse/PIG-1538 Project: Pig Issue Type: Wish Components: impl Affects Versions: 0.7.0 Reporter: Justin Hu Assignee: Alan Gates Priority: Minor Fix For: 0.9.0 Attachments: testcase.tgz Some user depends isTwoLevelAccessRequired() method in his UDF, and wishes the method returns TRUE for nested schema (for example, the relation with nested tuple). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (PIG-1499) Type error message does not include complex type
[ https://issues.apache.org/jira/browse/PIG-1499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates reassigned PIG-1499: --- Assignee: Xuefu Zhang Type error message does not include complex type Key: PIG-1499 URL: https://issues.apache.org/jira/browse/PIG-1499 Project: Pig Issue Type: Improvement Components: impl Affects Versions: 0.7.0 Environment: Hadoop 0.20.104.3.1007030707 Apache Pig version 0.7.0.20.100.1.1006041903 (r951530) Reporter: Sherry Chen Assignee: Xuefu Zhang Priority: Minor Fix For: 0.9.0 When loading data as a bag, if the schema specification is not correct, error message does not include useful information about bag. For example, input file as input.txt, working script as working.pig, non working as not_working.pig as following: input.txt {(2, 3)} {(4, 6)} {(5, 7)} not_working.pig A = LOAD 'input.txt' AS (f1:bag[T:tuple(t1, t2)]); describe A; dump A; working .pig A = LOAD 'input.txt' AS (f1:bag{T:tuple(t1, t2)}); describe A; dump A; if run: pig -latest -x local working.pig, we get result: ({(2, 3)}) ({(4, 6)}) ({(5, 7)}) if run pig -latest -x local not_working.pig, we get: ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1000: Error during parsing. Encountered bag bag at line 1, column 29. Was expecting one of: int ... long ... float ... double ... chararray ... bytearray ... int ... long ... float ... double ... chararray ... bytearray ... Please include bag{} map[] tuple() in Error message for better addressing the error. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (PIG-1573) PIG shouldn't pass all input to a UDF if the UDF specify no argument
[ https://issues.apache.org/jira/browse/PIG-1573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates reassigned PIG-1573: --- Assignee: Daniel Dai (was: Xuefu Zhang) PIG shouldn't pass all input to a UDF if the UDF specify no argument Key: PIG-1573 URL: https://issues.apache.org/jira/browse/PIG-1573 Project: Pig Issue Type: Bug Affects Versions: 0.7.0 Reporter: Xuefu Zhang Assignee: Daniel Dai Fix For: 0.9.0 Currently If in a pig script user uses a UDF with no argument, PIG backend assumes that the UDF takes all input so at run time it passes all input as a tuple to the UDF. This assumption is incorrect, causing conceptual confusions. If a UDF takes all input, it can specify a star (*) as its argument. If it specify no argument at all, then we assume that it requires no input data. We need to differentiate no input and all input for a UDF. Thus, in case that a UDF specify no argument, backend should pass the UDF an empty tuple. See notes in PIG-1586 for more information. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (PIG-1584) deal with inner cogroup
[ https://issues.apache.org/jira/browse/PIG-1584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates reassigned PIG-1584: --- Assignee: Alan Gates deal with inner cogroup --- Key: PIG-1584 URL: https://issues.apache.org/jira/browse/PIG-1584 Project: Pig Issue Type: Bug Reporter: Olga Natkovich Assignee: Alan Gates Fix For: 0.9.0 The current implementation of inner in case of cogroup is in conflict with join. We need to decide of whether to fix inner cogroup or just remove the functionality if it is not widely used -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (PIG-1577) support to variable number of arguments in UDF
[ https://issues.apache.org/jira/browse/PIG-1577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates reassigned PIG-1577: --- Assignee: Daniel Dai support to variable number of arguments in UDF -- Key: PIG-1577 URL: https://issues.apache.org/jira/browse/PIG-1577 Project: Pig Issue Type: Bug Affects Versions: 0.6.0 Reporter: Olga Natkovich Assignee: Daniel Dai Fix For: 0.9.0 In the current implementation, functionality that allows to map arguments to classes does not support functions with variable number of arguments. Also it does not support funtions that can have variable (but fixed in number) number of arguments. This causes problems for string UDFs such as CONCAT that can take an arbitrary number of arguments or TRIM that can take 1,2, or 3 arguments -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-366) PigPen - Eclipse plugin for a graphical PigLatin editor
[ https://issues.apache.org/jira/browse/PIG-366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12909330#action_12909330 ] Yan Zhou commented on PIG-366: -- Robert, Could you put down a step-by-step instruction on how to use this jar as an eclipse plug-in? Thanks. PigPen - Eclipse plugin for a graphical PigLatin editor --- Key: PIG-366 URL: https://issues.apache.org/jira/browse/PIG-366 Project: Pig Issue Type: New Feature Reporter: Shubham Chopra Assignee: Robert Gibbon Priority: Minor Attachments: org.apache.pig.pigpen-0.7.0.tar.gz, org.apache.pig.pigpen-0.7.2.tar.gz, org.apache.pig.pigpen_0.0.1.jar, org.apache.pig.pigpen_0.0.1.tgz, org.apache.pig.pigpen_0.0.4.jar, org.apache.pig.pigpen_0.7.2.jar, pigpen.patch, pigPen.patch, PigPen.tgz This is an Eclipse plugin that provides a GUI that can help users create PigLatin scripts and see the example generator outputs on the fly and submit the jobs to hadoop clusters. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1479) Embed Pig in scripting languages
[ https://issues.apache.org/jira/browse/PIG-1479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12909345#action_12909345 ] Julien Le Dem commented on PIG-1479: Thanks Richard! Embed Pig in scripting languages Key: PIG-1479 URL: https://issues.apache.org/jira/browse/PIG-1479 Project: Pig Issue Type: New Feature Reporter: Julien Le Dem Attachments: PIG-1479.patch, pig-greek.tgz It should be possible to embed Pig calls in a scripting language and let functions defined in the same script available as UDFs. This is a spin off of https://issues.apache.org/jira/browse/PIG-928 which lets users define UDFs in scripting languages. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1589) add test cases for mapreduce operator which use distributed cache
[ https://issues.apache.org/jira/browse/PIG-1589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated PIG-1589: --- Status: Resolved (was: Patch Available) Hadoop Flags: [Reviewed] Resolution: Fixed Patch committed to 0.8 branch and trunk. add test cases for mapreduce operator which use distributed cache - Key: PIG-1589 URL: https://issues.apache.org/jira/browse/PIG-1589 Project: Pig Issue Type: Task Reporter: Thejas M Nair Assignee: Thejas M Nair Fix For: 0.8.0 Attachments: PIG-1589.1.patch, TestWordCount.jar '-files filename' can be specified in the parameters for mapreduce operator to send files to distributed cache. Need to add test cases for that. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1609) 'union onschema' should give a more useful error message when schema of one of the relations has null column name
[ https://issues.apache.org/jira/browse/PIG-1609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12909377#action_12909377 ] Thejas M Nair commented on PIG-1609: All unit tests passed in my run. Patch is ready for review. 'union onschema' should give a more useful error message when schema of one of the relations has null column name - Key: PIG-1609 URL: https://issues.apache.org/jira/browse/PIG-1609 Project: Pig Issue Type: Bug Affects Versions: 0.8.0 Reporter: Thejas M Nair Assignee: Thejas M Nair Fix For: 0.8.0 Attachments: PIG-1609.1.patch A better error message needs to be given in this case - {code} grunt l = load '/tmp/empty.bag' as (i : int); grunt f = foreach l generate i+1; grunt describe f; f: {int} grunt u = union onschema l , f; 2010-09-10 18:08:13,000 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1000: Error during parsing. Error merging schemas for union operator Details at logfile: /Users/tejas/pig_nmr_syn/trunk/pig_1284167020897.log {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (PIG-1611) use enums for error code
use enums for error code Key: PIG-1611 URL: https://issues.apache.org/jira/browse/PIG-1611 Project: Pig Issue Type: Sub-task Reporter: Thejas M Nair Fix For: 0.9.0 Pig code is using integer constants for error code, and the value of the error code is reserved using http://wiki.apache.org/pig/PigErrorHandlingFunctionalSpecification . This process is cumbersome and error prone. It will be better to use enum values instead. The enum value can contain the error message and encapsulate the error code. For example - {code} Replace throw new SchemaMergeException(Error in merging schema, 2124, PigException.BUG); with throw new SchemaMergeException(SCHEMA_MERGE_EX, PigException.BUG); {code} Where SCHEMA_MERGE_EX belongs to a error codes enum. We can use the ordinal value of the enum and an offset to determine the error code. The error code will be passed through the constructor of the enum. {code} SCHEMA_MERGE_EX(Error in merging schema); {code} For documentation, the error code and error messages can be dumped using code that uses the enum error code class. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1609) 'union onschema' should give a more useful error message when schema of one of the relations has null column name
[ https://issues.apache.org/jira/browse/PIG-1609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12909412#action_12909412 ] Richard Ding commented on PIG-1609: --- +1 'union onschema' should give a more useful error message when schema of one of the relations has null column name - Key: PIG-1609 URL: https://issues.apache.org/jira/browse/PIG-1609 Project: Pig Issue Type: Bug Affects Versions: 0.8.0 Reporter: Thejas M Nair Assignee: Thejas M Nair Fix For: 0.8.0 Attachments: PIG-1609.1.patch A better error message needs to be given in this case - {code} grunt l = load '/tmp/empty.bag' as (i : int); grunt f = foreach l generate i+1; grunt describe f; f: {int} grunt u = union onschema l , f; 2010-09-10 18:08:13,000 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1000: Error during parsing. Error merging schemas for union operator Details at logfile: /Users/tejas/pig_nmr_syn/trunk/pig_1284167020897.log {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (PIG-1612) error reporting: PigException needs to have a way to indicate that its message is appropriate for user
error reporting: PigException needs to have a way to indicate that its message is appropriate for user -- Key: PIG-1612 URL: https://issues.apache.org/jira/browse/PIG-1612 Project: Pig Issue Type: Improvement Reporter: Thejas M Nair Fix For: 0.9.0 The error message printed to the user by pig is the message from the exception that is the 'root cause' from the chain of getCause() of exception that has been thrown. But often the 'root cause' exception does not have enough context that would make for a better error message. It should be possible for a PigException to indicate to the code that determines the error message that its getMessage() string should be used instead of that of the 'cause' exception. The following code in LogUtils.java is used to determine the exception that is the 'root cause' - {code} public static PigException getPigException(Throwable top) { Throwable current = top; Throwable pigException = top; while (current != null current.getCause() != null){ current = current.getCause(); if((current instanceof PigException) (((PigException)current).getErrorCode() != 0)) { pigException = current; } } return (pigException instanceof PigException? (PigException)pigException : null); } {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1612) error reporting: PigException needs to have a way to indicate that its message is appropriate for user
[ https://issues.apache.org/jira/browse/PIG-1612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12909418#action_12909418 ] Thejas M Nair commented on PIG-1612: For example, in this exception stack trace - {code} Caused by: org.apache.pig.impl.logicalLayer.parser.ParseException: Error merging schemas for union operator : Error merging schema: ({i: int,j: long}) with merged schema: ({l1::i: int,l1::j: long,l2::i: int,l2::j: long}) of schemas : [{l1::i: int,l1::j: long,l2::i: int, l2::j: long}] at org.apache.pig.impl.logicalLayer.parser.QueryParser.UnionClause(QueryParser.java:3409) at org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseExpr(QueryParser.java:1457) at org.apache.pig.impl.logicalLayer.parser.QueryParser.Expr(QueryParser.java:1010) at org.apache.pig.impl.logicalLayer.parser.QueryParser.Parse(QueryParser.java:797) at org.apache.pig.impl.logicalLayer.LogicalPlanBuilder.parse(LogicalPlanBuilder.java:63) at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1593) ... 13 more Caused by: org.apache.pig.impl.logicalLayer.schema.SchemaMergeException: ERROR 0: Error merging schema: ({i: int,j: long}) with merged schema: ({l1::i: int,l1::j: long,l2::i: int,l2::j: long}) of schemas : [{l1::i: int,l1::j: long,l2::i: int,l2::j: long}] at org.apache.pig.impl.logicalLayer.schema.Schema.mergeSchemasByAlias(Schema.java:1652) at org.apache.pig.impl.logicalLayer.parser.QueryParser.UnionClause(QueryParser.java:3405) ... 18 more Caused by: org.apache.pig.impl.logicalLayer.schema.SchemaMergeException: ERROR 0: Caught exception finding FieldSchema for aliasi at org.apache.pig.impl.logicalLayer.schema.Schema.getFieldSubNameMatchThrowSchemaMergeException(Schema.java:1787) at org.apache.pig.impl.logicalLayer.schema.Schema.mergeSchemaByAlias(Schema.java:1686) at org.apache.pig.impl.logicalLayer.schema.Schema.mergeSchemasByAlias(Schema.java:1646) ... 19 more Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1025: Found more than one match: l1::i, l2::i at org.apache.pig.impl.logicalLayer.schema.Schema.getField(Schema.java:819) at org.apache.pig.impl.logicalLayer.schema.Schema.getFieldSubNameMatch(Schema.java:836) at org.apache.pig.impl.logicalLayer.schema.Schema.getFieldSubNameMatchThrowSchemaMergeException(Schema.java:1783) ... 21 more {code} The pig statement that results in this error is a union command - u = union onschema f, l3; The error message that is printed only says - 'Found more than one match: l1::i, l2::i' . It would be more useful for the user if we are able to say something on lines of - Error merging schema: ({i: int,j: long}) with merged schema: ({l1::i: int,l1::j: long,l2::i: int,l2::j: long}) of schemas : [{l1::i: int,l1::j: long,l2::i: int,l2::j: long}]. Found more than one match: l1::i, l2::i (assuming this was the message generated exception from Schema.java:1652) error reporting: PigException needs to have a way to indicate that its message is appropriate for user -- Key: PIG-1612 URL: https://issues.apache.org/jira/browse/PIG-1612 Project: Pig Issue Type: Improvement Reporter: Thejas M Nair Fix For: 0.9.0 The error message printed to the user by pig is the message from the exception that is the 'root cause' from the chain of getCause() of exception that has been thrown. But often the 'root cause' exception does not have enough context that would make for a better error message. It should be possible for a PigException to indicate to the code that determines the error message that its getMessage() string should be used instead of that of the 'cause' exception. The following code in LogUtils.java is used to determine the exception that is the 'root cause' - {code} public static PigException getPigException(Throwable top) { Throwable current = top; Throwable pigException = top; while (current != null current.getCause() != null){ current = current.getCause(); if((current instanceof PigException) (((PigException)current).getErrorCode() != 0)) { pigException = current; } } return (pigException instanceof PigException? (PigException)pigException : null); } {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1609) 'union onschema' should give a more useful error message when schema of one of the relations has null column name
[ https://issues.apache.org/jira/browse/PIG-1609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated PIG-1609: --- Status: Resolved (was: Patch Available) Hadoop Flags: [Reviewed] Resolution: Fixed Patch committed to 0.8 branch and trunk. 'union onschema' should give a more useful error message when schema of one of the relations has null column name - Key: PIG-1609 URL: https://issues.apache.org/jira/browse/PIG-1609 Project: Pig Issue Type: Bug Affects Versions: 0.8.0 Reporter: Thejas M Nair Assignee: Thejas M Nair Fix For: 0.8.0 Attachments: PIG-1609.1.patch A better error message needs to be given in this case - {code} grunt l = load '/tmp/empty.bag' as (i : int); grunt f = foreach l generate i+1; grunt describe f; f: {int} grunt u = union onschema l , f; 2010-09-10 18:08:13,000 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1000: Error during parsing. Error merging schemas for union operator Details at logfile: /Users/tejas/pig_nmr_syn/trunk/pig_1284167020897.log {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1608) pig should always include pig-default.properties and pig.properties in the pig.jar
[ https://issues.apache.org/jira/browse/PIG-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] niraj rai updated PIG-1608: --- Attachment: PIG-1608_0.patch This patch will include pig-default.properties with each pig jar file, by default. pig should always include pig-default.properties and pig.properties in the pig.jar -- Key: PIG-1608 URL: https://issues.apache.org/jira/browse/PIG-1608 Project: Pig Issue Type: Bug Reporter: niraj rai Assignee: niraj rai Attachments: PIG-1608_0.patch pig should always include pig-default.properties and pig.properties as a part of the pig.jar file -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1611) use enums for error code
[ https://issues.apache.org/jira/browse/PIG-1611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12909493#action_12909493 ] Dmitriy V. Ryaboy commented on PIG-1611: +140 use enums for error code Key: PIG-1611 URL: https://issues.apache.org/jira/browse/PIG-1611 Project: Pig Issue Type: Sub-task Reporter: Thejas M Nair Fix For: 0.9.0 Pig code is using integer constants for error code, and the value of the error code is reserved using http://wiki.apache.org/pig/PigErrorHandlingFunctionalSpecification . This process is cumbersome and error prone. It will be better to use enum values instead. The enum value can contain the error message and encapsulate the error code. For example - {code} Replace throw new SchemaMergeException(Error in merging schema, 2124, PigException.BUG); with throw new SchemaMergeException(SCHEMA_MERGE_EX, PigException.BUG); {code} Where SCHEMA_MERGE_EX belongs to a error codes enum. We can use the ordinal value of the enum and an offset to determine the error code. The error code will be passed through the constructor of the enum. {code} SCHEMA_MERGE_EX(Error in merging schema); {code} For documentation, the error code and error messages can be dumped using code that uses the enum error code class. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1542) log level not propogated to MR task loggers
[ https://issues.apache.org/jira/browse/PIG-1542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] niraj rai updated PIG-1542: --- Status: Patch Available (was: Open) log level not propogated to MR task loggers --- Key: PIG-1542 URL: https://issues.apache.org/jira/browse/PIG-1542 Project: Pig Issue Type: Bug Reporter: Thejas M Nair Assignee: niraj rai Fix For: 0.8.0 Attachments: PIG-1542.patch, PIG-1542_1.patch Specifying -d DEBUG does not affect the logging of the MR tasks . This was fixed earlier in PIG-882 . -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1608) pig should always include pig-default.properties and pig.properties in the pig.jar
[ https://issues.apache.org/jira/browse/PIG-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] niraj rai updated PIG-1608: --- Status: Patch Available (was: Open) pig should always include pig-default.properties and pig.properties in the pig.jar -- Key: PIG-1608 URL: https://issues.apache.org/jira/browse/PIG-1608 Project: Pig Issue Type: Bug Reporter: niraj rai Assignee: niraj rai Attachments: PIG-1608_0.patch pig should always include pig-default.properties and pig.properties as a part of the pig.jar file -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1479) Embed Pig in scripting languages
[ https://issues.apache.org/jira/browse/PIG-1479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Richard Ding updated PIG-1479: -- Attachment: PIG-1479_2.patch In the previous patch, the executeScript method on ScriptPigServer returns a list of ExecJobs (one for each store statement in the script). Unfortunately, the order of ExecJobs in the list is indeterminate. This patch fixes this problem by making the executeScript method return a PigStats object. One then can retrieves the output result by the alias corresponding to store statement. Here is a example: {code} P = pig.executeScript( A = load '${input}'; ... ... store G into '${output}'; ) output = P.result(G) # an OutputStats object iter = output.iterator() if iter.hasNext(): # do something else: # do something else {code} Embed Pig in scripting languages Key: PIG-1479 URL: https://issues.apache.org/jira/browse/PIG-1479 Project: Pig Issue Type: New Feature Reporter: Julien Le Dem Attachments: PIG-1479.patch, PIG-1479_2.patch, pig-greek.tgz It should be possible to embed Pig calls in a scripting language and let functions defined in the same script available as UDFs. This is a spin off of https://issues.apache.org/jira/browse/PIG-928 which lets users define UDFs in scripting languages. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1479) Embed Pig in scripting languages
[ https://issues.apache.org/jira/browse/PIG-1479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Richard Ding updated PIG-1479: -- Attachment: pig-greek-test.tar Attach the updated test program from Julien. To run the example: * tar -xvf pig-greek-test.tar * java -cp pig.jar:jython jar org.apache.pig.Main -x local -g script/tc.py Embed Pig in scripting languages Key: PIG-1479 URL: https://issues.apache.org/jira/browse/PIG-1479 Project: Pig Issue Type: New Feature Reporter: Julien Le Dem Attachments: PIG-1479.patch, PIG-1479_2.patch, pig-greek-test.tar, pig-greek.tgz It should be possible to embed Pig calls in a scripting language and let functions defined in the same script available as UDFs. This is a spin off of https://issues.apache.org/jira/browse/PIG-928 which lets users define UDFs in scripting languages. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1578) PigServer.executeBatch does not return status of failed job for native mapreduce statement
[ https://issues.apache.org/jira/browse/PIG-1578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Olga Natkovich updated PIG-1578: Fix Version/s: (was: 0.8.0) PigServer.executeBatch does not return status of failed job for native mapreduce statement -- Key: PIG-1578 URL: https://issues.apache.org/jira/browse/PIG-1578 Project: Pig Issue Type: Bug Reporter: Thejas M Nair Assignee: Richard Ding For failed job PigServer.executeBatch does not return ExecJob . ExecJobs are created using output statistics, and the output statistics for jobs that failed does not seem to exist. The query i tried was a native mapreduce job, where the output file of the native mr job already exists causing that job to fail. {code} A = load ' + INPUT_FILE + '; B = mapreduce ' + jarFileName + ' + Store A into 'table_testNativeMRJobSimple_input' + Load 'table_testNativeMRJobSimple_output' + `WordCount table_testNativeMRJobSimple_input + INPUT_FILE + `;); Store B into 'table_testNativeMRJobSimpleDir';); {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (PIG-815) misleading error message when streaming fails
[ https://issues.apache.org/jira/browse/PIG-815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Olga Natkovich resolved PIG-815. Resolution: Won't Fix I don't think we have sufficient information to act on this misleading error message when streaming fails - Key: PIG-815 URL: https://issues.apache.org/jira/browse/PIG-815 Project: Pig Issue Type: Bug Affects Versions: 0.2.0 Reporter: Olga Natkovich Assignee: Gunther Hagleitner Fix For: 0.9.0 One of the users reported seeing a confusing message: Jobs not found in the JobClient. Please try to use Local, Hadoop Distributed or Hadoop MiniCluster modes instead of Hadoop LocalExecution ERROR 2055: Received Error while processing the map plan: 'process.pl ' failed with exit status: 255 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-638) error handling - enforce error codes
[ https://issues.apache.org/jira/browse/PIG-638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Olga Natkovich updated PIG-638: --- Fix Version/s: (was: 0.9.0) error handling - enforce error codes Key: PIG-638 URL: https://issues.apache.org/jira/browse/PIG-638 Project: Pig Issue Type: Bug Reporter: Olga Natkovich Assignee: Santhosh Srinivasan We should not allow exceptions that don't set error code as that kind of information is not helpful for users. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1017) Converts strings to text in Pig
[ https://issues.apache.org/jira/browse/PIG-1017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Olga Natkovich updated PIG-1017: Assignee: Thejas M Nair (was: Sriranjan Manjunath) We need to decide if this is something we should do for 0.9 Converts strings to text in Pig --- Key: PIG-1017 URL: https://issues.apache.org/jira/browse/PIG-1017 Project: Pig Issue Type: Improvement Reporter: Sriranjan Manjunath Assignee: Thejas M Nair Fix For: 0.9.0 Attachments: stotext.patch Strings in Java are UTF-16 and takes 2 bytes. Text (org.apache.hadoop.io.Text) stores the data in UTF-8 and could show significant reductions in memory. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.