[jira] Commented: (PIG-1661) Add alternative search-provider to Pig site
[ https://issues.apache.org/jira/browse/PIG-1661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12917246#action_12917246 ] Santhosh Srinivasan commented on PIG-1661: -- Sure, worth a try. > Add alternative search-provider to Pig site > --- > > Key: PIG-1661 > URL: https://issues.apache.org/jira/browse/PIG-1661 > Project: Pig > Issue Type: Improvement > Components: documentation >Reporter: Alex Baranau >Priority: Minor > Attachments: PIG-1661.patch > > > Use search-hadoop.com service to make available search in Pig sources, MLs, > wiki, etc. > This was initially proposed on user mailing list. The search service was > already added in site's skin (common for all Hadoop related projects) via > AVRO-626 so this issue is about enabling it for Pig. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1016) Reading in map data seems broken
[ https://issues.apache.org/jira/browse/PIG-1016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12771287#action_12771287 ] Santhosh Srinivasan commented on PIG-1016: -- I am summarizing my understanding of the patch that has been submitted by hc busy. Root cause: PIG-880 changed the value type of maps in PigStorage from native Java types to DataByteArray. As a result of this change, parsing of complex types as map values was disabled. Proposed fix: Revert the changes made as part of PIG-880 to interpret map values as Java types. In addition, change the comparison method to check for the object type and call the appropriate compareTo method. The latter is required to workaround the fact that the front-end assigns the value type to be DataByteArray whereas the backend sees the actual type (Integer, Long, Tuple, DataBag, etc.) Based on this understanding I have the following review comment(s). Index: src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigBytesRawComparator.java === Can you explain the checks in the if and the else? Specifically, NullableBytesWritable is a subclass of PigNullableWritable. As a result, in the if part, the check for both o1 and o2 not being PigNullableWritable is confusing as nbw1 and nbw2 are cast to NullableBytesWritable if o1 and o2 are not PigNullableWritable. {code} +// find bug is complaining about nulls. This check sequence will prevent nulls from being dereferenced. +if(o1!=null && o2!=null){ + +// In case the objects are comparable +if((o1 instanceof NullableBytesWritable && o2 instanceof NullableBytesWritable)|| + !(o1 instanceof PigNullableWritable && o2 instanceof PigNullableWritable) +){ + + NullableBytesWritable nbw1 = (NullableBytesWritable)o1; + NullableBytesWritable nbw2 = (NullableBytesWritable)o2; + + // If either are null, handle differently. + if (!nbw1.isNull() && !nbw2.isNull()) { + rc = ((DataByteArray)nbw1.getValueAsPigType()).compareTo((DataByteArray)nbw2.getValueAsPigType()); + } else { + // For sorting purposes two nulls are equal. + if (nbw1.isNull() && nbw2.isNull()) rc = 0; + else if (nbw1.isNull()) rc = -1; + else rc = 1; + } +}else{ + // enter here only if both o1 and o2 are non-NullableByteWritable PigNullableWritable's + PigNullableWritable nbw1 = (PigNullableWritable)o1; + PigNullableWritable nbw2 = (PigNullableWritable)o2; + // If either are null, handle differently. + if (!nbw1.isNull() && !nbw2.isNull()) { + rc = nbw1.compareTo(nbw2); + } else { + // For sorting purposes two nulls are equal. + if (nbw1.isNull() && nbw2.isNull()) rc = 0; + else if (nbw1.isNull()) rc = -1; + else rc = 1; + } +} +}else{ + if(o1==null && o2==null){rc=0;} + else if(o1==null) {rc=-1;} + else{ rc=1; } {code} > Reading in map data seems broken > > > Key: PIG-1016 > URL: https://issues.apache.org/jira/browse/PIG-1016 > Project: Pig > Issue Type: Improvement > Components: data >Affects Versions: 0.4.0 >Reporter: hc busy > Fix For: 0.5.0 > > Attachments: PIG-1016.patch > > > Hi, I'm trying to load a map that has a tuple for value. The read fails in > 0.4.0 because of a misconfiguration in the parser. Where as in almost all > documentation it is stated that value of the map can be any time. > I've attached a patch that allows us to read in complex objects as value as > documented. I've done simple verification of loading in maps with tuple/map > values and writing them back out using LOAD and STORE. All seems to work fine. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1016) Reading in map data seems broken
[ https://issues.apache.org/jira/browse/PIG-1016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12771442#action_12771442 ] Santhosh Srinivasan commented on PIG-1016: -- Hc Busy, thanks for taking time to contribute the patch, explaining the details and especially for being patient. A few more questions and details have to be cleared up before we commit this patch. IMHO, the right comparison should be along the lines of checking if o1 and o2 are NullableBytesWritable followed by a check for PigNullableWritable and then followed by error handling code. Alan, can you comment on this approach? There is a more important semantic issue. If the map value types are strings and if the strings are numeric, then the value types for the maps will be of different types. In that case, the load function will break. In addition, conversion routines might fail when the compareTo method is invoked. An example to illustrate this issue. Suppose, the records is ['key'#1234567890124567]. PIG-880 would treat the value as a string and there would be no problem. Now, with the changes reverted, the type is inferred as integer and the parsing will fail as the value is too big to fit into an integer Secondly, assuming that the integer was small enough to be converted, the comparison method in DataType.java will return the wrong results when an integer and a string are compared. For example, if the records are: [key#*$] [key#123] The first value is treated as a string and the second value is treated as an integer. The compareTo method will return 1 to indicate that string > integer while in reality 123 > *$ Please correct me if the last statement is incorrect or let me know if it needs more explanation. Thoughts/comments from other committers? > Reading in map data seems broken > > > Key: PIG-1016 > URL: https://issues.apache.org/jira/browse/PIG-1016 > Project: Pig > Issue Type: Improvement > Components: data >Affects Versions: 0.4.0 >Reporter: hc busy > Fix For: 0.5.0 > > Attachments: PIG-1016.patch > > > Hi, I'm trying to load a map that has a tuple for value. The read fails in > 0.4.0 because of a misconfiguration in the parser. Where as in almost all > documentation it is stated that value of the map can be any time. > I've attached a patch that allows us to read in complex objects as value as > documented. I've done simple verification of loading in maps with tuple/map > values and writing them back out using LOAD and STORE. All seems to work fine. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1073) LogicalPlanCloner can't clone plan containing LOJoin
[ https://issues.apache.org/jira/browse/PIG-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12774147#action_12774147 ] Santhosh Srinivasan commented on PIG-1073: -- If my memory serves me correctly, the logical plan cloning was implemented (by me) for cloning inner plans for foreach. As such, the top level plan cloning was never tested and some items are marked as TODO (see visit methods for LOLoad, LOStore and LOStream). If you want to use it as you mention in your test cases, then you need to add code for cloning the LOLoad, LOStore, LOStream and LOJoin operators. > LogicalPlanCloner can't clone plan containing LOJoin > > > Key: PIG-1073 > URL: https://issues.apache.org/jira/browse/PIG-1073 > Project: Pig > Issue Type: Bug > Components: impl >Reporter: Ashutosh Chauhan > > Add following testcase in LogicalPlanBuilder.java > public void testLogicalPlanCloner() throws CloneNotSupportedException{ > LogicalPlan lp = buildPlan("C = join ( load 'A') by $0, (load 'B') by > $0;"); > LogicalPlanCloner cloner = new LogicalPlanCloner(lp); > cloner.getClonedPlan(); > } > and this fails with the following stacktrace: > java.lang.NullPointerException > at > org.apache.pig.impl.logicalLayer.LOVisitor.visit(LOVisitor.java:171) > at > org.apache.pig.impl.logicalLayer.PlanSetter.visit(PlanSetter.java:63) > at org.apache.pig.impl.logicalLayer.LOJoin.visit(LOJoin.java:213) > at org.apache.pig.impl.logicalLayer.LOJoin.visit(LOJoin.java:45) > at > org.apache.pig.impl.plan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:67) > at > org.apache.pig.impl.plan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:69) > at > org.apache.pig.impl.plan.DepthFirstWalker.walk(DepthFirstWalker.java:50) > at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:51) > at > org.apache.pig.impl.logicalLayer.LogicalPlanCloneHelper.getClonedPlan(LogicalPlanCloneHelper.java:73) > at > org.apache.pig.impl.logicalLayer.LogicalPlanCloner.getClonedPlan(LogicalPlanCloner.java:46) > at > org.apache.pig.test.TestLogicalPlanBuilder.testLogicalPlanCloneHelper(TestLogicalPlanBuilder.java:2110) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1065) In-determinate behaviour of Union when there are 2 non-matching schema's
[ https://issues.apache.org/jira/browse/PIG-1065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12774153#action_12774153 ] Santhosh Srinivasan commented on PIG-1065: -- Answer to Question 1: Pig 1.0 had that syntax and it was retained for backward compatibility. Paolo suggested that for uniformity, the 'AS' clause for the load statements should be extended to all relational operators. Gradually, the column aliasing in the foreach should be removed from the documentation and eventually removed from the language. > In-determinate behaviour of Union when there are 2 non-matching schema's > > > Key: PIG-1065 > URL: https://issues.apache.org/jira/browse/PIG-1065 > Project: Pig > Issue Type: Bug >Affects Versions: 0.6.0 >Reporter: Viraj Bhat > Fix For: 0.6.0 > > > I have a script which first does a union of these schemas and then does a > ORDER BY of this result. > {code} > f1 = LOAD '1.txt' as (key:chararray, v:chararray); > f2 = LOAD '2.txt' as (key:chararray); > u0 = UNION f1, f2; > describe u0; > dump u0; > u1 = ORDER u0 BY $0; > dump u1; > {code} > When I run in Map Reduce mode I get the following result: > $java -cp pig.jar:$HADOOP_HOME/conf org.apache.pig.Main broken.pig > > Schema for u0 unknown. > > (1,2) > (2,3) > (1) > (2) > > org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to > open iterator for alias u1 > at org.apache.pig.PigServer.openIterator(PigServer.java:475) > at > org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:532) > at > org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:190) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:166) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:142) > at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:89) > at org.apache.pig.Main.main(Main.java:397) > > Caused by: java.io.IOException: Type mismatch in key from map: expected > org.apache.pig.impl.io.NullableBytesWritable, recieved > org.apache.pig.impl.io.NullableText > at > org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:415) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Map.collect(PigMapReduce.java:108) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:251) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:240) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Map.map(PigMapReduce.java:93) > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:47) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:227) > > When I run the same script in local mode I get a different result, as we know > that local mode does not use any Hadoop Classes. > $java -cp pig.jar org.apache.pig.Main -x local broken.pig > > Schema for u0 unknown > > (1,2) > (1) > (2,3) > (2) > > (1,2) > (1) > (2,3) > (2) > > Here are some questions > 1) Why do we allow union if the schemas do not match > 2) Should we not print an error message/warning so that the user knows that > this is not allowed or he can get unexpected results? > Viraj -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1065) In-determinate behaviour of Union when there are 2 non-matching schema's
[ https://issues.apache.org/jira/browse/PIG-1065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12775968#action_12775968 ] Santhosh Srinivasan commented on PIG-1065: -- The schema will then correspond to the prefix as it is implemented today. For example if the AS statement is define for the flatten($1) and if $1 flattens to 10 columns and if the AS clause has 3 columns then the prefix is used and the remaining are left undefined. > In-determinate behaviour of Union when there are 2 non-matching schema's > > > Key: PIG-1065 > URL: https://issues.apache.org/jira/browse/PIG-1065 > Project: Pig > Issue Type: Bug >Affects Versions: 0.6.0 >Reporter: Viraj Bhat > Fix For: 0.6.0 > > > I have a script which first does a union of these schemas and then does a > ORDER BY of this result. > {code} > f1 = LOAD '1.txt' as (key:chararray, v:chararray); > f2 = LOAD '2.txt' as (key:chararray); > u0 = UNION f1, f2; > describe u0; > dump u0; > u1 = ORDER u0 BY $0; > dump u1; > {code} > When I run in Map Reduce mode I get the following result: > $java -cp pig.jar:$HADOOP_HOME/conf org.apache.pig.Main broken.pig > > Schema for u0 unknown. > > (1,2) > (2,3) > (1) > (2) > > org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to > open iterator for alias u1 > at org.apache.pig.PigServer.openIterator(PigServer.java:475) > at > org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:532) > at > org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:190) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:166) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:142) > at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:89) > at org.apache.pig.Main.main(Main.java:397) > > Caused by: java.io.IOException: Type mismatch in key from map: expected > org.apache.pig.impl.io.NullableBytesWritable, recieved > org.apache.pig.impl.io.NullableText > at > org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:415) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Map.collect(PigMapReduce.java:108) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:251) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:240) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Map.map(PigMapReduce.java:93) > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:47) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:227) > > When I run the same script in local mode I get a different result, as we know > that local mode does not use any Hadoop Classes. > $java -cp pig.jar org.apache.pig.Main -x local broken.pig > > Schema for u0 unknown > > (1,2) > (1) > (2,3) > (2) > > (1,2) > (1) > (2,3) > (2) > > Here are some questions > 1) Why do we allow union if the schemas do not match > 2) Should we not print an error message/warning so that the user knows that > this is not allowed or he can get unexpected results? > Viraj -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1065) In-determinate behaviour of Union when there are 2 non-matching schema's
[ https://issues.apache.org/jira/browse/PIG-1065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12776098#action_12776098 ] Santhosh Srinivasan commented on PIG-1065: -- bq. Aliasing inside foreach is hugely useful for readability. Are you suggesting removing the ability to assign aliases inside a forearch, or just to change/assign schemas? For consistency, all relational operators should support the AS clause. Gradually, the aliasing on a per column basis in foreach should be removed from the documentation, deprecated and eventually removed. This is a long term recommendation. > In-determinate behaviour of Union when there are 2 non-matching schema's > > > Key: PIG-1065 > URL: https://issues.apache.org/jira/browse/PIG-1065 > Project: Pig > Issue Type: Bug >Affects Versions: 0.6.0 >Reporter: Viraj Bhat > Fix For: 0.6.0 > > > I have a script which first does a union of these schemas and then does a > ORDER BY of this result. > {code} > f1 = LOAD '1.txt' as (key:chararray, v:chararray); > f2 = LOAD '2.txt' as (key:chararray); > u0 = UNION f1, f2; > describe u0; > dump u0; > u1 = ORDER u0 BY $0; > dump u1; > {code} > When I run in Map Reduce mode I get the following result: > $java -cp pig.jar:$HADOOP_HOME/conf org.apache.pig.Main broken.pig > > Schema for u0 unknown. > > (1,2) > (2,3) > (1) > (2) > > org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to > open iterator for alias u1 > at org.apache.pig.PigServer.openIterator(PigServer.java:475) > at > org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:532) > at > org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:190) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:166) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:142) > at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:89) > at org.apache.pig.Main.main(Main.java:397) > > Caused by: java.io.IOException: Type mismatch in key from map: expected > org.apache.pig.impl.io.NullableBytesWritable, recieved > org.apache.pig.impl.io.NullableText > at > org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:415) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Map.collect(PigMapReduce.java:108) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:251) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:240) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Map.map(PigMapReduce.java:93) > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:47) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:227) > > When I run the same script in local mode I get a different result, as we know > that local mode does not use any Hadoop Classes. > $java -cp pig.jar org.apache.pig.Main -x local broken.pig > > Schema for u0 unknown > > (1,2) > (1) > (2,3) > (2) > > (1,2) > (1) > (2,3) > (2) > > Here are some questions > 1) Why do we allow union if the schemas do not match > 2) Should we not print an error message/warning so that the user knows that > this is not allowed or he can get unexpected results? > Viraj -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1117) Pig reading hive columnar rc tables
[ https://issues.apache.org/jira/browse/PIG-1117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12798917#action_12798917 ] Santhosh Srinivasan commented on PIG-1117: -- +1 on making it part of main piggybank. We should not be creating a separate directory just to handle hive. > Pig reading hive columnar rc tables > --- > > Key: PIG-1117 > URL: https://issues.apache.org/jira/browse/PIG-1117 > Project: Pig > Issue Type: New Feature >Affects Versions: 0.7.0 >Reporter: Gerrit Jansen van Vuuren >Assignee: Gerrit Jansen van Vuuren > Fix For: 0.7.0 > > Attachments: HiveColumnarLoader.patch, HiveColumnarLoaderTest.patch, > PIG-1117.patch, PIG-117-v.0.6.0.patch, PIG-117-v.0.7.0.patch > > > I've coded a LoadFunc implementation that can read from Hive Columnar RC > tables, this is needed for a project that I'm working on because all our data > is stored using the Hive thrift serialized Columnar RC format. I have looked > at the piggy bank but did not find any implementation that could do this. > We've been running it on our cluster for the last week and have worked out > most bugs. > > There are still some improvements to be done but I would need like setting > the amount of mappers based on date partitioning. Its been optimized so as to > read only specific columns and can churn through a data set almost 8 times > faster with this improvement because not all column data is read. > I would like to contribute the class to the piggybank can you guide me in > what I need to do? > I've used hive specific classes to implement this, is it possible to add this > to the piggy bank build ivy for automatic download of the dependencies? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1331) Owl Hadoop Table Management Service
[ https://issues.apache.org/jira/browse/PIG-1331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12850342#action_12850342 ] Santhosh Srinivasan commented on PIG-1331: -- Jay, In PIG-823 there was a discussion around how Owl is different from Hive's metastore. Is that still true today? If not, can you elaborate on the key differences between the two systems? Thanks, Santhosh > Owl Hadoop Table Management Service > --- > > Key: PIG-1331 > URL: https://issues.apache.org/jira/browse/PIG-1331 > Project: Pig > Issue Type: New Feature >Reporter: Jay Tang > > This JIRA is a proposal to create a Hadoop table management service: Owl. > Today, MapReduce and Pig applications interacts directly with HDFS > directories and files and must deal with low level data management issues > such as storage format, serialization/compression schemes, data layout, and > efficient data accesses, etc, often with different solutions. Owl aims to > provide a standard way to addresses this issue and abstracts away the > complexities of reading/writing huge amount of data from/to HDFS. > Owl has a data access API that is modeled after the traditional Hadoop > !InputFormt and a management API to manipulate Owl objects. This JIRA is > related to Pig-823 (Hadoop Metadata Service) as Owl has an internal metadata > store. Owl integrates with different storage module like Zebra with a > pluggable architecture. > Initially, the proposal is to submit Owl as a Pig contrib project. Over > time, it makes sense to move it to a Hadoop subproject. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1331) Owl Hadoop Table Management Service
[ https://issues.apache.org/jira/browse/PIG-1331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12850355#action_12850355 ] Santhosh Srinivasan commented on PIG-1331: -- Thanks for the information. Looking at the Hive design at http://wiki.apache.org/hadoop/Hive/Design , it looks like there is no significant difference between Owl and Hive. As you indicate, I hope we converge to a common metastore for Hadoop. > Owl Hadoop Table Management Service > --- > > Key: PIG-1331 > URL: https://issues.apache.org/jira/browse/PIG-1331 > Project: Pig > Issue Type: New Feature >Reporter: Jay Tang > > This JIRA is a proposal to create a Hadoop table management service: Owl. > Today, MapReduce and Pig applications interacts directly with HDFS > directories and files and must deal with low level data management issues > such as storage format, serialization/compression schemes, data layout, and > efficient data accesses, etc, often with different solutions. Owl aims to > provide a standard way to addresses this issue and abstracts away the > complexities of reading/writing huge amount of data from/to HDFS. > Owl has a data access API that is modeled after the traditional Hadoop > !InputFormt and a management API to manipulate Owl objects. This JIRA is > related to Pig-823 (Hadoop Metadata Service) as Owl has an internal metadata > store. Owl integrates with different storage module like Zebra with a > pluggable architecture. > Initially, the proposal is to submit Owl as a Pig contrib project. Over > time, it makes sense to move it to a Hadoop subproject. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (PIG-1344) PigStorage should be able to read back complex data containing delimiters created by PigStorage
PigStorage should be able to read back complex data containing delimiters created by PigStorage --- Key: PIG-1344 URL: https://issues.apache.org/jira/browse/PIG-1344 Project: Pig Issue Type: Bug Components: impl Affects Versions: 0.7.0 Reporter: Santhosh Srinivasan Assignee: Daniel Dai Fix For: 0.8.0 With Pig 0.7, the TextDataParser has been removed and the logic to parse complex data types has moved to Utf8StorageConverter. However, this does not handle the case where the complex data types could contain delimiters ('{', '}', ',', '(', ')', '[', ']', '#'). Fixing this issue will make PigStorage self contained and more usable. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (PIG-520) Physical plan cloning could lead to out of order connections
Physical plan cloning could lead to out of order connections Key: PIG-520 URL: https://issues.apache.org/jira/browse/PIG-520 Project: Pig Issue Type: Bug Affects Versions: types_branch Reporter: Santhosh Srinivasan Fix For: types_branch In the PhysicalPlan clone method, the algorithm used is as follows: 1. Create an empty plan 2. For all the operators in the plan, a. clone the operator b. add it to the plan 3. For all the keys (from_node) in the map mFromEdges a. For all the values (to_node) for this key i. Connect the from_node to the to_node in the plan There are no guarantees on the order in which the from_nodes in the mFromEdges are processed, we could get out of order connections in the graph. Example: If we have UDF with two arguments like myUDF(a, b) in a plan, the order in which the nodes are processed will determine the cloned plan. We could end up with myUDF(a, b) OR myUDF(b,. a) depending on the order in which a and b appear in the mFromEdges look up table. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (PIG-522) Problem in using negative (-a)
Problem in using negative (-a) -- Key: PIG-522 URL: https://issues.apache.org/jira/browse/PIG-522 Project: Pig Issue Type: Bug Affects Versions: types_branch Reporter: Santhosh Srinivasan Fix For: types_branch Using negative, i.e., -a leads to exceptions. {code} grunt> a = load 'myfile' as (name:chararray, age:int, gpa:double); grunt> b = foreach a generate -gpa; grunt> dump b; 2008-11-10 16:38:12,517 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete 2008-11-10 16:38:37,539 [main] ERROR org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Map reduce job failed 2008-11-10 16:38:37,540 [main] ERROR org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Job failed! 2008-11-10 16:38:37,542 [main] ERROR org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher - Error message from task (map) task_200809241441_19426_m_00java.io.IOException: Received Error while processing the map plan. at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:197) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:158) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map.map(PigMapOnly.java:65) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:47) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:227) at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2207) {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-449) Schemas for bags should contain tuples all the time
[ https://issues.apache.org/jira/browse/PIG-449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12646712#action_12646712 ] Santhosh Srinivasan commented on PIG-449: - Currently, bags in Pig are containers of tuples. Accessing elements inside a bag should translate to accessing elements inside the tuple contained in the bag. In addition, accessing tuples inside a bag should be restricted to the FLATTEN keyword in a FOREACH statement. A few examples shown below will demonstrate the point. {code} a = load '/user/pig/data/student.data' using PigStorage(' ') as (name, age, gpa); b = foreach a generate {(16, 4.0e-2, 'hello')} as b:{t:(i: int, d: double, c: chararray)}; c = foreach b generate b.i; -- Here b.i should generate a bag of integers by accessing the column called 'i' inside each tuple d = foeach b generate b.t; -- This should be outlawed as the tuple inside the bag does not have a column called 't' although the tuple inside the bag are named 't' {code} Summary: 1. The frontend should translate access to columns in a bag to columns inside the tuple in the bag 2. The frontend should prevent access to tuples inside the bag via projections and allow access only via the FLATTEN keyword Thoughts/suggestions/comments are welcome. > Schemas for bags should contain tuples all the time > --- > > Key: PIG-449 > URL: https://issues.apache.org/jira/browse/PIG-449 > Project: Pig > Issue Type: Bug >Affects Versions: types_branch >Reporter: Santhosh Srinivasan >Assignee: Santhosh Srinivasan > Fix For: types_branch > > > The front end treats relations as operators that return bags. When the > schema of a load statement is specified, the bag is associated with the > schema specified by the user. Ideally, the schema corresponds to the tuple > contained in the bag. > With PIG-380, the schema for bag constants are computed by the front end. The > schema for the bag contains the tuple which in turn contains the schema of > the columns. This results in errors when columns are accessed directly just > like the load statements. > The front end should then treat access to the columns as a double > dereference, i.e., access the tuple inside the bag and then the column inside > the tuple. > {code} > grunt> a = load '/user/sms/data/student.data' using PigStorage(' ') as (name, > age, gpa); > grunt> b = foreach a generate {(16, 4.0e-2, 'hello')} as b:{t:(i: int, d: > double, c: chararray)}; > grunt> describe b; > b: {b: {t: (i: integer,d: double,c: chararray)}} > grunt> c = foreach b generate b.i; > 111064 [main] ERROR org.apache.pig.tools.grunt.GruntParser - > java.io.IOException: Invalid alias: i in {t: (i: integer,d: double,c: > chararray)} > at org.apache.pig.PigServer.parseQuery(PigServer.java:293) > at org.apache.pig.PigServer.registerQuery(PigServer.java:258) > at > org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:432) > at > org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:242) > at > org.apache.pig.tools.grunt.GruntParser.parseContOnError(GruntParser.java:93) > at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:58) > at org.apache.pig.Main.main(Main.java:282) > Caused by: org.apache.pig.impl.logicalLayer.parser.ParseException: Invalid > alias: i in {t: (i: integer,d: double,c: chararray)} > at > org.apache.pig.impl.logicalLayer.parser.QueryParser.AliasFieldOrSpec(QueryParser.java:5851) > at > org.apache.pig.impl.logicalLayer.parser.QueryParser.ColOrSpec(QueryParser.java:5709) > at > org.apache.pig.impl.logicalLayer.parser.QueryParser.BracketedSimpleProj(QueryParser.java:5242) > at > org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseEvalSpec(QueryParser.java:4040) > at > org.apache.pig.impl.logicalLayer.parser.QueryParser.UnaryExpr(QueryParser.java:3909) > at > org.apache.pig.impl.logicalLayer.parser.QueryParser.CastExpr(QueryParser.java:3863) > at > org.apache.pig.impl.logicalLayer.parser.QueryParser.MultiplicativeExpr(QueryParser.java:3772) > at > org.apache.pig.impl.logicalLayer.parser.QueryParser.AdditiveExpr(QueryParser.java:3698) > at > org.apache.pig.impl.logicalLayer.parser.QueryParser.InfixExpr(QueryParser.java:3664) > at > org.apache.pig.impl.logicalLayer.parser.QueryParser.FlattenedGenerateItem(QueryParser.java:3590) > at > org.apache.pig.impl.logicalLayer.parser.QueryParser.FlattenedGenerateItemList(QueryParser.java:3500) > at > org.apache.pig.impl.logicalLayer.parser.QueryParser.GenerateStatement(QueryParser.java:3457) > at > org.apache.pig.impl.logicalLayer.parser.QueryParser.NestedBlock(QueryParser.java:2933) > at >
[jira] Updated: (PIG-512) Expressions in foreach lead to errors
[ https://issues.apache.org/jira/browse/PIG-512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Santhosh Srinivasan updated PIG-512: Patch Info: [Patch Available] > Expressions in foreach lead to errors > - > > Key: PIG-512 > URL: https://issues.apache.org/jira/browse/PIG-512 > Project: Pig > Issue Type: Bug >Affects Versions: types_branch >Reporter: Santhosh Srinivasan >Assignee: Santhosh Srinivasan > Fix For: types_branch > > Attachments: PIG-512.patch > > > Use of expressions that use the same sub-expressions in foreach lead to > translation errors. This issue is caused due to sharing operators across > nested plans. To remedy this issue, logical operators should be cloned and > not shared across plans. > {code} > grunt> a = load 'a' as (x, y, z); > grunt> b = foreach a { > >> exp1 = x + y; > >> exp2 = exp1 + x; > >> generate exp1, exp2; > >> } > grunt> explain b; > 2008-10-30 15:38:40,257 [main] WARN org.apache.pig.PigServer - bytearray is > implicitly casted to double under LOAdd Operator > 2008-10-30 15:38:40,258 [main] WARN org.apache.pig.PigServer - bytearray is > implicitly casted to double under LOAdd Operator > 2008-10-30 15:38:40,258 [main] WARN org.apache.pig.PigServer - bytearray is > implicitly casted to double under LOAdd Operator > Logical Plan: > Store sms-Thu Oct 30 11:27:27 PDT 2008-2609 Schema: {double,double} Type: > Unknown > | > |---ForEach sms-Thu Oct 30 11:27:27 PDT 2008-2605 Schema: {double,double} > Type: bag > | | > | Add sms-Thu Oct 30 11:27:27 PDT 2008-2600 FieldSchema: double Type: > double > | | > | |---Cast sms-Thu Oct 30 11:27:27 PDT 2008-2606 FieldSchema: double > Type: double > | | | > | | |---Project sms-Thu Oct 30 11:27:27 PDT 2008-2598 Projections: > [0] Overloaded: false FieldSchema: x: bytearray Type: bytearray > | | Input: Load sms-Thu Oct 30 11:27:27 PDT 2008-2597 > | | > | |---Cast sms-Thu Oct 30 11:27:27 PDT 2008-2607 FieldSchema: double > Type: double > | | > | |---Project sms-Thu Oct 30 11:27:27 PDT 2008-2599 Projections: > [1] Overloaded: false FieldSchema: y: bytearray Type: bytearray > | Input: Load sms-Thu Oct 30 11:27:27 PDT 2008-2597 > | | > | Add sms-Thu Oct 30 11:27:27 PDT 2008-2603 FieldSchema: double Type: > double > | | > | |---Project sms-Thu Oct 30 11:27:27 PDT 2008-2601 Projections: [*] > Overloaded: false FieldSchema: double Type: double > | | Input: Add sms-Thu Oct 30 11:27:27 PDT 2008-2600| > | | |---Add sms-Thu Oct 30 11:27:27 PDT 2008-2600 FieldSchema: double > Type: double > | | | > | | |---Project sms-Thu Oct 30 11:27:27 PDT 2008-2598 > Projections: [0] Overloaded: false FieldSchema: x: bytearray Type: bytearray > | | | Input: Load sms-Thu Oct 30 11:27:27 PDT 2008-2597 > | | | > | | |---Project sms-Thu Oct 30 11:27:27 PDT 2008-2599 > Projections: [1] Overloaded: false FieldSchema: y: bytearray Type: bytearray > | | Input: Load sms-Thu Oct 30 11:27:27 PDT 2008-2597 > | | > | |---Cast sms-Thu Oct 30 11:27:27 PDT 2008-2608 FieldSchema: double > Type: double > | | > | |---Project sms-Thu Oct 30 11:27:27 PDT 2008-2602 Projections: > [0] Overloaded: false FieldSchema: x: bytearray Type: bytearray > | Input: Load sms-Thu Oct 30 11:27:27 PDT 2008-2597 > | > |---Load sms-Thu Oct 30 11:27:27 PDT 2008-2597 Schema: {x: bytearray,y: > bytearray,z: bytearray} Type: bag > 2008-10-30 15:38:40,272 [main] ERROR org.apache.pig.impl.plan.OperatorPlan - > Attempt to give operator of type > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject > multiple outputs. This operator does not support multiple outputs. > 2008-10-30 15:38:40,272 [main] ERROR > org.apache.pig.backend.hadoop.executionengine.physicalLayer.LogToPhyTranslationVisitor > - Invalid physical operators in the physical planAttempt to give operator of > type > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject > multiple outputs. This operator does not support multiple outputs. > 2008-10-30 15:38:40,273 [main] ERROR org.apache.pig.tools.grunt.GruntParser - > java.io.IOException: Unable to explain alias b > [org.apache.pig.impl.plan.VisitorException] > at > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(HExecutionEngine.java:235) > at org.apache.pig.PigServer.compilePp(PigServer.java:731) > at org.apache.pig.PigServer.explain(PigServer.java:495) > at > org.apache.pig.tools.grunt.GruntParser.processExplain(GruntParser.java:155) > at > org.apache.pig.too
[jira] Updated: (PIG-512) Expressions in foreach lead to errors
[ https://issues.apache.org/jira/browse/PIG-512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Santhosh Srinivasan updated PIG-512: Attachment: PIG-512.patch Atached patch (PIG-512.patch) includes the following: 1. Logical Plan cloning which in turn includes logical operator cloning. Caveat: Only logical plan cloning is allowed and LogicalPlanCloner is the supported mechanism for cloning logical plans. The following operators do not support cloning: i. LOLoad ii. LOStore iii. LOStream 2. A visitor to remove redundant project( * ) operators that occur between two relational operators or between two expression operators. 3. Unit tests for 1 and 2 All unit tests pass. > Expressions in foreach lead to errors > - > > Key: PIG-512 > URL: https://issues.apache.org/jira/browse/PIG-512 > Project: Pig > Issue Type: Bug >Affects Versions: types_branch >Reporter: Santhosh Srinivasan >Assignee: Santhosh Srinivasan > Fix For: types_branch > > Attachments: PIG-512.patch > > > Use of expressions that use the same sub-expressions in foreach lead to > translation errors. This issue is caused due to sharing operators across > nested plans. To remedy this issue, logical operators should be cloned and > not shared across plans. > {code} > grunt> a = load 'a' as (x, y, z); > grunt> b = foreach a { > >> exp1 = x + y; > >> exp2 = exp1 + x; > >> generate exp1, exp2; > >> } > grunt> explain b; > 2008-10-30 15:38:40,257 [main] WARN org.apache.pig.PigServer - bytearray is > implicitly casted to double under LOAdd Operator > 2008-10-30 15:38:40,258 [main] WARN org.apache.pig.PigServer - bytearray is > implicitly casted to double under LOAdd Operator > 2008-10-30 15:38:40,258 [main] WARN org.apache.pig.PigServer - bytearray is > implicitly casted to double under LOAdd Operator > Logical Plan: > Store sms-Thu Oct 30 11:27:27 PDT 2008-2609 Schema: {double,double} Type: > Unknown > | > |---ForEach sms-Thu Oct 30 11:27:27 PDT 2008-2605 Schema: {double,double} > Type: bag > | | > | Add sms-Thu Oct 30 11:27:27 PDT 2008-2600 FieldSchema: double Type: > double > | | > | |---Cast sms-Thu Oct 30 11:27:27 PDT 2008-2606 FieldSchema: double > Type: double > | | | > | | |---Project sms-Thu Oct 30 11:27:27 PDT 2008-2598 Projections: > [0] Overloaded: false FieldSchema: x: bytearray Type: bytearray > | | Input: Load sms-Thu Oct 30 11:27:27 PDT 2008-2597 > | | > | |---Cast sms-Thu Oct 30 11:27:27 PDT 2008-2607 FieldSchema: double > Type: double > | | > | |---Project sms-Thu Oct 30 11:27:27 PDT 2008-2599 Projections: > [1] Overloaded: false FieldSchema: y: bytearray Type: bytearray > | Input: Load sms-Thu Oct 30 11:27:27 PDT 2008-2597 > | | > | Add sms-Thu Oct 30 11:27:27 PDT 2008-2603 FieldSchema: double Type: > double > | | > | |---Project sms-Thu Oct 30 11:27:27 PDT 2008-2601 Projections: [*] > Overloaded: false FieldSchema: double Type: double > | | Input: Add sms-Thu Oct 30 11:27:27 PDT 2008-2600| > | | |---Add sms-Thu Oct 30 11:27:27 PDT 2008-2600 FieldSchema: double > Type: double > | | | > | | |---Project sms-Thu Oct 30 11:27:27 PDT 2008-2598 > Projections: [0] Overloaded: false FieldSchema: x: bytearray Type: bytearray > | | | Input: Load sms-Thu Oct 30 11:27:27 PDT 2008-2597 > | | | > | | |---Project sms-Thu Oct 30 11:27:27 PDT 2008-2599 > Projections: [1] Overloaded: false FieldSchema: y: bytearray Type: bytearray > | | Input: Load sms-Thu Oct 30 11:27:27 PDT 2008-2597 > | | > | |---Cast sms-Thu Oct 30 11:27:27 PDT 2008-2608 FieldSchema: double > Type: double > | | > | |---Project sms-Thu Oct 30 11:27:27 PDT 2008-2602 Projections: > [0] Overloaded: false FieldSchema: x: bytearray Type: bytearray > | Input: Load sms-Thu Oct 30 11:27:27 PDT 2008-2597 > | > |---Load sms-Thu Oct 30 11:27:27 PDT 2008-2597 Schema: {x: bytearray,y: > bytearray,z: bytearray} Type: bag > 2008-10-30 15:38:40,272 [main] ERROR org.apache.pig.impl.plan.OperatorPlan - > Attempt to give operator of type > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject > multiple outputs. This operator does not support multiple outputs. > 2008-10-30 15:38:40,272 [main] ERROR > org.apache.pig.backend.hadoop.executionengine.physicalLayer.LogToPhyTranslationVisitor > - Invalid physical operators in the physical planAttempt to give operator of > type > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject > multiple outputs. This operator does not support multiple outputs. > 2008-10-30 15:38:40
[jira] Updated: (PIG-512) Expressions in foreach lead to errors
[ https://issues.apache.org/jira/browse/PIG-512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Santhosh Srinivasan updated PIG-512: Attachment: PIG-512_1.patch Updated patch with SVN changes since last patch. All unit tests pass. > Expressions in foreach lead to errors > - > > Key: PIG-512 > URL: https://issues.apache.org/jira/browse/PIG-512 > Project: Pig > Issue Type: Bug >Affects Versions: types_branch >Reporter: Santhosh Srinivasan >Assignee: Santhosh Srinivasan > Fix For: types_branch > > Attachments: PIG-512.patch, PIG-512_1.patch > > > Use of expressions that use the same sub-expressions in foreach lead to > translation errors. This issue is caused due to sharing operators across > nested plans. To remedy this issue, logical operators should be cloned and > not shared across plans. > {code} > grunt> a = load 'a' as (x, y, z); > grunt> b = foreach a { > >> exp1 = x + y; > >> exp2 = exp1 + x; > >> generate exp1, exp2; > >> } > grunt> explain b; > 2008-10-30 15:38:40,257 [main] WARN org.apache.pig.PigServer - bytearray is > implicitly casted to double under LOAdd Operator > 2008-10-30 15:38:40,258 [main] WARN org.apache.pig.PigServer - bytearray is > implicitly casted to double under LOAdd Operator > 2008-10-30 15:38:40,258 [main] WARN org.apache.pig.PigServer - bytearray is > implicitly casted to double under LOAdd Operator > Logical Plan: > Store sms-Thu Oct 30 11:27:27 PDT 2008-2609 Schema: {double,double} Type: > Unknown > | > |---ForEach sms-Thu Oct 30 11:27:27 PDT 2008-2605 Schema: {double,double} > Type: bag > | | > | Add sms-Thu Oct 30 11:27:27 PDT 2008-2600 FieldSchema: double Type: > double > | | > | |---Cast sms-Thu Oct 30 11:27:27 PDT 2008-2606 FieldSchema: double > Type: double > | | | > | | |---Project sms-Thu Oct 30 11:27:27 PDT 2008-2598 Projections: > [0] Overloaded: false FieldSchema: x: bytearray Type: bytearray > | | Input: Load sms-Thu Oct 30 11:27:27 PDT 2008-2597 > | | > | |---Cast sms-Thu Oct 30 11:27:27 PDT 2008-2607 FieldSchema: double > Type: double > | | > | |---Project sms-Thu Oct 30 11:27:27 PDT 2008-2599 Projections: > [1] Overloaded: false FieldSchema: y: bytearray Type: bytearray > | Input: Load sms-Thu Oct 30 11:27:27 PDT 2008-2597 > | | > | Add sms-Thu Oct 30 11:27:27 PDT 2008-2603 FieldSchema: double Type: > double > | | > | |---Project sms-Thu Oct 30 11:27:27 PDT 2008-2601 Projections: [*] > Overloaded: false FieldSchema: double Type: double > | | Input: Add sms-Thu Oct 30 11:27:27 PDT 2008-2600| > | | |---Add sms-Thu Oct 30 11:27:27 PDT 2008-2600 FieldSchema: double > Type: double > | | | > | | |---Project sms-Thu Oct 30 11:27:27 PDT 2008-2598 > Projections: [0] Overloaded: false FieldSchema: x: bytearray Type: bytearray > | | | Input: Load sms-Thu Oct 30 11:27:27 PDT 2008-2597 > | | | > | | |---Project sms-Thu Oct 30 11:27:27 PDT 2008-2599 > Projections: [1] Overloaded: false FieldSchema: y: bytearray Type: bytearray > | | Input: Load sms-Thu Oct 30 11:27:27 PDT 2008-2597 > | | > | |---Cast sms-Thu Oct 30 11:27:27 PDT 2008-2608 FieldSchema: double > Type: double > | | > | |---Project sms-Thu Oct 30 11:27:27 PDT 2008-2602 Projections: > [0] Overloaded: false FieldSchema: x: bytearray Type: bytearray > | Input: Load sms-Thu Oct 30 11:27:27 PDT 2008-2597 > | > |---Load sms-Thu Oct 30 11:27:27 PDT 2008-2597 Schema: {x: bytearray,y: > bytearray,z: bytearray} Type: bag > 2008-10-30 15:38:40,272 [main] ERROR org.apache.pig.impl.plan.OperatorPlan - > Attempt to give operator of type > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject > multiple outputs. This operator does not support multiple outputs. > 2008-10-30 15:38:40,272 [main] ERROR > org.apache.pig.backend.hadoop.executionengine.physicalLayer.LogToPhyTranslationVisitor > - Invalid physical operators in the physical planAttempt to give operator of > type > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject > multiple outputs. This operator does not support multiple outputs. > 2008-10-30 15:38:40,273 [main] ERROR org.apache.pig.tools.grunt.GruntParser - > java.io.IOException: Unable to explain alias b > [org.apache.pig.impl.plan.VisitorException] > at > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(HExecutionEngine.java:235) > at org.apache.pig.PigServer.compilePp(PigServer.java:731) > at org.apache.pig.PigServer.explain(PigServer.java:495) > at > org.apache.pig.tools.gru
[jira] Created: (PIG-527) Pig does not support storing nested data using default storage
Pig does not support storing nested data using default storage -- Key: PIG-527 URL: https://issues.apache.org/jira/browse/PIG-527 Project: Pig Issue Type: Bug Affects Versions: types_branch Reporter: Santhosh Srinivasan Assignee: Santhosh Srinivasan Fix For: types_branch Pig does not allow storing nested data using the default storage function (PigStorage) {code} grunt> a = load 'student_tab.data' as (name, age, gpa); grunt> b = group a by age; grunt> store b into '/user/sms/data/complex.data'; 2008-11-13 16:21:17,711 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 50% complete 2008-11-13 16:21:52,747 [main] ERROR org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Map reduce job failed 2008-11-13 16:21:52,747 [main] ERROR org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Job failed! 2008-11-13 16:21:52,764 [main] ERROR org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher - Error message from task (reduce) task_200809241441_21188_r_00java.io.IOException: Cannot store a non-flat tuple using PigStorage at org.apache.pig.builtin.PigStorage.putNext(PigStorage.java:196) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:116) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:90) at org.apache.hadoop.mapred.ReduceTask$3.collect(ReduceTask.java:300) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.processOnePackageOutput(PigMapReduce.java:238) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:224) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:136) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:318) at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2207) 2008-11-13 16:21:52,764 [main] ERROR org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher - Error message from task (reduce) task_200809241441_21188_r_00java.io.IOException: Cannot store a non-flat tuple using PigStorage {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (PIG-528) Schema returned in UDF is not used by Pig
Schema returned in UDF is not used by Pig - Key: PIG-528 URL: https://issues.apache.org/jira/browse/PIG-528 Project: Pig Issue Type: Bug Affects Versions: types_branch Reporter: Santhosh Srinivasan Assignee: Santhosh Srinivasan Fix For: types_branch Using an identity UDF that returns the input schema as the output schema leads to schema truncation in Pig. {code} grunt> a = load '/tudent_tab.data' as (name, age, gpa); grunt> b = foreach a generate IdentityFunc(name, age); grunt> describe b; b: {name: bytearray} --It should have been b:{(name: bytearray, age: bytearray)} {code} The outputSchema method in IdentityFunc is given below: {code} @Override public Schema outputSchema(Schema input) { return input; } {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-512) Expressions in foreach lead to errors
[ https://issues.apache.org/jira/browse/PIG-512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12647728#action_12647728 ] Santhosh Srinivasan commented on PIG-512: - Yes, the visit(LOCross cs) can be removed from LogicalPlanCloneHelper.java. Its a placeholder if we change LOCross to have additional member variables. For now, its redundant. The change in the type checker is not related to the cloning. Its a bug that I uncovered while I was testing unary expressions as part of cloning. The insertCastForUniOp method in the typeChecker had a bug where the newly created cast operator was not added to the plan before inserting the cast between the unary expression and the unary expression's input. I fixed it by adding the cast operator to the plan and patching the reference in the unary expression to point to the cast. I would like to thank Pradeep Kamath who had done the ground work in an earlier attempt at cloning logical plans. > Expressions in foreach lead to errors > - > > Key: PIG-512 > URL: https://issues.apache.org/jira/browse/PIG-512 > Project: Pig > Issue Type: Bug >Affects Versions: types_branch >Reporter: Santhosh Srinivasan >Assignee: Santhosh Srinivasan > Fix For: types_branch > > Attachments: PIG-512.patch, PIG-512_1.patch > > > Use of expressions that use the same sub-expressions in foreach lead to > translation errors. This issue is caused due to sharing operators across > nested plans. To remedy this issue, logical operators should be cloned and > not shared across plans. > {code} > grunt> a = load 'a' as (x, y, z); > grunt> b = foreach a { > >> exp1 = x + y; > >> exp2 = exp1 + x; > >> generate exp1, exp2; > >> } > grunt> explain b; > 2008-10-30 15:38:40,257 [main] WARN org.apache.pig.PigServer - bytearray is > implicitly casted to double under LOAdd Operator > 2008-10-30 15:38:40,258 [main] WARN org.apache.pig.PigServer - bytearray is > implicitly casted to double under LOAdd Operator > 2008-10-30 15:38:40,258 [main] WARN org.apache.pig.PigServer - bytearray is > implicitly casted to double under LOAdd Operator > Logical Plan: > Store sms-Thu Oct 30 11:27:27 PDT 2008-2609 Schema: {double,double} Type: > Unknown > | > |---ForEach sms-Thu Oct 30 11:27:27 PDT 2008-2605 Schema: {double,double} > Type: bag > | | > | Add sms-Thu Oct 30 11:27:27 PDT 2008-2600 FieldSchema: double Type: > double > | | > | |---Cast sms-Thu Oct 30 11:27:27 PDT 2008-2606 FieldSchema: double > Type: double > | | | > | | |---Project sms-Thu Oct 30 11:27:27 PDT 2008-2598 Projections: > [0] Overloaded: false FieldSchema: x: bytearray Type: bytearray > | | Input: Load sms-Thu Oct 30 11:27:27 PDT 2008-2597 > | | > | |---Cast sms-Thu Oct 30 11:27:27 PDT 2008-2607 FieldSchema: double > Type: double > | | > | |---Project sms-Thu Oct 30 11:27:27 PDT 2008-2599 Projections: > [1] Overloaded: false FieldSchema: y: bytearray Type: bytearray > | Input: Load sms-Thu Oct 30 11:27:27 PDT 2008-2597 > | | > | Add sms-Thu Oct 30 11:27:27 PDT 2008-2603 FieldSchema: double Type: > double > | | > | |---Project sms-Thu Oct 30 11:27:27 PDT 2008-2601 Projections: [*] > Overloaded: false FieldSchema: double Type: double > | | Input: Add sms-Thu Oct 30 11:27:27 PDT 2008-2600| > | | |---Add sms-Thu Oct 30 11:27:27 PDT 2008-2600 FieldSchema: double > Type: double > | | | > | | |---Project sms-Thu Oct 30 11:27:27 PDT 2008-2598 > Projections: [0] Overloaded: false FieldSchema: x: bytearray Type: bytearray > | | | Input: Load sms-Thu Oct 30 11:27:27 PDT 2008-2597 > | | | > | | |---Project sms-Thu Oct 30 11:27:27 PDT 2008-2599 > Projections: [1] Overloaded: false FieldSchema: y: bytearray Type: bytearray > | | Input: Load sms-Thu Oct 30 11:27:27 PDT 2008-2597 > | | > | |---Cast sms-Thu Oct 30 11:27:27 PDT 2008-2608 FieldSchema: double > Type: double > | | > | |---Project sms-Thu Oct 30 11:27:27 PDT 2008-2602 Projections: > [0] Overloaded: false FieldSchema: x: bytearray Type: bytearray > | Input: Load sms-Thu Oct 30 11:27:27 PDT 2008-2597 > | > |---Load sms-Thu Oct 30 11:27:27 PDT 2008-2597 Schema: {x: bytearray,y: > bytearray,z: bytearray} Type: bag > 2008-10-30 15:38:40,272 [main] ERROR org.apache.pig.impl.plan.OperatorPlan - > Attempt to give operator of type > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject > multiple outputs. This operator does not support multiple outputs. > 2008-10-30 15:38:40,272 [main] ERROR > org.apache.pig.backend.hadoop.executionengine.physicalLayer.LogToP
[jira] Updated: (PIG-528) Schema returned in UDF is not used by Pig
[ https://issues.apache.org/jira/browse/PIG-528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Santhosh Srinivasan updated PIG-528: Attachment: PIG-528.patch Attached patch (PIG-528.patch) contains the following: 1. Fix for handling schemas returned by UDFs 2. Unit test cases for the fix All unit test cases passed. > Schema returned in UDF is not used by Pig > - > > Key: PIG-528 > URL: https://issues.apache.org/jira/browse/PIG-528 > Project: Pig > Issue Type: Bug >Affects Versions: types_branch >Reporter: Santhosh Srinivasan >Assignee: Santhosh Srinivasan > Fix For: types_branch > > Attachments: PIG-528.patch > > > Using an identity UDF that returns the input schema as the output schema > leads to schema truncation in Pig. > {code} > grunt> a = load '/tudent_tab.data' as (name, age, gpa); > grunt> b = foreach a generate IdentityFunc(name, age); > grunt> describe b; > b: {name: bytearray} > --It should have been b:{(name: bytearray, age: bytearray)} > {code} > The outputSchema method in IdentityFunc is given below: > {code} > @Override > public Schema outputSchema(Schema input) { > return input; > } > {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-528) Schema returned in UDF is not used by Pig
[ https://issues.apache.org/jira/browse/PIG-528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Santhosh Srinivasan updated PIG-528: Patch Info: [Patch Available] > Schema returned in UDF is not used by Pig > - > > Key: PIG-528 > URL: https://issues.apache.org/jira/browse/PIG-528 > Project: Pig > Issue Type: Bug >Affects Versions: types_branch >Reporter: Santhosh Srinivasan >Assignee: Santhosh Srinivasan > Fix For: types_branch > > Attachments: PIG-528.patch > > > Using an identity UDF that returns the input schema as the output schema > leads to schema truncation in Pig. > {code} > grunt> a = load '/tudent_tab.data' as (name, age, gpa); > grunt> b = foreach a generate IdentityFunc(name, age); > grunt> describe b; > b: {name: bytearray} > --It should have been b:{(name: bytearray, age: bytearray)} > {code} > The outputSchema method in IdentityFunc is given below: > {code} > @Override > public Schema outputSchema(Schema input) { > return input; > } > {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-528) Schema returned in UDF is not used by Pig
[ https://issues.apache.org/jira/browse/PIG-528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Santhosh Srinivasan updated PIG-528: Attachment: PIG-528_1.patch Updated patch removing merge conflicts due to the earlier patch. > Schema returned in UDF is not used by Pig > - > > Key: PIG-528 > URL: https://issues.apache.org/jira/browse/PIG-528 > Project: Pig > Issue Type: Bug >Affects Versions: types_branch >Reporter: Santhosh Srinivasan >Assignee: Santhosh Srinivasan > Fix For: types_branch > > Attachments: PIG-528.patch, PIG-528_1.patch > > > Using an identity UDF that returns the input schema as the output schema > leads to schema truncation in Pig. > {code} > grunt> a = load '/tudent_tab.data' as (name, age, gpa); > grunt> b = foreach a generate IdentityFunc(name, age); > grunt> describe b; > b: {name: bytearray} > --It should have been b:{(name: bytearray, age: bytearray)} > {code} > The outputSchema method in IdentityFunc is given below: > {code} > @Override > public Schema outputSchema(Schema input) { > return input; > } > {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-527) Pig does not support storing nested data using default storage
[ https://issues.apache.org/jira/browse/PIG-527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Santhosh Srinivasan updated PIG-527: Attachment: PIG-527.patch Attached patch (PIG-527.patch) addresses the following: 1. PigStorage allows storage of nested data 2. Unit tests to test the same All unit tests pass > Pig does not support storing nested data using default storage > -- > > Key: PIG-527 > URL: https://issues.apache.org/jira/browse/PIG-527 > Project: Pig > Issue Type: Bug >Affects Versions: types_branch >Reporter: Santhosh Srinivasan >Assignee: Santhosh Srinivasan > Fix For: types_branch > > Attachments: PIG-527.patch > > > Pig does not allow storing nested data using the default storage function > (PigStorage) > {code} > grunt> a = load 'student_tab.data' as (name, age, gpa); > grunt> b = group a by age; > grunt> store b into '/user/sms/data/complex.data'; > 2008-11-13 16:21:17,711 [main] INFO > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher > - 50% complete > 2008-11-13 16:21:52,747 [main] ERROR > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher > - Map reduce job failed > 2008-11-13 16:21:52,747 [main] ERROR > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher > - Job failed! > 2008-11-13 16:21:52,764 [main] ERROR > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher - Error > message from task (reduce) > task_200809241441_21188_r_00java.io.IOException: Cannot store a non-flat > tuple using PigStorage > at org.apache.pig.builtin.PigStorage.putNext(PigStorage.java:196) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:116) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:90) > at org.apache.hadoop.mapred.ReduceTask$3.collect(ReduceTask.java:300) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.processOnePackageOutput(PigMapReduce.java:238) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:224) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:136) > at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:318) > at > org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2207) > 2008-11-13 16:21:52,764 [main] ERROR > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher - Error > message from task (reduce) > task_200809241441_21188_r_00java.io.IOException: Cannot store a non-flat > tuple using PigStorage > {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-527) Pig does not support storing nested data using default storage
[ https://issues.apache.org/jira/browse/PIG-527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Santhosh Srinivasan updated PIG-527: Patch Info: [Patch Available] > Pig does not support storing nested data using default storage > -- > > Key: PIG-527 > URL: https://issues.apache.org/jira/browse/PIG-527 > Project: Pig > Issue Type: Bug >Affects Versions: types_branch >Reporter: Santhosh Srinivasan >Assignee: Santhosh Srinivasan > Fix For: types_branch > > Attachments: PIG-527.patch > > > Pig does not allow storing nested data using the default storage function > (PigStorage) > {code} > grunt> a = load 'student_tab.data' as (name, age, gpa); > grunt> b = group a by age; > grunt> store b into '/user/sms/data/complex.data'; > 2008-11-13 16:21:17,711 [main] INFO > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher > - 50% complete > 2008-11-13 16:21:52,747 [main] ERROR > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher > - Map reduce job failed > 2008-11-13 16:21:52,747 [main] ERROR > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher > - Job failed! > 2008-11-13 16:21:52,764 [main] ERROR > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher - Error > message from task (reduce) > task_200809241441_21188_r_00java.io.IOException: Cannot store a non-flat > tuple using PigStorage > at org.apache.pig.builtin.PigStorage.putNext(PigStorage.java:196) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:116) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:90) > at org.apache.hadoop.mapred.ReduceTask$3.collect(ReduceTask.java:300) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.processOnePackageOutput(PigMapReduce.java:238) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:224) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:136) > at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:318) > at > org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2207) > 2008-11-13 16:21:52,764 [main] ERROR > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher - Error > message from task (reduce) > task_200809241441_21188_r_00java.io.IOException: Cannot store a non-flat > tuple using PigStorage > {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-385) Should support 'null' as a constant
[ https://issues.apache.org/jira/browse/PIG-385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12648872#action_12648872 ] Santhosh Srinivasan commented on PIG-385: - The NULL constant can be used in any context where other constants or expressions are used. The difference between constants and NULL constants will be the type inference. The interred type for NULL will be based on the context. For example, in the statement used in the bug report (shown below for reference), the type of null will be the same as the type of $0. By default, the type of null will be a bytearray. {code} B = foreach A generate $0 > 0 ? $0 : null; {code} Casting null - If the user chooses to, he/she can cast the null to the appropriate type. For example: {code} B = foreach A generate $0 > 0 ? $0 : (int)null; {code} Use of null with complex types --- Since complex types are made of simple types, the same rules (as stated above) apply. Null constant as map keys will be disallowed. Examples follow: {code} B = foreach A generate $0 > 0 ? $0 : {(null)}; -- here we have a bag with a tuple with a bytearray null constant C = foreach A generate [2#null]; -- a map constant with key 2 and value bytearray null D = foreach A generate [null#10]; --- error maps cannot have null keys {code} Open questions -- 1. When nulls are stored using PigStorage and then read back using PigStorage, a distinction between the various types of null cannot be made. Thoughts/suggestions/comments welcome. > Should support 'null' as a constant > --- > > Key: PIG-385 > URL: https://issues.apache.org/jira/browse/PIG-385 > Project: Pig > Issue Type: New Feature > Components: impl >Affects Versions: types_branch >Reporter: Alan Gates >Priority: Minor > Fix For: types_branch > > > It would be nice to be able to do things like: > B = foreach A generate $0 > 0 ? $0 : null; > but right now null is not allowed as a constant. This null constant should > be allowed anywhere an expression would be, and should be castable (that is > (int)null). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-544) Utf8StorageConverter.java does not always produce NULLs when data is malformed
[ https://issues.apache.org/jira/browse/PIG-544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12650810#action_12650810 ] Santhosh Srinivasan commented on PIG-544: - Another use case where scalars also generate errors: {code} grunt> a = load 'student_tab.data'; grunt> store a into 'student_tab.bin' using BinStorage(); grunt> a = load 'student_tab.bin' using BinStorage() as (name: int, age: int, gpa: float); grunt> dump a; 2008-11-25 16:02:40,986 [main] ERROR org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher - Error message from task (map) task_200809241441_24635_m_00java.lang.RuntimeException : Unexpected data type 74 found in stream. at org.apache.pig.data.DataReaderWriter.readDatum(DataReaderWriter.java:115) at org.apache.pig.builtin.BinStorage.bytesToInteger(BinStorage.java:169) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POCast.getNext(POCast.java:143) {code} > Utf8StorageConverter.java does not always produce NULLs when data is malformed > -- > > Key: PIG-544 > URL: https://issues.apache.org/jira/browse/PIG-544 > Project: Pig > Issue Type: Bug >Reporter: Olga Natkovich > > It does so for scalar types but not for complext types and not for the fields > inside of the complext types. > This is because it uses different code to parse scalar types by themselves > and scalar types inside of a complex type. It should really use the same (its > own code to do so.) > The code it is currently uses, is inside of TextDataParser.jjt and is also > used to parse constants so we need to be careful if we want to make changes > to it. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-545) PERFORMANCE: Sampler for order bys does not produce a good distribution
[ https://issues.apache.org/jira/browse/PIG-545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12650824#action_12650824 ] Santhosh Srinivasan commented on PIG-545: - The current sampler uses random sampling, assuming uniform distribution of sort keys. Using Poisson distribution will enable the sampler to figure out the expected value of the distribution without knowing the actual distribution. This will ensure (more) even distribution of data for the reducers. > PERFORMANCE: Sampler for order bys does not produce a good distribution > --- > > Key: PIG-545 > URL: https://issues.apache.org/jira/browse/PIG-545 > Project: Pig > Issue Type: Bug > Components: impl >Affects Versions: types_branch >Reporter: Alan Gates > Fix For: types_branch > > > In running tests on actual data, I've noticed that the final reduce of an > order by has skewed partitions. Some reduces finish in a few seconds while > some run for 20 minutes. Getting a better distribution should lead to much > better performance for order by. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-549) type checking with order-by following user-defined function
[ https://issues.apache.org/jira/browse/PIG-549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12652153#action_12652153 ] Santhosh Srinivasan commented on PIG-549: - AFAIK, Pig does not support zero argument UDFs. In your script, UDF2() is the reason for the type checking error. > type checking with order-by following user-defined function > --- > > Key: PIG-549 > URL: https://issues.apache.org/jira/browse/PIG-549 > Project: Pig > Issue Type: Bug >Affects Versions: types_branch > Environment: type checker fails here: > A = load ...; > B = foreach A generate UDF1(*), UDF2(); > C = order B by $1; > where UDF2() is of type EvalFunc. > I tried all sorts of things, including overriding outputSchema() of the UDF > to specify Integer, and also adding "as x : int" to the foreach command -- in > all cases I get the same error. >Reporter: Christopher Olston > Fix For: types_branch > > > Exception in thread "main" java.lang.AssertionError: Unsupported root type in > LOForEach:LOUserFunc > at > org.apache.pig.impl.logicalLayer.validators.TypeCheckingVisitor.visit(TypeCheckingVisitor.java:2267) > at org.apache.pig.impl.logicalLayer.LOForEach.visit(LOForEach.java:121) > at org.apache.pig.impl.logicalLayer.LOForEach.visit(LOForEach.java:40) > at > org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:68) > at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:51) > at > org.apache.pig.impl.plan.PlanValidator.validateSkipCollectException(PlanValidator.java:101) > at > org.apache.pig.impl.logicalLayer.validators.TypeCheckingValidator.validate(TypeCheckingValidator.java:40) > at > org.apache.pig.impl.logicalLayer.validators.TypeCheckingValidator.validate(TypeCheckingValidator.java:30) > at > org.apache.pig.impl.logicalLayer.validators.LogicalPlanValidationExecutor.validate(LogicalPlanValidationExecutor.java:79) > at org.apache.pig.PigServer.compileLp(PigServer.java:684) > at org.apache.pig.PigServer.compileLp(PigServer.java:655) > at org.apache.pig.PigServer.store(PigServer.java:433) > at org.apache.pig.PigServer.store(PigServer.java:421) > at org.apache.pig.PigServer.openIterator(PigServer.java:384) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-552) UDF defined with argument causes class instantiation exception
[ https://issues.apache.org/jira/browse/PIG-552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12652915#action_12652915 ] Santhosh Srinivasan commented on PIG-552: - Question related to the test case reported in the bug report. Can you post the UDF? If not, can you confirm if the UDF is missing a default constructor? Review comments: The patch ignores the problem and tries to proceed. This will lead to runtime issues as the class will not be instantiated in the backend. This is not what the user wants. Its probably a bug in the parser where the user defined alias is not getting picked up. > UDF defined with argument causes class instantiation exception > -- > > Key: PIG-552 > URL: https://issues.apache.org/jira/browse/PIG-552 > Project: Pig > Issue Type: Bug >Affects Versions: types_branch >Reporter: Christopher Olston > Attachments: pig.patch > > > I'm doing: > define myFunc myFunc('blah'); > b = foreach a generate myFunc(*); > Pig parses it, but fails when it tries to run it on hadoop (I'm using "local" > mode). It tries to invoke the class loader on "myFunc('blah')" instead of on > "myFunc", which causes an exception. > The bug seems to stem from this part of JobControlCompiler.getJobConf(): > if(mro.UDFs.size()==1){ > String compFuncSpec = mro.UDFs.get(0); > Class comparator = > PigContext.resolveClassName(compFuncSpec); > if(ComparisonFunc.class.isAssignableFrom(comparator)) { > > jobConf.setMapperClass(PigMapReduce.MapWithComparator.class); > > jobConf.setReducerClass(PigMapReduce.ReduceWithComparator.class); > jobConf.set("pig.reduce.package", > ObjectSerializer.serialize(pack)); > jobConf.set("pig.usercomparator", "true"); > jobConf.setOutputKeyClass(NullableTuple.class); > jobConf.setOutputKeyComparatorClass(comparator); > } > } else { > jobConf.set("pig.sortOrder", > ObjectSerializer.serialize(mro.getSortOrder())); > } -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-549) type checking with order-by following user-defined function
[ https://issues.apache.org/jira/browse/PIG-549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12652916#action_12652916 ] Santhosh Srinivasan commented on PIG-549: - Wrt my previous comment, Pig does support zero argument UDFs in foreach but they are allowed in other places like Filter, Order by, etc. > type checking with order-by following user-defined function > --- > > Key: PIG-549 > URL: https://issues.apache.org/jira/browse/PIG-549 > Project: Pig > Issue Type: Bug >Affects Versions: types_branch > Environment: type checker fails here: > A = load ...; > B = foreach A generate UDF1(*), UDF2(); > C = order B by $1; > where UDF2() is of type EvalFunc. > I tried all sorts of things, including overriding outputSchema() of the UDF > to specify Integer, and also adding "as x : int" to the foreach command -- in > all cases I get the same error. >Reporter: Christopher Olston > Fix For: types_branch > > > Exception in thread "main" java.lang.AssertionError: Unsupported root type in > LOForEach:LOUserFunc > at > org.apache.pig.impl.logicalLayer.validators.TypeCheckingVisitor.visit(TypeCheckingVisitor.java:2267) > at org.apache.pig.impl.logicalLayer.LOForEach.visit(LOForEach.java:121) > at org.apache.pig.impl.logicalLayer.LOForEach.visit(LOForEach.java:40) > at > org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:68) > at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:51) > at > org.apache.pig.impl.plan.PlanValidator.validateSkipCollectException(PlanValidator.java:101) > at > org.apache.pig.impl.logicalLayer.validators.TypeCheckingValidator.validate(TypeCheckingValidator.java:40) > at > org.apache.pig.impl.logicalLayer.validators.TypeCheckingValidator.validate(TypeCheckingValidator.java:30) > at > org.apache.pig.impl.logicalLayer.validators.LogicalPlanValidationExecutor.validate(LogicalPlanValidationExecutor.java:79) > at org.apache.pig.PigServer.compileLp(PigServer.java:684) > at org.apache.pig.PigServer.compileLp(PigServer.java:655) > at org.apache.pig.PigServer.store(PigServer.java:433) > at org.apache.pig.PigServer.store(PigServer.java:421) > at org.apache.pig.PigServer.openIterator(PigServer.java:384) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (PIG-538) bincond can't work with flatten bags
[ https://issues.apache.org/jira/browse/PIG-538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Santhosh Srinivasan reassigned PIG-538: --- Assignee: Pradeep Kamath (was: Santhosh Srinivasan) > bincond can't work with flatten bags > > > Key: PIG-538 > URL: https://issues.apache.org/jira/browse/PIG-538 > Project: Pig > Issue Type: Bug >Affects Versions: types_branch >Reporter: Olga Natkovich >Assignee: Pradeep Kamath > Fix For: types_branch > > > The following script is user with trunk code to simulated outer join not > directly supported by pig: > A = load '/studenttab10k' as (name: chararray, age: int, gpa: float); > B = load 'votertab10k' as (name: chararray, age: int, registration: > chararray, donation: float); > C = cogroup A by name, B by name; > D = foreach C generate group, (IsEmpty(A) ? '' : flatten(A)), (IsEmpty(B) ? > 'null' : flatten(B)); > On types branch this gives syntax error and even beyond that not supported > since bincond requires that both expressions be of the same type. Santhosh > suggested to have special NULL expression that matches any type. This seems > to make sense. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-294) Parse errors for boolean conditions
[ https://issues.apache.org/jira/browse/PIG-294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Santhosh Srinivasan updated PIG-294: Patch Info: [Patch Available] > Parse errors for boolean conditions > --- > > Key: PIG-294 > URL: https://issues.apache.org/jira/browse/PIG-294 > Project: Pig > Issue Type: Bug > Components: impl >Affects Versions: types_branch >Reporter: Santhosh Srinivasan >Assignee: Santhosh Srinivasan > Attachments: boolean_test.patch > > > The parser throws exceptions for pig statements that contain boolean > conditions with operands that use string comparators. A sample statement to > reproduce the test is given below: > split a into b if name lt 'f', c if (name ge 'f' and name le 'h'), d if name > gt 'h'; -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-290) LOCross output schema is not right
[ https://issues.apache.org/jira/browse/PIG-290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Santhosh Srinivasan updated PIG-290: Patch Info: [Patch Available] > LOCross output schema is not right > -- > > Key: PIG-290 > URL: https://issues.apache.org/jira/browse/PIG-290 > Project: Pig > Issue Type: Bug > Components: impl >Affects Versions: types_branch >Reporter: Pi Song >Assignee: Santhosh Srinivasan > Fix For: types_branch > > Attachments: insert_between.patch > > > From the schema generation code:- > {noformat} > List inputs = mPlan.getPredecessors(this); > for (LogicalOperator op : inputs) { > // Create schema here > } > {noformat} > The output schema is generated based on inputs determined in the logical > plan. However, mPlan.getPredecessors() doesn't always preserve the right > order (A x B and B x A result in different schemas). I suggest maintaining > mInputs variable in LOCross (as it used to be) to resolve this issue. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-299) Filter operator not included in the main predecessor plan structure
[ https://issues.apache.org/jira/browse/PIG-299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Santhosh Srinivasan updated PIG-299: Patch Info: [Patch Available] > Filter operator not included in the main predecessor plan structure > --- > > Key: PIG-299 > URL: https://issues.apache.org/jira/browse/PIG-299 > Project: Pig > Issue Type: Bug > Components: impl >Affects Versions: types_branch > Environment: N/A >Reporter: Tyson Condie >Assignee: Santhosh Srinivasan >Priority: Blocker > Fix For: types_branch > > Attachments: nested_project_as_foreach.patch > > > Take the following query, which can be found in TestLogicalPlanBuilder.java > method testQuery80(); > a = load 'input1' as (name, age, gpa); > b = filter a by age < '20';"); > c = group b by (name,age); > d = foreach c { > cf = filter b by gpa < '3.0'; > cp = cf.gpa; > cd = distinct cp; > co = order cd by gpa; > generate group, flatten(co); > }; > The filter statement 'cf = filter b by gpa < '3.0'' is not accessible via the > LogicalPlan::getPredecessor method. Here is the explan plan print out of the > inner foreach plan: > |---SORT Test-Plan-Builder-17 Schema: {gpa: bytearray} Type: bag > | | > | Project Test-Plan-Builder-16 Projections: [0] Overloaded: false > FieldSchema: gpa: bytearray cn: 2 Type: bytearray > | Input: Distinct Test-Plan-Builder-1 > | > |---Distinct Test-Plan-Builder-15 Schema: {gpa: bytearray} Type: bag > | > |---Project Test-Plan-Builder-14 Projections: [2] Overloaded: false > FieldSchema: gpa: bytearray cn: 2 Type: bytearray > Input: Project Test-Plan-Builder-13 Projections: [*] > Overloaded: false| > |---Project Test-Plan-Builder-13 Projections: [*] Overloaded: > false FieldSchema: cf: tuple({name: bytearray,age: bytearray,gpa: bytearray}) > Type: tuple > Input: Filter Test-Plan-Builder-12OPERATOR PROJECT SCHEMA > {name: bytearray,age: bytearray,gpa: bytearray} > As you can see the filter is only accessible via the > LOProject::getExpression() method. It is not showing up as an input operator. > Focus on the projection immediately following the filter. If I remove this > projection then I get a correct plan. For example, let the inner foreach plan > be as follows: > d = foreach c { > cf = filter b by gpa < '3.0'; > cd = distinct cf; > co = order cd by gpa; > generate group, flatten(co); > }; > Then I get the following (correct) explan plan output. > |---SORT Test-Plan-Builder-15 Schema: {name: bytearray,age: bytearray,gpa: > bytearray} Type: bag > | | > | Project Test-Plan-Builder-14 Projections: [2] Overloaded: false > FieldSchema: gpa: bytearray cn: 2 Type: bytearray > | Input: Distinct Test-Plan-Builder-1 > | > |---Distinct Test-Plan-Builder-13 Schema: {name: bytearray,age: > bytearray,gpa: bytearray} Type: bag > | > |---Filter Test-Plan-Builder-12 Schema: {name: bytearray,age: > bytearray,gpa: bytearray} Type: bag > | | > | LesserThan Test-Plan-Builder-11 FieldSchema: null Type: > Unknown > | | > | |---Project Test-Plan-Builder-9 Projections: [2] Overloaded: > false FieldSchema: Type: Unknown > | | Input: CoGroup Test-Plan-Builder-7 > | | > | |---Const Test-Plan-Builder-10 FieldSchema: chararray Type: > chararray > | > |---Project Test-Plan-Builder-8 Projections: [1] Overloaded: > false FieldSchema: b: bag({name: bytearray,age: bytearray,gpa: bytearray}) > Type: bag > Input: CoGroup Test-Plan-Builder-7OPERATOR PROJECT SCHEMA > {name: bytearray,age: bytearray,gpa: bytearray} > Alan said that the problem is we don't generate a foreach operator for the > 'cp = cf.gpa' statement. Please let me know if this can be resolved. > Thanks, > Tyson -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-323) Remove DEFINE from QueryParser
[ https://issues.apache.org/jira/browse/PIG-323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Santhosh Srinivasan updated PIG-323: Patch Info: [Patch Available] > Remove DEFINE from QueryParser > -- > > Key: PIG-323 > URL: https://issues.apache.org/jira/browse/PIG-323 > Project: Pig > Issue Type: Bug > Components: impl >Affects Versions: types_branch >Reporter: Santhosh Srinivasan >Assignee: Santhosh Srinivasan >Priority: Minor > Fix For: types_branch > > Attachments: remove_define_from_query_parser.patch > > > Remove the keyword DEFINE and the associated methods from QueryParser. The > syntax and semantics of define as proposed in the functional specification > breaks backward compatibility. The UDFs will now provide the list of function > arguments that are expected. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-320) The parser/type checker should use the getSchema method of UDFs to deduce return type/schema
[ https://issues.apache.org/jira/browse/PIG-320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Santhosh Srinivasan updated PIG-320: Patch Info: [Patch Available] > The parser/type checker should use the getSchema method of UDFs to deduce > return type/schema > > > Key: PIG-320 > URL: https://issues.apache.org/jira/browse/PIG-320 > Project: Pig > Issue Type: Bug > Components: impl >Affects Versions: types_branch >Reporter: Santhosh Srinivasan >Assignee: Santhosh Srinivasan > Fix For: types_branch > > Attachments: udf_outputSchema.patch > > > Currently, the parser/type checker uses the getReturnType to deduce the > return type of the user defined function (UDF). This mechanism is > satisfactory only for basic types (int, long, ...); for composite types > (tuple, bag), the schema is also required.The abstract class EvalFunc > interface exposes the outputSchema to deduce the return type/schema of the > UDF. The parser/type checker should use this method to figure out the return > type/schema of the UDF and use it appropriately. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-421) error with complex nested plan
[ https://issues.apache.org/jira/browse/PIG-421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Santhosh Srinivasan updated PIG-421: Patch Info: [Patch Available] > error with complex nested plan > -- > > Key: PIG-421 > URL: https://issues.apache.org/jira/browse/PIG-421 > Project: Pig > Issue Type: Bug >Affects Versions: types_branch >Reporter: Olga Natkovich >Assignee: Santhosh Srinivasan > Fix For: types_branch > > Attachments: PIG-421.patch, PIG-421_1.patch > > > Even after applying patch for PIG-398, the following query still fails: > a = load 'studenttab10k' as (name, age, gpa); > b = filter a by age < 20; > c = group b by age; > d = foreach c { > cf = filter b by gpa < 3.0; > cp = cf.gpa; > cd = distinct cp; > co = order cd by $0; > generate group, flatten(co); > } > store d into 'output'; -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-400) flatten causes schema naming problems
[ https://issues.apache.org/jira/browse/PIG-400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Santhosh Srinivasan updated PIG-400: Patch Info: [Patch Available] > flatten causes schema naming problems > - > > Key: PIG-400 > URL: https://issues.apache.org/jira/browse/PIG-400 > Project: Pig > Issue Type: Bug >Affects Versions: types_branch >Reporter: Olga Natkovich >Assignee: Santhosh Srinivasan > Fix For: types_branch > > Attachments: PIG_400.patch > > > Script: > A = load 'data' as (name: chararray, age: chararray, gpa: float); > B = group A by (name, age); > C = foreach B generate flatten(group) as res, COUNT(A); > D = foreach C generate res; > dump D; > Error: > java.io.IOException: Invalid alias: res in {res::name: chararray,res::age: > chararray,long} > at org.apache.pig.PigServer.registerQuery(PigServer.java:255) > at > org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:422) > at > org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:241) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:82) > at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:64) > at org.apache.pig.Main.main(Main.java:302) > Caused by: org.apache.pig.impl.logicalLayer.parser.ParseException: Invalid > alias: res in {res::name: chararray,res::age: chararray,long} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-158) Rework logical plan
[ https://issues.apache.org/jira/browse/PIG-158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Santhosh Srinivasan updated PIG-158: Patch Info: [Patch Available] Assignee: Santhosh Srinivasan (was: Alan Gates) > Rework logical plan > --- > > Key: PIG-158 > URL: https://issues.apache.org/jira/browse/PIG-158 > Project: Pig > Issue Type: Sub-task > Components: impl >Reporter: Alan Gates >Assignee: Santhosh Srinivasan > Attachments: cast_fix.patch, fully_qualified_typecast_fix.patch, > is_null.patch, logical_operators.patch, logical_operators_rev_1.patch, > logical_operators_rev_2.patch, logical_operators_rev_3.patch, > multiple_column_project.patch, overloaded_project_distinct.patch, > parser_changes.patch, parser_changes_v1.patch, parser_changes_v2.patch, > parser_changes_v3.patch, parser_changes_v4.patch, ParserErrors.txt, > udf_fix.patch, udf_funcSpec.patch, udf_return_type.patch, > user_func_and_store.patch, visitorWalker.patch > > > Rework the logical plan in line with > http://wiki.apache.org/pig/PigExecutionModel -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (PIG-158) Rework logical plan
[ https://issues.apache.org/jira/browse/PIG-158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Santhosh Srinivasan resolved PIG-158. - Resolution: Fixed All patches have been reviewed and checked in as part of the types branch work. > Rework logical plan > --- > > Key: PIG-158 > URL: https://issues.apache.org/jira/browse/PIG-158 > Project: Pig > Issue Type: Sub-task > Components: impl >Reporter: Alan Gates >Assignee: Santhosh Srinivasan > Attachments: cast_fix.patch, fully_qualified_typecast_fix.patch, > is_null.patch, logical_operators.patch, logical_operators_rev_1.patch, > logical_operators_rev_2.patch, logical_operators_rev_3.patch, > multiple_column_project.patch, overloaded_project_distinct.patch, > parser_changes.patch, parser_changes_v1.patch, parser_changes_v2.patch, > parser_changes_v3.patch, parser_changes_v4.patch, ParserErrors.txt, > udf_fix.patch, udf_funcSpec.patch, udf_return_type.patch, > user_func_and_store.patch, visitorWalker.patch > > > Rework the logical plan in line with > http://wiki.apache.org/pig/PigExecutionModel -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-159) Make changes to the parser to support new types functionality
[ https://issues.apache.org/jira/browse/PIG-159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Santhosh Srinivasan updated PIG-159: Patch Info: [Patch Available] Assignee: Santhosh Srinivasan (was: Alan Gates) > Make changes to the parser to support new types functionality > - > > Key: PIG-159 > URL: https://issues.apache.org/jira/browse/PIG-159 > Project: Pig > Issue Type: Sub-task > Components: impl >Reporter: Alan Gates >Assignee: Santhosh Srinivasan > Attachments: parser_chages_v10.patch, parser_chages_v11.patch, > parser_chages_v12.patch, parser_chages_v13.patch, parser_chages_v5.patch, > parser_chages_v6.patch, parser_chages_v7.patch, parser_chages_v8.patch, > parser_chages_v9.patch > > > In order to support the new types functionality described in > http://wiki.apache.org/pig/PigTypesFunctionalSpec, the parse needs to change > in the following ways: > 1) AS needs to support types in addition to aliases. So where previously it > was legal to say: > a = load 'myfile' as a, b, c; > it will now also be legal to say > a = load 'myfile' as a integer, b float, c chararray; > 2) Non string constants need to be supported. This includes non-string > atomic types (integer, long, float, double) and the non-atomic types bags, > tuples, and maps. > 3) A cast operator needs to be added so that fields can be explicitly casted. > 4) Changes to DEFINE, to allow users to declare arguments and return types > for UDFs -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (PIG-159) Make changes to the parser to support new types functionality
[ https://issues.apache.org/jira/browse/PIG-159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Santhosh Srinivasan resolved PIG-159. - Resolution: Fixed All patches have been reviewed and checked in as part of the types branch rework. > Make changes to the parser to support new types functionality > - > > Key: PIG-159 > URL: https://issues.apache.org/jira/browse/PIG-159 > Project: Pig > Issue Type: Sub-task > Components: impl >Reporter: Alan Gates >Assignee: Santhosh Srinivasan > Attachments: parser_chages_v10.patch, parser_chages_v11.patch, > parser_chages_v12.patch, parser_chages_v13.patch, parser_chages_v5.patch, > parser_chages_v6.patch, parser_chages_v7.patch, parser_chages_v8.patch, > parser_chages_v9.patch > > > In order to support the new types functionality described in > http://wiki.apache.org/pig/PigTypesFunctionalSpec, the parse needs to change > in the following ways: > 1) AS needs to support types in addition to aliases. So where previously it > was legal to say: > a = load 'myfile' as a, b, c; > it will now also be legal to say > a = load 'myfile' as a integer, b float, c chararray; > 2) Non string constants need to be supported. This includes non-string > atomic types (integer, long, float, double) and the non-atomic types bags, > tuples, and maps. > 3) A cast operator needs to be added so that fields can be explicitly casted. > 4) Changes to DEFINE, to allow users to declare arguments and return types > for UDFs -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-552) UDF defined with argument causes class instantiation exception
[ https://issues.apache.org/jira/browse/PIG-552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12653027#action_12653027 ] Santhosh Srinivasan commented on PIG-552: - Sort UDFs have to be CompareFunc and not EvalFunc. > UDF defined with argument causes class instantiation exception > -- > > Key: PIG-552 > URL: https://issues.apache.org/jira/browse/PIG-552 > Project: Pig > Issue Type: Bug >Affects Versions: types_branch >Reporter: Christopher Olston > Attachments: pig.patch > > > I'm doing: > define myFunc myFunc('blah'); > b = foreach a generate myFunc(*); > Pig parses it, but fails when it tries to run it on hadoop (I'm using "local" > mode). It tries to invoke the class loader on "myFunc('blah')" instead of on > "myFunc", which causes an exception. > The bug seems to stem from this part of JobControlCompiler.getJobConf(): > if(mro.UDFs.size()==1){ > String compFuncSpec = mro.UDFs.get(0); > Class comparator = > PigContext.resolveClassName(compFuncSpec); > if(ComparisonFunc.class.isAssignableFrom(comparator)) { > > jobConf.setMapperClass(PigMapReduce.MapWithComparator.class); > > jobConf.setReducerClass(PigMapReduce.ReduceWithComparator.class); > jobConf.set("pig.reduce.package", > ObjectSerializer.serialize(pack)); > jobConf.set("pig.usercomparator", "true"); > jobConf.setOutputKeyClass(NullableTuple.class); > jobConf.setOutputKeyComparatorClass(comparator); > } > } else { > jobConf.set("pig.sortOrder", > ObjectSerializer.serialize(mro.getSortOrder())); > } -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-549) type checking with order-by following user-defined function
[ https://issues.apache.org/jira/browse/PIG-549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Santhosh Srinivasan updated PIG-549: Description: Exception in thread "main" java.lang.AssertionError: Unsupported root type in LOForEach:LOUserFunc at org.apache.pig.impl.logicalLayer.validators.TypeCheckingVisitor.visit(TypeCheckingVisitor.java:2267) at org.apache.pig.impl.logicalLayer.LOForEach.visit(LOForEach.java:121) at org.apache.pig.impl.logicalLayer.LOForEach.visit(LOForEach.java:40) at org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:68) at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:51) at org.apache.pig.impl.plan.PlanValidator.validateSkipCollectException(PlanValidator.java:101) at org.apache.pig.impl.logicalLayer.validators.TypeCheckingValidator.validate(TypeCheckingValidator.java:40) at org.apache.pig.impl.logicalLayer.validators.TypeCheckingValidator.validate(TypeCheckingValidator.java:30) at org.apache.pig.impl.logicalLayer.validators.LogicalPlanValidationExecutor.validate(LogicalPlanValidationExecutor.java:79) at org.apache.pig.PigServer.compileLp(PigServer.java:684) at org.apache.pig.PigServer.compileLp(PigServer.java:655) at org.apache.pig.PigServer.store(PigServer.java:433) at org.apache.pig.PigServer.store(PigServer.java:421) at org.apache.pig.PigServer.openIterator(PigServer.java:384) was: Exception in thread "main" java.lang.AssertionError: Unsupported root type in LOForEach:LOUserFunc at org.apache.pig.impl.logicalLayer.validators.TypeCheckingVisitor.visit(TypeCheckingVisitor.java:2267) at org.apache.pig.impl.logicalLayer.LOForEach.visit(LOForEach.java:121) at org.apache.pig.impl.logicalLayer.LOForEach.visit(LOForEach.java:40) at org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:68) at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:51) at org.apache.pig.impl.plan.PlanValidator.validateSkipCollectException(PlanValidator.java:101) at org.apache.pig.impl.logicalLayer.validators.TypeCheckingValidator.validate(TypeCheckingValidator.java:40) at org.apache.pig.impl.logicalLayer.validators.TypeCheckingValidator.validate(TypeCheckingValidator.java:30) at org.apache.pig.impl.logicalLayer.validators.LogicalPlanValidationExecutor.validate(LogicalPlanValidationExecutor.java:79) at org.apache.pig.PigServer.compileLp(PigServer.java:684) at org.apache.pig.PigServer.compileLp(PigServer.java:655) at org.apache.pig.PigServer.store(PigServer.java:433) at org.apache.pig.PigServer.store(PigServer.java:421) at org.apache.pig.PigServer.openIterator(PigServer.java:384) Issue Type: Improvement (was: Bug) > type checking with order-by following user-defined function > --- > > Key: PIG-549 > URL: https://issues.apache.org/jira/browse/PIG-549 > Project: Pig > Issue Type: Improvement >Affects Versions: types_branch > Environment: type checker fails here: > A = load ...; > B = foreach A generate UDF1(*), UDF2(); > C = order B by $1; > where UDF2() is of type EvalFunc. > I tried all sorts of things, including overriding outputSchema() of the UDF > to specify Integer, and also adding "as x : int" to the foreach command -- in > all cases I get the same error. >Reporter: Christopher Olston > Fix For: types_branch > > > Exception in thread "main" java.lang.AssertionError: Unsupported root type in > LOForEach:LOUserFunc > at > org.apache.pig.impl.logicalLayer.validators.TypeCheckingVisitor.visit(TypeCheckingVisitor.java:2267) > at org.apache.pig.impl.logicalLayer.LOForEach.visit(LOForEach.java:121) > at org.apache.pig.impl.logicalLayer.LOForEach.visit(LOForEach.java:40) > at > org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:68) > at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:51) > at > org.apache.pig.impl.plan.PlanValidator.validateSkipCollectException(PlanValidator.java:101) > at > org.apache.pig.impl.logicalLayer.validators.TypeCheckingValidator.validate(TypeCheckingValidator.java:40) > at > org.apache.pig.impl.logicalLayer.validators.TypeCheckingValidator.validate(TypeCheckingValidator.java:30) > at > org.apache.pig.impl.logicalLayer.validators.LogicalPlanValidationExecutor.validate(LogicalPlanValidationExecutor.java:79) > at org.apache.pig.PigServer.compileLp(PigServer.java:684) > at org.apache.pig.PigServer.compileLp(PigServer.java:655) > at org.apache.pig.PigServer.store(PigServer.java:433) > at org.apache.pig.PigServer.store(PigServer.java
[jira] Commented: (PIG-549) type checking with order-by following user-defined function
[ https://issues.apache.org/jira/browse/PIG-549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12653028#action_12653028 ] Santhosh Srinivasan commented on PIG-549: - Sure, we should allow that. I will mark this an enhancement request. > type checking with order-by following user-defined function > --- > > Key: PIG-549 > URL: https://issues.apache.org/jira/browse/PIG-549 > Project: Pig > Issue Type: Bug >Affects Versions: types_branch > Environment: type checker fails here: > A = load ...; > B = foreach A generate UDF1(*), UDF2(); > C = order B by $1; > where UDF2() is of type EvalFunc. > I tried all sorts of things, including overriding outputSchema() of the UDF > to specify Integer, and also adding "as x : int" to the foreach command -- in > all cases I get the same error. >Reporter: Christopher Olston > Fix For: types_branch > > > Exception in thread "main" java.lang.AssertionError: Unsupported root type in > LOForEach:LOUserFunc > at > org.apache.pig.impl.logicalLayer.validators.TypeCheckingVisitor.visit(TypeCheckingVisitor.java:2267) > at org.apache.pig.impl.logicalLayer.LOForEach.visit(LOForEach.java:121) > at org.apache.pig.impl.logicalLayer.LOForEach.visit(LOForEach.java:40) > at > org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:68) > at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:51) > at > org.apache.pig.impl.plan.PlanValidator.validateSkipCollectException(PlanValidator.java:101) > at > org.apache.pig.impl.logicalLayer.validators.TypeCheckingValidator.validate(TypeCheckingValidator.java:40) > at > org.apache.pig.impl.logicalLayer.validators.TypeCheckingValidator.validate(TypeCheckingValidator.java:30) > at > org.apache.pig.impl.logicalLayer.validators.LogicalPlanValidationExecutor.validate(LogicalPlanValidationExecutor.java:79) > at org.apache.pig.PigServer.compileLp(PigServer.java:684) > at org.apache.pig.PigServer.compileLp(PigServer.java:655) > at org.apache.pig.PigServer.store(PigServer.java:433) > at org.apache.pig.PigServer.store(PigServer.java:421) > at org.apache.pig.PigServer.openIterator(PigServer.java:384) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-550) java.lang.ClassCastException: java.lang.String cannot be cast to org.apache.pig.data.Tuple
[ https://issues.apache.org/jira/browse/PIG-550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12653424#action_12653424 ] Santhosh Srinivasan commented on PIG-550: - The issue is similar to the bug reported in PIG-449. > java.lang.ClassCastException: java.lang.String cannot be cast to > org.apache.pig.data.Tuple > -- > > Key: PIG-550 > URL: https://issues.apache.org/jira/browse/PIG-550 > Project: Pig > Issue Type: Bug >Affects Versions: types_branch >Reporter: Viraj Bhat > Fix For: types_branch > > > == > Map tasks resulting from the below Pig Script throws the following exception. > Note 'one' is a dummy input containing, number 1. > == > {code} > A = load 'one' using PigStorage() as ( one ); > B = foreach A generate > { > ( > ('p1-t1-e1', 'p1-t1-e2'), > ('p1-t2-e1', 'p1-t2-e2') > ), > ( > ('p2-t1-e1', 'p2-t1-e2'), > ('p2-t2-e1', 'p2-t2-e2') > ) > }; > describe B; > C = foreach B generate > $0 as pairbag { pair: ( t1: (e1, e2), t2: (e1, e2) ) }; describe C; > D = foreach C generate FLATTEN(pairbag); > describe D; > E = foreach D generate > pair.t1.e2 as t1e2, > pair.t2.e1 as t2e1; > describe E; > dump E; > {code} > == > 2008-12-01 20:07:53,974 [main] ERROR > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher - Error > message from task (map) > task_200810152105_0207_m_00java.lang.ClassCastException: java.lang.String > cannot be cast to org.apache.pig.data.Tuple > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:279) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:226) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:133) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:233) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:254) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:180) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:170) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:158) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map.map(PigMapOnly.java:65) > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:47) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:227) > at > org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2209) > == -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-552) UDF defined with argument causes class instantiation exception
[ https://issues.apache.org/jira/browse/PIG-552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12653429#action_12653429 ] Santhosh Srinivasan commented on PIG-552: - Sorry about the reference to ComparisonFunc, I was looking at your issue - PIG-549 and had that in mind. I was not able to reproduce your scenario. I tried the following script and it worked. {code} grunt> define mapUdf MapUDF('world'); grunt> RAW_LOGS = load '/user/sms/data/mydata.txt' as (url:chararray, numvisits:int); grunt> b = foreach RAW_LOGS generate mapUdf(*); {code} > UDF defined with argument causes class instantiation exception > -- > > Key: PIG-552 > URL: https://issues.apache.org/jira/browse/PIG-552 > Project: Pig > Issue Type: Bug >Affects Versions: types_branch >Reporter: Christopher Olston > Attachments: pig.patch > > > I'm doing: > define myFunc myFunc('blah'); > b = foreach a generate myFunc(*); > Pig parses it, but fails when it tries to run it on hadoop (I'm using "local" > mode). It tries to invoke the class loader on "myFunc('blah')" instead of on > "myFunc", which causes an exception. > The bug seems to stem from this part of JobControlCompiler.getJobConf(): > if(mro.UDFs.size()==1){ > String compFuncSpec = mro.UDFs.get(0); > Class comparator = > PigContext.resolveClassName(compFuncSpec); > if(ComparisonFunc.class.isAssignableFrom(comparator)) { > > jobConf.setMapperClass(PigMapReduce.MapWithComparator.class); > > jobConf.setReducerClass(PigMapReduce.ReduceWithComparator.class); > jobConf.set("pig.reduce.package", > ObjectSerializer.serialize(pack)); > jobConf.set("pig.usercomparator", "true"); > jobConf.setOutputKeyClass(NullableTuple.class); > jobConf.setOutputKeyComparatorClass(comparator); > } > } else { > jobConf.set("pig.sortOrder", > ObjectSerializer.serialize(mro.getSortOrder())); > } -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (PIG-550) java.lang.ClassCastException: java.lang.String cannot be cast to org.apache.pig.data.Tuple
[ https://issues.apache.org/jira/browse/PIG-550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Santhosh Srinivasan resolved PIG-550. - Resolution: Duplicate Marking it a duplicate of Pig-449. The resolution in Pig-449 will fix the issue reported in this bug. > java.lang.ClassCastException: java.lang.String cannot be cast to > org.apache.pig.data.Tuple > -- > > Key: PIG-550 > URL: https://issues.apache.org/jira/browse/PIG-550 > Project: Pig > Issue Type: Bug >Affects Versions: types_branch >Reporter: Viraj Bhat > Fix For: types_branch > > > == > Map tasks resulting from the below Pig Script throws the following exception. > Note 'one' is a dummy input containing, number 1. > == > {code} > A = load 'one' using PigStorage() as ( one ); > B = foreach A generate > { > ( > ('p1-t1-e1', 'p1-t1-e2'), > ('p1-t2-e1', 'p1-t2-e2') > ), > ( > ('p2-t1-e1', 'p2-t1-e2'), > ('p2-t2-e1', 'p2-t2-e2') > ) > }; > describe B; > C = foreach B generate > $0 as pairbag { pair: ( t1: (e1, e2), t2: (e1, e2) ) }; describe C; > D = foreach C generate FLATTEN(pairbag); > describe D; > E = foreach D generate > pair.t1.e2 as t1e2, > pair.t2.e1 as t2e1; > describe E; > dump E; > {code} > == > 2008-12-01 20:07:53,974 [main] ERROR > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher - Error > message from task (map) > task_200810152105_0207_m_00java.lang.ClassCastException: java.lang.String > cannot be cast to org.apache.pig.data.Tuple > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:279) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:226) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:133) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:233) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:254) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:180) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:170) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:158) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map.map(PigMapOnly.java:65) > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:47) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:227) > at > org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2209) > == -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Issue Comment Edited: (PIG-550) java.lang.ClassCastException: java.lang.String cannot be cast to org.apache.pig.data.Tuple
[ https://issues.apache.org/jira/browse/PIG-550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12653445#action_12653445 ] sms edited comment on PIG-550 at 12/4/08 12:02 PM: --- Marking it a duplicate of PIG-449. The resolution in PIG-449 will fix the issue reported in this bug. was (Author: sms): Marking it a duplicate of Pig-449. The resolution in Pig-449 will fix the issue reported in this bug. > java.lang.ClassCastException: java.lang.String cannot be cast to > org.apache.pig.data.Tuple > -- > > Key: PIG-550 > URL: https://issues.apache.org/jira/browse/PIG-550 > Project: Pig > Issue Type: Bug >Affects Versions: types_branch >Reporter: Viraj Bhat > Fix For: types_branch > > > == > Map tasks resulting from the below Pig Script throws the following exception. > Note 'one' is a dummy input containing, number 1. > == > {code} > A = load 'one' using PigStorage() as ( one ); > B = foreach A generate > { > ( > ('p1-t1-e1', 'p1-t1-e2'), > ('p1-t2-e1', 'p1-t2-e2') > ), > ( > ('p2-t1-e1', 'p2-t1-e2'), > ('p2-t2-e1', 'p2-t2-e2') > ) > }; > describe B; > C = foreach B generate > $0 as pairbag { pair: ( t1: (e1, e2), t2: (e1, e2) ) }; describe C; > D = foreach C generate FLATTEN(pairbag); > describe D; > E = foreach D generate > pair.t1.e2 as t1e2, > pair.t2.e1 as t2e1; > describe E; > dump E; > {code} > == > 2008-12-01 20:07:53,974 [main] ERROR > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher - Error > message from task (map) > task_200810152105_0207_m_00java.lang.ClassCastException: java.lang.String > cannot be cast to org.apache.pig.data.Tuple > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:279) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:226) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:133) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:233) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:254) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:180) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:170) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:158) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map.map(PigMapOnly.java:65) > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:47) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:227) > at > org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2209) > == -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-546) FilterFunc calls empty constructor when it should be calling parameterized constructor
[ https://issues.apache.org/jira/browse/PIG-546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Santhosh Srinivasan updated PIG-546: Patch Info: [Patch Available] > FilterFunc calls empty constructor when it should be calling parameterized > constructor > -- > > Key: PIG-546 > URL: https://issues.apache.org/jira/browse/PIG-546 > Project: Pig > Issue Type: Bug > Components: impl >Affects Versions: types_branch >Reporter: Viraj Bhat > Fix For: types_branch > > Attachments: FILTERFROMFILE.java, insetfilterfile, mydata.txt, > PIG-546.patch > > > The following piece of Pig Script uses a custom UDF known as FILTERFROMFILE > which extends the FilterFunc. It contains two constructors, an empty > constructor which is mandatory and the parameterized constructor. The > parameterized constructor passes the HDFS filename, which the exec function > uses to construct a HashMap. The HashMap is later used for filtering records > based on the match criteria in the HDFS file. > {code} > register util.jar; > --util.jar contains the FILTERFROMFILE class > define FILTER_CRITERION util.FILTERFROMFILE('/user/viraj/insetfilterfile'); > RAW_LOGS = load 'mydata.txt' as (url:chararray, numvisits:int); > FILTERED_LOGS = filter RAW_LOGS by FILTER_CRITERION(numvisits); > dump FILTERED_LOGS; > {code} > When you execute the above script, it results in a single Map only job with > 1 Map. It seems that the empty constructor is called 5 times, and ultimately > results in failure of the job. > === > parameterized constructor: /user/viraj/insetfilterfile > parameterized constructor: /user/viraj/insetfilterfile > empty constructor > empty constructor > empty constructor > empty constructor > empty constructor > === > Error in the Hadoop backend > === > java.lang.IllegalArgumentException: Can not create a Path from an empty string > at org.apache.hadoop.fs.Path.checkPathArg(Path.java:82) > at org.apache.hadoop.fs.Path.(Path.java:90) > at > org.apache.pig.backend.hadoop.datastorage.HDataStorage.isContainer(HDataStorage.java:199) > at > org.apache.pig.backend.hadoop.datastorage.HDataStorage.asElement(HDataStorage.java:130) > at > org.apache.pig.impl.io.FileLocalizer.openDFSFile(FileLocalizer.java:164) > at util.FILTERFROMFILE.init(FILTERFROMFILE.java:70) > at util.FILTERFROMFILE.exec(FILTERFROMFILE.java:89) > at util.FILTERFROMFILE.exec(FILTERFROMFILE.java:52) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:179) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:217) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POFilter.getNext(POFilter.java:148) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:170) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:158) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map.map(PigMapOnly.java:65) > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:47) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:227) > at > org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2209) > === > Attaching the sample data and the filter function UDF. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-546) FilterFunc calls empty constructor when it should be calling parameterized constructor
[ https://issues.apache.org/jira/browse/PIG-546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Santhosh Srinivasan updated PIG-546: Attachment: PIG-546.patch The patch (PIG-546.patch) addresses the following issue(s): 1. Fixes the use of an alias declared via the define statement and the subsequent use in i. Filter functions ii. Load functions iii. Store functions iv. Order by functions v. Streaming specifications (input and output) 2. New unit test cases for the parser, end-to-end test cases for streaming and filter udf have been added. Note: There are no end-to-end test cases for order by using a UDF. All unit test cases pass. > FilterFunc calls empty constructor when it should be calling parameterized > constructor > -- > > Key: PIG-546 > URL: https://issues.apache.org/jira/browse/PIG-546 > Project: Pig > Issue Type: Bug > Components: impl >Affects Versions: types_branch >Reporter: Viraj Bhat > Fix For: types_branch > > Attachments: FILTERFROMFILE.java, insetfilterfile, mydata.txt, > PIG-546.patch > > > The following piece of Pig Script uses a custom UDF known as FILTERFROMFILE > which extends the FilterFunc. It contains two constructors, an empty > constructor which is mandatory and the parameterized constructor. The > parameterized constructor passes the HDFS filename, which the exec function > uses to construct a HashMap. The HashMap is later used for filtering records > based on the match criteria in the HDFS file. > {code} > register util.jar; > --util.jar contains the FILTERFROMFILE class > define FILTER_CRITERION util.FILTERFROMFILE('/user/viraj/insetfilterfile'); > RAW_LOGS = load 'mydata.txt' as (url:chararray, numvisits:int); > FILTERED_LOGS = filter RAW_LOGS by FILTER_CRITERION(numvisits); > dump FILTERED_LOGS; > {code} > When you execute the above script, it results in a single Map only job with > 1 Map. It seems that the empty constructor is called 5 times, and ultimately > results in failure of the job. > === > parameterized constructor: /user/viraj/insetfilterfile > parameterized constructor: /user/viraj/insetfilterfile > empty constructor > empty constructor > empty constructor > empty constructor > empty constructor > === > Error in the Hadoop backend > === > java.lang.IllegalArgumentException: Can not create a Path from an empty string > at org.apache.hadoop.fs.Path.checkPathArg(Path.java:82) > at org.apache.hadoop.fs.Path.(Path.java:90) > at > org.apache.pig.backend.hadoop.datastorage.HDataStorage.isContainer(HDataStorage.java:199) > at > org.apache.pig.backend.hadoop.datastorage.HDataStorage.asElement(HDataStorage.java:130) > at > org.apache.pig.impl.io.FileLocalizer.openDFSFile(FileLocalizer.java:164) > at util.FILTERFROMFILE.init(FILTERFROMFILE.java:70) > at util.FILTERFROMFILE.exec(FILTERFROMFILE.java:89) > at util.FILTERFROMFILE.exec(FILTERFROMFILE.java:52) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:179) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:217) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POFilter.getNext(POFilter.java:148) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:170) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:158) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map.map(PigMapOnly.java:65) > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:47) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:227) > at > org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2209) > === > Attaching the sample data and the filter function UDF. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (PIG-562) Command line parameters should have higher precedence than parameter files during pre-processing of parameters
Command line parameters should have higher precedence than parameter files during pre-processing of parameters -- Key: PIG-562 URL: https://issues.apache.org/jira/browse/PIG-562 Project: Pig Issue Type: Improvement Components: tools Affects Versions: types_branch Reporter: Santhosh Srinivasan Priority: Minor Fix For: types_branch In parameter substitution, the order of processing order is stated as follows: Processing Order 1. Configuration files are scanned in the order they are specified on the command line. Within each file, the parameters are processed in the order they are specified. 2. Command line parameters are scanned in the order they are specified on the command line. The order needs to be flipped, allowing the use of command line parameters to define values for variables declared in parameter files. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (PIG-566) Dump and store outputs do not match for PigStorage
Dump and store outputs do not match for PigStorage -- Key: PIG-566 URL: https://issues.apache.org/jira/browse/PIG-566 Project: Pig Issue Type: Bug Affects Versions: types_branch Reporter: Santhosh Srinivasan Priority: Minor Fix For: types_branch The dump and store formats for PigStorage do not match for longs and floats. {code} grunt> y = foreach x generate {(2985671202194220139L)}; grunt> describe y; y: {{(long)}} grunt> dump y; ({(2985671202194220139L)}) grunt> store y into 'y'; grunt> cat y {(2985671202194220139)} {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (PIG-567) Handling strings and exceptions in text data parser
Handling strings and exceptions in text data parser --- Key: PIG-567 URL: https://issues.apache.org/jira/browse/PIG-567 Project: Pig Issue Type: Bug Affects Versions: types_branch Reporter: Santhosh Srinivasan Priority: Minor Fix For: types_branch The text data parser treats a sequence of numerals as integer. If the data is too long to fit into an integer then a number format exception is thrown and no attempts are made to convert the data to a higher type. A couple of questions arise: 1. Should strings be annotated with delimiters like quotes to distinguish them from numbers? 2. Should conversions to higher types or strings be attempted? The conversions have performance implications. {noformat} Data file: {(2985671202194220139L} Pig script: a = load 'data' as (list: bag{t: tuple(value: chararray)}); dump a Output: 2008-12-13 09:08:24,831 [main] ERROR org.apache.pig.tools.grunt.GruntParser - java.io.IOException: Unable to open iterator for alias: a [Unable to store for alias: a [For input string: "2985671202194220139"]] at org.apache.pig.backend.local.executionengine.LocalExecutionEngine.execute(LocalExecutionEngine.java:178) at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:647) at org.apache.pig.PigServer.store(PigServer.java:452) at org.apache.pig.PigServer.store(PigServer.java:421) at org.apache.pig.PigServer.openIterator(PigServer.java:384) at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:269) at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:178) at org.apache.pig.tools.grunt.GruntParser.parseContOnError(GruntParser.java:94) at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:58) at org.apache.pig.Main.main(Main.java:282) Caused by: java.io.IOException: Unable to store for alias: a [For input string: "2985671202194220139"] ... 10 more Caused by: org.apache.pig.backend.executionengine.ExecException: For input string: "2985671202194220139" ... 10 more Caused by: java.lang.NumberFormatException: For input string: "2985671202194220139" at java.lang.NumberFormatException.forInputString(NumberFormatException.java:48) at java.lang.Integer.parseInt(Integer.java:459) at java.lang.Integer.parseInt(Integer.java:497) at org.apache.pig.data.parser.TextDataParser.AtomDatum(TextDataParser.java:291) at org.apache.pig.data.parser.TextDataParser.Datum(TextDataParser.java:359) at org.apache.pig.data.parser.TextDataParser.Tuple(TextDataParser.java:149) at org.apache.pig.data.parser.TextDataParser.Bag(TextDataParser.java:85) at org.apache.pig.data.parser.TextDataParser.Datum(TextDataParser.java:345) at org.apache.pig.data.parser.TextDataParser.Parse(TextDataParser.java:42) at org.apache.pig.builtin.Utf8StorageConverter.parseFromBytes(Utf8StorageConverter.java:70) at org.apache.pig.builtin.Utf8StorageConverter.bytesToBag(Utf8StorageConverter.java:78) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POCast.getNext(POCast.java:861) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:243) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:197) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:226) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POStore.store(POStore.java:137) at org.apache.pig.backend.local.executionengine.LocalPigLauncher.launchPig(LocalPigLauncher.java:62) at org.apache.pig.backend.local.executionengine.LocalExecutionEngine.execute(LocalExecutionEngine.java:166) ... 9 more 2008-12-13 09:08:24,833 [main] ERROR org.apache.pig.tools.grunt.GruntParser - Unable to open iterator for alias: a [Unable to store for alias: a [For input string: "2985671202194220139"]] 2008-12-13 09:08:24,834 [main] ERROR org.apache.pig.tools.grunt.GruntParser - java.io.IOException: Unable to open iterator for alias: a [Unable to store for alias: a [For input string: "2985671202194220139"]] {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-575) Please extend FieldSchema class with getSchema() member function for iterating over complex Schemas in Pig UDF outputSchema
[ https://issues.apache.org/jira/browse/PIG-575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12658625#action_12658625 ] Santhosh Srinivasan commented on PIG-575: - The FiledSchema member variable schema is public. It can be accessed directly without the use of a getSchema() although having the method could make the code cleaner. > Please extend FieldSchema class with getSchema() member function for > iterating over complex Schemas in Pig UDF outputSchema > --- > > Key: PIG-575 > URL: https://issues.apache.org/jira/browse/PIG-575 > Project: Pig > Issue Type: Improvement > Components: impl >Reporter: David Ciemiewicz >Priority: Minor > > I have discovered that it is not possible to recurse through parts of the > input Schema in the UDF outputSchema function. > I have a function that operates on an input bag of tuples and then creates > sequential pairings of the rows. > A = foreach One generate { > ( 1, a ), > ( 2, b ) > } as bag { tuple ( seq: int, value: chararray ) }; > The output of the PAIRS(A) should be: > { > ( ( 1, a ), ( 2, b ) ), > ( ( 2, b ), ( null, null ) ) > } > The default output schema for the function should be: > bag { tuple ( tuple ( order: int, value: chararray ), tuple ( order: int, > value: chararray ) ) ) } > The problem I have is that I'm not able to recurse into the internal Schema > of the FieldSchema in my outputSchema function to get at the tuple within the > input bag. > Here's my sample outputSchema for PAIRS: > public Schema outputSchema(Schema input) { > try { > System.out.println("input: " + input.toString()); > Schema databagSchema = new Schema(); > Schema tupleSchema = new Schema(); > Schema inputDataBag = new Schema(input.getFields().get(0)); > System.out.println("inputDataBag: " + > input.getFields().get(0).toString()); > // > // RIGHT HERE IS WHERE I WANT TO DO inputDataBag.getFields.get(0).getSchema > // > Schema.FieldSchema inputTuple = inputDataBag.getFields().get(0); // > Here's where I want to say > System.out.println("inputTuple: " + inputTuple.toString()); > databagSchema.add(new Schema.FieldSchema(null, DataType.TUPLE)); > System.out.println("databagSchema: " + databagSchema.toString()); > return new Schema( > new Schema.FieldSchema( > getSchemaName( this.getClass().getName().toLowerCase(), > input), > databagSchema, > DataType.BAG > ) > ); > } catch (Exception e) { > return null; > } > } > Here's the execution output from outputSchema: > input: {A: {seq: int,value: chararray},int,int} > inputDataBag: A: bag({seq: int,value: chararray}) > inputTuple: A: bag({seq: int,value: chararray})<= what I want to see is ( > seq: int, value: chararray ) > rowSchema: A: bag({seq: int,value: chararray}) > rowSchema: A: bag({seq: int,value: chararray}) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-458) Type branch integration with hadoop 18
[ https://issues.apache.org/jira/browse/PIG-458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Santhosh Srinivasan updated PIG-458: Affects Version/s: types_branch Fix Version/s: types_branch > Type branch integration with hadoop 18 > -- > > Key: PIG-458 > URL: https://issues.apache.org/jira/browse/PIG-458 > Project: Pig > Issue Type: Improvement >Affects Versions: types_branch >Reporter: Olga Natkovich >Assignee: Olga Natkovich > Fix For: types_branch > > Attachments: hadoop18.jar, PIG-458.patch, un18.patch > > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-577) outer join query looses name information
[ https://issues.apache.org/jira/browse/PIG-577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12659016#action_12659016 ] Santhosh Srinivasan commented on PIG-577: - The workaround is to invert the condition of the bincond and swap the two elements of the bincond. {code} D = FOREACH C GENERATE group, flatten((not IsEmpty(A) ? A: null ), flatten((not IsEmpty(B) ? B: null )); {code} The root cause for this issue is the schema computation in LOBinCond. The assumption is that the schemas of the LHS and RHS of the bincond match all the time. The type checker ensures that this assumption is true. However, after each statement is parsed we do not run the type checker. The type checker is run only when describe, explain, dump or store is encountered. As a result for the script reported in the bug, the type of the null constant is seen as bytearray and not as the schema of the RHS which is a bag. The type checking logic should be invoked early by invoking the type checker after each statement or the type checking logic for bincond should be invoked by the getFieldSchema method to ensure the equivalence of the LHS and RHS schemas. > outer join query looses name information > > > Key: PIG-577 > URL: https://issues.apache.org/jira/browse/PIG-577 > Project: Pig > Issue Type: Bug >Affects Versions: types_branch >Reporter: Olga Natkovich > Fix For: types_branch > > > The following query: > A = LOAD 'student_data' AS (name: chararray, age: int, gpa: float); > B = LOAD 'voter_data' AS (name: chararray, age: int, registration: chararray, > contributions: float); > C = COGROUP A BY name, B BY name; > D = FOREACH C GENERATE group, flatten((IsEmpty(A) ? null : A)), > flatten((IsEmpty(B) ? null : B)); > describe D; > E = FOREACH D GENERATE A::gpa, B::contributions; > Give the following error: (Even though describe shows correct information: D: > {group: chararray,A::name: chararray,A::age: int,A::gpa: float,B::name: > chararray,B::age: int,B::registration: chararray,B::contributions: float} > java.io.IOException: Invalid alias: A::gpa in {group: > chararray,bytearray,bytearray} > at org.apache.pig.PigServer.parseQuery(PigServer.java:298) > at org.apache.pig.PigServer.registerQuery(PigServer.java:263) > at > org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:439) > at > org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:249) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:84) > at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:64) > at org.apache.pig.Main.main(Main.java:306) > Caused by: org.apache.pig.impl.logicalLayer.parser.ParseException: Invalid > alias: A::gpa in {group: chararray,bytearray,bytearray} > at > org.apache.pig.impl.logicalLayer.parser.QueryParser.AliasFieldOrSpec(QueryParser.java:5930) > at > org.apache.pig.impl.logicalLayer.parser.QueryParser.ColOrSpec(QueryParser.java:5788) > at > org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseEvalSpec(QueryParser.java:3974) > at > org.apache.pig.impl.logicalLayer.parser.QueryParser.UnaryExpr(QueryParser.java:3871) > at > org.apache.pig.impl.logicalLayer.parser.QueryParser.CastExpr(QueryParser.java:3825) > at > org.apache.pig.impl.logicalLayer.parser.QueryParser.MultiplicativeExpr(QueryParser.java:3734) > at > org.apache.pig.impl.logicalLayer.parser.QueryParser.AdditiveExpr(QueryParser.java:3660) > at > org.apache.pig.impl.logicalLayer.parser.QueryParser.InfixExpr(QueryParser.java:3626) > at > org.apache.pig.impl.logicalLayer.parser.QueryParser.FlattenedGenerateItem(QueryParser.java:3552) > at > org.apache.pig.impl.logicalLayer.parser.QueryParser.FlattenedGenerateItemList(QueryParser.java:3462) > at > org.apache.pig.impl.logicalLayer.parser.QueryParser.GenerateStatement(QueryParser.java:3419) > at > org.apache.pig.impl.logicalLayer.parser.QueryParser.NestedBlock(QueryParser.java:2894) > at > org.apache.pig.impl.logicalLayer.parser.QueryParser.ForEachClause(QueryParser.java:2309) > at > org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseExpr(QueryParser.java:966) > at > org.apache.pig.impl.logicalLayer.parser.QueryParser.Expr(QueryParser.java:742) > at > org.apache.pig.impl.logicalLayer.parser.QueryParser.Parse(QueryParser.java:537) > at > org.apache.pig.impl.logicalLayer.LogicalPlanBuilder.parse(LogicalPlanBuilder.java:60) > at org.apache.pig.PigServer.parseQuery(PigServer.java:295) > ... 6 more -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-577) outer join query looses name information
[ https://issues.apache.org/jira/browse/PIG-577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12659144#action_12659144 ] Santhosh Srinivasan commented on PIG-577: - The correct statement to make mimic the outer join semantics will be: {code} D = FOREACH C GENERATE group, flatten((not IsEmpty(A) ? A : (bag{tuple(chararray, int, float)}){(null, null, null)})), flatten((not IsEmpty(B) ? B : (bag{tuple(chararray, int, chararray, float)}){(null,null,null, null)})); {code} However, this exposed a bag in the type checker where the schemas of the LHS and RHS do not match. The bag with the null constants has a tuple and relation A(or B) has a schema without the tuple. This issue was resolved in PIG-449. The solution proposed in PIG-449 has to be extended to schema comparisons that involve bags. {code} 2008-12-24 10:31:58,529 [main] ERROR org.apache.pig.tools.grunt.Grunt - Two inputs of BinCond must have compatible schemas 2008-12-24 10:31:58,529 [main] ERROR org.apache.pig.tools.grunt.Grunt - org.apache.pig.impl.logicalLayer.FrontendException: Unable to describe schema for alias D at org.apache.pig.PigServer.dumpSchema(PigServer.java:367) at org.apache.pig.tools.grunt.GruntParser.processDescribe(GruntParser.java:153) at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:188) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:84) at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:71) at org.apache.pig.Main.main(Main.java:302) Caused by: org.apache.pig.impl.plan.PlanValidationException: An unexpected exception caused the validation to stop at org.apache.pig.impl.plan.PlanValidator.validateSkipCollectException(PlanValidator.java:104) at org.apache.pig.impl.logicalLayer.validators.TypeCheckingValidator.validate(TypeCheckingValidator.java:40) at org.apache.pig.impl.logicalLayer.validators.TypeCheckingValidator.validate(TypeCheckingValidator.java:30) at org.apache.pig.impl.logicalLayer.validators.LogicalPlanValidationExecutor.validate(LogicalPlanValidationExecutor.java:79) at org.apache.pig.PigServer.compileLp(PigServer.java:687) at org.apache.pig.PigServer.dumpSchema(PigServer.java:360) ... 5 more Caused by: org.apache.pig.impl.logicalLayer.validators.TypeCheckerException: Cannot resolve ForEach output schema. at org.apache.pig.impl.logicalLayer.validators.TypeCheckingVisitor.visit(TypeCheckingVisitor.java:2731) at org.apache.pig.impl.logicalLayer.LOForEach.visit(LOForEach.java:122) at org.apache.pig.impl.logicalLayer.LOForEach.visit(LOForEach.java:41) at org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:68) at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:51) at org.apache.pig.impl.plan.PlanValidator.validateSkipCollectException(PlanValidator.java:101) ... 10 more Caused by: org.apache.pig.impl.logicalLayer.validators.TypeCheckerException: Problem during evaluaton of BinCond output type at org.apache.pig.impl.logicalLayer.validators.TypeCheckingVisitor.visit(TypeCheckingVisitor.java:1913) at org.apache.pig.impl.logicalLayer.LOBinCond.visit(LOBinCond.java:88) at org.apache.pig.impl.logicalLayer.LOBinCond.visit(LOBinCond.java:27) at org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:68) at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:51) at org.apache.pig.impl.logicalLayer.validators.TypeCheckingVisitor.checkInnerPlan(TypeCheckingVisitor.java:2812) at org.apache.pig.impl.logicalLayer.validators.TypeCheckingVisitor.visit(TypeCheckingVisitor.java:2720) ... 15 more Caused by: org.apache.pig.impl.logicalLayer.validators.TypeCheckerException: Two inputs of BinCond must have compatible schemas at org.apache.pig.impl.logicalLayer.validators.TypeCheckingVisitor.visit(TypeCheckingVisitor.java:1903) ... 21 more {code} > outer join query looses name information > > > Key: PIG-577 > URL: https://issues.apache.org/jira/browse/PIG-577 > Project: Pig > Issue Type: Bug >Affects Versions: types_branch >Reporter: Olga Natkovich > Fix For: types_branch > > > The following query: > A = LOAD 'student_data' AS (name: chararray, age: int, gpa: float); > B = LOAD 'voter_data' AS (name: chararray, age: int, registration: chararray, > contributions: float); > C = COGROUP A BY name, B BY name; > D = FOREACH C GENERATE group, flatten((IsEmpty(A) ? null : A)), > flatten((IsEmpty(B) ? null : B)); > describe D; > E = FOREACH D GENERATE A::gpa, B::contributions; > Give the following error: (Even though describe shows correct informat
[jira] Updated: (PIG-578) join ... outer, ... outer semantics are a no-ops, should produce corresponding null values
[ https://issues.apache.org/jira/browse/PIG-578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Santhosh Srinivasan updated PIG-578: Issue Type: Improvement (was: Bug) Marking this as an improvement as Pig does not support outer joins as a language construct. The keyword outer is ignored in the join statement currently. This should be fixed to allow outer joins (left, right and full). > join ... outer, ... outer semantics are a no-ops, should produce > corresponding null values > -- > > Key: PIG-578 > URL: https://issues.apache.org/jira/browse/PIG-578 > Project: Pig > Issue Type: Improvement > Components: impl >Affects Versions: types_branch >Reporter: David Ciemiewicz > > Currently using the "OUTER" modifier in the JOIN statement is a no-op. The > resuls of JOIN are always an INNER join. Now that the Pig types branch > supports null values proper, the semantics of JOIN ... OUTER, ... OUTER > should be corrected to do proper outer joins and populating the corresponding > empty values with nulls. > Here's the example: > A = load 'a.txt' using PigStorage() as ( comment, value ); > B = load 'b.txt' using PigStorage() as ( comment, value ); > -- > -- OUTER clause is ignored in JOIN statement and does not populat tuple with > -- null values as it should. Otherwise OUTER is a meaningless no-op modifier. > -- > ABOuterJoin = join A by ( comment ) outer, B by ( comment ) outer; > describe ABOuterJoin; > dump ABOuterJoin; > The file a contains: > a-only 1 > ab-both 2 > The file b contains: > ab-both 2 > b-only 3 > When you execute the script today, the dump results are: > (ab-both,2,ab-both,2) > The expected dump results should be: > (a-only,1,,) > (ab-both,2,ab-both,2) > (,,b-only,3) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-577) outer join query looses name information
[ https://issues.apache.org/jira/browse/PIG-577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12659152#action_12659152 ] Santhosh Srinivasan commented on PIG-577: - The use of the null constant in the bincond in the context of a flatten should handle the following cases: Assumption: one of the columns in the bincond is a null constant. 1. If the other column is a simple type or a map then cast the null to the other type. 2. If the other column is a complex type other than a map then remove the null constant and supplant it with a bag or a tuple or a map constant with the appropriate elements, i.e., if the other column is a bag with a tuple that contains three columns (say int, float, chararray) then replace the null constant with a bag that contains a tuple with three null constants. The same reasoning applies to a tuple column. Upon flattening the complex types will give out the appropriate number of columns. Handling null constants for complex type has implications when the constant is materialized either via dump or store. If the null constant is replaced with an appropriate bag/tuple/map then the materialized constant will look like {(,,)} or (,,) or []. This conflicts with our existing view of nulls being empty when materialized. > outer join query looses name information > > > Key: PIG-577 > URL: https://issues.apache.org/jira/browse/PIG-577 > Project: Pig > Issue Type: Bug >Affects Versions: types_branch >Reporter: Olga Natkovich > Fix For: types_branch > > > The following query: > A = LOAD 'student_data' AS (name: chararray, age: int, gpa: float); > B = LOAD 'voter_data' AS (name: chararray, age: int, registration: chararray, > contributions: float); > C = COGROUP A BY name, B BY name; > D = FOREACH C GENERATE group, flatten((IsEmpty(A) ? null : A)), > flatten((IsEmpty(B) ? null : B)); > describe D; > E = FOREACH D GENERATE A::gpa, B::contributions; > Give the following error: (Even though describe shows correct information: D: > {group: chararray,A::name: chararray,A::age: int,A::gpa: float,B::name: > chararray,B::age: int,B::registration: chararray,B::contributions: float} > java.io.IOException: Invalid alias: A::gpa in {group: > chararray,bytearray,bytearray} > at org.apache.pig.PigServer.parseQuery(PigServer.java:298) > at org.apache.pig.PigServer.registerQuery(PigServer.java:263) > at > org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:439) > at > org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:249) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:84) > at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:64) > at org.apache.pig.Main.main(Main.java:306) > Caused by: org.apache.pig.impl.logicalLayer.parser.ParseException: Invalid > alias: A::gpa in {group: chararray,bytearray,bytearray} > at > org.apache.pig.impl.logicalLayer.parser.QueryParser.AliasFieldOrSpec(QueryParser.java:5930) > at > org.apache.pig.impl.logicalLayer.parser.QueryParser.ColOrSpec(QueryParser.java:5788) > at > org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseEvalSpec(QueryParser.java:3974) > at > org.apache.pig.impl.logicalLayer.parser.QueryParser.UnaryExpr(QueryParser.java:3871) > at > org.apache.pig.impl.logicalLayer.parser.QueryParser.CastExpr(QueryParser.java:3825) > at > org.apache.pig.impl.logicalLayer.parser.QueryParser.MultiplicativeExpr(QueryParser.java:3734) > at > org.apache.pig.impl.logicalLayer.parser.QueryParser.AdditiveExpr(QueryParser.java:3660) > at > org.apache.pig.impl.logicalLayer.parser.QueryParser.InfixExpr(QueryParser.java:3626) > at > org.apache.pig.impl.logicalLayer.parser.QueryParser.FlattenedGenerateItem(QueryParser.java:3552) > at > org.apache.pig.impl.logicalLayer.parser.QueryParser.FlattenedGenerateItemList(QueryParser.java:3462) > at > org.apache.pig.impl.logicalLayer.parser.QueryParser.GenerateStatement(QueryParser.java:3419) > at > org.apache.pig.impl.logicalLayer.parser.QueryParser.NestedBlock(QueryParser.java:2894) > at > org.apache.pig.impl.logicalLayer.parser.QueryParser.ForEachClause(QueryParser.java:2309) > at > org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseExpr(QueryParser.java:966) > at > org.apache.pig.impl.logicalLayer.parser.QueryParser.Expr(QueryParser.java:742) > at > org.apache.pig.impl.logicalLayer.parser.QueryParser.Parse(QueryParser.java:537) > at > org.apache.pig.impl.logicalLayer.LogicalPlanBuilder.parse(LogicalPlanBuilder.java:60) > at org.apache.pig.PigServer.parseQuery(PigServer.java:295) > ... 6 more -- This message i
[jira] Created: (PIG-583) Bag constants used in non foreach statements cause lexical errors
Bag constants used in non foreach statements cause lexical errors - Key: PIG-583 URL: https://issues.apache.org/jira/browse/PIG-583 Project: Pig Issue Type: Bug Components: grunt Affects Versions: types_branch Reporter: Santhosh Srinivasan Priority: Minor Fix For: types_branch Use of bag constants in non-foreach statement cause lexical errors in Pig. The root cause is the inability of grunt to distinguish between nested block and bag constant in non-foreach statements. {code} grunt> a = load 'input'; grunt> b = filter a by ($0 eq {(1)}); 2008-12-29 14:12:15,306 [main] ERROR org.apache.pig.tools.grunt.GruntParser - java.io.IOException: Encountered " "eq "" at line 1, column 21. Was expecting one of: "*" ... ")" ... "." ... "+" ... "-" ... "/" ... "%" ... "#" ... ... org.apache.pig.tools.pigscript.parser.TokenMgrError: Lexical error at line 2, column 29. Encountered: ")" (41), after : "" at org.apache.pig.tools.pigscript.parser.PigScriptParserTokenManager.getNextToken(PigScriptParserTokenManager.java:2608) at org.apache.pig.tools.pigscript.parser.PigScriptParser.jj_ntk(PigScriptParser.java:658) at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:84) at org.apache.pig.tools.grunt.GruntParser.parseContOnError(GruntParser.java:94) at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:58) at org.apache.pig.Main.main(Main.java:282) {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (PIG-584) Error handling in Pig
Error handling in Pig - Key: PIG-584 URL: https://issues.apache.org/jira/browse/PIG-584 Project: Pig Issue Type: New Feature Affects Versions: types_branch Reporter: Santhosh Srinivasan Assignee: Santhosh Srinivasan Fix For: types_branch This JIRA tracks the error handling feature in Pig. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (PIG-585) Error handling requirements
Error handling requirements --- Key: PIG-585 URL: https://issues.apache.org/jira/browse/PIG-585 Project: Pig Issue Type: Sub-task Affects Versions: types_branch Reporter: Santhosh Srinivasan Assignee: Santhosh Srinivasan Fix For: types_branch The error handling feature requirements are documented at: http://wiki.apache.org/pig/PigErrorHandling -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (PIG-586) Error handling functional specification
Error handling functional specification --- Key: PIG-586 URL: https://issues.apache.org/jira/browse/PIG-586 Project: Pig Issue Type: Sub-task Components: documentation Affects Versions: types_branch Reporter: Santhosh Srinivasan Assignee: Santhosh Srinivasan Fix For: types_branch The error handling functional specification will be at: http://wiki.apache.org/pig/PigErrorHandlingFunctionalSpecification -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (PIG-587) Error handling design
Error handling design - Key: PIG-587 URL: https://issues.apache.org/jira/browse/PIG-587 Project: Pig Issue Type: Sub-task Components: documentation Affects Versions: types_branch Reporter: Santhosh Srinivasan Assignee: Santhosh Srinivasan Fix For: types_branch The error handling design will be at: http://wiki.apache.org/pig/PigErrorHandlingDesign -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-585) Error handling requirements
[ https://issues.apache.org/jira/browse/PIG-585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Santhosh Srinivasan updated PIG-585: Component/s: documentation > Error handling requirements > --- > > Key: PIG-585 > URL: https://issues.apache.org/jira/browse/PIG-585 > Project: Pig > Issue Type: Sub-task > Components: documentation >Affects Versions: types_branch >Reporter: Santhosh Srinivasan >Assignee: Santhosh Srinivasan > Fix For: types_branch > > > The error handling feature requirements are documented at: > http://wiki.apache.org/pig/PigErrorHandling -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (PIG-588) Error handling phase one
Error handling phase one Key: PIG-588 URL: https://issues.apache.org/jira/browse/PIG-588 Project: Pig Issue Type: Sub-task Components: grunt, impl, tools Affects Versions: types_branch Reporter: Santhosh Srinivasan Assignee: Santhosh Srinivasan Fix For: types_branch Phase one of error handling implementation will build the infrastructure for handling errors and handle errors in the parser and the type checker. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-588) Error handling phase one
[ https://issues.apache.org/jira/browse/PIG-588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Santhosh Srinivasan updated PIG-588: Attachment: Error_handling_phase_1.patch The attached patch (Error_handling_phase_1.patch) includes the following: 1. Infrastructure for handling errors, i.e., logging detailed error messages to the client side log file, switches to control throwing detailed messages on user's screen and base classes for the exception hierarchy 2. Error codes and error messages for the parser and type checker Unit tests have been modified to accommodate the new structure. No new unit test cases have been added. > Error handling phase one > > > Key: PIG-588 > URL: https://issues.apache.org/jira/browse/PIG-588 > Project: Pig > Issue Type: Sub-task > Components: grunt, impl, tools >Affects Versions: types_branch >Reporter: Santhosh Srinivasan >Assignee: Santhosh Srinivasan > Fix For: types_branch > > Attachments: Error_handling_phase_1.patch > > > Phase one of error handling implementation will build the infrastructure for > handling errors and handle errors in the parser and the type checker. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (PIG-589) Error handling phase two
Error handling phase two Key: PIG-589 URL: https://issues.apache.org/jira/browse/PIG-589 Project: Pig Issue Type: Sub-task Components: impl Affects Versions: types_branch Reporter: Santhosh Srinivasan Assignee: Santhosh Srinivasan Fix For: types_branch Phase two of the implementation will cover the remainder of the logical layer and the front-end, i.e., the optimizer, the translators, etc. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (PIG-590) Error handling phase three
Error handling phase three -- Key: PIG-590 URL: https://issues.apache.org/jira/browse/PIG-590 Project: Pig Issue Type: Sub-task Components: impl Affects Versions: types_branch Reporter: Santhosh Srinivasan Assignee: Santhosh Srinivasan Fix For: types_branch Phase three of the error handling feature will cover the backed including Hadoop specific error messages. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (PIG-591) Error handling phase four
Error handling phase four - Key: PIG-591 URL: https://issues.apache.org/jira/browse/PIG-591 Project: Pig Issue Type: Sub-task Components: grunt, impl, tools Affects Versions: types_branch Reporter: Santhosh Srinivasan Assignee: Santhosh Srinivasan Fix For: types_branch Phase four of the error handling feature will address the warning message cleanup and warning message aggregation. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-588) Error handling phase one
[ https://issues.apache.org/jira/browse/PIG-588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Santhosh Srinivasan updated PIG-588: Attachment: Error_handling_phase_1_1.patch Attached patch adds a few more error codes to exceptions inside PigContext.java. > Error handling phase one > > > Key: PIG-588 > URL: https://issues.apache.org/jira/browse/PIG-588 > Project: Pig > Issue Type: Sub-task > Components: grunt, impl, tools >Affects Versions: types_branch >Reporter: Santhosh Srinivasan >Assignee: Santhosh Srinivasan > Fix For: types_branch > > Attachments: Error_handling_phase_1.patch, > Error_handling_phase_1_1.patch > > > Phase one of error handling implementation will build the infrastructure for > handling errors and handle errors in the parser and the type checker. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-588) Error handling phase one
[ https://issues.apache.org/jira/browse/PIG-588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Santhosh Srinivasan updated PIG-588: Attachment: Error_handling_phase_1_2.patch Attaching new patch as the previous one did not have newly added files (PigException.java and TypeCheckerException.java) > Error handling phase one > > > Key: PIG-588 > URL: https://issues.apache.org/jira/browse/PIG-588 > Project: Pig > Issue Type: Sub-task > Components: grunt, impl, tools >Affects Versions: types_branch >Reporter: Santhosh Srinivasan >Assignee: Santhosh Srinivasan > Fix For: types_branch > > Attachments: Error_handling_phase_1.patch, > Error_handling_phase_1_1.patch, Error_handling_phase_1_2.patch > > > Phase one of error handling implementation will build the infrastructure for > handling errors and handle errors in the parser and the type checker. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-588) Error handling phase one
[ https://issues.apache.org/jira/browse/PIG-588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Santhosh Srinivasan updated PIG-588: Attachment: Error_handling_phase_1_3.patch The attached patch ensures that error messages are sent to STDERR instead of STDOUT > Error handling phase one > > > Key: PIG-588 > URL: https://issues.apache.org/jira/browse/PIG-588 > Project: Pig > Issue Type: Sub-task > Components: grunt, impl, tools >Affects Versions: types_branch >Reporter: Santhosh Srinivasan >Assignee: Santhosh Srinivasan > Fix For: types_branch > > Attachments: Error_handling_phase_1.patch, > Error_handling_phase_1_1.patch, Error_handling_phase_1_2.patch, > Error_handling_phase_1_3.patch > > > Phase one of error handling implementation will build the infrastructure for > handling errors and handle errors in the parser and the type checker. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-588) Error handling phase one
[ https://issues.apache.org/jira/browse/PIG-588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Santhosh Srinivasan updated PIG-588: Attachment: Error_handling_phase_1_4.patch Another patch in synchrony with the types branch after PIG-599 > Error handling phase one > > > Key: PIG-588 > URL: https://issues.apache.org/jira/browse/PIG-588 > Project: Pig > Issue Type: Sub-task > Components: grunt, impl, tools >Affects Versions: types_branch >Reporter: Santhosh Srinivasan >Assignee: Santhosh Srinivasan > Fix For: types_branch > > Attachments: Error_handling_phase_1.patch, > Error_handling_phase_1_1.patch, Error_handling_phase_1_2.patch, > Error_handling_phase_1_3.patch, Error_handling_phase_1_4.patch > > > Phase one of error handling implementation will build the infrastructure for > handling errors and handle errors in the parser and the type checker. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (PIG-616) Casts to complex types do not work as expected
Casts to complex types do not work as expected -- Key: PIG-616 URL: https://issues.apache.org/jira/browse/PIG-616 Project: Pig Issue Type: Bug Components: impl Affects Versions: types_branch Reporter: Santhosh Srinivasan Fix For: types_branch When we specify a (complex) type as a column in Pig, the TypeCastInserter inserts the appropriate cast for the (complex) type. However, in the implementation of POCast.java, when databyte arrays are converted to the (complex) types, we invoke the bytesToXXX method. For complex types, especially tuples and bags, we do not enforce the typing information specified by the user in the AS clause or with the explicit cast statement. The implementation solely relies on bytesToXXX to figure out the right type. An example of a query that fails is given below. Wrt the query, the data is a single column that is a bag of integers. The user specifies this bag to be a bag of chararray. This conversion is allowed in pig but the implementation does not perform the actual cast. Instead the bytesToBag is called on the stream. The resulting type is a bag of integers and not a bag of chararray. In the subsequent statement the user (correctly) assumes that the conversion has been performed but in reality it has not been done. At run time when a chararray based operation is performed we see a ClassCastException. The notion of a schema has is absent in the physical operators. The schema/fieldSchema in the logical layer has to be passed on to the physical layer. The schema can be used to perform additional operations like casting, etc. {code} grunt> cat bag.data {(1)} grunt> a = load 'bag.data' as (b:{t:(c:chararray)}); grunt> b = foreach a generate flatten(b); grunt> c = foreach b generate CONCAT('Hello ', $0); grunt> dump c; 2009-01-12 10:44:44,417 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete 2009-01-12 10:45:09,439 [main] ERROR org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Map reduce job failed 2009-01-12 10:45:09,440 [main] ERROR org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Job failed! 2009-01-12 10:45:09,443 [main] ERROR org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher - Error message from task (map) task_200812151518_9681_m_00java.lang.ClassCastException: java.lang.Integer cannot be cast to java.lang.String at org.apache.pig.builtin.StringConcat.exec(StringConcat.java:37) at org.apache.pig.builtin.StringConcat.exec(StringConcat.java:31) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:185) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:259) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:271) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:197) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:187) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:175) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map.map(PigMapOnly.java:65) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:47) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:227) at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2207) ... 2009-01-12 10:45:09,448 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1066: Unable to open iterator for alias c 2009-01-12 10:45:09,448 [main] ERROR org.apache.pig.tools.grunt.Grunt - org.apache.pig.impl.logicalLayer.FrontendException: Unable to open iterator for alias c at org.apache.pig.PigServer.openIterator(PigServer.java:426) at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:271) at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:178) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:84) at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:72) at org.apache.pig.Main.main(Main.java:302) Caused by: java.io.IOException: Job terminated with anomalous status FAILED at org.apache.pig.PigServer.openIterator(PigServer.java:420) ... 5 more {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (PIG-617) Using SUM with basic type fails
Using SUM with basic type fails --- Key: PIG-617 URL: https://issues.apache.org/jira/browse/PIG-617 Project: Pig Issue Type: Bug Components: impl Affects Versions: types_branch Reporter: Santhosh Srinivasan Fix For: types_branch SUM is an aggregate function that expects a bag as an argument. When basic types are used as arguments to SUM, Pig fails during run time. The typechecker should catch this error and fail earlier. An example is given below: {code} grunt> a = load 'one' as (i: int); grunt> b = foreach a generate SUM(i); grunt> dump b; 2009-01-12 14:11:47,595 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete 2009-01-12 14:12:12,617 [main] ERROR org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Map reduce job failed 2009-01-12 14:12:12,618 [main] ERROR org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Job failed! 2009-01-12 14:12:12,623 [main] ERROR org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher - Error message from task (map) task_200812151518_9683_m_00java.lang.ClassCastException: java.lang.Integer cannot be cast to org.apache.pig.data.DataBag 2009-01-12 14:12:12,623 [main] ERROR org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher - Error message from task (map) task_200812151518_9683_m_00java.lang.ClassCastException: java.lang.Integer cannot be cast to org.apache.pig.data.DataBag at org.apache.pig.builtin.IntSum.sum(IntSum.java:141) at org.apache.pig.builtin.IntSum.exec(IntSum.java:41) at org.apache.pig.builtin.IntSum.exec(IntSum.java:36) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:185) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:247) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:265) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:197) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:187) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:175) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map.map(PigMapOnly.java:65) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:47) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:227) at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2207) ... 2009-01-12 14:12:12,629 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1066: Unable to open iterator for alias b 2009-01-12 14:12:12,629 [main] ERROR org.apache.pig.tools.grunt.Grunt - org.apache.pig.impl.logicalLayer.FrontendException: Unable to open iterator for alias b at org.apache.pig.PigServer.openIterator(PigServer.java:425) at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:271) at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:178) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:84) at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:72) at org.apache.pig.Main.main(Main.java:302) Caused by: java.io.IOException: Job terminated with anomalous status FAILED at org.apache.pig.PigServer.openIterator(PigServer.java:419) ... 5 more {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-589) Error handling phase two
[ https://issues.apache.org/jira/browse/PIG-589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Santhosh Srinivasan updated PIG-589: Attachment: Error_handling_phase2.patch The attached patch fulfills the requirements for phase two. > Error handling phase two > > > Key: PIG-589 > URL: https://issues.apache.org/jira/browse/PIG-589 > Project: Pig > Issue Type: Sub-task > Components: impl >Affects Versions: types_branch >Reporter: Santhosh Srinivasan >Assignee: Santhosh Srinivasan > Fix For: types_branch > > Attachments: Error_handling_phase2.patch > > > Phase two of the implementation will cover the remainder of the logical layer > and the front-end, i.e., the optimizer, the translators, etc. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (PIG-621) Casts swallow exceptions when there are issues with conversion of bytes to Pig types
Casts swallow exceptions when there are issues with conversion of bytes to Pig types Key: PIG-621 URL: https://issues.apache.org/jira/browse/PIG-621 Project: Pig Issue Type: Bug Affects Versions: types_branch Reporter: Santhosh Srinivasan Fix For: types_branch In the current implementation of casts, exceptions thrown while converting bytes to Pig types are swallowed. Pig needs to either return NULL or rethrow the exception. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-589) Error handling phase two
[ https://issues.apache.org/jira/browse/PIG-589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Santhosh Srinivasan updated PIG-589: Attachment: Error_handling_phase2_4.patch Updated patch with a minor fix to rethrow an exception. See related bug PIG-621. > Error handling phase two > > > Key: PIG-589 > URL: https://issues.apache.org/jira/browse/PIG-589 > Project: Pig > Issue Type: Sub-task > Components: impl >Affects Versions: types_branch >Reporter: Santhosh Srinivasan >Assignee: Santhosh Srinivasan > Fix For: types_branch > > Attachments: Error_handling_phase2.patch, > Error_handling_phase2_4.patch > > > Phase two of the implementation will cover the remainder of the logical layer > and the front-end, i.e., the optimizer, the translators, etc. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Issue Comment Edited: (PIG-571) pigserver methods do not throw error or return error code when an error occurs
[ https://issues.apache.org/jira/browse/PIG-571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12664657#action_12664657 ] sms edited comment on PIG-571 at 1/16/09 12:06 PM: --- In the current implementation, Pig displays the errors including the stack trace but does not throw an exception. There are two problems in the existing code: 1. Hadoop returns status as String instead of serialized objects 2. Pig does not throw an exception on failures with the appropriate details. As part of the error handling feature, Pig will handle point 2 in the third milestone(PIG-590) and request Hadoop to support status reporting via objects and not just Strings. was (Author: sms): In the current implementation, Pig displays the errors including the stack trace but do not throw an exception. There are two problems in the existing code: 1. Hadoop returns status as String instead of serialized objects 2. Pig does not throw an exception on failures with the appropriate details. As part of the error handling feature, Pig will handle point 2 in the third milestone(PIG-590) and request Hadoop to support status reporting via objects and not just Strings. > pigserver methods do not throw error or return error code when an error occurs > -- > > Key: PIG-571 > URL: https://issues.apache.org/jira/browse/PIG-571 > Project: Pig > Issue Type: Bug >Affects Versions: types_branch >Reporter: Christopher Olston >Assignee: Santhosh Srinivasan > > I do PigServer.registerQuery("store ..."), and the query fails. Pig prints a > bunch of stack traces but does not throw an error back to the caller. This is > a major problem because my client needs to know whether the Pig command > succeeded or failed. > I saw this problem with registerQuery() ... the same problem may arise with > other PigServer methods as well, such as store(), copy(), etc. -- not sure. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-571) pigserver methods do not throw error or return error code when an error occurs
[ https://issues.apache.org/jira/browse/PIG-571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12664657#action_12664657 ] Santhosh Srinivasan commented on PIG-571: - In the current implementation, Pig displays the errors including the stack trace but do not throw an exception. There are two problems in the existing code: 1. Hadoop returns status as String instead of serialized objects 2. Pig does not throw an exception on failures with the appropriate details. As part of the error handling feature, Pig will handle point 2 in the third milestone(PIG-590) and request Hadoop to support status reporting via objects and not just Strings. > pigserver methods do not throw error or return error code when an error occurs > -- > > Key: PIG-571 > URL: https://issues.apache.org/jira/browse/PIG-571 > Project: Pig > Issue Type: Bug >Affects Versions: types_branch >Reporter: Christopher Olston >Assignee: Santhosh Srinivasan > > I do PigServer.registerQuery("store ..."), and the query fails. Pig prints a > bunch of stack traces but does not throw an error back to the caller. This is > a major problem because my client needs to know whether the Pig command > succeeded or failed. > I saw this problem with registerQuery() ... the same problem may arise with > other PigServer methods as well, such as store(), copy(), etc. -- not sure. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-571) pigserver methods do not throw error or return error code when an error occurs
[ https://issues.apache.org/jira/browse/PIG-571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12664680#action_12664680 ] Santhosh Srinivasan commented on PIG-571: - As an intermediate step, Pig will parser the Hadoop status message and create an exception with relevant details. > pigserver methods do not throw error or return error code when an error occurs > -- > > Key: PIG-571 > URL: https://issues.apache.org/jira/browse/PIG-571 > Project: Pig > Issue Type: Bug >Affects Versions: types_branch >Reporter: Christopher Olston >Assignee: Santhosh Srinivasan > > I do PigServer.registerQuery("store ..."), and the query fails. Pig prints a > bunch of stack traces but does not throw an error back to the caller. This is > a major problem because my client needs to know whether the Pig command > succeeded or failed. > I saw this problem with registerQuery() ... the same problem may arise with > other PigServer methods as well, such as store(), copy(), etc. -- not sure. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Issue Comment Edited: (PIG-571) pigserver methods do not throw error or return error code when an error occurs
[ https://issues.apache.org/jira/browse/PIG-571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12664680#action_12664680 ] sms edited comment on PIG-571 at 1/16/09 1:18 PM: -- As an intermediate step, Pig will parse the Hadoop status message and create an exception with relevant details. was (Author: sms): As an intermediate step, Pig will parser the Hadoop status message and create an exception with relevant details. > pigserver methods do not throw error or return error code when an error occurs > -- > > Key: PIG-571 > URL: https://issues.apache.org/jira/browse/PIG-571 > Project: Pig > Issue Type: Bug >Affects Versions: types_branch >Reporter: Christopher Olston >Assignee: Santhosh Srinivasan > > I do PigServer.registerQuery("store ..."), and the query fails. Pig prints a > bunch of stack traces but does not throw an error back to the caller. This is > a major problem because my client needs to know whether the Pig command > succeeded or failed. > I saw this problem with registerQuery() ... the same problem may arise with > other PigServer methods as well, such as store(), copy(), etc. -- not sure. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-623) Fix spelling errors in output messages
[ https://issues.apache.org/jira/browse/PIG-623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Santhosh Srinivasan updated PIG-623: Resolution: Fixed Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) Patch has been committed. Thanks for your contribution Tom. > Fix spelling errors in output messages > -- > > Key: PIG-623 > URL: https://issues.apache.org/jira/browse/PIG-623 > Project: Pig > Issue Type: Improvement >Reporter: Tom White >Priority: Trivial > Attachments: pig-623.patch > > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-622) Include pig executable in distribution
[ https://issues.apache.org/jira/browse/PIG-622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Santhosh Srinivasan updated PIG-622: Resolution: Fixed Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) Patch committed. Thanks for your contribution Tom. > Include pig executable in distribution > -- > > Key: PIG-622 > URL: https://issues.apache.org/jira/browse/PIG-622 > Project: Pig > Issue Type: Bug >Reporter: Tom White > Attachments: pig-622.patch > > > Running "ant tar" does not generate the bin directory with the pig executable > in it. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-571) pigserver methods do not throw error or return error code when an error occurs
[ https://issues.apache.org/jira/browse/PIG-571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12664787#action_12664787 ] Santhosh Srinivasan commented on PIG-571: - Thanks for the patch Laukik. A similar fix has been made as part of PIG-588. Its pending a review. This bug addresses the fact that Pig does not throw an exception when the registerQuery() method reports a failure. This affects Java programs that use this API. > pigserver methods do not throw error or return error code when an error occurs > -- > > Key: PIG-571 > URL: https://issues.apache.org/jira/browse/PIG-571 > Project: Pig > Issue Type: Bug >Affects Versions: types_branch >Reporter: Christopher Olston >Assignee: Santhosh Srinivasan > Attachments: ret_code.diff > > > I do PigServer.registerQuery("store ..."), and the query fails. Pig prints a > bunch of stack traces but does not throw an error back to the caller. This is > a major problem because my client needs to know whether the Pig command > succeeded or failed. > I saw this problem with registerQuery() ... the same problem may arise with > other PigServer methods as well, such as store(), copy(), etc. -- not sure. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-545) PERFORMANCE: Sampler for order bys does not produce a good distribution
[ https://issues.apache.org/jira/browse/PIG-545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12666234#action_12666234 ] Santhosh Srinivasan commented on PIG-545: - Two, just getting better sampling won't resolve the issue for order by queries that have one or a few keys with a very high number of values, such as in a zipf distribution. Unfortunately for us, zipf is a very common data distribution. In this case our partitioner may need to be able to detect and split large keys by round robining them to a group of reducers. Better sampling will not resolve the issue for order by. It will help in having more evenly sized partitions for the reducers. Since its sampling and not brute force approach of checking out the cardinality of each key, there will always be a non-zero probability of one reducer getting more data than the other reducers. The better sampling approach will minimize such occurrences. Secondly, post sampling, we can ensure that reducers get the right partitions by using Hadoop's ability to pick reducers based on partition functions. I am not quite sure how Pig can propose a generic partition function to achieve this. > PERFORMANCE: Sampler for order bys does not produce a good distribution > --- > > Key: PIG-545 > URL: https://issues.apache.org/jira/browse/PIG-545 > Project: Pig > Issue Type: Bug > Components: impl >Affects Versions: types_branch >Reporter: Alan Gates >Assignee: Amir Youssefi > Fix For: types_branch > > > In running tests on actual data, I've noticed that the final reduce of an > order by has skewed partitions. Some reduces finish in a few seconds while > some run for 20 minutes. Getting a better distribution should lead to much > better performance for order by. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-589) Error handling phase two
[ https://issues.apache.org/jira/browse/PIG-589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Santhosh Srinivasan updated PIG-589: Attachment: Error_handling_phase2_5.patch Attached patch is in synchrony with the latest sources. > Error handling phase two > > > Key: PIG-589 > URL: https://issues.apache.org/jira/browse/PIG-589 > Project: Pig > Issue Type: Sub-task > Components: impl >Affects Versions: types_branch >Reporter: Santhosh Srinivasan >Assignee: Santhosh Srinivasan > Fix For: types_branch > > Attachments: Error_handling_phase2.patch, > Error_handling_phase2_4.patch, Error_handling_phase2_5.patch > > > Phase two of the implementation will cover the remainder of the logical layer > and the front-end, i.e., the optimizer, the translators, etc. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-589) Error handling phase two
[ https://issues.apache.org/jira/browse/PIG-589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12666428#action_12666428 ] Santhosh Srinivasan commented on PIG-589: - Will fix issue (1). For issue (2), the list of matching functions is internal to Pig and is probably something that users should not be made aware of. It should probably be part of the detailed message that is logged to the file. > Error handling phase two > > > Key: PIG-589 > URL: https://issues.apache.org/jira/browse/PIG-589 > Project: Pig > Issue Type: Sub-task > Components: impl >Affects Versions: types_branch >Reporter: Santhosh Srinivasan >Assignee: Santhosh Srinivasan > Fix For: types_branch > > Attachments: Error_handling_phase2.patch, > Error_handling_phase2_4.patch, Error_handling_phase2_5.patch > > > Phase two of the implementation will cover the remainder of the logical layer > and the front-end, i.e., the optimizer, the translators, etc. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-615) Wrong number of jobs with limit
[ https://issues.apache.org/jira/browse/PIG-615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1254#action_1254 ] Santhosh Srinivasan commented on PIG-615: - I will be reviewing this patch. > Wrong number of jobs with limit > --- > > Key: PIG-615 > URL: https://issues.apache.org/jira/browse/PIG-615 > Project: Pig > Issue Type: Bug >Reporter: Olga Natkovich >Assignee: Shravan Matthur Narayanamurthy > Attachments: 615.patch > > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-553) EvalFunc.finish() not getting called
[ https://issues.apache.org/jira/browse/PIG-553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1252#action_1252 ] Santhosh Srinivasan commented on PIG-553: - I will be reviewing this patch. > EvalFunc.finish() not getting called > > > Key: PIG-553 > URL: https://issues.apache.org/jira/browse/PIG-553 > Project: Pig > Issue Type: Bug >Affects Versions: types_branch > Environment: "local" mode >Reporter: Christopher Olston >Assignee: Shravan Matthur Narayanamurthy > Attachments: 553.patch > > > My EvalFunc's finish() method doesn't seem to get invoked. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-553) EvalFunc.finish() not getting called
[ https://issues.apache.org/jira/browse/PIG-553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1279#action_1279 ] Santhosh Srinivasan commented on PIG-553: - Review comments: 1. The code looks fine. 2. There are no unit test cases. We need unit test cases to ensure that the code path is exercised in all cases (preferably at least map reduce case) 3. In algebraic functions, since intermediate is called only in PigCombiner and the UDF visitor is never called in the PigCombiner, users should be aware that finish() is never called for intermediate methods. The UDF documentation has to be updated to reflect this caveat. > EvalFunc.finish() not getting called > > > Key: PIG-553 > URL: https://issues.apache.org/jira/browse/PIG-553 > Project: Pig > Issue Type: Bug >Affects Versions: types_branch > Environment: "local" mode >Reporter: Christopher Olston >Assignee: Shravan Matthur Narayanamurthy > Attachments: 553.patch > > > My EvalFunc's finish() method doesn't seem to get invoked. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-615) Wrong number of jobs with limit
[ https://issues.apache.org/jira/browse/PIG-615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Santhosh Srinivasan updated PIG-615: Attachment: 615_1.patch Attached patch includes Shravan's fix along with test cases that I added. Running unit test cases now. > Wrong number of jobs with limit > --- > > Key: PIG-615 > URL: https://issues.apache.org/jira/browse/PIG-615 > Project: Pig > Issue Type: Bug >Reporter: Olga Natkovich >Assignee: Shravan Matthur Narayanamurthy > Attachments: 615.patch, 615_1.patch > > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-615) Wrong number of jobs with limit
[ https://issues.apache.org/jira/browse/PIG-615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Santhosh Srinivasan updated PIG-615: Resolution: Fixed Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) All unit test cases passed. Patch has been committed. Thanks for the fix Shravan. > Wrong number of jobs with limit > --- > > Key: PIG-615 > URL: https://issues.apache.org/jira/browse/PIG-615 > Project: Pig > Issue Type: Bug >Reporter: Olga Natkovich >Assignee: Shravan Matthur Narayanamurthy > Attachments: 615.patch, 615_1.patch > > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (PIG-635) POCast.java has incorrect formatting
POCast.java has incorrect formatting Key: PIG-635 URL: https://issues.apache.org/jira/browse/PIG-635 Project: Pig Issue Type: Improvement Components: impl Affects Versions: types_branch Reporter: Santhosh Srinivasan Assignee: Santhosh Srinivasan Priority: Trivial Fix For: types_branch POCast.java has incorrect formatting. This crept in as part of PIG-589. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-635) POCast.java has incorrect formatting
[ https://issues.apache.org/jira/browse/PIG-635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Santhosh Srinivasan updated PIG-635: Attachment: POCast.patch Patch to correct formatting in POCast.java > POCast.java has incorrect formatting > > > Key: PIG-635 > URL: https://issues.apache.org/jira/browse/PIG-635 > Project: Pig > Issue Type: Improvement > Components: impl >Affects Versions: types_branch >Reporter: Santhosh Srinivasan >Assignee: Santhosh Srinivasan >Priority: Trivial > Fix For: types_branch > > Attachments: POCast.patch > > > POCast.java has incorrect formatting. This crept in as part of PIG-589. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (PIG-635) POCast.java has incorrect formatting
[ https://issues.apache.org/jira/browse/PIG-635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Santhosh Srinivasan resolved PIG-635. - Resolution: Fixed Hadoop Flags: [Reviewed] Resolved. Fix has been committed. > POCast.java has incorrect formatting > > > Key: PIG-635 > URL: https://issues.apache.org/jira/browse/PIG-635 > Project: Pig > Issue Type: Improvement > Components: impl >Affects Versions: types_branch >Reporter: Santhosh Srinivasan >Assignee: Santhosh Srinivasan >Priority: Trivial > Fix For: types_branch > > Attachments: POCast.patch > > > POCast.java has incorrect formatting. This crept in as part of PIG-589. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-635) POCast.java has incorrect formatting
[ https://issues.apache.org/jira/browse/PIG-635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Santhosh Srinivasan updated PIG-635: Patch Info: [Patch Available] > POCast.java has incorrect formatting > > > Key: PIG-635 > URL: https://issues.apache.org/jira/browse/PIG-635 > Project: Pig > Issue Type: Improvement > Components: impl >Affects Versions: types_branch >Reporter: Santhosh Srinivasan >Assignee: Santhosh Srinivasan >Priority: Trivial > Fix For: types_branch > > Attachments: POCast.patch > > > POCast.java has incorrect formatting. This crept in as part of PIG-589. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.