[jira] Updated: (PIG-1608) pig should always include pig-default.properties and pig.properties in the pig.jar
[ https://issues.apache.org/jira/browse/PIG-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Giridharan Kesavan updated PIG-1608: Status: Patch Available (was: Open) pig should always include pig-default.properties and pig.properties in the pig.jar -- Key: PIG-1608 URL: https://issues.apache.org/jira/browse/PIG-1608 Project: Pig Issue Type: Bug Reporter: niraj rai Assignee: niraj rai Attachments: PIG-1608_0.patch pig should always include pig-default.properties and pig.properties as a part of the pig.jar file -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1608) pig should always include pig-default.properties and pig.properties in the pig.jar
[ https://issues.apache.org/jira/browse/PIG-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Giridharan Kesavan updated PIG-1608: Status: Open (was: Patch Available) re-submiting patch to hudson .. pig should always include pig-default.properties and pig.properties in the pig.jar -- Key: PIG-1608 URL: https://issues.apache.org/jira/browse/PIG-1608 Project: Pig Issue Type: Bug Reporter: niraj rai Assignee: niraj rai Attachments: PIG-1608_0.patch pig should always include pig-default.properties and pig.properties as a part of the pig.jar file -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1608) pig should always include pig-default.properties and pig.properties in the pig.jar
[ https://issues.apache.org/jira/browse/PIG-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Giridharan Kesavan updated PIG-1608: Status: Open (was: Patch Available) pig should always include pig-default.properties and pig.properties in the pig.jar -- Key: PIG-1608 URL: https://issues.apache.org/jira/browse/PIG-1608 Project: Pig Issue Type: Bug Reporter: niraj rai Assignee: niraj rai Attachments: PIG-1608_0.patch pig should always include pig-default.properties and pig.properties as a part of the pig.jar file -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1608) pig should always include pig-default.properties and pig.properties in the pig.jar
[ https://issues.apache.org/jira/browse/PIG-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Giridharan Kesavan updated PIG-1608: Status: Patch Available (was: Open) pig should always include pig-default.properties and pig.properties in the pig.jar -- Key: PIG-1608 URL: https://issues.apache.org/jira/browse/PIG-1608 Project: Pig Issue Type: Bug Reporter: niraj rai Assignee: niraj rai Attachments: PIG-1608_0.patch pig should always include pig-default.properties and pig.properties as a part of the pig.jar file -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1611) use enums for error code
[ https://issues.apache.org/jira/browse/PIG-1611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12909695#action_12909695 ] Gianmarco De Francisci Morales commented on PIG-1611: - Would be nice to see it! use enums for error code Key: PIG-1611 URL: https://issues.apache.org/jira/browse/PIG-1611 Project: Pig Issue Type: Sub-task Reporter: Thejas M Nair Fix For: 0.9.0 Pig code is using integer constants for error code, and the value of the error code is reserved using http://wiki.apache.org/pig/PigErrorHandlingFunctionalSpecification . This process is cumbersome and error prone. It will be better to use enum values instead. The enum value can contain the error message and encapsulate the error code. For example - {code} Replace throw new SchemaMergeException(Error in merging schema, 2124, PigException.BUG); with throw new SchemaMergeException(SCHEMA_MERGE_EX, PigException.BUG); {code} Where SCHEMA_MERGE_EX belongs to a error codes enum. We can use the ordinal value of the enum and an offset to determine the error code. The error code will be passed through the constructor of the enum. {code} SCHEMA_MERGE_EX(Error in merging schema); {code} For documentation, the error code and error messages can be dumped using code that uses the enum error code class. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1610) 'union onschema' does handle some cases involving 'namespaced' column names in schema
[ https://issues.apache.org/jira/browse/PIG-1610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated PIG-1610: --- Attachment: PIG-1610.1.patch 'union onschema' does handle some cases involving 'namespaced' column names in schema - Key: PIG-1610 URL: https://issues.apache.org/jira/browse/PIG-1610 Project: Pig Issue Type: Bug Affects Versions: 0.8.0 Reporter: Thejas M Nair Assignee: Thejas M Nair Fix For: 0.8.0 Attachments: PIG-1610.1.patch case 1: grunt describe f; f: {l1::a: bytearray,l1::b: bytearray} grunt describe l1; l1: {a: bytearray,b: bytearray} grunt dump f; (1,11) (2,22) (3,33) grunt dump l1; (1,11) (2,22) (3,33) grunt u = union onschema f, l1; grunt describe u; u: {l1::a: bytearray,l1::b: bytearray} -- the dump u gives incorrect results grunt dump u; (,) (,) (,) (1,11) (2,22) (3,33) case 2: grunt u = union onschema l1, f; grunt describe u; 2010-09-13 15:11:13,877 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1108: Duplicate schema alias: l1::a Details at logfile: /Users/tejas/pig_unions_err2/trunk/pig_1284410413970.log -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-366) PigPen - Eclipse plugin for a graphical PigLatin editor
[ https://issues.apache.org/jira/browse/PIG-366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Gibbon updated PIG-366: -- Attachment: org.apache.pig.pigpen-0.7.3.tar.gz PigPen - Eclipse plugin for a graphical PigLatin editor --- Key: PIG-366 URL: https://issues.apache.org/jira/browse/PIG-366 Project: Pig Issue Type: New Feature Reporter: Shubham Chopra Assignee: Robert Gibbon Priority: Minor Attachments: org.apache.pig.pigpen-0.7.0.tar.gz, org.apache.pig.pigpen-0.7.2.tar.gz, org.apache.pig.pigpen-0.7.3.tar.gz, org.apache.pig.pigpen_0.0.1.jar, org.apache.pig.pigpen_0.0.1.tgz, org.apache.pig.pigpen_0.0.4.jar, org.apache.pig.pigpen_0.7.2.jar, pigpen.patch, pigPen.patch, PigPen.tgz This is an Eclipse plugin that provides a GUI that can help users create PigLatin scripts and see the example generator outputs on the fly and submit the jobs to hadoop clusters. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1610) 'union onschema' does handle some cases involving 'namespaced' column names in schema
[ https://issues.apache.org/jira/browse/PIG-1610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated PIG-1610: --- Status: Patch Available (was: Open) Release Note: This fixes the behavior for merging of column alias name that have a 'namespace' portion in them. - Alias such as 'nm::c1' and 'c1' in two separate relations specified in 'union onschema' are considered mergeable and in the schema of the union, the merged column alias will be 'c1'. - Alias such as 'nm1::c1' and 'nm2::c1' in two separate relations specified in 'union onschema' will not be merged together, in schema of the union there will be two columns with these names. Example - describe f; f: {l1::a: int, l1::b: int, l1::c: int} describe l1; l1: {a: int, b: int} u = union onschema f,l1; desc u; u: {a: int, b: int, l1::c: int} Test-patch and unit test cases have succeeded. 'union onschema' does handle some cases involving 'namespaced' column names in schema - Key: PIG-1610 URL: https://issues.apache.org/jira/browse/PIG-1610 Project: Pig Issue Type: Bug Affects Versions: 0.8.0 Reporter: Thejas M Nair Assignee: Thejas M Nair Fix For: 0.8.0 Attachments: PIG-1610.1.patch case 1: grunt describe f; f: {l1::a: bytearray,l1::b: bytearray} grunt describe l1; l1: {a: bytearray,b: bytearray} grunt dump f; (1,11) (2,22) (3,33) grunt dump l1; (1,11) (2,22) (3,33) grunt u = union onschema f, l1; grunt describe u; u: {l1::a: bytearray,l1::b: bytearray} -- the dump u gives incorrect results grunt dump u; (,) (,) (,) (1,11) (2,22) (3,33) case 2: grunt u = union onschema l1, f; grunt describe u; 2010-09-13 15:11:13,877 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1108: Duplicate schema alias: l1::a Details at logfile: /Users/tejas/pig_unions_err2/trunk/pig_1284410413970.log -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-366) PigPen - Eclipse plugin for a graphical PigLatin editor
[ https://issues.apache.org/jira/browse/PIG-366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Gibbon updated PIG-366: -- Attachment: org.apache.pig.pigpen_0.7.3.jar PigPen - Eclipse plugin for a graphical PigLatin editor --- Key: PIG-366 URL: https://issues.apache.org/jira/browse/PIG-366 Project: Pig Issue Type: New Feature Reporter: Shubham Chopra Assignee: Robert Gibbon Priority: Minor Attachments: org.apache.pig.pigpen-0.7.0.tar.gz, org.apache.pig.pigpen-0.7.2.tar.gz, org.apache.pig.pigpen-0.7.3.tar.gz, org.apache.pig.pigpen_0.0.1.jar, org.apache.pig.pigpen_0.0.1.tgz, org.apache.pig.pigpen_0.0.4.jar, org.apache.pig.pigpen_0.7.2.jar, org.apache.pig.pigpen_0.7.3.jar, pigpen.patch, pigPen.patch, PigPen.tgz This is an Eclipse plugin that provides a GUI that can help users create PigLatin scripts and see the example generator outputs on the fly and submit the jobs to hadoop clusters. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1607) pig should have separate javadoc.jar in the maven repository
[ https://issues.apache.org/jira/browse/PIG-1607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] niraj rai updated PIG-1607: --- Attachment: PIG-1607_0.patch Things fixed with this patch: 1. Created javadoc.jar 2. Cleaned sources.jar . Removed the generated source and the zebra related files. 2. changed build.xml to upload the javadoc.jar to maven pig should have separate javadoc.jar in the maven repository Key: PIG-1607 URL: https://issues.apache.org/jira/browse/PIG-1607 Project: Pig Issue Type: Bug Reporter: niraj rai Assignee: niraj rai Attachments: PIG-1607_0.patch At this moment, javadoc is part of the source.jar but pig should have separate javadoc.jar in the maven repository. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1607) pig should have separate javadoc.jar in the maven repository
[ https://issues.apache.org/jira/browse/PIG-1607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] niraj rai updated PIG-1607: --- Attachment: PIG-1607_1.patch fixed the javadoc-jar dependency pig should have separate javadoc.jar in the maven repository Key: PIG-1607 URL: https://issues.apache.org/jira/browse/PIG-1607 Project: Pig Issue Type: Bug Reporter: niraj rai Assignee: niraj rai Attachments: PIG-1607_0.patch, PIG-1607_1.patch At this moment, javadoc is part of the source.jar but pig should have separate javadoc.jar in the maven repository. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (PIG-1247) Error Number makes it hard to debug: ERROR 2999: Unexpected internal error. org.apache.pig.backend.datastorage.DataStorageException cannot be cast to java.lang.Error
[ https://issues.apache.org/jira/browse/PIG-1247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Olga Natkovich reassigned PIG-1247: --- Assignee: Xuefu Zhang Error Number makes it hard to debug: ERROR 2999: Unexpected internal error. org.apache.pig.backend.datastorage.DataStorageException cannot be cast to java.lang.Error - Key: PIG-1247 URL: https://issues.apache.org/jira/browse/PIG-1247 Project: Pig Issue Type: Bug Components: impl Affects Versions: 0.6.0 Reporter: Viraj Bhat Assignee: Xuefu Zhang Fix For: 0.9.0 I have a large script in which there are intermediate stores statements, one of them writes to a directory I do not have permission to write to. The stack trace I get from Pig is this: 2010-02-20 02:16:32,055 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2999: Unexpected internal error. org.apache.pig.backend.datastorage.DataStorageException cannot be cast to java.lang.Error Details at logfile: /home/viraj/pig_1266632145355.log Pig Stack Trace --- ERROR 2999: Unexpected internal error. org.apache.pig.backend.datastorage.DataStorageException cannot be cast to java.lang.Error java.lang.ClassCastException: org.apache.pig.backend.datastorage.DataStorageException cannot be cast to java.lang.Error at org.apache.pig.impl.logicalLayer.parser.QueryParser.StoreClause(QueryParser.java:3583) at org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseExpr(QueryParser.java:1407) at org.apache.pig.impl.logicalLayer.parser.QueryParser.Expr(QueryParser.java:949) at org.apache.pig.impl.logicalLayer.parser.QueryParser.Parse(QueryParser.java:762) at org.apache.pig.impl.logicalLayer.LogicalPlanBuilder.parse(LogicalPlanBuilder.java:63) at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1036) at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:986) at org.apache.pig.PigServer.registerQuery(PigServer.java:386) at org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:720) at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:324) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:168) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:144) at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:89) at org.apache.pig.Main.main(Main.java:386) The only way to find the error was to look at the javacc generated QueryParser.java code and do a System.out.println() Here is a script to reproduce the problem: {code} A = load '/user/viraj/three.txt' using PigStorage(); B = foreach A generate ['a'#'12'] as b:map[] ; store B into '/user/secure/pigtest' using PigStorage(); {code} three.txt has 3 lines which contain nothing but the number 1. {code} $ hadoop fs -ls /user/secure/ ls: could not get get listing for 'hdfs://mynamenode/user/secure' : org.apache.hadoop.security.AccessControlException: Permission denied: user=viraj, access=READ_EXECUTE, inode=secure:secure:users:rwx-- {code} Viraj -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (PIG-1592) ORDER BY distribution is uneven when record size is correlated with order key
[ https://issues.apache.org/jira/browse/PIG-1592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Olga Natkovich reassigned PIG-1592: --- Assignee: Thejas M Nair ORDER BY distribution is uneven when record size is correlated with order key - Key: PIG-1592 URL: https://issues.apache.org/jira/browse/PIG-1592 Project: Pig Issue Type: Improvement Reporter: Dmitriy V. Ryaboy Assignee: Thejas M Nair Fix For: 0.9.0 The partitioner contributed in PIG-545 distributes the order key space between partitions so that each partition gets approximately the same number of keys, even when the keys have a non-uniform distribution over the key space. Unfortunately this still allows for severe partition imbalance when record size is correlated with the order key. By way of motivating example, consider this script which attempts to produce a list of genuses based on how many species each genus contains: {code} set default_parallel 60; critters = load 'biodata'' as (genus, species); genus_counts = foreach (group critters by genus) generate group as genus, COUNT(critters) as num_species, critters; ordered_genuses = order genus_counts by num_species desc; store ordered_genuses {code} The higher the value of genus_counts, the more species tuples will be contained in the critters bag, the wider the row. This can cause a severe processing imbalance, as the partitioner processing the records with the highest values of genus_counts will have the same number of *records* as the partitioner processing the lowest number, but it will have far more actual *bytes* to work on. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1607) pig should have separate javadoc.jar in the maven repository
[ https://issues.apache.org/jira/browse/PIG-1607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] niraj rai updated PIG-1607: --- Attachment: PIG-1607_2.patch pig should have separate javadoc.jar in the maven repository Key: PIG-1607 URL: https://issues.apache.org/jira/browse/PIG-1607 Project: Pig Issue Type: Bug Reporter: niraj rai Assignee: niraj rai Attachments: PIG-1607_0.patch, PIG-1607_1.patch, PIG-1607_2.patch At this moment, javadoc is part of the source.jar but pig should have separate javadoc.jar in the maven repository. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1606) flatten documentation does not discuss flatten of empty bag
[ https://issues.apache.org/jira/browse/PIG-1606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12909810#action_12909810 ] Olga Natkovich commented on PIG-1606: - If we are not planning to change the semantics I will ask Corinne to document for 0.8 flatten documentation does not discuss flatten of empty bag --- Key: PIG-1606 URL: https://issues.apache.org/jira/browse/PIG-1606 Project: Pig Issue Type: Bug Components: documentation Reporter: Thejas M Nair Fix For: 0.9.0 From the existing flatten documentation, it is not clear that flatten of an empty bag results in that row being discarded . For example the following query gives no output - {code} grunt cat /tmp/empty.bag {} 1 grunt l = load '/tmp/empty.bag' as (b : bag{}, i : int); grunt f = foreach l generate flatten(b), i; grunt dump f; grunt {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1606) flatten documentation does not discuss flatten of empty bag
[ https://issues.apache.org/jira/browse/PIG-1606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Olga Natkovich updated PIG-1606: Assignee: Corinne Chandel Fix Version/s: 0.8.0 (was: 0.9.0) flatten documentation does not discuss flatten of empty bag --- Key: PIG-1606 URL: https://issues.apache.org/jira/browse/PIG-1606 Project: Pig Issue Type: Bug Components: documentation Reporter: Thejas M Nair Assignee: Corinne Chandel Fix For: 0.8.0 From the existing flatten documentation, it is not clear that flatten of an empty bag results in that row being discarded . For example the following query gives no output - {code} grunt cat /tmp/empty.bag {} 1 grunt l = load '/tmp/empty.bag' as (b : bag{}, i : int); grunt f = foreach l generate flatten(b), i; grunt dump f; grunt {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1608) pig should always include pig-default.properties and pig.properties in the pig.jar
[ https://issues.apache.org/jira/browse/PIG-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12909821#action_12909821 ] Daniel Dai commented on PIG-1608: - Two comments: 1. target buildJar-withouthadoop should also include this change 2. format comment: use space instead of tab Target jar, package looks good. pig should always include pig-default.properties and pig.properties in the pig.jar -- Key: PIG-1608 URL: https://issues.apache.org/jira/browse/PIG-1608 Project: Pig Issue Type: Bug Reporter: niraj rai Assignee: niraj rai Attachments: PIG-1608_0.patch pig should always include pig-default.properties and pig.properties as a part of the pig.jar file -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-282) Custom Partitioner
[ https://issues.apache.org/jira/browse/PIG-282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Bhat updated PIG-282: --- Release Note: This feature allows to specify Hadoop Partitioner for the following operations: GROUP/COGROUP, CROSS, DISTINCT, JOIN (except 'skewed' join). Partitioner controls the partitioning of the keys of the intermediate map-outputs. See http://hadoop.apache.org/common/docs/r0.20.2/api/org/apache/hadoop/mapred/Partitioner.html for more details. To use this feature you can add PARTITION BY clause to the appropriate operator: A = load 'input_data'; B = group A by $0 PARTITION BY org.apache.pig.test.utils.SimpleCustomPartitioner parallel 2; . Here is the code for SimpleCustomPartitioner public class SimpleCustomPartitioner extends PartitionerPigNullableWritable, Writable { //@Override public int getPartition(PigNullableWritable key, Writable value, int numPartitions) { if(key.getValueAsPigType() instanceof Integer) { int ret = (((Integer)key.getValueAsPigType()).intValue() % numPartitions); return ret; } else { return (key.hashCode()) % numPartitions; } } } was: This feature allows to specify Hadoop Partitioner for the following operations: GROUP/COGROUP, CROSS, DISTINCT, JOIN (except 'skewed' join). Partitioner controls the partitioning of the keys of the intermediate map-outputs. See http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapred/Partitioner.html for more details. To use this feature you can add PARTITION BY clause to the appropriate operator: A = load 'input_data'; B = group A by $0 PARTITION BY org.apache.pig.test.utils.SimpleCustomPartitioner parallel 2; . Here is the code for SimpleCustomPartitioner public class SimpleCustomPartitioner extends PartitionerPigNullableWritable, Writable { //@Override public int getPartition(PigNullableWritable key, Writable value, int numPartitions) { if(key.getValueAsPigType() instanceof Integer) { int ret = (((Integer)key.getValueAsPigType()).intValue() % numPartitions); return ret; } else { return (key.hashCode()) % numPartitions; } } } Custom Partitioner -- Key: PIG-282 URL: https://issues.apache.org/jira/browse/PIG-282 Project: Pig Issue Type: New Feature Affects Versions: 0.7.0 Reporter: Amir Youssefi Assignee: Aniket Mokashi Priority: Minor Fix For: 0.8.0 Attachments: CustomPartitioner.patch, CustomPartitionerFinale.patch, CustomPartitionerTest.patch By adding custom partitioner we can give control over which output partition a key (/value) goes to. We can add keywords to language e.g. PARTITION BY UDF(...) or a similar syntax. UDF returns a number between 0 and n-1 where n is number of output partitions. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1608) pig should always include pig-default.properties and pig.properties in the pig.jar
[ https://issues.apache.org/jira/browse/PIG-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] niraj rai updated PIG-1608: --- Status: Patch Available (was: Open) pig should always include pig-default.properties and pig.properties in the pig.jar -- Key: PIG-1608 URL: https://issues.apache.org/jira/browse/PIG-1608 Project: Pig Issue Type: Bug Reporter: niraj rai Assignee: niraj rai Attachments: PIG-1608_0.patch, PIG-1608_1.patch pig should always include pig-default.properties and pig.properties as a part of the pig.jar file -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1608) pig should always include pig-default.properties and pig.properties in the pig.jar
[ https://issues.apache.org/jira/browse/PIG-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] niraj rai updated PIG-1608: --- Attachment: PIG-1608_1.patch updated patch to accommodate the review comments. pig should always include pig-default.properties and pig.properties in the pig.jar -- Key: PIG-1608 URL: https://issues.apache.org/jira/browse/PIG-1608 Project: Pig Issue Type: Bug Reporter: niraj rai Assignee: niraj rai Attachments: PIG-1608_0.patch, PIG-1608_1.patch pig should always include pig-default.properties and pig.properties as a part of the pig.jar file -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1608) pig should always include pig-default.properties and pig.properties in the pig.jar
[ https://issues.apache.org/jira/browse/PIG-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] niraj rai updated PIG-1608: --- Status: Open (was: Patch Available) pig should always include pig-default.properties and pig.properties in the pig.jar -- Key: PIG-1608 URL: https://issues.apache.org/jira/browse/PIG-1608 Project: Pig Issue Type: Bug Reporter: niraj rai Assignee: niraj rai Attachments: PIG-1608_0.patch, PIG-1608_1.patch pig should always include pig-default.properties and pig.properties as a part of the pig.jar file -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-366) PigPen - Eclipse plugin for a graphical PigLatin editor
[ https://issues.apache.org/jira/browse/PIG-366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Gibbon updated PIG-366: -- Attachment: (was: org.apache.pig.pigpen_0.7.3.jar) PigPen - Eclipse plugin for a graphical PigLatin editor --- Key: PIG-366 URL: https://issues.apache.org/jira/browse/PIG-366 Project: Pig Issue Type: New Feature Reporter: Shubham Chopra Assignee: Robert Gibbon Priority: Minor Attachments: org.apache.pig.pigpen-0.7.0.tar.gz, org.apache.pig.pigpen-0.7.2.tar.gz, org.apache.pig.pigpen-0.7.3.tar.gz, org.apache.pig.pigpen_0.0.1.jar, org.apache.pig.pigpen_0.0.1.tgz, org.apache.pig.pigpen_0.0.4.jar, org.apache.pig.pigpen_0.7.2.jar, pigpen.patch, pigPen.patch, PigPen.tgz This is an Eclipse plugin that provides a GUI that can help users create PigLatin scripts and see the example generator outputs on the fly and submit the jobs to hadoop clusters. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1610) 'union onschema' does handle some cases involving 'namespaced' column names in schema
[ https://issues.apache.org/jira/browse/PIG-1610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12909859#action_12909859 ] Thejas M Nair commented on PIG-1610: Richard pointed out an issue with the patch where the schema of 'union onschema' differs with different order of relation in the statement. The case is like - {code} l = load 'x' as (c, nm::c); f = load 'y' as (i,j); u = union onschema f,l; describe u; u: {i: bytearray,j: bytearray,c: bytearray} u = union onschema l,f; describe u; u: {c: bytearray,nm::c: bytearray,i: bytearray,j: bytearray} {code} Another issue found with the feature is that the schema of union is null when a column in one of the relations has a complex type with null inner schema. I will submit another patch with fix for these issues. 'union onschema' does handle some cases involving 'namespaced' column names in schema - Key: PIG-1610 URL: https://issues.apache.org/jira/browse/PIG-1610 Project: Pig Issue Type: Bug Affects Versions: 0.8.0 Reporter: Thejas M Nair Assignee: Thejas M Nair Fix For: 0.8.0 Attachments: PIG-1610.1.patch case 1: grunt describe f; f: {l1::a: bytearray,l1::b: bytearray} grunt describe l1; l1: {a: bytearray,b: bytearray} grunt dump f; (1,11) (2,22) (3,33) grunt dump l1; (1,11) (2,22) (3,33) grunt u = union onschema f, l1; grunt describe u; u: {l1::a: bytearray,l1::b: bytearray} -- the dump u gives incorrect results grunt dump u; (,) (,) (,) (1,11) (2,22) (3,33) case 2: grunt u = union onschema l1, f; grunt describe u; 2010-09-13 15:11:13,877 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1108: Duplicate schema alias: l1::a Details at logfile: /Users/tejas/pig_unions_err2/trunk/pig_1284410413970.log -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-366) PigPen - Eclipse plugin for a graphical PigLatin editor
[ https://issues.apache.org/jira/browse/PIG-366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Gibbon updated PIG-366: -- Attachment: org.apache.pig.pigpen_0.7.4.jar PigPen - Eclipse plugin for a graphical PigLatin editor --- Key: PIG-366 URL: https://issues.apache.org/jira/browse/PIG-366 Project: Pig Issue Type: New Feature Reporter: Shubham Chopra Assignee: Robert Gibbon Priority: Minor Attachments: org.apache.pig.pigpen-0.7.0.tar.gz, org.apache.pig.pigpen-0.7.2.tar.gz, org.apache.pig.pigpen-0.7.3.tar.gz, org.apache.pig.pigpen_0.0.1.jar, org.apache.pig.pigpen_0.0.1.tgz, org.apache.pig.pigpen_0.0.4.jar, org.apache.pig.pigpen_0.7.2.jar, org.apache.pig.pigpen_0.7.4.jar, pigpen.patch, pigPen.patch, PigPen.tgz This is an Eclipse plugin that provides a GUI that can help users create PigLatin scripts and see the example generator outputs on the fly and submit the jobs to hadoop clusters. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-366) PigPen - Eclipse plugin for a graphical PigLatin editor
[ https://issues.apache.org/jira/browse/PIG-366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Gibbon updated PIG-366: -- Attachment: org.apache.pig.pigpen-0.7.4.tar.gz PigPen - Eclipse plugin for a graphical PigLatin editor --- Key: PIG-366 URL: https://issues.apache.org/jira/browse/PIG-366 Project: Pig Issue Type: New Feature Reporter: Shubham Chopra Assignee: Robert Gibbon Priority: Minor Attachments: org.apache.pig.pigpen-0.7.0.tar.gz, org.apache.pig.pigpen-0.7.2.tar.gz, org.apache.pig.pigpen-0.7.4.tar.gz, org.apache.pig.pigpen_0.0.1.jar, org.apache.pig.pigpen_0.0.1.tgz, org.apache.pig.pigpen_0.0.4.jar, org.apache.pig.pigpen_0.7.2.jar, org.apache.pig.pigpen_0.7.4.jar, pigpen.patch, pigPen.patch, PigPen.tgz This is an Eclipse plugin that provides a GUI that can help users create PigLatin scripts and see the example generator outputs on the fly and submit the jobs to hadoop clusters. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-366) PigPen - Eclipse plugin for a graphical PigLatin editor
[ https://issues.apache.org/jira/browse/PIG-366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Gibbon updated PIG-366: -- Attachment: (was: org.apache.pig.pigpen-0.7.3.tar.gz) PigPen - Eclipse plugin for a graphical PigLatin editor --- Key: PIG-366 URL: https://issues.apache.org/jira/browse/PIG-366 Project: Pig Issue Type: New Feature Reporter: Shubham Chopra Assignee: Robert Gibbon Priority: Minor Attachments: org.apache.pig.pigpen-0.7.0.tar.gz, org.apache.pig.pigpen-0.7.2.tar.gz, org.apache.pig.pigpen-0.7.4.tar.gz, org.apache.pig.pigpen_0.0.1.jar, org.apache.pig.pigpen_0.0.1.tgz, org.apache.pig.pigpen_0.0.4.jar, org.apache.pig.pigpen_0.7.2.jar, org.apache.pig.pigpen_0.7.4.jar, pigpen.patch, pigPen.patch, PigPen.tgz This is an Eclipse plugin that provides a GUI that can help users create PigLatin scripts and see the example generator outputs on the fly and submit the jobs to hadoop clusters. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-366) PigPen - Eclipse plugin for a graphical PigLatin editor
[ https://issues.apache.org/jira/browse/PIG-366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Gibbon updated PIG-366: -- Attachment: org.apache.pig.pigpen_0.7.4.jar PigPen - Eclipse plugin for a graphical PigLatin editor --- Key: PIG-366 URL: https://issues.apache.org/jira/browse/PIG-366 Project: Pig Issue Type: New Feature Reporter: Shubham Chopra Assignee: Robert Gibbon Priority: Minor Attachments: org.apache.pig.pigpen-0.7.0.tar.gz, org.apache.pig.pigpen-0.7.2.tar.gz, org.apache.pig.pigpen-0.7.4.tar.gz, org.apache.pig.pigpen_0.0.1.jar, org.apache.pig.pigpen_0.0.1.tgz, org.apache.pig.pigpen_0.0.4.jar, org.apache.pig.pigpen_0.7.2.jar, org.apache.pig.pigpen_0.7.4.jar, org.apache.pig.pigpen_0.7.4.jar, pigpen.patch, pigPen.patch, PigPen.tgz This is an Eclipse plugin that provides a GUI that can help users create PigLatin scripts and see the example generator outputs on the fly and submit the jobs to hadoop clusters. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-366) PigPen - Eclipse plugin for a graphical PigLatin editor
[ https://issues.apache.org/jira/browse/PIG-366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Gibbon updated PIG-366: -- Attachment: (was: org.apache.pig.pigpen_0.7.4.jar) PigPen - Eclipse plugin for a graphical PigLatin editor --- Key: PIG-366 URL: https://issues.apache.org/jira/browse/PIG-366 Project: Pig Issue Type: New Feature Reporter: Shubham Chopra Assignee: Robert Gibbon Priority: Minor Attachments: org.apache.pig.pigpen-0.7.0.tar.gz, org.apache.pig.pigpen-0.7.2.tar.gz, org.apache.pig.pigpen-0.7.4.tar.gz, org.apache.pig.pigpen_0.0.1.jar, org.apache.pig.pigpen_0.0.1.tgz, org.apache.pig.pigpen_0.0.4.jar, org.apache.pig.pigpen_0.7.2.jar, org.apache.pig.pigpen_0.7.4.jar, pigpen.patch, pigPen.patch, PigPen.tgz This is an Eclipse plugin that provides a GUI that can help users create PigLatin scripts and see the example generator outputs on the fly and submit the jobs to hadoop clusters. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-366) PigPen - Eclipse plugin for a graphical PigLatin editor
[ https://issues.apache.org/jira/browse/PIG-366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12909897#action_12909897 ] Robert Gibbon commented on PIG-366: --- Here's the README * Download the latest version of the jar binary. Right now that's org.apache.pig.pigpen_0.7.4.jar * Put the jar in the $ECLIPSE_HOME/plugins/ directory, where $ECLIPSE_HOME is your eclipse installation directory. * Optionally edit the file $ECLIPSE_HOME/eclipse.ini and add the parameter -clean (no quotes) to make sure the deployment is fresh * Start eclipse. * Open a .pig script or make a new one. It should come up with syntax colouring etc. when you start keying in your script. If it doesn't, post your report here * When you save a script for the first time there might be a slight delay. *normal*. Any errors in your script should be marked up so you can see them. * To run a script on the cluster, first open the eclipse preferences dialog = PigPen = and change the settings to your liking. You must have configuration.path pointing to where your pig config files are located (typically $PIG_HOME/conf/). You should specify your preferred pig runtime jar (typically $PIG_HOME/pig-x.x.x-core.jar). log.path defaults to your temp directory, but you can set it to whatever you like. ssh.gateway is untested, so if you don't use it, delete that key. Go ahead and click on OK. * Select the script you want to run from the package explorer and click the little pig icon on the toolbar. This kicks off a new JVM and submits your job. You can track it in the console. If you want to cancel it for some reason, click the little pig icon with the red cross next to the console tab. This simply kills the JVM process. That's it for now. have fun... PigPen - Eclipse plugin for a graphical PigLatin editor --- Key: PIG-366 URL: https://issues.apache.org/jira/browse/PIG-366 Project: Pig Issue Type: New Feature Reporter: Shubham Chopra Assignee: Robert Gibbon Priority: Minor Attachments: org.apache.pig.pigpen-0.7.0.tar.gz, org.apache.pig.pigpen-0.7.2.tar.gz, org.apache.pig.pigpen-0.7.4.tar.gz, org.apache.pig.pigpen_0.0.1.jar, org.apache.pig.pigpen_0.0.1.tgz, org.apache.pig.pigpen_0.0.4.jar, org.apache.pig.pigpen_0.7.2.jar, org.apache.pig.pigpen_0.7.4.jar, pigpen.patch, pigPen.patch, PigPen.tgz This is an Eclipse plugin that provides a GUI that can help users create PigLatin scripts and see the example generator outputs on the fly and submit the jobs to hadoop clusters. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1479) Embed Pig in scripting languages
[ https://issues.apache.org/jira/browse/PIG-1479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12909898#action_12909898 ] Julien Le Dem commented on PIG-1479: The -g parameter on the command line should take two parameters, the scripting implementation instance name and the script itself. That way we can have several scripting implementations. {noformat} java -cp pig.jar:jython jar org.apache.pig.Main -x local -g jython script/tc.py {noformat} {code} case GREEK: { ScriptEngine scriptEngine = ScriptEngine.getInstance(instanceName); scriptEngine.run(new PigServer(pigContext), file); return ReturnCode.SUCCESS; } {code} Embed Pig in scripting languages Key: PIG-1479 URL: https://issues.apache.org/jira/browse/PIG-1479 Project: Pig Issue Type: New Feature Reporter: Julien Le Dem Attachments: PIG-1479.patch, PIG-1479_2.patch, pig-greek-test.tar, pig-greek.tgz It should be possible to embed Pig calls in a scripting language and let functions defined in the same script available as UDFs. This is a spin off of https://issues.apache.org/jira/browse/PIG-928 which lets users define UDFs in scripting languages. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1607) pig should have separate javadoc.jar in the maven repository
[ https://issues.apache.org/jira/browse/PIG-1607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] niraj rai updated PIG-1607: --- Attachment: PIG-1607_4.patch fixed the package structure of the javadoc. pig should have separate javadoc.jar in the maven repository Key: PIG-1607 URL: https://issues.apache.org/jira/browse/PIG-1607 Project: Pig Issue Type: Bug Reporter: niraj rai Assignee: niraj rai Attachments: PIG-1607_0.patch, PIG-1607_1.patch, PIG-1607_2.patch, PIG-1607_3.patch, PIG-1607_4.patch At this moment, javadoc is part of the source.jar but pig should have separate javadoc.jar in the maven repository. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1479) Embed Pig in scripting languages
[ https://issues.apache.org/jira/browse/PIG-1479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12909901#action_12909901 ] Julien Le Dem commented on PIG-1479: The end of loop condition in the script can just test for to_join_n emptiness. It was testing both because it did not know which one was to_join_n. {code} if (not P.result(to_join_n).iterator().hasNext()): {code} Embed Pig in scripting languages Key: PIG-1479 URL: https://issues.apache.org/jira/browse/PIG-1479 Project: Pig Issue Type: New Feature Reporter: Julien Le Dem Attachments: PIG-1479.patch, PIG-1479_2.patch, pig-greek-test.tar, pig-greek.tgz It should be possible to embed Pig calls in a scripting language and let functions defined in the same script available as UDFs. This is a spin off of https://issues.apache.org/jira/browse/PIG-928 which lets users define UDFs in scripting languages. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (PIG-1613) Explain how different UDF interfaces are used
Explain how different UDF interfaces are used - Key: PIG-1613 URL: https://issues.apache.org/jira/browse/PIG-1613 Project: Pig Issue Type: Improvement Components: documentation Affects Versions: 0.7.0 Reporter: Olga Natkovich Assignee: Corinne Chandel Fix For: 0.8.0 The current documentation describes individual UDF interfaces such as Algebraic and Accumulator but not their precedence or how they interact with each other and why you might want to implement several of them. Corrine, I will add release notes to this JIRA shortly. Don't worry about it till then. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1608) pig should always include pig-default.properties and pig.properties in the pig.jar
[ https://issues.apache.org/jira/browse/PIG-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-1608: Fix Version/s: 0.9.0 Affects Version/s: 0.8.0 pig should always include pig-default.properties and pig.properties in the pig.jar -- Key: PIG-1608 URL: https://issues.apache.org/jira/browse/PIG-1608 Project: Pig Issue Type: Bug Affects Versions: 0.8.0 Reporter: niraj rai Assignee: niraj rai Fix For: 0.9.0 Attachments: PIG-1608_0.patch, PIG-1608_1.patch pig should always include pig-default.properties and pig.properties as a part of the pig.jar file -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1608) pig should always include pig-default.properties and pig.properties in the pig.jar
[ https://issues.apache.org/jira/browse/PIG-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-1608: Status: Resolved (was: Patch Available) Hadoop Flags: [Reviewed] Resolution: Fixed Patch committed to trunk. Thanks Niraj! pig should always include pig-default.properties and pig.properties in the pig.jar -- Key: PIG-1608 URL: https://issues.apache.org/jira/browse/PIG-1608 Project: Pig Issue Type: Bug Affects Versions: 0.8.0 Reporter: niraj rai Assignee: niraj rai Fix For: 0.9.0 Attachments: PIG-1608_0.patch, PIG-1608_1.patch pig should always include pig-default.properties and pig.properties as a part of the pig.jar file -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1613) Explain how different UDF interfaces are used
[ https://issues.apache.org/jira/browse/PIG-1613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Olga Natkovich updated PIG-1613: Release Note: I think this should go into Advanced Topics in the UDF manual There are multiple ways for a UDF to be invoked. The simplest UDF can just extend EvalFunc that requires only exec function to be implemented as described in the How to Write a Simple Eval Function section. Every eval UDF must implement this. Additionally, if a function is algebraic, it can implement Algebraic interface to significantly improve query performance in the cases when combiner can be used. The Aggregate Functions section covers this topic in detail. Finally, a function that can process tuples in the incremental fashion can also implement Accumulator interface to improve query memory consumption. Accumulator interface section explains this interface. The exact method by which UDF is invoked is selected by the optimizer based on the UDF type and the query. Note that only a single interface is used at any given time. The optimizer tries to find the most efficient way to execute the function. If a combiner is used and function implements Algebraic interface then this interface will be used to invoke the function. If the combiner is not invoked but accumulator can be used and the function implements Accumulator interface then that interface is used. If neither of the conditions is satisfied then exec function is used to invoke the UDF. Can one of the developers review the release notes to make sure they are accurate, thanks. Explain how different UDF interfaces are used - Key: PIG-1613 URL: https://issues.apache.org/jira/browse/PIG-1613 Project: Pig Issue Type: Improvement Components: documentation Affects Versions: 0.7.0 Reporter: Olga Natkovich Assignee: Corinne Chandel Fix For: 0.8.0 The current documentation describes individual UDF interfaces such as Algebraic and Accumulator but not their precedence or how they interact with each other and why you might want to implement several of them. Corrine, I will add release notes to this JIRA shortly. Don't worry about it till then. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (PIG-1614) javacc.jar pulled twice from maven repository
javacc.jar pulled twice from maven repository - Key: PIG-1614 URL: https://issues.apache.org/jira/browse/PIG-1614 Project: Pig Issue Type: Bug Components: build Reporter: Daniel Dai Priority: Trivial ant pull javacc.jar twice from maven. One is javacc.jar, and the other is javacc-4.2.jar. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.