Re: Pig Error Java_home not set
export JAVA_HOME=path/to/java on your local machine. Even better if you can add it to your .bashrc or .profile or their equivalents. On Mar 15, 2013, at 7:59 AM, oualid ait wafli oualid.aitwa...@gmail.com wrote: Hi I am beginner at Pig I want start it but I dont know how to fix fix the java_home error ! how set the Java_home variable please thanks
Re: Error 4010 Cannot find hadoop configurations in classpath
Take a look at http://pig.apache.org/docs/r0.11.0/start.html#Running+the+Pig+Scripts+in+Mapreduce+Mode Additionally, you can try pig -x local if you wish to run it in local mode http://pig.apache.org/docs/r0.11.0/start.html#Running+the+Pig+Scripts+in+Local+Mode Please follow the instructions on the Getting Started page, a lot of questions should be answered there regarding setup. On Fri, Mar 15, 2013 at 8:54 AM, oualid ait wafli oualid.aitwa...@gmail.com wrote: Hi when I start Pig this error appear : ERROR org.apache.pig.Main - ERROR 4010: Cannot find hadoop configurations in classpath (neither hadoop-site.xml nor core-site.xml was found in the classpath). If you plan to use local mode, please put -x local option in command line I think th e problem is in relying Pig with Hadoop ! Can somebody help me thank you
[jira] [Commented] (PIG-3249) Pig startup script prints out a wrong version of hadoop when using fat jar
[ https://issues.apache.org/jira/browse/PIG-3249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603540#comment-13603540 ] Daniel Dai commented on PIG-3249: - We can change to use java -cp pig.jar org.apache.hadoop.util.VersionInfo to get Hadoop version. Pig startup script prints out a wrong version of hadoop when using fat jar -- Key: PIG-3249 URL: https://issues.apache.org/jira/browse/PIG-3249 Project: Pig Issue Type: Bug Affects Versions: 0.11 Reporter: Prashant Kommireddi Labels: newbie Fix For: 0.12 Script suggests 0.20.2 is used with the bundled jar but we are using 1.0 at the moment. {code} # fall back to use fat pig.jar if [ $debug == true ]; then echo Cannot find local hadoop installation, using bundled hadoop 20.2 fi {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-3205) Passing arguments to python script does not work with -f option
[ https://issues.apache.org/jira/browse/PIG-3205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603560#comment-13603560 ] Cheolsoo Park commented on PIG-3205: +1. Do you mind deleting the following line if it's not necessary when you commit? {code} +System.out.println(---); {code} Passing arguments to python script does not work with -f option --- Key: PIG-3205 URL: https://issues.apache.org/jira/browse/PIG-3205 Project: Pig Issue Type: Bug Affects Versions: 0.10.1 Reporter: Rohini Palaniswamy Assignee: Rohini Palaniswamy Fix For: 0.12 Attachments: PIG-3205.patch With pig sample.py arg1 arg2, arguments can be accessed in the embedded python script using sys.argv[]. But not in the case pig -f sample.py arg1 arg2. In case of ExecMode.FILE, we don't set PigContext.PIG_CMD_ARGS_REMAINDERS and so the arguments are not passed to JythonScriptEngine or GroovyScriptEngine. This is specially a problem with Oozie as it always uses -f option to specify the pig script. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-3205) Passing arguments to python script does not work with -f option
[ https://issues.apache.org/jira/browse/PIG-3205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603563#comment-13603563 ] Rohini Palaniswamy commented on PIG-3205: - Ah. Sure. Thanks for catching it. Left over from some debug statements. Passing arguments to python script does not work with -f option --- Key: PIG-3205 URL: https://issues.apache.org/jira/browse/PIG-3205 Project: Pig Issue Type: Bug Affects Versions: 0.10.1 Reporter: Rohini Palaniswamy Assignee: Rohini Palaniswamy Fix For: 0.12 Attachments: PIG-3205.patch With pig sample.py arg1 arg2, arguments can be accessed in the embedded python script using sys.argv[]. But not in the case pig -f sample.py arg1 arg2. In case of ExecMode.FILE, we don't set PigContext.PIG_CMD_ARGS_REMAINDERS and so the arguments are not passed to JythonScriptEngine or GroovyScriptEngine. This is specially a problem with Oozie as it always uses -f option to specify the pig script. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-2597) Move grunt from javacc to ANTRL
[ https://issues.apache.org/jira/browse/PIG-2597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603583#comment-13603583 ] Koji Noguchi commented on PIG-2597: --- bq. Jonathan, any update on this? I'm interested in this status as well. Does Boski have a plan to continue working on this? Move grunt from javacc to ANTRL --- Key: PIG-2597 URL: https://issues.apache.org/jira/browse/PIG-2597 Project: Pig Issue Type: Improvement Reporter: Jonathan Coveney Labels: GSoC2012 Attachments: pig02.diff Currently, the parser for queries is in ANTLR, but Grunt is still javacc. The parser is very difficult to work with, and next to impossible to understand or modify. ANTLR provides a much cleaner, more standard way to generate parsers/lexers/ASTs/etc, and moving from javacc to Grunt would be huge as we continue to add features to Pig. This is a candidate project for Google summer of code 2012. More information about the program can be found at https://cwiki.apache.org/confluence/display/PIG/GSoc2012 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: LoadFunc and LoadMetadata
getPartitionKeys should be called by default. Did you use AS clause in load statement? That could add a foreach between Load and Filter, and getPartitionKeys will only be invoked if filter is right after load. Do an explain to check for it. Thanks, Daniel On Thu, Mar 14, 2013 at 8:37 PM, Jeff Yuan quaintena...@gmail.com wrote: Hi all, For CustomLoader (a class I'm implementing) which extends LoadFunct, implemented LoadMetadata, the getPartitionKeys function is supposed to be called by PartitionFilterOptimizer, right? I put some debug statements in getPartitionKeys, but this function doesn't seem like it's ever called. I've read through some Pig source, optimization rules can be disabled by properties, but by default the PartitionFilterOptimizer should be enabled. Also, in PartitionFilterOptimizer, I saw checks to saw some other checks, like the Filter operator cannot have another dependency other than load, which is true in my case. Anyway, can someone shed some light on this? Am I understanding this interface incorrectly? My script is very simple (line 1 is load, line 2 is filter, and line 3 is store), so the Logical Plan should be very simple. Also, I'm testing this in Pig local mode, not sure if that matters. Greatly appreciate any hints!
[jira] [Commented] (PIG-3223) AvroStorage does not handle comma separated input paths
[ https://issues.apache.org/jira/browse/PIG-3223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603652#comment-13603652 ] Michael Kramer commented on PIG-3223: - [~cheolsoo], thanks for getting back to me so quickly! We're using variable substitution and input path generation via Oozie Coordinator. We include the hdfs://namenode:8020 at the beginning of our path templates, which I think is pretty standard (e.g. something like uri-template$\{nameNode\}/data//uri-template ) When Oozie constructs input paths to be passed to the pig script or map reduce job, it enumerates the paths via a comma separated list, something like hdfs://namenode:8020/data/1,hdfs://namenode:8020/data/2. This is how we figured out AvroStorage was breaking in the first place. A good coordinator/workflow example that is indicative of the types of workflows we're running can be found in the Oozie source examples: https://github.com/apache/oozie/blob/trunk/examples/src/main/apps/aggregator/coordinator.xml AvroStorage does not handle comma separated input paths --- Key: PIG-3223 URL: https://issues.apache.org/jira/browse/PIG-3223 Project: Pig Issue Type: Bug Components: piggybank Affects Versions: 0.10.0, 0.11 Reporter: Michael Kramer Assignee: Johnny Zhang Attachments: AvroStorage.patch, AvroStorage.patch-2, AvroStorageUtils.patch, AvroStorageUtils.patch-2, PIG-3223.patch.txt In pig 0.11, a patch was issued to AvroStorage to support globs and comma separated input paths (PIG-2492). While this function works fine for glob-formatted input paths, it fails when issued a standard comma separated list of paths. fs.globStatus does not seem to be able to parse out such a list, and a java.net.URISyntaxException is thrown when toURI is called on the path. I have a working fix for this, but it's extremely ugly (basically checking if the string of input paths is globbed, otherwise splitting on ,). I'm sure there's a more elegant solution. I'd be happy to post the relevant methods and fixes if necessary. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: LoadFunc and LoadMetadata
Yes, I do use AS in the load statement. I thought Filters are always pushed as close to the Load operators as possible? What kind of Foreach is added? Thanks, Jeff On Fri, Mar 15, 2013 at 10:57 AM, Daniel Dai da...@hortonworks.com wrote: getPartitionKeys should be called by default. Did you use AS clause in load statement? That could add a foreach between Load and Filter, and getPartitionKeys will only be invoked if filter is right after load. Do an explain to check for it. Thanks, Daniel On Thu, Mar 14, 2013 at 8:37 PM, Jeff Yuan quaintena...@gmail.com wrote: Hi all, For CustomLoader (a class I'm implementing) which extends LoadFunct, implemented LoadMetadata, the getPartitionKeys function is supposed to be called by PartitionFilterOptimizer, right? I put some debug statements in getPartitionKeys, but this function doesn't seem like it's ever called. I've read through some Pig source, optimization rules can be disabled by properties, but by default the PartitionFilterOptimizer should be enabled. Also, in PartitionFilterOptimizer, I saw checks to saw some other checks, like the Filter operator cannot have another dependency other than load, which is true in my case. Anyway, can someone shed some light on this? Am I understanding this interface incorrectly? My script is very simple (line 1 is load, line 2 is filter, and line 3 is store), so the Logical Plan should be very simple. Also, I'm testing this in Pig local mode, not sure if that matters. Greatly appreciate any hints!
[jira] [Assigned] (PIG-2630) Issue with setting b = a;
[ https://issues.apache.org/jira/browse/PIG-2630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Johnny Zhang reassigned PIG-2630: - Assignee: (was: Johnny Zhang) Issue with setting b = a; --- Key: PIG-2630 URL: https://issues.apache.org/jira/browse/PIG-2630 Project: Pig Issue Type: Bug Affects Versions: 0.10.0, 0.11 Reporter: Jonathan Coveney Fix For: 0.12 The following gives an error: {code} a = load 'thing' as (x:int); b = a; c = join a by x, b by x; {code} Error: {code} 2012-04-03 14:02:47,434 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1200: Pig script failed to parse: line 14, column 4 pig script failed to validate: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 2225: Projection with nothing to reference! {code} No issue with the following, however {code} a = load 'thing' as (x:int); b = foreach a generate *; c = join a by x, b by x; {code} oh and here is the log: {code} $ cat pig_1333487146863.log Pig Stack Trace --- ERROR 1200: Pig script failed to parse: line 3, column 4 pig script failed to validate: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 2225: Projection with nothing to reference! Failed to parse: Pig script failed to parse: line 3, column 4 pig script failed to validate: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 2225: Projection with nothing to reference! at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:182) at org.apache.pig.PigServer$Graph.validateQuery(PigServer.java:1566) at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1539) at org.apache.pig.PigServer.registerQuery(PigServer.java:541) at org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:945) at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:392) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:190) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:166) at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:69) at org.apache.pig.Main.run(Main.java:535) at org.apache.pig.Main.main(Main.java:153) Caused by: line 3, column 4 pig script failed to validate: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 2225: Projection with nothing to reference! at org.apache.pig.parser.LogicalPlanBuilder.buildJoinOp(LogicalPlanBuilder.java:363) at org.apache.pig.parser.LogicalPlanGenerator.join_clause(LogicalPlanGenerator.java:11441) at org.apache.pig.parser.LogicalPlanGenerator.op_clause(LogicalPlanGenerator.java:1491) at org.apache.pig.parser.LogicalPlanGenerator.general_statement(LogicalPlanGenerator.java:791) at org.apache.pig.parser.LogicalPlanGenerator.statement(LogicalPlanGenerator.java:509) at org.apache.pig.parser.LogicalPlanGenerator.query(LogicalPlanGenerator.java:384) at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:175) ... 10 more {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Contribute to PIG-3225
Can we have a GsoC entry to antlrize grunt? Who can mentor it? Russell Jurney http://datasyndrome.com On Mar 15, 2013, at 11:34 AM, Daniel Dai da...@hortonworks.com wrote: GSoC 2013 wiki is not up yet. You can find some information from last year's wiki: https://cwiki.apache.org/confluence/display/PIG/GSoc2012. Thanks, Daniel On Mon, Mar 11, 2013 at 6:38 PM, Dmitriy Ryaboy dvrya...@gmail.com wrote: + Gianmarco On Mon, Mar 11, 2013 at 11:20 AM, Sadari Jayawardena sjayawardena...@gmail.com wrote: I am a final year undergraduate in Computer Science Engineering. I have a good experience in Java programming and interested in mathematics and statistics. I would like to contribute to this project through GSoC 2013. ( https://issues.apache.org/jira/browse/PIG-3225) I went through the Wikipedia link provided. Could I be provided with additional references and study materials? Thanks in advance -- Sadari Jayawardena Undergraduate Department of Computer Science Engineering University of Moratuwa
[jira] [Updated] (PIG-3194) Changes to ObjectSerializer.java break compatibility with Hadoop 0.20.2
[ https://issues.apache.org/jira/browse/PIG-3194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prashant Kommireddi updated PIG-3194: - Attachment: PIG-3194_2.patch Uploading a new patch with Dmitriy's feedback incorporated. Changes to ObjectSerializer.java break compatibility with Hadoop 0.20.2 --- Key: PIG-3194 URL: https://issues.apache.org/jira/browse/PIG-3194 Project: Pig Issue Type: Bug Affects Versions: 0.11 Reporter: Kai Londenberg Assignee: Prashant Kommireddi Fix For: 0.11.1 Attachments: PIG-3194_2.patch, PIG-3194.patch The changes to ObjectSerializer.java in the following commit http://svn.apache.org/viewvc?view=revisionrevision=1403934 break compatibility with Hadoop 0.20.2 Clusters. The reason is, that the code uses methods from Apache Commons Codec 1.4 - which are not available in Apache Commons Codec 1.3 which is shipping with Hadoop 0.20.2. The offending methods are Base64.decodeBase64(String) and Base64.encodeBase64URLSafeString(byte[]) If I revert these changes, Pig 0.11.0 candidate 2 works well with our Hadoop 0.20.2 Clusters. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-3244) Make PIG_HOME configurable
[ https://issues.apache.org/jira/browse/PIG-3244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-3244: Resolution: Fixed Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Looks good. Committed to trunk. Make PIG_HOME configurable -- Key: PIG-3244 URL: https://issues.apache.org/jira/browse/PIG-3244 Project: Pig Issue Type: Improvement Reporter: Robert Schooley Priority: Minor Attachments: make-pig-home-configurable.patch It looks like the pig shell script in v0.11 exports PIG_HOME without first checking to see if it already exists. from line 78 in path/bin/pig: \# the root of the Pig installation export PIG_HOME=`dirname $this`/.. The supplied patch checks to see if the env has already been set prior to setting. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-3244) Make PIG_HOME configurable
[ https://issues.apache.org/jira/browse/PIG-3244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-3244: Assignee: Robert Schooley Make PIG_HOME configurable -- Key: PIG-3244 URL: https://issues.apache.org/jira/browse/PIG-3244 Project: Pig Issue Type: Improvement Reporter: Robert Schooley Assignee: Robert Schooley Priority: Minor Attachments: make-pig-home-configurable.patch It looks like the pig shell script in v0.11 exports PIG_HOME without first checking to see if it already exists. from line 78 in path/bin/pig: \# the root of the Pig installation export PIG_HOME=`dirname $this`/.. The supplied patch checks to see if the env has already been set prior to setting. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: LoadFunc and LoadMetadata
Yes, in theory filter should pushed above foreach. I don't know what happen, the easiest way is do an explain and let's check the plan. Daniel On Fri, Mar 15, 2013 at 11:32 AM, Jeff Yuan quaintena...@gmail.com wrote: Yes, I do use AS in the load statement. I thought Filters are always pushed as close to the Load operators as possible? What kind of Foreach is added? Thanks, Jeff On Fri, Mar 15, 2013 at 10:57 AM, Daniel Dai da...@hortonworks.com wrote: getPartitionKeys should be called by default. Did you use AS clause in load statement? That could add a foreach between Load and Filter, and getPartitionKeys will only be invoked if filter is right after load. Do an explain to check for it. Thanks, Daniel On Thu, Mar 14, 2013 at 8:37 PM, Jeff Yuan quaintena...@gmail.com wrote: Hi all, For CustomLoader (a class I'm implementing) which extends LoadFunct, implemented LoadMetadata, the getPartitionKeys function is supposed to be called by PartitionFilterOptimizer, right? I put some debug statements in getPartitionKeys, but this function doesn't seem like it's ever called. I've read through some Pig source, optimization rules can be disabled by properties, but by default the PartitionFilterOptimizer should be enabled. Also, in PartitionFilterOptimizer, I saw checks to saw some other checks, like the Filter operator cannot have another dependency other than load, which is true in my case. Anyway, can someone shed some light on this? Am I understanding this interface incorrectly? My script is very simple (line 1 is load, line 2 is filter, and line 3 is store), so the Logical Plan should be very simple. Also, I'm testing this in Pig local mode, not sure if that matters. Greatly appreciate any hints!
[jira] [Commented] (PIG-3249) Pig startup script prints out a wrong version of hadoop when using fat jar
[ https://issues.apache.org/jira/browse/PIG-3249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603946#comment-13603946 ] Prashant Kommireddi commented on PIG-3249: -- Thanks Daniel, that's a nice approach. {code} # fall back to use fat pig.jar if [ -f $PIG_HOME/pig.jar ]; then PIG_JAR=$PIG_HOME/pig.jar else PIG_JAR=`echo $PIG_HOME/pig-?.!(*withouthadoop).jar` fi if [ -n $PIG_JAR ]; then CLASSPATH=${CLASSPATH}:$PIG_JAR else echo Cannot locate pig.jar. do 'ant jar', and try again exit 1 fi if [ $debug == true ]; then echo Cannot find local hadoop installation, using bundled `java -cp $PIG_JAR org.apache.hadoop.util.VersionInfo | head -1` fi {code} Please note I have placed the debug statement below the code that looks for pig jar. It makes sense that the debug statements execute only after pig jar is found. Do you agree? I will upload the patch shortly. Pig startup script prints out a wrong version of hadoop when using fat jar -- Key: PIG-3249 URL: https://issues.apache.org/jira/browse/PIG-3249 Project: Pig Issue Type: Bug Affects Versions: 0.11 Reporter: Prashant Kommireddi Labels: newbie Fix For: 0.12 Script suggests 0.20.2 is used with the bundled jar but we are using 1.0 at the moment. {code} # fall back to use fat pig.jar if [ $debug == true ]; then echo Cannot find local hadoop installation, using bundled hadoop 20.2 fi {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-3194) Changes to ObjectSerializer.java break compatibility with Hadoop 0.20.2
[ https://issues.apache.org/jira/browse/PIG-3194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603984#comment-13603984 ] Dmitriy V. Ryaboy commented on PIG-3194: +1 Changes to ObjectSerializer.java break compatibility with Hadoop 0.20.2 --- Key: PIG-3194 URL: https://issues.apache.org/jira/browse/PIG-3194 Project: Pig Issue Type: Bug Affects Versions: 0.11 Reporter: Kai Londenberg Assignee: Prashant Kommireddi Fix For: 0.11.1 Attachments: PIG-3194_2.patch, PIG-3194.patch The changes to ObjectSerializer.java in the following commit http://svn.apache.org/viewvc?view=revisionrevision=1403934 break compatibility with Hadoop 0.20.2 Clusters. The reason is, that the code uses methods from Apache Commons Codec 1.4 - which are not available in Apache Commons Codec 1.3 which is shipping with Hadoop 0.20.2. The offending methods are Base64.decodeBase64(String) and Base64.encodeBase64URLSafeString(byte[]) If I revert these changes, Pig 0.11.0 candidate 2 works well with our Hadoop 0.20.2 Clusters. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (PIG-3249) Pig startup script prints out a wrong version of hadoop when using fat jar
[ https://issues.apache.org/jira/browse/PIG-3249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prashant Kommireddi reassigned PIG-3249: Assignee: Prashant Kommireddi Pig startup script prints out a wrong version of hadoop when using fat jar -- Key: PIG-3249 URL: https://issues.apache.org/jira/browse/PIG-3249 Project: Pig Issue Type: Bug Affects Versions: 0.11 Reporter: Prashant Kommireddi Assignee: Prashant Kommireddi Labels: newbie Fix For: 0.12 Attachments: PIG-3249.patch Script suggests 0.20.2 is used with the bundled jar but we are using 1.0 at the moment. {code} # fall back to use fat pig.jar if [ $debug == true ]; then echo Cannot find local hadoop installation, using bundled hadoop 20.2 fi {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-3249) Pig startup script prints out a wrong version of hadoop when using fat jar
[ https://issues.apache.org/jira/browse/PIG-3249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prashant Kommireddi updated PIG-3249: - Attachment: PIG-3249.patch Pig startup script prints out a wrong version of hadoop when using fat jar -- Key: PIG-3249 URL: https://issues.apache.org/jira/browse/PIG-3249 Project: Pig Issue Type: Bug Affects Versions: 0.11 Reporter: Prashant Kommireddi Labels: newbie Fix For: 0.12 Attachments: PIG-3249.patch Script suggests 0.20.2 is used with the bundled jar but we are using 1.0 at the moment. {code} # fall back to use fat pig.jar if [ $debug == true ]; then echo Cannot find local hadoop installation, using bundled hadoop 20.2 fi {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Are we ready for 0.11.1 release?
I think all the critical patches we discussed as required for 0.11.1 have gone in -- is there anything else people want to finish up, or can we roll this? Current change log: Release 0.11.1 (unreleased) INCOMPATIBLE CHANGES IMPROVEMENTS PIG-2988: start deploying pigunit maven artifact part of Pig release process (njw45 via rohini) PIG-3148: OutOfMemory exception while spilling stale DefaultDataBag. Extra option to gc() before spilling large bag. (knoguchi via rohini) PIG-3216: Groovy UDFs documentation has minor typos (herberts via rohini) PIG-3202: CUBE operator not documented in user docs (prasanth_j via billgraham) OPTIMIZATIONS BUG FIXES PIG-3194: Changes to ObjectSerializer.java break compatibility with Hadoop 0.20.2 (prkommireddi via dvryaboy) PIG-3241: ConcurrentModificationException in POPartialAgg (dvryaboy) PIG-3144: Erroneous map entry alias resolution leading to Duplicate schema alias errors (jcoveney via cheolsoo) PIG-3212: Race Conditions in POSort and (Internal)SortedBag during Proactive Spill (kadeng via dvryaboy) PIG-3206: HBaseStorage does not work with Oozie pig action and secure HBase (rohini)
Re: pig 0.11 candidate 2 feedback: Several problems
Looks like all outstanding 0.11.1 critical bugs are fixed. Time for an RC? Please let me know if I can help. On Fri, Mar 8, 2013 at 3:51 PM, Dmitriy Ryaboy dvrya...@gmail.com wrote: Looks like Lohit found a critical bug we should fix for 11.1: https://issues.apache.org/jira/browse/PIG-3241(only observed in hadoop 2.0) D On Wed, Mar 6, 2013 at 12:57 PM, Prashant Kommireddi prash1...@gmail.com wrote: Dmitriy, are the gc fixes all in for 0.11.1? PIG-3148 and PIG-3212 are the 2 JIRAs I know were fixed, any others? I have a patch up for 3194, I think we should be good for a release once that makes it in. -Prashant On Sat, Mar 2, 2013 at 11:16 AM, Prashant Kommireddi prash1...@gmail.com wrote: Great. I have commented regarding a possible approach for PIG-3194 http://goo.gl/UQ3zs. Please take a look when you folks have a chance. On Fri, Mar 1, 2013 at 7:00 PM, Dmitriy Ryaboy dvrya...@gmail.com wrote: I'd like to get the gc fix in as well, but looks like Rohini is about to commit it so we are good there. On Mar 1, 2013, at 11:33 AM, Bill Graham billgra...@gmail.com wrote: +1 to releasing Pig 0.11.1 when this is addressed. I should be able to help with the release again. On Fri, Mar 1, 2013 at 11:25 AM, Prashant Kommireddi prash1...@gmail.comwrote: Hey Guys, I wanted to start a conversation on this again. If Kai is not looking at PIG-3194 I can start working on it to get 0.11 compatible with 20.2. If everyone agrees, we should roll out 0.11.1 sooner than usual and I volunteer to help with it in anyway possible. Any objections to getting 0.11.1 out soon after 3194 is fixed? -Prashant On Wed, Feb 20, 2013 at 3:34 PM, Russell Jurney russell.jur...@gmail.com wrote: I stand corrected. Cool, 0.11 is good! On Wed, Feb 20, 2013 at 1:15 PM, Jarek Jarcec Cecho jar...@apache.org wrote: Just a unrelated note: The CDH3 is more closer to Hadoop 1.x than to 0.20. Jarcec On Wed, Feb 20, 2013 at 12:04:51PM -0800, Dmitriy Ryaboy wrote: I agree -- this is a good release. The bugs Kai pointed out should be fixed, but as they are not critical regressions, we can fix them in 0.11.1 (if someone wants to roll 0.11.1 the minute these fixes are committed, I won't mind and will dutifully vote for the release). I think the Hadoop 20.2 incompatibility is unfortunate but iirc this is fixable by setting HADOOP_USER_CLASSPATH_FIRST=true (was that in 20.2?) FWIW Twitter's running CDH3 and this release works in our environment. At this point things that block a release are critical regressions in performance or correctness. D On Wed, Feb 20, 2013 at 11:52 AM, Alan Gates ga...@hortonworks.com wrote: No. Bugs like these are supposed to be found and fixed after we branch from trunk (which happened several months ago in the case of 0.11). The point of RCs are to check that it's a good build, licenses are right, etc. Any bugs found this late in the game have to be seen as failures of earlier testing. Alan. On Feb 20, 2013, at 11:33 AM, Russell Jurney wrote: Isn't the point of an RC to find and fix bugs like these On Wed, Feb 20, 2013 at 11:31 AM, Bill Graham billgra...@gmail.com wrote: Regarding Pig 11 rc2, I propose we continue with the current vote as is (which closes today EOD). Patches for 0.20.2 issues can be rolled into a Pig 0.11.1 release whenever they're available and tested. On Wed, Feb 20, 2013 at 9:24 AM, Olga Natkovich onatkov...@yahoo.com wrote: I agree that supporting as much as we can is a good goal. The issue is who is going to be testing against all these versions? We found the issues under discussion because of a customer report, not because we consistently test against all versions. Perhaps when we decide which versions to support for next release we need also to agree who is going to be testing and maintaining compatibility with a particular version. For instance since Hadoop 23 compatibility is important for us at Yahoo we have been maintaining compatibility with this version for 0.9, 0.10 and will do the same for 0.11 and going forward. I think we would need others to step in and claim the versions of their interest. Olga From: Kai Londenberg kai.londenb...@googlemail.com To: dev@pig.apache.org Sent: Wednesday, February 20, 2013 1:51 AM Subject: Re: pig 0.11 candidate 2 feedback: Several problems Hi, I stronly agree with Jonathan here. If
[jira] [Commented] (PIG-3194) Changes to ObjectSerializer.java break compatibility with Hadoop 0.20.2
[ https://issues.apache.org/jira/browse/PIG-3194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13604023#comment-13604023 ] Prashant Kommireddi commented on PIG-3194: -- Thanks for review/commit, Dmitriy. Changes to ObjectSerializer.java break compatibility with Hadoop 0.20.2 --- Key: PIG-3194 URL: https://issues.apache.org/jira/browse/PIG-3194 Project: Pig Issue Type: Bug Affects Versions: 0.11 Reporter: Kai Londenberg Assignee: Prashant Kommireddi Fix For: 0.12, 0.11.1 Attachments: PIG-3194_2.patch, PIG-3194.patch The changes to ObjectSerializer.java in the following commit http://svn.apache.org/viewvc?view=revisionrevision=1403934 break compatibility with Hadoop 0.20.2 Clusters. The reason is, that the code uses methods from Apache Commons Codec 1.4 - which are not available in Apache Commons Codec 1.3 which is shipping with Hadoop 0.20.2. The offending methods are Base64.decodeBase64(String) and Base64.encodeBase64URLSafeString(byte[]) If I revert these changes, Pig 0.11.0 candidate 2 works well with our Hadoop 0.20.2 Clusters. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-3194) Changes to ObjectSerializer.java break compatibility with Hadoop 0.20.2
[ https://issues.apache.org/jira/browse/PIG-3194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13604037#comment-13604037 ] Prashant Kommireddi commented on PIG-3194: -- Kai, can you confirm 11.1 works for you? Changes to ObjectSerializer.java break compatibility with Hadoop 0.20.2 --- Key: PIG-3194 URL: https://issues.apache.org/jira/browse/PIG-3194 Project: Pig Issue Type: Bug Affects Versions: 0.11 Reporter: Kai Londenberg Assignee: Prashant Kommireddi Fix For: 0.12, 0.11.1 Attachments: PIG-3194_2.patch, PIG-3194.patch The changes to ObjectSerializer.java in the following commit http://svn.apache.org/viewvc?view=revisionrevision=1403934 break compatibility with Hadoop 0.20.2 Clusters. The reason is, that the code uses methods from Apache Commons Codec 1.4 - which are not available in Apache Commons Codec 1.3 which is shipping with Hadoop 0.20.2. The offending methods are Base64.decodeBase64(String) and Base64.encodeBase64URLSafeString(byte[]) If I revert these changes, Pig 0.11.0 candidate 2 works well with our Hadoop 0.20.2 Clusters. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Contribute to PIG-3225
That's PIG-2597. I am not sure about the status of it, if it is not done, we can continue it this year. Daniel On Fri, Mar 15, 2013 at 11:52 AM, Russell Jurney russell.jur...@gmail.com wrote: Can we have a GsoC entry to antlrize grunt? Who can mentor it? Russell Jurney http://datasyndrome.com On Mar 15, 2013, at 11:34 AM, Daniel Dai da...@hortonworks.com wrote: GSoC 2013 wiki is not up yet. You can find some information from last year's wiki: https://cwiki.apache.org/confluence/display/PIG/GSoc2012. Thanks, Daniel On Mon, Mar 11, 2013 at 6:38 PM, Dmitriy Ryaboy dvrya...@gmail.com wrote: + Gianmarco On Mon, Mar 11, 2013 at 11:20 AM, Sadari Jayawardena sjayawardena...@gmail.com wrote: I am a final year undergraduate in Computer Science Engineering. I have a good experience in Java programming and interested in mathematics and statistics. I would like to contribute to this project through GSoC 2013. ( https://issues.apache.org/jira/browse/PIG-3225) I went through the Wikipedia link provided. Could I be provided with additional references and study materials? Thanks in advance -- Sadari Jayawardena Undergraduate Department of Computer Science Engineering University of Moratuwa
Re: Are we ready for 0.11.1 release?
Can I put PIG-3132 in? Thanks, Daniel On Fri, Mar 15, 2013 at 5:55 PM, Julien Le Dem jul...@twitter.com wrote: +1 for a new release On Friday, March 15, 2013, Dmitriy Ryaboy wrote: I think all the critical patches we discussed as required for 0.11.1 have gone in -- is there anything else people want to finish up, or can we roll this? Current change log: Release 0.11.1 (unreleased) INCOMPATIBLE CHANGES IMPROVEMENTS PIG-2988: start deploying pigunit maven artifact part of Pig release process (njw45 via rohini) PIG-3148: OutOfMemory exception while spilling stale DefaultDataBag. Extra option to gc() before spilling large bag. (knoguchi via rohini) PIG-3216: Groovy UDFs documentation has minor typos (herberts via rohini) PIG-3202: CUBE operator not documented in user docs (prasanth_j via billgraham) OPTIMIZATIONS BUG FIXES PIG-3194: Changes to ObjectSerializer.java break compatibility with Hadoop 0.20.2 (prkommireddi via dvryaboy) PIG-3241: ConcurrentModificationException in POPartialAgg (dvryaboy) PIG-3144: Erroneous map entry alias resolution leading to Duplicate schema alias errors (jcoveney via cheolsoo) PIG-3212: Race Conditions in POSort and (Internal)SortedBag during Proactive Spill (kadeng via dvryaboy) PIG-3206: HBaseStorage does not work with Oozie pig action and secure HBase (rohini)
[jira] [Commented] (PIG-3181) MultiStorage - java.lang.OutOfMemoryError: Java heap space
[ https://issues.apache.org/jira/browse/PIG-3181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13604062#comment-13604062 ] Johnny Zhang commented on PIG-3181: --- Hi, do you mean below right? {noformat} a = load '/input' as (f1, f2); a = group a by f1; logs = foreach a { generate group, a.f2; } store logs into '/output/' using org.apache.pig.piggybank.storage.MultiStorage('/output/', '0'); {noformat} can you share how large is your /input file? so that I can try to reproduce it. MultiStorage - java.lang.OutOfMemoryError: Java heap space -- Key: PIG-3181 URL: https://issues.apache.org/jira/browse/PIG-3181 Project: Pig Issue Type: Bug Components: piggybank Affects Versions: 0.10.0 Reporter: Fabian Alenius Hi, I have a script that looks like this: a = load '/input' as (f1, f2); a = group a by f1; a = foreach logs { generate group, a.f2; } store logs into '/output/' using org.apache.pig.piggybank.storage.MultiStorage('/output/', '0'); But for some reason it fails with: FATAL org.apache.hadoop.mapred.Child: Error running child : java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOf(Arrays.java:2786) at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:94) at java.io.OutputStream.write(OutputStream.java:58) at org.apache.pig.impl.util.StorageUtil.putField(StorageUtil.java:145) at org.apache.pig.impl.util.StorageUtil.putField(StorageUtil.java:176) at org.apache.pig.impl.util.StorageUtil.putField(StorageUtil.java:194) at org.apache.pig.piggybank.storage.MultiStorage$MultiStorageOutputFormat$1.write(MultiStorage.java:208) at org.apache.pig.piggybank.storage.MultiStorage$MultiStorageOutputFormat$1.write(MultiStorage.java:187) at org.apache.pig.piggybank.storage.MultiStorage.putNext(MultiStorage.java:138) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:139) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:98) at org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.write(ReduceTask.java:537) at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:88) at org.apache.hadoop.mapreduce.lib.reduce.WrappedReducer$Context.write(WrappedReducer.java:99) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.runPipeline(PigGenericMapReduce.java:463) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.processOnePackageOutput(PigGenericMapReduce.java:428) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.reduce(PigGenericMapReduce.java:408) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.reduce(PigGenericMapReduce.java:262) at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:164) at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:595) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:433) at org.apache.hadoop.mapred.Child$4.run(Child.java:268) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332) at org.apache.hadoop.mapred.Child.main(Child.java:262) *stderr logs* java.lang.RuntimeException: InternalCachedBag.spill() should not be called at org.apache.pig.data.InternalCachedBag.spill(InternalCachedBag.java:159) at org.apache.pig.impl.util.SpillableMemoryManager.handleNotification(SpillableMemoryManager.java:243) at sun.management.NotificationEmitterSupport.sendNotification(NotificationEmitterSupport.java:138) at sun.management.MemoryImpl.createNotification(MemoryImpl.java:171) at
Re: Are we ready for 0.11.1 release?
+1 for a new release Julien On Mar 15, 2013, at 17:08, Dmitriy Ryaboy dvrya...@gmail.com wrote: I think all the critical patches we discussed as required for 0.11.1 have gone in -- is there anything else people want to finish up, or can we roll this? Current change log: Release 0.11.1 (unreleased) INCOMPATIBLE CHANGES IMPROVEMENTS PIG-2988: start deploying pigunit maven artifact part of Pig release process (njw45 via rohini) PIG-3148: OutOfMemory exception while spilling stale DefaultDataBag. Extra option to gc() before spilling large bag. (knoguchi via rohini) PIG-3216: Groovy UDFs documentation has minor typos (herberts via rohini) PIG-3202: CUBE operator not documented in user docs (prasanth_j via billgraham) OPTIMIZATIONS BUG FIXES PIG-3194: Changes to ObjectSerializer.java break compatibility with Hadoop 0.20.2 (prkommireddi via dvryaboy) PIG-3241: ConcurrentModificationException in POPartialAgg (dvryaboy) PIG-3144: Erroneous map entry alias resolution leading to Duplicate schema alias errors (jcoveney via cheolsoo) PIG-3212: Race Conditions in POSort and (Internal)SortedBag during Proactive Spill (kadeng via dvryaboy) PIG-3206: HBaseStorage does not work with Oozie pig action and secure HBase (rohini)
Re: Welcome our new PMC chair, Julien Le Dem
Thank you all ! Julien On Mar 10, 2013, at 21:31, Xuefu Zhang xzh...@inadco.com wrote: Congrats!!! --Xuefu On Sun, Mar 10, 2013 at 9:00 PM, Jarek Jarcec Cecho jar...@apache.orgwrote: Congratulations sir! Jarcec On Sun, Mar 10, 2013 at 08:55:55PM -0700, Aniket Mokashi wrote: Congrats Julien! On Sun, Mar 10, 2013 at 8:54 PM, Russell Jurney russell.jur...@gmail.comwrote: Congrats! Russell Jurney http://datasyndrome.com On Mar 10, 2013, at 8:53 PM, Daniel Dai da...@hortonworks.com wrote: It is a bit late, Apache board approved the nomination of Julien Le Dem as our Pig PMC Chair last month. Welcome Julien! Thanks, Daniel -- ...:::Aniket:::... Quetzalco@tl
[jira] Subscription: PIG patch available
Issue Subscription Filter: PIG patch available (33 issues) Subscriber: pigdaily Key Summary PIG-3247Piggybank functions to mimic OVER clause in SQL https://issues.apache.org/jira/browse/PIG-3247 PIG-3238Pig current releases lack a UDF Stuff(). This UDF deletes a specified length of characters and inserts another set of characters at a specified starting point. https://issues.apache.org/jira/browse/PIG-3238 PIG-3237Pig current releases lack a UDF MakeSet(). This UDF returns a set value (a string containing substrings separated by , characters) consisting of the strings that have the corresponding bit in the first argument https://issues.apache.org/jira/browse/PIG-3237 PIG-3235Enable DEBUG log messages in unit tests by default https://issues.apache.org/jira/browse/PIG-3235 PIG-3215[piggybank] Add LTSVLoader to load LTSV (Labeled Tab-separated Values) files https://issues.apache.org/jira/browse/PIG-3215 PIG-3210Pig fails to start when it cannot write log to log files https://issues.apache.org/jira/browse/PIG-3210 PIG-3208[zebra] TFile should not set io.compression.codec.lzo.buffersize https://issues.apache.org/jira/browse/PIG-3208 PIG-3205Passing arguments to python script does not work with -f option https://issues.apache.org/jira/browse/PIG-3205 PIG-3198Let users use any function from PigType - PigType as if it were builtlin https://issues.apache.org/jira/browse/PIG-3198 PIG-3190Add LuceneTokenizer and SnowballTokenizer to Pig - useful text tokenization https://issues.apache.org/jira/browse/PIG-3190 PIG-3183rm or rmf commands should respect globbing/regex of path https://issues.apache.org/jira/browse/PIG-3183 PIG-3172Partition filter push down does not happen when there is a non partition key map column filter https://issues.apache.org/jira/browse/PIG-3172 PIG-3166Update eclipse .classpath according to ivy library.properties https://issues.apache.org/jira/browse/PIG-3166 PIG-3164Pig current releases lack a UDF endsWith.This UDF tests if a given string ends with the specified suffix. https://issues.apache.org/jira/browse/PIG-3164 PIG-3141Giving CSVExcelStorage an option to handle header rows https://issues.apache.org/jira/browse/PIG-3141 PIG-3123Simplify Logical Plans By Removing Unneccessary Identity Projections https://issues.apache.org/jira/browse/PIG-3123 PIG-3122Operators should not implicitly become reserved keywords https://issues.apache.org/jira/browse/PIG-3122 PIG-3114Duplicated macro name error when using pigunit https://issues.apache.org/jira/browse/PIG-3114 PIG-3105Fix TestJobSubmission unit test failure. https://issues.apache.org/jira/browse/PIG-3105 PIG-3088Add a builtin udf which removes prefixes https://issues.apache.org/jira/browse/PIG-3088 PIG-3069Native Windows Compatibility for Pig E2E Tests and Harness https://issues.apache.org/jira/browse/PIG-3069 PIG-3028testGrunt dev test needs some command filters to run correctly without cygwin https://issues.apache.org/jira/browse/PIG-3028 PIG-3027pigTest unit test needs a newline filter for comparisons of golden multi-line https://issues.apache.org/jira/browse/PIG-3027 PIG-3026Pig checked-in baseline comparisons need a pre-filter to address OS-specific newline differences https://issues.apache.org/jira/browse/PIG-3026 PIG-3024TestEmptyInputDir unit test - hadoop version detection logic is brittle https://issues.apache.org/jira/browse/PIG-3024 PIG-3015Rewrite of AvroStorage https://issues.apache.org/jira/browse/PIG-3015 PIG-3010Allow UDF's to flatten themselves https://issues.apache.org/jira/browse/PIG-3010 PIG-2959Add a pig.cmd for Pig to run under Windows https://issues.apache.org/jira/browse/PIG-2959 PIG-2955 Fix bunch of Pig e2e tests on Windows https://issues.apache.org/jira/browse/PIG-2955 PIG-2643Use bytecode generation to make a performance replacement for InvokeForLong, InvokeForString, etc https://issues.apache.org/jira/browse/PIG-2643 PIG-2641Create toJSON function for all complex types: tuples, bags and maps https://issues.apache.org/jira/browse/PIG-2641 PIG-2591Unit tests should not write to /tmp but respect java.io.tmpdir https://issues.apache.org/jira/browse/PIG-2591 PIG-1914Support load/store JSON data in Pig https://issues.apache.org/jira/browse/PIG-1914 You may edit this subscription at: https://issues.apache.org/jira/secure/FilterSubscription!default.jspa?subId=13225filterId=12322384
[jira] [Commented] (PIG-3249) Pig startup script prints out a wrong version of hadoop when using fat jar
[ https://issues.apache.org/jira/browse/PIG-3249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13604064#comment-13604064 ] Daniel Dai commented on PIG-3249: - Sure, can't agree more. Pig startup script prints out a wrong version of hadoop when using fat jar -- Key: PIG-3249 URL: https://issues.apache.org/jira/browse/PIG-3249 Project: Pig Issue Type: Bug Affects Versions: 0.11 Reporter: Prashant Kommireddi Assignee: Prashant Kommireddi Labels: newbie Fix For: 0.12 Attachments: PIG-3249.patch Script suggests 0.20.2 is used with the bundled jar but we are using 1.0 at the moment. {code} # fall back to use fat pig.jar if [ $debug == true ]; then echo Cannot find local hadoop installation, using bundled hadoop 20.2 fi {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-3243) Documentation error
[ https://issues.apache.org/jira/browse/PIG-3243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13604066#comment-13604066 ] Daniel Dai commented on PIG-3243: - Would you like to create a patch? Change the src/docs/src/documentation/content/xdocs/udf.xml will do it. Documentation error --- Key: PIG-3243 URL: https://issues.apache.org/jira/browse/PIG-3243 Project: Pig Issue Type: Bug Reporter: Tolga Konik Priority: Trivial Original Estimate: 1h Remaining Estimate: 1h Error in documentation on web related to python udf usage: The document mentions JYTHON_PATH but it should be JYTHONPATH. Seasoned jython users will easily figure this out but for cpython users who are new to jython, this error can easily be a show stopper. I observed this in 11.0 but it may be occuring in earlier versions. REFERENCE: http://pig.apache.org/docs/r0.11.0/udf.html#python-advanced Advanced Topics Importing Modules You can import Python modules in your Python script. Pig resolves Python dependencies recursively, which means Pig will automatically ship all dependent Python modules to the backend. Python modules should be found in the jython search path: JYTHON_HOME, JYTHON_PATH, or current directory. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-3238) Pig current releases lack a UDF Stuff(). This UDF deletes a specified length of characters and inserts another set of characters at a specified starting point.
[ https://issues.apache.org/jira/browse/PIG-3238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13604071#comment-13604071 ] Daniel Dai commented on PIG-3238: - Thanks for the patch. However, we need example in javadoc and a test case to commit it. Pig current releases lack a UDF Stuff(). This UDF deletes a specified length of characters and inserts another set of characters at a specified starting point. --- Key: PIG-3238 URL: https://issues.apache.org/jira/browse/PIG-3238 Project: Pig Issue Type: New Feature Affects Versions: 0.10.0 Reporter: Sonu Prathap Fix For: 0.10.0 Attachments: Stuff.java.patch Pig current releases lack a UDF Stuff(). This UDF deletes a specified length of characters and inserts another set of characters at a specified starting point. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-3235) Enable DEBUG log messages in unit tests by default
[ https://issues.apache.org/jira/browse/PIG-3235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13604076#comment-13604076 ] Daniel Dai commented on PIG-3235: - Make sense, but would like set the default to warning. Don't need too much logs in most cases. Enable DEBUG log messages in unit tests by default -- Key: PIG-3235 URL: https://issues.apache.org/jira/browse/PIG-3235 Project: Pig Issue Type: Improvement Components: tools Reporter: Cheolsoo Park Assignee: Cheolsoo Park Priority: Minor Attachments: PIG-3235.patch Currently, debug level messages are not logged for unit tests. It is helpful to enable them to debug unit tests. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-3236) parametrize snapshot and staging repo id
[ https://issues.apache.org/jira/browse/PIG-3236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-3236: Attachment: PIG-3236-0.12.patch Attach patch for trunk. parametrize snapshot and staging repo id - Key: PIG-3236 URL: https://issues.apache.org/jira/browse/PIG-3236 Project: Pig Issue Type: Improvement Affects Versions: 0.10.0, 0.11 Reporter: Giridharan Kesavan Assignee: Giridharan Kesavan Attachments: PIG-3236-0.12.patch, PIG-3236.patch this would allow users to override the repo_id's to publish artifacts to different repo's with different repo id's -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (PIG-3236) parametrize snapshot and staging repo id
[ https://issues.apache.org/jira/browse/PIG-3236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai resolved PIG-3236. - Resolution: Fixed Fix Version/s: 0.12 Hadoop Flags: Reviewed Committed to trunk. parametrize snapshot and staging repo id - Key: PIG-3236 URL: https://issues.apache.org/jira/browse/PIG-3236 Project: Pig Issue Type: Improvement Affects Versions: 0.10.0, 0.11 Reporter: Giridharan Kesavan Assignee: Giridharan Kesavan Fix For: 0.12 Attachments: PIG-3236-0.12.patch, PIG-3236.patch this would allow users to override the repo_id's to publish artifacts to different repo's with different repo id's -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-3235) Add log4j.properties for unit tests
[ https://issues.apache.org/jira/browse/PIG-3235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheolsoo Park updated PIG-3235: --- Summary: Add log4j.properties for unit tests (was: Enable DEBUG log messages in unit tests by default) Add log4j.properties for unit tests --- Key: PIG-3235 URL: https://issues.apache.org/jira/browse/PIG-3235 Project: Pig Issue Type: Improvement Components: tools Reporter: Cheolsoo Park Assignee: Cheolsoo Park Priority: Minor Attachments: PIG-3235-2.patch, PIG-3235.patch Currently, debug level messages are not logged for unit tests. It is helpful to enable them to debug unit tests. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-3235) Enable DEBUG log messages in unit tests by default
[ https://issues.apache.org/jira/browse/PIG-3235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheolsoo Park updated PIG-3235: --- Attachment: PIG-3235-2.patch Sure. I lowered the logging level to INFO, which is the current default value. Enable DEBUG log messages in unit tests by default -- Key: PIG-3235 URL: https://issues.apache.org/jira/browse/PIG-3235 Project: Pig Issue Type: Improvement Components: tools Reporter: Cheolsoo Park Assignee: Cheolsoo Park Priority: Minor Attachments: PIG-3235-2.patch, PIG-3235.patch Currently, debug level messages are not logged for unit tests. It is helpful to enable them to debug unit tests. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-3235) Add log4j.properties for unit tests
[ https://issues.apache.org/jira/browse/PIG-3235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13604127#comment-13604127 ] Daniel Dai commented on PIG-3235: - +1 Add log4j.properties for unit tests --- Key: PIG-3235 URL: https://issues.apache.org/jira/browse/PIG-3235 Project: Pig Issue Type: Improvement Components: tools Reporter: Cheolsoo Park Assignee: Cheolsoo Park Priority: Minor Attachments: PIG-3235-2.patch, PIG-3235.patch Currently, debug level messages are not logged for unit tests. It is helpful to enable them to debug unit tests. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-3235) Add log4j.properties for unit tests
[ https://issues.apache.org/jira/browse/PIG-3235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheolsoo Park updated PIG-3235: --- Resolution: Fixed Fix Version/s: 0.12 Status: Resolved (was: Patch Available) Thank you Daniel for the review. I committed to trunk. Add log4j.properties for unit tests --- Key: PIG-3235 URL: https://issues.apache.org/jira/browse/PIG-3235 Project: Pig Issue Type: Improvement Components: tools Reporter: Cheolsoo Park Assignee: Cheolsoo Park Priority: Minor Fix For: 0.12 Attachments: PIG-3235-2.patch, PIG-3235.patch Currently, debug level messages are not logged for unit tests. It is helpful to enable them to debug unit tests. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira