Re: Welcome to the new Pig PMC member Aniket Mokashi
Woo! Congrats Aniket! On Tue, Jan 14, 2014 at 8:47 PM, Olga Natkovich onatkov...@yahoo.comwrote: Congrats, Aniket! On Tuesday, January 14, 2014 8:32 PM, Tongjie Chen tongjie.c...@gmail.com wrote: Congrats Aniket! On Tue, Jan 14, 2014 at 8:12 PM, Cheolsoo Park piaozhe...@gmail.com wrote: Congrats Aniket! On Tue, Jan 14, 2014 at 7:01 PM, Jarek Jarcec Cecho jar...@apache.org wrote: Congratulations Aniket, good work! Jarcec On Tue, Jan 14, 2014 at 06:52:10PM -0800, JULIEN LE DEM wrote: It's my pleasure to announce that Aniket Mokashi became the newest addition to the Pig PMC. Aniket has been actively contributing to Pig for years. Please join me in congratulating Aniket! Julien -- *Note that I'm no longer using my Yahoo! email address. Please email me at billgra...@gmail.com billgra...@gmail.com going forward.*
[jira] [Commented] (PIG-3623) Documentation for loadKey in HBaseStorage is incorrect
[ https://issues.apache.org/jira/browse/PIG-3623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13857303#comment-13857303 ] Bill Graham commented on PIG-3623: -- +1 for fixing behavior to match the docs here. Great find [~mstefaniak]. Documentation for loadKey in HBaseStorage is incorrect -- Key: PIG-3623 URL: https://issues.apache.org/jira/browse/PIG-3623 Project: Pig Issue Type: Bug Reporter: Michael Stefaniak The documentation for HBaseStorage (http://pig.apache.org/docs/r0.12.0/func.html#HBaseStorage) says -loadKey=(true|false) Load the row key as the first value in every tuple returned from HBase (default=false) However, looking at the source (http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/hbase/HBaseStorage.java) it is just doing a check for the existence of this option loadRowKey_ = configuredOptions_.hasOption(loadKey); So setting -loadKey=false in the options string, still results in a true value -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (PIG-3549) Print hadoop jobids for failed, killed job
[ https://issues.apache.org/jira/browse/PIG-3549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13808038#comment-13808038 ] Bill Graham commented on PIG-3549: -- Wow, fix of the year. +1 Print hadoop jobids for failed, killed job -- Key: PIG-3549 URL: https://issues.apache.org/jira/browse/PIG-3549 Project: Pig Issue Type: Bug Affects Versions: 0.12.0 Reporter: Aniket Mokashi Assignee: Aniket Mokashi Fix For: 0.12.1 Attachments: PIG-3549.patch It would be better if we dump the hadoop job ids for failed, killed jobs in pig log. Right now, log looks like following- {noformat} ERROR org.apache.pig.tools.grunt.Grunt: ERROR 6017: Job failed! Error - NA INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher: Job job_pigexec_1 killed {noformat} From that its hard to say which hadoop job failed if there are multiple jobs running in parallel. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (PIG-3497) JobControlCompiler should only do reducer estimation when the job has a reduce phase
[ https://issues.apache.org/jira/browse/PIG-3497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Graham updated PIG-3497: - Assignee: Akihiro Matsukawa JobControlCompiler should only do reducer estimation when the job has a reduce phase Key: PIG-3497 URL: https://issues.apache.org/jira/browse/PIG-3497 Project: Pig Issue Type: Bug Reporter: Akihiro Matsukawa Assignee: Akihiro Matsukawa Priority: Minor Attachments: reducer_estimation.patch Currently, JobControlCompiler makes an estimation for the number of reducers required (by default based on input size into mappers) regardless of whether there is a reduce phase in the job. This is unnecessary, especially when running more complex custom reducer estimators. Change to only estimate reducers when necessary. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (PIG-3455) Pig 0.11.1 OutOfMemory error
[ https://issues.apache.org/jira/browse/PIG-3455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13770256#comment-13770256 ] Bill Graham commented on PIG-3455: -- +1, much better. Pig 0.11.1 OutOfMemory error Key: PIG-3455 URL: https://issues.apache.org/jira/browse/PIG-3455 Project: Pig Issue Type: Bug Affects Versions: 0.11.1 Reporter: Shubham Chopra Priority: Critical Fix For: 0.12, 0.11.2 Attachments: PIG-3455-1.patch When running Pig on a relatively large script (around 1.5k lines, 85 assignments), Pig fails with the following error even before any jobs are fired: Pig Stack Trace --- ERROR 2998: Unhandled internal error. Java heap space java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOf(Arrays.java:2882) at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:100) at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:390) at java.lang.StringBuilder.append(StringBuilder.java:119) at org.apache.pig.newplan.logical.optimizer.LogicalPlanPrinter.depthFirstLP(LogicalPlanPrinter.java:83) at org.apache.pig.newplan.logical.optimizer.LogicalPlanPrinter.visit(LogicalPlanPrinter.java:69) at org.apache.pig.newplan.logical.relational.LogicalPlan.getSignature(LogicalPlan.java:122) at org.apache.pig.PigServer.execute(PigServer.java:1237) at org.apache.pig.PigServer.executeBatch(PigServer.java:333) at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:137) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:198) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:170) at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84) at org.apache.pig.Main.run(Main.java:604) at org.apache.pig.Main.main(Main.java:157) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:160) The same script works fine with Pig-0.10.1. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-3419) Pluggable Execution Engine
[ https://issues.apache.org/jira/browse/PIG-3419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13767227#comment-13767227 ] Bill Graham commented on PIG-3419: -- Would should at least annotate the new interfaces as evolving so we don't need to evolve them in a backward compatible way just yet. Pluggable Execution Engine --- Key: PIG-3419 URL: https://issues.apache.org/jira/browse/PIG-3419 Project: Pig Issue Type: New Feature Affects Versions: 0.12 Reporter: Achal Soni Assignee: Achal Soni Priority: Minor Fix For: 0.12 Attachments: execengine.patch, mapreduce_execengine.patch, stats_scriptstate.patch, test_failures.txt, test_suite.patch, updated-8-22-2013-exec-engine.patch, updated-8-23-2013-exec-engine.patch, updated-8-27-2013-exec-engine.patch, updated-8-28-2013-exec-engine.patch, updated-8-29-2013-exec-engine.patch In an effort to adapt Pig to work using Apache Tez (https://issues.apache.org/jira/browse/TEZ), I made some changes to allow for a cleaner ExecutionEngine abstraction than existed before. The changes are not that major as Pig was already relatively abstracted out between the frontend and backend. The changes in the attached commit are essentially the barebones changes -- I tried to not change the structure of Pig's different components too much. I think it will be interesting to see in the future how we can refactor more areas of Pig to really honor this abstraction between the frontend and backend. Some of the changes was to reinstate an ExecutionEngine interface to tie together the front end and backend, and making the changes in Pig to delegate to the EE when necessary, and creating an MRExecutionEngine that implements this interface. Other work included changing ExecType to cycle through the ExecutionEngines on the classpath and select the appropriate one (this is done using Java ServiceLoader, exactly how MapReduce does for choosing the framework to use between local and distributed mode). Also I tried to make ScriptState, JobStats, and PigStats as abstract as possible in its current state. I think in the future some work will need to be done here to perhaps re-evaluate the usage of ScriptState and the responsibilities of the different statistics classes. I haven't touched the PPNL, but I think more abstraction is needed here, perhaps in a separate patch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: [ANNOUNCE] Congratulations to our new PMC members Rohini Palaniswamy and Cheolsoo Park
Congrats guys! Well deserved indeed. On Wed, Sep 11, 2013 at 10:58 PM, Jarek Jarcec Cecho jar...@apache.orgwrote: Congratulations Rohini and Cheolsoo, awesome work! Jarcec On Wed, Sep 11, 2013 at 04:24:21PM -0700, Julien Le Dem wrote: Please welcome Rohini Palaniswamy and Cheolsoo Park as our latest Pig PMC members. Congrats Rohini and Cheolsoo ! -- *Note that I'm no longer using my Yahoo! email address. Please email me at billgra...@gmail.com going forward.*
Re: Welcome new Pig Committer - Koji Noguchi
Congrats Koji! On Tue, Sep 10, 2013 at 10:29 PM, Cheolsoo Park piaozhe...@gmail.comwrote: Congratulations Koji! On Wed, Sep 11, 2013 at 7:32 AM, Prashant Kommireddi prash1...@gmail.com wrote: Congrats Koji! On Tue, Sep 10, 2013 at 10:01 AM, Xuefu Zhang xzh...@cloudera.com wrote: Congratulations, Koji. Looking forward to more of your contributions. --Xuefu On Tue, Sep 10, 2013 at 8:58 AM, Olga Natkovich onatkov...@yahoo.com wrote: It is my pleasure to announce that Koji Noguchi became the newest addition to the Pig Committers! Koji has been actively contributing to Pig for over a year now and has been a part of larger Hadoop community (including Hadoop Committer) for many years now. Please, join me in congratulating Koji! Olga -- *Note that I'm no longer using my Yahoo! email address. Please email me at billgra...@gmail.com going forward.*
[jira] [Commented] (PIG-3455) Pig 0.11.1 OutOfMemory error
[ https://issues.apache.org/jira/browse/PIG-3455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763912#comment-13763912 ] Bill Graham commented on PIG-3455: -- Thanks [~rohini.u] for kicking this off. Yes, a streaming based hash function would be a much better approach. No need for backward compatibility. The signature contract is that it could change between Pig releases. Pig 0.11.1 OutOfMemory error Key: PIG-3455 URL: https://issues.apache.org/jira/browse/PIG-3455 Project: Pig Issue Type: Bug Affects Versions: 0.11.1 Reporter: Shubham Chopra Priority: Critical When running Pig on a relatively large script (around 1.5k lines, 85 assignments), Pig fails with the following error even before any jobs are fired: Pig Stack Trace --- ERROR 2998: Unhandled internal error. Java heap space java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOf(Arrays.java:2882) at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:100) at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:390) at java.lang.StringBuilder.append(StringBuilder.java:119) at org.apache.pig.newplan.logical.optimizer.LogicalPlanPrinter.depthFirstLP(LogicalPlanPrinter.java:83) at org.apache.pig.newplan.logical.optimizer.LogicalPlanPrinter.visit(LogicalPlanPrinter.java:69) at org.apache.pig.newplan.logical.relational.LogicalPlan.getSignature(LogicalPlan.java:122) at org.apache.pig.PigServer.execute(PigServer.java:1237) at org.apache.pig.PigServer.executeBatch(PigServer.java:333) at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:137) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:198) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:170) at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84) at org.apache.pig.Main.run(Main.java:604) at org.apache.pig.Main.main(Main.java:157) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:160) The same script works fine with Pig-0.10.1. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-3048) Add mapreduce workflow information to job configuration
[ https://issues.apache.org/jira/browse/PIG-3048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13753997#comment-13753997 ] Bill Graham commented on PIG-3048: -- +1 to commit. Just one style nit re spaces: {noformat} (getFileName() != null)?getFileName():default {noformat} should instead be: {noformat} (getFileName() != null) ? getFileName() : default {noformat} Add mapreduce workflow information to job configuration --- Key: PIG-3048 URL: https://issues.apache.org/jira/browse/PIG-3048 Project: Pig Issue Type: Improvement Reporter: Billie Rinaldi Assignee: Billie Rinaldi Fix For: 0.12 Attachments: PIG-3048.patch, PIG-3048.patch, PIG-3048.patch Adding workflow properties to the job configuration would enable logging and analysis of workflows in addition to individual MapReduce jobs. Suggested properties include a workflow ID, workflow name, adjacency list connecting nodes in the workflow, and the name of the current node in the workflow. mapreduce.workflow.id - a unique ID for the workflow, ideally prepended with the application name e.g. pig_pigScriptId mapreduce.workflow.name - a name for the workflow, to distinguish this workflow from other workflows and to group different runs of the same workflow e.g. pig command line mapreduce.workflow.adjacency - an adjacency list for the workflow graph, encoded as mapreduce.workflow.adjacency.source node = comma-separated list of target nodes mapreduce.workflow.node.name - the name of the node corresponding to this MapReduce job in the workflow adjacency list -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-3048) Add mapreduce workflow information to job configuration
[ https://issues.apache.org/jira/browse/PIG-3048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13753998#comment-13753998 ] Bill Graham commented on PIG-3048: -- Whoops, I was a minute too late. :) Add mapreduce workflow information to job configuration --- Key: PIG-3048 URL: https://issues.apache.org/jira/browse/PIG-3048 Project: Pig Issue Type: Improvement Reporter: Billie Rinaldi Assignee: Billie Rinaldi Fix For: 0.12 Attachments: PIG-3048.patch, PIG-3048.patch, PIG-3048.patch Adding workflow properties to the job configuration would enable logging and analysis of workflows in addition to individual MapReduce jobs. Suggested properties include a workflow ID, workflow name, adjacency list connecting nodes in the workflow, and the name of the current node in the workflow. mapreduce.workflow.id - a unique ID for the workflow, ideally prepended with the application name e.g. pig_pigScriptId mapreduce.workflow.name - a name for the workflow, to distinguish this workflow from other workflows and to group different runs of the same workflow e.g. pig command line mapreduce.workflow.adjacency - an adjacency list for the workflow graph, encoded as mapreduce.workflow.adjacency.source node = comma-separated list of target nodes mapreduce.workflow.node.name - the name of the node corresponding to this MapReduce job in the workflow adjacency list -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-3382) Store data in hbase with more than 2 column family
[ https://issues.apache.org/jira/browse/PIG-3382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13711315#comment-13711315 ] Bill Graham commented on PIG-3382: -- Would you please attach your script and ideally some sample data, along with any errors or other relevant info that might help us troubleshoot and reproduce. Store data in hbase with more than 2 column family -- Key: PIG-3382 URL: https://issues.apache.org/jira/browse/PIG-3382 Project: Pig Issue Type: Improvement Components: build, internal-udfs, parser Reporter: vikram s I am not able to store data in HBase with more than 2 column families. I used STORE api from pig with internal-udf org.apache.pig.backend.hadoop.hbase.HBaseStorage -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (PIG-3382) Store data in hbase with more than 2 column family
[ https://issues.apache.org/jira/browse/PIG-3382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Graham resolved PIG-3382. -- Resolution: Not A Problem Store data in hbase with more than 2 column family -- Key: PIG-3382 URL: https://issues.apache.org/jira/browse/PIG-3382 Project: Pig Issue Type: Improvement Components: build, internal-udfs, parser Reporter: vikram s I am not able to store data in HBase with more than 2 column families. I used STORE api from pig with internal-udf org.apache.pig.backend.hadoop.hbase.HBaseStorage -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-3330) please fix the change that created a dependency on org.apache.pig.impl.PigImplConstants
[ https://issues.apache.org/jira/browse/PIG-3330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Graham updated PIG-3330: - Assignee: Bill Graham please fix the change that created a dependency on org.apache.pig.impl.PigImplConstants --- Key: PIG-3330 URL: https://issues.apache.org/jira/browse/PIG-3330 Project: Pig Issue Type: Bug Reporter: Joseph Adler Assignee: Bill Graham Priority: Blocker I can't build Pig from trunk because several source files (including org.apache.pig.Main.java) require org.apache.pig.impl.PigImplConstants, but that class isn't available. I'm assuming someone left out a file on a recent commit. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (PIG-3330) please fix the change that created a dependency on org.apache.pig.impl.PigImplConstants
[ https://issues.apache.org/jira/browse/PIG-3330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Graham resolved PIG-3330. -- Resolution: Fixed My bad, I made the commit last night and forgot 'svn add'. Just made the fix by adding the missing file. please fix the change that created a dependency on org.apache.pig.impl.PigImplConstants --- Key: PIG-3330 URL: https://issues.apache.org/jira/browse/PIG-3330 Project: Pig Issue Type: Bug Reporter: Joseph Adler Assignee: Bill Graham Priority: Blocker I can't build Pig from trunk because several source files (including org.apache.pig.Main.java) require org.apache.pig.impl.PigImplConstants, but that class isn't available. I'm assuming someone left out a file on a recent commit. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-3317) disable optimizations via pig properties
[ https://issues.apache.org/jira/browse/PIG-3317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Graham updated PIG-3317: - Resolution: Fixed Status: Resolved (was: Patch Available) Committed, thanks Travis! disable optimizations via pig properties Key: PIG-3317 URL: https://issues.apache.org/jira/browse/PIG-3317 Project: Pig Issue Type: Improvement Affects Versions: 0.12 Reporter: Travis Crawford Assignee: Travis Crawford Attachments: PIG-3317_disable_opts.1.patch, PIG-3317_disable_opts.2.patch, PIG-3317_disable_opts.3.patch, PIG-3317_disable_opts.4.patch Pig provides a number of optimizations which are described at [http://pig.apache.org/docs/r0.11.1/perf.html#optimization-rules]. As is described in the docs, all or specific optimizations can be disabled via the command-line. Currently the caller of a pig script must know which optimizations to disable when running because that information cannot be set in the script itself. Nor can optimizations be disabled site-wide through pig.properties. Pig should allow disabling optimizations via properties so that pig scripts themselves can disable optimizations as needed, rather than the caller needing to know what optimizations to disable on the command-line. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-3326) Add PiggyBank to Maven Repository
[ https://issues.apache.org/jira/browse/PIG-3326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Graham updated PIG-3326: - Fix Version/s: 0.12 Add PiggyBank to Maven Repository - Key: PIG-3326 URL: https://issues.apache.org/jira/browse/PIG-3326 Project: Pig Issue Type: New Feature Components: piggybank Reporter: Aaron Mitchell Priority: Minor Fix For: 0.12 PiggyBank should be uploaded to the apache maven repository. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-3326) Add PiggyBank to Maven Repository
[ https://issues.apache.org/jira/browse/PIG-3326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13658831#comment-13658831 ] Bill Graham commented on PIG-3326: -- Support for publishing piggybank to maven will be in Pig 0.12 thanks to PIG-3233. Add PiggyBank to Maven Repository - Key: PIG-3326 URL: https://issues.apache.org/jira/browse/PIG-3326 Project: Pig Issue Type: New Feature Components: piggybank Reporter: Aaron Mitchell Priority: Minor Fix For: 0.12 PiggyBank should be uploaded to the apache maven repository. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Review Request: disable optimizations via pig properties
On May 13, 2013, 11:35 p.m., Bill Graham wrote: src/docs/src/documentation/content/xdocs/perf.xml, line 493 https://reviews.apache.org/r/11032/diff/2/?file=290925#file290925line493 Would you please specify that setting this value in both the pig properties file and the command line (or script) will be additive. Travis Crawford wrote: Currently it works like this: (a) -optimizer_off command-line rules are always disabled. (b) The pig.optimizer.rules.disabled property works like other properties, where setting in the script itself overwrites previously set values (from either the command-line or pig.properties). Disabled rules are additive in that (a) + (b) will be disabled. However, within (b) only the last specified value of pig.optimizer.rules.disabled takes effect. I think this makes sense for how people will want to use the feature (and I think is consistent with how other properties work). * Site administrators can specify default rules to disable via pig.properties. * Individual scripts can override the site defaults if needed. * Invokers of pig can supplement the rules to disable. Thoughts? If we want to be additive within (b) we'd also need a way to remove defaults set by site administrators, since the default should be a suggestion not requirement. That would easily be achieved with a - prefix that would remove disabled rules, but I think we've covered the common use-cases without introducing extra complexity. Bill Graham wrote: (b) The pig.optimizer.rules.disabled property works like other properties, where setting in the script itself overwrites previously set values (from either the command-line or pig.properties). This implies SET in a script will override the command line (or properties). Disabled rules are additive in that (a) + (b) will be disabled. However, within (b) only the last specified value of pig.optimizer.rules.disabled takes effect. This implies SET in a script (or properties) would be additive with the command line. Can you help clarify what I think sounds like a contradiction? Just trying to understand the implemented behavior more than propose a change to it. Travis Crawford wrote: RE: This implies SET in a script will override the command line (or properties). A: Rules disabled via the -optimizer_off command-line flag are treated separately from values disabled via the pig property. RE: This implies SET in a script (or properties) would be additive with the command line. A: Correct - SET in a script is additive with rules disabled via the command-line flag. Rules to disable are the set of rules disabled on the command line + rules disabled via the pig.optimizer.rules.disabled property. Pig's code currently uses a command-line flag to disable optimization rules rather than standard pig properties. I think the ideal state would be using a single property to disable rules because properties are how pig configuration works in general. However, since there's currently a command-line flag to disable rules it seems like we should keep it (perhaps deprecating to allow removing in a future release). The proposal in this change is to: * preserve existing behavior by making the command-line flag continues to disable rules as it does today * Add a new property that also lets you disable optimization rules. This is a standard pig property that can be set in all the ways one can currently set properties. Then we add rules disabled via the command-line flag with rules disabled via the property and that's the full list of rules to disable. Bill Graham wrote: Got it, thanks. In your original text I was overlooking the fact that you were discussing a flag which is different then the property. All's clear now. The behavior makes sense to me. We should add a line to the docs to mention that when both -optimizer_off and pig.optimizer.rules.disabled are set, that the union of the two rule sets are disabled. Travis Crawford wrote: Sounds good, I'll clarify that section. Digging around I found this section which defined how pig properties are set: http://pig.apache.org/docs/r0.11.1/start.html#properties pig.properties -D Pig property -P properties file set command Which is interesting because until poking around in this change I didn't know the exact order they were applied in (and just now learning about -P). What are your thoughts on deprecating the command-line argument? To minimize impact I'm fine leaving as-is, but long-term I think it makes sense for pig to simply be configured through properties without a good reason to do things differently. I agree deprecating would be nice, but I'm ok leaving
Re: Review Request: disable optimizations via pig properties
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/11032/#review20545 --- Ship it! My concerns have been addresses. Thanks! - Bill Graham On May 14, 2013, 5:23 p.m., Travis Crawford wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/11032/ --- (Updated May 14, 2013, 5:23 p.m.) Review request for pig, Julien Le Dem, Bill Graham, and Feng Peng. Description --- Update pig to allow disabling optimizations via pig properties. Currently optimizations must be disabled via command-line options. Pig properties can be set in pig.properties, set commands in scripts themselves, and command-line -D options. The use-case is, for scripts that require certain optimizations to be disabled, allowing the script itself to disable the optimization. Currently whatever runs the script needs to specially handle disabling the optimization for that specific query. This addresses bug PIG-3317. https://issues.apache.org/jira/browse/PIG-3317 Diffs - src/docs/src/documentation/content/xdocs/perf.xml 108ae7e src/org/apache/pig/Main.java f97ed9f src/org/apache/pig/PigConstants.java ea77e97 src/org/apache/pig/backend/hadoop/executionengine/HExecutionEngine.java 4dab4e8 src/org/apache/pig/impl/PigImplConstants.java PRE-CREATION src/org/apache/pig/newplan/logical/optimizer/LogicalPlanOptimizer.java d26f381 test/org/apache/pig/test/TestEvalPipeline2.java 39cf807 Diff: https://reviews.apache.org/r/11032/diff/ Testing --- Manually tested on a fully-distributed cluster. THIS FAILS: PIG_CONF_DIR=/etc/pig/conf ./bin/pig -c query.pig THIS WORKS: PIG_CONF_DIR=/etc/pig/conf ./bin/pig -Dpig.optimizer.rules.disabled=ColumnMapKeyPrune -c query.pig Notice how -Dpig.optimizer.rules.disabled=ColumnMapKeyPrune specifies a pig property, which could be in pig.properties, or the script itself. Failure message: Pig Stack Trace --- ERROR 2229: Couldn't find matching uid -1 for project (Name: Project Type: bytearray Uid: 97550 Input: 0 Column: 1) org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1067: Unable to explain alias null at org.apache.pig.PigServer.explain(PigServer.java:1057) at org.apache.pig.tools.grunt.GruntParser.explainCurrentBatch(GruntParser.java:419) at org.apache.pig.tools.grunt.GruntParser.processExplain(GruntParser.java:351) at org.apache.pig.tools.grunt.Grunt.checkScript(Grunt.java:98) at org.apache.pig.Main.run(Main.java:607) at org.apache.pig.Main.main(Main.java:152) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:186) Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 2000: Error processing rule ColumnMapKeyPrune. Try -t ColumnMapKeyPrune at org.apache.pig.newplan.optimizer.PlanOptimizer.optimize(PlanOptimizer.java:122) at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(HExecutionEngine.java:281) at org.apache.pig.PigServer.compilePp(PigServer.java:1380) at org.apache.pig.PigServer.explain(PigServer.java:1042) ... 10 more Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 2229: Couldn't find matching uid -1 for project (Name: Project Type: bytearray Uid: 97550 Input: 0 Column: 1) at org.apache.pig.newplan.logical.optimizer.ProjectionPatcher$ProjectionRewriter.visit(ProjectionPatcher.java:91) at org.apache.pig.newplan.logical.expression.ProjectExpression.accept(ProjectExpression.java:207) at org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:64) at org.apache.pig.newplan.DepthFirstWalker.walk(DepthFirstWalker.java:53) at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:52) at org.apache.pig.newplan.logical.optimizer.AllExpressionVisitor.visit(AllExpressionVisitor.java:142) at org.apache.pig.newplan.logical.relational.LOInnerLoad.accept(LOInnerLoad.java:128) at org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalker.java:75) at org.apache.pig.newplan.logical.optimizer.AllExpressionVisitor.visit(AllExpressionVisitor.java:124) at org.apache.pig.newplan.logical.relational.LOForEach.accept(LOForEach.java:76
[jira] [Updated] (PIG-3317) disable optimizations via pig properties
[ https://issues.apache.org/jira/browse/PIG-3317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Graham updated PIG-3317: - Status: Open (was: Patch Available) Canceling patch since Travis and Julien identified issues with SET in scripts in https://reviews.apache.org/r/11032/. disable optimizations via pig properties Key: PIG-3317 URL: https://issues.apache.org/jira/browse/PIG-3317 Project: Pig Issue Type: Improvement Affects Versions: 0.12 Reporter: Travis Crawford Assignee: Travis Crawford Attachments: PIG-3317_disable_opts.1.patch Pig provides a number of optimizations which are described at [http://pig.apache.org/docs/r0.11.1/perf.html#optimization-rules]. As is described in the docs, all or specific optimizations can be disabled via the command-line. Currently the caller of a pig script must know which optimizations to disable when running because that information cannot be set in the script itself. Nor can optimizations be disabled site-wide through pig.properties. Pig should allow disabling optimizations via properties so that pig scripts themselves can disable optimizations as needed, rather than the caller needing to know what optimizations to disable on the command-line. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Review Request: disable optimizations via pig properties
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/11032/#review20516 --- src/docs/src/documentation/content/xdocs/perf.xml https://reviews.apache.org/r/11032/#comment42293 Would you please specify that setting this value in both the pig properties file and the command line (or script) will be additive. - Bill Graham On May 13, 2013, 8:35 p.m., Travis Crawford wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/11032/ --- (Updated May 13, 2013, 8:35 p.m.) Review request for pig, Julien Le Dem, Bill Graham, and Feng Peng. Description --- Update pig to allow disabling optimizations via pig properties. Currently optimizations must be disabled via command-line options. Pig properties can be set in pig.properties, set commands in scripts themselves, and command-line -D options. The use-case is, for scripts that require certain optimizations to be disabled, allowing the script itself to disable the optimization. Currently whatever runs the script needs to specially handle disabling the optimization for that specific query. This addresses bug PIG-3317. https://issues.apache.org/jira/browse/PIG-3317 Diffs - src/docs/src/documentation/content/xdocs/perf.xml 108ae7e src/org/apache/pig/Main.java f97ed9f src/org/apache/pig/PigConstants.java ea77e97 src/org/apache/pig/backend/hadoop/executionengine/HExecutionEngine.java 4dab4e8 src/org/apache/pig/newplan/logical/optimizer/LogicalPlanOptimizer.java d26f381 test/org/apache/pig/test/TestEvalPipeline2.java 39cf807 Diff: https://reviews.apache.org/r/11032/diff/ Testing --- Manually tested on a fully-distributed cluster. THIS FAILS: PIG_CONF_DIR=/etc/pig/conf ./bin/pig -c query.pig THIS WORKS: PIG_CONF_DIR=/etc/pig/conf ./bin/pig -Dpig.optimizer.rules.disabled=ColumnMapKeyPrune -c query.pig Notice how -Dpig.optimizer.rules.disabled=ColumnMapKeyPrune specifies a pig property, which could be in pig.properties, or the script itself. Failure message: Pig Stack Trace --- ERROR 2229: Couldn't find matching uid -1 for project (Name: Project Type: bytearray Uid: 97550 Input: 0 Column: 1) org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1067: Unable to explain alias null at org.apache.pig.PigServer.explain(PigServer.java:1057) at org.apache.pig.tools.grunt.GruntParser.explainCurrentBatch(GruntParser.java:419) at org.apache.pig.tools.grunt.GruntParser.processExplain(GruntParser.java:351) at org.apache.pig.tools.grunt.Grunt.checkScript(Grunt.java:98) at org.apache.pig.Main.run(Main.java:607) at org.apache.pig.Main.main(Main.java:152) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:186) Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 2000: Error processing rule ColumnMapKeyPrune. Try -t ColumnMapKeyPrune at org.apache.pig.newplan.optimizer.PlanOptimizer.optimize(PlanOptimizer.java:122) at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(HExecutionEngine.java:281) at org.apache.pig.PigServer.compilePp(PigServer.java:1380) at org.apache.pig.PigServer.explain(PigServer.java:1042) ... 10 more Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 2229: Couldn't find matching uid -1 for project (Name: Project Type: bytearray Uid: 97550 Input: 0 Column: 1) at org.apache.pig.newplan.logical.optimizer.ProjectionPatcher$ProjectionRewriter.visit(ProjectionPatcher.java:91) at org.apache.pig.newplan.logical.expression.ProjectExpression.accept(ProjectExpression.java:207) at org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:64) at org.apache.pig.newplan.DepthFirstWalker.walk(DepthFirstWalker.java:53) at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:52) at org.apache.pig.newplan.logical.optimizer.AllExpressionVisitor.visit(AllExpressionVisitor.java:142) at org.apache.pig.newplan.logical.relational.LOInnerLoad.accept(LOInnerLoad.java:128) at org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalker.java:75) at org.apache.pig.newplan.logical.optimizer.AllExpressionVisitor.visit(AllExpressionVisitor.java:124
Re: Review Request: disable optimizations via pig properties
On May 13, 2013, 11:35 p.m., Bill Graham wrote: src/docs/src/documentation/content/xdocs/perf.xml, line 493 https://reviews.apache.org/r/11032/diff/2/?file=290925#file290925line493 Would you please specify that setting this value in both the pig properties file and the command line (or script) will be additive. Travis Crawford wrote: Currently it works like this: (a) -optimizer_off command-line rules are always disabled. (b) The pig.optimizer.rules.disabled property works like other properties, where setting in the script itself overwrites previously set values (from either the command-line or pig.properties). Disabled rules are additive in that (a) + (b) will be disabled. However, within (b) only the last specified value of pig.optimizer.rules.disabled takes effect. I think this makes sense for how people will want to use the feature (and I think is consistent with how other properties work). * Site administrators can specify default rules to disable via pig.properties. * Individual scripts can override the site defaults if needed. * Invokers of pig can supplement the rules to disable. Thoughts? If we want to be additive within (b) we'd also need a way to remove defaults set by site administrators, since the default should be a suggestion not requirement. That would easily be achieved with a - prefix that would remove disabled rules, but I think we've covered the common use-cases without introducing extra complexity. (b) The pig.optimizer.rules.disabled property works like other properties, where setting in the script itself overwrites previously set values (from either the command-line or pig.properties). This implies SET in a script will override the command line (or properties). Disabled rules are additive in that (a) + (b) will be disabled. However, within (b) only the last specified value of pig.optimizer.rules.disabled takes effect. This implies SET in a script (or properties) would be additive with the command line. Can you help clarify what I think sounds like a contradiction? Just trying to understand the implemented behavior more than propose a change to it. - Bill --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/11032/#review20516 --- On May 13, 2013, 8:35 p.m., Travis Crawford wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/11032/ --- (Updated May 13, 2013, 8:35 p.m.) Review request for pig, Julien Le Dem, Bill Graham, and Feng Peng. Description --- Update pig to allow disabling optimizations via pig properties. Currently optimizations must be disabled via command-line options. Pig properties can be set in pig.properties, set commands in scripts themselves, and command-line -D options. The use-case is, for scripts that require certain optimizations to be disabled, allowing the script itself to disable the optimization. Currently whatever runs the script needs to specially handle disabling the optimization for that specific query. This addresses bug PIG-3317. https://issues.apache.org/jira/browse/PIG-3317 Diffs - src/docs/src/documentation/content/xdocs/perf.xml 108ae7e src/org/apache/pig/Main.java f97ed9f src/org/apache/pig/PigConstants.java ea77e97 src/org/apache/pig/backend/hadoop/executionengine/HExecutionEngine.java 4dab4e8 src/org/apache/pig/newplan/logical/optimizer/LogicalPlanOptimizer.java d26f381 test/org/apache/pig/test/TestEvalPipeline2.java 39cf807 Diff: https://reviews.apache.org/r/11032/diff/ Testing --- Manually tested on a fully-distributed cluster. THIS FAILS: PIG_CONF_DIR=/etc/pig/conf ./bin/pig -c query.pig THIS WORKS: PIG_CONF_DIR=/etc/pig/conf ./bin/pig -Dpig.optimizer.rules.disabled=ColumnMapKeyPrune -c query.pig Notice how -Dpig.optimizer.rules.disabled=ColumnMapKeyPrune specifies a pig property, which could be in pig.properties, or the script itself. Failure message: Pig Stack Trace --- ERROR 2229: Couldn't find matching uid -1 for project (Name: Project Type: bytearray Uid: 97550 Input: 0 Column: 1) org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1067: Unable to explain alias null at org.apache.pig.PigServer.explain(PigServer.java:1057) at org.apache.pig.tools.grunt.GruntParser.explainCurrentBatch(GruntParser.java:419) at org.apache.pig.tools.grunt.GruntParser.processExplain(GruntParser.java:351) at org.apache.pig.tools.grunt.Grunt.checkScript(Grunt.java:98) at org.apache.pig.Main.run(Main.java:607
[jira] [Created] (PIG-3324) STARTSWITH documentation
Bill Graham created PIG-3324: Summary: STARTSWITH documentation Key: PIG-3324 URL: https://issues.apache.org/jira/browse/PIG-3324 Project: Pig Issue Type: Bug Reporter: Bill Graham PIG-2879 added support for STARTSWITH udf, which should be documented here: http://pig.apache.org/docs/r0.11.1/func.html -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-3324) STARTSWITH documentation
[ https://issues.apache.org/jira/browse/PIG-3324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Graham updated PIG-3324: - Fix Version/s: 0.12 STARTSWITH documentation Key: PIG-3324 URL: https://issues.apache.org/jira/browse/PIG-3324 Project: Pig Issue Type: Bug Reporter: Bill Graham Labels: documentation, newbie, simple Fix For: 0.12 PIG-2879 added support for STARTSWITH udf, which should be documented here: http://pig.apache.org/docs/r0.11.1/func.html -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Review Request: disable optimizations via pig properties
On May 13, 2013, 11:35 p.m., Bill Graham wrote: src/docs/src/documentation/content/xdocs/perf.xml, line 493 https://reviews.apache.org/r/11032/diff/2/?file=290925#file290925line493 Would you please specify that setting this value in both the pig properties file and the command line (or script) will be additive. Travis Crawford wrote: Currently it works like this: (a) -optimizer_off command-line rules are always disabled. (b) The pig.optimizer.rules.disabled property works like other properties, where setting in the script itself overwrites previously set values (from either the command-line or pig.properties). Disabled rules are additive in that (a) + (b) will be disabled. However, within (b) only the last specified value of pig.optimizer.rules.disabled takes effect. I think this makes sense for how people will want to use the feature (and I think is consistent with how other properties work). * Site administrators can specify default rules to disable via pig.properties. * Individual scripts can override the site defaults if needed. * Invokers of pig can supplement the rules to disable. Thoughts? If we want to be additive within (b) we'd also need a way to remove defaults set by site administrators, since the default should be a suggestion not requirement. That would easily be achieved with a - prefix that would remove disabled rules, but I think we've covered the common use-cases without introducing extra complexity. Bill Graham wrote: (b) The pig.optimizer.rules.disabled property works like other properties, where setting in the script itself overwrites previously set values (from either the command-line or pig.properties). This implies SET in a script will override the command line (or properties). Disabled rules are additive in that (a) + (b) will be disabled. However, within (b) only the last specified value of pig.optimizer.rules.disabled takes effect. This implies SET in a script (or properties) would be additive with the command line. Can you help clarify what I think sounds like a contradiction? Just trying to understand the implemented behavior more than propose a change to it. Travis Crawford wrote: RE: This implies SET in a script will override the command line (or properties). A: Rules disabled via the -optimizer_off command-line flag are treated separately from values disabled via the pig property. RE: This implies SET in a script (or properties) would be additive with the command line. A: Correct - SET in a script is additive with rules disabled via the command-line flag. Rules to disable are the set of rules disabled on the command line + rules disabled via the pig.optimizer.rules.disabled property. Pig's code currently uses a command-line flag to disable optimization rules rather than standard pig properties. I think the ideal state would be using a single property to disable rules because properties are how pig configuration works in general. However, since there's currently a command-line flag to disable rules it seems like we should keep it (perhaps deprecating to allow removing in a future release). The proposal in this change is to: * preserve existing behavior by making the command-line flag continues to disable rules as it does today * Add a new property that also lets you disable optimization rules. This is a standard pig property that can be set in all the ways one can currently set properties. Then we add rules disabled via the command-line flag with rules disabled via the property and that's the full list of rules to disable. Got it, thanks. In your original text I was overlooking the fact that you were discussing a flag which is different then the property. All's clear now. The behavior makes sense to me. We should add a line to the docs to mention that when both -optimizer_off and pig.optimizer.rules.disabled are set, that the union of the two rule sets are disabled. - Bill --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/11032/#review20516 --- On May 13, 2013, 8:35 p.m., Travis Crawford wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/11032/ --- (Updated May 13, 2013, 8:35 p.m.) Review request for pig, Julien Le Dem, Bill Graham, and Feng Peng. Description --- Update pig to allow disabling optimizations via pig properties. Currently optimizations must be disabled via command-line options. Pig properties can be set
Re: Review Request: disable optimizations via pig properties
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/11032/#review20406 --- Ship it! Looks good to me. Way to be #prostyle with including the docs edits in the patch. The way we still use optimizerRules and pig.optimizer.rules in places for rules that are disabled and not enabled is way confusing, but we can fix that separately. - Bill Graham On May 9, 2013, 9:03 p.m., Travis Crawford wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/11032/ --- (Updated May 9, 2013, 9:03 p.m.) Review request for pig, Julien Le Dem, Bill Graham, and Feng Peng. Description --- Update pig to allow disabling optimizations via pig properties. Currently optimizations must be disabled via command-line options. Pig properties can be set in pig.properties, set commands in scripts themselves, and command-line -D options. The use-case is, for scripts that require certain optimizations to be disabled, allowing the script itself to disable the optimization. Currently whatever runs the script needs to specially handle disabling the optimization for that specific query. This addresses bug PIG-3317. https://issues.apache.org/jira/browse/PIG-3317 Diffs - src/docs/src/documentation/content/xdocs/perf.xml 108ae7e src/org/apache/pig/Main.java f97ed9f src/org/apache/pig/PigConstants.java ea77e97 src/org/apache/pig/newplan/logical/optimizer/LogicalPlanOptimizer.java d26f381 Diff: https://reviews.apache.org/r/11032/diff/ Testing --- Manually tested on a fully-distributed cluster. THIS FAILS: PIG_CONF_DIR=/etc/pig/conf ./bin/pig -c query.pig THIS WORKS: PIG_CONF_DIR=/etc/pig/conf ./bin/pig -Dpig.optimizer.rules.disabled=ColumnMapKeyPrune -c query.pig Notice how -Dpig.optimizer.rules.disabled=ColumnMapKeyPrune specifies a pig property, which could be in pig.properties, or the script itself. Failure message: Pig Stack Trace --- ERROR 2229: Couldn't find matching uid -1 for project (Name: Project Type: bytearray Uid: 97550 Input: 0 Column: 1) org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1067: Unable to explain alias null at org.apache.pig.PigServer.explain(PigServer.java:1057) at org.apache.pig.tools.grunt.GruntParser.explainCurrentBatch(GruntParser.java:419) at org.apache.pig.tools.grunt.GruntParser.processExplain(GruntParser.java:351) at org.apache.pig.tools.grunt.Grunt.checkScript(Grunt.java:98) at org.apache.pig.Main.run(Main.java:607) at org.apache.pig.Main.main(Main.java:152) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:186) Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 2000: Error processing rule ColumnMapKeyPrune. Try -t ColumnMapKeyPrune at org.apache.pig.newplan.optimizer.PlanOptimizer.optimize(PlanOptimizer.java:122) at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(HExecutionEngine.java:281) at org.apache.pig.PigServer.compilePp(PigServer.java:1380) at org.apache.pig.PigServer.explain(PigServer.java:1042) ... 10 more Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 2229: Couldn't find matching uid -1 for project (Name: Project Type: bytearray Uid: 97550 Input: 0 Column: 1) at org.apache.pig.newplan.logical.optimizer.ProjectionPatcher$ProjectionRewriter.visit(ProjectionPatcher.java:91) at org.apache.pig.newplan.logical.expression.ProjectExpression.accept(ProjectExpression.java:207) at org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:64) at org.apache.pig.newplan.DepthFirstWalker.walk(DepthFirstWalker.java:53) at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:52) at org.apache.pig.newplan.logical.optimizer.AllExpressionVisitor.visit(AllExpressionVisitor.java:142) at org.apache.pig.newplan.logical.relational.LOInnerLoad.accept(LOInnerLoad.java:128) at org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalker.java:75) at org.apache.pig.newplan.logical.optimizer.AllExpressionVisitor.visit(AllExpressionVisitor.java:124) at org.apache.pig.newplan.logical.relational.LOForEach.accept(LOForEach.java:76
[jira] [Commented] (PIG-3317) disable optimizations via pig properties
[ https://issues.apache.org/jira/browse/PIG-3317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13653332#comment-13653332 ] Bill Graham commented on PIG-3317: -- Commented as such in the rb, but this patch looks good to me. disable optimizations via pig properties Key: PIG-3317 URL: https://issues.apache.org/jira/browse/PIG-3317 Project: Pig Issue Type: Improvement Affects Versions: 0.12 Reporter: Travis Crawford Assignee: Travis Crawford Attachments: PIG-3317_disable_opts.1.patch Pig provides a number of optimizations which are described at [http://pig.apache.org/docs/r0.11.1/perf.html#optimization-rules]. As is described in the docs, all or specific optimizations can be disabled via the command-line. Currently the caller of a pig script must know which optimizations to disable when running because that information cannot be set in the script itself. Nor can optimizations be disabled site-wide through pig.properties. Pig should allow disabling optimizations via properties so that pig scripts themselves can disable optimizations as needed, rather than the caller needing to know what optimizations to disable on the command-line. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-3311) add pig-withouthadoop-h2 to mvn-jar
[ https://issues.apache.org/jira/browse/PIG-3311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13648931#comment-13648931 ] Bill Graham commented on PIG-3311: -- +1 add pig-withouthadoop-h2 to mvn-jar --- Key: PIG-3311 URL: https://issues.apache.org/jira/browse/PIG-3311 Project: Pig Issue Type: Improvement Components: build Reporter: Julien Le Dem Assignee: Julien Le Dem Attachments: PIG-3311.patch mvn-jar currently creates pig-version.jar and pig-version-h2.jar I'm adding pig-version-withouthadoop.jar and pig-version-withouthadoop-h2.jar that are needed to run pig from the command line. This will allow a dual-version package. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Welcome our newest committer Prashant Kommireddi
Congrats Prashant! On Thu, May 2, 2013 at 1:11 PM, Daniel Dai da...@hortonworks.com wrote: Congratulation! On Thu, May 2, 2013 at 1:06 PM, Cheolsoo Park piaozhe...@gmail.com wrote: Congrats Prashant! On Thu, May 2, 2013 at 12:56 PM, Julien Le Dem jul...@ledem.net wrote: All, Please join me in welcoming Prashant Kommireddi as our newest Pig committer. He's been contributing to Pig for a while now. We look forward to him being a part of the project. Julien -- *Note that I'm no longer using my Yahoo! email address. Please email me at billgra...@gmail.com going forward.*
[jira] [Updated] (PIG-3303) add hadoop h2 artifact to publications in ivy.xml
[ https://issues.apache.org/jira/browse/PIG-3303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Graham updated PIG-3303: - Assignee: Julien Le Dem add hadoop h2 artifact to publications in ivy.xml - Key: PIG-3303 URL: https://issues.apache.org/jira/browse/PIG-3303 Project: Pig Issue Type: Bug Reporter: Julien Le Dem Assignee: Julien Le Dem Attachments: PIG-3303.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (PIG-3306) Publish h2 artifact to maven
Bill Graham created PIG-3306: Summary: Publish h2 artifact to maven Key: PIG-3306 URL: https://issues.apache.org/jira/browse/PIG-3306 Project: Pig Issue Type: Bug Reporter: Bill Graham The Pig artifact built with hadoopversion=23 should be published to maven. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-3303) add hadoop h2 artifact to publications in ivy.xml
[ https://issues.apache.org/jira/browse/PIG-3303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13646720#comment-13646720 ] Bill Graham commented on PIG-3303: -- +1 Created PIG-3306 for publishing the h2 artifact to maven. add hadoop h2 artifact to publications in ivy.xml - Key: PIG-3303 URL: https://issues.apache.org/jira/browse/PIG-3303 Project: Pig Issue Type: Bug Reporter: Julien Le Dem Assignee: Julien Le Dem Attachments: PIG-3303.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (PIG-3306) Publish h2 artifact to maven
[ https://issues.apache.org/jira/browse/PIG-3306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Graham resolved PIG-3306. -- Resolution: Not A Problem Yup [~rohini] you're right we already do that. I should have known, I've published the last two releases. :) Publish h2 artifact to maven Key: PIG-3306 URL: https://issues.apache.org/jira/browse/PIG-3306 Project: Pig Issue Type: Bug Reporter: Bill Graham The Pig artifact built with hadoopversion=23 should be published to maven. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Pig 0.10.1 to Pig 0.11.1 API compatibility break
Hi Gerrit, Sorry to hear these changes caused you problems. The PPNL interface is marked as Evolving, so it should be expected that future releases of that interface will change (i.e. break). I'm open for ways to better communicate these changes when they occur besides the current release notes process. thanks, Bill On Fri, Apr 19, 2013 at 12:11 AM, Dmitriy Ryaboy dvrya...@gmail.com wrote: Hi Gerrit, we do try to keep backwards incompatible changes to a minimum, but sometimes they are needed to make progress. How about we make a practice of tagging notifications about new pig release candidates with [RC] so you can set up your filters and get a heads up to try your software with the latest release candidate? That will at least let you prepare for changes before a release is made, or perhaps argue that we should revert something that is backwards incompatible. On Apr 18, 2013, at 2:23 AM, Gerrit Jansen van Vuuren gerrit...@gmail.com wrote: Hi, I'm the developer of http://gerritjvv.github.io/glue/ that uses the Pig API directly to launch pig jobs in separate JVM instances. Recently I've updated to use pig-0.11.1 and found two API compatibility breaks. PigServer.parseExecType does not exist anymore, (was a static method up to pig-0.10.1) New method for PigProgressNotificationListener public void initialPlanNotification(String scriptId, MROperPlan plan) It would be nice if you guys (when possible) could lookout for these kind of breaks in the future. Thanks, Gerrit -- *Note that I'm no longer using my Yahoo! email address. Please email me at billgra...@gmail.com going forward.*
[jira] [Updated] (PIG-3159) TestAvroStorage.testArrayWithSnappyCompression fails on mac with Java 7
[ https://issues.apache.org/jira/browse/PIG-3159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Graham updated PIG-3159: - Description: Seems like snappy isn't being properly loaded when run on mac. This is the exception from the {{TestAvroStorage.testArrayWithSnappyCompression}} test. {noformat} 13/02/03 13:20:49 INFO mapReduceLayer.PigMapOnly$Map: Aliases being processed per job phase (AliasName[line,offset]): M: in[1,6] C: R: java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at org.xerial.snappy.SnappyLoader.loadNativeLibrary(SnappyLoader.java:315) at org.xerial.snappy.SnappyLoader.load(SnappyLoader.java:218) at org.xerial.snappy.Snappy.clinit(Snappy.java:42) at org.apache.avro.file.SnappyCodec.compress(SnappyCodec.java:43) at org.apache.avro.file.DataFileStream$DataBlock.compressUsing(DataFileStream.java:349) at org.apache.avro.file.DataFileWriter.writeBlock(DataFileWriter.java:347) at org.apache.avro.file.DataFileWriter.sync(DataFileWriter.java:359) at org.apache.avro.file.DataFileWriter.flush(DataFileWriter.java:366) at org.apache.avro.file.DataFileWriter.close(DataFileWriter.java:373) at org.apache.pig.piggybank.storage.avro.PigAvroRecordWriter.close(PigAvroRecordWriter.java:44) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.close(PigOutputFormat.java:149) at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.close(MapTask.java:651) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212) Caused by: java.lang.UnsatisfiedLinkError: no snappyjava in java.library.path at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1860) at java.lang.Runtime.loadLibrary0(Runtime.java:845) at java.lang.System.loadLibrary(System.java:1084) at org.xerial.snappy.SnappyNativeLoader.loadLibrary(SnappyNativeLoader.java:52) ... 19 more 13/02/03 13:20:49 WARN mapred.LocalJobRunner: job_local_0001 org.xerial.snappy.SnappyError: [FAILED_TO_LOAD_NATIVE_LIBRARY] null at org.xerial.snappy.SnappyLoader.load(SnappyLoader.java:227) at org.xerial.snappy.Snappy.clinit(Snappy.java:42) at org.apache.avro.file.SnappyCodec.compress(SnappyCodec.java:43) at org.apache.avro.file.DataFileStream$DataBlock.compressUsing(DataFileStream.java:349) at org.apache.avro.file.DataFileWriter.writeBlock(DataFileWriter.java:347) at org.apache.avro.file.DataFileWriter.sync(DataFileWriter.java:359) at org.apache.avro.file.DataFileWriter.flush(DataFileWriter.java:366) at org.apache.avro.file.DataFileWriter.close(DataFileWriter.java:373) at org.apache.pig.piggybank.storage.avro.PigAvroRecordWriter.close(PigAvroRecordWriter.java:44) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.close(PigOutputFormat.java:149) at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.close(MapTask.java:651) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212) {noformat} was: Seems like snappy isn't being properly loaded when run on mac. This is the exception from the {{testArrayWithSnappyCompression}} test. {noformat} 13/02/03 13:20:49 INFO mapReduceLayer.PigMapOnly$Map: Aliases being processed per job phase (AliasName[line,offset]): M: in[1,6] C: R: java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at org.xerial.snappy.SnappyLoader.loadNativeLibrary(SnappyLoader.java:315) at org.xerial.snappy.SnappyLoader.load(SnappyLoader.java:218) at org.xerial.snappy.Snappy.clinit(Snappy.java:42) at org.apache.avro.file.SnappyCodec.compress(SnappyCodec.java:43) at org.apache.avro.file.DataFileStream$DataBlock.compressUsing(DataFileStream.java:349) at org.apache.avro.file.DataFileWriter.writeBlock(DataFileWriter.java:347
[jira] [Created] (PIG-3273) bad %default directives can cause pig dry run to silently fail
Bill Graham created PIG-3273: Summary: bad %default directives can cause pig dry run to silently fail Key: PIG-3273 URL: https://issues.apache.org/jira/browse/PIG-3273 Project: Pig Issue Type: Bug Reporter: Bill Graham {{pig -r myscript.pig}} will silently fail without producing output or error messaging for the following script: {noformat} %default STORAGE_WITH_SCHEMA org.apache.pig.builtin.PigStorage('\t', '-schema') A = LOAD 'foo' using $STORAGE_WITH_SCHEMA; dump A; {noformat} Changing the first line to any of these will cause dry run to parse without problems: {noformat} %default STORAGE_WITH_SCHEMA org.apache.pig.builtin.PigStorage('\t') %default STORAGE_WITH_SCHEMA org.apache.pig.builtin.PigStorage('\t', '-schema') %default STORAGE_WITH_SCHEMA 'org.apache.pig.builtin.PigStorage(\'\t\', \'-schema\')' {noformat} Issue seems to be with more then one set of single quotes that are not outer. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-3273) bad %default directives can cause pig dry run to silently fail
[ https://issues.apache.org/jira/browse/PIG-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Graham updated PIG-3273: - Description: {{pig -r myscript.pig}} will silently fail without producing output or error messaging for the following script: {noformat} %default STORAGE_WITH_SCHEMA org.apache.pig.builtin.PigStorage('\t', '-schema') A = LOAD 'foo' using $STORAGE_WITH_SCHEMA; dump A; {noformat} Changing the first line to any of these will cause dry run to parse without problems: {noformat} %default STORAGE_WITH_SCHEMA org.apache.pig.builtin.PigStorage('\\t') %default STORAGE_WITH_SCHEMA org.apache.pig.builtin.PigStorage('\\t', '-schema') %default STORAGE_WITH_SCHEMA 'org.apache.pig.builtin.PigStorage(\'\\t\', \'-schema\')' {noformat} Issue seems to be with more then one set of single quotes that are not outer. was: {{pig -r myscript.pig}} will silently fail without producing output or error messaging for the following script: {noformat} %default STORAGE_WITH_SCHEMA org.apache.pig.builtin.PigStorage('\t', '-schema') A = LOAD 'foo' using $STORAGE_WITH_SCHEMA; dump A; {noformat} Changing the first line to any of these will cause dry run to parse without problems: {noformat} %default STORAGE_WITH_SCHEMA org.apache.pig.builtin.PigStorage('\t') %default STORAGE_WITH_SCHEMA org.apache.pig.builtin.PigStorage('\t', '-schema') %default STORAGE_WITH_SCHEMA 'org.apache.pig.builtin.PigStorage(\'\t\', \'-schema\')' {noformat} Issue seems to be with more then one set of single quotes that are not outer. bad %default directives can cause pig dry run to silently fail -- Key: PIG-3273 URL: https://issues.apache.org/jira/browse/PIG-3273 Project: Pig Issue Type: Bug Reporter: Bill Graham {{pig -r myscript.pig}} will silently fail without producing output or error messaging for the following script: {noformat} %default STORAGE_WITH_SCHEMA org.apache.pig.builtin.PigStorage('\t', '-schema') A = LOAD 'foo' using $STORAGE_WITH_SCHEMA; dump A; {noformat} Changing the first line to any of these will cause dry run to parse without problems: {noformat} %default STORAGE_WITH_SCHEMA org.apache.pig.builtin.PigStorage('\\t') %default STORAGE_WITH_SCHEMA org.apache.pig.builtin.PigStorage('\\t', '-schema') %default STORAGE_WITH_SCHEMA 'org.apache.pig.builtin.PigStorage(\'\\t\', \'-schema\')' {noformat} Issue seems to be with more then one set of single quotes that are not outer. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-3264) mvn signanddeploy target broken for pigunit, pigsmoke and piggybank
[ https://issues.apache.org/jira/browse/PIG-3264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Graham updated PIG-3264: - Fix Version/s: 0.12 mvn signanddeploy target broken for pigunit, pigsmoke and piggybank --- Key: PIG-3264 URL: https://issues.apache.org/jira/browse/PIG-3264 Project: Pig Issue Type: Bug Reporter: Bill Graham Assignee: Bill Graham Fix For: 0.12, 0.11.2 Attachments: PIG_3264.1.patch, PIG_3264_branch11.1.patch Build fails with: {noformat} [artifact:deploy] Invalid reference: 'pigunit' {noformat} Patch on the way. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[ANNOUNCE] Pig 0.11.1 has been released!
The Pig team is happy to announce the Pig 0.11.1 release. Apache Pig provides a high-level data-flow language and execution framework for parallel computation on Hadoop clusters. More details about Pig can be found at http://pig.apache.org/. This is a maintenance release of Pig 0.11 and contains several critical bug fixes. The details of the release can be found at http://pig.apache.org/releases.html.
Re: Apache Pig 0.11.1 release candidate
Hi Mark, Thanks for the work you're doing to support Pig in BigTop. Starting with Pig 0.12, our release process will be simplified to not include rpm/deb packages, thanks to BigTop. I've built Pig on a multiple RHEL versions so this issue might not be as broadly spanning as you describe. The RPMs for 0.11.0 and 0.11.1 were both built on rhel5 instances from ec2 (ami-2d8e4c44). While I don't mind putting together another release, I think we should proceed to release 0.11.1rc0 for the following reasons: - since the vote passed and to respect the time people put in testing/validating this release - 0.11.1 contains support for Hadoop 0.20.2 and other critical bug fixes, which people are anxious for. For fairness to those stakeholders, these fixes were not put into a 0.11.0 RC when discovered late in that release process. - Pig 0.11.1 will contain an RPM as part of it's release artifacts. That said, if the Pig community feels strongly that we should cancel the release and re-issue a new one, I'm fine with shepherding that process. As an alternative is it possible for you to build by setting the default encoding externally? Or could you apply this patch to the pig 0.11.1 distro? thanks, Bill On Fri, Mar 29, 2013 at 5:41 PM, Mark Grover grover.markgro...@gmail.comwrote: Hi all, I am a contributor to Apache Bigtop http://bigtop.apache.org and have a question for you. Bigtop is a TLP responsible for performing packaging and interoperability testing of various projects in the Hadoop ecosystem, including Apache Pig. We are planning to include Pig 0.11 in our soon to be released Bigtop 0.6 distribution. However, while upgrading Pig from 0.10 to 0.11, I wasn't able to compile Pig 0.11.1 on RPM based systems http://bigtop01.cloudera.org:8080/view/Bigtop-trunk/job/Bigtop-trunk-Pig/313/label=centos6/console . There doesn't seem to be anything Bigtop specific here, I would expect this issue to impact all Pig users. It seems like Pig's contrib sub-project uses the system's default encoding for compiling code; however on RPM based systems, the default encoding is not suitable and breaks the build. I created PIG-3262 https://issues.apache.org/jira/browse/PIG-3262 to track this and Cheolsoo graciously committed this to Pig trunk. The essence of Bigtop is exactly to find integration issues like this. Now, I do realize that Bill and the community has done some excellent work in putting together 0.11.1. Perhaps, I am a little too late to ask this question but I thought I'd ask it anyway. Is there a possibility that the Pig community can release a new release candidate for 0.11.1 with the fix in PIG-3262? The pros: 1. It would allow Pig users to compile Pig contrib on RPM machines (RHEL/CentOS 5, 6, SLES 11, Fedora, etc.) which doesn't seem to be possible as of now. 2. It would enable Apache Bigtop 0.6 to include a Pig version that builds on all OS variants. The cons: 1. There is a cost of cutting out another release candidate to the Pig community. I completely understand and appreciate the cost involved; however, I would anticipate the cost to be minimal since a) the change https://issues.apache.org/jira/secure/attachment/12575962/PIG-3262.2.patch is quite trivial; b) the change only affects the contrib functionality and not the core functionality, per se. If we do decide to release another release candidate, I would be more than happy to perform integration testing on it by means of Apache Bigtop. I do realize the unfortunate timing of this email, it would have been ideal if we were having this conversation a week ago while the vote was still going on. I will try to change that in future so please do accept my apologies in advance. Regards, Mark -- *Note that I'm no longer using my Yahoo! email address. Please email me at billgra...@gmail.com going forward.*
Re: [VOTE] Release Pig 0.11.1 (candidate 0)
With 3 binding +1s (Daniel, Julien, BillG) this vote passes. I'll start the release process. On Wed, Mar 27, 2013 at 6:22 PM, Bill Graham billgra...@gmail.com wrote: +1 On Mon, Mar 25, 2013 at 3:42 PM, Daniel Dai da...@hortonworks.com wrote: Yes, it is Ok with me. Daniel On Mon, Mar 25, 2013 at 2:44 PM, Julien Le Dem jul...@twitter.com wrote: +1 The full test suite is passing. I don't think we need not make a new rc just for one license header missing. Daniel, is it OK for you ? Thanks, Julien On Mon, Mar 25, 2013 at 11:02 AM, Daniel Dai da...@hortonworks.com wrote: My fault for missing license header for UDFContextTestLoaderWithSignature. Added it to both files, Thanks Prashant! I run unit tests/e2e tests, both passed. +1 for the rc except for the license header issue. Daniel On Sun, Mar 24, 2013 at 11:18 PM, Prashant Kommireddi prash1...@gmail.com wrote: Downloaded tarball and performed the following: 1. ant releaseaudit - UDFContextTestLoaderWithSignature ( http://svn.apache.org/viewvc?view=revisionrevision=r1458036) and DOTParser.jjt do not have Apache License header. 2. Verified RELEASE_NOTES.txt for correct version numbers 3. Verified build.xml points to next version (0.11.2) SNAPSHOT 4. Built and tested Piggybank, Built tutorial - looks good. 5. Tested jar by running scripts against 0.20.2 hadoop cluster (would be great to have someone else test the same) 6. ant test-commit - all tests pass Except for #1, RC looks good to me. Thanks, -Prashant On Fri, Mar 22, 2013 at 7:58 AM, Bill Graham billgra...@gmail.com wrote: Hi, I have created a candidate build for Pig 0.11.1. This is a maintenance release of Pig 0.11. Keys used to sign the release are available at: http://svn.apache.org/viewvc/pig/trunk/KEYS?view=markup Please download, test, and try it out: http://people.apache.org/~billgraham/pig-0.11.1-candidate-0/ Should we release this? Vote closes on next Thursday EOD, Mar 28th. Thanks, Bill -- *Note that I'm no longer using my Yahoo! email address. Please email me at billgra...@gmail.com going forward.* -- *Note that I'm no longer using my Yahoo! email address. Please email me at billgra...@gmail.com going forward.*
[jira] [Created] (PIG-3264) mvn signanddeploy target broken for pigunit and pigsmoke
Bill Graham created PIG-3264: Summary: mvn signanddeploy target broken for pigunit and pigsmoke Key: PIG-3264 URL: https://issues.apache.org/jira/browse/PIG-3264 Project: Pig Issue Type: Bug Reporter: Bill Graham Assignee: Bill Graham Build fails with: {noformat} [artifact:deploy] Invalid reference: 'pigunit' {noformat} Patch on the way. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-3264) mvn signanddeploy target broken for pigunit and pigsmoke
[ https://issues.apache.org/jira/browse/PIG-3264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Graham updated PIG-3264: - Attachment: PIG_3264.1.patch PIG_3264_branch11.1.patch Attaching trunk and branch 11 patches. mvn signanddeploy target broken for pigunit and pigsmoke Key: PIG-3264 URL: https://issues.apache.org/jira/browse/PIG-3264 Project: Pig Issue Type: Bug Reporter: Bill Graham Assignee: Bill Graham Attachments: PIG_3264.1.patch, PIG_3264_branch11.1.patch Build fails with: {noformat} [artifact:deploy] Invalid reference: 'pigunit' {noformat} Patch on the way. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-3264) mvn signanddeploy target broken for pigunit and pigsmoke
[ https://issues.apache.org/jira/browse/PIG-3264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Graham updated PIG-3264: - Status: Patch Available (was: Open) mvn signanddeploy target broken for pigunit and pigsmoke Key: PIG-3264 URL: https://issues.apache.org/jira/browse/PIG-3264 Project: Pig Issue Type: Bug Reporter: Bill Graham Assignee: Bill Graham Attachments: PIG_3264.1.patch, PIG_3264_branch11.1.patch Build fails with: {noformat} [artifact:deploy] Invalid reference: 'pigunit' {noformat} Patch on the way. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-3264) mvn signanddeploy target broken for pigunit, pigsmoke and piggybank
[ https://issues.apache.org/jira/browse/PIG-3264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Graham updated PIG-3264: - Summary: mvn signanddeploy target broken for pigunit, pigsmoke and piggybank (was: mvn signanddeploy target broken for pigunit and pigsmoke) mvn signanddeploy target broken for pigunit, pigsmoke and piggybank --- Key: PIG-3264 URL: https://issues.apache.org/jira/browse/PIG-3264 Project: Pig Issue Type: Bug Reporter: Bill Graham Assignee: Bill Graham Attachments: PIG_3264.1.patch, PIG_3264_branch11.1.patch Build fails with: {noformat} [artifact:deploy] Invalid reference: 'pigunit' {noformat} Patch on the way. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-3264) mvn signanddeploy target broken for pigunit, pigsmoke and piggybank
[ https://issues.apache.org/jira/browse/PIG-3264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Graham updated PIG-3264: - Resolution: Fixed Status: Resolved (was: Patch Available) mvn signanddeploy target broken for pigunit, pigsmoke and piggybank --- Key: PIG-3264 URL: https://issues.apache.org/jira/browse/PIG-3264 Project: Pig Issue Type: Bug Reporter: Bill Graham Assignee: Bill Graham Attachments: PIG_3264.1.patch, PIG_3264_branch11.1.patch Build fails with: {noformat} [artifact:deploy] Invalid reference: 'pigunit' {noformat} Patch on the way. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: [VOTE] Release Pig 0.11.1 (candidate 0)
+1 On Mon, Mar 25, 2013 at 3:42 PM, Daniel Dai da...@hortonworks.com wrote: Yes, it is Ok with me. Daniel On Mon, Mar 25, 2013 at 2:44 PM, Julien Le Dem jul...@twitter.com wrote: +1 The full test suite is passing. I don't think we need not make a new rc just for one license header missing. Daniel, is it OK for you ? Thanks, Julien On Mon, Mar 25, 2013 at 11:02 AM, Daniel Dai da...@hortonworks.com wrote: My fault for missing license header for UDFContextTestLoaderWithSignature. Added it to both files, Thanks Prashant! I run unit tests/e2e tests, both passed. +1 for the rc except for the license header issue. Daniel On Sun, Mar 24, 2013 at 11:18 PM, Prashant Kommireddi prash1...@gmail.com wrote: Downloaded tarball and performed the following: 1. ant releaseaudit - UDFContextTestLoaderWithSignature ( http://svn.apache.org/viewvc?view=revisionrevision=r1458036) and DOTParser.jjt do not have Apache License header. 2. Verified RELEASE_NOTES.txt for correct version numbers 3. Verified build.xml points to next version (0.11.2) SNAPSHOT 4. Built and tested Piggybank, Built tutorial - looks good. 5. Tested jar by running scripts against 0.20.2 hadoop cluster (would be great to have someone else test the same) 6. ant test-commit - all tests pass Except for #1, RC looks good to me. Thanks, -Prashant On Fri, Mar 22, 2013 at 7:58 AM, Bill Graham billgra...@gmail.com wrote: Hi, I have created a candidate build for Pig 0.11.1. This is a maintenance release of Pig 0.11. Keys used to sign the release are available at: http://svn.apache.org/viewvc/pig/trunk/KEYS?view=markup Please download, test, and try it out: http://people.apache.org/~billgraham/pig-0.11.1-candidate-0/ Should we release this? Vote closes on next Thursday EOD, Mar 28th. Thanks, Bill -- *Note that I'm no longer using my Yahoo! email address. Please email me at billgra...@gmail.com going forward.*
[VOTE] Release Pig 0.11.1 (candidate 0)
Hi, I have created a candidate build for Pig 0.11.1. This is a maintenance release of Pig 0.11. Keys used to sign the release are available at: http://svn.apache.org/viewvc/pig/trunk/KEYS?view=markup Please download, test, and try it out: http://people.apache.org/~billgraham/pig-0.11.1-candidate-0/ Should we release this? Vote closes on next Thursday EOD, Mar 28th. Thanks, Bill
Re: Are we ready for 0.11.1 release?
Sure, I can get a RC out this week. On Mon, Mar 18, 2013 at 10:51 AM, Dmitriy Ryaboy dvrya...@gmail.com wrote: Yeah adding new types seems like a big thing, would prefer for it to be 0.12 only. Sounds like we are ready to roll 0.11.1.. Bill, want to do the honors again? On Mon, Mar 18, 2013 at 10:40 AM, Julien Le Dem jul...@ledem.net wrote: Agreed with Daniel, PIG-2764 will go in Pig 0.12 Julien On Mar 18, 2013, at 10:32 AM, Daniel Dai wrote: Dimitry: Just committed PIG-3132. Richard: PIG-2764 is new feature, we usually don't include new feature in minor release. Daniel On Mon, Mar 18, 2013 at 10:21 AM, Richard Ding pigu...@gmail.com wrote: How about PIG-2764? It would be nice to include this feature. On Mon, Mar 18, 2013 at 1:04 AM, Dmitriy Ryaboy dvrya...@gmail.com wrote: Just +1'd it. I think after this one we are good to go? On Sun, Mar 17, 2013 at 9:09 PM, Daniel Dai da...@hortonworks.com wrote: Can I include PIG-3132? Thanks, Daniel On Fri, Mar 15, 2013 at 5:57 PM, Julien Le Dem jul...@ledem.net wrote: +1 for a new release Julien On Mar 15, 2013, at 17:08, Dmitriy Ryaboy dvrya...@gmail.com wrote: I think all the critical patches we discussed as required for 0.11.1 have gone in -- is there anything else people want to finish up, or can we roll this? Current change log: Release 0.11.1 (unreleased) INCOMPATIBLE CHANGES IMPROVEMENTS PIG-2988: start deploying pigunit maven artifact part of Pig release process (njw45 via rohini) PIG-3148: OutOfMemory exception while spilling stale DefaultDataBag. Extra option to gc() before spilling large bag. (knoguchi via rohini) PIG-3216: Groovy UDFs documentation has minor typos (herberts via rohini) PIG-3202: CUBE operator not documented in user docs (prasanth_j via billgraham) OPTIMIZATIONS BUG FIXES PIG-3194: Changes to ObjectSerializer.java break compatibility with Hadoop 0.20.2 (prkommireddi via dvryaboy) PIG-3241: ConcurrentModificationException in POPartialAgg (dvryaboy) PIG-3144: Erroneous map entry alias resolution leading to Duplicate schema alias errors (jcoveney via cheolsoo) PIG-3212: Race Conditions in POSort and (Internal)SortedBag during Proactive Spill (kadeng via dvryaboy) PIG-3206: HBaseStorage does not work with Oozie pig action and secure HBase (rohini) -- *Note that I'm no longer using my Yahoo! email address. Please email me at billgra...@gmail.com going forward.*
[jira] [Updated] (PIG-3241) ConcurrentModificationException in POPartialAgg
[ https://issues.apache.org/jira/browse/PIG-3241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Graham updated PIG-3241: - Fix Version/s: 0.12 ConcurrentModificationException in POPartialAgg --- Key: PIG-3241 URL: https://issues.apache.org/jira/browse/PIG-3241 Project: Pig Issue Type: Bug Affects Versions: 0.11 Reporter: Lohit Vijayarenu Fix For: 0.12, 0.11.1 While running few PIG scripts against Hadoop 2.0, I see consistently see ConcurrentModificationException {noformat} at java.util.HashMap$HashIterator.remove(HashMap.java:811) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POPartialAgg.aggregate(POPartialAgg.java:365) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POPartialAgg.aggregateSecondLevel(POPartialAgg.java:379) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POPartialAgg.getNext(POPartialAgg.java:203) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:308) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNext(POLocalRearrange.java:263) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:283) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:278) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:729) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:334) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:158) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1441) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:153) {noformat} It looks like there is rawInputMap is being modified while elements are removed from it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-3241) ConcurrentModificationException in POPartialAgg
[ https://issues.apache.org/jira/browse/PIG-3241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Graham updated PIG-3241: - Fix Version/s: 0.11.1 ConcurrentModificationException in POPartialAgg --- Key: PIG-3241 URL: https://issues.apache.org/jira/browse/PIG-3241 Project: Pig Issue Type: Bug Affects Versions: 0.11 Reporter: Lohit Vijayarenu Fix For: 0.11.1 While running few PIG scripts against Hadoop 2.0, I see consistently see ConcurrentModificationException {noformat} at java.util.HashMap$HashIterator.remove(HashMap.java:811) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POPartialAgg.aggregate(POPartialAgg.java:365) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POPartialAgg.aggregateSecondLevel(POPartialAgg.java:379) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POPartialAgg.getNext(POPartialAgg.java:203) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:308) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNext(POLocalRearrange.java:263) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:283) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:278) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:729) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:334) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:158) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1441) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:153) {noformat} It looks like there is rawInputMap is being modified while elements are removed from it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-3214) New/improved mascot
[ https://issues.apache.org/jira/browse/PIG-3214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13593587#comment-13593587 ] Bill Graham commented on PIG-3214: -- I like where #4 is going. I think the fonts look current and it's simple. The shape of the dots in the P could use some tweaking though to be a bit more rounded, but I think it's close. I'm not a big fan of #2 because I think it looks like it says Pij and the details of the Pig head won't scale well when shown as a thumbnail. I also tend towards the more modern fonts than what's used for the Pig part. Just my $.02 New/improved mascot --- Key: PIG-3214 URL: https://issues.apache.org/jira/browse/PIG-3214 Project: Pig Issue Type: Wish Components: site Affects Versions: 0.11 Reporter: Andrew Musselman Priority: Minor Fix For: 0.12 Attachments: newlogo1.png, newlogo2.png, newlogo3.png, newlogo4.png, newlogo5.png Request to change pig mascot to something more graphically appealing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-3233) Deploy a Piggybank Jar
[ https://issues.apache.org/jira/browse/PIG-3233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13594245#comment-13594245 ] Bill Graham commented on PIG-3233: -- Using the imports works. It looks like you used the versions from {{pigunit-template.xml}} but we should instead use those in {{ivy/libraries.properties}} since that's what piggybank is built against. Could you please update those the versions to be in sync there. We can tackle keeping synced with ivy in another JIRA, but one thought is variable substitution like we do with @version. Deploy a Piggybank Jar -- Key: PIG-3233 URL: https://issues.apache.org/jira/browse/PIG-3233 Project: Pig Issue Type: New Feature Components: piggybank Affects Versions: 0.10.0, 0.11 Reporter: Nick White Assignee: Nick White Fix For: 0.10.1, 0.11.1 Attachments: PIG-3233.0.patch The attached patch adds the piggybank contrib jar to the mvn-install and mvn-deploy ant targets in the same way as the pigunit pigsmoke artifacts. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-3233) Deploy a Piggybank Jar
[ https://issues.apache.org/jira/browse/PIG-3233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13592712#comment-13592712 ] Bill Graham commented on PIG-3233: -- Thanks for tackling this one Nick! The mechanics of the patch looks good, but from where did you get the deps that you included in {{ivy/piggybank-template.xml}}? Deploy a Piggybank Jar -- Key: PIG-3233 URL: https://issues.apache.org/jira/browse/PIG-3233 Project: Pig Issue Type: New Feature Components: piggybank Affects Versions: 0.10.0, 0.11 Reporter: Nick White Assignee: Nick White Fix For: 0.10.1, 0.11.1 Attachments: PIG-3233.0.patch The attached patch adds the piggybank contrib jar to the mvn-install and mvn-deploy ant targets in the same way as the pigunit pigsmoke artifacts. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: pig 0.11 candidate 2 feedback: Several problems
+1 to releasing Pig 0.11.1 when this is addressed. I should be able to help with the release again. On Fri, Mar 1, 2013 at 11:25 AM, Prashant Kommireddi prash1...@gmail.comwrote: Hey Guys, I wanted to start a conversation on this again. If Kai is not looking at PIG-3194 I can start working on it to get 0.11 compatible with 20.2. If everyone agrees, we should roll out 0.11.1 sooner than usual and I volunteer to help with it in anyway possible. Any objections to getting 0.11.1 out soon after 3194 is fixed? -Prashant On Wed, Feb 20, 2013 at 3:34 PM, Russell Jurney russell.jur...@gmail.com wrote: I stand corrected. Cool, 0.11 is good! On Wed, Feb 20, 2013 at 1:15 PM, Jarek Jarcec Cecho jar...@apache.org wrote: Just a unrelated note: The CDH3 is more closer to Hadoop 1.x than to 0.20. Jarcec On Wed, Feb 20, 2013 at 12:04:51PM -0800, Dmitriy Ryaboy wrote: I agree -- this is a good release. The bugs Kai pointed out should be fixed, but as they are not critical regressions, we can fix them in 0.11.1 (if someone wants to roll 0.11.1 the minute these fixes are committed, I won't mind and will dutifully vote for the release). I think the Hadoop 20.2 incompatibility is unfortunate but iirc this is fixable by setting HADOOP_USER_CLASSPATH_FIRST=true (was that in 20.2?) FWIW Twitter's running CDH3 and this release works in our environment. At this point things that block a release are critical regressions in performance or correctness. D On Wed, Feb 20, 2013 at 11:52 AM, Alan Gates ga...@hortonworks.com wrote: No. Bugs like these are supposed to be found and fixed after we branch from trunk (which happened several months ago in the case of 0.11). The point of RCs are to check that it's a good build, licenses are right, etc. Any bugs found this late in the game have to be seen as failures of earlier testing. Alan. On Feb 20, 2013, at 11:33 AM, Russell Jurney wrote: Isn't the point of an RC to find and fix bugs like these On Wed, Feb 20, 2013 at 11:31 AM, Bill Graham billgra...@gmail.com wrote: Regarding Pig 11 rc2, I propose we continue with the current vote as is (which closes today EOD). Patches for 0.20.2 issues can be rolled into a Pig 0.11.1 release whenever they're available and tested. On Wed, Feb 20, 2013 at 9:24 AM, Olga Natkovich onatkov...@yahoo.com wrote: I agree that supporting as much as we can is a good goal. The issue is who is going to be testing against all these versions? We found the issues under discussion because of a customer report, not because we consistently test against all versions. Perhaps when we decide which versions to support for next release we need also to agree who is going to be testing and maintaining compatibility with a particular version. For instance since Hadoop 23 compatibility is important for us at Yahoo we have been maintaining compatibility with this version for 0.9, 0.10 and will do the same for 0.11 and going forward. I think we would need others to step in and claim the versions of their interest. Olga From: Kai Londenberg kai.londenb...@googlemail.com To: dev@pig.apache.org Sent: Wednesday, February 20, 2013 1:51 AM Subject: Re: pig 0.11 candidate 2 feedback: Several problems Hi, I stronly agree with Jonathan here. If there are good reasons why you can't support an older version of Hadoop any more, that's one thing. But having to change 2 lines of code doesn't really qualify as such in my point of view ;) At least for me, pig support for 0.20.2 is essential - without it, I can't use it. If it doesn't support it, I'll have to branch pig and hack it myself, or stop using it. I guess, there are a lot of people still running 0.20.2 Clusters. If you really have lots of data stored on HDFS and a continuously busy cluster, an upgrade is nothing you do just because. 2013/2/20 Jonathan Coveney jcove...@gmail.com: I agree that we shouldn't have to support old versions forever. That said, I also don't think we should be too blase about supporting older versions where it is not odious to do so. We have a lot of competition in the language space and the broader the versions we can support, the better (assuming it isn't too odious to do so). In this case, I don't think it should be too hard to change ObjectSerializer so that the commons-codec code used is compatible with both
[jira] [Resolved] (PIG-3002) Pig client should handle CountersExceededException
[ https://issues.apache.org/jira/browse/PIG-3002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Graham resolved PIG-3002. -- Resolution: Fixed Fix Version/s: 0.12 Committed. Thanks [~jarcec] for digging into this one and sorry for the delay. Pig client should handle CountersExceededException -- Key: PIG-3002 URL: https://issues.apache.org/jira/browse/PIG-3002 Project: Pig Issue Type: Bug Reporter: Bill Graham Assignee: Jarek Jarcec Cecho Labels: newbie, simple Fix For: 0.12 Attachments: PIG-3002.2.patch, PIG-3002.patch Running a pig job that uses more than 120 counters will succeed, but a grunt exception will occur when trying to output counter info to the console. This exception should be caught and handled with friendly messaging: {noformat} org.apache.pig.backend.executionengine.ExecException: ERROR 2043: Unexpected error during execution. at org.apache.pig.PigServer.launchPlan(PigServer.java:1275) at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1249) at org.apache.pig.PigServer.execute(PigServer.java:1239) at org.apache.pig.PigServer.executeBatch(PigServer.java:333) at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:136) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:197) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:169) at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84) at org.apache.pig.Main.run(Main.java:604) at org.apache.pig.Main.main(Main.java:154) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:186) Caused by: org.apache.hadoop.mapred.Counters$CountersExceededException: Error: Exceeded limits on number of counters - Counters=120 Limit=120 at org.apache.hadoop.mapred.Counters$Group.getCounterForName(Counters.java:312) at org.apache.hadoop.mapred.Counters.findCounter(Counters.java:431) at org.apache.hadoop.mapred.Counters.getCounter(Counters.java:495) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.computeWarningAggregate(MapReduceLauncher.java:707) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:442) at org.apache.pig.PigServer.launchPlan(PigServer.java:1264) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-3002) Pig client should handle CountersExceededException
[ https://issues.apache.org/jira/browse/PIG-3002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13589284#comment-13589284 ] Bill Graham commented on PIG-3002: -- I don't think we should be modifying the shims code in this way for the contrived case. Swallowing exceptions and returning 0 doesn't seem like the right thing to do for the reasons I've described above. If Hadoop is throwing exceptions because we've used too many counters, let's catch it and log it and move on. Surfacing the exception to a user in the console is better than trying to print some of them. Any counters captured by Hadoop will still be reported in the JT and the job history. Pig client should handle CountersExceededException -- Key: PIG-3002 URL: https://issues.apache.org/jira/browse/PIG-3002 Project: Pig Issue Type: Bug Reporter: Bill Graham Assignee: Jarek Jarcec Cecho Labels: newbie, simple Attachments: PIG-3002.2.patch, PIG-3002.patch Running a pig job that uses more than 120 counters will succeed, but a grunt exception will occur when trying to output counter info to the console. This exception should be caught and handled with friendly messaging: {noformat} org.apache.pig.backend.executionengine.ExecException: ERROR 2043: Unexpected error during execution. at org.apache.pig.PigServer.launchPlan(PigServer.java:1275) at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1249) at org.apache.pig.PigServer.execute(PigServer.java:1239) at org.apache.pig.PigServer.executeBatch(PigServer.java:333) at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:136) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:197) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:169) at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84) at org.apache.pig.Main.run(Main.java:604) at org.apache.pig.Main.main(Main.java:154) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:186) Caused by: org.apache.hadoop.mapred.Counters$CountersExceededException: Error: Exceeded limits on number of counters - Counters=120 Limit=120 at org.apache.hadoop.mapred.Counters$Group.getCounterForName(Counters.java:312) at org.apache.hadoop.mapred.Counters.findCounter(Counters.java:431) at org.apache.hadoop.mapred.Counters.getCounter(Counters.java:495) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.computeWarningAggregate(MapReduceLauncher.java:707) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:442) at org.apache.pig.PigServer.launchPlan(PigServer.java:1264) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-1832) Support timestamp in HBaseStorage
[ https://issues.apache.org/jira/browse/PIG-1832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13587239#comment-13587239 ] Bill Graham commented on PIG-1832: -- Yes, read via time ranges is done. Work on PIG-2114 seems stalled though and there's a lot going on in that patch. I propose this JIRA just add write support for -timestamp=millis_since_the_epoch_utc for consistency with the current read API. That's a quick change that would be useful and would give full read/write support for timestamps. That would also help reduce the somewhat broad scope of PIG-2114. Support timestamp in HBaseStorage - Key: PIG-1832 URL: https://issues.apache.org/jira/browse/PIG-1832 Project: Pig Issue Type: Improvement Environment: Java 6, Mac OS X 10.6 Reporter: Eric Yang When storing data into HBase using org.apache.pig.backend.hadoop.hbase.HBaseStorage, HBase timestamp field is stored with insertion time of the mapreduce job. It would be nice to have a way to populate timestamp from user data. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-1832) Support timestamp in HBaseStorage when storing
[ https://issues.apache.org/jira/browse/PIG-1832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Graham updated PIG-1832: - Summary: Support timestamp in HBaseStorage when storing (was: Support timestamp in HBaseStorage) Support timestamp in HBaseStorage when storing -- Key: PIG-1832 URL: https://issues.apache.org/jira/browse/PIG-1832 Project: Pig Issue Type: Improvement Environment: Java 6, Mac OS X 10.6 Reporter: Eric Yang When storing data into HBase using org.apache.pig.backend.hadoop.hbase.HBaseStorage, HBase timestamp field is stored with insertion time of the mapreduce job. It would be nice to have a way to populate timestamp from user data. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-1832) Support timestamp in HBaseStorage when storing
[ https://issues.apache.org/jira/browse/PIG-1832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Graham updated PIG-1832: - Environment: (was: Java 6, Mac OS X 10.6) Support timestamp in HBaseStorage when storing -- Key: PIG-1832 URL: https://issues.apache.org/jira/browse/PIG-1832 Project: Pig Issue Type: Improvement Reporter: Eric Yang When storing data into HBase using org.apache.pig.backend.hadoop.hbase.HBaseStorage, HBase timestamp field is stored with insertion time of the mapreduce job. It would be nice to have a way to populate timestamp from user data. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-1832) Support timestamp in HBaseStorage when storing
[ https://issues.apache.org/jira/browse/PIG-1832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13587328#comment-13587328 ] Bill Graham commented on PIG-1832: -- I don't think there is a ticket to support returning multiple cell versions with timestamps, but we did discuss ideas for an approach here: https://issues.apache.org/jira/browse/PIG-1782?focusedCommentId=12988192page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-12988192 Basically the idea is to create a new class to support this, since it would be fundamentally very different than what we currently support with {{HBaseStorage}}. That work might be better handled after we tackle PIG-3067 (HBaseStorage should be split up to become more manageable). Support timestamp in HBaseStorage when storing -- Key: PIG-1832 URL: https://issues.apache.org/jira/browse/PIG-1832 Project: Pig Issue Type: Improvement Reporter: Eric Yang When storing data into HBase using org.apache.pig.backend.hadoop.hbase.HBaseStorage, HBase timestamp field is stored with insertion time of the mapreduce job. It would be nice to have a way to populate timestamp from user data. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-3067) HBaseStorage should be split up to become more manageable
[ https://issues.apache.org/jira/browse/PIG-3067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Graham updated PIG-3067: - Summary: HBaseStorage should be split up to become more manageable (was: HBaseStorage should be split up to become more managable) HBaseStorage should be split up to become more manageable - Key: PIG-3067 URL: https://issues.apache.org/jira/browse/PIG-3067 Project: Pig Issue Type: Improvement Reporter: Christoph Bauer Assignee: Christoph Bauer Attachments: hbasestorage-split.patch HBaseStorage has become quite big (1100 lines). I propose to split it up into more managable parts. I believe it will become a lot easier to maintain. I split it up like this: HBaseStorage * settings:LoadStoreFuncSettings ** options ** caster ** udfProperties ** contextSignature ** columns:ColumnInfo - moved to its own class-file * loadFuncDelegate:HBaseLoadFunc - LoadFunc implementation ** settings:LoadStoreFuncSettings (s.a.) ** scanner:HBaseLoadFuncScanner - everything scan-specific ** tupleIterator:HBaseTupleIterator - interface for _public Tuple getNext()_ * storeFuncDelegate:HBaseStorFunc - StorFunc implementation ** settings:LoadStoreFuncSettings (s.a.) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (PIG-3220) No docs for CUBE in Pig 0.11 :(
[ https://issues.apache.org/jira/browse/PIG-3220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Graham resolved PIG-3220. -- Resolution: Duplicate Duplicate of PIG-3202. Until the next release, docs can be found here: https://issues.apache.org/jira/browse/PIG-2765?focusedCommentId=13427021page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13427021 No docs for CUBE in Pig 0.11 :( --- Key: PIG-3220 URL: https://issues.apache.org/jira/browse/PIG-3220 Project: Pig Issue Type: Bug Affects Versions: 0.11 Reporter: Russell Jurney Priority: Blocker Fix For: 0.11.1 There are no docs for CUBE in this release. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-3202) CUBE operator not documented in user docs
[ https://issues.apache.org/jira/browse/PIG-3202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Graham updated PIG-3202: - Attachment: PIG-3202.2.patch Looks great, thanks Prasanth! I just made a few minor formatting tweaks and committed. CUBE operator not documented in user docs - Key: PIG-3202 URL: https://issues.apache.org/jira/browse/PIG-3202 Project: Pig Issue Type: Bug Affects Versions: 0.11 Reporter: Bill Graham Assignee: Prasanth J Fix For: 0.12 Attachments: PIG-3202.1.git.patch, PIG-3202.2.patch This is not documented in the user docs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-3202) CUBE operator not documented in user docs
[ https://issues.apache.org/jira/browse/PIG-3202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Graham updated PIG-3202: - Fix Version/s: 0.11.1 CUBE operator not documented in user docs - Key: PIG-3202 URL: https://issues.apache.org/jira/browse/PIG-3202 Project: Pig Issue Type: Bug Affects Versions: 0.11 Reporter: Bill Graham Assignee: Prasanth J Fix For: 0.12, 0.11.1 Attachments: PIG-3202.1.git.patch, PIG-3202.2.patch This is not documented in the user docs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (PIG-3202) CUBE operator not documented in user docs
[ https://issues.apache.org/jira/browse/PIG-3202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Graham resolved PIG-3202. -- Resolution: Fixed Committed to trunk and pig 11 branch. CUBE operator not documented in user docs - Key: PIG-3202 URL: https://issues.apache.org/jira/browse/PIG-3202 Project: Pig Issue Type: Bug Affects Versions: 0.11 Reporter: Bill Graham Assignee: Prasanth J Fix For: 0.12, 0.11.1 Attachments: PIG-3202.1.git.patch, PIG-3202.2.patch This is not documented in the user docs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-3214) New/improved mascot
[ https://issues.apache.org/jira/browse/PIG-3214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Graham updated PIG-3214: - Fix Version/s: (was: 0.11.1) 0.12 New/improved mascot --- Key: PIG-3214 URL: https://issues.apache.org/jira/browse/PIG-3214 Project: Pig Issue Type: Wish Components: site Affects Versions: 0.11 Reporter: Andrew Musselman Priority: Minor Fix For: 0.12 Request to change pig mascot to something more graphically appealing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-3214) New/improved mascot
[ https://issues.apache.org/jira/browse/PIG-3214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13585623#comment-13585623 ] Bill Graham commented on PIG-3214: -- +1 to a new mascot. New/improved mascot --- Key: PIG-3214 URL: https://issues.apache.org/jira/browse/PIG-3214 Project: Pig Issue Type: Wish Components: site Affects Versions: 0.11 Reporter: Andrew Musselman Priority: Minor Fix For: 0.12 Request to change pig mascot to something more graphically appealing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-3002) Pig client should handle CountersExceededException
[ https://issues.apache.org/jira/browse/PIG-3002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Graham updated PIG-3002: - Attachment: PIG-3002.2.patch I was able to run some tests using both the patched and unpatched version and both behave the same w.r.t. what's output to the console. The behavior of {{MapReduceLauncher.computeWarningAggregate()}} is already to catch {{IOException}} and log a warning, so it would be aceptable to catch {{Exception}} around the iterator on the {{PigWarning}} enum and just {{log.warn}} there as well. No need to modify the shim code. Just log the error and move on. This will cause the console output to show the success state of the jobs, along with the exception with the counters. Pig client should handle CountersExceededException -- Key: PIG-3002 URL: https://issues.apache.org/jira/browse/PIG-3002 Project: Pig Issue Type: Bug Reporter: Bill Graham Assignee: Jarek Jarcec Cecho Labels: newbie, simple Attachments: PIG-3002.2.patch, PIG-3002.patch Running a pig job that uses more than 120 counters will succeed, but a grunt exception will occur when trying to output counter info to the console. This exception should be caught and handled with friendly messaging: {noformat} org.apache.pig.backend.executionengine.ExecException: ERROR 2043: Unexpected error during execution. at org.apache.pig.PigServer.launchPlan(PigServer.java:1275) at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1249) at org.apache.pig.PigServer.execute(PigServer.java:1239) at org.apache.pig.PigServer.executeBatch(PigServer.java:333) at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:136) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:197) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:169) at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84) at org.apache.pig.Main.run(Main.java:604) at org.apache.pig.Main.main(Main.java:154) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:186) Caused by: org.apache.hadoop.mapred.Counters$CountersExceededException: Error: Exceeded limits on number of counters - Counters=120 Limit=120 at org.apache.hadoop.mapred.Counters$Group.getCounterForName(Counters.java:312) at org.apache.hadoop.mapred.Counters.findCounter(Counters.java:431) at org.apache.hadoop.mapred.Counters.getCounter(Counters.java:495) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.computeWarningAggregate(MapReduceLauncher.java:707) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:442) at org.apache.pig.PigServer.launchPlan(PigServer.java:1264) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (PIG-3219) Script to run Pig ant targets on AWS
Bill Graham created PIG-3219: Summary: Script to run Pig ant targets on AWS Key: PIG-3219 URL: https://issues.apache.org/jira/browse/PIG-3219 Project: Pig Issue Type: Bug Reporter: Bill Graham Assignee: Bill Graham During the Pig 11 release I wrote a script to install software required to build Pig on ec2 instances before running the build. This script could be helpful for future releases or for running unit tests remotely. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-3219) Script to run Pig ant targets on AWS
[ https://issues.apache.org/jira/browse/PIG-3219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Graham updated PIG-3219: - Attachment: pig-ec2-release-build.sh Not sure where the best place to put this would be. Suggestions welcome. Script to run Pig ant targets on AWS Key: PIG-3219 URL: https://issues.apache.org/jira/browse/PIG-3219 Project: Pig Issue Type: Bug Reporter: Bill Graham Assignee: Bill Graham Attachments: pig-ec2-release-build.sh During the Pig 11 release I wrote a script to install software required to build Pig on ec2 instances before running the build. This script could be helpful for future releases or for running unit tests remotely. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: How do we post to the apache pig blog?
Looks like an infra jira is needed to get a PMC member admin rights: Creating new Project Blog users The blogs.apache.org backend is not currently connected to Apache LDAP services, so blog users need to be created by our infrastructure team. To get a username, create anINFRA issuehttps://issues.apache.org/jira/browse/INFRA using *Blogs* as its component name. Indicate your Apache user ID and to which blog you're requesting access. Project Blog Features Granting Project blog rights to other committers PMC members with blog admin rights can grant access rights to blog users via https://blogs.apache.org/admin On Fri, Feb 22, 2013 at 12:37 AM, Aniket Mokashi aniket...@gmail.comwrote: Just a guess- http://www.apache.org/dev/project-blogs On Fri, Feb 22, 2013 at 12:07 AM, Dmitriy Ryaboy dvrya...@gmail.com wrote: I prepared a detailed post going over the pig 0.11 release, and realized I don't know how to post to the apache pig blog. Does anyone have a pointer? -- ...:::Aniket:::... Quetzalco@tl -- *Note that I'm no longer using my Yahoo! email address. Please email me at billgra...@gmail.com going forward.*
[jira] [Commented] (PIG-3174) Remove rpm and deb artifacts from build.xml
[ https://issues.apache.org/jira/browse/PIG-3174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13583301#comment-13583301 ] Bill Graham commented on PIG-3174: -- +1 to patch and doc approach. A link from the releases page makes sense. Remove rpm and deb artifacts from build.xml --- Key: PIG-3174 URL: https://issues.apache.org/jira/browse/PIG-3174 Project: Pig Issue Type: Task Components: build Affects Versions: 0.12 Reporter: Alan Gates Assignee: Alan Gates Fix For: 0.12 Attachments: PIG-3174.2.patch, PIG-3174.patch I propose that we remove the targets to build rpms and debs from build.xml and consequently quit publishing them as part of our releases. Bigtop publishes these packages now. And building them takes infrastructure that not every committer/PMC member has. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (PIG-3202) CUBE operator not documented in user docs
Bill Graham created PIG-3202: Summary: CUBE operator not documented in user docs Key: PIG-3202 URL: https://issues.apache.org/jira/browse/PIG-3202 Project: Pig Issue Type: Bug Affects Versions: 0.11 Reporter: Bill Graham Fix For: 0.12 This is not documented in the user docs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (PIG-3203) ROLLUP not documented in user docs
Bill Graham created PIG-3203: Summary: ROLLUP not documented in user docs Key: PIG-3203 URL: https://issues.apache.org/jira/browse/PIG-3203 Project: Pig Issue Type: Bug Affects Versions: 0.11 Reporter: Bill Graham Fix For: 0.12 This is not documented in the user docs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-3202) CUBE operator not documented in user docs
[ https://issues.apache.org/jira/browse/PIG-3202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13583490#comment-13583490 ] Bill Graham commented on PIG-3202: -- Sure, that would be great. See https://cwiki.apache.org/PIG/howtodocument.html for a description of how to modify user docs. CUBE operator not documented in user docs - Key: PIG-3202 URL: https://issues.apache.org/jira/browse/PIG-3202 Project: Pig Issue Type: Bug Affects Versions: 0.11 Reporter: Bill Graham Fix For: 0.12 This is not documented in the user docs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-3144) Erroneous map entry alias resolution leading to Duplicate schema alias errors
[ https://issues.apache.org/jira/browse/PIG-3144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Graham updated PIG-3144: - Fix Version/s: (was: 0.11) Erroneous map entry alias resolution leading to Duplicate schema alias errors --- Key: PIG-3144 URL: https://issues.apache.org/jira/browse/PIG-3144 Project: Pig Issue Type: Bug Affects Versions: 0.11, 0.10.1 Reporter: Kai Londenberg Assignee: Jonathan Coveney Fix For: 0.12 Attachments: PIG-3144-0.patch The following code illustrates a problem concerning alias resolution in pig The schema of D2 will incorrectly be described as containing two age fields. And the last step in the following script will lead to a Duplicate schema alias error message. I only encountered this bug when using aliases for map fields. {code} DATA = LOAD 'file:///whatever' as (a:map[chararray], b:chararray); D1 = FOREACH DATA GENERATE a#'name' as name, a#'age' as age, b; D2 = FOREACH D1 GENERATE name, age, b; DESCRIBE D2; {code} Output: {code} D2: { age: chararray, age: chararray, b: chararray } {code} {code} D3 = FOREACH D2 GENERATE *; DESCRIBE D3; {code} Output: {code} file file:///.../pig-bug-example.pig, line 20, column 16 Duplicate schema alias: age {code} This error occurs in this form in Apache Pig version 0.11.0-SNAPSHOT (r6408). A less severe variant of this bug is also present in pig 0.10.1. In 0.10.1, the Duplicate schema alias error message won't occur, but the schema of D2 (see above) will still have wrong duplicate alias entries. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-3177) Fix Pig project SEO so latest, 0.11 docs show when you google things
[ https://issues.apache.org/jira/browse/PIG-3177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Graham updated PIG-3177: - Fix Version/s: (was: 0.11) 0.12 Fix Pig project SEO so latest, 0.11 docs show when you google things Key: PIG-3177 URL: https://issues.apache.org/jira/browse/PIG-3177 Project: Pig Issue Type: Bug Components: site Affects Versions: 0.11 Reporter: Russell Jurney Assignee: Russell Jurney Priority: Critical Fix For: 0.12 http://pig.apache.org/docs/r0.7.0/api/org/apache/pig/piggybank/storage/SequenceFileLoader.html The 0.7.0 docs are what everyone references. FOR POOPS SAKES. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-3194) Changes to ObjectSerializer.java break compatibility with Hadoop 0.20.2
[ https://issues.apache.org/jira/browse/PIG-3194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Graham updated PIG-3194: - Summary: Changes to ObjectSerializer.java break compatibility with Hadoop 0.20.2 (was: Pig 0.11 candidate 2: Changes to ObjectSerializer.java break compatibility with Hadoop 0.20.2) Changes to ObjectSerializer.java break compatibility with Hadoop 0.20.2 --- Key: PIG-3194 URL: https://issues.apache.org/jira/browse/PIG-3194 Project: Pig Issue Type: Bug Affects Versions: 0.11 Reporter: Kai Londenberg The changes to ObjectSerializer.java in the following commit http://svn.apache.org/viewvc?view=revisionrevision=1403934 break compatibility with Hadoop 0.20.2 Clusters. The reason is, that the code uses methods from Apache Commons Codec 1.4 - which are not available in Apache Commons Codec 1.3 which is shipping with Hadoop 0.20.2. The offending methods are Base64.decodeBase64(String) and Base64.encodeBase64URLSafeString(byte[]) If I revert these changes, Pig 0.11.0 candidate 2 works well with our Hadoop 0.20.2 Clusters. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-3194) Pig 0.11 candidate 2: Changes to ObjectSerializer.java break compatibility with Hadoop 0.20.2
[ https://issues.apache.org/jira/browse/PIG-3194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Graham updated PIG-3194: - Affects Version/s: 0.11 Pig 0.11 candidate 2: Changes to ObjectSerializer.java break compatibility with Hadoop 0.20.2 - Key: PIG-3194 URL: https://issues.apache.org/jira/browse/PIG-3194 Project: Pig Issue Type: Bug Affects Versions: 0.11 Reporter: Kai Londenberg The changes to ObjectSerializer.java in the following commit http://svn.apache.org/viewvc?view=revisionrevision=1403934 break compatibility with Hadoop 0.20.2 Clusters. The reason is, that the code uses methods from Apache Commons Codec 1.4 - which are not available in Apache Commons Codec 1.3 which is shipping with Hadoop 0.20.2. The offending methods are Base64.decodeBase64(String) and Base64.encodeBase64URLSafeString(byte[]) If I revert these changes, Pig 0.11.0 candidate 2 works well with our Hadoop 0.20.2 Clusters. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-3189) Remove ivy/pig.pom and improve build mvn targets
[ https://issues.apache.org/jira/browse/PIG-3189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Graham updated PIG-3189: - Attachment: PIG-3189.4.patch Patch #3 had my *.iws changes in it. Uploading cleaned up patch #4. Remove ivy/pig.pom and improve build mvn targets Key: PIG-3189 URL: https://issues.apache.org/jira/browse/PIG-3189 Project: Pig Issue Type: Bug Reporter: Bill Graham Assignee: Bill Graham Fix For: 0.12 Attachments: PIG-3189.1.patch, PIG-3189.2.patch, PIG-3189.3.patch, PIG-3189.4.patch {{ivy/pig.pom}} in SVN seems to no longer be used. At build time ({{ant set-version}} via {{ant mvn-deploy}}) {{ivy/pig.pom}} is generated from {{ivy/pig-template.xml}}. We should remove {{ivy/pig.pom}} from SVN. It would also be good to decouple building the maven artifacts from publishing them, since those two tasks might be done on different hosts. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-3189) Remove ivy/pig.pom and improve build mvn targets
[ https://issues.apache.org/jira/browse/PIG-3189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Graham updated PIG-3189: - Resolution: Fixed Status: Resolved (was: Patch Available) Remove ivy/pig.pom and improve build mvn targets Key: PIG-3189 URL: https://issues.apache.org/jira/browse/PIG-3189 Project: Pig Issue Type: Bug Reporter: Bill Graham Assignee: Bill Graham Fix For: 0.12 Attachments: PIG-3189.1.patch, PIG-3189.2.patch, PIG-3189.3.patch, PIG-3189.4.patch {{ivy/pig.pom}} in SVN seems to no longer be used. At build time ({{ant set-version}} via {{ant mvn-deploy}}) {{ivy/pig.pom}} is generated from {{ivy/pig-template.xml}}. We should remove {{ivy/pig.pom}} from SVN. It would also be good to decouple building the maven artifacts from publishing them, since those two tasks might be done on different hosts. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: When making a change to pig.apache.org, do we attach just the patch for the changes to author, or to the post-forrest changes to publish as well?
Just the xdoc XML gets submitted IIRC, not the generated files. On Wed, Feb 20, 2013 at 1:35 AM, Jonathan Coveney jcove...@gmail.comwrote: I believe that this is how it works, but it has been a while and I want to make sure...
Re: When making a change to pig.apache.org, do we attach just the patch for the changes to author, or to the post-forrest changes to publish as well?
My bad, I was talking about the user docs, not the site docs. On Wed, Feb 20, 2013 at 8:17 AM, Alan Gates ga...@hortonworks.com wrote: You need to check in both author and publish. The site is now directly loaded from SVN using what's under publish. Alan. On Feb 20, 2013, at 1:35 AM, Jonathan Coveney wrote: I believe that this is how it works, but it has been a while and I want to make sure... -- *Note that I'm no longer using my Yahoo! email address. Please email me at billgra...@gmail.com going forward.*
Re: pig 0.11 candidate 2 feedback: Several problems
Regarding Pig 11 rc2, I propose we continue with the current vote as is (which closes today EOD). Patches for 0.20.2 issues can be rolled into a Pig 0.11.1 release whenever they're available and tested. On Wed, Feb 20, 2013 at 9:24 AM, Olga Natkovich onatkov...@yahoo.comwrote: I agree that supporting as much as we can is a good goal. The issue is who is going to be testing against all these versions? We found the issues under discussion because of a customer report, not because we consistently test against all versions. Perhaps when we decide which versions to support for next release we need also to agree who is going to be testing and maintaining compatibility with a particular version. For instance since Hadoop 23 compatibility is important for us at Yahoo we have been maintaining compatibility with this version for 0.9, 0.10 and will do the same for 0.11 and going forward. I think we would need others to step in and claim the versions of their interest. Olga From: Kai Londenberg kai.londenb...@googlemail.com To: dev@pig.apache.org Sent: Wednesday, February 20, 2013 1:51 AM Subject: Re: pig 0.11 candidate 2 feedback: Several problems Hi, I stronly agree with Jonathan here. If there are good reasons why you can't support an older version of Hadoop any more, that's one thing. But having to change 2 lines of code doesn't really qualify as such in my point of view ;) At least for me, pig support for 0.20.2 is essential - without it, I can't use it. If it doesn't support it, I'll have to branch pig and hack it myself, or stop using it. I guess, there are a lot of people still running 0.20.2 Clusters. If you really have lots of data stored on HDFS and a continuously busy cluster, an upgrade is nothing you do just because. 2013/2/20 Jonathan Coveney jcove...@gmail.com: I agree that we shouldn't have to support old versions forever. That said, I also don't think we should be too blase about supporting older versions where it is not odious to do so. We have a lot of competition in the language space and the broader the versions we can support, the better (assuming it isn't too odious to do so). In this case, I don't think it should be too hard to change ObjectSerializer so that the commons-codec code used is compatible with both versions...we could just in-line some of the Base64 code, and comment accordingly. That said, we also should be clear about what versions we support, but 6-12 months seems short. The upgrade cycles on Hadoop are really, really long. 2013/2/20 Prashant Kommireddi prash1...@gmail.com Agreed, that makes sense. Probably supporting older hadoop version for a 1 or 2 pig releases before moving to a newer/stable version? Having said that, should we use 0.11 period to communicate the same to the community and start moving on 0.12 onwards? I know we are way past 6-12 months (1-2 release) time frame with 0.20.2, but we also need to make sure users are aware and plan accordingly. I'd also be interested to hear how other projects (Hive, Oozie) are handling this. -Prashant On Tue, Feb 19, 2013 at 3:22 PM, Olga Natkovich onatkov...@yahoo.com wrote: It seems that for each Pig release we need to agree and clearly state which Hadoop versions it will support. I guess the main question is how we decide on this. Perhaps we should say that Pig no longer supports older Hadoop versions once the newer one is out for at least 6-12 month to make sure it is stable. I don't think we can support old versions indefinitely. It is in everybody's interest to keep moving forward. Olga From: Prashant Kommireddi prash1...@gmail.com To: dev@pig.apache.org Sent: Tuesday, February 19, 2013 10:57 AM Subject: Re: pig 0.11 candidate 2 feedback: Several problems What do you guys feel about the JIRA to do with 0.20.2 compatibility (PIG-3194)? I am interested in discussing the strategy around backward compatibility as this is something that would haunt us each time we move to the next hadoop version. For eg, we might be in a similar situation while moving to Hadoop 2.0, when some of the stuff might break for 1.0. I feel it would be good to get this JIRA fix in for 0.11, as 0.20.2 users might be caught unaware. Of course, I must admit there is selfish interest here and it's probably easier for us to have a workaround on Pig rather than upgrade hadoop in all our production DCs. -Prashant On Tue, Feb 19, 2013 at 9:54 AM, Russell Jurney russell.jur...@gmail.com wrote: I think someone should step up and fix the easy ones, if possible. On Tue, Feb 19, 2013 at 9:51 AM, Bill Graham billgra...@gmail.com wrote: Thanks Kai for reporting these. What do people think about the severity of these issues w.r.t. Pig 11
Re: [VOTE] Release Pig 0.11.0 (candidate 2)
With 3 binding +1s (Daniel, Dmitriy and Julien) and 1 non-binding +1 (Cheolsoo), the vote passes. I will start the release process. thanks, Bill On Wed, Feb 20, 2013 at 7:46 PM, Cheolsoo Park cheol...@cloudera.comwrote: +1 (non-binding) I downloaded and compiled source tarball. I tested jars against Hadoop 1.x and 2.x based clusters. On Wed, Feb 20, 2013 at 5:10 PM, Julien Le Dem jul...@twitter.com wrote: +1 I've run a subset of the tests on the src tar run some jobs in local mode on the binary tar checked the release note looks good to me Julien On Feb 14, 2013, at 3:59 PM, Bill Graham wrote: Hi, I have created a candidate build for Pig 0.11.0. Keys used to sign the release are available at: http://svn.apache.org/viewvc/pig/trunk/KEYS?view=markup Please download, test, and try it out: http://people.apache.org/~billgraham/pig-0.11.0-candidate-2/ Should we release this? Vote closes on next Wednesday EOD, Feb 20th. Thanks, Bill
Re: What do we need to change site documentation?
It's the '~'. Swap that out for $HOME. On Tue, Feb 19, 2013 at 7:15 AM, Jonathan Coveney jcove...@gmail.comwrote: Hm, that's what I thought. Not sure why I'm having an issue then...I tried to build it and it failed. I was able to successfully build the example that came with forrest. Anyone seen something like this before? $ ls ~/workspace/apache-forrest-0.9/ KEYSLICENSE.txtNOTICE.txtREADME.txtbinbuild etcindex.htmllibmainplugins site-authortoolswhiteboard [jonathancoveney@Jonathans-MacBook-Pro site]$ ant -Dforrest.home=~/workspace/apache-forrest-0.9 Buildfile: /Users/jonathancoveney/workspace/pig_full/site/build.xml clean: forrest.check: update: BUILD FAILED /Users/jonathancoveney/workspace/pig_full/site/build.xml:11: Execute failed: java.io.IOException: Cannot run program ~/workspace/apache-forrest-0.9/bin/forrest (in directory /Users/jonathancoveney/workspace/pig_full/site/author): error=2, No such file or directory at java.lang.ProcessBuilder.start(ProcessBuilder.java:460) at java.lang.Runtime.exec(Runtime.java:593) at org.apache.tools.ant.taskdefs.Execute$Java13CommandLauncher.exec(Execute.java:862) at org.apache.tools.ant.taskdefs.Execute.launch(Execute.java:481) at org.apache.tools.ant.taskdefs.Execute.execute(Execute.java:495) at org.apache.tools.ant.taskdefs.ExecTask.runExecute(ExecTask.java:631) at org.apache.tools.ant.taskdefs.ExecTask.runExec(ExecTask.java:672) at org.apache.tools.ant.taskdefs.ExecTask.execute(ExecTask.java:498) at org.apache.tools.ant.UnknownElement.execute(UnknownElement.java:291) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.tools.ant.dispatch.DispatchUtils.execute(DispatchUtils.java:106) at org.apache.tools.ant.Task.perform(Task.java:348) at org.apache.tools.ant.Target.execute(Target.java:390) at org.apache.tools.ant.Target.performTasks(Target.java:411) at org.apache.tools.ant.Project.executeSortedTargets(Project.java:1399) at org.apache.tools.ant.Project.executeTarget(Project.java:1368) at org.apache.tools.ant.helper.DefaultExecutor.executeTargets(DefaultExecutor.java:41) at org.apache.tools.ant.Project.executeTargets(Project.java:1251) at org.apache.tools.ant.Main.runBuild(Main.java:809) at org.apache.tools.ant.Main.startAnt(Main.java:217) at org.apache.tools.ant.launch.Launcher.run(Launcher.java:280) at org.apache.tools.ant.launch.Launcher.main(Launcher.java:109) Caused by: java.io.IOException: error=2, No such file or directory at java.lang.UNIXProcess.forkAndExec(Native Method) at java.lang.UNIXProcess.init(UNIXProcess.java:53) at java.lang.ProcessImpl.start(ProcessImpl.java:91) at java.lang.ProcessBuilder.start(ProcessBuilder.java:453) ... 24 more Total time: 0 seconds 2013/2/19 Alan Gates ga...@hortonworks.com No, somebody fixed it a while ago so it works with java 6. Just checkout pig/site, make your changes, build with ant -Dforrest.home=whatever, view the changes locally under the publish directory, add any new files, and check in. The publication from SVN to web is now automatic. It all works fine with the default Java on my mac. Alan. On Feb 19, 2013, at 4:39 AM, Jonathan Coveney wrote: I know we need forrest, but do we still need java 5? -- *Note that I'm no longer using my Yahoo! email address. Please email me at billgra...@gmail.com going forward.*
[jira] [Updated] (PIG-3189) Remove ivy/pig.pom and improve build mvn targets
[ https://issues.apache.org/jira/browse/PIG-3189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Graham updated PIG-3189: - Attachment: PIG-3189.2.patch {{git rm}} did not, but just {{rm}} did the trick. Here's patch 2 which reflects the delete. Remove ivy/pig.pom and improve build mvn targets Key: PIG-3189 URL: https://issues.apache.org/jira/browse/PIG-3189 Project: Pig Issue Type: Bug Reporter: Bill Graham Assignee: Bill Graham Fix For: 0.12 Attachments: PIG-3189.1.patch, PIG-3189.2.patch {{ivy/pig.pom}} SVN seems to no longer be used. At build time ({{ant set-version}} via {{ant mvn-deploy}}) {{ivy/pig.pom}} is generated from {{ivy/pig-template.xml}}. We should remove {{ivy/pig.pom}} from SVN. It would also be good to decouple building the maven artifacts from publishing them, since those two tasks might be done on different hosts. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-3189) Remove ivy/pig.pom and improve build mvn targets
[ https://issues.apache.org/jira/browse/PIG-3189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Graham updated PIG-3189: - Description: {{ivy/pig.pom}} in SVN seems to no longer be used. At build time ({{ant set-version}} via {{ant mvn-deploy}}) {{ivy/pig.pom}} is generated from {{ivy/pig-template.xml}}. We should remove {{ivy/pig.pom}} from SVN. It would also be good to decouple building the maven artifacts from publishing them, since those two tasks might be done on different hosts. was: {{ivy/pig.pom}} SVN seems to no longer be used. At build time ({{ant set-version}} via {{ant mvn-deploy}}) {{ivy/pig.pom}} is generated from {{ivy/pig-template.xml}}. We should remove {{ivy/pig.pom}} from SVN. It would also be good to decouple building the maven artifacts from publishing them, since those two tasks might be done on different hosts. Remove ivy/pig.pom and improve build mvn targets Key: PIG-3189 URL: https://issues.apache.org/jira/browse/PIG-3189 Project: Pig Issue Type: Bug Reporter: Bill Graham Assignee: Bill Graham Fix For: 0.12 Attachments: PIG-3189.1.patch, PIG-3189.2.patch {{ivy/pig.pom}} in SVN seems to no longer be used. At build time ({{ant set-version}} via {{ant mvn-deploy}}) {{ivy/pig.pom}} is generated from {{ivy/pig-template.xml}}. We should remove {{ivy/pig.pom}} from SVN. It would also be good to decouple building the maven artifacts from publishing them, since those two tasks might be done on different hosts. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: What do we need to change site documentation?
You and me both. I've been burned by that one about 342 times now... On Tue, Feb 19, 2013 at 8:11 AM, Jonathan Coveney jcove...@gmail.comwrote: I am such a scrub :) thanks Bill! 2013/2/19 Bill Graham billgra...@gmail.com It's the '~'. Swap that out for $HOME. On Tue, Feb 19, 2013 at 7:15 AM, Jonathan Coveney jcove...@gmail.com wrote: Hm, that's what I thought. Not sure why I'm having an issue then...I tried to build it and it failed. I was able to successfully build the example that came with forrest. Anyone seen something like this before? $ ls ~/workspace/apache-forrest-0.9/ KEYSLICENSE.txtNOTICE.txtREADME.txtbinbuild etcindex.htmllibmainplugins site-authortoolswhiteboard [jonathancoveney@Jonathans-MacBook-Pro site]$ ant -Dforrest.home=~/workspace/apache-forrest-0.9 Buildfile: /Users/jonathancoveney/workspace/pig_full/site/build.xml clean: forrest.check: update: BUILD FAILED /Users/jonathancoveney/workspace/pig_full/site/build.xml:11: Execute failed: java.io.IOException: Cannot run program ~/workspace/apache-forrest-0.9/bin/forrest (in directory /Users/jonathancoveney/workspace/pig_full/site/author): error=2, No such file or directory at java.lang.ProcessBuilder.start(ProcessBuilder.java:460) at java.lang.Runtime.exec(Runtime.java:593) at org.apache.tools.ant.taskdefs.Execute$Java13CommandLauncher.exec(Execute.java:862) at org.apache.tools.ant.taskdefs.Execute.launch(Execute.java:481) at org.apache.tools.ant.taskdefs.Execute.execute(Execute.java:495) at org.apache.tools.ant.taskdefs.ExecTask.runExecute(ExecTask.java:631) at org.apache.tools.ant.taskdefs.ExecTask.runExec(ExecTask.java:672) at org.apache.tools.ant.taskdefs.ExecTask.execute(ExecTask.java:498) at org.apache.tools.ant.UnknownElement.execute(UnknownElement.java:291) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.tools.ant.dispatch.DispatchUtils.execute(DispatchUtils.java:106) at org.apache.tools.ant.Task.perform(Task.java:348) at org.apache.tools.ant.Target.execute(Target.java:390) at org.apache.tools.ant.Target.performTasks(Target.java:411) at org.apache.tools.ant.Project.executeSortedTargets(Project.java:1399) at org.apache.tools.ant.Project.executeTarget(Project.java:1368) at org.apache.tools.ant.helper.DefaultExecutor.executeTargets(DefaultExecutor.java:41) at org.apache.tools.ant.Project.executeTargets(Project.java:1251) at org.apache.tools.ant.Main.runBuild(Main.java:809) at org.apache.tools.ant.Main.startAnt(Main.java:217) at org.apache.tools.ant.launch.Launcher.run(Launcher.java:280) at org.apache.tools.ant.launch.Launcher.main(Launcher.java:109) Caused by: java.io.IOException: error=2, No such file or directory at java.lang.UNIXProcess.forkAndExec(Native Method) at java.lang.UNIXProcess.init(UNIXProcess.java:53) at java.lang.ProcessImpl.start(ProcessImpl.java:91) at java.lang.ProcessBuilder.start(ProcessBuilder.java:453) ... 24 more Total time: 0 seconds 2013/2/19 Alan Gates ga...@hortonworks.com No, somebody fixed it a while ago so it works with java 6. Just checkout pig/site, make your changes, build with ant -Dforrest.home=whatever, view the changes locally under the publish directory, add any new files, and check in. The publication from SVN to web is now automatic. It all works fine with the default Java on my mac. Alan. On Feb 19, 2013, at 4:39 AM, Jonathan Coveney wrote: I know we need forrest, but do we still need java 5? -- *Note that I'm no longer using my Yahoo! email address. Please email me at billgra...@gmail.com going forward.* -- *Note that I'm no longer using my Yahoo! email address. Please email me at billgra...@gmail.com going forward.*
Re: pig 0.11 candidate 2 feedback: Several problems
Thanks Kai for reporting these. What do people think about the severity of these issues w.r.t. Pig 11? I see a few possible options: 1. We include some or all of these patches in a new Pig 11 rc. We'd want to make sure that they don't destabilize the current branch. This approach makes sense if we think Pig 11 wouldn't be a good release without one or more of these included. 2. We continue with the Pig 11 release without these, but then include one or more in a 0.11.1 release. 3. We continue with the Pig 11 release without these, but then include them in a 0.12 release. Jon has a patch for the MAP issue (PIG-3144https://issues.apache.org/jira/browse/PIG-3144) ready, which seems like the most pressing of the three to me. thanks, Bill On Mon, Feb 18, 2013 at 2:27 AM, Kai Londenberg kai.londenb...@googlemail.com wrote: Hi, I just subscribed to the dev mailing list in order to give you some feedback on pig 0.11 candidate 2. The following three issues are currently present in 0.11 candidate 2: https://issues.apache.org/jira/browse/PIG-3144 - 'Erroneous map entry alias resolution leading to Duplicate schema alias errors' https://issues.apache.org/jira/browse/PIG-3194 - Changes to ObjectSerializer.java break compatibility with Hadoop 0.20.2 https://issues.apache.org/jira/browse/PIG-3195 - Race Condition in PhysicalOperator leads to ExecException Error while trying to get next result in POStream The last two of these are easily solveable (see the tickets for details on that). The first one is a bit trickier I think, but at least there is a workaround for it (pass Map fields through an UDF) In my personal opinion, each of these problems is pretty severe, but opinions about the importance of the MAP Datatype and STREAM Operator, as well as Hadoop 0.20.2 compatibility might differ. so far .. Kai Londenberg -- *Note that I'm no longer using my Yahoo! email address. Please email me at billgra...@gmail.com going forward.*
[jira] [Commented] (PIG-3189) Remove ivy/pig.pom and improve build mvn targets
[ https://issues.apache.org/jira/browse/PIG-3189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13581539#comment-13581539 ] Bill Graham commented on PIG-3189: -- There have been changes to {{ivy/pig.pom}} that are not reflected in {{ivy/pig-template.xml}}. Particularly antlr is not being included in the published pom because it's not in the template. Will submit a new patch. {nofomat} $ diff ivy/pig-template.xml ivy/pig.pom 24c24 version@version/version --- version0.9.0-SNAPSHOT/version 85,86c85,86 /dependency dependency --- /dependency dependency 122c122,132 groupIdorg.apache.avro/groupId --- groupIdorg.antlr/groupId artifactIdantlr-runtime/artifactId version3.4/version /dependency dependency groupIdorg.antlr/groupId artifactIdST4/artifactId version4.0.4/version /dependency dependency groupIdorg.apache.hadoop/groupId 124c134 version1.5.3/version --- version1.3.2/version {noformat} Remove ivy/pig.pom and improve build mvn targets Key: PIG-3189 URL: https://issues.apache.org/jira/browse/PIG-3189 Project: Pig Issue Type: Bug Reporter: Bill Graham Assignee: Bill Graham Fix For: 0.12 Attachments: PIG-3189.1.patch, PIG-3189.2.patch {{ivy/pig.pom}} in SVN seems to no longer be used. At build time ({{ant set-version}} via {{ant mvn-deploy}}) {{ivy/pig.pom}} is generated from {{ivy/pig-template.xml}}. We should remove {{ivy/pig.pom}} from SVN. It would also be good to decouple building the maven artifacts from publishing them, since those two tasks might be done on different hosts. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Missing ANTLR dependency in Pig 0.10.1
No need to file a jira. I can roll this fix into this bug, which is the source of the problem: https://issues.apache.org/jira/browse/PIG-3189#comment-13581539 On Tue, Feb 19, 2013 at 11:07 AM, Rohini Palaniswamy rohini.adi...@gmail.com wrote: You mean it is not pulled as a transitive dependency? Currently you have to manually specify that as a dependency in your pom. Can you file a jira to make that part of the pig pom? Regards, Rohini On Tue, Feb 19, 2013 at 10:42 AM, Minh Lê ngocminh@gmail.com wrote: I tried to run PigServer in my Java code and get NoClassDefFoundError: org/antlr/runtime/RecognitionException. Pig maven repo appears to lack a reference to ANTLR jars also it uses ANTLR in QueryParserDriver class. http://mvnrepository.com/artifact/org.apache.pig/pig/0.10.1 Bests, -- Minh, Lê Ngọc Trento University, Master in Cognitive Science - Class of 2014 Skype: ngocminh_oss | Yahoo: ngocminh_oss | Tel: +39 389 603 7251 -- *Note that I'm no longer using my Yahoo! email address. Please email me at billgra...@gmail.com going forward.*
[jira] [Commented] (PIG-3131) Document PluckTuple UDF
[ https://issues.apache.org/jira/browse/PIG-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13581748#comment-13581748 ] Bill Graham commented on PIG-3131: -- +1 to the hotfix. Document PluckTuple UDF --- Key: PIG-3131 URL: https://issues.apache.org/jira/browse/PIG-3131 Project: Pig Issue Type: Task Affects Versions: 0.12 Reporter: Jonathan Coveney Assignee: Russell Jurney Priority: Blocker Fix For: 0.12 Attachments: PIG-3131-hotfix.patch, PIG-3131.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: [ANNOUNCE] Welcome Bill Graham to join Pig PMC
Thanks guys! On Tue, Feb 19, 2013 at 3:20 PM, Cheolsoo Park cheol...@cloudera.comwrote: Congratulations! On Tue, Feb 19, 2013 at 2:35 PM, Prasanth J buckeye.prasa...@gmail.com wrote: Congrats Bill! Thanks -- Prasanth On Feb 19, 2013, at 4:52 PM, Prashant Kommireddi prash1...@gmail.com wrote: Congrats Bill! On Tue, Feb 19, 2013 at 1:48 PM, Daniel Dai da...@hortonworks.com wrote: Please welcome Bill Graham as our latest Pig PMC member. Congrats Bill! -- *Note that I'm no longer using my Yahoo! email address. Please email me at billgra...@gmail.com going forward.*
[jira] [Updated] (PIG-3189) Remove ivy/pig.pom and improve build mvn targets
[ https://issues.apache.org/jira/browse/PIG-3189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Graham updated PIG-3189: - Attachment: PIG-3189.3.patch Adding patch #3 which includes antlr in the template. Remove ivy/pig.pom and improve build mvn targets Key: PIG-3189 URL: https://issues.apache.org/jira/browse/PIG-3189 Project: Pig Issue Type: Bug Reporter: Bill Graham Assignee: Bill Graham Fix For: 0.12 Attachments: PIG-3189.1.patch, PIG-3189.2.patch, PIG-3189.3.patch {{ivy/pig.pom}} in SVN seems to no longer be used. At build time ({{ant set-version}} via {{ant mvn-deploy}}) {{ivy/pig.pom}} is generated from {{ivy/pig-template.xml}}. We should remove {{ivy/pig.pom}} from SVN. It would also be good to decouple building the maven artifacts from publishing them, since those two tasks might be done on different hosts. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (PIG-3188) pig.script.submitted.timestamp not always consistent for jobs launched in a given script
Bill Graham created PIG-3188: Summary: pig.script.submitted.timestamp not always consistent for jobs launched in a given script Key: PIG-3188 URL: https://issues.apache.org/jira/browse/PIG-3188 Project: Pig Issue Type: Bug Reporter: Bill Graham Assignee: Bill Graham Fix For: 0.12 {{pig.script.submitted.timestamp}} is set in {{MapReduceLauncher.launchPig()}} when the a MR plan is launched. Some scripts (i.e. those with an exec in the middle) will cause multiple plans to be launched. In these case jobs launched from the same script can have different {{pig.script.submitted.timestamp}} values, which is a bug. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-3174) Remove rpm and deb artifacts from build.xml
[ https://issues.apache.org/jira/browse/PIG-3174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13579643#comment-13579643 ] Bill Graham commented on PIG-3174: -- I'll take a look at the patch, but what are your thoughts on documentation or more generally, how we should point people to Bigtop for distros? Remove rpm and deb artifacts from build.xml --- Key: PIG-3174 URL: https://issues.apache.org/jira/browse/PIG-3174 Project: Pig Issue Type: Task Components: build Affects Versions: 0.12 Reporter: Alan Gates Assignee: Alan Gates Fix For: 0.12 Attachments: PIG-3174.2.patch, PIG-3174.patch I propose that we remove the targets to build rpms and debs from build.xml and consequently quit publishing them as part of our releases. Bigtop publishes these packages now. And building them takes infrastructure that not every committer/PMC member has. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (PIG-3189) Remove ivy/pig.pom and improve build mvn targets
Bill Graham created PIG-3189: Summary: Remove ivy/pig.pom and improve build mvn targets Key: PIG-3189 URL: https://issues.apache.org/jira/browse/PIG-3189 Project: Pig Issue Type: Bug Reporter: Bill Graham Assignee: Bill Graham Fix For: 0.12 {{ivy/pig.pom}} SVN seems to no longer be used. At build time ({{ant set-version}} via {{ant mvn-deploy}}) {{ivy/pig.pom}} is generated from {{ivy/pig-template.xml}}. We should remove {{ivy/pig.pom}} from SVN. It would also be good to decouple building the maven artifacts from publishing them, since those two tasks might be done on different hosts. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira