Re: Welcome to the new Pig PMC member Aniket Mokashi

2014-01-14 Thread Bill Graham
Woo! Congrats Aniket!


On Tue, Jan 14, 2014 at 8:47 PM, Olga Natkovich onatkov...@yahoo.comwrote:

 Congrats, Aniket!



 On Tuesday, January 14, 2014 8:32 PM, Tongjie Chen tongjie.c...@gmail.com
 wrote:

 Congrats Aniket!



 On Tue, Jan 14, 2014 at 8:12 PM, Cheolsoo Park piaozhe...@gmail.com
 wrote:

  Congrats Aniket!
 
 
  On Tue, Jan 14, 2014 at 7:01 PM, Jarek Jarcec Cecho jar...@apache.org
  wrote:
 
   Congratulations Aniket, good work!
  
   Jarcec
  
   On Tue, Jan 14, 2014 at 06:52:10PM -0800, JULIEN LE DEM wrote:
It's my pleasure to announce that Aniket Mokashi became the newest
   addition to the Pig PMC.
Aniket has been actively contributing to Pig for years.
Please join me in congratulating Aniket!
   
Julien
   
  
 




-- 
*Note that I'm no longer using my Yahoo! email address. Please email me
at billgra...@gmail.com billgra...@gmail.com going forward.*


[jira] [Commented] (PIG-3623) Documentation for loadKey in HBaseStorage is incorrect

2013-12-26 Thread Bill Graham (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13857303#comment-13857303
 ] 

Bill Graham commented on PIG-3623:
--

+1 for fixing behavior to match the docs here. Great find [~mstefaniak].

 Documentation for loadKey in HBaseStorage is incorrect
 --

 Key: PIG-3623
 URL: https://issues.apache.org/jira/browse/PIG-3623
 Project: Pig
  Issue Type: Bug
Reporter: Michael Stefaniak

 The documentation for HBaseStorage 
 (http://pig.apache.org/docs/r0.12.0/func.html#HBaseStorage)
 says -loadKey=(true|false) Load the row key as the first value in every tuple 
 returned from HBase (default=false)
 However, looking at the source 
 (http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/hbase/HBaseStorage.java)
 it is just doing a check for the existence of this option
 loadRowKey_ = configuredOptions_.hasOption(loadKey);
 So setting -loadKey=false in the options string, still results in a true value



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (PIG-3549) Print hadoop jobids for failed, killed job

2013-10-29 Thread Bill Graham (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13808038#comment-13808038
 ] 

Bill Graham commented on PIG-3549:
--

Wow, fix of the year.

+1

 Print hadoop jobids for failed, killed job
 --

 Key: PIG-3549
 URL: https://issues.apache.org/jira/browse/PIG-3549
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Aniket Mokashi
Assignee: Aniket Mokashi
 Fix For: 0.12.1

 Attachments: PIG-3549.patch


 It would be better if we dump the hadoop job ids for failed, killed jobs in 
 pig log. Right now, log looks like following-
 {noformat}
 ERROR org.apache.pig.tools.grunt.Grunt: ERROR 6017: Job failed! Error - NA
 INFO 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher:
  Job job_pigexec_1 killed
 {noformat}
 From that its hard to say which hadoop job failed if there are multiple jobs 
 running in parallel.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (PIG-3497) JobControlCompiler should only do reducer estimation when the job has a reduce phase

2013-10-03 Thread Bill Graham (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Graham updated PIG-3497:
-

Assignee: Akihiro Matsukawa

 JobControlCompiler should only do reducer estimation when the job has a 
 reduce phase
 

 Key: PIG-3497
 URL: https://issues.apache.org/jira/browse/PIG-3497
 Project: Pig
  Issue Type: Bug
Reporter: Akihiro Matsukawa
Assignee: Akihiro Matsukawa
Priority: Minor
 Attachments: reducer_estimation.patch


 Currently, JobControlCompiler makes an estimation for the number of reducers 
 required (by default based on input size into mappers) regardless of whether 
 there is a reduce phase in the job. This is unnecessary, especially when 
 running more complex custom reducer estimators. 
 Change to only estimate reducers when necessary.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (PIG-3455) Pig 0.11.1 OutOfMemory error

2013-09-17 Thread Bill Graham (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13770256#comment-13770256
 ] 

Bill Graham commented on PIG-3455:
--

+1, much better.

 Pig 0.11.1 OutOfMemory error
 

 Key: PIG-3455
 URL: https://issues.apache.org/jira/browse/PIG-3455
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.11.1
Reporter: Shubham Chopra
Priority: Critical
 Fix For: 0.12, 0.11.2

 Attachments: PIG-3455-1.patch


 When running Pig on a relatively large script (around 1.5k lines, 85 
 assignments), Pig fails with the following error even before any jobs are 
 fired:
 Pig Stack Trace
 ---
 ERROR 2998: Unhandled internal error. Java heap space
 java.lang.OutOfMemoryError: Java heap space
 at java.util.Arrays.copyOf(Arrays.java:2882)
 at 
 java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:100)
 at 
 java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:390)
 at java.lang.StringBuilder.append(StringBuilder.java:119)
 at 
 org.apache.pig.newplan.logical.optimizer.LogicalPlanPrinter.depthFirstLP(LogicalPlanPrinter.java:83)
 at 
 org.apache.pig.newplan.logical.optimizer.LogicalPlanPrinter.visit(LogicalPlanPrinter.java:69)
 at 
 org.apache.pig.newplan.logical.relational.LogicalPlan.getSignature(LogicalPlan.java:122)
 at org.apache.pig.PigServer.execute(PigServer.java:1237)
 at org.apache.pig.PigServer.executeBatch(PigServer.java:333)
 at 
 org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:137)
 at 
 org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:198)
 at 
 org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:170)
 at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)
 at org.apache.pig.Main.run(Main.java:604)
 at org.apache.pig.Main.main(Main.java:157)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:160)
 The same script works fine with Pig-0.10.1.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-3419) Pluggable Execution Engine

2013-09-13 Thread Bill Graham (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13767227#comment-13767227
 ] 

Bill Graham commented on PIG-3419:
--

Would should at least annotate the new interfaces as evolving so we don't need 
to evolve them in a backward compatible way just yet.

 Pluggable Execution Engine 
 ---

 Key: PIG-3419
 URL: https://issues.apache.org/jira/browse/PIG-3419
 Project: Pig
  Issue Type: New Feature
Affects Versions: 0.12
Reporter: Achal Soni
Assignee: Achal Soni
Priority: Minor
 Fix For: 0.12

 Attachments: execengine.patch, mapreduce_execengine.patch, 
 stats_scriptstate.patch, test_failures.txt, test_suite.patch, 
 updated-8-22-2013-exec-engine.patch, updated-8-23-2013-exec-engine.patch, 
 updated-8-27-2013-exec-engine.patch, updated-8-28-2013-exec-engine.patch, 
 updated-8-29-2013-exec-engine.patch


 In an effort to adapt Pig to work using Apache Tez 
 (https://issues.apache.org/jira/browse/TEZ), I made some changes to allow for 
 a cleaner ExecutionEngine abstraction than existed before. The changes are 
 not that major as Pig was already relatively abstracted out between the 
 frontend and backend. The changes in the attached commit are essentially the 
 barebones changes -- I tried to not change the structure of Pig's different 
 components too much. I think it will be interesting to see in the future how 
 we can refactor more areas of Pig to really honor this abstraction between 
 the frontend and backend. 
 Some of the changes was to reinstate an ExecutionEngine interface to tie 
 together the front end and backend, and making the changes in Pig to delegate 
 to the EE when necessary, and creating an MRExecutionEngine that implements 
 this interface. Other work included changing ExecType to cycle through the 
 ExecutionEngines on the classpath and select the appropriate one (this is 
 done using Java ServiceLoader, exactly how MapReduce does for choosing the 
 framework to use between local and distributed mode). Also I tried to make 
 ScriptState, JobStats, and PigStats as abstract as possible in its current 
 state. I think in the future some work will need to be done here to perhaps 
 re-evaluate the usage of ScriptState and the responsibilities of the 
 different statistics classes. I haven't touched the PPNL, but I think more 
 abstraction is needed here, perhaps in a separate patch. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: [ANNOUNCE] Congratulations to our new PMC members Rohini Palaniswamy and Cheolsoo Park

2013-09-12 Thread Bill Graham
Congrats guys! Well deserved indeed.


On Wed, Sep 11, 2013 at 10:58 PM, Jarek Jarcec Cecho jar...@apache.orgwrote:

 Congratulations Rohini and Cheolsoo, awesome work!

 Jarcec

 On Wed, Sep 11, 2013 at 04:24:21PM -0700, Julien Le Dem wrote:
  Please welcome Rohini Palaniswamy and Cheolsoo Park as our latest Pig
 PMC members.
 
  Congrats Rohini and Cheolsoo !




-- 
*Note that I'm no longer using my Yahoo! email address. Please email me at
billgra...@gmail.com going forward.*


Re: Welcome new Pig Committer - Koji Noguchi

2013-09-11 Thread Bill Graham
Congrats Koji!


On Tue, Sep 10, 2013 at 10:29 PM, Cheolsoo Park piaozhe...@gmail.comwrote:

 Congratulations Koji!


 On Wed, Sep 11, 2013 at 7:32 AM, Prashant Kommireddi prash1...@gmail.com
 wrote:

  Congrats Koji!
 
 
  On Tue, Sep 10, 2013 at 10:01 AM, Xuefu Zhang xzh...@cloudera.com
 wrote:
 
   Congratulations, Koji. Looking forward to more of your contributions.
  
   --Xuefu
  
  
   On Tue, Sep 10, 2013 at 8:58 AM, Olga Natkovich onatkov...@yahoo.com
   wrote:
  
It is my pleasure to announce that Koji Noguchi became the newest
   addition
to the Pig Committers!
   
Koji has been actively contributing to Pig for over a year now and
 has
been a part of larger Hadoop community (including Hadoop Committer)
 for
many years now.
   
Please, join me in congratulating Koji!
   
Olga
   
  
 




-- 
*Note that I'm no longer using my Yahoo! email address. Please email me at
billgra...@gmail.com going forward.*


[jira] [Commented] (PIG-3455) Pig 0.11.1 OutOfMemory error

2013-09-10 Thread Bill Graham (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763912#comment-13763912
 ] 

Bill Graham commented on PIG-3455:
--

Thanks [~rohini.u] for kicking this off. Yes, a streaming based hash function 
would be a much better approach. No need for backward compatibility. The 
signature contract is that it could change between Pig releases.

 Pig 0.11.1 OutOfMemory error
 

 Key: PIG-3455
 URL: https://issues.apache.org/jira/browse/PIG-3455
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.11.1
Reporter: Shubham Chopra
Priority: Critical

 When running Pig on a relatively large script (around 1.5k lines, 85 
 assignments), Pig fails with the following error even before any jobs are 
 fired:
 Pig Stack Trace
 ---
 ERROR 2998: Unhandled internal error. Java heap space
 java.lang.OutOfMemoryError: Java heap space
 at java.util.Arrays.copyOf(Arrays.java:2882)
 at 
 java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:100)
 at 
 java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:390)
 at java.lang.StringBuilder.append(StringBuilder.java:119)
 at 
 org.apache.pig.newplan.logical.optimizer.LogicalPlanPrinter.depthFirstLP(LogicalPlanPrinter.java:83)
 at 
 org.apache.pig.newplan.logical.optimizer.LogicalPlanPrinter.visit(LogicalPlanPrinter.java:69)
 at 
 org.apache.pig.newplan.logical.relational.LogicalPlan.getSignature(LogicalPlan.java:122)
 at org.apache.pig.PigServer.execute(PigServer.java:1237)
 at org.apache.pig.PigServer.executeBatch(PigServer.java:333)
 at 
 org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:137)
 at 
 org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:198)
 at 
 org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:170)
 at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)
 at org.apache.pig.Main.run(Main.java:604)
 at org.apache.pig.Main.main(Main.java:157)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:160)
 The same script works fine with Pig-0.10.1.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-3048) Add mapreduce workflow information to job configuration

2013-08-29 Thread Bill Graham (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13753997#comment-13753997
 ] 

Bill Graham commented on PIG-3048:
--

+1 to commit.

Just one style nit re spaces:

{noformat}
(getFileName() != null)?getFileName():default
{noformat}
should instead be:
{noformat}
(getFileName() != null) ? getFileName() : default
{noformat}


 Add mapreduce workflow information to job configuration
 ---

 Key: PIG-3048
 URL: https://issues.apache.org/jira/browse/PIG-3048
 Project: Pig
  Issue Type: Improvement
Reporter: Billie Rinaldi
Assignee: Billie Rinaldi
 Fix For: 0.12

 Attachments: PIG-3048.patch, PIG-3048.patch, PIG-3048.patch


 Adding workflow properties to the job configuration would enable logging and 
 analysis of workflows in addition to individual MapReduce jobs.  Suggested 
 properties include a workflow ID, workflow name, adjacency list connecting 
 nodes in the workflow, and the name of the current node in the workflow.
 mapreduce.workflow.id - a unique ID for the workflow, ideally prepended with 
 the application name
 e.g. pig_pigScriptId
 mapreduce.workflow.name - a name for the workflow, to distinguish this 
 workflow from other workflows and to group different runs of the same workflow
 e.g. pig command line
 mapreduce.workflow.adjacency - an adjacency list for the workflow graph, 
 encoded as mapreduce.workflow.adjacency.source node = comma-separated list 
 of target nodes
 mapreduce.workflow.node.name - the name of the node corresponding to this 
 MapReduce job in the workflow adjacency list

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-3048) Add mapreduce workflow information to job configuration

2013-08-29 Thread Bill Graham (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13753998#comment-13753998
 ] 

Bill Graham commented on PIG-3048:
--

Whoops, I was a minute too late. :)

 Add mapreduce workflow information to job configuration
 ---

 Key: PIG-3048
 URL: https://issues.apache.org/jira/browse/PIG-3048
 Project: Pig
  Issue Type: Improvement
Reporter: Billie Rinaldi
Assignee: Billie Rinaldi
 Fix For: 0.12

 Attachments: PIG-3048.patch, PIG-3048.patch, PIG-3048.patch


 Adding workflow properties to the job configuration would enable logging and 
 analysis of workflows in addition to individual MapReduce jobs.  Suggested 
 properties include a workflow ID, workflow name, adjacency list connecting 
 nodes in the workflow, and the name of the current node in the workflow.
 mapreduce.workflow.id - a unique ID for the workflow, ideally prepended with 
 the application name
 e.g. pig_pigScriptId
 mapreduce.workflow.name - a name for the workflow, to distinguish this 
 workflow from other workflows and to group different runs of the same workflow
 e.g. pig command line
 mapreduce.workflow.adjacency - an adjacency list for the workflow graph, 
 encoded as mapreduce.workflow.adjacency.source node = comma-separated list 
 of target nodes
 mapreduce.workflow.node.name - the name of the node corresponding to this 
 MapReduce job in the workflow adjacency list

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-3382) Store data in hbase with more than 2 column family

2013-07-17 Thread Bill Graham (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13711315#comment-13711315
 ] 

Bill Graham commented on PIG-3382:
--

Would you please attach your script and ideally some sample data, along with 
any errors or other relevant info that might help us troubleshoot and reproduce.

 Store data in hbase with more than 2 column family
 --

 Key: PIG-3382
 URL: https://issues.apache.org/jira/browse/PIG-3382
 Project: Pig
  Issue Type: Improvement
  Components: build, internal-udfs, parser
Reporter: vikram s

 I am not able to store data in HBase with more than 2 column families.
 I used STORE api from pig with internal-udf 
 org.apache.pig.backend.hadoop.hbase.HBaseStorage

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (PIG-3382) Store data in hbase with more than 2 column family

2013-07-17 Thread Bill Graham (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Graham resolved PIG-3382.
--

Resolution: Not A Problem

 Store data in hbase with more than 2 column family
 --

 Key: PIG-3382
 URL: https://issues.apache.org/jira/browse/PIG-3382
 Project: Pig
  Issue Type: Improvement
  Components: build, internal-udfs, parser
Reporter: vikram s

 I am not able to store data in HBase with more than 2 column families.
 I used STORE api from pig with internal-udf 
 org.apache.pig.backend.hadoop.hbase.HBaseStorage

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3330) please fix the change that created a dependency on org.apache.pig.impl.PigImplConstants

2013-05-17 Thread Bill Graham (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Graham updated PIG-3330:
-

Assignee: Bill Graham

 please fix the change that created a dependency on 
 org.apache.pig.impl.PigImplConstants
 ---

 Key: PIG-3330
 URL: https://issues.apache.org/jira/browse/PIG-3330
 Project: Pig
  Issue Type: Bug
Reporter: Joseph Adler
Assignee: Bill Graham
Priority: Blocker

 I can't build Pig from trunk because several source files (including 
 org.apache.pig.Main.java) require org.apache.pig.impl.PigImplConstants, but 
 that class isn't available.
 I'm assuming someone left out a file on a recent commit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (PIG-3330) please fix the change that created a dependency on org.apache.pig.impl.PigImplConstants

2013-05-17 Thread Bill Graham (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Graham resolved PIG-3330.
--

Resolution: Fixed

My bad, I made the commit last night and forgot 'svn add'. Just made the fix by 
adding the missing file.

 please fix the change that created a dependency on 
 org.apache.pig.impl.PigImplConstants
 ---

 Key: PIG-3330
 URL: https://issues.apache.org/jira/browse/PIG-3330
 Project: Pig
  Issue Type: Bug
Reporter: Joseph Adler
Assignee: Bill Graham
Priority: Blocker

 I can't build Pig from trunk because several source files (including 
 org.apache.pig.Main.java) require org.apache.pig.impl.PigImplConstants, but 
 that class isn't available.
 I'm assuming someone left out a file on a recent commit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3317) disable optimizations via pig properties

2013-05-16 Thread Bill Graham (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Graham updated PIG-3317:
-

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed, thanks Travis!

 disable optimizations via pig properties
 

 Key: PIG-3317
 URL: https://issues.apache.org/jira/browse/PIG-3317
 Project: Pig
  Issue Type: Improvement
Affects Versions: 0.12
Reporter: Travis Crawford
Assignee: Travis Crawford
 Attachments: PIG-3317_disable_opts.1.patch, 
 PIG-3317_disable_opts.2.patch, PIG-3317_disable_opts.3.patch, 
 PIG-3317_disable_opts.4.patch


 Pig provides a number of optimizations which are described at 
 [http://pig.apache.org/docs/r0.11.1/perf.html#optimization-rules]. As is 
 described in the docs, all or specific optimizations can be disabled via the 
 command-line.
 Currently the caller of a pig script must know which optimizations to disable 
 when running because that information cannot be set in the script itself. Nor 
 can optimizations be disabled site-wide through pig.properties.
 Pig should allow disabling optimizations via properties so that pig scripts 
 themselves can disable optimizations as needed, rather than the caller 
 needing to know what optimizations to disable on the command-line.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3326) Add PiggyBank to Maven Repository

2013-05-15 Thread Bill Graham (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Graham updated PIG-3326:
-

Fix Version/s: 0.12

 Add PiggyBank to Maven Repository
 -

 Key: PIG-3326
 URL: https://issues.apache.org/jira/browse/PIG-3326
 Project: Pig
  Issue Type: New Feature
  Components: piggybank
Reporter: Aaron Mitchell
Priority: Minor
 Fix For: 0.12


 PiggyBank should be uploaded to the apache maven repository.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-3326) Add PiggyBank to Maven Repository

2013-05-15 Thread Bill Graham (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13658831#comment-13658831
 ] 

Bill Graham commented on PIG-3326:
--

Support for publishing piggybank to maven will be in Pig 0.12 thanks to 
PIG-3233.

 Add PiggyBank to Maven Repository
 -

 Key: PIG-3326
 URL: https://issues.apache.org/jira/browse/PIG-3326
 Project: Pig
  Issue Type: New Feature
  Components: piggybank
Reporter: Aaron Mitchell
Priority: Minor
 Fix For: 0.12


 PiggyBank should be uploaded to the apache maven repository.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Review Request: disable optimizations via pig properties

2013-05-14 Thread Bill Graham


 On May 13, 2013, 11:35 p.m., Bill Graham wrote:
  src/docs/src/documentation/content/xdocs/perf.xml, line 493
  https://reviews.apache.org/r/11032/diff/2/?file=290925#file290925line493
 
  Would you please specify that setting this value in both the pig 
  properties file and the command line (or script) will be additive.
 
 Travis Crawford wrote:
 Currently it works like this:
 
 (a) -optimizer_off command-line rules are always disabled.
 (b) The pig.optimizer.rules.disabled property works like other 
 properties, where setting in the script itself overwrites previously set 
 values (from either the command-line or pig.properties).
 
 Disabled rules are additive in that (a) + (b) will be disabled. However, 
 within (b) only the last specified value of pig.optimizer.rules.disabled 
 takes effect.
 
 I think this makes sense for how people will want to use the feature (and 
 I think is consistent with how other properties work).
 
 * Site administrators can specify default rules to disable via 
 pig.properties.
 * Individual scripts can override the site defaults if needed.
 * Invokers of pig can supplement the rules to disable.
 
 Thoughts? If we want to be additive within (b) we'd also need a way to 
 remove defaults set by site administrators, since the default should be a 
 suggestion not requirement. That would easily be achieved with a - prefix 
 that would remove disabled rules, but I think we've covered the common 
 use-cases without introducing extra complexity.
 
 Bill Graham wrote:
  (b) The pig.optimizer.rules.disabled property works like other 
 properties, where setting in the script itself overwrites previously set 
 values (from either the command-line or pig.properties).
 
 This implies SET in a script will override the command line (or 
 properties).
 
  Disabled rules are additive in that (a) + (b) will be disabled. 
 However, within (b) only the last specified value of 
 pig.optimizer.rules.disabled takes effect.
 
 This implies SET in a script (or properties) would be additive with the 
 command line.
 
 Can you help clarify what I think sounds like a contradiction? Just 
 trying to understand the implemented behavior more than propose a change to 
 it.
 
 Travis Crawford wrote:
 RE: This implies SET in a script will override the command line (or 
 properties).
 A: Rules disabled via the -optimizer_off command-line flag are treated 
 separately from values disabled via the pig property.
 
 RE: This implies SET in a script (or properties) would be additive with 
 the command line.
 A: Correct - SET in a script is additive with rules disabled via the 
 command-line flag.
 
 
 Rules to disable are the set of rules disabled on the command line + 
 rules disabled via the pig.optimizer.rules.disabled property.
 
 Pig's code currently uses a command-line flag to disable optimization 
 rules rather than standard pig properties. I think the ideal state would be 
 using a single property to disable rules because properties are how pig 
 configuration works in general. However, since there's currently a 
 command-line flag to disable rules it seems like we should keep it (perhaps 
 deprecating to allow removing in a future release).
 
 The proposal in this change is to:
 
 * preserve existing behavior by making the command-line flag continues to 
 disable rules as it does today
 * Add a new property that also lets you disable optimization rules. This 
 is a standard pig property that can be set in all the ways one can currently 
 set properties.
 
 Then we add rules disabled via the command-line flag with rules disabled 
 via the property and that's the full list of rules to disable.
 
 Bill Graham wrote:
 Got it, thanks. In your original text I was overlooking the fact that you 
 were discussing a flag which is different then the property. All's clear now.
 
 The behavior makes sense to me. We should add a line to the docs to 
 mention that when both -optimizer_off  and pig.optimizer.rules.disabled are 
 set, that the union of the two rule sets are disabled.
 
 Travis Crawford wrote:
 Sounds good, I'll clarify that section.
 
 Digging around I found this section which defined how pig properties are 
 set: http://pig.apache.org/docs/r0.11.1/start.html#properties
 
 pig.properties  -D Pig property  -P properties file  set command
 
 Which is interesting because until poking around in this change I didn't 
 know the exact order they were applied in (and just now learning about -P).
 
 What are your thoughts on deprecating the command-line argument? To 
 minimize impact I'm fine leaving as-is, but long-term I think it makes sense 
 for pig to simply be configured through properties without a good reason to 
 do things differently.

I agree deprecating would be nice, but I'm ok leaving

Re: Review Request: disable optimizations via pig properties

2013-05-14 Thread Bill Graham

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11032/#review20545
---

Ship it!


My concerns have been addresses. Thanks!

- Bill Graham


On May 14, 2013, 5:23 p.m., Travis Crawford wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/11032/
 ---
 
 (Updated May 14, 2013, 5:23 p.m.)
 
 
 Review request for pig, Julien Le Dem, Bill Graham, and Feng Peng.
 
 
 Description
 ---
 
 Update pig to allow disabling optimizations via pig properties. Currently 
 optimizations must be disabled via command-line options. Pig properties can 
 be set in pig.properties, set commands in scripts themselves, and 
 command-line -D options.
 
 The use-case is, for scripts that require certain optimizations to be 
 disabled, allowing the script itself to disable the optimization. Currently 
 whatever runs the script needs to specially handle disabling the optimization 
 for that specific query.
 
 
 This addresses bug PIG-3317.
 https://issues.apache.org/jira/browse/PIG-3317
 
 
 Diffs
 -
 
   src/docs/src/documentation/content/xdocs/perf.xml 108ae7e 
   src/org/apache/pig/Main.java f97ed9f 
   src/org/apache/pig/PigConstants.java ea77e97 
   src/org/apache/pig/backend/hadoop/executionengine/HExecutionEngine.java 
 4dab4e8 
   src/org/apache/pig/impl/PigImplConstants.java PRE-CREATION 
   src/org/apache/pig/newplan/logical/optimizer/LogicalPlanOptimizer.java 
 d26f381 
   test/org/apache/pig/test/TestEvalPipeline2.java 39cf807 
 
 Diff: https://reviews.apache.org/r/11032/diff/
 
 
 Testing
 ---
 
 Manually tested on a fully-distributed cluster.
 
 THIS FAILS:
 PIG_CONF_DIR=/etc/pig/conf ./bin/pig -c query.pig
 
 THIS WORKS:
 PIG_CONF_DIR=/etc/pig/conf ./bin/pig 
 -Dpig.optimizer.rules.disabled=ColumnMapKeyPrune -c query.pig
 
 Notice how -Dpig.optimizer.rules.disabled=ColumnMapKeyPrune specifies a pig 
 property, which could be in pig.properties, or the script itself.
 
 
 Failure message:
 
 Pig Stack Trace
 ---
 ERROR 2229: Couldn't find matching uid -1 for project (Name: Project Type: 
 bytearray Uid: 97550 Input: 0 Column: 1)
 
 org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1067: Unable to 
 explain alias null
   at org.apache.pig.PigServer.explain(PigServer.java:1057)
   at 
 org.apache.pig.tools.grunt.GruntParser.explainCurrentBatch(GruntParser.java:419)
   at 
 org.apache.pig.tools.grunt.GruntParser.processExplain(GruntParser.java:351)
   at org.apache.pig.tools.grunt.Grunt.checkScript(Grunt.java:98)
   at org.apache.pig.Main.run(Main.java:607)
   at org.apache.pig.Main.main(Main.java:152)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
 Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 2000: 
 Error processing rule ColumnMapKeyPrune. Try -t ColumnMapKeyPrune
   at 
 org.apache.pig.newplan.optimizer.PlanOptimizer.optimize(PlanOptimizer.java:122)
   at 
 org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(HExecutionEngine.java:281)
   at org.apache.pig.PigServer.compilePp(PigServer.java:1380)
   at org.apache.pig.PigServer.explain(PigServer.java:1042)
   ... 10 more
 Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 2229: 
 Couldn't find matching uid -1 for project (Name: Project Type: bytearray Uid: 
 97550 Input: 0 Column: 1)
   at 
 org.apache.pig.newplan.logical.optimizer.ProjectionPatcher$ProjectionRewriter.visit(ProjectionPatcher.java:91)
   at 
 org.apache.pig.newplan.logical.expression.ProjectExpression.accept(ProjectExpression.java:207)
   at 
 org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:64)
   at 
 org.apache.pig.newplan.DepthFirstWalker.walk(DepthFirstWalker.java:53)
   at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:52)
   at 
 org.apache.pig.newplan.logical.optimizer.AllExpressionVisitor.visit(AllExpressionVisitor.java:142)
   at 
 org.apache.pig.newplan.logical.relational.LOInnerLoad.accept(LOInnerLoad.java:128)
   at 
 org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalker.java:75)
   at 
 org.apache.pig.newplan.logical.optimizer.AllExpressionVisitor.visit(AllExpressionVisitor.java:124)
   at 
 org.apache.pig.newplan.logical.relational.LOForEach.accept(LOForEach.java:76

[jira] [Updated] (PIG-3317) disable optimizations via pig properties

2013-05-13 Thread Bill Graham (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Graham updated PIG-3317:
-

Status: Open  (was: Patch Available)

Canceling patch since Travis and Julien identified issues with SET in scripts 
in https://reviews.apache.org/r/11032/.

 disable optimizations via pig properties
 

 Key: PIG-3317
 URL: https://issues.apache.org/jira/browse/PIG-3317
 Project: Pig
  Issue Type: Improvement
Affects Versions: 0.12
Reporter: Travis Crawford
Assignee: Travis Crawford
 Attachments: PIG-3317_disable_opts.1.patch


 Pig provides a number of optimizations which are described at 
 [http://pig.apache.org/docs/r0.11.1/perf.html#optimization-rules]. As is 
 described in the docs, all or specific optimizations can be disabled via the 
 command-line.
 Currently the caller of a pig script must know which optimizations to disable 
 when running because that information cannot be set in the script itself. Nor 
 can optimizations be disabled site-wide through pig.properties.
 Pig should allow disabling optimizations via properties so that pig scripts 
 themselves can disable optimizations as needed, rather than the caller 
 needing to know what optimizations to disable on the command-line.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Review Request: disable optimizations via pig properties

2013-05-13 Thread Bill Graham

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11032/#review20516
---



src/docs/src/documentation/content/xdocs/perf.xml
https://reviews.apache.org/r/11032/#comment42293

Would you please specify that setting this value in both the pig properties 
file and the command line (or script) will be additive.


- Bill Graham


On May 13, 2013, 8:35 p.m., Travis Crawford wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/11032/
 ---
 
 (Updated May 13, 2013, 8:35 p.m.)
 
 
 Review request for pig, Julien Le Dem, Bill Graham, and Feng Peng.
 
 
 Description
 ---
 
 Update pig to allow disabling optimizations via pig properties. Currently 
 optimizations must be disabled via command-line options. Pig properties can 
 be set in pig.properties, set commands in scripts themselves, and 
 command-line -D options.
 
 The use-case is, for scripts that require certain optimizations to be 
 disabled, allowing the script itself to disable the optimization. Currently 
 whatever runs the script needs to specially handle disabling the optimization 
 for that specific query.
 
 
 This addresses bug PIG-3317.
 https://issues.apache.org/jira/browse/PIG-3317
 
 
 Diffs
 -
 
   src/docs/src/documentation/content/xdocs/perf.xml 108ae7e 
   src/org/apache/pig/Main.java f97ed9f 
   src/org/apache/pig/PigConstants.java ea77e97 
   src/org/apache/pig/backend/hadoop/executionengine/HExecutionEngine.java 
 4dab4e8 
   src/org/apache/pig/newplan/logical/optimizer/LogicalPlanOptimizer.java 
 d26f381 
   test/org/apache/pig/test/TestEvalPipeline2.java 39cf807 
 
 Diff: https://reviews.apache.org/r/11032/diff/
 
 
 Testing
 ---
 
 Manually tested on a fully-distributed cluster.
 
 THIS FAILS:
 PIG_CONF_DIR=/etc/pig/conf ./bin/pig -c query.pig
 
 THIS WORKS:
 PIG_CONF_DIR=/etc/pig/conf ./bin/pig 
 -Dpig.optimizer.rules.disabled=ColumnMapKeyPrune -c query.pig
 
 Notice how -Dpig.optimizer.rules.disabled=ColumnMapKeyPrune specifies a pig 
 property, which could be in pig.properties, or the script itself.
 
 
 Failure message:
 
 Pig Stack Trace
 ---
 ERROR 2229: Couldn't find matching uid -1 for project (Name: Project Type: 
 bytearray Uid: 97550 Input: 0 Column: 1)
 
 org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1067: Unable to 
 explain alias null
   at org.apache.pig.PigServer.explain(PigServer.java:1057)
   at 
 org.apache.pig.tools.grunt.GruntParser.explainCurrentBatch(GruntParser.java:419)
   at 
 org.apache.pig.tools.grunt.GruntParser.processExplain(GruntParser.java:351)
   at org.apache.pig.tools.grunt.Grunt.checkScript(Grunt.java:98)
   at org.apache.pig.Main.run(Main.java:607)
   at org.apache.pig.Main.main(Main.java:152)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
 Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 2000: 
 Error processing rule ColumnMapKeyPrune. Try -t ColumnMapKeyPrune
   at 
 org.apache.pig.newplan.optimizer.PlanOptimizer.optimize(PlanOptimizer.java:122)
   at 
 org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(HExecutionEngine.java:281)
   at org.apache.pig.PigServer.compilePp(PigServer.java:1380)
   at org.apache.pig.PigServer.explain(PigServer.java:1042)
   ... 10 more
 Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 2229: 
 Couldn't find matching uid -1 for project (Name: Project Type: bytearray Uid: 
 97550 Input: 0 Column: 1)
   at 
 org.apache.pig.newplan.logical.optimizer.ProjectionPatcher$ProjectionRewriter.visit(ProjectionPatcher.java:91)
   at 
 org.apache.pig.newplan.logical.expression.ProjectExpression.accept(ProjectExpression.java:207)
   at 
 org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:64)
   at 
 org.apache.pig.newplan.DepthFirstWalker.walk(DepthFirstWalker.java:53)
   at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:52)
   at 
 org.apache.pig.newplan.logical.optimizer.AllExpressionVisitor.visit(AllExpressionVisitor.java:142)
   at 
 org.apache.pig.newplan.logical.relational.LOInnerLoad.accept(LOInnerLoad.java:128)
   at 
 org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalker.java:75)
   at 
 org.apache.pig.newplan.logical.optimizer.AllExpressionVisitor.visit(AllExpressionVisitor.java:124

Re: Review Request: disable optimizations via pig properties

2013-05-13 Thread Bill Graham


 On May 13, 2013, 11:35 p.m., Bill Graham wrote:
  src/docs/src/documentation/content/xdocs/perf.xml, line 493
  https://reviews.apache.org/r/11032/diff/2/?file=290925#file290925line493
 
  Would you please specify that setting this value in both the pig 
  properties file and the command line (or script) will be additive.
 
 Travis Crawford wrote:
 Currently it works like this:
 
 (a) -optimizer_off command-line rules are always disabled.
 (b) The pig.optimizer.rules.disabled property works like other 
 properties, where setting in the script itself overwrites previously set 
 values (from either the command-line or pig.properties).
 
 Disabled rules are additive in that (a) + (b) will be disabled. However, 
 within (b) only the last specified value of pig.optimizer.rules.disabled 
 takes effect.
 
 I think this makes sense for how people will want to use the feature (and 
 I think is consistent with how other properties work).
 
 * Site administrators can specify default rules to disable via 
 pig.properties.
 * Individual scripts can override the site defaults if needed.
 * Invokers of pig can supplement the rules to disable.
 
 Thoughts? If we want to be additive within (b) we'd also need a way to 
 remove defaults set by site administrators, since the default should be a 
 suggestion not requirement. That would easily be achieved with a - prefix 
 that would remove disabled rules, but I think we've covered the common 
 use-cases without introducing extra complexity.

 (b) The pig.optimizer.rules.disabled property works like other properties, 
 where setting in the script itself overwrites previously set values (from 
 either the command-line or pig.properties).

This implies SET in a script will override the command line (or properties).

 Disabled rules are additive in that (a) + (b) will be disabled. However, 
 within (b) only the last specified value of pig.optimizer.rules.disabled 
 takes effect.

This implies SET in a script (or properties) would be additive with the command 
line.

Can you help clarify what I think sounds like a contradiction? Just trying to 
understand the implemented behavior more than propose a change to it.


- Bill


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11032/#review20516
---


On May 13, 2013, 8:35 p.m., Travis Crawford wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/11032/
 ---
 
 (Updated May 13, 2013, 8:35 p.m.)
 
 
 Review request for pig, Julien Le Dem, Bill Graham, and Feng Peng.
 
 
 Description
 ---
 
 Update pig to allow disabling optimizations via pig properties. Currently 
 optimizations must be disabled via command-line options. Pig properties can 
 be set in pig.properties, set commands in scripts themselves, and 
 command-line -D options.
 
 The use-case is, for scripts that require certain optimizations to be 
 disabled, allowing the script itself to disable the optimization. Currently 
 whatever runs the script needs to specially handle disabling the optimization 
 for that specific query.
 
 
 This addresses bug PIG-3317.
 https://issues.apache.org/jira/browse/PIG-3317
 
 
 Diffs
 -
 
   src/docs/src/documentation/content/xdocs/perf.xml 108ae7e 
   src/org/apache/pig/Main.java f97ed9f 
   src/org/apache/pig/PigConstants.java ea77e97 
   src/org/apache/pig/backend/hadoop/executionengine/HExecutionEngine.java 
 4dab4e8 
   src/org/apache/pig/newplan/logical/optimizer/LogicalPlanOptimizer.java 
 d26f381 
   test/org/apache/pig/test/TestEvalPipeline2.java 39cf807 
 
 Diff: https://reviews.apache.org/r/11032/diff/
 
 
 Testing
 ---
 
 Manually tested on a fully-distributed cluster.
 
 THIS FAILS:
 PIG_CONF_DIR=/etc/pig/conf ./bin/pig -c query.pig
 
 THIS WORKS:
 PIG_CONF_DIR=/etc/pig/conf ./bin/pig 
 -Dpig.optimizer.rules.disabled=ColumnMapKeyPrune -c query.pig
 
 Notice how -Dpig.optimizer.rules.disabled=ColumnMapKeyPrune specifies a pig 
 property, which could be in pig.properties, or the script itself.
 
 
 Failure message:
 
 Pig Stack Trace
 ---
 ERROR 2229: Couldn't find matching uid -1 for project (Name: Project Type: 
 bytearray Uid: 97550 Input: 0 Column: 1)
 
 org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1067: Unable to 
 explain alias null
   at org.apache.pig.PigServer.explain(PigServer.java:1057)
   at 
 org.apache.pig.tools.grunt.GruntParser.explainCurrentBatch(GruntParser.java:419)
   at 
 org.apache.pig.tools.grunt.GruntParser.processExplain(GruntParser.java:351)
   at org.apache.pig.tools.grunt.Grunt.checkScript(Grunt.java:98)
   at org.apache.pig.Main.run(Main.java:607

[jira] [Created] (PIG-3324) STARTSWITH documentation

2013-05-13 Thread Bill Graham (JIRA)
Bill Graham created PIG-3324:


 Summary: STARTSWITH documentation
 Key: PIG-3324
 URL: https://issues.apache.org/jira/browse/PIG-3324
 Project: Pig
  Issue Type: Bug
Reporter: Bill Graham


PIG-2879 added support for STARTSWITH udf, which should be documented here:
http://pig.apache.org/docs/r0.11.1/func.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3324) STARTSWITH documentation

2013-05-13 Thread Bill Graham (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Graham updated PIG-3324:
-

Fix Version/s: 0.12

 STARTSWITH documentation
 

 Key: PIG-3324
 URL: https://issues.apache.org/jira/browse/PIG-3324
 Project: Pig
  Issue Type: Bug
Reporter: Bill Graham
  Labels: documentation, newbie, simple
 Fix For: 0.12


 PIG-2879 added support for STARTSWITH udf, which should be documented here:
 http://pig.apache.org/docs/r0.11.1/func.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Review Request: disable optimizations via pig properties

2013-05-13 Thread Bill Graham


 On May 13, 2013, 11:35 p.m., Bill Graham wrote:
  src/docs/src/documentation/content/xdocs/perf.xml, line 493
  https://reviews.apache.org/r/11032/diff/2/?file=290925#file290925line493
 
  Would you please specify that setting this value in both the pig 
  properties file and the command line (or script) will be additive.
 
 Travis Crawford wrote:
 Currently it works like this:
 
 (a) -optimizer_off command-line rules are always disabled.
 (b) The pig.optimizer.rules.disabled property works like other 
 properties, where setting in the script itself overwrites previously set 
 values (from either the command-line or pig.properties).
 
 Disabled rules are additive in that (a) + (b) will be disabled. However, 
 within (b) only the last specified value of pig.optimizer.rules.disabled 
 takes effect.
 
 I think this makes sense for how people will want to use the feature (and 
 I think is consistent with how other properties work).
 
 * Site administrators can specify default rules to disable via 
 pig.properties.
 * Individual scripts can override the site defaults if needed.
 * Invokers of pig can supplement the rules to disable.
 
 Thoughts? If we want to be additive within (b) we'd also need a way to 
 remove defaults set by site administrators, since the default should be a 
 suggestion not requirement. That would easily be achieved with a - prefix 
 that would remove disabled rules, but I think we've covered the common 
 use-cases without introducing extra complexity.
 
 Bill Graham wrote:
  (b) The pig.optimizer.rules.disabled property works like other 
 properties, where setting in the script itself overwrites previously set 
 values (from either the command-line or pig.properties).
 
 This implies SET in a script will override the command line (or 
 properties).
 
  Disabled rules are additive in that (a) + (b) will be disabled. 
 However, within (b) only the last specified value of 
 pig.optimizer.rules.disabled takes effect.
 
 This implies SET in a script (or properties) would be additive with the 
 command line.
 
 Can you help clarify what I think sounds like a contradiction? Just 
 trying to understand the implemented behavior more than propose a change to 
 it.
 
 Travis Crawford wrote:
 RE: This implies SET in a script will override the command line (or 
 properties).
 A: Rules disabled via the -optimizer_off command-line flag are treated 
 separately from values disabled via the pig property.
 
 RE: This implies SET in a script (or properties) would be additive with 
 the command line.
 A: Correct - SET in a script is additive with rules disabled via the 
 command-line flag.
 
 
 Rules to disable are the set of rules disabled on the command line + 
 rules disabled via the pig.optimizer.rules.disabled property.
 
 Pig's code currently uses a command-line flag to disable optimization 
 rules rather than standard pig properties. I think the ideal state would be 
 using a single property to disable rules because properties are how pig 
 configuration works in general. However, since there's currently a 
 command-line flag to disable rules it seems like we should keep it (perhaps 
 deprecating to allow removing in a future release).
 
 The proposal in this change is to:
 
 * preserve existing behavior by making the command-line flag continues to 
 disable rules as it does today
 * Add a new property that also lets you disable optimization rules. This 
 is a standard pig property that can be set in all the ways one can currently 
 set properties.
 
 Then we add rules disabled via the command-line flag with rules disabled 
 via the property and that's the full list of rules to disable.

Got it, thanks. In your original text I was overlooking the fact that you were 
discussing a flag which is different then the property. All's clear now.

The behavior makes sense to me. We should add a line to the docs to mention 
that when both -optimizer_off  and pig.optimizer.rules.disabled are set, that 
the union of the two rule sets are disabled.


- Bill


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11032/#review20516
---


On May 13, 2013, 8:35 p.m., Travis Crawford wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/11032/
 ---
 
 (Updated May 13, 2013, 8:35 p.m.)
 
 
 Review request for pig, Julien Le Dem, Bill Graham, and Feng Peng.
 
 
 Description
 ---
 
 Update pig to allow disabling optimizations via pig properties. Currently 
 optimizations must be disabled via command-line options. Pig properties can 
 be set

Re: Review Request: disable optimizations via pig properties

2013-05-09 Thread Bill Graham

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11032/#review20406
---

Ship it!


Looks good to me. Way to be #prostyle with including the docs edits in the 
patch.

The way we still use optimizerRules and pig.optimizer.rules in places for rules 
that are disabled and not enabled is way confusing, but we can fix that 
separately.

- Bill Graham


On May 9, 2013, 9:03 p.m., Travis Crawford wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/11032/
 ---
 
 (Updated May 9, 2013, 9:03 p.m.)
 
 
 Review request for pig, Julien Le Dem, Bill Graham, and Feng Peng.
 
 
 Description
 ---
 
 Update pig to allow disabling optimizations via pig properties. Currently 
 optimizations must be disabled via command-line options. Pig properties can 
 be set in pig.properties, set commands in scripts themselves, and 
 command-line -D options.
 
 The use-case is, for scripts that require certain optimizations to be 
 disabled, allowing the script itself to disable the optimization. Currently 
 whatever runs the script needs to specially handle disabling the optimization 
 for that specific query.
 
 
 This addresses bug PIG-3317.
 https://issues.apache.org/jira/browse/PIG-3317
 
 
 Diffs
 -
 
   src/docs/src/documentation/content/xdocs/perf.xml 108ae7e 
   src/org/apache/pig/Main.java f97ed9f 
   src/org/apache/pig/PigConstants.java ea77e97 
   src/org/apache/pig/newplan/logical/optimizer/LogicalPlanOptimizer.java 
 d26f381 
 
 Diff: https://reviews.apache.org/r/11032/diff/
 
 
 Testing
 ---
 
 Manually tested on a fully-distributed cluster.
 
 THIS FAILS:
 PIG_CONF_DIR=/etc/pig/conf ./bin/pig -c query.pig
 
 THIS WORKS:
 PIG_CONF_DIR=/etc/pig/conf ./bin/pig 
 -Dpig.optimizer.rules.disabled=ColumnMapKeyPrune -c query.pig
 
 Notice how -Dpig.optimizer.rules.disabled=ColumnMapKeyPrune specifies a pig 
 property, which could be in pig.properties, or the script itself.
 
 
 Failure message:
 
 Pig Stack Trace
 ---
 ERROR 2229: Couldn't find matching uid -1 for project (Name: Project Type: 
 bytearray Uid: 97550 Input: 0 Column: 1)
 
 org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1067: Unable to 
 explain alias null
   at org.apache.pig.PigServer.explain(PigServer.java:1057)
   at 
 org.apache.pig.tools.grunt.GruntParser.explainCurrentBatch(GruntParser.java:419)
   at 
 org.apache.pig.tools.grunt.GruntParser.processExplain(GruntParser.java:351)
   at org.apache.pig.tools.grunt.Grunt.checkScript(Grunt.java:98)
   at org.apache.pig.Main.run(Main.java:607)
   at org.apache.pig.Main.main(Main.java:152)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
 Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 2000: 
 Error processing rule ColumnMapKeyPrune. Try -t ColumnMapKeyPrune
   at 
 org.apache.pig.newplan.optimizer.PlanOptimizer.optimize(PlanOptimizer.java:122)
   at 
 org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(HExecutionEngine.java:281)
   at org.apache.pig.PigServer.compilePp(PigServer.java:1380)
   at org.apache.pig.PigServer.explain(PigServer.java:1042)
   ... 10 more
 Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 2229: 
 Couldn't find matching uid -1 for project (Name: Project Type: bytearray Uid: 
 97550 Input: 0 Column: 1)
   at 
 org.apache.pig.newplan.logical.optimizer.ProjectionPatcher$ProjectionRewriter.visit(ProjectionPatcher.java:91)
   at 
 org.apache.pig.newplan.logical.expression.ProjectExpression.accept(ProjectExpression.java:207)
   at 
 org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:64)
   at 
 org.apache.pig.newplan.DepthFirstWalker.walk(DepthFirstWalker.java:53)
   at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:52)
   at 
 org.apache.pig.newplan.logical.optimizer.AllExpressionVisitor.visit(AllExpressionVisitor.java:142)
   at 
 org.apache.pig.newplan.logical.relational.LOInnerLoad.accept(LOInnerLoad.java:128)
   at 
 org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalker.java:75)
   at 
 org.apache.pig.newplan.logical.optimizer.AllExpressionVisitor.visit(AllExpressionVisitor.java:124)
   at 
 org.apache.pig.newplan.logical.relational.LOForEach.accept(LOForEach.java:76

[jira] [Commented] (PIG-3317) disable optimizations via pig properties

2013-05-09 Thread Bill Graham (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13653332#comment-13653332
 ] 

Bill Graham commented on PIG-3317:
--

Commented as such in the rb, but this patch looks good to me. 

 disable optimizations via pig properties
 

 Key: PIG-3317
 URL: https://issues.apache.org/jira/browse/PIG-3317
 Project: Pig
  Issue Type: Improvement
Affects Versions: 0.12
Reporter: Travis Crawford
Assignee: Travis Crawford
 Attachments: PIG-3317_disable_opts.1.patch


 Pig provides a number of optimizations which are described at 
 [http://pig.apache.org/docs/r0.11.1/perf.html#optimization-rules]. As is 
 described in the docs, all or specific optimizations can be disabled via the 
 command-line.
 Currently the caller of a pig script must know which optimizations to disable 
 when running because that information cannot be set in the script itself. Nor 
 can optimizations be disabled site-wide through pig.properties.
 Pig should allow disabling optimizations via properties so that pig scripts 
 themselves can disable optimizations as needed, rather than the caller 
 needing to know what optimizations to disable on the command-line.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-3311) add pig-withouthadoop-h2 to mvn-jar

2013-05-03 Thread Bill Graham (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13648931#comment-13648931
 ] 

Bill Graham commented on PIG-3311:
--

+1

 add pig-withouthadoop-h2 to mvn-jar
 ---

 Key: PIG-3311
 URL: https://issues.apache.org/jira/browse/PIG-3311
 Project: Pig
  Issue Type: Improvement
  Components: build
Reporter: Julien Le Dem
Assignee: Julien Le Dem
 Attachments: PIG-3311.patch


 mvn-jar currently creates pig-version.jar and pig-version-h2.jar
 I'm adding pig-version-withouthadoop.jar and pig-version-withouthadoop-h2.jar 
 that are needed to run pig from the command line.
 This will allow a dual-version package.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Welcome our newest committer Prashant Kommireddi

2013-05-02 Thread Bill Graham
Congrats Prashant!


On Thu, May 2, 2013 at 1:11 PM, Daniel Dai da...@hortonworks.com wrote:

 Congratulation!


 On Thu, May 2, 2013 at 1:06 PM, Cheolsoo Park piaozhe...@gmail.com
 wrote:

  Congrats Prashant!
 
 
  On Thu, May 2, 2013 at 12:56 PM, Julien Le Dem jul...@ledem.net wrote:
 
   All,
  
   Please join me in welcoming Prashant Kommireddi as our newest Pig
   committer.
   He's been contributing to Pig for a while now. We look forward to him
   being a part of the project.
  
   Julien
 




-- 
*Note that I'm no longer using my Yahoo! email address. Please email me at
billgra...@gmail.com going forward.*


[jira] [Updated] (PIG-3303) add hadoop h2 artifact to publications in ivy.xml

2013-05-01 Thread Bill Graham (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Graham updated PIG-3303:
-

Assignee: Julien Le Dem

 add hadoop h2 artifact to publications in ivy.xml
 -

 Key: PIG-3303
 URL: https://issues.apache.org/jira/browse/PIG-3303
 Project: Pig
  Issue Type: Bug
Reporter: Julien Le Dem
Assignee: Julien Le Dem
 Attachments: PIG-3303.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (PIG-3306) Publish h2 artifact to maven

2013-05-01 Thread Bill Graham (JIRA)
Bill Graham created PIG-3306:


 Summary: Publish h2 artifact to maven
 Key: PIG-3306
 URL: https://issues.apache.org/jira/browse/PIG-3306
 Project: Pig
  Issue Type: Bug
Reporter: Bill Graham


The Pig artifact built with hadoopversion=23 should be published to maven.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-3303) add hadoop h2 artifact to publications in ivy.xml

2013-05-01 Thread Bill Graham (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13646720#comment-13646720
 ] 

Bill Graham commented on PIG-3303:
--

+1

Created PIG-3306 for publishing the h2 artifact to maven. 

 add hadoop h2 artifact to publications in ivy.xml
 -

 Key: PIG-3303
 URL: https://issues.apache.org/jira/browse/PIG-3303
 Project: Pig
  Issue Type: Bug
Reporter: Julien Le Dem
Assignee: Julien Le Dem
 Attachments: PIG-3303.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (PIG-3306) Publish h2 artifact to maven

2013-05-01 Thread Bill Graham (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Graham resolved PIG-3306.
--

Resolution: Not A Problem

Yup [~rohini] you're right we already do that.

I should have known, I've published the last two releases. :)

 Publish h2 artifact to maven
 

 Key: PIG-3306
 URL: https://issues.apache.org/jira/browse/PIG-3306
 Project: Pig
  Issue Type: Bug
Reporter: Bill Graham

 The Pig artifact built with hadoopversion=23 should be published to maven.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Pig 0.10.1 to Pig 0.11.1 API compatibility break

2013-04-19 Thread Bill Graham
Hi Gerrit,

Sorry to hear these changes caused you problems. The PPNL interface is
marked as Evolving, so it should be expected that future releases of that
interface will change (i.e. break). I'm open for ways to better communicate
these changes when they occur besides the current release notes process.

thanks,
Bill


On Fri, Apr 19, 2013 at 12:11 AM, Dmitriy Ryaboy dvrya...@gmail.com wrote:

 Hi Gerrit, we do try to keep backwards incompatible changes to a minimum,
 but sometimes they are needed to make progress. How about we make a
 practice of tagging notifications about new pig release candidates with
 [RC] so you can set up your filters and get a heads up to try your software
 with the latest release candidate? That will at least let you prepare for
 changes before a release is made, or perhaps argue that we should revert
 something that is backwards incompatible.

 On Apr 18, 2013, at 2:23 AM, Gerrit Jansen van Vuuren gerrit...@gmail.com
 wrote:

  Hi,
 
  I'm the developer of http://gerritjvv.github.io/glue/ that uses the Pig
 API
  directly to launch pig jobs in separate JVM instances.
 
  Recently I've updated to use pig-0.11.1 and found two API compatibility
  breaks.
 
  PigServer.parseExecType does not exist anymore, (was a static method up
 to
  pig-0.10.1)
 
  New method for PigProgressNotificationListener
 
  public void initialPlanNotification(String scriptId, MROperPlan plan)
 
  It would be nice if you guys (when possible) could lookout for these kind
  of breaks in the future.
 
  Thanks,
  Gerrit




-- 
*Note that I'm no longer using my Yahoo! email address. Please email me at
billgra...@gmail.com going forward.*


[jira] [Updated] (PIG-3159) TestAvroStorage.testArrayWithSnappyCompression fails on mac with Java 7

2013-04-17 Thread Bill Graham (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Graham updated PIG-3159:
-

Description: 
Seems like snappy isn't being properly loaded when run on mac. This is the 
exception from the {{TestAvroStorage.testArrayWithSnappyCompression}} test.

{noformat}
13/02/03 13:20:49 INFO mapReduceLayer.PigMapOnly$Map: Aliases being processed 
per job phase (AliasName[line,offset]): M: in[1,6] C:  R: 
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at 
org.xerial.snappy.SnappyLoader.loadNativeLibrary(SnappyLoader.java:315)
at org.xerial.snappy.SnappyLoader.load(SnappyLoader.java:218)
at org.xerial.snappy.Snappy.clinit(Snappy.java:42)
at org.apache.avro.file.SnappyCodec.compress(SnappyCodec.java:43)
at 
org.apache.avro.file.DataFileStream$DataBlock.compressUsing(DataFileStream.java:349)
at 
org.apache.avro.file.DataFileWriter.writeBlock(DataFileWriter.java:347)
at org.apache.avro.file.DataFileWriter.sync(DataFileWriter.java:359)
at org.apache.avro.file.DataFileWriter.flush(DataFileWriter.java:366)
at org.apache.avro.file.DataFileWriter.close(DataFileWriter.java:373)
at 
org.apache.pig.piggybank.storage.avro.PigAvroRecordWriter.close(PigAvroRecordWriter.java:44)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.close(PigOutputFormat.java:149)
at 
org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.close(MapTask.java:651)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
at 
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
Caused by: java.lang.UnsatisfiedLinkError: no snappyjava in java.library.path
at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1860)
at java.lang.Runtime.loadLibrary0(Runtime.java:845)
at java.lang.System.loadLibrary(System.java:1084)
at 
org.xerial.snappy.SnappyNativeLoader.loadLibrary(SnappyNativeLoader.java:52)
... 19 more
13/02/03 13:20:49 WARN mapred.LocalJobRunner: job_local_0001
org.xerial.snappy.SnappyError: [FAILED_TO_LOAD_NATIVE_LIBRARY] null
at org.xerial.snappy.SnappyLoader.load(SnappyLoader.java:227)
at org.xerial.snappy.Snappy.clinit(Snappy.java:42)
at org.apache.avro.file.SnappyCodec.compress(SnappyCodec.java:43)
at 
org.apache.avro.file.DataFileStream$DataBlock.compressUsing(DataFileStream.java:349)
at 
org.apache.avro.file.DataFileWriter.writeBlock(DataFileWriter.java:347)
at org.apache.avro.file.DataFileWriter.sync(DataFileWriter.java:359)
at org.apache.avro.file.DataFileWriter.flush(DataFileWriter.java:366)
at org.apache.avro.file.DataFileWriter.close(DataFileWriter.java:373)
at 
org.apache.pig.piggybank.storage.avro.PigAvroRecordWriter.close(PigAvroRecordWriter.java:44)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.close(PigOutputFormat.java:149)
at 
org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.close(MapTask.java:651)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
at 
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
{noformat}

  was:
Seems like snappy isn't being properly loaded when run on mac. This is the 
exception from the {{testArrayWithSnappyCompression}} test.

{noformat}
13/02/03 13:20:49 INFO mapReduceLayer.PigMapOnly$Map: Aliases being processed 
per job phase (AliasName[line,offset]): M: in[1,6] C:  R: 
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at 
org.xerial.snappy.SnappyLoader.loadNativeLibrary(SnappyLoader.java:315)
at org.xerial.snappy.SnappyLoader.load(SnappyLoader.java:218)
at org.xerial.snappy.Snappy.clinit(Snappy.java:42)
at org.apache.avro.file.SnappyCodec.compress(SnappyCodec.java:43)
at 
org.apache.avro.file.DataFileStream$DataBlock.compressUsing(DataFileStream.java:349)
at 
org.apache.avro.file.DataFileWriter.writeBlock(DataFileWriter.java:347

[jira] [Created] (PIG-3273) bad %default directives can cause pig dry run to silently fail

2013-04-10 Thread Bill Graham (JIRA)
Bill Graham created PIG-3273:


 Summary: bad %default directives can cause pig dry run to silently 
fail
 Key: PIG-3273
 URL: https://issues.apache.org/jira/browse/PIG-3273
 Project: Pig
  Issue Type: Bug
Reporter: Bill Graham


{{pig -r myscript.pig}} will silently fail without producing output or error 
messaging for the following script:

{noformat}
%default STORAGE_WITH_SCHEMA org.apache.pig.builtin.PigStorage('\t', '-schema')
A = LOAD 'foo' using $STORAGE_WITH_SCHEMA;
dump A;
{noformat}

Changing the first line to any of these will cause dry run to parse without 
problems:
{noformat}
%default STORAGE_WITH_SCHEMA org.apache.pig.builtin.PigStorage('\t')
%default STORAGE_WITH_SCHEMA org.apache.pig.builtin.PigStorage('\t', 
'-schema')
%default STORAGE_WITH_SCHEMA 'org.apache.pig.builtin.PigStorage(\'\t\', 
\'-schema\')'
{noformat}

Issue seems to be with more then one set of single quotes that are not outer.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3273) bad %default directives can cause pig dry run to silently fail

2013-04-10 Thread Bill Graham (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Graham updated PIG-3273:
-

Description: 
{{pig -r myscript.pig}} will silently fail without producing output or error 
messaging for the following script:

{noformat}
%default STORAGE_WITH_SCHEMA org.apache.pig.builtin.PigStorage('\t', '-schema')
A = LOAD 'foo' using $STORAGE_WITH_SCHEMA;
dump A;
{noformat}

Changing the first line to any of these will cause dry run to parse without 
problems:
{noformat}
%default STORAGE_WITH_SCHEMA org.apache.pig.builtin.PigStorage('\\t')
%default STORAGE_WITH_SCHEMA org.apache.pig.builtin.PigStorage('\\t', 
'-schema')
%default STORAGE_WITH_SCHEMA 'org.apache.pig.builtin.PigStorage(\'\\t\', 
\'-schema\')'
{noformat}

Issue seems to be with more then one set of single quotes that are not outer.

  was:
{{pig -r myscript.pig}} will silently fail without producing output or error 
messaging for the following script:

{noformat}
%default STORAGE_WITH_SCHEMA org.apache.pig.builtin.PigStorage('\t', '-schema')
A = LOAD 'foo' using $STORAGE_WITH_SCHEMA;
dump A;
{noformat}

Changing the first line to any of these will cause dry run to parse without 
problems:
{noformat}
%default STORAGE_WITH_SCHEMA org.apache.pig.builtin.PigStorage('\t')
%default STORAGE_WITH_SCHEMA org.apache.pig.builtin.PigStorage('\t', 
'-schema')
%default STORAGE_WITH_SCHEMA 'org.apache.pig.builtin.PigStorage(\'\t\', 
\'-schema\')'
{noformat}

Issue seems to be with more then one set of single quotes that are not outer.


 bad %default directives can cause pig dry run to silently fail
 --

 Key: PIG-3273
 URL: https://issues.apache.org/jira/browse/PIG-3273
 Project: Pig
  Issue Type: Bug
Reporter: Bill Graham

 {{pig -r myscript.pig}} will silently fail without producing output or error 
 messaging for the following script:
 {noformat}
 %default STORAGE_WITH_SCHEMA org.apache.pig.builtin.PigStorage('\t', 
 '-schema')
 A = LOAD 'foo' using $STORAGE_WITH_SCHEMA;
 dump A;
 {noformat}
 Changing the first line to any of these will cause dry run to parse without 
 problems:
 {noformat}
 %default STORAGE_WITH_SCHEMA org.apache.pig.builtin.PigStorage('\\t')
 %default STORAGE_WITH_SCHEMA org.apache.pig.builtin.PigStorage('\\t', 
 '-schema')
 %default STORAGE_WITH_SCHEMA 'org.apache.pig.builtin.PigStorage(\'\\t\', 
 \'-schema\')'
 {noformat}
 Issue seems to be with more then one set of single quotes that are not outer.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3264) mvn signanddeploy target broken for pigunit, pigsmoke and piggybank

2013-04-04 Thread Bill Graham (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Graham updated PIG-3264:
-

Fix Version/s: 0.12

 mvn signanddeploy target broken for pigunit, pigsmoke and piggybank
 ---

 Key: PIG-3264
 URL: https://issues.apache.org/jira/browse/PIG-3264
 Project: Pig
  Issue Type: Bug
Reporter: Bill Graham
Assignee: Bill Graham
 Fix For: 0.12, 0.11.2

 Attachments: PIG_3264.1.patch, PIG_3264_branch11.1.patch


 Build fails with:
 {noformat}
 [artifact:deploy] Invalid reference: 'pigunit'
 {noformat}
 Patch on the way.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[ANNOUNCE] Pig 0.11.1 has been released!

2013-04-01 Thread Bill Graham
The Pig team is happy to announce the Pig 0.11.1 release.

Apache Pig provides a high-level data-flow language and execution
framework for parallel computation on Hadoop clusters.
More details about Pig can be found at http://pig.apache.org/.

This is a maintenance release of Pig 0.11 and contains several
critical bug fixes. The details of the release can be found at
http://pig.apache.org/releases.html.


Re: Apache Pig 0.11.1 release candidate

2013-03-31 Thread Bill Graham
Hi Mark,

Thanks for the work you're doing to support Pig in BigTop. Starting with
Pig 0.12, our release process will be simplified to not include rpm/deb
packages, thanks to BigTop.

I've built Pig on a multiple RHEL versions so this issue might not be as
broadly spanning as you describe. The RPMs for 0.11.0 and 0.11.1 were both
built on rhel5 instances from ec2 (ami-2d8e4c44).

While I don't mind putting together another release, I think we should
proceed to release 0.11.1rc0 for the following reasons:

- since the vote passed and to respect the time people put in
testing/validating this release
- 0.11.1 contains support for Hadoop 0.20.2 and other critical bug fixes,
which people are anxious for. For fairness to those stakeholders, these
fixes were not put into a 0.11.0 RC when discovered late in that release
process.
- Pig 0.11.1 will contain an RPM as part of it's release artifacts.

That said, if the Pig community feels strongly that we should cancel the
release and re-issue a new one, I'm fine with shepherding that process.


As an alternative is it possible for you to build by setting the default
encoding externally? Or could you apply this patch to the pig 0.11.1 distro?

thanks,
Bill

On Fri, Mar 29, 2013 at 5:41 PM, Mark Grover grover.markgro...@gmail.comwrote:

 Hi all,
 I am a contributor to Apache Bigtop http://bigtop.apache.org and have a
 question for you.
 Bigtop is a TLP responsible for performing packaging and interoperability
 testing of various projects in the Hadoop ecosystem, including Apache Pig.

 We are planning to include Pig 0.11 in our soon to be released Bigtop 0.6
 distribution. However, while upgrading Pig from 0.10 to 0.11, I wasn't able
 to compile Pig 0.11.1 on RPM based
 systems
 http://bigtop01.cloudera.org:8080/view/Bigtop-trunk/job/Bigtop-trunk-Pig/313/label=centos6/console
 .
 There doesn't seem to be anything Bigtop specific here, I would expect this
 issue to impact all Pig users. It seems like Pig's contrib sub-project uses
 the system's default encoding for compiling code; however on RPM based
 systems, the default encoding is not suitable and breaks the build. I
 created PIG-3262 https://issues.apache.org/jira/browse/PIG-3262 to track
 this and Cheolsoo graciously committed this to Pig trunk. The essence of
 Bigtop is exactly to find integration issues like this.

 Now, I do realize that Bill and the community has done some excellent work
 in putting together 0.11.1. Perhaps, I am a little too late to ask this
 question but I thought I'd ask it anyway. Is there a possibility that the
 Pig community can release a new release candidate for 0.11.1 with the fix
 in PIG-3262?

 The pros:
 1. It would allow Pig users to compile Pig contrib on RPM machines
 (RHEL/CentOS 5, 6, SLES 11, Fedora, etc.) which doesn't seem to be possible
 as of now.
 2. It would enable Apache Bigtop 0.6 to include a Pig version that builds
 on all OS variants.

 The cons:
 1. There is a cost of cutting out another release candidate to the Pig
 community. I completely understand and appreciate the cost involved;
 however, I would anticipate the cost to be minimal since a) the
 change
 https://issues.apache.org/jira/secure/attachment/12575962/PIG-3262.2.patch
 is
 quite trivial; b) the change only affects the contrib functionality
 and
 not the core functionality, per se.

 If we do decide to release another release candidate, I would be more than
 happy to perform integration testing on it by means of Apache Bigtop.

 I do realize the unfortunate timing of this email, it would have been ideal
 if we were having this conversation a week ago while the vote was still
 going on. I will try to change that in future so please do accept my
 apologies in advance.

 Regards,
 Mark




-- 
*Note that I'm no longer using my Yahoo! email address. Please email me at
billgra...@gmail.com going forward.*


Re: [VOTE] Release Pig 0.11.1 (candidate 0)

2013-03-29 Thread Bill Graham
With 3 binding +1s (Daniel, Julien, BillG) this vote passes. I'll start the
release process.


On Wed, Mar 27, 2013 at 6:22 PM, Bill Graham billgra...@gmail.com wrote:

 +1


 On Mon, Mar 25, 2013 at 3:42 PM, Daniel Dai da...@hortonworks.com wrote:

 Yes, it is Ok with me.

 Daniel

 On Mon, Mar 25, 2013 at 2:44 PM, Julien Le Dem jul...@twitter.com
 wrote:
  +1
  The full test suite is passing.
  I don't think we need not make a new rc just for one license header
 missing.
  Daniel, is it OK for you ?
  Thanks,
  Julien
 
  On Mon, Mar 25, 2013 at 11:02 AM, Daniel Dai da...@hortonworks.com
 wrote:
  My fault for missing license header for
  UDFContextTestLoaderWithSignature. Added it to both files, Thanks
  Prashant!
 
  I run unit tests/e2e tests, both passed. +1 for the rc except for the
  license header issue.
 
  Daniel
 
  On Sun, Mar 24, 2013 at 11:18 PM, Prashant Kommireddi
  prash1...@gmail.com wrote:
  Downloaded tarball and performed the following:
 
 1. ant releaseaudit - UDFContextTestLoaderWithSignature (
 http://svn.apache.org/viewvc?view=revisionrevision=r1458036) and
 DOTParser.jjt do not have Apache License header.
 2. Verified RELEASE_NOTES.txt for correct version numbers
 3. Verified build.xml points to next version (0.11.2) SNAPSHOT
 4. Built and tested Piggybank, Built tutorial - looks good.
 5. Tested jar by running scripts against 0.20.2 hadoop cluster
 (would be
 great to have someone else test the same)
 6. ant test-commit - all tests pass
 
  Except for #1, RC looks good to me.
  Thanks,
  -Prashant
 
  On Fri, Mar 22, 2013 at 7:58 AM, Bill Graham billgra...@gmail.com
 wrote:
 
  Hi,
 
  I have created a candidate build for Pig 0.11.1. This is a
 maintenance
  release
  of Pig 0.11.
 
  Keys used to sign the release are available at:
  http://svn.apache.org/viewvc/pig/trunk/KEYS?view=markup
 
  Please download, test, and try it out:
  http://people.apache.org/~billgraham/pig-0.11.1-candidate-0/
 
  Should we release this? Vote closes on next Thursday EOD, Mar 28th.
 
  Thanks,
  Bill
 




 --
 *Note that I'm no longer using my Yahoo! email address. Please email me
 at billgra...@gmail.com going forward.*




-- 
*Note that I'm no longer using my Yahoo! email address. Please email me at
billgra...@gmail.com going forward.*


[jira] [Created] (PIG-3264) mvn signanddeploy target broken for pigunit and pigsmoke

2013-03-29 Thread Bill Graham (JIRA)
Bill Graham created PIG-3264:


 Summary: mvn signanddeploy target broken for pigunit and pigsmoke
 Key: PIG-3264
 URL: https://issues.apache.org/jira/browse/PIG-3264
 Project: Pig
  Issue Type: Bug
Reporter: Bill Graham
Assignee: Bill Graham


Build fails with:

{noformat}
[artifact:deploy] Invalid reference: 'pigunit'
{noformat}

Patch on the way.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3264) mvn signanddeploy target broken for pigunit and pigsmoke

2013-03-29 Thread Bill Graham (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Graham updated PIG-3264:
-

Attachment: PIG_3264.1.patch
PIG_3264_branch11.1.patch

Attaching trunk and branch 11 patches.

 mvn signanddeploy target broken for pigunit and pigsmoke
 

 Key: PIG-3264
 URL: https://issues.apache.org/jira/browse/PIG-3264
 Project: Pig
  Issue Type: Bug
Reporter: Bill Graham
Assignee: Bill Graham
 Attachments: PIG_3264.1.patch, PIG_3264_branch11.1.patch


 Build fails with:
 {noformat}
 [artifact:deploy] Invalid reference: 'pigunit'
 {noformat}
 Patch on the way.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3264) mvn signanddeploy target broken for pigunit and pigsmoke

2013-03-29 Thread Bill Graham (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Graham updated PIG-3264:
-

Status: Patch Available  (was: Open)

 mvn signanddeploy target broken for pigunit and pigsmoke
 

 Key: PIG-3264
 URL: https://issues.apache.org/jira/browse/PIG-3264
 Project: Pig
  Issue Type: Bug
Reporter: Bill Graham
Assignee: Bill Graham
 Attachments: PIG_3264.1.patch, PIG_3264_branch11.1.patch


 Build fails with:
 {noformat}
 [artifact:deploy] Invalid reference: 'pigunit'
 {noformat}
 Patch on the way.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3264) mvn signanddeploy target broken for pigunit, pigsmoke and piggybank

2013-03-29 Thread Bill Graham (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Graham updated PIG-3264:
-

Summary: mvn signanddeploy target broken for pigunit, pigsmoke and 
piggybank  (was: mvn signanddeploy target broken for pigunit and pigsmoke)

 mvn signanddeploy target broken for pigunit, pigsmoke and piggybank
 ---

 Key: PIG-3264
 URL: https://issues.apache.org/jira/browse/PIG-3264
 Project: Pig
  Issue Type: Bug
Reporter: Bill Graham
Assignee: Bill Graham
 Attachments: PIG_3264.1.patch, PIG_3264_branch11.1.patch


 Build fails with:
 {noformat}
 [artifact:deploy] Invalid reference: 'pigunit'
 {noformat}
 Patch on the way.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3264) mvn signanddeploy target broken for pigunit, pigsmoke and piggybank

2013-03-29 Thread Bill Graham (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Graham updated PIG-3264:
-

Resolution: Fixed
Status: Resolved  (was: Patch Available)

 mvn signanddeploy target broken for pigunit, pigsmoke and piggybank
 ---

 Key: PIG-3264
 URL: https://issues.apache.org/jira/browse/PIG-3264
 Project: Pig
  Issue Type: Bug
Reporter: Bill Graham
Assignee: Bill Graham
 Attachments: PIG_3264.1.patch, PIG_3264_branch11.1.patch


 Build fails with:
 {noformat}
 [artifact:deploy] Invalid reference: 'pigunit'
 {noformat}
 Patch on the way.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: [VOTE] Release Pig 0.11.1 (candidate 0)

2013-03-27 Thread Bill Graham
+1

On Mon, Mar 25, 2013 at 3:42 PM, Daniel Dai da...@hortonworks.com wrote:

 Yes, it is Ok with me.

 Daniel

 On Mon, Mar 25, 2013 at 2:44 PM, Julien Le Dem jul...@twitter.com wrote:
  +1
  The full test suite is passing.
  I don't think we need not make a new rc just for one license header
 missing.
  Daniel, is it OK for you ?
  Thanks,
  Julien
 
  On Mon, Mar 25, 2013 at 11:02 AM, Daniel Dai da...@hortonworks.com
 wrote:
  My fault for missing license header for
  UDFContextTestLoaderWithSignature. Added it to both files, Thanks
  Prashant!
 
  I run unit tests/e2e tests, both passed. +1 for the rc except for the
  license header issue.
 
  Daniel
 
  On Sun, Mar 24, 2013 at 11:18 PM, Prashant Kommireddi
  prash1...@gmail.com wrote:
  Downloaded tarball and performed the following:
 
 1. ant releaseaudit - UDFContextTestLoaderWithSignature (
 http://svn.apache.org/viewvc?view=revisionrevision=r1458036) and
 DOTParser.jjt do not have Apache License header.
 2. Verified RELEASE_NOTES.txt for correct version numbers
 3. Verified build.xml points to next version (0.11.2) SNAPSHOT
 4. Built and tested Piggybank, Built tutorial - looks good.
 5. Tested jar by running scripts against 0.20.2 hadoop cluster
 (would be
 great to have someone else test the same)
 6. ant test-commit - all tests pass
 
  Except for #1, RC looks good to me.
  Thanks,
  -Prashant
 
  On Fri, Mar 22, 2013 at 7:58 AM, Bill Graham billgra...@gmail.com
 wrote:
 
  Hi,
 
  I have created a candidate build for Pig 0.11.1. This is a maintenance
  release
  of Pig 0.11.
 
  Keys used to sign the release are available at:
  http://svn.apache.org/viewvc/pig/trunk/KEYS?view=markup
 
  Please download, test, and try it out:
  http://people.apache.org/~billgraham/pig-0.11.1-candidate-0/
 
  Should we release this? Vote closes on next Thursday EOD, Mar 28th.
 
  Thanks,
  Bill
 




-- 
*Note that I'm no longer using my Yahoo! email address. Please email me at
billgra...@gmail.com going forward.*


[VOTE] Release Pig 0.11.1 (candidate 0)

2013-03-22 Thread Bill Graham
Hi,

I have created a candidate build for Pig 0.11.1. This is a maintenance release
of Pig 0.11.

Keys used to sign the release are available at:
http://svn.apache.org/viewvc/pig/trunk/KEYS?view=markup

Please download, test, and try it out:
http://people.apache.org/~billgraham/pig-0.11.1-candidate-0/

Should we release this? Vote closes on next Thursday EOD, Mar 28th.

Thanks,
Bill


Re: Are we ready for 0.11.1 release?

2013-03-18 Thread Bill Graham
Sure, I can get a RC out this week.


On Mon, Mar 18, 2013 at 10:51 AM, Dmitriy Ryaboy dvrya...@gmail.com wrote:

 Yeah adding new types seems like a big thing, would prefer for it to be
 0.12 only.


 Sounds like we are ready to roll 0.11.1..  Bill, want to do the honors
 again?


 On Mon, Mar 18, 2013 at 10:40 AM, Julien Le Dem jul...@ledem.net wrote:

  Agreed with Daniel,
  PIG-2764 will go in Pig 0.12
  Julien
 
  On Mar 18, 2013, at 10:32 AM, Daniel Dai wrote:
 
   Dimitry: Just committed PIG-3132.
  
   Richard: PIG-2764 is new feature, we usually don't include new feature
   in minor release.
  
   Daniel
  
   On Mon, Mar 18, 2013 at 10:21 AM, Richard Ding pigu...@gmail.com
  wrote:
   How about PIG-2764? It would be nice to include this feature.
  
  
   On Mon, Mar 18, 2013 at 1:04 AM, Dmitriy Ryaboy dvrya...@gmail.com
  wrote:
  
   Just +1'd it.
   I think after this one we are good to go?
  
  
   On Sun, Mar 17, 2013 at 9:09 PM, Daniel Dai da...@hortonworks.com
  wrote:
  
   Can I include PIG-3132?
  
   Thanks,
   Daniel
  
   On Fri, Mar 15, 2013 at 5:57 PM, Julien Le Dem jul...@ledem.net
  wrote:
   +1 for a new release
  
   Julien
  
   On Mar 15, 2013, at 17:08, Dmitriy Ryaboy dvrya...@gmail.com
  wrote:
  
   I think all the critical patches we discussed as required for
 0.11.1
   have
   gone in -- is there anything else people want to finish up, or can
  we
   roll
   this?  Current change log:
  
   Release 0.11.1 (unreleased)
  
   INCOMPATIBLE CHANGES
  
   IMPROVEMENTS
  
   PIG-2988: start deploying pigunit maven artifact part of Pig
 release
   process (njw45 via rohini)
  
   PIG-3148: OutOfMemory exception while spilling stale
 DefaultDataBag.
   Extra
   option to gc() before spilling large bag. (knoguchi via rohini)
  
   PIG-3216: Groovy UDFs documentation has minor typos (herberts via
   rohini)
  
   PIG-3202: CUBE operator not documented in user docs (prasanth_j
 via
   billgraham)
  
   OPTIMIZATIONS
  
   BUG FIXES
  
   PIG-3194: Changes to ObjectSerializer.java break compatibility
 with
   Hadoop
   0.20.2 (prkommireddi via dvryaboy)
  
   PIG-3241: ConcurrentModificationException in POPartialAgg
 (dvryaboy)
  
   PIG-3144: Erroneous map entry alias resolution leading to
 Duplicate
   schema
   alias errors (jcoveney via cheolsoo)
  
   PIG-3212: Race Conditions in POSort and (Internal)SortedBag during
   Proactive Spill (kadeng via dvryaboy)
  
   PIG-3206: HBaseStorage does not work with Oozie pig action and
  secure
   HBase
   (rohini)
  
  
 
 




-- 
*Note that I'm no longer using my Yahoo! email address. Please email me at
billgra...@gmail.com going forward.*


[jira] [Updated] (PIG-3241) ConcurrentModificationException in POPartialAgg

2013-03-07 Thread Bill Graham (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Graham updated PIG-3241:
-

Fix Version/s: 0.12

 ConcurrentModificationException in POPartialAgg
 ---

 Key: PIG-3241
 URL: https://issues.apache.org/jira/browse/PIG-3241
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.11
Reporter: Lohit Vijayarenu
 Fix For: 0.12, 0.11.1


 While running few PIG scripts against Hadoop 2.0, I see consistently see 
 ConcurrentModificationException 
 {noformat}
 at java.util.HashMap$HashIterator.remove(HashMap.java:811)
   at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POPartialAgg.aggregate(POPartialAgg.java:365)
   at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POPartialAgg.aggregateSecondLevel(POPartialAgg.java:379)
   at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POPartialAgg.getNext(POPartialAgg.java:203)
   at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:308)
   at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNext(POLocalRearrange.java:263)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:283)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:278)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:729)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:334)
   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:158)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1441)
   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:153)
 {noformat}
 It looks like there is rawInputMap is being modified while elements are 
 removed from it. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3241) ConcurrentModificationException in POPartialAgg

2013-03-07 Thread Bill Graham (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Graham updated PIG-3241:
-

Fix Version/s: 0.11.1

 ConcurrentModificationException in POPartialAgg
 ---

 Key: PIG-3241
 URL: https://issues.apache.org/jira/browse/PIG-3241
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.11
Reporter: Lohit Vijayarenu
 Fix For: 0.11.1


 While running few PIG scripts against Hadoop 2.0, I see consistently see 
 ConcurrentModificationException 
 {noformat}
 at java.util.HashMap$HashIterator.remove(HashMap.java:811)
   at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POPartialAgg.aggregate(POPartialAgg.java:365)
   at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POPartialAgg.aggregateSecondLevel(POPartialAgg.java:379)
   at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POPartialAgg.getNext(POPartialAgg.java:203)
   at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:308)
   at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNext(POLocalRearrange.java:263)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:283)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:278)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:729)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:334)
   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:158)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1441)
   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:153)
 {noformat}
 It looks like there is rawInputMap is being modified while elements are 
 removed from it. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-3214) New/improved mascot

2013-03-05 Thread Bill Graham (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13593587#comment-13593587
 ] 

Bill Graham commented on PIG-3214:
--

I like where #4 is going. I think the fonts look current and it's simple. The 
shape of the dots in the P could use some tweaking though to be a bit more 
rounded, but I think it's close.

I'm not a big fan of #2 because I think it looks like it says Pij and the 
details of the Pig head won't scale well when shown as a thumbnail. I also tend 
towards the more modern fonts than what's used for the Pig part.

Just my $.02 

 New/improved mascot
 ---

 Key: PIG-3214
 URL: https://issues.apache.org/jira/browse/PIG-3214
 Project: Pig
  Issue Type: Wish
  Components: site
Affects Versions: 0.11
Reporter: Andrew Musselman
Priority: Minor
 Fix For: 0.12

 Attachments: newlogo1.png, newlogo2.png, newlogo3.png, newlogo4.png, 
 newlogo5.png


 Request to change pig mascot to something more graphically appealing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-3233) Deploy a Piggybank Jar

2013-03-05 Thread Bill Graham (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13594245#comment-13594245
 ] 

Bill Graham commented on PIG-3233:
--

Using the imports works. It looks like you used the versions from 
{{pigunit-template.xml}} but we should instead use those in 
{{ivy/libraries.properties}} since that's what piggybank is built against. 
Could you please update those the versions to be in sync there.

We can tackle keeping synced with ivy in another JIRA, but one thought is 
variable substitution like we do with @version.

 Deploy a Piggybank Jar
 --

 Key: PIG-3233
 URL: https://issues.apache.org/jira/browse/PIG-3233
 Project: Pig
  Issue Type: New Feature
  Components: piggybank
Affects Versions: 0.10.0, 0.11
Reporter: Nick White
Assignee: Nick White
 Fix For: 0.10.1, 0.11.1

 Attachments: PIG-3233.0.patch


 The attached patch adds the piggybank contrib jar to the mvn-install and 
 mvn-deploy ant targets in the same way as the pigunit  pigsmoke artifacts.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-3233) Deploy a Piggybank Jar

2013-03-04 Thread Bill Graham (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13592712#comment-13592712
 ] 

Bill Graham commented on PIG-3233:
--

Thanks for tackling this one Nick!

The mechanics of the patch looks good, but from where did you get the deps that 
you included in {{ivy/piggybank-template.xml}}?

 Deploy a Piggybank Jar
 --

 Key: PIG-3233
 URL: https://issues.apache.org/jira/browse/PIG-3233
 Project: Pig
  Issue Type: New Feature
  Components: piggybank
Affects Versions: 0.10.0, 0.11
Reporter: Nick White
Assignee: Nick White
 Fix For: 0.10.1, 0.11.1

 Attachments: PIG-3233.0.patch


 The attached patch adds the piggybank contrib jar to the mvn-install and 
 mvn-deploy ant targets in the same way as the pigunit  pigsmoke artifacts.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: pig 0.11 candidate 2 feedback: Several problems

2013-03-01 Thread Bill Graham
+1 to releasing Pig 0.11.1 when this is addressed. I should be able to help
with the release again.



On Fri, Mar 1, 2013 at 11:25 AM, Prashant Kommireddi prash1...@gmail.comwrote:

 Hey Guys,

 I wanted to start a conversation on this again. If Kai is not looking at
 PIG-3194 I can start working on it to get 0.11 compatible with 20.2. If
 everyone agrees, we should roll out 0.11.1 sooner than usual and I
 volunteer to help with it in anyway possible.

 Any objections to getting 0.11.1 out soon after 3194 is fixed?

 -Prashant

 On Wed, Feb 20, 2013 at 3:34 PM, Russell Jurney russell.jur...@gmail.com
 wrote:

  I stand corrected. Cool, 0.11 is good!
 
 
  On Wed, Feb 20, 2013 at 1:15 PM, Jarek Jarcec Cecho jar...@apache.org
  wrote:
 
   Just a unrelated note: The CDH3 is more closer to Hadoop 1.x than to
  0.20.
  
   Jarcec
  
   On Wed, Feb 20, 2013 at 12:04:51PM -0800, Dmitriy Ryaboy wrote:
I agree -- this is a good release. The bugs Kai pointed out should be
fixed, but as they are not critical regressions, we can fix them in
   0.11.1
(if someone wants to roll 0.11.1 the minute these fixes are
 committed,
  I
won't mind and will dutifully vote for the release).
   
I think the Hadoop 20.2 incompatibility is unfortunate but iirc this
 is
fixable by setting HADOOP_USER_CLASSPATH_FIRST=true (was that in
 20.2?)
   
FWIW Twitter's running CDH3 and this release works in our
 environment.
   
At this point things that block a release are critical regressions in
performance or correctness.
   
D
   
   
On Wed, Feb 20, 2013 at 11:52 AM, Alan Gates ga...@hortonworks.com
   wrote:
   
 No.  Bugs like these are supposed to be found and fixed after we
  branch
 from trunk (which happened several months ago in the case of 0.11).
The
 point of RCs are to check that it's a good build, licenses are
 right,
   etc.
  Any bugs found this late in the game have to be seen as failures
 of
 earlier testing.

 Alan.

 On Feb 20, 2013, at 11:33 AM, Russell Jurney wrote:

  Isn't the point of an RC to find and fix bugs like these
 
 
  On Wed, Feb 20, 2013 at 11:31 AM, Bill Graham 
  billgra...@gmail.com
 wrote:
 
  Regarding Pig 11 rc2, I propose we continue with the current
 vote
   as is
  (which closes today EOD). Patches for 0.20.2 issues can be
 rolled
   into a
  Pig 0.11.1 release whenever they're available and tested.
 
 
 
  On Wed, Feb 20, 2013 at 9:24 AM, Olga Natkovich 
   onatkov...@yahoo.com
  wrote:
 
  I agree that supporting as much as we can is a good goal. The
   issue is
  who
  is going to be testing against all these versions? We found the
   issues
  under discussion because of a customer report, not because we
  consistently
  test against all versions. Perhaps when we decide which
 versions
  to
  support
  for next release we need also to agree who is going to be
 testing
   and
  maintaining compatibility with a particular version.
 
  For instance since Hadoop 23 compatibility is important for us
 at
   Yahoo
  we
  have been maintaining compatibility with this version for 0.9,
   0.10 and
  will do the same for 0.11 and going forward. I think we would
  need
 others
  to step in and claim the versions of their interest.
 
  Olga
 
 
  
  From: Kai Londenberg kai.londenb...@googlemail.com
  To: dev@pig.apache.org
  Sent: Wednesday, February 20, 2013 1:51 AM
  Subject: Re: pig 0.11 candidate 2 feedback: Several problems
 
  Hi,
 
  I stronly agree with Jonathan here. If there are good reasons
 why
   you
  can't support an older version of Hadoop any more, that's one
   thing.
  But having to change 2 lines of code doesn't really qualify as
   such in
  my point of view ;)
 
  At least for me, pig support for 0.20.2 is essential - without
  it,
   I
  can't use it. If it doesn't support it, I'll have to branch pig
  and
  hack it myself, or stop using it.
 
  I guess, there are a lot of people still running 0.20.2
 Clusters.
   If
  you really have lots of data stored on HDFS and a continuously
  busy
  cluster, an upgrade is nothing you do just because.
 
 
  2013/2/20 Jonathan Coveney jcove...@gmail.com:
  I agree that we shouldn't have to support old versions
 forever.
   That
  said,
  I also don't think we should be too blase about supporting
 older
  versions
  where it is not odious to do so. We have a lot of competition
 in
   the
  language space and the broader the versions we can support,
 the
   better
  (assuming it isn't too odious to do so). In this case, I don't
   think
 it
  should be too hard to change ObjectSerializer so that the
 commons-codec
  code used is compatible with both

[jira] [Resolved] (PIG-3002) Pig client should handle CountersExceededException

2013-02-28 Thread Bill Graham (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Graham resolved PIG-3002.
--

   Resolution: Fixed
Fix Version/s: 0.12

Committed. Thanks [~jarcec] for digging into this one and sorry for the delay.

 Pig client should handle CountersExceededException
 --

 Key: PIG-3002
 URL: https://issues.apache.org/jira/browse/PIG-3002
 Project: Pig
  Issue Type: Bug
Reporter: Bill Graham
Assignee: Jarek Jarcec Cecho
  Labels: newbie, simple
 Fix For: 0.12

 Attachments: PIG-3002.2.patch, PIG-3002.patch


 Running a pig job that uses more than 120 counters will succeed, but a grunt 
 exception will occur when trying to output counter info to the console. This 
 exception should be caught and handled with friendly messaging:
 {noformat}
 org.apache.pig.backend.executionengine.ExecException: ERROR 2043: Unexpected 
 error during execution.
 at org.apache.pig.PigServer.launchPlan(PigServer.java:1275)
 at 
 org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1249)
 at org.apache.pig.PigServer.execute(PigServer.java:1239)
 at org.apache.pig.PigServer.executeBatch(PigServer.java:333)
 at 
 org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:136)
 at 
 org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:197)
 at 
 org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:169)
 at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)
 at org.apache.pig.Main.run(Main.java:604)
 at org.apache.pig.Main.main(Main.java:154)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
 Caused by: org.apache.hadoop.mapred.Counters$CountersExceededException: 
 Error: Exceeded limits on number of counters - Counters=120 Limit=120
 at 
 org.apache.hadoop.mapred.Counters$Group.getCounterForName(Counters.java:312)
 at org.apache.hadoop.mapred.Counters.findCounter(Counters.java:431)
 at org.apache.hadoop.mapred.Counters.getCounter(Counters.java:495)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.computeWarningAggregate(MapReduceLauncher.java:707)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:442)
 at org.apache.pig.PigServer.launchPlan(PigServer.java:1264)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-3002) Pig client should handle CountersExceededException

2013-02-27 Thread Bill Graham (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13589284#comment-13589284
 ] 

Bill Graham commented on PIG-3002:
--

I don't think we should be modifying the shims code in this way for the 
contrived case. Swallowing exceptions and returning 0 doesn't seem like the 
right thing to do for the reasons I've described above. If Hadoop is throwing 
exceptions because we've used too many counters, let's catch it and log it and 
move on. Surfacing the exception to a user in the console is better than trying 
to print some of them. Any counters captured by Hadoop will still be reported 
in the JT and the job history. 

 Pig client should handle CountersExceededException
 --

 Key: PIG-3002
 URL: https://issues.apache.org/jira/browse/PIG-3002
 Project: Pig
  Issue Type: Bug
Reporter: Bill Graham
Assignee: Jarek Jarcec Cecho
  Labels: newbie, simple
 Attachments: PIG-3002.2.patch, PIG-3002.patch


 Running a pig job that uses more than 120 counters will succeed, but a grunt 
 exception will occur when trying to output counter info to the console. This 
 exception should be caught and handled with friendly messaging:
 {noformat}
 org.apache.pig.backend.executionengine.ExecException: ERROR 2043: Unexpected 
 error during execution.
 at org.apache.pig.PigServer.launchPlan(PigServer.java:1275)
 at 
 org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1249)
 at org.apache.pig.PigServer.execute(PigServer.java:1239)
 at org.apache.pig.PigServer.executeBatch(PigServer.java:333)
 at 
 org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:136)
 at 
 org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:197)
 at 
 org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:169)
 at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)
 at org.apache.pig.Main.run(Main.java:604)
 at org.apache.pig.Main.main(Main.java:154)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
 Caused by: org.apache.hadoop.mapred.Counters$CountersExceededException: 
 Error: Exceeded limits on number of counters - Counters=120 Limit=120
 at 
 org.apache.hadoop.mapred.Counters$Group.getCounterForName(Counters.java:312)
 at org.apache.hadoop.mapred.Counters.findCounter(Counters.java:431)
 at org.apache.hadoop.mapred.Counters.getCounter(Counters.java:495)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.computeWarningAggregate(MapReduceLauncher.java:707)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:442)
 at org.apache.pig.PigServer.launchPlan(PigServer.java:1264)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-1832) Support timestamp in HBaseStorage

2013-02-26 Thread Bill Graham (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13587239#comment-13587239
 ] 

Bill Graham commented on PIG-1832:
--

Yes, read via time ranges is done. Work on PIG-2114 seems stalled though and 
there's a lot going on in that patch. I propose this JIRA just add write 
support for -timestamp=millis_since_the_epoch_utc for consistency with the 
current read API. That's a quick change that would be useful and would give 
full read/write support for timestamps. That would also help reduce the 
somewhat broad scope of PIG-2114.

 Support timestamp in HBaseStorage
 -

 Key: PIG-1832
 URL: https://issues.apache.org/jira/browse/PIG-1832
 Project: Pig
  Issue Type: Improvement
 Environment: Java 6, Mac OS X 10.6
Reporter: Eric Yang

 When storing data into HBase using 
 org.apache.pig.backend.hadoop.hbase.HBaseStorage, HBase timestamp field is 
 stored with insertion time of the mapreduce job.  It would be nice to have a 
 way to populate timestamp from user data.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-1832) Support timestamp in HBaseStorage when storing

2013-02-26 Thread Bill Graham (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Graham updated PIG-1832:
-

Summary: Support timestamp in HBaseStorage when storing  (was: Support 
timestamp in HBaseStorage)

 Support timestamp in HBaseStorage when storing
 --

 Key: PIG-1832
 URL: https://issues.apache.org/jira/browse/PIG-1832
 Project: Pig
  Issue Type: Improvement
 Environment: Java 6, Mac OS X 10.6
Reporter: Eric Yang

 When storing data into HBase using 
 org.apache.pig.backend.hadoop.hbase.HBaseStorage, HBase timestamp field is 
 stored with insertion time of the mapreduce job.  It would be nice to have a 
 way to populate timestamp from user data.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-1832) Support timestamp in HBaseStorage when storing

2013-02-26 Thread Bill Graham (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Graham updated PIG-1832:
-

Environment: (was: Java 6, Mac OS X 10.6)

 Support timestamp in HBaseStorage when storing
 --

 Key: PIG-1832
 URL: https://issues.apache.org/jira/browse/PIG-1832
 Project: Pig
  Issue Type: Improvement
Reporter: Eric Yang

 When storing data into HBase using 
 org.apache.pig.backend.hadoop.hbase.HBaseStorage, HBase timestamp field is 
 stored with insertion time of the mapreduce job.  It would be nice to have a 
 way to populate timestamp from user data.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-1832) Support timestamp in HBaseStorage when storing

2013-02-26 Thread Bill Graham (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13587328#comment-13587328
 ] 

Bill Graham commented on PIG-1832:
--

I don't think there is a ticket to support returning multiple cell versions 
with timestamps, but we did discuss ideas for an approach here:

https://issues.apache.org/jira/browse/PIG-1782?focusedCommentId=12988192page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-12988192

Basically the idea is to create a new class to support this, since it would be 
fundamentally very different than what we currently support with 
{{HBaseStorage}}. That work might be better handled after we tackle PIG-3067 
(HBaseStorage should be split up to become more manageable).

 Support timestamp in HBaseStorage when storing
 --

 Key: PIG-1832
 URL: https://issues.apache.org/jira/browse/PIG-1832
 Project: Pig
  Issue Type: Improvement
Reporter: Eric Yang

 When storing data into HBase using 
 org.apache.pig.backend.hadoop.hbase.HBaseStorage, HBase timestamp field is 
 stored with insertion time of the mapreduce job.  It would be nice to have a 
 way to populate timestamp from user data.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3067) HBaseStorage should be split up to become more manageable

2013-02-26 Thread Bill Graham (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Graham updated PIG-3067:
-

Summary: HBaseStorage should be split up to become more manageable  (was: 
HBaseStorage should be split up to become more managable)

 HBaseStorage should be split up to become more manageable
 -

 Key: PIG-3067
 URL: https://issues.apache.org/jira/browse/PIG-3067
 Project: Pig
  Issue Type: Improvement
Reporter: Christoph Bauer
Assignee: Christoph Bauer
 Attachments: hbasestorage-split.patch


 HBaseStorage has become quite big (1100 lines).
 I propose to split it up into more managable parts. I believe it will become 
 a lot easier to maintain.
 I split it up like this:
 HBaseStorage
 * settings:LoadStoreFuncSettings
 ** options
 ** caster
 ** udfProperties
 ** contextSignature
 ** columns:ColumnInfo - moved to its own class-file
 * loadFuncDelegate:HBaseLoadFunc - LoadFunc implementation
 ** settings:LoadStoreFuncSettings (s.a.)
 ** scanner:HBaseLoadFuncScanner - everything scan-specific
 ** tupleIterator:HBaseTupleIterator - interface for _public Tuple getNext()_
 * storeFuncDelegate:HBaseStorFunc - StorFunc implementation
 ** settings:LoadStoreFuncSettings (s.a.)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (PIG-3220) No docs for CUBE in Pig 0.11 :(

2013-02-25 Thread Bill Graham (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Graham resolved PIG-3220.
--

Resolution: Duplicate

Duplicate of PIG-3202. Until the next release, docs can be found here:

https://issues.apache.org/jira/browse/PIG-2765?focusedCommentId=13427021page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13427021

 No docs for CUBE in Pig 0.11 :(
 ---

 Key: PIG-3220
 URL: https://issues.apache.org/jira/browse/PIG-3220
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.11
Reporter: Russell Jurney
Priority: Blocker
 Fix For: 0.11.1


 There are no docs for CUBE in this release.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3202) CUBE operator not documented in user docs

2013-02-24 Thread Bill Graham (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Graham updated PIG-3202:
-

Attachment: PIG-3202.2.patch

Looks great, thanks Prasanth! I just made a few minor formatting tweaks and 
committed.

 CUBE operator not documented in user docs
 -

 Key: PIG-3202
 URL: https://issues.apache.org/jira/browse/PIG-3202
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.11
Reporter: Bill Graham
Assignee: Prasanth J
 Fix For: 0.12

 Attachments: PIG-3202.1.git.patch, PIG-3202.2.patch


 This is not documented in the user docs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3202) CUBE operator not documented in user docs

2013-02-24 Thread Bill Graham (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Graham updated PIG-3202:
-

Fix Version/s: 0.11.1

 CUBE operator not documented in user docs
 -

 Key: PIG-3202
 URL: https://issues.apache.org/jira/browse/PIG-3202
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.11
Reporter: Bill Graham
Assignee: Prasanth J
 Fix For: 0.12, 0.11.1

 Attachments: PIG-3202.1.git.patch, PIG-3202.2.patch


 This is not documented in the user docs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (PIG-3202) CUBE operator not documented in user docs

2013-02-24 Thread Bill Graham (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Graham resolved PIG-3202.
--

Resolution: Fixed

Committed to trunk and pig 11 branch.

 CUBE operator not documented in user docs
 -

 Key: PIG-3202
 URL: https://issues.apache.org/jira/browse/PIG-3202
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.11
Reporter: Bill Graham
Assignee: Prasanth J
 Fix For: 0.12, 0.11.1

 Attachments: PIG-3202.1.git.patch, PIG-3202.2.patch


 This is not documented in the user docs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3214) New/improved mascot

2013-02-24 Thread Bill Graham (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Graham updated PIG-3214:
-

Fix Version/s: (was: 0.11.1)
   0.12

 New/improved mascot
 ---

 Key: PIG-3214
 URL: https://issues.apache.org/jira/browse/PIG-3214
 Project: Pig
  Issue Type: Wish
  Components: site
Affects Versions: 0.11
Reporter: Andrew Musselman
Priority: Minor
 Fix For: 0.12


 Request to change pig mascot to something more graphically appealing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-3214) New/improved mascot

2013-02-24 Thread Bill Graham (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13585623#comment-13585623
 ] 

Bill Graham commented on PIG-3214:
--

+1 to a new mascot.

 New/improved mascot
 ---

 Key: PIG-3214
 URL: https://issues.apache.org/jira/browse/PIG-3214
 Project: Pig
  Issue Type: Wish
  Components: site
Affects Versions: 0.11
Reporter: Andrew Musselman
Priority: Minor
 Fix For: 0.12


 Request to change pig mascot to something more graphically appealing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3002) Pig client should handle CountersExceededException

2013-02-24 Thread Bill Graham (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Graham updated PIG-3002:
-

Attachment: PIG-3002.2.patch

I was able to run some tests using both the patched and unpatched version and 
both behave the same w.r.t. what's output to the console.

The behavior of {{MapReduceLauncher.computeWarningAggregate()}} is  already to 
catch {{IOException}} and log a warning, so it would be aceptable to catch 
{{Exception}} around the iterator on the {{PigWarning}} enum and just 
{{log.warn}} there as well. No need to modify the shim code. Just log the error 
and move on. This will cause the console output to show the success state of 
the jobs, along with the exception with the counters.

 Pig client should handle CountersExceededException
 --

 Key: PIG-3002
 URL: https://issues.apache.org/jira/browse/PIG-3002
 Project: Pig
  Issue Type: Bug
Reporter: Bill Graham
Assignee: Jarek Jarcec Cecho
  Labels: newbie, simple
 Attachments: PIG-3002.2.patch, PIG-3002.patch


 Running a pig job that uses more than 120 counters will succeed, but a grunt 
 exception will occur when trying to output counter info to the console. This 
 exception should be caught and handled with friendly messaging:
 {noformat}
 org.apache.pig.backend.executionengine.ExecException: ERROR 2043: Unexpected 
 error during execution.
 at org.apache.pig.PigServer.launchPlan(PigServer.java:1275)
 at 
 org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1249)
 at org.apache.pig.PigServer.execute(PigServer.java:1239)
 at org.apache.pig.PigServer.executeBatch(PigServer.java:333)
 at 
 org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:136)
 at 
 org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:197)
 at 
 org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:169)
 at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)
 at org.apache.pig.Main.run(Main.java:604)
 at org.apache.pig.Main.main(Main.java:154)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
 Caused by: org.apache.hadoop.mapred.Counters$CountersExceededException: 
 Error: Exceeded limits on number of counters - Counters=120 Limit=120
 at 
 org.apache.hadoop.mapred.Counters$Group.getCounterForName(Counters.java:312)
 at org.apache.hadoop.mapred.Counters.findCounter(Counters.java:431)
 at org.apache.hadoop.mapred.Counters.getCounter(Counters.java:495)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.computeWarningAggregate(MapReduceLauncher.java:707)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:442)
 at org.apache.pig.PigServer.launchPlan(PigServer.java:1264)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (PIG-3219) Script to run Pig ant targets on AWS

2013-02-24 Thread Bill Graham (JIRA)
Bill Graham created PIG-3219:


 Summary: Script to run Pig ant targets on AWS
 Key: PIG-3219
 URL: https://issues.apache.org/jira/browse/PIG-3219
 Project: Pig
  Issue Type: Bug
Reporter: Bill Graham
Assignee: Bill Graham


During the Pig 11 release I wrote a script to install software required to 
build Pig on ec2 instances before running the build. This script could be 
helpful for future releases or for running unit tests remotely.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3219) Script to run Pig ant targets on AWS

2013-02-24 Thread Bill Graham (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Graham updated PIG-3219:
-

Attachment: pig-ec2-release-build.sh

Not sure where the best place to put this would be. Suggestions welcome.

 Script to run Pig ant targets on AWS
 

 Key: PIG-3219
 URL: https://issues.apache.org/jira/browse/PIG-3219
 Project: Pig
  Issue Type: Bug
Reporter: Bill Graham
Assignee: Bill Graham
 Attachments: pig-ec2-release-build.sh


 During the Pig 11 release I wrote a script to install software required to 
 build Pig on ec2 instances before running the build. This script could be 
 helpful for future releases or for running unit tests remotely.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: How do we post to the apache pig blog?

2013-02-22 Thread Bill Graham
Looks like an infra jira is needed to get a PMC member admin rights:

Creating new Project Blog users

The blogs.apache.org backend is not currently connected to Apache LDAP
services, so blog users need to be created by our infrastructure team. To
get a username, create anINFRA
issuehttps://issues.apache.org/jira/browse/INFRA
 using *Blogs* as its component name. Indicate your Apache user ID and to
which blog you're requesting access.
 Project Blog Features Granting Project blog rights to other committers

PMC members with blog admin rights can grant access rights to blog users
via https://blogs.apache.org/admin




On Fri, Feb 22, 2013 at 12:37 AM, Aniket Mokashi aniket...@gmail.comwrote:

 Just a guess- http://www.apache.org/dev/project-blogs


 On Fri, Feb 22, 2013 at 12:07 AM, Dmitriy Ryaboy dvrya...@gmail.com
 wrote:

  I prepared a detailed post going over the pig 0.11 release, and realized
 I
  don't know how to post to the apache pig blog. Does anyone have a
 pointer?
 



 --
 ...:::Aniket:::... Quetzalco@tl




-- 
*Note that I'm no longer using my Yahoo! email address. Please email me at
billgra...@gmail.com going forward.*


[jira] [Commented] (PIG-3174) Remove rpm and deb artifacts from build.xml

2013-02-21 Thread Bill Graham (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13583301#comment-13583301
 ] 

Bill Graham commented on PIG-3174:
--

+1 to patch and doc approach. A link from the releases page makes sense.

 Remove rpm and deb artifacts from build.xml
 ---

 Key: PIG-3174
 URL: https://issues.apache.org/jira/browse/PIG-3174
 Project: Pig
  Issue Type: Task
  Components: build
Affects Versions: 0.12
Reporter: Alan Gates
Assignee: Alan Gates
 Fix For: 0.12

 Attachments: PIG-3174.2.patch, PIG-3174.patch


 I propose that we remove the targets to build rpms and debs from build.xml 
 and consequently quit publishing them as part of our releases.  Bigtop 
 publishes these packages now.  And building them takes infrastructure that 
 not every committer/PMC member has.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (PIG-3202) CUBE operator not documented in user docs

2013-02-21 Thread Bill Graham (JIRA)
Bill Graham created PIG-3202:


 Summary: CUBE operator not documented in user docs
 Key: PIG-3202
 URL: https://issues.apache.org/jira/browse/PIG-3202
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.11
Reporter: Bill Graham
 Fix For: 0.12


This is not documented in the user docs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (PIG-3203) ROLLUP not documented in user docs

2013-02-21 Thread Bill Graham (JIRA)
Bill Graham created PIG-3203:


 Summary: ROLLUP not documented in user docs
 Key: PIG-3203
 URL: https://issues.apache.org/jira/browse/PIG-3203
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.11
Reporter: Bill Graham
 Fix For: 0.12


This is not documented in the user docs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-3202) CUBE operator not documented in user docs

2013-02-21 Thread Bill Graham (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13583490#comment-13583490
 ] 

Bill Graham commented on PIG-3202:
--

Sure, that would be great. See https://cwiki.apache.org/PIG/howtodocument.html 
for a description of how to modify user docs.

 CUBE operator not documented in user docs
 -

 Key: PIG-3202
 URL: https://issues.apache.org/jira/browse/PIG-3202
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.11
Reporter: Bill Graham
 Fix For: 0.12


 This is not documented in the user docs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3144) Erroneous map entry alias resolution leading to Duplicate schema alias errors

2013-02-21 Thread Bill Graham (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Graham updated PIG-3144:
-

Fix Version/s: (was: 0.11)

 Erroneous map entry alias resolution leading to Duplicate schema alias 
 errors
 ---

 Key: PIG-3144
 URL: https://issues.apache.org/jira/browse/PIG-3144
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.11, 0.10.1
Reporter: Kai Londenberg
Assignee: Jonathan Coveney
 Fix For: 0.12

 Attachments: PIG-3144-0.patch


 The following code illustrates a problem concerning alias resolution in pig 
 The schema of D2 will incorrectly be described as containing two age 
 fields. And the last step in the following script will lead to a Duplicate 
 schema alias error message.
 I only encountered this bug when using aliases for map fields. 
 {code}
 DATA = LOAD 'file:///whatever' as (a:map[chararray], b:chararray);
 D1 = FOREACH DATA GENERATE a#'name' as name, a#'age' as age, b;
 D2 = FOREACH D1 GENERATE name, age, b;
 DESCRIBE D2;
 {code}
 Output:
 {code}
 D2: {
 age: chararray,
 age: chararray,
 b: chararray
 }
 {code}
 {code}
 D3 = FOREACH D2 GENERATE *;
 DESCRIBE D3;
 {code}
 Output:
 {code}
 file file:///.../pig-bug-example.pig, line 20, column 16 Duplicate schema 
 alias: age
 {code}
 This error occurs in this form in Apache Pig version 0.11.0-SNAPSHOT (r6408). 
 A less severe variant of this bug is also present in pig 0.10.1. In 0.10.1, 
 the Duplicate schema alias error message won't occur, but the schema of D2 
 (see above) will still have wrong duplicate alias entries.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3177) Fix Pig project SEO so latest, 0.11 docs show when you google things

2013-02-21 Thread Bill Graham (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Graham updated PIG-3177:
-

Fix Version/s: (was: 0.11)
   0.12

 Fix Pig project SEO so latest, 0.11 docs show when you google things
 

 Key: PIG-3177
 URL: https://issues.apache.org/jira/browse/PIG-3177
 Project: Pig
  Issue Type: Bug
  Components: site
Affects Versions: 0.11
Reporter: Russell Jurney
Assignee: Russell Jurney
Priority: Critical
 Fix For: 0.12


 http://pig.apache.org/docs/r0.7.0/api/org/apache/pig/piggybank/storage/SequenceFileLoader.html
 The 0.7.0 docs are what everyone references. FOR POOPS SAKES.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3194) Changes to ObjectSerializer.java break compatibility with Hadoop 0.20.2

2013-02-21 Thread Bill Graham (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Graham updated PIG-3194:
-

Summary: Changes to ObjectSerializer.java break compatibility with Hadoop 
0.20.2  (was: Pig 0.11 candidate 2: Changes to ObjectSerializer.java break 
compatibility with Hadoop 0.20.2)

 Changes to ObjectSerializer.java break compatibility with Hadoop 0.20.2
 ---

 Key: PIG-3194
 URL: https://issues.apache.org/jira/browse/PIG-3194
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.11
Reporter: Kai Londenberg

 The changes to ObjectSerializer.java in the following commit
 http://svn.apache.org/viewvc?view=revisionrevision=1403934 break 
 compatibility with Hadoop 0.20.2 Clusters.
 The reason is, that the code uses methods from Apache Commons Codec 1.4 - 
 which are not available in Apache Commons Codec 1.3 which is shipping with 
 Hadoop 0.20.2.
 The offending methods are Base64.decodeBase64(String) and 
 Base64.encodeBase64URLSafeString(byte[])
 If I revert these changes, Pig 0.11.0 candidate 2 works well with our Hadoop 
 0.20.2 Clusters.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3194) Pig 0.11 candidate 2: Changes to ObjectSerializer.java break compatibility with Hadoop 0.20.2

2013-02-21 Thread Bill Graham (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Graham updated PIG-3194:
-

Affects Version/s: 0.11

 Pig 0.11 candidate 2: Changes to ObjectSerializer.java break compatibility 
 with Hadoop 0.20.2
 -

 Key: PIG-3194
 URL: https://issues.apache.org/jira/browse/PIG-3194
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.11
Reporter: Kai Londenberg

 The changes to ObjectSerializer.java in the following commit
 http://svn.apache.org/viewvc?view=revisionrevision=1403934 break 
 compatibility with Hadoop 0.20.2 Clusters.
 The reason is, that the code uses methods from Apache Commons Codec 1.4 - 
 which are not available in Apache Commons Codec 1.3 which is shipping with 
 Hadoop 0.20.2.
 The offending methods are Base64.decodeBase64(String) and 
 Base64.encodeBase64URLSafeString(byte[])
 If I revert these changes, Pig 0.11.0 candidate 2 works well with our Hadoop 
 0.20.2 Clusters.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3189) Remove ivy/pig.pom and improve build mvn targets

2013-02-21 Thread Bill Graham (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Graham updated PIG-3189:
-

Attachment: PIG-3189.4.patch

Patch #3 had my *.iws changes in it. Uploading cleaned up patch #4. 

 Remove ivy/pig.pom and improve build mvn targets
 

 Key: PIG-3189
 URL: https://issues.apache.org/jira/browse/PIG-3189
 Project: Pig
  Issue Type: Bug
Reporter: Bill Graham
Assignee: Bill Graham
 Fix For: 0.12

 Attachments: PIG-3189.1.patch, PIG-3189.2.patch, PIG-3189.3.patch, 
 PIG-3189.4.patch


 {{ivy/pig.pom}} in SVN seems to no longer be used.  At build time ({{ant 
 set-version}} via {{ant mvn-deploy}}) {{ivy/pig.pom}} is generated from 
 {{ivy/pig-template.xml}}. We should remove {{ivy/pig.pom}} from SVN.
 It would also be good to decouple building the maven artifacts from 
 publishing them, since those two tasks might be done on different hosts.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3189) Remove ivy/pig.pom and improve build mvn targets

2013-02-21 Thread Bill Graham (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Graham updated PIG-3189:
-

Resolution: Fixed
Status: Resolved  (was: Patch Available)

 Remove ivy/pig.pom and improve build mvn targets
 

 Key: PIG-3189
 URL: https://issues.apache.org/jira/browse/PIG-3189
 Project: Pig
  Issue Type: Bug
Reporter: Bill Graham
Assignee: Bill Graham
 Fix For: 0.12

 Attachments: PIG-3189.1.patch, PIG-3189.2.patch, PIG-3189.3.patch, 
 PIG-3189.4.patch


 {{ivy/pig.pom}} in SVN seems to no longer be used.  At build time ({{ant 
 set-version}} via {{ant mvn-deploy}}) {{ivy/pig.pom}} is generated from 
 {{ivy/pig-template.xml}}. We should remove {{ivy/pig.pom}} from SVN.
 It would also be good to decouple building the maven artifacts from 
 publishing them, since those two tasks might be done on different hosts.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: When making a change to pig.apache.org, do we attach just the patch for the changes to author, or to the post-forrest changes to publish as well?

2013-02-20 Thread Bill Graham
Just the xdoc XML gets submitted IIRC, not the generated files.

On Wed, Feb 20, 2013 at 1:35 AM, Jonathan Coveney jcove...@gmail.comwrote:

 I believe that this is how it works, but it has been a while and I want to
 make sure...



Re: When making a change to pig.apache.org, do we attach just the patch for the changes to author, or to the post-forrest changes to publish as well?

2013-02-20 Thread Bill Graham
My bad, I was talking about the user docs, not the site docs.


On Wed, Feb 20, 2013 at 8:17 AM, Alan Gates ga...@hortonworks.com wrote:

 You need to check in both author and publish.  The site is now directly
 loaded from SVN using what's under publish.

 Alan.

 On Feb 20, 2013, at 1:35 AM, Jonathan Coveney wrote:

  I believe that this is how it works, but it has been a while and I want
 to
  make sure...




-- 
*Note that I'm no longer using my Yahoo! email address. Please email me at
billgra...@gmail.com going forward.*


Re: pig 0.11 candidate 2 feedback: Several problems

2013-02-20 Thread Bill Graham
Regarding Pig 11 rc2, I propose we continue with the current vote as is
(which closes today EOD). Patches for 0.20.2 issues can be rolled into a
Pig 0.11.1 release whenever they're available and tested.



On Wed, Feb 20, 2013 at 9:24 AM, Olga Natkovich onatkov...@yahoo.comwrote:

 I agree that supporting as much as we can is a good goal. The issue is who
 is going to be testing against all these versions? We found the issues
 under discussion because of a customer report, not because we consistently
 test against all versions. Perhaps when we decide which versions to support
 for next release we need also to agree who is going to be testing and
 maintaining compatibility with a particular version.

 For instance since Hadoop 23 compatibility is important for us at Yahoo we
 have been maintaining compatibility with this version for 0.9, 0.10 and
 will do the same for 0.11 and going forward. I think we would need others
 to step in and claim the versions of their interest.

 Olga


 
  From: Kai Londenberg kai.londenb...@googlemail.com
 To: dev@pig.apache.org
 Sent: Wednesday, February 20, 2013 1:51 AM
 Subject: Re: pig 0.11 candidate 2 feedback: Several problems

 Hi,

 I stronly agree with Jonathan here. If there are good reasons why you
 can't support an older version of Hadoop any more, that's one thing.
 But having to change 2 lines of code doesn't really qualify as such in
 my point of view ;)

 At least for me, pig support for 0.20.2 is essential - without it, I
 can't use it. If it doesn't support it, I'll have to branch pig and
 hack it myself, or stop using it.

 I guess, there are a lot of people still running 0.20.2 Clusters. If
 you really have lots of data stored on HDFS and a continuously busy
 cluster, an upgrade is nothing you do just because.


 2013/2/20 Jonathan Coveney jcove...@gmail.com:
  I agree that we shouldn't have to support old versions forever. That
 said,
  I also don't think we should be too blase about supporting older versions
  where it is not odious to do so. We have a lot of competition in the
  language space and the broader the versions we can support, the better
  (assuming it isn't too odious to do so). In this case, I don't think it
  should be too hard to change ObjectSerializer so that the commons-codec
  code used is compatible with both versions...we could just in-line some
 of
  the Base64 code, and comment accordingly.
 
  That said, we also should be clear about what versions we support, but
 6-12
  months seems short. The upgrade cycles on Hadoop are really, really long.
 
 
  2013/2/20 Prashant Kommireddi prash1...@gmail.com
 
  Agreed, that makes sense. Probably supporting older hadoop version for
 a 1
  or 2 pig releases before moving to a newer/stable version?
 
  Having said that, should we use 0.11 period to communicate the same to
 the
  community and start moving on 0.12 onwards? I know we are way past 6-12
  months (1-2 release) time frame with 0.20.2, but we also need to make
 sure
  users are aware and plan accordingly.
 
  I'd also be interested to hear how other projects (Hive, Oozie) are
  handling this.
 
  -Prashant
 
  On Tue, Feb 19, 2013 at 3:22 PM, Olga Natkovich onatkov...@yahoo.com
  wrote:
 
   It seems that for each Pig release we need to agree and clearly state
   which Hadoop versions it will support. I guess the main question is
 how
  we
   decide on this. Perhaps we should say that Pig no longer supports
 older
   Hadoop versions once the newer one is out for at least 6-12 month to
 make
   sure it is stable. I don't think we can support old versions
  indefinitely.
   It is in everybody's interest to keep moving forward.
  
   Olga
  
  
   
From: Prashant Kommireddi prash1...@gmail.com
   To: dev@pig.apache.org
   Sent: Tuesday, February 19, 2013 10:57 AM
   Subject: Re: pig 0.11 candidate 2 feedback: Several problems
  
   What do you guys feel about the JIRA to do with 0.20.2 compatibility
   (PIG-3194)? I am interested in discussing the strategy around backward
   compatibility as this is something that would haunt us each time we
 move
  to
   the next hadoop version. For eg, we might be in a similar situation
 while
   moving to Hadoop 2.0, when some of the stuff might break for 1.0.
  
   I feel it would be good to get this JIRA fix in for 0.11, as 0.20.2
 users
   might be caught unaware. Of course, I must admit there is selfish
  interest
   here and it's probably easier for us to have a workaround on Pig
 rather
   than upgrade hadoop in all our production DCs.
  
   -Prashant
  
  
   On Tue, Feb 19, 2013 at 9:54 AM, Russell Jurney 
  russell.jur...@gmail.com
   wrote:
  
I think someone should step up and fix the easy ones, if possible.
   
   
On Tue, Feb 19, 2013 at 9:51 AM, Bill Graham billgra...@gmail.com
   wrote:
   
 Thanks Kai for reporting these.

 What do people think about the severity of these issues w.r.t. Pig
  11

Re: [VOTE] Release Pig 0.11.0 (candidate 2)

2013-02-20 Thread Bill Graham
With 3 binding +1s (Daniel, Dmitriy and Julien) and 1 non-binding +1
(Cheolsoo),
the vote passes. I will start the release process.

thanks,
Bill

On Wed, Feb 20, 2013 at 7:46 PM, Cheolsoo Park cheol...@cloudera.comwrote:

 +1 (non-binding)

 I downloaded and compiled source tarball. I tested jars against Hadoop 1.x
 and 2.x based clusters.


 On Wed, Feb 20, 2013 at 5:10 PM, Julien Le Dem jul...@twitter.com wrote:

  +1
  I've run a subset of the tests on the src tar
  run some jobs in local mode on the binary tar
  checked the release note
  looks good to me
  Julien
 
  On Feb 14, 2013, at 3:59 PM, Bill Graham wrote:
 
   Hi,
  
   I have created a candidate build for Pig 0.11.0.
  
   Keys used to sign the release are available at:
   http://svn.apache.org/viewvc/pig/trunk/KEYS?view=markup
  
   Please download, test, and try it out:
   http://people.apache.org/~billgraham/pig-0.11.0-candidate-2/
  
   Should we release this? Vote closes on next Wednesday EOD, Feb 20th.
  
   Thanks,
   Bill
 
 



Re: What do we need to change site documentation?

2013-02-19 Thread Bill Graham
It's the '~'. Swap that out for $HOME.

On Tue, Feb 19, 2013 at 7:15 AM, Jonathan Coveney jcove...@gmail.comwrote:

 Hm, that's what I thought. Not sure why I'm having an issue then...I tried
 to build it and it failed. I was able to successfully build the example
 that came with forrest. Anyone seen something like this before?

 $ ls ~/workspace/apache-forrest-0.9/
 KEYSLICENSE.txtNOTICE.txtREADME.txtbinbuild
 etcindex.htmllibmainplugins
 site-authortoolswhiteboard
 [jonathancoveney@Jonathans-MacBook-Pro site]$ ant
 -Dforrest.home=~/workspace/apache-forrest-0.9
 Buildfile: /Users/jonathancoveney/workspace/pig_full/site/build.xml

 clean:

 forrest.check:

 update:

 BUILD FAILED
 /Users/jonathancoveney/workspace/pig_full/site/build.xml:11: Execute
 failed: java.io.IOException: Cannot run program
 ~/workspace/apache-forrest-0.9/bin/forrest (in directory
 /Users/jonathancoveney/workspace/pig_full/site/author): error=2, No such
 file or directory
 at java.lang.ProcessBuilder.start(ProcessBuilder.java:460)
 at java.lang.Runtime.exec(Runtime.java:593)
 at

 org.apache.tools.ant.taskdefs.Execute$Java13CommandLauncher.exec(Execute.java:862)
 at org.apache.tools.ant.taskdefs.Execute.launch(Execute.java:481)
 at org.apache.tools.ant.taskdefs.Execute.execute(Execute.java:495)
 at org.apache.tools.ant.taskdefs.ExecTask.runExecute(ExecTask.java:631)
 at org.apache.tools.ant.taskdefs.ExecTask.runExec(ExecTask.java:672)
 at org.apache.tools.ant.taskdefs.ExecTask.execute(ExecTask.java:498)
 at org.apache.tools.ant.UnknownElement.execute(UnknownElement.java:291)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at

 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at

 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at
 org.apache.tools.ant.dispatch.DispatchUtils.execute(DispatchUtils.java:106)
 at org.apache.tools.ant.Task.perform(Task.java:348)
 at org.apache.tools.ant.Target.execute(Target.java:390)
 at org.apache.tools.ant.Target.performTasks(Target.java:411)
 at org.apache.tools.ant.Project.executeSortedTargets(Project.java:1399)
 at org.apache.tools.ant.Project.executeTarget(Project.java:1368)
 at

 org.apache.tools.ant.helper.DefaultExecutor.executeTargets(DefaultExecutor.java:41)
 at org.apache.tools.ant.Project.executeTargets(Project.java:1251)
 at org.apache.tools.ant.Main.runBuild(Main.java:809)
 at org.apache.tools.ant.Main.startAnt(Main.java:217)
 at org.apache.tools.ant.launch.Launcher.run(Launcher.java:280)
 at org.apache.tools.ant.launch.Launcher.main(Launcher.java:109)
 Caused by: java.io.IOException: error=2, No such file or directory
 at java.lang.UNIXProcess.forkAndExec(Native Method)
 at java.lang.UNIXProcess.init(UNIXProcess.java:53)
 at java.lang.ProcessImpl.start(ProcessImpl.java:91)
 at java.lang.ProcessBuilder.start(ProcessBuilder.java:453)
 ... 24 more

 Total time: 0 seconds



 2013/2/19 Alan Gates ga...@hortonworks.com

  No, somebody fixed it a while ago so it works with java 6.  Just checkout
  pig/site, make your changes, build with ant -Dforrest.home=whatever,
  view the changes locally under the publish directory, add any new files,
  and check in.  The publication from SVN to web is now automatic.  It all
  works fine with the default Java on my mac.
 
  Alan.
 
  On Feb 19, 2013, at 4:39 AM, Jonathan Coveney wrote:
 
   I know we need forrest, but do we still need java 5?
 
 




-- 
*Note that I'm no longer using my Yahoo! email address. Please email me at
billgra...@gmail.com going forward.*


[jira] [Updated] (PIG-3189) Remove ivy/pig.pom and improve build mvn targets

2013-02-19 Thread Bill Graham (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Graham updated PIG-3189:
-

Attachment: PIG-3189.2.patch

{{git rm}} did not, but just {{rm}} did the trick. Here's patch 2 which 
reflects the delete.

 Remove ivy/pig.pom and improve build mvn targets
 

 Key: PIG-3189
 URL: https://issues.apache.org/jira/browse/PIG-3189
 Project: Pig
  Issue Type: Bug
Reporter: Bill Graham
Assignee: Bill Graham
 Fix For: 0.12

 Attachments: PIG-3189.1.patch, PIG-3189.2.patch


 {{ivy/pig.pom}} SVN seems to no longer be used.  At build time ({{ant 
 set-version}} via {{ant mvn-deploy}}) {{ivy/pig.pom}} is generated from 
 {{ivy/pig-template.xml}}. We should remove {{ivy/pig.pom}} from SVN.
 It would also be good to decouple building the maven artifacts from 
 publishing them, since those two tasks might be done on different hosts.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3189) Remove ivy/pig.pom and improve build mvn targets

2013-02-19 Thread Bill Graham (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Graham updated PIG-3189:
-

Description: 
{{ivy/pig.pom}} in SVN seems to no longer be used.  At build time ({{ant 
set-version}} via {{ant mvn-deploy}}) {{ivy/pig.pom}} is generated from 
{{ivy/pig-template.xml}}. We should remove {{ivy/pig.pom}} from SVN.

It would also be good to decouple building the maven artifacts from publishing 
them, since those two tasks might be done on different hosts.

  was:
{{ivy/pig.pom}} SVN seems to no longer be used.  At build time ({{ant 
set-version}} via {{ant mvn-deploy}}) {{ivy/pig.pom}} is generated from 
{{ivy/pig-template.xml}}. We should remove {{ivy/pig.pom}} from SVN.

It would also be good to decouple building the maven artifacts from publishing 
them, since those two tasks might be done on different hosts.


 Remove ivy/pig.pom and improve build mvn targets
 

 Key: PIG-3189
 URL: https://issues.apache.org/jira/browse/PIG-3189
 Project: Pig
  Issue Type: Bug
Reporter: Bill Graham
Assignee: Bill Graham
 Fix For: 0.12

 Attachments: PIG-3189.1.patch, PIG-3189.2.patch


 {{ivy/pig.pom}} in SVN seems to no longer be used.  At build time ({{ant 
 set-version}} via {{ant mvn-deploy}}) {{ivy/pig.pom}} is generated from 
 {{ivy/pig-template.xml}}. We should remove {{ivy/pig.pom}} from SVN.
 It would also be good to decouple building the maven artifacts from 
 publishing them, since those two tasks might be done on different hosts.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: What do we need to change site documentation?

2013-02-19 Thread Bill Graham
You and me both. I've been burned by that one about 342 times now...

On Tue, Feb 19, 2013 at 8:11 AM, Jonathan Coveney jcove...@gmail.comwrote:

 I am such a scrub :) thanks Bill!


 2013/2/19 Bill Graham billgra...@gmail.com

 It's the '~'. Swap that out for $HOME.

 On Tue, Feb 19, 2013 at 7:15 AM, Jonathan Coveney jcove...@gmail.com
 wrote:

  Hm, that's what I thought. Not sure why I'm having an issue then...I
 tried
  to build it and it failed. I was able to successfully build the example
  that came with forrest. Anyone seen something like this before?
 
  $ ls ~/workspace/apache-forrest-0.9/
  KEYSLICENSE.txtNOTICE.txtREADME.txtbinbuild
  etcindex.htmllibmainplugins
  site-authortoolswhiteboard
  [jonathancoveney@Jonathans-MacBook-Pro site]$ ant
  -Dforrest.home=~/workspace/apache-forrest-0.9
  Buildfile: /Users/jonathancoveney/workspace/pig_full/site/build.xml
 
  clean:
 
  forrest.check:
 
  update:
 
  BUILD FAILED
  /Users/jonathancoveney/workspace/pig_full/site/build.xml:11: Execute
  failed: java.io.IOException: Cannot run program
  ~/workspace/apache-forrest-0.9/bin/forrest (in directory
  /Users/jonathancoveney/workspace/pig_full/site/author): error=2, No
 such
  file or directory
  at java.lang.ProcessBuilder.start(ProcessBuilder.java:460)
  at java.lang.Runtime.exec(Runtime.java:593)
  at
 
 
 org.apache.tools.ant.taskdefs.Execute$Java13CommandLauncher.exec(Execute.java:862)
  at org.apache.tools.ant.taskdefs.Execute.launch(Execute.java:481)
  at org.apache.tools.ant.taskdefs.Execute.execute(Execute.java:495)
  at
 org.apache.tools.ant.taskdefs.ExecTask.runExecute(ExecTask.java:631)
  at org.apache.tools.ant.taskdefs.ExecTask.runExec(ExecTask.java:672)
  at org.apache.tools.ant.taskdefs.ExecTask.execute(ExecTask.java:498)
  at
 org.apache.tools.ant.UnknownElement.execute(UnknownElement.java:291)
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at
 
 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
  at
 
 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
  at java.lang.reflect.Method.invoke(Method.java:597)
  at
 
 org.apache.tools.ant.dispatch.DispatchUtils.execute(DispatchUtils.java:106)
  at org.apache.tools.ant.Task.perform(Task.java:348)
  at org.apache.tools.ant.Target.execute(Target.java:390)
  at org.apache.tools.ant.Target.performTasks(Target.java:411)
  at
 org.apache.tools.ant.Project.executeSortedTargets(Project.java:1399)
  at org.apache.tools.ant.Project.executeTarget(Project.java:1368)
  at
 
 
 org.apache.tools.ant.helper.DefaultExecutor.executeTargets(DefaultExecutor.java:41)
  at org.apache.tools.ant.Project.executeTargets(Project.java:1251)
  at org.apache.tools.ant.Main.runBuild(Main.java:809)
  at org.apache.tools.ant.Main.startAnt(Main.java:217)
  at org.apache.tools.ant.launch.Launcher.run(Launcher.java:280)
  at org.apache.tools.ant.launch.Launcher.main(Launcher.java:109)
  Caused by: java.io.IOException: error=2, No such file or directory
  at java.lang.UNIXProcess.forkAndExec(Native Method)
  at java.lang.UNIXProcess.init(UNIXProcess.java:53)
  at java.lang.ProcessImpl.start(ProcessImpl.java:91)
  at java.lang.ProcessBuilder.start(ProcessBuilder.java:453)
  ... 24 more
 
  Total time: 0 seconds
 
 
 
  2013/2/19 Alan Gates ga...@hortonworks.com
 
   No, somebody fixed it a while ago so it works with java 6.  Just
 checkout
   pig/site, make your changes, build with ant
 -Dforrest.home=whatever,
   view the changes locally under the publish directory, add any new
 files,
   and check in.  The publication from SVN to web is now automatic.  It
 all
   works fine with the default Java on my mac.
  
   Alan.
  
   On Feb 19, 2013, at 4:39 AM, Jonathan Coveney wrote:
  
I know we need forrest, but do we still need java 5?
  
  
 



 --
 *Note that I'm no longer using my Yahoo! email address. Please email me at
 billgra...@gmail.com going forward.*





-- 
*Note that I'm no longer using my Yahoo! email address. Please email me at
billgra...@gmail.com going forward.*


Re: pig 0.11 candidate 2 feedback: Several problems

2013-02-19 Thread Bill Graham
Thanks Kai for reporting these.

What do people think about the severity of these issues w.r.t. Pig 11? I
see a few possible options:

1. We include some or all of these patches in a new Pig 11 rc. We'd want to
make sure that they don't destabilize the current branch. This approach
makes sense if we think Pig 11 wouldn't be a good release without one or
more of these included.

2. We continue with the Pig 11 release without these, but then include one
or more in a 0.11.1 release.

3. We continue with the Pig 11 release without these, but then include them
in a 0.12 release.

Jon has a patch for the MAP issue
(PIG-3144https://issues.apache.org/jira/browse/PIG-3144)
ready, which seems like the most pressing of the three to me.

thanks,
Bill

On Mon, Feb 18, 2013 at 2:27 AM, Kai Londenberg 
kai.londenb...@googlemail.com wrote:

 Hi,

 I just subscribed to the dev mailing list in order to give you some
 feedback on pig 0.11 candidate 2.

 The following three issues are currently present in 0.11 candidate 2:

 https://issues.apache.org/jira/browse/PIG-3144 - 'Erroneous map entry
 alias resolution leading to Duplicate schema alias errors'
 https://issues.apache.org/jira/browse/PIG-3194 - Changes to
 ObjectSerializer.java break compatibility with Hadoop 0.20.2
 https://issues.apache.org/jira/browse/PIG-3195 - Race Condition in
 PhysicalOperator leads to ExecException Error while trying to get
 next result in POStream

 The last two of these are easily solveable (see the tickets for
 details on that). The first one is a bit trickier I think, but at
 least there is a workaround for it (pass Map fields through an UDF)

 In my personal opinion, each of these problems is pretty severe, but
 opinions about the importance of the MAP Datatype and STREAM Operator,
 as well as Hadoop 0.20.2 compatibility might differ.

 so far ..

 Kai Londenberg




-- 
*Note that I'm no longer using my Yahoo! email address. Please email me at
billgra...@gmail.com going forward.*


[jira] [Commented] (PIG-3189) Remove ivy/pig.pom and improve build mvn targets

2013-02-19 Thread Bill Graham (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13581539#comment-13581539
 ] 

Bill Graham commented on PIG-3189:
--

There have been changes to {{ivy/pig.pom}} that are not reflected in 
{{ivy/pig-template.xml}}. Particularly antlr is not being included in the 
published pom because it's not in the template. Will submit a new patch.

{nofomat}
$ diff ivy/pig-template.xml ivy/pig.pom
24c24
   version@version/version
---
   version0.9.0-SNAPSHOT/version
85,86c85,86
 /dependency
 dependency
---
/dependency
dependency
122c122,132
   groupIdorg.apache.avro/groupId
---
   groupIdorg.antlr/groupId
   artifactIdantlr-runtime/artifactId
   version3.4/version
 /dependency
 dependency
   groupIdorg.antlr/groupId
   artifactIdST4/artifactId
   version4.0.4/version
 /dependency
dependency
   groupIdorg.apache.hadoop/groupId
124c134
   version1.5.3/version
---
   version1.3.2/version
{noformat}

 Remove ivy/pig.pom and improve build mvn targets
 

 Key: PIG-3189
 URL: https://issues.apache.org/jira/browse/PIG-3189
 Project: Pig
  Issue Type: Bug
Reporter: Bill Graham
Assignee: Bill Graham
 Fix For: 0.12

 Attachments: PIG-3189.1.patch, PIG-3189.2.patch


 {{ivy/pig.pom}} in SVN seems to no longer be used.  At build time ({{ant 
 set-version}} via {{ant mvn-deploy}}) {{ivy/pig.pom}} is generated from 
 {{ivy/pig-template.xml}}. We should remove {{ivy/pig.pom}} from SVN.
 It would also be good to decouple building the maven artifacts from 
 publishing them, since those two tasks might be done on different hosts.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Missing ANTLR dependency in Pig 0.10.1

2013-02-19 Thread Bill Graham
No need to file a jira. I can roll this fix into this bug, which is the
source of the problem:

https://issues.apache.org/jira/browse/PIG-3189#comment-13581539


On Tue, Feb 19, 2013 at 11:07 AM, Rohini Palaniswamy 
rohini.adi...@gmail.com wrote:

 You mean it is not pulled as a transitive dependency? Currently you have to
 manually specify that as a dependency in your pom. Can you file a jira to
 make that part of the pig pom?

 Regards,
 Rohini

 On Tue, Feb 19, 2013 at 10:42 AM, Minh Lê ngocminh@gmail.com wrote:

  I tried to run PigServer in my Java code and get NoClassDefFoundError:
  org/antlr/runtime/RecognitionException. Pig maven repo appears to lack a
  reference to ANTLR jars also it uses ANTLR in QueryParserDriver class.
 
  http://mvnrepository.com/artifact/org.apache.pig/pig/0.10.1
 
  Bests,
 
  --
  Minh, Lê Ngọc
  Trento University, Master in Cognitive Science - Class of 2014
  Skype: ngocminh_oss | Yahoo: ngocminh_oss | Tel: +39 389 603 7251
 




-- 
*Note that I'm no longer using my Yahoo! email address. Please email me at
billgra...@gmail.com going forward.*


[jira] [Commented] (PIG-3131) Document PluckTuple UDF

2013-02-19 Thread Bill Graham (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13581748#comment-13581748
 ] 

Bill Graham commented on PIG-3131:
--

+1 to the hotfix.

 Document PluckTuple UDF
 ---

 Key: PIG-3131
 URL: https://issues.apache.org/jira/browse/PIG-3131
 Project: Pig
  Issue Type: Task
Affects Versions: 0.12
Reporter: Jonathan Coveney
Assignee: Russell Jurney
Priority: Blocker
 Fix For: 0.12

 Attachments: PIG-3131-hotfix.patch, PIG-3131.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: [ANNOUNCE] Welcome Bill Graham to join Pig PMC

2013-02-19 Thread Bill Graham
Thanks guys!

On Tue, Feb 19, 2013 at 3:20 PM, Cheolsoo Park cheol...@cloudera.comwrote:

 Congratulations!


 On Tue, Feb 19, 2013 at 2:35 PM, Prasanth J buckeye.prasa...@gmail.com
 wrote:

  Congrats Bill!
 
  Thanks
  -- Prasanth
 
  On Feb 19, 2013, at 4:52 PM, Prashant Kommireddi prash1...@gmail.com
  wrote:
 
   Congrats Bill!
  
   On Tue, Feb 19, 2013 at 1:48 PM, Daniel Dai da...@hortonworks.com
  wrote:
  
   Please welcome Bill Graham as our latest Pig PMC member.
  
   Congrats Bill!
  
 
 




-- 
*Note that I'm no longer using my Yahoo! email address. Please email me at
billgra...@gmail.com going forward.*


[jira] [Updated] (PIG-3189) Remove ivy/pig.pom and improve build mvn targets

2013-02-19 Thread Bill Graham (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Graham updated PIG-3189:
-

Attachment: PIG-3189.3.patch

Adding patch #3 which includes antlr in the template.

 Remove ivy/pig.pom and improve build mvn targets
 

 Key: PIG-3189
 URL: https://issues.apache.org/jira/browse/PIG-3189
 Project: Pig
  Issue Type: Bug
Reporter: Bill Graham
Assignee: Bill Graham
 Fix For: 0.12

 Attachments: PIG-3189.1.patch, PIG-3189.2.patch, PIG-3189.3.patch


 {{ivy/pig.pom}} in SVN seems to no longer be used.  At build time ({{ant 
 set-version}} via {{ant mvn-deploy}}) {{ivy/pig.pom}} is generated from 
 {{ivy/pig-template.xml}}. We should remove {{ivy/pig.pom}} from SVN.
 It would also be good to decouple building the maven artifacts from 
 publishing them, since those two tasks might be done on different hosts.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (PIG-3188) pig.script.submitted.timestamp not always consistent for jobs launched in a given script

2013-02-15 Thread Bill Graham (JIRA)
Bill Graham created PIG-3188:


 Summary: pig.script.submitted.timestamp not always consistent for 
jobs launched in a given script
 Key: PIG-3188
 URL: https://issues.apache.org/jira/browse/PIG-3188
 Project: Pig
  Issue Type: Bug
Reporter: Bill Graham
Assignee: Bill Graham
 Fix For: 0.12


{{pig.script.submitted.timestamp}} is set in {{MapReduceLauncher.launchPig()}} 
when the a MR plan is launched. Some scripts (i.e. those with an exec in the 
middle) will cause multiple plans to be launched. In these case jobs launched 
from the same script can have different {{pig.script.submitted.timestamp}} 
values, which is a bug.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-3174) Remove rpm and deb artifacts from build.xml

2013-02-15 Thread Bill Graham (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13579643#comment-13579643
 ] 

Bill Graham commented on PIG-3174:
--

I'll take a look at the patch, but what are your thoughts on documentation or 
more generally, how we should point people to Bigtop for distros?

 Remove rpm and deb artifacts from build.xml
 ---

 Key: PIG-3174
 URL: https://issues.apache.org/jira/browse/PIG-3174
 Project: Pig
  Issue Type: Task
  Components: build
Affects Versions: 0.12
Reporter: Alan Gates
Assignee: Alan Gates
 Fix For: 0.12

 Attachments: PIG-3174.2.patch, PIG-3174.patch


 I propose that we remove the targets to build rpms and debs from build.xml 
 and consequently quit publishing them as part of our releases.  Bigtop 
 publishes these packages now.  And building them takes infrastructure that 
 not every committer/PMC member has.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (PIG-3189) Remove ivy/pig.pom and improve build mvn targets

2013-02-15 Thread Bill Graham (JIRA)
Bill Graham created PIG-3189:


 Summary: Remove ivy/pig.pom and improve build mvn targets
 Key: PIG-3189
 URL: https://issues.apache.org/jira/browse/PIG-3189
 Project: Pig
  Issue Type: Bug
Reporter: Bill Graham
Assignee: Bill Graham
 Fix For: 0.12


{{ivy/pig.pom}} SVN seems to no longer be used.  At build time ({{ant 
set-version}} via {{ant mvn-deploy}}) {{ivy/pig.pom}} is generated from 
{{ivy/pig-template.xml}}. We should remove {{ivy/pig.pom}} from SVN.

It would also be good to decouple building the maven artifacts from publishing 
them, since those two tasks might be done on different hosts.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


  1   2   3   4   5   6   >