[jira] [Created] (PIG-3240) DLS_DEAD_LOCAL_STORE fails to recognize an occurrence

2013-03-07 Thread Jan de Ruiter (JIRA)
Jan de Ruiter created PIG-3240:
--

 Summary: DLS_DEAD_LOCAL_STORE fails to recognize an occurrence
 Key: PIG-3240
 URL: https://issues.apache.org/jira/browse/PIG-3240
 Project: Pig
  Issue Type: Bug
  Components: impl
 Environment: eclipse
Reporter: Jan de Ruiter


A statement like
BigDecimal scoreValue = new BigDecimal(0);
will correctly trigger a DLS_DEAD_LOCAL_STORE message, because scoreValue is 
assigned a new value shortly thereafter without being read in between.
But if the line is changed to
BigDecimal scoreValue = new BigDecimal(0).setScale(8, BigDecimal.ROUND_HALF_UP);
it will incorrectly not trigger the message.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (PIG-3241) ConcurrentModificationException in POPartialAgg

2013-03-07 Thread Lohit Vijayarenu (JIRA)
Lohit Vijayarenu created PIG-3241:
-

 Summary: ConcurrentModificationException in POPartialAgg
 Key: PIG-3241
 URL: https://issues.apache.org/jira/browse/PIG-3241
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.11
Reporter: Lohit Vijayarenu


While running few PIG scripts against Hadoop 2.0, I see consistently see 
ConcurrentModificationException 

{noformat}
at java.util.HashMap$HashIterator.remove(HashMap.java:811)
at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POPartialAgg.aggregate(POPartialAgg.java:365)
at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POPartialAgg.aggregateSecondLevel(POPartialAgg.java:379)
at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POPartialAgg.getNext(POPartialAgg.java:203)
at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:308)
at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNext(POLocalRearrange.java:263)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:283)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:278)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:729)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:334)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:158)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1441)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:153)
{noformat}

It looks like there is rawInputMap is being modified while elements are removed 
from it. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (PIG-3242) Standardize where TupleFactory is called from in LoadFuncs

2013-03-07 Thread Jim Plush (JIRA)
Jim Plush created PIG-3242:
--

 Summary: Standardize where TupleFactory is called from in LoadFuncs
 Key: PIG-3242
 URL: https://issues.apache.org/jira/browse/PIG-3242
 Project: Pig
  Issue Type: Improvement
Reporter: Jim Plush
Priority: Minor


I noticed looking in the PiggyBank LoadFunc classes sometimes there will be 
cases where the TupleFactory instance is created in the constructor and 
sometimes there are cases when it's in the getNext() override method. 

Is there a particular reason for this or can I submit a patch to standardize on 
using it in the constructor across the Loaders?


CSVLoader.java - constructor method
private TupleFactory mTupleFactory = TupleFactory.getInstance();


RegExLoader.java - getNext method
TupleFactory mTupleFactory = DefaultTupleFactory.getInstance();



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-3214) New/improved mascot

2013-03-07 Thread Julien Le Dem (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13596273#comment-13596273
 ] 

Julien Le Dem commented on PIG-3214:


Who let the trolls out?

 New/improved mascot
 ---

 Key: PIG-3214
 URL: https://issues.apache.org/jira/browse/PIG-3214
 Project: Pig
  Issue Type: Wish
  Components: site
Affects Versions: 0.11
Reporter: Andrew Musselman
Priority: Minor
 Fix For: 0.12

 Attachments: apache-pig-yellow-logo.png, newlogo1.png, newlogo2.png, 
 newlogo3.png, newlogo4.png, newlogo5.png, new_logo_7.png, pig_6.JPG, 
 pig-logo-10.png, pig-logo-11.png, pig-logo-12.png, pig-logo-13.png, 
 pig-logo-8a.png, pig-logo-8b.png, pig-logo-9a.png, pig-logo-9b.png, 
 pig_logo_new.png


 Request to change pig mascot to something more graphically appealing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-3241) ConcurrentModificationException in POPartialAgg

2013-03-07 Thread Dmitriy V. Ryaboy (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13596300#comment-13596300
 ] 

Dmitriy V. Ryaboy commented on PIG-3241:


Lohit, can you post the svn revision you compiled from?

[~rohini] did you guys get Pig running on Hadoop 2.0 in production? Seeing 
anything like this?

 ConcurrentModificationException in POPartialAgg
 ---

 Key: PIG-3241
 URL: https://issues.apache.org/jira/browse/PIG-3241
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.11
Reporter: Lohit Vijayarenu

 While running few PIG scripts against Hadoop 2.0, I see consistently see 
 ConcurrentModificationException 
 {noformat}
 at java.util.HashMap$HashIterator.remove(HashMap.java:811)
   at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POPartialAgg.aggregate(POPartialAgg.java:365)
   at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POPartialAgg.aggregateSecondLevel(POPartialAgg.java:379)
   at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POPartialAgg.getNext(POPartialAgg.java:203)
   at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:308)
   at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNext(POLocalRearrange.java:263)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:283)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:278)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:729)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:334)
   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:158)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1441)
   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:153)
 {noformat}
 It looks like there is rawInputMap is being modified while elements are 
 removed from it. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Issue Comment Deleted] (PIG-3214) New/improved mascot

2013-03-07 Thread Julien Le Dem (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julien Le Dem updated PIG-3214:
---

Comment: was deleted

(was: Who let the trolls out?)

 New/improved mascot
 ---

 Key: PIG-3214
 URL: https://issues.apache.org/jira/browse/PIG-3214
 Project: Pig
  Issue Type: Wish
  Components: site
Affects Versions: 0.11
Reporter: Andrew Musselman
Priority: Minor
 Fix For: 0.12

 Attachments: apache-pig-yellow-logo.png, newlogo1.png, newlogo2.png, 
 newlogo3.png, newlogo4.png, newlogo5.png, new_logo_7.png, pig_6.JPG, 
 pig-logo-10.png, pig-logo-11.png, pig-logo-12.png, pig-logo-13.png, 
 pig-logo-8a.png, pig-logo-8b.png, pig-logo-9a.png, pig-logo-9b.png, 
 pig_logo_new.png


 Request to change pig mascot to something more graphically appealing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-3194) Changes to ObjectSerializer.java break compatibility with Hadoop 0.20.2

2013-03-07 Thread Dmitriy V. Ryaboy (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13596307#comment-13596307
 ] 

Dmitriy V. Ryaboy commented on PIG-3194:


Prashant, were you able to test in environments where 1.4 is and is not 
available?

 Changes to ObjectSerializer.java break compatibility with Hadoop 0.20.2
 ---

 Key: PIG-3194
 URL: https://issues.apache.org/jira/browse/PIG-3194
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.11
Reporter: Kai Londenberg
Assignee: Prashant Kommireddi
 Fix For: 0.11.1

 Attachments: PIG-3194.patch


 The changes to ObjectSerializer.java in the following commit
 http://svn.apache.org/viewvc?view=revisionrevision=1403934 break 
 compatibility with Hadoop 0.20.2 Clusters.
 The reason is, that the code uses methods from Apache Commons Codec 1.4 - 
 which are not available in Apache Commons Codec 1.3 which is shipping with 
 Hadoop 0.20.2.
 The offending methods are Base64.decodeBase64(String) and 
 Base64.encodeBase64URLSafeString(byte[])
 If I revert these changes, Pig 0.11.0 candidate 2 works well with our Hadoop 
 0.20.2 Clusters.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-3241) ConcurrentModificationException in POPartialAgg

2013-03-07 Thread Lohit Vijayarenu (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13596440#comment-13596440
 ] 

Lohit Vijayarenu commented on PIG-3241:
---

Compiled release 0.11 version against Hadoop 2.0.
I also see ConcurrentModificationExceptions with processedInputMap in same class
{noformat}
at java.util.HashMap$HashIterator.nextEntry(HashMap.java:793)
at java.util.HashMap$EntryIterator.next(HashMap.java:834)
at java.util.HashMap$EntryIterator.next(HashMap.java:832)
at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POPartialAgg.spillResult(POPartialAgg.java:339)
at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POPartialAgg.getNext(POPartialAgg.java:156)
at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:308)
at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNext(POLocalRearrange.java:263)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:283)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:278)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:729)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:334)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:158)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1441)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:153)
{noformat}


 ConcurrentModificationException in POPartialAgg
 ---

 Key: PIG-3241
 URL: https://issues.apache.org/jira/browse/PIG-3241
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.11
Reporter: Lohit Vijayarenu

 While running few PIG scripts against Hadoop 2.0, I see consistently see 
 ConcurrentModificationException 
 {noformat}
 at java.util.HashMap$HashIterator.remove(HashMap.java:811)
   at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POPartialAgg.aggregate(POPartialAgg.java:365)
   at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POPartialAgg.aggregateSecondLevel(POPartialAgg.java:379)
   at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POPartialAgg.getNext(POPartialAgg.java:203)
   at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:308)
   at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNext(POLocalRearrange.java:263)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:283)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:278)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:729)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:334)
   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:158)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1441)
   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:153)
 {noformat}
 It looks like there is rawInputMap is being modified while elements are 
 removed from it. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Review Request: PIG-3215 [piggybank] Add LTSVLoader to load LTSV files

2013-03-07 Thread Taku Miyakawa

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/9685/
---

(Updated March 7, 2013, 10:46 p.m.)


Review request for pig.


Changes
---

Removes excess blank lines.


Description
---

This is a review board for  https://issues.apache.org/jira/browse/PIG-3215

The patch adds LTSVLoader function and its test class.


This addresses bug PIG-3215.
https://issues.apache.org/jira/browse/PIG-3215


Diffs (updated)
-

  
contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/LTSVLoader.java
 PRE-CREATION 
  
contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/TestLTSVLoader.java
 PRE-CREATION 

Diff: https://reviews.apache.org/r/9685/diff/


Testing
---

ant compile-test
cd contrib/piggybank/java/
ant -Dtestcase=TestLTSVLoader test


Thanks,

Taku Miyakawa



[jira] [Updated] (PIG-3241) ConcurrentModificationException in POPartialAgg

2013-03-07 Thread Bill Graham (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Graham updated PIG-3241:
-

Fix Version/s: 0.12

 ConcurrentModificationException in POPartialAgg
 ---

 Key: PIG-3241
 URL: https://issues.apache.org/jira/browse/PIG-3241
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.11
Reporter: Lohit Vijayarenu
 Fix For: 0.12, 0.11.1


 While running few PIG scripts against Hadoop 2.0, I see consistently see 
 ConcurrentModificationException 
 {noformat}
 at java.util.HashMap$HashIterator.remove(HashMap.java:811)
   at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POPartialAgg.aggregate(POPartialAgg.java:365)
   at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POPartialAgg.aggregateSecondLevel(POPartialAgg.java:379)
   at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POPartialAgg.getNext(POPartialAgg.java:203)
   at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:308)
   at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNext(POLocalRearrange.java:263)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:283)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:278)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:729)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:334)
   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:158)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1441)
   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:153)
 {noformat}
 It looks like there is rawInputMap is being modified while elements are 
 removed from it. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3241) ConcurrentModificationException in POPartialAgg

2013-03-07 Thread Bill Graham (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Graham updated PIG-3241:
-

Fix Version/s: 0.11.1

 ConcurrentModificationException in POPartialAgg
 ---

 Key: PIG-3241
 URL: https://issues.apache.org/jira/browse/PIG-3241
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.11
Reporter: Lohit Vijayarenu
 Fix For: 0.11.1


 While running few PIG scripts against Hadoop 2.0, I see consistently see 
 ConcurrentModificationException 
 {noformat}
 at java.util.HashMap$HashIterator.remove(HashMap.java:811)
   at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POPartialAgg.aggregate(POPartialAgg.java:365)
   at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POPartialAgg.aggregateSecondLevel(POPartialAgg.java:379)
   at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POPartialAgg.getNext(POPartialAgg.java:203)
   at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:308)
   at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNext(POLocalRearrange.java:263)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:283)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:278)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:729)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:334)
   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:158)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1441)
   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:153)
 {noformat}
 It looks like there is rawInputMap is being modified while elements are 
 removed from it. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3241) ConcurrentModificationException in POPartialAgg

2013-03-07 Thread Lohit Vijayarenu (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lohit Vijayarenu updated PIG-3241:
--

Priority: Blocker  (was: Major)

 ConcurrentModificationException in POPartialAgg
 ---

 Key: PIG-3241
 URL: https://issues.apache.org/jira/browse/PIG-3241
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.11
Reporter: Lohit Vijayarenu
Priority: Blocker
 Fix For: 0.12, 0.11.1


 While running few PIG scripts against Hadoop 2.0, I see consistently see 
 ConcurrentModificationException 
 {noformat}
 at java.util.HashMap$HashIterator.remove(HashMap.java:811)
   at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POPartialAgg.aggregate(POPartialAgg.java:365)
   at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POPartialAgg.aggregateSecondLevel(POPartialAgg.java:379)
   at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POPartialAgg.getNext(POPartialAgg.java:203)
   at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:308)
   at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNext(POLocalRearrange.java:263)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:283)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:278)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:729)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:334)
   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:158)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1441)
   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:153)
 {noformat}
 It looks like there is rawInputMap is being modified while elements are 
 removed from it. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Review Request: PIG-3215 [piggybank] Add LTSVLoader to load LTSV files

2013-03-07 Thread Taku Miyakawa

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/9685/
---

(Updated March 7, 2013, 11:31 p.m.)


Review request for pig.


Changes
---

Outputs skipped labels to the task log at the first occurrence.


Description
---

This is a review board for  https://issues.apache.org/jira/browse/PIG-3215

The patch adds LTSVLoader function and its test class.


This addresses bug PIG-3215.
https://issues.apache.org/jira/browse/PIG-3215


Diffs (updated)
-

  
contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/LTSVLoader.java
 PRE-CREATION 
  
contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/TestLTSVLoader.java
 PRE-CREATION 

Diff: https://reviews.apache.org/r/9685/diff/


Testing
---

ant compile-test
cd contrib/piggybank/java/
ant -Dtestcase=TestLTSVLoader test


Thanks,

Taku Miyakawa



[jira] [Updated] (PIG-3015) Rewrite of AvroStorage

2013-03-07 Thread Cheolsoo Park (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cheolsoo Park updated PIG-3015:
---

Attachment: PIG-3015-11.patch

I got unit tests working with both hadoop 20 and 23. In fact, the problem was 
very simple. The option parser code was not correct (i.e. 
CommandLine.getOptionValue() takes the option name not the long option name.), 
and thus, the namespace of output Avro files was not set properly.

What's surprised me was that how come this issue only showed up with hadoop23. 
We need better test coverage.

PIG-3015-11.patch includes the following changes:
* Fixed the option parser code.
* Removed commented-out code.
* Added with_dates.pig.

 Rewrite of AvroStorage
 --

 Key: PIG-3015
 URL: https://issues.apache.org/jira/browse/PIG-3015
 Project: Pig
  Issue Type: Improvement
  Components: piggybank
Reporter: Joseph Adler
Assignee: Joseph Adler
 Attachments: bad.avro, good.avro, PIG-3015-10.patch, 
 PIG-3015-11.patch, PIG-3015-2.patch, PIG-3015-3.patch, PIG-3015-4.patch, 
 PIG-3015-5.patch, PIG-3015-6.patch, PIG-3015-7.patch, PIG-3015-9.patch, 
 PIG-3015-doc.patch, TestInput.java, Test.java, with_dates.pig


 The current AvroStorage implementation has a lot of issues: it requires old 
 versions of Avro, it copies data much more than needed, and it's verbose and 
 complicated. (One pet peeve of mine is that old versions of Avro don't 
 support Snappy compression.)
 I rewrote AvroStorage from scratch to fix these issues. In early tests, the 
 new implementation is significantly faster, and the code is a lot simpler. 
 Rewriting AvroStorage also enabled me to implement support for Trevni (as 
 TrevniStorage).
 I'm opening this ticket to facilitate discussion while I figure out the best 
 way to contribute the changes back to Apache.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-3194) Changes to ObjectSerializer.java break compatibility with Hadoop 0.20.2

2013-03-07 Thread Prashant Kommireddi (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13596588#comment-13596588
 ] 

Prashant Kommireddi commented on PIG-3194:
--

Dmitriy, yes. Here is what I have done after the fix:

*1. compile pig with 1.4*
1a. start up pig with hadoop 0.20.2 - runs [expected]
1b. start up pig with hadoop 1 - runs [expected]
1c. test-commit and few other tests mentioned in my previous comment all pass.
1d. run sample scripts against 1a and 1b - works good.


*2. compile pig with 1.3*
2a. start up pig with hadoop 0.20.2 - runs [expected]
2b. start up pig with hadoop-1 - runs [expected]
2c. test-commit and other tests pass but TestMRCompiler fails - *[not expected]*
2d. run sample scripts against 2a and 2b - works good


The issue with 2c (TestMRCompiler.testMergeJoin failing against 1.3) is that 
the gold file being used contains a serialized string that was generated using 
1.4. There seem to be a few differences between 1.3 and 1.4 with encode/decode 
behavior - please see https://issues.apache.org/jira/browse/CODEC-89, 
https://issues.apache.org/jira/browse/CODEC-91

Considering our tests run against hadoop-1 and all pass I am not sure if we 
should be spending time in making the test case codec 1.3 aware? 


 Changes to ObjectSerializer.java break compatibility with Hadoop 0.20.2
 ---

 Key: PIG-3194
 URL: https://issues.apache.org/jira/browse/PIG-3194
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.11
Reporter: Kai Londenberg
Assignee: Prashant Kommireddi
 Fix For: 0.11.1

 Attachments: PIG-3194.patch


 The changes to ObjectSerializer.java in the following commit
 http://svn.apache.org/viewvc?view=revisionrevision=1403934 break 
 compatibility with Hadoop 0.20.2 Clusters.
 The reason is, that the code uses methods from Apache Commons Codec 1.4 - 
 which are not available in Apache Commons Codec 1.3 which is shipping with 
 Hadoop 0.20.2.
 The offending methods are Base64.decodeBase64(String) and 
 Base64.encodeBase64URLSafeString(byte[])
 If I revert these changes, Pig 0.11.0 candidate 2 works well with our Hadoop 
 0.20.2 Clusters.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] Subscription: PIG patch available

2013-03-07 Thread jira
Issue Subscription
Filter: PIG patch available (33 issues)

Subscriber: pigdaily

Key Summary
PIG-3238Pig current releases lack a UDF Stuff(). This UDF deletes a 
specified length of characters and inserts another set of characters at a 
specified starting point.
https://issues.apache.org/jira/browse/PIG-3238
PIG-3237Pig current releases lack a UDF MakeSet(). This UDF returns a set 
value (a string containing substrings separated by , characters) consisting 
of the strings that have the corresponding bit in the first argument
https://issues.apache.org/jira/browse/PIG-3237
PIG-3235Enable DEBUG log messages in unit tests by default
https://issues.apache.org/jira/browse/PIG-3235
PIG-3233Deploy a Piggybank Jar
https://issues.apache.org/jira/browse/PIG-3233
PIG-3215[piggybank] Add LTSVLoader to load LTSV (Labeled Tab-separated 
Values) files
https://issues.apache.org/jira/browse/PIG-3215
PIG-3210Pig fails to start when it cannot write log to log files
https://issues.apache.org/jira/browse/PIG-3210
PIG-3208[zebra] TFile should not set io.compression.codec.lzo.buffersize
https://issues.apache.org/jira/browse/PIG-3208
PIG-3205Passing arguments to python script does not work with -f option
https://issues.apache.org/jira/browse/PIG-3205
PIG-3198Let users use any function from PigType - PigType as if it were 
builtlin
https://issues.apache.org/jira/browse/PIG-3198
PIG-3194Changes to ObjectSerializer.java break compatibility with Hadoop 
0.20.2
https://issues.apache.org/jira/browse/PIG-3194
PIG-3183rm or rmf commands should respect globbing/regex of path
https://issues.apache.org/jira/browse/PIG-3183
PIG-3172Partition filter push down does not happen when there is a non 
partition key map column filter
https://issues.apache.org/jira/browse/PIG-3172
PIG-3166Update eclipse .classpath according to ivy library.properties
https://issues.apache.org/jira/browse/PIG-3166
PIG-3164Pig current releases lack a UDF endsWith.This UDF tests if a given 
string ends with the specified suffix.
https://issues.apache.org/jira/browse/PIG-3164
PIG-3141Giving CSVExcelStorage an option to handle header rows
https://issues.apache.org/jira/browse/PIG-3141
PIG-3123Simplify Logical Plans By Removing Unneccessary Identity Projections
https://issues.apache.org/jira/browse/PIG-3123
PIG-3122Operators should not implicitly become reserved keywords
https://issues.apache.org/jira/browse/PIG-3122
PIG-3114Duplicated macro name error when using pigunit
https://issues.apache.org/jira/browse/PIG-3114
PIG-3105Fix TestJobSubmission unit test failure.
https://issues.apache.org/jira/browse/PIG-3105
PIG-3088Add a builtin udf which removes prefixes
https://issues.apache.org/jira/browse/PIG-3088
PIG-3069Native Windows Compatibility for Pig E2E Tests and Harness
https://issues.apache.org/jira/browse/PIG-3069
PIG-3028testGrunt dev test needs some command filters to run correctly 
without cygwin
https://issues.apache.org/jira/browse/PIG-3028
PIG-3027pigTest unit test needs a newline filter for comparisons of golden 
multi-line
https://issues.apache.org/jira/browse/PIG-3027
PIG-3026Pig checked-in baseline comparisons need a pre-filter to address 
OS-specific newline differences
https://issues.apache.org/jira/browse/PIG-3026
PIG-3024TestEmptyInputDir unit test - hadoop version detection logic is 
brittle
https://issues.apache.org/jira/browse/PIG-3024
PIG-3015Rewrite of AvroStorage
https://issues.apache.org/jira/browse/PIG-3015
PIG-3010Allow UDF's to flatten themselves
https://issues.apache.org/jira/browse/PIG-3010
PIG-2959Add a pig.cmd for Pig to run under Windows
https://issues.apache.org/jira/browse/PIG-2959
PIG-2955 Fix bunch of Pig e2e tests on Windows 
https://issues.apache.org/jira/browse/PIG-2955
PIG-2643Use bytecode generation to make a performance replacement for 
InvokeForLong, InvokeForString, etc
https://issues.apache.org/jira/browse/PIG-2643
PIG-2641Create toJSON function for all complex types: tuples, bags and maps
https://issues.apache.org/jira/browse/PIG-2641
PIG-2591Unit tests should not write to /tmp but respect java.io.tmpdir
https://issues.apache.org/jira/browse/PIG-2591
PIG-1914Support load/store JSON data in Pig
https://issues.apache.org/jira/browse/PIG-1914

You may edit this subscription at:
https://issues.apache.org/jira/secure/FilterSubscription!default.jspa?subId=13225filterId=12322384


Possible issue when using non-bundled hadoop

2013-03-07 Thread Prashant Kommireddi
A pig user brought to my attention that PIG_OPTS are not being picked up
but pig -Dfoo=bar script.pig works. I dug a little into it and found that
in the case when an external hadoop (not bundled with pig) is used we rely
on bin/hadoop script for starting up pig. In the process we pass PIG_OPTS
as HADOOP_OPTS to the hadoop script.

bin/hadoop in turn picks up HADOOP_OPTS from
HADOOP_HOME/conf/hadoop-env.sh. It might be possible a user completely
resets HADOOP_OPTS here and PIG_OPTS is cleared.

# Extra Java runtime options.  Empty by default.
export HADOOP_OPTS=-server

This will be noticeable since Pig 0.9.1
https://issues.apache.org/jira/browse/PIG-2239

Users need to be aware of this and make sure they do not reset HADOOP_OPTS.
Instead this will work
# Extra Java runtime options.  Empty by default.
export HADOOP_OPTS=-server $HADOOP_OPTS

Thought it would be useful info if this ever comes up.

-Prashant


[jira] [Updated] (PIG-3015) Rewrite of AvroStorage

2013-03-07 Thread Cheolsoo Park (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cheolsoo Park updated PIG-3015:
---

Attachment: PIG-3015-doc-2.patch

PIG-3015-doc-2.patch fixes ant docs errors.

 Rewrite of AvroStorage
 --

 Key: PIG-3015
 URL: https://issues.apache.org/jira/browse/PIG-3015
 Project: Pig
  Issue Type: Improvement
  Components: piggybank
Reporter: Joseph Adler
Assignee: Joseph Adler
 Attachments: bad.avro, good.avro, PIG-3015-10.patch, 
 PIG-3015-11.patch, PIG-3015-2.patch, PIG-3015-3.patch, PIG-3015-4.patch, 
 PIG-3015-5.patch, PIG-3015-6.patch, PIG-3015-7.patch, PIG-3015-9.patch, 
 PIG-3015-doc-2.patch, PIG-3015-doc.patch, TestInput.java, Test.java, 
 with_dates.pig


 The current AvroStorage implementation has a lot of issues: it requires old 
 versions of Avro, it copies data much more than needed, and it's verbose and 
 complicated. (One pet peeve of mine is that old versions of Avro don't 
 support Snappy compression.)
 I rewrote AvroStorage from scratch to fix these issues. In early tests, the 
 new implementation is significantly faster, and the code is a lot simpler. 
 Rewriting AvroStorage also enabled me to implement support for Trevni (as 
 TrevniStorage).
 I'm opening this ticket to facilitate discussion while I figure out the best 
 way to contribute the changes back to Apache.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3214) New/improved mascot

2013-03-07 Thread Justin Dorfman (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Justin Dorfman updated PIG-3214:


Attachment: apache-pig-14.png

It will be colored in by Monday.  Please give feedback.

 New/improved mascot
 ---

 Key: PIG-3214
 URL: https://issues.apache.org/jira/browse/PIG-3214
 Project: Pig
  Issue Type: Wish
  Components: site
Affects Versions: 0.11
Reporter: Andrew Musselman
Priority: Minor
 Fix For: 0.12

 Attachments: apache-pig-14.png, apache-pig-yellow-logo.png, 
 newlogo1.png, newlogo2.png, newlogo3.png, newlogo4.png, newlogo5.png, 
 new_logo_7.png, pig_6.JPG, pig-logo-10.png, pig-logo-11.png, pig-logo-12.png, 
 pig-logo-13.png, pig-logo-8a.png, pig-logo-8b.png, pig-logo-9a.png, 
 pig-logo-9b.png, pig_logo_new.png


 Request to change pig mascot to something more graphically appealing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-3214) New/improved mascot

2013-03-07 Thread Dmitriy V. Ryaboy (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13596896#comment-13596896
 ] 

Dmitriy V. Ryaboy commented on PIG-3214:


Justin, I like it from the artistic point of view, but I don't think it's much 
of a departure from what we currently have -- more of a cartoon than a logo. I 
would have to echo Alan's thoughts on this..  

You are clearly far more artistically skilled then the rest of us lot though! 
Think you can take a pass at formalizing the sketch Julien posted? This seemed 
to resonate with people the most.

 New/improved mascot
 ---

 Key: PIG-3214
 URL: https://issues.apache.org/jira/browse/PIG-3214
 Project: Pig
  Issue Type: Wish
  Components: site
Affects Versions: 0.11
Reporter: Andrew Musselman
Priority: Minor
 Fix For: 0.12

 Attachments: apache-pig-14.png, apache-pig-yellow-logo.png, 
 newlogo1.png, newlogo2.png, newlogo3.png, newlogo4.png, newlogo5.png, 
 new_logo_7.png, pig_6.JPG, pig-logo-10.png, pig-logo-11.png, pig-logo-12.png, 
 pig-logo-13.png, pig-logo-8a.png, pig-logo-8b.png, pig-logo-9a.png, 
 pig-logo-9b.png, pig_logo_new.png


 Request to change pig mascot to something more graphically appealing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-3214) New/improved mascot

2013-03-07 Thread Justin Dorfman (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13596905#comment-13596905
 ] 

Justin Dorfman commented on PIG-3214:
-

[~dvryaboy] Totally get what you are saying.  First I would like to say I am 
art directing this.  The company I work for has hired a illustrator to donate 
the logo to the ASF.  Any feedback will go back to him so please use this as an 
opportunity to get exactly what you want, professionally, for free. =)

I did a rough fill in photoshop to show how it could look on the site: 
http://note.io/XuqI3U

Again please use us to get exactly what you want.  I will try my best to 
deliver.

 New/improved mascot
 ---

 Key: PIG-3214
 URL: https://issues.apache.org/jira/browse/PIG-3214
 Project: Pig
  Issue Type: Wish
  Components: site
Affects Versions: 0.11
Reporter: Andrew Musselman
Priority: Minor
 Fix For: 0.12

 Attachments: apache-pig-14.png, apache-pig-yellow-logo.png, 
 newlogo1.png, newlogo2.png, newlogo3.png, newlogo4.png, newlogo5.png, 
 new_logo_7.png, pig_6.JPG, pig-logo-10.png, pig-logo-11.png, pig-logo-12.png, 
 pig-logo-13.png, pig-logo-8a.png, pig-logo-8b.png, pig-logo-9a.png, 
 pig-logo-9b.png, pig_logo_new.png


 Request to change pig mascot to something more graphically appealing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira