date:20091008


 [ 
https://issues.apache.org/jira/browse/PIG-989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-989:
---

  Resolution: Fixed
Hadoop Flags: [Reviewed]
  Status: Resolved  (was: Patch Available)

Patch committed

 Allow type merge between numerical type and non-numerical type
 --

 Key: PIG-989
 URL: https://issues.apache.org/jira/browse/PIG-989
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.5.0
Reporter: Daniel Dai
 Attachments: PIG-989-1.patch, PIG-989-2.patch


 Currently, we do not allow type merge between numerical type and 
 non-numerical type. And the error message is confusing. 
 Eg, if you run:
 a = load '1.txt' as (a0:chararray, a1:chararray);
 b = load '2.txt' as (b0:long, b1:chararray);
 c = join a by a0, b by b0;
 dump c;
 And the error message is ERROR 1051: Cannot cast to Unknown
 We shall:
 1. Allow the type merge between numerical type and non-numerical type
 2. Or at least, provide more meaningful error message to the user

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-948) [Usability] Relating pig script with MR jobs


 [ 
https://issues.apache.org/jira/browse/PIG-948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-948:
---

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Patch committed.

 [Usability] Relating pig script with MR jobs
 

 Key: PIG-948
 URL: https://issues.apache.org/jira/browse/PIG-948
 Project: Pig
  Issue Type: Improvement
  Components: impl
Affects Versions: 0.4.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
Priority: Minor
 Fix For: 0.6.0

 Attachments: pig-948-2.patch, pig-948-3.patch, PIG-948-4.patch, 
 pig-948.patch


 Currently its hard to find a way to relate pig script with specific MR job. 
 In a loaded cluster with multiple simultaneous job submissions, its not easy 
 to figure out which specific MR jobs were launched for a given pig script. If 
 Pig can provide this info, it will be useful to debug and monitor the jobs 
 resulting from a pig script.
 At the very least, Pig should be able to provide user the following 
 information
 1) Job id of the launched job.
 2) Complete web url of jobtracker running this job. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (PIG-976) Multi-query optimization throws ClassCastException


[ 
https://issues.apache.org/jira/browse/PIG-976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12763645#action_12763645
 ] 

Pradeep Kamath commented on PIG-976:


Reviewed the new patch - one comment is on POMultiQueryPackage:
{code}
203 Object obj = tuple.get(0);  

 
  204 if (obj instanceof PigNullableWritable) { 

   
  205 ((PigNullableWritable)obj).setIndex(origIndex);   

   
  206 } 

   
  207 else {

   
  208 PigNullableWritable myObj = 
HDataType.getWritableComparableTypes(obj, (byte)0); 

 
  209 myObj.setIndex(origIndex);

   
  210 tuple.set(0, myObj);  

   
  211 } 
{code}

If obj is null then the above code in the else would give an exception - I 
think the code should check for obj == null and if so create a NullWritable 
object where NullWritable is a subclass of PigNullableWritable representing a 
null. Since only the getValueAsPigType() method is used in PODemux, that would 
always return null for this use case.

 Multi-query optimization throws ClassCastException
 --

 Key: PIG-976
 URL: https://issues.apache.org/jira/browse/PIG-976
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.4.0
Reporter: Ankur
Assignee: Richard Ding
 Attachments: PIG-976.patch, PIG-976.patch


 Multi-query optimization fails to merge 2 branches when 1 is a result of 
 Group By ALL and another is a result of Group By field1 where field 1 is of 
 type long. Here is the script that fails with multi-query on.
 data = LOAD 'test' USING PigStorage('\t') AS (a:long, b:double, c:double); 
 A = GROUP data ALL;
 B = FOREACH A GENERATE SUM(data.b) AS sum1, SUM(data.c) AS sum2;
 C = FOREACH B GENERATE (sum1/sum2) AS rate; 
 STORE C INTO 'result1';
 D = GROUP data BY a; 
 E = FOREACH D GENERATE group AS a, SUM(data.b), SUM(data.c);
 STORE E into 'result2';
  
 Here is the exception from the logs
 java.lang.ClassCastException: org.apache.pig.data.DefaultTuple cannot be cast 
 to org.apache.pig.data.DataBag
   at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.processInputBag(POProject.java:399)
   at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:180)
   at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.processInput(POUserFunc.java:145)
   at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:197)
   at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:235)
   at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:254)
   at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:204)
   at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:231)
   at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNext(POLocalRearrange.java:240)
   at

[jira] Updated: (PIG-922) Logical optimizer: push up project


 [ 
https://issues.apache.org/jira/browse/PIG-922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-922:
---

Attachment: PIG-922-p3_9.patch

Fix the unit test

 Logical optimizer: push up project
 --

 Key: PIG-922
 URL: https://issues.apache.org/jira/browse/PIG-922
 Project: Pig
  Issue Type: New Feature
  Components: impl
Affects Versions: 0.3.0
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.6.0

 Attachments: PIG-922-p1_0.patch, PIG-922-p1_1.patch, 
 PIG-922-p1_2.patch, PIG-922-p1_3.patch, PIG-922-p1_4.patch, 
 PIG-922-p2_preview.patch, PIG-922-p2_preview2.patch, PIG-922-p3_1.patch, 
 PIG-922-p3_2.patch, PIG-922-p3_3.patch, PIG-922-p3_4.patch, 
 PIG-922-p3_5.patch, PIG-922-p3_6.patch, PIG-922-p3_7.patch, 
 PIG-922-p3_8.patch, PIG-922-p3_9.patch


 This is a continuation work of 
 [PIG-697|https://issues.apache.org/jira/browse/PIG-697]. We need to add 
 another rule to the logical optimizer: Push up project, ie, prune columns as 
 early as possible.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-922) Logical optimizer: push up project


 [ 
https://issues.apache.org/jira/browse/PIG-922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-922:
---

Status: Open  (was: Patch Available)

 Logical optimizer: push up project
 --

 Key: PIG-922
 URL: https://issues.apache.org/jira/browse/PIG-922
 Project: Pig
  Issue Type: New Feature
  Components: impl
Affects Versions: 0.3.0
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.6.0

 Attachments: PIG-922-p1_0.patch, PIG-922-p1_1.patch, 
 PIG-922-p1_2.patch, PIG-922-p1_3.patch, PIG-922-p1_4.patch, 
 PIG-922-p2_preview.patch, PIG-922-p2_preview2.patch, PIG-922-p3_1.patch, 
 PIG-922-p3_2.patch, PIG-922-p3_3.patch, PIG-922-p3_4.patch, 
 PIG-922-p3_5.patch, PIG-922-p3_6.patch, PIG-922-p3_7.patch, 
 PIG-922-p3_8.patch, PIG-922-p3_9.patch


 This is a continuation work of 
 [PIG-697|https://issues.apache.org/jira/browse/PIG-697]. We need to add 
 another rule to the logical optimizer: Push up project, ie, prune columns as 
 early as possible.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-922) Logical optimizer: push up project


 [ 
https://issues.apache.org/jira/browse/PIG-922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-922:
---

Status: Patch Available  (was: Open)

 Logical optimizer: push up project
 --

 Key: PIG-922
 URL: https://issues.apache.org/jira/browse/PIG-922
 Project: Pig
  Issue Type: New Feature
  Components: impl
Affects Versions: 0.3.0
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.6.0

 Attachments: PIG-922-p1_0.patch, PIG-922-p1_1.patch, 
 PIG-922-p1_2.patch, PIG-922-p1_3.patch, PIG-922-p1_4.patch, 
 PIG-922-p2_preview.patch, PIG-922-p2_preview2.patch, PIG-922-p3_1.patch, 
 PIG-922-p3_2.patch, PIG-922-p3_3.patch, PIG-922-p3_4.patch, 
 PIG-922-p3_5.patch, PIG-922-p3_6.patch, PIG-922-p3_7.patch, 
 PIG-922-p3_8.patch, PIG-922-p3_9.patch


 This is a continuation work of 
 [PIG-697|https://issues.apache.org/jira/browse/PIG-697]. We need to add 
 another rule to the logical optimizer: Push up project, ie, prune columns as 
 early as possible.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Pig 0.4.0 is released!

2009-10-08 Thread Olga Natkovich

Pig Team is happy to announce Pig 0.4.0 release!

 

Pig is a Hadoop subproject that provides high-level data-flow language
and an execution framework for parallel computation on a Hadoop cluster.
More details about Pig can be found at http://hadoop.apache.org/pig/.

 

This release introduces two new types of join. The skewed join improves
join performance for the data with large skew in the join key. The merge
join improves performance for the case where both inputs are sorted on
the join key. The release also includes support for outer join.  The
details of the release can be found at
http://hadoop.apache.org/pig/releases.html

 

The publishing of this release has been delayed due to problems with
Apache infrastructure that prevented us from publishing the updated
site.

 

Olga

[jira] Commented: (PIG-995) Limit Optimizer throw exception ERROR 2156: Error while fixing projections


[ 
https://issues.apache.org/jira/browse/PIG-995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12763694#action_12763694
 ] 

Pradeep Kamath commented on PIG-995:


+1

 Limit Optimizer throw exception ERROR 2156: Error while fixing projections
 

 Key: PIG-995
 URL: https://issues.apache.org/jira/browse/PIG-995
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.3.0
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.6.0

 Attachments: PIG-995-1.patch


 The following script fail:
 A = load '1.txt' AS (a0, a1, a2);
 B = order A by a1;
 C = limit B 10;
 D = foreach C generate $0;
 dump D;
 Error log:
 Caused by: org.apache.pig.impl.plan.VisitorException: ERROR 2156: Error while 
 fixing projections. Projection map of node to be replaced is null.
 at 
 org.apache.pig.impl.logicalLayer.ProjectFixerUpper.visit(ProjectFixerUpper.java:138)
 at 
 org.apache.pig.impl.logicalLayer.LOProject.visit(LOProject.java:408)
 at org.apache.pig.impl.logicalLayer.LOProject.visit(LOProject.java:58)
 at 
 org.apache.pig.impl.plan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:65)
 at 
 org.apache.pig.impl.plan.DepthFirstWalker.walk(DepthFirstWalker.java:50)
 at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:51)
 at 
 org.apache.pig.impl.logicalLayer.LOForEach.rewire(LOForEach.java:761)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (PIG-922) Logical optimizer: push up project


[ 
https://issues.apache.org/jira/browse/PIG-922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12763705#action_12763705
 ] 

Pradeep Kamath commented on PIG-922:


Reviewed changes per my last review comments - looks good - +1

 Logical optimizer: push up project
 --

 Key: PIG-922
 URL: https://issues.apache.org/jira/browse/PIG-922
 Project: Pig
  Issue Type: New Feature
  Components: impl
Affects Versions: 0.3.0
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.6.0

 Attachments: PIG-922-p1_0.patch, PIG-922-p1_1.patch, 
 PIG-922-p1_2.patch, PIG-922-p1_3.patch, PIG-922-p1_4.patch, 
 PIG-922-p2_preview.patch, PIG-922-p2_preview2.patch, PIG-922-p3_1.patch, 
 PIG-922-p3_2.patch, PIG-922-p3_3.patch, PIG-922-p3_4.patch, 
 PIG-922-p3_5.patch, PIG-922-p3_6.patch, PIG-922-p3_7.patch, 
 PIG-922-p3_8.patch, PIG-922-p3_9.patch


 This is a continuation work of 
 [PIG-697|https://issues.apache.org/jira/browse/PIG-697]. We need to add 
 another rule to the logical optimizer: Push up project, ie, prune columns as 
 early as possible.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-995) Limit Optimizer throw exception ERROR 2156: Error while fixing projections


 [ 
https://issues.apache.org/jira/browse/PIG-995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-995:
---

Attachment: PIG-995-2.patch

After discussion with Santhosh, I get a better patch. The problem is we do not 
generate projection map before applying optimization rules. If the optimization 
rules change the structure of the logical plan and then generate the projection 
map, we will end up using a wrong projection map. In the new patch, we 
regenerate projection map before applying each optimization rule.

 Limit Optimizer throw exception ERROR 2156: Error while fixing projections
 

 Key: PIG-995
 URL: https://issues.apache.org/jira/browse/PIG-995
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.3.0
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.6.0

 Attachments: PIG-995-1.patch, PIG-995-2.patch


 The following script fail:
 A = load '1.txt' AS (a0, a1, a2);
 B = order A by a1;
 C = limit B 10;
 D = foreach C generate $0;
 dump D;
 Error log:
 Caused by: org.apache.pig.impl.plan.VisitorException: ERROR 2156: Error while 
 fixing projections. Projection map of node to be replaced is null.
 at 
 org.apache.pig.impl.logicalLayer.ProjectFixerUpper.visit(ProjectFixerUpper.java:138)
 at 
 org.apache.pig.impl.logicalLayer.LOProject.visit(LOProject.java:408)
 at org.apache.pig.impl.logicalLayer.LOProject.visit(LOProject.java:58)
 at 
 org.apache.pig.impl.plan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:65)
 at 
 org.apache.pig.impl.plan.DepthFirstWalker.walk(DepthFirstWalker.java:50)
 at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:51)
 at 
 org.apache.pig.impl.logicalLayer.LOForEach.rewire(LOForEach.java:761)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-995) Limit Optimizer throw exception ERROR 2156: Error while fixing projections


 [ 
https://issues.apache.org/jira/browse/PIG-995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-995:
---

Status: Patch Available  (was: Open)

 Limit Optimizer throw exception ERROR 2156: Error while fixing projections
 

 Key: PIG-995
 URL: https://issues.apache.org/jira/browse/PIG-995
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.3.0
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.6.0

 Attachments: PIG-995-1.patch, PIG-995-2.patch


 The following script fail:
 A = load '1.txt' AS (a0, a1, a2);
 B = order A by a1;
 C = limit B 10;
 D = foreach C generate $0;
 dump D;
 Error log:
 Caused by: org.apache.pig.impl.plan.VisitorException: ERROR 2156: Error while 
 fixing projections. Projection map of node to be replaced is null.
 at 
 org.apache.pig.impl.logicalLayer.ProjectFixerUpper.visit(ProjectFixerUpper.java:138)
 at 
 org.apache.pig.impl.logicalLayer.LOProject.visit(LOProject.java:408)
 at org.apache.pig.impl.logicalLayer.LOProject.visit(LOProject.java:58)
 at 
 org.apache.pig.impl.plan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:65)
 at 
 org.apache.pig.impl.plan.DepthFirstWalker.walk(DepthFirstWalker.java:50)
 at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:51)
 at 
 org.apache.pig.impl.logicalLayer.LOForEach.rewire(LOForEach.java:761)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-995) Limit Optimizer throw exception ERROR 2156: Error while fixing projections


 [ 
https://issues.apache.org/jira/browse/PIG-995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-995:
---

Status: Open  (was: Patch Available)

 Limit Optimizer throw exception ERROR 2156: Error while fixing projections
 

 Key: PIG-995
 URL: https://issues.apache.org/jira/browse/PIG-995
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.3.0
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.6.0

 Attachments: PIG-995-1.patch, PIG-995-2.patch


 The following script fail:
 A = load '1.txt' AS (a0, a1, a2);
 B = order A by a1;
 C = limit B 10;
 D = foreach C generate $0;
 dump D;
 Error log:
 Caused by: org.apache.pig.impl.plan.VisitorException: ERROR 2156: Error while 
 fixing projections. Projection map of node to be replaced is null.
 at 
 org.apache.pig.impl.logicalLayer.ProjectFixerUpper.visit(ProjectFixerUpper.java:138)
 at 
 org.apache.pig.impl.logicalLayer.LOProject.visit(LOProject.java:408)
 at org.apache.pig.impl.logicalLayer.LOProject.visit(LOProject.java:58)
 at 
 org.apache.pig.impl.plan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:65)
 at 
 org.apache.pig.impl.plan.DepthFirstWalker.walk(DepthFirstWalker.java:50)
 at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:51)
 at 
 org.apache.pig.impl.logicalLayer.LOForEach.rewire(LOForEach.java:761)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (PIG-922) Logical optimizer: push up project

[
https://issues.apache.org/jira/browse/PIG-922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12763728#action_12763728
]

Hadoop QA commented on PIG-922:
---

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12421651/PIG-922-p3_9.patch
against trunk revision 823257.

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 30 new or modified tests.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

+1 findbugs. The patch does not introduce any new Findbugs warnings.

-1 release audit. The applied patch generated 287 release audit warnings
(more than the trunk's current 280 warnings).

+1 core tests. The patch passed core unit tests.

+1 contrib tests. The patch passed contrib unit tests.

Test results:
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/15/testReport/
Release audit warnings:
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/15/artifact/trunk/patchprocess/releaseAuditDiffWarnings.txt
Findbugs warnings:
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/15/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output:
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/15/console

This message is automatically generated.

Logical optimizer: push up project
--

Key: PIG-922
URL: https://issues.apache.org/jira/browse/PIG-922
Project: Pig
Issue Type: New Feature
Components: impl
Affects Versions: 0.3.0
Reporter: Daniel Dai
Assignee: Daniel Dai
Fix For: 0.6.0

Attachments: PIG-922-p1_0.patch, PIG-922-p1_1.patch,
PIG-922-p1_2.patch, PIG-922-p1_3.patch, PIG-922-p1_4.patch,
PIG-922-p2_preview.patch, PIG-922-p2_preview2.patch, PIG-922-p3_1.patch,
PIG-922-p3_2.patch, PIG-922-p3_3.patch, PIG-922-p3_4.patch,
PIG-922-p3_5.patch, PIG-922-p3_6.patch, PIG-922-p3_7.patch,
PIG-922-p3_8.patch, PIG-922-p3_9.patch

This is a continuation work of
[PIG-697|https://issues.apache.org/jira/browse/PIG-697]. We need to add
another rule to the logical optimizer: Push up project, ie, prune columns as
early as possible.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (PIG-1000) InternalCachedBag.java generates javac warning and findbug warning

2009-10-08 Thread Ying He (JIRA)

InternalCachedBag.java generates javac warning and findbug warning
--

 Key: PIG-1000
 URL: https://issues.apache.org/jira/browse/PIG-1000
 Project: Pig
  Issue Type: Improvement
Affects Versions: 0.4.0
Reporter: Ying He
Assignee: Ying He
 Fix For: 0.6.0


POPackage uses DefaultDataBag during reduce process to hold data. It is 
registered with SpillableMemoryManager and prone to OutOfMemoryException.  It's 
better to pro-actively managers the usage of the memory. The bag fills in 
memory to a specified amount, and dump the rest the disk.  The amount of memory 
to hold tuples is configurable. This can avoid out of memory error.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-1000) InternalCachedBag.java generates javac warning and findbug warning

2009-10-08 Thread Ying He (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-1000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ying He updated PIG-1000:
-

Attachment: PIG-1000.patch

fix javac warning and findbug warning

 InternalCachedBag.java generates javac warning and findbug warning
 --

 Key: PIG-1000
 URL: https://issues.apache.org/jira/browse/PIG-1000
 Project: Pig
  Issue Type: Improvement
Affects Versions: 0.4.0
Reporter: Ying He
Assignee: Ying He
 Fix For: 0.6.0

 Attachments: PIG-1000.patch


 POPackage uses DefaultDataBag during reduce process to hold data. It is 
 registered with SpillableMemoryManager and prone to OutOfMemoryException.  
 It's better to pro-actively managers the usage of the memory. The bag fills 
 in memory to a specified amount, and dump the rest the disk.  The amount of 
 memory to hold tuples is configurable. This can avoid out of memory error.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-1000) InternalCachedBag.java generates javac warning and findbug warning

2009-10-08 Thread Ying He (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-1000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ying He updated PIG-1000:
-

Description: patch submitted by PIG-975 generates javac warning and findbug 
warning  (was: POPackage uses DefaultDataBag during reduce process to hold 
data. It is registered with SpillableMemoryManager and prone to 
OutOfMemoryException.  It's better to pro-actively managers the usage of the 
memory. The bag fills in memory to a specified amount, and dump the rest the 
disk.  The amount of memory to hold tuples is configurable. This can avoid out 
of memory error.)
 Patch Info: [Patch Available]

 InternalCachedBag.java generates javac warning and findbug warning
 --

 Key: PIG-1000
 URL: https://issues.apache.org/jira/browse/PIG-1000
 Project: Pig
  Issue Type: Improvement
Affects Versions: 0.4.0
Reporter: Ying He
Assignee: Ying He
 Fix For: 0.6.0

 Attachments: PIG-1000.patch


 patch submitted by PIG-975 generates javac warning and findbug warning

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (PIG-894) order-by fails when input is empty


 [ 
https://issues.apache.org/jira/browse/PIG-894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai reassigned PIG-894:
--

Assignee: Daniel Dai

 order-by fails when input is empty
 --

 Key: PIG-894
 URL: https://issues.apache.org/jira/browse/PIG-894
 Project: Pig
  Issue Type: Bug
Reporter: Thejas M Nair
Assignee: Daniel Dai
 Attachments: PIG-894-1.patch


 grunt l = load 'students.txt' ;
 grunt f = filter l by 1 == 2;
 grunt o = order f by $0 ;
 grunt dump o;
 This results in 3 MR jobs . The 2nd (sampling) MR creates empty sample file, 
 and 3rd MR (order-by) fails with following error in Map job -
 java.lang.RuntimeException: java.lang.RuntimeException: Empty samples file
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.configure(WeightedRangePartitioner.java:104)
   at 
 org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:58)
   at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:82)
   at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.(MapTask.java:348)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:193)
   at 
 org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2207)
 Caused by: java.lang.RuntimeException: Empty samples file
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.configure(WeightedRangePartitioner.java:89)
   ... 5 more

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-894) order-by fails when input is empty


 [ 
https://issues.apache.org/jira/browse/PIG-894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-894:
---

Attachment: PIG-894-1.patch

 order-by fails when input is empty
 --

 Key: PIG-894
 URL: https://issues.apache.org/jira/browse/PIG-894
 Project: Pig
  Issue Type: Bug
Reporter: Thejas M Nair
 Attachments: PIG-894-1.patch


 grunt l = load 'students.txt' ;
 grunt f = filter l by 1 == 2;
 grunt o = order f by $0 ;
 grunt dump o;
 This results in 3 MR jobs . The 2nd (sampling) MR creates empty sample file, 
 and 3rd MR (order-by) fails with following error in Map job -
 java.lang.RuntimeException: java.lang.RuntimeException: Empty samples file
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.configure(WeightedRangePartitioner.java:104)
   at 
 org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:58)
   at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:82)
   at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.(MapTask.java:348)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:193)
   at 
 org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2207)
 Caused by: java.lang.RuntimeException: Empty samples file
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.configure(WeightedRangePartitioner.java:89)
   ... 5 more

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-894) order-by fails when input is empty


 [ 
https://issues.apache.org/jira/browse/PIG-894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-894:
---

Status: Patch Available  (was: Open)

 order-by fails when input is empty
 --

 Key: PIG-894
 URL: https://issues.apache.org/jira/browse/PIG-894
 Project: Pig
  Issue Type: Bug
Reporter: Thejas M Nair
Assignee: Daniel Dai
 Attachments: PIG-894-1.patch


 grunt l = load 'students.txt' ;
 grunt f = filter l by 1 == 2;
 grunt o = order f by $0 ;
 grunt dump o;
 This results in 3 MR jobs . The 2nd (sampling) MR creates empty sample file, 
 and 3rd MR (order-by) fails with following error in Map job -
 java.lang.RuntimeException: java.lang.RuntimeException: Empty samples file
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.configure(WeightedRangePartitioner.java:104)
   at 
 org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:58)
   at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:82)
   at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.(MapTask.java:348)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:193)
   at 
 org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2207)
 Caused by: java.lang.RuntimeException: Empty samples file
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.configure(WeightedRangePartitioner.java:89)
   ... 5 more

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (PIG-1001) Generate more meaningful error message when one input file does not exist

Generate more meaningful error message when one input file does not exist
-

 Key: PIG-1001
 URL: https://issues.apache.org/jira/browse/PIG-1001
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.4.0
Reporter: Daniel Dai
 Fix For: 0.6.0


In the following query, if 2.txt does not exist, 

a = load '1.txt';
b = order a by $0;
c = load '2.txt';
d = order c by $0;
e = join b by $0, d by $0;
dump e;

Pig throws error message ERROR 2100: file:/tmp/temp155054664/tmp1144108421 
does not exist., Pig should deal with it with the error message Input file 
2.txt not exist instead of those confusing messages.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Hudson build is back to normal: Pig-trunk #581

2009-10-08 Thread Apache Hudson Server

See http://hudson.zones.apache.org/hudson/job/Pig-trunk/581/changes

[jira] Updated: (PIG-987) [zebra] Zebra Column Group Access Control

2009-10-08 Thread Yan Zhou (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yan Zhou updated PIG-987:
-

Attachment: ColumnGroupSecurity.patch

 [zebra] Zebra Column Group Access Control
 -

 Key: PIG-987
 URL: https://issues.apache.org/jira/browse/PIG-987
 Project: Pig
  Issue Type: New Feature
Affects Versions: 0.6.0
Reporter: Yan Zhou
Assignee: Yan Zhou
 Attachments: ColumnGroupSecurity.patch, ColumnGroupSecurity.patch, 
 ColumnGroupSecurity.patch, TEST-org.apache.hadoop.zebra.io.TestCheckin.txt, 
 TEST-org.apache.hadoop.zebra.mapred.TestCheckin.txt, tmp-987-plus-991.patch


 Access Control: when processes try to read from the column groups, Zebra 
 should be able to handle allowed vs. disallowed user/application accesses.  
 The security is eventuallt granted by corresponding  HDFS security of the 
 data stored.
 Expected behavior when column group permissions are set:
 When user selects only columns that they do not have permissions to 
 access, Zebra should return error with message Error #: Permission denied 
 for accessing column column name or names 
 Access control applies to an entire column group, so all columns in a column 
 group have same permissions. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (PIG-894) order-by fails when input is empty


[ 
https://issues.apache.org/jira/browse/PIG-894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12763776#action_12763776
 ] 

Pradeep Kamath commented on PIG-894:


The patch uses pig.inputs property from jobconf which does not directly have 
the input file name - it actually has a serialized arrayListPairFileSpec, 
Boolean in string form containing the filespec and the issplittable flag for 
each input for the job - this serialized string will need to be deserialized 
using ObjectSerializer.deserialize and then from the filespec, the filename 
will need to be retrieved.

 order-by fails when input is empty
 --

 Key: PIG-894
 URL: https://issues.apache.org/jira/browse/PIG-894
 Project: Pig
  Issue Type: Bug
Reporter: Thejas M Nair
Assignee: Daniel Dai
 Attachments: PIG-894-1.patch


 grunt l = load 'students.txt' ;
 grunt f = filter l by 1 == 2;
 grunt o = order f by $0 ;
 grunt dump o;
 This results in 3 MR jobs . The 2nd (sampling) MR creates empty sample file, 
 and 3rd MR (order-by) fails with following error in Map job -
 java.lang.RuntimeException: java.lang.RuntimeException: Empty samples file
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.configure(WeightedRangePartitioner.java:104)
   at 
 org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:58)
   at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:82)
   at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.(MapTask.java:348)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:193)
   at 
 org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2207)
 Caused by: java.lang.RuntimeException: Empty samples file
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.configure(WeightedRangePartitioner.java:89)
   ... 5 more

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-986) [zebra] Zebra Column Group Naming Support

2009-10-08 Thread Yan Zhou (JIRA)

[
https://issues.apache.org/jira/browse/PIG-986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Yan Zhou updated PIG-986:
-

Attachment: ColumnGroupName.patch

removed hard coded group in a few test cases

[zebra] Zebra Column Group Naming Support
-

Key: PIG-986
URL: https://issues.apache.org/jira/browse/PIG-986
Project: Pig
Issue Type: New Feature
Components: impl
Affects Versions: 0.4.0
Reporter: Chao Wang
Assignee: Chao Wang
Fix For: 0.6.0

Attachments: ColumnGroupName.patch, ColumnGroupName.patch,
ColumnGroupName.patch

We introduce column group name to Zebra and make it a first-class citizen in
Zebra. This can ease management of column groups.
We plan to introduce an as clause for column group name in Zebra's syntax.
Functional Specifications:
1) Column group names are optional. For column groups which do not have a
user-provided name, Zebra will assign some default column group names
internally that is unique for that table - CG0, CG1, CG2 ... Note: If CGx is
used by user, then it can not be used for internal names.
2) We introduce an AS clause in Zebra's syntax for column group names. If
it occurs, it has to immediately follow [ ]. For example, [a1, a2] as PI
secure by user:joe group:secure perm:640; [a3, a4] as General compress by
lzo. Note that keyword AS is case insensitive.
3) Column group names are unique within one table and are case sensitive,
i.e., c1 and C1 are different.
4) Column group names will be used as the physical column group directory
path names.
5) Zebra V2 will support dropColumnGroup by column group names (will
integrate with Raghu's A29 drop column work).
6) Zebra V2 can support backward compatibility (If there are Zebra V1 created
tables in production when V2 is released). More specifically, this means that
Zebra V2 can load from V1-created tables and do dropColumnGroup on it.
7) Does NOT support renaming.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (PIG-987) [zebra] Zebra Column Group Access Control

2009-10-08 Thread Yan Zhou (JIRA)

[
https://issues.apache.org/jira/browse/PIG-987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12763781#action_12763781
]

Yan Zhou commented on PIG-987:
--

remove the hardcoded group name from a few test scripts. This patch and the
ones in Pig-991 and Pig-986 are ready to be comitted. But please hold on
commiting Pig-992 and afterwards.

[zebra] Zebra Column Group Access Control
-

Key: PIG-987
URL: https://issues.apache.org/jira/browse/PIG-987
Project: Pig
Issue Type: New Feature
Affects Versions: 0.6.0
Reporter: Yan Zhou
Assignee: Yan Zhou
Attachments: ColumnGroupSecurity.patch, ColumnGroupSecurity.patch,
ColumnGroupSecurity.patch, TEST-org.apache.hadoop.zebra.io.TestCheckin.txt,
TEST-org.apache.hadoop.zebra.mapred.TestCheckin.txt, tmp-987-plus-991.patch

Access Control: when processes try to read from the column groups, Zebra
should be able to handle allowed vs. disallowed user/application accesses.
The security is eventuallt granted by corresponding HDFS security of the
data stored.
Expected behavior when column group permissions are set:
When user selects only columns that they do not have permissions to
access, Zebra should return error with message Error #: Permission denied
for accessing column column name or names
Access control applies to an entire column group, so all columns in a column
group have same permissions.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-894) order-by fails when input is empty


 [ 
https://issues.apache.org/jira/browse/PIG-894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-894:
---

Status: Open  (was: Patch Available)

 order-by fails when input is empty
 --

 Key: PIG-894
 URL: https://issues.apache.org/jira/browse/PIG-894
 Project: Pig
  Issue Type: Bug
Reporter: Thejas M Nair
Assignee: Daniel Dai
 Attachments: PIG-894-1.patch, PIG-894-2.patch


 grunt l = load 'students.txt' ;
 grunt f = filter l by 1 == 2;
 grunt o = order f by $0 ;
 grunt dump o;
 This results in 3 MR jobs . The 2nd (sampling) MR creates empty sample file, 
 and 3rd MR (order-by) fails with following error in Map job -
 java.lang.RuntimeException: java.lang.RuntimeException: Empty samples file
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.configure(WeightedRangePartitioner.java:104)
   at 
 org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:58)
   at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:82)
   at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.(MapTask.java:348)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:193)
   at 
 org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2207)
 Caused by: java.lang.RuntimeException: Empty samples file
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.configure(WeightedRangePartitioner.java:89)
   ... 5 more

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-894) order-by fails when input is empty


 [ 
https://issues.apache.org/jira/browse/PIG-894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-894:
---

Fix Version/s: 0.6.0
Affects Version/s: 0.4.0
   Status: Patch Available  (was: Open)

 order-by fails when input is empty
 --

 Key: PIG-894
 URL: https://issues.apache.org/jira/browse/PIG-894
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.4.0
Reporter: Thejas M Nair
Assignee: Daniel Dai
 Fix For: 0.6.0

 Attachments: PIG-894-1.patch, PIG-894-2.patch


 grunt l = load 'students.txt' ;
 grunt f = filter l by 1 == 2;
 grunt o = order f by $0 ;
 grunt dump o;
 This results in 3 MR jobs . The 2nd (sampling) MR creates empty sample file, 
 and 3rd MR (order-by) fails with following error in Map job -
 java.lang.RuntimeException: java.lang.RuntimeException: Empty samples file
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.configure(WeightedRangePartitioner.java:104)
   at 
 org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:58)
   at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:82)
   at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.(MapTask.java:348)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:193)
   at 
 org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2207)
 Caused by: java.lang.RuntimeException: Empty samples file
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.configure(WeightedRangePartitioner.java:89)
   ... 5 more

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-894) order-by fails when input is empty


 [ 
https://issues.apache.org/jira/browse/PIG-894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-894:
---

Attachment: PIG-894-2.patch

Fix the issue Pradeep find.

 order-by fails when input is empty
 --

 Key: PIG-894
 URL: https://issues.apache.org/jira/browse/PIG-894
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.4.0
Reporter: Thejas M Nair
Assignee: Daniel Dai
 Fix For: 0.6.0

 Attachments: PIG-894-1.patch, PIG-894-2.patch


 grunt l = load 'students.txt' ;
 grunt f = filter l by 1 == 2;
 grunt o = order f by $0 ;
 grunt dump o;
 This results in 3 MR jobs . The 2nd (sampling) MR creates empty sample file, 
 and 3rd MR (order-by) fails with following error in Map job -
 java.lang.RuntimeException: java.lang.RuntimeException: Empty samples file
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.configure(WeightedRangePartitioner.java:104)
   at 
 org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:58)
   at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:82)
   at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.(MapTask.java:348)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:193)
   at 
 org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2207)
 Caused by: java.lang.RuntimeException: Empty samples file
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.configure(WeightedRangePartitioner.java:89)
   ... 5 more

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (PIG-894) order-by fails when input is empty

[
https://issues.apache.org/jira/browse/PIG-894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12763786#action_12763786
]

Hadoop QA commented on PIG-894:
---

+1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12421681/PIG-894-1.patch
against trunk revision 823257.

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 3 new or modified tests.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

+1 findbugs. The patch does not introduce any new Findbugs warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

+1 core tests. The patch passed core unit tests.

+1 contrib tests. The patch passed contrib unit tests.

Test results:
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/16/testReport/
Findbugs warnings:
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/16/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output:
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/16/console

This message is automatically generated.

order-by fails when input is empty
--

Key: PIG-894
URL: https://issues.apache.org/jira/browse/PIG-894
Project: Pig
Issue Type: Bug
Affects Versions: 0.4.0
Reporter: Thejas M Nair
Assignee: Daniel Dai
Fix For: 0.6.0

Attachments: PIG-894-1.patch, PIG-894-2.patch

grunt l = load 'students.txt' ;
grunt f = filter l by 1 == 2;
grunt o = order f by $0 ;
grunt dump o;
This results in 3 MR jobs . The 2nd (sampling) MR creates empty sample file,
and 3rd MR (order-by) fails with following error in Map job -
java.lang.RuntimeException: java.lang.RuntimeException: Empty samples file
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.configure(WeightedRangePartitioner.java:104)
at
org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:58)
at
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:82)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.(MapTask.java:348)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:193)
at
org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2207)
Caused by: java.lang.RuntimeException: Empty samples file
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.configure(WeightedRangePartitioner.java:89)
... 5 more

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (PIG-995) Limit Optimizer throw exception ERROR 2156: Error while fixing projections

[
https://issues.apache.org/jira/browse/PIG-995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12763797#action_12763797
]

Hadoop QA commented on PIG-995:
---

+1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12421671/PIG-995-2.patch
against trunk revision 823257.

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 3 new or modified tests.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

+1 findbugs. The patch does not introduce any new Findbugs warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

+1 core tests. The patch passed core unit tests.

+1 contrib tests. The patch passed contrib unit tests.

Test results:
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/67/testReport/
Findbugs warnings:
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/67/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output:
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/67/console

This message is automatically generated.

Limit Optimizer throw exception ERROR 2156: Error while fixing projections

Key: PIG-995
URL: https://issues.apache.org/jira/browse/PIG-995
Project: Pig
Issue Type: Bug
Components: impl
Affects Versions: 0.3.0
Reporter: Daniel Dai
Assignee: Daniel Dai
Fix For: 0.6.0

Attachments: PIG-995-1.patch, PIG-995-2.patch

The following script fail:
A = load '1.txt' AS (a0, a1, a2);
B = order A by a1;
C = limit B 10;
D = foreach C generate $0;
dump D;
Error log:
Caused by: org.apache.pig.impl.plan.VisitorException: ERROR 2156: Error while
fixing projections. Projection map of node to be replaced is null.
at
org.apache.pig.impl.logicalLayer.ProjectFixerUpper.visit(ProjectFixerUpper.java:138)
at
org.apache.pig.impl.logicalLayer.LOProject.visit(LOProject.java:408)
at org.apache.pig.impl.logicalLayer.LOProject.visit(LOProject.java:58)
at
org.apache.pig.impl.plan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:65)
at
org.apache.pig.impl.plan.DepthFirstWalker.walk(DepthFirstWalker.java:50)
at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:51)
at
org.apache.pig.impl.logicalLayer.LOForEach.rewire(LOForEach.java:761)

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (PIG-987) [zebra] Zebra Column Group Access Control

[
https://issues.apache.org/jira/browse/PIG-987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12763836#action_12763836
]

Raghu Angadi commented on PIG-987:
--

Thanks Yan. It might be better to remove gauravj also since it is ignored
anyway.

This implies column access control is not tested in this patch, right?

[zebra] Zebra Column Group Access Control
-

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (PIG-894) order-by fails when input is empty

[
https://issues.apache.org/jira/browse/PIG-894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12763844#action_12763844
]

Hadoop QA commented on PIG-894:
---

+1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12421694/PIG-894-2.patch
against trunk revision 823257.

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 3 new or modified tests.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

+1 findbugs. The patch does not introduce any new Findbugs warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

+1 core tests. The patch passed core unit tests.

+1 contrib tests. The patch passed contrib unit tests.

Test results:
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/17/testReport/
Findbugs warnings:
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/17/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output:
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/17/console

This message is automatically generated.

order-by fails when input is empty
--

Key: PIG-894
URL: https://issues.apache.org/jira/browse/PIG-894
Project: Pig
Issue Type: Bug
Affects Versions: 0.4.0
Reporter: Thejas M Nair
Assignee: Daniel Dai
Fix For: 0.6.0

Attachments: PIG-894-1.patch, PIG-894-2.patch

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-991) [zebra] A few minor bugs as described in the Description section


 [ 
https://issues.apache.org/jira/browse/PIG-991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raghu Angadi updated PIG-991:
-

Attachment: Bugs-2.patch

I am committing a slightly modified patch. I removed the following lines that 
modified build.xml at the top level. Please ask one of the PIG committers to 
commit that change.

The part that is removed :
{noformat}
@@ -940,4 +942,13 @@

  target name=published depends=ivy-publish-local, maven-artifacts/

+target name=pig-test
+jar
+  jarfile=${build.dir}/pig-test-${version}.jar
+  basedir=${build.dir}/test/classes
+  excludes=**/Test*.class
+
+/jar
+/target
+
 /project
{noformat}

 [zebra] A few minor bugs as described in the Description section
 

 Key: PIG-991
 URL: https://issues.apache.org/jira/browse/PIG-991
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.4.0
Reporter: Yan Zhou
Assignee: Yan Zhou
Priority: Minor
 Fix For: 0.6.0

 Attachments: Bugs-2.patch, Bugs.patch


 1) lzo2 was used as the compressor name for the LZO compression algorithm; 
 it should be lzo instead;
 2) the default compression is changed from lzo to gz for gzip;
 3) In JAVACC file SchemaParser.jjt, the package name was wrong using the old 
 package org.apache.pig.table.types;
 4) in build.xml, two new javacc targets are added to generate 
 TableSchemaParser and TableStorageParser java codes;
 5) Support of column group security ( 
 https://issues.apache.org/jira/browse/PIG-987 ) lacked support of the 
 dumpinfo method: the groups and permissions were not displayed. Note that as 
 a consequence, the patch herein must be applied after that of JIRA987.
 6) and 7) a couple of issues reported in Jira917.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-987) [zebra] Zebra Column Group Access Control


 [ 
https://issues.apache.org/jira/browse/PIG-987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raghu Angadi updated PIG-987:
-

   Resolution: Fixed
Fix Version/s: 0.6.0
   Status: Resolved  (was: Patch Available)

I just committed this. Thanks Yan!

 [zebra] Zebra Column Group Access Control
 -

 Key: PIG-987
 URL: https://issues.apache.org/jira/browse/PIG-987
 Project: Pig
  Issue Type: New Feature
Affects Versions: 0.6.0
Reporter: Yan Zhou
Assignee: Yan Zhou
 Fix For: 0.6.0

 Attachments: ColumnGroupSecurity.patch, ColumnGroupSecurity.patch, 
 ColumnGroupSecurity.patch, TEST-org.apache.hadoop.zebra.io.TestCheckin.txt, 
 TEST-org.apache.hadoop.zebra.mapred.TestCheckin.txt, tmp-987-plus-991.patch


 Access Control: when processes try to read from the column groups, Zebra 
 should be able to handle allowed vs. disallowed user/application accesses.  
 The security is eventuallt granted by corresponding  HDFS security of the 
 data stored.
 Expected behavior when column group permissions are set:
 When user selects only columns that they do not have permissions to 
 access, Zebra should return error with message Error #: Permission denied 
 for accessing column column name or names 
 Access control applies to an entire column group, so all columns in a column 
 group have same permissions. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-991) [zebra] A few minor bugs as described in the Description section