[jira] Commented: (PIG-992) [zebra] Separate Schema-related files into a Schema package

2009-10-08 Thread Hong Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12763439#action_12763439
 ] 

Hong Tang commented on PIG-992:
---

Comments:
- In many places, both types.ParseException and schema.ParseException are 
thrown. Do you really want both?
- In the following
{noformat}
+public enum ColumnType implements Writable {
{noformat}
Is the Writable interface actually used? You have rather odd pattern of 
asymmetric readFields and write:
{noformat}
+  @Override
+  public void readFields(DataInput in) throws IOException {
+// no op, instantiated by the caller
+  }
+
+  @Override
+  public void write(DataOutput out) throws IOException {
+Utils.writeString(out, name);
+  }
{noformat}
- In the following code
{noformat}
+  public static class ColumnSchema {
+public String name;
+public ColumnType type;
+public Schema schema;
+public int index; // field index in schema
{noformat}
Exposing fields as all-public seems like a bad idea.
- Is there a specific usage case to allow schema to be mutable at any time? 
(minor nit: the comment says add a field, but the code seems to add a column to 
the schema).
{noformat}
+  /**
+   * add a field
+   */
+  public void add(ColumnSchema f) throws ParseException
+  {
+add(f, false);
+  }
{noformat}
- Why Schema.equals(Object) is not implemented on top of the static version of 
the method (or vice versa)?
- In Schema.readFields(), the Version string from the input is not checked for 
compatibility.
- In the following
{noformat}
+  private void init(String[] columnNames) throws ParseException {
+// the arg must be of type or they will be treated as the default type
+// TODO: verify column names don't contain COLUMN_DELIMITER
{noformat}
It seems that the TODO should not involve too much work and please consider not 
deferring it later.
- Need more detailed documentation on the spec of the parameter for 
Schema.getColumnSchema(String name)
{noformat}
+  /**
+   * Get a column's schema
+   */
+  public ColumnSchema getColumnSchema(String name) throws ParseException
+  {
{noformat}
- Schema.getColumnSchemaOnParsedName and Schema.getColumnSchema seems to be 
copy/paste code.
- Schema.getColumnSchema(ParsedName pn) has side effect of modifying the 
parameter pn. The javadoc reads cryptic to me.
- There are many classes generated by JavaCC. It is probably better not 
including them in the patch (and put the generated source under build/src).

Other minor issues:
- Typically contrib projects should use the version string as the parent 
project.
- Style: there are some very long lines.
 - There are a few white space changes. That should be avoided if possible.
- In the following
{noformat}
+} catch (org.apache.hadoop.zebra.schema.ParseException e) {
+  throw new AssertionError(Invalid Projection: +e.getMessage());
{noformat}
consider change AssertionError to IllegalArgumentException.
- In the following:
{noformat}
+  /*
+   * helper class to parse a column name string one section at a time and find 
the required
+   * type for the parsed part.
+   */
+  public static class ParsedName {
+public String mName;
+int mKeyOffset; // the offset where the keysstring starts
+public ColumnType mDT = ColumnType.ANY; // parent's type
{noformat}
The description seems to indicate that this should not be a public class. I 
tried to understand the body of the class and do not feel that it serves a 
general purpose.
- The following seems like useless assignment:
{noformat}
+  private long mVersion = schemaVersion;
{noformat}
- {noformat}
  /**
+   * Normalize the schema string.
+   * 
+   * @param value
+   *  the input string representation of the schema.
+   * @return the normalized string representation.
+   */
+  public static String normalize(String value) {
+String result = new String();
+
+if (value == null || value.trim().isEmpty())
+  return result;
+
+StringBuilder sb = new StringBuilder();
+String[] parts = value.trim().split(COLUMN_DELIMITER);
+for (int nx = 0; nx  parts.length; nx++) {
+  if (nx  0) sb.append(COLUMN_DELIMITER);
+  sb.append(parts[nx].trim());
+}
+return sb.toString();
+  }

{noformat}
There is a wasted value.trim().
- In Schema.equals(Object), instead of comparing class equality, using 
instanceof is typically better.
- Use StringBuilder instead in the following code:
{noformat}
+String merged = new String();
+for (int i = 0; i  columnNames.length; i++) {
+  if (i  0) merged += ,;
+  merged += columnNames[i];
+}
{noformat}
- There are a few indentation problems.

 [zebra] Separate Schema-related files into a Schema package
 -

 Key: PIG-992
 URL: https://issues.apache.org/jira/browse/PIG-992
 Project: Pig
  Issue Type: Improvement

[jira] Updated: (PIG-976) Multi-query optimization throws ClassCastException

2009-10-08 Thread Richard Ding (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Richard Ding updated PIG-976:
-

Status: Patch Available  (was: Open)

 Multi-query optimization throws ClassCastException
 --

 Key: PIG-976
 URL: https://issues.apache.org/jira/browse/PIG-976
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.4.0
Reporter: Ankur
Assignee: Richard Ding
 Attachments: PIG-976.patch, PIG-976.patch


 Multi-query optimization fails to merge 2 branches when 1 is a result of 
 Group By ALL and another is a result of Group By field1 where field 1 is of 
 type long. Here is the script that fails with multi-query on.
 data = LOAD 'test' USING PigStorage('\t') AS (a:long, b:double, c:double); 
 A = GROUP data ALL;
 B = FOREACH A GENERATE SUM(data.b) AS sum1, SUM(data.c) AS sum2;
 C = FOREACH B GENERATE (sum1/sum2) AS rate; 
 STORE C INTO 'result1';
 D = GROUP data BY a; 
 E = FOREACH D GENERATE group AS a, SUM(data.b), SUM(data.c);
 STORE E into 'result2';
  
 Here is the exception from the logs
 java.lang.ClassCastException: org.apache.pig.data.DefaultTuple cannot be cast 
 to org.apache.pig.data.DataBag
   at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.processInputBag(POProject.java:399)
   at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:180)
   at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.processInput(POUserFunc.java:145)
   at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:197)
   at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:235)
   at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:254)
   at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:204)
   at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:231)
   at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNext(POLocalRearrange.java:240)
   at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.PODemux.runPipeline(PODemux.java:264)
   at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.PODemux.getNext(PODemux.java:254)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigCombiner$Combine.processOnePackageOutput(PigCombiner.java:196)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigCombiner$Combine.reduce(PigCombiner.java:174)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigCombiner$Combine.reduce(PigCombiner.java:63)
   at 
 org.apache.hadoop.mapred.MapTask$MapOutputBuffer.combineAndSpill(MapTask.java:906)
   at 
 org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:786)
   at 
 org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:698)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:228)
   at 
 org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2206)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-989) Allow type merge between numerical type and non-numerical type

2009-10-08 Thread Olga Natkovich (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12763592#action_12763592
 ] 

Olga Natkovich commented on PIG-989:


+1, looks good. Please, commit

 Allow type merge between numerical type and non-numerical type
 --

 Key: PIG-989
 URL: https://issues.apache.org/jira/browse/PIG-989
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.5.0
Reporter: Daniel Dai
 Attachments: PIG-989-1.patch, PIG-989-2.patch


 Currently, we do not allow type merge between numerical type and 
 non-numerical type. And the error message is confusing. 
 Eg, if you run:
 a = load '1.txt' as (a0:chararray, a1:chararray);
 b = load '2.txt' as (b0:long, b1:chararray);
 c = join a by a0, b by b0;
 dump c;
 And the error message is ERROR 1051: Cannot cast to Unknown
 We shall:
 1. Allow the type merge between numerical type and non-numerical type
 2. Or at least, provide more meaningful error message to the user

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-948) [Usability] Relating pig script with MR jobs

2009-10-08 Thread Olga Natkovich (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12763593#action_12763593
 ] 

Olga Natkovich commented on PIG-948:


+1, please, commit

 [Usability] Relating pig script with MR jobs
 

 Key: PIG-948
 URL: https://issues.apache.org/jira/browse/PIG-948
 Project: Pig
  Issue Type: Improvement
  Components: impl
Affects Versions: 0.4.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
Priority: Minor
 Fix For: 0.6.0

 Attachments: pig-948-2.patch, pig-948-3.patch, PIG-948-4.patch, 
 pig-948.patch


 Currently its hard to find a way to relate pig script with specific MR job. 
 In a loaded cluster with multiple simultaneous job submissions, its not easy 
 to figure out which specific MR jobs were launched for a given pig script. If 
 Pig can provide this info, it will be useful to debug and monitor the jobs 
 resulting from a pig script.
 At the very least, Pig should be able to provide user the following 
 information
 1) Job id of the launched job.
 2) Complete web url of jobtracker running this job. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-989) Allow type merge between numerical type and non-numerical type

2009-10-08 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-989:
---

  Resolution: Fixed
Hadoop Flags: [Reviewed]
  Status: Resolved  (was: Patch Available)

Patch committed

 Allow type merge between numerical type and non-numerical type
 --

 Key: PIG-989
 URL: https://issues.apache.org/jira/browse/PIG-989
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.5.0
Reporter: Daniel Dai
 Attachments: PIG-989-1.patch, PIG-989-2.patch


 Currently, we do not allow type merge between numerical type and 
 non-numerical type. And the error message is confusing. 
 Eg, if you run:
 a = load '1.txt' as (a0:chararray, a1:chararray);
 b = load '2.txt' as (b0:long, b1:chararray);
 c = join a by a0, b by b0;
 dump c;
 And the error message is ERROR 1051: Cannot cast to Unknown
 We shall:
 1. Allow the type merge between numerical type and non-numerical type
 2. Or at least, provide more meaningful error message to the user

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-948) [Usability] Relating pig script with MR jobs

2009-10-08 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-948:
---

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Patch committed.

 [Usability] Relating pig script with MR jobs
 

 Key: PIG-948
 URL: https://issues.apache.org/jira/browse/PIG-948
 Project: Pig
  Issue Type: Improvement
  Components: impl
Affects Versions: 0.4.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
Priority: Minor
 Fix For: 0.6.0

 Attachments: pig-948-2.patch, pig-948-3.patch, PIG-948-4.patch, 
 pig-948.patch


 Currently its hard to find a way to relate pig script with specific MR job. 
 In a loaded cluster with multiple simultaneous job submissions, its not easy 
 to figure out which specific MR jobs were launched for a given pig script. If 
 Pig can provide this info, it will be useful to debug and monitor the jobs 
 resulting from a pig script.
 At the very least, Pig should be able to provide user the following 
 information
 1) Job id of the launched job.
 2) Complete web url of jobtracker running this job. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-976) Multi-query optimization throws ClassCastException

2009-10-08 Thread Pradeep Kamath (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12763645#action_12763645
 ] 

Pradeep Kamath commented on PIG-976:


Reviewed the new patch - one comment is on POMultiQueryPackage:
{code}
203 Object obj = tuple.get(0);  

 
  204 if (obj instanceof PigNullableWritable) { 

   
  205 ((PigNullableWritable)obj).setIndex(origIndex);   

   
  206 } 

   
  207 else {

   
  208 PigNullableWritable myObj = 
HDataType.getWritableComparableTypes(obj, (byte)0); 

 
  209 myObj.setIndex(origIndex);

   
  210 tuple.set(0, myObj);  

   
  211 } 
{code}

If obj is null then the above code in the else would give an exception - I 
think the code should check for obj == null and if so create a NullWritable 
object where NullWritable is a subclass of PigNullableWritable representing a 
null. Since only the getValueAsPigType() method is used in PODemux, that would 
always return null for this use case.

 Multi-query optimization throws ClassCastException
 --

 Key: PIG-976
 URL: https://issues.apache.org/jira/browse/PIG-976
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.4.0
Reporter: Ankur
Assignee: Richard Ding
 Attachments: PIG-976.patch, PIG-976.patch


 Multi-query optimization fails to merge 2 branches when 1 is a result of 
 Group By ALL and another is a result of Group By field1 where field 1 is of 
 type long. Here is the script that fails with multi-query on.
 data = LOAD 'test' USING PigStorage('\t') AS (a:long, b:double, c:double); 
 A = GROUP data ALL;
 B = FOREACH A GENERATE SUM(data.b) AS sum1, SUM(data.c) AS sum2;
 C = FOREACH B GENERATE (sum1/sum2) AS rate; 
 STORE C INTO 'result1';
 D = GROUP data BY a; 
 E = FOREACH D GENERATE group AS a, SUM(data.b), SUM(data.c);
 STORE E into 'result2';
  
 Here is the exception from the logs
 java.lang.ClassCastException: org.apache.pig.data.DefaultTuple cannot be cast 
 to org.apache.pig.data.DataBag
   at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.processInputBag(POProject.java:399)
   at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:180)
   at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.processInput(POUserFunc.java:145)
   at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:197)
   at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:235)
   at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:254)
   at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:204)
   at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:231)
   at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNext(POLocalRearrange.java:240)
   at 
 

[jira] Updated: (PIG-922) Logical optimizer: push up project

2009-10-08 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-922:
---

Attachment: PIG-922-p3_9.patch

Fix the unit test

 Logical optimizer: push up project
 --

 Key: PIG-922
 URL: https://issues.apache.org/jira/browse/PIG-922
 Project: Pig
  Issue Type: New Feature
  Components: impl
Affects Versions: 0.3.0
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.6.0

 Attachments: PIG-922-p1_0.patch, PIG-922-p1_1.patch, 
 PIG-922-p1_2.patch, PIG-922-p1_3.patch, PIG-922-p1_4.patch, 
 PIG-922-p2_preview.patch, PIG-922-p2_preview2.patch, PIG-922-p3_1.patch, 
 PIG-922-p3_2.patch, PIG-922-p3_3.patch, PIG-922-p3_4.patch, 
 PIG-922-p3_5.patch, PIG-922-p3_6.patch, PIG-922-p3_7.patch, 
 PIG-922-p3_8.patch, PIG-922-p3_9.patch


 This is a continuation work of 
 [PIG-697|https://issues.apache.org/jira/browse/PIG-697]. We need to add 
 another rule to the logical optimizer: Push up project, ie, prune columns as 
 early as possible.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-922) Logical optimizer: push up project

2009-10-08 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-922:
---

Status: Open  (was: Patch Available)

 Logical optimizer: push up project
 --

 Key: PIG-922
 URL: https://issues.apache.org/jira/browse/PIG-922
 Project: Pig
  Issue Type: New Feature
  Components: impl
Affects Versions: 0.3.0
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.6.0

 Attachments: PIG-922-p1_0.patch, PIG-922-p1_1.patch, 
 PIG-922-p1_2.patch, PIG-922-p1_3.patch, PIG-922-p1_4.patch, 
 PIG-922-p2_preview.patch, PIG-922-p2_preview2.patch, PIG-922-p3_1.patch, 
 PIG-922-p3_2.patch, PIG-922-p3_3.patch, PIG-922-p3_4.patch, 
 PIG-922-p3_5.patch, PIG-922-p3_6.patch, PIG-922-p3_7.patch, 
 PIG-922-p3_8.patch, PIG-922-p3_9.patch


 This is a continuation work of 
 [PIG-697|https://issues.apache.org/jira/browse/PIG-697]. We need to add 
 another rule to the logical optimizer: Push up project, ie, prune columns as 
 early as possible.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-922) Logical optimizer: push up project

2009-10-08 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-922:
---

Status: Patch Available  (was: Open)

 Logical optimizer: push up project
 --

 Key: PIG-922
 URL: https://issues.apache.org/jira/browse/PIG-922
 Project: Pig
  Issue Type: New Feature
  Components: impl
Affects Versions: 0.3.0
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.6.0

 Attachments: PIG-922-p1_0.patch, PIG-922-p1_1.patch, 
 PIG-922-p1_2.patch, PIG-922-p1_3.patch, PIG-922-p1_4.patch, 
 PIG-922-p2_preview.patch, PIG-922-p2_preview2.patch, PIG-922-p3_1.patch, 
 PIG-922-p3_2.patch, PIG-922-p3_3.patch, PIG-922-p3_4.patch, 
 PIG-922-p3_5.patch, PIG-922-p3_6.patch, PIG-922-p3_7.patch, 
 PIG-922-p3_8.patch, PIG-922-p3_9.patch


 This is a continuation work of 
 [PIG-697|https://issues.apache.org/jira/browse/PIG-697]. We need to add 
 another rule to the logical optimizer: Push up project, ie, prune columns as 
 early as possible.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Pig 0.4.0 is released!

2009-10-08 Thread Olga Natkovich
Pig Team is happy to announce Pig 0.4.0 release!

 

Pig is a Hadoop subproject that provides high-level data-flow language
and an execution framework for parallel computation on a Hadoop cluster.
More details about Pig can be found at http://hadoop.apache.org/pig/.

 

This release introduces two new types of join. The skewed join improves
join performance for the data with large skew in the join key. The merge
join improves performance for the case where both inputs are sorted on
the join key. The release also includes support for outer join.  The
details of the release can be found at
http://hadoop.apache.org/pig/releases.html

 

The publishing of this release has been delayed due to problems with
Apache infrastructure that prevented us from publishing the updated
site.

 

Olga



[jira] Commented: (PIG-995) Limit Optimizer throw exception ERROR 2156: Error while fixing projections

2009-10-08 Thread Pradeep Kamath (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12763694#action_12763694
 ] 

Pradeep Kamath commented on PIG-995:


+1

 Limit Optimizer throw exception ERROR 2156: Error while fixing projections
 

 Key: PIG-995
 URL: https://issues.apache.org/jira/browse/PIG-995
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.3.0
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.6.0

 Attachments: PIG-995-1.patch


 The following script fail:
 A = load '1.txt' AS (a0, a1, a2);
 B = order A by a1;
 C = limit B 10;
 D = foreach C generate $0;
 dump D;
 Error log:
 Caused by: org.apache.pig.impl.plan.VisitorException: ERROR 2156: Error while 
 fixing projections. Projection map of node to be replaced is null.
 at 
 org.apache.pig.impl.logicalLayer.ProjectFixerUpper.visit(ProjectFixerUpper.java:138)
 at 
 org.apache.pig.impl.logicalLayer.LOProject.visit(LOProject.java:408)
 at org.apache.pig.impl.logicalLayer.LOProject.visit(LOProject.java:58)
 at 
 org.apache.pig.impl.plan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:65)
 at 
 org.apache.pig.impl.plan.DepthFirstWalker.walk(DepthFirstWalker.java:50)
 at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:51)
 at 
 org.apache.pig.impl.logicalLayer.LOForEach.rewire(LOForEach.java:761)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-922) Logical optimizer: push up project

2009-10-08 Thread Pradeep Kamath (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12763705#action_12763705
 ] 

Pradeep Kamath commented on PIG-922:


Reviewed changes per my last review comments - looks good - +1

 Logical optimizer: push up project
 --

 Key: PIG-922
 URL: https://issues.apache.org/jira/browse/PIG-922
 Project: Pig
  Issue Type: New Feature
  Components: impl
Affects Versions: 0.3.0
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.6.0

 Attachments: PIG-922-p1_0.patch, PIG-922-p1_1.patch, 
 PIG-922-p1_2.patch, PIG-922-p1_3.patch, PIG-922-p1_4.patch, 
 PIG-922-p2_preview.patch, PIG-922-p2_preview2.patch, PIG-922-p3_1.patch, 
 PIG-922-p3_2.patch, PIG-922-p3_3.patch, PIG-922-p3_4.patch, 
 PIG-922-p3_5.patch, PIG-922-p3_6.patch, PIG-922-p3_7.patch, 
 PIG-922-p3_8.patch, PIG-922-p3_9.patch


 This is a continuation work of 
 [PIG-697|https://issues.apache.org/jira/browse/PIG-697]. We need to add 
 another rule to the logical optimizer: Push up project, ie, prune columns as 
 early as possible.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-995) Limit Optimizer throw exception ERROR 2156: Error while fixing projections

2009-10-08 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-995:
---

Attachment: PIG-995-2.patch

After discussion with Santhosh, I get a better patch. The problem is we do not 
generate projection map before applying optimization rules. If the optimization 
rules change the structure of the logical plan and then generate the projection 
map, we will end up using a wrong projection map. In the new patch, we 
regenerate projection map before applying each optimization rule.

 Limit Optimizer throw exception ERROR 2156: Error while fixing projections
 

 Key: PIG-995
 URL: https://issues.apache.org/jira/browse/PIG-995
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.3.0
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.6.0

 Attachments: PIG-995-1.patch, PIG-995-2.patch


 The following script fail:
 A = load '1.txt' AS (a0, a1, a2);
 B = order A by a1;
 C = limit B 10;
 D = foreach C generate $0;
 dump D;
 Error log:
 Caused by: org.apache.pig.impl.plan.VisitorException: ERROR 2156: Error while 
 fixing projections. Projection map of node to be replaced is null.
 at 
 org.apache.pig.impl.logicalLayer.ProjectFixerUpper.visit(ProjectFixerUpper.java:138)
 at 
 org.apache.pig.impl.logicalLayer.LOProject.visit(LOProject.java:408)
 at org.apache.pig.impl.logicalLayer.LOProject.visit(LOProject.java:58)
 at 
 org.apache.pig.impl.plan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:65)
 at 
 org.apache.pig.impl.plan.DepthFirstWalker.walk(DepthFirstWalker.java:50)
 at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:51)
 at 
 org.apache.pig.impl.logicalLayer.LOForEach.rewire(LOForEach.java:761)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-995) Limit Optimizer throw exception ERROR 2156: Error while fixing projections

2009-10-08 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-995:
---

Status: Patch Available  (was: Open)

 Limit Optimizer throw exception ERROR 2156: Error while fixing projections
 

 Key: PIG-995
 URL: https://issues.apache.org/jira/browse/PIG-995
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.3.0
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.6.0

 Attachments: PIG-995-1.patch, PIG-995-2.patch


 The following script fail:
 A = load '1.txt' AS (a0, a1, a2);
 B = order A by a1;
 C = limit B 10;
 D = foreach C generate $0;
 dump D;
 Error log:
 Caused by: org.apache.pig.impl.plan.VisitorException: ERROR 2156: Error while 
 fixing projections. Projection map of node to be replaced is null.
 at 
 org.apache.pig.impl.logicalLayer.ProjectFixerUpper.visit(ProjectFixerUpper.java:138)
 at 
 org.apache.pig.impl.logicalLayer.LOProject.visit(LOProject.java:408)
 at org.apache.pig.impl.logicalLayer.LOProject.visit(LOProject.java:58)
 at 
 org.apache.pig.impl.plan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:65)
 at 
 org.apache.pig.impl.plan.DepthFirstWalker.walk(DepthFirstWalker.java:50)
 at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:51)
 at 
 org.apache.pig.impl.logicalLayer.LOForEach.rewire(LOForEach.java:761)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-995) Limit Optimizer throw exception ERROR 2156: Error while fixing projections

2009-10-08 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-995:
---

Status: Open  (was: Patch Available)

 Limit Optimizer throw exception ERROR 2156: Error while fixing projections
 

 Key: PIG-995
 URL: https://issues.apache.org/jira/browse/PIG-995
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.3.0
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.6.0

 Attachments: PIG-995-1.patch, PIG-995-2.patch


 The following script fail:
 A = load '1.txt' AS (a0, a1, a2);
 B = order A by a1;
 C = limit B 10;
 D = foreach C generate $0;
 dump D;
 Error log:
 Caused by: org.apache.pig.impl.plan.VisitorException: ERROR 2156: Error while 
 fixing projections. Projection map of node to be replaced is null.
 at 
 org.apache.pig.impl.logicalLayer.ProjectFixerUpper.visit(ProjectFixerUpper.java:138)
 at 
 org.apache.pig.impl.logicalLayer.LOProject.visit(LOProject.java:408)
 at org.apache.pig.impl.logicalLayer.LOProject.visit(LOProject.java:58)
 at 
 org.apache.pig.impl.plan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:65)
 at 
 org.apache.pig.impl.plan.DepthFirstWalker.walk(DepthFirstWalker.java:50)
 at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:51)
 at 
 org.apache.pig.impl.logicalLayer.LOForEach.rewire(LOForEach.java:761)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-922) Logical optimizer: push up project

2009-10-08 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12763728#action_12763728
 ] 

Hadoop QA commented on PIG-922:
---

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12421651/PIG-922-p3_9.patch
  against trunk revision 823257.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 30 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

-1 release audit.  The applied patch generated 287 release audit warnings 
(more than the trunk's current 280 warnings).

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/15/testReport/
Release audit warnings: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/15/artifact/trunk/patchprocess/releaseAuditDiffWarnings.txt
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/15/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/15/console

This message is automatically generated.

 Logical optimizer: push up project
 --

 Key: PIG-922
 URL: https://issues.apache.org/jira/browse/PIG-922
 Project: Pig
  Issue Type: New Feature
  Components: impl
Affects Versions: 0.3.0
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.6.0

 Attachments: PIG-922-p1_0.patch, PIG-922-p1_1.patch, 
 PIG-922-p1_2.patch, PIG-922-p1_3.patch, PIG-922-p1_4.patch, 
 PIG-922-p2_preview.patch, PIG-922-p2_preview2.patch, PIG-922-p3_1.patch, 
 PIG-922-p3_2.patch, PIG-922-p3_3.patch, PIG-922-p3_4.patch, 
 PIG-922-p3_5.patch, PIG-922-p3_6.patch, PIG-922-p3_7.patch, 
 PIG-922-p3_8.patch, PIG-922-p3_9.patch


 This is a continuation work of 
 [PIG-697|https://issues.apache.org/jira/browse/PIG-697]. We need to add 
 another rule to the logical optimizer: Push up project, ie, prune columns as 
 early as possible.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (PIG-1000) InternalCachedBag.java generates javac warning and findbug warning

2009-10-08 Thread Ying He (JIRA)
InternalCachedBag.java generates javac warning and findbug warning
--

 Key: PIG-1000
 URL: https://issues.apache.org/jira/browse/PIG-1000
 Project: Pig
  Issue Type: Improvement
Affects Versions: 0.4.0
Reporter: Ying He
Assignee: Ying He
 Fix For: 0.6.0


POPackage uses DefaultDataBag during reduce process to hold data. It is 
registered with SpillableMemoryManager and prone to OutOfMemoryException.  It's 
better to pro-actively managers the usage of the memory. The bag fills in 
memory to a specified amount, and dump the rest the disk.  The amount of memory 
to hold tuples is configurable. This can avoid out of memory error.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1000) InternalCachedBag.java generates javac warning and findbug warning

2009-10-08 Thread Ying He (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ying He updated PIG-1000:
-

Attachment: PIG-1000.patch

fix javac warning and findbug warning

 InternalCachedBag.java generates javac warning and findbug warning
 --

 Key: PIG-1000
 URL: https://issues.apache.org/jira/browse/PIG-1000
 Project: Pig
  Issue Type: Improvement
Affects Versions: 0.4.0
Reporter: Ying He
Assignee: Ying He
 Fix For: 0.6.0

 Attachments: PIG-1000.patch


 POPackage uses DefaultDataBag during reduce process to hold data. It is 
 registered with SpillableMemoryManager and prone to OutOfMemoryException.  
 It's better to pro-actively managers the usage of the memory. The bag fills 
 in memory to a specified amount, and dump the rest the disk.  The amount of 
 memory to hold tuples is configurable. This can avoid out of memory error.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1000) InternalCachedBag.java generates javac warning and findbug warning

2009-10-08 Thread Ying He (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ying He updated PIG-1000:
-

Description: patch submitted by PIG-975 generates javac warning and findbug 
warning  (was: POPackage uses DefaultDataBag during reduce process to hold 
data. It is registered with SpillableMemoryManager and prone to 
OutOfMemoryException.  It's better to pro-actively managers the usage of the 
memory. The bag fills in memory to a specified amount, and dump the rest the 
disk.  The amount of memory to hold tuples is configurable. This can avoid out 
of memory error.)
 Patch Info: [Patch Available]

 InternalCachedBag.java generates javac warning and findbug warning
 --

 Key: PIG-1000
 URL: https://issues.apache.org/jira/browse/PIG-1000
 Project: Pig
  Issue Type: Improvement
Affects Versions: 0.4.0
Reporter: Ying He
Assignee: Ying He
 Fix For: 0.6.0

 Attachments: PIG-1000.patch


 patch submitted by PIG-975 generates javac warning and findbug warning

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Assigned: (PIG-894) order-by fails when input is empty

2009-10-08 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai reassigned PIG-894:
--

Assignee: Daniel Dai

 order-by fails when input is empty
 --

 Key: PIG-894
 URL: https://issues.apache.org/jira/browse/PIG-894
 Project: Pig
  Issue Type: Bug
Reporter: Thejas M Nair
Assignee: Daniel Dai
 Attachments: PIG-894-1.patch


 grunt l = load 'students.txt' ;
 grunt f = filter l by 1 == 2;
 grunt o = order f by $0 ;
 grunt dump o;
 This results in 3 MR jobs . The 2nd (sampling) MR creates empty sample file, 
 and 3rd MR (order-by) fails with following error in Map job -
 java.lang.RuntimeException: java.lang.RuntimeException: Empty samples file
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.configure(WeightedRangePartitioner.java:104)
   at 
 org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:58)
   at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:82)
   at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.(MapTask.java:348)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:193)
   at 
 org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2207)
 Caused by: java.lang.RuntimeException: Empty samples file
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.configure(WeightedRangePartitioner.java:89)
   ... 5 more

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-894) order-by fails when input is empty

2009-10-08 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-894:
---

Attachment: PIG-894-1.patch

 order-by fails when input is empty
 --

 Key: PIG-894
 URL: https://issues.apache.org/jira/browse/PIG-894
 Project: Pig
  Issue Type: Bug
Reporter: Thejas M Nair
 Attachments: PIG-894-1.patch


 grunt l = load 'students.txt' ;
 grunt f = filter l by 1 == 2;
 grunt o = order f by $0 ;
 grunt dump o;
 This results in 3 MR jobs . The 2nd (sampling) MR creates empty sample file, 
 and 3rd MR (order-by) fails with following error in Map job -
 java.lang.RuntimeException: java.lang.RuntimeException: Empty samples file
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.configure(WeightedRangePartitioner.java:104)
   at 
 org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:58)
   at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:82)
   at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.(MapTask.java:348)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:193)
   at 
 org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2207)
 Caused by: java.lang.RuntimeException: Empty samples file
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.configure(WeightedRangePartitioner.java:89)
   ... 5 more

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-894) order-by fails when input is empty

2009-10-08 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-894:
---

Status: Patch Available  (was: Open)

 order-by fails when input is empty
 --

 Key: PIG-894
 URL: https://issues.apache.org/jira/browse/PIG-894
 Project: Pig
  Issue Type: Bug
Reporter: Thejas M Nair
Assignee: Daniel Dai
 Attachments: PIG-894-1.patch


 grunt l = load 'students.txt' ;
 grunt f = filter l by 1 == 2;
 grunt o = order f by $0 ;
 grunt dump o;
 This results in 3 MR jobs . The 2nd (sampling) MR creates empty sample file, 
 and 3rd MR (order-by) fails with following error in Map job -
 java.lang.RuntimeException: java.lang.RuntimeException: Empty samples file
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.configure(WeightedRangePartitioner.java:104)
   at 
 org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:58)
   at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:82)
   at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.(MapTask.java:348)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:193)
   at 
 org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2207)
 Caused by: java.lang.RuntimeException: Empty samples file
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.configure(WeightedRangePartitioner.java:89)
   ... 5 more

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (PIG-1001) Generate more meaningful error message when one input file does not exist

2009-10-08 Thread Daniel Dai (JIRA)
Generate more meaningful error message when one input file does not exist
-

 Key: PIG-1001
 URL: https://issues.apache.org/jira/browse/PIG-1001
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.4.0
Reporter: Daniel Dai
 Fix For: 0.6.0


In the following query, if 2.txt does not exist, 

a = load '1.txt';
b = order a by $0;
c = load '2.txt';
d = order c by $0;
e = join b by $0, d by $0;
dump e;

Pig throws error message ERROR 2100: file:/tmp/temp155054664/tmp1144108421 
does not exist., Pig should deal with it with the error message Input file 
2.txt not exist instead of those confusing messages.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Hudson build is back to normal: Pig-trunk #581

2009-10-08 Thread Apache Hudson Server
See http://hudson.zones.apache.org/hudson/job/Pig-trunk/581/changes




[jira] Updated: (PIG-987) [zebra] Zebra Column Group Access Control

2009-10-08 Thread Yan Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yan Zhou updated PIG-987:
-

Attachment: ColumnGroupSecurity.patch

 [zebra] Zebra Column Group Access Control
 -

 Key: PIG-987
 URL: https://issues.apache.org/jira/browse/PIG-987
 Project: Pig
  Issue Type: New Feature
Affects Versions: 0.6.0
Reporter: Yan Zhou
Assignee: Yan Zhou
 Attachments: ColumnGroupSecurity.patch, ColumnGroupSecurity.patch, 
 ColumnGroupSecurity.patch, TEST-org.apache.hadoop.zebra.io.TestCheckin.txt, 
 TEST-org.apache.hadoop.zebra.mapred.TestCheckin.txt, tmp-987-plus-991.patch


 Access Control: when processes try to read from the column groups, Zebra 
 should be able to handle allowed vs. disallowed user/application accesses.  
 The security is eventuallt granted by corresponding  HDFS security of the 
 data stored.
 Expected behavior when column group permissions are set:
 When user selects only columns that they do not have permissions to 
 access, Zebra should return error with message Error #: Permission denied 
 for accessing column column name or names 
 Access control applies to an entire column group, so all columns in a column 
 group have same permissions. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-894) order-by fails when input is empty

2009-10-08 Thread Pradeep Kamath (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12763776#action_12763776
 ] 

Pradeep Kamath commented on PIG-894:


The patch uses pig.inputs property from jobconf which does not directly have 
the input file name - it actually has a serialized arrayListPairFileSpec, 
Boolean in string form containing the filespec and the issplittable flag for 
each input for the job - this serialized string will need to be deserialized 
using ObjectSerializer.deserialize and then from the filespec, the filename 
will need to be retrieved.

 order-by fails when input is empty
 --

 Key: PIG-894
 URL: https://issues.apache.org/jira/browse/PIG-894
 Project: Pig
  Issue Type: Bug
Reporter: Thejas M Nair
Assignee: Daniel Dai
 Attachments: PIG-894-1.patch


 grunt l = load 'students.txt' ;
 grunt f = filter l by 1 == 2;
 grunt o = order f by $0 ;
 grunt dump o;
 This results in 3 MR jobs . The 2nd (sampling) MR creates empty sample file, 
 and 3rd MR (order-by) fails with following error in Map job -
 java.lang.RuntimeException: java.lang.RuntimeException: Empty samples file
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.configure(WeightedRangePartitioner.java:104)
   at 
 org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:58)
   at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:82)
   at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.(MapTask.java:348)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:193)
   at 
 org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2207)
 Caused by: java.lang.RuntimeException: Empty samples file
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.configure(WeightedRangePartitioner.java:89)
   ... 5 more

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-986) [zebra] Zebra Column Group Naming Support

2009-10-08 Thread Yan Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yan Zhou updated PIG-986:
-

Attachment: ColumnGroupName.patch

removed hard coded group in a few test cases

 [zebra] Zebra Column Group Naming Support
 -

 Key: PIG-986
 URL: https://issues.apache.org/jira/browse/PIG-986
 Project: Pig
  Issue Type: New Feature
  Components: impl
Affects Versions: 0.4.0
Reporter: Chao Wang
Assignee: Chao Wang
 Fix For: 0.6.0

 Attachments: ColumnGroupName.patch, ColumnGroupName.patch, 
 ColumnGroupName.patch


 We introduce column group name to Zebra and make it a first-class citizen in 
 Zebra. This can ease management of column groups.
 We plan to introduce an as clause for column group name in Zebra's syntax.
 Functional Specifications:
 1) Column group names are optional. For column groups which do not have a 
 user-provided name, Zebra will assign some default column group names 
 internally that is unique for that table - CG0, CG1, CG2 ... Note: If CGx is 
 used by user, then it can not be used for internal names.
 2) We introduce an AS clause in Zebra's syntax for column group names. If 
 it occurs, it has to immediately follow [ ]. For example, [a1, a2] as PI 
 secure by user:joe group:secure perm:640; [a3, a4] as General compress by 
 lzo. Note that keyword AS is case insensitive.
 3) Column group names are unique within one table and are case sensitive, 
 i.e., c1 and C1 are different.
 4) Column group names will be used as the physical column group directory 
 path names.
 5) Zebra V2 will support dropColumnGroup by column group names (will 
 integrate with Raghu's A29 drop column work).
 6) Zebra V2 can support backward compatibility (If there are Zebra V1 created 
 tables in production when V2 is released). More specifically, this means that 
 Zebra V2 can load from V1-created tables and do dropColumnGroup on it.
 7) Does NOT support renaming.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-987) [zebra] Zebra Column Group Access Control

2009-10-08 Thread Yan Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12763781#action_12763781
 ] 

Yan Zhou commented on PIG-987:
--

remove the hardcoded group name  from a few test scripts.  This patch and the 
ones in Pig-991 and Pig-986 are ready to be comitted. But please hold on 
commiting Pig-992 and afterwards.

 [zebra] Zebra Column Group Access Control
 -

 Key: PIG-987
 URL: https://issues.apache.org/jira/browse/PIG-987
 Project: Pig
  Issue Type: New Feature
Affects Versions: 0.6.0
Reporter: Yan Zhou
Assignee: Yan Zhou
 Attachments: ColumnGroupSecurity.patch, ColumnGroupSecurity.patch, 
 ColumnGroupSecurity.patch, TEST-org.apache.hadoop.zebra.io.TestCheckin.txt, 
 TEST-org.apache.hadoop.zebra.mapred.TestCheckin.txt, tmp-987-plus-991.patch


 Access Control: when processes try to read from the column groups, Zebra 
 should be able to handle allowed vs. disallowed user/application accesses.  
 The security is eventuallt granted by corresponding  HDFS security of the 
 data stored.
 Expected behavior when column group permissions are set:
 When user selects only columns that they do not have permissions to 
 access, Zebra should return error with message Error #: Permission denied 
 for accessing column column name or names 
 Access control applies to an entire column group, so all columns in a column 
 group have same permissions. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-894) order-by fails when input is empty

2009-10-08 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-894:
---

Status: Open  (was: Patch Available)

 order-by fails when input is empty
 --

 Key: PIG-894
 URL: https://issues.apache.org/jira/browse/PIG-894
 Project: Pig
  Issue Type: Bug
Reporter: Thejas M Nair
Assignee: Daniel Dai
 Attachments: PIG-894-1.patch, PIG-894-2.patch


 grunt l = load 'students.txt' ;
 grunt f = filter l by 1 == 2;
 grunt o = order f by $0 ;
 grunt dump o;
 This results in 3 MR jobs . The 2nd (sampling) MR creates empty sample file, 
 and 3rd MR (order-by) fails with following error in Map job -
 java.lang.RuntimeException: java.lang.RuntimeException: Empty samples file
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.configure(WeightedRangePartitioner.java:104)
   at 
 org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:58)
   at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:82)
   at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.(MapTask.java:348)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:193)
   at 
 org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2207)
 Caused by: java.lang.RuntimeException: Empty samples file
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.configure(WeightedRangePartitioner.java:89)
   ... 5 more

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-894) order-by fails when input is empty

2009-10-08 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-894:
---

Fix Version/s: 0.6.0
Affects Version/s: 0.4.0
   Status: Patch Available  (was: Open)

 order-by fails when input is empty
 --

 Key: PIG-894
 URL: https://issues.apache.org/jira/browse/PIG-894
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.4.0
Reporter: Thejas M Nair
Assignee: Daniel Dai
 Fix For: 0.6.0

 Attachments: PIG-894-1.patch, PIG-894-2.patch


 grunt l = load 'students.txt' ;
 grunt f = filter l by 1 == 2;
 grunt o = order f by $0 ;
 grunt dump o;
 This results in 3 MR jobs . The 2nd (sampling) MR creates empty sample file, 
 and 3rd MR (order-by) fails with following error in Map job -
 java.lang.RuntimeException: java.lang.RuntimeException: Empty samples file
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.configure(WeightedRangePartitioner.java:104)
   at 
 org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:58)
   at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:82)
   at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.(MapTask.java:348)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:193)
   at 
 org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2207)
 Caused by: java.lang.RuntimeException: Empty samples file
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.configure(WeightedRangePartitioner.java:89)
   ... 5 more

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-894) order-by fails when input is empty

2009-10-08 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-894:
---

Attachment: PIG-894-2.patch

Fix the issue Pradeep find.

 order-by fails when input is empty
 --

 Key: PIG-894
 URL: https://issues.apache.org/jira/browse/PIG-894
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.4.0
Reporter: Thejas M Nair
Assignee: Daniel Dai
 Fix For: 0.6.0

 Attachments: PIG-894-1.patch, PIG-894-2.patch


 grunt l = load 'students.txt' ;
 grunt f = filter l by 1 == 2;
 grunt o = order f by $0 ;
 grunt dump o;
 This results in 3 MR jobs . The 2nd (sampling) MR creates empty sample file, 
 and 3rd MR (order-by) fails with following error in Map job -
 java.lang.RuntimeException: java.lang.RuntimeException: Empty samples file
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.configure(WeightedRangePartitioner.java:104)
   at 
 org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:58)
   at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:82)
   at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.(MapTask.java:348)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:193)
   at 
 org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2207)
 Caused by: java.lang.RuntimeException: Empty samples file
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.configure(WeightedRangePartitioner.java:89)
   ... 5 more

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-894) order-by fails when input is empty

2009-10-08 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12763786#action_12763786
 ] 

Hadoop QA commented on PIG-894:
---

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12421681/PIG-894-1.patch
  against trunk revision 823257.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/16/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/16/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/16/console

This message is automatically generated.

 order-by fails when input is empty
 --

 Key: PIG-894
 URL: https://issues.apache.org/jira/browse/PIG-894
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.4.0
Reporter: Thejas M Nair
Assignee: Daniel Dai
 Fix For: 0.6.0

 Attachments: PIG-894-1.patch, PIG-894-2.patch


 grunt l = load 'students.txt' ;
 grunt f = filter l by 1 == 2;
 grunt o = order f by $0 ;
 grunt dump o;
 This results in 3 MR jobs . The 2nd (sampling) MR creates empty sample file, 
 and 3rd MR (order-by) fails with following error in Map job -
 java.lang.RuntimeException: java.lang.RuntimeException: Empty samples file
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.configure(WeightedRangePartitioner.java:104)
   at 
 org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:58)
   at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:82)
   at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.(MapTask.java:348)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:193)
   at 
 org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2207)
 Caused by: java.lang.RuntimeException: Empty samples file
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.configure(WeightedRangePartitioner.java:89)
   ... 5 more

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-995) Limit Optimizer throw exception ERROR 2156: Error while fixing projections

2009-10-08 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12763797#action_12763797
 ] 

Hadoop QA commented on PIG-995:
---

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12421671/PIG-995-2.patch
  against trunk revision 823257.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/67/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/67/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/67/console

This message is automatically generated.

 Limit Optimizer throw exception ERROR 2156: Error while fixing projections
 

 Key: PIG-995
 URL: https://issues.apache.org/jira/browse/PIG-995
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.3.0
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.6.0

 Attachments: PIG-995-1.patch, PIG-995-2.patch


 The following script fail:
 A = load '1.txt' AS (a0, a1, a2);
 B = order A by a1;
 C = limit B 10;
 D = foreach C generate $0;
 dump D;
 Error log:
 Caused by: org.apache.pig.impl.plan.VisitorException: ERROR 2156: Error while 
 fixing projections. Projection map of node to be replaced is null.
 at 
 org.apache.pig.impl.logicalLayer.ProjectFixerUpper.visit(ProjectFixerUpper.java:138)
 at 
 org.apache.pig.impl.logicalLayer.LOProject.visit(LOProject.java:408)
 at org.apache.pig.impl.logicalLayer.LOProject.visit(LOProject.java:58)
 at 
 org.apache.pig.impl.plan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:65)
 at 
 org.apache.pig.impl.plan.DepthFirstWalker.walk(DepthFirstWalker.java:50)
 at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:51)
 at 
 org.apache.pig.impl.logicalLayer.LOForEach.rewire(LOForEach.java:761)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-987) [zebra] Zebra Column Group Access Control

2009-10-08 Thread Raghu Angadi (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12763836#action_12763836
 ] 

Raghu Angadi commented on PIG-987:
--

Thanks Yan. It might be better to remove gauravj also since it is ignored 
anyway. 

This implies column access control is not tested in this patch, right?

 [zebra] Zebra Column Group Access Control
 -

 Key: PIG-987
 URL: https://issues.apache.org/jira/browse/PIG-987
 Project: Pig
  Issue Type: New Feature
Affects Versions: 0.6.0
Reporter: Yan Zhou
Assignee: Yan Zhou
 Attachments: ColumnGroupSecurity.patch, ColumnGroupSecurity.patch, 
 ColumnGroupSecurity.patch, TEST-org.apache.hadoop.zebra.io.TestCheckin.txt, 
 TEST-org.apache.hadoop.zebra.mapred.TestCheckin.txt, tmp-987-plus-991.patch


 Access Control: when processes try to read from the column groups, Zebra 
 should be able to handle allowed vs. disallowed user/application accesses.  
 The security is eventuallt granted by corresponding  HDFS security of the 
 data stored.
 Expected behavior when column group permissions are set:
 When user selects only columns that they do not have permissions to 
 access, Zebra should return error with message Error #: Permission denied 
 for accessing column column name or names 
 Access control applies to an entire column group, so all columns in a column 
 group have same permissions. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-894) order-by fails when input is empty

2009-10-08 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12763844#action_12763844
 ] 

Hadoop QA commented on PIG-894:
---

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12421694/PIG-894-2.patch
  against trunk revision 823257.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/17/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/17/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/17/console

This message is automatically generated.

 order-by fails when input is empty
 --

 Key: PIG-894
 URL: https://issues.apache.org/jira/browse/PIG-894
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.4.0
Reporter: Thejas M Nair
Assignee: Daniel Dai
 Fix For: 0.6.0

 Attachments: PIG-894-1.patch, PIG-894-2.patch


 grunt l = load 'students.txt' ;
 grunt f = filter l by 1 == 2;
 grunt o = order f by $0 ;
 grunt dump o;
 This results in 3 MR jobs . The 2nd (sampling) MR creates empty sample file, 
 and 3rd MR (order-by) fails with following error in Map job -
 java.lang.RuntimeException: java.lang.RuntimeException: Empty samples file
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.configure(WeightedRangePartitioner.java:104)
   at 
 org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:58)
   at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:82)
   at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.(MapTask.java:348)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:193)
   at 
 org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2207)
 Caused by: java.lang.RuntimeException: Empty samples file
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.configure(WeightedRangePartitioner.java:89)
   ... 5 more

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-991) [zebra] A few minor bugs as described in the Description section

2009-10-08 Thread Raghu Angadi (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raghu Angadi updated PIG-991:
-

Attachment: Bugs-2.patch

I am committing a slightly modified patch. I removed the following lines that 
modified build.xml at the top level. Please ask one of the PIG committers to 
commit that change.

The part that is removed :
{noformat}
@@ -940,4 +942,13 @@

  target name=published depends=ivy-publish-local, maven-artifacts/

+target name=pig-test
+jar
+  jarfile=${build.dir}/pig-test-${version}.jar
+  basedir=${build.dir}/test/classes
+  excludes=**/Test*.class
+
+/jar
+/target
+
 /project
{noformat}

 [zebra] A few minor bugs as described in the Description section
 

 Key: PIG-991
 URL: https://issues.apache.org/jira/browse/PIG-991
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.4.0
Reporter: Yan Zhou
Assignee: Yan Zhou
Priority: Minor
 Fix For: 0.6.0

 Attachments: Bugs-2.patch, Bugs.patch


 1) lzo2 was used as the compressor name for the LZO compression algorithm; 
 it should be lzo instead;
 2) the default compression is changed from lzo to gz for gzip;
 3) In JAVACC file SchemaParser.jjt, the package name was wrong using the old 
 package org.apache.pig.table.types;
 4) in build.xml, two new javacc targets are added to generate 
 TableSchemaParser and TableStorageParser java codes;
 5) Support of column group security ( 
 https://issues.apache.org/jira/browse/PIG-987 ) lacked support of the 
 dumpinfo method: the groups and permissions were not displayed. Note that as 
 a consequence, the patch herein must be applied after that of JIRA987.
 6) and 7) a couple of issues reported in Jira917.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-987) [zebra] Zebra Column Group Access Control

2009-10-08 Thread Raghu Angadi (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raghu Angadi updated PIG-987:
-

   Resolution: Fixed
Fix Version/s: 0.6.0
   Status: Resolved  (was: Patch Available)

I just committed this. Thanks Yan!

 [zebra] Zebra Column Group Access Control
 -

 Key: PIG-987
 URL: https://issues.apache.org/jira/browse/PIG-987
 Project: Pig
  Issue Type: New Feature
Affects Versions: 0.6.0
Reporter: Yan Zhou
Assignee: Yan Zhou
 Fix For: 0.6.0

 Attachments: ColumnGroupSecurity.patch, ColumnGroupSecurity.patch, 
 ColumnGroupSecurity.patch, TEST-org.apache.hadoop.zebra.io.TestCheckin.txt, 
 TEST-org.apache.hadoop.zebra.mapred.TestCheckin.txt, tmp-987-plus-991.patch


 Access Control: when processes try to read from the column groups, Zebra 
 should be able to handle allowed vs. disallowed user/application accesses.  
 The security is eventuallt granted by corresponding  HDFS security of the 
 data stored.
 Expected behavior when column group permissions are set:
 When user selects only columns that they do not have permissions to 
 access, Zebra should return error with message Error #: Permission denied 
 for accessing column column name or names 
 Access control applies to an entire column group, so all columns in a column 
 group have same permissions. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-991) [zebra] A few minor bugs as described in the Description section

2009-10-08 Thread Raghu Angadi (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raghu Angadi updated PIG-991:
-

Resolution: Fixed
Status: Resolved  (was: Patch Available)

I just committed this. Thanks Yan.

 [zebra] A few minor bugs as described in the Description section
 

 Key: PIG-991
 URL: https://issues.apache.org/jira/browse/PIG-991
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.4.0
Reporter: Yan Zhou
Assignee: Yan Zhou
Priority: Minor
 Fix For: 0.6.0

 Attachments: Bugs-2.patch, Bugs.patch


 1) lzo2 was used as the compressor name for the LZO compression algorithm; 
 it should be lzo instead;
 2) the default compression is changed from lzo to gz for gzip;
 3) In JAVACC file SchemaParser.jjt, the package name was wrong using the old 
 package org.apache.pig.table.types;
 4) in build.xml, two new javacc targets are added to generate 
 TableSchemaParser and TableStorageParser java codes;
 5) Support of column group security ( 
 https://issues.apache.org/jira/browse/PIG-987 ) lacked support of the 
 dumpinfo method: the groups and permissions were not displayed. Note that as 
 a consequence, the patch herein must be applied after that of JIRA987.
 6) and 7) a couple of issues reported in Jira917.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.