[jira] Updated: (PIG-1608) pig should always include pig-default.properties and pig.properties in the pig.jar

2010-09-15 Thread Giridharan Kesavan (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Giridharan Kesavan updated PIG-1608:


Status: Patch Available  (was: Open)

 pig should always include pig-default.properties and pig.properties in the 
 pig.jar
 --

 Key: PIG-1608
 URL: https://issues.apache.org/jira/browse/PIG-1608
 Project: Pig
  Issue Type: Bug
Reporter: niraj rai
Assignee: niraj rai
 Attachments: PIG-1608_0.patch


 pig should always include pig-default.properties and pig.properties as a part 
 of the pig.jar file

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1608) pig should always include pig-default.properties and pig.properties in the pig.jar

2010-09-15 Thread Giridharan Kesavan (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Giridharan Kesavan updated PIG-1608:


Status: Open  (was: Patch Available)

re-submiting patch to hudson ..

 pig should always include pig-default.properties and pig.properties in the 
 pig.jar
 --

 Key: PIG-1608
 URL: https://issues.apache.org/jira/browse/PIG-1608
 Project: Pig
  Issue Type: Bug
Reporter: niraj rai
Assignee: niraj rai
 Attachments: PIG-1608_0.patch


 pig should always include pig-default.properties and pig.properties as a part 
 of the pig.jar file

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1608) pig should always include pig-default.properties and pig.properties in the pig.jar

2010-09-15 Thread Giridharan Kesavan (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Giridharan Kesavan updated PIG-1608:


Status: Open  (was: Patch Available)

 pig should always include pig-default.properties and pig.properties in the 
 pig.jar
 --

 Key: PIG-1608
 URL: https://issues.apache.org/jira/browse/PIG-1608
 Project: Pig
  Issue Type: Bug
Reporter: niraj rai
Assignee: niraj rai
 Attachments: PIG-1608_0.patch


 pig should always include pig-default.properties and pig.properties as a part 
 of the pig.jar file

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1608) pig should always include pig-default.properties and pig.properties in the pig.jar

2010-09-15 Thread Giridharan Kesavan (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Giridharan Kesavan updated PIG-1608:


Status: Patch Available  (was: Open)

 pig should always include pig-default.properties and pig.properties in the 
 pig.jar
 --

 Key: PIG-1608
 URL: https://issues.apache.org/jira/browse/PIG-1608
 Project: Pig
  Issue Type: Bug
Reporter: niraj rai
Assignee: niraj rai
 Attachments: PIG-1608_0.patch


 pig should always include pig-default.properties and pig.properties as a part 
 of the pig.jar file

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1611) use enums for error code

2010-09-15 Thread Gianmarco De Francisci Morales (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12909695#action_12909695
 ] 

Gianmarco De Francisci Morales commented on PIG-1611:
-

Would be nice to see it!

 use enums for error code
 

 Key: PIG-1611
 URL: https://issues.apache.org/jira/browse/PIG-1611
 Project: Pig
  Issue Type: Sub-task
Reporter: Thejas M Nair
 Fix For: 0.9.0


 Pig code is using integer constants for error code, and the value of the 
 error code is reserved using 
 http://wiki.apache.org/pig/PigErrorHandlingFunctionalSpecification .
 This process is cumbersome and error prone.
 It will be better to use enum values instead. The enum value can contain the 
 error message and encapsulate the error code. 
 For example -
 {code}
 Replace 
 throw new SchemaMergeException(Error in merging schema, 2124, 
 PigException.BUG); 
 with
 throw new SchemaMergeException(SCHEMA_MERGE_EX, PigException.BUG); 
 {code}
 Where SCHEMA_MERGE_EX belongs to a error codes enum. We can use the ordinal 
 value of the enum and an offset to determine the error code. 
 The error code will be passed through the constructor of the enum.
 {code}
 SCHEMA_MERGE_EX(Error in merging schema);
 {code}
 For documentation, the error code and error messages can be dumped using code 
 that uses the enum error code class.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1610) 'union onschema' does handle some cases involving 'namespaced' column names in schema

2010-09-15 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated PIG-1610:
---

Attachment: PIG-1610.1.patch

 'union onschema' does handle some cases involving 'namespaced' column names 
 in schema
 -

 Key: PIG-1610
 URL: https://issues.apache.org/jira/browse/PIG-1610
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Fix For: 0.8.0

 Attachments: PIG-1610.1.patch


 case 1:
 grunt describe f;  
 f: {l1::a: bytearray,l1::b: bytearray}
 grunt describe l1;
 l1: {a: bytearray,b: bytearray}
 grunt dump f;
 (1,11)
 (2,22)
 (3,33)
 grunt dump l1;
 (1,11)
 (2,22)
 (3,33)
 grunt u = union onschema f, l1;
 grunt describe u;
 u: {l1::a: bytearray,l1::b: bytearray}
 -- the dump u gives incorrect results
 grunt dump u; 
 (,)
 (,)
 (,)
 (1,11)
 (2,22)
 (3,33)
 case 2:
 grunt u = union onschema l1, f;
 grunt describe u;
 2010-09-13 15:11:13,877 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 
 1108: Duplicate schema alias: l1::a
 Details at logfile: /Users/tejas/pig_unions_err2/trunk/pig_1284410413970.log

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-366) PigPen - Eclipse plugin for a graphical PigLatin editor

2010-09-15 Thread Robert Gibbon (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Gibbon updated PIG-366:
--

Attachment: org.apache.pig.pigpen-0.7.3.tar.gz

 PigPen - Eclipse plugin for a graphical PigLatin editor
 ---

 Key: PIG-366
 URL: https://issues.apache.org/jira/browse/PIG-366
 Project: Pig
  Issue Type: New Feature
Reporter: Shubham Chopra
Assignee: Robert Gibbon
Priority: Minor
 Attachments: org.apache.pig.pigpen-0.7.0.tar.gz, 
 org.apache.pig.pigpen-0.7.2.tar.gz, org.apache.pig.pigpen-0.7.3.tar.gz, 
 org.apache.pig.pigpen_0.0.1.jar, org.apache.pig.pigpen_0.0.1.tgz, 
 org.apache.pig.pigpen_0.0.4.jar, org.apache.pig.pigpen_0.7.2.jar, 
 pigpen.patch, pigPen.patch, PigPen.tgz


 This is an Eclipse plugin that provides a GUI that can help users create 
 PigLatin scripts and see the example generator outputs on the fly and submit 
 the jobs to hadoop clusters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1610) 'union onschema' does handle some cases involving 'namespaced' column names in schema

2010-09-15 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated PIG-1610:
---

  Status: Patch Available  (was: Open)
Release Note: 
This fixes the behavior for merging of column alias name that have a 
'namespace' portion in them.

- Alias such as 'nm::c1' and 'c1' in two separate relations specified in 'union 
onschema' are considered mergeable and in the schema of the union, the merged 
column alias will be 'c1'. 
- Alias such as 'nm1::c1' and 'nm2::c1' in two separate relations specified in 
'union onschema'  will not be merged together, in schema of the union there 
will be two columns with these names.

Example -

 describe f;
f: {l1::a: int, l1::b: int, l1::c: int}
 describe l1;
l1: {a: int, b: int}

 u = union onschema f,l1;
 desc u;
u: {a: int, b: int, l1::c: int}

Test-patch and unit test cases have succeeded.


 'union onschema' does handle some cases involving 'namespaced' column names 
 in schema
 -

 Key: PIG-1610
 URL: https://issues.apache.org/jira/browse/PIG-1610
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Fix For: 0.8.0

 Attachments: PIG-1610.1.patch


 case 1:
 grunt describe f;  
 f: {l1::a: bytearray,l1::b: bytearray}
 grunt describe l1;
 l1: {a: bytearray,b: bytearray}
 grunt dump f;
 (1,11)
 (2,22)
 (3,33)
 grunt dump l1;
 (1,11)
 (2,22)
 (3,33)
 grunt u = union onschema f, l1;
 grunt describe u;
 u: {l1::a: bytearray,l1::b: bytearray}
 -- the dump u gives incorrect results
 grunt dump u; 
 (,)
 (,)
 (,)
 (1,11)
 (2,22)
 (3,33)
 case 2:
 grunt u = union onschema l1, f;
 grunt describe u;
 2010-09-13 15:11:13,877 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 
 1108: Duplicate schema alias: l1::a
 Details at logfile: /Users/tejas/pig_unions_err2/trunk/pig_1284410413970.log

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-366) PigPen - Eclipse plugin for a graphical PigLatin editor

2010-09-15 Thread Robert Gibbon (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Gibbon updated PIG-366:
--

Attachment: org.apache.pig.pigpen_0.7.3.jar

 PigPen - Eclipse plugin for a graphical PigLatin editor
 ---

 Key: PIG-366
 URL: https://issues.apache.org/jira/browse/PIG-366
 Project: Pig
  Issue Type: New Feature
Reporter: Shubham Chopra
Assignee: Robert Gibbon
Priority: Minor
 Attachments: org.apache.pig.pigpen-0.7.0.tar.gz, 
 org.apache.pig.pigpen-0.7.2.tar.gz, org.apache.pig.pigpen-0.7.3.tar.gz, 
 org.apache.pig.pigpen_0.0.1.jar, org.apache.pig.pigpen_0.0.1.tgz, 
 org.apache.pig.pigpen_0.0.4.jar, org.apache.pig.pigpen_0.7.2.jar, 
 org.apache.pig.pigpen_0.7.3.jar, pigpen.patch, pigPen.patch, PigPen.tgz


 This is an Eclipse plugin that provides a GUI that can help users create 
 PigLatin scripts and see the example generator outputs on the fly and submit 
 the jobs to hadoop clusters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1607) pig should have separate javadoc.jar in the maven repository

2010-09-15 Thread niraj rai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

niraj rai updated PIG-1607:
---

Attachment: PIG-1607_0.patch

Things fixed with this patch:
1. Created javadoc.jar 
2. Cleaned sources.jar . Removed the generated source and the zebra related 
files.
2. changed build.xml to upload the javadoc.jar to maven

 pig should have separate javadoc.jar in the maven repository
 

 Key: PIG-1607
 URL: https://issues.apache.org/jira/browse/PIG-1607
 Project: Pig
  Issue Type: Bug
Reporter: niraj rai
Assignee: niraj rai
 Attachments: PIG-1607_0.patch


 At this moment, javadoc is part of the source.jar but pig should have 
 separate javadoc.jar in the maven repository.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1607) pig should have separate javadoc.jar in the maven repository

2010-09-15 Thread niraj rai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

niraj rai updated PIG-1607:
---

Attachment: PIG-1607_1.patch

fixed the javadoc-jar dependency

 pig should have separate javadoc.jar in the maven repository
 

 Key: PIG-1607
 URL: https://issues.apache.org/jira/browse/PIG-1607
 Project: Pig
  Issue Type: Bug
Reporter: niraj rai
Assignee: niraj rai
 Attachments: PIG-1607_0.patch, PIG-1607_1.patch


 At this moment, javadoc is part of the source.jar but pig should have 
 separate javadoc.jar in the maven repository.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Assigned: (PIG-1247) Error Number makes it hard to debug: ERROR 2999: Unexpected internal error. org.apache.pig.backend.datastorage.DataStorageException cannot be cast to java.lang.Error

2010-09-15 Thread Olga Natkovich (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Olga Natkovich reassigned PIG-1247:
---

Assignee: Xuefu Zhang

 Error Number makes it hard to debug: ERROR 2999: Unexpected internal error. 
 org.apache.pig.backend.datastorage.DataStorageException cannot be cast to 
 java.lang.Error
 -

 Key: PIG-1247
 URL: https://issues.apache.org/jira/browse/PIG-1247
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.6.0
Reporter: Viraj Bhat
Assignee: Xuefu Zhang
 Fix For: 0.9.0


 I have a large script in which there are intermediate stores statements, one 
 of them writes to a directory I do not have permission to write to. 
 The stack trace I get from Pig is this:
 2010-02-20 02:16:32,055 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 
 2999: Unexpected internal error. 
 org.apache.pig.backend.datastorage.DataStorageException cannot be cast to 
 java.lang.Error
 Details at logfile: /home/viraj/pig_1266632145355.log
 Pig Stack Trace
 ---
 ERROR 2999: Unexpected internal error. 
 org.apache.pig.backend.datastorage.DataStorageException cannot be cast to 
 java.lang.Error
 java.lang.ClassCastException: 
 org.apache.pig.backend.datastorage.DataStorageException cannot be cast to 
 java.lang.Error
 at 
 org.apache.pig.impl.logicalLayer.parser.QueryParser.StoreClause(QueryParser.java:3583)
 at 
 org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseExpr(QueryParser.java:1407)
 at 
 org.apache.pig.impl.logicalLayer.parser.QueryParser.Expr(QueryParser.java:949)
 at 
 org.apache.pig.impl.logicalLayer.parser.QueryParser.Parse(QueryParser.java:762)
 at 
 org.apache.pig.impl.logicalLayer.LogicalPlanBuilder.parse(LogicalPlanBuilder.java:63)
 at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1036)
 at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:986)
 at org.apache.pig.PigServer.registerQuery(PigServer.java:386)
 at 
 org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:720)
 at 
 org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:324)
 at 
 org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:168)
 at 
 org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:144)
 at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:89)
 at org.apache.pig.Main.main(Main.java:386)
 
 The only way to find the error was to look at the javacc generated 
 QueryParser.java code and do a System.out.println()
 Here is a script to reproduce the problem:
 {code}
 A = load '/user/viraj/three.txt' using PigStorage();
 B = foreach A generate ['a'#'12'] as b:map[] ;
 store B into '/user/secure/pigtest' using PigStorage();
 {code}
 three.txt has 3 lines which contain nothing but the number 1.
 {code}
 $ hadoop fs -ls /user/secure/
 ls: could not get get listing for 'hdfs://mynamenode/user/secure' : 
 org.apache.hadoop.security.AccessControlException: Permission denied: 
 user=viraj, access=READ_EXECUTE, inode=secure:secure:users:rwx--
 {code}
 Viraj

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Assigned: (PIG-1592) ORDER BY distribution is uneven when record size is correlated with order key

2010-09-15 Thread Olga Natkovich (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Olga Natkovich reassigned PIG-1592:
---

Assignee: Thejas M Nair

 ORDER BY distribution is uneven when record size is correlated with order key
 -

 Key: PIG-1592
 URL: https://issues.apache.org/jira/browse/PIG-1592
 Project: Pig
  Issue Type: Improvement
Reporter: Dmitriy V. Ryaboy
Assignee: Thejas M Nair
 Fix For: 0.9.0


 The partitioner contributed in PIG-545 distributes the order key space 
 between partitions so that each partition gets approximately the same number 
 of keys, even when the keys have a non-uniform distribution over the key 
 space.
 Unfortunately this still allows for severe partition imbalance when record 
 size is correlated with the order key. By way of motivating example, consider 
 this script which attempts to produce a list of genuses based on how many 
 species each genus contains:
 {code}
 set default_parallel 60;
 critters = load 'biodata'' as (genus, species);
 genus_counts = foreach (group critters by genus) generate group as genus, 
 COUNT(critters) as num_species, critters;
 ordered_genuses = order genus_counts by num_species desc;
 store ordered_genuses
 {code}
 The higher the value of genus_counts, the more species tuples will be 
 contained in the critters bag, the wider the row. This can cause a severe 
 processing imbalance, as the partitioner processing the records with the 
 highest values of genus_counts will have the same number of *records* as the 
 partitioner processing the lowest number, but it will have far more actual 
 *bytes* to work on.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1607) pig should have separate javadoc.jar in the maven repository

2010-09-15 Thread niraj rai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

niraj rai updated PIG-1607:
---

Attachment: PIG-1607_2.patch

 pig should have separate javadoc.jar in the maven repository
 

 Key: PIG-1607
 URL: https://issues.apache.org/jira/browse/PIG-1607
 Project: Pig
  Issue Type: Bug
Reporter: niraj rai
Assignee: niraj rai
 Attachments: PIG-1607_0.patch, PIG-1607_1.patch, PIG-1607_2.patch


 At this moment, javadoc is part of the source.jar but pig should have 
 separate javadoc.jar in the maven repository.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1606) flatten documentation does not discuss flatten of empty bag

2010-09-15 Thread Olga Natkovich (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12909810#action_12909810
 ] 

Olga Natkovich commented on PIG-1606:
-

If we are not planning to change the semantics I will ask Corinne to document 
for 0.8

 flatten documentation does not discuss flatten of empty bag
 ---

 Key: PIG-1606
 URL: https://issues.apache.org/jira/browse/PIG-1606
 Project: Pig
  Issue Type: Bug
  Components: documentation
Reporter: Thejas M Nair
 Fix For: 0.9.0


 From the existing flatten documentation, it is not clear that flatten of an 
 empty bag results in that row being discarded .
 For example the following query gives no output -
 {code}
 grunt cat /tmp/empty.bag
 {}  1
 grunt l = load '/tmp/empty.bag' as (b : bag{}, i : int);
 grunt f = foreach l generate flatten(b), i;
 grunt dump f;
 grunt
 {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1606) flatten documentation does not discuss flatten of empty bag

2010-09-15 Thread Olga Natkovich (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Olga Natkovich updated PIG-1606:


 Assignee: Corinne Chandel
Fix Version/s: 0.8.0
   (was: 0.9.0)

 flatten documentation does not discuss flatten of empty bag
 ---

 Key: PIG-1606
 URL: https://issues.apache.org/jira/browse/PIG-1606
 Project: Pig
  Issue Type: Bug
  Components: documentation
Reporter: Thejas M Nair
Assignee: Corinne Chandel
 Fix For: 0.8.0


 From the existing flatten documentation, it is not clear that flatten of an 
 empty bag results in that row being discarded .
 For example the following query gives no output -
 {code}
 grunt cat /tmp/empty.bag
 {}  1
 grunt l = load '/tmp/empty.bag' as (b : bag{}, i : int);
 grunt f = foreach l generate flatten(b), i;
 grunt dump f;
 grunt
 {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1608) pig should always include pig-default.properties and pig.properties in the pig.jar

2010-09-15 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12909821#action_12909821
 ] 

Daniel Dai commented on PIG-1608:
-

Two comments:
1. target buildJar-withouthadoop should also include this change
2. format comment: use space instead of tab

Target jar, package looks good.

 pig should always include pig-default.properties and pig.properties in the 
 pig.jar
 --

 Key: PIG-1608
 URL: https://issues.apache.org/jira/browse/PIG-1608
 Project: Pig
  Issue Type: Bug
Reporter: niraj rai
Assignee: niraj rai
 Attachments: PIG-1608_0.patch


 pig should always include pig-default.properties and pig.properties as a part 
 of the pig.jar file

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-282) Custom Partitioner

2010-09-15 Thread Viraj Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Bhat updated PIG-282:
---

Release Note: 
This feature allows to specify Hadoop Partitioner for the following operations: 
GROUP/COGROUP, CROSS, DISTINCT, JOIN (except 'skewed'  join). Partitioner 
controls the partitioning of the keys of the intermediate map-outputs. See 
http://hadoop.apache.org/common/docs/r0.20.2/api/org/apache/hadoop/mapred/Partitioner.html
 for more details.

To use this feature you can add PARTITION BY clause to the appropriate operator:
A = load 'input_data';
B = group A by $0 PARTITION BY 
org.apache.pig.test.utils.SimpleCustomPartitioner parallel 2;
.
Here is the code for SimpleCustomPartitioner

public class SimpleCustomPartitioner extends PartitionerPigNullableWritable, 
Writable {
 //@Override
public int getPartition(PigNullableWritable key, Writable value, int 
numPartitions) {
if(key.getValueAsPigType() instanceof Integer) {
int ret = (((Integer)key.getValueAsPigType()).intValue() % 
numPartitions);
return ret;
   }
   else {
return (key.hashCode()) % numPartitions;
}
}
}

  was:
This feature allows to specify Hadoop Partitioner for the following operations: 
GROUP/COGROUP, CROSS, DISTINCT, JOIN (except 'skewed'  join). Partitioner 
controls the partitioning of the keys of the intermediate map-outputs. See 
http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapred/Partitioner.html
 for more details.

To use this feature you can add PARTITION BY clause to the appropriate operator:
A = load 'input_data';
B = group A by $0 PARTITION BY 
org.apache.pig.test.utils.SimpleCustomPartitioner parallel 2;
.
Here is the code for SimpleCustomPartitioner

public class SimpleCustomPartitioner extends PartitionerPigNullableWritable, 
Writable {
 //@Override
public int getPartition(PigNullableWritable key, Writable value, int 
numPartitions) {
if(key.getValueAsPigType() instanceof Integer) {
int ret = (((Integer)key.getValueAsPigType()).intValue() % 
numPartitions);
return ret;
   }
   else {
return (key.hashCode()) % numPartitions;
}
}
}


 Custom Partitioner
 --

 Key: PIG-282
 URL: https://issues.apache.org/jira/browse/PIG-282
 Project: Pig
  Issue Type: New Feature
Affects Versions: 0.7.0
Reporter: Amir Youssefi
Assignee: Aniket Mokashi
Priority: Minor
 Fix For: 0.8.0

 Attachments: CustomPartitioner.patch, CustomPartitionerFinale.patch, 
 CustomPartitionerTest.patch


 By adding custom partitioner we can give control over which output partition 
 a key (/value) goes to. We can add keywords to language e.g. 
 PARTITION BY UDF(...)
 or a similar syntax. UDF returns a number between 0 and n-1 where n is number 
 of output partitions.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1608) pig should always include pig-default.properties and pig.properties in the pig.jar

2010-09-15 Thread niraj rai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

niraj rai updated PIG-1608:
---

Status: Patch Available  (was: Open)

 pig should always include pig-default.properties and pig.properties in the 
 pig.jar
 --

 Key: PIG-1608
 URL: https://issues.apache.org/jira/browse/PIG-1608
 Project: Pig
  Issue Type: Bug
Reporter: niraj rai
Assignee: niraj rai
 Attachments: PIG-1608_0.patch, PIG-1608_1.patch


 pig should always include pig-default.properties and pig.properties as a part 
 of the pig.jar file

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1608) pig should always include pig-default.properties and pig.properties in the pig.jar

2010-09-15 Thread niraj rai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

niraj rai updated PIG-1608:
---

Attachment: PIG-1608_1.patch

updated patch to accommodate the review comments.

 pig should always include pig-default.properties and pig.properties in the 
 pig.jar
 --

 Key: PIG-1608
 URL: https://issues.apache.org/jira/browse/PIG-1608
 Project: Pig
  Issue Type: Bug
Reporter: niraj rai
Assignee: niraj rai
 Attachments: PIG-1608_0.patch, PIG-1608_1.patch


 pig should always include pig-default.properties and pig.properties as a part 
 of the pig.jar file

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1608) pig should always include pig-default.properties and pig.properties in the pig.jar

2010-09-15 Thread niraj rai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

niraj rai updated PIG-1608:
---

Status: Open  (was: Patch Available)

 pig should always include pig-default.properties and pig.properties in the 
 pig.jar
 --

 Key: PIG-1608
 URL: https://issues.apache.org/jira/browse/PIG-1608
 Project: Pig
  Issue Type: Bug
Reporter: niraj rai
Assignee: niraj rai
 Attachments: PIG-1608_0.patch, PIG-1608_1.patch


 pig should always include pig-default.properties and pig.properties as a part 
 of the pig.jar file

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-366) PigPen - Eclipse plugin for a graphical PigLatin editor

2010-09-15 Thread Robert Gibbon (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Gibbon updated PIG-366:
--

Attachment: (was: org.apache.pig.pigpen_0.7.3.jar)

 PigPen - Eclipse plugin for a graphical PigLatin editor
 ---

 Key: PIG-366
 URL: https://issues.apache.org/jira/browse/PIG-366
 Project: Pig
  Issue Type: New Feature
Reporter: Shubham Chopra
Assignee: Robert Gibbon
Priority: Minor
 Attachments: org.apache.pig.pigpen-0.7.0.tar.gz, 
 org.apache.pig.pigpen-0.7.2.tar.gz, org.apache.pig.pigpen-0.7.3.tar.gz, 
 org.apache.pig.pigpen_0.0.1.jar, org.apache.pig.pigpen_0.0.1.tgz, 
 org.apache.pig.pigpen_0.0.4.jar, org.apache.pig.pigpen_0.7.2.jar, 
 pigpen.patch, pigPen.patch, PigPen.tgz


 This is an Eclipse plugin that provides a GUI that can help users create 
 PigLatin scripts and see the example generator outputs on the fly and submit 
 the jobs to hadoop clusters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1610) 'union onschema' does handle some cases involving 'namespaced' column names in schema

2010-09-15 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12909859#action_12909859
 ] 

Thejas M Nair commented on PIG-1610:


Richard pointed out an issue with the patch where the schema of 'union 
onschema' differs with different order of relation in the statement. The case 
is like -
{code}
l = load 'x' as (c, nm::c);
f = load 'y' as (i,j);

u = union onschema f,l;
describe u;
u: {i: bytearray,j: bytearray,c: bytearray}

u = union onschema l,f;
describe u;
u: {c: bytearray,nm::c: bytearray,i: bytearray,j: bytearray}
{code}

Another issue found with the feature is that the schema of union is null when a 
column in one of the relations has a complex type with null inner schema.

I will submit another patch with fix for these issues.



 'union onschema' does handle some cases involving 'namespaced' column names 
 in schema
 -

 Key: PIG-1610
 URL: https://issues.apache.org/jira/browse/PIG-1610
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Fix For: 0.8.0

 Attachments: PIG-1610.1.patch


 case 1:
 grunt describe f;  
 f: {l1::a: bytearray,l1::b: bytearray}
 grunt describe l1;
 l1: {a: bytearray,b: bytearray}
 grunt dump f;
 (1,11)
 (2,22)
 (3,33)
 grunt dump l1;
 (1,11)
 (2,22)
 (3,33)
 grunt u = union onschema f, l1;
 grunt describe u;
 u: {l1::a: bytearray,l1::b: bytearray}
 -- the dump u gives incorrect results
 grunt dump u; 
 (,)
 (,)
 (,)
 (1,11)
 (2,22)
 (3,33)
 case 2:
 grunt u = union onschema l1, f;
 grunt describe u;
 2010-09-13 15:11:13,877 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 
 1108: Duplicate schema alias: l1::a
 Details at logfile: /Users/tejas/pig_unions_err2/trunk/pig_1284410413970.log

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-366) PigPen - Eclipse plugin for a graphical PigLatin editor

2010-09-15 Thread Robert Gibbon (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Gibbon updated PIG-366:
--

Attachment: org.apache.pig.pigpen_0.7.4.jar

 PigPen - Eclipse plugin for a graphical PigLatin editor
 ---

 Key: PIG-366
 URL: https://issues.apache.org/jira/browse/PIG-366
 Project: Pig
  Issue Type: New Feature
Reporter: Shubham Chopra
Assignee: Robert Gibbon
Priority: Minor
 Attachments: org.apache.pig.pigpen-0.7.0.tar.gz, 
 org.apache.pig.pigpen-0.7.2.tar.gz, org.apache.pig.pigpen-0.7.3.tar.gz, 
 org.apache.pig.pigpen_0.0.1.jar, org.apache.pig.pigpen_0.0.1.tgz, 
 org.apache.pig.pigpen_0.0.4.jar, org.apache.pig.pigpen_0.7.2.jar, 
 org.apache.pig.pigpen_0.7.4.jar, pigpen.patch, pigPen.patch, PigPen.tgz


 This is an Eclipse plugin that provides a GUI that can help users create 
 PigLatin scripts and see the example generator outputs on the fly and submit 
 the jobs to hadoop clusters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-366) PigPen - Eclipse plugin for a graphical PigLatin editor

2010-09-15 Thread Robert Gibbon (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Gibbon updated PIG-366:
--

Attachment: org.apache.pig.pigpen-0.7.4.tar.gz

 PigPen - Eclipse plugin for a graphical PigLatin editor
 ---

 Key: PIG-366
 URL: https://issues.apache.org/jira/browse/PIG-366
 Project: Pig
  Issue Type: New Feature
Reporter: Shubham Chopra
Assignee: Robert Gibbon
Priority: Minor
 Attachments: org.apache.pig.pigpen-0.7.0.tar.gz, 
 org.apache.pig.pigpen-0.7.2.tar.gz, org.apache.pig.pigpen-0.7.4.tar.gz, 
 org.apache.pig.pigpen_0.0.1.jar, org.apache.pig.pigpen_0.0.1.tgz, 
 org.apache.pig.pigpen_0.0.4.jar, org.apache.pig.pigpen_0.7.2.jar, 
 org.apache.pig.pigpen_0.7.4.jar, pigpen.patch, pigPen.patch, PigPen.tgz


 This is an Eclipse plugin that provides a GUI that can help users create 
 PigLatin scripts and see the example generator outputs on the fly and submit 
 the jobs to hadoop clusters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-366) PigPen - Eclipse plugin for a graphical PigLatin editor

2010-09-15 Thread Robert Gibbon (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Gibbon updated PIG-366:
--

Attachment: (was: org.apache.pig.pigpen-0.7.3.tar.gz)

 PigPen - Eclipse plugin for a graphical PigLatin editor
 ---

 Key: PIG-366
 URL: https://issues.apache.org/jira/browse/PIG-366
 Project: Pig
  Issue Type: New Feature
Reporter: Shubham Chopra
Assignee: Robert Gibbon
Priority: Minor
 Attachments: org.apache.pig.pigpen-0.7.0.tar.gz, 
 org.apache.pig.pigpen-0.7.2.tar.gz, org.apache.pig.pigpen-0.7.4.tar.gz, 
 org.apache.pig.pigpen_0.0.1.jar, org.apache.pig.pigpen_0.0.1.tgz, 
 org.apache.pig.pigpen_0.0.4.jar, org.apache.pig.pigpen_0.7.2.jar, 
 org.apache.pig.pigpen_0.7.4.jar, pigpen.patch, pigPen.patch, PigPen.tgz


 This is an Eclipse plugin that provides a GUI that can help users create 
 PigLatin scripts and see the example generator outputs on the fly and submit 
 the jobs to hadoop clusters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-366) PigPen - Eclipse plugin for a graphical PigLatin editor

2010-09-15 Thread Robert Gibbon (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Gibbon updated PIG-366:
--

Attachment: org.apache.pig.pigpen_0.7.4.jar

 PigPen - Eclipse plugin for a graphical PigLatin editor
 ---

 Key: PIG-366
 URL: https://issues.apache.org/jira/browse/PIG-366
 Project: Pig
  Issue Type: New Feature
Reporter: Shubham Chopra
Assignee: Robert Gibbon
Priority: Minor
 Attachments: org.apache.pig.pigpen-0.7.0.tar.gz, 
 org.apache.pig.pigpen-0.7.2.tar.gz, org.apache.pig.pigpen-0.7.4.tar.gz, 
 org.apache.pig.pigpen_0.0.1.jar, org.apache.pig.pigpen_0.0.1.tgz, 
 org.apache.pig.pigpen_0.0.4.jar, org.apache.pig.pigpen_0.7.2.jar, 
 org.apache.pig.pigpen_0.7.4.jar, org.apache.pig.pigpen_0.7.4.jar, 
 pigpen.patch, pigPen.patch, PigPen.tgz


 This is an Eclipse plugin that provides a GUI that can help users create 
 PigLatin scripts and see the example generator outputs on the fly and submit 
 the jobs to hadoop clusters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-366) PigPen - Eclipse plugin for a graphical PigLatin editor

2010-09-15 Thread Robert Gibbon (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Gibbon updated PIG-366:
--

Attachment: (was: org.apache.pig.pigpen_0.7.4.jar)

 PigPen - Eclipse plugin for a graphical PigLatin editor
 ---

 Key: PIG-366
 URL: https://issues.apache.org/jira/browse/PIG-366
 Project: Pig
  Issue Type: New Feature
Reporter: Shubham Chopra
Assignee: Robert Gibbon
Priority: Minor
 Attachments: org.apache.pig.pigpen-0.7.0.tar.gz, 
 org.apache.pig.pigpen-0.7.2.tar.gz, org.apache.pig.pigpen-0.7.4.tar.gz, 
 org.apache.pig.pigpen_0.0.1.jar, org.apache.pig.pigpen_0.0.1.tgz, 
 org.apache.pig.pigpen_0.0.4.jar, org.apache.pig.pigpen_0.7.2.jar, 
 org.apache.pig.pigpen_0.7.4.jar, pigpen.patch, pigPen.patch, PigPen.tgz


 This is an Eclipse plugin that provides a GUI that can help users create 
 PigLatin scripts and see the example generator outputs on the fly and submit 
 the jobs to hadoop clusters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-366) PigPen - Eclipse plugin for a graphical PigLatin editor

2010-09-15 Thread Robert Gibbon (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12909897#action_12909897
 ] 

Robert Gibbon commented on PIG-366:
---

Here's the README

* Download the latest version of the jar binary. Right now that's 
org.apache.pig.pigpen_0.7.4.jar

* Put the jar in the $ECLIPSE_HOME/plugins/ directory, where $ECLIPSE_HOME is 
your eclipse installation directory.

* Optionally edit the file $ECLIPSE_HOME/eclipse.ini and add the parameter 
-clean (no quotes) to make sure the deployment is fresh

* Start eclipse.

* Open a .pig script or make a new one. It should come up with syntax colouring 
etc. when you start keying in your script. If it doesn't, post your report here

* When you save a script for the first time there might be a slight delay. 
*normal*. Any errors in your script should be marked up so you can see them.

* To run a script on the cluster, first open the eclipse preferences dialog 
= PigPen = and change the settings to your liking. You must have 
configuration.path pointing to where your pig config files are located 
(typically $PIG_HOME/conf/). You should specify your preferred pig runtime jar 
(typically $PIG_HOME/pig-x.x.x-core.jar). log.path defaults to your temp 
directory, but you can set it to whatever you like. ssh.gateway is untested, so 
if you don't use it, delete that key. Go ahead and click on OK.

* Select the script you want to run from the package explorer and click the 
little pig icon on the toolbar. This kicks off a new JVM and submits your job. 
You can track it in the console. If you want to cancel it for some reason, 
click the little pig icon with the red cross next to the console tab. This 
simply kills the JVM process.

That's it for now. have fun...


 PigPen - Eclipse plugin for a graphical PigLatin editor
 ---

 Key: PIG-366
 URL: https://issues.apache.org/jira/browse/PIG-366
 Project: Pig
  Issue Type: New Feature
Reporter: Shubham Chopra
Assignee: Robert Gibbon
Priority: Minor
 Attachments: org.apache.pig.pigpen-0.7.0.tar.gz, 
 org.apache.pig.pigpen-0.7.2.tar.gz, org.apache.pig.pigpen-0.7.4.tar.gz, 
 org.apache.pig.pigpen_0.0.1.jar, org.apache.pig.pigpen_0.0.1.tgz, 
 org.apache.pig.pigpen_0.0.4.jar, org.apache.pig.pigpen_0.7.2.jar, 
 org.apache.pig.pigpen_0.7.4.jar, pigpen.patch, pigPen.patch, PigPen.tgz


 This is an Eclipse plugin that provides a GUI that can help users create 
 PigLatin scripts and see the example generator outputs on the fly and submit 
 the jobs to hadoop clusters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1479) Embed Pig in scripting languages

2010-09-15 Thread Julien Le Dem (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12909898#action_12909898
 ] 

Julien Le Dem commented on PIG-1479:


The -g parameter on the command line should take two parameters, the scripting 
implementation instance name and the script itself.
That way we can have several scripting implementations.
{noformat}
java -cp pig.jar:jython jar org.apache.pig.Main -x local -g jython 
script/tc.py
{noformat}
{code}
case GREEK: {   
ScriptEngine scriptEngine = ScriptEngine.getInstance(instanceName);
scriptEngine.run(new PigServer(pigContext), file);
return ReturnCode.SUCCESS;
}
{code}


 Embed Pig in scripting languages
 

 Key: PIG-1479
 URL: https://issues.apache.org/jira/browse/PIG-1479
 Project: Pig
  Issue Type: New Feature
Reporter: Julien Le Dem
 Attachments: PIG-1479.patch, PIG-1479_2.patch, pig-greek-test.tar, 
 pig-greek.tgz


 It should be possible to embed Pig calls in a scripting language and let 
 functions defined in the same script available as UDFs.
 This is a spin off of https://issues.apache.org/jira/browse/PIG-928 which 
 lets users define UDFs in scripting languages.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1607) pig should have separate javadoc.jar in the maven repository

2010-09-15 Thread niraj rai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

niraj rai updated PIG-1607:
---

Attachment: PIG-1607_4.patch

fixed the package structure of the javadoc.

 pig should have separate javadoc.jar in the maven repository
 

 Key: PIG-1607
 URL: https://issues.apache.org/jira/browse/PIG-1607
 Project: Pig
  Issue Type: Bug
Reporter: niraj rai
Assignee: niraj rai
 Attachments: PIG-1607_0.patch, PIG-1607_1.patch, PIG-1607_2.patch, 
 PIG-1607_3.patch, PIG-1607_4.patch


 At this moment, javadoc is part of the source.jar but pig should have 
 separate javadoc.jar in the maven repository.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1479) Embed Pig in scripting languages

2010-09-15 Thread Julien Le Dem (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12909901#action_12909901
 ] 

Julien Le Dem commented on PIG-1479:


The end of loop condition in the script can just test for to_join_n emptiness. 
It was testing both because it did not know which one was to_join_n.
{code}
if (not P.result(to_join_n).iterator().hasNext()):
{code}

 Embed Pig in scripting languages
 

 Key: PIG-1479
 URL: https://issues.apache.org/jira/browse/PIG-1479
 Project: Pig
  Issue Type: New Feature
Reporter: Julien Le Dem
 Attachments: PIG-1479.patch, PIG-1479_2.patch, pig-greek-test.tar, 
 pig-greek.tgz


 It should be possible to embed Pig calls in a scripting language and let 
 functions defined in the same script available as UDFs.
 This is a spin off of https://issues.apache.org/jira/browse/PIG-928 which 
 lets users define UDFs in scripting languages.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (PIG-1613) Explain how different UDF interfaces are used

2010-09-15 Thread Olga Natkovich (JIRA)
Explain how different UDF interfaces are used
-

 Key: PIG-1613
 URL: https://issues.apache.org/jira/browse/PIG-1613
 Project: Pig
  Issue Type: Improvement
  Components: documentation
Affects Versions: 0.7.0
Reporter: Olga Natkovich
Assignee: Corinne Chandel
 Fix For: 0.8.0


The current documentation describes individual UDF interfaces such as Algebraic 
and Accumulator but not their precedence or how they interact with each other 
and why you might want to implement several of them.

Corrine, I will add release notes to this JIRA shortly. Don't worry about it 
till then.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1608) pig should always include pig-default.properties and pig.properties in the pig.jar

2010-09-15 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-1608:


Fix Version/s: 0.9.0
Affects Version/s: 0.8.0

 pig should always include pig-default.properties and pig.properties in the 
 pig.jar
 --

 Key: PIG-1608
 URL: https://issues.apache.org/jira/browse/PIG-1608
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: niraj rai
Assignee: niraj rai
 Fix For: 0.9.0

 Attachments: PIG-1608_0.patch, PIG-1608_1.patch


 pig should always include pig-default.properties and pig.properties as a part 
 of the pig.jar file

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1608) pig should always include pig-default.properties and pig.properties in the pig.jar

2010-09-15 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-1608:


  Status: Resolved  (was: Patch Available)
Hadoop Flags: [Reviewed]
  Resolution: Fixed

Patch committed to trunk. Thanks Niraj!

 pig should always include pig-default.properties and pig.properties in the 
 pig.jar
 --

 Key: PIG-1608
 URL: https://issues.apache.org/jira/browse/PIG-1608
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: niraj rai
Assignee: niraj rai
 Fix For: 0.9.0

 Attachments: PIG-1608_0.patch, PIG-1608_1.patch


 pig should always include pig-default.properties and pig.properties as a part 
 of the pig.jar file

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1613) Explain how different UDF interfaces are used

2010-09-15 Thread Olga Natkovich (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Olga Natkovich updated PIG-1613:


Release Note: 
I think this should go into Advanced Topics in the UDF manual

There are multiple ways for a UDF to be invoked. The simplest UDF can just 
extend EvalFunc that requires only exec function to be implemented as described 
in the How to Write a Simple Eval Function section. Every eval UDF must 
implement this. Additionally, if a function is algebraic, it can implement 
Algebraic interface to significantly improve query performance in the cases 
when combiner can be used. The Aggregate Functions section covers this topic in 
detail. Finally, a function that can process tuples in the incremental fashion 
can also implement Accumulator interface to improve query memory consumption. 
Accumulator interface section explains this interface.

The exact method by which UDF is invoked is selected by the optimizer based on 
the UDF type and the query. Note that only a single interface is used at any 
given time. The optimizer tries to find the most efficient way to execute the 
function. If a combiner is used and function implements Algebraic interface 
then this interface will be used to invoke the function. If the combiner is not 
invoked but accumulator can be used and the function implements Accumulator 
interface then that interface is used. If neither of the conditions is 
satisfied then exec function is used to invoke the UDF.


Can one of the developers review the release notes to make sure they are 
accurate, thanks.

 Explain how different UDF interfaces are used
 -

 Key: PIG-1613
 URL: https://issues.apache.org/jira/browse/PIG-1613
 Project: Pig
  Issue Type: Improvement
  Components: documentation
Affects Versions: 0.7.0
Reporter: Olga Natkovich
Assignee: Corinne Chandel
 Fix For: 0.8.0


 The current documentation describes individual UDF interfaces such as 
 Algebraic and Accumulator but not their precedence or how they interact with 
 each other and why you might want to implement several of them.
 Corrine, I will add release notes to this JIRA shortly. Don't worry about it 
 till then.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (PIG-1614) javacc.jar pulled twice from maven repository

2010-09-15 Thread Daniel Dai (JIRA)
javacc.jar pulled twice from maven repository
-

 Key: PIG-1614
 URL: https://issues.apache.org/jira/browse/PIG-1614
 Project: Pig
  Issue Type: Bug
  Components: build
Reporter: Daniel Dai
Priority: Trivial


ant pull javacc.jar twice from maven. One is javacc.jar, and the other is 
javacc-4.2.jar.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.