date:20090914


[ 
https://issues.apache.org/jira/browse/PIG-891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12755081#action_12755081
 ] 

Daniel Dai commented on PIG-891:


Not quite sure about it now. But I will figure out and let you know. Thanks.

 Fixing dfs statement for Pig
 

 Key: PIG-891
 URL: https://issues.apache.org/jira/browse/PIG-891
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.4.0
Reporter: Daniel Dai
Assignee: Jeff Zhang
Priority: Minor
 Fix For: 0.4.0

 Attachments: Pig_891.patch


 Several hadoop dfs commands are not support or restrictive on current Pig. We 
 need to fix that. These include:
 1. Several commands do not supported: lsr, dus, count, rmr, expunge, put, 
 moveFromLocal, get, getmerge, text, moveToLocal, mkdir, touchz, test, stat, 
 tail, chmod, chown, chgrp. A reference for these command can be found in 
 http://hadoop.apache.org/common/docs/current/hdfs_shell.html
 2. All existing dfs commands do not support globing.
 3. Pig should provide a programmatic way to perform dfs commands. Several of 
 them exist in PigServer, but not all of them.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-957) Tutorial is broken with 0.4 branch and trunk


 [ 
https://issues.apache.org/jira/browse/PIG-957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pradeep Kamath updated PIG-957:
---

Status: Open  (was: Patch Available)

 Tutorial is broken with 0.4 branch and trunk
 

 Key: PIG-957
 URL: https://issues.apache.org/jira/browse/PIG-957
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.3.0
Reporter: Olga Natkovich
Assignee: Pradeep Kamath
 Fix For: 0.4.0

 Attachments: PIG-957-2.patch, PIG-957.patch


 As I was testing the Pig Tutorial in preparation for the release, I found 
 that we broke the second script both in local mode and in MR mode. The issue 
 has to do with schema and naming fields.  
 Here is what I see:
  
 java -cp pig.jar org.apache.pig.Main -x local script2-local.pig
 2009-09-11 12:52:46,961 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 
 1000: Error during parsing. Invalid alias: hour00::group::ngram in 
 {group::ngram: chararray,group::hour: chararray,hour00::count: long,ngram: 
 chararray,hour: chararray,hour12::count: long}
 09/09/11 12:52:46 ERROR grunt.Grunt: ERROR 1000: Error during parsing. 
 Invalid alias: hour00::group::ngram in {group::ngram: chararray,group::hour: 
 chararray,hour00::count: long,ngram: chararray,hour: chararray,hour12::count: 
 long}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-957) Tutorial is broken with 0.4 branch and trunk

[
https://issues.apache.org/jira/browse/PIG-957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Pradeep Kamath updated PIG-957:
---

Attachment: PIG-957-2.patch

There were two unit test failures in the last patch
1) TestPigServer had a failure which was because join's describe now prefixes
the outer relation alias for each field - corrected the test case to update the
expected result.
2) TestSkewedJoin had a timeout - this ran fine on my local box.

Resubmitting with just the change in 1) above.

Tutorial is broken with 0.4 branch and trunk

Key: PIG-957
URL: https://issues.apache.org/jira/browse/PIG-957
Project: Pig
Issue Type: Bug
Affects Versions: 0.3.0
Reporter: Olga Natkovich
Assignee: Pradeep Kamath
Fix For: 0.4.0

Attachments: PIG-957-2.patch, PIG-957.patch

As I was testing the Pig Tutorial in preparation for the release, I found
that we broke the second script both in local mode and in MR mode. The issue
has to do with schema and naming fields.
Here is what I see:

java -cp pig.jar org.apache.pig.Main -x local script2-local.pig
2009-09-11 12:52:46,961 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR
1000: Error during parsing. Invalid alias: hour00::group::ngram in
{group::ngram: chararray,group::hour: chararray,hour00::count: long,ngram:
chararray,hour: chararray,hour12::count: long}
09/09/11 12:52:46 ERROR grunt.Grunt: ERROR 1000: Error during parsing.
Invalid alias: hour00::group::ngram in {group::ngram: chararray,group::hour:
chararray,hour00::count: long,ngram: chararray,hour: chararray,hour12::count:
long}

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-957) Tutorial is broken with 0.4 branch and trunk


 [ 
https://issues.apache.org/jira/browse/PIG-957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pradeep Kamath updated PIG-957:
---

Status: Patch Available  (was: Open)

 Tutorial is broken with 0.4 branch and trunk
 

 Key: PIG-957
 URL: https://issues.apache.org/jira/browse/PIG-957
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.3.0
Reporter: Olga Natkovich
Assignee: Pradeep Kamath
 Fix For: 0.4.0

 Attachments: PIG-957-2.patch, PIG-957.patch


 As I was testing the Pig Tutorial in preparation for the release, I found 
 that we broke the second script both in local mode and in MR mode. The issue 
 has to do with schema and naming fields.  
 Here is what I see:
  
 java -cp pig.jar org.apache.pig.Main -x local script2-local.pig
 2009-09-11 12:52:46,961 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 
 1000: Error during parsing. Invalid alias: hour00::group::ngram in 
 {group::ngram: chararray,group::hour: chararray,hour00::count: long,ngram: 
 chararray,hour: chararray,hour12::count: long}
 09/09/11 12:52:46 ERROR grunt.Grunt: ERROR 1000: Error during parsing. 
 Invalid alias: hour00::group::ngram in {group::ngram: chararray,group::hour: 
 chararray,hour00::count: long,ngram: chararray,hour: chararray,hour12::count: 
 long}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (PIG-949) Zebra Bug: splitting map into multiple column group using storage hint causes unexpected behaviour

2009-09-14 Thread Yan Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12755094#action_12755094
 ] 

Yan Zhou commented on PIG-949:
--

The problem is caused by not adding ColumnMappingEntrys from the key-split 
specs in storage info to an  explicitly specified MAP item in storage info, 
thus causing missing CGs as needed by the key-split specs. Everything falls 
apart thereafter. Will create a patch for R1 patch release soon.

 Zebra Bug: splitting map into multiple column group using storage hint causes 
 unexpected behaviour
 --

 Key: PIG-949
 URL: https://issues.apache.org/jira/browse/PIG-949
 Project: Pig
  Issue Type: Bug
 Environment: linux
Reporter: Alok Singh

 Hi 
  The storage hint
 specification plays a important part whether the output table is readable or 
 not
 say if we have have the map 'map'.
 One can split the map into a column group using [map#{k1}, map#{k2}...] 
 however the remaining map field will automatically be added to the default 
 group.
 if user try to create a new column group for the remaining fields as follows
 [map#{k1}, map#{k2}, ..][map] i.e create a seperate column group
 the table writer will create the table.
 however, if one tries to load the created table via pig or via map reduce 
 using TableInputFormat
  
 then the reader  have problem reading the map
 We get the following stack trace
 09/09/09 00:09:45 INFO mapred.JobClient: Task Id : 
 attempt_200908191538_33939_m_21_2, Status : FAILED
 java.io.IOException: getValue() failed: null
 at 
 org.apache.hadoop.zebra.io.BasicTable$Reader$BTScanner.getValue(BasicTable.java:775)
 at 
 org.apache.hadoop.zebra.mapred.TableRecordReader.next(TableInputFormat.java:717)
 at 
 org.apache.hadoop.zebra.mapred.TableRecordReader.next(TableInputFormat.java:651)
 at 
 org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:191)
 at 
 org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:175)
 at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
 at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:356)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
 at org.apache.hadoop.mapred.Child.main(Child.java:170)
 Alok

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (PIG-957) Tutorial is broken with 0.4 branch and trunk

2009-09-14 Thread Olga Natkovich (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12755101#action_12755101
 ] 

Olga Natkovich commented on PIG-957:


Pradeep, please, commit. The change is trivial enough not to wait for another 
automated test run.

 Tutorial is broken with 0.4 branch and trunk
 

 Key: PIG-957
 URL: https://issues.apache.org/jira/browse/PIG-957
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.3.0
Reporter: Olga Natkovich
Assignee: Pradeep Kamath
 Fix For: 0.4.0

 Attachments: PIG-957-2.patch, PIG-957.patch


 As I was testing the Pig Tutorial in preparation for the release, I found 
 that we broke the second script both in local mode and in MR mode. The issue 
 has to do with schema and naming fields.  
 Here is what I see:
  
 java -cp pig.jar org.apache.pig.Main -x local script2-local.pig
 2009-09-11 12:52:46,961 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 
 1000: Error during parsing. Invalid alias: hour00::group::ngram in 
 {group::ngram: chararray,group::hour: chararray,hour00::count: long,ngram: 
 chararray,hour: chararray,hour12::count: long}
 09/09/11 12:52:46 ERROR grunt.Grunt: ERROR 1000: Error during parsing. 
 Invalid alias: hour00::group::ngram in {group::ngram: chararray,group::hour: 
 chararray,hour00::count: long,ngram: chararray,hour: chararray,hour12::count: 
 long}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-955) Skewed join generates incorrect results

2009-09-14 Thread Olga Natkovich (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Olga Natkovich updated PIG-955:
---

Resolution: Fixed
Status: Resolved  (was: Patch Available)

patch committed to trunk and branch-04. Thanks, Ying

 Skewed join generates  incorrect results 
 -

 Key: PIG-955
 URL: https://issues.apache.org/jira/browse/PIG-955
 Project: Pig
  Issue Type: Improvement
Reporter: Ying He
 Attachments: PIG-955.patch, PIG-955.patch2


 SkewedPartitioner doesn't partition the skewed keys in partition table (first 
 table) correctly. This can cause data loss.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (PIG-957) Tutorial is broken with 0.4 branch and trunk

2009-09-14 Thread Hadoop QA (JIRA)

[
https://issues.apache.org/jira/browse/PIG-957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12755140#action_12755140
]

Hadoop QA commented on PIG-957:
---

+1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12419544/PIG-957-2.patch
against trunk revision 814075.

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 6 new or modified tests.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

+1 findbugs. The patch does not introduce any new Findbugs warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

+1 core tests. The patch passed core unit tests.

+1 contrib tests. The patch passed contrib unit tests.

Test results:
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/27/testReport/
Findbugs warnings:
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/27/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output:
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/27/console

This message is automatically generated.

Tutorial is broken with 0.4 branch and trunk

Key: PIG-957
URL: https://issues.apache.org/jira/browse/PIG-957
Project: Pig
Issue Type: Bug
Affects Versions: 0.3.0
Reporter: Olga Natkovich
Assignee: Pradeep Kamath
Fix For: 0.4.0

Attachments: PIG-957-2.patch, PIG-957.patch

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-957) Tutorial is broken with 0.4 branch and trunk


 [ 
https://issues.apache.org/jira/browse/PIG-957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pradeep Kamath updated PIG-957:
---

  Resolution: Fixed
Hadoop Flags: [Reviewed]
  Status: Resolved  (was: Patch Available)

Patch committed to both trunk and branch-0.4

 Tutorial is broken with 0.4 branch and trunk
 

 Key: PIG-957
 URL: https://issues.apache.org/jira/browse/PIG-957
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.3.0
Reporter: Olga Natkovich
Assignee: Pradeep Kamath
 Fix For: 0.4.0

 Attachments: PIG-957-2.patch, PIG-957.patch


 As I was testing the Pig Tutorial in preparation for the release, I found 
 that we broke the second script both in local mode and in MR mode. The issue 
 has to do with schema and naming fields.  
 Here is what I see:
  
 java -cp pig.jar org.apache.pig.Main -x local script2-local.pig
 2009-09-11 12:52:46,961 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 
 1000: Error during parsing. Invalid alias: hour00::group::ngram in 
 {group::ngram: chararray,group::hour: chararray,hour00::count: long,ngram: 
 chararray,hour: chararray,hour12::count: long}
 09/09/11 12:52:46 ERROR grunt.Grunt: ERROR 1000: Error during parsing. 
 Invalid alias: hour00::group::ngram in {group::ngram: chararray,group::hour: 
 chararray,hour00::count: long,ngram: chararray,hour: chararray,hour12::count: 
 long}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-922) Logical optimizer: push up project


 [ 
https://issues.apache.org/jira/browse/PIG-922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-922:
---

Attachment: PIG-922-p3_1.patch

Attach phase 3 patch. I am still working on adding more unit test.

 Logical optimizer: push up project
 --

 Key: PIG-922
 URL: https://issues.apache.org/jira/browse/PIG-922
 Project: Pig
  Issue Type: New Feature
  Components: impl
Affects Versions: 0.3.0
Reporter: Daniel Dai
Assignee: Daniel Dai
 Attachments: PIG-922-p1_0.patch, PIG-922-p1_1.patch, 
 PIG-922-p1_2.patch, PIG-922-p1_3.patch, PIG-922-p1_4.patch, 
 PIG-922-p2_preview.patch, PIG-922-p2_preview2.patch, PIG-922-p3_1.patch


 This is a continuation work of 
 [PIG-697|https://issues.apache.org/jira/browse/PIG-697]. We need to add 
 another rule to the logical optimizer: Push up project, ie, prune columns as 
 early as possible.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

RE: [VOTE] Release Pig 0.4.0 (candidate 0)

2009-09-14 Thread Pradeep Kamath

+1 for release.

-Original Message-
From: Olga Natkovich [mailto:ol...@yahoo-inc.com] 
Sent: Monday, September 14, 2009 2:06 PM
To: pig-dev@hadoop.apache.org; priv...@hadoop.apache.org
Subject: [VOTE] Release Pig 0.4.0 (candidate 0)

Hi,

I created a candidate build for Pig 0.4.0 release. The highlights of
this release are

-  Performance improvements especially in the area of JOIN
support where we introduced two new join types: skew join to deal with
data skew and sort merge join to take advantage of the sorted data sets.

-  Support for Outer join.

-  Works with Hadoop 18

I ran the release audit and rat report looked fine. The relevant part is
attached below.

Keys used to sign the release are available at
http://svn.apache.org/viewvc/hadoop/pig/trunk/KEYS?view=markup.

Please download the release and try it out:
http://people.apache.org/~olga/pig-0.4.0-candidate-0.

Should we release this? Vote closes on Thursday, 9/17.

Olga

 [java]  !?
/home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/contrib/CHANGES.txt
 [java]  !?
/home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/contrib/zebra/CHANG
ES.txt
 [java]  !?
/home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/broken-links.x
ml
 [java]  !?
/home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/cookbook.html
 [java]  !?
/home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/index.html
 [java]  !?
/home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/linkmap.html
 [java]  !?
/home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/piglatin_refer
ence.html
 [java]  !?
/home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/piglatin_users
.html
 [java]  !?
/home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/setup.html
 [java]  !?
/home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/tutorial.html
 [java]  !?
/home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/udf.html
 [java]  !?
/home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/api/package-li
st
 [java]  !?
/home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/changes.
html
 [java]  !?
/home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/missingS
inces.txt
 [java]  !?
/home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/user_com
ments_for_pig_0.3.1_to_pig_0.5.0-dev.xml
 [java]  !?
/home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/changes/
alldiffs_index_additions.html
 [java]  !?
/home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/changes/
alldiffs_index_all.html
 [java]  !?
/home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/changes/
alldiffs_index_changes.html
 [java]  !?
/home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/changes/
alldiffs_index_removals.html
 [java]  !?
/home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/changes/
changes-summary.html
 [java]  !?
/home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/changes/
classes_index_additions.html
 [java]  !?
/home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/changes/
classes_index_all.html
 [java]  !?
/home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/changes/
classes_index_changes.html
 [java]  !?
/home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/changes/
classes_index_removals.html
 [java]  !?
/home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/changes/
constructors_index_additions.html
 [java]  !?
/home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/changes/
constructors_index_all.html
 [java]  !?
/home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/changes/
constructors_index_changes.html
 [java]  !?
/home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/changes/
constructors_index_removals.html
 [java]  !?
/home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/changes/
fields_index_additions.html
 [java]  !?
/home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/changes/
fields_index_all.html
 [java]  !?
/home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/changes/
fields_index_changes.html
 [java]  !?
/home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/changes/
fields_index_removals.html
 [java]  !?
/home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/changes/
jdiff_help.html
 [java]  !?
/home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/changes/
jdiff_statistics.html
 [java]  !?
/home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/changes/
jdiff_topleftframe.html
 [java]  !?
/home/olgan/src/pig-apache/trunk/build/pig-0.5.0-dev/docs/jdiff/changes/
methods_index_additions.html
 [java]

[jira] Updated: (PIG-592) schema inferred incorrectly


 [ 
https://issues.apache.org/jira/browse/PIG-592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-592:
---

Fix Version/s: 0.5.0
Affects Version/s: (was: 0.2.0)
   0.4.0
   Status: Patch Available  (was: Open)

 schema inferred incorrectly
 ---

 Key: PIG-592
 URL: https://issues.apache.org/jira/browse/PIG-592
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.4.0
Reporter: Christopher Olston
 Fix For: 0.5.0

 Attachments: PIG-592-1.patch


 A simple pig script, that never introduces any schema information:
 A = load 'foo';
 B = foreach (group A by $8) generate group, COUNT($1);
 C = load 'bar';   // ('bar' has two columns)
 D = join B by $0, C by $0;
 E = foreach D generate $0, $1, $3;
 Fails, complaining that $3 does not exist:
 java.io.IOException: Out of bound access. Trying to access non-existent 
 column: 3. Schema {B::group: bytearray,long,bytearray} has 3 column(s).
 Apparently Pig gets confused, and thinks it knows the schema for C (a single 
 bytearray column).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-592) schema inferred incorrectly


 [ 
https://issues.apache.org/jira/browse/PIG-592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-592:
---

Attachment: PIG-592-1.patch

 schema inferred incorrectly
 ---

 Key: PIG-592
 URL: https://issues.apache.org/jira/browse/PIG-592
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.4.0
Reporter: Christopher Olston
 Fix For: 0.5.0

 Attachments: PIG-592-1.patch


 A simple pig script, that never introduces any schema information:
 A = load 'foo';
 B = foreach (group A by $8) generate group, COUNT($1);
 C = load 'bar';   // ('bar' has two columns)
 D = join B by $0, C by $0;
 E = foreach D generate $0, $1, $3;
 Fails, complaining that $3 does not exist:
 java.io.IOException: Out of bound access. Trying to access non-existent 
 column: 3. Schema {B::group: bytearray,long,bytearray} has 3 column(s).
 Apparently Pig gets confused, and thinks it knows the schema for C (a single 
 bytearray column).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (PIG-858) Order By followed by replicated join fails while compiling MR-plan from physical plan


 [ 
https://issues.apache.org/jira/browse/PIG-858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan reassigned PIG-858:


Assignee: Ashutosh Chauhan

 Order By followed by replicated join fails while compiling MR-plan from 
 physical plan
 ---

 Key: PIG-858
 URL: https://issues.apache.org/jira/browse/PIG-858
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.3.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: pig-858.patch


 Consider the query:
 {code}
 A = load 'a';
 B = order A by $0;
 C = join A by $0, B by $0;
 explain C;
 {code}
 works. But if replicated join is used instead
 {code}
 A = load 'a';
 B = order A by $0;
 C = join A by $0, B by $0 using replicated;
 explain C;
 {code}
 this fails with ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2034: Error 
 compiling operator POFRJoin
 relevant stacktrace:
 {code}
 Caused by: java.lang.RuntimeException: 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompilerException:
  ERROR 2034: Error compiling operator POFRJoin
 at 
 org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.explain(HExecutionEngine.java:306)
 at org.apache.pig.PigServer.explain(PigServer.java:574)
 ... 8 more
 Caused by: 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompilerException:
  ERROR 2034: Error compiling operator POFRJoin
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.visitFRJoin(MRCompiler.java:942)
 at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POFRJoin.visit(POFRJoin.java:173)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.compile(MRCompiler.java:342)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.compile(MRCompiler.java:327)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.compile(MRCompiler.java:233)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.compile(MapReduceLauncher.java:301)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.explain(MapReduceLauncher.java:278)
 at 
 org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.explain(HExecutionEngine.java:303)
 ... 9 more
 Caused by: java.lang.ArrayIndexOutOfBoundsException: -1
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.visitFRJoin(MRCompiler.java:901)
 ... 16 more
 {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-858) Order By followed by replicated join fails while compiling MR-plan from physical plan


 [ 
https://issues.apache.org/jira/browse/PIG-858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated PIG-858:
-

Attachment: pig-858.patch

Patch as discussed in previous comment. Also included are test cases, where 
blocking operator (order-by, distinct) occurs before FRjoin.

 Order By followed by replicated join fails while compiling MR-plan from 
 physical plan
 ---

 Key: PIG-858
 URL: https://issues.apache.org/jira/browse/PIG-858
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.3.0
Reporter: Ashutosh Chauhan
 Attachments: pig-858.patch


 Consider the query:
 {code}
 A = load 'a';
 B = order A by $0;
 C = join A by $0, B by $0;
 explain C;
 {code}
 works. But if replicated join is used instead
 {code}
 A = load 'a';
 B = order A by $0;
 C = join A by $0, B by $0 using replicated;
 explain C;
 {code}
 this fails with ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2034: Error 
 compiling operator POFRJoin
 relevant stacktrace:
 {code}
 Caused by: java.lang.RuntimeException: 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompilerException:
  ERROR 2034: Error compiling operator POFRJoin
 at 
 org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.explain(HExecutionEngine.java:306)
 at org.apache.pig.PigServer.explain(PigServer.java:574)
 ... 8 more
 Caused by: 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompilerException:
  ERROR 2034: Error compiling operator POFRJoin
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.visitFRJoin(MRCompiler.java:942)
 at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POFRJoin.visit(POFRJoin.java:173)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.compile(MRCompiler.java:342)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.compile(MRCompiler.java:327)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.compile(MRCompiler.java:233)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.compile(MapReduceLauncher.java:301)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.explain(MapReduceLauncher.java:278)
 at 
 org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.explain(HExecutionEngine.java:303)
 ... 9 more
 Caused by: java.lang.ArrayIndexOutOfBoundsException: -1
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler.visitFRJoin(MRCompiler.java:901)
 ... 16 more
 {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (PIG-959) Merge Join fails when there is a blocking operator before it in query.


 [ 
https://issues.apache.org/jira/browse/PIG-959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan reassigned PIG-959:


Assignee: Ashutosh Chauhan

 Merge Join fails when there is a blocking operator before it in query.
 --

 Key: PIG-959
 URL: https://issues.apache.org/jira/browse/PIG-959
 Project: Pig
  Issue Type: Bug
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan

 If there is an order-by, distinct or any other blocking operator in query 
 followed by Merge Join, pig fails to compile it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (PIG-959) Merge Join fails when there is a blocking operator before it in query.


[ 
https://issues.apache.org/jira/browse/PIG-959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12755270#action_12755270
 ] 

Ashutosh Chauhan commented on PIG-959:
--

This issue is blocked on PIG-858

 Merge Join fails when there is a blocking operator before it in query.
 --

 Key: PIG-959
 URL: https://issues.apache.org/jira/browse/PIG-959
 Project: Pig
  Issue Type: Bug
Reporter: Ashutosh Chauhan

 If there is an order-by, distinct or any other blocking operator in query 
 followed by Merge Join, pig fails to compile it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (PIG-959) Merge Join fails when there is a blocking operator before it in query.