[jira] Commented: (PIG-366) PigPen - Eclipse plugin for a graphical PigLatin editor

2010-01-26 Thread Jeff Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12804986#action_12804986
 ] 

Jeff Zhang commented on PIG-366:


Does anyone continue maintain this issue ? And could the author contribute the 
latest source code, I can help about this jira.



 PigPen - Eclipse plugin for a graphical PigLatin editor
 ---

 Key: PIG-366
 URL: https://issues.apache.org/jira/browse/PIG-366
 Project: Pig
  Issue Type: New Feature
Reporter: Shubham Chopra
Assignee: Shubham Chopra
Priority: Minor
 Attachments: org.apache.pig.pigpen_0.0.1.jar, 
 org.apache.pig.pigpen_0.0.1.tgz, org.apache.pig.pigpen_0.0.4.jar, 
 pigpen.patch, pigPen.patch, PigPen.tgz


 This is an Eclipse plugin that provides a GUI that can help users create 
 PigLatin scripts and see the example generator outputs on the fly and submit 
 the jobs to hadoop clusters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-366) PigPen - Eclipse plugin for a graphical PigLatin editor

2010-01-26 Thread Olga Natkovich (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12805115#action_12805115
 ] 

Olga Natkovich commented on PIG-366:


I don't think we have an owner. This could is looking for one :)

 PigPen - Eclipse plugin for a graphical PigLatin editor
 ---

 Key: PIG-366
 URL: https://issues.apache.org/jira/browse/PIG-366
 Project: Pig
  Issue Type: New Feature
Reporter: Shubham Chopra
Assignee: Shubham Chopra
Priority: Minor
 Attachments: org.apache.pig.pigpen_0.0.1.jar, 
 org.apache.pig.pigpen_0.0.1.tgz, org.apache.pig.pigpen_0.0.4.jar, 
 pigpen.patch, pigPen.patch, PigPen.tgz


 This is an Eclipse plugin that provides a GUI that can help users create 
 PigLatin scripts and see the example generator outputs on the fly and submit 
 the jobs to hadoop clusters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1169) Top-N queries produce incorrect results when a store statement is added between order by and limit statement

2010-01-26 Thread Richard Ding (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Richard Ding updated PIG-1169:
--

  Description: 
??We tried to get top N results after a groupby and sort, and got different 
results with or without storing the full sorted results. Here is a skeleton of 
our pig script.??

{code}
raw_data = Load 'input_files' AS (f1, f2, ..., fn);
grouped = group raw_data by (f1, f2);
data = foreach grouped generate FLATTEN(group). SUM(raw_data.fk) as value;
ordered = order data by value DESC parallel 10;
topn = limit ordered 10;
store ordered into 'outputdir/full';
store topn into 'outputdir/topn';
{code}

??With the statement 'store ordered ...', top N results are incorrect, but 
without the statement, results are correct. Has anyone seen this before? I know 
a similar bug has been fixed in the multi-query release. We are on pig .4 and 
hadoop .20.1.??



  was:

??We tried to get top N results after a groupby and sort, and got different 
results with or without storing the full sorted results. Here is a skeleton of 
our pig script.??

{code}
raw_data = Load 'input_files' AS (f1, f2, ..., fn);
grouped = group raw_data by (f1, f2);
data = foreach grouped generate FLATTEN(group). SUM(raw_data.fk) as value;
ordered = order data by value DESC parallel 10;
topn = limit ordered 10;
store ordered into 'outputdir/full';
store topn into 'outputdir/topn';
{code}

??With the statement 'store ordered ...', top N results are incorrect, but 
without the statement, results are correct. Has anyone seen this before? I know 
a similar bug has been fixed in the multi-query release. We are on pig .4 and 
hadoop .20.1.??



Fix Version/s: 0.7.0

 Top-N queries produce incorrect results when a store statement is added 
 between order by and limit statement
 

 Key: PIG-1169
 URL: https://issues.apache.org/jira/browse/PIG-1169
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.7.0
Reporter: Richard Ding
Assignee: Richard Ding
 Fix For: 0.7.0


 ??We tried to get top N results after a groupby and sort, and got different 
 results with or without storing the full sorted results. Here is a skeleton 
 of our pig script.??
 {code}
 raw_data = Load 'input_files' AS (f1, f2, ..., fn);
 grouped = group raw_data by (f1, f2);
 data = foreach grouped generate FLATTEN(group). SUM(raw_data.fk) as value;
 ordered = order data by value DESC parallel 10;
 topn = limit ordered 10;
 store ordered into 'outputdir/full';
 store topn into 'outputdir/topn';
 {code}
 ??With the statement 'store ordered ...', top N results are incorrect, but 
 without the statement, results are correct. Has anyone seen this before? I 
 know a similar bug has been fixed in the multi-query release. We are on pig 
 .4 and hadoop .20.1.??

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Assigned: (PIG-1169) Top-N queries produce incorrect results when a store statement is added between order by and limit statement

2010-01-26 Thread Richard Ding (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Richard Ding reassigned PIG-1169:
-

Assignee: Daniel Dai  (was: Richard Ding)

 Top-N queries produce incorrect results when a store statement is added 
 between order by and limit statement
 

 Key: PIG-1169
 URL: https://issues.apache.org/jira/browse/PIG-1169
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.7.0
Reporter: Richard Ding
Assignee: Daniel Dai
 Fix For: 0.7.0


 ??We tried to get top N results after a groupby and sort, and got different 
 results with or without storing the full sorted results. Here is a skeleton 
 of our pig script.??
 {code}
 raw_data = Load 'input_files' AS (f1, f2, ..., fn);
 grouped = group raw_data by (f1, f2);
 data = foreach grouped generate FLATTEN(group). SUM(raw_data.fk) as value;
 ordered = order data by value DESC parallel 10;
 topn = limit ordered 10;
 store ordered into 'outputdir/full';
 store topn into 'outputdir/topn';
 {code}
 ??With the statement 'store ordered ...', top N results are incorrect, but 
 without the statement, results are correct. Has anyone seen this before? I 
 know a similar bug has been fixed in the multi-query release. We are on pig 
 .4 and hadoop .20.1.??

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (PIG-1202) explain plan throws out exception

2010-01-26 Thread Ying He (JIRA)
explain plan throws out exception 
--

 Key: PIG-1202
 URL: https://issues.apache.org/jira/browse/PIG-1202
 Project: Pig
  Issue Type: Bug
Reporter: Ying He


run the following script

a = load 's/part*' as (id:int, f:chararray);
b = load 's/part*' as (id:int, f:chararray);
c = join a by id, b by id;
d = filter c by a::f == 'apple';
explain d;

got error message:
ERROR 1067: Unable to explain alias d

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (PIG-1203) Temporarily disable failed unit test in load-store-redesign branch which have external dependency

2010-01-26 Thread Daniel Dai (JIRA)
Temporarily disable failed unit test in load-store-redesign branch which have 
external dependency
-

 Key: PIG-1203
 URL: https://issues.apache.org/jira/browse/PIG-1203
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.7.0
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.7.0


In load-store-redesign branch, two test suits, TestHBaseStorage and 
TestCounters always fail. TestHBaseStorage depends on 
https://issues.apache.org/jira/browse/PIG-1200, TestCounters depends on future 
version of hadoop. We disable these two test suits temporarily, and will enable 
them once the dependent issues are solved.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1203) Temporarily disable failed unit test in load-store-redesign branch which have external dependency

2010-01-26 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-1203:


Issue Type: Sub-task  (was: Bug)
Parent: PIG-966

 Temporarily disable failed unit test in load-store-redesign branch which have 
 external dependency
 -

 Key: PIG-1203
 URL: https://issues.apache.org/jira/browse/PIG-1203
 Project: Pig
  Issue Type: Sub-task
  Components: impl
Affects Versions: 0.7.0
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.7.0


 In load-store-redesign branch, two test suits, TestHBaseStorage and 
 TestCounters always fail. TestHBaseStorage depends on 
 https://issues.apache.org/jira/browse/PIG-1200, TestCounters depends on 
 future version of hadoop. We disable these two test suits temporarily, and 
 will enable them once the dependent issues are solved.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1201) [zebra] HDFS meta queries are issued by all mappers; Pig Loader serialize all JobConf contents including those unused by zebra

2010-01-26 Thread Yan Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yan Zhou updated PIG-1201:
--

Status: Open  (was: Patch Available)

 [zebra] HDFS meta queries are issued by all mappers; Pig Loader serialize all 
 JobConf contents including those unused by zebra
 --

 Key: PIG-1201
 URL: https://issues.apache.org/jira/browse/PIG-1201
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.6.0
Reporter: Yan Zhou
Assignee: Yan Zhou
Priority: Minor
 Fix For: 0.6.0, 0.7.0

 Attachments: PIG-1201.patch




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1203) Temporarily disable failed unit test in load-store-redesign branch which have external dependency

2010-01-26 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-1203:


Attachment: PIG-1203-1.patch

Patch for the load-store-redesign branch

 Temporarily disable failed unit test in load-store-redesign branch which have 
 external dependency
 -

 Key: PIG-1203
 URL: https://issues.apache.org/jira/browse/PIG-1203
 Project: Pig
  Issue Type: Sub-task
  Components: impl
Affects Versions: 0.7.0
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.7.0

 Attachments: PIG-1203-1.patch


 In load-store-redesign branch, two test suits, TestHBaseStorage and 
 TestCounters always fail. TestHBaseStorage depends on 
 https://issues.apache.org/jira/browse/PIG-1200, TestCounters depends on 
 future version of hadoop. We disable these two test suits temporarily, and 
 will enable them once the dependent issues are solved.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1201) [zebra] HDFS meta queries are issued by all mappers; Pig Loader serialize all JobConf contents including those unused by zebra

2010-01-26 Thread Yan Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12805195#action_12805195
 ] 

Yan Zhou commented on PIG-1201:
---

HDFS listStatus calls by every mapper to the name node is costly, particularly 
if the target has huge number of disk entries, i.e., files and directories. 
Zebra has the problem in a couple of ways:

1) for unsorted tables,  the index is not built on disk. The input split which 
is a tfile row split has file index that needs to be mapped to the file name 
using the index, which contains file names in order and their sizes, by each 
and every mapper. Building the index makes the listStatus call as it needs info 
of all files. And if the number of files are huge, this caused name node 
resource cramps. Instead, the file index can be well replaced with the file 
name so that the mapping, and consequently the index,  is not needed at all for 
the routine ops like queries against the tables. For other informational 
requests like dumpInfo where a comprehensive picture is required, the index 
could be built as needed. The on-disk index is still preferred as it will save 
one listStatus call by the front end. But it would require more changes to 
support backward compatibility and the meta file that holds the index does not 
support versioning. Consequently, this work is deferred to a future release, 
although the on-disk index will be built for future convinience;

2) Each BasicTable.Reader, at construction, will check and mark all deleted CGs 
in the SchemaFile.setCGDeletedFlags method, which makes the listStatus call. 
This may not be as bad as the one in 1), but for the tables with lots of CGs, 
it could present a problem. Instead, the check can only be made by a front end 
and passed to mappers the info.

The huge JobConf serialization size in Pig loader implementation will be fixed 
by only serializing the few configuration variables that Zebra need. 

 [zebra] HDFS meta queries are issued by all mappers; Pig Loader serialize all 
 JobConf contents including those unused by zebra
 --

 Key: PIG-1201
 URL: https://issues.apache.org/jira/browse/PIG-1201
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.6.0
Reporter: Yan Zhou
Assignee: Yan Zhou
Priority: Minor
 Fix For: 0.6.0, 0.7.0

 Attachments: PIG-1201.patch




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1090) Update sources to reflect recent changes in load-store interfaces

2010-01-26 Thread Pradeep Kamath (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12805207#action_12805207
 ] 

Pradeep Kamath commented on PIG-1090:
-

+1 for PIG-1090-15.patch, patch committed.

Here are the results of running ant test-patch:
[exec] +1 overall.  
 [exec] 
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec] 
 [exec] +1 tests included.  The patch appears to include 6 new or 
modified tests.
 [exec] 
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec] 
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec] 
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
warnings.
 [exec] 
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.
 [exec] 


 Update sources to reflect recent changes in load-store interfaces
 -

 Key: PIG-1090
 URL: https://issues.apache.org/jira/browse/PIG-1090
 Project: Pig
  Issue Type: Sub-task
Reporter: Pradeep Kamath
Assignee: Pradeep Kamath
 Attachments: PIG-1090-10.patch, PIG-1090-11.patch, PIG-1090-12.patch, 
 PIG-1090-13.patch, PIG-1090-14.patch, PIG-1090-15.patch, PIG-1090-2.patch, 
 PIG-1090-3.patch, PIG-1090-4.patch, PIG-1090-6.patch, PIG-1090-7.patch, 
 PIG-1090-8.patch, PIG-1090-9.patch, PIG-1090.patch, PIG-1190-5.patch


 There have been some changes (as recorded in the Changes Section, Nov 2 2009 
 sub section of http://wiki.apache.org/pig/LoadStoreRedesignProposal) in the 
 load/store interfaces - this jira is to track the task of making those 
 changes under src. Changes under test will be addresses in a different jira.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1045) Integration with Hadoop 20 New API

2010-01-26 Thread Richard Ding (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Richard Ding updated PIG-1045:
--

Fix Version/s: 0.7.0

 Integration with Hadoop 20 New API
 --

 Key: PIG-1045
 URL: https://issues.apache.org/jira/browse/PIG-1045
 Project: Pig
  Issue Type: New Feature
Reporter: Richard Ding
Assignee: Richard Ding
 Fix For: 0.7.0

 Attachments: PIG-1045.patch, PIG-1045.patch


 Hadoop 21 is not yet released but we know that switch to new MR API is coming 
 there. This JIRA is for early integration with the portion of this API that 
 has been implemented in Hadoop 20.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1200) Using TableInputFormat in HBaseStorage

2010-01-26 Thread Pradeep Kamath (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pradeep Kamath updated PIG-1200:


  Resolution: Fixed
Hadoop Flags: [Reviewed]
  Status: Resolved  (was: Patch Available)

+1, patch committed on Jeff's behalf - thanks Jeff!

 Using TableInputFormat in HBaseStorage
 --

 Key: PIG-1200
 URL: https://issues.apache.org/jira/browse/PIG-1200
 Project: Pig
  Issue Type: Sub-task
Affects Versions: 0.7.0
Reporter: Jeff Zhang
Assignee: Jeff Zhang
 Fix For: 0.7.0

 Attachments: Pig_1200.patch




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1203) Temporarily disable failed unit test in load-store-redesign branch which have external dependency

2010-01-26 Thread Pradeep Kamath (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12805226#action_12805226
 ] 

Pradeep Kamath commented on PIG-1203:
-

I committed the patch with a change to disable only TestCounters since PIG-1200 
address TestHBaseStorage failures.

 Temporarily disable failed unit test in load-store-redesign branch which have 
 external dependency
 -

 Key: PIG-1203
 URL: https://issues.apache.org/jira/browse/PIG-1203
 Project: Pig
  Issue Type: Sub-task
  Components: impl
Affects Versions: 0.7.0
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.7.0

 Attachments: PIG-1203-1.patch


 In load-store-redesign branch, two test suits, TestHBaseStorage and 
 TestCounters always fail. TestHBaseStorage depends on 
 https://issues.apache.org/jira/browse/PIG-1200, TestCounters depends on 
 future version of hadoop. We disable these two test suits temporarily, and 
 will enable them once the dependent issues are solved.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1090) Update sources to reflect recent changes in load-store interfaces

2010-01-26 Thread Pradeep Kamath (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pradeep Kamath updated PIG-1090:


Affects Version/s: 0.7.0
Fix Version/s: 0.7.0

 Update sources to reflect recent changes in load-store interfaces
 -

 Key: PIG-1090
 URL: https://issues.apache.org/jira/browse/PIG-1090
 Project: Pig
  Issue Type: Sub-task
Affects Versions: 0.7.0
Reporter: Pradeep Kamath
Assignee: Pradeep Kamath
 Fix For: 0.7.0

 Attachments: PIG-1090-10.patch, PIG-1090-11.patch, PIG-1090-12.patch, 
 PIG-1090-13.patch, PIG-1090-14.patch, PIG-1090-15.patch, PIG-1090-2.patch, 
 PIG-1090-3.patch, PIG-1090-4.patch, PIG-1090-6.patch, PIG-1090-7.patch, 
 PIG-1090-8.patch, PIG-1090-9.patch, PIG-1090.patch, PIG-1190-5.patch


 There have been some changes (as recorded in the Changes Section, Nov 2 2009 
 sub section of http://wiki.apache.org/pig/LoadStoreRedesignProposal) in the 
 load/store interfaces - this jira is to track the task of making those 
 changes under src. Changes under test will be addresses in a different jira.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (PIG-1094) Fix unit tests corresponding to source changes so far

2010-01-26 Thread Pradeep Kamath (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pradeep Kamath resolved PIG-1094.
-

   Resolution: Fixed
Fix Version/s: 0.7.0
 Hadoop Flags: [Incompatible change]

Marking this issuse as fixed since all unit tests except TestCounters now pass 
- TestCounters failure will be tracked in PIG-1203

 Fix unit tests corresponding to source changes so far
 -

 Key: PIG-1094
 URL: https://issues.apache.org/jira/browse/PIG-1094
 Project: Pig
  Issue Type: Sub-task
Reporter: Pradeep Kamath
Assignee: Pradeep Kamath
 Fix For: 0.7.0

 Attachments: PIG-1094.patch, PIG-1094_2.patch, PIG-1094_3.patch, 
 PIG-1094_4.patch, PIG-1094_5.patch, PIG-1094_6.patch, PIG-1094_7.patch


 The check-in's so far on load-store-redesign branch have nor addressed unit 
 test failures due to interface changes. This jira is to track the task of 
 making the common case unit tests work with the new interfaces. Some aspects 
 of the new proposal like using LoadCaster interface for casting, making local 
 mode work have not been completed yet. Tests which are failing due to those 
 reasons will not be fixed in this jira and addressed in the jiras 
 corresponding to those tasks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-613) Casting elements inside a tuple does not take effect

2010-01-26 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-613:
---

Fix Version/s: 0.7.0
 Assignee: Daniel Dai

 Casting elements inside a tuple does not take effect
 

 Key: PIG-613
 URL: https://issues.apache.org/jira/browse/PIG-613
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.2.0
Reporter: Viraj Bhat
Assignee: Daniel Dai
 Fix For: 0.7.0

 Attachments: myfloatdata.txt, SQUARE.java


 Consider the following Pig script which casts return values of the SQUARE UDF 
 which are  tuples of doubles to long. The describe output of B shows it is 
 long, however the result is still double.
 {code}
 register statistics.jar;
 A = load 'myfloatdata.txt' using PigStorage() as (doublecol:double);
 B = foreach A generate (tuple(long))statistics.SQUARE(doublecol) as 
 squares:(loadtimesq);
 describe B;
 explain B;
 dump B;
 {code}
 ===
 Describe output of B:
 B: {squares: (loadtimesq: long)}
 ===
 Sample output of B:
 ((7885.44))
 ((792098.2200010001))
 ((1497360.9268889998))
 ((50023.7956))
 ((0.972196))
 ((0.30980356))
 ((9.9760144E-7))
 ===
 Cause: The cast for Tuples has not been implemented in POCast.java

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1201) [zebra] HDFS meta queries are issued by all mappers; Pig Loader serialize all JobConf contents including those unused by zebra

2010-01-26 Thread Yan Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yan Zhou updated PIG-1201:
--

Attachment: PIG-1201.patch

 [zebra] HDFS meta queries are issued by all mappers; Pig Loader serialize all 
 JobConf contents including those unused by zebra
 --

 Key: PIG-1201
 URL: https://issues.apache.org/jira/browse/PIG-1201
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.6.0
Reporter: Yan Zhou
Assignee: Yan Zhou
Priority: Minor
 Fix For: 0.6.0, 0.7.0

 Attachments: PIG-1201.patch, PIG-1201.patch




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1201) [zebra] HDFS meta queries are issued by all mappers; Pig Loader serialize all JobConf contents including those unused by zebra

2010-01-26 Thread Yan Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yan Zhou updated PIG-1201:
--

Status: Patch Available  (was: Open)

 [zebra] HDFS meta queries are issued by all mappers; Pig Loader serialize all 
 JobConf contents including those unused by zebra
 --

 Key: PIG-1201
 URL: https://issues.apache.org/jira/browse/PIG-1201
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.6.0
Reporter: Yan Zhou
Assignee: Yan Zhou
Priority: Minor
 Fix For: 0.6.0, 0.7.0

 Attachments: PIG-1201.patch, PIG-1201.patch




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1204) Join two streaming relations hang in local mode

2010-01-26 Thread Richard Ding (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Richard Ding updated PIG-1204:
--

Affects Version/s: (was: 0.5.0)
   0.6.0
   Status: Patch Available  (was: Open)

 Join two streaming relations hang in local mode
 ---

 Key: PIG-1204
 URL: https://issues.apache.org/jira/browse/PIG-1204
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.6.0
Reporter: Richard Ding
Assignee: Richard Ding
 Fix For: 0.7.0

 Attachments: PIG-1204.patch


 The following script hangs running in local mode  when inpuf files contains 
 many lines (e.g. 10K). The same script works when runing in MR mode.
 {code}
 A = load 'input1' as (a0, a1, a2);
 B = stream A through `head -1` as (a0, a1, a2);
 C = load 'input2' as (a0, a1, a2);
 D = stream C through `head -1` as (a0, a1, a2);
 E = join B by a0, D by a0;
 dump E
 {code}  
 Here is one stack trace:
 Thread-13 prio=10 tid=0x09938400 nid=0x1232 in Object.wait() 
 [0x8fffe000..0x8030]
java.lang.Thread.State: WAITING (on object monitor)
 at java.lang.Object.wait(Native Method)
 - waiting on 0x9b8e0a40 (a 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POStream)
 at java.lang.Object.wait(Object.java:485)
 at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POStream.getNextHelper(POStream.java:291)
 - locked 0x9b8e0a40 (a 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POStream)
 at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POStream.getNext(POStream.java:214)
 at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:272)
 at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNext(POLocalRearrange.java:256)
 at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POUnion.getNext(POUnion.java:162)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:232)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:227)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:52)
 at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
 at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:583)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
 at 
 org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:176)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1204) Join two streaming relations hang in local mode

2010-01-26 Thread Richard Ding (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Richard Ding updated PIG-1204:
--

Attachment: PIG-1204.patch

The cause was a final class variable was modified by another class, and,  in 
local mode, all the mappers are running in the same JVM that resulted in the 
dead lock.   

This patch provides a fix.

 Join two streaming relations hang in local mode
 ---

 Key: PIG-1204
 URL: https://issues.apache.org/jira/browse/PIG-1204
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.6.0
Reporter: Richard Ding
Assignee: Richard Ding
 Fix For: 0.7.0

 Attachments: PIG-1204.patch


 The following script hangs running in local mode  when inpuf files contains 
 many lines (e.g. 10K). The same script works when runing in MR mode.
 {code}
 A = load 'input1' as (a0, a1, a2);
 B = stream A through `head -1` as (a0, a1, a2);
 C = load 'input2' as (a0, a1, a2);
 D = stream C through `head -1` as (a0, a1, a2);
 E = join B by a0, D by a0;
 dump E
 {code}  
 Here is one stack trace:
 Thread-13 prio=10 tid=0x09938400 nid=0x1232 in Object.wait() 
 [0x8fffe000..0x8030]
java.lang.Thread.State: WAITING (on object monitor)
 at java.lang.Object.wait(Native Method)
 - waiting on 0x9b8e0a40 (a 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POStream)
 at java.lang.Object.wait(Object.java:485)
 at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POStream.getNextHelper(POStream.java:291)
 - locked 0x9b8e0a40 (a 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POStream)
 at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POStream.getNext(POStream.java:214)
 at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:272)
 at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNext(POLocalRearrange.java:256)
 at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POUnion.getNext(POUnion.java:162)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:232)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:227)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:52)
 at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
 at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:583)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
 at 
 org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:176)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1204) Join two streaming relations hang in local mode

2010-01-26 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12805364#action_12805364
 ] 

Hadoop QA commented on PIG-1204:


-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12431490/PIG-1204.patch
  against trunk revision 903030.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/190/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/190/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/190/console

This message is automatically generated.

 Join two streaming relations hang in local mode
 ---

 Key: PIG-1204
 URL: https://issues.apache.org/jira/browse/PIG-1204
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.6.0
Reporter: Richard Ding
Assignee: Richard Ding
 Fix For: 0.7.0

 Attachments: PIG-1204.patch


 The following script hangs running in local mode  when inpuf files contains 
 many lines (e.g. 10K). The same script works when runing in MR mode.
 {code}
 A = load 'input1' as (a0, a1, a2);
 B = stream A through `head -1` as (a0, a1, a2);
 C = load 'input2' as (a0, a1, a2);
 D = stream C through `head -1` as (a0, a1, a2);
 E = join B by a0, D by a0;
 dump E
 {code}  
 Here is one stack trace:
 Thread-13 prio=10 tid=0x09938400 nid=0x1232 in Object.wait() 
 [0x8fffe000..0x8030]
java.lang.Thread.State: WAITING (on object monitor)
 at java.lang.Object.wait(Native Method)
 - waiting on 0x9b8e0a40 (a 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POStream)
 at java.lang.Object.wait(Object.java:485)
 at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POStream.getNextHelper(POStream.java:291)
 - locked 0x9b8e0a40 (a 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POStream)
 at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POStream.getNext(POStream.java:214)
 at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:272)
 at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNext(POLocalRearrange.java:256)
 at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POUnion.getNext(POUnion.java:162)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:232)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:227)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:52)
 at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
 at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:583)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
 at 
 org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:176)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1201) [zebra] HDFS meta queries are issued by all mappers; Pig Loader serialize all JobConf contents including those unused by zebra

2010-01-26 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12805366#action_12805366
 ] 

Hadoop QA commented on PIG-1201:


-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12431488/PIG-1201.patch
  against trunk revision 903030.

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no tests are needed for this patch.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/178/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/178/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/178/console

This message is automatically generated.

 [zebra] HDFS meta queries are issued by all mappers; Pig Loader serialize all 
 JobConf contents including those unused by zebra
 --

 Key: PIG-1201
 URL: https://issues.apache.org/jira/browse/PIG-1201
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.6.0
Reporter: Yan Zhou
Assignee: Yan Zhou
Priority: Minor
 Fix For: 0.6.0, 0.7.0

 Attachments: PIG-1201.patch, PIG-1201.patch




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.