[jira] Commented: (PIG-818) Explain doesn't handle PODemux properly

2009-05-29 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12714421#action_12714421
 ] 

Hudson commented on PIG-818:


Integrated in Pig-trunk #457 (See 
[http://hudson.zones.apache.org/hudson/job/Pig-trunk/457/])
: Explain doesn't handle PODemux properly (hagleitn via olgan)


 Explain doesn't handle PODemux properly
 ---

 Key: PIG-818
 URL: https://issues.apache.org/jira/browse/PIG-818
 Project: Pig
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: explain.patch


 The PODemux operator has nested plans but they are not expanded in the -dot 
 version of explain.
 Also, both split and demux are displayed as clusters of nodes, but it really 
 makes more sense to just show them as multi output operators.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-819) run -param -param; is a valid grunt command

2009-05-29 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12714420#action_12714420
 ] 

Hudson commented on PIG-819:


Integrated in Pig-trunk #457 (See 
[http://hudson.zones.apache.org/hudson/job/Pig-trunk/457/])
: run -param -param; is a valid grunt command (milindb via olgan)


 run -param -param; is a valid grunt command
 ---

 Key: PIG-819
 URL: https://issues.apache.org/jira/browse/PIG-819
 Project: Pig
  Issue Type: Bug
  Components: grunt
Affects Versions: 0.3.0
 Environment: all
Reporter: Milind Bhandarkar
Assignee: Milind Bhandarkar
 Attachments: invalidparam.patch


 By mistake, I typed 
 {code}
 run -param -param;
 {code}
 in grunt. And was surprised to find it to be  a valid grunt command.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-796) support conversion from numeric types to chararray

2009-05-29 Thread Yiping Han (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12714526#action_12714526
 ] 

Yiping Han commented on PIG-796:


I have the same idea that Alan proposed. I agree the common case is most values 
are of the same type. Caching the type and change the cached type only when 
catch the ClassCastException would be the most efficient way.

 support  conversion from numeric types to chararray
 ---

 Key: PIG-796
 URL: https://issues.apache.org/jira/browse/PIG-796
 Project: Pig
  Issue Type: Improvement
Affects Versions: 0.2.0
Reporter: Olga Natkovich



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-816) PigStorage() does not accept Unicode characters in its contructor

2009-05-29 Thread Olga Natkovich (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12714528#action_12714528
 ] 

Olga Natkovich commented on PIG-816:


+1, the fix looks good

 PigStorage() does not accept Unicode characters in its contructor 
 --

 Key: PIG-816
 URL: https://issues.apache.org/jira/browse/PIG-816
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.3.0
Reporter: Viraj Bhat
Assignee: Pradeep Kamath
Priority: Critical
 Fix For: 0.3.0

 Attachments: PIG-816.patch, pig_1243043613713.log


 Simple Pig script which uses Unicode characters in the PigStorage() 
 constructor fails with the following error:
 {code}
 studenttab = LOAD '/user/viraj/studenttab10k' AS (name:chararray, 
 age:int,gpa:float);
 X2 = GROUP studenttab by age;
 Y2 = FOREACH X2 GENERATE group, COUNT(studenttab);
 store Y2 into '/user/viraj/y2' using PigStorage('\u0001');
 {code}
 
 ERROR org.apache.pig.tools.grunt.GruntParser - ERROR 2997: Unable to recreate 
 exception from backend error: org.apache.hadoop.ipc.RemoteException: 
 java.io.IOException: java.lang.RuntimeException: 
 org.xml.sax.SAXParseException: Character reference #1 is an invalid XML 
 character.
 
 Attaching log file.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (PIG-823) Hadoop Metadata Service

2009-05-29 Thread Olga Natkovich (JIRA)
Hadoop Metadata Service
---

 Key: PIG-823
 URL: https://issues.apache.org/jira/browse/PIG-823
 Project: Pig
  Issue Type: New Feature
Reporter: Olga Natkovich


This JIRA is created to track development of a metadata system for  Hadoop. The 
goal of the system is to allow users and applications to register data stored 
on HDFS, search for the data available on HDFS, and associate metadata such as 
schema, statistics, etc. with a particular data unit or a data set stored on 
HDFS. The initial goal is to provide a fairly generic, low level abstraction 
that any user or application on HDFS can use to store an retrieve metadata. 
Over time a higher level abstractions closely tied to particular applications 
or tools can be developed.

Over time, it would make sense for the metadata service to become a subproject 
within Hadoop. For now, the proposal is to make it a contrib to Pig since Pig 
SQL is likely to be the first user of the system.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Issue Comment Edited: (PIG-823) Hadoop Metadata Service

2009-05-29 Thread Jeff Hammerbacher (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12714539#action_12714539
 ] 

Jeff Hammerbacher edited comment on PIG-823 at 5/29/09 11:20 AM:
-

Hey,

Hadoop already has a metadata service (well defined at 
http://svn.apache.org/viewvc/hadoop/hive/trunk/metastore/if/hive_metastore.thrift)
 and a SQL implementation in production use at scale at several organizations. 
Can any of that work be reused for this purpose? It seems like duplicating 
effort across subprojects is a bad idea.

Later,
Jeff

  was (Author: hammer):
Hey,

Hadoop already had a metadata service (well defined at 
http://svn.apache.org/viewvc/hadoop/hive/trunk/metastore/if/hive_metastore.thrift)
 and a SQL implementation in production use at scale at several organizations. 
Can any of that work be reused for this purpose? It seems like duplicating 
effort across subprojects is a bad idea.

Later,
Jeff
  
 Hadoop Metadata Service
 ---

 Key: PIG-823
 URL: https://issues.apache.org/jira/browse/PIG-823
 Project: Pig
  Issue Type: New Feature
Reporter: Olga Natkovich

 This JIRA is created to track development of a metadata system for  Hadoop. 
 The goal of the system is to allow users and applications to register data 
 stored on HDFS, search for the data available on HDFS, and associate metadata 
 such as schema, statistics, etc. with a particular data unit or a data set 
 stored on HDFS. The initial goal is to provide a fairly generic, low level 
 abstraction that any user or application on HDFS can use to store an retrieve 
 metadata. Over time a higher level abstractions closely tied to particular 
 applications or tools can be developed.
 Over time, it would make sense for the metadata service to become a 
 subproject within Hadoop. For now, the proposal is to make it a contrib to 
 Pig since Pig SQL is likely to be the first user of the system.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-802) PERFORMANCE: not creating bags for ORDER BY

2009-05-29 Thread Rakesh Setty (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh Setty updated PIG-802:
-

Attachment: (was: OrderByOptimization.patch)

 PERFORMANCE: not creating bags for ORDER BY
 ---

 Key: PIG-802
 URL: https://issues.apache.org/jira/browse/PIG-802
 Project: Pig
  Issue Type: Improvement
Affects Versions: 0.2.0
Reporter: Olga Natkovich
 Attachments: OrderByOptimization.patch


 Order by should be changed to not use POPackage to put all of the tuples in a 
 bag on the reduce side, as the bag is just immediately flattened. It can 
 instead work like join does for the last input in the join. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-823) Hadoop Metadata Service

2009-05-29 Thread Olga Natkovich (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12714543#action_12714543
 ] 

Olga Natkovich commented on PIG-823:


We looked at metadata in Hive and it is really focused around higher level of 
abstraction such as tables/partitions etc. We would like to have something 
lower level, more generic, and closer to HDFS. We see a wider use for this 
system then just to support for SQL though SQL for Pig might be the first user.


 Hadoop Metadata Service
 ---

 Key: PIG-823
 URL: https://issues.apache.org/jira/browse/PIG-823
 Project: Pig
  Issue Type: New Feature
Reporter: Olga Natkovich

 This JIRA is created to track development of a metadata system for  Hadoop. 
 The goal of the system is to allow users and applications to register data 
 stored on HDFS, search for the data available on HDFS, and associate metadata 
 such as schema, statistics, etc. with a particular data unit or a data set 
 stored on HDFS. The initial goal is to provide a fairly generic, low level 
 abstraction that any user or application on HDFS can use to store an retrieve 
 metadata. Over time a higher level abstractions closely tied to particular 
 applications or tools can be developed.
 Over time, it would make sense for the metadata service to become a 
 subproject within Hadoop. For now, the proposal is to make it a contrib to 
 Pig since Pig SQL is likely to be the first user of the system.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-802) PERFORMANCE: not creating bags for ORDER BY

2009-05-29 Thread Olga Natkovich (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Olga Natkovich updated PIG-802:
---

Status: Open  (was: Patch Available)

 PERFORMANCE: not creating bags for ORDER BY
 ---

 Key: PIG-802
 URL: https://issues.apache.org/jira/browse/PIG-802
 Project: Pig
  Issue Type: Improvement
Affects Versions: 0.2.0
Reporter: Olga Natkovich
 Attachments: OrderByOptimization.patch


 Order by should be changed to not use POPackage to put all of the tuples in a 
 bag on the reduce side, as the bag is just immediately flattened. It can 
 instead work like join does for the last input in the join. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-802) PERFORMANCE: not creating bags for ORDER BY

2009-05-29 Thread Olga Natkovich (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Olga Natkovich updated PIG-802:
---

Status: Patch Available  (was: Open)

 PERFORMANCE: not creating bags for ORDER BY
 ---

 Key: PIG-802
 URL: https://issues.apache.org/jira/browse/PIG-802
 Project: Pig
  Issue Type: Improvement
Affects Versions: 0.2.0
Reporter: Olga Natkovich
 Attachments: OrderByOptimization.patch


 Order by should be changed to not use POPackage to put all of the tuples in a 
 bag on the reduce side, as the bag is just immediately flattened. It can 
 instead work like join does for the last input in the join. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-823) Hadoop Metadata Service

2009-05-29 Thread Jeff Hammerbacher (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12714547#action_12714547
 ] 

Jeff Hammerbacher commented on PIG-823:
---

It's an open source project and easily extensible. There are many extensions to 
the service within Facebook to support more general information. Why not try to 
add them to the existing service, since it's already got pluggable backends and 
a server implementation already defined?

 Hadoop Metadata Service
 ---

 Key: PIG-823
 URL: https://issues.apache.org/jira/browse/PIG-823
 Project: Pig
  Issue Type: New Feature
Reporter: Olga Natkovich

 This JIRA is created to track development of a metadata system for  Hadoop. 
 The goal of the system is to allow users and applications to register data 
 stored on HDFS, search for the data available on HDFS, and associate metadata 
 such as schema, statistics, etc. with a particular data unit or a data set 
 stored on HDFS. The initial goal is to provide a fairly generic, low level 
 abstraction that any user or application on HDFS can use to store an retrieve 
 metadata. Over time a higher level abstractions closely tied to particular 
 applications or tools can be developed.
 Over time, it would make sense for the metadata service to become a 
 subproject within Hadoop. For now, the proposal is to make it a contrib to 
 Pig since Pig SQL is likely to be the first user of the system.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Issue Comment Edited: (PIG-823) Hadoop Metadata Service

2009-05-29 Thread Jeff Hammerbacher (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12714547#action_12714547
 ] 

Jeff Hammerbacher edited comment on PIG-823 at 5/29/09 11:48 AM:
-

It's an open source project and easily extensible. There are many extensions to 
the service within Facebook to support more general information. Why not try to 
add the desired lower level metadata to the existing service as a patch to 
Hive, since it's already got pluggable backends and a server implementation 
already defined? Also, could you better define what close to HDFS means? 
There's a lot of HDFS metadata stored in the NameNode. Also, the initial 
implementation of the metadata repository for Hive stored data in HDFS, but it 
was found to be quite useful to have a separate service for metadata. Perhaps 
you could learn from their experiences?

  was (Author: hammer):
It's an open source project and easily extensible. There are many 
extensions to the service within Facebook to support more general information. 
Why not try to add them to the existing service, since it's already got 
pluggable backends and a server implementation already defined?
  
 Hadoop Metadata Service
 ---

 Key: PIG-823
 URL: https://issues.apache.org/jira/browse/PIG-823
 Project: Pig
  Issue Type: New Feature
Reporter: Olga Natkovich

 This JIRA is created to track development of a metadata system for  Hadoop. 
 The goal of the system is to allow users and applications to register data 
 stored on HDFS, search for the data available on HDFS, and associate metadata 
 such as schema, statistics, etc. with a particular data unit or a data set 
 stored on HDFS. The initial goal is to provide a fairly generic, low level 
 abstraction that any user or application on HDFS can use to store an retrieve 
 metadata. Over time a higher level abstractions closely tied to particular 
 applications or tools can be developed.
 Over time, it would make sense for the metadata service to become a 
 subproject within Hadoop. For now, the proposal is to make it a contrib to 
 Pig since Pig SQL is likely to be the first user of the system.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Hudson build is back to normal: Pig-Patch-minerva.apache.org #63

2009-05-29 Thread Apache Hudson Server
See http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/63/




[jira] Commented: (PIG-816) PigStorage() does not accept Unicode characters in its contructor

2009-05-29 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12714559#action_12714559
 ] 

Hadoop QA commented on PIG-816:
---

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12409405/PIG-816.patch
  against trunk revision 779788.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/63/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/63/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/63/console

This message is automatically generated.

 PigStorage() does not accept Unicode characters in its contructor 
 --

 Key: PIG-816
 URL: https://issues.apache.org/jira/browse/PIG-816
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.3.0
Reporter: Viraj Bhat
Assignee: Pradeep Kamath
Priority: Critical
 Fix For: 0.3.0

 Attachments: PIG-816.patch, pig_1243043613713.log


 Simple Pig script which uses Unicode characters in the PigStorage() 
 constructor fails with the following error:
 {code}
 studenttab = LOAD '/user/viraj/studenttab10k' AS (name:chararray, 
 age:int,gpa:float);
 X2 = GROUP studenttab by age;
 Y2 = FOREACH X2 GENERATE group, COUNT(studenttab);
 store Y2 into '/user/viraj/y2' using PigStorage('\u0001');
 {code}
 
 ERROR org.apache.pig.tools.grunt.GruntParser - ERROR 2997: Unable to recreate 
 exception from backend error: org.apache.hadoop.ipc.RemoteException: 
 java.io.IOException: java.lang.RuntimeException: 
 org.xml.sax.SAXParseException: Character reference #1 is an invalid XML 
 character.
 
 Attaching log file.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-824) SQL interface for Pig

2009-05-29 Thread Jeff Hammerbacher (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12714571#action_12714571
 ] 

Jeff Hammerbacher commented on PIG-824:
---

Sigh. Really? Why build another SQL interface to Hadoop when we have two 
already (CloudBase, Hive)? Extending Pig to share Hive's metadata repository 
seems to be a much, much shorter path to a solution.

 SQL interface for Pig
 -

 Key: PIG-824
 URL: https://issues.apache.org/jira/browse/PIG-824
 Project: Pig
  Issue Type: New Feature
Reporter: Olga Natkovich

 In the last 18 month PigLatin has gained significant popularity within the 
 open source community. Many users like its data flow model, its rich type 
 system and its ability to work with any data available on HDFS or outside. We 
 have also heard from many users that having Pig speak SQL would bring many 
 more users. Having a single system that exports multiple interfaces is a big 
 advantage as it guarantees consistent semantics, custom code reuse, and 
 reduces the amount of maintenance. This is especially relevant for project 
 where using both interfaces for different parts of the system is relevant.  
 For instance, in a 
 data warehousing system, you would have ETL component that brings data  into 
 the warehouse and a component that analyzes the data and produces reports. 
 PigLatin is uniquely suited for ETL processing while SQL might be a better 
 fit for report generation.
 To start, it would make sense to implement a subset of SQL92 standard and to 
 be as much as possible standard compliant. This would include all the 
 standard constructs: select, from, where, group-by + having, order by, limit, 
 join (inner + outer). Several extensions  such as support for pig's UDFs and 
 possibly streaming, multiquery and support for pig's complex types would be 
 helpful.
 This work is dependent on metadata support outlined in 
 https://issues.apache.org/jira/browse/PIG-823

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (PIG-825) PIG_HADOOP_VERSION should be 18

2009-05-29 Thread Dmitriy V. Ryaboy (JIRA)
PIG_HADOOP_VERSION should be 18
---

 Key: PIG-825
 URL: https://issues.apache.org/jira/browse/PIG-825
 Project: Pig
  Issue Type: Bug
  Components: grunt
Reporter: Dmitriy V. Ryaboy


PIG_HADOOP_VERSION should be set to 18, not 17, as Hadoop 0.18 is now 
considered default.
Patch coming.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-825) PIG_HADOOP_VERSION should be 18

2009-05-29 Thread Dmitriy V. Ryaboy (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitriy V. Ryaboy updated PIG-825:
--

Attachment: pig-825.patch

Attached trivial patch, please review.

 PIG_HADOOP_VERSION should be 18
 ---

 Key: PIG-825
 URL: https://issues.apache.org/jira/browse/PIG-825
 Project: Pig
  Issue Type: Bug
  Components: grunt
Reporter: Dmitriy V. Ryaboy
 Attachments: pig-825.patch


 PIG_HADOOP_VERSION should be set to 18, not 17, as Hadoop 0.18 is now 
 considered default.
 Patch coming.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Build failed in Hudson: Pig-Patch-minerva.apache.org #64

2009-05-29 Thread Apache Hudson Server
See http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/64/

--
[...truncated 90881 lines...]
 [exec] [junit] 09/05/29 14:18:34 INFO dfs.DataNode: PacketResponder 0 
for block blk_-1834358976001448559_1011 terminating
 [exec] [junit] 09/05/29 14:18:34 INFO dfs.StateChange: BLOCK* 
NameSystem.addStoredBlock: blockMap updated: 127.0.0.1:51804 is added to 
blk_-1834358976001448559_1011 size 6
 [exec] [junit] 09/05/29 14:18:34 INFO dfs.DataNode: Received block 
blk_-1834358976001448559_1011 of size 6 from /127.0.0.1
 [exec] [junit] 09/05/29 14:18:34 INFO dfs.StateChange: BLOCK* 
NameSystem.addStoredBlock: blockMap updated: 127.0.0.1:39345 is added to 
blk_-1834358976001448559_1011 size 6
 [exec] [junit] 09/05/29 14:18:34 INFO dfs.DataNode: Received block 
blk_-1834358976001448559_1011 of size 6 from /127.0.0.1
 [exec] [junit] 09/05/29 14:18:34 INFO dfs.StateChange: BLOCK* 
NameSystem.addStoredBlock: blockMap updated: 127.0.0.1:59762 is added to 
blk_-1834358976001448559_1011 size 6
 [exec] [junit] 09/05/29 14:18:34 INFO dfs.DataNode: PacketResponder 1 
for block blk_-1834358976001448559_1011 terminating
 [exec] [junit] 09/05/29 14:18:34 INFO dfs.DataNode: PacketResponder 2 
for block blk_-1834358976001448559_1011 terminating
 [exec] [junit] 09/05/29 14:18:34 INFO 
executionengine.HExecutionEngine: Connecting to hadoop file system at: 
hdfs://localhost:51173
 [exec] [junit] 09/05/29 14:18:34 INFO 
executionengine.HExecutionEngine: Connecting to map-reduce job tracker at: 
localhost:48177
 [exec] [junit] 09/05/29 14:18:34 INFO 
mapReduceLayer.MultiQueryOptimizer: MR plan size before optimization: 1
 [exec] [junit] 09/05/29 14:18:34 INFO 
mapReduceLayer.MultiQueryOptimizer: MR plan size after optimization: 1
 [exec] [junit] 09/05/29 14:18:35 WARN dfs.DataNode: Unexpected error 
trying to delete block blk_-5391508296031911272_1004. BlockInfo not found in 
volumeMap.
 [exec] [junit] 09/05/29 14:18:35 INFO dfs.DataNode: Deleting block 
blk_4184519850683123566_1005 file dfs/data/data7/current/blk_4184519850683123566
 [exec] [junit] 09/05/29 14:18:35 WARN dfs.DataNode: 
java.io.IOException: Error in deleting blocks.
 [exec] [junit] at 
org.apache.hadoop.dfs.FSDataset.invalidate(FSDataset.java:1146)
 [exec] [junit] at 
org.apache.hadoop.dfs.DataNode.processCommand(DataNode.java:793)
 [exec] [junit] at 
org.apache.hadoop.dfs.DataNode.offerService(DataNode.java:663)
 [exec] [junit] at 
org.apache.hadoop.dfs.DataNode.run(DataNode.java:2888)
 [exec] [junit] at java.lang.Thread.run(Thread.java:619)
 [exec] [junit] 
 [exec] [junit] 09/05/29 14:18:35 INFO 
mapReduceLayer.JobControlCompiler: Setting up single store job
 [exec] [junit] 09/05/29 14:18:35 WARN mapred.JobClient: Use 
GenericOptionsParser for parsing the arguments. Applications should implement 
Tool for the same.
 [exec] [junit] 09/05/29 14:18:35 INFO dfs.StateChange: BLOCK* 
NameSystem.allocateBlock: 
/tmp/hadoop-hudson/mapred/system/job_200905291417_0002/job.jar. 
blk_9150403780694500298_1012
 [exec] [junit] 09/05/29 14:18:35 INFO dfs.DataNode: Receiving block 
blk_9150403780694500298_1012 src: /127.0.0.1:43863 dest: /127.0.0.1:48879
 [exec] [junit] 09/05/29 14:18:35 INFO dfs.DataNode: Receiving block 
blk_9150403780694500298_1012 src: /127.0.0.1:33948 dest: /127.0.0.1:51804
 [exec] [junit] 09/05/29 14:18:35 INFO dfs.DataNode: Receiving block 
blk_9150403780694500298_1012 src: /127.0.0.1:57145 dest: /127.0.0.1:59762
 [exec] [junit] 09/05/29 14:18:35 INFO dfs.DataNode: Received block 
blk_9150403780694500298_1012 of size 1411199 from /127.0.0.1
 [exec] [junit] 09/05/29 14:18:35 INFO dfs.StateChange: BLOCK* 
NameSystem.addStoredBlock: blockMap updated: 127.0.0.1:59762 is added to 
blk_9150403780694500298_1012 size 1411199
 [exec] [junit] 09/05/29 14:18:35 INFO dfs.DataNode: Received block 
blk_9150403780694500298_1012 of size 1411199 from /127.0.0.1
 [exec] [junit] 09/05/29 14:18:35 INFO dfs.DataNode: PacketResponder 0 
for block blk_9150403780694500298_1012 terminating
 [exec] [junit] 09/05/29 14:18:35 INFO dfs.DataNode: PacketResponder 1 
for block blk_9150403780694500298_1012 terminating
 [exec] [junit] 09/05/29 14:18:35 INFO dfs.StateChange: BLOCK* 
NameSystem.addStoredBlock: blockMap updated: 127.0.0.1:51804 is added to 
blk_9150403780694500298_1012 size 1411199
 [exec] [junit] 09/05/29 14:18:35 INFO dfs.DataNode: Received block 
blk_9150403780694500298_1012 of size 1411199 from /127.0.0.1
 [exec] [junit] 09/05/29 14:18:35 INFO dfs.DataNode: PacketResponder 2 
for block blk_9150403780694500298_1012 terminating
 [exec] [junit] 09/05/29 14:18:35 INFO dfs.StateChange: BLOCK* 

[jira] Commented: (PIG-802) PERFORMANCE: not creating bags for ORDER BY

2009-05-29 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12714594#action_12714594
 ] 

Hadoop QA commented on PIG-802:
---

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12409408/OrderByOptimization.patch
  against trunk revision 779788.

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no tests are needed for this patch.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/64/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/64/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/64/console

This message is automatically generated.

 PERFORMANCE: not creating bags for ORDER BY
 ---

 Key: PIG-802
 URL: https://issues.apache.org/jira/browse/PIG-802
 Project: Pig
  Issue Type: Improvement
Affects Versions: 0.2.0
Reporter: Olga Natkovich
 Attachments: OrderByOptimization.patch


 Order by should be changed to not use POPackage to put all of the tuples in a 
 bag on the reduce side, as the bag is just immediately flattened. It can 
 instead work like join does for the last input in the join. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-816) PigStorage() does not accept Unicode characters in its contructor

2009-05-29 Thread Pradeep Kamath (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pradeep Kamath updated PIG-816:
---

  Resolution: Fixed
Hadoop Flags: [Reviewed]
  Status: Resolved  (was: Patch Available)

Patch committed.

 PigStorage() does not accept Unicode characters in its contructor 
 --

 Key: PIG-816
 URL: https://issues.apache.org/jira/browse/PIG-816
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.3.0
Reporter: Viraj Bhat
Assignee: Pradeep Kamath
Priority: Critical
 Fix For: 0.3.0

 Attachments: PIG-816.patch, pig_1243043613713.log


 Simple Pig script which uses Unicode characters in the PigStorage() 
 constructor fails with the following error:
 {code}
 studenttab = LOAD '/user/viraj/studenttab10k' AS (name:chararray, 
 age:int,gpa:float);
 X2 = GROUP studenttab by age;
 Y2 = FOREACH X2 GENERATE group, COUNT(studenttab);
 store Y2 into '/user/viraj/y2' using PigStorage('\u0001');
 {code}
 
 ERROR org.apache.pig.tools.grunt.GruntParser - ERROR 2997: Unable to recreate 
 exception from backend error: org.apache.hadoop.ipc.RemoteException: 
 java.io.IOException: java.lang.RuntimeException: 
 org.xml.sax.SAXParseException: Character reference #1 is an invalid XML 
 character.
 
 Attaching log file.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-822) Flatten semantics are unknown

2009-05-29 Thread Olga Natkovich (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Olga Natkovich updated PIG-822:
---

Description: 
There is no formal specification of the flatten keyword in 
http://hadoop.apache.org/pig/docs/r0.2.0/piglatin.html 
There are only some examples.

I have found flatten to be very fragile and unpredictable with the data types 
it reads and creates. 

Please document:
Flatten to be explained formally in its own dedicated section: What are the 
valid input types, the output types it creates, what transformation it does 
from input to output and how the resulting data are named.



  was:
There is no formal specification of the flatten keyword in 
http://hadoop.apache.org/pig/docs/r0.2.0/piglatin.html 
There are only some examples.

I have found flatten to be very fragile and unpredictable with the data types 
it reads and creates. I have wasted too many hours (and Viraj too) trying to 
figure out its peculiarities, the latest of which is here: 
http://bug.corp.yahoo.com/show_bug.cgi?id=2768016 comment #15

Please document:
Flatten to be explained formally in its own dedicated section: What are the 
valid input types, the output types it creates, what transformation it does 
from input to output and how the resulting data are named.




 Flatten semantics are unknown
 -

 Key: PIG-822
 URL: https://issues.apache.org/jira/browse/PIG-822
 Project: Pig
  Issue Type: Bug
  Components: documentation
Reporter: George Mavromatis
Priority: Critical

 There is no formal specification of the flatten keyword in 
 http://hadoop.apache.org/pig/docs/r0.2.0/piglatin.html 
 There are only some examples.
 I have found flatten to be very fragile and unpredictable with the data types 
 it reads and creates. 
 Please document:
 Flatten to be explained formally in its own dedicated section: What are the 
 valid input types, the output types it creates, what transformation it does 
 from input to output and how the resulting data are named.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-825) PIG_HADOOP_VERSION should be 18

2009-05-29 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12714621#action_12714621
 ] 

Alan Gates commented on PIG-825:


I'll take a look at this patch.

 PIG_HADOOP_VERSION should be 18
 ---

 Key: PIG-825
 URL: https://issues.apache.org/jira/browse/PIG-825
 Project: Pig
  Issue Type: Bug
  Components: grunt
Reporter: Dmitriy V. Ryaboy
 Attachments: pig-825.patch


 PIG_HADOOP_VERSION should be set to 18, not 17, as Hadoop 0.18 is now 
 considered default.
 Patch coming.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-825) PIG_HADOOP_VERSION should be 18

2009-05-29 Thread Dmitriy V. Ryaboy (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitriy V. Ryaboy updated PIG-825:
--

Attachment: pig-825.patch

Minor update to minor patch --fixed a typo in the bug number in CHANGES.txt

 PIG_HADOOP_VERSION should be 18
 ---

 Key: PIG-825
 URL: https://issues.apache.org/jira/browse/PIG-825
 Project: Pig
  Issue Type: Bug
  Components: grunt
Reporter: Dmitriy V. Ryaboy
 Attachments: pig-825.patch, pig-825.patch


 PIG_HADOOP_VERSION should be set to 18, not 17, as Hadoop 0.18 is now 
 considered default.
 Patch coming.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-796) support conversion from numeric types to chararray

2009-05-29 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated PIG-796:
-

Attachment: pig-796.patch

This patch implements the fix as suggested by Alan.

 support  conversion from numeric types to chararray
 ---

 Key: PIG-796
 URL: https://issues.apache.org/jira/browse/PIG-796
 Project: Pig
  Issue Type: Improvement
Affects Versions: 0.2.0
Reporter: Olga Natkovich
 Attachments: pig-796.patch




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.