[jira] Commented: (PIG-818) Explain doesn't handle PODemux properly
[ https://issues.apache.org/jira/browse/PIG-818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12714421#action_12714421 ] Hudson commented on PIG-818: Integrated in Pig-trunk #457 (See [http://hudson.zones.apache.org/hudson/job/Pig-trunk/457/]) : Explain doesn't handle PODemux properly (hagleitn via olgan) Explain doesn't handle PODemux properly --- Key: PIG-818 URL: https://issues.apache.org/jira/browse/PIG-818 Project: Pig Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Attachments: explain.patch The PODemux operator has nested plans but they are not expanded in the -dot version of explain. Also, both split and demux are displayed as clusters of nodes, but it really makes more sense to just show them as multi output operators. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-819) run -param -param; is a valid grunt command
[ https://issues.apache.org/jira/browse/PIG-819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12714420#action_12714420 ] Hudson commented on PIG-819: Integrated in Pig-trunk #457 (See [http://hudson.zones.apache.org/hudson/job/Pig-trunk/457/]) : run -param -param; is a valid grunt command (milindb via olgan) run -param -param; is a valid grunt command --- Key: PIG-819 URL: https://issues.apache.org/jira/browse/PIG-819 Project: Pig Issue Type: Bug Components: grunt Affects Versions: 0.3.0 Environment: all Reporter: Milind Bhandarkar Assignee: Milind Bhandarkar Attachments: invalidparam.patch By mistake, I typed {code} run -param -param; {code} in grunt. And was surprised to find it to be a valid grunt command. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-796) support conversion from numeric types to chararray
[ https://issues.apache.org/jira/browse/PIG-796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12714526#action_12714526 ] Yiping Han commented on PIG-796: I have the same idea that Alan proposed. I agree the common case is most values are of the same type. Caching the type and change the cached type only when catch the ClassCastException would be the most efficient way. support conversion from numeric types to chararray --- Key: PIG-796 URL: https://issues.apache.org/jira/browse/PIG-796 Project: Pig Issue Type: Improvement Affects Versions: 0.2.0 Reporter: Olga Natkovich -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-816) PigStorage() does not accept Unicode characters in its contructor
[ https://issues.apache.org/jira/browse/PIG-816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12714528#action_12714528 ] Olga Natkovich commented on PIG-816: +1, the fix looks good PigStorage() does not accept Unicode characters in its contructor -- Key: PIG-816 URL: https://issues.apache.org/jira/browse/PIG-816 Project: Pig Issue Type: Bug Components: impl Affects Versions: 0.3.0 Reporter: Viraj Bhat Assignee: Pradeep Kamath Priority: Critical Fix For: 0.3.0 Attachments: PIG-816.patch, pig_1243043613713.log Simple Pig script which uses Unicode characters in the PigStorage() constructor fails with the following error: {code} studenttab = LOAD '/user/viraj/studenttab10k' AS (name:chararray, age:int,gpa:float); X2 = GROUP studenttab by age; Y2 = FOREACH X2 GENERATE group, COUNT(studenttab); store Y2 into '/user/viraj/y2' using PigStorage('\u0001'); {code} ERROR org.apache.pig.tools.grunt.GruntParser - ERROR 2997: Unable to recreate exception from backend error: org.apache.hadoop.ipc.RemoteException: java.io.IOException: java.lang.RuntimeException: org.xml.sax.SAXParseException: Character reference #1 is an invalid XML character. Attaching log file. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (PIG-823) Hadoop Metadata Service
Hadoop Metadata Service --- Key: PIG-823 URL: https://issues.apache.org/jira/browse/PIG-823 Project: Pig Issue Type: New Feature Reporter: Olga Natkovich This JIRA is created to track development of a metadata system for Hadoop. The goal of the system is to allow users and applications to register data stored on HDFS, search for the data available on HDFS, and associate metadata such as schema, statistics, etc. with a particular data unit or a data set stored on HDFS. The initial goal is to provide a fairly generic, low level abstraction that any user or application on HDFS can use to store an retrieve metadata. Over time a higher level abstractions closely tied to particular applications or tools can be developed. Over time, it would make sense for the metadata service to become a subproject within Hadoop. For now, the proposal is to make it a contrib to Pig since Pig SQL is likely to be the first user of the system. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Issue Comment Edited: (PIG-823) Hadoop Metadata Service
[ https://issues.apache.org/jira/browse/PIG-823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12714539#action_12714539 ] Jeff Hammerbacher edited comment on PIG-823 at 5/29/09 11:20 AM: - Hey, Hadoop already has a metadata service (well defined at http://svn.apache.org/viewvc/hadoop/hive/trunk/metastore/if/hive_metastore.thrift) and a SQL implementation in production use at scale at several organizations. Can any of that work be reused for this purpose? It seems like duplicating effort across subprojects is a bad idea. Later, Jeff was (Author: hammer): Hey, Hadoop already had a metadata service (well defined at http://svn.apache.org/viewvc/hadoop/hive/trunk/metastore/if/hive_metastore.thrift) and a SQL implementation in production use at scale at several organizations. Can any of that work be reused for this purpose? It seems like duplicating effort across subprojects is a bad idea. Later, Jeff Hadoop Metadata Service --- Key: PIG-823 URL: https://issues.apache.org/jira/browse/PIG-823 Project: Pig Issue Type: New Feature Reporter: Olga Natkovich This JIRA is created to track development of a metadata system for Hadoop. The goal of the system is to allow users and applications to register data stored on HDFS, search for the data available on HDFS, and associate metadata such as schema, statistics, etc. with a particular data unit or a data set stored on HDFS. The initial goal is to provide a fairly generic, low level abstraction that any user or application on HDFS can use to store an retrieve metadata. Over time a higher level abstractions closely tied to particular applications or tools can be developed. Over time, it would make sense for the metadata service to become a subproject within Hadoop. For now, the proposal is to make it a contrib to Pig since Pig SQL is likely to be the first user of the system. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-802) PERFORMANCE: not creating bags for ORDER BY
[ https://issues.apache.org/jira/browse/PIG-802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rakesh Setty updated PIG-802: - Attachment: (was: OrderByOptimization.patch) PERFORMANCE: not creating bags for ORDER BY --- Key: PIG-802 URL: https://issues.apache.org/jira/browse/PIG-802 Project: Pig Issue Type: Improvement Affects Versions: 0.2.0 Reporter: Olga Natkovich Attachments: OrderByOptimization.patch Order by should be changed to not use POPackage to put all of the tuples in a bag on the reduce side, as the bag is just immediately flattened. It can instead work like join does for the last input in the join. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-823) Hadoop Metadata Service
[ https://issues.apache.org/jira/browse/PIG-823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12714543#action_12714543 ] Olga Natkovich commented on PIG-823: We looked at metadata in Hive and it is really focused around higher level of abstraction such as tables/partitions etc. We would like to have something lower level, more generic, and closer to HDFS. We see a wider use for this system then just to support for SQL though SQL for Pig might be the first user. Hadoop Metadata Service --- Key: PIG-823 URL: https://issues.apache.org/jira/browse/PIG-823 Project: Pig Issue Type: New Feature Reporter: Olga Natkovich This JIRA is created to track development of a metadata system for Hadoop. The goal of the system is to allow users and applications to register data stored on HDFS, search for the data available on HDFS, and associate metadata such as schema, statistics, etc. with a particular data unit or a data set stored on HDFS. The initial goal is to provide a fairly generic, low level abstraction that any user or application on HDFS can use to store an retrieve metadata. Over time a higher level abstractions closely tied to particular applications or tools can be developed. Over time, it would make sense for the metadata service to become a subproject within Hadoop. For now, the proposal is to make it a contrib to Pig since Pig SQL is likely to be the first user of the system. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-802) PERFORMANCE: not creating bags for ORDER BY
[ https://issues.apache.org/jira/browse/PIG-802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Olga Natkovich updated PIG-802: --- Status: Open (was: Patch Available) PERFORMANCE: not creating bags for ORDER BY --- Key: PIG-802 URL: https://issues.apache.org/jira/browse/PIG-802 Project: Pig Issue Type: Improvement Affects Versions: 0.2.0 Reporter: Olga Natkovich Attachments: OrderByOptimization.patch Order by should be changed to not use POPackage to put all of the tuples in a bag on the reduce side, as the bag is just immediately flattened. It can instead work like join does for the last input in the join. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-802) PERFORMANCE: not creating bags for ORDER BY
[ https://issues.apache.org/jira/browse/PIG-802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Olga Natkovich updated PIG-802: --- Status: Patch Available (was: Open) PERFORMANCE: not creating bags for ORDER BY --- Key: PIG-802 URL: https://issues.apache.org/jira/browse/PIG-802 Project: Pig Issue Type: Improvement Affects Versions: 0.2.0 Reporter: Olga Natkovich Attachments: OrderByOptimization.patch Order by should be changed to not use POPackage to put all of the tuples in a bag on the reduce side, as the bag is just immediately flattened. It can instead work like join does for the last input in the join. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-823) Hadoop Metadata Service
[ https://issues.apache.org/jira/browse/PIG-823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12714547#action_12714547 ] Jeff Hammerbacher commented on PIG-823: --- It's an open source project and easily extensible. There are many extensions to the service within Facebook to support more general information. Why not try to add them to the existing service, since it's already got pluggable backends and a server implementation already defined? Hadoop Metadata Service --- Key: PIG-823 URL: https://issues.apache.org/jira/browse/PIG-823 Project: Pig Issue Type: New Feature Reporter: Olga Natkovich This JIRA is created to track development of a metadata system for Hadoop. The goal of the system is to allow users and applications to register data stored on HDFS, search for the data available on HDFS, and associate metadata such as schema, statistics, etc. with a particular data unit or a data set stored on HDFS. The initial goal is to provide a fairly generic, low level abstraction that any user or application on HDFS can use to store an retrieve metadata. Over time a higher level abstractions closely tied to particular applications or tools can be developed. Over time, it would make sense for the metadata service to become a subproject within Hadoop. For now, the proposal is to make it a contrib to Pig since Pig SQL is likely to be the first user of the system. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Issue Comment Edited: (PIG-823) Hadoop Metadata Service
[ https://issues.apache.org/jira/browse/PIG-823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12714547#action_12714547 ] Jeff Hammerbacher edited comment on PIG-823 at 5/29/09 11:48 AM: - It's an open source project and easily extensible. There are many extensions to the service within Facebook to support more general information. Why not try to add the desired lower level metadata to the existing service as a patch to Hive, since it's already got pluggable backends and a server implementation already defined? Also, could you better define what close to HDFS means? There's a lot of HDFS metadata stored in the NameNode. Also, the initial implementation of the metadata repository for Hive stored data in HDFS, but it was found to be quite useful to have a separate service for metadata. Perhaps you could learn from their experiences? was (Author: hammer): It's an open source project and easily extensible. There are many extensions to the service within Facebook to support more general information. Why not try to add them to the existing service, since it's already got pluggable backends and a server implementation already defined? Hadoop Metadata Service --- Key: PIG-823 URL: https://issues.apache.org/jira/browse/PIG-823 Project: Pig Issue Type: New Feature Reporter: Olga Natkovich This JIRA is created to track development of a metadata system for Hadoop. The goal of the system is to allow users and applications to register data stored on HDFS, search for the data available on HDFS, and associate metadata such as schema, statistics, etc. with a particular data unit or a data set stored on HDFS. The initial goal is to provide a fairly generic, low level abstraction that any user or application on HDFS can use to store an retrieve metadata. Over time a higher level abstractions closely tied to particular applications or tools can be developed. Over time, it would make sense for the metadata service to become a subproject within Hadoop. For now, the proposal is to make it a contrib to Pig since Pig SQL is likely to be the first user of the system. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Hudson build is back to normal: Pig-Patch-minerva.apache.org #63
See http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/63/
[jira] Commented: (PIG-816) PigStorage() does not accept Unicode characters in its contructor
[ https://issues.apache.org/jira/browse/PIG-816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12714559#action_12714559 ] Hadoop QA commented on PIG-816: --- +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12409405/PIG-816.patch against trunk revision 779788. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/63/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/63/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/63/console This message is automatically generated. PigStorage() does not accept Unicode characters in its contructor -- Key: PIG-816 URL: https://issues.apache.org/jira/browse/PIG-816 Project: Pig Issue Type: Bug Components: impl Affects Versions: 0.3.0 Reporter: Viraj Bhat Assignee: Pradeep Kamath Priority: Critical Fix For: 0.3.0 Attachments: PIG-816.patch, pig_1243043613713.log Simple Pig script which uses Unicode characters in the PigStorage() constructor fails with the following error: {code} studenttab = LOAD '/user/viraj/studenttab10k' AS (name:chararray, age:int,gpa:float); X2 = GROUP studenttab by age; Y2 = FOREACH X2 GENERATE group, COUNT(studenttab); store Y2 into '/user/viraj/y2' using PigStorage('\u0001'); {code} ERROR org.apache.pig.tools.grunt.GruntParser - ERROR 2997: Unable to recreate exception from backend error: org.apache.hadoop.ipc.RemoteException: java.io.IOException: java.lang.RuntimeException: org.xml.sax.SAXParseException: Character reference #1 is an invalid XML character. Attaching log file. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-824) SQL interface for Pig
[ https://issues.apache.org/jira/browse/PIG-824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12714571#action_12714571 ] Jeff Hammerbacher commented on PIG-824: --- Sigh. Really? Why build another SQL interface to Hadoop when we have two already (CloudBase, Hive)? Extending Pig to share Hive's metadata repository seems to be a much, much shorter path to a solution. SQL interface for Pig - Key: PIG-824 URL: https://issues.apache.org/jira/browse/PIG-824 Project: Pig Issue Type: New Feature Reporter: Olga Natkovich In the last 18 month PigLatin has gained significant popularity within the open source community. Many users like its data flow model, its rich type system and its ability to work with any data available on HDFS or outside. We have also heard from many users that having Pig speak SQL would bring many more users. Having a single system that exports multiple interfaces is a big advantage as it guarantees consistent semantics, custom code reuse, and reduces the amount of maintenance. This is especially relevant for project where using both interfaces for different parts of the system is relevant. For instance, in a data warehousing system, you would have ETL component that brings data into the warehouse and a component that analyzes the data and produces reports. PigLatin is uniquely suited for ETL processing while SQL might be a better fit for report generation. To start, it would make sense to implement a subset of SQL92 standard and to be as much as possible standard compliant. This would include all the standard constructs: select, from, where, group-by + having, order by, limit, join (inner + outer). Several extensions such as support for pig's UDFs and possibly streaming, multiquery and support for pig's complex types would be helpful. This work is dependent on metadata support outlined in https://issues.apache.org/jira/browse/PIG-823 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (PIG-825) PIG_HADOOP_VERSION should be 18
PIG_HADOOP_VERSION should be 18 --- Key: PIG-825 URL: https://issues.apache.org/jira/browse/PIG-825 Project: Pig Issue Type: Bug Components: grunt Reporter: Dmitriy V. Ryaboy PIG_HADOOP_VERSION should be set to 18, not 17, as Hadoop 0.18 is now considered default. Patch coming. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-825) PIG_HADOOP_VERSION should be 18
[ https://issues.apache.org/jira/browse/PIG-825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dmitriy V. Ryaboy updated PIG-825: -- Attachment: pig-825.patch Attached trivial patch, please review. PIG_HADOOP_VERSION should be 18 --- Key: PIG-825 URL: https://issues.apache.org/jira/browse/PIG-825 Project: Pig Issue Type: Bug Components: grunt Reporter: Dmitriy V. Ryaboy Attachments: pig-825.patch PIG_HADOOP_VERSION should be set to 18, not 17, as Hadoop 0.18 is now considered default. Patch coming. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Build failed in Hudson: Pig-Patch-minerva.apache.org #64
See http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/64/ -- [...truncated 90881 lines...] [exec] [junit] 09/05/29 14:18:34 INFO dfs.DataNode: PacketResponder 0 for block blk_-1834358976001448559_1011 terminating [exec] [junit] 09/05/29 14:18:34 INFO dfs.StateChange: BLOCK* NameSystem.addStoredBlock: blockMap updated: 127.0.0.1:51804 is added to blk_-1834358976001448559_1011 size 6 [exec] [junit] 09/05/29 14:18:34 INFO dfs.DataNode: Received block blk_-1834358976001448559_1011 of size 6 from /127.0.0.1 [exec] [junit] 09/05/29 14:18:34 INFO dfs.StateChange: BLOCK* NameSystem.addStoredBlock: blockMap updated: 127.0.0.1:39345 is added to blk_-1834358976001448559_1011 size 6 [exec] [junit] 09/05/29 14:18:34 INFO dfs.DataNode: Received block blk_-1834358976001448559_1011 of size 6 from /127.0.0.1 [exec] [junit] 09/05/29 14:18:34 INFO dfs.StateChange: BLOCK* NameSystem.addStoredBlock: blockMap updated: 127.0.0.1:59762 is added to blk_-1834358976001448559_1011 size 6 [exec] [junit] 09/05/29 14:18:34 INFO dfs.DataNode: PacketResponder 1 for block blk_-1834358976001448559_1011 terminating [exec] [junit] 09/05/29 14:18:34 INFO dfs.DataNode: PacketResponder 2 for block blk_-1834358976001448559_1011 terminating [exec] [junit] 09/05/29 14:18:34 INFO executionengine.HExecutionEngine: Connecting to hadoop file system at: hdfs://localhost:51173 [exec] [junit] 09/05/29 14:18:34 INFO executionengine.HExecutionEngine: Connecting to map-reduce job tracker at: localhost:48177 [exec] [junit] 09/05/29 14:18:34 INFO mapReduceLayer.MultiQueryOptimizer: MR plan size before optimization: 1 [exec] [junit] 09/05/29 14:18:34 INFO mapReduceLayer.MultiQueryOptimizer: MR plan size after optimization: 1 [exec] [junit] 09/05/29 14:18:35 WARN dfs.DataNode: Unexpected error trying to delete block blk_-5391508296031911272_1004. BlockInfo not found in volumeMap. [exec] [junit] 09/05/29 14:18:35 INFO dfs.DataNode: Deleting block blk_4184519850683123566_1005 file dfs/data/data7/current/blk_4184519850683123566 [exec] [junit] 09/05/29 14:18:35 WARN dfs.DataNode: java.io.IOException: Error in deleting blocks. [exec] [junit] at org.apache.hadoop.dfs.FSDataset.invalidate(FSDataset.java:1146) [exec] [junit] at org.apache.hadoop.dfs.DataNode.processCommand(DataNode.java:793) [exec] [junit] at org.apache.hadoop.dfs.DataNode.offerService(DataNode.java:663) [exec] [junit] at org.apache.hadoop.dfs.DataNode.run(DataNode.java:2888) [exec] [junit] at java.lang.Thread.run(Thread.java:619) [exec] [junit] [exec] [junit] 09/05/29 14:18:35 INFO mapReduceLayer.JobControlCompiler: Setting up single store job [exec] [junit] 09/05/29 14:18:35 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same. [exec] [junit] 09/05/29 14:18:35 INFO dfs.StateChange: BLOCK* NameSystem.allocateBlock: /tmp/hadoop-hudson/mapred/system/job_200905291417_0002/job.jar. blk_9150403780694500298_1012 [exec] [junit] 09/05/29 14:18:35 INFO dfs.DataNode: Receiving block blk_9150403780694500298_1012 src: /127.0.0.1:43863 dest: /127.0.0.1:48879 [exec] [junit] 09/05/29 14:18:35 INFO dfs.DataNode: Receiving block blk_9150403780694500298_1012 src: /127.0.0.1:33948 dest: /127.0.0.1:51804 [exec] [junit] 09/05/29 14:18:35 INFO dfs.DataNode: Receiving block blk_9150403780694500298_1012 src: /127.0.0.1:57145 dest: /127.0.0.1:59762 [exec] [junit] 09/05/29 14:18:35 INFO dfs.DataNode: Received block blk_9150403780694500298_1012 of size 1411199 from /127.0.0.1 [exec] [junit] 09/05/29 14:18:35 INFO dfs.StateChange: BLOCK* NameSystem.addStoredBlock: blockMap updated: 127.0.0.1:59762 is added to blk_9150403780694500298_1012 size 1411199 [exec] [junit] 09/05/29 14:18:35 INFO dfs.DataNode: Received block blk_9150403780694500298_1012 of size 1411199 from /127.0.0.1 [exec] [junit] 09/05/29 14:18:35 INFO dfs.DataNode: PacketResponder 0 for block blk_9150403780694500298_1012 terminating [exec] [junit] 09/05/29 14:18:35 INFO dfs.DataNode: PacketResponder 1 for block blk_9150403780694500298_1012 terminating [exec] [junit] 09/05/29 14:18:35 INFO dfs.StateChange: BLOCK* NameSystem.addStoredBlock: blockMap updated: 127.0.0.1:51804 is added to blk_9150403780694500298_1012 size 1411199 [exec] [junit] 09/05/29 14:18:35 INFO dfs.DataNode: Received block blk_9150403780694500298_1012 of size 1411199 from /127.0.0.1 [exec] [junit] 09/05/29 14:18:35 INFO dfs.DataNode: PacketResponder 2 for block blk_9150403780694500298_1012 terminating [exec] [junit] 09/05/29 14:18:35 INFO dfs.StateChange: BLOCK*
[jira] Commented: (PIG-802) PERFORMANCE: not creating bags for ORDER BY
[ https://issues.apache.org/jira/browse/PIG-802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12714594#action_12714594 ] Hadoop QA commented on PIG-802: --- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12409408/OrderByOptimization.patch against trunk revision 779788. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/64/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/64/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/64/console This message is automatically generated. PERFORMANCE: not creating bags for ORDER BY --- Key: PIG-802 URL: https://issues.apache.org/jira/browse/PIG-802 Project: Pig Issue Type: Improvement Affects Versions: 0.2.0 Reporter: Olga Natkovich Attachments: OrderByOptimization.patch Order by should be changed to not use POPackage to put all of the tuples in a bag on the reduce side, as the bag is just immediately flattened. It can instead work like join does for the last input in the join. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-816) PigStorage() does not accept Unicode characters in its contructor
[ https://issues.apache.org/jira/browse/PIG-816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pradeep Kamath updated PIG-816: --- Resolution: Fixed Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) Patch committed. PigStorage() does not accept Unicode characters in its contructor -- Key: PIG-816 URL: https://issues.apache.org/jira/browse/PIG-816 Project: Pig Issue Type: Bug Components: impl Affects Versions: 0.3.0 Reporter: Viraj Bhat Assignee: Pradeep Kamath Priority: Critical Fix For: 0.3.0 Attachments: PIG-816.patch, pig_1243043613713.log Simple Pig script which uses Unicode characters in the PigStorage() constructor fails with the following error: {code} studenttab = LOAD '/user/viraj/studenttab10k' AS (name:chararray, age:int,gpa:float); X2 = GROUP studenttab by age; Y2 = FOREACH X2 GENERATE group, COUNT(studenttab); store Y2 into '/user/viraj/y2' using PigStorage('\u0001'); {code} ERROR org.apache.pig.tools.grunt.GruntParser - ERROR 2997: Unable to recreate exception from backend error: org.apache.hadoop.ipc.RemoteException: java.io.IOException: java.lang.RuntimeException: org.xml.sax.SAXParseException: Character reference #1 is an invalid XML character. Attaching log file. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-822) Flatten semantics are unknown
[ https://issues.apache.org/jira/browse/PIG-822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Olga Natkovich updated PIG-822: --- Description: There is no formal specification of the flatten keyword in http://hadoop.apache.org/pig/docs/r0.2.0/piglatin.html There are only some examples. I have found flatten to be very fragile and unpredictable with the data types it reads and creates. Please document: Flatten to be explained formally in its own dedicated section: What are the valid input types, the output types it creates, what transformation it does from input to output and how the resulting data are named. was: There is no formal specification of the flatten keyword in http://hadoop.apache.org/pig/docs/r0.2.0/piglatin.html There are only some examples. I have found flatten to be very fragile and unpredictable with the data types it reads and creates. I have wasted too many hours (and Viraj too) trying to figure out its peculiarities, the latest of which is here: http://bug.corp.yahoo.com/show_bug.cgi?id=2768016 comment #15 Please document: Flatten to be explained formally in its own dedicated section: What are the valid input types, the output types it creates, what transformation it does from input to output and how the resulting data are named. Flatten semantics are unknown - Key: PIG-822 URL: https://issues.apache.org/jira/browse/PIG-822 Project: Pig Issue Type: Bug Components: documentation Reporter: George Mavromatis Priority: Critical There is no formal specification of the flatten keyword in http://hadoop.apache.org/pig/docs/r0.2.0/piglatin.html There are only some examples. I have found flatten to be very fragile and unpredictable with the data types it reads and creates. Please document: Flatten to be explained formally in its own dedicated section: What are the valid input types, the output types it creates, what transformation it does from input to output and how the resulting data are named. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-825) PIG_HADOOP_VERSION should be 18
[ https://issues.apache.org/jira/browse/PIG-825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12714621#action_12714621 ] Alan Gates commented on PIG-825: I'll take a look at this patch. PIG_HADOOP_VERSION should be 18 --- Key: PIG-825 URL: https://issues.apache.org/jira/browse/PIG-825 Project: Pig Issue Type: Bug Components: grunt Reporter: Dmitriy V. Ryaboy Attachments: pig-825.patch PIG_HADOOP_VERSION should be set to 18, not 17, as Hadoop 0.18 is now considered default. Patch coming. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-825) PIG_HADOOP_VERSION should be 18
[ https://issues.apache.org/jira/browse/PIG-825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dmitriy V. Ryaboy updated PIG-825: -- Attachment: pig-825.patch Minor update to minor patch --fixed a typo in the bug number in CHANGES.txt PIG_HADOOP_VERSION should be 18 --- Key: PIG-825 URL: https://issues.apache.org/jira/browse/PIG-825 Project: Pig Issue Type: Bug Components: grunt Reporter: Dmitriy V. Ryaboy Attachments: pig-825.patch, pig-825.patch PIG_HADOOP_VERSION should be set to 18, not 17, as Hadoop 0.18 is now considered default. Patch coming. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-796) support conversion from numeric types to chararray
[ https://issues.apache.org/jira/browse/PIG-796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated PIG-796: - Attachment: pig-796.patch This patch implements the fix as suggested by Alan. support conversion from numeric types to chararray --- Key: PIG-796 URL: https://issues.apache.org/jira/browse/PIG-796 Project: Pig Issue Type: Improvement Affects Versions: 0.2.0 Reporter: Olga Natkovich Attachments: pig-796.patch -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.