[jira] Commented: (PIG-788) Proposal to remove float from Pig data types
[ https://issues.apache.org/jira/browse/PIG-788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12708341#action_12708341 ] Mridul Muralidharan commented on PIG-788: - We do use floats quite a bit in our projects, so assertion of we do not see anyone using the float type is not correct. Even the webdata (and webmap too iirc) uses float for some of its fields. Agree with rest of Santhosh' comments above (11/May/09 05:52 PM) too. Proposal to remove float from Pig data types Key: PIG-788 URL: https://issues.apache.org/jira/browse/PIG-788 Project: Pig Issue Type: Improvement Components: impl Affects Versions: 0.2.0 Reporter: Alan Gates Assignee: Alan Gates Pig would like to use the new Hadoop Avro serialization package to pass data between MR jobs, and eventually between Pig and UDFs that are not written in Java. Avro will not be supporting the float data type, but only double (see AVRO-17). Pig currently support both float and double. Double is the default floating point type (so if the user says x + 1.0, 1.0 is taken to be a double, not a float). Float was initially included in the list of Pig types because Hadoop supported it as one of the Writable types, and we were trying to make sure all of Hadoop's writable types could be represented in Pig. In practice we do not see anyone using the float type. In order to be able to easily use Avro I propose dropping the float type. Please speak up if you are using the float type and you have a compelling reason not to use double. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-806) to remove author tags in the pig source code
[ https://issues.apache.org/jira/browse/PIG-806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12708422#action_12708422 ] Giridharan Kesavan commented on PIG-806: This issue blocks : https://issues.apache.org/jira/browse/PIG-765 to remove author tags in the pig source code Key: PIG-806 URL: https://issues.apache.org/jira/browse/PIG-806 Project: Pig Issue Type: Bug Reporter: Giridharan Kesavan Following java source files has author tags in them ; which need to to be cleaned. src/org/apache/pig/Algebraic.java src/org/apache/pig/backend/local/executionengine/physicalLayer/relationalOperators/POCross.java src/org/apache/pig/backend/local/executionengine/physicalLayer/relationalOperators/POCogroup.java src/org/apache/pig/impl/io/FileSpec.java src/org/apache/pig/impl/streaming/StreamingCommand.java src/org/apache/pig/StoreFunc.java src/org/apache/pig/tools/cmdline/CmdLineParser.java src/org/apache/pig/tools/timer/PerformanceTimer.java src/org/apache/pig/tools/timer/PerformanceTimerFactory.java Thanks, -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Hudson build is back to normal: Pig-trunk #432
See http://hudson.zones.apache.org/hudson/job/Pig-trunk/432/changes
[jira] Updated: (PIG-765) to implement jdiff
[ https://issues.apache.org/jira/browse/PIG-765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Giridharan Kesavan updated PIG-765: --- Attachment: pig-765.patch ported patch to resolve the jdiff dependencies using ivy. tnx! to implement jdiff -- Key: PIG-765 URL: https://issues.apache.org/jira/browse/PIG-765 Project: Pig Issue Type: Improvement Components: build Reporter: Giridharan Kesavan Assignee: Giridharan Kesavan Attachments: pig-765.patch, pig-765.patch -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-806) to remove author tags in the pig source code
[ https://issues.apache.org/jira/browse/PIG-806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12708462#action_12708462 ] Olga Natkovich commented on PIG-806: what does author tags mean? Are you talking about control characters? to remove author tags in the pig source code Key: PIG-806 URL: https://issues.apache.org/jira/browse/PIG-806 Project: Pig Issue Type: Bug Reporter: Giridharan Kesavan Following java source files has author tags in them ; which need to to be cleaned. src/org/apache/pig/Algebraic.java src/org/apache/pig/backend/local/executionengine/physicalLayer/relationalOperators/POCross.java src/org/apache/pig/backend/local/executionengine/physicalLayer/relationalOperators/POCogroup.java src/org/apache/pig/impl/io/FileSpec.java src/org/apache/pig/impl/streaming/StreamingCommand.java src/org/apache/pig/StoreFunc.java src/org/apache/pig/tools/cmdline/CmdLineParser.java src/org/apache/pig/tools/timer/PerformanceTimer.java src/org/apache/pig/tools/timer/PerformanceTimerFactory.java Thanks, -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-806) to remove author tags in the pig source code
[ https://issues.apache.org/jira/browse/PIG-806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12708464#action_12708464 ] Thejas M Nair commented on PIG-806: --- Example of author tag in a java file - /* * @author xyz */ to remove author tags in the pig source code Key: PIG-806 URL: https://issues.apache.org/jira/browse/PIG-806 Project: Pig Issue Type: Bug Reporter: Giridharan Kesavan Following java source files has author tags in them ; which need to to be cleaned. src/org/apache/pig/Algebraic.java src/org/apache/pig/backend/local/executionengine/physicalLayer/relationalOperators/POCross.java src/org/apache/pig/backend/local/executionengine/physicalLayer/relationalOperators/POCogroup.java src/org/apache/pig/impl/io/FileSpec.java src/org/apache/pig/impl/streaming/StreamingCommand.java src/org/apache/pig/StoreFunc.java src/org/apache/pig/tools/cmdline/CmdLineParser.java src/org/apache/pig/tools/timer/PerformanceTimer.java src/org/apache/pig/tools/timer/PerformanceTimerFactory.java Thanks, -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Issue Comment Edited: (PIG-806) to remove author tags in the pig source code
[ https://issues.apache.org/jira/browse/PIG-806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12708464#action_12708464 ] Thejas M Nair edited comment on PIG-806 at 5/12/09 8:28 AM: Example of author tag in a java file - {code} /* * @author xyz */ {code} was (Author: thejas): Example of author tag in a java file - /* * @author xyz */ to remove author tags in the pig source code Key: PIG-806 URL: https://issues.apache.org/jira/browse/PIG-806 Project: Pig Issue Type: Bug Reporter: Giridharan Kesavan Following java source files has author tags in them ; which need to to be cleaned. src/org/apache/pig/Algebraic.java src/org/apache/pig/backend/local/executionengine/physicalLayer/relationalOperators/POCross.java src/org/apache/pig/backend/local/executionengine/physicalLayer/relationalOperators/POCogroup.java src/org/apache/pig/impl/io/FileSpec.java src/org/apache/pig/impl/streaming/StreamingCommand.java src/org/apache/pig/StoreFunc.java src/org/apache/pig/tools/cmdline/CmdLineParser.java src/org/apache/pig/tools/timer/PerformanceTimer.java src/org/apache/pig/tools/timer/PerformanceTimerFactory.java Thanks, -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-806) to remove author tags in the pig source code
[ https://issues.apache.org/jira/browse/PIG-806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12708469#action_12708469 ] Olga Natkovich commented on PIG-806: Thanks, Tejas. What is wrong with author tag? The error on PIG-765 says that Pig community agreed to disallow that but I don't remember that. to remove author tags in the pig source code Key: PIG-806 URL: https://issues.apache.org/jira/browse/PIG-806 Project: Pig Issue Type: Bug Reporter: Giridharan Kesavan Following java source files has author tags in them ; which need to to be cleaned. src/org/apache/pig/Algebraic.java src/org/apache/pig/backend/local/executionengine/physicalLayer/relationalOperators/POCross.java src/org/apache/pig/backend/local/executionengine/physicalLayer/relationalOperators/POCogroup.java src/org/apache/pig/impl/io/FileSpec.java src/org/apache/pig/impl/streaming/StreamingCommand.java src/org/apache/pig/StoreFunc.java src/org/apache/pig/tools/cmdline/CmdLineParser.java src/org/apache/pig/tools/timer/PerformanceTimer.java src/org/apache/pig/tools/timer/PerformanceTimerFactory.java Thanks, -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-799) Unit tests on windows are failing after multiquery commit
[ https://issues.apache.org/jira/browse/PIG-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12708471#action_12708471 ] Olga Natkovich commented on PIG-799: Daniel, thanks for the patch! Looks like the automated patch testing is not working. If the tests pass in both windows and unix, please, commit the patch. Unit tests on windows are failing after multiquery commit - Key: PIG-799 URL: https://issues.apache.org/jira/browse/PIG-799 Project: Pig Issue Type: Bug Reporter: Olga Natkovich Assignee: Daniel Dai Attachments: PIG-799.patch Daniel could you take a look. It should be reproducible with the latest trunk. Thanks -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: [jira] Commented: (PIG-806) to remove author tags in the pig source code
Hi @author tags are not allowed in Pig (or any apache project I suppose). Refer How to Contribute page - http://wiki.apache.org/pig/HowToContribute --nitesh On Tue, May 12, 2009 at 9:09 PM, Olga Natkovich (JIRA) j...@apache.orgwrote: [ https://issues.apache.org/jira/browse/PIG-806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12708469#action_12708469] Olga Natkovich commented on PIG-806: Thanks, Tejas. What is wrong with author tag? The error on PIG-765 says that Pig community agreed to disallow that but I don't remember that. to remove author tags in the pig source code Key: PIG-806 URL: https://issues.apache.org/jira/browse/PIG-806 Project: Pig Issue Type: Bug Reporter: Giridharan Kesavan Following java source files has author tags in them ; which need to to be cleaned. src/org/apache/pig/Algebraic.java src/org/apache/pig/backend/local/executionengine/physicalLayer/relationalOperators/POCross.java src/org/apache/pig/backend/local/executionengine/physicalLayer/relationalOperators/POCogroup.java src/org/apache/pig/impl/io/FileSpec.java src/org/apache/pig/impl/streaming/StreamingCommand.java src/org/apache/pig/StoreFunc.java src/org/apache/pig/tools/cmdline/CmdLineParser.java src/org/apache/pig/tools/timer/PerformanceTimer.java src/org/apache/pig/tools/timer/PerformanceTimerFactory.java Thanks, -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. -- Nitesh Bhatia Dhirubhai Ambani Institute of Information Communication Technology Gandhinagar Gujarat Life is never perfect. It just depends where you draw the line. visit: http://www.awaaaz.com - connecting through music http://www.volstreet.com - lets volunteer for better tomorrow http://www.instibuzz.com - Voice opinions, Transact easily, Have fun
[jira] Commented: (PIG-806) to remove author tags in the pig source code
[ https://issues.apache.org/jira/browse/PIG-806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12708508#action_12708508 ] Alan Gates commented on PIG-806: http://wiki.apache.org/pig/HowToContribute see section on Making Changes. to remove author tags in the pig source code Key: PIG-806 URL: https://issues.apache.org/jira/browse/PIG-806 Project: Pig Issue Type: Bug Reporter: Giridharan Kesavan Following java source files has author tags in them ; which need to to be cleaned. src/org/apache/pig/Algebraic.java src/org/apache/pig/backend/local/executionengine/physicalLayer/relationalOperators/POCross.java src/org/apache/pig/backend/local/executionengine/physicalLayer/relationalOperators/POCogroup.java src/org/apache/pig/impl/io/FileSpec.java src/org/apache/pig/impl/streaming/StreamingCommand.java src/org/apache/pig/StoreFunc.java src/org/apache/pig/tools/cmdline/CmdLineParser.java src/org/apache/pig/tools/timer/PerformanceTimer.java src/org/apache/pig/tools/timer/PerformanceTimerFactory.java Thanks, -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-788) Proposal to remove float from Pig data types
[ https://issues.apache.org/jira/browse/PIG-788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12708512#action_12708512 ] Alan Gates commented on PIG-788: Reading the latest comments on AVRO-17 it looks like they are leaning towards keeping float, so this may be becoming a non-issue. Proposal to remove float from Pig data types Key: PIG-788 URL: https://issues.apache.org/jira/browse/PIG-788 Project: Pig Issue Type: Improvement Components: impl Affects Versions: 0.2.0 Reporter: Alan Gates Assignee: Alan Gates Pig would like to use the new Hadoop Avro serialization package to pass data between MR jobs, and eventually between Pig and UDFs that are not written in Java. Avro will not be supporting the float data type, but only double (see AVRO-17). Pig currently support both float and double. Double is the default floating point type (so if the user says x + 1.0, 1.0 is taken to be a double, not a float). Float was initially included in the list of Pig types because Hadoop supported it as one of the Writable types, and we were trying to make sure all of Hadoop's writable types could be represented in Pig. In practice we do not see anyone using the float type. In order to be able to easily use Avro I propose dropping the float type. Please speak up if you are using the float type and you have a compelling reason not to use double. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-806) to remove author tags in the pig source code
[ https://issues.apache.org/jira/browse/PIG-806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Santhosh Srinivasan updated PIG-806: Affects Version/s: 0.3.0 Fix Version/s: 0.3.0 Assignee: Santhosh Srinivasan to remove author tags in the pig source code Key: PIG-806 URL: https://issues.apache.org/jira/browse/PIG-806 Project: Pig Issue Type: Bug Affects Versions: 0.3.0 Reporter: Giridharan Kesavan Assignee: Santhosh Srinivasan Fix For: 0.3.0 Following java source files has author tags in them ; which need to to be cleaned. src/org/apache/pig/Algebraic.java src/org/apache/pig/backend/local/executionengine/physicalLayer/relationalOperators/POCross.java src/org/apache/pig/backend/local/executionengine/physicalLayer/relationalOperators/POCogroup.java src/org/apache/pig/impl/io/FileSpec.java src/org/apache/pig/impl/streaming/StreamingCommand.java src/org/apache/pig/StoreFunc.java src/org/apache/pig/tools/cmdline/CmdLineParser.java src/org/apache/pig/tools/timer/PerformanceTimer.java src/org/apache/pig/tools/timer/PerformanceTimerFactory.java Thanks, -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (PIG-806) to remove author tags in the pig source code
[ https://issues.apache.org/jira/browse/PIG-806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Santhosh Srinivasan resolved PIG-806. - Resolution: Fixed Committed the changes. Except for StreamingCommand.java all the other files noted in the bug report were modified to remove the @author tag to remove author tags in the pig source code Key: PIG-806 URL: https://issues.apache.org/jira/browse/PIG-806 Project: Pig Issue Type: Bug Affects Versions: 0.3.0 Reporter: Giridharan Kesavan Assignee: Santhosh Srinivasan Fix For: 0.3.0 Following java source files has author tags in them ; which need to to be cleaned. src/org/apache/pig/Algebraic.java src/org/apache/pig/backend/local/executionengine/physicalLayer/relationalOperators/POCross.java src/org/apache/pig/backend/local/executionengine/physicalLayer/relationalOperators/POCogroup.java src/org/apache/pig/impl/io/FileSpec.java src/org/apache/pig/impl/streaming/StreamingCommand.java src/org/apache/pig/StoreFunc.java src/org/apache/pig/tools/cmdline/CmdLineParser.java src/org/apache/pig/tools/timer/PerformanceTimer.java src/org/apache/pig/tools/timer/PerformanceTimerFactory.java Thanks, -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (PIG-807) PERFORMANCE: Provide a way for UDFs to use read-once bags (backed by the Hadoop values iterator)
PERFORMANCE: Provide a way for UDFs to use read-once bags (backed by the Hadoop values iterator) Key: PIG-807 URL: https://issues.apache.org/jira/browse/PIG-807 Project: Pig Issue Type: Improvement Affects Versions: 0.2.1 Reporter: Pradeep Kamath Fix For: 0.3.0 Currently all bags resulting from a group or cogroup are materialized as bags containing all of the contents. The issue with this is that if a particular key has many corresponding values, all these values get stuffed in a bag which may run out of memory and hence spill causing slow down in performance and sometime memory exceptions. In many cases, the udfs which use these bags coming out a group and cogroup only need to iterate over the bag in a unidirectional read-once manner. This can be implemented by having the bag implement its iterator by simply iterating over the underlying hadoop iterator provided in the reduce. This kind of a bag is also needed in http://issues.apache.org/jira/browse/PIG-802. So the code can be reused for this issue too. The other part of this issue is to have some way for the udfs to communicate to Pig that any input bags that they need are read once bags . This can be achieved by having an Interface - say UsesReadOnceBags which is serves as a tag to indicate the intent to Pig. Pig can then rewire its execution plan to use ReadOnceBags is feasible. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-802) PERFORMANCE: not creating bags for ORDER BY
[ https://issues.apache.org/jira/browse/PIG-802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12708551#action_12708551 ] Pradeep Kamath commented on PIG-802: Adding some more details: A new kind of bag - ReadOnceBag needs to be implemented. This bag will have reference to the key currently being processed and the iterator to values provided by hadoop in reduce(). The ReadOnceBag's iterator will simply iterate over the hadoop iterator at each call and construct a tuple by using the key and value (see POPackage.java for details on how this is done). POPackage should also be changed or a new class introduced which creates ReadOnceBags instead of regular bags. This creation of the bag should only initialize the bag with the key and iterator. PERFORMANCE: not creating bags for ORDER BY --- Key: PIG-802 URL: https://issues.apache.org/jira/browse/PIG-802 Project: Pig Issue Type: Improvement Reporter: Olga Natkovich Order by should be changed to not use POPackage to put all of the tuples in a bag on the reduce side, as the bag is just immediately flattened. It can instead work like join does for the last input in the join. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-794) Use Avro serialization in Pig
[ https://issues.apache.org/jira/browse/PIG-794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12708701#action_12708701 ] Olga Natkovich commented on PIG-794: I integrated the latest patch and run unit tests. All the AVRO unit tests failed with the following stack trace: Could not initialize class org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.AvroTupleSchema java.lang.NoClassDefFoundError: Could not initialize class org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.AvroTupleSchema at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.TupleAvroWriter.writeDatum(AvroStorage.java:359) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.TupleAvroWriter.writeTuple(AvroStorage.java:408) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.TupleAvroWriter.write(AvroStorage.java:353) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.AvroStorage.putNext(AvroStorage.java:571) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POStore.getNext(POStore.java:121) at org.apache.pig.backend.local.executionengine.LocalPigLauncher.runPipeline(LocalPigLauncher.java:129) at org.apache.pig.backend.local.executionengine.LocalPigLauncher.launchPig(LocalPigLauncher.java:102) at org.apache.pig.test.TestAvroStorage.store(TestAvroStorage.java:117) at org.apache.pig.test.TestAvroStorage.testLoadStoreComplexDataWithNull(TestAvroStorage.java:178) ~ Use Avro serialization in Pig - Key: PIG-794 URL: https://issues.apache.org/jira/browse/PIG-794 Project: Pig Issue Type: Improvement Components: impl Affects Versions: 0.2.0 Reporter: Rakesh Setty Fix For: 0.2.0 Attachments: avro-0.1-dev-java.jar, AvroStorage.patch, jackson-asl-0.9.4.jar We would like to use Avro serialization in Pig to pass data between MR jobs instead of the current BinStorage. Attached is an implementation of AvroBinStorage which performs significantly better compared to BinStorage on our benchmarks. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-799) Unit tests on windows are failing after multiquery commit
[ https://issues.apache.org/jira/browse/PIG-799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-799: --- Resolution: Fixed Fix Version/s: 0.3.0 Status: Resolved (was: Patch Available) Unit tests on windows are failing after multiquery commit - Key: PIG-799 URL: https://issues.apache.org/jira/browse/PIG-799 Project: Pig Issue Type: Bug Reporter: Olga Natkovich Assignee: Daniel Dai Fix For: 0.3.0 Attachments: PIG-799.patch Daniel could you take a look. It should be reproducible with the latest trunk. Thanks -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-626) Statistics (records read by each mapper and reducer)
[ https://issues.apache.org/jira/browse/PIG-626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12708737#action_12708737 ] Hadoop QA commented on PIG-626: --- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12407672/PIG-626.patch against trunk revision 774167. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 12 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/34/console This message is automatically generated. Statistics (records read by each mapper and reducer) Key: PIG-626 URL: https://issues.apache.org/jira/browse/PIG-626 Project: Pig Issue Type: New Feature Components: impl Affects Versions: 0.2.0 Reporter: Shubham Chopra Assignee: Shubham Chopra Priority: Minor Fix For: 0.3.0 Attachments: PIG-626.patch, pigStats.patch, pigStats.patch, pigStats.patch, pigStats.patch, pigStats.patch, TEST-org.apache.pig.test.TestBZip.txt This uses the counters framework that hadoop has. Initially, I am just interested in finding out the number of records read by each mapper/reducer particularly for the last job in any script. A sample code to access the statistics for the last job: String reducePlan = stats.getPigStats().get(stats.getLastJobID()).get(PIG_STATS_REDUCE_PLAN); if(reducePlan == null) { System.out.println(Records written : + stats.getPigStats().get(stats.getLastJobID()).get(PIG_STATS_MAP_OUTPUT_RECORDS)); } else { System.out.println(Records written : + stats.getPigStats().get(stats.getLastJobID()).get(PIG_STATS_REDUCE_OUTPUT_RECORDS)); } The patch contains 7 test cases. These include tests PigStorage and BinStorage along with one for multiple MR jobs case. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-781) Error reporting for failed MR jobs
[ https://issues.apache.org/jira/browse/PIG-781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12708752#action_12708752 ] Hadoop QA commented on PIG-781: --- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12407727/partial_failure.patch against trunk revision 774167. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 14 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/37/console This message is automatically generated. Error reporting for failed MR jobs -- Key: PIG-781 URL: https://issues.apache.org/jira/browse/PIG-781 Project: Pig Issue Type: Improvement Reporter: Gunther Hagleitner Attachments: partial_failure.patch, partial_failure.patch If we have multiple MR jobs to run and some of them fail the behavior of the system is to not stop on the first failure but to keep going. That way jobs that do not depend on the failed job might still succeed. The question is to how best report this scenario to a user. How do we tell which jobs failed and which didn't? One way could be to tie jobs to stores and report which store locations won't have data and which ones do. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-794) Use Avro serialization in Pig
[ https://issues.apache.org/jira/browse/PIG-794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Giridharan Kesavan updated PIG-794: --- Attachment: PIG-794.patch this patch resolves jackson-asl.jar from the mvn repo through ivy and avro from the local lib dir. While submitting this patch to svn we have to add avro jar to the lib dir tnx! Use Avro serialization in Pig - Key: PIG-794 URL: https://issues.apache.org/jira/browse/PIG-794 Project: Pig Issue Type: Improvement Components: impl Affects Versions: 0.2.0 Reporter: Rakesh Setty Fix For: 0.2.0 Attachments: avro-0.1-dev-java.jar, AvroStorage.patch, jackson-asl-0.9.4.jar, PIG-794.patch We would like to use Avro serialization in Pig to pass data between MR jobs instead of the current BinStorage. Attached is an implementation of AvroBinStorage which performs significantly better compared to BinStorage on our benchmarks. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.