[jira] [Commented] (TEZ-3605) Detect and prune empty partitions for the Ordered case
[ https://issues.apache.org/jira/browse/TEZ-3605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16066924#comment-16066924 ] Kuhu Shukla commented on TEZ-3605: -- Committing this to master. > Detect and prune empty partitions for the Ordered case > -- > > Key: TEZ-3605 > URL: https://issues.apache.org/jira/browse/TEZ-3605 > Project: Apache Tez > Issue Type: Bug >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: TEZ-3605.001.patch, TEZ-3605.002.patch, > TEZ-3605.003.patch, TEZ-3605.004.patch, TEZ-3605.005.patch, > TEZ-3605.006.patch, TEZ-3605.007.patch, TEZ-3605.008.patch, > TEZ-3605.009.patch, TEZ-3605.010.patch, TEZ-3605.011.patch, > TEZ-3605.012.patch, TEZ-3605.013.patch > > > Analogous to the Unordered case we should not have empty partition > entries/segments in the Ordered/DefaultSorter case. This will save writing > unnecessary data. > Additionally, with tez_shuffle feature (TEZ-3334), in a heavily auto reduced > job, this change would allow not fetching empty partitions and then throwing > them away. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TEZ-3605) Detect and prune empty partitions for the Ordered case
[ https://issues.apache.org/jira/browse/TEZ-3605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065525#comment-16065525 ] Kuhu Shukla commented on TEZ-3605: -- [~sseth], with the latest patch running a clean pre-commit, request for one last review if needed, else I will commit this tomorrow if there are no objections from the community till then. Thanks! > Detect and prune empty partitions for the Ordered case > -- > > Key: TEZ-3605 > URL: https://issues.apache.org/jira/browse/TEZ-3605 > Project: Apache Tez > Issue Type: Bug >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: TEZ-3605.001.patch, TEZ-3605.002.patch, > TEZ-3605.003.patch, TEZ-3605.004.patch, TEZ-3605.005.patch, > TEZ-3605.006.patch, TEZ-3605.007.patch, TEZ-3605.008.patch, > TEZ-3605.009.patch, TEZ-3605.010.patch, TEZ-3605.011.patch, > TEZ-3605.012.patch, TEZ-3605.013.patch > > > Analogous to the Unordered case we should not have empty partition > entries/segments in the Ordered/DefaultSorter case. This will save writing > unnecessary data. > Additionally, with tez_shuffle feature (TEZ-3334), in a heavily auto reduced > job, this change would allow not fetching empty partitions and then throwing > them away. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TEZ-3605) Detect and prune empty partitions for the Ordered case
[ https://issues.apache.org/jira/browse/TEZ-3605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065516#comment-16065516 ] TezQA commented on TEZ-3605: {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12874729/TEZ-3605.013.patch against master revision de72fbe. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/2555//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2555//console This message is automatically generated. > Detect and prune empty partitions for the Ordered case > -- > > Key: TEZ-3605 > URL: https://issues.apache.org/jira/browse/TEZ-3605 > Project: Apache Tez > Issue Type: Bug >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: TEZ-3605.001.patch, TEZ-3605.002.patch, > TEZ-3605.003.patch, TEZ-3605.004.patch, TEZ-3605.005.patch, > TEZ-3605.006.patch, TEZ-3605.007.patch, TEZ-3605.008.patch, > TEZ-3605.009.patch, TEZ-3605.010.patch, TEZ-3605.011.patch, > TEZ-3605.012.patch, TEZ-3605.013.patch > > > Analogous to the Unordered case we should not have empty partition > entries/segments in the Ordered/DefaultSorter case. This will save writing > unnecessary data. > Additionally, with tez_shuffle feature (TEZ-3334), in a heavily auto reduced > job, this change would allow not fetching empty partitions and then throwing > them away. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TEZ-3605) Detect and prune empty partitions for the Ordered case
[ https://issues.apache.org/jira/browse/TEZ-3605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065275#comment-16065275 ] Siddharth Seth commented on TEZ-3605: - Thanks for the updated patch. Fixes large records as well. +1, with one minor fix before committing. In PipelinedSorter - if (combiner != null) will run into an NPE. A simple hasNext check there as well fixes this. > Detect and prune empty partitions for the Ordered case > -- > > Key: TEZ-3605 > URL: https://issues.apache.org/jira/browse/TEZ-3605 > Project: Apache Tez > Issue Type: Bug >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: TEZ-3605.001.patch, TEZ-3605.002.patch, > TEZ-3605.003.patch, TEZ-3605.004.patch, TEZ-3605.005.patch, > TEZ-3605.006.patch, TEZ-3605.007.patch, TEZ-3605.008.patch, > TEZ-3605.009.patch, TEZ-3605.010.patch, TEZ-3605.011.patch, TEZ-3605.012.patch > > > Analogous to the Unordered case we should not have empty partition > entries/segments in the Ordered/DefaultSorter case. This will save writing > unnecessary data. > Additionally, with tez_shuffle feature (TEZ-3334), in a heavily auto reduced > job, this change would allow not fetching empty partitions and then throwing > them away. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TEZ-3605) Detect and prune empty partitions for the Ordered case
[ https://issues.apache.org/jira/browse/TEZ-3605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16064752#comment-16064752 ] Kuhu Shukla commented on TEZ-3605: -- Request for comments/review on the latest patch [~sseth], [~jeagles]. Thanks a lot! > Detect and prune empty partitions for the Ordered case > -- > > Key: TEZ-3605 > URL: https://issues.apache.org/jira/browse/TEZ-3605 > Project: Apache Tez > Issue Type: Bug >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: TEZ-3605.001.patch, TEZ-3605.002.patch, > TEZ-3605.003.patch, TEZ-3605.004.patch, TEZ-3605.005.patch, > TEZ-3605.006.patch, TEZ-3605.007.patch, TEZ-3605.008.patch, > TEZ-3605.009.patch, TEZ-3605.010.patch, TEZ-3605.011.patch, TEZ-3605.012.patch > > > Analogous to the Unordered case we should not have empty partition > entries/segments in the Ordered/DefaultSorter case. This will save writing > unnecessary data. > Additionally, with tez_shuffle feature (TEZ-3334), in a heavily auto reduced > job, this change would allow not fetching empty partitions and then throwing > them away. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TEZ-3605) Detect and prune empty partitions for the Ordered case
[ https://issues.apache.org/jira/browse/TEZ-3605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16064225#comment-16064225 ] TezQA commented on TEZ-3605: {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12874554/TEZ-3605.012.patch against master revision 5b0f5a0. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/2546//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2546//console This message is automatically generated. > Detect and prune empty partitions for the Ordered case > -- > > Key: TEZ-3605 > URL: https://issues.apache.org/jira/browse/TEZ-3605 > Project: Apache Tez > Issue Type: Bug >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: TEZ-3605.001.patch, TEZ-3605.002.patch, > TEZ-3605.003.patch, TEZ-3605.004.patch, TEZ-3605.005.patch, > TEZ-3605.006.patch, TEZ-3605.007.patch, TEZ-3605.008.patch, > TEZ-3605.009.patch, TEZ-3605.010.patch, TEZ-3605.011.patch, TEZ-3605.012.patch > > > Analogous to the Unordered case we should not have empty partition > entries/segments in the Ordered/DefaultSorter case. This will save writing > unnecessary data. > Additionally, with tez_shuffle feature (TEZ-3334), in a heavily auto reduced > job, this change would allow not fetching empty partitions and then throwing > them away. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TEZ-3605) Detect and prune empty partitions for the Ordered case
[ https://issues.apache.org/jira/browse/TEZ-3605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16063606#comment-16063606 ] Kuhu Shukla commented on TEZ-3605: -- bq. and invokes a merger on an empty list (not sure how this is handled) Empty List is handled fine {quote} if (segments.size() == 0) { LOG.info("Nothing to merge. Returning an empty iterator"); return new EmptyIterator(); } {quote} It is when the segment size is zero when it gets into trouble due to a stream with no bytes to read. > Detect and prune empty partitions for the Ordered case > -- > > Key: TEZ-3605 > URL: https://issues.apache.org/jira/browse/TEZ-3605 > Project: Apache Tez > Issue Type: Bug >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: TEZ-3605.001.patch, TEZ-3605.002.patch, > TEZ-3605.003.patch, TEZ-3605.004.patch, TEZ-3605.005.patch, > TEZ-3605.006.patch, TEZ-3605.007.patch, TEZ-3605.008.patch, > TEZ-3605.009.patch, TEZ-3605.010.patch, TEZ-3605.011.patch > > > Analogous to the Unordered case we should not have empty partition > entries/segments in the Ordered/DefaultSorter case. This will save writing > unnecessary data. > Additionally, with tez_shuffle feature (TEZ-3334), in a heavily auto reduced > job, this change would allow not fetching empty partitions and then throwing > them away. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TEZ-3605) Detect and prune empty partitions for the Ordered case
[ https://issues.apache.org/jira/browse/TEZ-3605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16045101#comment-16045101 ] Siddharth Seth commented on TEZ-3605: - In PipelinedSorter - the final merge does not necessarily skip a fully empty partition. The check while creating DiskSegments can end up with a list which is empty, and invokes a merger on an empty list (not sure how this is handled) Similarly in DefaultSorter, I think mergeParts needs some work. Would be useful to have tests for both, i.e. when there's multiple spills involved, 1) where a single spill has a partition, another does not, 2) all spills don't have a partition > Detect and prune empty partitions for the Ordered case > -- > > Key: TEZ-3605 > URL: https://issues.apache.org/jira/browse/TEZ-3605 > Project: Apache Tez > Issue Type: Bug >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: TEZ-3605.001.patch, TEZ-3605.002.patch, > TEZ-3605.003.patch, TEZ-3605.004.patch, TEZ-3605.005.patch, > TEZ-3605.006.patch, TEZ-3605.007.patch, TEZ-3605.008.patch, > TEZ-3605.009.patch, TEZ-3605.010.patch > > > Analogous to the Unordered case we should not have empty partition > entries/segments in the Ordered/DefaultSorter case. This will save writing > unnecessary data. > Additionally, with tez_shuffle feature (TEZ-3334), in a heavily auto reduced > job, this change would allow not fetching empty partitions and then throwing > them away. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (TEZ-3605) Detect and prune empty partitions for the Ordered case
[ https://issues.apache.org/jira/browse/TEZ-3605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16031088#comment-16031088 ] Kuhu Shukla commented on TEZ-3605: -- TestExceptionPropagation failure seems unrelated and locally irreproducible. I will continue to investigate this. Other test failures are known and already have JIRAs associated. [~jeagles], looking for some comments on the latest patch. Thanks a lot! > Detect and prune empty partitions for the Ordered case > -- > > Key: TEZ-3605 > URL: https://issues.apache.org/jira/browse/TEZ-3605 > Project: Apache Tez > Issue Type: Bug >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: TEZ-3605.001.patch, TEZ-3605.002.patch, > TEZ-3605.003.patch, TEZ-3605.004.patch, TEZ-3605.005.patch, > TEZ-3605.006.patch, TEZ-3605.007.patch, TEZ-3605.008.patch, > TEZ-3605.009.patch, TEZ-3605.010.patch > > > Analogous to the Unordered case we should not have empty partition > entries/segments in the Ordered/DefaultSorter case. This will save writing > unnecessary data. > Additionally, with tez_shuffle feature (TEZ-3334), in a heavily auto reduced > job, this change would allow not fetching empty partitions and then throwing > them away. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (TEZ-3605) Detect and prune empty partitions for the Ordered case
[ https://issues.apache.org/jira/browse/TEZ-3605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16022893#comment-16022893 ] Kuhu Shukla commented on TEZ-3605: -- The test failure is irreproducible locally and is due to an unexpected state transition. I ran the same test in a loop and did not see it failing even once. {code} 2017-05-22 20:58:09,104 ERROR [Dispatcher thread {Central}] impl.TaskAttemptImpl (TaskAttemptImpl.java:handle(861)) - Can't handle this event at current state for attempt_1495486688894_0001_1_00_03_1 org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: TA_SUBMITTED at KILL_IN_PROGRESS at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305) at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) at org.apache.tez.dag.app.dag.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:859) at org.apache.tez.dag.app.dag.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:124) at org.apache.tez.dag.app.DAGAppMaster$TaskAttemptEventDispatcher.handle(DAGAppMaster.java:2299) at org.apache.tez.dag.app.DAGAppMaster$TaskAttemptEventDispatcher.handle(DAGAppMaster.java:2284) at org.apache.tez.common.AsyncDispatcher.dispatch(AsyncDispatcher.java:180) at org.apache.tez.common.AsyncDispatcher$1.run(AsyncDispatcher.java:115) at java.lang.Thread.run(Thread.java:745) {code} Would appreciate any comments/review on the latest patch. [~sseth]/ [~jeagles]/[~rajesh.balamohan]. Thanks a lot! > Detect and prune empty partitions for the Ordered case > -- > > Key: TEZ-3605 > URL: https://issues.apache.org/jira/browse/TEZ-3605 > Project: Apache Tez > Issue Type: Bug >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: TEZ-3605.001.patch, TEZ-3605.002.patch, > TEZ-3605.003.patch, TEZ-3605.004.patch, TEZ-3605.005.patch, > TEZ-3605.006.patch, TEZ-3605.007.patch, TEZ-3605.008.patch, TEZ-3605.009.patch > > > Analogous to the Unordered case we should not have empty partition > entries/segments in the Ordered/DefaultSorter case. This will save writing > unnecessary data. > Additionally, with tez_shuffle feature (TEZ-3334), in a heavily auto reduced > job, this change would allow not fetching empty partitions and then throwing > them away. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (TEZ-3605) Detect and prune empty partitions for the Ordered case
[ https://issues.apache.org/jira/browse/TEZ-3605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16016327#comment-16016327 ] Kuhu Shukla commented on TEZ-3605: -- Looking at the related test failures. Will update shortly. > Detect and prune empty partitions for the Ordered case > -- > > Key: TEZ-3605 > URL: https://issues.apache.org/jira/browse/TEZ-3605 > Project: Apache Tez > Issue Type: Bug >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: TEZ-3605.001.patch, TEZ-3605.002.patch, > TEZ-3605.003.patch, TEZ-3605.004.patch, TEZ-3605.005.patch, > TEZ-3605.006.patch, TEZ-3605.007.patch, TEZ-3605.008.patch > > > Analogous to the Unordered case we should not have empty partition > entries/segments in the Ordered/DefaultSorter case. This will save writing > unnecessary data. > Additionally, with tez_shuffle feature (TEZ-3334), in a heavily auto reduced > job, this change would allow not fetching empty partitions and then throwing > them away. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (TEZ-3605) Detect and prune empty partitions for the Ordered case
[ https://issues.apache.org/jira/browse/TEZ-3605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16016297#comment-16016297 ] TezQA commented on TEZ-3605: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12868801/TEZ-3605.008.patch against master revision e3ee7a6. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in : org.apache.tez.runtime.library.common.sort.impl.TestPipelinedSorter Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/2464//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2464//console This message is automatically generated. > Detect and prune empty partitions for the Ordered case > -- > > Key: TEZ-3605 > URL: https://issues.apache.org/jira/browse/TEZ-3605 > Project: Apache Tez > Issue Type: Bug >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: TEZ-3605.001.patch, TEZ-3605.002.patch, > TEZ-3605.003.patch, TEZ-3605.004.patch, TEZ-3605.005.patch, > TEZ-3605.006.patch, TEZ-3605.007.patch, TEZ-3605.008.patch > > > Analogous to the Unordered case we should not have empty partition > entries/segments in the Ordered/DefaultSorter case. This will save writing > unnecessary data. > Additionally, with tez_shuffle feature (TEZ-3334), in a heavily auto reduced > job, this change would allow not fetching empty partitions and then throwing > them away. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (TEZ-3605) Detect and prune empty partitions for the Ordered case
[ https://issues.apache.org/jira/browse/TEZ-3605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16016051#comment-16016051 ] TezQA commented on TEZ-3605: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12868788/TEZ-3605.007.patch against master revision e3ee7a6. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:red}-1 javac{color:red}. The patch appears to cause the build to fail. Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2463//console This message is automatically generated. > Detect and prune empty partitions for the Ordered case > -- > > Key: TEZ-3605 > URL: https://issues.apache.org/jira/browse/TEZ-3605 > Project: Apache Tez > Issue Type: Bug >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: TEZ-3605.001.patch, TEZ-3605.002.patch, > TEZ-3605.003.patch, TEZ-3605.004.patch, TEZ-3605.005.patch, > TEZ-3605.006.patch, TEZ-3605.007.patch > > > Analogous to the Unordered case we should not have empty partition > entries/segments in the Ordered/DefaultSorter case. This will save writing > unnecessary data. > Additionally, with tez_shuffle feature (TEZ-3334), in a heavily auto reduced > job, this change would allow not fetching empty partitions and then throwing > them away. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (TEZ-3605) Detect and prune empty partitions for the Ordered case
[ https://issues.apache.org/jira/browse/TEZ-3605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15948010#comment-15948010 ] Siddharth Seth commented on TEZ-3605: - Took a while to get to this, and to recollect what is done in the UnorderedWriter / DefaultWriter. If I'm not mistaken, the patch is trying to avoid writing out the default 4bytes(?) that is generated by an IFile.Writer?, when the partition does not have data? (TEZ-941) The changes to track numRecordsPerPartition are required for this. The Sorters already know how to generate the empty partition bitset by making use of TezSpillRecord and TezIndexRecord.hasData. The current changes to track numRecordsPerPartition also breaks PipelinedSHuffle / AvoidFinalMerge - since the partition stats are cumulative, and not per partition. Synchronization will also need to be looked at (suspect there may be some issues with the size stats as well). The unordered case does not respect "sendEmptyPartitionsViaEvents" as a configuration parameter, and always sends empty partition information. IIRC this is why it is able to avoid the Writer for an empty partition - the reader will never access it. In the ordered case, if sendEmptyPartitionsViaEvents is disabled, the reader may try interpreting the contents of TezIndexRecord, which was not written, and fail (need to check how this will behave). I think the changes to track number of records should be removed. Instead, the main changes should be in DefaultSorter (and maybe the same changes in PipelinedSorter). These changes should skip creating the writer only if sendEmptyPartitionsViaEvents is enabled. Also, in the current changes to DefaultSorter, is it possible to move (if (writer == null)) - outside of the while loop? > Detect and prune empty partitions for the Ordered case > -- > > Key: TEZ-3605 > URL: https://issues.apache.org/jira/browse/TEZ-3605 > Project: Apache Tez > Issue Type: Bug >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: TEZ-3605.001.patch, TEZ-3605.002.patch, > TEZ-3605.003.patch, TEZ-3605.004.patch, TEZ-3605.005.patch, TEZ-3605.006.patch > > > Analogous to the Unordered case we should not have empty partition > entries/segments in the Ordered/DefaultSorter case. This will save writing > unnecessary data. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (TEZ-3605) Detect and prune empty partitions for the Ordered case
[ https://issues.apache.org/jira/browse/TEZ-3605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15938398#comment-15938398 ] Jonathan Eagles commented on TEZ-3605: -- This approach seems correct. I will want to spend some time testing this to understand the implications. > Detect and prune empty partitions for the Ordered case > -- > > Key: TEZ-3605 > URL: https://issues.apache.org/jira/browse/TEZ-3605 > Project: Apache Tez > Issue Type: Bug >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: TEZ-3605.001.patch, TEZ-3605.002.patch, > TEZ-3605.003.patch, TEZ-3605.004.patch, TEZ-3605.005.patch, TEZ-3605.006.patch > > > Analogous to the Unordered case we should not have empty partition > entries/segments in the Ordered/DefaultSorter case. This will save writing > unnecessary data. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (TEZ-3605) Detect and prune empty partitions for the Ordered case
[ https://issues.apache.org/jira/browse/TEZ-3605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15927083#comment-15927083 ] Kuhu Shukla commented on TEZ-3605: -- [~jeagles], [~rajesh.balamohan], request for review/comments. Appreciate it! > Detect and prune empty partitions for the Ordered case > -- > > Key: TEZ-3605 > URL: https://issues.apache.org/jira/browse/TEZ-3605 > Project: Apache Tez > Issue Type: Bug >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: TEZ-3605.001.patch, TEZ-3605.002.patch, > TEZ-3605.003.patch, TEZ-3605.004.patch, TEZ-3605.005.patch, TEZ-3605.006.patch > > > Analogous to the Unordered case we should not have empty partition > entries/segments in the Ordered/DefaultSorter case. This will save writing > unnecessary data. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (TEZ-3605) Detect and prune empty partitions for the Ordered case
[ https://issues.apache.org/jira/browse/TEZ-3605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15927078#comment-15927078 ] TezQA commented on TEZ-3605: {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12858951/TEZ-3605.006.patch against master revision 57c857d. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/2330//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2330//console This message is automatically generated. > Detect and prune empty partitions for the Ordered case > -- > > Key: TEZ-3605 > URL: https://issues.apache.org/jira/browse/TEZ-3605 > Project: Apache Tez > Issue Type: Bug >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: TEZ-3605.001.patch, TEZ-3605.002.patch, > TEZ-3605.003.patch, TEZ-3605.004.patch, TEZ-3605.005.patch, TEZ-3605.006.patch > > > Analogous to the Unordered case we should not have empty partition > entries/segments in the Ordered/DefaultSorter case. This will save writing > unnecessary data. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (TEZ-3605) Detect and prune empty partitions for the Ordered case
[ https://issues.apache.org/jira/browse/TEZ-3605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15906612#comment-15906612 ] Kuhu Shukla commented on TEZ-3605: -- [~rajesh.balamohan], could you take a look at this patch and share your comments. Thanks a lot! > Detect and prune empty partitions for the Ordered case > -- > > Key: TEZ-3605 > URL: https://issues.apache.org/jira/browse/TEZ-3605 > Project: Apache Tez > Issue Type: Bug >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: TEZ-3605.001.patch, TEZ-3605.002.patch, > TEZ-3605.003.patch, TEZ-3605.004.patch, TEZ-3605.005.patch > > > Analogous to the Unordered case we should not have empty partition > entries/segments in the Ordered/DefaultSorter case. This will save writing > unnecessary data. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (TEZ-3605) Detect and prune empty partitions for the Ordered case
[ https://issues.apache.org/jira/browse/TEZ-3605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900398#comment-15900398 ] Kuhu Shukla commented on TEZ-3605: -- [~sseth], [~jeagles], Would appreciate any initial comments on the patch and how to proceed with this fix. Thanks a lot! > Detect and prune empty partitions for the Ordered case > -- > > Key: TEZ-3605 > URL: https://issues.apache.org/jira/browse/TEZ-3605 > Project: Apache Tez > Issue Type: Bug >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: TEZ-3605.001.patch, TEZ-3605.002.patch, > TEZ-3605.003.patch, TEZ-3605.004.patch, TEZ-3605.005.patch > > > Analogous to the Unordered case we should not have empty partition > entries/segments in the Ordered/DefaultSorter case. This will save writing > unnecessary data. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (TEZ-3605) Detect and prune empty partitions for the Ordered case
[ https://issues.apache.org/jira/browse/TEZ-3605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900376#comment-15900376 ] TezQA commented on TEZ-3605: {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12856686/TEZ-3605.005.patch against master revision c6d4908. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/2313//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2313//console This message is automatically generated. > Detect and prune empty partitions for the Ordered case > -- > > Key: TEZ-3605 > URL: https://issues.apache.org/jira/browse/TEZ-3605 > Project: Apache Tez > Issue Type: Bug >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: TEZ-3605.001.patch, TEZ-3605.002.patch, > TEZ-3605.003.patch, TEZ-3605.004.patch, TEZ-3605.005.patch > > > Analogous to the Unordered case we should not have empty partition > entries/segments in the Ordered/DefaultSorter case. This will save writing > unnecessary data. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (TEZ-3605) Detect and prune empty partitions for the Ordered case
[ https://issues.apache.org/jira/browse/TEZ-3605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15899672#comment-15899672 ] TezQA commented on TEZ-3605: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12856609/TEZ-3605.004.patch against master revision d40f3ad. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:red}-1 findbugs{color}. The patch appears to introduce 2 new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in : org.apache.tez.test.TestRecovery Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/2307//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-TEZ-Build/2307//artifact/patchprocess/newPatchFindbugsWarningstez-runtime-library.html Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2307//console This message is automatically generated. > Detect and prune empty partitions for the Ordered case > -- > > Key: TEZ-3605 > URL: https://issues.apache.org/jira/browse/TEZ-3605 > Project: Apache Tez > Issue Type: Bug >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: TEZ-3605.001.patch, TEZ-3605.002.patch, > TEZ-3605.003.patch, TEZ-3605.004.patch > > > Analogous to the Unordered case we should not have empty partition > entries/segments in the Ordered/DefaultSorter case. This will save writing > unnecessary data. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (TEZ-3605) Detect and prune empty partitions for the Ordered case
[ https://issues.apache.org/jira/browse/TEZ-3605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15898019#comment-15898019 ] TezQA commented on TEZ-3605: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12856328/TEZ-3605.003.patch against master revision a5ffdea. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:red}-1 findbugs{color}. The patch appears to introduce 1 new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/2303//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-TEZ-Build/2303//artifact/patchprocess/newPatchFindbugsWarningstez-runtime-library.html Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2303//console This message is automatically generated. > Detect and prune empty partitions for the Ordered case > -- > > Key: TEZ-3605 > URL: https://issues.apache.org/jira/browse/TEZ-3605 > Project: Apache Tez > Issue Type: Bug >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: TEZ-3605.001.patch, TEZ-3605.002.patch, > TEZ-3605.003.patch > > > Analogous to the Unordered case we should not have empty partition > entries/segments in the Ordered/DefaultSorter case. This will save writing > unnecessary data. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (TEZ-3605) Detect and prune empty partitions for the Ordered case
[ https://issues.apache.org/jira/browse/TEZ-3605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15897595#comment-15897595 ] TezQA commented on TEZ-3605: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12856292/TEZ-3605.002.patch against master revision 518deb6. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:red}-1 findbugs{color}. The patch appears to introduce 1 new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/2302//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-TEZ-Build/2302//artifact/patchprocess/newPatchFindbugsWarningstez-runtime-library.html Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2302//console This message is automatically generated. > Detect and prune empty partitions for the Ordered case > -- > > Key: TEZ-3605 > URL: https://issues.apache.org/jira/browse/TEZ-3605 > Project: Apache Tez > Issue Type: Bug >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: TEZ-3605.001.patch, TEZ-3605.002.patch > > > Analogous to the Unordered case we should not have empty partition > entries/segments in the Ordered/DefaultSorter case. This will save writing > unnecessary data. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (TEZ-3605) Detect and prune empty partitions for the Ordered case
[ https://issues.apache.org/jira/browse/TEZ-3605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15896696#comment-15896696 ] TezQA commented on TEZ-3605: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12856006/TEZ-3605.001.patch against master revision 1b1eb1d. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2299//console This message is automatically generated. > Detect and prune empty partitions for the Ordered case > -- > > Key: TEZ-3605 > URL: https://issues.apache.org/jira/browse/TEZ-3605 > Project: Apache Tez > Issue Type: Bug >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: TEZ-3605.001.patch > > > Analogous to the Unordered case we should not have empty partition > entries/segments in the Ordered/DefaultSorter case. This will save writing > unnecessary data. -- This message was sent by Atlassian JIRA (v6.3.15#6346)