[jira] [Commented] (TEZ-3269) Provide basic fair routing and scheduling functionality via custom VertexManager and EdgeManager
[ https://issues.apache.org/jira/browse/TEZ-3269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15652670#comment-15652670 ] Siddharth Seth commented on TEZ-3269: - +1. Looks good. Thanks [~mingma] > Provide basic fair routing and scheduling functionality via custom > VertexManager and EdgeManager > > > Key: TEZ-3269 > URL: https://issues.apache.org/jira/browse/TEZ-3269 > Project: Apache Tez > Issue Type: Sub-task >Reporter: Ming Ma >Assignee: Ming Ma > Attachments: TEZ-3269-2.patch, TEZ-3269-3.patch, TEZ-3269-4.patch, > TEZ-3269-5.patch, TEZ-3269.patch > > > With TEZ-3206 and TEZ-3216, we can build a custom VertexManager and > EdgeManager that uses partition stats to do fair routing as well as the > scheduling based on destination tasks’ dependency on source tasks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-3269) Provide basic fair routing and scheduling functionality via custom VertexManager and EdgeManager
[ https://issues.apache.org/jira/browse/TEZ-3269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15622970#comment-15622970 ] Siddharth Seth commented on TEZ-3269: - Will take a look, hopefully, soon. bq. Will the parallelism ever end up getting increased? bq. yes. When FairRoutingType#FAIR_PARALLELISM is used, there will be multiple destination tasks processing the same partition and each destination task will process a range of source tasks of that partition. I'm not sure increasing the parallelism of a vertex, beyond the initial value (other than -1) is supported yet. Will need to take a look. > Provide basic fair routing and scheduling functionality via custom > VertexManager and EdgeManager > > > Key: TEZ-3269 > URL: https://issues.apache.org/jira/browse/TEZ-3269 > Project: Apache Tez > Issue Type: Sub-task >Reporter: Ming Ma >Assignee: Ming Ma > Attachments: TEZ-3269-2.patch, TEZ-3269-3.patch, TEZ-3269-4.patch, > TEZ-3269-5.patch, TEZ-3269.patch > > > With TEZ-3206 and TEZ-3216, we can build a custom VertexManager and > EdgeManager that uses partition stats to do fair routing as well as the > scheduling based on destination tasks’ dependency on source tasks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-3269) Provide basic fair routing and scheduling functionality via custom VertexManager and EdgeManager
[ https://issues.apache.org/jira/browse/TEZ-3269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15622958#comment-15622958 ] TezQA commented on TEZ-3269: {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12836176/TEZ-3269-5.patch against master revision a328d46. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 4 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/2081//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2081//console This message is automatically generated. > Provide basic fair routing and scheduling functionality via custom > VertexManager and EdgeManager > > > Key: TEZ-3269 > URL: https://issues.apache.org/jira/browse/TEZ-3269 > Project: Apache Tez > Issue Type: Sub-task >Reporter: Ming Ma >Assignee: Ming Ma > Attachments: TEZ-3269-2.patch, TEZ-3269-3.patch, TEZ-3269-4.patch, > TEZ-3269-5.patch, TEZ-3269.patch > > > With TEZ-3206 and TEZ-3216, we can build a custom VertexManager and > EdgeManager that uses partition stats to do fair routing as well as the > scheduling based on destination tasks’ dependency on source tasks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-3269) Provide basic fair routing and scheduling functionality via custom VertexManager and EdgeManager
[ https://issues.apache.org/jira/browse/TEZ-3269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15621538#comment-15621538 ] TezQA commented on TEZ-3269: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12836122/TEZ-3269-4.patch against master revision a328d46. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 4 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:red}-1 findbugs{color}. The patch appears to introduce 1 new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/2079//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-TEZ-Build/2079//artifact/patchprocess/newPatchFindbugsWarningstez-runtime-library.html Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2079//console This message is automatically generated. > Provide basic fair routing and scheduling functionality via custom > VertexManager and EdgeManager > > > Key: TEZ-3269 > URL: https://issues.apache.org/jira/browse/TEZ-3269 > Project: Apache Tez > Issue Type: Sub-task >Reporter: Ming Ma >Assignee: Ming Ma > Attachments: TEZ-3269-2.patch, TEZ-3269-3.patch, TEZ-3269-4.patch, > TEZ-3269.patch > > > With TEZ-3206 and TEZ-3216, we can build a custom VertexManager and > EdgeManager that uses partition stats to do fair routing as well as the > scheduling based on destination tasks’ dependency on source tasks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-3269) Provide basic fair routing and scheduling functionality via custom VertexManager and EdgeManager
[ https://issues.apache.org/jira/browse/TEZ-3269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15578791#comment-15578791 ] Siddharth Seth commented on TEZ-3269: - Apologies for the long delay in the review. Mostly looks good. Would be a lot easier to review if this were split into smaller jiras... think it combines a bunch of things like long to int, with the core logic changes. Minor Stuff: - final where possible - e.g. PartitionsGroupingCalculator.sourceVertexInfo, all variables in FairEdgeConfiguration - This is a fairly complicated patch. Would be good to have some more documentation. - the ceil method - Within various conditions in compute and iterator - Obligatory rename request: getTotalStatsAtIndex to getCurrentlyKnownStatsAtIndex - this method will normally not return totalStats. - Nit: expectedTotalSourceTasksOutputSize / numOfPartitions; - can be done once outside the loop - onVertexStarted - Should this be split up a little more. It's possible for quite a bit to happen at the moment, before the "single vertex only" check is hit in FairShufflleVertexManager Question: - estimatePartitionSize.partitionstatSizeInMB is across all partitions. This ensures that averaging of stats based on output size isn't accidentally hit on a 0 sized partition? (Could break earlier from the loop) - In case of reduce_parallelism - this considers the partition size and may produce groups with different number of partitions to consume, which the current ShuffleVertexManager doesn't do yet? - Will the parallelism ever end up getting increased? Any thoughts on what it will take to move this to support multiple source vertices ? > Provide basic fair routing and scheduling functionality via custom > VertexManager and EdgeManager > > > Key: TEZ-3269 > URL: https://issues.apache.org/jira/browse/TEZ-3269 > Project: Apache Tez > Issue Type: Sub-task >Reporter: Ming Ma > Attachments: TEZ-3269-2.patch, TEZ-3269-3.patch, TEZ-3269.patch > > > With TEZ-3206 and TEZ-3216, we can build a custom VertexManager and > EdgeManager that uses partition stats to do fair routing as well as the > scheduling based on destination tasks’ dependency on source tasks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-3269) Provide basic fair routing and scheduling functionality via custom VertexManager and EdgeManager
[ https://issues.apache.org/jira/browse/TEZ-3269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15569378#comment-15569378 ] Zhiyuan Yang commented on TEZ-3269: --- +1 (non-binding) > Provide basic fair routing and scheduling functionality via custom > VertexManager and EdgeManager > > > Key: TEZ-3269 > URL: https://issues.apache.org/jira/browse/TEZ-3269 > Project: Apache Tez > Issue Type: Sub-task >Reporter: Ming Ma > Attachments: TEZ-3269-2.patch, TEZ-3269-3.patch, TEZ-3269.patch > > > With TEZ-3206 and TEZ-3216, we can build a custom VertexManager and > EdgeManager that uses partition stats to do fair routing as well as the > scheduling based on destination tasks’ dependency on source tasks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-3269) Provide basic fair routing and scheduling functionality via custom VertexManager and EdgeManager
[ https://issues.apache.org/jira/browse/TEZ-3269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15569234#comment-15569234 ] Ming Ma commented on TEZ-3269: -- [~aplusplus] [~sseth] or anyone else, any additional comments? If it looks good, appreciate you can give a +1 so that I can commit this and use it for our integration testing. Thanks. > Provide basic fair routing and scheduling functionality via custom > VertexManager and EdgeManager > > > Key: TEZ-3269 > URL: https://issues.apache.org/jira/browse/TEZ-3269 > Project: Apache Tez > Issue Type: Sub-task >Reporter: Ming Ma > Attachments: TEZ-3269-2.patch, TEZ-3269-3.patch, TEZ-3269.patch > > > With TEZ-3206 and TEZ-3216, we can build a custom VertexManager and > EdgeManager that uses partition stats to do fair routing as well as the > scheduling based on destination tasks’ dependency on source tasks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-3269) Provide basic fair routing and scheduling functionality via custom VertexManager and EdgeManager
[ https://issues.apache.org/jira/browse/TEZ-3269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15546088#comment-15546088 ] Zhiyuan Yang commented on TEZ-3269: --- Patch looks good to me. > Provide basic fair routing and scheduling functionality via custom > VertexManager and EdgeManager > > > Key: TEZ-3269 > URL: https://issues.apache.org/jira/browse/TEZ-3269 > Project: Apache Tez > Issue Type: Sub-task >Reporter: Ming Ma > Attachments: TEZ-3269-2.patch, TEZ-3269-3.patch, TEZ-3269.patch > > > With TEZ-3206 and TEZ-3216, we can build a custom VertexManager and > EdgeManager that uses partition stats to do fair routing as well as the > scheduling based on destination tasks’ dependency on source tasks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-3269) Provide basic fair routing and scheduling functionality via custom VertexManager and EdgeManager
[ https://issues.apache.org/jira/browse/TEZ-3269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15544146#comment-15544146 ] TezQA commented on TEZ-3269: {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12831443/TEZ-3269-3.patch against master revision ad1fb62. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 4 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/2008//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2008//console This message is automatically generated. > Provide basic fair routing and scheduling functionality via custom > VertexManager and EdgeManager > > > Key: TEZ-3269 > URL: https://issues.apache.org/jira/browse/TEZ-3269 > Project: Apache Tez > Issue Type: Sub-task >Reporter: Ming Ma > Attachments: TEZ-3269-2.patch, TEZ-3269-3.patch, TEZ-3269.patch > > > With TEZ-3206 and TEZ-3216, we can build a custom VertexManager and > EdgeManager that uses partition stats to do fair routing as well as the > scheduling based on destination tasks’ dependency on source tasks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-3269) Provide basic fair routing and scheduling functionality via custom VertexManager and EdgeManager
[ https://issues.apache.org/jira/browse/TEZ-3269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15542944#comment-15542944 ] Zhiyuan Yang commented on TEZ-3269: --- I'm sorry for the late review... Generally code looks good as core parts are mostly same wtih TEZ-3209 patch. Just some minor things: In FairShuffleVertexManager: {code:java} if (pendingTasks.size() == 0) { // don't change routing when number of tasks is set to zero. return null; } {code} Given initial parallelism of reducer determines the number of partitions of source vertices output, why pendingTasks.size() would be zero? In Test FairShuffleVertexManager: {code:java} // The 2nd destination task fetches one partition from the first source // task. // The 3rd destination task fetches one partition from the 2nd and 3rd // source task. {code} The comments doesn't match code. {code:java} Assert.assertTrue(manager.config.isAutoParallelismEnabled() == true); {code} '== true' can be removed. > Provide basic fair routing and scheduling functionality via custom > VertexManager and EdgeManager > > > Key: TEZ-3269 > URL: https://issues.apache.org/jira/browse/TEZ-3269 > Project: Apache Tez > Issue Type: Sub-task >Reporter: Ming Ma > Attachments: TEZ-3269-2.patch, TEZ-3269.patch > > > With TEZ-3206 and TEZ-3216, we can build a custom VertexManager and > EdgeManager that uses partition stats to do fair routing as well as the > scheduling based on destination tasks’ dependency on source tasks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-3269) Provide basic fair routing and scheduling functionality via custom VertexManager and EdgeManager
[ https://issues.apache.org/jira/browse/TEZ-3269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15460260#comment-15460260 ] TezQA commented on TEZ-3269: {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12826952/TEZ-3269-2.patch against master revision af82469. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 4 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/1955//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1955//console This message is automatically generated. > Provide basic fair routing and scheduling functionality via custom > VertexManager and EdgeManager > > > Key: TEZ-3269 > URL: https://issues.apache.org/jira/browse/TEZ-3269 > Project: Apache Tez > Issue Type: Sub-task >Reporter: Ming Ma > Attachments: TEZ-3269-2.patch, TEZ-3269.patch > > > With TEZ-3206 and TEZ-3216, we can build a custom VertexManager and > EdgeManager that uses partition stats to do fair routing as well as the > scheduling based on destination tasks’ dependency on source tasks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-3269) Provide basic fair routing and scheduling functionality via custom VertexManager and EdgeManager
[ https://issues.apache.org/jira/browse/TEZ-3269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15440355#comment-15440355 ] TezQA commented on TEZ-3269: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12825741/TEZ-3269.patch against master revision a9eb937. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 4 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:red}-1 findbugs{color}. The patch appears to introduce 7 new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/1938//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-TEZ-Build/1938//artifact/patchprocess/newPatchFindbugsWarningstez-runtime-library.html Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1938//console This message is automatically generated. > Provide basic fair routing and scheduling functionality via custom > VertexManager and EdgeManager > > > Key: TEZ-3269 > URL: https://issues.apache.org/jira/browse/TEZ-3269 > Project: Apache Tez > Issue Type: Sub-task >Reporter: Ming Ma > Attachments: TEZ-3269.patch > > > With TEZ-3206 and TEZ-3216, we can build a custom VertexManager and > EdgeManager that uses partition stats to do fair routing as well as the > scheduling based on destination tasks’ dependency on source tasks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)