[GitHub] drill issue #639: DRILL-4706: Fragment planning causes Drillbits to read rem...
Github user ppadma commented on the issue: https://github.com/apache/drill/pull/639 Merged with latest code. All review comments taken care of. All tests pass with the option `store.parquet.use_local_affinity` = true and false, both. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] drill issue #639: DRILL-4706: Fragment planning causes Drillbits to read rem...
Github user ppadma commented on the issue: https://github.com/apache/drill/pull/639 Parallelization logic is affected for following reasons: Depending upon how many rowGroups to scan on a node (based on locality information) i.e. how much work the node has to do, we want to adjust the number of fragments on the node (constrained to usual global and per node limits). We do not want to schedule fragment(s) on a node which do not have data. Because we want pure locality, we may have fewer fragments doing more work. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] drill issue #639: DRILL-4706: Fragment planning causes Drillbits to read rem...
Github user sudheeshkatkam commented on the issue: https://github.com/apache/drill/pull/639 Hmm the answer seems like a rephrasing of the question. Sorry, I misspoke. Better asked: The issue is regarding assigning **_work to_** fragments based on strict locality (**_decide which fragment does what_**). So why is the parallelization (**_decide how many fragments_**) logic affected? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] drill issue #639: DRILL-4706: Fragment planning causes Drillbits to read rem...
Github user ppadma commented on the issue: https://github.com/apache/drill/pull/639 Some initial comments. The issue is regarding assigning fragments based on strict locality. So why is the parallelization logic affected, and not exclusively locality? Parallelization logic is affected because it decides how many fragments to run on each node and that is dependent on locality. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] drill issue #639: DRILL-4706: Fragment planning causes Drillbits to read rem...
Github user ppadma commented on the issue: https://github.com/apache/drill/pull/639 Updated with all review comments taken care of. Added TestLocalAffinityFragmentParallelizer.java which has bunch of test cases with examples. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] drill issue #639: DRILL-4706: Fragment planning causes Drillbits to read rem...
Github user ppadma commented on the issue: https://github.com/apache/drill/pull/639 Updated the JIRA with details on how current algorithm works, why remote reads were happening and the new algorithm details. https://issues.apache.org/jira/browse/DRILL-4706 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---