[GitHub] drill issue #639: DRILL-4706: Fragment planning causes Drillbits to read rem...

2016-11-09 Thread ppadma
Github user ppadma commented on the issue:

https://github.com/apache/drill/pull/639
  
Merged with latest code. All review comments taken care of. All tests pass 
with the option `store.parquet.use_local_affinity` = true and false, both.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill issue #639: DRILL-4706: Fragment planning causes Drillbits to read rem...

2016-11-04 Thread ppadma
Github user ppadma commented on the issue:

https://github.com/apache/drill/pull/639
  
Parallelization logic is affected for following reasons:
Depending upon how many rowGroups to scan on a node (based on locality 
information) i.e. how much work the node has to do, we want to adjust the 
number of fragments on the node (constrained to usual global and per node 
limits). 
We do not want to schedule fragment(s) on a node which do not have data. 
Because we want pure locality, we may have fewer fragments doing more work.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill issue #639: DRILL-4706: Fragment planning causes Drillbits to read rem...

2016-11-04 Thread sudheeshkatkam
Github user sudheeshkatkam commented on the issue:

https://github.com/apache/drill/pull/639
  
Hmm the answer seems like a rephrasing of the question. Sorry, I misspoke. 
Better asked:

The issue is regarding assigning **_work to_** fragments based on strict 
locality (**_decide which fragment does what_**). So why is the parallelization 
(**_decide how many fragments_**) logic affected?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill issue #639: DRILL-4706: Fragment planning causes Drillbits to read rem...

2016-11-04 Thread ppadma
Github user ppadma commented on the issue:

https://github.com/apache/drill/pull/639
  
Some initial comments.

The issue is regarding assigning fragments based on strict locality. So why 
is the parallelization logic affected, and not exclusively locality?

Parallelization logic is affected because it decides how many fragments to 
run on each node and that is dependent on locality.  


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill issue #639: DRILL-4706: Fragment planning causes Drillbits to read rem...

2016-11-04 Thread ppadma
Github user ppadma commented on the issue:

https://github.com/apache/drill/pull/639
  
Updated with all review comments taken care of.  Added 
TestLocalAffinityFragmentParallelizer.java which has bunch of test cases with 
examples. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill issue #639: DRILL-4706: Fragment planning causes Drillbits to read rem...

2016-10-31 Thread ppadma
Github user ppadma commented on the issue:

https://github.com/apache/drill/pull/639
  
Updated the JIRA with details on how current algorithm works, why remote 
reads were happening and the new algorithm details.
https://issues.apache.org/jira/browse/DRILL-4706



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---