[GitHub] drill issue #671: DRILL-4347: Propagate distinct row count for joins from lo...

2016-12-07 Thread jinfengni
Github user jinfengni commented on the issue:

https://github.com/apache/drill/pull/671
  
Agreed with your comment about the importance of doing calcite rebasing.  
AFAIK, someone(s) else in Ukraine have been working on calcite rebasing for a 
while. Last time I heard is they managed to get a rebased calcite branch and 
are dealing with regressions on Drill side.
 

 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill issue #671: DRILL-4347: Propagate distinct row count for joins from lo...

2016-12-07 Thread julianhyde
Github user julianhyde commented on the issue:

https://github.com/apache/drill/pull/671
  
Rebasing onto Calcite is like running after a train: you can't just stop 
and take a rest. :)

And by the way, the state of Drill-Arrow integration makes me very sad. Now 
Drill has fallen behind there, I doubt whether it will ever catch up.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill issue #671: DRILL-4347: Propagate distinct row count for joins from lo...

2016-12-07 Thread jinfengni
Github user jinfengni commented on the issue:

https://github.com/apache/drill/pull/671
  
@jacques-n , CachingRelMetadataProvider provides caching capability per 
meta-method / rel node [1]. Since Drill logical rel (DrillJoinRel) and Drill 
physical rel (JoinPrel) are different rels, CachingRelMetadataProvider probably 
would not help avoiding the first meta data call for the physical rel nodes, 
even the meta data for logical rels are in the cache. 

@julianhyde , I probably once tried to cherry-pick CALCITE-604 to Drill's 
calcite fork, and I aborted that effort after seeing many merging conflicts (If 
I remember correctly).  Since there has been ongoing effort to rebase Drill 
onto latest Calcite,  it might make sense to see if the rebase work could be 
done shortly. At that time, Drill will benefit from CALCITE-604.

 
1. 
https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/rel/metadata/CachingRelMetadataProvider.java#L113-L120



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill issue #671: DRILL-4347: Propagate distinct row count for joins from lo...

2016-11-29 Thread amansinha100
Github user amansinha100 commented on the issue:

https://github.com/apache/drill/pull/671
  
Thanks @julianhyde... CALCITE-604 should potentially help with this.  
Drill's calcite version has not caught up to this yet.  Let me confer with 
@jinfengni  sometime next week (he is on vacation until then) and get back on 
what can be done to get this into Drill.  In the meantime, even though my patch 
addresses the hang issue, I will hold it for now. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill issue #671: DRILL-4347: Propagate distinct row count for joins from lo...

2016-11-29 Thread julianhyde
Github user julianhyde commented on the issue:

https://github.com/apache/drill/pull/671
  
Yes, CachingRelMetadataProvider is meant for this.

After https://issues.apache.org/jira/browse/CALCITE-604, providers are 
considerably more efficient (not called via reflection), and RelMetadataQuery 
contains a per-request cache (because some metadata does change, but slowly).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill issue #671: DRILL-4347: Propagate distinct row count for joins from lo...

2016-11-29 Thread amansinha100
Github user amansinha100 commented on the issue:

https://github.com/apache/drill/pull/671
  
It is quite likely that the CachingRelMetadataProvider is meant for this.  
Based on the stack trace, there are multiple instances of "at 
org.apache.calcite.rel.metadata.CachingRelMetadataProvider$CachingInvocationHandler.invoke(CachingRelMetadataProvider.java:132)"
  and that line # indicates that there was either a cache miss or the entry was 
stale.   So, the caching provider does in fact get used but then subsequently 
gets stuck in the apply() method of the ReflectiveRelMetadataProvider.  I did 
not attempt to debug why it got stuck there...partly because I am not very 
familiar with the way reflection is used in this provider.  Hence, my fix is an 
attempt to circumvent the issue.  


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill issue #671: DRILL-4347: Propagate distinct row count for joins from lo...

2016-11-29 Thread jacques-n
Github user jacques-n commented on the issue:

https://github.com/apache/drill/pull/671
  
Random question: won't a caching metadata provider achieve the same thing 
without having to do per rel node changes? At least that is what I thought it 
was for. Maybe @julianhyde can provide additional feedback?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---