[jira] [Commented] (DRILL-6173) Support transitive closure during filter push down and partition pruning

2018-06-14 Thread Bridget Bevens (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16513119#comment-16513119
 ] 

Bridget Bevens commented on DRILL-6173:
---

Updated doc here: https://drill.apache.org/docs/parquet-filter-pushdown/ 
I'll also include the following mention in the 1.14 release notes:
Drill supports the planner rule, JoinPushTransitivePredicatesRule, which 
enables Drill to infer filter conditions for join queries and push the filter 
conditions down to the data source.
Setting doc status to doc-complete. Please let me know if you see any issues 
with the content.
Thanks,
Bridget

> Support transitive closure during filter push down and partition pruning
> 
>
> Key: DRILL-6173
> URL: https://issues.apache.org/jira/browse/DRILL-6173
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Query Planning  Optimization
>Affects Versions: 1.12.0
>Reporter: Vitalii Diravka
>Assignee: Vitalii Diravka
>Priority: Major
>  Labels: doc-complete, ready-to-commit
> Fix For: 1.14.0
>
>
> There is Calcite rule JoinPushTransitivePredicatesRule but it does not work 
> in Drill. 
>  Applying it in Drill will allow for equi-join queries to push filter 
> condition from one table to another:
> {code:sql}
> select * 
> from A, B 
> where
> A.id = B.id and 
> B.id = 100
> {code}
> In that case it is possible that Scan operator for A table will not scan all 
> data. 
>  For table A it can lead for applying: 
>  1. [Partition pruning for Hive tables and partiotion/directory pruning for 
> file system 
> tables|https://drill.apache.org/docs/partition-pruning-introduction/] 
>  2. [Parquet filter 
> pushdown|https://drill.apache.org/docs/parquet-filter-pushdown/]
>  
> Note: transitive closure doesn't work for some cases, these Calcite issues 
> can resolve them:
> CALCITE-1048, CALCITE-2274, CALCITE-2275, CALCITE-2241. They are tracked by 
> DRILL-6350



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6173) Support transitive closure during filter push down and partition pruning

2018-04-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16458158#comment-16458158
 ] 

ASF GitHub Bot commented on DRILL-6173:
---

Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/1216


> Support transitive closure during filter push down and partition pruning
> 
>
> Key: DRILL-6173
> URL: https://issues.apache.org/jira/browse/DRILL-6173
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Query Planning  Optimization
>Affects Versions: 1.12.0
>Reporter: Vitalii Diravka
>Assignee: Vitalii Diravka
>Priority: Major
>  Labels: doc-impacting, ready-to-commit
> Fix For: 1.14.0
>
>
> There is Calcite rule JoinPushTransitivePredicatesRule but it does not work 
> in Drill. 
>  Applying it in Drill will allow for equi-join queries to push filter 
> condition from one table to another:
> {code:sql}
> select * 
> from A, B 
> where
> A.id = B.id and 
> B.id = 100
> {code}
> In that case it is possible that Scan operator for A table will not scan all 
> data. 
>  For table A it can lead for applying: 
>  1. [Partition pruning for Hive tables and partiotion/directory pruning for 
> file system 
> tables|https://drill.apache.org/docs/partition-pruning-introduction/] 
>  2. [Parquet filter 
> pushdown|https://drill.apache.org/docs/parquet-filter-pushdown/]
>  
> Note: transitive closure doesn't work for some cases, these Calcite issues 
> can resolve them:
> CALCITE-1048, CALCITE-2274, CALCITE-2275, CALCITE-2241.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6173) Support transitive closure during filter push down and partition pruning

2018-04-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16448828#comment-16448828
 ] 

ASF GitHub Bot commented on DRILL-6173:
---

Github user amansinha100 commented on the issue:

https://github.com/apache/drill/pull/1216
  
+1


> Support transitive closure during filter push down and partition pruning
> 
>
> Key: DRILL-6173
> URL: https://issues.apache.org/jira/browse/DRILL-6173
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Query Planning  Optimization
>Affects Versions: 1.12.0
>Reporter: Vitalii Diravka
>Assignee: Vitalii Diravka
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.14.0
>
>
> There is Calcite rule JoinPushTransitivePredicatesRule but it does not work 
> in Drill. 
> Applying it in Drill will allow for equi-join queries to push filter 
> condition from one table to another:
> {code:sql}
> select * 
> from A, B 
> where
> A.id = B.id and 
> B.id = 100
> {code}
> In that case it is possible that Scan operator for A table will not scan all 
> data. 
> For table A it can lead for applying: 
> 1. [Partition pruning for Hive or file system Parquet 
> tables|https://drill.apache.org/docs/partition-pruning-introduction/] 
> 2. [Parquet filter 
> pushdown|https://drill.apache.org/docs/parquet-filter-pushdown/]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6173) Support transitive closure during filter push down and partition pruning

2018-04-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16448738#comment-16448738
 ] 

ASF GitHub Bot commented on DRILL-6173:
---

Github user vdiravka commented on a diff in the pull request:

https://github.com/apache/drill/pull/1216#discussion_r183329162
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/exec/planner/logical/TestTransitiveClosure.java
 ---
@@ -0,0 +1,102 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.planner.logical;
+
+import org.apache.drill.PlanTestBase;
+import org.apache.drill.categories.PlannerTest;
+import org.junit.BeforeClass;
+import org.junit.Test;
+import org.junit.experimental.categories.Category;
+
+import java.nio.file.Paths;
+
+import static org.junit.Assert.assertEquals;
+
+@Category(PlannerTest.class)
+public class TestTransitiveClosure extends PlanTestBase {
+
+  @BeforeClass
+  public static void setupTestFiles() {
+dirTestWatcher.copyResourceToRoot(Paths.get("join"));
+  }
+
+
+  @Test // CALCITE-2200: (query with infinite loop)
+  public void simpleInfiniteLoop() throws Exception {
+String query = "SELECT t1.department_id FROM cp.`employee.json` t1 " +
+" WHERE t1.department_id IN (SELECT department_id FROM 
cp.`department.json` t2 " +
+"WHERE t1.department_id = 
t2.department_id " +
+"OR (t1.department_id IS NULL and 
t2.department_id IS NULL))";
+int actualRowCount = testSql(query);
+int expectedRowCount = 1155;
+assertEquals("Expected and actual row count should match", 
expectedRowCount, actualRowCount);
+
+// TODO: After resolving CALCITE-2257 there will not be Filter with 
OR(IS NOT NULL($0), IS NULL($0)) condition
+// Then remove testPlanMatchingPatterns() from this test
+final String[] expectedPlan =
+new String[] {"Filter\\(condition=\\[OR\\(IS NOT NULL\\(\\$0\\), 
IS NULL\\(\\$0\\)\\)\\]\\)"};
+final String[] excludedPlan ={};
+testPlanMatchingPatterns(query, expectedPlan, excludedPlan);
+  }
+
+  @Test // CALCITE-2205 (query with infinite loop)
+  public void infiniteLoopWhilePlaningComplexQuery() throws Exception {
--- End diff --

This test and other one from TestTransitiveClosure.java are from functional 
test suite.
So we can remove them from unit tests scope.


> Support transitive closure during filter push down and partition pruning
> 
>
> Key: DRILL-6173
> URL: https://issues.apache.org/jira/browse/DRILL-6173
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Query Planning  Optimization
>Affects Versions: 1.12.0
>Reporter: Vitalii Diravka
>Assignee: Vitalii Diravka
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.14.0
>
>
> There is Calcite rule JoinPushTransitivePredicatesRule but it does not work 
> in Drill. 
> Applying it in Drill will allow for equi-join queries to push filter 
> condition from one table to another:
> {code:sql}
> select * 
> from A, B 
> where
> A.id = B.id and 
> B.id = 100
> {code}
> In that case it is possible that Scan operator for A table will not scan all 
> data. 
> For table A it can lead for applying: 
> 1. [Partition pruning for Hive or file system Parquet 
> tables|https://drill.apache.org/docs/partition-pruning-introduction/] 
> 2. [Parquet filter 
> pushdown|https://drill.apache.org/docs/parquet-filter-pushdown/]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6173) Support transitive closure during filter push down and partition pruning

2018-04-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16445959#comment-16445959
 ] 

ASF GitHub Bot commented on DRILL-6173:
---

Github user amansinha100 commented on a diff in the pull request:

https://github.com/apache/drill/pull/1216#discussion_r183097212
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/exec/planner/logical/TestTransitiveClosure.java
 ---
@@ -0,0 +1,102 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.planner.logical;
+
+import org.apache.drill.PlanTestBase;
+import org.apache.drill.categories.PlannerTest;
+import org.junit.BeforeClass;
+import org.junit.Test;
+import org.junit.experimental.categories.Category;
+
+import java.nio.file.Paths;
+
+import static org.junit.Assert.assertEquals;
+
+@Category(PlannerTest.class)
+public class TestTransitiveClosure extends PlanTestBase {
+
+  @BeforeClass
+  public static void setupTestFiles() {
+dirTestWatcher.copyResourceToRoot(Paths.get("join"));
+  }
+
+
+  @Test // CALCITE-2200: (query with infinite loop)
+  public void simpleInfiniteLoop() throws Exception {
+String query = "SELECT t1.department_id FROM cp.`employee.json` t1 " +
+" WHERE t1.department_id IN (SELECT department_id FROM 
cp.`department.json` t2 " +
+"WHERE t1.department_id = 
t2.department_id " +
+"OR (t1.department_id IS NULL and 
t2.department_id IS NULL))";
+int actualRowCount = testSql(query);
+int expectedRowCount = 1155;
+assertEquals("Expected and actual row count should match", 
expectedRowCount, actualRowCount);
+
+// TODO: After resolving CALCITE-2257 there will not be Filter with 
OR(IS NOT NULL($0), IS NULL($0)) condition
+// Then remove testPlanMatchingPatterns() from this test
+final String[] expectedPlan =
+new String[] {"Filter\\(condition=\\[OR\\(IS NOT NULL\\(\\$0\\), 
IS NULL\\(\\$0\\)\\)\\]\\)"};
+final String[] excludedPlan ={};
+testPlanMatchingPatterns(query, expectedPlan, excludedPlan);
+  }
+
+  @Test // CALCITE-2205 (query with infinite loop)
+  public void infiniteLoopWhilePlaningComplexQuery() throws Exception {
--- End diff --

For complex queries such as this I would suggest to move them out of unit 
tests, into functional test suite. 


> Support transitive closure during filter push down and partition pruning
> 
>
> Key: DRILL-6173
> URL: https://issues.apache.org/jira/browse/DRILL-6173
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Query Planning  Optimization
>Affects Versions: 1.12.0
>Reporter: Vitalii Diravka
>Assignee: Vitalii Diravka
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.14.0
>
>
> There is Calcite rule JoinPushTransitivePredicatesRule but it does not work 
> in Drill. 
> Applying it in Drill will allow for equi-join queries to push filter 
> condition from one table to another:
> {code:sql}
> select * 
> from A, B 
> where
> A.id = B.id and 
> B.id = 100
> {code}
> In that case it is possible that Scan operator for A table will not scan all 
> data. 
> For table A it can lead for applying: 
> 1. [Partition pruning for Hive or file system Parquet 
> tables|https://drill.apache.org/docs/partition-pruning-introduction/] 
> 2. [Parquet filter 
> pushdown|https://drill.apache.org/docs/parquet-filter-pushdown/]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6173) Support transitive closure during filter push down and partition pruning

2018-04-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16443635#comment-16443635
 ] 

ASF GitHub Bot commented on DRILL-6173:
---

Github user arina-ielchiieva commented on the issue:

https://github.com/apache/drill/pull/1216
  
@vvysotskyi / @amansinha100 please review.


> Support transitive closure during filter push down and partition pruning
> 
>
> Key: DRILL-6173
> URL: https://issues.apache.org/jira/browse/DRILL-6173
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Query Planning  Optimization
>Affects Versions: 1.12.0
>Reporter: Vitalii Diravka
>Assignee: Vitalii Diravka
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.14.0
>
>
> There is Calcite rule JoinPushTransitivePredicatesRule but it does not work 
> in Drill. 
> Applying it in Drill will allow for equi-join queries to push filter 
> condition from one table to another:
> {code:sql}
> select * 
> from A, B 
> where
> A.id = B.id and 
> B.id = 100
> {code}
> In that case it is possible that Scan operator for A table will not scan all 
> data. 
> For table A it can lead for applying: 
> 1. [Partition pruning for Hive or file system Parquet 
> tables|https://drill.apache.org/docs/partition-pruning-introduction/] 
> 2. [Parquet filter 
> pushdown|https://drill.apache.org/docs/parquet-filter-pushdown/]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6173) Support transitive closure during filter push down and partition pruning

2018-04-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16441476#comment-16441476
 ] 

ASF GitHub Bot commented on DRILL-6173:
---

GitHub user vdiravka opened a pull request:

https://github.com/apache/drill/pull/1216

DRILL-6173: Support transitive closure during filter push down and pa…

…rtition pruning

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/vdiravka/drill DRILL-6173

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/1216.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1216


commit ad673cd8030ce1540a1c5bb5a55f6a2c8dbcb5d3
Author: Vitalii Diravka 
Date:   2018-04-17T11:38:03Z

DRILL-6173: Support transitive closure during filter push down and 
partition pruning




> Support transitive closure during filter push down and partition pruning
> 
>
> Key: DRILL-6173
> URL: https://issues.apache.org/jira/browse/DRILL-6173
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Query Planning  Optimization
>Affects Versions: 1.12.0
>Reporter: Vitalii Diravka
>Assignee: Vitalii Diravka
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.14.0
>
>
> There is Calcite rule JoinPushTransitivePredicatesRule but it does not work 
> in Drill. 
> Applying it in Drill will allow for equi-join queries to push filter 
> condition from one table to another:
> {code:sql}
> select * 
> from A, B 
> where
> A.id = B.id and 
> B.id = 100
> {code}
> In that case it is possible that Scan operator for A table will not scan all 
> data. 
> For table A it can lead for applying: 
> 1. [Partition pruning for Hive or file system Parquet 
> tables|https://drill.apache.org/docs/partition-pruning-introduction/] 
> 2. [Parquet filter 
> pushdown|https://drill.apache.org/docs/parquet-filter-pushdown/]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6173) Support transitive closure during filter push down and partition pruning

2018-03-06 Thread Vitalii Diravka (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16388281#comment-16388281
 ] 

Vitalii Diravka commented on DRILL-6173:


To enable "transitive closure" drill-calcite-1.15.0-r0 version requires at 
least 5 commits (maybe more) from Apache Calcite master branch. 
Therefore it can be more reasonable to leverage all changes within upgrade onto 
Calcite 1.16.0 version.

> Support transitive closure during filter push down and partition pruning
> 
>
> Key: DRILL-6173
> URL: https://issues.apache.org/jira/browse/DRILL-6173
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Query Planning  Optimization
>Affects Versions: 1.12.0
>Reporter: Vitalii Diravka
>Assignee: Vitalii Diravka
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.13.0
>
>
> There is Calcite rule JoinPushTransitivePredicatesRule but it does not work 
> in Drill. 
> Applying it in Drill will allow for equi-join queries to push filter 
> condition from one table to another:
> {code:sql}
> select * 
> from A, B 
> where
> A.id = B.id and 
> B.id = 100
> {code}
> In that case it is possible that Scan operator for A table will not scan all 
> data. 
> For table A it can lead for applying: 
> 1. [Partition pruning for Hive or file system Parquet 
> tables|https://drill.apache.org/docs/partition-pruning-introduction/] 
> 2. [Parquet filter 
> pushdown|https://drill.apache.org/docs/parquet-filter-pushdown/]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6173) Support transitive closure during filter push down and partition pruning

2018-03-06 Thread Pritesh Maker (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16388131#comment-16388131
 ] 

Pritesh Maker commented on DRILL-6173:
--

[~vitalii] can you add a comment about our discussion this morning? As I 
understand, the fix requires that we upgrade to Calcite 1.16.

> Support transitive closure during filter push down and partition pruning
> 
>
> Key: DRILL-6173
> URL: https://issues.apache.org/jira/browse/DRILL-6173
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Query Planning  Optimization
>Affects Versions: 1.12.0
>Reporter: Vitalii Diravka
>Assignee: Vitalii Diravka
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.13.0
>
>
> There is Calcite rule JoinPushTransitivePredicatesRule but it does not work 
> in Drill. 
> Applying it in Drill will allow for equi-join queries to push filter 
> condition from one table to another:
> {code:sql}
> select * 
> from A, B 
> where
> A.id = B.id and 
> B.id = 100
> {code}
> In that case it is possible that Scan operator for A table will not scan all 
> data. 
> For table A it can lead for applying: 
> 1. [Partition pruning for Hive or file system Parquet 
> tables|https://drill.apache.org/docs/partition-pruning-introduction/] 
> 2. [Parquet filter 
> pushdown|https://drill.apache.org/docs/parquet-filter-pushdown/]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)