[GitHub] spark pull request #15363: [SPARK-17791][SQL] Join reordering using star sch...

2017-03-07 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/15363#discussion_r104825321 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala --- @@ -20,19 +20,342 @@ package

[GitHub] spark pull request #15363: [SPARK-17791][SQL] Join reordering using star sch...

2017-03-07 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/15363#discussion_r104825193 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala --- @@ -20,19 +20,342 @@ package

[GitHub] spark issue #15363: [SPARK-17791][SQL] Join reordering using star schema det...

2017-03-07 Thread ioana-delaney
Github user ioana-delaney commented on the issue: https://github.com/apache/spark/pull/15363 @gatorsmile @wzhfy Would you please review this PR. Thank you. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request #15363: [SPARK-17791][SQL] Join reordering using star sch...

2017-01-04 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/15363#discussion_r94650968 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala --- @@ -42,7 +366,7 @@ object ReorderJoin extends Rule

[GitHub] spark pull request #16228: [WIP] [SPARK-17076] [SQL] Cardinality estimation ...

2016-12-14 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/16228#discussion_r92523690 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/estimation/JoinEstimation.scala --- @@ -0,0 +1,175

[GitHub] spark issue #15363: [SPARK-17791][SQL] Join reordering using star schema det...

2016-12-14 Thread ioana-delaney
Github user ioana-delaney commented on the issue: https://github.com/apache/spark/pull/15363 The following updates were made: 1. Incorporate table and column statistics into the star join detection algorithm. Fact table is chosen based on table cardinality, and dimensions are

[GitHub] spark pull request #15363: [SPARK-17791][SQL] Join reordering using star sch...

2016-11-04 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/15363#discussion_r86595007 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala --- @@ -83,10 +88,221 @@ object ReorderJoin extends Rule

[GitHub] spark issue #15363: [SPARK-17791][SQL] Join reordering using star schema det...

2016-11-03 Thread ioana-delaney
Github user ioana-delaney commented on the issue: https://github.com/apache/spark/pull/15363 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark pull request #15363: [SPARK-17791][SQL] Join reordering using star sch...

2016-10-27 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/15363#discussion_r85478316 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala --- @@ -83,10 +88,221 @@ object ReorderJoin extends Rule

[GitHub] spark pull request #15363: [SPARK-17791][SQL] Join reordering using star sch...

2016-10-27 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/15363#discussion_r85478259 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala --- @@ -83,10 +88,221 @@ object ReorderJoin extends Rule

[GitHub] spark pull request #15363: [SPARK-17791][SQL] Join reordering using star sch...

2016-10-27 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/15363#discussion_r85477835 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala --- @@ -83,10 +88,221 @@ object ReorderJoin extends Rule

[GitHub] spark pull request #15363: [SPARK-17791][SQL] Join reordering using star sch...

2016-10-27 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/15363#discussion_r85477511 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -373,6 +373,11 @@ object SQLConf { .booleanConf

[GitHub] spark issue #15363: [SPARK-17791][SQL] Join reordering using star schema det...

2016-10-27 Thread ioana-delaney
Github user ioana-delaney commented on the issue: https://github.com/apache/spark/pull/15363 @davies Thank you for reviewing the code! I see this work as evolving and improving with the support of CBO. Without statistics and features such as cardinality and selectivity, we cannot

[GitHub] spark pull request #15363: [SPARK-17791][SQL] Join reordering using star sch...

2016-10-27 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/15363#discussion_r85477459 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/planning/patterns.scala --- @@ -261,3 +262,34 @@ object PhysicalAggregation

[GitHub] spark pull request #15363: [SPARK-17791][SQL] Join reordering using star sch...

2016-10-24 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/15363#discussion_r84777635 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/StarJoinSuite.scala --- @@ -0,0 +1,354 @@ +/* + * Licensed to the Apache Software

[GitHub] spark issue #14847: [SPARK-17254][SQL] Add StopAfter physical plan for the f...

2016-10-19 Thread ioana-delaney
Github user ioana-delaney commented on the issue: https://github.com/apache/spark/pull/14847 @viirya Hi Simon, I have some general comments/questions: 1. It will help to include in the design document some example queries together with their corresponding optimized + physical

[GitHub] spark pull request #15363: [SPARK-17791][SQL] Join reordering using star sch...

2016-10-05 Thread ioana-delaney
GitHub user ioana-delaney opened a pull request: https://github.com/apache/spark/pull/15363 [SPARK-17791][SQL] Join reordering using star schema detection ## What changes were proposed in this pull request? Star schema consists of one or more fact tables referencing a

[GitHub] spark issue #15289: [SPARK-17712][SQL] Fix invalid pushdown of data-independ...

2016-09-30 Thread ioana-delaney
Github user ioana-delaney commented on the issue: https://github.com/apache/spark/pull/15289 @JoshRosen In your example, we don't want to first count one million rows coming from the base table and then to return zero rows based on the false predicate in the outer query

[GitHub] spark issue #15289: [SPARK-17712][SQL] Fix invalid pushdown of data-independ...

2016-09-30 Thread ioana-delaney
Github user ioana-delaney commented on the issue: https://github.com/apache/spark/pull/15289 @JoshRosen The original predicate has to be kept above the aggregation. An optimization would be to also push down the predicate below the aggregation, lower in the plan for early filtering

[GitHub] spark issue #15289: [SPARK-17712][SQL] Fix invalid pushdown of data-independ...

2016-09-28 Thread ioana-delaney
Github user ioana-delaney commented on the issue: https://github.com/apache/spark/pull/15289 @JoshRosen Wouldn't be a better design to push down the predicate but keep the original predicate as well? If the aggregate is above a complex join, not pushing down the predicate may

[GitHub] spark pull request #13867: [SPARK-16161][SQL] Ambiguous error message for un...

2016-08-24 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/13867#discussion_r76094136 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -912,19 +912,24 @@ class Analyzer

[GitHub] spark pull request #13867: [SPARK-16161][SQL] Ambiguous error message for un...

2016-08-24 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/13867#discussion_r76093305 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -912,19 +912,24 @@ class Analyzer

[GitHub] spark issue #13867: [SPARK-16161][SQL] Ambiguous error message for unsupport...

2016-08-22 Thread ioana-delaney
Github user ioana-delaney commented on the issue: https://github.com/apache/spark/pull/13867 @hvanhovell I rebased my changes to the latest build. Please take a look. Thank you. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark issue #13867: [SPARK-16161][SQL] Ambiguous error message for unsupport...

2016-07-08 Thread ioana-delaney
Github user ioana-delaney commented on the issue: https://github.com/apache/spark/pull/13867 Can someone please review the changes? Thank you. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #13867: [SPARK-16161][SQL] Ambiguous error message for unsupport...

2016-06-29 Thread ioana-delaney
Github user ioana-delaney commented on the issue: https://github.com/apache/spark/pull/13867 @hvanhovell Would you please let me know if you agree with my previous reply? An alternative design is to remove the try-catch expression from resolveOuterReferences() altogether

[GitHub] spark issue #13867: [SPARK-16161][SQL] Ambiguous error message for unsupport...

2016-06-26 Thread ioana-delaney
Github user ioana-delaney commented on the issue: https://github.com/apache/spark/pull/13867 @hvanhovell Thank you for reviewing the changes. I replied to your comments and made some updates. Please let me know. --- If your project is set up for it, you can reply to this email and

[GitHub] spark pull request #13867: [SPARK-16161][SQL] Ambiguous error message for un...

2016-06-26 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/13867#discussion_r68513197 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -911,19 +911,30 @@ class Analyzer

[GitHub] spark pull request #13867: [SPARK-16161][SQL] Ambiguous error message for un...

2016-06-26 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/13867#discussion_r68513152 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -911,19 +911,30 @@ class Analyzer

[GitHub] spark pull request #13867: [SPARK-16161][SQL] Ambiguous error message for un...

2016-06-22 Thread ioana-delaney
GitHub user ioana-delaney opened a pull request: https://github.com/apache/spark/pull/13867 [SPARK-16161][SQL] Ambiguous error message for unsupported correlated predicate subqueries ## What changes were proposed in this pull request? Subqueries with deep correlation fail with

[GitHub] spark issue #13570: [SPARK-15832][SQL] Embedded IN/EXISTS predicate subquery...

2016-06-10 Thread ioana-delaney
Github user ioana-delaney commented on the issue: https://github.com/apache/spark/pull/13570 @hvanhovell The EXISTS/NOT EXISTS predicates will have an empty condition. e.g. select c1 from t1 where EXISTS (select c2 from t2) == Optimized Logical Plan == Project

[GitHub] spark issue #13570: [SPARK-15832][SQL] Embedded IN/EXISTS predicate subquery...

2016-06-10 Thread ioana-delaney
Github user ioana-delaney commented on the issue: https://github.com/apache/spark/pull/13570 @hvanhovell Thank you for reviewing the changes and I apologize for the delay in replying. I simplified the code. However, I don't think this is what you suggested. What you sugg

[GitHub] spark pull request #13570: [SPARK-15832][SQL] Embedded IN/EXISTS predicate s...

2016-06-08 Thread ioana-delaney
GitHub user ioana-delaney opened a pull request: https://github.com/apache/spark/pull/13570 [SPARK-15832][SQL] Embedded IN/EXISTS predicate subquery throws TreeNodeException ## What changes were proposed in this pull request? Queries with embedded existential sub-query

[GitHub] spark issue #13418: [SPARK-15677][SQL] Query with scalar sub-query in the SE...

2016-06-03 Thread ioana-delaney
Github user ioana-delaney commented on the issue: https://github.com/apache/spark/pull/13418 @cloud-fan @gatorsmile @davies @rxin @hvanhovell Thank you all. This was my first PR! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark issue #13418: [SPARK-15677][SQL] Query with scalar sub-query in the SE...

2016-06-03 Thread ioana-delaney
Github user ioana-delaney commented on the issue: https://github.com/apache/spark/pull/13418 @cloud-fan Thank you. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #13418: [SPARK-15677][SQL] Query with scalar sub-query in the SE...

2016-06-02 Thread ioana-delaney
Github user ioana-delaney commented on the issue: https://github.com/apache/spark/pull/13418 Thank you @cloud-fan. I mentioned the local relations in the test case description and move the test cases under withTempTable. --- If your project is set up for it, you can reply to this

[GitHub] spark issue #13418: [SPARK-15677][SQL] Query with scalar sub-query in the SE...

2016-06-02 Thread ioana-delaney
Github user ioana-delaney commented on the issue: https://github.com/apache/spark/pull/13418 @cloud-fan I moved the unit tests to a new test case. Thank you. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark issue #13418: [SPARK-15677][SQL] Query with scalar sub-query in the SE...

2016-06-02 Thread ioana-delaney
Github user ioana-delaney commented on the issue: https://github.com/apache/spark/pull/13418 @cloud-fan I replaced p.expressions with projectList. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark issue #13418: [SPARK-15677][SQL] Query with scalar sub-query in the SE...

2016-06-01 Thread ioana-delaney
Github user ioana-delaney commented on the issue: https://github.com/apache/spark/pull/13418 @gatorsmile @davies @rxin @cloud-fan I've incorporated the comments. Please advise. Thank you. --- If your project is set up for it, you can reply to this email and have your reply appe

[GitHub] spark pull request #13418: [SPARK-15677][SQL] Query with scalar sub-query in...

2016-06-01 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/13418#discussion_r65438111 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/subquery.scala --- @@ -84,6 +84,13 @@ object ScalarSubquery

[GitHub] spark pull request #13418: [SPARK-15677][SQL] Query with scalar sub-query in...

2016-06-01 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/13418#discussion_r65437967 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -1468,7 +1468,8 @@ object DecimalAggregates

[GitHub] spark pull request: [SPARK-15677][SQL] Query with scalar sub-query in the SE...

2016-05-31 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/13418#discussion_r65294821 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -1468,7 +1468,8 @@ object DecimalAggregates

[GitHub] spark pull request: [SPARK-15677][SQL] Query with scalar sub-query in the SE...

2016-05-31 Thread ioana-delaney
GitHub user ioana-delaney opened a pull request: https://github.com/apache/spark/pull/13418 [SPARK-15677][SQL] Query with scalar sub-query in the SELECT list throws UnsupportedOperationException ## What changes were proposed in this pull request? Queries with scalar sub-query

<    1   2