[GitHub] spark pull request #16696: [SPARK-19350] [SQL] Cardinality estimation of Lim...

2017-02-04 Thread ron8hu
Github user ron8hu commented on a diff in the pull request: https://github.com/apache/spark/pull/16696#discussion_r99474494 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/statsEstimation/StatsEstimationSuite.scala --- @@ -18,12 +18,41 @@ package

[GitHub] spark pull request #16696: [SPARK-19350] [SQL] Cardinality estimation of Lim...

2017-01-31 Thread ron8hu
Github user ron8hu commented on a diff in the pull request: https://github.com/apache/spark/pull/16696#discussion_r98776491 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala --- @@ -727,37 +728,18 @@ case class

[GitHub] spark issue #16594: [SPARK-17078] [SQL] Show stats when explain

2017-01-23 Thread ron8hu
Github user ron8hu commented on the issue: https://github.com/apache/spark/pull/16594 To show a very large Long number, there is no need to print out every digit in the number. We can use exponent. For example, a number 120,000,000,005,123 can be printed as 1.2*10**14, where 10**14

[GitHub] spark pull request #16395: [SPARK-17075][SQL] implemented filter estimation

2017-02-20 Thread ron8hu
Github user ron8hu commented on a diff in the pull request: https://github.com/apache/spark/pull/16395#discussion_r102133390 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/statsEstimation/FilterEstimationSuite.scala --- @@ -0,0 +1,389

[GitHub] spark pull request #17065: [SPARK-17075][SQL][followup] fix some minor issue...

2017-02-25 Thread ron8hu
Github user ron8hu commented on a diff in the pull request: https://github.com/apache/spark/pull/17065#discussion_r103087483 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala --- @@ -95,15 +84,16 @@ case class

[GitHub] spark pull request #17065: [SPARK-17075][SQL][followup] fix some minor issue...

2017-02-25 Thread ron8hu
Github user ron8hu commented on a diff in the pull request: https://github.com/apache/spark/pull/17065#discussion_r103087345 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala --- @@ -297,6 +278,8 @@ case class

[GitHub] spark pull request #17065: [SPARK-17075][SQL][followup] fix some minor issue...

2017-02-25 Thread ron8hu
Github user ron8hu commented on a diff in the pull request: https://github.com/apache/spark/pull/17065#discussion_r103090517 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala --- @@ -361,57 +343,52 @@ case

[GitHub] spark issue #16395: [SPARK-17075][SQL] implemented filter estimation

2017-02-19 Thread ron8hu
Github user ron8hu commented on the issue: https://github.com/apache/spark/pull/16395 @cloud-fan I have updated code based on your feedback. Please review it again. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request #16395: [SPARK-17075][SQL] implemented filter estimation

2017-02-13 Thread ron8hu
Github user ron8hu commented on a diff in the pull request: https://github.com/apache/spark/pull/16395#discussion_r100955392 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala --- @@ -0,0 +1,623

[GitHub] spark pull request #16395: [SPARK-17075][SQL] implemented filter estimation

2017-02-13 Thread ron8hu
Github user ron8hu commented on a diff in the pull request: https://github.com/apache/spark/pull/16395#discussion_r100941831 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala --- @@ -0,0 +1,623

[GitHub] spark pull request #16395: [SPARK-17075][SQL] implemented filter estimation

2017-02-13 Thread ron8hu
Github user ron8hu commented on a diff in the pull request: https://github.com/apache/spark/pull/16395#discussion_r100940942 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala --- @@ -0,0 +1,623

[GitHub] spark pull request #16395: [SPARK-17075][SQL] implemented filter estimation

2017-02-13 Thread ron8hu
Github user ron8hu commented on a diff in the pull request: https://github.com/apache/spark/pull/16395#discussion_r100952454 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala --- @@ -0,0 +1,623

[GitHub] spark pull request #16395: [SPARK-17075][SQL] implemented filter estimation

2017-02-13 Thread ron8hu
Github user ron8hu commented on a diff in the pull request: https://github.com/apache/spark/pull/16395#discussion_r100954234 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala --- @@ -0,0 +1,623

[GitHub] spark issue #16395: [SPARK-17075][SQL] implemented filter estimation

2017-02-16 Thread ron8hu
Github user ron8hu commented on the issue: https://github.com/apache/spark/pull/16395 Hi @cloud-fan I revised the code using latest Range class. Thanks for reviewing the code. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark issue #16395: [SPARK-17075][SQL] implemented filter estimation

2017-01-18 Thread ron8hu
Github user ron8hu commented on the issue: https://github.com/apache/spark/pull/16395 @wzhfy For predicate condition d_date >= '2000-01-27', we do not support it because Spark SQL cast d_date column to String first before comparison. For predicate condition d_date >= cast('2

[GitHub] spark pull request #16395: [SPARK-17075][SQL] implemented filter estimation

2017-01-16 Thread ron8hu
Github user ron8hu commented on a diff in the pull request: https://github.com/apache/spark/pull/16395#discussion_r96308583 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/statsEstimation/FilterEstimationSuite.scala --- @@ -0,0 +1,309

[GitHub] spark pull request #16395: [SPARK-17075][SQL] implemented filter estimation

2017-01-18 Thread ron8hu
Github user ron8hu commented on a diff in the pull request: https://github.com/apache/spark/pull/16395#discussion_r96718076 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/statsEstimation/FilterEstimationSuite.scala --- @@ -0,0 +1,303

[GitHub] spark pull request #16395: [SPARK-17075][SQL] implemented filter estimation

2017-01-16 Thread ron8hu
Github user ron8hu commented on a diff in the pull request: https://github.com/apache/spark/pull/16395#discussion_r96324552 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala --- @@ -0,0 +1,620

[GitHub] spark pull request #16395: [SPARK-17075][SQL] implemented filter estimation

2017-01-16 Thread ron8hu
Github user ron8hu commented on a diff in the pull request: https://github.com/apache/spark/pull/16395#discussion_r96325991 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala --- @@ -0,0 +1,620

[GitHub] spark pull request #15090: [SPARK-17073] [SQL] generate column-level statist...

2016-09-19 Thread ron8hu
Github user ron8hu commented on a diff in the pull request: https://github.com/apache/spark/pull/15090#discussion_r79508043 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/AnalyzeColumnCommand.scala --- @@ -0,0 +1,159 @@ +/* + * Licensed

[GitHub] spark pull request #15637: [SPARK-18000] [SQL] Aggregation function for comp...

2016-11-04 Thread ron8hu
Github user ron8hu commented on a diff in the pull request: https://github.com/apache/spark/pull/15637#discussion_r86647571 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/MapAggregate.scala --- @@ -0,0 +1,310 @@ +/* + * Licensed

[GitHub] spark pull request #15637: [SPARK-18000] [SQL] Aggregation function for comp...

2016-11-03 Thread ron8hu
Github user ron8hu commented on a diff in the pull request: https://github.com/apache/spark/pull/15637#discussion_r86409090 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/MapAggregate.scala --- @@ -0,0 +1,332 @@ +/* + * Licensed

[GitHub] spark pull request #15637: [SPARK-18000] [SQL] Aggregation function for comp...

2016-11-03 Thread ron8hu
Github user ron8hu commented on a diff in the pull request: https://github.com/apache/spark/pull/15637#discussion_r86409288 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/MapAggregate.scala --- @@ -0,0 +1,332 @@ +/* + * Licensed

[GitHub] spark pull request #15637: [SPARK-18000] [SQL] Aggregation function for comp...

2016-11-03 Thread ron8hu
Github user ron8hu commented on a diff in the pull request: https://github.com/apache/spark/pull/15637#discussion_r86408951 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/MapAggregate.scala --- @@ -0,0 +1,332 @@ +/* + * Licensed

[GitHub] spark pull request #15637: [SPARK-18000] [SQL] Aggregation function for comp...

2016-11-03 Thread ron8hu
Github user ron8hu commented on a diff in the pull request: https://github.com/apache/spark/pull/15637#discussion_r86410847 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/MapAggregate.scala --- @@ -0,0 +1,332 @@ +/* + * Licensed

[GitHub] spark issue #15637: [SPARK-18000] [SQL] Aggregation function for computing e...

2016-11-03 Thread ron8hu
Github user ron8hu commented on the issue: https://github.com/apache/spark/pull/15637 test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark pull request #15637: [SPARK-18000] [SQL] Aggregation function for comp...

2016-11-03 Thread ron8hu
Github user ron8hu commented on a diff in the pull request: https://github.com/apache/spark/pull/15637#discussion_r86411132 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/MapAggregate.scala --- @@ -0,0 +1,332 @@ +/* + * Licensed

[GitHub] spark pull request #15637: [SPARK-18000] [SQL] Aggregation function for comp...

2016-11-03 Thread ron8hu
Github user ron8hu commented on a diff in the pull request: https://github.com/apache/spark/pull/15637#discussion_r86411252 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/MapAggregate.scala --- @@ -0,0 +1,332 @@ +/* + * Licensed

[GitHub] spark pull request #15637: [SPARK-18000] [SQL] Aggregation function for comp...

2016-11-03 Thread ron8hu
Github user ron8hu commented on a diff in the pull request: https://github.com/apache/spark/pull/15637#discussion_r86413803 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/MapAggregate.scala --- @@ -0,0 +1,332 @@ +/* + * Licensed

[GitHub] spark pull request #15637: [SPARK-18000] [SQL] Aggregation function for comp...

2016-11-03 Thread ron8hu
Github user ron8hu commented on a diff in the pull request: https://github.com/apache/spark/pull/15637#discussion_r86415255 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/MapAggregate.scala --- @@ -0,0 +1,332 @@ +/* + * Licensed

[GitHub] spark issue #15297: [WIP][SPARK-9862]Handling data skew

2016-10-13 Thread ron8hu
Github user ron8hu commented on the issue: https://github.com/apache/spark/pull/15297 cc @rxin @hvanhovell Can you review and comment this PR? Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark issue #15297: [WIP][SPARK-9862]Handling data skew

2016-10-13 Thread ron8hu
Github user ron8hu commented on the issue: https://github.com/apache/spark/pull/15297 The design note of this PR has been posted at jira page. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request #16395: [SPARK-17075][SQL] implemented filter estimation

2017-01-11 Thread ron8hu
Github user ron8hu commented on a diff in the pull request: https://github.com/apache/spark/pull/16395#discussion_r95719350 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala --- @@ -0,0 +1,555

[GitHub] spark pull request #16395: [SPARK-17075][SQL] implemented filter estimation

2017-01-11 Thread ron8hu
Github user ron8hu commented on a diff in the pull request: https://github.com/apache/spark/pull/16395#discussion_r95716353 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala --- @@ -0,0 +1,555

[GitHub] spark pull request #16395: [SPARK-17075][SQL] implemented filter estimation

2017-01-11 Thread ron8hu
Github user ron8hu commented on a diff in the pull request: https://github.com/apache/spark/pull/16395#discussion_r95720653 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala --- @@ -0,0 +1,555

[GitHub] spark pull request #16395: [SPARK-17075][SQL] implemented filter estimation

2017-01-11 Thread ron8hu
Github user ron8hu commented on a diff in the pull request: https://github.com/apache/spark/pull/16395#discussion_r95719141 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala --- @@ -0,0 +1,555

[GitHub] spark pull request #16395: [SPARK-17075][SQL] implemented filter estimation

2017-01-11 Thread ron8hu
Github user ron8hu commented on a diff in the pull request: https://github.com/apache/spark/pull/16395#discussion_r95718502 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala --- @@ -0,0 +1,555

[GitHub] spark pull request #16395: [SPARK-17075][SQL] implemented filter estimation

2017-01-11 Thread ron8hu
Github user ron8hu commented on a diff in the pull request: https://github.com/apache/spark/pull/16395#discussion_r95717298 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/EstimationUtils.scala --- @@ -52,3 +56,12 @@ object

[GitHub] spark pull request #16395: [SPARK-17075][SQL] implemented filter estimation

2017-01-11 Thread ron8hu
Github user ron8hu commented on a diff in the pull request: https://github.com/apache/spark/pull/16395#discussion_r95717900 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala --- @@ -0,0 +1,555

[GitHub] spark pull request #16395: [SPARK-17075][SQL] implemented filter estimation

2017-01-11 Thread ron8hu
Github user ron8hu commented on a diff in the pull request: https://github.com/apache/spark/pull/16395#discussion_r95718952 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala --- @@ -0,0 +1,555

[GitHub] spark pull request #16395: [SPARK-17075][SQL] implemented filter estimation

2017-01-11 Thread ron8hu
Github user ron8hu commented on a diff in the pull request: https://github.com/apache/spark/pull/16395#discussion_r95718997 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala --- @@ -0,0 +1,555

[GitHub] spark pull request #16395: [SPARK-17075][SQL] implemented filter estimation

2017-01-10 Thread ron8hu
Github user ron8hu commented on a diff in the pull request: https://github.com/apache/spark/pull/16395#discussion_r95511085 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/estimation/FilterEstimationSuite.scala --- @@ -0,0 +1,226 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #16395: [SPARK-17075][SQL] implemented filter estimation

2017-01-12 Thread ron8hu
Github user ron8hu commented on a diff in the pull request: https://github.com/apache/spark/pull/16395#discussion_r95935452 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala --- @@ -116,6 +116,12 @@ case class Filter

[GitHub] spark pull request #16395: [SPARK-17075][SQL] implemented filter estimation

2017-01-12 Thread ron8hu
Github user ron8hu commented on a diff in the pull request: https://github.com/apache/spark/pull/16395#discussion_r95916520 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala --- @@ -0,0 +1,555

[GitHub] spark issue #16395: [SPARK-17075][SQL] implemented filter estimation

2017-01-10 Thread ron8hu
Github user ron8hu commented on the issue: https://github.com/apache/spark/pull/16395 cc @rxin @wzhfy Have updated code based on rxin's comments. Please review again. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request #16395: [SPARK-17075][SQL] implemented filter estimation

2017-01-11 Thread ron8hu
Github user ron8hu commented on a diff in the pull request: https://github.com/apache/spark/pull/16395#discussion_r95695560 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala --- @@ -0,0 +1,555

[GitHub] spark pull request #16395: [SPARK-17075][SQL] implemented filter estimation

2017-01-11 Thread ron8hu
Github user ron8hu commented on a diff in the pull request: https://github.com/apache/spark/pull/16395#discussion_r95690239 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/statsEstimation/FilterEstimationSuite.scala --- @@ -0,0 +1,173

[GitHub] spark pull request #16395: [SPARK-17075][SQL] implemented filter estimation

2017-01-11 Thread ron8hu
Github user ron8hu commented on a diff in the pull request: https://github.com/apache/spark/pull/16395#discussion_r95691485 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/statsEstimation/FilterEstimationSuite.scala --- @@ -0,0 +1,173

[GitHub] spark pull request #16395: [SPARK-17075][SQL] implemented filter estimation

2017-01-11 Thread ron8hu
Github user ron8hu commented on a diff in the pull request: https://github.com/apache/spark/pull/16395#discussion_r95707748 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/statsEstimation/FilterEstimationSuite.scala --- @@ -0,0 +1,173

[GitHub] spark pull request #16395: [SPARK-17075][SQL] implemented filter estimation

2017-01-11 Thread ron8hu
Github user ron8hu commented on a diff in the pull request: https://github.com/apache/spark/pull/16395#discussion_r95707722 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/statsEstimation/FilterEstimationSuite.scala --- @@ -0,0 +1,173

[GitHub] spark pull request #16395: [SPARK-17075][SQL] implemented filter estimation

2017-01-11 Thread ron8hu
Github user ron8hu commented on a diff in the pull request: https://github.com/apache/spark/pull/16395#discussion_r95709024 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/statsEstimation/FilterEstimationSuite.scala --- @@ -0,0 +1,173

[GitHub] spark pull request #16395: [SPARK-17075][SQL] implemented filter estimation

2017-01-11 Thread ron8hu
Github user ron8hu commented on a diff in the pull request: https://github.com/apache/spark/pull/16395#discussion_r95707644 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/statsEstimation/FilterEstimationSuite.scala --- @@ -0,0 +1,173

[GitHub] spark pull request #16395: [SPARK-17075][SQL] implemented filter estimation

2017-01-11 Thread ron8hu
Github user ron8hu commented on a diff in the pull request: https://github.com/apache/spark/pull/16395#discussion_r95710436 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/statsEstimation/FilterEstimationSuite.scala --- @@ -0,0 +1,173

[GitHub] spark issue #16395: [SPARK-17075][SQL] implemented filter estimation

2017-01-04 Thread ron8hu
Github user ron8hu commented on the issue: https://github.com/apache/spark/pull/16395 cc @wzhfy @rxin @srinathshankar @hvanhovell @cloud-fan Happy New Year! This PR is ready for code review. --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request #16395: [SPARK-17075][SQL] implemented filter estimation

2017-01-08 Thread ron8hu
Github user ron8hu commented on a diff in the pull request: https://github.com/apache/spark/pull/16395#discussion_r95096451 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/estimation/FilterEstimation.scala --- @@ -0,0 +1,479

[GitHub] spark pull request #16395: [SPARK-17075][SQL] implemented filter estimation

2017-01-08 Thread ron8hu
Github user ron8hu commented on a diff in the pull request: https://github.com/apache/spark/pull/16395#discussion_r95096418 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/estimation/FilterEstimation.scala --- @@ -0,0 +1,479

[GitHub] spark pull request #16395: [SPARK-17075][SQL] implemented filter estimation

2017-01-08 Thread ron8hu
Github user ron8hu commented on a diff in the pull request: https://github.com/apache/spark/pull/16395#discussion_r95096432 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/estimation/FilterEstimation.scala --- @@ -0,0 +1,479

[GitHub] spark pull request #16395: [SPARK-17075][SQL] implemented filter estimation

2017-01-08 Thread ron8hu
Github user ron8hu commented on a diff in the pull request: https://github.com/apache/spark/pull/16395#discussion_r95096610 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/estimation/FilterEstimation.scala --- @@ -0,0 +1,479

[GitHub] spark pull request #16395: [SPARK-17075][SQL] implemented filter estimation

2017-01-08 Thread ron8hu
Github user ron8hu commented on a diff in the pull request: https://github.com/apache/spark/pull/16395#discussion_r95096602 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/estimation/FilterEstimation.scala --- @@ -0,0 +1,479

[GitHub] spark pull request #16395: [SPARK-17075][SQL] implemented filter estimation

2017-01-08 Thread ron8hu
Github user ron8hu commented on a diff in the pull request: https://github.com/apache/spark/pull/16395#discussion_r95096571 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/estimation/FilterEstimation.scala --- @@ -0,0 +1,479

[GitHub] spark pull request #16395: [SPARK-17075][SQL] implemented filter estimation

2017-01-08 Thread ron8hu
Github user ron8hu commented on a diff in the pull request: https://github.com/apache/spark/pull/16395#discussion_r95096457 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/estimation/FilterEstimation.scala --- @@ -0,0 +1,479

[GitHub] spark pull request #16333: Filter estimate

2016-12-18 Thread ron8hu
Github user ron8hu closed the pull request at: https://github.com/apache/spark/pull/16333 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request #16333: Filter estimate

2016-12-18 Thread ron8hu
Github user ron8hu closed the pull request at: https://github.com/apache/spark/pull/16333 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark issue #16333: Filter estimate

2016-12-18 Thread ron8hu
Github user ron8hu commented on the issue: https://github.com/apache/spark/pull/16333 This is a mistake. I point to the wrong repository. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request #16333: Filter estimate

2016-12-18 Thread ron8hu
GitHub user ron8hu reopened a pull request: https://github.com/apache/spark/pull/16333 Filter estimate ## What changes were proposed in this pull request? This is a WIP PR. In this version, we set up the framework to traverse predicate and evaluate the equality

[GitHub] spark pull request #16334: estimate filter cardinality

2016-12-18 Thread ron8hu
Github user ron8hu closed the pull request at: https://github.com/apache/spark/pull/16334 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark issue #16334: estimate filter cardinality

2016-12-18 Thread ron8hu
Github user ron8hu commented on the issue: https://github.com/apache/spark/pull/16334 Sorry. This is a mistake. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request #16334: estimate filter cardinality

2016-12-18 Thread ron8hu
GitHub user ron8hu opened a pull request: https://github.com/apache/spark/pull/16334 estimate filter cardinality ## What changes were proposed in this pull request? This is a WIP PR. In this version, we set up the framework to traverse predicate and evaluate the equality

[GitHub] spark issue #16333: Filter estimate

2016-12-18 Thread ron8hu
Github user ron8hu commented on the issue: https://github.com/apache/spark/pull/16333 cc @wzhfy --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark pull request #16333: Filter estimate

2016-12-18 Thread ron8hu
GitHub user ron8hu opened a pull request: https://github.com/apache/spark/pull/16333 Filter estimate ## What changes were proposed in this pull request? This is a WIP PR. In this version, we set up the framework to traverse predicate and evaluate the equality

[GitHub] spark issue #16333: Filter estimate

2016-12-18 Thread ron8hu
Github user ron8hu commented on the issue: https://github.com/apache/spark/pull/16333 cc @wzhfy Please preview it and make comments. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #16395: [SPARK-17075][SQL][WIP] implemented filter estimation

2016-12-23 Thread ron8hu
Github user ron8hu commented on the issue: https://github.com/apache/spark/pull/16395 cc @wzhfy @rxin @hvanhovell @cloud-fan --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #16395: implemented first version of filter estimation

2016-12-23 Thread ron8hu
GitHub user ron8hu opened a pull request: https://github.com/apache/spark/pull/16395 implemented first version of filter estimation ## What changes were proposed in this pull request? This is a WIP PR. In this version, we set up the framework to traverse predicate

[GitHub] spark pull request #16395: [SPARK-17075][SQL] implemented filter estimation

2017-01-14 Thread ron8hu
Github user ron8hu commented on a diff in the pull request: https://github.com/apache/spark/pull/16395#discussion_r96128021 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/Range.scala --- @@ -0,0 +1,75 @@ +/* + * Licensed

[GitHub] spark issue #16395: [SPARK-17075][SQL] implemented filter estimation

2017-01-14 Thread ron8hu
Github user ron8hu commented on the issue: https://github.com/apache/spark/pull/16395 cc @rxin @wzhfy Have updated code. Please review again. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request #16395: [SPARK-17075][SQL] implemented filter estimation

2017-01-14 Thread ron8hu
Github user ron8hu commented on a diff in the pull request: https://github.com/apache/spark/pull/16395#discussion_r96128057 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/Range.scala --- @@ -0,0 +1,75 @@ +/* + * Licensed

[GitHub] spark pull request #17415: [SPARK-19408][SQL] filter estimation on two colum...

2017-03-27 Thread ron8hu
Github user ron8hu commented on a diff in the pull request: https://github.com/apache/spark/pull/17415#discussion_r108246637 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala --- @@ -509,8 +524,131 @@ case

[GitHub] spark pull request #17415: [SPARK-19408][SQL] filter estimation on two colum...

2017-03-24 Thread ron8hu
GitHub user ron8hu opened a pull request: https://github.com/apache/spark/pull/17415 [SPARK-19408][SQL] filter estimation on two columns of same table ## What changes were proposed in this pull request? In SQL queries, we also see predicate expressions involving two columns

[GitHub] spark issue #17415: [SPARK-19408][SQL] filter estimation on two columns of s...

2017-03-24 Thread ron8hu
Github user ron8hu commented on the issue: https://github.com/apache/spark/pull/17415 cc @sameeragarwal @cloud-fan @gatorsmile This Jira is not on Spark 2.2 blocker list. If time permits, we can include it in Spark 2.2. If not, we can wait for a maintenance release. Thanks

[GitHub] spark pull request #17415: [SPARK-19408][SQL] filter estimation on two colum...

2017-03-29 Thread ron8hu
Github user ron8hu commented on a diff in the pull request: https://github.com/apache/spark/pull/17415#discussion_r108754614 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala --- @@ -515,8 +530,138 @@ case

[GitHub] spark pull request #17415: [SPARK-19408][SQL] filter estimation on two colum...

2017-03-29 Thread ron8hu
Github user ron8hu commented on a diff in the pull request: https://github.com/apache/spark/pull/17415#discussion_r108753830 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala --- @@ -515,8 +530,138 @@ case

[GitHub] spark pull request #17415: [SPARK-19408][SQL] filter estimation on two colum...

2017-03-29 Thread ron8hu
Github user ron8hu commented on a diff in the pull request: https://github.com/apache/spark/pull/17415#discussion_r108751882 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala --- @@ -515,8 +530,138 @@ case

[GitHub] spark pull request #17415: [SPARK-19408][SQL] filter estimation on two colum...

2017-03-29 Thread ron8hu
Github user ron8hu commented on a diff in the pull request: https://github.com/apache/spark/pull/17415#discussion_r108752975 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala --- @@ -515,8 +530,138 @@ case

[GitHub] spark pull request #17415: [SPARK-19408][SQL] filter estimation on two colum...

2017-03-28 Thread ron8hu
Github user ron8hu commented on a diff in the pull request: https://github.com/apache/spark/pull/17415#discussion_r108582594 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala --- @@ -515,8 +530,135 @@ case

[GitHub] spark pull request #17415: [SPARK-19408][SQL] filter estimation on two colum...

2017-03-28 Thread ron8hu
Github user ron8hu commented on a diff in the pull request: https://github.com/apache/spark/pull/17415#discussion_r108582540 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala --- @@ -509,8 +524,131 @@ case

[GitHub] spark pull request #17415: [SPARK-19408][SQL] filter estimation on two colum...

2017-03-28 Thread ron8hu
Github user ron8hu commented on a diff in the pull request: https://github.com/apache/spark/pull/17415#discussion_r108583109 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala --- @@ -515,8 +530,135 @@ case

[GitHub] spark pull request #17415: [SPARK-19408][SQL] filter estimation on two colum...

2017-03-27 Thread ron8hu
Github user ron8hu commented on a diff in the pull request: https://github.com/apache/spark/pull/17415#discussion_r108307672 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/statsEstimation/FilterEstimationSuite.scala --- @@ -381,7 +461,22 @@ class

[GitHub] spark pull request #17415: [SPARK-19408][SQL] filter estimation on two colum...

2017-03-27 Thread ron8hu
Github user ron8hu commented on a diff in the pull request: https://github.com/apache/spark/pull/17415#discussion_r108307490 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/statsEstimation/FilterEstimationSuite.scala --- @@ -381,7 +461,22 @@ class

[GitHub] spark pull request #17415: [SPARK-19408][SQL] filter estimation on two colum...

2017-03-27 Thread ron8hu
Github user ron8hu commented on a diff in the pull request: https://github.com/apache/spark/pull/17415#discussion_r108308949 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala --- @@ -509,8 +524,131 @@ case

[GitHub] spark issue #17446: [SPARK-17075][SQL][followup] Add Estimation of Constant ...

2017-03-27 Thread ron8hu
Github user ron8hu commented on the issue: https://github.com/apache/spark/pull/17446 The logic is straightforward. LGTM. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #17415: [SPARK-19408][SQL] filter estimation on two columns of s...

2017-03-30 Thread ron8hu
Github user ron8hu commented on the issue: https://github.com/apache/spark/pull/17415 cc @sameeragarwal @cloud-fan @gatorsmile @wzhfy After a few round of changes and commits, this PR should be in good shape. If we can include in Spark 2.2, then it can help tpc-h queries

[GitHub] spark pull request #15363: [SPARK-17791][SQL] Join reordering using star sch...

2017-03-17 Thread ron8hu
Github user ron8hu commented on a diff in the pull request: https://github.com/apache/spark/pull/15363#discussion_r106766281 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala --- @@ -20,19 +20,347 @@ package

[GitHub] spark pull request #15363: [SPARK-17791][SQL] Join reordering using star sch...

2017-03-15 Thread ron8hu
Github user ron8hu commented on a diff in the pull request: https://github.com/apache/spark/pull/15363#discussion_r106237955 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala --- @@ -51,6 +51,11 @@ case class

[GitHub] spark pull request #15363: [SPARK-17791][SQL] Join reordering using star sch...

2017-03-15 Thread ron8hu
Github user ron8hu commented on a diff in the pull request: https://github.com/apache/spark/pull/15363#discussion_r106271250 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala --- @@ -51,6 +51,11 @@ case class

[GitHub] spark pull request #15363: [SPARK-17791][SQL] Join reordering using star sch...

2017-03-14 Thread ron8hu
Github user ron8hu commented on a diff in the pull request: https://github.com/apache/spark/pull/15363#discussion_r106084556 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala --- @@ -51,6 +51,11 @@ case class

[GitHub] spark pull request #15363: [SPARK-17791][SQL] Join reordering using star sch...

2017-03-15 Thread ron8hu
Github user ron8hu commented on a diff in the pull request: https://github.com/apache/spark/pull/15363#discussion_r106285527 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala --- @@ -51,6 +51,11 @@ case class

[GitHub] spark pull request #17415: [SPARK-19408][SQL] filter estimation on two colum...

2017-04-01 Thread ron8hu
Github user ron8hu commented on a diff in the pull request: https://github.com/apache/spark/pull/17415#discussion_r109293607 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala --- @@ -550,6 +565,220 @@ case

[GitHub] spark pull request #17415: [SPARK-19408][SQL] filter estimation on two colum...

2017-04-03 Thread ron8hu
Github user ron8hu commented on a diff in the pull request: https://github.com/apache/spark/pull/17415#discussion_r109476460 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/statsEstimation/FilterEstimationSuite.scala --- @@ -491,7 +599,22 @@ class

[GitHub] spark pull request #17415: [SPARK-19408][SQL] filter estimation on two colum...

2017-04-03 Thread ron8hu
Github user ron8hu commented on a diff in the pull request: https://github.com/apache/spark/pull/17415#discussion_r109471156 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala --- @@ -550,6 +565,220 @@ case

[GitHub] spark pull request #17415: [SPARK-19408][SQL] filter estimation on two colum...

2017-04-03 Thread ron8hu
Github user ron8hu commented on a diff in the pull request: https://github.com/apache/spark/pull/17415#discussion_r109476536 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala --- @@ -550,6 +565,221 @@ case

  1   2   >