[GitHub] spark pull request #23249: [SPARK-26297][SQL] improve the doc of Distributio...

2018-12-06 Thread maryannxue
Github user maryannxue commented on a diff in the pull request: https://github.com/apache/spark/pull/23249#discussion_r239694008 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/physical/partitioning.scala --- @@ -243,10 +248,19 @@ case class

[GitHub] spark pull request #23249: [SPARK-26297][SQL] improve the doc of Distributio...

2018-12-06 Thread maryannxue
Github user maryannxue commented on a diff in the pull request: https://github.com/apache/spark/pull/23249#discussion_r239693849 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/physical/partitioning.scala --- @@ -243,10 +248,19 @@ case class

[GitHub] spark pull request #23249: [SPARK-26297][SQL] improve the doc of Distributio...

2018-12-06 Thread maryannxue
Github user maryannxue commented on a diff in the pull request: https://github.com/apache/spark/pull/23249#discussion_r239689874 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/physical/partitioning.scala --- @@ -22,13 +22,12 @@ import

[GitHub] spark pull request #23249: [SPARK-26297][SQL] improve the doc of Distributio...

2018-12-06 Thread maryannxue
Github user maryannxue commented on a diff in the pull request: https://github.com/apache/spark/pull/23249#discussion_r239540987 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/physical/partitioning.scala --- @@ -22,13 +22,12 @@ import

[GitHub] spark issue #23036: [SPARK-26065][SQL] Change query hint from a `LogicalPlan...

2018-11-14 Thread maryannxue
Github user maryannxue commented on the issue: https://github.com/apache/spark/pull/23036 @gatorsmile @cloud-fan @rxin @juliuszsompolski --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark pull request #23036: [SPARK-26065][SQL] Change query hint from a `Logi...

2018-11-14 Thread maryannxue
GitHub user maryannxue opened a pull request: https://github.com/apache/spark/pull/23036 [SPARK-26065][SQL] Change query hint from a `LogicalPlan` to a field ## What changes were proposed in this pull request? The existing query hint implementation relies on a logical plan

[GitHub] spark pull request #22060: [DO NOT MERGE][TEST ONLY] Add once-policy rule ch...

2018-11-10 Thread maryannxue
Github user maryannxue closed the pull request at: https://github.com/apache/spark/pull/22060 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #22060: [DO NOT MERGE][TEST ONLY] Add once-policy rule check

2018-11-10 Thread maryannxue
Github user maryannxue commented on the issue: https://github.com/apache/spark/pull/22060 Thank you for reminding me, @HyukjinKwon! And thanks to @mgaido91's contribution, this has been fixed already

[GitHub] spark pull request #22778: [SPARK-25784][SQL] Infer filters from constraints...

2018-10-30 Thread maryannxue
Github user maryannxue commented on a diff in the pull request: https://github.com/apache/spark/pull/22778#discussion_r229356678 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -171,10 +171,13 @@ abstract class Optimizer

[GitHub] spark issue #21156: [SPARK-24087][SQL] Avoid shuffle when join keys are a su...

2018-10-22 Thread maryannxue
Github user maryannxue commented on the issue: https://github.com/apache/spark/pull/21156 The idea is good. Is it possible to make it an optimization rule? Another suggestion is we need more test cases

[GitHub] spark issue #21156: [SPARK-24087][SQL] Avoid shuffle when join keys are a su...

2018-10-22 Thread maryannxue
Github user maryannxue commented on the issue: https://github.com/apache/spark/pull/21156 Sorry for the delay. I’ll take another look today. On Mon, Oct 22, 2018 at 7:50 AM UCB AMPLab wrote: > Can one of the admins verify this patch? > > —

[GitHub] spark pull request #22732: [SPARK-25044][FOLLOW-UP] Change ScalaUDF construc...

2018-10-18 Thread maryannxue
Github user maryannxue commented on a diff in the pull request: https://github.com/apache/spark/pull/22732#discussion_r226527439 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/ScalaReflection.scala --- @@ -932,6 +935,23 @@ trait ScalaReflection

[GitHub] spark pull request #22732: [SPARK-25044][FOLLOW-UP] Change ScalaUDF construc...

2018-10-18 Thread maryannxue
Github user maryannxue commented on a diff in the pull request: https://github.com/apache/spark/pull/22732#discussion_r226521257 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/ScalaReflection.scala --- @@ -932,6 +935,23 @@ trait ScalaReflection

[GitHub] spark pull request #22732: [SPARK-25044][FOLLOW-UP] Change ScalaUDF construc...

2018-10-18 Thread maryannxue
Github user maryannxue commented on a diff in the pull request: https://github.com/apache/spark/pull/22732#discussion_r226384713 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ScalaUDF.scala --- @@ -39,29 +42,29 @@ import

[GitHub] spark pull request #22732: [SPARK-25044][FOLLOW-UP] Change ScalaUDF construc...

2018-10-17 Thread maryannxue
Github user maryannxue commented on a diff in the pull request: https://github.com/apache/spark/pull/22732#discussion_r226156109 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/expressions/UserDefinedFunction.scala --- @@ -73,19 +73,21 @@ case class UserDefinedFunction

[GitHub] spark pull request #22732: [SPARK-25044][FOLLOW-UP] Change ScalaUDF construc...

2018-10-17 Thread maryannxue
Github user maryannxue commented on a diff in the pull request: https://github.com/apache/spark/pull/22732#discussion_r226155205 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/UDFSuite.scala --- @@ -393,4 +393,30 @@ class UDFSuite extends QueryTest with SharedSQLContext

[GitHub] spark pull request #22732: [SPARK-25044][FOLLOW-UP] Change ScalaUDF construc...

2018-10-17 Thread maryannxue
Github user maryannxue commented on a diff in the pull request: https://github.com/apache/spark/pull/22732#discussion_r226155153 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/UDFSuite.scala --- @@ -393,4 +393,30 @@ class UDFSuite extends QueryTest with SharedSQLContext

[GitHub] spark pull request #22732: [SPARK-25044][FOLLOW-UP] Change ScalaUDF construc...

2018-10-17 Thread maryannxue
Github user maryannxue commented on a diff in the pull request: https://github.com/apache/spark/pull/22732#discussion_r225986215 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormatDataWriter.scala --- @@ -179,7 +179,8 @@ class

[GitHub] spark issue #22732: [SPARK-25044][FOLLOW-UP] Change ScalaUDF constructor sig...

2018-10-17 Thread maryannxue
Github user maryannxue commented on the issue: https://github.com/apache/spark/pull/22732 @srowen What @cloud-fan described is a change introduced in #22259. We can fix it by keeping each call to `ScalaReflection.schemaFor` in their own `Try` blocks. As to `UserDefinedFunction

[GitHub] spark pull request #22732: [SPARK-25044][FOLLOW-UP] Change ScalaUDF construc...

2018-10-16 Thread maryannxue
Github user maryannxue commented on a diff in the pull request: https://github.com/apache/spark/pull/22732#discussion_r225762708 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/expressions/UserDefinedFunction.scala --- @@ -81,11 +81,11 @@ case class UserDefinedFunction

[GitHub] spark pull request #22732: [SPARK-25044][FOLLOW-UP] Change ScalaUDF construc...

2018-10-16 Thread maryannxue
Github user maryannxue commented on a diff in the pull request: https://github.com/apache/spark/pull/22732#discussion_r225724505 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/expressions/UserDefinedFunction.scala --- @@ -81,11 +81,11 @@ case class UserDefinedFunction

[GitHub] spark pull request #22732: [SPARK-25044][FOLLOW-UP] Change ScalaUDF construc...

2018-10-16 Thread maryannxue
Github user maryannxue commented on a diff in the pull request: https://github.com/apache/spark/pull/22732#discussion_r225619242 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ScalaUDF.scala --- @@ -31,6 +31,7 @@ import

[GitHub] spark pull request #22732: [SPARK-25044][FOLLOW-UP] Change ScalaUDF construc...

2018-10-16 Thread maryannxue
Github user maryannxue commented on a diff in the pull request: https://github.com/apache/spark/pull/22732#discussion_r225606907 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ScalaUDF.scala --- @@ -39,29 +40,29 @@ import

[GitHub] spark pull request #22732: [SPARK-25044][FOLLOW-UP] Change ScalaUDF construc...

2018-10-16 Thread maryannxue
Github user maryannxue commented on a diff in the pull request: https://github.com/apache/spark/pull/22732#discussion_r225605581 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/expressions/UserDefinedFunction.scala --- @@ -73,27 +73,27 @@ case class UserDefinedFunction

[GitHub] spark pull request #22732: [SPARK-25044][FOLLOW-UP] Change ScalaUDF construc...

2018-10-16 Thread maryannxue
Github user maryannxue commented on a diff in the pull request: https://github.com/apache/spark/pull/22732#discussion_r225588931 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/expressions/UserDefinedFunction.scala --- @@ -73,27 +73,27 @@ case class UserDefinedFunction

[GitHub] spark pull request #22732: [SPARK-25044][FOLLOW-UP] Change ScalaUDF construc...

2018-10-16 Thread maryannxue
Github user maryannxue commented on a diff in the pull request: https://github.com/apache/spark/pull/22732#discussion_r225587391 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ScalaUDF.scala --- @@ -31,6 +31,7 @@ import

[GitHub] spark pull request #22732: [SPARK-25044][FOLLOW-UP] Change ScalaUDF construc...

2018-10-16 Thread maryannxue
Github user maryannxue commented on a diff in the pull request: https://github.com/apache/spark/pull/22732#discussion_r225586820 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/AnalysisSuite.scala --- @@ -314,24 +314,24 @@ class AnalysisSuite extends

[GitHub] spark pull request #22732: [SPARK-25044][FOLLOW-UP] Change ScalaUDF construc...

2018-10-16 Thread maryannxue
Github user maryannxue commented on a diff in the pull request: https://github.com/apache/spark/pull/22732#discussion_r225585971 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -2137,36 +2137,27 @@ class Analyzer

[GitHub] spark pull request #22732: [SPARK-25044][FOLLOW-UP] Change ScalaUDF construc...

2018-10-16 Thread maryannxue
Github user maryannxue commented on a diff in the pull request: https://github.com/apache/spark/pull/22732#discussion_r225585740 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ScalaUDF.scala --- @@ -39,29 +40,29 @@ import

[GitHub] spark pull request #22732: [SPARK-25044][FOLLOW-UP] Change ScalaUDF construc...

2018-10-16 Thread maryannxue
Github user maryannxue commented on a diff in the pull request: https://github.com/apache/spark/pull/22732#discussion_r225583730 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -2137,36 +2137,27 @@ class Analyzer

[GitHub] spark pull request #22732: [SPARK-25044][FOLLOW-UP] Change ScalaUDF construc...

2018-10-16 Thread maryannxue
Github user maryannxue commented on a diff in the pull request: https://github.com/apache/spark/pull/22732#discussion_r225580591 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -2137,36 +2137,27 @@ class Analyzer

[GitHub] spark pull request #22732: [SPARK-25044][FOLLOW-UP] Change ScalaUDF construc...

2018-10-15 Thread maryannxue
GitHub user maryannxue opened a pull request: https://github.com/apache/spark/pull/22732 [SPARK-25044][FOLLOW-UP] Change ScalaUDF constructor signature ## What changes were proposed in this pull request? This is a follow-up PR for #22259. The extra field added in `ScalaUDF

[GitHub] spark issue #22706: [SPARK-25716][SQL][MINOR] remove unnecessary collection ...

2018-10-15 Thread maryannxue
Github user maryannxue commented on the issue: https://github.com/apache/spark/pull/22706 @srowen I don't think this would make a big difference performance-wise, but if it's the right change, it just looks cleaner now. Anyone have any idea why it wasn't like this before

[GitHub] spark issue #22713: [SPARK-25691][SQL] Use semantic equality in AliasViewChi...

2018-10-15 Thread maryannxue
Github user maryannxue commented on the issue: https://github.com/apache/spark/pull/22713 We do need a test case here anyway. Ideally it would be just as simple as #22701 but the difficulty is in declaring a view

[GitHub] spark pull request #22701: [SPARK-25690][SQL] Analyzer rule HandleNullInputs...

2018-10-11 Thread maryannxue
Github user maryannxue commented on a diff in the pull request: https://github.com/apache/spark/pull/22701#discussion_r224658264 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -2150,8 +2150,10 @@ class Analyzer

[GitHub] spark pull request #22701: [SPARK-25690][SQL] Analyzer rule HandleNullInputs...

2018-10-11 Thread maryannxue
Github user maryannxue commented on a diff in the pull request: https://github.com/apache/spark/pull/22701#discussion_r224602234 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/AnalysisSuite.scala --- @@ -351,8 +351,8 @@ class AnalysisSuite extends

[GitHub] spark pull request #22259: [SPARK-25044][SQL] (take 2) Address translation o...

2018-10-11 Thread maryannxue
Github user maryannxue commented on a diff in the pull request: https://github.com/apache/spark/pull/22259#discussion_r224527615 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ScalaUDF.scala --- @@ -47,7 +48,8 @@ case class ScalaUDF

[GitHub] spark pull request #22701: [SPARK-25690][SQL] Analyzer rule HandleNullInputs...

2018-10-11 Thread maryannxue
GitHub user maryannxue opened a pull request: https://github.com/apache/spark/pull/22701 [SPARK-25690][SQL] Analyzer rule HandleNullInputsForUDF does not stabilize and can be applied infinitely ## What changes were proposed in this pull request? The HandleNullInputsForUDF

[GitHub] spark pull request #22259: [SPARK-25044][SQL] (take 2) Address translation o...

2018-10-11 Thread maryannxue
Github user maryannxue commented on a diff in the pull request: https://github.com/apache/spark/pull/22259#discussion_r224510115 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ScalaUDF.scala --- @@ -47,7 +48,8 @@ case class ScalaUDF

[GitHub] spark pull request #22259: [SPARK-25044][SQL] (take 2) Address translation o...

2018-10-10 Thread maryannxue
Github user maryannxue commented on a diff in the pull request: https://github.com/apache/spark/pull/22259#discussion_r224295469 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ScalaUDF.scala --- @@ -47,7 +48,8 @@ case class ScalaUDF

[GitHub] spark pull request #22259: [SPARK-25044][SQL] (take 2) Address translation o...

2018-10-10 Thread maryannxue
Github user maryannxue commented on a diff in the pull request: https://github.com/apache/spark/pull/22259#discussion_r224252642 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ScalaUDF.scala --- @@ -47,7 +48,8 @@ case class ScalaUDF

[GitHub] spark issue #22060: [DO NOT MERGE][TEST ONLY] Add once-policy rule check

2018-10-04 Thread maryannxue
Github user maryannxue commented on the issue: https://github.com/apache/spark/pull/22060 @maropu I'll follow up on this. I started the test again and I'll keep track of "which rules violate the assumption" and "which tests can reproduce the violat

[GitHub] spark issue #22060: [DO NOT MERGE][TEST ONLY] Add once-policy rule check

2018-10-04 Thread maryannxue
Github user maryannxue commented on the issue: https://github.com/apache/spark/pull/22060 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #22060: [DO NOT MERGE][TEST ONLY] Add once-policy rule check

2018-09-28 Thread maryannxue
Github user maryannxue commented on the issue: https://github.com/apache/spark/pull/22060 Sorry for the late reply. The purpose of this is to find out the rules that violate the once-policy assumption and also tests that can reproduce the issues. I think we should eventually turn

[GitHub] spark pull request #22519: [SPARK-25505][SQL] The output order of grouping c...

2018-09-27 Thread maryannxue
Github user maryannxue commented on a diff in the pull request: https://github.com/apache/spark/pull/22519#discussion_r221090624 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -554,8 +554,11 @@ class Analyzer

[GitHub] spark pull request #22519: [SPARK-25505][SQL] The output order of grouping c...

2018-09-21 Thread maryannxue
Github user maryannxue commented on a diff in the pull request: https://github.com/apache/spark/pull/22519#discussion_r219624150 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -554,8 +554,10 @@ class Analyzer

[GitHub] spark pull request #22519: [SPARK-25505][SQL] The output order of grouping c...

2018-09-21 Thread maryannxue
Github user maryannxue commented on a diff in the pull request: https://github.com/apache/spark/pull/22519#discussion_r219624067 --- Diff: sql/core/src/test/resources/sql-tests/inputs/pivot.sql --- @@ -287,3 +287,13 @@ PIVOT ( sum(earnings) FOR (course, m

[GitHub] spark pull request #22519: [SPARK-25505][SQL] The output order of grouping c...

2018-09-21 Thread maryannxue
Github user maryannxue commented on a diff in the pull request: https://github.com/apache/spark/pull/22519#discussion_r219623907 --- Diff: sql/core/src/test/resources/sql-tests/results/pivot.sql.out --- @@ -1,5 +1,5 @@ --- Automatically generated by SQLQueryTestSuite

[GitHub] spark pull request #22519: [SPARK-25505][SQL] The output order of grouping c...

2018-09-21 Thread maryannxue
GitHub user maryannxue opened a pull request: https://github.com/apache/spark/pull/22519 [SPARK-25505][SQL] The output order of grouping columns in Pivot is different from the input order ## What changes were proposed in this pull request? The grouping columns from a Pivot

[GitHub] spark issue #22447: [SPARK-25450][SQL] PushProjectThroughUnion rule uses the...

2018-09-19 Thread maryannxue
Github user maryannxue commented on the issue: https://github.com/apache/spark/pull/22447 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #22447: [SPARK-25450][SQL] PushProjectThroughUnion rule u...

2018-09-17 Thread maryannxue
GitHub user maryannxue opened a pull request: https://github.com/apache/spark/pull/22447 [SPARK-25450][SQL] PushProjectThroughUnion rule uses the same exprId for project expressions in each Union child, causing mistakes in constant propagation ## What changes were proposed

[GitHub] spark pull request #22406: [SPARK-25415][SQL] Make plan change log in RuleEx...

2018-09-12 Thread maryannxue
GitHub user maryannxue opened a pull request: https://github.com/apache/spark/pull/22406 [SPARK-25415][SQL] Make plan change log in RuleExecutor configurable by SQLConf ## What changes were proposed in this pull request? In RuleExecutor, after applying a rule, if the plan

[GitHub] spark issue #22060: [DO NOT MERGE][TEST ONLY] Add once-policy rule check

2018-08-09 Thread maryannxue
Github user maryannxue commented on the issue: https://github.com/apache/spark/pull/22060 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #22060: [DO NOT MERGE][TEST ONLY] Add once-policy rule ch...

2018-08-09 Thread maryannxue
GitHub user maryannxue opened a pull request: https://github.com/apache/spark/pull/22060 [DO NOT MERGE][TEST ONLY] Add once-policy rule check ## What changes were proposed in this pull request? Rules like `HandleNullInputsForUDF` (https://issues.apache.org/jira/browse

[GitHub] spark pull request #22049: [SPARK-25063][SQL] Rename class KnowNotNull to Kn...

2018-08-08 Thread maryannxue
GitHub user maryannxue opened a pull request: https://github.com/apache/spark/pull/22049 [SPARK-25063][SQL] Rename class KnowNotNull to KnownNotNull ## What changes were proposed in this pull request? Correct the class name typo checked in through SPARK-24891

[GitHub] spark issue #22049: [SPARK-25063][SQL] Rename class KnowNotNull to KnownNotN...

2018-08-08 Thread maryannxue
Github user maryannxue commented on the issue: https://github.com/apache/spark/pull/22049 @gatorsmile --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #22030: [SPARK-25048][SQL] Pivoting by multiple columns i...

2018-08-08 Thread maryannxue
Github user maryannxue commented on a diff in the pull request: https://github.com/apache/spark/pull/22030#discussion_r208468677 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/RelationalGroupedDataset.scala --- @@ -403,20 +415,29 @@ class RelationalGroupedDataset protected

[GitHub] spark pull request #22030: [SPARK-25048][SQL] Pivoting by multiple columns i...

2018-08-08 Thread maryannxue
Github user maryannxue commented on a diff in the pull request: https://github.com/apache/spark/pull/22030#discussion_r208466779 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/RelationalGroupedDataset.scala --- @@ -335,7 +337,7 @@ class RelationalGroupedDataset protected

[GitHub] spark pull request #22030: [SPARK-25048][SQL] Pivoting by multiple columns i...

2018-08-07 Thread maryannxue
Github user maryannxue commented on a diff in the pull request: https://github.com/apache/spark/pull/22030#discussion_r208460101 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/RelationalGroupedDataset.scala --- @@ -403,20 +415,29 @@ class RelationalGroupedDataset protected

[GitHub] spark pull request #22030: [SPARK-25048][SQL] Pivoting by multiple columns i...

2018-08-07 Thread maryannxue
Github user maryannxue commented on a diff in the pull request: https://github.com/apache/spark/pull/22030#discussion_r208459011 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/RelationalGroupedDataset.scala --- @@ -403,20 +415,29 @@ class RelationalGroupedDataset protected

[GitHub] spark pull request #22030: [SPARK-25048][SQL] Pivoting by multiple columns i...

2018-08-07 Thread maryannxue
Github user maryannxue commented on a diff in the pull request: https://github.com/apache/spark/pull/22030#discussion_r208458861 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/RelationalGroupedDataset.scala --- @@ -335,7 +337,7 @@ class RelationalGroupedDataset protected

[GitHub] spark pull request #22030: [SPARK-25048][SQL] Pivoting by multiple columns i...

2018-08-07 Thread maryannxue
Github user maryannxue commented on a diff in the pull request: https://github.com/apache/spark/pull/22030#discussion_r208458789 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/RelationalGroupedDataset.scala --- @@ -403,20 +415,29 @@ class RelationalGroupedDataset protected

[GitHub] spark pull request #22030: [SPARK-25048][SQL] Pivoting by multiple columns i...

2018-08-07 Thread maryannxue
Github user maryannxue commented on a diff in the pull request: https://github.com/apache/spark/pull/22030#discussion_r208453178 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/RelationalGroupedDataset.scala --- @@ -403,20 +415,29 @@ class RelationalGroupedDataset protected

[GitHub] spark pull request #22030: [SPARK-25048][SQL] Pivoting by multiple columns i...

2018-08-07 Thread maryannxue
Github user maryannxue commented on a diff in the pull request: https://github.com/apache/spark/pull/22030#discussion_r208451663 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/RelationalGroupedDataset.scala --- @@ -335,7 +337,7 @@ class RelationalGroupedDataset protected

[GitHub] spark pull request #22030: [SPARK-25048][SQL] Pivoting by multiple columns i...

2018-08-07 Thread maryannxue
Github user maryannxue commented on a diff in the pull request: https://github.com/apache/spark/pull/22030#discussion_r208423936 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/RelationalGroupedDataset.scala --- @@ -403,20 +415,29 @@ class RelationalGroupedDataset protected

[GitHub] spark pull request #22030: [SPARK-25048][SQL] Pivoting by multiple columns i...

2018-08-07 Thread maryannxue
Github user maryannxue commented on a diff in the pull request: https://github.com/apache/spark/pull/22030#discussion_r208411022 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/RelationalGroupedDataset.scala --- @@ -384,6 +392,10 @@ class RelationalGroupedDataset protected

[GitHub] spark pull request #22030: [SPARK-25048][SQL] Pivoting by multiple columns i...

2018-08-07 Thread maryannxue
Github user maryannxue commented on a diff in the pull request: https://github.com/apache/spark/pull/22030#discussion_r208410422 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/RelationalGroupedDataset.scala --- @@ -384,6 +392,10 @@ class RelationalGroupedDataset protected

[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument

2018-08-01 Thread maryannxue
Github user maryannxue commented on the issue: https://github.com/apache/spark/pull/21699 > Actually I am mostly worry of the pivotColumn. Specifying multiple columns via struct is not intuitive I believe. It depends on whether we'd like to add extra interfaces for multi

[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument

2018-08-01 Thread maryannxue
Github user maryannxue commented on the issue: https://github.com/apache/spark/pull/21699 Thank you for the change, @MaxGekk! @HyukjinKwon my idea was actually that the overloaded versions of pivot would be `pivot(column: Column, values, Seq[Column])`, so that we can construct

[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument

2018-07-31 Thread maryannxue
Github user maryannxue commented on the issue: https://github.com/apache/spark/pull/21699 @MaxGekk LGTM, but one more thing to consider: Since we support column list in SQL, it would be nice to support it and test it in DataFrame pivot too. The only thing that we need to enable

[GitHub] spark issue #21926: [SPARK-24972][SQL] PivotFirst could not handle pivot col...

2018-07-30 Thread maryannxue
Github user maryannxue commented on the issue: https://github.com/apache/spark/pull/21926 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument

2018-07-30 Thread maryannxue
Github user maryannxue commented on the issue: https://github.com/apache/spark/pull/21699 @MaxGekk Please take a look at https://github.com/apache/spark/pull/21926. There was a bug in PivotFirst and this PR should fix your test here

[GitHub] spark pull request #21926: [SPARK-24972][SQL] PivotFirst could not handle pi...

2018-07-30 Thread maryannxue
Github user maryannxue commented on a diff in the pull request: https://github.com/apache/spark/pull/21926#discussion_r206354004 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -574,10 +578,14 @@ class Analyzer

[GitHub] spark pull request #21926: [SPARK-24972][SQL] PivotFirst could not handle pi...

2018-07-30 Thread maryannxue
GitHub user maryannxue opened a pull request: https://github.com/apache/spark/pull/21926 [SPARK-24972][SQL] PivotFirst could not handle pivot columns of complex types ## What changes were proposed in this pull request? When the pivot column is of a complex type, the eval

[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument

2018-07-30 Thread maryannxue
Github user maryannxue commented on the issue: https://github.com/apache/spark/pull/21699 @MaxGekk Yes, it was caused by my previous PR. The change in my PR was a walk-around for an existing problem in either Aggregate or PivotFirst (I suspect it's Aggregate) with struct-type columns

[GitHub] spark issue #21875: [SPARK-24288][SQL] Add a JDBC Option to enable preventin...

2018-07-25 Thread maryannxue
Github user maryannxue commented on the issue: https://github.com/apache/spark/pull/21875 Programming guide updated. Thank you, @dilipbiswal and @HyukjinKwon! --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark pull request #21876: [SPARK-24802][SQL][FOLLOW-UP] Add a new config fo...

2018-07-25 Thread maryannxue
GitHub user maryannxue opened a pull request: https://github.com/apache/spark/pull/21876 [SPARK-24802][SQL][FOLLOW-UP] Add a new config for Optimization Rule Exclusion ## What changes were proposed in this pull request? This is an extension to the original PR, in which

[GitHub] spark pull request #21875: [SPARK-24288][SQL] Add a JDBC Option to enable pr...

2018-07-25 Thread maryannxue
Github user maryannxue commented on a diff in the pull request: https://github.com/apache/spark/pull/21875#discussion_r205268701 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCRelation.scala --- @@ -172,7 +172,11 @@ private[sql] case class

[GitHub] spark pull request #21875: [SPARK-24288][SQL] Add a JDBC Option to enable pr...

2018-07-25 Thread maryannxue
Github user maryannxue commented on a diff in the pull request: https://github.com/apache/spark/pull/21875#discussion_r205267327 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCOptions.scala --- @@ -183,6 +183,9 @@ class JDBCOptions

[GitHub] spark pull request #21875: [SPARK-24288][SQL] Add a JDBC Option to enable pr...

2018-07-25 Thread maryannxue
Github user maryannxue commented on a diff in the pull request: https://github.com/apache/spark/pull/21875#discussion_r205266067 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCRelation.scala --- @@ -172,7 +172,11 @@ private[sql] case class

[GitHub] spark issue #21403: [SPARK-24341][SQL] Support only IN subqueries with the s...

2018-07-25 Thread maryannxue
Github user maryannxue commented on the issue: https://github.com/apache/spark/pull/21403 @mgaido91 I see. But by using Seq[Expression] in `In`, can we hopefully remove `ResolveInValues`. I wouldn't mind changing the parser if it's necessary and if it saves work elsewhere. Having

[GitHub] spark issue #21875: [SPARK-24288][SQL] Enable preventing predicate pushdown

2018-07-25 Thread maryannxue
Github user maryannxue commented on the issue: https://github.com/apache/spark/pull/21875 @gatorsmile @TomaszGaweda --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #21875: [SPARK-24288][SQL] Enable preventing predicate pu...

2018-07-25 Thread maryannxue
GitHub user maryannxue opened a pull request: https://github.com/apache/spark/pull/21875 [SPARK-24288][SQL] Enable preventing predicate pushdown ## What changes were proposed in this pull request? Add a JDBC Option "pushDownPredicate" (default `true`) to allo

[GitHub] spark pull request #21360: [SPARK-24288] Enable preventing predicate pushdow...

2018-07-25 Thread maryannxue
Github user maryannxue closed the pull request at: https://github.com/apache/spark/pull/21360 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #21821: [SPARK-24867] [SQL] Add AnalysisBarrier to DataFrameWrit...

2018-07-25 Thread maryannxue
Github user maryannxue commented on the issue: https://github.com/apache/spark/pull/21821 LGTM. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #21403: [SPARK-24341][SQL] Support only IN subqueries with the s...

2018-07-24 Thread maryannxue
Github user maryannxue commented on the issue: https://github.com/apache/spark/pull/21403 I think the behavior definition is good and clear. But just a question on the implementation: is it necessary to introduce a new class `InValues`? or we could simply make `In` has it's first

[GitHub] spark pull request #21851: [SPARK-24891][SQL] Fix HandleNullInputsForUDF rul...

2018-07-23 Thread maryannxue
GitHub user maryannxue opened a pull request: https://github.com/apache/spark/pull/21851 [SPARK-24891][SQL] Fix HandleNullInputsForUDF rule ## What changes were proposed in this pull request? The HandleNullInputsForUDF would always add a new `If` node every time

[GitHub] spark issue #21821: [SPARK-24867] [SQL] Add AnalysisBarrier to DataFrameWrit...

2018-07-23 Thread maryannxue
Github user maryannxue commented on the issue: https://github.com/apache/spark/pull/21821 I just ran a test with once-strategy check and found out that a few batches/rules do not stop, e.g. AggregatePushDown, "Convert to Spark client exec", PartitionPruning. I believe mo

[GitHub] spark pull request #21764: [SPARK-24802][SQL] Add a new config for Optimizat...

2018-07-22 Thread maryannxue
Github user maryannxue commented on a diff in the pull request: https://github.com/apache/spark/pull/21764#discussion_r204279843 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -175,6 +191,41 @@ abstract class Optimizer

[GitHub] spark issue #21821: [SPARK-24867] [SQL] Add AnalysisBarrier to DataFrameWrit...

2018-07-22 Thread maryannxue
Github user maryannxue commented on the issue: https://github.com/apache/spark/pull/21821 Yes, @gatorsmile. Code is ready. Will post a PR shortly. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark pull request #21764: [SPARK-24802] Optimization Rule Exclusion

2018-07-19 Thread maryannxue
Github user maryannxue commented on a diff in the pull request: https://github.com/apache/spark/pull/21764#discussion_r203731087 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -127,6 +127,14 @@ object SQLConf

[GitHub] spark pull request #21764: [SPARK-24802] Optimization Rule Exclusion

2018-07-19 Thread maryannxue
Github user maryannxue commented on a diff in the pull request: https://github.com/apache/spark/pull/21764#discussion_r203730778 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/OptimizerRuleExclusionSuite.scala --- @@ -0,0 +1,84

[GitHub] spark pull request #21764: [SPARK-24802] Optimization Rule Exclusion

2018-07-19 Thread maryannxue
Github user maryannxue commented on a diff in the pull request: https://github.com/apache/spark/pull/21764#discussion_r203730652 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -160,6 +160,13 @@ abstract class Optimizer

[GitHub] spark pull request #21764: [SPARK-24802] Optimization Rule Exclusion

2018-07-19 Thread maryannxue
Github user maryannxue commented on a diff in the pull request: https://github.com/apache/spark/pull/21764#discussion_r203730125 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -175,6 +182,44 @@ abstract class Optimizer

[GitHub] spark issue #21720: [SPARK-24163][SPARK-24164][SQL] Support column list as t...

2018-07-17 Thread maryannxue
Github user maryannxue commented on the issue: https://github.com/apache/spark/pull/21720 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #21720: [SPARK-24163][SPARK-24164][SQL] Support column list as t...

2018-07-16 Thread maryannxue
Github user maryannxue commented on the issue: https://github.com/apache/spark/pull/21720 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #21764: [SPARK-24802] Optimization Rule Exclusion

2018-07-16 Thread maryannxue
Github user maryannxue commented on a diff in the pull request: https://github.com/apache/spark/pull/21764#discussion_r202786530 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -127,6 +127,14 @@ object SQLConf

[GitHub] spark pull request #21764: [SPARK-24802] Optimization Rule Exclusion

2018-07-16 Thread maryannxue
Github user maryannxue commented on a diff in the pull request: https://github.com/apache/spark/pull/21764#discussion_r202762054 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -127,6 +127,14 @@ object SQLConf

[GitHub] spark pull request #21764: [SPARK-24802] Optimization Rule Exclusion

2018-07-16 Thread maryannxue
Github user maryannxue commented on a diff in the pull request: https://github.com/apache/spark/pull/21764#discussion_r202760924 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -175,6 +179,35 @@ abstract class Optimizer

[GitHub] spark pull request #21764: [SPARK-24802] Optimization Rule Exclusion

2018-07-16 Thread maryannxue
Github user maryannxue commented on a diff in the pull request: https://github.com/apache/spark/pull/21764#discussion_r202759884 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -46,7 +47,23 @@ abstract class Optimizer

  1   2   >