[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/21699 Merged to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21699 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94123/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21699 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21699 **[Test build #94123 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94123/testReport)** for PR 21699 at commit [`ca1250b`](https://github.com/apache/spark/commit/ca1250b29f4edf8f38eb81c27773e04068e0fdf4). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21699 **[Test build #94123 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94123/testReport)** for PR 21699 at commit [`ca1250b`](https://github.com/apache/spark/commit/ca1250b29f4edf8f38eb81c27773e04068e0fdf4). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument
Github user MaxGekk commented on the issue: https://github.com/apache/spark/pull/21699 Jenkins, retest this, please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21699 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21699 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94109/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21699 **[Test build #94109 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94109/testReport)** for PR 21699 at commit [`ca1250b`](https://github.com/apache/spark/commit/ca1250b29f4edf8f38eb81c27773e04068e0fdf4). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21699 **[Test build #94109 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94109/testReport)** for PR 21699 at commit [`ca1250b`](https://github.com/apache/spark/commit/ca1250b29f4edf8f38eb81c27773e04068e0fdf4). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument
Github user MaxGekk commented on the issue: https://github.com/apache/spark/pull/21699 @HyukjinKwon I revert the last changes. Please, take a look at the PR again. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/21699 If this PR proposes a different API then an overloaded version of `pivot(String, Seq[Any])`, it's a different issue though I guess. I would prefer to have `pivot(Column, Seq[Any])` and reuse it for multiple pivot columns if we should do this since that would make more sense with the existing `pivot(String, Seq[Any])` - I guess this is still possible and we can do it later. For the current status, I would let the multiple `Column` thing defer - I am not quite sure on this yet and I think it's just better to leave it as is for now before when it's actually requested or needed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument
Github user maryannxue commented on the issue: https://github.com/apache/spark/pull/21699 > Actually I am mostly worry of the pivotColumn. Specifying multiple columns via struct is not intuitive I believe. It depends on whether we'd like to add extra interfaces for multiple columns. I don't have a preference between reusing this interface for multiple pivot columns or adding new ones. And we can always decide later. But back to this interface, I'd assume this is for more advanced users, and the pivot column, even just being a single column, can have complex types, so the "literal object" values might be insufficient. Plus, this is a new interface we haven't pushed out yet, but once we have, we are more likely to end up adding a new one than changing it if we want to make it more sophisticated later on. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument
Github user MaxGekk commented on the issue: https://github.com/apache/spark/pull/21699 > pivot(column: Column, values, Seq[Column]), so that we can construct different types in "values". Actually I am mostly worry of the `pivotColumn`. Specifying multiple columns via `struct` is not intuitive I believe. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument
Github user maryannxue commented on the issue: https://github.com/apache/spark/pull/21699 Thank you for the change, @MaxGekk! @HyukjinKwon my idea was actually that the overloaded versions of pivot would be `pivot(column: Column, values, Seq[Column])`, so that we can construct different types in "values". The constant check will be done in Analyzer, so we don't need to worry about it here. Ultimately we would like to support complex-typed values in `pivot(column: Column)` as well, but I think we can make this in a different PR. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21699 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93848/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21699 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21699 **[Test build #93848 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93848/testReport)** for PR 21699 at commit [`5da5a2c`](https://github.com/apache/spark/commit/5da5a2c94a1e99cc3edd920080470b3d17cfc699). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21699 **[Test build #93848 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93848/testReport)** for PR 21699 at commit [`5da5a2c`](https://github.com/apache/spark/commit/5da5a2c94a1e99cc3edd920080470b3d17cfc699). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument
Github user MaxGekk commented on the issue: https://github.com/apache/spark/pull/21699 > it would be nice to support it and test it in DataFrame pivot too. @maryannxue Supported and tested. Please, have a look at: https://github.com/MaxGekk/spark-1/blob/5da5a2c94a1e99cc3edd920080470b3d17cfc699/sql/core/src/test/scala/org/apache/spark/sql/DataFramePivotSuite.scala#L312-L342 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument
Github user maryannxue commented on the issue: https://github.com/apache/spark/pull/21699 @MaxGekk LGTM, but one more thing to consider: Since we support column list in SQL, it would be nice to support it and test it in DataFrame pivot too. The only thing that we need to enable is to make pivot values `Expression`s instead of `Literal`s, coz `Literal`s do not include struct-type literals, e.g., `struct(1, 2)`. The `Pivot` node already has pivot values as `Seq[Expression]`, so all left to be done is in the DataFrame interfaces. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21699 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93832/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21699 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21699 **[Test build #93832 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93832/testReport)** for PR 21699 at commit [`cf55135`](https://github.com/apache/spark/commit/cf55135f430b2012723f8e09a1aa4651d6c7161b). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21699 **[Test build #93832 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93832/testReport)** for PR 21699 at commit [`cf55135`](https://github.com/apache/spark/commit/cf55135f430b2012723f8e09a1aa4651d6c7161b). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21699 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21699 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93826/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21699 **[Test build #93826 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93826/testReport)** for PR 21699 at commit [`34535a9`](https://github.com/apache/spark/commit/34535a9cc5ec7a2ba880f7f525feb7dbbc0b0c37). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21699 **[Test build #93826 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93826/testReport)** for PR 21699 at commit [`34535a9`](https://github.com/apache/spark/commit/34535a9cc5ec7a2ba880f7f525feb7dbbc0b0c37). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/21699 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21699 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21699 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93824/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21699 **[Test build #93824 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93824/testReport)** for PR 21699 at commit [`34535a9`](https://github.com/apache/spark/commit/34535a9cc5ec7a2ba880f7f525feb7dbbc0b0c37). * This patch **fails due to an unknown error code, -9**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21699 **[Test build #93824 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93824/testReport)** for PR 21699 at commit [`34535a9`](https://github.com/apache/spark/commit/34535a9cc5ec7a2ba880f7f525feb7dbbc0b0c37). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/21699 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument
Github user maryannxue commented on the issue: https://github.com/apache/spark/pull/21699 @MaxGekk Please take a look at https://github.com/apache/spark/pull/21926. There was a bug in PivotFirst and this PR should fix your test here. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument
Github user maryannxue commented on the issue: https://github.com/apache/spark/pull/21699 @MaxGekk Yes, it was caused by my previous PR. The change in my PR was a walk-around for an existing problem in either Aggregate or PivotFirst (I suspect it's Aggregate) with struct-type columns. The change itself worked as designed because Pivot SQL support wouldn't allow any function (like "lowercase") in the pivot column. However it broke your PR coz it aimed to allow any expression. That said, we have two options here: 1) Give up the PivotFirst approach and fall back to "else" branch for struct-type pivot columns, i.e., multiple column in pivot FOR clause. 2) Fix the bug for Aggregate or PivotFirst. I will do a little investigation into option 2) tomorrow and get back to you :) --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21699 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21699 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93746/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21699 **[Test build #93746 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93746/testReport)** for PR 21699 at commit [`34535a9`](https://github.com/apache/spark/commit/34535a9cc5ec7a2ba880f7f525feb7dbbc0b0c37). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21699 **[Test build #93746 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93746/testReport)** for PR 21699 at commit [`34535a9`](https://github.com/apache/spark/commit/34535a9cc5ec7a2ba880f7f525feb7dbbc0b0c37). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument
Github user MaxGekk commented on the issue: https://github.com/apache/spark/pull/21699 I merged the `master` branch to my branch `pivot-column`, and the changes break my test. It seems recent changes in pivoting introduced a correctness bug. See the test https://github.com/apache/spark/pull/21699/files#diff-50aa7d3b7b7934a7df6f414396e74c3cR271 . Here is the result without pivots: ``` val df = trainingSales .groupBy($"sales.year", lower($"sales.course")) .agg(sum($"sales.earnings")) df.show(false) ``` ``` ++---+---+ |year|lower(sales.course)|sum(sales.earnings)| ++---+---+ |2012|java |2.0| |2012|dotnet |15000.0| |2013|java |3.0| |2013|dotnet |48000.0| ++---+---+ ``` with pivoting: ``` val df = trainingSales .groupBy($"sales.year") .pivot(lower($"sales.course"), Seq("dotNet", "Java").map(_.toLowerCase)) .agg(sum($"sales.earnings")) df.show(false) ``` the result must be as the test expects: ``` +++---+ |year|dotnet |java | +++---+ |2012|15000.0 |2.0| |2013|48000.0 |3.0| +++---+ ``` but the returned result for `dotnet` in `2012` is wrong: ``` ++---+---+ |year|dotnet |java | ++---+---+ |2012|5000.0 |2.0| |2013|48000.0|3.0| ++---+---+ ``` @maryannxue Please, take a look at it. Maybe the bug was introduced by your recent changes. /cc @gatorsmile --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21699 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21699 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93733/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21699 **[Test build #93733 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93733/testReport)** for PR 21699 at commit [`e76e7ad`](https://github.com/apache/spark/commit/e76e7adcca6787cb334b19f8db35f3a4ec61bafc). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21699 **[Test build #93733 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93733/testReport)** for PR 21699 at commit [`e76e7ad`](https://github.com/apache/spark/commit/e76e7adcca6787cb334b19f8db35f3a4ec61bafc). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument
Github user rxin commented on the issue: https://github.com/apache/spark/pull/21699 I'm OK with it. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument
Github user MaxGekk commented on the issue: https://github.com/apache/spark/pull/21699 > it's basically reverting Reynold's decision. is he okay with it? As far as I know, yes. /cc @rxin --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/21699 it's basically reverting Reynold's decision. is he okay with it? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument
Github user MaxGekk commented on the issue: https://github.com/apache/spark/pull/21699 @gatorsmile The PR https://github.com/apache/spark/pull/21753 has been merged already. Can we continue with this PR? @maryannxue I added the tests you asked for in https://github.com/MaxGekk/spark-1/pull/7#discussion_r200516296 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21699 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92719/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21699 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21699 **[Test build #92719 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92719/testReport)** for PR 21699 at commit [`f32a85b`](https://github.com/apache/spark/commit/f32a85bd7d114adb85e7281e2a039b383392a17b). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument
Github user MaxGekk commented on the issue: https://github.com/apache/spark/pull/21699 @maryannxue Please, have a look at the PR. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21699 **[Test build #92719 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92719/testReport)** for PR 21699 at commit [`f32a85b`](https://github.com/apache/spark/commit/f32a85bd7d114adb85e7281e2a039b383392a17b). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/21699 @MaxGekk, I think you should talk with @rxin rather then @aray's comment. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument
Github user MaxGekk commented on the issue: https://github.com/apache/spark/pull/21699 > Considering you can just make a call to withColumn first I'm not really convinced in the utility of this PR. Purpose of the PR is to make pivot API consistent to `groupBy` and clear. Our users/clients shouldn't google workarounds like `withColumn` to apply the `pivot()` functions. I believe we should follow [the principle of least astonishment (POLA)](https://en.wikipedia.org/wiki/Principle_of_least_astonishment). Some of our clients forms `Column` expressions programmatically as a result of calculation. And for now, instead of just passing `Column` variable to `pivot()`, they have to modify whole expression, and inject projection or `withColumn`. /cc @ssimeonov --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21699 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92577/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21699 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21699 **[Test build #92577 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92577/testReport)** for PR 21699 at commit [`fae4fd2`](https://github.com/apache/spark/commit/fae4fd2f607c0b44adb03827039f26c5ff592d31). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument
Github user aray commented on the issue: https://github.com/apache/spark/pull/21699 Using either `Column` or `String` type was actually in my original PR: https://github.com/apache/spark/pull/7841 @rxin later modified the api to only take a `String` prior to the release as part of an API audit: https://github.com/apache/spark/pull/9929 Considering you can just make a call to `withColumn` first I'm not really convinced in the utility of this PR. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21699 **[Test build #92577 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92577/testReport)** for PR 21699 at commit [`fae4fd2`](https://github.com/apache/spark/commit/fae4fd2f607c0b44adb03827039f26c5ff592d31). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument
Github user MaxGekk commented on the issue: https://github.com/apache/spark/pull/21699 > Were you planning to add a new overload for each existing String version, e.g. pivot(Column) and pivot(Column, java.util.List[Any])? The methods have been added already. @rednaxelafx Please, look at the lines: https://github.com/apache/spark/pull/21699/files#diff-95bb2228c67e3cce4c729e44e2d82422R362 https://github.com/apache/spark/pull/21699/files#diff-95bb2228c67e3cce4c729e44e2d82422R377 > Yes you've already included that in test case examples so that's already good I added this test case: https://github.com/apache/spark/pull/21699/files#diff-50aa7d3b7b7934a7df6f414396e74c3cR271 . Also I am trying to find a case when existing API with `String` doesn't work well (some `Column` expressions maybe). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21699 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21699 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92568/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21699 **[Test build #92568 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92568/testReport)** for PR 21699 at commit [`d62b7e7`](https://github.com/apache/spark/commit/d62b7e789f38219b62fb5b010fb2cacc0324fe29). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument
Github user rednaxelafx commented on the issue: https://github.com/apache/spark/pull/21699 This mostly looks good, but I'd like to ask a few things first: 1. The new overloaded `pivot()` that takes `Column` only exist for `pivot(Column, Seq[Any])`. Were you planning to add a new overload for each existing `String` version, e.g. `pivot(Column)` and `pivot(Column, java.util.List[Any])`? 2. Since you're adding the `Column` version(s) to address accessing nested columns, would it be nice to highlight that capability in the doc example? (Yes you've already included that in test case examples so that's already good) --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/21699 cc @aray --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21699 **[Test build #92568 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92568/testReport)** for PR 21699 at commit [`d62b7e7`](https://github.com/apache/spark/commit/d62b7e789f38219b62fb5b010fb2cacc0324fe29). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument
Github user MaxGekk commented on the issue: https://github.com/apache/spark/pull/21699 jenkins, retest this, please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21699 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92561/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21699 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21699 **[Test build #92561 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92561/testReport)** for PR 21699 at commit [`d62b7e7`](https://github.com/apache/spark/commit/d62b7e789f38219b62fb5b010fb2cacc0324fe29). * This patch **fails PySpark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21699 **[Test build #92561 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92561/testReport)** for PR 21699 at commit [`d62b7e7`](https://github.com/apache/spark/commit/d62b7e789f38219b62fb5b010fb2cacc0324fe29). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument
Github user maropu commented on the issue: https://github.com/apache/spark/pull/21699 cc: @rxin @gatorsmile --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument
Github user maropu commented on the issue: https://github.com/apache/spark/pull/21699 `def pivot(pivotColumn: String)`, too? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21699 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92541/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21699 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21699 **[Test build #92541 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92541/testReport)** for PR 21699 at commit [`0fdd11f`](https://github.com/apache/spark/commit/0fdd11ff26b4f4ca3b79bdd116aaf1c558643698). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21699 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21699 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21699 **[Test build #92541 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92541/testReport)** for PR 21699 at commit [`0fdd11f`](https://github.com/apache/spark/commit/0fdd11ff26b4f4ca3b79bdd116aaf1c558643698). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org