[GitHub] spark issue #15082: [SPARK-17528][SQL] MutableProjection should not cache co...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15082 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65807/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15082: [SPARK-17528][SQL] MutableProjection should not cache co...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15082 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15082: [SPARK-17528][SQL] MutableProjection should not cache co...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15082 **[Test build #65807 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65807/consoleFull)** for PR 15082 at commit [`c56de6d`](https://github.com/apache/spark/commit/c56de6da72c18b2cd1f65eed956cdee89371b075). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15082: [SPARK-17528][SQL] MutableProjection should not cache co...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15082 **[Test build #65807 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65807/consoleFull)** for PR 15082 at commit [`c56de6d`](https://github.com/apache/spark/commit/c56de6da72c18b2cd1f65eed956cdee89371b075). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15082: [SPARK-17528][SQL] MutableProjection should not cache co...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/15082 I re-targeted it to 2.1 only. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15082: [SPARK-17528][SQL] MutableProjection should not cache co...
Github user JoshRosen commented on the issue: https://github.com/apache/spark/pull/15082 What's the status of this issue? I see that it's currently targeted for 2.0.1 in JIRA and wanted to ping to make sure that this doesn't miss the cut in case we prepare an RC soon. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15082: [SPARK-17528][SQL] MutableProjection should not cache co...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15082 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65561/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15082: [SPARK-17528][SQL] MutableProjection should not cache co...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15082 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15082: [SPARK-17528][SQL] MutableProjection should not cache co...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15082 **[Test build #65561 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65561/consoleFull)** for PR 15082 at commit [`da6a285`](https://github.com/apache/spark/commit/da6a2859c8fe0c90b4aa14a68c2234e6ede36fce). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15082: [SPARK-17528][SQL] MutableProjection should not cache co...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15082 **[Test build #65561 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65561/consoleFull)** for PR 15082 at commit [`da6a285`](https://github.com/apache/spark/commit/da6a2859c8fe0c90b4aa14a68c2234e6ede36fce). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15082: [SPARK-17528][SQL] MutableProjection should not cache co...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/15082 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15082: [SPARK-17528][SQL] MutableProjection should not cache co...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15082 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65418/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15082: [SPARK-17528][SQL] MutableProjection should not cache co...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15082 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15082: [SPARK-17528][SQL] MutableProjection should not cache co...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15082 **[Test build #65418 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65418/consoleFull)** for PR 15082 at commit [`da6a285`](https://github.com/apache/spark/commit/da6a2859c8fe0c90b4aa14a68c2234e6ede36fce). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15082: [SPARK-17528][SQL] MutableProjection should not cache co...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15082 **[Test build #65418 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65418/consoleFull)** for PR 15082 at commit [`da6a285`](https://github.com/apache/spark/commit/da6a2859c8fe0c90b4aa14a68c2234e6ede36fce). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15082: [SPARK-17528][SQL] MutableProjection should not cache co...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15082 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15082: [SPARK-17528][SQL] MutableProjection should not cache co...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15082 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65386/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15082: [SPARK-17528][SQL] MutableProjection should not cache co...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15082 **[Test build #65386 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65386/consoleFull)** for PR 15082 at commit [`f83e291`](https://github.com/apache/spark/commit/f83e2916c7da09be90c1cfad5387bab51512b0f0). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15082: [SPARK-17528][SQL] MutableProjection should not cache co...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15082 **[Test build #65386 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65386/consoleFull)** for PR 15082 at commit [`f83e291`](https://github.com/apache/spark/commit/f83e2916c7da09be90c1cfad5387bab51512b0f0). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15082: [SPARK-17528][SQL] MutableProjection should not cache co...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/15082 After some more discussion, I'm going to reopen it: 1. `InternalRow.copy` breaks the copy contract and the fix in this PR is necessary 2. `MutableProjection` may bite us in the future if we don't fix it 3. It turns out `Collect` is the only `ImperativeAggregate` that we need to fix while updating aggregation buffer, so it's not a lot of work. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15082: [SPARK-17528][SQL] MutableProjection should not cache co...
Github user liancheng commented on the issue: https://github.com/apache/spark/pull/15082 My understanding of the main concern of closing this PR is that: 1. Although this issue can be potentially dangerous, the current code work fine without fixing this issue. 1. We still can't remove the safe projection in `SortAggregateExec` after fixing this issue (as @cloud-fan explained above), thus data may be copied twice and leads to performance regression. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15082: [SPARK-17528][SQL] MutableProjection should not cache co...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/15082 After thinking more about it, it seems not a problem and we don't need to fix it. Currently `MutableProjection` is used in 3 places: 1. hash based aggregate. It's fine because hash aggregate only supports primitive type buffer, so it doesn't have the copy problem. 2. sort based aggregate. It's also fine because we always apply a safe projection to the input row before process it. I have a PR to fix the comment of it: https://github.com/apache/spark/pull/15095 3. window execution. It's also fine because its input rows are come from `RowBuffer`, not unsafe rows produced by unsafe projection. This PR tried to fix `MutableProjection` so that we can remove the safe projection in sort based aggregate, but I was wrong because `MutableProjection` is not the only way to update the aggregation buffer, `ImperativeAggregate` can also update the buffer via the `update` and `merge` API, and it's not a good idea to fix all `ImperativeAggregate` to do copy correctly while updating the aggregate buffer(and it's not future proof, we may add more `ImperativeAggragate` implementations and miss this requirement). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15082: [SPARK-17528][SQL] MutableProjection should not cache co...
Github user davies commented on the issue: https://github.com/apache/spark/pull/15082 @JoshRosen I think this is a potential bug (not now). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15082: [SPARK-17528][SQL] MutableProjection should not cache co...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15082 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15082: [SPARK-17528][SQL] MutableProjection should not cache co...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15082 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65328/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15082: [SPARK-17528][SQL] MutableProjection should not cache co...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15082 **[Test build #65328 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65328/consoleFull)** for PR 15082 at commit [`75b5749`](https://github.com/apache/spark/commit/75b57490d1be2ca4d7ca88bcd4d6d9461d470ce8). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15082: [SPARK-17528][SQL] MutableProjection should not cache co...
Github user JoshRosen commented on the issue: https://github.com/apache/spark/pull/15082 Is this a "wrong answer" correctness bug? If so, let me label accordingly on the JIRA. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15082: [SPARK-17528][SQL] MutableProjection should not cache co...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15082 **[Test build #65328 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65328/consoleFull)** for PR 15082 at commit [`75b5749`](https://github.com/apache/spark/commit/75b57490d1be2ca4d7ca88bcd4d6d9461d470ce8). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15082: [SPARK-17528][SQL] MutableProjection should not cache co...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/15082 cc @davies @yhuai @liancheng @clockfly --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org