[GitHub] spark pull request #23139: [SPARK-25860][SPARK-26107] [FOLLOW-UP] Rule Repla...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23139#discussion_r236150260 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/ReplaceNullWithFalseInPredicate.scala --- @@ -0,0 +1,107 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.catalyst.optimizer + +import org.apache.spark.sql.catalyst.expressions.{And, ArrayExists, ArrayFilter, CaseWhen, Expression, If} +import org.apache.spark.sql.catalyst.expressions.{LambdaFunction, Literal, MapFilter, Or} +import org.apache.spark.sql.catalyst.expressions.Literal.FalseLiteral +import org.apache.spark.sql.catalyst.plans.logical.{Filter, Join, LogicalPlan} +import org.apache.spark.sql.catalyst.rules.Rule +import org.apache.spark.sql.types.BooleanType + + +/** + * A rule that replaces `Literal(null, BooleanType)` with `FalseLiteral`, if possible, in the search + * condition of the WHERE/HAVING/ON(JOIN) clauses, which contain an implicit Boolean operator + * "(search condition) = TRUE". The replacement is only valid when `Literal(null, BooleanType)` is + * semantically equivalent to `FalseLiteral` when evaluating the whole search condition. + * + * Please note that FALSE and NULL are not exchangeable in most cases, when the search condition + * contains NOT and NULL-tolerant expressions. Thus, the rule is very conservative and applicable + * in very limited cases. + * + * For example, `Filter(Literal(null, BooleanType))` is equal to `Filter(FalseLiteral)`. + * + * Another example containing branches is `Filter(If(cond, FalseLiteral, Literal(null, _)))`; + * this can be optimized to `Filter(If(cond, FalseLiteral, FalseLiteral))`, and eventually + * `Filter(FalseLiteral)`. + * + * Moreover, this rule also transforms predicates in all [[If]] expressions as well as branch + * conditions in all [[CaseWhen]] expressions, even if they are not part of the search conditions. + * + * For example, `Project(If(And(cond, Literal(null)), Literal(1), Literal(2)))` can be simplified + * into `Project(Literal(2))`. + */ +object ReplaceNullWithFalseInPredicate extends Rule[LogicalPlan] { + + def apply(plan: LogicalPlan): LogicalPlan = plan transform { +case f @ Filter(cond, _) => f.copy(condition = replaceNullWithFalse(cond)) +case j @ Join(_, _, _, Some(cond)) => j.copy(condition = Some(replaceNullWithFalse(cond))) +case p: LogicalPlan => p transformExpressions { + case i @ If(pred, _, _) => i.copy(predicate = replaceNullWithFalse(pred)) + case cw @ CaseWhen(branches, _) => +val newBranches = branches.map { case (cond, value) => + replaceNullWithFalse(cond) -> value +} +cw.copy(branches = newBranches) + case af @ ArrayFilter(_, lf @ LambdaFunction(func, _, _)) => +val newLambda = lf.copy(function = replaceNullWithFalse(func)) +af.copy(function = newLambda) + case ae @ ArrayExists(_, lf @ LambdaFunction(func, _, _)) => +val newLambda = lf.copy(function = replaceNullWithFalse(func)) +ae.copy(function = newLambda) + case mf @ MapFilter(_, lf @ LambdaFunction(func, _, _)) => +val newLambda = lf.copy(function = replaceNullWithFalse(func)) +mf.copy(function = newLambda) +} + } + + /** + * Recursively traverse the Boolean-type expression to replace + * `Literal(null, BooleanType)` with `FalseLiteral`, if possible. + * + * Note that `transformExpressionsDown` can not be used here as we must stop as soon as we hit + * an expression that is not [[CaseWhen]], [[If]], [[And]], [[Or]] or + * `Literal(null, BooleanType)`. + */ + private def replaceNullWithFalse(e: Expression): Expression = { +if (e.dataType != BooleanType) { --- End diff -- We don't handle `LambdaFunction` inside this method, it's caller side.
[GitHub] spark pull request #23135: [SPARK-26168][SQL] Update the code comments in Ex...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/23135 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter's readingIterator fie...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23083 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5351/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter's readingIterator fie...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/23083 **[Test build #99267 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99267/testReport)** for PR 23083 at commit [`1723819`](https://github.com/apache/spark/commit/17238196719de1e68cbcb1eb930cb3176308e437). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter's readingIterator fie...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23083 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #23130: [SPARK-26161][SQL] Ignore empty files in load
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23130#discussion_r236149203 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala --- @@ -388,7 +388,7 @@ case class FileSourceScanExec( logInfo(s"Planning with ${bucketSpec.numBuckets} buckets") val filesGroupedToBuckets = selectedPartitions.flatMap { p => -p.files.map { f => +p.files.filter(_.getLen > 0).map { f => --- End diff -- do you mean changing `filter...map...` to `flatMap`? I don't have a strong preference about it. The updated test cases and the new test case are for this change. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #23137: [SPARK-26169] Create DataFrameSetOperationsSuite
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/23137 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23135: [SPARK-26168][SQL] Update the code comments in Expressio...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23135 thanks, merging to master! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter's readingIterator fie...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23083 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23135: [SPARK-26168][SQL] Update the code comments in Expressio...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23135 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99257/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23135: [SPARK-26168][SQL] Update the code comments in Expressio...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23135 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23137: [SPARK-26169] Create DataFrameSetOperationsSuite
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23137 thanks, merging to master! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23135: [SPARK-26168][SQL] Update the code comments in Expressio...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/23135 **[Test build #99257 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99257/testReport)** for PR 23135 at commit [`cd682ff`](https://github.com/apache/spark/commit/cd682ff4377856b969f4745f782b7f49f2fc85c8). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23142: [SPARK-26170][SS] Add missing metrics in FlatMapGroupsWi...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/23142 **[Test build #99266 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99266/testReport)** for PR 23142 at commit [`56f39cc`](https://github.com/apache/spark/commit/56f39cc5838c3f609c8657639ac3a45991fde99f). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23142: [SPARK-26170][SS] Add missing metrics in FlatMapGroupsWi...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23142 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23142: [SPARK-26170][SS] Add missing metrics in FlatMapGroupsWi...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23142 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #23142: [SPARK-26170][SS] Add missing metrics in FlatMapG...
GitHub user HeartSaVioR opened a pull request: https://github.com/apache/spark/pull/23142 [SPARK-26170][SS] Add missing metrics in FlatMapGroupsWithState ## What changes were proposed in this pull request? This patch addresses measuring possible metrics in StateStoreWriter to FlatMapGroupsWithStateExec. Please note that some metrics like time to remove elements are not addressed because they are coupled with state function. ## How was this patch tested? Manually tested with https://github.com/apache/spark/blob/master/examples/src/main/scala/org/apache/spark/examples/sql/streaming/StructuredSessionization.scala. Snapshots below: ![screen shot 2018-11-26 at 4 13 40 pm](https://user-images.githubusercontent.com/1317309/48999346-b5f7b400-f199-11e8-89c7-8795f13470d6.png) ![screen shot 2018-11-26 at 4 13 54 pm](https://user-images.githubusercontent.com/1317309/48999347-b5f7b400-f199-11e8-91ef-ef0b2f816b2e.png) You can merge this pull request into a Git repository by running: $ git pull https://github.com/HeartSaVioR/spark SPARK-26170 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/23142.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #23142 commit 56f39cc5838c3f609c8657639ac3a45991fde99f Author: Jungtaek Lim (HeartSaVioR) Date: 2018-11-26T07:33:08Z SPARK-26170 Add missing metrics in FlatMapGroupsWithState --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23137: [SPARK-26169] Create DataFrameSetOperationsSuite
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23137 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23137: [SPARK-26169] Create DataFrameSetOperationsSuite
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23137 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99256/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23137: [SPARK-26169] Create DataFrameSetOperationsSuite
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/23137 **[Test build #99256 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99256/testReport)** for PR 23137 at commit [`5cfe08d`](https://github.com/apache/spark/commit/5cfe08d75383069d0ac62f9603685ea1860b74e1). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #23141: [SPARK-26021][SQL][followup] add test for special...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23141#discussion_r236143942 --- Diff: docs/sql-migration-guide-upgrade.md --- @@ -17,14 +17,16 @@ displayTitle: Spark SQL Upgrading Guide - Since Spark 3.0, the `from_json` functions supports two modes - `PERMISSIVE` and `FAILFAST`. The modes can be set via the `mode` option. The default mode became `PERMISSIVE`. In previous versions, behavior of `from_json` did not conform to either `PERMISSIVE` nor `FAILFAST`, especially in processing of malformed JSON records. For example, the JSON string `{"a" 1}` with the schema `a INT` is converted to `null` by previous versions but Spark 3.0 converts it to `Row(null)`. - - In Spark version 2.4 and earlier, the `from_json` function produces `null`s for JSON strings and JSON datasource skips the same independetly of its mode if there is no valid root JSON token in its input (` ` for example). Since Spark 3.0, such input is treated as a bad record and handled according to specified mode. For example, in the `PERMISSIVE` mode the ` ` input is converted to `Row(null, null)` if specified schema is `key STRING, value INT`. + - In Spark version 2.4 and earlier, the `from_json` function produces `null`s for JSON strings and JSON datasource skips the same independetly of its mode if there is no valid root JSON token in its input (` ` for example). Since Spark 3.0, such input is treated as a bad record and handled according to specified mode. For example, in the `PERMISSIVE` mode the ` ` input is converted to `Row(null, null)` if specified schema is `key STRING, value INT`. - The `ADD JAR` command previously returned a result set with the single value 0. It now returns an empty result set. - In Spark version 2.4 and earlier, users can create map values with map type key via built-in function like `CreateMap`, `MapFromArrays`, etc. Since Spark 3.0, it's not allowed to create map values with map type key with these built-in functions. Users can still read map values with map type key from data source or Java/Scala collections, though they are not very useful. - + - In Spark version 2.4 and earlier, `Dataset.groupByKey` results to a grouped dataset with key attribute wrongly named as "value", if the key is non-struct type, e.g. int, string, array, etc. This is counterintuitive and makes the schema of aggregation queries weird. For example, the schema of `ds.groupByKey(...).count()` is `(value, count)`. Since Spark 3.0, we name the grouping attribute to "key". The old behaviour is preserved under a newly added configuration `spark.sql.legacy.dataset.nameNonStructGroupingKeyAsValue` with a default value of `false`. + - In Spark version 2.4 and earlier, float/double -0.0 is semantically equal to 0.0, but users can still distinguish them via `Dataset.show`, `Dataset.collect` etc. Since Spark 3.0, float/double -0.0 is replaced by 0.0 internally, and users can't distinguish them any more. --- End diff -- I checked presto and postgres, the behaviors are same. Hive distinguishes -0.0 and 0.0, but it has the group by bug. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #23104: [SPARK-26138][SQL] Cross join requires push Local...
Github user guoxiaolongzte commented on a diff in the pull request: https://github.com/apache/spark/pull/23104#discussion_r236143436 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -459,6 +459,7 @@ object LimitPushDown extends Rule[LogicalPlan] { val newJoin = joinType match { case RightOuter => join.copy(right = maybePushLocalLimit(exp, right)) case LeftOuter => join.copy(left = maybePushLocalLimit(exp, left)) +case Cross => join.copy(left = maybePushLocalLimit(exp, left), right = maybePushLocalLimit(exp, right)) --- End diff -- I think, if when set spark.sql.crossJoin.enabled=true, if Inner join without condition, LeftOuter join without condition, RightOuter join without condition, FullOuter join without condition , limit should be pushed down on both sides, just like cross join limit in this PR. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23138: [SPARK-23356][SQL][TEST] add new test cases for a + 1,a ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23138 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23138: [SPARK-23356][SQL][TEST] add new test cases for a + 1,a ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23138 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5350/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23104: [SPARK-26138][SQL] Cross join requires push LocalLimit i...
Github user guoxiaolongzte commented on the issue: https://github.com/apache/spark/pull/23104 > The title has a typo. Sorry, it has been fixed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23138: [SPARK-23356][SQL][TEST] add new test cases for a + 1,a ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/23138 **[Test build #99265 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99265/testReport)** for PR 23138 at commit [`471d114`](https://github.com/apache/spark/commit/471d1144d41f767b3227d78b663eaa79efef738c). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22163: [SPARK-25166][CORE]Reduce the number of write operations...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22163 **[Test build #99264 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99264/testReport)** for PR 22163 at commit [`90726db`](https://github.com/apache/spark/commit/90726dbcbde2c5f165a870a8038488f09a3c92d2). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22163: [SPARK-25166][CORE]Reduce the number of write operations...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22163 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5349/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22163: [SPARK-25166][CORE]Reduce the number of write operations...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22163 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #23104: [SPARK-26138][SQL] LimitPushDown cross join requi...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/23104#discussion_r236137768 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -459,6 +459,7 @@ object LimitPushDown extends Rule[LogicalPlan] { val newJoin = joinType match { case RightOuter => join.copy(right = maybePushLocalLimit(exp, right)) case LeftOuter => join.copy(left = maybePushLocalLimit(exp, left)) +case Cross => join.copy(left = maybePushLocalLimit(exp, left), right = maybePushLocalLimit(exp, right)) --- End diff -- @guoxiaolongzte nope. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23104: [SPARK-26138][SQL] LimitPushDown cross join requires may...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/23104 The title has a typo. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #23104: [SPARK-26138][SQL] LimitPushDown cross join requi...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/23104#discussion_r236137426 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -459,6 +459,7 @@ object LimitPushDown extends Rule[LogicalPlan] { val newJoin = joinType match { case RightOuter => join.copy(right = maybePushLocalLimit(exp, right)) case LeftOuter => join.copy(left = maybePushLocalLimit(exp, left)) +case Cross => join.copy(left = maybePushLocalLimit(exp, left), right = maybePushLocalLimit(exp, right)) --- End diff -- +1 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #23104: [SPARK-26138][SQL] LimitPushDown cross join requi...
Github user guoxiaolongzte commented on a diff in the pull request: https://github.com/apache/spark/pull/23104#discussion_r236137253 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -459,6 +459,7 @@ object LimitPushDown extends Rule[LogicalPlan] { val newJoin = joinType match { case RightOuter => join.copy(right = maybePushLocalLimit(exp, right)) case LeftOuter => join.copy(left = maybePushLocalLimit(exp, left)) +case Cross => join.copy(left = maybePushLocalLimit(exp, left), right = maybePushLocalLimit(exp, right)) --- End diff -- When set spark.sql.crossJoin.enabled=true, inner join without condition, LeftOuter without condition, RightOuter without condition, FullOuter without conditionï¼ all these are iterally cross join? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23141: [SPARK-26021][SQL][followup] add test for special floati...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/23141 **[Test build #99263 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99263/testReport)** for PR 23141 at commit [`8a9103c`](https://github.com/apache/spark/commit/8a9103c47931eb61cb329ece046d5efc50e855c2). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23141: [SPARK-26021][SQL][followup] add test for special floati...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23141 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23141: [SPARK-26021][SQL][followup] add test for special floati...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23141 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5348/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23141: [SPARK-26021][SQL][followup] add test for special floati...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23141 cc @adoron @kiszk @viirya --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #23141: [SPARK-26021][SQL][followup] add test for special...
GitHub user cloud-fan opened a pull request: https://github.com/apache/spark/pull/23141 [SPARK-26021][SQL][followup] add test for special floating point values ## What changes were proposed in this pull request? a followup of https://github.com/apache/spark/pull/23124 . Add a test to show the minor behavior change introduced by #23124 , and add migration guide. ## How was this patch tested? a new test You can merge this pull request into a Git repository by running: $ git pull https://github.com/cloud-fan/spark follow Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/23141.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #23141 commit 8a9103c47931eb61cb329ece046d5efc50e855c2 Author: Wenchen Fan Date: 2018-11-26T06:11:09Z add test for special floating point values --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #23138: [SPARK-23356][SQL][TEST] add new test cases for a...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/23138#discussion_r236136273 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/SetOperationSuite.scala --- @@ -196,4 +196,31 @@ class SetOperationSuite extends PlanTest { )) comparePlans(expectedPlan, rewrittenPlan) } + + test("SPARK-23356 union: expressions in project list are addition to each side") { +val unionQuery = testUnion.select(('a + 1).as("aa")) +val unionOptimized = Optimize.execute(unionQuery.analyze) +val unionCorrectAnswer = + Union(testRelation.select(('a + 1).as("aa")) :: +testRelation2.select(('d + 1).as("aa")) :: +testRelation3.select(('g + 1).as("aa")) :: Nil).analyze +comparePlans(unionOptimized, unionCorrectAnswer) + } + + test("SPARK-23356 union: expressions in project list are attribute addition to each side") { +val unionQuery = testUnion.select(('a + 'b).as("ab")) +val unionOptimized = Optimize.execute(unionQuery.analyze) +val unionCorrectAnswer = + Union(testRelation.select(('a + 'b).as("ab")) :: +testRelation2.select(('d + 'e).as("ab")) :: +testRelation3.select(('g + 'h).as("ab")) :: Nil).analyze +comparePlans(unionOptimized, unionCorrectAnswer) + } + + test("SPARK-23356 union: project don't each side with non-deterministic expression") { --- End diff -- no pushdown for non-deterministic expression --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #23138: [SPARK-23356][SQL][TEST] add new test cases for a...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/23138#discussion_r236136228 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/SetOperationSuite.scala --- @@ -196,4 +196,31 @@ class SetOperationSuite extends PlanTest { )) comparePlans(expectedPlan, rewrittenPlan) } + + test("SPARK-23356 union: expressions in project list are addition to each side") { +val unionQuery = testUnion.select(('a + 1).as("aa")) +val unionOptimized = Optimize.execute(unionQuery.analyze) +val unionCorrectAnswer = + Union(testRelation.select(('a + 1).as("aa")) :: +testRelation2.select(('d + 1).as("aa")) :: +testRelation3.select(('g + 1).as("aa")) :: Nil).analyze +comparePlans(unionOptimized, unionCorrectAnswer) + } + + test("SPARK-23356 union: expressions in project list are attribute addition to each side") { --- End diff -- the same here --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #23138: [SPARK-23356][SQL][TEST] add new test cases for a...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/23138#discussion_r236136178 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/SetOperationSuite.scala --- @@ -196,4 +196,31 @@ class SetOperationSuite extends PlanTest { )) comparePlans(expectedPlan, rewrittenPlan) } + + test("SPARK-23356 union: expressions in project list are addition to each side") { --- End diff -- `are addition to each side` -> `are pushed down` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter's readingIterator fie...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23083 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter's readingIterator fie...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23083 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99254/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #23130: [SPARK-26161][SQL] Ignore empty files in load
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/23130#discussion_r236135787 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala --- @@ -388,7 +388,7 @@ case class FileSourceScanExec( logInfo(s"Planning with ${bucketSpec.numBuckets} buckets") val filesGroupedToBuckets = selectedPartitions.flatMap { p => -p.files.map { f => +p.files.filter(_.getLen > 0).map { f => --- End diff -- Do we have a test case for this line? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #23130: [SPARK-26161][SQL] Ignore empty files in load
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/23130#discussion_r236135647 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala --- @@ -388,7 +388,7 @@ case class FileSourceScanExec( logInfo(s"Planning with ${bucketSpec.numBuckets} buckets") val filesGroupedToBuckets = selectedPartitions.flatMap { p => -p.files.map { f => +p.files.filter(_.getLen > 0).map { f => --- End diff -- do the filtering inside the map? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter's readingIterator fie...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/23083 **[Test build #99254 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99254/testReport)** for PR 23083 at commit [`1723819`](https://github.com/apache/spark/commit/17238196719de1e68cbcb1eb930cb3176308e437). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23139: [SPARK-25860][SPARK-26107] [FOLLOW-UP] Rule ReplaceNullW...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/23139 **[Test build #99262 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99262/testReport)** for PR 23139 at commit [`e416810`](https://github.com/apache/spark/commit/e41681096867cbc6d2556da83ce733092d6df841). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23139: [SPARK-25860][SPARK-26107] [FOLLOW-UP] Rule ReplaceNullW...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23139 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5347/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23139: [SPARK-25860][SPARK-26107] [FOLLOW-UP] Rule ReplaceNullW...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23139 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23108: [Spark-25993][SQL][TEST]Add test cases for resolution of...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23108 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99251/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23108: [Spark-25993][SQL][TEST]Add test cases for resolution of...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23108 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23108: [Spark-25993][SQL][TEST]Add test cases for resolution of...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/23108 **[Test build #99251 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99251/testReport)** for PR 23108 at commit [`d6e582b`](https://github.com/apache/spark/commit/d6e582b3ff33f767d41c9c7cf1710107d7901e0f). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23138: [SPARK-23356][SQL][TEST] add new test cases for a + 1,a ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23138 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23138: [SPARK-23356][SQL][TEST] add new test cases for a + 1,a ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/23138 **[Test build #99261 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99261/testReport)** for PR 23138 at commit [`ebe10e1`](https://github.com/apache/spark/commit/ebe10e171a8fd6fd8afa4f22eb47ee643562db5a). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23138: [SPARK-23356][SQL][TEST] add new test cases for a + 1,a ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23138 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5346/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23138: [SPARK-23356][SQL][TEST] add new test cases for a + 1,a ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/23138 **[Test build #99260 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99260/testReport)** for PR 23138 at commit [`ebe10e1`](https://github.com/apache/spark/commit/ebe10e171a8fd6fd8afa4f22eb47ee643562db5a). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23138: [SPARK-23356][SQL][TEST] add new test cases for a + 1,a ...
Github user heary-cao commented on the issue: https://github.com/apache/spark/pull/23138 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23138: [SPARK-23356][SQL][TEST] add new test cases for a + 1,a ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23138 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23138: [SPARK-23356][SQL][TEST] add new test cases for a + 1,a ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/23138 **[Test build #99253 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99253/testReport)** for PR 23138 at commit [`ebe10e1`](https://github.com/apache/spark/commit/ebe10e171a8fd6fd8afa4f22eb47ee643562db5a). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23138: [SPARK-23356][SQL][TEST] add new test cases for a + 1,a ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23138 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99253/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23139: [SPARK-25860][SPARK-26107] [FOLLOW-UP] Rule ReplaceNullW...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23139 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99255/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23139: [SPARK-25860][SPARK-26107] [FOLLOW-UP] Rule ReplaceNullW...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23139 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23139: [SPARK-25860][SPARK-26107] [FOLLOW-UP] Rule ReplaceNullW...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/23139 **[Test build #99255 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99255/testReport)** for PR 23139 at commit [`6b6997d`](https://github.com/apache/spark/commit/6b6997d6c5eedb9a75af61345ae808c9d98e6f4d). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23127: [SPARK-26159] Codegen for LocalTableScanExec and RDDScan...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23127 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99259/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23127: [SPARK-26159] Codegen for LocalTableScanExec and RDDScan...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23127 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23127: [SPARK-26159] Codegen for LocalTableScanExec and RDDScan...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/23127 **[Test build #99259 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99259/testReport)** for PR 23127 at commit [`23c2d91`](https://github.com/apache/spark/commit/23c2d9111f1cff9059746bb7b48bb8ef7ad7027b). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `trait InputRDDCodegen extends CodegenSupport ` * `case class InputAdapter(child: SparkPlan) extends UnaryExecNode with InputRDDCodegen ` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23127: [SPARK-26159] Codegen for LocalTableScanExec and RDDScan...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23127 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99258/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23127: [SPARK-26159] Codegen for LocalTableScanExec and RDDScan...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23127 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23127: [SPARK-26159] Codegen for LocalTableScanExec and RDDScan...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/23127 **[Test build #99258 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99258/testReport)** for PR 23127 at commit [`23c2d91`](https://github.com/apache/spark/commit/23c2d9111f1cff9059746bb7b48bb8ef7ad7027b). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `trait InputRDDCodegen extends CodegenSupport ` * `case class InputAdapter(child: SparkPlan) extends UnaryExecNode with InputRDDCodegen ` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22575: [SPARK-24630][SS] Support SQLStreaming in Spark
Github user sujith71955 commented on the issue: https://github.com/apache/spark/pull/22575 @stczwd Can you provide a detail design document for this PR, by mentioning the cenarios is been handled and constraints if any. this wll give a complete pitcture about this PR. Thanks --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23131: [SPARK-25908][SQL][FOLLOW-UP] Add back unionAll
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/23131 Thanks! Merged to master. Yes. Adding Distinct over Union is super expensive especially when the underlying data set is huge. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23137: [SPARK-26169] Create DataFrameSetOperationsSuite
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/23137 LGTM, pending Jenkins --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23135: [SPARK-26168][SQL] Update the code comments in Expressio...
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/23135 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #23139: [SPARK-25860][SPARK-26107] [FOLLOW-UP] Rule Repla...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/23139#discussion_r236120731 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/ReplaceNullWithFalseInPredicate.scala --- @@ -0,0 +1,107 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.catalyst.optimizer + +import org.apache.spark.sql.catalyst.expressions.{And, ArrayExists, ArrayFilter, CaseWhen, Expression, If} +import org.apache.spark.sql.catalyst.expressions.{LambdaFunction, Literal, MapFilter, Or} +import org.apache.spark.sql.catalyst.expressions.Literal.FalseLiteral +import org.apache.spark.sql.catalyst.plans.logical.{Filter, Join, LogicalPlan} +import org.apache.spark.sql.catalyst.rules.Rule +import org.apache.spark.sql.types.BooleanType + + +/** + * A rule that replaces `Literal(null, BooleanType)` with `FalseLiteral`, if possible, in the search + * condition of the WHERE/HAVING/ON(JOIN) clauses, which contain an implicit Boolean operator + * "(search condition) = TRUE". The replacement is only valid when `Literal(null, BooleanType)` is + * semantically equivalent to `FalseLiteral` when evaluating the whole search condition. + * + * Please note that FALSE and NULL are not exchangeable in most cases, when the search condition + * contains NOT and NULL-tolerant expressions. Thus, the rule is very conservative and applicable + * in very limited cases. + * + * For example, `Filter(Literal(null, BooleanType))` is equal to `Filter(FalseLiteral)`. + * + * Another example containing branches is `Filter(If(cond, FalseLiteral, Literal(null, _)))`; + * this can be optimized to `Filter(If(cond, FalseLiteral, FalseLiteral))`, and eventually + * `Filter(FalseLiteral)`. + * + * Moreover, this rule also transforms predicates in all [[If]] expressions as well as branch + * conditions in all [[CaseWhen]] expressions, even if they are not part of the search conditions. + * + * For example, `Project(If(And(cond, Literal(null)), Literal(1), Literal(2)))` can be simplified + * into `Project(Literal(2))`. + */ +object ReplaceNullWithFalseInPredicate extends Rule[LogicalPlan] { + + def apply(plan: LogicalPlan): LogicalPlan = plan transform { +case f @ Filter(cond, _) => f.copy(condition = replaceNullWithFalse(cond)) +case j @ Join(_, _, _, Some(cond)) => j.copy(condition = Some(replaceNullWithFalse(cond))) +case p: LogicalPlan => p transformExpressions { + case i @ If(pred, _, _) => i.copy(predicate = replaceNullWithFalse(pred)) + case cw @ CaseWhen(branches, _) => +val newBranches = branches.map { case (cond, value) => + replaceNullWithFalse(cond) -> value +} +cw.copy(branches = newBranches) + case af @ ArrayFilter(_, lf @ LambdaFunction(func, _, _)) => +val newLambda = lf.copy(function = replaceNullWithFalse(func)) +af.copy(function = newLambda) + case ae @ ArrayExists(_, lf @ LambdaFunction(func, _, _)) => +val newLambda = lf.copy(function = replaceNullWithFalse(func)) +ae.copy(function = newLambda) + case mf @ MapFilter(_, lf @ LambdaFunction(func, _, _)) => +val newLambda = lf.copy(function = replaceNullWithFalse(func)) +mf.copy(function = newLambda) +} + } + + /** + * Recursively traverse the Boolean-type expression to replace + * `Literal(null, BooleanType)` with `FalseLiteral`, if possible. + * + * Note that `transformExpressionsDown` can not be used here as we must stop as soon as we hit + * an expression that is not [[CaseWhen]], [[If]], [[And]], [[Or]] or + * `Literal(null, BooleanType)`. + */ + private def replaceNullWithFalse(e: Expression): Expression = { +if (e.dataType != BooleanType) { --- End diff -- How about the LambdaFunction? My major concern is the future changes might
[GitHub] spark pull request #23088: [SPARK-26119][CORE][WEBUI]Task summary table shou...
Github user shahidki31 commented on a diff in the pull request: https://github.com/apache/spark/pull/23088#discussion_r236120634 --- Diff: core/src/main/scala/org/apache/spark/status/AppStatusStore.scala --- @@ -222,29 +223,20 @@ private[spark] class AppStatusStore( val indices = quantiles.map { q => math.min((q * count).toLong, count - 1) } def scanTasks(index: String)(fn: TaskDataWrapper => Long): IndexedSeq[Double] = { - Utils.tryWithResource( -store.view(classOf[TaskDataWrapper]) - .parent(stageKey) - .index(index) - .first(0L) - .closeableIterator() - ) { it => -var last = Double.NaN -var currentIdx = -1L -indices.map { idx => - if (idx == currentIdx) { -last - } else { -val diff = idx - currentIdx -currentIdx = idx -if (it.skip(diff - 1)) { - last = fn(it.next()).toDouble - last -} else { - Double.NaN -} - } -}.toIndexedSeq + val quantileTasks = store.view(classOf[TaskDataWrapper]) --- End diff -- Yes. If we do, "if (status == "SUCCESS")" for every iterator value, we can't do the skip function. Becuase, earlier we know the exact index we need to take. ie. we can directly skip to 25th percentile, 50th percentile and so on. Now, we don't know which index has the 25th percentile of the "SUCCESS" value, unless we iterate each. Otherwise, we have to filter the "SUCCESS" the tasks prior, like I have done in the PR. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #23027: [SPARK-26049][SQL][TEST] FilterPushdownBenchmark ...
Github user wangyum closed the pull request at: https://github.com/apache/spark/pull/23027 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23127: [SPARK-26159] Codegen for LocalTableScanExec and RDDScan...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23127 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23127: [SPARK-26159] Codegen for LocalTableScanExec and RDDScan...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/23127 **[Test build #99259 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99259/testReport)** for PR 23127 at commit [`23c2d91`](https://github.com/apache/spark/commit/23c2d9111f1cff9059746bb7b48bb8ef7ad7027b). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23127: [SPARK-26159] Codegen for LocalTableScanExec and RDDScan...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23127 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5345/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #23104: [SPARK-26138][SQL] LimitPushDown cross join requi...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23104#discussion_r236118983 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -459,6 +459,7 @@ object LimitPushDown extends Rule[LogicalPlan] { val newJoin = joinType match { case RightOuter => join.copy(right = maybePushLocalLimit(exp, right)) case LeftOuter => join.copy(left = maybePushLocalLimit(exp, left)) +case Cross => join.copy(left = maybePushLocalLimit(exp, left), right = maybePushLocalLimit(exp, right)) --- End diff -- inner join without condition is literally cross join. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #23139: [SPARK-25860][SPARK-26107] [FOLLOW-UP] Rule Repla...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23139#discussion_r236118914 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/ReplaceNullWithFalseInPredicate.scala --- @@ -0,0 +1,107 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.catalyst.optimizer + +import org.apache.spark.sql.catalyst.expressions.{And, ArrayExists, ArrayFilter, CaseWhen, Expression, If} +import org.apache.spark.sql.catalyst.expressions.{LambdaFunction, Literal, MapFilter, Or} +import org.apache.spark.sql.catalyst.expressions.Literal.FalseLiteral +import org.apache.spark.sql.catalyst.plans.logical.{Filter, Join, LogicalPlan} +import org.apache.spark.sql.catalyst.rules.Rule +import org.apache.spark.sql.types.BooleanType + + +/** + * A rule that replaces `Literal(null, BooleanType)` with `FalseLiteral`, if possible, in the search + * condition of the WHERE/HAVING/ON(JOIN) clauses, which contain an implicit Boolean operator + * "(search condition) = TRUE". The replacement is only valid when `Literal(null, BooleanType)` is + * semantically equivalent to `FalseLiteral` when evaluating the whole search condition. + * + * Please note that FALSE and NULL are not exchangeable in most cases, when the search condition + * contains NOT and NULL-tolerant expressions. Thus, the rule is very conservative and applicable + * in very limited cases. + * + * For example, `Filter(Literal(null, BooleanType))` is equal to `Filter(FalseLiteral)`. + * + * Another example containing branches is `Filter(If(cond, FalseLiteral, Literal(null, _)))`; + * this can be optimized to `Filter(If(cond, FalseLiteral, FalseLiteral))`, and eventually + * `Filter(FalseLiteral)`. + * + * Moreover, this rule also transforms predicates in all [[If]] expressions as well as branch + * conditions in all [[CaseWhen]] expressions, even if they are not part of the search conditions. + * + * For example, `Project(If(And(cond, Literal(null)), Literal(1), Literal(2)))` can be simplified + * into `Project(Literal(2))`. + */ +object ReplaceNullWithFalseInPredicate extends Rule[LogicalPlan] { + + def apply(plan: LogicalPlan): LogicalPlan = plan transform { +case f @ Filter(cond, _) => f.copy(condition = replaceNullWithFalse(cond)) +case j @ Join(_, _, _, Some(cond)) => j.copy(condition = Some(replaceNullWithFalse(cond))) +case p: LogicalPlan => p transformExpressions { + case i @ If(pred, _, _) => i.copy(predicate = replaceNullWithFalse(pred)) + case cw @ CaseWhen(branches, _) => +val newBranches = branches.map { case (cond, value) => + replaceNullWithFalse(cond) -> value +} +cw.copy(branches = newBranches) + case af @ ArrayFilter(_, lf @ LambdaFunction(func, _, _)) => +val newLambda = lf.copy(function = replaceNullWithFalse(func)) +af.copy(function = newLambda) + case ae @ ArrayExists(_, lf @ LambdaFunction(func, _, _)) => +val newLambda = lf.copy(function = replaceNullWithFalse(func)) +ae.copy(function = newLambda) + case mf @ MapFilter(_, lf @ LambdaFunction(func, _, _)) => +val newLambda = lf.copy(function = replaceNullWithFalse(func)) +mf.copy(function = newLambda) +} + } + + /** + * Recursively traverse the Boolean-type expression to replace + * `Literal(null, BooleanType)` with `FalseLiteral`, if possible. + * + * Note that `transformExpressionsDown` can not be used here as we must stop as soon as we hit + * an expression that is not [[CaseWhen]], [[If]], [[And]], [[Or]] or + * `Literal(null, BooleanType)`. + */ + private def replaceNullWithFalse(e: Expression): Expression = { +if (e.dataType != BooleanType) { --- End diff -- do we need this? `And`, `Or`, `If` all return boolean, and we already
[GitHub] spark issue #23135: [SPARK-26168][SQL] Update the code comments in Expressio...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23135 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23127: [SPARK-26159] Codegen for LocalTableScanExec and RDDScan...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23127 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23127: [SPARK-26159] Codegen for LocalTableScanExec and RDDScan...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/23127 **[Test build #99258 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99258/testReport)** for PR 23127 at commit [`23c2d91`](https://github.com/apache/spark/commit/23c2d9111f1cff9059746bb7b48bb8ef7ad7027b). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23127: [SPARK-26159] Codegen for LocalTableScanExec and RDDScan...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23127 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #23127: [SPARK-26159] Codegen for LocalTableScanExec and ...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23127#discussion_r236118569 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegenExec.scala --- @@ -350,6 +350,15 @@ trait CodegenSupport extends SparkPlan { */ def needStopCheck: Boolean = parent.needStopCheck + /** + * Helper default should stop check code. + */ + def shouldStopCheckCode: String = if (needStopCheck) { --- End diff -- we can use in in more places. This can be done in folllowup. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23135: [SPARK-26168][SQL] Update the code comments in Expressio...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/23135 **[Test build #99257 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99257/testReport)** for PR 23135 at commit [`cd682ff`](https://github.com/apache/spark/commit/cd682ff4377856b969f4745f782b7f49f2fc85c8). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23135: [SPARK-26168][SQL] Update the code comments in Expressio...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23135 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5344/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23135: [SPARK-26168][SQL] Update the code comments in Expressio...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23135 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23140: SPARK-25774 truncate table with partition and path
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23140 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23140: SPARK-25774 truncate table with partition and path
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23140 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #23135: [SPARK-26168][SQL] Update the code comments in Ex...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/23135#discussion_r236117773 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Expression.scala --- @@ -43,9 +43,24 @@ import org.apache.spark.sql.types._ * There are a few important traits: * * - [[Nondeterministic]]: an expression that is not deterministic. + * - [[Stateful]]: an expression that contains mutable state. For example, MonotonicallyIncreasingID + * and Rand. A stateful expression is always non-deterministic. * - [[Unevaluable]]: an expression that is not supposed to be evaluated. * - [[CodegenFallback]]: an expression that does not have code gen implemented and falls back to *interpreted mode. + * - [[NullIntolerant]]: an expression that is null intolerant (i.e. any null input will result in + * null output). + * - [[NonSQLExpression]]: a common base trait for the expressions that doesn't have SQL + * expressions like representation. For example, `ScalaUDF`, `ScalaUDAF`, + * and object `MapObjects` and `Invoke`. + * - [[UserDefinedExpression]]: a common base trait for user-defined functions, including + * UDF/UDAF/UDTF. + * - [[HigherOrderFunction]]: a common base trait for higher order functions that take one or more + *(lambda) functions and applies these to some objects. The function + *produces a number of variables which can be consumed by some lambda + *function. + * - [[NamedExpression]]: An [[Expression]] that is named. + * - [[TimeZoneAwareExpression]]: A common base trait for time zone aware expressions. --- End diff -- Added. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23140: SPARK-25774 truncate table with partition and path
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23140 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #23140: SPARK-25774 truncate table with partition and pat...
GitHub user lcqzte10192193 opened a pull request: https://github.com/apache/spark/pull/23140 SPARK-25774 truncate table with partition and path ## What changes were proposed in this pull request? when we run SPARK SQL TRUNCATE TABLE command on a managed table in Hive, it deletes the files in HDFS but leaves the partitions and partition folder structure,more details refers to SPARK-25774.This pr is to resolve this problem. ## How was this patch tested? DDLSuite Please review http://spark.apache.org/contributing.html before opening a pull request. You can merge this pull request into a Git repository by running: $ git pull https://github.com/lcqzte10192193/spark wid-lcq-1126 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/23140.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #23140 commit a902f4c233d8199d2461dbad9492d34d5179a1cc Author: lichaoqun Date: 2018-11-26T03:20:24Z SPARK-25774 truncate table with partition and path --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21732: [SPARK-24762][SQL] Enable Option of Product encod...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/21732#discussion_r236117313 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/encoders/ExpressionEncoder.scala --- @@ -253,10 +247,24 @@ case class ExpressionEncoder[T]( }) /** - * Returns true if the type `T` is serialized as a struct. + * Returns true if the type `T` is serialized as a struct by `objSerializer`. */ def isSerializedAsStruct: Boolean = objSerializer.dataType.isInstanceOf[StructType] + /** + * Returns true if the type `T` is an `Option` type. + */ + def isOptionType: Boolean = classOf[Option[_]].isAssignableFrom(clsTag.runtimeClass) + + /** + * If the type `T` is serialized as a struct, when it is encoded to a Spark SQL row, fields in + * the struct are naturally mapped to top-level columns in a row. In other words, the serialized + * struct is flattened to row. But in case of the `T` is also an `Option` type, it can't be + * flattened to top-level row, because in Spark SQL top-level row can't be null. This method + * returns true if `T` is serialized as struct and is not `Option` type. + */ + def isSerializedAsStructForTopLevel: Boolean = isSerializedAsStruct && !isOptionType --- End diff -- ok. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23137: [SPARK-26169] Create DataFrameSetOperationsSuite
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23137 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22991: [SPARK-25989][ML] OneVsRestModel handle empty outputCols...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22991 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23137: [SPARK-26169] Create DataFrameSetOperationsSuite
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/23137 **[Test build #99256 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99256/testReport)** for PR 23137 at commit [`5cfe08d`](https://github.com/apache/spark/commit/5cfe08d75383069d0ac62f9603685ea1860b74e1). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23137: [SPARK-26169] Create DataFrameSetOperationsSuite
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23137 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5343/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22991: [SPARK-25989][ML] OneVsRestModel handle empty outputCols...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22991 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99252/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org