[spark] branch branch-2.4 updated: [SPARK-27485][BRANCH-2.4] EnsureRequirements.reorder should handle duplicate expressions gracefully
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-2.4 by this push: new 198f2f3 [SPARK-27485][BRANCH-2.4] EnsureRequirements.reorder should handle duplicate expressions gracefully 198f2f3 is described below commit 198f2f331b79fc0a1d3afcf4d999c0dd4ad7a818 Author: herman AuthorDate: Tue Jul 16 18:01:15 2019 -0700 [SPARK-27485][BRANCH-2.4] EnsureRequirements.reorder should handle duplicate expressions gracefully Backport of 421d9d56efd447d31787e77316ce0eafb5fe45a5 ## What changes were proposed in this pull request? When reordering joins EnsureRequirements only checks if all the join keys are present in the partitioning expression seq. This is problematic when the joins keys and and partitioning expressions both contain duplicates but not the same number of duplicates for each expression, e.g. `Seq(a, a, b)` vs `Seq(a, b, b)`. This fails with an index lookup failure in the `reorder` function. This PR fixes this removing the equality checking logic from the `reorderJoinKeys` function, and by doing the multiset equality in the `reorder` function while building the reordered key sequences. ## How was this patch tested? Added a unit test to the `PlannerSuite` and added an integration test to `JoinSuite` Closes #25174 from hvanhovell/SPARK-27485-2.4. Authored-by: herman Signed-off-by: Dongjoon Hyun --- .../execution/exchange/EnsureRequirements.scala| 72 -- .../scala/org/apache/spark/sql/JoinSuite.scala | 20 ++ .../apache/spark/sql/execution/PlannerSuite.scala | 26 3 files changed, 86 insertions(+), 32 deletions(-) diff --git a/sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/EnsureRequirements.scala b/sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/EnsureRequirements.scala index d2d5011..bdb9a31 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/EnsureRequirements.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/EnsureRequirements.scala @@ -24,8 +24,7 @@ import org.apache.spark.sql.catalyst.expressions._ import org.apache.spark.sql.catalyst.plans.physical._ import org.apache.spark.sql.catalyst.rules.Rule import org.apache.spark.sql.execution._ -import org.apache.spark.sql.execution.joins.{BroadcastHashJoinExec, ShuffledHashJoinExec, - SortMergeJoinExec} +import org.apache.spark.sql.execution.joins.{ShuffledHashJoinExec, SortMergeJoinExec} import org.apache.spark.sql.internal.SQLConf /** @@ -221,25 +220,41 @@ case class EnsureRequirements(conf: SQLConf) extends Rule[SparkPlan] { } private def reorder( - leftKeys: Seq[Expression], - rightKeys: Seq[Expression], + leftKeys: IndexedSeq[Expression], + rightKeys: IndexedSeq[Expression], expectedOrderOfKeys: Seq[Expression], currentOrderOfKeys: Seq[Expression]): (Seq[Expression], Seq[Expression]) = { -val leftKeysBuffer = ArrayBuffer[Expression]() -val rightKeysBuffer = ArrayBuffer[Expression]() -val pickedIndexes = mutable.Set[Int]() -val keysAndIndexes = currentOrderOfKeys.zipWithIndex +if (expectedOrderOfKeys.size != currentOrderOfKeys.size) { + return (leftKeys, rightKeys) +} + +// Build a lookup between an expression and the positions its holds in the current key seq. +val keyToIndexMap = mutable.Map.empty[Expression, mutable.BitSet] +currentOrderOfKeys.zipWithIndex.foreach { + case (key, index) => +keyToIndexMap.getOrElseUpdate(key.canonicalized, mutable.BitSet.empty).add(index) +} + +// Reorder the keys. +val leftKeysBuffer = new ArrayBuffer[Expression](leftKeys.size) +val rightKeysBuffer = new ArrayBuffer[Expression](rightKeys.size) +val iterator = expectedOrderOfKeys.iterator +while (iterator.hasNext) { + // Lookup the current index of this key. + keyToIndexMap.get(iterator.next().canonicalized) match { +case Some(indices) if indices.nonEmpty => + // Take the first available index from the map. + val index = indices.firstKey + indices.remove(index) -expectedOrderOfKeys.foreach(expression => { - val index = keysAndIndexes.find { case (e, idx) => -// As we may have the same key used many times, we need to filter out its occurrence we -// have already used. -e.semanticEquals(expression) && !pickedIndexes.contains(idx) - }.map(_._2).get - pickedIndexes += index - leftKeysBuffer.append(leftKeys(index)) - rightKeysBuffer.append(rightKeys(index)) -}) + // Add the keys for that index to the reordered keys. + leftKeysBuffer += leftKeys(index) + rightKeysBuffer += rightKeys(index) +case _ => + /
[spark] branch master updated: [SPARK-27963][CORE] Allow dynamic allocation without a shuffle service.
This is an automated email from the ASF dual-hosted git repository. vanzin pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 2ddeff9 [SPARK-27963][CORE] Allow dynamic allocation without a shuffle service. 2ddeff9 is described below commit 2ddeff97d7329942a98ef363991eeabc3fa71a76 Author: Marcelo Vanzin AuthorDate: Tue Jul 16 16:37:38 2019 -0700 [SPARK-27963][CORE] Allow dynamic allocation without a shuffle service. This change adds a new option that enables dynamic allocation without the need for a shuffle service. This mode works by tracking which stages generate shuffle files, and keeping executors that generate data for those shuffles alive while the jobs that use them are active. A separate timeout is also added for shuffle data; so that executors that hold shuffle data can use a separate timeout before being removed because of being idle. This allows the shuffle data to be kept around in case it is needed by some new job, or allow users to be more aggressive in timing out executors that don't have shuffle data in active use. The code also hooks up to the context cleaner so that shuffles that are garbage collected are detected, and the respective executors not held unnecessarily. Testing done with added unit tests, and also with TPC-DS workloads on YARN without a shuffle service. Closes #24817 from vanzin/SPARK-27963. Authored-by: Marcelo Vanzin Signed-off-by: Marcelo Vanzin --- .../apache/spark/ExecutorAllocationManager.scala | 16 +- .../main/scala/org/apache/spark/SparkContext.scala | 20 +- .../org/apache/spark/internal/config/package.scala | 11 + .../org/apache/spark/scheduler/StageInfo.scala | 10 +- .../spark/scheduler/dynalloc/ExecutorMonitor.scala | 225 - .../spark/ExecutorAllocationManagerSuite.scala | 2 +- .../scheduler/dynalloc/ExecutorMonitorSuite.scala | 132 +++- docs/configuration.md | 20 ++ 8 files changed, 407 insertions(+), 29 deletions(-) diff --git a/core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala b/core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala index bceb26c..5114cf7 100644 --- a/core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala +++ b/core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala @@ -94,6 +94,7 @@ private[spark] class ExecutorAllocationManager( client: ExecutorAllocationClient, listenerBus: LiveListenerBus, conf: SparkConf, +cleaner: Option[ContextCleaner] = None, clock: Clock = new SystemClock()) extends Logging { @@ -148,7 +149,7 @@ private[spark] class ExecutorAllocationManager( // Listener for Spark events that impact the allocation policy val listener = new ExecutorAllocationListener - val executorMonitor = new ExecutorMonitor(conf, client, clock) + val executorMonitor = new ExecutorMonitor(conf, client, listenerBus, clock) // Executor that handles the scheduling task. private val executor = @@ -194,11 +195,13 @@ private[spark] class ExecutorAllocationManager( throw new SparkException( s"s${DYN_ALLOCATION_SUSTAINED_SCHEDULER_BACKLOG_TIMEOUT.key} must be > 0!") } -// Require external shuffle service for dynamic allocation -// Otherwise, we may lose shuffle files when killing executors -if (!conf.get(config.SHUFFLE_SERVICE_ENABLED) && !testing) { - throw new SparkException("Dynamic allocation of executors requires the external " + -"shuffle service. You may enable this through spark.shuffle.service.enabled.") +if (!conf.get(config.SHUFFLE_SERVICE_ENABLED)) { + if (conf.get(config.DYN_ALLOCATION_SHUFFLE_TRACKING)) { +logWarning("Dynamic allocation without a shuffle service is an experimental feature.") + } else if (!testing) { +throw new SparkException("Dynamic allocation of executors requires the external " + + "shuffle service. You may enable this through spark.shuffle.service.enabled.") + } } if (executorAllocationRatio > 1.0 || executorAllocationRatio <= 0.0) { @@ -214,6 +217,7 @@ private[spark] class ExecutorAllocationManager( def start(): Unit = { listenerBus.addToManagementQueue(listener) listenerBus.addToManagementQueue(executorMonitor) +cleaner.foreach(_.attachListener(executorMonitor)) val scheduleTask = new Runnable() { override def run(): Unit = { diff --git a/core/src/main/scala/org/apache/spark/SparkContext.scala b/core/src/main/scala/org/apache/spark/SparkContext.scala index f289c17..75182b0 100644 --- a/core/src/main/scala/org/apache/spark/SparkContext.scala +++ b/core/src/main/scala/org/apache/spark/SparkContext.scala @@ -553,14 +553,22 @@ class SparkContext(config: SparkConf) extends
[spark] branch master updated: [SPARK-18299][SQL] Allow more aggregations on KeyValueGroupedDataset
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 1134fae [SPARK-18299][SQL] Allow more aggregations on KeyValueGroupedDataset 1134fae is described below commit 1134faecf4fe2cc1bf7c3670f5f8f2b9d0c6f2e7 Author: nooberfsh AuthorDate: Tue Jul 16 16:35:04 2019 -0700 [SPARK-18299][SQL] Allow more aggregations on KeyValueGroupedDataset ## What changes were proposed in this pull request? Add 4 additional agg to KeyValueGroupedDataset ## How was this patch tested? New test in DatasetSuite for typed aggregation Closes #24993 from nooberfsh/sqlagg. Authored-by: nooberfsh Signed-off-by: Dongjoon Hyun --- .../apache/spark/sql/KeyValueGroupedDataset.scala | 65 ++ .../scala/org/apache/spark/sql/DatasetSuite.scala | 64 + 2 files changed, 129 insertions(+) diff --git a/sql/core/src/main/scala/org/apache/spark/sql/KeyValueGroupedDataset.scala b/sql/core/src/main/scala/org/apache/spark/sql/KeyValueGroupedDataset.scala index a3cbea9..0da52d4 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/KeyValueGroupedDataset.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/KeyValueGroupedDataset.scala @@ -521,6 +521,71 @@ class KeyValueGroupedDataset[K, V] private[sql]( aggUntyped(col1, col2, col3, col4).asInstanceOf[Dataset[(K, U1, U2, U3, U4)]] /** + * Computes the given aggregations, returning a [[Dataset]] of tuples for each unique key + * and the result of computing these aggregations over all elements in the group. + * + * @since 3.0.0 + */ + def agg[U1, U2, U3, U4, U5]( + col1: TypedColumn[V, U1], + col2: TypedColumn[V, U2], + col3: TypedColumn[V, U3], + col4: TypedColumn[V, U4], + col5: TypedColumn[V, U5]): Dataset[(K, U1, U2, U3, U4, U5)] = +aggUntyped(col1, col2, col3, col4, col5).asInstanceOf[Dataset[(K, U1, U2, U3, U4, U5)]] + + /** + * Computes the given aggregations, returning a [[Dataset]] of tuples for each unique key + * and the result of computing these aggregations over all elements in the group. + * + * @since 3.0.0 + */ + def agg[U1, U2, U3, U4, U5, U6]( + col1: TypedColumn[V, U1], + col2: TypedColumn[V, U2], + col3: TypedColumn[V, U3], + col4: TypedColumn[V, U4], + col5: TypedColumn[V, U5], + col6: TypedColumn[V, U6]): Dataset[(K, U1, U2, U3, U4, U5, U6)] = +aggUntyped(col1, col2, col3, col4, col5, col6) + .asInstanceOf[Dataset[(K, U1, U2, U3, U4, U5, U6)]] + + /** + * Computes the given aggregations, returning a [[Dataset]] of tuples for each unique key + * and the result of computing these aggregations over all elements in the group. + * + * @since 3.0.0 + */ + def agg[U1, U2, U3, U4, U5, U6, U7]( + col1: TypedColumn[V, U1], + col2: TypedColumn[V, U2], + col3: TypedColumn[V, U3], + col4: TypedColumn[V, U4], + col5: TypedColumn[V, U5], + col6: TypedColumn[V, U6], + col7: TypedColumn[V, U7]): Dataset[(K, U1, U2, U3, U4, U5, U6, U7)] = +aggUntyped(col1, col2, col3, col4, col5, col6, col7) + .asInstanceOf[Dataset[(K, U1, U2, U3, U4, U5, U6, U7)]] + + /** + * Computes the given aggregations, returning a [[Dataset]] of tuples for each unique key + * and the result of computing these aggregations over all elements in the group. + * + * @since 3.0.0 + */ + def agg[U1, U2, U3, U4, U5, U6, U7, U8]( + col1: TypedColumn[V, U1], + col2: TypedColumn[V, U2], + col3: TypedColumn[V, U3], + col4: TypedColumn[V, U4], + col5: TypedColumn[V, U5], + col6: TypedColumn[V, U6], + col7: TypedColumn[V, U7], + col8: TypedColumn[V, U8]): Dataset[(K, U1, U2, U3, U4, U5, U6, U7, U8)] = +aggUntyped(col1, col2, col3, col4, col5, col6, col7, col8) + .asInstanceOf[Dataset[(K, U1, U2, U3, U4, U5, U6, U7, U8)]] + + /** * Returns a [[Dataset]] that contains a tuple with each key and the number of items present * for that key. * diff --git a/sql/core/src/test/scala/org/apache/spark/sql/DatasetSuite.scala b/sql/core/src/test/scala/org/apache/spark/sql/DatasetSuite.scala index 4b08a4b..ff61431 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/DatasetSuite.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/DatasetSuite.scala @@ -603,6 +603,70 @@ class DatasetSuite extends QueryTest with SharedSQLContext { ("a", 30L, 32L, 2L, 15.0), ("b", 3L, 5L, 2L, 1.5), ("c", 1L, 2L, 1L, 1.0)) } + test("typed aggregation: expr, expr, expr, expr, expr") { +val ds = Seq(("a", 10), ("a", 20), ("b", 1), ("b", 2), ("c", 1)).toDS() + +checkDatasetUnorderly( + ds.groupByKey(_._1).agg( +sum("_2").as[Long], +sum($"_2" + 1).as[Long], +count("*").a
[spark] branch master updated: [SPARK-27959][YARN] Change YARN resource configs to use .amount
This is an automated email from the ASF dual-hosted git repository. vanzin pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 43d68cd [SPARK-27959][YARN] Change YARN resource configs to use .amount 43d68cd is described below commit 43d68cd4ff84530c3d597f07352984225ab1db7c Author: Thomas Graves AuthorDate: Tue Jul 16 10:56:07 2019 -0700 [SPARK-27959][YARN] Change YARN resource configs to use .amount ## What changes were proposed in this pull request? we are adding in generic resource support into spark where we have suffix for the amount of the resource so that we could support other configs. Spark on yarn already had added configs to request resources via the configs spark.yarn.{executor/driver/am}.resource=, where the is value and unit together. We should change those configs to have a `.amount` suffix on them to match the spark configs and to allow future configs to be more easily added. YARN itself already supports tags and attributes so if we want the user to be able to pass those from spark at some point having a suffix makes sense. it wou [...] ## How was this patch tested? Tested via unit tests and manually on a yarn 3.x cluster with GPU resources configured on. Closes #24989 from tgravescs/SPARK-27959-yarn-resourceconfigs. Authored-by: Thomas Graves Signed-off-by: Marcelo Vanzin --- docs/running-on-yarn.md| 14 ++-- .../org/apache/spark/deploy/yarn/Client.scala | 14 ++-- .../spark/deploy/yarn/ResourceRequestHelper.scala | 46 - .../apache/spark/deploy/yarn/YarnAllocator.scala | 5 +- .../spark/deploy/yarn/YarnSparkHadoopUtil.scala| 18 - .../org/apache/spark/deploy/yarn/ClientSuite.scala | 19 +++--- .../deploy/yarn/ResourceRequestHelperSuite.scala | 77 +- .../spark/deploy/yarn/YarnAllocatorSuite.scala | 13 ++-- 8 files changed, 137 insertions(+), 69 deletions(-) diff --git a/docs/running-on-yarn.md b/docs/running-on-yarn.md index dc93e9c..9d9b253 100644 --- a/docs/running-on-yarn.md +++ b/docs/running-on-yarn.md @@ -142,20 +142,20 @@ To use a custom metrics.properties for the application master and executors, upd - spark.yarn.am.resource.{resource-type} + spark.yarn.am.resource.{resource-type}.amount (none) Amount of resource to use for the YARN Application Master in client mode. -In cluster mode, use spark.yarn.driver.resource.instead. +In cluster mode, use spark.yarn.driver.resource. .amount instead. Please note that this feature can be used only with YARN 3.0+ For reference, see YARN Resource Model documentation: https://hadoop.apache.org/docs/r3.0.1/hadoop-yarn/hadoop-yarn-site/ResourceModel.html Example: -To request GPU resources from YARN, use: spark.yarn.am.resource.yarn.io/gpu +To request GPU resources from YARN, use: spark.yarn.am.resource.yarn.io/gpu.amount - spark.yarn.driver.resource.{resource-type} + spark.yarn.driver.resource.{resource-type}.amount (none) Amount of resource to use for the YARN Application Master in cluster mode. @@ -163,11 +163,11 @@ To use a custom metrics.properties for the application master and executors, upd For reference, see YARN Resource Model documentation: https://hadoop.apache.org/docs/r3.0.1/hadoop-yarn/hadoop-yarn-site/ResourceModel.html Example: -To request GPU resources from YARN, use: spark.yarn.driver.resource.yarn.io/gpu +To request GPU resources from YARN, use: spark.yarn.driver.resource.yarn.io/gpu.amount - spark.yarn.executor.resource.{resource-type} + spark.yarn.executor.resource.{resource-type}.amount (none) Amount of resource to use per executor process. @@ -175,7 +175,7 @@ To use a custom metrics.properties for the application master and executors, upd For reference, see YARN Resource Model documentation: https://hadoop.apache.org/docs/r3.0.1/hadoop-yarn/hadoop-yarn-site/ResourceModel.html Example: - To request GPU resources from YARN, use: spark.yarn.executor.resource.yarn.io/gpu + To request GPU resources from YARN, use: spark.yarn.executor.resource.yarn.io/gpu.amount diff --git a/resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala b/resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala index 5b361d1..651e706 100644 --- a/resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala +++ b/resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala @@ -51,7 +51,7 @@ import org.apache.spark.{SecurityManager, SparkConf, SparkException} import org.apache.spark.api.python.PythonUtils import org.apache.spark.deploy.{SparkApplication, SparkHadoopUtil}
[spark] branch master updated: [SPARK-28343][FOLLOW-UP][SQL][TEST] Enable spark.sql.function.preferIntegralDivision for PostgreSQL testing
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 71882f1 [SPARK-28343][FOLLOW-UP][SQL][TEST] Enable spark.sql.function.preferIntegralDivision for PostgreSQL testing 71882f1 is described below commit 71882f119e72934a02d4c177f0d52c785e2df79f Author: Yuming Wang AuthorDate: Tue Jul 16 08:46:01 2019 -0700 [SPARK-28343][FOLLOW-UP][SQL][TEST] Enable spark.sql.function.preferIntegralDivision for PostgreSQL testing ## What changes were proposed in this pull request? This PR enables `spark.sql.function.preferIntegralDivision` for PostgreSQL testing. ## How was this patch tested? N/A Closes #25170 from wangyum/SPARK-28343-2. Authored-by: Yuming Wang Signed-off-by: Dongjoon Hyun --- .../test/resources/sql-tests/inputs/pgSQL/int2.sql | 6 +- .../test/resources/sql-tests/inputs/pgSQL/int4.sql | 1 - .../test/resources/sql-tests/inputs/pgSQL/int8.sql | 1 - .../resources/sql-tests/results/pgSQL/case.sql.out | 18 ++--- .../resources/sql-tests/results/pgSQL/int2.sql.out | 4 +- .../resources/sql-tests/results/pgSQL/int4.sql.out | 32 - .../resources/sql-tests/results/pgSQL/int8.sql.out | 78 +++--- .../sql-tests/results/udf/pgSQL/udf-case.sql.out | 8 +-- .../org/apache/spark/sql/SQLQueryTestSuite.scala | 1 + 9 files changed, 73 insertions(+), 76 deletions(-) diff --git a/sql/core/src/test/resources/sql-tests/inputs/pgSQL/int2.sql b/sql/core/src/test/resources/sql-tests/inputs/pgSQL/int2.sql index 61f350d..f64ec5d 100644 --- a/sql/core/src/test/resources/sql-tests/inputs/pgSQL/int2.sql +++ b/sql/core/src/test/resources/sql-tests/inputs/pgSQL/int2.sql @@ -88,11 +88,9 @@ WHERE f1 > -32767; SELECT '' AS five, i.f1, i.f1 - int('2') AS x FROM INT2_TBL i; --- PostgreSQL `/` is the same with Spark `div` since SPARK-2659. -SELECT '' AS five, i.f1, i.f1 div smallint('2') AS x FROM INT2_TBL i; +SELECT '' AS five, i.f1, i.f1 / smallint('2') AS x FROM INT2_TBL i; --- PostgreSQL `/` is the same with Spark `div` since SPARK-2659. -SELECT '' AS five, i.f1, i.f1 div int('2') AS x FROM INT2_TBL i; +SELECT '' AS five, i.f1, i.f1 / int('2') AS x FROM INT2_TBL i; -- corner cases SELECT string(shiftleft(smallint(-1), 15)); diff --git a/sql/core/src/test/resources/sql-tests/inputs/pgSQL/int4.sql b/sql/core/src/test/resources/sql-tests/inputs/pgSQL/int4.sql index 675636e..86432a8 100644 --- a/sql/core/src/test/resources/sql-tests/inputs/pgSQL/int4.sql +++ b/sql/core/src/test/resources/sql-tests/inputs/pgSQL/int4.sql @@ -135,7 +135,6 @@ SELECT int('1000') < int('999') AS `false`; SELECT 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 AS ten; --- [SPARK-2659] HiveQL: Division operator should always perform fractional division SELECT 2 + 2 / 2 AS three; SELECT (2 + 2) / 2 AS two; diff --git a/sql/core/src/test/resources/sql-tests/inputs/pgSQL/int8.sql b/sql/core/src/test/resources/sql-tests/inputs/pgSQL/int8.sql index 32ac877..d29bf3b 100644 --- a/sql/core/src/test/resources/sql-tests/inputs/pgSQL/int8.sql +++ b/sql/core/src/test/resources/sql-tests/inputs/pgSQL/int8.sql @@ -85,7 +85,6 @@ SELECT 37 - q1 AS minus4 FROM INT8_TBL; SELECT '' AS five, 2 * q1 AS `twice int4` FROM INT8_TBL; SELECT '' AS five, q1 * 2 AS `twice int4` FROM INT8_TBL; --- [SPARK-2659] HiveQL: Division operator should always perform fractional division -- int8 op int4 SELECT q1 + int(42) AS `8plus4`, q1 - int(42) AS `8minus4`, q1 * int(42) AS `8mul4`, q1 / int(42) AS `8div4` FROM INT8_TBL; -- int4 op int8 diff --git a/sql/core/src/test/resources/sql-tests/results/pgSQL/case.sql.out b/sql/core/src/test/resources/sql-tests/results/pgSQL/case.sql.out index 9b20b31..f95adcd 100644 --- a/sql/core/src/test/resources/sql-tests/results/pgSQL/case.sql.out +++ b/sql/core/src/test/resources/sql-tests/results/pgSQL/case.sql.out @@ -176,28 +176,28 @@ struct -- !query 18 SELECT CASE WHEN 1=0 THEN 1/0 WHEN 1=1 THEN 1 ELSE 2/0 END -- !query 18 schema -struct +struct -- !query 18 output -1.0 +1 -- !query 19 SELECT CASE 1 WHEN 0 THEN 1/0 WHEN 1 THEN 1 ELSE 2/0 END -- !query 19 schema -struct +struct -- !query 19 output -1.0 +1 -- !query 20 SELECT CASE WHEN i > 100 THEN 1/0 ELSE 0 END FROM case_tbl -- !query 20 schema -struct 100) THEN (CAST(1 AS DOUBLE) / CAST(0 AS DOUBLE)) ELSE CAST(0 AS DOUBLE) END:double> +struct 100) THEN (1 div 0) ELSE 0 END:int> -- !query 20 output -0.0 -0.0 -0.0 -0.0 +0 +0 +0 +0 -- !query 21 diff --git a/sql/core/src/test/resources/sql-tests/results/pgSQL/int2.sql.out b/sql/core/src/test/resources/sql-tests/results/pgSQL/int2.sql.out index 6b9246f..7a7ce5f 100644 --- a/sql/core/src/test/resources/sql-tests/results/pgSQL/int2.sql.out +++ b/sql/core/src/test/resources/sql-tests/results/pgSQL/int2.sql.out @@ -266,7 +2
[spark] branch branch-2.4 updated: Revert "[SPARK-27485] EnsureRequirements.reorder should handle duplicate expressions gracefully"
This is an automated email from the ASF dual-hosted git repository. lixiao pushed a commit to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-2.4 by this push: new 63898cb Revert "[SPARK-27485] EnsureRequirements.reorder should handle duplicate expressions gracefully" 63898cb is described below commit 63898cbc1db46be8bcbb46d21fe01340bc883520 Author: gatorsmile AuthorDate: Tue Jul 16 06:57:14 2019 -0700 Revert "[SPARK-27485] EnsureRequirements.reorder should handle duplicate expressions gracefully" This reverts commit 72f547d4a960ba0ba9cace53a0a5553eca1b4dd6. --- .../execution/exchange/EnsureRequirements.scala| 72 ++ .../scala/org/apache/spark/sql/JoinSuite.scala | 20 -- .../apache/spark/sql/execution/PlannerSuite.scala | 26 3 files changed, 32 insertions(+), 86 deletions(-) diff --git a/sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/EnsureRequirements.scala b/sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/EnsureRequirements.scala index bdb9a31..d2d5011 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/EnsureRequirements.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/EnsureRequirements.scala @@ -24,7 +24,8 @@ import org.apache.spark.sql.catalyst.expressions._ import org.apache.spark.sql.catalyst.plans.physical._ import org.apache.spark.sql.catalyst.rules.Rule import org.apache.spark.sql.execution._ -import org.apache.spark.sql.execution.joins.{ShuffledHashJoinExec, SortMergeJoinExec} +import org.apache.spark.sql.execution.joins.{BroadcastHashJoinExec, ShuffledHashJoinExec, + SortMergeJoinExec} import org.apache.spark.sql.internal.SQLConf /** @@ -220,41 +221,25 @@ case class EnsureRequirements(conf: SQLConf) extends Rule[SparkPlan] { } private def reorder( - leftKeys: IndexedSeq[Expression], - rightKeys: IndexedSeq[Expression], + leftKeys: Seq[Expression], + rightKeys: Seq[Expression], expectedOrderOfKeys: Seq[Expression], currentOrderOfKeys: Seq[Expression]): (Seq[Expression], Seq[Expression]) = { -if (expectedOrderOfKeys.size != currentOrderOfKeys.size) { - return (leftKeys, rightKeys) -} - -// Build a lookup between an expression and the positions its holds in the current key seq. -val keyToIndexMap = mutable.Map.empty[Expression, mutable.BitSet] -currentOrderOfKeys.zipWithIndex.foreach { - case (key, index) => -keyToIndexMap.getOrElseUpdate(key.canonicalized, mutable.BitSet.empty).add(index) -} - -// Reorder the keys. -val leftKeysBuffer = new ArrayBuffer[Expression](leftKeys.size) -val rightKeysBuffer = new ArrayBuffer[Expression](rightKeys.size) -val iterator = expectedOrderOfKeys.iterator -while (iterator.hasNext) { - // Lookup the current index of this key. - keyToIndexMap.get(iterator.next().canonicalized) match { -case Some(indices) if indices.nonEmpty => - // Take the first available index from the map. - val index = indices.firstKey - indices.remove(index) +val leftKeysBuffer = ArrayBuffer[Expression]() +val rightKeysBuffer = ArrayBuffer[Expression]() +val pickedIndexes = mutable.Set[Int]() +val keysAndIndexes = currentOrderOfKeys.zipWithIndex - // Add the keys for that index to the reordered keys. - leftKeysBuffer += leftKeys(index) - rightKeysBuffer += rightKeys(index) -case _ => - // The expression cannot be found, or we have exhausted all indices for that expression. - return (leftKeys, rightKeys) - } -} +expectedOrderOfKeys.foreach(expression => { + val index = keysAndIndexes.find { case (e, idx) => +// As we may have the same key used many times, we need to filter out its occurrence we +// have already used. +e.semanticEquals(expression) && !pickedIndexes.contains(idx) + }.map(_._2).get + pickedIndexes += index + leftKeysBuffer.append(leftKeys(index)) + rightKeysBuffer.append(rightKeys(index)) +}) (leftKeysBuffer, rightKeysBuffer) } @@ -264,13 +249,20 @@ case class EnsureRequirements(conf: SQLConf) extends Rule[SparkPlan] { leftPartitioning: Partitioning, rightPartitioning: Partitioning): (Seq[Expression], Seq[Expression]) = { if (leftKeys.forall(_.deterministic) && rightKeys.forall(_.deterministic)) { - (leftPartitioning, rightPartitioning) match { -case (HashPartitioning(leftExpressions, _), _) => - reorder(leftKeys.toIndexedSeq, rightKeys.toIndexedSeq, leftExpressions, leftKeys) -case (_, HashPartitioning(rightExpressions, _)) => - reorder(leftKeys.toIndexedSeq, rightKeys.toIndexedSeq, rightExpressions, rightKeys) -case _ => - (leftKeys, rightKeys) +
[spark] branch master updated (113f62d -> 282a12d)
This is an automated email from the ASF dual-hosted git repository. srowen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 113f62d [SPARK-27485][FOLLOWUP] Do not reduce the number of partitions for repartition in adaptive execution - fix compilation add 282a12d [SPARK-27944][ML] Unify the behavior of checking empty output column names No new revisions were added by this update. Summary of changes: .../main/scala/org/apache/spark/ml/Predictor.scala | 4 +-- .../spark/ml/classification/Classifier.scala | 6 ++--- .../apache/spark/ml/classification/OneVsRest.scala | 18 ++--- .../classification/ProbabilisticClassifier.scala | 4 +-- .../spark/ml/clustering/GaussianMixture.scala | 30 +- .../scala/org/apache/spark/ml/clustering/LDA.scala | 15 +++ .../ml/regression/AFTSurvivalRegression.scala | 27 ++- .../ml/regression/DecisionTreeRegressor.scala | 26 ++- .../regression/GeneralizedLinearRegression.scala | 25 +- 9 files changed, 103 insertions(+), 52 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-2.4 updated (72f547d -> 3f5a114)
This is an automated email from the ASF dual-hosted git repository. srowen pushed a change to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git. from 72f547d [SPARK-27485] EnsureRequirements.reorder should handle duplicate expressions gracefully add 3f5a114 [SPARK-28247][SS][BRANCH-2.4] Fix flaky test "query without test harness" on ContinuousSuite No new revisions were added by this update. Summary of changes: .../sql/streaming/continuous/ContinuousSuite.scala | 63 -- 1 file changed, 47 insertions(+), 16 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-2.4 updated: [SPARK-27485] EnsureRequirements.reorder should handle duplicate expressions gracefully
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a commit to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-2.4 by this push: new 72f547d [SPARK-27485] EnsureRequirements.reorder should handle duplicate expressions gracefully 72f547d is described below commit 72f547d4a960ba0ba9cace53a0a5553eca1b4dd6 Author: herman AuthorDate: Tue Jul 16 17:09:52 2019 +0800 [SPARK-27485] EnsureRequirements.reorder should handle duplicate expressions gracefully ## What changes were proposed in this pull request? When reordering joins EnsureRequirements only checks if all the join keys are present in the partitioning expression seq. This is problematic when the joins keys and and partitioning expressions both contain duplicates but not the same number of duplicates for each expression, e.g. `Seq(a, a, b)` vs `Seq(a, b, b)`. This fails with an index lookup failure in the `reorder` function. This PR fixes this removing the equality checking logic from the `reorderJoinKeys` function, and by doing the multiset equality in the `reorder` function while building the reordered key sequences. ## How was this patch tested? Added a unit test to the `PlannerSuite` and added an integration test to `JoinSuite` Closes #25167 from hvanhovell/SPARK-27485. Authored-by: herman Signed-off-by: Wenchen Fan --- .../execution/exchange/EnsureRequirements.scala| 72 -- .../scala/org/apache/spark/sql/JoinSuite.scala | 20 ++ .../apache/spark/sql/execution/PlannerSuite.scala | 26 3 files changed, 86 insertions(+), 32 deletions(-) diff --git a/sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/EnsureRequirements.scala b/sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/EnsureRequirements.scala index d2d5011..bdb9a31 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/EnsureRequirements.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/EnsureRequirements.scala @@ -24,8 +24,7 @@ import org.apache.spark.sql.catalyst.expressions._ import org.apache.spark.sql.catalyst.plans.physical._ import org.apache.spark.sql.catalyst.rules.Rule import org.apache.spark.sql.execution._ -import org.apache.spark.sql.execution.joins.{BroadcastHashJoinExec, ShuffledHashJoinExec, - SortMergeJoinExec} +import org.apache.spark.sql.execution.joins.{ShuffledHashJoinExec, SortMergeJoinExec} import org.apache.spark.sql.internal.SQLConf /** @@ -221,25 +220,41 @@ case class EnsureRequirements(conf: SQLConf) extends Rule[SparkPlan] { } private def reorder( - leftKeys: Seq[Expression], - rightKeys: Seq[Expression], + leftKeys: IndexedSeq[Expression], + rightKeys: IndexedSeq[Expression], expectedOrderOfKeys: Seq[Expression], currentOrderOfKeys: Seq[Expression]): (Seq[Expression], Seq[Expression]) = { -val leftKeysBuffer = ArrayBuffer[Expression]() -val rightKeysBuffer = ArrayBuffer[Expression]() -val pickedIndexes = mutable.Set[Int]() -val keysAndIndexes = currentOrderOfKeys.zipWithIndex +if (expectedOrderOfKeys.size != currentOrderOfKeys.size) { + return (leftKeys, rightKeys) +} + +// Build a lookup between an expression and the positions its holds in the current key seq. +val keyToIndexMap = mutable.Map.empty[Expression, mutable.BitSet] +currentOrderOfKeys.zipWithIndex.foreach { + case (key, index) => +keyToIndexMap.getOrElseUpdate(key.canonicalized, mutable.BitSet.empty).add(index) +} + +// Reorder the keys. +val leftKeysBuffer = new ArrayBuffer[Expression](leftKeys.size) +val rightKeysBuffer = new ArrayBuffer[Expression](rightKeys.size) +val iterator = expectedOrderOfKeys.iterator +while (iterator.hasNext) { + // Lookup the current index of this key. + keyToIndexMap.get(iterator.next().canonicalized) match { +case Some(indices) if indices.nonEmpty => + // Take the first available index from the map. + val index = indices.firstKey + indices.remove(index) -expectedOrderOfKeys.foreach(expression => { - val index = keysAndIndexes.find { case (e, idx) => -// As we may have the same key used many times, we need to filter out its occurrence we -// have already used. -e.semanticEquals(expression) && !pickedIndexes.contains(idx) - }.map(_._2).get - pickedIndexes += index - leftKeysBuffer.append(leftKeys(index)) - rightKeysBuffer.append(rightKeys(index)) -}) + // Add the keys for that index to the reordered keys. + leftKeysBuffer += leftKeys(index) + rightKeysBuffer += rightKeys(index) +case _ => + // The expression cannot be found, or we have exhausted all indices for that expression. +
[spark] branch master updated (f74ad3d -> 113f62d)
This is an automated email from the ASF dual-hosted git repository. hvanhovell pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from f74ad3d [SPARK-28129][SQL][TEST] Port float8.sql add 113f62d [SPARK-27485][FOLLOWUP] Do not reduce the number of partitions for repartition in adaptive execution - fix compilation No new revisions were added by this update. Summary of changes: .../src/test/scala/org/apache/spark/sql/execution/PlannerSuite.scala | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated: [SPARK-28129][SQL][TEST] Port float8.sql
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new f74ad3d [SPARK-28129][SQL][TEST] Port float8.sql f74ad3d is described below commit f74ad3d7004e833b6dbc07d6281407ab89ef2d32 Author: Yuming Wang AuthorDate: Tue Jul 16 19:31:20 2019 +0900 [SPARK-28129][SQL][TEST] Port float8.sql ## What changes were proposed in this pull request? This PR is to port float8.sql from PostgreSQL regression tests. https://github.com/postgres/postgres/blob/REL_12_BETA2/src/test/regress/sql/float8.sql The expected results can be found in the link: https://github.com/postgres/postgres/blob/REL_12_BETA2/src/test/regress/expected/float8.out When porting the test cases, found six PostgreSQL specific features that do not exist in Spark SQL: [SPARK-28060](https://issues.apache.org/jira/browse/SPARK-28060): Double type can not accept some special inputs [SPARK-28027](https://issues.apache.org/jira/browse/SPARK-28027): Spark SQL does not support prefix operator `` and `|/` [SPARK-28061](https://issues.apache.org/jira/browse/SPARK-28061): Support for converting float to binary format [SPARK-23906](https://issues.apache.org/jira/browse/SPARK-23906): Support Truncate number [SPARK-28134](https://issues.apache.org/jira/browse/SPARK-28134): Missing Trigonometric Functions Also, found two bug: [SPARK-28024](https://issues.apache.org/jira/browse/SPARK-28024): Incorrect value when out of range [SPARK-28135](https://issues.apache.org/jira/browse/SPARK-28135): ceil/ceiling/floor/power returns incorrect values Also, found four inconsistent behavior: [SPARK-27923](https://issues.apache.org/jira/browse/SPARK-27923): Spark SQL insert bad inputs to NULL [SPARK-28028](https://issues.apache.org/jira/browse/SPARK-28028): Cast numeric to integral type need round [SPARK-27923](https://issues.apache.org/jira/browse/SPARK-27923): Spark SQL returns NULL when dividing by zero [SPARK-28007](https://issues.apache.org/jira/browse/SPARK-28007): Caret operator (^) means bitwise XOR in Spark/Hive and exponentiation in Postgres ## How was this patch tested? N/A Closes #24931 from wangyum/SPARK-28129. Authored-by: Yuming Wang Signed-off-by: HyukjinKwon --- .../resources/sql-tests/inputs/pgSQL/float8.sql| 500 .../sql-tests/results/pgSQL/float8.sql.out | 839 + 2 files changed, 1339 insertions(+) diff --git a/sql/core/src/test/resources/sql-tests/inputs/pgSQL/float8.sql b/sql/core/src/test/resources/sql-tests/inputs/pgSQL/float8.sql new file mode 100644 index 000..6f8e3b5 --- /dev/null +++ b/sql/core/src/test/resources/sql-tests/inputs/pgSQL/float8.sql @@ -0,0 +1,500 @@ +-- +-- Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group +-- +-- +-- FLOAT8 +-- https://github.com/postgres/postgres/blob/REL_12_BETA2/src/test/regress/sql/float8.sql + +CREATE TABLE FLOAT8_TBL(f1 double) USING parquet; + +INSERT INTO FLOAT8_TBL VALUES ('0.0 '); +INSERT INTO FLOAT8_TBL VALUES ('1004.30 '); +INSERT INTO FLOAT8_TBL VALUES (' -34.84'); +INSERT INTO FLOAT8_TBL VALUES ('1.2345678901234e+200'); +INSERT INTO FLOAT8_TBL VALUES ('1.2345678901234e-200'); + +-- [SPARK-28024] Incorrect numeric values when out of range +-- test for underflow and overflow handling +SELECT double('10e400'); +SELECT double('-10e400'); +SELECT double('10e-400'); +SELECT double('-10e-400'); + +-- [SPARK-28061] Support for converting float to binary format +-- test smallest normalized input +-- SELECT float8send('2.2250738585072014E-308'::float8); + +-- [SPARK-27923] Spark SQL insert there bad inputs to NULL +-- bad input +-- INSERT INTO FLOAT8_TBL VALUES (''); +-- INSERT INTO FLOAT8_TBL VALUES (' '); +-- INSERT INTO FLOAT8_TBL VALUES ('xyz'); +-- INSERT INTO FLOAT8_TBL VALUES ('5.0.0'); +-- INSERT INTO FLOAT8_TBL VALUES ('5 . 0'); +-- INSERT INTO FLOAT8_TBL VALUES ('5. 0'); +-- INSERT INTO FLOAT8_TBL VALUES ('- 3'); +-- INSERT INTO FLOAT8_TBL VALUES ('123 5'); + +-- special inputs +SELECT double('NaN'); +-- [SPARK-28060] Double type can not accept some special inputs +SELECT double('nan'); +SELECT double(' NAN '); +SELECT double('infinity'); +SELECT double(' -INFINiTY '); +-- [SPARK-27923] Spark SQL insert there bad special inputs to NULL +-- bad special inputs +SELECT double('N A N'); +SELECT double('NaN x'); +SELECT double(' INFINITYx'); + +SELECT double('Infinity') + 100.0; +-- [SPARK-27768] Infinity, -Infinity, NaN should be recognized in a case insensitive manner +SELECT double('Infinity') / double('Infinity'); +SELECT double('NaN') / double('NaN'); +-- [SPARK-28315] Decimal can not accept NaN as input +SELECT double(decimal('nan')); + +SEL
[spark] branch master updated (421d9d5 -> d1a1376)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 421d9d5 [SPARK-27485] EnsureRequirements.reorder should handle duplicate expressions gracefully add d1a1376 [SPARK-28356][SQL] Do not reduce the number of partitions for repartition in adaptive execution No new revisions were added by this update. Summary of changes: .../spark/sql/execution/SparkStrategies.scala | 6 -- .../adaptive/ReduceNumShufflePartitions.scala | 21 +++-- .../sql/execution/exchange/EnsureRequirements.scala | 4 ++-- .../execution/exchange/ShuffleExchangeExec.scala| 3 ++- .../scala/org/apache/spark/sql/DatasetSuite.scala | 2 +- .../execution/ReduceNumShufflePartitionsSuite.scala | 15 +-- 6 files changed, 25 insertions(+), 26 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (9a7f01d -> 421d9d5)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 9a7f01d [SPARK-28201][SQL][TEST][FOLLOWUP] Fix Integration test suite according to the new exception message add 421d9d5 [SPARK-27485] EnsureRequirements.reorder should handle duplicate expressions gracefully No new revisions were added by this update. Summary of changes: .../execution/exchange/EnsureRequirements.scala| 74 -- .../scala/org/apache/spark/sql/JoinSuite.scala | 20 ++ .../apache/spark/sql/execution/PlannerSuite.scala | 26 3 files changed, 87 insertions(+), 33 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (6926849 -> 9a7f01d)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 6926849 [SPARK-28395][SQL] Division operator support integral division add 9a7f01d [SPARK-28201][SQL][TEST][FOLLOWUP] Fix Integration test suite according to the new exception message No new revisions were added by this update. Summary of changes: .../test/scala/org/apache/spark/sql/jdbc/OracleIntegrationSuite.scala | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated: [SPARK-28395][SQL] Division operator support integral division
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 6926849 [SPARK-28395][SQL] Division operator support integral division 6926849 is described below commit 69268492471137dd7a3da54c218026c3b1fa7db3 Author: Yuming Wang AuthorDate: Tue Jul 16 15:43:15 2019 +0800 [SPARK-28395][SQL] Division operator support integral division ## What changes were proposed in this pull request? PostgreSQL, Teradata, SQL Server, DB2 and Presto perform integral division with the `/` operator. But Oracle, Vertica, Hive, MySQL and MariaDB perform fractional division with the `/` operator. This pr add a flag(`spark.sql.function.preferIntegralDivision`) to control whether to use integral division with the `/` operator. Examples: **PostgreSQL**: ```sql postgres=# select substr(version(), 0, 16), cast(10 as int) / cast(3 as int), cast(10.1 as float8) / cast(3 as int), cast(10 as int) / cast(3.1 as float8), cast(10.1 as float8)/cast(3.1 as float8); substr | ?column? | ?column? |?column? | ?column? -+--+--+-+-- PostgreSQL 11.3 |3 | 3.37 | 3.2258064516129 | 3.25806451612903 (1 row) ``` **SQL Server**: ```sql 1> select cast(10 as int) / cast(3 as int), cast(10.1 as float) / cast(3 as int), cast(10 as int) / cast(3.1 as float), cast(10.1 as float)/cast(3.1 as float); 2> go --- 3 3.36673.225806451612903 3.258064516129032 (1 rows affected) ``` **DB2**: ```sql [db2inst12f3c821d36b7 ~]$ db2 "select cast(10 as int) / cast(3 as int), cast(10.1 as double) / cast(3 as int), cast(10 as int) / cast(3.1 as double), cast(10.1 as double)/cast(3.1 as double) from table (sysproc.env_get_inst_info())" 1 234 --- 3 +3.37E+000 +3.22580645161290E+000 +3.25806451612903E+000 1 record(s) selected. ``` **Presto**: ```sql presto> select cast(10 as int) / cast(3 as int), cast(10.1 as double) / cast(3 as int), cast(10 as int) / cast(3.1 as double), cast(10.1 as double)/cast(3.1 as double); _col0 | _col1| _col2 | _col3 ---++---+--- 3 | 3.3667 | 3.225806451612903 | 3.258064516129032 (1 row) ``` **Teradata**: ![image](https://user-images.githubusercontent.com/5399861/61200701-e97d5380-a714-11e9-9a1d-57fd99d38c8d.png) **Oracle**: ```sql SQL> select 10 / 3 from dual; 10/3 -- 3. ``` **Vertica** ```sql dbadmin=> select version(), cast(10 as int) / cast(3 as int), cast(10.1 as float8) / cast(3 as int), cast(10 as int) / cast(3.1 as float8), cast(10.1 as float8)/cast(3.1 as float8); version | ?column? | ?column? |?column? | ?column? +--+--+-+-- Vertica Analytic Database v9.1.1-0 | 3.33 | 3.37 | 3.2258064516129 | 3.25806451612903 (1 row) ``` **Hive**: ```sql hive> select cast(10 as int) / cast(3 as int), cast(10.1 as double) / cast(3 as int), cast(10 as int) / cast(3.1 as double), cast(10.1 as double)/cast(3.1 as double); OK 3.3335 3.3667 3.225806451612903 3.258064516129032 Time taken: 0.143 seconds, Fetched: 1 row(s) ``` **MariaDB**: ```sql MariaDB [(none)]> select version(), cast(10 as int) / cast(3 as int), cast(10.1 as double) / cast(3 as int), cast(10 as int) / cast(3.1 as double), cast(10.1 as double)/cast(3.1 as double); +--+--+---+---+--+ | version()| cast(10 as int) / cast(3 as int) | cast(10.1 as double) / cast(3 as int) | cast(10 as int) / cast(3.1 as double) | cast(10.1 as double)/cast(3.1 as double) | +--+--+---+---+--+ | 10.4.6-MariaDB-1:10.4.6+maria~bionic |
[spark] branch master updated (b94fa97 -> be4a552)
This is an automated email from the ASF dual-hosted git repository. jshao pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from b94fa97 [SPARK-28345][SQL][PYTHON] PythonUDF predicate should be able to pushdown to join add be4a552 [SPARK-28106][SQL] When Spark SQL use "add jar" , before add to SparkContext, check jar path exist first. No new revisions were added by this update. Summary of changes: .../main/scala/org/apache/spark/SparkContext.scala | 34 ++ .../scala/org/apache/spark/SparkContextSuite.scala | 11 +++ 2 files changed, 40 insertions(+), 5 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (8e26d4d -> b94fa97)
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 8e26d4d [SPARK-28408][SQL][TEST] Restrict test values for DateType, TimestampType and CalendarIntervalType add b94fa97 [SPARK-28345][SQL][PYTHON] PythonUDF predicate should be able to pushdown to join No new revisions were added by this update. Summary of changes: .../sql/catalyst/expressions/predicates.scala | 4 +++ .../catalyst/optimizer/FilterPushdownSuite.scala | 34 -- .../scala/org/apache/spark/sql/JoinSuite.scala | 25 +++- 3 files changed, 59 insertions(+), 4 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org