svn commit: r32116 - in /dev/spark/2.4.1-SNAPSHOT-2019_01_23_23_03-63fa6f5-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s
Author: pwendell Date: Thu Jan 24 07:18:16 2019 New Revision: 32116 Log: Apache Spark 2.4.1-SNAPSHOT-2019_01_23_23_03-63fa6f5 docs [This commit notification would consist of 1476 parts, which exceeds the limit of 50 ones, so it was shortened to the summary.] - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
svn commit: r32113 - in /dev/spark/3.0.0-SNAPSHOT-2019_01_23_20_51-d5a97c1-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s
Author: pwendell Date: Thu Jan 24 05:03:45 2019 New Revision: 32113 Log: Apache Spark 3.0.0-SNAPSHOT-2019_01_23_20_51-d5a97c1 docs [This commit notification would consist of 1781 parts, which exceeds the limit of 50 ones, so it was shortened to the summary.] - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated: [SPARK-26682][SQL] Use taskAttemptID instead of attemptNumber for Hadoop.
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new d5a97c1 [SPARK-26682][SQL] Use taskAttemptID instead of attemptNumber for Hadoop. d5a97c1 is described below commit d5a97c1c2c86ae335e91008fa25b3359c4560915 Author: Ryan Blue AuthorDate: Thu Jan 24 12:45:25 2019 +0800 [SPARK-26682][SQL] Use taskAttemptID instead of attemptNumber for Hadoop. ## What changes were proposed in this pull request? Updates the attempt ID used by FileFormatWriter. Tasks in stage attempts use the same task attempt number and could conflict. Using Spark's task attempt ID guarantees that Hadoop TaskAttemptID instances are unique. ## How was this patch tested? Existing tests. Also validated that we no longer detect this failure case in our logs after deployment. Closes #23608 from rdblue/SPARK-26682-fix-hadoop-task-attempt-id. Authored-by: Ryan Blue Signed-off-by: Wenchen Fan --- .../org/apache/spark/sql/execution/datasources/FileFormatWriter.scala | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormatWriter.scala b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormatWriter.scala index 260ad97..91e92d8 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormatWriter.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormatWriter.scala @@ -170,7 +170,7 @@ object FileFormatWriter extends Logging { description = description, sparkStageId = taskContext.stageId(), sparkPartitionId = taskContext.partitionId(), -sparkAttemptNumber = taskContext.attemptNumber(), +sparkAttemptNumber = taskContext.taskAttemptId().toInt & Integer.MAX_VALUE, committer, iterator = iter) }, - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-2.4 updated: [SPARK-26682][SQL] Use taskAttemptID instead of attemptNumber for Hadoop.
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a commit to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-2.4 by this push: new 63fa6f5 [SPARK-26682][SQL] Use taskAttemptID instead of attemptNumber for Hadoop. 63fa6f5 is described below commit 63fa6f5abc0c529d017243a4eea505c1c4cbbbd4 Author: Ryan Blue AuthorDate: Thu Jan 24 12:45:25 2019 +0800 [SPARK-26682][SQL] Use taskAttemptID instead of attemptNumber for Hadoop. ## What changes were proposed in this pull request? Updates the attempt ID used by FileFormatWriter. Tasks in stage attempts use the same task attempt number and could conflict. Using Spark's task attempt ID guarantees that Hadoop TaskAttemptID instances are unique. ## How was this patch tested? Existing tests. Also validated that we no longer detect this failure case in our logs after deployment. Closes #23608 from rdblue/SPARK-26682-fix-hadoop-task-attempt-id. Authored-by: Ryan Blue Signed-off-by: Wenchen Fan (cherry picked from commit d5a97c1c2c86ae335e91008fa25b3359c4560915) Signed-off-by: Wenchen Fan --- .../org/apache/spark/sql/execution/datasources/FileFormatWriter.scala | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormatWriter.scala b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormatWriter.scala index 774fe38..2103a2d 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormatWriter.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormatWriter.scala @@ -170,7 +170,7 @@ object FileFormatWriter extends Logging { description = description, sparkStageId = taskContext.stageId(), sparkPartitionId = taskContext.partitionId(), -sparkAttemptNumber = taskContext.attemptNumber(), +sparkAttemptNumber = taskContext.taskAttemptId().toInt & Integer.MAX_VALUE, committer, iterator = iter) }, - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
svn commit: r32112 - in /dev/spark/2.4.1-SNAPSHOT-2019_01_23_18_40-921c22b-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s
Author: pwendell Date: Thu Jan 24 02:56:27 2019 New Revision: 32112 Log: Apache Spark 2.4.1-SNAPSHOT-2019_01_23_18_40-921c22b docs [This commit notification would consist of 1476 parts, which exceeds the limit of 50 ones, so it was shortened to the summary.] - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
svn commit: r32111 - in /dev/spark/2.3.4-SNAPSHOT-2019_01_23_18_40-23e35d4-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s
Author: pwendell Date: Thu Jan 24 02:54:33 2019 New Revision: 32111 Log: Apache Spark 2.3.4-SNAPSHOT-2019_01_23_18_40-23e35d4 docs [This commit notification would consist of 1443 parts, which exceeds the limit of 50 ones, so it was shortened to the summary.] - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated: [SPARK-26617][SQL] Cache manager locks
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new d0e9219 [SPARK-26617][SQL] Cache manager locks d0e9219 is described below commit d0e9219e03373b0db918d51bafe6a97775e2c65f Author: Dave DeCaprio AuthorDate: Thu Jan 24 10:48:48 2019 +0800 [SPARK-26617][SQL] Cache manager locks ## What changes were proposed in this pull request? Fixed several places in CacheManager where a write lock was being held while running the query optimizer. This could cause a very lock block if the query optimization takes a long time. This builds on changes from [SPARK-26548] that fixed this issue for one specific case in the CacheManager. gatorsmile This is very similar to the PR you approved last week. ## How was this patch tested? Has been tested on a live system where the blocking was causing major issues and it is working well. CacheManager has no explicit unit test but is used in many places internally as part of the SharedState. Closes #23539 from DaveDeCaprio/cache-manager-locks. Lead-authored-by: Dave DeCaprio Co-authored-by: David DeCaprio Signed-off-by: Wenchen Fan --- .../apache/spark/sql/execution/CacheManager.scala | 69 ++ 1 file changed, 43 insertions(+), 26 deletions(-) diff --git a/sql/core/src/main/scala/org/apache/spark/sql/execution/CacheManager.scala b/sql/core/src/main/scala/org/apache/spark/sql/execution/CacheManager.scala index 728fde5..0e6627c 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/execution/CacheManager.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/execution/CacheManager.scala @@ -20,6 +20,7 @@ package org.apache.spark.sql.execution import java.util.concurrent.locks.ReentrantReadWriteLock import scala.collection.JavaConverters._ +import scala.collection.mutable import org.apache.hadoop.fs.{FileSystem, Path} @@ -120,7 +121,7 @@ class CacheManager extends Logging { def uncacheQuery( query: Dataset[_], cascade: Boolean, - blocking: Boolean = true): Unit = writeLock { + blocking: Boolean = true): Unit = { uncacheQuery(query.sparkSession, query.logicalPlan, cascade, blocking) } @@ -136,21 +137,27 @@ class CacheManager extends Logging { spark: SparkSession, plan: LogicalPlan, cascade: Boolean, - blocking: Boolean): Unit = writeLock { + blocking: Boolean): Unit = { val shouldRemove: LogicalPlan => Boolean = if (cascade) { _.find(_.sameResult(plan)).isDefined } else { _.sameResult(plan) } -val it = cachedData.iterator() -while (it.hasNext) { - val cd = it.next() - if (shouldRemove(cd.plan)) { -cd.cachedRepresentation.cacheBuilder.clearCache(blocking) -it.remove() +val plansToUncache = mutable.Buffer[CachedData]() +writeLock { + val it = cachedData.iterator() + while (it.hasNext) { +val cd = it.next() +if (shouldRemove(cd.plan)) { + plansToUncache += cd + it.remove() +} } } +plansToUncache.foreach { cd => + cd.cachedRepresentation.cacheBuilder.clearCache(blocking) +} // Re-compile dependent cached queries after removing the cached query. if (!cascade) { recacheByCondition(spark, _.find(_.sameResult(plan)).isDefined, clearCache = false) @@ -160,7 +167,7 @@ class CacheManager extends Logging { /** * Tries to re-cache all the cache entries that refer to the given plan. */ - def recacheByPlan(spark: SparkSession, plan: LogicalPlan): Unit = writeLock { + def recacheByPlan(spark: SparkSession, plan: LogicalPlan): Unit = { recacheByCondition(spark, _.find(_.sameResult(plan)).isDefined) } @@ -168,26 +175,36 @@ class CacheManager extends Logging { spark: SparkSession, condition: LogicalPlan => Boolean, clearCache: Boolean = true): Unit = { -val it = cachedData.iterator() val needToRecache = scala.collection.mutable.ArrayBuffer.empty[CachedData] -while (it.hasNext) { - val cd = it.next() - if (condition(cd.plan)) { -if (clearCache) { - cd.cachedRepresentation.cacheBuilder.clearCache() +writeLock { + val it = cachedData.iterator() + while (it.hasNext) { +val cd = it.next() +if (condition(cd.plan)) { + needToRecache += cd + // Remove the cache entry before we create a new one, so that we can have a different + // physical plan. + it.remove() +} + } +} +needToRecache.map { cd => + if (clearCache) { +cd.cachedRepresentation.cacheBuilder.clearCache() + } + val plan = spark.sessionState.executePlan(cd.plan).executedPlan + val newCache =
[spark] branch master updated: [SPARK-25713][SQL] implementing copy for ColumnArray
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 11be22b [SPARK-25713][SQL] implementing copy for ColumnArray 11be22b is described below commit 11be22bb5e4784578c2f9ec1b80b30b2cf0ac3c7 Author: ayudovin AuthorDate: Thu Jan 24 10:35:44 2019 +0800 [SPARK-25713][SQL] implementing copy for ColumnArray ## What changes were proposed in this pull request? Implement copy() for ColumnarArray ## How was this patch tested? Updating test case to existing tests in ColumnVectorSuite Closes #23569 from ayudovin/copy-for-columnArray. Authored-by: ayudovin Signed-off-by: Wenchen Fan --- .../apache/spark/sql/vectorized/ColumnarArray.java | 22 +- .../execution/vectorized/ColumnVectorSuite.scala | 18 ++ 2 files changed, 39 insertions(+), 1 deletion(-) diff --git a/sql/core/src/main/java/org/apache/spark/sql/vectorized/ColumnarArray.java b/sql/core/src/main/java/org/apache/spark/sql/vectorized/ColumnarArray.java index dd2bd78..1471627 100644 --- a/sql/core/src/main/java/org/apache/spark/sql/vectorized/ColumnarArray.java +++ b/sql/core/src/main/java/org/apache/spark/sql/vectorized/ColumnarArray.java @@ -17,7 +17,9 @@ package org.apache.spark.sql.vectorized; import org.apache.spark.annotation.Evolving; +import org.apache.spark.sql.catalyst.expressions.UnsafeArrayData; import org.apache.spark.sql.catalyst.util.ArrayData; +import org.apache.spark.sql.catalyst.util.GenericArrayData; import org.apache.spark.sql.types.*; import org.apache.spark.unsafe.types.CalendarInterval; import org.apache.spark.unsafe.types.UTF8String; @@ -46,7 +48,25 @@ public final class ColumnarArray extends ArrayData { @Override public ArrayData copy() { -throw new UnsupportedOperationException(); +DataType dt = data.dataType(); + +if (dt instanceof BooleanType) { + return UnsafeArrayData.fromPrimitiveArray(toBooleanArray()); +} else if (dt instanceof ByteType) { + return UnsafeArrayData.fromPrimitiveArray(toByteArray()); +} else if (dt instanceof ShortType) { + return UnsafeArrayData.fromPrimitiveArray(toShortArray()); +} else if (dt instanceof IntegerType) { + return UnsafeArrayData.fromPrimitiveArray(toIntArray()); +} else if (dt instanceof LongType) { + return UnsafeArrayData.fromPrimitiveArray(toLongArray()); +} else if (dt instanceof FloatType) { + return UnsafeArrayData.fromPrimitiveArray(toFloatArray()); +} else if (dt instanceof DoubleType) { + return UnsafeArrayData.fromPrimitiveArray(toDoubleArray()); +} else { + return new GenericArrayData(toObjectArray(dt)); +} } @Override diff --git a/sql/core/src/test/scala/org/apache/spark/sql/execution/vectorized/ColumnVectorSuite.scala b/sql/core/src/test/scala/org/apache/spark/sql/execution/vectorized/ColumnVectorSuite.scala index 2d1ad4b..866fcb1 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/execution/vectorized/ColumnVectorSuite.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/execution/vectorized/ColumnVectorSuite.scala @@ -58,9 +58,11 @@ class ColumnVectorSuite extends SparkFunSuite with BeforeAndAfterEach { } val array = new ColumnarArray(testVector, 0, 10) +val arrayCopy = array.copy() (0 until 10).foreach { i => assert(array.get(i, BooleanType) === (i % 2 == 0)) + assert(arrayCopy.get(i, BooleanType) === (i % 2 == 0)) } } @@ -70,9 +72,11 @@ class ColumnVectorSuite extends SparkFunSuite with BeforeAndAfterEach { } val array = new ColumnarArray(testVector, 0, 10) +val arrayCopy = array.copy() (0 until 10).foreach { i => assert(array.get(i, ByteType) === i.toByte) + assert(arrayCopy.get(i, ByteType) === i.toByte) } } @@ -82,9 +86,11 @@ class ColumnVectorSuite extends SparkFunSuite with BeforeAndAfterEach { } val array = new ColumnarArray(testVector, 0, 10) +val arrayCopy = array.copy() (0 until 10).foreach { i => assert(array.get(i, ShortType) === i.toShort) + assert(arrayCopy.get(i, ShortType) === i.toShort) } } @@ -94,9 +100,11 @@ class ColumnVectorSuite extends SparkFunSuite with BeforeAndAfterEach { } val array = new ColumnarArray(testVector, 0, 10) +val arrayCopy = array.copy() (0 until 10).foreach { i => assert(array.get(i, IntegerType) === i) + assert(arrayCopy.get(i, IntegerType) === i) } } @@ -106,9 +114,11 @@ class ColumnVectorSuite extends SparkFunSuite with BeforeAndAfterEach { } val array = new ColumnarArray(testVector, 0, 10) +val arrayCopy = array.copy() (0 until 10).foreach { i => assert(array.get(i, LongType) === i) +
[spark] branch branch-2.3 updated: [SPARK-26706][SQL][FOLLOWUP] Fix `illegalNumericPrecedence` for ByteType
This is an automated email from the ASF dual-hosted git repository. dbtsai pushed a commit to branch branch-2.3 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-2.3 by this push: new 23e35d4 [SPARK-26706][SQL][FOLLOWUP] Fix `illegalNumericPrecedence` for ByteType 23e35d4 is described below commit 23e35d4f6cd75d3cc2a8054f8872fdf61c1a8bcc Author: DB Tsai AuthorDate: Wed Jan 23 17:43:43 2019 -0800 [SPARK-26706][SQL][FOLLOWUP] Fix `illegalNumericPrecedence` for ByteType Removed automatically generated tests in ``` test("canSafeCast and mayTruncate must be consistent for numeric types") ``` since `canSafeCast` doesn't exit in 2.3 branch. We have enough test coverages in the explict casting tests. Authored-by: DB Tsai Signed-off-by: DB Tsai --- .../spark/sql/catalyst/expressions/CastSuite.scala | 24 -- 1 file changed, 24 deletions(-) diff --git a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CastSuite.scala b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CastSuite.scala index 777295c..079fc6b 100644 --- a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CastSuite.scala +++ b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CastSuite.scala @@ -934,28 +934,4 @@ class CastSuite extends SparkFunSuite with ExpressionEvalHelper { assert(Cast.mayTruncate(DoubleType, ByteType)) assert(Cast.mayTruncate(DecimalType.IntDecimal, ByteType)) } - - test("canSafeCast and mayTruncate must be consistent for numeric types") { -import DataTypeTestUtils._ - -def isCastSafe(from: NumericType, to: NumericType): Boolean = (from, to) match { - case (_, dt: DecimalType) => dt.isWiderThan(from) - case (dt: DecimalType, _) => dt.isTighterThan(to) - case _ => numericPrecedence.indexOf(from) <= numericPrecedence.indexOf(to) -} - -numericTypes.foreach { from => - val (safeTargetTypes, unsafeTargetTypes) = numericTypes.partition(to => isCastSafe(from, to)) - - safeTargetTypes.foreach { to => -assert(Cast.canSafeCast(from, to), s"It should be possible to safely cast $from to $to") -assert(!Cast.mayTruncate(from, to), s"No truncation is expected when casting $from to $to") - } - - unsafeTargetTypes.foreach { to => -assert(!Cast.canSafeCast(from, to), s"It shouldn't be possible to safely cast $from to $to") -assert(Cast.mayTruncate(from, to), s"Truncation is expected when casting $from to $to") - } -} - } } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
svn commit: r32108 - in /dev/spark/3.0.0-SNAPSHOT-2019_01_23_16_28-0df29bf-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s
Author: pwendell Date: Thu Jan 24 00:41:10 2019 New Revision: 32108 Log: Apache Spark 3.0.0-SNAPSHOT-2019_01_23_16_28-0df29bf docs [This commit notification would consist of 1781 parts, which exceeds the limit of 50 ones, so it was shortened to the summary.] - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-2.3 updated: [SPARK-26706][SQL] Fix `illegalNumericPrecedence` for ByteType
This is an automated email from the ASF dual-hosted git repository. dbtsai pushed a commit to branch branch-2.3 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-2.3 by this push: new de3b5c4 [SPARK-26706][SQL] Fix `illegalNumericPrecedence` for ByteType de3b5c4 is described below commit de3b5c459869e9ff0979e579828e822e6b01f0e3 Author: Anton Okolnychyi AuthorDate: Thu Jan 24 00:12:26 2019 + [SPARK-26706][SQL] Fix `illegalNumericPrecedence` for ByteType This PR contains a minor change in `Cast$mayTruncate` that fixes its logic for bytes. Right now, `mayTruncate(ByteType, LongType)` returns `false` while `mayTruncate(ShortType, LongType)` returns `true`. Consequently, `spark.range(1, 3).as[Byte]` and `spark.range(1, 3).as[Short]` behave differently. Potentially, this bug can silently corrupt someone's data. ```scala // executes silently even though Long is converted into Byte spark.range(Long.MaxValue - 10, Long.MaxValue).as[Byte] .map(b => b - 1) .show() +-+ |value| +-+ | -12| | -11| | -10| | -9| | -8| | -7| | -6| | -5| | -4| | -3| +-+ // throws an AnalysisException: Cannot up cast `id` from bigint to smallint as it may truncate spark.range(Long.MaxValue - 10, Long.MaxValue).as[Short] .map(s => s - 1) .show() ``` This PR comes with a set of unit tests. Closes #23632 from aokolnychyi/cast-fix. Authored-by: Anton Okolnychyi Signed-off-by: DB Tsai --- .../spark/sql/catalyst/expressions/Cast.scala | 2 +- .../spark/sql/catalyst/expressions/CastSuite.scala | 36 ++ .../scala/org/apache/spark/sql/DatasetSuite.scala | 9 ++ 3 files changed, 46 insertions(+), 1 deletion(-) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala index 79b0516..5a156c5 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala @@ -130,7 +130,7 @@ object Cast { private def illegalNumericPrecedence(from: DataType, to: DataType): Boolean = { val fromPrecedence = TypeCoercion.numericPrecedence.indexOf(from) val toPrecedence = TypeCoercion.numericPrecedence.indexOf(to) -toPrecedence > 0 && fromPrecedence > toPrecedence +toPrecedence >= 0 && fromPrecedence > toPrecedence } def forceNullable(from: DataType, to: DataType): Boolean = (from, to) match { diff --git a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CastSuite.scala b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CastSuite.scala index 5b25bdf..777295c 100644 --- a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CastSuite.scala +++ b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CastSuite.scala @@ -23,6 +23,7 @@ import java.util.{Calendar, Locale, TimeZone} import org.apache.spark.SparkFunSuite import org.apache.spark.sql.Row import org.apache.spark.sql.catalyst.InternalRow +import org.apache.spark.sql.catalyst.analysis.TypeCoercion.numericPrecedence import org.apache.spark.sql.catalyst.expressions.codegen.CodegenContext import org.apache.spark.sql.catalyst.util.DateTimeTestUtils._ import org.apache.spark.sql.catalyst.util.DateTimeUtils @@ -922,4 +923,39 @@ class CastSuite extends SparkFunSuite with ExpressionEvalHelper { val ret6 = cast(Literal.create((1, Map(1 -> "a", 2 -> "b", 3 -> "c"))), StringType) checkEvaluation(ret6, "[1, [1 -> a, 2 -> b, 3 -> c]]") } + + test("SPARK-26706: Fix Cast.mayTruncate for bytes") { +assert(!Cast.mayTruncate(ByteType, ByteType)) +assert(!Cast.mayTruncate(DecimalType.ByteDecimal, ByteType)) +assert(Cast.mayTruncate(ShortType, ByteType)) +assert(Cast.mayTruncate(IntegerType, ByteType)) +assert(Cast.mayTruncate(LongType, ByteType)) +assert(Cast.mayTruncate(FloatType, ByteType)) +assert(Cast.mayTruncate(DoubleType, ByteType)) +assert(Cast.mayTruncate(DecimalType.IntDecimal, ByteType)) + } + + test("canSafeCast and mayTruncate must be consistent for numeric types") { +import DataTypeTestUtils._ + +def isCastSafe(from: NumericType, to: NumericType): Boolean = (from, to) match { + case (_, dt: DecimalType) => dt.isWiderThan(from) + case (dt: DecimalType, _) => dt.isTighterThan(to) + case _ => numericPrecedence.indexOf(from) <= numericPrecedence.indexOf(to) +} + +numericTypes.foreach { from => + val (safeTargetTypes, unsafeTargetTypes) = numericTypes.partition(to => isCastSafe(from, to)) + + safeTargetTypes.foreach { to => +assert(Cast.canSafeCast(from, to), s"It
[spark] branch branch-2.4 updated: [SPARK-26706][SQL] Fix `illegalNumericPrecedence` for ByteType
This is an automated email from the ASF dual-hosted git repository. dbtsai pushed a commit to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-2.4 by this push: new 921c22b [SPARK-26706][SQL] Fix `illegalNumericPrecedence` for ByteType 921c22b is described below commit 921c22b1fffc4844aa05c201ba15986be34a3782 Author: Anton Okolnychyi AuthorDate: Thu Jan 24 00:12:26 2019 + [SPARK-26706][SQL] Fix `illegalNumericPrecedence` for ByteType This PR contains a minor change in `Cast$mayTruncate` that fixes its logic for bytes. Right now, `mayTruncate(ByteType, LongType)` returns `false` while `mayTruncate(ShortType, LongType)` returns `true`. Consequently, `spark.range(1, 3).as[Byte]` and `spark.range(1, 3).as[Short]` behave differently. Potentially, this bug can silently corrupt someone's data. ```scala // executes silently even though Long is converted into Byte spark.range(Long.MaxValue - 10, Long.MaxValue).as[Byte] .map(b => b - 1) .show() +-+ |value| +-+ | -12| | -11| | -10| | -9| | -8| | -7| | -6| | -5| | -4| | -3| +-+ // throws an AnalysisException: Cannot up cast `id` from bigint to smallint as it may truncate spark.range(Long.MaxValue - 10, Long.MaxValue).as[Short] .map(s => s - 1) .show() ``` This PR comes with a set of unit tests. Closes #23632 from aokolnychyi/cast-fix. Authored-by: Anton Okolnychyi Signed-off-by: DB Tsai --- .../spark/sql/catalyst/expressions/Cast.scala | 2 +- .../spark/sql/catalyst/expressions/CastSuite.scala | 36 ++ .../scala/org/apache/spark/sql/DatasetSuite.scala | 9 ++ 3 files changed, 46 insertions(+), 1 deletion(-) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala index ee463bf..ac02dac 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala @@ -131,7 +131,7 @@ object Cast { private def illegalNumericPrecedence(from: DataType, to: DataType): Boolean = { val fromPrecedence = TypeCoercion.numericPrecedence.indexOf(from) val toPrecedence = TypeCoercion.numericPrecedence.indexOf(to) -toPrecedence > 0 && fromPrecedence > toPrecedence +toPrecedence >= 0 && fromPrecedence > toPrecedence } /** diff --git a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CastSuite.scala b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CastSuite.scala index d9f32c0..b1531ba 100644 --- a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CastSuite.scala +++ b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CastSuite.scala @@ -23,6 +23,7 @@ import java.util.{Calendar, Locale, TimeZone} import org.apache.spark.SparkFunSuite import org.apache.spark.sql.Row import org.apache.spark.sql.catalyst.InternalRow +import org.apache.spark.sql.catalyst.analysis.TypeCoercion.numericPrecedence import org.apache.spark.sql.catalyst.expressions.codegen.CodegenContext import org.apache.spark.sql.catalyst.util.DateTimeTestUtils._ import org.apache.spark.sql.catalyst.util.DateTimeUtils @@ -953,4 +954,39 @@ class CastSuite extends SparkFunSuite with ExpressionEvalHelper { val ret6 = cast(Literal.create((1, Map(1 -> "a", 2 -> "b", 3 -> "c"))), StringType) checkEvaluation(ret6, "[1, [1 -> a, 2 -> b, 3 -> c]]") } + + test("SPARK-26706: Fix Cast.mayTruncate for bytes") { +assert(!Cast.mayTruncate(ByteType, ByteType)) +assert(!Cast.mayTruncate(DecimalType.ByteDecimal, ByteType)) +assert(Cast.mayTruncate(ShortType, ByteType)) +assert(Cast.mayTruncate(IntegerType, ByteType)) +assert(Cast.mayTruncate(LongType, ByteType)) +assert(Cast.mayTruncate(FloatType, ByteType)) +assert(Cast.mayTruncate(DoubleType, ByteType)) +assert(Cast.mayTruncate(DecimalType.IntDecimal, ByteType)) + } + + test("canSafeCast and mayTruncate must be consistent for numeric types") { +import DataTypeTestUtils._ + +def isCastSafe(from: NumericType, to: NumericType): Boolean = (from, to) match { + case (_, dt: DecimalType) => dt.isWiderThan(from) + case (dt: DecimalType, _) => dt.isTighterThan(to) + case _ => numericPrecedence.indexOf(from) <= numericPrecedence.indexOf(to) +} + +numericTypes.foreach { from => + val (safeTargetTypes, unsafeTargetTypes) = numericTypes.partition(to => isCastSafe(from, to)) + + safeTargetTypes.foreach { to => +assert(Cast.canSafeCast(from, to), s"It should be possible to safely cast $from to $to") +
[spark] branch master updated: [SPARK-26706][SQL] Fix `illegalNumericPrecedence` for ByteType
This is an automated email from the ASF dual-hosted git repository. dbtsai pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 0df29bf [SPARK-26706][SQL] Fix `illegalNumericPrecedence` for ByteType 0df29bf is described below commit 0df29bfbdc6ec7f3ba7d7b54e024059b3894da16 Author: Anton Okolnychyi AuthorDate: Thu Jan 24 00:12:26 2019 + [SPARK-26706][SQL] Fix `illegalNumericPrecedence` for ByteType ## What changes were proposed in this pull request? This PR contains a minor change in `Cast$mayTruncate` that fixes its logic for bytes. Right now, `mayTruncate(ByteType, LongType)` returns `false` while `mayTruncate(ShortType, LongType)` returns `true`. Consequently, `spark.range(1, 3).as[Byte]` and `spark.range(1, 3).as[Short]` behave differently. Potentially, this bug can silently corrupt someone's data. ```scala // executes silently even though Long is converted into Byte spark.range(Long.MaxValue - 10, Long.MaxValue).as[Byte] .map(b => b - 1) .show() +-+ |value| +-+ | -12| | -11| | -10| | -9| | -8| | -7| | -6| | -5| | -4| | -3| +-+ // throws an AnalysisException: Cannot up cast `id` from bigint to smallint as it may truncate spark.range(Long.MaxValue - 10, Long.MaxValue).as[Short] .map(s => s - 1) .show() ``` ## How was this patch tested? This PR comes with a set of unit tests. Closes #23632 from aokolnychyi/cast-fix. Authored-by: Anton Okolnychyi Signed-off-by: DB Tsai --- .../spark/sql/catalyst/expressions/Cast.scala | 2 +- .../spark/sql/catalyst/expressions/CastSuite.scala | 36 ++ .../scala/org/apache/spark/sql/DatasetSuite.scala | 9 ++ 3 files changed, 46 insertions(+), 1 deletion(-) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala index ff6a68b..a6926d8 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala @@ -131,7 +131,7 @@ object Cast { private def illegalNumericPrecedence(from: DataType, to: DataType): Boolean = { val fromPrecedence = TypeCoercion.numericPrecedence.indexOf(from) val toPrecedence = TypeCoercion.numericPrecedence.indexOf(to) -toPrecedence > 0 && fromPrecedence > toPrecedence +toPrecedence >= 0 && fromPrecedence > toPrecedence } /** diff --git a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CastSuite.scala b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CastSuite.scala index 94dee7e..11956e1 100644 --- a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CastSuite.scala +++ b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CastSuite.scala @@ -25,6 +25,7 @@ import scala.util.Random import org.apache.spark.SparkFunSuite import org.apache.spark.sql.Row import org.apache.spark.sql.catalyst.InternalRow +import org.apache.spark.sql.catalyst.analysis.TypeCoercion.numericPrecedence import org.apache.spark.sql.catalyst.expressions.codegen.CodegenContext import org.apache.spark.sql.catalyst.util.DateTimeTestUtils._ import org.apache.spark.sql.catalyst.util.DateTimeUtils @@ -955,4 +956,39 @@ class CastSuite extends SparkFunSuite with ExpressionEvalHelper { val ret6 = cast(Literal.create((1, Map(1 -> "a", 2 -> "b", 3 -> "c"))), StringType) checkEvaluation(ret6, "[1, [1 -> a, 2 -> b, 3 -> c]]") } + + test("SPARK-26706: Fix Cast.mayTruncate for bytes") { +assert(!Cast.mayTruncate(ByteType, ByteType)) +assert(!Cast.mayTruncate(DecimalType.ByteDecimal, ByteType)) +assert(Cast.mayTruncate(ShortType, ByteType)) +assert(Cast.mayTruncate(IntegerType, ByteType)) +assert(Cast.mayTruncate(LongType, ByteType)) +assert(Cast.mayTruncate(FloatType, ByteType)) +assert(Cast.mayTruncate(DoubleType, ByteType)) +assert(Cast.mayTruncate(DecimalType.IntDecimal, ByteType)) + } + + test("canSafeCast and mayTruncate must be consistent for numeric types") { +import DataTypeTestUtils._ + +def isCastSafe(from: NumericType, to: NumericType): Boolean = (from, to) match { + case (_, dt: DecimalType) => dt.isWiderThan(from) + case (dt: DecimalType, _) => dt.isTighterThan(to) + case _ => numericPrecedence.indexOf(from) <= numericPrecedence.indexOf(to) +} + +numericTypes.foreach { from => + val (safeTargetTypes, unsafeTargetTypes) = numericTypes.partition(to => isCastSafe(from, to)) + + safeTargetTypes.foreach { to => +assert(Cast.canSafeCast(from, to), s"It should be
svn commit: r32105 - in /dev/spark/3.0.0-SNAPSHOT-2019_01_23_07_25-0446363-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s
Author: pwendell Date: Wed Jan 23 15:37:51 2019 New Revision: 32105 Log: Apache Spark 3.0.0-SNAPSHOT-2019_01_23_07_25-0446363 docs [This commit notification would consist of 1781 parts, which exceeds the limit of 50 ones, so it was shortened to the summary.] - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated: [SPARK-26660] Add warning logs when broadcasting large task binary
This is an automated email from the ASF dual-hosted git repository. srowen pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 0446363 [SPARK-26660] Add warning logs when broadcasting large task binary 0446363 is described below commit 0446363ef466922a42de019ca14fada62959c1f3 Author: Liupengcheng AuthorDate: Wed Jan 23 08:51:39 2019 -0600 [SPARK-26660] Add warning logs when broadcasting large task binary ## What changes were proposed in this pull request? Currently, some ML library may generate large ml model, which may be referenced in the task closure, so driver will broadcasting large task binary, and executor may not able to deserialize it and result in OOM failures(for instance, executor's memory is not enough). This problem not only affects apps using ml library, some user specified closure or function which refers large data may also have this problem. In order to facilitate the debuging of memory problem caused by large taskBinary broadcast, we can add same warning logs for it. This PR will add some warning logs on the driver side when broadcasting a large task binary, and it also included some minor log changes in the reading of broadcast. ## How was this patch tested? NA-Just log changes. Please review http://spark.apache.org/contributing.html before opening a pull request. Closes #23580 from liupc/Add-warning-logs-for-large-taskBinary-size. Authored-by: Liupengcheng Signed-off-by: Sean Owen --- core/src/main/scala/org/apache/spark/broadcast/TorrentBroadcast.scala | 4 +++- core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala | 4 2 files changed, 7 insertions(+), 1 deletion(-) diff --git a/core/src/main/scala/org/apache/spark/broadcast/TorrentBroadcast.scala b/core/src/main/scala/org/apache/spark/broadcast/TorrentBroadcast.scala index 6410866..7680587 100644 --- a/core/src/main/scala/org/apache/spark/broadcast/TorrentBroadcast.scala +++ b/core/src/main/scala/org/apache/spark/broadcast/TorrentBroadcast.scala @@ -236,7 +236,9 @@ private[spark] class TorrentBroadcast[T: ClassTag](obj: T, id: Long) throw new SparkException(s"Failed to get locally stored broadcast data: $broadcastId") } case None => -logInfo("Started reading broadcast variable " + id) +val estimatedTotalSize = Utils.bytesToString(numBlocks * blockSize) +logInfo(s"Started reading broadcast variable $id with $numBlocks pieces " + + s"(estimated total size $estimatedTotalSize)") val startTimeMs = System.currentTimeMillis() val blocks = readBlocks() logInfo("Reading broadcast variable " + id + " took" + Utils.getUsedTimeMs(startTimeMs)) diff --git a/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala b/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala index f6ade18..ecb8ac0 100644 --- a/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala +++ b/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala @@ -1162,6 +1162,10 @@ private[spark] class DAGScheduler( partitions = stage.rdd.partitions } + if (taskBinaryBytes.length * 1000 > TaskSetManager.TASK_SIZE_TO_WARN_KB) { +logWarning(s"Broadcasting large task binary with size " + + s"${Utils.bytesToString(taskBinaryBytes.length)}") + } taskBinary = sc.broadcast(taskBinaryBytes) } catch { // In the case of a failure during serialization, abort the stage. - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated: [SPARK-26681][SQL] Support Ammonite inner-class scopes.
This is an automated email from the ASF dual-hosted git repository. srowen pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new d008e23 [SPARK-26681][SQL] Support Ammonite inner-class scopes. d008e23 is described below commit d008e23ab50637ebe54eebe8784f5410f15486cc Author: Ryan Blue AuthorDate: Wed Jan 23 08:50:03 2019 -0600 [SPARK-26681][SQL] Support Ammonite inner-class scopes. ## What changes were proposed in this pull request? This adds a new pattern to recognize Ammonite REPL classes and return the correct scope. ## How was this patch tested? Manually tested with Spark in an Ammonite session. Closes #23607 from rdblue/SPARK-26681-support-ammonite-scopes. Authored-by: Ryan Blue Signed-off-by: Sean Owen --- .../org/apache/spark/sql/catalyst/encoders/OuterScopes.scala | 10 ++ 1 file changed, 10 insertions(+) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/encoders/OuterScopes.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/encoders/OuterScopes.scala index a1f0312..665b2cd 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/encoders/OuterScopes.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/encoders/OuterScopes.scala @@ -53,6 +53,12 @@ object OuterScopes { val outer = outerScopes.get(outerClassName) if (outer == null) { outerClassName match { +case AmmoniteREPLClass(cellClassName) => + () => { +val objClass = Utils.classForName(cellClassName) +val objInstance = objClass.getField("MODULE$").get(null) +objClass.getMethod("instance").invoke(objInstance) + } // If the outer class is generated by REPL, users don't need to register it as it has // only one instance and there is a way to retrieve it: get the `$read` object, call the // `INSTANCE()` method to get the single instance of class `$read`. Then call `$iw()` @@ -95,4 +101,8 @@ object OuterScopes { // The format of REPL generated wrapper class's name, e.g. `$line12.$read$$iw$$iw` private[this] val REPLClass = """^(\$line(?:\d+)\.\$read)(?:\$\$iw)+$""".r + + // The format of ammonite REPL generated wrapper class's name, + // e.g. `ammonite.$sess.cmd8$Helper$Foo` -> `ammonite.$sess.cmd8.instance.Foo` + private[this] val AmmoniteREPLClass = """^(ammonite\.\$sess\.cmd(?:\d+)\$).*""".r } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated: [SPARK-26653][SQL] Use Proleptic Gregorian calendar in parsing JDBC lower/upper bounds
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 46d5bb9 [SPARK-26653][SQL] Use Proleptic Gregorian calendar in parsing JDBC lower/upper bounds 46d5bb9 is described below commit 46d5bb9a0fa4f44ea857e2cb15bf15acd773b839 Author: Maxim Gekk AuthorDate: Wed Jan 23 20:23:17 2019 +0800 [SPARK-26653][SQL] Use Proleptic Gregorian calendar in parsing JDBC lower/upper bounds ## What changes were proposed in this pull request? In the PR, I propose using of the `stringToDate` and `stringToTimestamp` methods in parsing JDBC lower/upper bounds of the partition column if it has `DateType` or `TimestampType`. Since those methods have been ported on Proleptic Gregorian calendar by #23512, the PR switches parsing of JDBC bounds of the partition column on the calendar as well. ## How was this patch tested? This was tested by `JDBCSuite`. Closes #23597 from MaxGekk/jdbc-parse-timestamp-bounds. Lead-authored-by: Maxim Gekk Co-authored-by: Maxim Gekk Signed-off-by: Wenchen Fan --- docs/sql-migration-guide-upgrade.md| 2 ++ .../execution/datasources/jdbc/JDBCRelation.scala | 27 - .../org/apache/spark/sql/jdbc/JDBCSuite.scala | 34 +- 3 files changed, 54 insertions(+), 9 deletions(-) diff --git a/docs/sql-migration-guide-upgrade.md b/docs/sql-migration-guide-upgrade.md index d442087..98fc9fa 100644 --- a/docs/sql-migration-guide-upgrade.md +++ b/docs/sql-migration-guide-upgrade.md @@ -93,6 +93,8 @@ displayTitle: Spark SQL Upgrading Guide - Since Spark 3.0, the `weekofyear`, `weekday` and `dayofweek` functions use java.time API for calculation week number of year and day number of week based on Proleptic Gregorian calendar. In Spark version 2.4 and earlier, the hybrid calendar (Julian + Gregorian) is used for the same purpose. Results of the functions returned by Spark 3.0 and previous versions can be different for dates before October 15, 1582 (Gregorian). + - Since Spark 3.0, the JDBC options `lowerBound` and `upperBound` are converted to TimestampType/DateType values in the same way as casting strings to TimestampType/DateType values. The conversion is based on Proleptic Gregorian calendar, and time zone defined by the SQL config `spark.sql.session.timeZone`. In Spark version 2.4 and earlier, the conversion is based on the hybrid calendar (Julian + Gregorian) and on default system time zone. + ## Upgrading From Spark SQL 2.3 to 2.4 - In Spark version 2.3 and earlier, the second parameter to array_contains function is implicitly promoted to the element type of first array type parameter. This type promotion can be lossy and may cause `array_contains` function to return wrong result. This problem has been addressed in 2.4 by employing a safer type promotion mechanism. This can cause some change in behavior and are illustrated in the table below. diff --git a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCRelation.scala b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCRelation.scala index 13ed105..c0f78b5 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCRelation.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCRelation.scala @@ -17,8 +17,6 @@ package org.apache.spark.sql.execution.datasources.jdbc -import java.sql.{Date, Timestamp} - import scala.collection.mutable.ArrayBuffer import org.apache.spark.Partition @@ -27,10 +25,12 @@ import org.apache.spark.rdd.RDD import org.apache.spark.sql.{AnalysisException, DataFrame, Row, SaveMode, SparkSession, SQLContext} import org.apache.spark.sql.catalyst.analysis._ import org.apache.spark.sql.catalyst.util.{DateFormatter, DateTimeUtils, TimestampFormatter} +import org.apache.spark.sql.catalyst.util.DateTimeUtils.{getTimeZone, stringToDate, stringToTimestamp} import org.apache.spark.sql.internal.SQLConf import org.apache.spark.sql.jdbc.JdbcDialects import org.apache.spark.sql.sources._ import org.apache.spark.sql.types.{DataType, DateType, NumericType, StructType, TimestampType} +import org.apache.spark.unsafe.types.UTF8String /** * Instructions on how to partition the table among workers. @@ -85,8 +85,8 @@ private[sql] object JDBCRelation extends Logging { val (column, columnType) = verifyAndGetNormalizedPartitionColumn( schema, partitionColumn.get, resolver, jdbcOptions) -val lowerBoundValue = toInternalBoundValue(lowerBound.get, columnType) -val upperBoundValue = toInternalBoundValue(upperBound.get, columnType) +val lowerBoundValue = toInternalBoundValue(lowerBound.get, columnType, timeZoneId) +val
svn commit: r32102 - in /dev/spark/3.0.0-SNAPSHOT-2019_01_23_02_49-1ed1b4d-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s
Author: pwendell Date: Wed Jan 23 11:01:13 2019 New Revision: 32102 Log: Apache Spark 3.0.0-SNAPSHOT-2019_01_23_02_49-1ed1b4d docs [This commit notification would consist of 1781 parts, which exceeds the limit of 50 ones, so it was shortened to the summary.] - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org