[GitHub] spark issue #23088: [SPARK-26119][CORE][WEBUI]Task summary table should cont...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23088 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99645/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23213: [SPARK-26262][SQL] Run SQLQueryTestSuite with WHOLESTAGE...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23213 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23213: [SPARK-26262][SQL] Run SQLQueryTestSuite with WHOLESTAGE...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23213 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99646/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23088: [SPARK-26119][CORE][WEBUI]Task summary table should cont...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23088 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23088: [SPARK-26119][CORE][WEBUI]Task summary table should cont...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/23088 **[Test build #99645 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99645/testReport)** for PR 23088 at commit [`f8cfb54`](https://github.com/apache/spark/commit/f8cfb544a805ecb5d1056f703dde4e7705ad1810). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23213: [SPARK-26262][SQL] Run SQLQueryTestSuite with WHOLESTAGE...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/23213 **[Test build #99646 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99646/testReport)** for PR 23213 at commit [`2ced0ca`](https://github.com/apache/spark/commit/2ced0cab0c16ee7a2400035a5a7794033eae3ed9). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23200: [SPARK-26033][SPARK-26034][PYTHON][FOLLOW-UP] Small clea...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23200 Thank you @srowen and @BryanCutler --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23212: [SPARK-25498][SQL][FOLLOW-UP] Return an empty config set...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23212 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #23196: [SPARK-26243][SQL] Use java.time API for parsing ...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/23196#discussion_r238495344 --- Diff: docs/sql-migration-guide-upgrade.md --- @@ -33,6 +33,8 @@ displayTitle: Spark SQL Upgrading Guide - Spark applications which are built with Spark version 2.4 and prior, and call methods of `UserDefinedFunction`, need to be re-compiled with Spark 3.0, as they are not binary compatible with Spark 3.0. + - Since Spark 3.0, JSON datasource uses java.time API for parsing and generating JSON content. New formatting implementation supports date/timestamp patterns conformed to ISO 8601. To switch back to the implementation used in Spark 2.4 and earlier, set `spark.sql.legacy.timeParser.enabled` to `true`. --- End diff -- I think we can add an example that shows the diff. IIRC it has a difference about exact match or non-exact match. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23212: [SPARK-25498][SQL][FOLLOW-UP] Return an empty config set...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23212 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22468: [SPARK-25374][SQL] SafeProjection supports fallba...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22468#discussion_r238515227 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Projection.scala --- @@ -166,29 +166,40 @@ object UnsafeProjection } } -/** - * A projection that could turn UnsafeRow into GenericInternalRow --- End diff -- can we keep this comment? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22911: [SPARK-25815][k8s] Support kerberos in client mode, keyt...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22911 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22911: [SPARK-25815][k8s] Support kerberos in client mode, keyt...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22911 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99630/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22911: [SPARK-25815][k8s] Support kerberos in client mode, keyt...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22911 **[Test build #99630 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99630/testReport)** for PR 22911 at commit [`05333ea`](https://github.com/apache/spark/commit/05333ea11bc50ade8f3ba8e1d2116bad9d3d7c89). * This patch **fails PySpark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23174: [SPARK-26194][k8s] Auto generate auth secret for k8s app...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23174 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99629/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23174: [SPARK-26194][k8s] Auto generate auth secret for k8s app...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23174 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23169: [SPARK-26103][SQL] Limit the length of debug strings for...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23169 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99631/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22957: [SPARK-25951][SQL] Ignore aliases for distributio...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22957#discussion_r238524763 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/EnsureRequirements.scala --- @@ -145,9 +145,14 @@ case class EnsureRequirements(conf: SQLConf) extends Rule[SparkPlan] { assert(requiredChildDistributions.length == children.length) assert(requiredChildOrderings.length == children.length) +val aliasMap = AttributeMap[Expression](children.flatMap(_.expressions.collect { + case a: Alias => (a.toAttribute, a) +})) + // Ensure that the operator's children satisfy their output distribution requirements. children = children.zip(requiredChildDistributions).map { - case (child, distribution) if child.outputPartitioning.satisfies(distribution) => + case (child, distribution) if child.outputPartitioning.satisfies( + distribution.mapExpressions(replaceAlias(_, aliasMap))) => --- End diff -- As an example, `ProjectExec.outputPartitioning` can be wrong, as it doesn't consider the aliases in the project list. I think it's clearer to adjust the `outputPartitioning` there, instead of dealing with it in a rule. What if we have more rules need to check `outputPartitioning` and `requiredChildDistribution`? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23194: [MINOR][SQL] Combine the same codes in test cases
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23194 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23194: [MINOR][SQL] Combine the same codes in test cases
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23194 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5705/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22899: [SPARK-25573] Combine resolveExpression and resolve in t...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22899 To be honest, we might still need to revisit it since it is still very confusing to the developer which one they should use, top-down? or bottom-up? The current use case for top-down is majorly for resolving the higher order functions. This PR at least improves the description. We might need to combine them in the future. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22899: [SPARK-25573] Combine resolveExpression and resolve in t...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22899 LGTM Merged to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23169: [SPARK-26103][SQL] Limit the length of debug strings for...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23169 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99648/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23169: [SPARK-26103][SQL] Limit the length of debug strings for...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/23169 **[Test build #99648 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99648/testReport)** for PR 23169 at commit [`f0f75c2`](https://github.com/apache/spark/commit/f0f75c25b95010d63ecdf83bb9f280687361d154). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22857: [SPARK-25860][SQL] Replace Literal(null, _) with ...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22857#discussion_r238489445 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala --- @@ -31,14 +31,14 @@ import org.apache.spark.scheduler.{SparkListener, SparkListenerJobEnd} import org.apache.spark.sql.catalyst.TableIdentifier import org.apache.spark.sql.catalyst.expressions.Uuid import org.apache.spark.sql.catalyst.optimizer.ConvertToLocalRelation -import org.apache.spark.sql.catalyst.plans.logical.{Filter, OneRowRelation, Union} +import org.apache.spark.sql.catalyst.plans.logical.{OneRowRelation, Union} --- End diff -- Yea, also it's unrelated import cleanup. It should be discouraged because it might make backporting / reverting potentially difficult, and sometimes those changes make readers confused. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #23088: [SPARK-26119][CORE][WEBUI]Task summary table shou...
Github user shahidki31 commented on a diff in the pull request: https://github.com/apache/spark/pull/23088#discussion_r238492653 --- Diff: core/src/test/scala/org/apache/spark/status/AppStatusStoreSuite.scala --- @@ -77,6 +77,34 @@ class AppStatusStoreSuite extends SparkFunSuite { assert(store.count(classOf[CachedQuantile]) === 2) } + test("only successfull task have taskSummary") { +val store = new InMemoryStore() +(0 until 5).foreach { i => store.write(newTaskData(i, "FAILED")) } --- End diff -- Done. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #23088: [SPARK-26119][CORE][WEBUI]Task summary table shou...
Github user shahidki31 commented on a diff in the pull request: https://github.com/apache/spark/pull/23088#discussion_r238492582 --- Diff: core/src/main/scala/org/apache/spark/status/AppStatusStore.scala --- @@ -148,11 +148,20 @@ private[spark] class AppStatusStore( // cheaper for disk stores (avoids deserialization). val count = { Utils.tryWithResource( -store.view(classOf[TaskDataWrapper]) - .parent(stageKey) - .index(TaskIndexNames.EXEC_RUN_TIME) - .first(0L) - .closeableIterator() +if (store.isInstanceOf[LevelDB]) { --- End diff -- Done. Thanks --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23212: [SPARK-25498][SQL][FOLLOW-UP] Return an empty config set...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/23212 **[Test build #99640 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99640/testReport)** for PR 23212 at commit [`ca09bb8`](https://github.com/apache/spark/commit/ca09bb82d49d29a5d6f088d2250e980bab64a8b7). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23080: [SPARK-26108][SQL] Support custom lineSep in CSV datasou...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23080 It's fixed in upcoming Spark. Spark 2.4 does not support it. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23092: [SPARK-26094][CORE][STREAMING] createNonEcFile creates p...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23092 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99625/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23092: [SPARK-26094][CORE][STREAMING] createNonEcFile creates p...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23092 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22468: [SPARK-25374][SQL] SafeProjection supports fallba...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22468#discussion_r238520267 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/UnsafeRowConverterSuite.scala --- @@ -535,4 +535,100 @@ class UnsafeRowConverterSuite extends SparkFunSuite with Matchers with PlanTestB assert(unsafeRow.getSizeInBytes == 8 + 8 * 2 + roundedSize(field1.getSizeInBytes) + roundedSize(field2.getSizeInBytes)) } + + testBothCodegenAndInterpreted("SPARK-25374 converts back into safe representation") { +def convertBackToInternalRow(inputRow: InternalRow, fields: Array[DataType]): InternalRow = { + val unsafeProj = UnsafeProjection.create(fields) + val unsafeRow = unsafeProj(inputRow) + val safeProj = SafeProjection.create(fields) + safeProj(unsafeRow) +} + +// Simple tests +val inputRow = InternalRow.fromSeq(Seq( + false, 3.toByte, 15.toShort, -83, 129L, 1.0f, 8.0, UTF8String.fromString("test"), + Decimal(255), CalendarInterval.fromString("interval 1 day"), Array[Byte](1, 2) +)) +val fields1 = Array( + BooleanType, ByteType, ShortType, IntegerType, LongType, FloatType, + DoubleType, StringType, DecimalType.defaultConcreteType, CalendarIntervalType, + BinaryType) + +assert(convertBackToInternalRow(inputRow, fields1) === inputRow) + +// Array tests +val arrayRow = InternalRow.fromSeq(Seq( + createArray(1, 2, 3), + createArray( +createArray(Seq("a", "b", "c").map(UTF8String.fromString): _*), +createArray(Seq("d").map(UTF8String.fromString): _*)) +)) +val fields2 = Array[DataType]( + ArrayType(IntegerType), + ArrayType(ArrayType(StringType))) + +assert(convertBackToInternalRow(arrayRow, fields2) === arrayRow) + +// Struct tests +val structRow = InternalRow.fromSeq(Seq( + InternalRow.fromSeq(Seq[Any](1, 4.0)), + InternalRow.fromSeq(Seq( +UTF8String.fromString("test"), +InternalRow.fromSeq(Seq( + 1, + createArray(Seq("2", "3").map(UTF8String.fromString): _*) +)) + )) +)) +val fields3 = Array[DataType]( + StructType( +StructField("c0", IntegerType) :: +StructField("c1", DoubleType) :: +Nil), + StructType( +StructField("c2", StringType) :: +StructField("c3", StructType( + StructField("c4", IntegerType) :: + StructField("c5", ArrayType(StringType)) :: + Nil)) :: +Nil)) + +assert(convertBackToInternalRow(structRow, fields3) === structRow) + +// Map tests +val mapRow = InternalRow.fromSeq(Seq( + createMap(Seq("k1", "k2").map(UTF8String.fromString): _*)(1, 2), + createMap( +createMap(3, 5)(Seq("v1", "v2").map(UTF8String.fromString): _*), +createMap(7, 9)(Seq("v3", "v4").map(UTF8String.fromString): _*) + )( +createMap(Seq("k3", "k4").map(UTF8String.fromString): _*)(3.toShort, 4.toShort), +createMap(Seq("k5", "k6").map(UTF8String.fromString): _*)(5.toShort, 6.toShort) + ))) +val fields4 = Array[DataType]( + MapType(StringType, IntegerType), + MapType(MapType(IntegerType, StringType), MapType(StringType, ShortType))) + +// Since `ArrayBasedMapData` does not override `equals` and `hashCode`, +// we need to take care of it to compare rows. +def toComparable(d: Any): Any = d match { --- End diff -- this does nothing, isn't it? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23211: [SPARK-19712][SQL] Move PullupCorrelatedPredicates and R...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23211 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23088: [SPARK-26119][CORE][WEBUI]Task summary table should cont...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23088 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23088: [SPARK-26119][CORE][WEBUI]Task summary table should cont...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23088 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99635/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23088: [SPARK-26119][CORE][WEBUI]Task summary table should cont...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/23088 **[Test build #99635 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99635/testReport)** for PR 23088 at commit [`f8cfb54`](https://github.com/apache/spark/commit/f8cfb544a805ecb5d1056f703dde4e7705ad1810). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23211: [SPARK-19712][SQL] Move PullupCorrelatedPredicates and R...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23211 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99636/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23169: [SPARK-26103][SQL] Limit the length of debug strings for...
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/23169 retest this, please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22468: [SPARK-25374][SQL] SafeProjection supports fallba...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22468#discussion_r238534101 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/UnsafeRowConverterSuite.scala --- @@ -535,4 +535,98 @@ class UnsafeRowConverterSuite extends SparkFunSuite with Matchers with PlanTestB assert(unsafeRow.getSizeInBytes == 8 + 8 * 2 + roundedSize(field1.getSizeInBytes) + roundedSize(field2.getSizeInBytes)) } + + testBothCodegenAndInterpreted("SPARK-25374 converts back into safe representation") { +def convertBackToInternalRow(inputRow: InternalRow, fields: Array[DataType]): InternalRow = { + val unsafeProj = UnsafeProjection.create(fields) + val unsafeRow = unsafeProj(inputRow) + val safeProj = SafeProjection.create(fields) + safeProj(unsafeRow) +} + +// Simple tests +val inputRow = InternalRow.fromSeq(Seq( + false, 3.toByte, 15.toShort, -83, 129L, 1.0f, 8.0, UTF8String.fromString("test"), + Decimal(255), CalendarInterval.fromString("interval 1 day"), Array[Byte](1, 2) +)) +val fields1 = Array( + BooleanType, ByteType, ShortType, IntegerType, LongType, FloatType, + DoubleType, StringType, DecimalType.defaultConcreteType, CalendarIntervalType, + BinaryType) + +assert(convertBackToInternalRow(inputRow, fields1) === inputRow) + +// Array tests +val arrayRow = InternalRow.fromSeq(Seq( + createArray(1, 2, 3), + createArray( +createArray(Seq("a", "b", "c").map(UTF8String.fromString): _*), +createArray(Seq("d").map(UTF8String.fromString): _*)) +)) +val fields2 = Array[DataType]( + ArrayType(IntegerType), + ArrayType(ArrayType(StringType))) + +assert(convertBackToInternalRow(arrayRow, fields2) === arrayRow) + +// Struct tests +val structRow = InternalRow.fromSeq(Seq( + InternalRow.fromSeq(Seq[Any](1, 4.0)), + InternalRow.fromSeq(Seq( +UTF8String.fromString("test"), +InternalRow.fromSeq(Seq( + 1, + createArray(Seq("2", "3").map(UTF8String.fromString): _*) +)) + )) +)) +val fields3 = Array[DataType]( + StructType( +StructField("c0", IntegerType) :: +StructField("c1", DoubleType) :: +Nil), + StructType( +StructField("c2", StringType) :: +StructField("c3", StructType( + StructField("c4", IntegerType) :: + StructField("c5", ArrayType(StringType)) :: + Nil)) :: +Nil)) + +assert(convertBackToInternalRow(structRow, fields3) === structRow) + +// Map tests +val mapRow = InternalRow.fromSeq(Seq( + createMap(Seq("k1", "k2").map(UTF8String.fromString): _*)(1, 2), + createMap( +createMap(3, 5)(Seq("v1", "v2").map(UTF8String.fromString): _*), +createMap(7, 9)(Seq("v3", "v4").map(UTF8String.fromString): _*) + )( +createMap(Seq("k3", "k4").map(UTF8String.fromString): _*)(3.toShort, 4.toShort), +createMap(Seq("k5", "k6").map(UTF8String.fromString): _*)(5.toShort, 6.toShort) + ))) +val fields4 = Array[DataType]( + MapType(StringType, IntegerType), + MapType(MapType(IntegerType, StringType), MapType(StringType, ShortType))) + +val mapResultRow = convertBackToInternalRow(mapRow, fields4).toSeq(fields4) +val mapExpectedRow = mapRow.toSeq(fields4) +// Since `ArrayBasedMapData` does not override `equals` and `hashCode`, --- End diff -- Or we can use `ExpressionEvalHelper.checkResult` here. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22899: [SPARK-25573] Combine resolveExpression and resolve in t...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22899 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22899: [SPARK-25573] Combine resolveExpression and resolve in t...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22899 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99642/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23194: [MINOR][SQL] Combine the same codes in test cases
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/23194 **[Test build #99644 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99644/testReport)** for PR 23194 at commit [`867ee5e`](https://github.com/apache/spark/commit/867ee5e5ee73d00ea492655aabaa407089ea0f91). * This patch **fails PySpark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23207: [SPARK-26193][SQL] Implement shuffle write metrics in SQ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23207 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23214: [SPARK-26155] Optimizing the performance of LongToUnsafe...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/23214 **[Test build #99651 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99651/testReport)** for PR 23214 at commit [`a267e6b`](https://github.com/apache/spark/commit/a267e6bbf874038573c598e4c411274c8b459701). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23207: [SPARK-26193][SQL] Implement shuffle write metrics in SQ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23207 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99643/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22512: [SPARK-25498][SQL] InterpretedMutableProjection s...
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/22512#discussion_r238489997 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQueryTestSuite.scala --- @@ -148,12 +156,25 @@ class SQLQueryTestSuite extends QueryTest with SharedSQLContext { }) // When we are regenerating the golden files we don't need to run all the configs as they // all need to return the same result - if (regenerateGoldenFiles && configs.nonEmpty) { -configs.take(1) + if (regenerateGoldenFiles) { +if (configs.nonEmpty) { + configs.take(1) +} else { + Array.empty[Array[(String, String)]] --- End diff -- ok --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23088: [SPARK-26119][CORE][WEBUI]Task summary table should cont...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/23088 **[Test build #99635 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99635/testReport)** for PR 23088 at commit [`f8cfb54`](https://github.com/apache/spark/commit/f8cfb544a805ecb5d1056f703dde4e7705ad1810). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20433: [SPARK-23264][SQL] Make INTERVAL keyword optional in INT...
Github user maropu commented on the issue: https://github.com/apache/spark/pull/20433 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22468: [SPARK-25374][SQL] SafeProjection supports fallba...
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/22468#discussion_r238490121 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/InternalRow.scala --- @@ -157,4 +157,22 @@ object InternalRow { getValueNullSafe } } + + /** + * Returns a writer for an `InternalRow` with given data type. + */ + def getWriter(ordinal: Int, dt: DataType): (InternalRow, Any) => Unit = dt match { --- End diff -- ok --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23174: [SPARK-26194][k8s] Auto generate auth secret for k8s app...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23174 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99634/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23174: [SPARK-26194][k8s] Auto generate auth secret for k8s app...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23174 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22952: [SPARK-20568][SS] Provide option to clean up completed f...
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/22952 @zsxwing @gaborgsomogyi What we were trying to do is enforcing archive path so that moved files will not make overlap with source path. There may be same file name with different directory so I'm also trying to persist its own path in final archived path, which means archive files will not be placed in same directory. Based on above, I thought enforcing archive path with checking glob path is not easy to do, because without knowing final archive path (per file) we can't check it matches with glob pattern. That's why I just would rather restrict all subdirectories instead of finding a way to check against glob pattern. Actually I'm a bit afraid that we might be putting too much complexity on enforcing archive path. If we are OK with not enforcing archive path and just verify the final archive path doesn't overlap source path per each source file, it would be simple to do. We can make Spark not moving file and log warning message to let end users specify other directory. Would like to hear everyone's thought and idea. Thanks in advance! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23174: [SPARK-26194][k8s] Auto generate auth secret for k8s app...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/23174 **[Test build #99634 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99634/testReport)** for PR 23174 at commit [`791b5ee`](https://github.com/apache/spark/commit/791b5ee88d2f3c856aad640f662472124e098c66). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23088: [SPARK-26119][CORE][WEBUI]Task summary table should cont...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/23088 **[Test build #99645 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99645/testReport)** for PR 23088 at commit [`f8cfb54`](https://github.com/apache/spark/commit/f8cfb544a805ecb5d1056f703dde4e7705ad1810). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23194: [MINOR][SQL] Combine the same codes in test cases
Github user maropu commented on the issue: https://github.com/apache/spark/pull/23194 LGTM except for minor comments --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #23194: [MINOR][SQL] Combine the same codes in test cases
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/23194#discussion_r238526892 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala --- @@ -377,41 +377,41 @@ abstract class DDLSuite extends QueryTest with SQLTestUtils { } } - test("CTAS a managed table with the existing empty directory") { -val tableLoc = new File(spark.sessionState.catalog.defaultTablePath(TableIdentifier("tab1"))) + protected def withEmptyDirInTablePath(dirName: String)(f: File => Unit): Unit = { --- End diff -- private? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23088: [SPARK-26119][CORE][WEBUI]Task summary table should cont...
Github user shahidki31 commented on the issue: https://github.com/apache/spark/pull/23088 Retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22468: [SPARK-25374][SQL] SafeProjection supports fallba...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22468#discussion_r238533700 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/UnsafeRowConverterSuite.scala --- @@ -535,4 +535,98 @@ class UnsafeRowConverterSuite extends SparkFunSuite with Matchers with PlanTestB assert(unsafeRow.getSizeInBytes == 8 + 8 * 2 + roundedSize(field1.getSizeInBytes) + roundedSize(field2.getSizeInBytes)) } + + testBothCodegenAndInterpreted("SPARK-25374 converts back into safe representation") { +def convertBackToInternalRow(inputRow: InternalRow, fields: Array[DataType]): InternalRow = { + val unsafeProj = UnsafeProjection.create(fields) + val unsafeRow = unsafeProj(inputRow) + val safeProj = SafeProjection.create(fields) + safeProj(unsafeRow) +} + +// Simple tests +val inputRow = InternalRow.fromSeq(Seq( + false, 3.toByte, 15.toShort, -83, 129L, 1.0f, 8.0, UTF8String.fromString("test"), + Decimal(255), CalendarInterval.fromString("interval 1 day"), Array[Byte](1, 2) +)) +val fields1 = Array( + BooleanType, ByteType, ShortType, IntegerType, LongType, FloatType, + DoubleType, StringType, DecimalType.defaultConcreteType, CalendarIntervalType, + BinaryType) + +assert(convertBackToInternalRow(inputRow, fields1) === inputRow) + +// Array tests +val arrayRow = InternalRow.fromSeq(Seq( + createArray(1, 2, 3), + createArray( +createArray(Seq("a", "b", "c").map(UTF8String.fromString): _*), +createArray(Seq("d").map(UTF8String.fromString): _*)) +)) +val fields2 = Array[DataType]( + ArrayType(IntegerType), + ArrayType(ArrayType(StringType))) + +assert(convertBackToInternalRow(arrayRow, fields2) === arrayRow) + +// Struct tests +val structRow = InternalRow.fromSeq(Seq( + InternalRow.fromSeq(Seq[Any](1, 4.0)), + InternalRow.fromSeq(Seq( +UTF8String.fromString("test"), +InternalRow.fromSeq(Seq( + 1, + createArray(Seq("2", "3").map(UTF8String.fromString): _*) +)) + )) +)) +val fields3 = Array[DataType]( + StructType( +StructField("c0", IntegerType) :: +StructField("c1", DoubleType) :: +Nil), + StructType( +StructField("c2", StringType) :: +StructField("c3", StructType( + StructField("c4", IntegerType) :: + StructField("c5", ArrayType(StringType)) :: + Nil)) :: +Nil)) + +assert(convertBackToInternalRow(structRow, fields3) === structRow) + +// Map tests +val mapRow = InternalRow.fromSeq(Seq( + createMap(Seq("k1", "k2").map(UTF8String.fromString): _*)(1, 2), + createMap( +createMap(3, 5)(Seq("v1", "v2").map(UTF8String.fromString): _*), +createMap(7, 9)(Seq("v3", "v4").map(UTF8String.fromString): _*) + )( +createMap(Seq("k3", "k4").map(UTF8String.fromString): _*)(3.toShort, 4.toShort), +createMap(Seq("k5", "k6").map(UTF8String.fromString): _*)(5.toShort, 6.toShort) + ))) +val fields4 = Array[DataType]( + MapType(StringType, IntegerType), + MapType(MapType(IntegerType, StringType), MapType(StringType, ShortType))) + +val mapResultRow = convertBackToInternalRow(mapRow, fields4).toSeq(fields4) +val mapExpectedRow = mapRow.toSeq(fields4) +// Since `ArrayBasedMapData` does not override `equals` and `hashCode`, --- End diff -- maybe we should implement `equals` and `hashCode` in `ArrayBasedMapData` and `UnsafeMapData`. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23212: [SPARK-25498][SQL][FOLLOW-UP] Return an empty config set...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23212 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5697/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23169: [SPARK-26103][SQL] Limit the length of debug strings for...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23169 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23092: [SPARK-26094][CORE][STREAMING] createNonEcFile creates p...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/23092 **[Test build #99625 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99625/testReport)** for PR 23092 at commit [`891e37c`](https://github.com/apache/spark/commit/891e37cd371350906035cfbf93c618035a4293b1). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23169: [SPARK-26103][SQL] Limit the length of debug strings for...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/23169 **[Test build #99631 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99631/testReport)** for PR 23169 at commit [`a4be985`](https://github.com/apache/spark/commit/a4be985b5e7eeed43409ba0b0cda2a45c44e5110). * This patch **fails Spark unit tests**. * This patch **does not merge cleanly**. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23174: [SPARK-26194][k8s] Auto generate auth secret for k8s app...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/23174 **[Test build #99629 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99629/testReport)** for PR 23174 at commit [`791b5ee`](https://github.com/apache/spark/commit/791b5ee88d2f3c856aad640f662472124e098c66). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23169: [SPARK-26103][SQL] Limit the length of debug strings for...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23169 Build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20433: [SPARK-23264][SQL] Make INTERVAL keyword optional in INT...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20433 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99638/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #23108: [Spark-25993][SQL][TEST]Add test cases for resolu...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/23108#discussion_r238524452 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/orc/OrcSourceSuite.scala --- @@ -186,6 +186,54 @@ abstract class OrcSuite extends OrcTest with BeforeAndAfterAll { } } + protected def testORCTableLocation(isConvertMetastore: Boolean): Unit = { +val tableName1 = "spark_orc1" +val tableName2 = "spark_orc2" + +withTempDir { dir => + val someDF1 = Seq((1, 1, "orc1"), (2, 2, "orc2")).toDF("c1", "c2", "c3").repartition(1) + withTable(tableName1, tableName2) { +val dataDir = s"${dir.getCanonicalPath}/dir1/" +val parentDir = s"${dir.getCanonicalPath}/" +val wildCardDir = new File(s"${dir}/*").toURI +someDF1.write.orc(dataDir) +val parentDirStatement = + s""" + |CREATE EXTERNAL TABLE $tableName1( + | c1 int, + | c2 int, + | c3 string) + |STORED AS orc + |LOCATION '${parentDir}'""".stripMargin +sql(parentDirStatement) +val parentDirSqlStatement = s"select * from ${tableName1}" +if (isConvertMetastore) { + checkAnswer(sql(parentDirSqlStatement), Nil) +} else { + checkAnswer(sql(parentDirSqlStatement), + (1 to 2).map(i => Row(i, i, s"orc$i"))) +} + +val wildCardStatement = + s""" + |CREATE EXTERNAL TABLE $tableName2( + | c1 int, + | c2 int, + | c3 string) + |STORED AS orc + |LOCATION '$wildCardDir'""".stripMargin --- End diff -- Thanks, @kevinyu98 . Also, please update the PR title ``` - [Spark-25993][SQL][TEST]Add test cases for resolution of ORC table location + [SPARK-25993][SQL][TEST] Add test cases for CREATE EXTERNAL TABLE with subdirectories ``` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20433: [SPARK-23264][SQL] Make INTERVAL keyword optional in INT...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20433 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23171: [SPARK-26205][SQL] Optimize In for bytes, shorts, ints
Github user rxin commented on the issue: https://github.com/apache/spark/pull/23171 That probably means we should just optimize InSet to have the switch version though? Rather than do it in In? On Mon, Dec 03, 2018 at 8:20 PM, Wenchen Fan < notificati...@github.com > wrote: > > > > @ rxin ( https://github.com/rxin ) I proposed the same thing before, but > one problem is that, we only convert In to InSet when the length of list > reaches the threshold. If the switch way is faster than hash set when the > list is small, it seems still worth to optimize In using switch. > > > > â > You are receiving this because you were mentioned. > Reply to this email directly, view it on GitHub ( > https://github.com/apache/spark/pull/23171#issuecomment-443965616 ) , or mute > the thread ( > https://github.com/notifications/unsubscribe-auth/AATvPEkrUFJuT4FI167cCI9b0nfv16V4ks5u1fgNgaJpZM4Y4P4J > ). > > > --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23171: [SPARK-26205][SQL] Optimize In for bytes, shorts, ints
Github user rxin commented on the issue: https://github.com/apache/spark/pull/23171 I thought InSwitch logically is the same as InSet, in which all the child expressions are literals? On Mon, Dec 03, 2018 at 8:38 PM, Wenchen Fan < notificati...@github.com > wrote: > > > > I think InSet is not an optimized version of In , but just a way to > separate the implementation for different conditions (the length of the > list). Maybe we should do the same thing here, create a InSwitch and > convert In to it when meeting some conditions. One problem is, In and InSwitch > is same in the interpreted version, maybe we should create a base class > for them. > > > > â > You are receiving this because you were mentioned. > Reply to this email directly, view it on GitHub ( > https://github.com/apache/spark/pull/23171#issuecomment-443968486 ) , or mute > the thread ( > https://github.com/notifications/unsubscribe-auth/AATvPDTQic0Ii5UD40m_Uj5kMVy4pNExks5u1fxPgaJpZM4Y4P4J > ). > > > --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23213: [SPARK-26262][SQL] Run SQLQueryTestSuite with WHOLESTAGE...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23213 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23213: [SPARK-26262][SQL] Run SQLQueryTestSuite with WHOLESTAGE...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23213 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5703/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23194: [MINOR][SQL] Combine the same codes in test cases
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/23194 **[Test build #99649 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99649/testReport)** for PR 23194 at commit [`877751b`](https://github.com/apache/spark/commit/877751bdab5310dae4c5d8a1f2b7c3bbd761b718). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23171: [SPARK-26205][SQL] Optimize In for bytes, shorts, ints
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23171 How about, we create an `OptimizedIn`, and convert `In` to `OptimizedIn` if the list is all literals? `OptimizedIn` will pick `switch` or hash set based on the length of the list. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #23214: [SPARK-26155] Optimizing the performance of LongT...
GitHub user LuciferYang opened a pull request: https://github.com/apache/spark/pull/23214 [SPARK-26155] Optimizing the performance of LongToUnsafeRowMap ## What changes were proposed in this pull request? To slove @JkSelf report problem at [SPARK-26155](https://issues.apache.org/jira/browse/SPARK-26155), use LongAdder instead of Long of `numKeyLookups` and `numProbes` to reduce add operation times. @JkSelf test this patch in Intel performance testing environment and run TPCDS sqls after this patch with Spark-2.3 and master no longer slower than Spark-2.1. ## How was this patch tested? N/A You can merge this pull request into a Git repository by running: $ git pull https://github.com/LuciferYang/spark spark-26155 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/23214.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #23214 commit 34decea2f59d9fb3930f4d2ca1ad8a625d0409ea Author: yangjie01 Date: 2018-12-04T05:49:19Z use LongAdder instead of Long --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23214: [SPARK-26155] Optimizing the performance of LongToUnsafe...
Github user LuciferYang commented on the issue: https://github.com/apache/spark/pull/23214 @adrian-wang ok~ I will add some comments to explain the reason --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22899: [SPARK-25573] Combine resolveExpression and resolve in t...
Github user dilipbiswal commented on the issue: https://github.com/apache/spark/pull/22899 @gatorsmile Thanks a lot. I completely agree that we should try and combine these two. I will continue to think about it :-) --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23088: [SPARK-26119][CORE][WEBUI]Task summary table should cont...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23088 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23088: [SPARK-26119][CORE][WEBUI]Task summary table should cont...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23088 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5709/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23214: [SPARK-26155] Optimizing the performance of LongToUnsafe...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23214 It's easy to track `numKeyLookups` at `HashedRelation`, but it's hard to track `numProbes`. One idea is, we pass a `MutableInt` to `LongToUnsafeRowMap.getValue` as a parameter, and in the method we set the actual `numProbes` of this look up to the `MutableInt` parameter. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23037: [SPARK-26083][k8s] Add Copy pyspark into corresponding d...
Github user vanzin commented on the issue: https://github.com/apache/spark/pull/23037 I thought there was already one for that Hive suite failing... SPARK-23622? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22904: [SPARK-25887][K8S] Configurable K8S context suppo...
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/22904#discussion_r238484698 --- Diff: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/SparkKubernetesClientFactory.scala --- @@ -67,8 +66,16 @@ private[spark] object SparkKubernetesClientFactory { val dispatcher = new Dispatcher( ThreadUtils.newDaemonCachedThreadPool("kubernetes-dispatcher")) -// TODO [SPARK-25887] Create builder in a way that respects configurable context -val config = new ConfigBuilder() +// Allow for specifying a context used to auto-configure from the users K8S config file +val kubeContext = sparkConf.get(KUBERNETES_CONTEXT).filter(c => StringUtils.isNotBlank(c)) +logInfo(s"Auto-configuring K8S client using " + + s"${if (kubeContext.isEmpty) s"context ${kubeContext.get}" else "current context"}" + + s" from users K8S config file") + +// Start from an auto-configured config with the desired context +// Fabric 8 uses null to indicate that the users current context should be used so if no +// explicit setting pass null +val config = new ConfigBuilder(autoConfigure(kubeContext.getOrElse(null))) --- End diff -- > we don't propagate the submission clients config file We don't propagate the config *file* itself, but we do propagate all its contents, as far as I remember. But given your explanation it should work fine, unless there's a different config context with the same name inside the container... --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23037: [SPARK-26083][k8s] Add Copy pyspark into corresponding d...
Github user AzureQ commented on the issue: https://github.com/apache/spark/pull/23037 > ok, I give up on flaky tests. > > Merging to master. Thanks! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23055: [SPARK-26080][PYTHON] Skips Python resource limit on Win...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23055 Last comment was a minor comment for a doc - actually the whole point was a minor one. It does related with Windows. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23174: [SPARK-26194][k8s] Auto generate auth secret for k8s app...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/23174 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/5691/ --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23174: [SPARK-26194][k8s] Auto generate auth secret for k8s app...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/23174 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/5691/ --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23174: [SPARK-26194][k8s] Auto generate auth secret for k8s app...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23174 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5691/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23174: [SPARK-26194][k8s] Auto generate auth secret for k8s app...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23174 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23080: [SPARK-26108][SQL] Support custom lineSep in CSV datasou...
Github user pooja-murarka commented on the issue: https://github.com/apache/spark/pull/23080 I am testing **lineSep** with spark 2.4 data.csv : "a",1 "c",2 "d",3 val schema : StructType = StructType( Seq( StructField(name = "dteday", dataType = StringType), StructField(name = "hr", dataType = IntegerType) ) _val logData = spark.read.format("csv").schema(schema).option("lineSep", "\t").load("data.csv")_ But can only see schema without any data. scala> logData.show() +--++ |dteday| hr| +--++ | null|null| +--++ Can you please suggest if i missed something or above fix has not been merged with branch. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23155: [MINOR][K8S] add missing docs for podTemplateContainerNa...
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/23155 Oh, I thought it could be handled by another Kubernetes committers. I'll take a look again. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20433: [SPARK-23264][SQL] Make INTERVAL keyword optional in INT...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20433 **[Test build #99638 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99638/testReport)** for PR 20433 at commit [`7cb8f87`](https://github.com/apache/spark/commit/7cb8f87ba83fb04e891e069b8339dcef6d835ed9). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23169: [SPARK-26103][SQL] Limit the length of debug strings for...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/23169 **[Test build #99648 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99648/testReport)** for PR 23169 at commit [`f0f75c2`](https://github.com/apache/spark/commit/f0f75c25b95010d63ecdf83bb9f280687361d154). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23169: [SPARK-26103][SQL] Limit the length of debug strings for...
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/23169 @DaveDeCaprio You might miss to roll back change in test. https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99632/testReport/org.apache.spark.sql.catalyst.trees/TreeNodeSuite/treeString_limits_plan_length/ I also think you need to add a new test with setting configuration to some value and see whether it works properly. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23214: [SPARK-26155] Optimizing the performance of LongToUnsafe...
Github user adrian-wang commented on the issue: https://github.com/apache/spark/pull/23214 maybe add some detailed test result in description and explain the reason for this in code comment? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23214: [SPARK-26155] Optimizing the performance of LongToUnsafe...
Github user LuciferYang commented on the issue: https://github.com/apache/spark/pull/23214 ping @viirya --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23171: [SPARK-26205][SQL] Optimize In for bytes, shorts, ints
Github user dbtsai commented on the issue: https://github.com/apache/spark/pull/23171 @rxin `switch` in Java is still significantly faster than hash set even without boxing / unboxing problems when the number of elements are small. We were thinking about to have two implementations in `InSet`, and pick up `switch` if the number of elements are small, or otherwise pick up hash set one. But this is the same complexity as having two implements in `In` as this PR. @cloud-fan do you suggest to create an `OptimizeIn` which has `switch` and hash set implementations based on the length of the elements and remove `InSet`? Basically, what we were thinking above. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23194: [MINOR][SQL] Combine the same codes in test cases
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23194 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23194: [MINOR][SQL] Combine the same codes in test cases
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23194 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99644/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23194: [MINOR][SQL] Combine the same codes in test cases
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23194 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23194: [MINOR][SQL] Combine the same codes in test cases
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23194 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99649/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22904: [SPARK-25887][K8S] Configurable K8S context suppo...
Github user rvesse commented on a diff in the pull request: https://github.com/apache/spark/pull/22904#discussion_r238484145 --- Diff: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/SparkKubernetesClientFactory.scala --- @@ -67,8 +66,16 @@ private[spark] object SparkKubernetesClientFactory { val dispatcher = new Dispatcher( ThreadUtils.newDaemonCachedThreadPool("kubernetes-dispatcher")) -// TODO [SPARK-25887] Create builder in a way that respects configurable context -val config = new ConfigBuilder() +// Allow for specifying a context used to auto-configure from the users K8S config file +val kubeContext = sparkConf.get(KUBERNETES_CONTEXT).filter(c => StringUtils.isNotBlank(c)) --- End diff -- Fixed in latest commit --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org