[GitHub] spark pull request #21859: [SPARK-24900][SQL]Speed up sort when the dataset ...
Github user sddyljsx commented on a diff in the pull request: https://github.com/apache/spark/pull/21859#discussion_r209417551 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/ShuffleExchangeExec.scala --- @@ -294,7 +296,12 @@ object ShuffleExchangeExec { sorter.sort(iter.asInstanceOf[Iterator[UnsafeRow]]) } } else { -rdd +part match { + case partitioner: RangePartitioner[InternalRow @unchecked, _] +if partitioner.getSampledArray != null => +sparkContext.parallelize(partitioner.getSampledArray.toSeq, rdd.getNumPartitions) --- End diff -- ``` part match { case partitioner: RangePartitioner[InternalRow @unchecked, _] if partitioner.getSampledArray != null => sparkContext.parallelize(partitioner.getSampledArray.toSeq, rdd.getNumPartitions) case _ => rdd } ``` When the optimization works, It will return the parallelized sampled data instead of the rdd. So I keep the number of the partitions same as the rdd's here --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22011: [SPARK-24822][PySpark] Python support for barrier execut...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22011 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22011: [SPARK-24822][PySpark] Python support for barrier execut...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22011 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94590/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22011: [SPARK-24822][PySpark] Python support for barrier execut...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22011 **[Test build #94590 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94590/testReport)** for PR 22011 at commit [`cf38531`](https://github.com/apache/spark/commit/cf3853177d0ed76efbffee8ced1021003b085a26). * This patch **fails SparkR unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22037: [SPARK-24774][SQL] Avro: Support logical decimal ...
Github user gengliangwang commented on a diff in the pull request: https://github.com/apache/spark/pull/22037#discussion_r209417216 --- Diff: external/avro/src/main/scala/org/apache/spark/sql/avro/SchemaConverters.scala --- @@ -114,32 +129,35 @@ object SchemaConverters { prevNameSpace: String = "", outputTimestampType: AvroOutputTimestampType.Value = AvroOutputTimestampType.TIMESTAMP_MICROS) : Schema = { -val builder = if (nullable) { - SchemaBuilder.builder().nullable() -} else { - SchemaBuilder.builder() -} +val builder = SchemaBuilder.builder() -catalystType match { +val schema = catalystType match { case BooleanType => builder.booleanType() case ByteType | ShortType | IntegerType => builder.intType() case LongType => builder.longType() - case DateType => builder -.intBuilder() -.prop(LogicalType.LOGICAL_TYPE_PROP, LogicalTypes.date().getName) -.endInt() + case DateType => +LogicalTypes.date().addToSchema(builder.intType()) case TimestampType => val timestampType = outputTimestampType match { case AvroOutputTimestampType.TIMESTAMP_MILLIS => LogicalTypes.timestampMillis() case AvroOutputTimestampType.TIMESTAMP_MICROS => LogicalTypes.timestampMicros() case other => throw new IncompatibleSchemaException(s"Unexpected output timestamp type $other.") } -builder.longBuilder().prop(LogicalType.LOGICAL_TYPE_PROP, timestampType.getName).endLong() +timestampType.addToSchema(builder.longType()) case FloatType => builder.floatType() case DoubleType => builder.doubleType() - case _: DecimalType | StringType => builder.stringType() + case StringType => builder.stringType() + case d: DecimalType => +val avroType = LogicalTypes.decimal(d.precision, d.scale) +val fixedSize = minBytesForPrecision(d.precision) +// Use random name to avoid conflict in naming of fixed field. +// Field names must start with [A-Za-z_], while the charset of Random.alphanumeric contains +// [0-9]. So add a single character "f" to ensure the name is valid. +val name = "f" + Random.alphanumeric.take(32).mkString("") --- End diff -- No, if there are two decimal fields, then there will be name conflict. I tried. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21859: [SPARK-24900][SQL]Speed up sort when the dataset ...
Github user sddyljsx commented on a diff in the pull request: https://github.com/apache/spark/pull/21859#discussion_r209417115 --- Diff: core/src/main/scala/org/apache/spark/Partitioner.scala --- @@ -166,9 +169,17 @@ class RangePartitioner[K : Ordering : ClassTag, V]( // Assume the input partitions are roughly balanced and over-sample a little bit. val sampleSizePerPartition = math.ceil(3.0 * sampleSize / rdd.partitions.length).toInt val (numItems, sketched) = RangePartitioner.sketch(rdd.map(_._1), sampleSizePerPartition) + val numSampled = sketched.map(_._3.length).sum if (numItems == 0L) { Array.empty } else { +// already got the whole data +if (sampleCacheEnabled && numItems == numSampled) { + // get the sampled data + sampledArray = sketched.foldLeft(Array.empty[K])((total, sample) => { --- End diff -- as @kiszk suggests in his review: Do we need to always create sampledArray and to store into var? It may lead to overhead when the execution would go to L182. It would be good to calculate only length here and to create the array at L179. maybe allocate it when necessary is a better choice --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20838: [SPARK-23698] Resolve undefined names in Python 3
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/20838 > @HyukjinKwon Your advise on next steps? Shall we make the tests passed? > Have you run the tests locally and do they pass? Same question. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22070: Fix typos detected by github.com/client9/misspell
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22070#discussion_r209416269 --- Diff: sql/hive/src/test/resources/golden/udf_translate-2-f7aa38a33ca0df73b7a1e6b6da4b7fe8 --- @@ -6,8 +6,8 @@ translate('abcdef', 'adc', '19') returns '1b9ef' replacing 'a' with '1', 'd' wit translate('a b c d', ' ', '') return 'abcd' removing all spaces from the input string -If the same character is present multiple times in the input string, the first occurence of the character is the one that's considered for matching. However, it is not recommended to have the same character more than once in the from string since it's not required and adds to confusion. +If the same character is present multiple times in the input string, the first occurrence of the character is the one that's considered for matching. However, it is not recommended to have the same character more than once in the from string since it's not required and adds to confusion. For example, -translate('abcdef', 'ada', '192') returns '1bc9ef' replaces 'a' with '1' and 'd' with '9' ignoring the second occurence of 'a' in the from string mapping it to '2' +ththtranslate('abcdef', 'ada', '192') returns '1bc9ef' replaces 'a' with '1' and 'd' with '9' ignoring the second occurrence of 'a' in the from string mapping it to '2' --- End diff -- `ththtranslate` is it correct .. ? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22069: [MINOR][DOC] Fix Java example code in Column's comments
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22069 **[Test build #94595 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94595/testReport)** for PR 22069 at commit [`8520df8`](https://github.com/apache/spark/commit/8520df899a3364f2bb41d4155d2bed9e68772a07). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22069: [MINOR][DOC] Fix Java example code in Column's comments
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22069 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22068: [MINOR][DOC]Add missing compression codec .
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22068 Eh, I think it's okay. Let's just get this in. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20637: [SPARK-23466][SQL] Remove redundant null checks i...
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/20637#discussion_r209416053 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/GenerateUnsafeProjection.scala --- @@ -43,25 +43,29 @@ object GenerateUnsafeProjection extends CodeGenerator[Seq[Expression], UnsafePro case _ => false } - // TODO: if the nullability of field is correct, we can use it to save null check. private def writeStructToBuffer( ctx: CodegenContext, input: String, index: String, - fieldTypes: Seq[DataType], + fieldTypeAndNullables: Seq[(DataType, Boolean)], --- End diff -- @cloud-fan What name of the case class do you suggest? `DataTypeNullable`, or others? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22068: [MINOR][DOC]Add missing compression codec .
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22068 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22068: [MINOR][DOC]Add missing compression codec .
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22068 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2066/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22068: [MINOR][DOC]Add missing compression codec .
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22068 **[Test build #94594 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94594/testReport)** for PR 22068 at commit [`5b4562a`](https://github.com/apache/spark/commit/5b4562a19e1b29a2d0d20f4b046c88837c536dac). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22068: [MINOR][DOC]Add missing compression codec .
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22068 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94593/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22068: [MINOR][DOC]Add missing compression codec .
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22068 **[Test build #94593 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94593/testReport)** for PR 22068 at commit [`59f6080`](https://github.com/apache/spark/commit/59f6080030fddece04f5e8cda0a27b6aa60393a6). * This patch **fails Python style tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22068: [MINOR][DOC]Add missing compression codec .
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22068 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22068: [MINOR][DOC]Add missing compression codec .
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22068 **[Test build #94593 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94593/testReport)** for PR 22068 at commit [`59f6080`](https://github.com/apache/spark/commit/59f6080030fddece04f5e8cda0a27b6aa60393a6). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22068: [MINOR][DOC]Add missing compression codec .
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22068 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22068: [MINOR][DOC]Add missing compression codec .
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22068 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2065/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22068: [MINOR][DOC]Add missing compression codec .
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/22068 Thanks. BTW, I found another instance in test, not in doc. Do we address this in this PR? Or, do we address in another PR? @HyukjinKwon WDYT ? ``` class ParquetCompressionCodecPrecedenceSuite extends ParquetTest with SharedSQLContext { test("Test `spark.sql.parquet.compression.codec` config") { Seq("NONE", "UNCOMPRESSED", "SNAPPY", "GZIP", "LZO").foreach { c => withSQLConf(SQLConf.PARQUET_COMPRESSION.key -> c) { val expected = if (c == "NONE") "UNCOMPRESSED" else c val option = new ParquetOptions(Map.empty[String, String], spark.sessionState.conf) assert(option.compressionCodecClassName == expected) } } } ``` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22067: [SPARK-25084][SQL] distribute by on multiple columns may...
Github user LantaoJin commented on the issue: https://github.com/apache/spark/pull/22067 Seems #22066 has changed the implementation with a similar approach. I will close this one. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22067: [SPARK-25084][SQL] distribute by on multiple colu...
Github user LantaoJin closed the pull request at: https://github.com/apache/spark/pull/22067 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22068: [MINOR][DOC]Add missing compression codec .
Github user 10110346 commented on the issue: https://github.com/apache/spark/pull/22068 I will update,thanks @kiszk @HyukjinKwon --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22063: [WIP][SPARK-25044][SQL] Address translation of LMF closu...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22063 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22063: [WIP][SPARK-25044][SQL] Address translation of LMF closu...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22063 **[Test build #94592 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94592/testReport)** for PR 22063 at commit [`3c67d9d`](https://github.com/apache/spark/commit/3c67d9d30b2afd38ccf7a01d49c75d4c17f31445). * This patch **fails to build**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22063: [WIP][SPARK-25044][SQL] Address translation of LMF closu...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22063 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94592/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22047: [SPARK-19851] Add support for EVERY and ANY (SOME) aggre...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22047 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94588/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22047: [SPARK-19851] Add support for EVERY and ANY (SOME) aggre...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22047 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22047: [SPARK-19851] Add support for EVERY and ANY (SOME) aggre...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22047 **[Test build #94588 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94588/testReport)** for PR 22047 at commit [`6593cf4`](https://github.com/apache/spark/commit/6593cf43ed89da88c4bfae8b10682bea21746e5a). * This patch **fails SparkR unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20838: [SPARK-23698] Resolve undefined names in Python 3
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20838 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94587/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20838: [SPARK-23698] Resolve undefined names in Python 3
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20838 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20838: [SPARK-23698] Resolve undefined names in Python 3
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20838 **[Test build #94587 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94587/testReport)** for PR 20838 at commit [`da58029`](https://github.com/apache/spark/commit/da58029f6164247a37c9c33b3b94fff3453cb920). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22063: [WIP][SPARK-25044][SQL] Address translation of LMF closu...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22063 **[Test build #94592 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94592/testReport)** for PR 22063 at commit [`3c67d9d`](https://github.com/apache/spark/commit/3c67d9d30b2afd38ccf7a01d49c75d4c17f31445). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22063: [WIP][SPARK-25044][SQL] Address translation of LMF closu...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22063 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2064/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22063: [WIP][SPARK-25044][SQL] Address translation of LMF closu...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22063 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22070: Fix typos detected by github.com/client9/misspell
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22070 **[Test build #4242 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4242/testReport)** for PR 22070 at commit [`9e95df2`](https://github.com/apache/spark/commit/9e95df24206bbcc51ae09bd488d72a2bcf84ee7b). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22037: [SPARK-24774][SQL] Avro: Support logical decimal ...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22037#discussion_r209412587 --- Diff: external/avro/src/main/scala/org/apache/spark/sql/avro/SchemaConverters.scala --- @@ -114,32 +129,35 @@ object SchemaConverters { prevNameSpace: String = "", outputTimestampType: AvroOutputTimestampType.Value = AvroOutputTimestampType.TIMESTAMP_MICROS) : Schema = { -val builder = if (nullable) { - SchemaBuilder.builder().nullable() -} else { - SchemaBuilder.builder() -} +val builder = SchemaBuilder.builder() -catalystType match { +val schema = catalystType match { case BooleanType => builder.booleanType() case ByteType | ShortType | IntegerType => builder.intType() case LongType => builder.longType() - case DateType => builder -.intBuilder() -.prop(LogicalType.LOGICAL_TYPE_PROP, LogicalTypes.date().getName) -.endInt() + case DateType => +LogicalTypes.date().addToSchema(builder.intType()) case TimestampType => val timestampType = outputTimestampType match { case AvroOutputTimestampType.TIMESTAMP_MILLIS => LogicalTypes.timestampMillis() case AvroOutputTimestampType.TIMESTAMP_MICROS => LogicalTypes.timestampMicros() case other => throw new IncompatibleSchemaException(s"Unexpected output timestamp type $other.") } -builder.longBuilder().prop(LogicalType.LOGICAL_TYPE_PROP, timestampType.getName).endLong() +timestampType.addToSchema(builder.longType()) case FloatType => builder.floatType() case DoubleType => builder.doubleType() - case _: DecimalType | StringType => builder.stringType() + case StringType => builder.stringType() + case d: DecimalType => +val avroType = LogicalTypes.decimal(d.precision, d.scale) +val fixedSize = minBytesForPrecision(d.precision) +// Use random name to avoid conflict in naming of fixed field. +// Field names must start with [A-Za-z_], while the charset of Random.alphanumeric contains +// [0-9]. So add a single character "f" to ensure the name is valid. +val name = "f" + Random.alphanumeric.take(32).mkString("") --- End diff -- can we use `recordName` here? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22072: [SPARK-25081][Core]Nested spill in ShuffleExternalSorter...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22072 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94589/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22072: [SPARK-25081][Core]Nested spill in ShuffleExternalSorter...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22072 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22037: [SPARK-24774][SQL] Avro: Support logical decimal ...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22037#discussion_r209412578 --- Diff: external/avro/src/test/scala/org/apache/spark/sql/avro/AvroSuite.scala --- @@ -494,6 +522,68 @@ class AvroSuite extends QueryTest with SharedSQLContext with SQLTestUtils { checkAnswer(df, expected) } + test("Logical type: Decimal") { +val expected = Seq("1.23", "4.56", "78.90", "-1", "-2.31") + .map { x => Row(new java.math.BigDecimal(x), new java.math.BigDecimal(x)) } +val df = spark.read.format("avro").load(decimalAvro) + +checkAnswer(df, expected) + +val avroSchema = s""" + { +"namespace": "logical", +"type": "record", +"name": "test", +"fields": [ + {"name": "bytes", "type": + {"type": "bytes", "logicalType": "decimal", "precision": 4, "scale": 2} + }, + {"name": "fixed", "type": +{"type": "fixed", "size": 5, "logicalType": "decimal", + "precision": 4, "scale": 2, "name": "foo"} + } +] + } +""" + +checkAnswer(spark.read.format("avro").option("avroSchema", avroSchema).load(decimalAvro), + expected) + +withTempPath { dir => + df.write.format("avro").save(dir.toString) + checkAnswer(spark.read.format("avro").load(dir.toString), expected) +} + } + + test("Logical type: Decimal with too large precision") { +withTempDir { dir => + val schema = new Schema.Parser().parse("""{ +"namespace": "logical", +"type": "record", +"name": "test", +"fields": [{ + "name": "decimal", + "type": {"type": "bytes", "logicalType": "decimal", "precision": 4, "scale": 2} +}] + }""") + val datumWriter = new GenericDatumWriter[GenericRecord](schema) + val dataFileWriter = new DataFileWriter[GenericRecord](datumWriter) + dataFileWriter.create(schema, new File(s"$dir.avro")) --- End diff -- Let's either always use python to write test files, or always use java. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22072: [SPARK-25081][Core]Nested spill in ShuffleExternalSorter...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22072 **[Test build #94589 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94589/testReport)** for PR 22072 at commit [`1a6452e`](https://github.com/apache/spark/commit/1a6452ef0939c09c09801cff78b0214d7979bf6d). * This patch **fails SparkR unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22001: [SPARK-24819][CORE] Fail fast when no enough slots to la...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22001 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22001: [SPARK-24819][CORE] Fail fast when no enough slots to la...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22001 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94591/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22001: [SPARK-24819][CORE] Fail fast when no enough slots to la...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22001 **[Test build #94591 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94591/testReport)** for PR 22001 at commit [`458c78f`](https://github.com/apache/spark/commit/458c78fb076f642f5eee24a7a0911f3822254084). * This patch **fails to build**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22001: [SPARK-24819][CORE] Fail fast when no enough slots to la...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22001 **[Test build #94591 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94591/testReport)** for PR 22001 at commit [`458c78f`](https://github.com/apache/spark/commit/458c78fb076f642f5eee24a7a0911f3822254084). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22011: [SPARK-24822][PySpark] Python support for barrier execut...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22011 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22001: [SPARK-24819][CORE] Fail fast when no enough slots to la...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22001 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22001: [SPARK-24819][CORE] Fail fast when no enough slots to la...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22001 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2063/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22011: [SPARK-24822][PySpark] Python support for barrier execut...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22011 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2062/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22011: [SPARK-24822][PySpark] Python support for barrier execut...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22011 **[Test build #94590 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94590/testReport)** for PR 22011 at commit [`cf38531`](https://github.com/apache/spark/commit/cf3853177d0ed76efbffee8ced1021003b085a26). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22001: [SPARK-24819][CORE] Fail fast when no enough slots to la...
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/22001 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22011: [SPARK-24822][PySpark] Python support for barrier execut...
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/22011 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21859: [SPARK-24900][SQL]Speed up sort when the dataset ...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/21859#discussion_r209410871 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/ShuffleExchangeExec.scala --- @@ -294,7 +296,12 @@ object ShuffleExchangeExec { sorter.sort(iter.asInstanceOf[Iterator[UnsafeRow]]) } } else { -rdd +part match { + case partitioner: RangePartitioner[InternalRow @unchecked, _] +if partitioner.getSampledArray != null => +sparkContext.parallelize(partitioner.getSampledArray.toSeq, rdd.getNumPartitions) --- End diff -- When you just parallelize sampled data, it might not have required partitioning (rang partitioning), doesn't it affect the result? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21859: [SPARK-24900][SQL]Speed up sort when the dataset ...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/21859#discussion_r209407875 --- Diff: core/src/main/scala/org/apache/spark/Partitioner.scala --- @@ -166,9 +169,17 @@ class RangePartitioner[K : Ordering : ClassTag, V]( // Assume the input partitions are roughly balanced and over-sample a little bit. val sampleSizePerPartition = math.ceil(3.0 * sampleSize / rdd.partitions.length).toInt val (numItems, sketched) = RangePartitioner.sketch(rdd.map(_._1), sampleSizePerPartition) + val numSampled = sketched.map(_._3.length).sum if (numItems == 0L) { Array.empty } else { +// already got the whole data +if (sampleCacheEnabled && numItems == numSampled) { + // get the sampled data + sampledArray = sketched.foldLeft(Array.empty[K])((total, sample) => { --- End diff -- As you already know the size of `sampledArray`, maybe you can allocate it at once in advance. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21859: [SPARK-24900][SQL]Speed up sort when the dataset ...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/21859#discussion_r209408395 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/ShuffleExchangeExec.scala --- @@ -294,7 +296,12 @@ object ShuffleExchangeExec { sorter.sort(iter.asInstanceOf[Iterator[UnsafeRow]]) } } else { -rdd +part match { + case partitioner: RangePartitioner[InternalRow @unchecked, _] +if partitioner.getSampledArray != null => +sparkContext.parallelize(partitioner.getSampledArray.toSeq, rdd.getNumPartitions) --- End diff -- Instead of `rdd.getNumPartitions`, I think we should use `partitioner.numPartitions`. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22017: [SPARK-23938][SQL] Add map_zip_with function
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22017 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94585/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22017: [SPARK-23938][SQL] Add map_zip_with function
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22017 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22007: [SPARK-25033] Bump Apache commons.{httpclient, httpcore}
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22007 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94580/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22007: [SPARK-25033] Bump Apache commons.{httpclient, httpcore}
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22007 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22007: [SPARK-25033] Bump Apache commons.{httpclient, httpcore}
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22007 **[Test build #94580 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94580/testReport)** for PR 22007 at commit [`316b9ad`](https://github.com/apache/spark/commit/316b9adc3be3e2d12ce5c092421901929d5455d4). * This patch **fails SparkR unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22017: [SPARK-23938][SQL] Add map_zip_with function
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22017 **[Test build #94585 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94585/testReport)** for PR 22017 at commit [`595161f`](https://github.com/apache/spark/commit/595161fefbf55711b76530a9e53aff73491febd6). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22074: [SPARK-25089][R] removing lintr checks for 2.0
Github user shaneknapp closed the pull request at: https://github.com/apache/spark/pull/22074 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22073: [SPARK-25089][R] removing lintr checks for 2.1
Github user shaneknapp closed the pull request at: https://github.com/apache/spark/pull/22073 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22016: Fix typos
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/22016 #22070 addresses more typo. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22074: [SPARK-25089][R] removing lintr checks for 2.0
Github user srowen commented on the issue: https://github.com/apache/spark/pull/22074 You can close both PRs now, they're merged (and won't close automatically) --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22037: [SPARK-24774][SQL] Avro: Support logical decimal type
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22037 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22073: [SPARK-25089][R] removing lintr checks for 2.1
Github user srowen commented on the issue: https://github.com/apache/spark/pull/22073 Per other PR, merging as this did succeed in removing lintr, just uncovered test failures. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22037: [SPARK-24774][SQL] Avro: Support logical decimal type
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22037 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94579/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22037: [SPARK-24774][SQL] Avro: Support logical decimal type
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22037 **[Test build #94579 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94579/testReport)** for PR 22037 at commit [`0384a1a`](https://github.com/apache/spark/commit/0384a1af69573af317f9e644bcf04e12bf38f1f3). * This patch **fails SparkR unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22074: [SPARK-25089][R] removing lintr checks for 2.0
Github user shaneknapp commented on the issue: https://github.com/apache/spark/pull/22074 yeah, the R linting takes place at the beginning of the build, and would have failed in spectacular fashion hours ago. same w/the other PR. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22074: [SPARK-25089][R] removing lintr checks for 2.0
Github user srowen commented on the issue: https://github.com/apache/spark/pull/22074 Unclear, but this successfully removes lintr (and associated failures), it seems? seems OK to merge if so. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22073: [SPARK-25089][R] removing lintr checks for 2.1
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22073 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22073: [SPARK-25089][R] removing lintr checks for 2.1
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22073 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94583/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22073: [SPARK-25089][R] removing lintr checks for 2.1
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22073 **[Test build #94583 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94583/testReport)** for PR 22073 at commit [`f2974fb`](https://github.com/apache/spark/commit/f2974fbaf518f9e5350324ea0bf32c2fcea6f9b3). * This patch **fails SparkR unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22074: [SPARK-25089][R] removing lintr checks for 2.0
Github user shaneknapp commented on the issue: https://github.com/apache/spark/pull/22074 ugh... i wonder how long this has been broken? :\ --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22074: [SPARK-25089][R] removing lintr checks for 2.0
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22074 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94584/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22074: [SPARK-25089][R] removing lintr checks for 2.0
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22074 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22074: [SPARK-25089][R] removing lintr checks for 2.0
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22074 **[Test build #94584 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94584/consoleFull)** for PR 22074 at commit [`b204a88`](https://github.com/apache/spark/commit/b204a88f9d0116a384682422f82f1a55be32443b). * This patch **fails SparkR unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22010: [SPARK-21436][CORE] Take advantage of known partitioner ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22010 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22010: [SPARK-21436][CORE] Take advantage of known partitioner ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22010 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94577/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22010: [SPARK-21436][CORE] Take advantage of known partitioner ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22010 **[Test build #94577 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94577/testReport)** for PR 22010 at commit [`5fd3659`](https://github.com/apache/spark/commit/5fd36592a26b07fdb58e79e4efbb6b70daea54df). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22072: [SPARK-25081][Core]Nested spill in ShuffleExternalSorter...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22072 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22072: [SPARK-25081][Core]Nested spill in ShuffleExternalSorter...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22072 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2061/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22072: [SPARK-25081][Core]Nested spill in ShuffleExternalSorter...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22072 **[Test build #94589 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94589/testReport)** for PR 22072 at commit [`1a6452e`](https://github.com/apache/spark/commit/1a6452ef0939c09c09801cff78b0214d7979bf6d). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22072: [SPARK-25081][Core]Nested spill in ShuffleExternalSorter...
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/22072 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20838: [SPARK-23698] Resolve undefined names in Python 3
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20838 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94576/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20838: [SPARK-23698] Resolve undefined names in Python 3
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20838 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20838: [SPARK-23698] Resolve undefined names in Python 3
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20838 **[Test build #94576 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94576/testReport)** for PR 20838 at commit [`e41a8cc`](https://github.com/apache/spark/commit/e41a8cc303eb428fe99bacae3b9c87be57b03125). * This patch **fails PySpark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22072: [SPARK-25081][Core]Nested spill in ShuffleExternalSorter...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22072 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94581/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22072: [SPARK-25081][Core]Nested spill in ShuffleExternalSorter...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22072 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22072: [SPARK-25081][Core]Nested spill in ShuffleExternalSorter...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22072 **[Test build #94581 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94581/testReport)** for PR 22072 at commit [`1a6452e`](https://github.com/apache/spark/commit/1a6452ef0939c09c09801cff78b0214d7979bf6d). * This patch **fails SparkR unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22047: [SPARK-19851] Add support for EVERY and ANY (SOME) aggre...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22047 **[Test build #94588 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94588/testReport)** for PR 22047 at commit [`6593cf4`](https://github.com/apache/spark/commit/6593cf43ed89da88c4bfae8b10682bea21746e5a). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22047: [SPARK-19851] Add support for EVERY and ANY (SOME) aggre...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22047 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22047: [SPARK-19851] Add support for EVERY and ANY (SOME) aggre...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22047 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2060/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20838: [SPARK-23698] Resolve undefined names in Python 3
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20838 **[Test build #94587 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94587/testReport)** for PR 20838 at commit [`da58029`](https://github.com/apache/spark/commit/da58029f6164247a37c9c33b3b94fff3453cb920). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22011: [SPARK-24822][PySpark] Python support for barrier execut...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22011 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22011: [SPARK-24822][PySpark] Python support for barrier execut...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22011 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94575/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22011: [SPARK-24822][PySpark] Python support for barrier execut...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22011 **[Test build #94575 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94575/testReport)** for PR 22011 at commit [`cf38531`](https://github.com/apache/spark/commit/cf3853177d0ed76efbffee8ced1021003b085a26). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org