[GitHub] spark issue #20890: [WIP][SPARK-23779][SQL] TaskMemoryManager and UnsafeSort...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20890 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88921/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20890: [WIP][SPARK-23779][SQL] TaskMemoryManager and UnsafeSort...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20890 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20890: [WIP][SPARK-23779][SQL] TaskMemoryManager and UnsafeSort...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20890 **[Test build #88921 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88921/testReport)** for PR 20890 at commit [`9abcf09`](https://github.com/apache/spark/commit/9abcf09d99d5a42a7620882c4a7d6ebbc43fff6a). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20979: [SPARK-23588][SQL] CatalystToExternalMap should s...
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/20979#discussion_r179355419 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala --- @@ -988,8 +988,20 @@ case class CatalystToExternalMap private( override def children: Seq[Expression] = keyLambdaFunction :: valueLambdaFunction :: inputData :: Nil - override def eval(input: InternalRow): Any = -throw new UnsupportedOperationException("Only code-generated evaluation is supported") + private lazy val toScalaValue: Any => Any = { +assert(inputData.dataType.isInstanceOf[MapType]) +val mapType = inputData.dataType.asInstanceOf[MapType] +CatalystTypeConverters.createToScalaConverter(mapType) --- End diff -- yea, you're right. I'll update soon. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20797: [SPARK-23583][SQL] Invoke should support interpre...
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/20797#discussion_r179355326 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/ScalaReflection.scala --- @@ -794,6 +794,52 @@ object ScalaReflection extends ScalaReflection { "interface", "long", "native", "new", "null", "package", "private", "protected", "public", "return", "short", "static", "strictfp", "super", "switch", "synchronized", "this", "throw", "throws", "transient", "true", "try", "void", "volatile", "while") + + val typeJavaMapping = Map[DataType, Class[_]]( +BooleanType -> classOf[Boolean], +ByteType -> classOf[Byte], +ShortType -> classOf[Short], +IntegerType -> classOf[Int], +LongType -> classOf[Long], +FloatType -> classOf[Float], +DoubleType -> classOf[Double], +StringType -> classOf[UTF8String], +DateType -> classOf[DateType.InternalType], +TimestampType -> classOf[TimestampType.InternalType], +BinaryType -> classOf[BinaryType.InternalType], +CalendarIntervalType -> classOf[CalendarInterval] + ) + + val typeBoxedJavaMapping = Map[DataType, Class[_]]( +BooleanType -> classOf[java.lang.Boolean], +ByteType -> classOf[java.lang.Byte], +ShortType -> classOf[java.lang.Short], +IntegerType -> classOf[java.lang.Integer], +LongType -> classOf[java.lang.Long], +FloatType -> classOf[java.lang.Float], +DoubleType -> classOf[java.lang.Double], +DateType -> classOf[java.lang.Integer], +TimestampType -> classOf[java.lang.Long] + ) + + def dataTypeJavaClass(dt: DataType): Class[_] = { +dt match { + case _: DecimalType => classOf[Decimal] + case _: StructType => classOf[InternalRow] + case _: ArrayType => classOf[ArrayData] + case _: MapType => classOf[MapData] + case ObjectType(cls) => cls + case _ => typeJavaMapping.getOrElse(dt, classOf[java.lang.Object]) +} + } --- End diff -- ok --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20980: [SPARK-23589][SQL] ExternalMapToCatalyst should s...
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/20980#discussion_r179355354 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala --- @@ -1176,8 +1176,28 @@ case class ExternalMapToCatalyst private( override def dataType: MapType = MapType( keyConverter.dataType, valueConverter.dataType, valueContainsNull = valueConverter.nullable) - override def eval(input: InternalRow): Any = -throw new UnsupportedOperationException("Only code-generated evaluation is supported") + private lazy val keyCatalystConverter = +CatalystTypeConverters.createToCatalystConverter(dataType.keyType) + private lazy val valueCatalystConverter = +CatalystTypeConverters.createToCatalystConverter(dataType.valueType) + + override def eval(input: InternalRow): Any = { +val result = child.eval(input) +if (result != null) { + val mapValue = result.asInstanceOf[Map[Any, Any]] --- End diff -- ok --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20797: [SPARK-23583][SQL] Invoke should support interpre...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/20797#discussion_r179354457 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/ScalaReflection.scala --- @@ -794,6 +794,52 @@ object ScalaReflection extends ScalaReflection { "interface", "long", "native", "new", "null", "package", "private", "protected", "public", "return", "short", "static", "strictfp", "super", "switch", "synchronized", "this", "throw", "throws", "transient", "true", "try", "void", "volatile", "while") + + val typeJavaMapping = Map[DataType, Class[_]]( +BooleanType -> classOf[Boolean], +ByteType -> classOf[Byte], +ShortType -> classOf[Short], +IntegerType -> classOf[Int], +LongType -> classOf[Long], +FloatType -> classOf[Float], +DoubleType -> classOf[Double], +StringType -> classOf[UTF8String], +DateType -> classOf[DateType.InternalType], +TimestampType -> classOf[TimestampType.InternalType], +BinaryType -> classOf[BinaryType.InternalType], +CalendarIntervalType -> classOf[CalendarInterval] + ) + + val typeBoxedJavaMapping = Map[DataType, Class[_]]( +BooleanType -> classOf[java.lang.Boolean], +ByteType -> classOf[java.lang.Byte], +ShortType -> classOf[java.lang.Short], +IntegerType -> classOf[java.lang.Integer], +LongType -> classOf[java.lang.Long], +FloatType -> classOf[java.lang.Float], +DoubleType -> classOf[java.lang.Double], +DateType -> classOf[java.lang.Integer], +TimestampType -> classOf[java.lang.Long] + ) + + def dataTypeJavaClass(dt: DataType): Class[_] = { +dt match { + case _: DecimalType => classOf[Decimal] + case _: StructType => classOf[InternalRow] + case _: ArrayType => classOf[ArrayData] + case _: MapType => classOf[MapData] + case ObjectType(cls) => cls + case _ => typeJavaMapping.getOrElse(dt, classOf[java.lang.Object]) +} + } --- End diff -- cc @maropu We can now use this instead of `CallMethodViaReflection.typeMapping`. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20757: [SPARK-23595][SQL] ValidateExternalType should support i...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20757 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20757: [SPARK-23595][SQL] ValidateExternalType should support i...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20757 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88919/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20757: [SPARK-23595][SQL] ValidateExternalType should support i...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20757 **[Test build #88919 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88919/testReport)** for PR 20757 at commit [`dd1159b`](https://github.com/apache/spark/commit/dd1159b2ead8c605a71b73d927d22f640ef9f64e). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20965: [SPARK-21870][SQL] Split aggregation code into small fun...
Github user maropu commented on the issue: https://github.com/apache/spark/pull/20965 @mgaido91 @viirya Could you check again? Thanks! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20965: [SPARK-21870][SQL] Split aggregation code into small fun...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20965 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88918/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20965: [SPARK-21870][SQL] Split aggregation code into small fun...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20965 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20965: [SPARK-21870][SQL] Split aggregation code into small fun...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20965 **[Test build #88918 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88918/testReport)** for PR 20965 at commit [`640fd8f`](https://github.com/apache/spark/commit/640fd8f51e57edea12a36832b5c686a062ead7ab). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20980: [SPARK-23589][SQL] ExternalMapToCatalyst should s...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/20980#discussion_r179349022 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala --- @@ -1176,8 +1176,28 @@ case class ExternalMapToCatalyst private( override def dataType: MapType = MapType( keyConverter.dataType, valueConverter.dataType, valueContainsNull = valueConverter.nullable) - override def eval(input: InternalRow): Any = -throw new UnsupportedOperationException("Only code-generated evaluation is supported") + private lazy val keyCatalystConverter = +CatalystTypeConverters.createToCatalystConverter(dataType.keyType) + private lazy val valueCatalystConverter = +CatalystTypeConverters.createToCatalystConverter(dataType.valueType) + + override def eval(input: InternalRow): Any = { +val result = child.eval(input) +if (result != null) { + val mapValue = result.asInstanceOf[Map[Any, Any]] --- End diff -- The external map can be `java.util.Map` or `scala.collection.Map`. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20962: [SPARK-23847][PYTHON][SQL]Add asc_nulls_first, asc_nulls...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20962 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20962: [SPARK-23847][PYTHON][SQL]Add asc_nulls_first, asc_nulls...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20962 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88916/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20962: [SPARK-23847][PYTHON][SQL]Add asc_nulls_first, asc_nulls...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20962 **[Test build #88916 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88916/testReport)** for PR 20962 at commit [`c3550c2`](https://github.com/apache/spark/commit/c3550c23d4ae7a871ce8ee6198d418b4970c0669). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20979: [SPARK-23588][SQL] CatalystToExternalMap should s...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/20979#discussion_r179347546 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala --- @@ -988,8 +988,20 @@ case class CatalystToExternalMap private( override def children: Seq[Expression] = keyLambdaFunction :: valueLambdaFunction :: inputData :: Nil - override def eval(input: InternalRow): Any = -throw new UnsupportedOperationException("Only code-generated evaluation is supported") + private lazy val toScalaValue: Any => Any = { +assert(inputData.dataType.isInstanceOf[MapType]) +val mapType = inputData.dataType.asInstanceOf[MapType] +CatalystTypeConverters.createToScalaConverter(mapType) --- End diff -- hmm, don't we need to support `collClass`? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20931: [SPARK-23815][Core]Spark writer dynamic partition...
Github user fangshil commented on a diff in the pull request: https://github.com/apache/spark/pull/20931#discussion_r179346911 --- Diff: core/src/main/scala/org/apache/spark/internal/io/HadoopMapReduceCommitProtocol.scala --- @@ -186,7 +186,9 @@ class HadoopMapReduceCommitProtocol( logDebug(s"Clean up default partition directories for overwriting: $partitionPaths") for (part <- partitionPaths) { val finalPartPath = new Path(path, part) - fs.delete(finalPartPath, true) + if (!fs.delete(finalPartPath, true) && !fs.exists(finalPartPath.getParent)) { +fs.mkdirs(finalPartPath.getParent) --- End diff -- @cloud-fan yes, in official HDFS document(https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/filesystem/filesystem.html), the rename command has precondition "dest must be root, or have a parent that exists" --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20981: [SPARK-23873][SQL] Use accessors in interpreted LambdaVa...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20981 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1991/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20981: [SPARK-23873][SQL] Use accessors in interpreted LambdaVa...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20981 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20981: [SPARK-23873][SQL] Use accessors in interpreted LambdaVa...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20981 **[Test build #88926 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88926/testReport)** for PR 20981 at commit [`35539be`](https://github.com/apache/spark/commit/35539be14bc79647409bfcdc5dac1fb99d00f5ec). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20981: [SPARK-23873][SQL] Use accessors in interpreted LambdaVa...
Github user viirya commented on the issue: https://github.com/apache/spark/pull/20981 cc @hvanhovell --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20981: [SPARK-23873][SQL] Use accessors in interpreted L...
GitHub user viirya opened a pull request: https://github.com/apache/spark/pull/20981 [SPARK-23873][SQL] Use accessors in interpreted LambdaVariable ## What changes were proposed in this pull request? Currently, interpreted execution of `LambdaVariable` just uses `InternalRow.get` to access element. We should use specified accessors if possible. ## How was this patch tested? Added test. You can merge this pull request into a Git repository by running: $ git pull https://github.com/viirya/spark-1 SPARK-23873 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/20981.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #20981 commit 35539be14bc79647409bfcdc5dac1fb99d00f5ec Author: Liang-Chi HsiehDate: 2018-04-05T03:57:17Z Use accessors in interpreted LambdaVariable. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20979: [SPARK-23588][SQL] CatalystToExternalMap should support ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20979 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20979: [SPARK-23588][SQL] CatalystToExternalMap should support ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20979 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1990/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20980: [SPARK-23589][SQL] ExternalMapToCatalyst should support ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20980 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1989/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20980: [SPARK-23589][SQL] ExternalMapToCatalyst should support ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20980 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20979: [SPARK-23588][SQL] CatalystToExternalMap should support ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20979 **[Test build #88925 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88925/testReport)** for PR 20979 at commit [`352777b`](https://github.com/apache/spark/commit/352777b162752397efac0556a87c924935000e4a). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20980: [SPARK-23589][SQL] ExternalMapToCatalyst should support ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20980 **[Test build #88924 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88924/testReport)** for PR 20980 at commit [`52e76d7`](https://github.com/apache/spark/commit/52e76d760b2c9e1c9c72e1ee98a664bcf92c2c35). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20980: [SPARK-23589][SQL] ExternalMapToCatalyst should s...
GitHub user maropu opened a pull request: https://github.com/apache/spark/pull/20980 [SPARK-23589][SQL] ExternalMapToCatalyst should support interpreted execution ## What changes were proposed in this pull request? This pr supported interpreted mode for `ExternalMapToCatalyst`. ## How was this patch tested? Added tests in `ObjectExpressionsSuite`. You can merge this pull request into a Git repository by running: $ git pull https://github.com/maropu/spark SPARK-23589 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/20980.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #20980 commit 52e76d760b2c9e1c9c72e1ee98a664bcf92c2c35 Author: Takeshi YamamuroDate: 2018-03-09T07:37:57Z ExternalMapToCatalyst should support interpreted execution --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20979: [SPARK-23588][SQL] CatalystToExternalMap should s...
GitHub user maropu opened a pull request: https://github.com/apache/spark/pull/20979 [SPARK-23588][SQL] CatalystToExternalMap should support interpreted execution ## What changes were proposed in this pull request? This pr supported interpreted mode for `CatalystToExternalMap`. ## How was this patch tested? Added tests in `ObjectExpressionsSuite`. You can merge this pull request into a Git repository by running: $ git pull https://github.com/maropu/spark SPARK-23588 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/20979.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #20979 commit 352777b162752397efac0556a87c924935000e4a Author: Takeshi YamamuroDate: 2018-03-12T05:32:22Z CatalystToExternalMap should support interpreted execution --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20933: [SPARK-23817][SQL]Migrate ORC file format read path to d...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20933 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1988/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20933: [SPARK-23817][SQL]Migrate ORC file format read path to d...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20933 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20953: [SPARK-23822][SQL] Improve error message for Parq...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/20953#discussion_r179343192 --- Diff: sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaColumnConvertNotSupportedException.java --- @@ -0,0 +1,62 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.execution.datasources.parquet; + +import org.apache.spark.annotation.InterfaceStability; + +/** + * Exception thrown when the parquet reader find column type mismatches. + */ +@InterfaceStability.Unstable +public class ParquetSchemaColumnConvertNotSupportedException extends RuntimeException { --- End diff -- @gatorsmile and @yuchenhuo Why don't we generalize this exception by removing `Parquet` and move to parent `datasources` package? Since this exception doesn't carry Parquet-specific information at all, that would be better for the future. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20933: [SPARK-23817][SQL]Migrate ORC file format read path to d...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20933 **[Test build #88923 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88923/testReport)** for PR 20933 at commit [`35b74c0`](https://github.com/apache/spark/commit/35b74c08c3aefcd3baaee24b35a3d962a8e069ac). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20933: [SPARK-23817][SQL]Migrate ORC file format read path to d...
Github user gengliangwang commented on the issue: https://github.com/apache/spark/pull/20933 retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20953: [SPARK-23822][SQL] Improve error message for Parq...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/20953#discussion_r179342708 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaSuite.scala --- @@ -382,6 +384,59 @@ class ParquetSchemaSuite extends ParquetSchemaTest { } } + // === + // Tests for parquet schema mismatch error + // === + def testSchemaMismatch(path: String, vectorizedReaderEnabled: Boolean): SparkException = { +import testImplicits._ + +var e: SparkException = null +// Disable databricks' vectorized parquet reader and use open source version. --- End diff -- @yuchenhuo . In Apache Spark, this comment should be removed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20786: [SPARK-14681][ML] Provide label/impurity stats for spark...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20786 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88914/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20786: [SPARK-14681][ML] Provide label/impurity stats for spark...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20786 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20786: [SPARK-14681][ML] Provide label/impurity stats for spark...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20786 **[Test build #88914 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88914/testReport)** for PR 20786 at commit [`48c17d4`](https://github.com/apache/spark/commit/48c17d4dff6a4e82b86d70f3845e6d524b4807e5). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `sealed trait ClassificationNode extends Node ` * `sealed trait RegressionNode extends Node ` * `sealed trait LeafNode extends Node ` * `sealed trait InternalNode extends Node ` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20753: [SPARK-23582][SQL] StaticInvoke should support interpret...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20753 **[Test build #88922 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88922/testReport)** for PR 20753 at commit [`7b30c30`](https://github.com/apache/spark/commit/7b30c30a68107880cda108e6a0ed923c27ac6d56). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20753: [SPARK-23582][SQL] StaticInvoke should support interpret...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20753 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20753: [SPARK-23582][SQL] StaticInvoke should support interpret...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20753 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1987/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20977: [SPARK-23867][Scheduler] use droppedCount in logW...
Github user Ngone51 commented on a diff in the pull request: https://github.com/apache/spark/pull/20977#discussion_r179338562 --- Diff: core/src/main/scala/org/apache/spark/scheduler/AsyncEventQueue.scala --- @@ -166,7 +166,7 @@ private class AsyncEventQueue(val name: String, conf: SparkConf, metrics: LiveLi val prevLastReportTimestamp = lastReportTimestamp lastReportTimestamp = System.currentTimeMillis() val previous = new java.util.Date(prevLastReportTimestamp) - logWarning(s"Dropped $droppedEvents events from $name since $previous.") + logWarning(s"Dropped $droppedCount events from $name since $previous.") --- End diff -- I think the change is OK, since it tells us how many events dropped during the period `[System.currentTimeMillis(), lastReportTimestamp]`. BTW, I still have the same question with `lastReportTimestamp `(though, this is none of this pr's business): When first time we log warning, `lastReportTimestamp ` should be `0` due to the default value and without any initial. So, `previous` will be `1970/1/1 8:0:0 `. And, it means we got several dropped events since `1970/1/1 8:0:0 `. I feel quite weird about it. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19193: [WIP][SPARK-21896][SQL] Fix Stack Overflow when window f...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19193 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88910/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19193: [WIP][SPARK-21896][SQL] Fix Stack Overflow when window f...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19193 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19193: [WIP][SPARK-21896][SQL] Fix Stack Overflow when window f...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19193 **[Test build #88910 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88910/testReport)** for PR 19193 at commit [`cb19ad6`](https://github.com/apache/spark/commit/cb19ad614b7b757b71e6995aa0c5d8295d8f2485). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20890: [WIP][SPARK-23779][SQL] TaskMemoryManager and UnsafeSort...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20890 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20890: [WIP][SPARK-23779][SQL] TaskMemoryManager and UnsafeSort...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20890 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1986/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20890: [WIP][SPARK-23779][SQL] TaskMemoryManager and UnsafeSort...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20890 **[Test build #88921 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88921/testReport)** for PR 20890 at commit [`9abcf09`](https://github.com/apache/spark/commit/9abcf09d99d5a42a7620882c4a7d6ebbc43fff6a). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20890: [WIP][SPARK-23779][SQL] TaskMemoryManager and UnsafeSort...
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/20890 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20778: [SPARK-23584][SQL] NewInstance should support interprete...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20778 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1985/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20778: [SPARK-23584][SQL] NewInstance should support interprete...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20778 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20778: [SPARK-23584][SQL] NewInstance should support interprete...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20778 **[Test build #88920 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88920/testReport)** for PR 20778 at commit [`264f129`](https://github.com/apache/spark/commit/264f1298e099356f53ea7bf1f0dfcd38fe514532). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19222: [SPARK-10399][CORE][SQL] Introduce multiple Memor...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19222#discussion_r179335313 --- Diff: common/unsafe/src/test/java/org/apache/spark/unsafe/types/UTF8StringSuite.java --- @@ -515,7 +518,8 @@ public void writeToOutputStreamUnderflow() throws IOException { final byte[] test = "01234567".getBytes(StandardCharsets.UTF_8); for (int i = 1; i <= Platform.BYTE_ARRAY_OFFSET; ++i) { - UTF8String.fromAddress(test, Platform.BYTE_ARRAY_OFFSET - i, test.length + i) + new UTF8String( +new ByteArrayMemoryBlock(test, Platform.BYTE_ARRAY_OFFSET - i, test.length + i)) --- End diff -- Sure, I just say it looks strange. I know you didn't change the unit test's behavior. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19222: [SPARK-10399][CORE][SQL] Introduce multiple Memor...
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/19222#discussion_r179335019 --- Diff: common/unsafe/src/test/java/org/apache/spark/unsafe/types/UTF8StringSuite.java --- @@ -515,7 +518,8 @@ public void writeToOutputStreamUnderflow() throws IOException { final byte[] test = "01234567".getBytes(StandardCharsets.UTF_8); for (int i = 1; i <= Platform.BYTE_ARRAY_OFFSET; ++i) { - UTF8String.fromAddress(test, Platform.BYTE_ARRAY_OFFSET - i, test.length + i) + new UTF8String( +new ByteArrayMemoryBlock(test, Platform.BYTE_ARRAY_OFFSET - i, test.length + i)) --- End diff -- This change follows the original behavior for the backward compatibility. If we want to change the behavior, update test, or delete this, it would be good to address in another PR. Is it better to discuss this UT in another PR? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20757: [SPARK-23595][SQL] ValidateExternalType should support i...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20757 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20757: [SPARK-23595][SQL] ValidateExternalType should support i...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20757 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1984/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20757: [SPARK-23595][SQL] ValidateExternalType should support i...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20757 **[Test build #88919 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88919/testReport)** for PR 20757 at commit [`dd1159b`](https://github.com/apache/spark/commit/dd1159b2ead8c605a71b73d927d22f640ef9f64e). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19222: [SPARK-10399][CORE][SQL] Introduce multiple Memor...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19222#discussion_r179333954 --- Diff: common/unsafe/src/test/java/org/apache/spark/unsafe/types/UTF8StringSuite.java --- @@ -515,7 +518,8 @@ public void writeToOutputStreamUnderflow() throws IOException { final byte[] test = "01234567".getBytes(StandardCharsets.UTF_8); for (int i = 1; i <= Platform.BYTE_ARRAY_OFFSET; ++i) { - UTF8String.fromAddress(test, Platform.BYTE_ARRAY_OFFSET - i, test.length + i) + new UTF8String( +new ByteArrayMemoryBlock(test, Platform.BYTE_ARRAY_OFFSET - i, test.length + i)) --- End diff -- It depends on what we are testing at this unit test. From the comment in this test `// offset underflow is apparently supported?`, looks like this behavior is uncertain. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19222: [SPARK-10399][CORE][SQL] Introduce multiple Memor...
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/19222#discussion_r179332522 --- Diff: common/unsafe/src/test/java/org/apache/spark/unsafe/types/UTF8StringSuite.java --- @@ -515,7 +518,8 @@ public void writeToOutputStreamUnderflow() throws IOException { final byte[] test = "01234567".getBytes(StandardCharsets.UTF_8); for (int i = 1; i <= Platform.BYTE_ARRAY_OFFSET; ++i) { - UTF8String.fromAddress(test, Platform.BYTE_ARRAY_OFFSET - i, test.length + i) + new UTF8String( +new ByteArrayMemoryBlock(test, Platform.BYTE_ARRAY_OFFSET - i, test.length + i)) --- End diff -- What is your suggestion to make it non-strange? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19222: [SPARK-10399][CORE][SQL] Introduce multiple Memor...
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/19222#discussion_r179332281 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/ByteArrayMemoryBlock.java --- @@ -0,0 +1,127 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.unsafe.memory; + +import com.google.common.primitives.Ints; + +import org.apache.spark.unsafe.Platform; + +/** + * A consecutive block of memory with a byte array on Java heap. + */ +public final class ByteArrayMemoryBlock extends MemoryBlock { + + private final byte[] array; + + public ByteArrayMemoryBlock(byte[] obj, long offset, long size) { +super(obj, offset, size); +this.array = obj; +assert(offset + size <= Platform.BYTE_ARRAY_OFFSET + obj.length) : + "The sum of size " + size + " and offset " + offset + " should not be larger than " + +"the array size " + (obj.length + Platform.BYTE_ARRAY_OFFSET); + } + + public ByteArrayMemoryBlock(long length) { +this(new byte[Ints.checkedCast(length)], Platform.BYTE_ARRAY_OFFSET, length); + } + + @Override + public MemoryBlock subBlock(long offset, long size) { +checkSubBlockRange(offset, size); +return new ByteArrayMemoryBlock(array, this.offset + offset, size); + } + + public byte[] getByteArray() { return array; } + + /** + * Creates a memory block pointing to the memory used by the byte array. + */ + public static ByteArrayMemoryBlock fromArray(final byte[] array) { +return new ByteArrayMemoryBlock(array, Platform.BYTE_ARRAY_OFFSET, array.length); + } + + @Override + public final int getInt(long offset) { +return Platform.getInt(array, this.offset + offset); + } + + @Override + public final void putInt(long offset, int value) { +Platform.putInt(array, this.offset + offset, value); + } + + @Override + public final boolean getBoolean(long offset) { +return Platform.getBoolean(array, this.offset + offset); + } + + @Override + public final void putBoolean(long offset, boolean value) { +Platform.putBoolean(array, this.offset + offset, value); + } + + @Override + public final byte getByte(long offset) { +return array[(int)(this.offset + offset - Platform.BYTE_ARRAY_OFFSET)]; + } + + @Override + public final void putByte(long offset, byte value) { +array[(int)(this.offset + offset - Platform.BYTE_ARRAY_OFFSET)] = value; --- End diff -- ditto --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19222: [SPARK-10399][CORE][SQL] Introduce multiple Memor...
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/19222#discussion_r179331924 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/ByteArrayMemoryBlock.java --- @@ -0,0 +1,127 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.unsafe.memory; + +import com.google.common.primitives.Ints; + +import org.apache.spark.unsafe.Platform; + +/** + * A consecutive block of memory with a byte array on Java heap. + */ +public final class ByteArrayMemoryBlock extends MemoryBlock { + + private final byte[] array; + + public ByteArrayMemoryBlock(byte[] obj, long offset, long size) { +super(obj, offset, size); +this.array = obj; +assert(offset + size <= Platform.BYTE_ARRAY_OFFSET + obj.length) : + "The sum of size " + size + " and offset " + offset + " should not be larger than " + +"the array size " + (obj.length + Platform.BYTE_ARRAY_OFFSET); + } + + public ByteArrayMemoryBlock(long length) { +this(new byte[Ints.checkedCast(length)], Platform.BYTE_ARRAY_OFFSET, length); + } + + @Override + public MemoryBlock subBlock(long offset, long size) { +checkSubBlockRange(offset, size); +return new ByteArrayMemoryBlock(array, this.offset + offset, size); + } + + public byte[] getByteArray() { return array; } + + /** + * Creates a memory block pointing to the memory used by the byte array. + */ + public static ByteArrayMemoryBlock fromArray(final byte[] array) { +return new ByteArrayMemoryBlock(array, Platform.BYTE_ARRAY_OFFSET, array.length); + } + + @Override + public final int getInt(long offset) { +return Platform.getInt(array, this.offset + offset); + } + + @Override + public final void putInt(long offset, int value) { +Platform.putInt(array, this.offset + offset, value); + } + + @Override + public final boolean getBoolean(long offset) { +return Platform.getBoolean(array, this.offset + offset); + } + + @Override + public final void putBoolean(long offset, boolean value) { +Platform.putBoolean(array, this.offset + offset, value); + } + + @Override + public final byte getByte(long offset) { +return array[(int)(this.offset + offset - Platform.BYTE_ARRAY_OFFSET)]; --- End diff -- Suggestion from @cloud-fan. The Java array access is faster than `Platform.getByte`. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20937: [SPARK-23094][SPARK-23723][SPARK-23724][SQL] Support cus...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20937 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20937: [SPARK-23094][SPARK-23723][SPARK-23724][SQL] Support cus...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20937 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88907/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20937: [SPARK-23094][SPARK-23723][SPARK-23724][SQL] Support cus...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20937 **[Test build #88907 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88907/testReport)** for PR 20937 at commit [`7e20891`](https://github.com/apache/spark/commit/7e208912a370af0466dfcadbff0787f2b3248dd9). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20965: [SPARK-21870][SQL] Split aggregation code into small fun...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20965 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20965: [SPARK-21870][SQL] Split aggregation code into small fun...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20965 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20965: [SPARK-21870][SQL] Split aggregation code into small fun...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20965 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1983/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20965: [SPARK-21870][SQL] Split aggregation code into small fun...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20965 **[Test build #88917 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88917/testReport)** for PR 20965 at commit [`b96aa8d`](https://github.com/apache/spark/commit/b96aa8d4d2d0021439499e90e051982b1ed121bd). * This patch **fails to build**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20965: [SPARK-21870][SQL] Split aggregation code into small fun...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20965 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88917/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20965: [SPARK-21870][SQL] Split aggregation code into small fun...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20965 **[Test build #88918 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88918/testReport)** for PR 20965 at commit [`640fd8f`](https://github.com/apache/spark/commit/640fd8f51e57edea12a36832b5c686a062ead7ab). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20965: [SPARK-21870][SQL] Split aggregation code into small fun...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20965 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1982/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20965: [SPARK-21870][SQL] Split aggregation code into small fun...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20965 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20965: [SPARK-21870][SQL] Split aggregation code into small fun...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20965 **[Test build #88917 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88917/testReport)** for PR 20965 at commit [`b96aa8d`](https://github.com/apache/spark/commit/b96aa8d4d2d0021439499e90e051982b1ed121bd). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20962: [SPARK-23847][PYTHON][SQL]Add asc_nulls_first, asc_nulls...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20962 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20962: [SPARK-23847][PYTHON][SQL]Add asc_nulls_first, asc_nulls...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20962 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1981/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20962: [SPARK-23847][PYTHON][SQL]Add asc_nulls_first, asc_nulls...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20962 **[Test build #88916 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88916/testReport)** for PR 20962 at commit [`c3550c2`](https://github.com/apache/spark/commit/c3550c23d4ae7a871ce8ee6198d418b4970c0669). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20962: [SPARK-23847][PYTHON][SQL]Add asc_nulls_first, asc_nulls...
Github user huaxingao commented on the issue: https://github.com/apache/spark/pull/20962 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20961: [SPARK-23823][SQL] Keep origin in transformExpression
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20961 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88909/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20961: [SPARK-23823][SQL] Keep origin in transformExpression
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20961 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20961: [SPARK-23823][SQL] Keep origin in transformExpression
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20961 **[Test build #88909 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88909/testReport)** for PR 20961 at commit [`52e8875`](https://github.com/apache/spark/commit/52e8875f040a327c7045fdba3fbd4671690cb0c5). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `class QueryPlanSuite extends SparkFunSuite ` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20965: [SPARK-21870][SQL] Split aggregation code into sm...
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/20965#discussion_r179322582 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala --- @@ -1063,6 +1064,29 @@ class CodegenContext { } } +object CodegenContext { --- End diff -- ok --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20913: [SPARK-23799] FilterEstimation.evaluateInSet prod...
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/20913#discussion_r179322407 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala --- @@ -427,7 +427,11 @@ case class FilterEstimation(plan: Filter) extends Logging { // return the filter selectivity. Without advanced statistics such as histograms, // we have to assume uniform distribution. -Some(math.min(newNdv.toDouble / ndv.toDouble, 1.0)) +if (ndv.toDouble != 0) { --- End diff -- Can you add a test for the empty table case? I think we need to fix the other places if they have the same issue. cc: @wzhfy --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20913: [SPARK-23799] FilterEstimation.evaluateInSet prod...
Github user apache-hivemall commented on a diff in the pull request: https://github.com/apache/spark/pull/20913#discussion_r179322297 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala --- @@ -427,7 +427,11 @@ case class FilterEstimation(plan: Filter) extends Logging { // return the filter selectivity. Without advanced statistics such as histograms, // we have to assume uniform distribution. -Some(math.min(newNdv.toDouble / ndv.toDouble, 1.0)) +if (ndv.toDouble != 0) { --- End diff -- Can you add a test for the empty table case? I think we need to fix the other places if they have the same issue. cc: @wzhfy --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19222: [SPARK-10399][CORE][SQL] Introduce multiple Memor...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19222#discussion_r179321572 --- Diff: common/unsafe/src/test/java/org/apache/spark/unsafe/types/UTF8StringSuite.java --- @@ -515,7 +518,8 @@ public void writeToOutputStreamUnderflow() throws IOException { final byte[] test = "01234567".getBytes(StandardCharsets.UTF_8); for (int i = 1; i <= Platform.BYTE_ARRAY_OFFSET; ++i) { - UTF8String.fromAddress(test, Platform.BYTE_ARRAY_OFFSET - i, test.length + i) + new UTF8String( +new ByteArrayMemoryBlock(test, Platform.BYTE_ARRAY_OFFSET - i, test.length + i)) --- End diff -- Looks a bit strange. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19222: [SPARK-10399][CORE][SQL] Introduce multiple Memor...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19222#discussion_r179321207 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/UnsafeMemoryAllocator.java --- @@ -19,15 +19,21 @@ import org.apache.spark.unsafe.Platform; +import java.lang.reflect.InvocationTargetException; +import java.lang.reflect.Method; +import java.nio.ByteBuffer; + +import sun.nio.ch.DirectBuffer; --- End diff -- Seems unused imports? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19222: [SPARK-10399][CORE][SQL] Introduce multiple Memor...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19222#discussion_r179320913 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/OnHeapMemoryBlock.java --- @@ -0,0 +1,131 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.unsafe.memory; + +import com.google.common.primitives.Ints; + +import org.apache.spark.unsafe.Platform; + +/** + * A consecutive block of memory with a long array on Java heap. + */ +public final class OnHeapMemoryBlock extends MemoryBlock { + + private final long[] array; + + public OnHeapMemoryBlock(long[] obj, long offset, long size) { +super(obj, offset, size); +this.array = obj; +assert(offset + size <= obj.length * 8L + Platform.LONG_ARRAY_OFFSET) : + "The sum of size " + size + " and offset " + offset + " should not be larger than " + +"the array size " + (obj.length * 8L + Platform.LONG_ARRAY_OFFSET); --- End diff -- ditto. maybe "the size of the given memory space"? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19222: [SPARK-10399][CORE][SQL] Introduce multiple Memor...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19222#discussion_r179320302 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/ByteArrayMemoryBlock.java --- @@ -0,0 +1,127 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.unsafe.memory; + +import com.google.common.primitives.Ints; + +import org.apache.spark.unsafe.Platform; + +/** + * A consecutive block of memory with a byte array on Java heap. + */ +public final class ByteArrayMemoryBlock extends MemoryBlock { + + private final byte[] array; + + public ByteArrayMemoryBlock(byte[] obj, long offset, long size) { +super(obj, offset, size); +this.array = obj; +assert(offset + size <= Platform.BYTE_ARRAY_OFFSET + obj.length) : + "The sum of size " + size + " and offset " + offset + " should not be larger than " + +"the array size " + (obj.length + Platform.BYTE_ARRAY_OFFSET); + } + + public ByteArrayMemoryBlock(long length) { +this(new byte[Ints.checkedCast(length)], Platform.BYTE_ARRAY_OFFSET, length); + } + + @Override + public MemoryBlock subBlock(long offset, long size) { +checkSubBlockRange(offset, size); +return new ByteArrayMemoryBlock(array, this.offset + offset, size); + } + + public byte[] getByteArray() { return array; } + + /** + * Creates a memory block pointing to the memory used by the byte array. + */ + public static ByteArrayMemoryBlock fromArray(final byte[] array) { +return new ByteArrayMemoryBlock(array, Platform.BYTE_ARRAY_OFFSET, array.length); + } + + @Override + public final int getInt(long offset) { +return Platform.getInt(array, this.offset + offset); + } + + @Override + public final void putInt(long offset, int value) { +Platform.putInt(array, this.offset + offset, value); + } + + @Override + public final boolean getBoolean(long offset) { +return Platform.getBoolean(array, this.offset + offset); + } + + @Override + public final void putBoolean(long offset, boolean value) { +Platform.putBoolean(array, this.offset + offset, value); + } + + @Override + public final byte getByte(long offset) { +return array[(int)(this.offset + offset - Platform.BYTE_ARRAY_OFFSET)]; + } + + @Override + public final void putByte(long offset, byte value) { +array[(int)(this.offset + offset - Platform.BYTE_ARRAY_OFFSET)] = value; --- End diff -- ditto. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19222: [SPARK-10399][CORE][SQL] Introduce multiple Memor...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19222#discussion_r179320267 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/ByteArrayMemoryBlock.java --- @@ -0,0 +1,127 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.unsafe.memory; + +import com.google.common.primitives.Ints; + +import org.apache.spark.unsafe.Platform; + +/** + * A consecutive block of memory with a byte array on Java heap. + */ +public final class ByteArrayMemoryBlock extends MemoryBlock { + + private final byte[] array; + + public ByteArrayMemoryBlock(byte[] obj, long offset, long size) { +super(obj, offset, size); +this.array = obj; +assert(offset + size <= Platform.BYTE_ARRAY_OFFSET + obj.length) : + "The sum of size " + size + " and offset " + offset + " should not be larger than " + +"the array size " + (obj.length + Platform.BYTE_ARRAY_OFFSET); + } + + public ByteArrayMemoryBlock(long length) { +this(new byte[Ints.checkedCast(length)], Platform.BYTE_ARRAY_OFFSET, length); + } + + @Override + public MemoryBlock subBlock(long offset, long size) { +checkSubBlockRange(offset, size); +return new ByteArrayMemoryBlock(array, this.offset + offset, size); + } + + public byte[] getByteArray() { return array; } + + /** + * Creates a memory block pointing to the memory used by the byte array. + */ + public static ByteArrayMemoryBlock fromArray(final byte[] array) { +return new ByteArrayMemoryBlock(array, Platform.BYTE_ARRAY_OFFSET, array.length); + } + + @Override + public final int getInt(long offset) { +return Platform.getInt(array, this.offset + offset); + } + + @Override + public final void putInt(long offset, int value) { +Platform.putInt(array, this.offset + offset, value); + } + + @Override + public final boolean getBoolean(long offset) { +return Platform.getBoolean(array, this.offset + offset); + } + + @Override + public final void putBoolean(long offset, boolean value) { +Platform.putBoolean(array, this.offset + offset, value); + } + + @Override + public final byte getByte(long offset) { +return array[(int)(this.offset + offset - Platform.BYTE_ARRAY_OFFSET)]; --- End diff -- Why don't use `Platform.getByte`? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19222: [SPARK-10399][CORE][SQL] Introduce multiple Memor...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19222#discussion_r179320092 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/OnHeapMemoryBlock.java --- @@ -0,0 +1,141 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.unsafe.memory; + +import org.apache.spark.unsafe.Platform; + +/** + * A consecutive block of memory with a long array on Java heap. + */ +public final class OnHeapMemoryBlock extends MemoryBlock { --- End diff -- Can we put @cloud-fan's comment https://github.com/apache/spark/pull/19222/files#r172356146 into `OnHeapMemoryBlock`'s doc? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19222: [SPARK-10399][CORE][SQL] Introduce multiple Memor...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19222#discussion_r179319738 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/ByteArrayMemoryBlock.java --- @@ -0,0 +1,127 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.unsafe.memory; + +import com.google.common.primitives.Ints; + +import org.apache.spark.unsafe.Platform; + +/** + * A consecutive block of memory with a byte array on Java heap. + */ +public final class ByteArrayMemoryBlock extends MemoryBlock { + + private final byte[] array; + + public ByteArrayMemoryBlock(byte[] obj, long offset, long size) { +super(obj, offset, size); +this.array = obj; +assert(offset + size <= Platform.BYTE_ARRAY_OFFSET + obj.length) : --- End diff -- hmm, from here and the accessors below, looks like the given `offset` includes `Platform.BYTE_ARRAY_OFFSET`? What will happen if a given offset is less than that? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19222: [SPARK-10399][CORE][SQL] Introduce multiple Memor...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19222#discussion_r179319478 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/ByteArrayMemoryBlock.java --- @@ -0,0 +1,127 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.unsafe.memory; + +import com.google.common.primitives.Ints; + +import org.apache.spark.unsafe.Platform; + +/** + * A consecutive block of memory with a byte array on Java heap. + */ +public final class ByteArrayMemoryBlock extends MemoryBlock { + + private final byte[] array; + + public ByteArrayMemoryBlock(byte[] obj, long offset, long size) { +super(obj, offset, size); +this.array = obj; +assert(offset + size <= Platform.BYTE_ARRAY_OFFSET + obj.length) : + "The sum of size " + size + " and offset " + offset + " should not be larger than " + +"the array size " + (obj.length + Platform.BYTE_ARRAY_OFFSET); --- End diff -- Said `obj.length + Platform.BYTE_ARRAY_OFFSET` as the array size might be a bit confusing in an error message. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20962: [SPARK-23847][PYTHON][SQL]Add asc_nulls_first, asc_nulls...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20962 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88908/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20962: [SPARK-23847][PYTHON][SQL]Add asc_nulls_first, asc_nulls...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20962 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20962: [SPARK-23847][PYTHON][SQL]Add asc_nulls_first, asc_nulls...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20962 **[Test build #88908 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88908/testReport)** for PR 20962 at commit [`c3550c2`](https://github.com/apache/spark/commit/c3550c23d4ae7a871ce8ee6198d418b4970c0669). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20970: [SPARK-23870][ML] Forward RFormula handleInvalid Param t...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20970 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20970: [SPARK-23870][ML] Forward RFormula handleInvalid Param t...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20970 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88911/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org