[GitHub] spark issue #20890: [WIP][SPARK-23779][SQL] TaskMemoryManager and UnsafeSort...

2018-04-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20890
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88921/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20890: [WIP][SPARK-23779][SQL] TaskMemoryManager and UnsafeSort...

2018-04-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20890
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20890: [WIP][SPARK-23779][SQL] TaskMemoryManager and UnsafeSort...

2018-04-04 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20890
  
**[Test build #88921 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88921/testReport)**
 for PR 20890 at commit 
[`9abcf09`](https://github.com/apache/spark/commit/9abcf09d99d5a42a7620882c4a7d6ebbc43fff6a).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20979: [SPARK-23588][SQL] CatalystToExternalMap should s...

2018-04-04 Thread maropu
Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/20979#discussion_r179355419
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala
 ---
@@ -988,8 +988,20 @@ case class CatalystToExternalMap private(
   override def children: Seq[Expression] =
 keyLambdaFunction :: valueLambdaFunction :: inputData :: Nil
 
-  override def eval(input: InternalRow): Any =
-throw new UnsupportedOperationException("Only code-generated 
evaluation is supported")
+  private lazy val toScalaValue: Any => Any = {
+assert(inputData.dataType.isInstanceOf[MapType])
+val mapType = inputData.dataType.asInstanceOf[MapType]
+CatalystTypeConverters.createToScalaConverter(mapType)
--- End diff --

yea, you're right. I'll update soon. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20797: [SPARK-23583][SQL] Invoke should support interpre...

2018-04-04 Thread maropu
Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/20797#discussion_r179355326
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/ScalaReflection.scala 
---
@@ -794,6 +794,52 @@ object ScalaReflection extends ScalaReflection {
 "interface", "long", "native", "new", "null", "package", "private", 
"protected", "public",
 "return", "short", "static", "strictfp", "super", "switch", 
"synchronized", "this", "throw",
 "throws", "transient", "true", "try", "void", "volatile", "while")
+
+  val typeJavaMapping = Map[DataType, Class[_]](
+BooleanType -> classOf[Boolean],
+ByteType -> classOf[Byte],
+ShortType -> classOf[Short],
+IntegerType -> classOf[Int],
+LongType -> classOf[Long],
+FloatType -> classOf[Float],
+DoubleType -> classOf[Double],
+StringType -> classOf[UTF8String],
+DateType -> classOf[DateType.InternalType],
+TimestampType -> classOf[TimestampType.InternalType],
+BinaryType -> classOf[BinaryType.InternalType],
+CalendarIntervalType -> classOf[CalendarInterval]
+  )
+
+  val typeBoxedJavaMapping = Map[DataType, Class[_]](
+BooleanType -> classOf[java.lang.Boolean],
+ByteType -> classOf[java.lang.Byte],
+ShortType -> classOf[java.lang.Short],
+IntegerType -> classOf[java.lang.Integer],
+LongType -> classOf[java.lang.Long],
+FloatType -> classOf[java.lang.Float],
+DoubleType -> classOf[java.lang.Double],
+DateType -> classOf[java.lang.Integer],
+TimestampType -> classOf[java.lang.Long]
+  )
+
+  def dataTypeJavaClass(dt: DataType): Class[_] = {
+dt match {
+  case _: DecimalType => classOf[Decimal]
+  case _: StructType => classOf[InternalRow]
+  case _: ArrayType => classOf[ArrayData]
+  case _: MapType => classOf[MapData]
+  case ObjectType(cls) => cls
+  case _ => typeJavaMapping.getOrElse(dt, classOf[java.lang.Object])
+}
+  }
--- End diff --

ok


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20980: [SPARK-23589][SQL] ExternalMapToCatalyst should s...

2018-04-04 Thread maropu
Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/20980#discussion_r179355354
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala
 ---
@@ -1176,8 +1176,28 @@ case class ExternalMapToCatalyst private(
   override def dataType: MapType = MapType(
 keyConverter.dataType, valueConverter.dataType, valueContainsNull = 
valueConverter.nullable)
 
-  override def eval(input: InternalRow): Any =
-throw new UnsupportedOperationException("Only code-generated 
evaluation is supported")
+  private lazy val keyCatalystConverter =
+CatalystTypeConverters.createToCatalystConverter(dataType.keyType)
+  private lazy val valueCatalystConverter =
+CatalystTypeConverters.createToCatalystConverter(dataType.valueType)
+
+  override def eval(input: InternalRow): Any = {
+val result = child.eval(input)
+if (result != null) {
+  val mapValue = result.asInstanceOf[Map[Any, Any]]
--- End diff --

ok


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20797: [SPARK-23583][SQL] Invoke should support interpre...

2018-04-04 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/20797#discussion_r179354457
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/ScalaReflection.scala 
---
@@ -794,6 +794,52 @@ object ScalaReflection extends ScalaReflection {
 "interface", "long", "native", "new", "null", "package", "private", 
"protected", "public",
 "return", "short", "static", "strictfp", "super", "switch", 
"synchronized", "this", "throw",
 "throws", "transient", "true", "try", "void", "volatile", "while")
+
+  val typeJavaMapping = Map[DataType, Class[_]](
+BooleanType -> classOf[Boolean],
+ByteType -> classOf[Byte],
+ShortType -> classOf[Short],
+IntegerType -> classOf[Int],
+LongType -> classOf[Long],
+FloatType -> classOf[Float],
+DoubleType -> classOf[Double],
+StringType -> classOf[UTF8String],
+DateType -> classOf[DateType.InternalType],
+TimestampType -> classOf[TimestampType.InternalType],
+BinaryType -> classOf[BinaryType.InternalType],
+CalendarIntervalType -> classOf[CalendarInterval]
+  )
+
+  val typeBoxedJavaMapping = Map[DataType, Class[_]](
+BooleanType -> classOf[java.lang.Boolean],
+ByteType -> classOf[java.lang.Byte],
+ShortType -> classOf[java.lang.Short],
+IntegerType -> classOf[java.lang.Integer],
+LongType -> classOf[java.lang.Long],
+FloatType -> classOf[java.lang.Float],
+DoubleType -> classOf[java.lang.Double],
+DateType -> classOf[java.lang.Integer],
+TimestampType -> classOf[java.lang.Long]
+  )
+
+  def dataTypeJavaClass(dt: DataType): Class[_] = {
+dt match {
+  case _: DecimalType => classOf[Decimal]
+  case _: StructType => classOf[InternalRow]
+  case _: ArrayType => classOf[ArrayData]
+  case _: MapType => classOf[MapData]
+  case ObjectType(cls) => cls
+  case _ => typeJavaMapping.getOrElse(dt, classOf[java.lang.Object])
+}
+  }
--- End diff --

cc @maropu We can now use this instead of 
`CallMethodViaReflection.typeMapping`.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20757: [SPARK-23595][SQL] ValidateExternalType should support i...

2018-04-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20757
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20757: [SPARK-23595][SQL] ValidateExternalType should support i...

2018-04-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20757
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88919/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20757: [SPARK-23595][SQL] ValidateExternalType should support i...

2018-04-04 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20757
  
**[Test build #88919 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88919/testReport)**
 for PR 20757 at commit 
[`dd1159b`](https://github.com/apache/spark/commit/dd1159b2ead8c605a71b73d927d22f640ef9f64e).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20965: [SPARK-21870][SQL] Split aggregation code into small fun...

2018-04-04 Thread maropu
Github user maropu commented on the issue:

https://github.com/apache/spark/pull/20965
  
@mgaido91 @viirya Could you check again? Thanks!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20965: [SPARK-21870][SQL] Split aggregation code into small fun...

2018-04-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20965
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88918/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20965: [SPARK-21870][SQL] Split aggregation code into small fun...

2018-04-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20965
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20965: [SPARK-21870][SQL] Split aggregation code into small fun...

2018-04-04 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20965
  
**[Test build #88918 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88918/testReport)**
 for PR 20965 at commit 
[`640fd8f`](https://github.com/apache/spark/commit/640fd8f51e57edea12a36832b5c686a062ead7ab).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20980: [SPARK-23589][SQL] ExternalMapToCatalyst should s...

2018-04-04 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/20980#discussion_r179349022
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala
 ---
@@ -1176,8 +1176,28 @@ case class ExternalMapToCatalyst private(
   override def dataType: MapType = MapType(
 keyConverter.dataType, valueConverter.dataType, valueContainsNull = 
valueConverter.nullable)
 
-  override def eval(input: InternalRow): Any =
-throw new UnsupportedOperationException("Only code-generated 
evaluation is supported")
+  private lazy val keyCatalystConverter =
+CatalystTypeConverters.createToCatalystConverter(dataType.keyType)
+  private lazy val valueCatalystConverter =
+CatalystTypeConverters.createToCatalystConverter(dataType.valueType)
+
+  override def eval(input: InternalRow): Any = {
+val result = child.eval(input)
+if (result != null) {
+  val mapValue = result.asInstanceOf[Map[Any, Any]]
--- End diff --

The external map can be `java.util.Map` or `scala.collection.Map`.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20962: [SPARK-23847][PYTHON][SQL]Add asc_nulls_first, asc_nulls...

2018-04-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20962
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20962: [SPARK-23847][PYTHON][SQL]Add asc_nulls_first, asc_nulls...

2018-04-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20962
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88916/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20962: [SPARK-23847][PYTHON][SQL]Add asc_nulls_first, asc_nulls...

2018-04-04 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20962
  
**[Test build #88916 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88916/testReport)**
 for PR 20962 at commit 
[`c3550c2`](https://github.com/apache/spark/commit/c3550c23d4ae7a871ce8ee6198d418b4970c0669).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20979: [SPARK-23588][SQL] CatalystToExternalMap should s...

2018-04-04 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/20979#discussion_r179347546
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala
 ---
@@ -988,8 +988,20 @@ case class CatalystToExternalMap private(
   override def children: Seq[Expression] =
 keyLambdaFunction :: valueLambdaFunction :: inputData :: Nil
 
-  override def eval(input: InternalRow): Any =
-throw new UnsupportedOperationException("Only code-generated 
evaluation is supported")
+  private lazy val toScalaValue: Any => Any = {
+assert(inputData.dataType.isInstanceOf[MapType])
+val mapType = inputData.dataType.asInstanceOf[MapType]
+CatalystTypeConverters.createToScalaConverter(mapType)
--- End diff --

hmm, don't we need to support `collClass`?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20931: [SPARK-23815][Core]Spark writer dynamic partition...

2018-04-04 Thread fangshil
Github user fangshil commented on a diff in the pull request:

https://github.com/apache/spark/pull/20931#discussion_r179346911
  
--- Diff: 
core/src/main/scala/org/apache/spark/internal/io/HadoopMapReduceCommitProtocol.scala
 ---
@@ -186,7 +186,9 @@ class HadoopMapReduceCommitProtocol(
 logDebug(s"Clean up default partition directories for overwriting: 
$partitionPaths")
 for (part <- partitionPaths) {
   val finalPartPath = new Path(path, part)
-  fs.delete(finalPartPath, true)
+  if (!fs.delete(finalPartPath, true) && 
!fs.exists(finalPartPath.getParent)) {
+fs.mkdirs(finalPartPath.getParent)
--- End diff --

@cloud-fan  yes, in official HDFS 
document(https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/filesystem/filesystem.html),
 the rename command has precondition "dest must be root, or have a parent that 
exists"


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20981: [SPARK-23873][SQL] Use accessors in interpreted LambdaVa...

2018-04-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20981
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1991/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20981: [SPARK-23873][SQL] Use accessors in interpreted LambdaVa...

2018-04-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20981
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20981: [SPARK-23873][SQL] Use accessors in interpreted LambdaVa...

2018-04-04 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20981
  
**[Test build #88926 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88926/testReport)**
 for PR 20981 at commit 
[`35539be`](https://github.com/apache/spark/commit/35539be14bc79647409bfcdc5dac1fb99d00f5ec).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20981: [SPARK-23873][SQL] Use accessors in interpreted LambdaVa...

2018-04-04 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/20981
  
cc @hvanhovell 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20981: [SPARK-23873][SQL] Use accessors in interpreted L...

2018-04-04 Thread viirya
GitHub user viirya opened a pull request:

https://github.com/apache/spark/pull/20981

[SPARK-23873][SQL] Use accessors in interpreted LambdaVariable

## What changes were proposed in this pull request?

Currently, interpreted execution of `LambdaVariable` just uses 
`InternalRow.get` to access element. We should use specified accessors if 
possible.

## How was this patch tested?

Added test.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/viirya/spark-1 SPARK-23873

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/20981.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #20981


commit 35539be14bc79647409bfcdc5dac1fb99d00f5ec
Author: Liang-Chi Hsieh 
Date:   2018-04-05T03:57:17Z

Use accessors in interpreted LambdaVariable.




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20979: [SPARK-23588][SQL] CatalystToExternalMap should support ...

2018-04-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20979
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20979: [SPARK-23588][SQL] CatalystToExternalMap should support ...

2018-04-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20979
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1990/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20980: [SPARK-23589][SQL] ExternalMapToCatalyst should support ...

2018-04-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20980
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1989/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20980: [SPARK-23589][SQL] ExternalMapToCatalyst should support ...

2018-04-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20980
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20979: [SPARK-23588][SQL] CatalystToExternalMap should support ...

2018-04-04 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20979
  
**[Test build #88925 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88925/testReport)**
 for PR 20979 at commit 
[`352777b`](https://github.com/apache/spark/commit/352777b162752397efac0556a87c924935000e4a).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20980: [SPARK-23589][SQL] ExternalMapToCatalyst should support ...

2018-04-04 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20980
  
**[Test build #88924 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88924/testReport)**
 for PR 20980 at commit 
[`52e76d7`](https://github.com/apache/spark/commit/52e76d760b2c9e1c9c72e1ee98a664bcf92c2c35).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20980: [SPARK-23589][SQL] ExternalMapToCatalyst should s...

2018-04-04 Thread maropu
GitHub user maropu opened a pull request:

https://github.com/apache/spark/pull/20980

[SPARK-23589][SQL] ExternalMapToCatalyst should support interpreted 
execution

## What changes were proposed in this pull request?
This pr supported interpreted mode for `ExternalMapToCatalyst`.

## How was this patch tested?
Added tests in `ObjectExpressionsSuite`.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/maropu/spark SPARK-23589

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/20980.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #20980


commit 52e76d760b2c9e1c9c72e1ee98a664bcf92c2c35
Author: Takeshi Yamamuro 
Date:   2018-03-09T07:37:57Z

ExternalMapToCatalyst should support interpreted execution




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20979: [SPARK-23588][SQL] CatalystToExternalMap should s...

2018-04-04 Thread maropu
GitHub user maropu opened a pull request:

https://github.com/apache/spark/pull/20979

[SPARK-23588][SQL] CatalystToExternalMap should support interpreted 
execution

## What changes were proposed in this pull request?
This pr supported interpreted mode for `CatalystToExternalMap`.

## How was this patch tested?
Added tests in `ObjectExpressionsSuite`.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/maropu/spark SPARK-23588

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/20979.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #20979


commit 352777b162752397efac0556a87c924935000e4a
Author: Takeshi Yamamuro 
Date:   2018-03-12T05:32:22Z

CatalystToExternalMap should support interpreted execution




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20933: [SPARK-23817][SQL]Migrate ORC file format read path to d...

2018-04-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20933
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1988/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20933: [SPARK-23817][SQL]Migrate ORC file format read path to d...

2018-04-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20933
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20953: [SPARK-23822][SQL] Improve error message for Parq...

2018-04-04 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/20953#discussion_r179343192
  
--- Diff: 
sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaColumnConvertNotSupportedException.java
 ---
@@ -0,0 +1,62 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.datasources.parquet;
+
+import org.apache.spark.annotation.InterfaceStability;
+
+/**
+ * Exception thrown when the parquet reader find column type mismatches.
+ */
+@InterfaceStability.Unstable
+public class ParquetSchemaColumnConvertNotSupportedException extends 
RuntimeException {
--- End diff --

@gatorsmile and @yuchenhuo 
Why don't we generalize this exception by removing `Parquet` and move to 
parent `datasources` package?
Since this exception doesn't carry Parquet-specific information at all, 
that would be better for the future.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20933: [SPARK-23817][SQL]Migrate ORC file format read path to d...

2018-04-04 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20933
  
**[Test build #88923 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88923/testReport)**
 for PR 20933 at commit 
[`35b74c0`](https://github.com/apache/spark/commit/35b74c08c3aefcd3baaee24b35a3d962a8e069ac).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20933: [SPARK-23817][SQL]Migrate ORC file format read path to d...

2018-04-04 Thread gengliangwang
Github user gengliangwang commented on the issue:

https://github.com/apache/spark/pull/20933
  
retest this please.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20953: [SPARK-23822][SQL] Improve error message for Parq...

2018-04-04 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/20953#discussion_r179342708
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaSuite.scala
 ---
@@ -382,6 +384,59 @@ class ParquetSchemaSuite extends ParquetSchemaTest {
 }
   }
 
+  // ===
+  // Tests for parquet schema mismatch error
+  // ===
+  def testSchemaMismatch(path: String, vectorizedReaderEnabled: Boolean): 
SparkException = {
+import testImplicits._
+
+var e: SparkException = null
+// Disable databricks' vectorized parquet reader and use open source 
version.
--- End diff --

@yuchenhuo . In Apache Spark, this comment should be removed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20786: [SPARK-14681][ML] Provide label/impurity stats for spark...

2018-04-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20786
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88914/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20786: [SPARK-14681][ML] Provide label/impurity stats for spark...

2018-04-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20786
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20786: [SPARK-14681][ML] Provide label/impurity stats for spark...

2018-04-04 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20786
  
**[Test build #88914 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88914/testReport)**
 for PR 20786 at commit 
[`48c17d4`](https://github.com/apache/spark/commit/48c17d4dff6a4e82b86d70f3845e6d524b4807e5).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `sealed trait ClassificationNode extends Node `
  * `sealed trait RegressionNode extends Node `
  * `sealed trait LeafNode extends Node `
  * `sealed trait InternalNode extends Node `


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20753: [SPARK-23582][SQL] StaticInvoke should support interpret...

2018-04-04 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20753
  
**[Test build #88922 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88922/testReport)**
 for PR 20753 at commit 
[`7b30c30`](https://github.com/apache/spark/commit/7b30c30a68107880cda108e6a0ed923c27ac6d56).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20753: [SPARK-23582][SQL] StaticInvoke should support interpret...

2018-04-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20753
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20753: [SPARK-23582][SQL] StaticInvoke should support interpret...

2018-04-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20753
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1987/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20977: [SPARK-23867][Scheduler] use droppedCount in logW...

2018-04-04 Thread Ngone51
Github user Ngone51 commented on a diff in the pull request:

https://github.com/apache/spark/pull/20977#discussion_r179338562
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/AsyncEventQueue.scala ---
@@ -166,7 +166,7 @@ private class AsyncEventQueue(val name: String, conf: 
SparkConf, metrics: LiveLi
   val prevLastReportTimestamp = lastReportTimestamp
   lastReportTimestamp = System.currentTimeMillis()
   val previous = new java.util.Date(prevLastReportTimestamp)
-  logWarning(s"Dropped $droppedEvents events from $name since 
$previous.")
+  logWarning(s"Dropped $droppedCount events from $name since 
$previous.")
--- End diff --

I think the change is OK, since it tells us how many events dropped during 
the period `[System.currentTimeMillis(), lastReportTimestamp]`. 

BTW, I still have the same question with `lastReportTimestamp `(though, 
this is none of this pr's business):
When first time we log warning, `lastReportTimestamp ` should be `0` due to 
the default value and  without any initial. So, `previous` will be `1970/1/1 
8:0:0 `.  And, it means we got several dropped events since `1970/1/1 8:0:0 `. 
I feel quite weird about it.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19193: [WIP][SPARK-21896][SQL] Fix Stack Overflow when window f...

2018-04-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19193
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88910/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19193: [WIP][SPARK-21896][SQL] Fix Stack Overflow when window f...

2018-04-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19193
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19193: [WIP][SPARK-21896][SQL] Fix Stack Overflow when window f...

2018-04-04 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19193
  
**[Test build #88910 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88910/testReport)**
 for PR 19193 at commit 
[`cb19ad6`](https://github.com/apache/spark/commit/cb19ad614b7b757b71e6995aa0c5d8295d8f2485).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20890: [WIP][SPARK-23779][SQL] TaskMemoryManager and UnsafeSort...

2018-04-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20890
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20890: [WIP][SPARK-23779][SQL] TaskMemoryManager and UnsafeSort...

2018-04-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20890
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1986/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20890: [WIP][SPARK-23779][SQL] TaskMemoryManager and UnsafeSort...

2018-04-04 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20890
  
**[Test build #88921 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88921/testReport)**
 for PR 20890 at commit 
[`9abcf09`](https://github.com/apache/spark/commit/9abcf09d99d5a42a7620882c4a7d6ebbc43fff6a).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20890: [WIP][SPARK-23779][SQL] TaskMemoryManager and UnsafeSort...

2018-04-04 Thread kiszk
Github user kiszk commented on the issue:

https://github.com/apache/spark/pull/20890
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20778: [SPARK-23584][SQL] NewInstance should support interprete...

2018-04-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20778
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1985/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20778: [SPARK-23584][SQL] NewInstance should support interprete...

2018-04-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20778
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20778: [SPARK-23584][SQL] NewInstance should support interprete...

2018-04-04 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20778
  
**[Test build #88920 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88920/testReport)**
 for PR 20778 at commit 
[`264f129`](https://github.com/apache/spark/commit/264f1298e099356f53ea7bf1f0dfcd38fe514532).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19222: [SPARK-10399][CORE][SQL] Introduce multiple Memor...

2018-04-04 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/19222#discussion_r179335313
  
--- Diff: 
common/unsafe/src/test/java/org/apache/spark/unsafe/types/UTF8StringSuite.java 
---
@@ -515,7 +518,8 @@ public void writeToOutputStreamUnderflow() throws 
IOException {
 final byte[] test = "01234567".getBytes(StandardCharsets.UTF_8);
 
 for (int i = 1; i <= Platform.BYTE_ARRAY_OFFSET; ++i) {
-  UTF8String.fromAddress(test, Platform.BYTE_ARRAY_OFFSET - i, 
test.length + i)
+  new UTF8String(
+new ByteArrayMemoryBlock(test, Platform.BYTE_ARRAY_OFFSET - i, 
test.length + i))
--- End diff --

Sure, I just say it looks strange. I know you didn't change the unit test's 
behavior.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19222: [SPARK-10399][CORE][SQL] Introduce multiple Memor...

2018-04-04 Thread kiszk
Github user kiszk commented on a diff in the pull request:

https://github.com/apache/spark/pull/19222#discussion_r179335019
  
--- Diff: 
common/unsafe/src/test/java/org/apache/spark/unsafe/types/UTF8StringSuite.java 
---
@@ -515,7 +518,8 @@ public void writeToOutputStreamUnderflow() throws 
IOException {
 final byte[] test = "01234567".getBytes(StandardCharsets.UTF_8);
 
 for (int i = 1; i <= Platform.BYTE_ARRAY_OFFSET; ++i) {
-  UTF8String.fromAddress(test, Platform.BYTE_ARRAY_OFFSET - i, 
test.length + i)
+  new UTF8String(
+new ByteArrayMemoryBlock(test, Platform.BYTE_ARRAY_OFFSET - i, 
test.length + i))
--- End diff --

This change follows the original behavior for the backward compatibility.

If we want to change the behavior, update test, or delete this, it would be 
good to address in another PR. Is it better to discuss this UT in another PR?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20757: [SPARK-23595][SQL] ValidateExternalType should support i...

2018-04-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20757
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20757: [SPARK-23595][SQL] ValidateExternalType should support i...

2018-04-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20757
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1984/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20757: [SPARK-23595][SQL] ValidateExternalType should support i...

2018-04-04 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20757
  
**[Test build #88919 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88919/testReport)**
 for PR 20757 at commit 
[`dd1159b`](https://github.com/apache/spark/commit/dd1159b2ead8c605a71b73d927d22f640ef9f64e).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19222: [SPARK-10399][CORE][SQL] Introduce multiple Memor...

2018-04-04 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/19222#discussion_r179333954
  
--- Diff: 
common/unsafe/src/test/java/org/apache/spark/unsafe/types/UTF8StringSuite.java 
---
@@ -515,7 +518,8 @@ public void writeToOutputStreamUnderflow() throws 
IOException {
 final byte[] test = "01234567".getBytes(StandardCharsets.UTF_8);
 
 for (int i = 1; i <= Platform.BYTE_ARRAY_OFFSET; ++i) {
-  UTF8String.fromAddress(test, Platform.BYTE_ARRAY_OFFSET - i, 
test.length + i)
+  new UTF8String(
+new ByteArrayMemoryBlock(test, Platform.BYTE_ARRAY_OFFSET - i, 
test.length + i))
--- End diff --

It depends on what we are testing at this unit test. From the comment in 
this test `// offset underflow is apparently supported?`, looks like this 
behavior is uncertain.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19222: [SPARK-10399][CORE][SQL] Introduce multiple Memor...

2018-04-04 Thread kiszk
Github user kiszk commented on a diff in the pull request:

https://github.com/apache/spark/pull/19222#discussion_r179332522
  
--- Diff: 
common/unsafe/src/test/java/org/apache/spark/unsafe/types/UTF8StringSuite.java 
---
@@ -515,7 +518,8 @@ public void writeToOutputStreamUnderflow() throws 
IOException {
 final byte[] test = "01234567".getBytes(StandardCharsets.UTF_8);
 
 for (int i = 1; i <= Platform.BYTE_ARRAY_OFFSET; ++i) {
-  UTF8String.fromAddress(test, Platform.BYTE_ARRAY_OFFSET - i, 
test.length + i)
+  new UTF8String(
+new ByteArrayMemoryBlock(test, Platform.BYTE_ARRAY_OFFSET - i, 
test.length + i))
--- End diff --

What is your suggestion to make it non-strange?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19222: [SPARK-10399][CORE][SQL] Introduce multiple Memor...

2018-04-04 Thread kiszk
Github user kiszk commented on a diff in the pull request:

https://github.com/apache/spark/pull/19222#discussion_r179332281
  
--- Diff: 
common/unsafe/src/main/java/org/apache/spark/unsafe/memory/ByteArrayMemoryBlock.java
 ---
@@ -0,0 +1,127 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.unsafe.memory;
+
+import com.google.common.primitives.Ints;
+
+import org.apache.spark.unsafe.Platform;
+
+/**
+ * A consecutive block of memory with a byte array on Java heap.
+ */
+public final class ByteArrayMemoryBlock extends MemoryBlock {
+
+  private final byte[] array;
+
+  public ByteArrayMemoryBlock(byte[] obj, long offset, long size) {
+super(obj, offset, size);
+this.array = obj;
+assert(offset + size <= Platform.BYTE_ARRAY_OFFSET + obj.length) :
+  "The sum of size " + size + " and offset " + offset + " should not 
be larger than " +
+"the array size " + (obj.length + Platform.BYTE_ARRAY_OFFSET);
+  }
+
+  public ByteArrayMemoryBlock(long length) {
+this(new byte[Ints.checkedCast(length)], Platform.BYTE_ARRAY_OFFSET, 
length);
+  }
+
+  @Override
+  public MemoryBlock subBlock(long offset, long size) {
+checkSubBlockRange(offset, size);
+return new ByteArrayMemoryBlock(array, this.offset + offset, size);
+  }
+
+  public byte[] getByteArray() { return array; }
+
+  /**
+   * Creates a memory block pointing to the memory used by the byte array.
+   */
+  public static ByteArrayMemoryBlock fromArray(final byte[] array) {
+return new ByteArrayMemoryBlock(array, Platform.BYTE_ARRAY_OFFSET, 
array.length);
+  }
+
+  @Override
+  public final int getInt(long offset) {
+return Platform.getInt(array, this.offset + offset);
+  }
+
+  @Override
+  public final void putInt(long offset, int value) {
+Platform.putInt(array, this.offset + offset, value);
+  }
+
+  @Override
+  public final boolean getBoolean(long offset) {
+return Platform.getBoolean(array, this.offset + offset);
+  }
+
+  @Override
+  public final void putBoolean(long offset, boolean value) {
+Platform.putBoolean(array, this.offset + offset, value);
+  }
+
+  @Override
+  public final byte getByte(long offset) {
+return array[(int)(this.offset + offset - Platform.BYTE_ARRAY_OFFSET)];
+  }
+
+  @Override
+  public final void putByte(long offset, byte value) {
+array[(int)(this.offset + offset - Platform.BYTE_ARRAY_OFFSET)] = 
value;
--- End diff --

ditto


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19222: [SPARK-10399][CORE][SQL] Introduce multiple Memor...

2018-04-04 Thread kiszk
Github user kiszk commented on a diff in the pull request:

https://github.com/apache/spark/pull/19222#discussion_r179331924
  
--- Diff: 
common/unsafe/src/main/java/org/apache/spark/unsafe/memory/ByteArrayMemoryBlock.java
 ---
@@ -0,0 +1,127 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.unsafe.memory;
+
+import com.google.common.primitives.Ints;
+
+import org.apache.spark.unsafe.Platform;
+
+/**
+ * A consecutive block of memory with a byte array on Java heap.
+ */
+public final class ByteArrayMemoryBlock extends MemoryBlock {
+
+  private final byte[] array;
+
+  public ByteArrayMemoryBlock(byte[] obj, long offset, long size) {
+super(obj, offset, size);
+this.array = obj;
+assert(offset + size <= Platform.BYTE_ARRAY_OFFSET + obj.length) :
+  "The sum of size " + size + " and offset " + offset + " should not 
be larger than " +
+"the array size " + (obj.length + Platform.BYTE_ARRAY_OFFSET);
+  }
+
+  public ByteArrayMemoryBlock(long length) {
+this(new byte[Ints.checkedCast(length)], Platform.BYTE_ARRAY_OFFSET, 
length);
+  }
+
+  @Override
+  public MemoryBlock subBlock(long offset, long size) {
+checkSubBlockRange(offset, size);
+return new ByteArrayMemoryBlock(array, this.offset + offset, size);
+  }
+
+  public byte[] getByteArray() { return array; }
+
+  /**
+   * Creates a memory block pointing to the memory used by the byte array.
+   */
+  public static ByteArrayMemoryBlock fromArray(final byte[] array) {
+return new ByteArrayMemoryBlock(array, Platform.BYTE_ARRAY_OFFSET, 
array.length);
+  }
+
+  @Override
+  public final int getInt(long offset) {
+return Platform.getInt(array, this.offset + offset);
+  }
+
+  @Override
+  public final void putInt(long offset, int value) {
+Platform.putInt(array, this.offset + offset, value);
+  }
+
+  @Override
+  public final boolean getBoolean(long offset) {
+return Platform.getBoolean(array, this.offset + offset);
+  }
+
+  @Override
+  public final void putBoolean(long offset, boolean value) {
+Platform.putBoolean(array, this.offset + offset, value);
+  }
+
+  @Override
+  public final byte getByte(long offset) {
+return array[(int)(this.offset + offset - Platform.BYTE_ARRAY_OFFSET)];
--- End diff --

Suggestion from @cloud-fan. The Java array access is faster than 
`Platform.getByte`.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20937: [SPARK-23094][SPARK-23723][SPARK-23724][SQL] Support cus...

2018-04-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20937
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20937: [SPARK-23094][SPARK-23723][SPARK-23724][SQL] Support cus...

2018-04-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20937
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88907/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20937: [SPARK-23094][SPARK-23723][SPARK-23724][SQL] Support cus...

2018-04-04 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20937
  
**[Test build #88907 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88907/testReport)**
 for PR 20937 at commit 
[`7e20891`](https://github.com/apache/spark/commit/7e208912a370af0466dfcadbff0787f2b3248dd9).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20965: [SPARK-21870][SQL] Split aggregation code into small fun...

2018-04-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20965
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20965: [SPARK-21870][SQL] Split aggregation code into small fun...

2018-04-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20965
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20965: [SPARK-21870][SQL] Split aggregation code into small fun...

2018-04-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20965
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1983/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20965: [SPARK-21870][SQL] Split aggregation code into small fun...

2018-04-04 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20965
  
**[Test build #88917 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88917/testReport)**
 for PR 20965 at commit 
[`b96aa8d`](https://github.com/apache/spark/commit/b96aa8d4d2d0021439499e90e051982b1ed121bd).
 * This patch **fails to build**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20965: [SPARK-21870][SQL] Split aggregation code into small fun...

2018-04-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20965
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88917/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20965: [SPARK-21870][SQL] Split aggregation code into small fun...

2018-04-04 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20965
  
**[Test build #88918 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88918/testReport)**
 for PR 20965 at commit 
[`640fd8f`](https://github.com/apache/spark/commit/640fd8f51e57edea12a36832b5c686a062ead7ab).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20965: [SPARK-21870][SQL] Split aggregation code into small fun...

2018-04-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20965
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1982/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20965: [SPARK-21870][SQL] Split aggregation code into small fun...

2018-04-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20965
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20965: [SPARK-21870][SQL] Split aggregation code into small fun...

2018-04-04 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20965
  
**[Test build #88917 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88917/testReport)**
 for PR 20965 at commit 
[`b96aa8d`](https://github.com/apache/spark/commit/b96aa8d4d2d0021439499e90e051982b1ed121bd).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20962: [SPARK-23847][PYTHON][SQL]Add asc_nulls_first, asc_nulls...

2018-04-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20962
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20962: [SPARK-23847][PYTHON][SQL]Add asc_nulls_first, asc_nulls...

2018-04-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20962
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1981/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20962: [SPARK-23847][PYTHON][SQL]Add asc_nulls_first, asc_nulls...

2018-04-04 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20962
  
**[Test build #88916 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88916/testReport)**
 for PR 20962 at commit 
[`c3550c2`](https://github.com/apache/spark/commit/c3550c23d4ae7a871ce8ee6198d418b4970c0669).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20962: [SPARK-23847][PYTHON][SQL]Add asc_nulls_first, asc_nulls...

2018-04-04 Thread huaxingao
Github user huaxingao commented on the issue:

https://github.com/apache/spark/pull/20962
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20961: [SPARK-23823][SQL] Keep origin in transformExpression

2018-04-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20961
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88909/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20961: [SPARK-23823][SQL] Keep origin in transformExpression

2018-04-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20961
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20961: [SPARK-23823][SQL] Keep origin in transformExpression

2018-04-04 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20961
  
**[Test build #88909 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88909/testReport)**
 for PR 20961 at commit 
[`52e8875`](https://github.com/apache/spark/commit/52e8875f040a327c7045fdba3fbd4671690cb0c5).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class QueryPlanSuite extends SparkFunSuite `


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20965: [SPARK-21870][SQL] Split aggregation code into sm...

2018-04-04 Thread maropu
Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/20965#discussion_r179322582
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala
 ---
@@ -1063,6 +1064,29 @@ class CodegenContext {
   }
 }
 
+object CodegenContext {
--- End diff --

ok


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20913: [SPARK-23799] FilterEstimation.evaluateInSet prod...

2018-04-04 Thread maropu
Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/20913#discussion_r179322407
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala
 ---
@@ -427,7 +427,11 @@ case class FilterEstimation(plan: Filter) extends 
Logging {
 
 // return the filter selectivity.  Without advanced statistics such as 
histograms,
 // we have to assume uniform distribution.
-Some(math.min(newNdv.toDouble / ndv.toDouble, 1.0))
+if (ndv.toDouble != 0) {
--- End diff --

Can you add a test for the empty table case?
I think we need to fix the other places if they have the same issue. cc: 
@wzhfy 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20913: [SPARK-23799] FilterEstimation.evaluateInSet prod...

2018-04-04 Thread apache-hivemall
Github user apache-hivemall commented on a diff in the pull request:

https://github.com/apache/spark/pull/20913#discussion_r179322297
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala
 ---
@@ -427,7 +427,11 @@ case class FilterEstimation(plan: Filter) extends 
Logging {
 
 // return the filter selectivity.  Without advanced statistics such as 
histograms,
 // we have to assume uniform distribution.
-Some(math.min(newNdv.toDouble / ndv.toDouble, 1.0))
+if (ndv.toDouble != 0) {
--- End diff --

Can you add a test for the empty table case?
I think we need to fix the other places if they have the same issue. cc: 
@wzhfy 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19222: [SPARK-10399][CORE][SQL] Introduce multiple Memor...

2018-04-04 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/19222#discussion_r179321572
  
--- Diff: 
common/unsafe/src/test/java/org/apache/spark/unsafe/types/UTF8StringSuite.java 
---
@@ -515,7 +518,8 @@ public void writeToOutputStreamUnderflow() throws 
IOException {
 final byte[] test = "01234567".getBytes(StandardCharsets.UTF_8);
 
 for (int i = 1; i <= Platform.BYTE_ARRAY_OFFSET; ++i) {
-  UTF8String.fromAddress(test, Platform.BYTE_ARRAY_OFFSET - i, 
test.length + i)
+  new UTF8String(
+new ByteArrayMemoryBlock(test, Platform.BYTE_ARRAY_OFFSET - i, 
test.length + i))
--- End diff --

Looks a bit strange.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19222: [SPARK-10399][CORE][SQL] Introduce multiple Memor...

2018-04-04 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/19222#discussion_r179321207
  
--- Diff: 
common/unsafe/src/main/java/org/apache/spark/unsafe/memory/UnsafeMemoryAllocator.java
 ---
@@ -19,15 +19,21 @@
 
 import org.apache.spark.unsafe.Platform;
 
+import java.lang.reflect.InvocationTargetException;
+import java.lang.reflect.Method;
+import java.nio.ByteBuffer;
+
+import sun.nio.ch.DirectBuffer;
--- End diff --

Seems unused imports?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19222: [SPARK-10399][CORE][SQL] Introduce multiple Memor...

2018-04-04 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/19222#discussion_r179320913
  
--- Diff: 
common/unsafe/src/main/java/org/apache/spark/unsafe/memory/OnHeapMemoryBlock.java
 ---
@@ -0,0 +1,131 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.unsafe.memory;
+
+import com.google.common.primitives.Ints;
+
+import org.apache.spark.unsafe.Platform;
+
+/**
+ * A consecutive block of memory with a long array on Java heap.
+ */
+public final class OnHeapMemoryBlock extends MemoryBlock {
+
+  private final long[] array;
+
+  public OnHeapMemoryBlock(long[] obj, long offset, long size) {
+super(obj, offset, size);
+this.array = obj;
+assert(offset + size <= obj.length * 8L + Platform.LONG_ARRAY_OFFSET) :
+  "The sum of size " + size + " and offset " + offset + " should not 
be larger than " +
+"the array size " + (obj.length * 8L + Platform.LONG_ARRAY_OFFSET);
--- End diff --

ditto. maybe "the size of the given memory space"?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19222: [SPARK-10399][CORE][SQL] Introduce multiple Memor...

2018-04-04 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/19222#discussion_r179320302
  
--- Diff: 
common/unsafe/src/main/java/org/apache/spark/unsafe/memory/ByteArrayMemoryBlock.java
 ---
@@ -0,0 +1,127 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.unsafe.memory;
+
+import com.google.common.primitives.Ints;
+
+import org.apache.spark.unsafe.Platform;
+
+/**
+ * A consecutive block of memory with a byte array on Java heap.
+ */
+public final class ByteArrayMemoryBlock extends MemoryBlock {
+
+  private final byte[] array;
+
+  public ByteArrayMemoryBlock(byte[] obj, long offset, long size) {
+super(obj, offset, size);
+this.array = obj;
+assert(offset + size <= Platform.BYTE_ARRAY_OFFSET + obj.length) :
+  "The sum of size " + size + " and offset " + offset + " should not 
be larger than " +
+"the array size " + (obj.length + Platform.BYTE_ARRAY_OFFSET);
+  }
+
+  public ByteArrayMemoryBlock(long length) {
+this(new byte[Ints.checkedCast(length)], Platform.BYTE_ARRAY_OFFSET, 
length);
+  }
+
+  @Override
+  public MemoryBlock subBlock(long offset, long size) {
+checkSubBlockRange(offset, size);
+return new ByteArrayMemoryBlock(array, this.offset + offset, size);
+  }
+
+  public byte[] getByteArray() { return array; }
+
+  /**
+   * Creates a memory block pointing to the memory used by the byte array.
+   */
+  public static ByteArrayMemoryBlock fromArray(final byte[] array) {
+return new ByteArrayMemoryBlock(array, Platform.BYTE_ARRAY_OFFSET, 
array.length);
+  }
+
+  @Override
+  public final int getInt(long offset) {
+return Platform.getInt(array, this.offset + offset);
+  }
+
+  @Override
+  public final void putInt(long offset, int value) {
+Platform.putInt(array, this.offset + offset, value);
+  }
+
+  @Override
+  public final boolean getBoolean(long offset) {
+return Platform.getBoolean(array, this.offset + offset);
+  }
+
+  @Override
+  public final void putBoolean(long offset, boolean value) {
+Platform.putBoolean(array, this.offset + offset, value);
+  }
+
+  @Override
+  public final byte getByte(long offset) {
+return array[(int)(this.offset + offset - Platform.BYTE_ARRAY_OFFSET)];
+  }
+
+  @Override
+  public final void putByte(long offset, byte value) {
+array[(int)(this.offset + offset - Platform.BYTE_ARRAY_OFFSET)] = 
value;
--- End diff --

ditto.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19222: [SPARK-10399][CORE][SQL] Introduce multiple Memor...

2018-04-04 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/19222#discussion_r179320267
  
--- Diff: 
common/unsafe/src/main/java/org/apache/spark/unsafe/memory/ByteArrayMemoryBlock.java
 ---
@@ -0,0 +1,127 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.unsafe.memory;
+
+import com.google.common.primitives.Ints;
+
+import org.apache.spark.unsafe.Platform;
+
+/**
+ * A consecutive block of memory with a byte array on Java heap.
+ */
+public final class ByteArrayMemoryBlock extends MemoryBlock {
+
+  private final byte[] array;
+
+  public ByteArrayMemoryBlock(byte[] obj, long offset, long size) {
+super(obj, offset, size);
+this.array = obj;
+assert(offset + size <= Platform.BYTE_ARRAY_OFFSET + obj.length) :
+  "The sum of size " + size + " and offset " + offset + " should not 
be larger than " +
+"the array size " + (obj.length + Platform.BYTE_ARRAY_OFFSET);
+  }
+
+  public ByteArrayMemoryBlock(long length) {
+this(new byte[Ints.checkedCast(length)], Platform.BYTE_ARRAY_OFFSET, 
length);
+  }
+
+  @Override
+  public MemoryBlock subBlock(long offset, long size) {
+checkSubBlockRange(offset, size);
+return new ByteArrayMemoryBlock(array, this.offset + offset, size);
+  }
+
+  public byte[] getByteArray() { return array; }
+
+  /**
+   * Creates a memory block pointing to the memory used by the byte array.
+   */
+  public static ByteArrayMemoryBlock fromArray(final byte[] array) {
+return new ByteArrayMemoryBlock(array, Platform.BYTE_ARRAY_OFFSET, 
array.length);
+  }
+
+  @Override
+  public final int getInt(long offset) {
+return Platform.getInt(array, this.offset + offset);
+  }
+
+  @Override
+  public final void putInt(long offset, int value) {
+Platform.putInt(array, this.offset + offset, value);
+  }
+
+  @Override
+  public final boolean getBoolean(long offset) {
+return Platform.getBoolean(array, this.offset + offset);
+  }
+
+  @Override
+  public final void putBoolean(long offset, boolean value) {
+Platform.putBoolean(array, this.offset + offset, value);
+  }
+
+  @Override
+  public final byte getByte(long offset) {
+return array[(int)(this.offset + offset - Platform.BYTE_ARRAY_OFFSET)];
--- End diff --

Why don't use `Platform.getByte`?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19222: [SPARK-10399][CORE][SQL] Introduce multiple Memor...

2018-04-04 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/19222#discussion_r179320092
  
--- Diff: 
common/unsafe/src/main/java/org/apache/spark/unsafe/memory/OnHeapMemoryBlock.java
 ---
@@ -0,0 +1,141 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.unsafe.memory;
+
+import org.apache.spark.unsafe.Platform;
+
+/**
+ * A consecutive block of memory with a long array on Java heap.
+ */
+public final class OnHeapMemoryBlock extends MemoryBlock {
--- End diff --

Can we put @cloud-fan's comment 
https://github.com/apache/spark/pull/19222/files#r172356146 into 
`OnHeapMemoryBlock`'s doc?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19222: [SPARK-10399][CORE][SQL] Introduce multiple Memor...

2018-04-04 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/19222#discussion_r179319738
  
--- Diff: 
common/unsafe/src/main/java/org/apache/spark/unsafe/memory/ByteArrayMemoryBlock.java
 ---
@@ -0,0 +1,127 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.unsafe.memory;
+
+import com.google.common.primitives.Ints;
+
+import org.apache.spark.unsafe.Platform;
+
+/**
+ * A consecutive block of memory with a byte array on Java heap.
+ */
+public final class ByteArrayMemoryBlock extends MemoryBlock {
+
+  private final byte[] array;
+
+  public ByteArrayMemoryBlock(byte[] obj, long offset, long size) {
+super(obj, offset, size);
+this.array = obj;
+assert(offset + size <= Platform.BYTE_ARRAY_OFFSET + obj.length) :
--- End diff --

hmm, from here and the accessors below, looks like the given `offset` 
includes `Platform.BYTE_ARRAY_OFFSET`? What will happen if a given offset is 
less than that?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19222: [SPARK-10399][CORE][SQL] Introduce multiple Memor...

2018-04-04 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/19222#discussion_r179319478
  
--- Diff: 
common/unsafe/src/main/java/org/apache/spark/unsafe/memory/ByteArrayMemoryBlock.java
 ---
@@ -0,0 +1,127 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.unsafe.memory;
+
+import com.google.common.primitives.Ints;
+
+import org.apache.spark.unsafe.Platform;
+
+/**
+ * A consecutive block of memory with a byte array on Java heap.
+ */
+public final class ByteArrayMemoryBlock extends MemoryBlock {
+
+  private final byte[] array;
+
+  public ByteArrayMemoryBlock(byte[] obj, long offset, long size) {
+super(obj, offset, size);
+this.array = obj;
+assert(offset + size <= Platform.BYTE_ARRAY_OFFSET + obj.length) :
+  "The sum of size " + size + " and offset " + offset + " should not 
be larger than " +
+"the array size " + (obj.length + Platform.BYTE_ARRAY_OFFSET);
--- End diff --

Said `obj.length + Platform.BYTE_ARRAY_OFFSET` as the array size might be a 
bit confusing in an error message.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20962: [SPARK-23847][PYTHON][SQL]Add asc_nulls_first, asc_nulls...

2018-04-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20962
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88908/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20962: [SPARK-23847][PYTHON][SQL]Add asc_nulls_first, asc_nulls...

2018-04-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20962
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20962: [SPARK-23847][PYTHON][SQL]Add asc_nulls_first, asc_nulls...

2018-04-04 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20962
  
**[Test build #88908 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88908/testReport)**
 for PR 20962 at commit 
[`c3550c2`](https://github.com/apache/spark/commit/c3550c23d4ae7a871ce8ee6198d418b4970c0669).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20970: [SPARK-23870][ML] Forward RFormula handleInvalid Param t...

2018-04-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20970
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20970: [SPARK-23870][ML] Forward RFormula handleInvalid Param t...

2018-04-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20970
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88911/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   4   5   >