[GitHub] spark issue #13836: [SPARK][YARN] Fix not test yarn cluster mode correctly i...

2016-06-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13836
  
**[Test build #61013 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61013/consoleFull)**
 for PR 13836 at commit 
[`7820382`](https://github.com/apache/spark/commit/7820382a39dc41382d62b7a5e6fd871338ac00b3).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13836: [SPARK][YARN] Fix not test yarn cluster mode corr...

2016-06-21 Thread renozhang
GitHub user renozhang opened a pull request:

https://github.com/apache/spark/pull/13836

[SPARK][YARN] Fix not test yarn cluster mode correctly in YarnClusterSuite

## What changes were proposed in this pull request?

Since SPARK-13220(Deprecate "yarn-client" and "yarn-cluster"), 
YarnClusterSuite doesn't test "yarn cluster" mode correctly.
This pull request fixes it.

## How was this patch tested?
Unit test

(If this patch involves UI changes, please attach a screenshot; otherwise, 
remove this)




You can merge this pull request into a Git repository by running:

$ git pull https://github.com/renozhang/spark 
SPARK-16125-test-yarn-cluster-mode

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/13836.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #13836


commit 7820382a39dc41382d62b7a5e6fd871338ac00b3
Author: peng.zhang 
Date:   2016-06-22T05:37:26Z

Fix not test yarn cluster mode actually in YarnClusterSuite




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13758: [SPARK-16043][SQL] Prepare GenericArrayData imple...

2016-06-21 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request:

https://github.com/apache/spark/pull/13758#discussion_r67997948
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/GenericArrayData.scala
 ---
@@ -142,3 +164,415 @@ class GenericArrayData(val array: Array[Any]) extends 
ArrayData {
 result
   }
 }
+
+final class GenericIntArrayData(private val primitiveArray: Array[Int])
+  extends GenericArrayData(Array.empty) {
+  override def array(): Array[Any] = primitiveArray.toArray
+
+  override def copy(): ArrayData = new GenericIntArrayData(primitiveArray)
+
+  override def numElements(): Int = primitiveArray.length
+
+  override def isNullAt(ordinal: Int): Boolean = false
+  override def getInt(ordinal: Int): Int = primitiveArray(ordinal)
+  override def toIntArray(): Array[Int] = {
+val array = new Array[Int](numElements)
+System.arraycopy(primitiveArray, 0, array, 0, numElements)
+array
+  }
+  override def toString(): String = primitiveArray.mkString("[", ",", "]")
+
+  override def equals(o: Any): Boolean = {
--- End diff --

Yeah you did. So that is good. I am just saying that we could reduce the 
size of the PR by using things that are already provided by the JDK.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13778: [SPARK-16062][SPARK-15989][SQL] Fix two bugs of Python-o...

2016-06-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13778
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13778: [SPARK-16062][SPARK-15989][SQL] Fix two bugs of Python-o...

2016-06-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13778
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61000/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13778: [SPARK-16062][SPARK-15989][SQL] Fix two bugs of Python-o...

2016-06-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13778
  
**[Test build #61000 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61000/consoleFull)**
 for PR 13778 at commit 
[`a0b81ba`](https://github.com/apache/spark/commit/a0b81ba448018200eb8947bc496d8f07d87a64b8).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13820: [SPARK-16107] [R] group glm methods in documentation

2016-06-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13820
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13820: [SPARK-16107] [R] group glm methods in documentation

2016-06-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13820
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61007/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13820: [SPARK-16107] [R] group glm methods in documentation

2016-06-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13820
  
**[Test build #61007 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61007/consoleFull)**
 for PR 13820 at commit 
[`8ee701f`](https://github.com/apache/spark/commit/8ee701f99d63e921cf3bab86b4d30e7230b104d2).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13825: [SPARK-16120] [STREAMING] getCurrentLogFiles in Receiver...

2016-06-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13825
  
**[Test build #61012 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61012/consoleFull)**
 for PR 13825 at commit 
[`f689cce`](https://github.com/apache/spark/commit/f689cce93ba21fd25341c5c142747ba0821f0572).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13835: [SPARK-16100][SQL] fix bug when use Map as the buffer ty...

2016-06-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13835
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61003/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13835: [SPARK-16100][SQL] fix bug when use Map as the buffer ty...

2016-06-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13835
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13758: [SPARK-16043][SQL] Prepare GenericArrayData imple...

2016-06-21 Thread kiszk
Github user kiszk commented on a diff in the pull request:

https://github.com/apache/spark/pull/13758#discussion_r67997326
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/GenericArrayData.scala
 ---
@@ -142,3 +164,415 @@ class GenericArrayData(val array: Array[Any]) extends 
ArrayData {
 result
   }
 }
+
+final class GenericIntArrayData(private val primitiveArray: Array[Int])
+  extends GenericArrayData(Array.empty) {
+  override def array(): Array[Any] = primitiveArray.toArray
+
+  override def copy(): ArrayData = new GenericIntArrayData(primitiveArray)
+
+  override def numElements(): Int = primitiveArray.length
+
+  override def isNullAt(ordinal: Int): Boolean = false
+  override def getInt(ordinal: Int): Int = primitiveArray(ordinal)
+  override def toIntArray(): Array[Int] = {
+val array = new Array[Int](numElements)
+System.arraycopy(primitiveArray, 0, array, 0, numElements)
+array
+  }
+  override def toString(): String = primitiveArray.mkString("[", ",", "]")
+
+  override def equals(o: Any): Boolean = {
--- End diff --

I should implement ```equals()``` and ```hashCode()``` for classes related 
to ```GenericArrayData```

I already implemented type specialized ```equals()``` and ```hashCode()``` 
in ```GenericArrayData```. An issue in ```equals()``` and 
```hashCode``` of ```GenericRefArrayData``` is that a type of each element may 
be different since they are hold in ```Array[Any]```.
 
If I misunderstood your suggestion, could you please let me know?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13835: [SPARK-16100][SQL] fix bug when use Map as the buffer ty...

2016-06-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13835
  
**[Test build #61003 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61003/consoleFull)**
 for PR 13835 at commit 
[`447ddcd`](https://github.com/apache/spark/commit/447ddcd3812e0253d3548f9462f21282abc086eb).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13825: [SPARK-16120] [STREAMING] getCurrentLogFiles in Receiver...

2016-06-21 Thread srowen
Github user srowen commented on the issue:

https://github.com/apache/spark/pull/13825
  
Jenkins test this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13834: [TRIVIAL] [CORE] [ScriptTransform] move printing of stde...

2016-06-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13834
  
**[Test build #61011 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61011/consoleFull)**
 for PR 13834 at commit 
[`04c8637`](https://github.com/apache/spark/commit/04c86373e3b259471adef37a2c4aa7650f19134e).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13834: [TRIVIAL] [CORE] [ScriptTransform] move printing of stde...

2016-06-21 Thread srowen
Github user srowen commented on the issue:

https://github.com/apache/spark/pull/13834
  
Jenkins retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13824: [SPARK-16110][YARN][PYSPARK] Fix allowing python version...

2016-06-21 Thread jerryshao
Github user jerryshao commented on the issue:

https://github.com/apache/spark/pull/13824
  
Also would you please add a unit test about it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13834: [TRIVIAL] [CORE] [ScriptTransform] move printing of stde...

2016-06-21 Thread srowen
Github user srowen commented on the issue:

https://github.com/apache/spark/pull/13834
  
SGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13758: [SPARK-16043][SQL] Prepare GenericArrayData imple...

2016-06-21 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request:

https://github.com/apache/spark/pull/13758#discussion_r67996555
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/GenericArrayDataBenchmark.scala
 ---
@@ -0,0 +1,188 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.util
+
+import org.apache.spark.util.Benchmark
+
+/**
+ * Benchmark [[GenericArrayData]] for Dense and Sparse with primitive type
+ */
+object GenericArrayDataBenchmark {
+/*
+  def allocateGenericIntArray(iters: Int): Unit = {
--- End diff --

I am quite curious to see the results here. I cannot really imagine these 
being different.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13758: [SPARK-16043][SQL] Prepare GenericArrayData imple...

2016-06-21 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request:

https://github.com/apache/spark/pull/13758#discussion_r67996447
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/GenericArrayDataBenchmark.scala
 ---
@@ -0,0 +1,188 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.util
+
+import org.apache.spark.util.Benchmark
+
+/**
+ * Benchmark [[GenericArrayData]] for Dense and Sparse with primitive type
+ */
+object GenericArrayDataBenchmark {
+/*
+  def allocateGenericIntArray(iters: Int): Unit = {
+val count = 1024 * 1024 * 10
+var array: GenericArrayData = null
+
+val primitiveIntArray = new Array[Int](count)
+val denseIntArray = { i: Int =>
+  for (n <- 0L until iters) {
--- End diff --

`for (n <- until items)` is expensive and it will probably dominate the 
costs of the benchmark. Please use a while loop.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13818: [SPARK-15968][SQL] Nonempty partitioned metastore tables...

2016-06-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13818
  
**[Test build #3124 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3124/consoleFull)**
 for PR 13818 at commit 
[`8a058c6`](https://github.com/apache/spark/commit/8a058c65c6c20e311bde5c0ade87c14c6b6b5f37).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13758: [SPARK-16043][SQL] Prepare GenericArrayData imple...

2016-06-21 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request:

https://github.com/apache/spark/pull/13758#discussion_r67996267
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/GenericArrayDataBenchmark.scala
 ---
@@ -0,0 +1,188 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.util
+
+import org.apache.spark.util.Benchmark
+
+/**
+ * Benchmark [[GenericArrayData]] for Dense and Sparse with primitive type
+ */
+object GenericArrayDataBenchmark {
--- End diff --

This is different from `MiscBenchmark`. It is a class that extends 
BenchmarkBase. You probably have to move it to sql/core though.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13764: [SPARK-16024] [SQL] [TEST] Verify Column Comment for Dat...

2016-06-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13764
  
**[Test build #61010 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61010/consoleFull)**
 for PR 13764 at commit 
[`87d32d7`](https://github.com/apache/spark/commit/87d32d70b18fb040a351c754182ea11f807fc238).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13758: [SPARK-16043][SQL] Prepare GenericArrayData imple...

2016-06-21 Thread kiszk
Github user kiszk commented on a diff in the pull request:

https://github.com/apache/spark/pull/13758#discussion_r67996014
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CodeGenerationSuite.scala
 ---
@@ -112,8 +112,8 @@ class CodeGenerationSuite extends SparkFunSuite with 
ExpressionEvalHelper {
 val plan = GenerateMutableProjection.generate(expressions)
 val actual = plan(new 
GenericMutableRow(length)).toSeq(expressions.map(_.dataType))
 val expected = Seq(new ArrayBasedMapData(
-  new GenericArrayData(0 until length),
-  new GenericArrayData(Seq.fill(length)(true
+  GenericArrayData.allocate(0 until length),
+  GenericArrayData.allocate(Seq.fill(length)(true
--- End diff --

Ok, I will do both.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13072: [SPARK-15288] [Mesos] Mesos dispatcher should handle gra...

2016-06-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13072
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60999/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13072: [SPARK-15288] [Mesos] Mesos dispatcher should handle gra...

2016-06-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13072
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13758: [SPARK-16043][SQL] Prepare GenericArrayData imple...

2016-06-21 Thread kiszk
Github user kiszk commented on a diff in the pull request:

https://github.com/apache/spark/pull/13758#discussion_r67995993
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/GenericArrayData.scala
 ---
@@ -142,3 +196,414 @@ class GenericArrayData(val array: Array[Any]) extends 
ArrayData {
 result
   }
 }
+
+final class GenericIntArrayData(val primitiveArray: Array[Int]) extends 
GenericArrayData {
+  override def array(): Array[Any] = primitiveArray.toArray
+
+  override def copy(): ArrayData = new GenericIntArrayData(primitiveArray)
+
+  override def numElements(): Int = primitiveArray.length
+
+  override def isNullAt(ordinal: Int): Boolean = false
+  override def getInt(ordinal: Int): Int = primitiveArray(ordinal)
+  override def toIntArray(): Array[Int] = {
+val array = new Array[Int](numElements)
+System.arraycopy(primitiveArray, 0, array, 0, numElements)
+array
+  }
+  override def toString(): String = primitiveArray.mkString("[", ",", "]")
--- End diff --

Yes, I will move it up.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13764: [SPARK-16024] [SQL] [TEST] Verify Column Comment ...

2016-06-21 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/13764#discussion_r67995925
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/test/DataFrameReaderWriterSuite.scala
 ---
@@ -223,6 +223,31 @@ class DataFrameReaderWriterSuite extends QueryTest 
with SharedSQLContext with Be
 }
   }
 
+  test("column nullability and comment - write and then read") {
--- End diff --

Sure, let me change it. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13758: [SPARK-16043][SQL] Prepare GenericArrayData imple...

2016-06-21 Thread kiszk
Github user kiszk commented on a diff in the pull request:

https://github.com/apache/spark/pull/13758#discussion_r67995936
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/GenericArrayData.scala
 ---
@@ -142,3 +196,414 @@ class GenericArrayData(val array: Array[Any]) extends 
ArrayData {
 result
   }
 }
+
+final class GenericIntArrayData(val primitiveArray: Array[Int]) extends 
GenericArrayData {
+  override def array(): Array[Any] = primitiveArray.toArray
+
+  override def copy(): ArrayData = new GenericIntArrayData(primitiveArray)
--- End diff --

Hmm, I should clone the backing array. I should return the actual type.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13072: [SPARK-15288] [Mesos] Mesos dispatcher should handle gra...

2016-06-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13072
  
**[Test build #60999 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60999/consoleFull)**
 for PR 13072 at commit 
[`2f306a7`](https://github.com/apache/spark/commit/2f306a785f3adbbba7420d27630b6f1553139074).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13820: [SPARK-16107] [R] group glm methods in documentat...

2016-06-21 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/13820#discussion_r67995869
  
--- Diff: R/pkg/R/mllib.R ---
@@ -124,24 +138,21 @@ setMethod("spark.glm", signature(data = 
"SparkDataFrame", formula = "formula"),
 #' summary(model)
 #' }
 #' @note glm since 1.5.0
+#' @seealso \link{spark.glm}
 setMethod("glm", signature(formula = "formula", family = "ANY", data = 
"SparkDataFrame"),
   function(formula, family = gaussian, data, epsilon = 1e-6, maxit 
= 25) {
 spark.glm(data, formula, family, tol = epsilon, maxIter = 
maxit)
   })
 
-#' Get the summary of a generalized linear model
-#'
-#' Returns the summary of a model produced by glm() or spark.glm(), 
similarly to R's summary().
+#  Returns the summary of a model produced by glm() or spark.glm(), 
similarly to R's summary().
--- End diff --

I get this is intentional, but I'd suggest adding extra empty newline 
between `#` & `#'` line - it's very easy to newcomers to make mistakes 
copy/paste doc


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13764: [SPARK-16024] [SQL] [TEST] Verify Column Comment ...

2016-06-21 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/13764#discussion_r67995781
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/test/DataFrameReaderWriterSuite.scala
 ---
@@ -223,6 +223,31 @@ class DataFrameReaderWriterSuite extends QueryTest 
with SharedSQLContext with Be
 }
   }
 
+  test("column nullability and comment - write and then read") {
--- End diff --

also remove this test, let's focus on SQL CREATE TABLE in this PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13758: [SPARK-16043][SQL] Prepare GenericArrayData implementati...

2016-06-21 Thread kiszk
Github user kiszk commented on the issue:

https://github.com/apache/spark/pull/13758
  
@hvanhovell , I added [a 
file](https://github.com/kiszk/spark/blob/133d4c0085b5ca2f20870c05d077e25d8715e07a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/GenericArrayDataBenchmark.scala)
 of Benchmark (not ran yet). I would appreciate it if you have a time to look 
at this.

It is very strange to me that I can run a Benchmark program under 
```sql/core``` (e.g. ```MiscBenchmark)``` by using ```build/sbt "sql/test-only 
*MiscBenchmark*"```.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13715: [SPARK-15992] [MESOS] Refactor MesosCoarseGrainedSchedul...

2016-06-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13715
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13715: [SPARK-15992] [MESOS] Refactor MesosCoarseGrainedSchedul...

2016-06-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13715
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60997/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13715: [SPARK-15992] [MESOS] Refactor MesosCoarseGrainedSchedul...

2016-06-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13715
  
**[Test build #60997 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60997/consoleFull)**
 for PR 13715 at commit 
[`9e0aedf`](https://github.com/apache/spark/commit/9e0aedf12816c317db0a65e21adc921258608a4b).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13764: [SPARK-16024] [SQL] [TEST] Verify Column Comment for Dat...

2016-06-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13764
  
**[Test build #61008 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61008/consoleFull)**
 for PR 13764 at commit 
[`94b7264`](https://github.com/apache/spark/commit/94b7264480709c8ebfd551c85582967020111e97).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13758: [SPARK-16043][SQL] Prepare GenericArrayData implementati...

2016-06-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13758
  
**[Test build #61009 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61009/consoleFull)**
 for PR 13758 at commit 
[`133d4c0`](https://github.com/apache/spark/commit/133d4c0085b5ca2f20870c05d077e25d8715e07a).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13820: [SPARK-16107] [R] group glm methods in documentation

2016-06-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13820
  
**[Test build #61007 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61007/consoleFull)**
 for PR 13820 at commit 
[`8ee701f`](https://github.com/apache/spark/commit/8ee701f99d63e921cf3bab86b4d30e7230b104d2).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13758: [SPARK-16043][SQL] Prepare GenericArrayData imple...

2016-06-21 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request:

https://github.com/apache/spark/pull/13758#discussion_r67995256
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/CatalystTypeConverters.scala
 ---
@@ -159,17 +159,17 @@ object CatalystTypeConverters {
 override def toCatalystImpl(scalaValue: Any): ArrayData = {
   scalaValue match {
 case a: Array[_] =>
-  new GenericArrayData(a.map(elementConverter.toCatalyst))
+  GenericArrayData.allocate(a.map(elementConverter.toCatalyst))
--- End diff --

These allocates will create a GenericRefArrayData object?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13764: [SPARK-16024] [SQL] [TEST] Verify Column Comment ...

2016-06-21 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/13764#discussion_r67995158
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/orc/OrcQuerySuite.scala ---
@@ -462,4 +463,27 @@ class OrcQuerySuite extends QueryTest with 
BeforeAndAfterAll with OrcTest {
   }
 }
   }
+
+  test("column nullability and comment - write and then read") {
+val schema = StructType(
+  StructField("cl1", IntegerType, nullable = false,
+new MetadataBuilder().putString("comment", "test").build()) ::
--- End diff --

Sure, let me remove it and submit a new PR soon. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13758: [SPARK-16043][SQL] Prepare GenericArrayData imple...

2016-06-21 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request:

https://github.com/apache/spark/pull/13758#discussion_r67995044
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/GenericArrayData.scala
 ---
@@ -142,3 +196,414 @@ class GenericArrayData(val array: Array[Any]) extends 
ArrayData {
 result
   }
 }
+
+final class GenericIntArrayData(val primitiveArray: Array[Int]) extends 
GenericArrayData {
+  override def array(): Array[Any] = primitiveArray.toArray
+
+  override def copy(): ArrayData = new GenericIntArrayData(primitiveArray)
+
+  override def numElements(): Int = primitiveArray.length
+
+  override def isNullAt(ordinal: Int): Boolean = false
+  override def getInt(ordinal: Int): Int = primitiveArray(ordinal)
+  override def toIntArray(): Array[Int] = {
+val array = new Array[Int](numElements)
+System.arraycopy(primitiveArray, 0, array, 0, numElements)
+array
+  }
+  override def toString(): String = primitiveArray.mkString("[", ",", "]")
+
+  override def equals(o: Any): Boolean = {
+if (!o.isInstanceOf[GenericIntArrayData]) {
+  return false
+}
+
+val other = o.asInstanceOf[GenericIntArrayData]
+if (other eq null) {
+  return false
+}
+
+val len = numElements()
+if (len != other.numElements()) {
+  return false
+}
+
+var i = 0
+while (i < len) {
+  val o1 = primitiveArray(i)
+  val o2 = other.primitiveArray(i)
+  if (o1 != o2) {
+return false
+  }
+  i += 1
+}
+true
+  }
+
+  override def hashCode: Int = {
+var result: Int = 37
+var i = 0
+val len = numElements()
+while (i < len) {
+  val update: Int = primitiveArray(i)
+  result = 37 * result + update
+  i += 1
+}
+result
+  }
+}
+
+final class GenericLongArrayData(val primitiveArray: Array[Long])
+  extends GenericArrayData {
+  override def array(): Array[Any] = primitiveArray.toArray
+
+  override def copy(): ArrayData = new GenericLongArrayData(primitiveArray)
+
+  override def numElements(): Int = primitiveArray.length
+
+  override def isNullAt(ordinal: Int): Boolean = false
+  override def getLong(ordinal: Int): Long = primitiveArray(ordinal)
+  override def toLongArray(): Array[Long] = {
+val array = new Array[Long](numElements)
+System.arraycopy(primitiveArray, 0, array, 0, numElements)
+array
+  }
+  override def toString(): String = primitiveArray.mkString("[", ",", "]")
+
+  override def equals(o: Any): Boolean = {
+if (!o.isInstanceOf[GenericLongArrayData]) {
+  return false
+}
+
+val other = o.asInstanceOf[GenericLongArrayData]
+if (other eq null) {
+  return false
+}
+
+val len = numElements()
+if (len != other.numElements()) {
+  return false
+}
+
+var i = 0
+while (i < len) {
+  val o1 = primitiveArray(i)
+  val o2 = other.primitiveArray(i)
+  if (o1 != o2) {
+return false
+  }
+  i += 1
+}
+true
+  }
+
+  override def hashCode: Int = {
+var result: Int = 37
+var i = 0
+val len = numElements()
+while (i < len) {
+  val l = primitiveArray(i)
+  val update: Int = (l ^ (l >>> 32)).toInt
+  result = 37 * result + update
+  i += 1
+}
+result
+  }
+}
+
+final class GenericFloatArrayData(val primitiveArray: Array[Float])
+  extends GenericArrayData {
+  override def array(): Array[Any] = primitiveArray.toArray
+
+  override def copy(): ArrayData = new 
GenericFloatArrayData(primitiveArray)
+
+  override def numElements(): Int = primitiveArray.length
+
+  override def isNullAt(ordinal: Int): Boolean = false
+  override def getFloat(ordinal: Int): Float = primitiveArray(ordinal)
+  override def toFloatArray(): Array[Float] = {
+val array = new Array[Float](numElements)
+System.arraycopy(primitiveArray, 0, array, 0, numElements)
+array
+  }
+  override def toString(): String = primitiveArray.mkString("[", ",", "]")
+
+  override def equals(o: Any): Boolean = {
+if (!o.isInstanceOf[GenericFloatArrayData]) {
+  return false
+}
+
+val other = o.asInstanceOf[GenericFloatArrayData]
+if (other eq null) {
+  return false
+}
+
+val len = numElements()
+if (len != other.numElements()) {
+  return false
+}
+
+var i = 0
+while (i < len) {
+  val o1 = 

[GitHub] spark pull request #13764: [SPARK-16024] [SQL] [TEST] Verify Column Comment ...

2016-06-21 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/13764#discussion_r67994899
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/orc/OrcQuerySuite.scala ---
@@ -462,4 +463,27 @@ class OrcQuerySuite extends QueryTest with 
BeforeAndAfterAll with OrcTest {
   }
 }
   }
+
+  test("column nullability and comment - write and then read") {
+val schema = StructType(
+  StructField("cl1", IntegerType, nullable = false,
+new MetadataBuilder().putString("comment", "test").build()) ::
--- End diff --

you can open a new PR and move this test there.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13758: [SPARK-16043][SQL] Prepare GenericArrayData imple...

2016-06-21 Thread kiszk
Github user kiszk commented on a diff in the pull request:

https://github.com/apache/spark/pull/13758#discussion_r67994789
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/GenericArrayData.scala
 ---
@@ -142,3 +196,414 @@ class GenericArrayData(val array: Array[Any]) extends 
ArrayData {
 result
   }
 }
+
+final class GenericIntArrayData(val primitiveArray: Array[Int]) extends 
GenericArrayData {
+  override def array(): Array[Any] = primitiveArray.toArray
+
+  override def copy(): ArrayData = new GenericIntArrayData(primitiveArray)
+
+  override def numElements(): Int = primitiveArray.length
+
+  override def isNullAt(ordinal: Int): Boolean = false
+  override def getInt(ordinal: Int): Int = primitiveArray(ordinal)
+  override def toIntArray(): Array[Int] = {
+val array = new Array[Int](numElements)
+System.arraycopy(primitiveArray, 0, array, 0, numElements)
+array
+  }
+  override def toString(): String = primitiveArray.mkString("[", ",", "]")
+
+  override def equals(o: Any): Boolean = {
+if (!o.isInstanceOf[GenericIntArrayData]) {
+  return false
+}
+
+val other = o.asInstanceOf[GenericIntArrayData]
+if (other eq null) {
+  return false
+}
+
+val len = numElements()
+if (len != other.numElements()) {
+  return false
+}
+
+var i = 0
+while (i < len) {
+  val o1 = primitiveArray(i)
+  val o2 = other.primitiveArray(i)
+  if (o1 != o2) {
+return false
+  }
+  i += 1
+}
+true
+  }
+
+  override def hashCode: Int = {
+var result: Int = 37
+var i = 0
+val len = numElements()
+while (i < len) {
+  val update: Int = primitiveArray(i)
+  result = 37 * result + update
+  i += 1
+}
+result
+  }
+}
+
+final class GenericLongArrayData(val primitiveArray: Array[Long])
+  extends GenericArrayData {
+  override def array(): Array[Any] = primitiveArray.toArray
+
+  override def copy(): ArrayData = new GenericLongArrayData(primitiveArray)
+
+  override def numElements(): Int = primitiveArray.length
+
+  override def isNullAt(ordinal: Int): Boolean = false
+  override def getLong(ordinal: Int): Long = primitiveArray(ordinal)
+  override def toLongArray(): Array[Long] = {
+val array = new Array[Long](numElements)
+System.arraycopy(primitiveArray, 0, array, 0, numElements)
+array
+  }
+  override def toString(): String = primitiveArray.mkString("[", ",", "]")
+
+  override def equals(o: Any): Boolean = {
+if (!o.isInstanceOf[GenericLongArrayData]) {
+  return false
+}
+
+val other = o.asInstanceOf[GenericLongArrayData]
+if (other eq null) {
+  return false
+}
+
+val len = numElements()
+if (len != other.numElements()) {
+  return false
+}
+
+var i = 0
+while (i < len) {
+  val o1 = primitiveArray(i)
+  val o2 = other.primitiveArray(i)
+  if (o1 != o2) {
+return false
+  }
+  i += 1
+}
+true
+  }
+
+  override def hashCode: Int = {
+var result: Int = 37
+var i = 0
+val len = numElements()
+while (i < len) {
+  val l = primitiveArray(i)
+  val update: Int = (l ^ (l >>> 32)).toInt
+  result = 37 * result + update
+  i += 1
+}
+result
+  }
+}
+
+final class GenericFloatArrayData(val primitiveArray: Array[Float])
+  extends GenericArrayData {
+  override def array(): Array[Any] = primitiveArray.toArray
+
+  override def copy(): ArrayData = new 
GenericFloatArrayData(primitiveArray)
+
+  override def numElements(): Int = primitiveArray.length
+
+  override def isNullAt(ordinal: Int): Boolean = false
+  override def getFloat(ordinal: Int): Float = primitiveArray(ordinal)
+  override def toFloatArray(): Array[Float] = {
+val array = new Array[Float](numElements)
+System.arraycopy(primitiveArray, 0, array, 0, numElements)
+array
+  }
+  override def toString(): String = primitiveArray.mkString("[", ",", "]")
+
+  override def equals(o: Any): Boolean = {
+if (!o.isInstanceOf[GenericFloatArrayData]) {
+  return false
+}
+
+val other = o.asInstanceOf[GenericFloatArrayData]
+if (other eq null) {
+  return false
+}
+
+val len = numElements()
+if (len != other.numElements()) {
+  return false
+}
+
+var i = 0
+while (i < len) {
+  val o1 = primitiveArray(i)

[GitHub] spark pull request #13809: [SPARK-16104][SQL] Do not creaate CSV writer obje...

2016-06-21 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/13809


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13818: [SPARK-15968][SQL] Nonempty partitioned metastore...

2016-06-21 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request:

https://github.com/apache/spark/pull/13818#discussion_r67994573
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/parquetSuites.scala ---
@@ -425,6 +425,28 @@ class ParquetMetastoreSuite extends 
ParquetPartitioningTest {
 }
   }
 
+  test("SPARK-15968: nonempty partitioned metastore Parquet table lookup 
should use cached " +
--- End diff --

Could you take a look a 
[CachedTableSuite](https://github.com/apache/spark/blob/master/sql/hive/src/test/scala/org/apache/spark/sql/hive/CachedTableSuite.scala)
 and add the test there (and also use a similar approach).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13809: [SPARK-16104][SQL] Do not creaate CSV writer object for ...

2016-06-21 Thread davies
Github user davies commented on the issue:

https://github.com/apache/spark/pull/13809
  
LGTM, 
Merging this into master, thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13758: [SPARK-16043][SQL] Prepare GenericArrayData implementati...

2016-06-21 Thread hvanhovell
Github user hvanhovell commented on the issue:

https://github.com/apache/spark/pull/13758
  
Did you model it after MiscBenchmark? 
https://github.com/apache/spark/blob/master/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/MiscBenchmark.scala

I can take a look if you add it to the PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13764: [SPARK-16024] [SQL] [TEST] Verify Column Comment ...

2016-06-21 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/13764#discussion_r67994391
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/orc/OrcQuerySuite.scala ---
@@ -462,4 +463,27 @@ class OrcQuerySuite extends QueryTest with 
BeforeAndAfterAll with OrcTest {
   }
 }
   }
+
+  test("column nullability and comment - write and then read") {
+val schema = StructType(
+  StructField("cl1", IntegerType, nullable = false,
+new MetadataBuilder().putString("comment", "test").build()) ::
--- End diff --

Do you want me to submit a new PR for this? Or add it into this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13764: [SPARK-16024] [SQL] [TEST] Verify Column Comment ...

2016-06-21 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request:

https://github.com/apache/spark/pull/13764#discussion_r67994276
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/orc/OrcQuerySuite.scala ---
@@ -462,4 +463,27 @@ class OrcQuerySuite extends QueryTest with 
BeforeAndAfterAll with OrcTest {
   }
 }
   }
+
+  test("column nullability and comment - write and then read") {
+val schema = StructType(
+  StructField("cl1", IntegerType, nullable = false,
+new MetadataBuilder().putString("comment", "test").build()) ::
--- End diff --

Yeah I am afraid it is. I just grepped through the code base and there are 
a few places where we do this, for example:

https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala#L1434-L1438

https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/command/tables.scala#L397-L404

+1 for adding a convenience method.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13814: [SPARK-16003] SerializationDebugger runs into infinite l...

2016-06-21 Thread davies
Github user davies commented on the issue:

https://github.com/apache/spark/pull/13814
  
What's the error message looks like in the case of unserializable object 
(for example, Iterator in scala 2.10)?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13811: [SPARK-16100] [SQL] avoid crash in TreeNode.withNewChild...

2016-06-21 Thread inouehrs
Github user inouehrs commented on the issue:

https://github.com/apache/spark/pull/13811
  
Hi @cloud-fan, your fix looks cleaner than mine.
As a TODO comment at withNewChildren says, adding validation of the order 
of children somewhere will help avoiding future bugs caused by classes other 
than MapObjects.
Thank you.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13758: [SPARK-16043][SQL] Prepare GenericArrayData implementati...

2016-06-21 Thread kiszk
Github user kiszk commented on the issue:

https://github.com/apache/spark/pull/13758
  
@hvanhovell, yes, it is good idea. Actually, I wrote a benchmark program 
```org.apache.spark.sql.catalyst.util.GenericArrayBenchmark``` (not committed 
yet). An issue in my environment is that I cannot run a benchmark program under 
sql/catalyst.

The following command does not execute my benchmark program...
```
build/sbt "catalyst/test-only *GenericArrayBenchmark*"
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13802: [SPARK-16094][SQL] Support HashAggregateExec for non-par...

2016-06-21 Thread hvanhovell
Github user hvanhovell commented on the issue:

https://github.com/apache/spark/pull/13802
  
@maropu all aggregates that current set `supportsPartial = false` cannot be 
partially aggregated and require that the entire group is processed in one 
step. So the name is a bit misleading. I suppose we could rename it.

`UnsafeMapData` is typically part of an `UnsafeRow` which already 
implements equals() and hashCode() without requiring its elements to implement 
these methods (it uses the backing byte array). I suppose we can add these 
methods.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13380: [SPARK-15644] [MLlib] [SQL] Replace SQLContext with Spar...

2016-06-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13380
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60994/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13380: [SPARK-15644] [MLlib] [SQL] Replace SQLContext with Spar...

2016-06-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13380
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13806: [SPARK-16044][SQL] Backport input_file_name() for data s...

2016-06-21 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/13806
  
I guess it is about 
https://github.com/apache/spark/commit/f4af6a8b3ce5cea4dc4096e43001c7d60fce8cdb.
 I will look into this deeper.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13380: [SPARK-15644] [MLlib] [SQL] Replace SQLContext with Spar...

2016-06-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13380
  
**[Test build #60994 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60994/consoleFull)**
 for PR 13380 at commit 
[`65534a0`](https://github.com/apache/spark/commit/65534a04bd4fc68347ee9aff71d4de186e9656a0).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13728: [SPARK-16010] [SQL] Code Refactoring, Test Case Improvem...

2016-06-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13728
  
**[Test build #61006 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61006/consoleFull)**
 for PR 13728 at commit 
[`d1b2cbb`](https://github.com/apache/spark/commit/d1b2cbbe73e74ee80dd3afa6a9a1fe5214138b22).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13728: [SPARK-16010] [SQL] Code Refactoring, Test Case Improvem...

2016-06-21 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/13728
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13834: [TRIVIAL] [CORE] [ScriptTransform] move printing of stde...

2016-06-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13834
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13764: [SPARK-16024] [SQL] [TEST] Verify Column Comment ...

2016-06-21 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/13764#discussion_r67993473
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/orc/OrcQuerySuite.scala ---
@@ -462,4 +463,27 @@ class OrcQuerySuite extends QueryTest with 
BeforeAndAfterAll with OrcTest {
   }
 }
   }
+
+  test("column nullability and comment - write and then read") {
+val schema = StructType(
+  StructField("cl1", IntegerType, nullable = false,
+new MetadataBuilder().putString("comment", "test").build()) ::
--- End diff --

: ) It is a little bit hacky. Maybe we should add a new API for users to 
add comments?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13834: [TRIVIAL] [CORE] [ScriptTransform] move printing of stde...

2016-06-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13834
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61002/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13834: [TRIVIAL] [CORE] [ScriptTransform] move printing of stde...

2016-06-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13834
  
**[Test build #61002 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61002/consoleFull)**
 for PR 13834 at commit 
[`04c8637`](https://github.com/apache/spark/commit/04c86373e3b259471adef37a2c4aa7650f19134e).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13756: [SPARK-16041][SQL] Disallow Duplicate Columns in partiti...

2016-06-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13756
  
**[Test build #61005 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61005/consoleFull)**
 for PR 13756 at commit 
[`ae15ea9`](https://github.com/apache/spark/commit/ae15ea99e4f9d98b245ac7d6dcc98b7b5d30fffc).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13758: [SPARK-16043][SQL] Prepare GenericArrayData imple...

2016-06-21 Thread kiszk
Github user kiszk commented on a diff in the pull request:

https://github.com/apache/spark/pull/13758#discussion_r67993179
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/GenericArrayData.scala
 ---
@@ -23,7 +23,60 @@ import org.apache.spark.sql.catalyst.InternalRow
 import org.apache.spark.sql.types.{DataType, Decimal}
 import org.apache.spark.unsafe.types.{CalendarInterval, UTF8String}
 
-class GenericArrayData(val array: Array[Any]) extends ArrayData {
+object GenericArrayData {
+  def allocate(seq: Seq[Any]): GenericArrayData = new 
GenericRefArrayData(seq)
--- End diff --

Definitely, you are right


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13758: [SPARK-16043][SQL] Prepare GenericArrayData implementati...

2016-06-21 Thread hvanhovell
Github user hvanhovell commented on the issue:

https://github.com/apache/spark/pull/13758
  
@kiszk it would be nice to know what the influence of this PR is on 
performance. Could you perhaps elaborate on this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13758: [SPARK-16043][SQL] Prepare GenericArrayData imple...

2016-06-21 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request:

https://github.com/apache/spark/pull/13758#discussion_r67992883
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CodeGenerationSuite.scala
 ---
@@ -112,8 +112,8 @@ class CodeGenerationSuite extends SparkFunSuite with 
ExpressionEvalHelper {
 val plan = GenerateMutableProjection.generate(expressions)
 val actual = plan(new 
GenericMutableRow(length)).toSeq(expressions.map(_.dataType))
 val expected = Seq(new ArrayBasedMapData(
-  new GenericArrayData(0 until length),
-  new GenericArrayData(Seq.fill(length)(true
+  GenericArrayData.allocate(0 until length),
+  GenericArrayData.allocate(Seq.fill(length)(true
--- End diff --

Same as before...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13758: [SPARK-16043][SQL] Prepare GenericArrayData imple...

2016-06-21 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request:

https://github.com/apache/spark/pull/13758#discussion_r67992836
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CodeGenerationSuite.scala
 ---
@@ -96,7 +96,7 @@ class CodeGenerationSuite extends SparkFunSuite with 
ExpressionEvalHelper {
 val expressions = 
Seq(CreateArray(List.fill(length)(EqualTo(Literal(1), Literal(1)
 val plan = GenerateMutableProjection.generate(expressions)
 val actual = plan(new 
GenericMutableRow(length)).toSeq(expressions.map(_.dataType))
-val expected = Seq(new GenericArrayData(Seq.fill(length)(true)))
+val expected = Seq(GenericArrayData.allocate(Seq.fill(length)(true)))
--- End diff --

Use the `Array.fill(...)`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13758: [SPARK-16043][SQL] Prepare GenericArrayData imple...

2016-06-21 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request:

https://github.com/apache/spark/pull/13758#discussion_r67992774
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/GenericArrayData.scala
 ---
@@ -142,3 +196,414 @@ class GenericArrayData(val array: Array[Any]) extends 
ArrayData {
 result
   }
 }
+
+final class GenericIntArrayData(val primitiveArray: Array[Int]) extends 
GenericArrayData {
+  override def array(): Array[Any] = primitiveArray.toArray
+
+  override def copy(): ArrayData = new GenericIntArrayData(primitiveArray)
+
+  override def numElements(): Int = primitiveArray.length
+
+  override def isNullAt(ordinal: Int): Boolean = false
+  override def getInt(ordinal: Int): Int = primitiveArray(ordinal)
+  override def toIntArray(): Array[Int] = {
+val array = new Array[Int](numElements)
+System.arraycopy(primitiveArray, 0, array, 0, numElements)
+array
+  }
+  override def toString(): String = primitiveArray.mkString("[", ",", "]")
+
+  override def equals(o: Any): Boolean = {
+if (!o.isInstanceOf[GenericIntArrayData]) {
+  return false
+}
+
+val other = o.asInstanceOf[GenericIntArrayData]
+if (other eq null) {
+  return false
+}
+
+val len = numElements()
+if (len != other.numElements()) {
+  return false
+}
+
+var i = 0
+while (i < len) {
+  val o1 = primitiveArray(i)
+  val o2 = other.primitiveArray(i)
+  if (o1 != o2) {
+return false
+  }
+  i += 1
+}
+true
+  }
+
+  override def hashCode: Int = {
+var result: Int = 37
+var i = 0
+val len = numElements()
+while (i < len) {
+  val update: Int = primitiveArray(i)
+  result = 37 * result + update
+  i += 1
+}
+result
+  }
+}
+
+final class GenericLongArrayData(val primitiveArray: Array[Long])
+  extends GenericArrayData {
+  override def array(): Array[Any] = primitiveArray.toArray
+
+  override def copy(): ArrayData = new GenericLongArrayData(primitiveArray)
+
+  override def numElements(): Int = primitiveArray.length
+
+  override def isNullAt(ordinal: Int): Boolean = false
+  override def getLong(ordinal: Int): Long = primitiveArray(ordinal)
+  override def toLongArray(): Array[Long] = {
+val array = new Array[Long](numElements)
+System.arraycopy(primitiveArray, 0, array, 0, numElements)
+array
+  }
+  override def toString(): String = primitiveArray.mkString("[", ",", "]")
+
+  override def equals(o: Any): Boolean = {
+if (!o.isInstanceOf[GenericLongArrayData]) {
+  return false
+}
+
+val other = o.asInstanceOf[GenericLongArrayData]
+if (other eq null) {
+  return false
+}
+
+val len = numElements()
+if (len != other.numElements()) {
+  return false
+}
+
+var i = 0
+while (i < len) {
+  val o1 = primitiveArray(i)
+  val o2 = other.primitiveArray(i)
+  if (o1 != o2) {
+return false
+  }
+  i += 1
+}
+true
+  }
+
+  override def hashCode: Int = {
+var result: Int = 37
+var i = 0
+val len = numElements()
+while (i < len) {
+  val l = primitiveArray(i)
+  val update: Int = (l ^ (l >>> 32)).toInt
+  result = 37 * result + update
+  i += 1
+}
+result
+  }
+}
+
+final class GenericFloatArrayData(val primitiveArray: Array[Float])
+  extends GenericArrayData {
+  override def array(): Array[Any] = primitiveArray.toArray
+
+  override def copy(): ArrayData = new 
GenericFloatArrayData(primitiveArray)
+
+  override def numElements(): Int = primitiveArray.length
+
+  override def isNullAt(ordinal: Int): Boolean = false
+  override def getFloat(ordinal: Int): Float = primitiveArray(ordinal)
+  override def toFloatArray(): Array[Float] = {
+val array = new Array[Float](numElements)
+System.arraycopy(primitiveArray, 0, array, 0, numElements)
+array
+  }
+  override def toString(): String = primitiveArray.mkString("[", ",", "]")
+
+  override def equals(o: Any): Boolean = {
+if (!o.isInstanceOf[GenericFloatArrayData]) {
+  return false
+}
+
+val other = o.asInstanceOf[GenericFloatArrayData]
+if (other eq null) {
+  return false
+}
+
+val len = numElements()
+if (len != other.numElements()) {
+  return false
+}
+
+var i = 0
+while (i < len) {
+  val o1 = 

[GitHub] spark issue #13802: [SPARK-16094][SQL] Support HashAggregateExec for non-par...

2016-06-21 Thread maropu
Github user maropu commented on the issue:

https://github.com/apache/spark/pull/13802
  
@hvanhovell oh, I see. okay, I'll check we can implement mutable 
`ArrayData` and `MapData`.
btw, I have some question; 
1. Any reason to use `SortAggregateExec` for all the non-partial 
aggregates? It seems it is okay to use `HashAggregateExec` for non-partial ones 
except for `collect_xxx` and `hive_udaf`.
2. Why do we have no `hashCode` and `equals` in `UnsafeMapData`? 
`ArrayBasedMapData` already has these override functions.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13758: [SPARK-16043][SQL] Prepare GenericArrayData imple...

2016-06-21 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request:

https://github.com/apache/spark/pull/13758#discussion_r67992728
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/GenericArrayData.scala
 ---
@@ -142,3 +196,414 @@ class GenericArrayData(val array: Array[Any]) extends 
ArrayData {
 result
   }
 }
+
+final class GenericIntArrayData(val primitiveArray: Array[Int]) extends 
GenericArrayData {
+  override def array(): Array[Any] = primitiveArray.toArray
+
+  override def copy(): ArrayData = new GenericIntArrayData(primitiveArray)
--- End diff --

Return the actual type instead of the interface


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13831: [SPARK-16119][sql] Support PURGE option to drop table / ...

2016-06-21 Thread yhuai
Github user yhuai commented on the issue:

https://github.com/apache/spark/pull/13831
  
@vanzin Thank you for the PR. I probably will not be able to review it 
until we get 2.0 out. Will take a look after the release.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13831: [SPARK-16119][sql] Support PURGE option to drop table / ...

2016-06-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13831
  
**[Test build #61004 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61004/consoleFull)**
 for PR 13831 at commit 
[`b084081`](https://github.com/apache/spark/commit/b0840819c6c8cfcf45aa31b54d164802536ae4df).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13758: [SPARK-16043][SQL] Prepare GenericArrayData imple...

2016-06-21 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request:

https://github.com/apache/spark/pull/13758#discussion_r67992711
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/GenericArrayData.scala
 ---
@@ -142,3 +196,414 @@ class GenericArrayData(val array: Array[Any]) extends 
ArrayData {
 result
   }
 }
+
+final class GenericIntArrayData(val primitiveArray: Array[Int]) extends 
GenericArrayData {
+  override def array(): Array[Any] = primitiveArray.toArray
+
+  override def copy(): ArrayData = new GenericIntArrayData(primitiveArray)
+
+  override def numElements(): Int = primitiveArray.length
+
+  override def isNullAt(ordinal: Int): Boolean = false
+  override def getInt(ordinal: Int): Int = primitiveArray(ordinal)
+  override def toIntArray(): Array[Int] = {
+val array = new Array[Int](numElements)
+System.arraycopy(primitiveArray, 0, array, 0, numElements)
+array
+  }
+  override def toString(): String = primitiveArray.mkString("[", ",", "]")
--- End diff --

We could move this into the abstract class. Perf is not a real concern here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13758: [SPARK-16043][SQL] Prepare GenericArrayData imple...

2016-06-21 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request:

https://github.com/apache/spark/pull/13758#discussion_r67992617
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/GenericArrayData.scala
 ---
@@ -142,3 +196,414 @@ class GenericArrayData(val array: Array[Any]) extends 
ArrayData {
 result
   }
 }
+
+final class GenericIntArrayData(val primitiveArray: Array[Int]) extends 
GenericArrayData {
+  override def array(): Array[Any] = primitiveArray.toArray
+
+  override def copy(): ArrayData = new GenericIntArrayData(primitiveArray)
--- End diff --

Return `this` if you are not copying the backing array.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13758: [SPARK-16043][SQL] Prepare GenericArrayData imple...

2016-06-21 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request:

https://github.com/apache/spark/pull/13758#discussion_r67992529
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/GenericArrayData.scala
 ---
@@ -23,7 +23,60 @@ import org.apache.spark.sql.catalyst.InternalRow
 import org.apache.spark.sql.types.{DataType, Decimal}
 import org.apache.spark.unsafe.types.{CalendarInterval, UTF8String}
 
-class GenericArrayData(val array: Array[Any]) extends ArrayData {
+object GenericArrayData {
+  def allocate(seq: Seq[Any]): GenericArrayData = new 
GenericRefArrayData(seq)
--- End diff --

Please make all allocate methods return the type they are actually 
allocating. That will increase the chance that we deal with a monomorphic 
callsite in further code.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13830: [SPARK-16121] ListingFileCatalog does not list in parall...

2016-06-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13830
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60995/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13830: [SPARK-16121] ListingFileCatalog does not list in parall...

2016-06-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13830
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13758: [SPARK-16043][SQL] Prepare GenericArrayData imple...

2016-06-21 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request:

https://github.com/apache/spark/pull/13758#discussion_r67992307
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/GenericArrayData.scala
 ---
@@ -142,3 +164,415 @@ class GenericArrayData(val array: Array[Any]) extends 
ArrayData {
 result
   }
 }
+
+final class GenericIntArrayData(private val primitiveArray: Array[Int])
+  extends GenericArrayData(Array.empty) {
+  override def array(): Array[Any] = primitiveArray.toArray
+
+  override def copy(): ArrayData = new GenericIntArrayData(primitiveArray)
+
+  override def numElements(): Int = primitiveArray.length
+
+  override def isNullAt(ordinal: Int): Boolean = false
+  override def getInt(ordinal: Int): Int = primitiveArray(ordinal)
+  override def toIntArray(): Array[Int] = {
+val array = new Array[Int](numElements)
+System.arraycopy(primitiveArray, 0, array, 0, numElements)
+array
+  }
+  override def toString(): String = primitiveArray.mkString("[", ",", "]")
+
+  override def equals(o: Any): Boolean = {
--- End diff --

`UnsafeArrayData` is should always be a part of an `UnsafeRow` (which 
implements equals() and hashCode()). We should implement equals() and 
hashCode() for these classes, but please use the methods provided by 
`java.util.Arrays`. We can take care of the UnsafeArrayData in your dense 
UnsafeArrayData PR. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13830: [SPARK-16121] ListingFileCatalog does not list in parall...

2016-06-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13830
  
**[Test build #60995 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60995/consoleFull)**
 for PR 13830 at commit 
[`d898735`](https://github.com/apache/spark/commit/d898735cdb41cbab190d6bc267b2347d3978481e).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13758: [SPARK-16043][SQL] Prepare GenericArrayData implementati...

2016-06-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13758
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13758: [SPARK-16043][SQL] Prepare GenericArrayData implementati...

2016-06-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13758
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61001/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13758: [SPARK-16043][SQL] Prepare GenericArrayData implementati...

2016-06-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13758
  
**[Test build #61001 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61001/consoleFull)**
 for PR 13758 at commit 
[`e04ca2c`](https://github.com/apache/spark/commit/e04ca2ce51d7d587f12535fdeee5a6107c4e0c26).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13811: [SPARK-16100] [SQL] avoid crash in TreeNode.withNewChild...

2016-06-21 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/13811
  
hi @inouehrs , sorry I haven't noticed that you already have a PR for this 
JIRA ticket.
Yea the root cause of this bug is exactly what you described in your PR 
descption, thanks for finding this out!
However, I don't think adding a check to stop processing the children in 
`TreeNode.withNewChildren` is a good fix, we are replacing the wrong children 
there.
Instead, I think we should avoid mistakenly treating `MapOjects.loopVar` as 
a child. Do you mind reviewing my PR at 
https://github.com/apache/spark/pull/13835? thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13818: [SPARK-15968][SQL] Nonempty partitioned metastore tables...

2016-06-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13818
  
**[Test build #3124 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3124/consoleFull)**
 for PR 13818 at commit 
[`8a058c6`](https://github.com/apache/spark/commit/8a058c65c6c20e311bde5c0ade87c14c6b6b5f37).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13835: [SPARK-16100][SQL] fix bug when use Map as the buffer ty...

2016-06-21 Thread hvanhovell
Github user hvanhovell commented on the issue:

https://github.com/apache/spark/pull/13835
  
Is this PR https://github.com/apache/spark/pull/13811 related?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13764: [SPARK-16024] [SQL] [TEST] Verify Column Comment ...

2016-06-21 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/13764#discussion_r67991258
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/orc/OrcQuerySuite.scala ---
@@ -462,4 +463,27 @@ class OrcQuerySuite extends QueryTest with 
BeforeAndAfterAll with OrcTest {
   }
 }
   }
+
+  test("column nullability and comment - write and then read") {
+val schema = StructType(
+  StructField("cl1", IntegerType, nullable = false,
+new MetadataBuilder().putString("comment", "test").build()) ::
--- End diff --

hmmm, is this the official way to add column comment when create table 
using `DataFrameWriter`?

cc @yhuai  @hvanhovell 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13823: [MINOR] [MLLIB] deprecate setLabelCol in ChiSqSel...

2016-06-21 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/13823


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13823: [MINOR] [MLLIB] deprecate setLabelCol in ChiSqSelectorMo...

2016-06-21 Thread jkbradley
Github user jkbradley commented on the issue:

https://github.com/apache/spark/pull/13823
  
Merging with master


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13380: [SPARK-15644] [MLlib] [SQL] Replace SQLContext with Spar...

2016-06-21 Thread jkbradley
Github user jkbradley commented on the issue:

https://github.com/apache/spark/pull/13380
  
LGTM pending tests!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13835: [SPARK-16100][SQL] fix bug when use Map as the buffer ty...

2016-06-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13835
  
**[Test build #61003 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61003/consoleFull)**
 for PR 13835 at commit 
[`447ddcd`](https://github.com/apache/spark/commit/447ddcd3812e0253d3548f9462f21282abc086eb).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13835: [SPARK-16100][SQL] fix bug when use Map as the buffer ty...

2016-06-21 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/13835
  
cc @yhuai @liancheng @clockfly 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13835: [SPARK-16100][SQL] fix bug when use Map as the bu...

2016-06-21 Thread cloud-fan
GitHub user cloud-fan opened a pull request:

https://github.com/apache/spark/pull/13835

[SPARK-16100][SQL] fix bug when use Map as the buffer type of Aggregator

## What changes were proposed in this pull request?

The root cause is in `MapObjects`. Its parameter `loopVar` is not declared 
as child, but sometimes can be same with `lambdaFunction`(e.g. the function 
that takes `loopVar` and produces `lambdaFunction` may be `identity`), which is 
a child. This brings trouble when call `withNewChildren`, it may mistakenly 
treat `loopVar` as a child and cause `IndexOutOfBoundsException: 0` later.

This PR fixes this bug by simply pulling out the paremters from 
`LambdaVariable` and pass them to `MapObjects` directly.

## How was this patch tested?

new test in `DatasetAggregatorSuite`



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/cloud-fan/spark map-objects

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/13835.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #13835


commit 447ddcd3812e0253d3548f9462f21282abc086eb
Author: Wenchen Fan 
Date:   2016-06-22T03:30:31Z

fix bug of MapObjects




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13806: [SPARK-16044][SQL] Backport input_file_name() for data s...

2016-06-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13806
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13806: [SPARK-16044][SQL] Backport input_file_name() for data s...

2016-06-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13806
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60990/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13806: [SPARK-16044][SQL] Backport input_file_name() for data s...

2016-06-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13806
  
**[Test build #60990 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60990/consoleFull)**
 for PR 13806 at commit 
[`2a55091`](https://github.com/apache/spark/commit/2a550912f1194e9c212d9f4f78824eaf375ddccc).
 * This patch **fails PySpark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13834: [TRIVIAL] [CORE] [ScriptTransform] move printing of stde...

2016-06-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13834
  
**[Test build #61002 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61002/consoleFull)**
 for PR 13834 at commit 
[`04c8637`](https://github.com/apache/spark/commit/04c86373e3b259471adef37a2c4aa7650f19134e).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13834: [TRIVIAL] [CORE] [ScriptTransform] move printing ...

2016-06-21 Thread tejasapatil
GitHub user tejasapatil opened a pull request:

https://github.com/apache/spark/pull/13834

[TRIVIAL] [CORE] [ScriptTransform] move printing of stderr buffer before 
closing the outstream

## What changes were proposed in this pull request?

Currently, if due to some failure, the outstream gets destroyed or closed 
and later `outstream.close()` leads to IOException in such case. Due to this, 
the `stderrBuffer` does not get logged and there is no way for users to see why 
the job failed. 

The change is to first display the stderr buffer and then try closing the 
outstream.

## How was this patch tested?

The correct way to test this fix would be to grep the log to see if the 
`stderrBuffer` gets logged but I dont think having test cases which do that is 
a good idea.


(If this patch involves UI changes, please attach a screenshot; otherwise, 
remove this)


…

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tejasapatil/spark script_transform

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/13834.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #13834


commit 04c86373e3b259471adef37a2c4aa7650f19134e
Author: Tejas Patil 
Date:   2016-06-22T03:22:33Z

[TRIVIAL] [CORE] [ScriptTransform] move printing of stderr buffer before 
closing the outstream




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   4   5   6   7   8   9   >