[GitHub] spark issue #13836: [SPARK][YARN] Fix not test yarn cluster mode correctly i...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13836 **[Test build #61013 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61013/consoleFull)** for PR 13836 at commit [`7820382`](https://github.com/apache/spark/commit/7820382a39dc41382d62b7a5e6fd871338ac00b3). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13836: [SPARK][YARN] Fix not test yarn cluster mode corr...
GitHub user renozhang opened a pull request: https://github.com/apache/spark/pull/13836 [SPARK][YARN] Fix not test yarn cluster mode correctly in YarnClusterSuite ## What changes were proposed in this pull request? Since SPARK-13220(Deprecate "yarn-client" and "yarn-cluster"), YarnClusterSuite doesn't test "yarn cluster" mode correctly. This pull request fixes it. ## How was this patch tested? Unit test (If this patch involves UI changes, please attach a screenshot; otherwise, remove this) You can merge this pull request into a Git repository by running: $ git pull https://github.com/renozhang/spark SPARK-16125-test-yarn-cluster-mode Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/13836.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #13836 commit 7820382a39dc41382d62b7a5e6fd871338ac00b3 Author: peng.zhangDate: 2016-06-22T05:37:26Z Fix not test yarn cluster mode actually in YarnClusterSuite --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13758: [SPARK-16043][SQL] Prepare GenericArrayData imple...
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/13758#discussion_r67997948 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/GenericArrayData.scala --- @@ -142,3 +164,415 @@ class GenericArrayData(val array: Array[Any]) extends ArrayData { result } } + +final class GenericIntArrayData(private val primitiveArray: Array[Int]) + extends GenericArrayData(Array.empty) { + override def array(): Array[Any] = primitiveArray.toArray + + override def copy(): ArrayData = new GenericIntArrayData(primitiveArray) + + override def numElements(): Int = primitiveArray.length + + override def isNullAt(ordinal: Int): Boolean = false + override def getInt(ordinal: Int): Int = primitiveArray(ordinal) + override def toIntArray(): Array[Int] = { +val array = new Array[Int](numElements) +System.arraycopy(primitiveArray, 0, array, 0, numElements) +array + } + override def toString(): String = primitiveArray.mkString("[", ",", "]") + + override def equals(o: Any): Boolean = { --- End diff -- Yeah you did. So that is good. I am just saying that we could reduce the size of the PR by using things that are already provided by the JDK. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13778: [SPARK-16062][SPARK-15989][SQL] Fix two bugs of Python-o...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13778 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13778: [SPARK-16062][SPARK-15989][SQL] Fix two bugs of Python-o...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13778 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61000/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13778: [SPARK-16062][SPARK-15989][SQL] Fix two bugs of Python-o...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13778 **[Test build #61000 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61000/consoleFull)** for PR 13778 at commit [`a0b81ba`](https://github.com/apache/spark/commit/a0b81ba448018200eb8947bc496d8f07d87a64b8). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13820: [SPARK-16107] [R] group glm methods in documentation
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13820 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13820: [SPARK-16107] [R] group glm methods in documentation
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13820 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61007/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13820: [SPARK-16107] [R] group glm methods in documentation
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13820 **[Test build #61007 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61007/consoleFull)** for PR 13820 at commit [`8ee701f`](https://github.com/apache/spark/commit/8ee701f99d63e921cf3bab86b4d30e7230b104d2). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13825: [SPARK-16120] [STREAMING] getCurrentLogFiles in Receiver...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13825 **[Test build #61012 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61012/consoleFull)** for PR 13825 at commit [`f689cce`](https://github.com/apache/spark/commit/f689cce93ba21fd25341c5c142747ba0821f0572). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13835: [SPARK-16100][SQL] fix bug when use Map as the buffer ty...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13835 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61003/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13835: [SPARK-16100][SQL] fix bug when use Map as the buffer ty...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13835 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13758: [SPARK-16043][SQL] Prepare GenericArrayData imple...
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/13758#discussion_r67997326 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/GenericArrayData.scala --- @@ -142,3 +164,415 @@ class GenericArrayData(val array: Array[Any]) extends ArrayData { result } } + +final class GenericIntArrayData(private val primitiveArray: Array[Int]) + extends GenericArrayData(Array.empty) { + override def array(): Array[Any] = primitiveArray.toArray + + override def copy(): ArrayData = new GenericIntArrayData(primitiveArray) + + override def numElements(): Int = primitiveArray.length + + override def isNullAt(ordinal: Int): Boolean = false + override def getInt(ordinal: Int): Int = primitiveArray(ordinal) + override def toIntArray(): Array[Int] = { +val array = new Array[Int](numElements) +System.arraycopy(primitiveArray, 0, array, 0, numElements) +array + } + override def toString(): String = primitiveArray.mkString("[", ",", "]") + + override def equals(o: Any): Boolean = { --- End diff -- I should implement ```equals()``` and ```hashCode()``` for classes related to ```GenericArrayData``` I already implemented type specialized ```equals()``` and ```hashCode()``` in ```GenericArrayData```. An issue in ```equals()``` and ```hashCode``` of ```GenericRefArrayData``` is that a type of each element may be different since they are hold in ```Array[Any]```. If I misunderstood your suggestion, could you please let me know? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13835: [SPARK-16100][SQL] fix bug when use Map as the buffer ty...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13835 **[Test build #61003 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61003/consoleFull)** for PR 13835 at commit [`447ddcd`](https://github.com/apache/spark/commit/447ddcd3812e0253d3548f9462f21282abc086eb). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13825: [SPARK-16120] [STREAMING] getCurrentLogFiles in Receiver...
Github user srowen commented on the issue: https://github.com/apache/spark/pull/13825 Jenkins test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13834: [TRIVIAL] [CORE] [ScriptTransform] move printing of stde...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13834 **[Test build #61011 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61011/consoleFull)** for PR 13834 at commit [`04c8637`](https://github.com/apache/spark/commit/04c86373e3b259471adef37a2c4aa7650f19134e). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13834: [TRIVIAL] [CORE] [ScriptTransform] move printing of stde...
Github user srowen commented on the issue: https://github.com/apache/spark/pull/13834 Jenkins retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13824: [SPARK-16110][YARN][PYSPARK] Fix allowing python version...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/13824 Also would you please add a unit test about it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13834: [TRIVIAL] [CORE] [ScriptTransform] move printing of stde...
Github user srowen commented on the issue: https://github.com/apache/spark/pull/13834 SGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13758: [SPARK-16043][SQL] Prepare GenericArrayData imple...
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/13758#discussion_r67996555 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/GenericArrayDataBenchmark.scala --- @@ -0,0 +1,188 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.catalyst.util + +import org.apache.spark.util.Benchmark + +/** + * Benchmark [[GenericArrayData]] for Dense and Sparse with primitive type + */ +object GenericArrayDataBenchmark { +/* + def allocateGenericIntArray(iters: Int): Unit = { --- End diff -- I am quite curious to see the results here. I cannot really imagine these being different. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13758: [SPARK-16043][SQL] Prepare GenericArrayData imple...
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/13758#discussion_r67996447 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/GenericArrayDataBenchmark.scala --- @@ -0,0 +1,188 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.catalyst.util + +import org.apache.spark.util.Benchmark + +/** + * Benchmark [[GenericArrayData]] for Dense and Sparse with primitive type + */ +object GenericArrayDataBenchmark { +/* + def allocateGenericIntArray(iters: Int): Unit = { +val count = 1024 * 1024 * 10 +var array: GenericArrayData = null + +val primitiveIntArray = new Array[Int](count) +val denseIntArray = { i: Int => + for (n <- 0L until iters) { --- End diff -- `for (n <- until items)` is expensive and it will probably dominate the costs of the benchmark. Please use a while loop. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13818: [SPARK-15968][SQL] Nonempty partitioned metastore tables...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13818 **[Test build #3124 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3124/consoleFull)** for PR 13818 at commit [`8a058c6`](https://github.com/apache/spark/commit/8a058c65c6c20e311bde5c0ade87c14c6b6b5f37). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13758: [SPARK-16043][SQL] Prepare GenericArrayData imple...
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/13758#discussion_r67996267 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/GenericArrayDataBenchmark.scala --- @@ -0,0 +1,188 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.catalyst.util + +import org.apache.spark.util.Benchmark + +/** + * Benchmark [[GenericArrayData]] for Dense and Sparse with primitive type + */ +object GenericArrayDataBenchmark { --- End diff -- This is different from `MiscBenchmark`. It is a class that extends BenchmarkBase. You probably have to move it to sql/core though. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13764: [SPARK-16024] [SQL] [TEST] Verify Column Comment for Dat...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13764 **[Test build #61010 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61010/consoleFull)** for PR 13764 at commit [`87d32d7`](https://github.com/apache/spark/commit/87d32d70b18fb040a351c754182ea11f807fc238). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13758: [SPARK-16043][SQL] Prepare GenericArrayData imple...
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/13758#discussion_r67996014 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CodeGenerationSuite.scala --- @@ -112,8 +112,8 @@ class CodeGenerationSuite extends SparkFunSuite with ExpressionEvalHelper { val plan = GenerateMutableProjection.generate(expressions) val actual = plan(new GenericMutableRow(length)).toSeq(expressions.map(_.dataType)) val expected = Seq(new ArrayBasedMapData( - new GenericArrayData(0 until length), - new GenericArrayData(Seq.fill(length)(true + GenericArrayData.allocate(0 until length), + GenericArrayData.allocate(Seq.fill(length)(true --- End diff -- Ok, I will do both. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13072: [SPARK-15288] [Mesos] Mesos dispatcher should handle gra...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13072 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60999/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13072: [SPARK-15288] [Mesos] Mesos dispatcher should handle gra...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13072 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13758: [SPARK-16043][SQL] Prepare GenericArrayData imple...
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/13758#discussion_r67995993 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/GenericArrayData.scala --- @@ -142,3 +196,414 @@ class GenericArrayData(val array: Array[Any]) extends ArrayData { result } } + +final class GenericIntArrayData(val primitiveArray: Array[Int]) extends GenericArrayData { + override def array(): Array[Any] = primitiveArray.toArray + + override def copy(): ArrayData = new GenericIntArrayData(primitiveArray) + + override def numElements(): Int = primitiveArray.length + + override def isNullAt(ordinal: Int): Boolean = false + override def getInt(ordinal: Int): Int = primitiveArray(ordinal) + override def toIntArray(): Array[Int] = { +val array = new Array[Int](numElements) +System.arraycopy(primitiveArray, 0, array, 0, numElements) +array + } + override def toString(): String = primitiveArray.mkString("[", ",", "]") --- End diff -- Yes, I will move it up. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13764: [SPARK-16024] [SQL] [TEST] Verify Column Comment ...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/13764#discussion_r67995925 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/test/DataFrameReaderWriterSuite.scala --- @@ -223,6 +223,31 @@ class DataFrameReaderWriterSuite extends QueryTest with SharedSQLContext with Be } } + test("column nullability and comment - write and then read") { --- End diff -- Sure, let me change it. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13758: [SPARK-16043][SQL] Prepare GenericArrayData imple...
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/13758#discussion_r67995936 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/GenericArrayData.scala --- @@ -142,3 +196,414 @@ class GenericArrayData(val array: Array[Any]) extends ArrayData { result } } + +final class GenericIntArrayData(val primitiveArray: Array[Int]) extends GenericArrayData { + override def array(): Array[Any] = primitiveArray.toArray + + override def copy(): ArrayData = new GenericIntArrayData(primitiveArray) --- End diff -- Hmm, I should clone the backing array. I should return the actual type. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13072: [SPARK-15288] [Mesos] Mesos dispatcher should handle gra...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13072 **[Test build #60999 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60999/consoleFull)** for PR 13072 at commit [`2f306a7`](https://github.com/apache/spark/commit/2f306a785f3adbbba7420d27630b6f1553139074). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13820: [SPARK-16107] [R] group glm methods in documentat...
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/13820#discussion_r67995869 --- Diff: R/pkg/R/mllib.R --- @@ -124,24 +138,21 @@ setMethod("spark.glm", signature(data = "SparkDataFrame", formula = "formula"), #' summary(model) #' } #' @note glm since 1.5.0 +#' @seealso \link{spark.glm} setMethod("glm", signature(formula = "formula", family = "ANY", data = "SparkDataFrame"), function(formula, family = gaussian, data, epsilon = 1e-6, maxit = 25) { spark.glm(data, formula, family, tol = epsilon, maxIter = maxit) }) -#' Get the summary of a generalized linear model -#' -#' Returns the summary of a model produced by glm() or spark.glm(), similarly to R's summary(). +# Returns the summary of a model produced by glm() or spark.glm(), similarly to R's summary(). --- End diff -- I get this is intentional, but I'd suggest adding extra empty newline between `#` & `#'` line - it's very easy to newcomers to make mistakes copy/paste doc --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13764: [SPARK-16024] [SQL] [TEST] Verify Column Comment ...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/13764#discussion_r67995781 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/test/DataFrameReaderWriterSuite.scala --- @@ -223,6 +223,31 @@ class DataFrameReaderWriterSuite extends QueryTest with SharedSQLContext with Be } } + test("column nullability and comment - write and then read") { --- End diff -- also remove this test, let's focus on SQL CREATE TABLE in this PR. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13758: [SPARK-16043][SQL] Prepare GenericArrayData implementati...
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/13758 @hvanhovell , I added [a file](https://github.com/kiszk/spark/blob/133d4c0085b5ca2f20870c05d077e25d8715e07a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/GenericArrayDataBenchmark.scala) of Benchmark (not ran yet). I would appreciate it if you have a time to look at this. It is very strange to me that I can run a Benchmark program under ```sql/core``` (e.g. ```MiscBenchmark)``` by using ```build/sbt "sql/test-only *MiscBenchmark*"```. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13715: [SPARK-15992] [MESOS] Refactor MesosCoarseGrainedSchedul...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13715 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13715: [SPARK-15992] [MESOS] Refactor MesosCoarseGrainedSchedul...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13715 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60997/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13715: [SPARK-15992] [MESOS] Refactor MesosCoarseGrainedSchedul...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13715 **[Test build #60997 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60997/consoleFull)** for PR 13715 at commit [`9e0aedf`](https://github.com/apache/spark/commit/9e0aedf12816c317db0a65e21adc921258608a4b). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13764: [SPARK-16024] [SQL] [TEST] Verify Column Comment for Dat...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13764 **[Test build #61008 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61008/consoleFull)** for PR 13764 at commit [`94b7264`](https://github.com/apache/spark/commit/94b7264480709c8ebfd551c85582967020111e97). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13758: [SPARK-16043][SQL] Prepare GenericArrayData implementati...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13758 **[Test build #61009 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61009/consoleFull)** for PR 13758 at commit [`133d4c0`](https://github.com/apache/spark/commit/133d4c0085b5ca2f20870c05d077e25d8715e07a). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13820: [SPARK-16107] [R] group glm methods in documentation
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13820 **[Test build #61007 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61007/consoleFull)** for PR 13820 at commit [`8ee701f`](https://github.com/apache/spark/commit/8ee701f99d63e921cf3bab86b4d30e7230b104d2). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13758: [SPARK-16043][SQL] Prepare GenericArrayData imple...
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/13758#discussion_r67995256 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/CatalystTypeConverters.scala --- @@ -159,17 +159,17 @@ object CatalystTypeConverters { override def toCatalystImpl(scalaValue: Any): ArrayData = { scalaValue match { case a: Array[_] => - new GenericArrayData(a.map(elementConverter.toCatalyst)) + GenericArrayData.allocate(a.map(elementConverter.toCatalyst)) --- End diff -- These allocates will create a GenericRefArrayData object? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13764: [SPARK-16024] [SQL] [TEST] Verify Column Comment ...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/13764#discussion_r67995158 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/orc/OrcQuerySuite.scala --- @@ -462,4 +463,27 @@ class OrcQuerySuite extends QueryTest with BeforeAndAfterAll with OrcTest { } } } + + test("column nullability and comment - write and then read") { +val schema = StructType( + StructField("cl1", IntegerType, nullable = false, +new MetadataBuilder().putString("comment", "test").build()) :: --- End diff -- Sure, let me remove it and submit a new PR soon. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13758: [SPARK-16043][SQL] Prepare GenericArrayData imple...
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/13758#discussion_r67995044 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/GenericArrayData.scala --- @@ -142,3 +196,414 @@ class GenericArrayData(val array: Array[Any]) extends ArrayData { result } } + +final class GenericIntArrayData(val primitiveArray: Array[Int]) extends GenericArrayData { + override def array(): Array[Any] = primitiveArray.toArray + + override def copy(): ArrayData = new GenericIntArrayData(primitiveArray) + + override def numElements(): Int = primitiveArray.length + + override def isNullAt(ordinal: Int): Boolean = false + override def getInt(ordinal: Int): Int = primitiveArray(ordinal) + override def toIntArray(): Array[Int] = { +val array = new Array[Int](numElements) +System.arraycopy(primitiveArray, 0, array, 0, numElements) +array + } + override def toString(): String = primitiveArray.mkString("[", ",", "]") + + override def equals(o: Any): Boolean = { +if (!o.isInstanceOf[GenericIntArrayData]) { + return false +} + +val other = o.asInstanceOf[GenericIntArrayData] +if (other eq null) { + return false +} + +val len = numElements() +if (len != other.numElements()) { + return false +} + +var i = 0 +while (i < len) { + val o1 = primitiveArray(i) + val o2 = other.primitiveArray(i) + if (o1 != o2) { +return false + } + i += 1 +} +true + } + + override def hashCode: Int = { +var result: Int = 37 +var i = 0 +val len = numElements() +while (i < len) { + val update: Int = primitiveArray(i) + result = 37 * result + update + i += 1 +} +result + } +} + +final class GenericLongArrayData(val primitiveArray: Array[Long]) + extends GenericArrayData { + override def array(): Array[Any] = primitiveArray.toArray + + override def copy(): ArrayData = new GenericLongArrayData(primitiveArray) + + override def numElements(): Int = primitiveArray.length + + override def isNullAt(ordinal: Int): Boolean = false + override def getLong(ordinal: Int): Long = primitiveArray(ordinal) + override def toLongArray(): Array[Long] = { +val array = new Array[Long](numElements) +System.arraycopy(primitiveArray, 0, array, 0, numElements) +array + } + override def toString(): String = primitiveArray.mkString("[", ",", "]") + + override def equals(o: Any): Boolean = { +if (!o.isInstanceOf[GenericLongArrayData]) { + return false +} + +val other = o.asInstanceOf[GenericLongArrayData] +if (other eq null) { + return false +} + +val len = numElements() +if (len != other.numElements()) { + return false +} + +var i = 0 +while (i < len) { + val o1 = primitiveArray(i) + val o2 = other.primitiveArray(i) + if (o1 != o2) { +return false + } + i += 1 +} +true + } + + override def hashCode: Int = { +var result: Int = 37 +var i = 0 +val len = numElements() +while (i < len) { + val l = primitiveArray(i) + val update: Int = (l ^ (l >>> 32)).toInt + result = 37 * result + update + i += 1 +} +result + } +} + +final class GenericFloatArrayData(val primitiveArray: Array[Float]) + extends GenericArrayData { + override def array(): Array[Any] = primitiveArray.toArray + + override def copy(): ArrayData = new GenericFloatArrayData(primitiveArray) + + override def numElements(): Int = primitiveArray.length + + override def isNullAt(ordinal: Int): Boolean = false + override def getFloat(ordinal: Int): Float = primitiveArray(ordinal) + override def toFloatArray(): Array[Float] = { +val array = new Array[Float](numElements) +System.arraycopy(primitiveArray, 0, array, 0, numElements) +array + } + override def toString(): String = primitiveArray.mkString("[", ",", "]") + + override def equals(o: Any): Boolean = { +if (!o.isInstanceOf[GenericFloatArrayData]) { + return false +} + +val other = o.asInstanceOf[GenericFloatArrayData] +if (other eq null) { + return false +} + +val len = numElements() +if (len != other.numElements()) { + return false +} + +var i = 0 +while (i < len) { + val o1 =
[GitHub] spark pull request #13764: [SPARK-16024] [SQL] [TEST] Verify Column Comment ...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/13764#discussion_r67994899 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/orc/OrcQuerySuite.scala --- @@ -462,4 +463,27 @@ class OrcQuerySuite extends QueryTest with BeforeAndAfterAll with OrcTest { } } } + + test("column nullability and comment - write and then read") { +val schema = StructType( + StructField("cl1", IntegerType, nullable = false, +new MetadataBuilder().putString("comment", "test").build()) :: --- End diff -- you can open a new PR and move this test there. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13758: [SPARK-16043][SQL] Prepare GenericArrayData imple...
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/13758#discussion_r67994789 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/GenericArrayData.scala --- @@ -142,3 +196,414 @@ class GenericArrayData(val array: Array[Any]) extends ArrayData { result } } + +final class GenericIntArrayData(val primitiveArray: Array[Int]) extends GenericArrayData { + override def array(): Array[Any] = primitiveArray.toArray + + override def copy(): ArrayData = new GenericIntArrayData(primitiveArray) + + override def numElements(): Int = primitiveArray.length + + override def isNullAt(ordinal: Int): Boolean = false + override def getInt(ordinal: Int): Int = primitiveArray(ordinal) + override def toIntArray(): Array[Int] = { +val array = new Array[Int](numElements) +System.arraycopy(primitiveArray, 0, array, 0, numElements) +array + } + override def toString(): String = primitiveArray.mkString("[", ",", "]") + + override def equals(o: Any): Boolean = { +if (!o.isInstanceOf[GenericIntArrayData]) { + return false +} + +val other = o.asInstanceOf[GenericIntArrayData] +if (other eq null) { + return false +} + +val len = numElements() +if (len != other.numElements()) { + return false +} + +var i = 0 +while (i < len) { + val o1 = primitiveArray(i) + val o2 = other.primitiveArray(i) + if (o1 != o2) { +return false + } + i += 1 +} +true + } + + override def hashCode: Int = { +var result: Int = 37 +var i = 0 +val len = numElements() +while (i < len) { + val update: Int = primitiveArray(i) + result = 37 * result + update + i += 1 +} +result + } +} + +final class GenericLongArrayData(val primitiveArray: Array[Long]) + extends GenericArrayData { + override def array(): Array[Any] = primitiveArray.toArray + + override def copy(): ArrayData = new GenericLongArrayData(primitiveArray) + + override def numElements(): Int = primitiveArray.length + + override def isNullAt(ordinal: Int): Boolean = false + override def getLong(ordinal: Int): Long = primitiveArray(ordinal) + override def toLongArray(): Array[Long] = { +val array = new Array[Long](numElements) +System.arraycopy(primitiveArray, 0, array, 0, numElements) +array + } + override def toString(): String = primitiveArray.mkString("[", ",", "]") + + override def equals(o: Any): Boolean = { +if (!o.isInstanceOf[GenericLongArrayData]) { + return false +} + +val other = o.asInstanceOf[GenericLongArrayData] +if (other eq null) { + return false +} + +val len = numElements() +if (len != other.numElements()) { + return false +} + +var i = 0 +while (i < len) { + val o1 = primitiveArray(i) + val o2 = other.primitiveArray(i) + if (o1 != o2) { +return false + } + i += 1 +} +true + } + + override def hashCode: Int = { +var result: Int = 37 +var i = 0 +val len = numElements() +while (i < len) { + val l = primitiveArray(i) + val update: Int = (l ^ (l >>> 32)).toInt + result = 37 * result + update + i += 1 +} +result + } +} + +final class GenericFloatArrayData(val primitiveArray: Array[Float]) + extends GenericArrayData { + override def array(): Array[Any] = primitiveArray.toArray + + override def copy(): ArrayData = new GenericFloatArrayData(primitiveArray) + + override def numElements(): Int = primitiveArray.length + + override def isNullAt(ordinal: Int): Boolean = false + override def getFloat(ordinal: Int): Float = primitiveArray(ordinal) + override def toFloatArray(): Array[Float] = { +val array = new Array[Float](numElements) +System.arraycopy(primitiveArray, 0, array, 0, numElements) +array + } + override def toString(): String = primitiveArray.mkString("[", ",", "]") + + override def equals(o: Any): Boolean = { +if (!o.isInstanceOf[GenericFloatArrayData]) { + return false +} + +val other = o.asInstanceOf[GenericFloatArrayData] +if (other eq null) { + return false +} + +val len = numElements() +if (len != other.numElements()) { + return false +} + +var i = 0 +while (i < len) { + val o1 = primitiveArray(i)
[GitHub] spark pull request #13809: [SPARK-16104][SQL] Do not creaate CSV writer obje...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/13809 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13818: [SPARK-15968][SQL] Nonempty partitioned metastore...
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/13818#discussion_r67994573 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/parquetSuites.scala --- @@ -425,6 +425,28 @@ class ParquetMetastoreSuite extends ParquetPartitioningTest { } } + test("SPARK-15968: nonempty partitioned metastore Parquet table lookup should use cached " + --- End diff -- Could you take a look a [CachedTableSuite](https://github.com/apache/spark/blob/master/sql/hive/src/test/scala/org/apache/spark/sql/hive/CachedTableSuite.scala) and add the test there (and also use a similar approach). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13809: [SPARK-16104][SQL] Do not creaate CSV writer object for ...
Github user davies commented on the issue: https://github.com/apache/spark/pull/13809 LGTM, Merging this into master, thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13758: [SPARK-16043][SQL] Prepare GenericArrayData implementati...
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/13758 Did you model it after MiscBenchmark? https://github.com/apache/spark/blob/master/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/MiscBenchmark.scala I can take a look if you add it to the PR. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13764: [SPARK-16024] [SQL] [TEST] Verify Column Comment ...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/13764#discussion_r67994391 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/orc/OrcQuerySuite.scala --- @@ -462,4 +463,27 @@ class OrcQuerySuite extends QueryTest with BeforeAndAfterAll with OrcTest { } } } + + test("column nullability and comment - write and then read") { +val schema = StructType( + StructField("cl1", IntegerType, nullable = false, +new MetadataBuilder().putString("comment", "test").build()) :: --- End diff -- Do you want me to submit a new PR for this? Or add it into this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13764: [SPARK-16024] [SQL] [TEST] Verify Column Comment ...
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/13764#discussion_r67994276 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/orc/OrcQuerySuite.scala --- @@ -462,4 +463,27 @@ class OrcQuerySuite extends QueryTest with BeforeAndAfterAll with OrcTest { } } } + + test("column nullability and comment - write and then read") { +val schema = StructType( + StructField("cl1", IntegerType, nullable = false, +new MetadataBuilder().putString("comment", "test").build()) :: --- End diff -- Yeah I am afraid it is. I just grepped through the code base and there are a few places where we do this, for example: https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala#L1434-L1438 https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/command/tables.scala#L397-L404 +1 for adding a convenience method. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13814: [SPARK-16003] SerializationDebugger runs into infinite l...
Github user davies commented on the issue: https://github.com/apache/spark/pull/13814 What's the error message looks like in the case of unserializable object (for example, Iterator in scala 2.10)? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13811: [SPARK-16100] [SQL] avoid crash in TreeNode.withNewChild...
Github user inouehrs commented on the issue: https://github.com/apache/spark/pull/13811 Hi @cloud-fan, your fix looks cleaner than mine. As a TODO comment at withNewChildren says, adding validation of the order of children somewhere will help avoiding future bugs caused by classes other than MapObjects. Thank you. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13758: [SPARK-16043][SQL] Prepare GenericArrayData implementati...
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/13758 @hvanhovell, yes, it is good idea. Actually, I wrote a benchmark program ```org.apache.spark.sql.catalyst.util.GenericArrayBenchmark``` (not committed yet). An issue in my environment is that I cannot run a benchmark program under sql/catalyst. The following command does not execute my benchmark program... ``` build/sbt "catalyst/test-only *GenericArrayBenchmark*" ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13802: [SPARK-16094][SQL] Support HashAggregateExec for non-par...
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/13802 @maropu all aggregates that current set `supportsPartial = false` cannot be partially aggregated and require that the entire group is processed in one step. So the name is a bit misleading. I suppose we could rename it. `UnsafeMapData` is typically part of an `UnsafeRow` which already implements equals() and hashCode() without requiring its elements to implement these methods (it uses the backing byte array). I suppose we can add these methods. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13380: [SPARK-15644] [MLlib] [SQL] Replace SQLContext with Spar...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13380 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60994/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13380: [SPARK-15644] [MLlib] [SQL] Replace SQLContext with Spar...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13380 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13806: [SPARK-16044][SQL] Backport input_file_name() for data s...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/13806 I guess it is about https://github.com/apache/spark/commit/f4af6a8b3ce5cea4dc4096e43001c7d60fce8cdb. I will look into this deeper. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13380: [SPARK-15644] [MLlib] [SQL] Replace SQLContext with Spar...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13380 **[Test build #60994 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60994/consoleFull)** for PR 13380 at commit [`65534a0`](https://github.com/apache/spark/commit/65534a04bd4fc68347ee9aff71d4de186e9656a0). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13728: [SPARK-16010] [SQL] Code Refactoring, Test Case Improvem...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13728 **[Test build #61006 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61006/consoleFull)** for PR 13728 at commit [`d1b2cbb`](https://github.com/apache/spark/commit/d1b2cbbe73e74ee80dd3afa6a9a1fe5214138b22). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13728: [SPARK-16010] [SQL] Code Refactoring, Test Case Improvem...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/13728 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13834: [TRIVIAL] [CORE] [ScriptTransform] move printing of stde...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13834 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13764: [SPARK-16024] [SQL] [TEST] Verify Column Comment ...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/13764#discussion_r67993473 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/orc/OrcQuerySuite.scala --- @@ -462,4 +463,27 @@ class OrcQuerySuite extends QueryTest with BeforeAndAfterAll with OrcTest { } } } + + test("column nullability and comment - write and then read") { +val schema = StructType( + StructField("cl1", IntegerType, nullable = false, +new MetadataBuilder().putString("comment", "test").build()) :: --- End diff -- : ) It is a little bit hacky. Maybe we should add a new API for users to add comments? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13834: [TRIVIAL] [CORE] [ScriptTransform] move printing of stde...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13834 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61002/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13834: [TRIVIAL] [CORE] [ScriptTransform] move printing of stde...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13834 **[Test build #61002 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61002/consoleFull)** for PR 13834 at commit [`04c8637`](https://github.com/apache/spark/commit/04c86373e3b259471adef37a2c4aa7650f19134e). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13756: [SPARK-16041][SQL] Disallow Duplicate Columns in partiti...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13756 **[Test build #61005 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61005/consoleFull)** for PR 13756 at commit [`ae15ea9`](https://github.com/apache/spark/commit/ae15ea99e4f9d98b245ac7d6dcc98b7b5d30fffc). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13758: [SPARK-16043][SQL] Prepare GenericArrayData imple...
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/13758#discussion_r67993179 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/GenericArrayData.scala --- @@ -23,7 +23,60 @@ import org.apache.spark.sql.catalyst.InternalRow import org.apache.spark.sql.types.{DataType, Decimal} import org.apache.spark.unsafe.types.{CalendarInterval, UTF8String} -class GenericArrayData(val array: Array[Any]) extends ArrayData { +object GenericArrayData { + def allocate(seq: Seq[Any]): GenericArrayData = new GenericRefArrayData(seq) --- End diff -- Definitely, you are right --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13758: [SPARK-16043][SQL] Prepare GenericArrayData implementati...
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/13758 @kiszk it would be nice to know what the influence of this PR is on performance. Could you perhaps elaborate on this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13758: [SPARK-16043][SQL] Prepare GenericArrayData imple...
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/13758#discussion_r67992883 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CodeGenerationSuite.scala --- @@ -112,8 +112,8 @@ class CodeGenerationSuite extends SparkFunSuite with ExpressionEvalHelper { val plan = GenerateMutableProjection.generate(expressions) val actual = plan(new GenericMutableRow(length)).toSeq(expressions.map(_.dataType)) val expected = Seq(new ArrayBasedMapData( - new GenericArrayData(0 until length), - new GenericArrayData(Seq.fill(length)(true + GenericArrayData.allocate(0 until length), + GenericArrayData.allocate(Seq.fill(length)(true --- End diff -- Same as before... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13758: [SPARK-16043][SQL] Prepare GenericArrayData imple...
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/13758#discussion_r67992836 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CodeGenerationSuite.scala --- @@ -96,7 +96,7 @@ class CodeGenerationSuite extends SparkFunSuite with ExpressionEvalHelper { val expressions = Seq(CreateArray(List.fill(length)(EqualTo(Literal(1), Literal(1) val plan = GenerateMutableProjection.generate(expressions) val actual = plan(new GenericMutableRow(length)).toSeq(expressions.map(_.dataType)) -val expected = Seq(new GenericArrayData(Seq.fill(length)(true))) +val expected = Seq(GenericArrayData.allocate(Seq.fill(length)(true))) --- End diff -- Use the `Array.fill(...)`? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13758: [SPARK-16043][SQL] Prepare GenericArrayData imple...
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/13758#discussion_r67992774 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/GenericArrayData.scala --- @@ -142,3 +196,414 @@ class GenericArrayData(val array: Array[Any]) extends ArrayData { result } } + +final class GenericIntArrayData(val primitiveArray: Array[Int]) extends GenericArrayData { + override def array(): Array[Any] = primitiveArray.toArray + + override def copy(): ArrayData = new GenericIntArrayData(primitiveArray) + + override def numElements(): Int = primitiveArray.length + + override def isNullAt(ordinal: Int): Boolean = false + override def getInt(ordinal: Int): Int = primitiveArray(ordinal) + override def toIntArray(): Array[Int] = { +val array = new Array[Int](numElements) +System.arraycopy(primitiveArray, 0, array, 0, numElements) +array + } + override def toString(): String = primitiveArray.mkString("[", ",", "]") + + override def equals(o: Any): Boolean = { +if (!o.isInstanceOf[GenericIntArrayData]) { + return false +} + +val other = o.asInstanceOf[GenericIntArrayData] +if (other eq null) { + return false +} + +val len = numElements() +if (len != other.numElements()) { + return false +} + +var i = 0 +while (i < len) { + val o1 = primitiveArray(i) + val o2 = other.primitiveArray(i) + if (o1 != o2) { +return false + } + i += 1 +} +true + } + + override def hashCode: Int = { +var result: Int = 37 +var i = 0 +val len = numElements() +while (i < len) { + val update: Int = primitiveArray(i) + result = 37 * result + update + i += 1 +} +result + } +} + +final class GenericLongArrayData(val primitiveArray: Array[Long]) + extends GenericArrayData { + override def array(): Array[Any] = primitiveArray.toArray + + override def copy(): ArrayData = new GenericLongArrayData(primitiveArray) + + override def numElements(): Int = primitiveArray.length + + override def isNullAt(ordinal: Int): Boolean = false + override def getLong(ordinal: Int): Long = primitiveArray(ordinal) + override def toLongArray(): Array[Long] = { +val array = new Array[Long](numElements) +System.arraycopy(primitiveArray, 0, array, 0, numElements) +array + } + override def toString(): String = primitiveArray.mkString("[", ",", "]") + + override def equals(o: Any): Boolean = { +if (!o.isInstanceOf[GenericLongArrayData]) { + return false +} + +val other = o.asInstanceOf[GenericLongArrayData] +if (other eq null) { + return false +} + +val len = numElements() +if (len != other.numElements()) { + return false +} + +var i = 0 +while (i < len) { + val o1 = primitiveArray(i) + val o2 = other.primitiveArray(i) + if (o1 != o2) { +return false + } + i += 1 +} +true + } + + override def hashCode: Int = { +var result: Int = 37 +var i = 0 +val len = numElements() +while (i < len) { + val l = primitiveArray(i) + val update: Int = (l ^ (l >>> 32)).toInt + result = 37 * result + update + i += 1 +} +result + } +} + +final class GenericFloatArrayData(val primitiveArray: Array[Float]) + extends GenericArrayData { + override def array(): Array[Any] = primitiveArray.toArray + + override def copy(): ArrayData = new GenericFloatArrayData(primitiveArray) + + override def numElements(): Int = primitiveArray.length + + override def isNullAt(ordinal: Int): Boolean = false + override def getFloat(ordinal: Int): Float = primitiveArray(ordinal) + override def toFloatArray(): Array[Float] = { +val array = new Array[Float](numElements) +System.arraycopy(primitiveArray, 0, array, 0, numElements) +array + } + override def toString(): String = primitiveArray.mkString("[", ",", "]") + + override def equals(o: Any): Boolean = { +if (!o.isInstanceOf[GenericFloatArrayData]) { + return false +} + +val other = o.asInstanceOf[GenericFloatArrayData] +if (other eq null) { + return false +} + +val len = numElements() +if (len != other.numElements()) { + return false +} + +var i = 0 +while (i < len) { + val o1 =
[GitHub] spark issue #13802: [SPARK-16094][SQL] Support HashAggregateExec for non-par...
Github user maropu commented on the issue: https://github.com/apache/spark/pull/13802 @hvanhovell oh, I see. okay, I'll check we can implement mutable `ArrayData` and `MapData`. btw, I have some question; 1. Any reason to use `SortAggregateExec` for all the non-partial aggregates? It seems it is okay to use `HashAggregateExec` for non-partial ones except for `collect_xxx` and `hive_udaf`. 2. Why do we have no `hashCode` and `equals` in `UnsafeMapData`? `ArrayBasedMapData` already has these override functions. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13758: [SPARK-16043][SQL] Prepare GenericArrayData imple...
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/13758#discussion_r67992728 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/GenericArrayData.scala --- @@ -142,3 +196,414 @@ class GenericArrayData(val array: Array[Any]) extends ArrayData { result } } + +final class GenericIntArrayData(val primitiveArray: Array[Int]) extends GenericArrayData { + override def array(): Array[Any] = primitiveArray.toArray + + override def copy(): ArrayData = new GenericIntArrayData(primitiveArray) --- End diff -- Return the actual type instead of the interface --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13831: [SPARK-16119][sql] Support PURGE option to drop table / ...
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/13831 @vanzin Thank you for the PR. I probably will not be able to review it until we get 2.0 out. Will take a look after the release. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13831: [SPARK-16119][sql] Support PURGE option to drop table / ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13831 **[Test build #61004 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61004/consoleFull)** for PR 13831 at commit [`b084081`](https://github.com/apache/spark/commit/b0840819c6c8cfcf45aa31b54d164802536ae4df). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13758: [SPARK-16043][SQL] Prepare GenericArrayData imple...
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/13758#discussion_r67992711 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/GenericArrayData.scala --- @@ -142,3 +196,414 @@ class GenericArrayData(val array: Array[Any]) extends ArrayData { result } } + +final class GenericIntArrayData(val primitiveArray: Array[Int]) extends GenericArrayData { + override def array(): Array[Any] = primitiveArray.toArray + + override def copy(): ArrayData = new GenericIntArrayData(primitiveArray) + + override def numElements(): Int = primitiveArray.length + + override def isNullAt(ordinal: Int): Boolean = false + override def getInt(ordinal: Int): Int = primitiveArray(ordinal) + override def toIntArray(): Array[Int] = { +val array = new Array[Int](numElements) +System.arraycopy(primitiveArray, 0, array, 0, numElements) +array + } + override def toString(): String = primitiveArray.mkString("[", ",", "]") --- End diff -- We could move this into the abstract class. Perf is not a real concern here. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13758: [SPARK-16043][SQL] Prepare GenericArrayData imple...
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/13758#discussion_r67992617 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/GenericArrayData.scala --- @@ -142,3 +196,414 @@ class GenericArrayData(val array: Array[Any]) extends ArrayData { result } } + +final class GenericIntArrayData(val primitiveArray: Array[Int]) extends GenericArrayData { + override def array(): Array[Any] = primitiveArray.toArray + + override def copy(): ArrayData = new GenericIntArrayData(primitiveArray) --- End diff -- Return `this` if you are not copying the backing array. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13758: [SPARK-16043][SQL] Prepare GenericArrayData imple...
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/13758#discussion_r67992529 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/GenericArrayData.scala --- @@ -23,7 +23,60 @@ import org.apache.spark.sql.catalyst.InternalRow import org.apache.spark.sql.types.{DataType, Decimal} import org.apache.spark.unsafe.types.{CalendarInterval, UTF8String} -class GenericArrayData(val array: Array[Any]) extends ArrayData { +object GenericArrayData { + def allocate(seq: Seq[Any]): GenericArrayData = new GenericRefArrayData(seq) --- End diff -- Please make all allocate methods return the type they are actually allocating. That will increase the chance that we deal with a monomorphic callsite in further code. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13830: [SPARK-16121] ListingFileCatalog does not list in parall...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13830 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60995/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13830: [SPARK-16121] ListingFileCatalog does not list in parall...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13830 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13758: [SPARK-16043][SQL] Prepare GenericArrayData imple...
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/13758#discussion_r67992307 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/GenericArrayData.scala --- @@ -142,3 +164,415 @@ class GenericArrayData(val array: Array[Any]) extends ArrayData { result } } + +final class GenericIntArrayData(private val primitiveArray: Array[Int]) + extends GenericArrayData(Array.empty) { + override def array(): Array[Any] = primitiveArray.toArray + + override def copy(): ArrayData = new GenericIntArrayData(primitiveArray) + + override def numElements(): Int = primitiveArray.length + + override def isNullAt(ordinal: Int): Boolean = false + override def getInt(ordinal: Int): Int = primitiveArray(ordinal) + override def toIntArray(): Array[Int] = { +val array = new Array[Int](numElements) +System.arraycopy(primitiveArray, 0, array, 0, numElements) +array + } + override def toString(): String = primitiveArray.mkString("[", ",", "]") + + override def equals(o: Any): Boolean = { --- End diff -- `UnsafeArrayData` is should always be a part of an `UnsafeRow` (which implements equals() and hashCode()). We should implement equals() and hashCode() for these classes, but please use the methods provided by `java.util.Arrays`. We can take care of the UnsafeArrayData in your dense UnsafeArrayData PR. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13830: [SPARK-16121] ListingFileCatalog does not list in parall...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13830 **[Test build #60995 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60995/consoleFull)** for PR 13830 at commit [`d898735`](https://github.com/apache/spark/commit/d898735cdb41cbab190d6bc267b2347d3978481e). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13758: [SPARK-16043][SQL] Prepare GenericArrayData implementati...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13758 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13758: [SPARK-16043][SQL] Prepare GenericArrayData implementati...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13758 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61001/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13758: [SPARK-16043][SQL] Prepare GenericArrayData implementati...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13758 **[Test build #61001 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61001/consoleFull)** for PR 13758 at commit [`e04ca2c`](https://github.com/apache/spark/commit/e04ca2ce51d7d587f12535fdeee5a6107c4e0c26). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13811: [SPARK-16100] [SQL] avoid crash in TreeNode.withNewChild...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/13811 hi @inouehrs , sorry I haven't noticed that you already have a PR for this JIRA ticket. Yea the root cause of this bug is exactly what you described in your PR descption, thanks for finding this out! However, I don't think adding a check to stop processing the children in `TreeNode.withNewChildren` is a good fix, we are replacing the wrong children there. Instead, I think we should avoid mistakenly treating `MapOjects.loopVar` as a child. Do you mind reviewing my PR at https://github.com/apache/spark/pull/13835? thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13818: [SPARK-15968][SQL] Nonempty partitioned metastore tables...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13818 **[Test build #3124 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3124/consoleFull)** for PR 13818 at commit [`8a058c6`](https://github.com/apache/spark/commit/8a058c65c6c20e311bde5c0ade87c14c6b6b5f37). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13835: [SPARK-16100][SQL] fix bug when use Map as the buffer ty...
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/13835 Is this PR https://github.com/apache/spark/pull/13811 related? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13764: [SPARK-16024] [SQL] [TEST] Verify Column Comment ...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/13764#discussion_r67991258 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/orc/OrcQuerySuite.scala --- @@ -462,4 +463,27 @@ class OrcQuerySuite extends QueryTest with BeforeAndAfterAll with OrcTest { } } } + + test("column nullability and comment - write and then read") { +val schema = StructType( + StructField("cl1", IntegerType, nullable = false, +new MetadataBuilder().putString("comment", "test").build()) :: --- End diff -- hmmm, is this the official way to add column comment when create table using `DataFrameWriter`? cc @yhuai @hvanhovell --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13823: [MINOR] [MLLIB] deprecate setLabelCol in ChiSqSel...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/13823 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13823: [MINOR] [MLLIB] deprecate setLabelCol in ChiSqSelectorMo...
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/13823 Merging with master --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13380: [SPARK-15644] [MLlib] [SQL] Replace SQLContext with Spar...
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/13380 LGTM pending tests! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13835: [SPARK-16100][SQL] fix bug when use Map as the buffer ty...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13835 **[Test build #61003 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61003/consoleFull)** for PR 13835 at commit [`447ddcd`](https://github.com/apache/spark/commit/447ddcd3812e0253d3548f9462f21282abc086eb). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13835: [SPARK-16100][SQL] fix bug when use Map as the buffer ty...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/13835 cc @yhuai @liancheng @clockfly --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13835: [SPARK-16100][SQL] fix bug when use Map as the bu...
GitHub user cloud-fan opened a pull request: https://github.com/apache/spark/pull/13835 [SPARK-16100][SQL] fix bug when use Map as the buffer type of Aggregator ## What changes were proposed in this pull request? The root cause is in `MapObjects`. Its parameter `loopVar` is not declared as child, but sometimes can be same with `lambdaFunction`(e.g. the function that takes `loopVar` and produces `lambdaFunction` may be `identity`), which is a child. This brings trouble when call `withNewChildren`, it may mistakenly treat `loopVar` as a child and cause `IndexOutOfBoundsException: 0` later. This PR fixes this bug by simply pulling out the paremters from `LambdaVariable` and pass them to `MapObjects` directly. ## How was this patch tested? new test in `DatasetAggregatorSuite` You can merge this pull request into a Git repository by running: $ git pull https://github.com/cloud-fan/spark map-objects Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/13835.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #13835 commit 447ddcd3812e0253d3548f9462f21282abc086eb Author: Wenchen FanDate: 2016-06-22T03:30:31Z fix bug of MapObjects --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13806: [SPARK-16044][SQL] Backport input_file_name() for data s...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13806 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13806: [SPARK-16044][SQL] Backport input_file_name() for data s...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13806 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60990/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13806: [SPARK-16044][SQL] Backport input_file_name() for data s...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13806 **[Test build #60990 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60990/consoleFull)** for PR 13806 at commit [`2a55091`](https://github.com/apache/spark/commit/2a550912f1194e9c212d9f4f78824eaf375ddccc). * This patch **fails PySpark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13834: [TRIVIAL] [CORE] [ScriptTransform] move printing of stde...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13834 **[Test build #61002 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61002/consoleFull)** for PR 13834 at commit [`04c8637`](https://github.com/apache/spark/commit/04c86373e3b259471adef37a2c4aa7650f19134e). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13834: [TRIVIAL] [CORE] [ScriptTransform] move printing ...
GitHub user tejasapatil opened a pull request: https://github.com/apache/spark/pull/13834 [TRIVIAL] [CORE] [ScriptTransform] move printing of stderr buffer before closing the outstream ## What changes were proposed in this pull request? Currently, if due to some failure, the outstream gets destroyed or closed and later `outstream.close()` leads to IOException in such case. Due to this, the `stderrBuffer` does not get logged and there is no way for users to see why the job failed. The change is to first display the stderr buffer and then try closing the outstream. ## How was this patch tested? The correct way to test this fix would be to grep the log to see if the `stderrBuffer` gets logged but I dont think having test cases which do that is a good idea. (If this patch involves UI changes, please attach a screenshot; otherwise, remove this) ⦠You can merge this pull request into a Git repository by running: $ git pull https://github.com/tejasapatil/spark script_transform Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/13834.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #13834 commit 04c86373e3b259471adef37a2c4aa7650f19134e Author: Tejas PatilDate: 2016-06-22T03:22:33Z [TRIVIAL] [CORE] [ScriptTransform] move printing of stderr buffer before closing the outstream --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org