date:20181108

[GitHub] spark pull request #22939: [SPARK-25446][R] Add schema_of_json() and schema_...

2018-11-08 Thread felixcheung

Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/22939#discussion_r232166370
  
--- Diff: R/pkg/R/functions.R ---
@@ -2230,6 +2237,32 @@ setMethod("from_json", signature(x = "Column", 
schema = "characterOrstructType")
 column(jc)
   })
 
+#' @details
+#' \code{schema_of_json}: Parses a JSON string and infers its schema in 
DDL format.
+#'
+#' @rdname column_collection_functions
+#' @aliases schema_of_json schema_of_json,characterOrColumn-method
+#' @examples
+#'
+#' \dontrun{
+#' json <- '{"name":"Bob"}'
+#' df <- sql("SELECT * FROM range(1)")
+#' head(select(df, schema_of_json(json)))}
+#' @note schema_of_json since 3.0.0
+setMethod("schema_of_json", signature(x = "characterOrColumn"),
+  function(x, ...) {
+if (class(x) == "character") {
+  col <- callJStatic("org.apache.spark.sql.functions", "lit", 
x)
+} else {
+  col <- x@jc
--- End diff --

hm.. why not just support string then? it's kinda very odd usage in R 
`schema_of_csv(lit("Amsterdam,2018")))`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22987: [SPARK-25979][SQL] Window function: allow parentheses ar...

2018-11-08 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22987
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98636/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22987: [SPARK-25979][SQL] Window function: allow parentheses ar...

2018-11-08 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22987
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22987: [SPARK-25979][SQL] Window function: allow parentheses ar...

2018-11-08 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22987
  
**[Test build #98636 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98636/testReport)**
 for PR 22987 at commit 
[`471092d`](https://github.com/apache/spark/commit/471092d417666f5cf8908318aed098d6f06c4900).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22955: [SPARK-25949][SQL] Add test for PullOutPythonUDFInJoinCo...

2018-11-08 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22955
  
**[Test build #98644 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98644/testReport)**
 for PR 22955 at commit 
[`38b1555`](https://github.com/apache/spark/commit/38b15552995355d5e00186fb2b332928a83d248a).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22955: [SPARK-25949][SQL] Add test for PullOutPythonUDFInJoinCo...

2018-11-08 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22955
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4882/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22955: [SPARK-25949][SQL] Add test for PullOutPythonUDFInJoinCo...

2018-11-08 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22955
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22955: [SPARK-25949][SQL] Add test for PullOutPythonUDFI...

2018-11-08 Thread xuanyuanking

Github user xuanyuanking commented on a diff in the pull request:

https://github.com/apache/spark/pull/22955#discussion_r232163956
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/PullOutPythonUDFInJoinConditionSuite.scala
 ---
@@ -0,0 +1,128 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.optimizer
+
+import org.scalatest.Matchers._
+
+import org.apache.spark.api.python.PythonEvalType
+import org.apache.spark.sql.AnalysisException
+import org.apache.spark.sql.catalyst.dsl.expressions._
+import org.apache.spark.sql.catalyst.dsl.plans._
+import org.apache.spark.sql.catalyst.expressions.PythonUDF
+import org.apache.spark.sql.catalyst.plans._
+import org.apache.spark.sql.catalyst.plans.logical.{LocalRelation, 
LogicalPlan}
+import org.apache.spark.sql.catalyst.rules.RuleExecutor
+import org.apache.spark.sql.internal.SQLConf._
+import org.apache.spark.sql.types.BooleanType
+
+class PullOutPythonUDFInJoinConditionSuite extends PlanTest {
+
+  object Optimize extends RuleExecutor[LogicalPlan] {
+val batches =
+  Batch("Extract PythonUDF From JoinCondition", Once,
+PullOutPythonUDFInJoinCondition) ::
+  Batch("Check Cartesian Products", Once,
+CheckCartesianProducts) :: Nil
+  }
+
+  val testRelationLeft = LocalRelation('a.int, 'b.int)
+  val testRelationRight = LocalRelation('c.int, 'd.int)
+
+  // Dummy python UDF for testing. Unable to execute.
+  val pythonUDF = PythonUDF("pythonUDF", null,
+BooleanType,
+Seq.empty,
+PythonEvalType.SQL_BATCHED_UDF,
+udfDeterministic = true)
+
+  val notSupportJoinTypes = Seq(LeftOuter, RightOuter, FullOuter, LeftAnti)
+
+  test("inner join condition with python udf only") {
+val query = testRelationLeft.join(
+  testRelationRight,
+  joinType = Inner,
+  condition = Some(pythonUDF))
+val expected = testRelationLeft.join(
+  testRelationRight,
+  joinType = Inner,
+  condition = None).where(pythonUDF).analyze
+
+// AnalysisException thrown by CheckCartesianProducts while 
spark.sql.crossJoin.enabled=false
+val exception = the [AnalysisException] thrownBy {
+  Optimize.execute(query.analyze)
+}
+assert(exception.message.startsWith("Detected implicit cartesian 
product"))
+
+// pull out the python udf while set spark.sql.crossJoin.enabled=true
+withSQLConf(CROSS_JOINS_ENABLED.key -> "true") {
+  val optimized = Optimize.execute(query.analyze)
+  comparePlans(optimized, expected)
+}
+  }
+
+  test("left semi join condition with python udf only") {
+val query = testRelationLeft.join(
+  testRelationRight,
+  joinType = LeftSemi,
+  condition = Some(pythonUDF))
+val expected = testRelationLeft.join(
+  testRelationRight,
+  joinType = Inner,
+  condition = None).where(pythonUDF).select('a, 'b).analyze
+
+// AnalysisException thrown by CheckCartesianProducts while 
spark.sql.crossJoin.enabled=false
+val exception = the [AnalysisException] thrownBy {
+  Optimize.execute(query.analyze)
+}
+assert(exception.message.startsWith("Detected implicit cartesian 
product"))
+
+// pull out the python udf while set spark.sql.crossJoin.enabled=true
+withSQLConf(CROSS_JOINS_ENABLED.key -> "true") {
+  val optimized = Optimize.execute(query.analyze)
+  comparePlans(optimized, expected)
+}
+  }
+
+  test("python udf with other common condition") {
--- End diff --

Thanks, add more cases in 38b1555.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22955: [SPARK-25949][SQL] Add test for PullOutPythonUDFI...

2018-11-08 Thread xuanyuanking

Github user xuanyuanking commented on a diff in the pull request:

https://github.com/apache/spark/pull/22955#discussion_r232163715
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/PullOutPythonUDFInJoinConditionSuite.scala
 ---
@@ -0,0 +1,128 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.optimizer
+
+import org.scalatest.Matchers._
+
+import org.apache.spark.api.python.PythonEvalType
+import org.apache.spark.sql.AnalysisException
+import org.apache.spark.sql.catalyst.dsl.expressions._
+import org.apache.spark.sql.catalyst.dsl.plans._
+import org.apache.spark.sql.catalyst.expressions.PythonUDF
+import org.apache.spark.sql.catalyst.plans._
+import org.apache.spark.sql.catalyst.plans.logical.{LocalRelation, 
LogicalPlan}
+import org.apache.spark.sql.catalyst.rules.RuleExecutor
+import org.apache.spark.sql.internal.SQLConf._
+import org.apache.spark.sql.types.BooleanType
+
+class PullOutPythonUDFInJoinConditionSuite extends PlanTest {
+
+  object Optimize extends RuleExecutor[LogicalPlan] {
+val batches =
+  Batch("Extract PythonUDF From JoinCondition", Once,
+PullOutPythonUDFInJoinCondition) ::
+  Batch("Check Cartesian Products", Once,
+CheckCartesianProducts) :: Nil
+  }
+
+  val testRelationLeft = LocalRelation('a.int, 'b.int)
+  val testRelationRight = LocalRelation('c.int, 'd.int)
+
+  // Dummy python UDF for testing. Unable to execute.
+  val pythonUDF = PythonUDF("pythonUDF", null,
+BooleanType,
+Seq.empty,
+PythonEvalType.SQL_BATCHED_UDF,
+udfDeterministic = true)
+
+  val notSupportJoinTypes = Seq(LeftOuter, RightOuter, FullOuter, LeftAnti)
--- End diff --

Thanks, done in 38b1555.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22955: [SPARK-25949][SQL] Add test for PullOutPythonUDFI...

2018-11-08 Thread xuanyuanking

Github user xuanyuanking commented on a diff in the pull request:

https://github.com/apache/spark/pull/22955#discussion_r232163787
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/PullOutPythonUDFInJoinConditionSuite.scala
 ---
@@ -0,0 +1,128 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.optimizer
+
+import org.scalatest.Matchers._
+
+import org.apache.spark.api.python.PythonEvalType
+import org.apache.spark.sql.AnalysisException
+import org.apache.spark.sql.catalyst.dsl.expressions._
+import org.apache.spark.sql.catalyst.dsl.plans._
+import org.apache.spark.sql.catalyst.expressions.PythonUDF
+import org.apache.spark.sql.catalyst.plans._
+import org.apache.spark.sql.catalyst.plans.logical.{LocalRelation, 
LogicalPlan}
+import org.apache.spark.sql.catalyst.rules.RuleExecutor
+import org.apache.spark.sql.internal.SQLConf._
+import org.apache.spark.sql.types.BooleanType
+
+class PullOutPythonUDFInJoinConditionSuite extends PlanTest {
+
+  object Optimize extends RuleExecutor[LogicalPlan] {
+val batches =
+  Batch("Extract PythonUDF From JoinCondition", Once,
+PullOutPythonUDFInJoinCondition) ::
+  Batch("Check Cartesian Products", Once,
+CheckCartesianProducts) :: Nil
+  }
+
+  val testRelationLeft = LocalRelation('a.int, 'b.int)
+  val testRelationRight = LocalRelation('c.int, 'd.int)
+
+  // Dummy python UDF for testing. Unable to execute.
+  val pythonUDF = PythonUDF("pythonUDF", null,
+BooleanType,
+Seq.empty,
+PythonEvalType.SQL_BATCHED_UDF,
+udfDeterministic = true)
+
+  val notSupportJoinTypes = Seq(LeftOuter, RightOuter, FullOuter, LeftAnti)
+
+  test("inner join condition with python udf only") {
--- End diff --

Sorry for this, done in 38b1555.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22955: [SPARK-25949][SQL] Add test for PullOutPythonUDFI...

2018-11-08 Thread xuanyuanking

Github user xuanyuanking commented on a diff in the pull request:

https://github.com/apache/spark/pull/22955#discussion_r232163738
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/PullOutPythonUDFInJoinConditionSuite.scala
 ---
@@ -0,0 +1,128 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.optimizer
+
+import org.scalatest.Matchers._
+
+import org.apache.spark.api.python.PythonEvalType
+import org.apache.spark.sql.AnalysisException
+import org.apache.spark.sql.catalyst.dsl.expressions._
+import org.apache.spark.sql.catalyst.dsl.plans._
+import org.apache.spark.sql.catalyst.expressions.PythonUDF
+import org.apache.spark.sql.catalyst.plans._
+import org.apache.spark.sql.catalyst.plans.logical.{LocalRelation, 
LogicalPlan}
+import org.apache.spark.sql.catalyst.rules.RuleExecutor
+import org.apache.spark.sql.internal.SQLConf._
+import org.apache.spark.sql.types.BooleanType
+
+class PullOutPythonUDFInJoinConditionSuite extends PlanTest {
+
+  object Optimize extends RuleExecutor[LogicalPlan] {
+val batches =
+  Batch("Extract PythonUDF From JoinCondition", Once,
+PullOutPythonUDFInJoinCondition) ::
+  Batch("Check Cartesian Products", Once,
+CheckCartesianProducts) :: Nil
+  }
+
+  val testRelationLeft = LocalRelation('a.int, 'b.int)
+  val testRelationRight = LocalRelation('c.int, 'd.int)
+
+  // Dummy python UDF for testing. Unable to execute.
+  val pythonUDF = PythonUDF("pythonUDF", null,
+BooleanType,
+Seq.empty,
+PythonEvalType.SQL_BATCHED_UDF,
+udfDeterministic = true)
+
+  val notSupportJoinTypes = Seq(LeftOuter, RightOuter, FullOuter, LeftAnti)
+
+  test("inner join condition with python udf only") {
+val query = testRelationLeft.join(
+  testRelationRight,
+  joinType = Inner,
+  condition = Some(pythonUDF))
+val expected = testRelationLeft.join(
+  testRelationRight,
+  joinType = Inner,
+  condition = None).where(pythonUDF).analyze
+
+// AnalysisException thrown by CheckCartesianProducts while 
spark.sql.crossJoin.enabled=false
+val exception = the [AnalysisException] thrownBy {
--- End diff --

Thanks, done in 38b1555.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22990: [SPARK-25988] [SQL] Keep names unchanged when deduplicat...

2018-11-08 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22990
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98638/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22990: [SPARK-25988] [SQL] Keep names unchanged when deduplicat...

2018-11-08 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22990
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22990: [SPARK-25988] [SQL] Keep names unchanged when deduplicat...

2018-11-08 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22990
  
**[Test build #98638 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98638/testReport)**
 for PR 22990 at commit 
[`17b725c`](https://github.com/apache/spark/commit/17b725c79ad602df20c44cacb92e7c6abd84cdda).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22974: [SPARK-22450][Core][MLLib][FollowUp] Safely register Mul...

2018-11-08 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22974
  
**[Test build #98643 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98643/testReport)**
 for PR 22974 at commit 
[`2fc7247`](https://github.com/apache/spark/commit/2fc72471b1ce0c701bae20555c6b34126ec620bc).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22974: [SPARK-22450][Core][MLLib][FollowUp] Safely register Mul...

2018-11-08 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22974
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4881/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22974: [SPARK-22450][Core][MLLib][FollowUp] Safely register Mul...

2018-11-08 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22974
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22990: [SPARK-25988] [SQL] Keep names unchanged when deduplicat...

2018-11-08 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22990
  
**[Test build #98642 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98642/testReport)**
 for PR 22990 at commit 
[`52f2b1e`](https://github.com/apache/spark/commit/52f2b1e84596c8b877c3557c9821e6d0c9948397).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22990: [SPARK-25988] [SQL] Keep names unchanged when deduplicat...

2018-11-08 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22990
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4880/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22990: [SPARK-25988] [SQL] Keep names unchanged when deduplicat...

2018-11-08 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22990
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22966: [SPARK-25965][SQL][TEST] Add avro read benchmark

2018-11-08 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/22966#discussion_r232155608
  
--- Diff: 
external/avro/src/test/scala/org/apache/spark/sql/execution/benchmark/AvroReadBenchmark.scala
 ---
@@ -0,0 +1,226 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.sql.execution.benchmark
+
+import java.io.File
+
+import scala.util.Random
+
+import org.apache.spark.SparkConf
+import org.apache.spark.benchmark.{Benchmark, BenchmarkBase}
+import org.apache.spark.sql.{DataFrame, SparkSession}
+import org.apache.spark.sql.catalyst.plans.SQLHelper
+import org.apache.spark.sql.types._
+
+/**
+ * Benchmark to measure Avro read performance.
+ * {{{
+ *   To run this benchmark:
+ *   1. without sbt: bin/spark-submit --class 
+ *--jars , 
+ *   2. build/sbt "avro/test:runMain "
+ *   3. generate result: SPARK_GENERATE_BENCHMARK_FILES=1 build/sbt 
"avro/test:runMain "
+ *  Results will be written to 
"benchmarks/AvroReadBenchmark-results.txt".
+ * }}}
+ */
+object AvroReadBenchmark extends BenchmarkBase with SQLHelper {
+  val conf = new SparkConf()
+  conf.set("spark.sql.avro.compression.codec", "snappy")
--- End diff --

Since this is the default value, I think we can remove line 41 ~ 49.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22966: [SPARK-25965][SQL][TEST] Add avro read benchmark

2018-11-08 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/22966#discussion_r232155430
  
--- Diff: 
external/avro/src/test/scala/org/apache/spark/sql/execution/benchmark/AvroReadBenchmark.scala
 ---
@@ -0,0 +1,226 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.sql.execution.benchmark
+
+import java.io.File
+
+import scala.util.Random
+
+import org.apache.spark.SparkConf
+import org.apache.spark.benchmark.{Benchmark, BenchmarkBase}
+import org.apache.spark.sql.{DataFrame, SparkSession}
+import org.apache.spark.sql.catalyst.plans.SQLHelper
+import org.apache.spark.sql.types._
+
+/**
+ * Benchmark to measure Avro read performance.
+ * {{{
+ *   To run this benchmark:
+ *   1. without sbt: bin/spark-submit --class 
+ *--jars , 
+ *   2. build/sbt "avro/test:runMain "
+ *   3. generate result: SPARK_GENERATE_BENCHMARK_FILES=1 build/sbt 
"avro/test:runMain "
+ *  Results will be written to 
"benchmarks/AvroReadBenchmark-results.txt".
+ * }}}
+ */
+object AvroReadBenchmark extends BenchmarkBase with SQLHelper {
--- End diff --

@gengliangwang . Can we use `SqlBasedBenchmark` for consistency?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22973: [SPARK-25972][PYTHON] Missed JSON options in streaming.p...

2018-11-08 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22973
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98639/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22973: [SPARK-25972][PYTHON] Missed JSON options in streaming.p...

2018-11-08 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22973
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22973: [SPARK-25972][PYTHON] Missed JSON options in streaming.p...

2018-11-08 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22973
  
**[Test build #98639 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98639/testReport)**
 for PR 22973 at commit 
[`4ca71fc`](https://github.com/apache/spark/commit/4ca71fc75d0a25ced9803372b0594ae8342b5eb9).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22975: [SPARK-20156][SQL][ML][FOLLOW-UP] Java String toLowerCas...

2018-11-08 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22975
  
**[Test build #98641 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98641/testReport)**
 for PR 22975 at commit 
[`aa5aa8e`](https://github.com/apache/spark/commit/aa5aa8e2094ded81cf13e15bd3c59beac2886f7b).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22975: [SPARK-20156][SQL][ML][FOLLOW-UP] Java String toLowerCas...

2018-11-08 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22975
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22975: [SPARK-20156][SQL][ML][FOLLOW-UP] Java String toLowerCas...

2018-11-08 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22975
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4879/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22975: [SPARK-20156][SQL][ML][FOLLOW-UP] Java String toLowerCas...

2018-11-08 Thread zhengruifeng

Github user zhengruifeng commented on the issue:

https://github.com/apache/spark/pull/22975
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22979: [SPARK-25977][SQL] Parsing decimals from CSV using local...

2018-11-08 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22979
  
**[Test build #98640 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98640/testReport)**
 for PR 22979 at commit 
[`64a97a2`](https://github.com/apache/spark/commit/64a97a27e4b22e605f3b2ddfebb7eaebdebc).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22973: [SPARK-25972][PYTHON] Missed JSON options in streaming.p...

2018-11-08 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22973
  
**[Test build #98639 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98639/testReport)**
 for PR 22973 at commit 
[`4ca71fc`](https://github.com/apache/spark/commit/4ca71fc75d0a25ced9803372b0594ae8342b5eb9).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22974: [SPARK-22450][Core][MLLib][FollowUp] Safely register Mul...

2018-11-08 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22974
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98635/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22974: [SPARK-22450][Core][MLLib][FollowUp] Safely register Mul...

2018-11-08 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22974
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22974: [SPARK-22450][Core][MLLib][FollowUp] Safely register Mul...

2018-11-08 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22974
  
**[Test build #98635 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98635/testReport)**
 for PR 22974 at commit 
[`90a4d54`](https://github.com/apache/spark/commit/90a4d54387fcb110b01e34a5603a3fdbe2d35731).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22990: [SPARK-25988] [SQL] Keep names unchanged when deduplicat...

2018-11-08 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/22990
  
good catch! LGTM


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22990: [SPARK-25988] [SQL] Keep names unchanged when ded...

2018-11-08 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/22990#discussion_r232148751
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala 
---
@@ -2856,6 +2856,59 @@ class SQLQuerySuite extends QueryTest with 
SharedSQLContext {
   checkAnswer(sql("select 26393499451 / (1e6 * 1000)"), 
Row(BigDecimal("26.393499451")))
 }
   }
+
+  test("self join with aliases on partitioned tables #1") {
--- End diff --

let's put the JIRA ticket number in the test name


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22990: [SPARK-25988] [SQL] Keep names unchanged when ded...

2018-11-08 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/22990#discussion_r232148583
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala 
---
@@ -2856,6 +2856,59 @@ class SQLQuerySuite extends QueryTest with 
SharedSQLContext {
   checkAnswer(sql("select 26393499451 / (1e6 * 1000)"), 
Row(BigDecimal("26.393499451")))
 }
   }
+
+  test("self join with aliases on partitioned tables #1") {
+withTempView("tmpView1", "tmpView2") {
+  withTable("tab1", "tab2") {
+sql(
+  """
+|CREATE TABLE `tab1` (`col1` INT, `TDATE` DATE)
+|USING CSV
+|PARTITIONED BY (TDATE)
+  """.stripMargin)
+spark.table("tab1").where("TDATE >= 
'2017-08-15'").createOrReplaceTempView("tmpView1")
+sql("CREATE TABLE `tab2` (`TDATE` DATE) USING parquet")
+sql(
+  """
+|CREATE OR REPLACE TEMPORARY VIEW tmpView2 AS
+|SELECT N.tdate, col1 AS aliasCol1
+|FROM tmpView1 N
+|JOIN tab2 Z
+|ON N.tdate = Z.tdate
+  """.stripMargin)
+withSQLConf(SQLConf.AUTO_BROADCASTJOIN_THRESHOLD.key -> "0") {
+  sql("SELECT * FROM tmpView2 x JOIN tmpView2 y ON x.tdate = 
y.tdate").collect()
+}
+  }
+}
+  }
+
+  test("self join with aliases on partitioned tables #2") {
+withTempView("tmp") {
+  withTable("tab1", "tab2") {
+sql(
+  """
+|CREATE TABLE `tab1` (`EX` STRING, `TDATE` DATE)
+|USING parquet
+|PARTITIONED BY (tdate)
+  """.stripMargin)
+sql("CREATE TABLE `tab2` (`TDATE` DATE) USING parquet")
+sql(
+  """
+|CREATE OR REPLACE TEMPORARY VIEW TMP as
+|SELECT  N.tdate, EX AS new_ex
+|FROM tab1 N
+|JOIN tab2 Z
+|ON  N.tdate = Z.tdate
--- End diff --

nit: `ON N.tdate = Z.tdate`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22975: [SPARK-20156][SQL][ML][FOLLOW-UP] Java String toLowerCas...

2018-11-08 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22975
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22975: [SPARK-20156][SQL][ML][FOLLOW-UP] Java String toLowerCas...

2018-11-08 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22975
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98634/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22975: [SPARK-20156][SQL][ML][FOLLOW-UP] Java String toLowerCas...

2018-11-08 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22975
  
**[Test build #98634 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98634/testReport)**
 for PR 22975 at commit 
[`aa5aa8e`](https://github.com/apache/spark/commit/aa5aa8e2094ded81cf13e15bd3c59beac2886f7b).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22990: [SPARK-25988] [SQL] Keep names unchanged when deduplicat...

2018-11-08 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22990
  
**[Test build #98638 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98638/testReport)**
 for PR 22990 at commit 
[`17b725c`](https://github.com/apache/spark/commit/17b725c79ad602df20c44cacb92e7c6abd84cdda).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22990: [SPARK-25988] [SQL] Keep names unchanged when deduplicat...

2018-11-08 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22990
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4878/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22990: [SPARK-25988] [SQL] Keep names unchanged when deduplicat...

2018-11-08 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22990
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22990: [SPARK-25988] [SQL] Keep names unchanged when deduplicat...

2018-11-08 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/22990
  
cc @cloud-fan 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22990: [SPARK-25988] [SQL] Keep names unchanged when ded...

2018-11-08 Thread gatorsmile

GitHub user gatorsmile opened a pull request:

https://github.com/apache/spark/pull/22990

[SPARK-25988] [SQL] Keep names unchanged when deduplicating the column 
names in Analyzer

## What changes were proposed in this pull request?
When the queries do not use the column names with the same case, users 
might hit various errors. Below is a typical test failure they can hit.
```
Expected only partition pruning predicates: 
ArrayBuffer(isnotnull(tdate#237), (cast(tdate#237 as string) >= 2017-08-15));
org.apache.spark.sql.AnalysisException: Expected only partition pruning 
predicates: ArrayBuffer(isnotnull(tdate#237), (cast(tdate#237 as string) >= 
2017-08-15));
at 
org.apache.spark.sql.catalyst.catalog.ExternalCatalogUtils$.prunePartitionsByFilter(ExternalCatalogUtils.scala:146)
at 
org.apache.spark.sql.catalyst.catalog.InMemoryCatalog.listPartitionsByFilter(InMemoryCatalog.scala:560)
at 
org.apache.spark.sql.catalyst.catalog.SessionCatalog.listPartitionsByFilter(SessionCatalog.scala:925)
```

## How was this patch tested?
Added two test cases.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/gatorsmile/spark fix1283

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/22990.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #22990


commit 5e9f6f345b93d3370906c7b2d73ede15f4089c29
Author: gatorsmile 
Date:   2018-11-09T05:27:37Z

fix

commit 17b725c79ad602df20c44cacb92e7c6abd84cdda
Author: gatorsmile 
Date:   2018-11-09T05:33:58Z

fix




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22987: [SPARK-25979][SQL] Window function: allow parentheses ar...

2018-11-08 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22987
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98633/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22987: [SPARK-25979][SQL] Window function: allow parentheses ar...

2018-11-08 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22987
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22987: [SPARK-25979][SQL] Window function: allow parentheses ar...

2018-11-08 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22987
  
**[Test build #98633 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98633/testReport)**
 for PR 22987 at commit 
[`2da6f99`](https://github.com/apache/spark/commit/2da6f998e4ee95d6cfbf2e8258c3a160220a366c).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22275: [SPARK-25274][PYTHON][SQL] In toPandas with Arrow send u...

2018-11-08 Thread BryanCutler

Github user BryanCutler commented on the issue:

https://github.com/apache/spark/pull/22275
  
ping @HyukjinKwon and @viirya to maybe take another look at the recent 
changes to make this cleaner, if you are able to. Thanks!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22275: [SPARK-25274][PYTHON][SQL] In toPandas with Arrow...

2018-11-08 Thread BryanCutler

Github user BryanCutler commented on a diff in the pull request:

https://github.com/apache/spark/pull/22275#discussion_r232145973
  
--- Diff: python/pyspark/sql/tests.py ---
@@ -4923,6 +4923,28 @@ def test_timestamp_dst(self):
 self.assertPandasEqual(pdf, df_from_python.toPandas())
 self.assertPandasEqual(pdf, df_from_pandas.toPandas())
 
+def test_toPandas_batch_order(self):
+
+# Collects Arrow RecordBatches out of order in driver JVM then 
re-orders in Python
+def run_test(num_records, num_parts, max_records):
+df = self.spark.range(num_records, 
numPartitions=num_parts).toDF("a")
+with 
self.sql_conf({"spark.sql.execution.arrow.maxRecordsPerBatch": max_records}):
+pdf, pdf_arrow = self._toPandas_arrow_toggle(df)
+self.assertPandasEqual(pdf, pdf_arrow)
+
+cases = [
+(1024, 512, 2),  # Try large num partitions for good chance of 
not collecting in order
+(512, 64, 2),# Try medium num partitions to test out of 
order collection
+(64, 8, 2),  # Try small number of partitions to test out 
of order collection
+(64, 64, 1), # Test single batch per partition
+(64, 1, 64), # Test single partition, single batch
+(64, 1, 8),  # Test single partition, multiple batches
+(30, 7, 2),  # Test different sized partitions
+]
--- End diff --

@holdenk , I updated the tests, please take another look when you get a 
chance. Thanks!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22880: [SPARK-25407][SQL] Ensure we pass a compatible pruned sc...

2018-11-08 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/22880
  
Let me take a look on this weekends.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22989: [SPARK-25986][Build] Banning throw new OutOfMemoryErrors

2018-11-08 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22989
  
**[Test build #98637 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98637/testReport)**
 for PR 22989 at commit 
[`d678751`](https://github.com/apache/spark/commit/d67875115f622082519b1dbcb1c1e34c2184b34f).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22989: [SPARK-25986][Build] Banning throw new OutOfMemoryErrors

2018-11-08 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22989
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22989: [SPARK-25986][Build] Banning throw new OutOfMemoryErrors

2018-11-08 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22989
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4877/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22989: [SPARK-25986][Build] Banning throw new OutOfMemor...

2018-11-08 Thread xuanyuanking

GitHub user xuanyuanking opened a pull request:

https://github.com/apache/spark/pull/22989

[SPARK-25986][Build] Banning throw new OutOfMemoryErrors

## What changes were proposed in this pull request?

Add scala and java lint check rules to ban the usage of `throw new 
OutOfMemoryErrors` cause it will cause hole executor killed. See more details 
in https://github.com/apache/spark/pull/22969.

## How was this patch tested?

Local test with lint-scala and lint-java.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/xuanyuanking/spark SPARK-25986

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/22989.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #22989


commit d67875115f622082519b1dbcb1c1e34c2184b34f
Author: Yuanjian Li 
Date:   2018-11-09T05:23:01Z

banning throw new OutOfMemoryErrors




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22966: [SPARK-25965][SQL][TEST] Add avro read benchmark

2018-11-08 Thread gengliangwang

Github user gengliangwang commented on the issue:

https://github.com/apache/spark/pull/22966
  
@dongjoon-hyun I think we can merge this one first.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22976: [SPARK-25974][SQL]Optimizes Generates bytecode for order...

2018-11-08 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22976
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98632/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22976: [SPARK-25974][SQL]Optimizes Generates bytecode for order...

2018-11-08 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22976
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22976: [SPARK-25974][SQL]Optimizes Generates bytecode for order...

2018-11-08 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22976
  
**[Test build #98632 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98632/testReport)**
 for PR 22976 at commit 
[`5acf2a4`](https://github.com/apache/spark/commit/5acf2a44ef12b1af4457f07ff1bee6476c9b27d1).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22976: [SPARK-25974][SQL]Optimizes Generates bytecode for order...

2018-11-08 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22976
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22976: [SPARK-25974][SQL]Optimizes Generates bytecode for order...

2018-11-08 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22976
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98631/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22976: [SPARK-25974][SQL]Optimizes Generates bytecode for order...

2018-11-08 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22976
  
**[Test build #98631 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98631/testReport)**
 for PR 22976 at commit 
[`b07acdb`](https://github.com/apache/spark/commit/b07acdbb95b43f3cbfdf6c5c5e42dcab828937bc).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22988: [SPARK-25984][CORE][SQL][STREAMING] Remove deprecated .n...

2018-11-08 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22988
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22988: [SPARK-25984][CORE][SQL][STREAMING] Remove deprecated .n...

2018-11-08 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22988
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98627/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22988: [SPARK-25984][CORE][SQL][STREAMING] Remove deprecated .n...

2018-11-08 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22988
  
**[Test build #98627 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98627/testReport)**
 for PR 22988 at commit 
[`55ac7c0`](https://github.com/apache/spark/commit/55ac7c09d251ecb0ca21eac3c2fcffafe53c2960).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22987: [SPARK-25979][SQL] Window function: allow parentheses ar...

2018-11-08 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22987
  
**[Test build #98636 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98636/testReport)**
 for PR 22987 at commit 
[`471092d`](https://github.com/apache/spark/commit/471092d417666f5cf8908318aed098d6f06c4900).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22987: [SPARK-25979][SQL] Window function: allow parentheses ar...

2018-11-08 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22987
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22987: [SPARK-25979][SQL] Window function: allow parentheses ar...

2018-11-08 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22987
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4876/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22974: [SPARK-22450][Core][MLLib][FollowUp] Safely register Mul...

2018-11-08 Thread zhengruifeng

Github user zhengruifeng commented on the issue:

https://github.com/apache/spark/pull/22974
  
 not all public serializable classes are needed to registered. Only those 
one which needed ser-deser should be registered, one important groups should be 
transformers and prediction models.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22987: [SPARK-25979][SQL] Window function: allow parentheses ar...

2018-11-08 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/22987
  
LGTM


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22987: [SPARK-25979][SQL] Window function: allow parenth...

2018-11-08 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/22987#discussion_r232136609
  
--- Diff: sql/core/src/test/resources/sql-tests/inputs/window.sql ---
@@ -109,3 +109,9 @@ last_value(false, false) OVER w AS 
last_value_contain_null
 FROM testData
 WINDOW w AS ()
 ORDER BY cate, val;
+
+-- parentheses around window reference
+SELECT cate, sum(val) OVER (w)
+FROM testData
+WHERE val is not null
+WINDOW w AS (PARTITION BY cate ORDER BY val);
--- End diff --

+1


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22985: [SPARK-25510][SQL][TEST][FOLLOW-UP] Remove Benchm...

2018-11-08 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/22985


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22974: [SPARK-22450][Core][MLLib][FollowUp] Safely register Mul...

2018-11-08 Thread zhengruifeng

Github user zhengruifeng commented on the issue:

https://github.com/apache/spark/pull/22974
  
I am not sure, but maybe all serializable classes need to be registered. 
Since `MultivariateGaussian` is a public class, so I think we need to add 
it.
I also wonder whether a test is needed. If no longer needed, I can list all 
other public ones in ML in this PR.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22275: [SPARK-25274][PYTHON][SQL] In toPandas with Arrow send u...

2018-11-08 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22275
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22275: [SPARK-25274][PYTHON][SQL] In toPandas with Arrow send u...

2018-11-08 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22275
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98629/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22275: [SPARK-25274][PYTHON][SQL] In toPandas with Arrow send u...

2018-11-08 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22275
  
**[Test build #98629 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98629/testReport)**
 for PR 22275 at commit 
[`7dc92c8`](https://github.com/apache/spark/commit/7dc92c8d0dca69e254088fd6e1f3e15da1f90fbe).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22985: [SPARK-25510][SQL][TEST][FOLLOW-UP] Remove BenchmarkWith...

2018-11-08 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/22985
  
Merged to master.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22939: [SPARK-25446][R] Add schema_of_json() and schema_of_csv(...

2018-11-08 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/22939
  
Hey @felixcheung, it should be ready for another look.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22987: [SPARK-25979][SQL] Window function: allow parenth...

2018-11-08 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/22987#discussion_r232135028
  
--- Diff: sql/core/src/test/resources/sql-tests/inputs/window.sql ---
@@ -109,3 +109,9 @@ last_value(false, false) OVER w AS 
last_value_contain_null
 FROM testData
 WINDOW w AS ()
 ORDER BY cate, val;
+
+-- parentheses around window reference
+SELECT cate, sum(val) OVER (w)
+FROM testData
+WHERE val is not null
+WINDOW w AS (PARTITION BY cate ORDER BY val);
--- End diff --

need a new line at the end.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22275: [SPARK-25274][PYTHON][SQL] In toPandas with Arrow send u...

2018-11-08 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22275
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22275: [SPARK-25274][PYTHON][SQL] In toPandas with Arrow send u...

2018-11-08 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22275
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98630/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22954: [SPARK-25981][R] Enables Arrow optimization from R DataF...

2018-11-08 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22954
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98628/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22954: [SPARK-25981][R] Enables Arrow optimization from R DataF...

2018-11-08 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22954
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22954: [SPARK-25981][R] Enables Arrow optimization from R DataF...

2018-11-08 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22954
  
**[Test build #98628 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98628/testReport)**
 for PR 22954 at commit 
[`2ba6add`](https://github.com/apache/spark/commit/2ba6addbcd52940ef989880bff69fe126a4dd2e1).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22275: [SPARK-25274][PYTHON][SQL] In toPandas with Arrow send u...

2018-11-08 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22275
  
**[Test build #98630 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98630/testReport)**
 for PR 22275 at commit 
[`8045fac`](https://github.com/apache/spark/commit/8045facbe523c89b91b930203bb6874d82d08a4d).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22974: [SPARK-22450][Core][MLLib][FollowUp] Safely register Mul...

2018-11-08 Thread srowen

Github user srowen commented on the issue:

https://github.com/apache/spark/pull/22974
  
OK, that's the issue, yeah. Registration is an optimization. I wonder, what 
other classes should we add if we're going to add this one? I don't know if it 
needs a test. But if there are 10 other somewhat commonly-used classes that are 
serialized during Spark ML operations, they should be registered.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22974: [SPARK-22450][Core][MLLib][FollowUp] Safely register Mul...

2018-11-08 Thread zhengruifeng

Github user zhengruifeng commented on the issue:

https://github.com/apache/spark/pull/22974
  
Do you mean fail in this pr? It was caused by a non-registered filed 
`BDM[Double]`.
`MultivariateGaussian` is used in GMM, kryo-registration should help 
performance.

As to mllib-local's dependency, it is another thing: current 
kryo-regiestered classes, like 'ml.linalg.Vector', 'ml.linalg.Matrix', do not 
have kryo test in their testsuites.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22974: [SPARK-22450][Core][MLLib][FollowUp] Safely register Mul...

2018-11-08 Thread srowen

Github user srowen commented on the issue:

https://github.com/apache/spark/pull/22974
  
You're requiring registration, which is what makes this fail, right? why do 
that? I think I'm missing something.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22974: [SPARK-22450][Core][MLLib][FollowUp] Safely register Mul...

2018-11-08 Thread zhengruifeng

Github user zhengruifeng commented on the issue:

https://github.com/apache/spark/pull/22974
  
@srowen Existing kryo-register testsuite need to import spark-core:
```
import org.apache.spark.SparkConf
import org.apache.spark.serializer.KryoSerializer

val conf = new SparkConf(false)
conf.set("spark.kryo.registrationRequired", "true")
val ser = new KryoSerializer(conf).newInstance()
```

Since mllib-local is not dependent on spark-core, current classes in 
mllib-local do not test kryo-serialization at all. E.g. 
`mllib.linalg.VectorsSuite` contains test `test("kryo class register")`, while 
`ml.linalg.VectorsSuite`  do not have it.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22974: [SPARK-22450][Core][MLLib][FollowUp] Safely register Mul...

2018-11-08 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22974
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22974: [SPARK-22450][Core][MLLib][FollowUp] Safely register Mul...

2018-11-08 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22974
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4875/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22974: [SPARK-22450][Core][MLLib][FollowUp] Safely register Mul...

2018-11-08 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22974
  
**[Test build #98635 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98635/testReport)**
 for PR 22974 at commit 
[`90a4d54`](https://github.com/apache/spark/commit/90a4d54387fcb110b01e34a5603a3fdbe2d35731).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22974: [SPARK-22450][Core][MLLib][FollowUp] Safely register Mul...

2018-11-08 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22974
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4874/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22974: [SPARK-22450][Core][MLLib][FollowUp] Safely register Mul...

2018-11-08 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22974
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22975: [SPARK-20156][SQL][ML][FOLLOW-UP] Java String toLowerCas...

2018-11-08 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22975
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4873/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22975: [SPARK-20156][SQL][ML][FOLLOW-UP] Java String toLowerCas...

2018-11-08 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22975
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22975: [SPARK-20156][SQL][ML][FOLLOW-UP] Java String toLowerCas...

2018-11-08 Thread zhengruifeng

Github user zhengruifeng commented on the issue:

https://github.com/apache/spark/pull/22975
  
@srowen  Yes, we should keep user input data and column names. Thanks for 
your explain! 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22975: [SPARK-20156][SQL][ML][FOLLOW-UP] Java String toLowerCas...

2018-11-08 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22975
  
**[Test build #98634 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98634/testReport)**
 for PR 22975 at commit 
[`aa5aa8e`](https://github.com/apache/spark/commit/aa5aa8e2094ded81cf13e15bd3c59beac2886f7b).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22987: [SPARK-25979][SQL] Window function: allow parentheses ar...

2018-11-08 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22987
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4872/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22987: [SPARK-25979][SQL] Window function: allow parentheses ar...

2018-11-08 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22987
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

1 2 3 4 5 6 >

1 - 100 of 559 matches

Mail list logo