date:20181111

[GitHub] spark pull request #22518: [SPARK-25482][SQL] ReuseSubquery can be useless w...

2018-11-11 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/22518#discussion_r232558384
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/SubquerySuite.scala 
---
@@ -1268,4 +1269,16 @@ class SubquerySuite extends QueryTest with 
SharedSQLContext {
   assert(getNumSortsInQuery(query5) == 1)
 }
   }
+
+  test("SPARK-25482: Reuse same Subquery in order to execute it only 
once") {
+withTempView("t1", "t2") {
+  sql("create temporary view t1(a int) using parquet")
+  sql("create temporary view t2(b int) using parquet")
+  val plan = sql("select * from t2 where b > (select max(a) from t1)")
--- End diff --

sorry it has been a long time and I don't quite remember the context.

What was the problem we are trying to fix? This test looks nothing related 
to subquery reuse.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22887: [SPARK-25880][CORE] user set's hadoop conf should not ov...

2018-11-11 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/22887
  
looks reasonable, cc @gatorsmile 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22693: [SPARK-25701][SQL] Supports calculation of table ...

2018-11-11 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/22693#discussion_r232556859
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveStrategies.scala ---
@@ -115,26 +116,45 @@ class ResolveHiveSerdeTable(session: SparkSession) 
extends Rule[LogicalPlan] {
 
 class DetermineTableStats(session: SparkSession) extends Rule[LogicalPlan] 
{
   override def apply(plan: LogicalPlan): LogicalPlan = plan 
resolveOperators {
+case filterPlan @ Filter(_, SubqueryAlias(_, relation: 
HiveTableRelation)) =>
+  val predicates = 
PhysicalOperation.unapply(filterPlan).map(_._2).getOrElse(Nil)
+  computeTableStats(relation, predicates)
 case relation: HiveTableRelation
 if DDLUtils.isHiveTable(relation.tableMeta) && 
relation.tableMeta.stats.isEmpty =>
-  val table = relation.tableMeta
-  val sizeInBytes = if 
(session.sessionState.conf.fallBackToHdfsForStatsEnabled) {
-try {
-  val hadoopConf = session.sessionState.newHadoopConf()
-  val tablePath = new Path(table.location)
-  val fs: FileSystem = tablePath.getFileSystem(hadoopConf)
-  fs.getContentSummary(tablePath).getLength
-} catch {
-  case e: IOException =>
-logWarning("Failed to get table size from hdfs.", e)
-session.sessionState.conf.defaultSizeInBytes
-}
-  } else {
-session.sessionState.conf.defaultSizeInBytes
+  computeTableStats(relation)
+  }
+
+  private def computeTableStats(
+  relation: HiveTableRelation,
+  predicates: Seq[Expression] = Nil): LogicalPlan = {
+val table = relation.tableMeta
+val sizeInBytes = if 
(session.sessionState.conf.fallBackToHdfsForStatsEnabled) {
+  try {
+val hadoopConf = session.sessionState.newHadoopConf()
+val tablePath = new Path(table.location)
+val fs: FileSystem = tablePath.getFileSystem(hadoopConf)
+BigInt(fs.getContentSummary(tablePath).getLength)
+  } catch {
+case e: IOException =>
+  logWarning("Failed to get table size from hdfs.", e)
+  getSizeInBytesFromTablePartitions(table.identifier, predicates)
   }
+} else {
+  getSizeInBytesFromTablePartitions(table.identifier, predicates)
+}
+val withStats = table.copy(stats = Some(CatalogStatistics(sizeInBytes 
= sizeInBytes)))
+relation.copy(tableMeta = withStats)
+  }
 
-  val withStats = table.copy(stats = 
Some(CatalogStatistics(sizeInBytes = BigInt(sizeInBytes
-  relation.copy(tableMeta = withStats)
+  private def getSizeInBytesFromTablePartitions(
+  tableIdentifier: TableIdentifier,
+  predicates: Seq[Expression] = Nil): BigInt = {
+session.sessionState.catalog.listPartitionsByFilter(tableIdentifier, 
predicates) match {
--- End diff --

How come https://github.com/apache/spark/pull/22743 solves this problem? 
That PR targets to invalidate cache when configurations are changed. This PR 
targets to compute stats from HDFS when they are not available.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22944: [SPARK-25942][SQL] Aggregate expressions shouldn'...

2018-11-11 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/22944#discussion_r232556359
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/DatasetSuite.scala 
---
@@ -1556,6 +1556,20 @@ class DatasetSuite extends QueryTest with 
SharedSQLContext {
   df.where($"city".contains(new java.lang.Character('A'))),
   Seq(Row("Amsterdam")))
   }
+
+  test("SPARK-25942: typed aggregation on primitive type") {
+val ds = Seq(1, 2, 3).toDS()
+
+val agg = ds.groupByKey(_ >= 2)
+  .agg(sum("value").as[Long], sum($"value" + 1).as[Long])
--- End diff --

I think we should not make decisions for users. For untyped APIs, users can 
refer the grouping columns in the aggregate expressions, I think the typed APIs 
should be same.

For this particular case, currrently spark allows grouping columns inside 
aggregate functions, so the `value` here is indeed ambiguous. There is nothing 
we can do, but fail and ask users to add alias.

BTW, we should check other databases and see if "grouping columns inside 
aggregate functions" should be allowed,


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23005: [SPARK-26005] [SQL] Upgrade ANTRL from 4.7 to 4.7.1

2018-11-11 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/23005
  
Thanks! Merged to master.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23005: [SPARK-26005] [SQL] Upgrade ANTRL from 4.7 to 4.7...

2018-11-11 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/23005


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22429: [SPARK-25440][SQL] Dumping query execution info to a fil...

2018-11-11 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/22429
  
This is hard to review, do you mean we should add `maxFields: Option[Int]` 
to all the string related methods?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22976: [SPARK-25974][SQL]Optimizes Generates bytecode for order...

2018-11-11 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/22976
  
LGTM except one comment, cc @rednaxelafx


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22955: [SPARK-25949][SQL] Add test for PullOutPythonUDFI...

2018-11-11 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/22955


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22976: [SPARK-25974][SQL]Optimizes Generates bytecode fo...

2018-11-11 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/22976#discussion_r232552336
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/GenerateOrdering.scala
 ---
@@ -68,62 +68,55 @@ object GenerateOrdering extends 
CodeGenerator[Seq[SortOrder], Ordering[InternalR
 genComparisons(ctx, ordering)
   }
 
+  /**
+   * Creates the variables for ordering based on the given order.
+   */
+  private def createOrderKeys(
+ctx: CodegenContext,
--- End diff --

4 space identation


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22955: [SPARK-25949][SQL] Add test for PullOutPythonUDFInJoinCo...

2018-11-11 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/22955
  
thanks, merging to master!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22954: [SPARK-25981][R] Enables Arrow optimization from R DataF...

2018-11-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22954
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22954: [SPARK-25981][R] Enables Arrow optimization from R DataF...

2018-11-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22954
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98713/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22954: [SPARK-25981][R] Enables Arrow optimization from R DataF...

2018-11-11 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22954
  
**[Test build #98713 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98713/testReport)**
 for PR 22954 at commit 
[`d9d9f98`](https://github.com/apache/spark/commit/d9d9f982d26a5dd2141515e0c9089243b7b93554).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22938: [SPARK-25935][SQL] Prevent null rows from JSON pa...

2018-11-11 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/22938#discussion_r232550860
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/json/JsonSuite.scala
 ---
@@ -1813,6 +1817,7 @@ class JsonSuite extends QueryTest with 
SharedSQLContext with TestJsonData {
   val path = dir.getCanonicalPath
   primitiveFieldAndType
 .toDF("value")
+.repartition(1)
--- End diff --

why is the `repartition` required?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22938: [SPARK-25935][SQL] Prevent null rows from JSON pa...

2018-11-11 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/22938#discussion_r232550733
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/json/JsonSuite.scala
 ---
@@ -1115,6 +1115,7 @@ class JsonSuite extends QueryTest with 
SharedSQLContext with TestJsonData {
 Row(null, null, null),
 Row(null, null, null),
 Row(null, null, null),
+Row(null, null, null),
--- End diff --

so for json data source, previous behavior is, we would skip the row even 
it's in PERMISSIVE mode. Shall we clearly mention it in the migration guide?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22938: [SPARK-25935][SQL] Prevent null rows from JSON pa...

2018-11-11 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/22938#discussion_r232550502
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 ---
@@ -550,15 +550,23 @@ case class JsonToStructs(
   s"Input schema ${nullableSchema.catalogString} must be a struct, an 
array or a map.")
   }
 
-  // This converts parsed rows to the desired output by the given schema.
   @transient
-  lazy val converter = nullableSchema match {
-case _: StructType =>
-  (rows: Iterator[InternalRow]) => if (rows.hasNext) rows.next() else 
null
-case _: ArrayType =>
-  (rows: Iterator[InternalRow]) => if (rows.hasNext) 
rows.next().getArray(0) else null
-case _: MapType =>
-  (rows: Iterator[InternalRow]) => if (rows.hasNext) 
rows.next().getMap(0) else null
+  private lazy val castRow = nullableSchema match {
+case _: StructType => (row: InternalRow) => row
+case _: ArrayType => (row: InternalRow) => row.getArray(0)
+case _: MapType => (row: InternalRow) => row.getMap(0)
+  }
+
+  // This converts parsed rows to the desired output by the given schema.
+  private def convertRow(rows: Iterator[InternalRow]) = {
+if (rows.hasNext) {
+  val result = rows.next()
+  // JSON's parser produces one record only.
+  assert(!rows.hasNext)
+  castRow(result)
+} else {
+  throw new IllegalArgumentException("Expected one row from JSON 
parser.")
--- End diff --

This can only happen when we have a bug, right?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22966: [SPARK-25965][SQL][TEST] Add avro read benchmark

2018-11-11 Thread gengliangwang

Github user gengliangwang commented on a diff in the pull request:

https://github.com/apache/spark/pull/22966#discussion_r232550388
  
--- Diff: 
external/avro/src/test/scala/org/apache/spark/sql/execution/benchmark/AvroReadBenchmark.scala
 ---
@@ -0,0 +1,226 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.sql.execution.benchmark
+
+import java.io.File
+
+import scala.util.Random
+
+import org.apache.spark.SparkConf
+import org.apache.spark.benchmark.{Benchmark, BenchmarkBase}
+import org.apache.spark.sql.{DataFrame, SparkSession}
+import org.apache.spark.sql.catalyst.plans.SQLHelper
+import org.apache.spark.sql.types._
+
+/**
+ * Benchmark to measure Avro read performance.
+ * {{{
+ *   To run this benchmark:
+ *   1. without sbt: bin/spark-submit --class 
+ *--jars , 
+ *   2. build/sbt "avro/test:runMain "
+ *   3. generate result: SPARK_GENERATE_BENCHMARK_FILES=1 build/sbt 
"avro/test:runMain "
+ *  Results will be written to 
"benchmarks/AvroReadBenchmark-results.txt".
+ * }}}
+ */
+object AvroReadBenchmark extends BenchmarkBase with SQLHelper {
--- End diff --

@dongjoon-hyun OK, then I think this one is ready.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22938: [SPARK-25935][SQL] Prevent null rows from JSON pa...

2018-11-11 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/22938#discussion_r232550186
  
--- Diff: docs/sql-migration-guide-upgrade.md ---
@@ -15,6 +15,8 @@ displayTitle: Spark SQL Upgrading Guide
 
   - Since Spark 3.0, the `from_json` functions supports two modes - 
`PERMISSIVE` and `FAILFAST`. The modes can be set via the `mode` option. The 
default mode became `PERMISSIVE`. In previous versions, behavior of `from_json` 
did not conform to either `PERMISSIVE` nor `FAILFAST`, especially in processing 
of malformed JSON records. For example, the JSON string `{"a" 1}` with the 
schema `a INT` is converted to `null` by previous versions but Spark 3.0 
converts it to `Row(null)`.
 
+  - In Spark version 2.4 and earlier, JSON data source and the `from_json` 
function produced `null`s if there is no valid root JSON token in its input (` 
` for example). Since Spark 3.0, such input is treated as a bad record and 
handled according to specified mode. For example, in the `PERMISSIVE` mode the 
` ` input is converted to `Row(null, null)` if specified schema is `key STRING, 
value INT`. 
--- End diff --

> In Spark version 2.4 and earlier, JSON data source and the `from_json` 
function produced `null`s

Shall we update this? According to what you said, JSON data source can't 
produce null.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22998: [SPARK-26001][SQL]Reduce memory copy when writing decima...

2018-11-11 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/22998
  
I think this is wrong. We have to zero out the bytes even writing a null 
decimal, so that 2 unsafe rows with same values(including null values) are 
exactly same(in binary format).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23012: [SPARK-26014][R] Deprecate R prior to version 3.4...

2018-11-11 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/23012#discussion_r232549234
  
--- Diff: docs/index.md ---
@@ -31,7 +31,8 @@ Spark runs on both Windows and UNIX-like systems (e.g. 
Linux, Mac OS). It's easy
 locally on one machine --- all you need is to have `java` installed on 
your system `PATH`,
 or the `JAVA_HOME` environment variable pointing to a Java installation.
 
-Spark runs on Java 8+, Python 2.7+/3.4+ and R 3.1+. For the Scala API, 
Spark {{site.SPARK_VERSION}}
+Spark runs on Java 8+, Python 2.7+/3.4+ and R 3.1+. R prior to version 3.4 
is deprecated as of Spark 3.0.
--- End diff --

Ah,  yea, I switched this to deprecate it for now. I was a bit curious 
about that.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23012: [SPARK-26014][R] Deprecate R prior to version 3.4...

2018-11-11 Thread viirya

Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/23012#discussion_r232549062
  
--- Diff: docs/index.md ---
@@ -31,7 +31,8 @@ Spark runs on both Windows and UNIX-like systems (e.g. 
Linux, Mac OS). It's easy
 locally on one machine --- all you need is to have `java` installed on 
your system `PATH`,
 or the `JAVA_HOME` environment variable pointing to a Java installation.
 
-Spark runs on Java 8+, Python 2.7+/3.4+ and R 3.1+. For the Scala API, 
Spark {{site.SPARK_VERSION}}
+Spark runs on Java 8+, Python 2.7+/3.4+ and R 3.1+. R prior to version 3.4 
is deprecated as of Spark 3.0.
--- End diff --

hmm, so R prior to version 3.4 is just deprecated, not dropped in in Spark 
3.0?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23012: [SPARK-26014][R] Deprecate R prior to version 3.4...

2018-11-11 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/23012#discussion_r232549053
  
--- Diff: docs/index.md ---
@@ -31,7 +31,8 @@ Spark runs on both Windows and UNIX-like systems (e.g. 
Linux, Mac OS). It's easy
 locally on one machine --- all you need is to have `java` installed on 
your system `PATH`,
 or the `JAVA_HOME` environment variable pointing to a Java installation.
 
-Spark runs on Java 8+, Python 2.7+/3.4+ and R 3.1+. For the Scala API, 
Spark {{site.SPARK_VERSION}}
+Spark runs on Java 8+, Python 2.7+/3.4+ and R 3.1+. R prior to version 3.4 
is deprecated as of Spark 3.0.
--- End diff --

Hm .. I was thinking we could change them when we actually drop the 
support. Technically it does support 3.1+ yet although 3.1, 3.2, and 3.3 are 
deprecated.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23012: [SPARK-26014][R] Deprecate R prior to version 3.4...

2018-11-11 Thread viirya

Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/23012#discussion_r232548211
  
--- Diff: docs/index.md ---
@@ -31,7 +31,8 @@ Spark runs on both Windows and UNIX-like systems (e.g. 
Linux, Mac OS). It's easy
 locally on one machine --- all you need is to have `java` installed on 
your system `PATH`,
 or the `JAVA_HOME` environment variable pointing to a Java installation.
 
-Spark runs on Java 8+, Python 2.7+/3.4+ and R 3.1+. For the Scala API, 
Spark {{site.SPARK_VERSION}}
+Spark runs on Java 8+, Python 2.7+/3.4+ and R 3.1+. R prior to version 3.4 
is deprecated as of Spark 3.0.
--- End diff --

3.1+ -> 3.4+?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22974: [SPARK-22450][WIP][Core][MLLib][FollowUp] Safely registe...

2018-11-11 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22974
  
**[Test build #98719 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98719/testReport)**
 for PR 22974 at commit 
[`0c529fb`](https://github.com/apache/spark/commit/0c529fb7830b78c45b3f2a98046da9fa3061185f).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22974: [SPARK-22450][WIP][Core][MLLib][FollowUp] Safely registe...

2018-11-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22974
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22974: [SPARK-22450][WIP][Core][MLLib][FollowUp] Safely registe...

2018-11-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22974
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4945/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23005: [SPARK-26005] [SQL] Upgrade ANTRL from 4.7 to 4.7.1

2018-11-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23005
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23005: [SPARK-26005] [SQL] Upgrade ANTRL from 4.7 to 4.7.1

2018-11-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23005
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98711/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23005: [SPARK-26005] [SQL] Upgrade ANTRL from 4.7 to 4.7.1

2018-11-11 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23005
  
**[Test build #98711 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98711/testReport)**
 for PR 23005 at commit 
[`4545977`](https://github.com/apache/spark/commit/45459776f2dd08f8180e152aae2702dfed190ed9).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23012: [SPARK-26014][R] Deprecate R prior to version 3.4 in Spa...

2018-11-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23012
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23012: [SPARK-26014][R] Deprecate R prior to version 3.4 in Spa...

2018-11-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23012
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98718/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23012: [SPARK-26014][R] Deprecate R prior to version 3.4 in Spa...

2018-11-11 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23012
  
**[Test build #98718 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98718/testReport)**
 for PR 23012 at commit 
[`dc2dbd9`](https://github.com/apache/spark/commit/dc2dbd923a1396ca5a7a950df35da57cc70c2ab8).
 * This patch **fails SparkR unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23009: SPARK-26011: pyspark app with "spark.jars.packages" conf...

2018-11-11 Thread imatiach-msft

Github user imatiach-msft commented on the issue:

https://github.com/apache/spark/pull/23009
  
@shanyu can you update the name as [SPARK-26011][CORE][PYSPARK] according 
to the guidelines?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22974: [SPARK-22450][WIP][Core][MLLib][FollowUp] Safely registe...

2018-11-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22974
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98712/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22974: [SPARK-22450][WIP][Core][MLLib][FollowUp] Safely registe...

2018-11-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22974
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22974: [SPARK-22450][WIP][Core][MLLib][FollowUp] Safely registe...

2018-11-11 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22974
  
**[Test build #98712 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98712/testReport)**
 for PR 22974 at commit 
[`7e97e45`](https://github.com/apache/spark/commit/7e97e450e110b9cdbe3610ee03e1ea65d5575d63).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22939: [SPARK-25446][R] Add schema_of_json() and schema_...

2018-11-11 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/22939#discussion_r232540931
  
--- Diff: R/pkg/R/functions.R ---
@@ -2230,6 +2237,32 @@ setMethod("from_json", signature(x = "Column", 
schema = "characterOrstructType")
 column(jc)
   })
 
+#' @details
+#' \code{schema_of_json}: Parses a JSON string and infers its schema in 
DDL format.
+#'
+#' @rdname column_collection_functions
+#' @aliases schema_of_json schema_of_json,characterOrColumn-method
+#' @examples
+#'
+#' \dontrun{
+#' json <- '{"name":"Bob"}'
+#' df <- sql("SELECT * FROM range(1)")
+#' head(select(df, schema_of_json(json)))}
+#' @note schema_of_json since 3.0.0
+setMethod("schema_of_json", signature(x = "characterOrColumn"),
+  function(x, ...) {
+if (class(x) == "character") {
+  col <- callJStatic("org.apache.spark.sql.functions", "lit", 
x)
+} else {
+  col <- x@jc
--- End diff --

Hmm .. do you mind if we go ahead for this one and talk later within 3.0? I 
think we're going to deal with this problem within 3.0 if I am not mistaken. I 
need to make one followup after this anyway.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23012: [SPARK-26014][R] Deprecate R prior to version 3.4 in Spa...

2018-11-11 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23012
  
**[Test build #98718 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98718/testReport)**
 for PR 23012 at commit 
[`dc2dbd9`](https://github.com/apache/spark/commit/dc2dbd923a1396ca5a7a950df35da57cc70c2ab8).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23012: [SPARK-26014][R] Deprecate R prior to version 3.4 in Spa...

2018-11-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23012
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22998: [SPARK-26001][SQL]Reduce memory copy when writing decima...

2018-11-11 Thread heary-cao

Github user heary-cao commented on the issue:

https://github.com/apache/spark/pull/22998
  
@kiszk thank you for review it.
- when writing null decimalsï¼
```
OpenJDK 64-Bit Server VM 1.8.0_163-b01 on Windows 7 6.1
Intel64 Family 6 Model 94 Stepping 3, GenuineIntel
iter length 1048576: Best/Avg Time(ms)Rate(M/s)   
Per Row(ns)   Relative


before PR (input == null)   51 /   56 20.4  
49.0   1.0X
after PR (input == null) 8 /9125.2  
 8.0   6.1X
```

- when writing non-null decimals
```
OpenJDK 64-Bit Server VM 1.8.0_163-b01 on Windows 7 6.1
Intel64 Family 6 Model 94 Stepping 3, GenuineIntel
iter length 1048576: Best/Avg Time(ms)Rate(M/s)   
Per Row(ns)   Relative


before PR (input != null)   52 /   53 20.3  
49.2   1.0X
after PR (input != null)54 /   56 19.3  
51.7   1.0X
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23012: [SPARK-26014][R] Deprecate R prior to version 3.4 in Spa...

2018-11-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23012
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4944/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23012: [SPARK-26014][R] Deprecate R prior to version 3.4 in Spa...

2018-11-11 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/23012
  
adding @srowen too.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23012: [SPARK-26014][R] Deprecate R prior to version 3.4 in Spa...

2018-11-11 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/23012
  
Tests probably will fail since it produces warnings. 

cc @felixcheung. @shaneknapp, @viirya, @shivaram, @falaki, @mengxr, 
@yanboliang FYI.

This PR is made per 
http://apache-spark-developers-list.1001551.n3.nabble.com/discuss-SparkR-CRAN-feasibility-check-server-problem-td25605.html


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22721: [SPARK-25403][SQL] Refreshes the table after inserting t...

2018-11-11 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22721
  
**[Test build #98717 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98717/testReport)**
 for PR 22721 at commit 
[`1e62a24`](https://github.com/apache/spark/commit/1e62a24bba8aaa949f3481ae3befe2db5c286edc).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22998: [SPARK-26001][SQL]Reduce memory copy when writing decima...

2018-11-11 Thread heary-cao

Github user heary-cao commented on the issue:

https://github.com/apache/spark/pull/22998
  
@mgaido91 
thank you for review it.  I added a test case to test "write a decimal with 
16 bytes and then one with less than 8".  then the current change the remaining 
8 bytes would not dirty. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23012: [SPARK-26014][R] Deprecate R prior to version 3.4...

2018-11-11 Thread HyukjinKwon

GitHub user HyukjinKwon opened a pull request:

https://github.com/apache/spark/pull/23012

[SPARK-26014][R] Deprecate R prior to version 3.4 in SparkR

## What changes were proposed in this pull request?

This PR proposes to bump up the minimum versions of R from 3.1 to 3.4.

R version. 3.1.x is too old. It's released 4.5 years ago. R 3.4.0 is 
released 1.5 years ago. Considering the timing for Spark 3.0, deprecating lower 
versions, bumping up R to 3.4 might be reasonable option.

It should be good to deprecate and drop < R 3.4 support.

If we think about the practice, nothing particular is required within R 
codes as far as I can tell, except:

1. 
https://github.com/apache/spark/blob/master/R/pkg/src-native/string_hash_code.c
2. `env` becomes immutable but in some low versions they are mutable ... if 
I remember correctly .. shouldn't be a big deal in SparkR side.
3. We will need to upgrade Jenkins's R version to 3.4, which mean we're not 
going to test 3.1 R version - this should be okay because we're already not 
testing R 3.2, 3.3 and 3.4. We test 3.5 in Appveyor, and 3.1 in Jenkins.

## How was this patch tested?

Jenkins tests. 


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/HyukjinKwon/spark SPARK-26014

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/23012.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #23012


commit dc2dbd923a1396ca5a7a950df35da57cc70c2ab8
Author: hyukjinkwon 
Date:   2018-11-12T05:39:14Z

Deprecate R prior to version 3.4 in SparkR




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22721: [SPARK-25403][SQL] Refreshes the table after inserting t...

2018-11-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22721
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22721: [SPARK-25403][SQL] Refreshes the table after inserting t...

2018-11-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22721
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4943/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23009: SPARK-26011: pyspark app with "spark.jars.packages" conf...

2018-11-11 Thread imatiach-msft

Github user imatiach-msft commented on the issue:

https://github.com/apache/spark/pull/23009
  
Jenkins test this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23009: SPARK-26011: pyspark app with "spark.jars.packages" conf...

2018-11-11 Thread imatiach-msft

Github user imatiach-msft commented on the issue:

https://github.com/apache/spark/pull/23009
  
LGTM, nice find


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23011: [SPARK-26013][R][BUILD] Upgrade R tools version from 3.4...

2018-11-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23011
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4942/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23011: [SPARK-26013][R][BUILD] Upgrade R tools version from 3.4...

2018-11-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23011
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23010: [SPARK-26012][SQL]Null and '' values should not cause dy...

2018-11-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23010
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4941/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23011: [SPARK-26013][R][BUILD] Upgrade R tools version from 3.4...

2018-11-11 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23011
  
**[Test build #98716 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98716/testReport)**
 for PR 23011 at commit 
[`b94d04a`](https://github.com/apache/spark/commit/b94d04ac80052ed50830239b06a08bf5b07603e6).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23010: [SPARK-26012][SQL]Null and '' values should not cause dy...

2018-11-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23010
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23011: [SPARK-26013][R][BUILD] Upgrade R tools version from 3.4...

2018-11-11 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/23011
  
cc @felixcheung 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23011: [SPARK-26013][R][BUILD] Upgrade R tools version f...

2018-11-11 Thread HyukjinKwon

GitHub user HyukjinKwon opened a pull request:

https://github.com/apache/spark/pull/23011

[SPARK-26013][R][BUILD] Upgrade R tools version from 3.4.0 to 3.5.1 in 
AppVeyor build

## What changes were proposed in this pull request?

R tools 3.5.1 is released few months ago. Spark currently uses 3.4.0. We 
should better upgrade in AppVeyor.

## How was this patch tested?

AppVeyor builds.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/HyukjinKwon/spark SPARK-26013

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/23011.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #23011


commit b94d04ac80052ed50830239b06a08bf5b07603e6
Author: hyukjinkwon 
Date:   2018-11-12T05:02:23Z

Upgrade R tools version to 3.5.1 in AppVeyor build




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23010: [SPARK-26012][SQL]Null and '' values should not cause dy...

2018-11-11 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23010
  
**[Test build #98715 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98715/testReport)**
 for PR 23010 at commit 
[`1f18e27`](https://github.com/apache/spark/commit/1f18e2786a26eb64c52925d8ecff2d6a2295ca16).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23010: [SPARK-26012][SQL]Null and '' values should not c...

2018-11-11 Thread eatoncys

GitHub user eatoncys opened a pull request:

https://github.com/apache/spark/pull/23010

[SPARK-26012][SQL]Null and '' values should not cause dynamic partition 
failure of string types

## What changes were proposed in this pull request?

Dynamic partition will fail when both '' and null values are taken as 
dynamic partition values simultaneously.
For example, the test bellow will fail before this PR:

  test("Null and '' values should not cause dynamic partition failure of 
string types") {
withTable("t1", "t2") {
  spark.range(3).write.saveAsTable("t1")
  spark.sql("select id, cast(case when id = 1 then '' else null end as 
string) as p" +
" from t1").write.partitionBy("p").saveAsTable("t2")
  checkAnswer(spark.table("t2").sort("id"), Seq(Row(0, null), Row(1, 
null), Row(2, null)))
}
  }

The error is: 'org.apache.hadoop.fs.FileAlreadyExistsException: File 
already exists'.
This PR adds exception protection to file conflicts, renaming the file when 
files conflict.


(Please fill in changes proposed in this fix)

## How was this patch tested?
New added test.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/eatoncys/spark dynamicPartition

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/23010.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #23010


commit 1f18e2786a26eb64c52925d8ecff2d6a2295ca16
Author: 10129659 
Date:   2018-11-12T04:41:53Z

Null and '' values should not cause dynamic partition failure of string 
types




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22962: [SPARK-25921][PySpark] Fix barrier task run without Barr...

2018-11-11 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/22962
  
cc @jiangxb1987 @MrBago 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22962: [SPARK-25921][PySpark] Fix barrier task run without Barr...

2018-11-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22962
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22962: [SPARK-25921][PySpark] Fix barrier task run without Barr...

2018-11-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22962
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98714/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22962: [SPARK-25921][PySpark] Fix barrier task run without Barr...

2018-11-11 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22962
  
**[Test build #98714 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98714/testReport)**
 for PR 22962 at commit 
[`02555b8`](https://github.com/apache/spark/commit/02555b8fbdf85c3f2b5a92420479c168e14b573c).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22184: [SPARK-25132][SQL][DOC] Add migration doc for case-insen...

2018-11-11 Thread seancxmao

Github user seancxmao commented on the issue:

https://github.com/apache/spark/pull/22184
  
@HyukjinKwon Thank you for your comments. Yes, this is only valid when 
upgrade Spark 2.3 to 2.4. I will do it.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22989: [SPARK-25986][Build] Banning throw new OutOfMemoryErrors

2018-11-11 Thread xuanyuanking

Github user xuanyuanking commented on the issue:

https://github.com/apache/spark/pull/22989
  
Sorry for late reply, great thanks for all reviewer's advise, will address 
them soon.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22962: [SPARK-25921][PySpark] Fix barrier task run without Barr...

2018-11-11 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/22962
  
Looks making sense to me in general.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22962: [SPARK-25921][PySpark] Fix barrier task run witho...

2018-11-11 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/22962#discussion_r232528655
  
--- Diff: python/pyspark/tests.py ---
@@ -618,10 +618,13 @@ def test_barrier_with_python_worker_reuse(self):
 """
 Verify that BarrierTaskContext.barrier() with reused python worker.
 """
+self.sc._conf.set("spark.python.work.reuse", "true")
--- End diff --

@xuanyuanking, this will probably need a separate suite case since it's 
also related with how we start the worker or not. You can make a new class, run 
a simple job to make sure workers are created and being resued, test it and 
stop.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22962: [SPARK-25921][PySpark] Fix barrier task run without Barr...

2018-11-11 Thread xuanyuanking

Github user xuanyuanking commented on the issue:

https://github.com/apache/spark/pull/22962
  
@HyukjinKwon Thanks for your review, comment address and PR 
description/title changed done.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22962: [SPARK-25921][PySpark] Fix barrier task run witho...

2018-11-11 Thread xuanyuanking

Github user xuanyuanking commented on a diff in the pull request:

https://github.com/apache/spark/pull/22962#discussion_r232528333
  
--- Diff: python/pyspark/taskcontext.py ---
@@ -144,10 +144,19 @@ def __init__(self):
 """Construct a BarrierTaskContext, use get instead"""
 pass
 
+def __new__(cls):
--- End diff --

Yep, do this in `_getOrCreate` has same effect, this is an over consider of 
https://github.com/apache/spark/blob/aec0af4a952df2957e21d39d1e0546a36ab7ab86/python/pyspark/taskcontext.py#L44-L45
Deleted in 02555b8.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22962: [SPARK-25921][PySpark] Fix barrier task run without Barr...

2018-11-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22962
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22962: [SPARK-25921][PySpark] Fix barrier task run without Barr...

2018-11-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22962
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4940/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22962: [SPARK-25921][PySpark] Fix barrier task run witho...

2018-11-11 Thread xuanyuanking

Github user xuanyuanking commented on a diff in the pull request:

https://github.com/apache/spark/pull/22962#discussion_r232527808
  
--- Diff: python/pyspark/tests.py ---
@@ -614,6 +614,18 @@ def context_barrier(x):
 times = 
rdd.barrier().mapPartitions(f).map(context_barrier).collect()
 self.assertTrue(max(times) - min(times) < 1)
 
+def test_barrier_with_python_worker_reuse(self):
+"""
+Verify that BarrierTaskContext.barrier() with reused python worker.
+"""
+rdd = self.sc.parallelize(range(4), 4)
--- End diff --

Thanks, done in 02555b8.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22962: [SPARK-25921][PySpark] Fix BarrierTaskContext while pyth...

2018-11-11 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22962
  
**[Test build #98714 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98714/testReport)**
 for PR 22962 at commit 
[`02555b8`](https://github.com/apache/spark/commit/02555b8fbdf85c3f2b5a92420479c168e14b573c).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22954: [SPARK-25981][R] Enables Arrow optimization from R DataF...

2018-11-11 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22954
  
**[Test build #98713 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98713/testReport)**
 for PR 22954 at commit 
[`d9d9f98`](https://github.com/apache/spark/commit/d9d9f982d26a5dd2141515e0c9089243b7b93554).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22954: [SPARK-25981][R] Enables Arrow optimization from R DataF...

2018-11-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22954
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22954: [SPARK-25981][R] Enables Arrow optimization from R DataF...

2018-11-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22954
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4939/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23009: SPARK-26011: pyspark app with "spark.jars.packages" conf...

2018-11-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23009
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23009: SPARK-26011: pyspark app with "spark.jars.packages" conf...

2018-11-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23009
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22087: [SPARK-25097][ML] Support prediction on single instance ...

2018-11-11 Thread zhengruifeng

Github user zhengruifeng commented on the issue:

https://github.com/apache/spark/pull/22087
  
I also expose GMM's predictProbability.
 could you please make a final pass? @srowen @felixcheung 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23009: SPARK-26011: pyspark app with "spark.jars.packages" conf...

2018-11-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23009
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23009: SPARK-26011: pyspark app with "spark.jars.package...

2018-11-11 Thread shanyu

GitHub user shanyu opened a pull request:

https://github.com/apache/spark/pull/23009

SPARK-26011: pyspark app with "spark.jars.packages" config does not work

SparkSubmit determines pyspark app by the suffix of primary resource but 
Livy
uses "spark-internal" as the primary resource when calling spark-submit,
therefore args.isPython is set to false in SparkSubmit.scala.

The fix is to resolve maven coordinates not only when args.isPython is true,
but also when primary resource is spark-internal.

Tested the patch with Livy submitting pyspark app, spark-submit, pyspark 
with or without packages config.

Signed-off-by: Shanyu Zhao 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/shanyu/spark shanyu-26011

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/23009.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #23009


commit c8424aff80e33f9a3f5a7d19a04442c7dac701a4
Author: Shanyu Zhao 
Date:   2018-11-12T02:57:01Z

SPARK-26011: pyspark app with "spark.jars.packages" config does not work

SparkSubmit determines pyspark app by the suffix of primary resource but 
Livy
uses "spark-internal" as the primary resource when calling spark-submit,
therefore args.isPython is set to false in SparkSubmit.scala.

The fix is to resolve maven coordinates not only when args.isPython is true,
but also when primary resource is spark-internal.

Signed-off-by: Shanyu Zhao 




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22954: [SPARK-25981][R] Enables Arrow optimization from R DataF...

2018-11-11 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/22954
  
Yea .. I will make the followup works right away after this one get merged. 
Thanks @felixcheung. Let me address the rest of comments, and wait for Arrow 
release.

@BryanCutler BTW, do you know the rough expected timing for Arrow 0.12.0 
release?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22954: [SPARK-25981][R] Enables Arrow optimization from ...

2018-11-11 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/22954#discussion_r232525184
  
--- Diff: R/pkg/tests/fulltests/test_sparkSQL.R ---
@@ -307,6 +307,64 @@ test_that("create DataFrame from RDD", {
   unsetHiveContext()
 })
 
+test_that("createDataFrame Arrow optimization", {
+  skip_if_not_installed("arrow")
+  skip_if_not_installed("withr")
--- End diff --

Maybe we should hold it for now .. because I realised R API for Arrow 
requires R 3.5.x and Jenkins's one is 3.1.x if I remember this correctly. 
Ideally, we could probably do that via AppVeyor if everything goes fine.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22954: [SPARK-25981][R] Enables Arrow optimization from ...

2018-11-11 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/22954#discussion_r232525068
  
--- Diff: R/pkg/tests/fulltests/test_sparkSQL.R ---
@@ -307,6 +307,64 @@ test_that("create DataFrame from RDD", {
   unsetHiveContext()
 })
 
+test_that("createDataFrame Arrow optimization", {
+  skip_if_not_installed("arrow")
+  skip_if_not_installed("withr")
+
+  conf <- callJMethod(sparkSession, "conf")
+  arrowEnabled <- sparkR.conf("spark.sql.execution.arrow.enabled")[[1]]
+
+  callJMethod(conf, "set", "spark.sql.execution.arrow.enabled", "false")
+  tryCatch({
--- End diff --

Just to inject the finally .. :-) ..


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23008: [SPARK-22674][PYTHON] Removed the namedtuple pickling pa...

2018-11-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23008
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98710/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23008: [SPARK-22674][PYTHON] Removed the namedtuple pickling pa...

2018-11-11 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23008
  
**[Test build #98710 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98710/testReport)**
 for PR 23008 at commit 
[`9a81879`](https://github.com/apache/spark/commit/9a818797603f5804b32202d28474493c80966f58).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23008: [SPARK-22674][PYTHON] Removed the namedtuple pickling pa...

2018-11-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23008
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22974: [SPARK-22450][WIP][Core][MLLib][FollowUp] Safely registe...

2018-11-11 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22974
  
**[Test build #98712 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98712/testReport)**
 for PR 22974 at commit 
[`7e97e45`](https://github.com/apache/spark/commit/7e97e450e110b9cdbe3610ee03e1ea65d5575d63).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22974: [SPARK-22450][WIP][Core][MLLib][FollowUp] Safely registe...

2018-11-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22974
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4938/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22974: [SPARK-22450][WIP][Core][MLLib][FollowUp] Safely registe...

2018-11-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22974
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23005: [SPARK-26005] [SQL] Upgrade ANTRL from 4.7 to 4.7.1

2018-11-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23005
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4937/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23005: [SPARK-26005] [SQL] Upgrade ANTRL from 4.7 to 4.7.1

2018-11-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23005
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23005: [SPARK-26005] [SQL] Upgrade ANTRL from 4.7 to 4.7.1

2018-11-11 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23005
  
**[Test build #98711 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98711/testReport)**
 for PR 23005 at commit 
[`4545977`](https://github.com/apache/spark/commit/45459776f2dd08f8180e152aae2702dfed190ed9).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22974: [SPARK-22450][Core][MLLib][FollowUp] Safely register Mul...

2018-11-11 Thread zhengruifeng

Github user zhengruifeng commented on the issue:

https://github.com/apache/spark/pull/22974
  
@srowen  I have some spare time, and will work on it.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21766: [SPARK-24803][SQL] add support for numeric

2018-11-11 Thread wangtao605

Github user wangtao605 commented on the issue:

https://github.com/apache/spark/pull/21766
  
> @wangtao605 Do you mind documenting our behavior in our Spark SQL doc?

Yes, it's ok.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23008: [SPARK-22674][PYTHON] Removed the namedtuple pickling pa...

2018-11-11 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23008
  
**[Test build #98710 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98710/testReport)**
 for PR 23008 at commit 
[`9a81879`](https://github.com/apache/spark/commit/9a818797603f5804b32202d28474493c80966f58).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22979: [SPARK-25977][SQL] Parsing decimals from CSV usin...

2018-11-11 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/22979#discussion_r232520110
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/UnivocityParser.scala
 ---
@@ -149,8 +156,8 @@ class UnivocityParser(
 
 case dt: DecimalType => (d: String) =>
   nullSafeDatum(d, name, nullable, options) { datum =>
-val value = new BigDecimal(datum.replaceAll(",", ""))
-Decimal(value, dt.precision, dt.scale)
+val bigDecimal = 
decimalParser.parse(datum).asInstanceOf[BigDecimal]
--- End diff --

Sounds good if that's not difficult.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23008: [SPARK-22674][PYTHON] Removed the namedtuple pickling pa...

2018-11-11 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/23008
  
ok to test


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22764: [SPARK-25765][ML] Add training cost to BisectingKMeans s...

2018-11-11 Thread dbtsai

Github user dbtsai commented on the issue:

https://github.com/apache/spark/pull/22764
  
@mgaido91 I'm on thanksgiving vacation, will be back to community to help 
code review on Nov 21st. Sorry for the delay.  


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

1 2 3 >

1 - 100 of 297 matches

Mail list logo