[GitHub] spark issue #23236: [SPARK-26275][PYTHON][ML] Increases timeout for Streamin...

2018-12-05 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23236
  
**[Test build #99726 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99726/testReport)**
 for PR 23236 at commit 
[`3c4ee75`](https://github.com/apache/spark/commit/3c4ee75c4d0585702cd87cc4df9af74e235bb431).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22952: [SPARK-20568][SS] Provide option to clean up completed f...

2018-12-05 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue:

https://github.com/apache/spark/pull/22952
  
@gaborgsomogyi @steveloughran 
OK. I'll change the approach to just check against final path for each 
moving. As @steveloughran stated, it may bring performance hit for each 
checking when dealing with object stores, so we may also need to provide a way 
to disable checking as well with caution. (Btw, if moving file in object store 
requires huge overhead rather than globing, slow globing may not be a big deal.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23207: [SPARK-26193][SQL] Implement shuffle write metric...

2018-12-05 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/23207#discussion_r239060606
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/metric/SQLShuffleMetricsReporter.scala
 ---
@@ -95,3 +96,59 @@ private[spark] object SQLShuffleMetricsReporter {
 FETCH_WAIT_TIME -> SQLMetrics.createTimingMetric(sc, "fetch wait 
time"),
 RECORDS_READ -> SQLMetrics.createMetric(sc, "records read"))
 }
+
+/**
+ * A shuffle write metrics reporter for SQL exchange operators. Different 
with
+ * [[SQLShuffleReadMetricsReporter]], we need a function of (reporter => 
reporter) set in
+ * shuffle dependency, so the local SQLMetric should transient and create 
on executor.
+ * @param metrics Shuffle write metrics in current SparkPlan.
+ * @param metricsReporter Other reporter need to be updated in this 
SQLShuffleWriteMetricsReporter.
+ */
+private[spark] case class SQLShuffleWriteMetricsReporter(
+metrics: Map[String, SQLMetric])(metricsReporter: 
ShuffleWriteMetricsReporter)
+  extends ShuffleWriteMetricsReporter with Serializable {
+  @transient private[this] lazy val _bytesWritten =
+metrics(SQLShuffleWriteMetricsReporter.SHUFFLE_BYTES_WRITTEN)
+  @transient private[this] lazy val _recordsWritten =
+metrics(SQLShuffleWriteMetricsReporter.SHUFFLE_RECORDS_WRITTEN)
+  @transient private[this] lazy val _writeTime =
+metrics(SQLShuffleWriteMetricsReporter.SHUFFLE_WRITE_TIME)
+
+  override private[spark] def incBytesWritten(v: Long): Unit = {
+metricsReporter.incBytesWritten(v)
+_bytesWritten.add(v)
+  }
+  override private[spark] def decRecordsWritten(v: Long): Unit = {
+metricsReporter.decBytesWritten(v)
+_recordsWritten.set(_recordsWritten.value - v)
+  }
+  override private[spark] def incRecordsWritten(v: Long): Unit = {
+metricsReporter.incRecordsWritten(v)
+_recordsWritten.add(v)
+  }
+  override private[spark] def incWriteTime(v: Long): Unit = {
+metricsReporter.incWriteTime(v)
+_writeTime.add(v)
+  }
+  override private[spark] def decBytesWritten(v: Long): Unit = {
+metricsReporter.decBytesWritten(v)
+_bytesWritten.set(_bytesWritten.value - v)
+  }
+}
+
+private[spark] object SQLShuffleWriteMetricsReporter {
+  val SHUFFLE_BYTES_WRITTEN = "shuffleBytesWritten"
+  val SHUFFLE_RECORDS_WRITTEN = "shuffleRecordsWritten"
+  val SHUFFLE_WRITE_TIME = "shuffleWriteTime"
--- End diff --

do we have other time metrics using nanoseconds?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23235: [SPARK-26151][SQL][FOLLOWUP] Return partial results for ...

2018-12-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23235
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99728/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23236: [SPARK-26275][PYTHON][ML] Increases timeout for Streamin...

2018-12-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23236
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99724/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23236: [SPARK-26275][PYTHON][ML] Increases timeout for Streamin...

2018-12-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23236
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23235: [SPARK-26151][SQL][FOLLOWUP] Return partial results for ...

2018-12-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23235
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23235: [SPARK-26151][SQL][FOLLOWUP] Return partial results for ...

2018-12-05 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23235
  
**[Test build #99728 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99728/testReport)**
 for PR 23235 at commit 
[`463f9e1`](https://github.com/apache/spark/commit/463f9e16ead2291a7f0f3893e485a56b77da2f06).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23207: [SPARK-26193][SQL] Implement shuffle write metric...

2018-12-05 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/23207#discussion_r239059162
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/metric/SQLMetrics.scala 
---
@@ -163,6 +171,8 @@ object SQLMetrics {
 Utils.bytesToString
   } else if (metricsType == TIMING_METRIC) {
 Utils.msDurationToString
+  } else if (metricsType == NS_TIMING_METRIC) {
+duration => Utils.msDurationToString(duration / 1000 / 1000)
--- End diff --

will this string lose the nanosecond precision?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23236: [SPARK-26275][PYTHON][ML] Increases timeout for Streamin...

2018-12-05 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23236
  
**[Test build #99724 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99724/testReport)**
 for PR 23236 at commit 
[`3c4ee75`](https://github.com/apache/spark/commit/3c4ee75c4d0585702cd87cc4df9af74e235bb431).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23236: [SPARK-26275][PYTHON][ML] Increases timeout for Streamin...

2018-12-05 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23236
  
**[Test build #99729 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99729/testReport)**
 for PR 23236 at commit 
[`3c4ee75`](https://github.com/apache/spark/commit/3c4ee75c4d0585702cd87cc4df9af74e235bb431).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23235: [SPARK-26151][SQL][FOLLOWUP] Return partial results for ...

2018-12-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23235
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5771/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23235: [SPARK-26151][SQL][FOLLOWUP] Return partial results for ...

2018-12-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23235
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23236: [SPARK-26275][PYTHON][ML] Increases timeout for Streamin...

2018-12-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23236
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23236: [SPARK-26275][PYTHON][ML] Increases timeout for Streamin...

2018-12-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23236
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5770/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23236: [SPARK-26275][PYTHON][ML] Increases timeout for Streamin...

2018-12-05 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/23236
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23235: [SPARK-26151][SQL][FOLLOWUP] Return partial results for ...

2018-12-05 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23235
  
**[Test build #99728 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99728/testReport)**
 for PR 23235 at commit 
[`463f9e1`](https://github.com/apache/spark/commit/463f9e16ead2291a7f0f3893e485a56b77da2f06).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23236: [SPARK-26275][PYTHON][ML] Increases timeout for Streamin...

2018-12-05 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23236
  
**[Test build #99727 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99727/testReport)**
 for PR 23236 at commit 
[`3c4ee75`](https://github.com/apache/spark/commit/3c4ee75c4d0585702cd87cc4df9af74e235bb431).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23236: [SPARK-26275][PYTHON][ML] Increases timeout for Streamin...

2018-12-05 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/23236
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23224: [SPARK-26277][SQL][TEST] WholeStageCodegen metrics shoul...

2018-12-05 Thread seancxmao
Github user seancxmao commented on the issue:

https://github.com/apache/spark/pull/23224
  
@HyukjinKwon Thank you for your comments! I have filed a JIRA and updated 
the PR title accordingly.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23235: [SPARK-26151][SQL][FOLLOWUP] Return partial resul...

2018-12-05 Thread MaxGekk
Github user MaxGekk commented on a diff in the pull request:

https://github.com/apache/spark/pull/23235#discussion_r239055208
  
--- Diff: docs/sql-migration-guide-upgrade.md ---
@@ -35,6 +35,8 @@ displayTitle: Spark SQL Upgrading Guide
 
   - Since Spark 3.0, CSV datasource uses java.time API for parsing and 
generating CSV content. New formatting implementation supports date/timestamp 
patterns conformed to ISO 8601. To switch back to the implementation used in 
Spark 2.4 and earlier, set `spark.sql.legacy.timeParser.enabled` to `true`.
 
+  - In Spark version 2.4 and earlier, CSV datasource converts a malformed 
CSV string to a row with all `null`s in the PERMISSIVE mode if specified schema 
is `StructType`. Since Spark 3.0, returned row can contain non-`null` fields if 
some of CSV column values were parsed and converted to desired types 
successfully.
--- End diff --

you are right. I will remove the part about `StructType`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23236: [SPARK-26275][PYTHON][ML] Increases timeout for Streamin...

2018-12-05 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/23236
  
test this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23236: [SPARK-26275][PYTHON][ML] Increases timeout for Streamin...

2018-12-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23236
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5769/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23236: [SPARK-26275][PYTHON][ML] Increases timeout for Streamin...

2018-12-05 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/23236
  

test this please
--





---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23236: [SPARK-26275][PYTHON][ML] Increases timeout for Streamin...

2018-12-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23236
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23207: [SPARK-26193][SQL] Implement shuffle write metric...

2018-12-05 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request:

https://github.com/apache/spark/pull/23207#discussion_r239054315
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/metric/SQLMetricsSuite.scala
 ---
@@ -299,12 +312,25 @@ class SQLMetricsSuite extends SparkFunSuite with 
SQLMetricsTestUtils with Shared
   val df1 = Seq((1, "1"), (2, "2")).toDF("key", "value")
   val df2 = (1 to 10).map(i => (i, i.toString)).toSeq.toDF("key", 
"value")
   // Assume the execution plan is
-  // ... -> ShuffledHashJoin(nodeId = 1) -> Project(nodeId = 0)
+  // Project(nodeId = 0)
+  // +- ShuffledHashJoin(nodeId = 1)
+  // :- Exchange(nodeId = 2)
+  // :  +- Project(nodeId = 3)
+  // : +- LocalTableScan(nodeId = 4)
+  // +- Exchange(nodeId = 5)
+  // +- Project(nodeId = 6)
+  // +- LocalTableScan(nodeId = 7)
   val df = df1.join(df2, "key")
   testSparkPlanMetrics(df, 1, Map(
 1L -> (("ShuffledHashJoin", Map(
   "number of output rows" -> 2L,
-  "avg hash probe (min, med, max)" -> "\n(1, 1, 1)"
+  "avg hash probe (min, med, max)" -> "\n(1, 1, 1)"))),
+2L -> (("Exchange", Map(
+  "shuffle records written" -> 2L,
+  "records read" -> 2L))),
--- End diff --

For most scenario the answer is yes, but like sort merge join cases, 2 sort 
node reuse same child will make shuffle records written/records read different, 
I also add cases in here:


https://github.com/xuanyuanking/spark/blob/SPARK-26193/sql/core/src/test/scala/org/apache/spark/sql/execution/metric/SQLMetricsSuite.scala#L217-L222


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23236: [SPARK-26275][PYTHON][ML] Increases timeout for Streamin...

2018-12-05 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23236
  
**[Test build #99726 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99726/testReport)**
 for PR 23236 at commit 
[`3c4ee75`](https://github.com/apache/spark/commit/3c4ee75c4d0585702cd87cc4df9af74e235bb431).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23195: [SPARK-26236][SS] Add kafka delegation token support doc...

2018-12-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23195
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23195: [SPARK-26236][SS] Add kafka delegation token support doc...

2018-12-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23195
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99723/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23195: [SPARK-26236][SS] Add kafka delegation token support doc...

2018-12-05 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23195
  
**[Test build #99723 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99723/testReport)**
 for PR 23195 at commit 
[`3ad9cb7`](https://github.com/apache/spark/commit/3ad9cb704c7d3daa181aeba0be78dc025dde24e2).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23236: [SPARK-26275][PYTHON][ML] Increases timeout for Streamin...

2018-12-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23236
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5768/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23236: [SPARK-26275][PYTHON][ML] Increases timeout for Streamin...

2018-12-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23236
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23236: [SPARK-26275][PYTHON][ML] Increases timeout for Streamin...

2018-12-05 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/23236
  
test this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23236: [SPARK-26275][PYTHON][ML] Increases timeout for Streamin...

2018-12-05 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/23236
  
test this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23236: [SPARK-26275][PYTHON][ML] Increases timeout for Streamin...

2018-12-05 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/23236
  
test this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23236: [SPARK-26275][PYTHON][ML] Increases timeout for Streamin...

2018-12-05 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/23236
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23236: [SPARK-26275][PYTHON][ML] Increases timeout for Streamin...

2018-12-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23236
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99725/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23236: [SPARK-26275][PYTHON][ML] Increases timeout for Streamin...

2018-12-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23236
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23236: [SPARK-26275][PYTHON][ML] Increases timeout for Streamin...

2018-12-05 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23236
  
**[Test build #99724 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99724/testReport)**
 for PR 23236 at commit 
[`3c4ee75`](https://github.com/apache/spark/commit/3c4ee75c4d0585702cd87cc4df9af74e235bb431).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23236: [SPARK-26275][PYTHON][ML] Increases timeout for Streamin...

2018-12-05 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/23236
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23120: [SPARK-26151][SQL] Return partial results for bad CSV re...

2018-12-05 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/23120
  
a late LGTM as well


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23227: [SPARK-26271][FOLLOW-UP][SQL] remove unuse object SparkP...

2018-12-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23227
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23236: [SPARK-26275][PYTHON][ML] Increases timeout for Streamin...

2018-12-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23236
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23236: [SPARK-26275][PYTHON][ML] Increases timeout for Streamin...

2018-12-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23236
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99721/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23230: [SPARK-26133][ML][Followup] Fix doc for OneHotEncoder

2018-12-05 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/23230
  
Thanks @HyukjinKwon 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23227: [SPARK-26271][FOLLOW-UP][SQL] remove unuse object SparkP...

2018-12-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23227
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99702/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23236: [SPARK-26275][PYTHON][ML] Increases timeout for Streamin...

2018-12-05 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23236
  
**[Test build #99721 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99721/testReport)**
 for PR 23236 at commit 
[`3c4ee75`](https://github.com/apache/spark/commit/3c4ee75c4d0585702cd87cc4df9af74e235bb431).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23195: [SPARK-26236][SS] Add kafka delegation token support doc...

2018-12-05 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23195
  
**[Test build #99723 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99723/testReport)**
 for PR 23195 at commit 
[`3ad9cb7`](https://github.com/apache/spark/commit/3ad9cb704c7d3daa181aeba0be78dc025dde24e2).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23227: [SPARK-26271][FOLLOW-UP][SQL] remove unuse object SparkP...

2018-12-05 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23227
  
**[Test build #99702 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99702/testReport)**
 for PR 23227 at commit 
[`5cb416d`](https://github.com/apache/spark/commit/5cb416df5f03b0d750c83e1a8a344b8ea44b1735).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23207: [SPARK-26193][SQL] Implement shuffle write metric...

2018-12-05 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request:

https://github.com/apache/spark/pull/23207#discussion_r239050549
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/metric/SQLMetricsSuite.scala
 ---
@@ -170,13 +172,23 @@ class SQLMetricsSuite extends SparkFunSuite with 
SQLMetricsTestUtils with Shared
 val df = testData2.groupBy().agg(collect_set('a)) // 2 partitions
 testSparkPlanMetrics(df, 1, Map(
   2L -> (("ObjectHashAggregate", Map("number of output rows" -> 2L))),
+  1L -> (("Exchange", Map(
+"shuffle records written" -> 2L,
+"records read" -> 2L,
+"local blocks fetched" -> 2L,
--- End diff --

I agree "fetch" is a more code name in `ShuffleBlockFetcherIterator`, but 
do you mean just change the display in ui? Cause there's many place even 
api.scala use the name `localBlocksFetched`, change them all maybe not a good 
choice for code backport, WDYT?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23235: [SPARK-26151][SQL][FOLLOWUP] Return partial resul...

2018-12-05 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/23235#discussion_r239049825
  
--- Diff: docs/sql-migration-guide-upgrade.md ---
@@ -35,6 +35,8 @@ displayTitle: Spark SQL Upgrading Guide
 
   - Since Spark 3.0, CSV datasource uses java.time API for parsing and 
generating CSV content. New formatting implementation supports date/timestamp 
patterns conformed to ISO 8601. To switch back to the implementation used in 
Spark 2.4 and earlier, set `spark.sql.legacy.timeParser.enabled` to `true`.
 
+  - In Spark version 2.4 and earlier, CSV datasource converts a malformed 
CSV string to a row with all `null`s in the PERMISSIVE mode if specified schema 
is `StructType`. Since Spark 3.0, returned row can contain non-`null` fields if 
some of CSV column values were parsed and converted to desired types 
successfully.
--- End diff --

Ah, `from_csv` and `to_csv` are added in 3.0 so it's intentionally not 
mentioned. BTW, I think CSV functionalities can only have `StructType` so maybe 
we don't have to mention.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23207: [SPARK-26193][SQL] Implement shuffle write metric...

2018-12-05 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request:

https://github.com/apache/spark/pull/23207#discussion_r239049398
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/metric/SQLMetrics.scala 
---
@@ -163,6 +171,8 @@ object SQLMetrics {
 Utils.bytesToString
   } else if (metricsType == TIMING_METRIC) {
 Utils.msDurationToString
+  } else if (metricsType == NANO_TIMING_METRIC) {
+duration => Utils.msDurationToString(duration / 10)
--- End diff --

Sorry...Sorry for this, change it to `1000 / 1000` as other place do for 
safety.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23207: [SPARK-26193][SQL] Implement shuffle write metrics in SQ...

2018-12-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23207
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23207: [SPARK-26193][SQL] Implement shuffle write metric...

2018-12-05 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request:

https://github.com/apache/spark/pull/23207#discussion_r239049121
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/metric/SQLMetrics.scala 
---
@@ -78,6 +78,7 @@ object SQLMetrics {
   private val SUM_METRIC = "sum"
   private val SIZE_METRIC = "size"
   private val TIMING_METRIC = "timing"
+  private val NANO_TIMING_METRIC = "nanosecond"
--- End diff --

Done in cf35b9f.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23207: [SPARK-26193][SQL] Implement shuffle write metrics in SQ...

2018-12-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23207
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5767/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23207: [SPARK-26193][SQL] Implement shuffle write metric...

2018-12-05 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request:

https://github.com/apache/spark/pull/23207#discussion_r239049030
  
--- Diff: core/src/main/scala/org/apache/spark/shuffle/metrics.scala ---
@@ -50,3 +50,57 @@ private[spark] trait ShuffleWriteMetricsReporter {
   private[spark] def decBytesWritten(v: Long): Unit
   private[spark] def decRecordsWritten(v: Long): Unit
 }
+
+
+/**
+ * A proxy class of ShuffleWriteMetricsReporter which proxy all metrics 
updating to the input
+ * reporters.
+ */
+private[spark] class GroupedShuffleWriteMetricsReporter(
+reporters: Seq[ShuffleWriteMetricsReporter]) extends 
ShuffleWriteMetricsReporter {
+  override private[spark] def incBytesWritten(v: Long): Unit = {
+reporters.foreach(_.incBytesWritten(v))
+  }
+  override private[spark] def decRecordsWritten(v: Long): Unit = {
+reporters.foreach(_.decRecordsWritten(v))
+  }
+  override private[spark] def incRecordsWritten(v: Long): Unit = {
+reporters.foreach(_.incRecordsWritten(v))
+  }
+  override private[spark] def incWriteTime(v: Long): Unit = {
+reporters.foreach(_.incWriteTime(v))
+  }
+  override private[spark] def decBytesWritten(v: Long): Unit = {
+reporters.foreach(_.decBytesWritten(v))
+  }
+}
+
+
+/**
+ * A proxy class of ShuffleReadMetricsReporter which proxy all metrics 
updating to the input
+ * reporters.
+ */
+private[spark] class GroupedShuffleReadMetricsReporter(
--- End diff --

Got it, thanks for your guidance, revert to old approach and just little 
changes for `SQLShuffleReadMetricsReporter` which followed 
https://github.com/apache/spark/pull/23147.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23207: [SPARK-26193][SQL] Implement shuffle write metric...

2018-12-05 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request:

https://github.com/apache/spark/pull/23207#discussion_r239048356
  
--- Diff: core/src/main/scala/org/apache/spark/shuffle/metrics.scala ---
@@ -50,3 +50,57 @@ private[spark] trait ShuffleWriteMetricsReporter {
   private[spark] def decBytesWritten(v: Long): Unit
   private[spark] def decRecordsWritten(v: Long): Unit
 }
+
+
+/**
+ * A proxy class of ShuffleWriteMetricsReporter which proxy all metrics 
updating to the input
+ * reporters.
+ */
+private[spark] class GroupedShuffleWriteMetricsReporter(
--- End diff --

Thanks for your guidance Reynold and Wenchen, I choose the second 
implementation, it takes account of both less heavy option and similar use 
patten as `SQLShuffleReadMetricsReporter`. Done in cf35b9f.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23207: [SPARK-26193][SQL] Implement shuffle write metrics in SQ...

2018-12-05 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23207
  
**[Test build #99722 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99722/testReport)**
 for PR 23207 at commit 
[`cf35b9f`](https://github.com/apache/spark/commit/cf35b9f948f174a5726a7feba611224c4ac495e7).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22683: [SPARK-25696] The storage memory displayed on spark Appl...

2018-12-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22683
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22683: [SPARK-25696] The storage memory displayed on spark Appl...

2018-12-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22683
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99700/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22683: [SPARK-25696] The storage memory displayed on spark Appl...

2018-12-05 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22683
  
**[Test build #99700 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99700/testReport)**
 for PR 22683 at commit 
[`8cc05a5`](https://github.com/apache/spark/commit/8cc05a57e8ecaa3e2a2f67d125b12645bb4eb3a2).
 * This patch **fails PySpark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23235: [SPARK-26151][SQL][FOLLOWUP] Return partial results for ...

2018-12-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23235
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23235: [SPARK-26151][SQL][FOLLOWUP] Return partial results for ...

2018-12-05 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23235
  
**[Test build #99720 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99720/testReport)**
 for PR 23235 at commit 
[`8c115f7`](https://github.com/apache/spark/commit/8c115f7871d4db66b13ee21ea3a1231f7153791e).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23235: [SPARK-26151][SQL][FOLLOWUP] Return partial results for ...

2018-12-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23235
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99720/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23222: [SPARK-20636] Add the rule TransposeWindow to the optimi...

2018-12-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23222
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99701/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23222: [SPARK-20636] Add the rule TransposeWindow to the optimi...

2018-12-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23222
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23228: [MINOR][DOC]The condition description of serialized shuf...

2018-12-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23228
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99703/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23228: [MINOR][DOC]The condition description of serialized shuf...

2018-12-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23228
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23222: [SPARK-20636] Add the rule TransposeWindow to the optimi...

2018-12-05 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23222
  
**[Test build #99701 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99701/testReport)**
 for PR 23222 at commit 
[`1270e89`](https://github.com/apache/spark/commit/1270e89026d80c862137c03edbeee53e56f3ed6d).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23228: [MINOR][DOC]The condition description of serialized shuf...

2018-12-05 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23228
  
**[Test build #99703 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99703/testReport)**
 for PR 23228 at commit 
[`d5dadbf`](https://github.com/apache/spark/commit/d5dadbf30d5429c36ec3d5c2845a71c2717fd6f3).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23235: [SPARK-26151][SQL][FOLLOWUP] Return partial results for ...

2018-12-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23235
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5765/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23236: [SPARK-26275][PYTHON][ML] Increases timeout for Streamin...

2018-12-05 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23236
  
**[Test build #99721 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99721/testReport)**
 for PR 23236 at commit 
[`3c4ee75`](https://github.com/apache/spark/commit/3c4ee75c4d0585702cd87cc4df9af74e235bb431).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23236: [SPARK-26275][PYTHON][ML] Increases timeout for Streamin...

2018-12-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23236
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23236: [SPARK-26275][PYTHON][ML] Increases timeout for Streamin...

2018-12-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23236
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5766/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23235: [SPARK-26151][SQL][FOLLOWUP] Return partial results for ...

2018-12-05 Thread MaxGekk
Github user MaxGekk commented on the issue:

https://github.com/apache/spark/pull/23235
  
@cloud-fan Please, have a look at the PR.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23235: [SPARK-26151][SQL][FOLLOWUP] Return partial results for ...

2018-12-05 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23235
  
**[Test build #99720 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99720/testReport)**
 for PR 23235 at commit 
[`8c115f7`](https://github.com/apache/spark/commit/8c115f7871d4db66b13ee21ea3a1231f7153791e).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23236: [SPARK-26275][PYTHON][ML] Increases timeout for Streamin...

2018-12-05 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/23236
  
cc @BryanCutler and @viirya 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23235: [SPARK-26151][SQL][FOLLOWUP] Return partial resul...

2018-12-05 Thread MaxGekk
GitHub user MaxGekk opened a pull request:

https://github.com/apache/spark/pull/23235

[SPARK-26151][SQL][FOLLOWUP] Return partial results for bad CSV records

## What changes were proposed in this pull request?

Updated SQL migration guide according to changes in 
https://github.com/apache/spark/pull/23120

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/MaxGekk/spark-1 
failuresafe-partial-result-followup

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/23235.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #23235


commit 8c115f7871d4db66b13ee21ea3a1231f7153791e
Author: Maxim Gekk 
Date:   2018-12-05T12:13:26Z

Updating the migration guide




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23236: [SPARK-26275][PYTHON][ML] Increases timeout for Streamin...

2018-12-05 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/23236
  
(cc @squito as well since it's from #23111)


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23236: [SPARK-26275][PYTHON][ML] Increases timeout for Streamin...

2018-12-05 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/23236
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23236: [SPARK-26275][PYTHON][ML] Increases timeout for S...

2018-12-05 Thread HyukjinKwon
GitHub user HyukjinKwon opened a pull request:

https://github.com/apache/spark/pull/23236

[SPARK-26275][PYTHON][ML] Increases timeout for 
StreamingLogisticRegressionWithSGDTests.test_training_and_prediction test

## What changes were proposed in this pull request?

Looks this test is flaky


https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99704/console

https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99569/console

https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99644/console

https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99548/console

https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99454/console

https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99609/console

```
==
FAIL: test_training_and_prediction 
(pyspark.mllib.tests.test_streaming_algorithms.StreamingLogisticRegressionWithSGDTests)
Test that the model improves on toy data with no. of batches
--
Traceback (most recent call last):
  File 
"/home/jenkins/workspace/SparkPullRequestBuilder/python/pyspark/mllib/tests/test_streaming_algorithms.py",
 line 367, in test_training_and_prediction
self._eventually(condition)
  File 
"/home/jenkins/workspace/SparkPullRequestBuilder/python/pyspark/mllib/tests/test_streaming_algorithms.py",
 line 78, in _eventually
% (timeout, lastValue))
AssertionError: Test failed due to timeout after 30 sec, with last 
condition returning: Latest errors: 0.67, 0.71, 0.78, 0.7, 0.75, 0.74, 0.73, 
0.69, 0.62, 0.71, 0.69, 0.75, 0.72, 0.77, 0.71, 0.74

--
Ran 13 tests in 185.051s

FAILED (failures=1, skipped=1)
```

This looks happening after increasing the parallelism in Jenkins to speed 
up at https://github.com/apache/spark/pull/23111. I am able to reproduce this 
manually when the resource usage is heavy (with manual decrease of timeout).

## How was this patch tested?

Manually tested by 

```
cd python
./run-tests --testnames 'pyspark.mllib.tests.test_streaming_algorithms 
StreamingLogisticRegressionWithSGDTests.test_training_and_prediction' 
--python-executables=python
```


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/HyukjinKwon/spark SPARK-26275

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/23236.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #23236


commit 3c4ee75c4d0585702cd87cc4df9af74e235bb431
Author: Hyukjin Kwon 
Date:   2018-12-05T12:17:21Z

Increases timeout for 
StreamingLogisticRegressionWithSGDTests.test_training_and_prediction test




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23235: [SPARK-26151][SQL][FOLLOWUP] Return partial results for ...

2018-12-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23235
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23235: [SPARK-26151][SQL][FOLLOWUP] Return partial results for ...

2018-12-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23235
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23120: [SPARK-26151][SQL] Return partial results for bad CSV re...

2018-12-05 Thread MaxGekk
Github user MaxGekk commented on the issue:

https://github.com/apache/spark/pull/23120
  
The PR https://github.com/apache/spark/pull/23235 updates the sql migration 
guide


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23227: [SPARK-26271][FOLLOW-UP][SQL] remove unuse object SparkP...

2018-12-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23227
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23227: [SPARK-26271][FOLLOW-UP][SQL] remove unuse object SparkP...

2018-12-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23227
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5764/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23227: [SPARK-26271][FOLLOW-UP][SQL] remove unuse object SparkP...

2018-12-05 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23227
  
**[Test build #99719 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99719/testReport)**
 for PR 23227 at commit 
[`5cb416d`](https://github.com/apache/spark/commit/5cb416df5f03b0d750c83e1a8a344b8ea44b1735).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23227: [SPARK-26271][FOLLOW-UP][SQL] remove unuse object SparkP...

2018-12-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23227
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99704/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23227: [SPARK-26271][FOLLOW-UP][SQL] remove unuse object SparkP...

2018-12-05 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/23227
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23227: [SPARK-26271][FOLLOW-UP][SQL] remove unuse object SparkP...

2018-12-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23227
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23227: [SPARK-26271][FOLLOW-UP][SQL] remove unuse object SparkP...

2018-12-05 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23227
  
**[Test build #99704 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99704/testReport)**
 for PR 23227 at commit 
[`5cb416d`](https://github.com/apache/spark/commit/5cb416df5f03b0d750c83e1a8a344b8ea44b1735).
 * This patch **fails PySpark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23232: [SPARK-26233][SQL][BACKPORT-2.4] CheckOverflow when enco...

2018-12-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23232
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5763/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23232: [SPARK-26233][SQL][BACKPORT-2.4] CheckOverflow when enco...

2018-12-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23232
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23233: [SPARK-26233][SQL][BACKPORT-2.3] CheckOverflow when enco...

2018-12-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23233
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5762/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23233: [SPARK-26233][SQL][BACKPORT-2.3] CheckOverflow when enco...

2018-12-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23233
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23234: [SPARK-26233][SQL][BACKPORT-2.2] CheckOverflow when enco...

2018-12-05 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23234
  
**[Test build #99718 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99718/testReport)**
 for PR 23234 at commit 
[`930c510`](https://github.com/apache/spark/commit/930c51029b845c74357305e7ec30a4f2e6ea748a).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23233: [SPARK-26233][SQL][BACKPORT-2.3] CheckOverflow when enco...

2018-12-05 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23233
  
**[Test build #99717 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99717/testReport)**
 for PR 23233 at commit 
[`a1e7744`](https://github.com/apache/spark/commit/a1e77445c2675137fbcddf73181c47469f159dbf).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23234: [SPARK-26233][SQL][BACKPORT-2.2] CheckOverflow when enco...

2018-12-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23234
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5761/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23234: [SPARK-26233][SQL][BACKPORT-2.2] CheckOverflow when enco...

2018-12-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23234
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23210: [SPARK-26233][SQL] CheckOverflow when encoding a decimal...

2018-12-05 Thread mgaido91
Github user mgaido91 commented on the issue:

https://github.com/apache/spark/pull/23210
  
thanks @cloud-fan  @dongjoon-hyun, I created the PRs for the backports.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



<    1   2   3   4   5   6   7   >