date:20181205

[GitHub] spark issue #23231: [SPARK-26273][ML] Add OneHotEncoderEstimator as alias to...

2018-12-05 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23231
  
**[Test build #99712 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99712/testReport)**
 for PR 23231 at commit 
[`453d60f`](https://github.com/apache/spark/commit/453d60f42b99de621a7ee3fab6bc6138fc20ed05).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23207: [SPARK-26193][SQL] Implement shuffle write metrics in SQ...

2018-12-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23207
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5776/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23207: [SPARK-26193][SQL] Implement shuffle write metrics in SQ...

2018-12-05 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23207
  
**[Test build #99736 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99736/testReport)**
 for PR 23207 at commit 
[`76d1ca0`](https://github.com/apache/spark/commit/76d1ca0036bbb50a005e9d12f8b22bf21697af7f).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23207: [SPARK-26193][SQL] Implement shuffle write metrics in SQ...

2018-12-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23207
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23238: [SPARK-25132][SQL][FOLLOWUP] Add migration doc for case-...

2018-12-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23238
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23196: [SPARK-26243][SQL] Use java.time API for parsing timesta...

2018-12-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23196
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23196: [SPARK-26243][SQL] Use java.time API for parsing timesta...

2018-12-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23196
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5775/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23238: [SPARK-25132][SQL][FOLLOWUP] Add migration doc for case-...

2018-12-05 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23238
  
**[Test build #99732 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99732/testReport)**
 for PR 23238 at commit 
[`5bbcf41`](https://github.com/apache/spark/commit/5bbcf41f34f2ca160da7ef4ebe4c54d15a2d09b5).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23238: [SPARK-25132][SQL][FOLLOWUP] Add migration doc for case-...

2018-12-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23238
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99732/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23223: [SPARK-26269][YARN]Yarnallocator should have same blackl...

2018-12-05 Thread tgravescs

Github user tgravescs commented on the issue:

https://github.com/apache/spark/pull/23223
  
the approach here makes sense.   Are you seeing actual issues with this 
blacklisting when it shouldn't?  I could see that possible there and if so we 
should move this to defect and make sure it goes into 2.4.1


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22575: [SPARK-24630][SS] Support SQLStreaming in Spark

2018-12-05 Thread stczwd

Github user stczwd commented on a diff in the pull request:

https://github.com/apache/spark/pull/22575#discussion_r239113033
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
@@ -631,6 +631,33 @@ object SQLConf {
 .intConf
 .createWithDefault(200)
 
+  val SQLSTREAM_WATERMARK_ENABLE = 
buildConf("spark.sqlstreaming.watermark.enable")
+.doc("Whether use watermark in sqlstreaming.")
+.booleanConf
+.createWithDefault(false)
+
+  val SQLSTREAM_OUTPUTMODE = buildConf("spark.sqlstreaming.outputMode")
+.doc("The output mode used in sqlstreaming")
+.stringConf
+.createWithDefault("append")
+
+  val SQLSTREAM_TRIGGER = buildConf("spark.sqlstreaming.trigger")
--- End diff --

I don't think there are any problems with this. SQLStreaming is using 
Command to run streaming query, which is similar to InsertIntoHiveTable. 
herefore, the batch SQL and streaming SQL solution is expected.In addition, 
currently an application can only run one streaming SQL.
Therefore, the batch SQL and streaming SQL solution is expected


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23196: [SPARK-26243][SQL] Use java.time API for parsing timesta...

2018-12-05 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23196
  
**[Test build #99734 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99734/testReport)**
 for PR 23196 at commit 
[`07fcf46`](https://github.com/apache/spark/commit/07fcf4666a96928c8096db7a131e6514013679f0).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23159: [SPARK-26191][SQL] Control truncation of Spark plans via...

2018-12-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23159
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23196: [SPARK-26243][SQL] Use java.time API for parsing timesta...

2018-12-05 Thread MaxGekk

Github user MaxGekk commented on the issue:

https://github.com/apache/spark/pull/23196
  
jenkins, retest this, please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23196: [SPARK-26243][SQL] Use java.time API for parsing timesta...

2018-12-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23196
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23159: [SPARK-26191][SQL] Control truncation of Spark plans via...

2018-12-05 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23159
  
**[Test build #99735 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99735/testReport)**
 for PR 23159 at commit 
[`e0aa626`](https://github.com/apache/spark/commit/e0aa626c886976489348a6c0179d160bbe3252da).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23159: [SPARK-26191][SQL] Control truncation of Spark plans via...

2018-12-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23159
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5774/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23223: [SPARK-26269][YARN]Yarnallocator should have same...

2018-12-05 Thread tgravescs

Github user tgravescs commented on a diff in the pull request:

https://github.com/apache/spark/pull/23223#discussion_r239110361
  
--- Diff: 
resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocator.scala
 ---
@@ -612,11 +612,14 @@ private[yarn] class YarnAllocator(
 val message = "Container killed by YARN for exceeding physical 
memory limits. " +
   s"$diag Consider boosting ${EXECUTOR_MEMORY_OVERHEAD.key}."
 (true, message)
+  case exit_status if 
NOT_APP_AND_SYSTEM_FAULT_EXIT_STATUS.contains(exit_status) =>
--- End diff --

yeah I agree this should be cleaned up we already handle cases above that 
are in the NOT_APP_AND_SYSTEM_FAULT_EXIT_STATUS set.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23196: [SPARK-26243][SQL] Use java.time API for parsing timesta...

2018-12-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23196
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99714/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23196: [SPARK-26243][SQL] Use java.time API for parsing timesta...

2018-12-05 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23196
  
**[Test build #99714 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99714/testReport)**
 for PR 23196 at commit 
[`07fcf46`](https://github.com/apache/spark/commit/07fcf4666a96928c8096db7a131e6514013679f0).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23223: [SPARK-26269][YARN]Yarnallocator should have same blackl...

2018-12-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23223
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23238: [SPARK-25132][SQL][FOLLOWUP] Add migration doc for case-...

2018-12-05 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23238
  
**[Test build #99732 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99732/testReport)**
 for PR 23238 at commit 
[`5bbcf41`](https://github.com/apache/spark/commit/5bbcf41f34f2ca160da7ef4ebe4c54d15a2d09b5).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23238: [SPARK-25132][SQL][FOLLOWUP] Add migration doc for case-...

2018-12-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23238
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22957: [SPARK-25951][SQL] Ignore aliases for distributions and ...

2018-12-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22957
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23223: [SPARK-26269][YARN]Yarnallocator should have same blackl...

2018-12-05 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23223
  
**[Test build #99733 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99733/testReport)**
 for PR 23223 at commit 
[`65a70dc`](https://github.com/apache/spark/commit/65a70dcbb7993731104deab2592a5b969a31414e).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23223: [SPARK-26269][YARN]Yarnallocator should have same blackl...

2018-12-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23223
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5773/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23159: [SPARK-26191][SQL] Control truncation of Spark plans via...

2018-12-05 Thread MaxGekk

Github user MaxGekk commented on the issue:

https://github.com/apache/spark/pull/23159
  
jenkins, retest this, please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23238: [SPARK-25132][SQL][FOLLOWUP] Add migration doc for case-...

2018-12-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23238
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23233: [SPARK-26233][SQL][BACKPORT-2.3] CheckOverflow when enco...

2018-12-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23233
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22575: [SPARK-24630][SS] Support SQLStreaming in Spark

2018-12-05 Thread stczwd

Github user stczwd commented on a diff in the pull request:

https://github.com/apache/spark/pull/22575#discussion_r239109280
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/SQLStreamingSink.scala
 ---
@@ -0,0 +1,115 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.streaming
+
+import java.util.concurrent.TimeUnit
+
+import org.apache.spark.sql._
+import org.apache.spark.sql.catalyst.catalog.CatalogTable
+import org.apache.spark.sql.catalyst.plans.logical.LogicalPlan
+import org.apache.spark.sql.catalyst.streaming.InternalOutputModes
+import org.apache.spark.sql.execution.command.RunnableCommand
+import org.apache.spark.sql.execution.datasources.DataSource
+import org.apache.spark.sql.execution.datasources.v2.DataSourceV2Utils
+import org.apache.spark.sql.sources.v2.StreamingWriteSupportProvider
+import org.apache.spark.sql.streaming.Trigger
+import org.apache.spark.util.Utils
+
+/**
+ * The basic RunnableCommand for SQLStreaming, using Command.run to start 
a streaming query.
+ *
+ * @param sparkSession
+ * @param extraOptions
+ * @param partitionColumnNames
+ * @param child
+ */
+case class SQLStreamingSink(sparkSession: SparkSession,
+table: CatalogTable,
+child: LogicalPlan)
+  extends RunnableCommand {
+
+  private val sqlConf = sparkSession.sqlContext.conf
+
+  /**
+   * The given column name may not be equal to any of the existing column 
names if we were in
+   * case-insensitive context. Normalize the given column name to the real 
one so that we don't
+   * need to care about case sensitivity afterwards.
+   */
+  private def normalize(df: DataFrame, columnName: String, columnType: 
String): String = {
+val validColumnNames = df.logicalPlan.output.map(_.name)
+validColumnNames.find(sparkSession.sessionState.analyzer.resolver(_, 
columnName))
+  .getOrElse(throw new AnalysisException(s"$columnType column 
$columnName not found in " +
+s"existing columns (${validColumnNames.mkString(", ")})"))
+  }
+
+  /**
+   * Parse spark.sqlstreaming.trigger.seconds to Trigger
+   */
+  private def parseTrigger(): Trigger = {
+val trigger = Utils.timeStringAsMs(sqlConf.sqlStreamTrigger)
+Trigger.ProcessingTime(trigger, TimeUnit.MICROSECONDS)
--- End diff --

Yeah, I will change it to milliseconds.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22957: [SPARK-25951][SQL] Ignore aliases for distributions and ...

2018-12-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22957
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99713/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23233: [SPARK-26233][SQL][BACKPORT-2.3] CheckOverflow when enco...

2018-12-05 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23233
  
**[Test build #99717 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99717/testReport)**
 for PR 23233 at commit 
[`a1e7744`](https://github.com/apache/spark/commit/a1e77445c2675137fbcddf73181c47469f159dbf).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23223: [SPARK-26269][YARN]Yarnallocator should have same blackl...

2018-12-05 Thread tgravescs

Github user tgravescs commented on the issue:

https://github.com/apache/spark/pull/23223
  
ok to test


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22957: [SPARK-25951][SQL] Ignore aliases for distributions and ...

2018-12-05 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22957
  
**[Test build #99713 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99713/testReport)**
 for PR 22957 at commit 
[`e4f617f`](https://github.com/apache/spark/commit/e4f617fc7e47d7c49f3d773ac2d91c5508c0a239).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23233: [SPARK-26233][SQL][BACKPORT-2.3] CheckOverflow when enco...

2018-12-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23233
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99717/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23159: [SPARK-26191][SQL] Control truncation of Spark plans via...

2018-12-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23159
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23227: [SPARK-26271][FOLLOW-UP][SQL] remove unuse object SparkP...

2018-12-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23227
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5772/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23159: [SPARK-26191][SQL] Control truncation of Spark plans via...

2018-12-05 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23159
  
**[Test build #99715 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99715/testReport)**
 for PR 23159 at commit 
[`e0aa626`](https://github.com/apache/spark/commit/e0aa626c886976489348a6c0179d160bbe3252da).
 * This patch **fails PySpark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23227: [SPARK-26271][FOLLOW-UP][SQL] remove unuse object...

2018-12-05 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/23227


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23159: [SPARK-26191][SQL] Control truncation of Spark plans via...

2018-12-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23159
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99715/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23238: [SPARK-25132][SQL][FOLLOWUP] Add migration doc fo...

2018-12-05 Thread seancxmao

GitHub user seancxmao opened a pull request:

https://github.com/apache/spark/pull/23238

[SPARK-25132][SQL][FOLLOWUP] Add migration doc for case-insensitive field 
resolution when reading from Parquet

## What changes were proposed in this pull request?
#22148 introduces a behavior change. According to discussion at #22184, 
this PR updates migration guide when upgrade from Spark 2.3 to 2.4.

## How was this patch tested?
N/A

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/seancxmao/spark SPARK-25132-doc-2.4

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/23238.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #23238


commit 5bbcf41f34f2ca160da7ef4ebe4c54d15a2d09b5
Author: seancxmao 
Date:   2018-12-05T15:05:38Z

[SPARK-25132][SQL][FOLLOWUP] Update migration doc for case-insensitive 
field resolution when reading from Parquet




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23227: [SPARK-26271][FOLLOW-UP][SQL] remove unuse object SparkP...

2018-12-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23227
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23227: [SPARK-26271][FOLLOW-UP][SQL] remove unuse object SparkP...

2018-12-05 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23227
  
**[Test build #99731 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99731/testReport)**
 for PR 23227 at commit 
[`5cb416d`](https://github.com/apache/spark/commit/5cb416df5f03b0d750c83e1a8a344b8ea44b1735).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23227: [SPARK-26271][FOLLOW-UP][SQL] remove unuse object SparkP...

2018-12-05 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/23227
  
thanks, merging to master!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23227: [SPARK-26271][FOLLOW-UP][SQL] remove unuse object SparkP...

2018-12-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23227
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23231: [SPARK-26273][ML] Add OneHotEncoderEstimator as a...

2018-12-05 Thread viirya

Github user viirya closed the pull request at:

https://github.com/apache/spark/pull/23231


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23231: [SPARK-26273][ML] Add OneHotEncoderEstimator as alias to...

2018-12-05 Thread viirya

Github user viirya commented on the issue:

https://github.com/apache/spark/pull/23231
  
Then let me close this now.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23227: [SPARK-26271][FOLLOW-UP][SQL] remove unuse object SparkP...

2018-12-05 Thread heary-cao

Github user heary-cao commented on the issue:

https://github.com/apache/spark/pull/23227
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23227: [SPARK-26271][FOLLOW-UP][SQL] remove unuse object SparkP...

2018-12-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23227
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99719/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23227: [SPARK-26271][FOLLOW-UP][SQL] remove unuse object SparkP...

2018-12-05 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23227
  
**[Test build #99719 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99719/testReport)**
 for PR 23227 at commit 
[`5cb416d`](https://github.com/apache/spark/commit/5cb416df5f03b0d750c83e1a8a344b8ea44b1735).
 * This patch **fails PySpark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23231: [SPARK-26273][ML] Add OneHotEncoderEstimator as alias to...

2018-12-05 Thread viirya

Github user viirya commented on the issue:

https://github.com/apache/spark/pull/23231
  
Ok. Maybe we can add few words in ml migration guide to clearly announce 
this.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23213: [SPARK-26262][SQL] Runs SQLQueryTestSuite on mixed confi...

2018-12-05 Thread viirya

Github user viirya commented on the issue:

https://github.com/apache/spark/pull/23213
  
`wholeStage=false, factoryMode=CODE_ONLY` and `wholeStage=false, 
factoryMode=NO_CODEGEN` should have more complete test coverage for 
`GenerateUnsafeProject`, `GenerateMutableProject`, etc.




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23163: [SPARK-26164][SQL] Allow FileFormatWriter to write multi...

2018-12-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23163
  
Build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23163: [SPARK-26164][SQL] Allow FileFormatWriter to write multi...

2018-12-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23163
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99708/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23163: [SPARK-26164][SQL] Allow FileFormatWriter to write multi...

2018-12-05 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23163
  
**[Test build #99708 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99708/testReport)**
 for PR 23163 at commit 
[`6cb993b`](https://github.com/apache/spark/commit/6cb993b26e6b6867b3315228b55624b98acf1dcb).
 * This patch passes all tests.
 * This patch **does not merge cleanly**.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23231: [SPARK-26273][ML] Add OneHotEncoderEstimator as alias to...

2018-12-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23231
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23231: [SPARK-26273][ML] Add OneHotEncoderEstimator as alias to...

2018-12-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23231
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99707/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23231: [SPARK-26273][ML] Add OneHotEncoderEstimator as alias to...

2018-12-05 Thread srowen

Github user srowen commented on the issue:

https://github.com/apache/spark/pull/23231
  
I'm not seeing it in the migration guide, maybe I'm missing it. In any 
event, I dont' think we need to keep this for 3.0.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23231: [SPARK-26273][ML] Add OneHotEncoderEstimator as alias to...

2018-12-05 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23231
  
**[Test build #99707 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99707/testReport)**
 for PR 23231 at commit 
[`1716071`](https://github.com/apache/spark/commit/17160710cadc49b54f4385ae3ca9ddb0eb4034b0).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class OneHotEncoderEstimator @Since(\"2.3.0\") (@Since(\"2.3.0\") 
override val uid: String)`
  * `class OneHotEncoderEstimator(JavaEstimator, HasInputCols, 
HasOutputCols, HasHandleInvalid,`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23207: [SPARK-26193][SQL] Implement shuffle write metric...

2018-12-05 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/23207#discussion_r239090244
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/metric/SQLShuffleMetricsReporter.scala
 ---
@@ -95,3 +96,59 @@ private[spark] object SQLShuffleMetricsReporter {
 FETCH_WAIT_TIME -> SQLMetrics.createTimingMetric(sc, "fetch wait 
time"),
 RECORDS_READ -> SQLMetrics.createMetric(sc, "records read"))
 }
+
+/**
+ * A shuffle write metrics reporter for SQL exchange operators. Different 
with
+ * [[SQLShuffleReadMetricsReporter]], we need a function of (reporter => 
reporter) set in
+ * shuffle dependency, so the local SQLMetric should transient and create 
on executor.
+ * @param metrics Shuffle write metrics in current SparkPlan.
+ * @param metricsReporter Other reporter need to be updated in this 
SQLShuffleWriteMetricsReporter.
+ */
+private[spark] case class SQLShuffleWriteMetricsReporter(
+metrics: Map[String, SQLMetric])(metricsReporter: 
ShuffleWriteMetricsReporter)
+  extends ShuffleWriteMetricsReporter with Serializable {
+  @transient private[this] lazy val _bytesWritten =
+metrics(SQLShuffleWriteMetricsReporter.SHUFFLE_BYTES_WRITTEN)
+  @transient private[this] lazy val _recordsWritten =
+metrics(SQLShuffleWriteMetricsReporter.SHUFFLE_RECORDS_WRITTEN)
+  @transient private[this] lazy val _writeTime =
+metrics(SQLShuffleWriteMetricsReporter.SHUFFLE_WRITE_TIME)
+
+  override private[spark] def incBytesWritten(v: Long): Unit = {
+metricsReporter.incBytesWritten(v)
+_bytesWritten.add(v)
+  }
+  override private[spark] def decRecordsWritten(v: Long): Unit = {
+metricsReporter.decBytesWritten(v)
+_recordsWritten.set(_recordsWritten.value - v)
+  }
+  override private[spark] def incRecordsWritten(v: Long): Unit = {
+metricsReporter.incRecordsWritten(v)
+_recordsWritten.add(v)
+  }
+  override private[spark] def incWriteTime(v: Long): Unit = {
+metricsReporter.incWriteTime(v)
+_writeTime.add(v)
+  }
+  override private[spark] def decBytesWritten(v: Long): Unit = {
+metricsReporter.decBytesWritten(v)
+_bytesWritten.set(_bytesWritten.value - v)
+  }
+}
+
+private[spark] object SQLShuffleWriteMetricsReporter {
+  val SHUFFLE_BYTES_WRITTEN = "shuffleBytesWritten"
+  val SHUFFLE_RECORDS_WRITTEN = "shuffleRecordsWritten"
+  val SHUFFLE_WRITE_TIME = "shuffleWriteTime"
--- End diff --

cc @rxin , do you think we should change this metric to use ms as well? In 
all the places that read/write it.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23213: [SPARK-26262][SQL] Runs SQLQueryTestSuite on mixed confi...

2018-12-05 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/23213
  
If we look at test coverage, `wholeStage=false, factoryMode=CODE_ONLY` will 
go through code paths that wholeStageCodegen doesn't cover. Or did I miss 
something? 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23213: [SPARK-26262][SQL] Runs SQLQueryTestSuite on mixed confi...

2018-12-05 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/23213
  
But whole stage codegen will not test `GenerateUnsafeProject`, 
`GenerateMutableProject`, etc., right?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22184: [SPARK-25132][SQL][DOC] Add migration doc for case-insen...

2018-12-05 Thread seancxmao

Github user seancxmao commented on the issue:

https://github.com/apache/spark/pull/22184
  
@srowen Sorry for the late reply! I'd like to close this PR and file a new 
one since our SQL doc has changed a lot. Thank you all for your comments and 
time!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22184: [SPARK-25132][SQL][DOC] Add migration doc for cas...

2018-12-05 Thread seancxmao

Github user seancxmao closed the pull request at:

https://github.com/apache/spark/pull/22184


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23222: [SPARK-20636] Add the rule TransposeWindow to the optimi...

2018-12-05 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/23222
  
Jenkins passes, which means the previously added end-to-end test can't not 
show the benefit of this rule. We should update it.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23224: [SPARK-26277][SQL][TEST] WholeStageCodegen metrics shoul...

2018-12-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23224
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99711/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23224: [SPARK-26277][SQL][TEST] WholeStageCodegen metrics shoul...

2018-12-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23224
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23237: [SPARK-26279][CORE] Remove unused method in Logging

2018-12-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23237
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99730/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23237: [SPARK-26279][CORE] Remove unused method in Logging

2018-12-05 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23237
  
**[Test build #99730 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99730/testReport)**
 for PR 23237 at commit 
[`90b111f`](https://github.com/apache/spark/commit/90b111f900d8f11e4d730e0cfbe56a1683f96faa).
 * This patch **fails MiMa tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23237: [SPARK-26279][CORE] Remove unused method in Logging

2018-12-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23237
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23224: [SPARK-26277][SQL][TEST] WholeStageCodegen metrics shoul...

2018-12-05 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23224
  
**[Test build #99711 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99711/testReport)**
 for PR 23224 at commit 
[`021728c`](https://github.com/apache/spark/commit/021728ccc70cf971592c560cfc5492dedbdc362a).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23222: [SPARK-20636] Add the rule TransposeWindow to the optimi...

2018-12-05 Thread jiangxb1987

Github user jiangxb1987 commented on the issue:

https://github.com/apache/spark/pull/23222
  
Shall we add a SQL tag to the title?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23234: [SPARK-26233][SQL][BACKPORT-2.2] CheckOverflow when enco...

2018-12-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23234
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99718/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23234: [SPARK-26233][SQL][BACKPORT-2.2] CheckOverflow when enco...

2018-12-05 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23234
  
**[Test build #99718 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99718/testReport)**
 for PR 23234 at commit 
[`930c510`](https://github.com/apache/spark/commit/930c51029b845c74357305e7ec30a4f2e6ea748a).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23231: [SPARK-26273][ML] Add OneHotEncoderEstimator as alias to...

2018-12-05 Thread viirya

Github user viirya commented on the issue:

https://github.com/apache/spark/pull/23231
  
It is because we have such claim in ml migration guide that said we will 
keep OneHotEncoderEstimator as an alias. I'm fine if we have consensus now that 
we can avoid such alias.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23237: [SPARK-26279][CORE] Remove unused method in Logging

2018-12-05 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23237
  
**[Test build #99730 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99730/testReport)**
 for PR 23237 at commit 
[`90b111f`](https://github.com/apache/spark/commit/90b111f900d8f11e4d730e0cfbe56a1683f96faa).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23237: [SPARK-26279][CORE] Remove unused method in Logging

2018-12-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23237
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23237: [SPARK-26279][CORE] Remove unused method in Loggi...

2018-12-05 Thread seancxmao

GitHub user seancxmao opened a pull request:

https://github.com/apache/spark/pull/23237

[SPARK-26279][CORE] Remove unused method in Logging

## What changes were proposed in this pull request?
The method `Logging.isTraceEnabled` is not used anywhere. We should remove 
it to avoid confusion.

## How was this patch tested?
Test locally with existing tests.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/seancxmao/spark clean-logging

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/23237.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #23237


commit 90b111f900d8f11e4d730e0cfbe56a1683f96faa
Author: seancxmao 
Date:   2018-12-05T14:07:49Z

[SPARK-26279][CORE] Remove unused methods in Logging




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23237: [SPARK-26279][CORE] Remove unused method in Logging

2018-12-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23237
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23223: [SPARK-26269][YARN]Yarnallocator should have same...

2018-12-05 Thread attilapiros

Github user attilapiros commented on a diff in the pull request:

https://github.com/apache/spark/pull/23223#discussion_r239052799
  
--- Diff: 
resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocator.scala
 ---
@@ -612,11 +612,14 @@ private[yarn] class YarnAllocator(
 val message = "Container killed by YARN for exceeding physical 
memory limits. " +
   s"$diag Consider boosting ${EXECUTOR_MEMORY_OVERHEAD.key}."
 (true, message)
+  case exit_status if 
NOT_APP_AND_SYSTEM_FAULT_EXIT_STATUS.contains(exit_status) =>
+(true, "Container marked as failed: " + containerId + 
onHostStr +
--- End diff --

Nit: Use string interpolation. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23223: [SPARK-26269][YARN]Yarnallocator should have same...

2018-12-05 Thread attilapiros

Github user attilapiros commented on a diff in the pull request:

https://github.com/apache/spark/pull/23223#discussion_r239059997
  
--- Diff: 
resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocator.scala
 ---
@@ -612,11 +612,14 @@ private[yarn] class YarnAllocator(
 val message = "Container killed by YARN for exceeding physical 
memory limits. " +
   s"$diag Consider boosting ${EXECUTOR_MEMORY_OVERHEAD.key}."
 (true, message)
+  case exit_status if 
NOT_APP_AND_SYSTEM_FAULT_EXIT_STATUS.contains(exit_status) =>
--- End diff --

I would prefer not to have it as a separate case but just a new if around 
`handleResourceAllocationFailure` and as NOT_APP_AND_SYSTEM_FAULT_EXIT_STATUS 
is introduced it would make sense to separate it from the huge match on 
exitStatus. This way it would be easier to follow when it is really triggered 
(one should not check all the previous case branches then consider this 
condition with contains). That way values like ContainerExitStatus.SUCCESS 
from the set would be really used. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23223: [SPARK-26269][YARN]Yarnallocator should have same...

2018-12-05 Thread attilapiros

Github user attilapiros commented on a diff in the pull request:

https://github.com/apache/spark/pull/23223#discussion_r239070925
  
--- Diff: 
resource-managers/yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnAllocatorSuite.scala
 ---
@@ -114,13 +116,20 @@ class YarnAllocatorSuite extends SparkFunSuite with 
Matchers with BeforeAndAfter
   clock)
   }
 
-  def createContainer(host: String, resource: Resource = 
containerResource): Container = {
-val containerId = ContainerId.newContainerId(appAttemptId, 
containerNum)
+  def createContainer(
+  host: String,
+  containerId: ContainerId = ContainerId.newContainerId(appAttemptId, 
containerNum),
--- End diff --

Just containerNumber as parameter with default value of containerNum?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23228: [MINOR][DOC]The condition description of serialized shuf...

2018-12-05 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23228
  
**[Test build #4453 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4453/testReport)**
 for PR 23228 at commit 
[`d5dadbf`](https://github.com/apache/spark/commit/d5dadbf30d5429c36ec3d5c2845a71c2717fd6f3).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23229: [MINOR][CORE] Modify some field name because it may be c...

2018-12-05 Thread srowen

Github user srowen commented on the issue:

https://github.com/apache/spark/pull/23229
  
Agree, this isn't worthwhile.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23213: [SPARK-26262][SQL] Runs SQLQueryTestSuite on mixed confi...

2018-12-05 Thread maropu

Github user maropu commented on the issue:

https://github.com/apache/spark/pull/23213
  
yea, I think they're not totally the same..., but I'm not sure that the 
test run (`wholeStage=false, factoryMode=CODE_ONLY`) is worth the time cost.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23207: [SPARK-26193][SQL] Implement shuffle write metric...

2018-12-05 Thread xuanyuanking

Github user xuanyuanking commented on a diff in the pull request:

https://github.com/apache/spark/pull/23207#discussion_r239069014
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/metric/SQLShuffleMetricsReporter.scala
 ---
@@ -95,3 +96,59 @@ private[spark] object SQLShuffleMetricsReporter {
 FETCH_WAIT_TIME -> SQLMetrics.createTimingMetric(sc, "fetch wait 
time"),
 RECORDS_READ -> SQLMetrics.createMetric(sc, "records read"))
 }
+
+/**
+ * A shuffle write metrics reporter for SQL exchange operators. Different 
with
+ * [[SQLShuffleReadMetricsReporter]], we need a function of (reporter => 
reporter) set in
+ * shuffle dependency, so the local SQLMetric should transient and create 
on executor.
+ * @param metrics Shuffle write metrics in current SparkPlan.
+ * @param metricsReporter Other reporter need to be updated in this 
SQLShuffleWriteMetricsReporter.
+ */
+private[spark] case class SQLShuffleWriteMetricsReporter(
+metrics: Map[String, SQLMetric])(metricsReporter: 
ShuffleWriteMetricsReporter)
+  extends ShuffleWriteMetricsReporter with Serializable {
+  @transient private[this] lazy val _bytesWritten =
+metrics(SQLShuffleWriteMetricsReporter.SHUFFLE_BYTES_WRITTEN)
+  @transient private[this] lazy val _recordsWritten =
+metrics(SQLShuffleWriteMetricsReporter.SHUFFLE_RECORDS_WRITTEN)
+  @transient private[this] lazy val _writeTime =
+metrics(SQLShuffleWriteMetricsReporter.SHUFFLE_WRITE_TIME)
+
+  override private[spark] def incBytesWritten(v: Long): Unit = {
+metricsReporter.incBytesWritten(v)
+_bytesWritten.add(v)
+  }
+  override private[spark] def decRecordsWritten(v: Long): Unit = {
+metricsReporter.decBytesWritten(v)
+_recordsWritten.set(_recordsWritten.value - v)
+  }
+  override private[spark] def incRecordsWritten(v: Long): Unit = {
+metricsReporter.incRecordsWritten(v)
+_recordsWritten.add(v)
+  }
+  override private[spark] def incWriteTime(v: Long): Unit = {
+metricsReporter.incWriteTime(v)
+_writeTime.add(v)
+  }
+  override private[spark] def decBytesWritten(v: Long): Unit = {
+metricsReporter.decBytesWritten(v)
+_bytesWritten.set(_bytesWritten.value - v)
+  }
+}
+
+private[spark] object SQLShuffleWriteMetricsReporter {
+  val SHUFFLE_BYTES_WRITTEN = "shuffleBytesWritten"
+  val SHUFFLE_RECORDS_WRITTEN = "shuffleRecordsWritten"
+  val SHUFFLE_WRITE_TIME = "shuffleWriteTime"
--- End diff --

Just this shuffle write time in this PR. The left one of time metrics is 
`fetch wait time`, it's in ms set in `ShuffleBlockFetcherIterator`.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23196: [SPARK-26243][SQL] Use java.time API for parsing ...

2018-12-05 Thread srowen

Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/23196#discussion_r239068840
  
--- Diff: 
sql/hive/compatibility/src/test/scala/org/apache/spark/sql/hive/execution/HiveCompatibilitySuite.scala
 ---
@@ -49,8 +49,8 @@ class HiveCompatibilitySuite extends HiveQueryFileTest 
with BeforeAndAfter {
   override def beforeAll() {
 super.beforeAll()
 TestHive.setCacheTables(true)
-// Timezone is fixed to America/Los_Angeles for those timezone 
sensitive tests (timestamp_*)
-TimeZone.setDefault(TimeZone.getTimeZone("America/Los_Angeles"))
+// Timezone is fixed to GMT for those timezone sensitive tests 
(timestamp_*)
--- End diff --

I think consistency is indeed a problem, but why disable the new parser, 
rather than make this consistent? I haven't looked into whether there's a good 
reason they behave differently but suspect not.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23207: [SPARK-26193][SQL] Implement shuffle write metric...

2018-12-05 Thread xuanyuanking

Github user xuanyuanking commented on a diff in the pull request:

https://github.com/apache/spark/pull/23207#discussion_r239067552
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/metric/SQLMetrics.scala 
---
@@ -163,6 +171,8 @@ object SQLMetrics {
 Utils.bytesToString
   } else if (metricsType == TIMING_METRIC) {
 Utils.msDurationToString
+  } else if (metricsType == NS_TIMING_METRIC) {
+duration => Utils.msDurationToString(duration / 1000 / 1000)
--- End diff --

Maybe it's ok, as I test this locally with UT in SQLMetricsSuites, result 
below:
```
shuffle records written: 2
shuffle write time total (min, med, max): 37 ms (37 ms, 37 ms, 37 ms)
shuffle bytes written total (min, med, max): 66.0 B (66.0 B, 66.0 B, 66.0 
```
In the actual scenario the shuffle bytes written will be more larger, and 
keep the time to ms maybe enough, WDYT?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23225: [MINOR][CORE]Don't need to create an empty spill file wh...

2018-12-05 Thread wangjiaochun

Github user wangjiaochun commented on the issue:

https://github.com/apache/spark/pull/23225
  
1. I think test case writeEmptyIterator in UnsafeShuffleWriterSuite.java 
cover this scenes
2. I will propose a  JIRA soon.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23218: [SPARK-26266][BUILD] Update to Scala 2.12.8

2018-12-05 Thread srowen

Github user srowen commented on the issue:

https://github.com/apache/spark/pull/23218
  
Ah OK, so all of them were a JVM crash. It would probably be a good idea to 
update the JVM on all the workers as _60 is over 3 years old. It's probably not 
as simple as it sounds but WDYT @shaneknapp ?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23226: [MINOR][TEST] Add MAXIMUM_PAGE_SIZE_BYTES Excepti...

2018-12-05 Thread wangjiaochun

Github user wangjiaochun commented on a diff in the pull request:

https://github.com/apache/spark/pull/23226#discussion_r239066440
  
--- Diff: 
core/src/test/java/org/apache/spark/unsafe/map/AbstractBytesToBytesMapSuite.java
 ---
@@ -622,6 +622,17 @@ public void initialCapacityBoundsChecking() {
 } catch (IllegalArgumentException e) {
   // expected exception
 }
+
+try {
+  new BytesToBytesMap(
+  taskMemoryManager,
--- End diff --

ok,I will correct this indentation and propose JIRA.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23236: [SPARK-26275][PYTHON][ML] Increases timeout for Streamin...

2018-12-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23236
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23236: [SPARK-26275][PYTHON][ML] Increases timeout for Streamin...

2018-12-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23236
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99729/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23236: [SPARK-26275][PYTHON][ML] Increases timeout for Streamin...

2018-12-05 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23236
  
**[Test build #99729 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99729/testReport)**
 for PR 23236 at commit 
[`3c4ee75`](https://github.com/apache/spark/commit/3c4ee75c4d0585702cd87cc4df9af74e235bb431).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22952: [SPARK-20568][SS] Provide option to clean up completed f...

2018-12-05 Thread gaborgsomogyi

Github user gaborgsomogyi commented on the issue:

https://github.com/apache/spark/pull/22952
  
@HeartSaVioR It's a question what is not big deal, I've seen ~1 hour glob 
request when huge amount of files stored :)
If file move is even worse one more reason to move it to separate thread.



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23236: [SPARK-26275][PYTHON][ML] Increases timeout for Streamin...

2018-12-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23236
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99727/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23236: [SPARK-26275][PYTHON][ML] Increases timeout for Streamin...

2018-12-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23236
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23236: [SPARK-26275][PYTHON][ML] Increases timeout for Streamin...

2018-12-05 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23236
  
**[Test build #99727 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99727/testReport)**
 for PR 23236 at commit 
[`3c4ee75`](https://github.com/apache/spark/commit/3c4ee75c4d0585702cd87cc4df9af74e235bb431).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23236: [SPARK-26275][PYTHON][ML] Increases timeout for Streamin...

2018-12-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23236
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99726/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23236: [SPARK-26275][PYTHON][ML] Increases timeout for Streamin...

2018-12-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23236
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

< 1 2 3 4 5 6 7 >

301 - 400 of 614 matches

Mail list logo