[GitHub] spark issue #20892: [SPARK-23700][PYTHON] Cleanup imports in pyspark.sql

2018-03-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20892
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20892: [SPARK-23700][PYTHON] Cleanup imports in pyspark....

2018-03-23 Thread BryanCutler
Github user BryanCutler commented on a diff in the pull request:

https://github.com/apache/spark/pull/20892#discussion_r176829827
  
--- Diff: python/pyspark/sql/functions.py ---
@@ -28,10 +27,10 @@
 
 from pyspark import since, SparkContext
 from pyspark.rdd import ignore_unicode_prefix, PythonEvalType
-from pyspark.serializers import PickleSerializer, AutoBatchedSerializer
 from pyspark.sql.column import Column, _to_java_column, _to_seq
 from pyspark.sql.dataframe import DataFrame
 from pyspark.sql.types import StringType, DataType
+# Keep UserDefinedFunction import for backwards compatible import; moved 
in SPARK-22409
 from pyspark.sql.udf import UserDefinedFunction, _create_udf
--- End diff --

Not sure if there is a better way to do this other than importing 
`UserDefinedFunction` here, but hopefully the note will show the intent.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20892: [SPARK-23700][PYTHON] Cleanup imports in pyspark.sql

2018-03-23 Thread BryanCutler
Github user BryanCutler commented on the issue:

https://github.com/apache/spark/pull/20892
  
@HyukjinKwon and @ueshin I tried to be pretty conservative and only remove 
imports that were obviously not being used.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18982: [SPARK-21685][PYTHON][ML] PySpark Params isSet st...

2018-03-23 Thread holdenk
Github user holdenk commented on a diff in the pull request:

https://github.com/apache/spark/pull/18982#discussion_r176828963
  
--- Diff: python/pyspark/ml/wrapper.py ---
@@ -118,11 +118,18 @@ def _transfer_params_to_java(self):
 """
 Transforms the embedded params to the companion Java object.
 """
-paramMap = self.extractParamMap()
+pair_defaults = []
 for param in self.params:
-if param in paramMap:
-pair = self._make_java_param_pair(param, paramMap[param])
+if self.isSet(param):
+pair = self._make_java_param_pair(param, 
self._paramMap[param])
 self._java_obj.set(pair)
+if self.hasDefault(param):
+pair = self._make_java_param_pair(param, 
self._defaultParamMap[param])
+pair_defaults.append(pair)
+if len(pair_defaults) > 0:
+sc = SparkContext._active_spark_context
+pair_defaults_seq = sc._jvm.PythonUtils.toSeq(pair_defaults)
+self._java_obj.setDefault(pair_defaults_seq)
--- End diff --

I think this is reasonable, a few extra lines to avoid potential unwanted 
user surprise is worth it.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19876: [ML][SPARK-11171][SPARK-11239] Add PMML export to Spark ...

2018-03-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19876
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19876: [ML][SPARK-11171][SPARK-11239] Add PMML export to Spark ...

2018-03-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19876
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1727/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20892: [SPARK-23700][PYTHON] Cleanup imports in pyspark....

2018-03-23 Thread BryanCutler
GitHub user BryanCutler opened a pull request:

https://github.com/apache/spark/pull/20892

[SPARK-23700][PYTHON] Cleanup imports in pyspark.sql

## What changes were proposed in this pull request?

This cleans up unused imports, mainly from pyspark.sql module.  Added a 
note in function.py that imports `UserDefinedFunction` only to maintain 
backwards compatibility for using `from pyspark.sql.function import 
UserDefinedFunction`. 

## How was this patch tested?

Existing tests.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/BryanCutler/spark 
pyspark-cleanup-imports-SPARK-23700

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/20892.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #20892


commit 5da8c3d39cbbe8e78df2fad4c04d7a7ab1d9db9d
Author: Bryan Cutler 
Date:   2018-03-22T00:09:01Z

tests passing

commit 5214f411d28a19b244a97ffe25f8be5852e273c1
Author: Bryan Cutler 
Date:   2018-03-23T18:25:28Z

change note description




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20891: [SPARK-23782][CORE][UI] SHS should list only application...

2018-03-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20891
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88545/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20891: [SPARK-23782][CORE][UI] SHS should list only application...

2018-03-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20891
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20891: [SPARK-23782][CORE][UI] SHS should list only application...

2018-03-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20891
  
**[Test build #88545 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88545/testReport)**
 for PR 20891 at commit 
[`bc87945`](https://github.com/apache/spark/commit/bc879455d8c7057a181989461cae19e60c82966d).
 * This patch **fails MiMa tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20327: [SPARK-12963][CORE] NM host for driver end points

2018-03-23 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/20327#discussion_r176827527
  
--- Diff: 
resource-managers/yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnClusterSuite.scala
 ---
@@ -136,6 +135,39 @@ class YarnClusterSuite extends BaseYarnClusterSuite {
 checkResult(finalState, result)
   }
 
+  private def testClusterDriverBind(
+  uiEnabled: Boolean,
+  localHost: String,
+  localIp: String,
+  success: Boolean): Unit = {
+val result = File.createTempFile("result", null, tempDir)
+val finalState = runSpark(false, 
mainClassName(YarnClusterDriver.getClass),
+  appArgs = Seq(result.getAbsolutePath()),
+  extraConf = Map(
+"spark.yarn.appMasterEnv.SPARK_LOCAL_HOSTNAME" -> localHost,
+"spark.yarn.appMasterEnv.SPARK_LOCAL_IP" -> localIp,
+"spark.ui.enabled" -> uiEnabled.toString
+  ))
+if (success) {
+  checkResult(finalState, result, "success")
+} else {
+  finalState should be (SparkAppHandle.State.FAILED)
+}
+  }
+
+  test("yarn-cluster driver should be able to bind listeners to MM_HOST") {
--- End diff --

Yes, we are, sorry. I'm referring to your change in YarnRMClient.scala.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19876: [ML][SPARK-11171][SPARK-11239] Add PMML export to Spark ...

2018-03-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19876
  
**[Test build #88546 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88546/testReport)**
 for PR 19876 at commit 
[`cb6fd70`](https://github.com/apache/spark/commit/cb6fd70d0c61b6477f7514431ee2e1c097ec0aff).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19876: [ML][SPARK-11171][SPARK-11239] Add PMML export to Spark ...

2018-03-23 Thread holdenk
Github user holdenk commented on the issue:

https://github.com/apache/spark/pull/19876
  
LGTM pending Jenkins will merge.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19876: [ML][SPARK-11171][SPARK-11239] Add PMML export to...

2018-03-23 Thread holdenk
Github user holdenk commented on a diff in the pull request:

https://github.com/apache/spark/pull/19876#discussion_r176826923
  
--- Diff: mllib/src/main/scala/org/apache/spark/ml/util/ReadWrite.scala ---
@@ -86,7 +88,80 @@ private[util] sealed trait BaseReadWrite {
 }
 
 /**
- * Abstract class for utility classes that can save ML instances.
+ * Implemented by objects that provide ML exportability.
+ *
+ * A new instance of this class will be instantiated each time a save call 
is made.
+ *
+ * Must have a valid zero argument constructor which will be called to 
instantiate.
+ *
+ * @since 2.3.0
+ */
+@InterfaceStability.Unstable
+@Since("2.3.0")
+trait MLWriterFormat {
+  /**
+   * Function to write the provided pipeline stage out.
+   *
+   * @param path  The path to write the result out to.
+   * @param session  SparkSession associated with the write request.
+   * @param optionMap  User provided options stored as strings.
+   * @param stage  The pipeline stage to be saved.
+   */
+  @Since("2.3.0")
+  def write(path: String, session: SparkSession, optionMap: 
mutable.Map[String, String],
+stage: PipelineStage): Unit
+}
+
+/**
+ * ML export formats for should implement this trait so that users can 
specify a shortname rather
+ * than the fully qualified class name of the exporter.
+ *
+ * A new instance of this class will be instantiated each time a save call 
is made.
--- End diff --

done


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20327: [SPARK-12963][CORE] NM host for driver end points

2018-03-23 Thread gerashegalov
Github user gerashegalov commented on a diff in the pull request:

https://github.com/apache/spark/pull/20327#discussion_r176825696
  
--- Diff: 
resource-managers/yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnClusterSuite.scala
 ---
@@ -136,6 +135,39 @@ class YarnClusterSuite extends BaseYarnClusterSuite {
 checkResult(finalState, result)
   }
 
+  private def testClusterDriverBind(
+  uiEnabled: Boolean,
+  localHost: String,
+  localIp: String,
+  success: Boolean): Unit = {
+val result = File.createTempFile("result", null, tempDir)
+val finalState = runSpark(false, 
mainClassName(YarnClusterDriver.getClass),
+  appArgs = Seq(result.getAbsolutePath()),
+  extraConf = Map(
+"spark.yarn.appMasterEnv.SPARK_LOCAL_HOSTNAME" -> localHost,
+"spark.yarn.appMasterEnv.SPARK_LOCAL_IP" -> localIp,
+"spark.ui.enabled" -> uiEnabled.toString
+  ))
+if (success) {
+  checkResult(finalState, result, "success")
+} else {
+  finalState should be (SparkAppHandle.State.FAILED)
+}
+  }
+
+  test("yarn-cluster driver should be able to bind listeners to MM_HOST") {
--- End diff --

Maybe we are talking about different parts of the patch. This thread is 
attached to my test code where I am demoing how this happens. It doesn't 
explicitly validate the generated URL's, only how bind works/fails.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18982: [SPARK-21685][PYTHON][ML] PySpark Params isSet st...

2018-03-23 Thread BryanCutler
Github user BryanCutler commented on a diff in the pull request:

https://github.com/apache/spark/pull/18982#discussion_r176825209
  
--- Diff: python/pyspark/ml/wrapper.py ---
@@ -118,11 +118,18 @@ def _transfer_params_to_java(self):
 """
 Transforms the embedded params to the companion Java object.
 """
-paramMap = self.extractParamMap()
+pair_defaults = []
 for param in self.params:
-if param in paramMap:
-pair = self._make_java_param_pair(param, paramMap[param])
+if self.isSet(param):
+pair = self._make_java_param_pair(param, 
self._paramMap[param])
 self._java_obj.set(pair)
+if self.hasDefault(param):
+pair = self._make_java_param_pair(param, 
self._defaultParamMap[param])
+pair_defaults.append(pair)
+if len(pair_defaults) > 0:
+sc = SparkContext._active_spark_context
+pair_defaults_seq = sc._jvm.PythonUtils.toSeq(pair_defaults)
+self._java_obj.setDefault(pair_defaults_seq)
--- End diff --

My take is that while they should be the same, it's still possible they 
might not be. The user could extend their own classes or it's quite easy to 
change in Python. Although we don't really support this, if there was a 
mismatch the user would probably just get bad results and it would be really 
hard to figure out why. From the Python API, it would look like it was one 
value but actually using another in Scala. 

If you all think it's overly cautious to do this, I can take it out. I just 
thought it would be cheap insurance to just set these values regardless.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20327: [SPARK-12963][CORE] NM host for driver end points

2018-03-23 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/20327#discussion_r176823881
  
--- Diff: 
resource-managers/yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnClusterSuite.scala
 ---
@@ -136,6 +135,39 @@ class YarnClusterSuite extends BaseYarnClusterSuite {
 checkResult(finalState, result)
   }
 
+  private def testClusterDriverBind(
+  uiEnabled: Boolean,
+  localHost: String,
+  localIp: String,
+  success: Boolean): Unit = {
+val result = File.createTempFile("result", null, tempDir)
+val finalState = runSpark(false, 
mainClassName(YarnClusterDriver.getClass),
+  appArgs = Seq(result.getAbsolutePath()),
+  extraConf = Map(
+"spark.yarn.appMasterEnv.SPARK_LOCAL_HOSTNAME" -> localHost,
+"spark.yarn.appMasterEnv.SPARK_LOCAL_IP" -> localIp,
+"spark.ui.enabled" -> uiEnabled.toString
+  ))
+if (success) {
+  checkResult(finalState, result, "success")
+} else {
+  finalState should be (SparkAppHandle.State.FAILED)
+}
+  }
+
+  test("yarn-cluster driver should be able to bind listeners to MM_HOST") {
--- End diff --

Yes, but your change isn't touching that argument. I was wondering what is 
the effect of the code you're actually changing.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20891: [SPARK-23782][CORE][UI] SHS should list only application...

2018-03-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20891
  
**[Test build #88545 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88545/testReport)**
 for PR 20891 at commit 
[`bc87945`](https://github.com/apache/spark/commit/bc879455d8c7057a181989461cae19e60c82966d).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20891: [SPARK-23782][CORE][UI] SHS should list only application...

2018-03-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20891
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20891: [SPARK-23782][CORE][UI] SHS should list only application...

2018-03-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20891
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1726/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20891: [SPARK-23782][CORE][UI] SHS should list only appl...

2018-03-23 Thread mgaido91
GitHub user mgaido91 opened a pull request:

https://github.com/apache/spark/pull/20891

[SPARK-23782][CORE][UI] SHS should list only applications with read 
permissions

## What changes were proposed in this pull request?

Before the PR, all applications were returned to all users.
The PR, instead, changes the behavior, so that, If an authentication method 
is specified, the remote user is used to check the read permissions of the 
applications and only the applications the user can read are returned.

## How was this patch tested?

added UT

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mgaido91/spark SPARK-23782

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/20891.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #20891


commit bc879455d8c7057a181989461cae19e60c82966d
Author: Marco Gaido 
Date:   2018-03-21T16:26:35Z

[SPARK-23782][CORE][UI] SHS should list only applications with read 
permissions




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20327: [SPARK-12963][CORE] NM host for driver end points

2018-03-23 Thread gerashegalov
Github user gerashegalov commented on a diff in the pull request:

https://github.com/apache/spark/pull/20327#discussion_r176822894
  
--- Diff: 
resource-managers/yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnClusterSuite.scala
 ---
@@ -136,6 +135,39 @@ class YarnClusterSuite extends BaseYarnClusterSuite {
 checkResult(finalState, result)
   }
 
+  private def testClusterDriverBind(
+  uiEnabled: Boolean,
+  localHost: String,
+  localIp: String,
+  success: Boolean): Unit = {
+val result = File.createTempFile("result", null, tempDir)
+val finalState = runSpark(false, 
mainClassName(YarnClusterDriver.getClass),
+  appArgs = Seq(result.getAbsolutePath()),
+  extraConf = Map(
+"spark.yarn.appMasterEnv.SPARK_LOCAL_HOSTNAME" -> localHost,
+"spark.yarn.appMasterEnv.SPARK_LOCAL_IP" -> localIp,
+"spark.ui.enabled" -> uiEnabled.toString
+  ))
+if (success) {
+  checkResult(finalState, result, "success")
+} else {
+  finalState should be (SparkAppHandle.State.FAILED)
+}
+  }
+
+  test("yarn-cluster driver should be able to bind listeners to MM_HOST") {
--- End diff --

correct, SPARK_LOCAL_IP/HOSTNAME and SPARK_PUBLIC_DNS play a role in how 
tracking URL is generated


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19876: [ML][SPARK-11171][SPARK-11239] Add PMML export to...

2018-03-23 Thread holdenk
Github user holdenk commented on a diff in the pull request:

https://github.com/apache/spark/pull/19876#discussion_r176821909
  
--- Diff: mllib/src/main/scala/org/apache/spark/ml/util/ReadWrite.scala ---
@@ -86,7 +88,80 @@ private[util] sealed trait BaseReadWrite {
 }
 
 /**
- * Abstract class for utility classes that can save ML instances.
+ * Implemented by objects that provide ML exportability.
+ *
+ * A new instance of this class will be instantiated each time a save call 
is made.
+ *
+ * Must have a valid zero argument constructor which will be called to 
instantiate.
+ *
+ * @since 2.3.0
+ */
+@InterfaceStability.Unstable
+@Since("2.3.0")
+trait MLWriterFormat {
+  /**
+   * Function to write the provided pipeline stage out.
+   *
+   * @param path  The path to write the result out to.
+   * @param session  SparkSession associated with the write request.
+   * @param optionMap  User provided options stored as strings.
+   * @param stage  The pipeline stage to be saved.
+   */
+  @Since("2.3.0")
+  def write(path: String, session: SparkSession, optionMap: 
mutable.Map[String, String],
+stage: PipelineStage): Unit
+}
+
+/**
+ * ML export formats for should implement this trait so that users can 
specify a shortname rather
+ * than the fully qualified class name of the exporter.
+ *
+ * A new instance of this class will be instantiated each time a save call 
is made.
--- End diff --

Add a comment about zero arg constructor requirement


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19876: [ML][SPARK-11171][SPARK-11239] Add PMML export to...

2018-03-23 Thread holdenk
Github user holdenk commented on a diff in the pull request:

https://github.com/apache/spark/pull/19876#discussion_r176821060
  
--- Diff: mllib/src/main/scala/org/apache/spark/ml/util/ReadWrite.scala ---
@@ -86,7 +88,80 @@ private[util] sealed trait BaseReadWrite {
 }
 
 /**
- * Abstract class for utility classes that can save ML instances.
+ * Implemented by objects that provide ML exportability.
+ *
+ * A new instance of this class will be instantiated each time a save call 
is made.
+ *
+ * Must have a valid zero argument constructor which will be called to 
instantiate.
+ *
+ * @since 2.3.0
--- End diff --

Need to update since annotations to 2.4.0


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20885: [SPARK-23724][SPARK-23765][SQL] Line separator fo...

2018-03-23 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/20885#discussion_r176821110
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JSONOptions.scala
 ---
@@ -85,6 +85,38 @@ private[sql] class JSONOptions(
 
   val multiLine = 
parameters.get("multiLine").map(_.toBoolean).getOrElse(false)
 
+  val charset: Option[String] = Some("UTF-8")
+
+  /**
+   * A sequence of bytes between two consecutive json records. Format of 
the option is:
+   *   selector (1 char) + delimiter body (any length) | sequence of chars
--- End diff --

I'm afraid of defining our own rule here, is there any standard we can 
follow?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20885: [SPARK-23724][SPARK-23765][SQL] Line separator fo...

2018-03-23 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/20885#discussion_r176820312
  
--- Diff: python/pyspark/sql/readwriter.py ---
@@ -770,12 +773,15 @@ def json(self, path, mode=None, compression=None, 
dateFormat=None, timestampForm
 formats follow the formats at 
``java.text.SimpleDateFormat``.
 This applies to timestamp type. If None is 
set, it uses the
 default value, 
``-MM-dd'T'HH:mm:ss.SSSXXX``.
+:param lineSep: defines the line separator that should be used for 
writing. If None is
+set, it uses the default value, ``\\n``.
--- End diff --

```
it covers all ``\\r``, ``\\r\\n`` and ``\\n``.
```



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20327: [SPARK-12963][CORE] NM host for driver end points

2018-03-23 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/20327#discussion_r176817667
  
--- Diff: 
resource-managers/yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnClusterSuite.scala
 ---
@@ -136,6 +135,39 @@ class YarnClusterSuite extends BaseYarnClusterSuite {
 checkResult(finalState, result)
   }
 
+  private def testClusterDriverBind(
+  uiEnabled: Boolean,
+  localHost: String,
+  localIp: String,
+  success: Boolean): Unit = {
+val result = File.createTempFile("result", null, tempDir)
+val finalState = runSpark(false, 
mainClassName(YarnClusterDriver.getClass),
+  appArgs = Seq(result.getAbsolutePath()),
+  extraConf = Map(
+"spark.yarn.appMasterEnv.SPARK_LOCAL_HOSTNAME" -> localHost,
+"spark.yarn.appMasterEnv.SPARK_LOCAL_IP" -> localIp,
+"spark.ui.enabled" -> uiEnabled.toString
+  ))
+if (success) {
+  checkResult(finalState, result, "success")
+} else {
+  finalState should be (SparkAppHandle.State.FAILED)
+}
+  }
+
+  test("yarn-cluster driver should be able to bind listeners to MM_HOST") {
--- End diff --

Isn't that the `trackingUrl` which is a separate parameter to the call?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20327: [SPARK-12963][CORE] NM host for driver end points

2018-03-23 Thread gerashegalov
Github user gerashegalov commented on a diff in the pull request:

https://github.com/apache/spark/pull/20327#discussion_r176816848
  
--- Diff: 
resource-managers/yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnClusterSuite.scala
 ---
@@ -136,6 +135,39 @@ class YarnClusterSuite extends BaseYarnClusterSuite {
 checkResult(finalState, result)
   }
 
+  private def testClusterDriverBind(
+  uiEnabled: Boolean,
+  localHost: String,
+  localIp: String,
+  success: Boolean): Unit = {
+val result = File.createTempFile("result", null, tempDir)
+val finalState = runSpark(false, 
mainClassName(YarnClusterDriver.getClass),
+  appArgs = Seq(result.getAbsolutePath()),
+  extraConf = Map(
+"spark.yarn.appMasterEnv.SPARK_LOCAL_HOSTNAME" -> localHost,
+"spark.yarn.appMasterEnv.SPARK_LOCAL_IP" -> localIp,
+"spark.ui.enabled" -> uiEnabled.toString
+  ))
+if (success) {
+  checkResult(finalState, result, "success")
+} else {
+  finalState should be (SparkAppHandle.State.FAILED)
+}
+  }
+
+  test("yarn-cluster driver should be able to bind listeners to MM_HOST") {
--- End diff --

The URL should have a reachable authority for proxying and direct use.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20717: [SPARK-23564][SQL] Add isNotNull check for left anti and...

2018-03-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20717
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88542/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20717: [SPARK-23564][SQL] Add isNotNull check for left anti and...

2018-03-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20717
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20717: [SPARK-23564][SQL] Add isNotNull check for left anti and...

2018-03-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20717
  
**[Test build #88542 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88542/testReport)**
 for PR 20717 at commit 
[`5cadd86`](https://github.com/apache/spark/commit/5cadd86ec4fae40c8d2606f0c00aed99a96d0027).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20883: [SPARK-23759][UI] Unable to bind Spark UI to spec...

2018-03-23 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/20883


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20883: [SPARK-23759][UI] Unable to bind Spark UI to specific ho...

2018-03-23 Thread vanzin
Github user vanzin commented on the issue:

https://github.com/apache/spark/pull/20883
  
Merging to master, 2.3, 2.2.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20864: [SPARK-23745][SQL]Remove the directories of the “hive....

2018-03-23 Thread liufengdb
Github user liufengdb commented on the issue:

https://github.com/apache/spark/pull/20864
  
I thought the directory is also created from this line: 
https://github.com/apache/spark/blob/master/sql/hive-thriftserver/src/main/java/org/apache/hive/service/cli/session/HiveSessionImpl.java#L143.
 For this one, we need to think about whether we can remove all the temp 
directories creation, because the statements are executed by spark sql and it 
has nothing about the Hive in the thrift server.

You are right that HiveClientImpl (the Hive inside spark sql) will also 
produce such temp directories. However, it seems like the following line alone 
is sufficient to add the jar to the class loader: 
https://github.com/apache/spark/blob/master/sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala#L836.
 So I doubt we still need the `runSqlHive(s"ADD JAR $path")` to download the 
jar to a temp directory.

Overall, I think we need an overall design to remove the Hive legacy in 
both the thrift server and Spark SQL. Adding more temp fixes will make such a 
design harder.




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20884: [SPARK-23773][SQL] JacksonGenerator does not incl...

2018-03-23 Thread sameeragarwal
Github user sameeragarwal commented on a diff in the pull request:

https://github.com/apache/spark/pull/20884#discussion_r176796998
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/json/JacksonGeneratorSuite.scala
 ---
@@ -56,7 +56,7 @@ class JacksonGeneratorSuite extends SparkFunSuite {
 val gen = new JacksonGenerator(dataType, writer, option)
 gen.write(input)
 gen.flush()
-assert(writer.toString === """[{}]""")
+assert(writer.toString === """[{"a":null}]""")
--- End diff --

+1


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20838: [SPARK-23698] Resolve undefined names in Python 3

2018-03-23 Thread cclauss
Github user cclauss commented on the issue:

https://github.com/apache/spark/pull/20838
  
@HyukjinKwon Was there something more to do on this one?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20811: [SPARK-23668][K8S] Add config option for passing through...

2018-03-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20811
  
Kubernetes integration test status success
URL: 
https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-spark-integration/1705/



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20811: [SPARK-23668][K8S] Add config option for passing through...

2018-03-23 Thread liyinan926
Github user liyinan926 commented on the issue:

https://github.com/apache/spark/pull/20811
  
LGTM. @foxish.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20890: [WIP][SPARK-23779][SQL] TaskMemoryManager and UnsafeSort...

2018-03-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20890
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20890: [WIP][SPARK-23779][SQL] TaskMemoryManager and UnsafeSort...

2018-03-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20890
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1725/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20890: [WIP][SPARK-23779][SQL] TaskMemoryManager and UnsafeSort...

2018-03-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20890
  
**[Test build #88544 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88544/testReport)**
 for PR 20890 at commit 
[`4a0a5e3`](https://github.com/apache/spark/commit/4a0a5e34d5efebcdf9f58d70bff8d9e46d953099).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20811: [SPARK-23668][K8S] Add config option for passing through...

2018-03-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20811
  
Kubernetes integration test starting
URL: 
https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-spark-integration/1705/



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20811: [SPARK-23668][K8S] Add config option for passing through...

2018-03-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20811
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20811: [SPARK-23668][K8S] Add config option for passing through...

2018-03-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20811
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1724/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20851: [SPARK-23727][SQL] Support for pushing down filte...

2018-03-23 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/20851#discussion_r176762766
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
@@ -353,6 +353,13 @@ object SQLConf {
 .booleanConf
 .createWithDefault(true)
 
+  val PARQUET_FILTER_PUSHDOWN_DATE_ENABLED = 
buildConf("spark.sql.parquet.filterPushdown.date")
+.doc("If true, enables Parquet filter push-down optimization for Date. 
" +
+  "This configuration only has an effect when 
'spark.sql.parquet.filterPushdown' is enabled.")
+.internal()
+.booleanConf
+.createWithDefault(false)
--- End diff --

@yucai . The reason is that `spark.sql.orc.filterPushdown` is still `false` 
in Spark 2.3 while `spark.sql.parquet.filterPushdown` is `true`. We don't know 
this is safe or not.

Anyway, we have 6 or more months for Apache Spark 2.4. We may enable this 
in `master` branch temporarily for testing purpose only, and are able to 
disable this at the last moment of 2.4 release like we did about ORC conf if 
there is some issue.

BTW, did you use this in your company a lot in production?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20890: [WIP][SPARK-23779][SQL] TaskMemoryManager and UnsafeSort...

2018-03-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20890
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20890: [WIP][SPARK-23779][SQL] TaskMemoryManager and UnsafeSort...

2018-03-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20890
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88543/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20890: [WIP][SPARK-23779][SQL] TaskMemoryManager and UnsafeSort...

2018-03-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20890
  
**[Test build #88543 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88543/testReport)**
 for PR 20890 at commit 
[`33b692a`](https://github.com/apache/spark/commit/33b692a36f33d97a6e9cce7772d38db001e02203).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20890: [WIP][SPARK-23779][SQL] TaskMemoryManager and UnsafeSort...

2018-03-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20890
  
**[Test build #88543 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88543/testReport)**
 for PR 20890 at commit 
[`33b692a`](https://github.com/apache/spark/commit/33b692a36f33d97a6e9cce7772d38db001e02203).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20890: [WIP][SPARK-23779][SQL] TaskMemoryManager and UnsafeSort...

2018-03-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20890
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1723/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20890: [WIP][SPARK-23779][SQL] TaskMemoryManager and UnsafeSort...

2018-03-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20890
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20885: [SPARK-23724][SPARK-23765][SQL] Line separator for the j...

2018-03-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20885
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20885: [SPARK-23724][SPARK-23765][SQL] Line separator for the j...

2018-03-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20885
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88540/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20885: [SPARK-23724][SPARK-23765][SQL] Line separator for the j...

2018-03-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20885
  
**[Test build #88540 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88540/testReport)**
 for PR 20885 at commit 
[`bbff402`](https://github.com/apache/spark/commit/bbff40206e6871ea9ab035e7a8876f495bdf3d90).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20818: [SPARK-23675][WEB-UI]Title add spark logo, use sp...

2018-03-23 Thread srowen
Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/20818#discussion_r176753823
  
--- Diff: core/src/main/scala/org/apache/spark/ui/UIUtils.scala ---
@@ -265,6 +266,7 @@ private[spark] object UIUtils extends Logging {
   
 {commonHeaderNodes}
 {if (useDataTables) dataTablesHeaderNodes else Seq.empty}
+
--- End diff --

@guoxiaolongzte don't you need to change this too?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20717: [SPARK-23564][SQL] Add isNotNull check for left anti and...

2018-03-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20717
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20717: [SPARK-23564][SQL] Add isNotNull check for left anti and...

2018-03-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20717
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1722/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20717: [SPARK-23564][SQL] Add isNotNull check for left anti and...

2018-03-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20717
  
**[Test build #88542 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88542/testReport)**
 for PR 20717 at commit 
[`5cadd86`](https://github.com/apache/spark/commit/5cadd86ec4fae40c8d2606f0c00aed99a96d0027).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20717: [SPARK-23564][SQL] Add isNotNull check for left a...

2018-03-23 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request:

https://github.com/apache/spark/pull/20717#discussion_r176750470
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala
 ---
@@ -341,6 +341,26 @@ case class Join(
 case UsingJoin(_, _) => false
 case _ => resolvedExceptNatural
   }
+
+  override protected def constructAllConstraints: Set[Expression] = {
+// additional constraints which are not enforced on the result of join 
operations, but can be
+// enforced either on the left or the right side
+val additionalConstraints = joinType match {
+  case LeftAnti | LeftOuter if condition.isDefined =>
+
splitConjunctivePredicates(condition.get).flatMap(inferIsNotNullConstraints).filter(
+  _.references.subsetOf(right.outputSet))
+  case RightOuter if condition.isDefined =>
+
splitConjunctivePredicates(condition.get).flatMap(inferIsNotNullConstraints).filter(
+  _.references.subsetOf(left.outputSet))
+  case _ => Seq.empty[Expression]
+}
+super.constructAllConstraints ++ additionalConstraints
+  }
+
+  override lazy val constraints: ExpressionSet = ExpressionSet(
+super.constructAllConstraints.filter { c =>
+  c.references.nonEmpty && c.references.subsetOf(outputSet) && 
c.deterministic
+})
--- End diff --

thanks, I added some statements to the `ConstraintPropagationSuite`.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20890: [WIP][SPARK-23779][SQL] TaskMemoryManager and UnsafeSort...

2018-03-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20890
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20890: [WIP][SPARK-23779][SQL] TaskMemoryManager and UnsafeSort...

2018-03-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20890
  
**[Test build #88541 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88541/testReport)**
 for PR 20890 at commit 
[`fce58f8`](https://github.com/apache/spark/commit/fce58f8c40a54c9730ceefd0c9eb46e1aa3358a0).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20890: [WIP][SPARK-23779][SQL] TaskMemoryManager and UnsafeSort...

2018-03-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20890
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88541/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20883: [SPARK-23759][UI] Unable to bind Spark UI to specific ho...

2018-03-23 Thread mgaido91
Github user mgaido91 commented on the issue:

https://github.com/apache/spark/pull/20883
  
Thanks, LGTM


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20890: [WIP][SPARK-23779][SQL] TaskMemoryManager and UnsafeSort...

2018-03-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20890
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20890: [WIP][SPARK-23779][SQL] TaskMemoryManager and UnsafeSort...

2018-03-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20890
  
**[Test build #88541 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88541/testReport)**
 for PR 20890 at commit 
[`fce58f8`](https://github.com/apache/spark/commit/fce58f8c40a54c9730ceefd0c9eb46e1aa3358a0).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20890: [WIP][SPARK-23779][SQL] TaskMemoryManager and UnsafeSort...

2018-03-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20890
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1721/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20883: [SPARK-23759][UI] Unable to bind Spark UI to specific ho...

2018-03-23 Thread felixalbani
Github user felixalbani commented on the issue:

https://github.com/apache/spark/pull/20883
  
@mgaido91 Thanks, I updated description with your suggestions


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20890: [WIP][SPARK-23779][SQL] TaskMemoryManager and Uns...

2018-03-23 Thread kiszk
GitHub user kiszk opened a pull request:

https://github.com/apache/spark/pull/20890

[WIP][SPARK-23779][SQL] TaskMemoryManager and UnsafeSorter related classes 
use MemoryBlock

## What changes were proposed in this pull request?

Waiting for merging #19222

This PR tries to use `MemoryBlock` in `TaskMemoryManager` and 
`UnsafeSorter` related classes. There are two advantages to use `MemoryBlock`.

1. Has clean API calls rather than using a Java array or `PlatformMemory`
2. Improve runtime performance of memory access instead of using `Object` 
with `Platform.get/put...`.

## How was this patch tested?

Used existing UTs

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kiszk/spark SPARK-23779

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/20890.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #20890


commit f427ca2a0f30d70f42b4755d0d72318cfa0e4c77
Author: Kazuaki Ishizaki 
Date:   2017-09-13T10:16:19Z

introduce ByteArrayMemoryBlock, IntArrayMemoryBlock, LongArrayMemoryBlock, 
and OffheaMemoryBlock

commit 5d7ccdb0e845afcaa430bac3d21b519d35d1e6f4
Author: Kazuaki Ishizaki 
Date:   2017-09-13T17:15:25Z

OffHeapColumnVector uses UnsafeMemoryAllocator

commit 251fa09d4421085a6c97f56bb38108980a79b5c8
Author: Kazuaki Ishizaki 
Date:   2017-09-13T17:27:09Z

UTF8String uses UnsafeMemoryAllocator

commit 790bbe7f3ac52b2b1e4375684998737d6127a552
Author: Kazuaki Ishizaki 
Date:   2017-09-13T17:36:57Z

Platform.copymemory() in UsafeInMemorySorter uses new MemoryBlock

commit 93a792e7729f213fcef66f0ed9b33f45259d1ec3
Author: Kazuaki Ishizaki 
Date:   2017-09-14T15:34:48Z

address review comments

commit 0beab0308cdac57e85daf7794170a0ca899ab568
Author: Kazuaki Ishizaki 
Date:   2017-09-14T17:53:33Z

fix test failures (e.g. String in UnsafeArrayData)

commit fcf764c1aebdc847675f710c17ec8477d6022a40
Author: Kazuaki Ishizaki 
Date:   2017-09-18T16:13:13Z

fix failures

commit d2d2e50f8a2baf41d5b85127bf888da6f8bca343
Author: Kazuaki Ishizaki 
Date:   2017-09-21T18:45:55Z

minor update of UTF8String constructor

commit f5e10bb52c33856ddd3e1b1f8483b170e0167c53
Author: Kazuaki Ishizaki 
Date:   2017-09-22T11:00:12Z

rename method name

commit 1905e8ca4b3b8200fa56f5fc91899bb420a07628
Author: Kazuaki Ishizaki 
Date:   2017-09-22T11:01:13Z

remove unused code

commit 7778e586e94130749cec3f54a60b6fb24514647a
Author: Kazuaki Ishizaki 
Date:   2017-09-22T11:02:30Z

update arrayEquals

commit 4f96c82b151b78641bcfc92a65913048b055cfee
Author: Kazuaki Ishizaki 
Date:   2017-09-22T13:18:01Z

rebase master

commit d1d6ae90589c0fae6c64af0cb95696e270446228
Author: Kazuaki Ishizaki 
Date:   2017-09-22T14:51:40Z

make more methods final

commit 914dcd11d0d5ef014284868f2794cb4e5baa0958
Author: Kazuaki Ishizaki 
Date:   2017-09-22T15:39:57Z

make fill method final in MemoryBlock

commit 336e4b7bfd7edcb861edeac3ca115dead785b68a
Author: Kazuaki Ishizaki 
Date:   2017-09-23T11:35:15Z

fix test failures

commit 5be9ccb163832e1895b045730a06e79eb3b171cf
Author: Kazuaki Ishizaki 
Date:   2017-09-24T14:00:40Z

add testsuite

commit 43e6b572bd893bd42e58df930a34b7e31549a49a
Author: Kazuaki Ishizaki 
Date:   2017-09-24T18:10:49Z

pass concrete type to the first argument of Platform.get*/put* to get 
better performance

commit 05f024e566f828e9c3f836430c9c7b34da5e954b
Author: Kazuaki Ishizaki 
Date:   2017-09-28T01:51:38Z

rename methods related to hash

commit 9071cf6449123400f3d774664e1709337b05c555
Author: Kazuaki Ishizaki 
Date:   2017-09-28T01:52:48Z

added methods for MemoryBlock

commit 37ee9fa07a8f8faaeb097164422bb15958fa4b1c
Author: Kazuaki Ishizaki 
Date:   2017-09-28T03:44:30Z

rebase with master

commit d0b5d59bb31fe2845477ee243008992686e2f2a2
Author: Kazuaki Ishizaki 
Date:   2017-09-28T04:37:25Z

fix scala style error

commit 5cdad44717ccb510d4114d14cfff304bef9f5bb4
Author: Kazuaki Ishizaki 
Date:   2017-10-14T07:29:14Z

use MemoryBlock in Murmur3 for performance reason

commit 91028fa2ae34bb3ae667692112b8455d4394cbbd
Author: Kazuaki Ishizaki 
Date:   2017-10-14T07:29:30Z

fix typo in comment

commit 0210bd1e5f46f81617a35493d2cd0b737b4cf85d
Author: Kazuaki Ishizaki 
Date:   2017-10-29T12:32:38Z

address review comment

commit df6dad3762f4e918d503df75ae8fce052af8bf43
Author: Kazuaki Ishizaki 
Date:   2017-11-28T06:08:52Z


[GitHub] spark issue #20883: [SPARK-23759][UI] Unable to bind Spark UI to specific ho...

2018-03-23 Thread mgaido91
Github user mgaido91 commented on the issue:

https://github.com/apache/spark/pull/20883
  
please update the description moving

> Fixes SPARK-23759 by moving connector.start() after connector.setHost()

after the title (What changes...) and please remove the sentence:

> This pull is to fix SPARK-23759 issue

since you are referencing the JIRA in the commit message, thus it is 
obvious that it fixes that JIRA...

Other than this, LGTM


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20888: [SPARK-23775][TEST] DataFrameRangeSuite should wait for ...

2018-03-23 Thread squito
Github user squito commented on the issue:

https://github.com/apache/spark/pull/20888
  
ah ok, yes when run in isolation, the stage will be 0, so your change makes 
sense.  But that is not what is making it flaky in a full test run


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20877: [SPARK-23765][SQL] Supports custom line separator for js...

2018-03-23 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/20877
  
I think we are fine to change the behaviour of `lineSep` before the release 
..


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20800: [SPARK-23627][SQL] Provide isEmpty in Dataset

2018-03-23 Thread goungoun
Github user goungoun commented on a diff in the pull request:

https://github.com/apache/spark/pull/20800#discussion_r176728379
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ---
@@ -511,6 +511,14 @@ class Dataset[T] private[sql](
*/
   def isLocal: Boolean = logicalPlan.isInstanceOf[LocalRelation]
 
+  /**
+   * Returns true if the `Dataset` is empty.
+   *
+   * @group basic
+   * @since 2.4.0
+   */
+  def isEmpty: Boolean = rdd.isEmpty()
--- End diff --

@gatorsmile, just simply running df.rdd.isEmpty in spark-shell was quite 
responsive even in tera byte sized tables. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19881: [SPARK-22683][CORE] Add a fullExecutorAllocationDivisor ...

2018-03-23 Thread tgravescs
Github user tgravescs commented on the issue:

https://github.com/apache/spark/pull/19881
  
+1


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19881: [SPARK-22683][CORE] Add a fullExecutorAllocationD...

2018-03-23 Thread tgravescs
Github user tgravescs commented on a diff in the pull request:

https://github.com/apache/spark/pull/19881#discussion_r176727530
  
--- Diff: 
core/src/test/scala/org/apache/spark/ExecutorAllocationManagerSuite.scala ---
@@ -145,6 +145,39 @@ class ExecutorAllocationManagerSuite
 assert(numExecutorsToAdd(manager) === 1)
   }
 
+  def testParallelismDivisor(cores: Int, divisor: Double, expected: Int): 
Unit = {
+val conf = new SparkConf()
+  .setMaster("myDummyLocalExternalClusterManager")
+  .setAppName("test-executor-allocation-manager")
+  .set("spark.dynamicAllocation.enabled", "true")
+  .set("spark.dynamicAllocation.testing", "true")
+  .set("spark.dynamicAllocation.maxExecutors", "15")
+  .set("spark.dynamicAllocation.minExecutors", "3")
+  .set("spark.dynamicAllocation.fullExecutorAllocationDivisor", 
divisor.toString)
+  .set("spark.executor.cores", cores.toString)
+val sc = new SparkContext(conf)
+contexts += sc
+var manager = sc.executorAllocationManager.get
+post(sc.listenerBus, SparkListenerStageSubmitted(createStageInfo(0, 
20)))
+for (i <- 0 to 5) {
+  addExecutors(manager)
--- End diff --

ok


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20877: [SPARK-23765][SQL] Supports custom line separator for js...

2018-03-23 Thread MaxGekk
Github user MaxGekk commented on the issue:

https://github.com/apache/spark/pull/20877
  
I have only one concern: if we merge this PR, we close the possibility for 
changing format of `lineSep` and future extensions. Your changes allow any 
sequence of chars. It is not clear for me, how we can restrict it and assign 
different meanings to it in the future.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20889: [MINOR][DOC] Fix ml-guide markdown typos

2018-03-23 Thread Lemonjing
Github user Lemonjing commented on the issue:

https://github.com/apache/spark/pull/20889
  
@HyukjinKwon Thanks a lot, could u help me to review these commits if u 
have time。


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20885: [SPARK-23724][SPARK-23765][SQL] Line separator for the j...

2018-03-23 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/20885
  
@cloud-fan and @hvanhovell. Do you think we need the flexible option for 
line separator?


https://github.com/apache/spark/blob/bbff40206e6871ea9ab035e7a8876f495bdf3d90/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JSONOptions.scala#L91-L98



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20877: [SPARK-23765][SQL] Supports custom line separator for js...

2018-03-23 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/20877
  
Yup, yup. I don't object for now. Shall we merge this one first and talk 
more about it in your PR?
I believe this PR itself proposes a complete option and I saw many the 
requests for this feature here and there like mailing list.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20884: [SPARK-23773][SQL] JacksonGenerator does not incl...

2018-03-23 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/20884#discussion_r176715906
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/json/JacksonGeneratorSuite.scala
 ---
@@ -56,7 +56,7 @@ class JacksonGeneratorSuite extends SparkFunSuite {
 val gen = new JacksonGenerator(dataType, writer, option)
 gen.write(input)
 gen.flush()
-assert(writer.toString === """[{}]""")
+assert(writer.toString === """[{"a":null}]""")
--- End diff --

I think previous result was a valid test case .. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20884: [SPARK-23773][SQL] JacksonGenerator does not include key...

2018-03-23 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/20884
  
Shall we add a configuration to control its behaviour if this is something 
we need to support?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20889: [MINOR][DOC] Fix ml-guide markdown typos

2018-03-23 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/20889
  
I think you can add all other typos into here.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20879: [MINOR][R] Fix R lint failure

2018-03-23 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/20879


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20880: [SPARK-23769][Core]Remove comments that unnecessa...

2018-03-23 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/20880


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20880: [SPARK-23769][Core]Remove comments that unnecessarily di...

2018-03-23 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/20880
  
Merged to master and branch-2.3.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20879: [MINOR][R] Fix R lint failure

2018-03-23 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/20879
  
Merged to master and branch-2.3.

Thanks for reviewing this @shaneknapp and @felixcheung.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20879: [MINOR][R] Fix R lint failure

2018-03-23 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/20879
  
Yup, we are running old lint in PR builders and those are on the newer one.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20885: [SPARK-23724][SPARK-23765][SQL] Line separator for the j...

2018-03-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20885
  
**[Test build #88540 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88540/testReport)**
 for PR 20885 at commit 
[`bbff402`](https://github.com/apache/spark/commit/bbff40206e6871ea9ab035e7a8876f495bdf3d90).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20877: [SPARK-23765][SQL] Supports custom line separator for js...

2018-03-23 Thread MaxGekk
Github user MaxGekk commented on the issue:

https://github.com/apache/spark/pull/20877
  
> Does that fix actual usecases?

I see the following use cases:

1. Jsons coming usually from embedded systems have not-standard separators 
(invisible in some cases). It is very convenient to open a file in hex editor 
and copy bytes between }{ to the lineSep option. This is the use case for the 
format with `'x'` selector like: `x0d 54 45`

2. In Json Streaming, records could be separated in pretty different ways. 
We should leave room for improvement I believe. See `'r'` (for regexp) and 
`'/'` reserved selectors

3. Some UTF-8 chars could cause errors from style (format) checkers. It is 
easier to represent such chars in hexadecimal format instead of disabling the 
checkers.

4. In near future, json datasource will support input json in different 
charsets. If the source code in UTF-8 but input json in different charset, it 
is slightly hard to put such chars as values for the lineSep option. The 
`x` format is more convenient here again. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20811: [SPARK-23668][K8S] Add config option for passing through...

2018-03-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20811
  
Kubernetes integration test status failure
URL: 
https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-spark-integration/1701/



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20889: [MINOR][DOC] Fix ml-guide markdown typos

2018-03-23 Thread Lemonjing
Github user Lemonjing commented on the issue:

https://github.com/apache/spark/pull/20889
  
@felixcheung hi,In my reading process,I found another md file typos and 
fix it. Do I need to close this pr and merge it into one commit and give a new 
pr?  Thanks.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20811: [SPARK-23668][K8S] Add config option for passing through...

2018-03-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20811
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1720/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20811: [SPARK-23668][K8S] Add config option for passing through...

2018-03-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20811
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20811: [SPARK-23668][K8S] Add config option for passing through...

2018-03-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20811
  
Kubernetes integration test starting
URL: 
https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-spark-integration/1701/



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20811: [SPARK-23668][K8S] Add config option for passing ...

2018-03-23 Thread andrusha
Github user andrusha commented on a diff in the pull request:

https://github.com/apache/spark/pull/20811#discussion_r176687092
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodFactory.scala
 ---
@@ -108,6 +109,8 @@ private[spark] class ExecutorPodFactory(
   nodeToLocalTaskCount: Map[String, Int]): Pod = {
 val name = s"$executorPodNamePrefix-exec-$executorId"
 
+val imagePullSecrets = imagePullSecret.map(new 
LocalObjectReference(_)).toList
--- End diff --

@liyinan926 reverted! Should be good to merge now. Rebased against master 
too.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20860: [SPARK-23743][SQL] Changed a comparison logic from conta...

2018-03-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20860
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20860: [SPARK-23743][SQL] Changed a comparison logic from conta...

2018-03-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20860
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88538/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20860: [SPARK-23743][SQL] Changed a comparison logic from conta...

2018-03-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20860
  
**[Test build #88538 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88538/testReport)**
 for PR 20860 at commit 
[`2ea9b7a`](https://github.com/apache/spark/commit/2ea9b7a58279d0e5d7cdfad8d67ab9227983be1a).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20541: [SPARK-23356][SQL]Pushes Project to both sides of Union ...

2018-03-23 Thread heary-cao
Github user heary-cao commented on the issue:

https://github.com/apache/spark/pull/20541
  
oh, I see, I fallback to the modification of the non-deterministic 
expression, and to keep the newly added test cases for a+1 and a+b, can you?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20851: [SPARK-23727][SQL] Support for pushing down filte...

2018-03-23 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/20851#discussion_r176676405
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
@@ -353,6 +353,13 @@ object SQLConf {
 .booleanConf
 .createWithDefault(true)
 
+  val PARQUET_FILTER_PUSHDOWN_DATE_ENABLED = 
buildConf("spark.sql.parquet.filterPushdown.date")
+.doc("If true, enables Parquet filter push-down optimization for Date. 
" +
+  "This configuration only has an effect when 
'spark.sql.parquet.filterPushdown' is enabled.")
+.internal()
+.booleanConf
+.createWithDefault(false)
--- End diff --

I am fine either way.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20885: [SPARK-23724][SPARK-23765][SQL] Line separator for the j...

2018-03-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20885
  
**[Test build #88539 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88539/testReport)**
 for PR 20885 at commit 
[`d632706`](https://github.com/apache/spark/commit/d632706bf14c7a7c2688237e6dc552ca5aa9c98a).
 * This patch **fails to generate documentation**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



<    1   2   3   >