[GitHub] spark issue #20057: [SPARK-22880][SQL] Add cascadeTruncate option to JDBC da...

2018-07-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20057
  
**[Test build #93327 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93327/testReport)**
 for PR 20057 at commit 
[`a365f79`](https://github.com/apache/spark/commit/a365f79b2f29326621a4cd0177780e66c56eaceb).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21828: Update regression.py

2018-07-20 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/21828#discussion_r204012751
  
--- Diff: python/pyspark/ml/regression.py ---
@@ -1116,7 +1116,7 @@ def setParams(self, featuresCol="features", 
labelCol="label", predictionCol="pre
   maxDepth=5, maxBins=32, minInstancesPerNode=1, 
minInfoGain=0.0,
   maxMemoryInMB=256, cacheNodeIds=False, 
subsamplingRate=1.0,
   checkpointInterval=10, lossType="squared", maxIter=20, 
stepSize=0.1, seed=None,
-  impuriy="variance", featureSubsetStrategy="all"):
+  impurity="variance", featureSubsetStrategy="all"):
--- End diff --

we could. I would rather use `_NoValue` instance for that purpose though. 
Also, I would make a warning via warnings package as we do in the code base. 
Can we add a simple test for that as well while we are here?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21815: [SPARK-23731][SQL] Make FileSourceScanExec canonicalizab...

2018-07-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21815
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21815: [SPARK-23731][SQL] Make FileSourceScanExec canonicalizab...

2018-07-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21815
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93334/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21805: [SPARK-24850][SQL] fix str representation of CachedRDDBu...

2018-07-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21805
  
**[Test build #93341 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93341/testReport)**
 for PR 21805 at commit 
[`cf2eae2`](https://github.com/apache/spark/commit/cf2eae2b93df12e8418897c9bb770abb416cbe1e).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21815: [SPARK-23731][SQL] Make FileSourceScanExec canonicalizab...

2018-07-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21815
  
**[Test build #93334 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93334/testReport)**
 for PR 21815 at commit 
[`7f531bd`](https://github.com/apache/spark/commit/7f531bd3962685ff2bd271af8721653319f618bf).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20761: [SPARK-20327][CORE][YARN] Add CLI support for YARN custo...

2018-07-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20761
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93340/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20761: [SPARK-20327][CORE][YARN] Add CLI support for YARN custo...

2018-07-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20761
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20761: [SPARK-20327][CORE][YARN] Add CLI support for YARN custo...

2018-07-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20761
  
**[Test build #93340 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93340/testReport)**
 for PR 20761 at commit 
[`ad96372`](https://github.com/apache/spark/commit/ad96372f51fc1920da7b0173e2bf0dcc5ef626fe).
 * This patch **fails to build**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20761: [SPARK-20327][CORE][YARN] Add CLI support for YARN custo...

2018-07-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20761
  
**[Test build #93340 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93340/testReport)**
 for PR 20761 at commit 
[`ad96372`](https://github.com/apache/spark/commit/ad96372f51fc1920da7b0173e2bf0dcc5ef626fe).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20761: [SPARK-20327][CORE][YARN] Add CLI support for YARN custo...

2018-07-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20761
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20761: [SPARK-20327][CORE][YARN] Add CLI support for YARN custo...

2018-07-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20761
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93339/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20761: [SPARK-20327][CORE][YARN] Add CLI support for YARN custo...

2018-07-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20761
  
**[Test build #93339 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93339/testReport)**
 for PR 20761 at commit 
[`b58df80`](https://github.com/apache/spark/commit/b58df80d3e20ceea7e08a6394804e02847addb05).
 * This patch **fails to build**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20761: [SPARK-20327][CORE][YARN] Add CLI support for YARN custo...

2018-07-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20761
  
**[Test build #93339 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93339/testReport)**
 for PR 20761 at commit 
[`b58df80`](https://github.com/apache/spark/commit/b58df80d3e20ceea7e08a6394804e02847addb05).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21805: [SPARK-24850][SQL] fix str representation of Cach...

2018-07-20 Thread onursatici
Github user onursatici commented on a diff in the pull request:

https://github.com/apache/spark/pull/21805#discussion_r204001527
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/DatasetCacheSuite.scala ---
@@ -206,4 +206,19 @@ class DatasetCacheSuite extends QueryTest with 
SharedSQLContext with TimeLimits
 // first time use, load cache
 checkDataset(df5, Row(10))
   }
+
+  test("SPARK-24850 InMemoryRelation string representation does not 
include cached plan") {
+val dummyQueryExecution = spark.range(0, 1).toDF().queryExecution
+val inMemoryRelation = InMemoryRelation(
+  true,
+  1000,
+  StorageLevel.MEMORY_ONLY,
+  dummyQueryExecution.sparkPlan,
+  Some("test-relation"),
+  dummyQueryExecution.logical)
+
+
assert(!inMemoryRelation.simpleString.contains(dummyQueryExecution.sparkPlan.toString))
+assert(inMemoryRelation.simpleString.contains(
+  "CachedRDDBuilder(true, 1000, StorageLevel(memory, deserialized, 1 
replicas))"))
--- End diff --

@gatorsmile tried to keep this close to its default value, maybe we can do 
something like `CachedRDDBuilder(useCompression = true, batchSize = 1000, 
...)`? But that will break the consistency across logging case classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21805: [SPARK-24850][SQL] fix str representation of Cach...

2018-07-20 Thread onursatici
Github user onursatici commented on a diff in the pull request:

https://github.com/apache/spark/pull/21805#discussion_r204001546
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/DatasetCacheSuite.scala ---
@@ -206,4 +206,19 @@ class DatasetCacheSuite extends QueryTest with 
SharedSQLContext with TimeLimits
 // first time use, load cache
 checkDataset(df5, Row(10))
   }
+
+  test("SPARK-24850 InMemoryRelation string representation does not 
include cached plan") {
+val dummyQueryExecution = spark.range(0, 1).toDF().queryExecution
+val inMemoryRelation = InMemoryRelation(
+  true,
+  1000,
+  StorageLevel.MEMORY_ONLY,
+  dummyQueryExecution.sparkPlan,
+  Some("test-relation"),
+  dummyQueryExecution.logical)
+
+
assert(!inMemoryRelation.simpleString.contains(dummyQueryExecution.sparkPlan.toString))
+assert(inMemoryRelation.simpleString.contains(
+  "CachedRDDBuilder(true, 1000, StorageLevel(memory, deserialized, 1 
replicas))"))
--- End diff --

@maropu wouldn't that be testing the same thing, as explain calls 
`plan.treeString` which calls `elem.simpleString` for every child? I think 
testing for `InMemoryRelation.simpleString` covers other possible places where 
a `plan.treeString` is logged. Happy to change if you have concerns


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21103: [SPARK-23915][SQL] Add array_except function

2018-07-20 Thread kiszk
Github user kiszk commented on the issue:

https://github.com/apache/spark/pull/21103
  
cc @ueshin


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21828: Update regression.py

2018-07-20 Thread woodthom2
Github user woodthom2 commented on a diff in the pull request:

https://github.com/apache/spark/pull/21828#discussion_r203997163
  
--- Diff: python/pyspark/ml/regression.py ---
@@ -1116,7 +1116,7 @@ def setParams(self, featuresCol="features", 
labelCol="label", predictionCol="pre
   maxDepth=5, maxBins=32, minInstancesPerNode=1, 
minInfoGain=0.0,
   maxMemoryInMB=256, cacheNodeIds=False, 
subsamplingRate=1.0,
   checkpointInterval=10, lossType="squared", maxIter=20, 
stepSize=0.1, seed=None,
-  impuriy="variance", featureSubsetStrategy="all"):
+  impurity="variance", featureSubsetStrategy="all"):
--- End diff --



what about this until next major release?
```
def setParams(self, featuresCol="features", labelCol="label", 
predictionCol="prediction",
  maxDepth=5, maxBins=32, minInstancesPerNode=1, 
minInfoGain=0.0,
  maxMemoryInMB=256, cacheNodeIds=False, 
subsamplingRate=1.0,
  checkpointInterval=10, lossType="squared", maxIter=20, 
stepSize=0.1, seed=None,
  impuriy=None, impurity="variance", 
featureSubsetStrategy="all"):
if impuriy is not None: # for backward compatibility
impurity = impuriy
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21828: Update regression.py

2018-07-20 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/21828#discussion_r203995996
  
--- Diff: python/pyspark/ml/regression.py ---
@@ -1116,7 +1116,7 @@ def setParams(self, featuresCol="features", 
labelCol="label", predictionCol="pre
   maxDepth=5, maxBins=32, minInstancesPerNode=1, 
minInfoGain=0.0,
   maxMemoryInMB=256, cacheNodeIds=False, 
subsamplingRate=1.0,
   checkpointInterval=10, lossType="squared", maxIter=20, 
stepSize=0.1, seed=None,
-  impuriy="variance", featureSubsetStrategy="all"):
+  impurity="variance", featureSubsetStrategy="all"):
--- End diff --

Of course that's possible - user upgraded Spark and suddenly it gives no 
such keyword exception.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21822: [SPARK-24865] Remove AnalysisBarrier - WIP

2018-07-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21822
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21822: [SPARK-24865] Remove AnalysisBarrier - WIP

2018-07-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21822
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93325/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21822: [SPARK-24865] Remove AnalysisBarrier - WIP

2018-07-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21822
  
**[Test build #93325 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93325/testReport)**
 for PR 21822 at commit 
[`83ffa51`](https://github.com/apache/spark/commit/83ffa51f4b165152dea214be4d73dd518d742a56).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21820: [SPARK-24868][PYTHON]add sequence function in Pyt...

2018-07-20 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/21820


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21828: Update regression.py

2018-07-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21828
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21828: Update regression.py

2018-07-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21828
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21828: Update regression.py

2018-07-20 Thread woodthom2
Github user woodthom2 commented on a diff in the pull request:

https://github.com/apache/spark/pull/21828#discussion_r203994440
  
--- Diff: python/pyspark/ml/regression.py ---
@@ -1116,7 +1116,7 @@ def setParams(self, featuresCol="features", 
labelCol="label", predictionCol="pre
   maxDepth=5, maxBins=32, minInstancesPerNode=1, 
minInfoGain=0.0,
   maxMemoryInMB=256, cacheNodeIds=False, 
subsamplingRate=1.0,
   checkpointInterval=10, lossType="squared", maxIter=20, 
stepSize=0.1, seed=None,
-  impuriy="variance", featureSubsetStrategy="all"):
+  impurity="variance", featureSubsetStrategy="all"):
--- End diff --

is anyone depending on the typoed version?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21828: Update regression.py

2018-07-20 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/21828#discussion_r203994108
  
--- Diff: python/pyspark/ml/regression.py ---
@@ -1116,7 +1116,7 @@ def setParams(self, featuresCol="features", 
labelCol="label", predictionCol="pre
   maxDepth=5, maxBins=32, minInstancesPerNode=1, 
minInfoGain=0.0,
   maxMemoryInMB=256, cacheNodeIds=False, 
subsamplingRate=1.0,
   checkpointInterval=10, lossType="squared", maxIter=20, 
stepSize=0.1, seed=None,
-  impuriy="variance", featureSubsetStrategy="all"):
+  impurity="variance", featureSubsetStrategy="all"):
--- End diff --

I think this can't just changed like this since it's going to break other 
users codes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21820: [SPARK-24868][PYTHON]add sequence function in Python

2018-07-20 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/21820
  
Merged to master


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21828: Update regression.py

2018-07-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21828
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21828: Update regression.py

2018-07-20 Thread woodthom2
GitHub user woodthom2 opened a pull request:

https://github.com/apache/spark/pull/21828

Update regression.py

Correct typo impuriy -> impurity
(this would have stopped GBT working for some hyperparameter configurations)

## What changes were proposed in this pull request?

(Please fill in changes proposed in this fix)

## How was this patch tested?

(Please explain how this patch was tested. E.g. unit tests, integration 
tests, manual tests)
(If this patch involves UI changes, please attach a screenshot; otherwise, 
remove this)

Please review http://spark.apache.org/contributing.html before opening a 
pull request.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/woodthom2/spark patch-1

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/21828.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #21828


commit d3973d0149ce313210c1f1930a9450bb33d70960
Author: woodthom2 
Date:   2018-07-20T09:51:59Z

Update regression.py

Correct typo impuriy -> impurity
(this would have stopped GBT working for some hyperparameter configurations)




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21823: [SPARK-24870][SQL]Cache can't work normally if there are...

2018-07-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21823
  
**[Test build #93338 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93338/testReport)**
 for PR 21823 at commit 
[`86c7ed6`](https://github.com/apache/spark/commit/86c7ed6dd4e2790e64148ff2dc6e856b2b2fd80a).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21821: [SPARK-24867] [SQL] Add AnalysisBarrier to DataFrameWrit...

2018-07-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21821
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93323/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21821: [SPARK-24867] [SQL] Add AnalysisBarrier to DataFrameWrit...

2018-07-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21821
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21823: [SPARK-24870][SQL]Cache can't work normally if there are...

2018-07-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21823
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1169/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21827: [SPARK-24873]Increase switch to shielding frequent inter...

2018-07-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21827
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21823: [SPARK-24870][SQL]Cache can't work normally if there are...

2018-07-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21823
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21827: [SPARK-24873]Increase switch to shielding frequent inter...

2018-07-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21827
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21821: [SPARK-24867] [SQL] Add AnalysisBarrier to DataFrameWrit...

2018-07-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21821
  
**[Test build #93323 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93323/testReport)**
 for PR 21821 at commit 
[`9edc28f`](https://github.com/apache/spark/commit/9edc28fdcb7261f01db716f65e723668a493327e).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21823: [SPARK-24870][SQL]Cache can't work normally if there are...

2018-07-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21823
  
**[Test build #93337 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93337/testReport)**
 for PR 21823 at commit 
[`b5b2a1b`](https://github.com/apache/spark/commit/b5b2a1b5c1c2e8b04fe40c165c5827f3380a472b).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21827: [SPARK-24873]Increase switch to shielding frequent inter...

2018-07-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21827
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21827: [SPARK-24873]Increase switch to shielding frequen...

2018-07-20 Thread hejiefang
GitHub user hejiefang opened a pull request:

https://github.com/apache/spark/pull/21827

[SPARK-24873]Increase switch to shielding frequent interaction report…

[https://issues.apache.org/jira/browse/SPARK-24873](url)
[SPARK-24873]Increase switch to shielding frequent interaction report…


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/hejiefang/spark spark-24873

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/21827.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #21827


commit 7f9ef388679e8b9a282befc3c5a031a2199d0eb0
Author: hejiefang 
Date:   2018-07-20T09:39:11Z

[SPARK-24873]Increase switch to shielding frequent interaction reports with 
yarn




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21823: [SPARK-24870][SQL]Cache can't work normally if there are...

2018-07-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21823
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1168/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21823: [SPARK-24870][SQL]Cache can't work normally if there are...

2018-07-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21823
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21823: [SPARK-24870][SQL]Cache can't work normally if th...

2018-07-20 Thread eatoncys
Github user eatoncys commented on a diff in the pull request:

https://github.com/apache/spark/pull/21823#discussion_r203990617
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CanonicalizeSuite.scala
 ---
@@ -50,4 +52,30 @@ class CanonicalizeSuite extends SparkFunSuite {
 assert(range.where(arrays1).sameResult(range.where(arrays2)))
 assert(!range.where(arrays1).sameResult(range.where(arrays3)))
   }
+
+  test("Canonicalized result is not case-insensitive") {
--- End diff --

Ok,modified,thanks.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21826: [SPARK-24872] Remove the symbol “||” of the “OR”...

2018-07-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21826
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21826: [SPARK-24872] Remove the symbol “||” of the “OR”...

2018-07-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21826
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20761: [SPARK-20327][CORE][YARN] Add CLI support for YARN custo...

2018-07-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20761
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20761: [SPARK-20327][CORE][YARN] Add CLI support for YARN custo...

2018-07-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20761
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93336/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20761: [SPARK-20327][CORE][YARN] Add CLI support for YARN custo...

2018-07-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20761
  
**[Test build #93336 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93336/testReport)**
 for PR 20761 at commit 
[`0ff9dee`](https://github.com/apache/spark/commit/0ff9dee17e720fd448ad3c3939e5a2937a13b711).
 * This patch **fails to build**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21826: [SPARK-24872] Remove the symbol “||” of the “OR”...

2018-07-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21826
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21826: [SPARK-24872] Remove the symbol “||” of the �...

2018-07-20 Thread httfighter
GitHub user httfighter opened a pull request:

https://github.com/apache/spark/pull/21826

[SPARK-24872] Remove the symbol “||” of the “OR” operation

## What changes were proposed in this pull request?
“||” will perform the function of STRING concat, and it is also the 
symbol of the "OR" operation.

When I want use "||" as "OR" operation, I find that it perform the function 
of STRING concat,

  spark-sql> explain extended select * from aa where id==1 || id==2;

   == Parsed Logical Plan ==
'Project [*]
 +- 'Filter (('id = concat(1, 'id)) = 2)
  +- 'UnresolvedRelation `aa`

   spark-sql> select "abc" || "DFF" ;

   And the result is "abcDFF".

In predicates.scala, "||" is the symbol of "Or" operation. Could we remove 
it?

## How was this patch tested?

We can test this patch  with unit tests.

Please review http://spark.apache.org/contributing.html before opening a 
pull request.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/httfighter/spark SPARK-24872

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/21826.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #21826


commit fb98029c451023789a2c7fa0e758c6c8790bbaea
Author: 韩田田00222924 
Date:   2018-07-20T09:19:54Z

SPARK-24872 Remove the symbol “||” of the “OR” operation




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21789: [SPARK-24829][STS]In Spark Thrift Server, CAST AS FLOAT ...

2018-07-20 Thread zuotingbing
Github user zuotingbing commented on the issue:

https://github.com/apache/spark/pull/21789
  
@mgaido91 yes , update it. Thanks. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20761: [SPARK-20327][CORE][YARN] Add CLI support for YARN custo...

2018-07-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20761
  
**[Test build #93336 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93336/testReport)**
 for PR 20761 at commit 
[`0ff9dee`](https://github.com/apache/spark/commit/0ff9dee17e720fd448ad3c3939e5a2937a13b711).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21652: [SPARK-24551][K8S] Add integration tests for secrets

2018-07-20 Thread skonto
Github user skonto commented on the issue:

https://github.com/apache/spark/pull/21652
  
@felixcheung can I have a merge pls?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21815: [SPARK-23731][SQL] Make FileSourceScanExec canonicalizab...

2018-07-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21815
  
**[Test build #93335 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93335/testReport)**
 for PR 21815 at commit 
[`be6e594`](https://github.com/apache/spark/commit/be6e5941991ca045100456e11a59a9b2eb77a1ea).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21815: [SPARK-23731][SQL] Make FileSourceScanExec canonicalizab...

2018-07-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21815
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21815: [SPARK-23731][SQL] Make FileSourceScanExec canonicalizab...

2018-07-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21815
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1167/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21815: [SPARK-23731][SQL] Make FileSourceScanExec canonicalizab...

2018-07-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21815
  
**[Test build #93334 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93334/testReport)**
 for PR 21815 at commit 
[`7f531bd`](https://github.com/apache/spark/commit/7f531bd3962685ff2bd271af8721653319f618bf).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21815: [SPARK-23731][SQL] Make FileSourceScanExec canonicalizab...

2018-07-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21815
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1166/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21815: [SPARK-23731][SQL] Make FileSourceScanExec canonicalizab...

2018-07-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21815
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21815: [SPARK-23731][SQL] Make FileSourceScanExec canonicalizab...

2018-07-20 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/21815
  
Let me update it soon.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21789: [SPARK-24829][SQL]In Spark Thrift Server, CAST AS FLOAT ...

2018-07-20 Thread mgaido91
Github user mgaido91 commented on the issue:

https://github.com/apache/spark/pull/21789
  
@zuotingbing I think what @dongjoon-hyun was suggesting you was to put 
`[STS]` instead of `[SQL]` in the title of the PR. May you please update 
accordingly? Thanks.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21815: [SPARK-23731][SQL] Make FileSourceScanExec canoni...

2018-07-20 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/21815#discussion_r203980465
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/FileSourceScanExecSuite.scala
 ---
@@ -0,0 +1,36 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution
+
+import org.apache.spark.sql.test.SharedSQLContext
+
+class FileSourceScanExecSuite extends SharedSQLContext {
+  test("FileSourceScanExec should be canonicalizable on executor side") {
--- End diff --

`SparkPlanSuite` SGTM


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21823: [SPARK-24870][SQL]Cache can't work normally if th...

2018-07-20 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/21823#discussion_r203980186
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CanonicalizeSuite.scala
 ---
@@ -50,4 +52,30 @@ class CanonicalizeSuite extends SparkFunSuite {
 assert(range.where(arrays1).sameResult(range.where(arrays2)))
 assert(!range.where(arrays1).sameResult(range.where(arrays3)))
   }
+
+  test("Canonicalized result is not case-insensitive") {
--- End diff --

let's move it to `SameResultSuite`, also let's pick a simpler test, like 
using a `Project` with one columns instead of `Aggregate`.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21815: [SPARK-23731][SQL] Make FileSourceScanExec canoni...

2018-07-20 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/21815#discussion_r203979619
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/FileSourceScanExecSuite.scala
 ---
@@ -0,0 +1,36 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution
+
+import org.apache.spark.sql.test.SharedSQLContext
+
+class FileSourceScanExecSuite extends SharedSQLContext {
+  test("FileSourceScanExec should be canonicalizable on executor side") {
--- End diff --

I found `SparkPlanSuite` could be another place to add to address your 
comment. Let me stick to `FileSourceScanExec` but please let me know if you 
prefer this please. I don't mind changing it.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21823: [SPARK-24870][SQL]Cache can't work normally if th...

2018-07-20 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/21823#discussion_r203979598
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/QueryPlan.scala 
---
@@ -282,9 +282,9 @@ object QueryPlan extends PredicateHelper {
   case ar: AttributeReference =>
 val ordinal = input.indexOf(ar.exprId)
 if (ordinal == -1) {
-  ar
+  ar.withName("")
 } else {
-  ar.withExprId(ExprId(ordinal))
+  ar.withExprId(ExprId(ordinal)).withName("")
--- End diff --

I  think we just need to add a `.canonicalized` at the end.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21823: [SPARK-24870][SQL]Cache can't work normally if th...

2018-07-20 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/21823#discussion_r203979413
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/QueryPlan.scala 
---
@@ -237,7 +237,7 @@ abstract class QueryPlan[PlanType <: 
QueryPlan[PlanType]] extends TreeNode[PlanT
 // Top level `AttributeReference` may also be used for output like 
`Alias`, we should
 // normalize the epxrId too.
 id += 1
-ar.withExprId(ExprId(id)).canonicalized
+ar.withExprId(ExprId(id)).withName("").canonicalized
--- End diff --

oh wait. I think we've already erased the name, in 
`Expression#canonicalized`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21815: [SPARK-23731][SQL] Make FileSourceScanExec canoni...

2018-07-20 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/21815#discussion_r203979151
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/FileSourceScanExecSuite.scala
 ---
@@ -0,0 +1,36 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution
+
+import org.apache.spark.sql.test.SharedSQLContext
+
+class FileSourceScanExecSuite extends SharedSQLContext {
+  test("FileSourceScanExec should be canonicalizable on executor side") {
--- End diff --

I think I can actually put this under `SparkPlanSuite`. Let me put this it 
in.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21823: [SPARK-24870][SQL]Cache can't work normally if th...

2018-07-20 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/21823#discussion_r203978990
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/QueryPlan.scala 
---
@@ -282,9 +282,9 @@ object QueryPlan extends PredicateHelper {
   case ar: AttributeReference =>
 val ordinal = input.indexOf(ar.exprId)
 if (ordinal == -1) {
-  ar
+  ar.withName("")
--- End diff --

let's leave it. We don't even normalize the exprId here.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21823: [SPARK-24870][SQL]Cache can't work normally if there are...

2018-07-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21823
  
**[Test build #9 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/9/testReport)**
 for PR 21823 at commit 
[`1aefcb3`](https://github.com/apache/spark/commit/1aefcb370ad972cfc17315d000569da1f11c61ef).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21823: [SPARK-24870][SQL]Cache can't work normally if there are...

2018-07-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21823
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21823: [SPARK-24870][SQL]Cache can't work normally if there are...

2018-07-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21823
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1165/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21815: [SPARK-23731][SQL] Make FileSourceScanExec canoni...

2018-07-20 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/21815#discussion_r203976429
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/FileSourceScanExecSuite.scala
 ---
@@ -0,0 +1,36 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution
+
+import org.apache.spark.sql.test.SharedSQLContext
+
+class FileSourceScanExecSuite extends SharedSQLContext {
+  test("FileSourceScanExec should be canonicalizable on executor side") {
--- End diff --

There's few things bothering for that actually - it's kind of messy to 
create `FileSourceScanExec` without `SparkSession` (and also without other 
utils from `SharedSQLContext`), and `QueryPlanSuite` is under `catalyst` 
whereas this plan itself is under `execution` in SQL core.

And, I actually believe this PR more targets to make the plan 
canonicalizable after it's de/serialized since this plan itself is serializable 
and deserializable already but it's not canonicalizable after that.

Let me try to clean up based on your comment anyway.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21823: [SPARK-24870][SQL]Cache can't work normally if there are...

2018-07-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21823
  
**[Test build #93332 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93332/testReport)**
 for PR 21823 at commit 
[`c01cf89`](https://github.com/apache/spark/commit/c01cf897daea314c43c96253c0b41aace72637ac).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21823: [SPARK-24870][SQL]Cache can't work normally if there are...

2018-07-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21823
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21823: [SPARK-24870][SQL]Cache can't work normally if there are...

2018-07-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21823
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1164/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21802: [SPARK-23928][SQL] Add shuffle collection functio...

2018-07-20 Thread ueshin
Github user ueshin commented on a diff in the pull request:

https://github.com/apache/spark/pull/21802#discussion_r203974038
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
 ---
@@ -1184,6 +1186,137 @@ case class ArraySort(child: Expression) extends 
UnaryExpression with ArraySortLi
   override def prettyName: String = "array_sort"
 }
 
+/**
+ * Returns a random permutation of the given array.
+ *
+ * This implementation uses the modern version of Fisher-Yates algorithm.
+ * Reference: 
https://en.wikipedia.org/wiki/Fisher%E2%80%93Yates_shuffle#Modern_method
--- End diff --

Oh, I see. Let me try.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21823: [SPARK-24870][SQL]Cache can't work normally if th...

2018-07-20 Thread eatoncys
Github user eatoncys commented on a diff in the pull request:

https://github.com/apache/spark/pull/21823#discussion_r203972375
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/QueryPlan.scala 
---
@@ -237,7 +239,7 @@ abstract class QueryPlan[PlanType <: 
QueryPlan[PlanType]] extends TreeNode[PlanT
 // Top level `AttributeReference` may also be used for output like 
`Alias`, we should
 // normalize the epxrId too.
 id += 1
-ar.withExprId(ExprId(id)).canonicalized
+
ar.withExprId(ExprId(id)).withName(ar.name.toLowerCase(Locale.ROOT)).canonicalized
--- End diff --

I think it is Ok, and it erase the attribute name in spark version 2.0.2.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21774: [SPARK-24811][SQL]Avro: add new function from_avro and t...

2018-07-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21774
  
**[Test build #93331 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93331/testReport)**
 for PR 21774 at commit 
[`7179e85`](https://github.com/apache/spark/commit/7179e85f49fbd2f6f1a6a0d27dae474d6df12cea).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21774: [SPARK-24811][SQL]Avro: add new function from_avro and t...

2018-07-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21774
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21754: [SPARK-24705][SQL] Cannot reuse an exchange opera...

2018-07-20 Thread maropu
Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/21754#discussion_r203972003
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/Exchange.scala 
---
@@ -85,14 +85,20 @@ case class ReusedExchangeExec(override val output: 
Seq[Attribute], child: Exchan
  */
 case class ReuseExchange(conf: SQLConf) extends Rule[SparkPlan] {
 
+  private def supportReuseExchange(exchange: Exchange): Boolean = exchange 
match {
+// If a coordinator defined in an exchange operator, the exchange 
cannot be reused
--- End diff --

Ah, ok. I’ll check if we can.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21774: [SPARK-24811][SQL]Avro: add new function from_avro and t...

2018-07-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21774
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1163/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21774: [SPARK-24811][SQL]Avro: add new function from_avro and t...

2018-07-20 Thread gengliangwang
Github user gengliangwang commented on the issue:

https://github.com/apache/spark/pull/21774
  
I will create another separate PR to totally remove SerializableSchema.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21823: [SPARK-24870][SQL]Cache can't work normally if th...

2018-07-20 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/21823#discussion_r203969926
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/QueryPlan.scala 
---
@@ -237,7 +239,7 @@ abstract class QueryPlan[PlanType <: 
QueryPlan[PlanType]] extends TreeNode[PlanT
 // Top level `AttributeReference` may also be used for output like 
`Alias`, we should
 // normalize the epxrId too.
 id += 1
-ar.withExprId(ExprId(id)).canonicalized
+
ar.withExprId(ExprId(id)).withName(ar.name.toLowerCase(Locale.ROOT)).canonicalized
--- End diff --

shall we just erase the attribute name like alias?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21802: [SPARK-23928][SQL] Add shuffle collection functio...

2018-07-20 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/21802#discussion_r203968939
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -2086,6 +2087,20 @@ class Analyzer(
 }
   }
 
+  /**
+   * Set the seed for random number generation in Shuffle expressions.
+   */
+  object ResolvedShuffleExpressions extends Rule[LogicalPlan] {
+private lazy val random = new Random()
+
+override def apply(plan: LogicalPlan): LogicalPlan = plan.transformUp {
+  case p if p.resolved => p
+  case p => p transformExpressionsUp {
+case Shuffle(child, None) => Shuffle(child, 
Some(random.nextLong()))
--- End diff --

then can we use a single rule to assign seed to these randomized functions?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20861: [SPARK-23599][SQL] Use RandomUUIDGenerator in Uui...

2018-07-20 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/20861#discussion_r203968743
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -1994,6 +1996,20 @@ class Analyzer(
 }
   }
 
+  /**
+   * Set the seed for random number generation in Uuid expressions.
+   */
+  object ResolvedUuidExpressions extends Rule[LogicalPlan] {
+private lazy val random = new Random()
+
+override def apply(plan: LogicalPlan): LogicalPlan = plan.transformUp {
+  case p if p.resolved => p
+  case p => p transformExpressionsUp {
+case Uuid(None) => Uuid(Some(random.nextLong()))
--- End diff --

shall we do the same thing for `Rand`?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21802: [SPARK-23928][SQL] Add shuffle collection functio...

2018-07-20 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/21802#discussion_r203968608
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
 ---
@@ -1184,6 +1186,137 @@ case class ArraySort(child: Expression) extends 
UnaryExpression with ArraySortLi
   override def prettyName: String = "array_sort"
 }
 
+/**
+ * Returns a random permutation of the given array.
+ *
+ * This implementation uses the modern version of Fisher-Yates algorithm.
+ * Reference: 
https://en.wikipedia.org/wiki/Fisher%E2%80%93Yates_shuffle#Modern_method
--- End diff --

if we create a new array, I guess there should be some simpler algorithms 
without swapping...


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21754: [SPARK-24705][SQL] Cannot reuse an exchange opera...

2018-07-20 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/21754#discussion_r203968311
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/Exchange.scala 
---
@@ -85,14 +85,20 @@ case class ReusedExchangeExec(override val output: 
Seq[Attribute], child: Exchan
  */
 case class ReuseExchange(conf: SQLConf) extends Rule[SparkPlan] {
 
+  private def supportReuseExchange(exchange: Exchange): Boolean = exchange 
match {
+// If a coordinator defined in an exchange operator, the exchange 
cannot be reused
--- End diff --

I think object reference also works, since currently if it's same 
coordinator, it's the same object.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21825: [SPARK-18188][DOC][FOLLOW-UP]Add `spark.broadcast.checks...

2018-07-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21825
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21825: [SPARK-18188][DOC][FOLLOW-UP]Add `spark.broadcast.checks...

2018-07-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21825
  
**[Test build #93329 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93329/testReport)**
 for PR 21825 at commit 
[`6a85aad`](https://github.com/apache/spark/commit/6a85aadc33a6d1ba18d028eeafce3167e5b7aaf7).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21825: [SPARK-18188][DOC][FOLLOW-UP]Add `spark.broadcast.checks...

2018-07-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21825
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93329/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21754: [SPARK-24705][SQL] Cannot reuse an exchange opera...

2018-07-20 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/21754#discussion_r203968016
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/Exchange.scala 
---
@@ -85,14 +85,20 @@ case class ReusedExchangeExec(override val output: 
Seq[Attribute], child: Exchan
  */
 case class ReuseExchange(conf: SQLConf) extends Rule[SparkPlan] {
 
+  private def supportReuseExchange(exchange: Exchange): Boolean = exchange 
match {
+// If a coordinator defined in an exchange operator, the exchange 
cannot be reused
--- End diff --

can we assign an id to the `ExchangeCoordinator` so that we can correctly 
tell if they are same coordinators?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21754: [SPARK-24705][SQL] Cannot reuse an exchange opera...

2018-07-20 Thread maropu
Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/21754#discussion_r203966689
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/Exchange.scala 
---
@@ -85,14 +85,20 @@ case class ReusedExchangeExec(override val output: 
Seq[Attribute], child: Exchan
  */
 case class ReuseExchange(conf: SQLConf) extends Rule[SparkPlan] {
 
+  private def supportReuseExchange(exchange: Exchange): Boolean = exchange 
match {
+// If a coordinator defined in an exchange operator, the exchange 
cannot be reused
--- End diff --

We might be able to logically reuse the same coordinator though, it seems 
to be difficult to implement based on the current master, I think. In the 
current adaptive query execution, exchanges (between stages) registered in a 
coordinator and their partition size are decided on runtime (inside 
`SparkPlan.execute()`). Since `ReuseExchange` runs in the final phase of 
planning. So, it is difficult to tell which coordinator can be reused at that 
time. So, to archive the reuse, we might need some refactoring about these 
logics...


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21824: [SPARK-24871][SQL] Refactor Concat and MapConcat to avoi...

2018-07-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21824
  
**[Test build #93326 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93326/testReport)**
 for PR 21824 at commit 
[`c254523`](https://github.com/apache/spark/commit/c2545232d2157311ab3ea3ccf6dd45f1a5024f02).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21824: [SPARK-24871][SQL] Refactor Concat and MapConcat to avoi...

2018-07-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21824
  
**[Test build #93330 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93330/testReport)**
 for PR 21824 at commit 
[`c254523`](https://github.com/apache/spark/commit/c2545232d2157311ab3ea3ccf6dd45f1a5024f02).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21824: [SPARK-24871][SQL] Refactor Concat and MapConcat to avoi...

2018-07-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21824
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1162/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21824: [SPARK-24871][SQL] Refactor Concat and MapConcat to avoi...

2018-07-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21824
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21103: [SPARK-23915][SQL] Add array_except function

2018-07-20 Thread mn-mikke
Github user mn-mikke commented on a diff in the pull request:

https://github.com/apache/spark/pull/21103#discussion_r203964623
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
 ---
@@ -3805,3 +3799,332 @@ object ArrayUnion {
 new GenericArrayData(arrayBuffer)
   }
 }
+
+/**
+ * Returns an array of the elements in the intersect of x and y, without 
duplicates
+ */
+@ExpressionDescription(
+  usage = """
+  _FUNC_(array1, array2) - Returns an array of the elements in array1 but 
not in array2,
+without duplicates.
+  """,
+  examples = """
+Examples:Fun
+  > SELECT _FUNC_(array(1, 2, 3), array(1, 3, 5));
+   array(2)
+  """,
+  since = "2.4.0")
+case class ArrayExcept(left: Expression, right: Expression) extends 
ArraySetLike {
+  override def dataType: DataType = ArrayType(elementType,
--- End diff --

Yeah this case is valid. But ```containsNull``` flag is defined for the 
whole column (accross multiple rows). Since this flag could cause removal of 
null safe check in expressions that will use ```ArrayExcept``` as a child, it 
could lead to failures with ```NullPointerException``` for the cases as in the 
second row of the example dataset. WDYT?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21825: [SPARK-18188][DOC][FOLLOW-UP]Add `spark.broadcast.checks...

2018-07-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21825
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1161/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21825: [SPARK-18188][DOC][FOLLOW-UP]Add `spark.broadcast.checks...

2018-07-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21825
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



<    1   2   3   4   5   6   7   >