[GitHub] spark issue #23124: [SPARK-25829][SQL] remove duplicated map keys with last ...

2018-11-26 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23124
  
**[Test build #99310 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99310/testReport)**
 for PR 23124 at commit 
[`abd0ec5`](https://github.com/apache/spark/commit/abd0ec543a944fa02320337f4fab7fff6ffe9667).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23124: [SPARK-25829][SQL] remove duplicated map keys with last ...

2018-11-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23124
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23124: [SPARK-25829][SQL] remove duplicated map keys with last ...

2018-11-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23124
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5393/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23145: [MINOR][Docs] "a R interpreter" -> "an R interpre...

2018-11-26 Thread kjmrknsn
Github user kjmrknsn commented on a diff in the pull request:

https://github.com/apache/spark/pull/23145#discussion_r236549454
  
--- Diff: docs/index.md ---
@@ -67,7 +67,7 @@ Example applications are also provided in Python. For 
example,
 ./bin/spark-submit examples/src/main/python/pi.py 10
 
 Spark also provides an experimental [R API](sparkr.html) since 1.4 (only 
DataFrames APIs included).
--- End diff --

Should we remove the "experimental" adjective?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22887: [SPARK-25880][CORE] user set's hadoop conf should not ov...

2018-11-26 Thread gjhkael
Github user gjhkael commented on the issue:

https://github.com/apache/spark/pull/22887
  
@vanzin Thanks for you review, I add a new commit to let the user's "set" 
command take effect. Let me know if you have an easier way. Thanks.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23144: [SPARK-26172][ML][WIP] Unify String Params' case-insensi...

2018-11-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23144
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23144: [SPARK-26172][ML][WIP] Unify String Params' case-insensi...

2018-11-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23144
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99307/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23144: [SPARK-26172][ML][WIP] Unify String Params' case-insensi...

2018-11-26 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23144
  
**[Test build #99307 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99307/testReport)**
 for PR 23144 at commit 
[`27429fb`](https://github.com/apache/spark/commit/27429fb5ae95d9ee68dc0f0a769fd3412c54ebfc).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23124: [SPARK-25829][SQL] remove duplicated map keys with last ...

2018-11-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23124
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99304/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23124: [SPARK-25829][SQL] remove duplicated map keys with last ...

2018-11-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23124
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23124: [SPARK-25829][SQL] remove duplicated map keys with last ...

2018-11-26 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23124
  
**[Test build #99304 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99304/testReport)**
 for PR 23124 at commit 
[`0c77c41`](https://github.com/apache/spark/commit/0c77c4173606edb09d4d2c7a721dee48c976233f).
 * This patch **fails PySpark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23145: [MINOR][Docs] "a R interpreter" -> "an R interpre...

2018-11-26 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/23145#discussion_r236546043
  
--- Diff: docs/index.md ---
@@ -67,7 +67,7 @@ Example applications are also provided in Python. For 
example,
 ./bin/spark-submit examples/src/main/python/pi.py 10
 
 Spark also provides an experimental [R API](sparkr.html) since 1.4 (only 
DataFrames APIs included).
--- End diff --

tbh, I'm not sure this should be called "an experimental [R API]"


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23144: [SPARK-26172][ML][WIP] Unify String Params' case-insensi...

2018-11-26 Thread zhengruifeng
Github user zhengruifeng commented on the issue:

https://github.com/apache/spark/pull/23144
  
I am not sure about `$$` or `%%`, we can replace them with other names.
I want to resolve the confusion of case-insensitivity, and wonder whether a 
new flag can do this.
If we want to keep the return of getter identical to value passed to 
setter, we will need two version of 'getter', one to return the original value, 
the other to return the lower/upper value.
Another issue is that, there is no `StringParam` trait, so we have to 
modify `Param` trait directly.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23144: [SPARK-26172][ML][WIP] Unify String Params' case-insensi...

2018-11-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23144
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23128: [SPARK-26142][SQL] Support passing shuffle metrics to ex...

2018-11-26 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23128
  
**[Test build #99308 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99308/testReport)**
 for PR 23128 at commit 
[`8689acb`](https://github.com/apache/spark/commit/8689acbc6c1f106d7b2d906bcd627147f649e6ff).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter's readingIterator fie...

2018-11-26 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23083
  
**[Test build #99309 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99309/testReport)**
 for PR 23083 at commit 
[`1723819`](https://github.com/apache/spark/commit/17238196719de1e68cbcb1eb930cb3176308e437).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23144: [SPARK-26172][ML][WIP] Unify String Params' case-insensi...

2018-11-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23144
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5390/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23128: [SPARK-26142][SQL] Support passing shuffle metrics to ex...

2018-11-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23128
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5391/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23128: [SPARK-26142][SQL] Support passing shuffle metrics to ex...

2018-11-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23128
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter's readingIterator fie...

2018-11-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23083
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23144: [SPARK-26172][ML][WIP] Unify String Params' case-insensi...

2018-11-26 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23144
  
**[Test build #99307 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99307/testReport)**
 for PR 23144 at commit 
[`27429fb`](https://github.com/apache/spark/commit/27429fb5ae95d9ee68dc0f0a769fd3412c54ebfc).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter's readingIterator fie...

2018-11-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23083
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5392/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter's readingIterator fie...

2018-11-26 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/23083
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23128: [SPARK-26142][SQL] Support passing shuffle metrics to ex...

2018-11-26 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/23128
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23122: [MINOR][ML] add missing params to Instr

2018-11-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23122
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23122: [MINOR][ML] add missing params to Instr

2018-11-26 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23122
  
**[Test build #99306 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99306/testReport)**
 for PR 23122 at commit 
[`b74a766`](https://github.com/apache/spark/commit/b74a76639ce9db1c6d002a2361e7be598cbd7dc7).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23122: [MINOR][ML] add missing params to Instr

2018-11-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23122
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5389/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23122: [MINOR][ML] add missing params to Instr

2018-11-26 Thread zhengruifeng
Github user zhengruifeng commented on a diff in the pull request:

https://github.com/apache/spark/pull/23122#discussion_r236537309
  
--- Diff: mllib/src/main/scala/org/apache/spark/ml/recommendation/ALS.scala 
---
@@ -671,7 +671,7 @@ class ALS(@Since("1.4.0") override val uid: String) 
extends Estimator[ALSModel]
 instr.logDataset(dataset)
 instr.logParams(this, rank, numUserBlocks, numItemBlocks, 
implicitPrefs, alpha, userCol,
   itemCol, ratingCol, predictionCol, maxIter, regParam, nonnegative, 
checkpointInterval,
-  seed, intermediateStorageLevel, finalStorageLevel)
+  seed, intermediateStorageLevel, finalStorageLevel, coldStartStrategy)
--- End diff --

Yes, we could let coldStartStrategy alone. BTW, I made a rapid scan and 
found that some algs do not log the columns.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23106: [SPARK-26141] Enable custom metrics implementatio...

2018-11-26 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/23106


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23106: [SPARK-26141] Enable custom metrics implementation in sh...

2018-11-26 Thread rxin
Github user rxin commented on the issue:

https://github.com/apache/spark/pull/23106
  
Merging in master. Thanks @squito.



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23152: [SPARK-26181][SQL] the `hasMinMaxStats` method of `Colum...

2018-11-26 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23152
  
**[Test build #99305 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99305/testReport)**
 for PR 23152 at commit 
[`1c1edcc`](https://github.com/apache/spark/commit/1c1edccc43ac9cdb6da53ad93ac998b06bf8b6dd).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23152: [SPARK-26181][SQL] the `hasMinMaxStats` method of `Colum...

2018-11-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23152
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5388/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23152: [SPARK-26181][SQL] the `hasMinMaxStats` method of `Colum...

2018-11-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23152
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23152: [SPARK-26181][SQL] the `hasMinMaxStats` method of...

2018-11-26 Thread adrian-wang
GitHub user adrian-wang opened a pull request:

https://github.com/apache/spark/pull/23152

[SPARK-26181][SQL] the `hasMinMaxStats` method of `ColumnStatsMap` is not 
correct

## What changes were proposed in this pull request?

For now the `hasMinMaxStats` will return the same as `hasCountStats`, which 
is obviously not as expected.

## How was this patch tested?

Existing tests.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/adrian-wang/spark minmaxstats

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/23152.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #23152


commit 1c1edccc43ac9cdb6da53ad93ac998b06bf8b6dd
Author: Daoyuan Wang 
Date:   2018-11-27T06:18:34Z

fix ColumnStatsMap.hasMinMaxStats




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23148: [SPARK-26177] Automated formatting for Scala code

2018-11-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23148
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23148: [SPARK-26177] Automated formatting for Scala code

2018-11-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23148
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99299/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23148: [SPARK-26177] Automated formatting for Scala code

2018-11-26 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23148
  
**[Test build #99299 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99299/testReport)**
 for PR 23148 at commit 
[`92dd105`](https://github.com/apache/spark/commit/92dd105b4f2d3be2d2059f5a983dab01339e81a9).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23139: [SPARK-25860][SPARK-26107] [FOLLOW-UP] Rule Repla...

2018-11-26 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/23139


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23139: [SPARK-25860][SPARK-26107] [FOLLOW-UP] Rule ReplaceNullW...

2018-11-26 Thread dbtsai
Github user dbtsai commented on the issue:

https://github.com/apache/spark/pull/23139
  
Thanks. Merged into master.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23130: [SPARK-26161][SQL] Ignore empty files in load

2018-11-26 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/23130#discussion_r236515927
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala 
---
@@ -388,7 +388,7 @@ case class FileSourceScanExec(
 logInfo(s"Planning with ${bucketSpec.numBuckets} buckets")
 val filesGroupedToBuckets =
   selectedPartitions.flatMap { p =>
-p.files.map { f =>
+p.files.filter(_.getLen > 0).map { f =>
--- End diff --

yes, and the same change is also in `createNonBucketedReadRDD`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23124: [SPARK-25829][SQL] remove duplicated map keys with last ...

2018-11-26 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23124
  
**[Test build #99304 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99304/testReport)**
 for PR 23124 at commit 
[`0c77c41`](https://github.com/apache/spark/commit/0c77c4173606edb09d4d2c7a721dee48c976233f).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23124: [SPARK-25829][SQL] remove duplicated map keys with last ...

2018-11-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23124
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5387/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23124: [SPARK-25829][SQL] remove duplicated map keys with last ...

2018-11-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23124
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23143: [SPARK-24762][SQL][Followup] Enable Option of Pro...

2018-11-26 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/23143


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23138: [SPARK-23356][SQL][TEST] add new test cases for a + 1,a ...

2018-11-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23138
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23138: [SPARK-23356][SQL][TEST] add new test cases for a + 1,a ...

2018-11-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23138
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99300/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23138: [SPARK-23356][SQL][TEST] add new test cases for a + 1,a ...

2018-11-26 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23138
  
**[Test build #99300 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99300/testReport)**
 for PR 23138 at commit 
[`5ef0035`](https://github.com/apache/spark/commit/5ef0035cb64ede974cf0dbdbd095f6ee207c6e76).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23151: [SPARK-26180][CORE][TEST] Add a withCreateTempDir functi...

2018-11-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23151
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23151: [SPARK-26180][CORE][TEST] Add a withCreateTempDir functi...

2018-11-26 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23151
  
**[Test build #99303 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99303/testReport)**
 for PR 23151 at commit 
[`6402d2d`](https://github.com/apache/spark/commit/6402d2d35fd1588a730441755b3aa47c11b08f1c).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23151: [SPARK-26180][CORE][TEST] Add a withCreateTempDir functi...

2018-11-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23151
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5386/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23143: [SPARK-24762][SQL][Followup] Enable Option of Product en...

2018-11-26 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/23143
  
thanks, merging to master!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23106: [SPARK-26141] Enable custom metrics implementation in sh...

2018-11-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23106
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23106: [SPARK-26141] Enable custom metrics implementation in sh...

2018-11-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23106
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99298/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23086: [SPARK-25528][SQL] data source v2 API refactor (b...

2018-11-26 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/23086#discussion_r236514039
  
--- Diff: sql/core/src/main/java/org/apache/spark/sql/sources/v2/Table.java 
---
@@ -0,0 +1,51 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.sources.v2;
+
+import org.apache.spark.annotation.Evolving;
+import org.apache.spark.sql.sources.v2.reader.Scan;
+import org.apache.spark.sql.sources.v2.reader.ScanBuilder;
+import org.apache.spark.sql.types.StructType;
+
+/**
+ * An interface representing a logical structured data set of a data 
source. For example, the
+ * implementation can be a directory on the file system, a topic of Kafka, 
or a table in the
+ * catalog, etc.
+ *
+ * This interface can mixin the following interfaces to support different 
operations:
+ * 
+ *   {@link SupportsBatchRead}: this table can be read in batch 
queries.
+ * 
+ */
+@Evolving
+public interface Table {
+
+  /**
+   * Returns the schema of this table.
+   */
+  StructType schema();
+
+  /**
+   * Returns a {@link ScanBuilder} which can be used to build a {@link 
Scan} later. Spark will call
+   * this method for each data scanning query.
+   *
+   * The builder can take some query specific information to do operators 
pushdown, and keep these
+   * information in the created {@link Scan}.
+   */
+  ScanBuilder newScanBuilder(DataSourceOptions options);
--- End diff --

I agree with it. Since `CaseInsensitiveStringMap` is not in the code base 
yet, shall we do it in the followup?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23106: [SPARK-26141] Enable custom metrics implementation in sh...

2018-11-26 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23106
  
**[Test build #99298 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99298/testReport)**
 for PR 23106 at commit 
[`4e9b756`](https://github.com/apache/spark/commit/4e9b7565a4e78d52cc042a2768043b9db0b21ff6).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23086: [SPARK-25528][SQL] data source v2 API refactor (b...

2018-11-26 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/23086#discussion_r236513622
  
--- Diff: sql/core/src/main/java/org/apache/spark/sql/sources/v2/Table.java 
---
@@ -0,0 +1,51 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.sources.v2;
--- End diff --

why does this `Table` API need to be in catalyst? It's not even a plan. We 
can define a table plan interface in catalyst, and implement it in the SQL 
module with this `Table` API.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22991: [SPARK-25989][ML] OneVsRestModel handle empty outputCols...

2018-11-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22991
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22991: [SPARK-25989][ML] OneVsRestModel handle empty outputCols...

2018-11-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22991
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99301/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22991: [SPARK-25989][ML] OneVsRestModel handle empty outputCols...

2018-11-26 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22991
  
**[Test build #99301 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99301/testReport)**
 for PR 22991 at commit 
[`74cc277`](https://github.com/apache/spark/commit/74cc277dc5668ad59efd19fbf47d4cfa824ba9bf).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23055: [SPARK-26080][PYTHON] Disable 'spark.executor.pys...

2018-11-26 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/23055#discussion_r236512196
  
--- Diff: 
core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala ---
@@ -74,8 +74,13 @@ private[spark] abstract class BasePythonRunner[IN, OUT](
   private val reuseWorker = conf.getBoolean("spark.python.worker.reuse", 
true)
   // each python worker gets an equal part of the allocation. the worker 
pool will grow to the
   // number of concurrent tasks, which is determined by the number of 
cores in this executor.
-  private val memoryMb = conf.get(PYSPARK_EXECUTOR_MEMORY)
+  private val memoryMb = if (Utils.isWindows) {
--- End diff --

I mean, even if it succeeds to allocate in Yarn and Python worker doesn't 
have the control on that, what's the point? 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23127: [SPARK-26159] Codegen for LocalTableScanExec and RDDScan...

2018-11-26 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/23127
  
there are still 2 golden file test failures because of the plan change...


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23151: [SPARK-26180][CORE][TEST] Add a withCreateTempDir functi...

2018-11-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23151
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23151: [SPARK-26180][CORE][TEST] Add a withCreateTempDir functi...

2018-11-26 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23151
  
**[Test build #99302 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99302/testReport)**
 for PR 23151 at commit 
[`d29d4ec`](https://github.com/apache/spark/commit/d29d4ece73dd17ee3ba5e1e85e2d35096f524810).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23151: [SPARK-26180][CORE][TEST] Add a withCreateTempDir functi...

2018-11-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23151
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99302/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23151: [SPARK-26180][CORE][TEST] Add a withCreateTempDir functi...

2018-11-26 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23151
  
**[Test build #99302 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99302/testReport)**
 for PR 23151 at commit 
[`d29d4ec`](https://github.com/apache/spark/commit/d29d4ece73dd17ee3ba5e1e85e2d35096f524810).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23151: [SPARK-26180][CORE][TEST] Add a withCreateTempDir functi...

2018-11-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23151
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23151: [SPARK-26180][CORE][TEST] Add a withCreateTempDir functi...

2018-11-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23151
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5385/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23151: [SPARK-26180][CORE][TEST] Add a withCreateTempDir functi...

2018-11-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23151
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23151: [SPARK-26180][CORE][TEST] Add a withCreateTempDir...

2018-11-26 Thread heary-cao
GitHub user heary-cao opened a pull request:

https://github.com/apache/spark/pull/23151

[SPARK-26180][CORE][TEST] Add a withCreateTempDir function to the SparkCore 
test case

## What changes were proposed in this pull request?

Currently, the common `withTempDir` function is used in Spark SQL test 
cases. To handle `val dir = Utils. createTempDir()` and `Utils. 
deleteRecursively (dir)`. Unfortunately, the `withTempDir` function cannot be 
used in the Spark Core test case. This PR adds a common `withCreateTempDir` 
function to clean up SparkCore test cases. thanks.

## How was this patch tested?

N / A


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/heary-cao/spark withCreateTempDir

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/23151.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #23151


commit d29d4ece73dd17ee3ba5e1e85e2d35096f524810
Author: caoxuewen 
Date:   2018-11-27T03:00:08Z

Add a withCreateTempDir function to the SparkCore test case




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23019: [SPARK-26025][k8s] Speed up docker image build on dev re...

2018-11-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23019
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23019: [SPARK-26025][k8s] Speed up docker image build on dev re...

2018-11-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23019
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99293/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22995: [SPARK-25998] [CORE] Change TorrentBroadcast to hold wea...

2018-11-26 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22995
  
**[Test build #4443 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4443/testReport)**
 for PR 22995 at commit 
[`09ae762`](https://github.com/apache/spark/commit/09ae762962098e58be7ba8f777f9dffde2f81d81).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23019: [SPARK-26025][k8s] Speed up docker image build on dev re...

2018-11-26 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23019
  
**[Test build #99293 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99293/testReport)**
 for PR 23019 at commit 
[`4439543`](https://github.com/apache/spark/commit/4439543cc8e320542cd8f6826662c541d18e0a72).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22991: [SPARK-25989][ML] OneVsRestModel handle empty outputCols...

2018-11-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22991
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22991: [SPARK-25989][ML] OneVsRestModel handle empty outputCols...

2018-11-26 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22991
  
**[Test build #99301 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99301/testReport)**
 for PR 22991 at commit 
[`74cc277`](https://github.com/apache/spark/commit/74cc277dc5668ad59efd19fbf47d4cfa824ba9bf).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22991: [SPARK-25989][ML] OneVsRestModel handle empty outputCols...

2018-11-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22991
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5384/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23149: [SPARK-25451][HOTFIX] Call stage.attemptNumber instead o...

2018-11-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23149
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99292/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23149: [SPARK-25451][HOTFIX] Call stage.attemptNumber instead o...

2018-11-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23149
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23149: [SPARK-25451][HOTFIX] Call stage.attemptNumber instead o...

2018-11-26 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23149
  
**[Test build #99292 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99292/testReport)**
 for PR 23149 at commit 
[`9c84f18`](https://github.com/apache/spark/commit/9c84f181a460b7b22afca942f38ffb3eee6d545c).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23139: [SPARK-25860][SPARK-26107] [FOLLOW-UP] Rule ReplaceNullW...

2018-11-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23139
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23139: [SPARK-25860][SPARK-26107] [FOLLOW-UP] Rule ReplaceNullW...

2018-11-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23139
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99294/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23086: [SPARK-25528][SQL] data source v2 API refactor (b...

2018-11-26 Thread mccheah
Github user mccheah commented on a diff in the pull request:

https://github.com/apache/spark/pull/23086#discussion_r236500170
  
--- Diff: sql/core/src/main/java/org/apache/spark/sql/sources/v2/Table.java 
---
@@ -0,0 +1,51 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.sources.v2;
+
+import org.apache.spark.annotation.Evolving;
+import org.apache.spark.sql.sources.v2.reader.Scan;
+import org.apache.spark.sql.sources.v2.reader.ScanBuilder;
+import org.apache.spark.sql.types.StructType;
+
+/**
+ * An interface representing a logical structured data set of a data 
source. For example, the
+ * implementation can be a directory on the file system, a topic of Kafka, 
or a table in the
+ * catalog, etc.
+ *
+ * This interface can mixin the following interfaces to support different 
operations:
+ * 
+ *   {@link SupportsBatchRead}: this table can be read in batch 
queries.
+ * 
+ */
+@Evolving
+public interface Table {
+
+  /**
+   * Returns the schema of this table.
+   */
+  StructType schema();
+
+  /**
+   * Returns a {@link ScanBuilder} which can be used to build a {@link 
Scan} later. Spark will call
+   * this method for each data scanning query.
+   *
+   * The builder can take some query specific information to do operators 
pushdown, and keep these
+   * information in the created {@link Scan}.
+   */
+  ScanBuilder newScanBuilder(DataSourceOptions options);
--- End diff --

Makes sense to me - `DataSourceOptions` was carrying along identifiers that 
really belong to a table identifier and that should be interpreted at the 
catalog level, not the data read level. In other words the implementation of 
this `Table` should already know _what_ locations to look up (e.g. "files 
comprising dataset D"), now it's a matter of _how_ (e.g. pushdown, filter 
predicates).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23139: [SPARK-25860][SPARK-26107] [FOLLOW-UP] Rule ReplaceNullW...

2018-11-26 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23139
  
**[Test build #99294 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99294/testReport)**
 for PR 23139 at commit 
[`8b0401c`](https://github.com/apache/spark/commit/8b0401c4440136e33d4580a3f8da80164de3d4b4).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23055: [SPARK-26080][PYTHON] Disable 'spark.executor.pys...

2018-11-26 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/23055#discussion_r236494667
  
--- Diff: 
core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala ---
@@ -74,8 +74,13 @@ private[spark] abstract class BasePythonRunner[IN, OUT](
   private val reuseWorker = conf.getBoolean("spark.python.worker.reuse", 
true)
   // each python worker gets an equal part of the allocation. the worker 
pool will grow to the
   // number of concurrent tasks, which is determined by the number of 
cores in this executor.
-  private val memoryMb = conf.get(PYSPARK_EXECUTOR_MEMORY)
+  private val memoryMb = if (Utils.isWindows) {
--- End diff --

I mean .. what I am disabling here is setting Python memory limit on 
Windows; looks Yarn side still can allocate more but not tested.

I'm thinking we should disable whole feature on Windows ideally but I 
couldn't either test it on Windows because I don't have Windows Yarn cluster (I 
only have one Windows machine that has dev environment).

I was trying to at least document that it doesn't work because it's going 
to work differently comparing to other OSes that don't have `resource` module 
(For current status, PySpark apps that uses Python worker do not work at all 
and we don't necessarily document it works on Windows). I think it's more 
correct to say it does not work because it's not tested (or at least just 
simply say it's not guaranteed on Windows).

For the reason that I prefer to check it on JVM side instead is, 
1. It's really usual to check it on JVM side when it's possible and make 
the worker simple. It could make it coupled to JVM but it's already strongly 
coupled

2. If Yarn code can also be tested, and if it doesn't work, it should also 
be disabled in JVM side.
If we can find a workaround on Windows, we can fix the Python worker and 
enable it.

3. If it's checked JVM side ahead, it reduces one state in JVM (`memoryMb`) 
is enabled in JVM side but Python skips it.



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23086: [SPARK-25528][SQL] data source v2 API refactor (b...

2018-11-26 Thread rxin
Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/23086#discussion_r236492408
  
--- Diff: sql/core/src/main/java/org/apache/spark/sql/sources/v2/Table.java 
---
@@ -0,0 +1,51 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.sources.v2;
--- End diff --

Everything in catalyst is considered private (although public visibility 
for debugging) and it's best to stay that way.



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23086: [SPARK-25528][SQL] data source v2 API refactor (b...

2018-11-26 Thread rdblue
Github user rdblue commented on a diff in the pull request:

https://github.com/apache/spark/pull/23086#discussion_r236491385
  
--- Diff: sql/core/src/main/java/org/apache/spark/sql/sources/v2/Table.java 
---
@@ -0,0 +1,51 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.sources.v2;
+
+import org.apache.spark.annotation.Evolving;
+import org.apache.spark.sql.sources.v2.reader.Scan;
+import org.apache.spark.sql.sources.v2.reader.ScanBuilder;
+import org.apache.spark.sql.types.StructType;
+
+/**
+ * An interface representing a logical structured data set of a data 
source. For example, the
+ * implementation can be a directory on the file system, a topic of Kafka, 
or a table in the
+ * catalog, etc.
+ *
+ * This interface can mixin the following interfaces to support different 
operations:
+ * 
+ *   {@link SupportsBatchRead}: this table can be read in batch 
queries.
+ * 
+ */
+@Evolving
+public interface Table {
+
+  /**
+   * Returns the schema of this table.
+   */
+  StructType schema();
+
+  /**
+   * Returns a {@link ScanBuilder} which can be used to build a {@link 
Scan} later. Spark will call
+   * this method for each data scanning query.
+   *
+   * The builder can take some query specific information to do operators 
pushdown, and keep these
+   * information in the created {@link Scan}.
+   */
+  ScanBuilder newScanBuilder(DataSourceOptions options);
--- End diff --

`DataSourceOptions` isn't simply a map for two main reasons that I can 
tell: first, it forces options to be case insensitive, and second, it exposes 
helper methods to identify tables, like `tableName`, `databaseName`, and 
`paths`. In the new abstraction, the second use of `DataSourceOptions` is no 
longer needed. The table is already instantiated by the time that this is 
called.

We should to reconsider `DataSourceOptions`. The `tableName` methods aren't 
needed and we also no longer need to forward properties from the session config 
because the way tables are configured has changed (catalogs handle that). I 
think we should remove this class and instead use the more direct 
implementation, `CaseInsensitiveStringMap` from #21306. The behavior of that 
class is obvious from its name and it would be shared between the v2 APIs, both 
catalog and data source.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23086: [SPARK-25528][SQL] data source v2 API refactor (b...

2018-11-26 Thread mccheah
Github user mccheah commented on a diff in the pull request:

https://github.com/apache/spark/pull/23086#discussion_r236490800
  
--- Diff: sql/core/src/main/java/org/apache/spark/sql/sources/v2/Table.java 
---
@@ -0,0 +1,51 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.sources.v2;
--- End diff --

Moving this to the Catalyst package would set a precedent for 
user-overridable behavior to live in the catalyst project. I'm not aware of 
anything in the Catalyst package being considered as public API right now. Are 
we allowed to start such a convention at this juncture?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23117: [WIP][SPARK-7721][INFRA] Run and generate test co...

2018-11-26 Thread shaneknapp
Github user shaneknapp commented on a diff in the pull request:

https://github.com/apache/spark/pull/23117#discussion_r236488706
  
--- Diff: dev/run-tests.py ---
@@ -434,6 +434,63 @@ def run_python_tests(test_modules, parallelism):
 run_cmd(command)
 
 
+def run_python_tests_with_coverage(test_modules, parallelism):
+set_title_and_block("Running PySpark tests with coverage report", 
"BLOCK_PYSPARK_UNIT_TESTS")
+
+command = [os.path.join(SPARK_HOME, "python", 
"run-tests-with-coverage")]
+if test_modules != [modules.root]:
+command.append("--modules=%s" % ','.join(m.name for m in 
test_modules))
+command.append("--parallelism=%i" % parallelism)
+run_cmd(command)
+post_python_tests_results()
+
+
+def post_python_tests_results():
+if "SPARK_TEST_KEY" not in os.environ:
+print("[error] 'SPARK_TEST_KEY' environment variable was not set. 
Unable to post"
+  "PySpark coverage results.")
+sys.exit(1)
--- End diff --

sure, i can do that tomorrow (currently heading out for the day).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23055: [SPARK-26080][PYTHON] Disable 'spark.executor.pys...

2018-11-26 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/23055#discussion_r236488396
  
--- Diff: 
core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala ---
@@ -74,8 +74,13 @@ private[spark] abstract class BasePythonRunner[IN, OUT](
   private val reuseWorker = conf.getBoolean("spark.python.worker.reuse", 
true)
   // each python worker gets an equal part of the allocation. the worker 
pool will grow to the
   // number of concurrent tasks, which is determined by the number of 
cores in this executor.
-  private val memoryMb = conf.get(PYSPARK_EXECUTOR_MEMORY)
+  private val memoryMb = if (Utils.isWindows) {
--- End diff --

@vanzin, I mean the first change I did at 
https://github.com/apache/spark/pull/23055/commits/2d3315a7dab429abc4d9ef5ed7f8f5484e8421f1


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23136: [SPARK-25515][K8s] Adds a config option to keep e...

2018-11-26 Thread skonto
Github user skonto commented on a diff in the pull request:

https://github.com/apache/spark/pull/23136#discussion_r236488306
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsAllocator.scala
 ---
@@ -86,11 +88,14 @@ private[spark] class ExecutorPodsAllocator(
   s" cluster after $podCreationTimeout milliseconds despite the 
fact that a" +
   " previous allocation attempt tried to create it. The executor 
may have been" +
   " deleted but the application missed the deletion event.")
-Utils.tryLogNonFatalError {
-  kubernetesClient
-.pods()
-.withLabel(SPARK_EXECUTOR_ID_LABEL, execId.toString)
-.delete()
+
+if (shouldDeleteExecutors) {
+  Utils.tryLogNonFatalError {
+kubernetesClient
+  .pods()
+  .withLabel(SPARK_EXECUTOR_ID_LABEL, execId.toString)
--- End diff --

I just run multiple jobs in parallel:
```
NAMEREADY STATUS  RESTARTS   AGE
spark-pi-1543281035622-driver   1/1   Running 0  2m
spark-pi-1543281035622-exec-1   1/1   Running 0  1m
spark-pi-1543281098418-driver   0/1   Completed   0  1m
spark-pi-1543281107591-driver   0/1   Completed   0  57s
```
The long running was not terminated... 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23086: [SPARK-25528][SQL] data source v2 API refactor (b...

2018-11-26 Thread rdblue
Github user rdblue commented on a diff in the pull request:

https://github.com/apache/spark/pull/23086#discussion_r236487464
  
--- Diff: sql/core/src/main/java/org/apache/spark/sql/sources/v2/Table.java 
---
@@ -0,0 +1,51 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.sources.v2;
--- End diff --

#21306 (TableCatalog support) adds this class as 
`org.apache.spark.sql.catalog.v2.Table` in the `spark-catalyst` module. I think 
it needs to be in the catalyst module and should probably be in the 
`o.a.s.sql.catalog.v2` package as well.

The important one is moving this to the catalyst module. The analyzer is in 
catalyst and all of the v2 logical plans and analysis rules will be in catalyst 
as well, because we are standardizing behavior. The standard validation rules 
should be in catalyst, not in a source-specific or hive-specific package in the 
sql-core or hive modules.

Because the logical plans and validation rules are in the catalyst package, 
the `TableCatalog` API needs to be there as well. For example, when a [catalog 
table identifier](https://github.com/apache/spark/pull/21978) is resolved for a 
read query, one of the results is a `TableCatalog` instance for the catalog 
portion of the identifier. That catalog is used to load the v2 table, which is 
then wrapped in a v2 relation for further analysis. Similarly, the write path 
should also validate that the catalog exists during analysis by loading it, and 
would then pass the catalog in a v2 logical plan for `CreateTable` or 
`CreateTableAsSelect`.

I also think that it makes sense to use the 
`org.apache.spark.sql.catalog.v2` package for `Table` because `Table` is more 
closely tied to the `TableCatalog` API than to the data source API. The link to 
DSv2 is that `Table` carries `newScanBuilder`, but the rest of the methods 
exposed by `Table` are for catalog functions, like inspecting a table's 
partitioning or table properties.

Moving this class would make adding `TableCatalog` less intrusive.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23147: [SPARK-26140] followup: rename ShuffleMetricsRepo...

2018-11-26 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/23147


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23055: [SPARK-26080][PYTHON] Disable 'spark.executor.pys...

2018-11-26 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/23055#discussion_r236487153
  
--- Diff: 
core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala ---
@@ -74,8 +74,13 @@ private[spark] abstract class BasePythonRunner[IN, OUT](
   private val reuseWorker = conf.getBoolean("spark.python.worker.reuse", 
true)
   // each python worker gets an equal part of the allocation. the worker 
pool will grow to the
   // number of concurrent tasks, which is determined by the number of 
cores in this executor.
-  private val memoryMb = conf.get(PYSPARK_EXECUTOR_MEMORY)
+  private val memoryMb = if (Utils.isWindows) {
--- End diff --

> The point Ryan and I are making is that there's no point in having this 
check in the Java code, because the Python code already handles it. It doesn't 
work on Windows, and the updated Python code handles that.

My point is we should remove the Python condition. 

Also, I mean I wonder if this whole feature itself introduced at 
https://github.com/apache/spark/pull/21977 works or not since it looks not ever 
tested.



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23117: [WIP][SPARK-7721][INFRA] Run and generate test coverage ...

2018-11-26 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/23117
  
> does this add time to running the tests?

Given my local tests, the time diff looked slightly increasing .. I want to 
see how it works in Jenkins ..


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23117: [WIP][SPARK-7721][INFRA] Run and generate test co...

2018-11-26 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/23117#discussion_r236486298
  
--- Diff: dev/run-tests.py ---
@@ -434,6 +434,63 @@ def run_python_tests(test_modules, parallelism):
 run_cmd(command)
 
 
+def run_python_tests_with_coverage(test_modules, parallelism):
+set_title_and_block("Running PySpark tests with coverage report", 
"BLOCK_PYSPARK_UNIT_TESTS")
+
+command = [os.path.join(SPARK_HOME, "python", 
"run-tests-with-coverage")]
+if test_modules != [modules.root]:
+command.append("--modules=%s" % ','.join(m.name for m in 
test_modules))
+command.append("--parallelism=%i" % parallelism)
+run_cmd(command)
+post_python_tests_results()
+
+
+def post_python_tests_results():
+if "SPARK_TEST_KEY" not in os.environ:
+print("[error] 'SPARK_TEST_KEY' environment variable was not set. 
Unable to post"
+  "PySpark coverage results.")
+sys.exit(1)
--- End diff --

@shaneknapp can you add another environment variable that indicates PR 
builder and spark-master-test-sbt-hadoop-2.7 where we're going to run Python 
coverage? I can check it and explicitly enable it only in that condition.

True, if the condition below (which I checked before at #17669):

```python
os.environ.get("AMPLAB_JENKINS_BUILD_PROFILE", "") == 
"hadoop2.7"
and os.environ.get("SPARK_BRANCH", "") == "master"
and os.environ.get("AMPLAB_JENKINS", "") == "true"
and os.environ.get("AMPLAB_JENKINS_BUILD_TOOL", "") == "sbt")
```

is `True` in Jenkins build or other users environment, it might cause some 
problems (even though looks quite unlikely).

For similar instance, if `AMPLAB_JENKINS` is set in users environment who 
run the tests locally, it wouldn't work anyway tho.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23136: [SPARK-25515][K8s] Adds a config option to keep e...

2018-11-26 Thread skonto
Github user skonto commented on a diff in the pull request:

https://github.com/apache/spark/pull/23136#discussion_r236485835
  
--- Diff: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsAllocator.scala
 ---
@@ -86,11 +88,14 @@ private[spark] class ExecutorPodsAllocator(
   s" cluster after $podCreationTimeout milliseconds despite the 
fact that a" +
   " previous allocation attempt tried to create it. The executor 
may have been" +
   " deleted but the application missed the deletion event.")
-Utils.tryLogNonFatalError {
-  kubernetesClient
-.pods()
-.withLabel(SPARK_EXECUTOR_ID_LABEL, execId.toString)
-.delete()
+
+if (shouldDeleteExecutors) {
+  Utils.tryLogNonFatalError {
+kubernetesClient
+  .pods()
+  .withLabel(SPARK_EXECUTOR_ID_LABEL, execId.toString)
--- End diff --

Correct here is a sample output:
```Labels:spark-app-selector=spark-application-1543242814683
spark-exec-id=2
spark-role=executor
```
Weird that didnt see one job interfering with the other before.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23150: [SPARK-26178][SQL] Use java.time API for parsing timesta...

2018-11-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23150
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99296/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23150: [SPARK-26178][SQL] Use java.time API for parsing timesta...

2018-11-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23150
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23150: [SPARK-26178][SQL] Use java.time API for parsing timesta...

2018-11-26 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23150
  
**[Test build #99296 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99296/testReport)**
 for PR 23150 at commit 
[`f287b77`](https://github.com/apache/spark/commit/f287b77d94de9e9f466c0ff2c2370f22a46b48f7).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23055: [SPARK-26080][PYTHON] Disable 'spark.executor.pys...

2018-11-26 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/23055#discussion_r236485186
  
--- Diff: 
core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala ---
@@ -74,8 +74,13 @@ private[spark] abstract class BasePythonRunner[IN, OUT](
   private val reuseWorker = conf.getBoolean("spark.python.worker.reuse", 
true)
   // each python worker gets an equal part of the allocation. the worker 
pool will grow to the
   // number of concurrent tasks, which is determined by the number of 
cores in this executor.
-  private val memoryMb = conf.get(PYSPARK_EXECUTOR_MEMORY)
+  private val memoryMb = if (Utils.isWindows) {
--- End diff --

I don't understand what you're getting at here.

The point Ryan and I are making is that there's no point in having this 
check in the Java code, because the Python code already handles it. It doesn't 
work on Windows, and the updated Python code handles that.

The only shortcoming here is passing an env variable to Python that won't 
be used if you're on Windows. That's really super trivial and not a big deal at 
all.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   4   5   >