[GitHub] spark issue #22606: [SPARK-25592] Setting version to 3.0.0-SNAPSHOT

2018-10-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22606
  
**[Test build #96841 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96841/testReport)**
 for PR 22606 at commit 
[`2e117b2`](https://github.com/apache/spark/commit/2e117b2eff1f4162eac76c4a30e079a65252d998).
 * This patch **fails due to an unknown error code, -9**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22339: [SPARK-17159][STREAM] Significant speed up for running s...

2018-10-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22339
  
**[Test build #96843 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96843/testReport)**
 for PR 22339 at commit 
[`2fba9af`](https://github.com/apache/spark/commit/2fba9af597349fc023e04a845d1cfacfc3ab7d9e).
 * This patch **fails due to an unknown error code, -9**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22605: [SPARK-25589][SQL][TEST] Add BloomFilterBenchmark

2018-10-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22605
  
**[Test build #96840 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96840/testReport)**
 for PR 22605 at commit 
[`8e9a0b1`](https://github.com/apache/spark/commit/8e9a0b1c1e6a569f919947e2e49b621a9fb30147).
 * This patch **fails due to an unknown error code, -9**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22482: WIP - [SPARK-10816][SS] Support session window natively

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22482
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96838/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22339: [SPARK-17159][STREAM] Significant speed up for running s...

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22339
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22605: [SPARK-25589][SQL][TEST] Add BloomFilterBenchmark

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22605
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22606: [SPARK-25592] Setting version to 3.0.0-SNAPSHOT

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22606
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22339: [SPARK-17159][STREAM] Significant speed up for running s...

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22339
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96843/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22606: [SPARK-25592] Setting version to 3.0.0-SNAPSHOT

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22606
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22606: [SPARK-25592] Setting version to 3.0.0-SNAPSHOT

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22606
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96842/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22482: WIP - [SPARK-10816][SS] Support session window natively

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22482
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22606: [SPARK-25592] Setting version to 3.0.0-SNAPSHOT

2018-10-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22606
  
**[Test build #96842 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96842/testReport)**
 for PR 22606 at commit 
[`a0467da`](https://github.com/apache/spark/commit/a0467da5a71018e681385d780bc94af8854c7e08).
 * This patch **fails due to an unknown error code, -9**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22605: [SPARK-25589][SQL][TEST] Add BloomFilterBenchmark

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22605
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96840/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22606: [SPARK-25592] Setting version to 3.0.0-SNAPSHOT

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22606
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96841/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22482: WIP - [SPARK-10816][SS] Support session window natively

2018-10-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22482
  
**[Test build #96838 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96838/testReport)**
 for PR 22482 at commit 
[`9a60cf3`](https://github.com/apache/spark/commit/9a60cf36461fb86dc716f0a9ec6e25c4ef6031c4).
 * This patch **fails due to an unknown error code, -9**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22607: [SPARK-24530][followup] run Sphinx with python 3 ...

2018-10-02 Thread cloud-fan
GitHub user cloud-fan opened a pull request:

https://github.com/apache/spark/pull/22607

[SPARK-24530][followup] run Sphinx with python 3 in docker

## What changes were proposed in this pull request?

SPARK-24530 discovered a problem of generation python doc, and provided a 
fix: setting SPHINXPYTHON to python 3.

This PR makes this fix automatic in the release script using docker.

## How was this patch tested?

verified by the 2.4.0 rv2

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/cloud-fan/spark python

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/22607.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #22607


commit e76463d80c8414768b9af59a43cf55fd46bc0723
Author: Wenchen Fan 
Date:   2018-09-18T15:47:43Z

run Sphinx with python 3 in docker




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22607: [SPARK-24530][followup] run Sphinx with python 3 in dock...

2018-10-02 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/22607
  
cc @HyukjinKwon @vanzin @jerryshao @srowen 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22607: [SPARK-24530][followup] run Sphinx with python 3 in dock...

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22607
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22607: [SPARK-24530][followup] run Sphinx with python 3 in dock...

2018-10-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22607
  
**[Test build #96844 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96844/testReport)**
 for PR 22607 at commit 
[`e76463d`](https://github.com/apache/spark/commit/e76463d80c8414768b9af59a43cf55fd46bc0723).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22607: [SPARK-24530][followup] run Sphinx with python 3 in dock...

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22607
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/3617/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20405: [SPARK-23229][SQL] Dataset.hint should use planWithBarri...

2018-10-02 Thread felixcheung
Github user felixcheung commented on the issue:

https://github.com/apache/spark/pull/20405
  
probably.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22601: [SPARK-25583][DOCS]Add history-server related con...

2018-10-02 Thread shahidki31
Github user shahidki31 commented on a diff in the pull request:

https://github.com/apache/spark/pull/22601#discussion_r221847126
  
--- Diff: docs/configuration.md ---
@@ -807,6 +814,14 @@ Apart from these, the following properties are also 
available, and may be useful
 Allows jobs and stages to be killed from the web UI.
   
 
+
+  spark.ui.liveUpdate.period
+  100ms
+  
+How often to update live entities. For live applications, this avoids 
a few
--- End diff --

Thanks updated.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22601: [SPARK-25583][DOC]Add history-server related configurati...

2018-10-02 Thread shahidki31
Github user shahidki31 commented on the issue:

https://github.com/apache/spark/pull/22601
  
Hi @dongjoon-hyun , I have addressed the comments. Thank you.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21710: [SPARK-24207][R]add R API for PrefixSpan

2018-10-02 Thread felixcheung
Github user felixcheung commented on the issue:

https://github.com/apache/spark/pull/21710
  
actually, as of 2 hrs ago seems like we are skipping 2.5, to 3.0. we could 
wait a bit until that is merged


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22608: [SPARK-23257][K8S][TESTS] Kerberos Support Integr...

2018-10-02 Thread ifilonenko
GitHub user ifilonenko opened a pull request:

https://github.com/apache/spark/pull/22608

[SPARK-23257][K8S][TESTS] Kerberos Support Integration Tests

## What changes were proposed in this pull request?

This fix includes just the integration tests for Kerberos Support

## How was this patch tested?

This patch includes a single-noded pseudo-distributed Kerberized Hadoop 
cluster for the purpose of testing Kerberos interaction. The Keytabs are shared 
with Persistent Volumes and communication happens all within the same 
Kubernetes cluster. 


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ifilonenko/spark SPARK-25152-e2e-tests

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/22608.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #22608


commit 31fc536103538543cd7e114cf737b2712cfec15c
Author: Ilan Filonenko 
Date:   2018-09-27T00:41:21Z

initial commit

commit 9bfa86a947b4ff764762fe27b356480a6e957baa
Author: Ilan Filonenko 
Date:   2018-09-29T01:55:31Z

initial work on secure-hdfs integration testing

commit 77ea92a0c1303f7b4c7dd4a6131e49e691b19b84
Author: Ilan Filonenko 
Date:   2018-09-29T02:06:42Z

small fix

commit 761254c3d4bdd1b35e707077acf0a70defc88ea9
Author: Ilan Filonenko 
Date:   2018-09-29T03:02:11Z

fixed issue of docker building

commit 6e3966fbc98809a962bd9cbd589266d9b8b95834
Author: Ilan Filonenko 
Date:   2018-09-29T03:04:16Z

fixes and organizations

commit 776617dc5328a7a88afde854240d750efd52959f
Author: Ilan Filonenko 
Date:   2018-09-29T17:42:42Z

traits and polymorphosim

commit 7f1ccb6451d53f04d46263f7bb7e81211bfb809f
Author: Ilan Filonenko 
Date:   2018-09-30T07:39:19Z

polymorphism fixes and generuc class types

commit 3ab4358787e5cfb0de289f963122b7f22108fc36
Author: Ilan Filonenko 
Date:   2018-10-01T05:28:03Z

working test cases (just need clusterrolebindings)

commit cfe799033139251df44e584ab06b699cb437ed11
Author: Ilan Filonenko 
Date:   2018-10-02T07:40:28Z

small changes with addition of old tests

commit 54316ba4fbc5ec7b46184d01f6404bd26d3c0f5d
Author: Ilan Filonenko 
Date:   2018-10-02T07:48:26Z

bring back sparkr




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22608: [SPARK-23257][K8S][TESTS] Kerberos Support Integration T...

2018-10-02 Thread ifilonenko
Github user ifilonenko commented on the issue:

https://github.com/apache/spark/pull/22608
  
@mccheah @liyinan926 @erikerlandson for review

Things to note: 
- [ ] `clusterrolebindings` might be needed to ensure driver can setup 
necessary resources. 
- [ ] Any way to include the hadoop-2.7.3.tgz so that the 
`hadoop-base:latest` image can be built on the fly as opposed to pulling from 
`ifilonenko/hadoop-base:latest`




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22608: [SPARK-23257][K8S][TESTS] Kerberos Support Integration T...

2018-10-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22608
  
**[Test build #96845 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96845/testReport)**
 for PR 22608 at commit 
[`54316ba`](https://github.com/apache/spark/commit/54316ba4fbc5ec7b46184d01f6404bd26d3c0f5d).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22608: [SPARK-23257][K8S][TESTS] Kerberos Support Integration T...

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22608
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22608: [SPARK-23257][K8S][TESTS] Kerberos Support Integration T...

2018-10-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22608
  
**[Test build #96845 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96845/testReport)**
 for PR 22608 at commit 
[`54316ba`](https://github.com/apache/spark/commit/54316ba4fbc5ec7b46184d01f6404bd26d3c0f5d).
 * This patch **fails RAT tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22608: [SPARK-23257][K8S][TESTS] Kerberos Support Integration T...

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22608
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96845/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22608: [SPARK-23257][K8S][TESTS] Kerberos Support Integration T...

2018-10-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22608
  
Kubernetes integration test starting
URL: 
https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/3618/



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22609: [SPARK-25594] [Core] Avoid maintaining task infor...

2018-10-02 Thread mridulm
GitHub user mridulm opened a pull request:

https://github.com/apache/spark/pull/22609

[SPARK-25594] [Core] Avoid maintaining task information when UI is disabled

## What changes were proposed in this pull request?

Avoid maintaining task information in live spark application when UI is 
disabled.
For long running application, with large number of tasks, this ended up 
causing OOM in our tests.

## How was this patch tested?

Long running test successfully ran for 34 hours with steady memory, when it 
used to fail in 28 hours with OOM earlier.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mridulm/spark master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/22609.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #22609


commit c2f6b4a50ef3693292f600a2d4d7743ea870b96e
Author: Mridul Muralidharan 
Date:   2018-10-02T08:03:41Z

SPARK-25594: Avoid maintaining task information when UI is disabled




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22609: [SPARK-25594] [Core] Avoid maintaining task information ...

2018-10-02 Thread mridulm
Github user mridulm commented on the issue:

https://github.com/apache/spark/pull/22609
  
ok to test


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22609: [SPARK-25594] [Core] Avoid maintaining task information ...

2018-10-02 Thread mridulm
Github user mridulm commented on the issue:

https://github.com/apache/spark/pull/22609
  
+CC @vanzin, @tgravescs 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22609: [SPARK-25594] [Core] Avoid maintaining task information ...

2018-10-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22609
  
**[Test build #96846 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96846/testReport)**
 for PR 22609 at commit 
[`c2f6b4a`](https://github.com/apache/spark/commit/c2f6b4a50ef3693292f600a2d4d7743ea870b96e).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22606: [SPARK-25592] Setting version to 3.0.0-SNAPSHOT

2018-10-02 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/22606
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22605: [SPARK-25589][SQL][TEST] Add BloomFilterBenchmark

2018-10-02 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/22605
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22608: [SPARK-23257][K8S][TESTS] Kerberos Support Integration T...

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22608
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/3618/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22608: [SPARK-23257][K8S][TESTS] Kerberos Support Integration T...

2018-10-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22608
  
Kubernetes integration test status failure
URL: 
https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/3618/



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22608: [SPARK-23257][K8S][TESTS] Kerberos Support Integration T...

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22608
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22609: [SPARK-25594] [Core] Avoid maintaining task information ...

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22609
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22609: [SPARK-25594] [Core] Avoid maintaining task information ...

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22609
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/3619/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22606: [SPARK-25592] Setting version to 3.0.0-SNAPSHOT

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22606
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/3620/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22606: [SPARK-25592] Setting version to 3.0.0-SNAPSHOT

2018-10-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22606
  
**[Test build #96847 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96847/testReport)**
 for PR 22606 at commit 
[`a0467da`](https://github.com/apache/spark/commit/a0467da5a71018e681385d780bc94af8854c7e08).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22606: [SPARK-25592] Setting version to 3.0.0-SNAPSHOT

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22606
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22605: [SPARK-25589][SQL][TEST] Add BloomFilterBenchmark

2018-10-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22605
  
**[Test build #96848 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96848/testReport)**
 for PR 22605 at commit 
[`8e9a0b1`](https://github.com/apache/spark/commit/8e9a0b1c1e6a569f919947e2e49b621a9fb30147).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22605: [SPARK-25589][SQL][TEST] Add BloomFilterBenchmark

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22605
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22605: [SPARK-25589][SQL][TEST] Add BloomFilterBenchmark

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22605
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/3621/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22610: [WIP][SPARK-25461][PySpark][SQL] Print warning wh...

2018-10-02 Thread viirya
GitHub user viirya opened a pull request:

https://github.com/apache/spark/pull/22610

[WIP][SPARK-25461][PySpark][SQL] Print warning when return type of 
Pandas.Series mismatches the arrow return type of pandas udf

## What changes were proposed in this pull request?

For Pandas UDFs, we get arrow type from defined Catalyst return data type 
of UDFs. We use this arrow type to do serialization of data. If the defined 
return data type doesn't match with actual return type of Pandas.Series 
returned by Pandas UDFs, it has a risk to return incorrect data from Python 
side.

This WIP work proposes to check if returned Pandas.Series's dtype matches 
with defined return type of Pandas UDFs.

Although we can disallow it by throwing an exception to let users know they 
might need to set correct return type. But looks like we leverage such behavior 
in current codebase. For example, there is a test 
`test_vectorized_udf_null_short`:

```python
data = [(None,), (2,), (3,), (4,)]
schema = StructType().add("short", ShortType())
df = self.spark.createDataFrame(data, schema)
short_f = pandas_udf(lambda x: x, ShortType())
res = df.select(short_f(col('short')))
self.assertEquals(df.collect(), res.collect())
```
So instead, this work for now just prints warning message if such 
mismatching is detected. So users can read this message when debugging that 
their Pandas UDFs don't produce expected results.

## How was this patch tested?

Manually test by running:

```python
from pyspark.sql.functions import pandas_udf
import pandas as pd

values = [1.0] * 5 + [2.0] * 5
pdf = pd.DataFrame({'A': values})
df = spark.createDataFrame(pdf)
@pandas_udf(returnType=BooleanType())
def to_boolean(column):
return column
df.select(['A']).withColumn('to_boolean', to_boolean('A')).show()
```

Output:

```
WARN: Arrow type double of return Pandas.Series of the user-defined 
function's dtype float64 doesn't match the arrow type bool of defined return 
type B
ooleanType  

+---+--+

|  A|to_boolean|

+---+--+

|1.0| false|

|1.0| false|

|1.0| false|
  
|1.0| false|
 
|1.0| false|

|2.0| false|
   
|2.0| false|

|2.0| false|

|2.0| false|

|2.0| false|
   
+---+--+  
```   

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/viirya/spark-1 SPARK-25461

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/22610.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #22610


commit 2fa15bda48ba64a102f114dc9119cb3c310200c4
Author: Liang-Chi Hsieh 
Date:   2018-09-26T09:01:40Z

Ensure return type of Pandas.Series matches the arrow return type of pandas 
udf.

commit d206b7cf78f898e622f539a15e45515fcbd9e54a
Author: Liang-Chi Hsieh 
Date:   2018-10-02T05:29:44Z

Print warning message instead of throwing exception.




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22610: [WIP][SPARK-25461][PySpark][SQL] Print warning when retu...

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22610
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/3622/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22610: [WIP][SPARK-25461][PySpark][SQL] Print warning when retu...

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22610
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22610: [WIP][SPARK-25461][PySpark][SQL] Print warning when retu...

2018-10-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22610
  
**[Test build #96849 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96849/testReport)**
 for PR 22610 at commit 
[`d206b7c`](https://github.com/apache/spark/commit/d206b7cf78f898e622f539a15e45515fcbd9e54a).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22610: [WIP][SPARK-25461][PySpark][SQL] Print warning when retu...

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22610
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22610: [WIP][SPARK-25461][PySpark][SQL] Print warning when retu...

2018-10-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22610
  
**[Test build #96849 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96849/testReport)**
 for PR 22610 at commit 
[`d206b7c`](https://github.com/apache/spark/commit/d206b7cf78f898e622f539a15e45515fcbd9e54a).
 * This patch **fails Python style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22610: [WIP][SPARK-25461][PySpark][SQL] Print warning when retu...

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22610
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96849/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22611: [SPARK-25595] Ignore corrupt Avro files if flag I...

2018-10-02 Thread gengliangwang
GitHub user gengliangwang opened a pull request:

https://github.com/apache/spark/pull/22611

[SPARK-25595] Ignore corrupt Avro files if flag IGNORE_CORRUPT_FILES enabled

## What changes were proposed in this pull request?

With flag IGNORE_CORRUPT_FILES enabled, schema inference should ignore 
corrupt Avro files, which is consistent with Parquet and Orc data source.

## How was this patch tested?

Unit test

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/gengliangwang/spark ignoreCorruptAvro

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/22611.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #22611


commit e96ea209731acba240943f890773e6eb1d87dee8
Author: Gengliang Wang 
Date:   2018-10-02T08:21:03Z

Ignore corrupt Avro file if flag IGNORE_CORRUPT_FILES enabled




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22611: [SPARK-25595] Ignore corrupt Avro files if flag IGNORE_C...

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22611
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22611: [SPARK-25595] Ignore corrupt Avro files if flag IGNORE_C...

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22611
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/3623/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22227: [SPARK-25202] [SQL] Implements split with limit s...

2018-10-02 Thread phegstrom
Github user phegstrom closed the pull request at:

https://github.com/apache/spark/pull/7


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22611: [SPARK-25595] Ignore corrupt Avro files if flag IGNORE_C...

2018-10-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22611
  
**[Test build #96850 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96850/testReport)**
 for PR 22611 at commit 
[`e96ea20`](https://github.com/apache/spark/commit/e96ea209731acba240943f890773e6eb1d87dee8).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22227: [SPARK-25202] [SQL] Implements split with limit s...

2018-10-02 Thread phegstrom
GitHub user phegstrom reopened a pull request:

https://github.com/apache/spark/pull/7

[SPARK-25202] [SQL] Implements split with limit sql function

## What changes were proposed in this pull request?

Adds support for the setting limit in the sql split function

## How was this patch tested?

1. Updated unit tests
2. Tested using Scala spark shell

Please review http://spark.apache.org/contributing.html before opening a 
pull request.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/phegstrom/spark master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/7.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #7


commit 15362be4764d2b33757488efe38667cc762246ad
Author: Parker Hegstrom 
Date:   2018-08-24T19:19:58Z

implement split with limit

commit ceb3f41238c8731606164cea5c45a0b87bb5d6f2
Author: Parker Hegstrom 
Date:   2018-08-24T20:26:42Z

linting

commit e564a68ce8db001f5ebcec49bbeb8a4b893a8d70
Author: Parker Hegstrom 
Date:   2018-08-27T20:27:45Z

most comments

commit 4e107337a47ce590c703b757b0a44d60d6b862e1
Author: Parker Hegstrom 
Date:   2018-08-27T21:01:14Z

sql function tests

commit 5135cb28e505f5c61ce325ac6e5589b35450b44a
Author: Parker Hegstrom 
Date:   2018-08-28T21:52:25Z

fixing test file, comments

commit e8c8c8c5e3811fd8e8f2c7b8c77ae404b4acc157
Author: Parker Hegstrom 
Date:   2018-08-28T21:57:24Z

adding another example

commit 8e163283a71fa272ae8d7c43632e38ef01f8d12b
Author: Parker Hegstrom 
Date:   2018-08-30T13:57:53Z

better docs, fixing sql output files

commit ca23ea363fe2beaecc4cf16385256aabbdd7d626
Author: Parker Hegstrom 
Date:   2018-08-30T15:01:36Z

adding python support

commit 96bc875d4790f46f72e30c4845c25ed5c4dc73cc
Author: Parker Hegstrom 
Date:   2018-08-30T15:22:17Z

fixing style checks

commit 79599ebc26f089737e101b417428ceb6620f802d
Author: Parker Hegstrom 
Date:   2018-08-30T18:46:01Z

adding support for R

commit a27c848a62a08c869699bd521092987672337926
Author: Parker Hegstrom 
Date:   2018-08-30T21:07:28Z

ammending doc format

commit fa128db516be00a145af58a595143d7ccc31b7a4
Author: Parker Hegstrom 
Date:   2018-08-30T21:13:06Z

fixing typo

commit a6411069c352b30f9094a83991c35f0730b5df55
Author: Parker Hegstrom 
Date:   2018-08-31T01:58:07Z

fix docs to check ci status

commit 7e4ba981ef4636b5663f1a50df6e3fa8886186c3
Author: Parker Hegstrom 
Date:   2018-08-31T17:44:09Z

add fallback from limit = 0, update tests

commit d80b1a15ed8941bad78df2c5f7168a4196d27be4
Author: Parker Hegstrom 
Date:   2018-08-31T21:19:11Z

fix python syntax in tests

commit d17d2df530a9236471b415700533c9224c94435a
Author: Parker Hegstrom 
Date:   2018-09-01T15:39:04Z

bring limit handling into UTF8String split impl

commit 64b0afca802c4557b5a53aa62b7486c3d8d4fe8c
Author: Parker Hegstrom 
Date:   2018-09-06T19:57:37Z

HyukjinKwon comments

commit 4e84df0a80c1e610068884f937b73478be7e1c1c
Author: Parker Hegstrom 
Date:   2018-09-07T16:16:17Z

removing duplicate pattern reference

commit b12ee881c1d9025644d075a6124f9e6e465a6378
Author: Parker Hegstrom 
Date:   2018-09-10T10:04:28Z

docs comments, change to 3.0

commit 69d219018c9b2a7f3fb7dd716619f067aec0d2dc
Author: Parker Hegstrom 
Date:   2018-09-12T11:58:42Z

docs comments

commit b5994ad66c355cf104c1bd4b1e75ae2e6cc199c3
Author: Parker Hegstrom 
Date:   2018-09-14T10:55:55Z

docs comments, should be good to go

commit 5c8f48715748bdeda703761fba6a4d1828a19985
Author: Parker Hegstrom 
Date:   2018-09-16T22:46:43Z

felix comments for R tests

commit 34ba74f79aad2a0e2fe9e0d6f6110a10a51c8108
Author: Parker Hegstrom 
Date:   2018-10-01T18:18:06Z

viirya comments




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22602: [SPARK-25538][SQL] Zero-out all bytes when writing decim...

2018-10-02 Thread mgaido91
Github user mgaido91 commented on the issue:

https://github.com/apache/spark/pull/22602
  
thank you all for the reviews! I added the UT according to @cloud-fan's 
suggestion as I was unable to set up a reasonable the end-to-end UT. Thanks.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22611: [SPARK-25595] Ignore corrupt Avro files if flag IGNORE_C...

2018-10-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22611
  
**[Test build #96851 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96851/testReport)**
 for PR 22611 at commit 
[`404b1a0`](https://github.com/apache/spark/commit/404b1a0a603bc95654cf49b34378bbcc2dabcc4c).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22611: [SPARK-25595] Ignore corrupt Avro files if flag IGNORE_C...

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22611
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22611: [SPARK-25595] Ignore corrupt Avro files if flag IGNORE_C...

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22611
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/3624/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22602: [SPARK-25538][SQL] Zero-out all bytes when writing decim...

2018-10-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22602
  
**[Test build #96852 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96852/testReport)**
 for PR 22602 at commit 
[`6b84b41`](https://github.com/apache/spark/commit/6b84b41915ae184912922bb820a147060ac13afc).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22602: [SPARK-25538][SQL] Zero-out all bytes when writing decim...

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22602
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/3625/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22602: [SPARK-25538][SQL] Zero-out all bytes when writing decim...

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22602
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22611: [SPARK-25595] Ignore corrupt Avro files if flag IGNORE_C...

2018-10-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22611
  
**[Test build #96850 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96850/testReport)**
 for PR 22611 at commit 
[`e96ea20`](https://github.com/apache/spark/commit/e96ea209731acba240943f890773e6eb1d87dee8).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22611: [SPARK-25595] Ignore corrupt Avro files if flag IGNORE_C...

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22611
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22611: [SPARK-25595] Ignore corrupt Avro files if flag IGNORE_C...

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22611
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96850/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22610: [WIP][SPARK-25461][PySpark][SQL] Print warning when retu...

2018-10-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22610
  
**[Test build #96853 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96853/testReport)**
 for PR 22610 at commit 
[`c084e74`](https://github.com/apache/spark/commit/c084e745007d455a6ea99e10cc403b55ead6278d).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22610: [WIP][SPARK-25461][PySpark][SQL] Print warning when retu...

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22610
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/3626/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22610: [WIP][SPARK-25461][PySpark][SQL] Print warning when retu...

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22610
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22611: [SPARK-25595] Ignore corrupt Avro files if flag IGNORE_C...

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22611
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22611: [SPARK-25595] Ignore corrupt Avro files if flag IGNORE_C...

2018-10-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22611
  
**[Test build #96851 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96851/testReport)**
 for PR 22611 at commit 
[`404b1a0`](https://github.com/apache/spark/commit/404b1a0a603bc95654cf49b34378bbcc2dabcc4c).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22611: [SPARK-25595] Ignore corrupt Avro files if flag IGNORE_C...

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22611
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96851/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22527: [SPARK-17952][SQL] Nested Java beans support in c...

2018-10-02 Thread ueshin
Github user ueshin commented on a diff in the pull request:

https://github.com/apache/spark/pull/22527#discussion_r221857463
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala ---
@@ -1100,13 +1101,23 @@ object SQLContext {
   attrs: Seq[AttributeReference]): Iterator[InternalRow] = {
 val extractors =
   
JavaTypeInference.getJavaBeanReadableProperties(beanClass).map(_.getReadMethod)
-val methodsToConverts = extractors.zip(attrs).map { case (e, attr) =>
-  (e, CatalystTypeConverters.createToCatalystConverter(attr.dataType))
+val methodsToTypes = extractors.zip(attrs).map { case (e, attr) =>
+  (e, attr.dataType)
+}
+def invoke(element: Any)(tuple: (Method, DataType)): Any = tuple match 
{
--- End diff --

Can we create converters before `data.map { ... }` instead of calculating 
converters for each row?

I mean something like:

```scala
def converter(e: Method, dt: DataType): Any => Any = dt match {
  case StructType(fields) =>
val nestedExtractors =
  
JavaTypeInference.getJavaBeanReadableProperties(e.getReturnType).map(_.getReadMethod)
val nestedConverters =
  nestedExtractors.zip(fields).map { case (extractor, field) =>
converter(extractor, field.dataType)
  }

element =>
  val value = e.invoke(element)
  new GenericInternalRow(nestedConverters.map(_(value)))
  case _ =>
val convert = CatalystTypeConverters.createToCatalystConverter(dt)
element => convert(e.invoke(element))
}
```
and then
```scala
val converters = extractors.zip(attrs).map { case (e, attr) =>
  converter(e, attr.dataType)
}
data.map { element =>
  new GenericInternalRow(converters.map(_(element))): InternalRow
}
```



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22610: [WIP][SPARK-25461][PySpark][SQL] Print warning when retu...

2018-10-02 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/22610
  
cc @HyukjinKwon Can you take a look at this when you have time? Thanks.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22603: SPARK-25062: clean up BlockLocations in InMemoryF...

2018-10-02 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request:

https://github.com/apache/spark/pull/22603#discussion_r221876275
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/InMemoryFileIndex.scala
 ---
@@ -315,7 +315,12 @@ object InMemoryFileIndex extends Logging {
 // which is very slow on some file system (RawLocalFileSystem, 
which is launch a
 // subprocess and parse the stdout).
 try {
-  val locations = fs.getFileBlockLocations(f, 0, f.getLen)
+  val locations = fs.getFileBlockLocations(f, 0, f.getLen).map(
+loc => if (loc.getClass == classOf[BlockLocation]) {
--- End diff --

`lo.isInstanceOf[BlockLocation]`? Or even better, what about using pattern 
matching?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22610: [WIP][SPARK-25461][PySpark][SQL] Print warning when retu...

2018-10-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22610
  
**[Test build #96853 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96853/testReport)**
 for PR 22610 at commit 
[`c084e74`](https://github.com/apache/spark/commit/c084e745007d455a6ea99e10cc403b55ead6278d).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22610: [WIP][SPARK-25461][PySpark][SQL] Print warning when retu...

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22610
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22610: [WIP][SPARK-25461][PySpark][SQL] Print warning when retu...

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22610
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96853/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22610: [WIP][SPARK-25461][PySpark][SQL] Print warning when retu...

2018-10-02 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/22610
  
Yea, I will do this week. Sorry I missed the cc in the JIRA.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22610: [WIP][SPARK-25461][PySpark][SQL] Print warning when retu...

2018-10-02 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/22610
  
The idea sounds good to me from a cursory look for now.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22603: SPARK-25062: clean up BlockLocations in InMemoryF...

2018-10-02 Thread peter-toth
Github user peter-toth commented on a diff in the pull request:

https://github.com/apache/spark/pull/22603#discussion_r221890344
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/InMemoryFileIndex.scala
 ---
@@ -315,7 +315,12 @@ object InMemoryFileIndex extends Logging {
 // which is very slow on some file system (RawLocalFileSystem, 
which is launch a
 // subprocess and parse the stdout).
 try {
-  val locations = fs.getFileBlockLocations(f, 0, f.getLen)
+  val locations = fs.getFileBlockLocations(f, 0, f.getLen).map(
+loc => if (loc.getClass == classOf[BlockLocation]) {
--- End diff --

Thanks @mgaido91, but loc is always an instance of `BlockLocation` (might 
be a subclass such as `HdfsBlockLocation`) so isInstanceOf[BlockLocation] or 
pattern matching would return always true.
I want to test that the class of loc is exactly `BlockLocation` and if it 
is we don't need to convert it.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22603: SPARK-25062: clean up BlockLocations in InMemoryF...

2018-10-02 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request:

https://github.com/apache/spark/pull/22603#discussion_r221895032
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/InMemoryFileIndex.scala
 ---
@@ -315,7 +315,12 @@ object InMemoryFileIndex extends Logging {
 // which is very slow on some file system (RawLocalFileSystem, 
which is launch a
 // subprocess and parse the stdout).
 try {
-  val locations = fs.getFileBlockLocations(f, 0, f.getLen)
+  val locations = fs.getFileBlockLocations(f, 0, f.getLen).map(
+loc => if (loc.getClass == classOf[BlockLocation]) {
--- End diff --

ah right, sorry @peter-toth. Thanks. Anyway, please move `loc` to the 
previous line and use curly braces for map. I think that is the most widely 
spread syntax in the codebase. Thanks.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22603: SPARK-25062: clean up BlockLocations in InMemoryF...

2018-10-02 Thread peter-toth
Github user peter-toth commented on a diff in the pull request:

https://github.com/apache/spark/pull/22603#discussion_r221898450
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/InMemoryFileIndex.scala
 ---
@@ -315,7 +315,12 @@ object InMemoryFileIndex extends Logging {
 // which is very slow on some file system (RawLocalFileSystem, 
which is launch a
 // subprocess and parse the stdout).
 try {
-  val locations = fs.getFileBlockLocations(f, 0, f.getLen)
+  val locations = fs.getFileBlockLocations(f, 0, f.getLen).map(
+loc => if (loc.getClass == classOf[BlockLocation]) {
--- End diff --

:thumbsup:


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22608: [SPARK-23257][K8S][TESTS] Kerberos Support Integration T...

2018-10-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22608
  
**[Test build #96854 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96854/testReport)**
 for PR 22608 at commit 
[`56e2c6e`](https://github.com/apache/spark/commit/56e2c6e20b427c883e330d79f45ef6f3841cd518).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22608: [SPARK-23257][K8S][TESTS] Kerberos Support Integration T...

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22608
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22608: [SPARK-23257][K8S][TESTS] Kerberos Support Integration T...

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22608
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96854/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22608: [SPARK-23257][K8S][TESTS] Kerberos Support Integration T...

2018-10-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22608
  
**[Test build #96854 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96854/testReport)**
 for PR 22608 at commit 
[`56e2c6e`](https://github.com/apache/spark/commit/56e2c6e20b427c883e330d79f45ef6f3841cd518).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22608: [SPARK-23257][K8S][TESTS] Kerberos Support Integration T...

2018-10-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22608
  
Kubernetes integration test starting
URL: 
https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/3627/



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22599: [SPARK-25581][SQL] Rename method `benchmark` as `runBenc...

2018-10-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22599
  
**[Test build #96855 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96855/testReport)**
 for PR 22599 at commit 
[`1c15c25`](https://github.com/apache/spark/commit/1c15c25430b5084381c71215e4e2ea1f72f0af7c).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22599: [SPARK-25581][SQL] Rename method `benchmark` as `runBenc...

2018-10-02 Thread gengliangwang
Github user gengliangwang commented on the issue:

https://github.com/apache/spark/pull/22599
  
Discuss with @cloud-fan offline. Rename method `benchmark` as 
`runBenchmarkSuite`.  Also add comment to guide developers to use 
`runBenchmark` for each scenario in implementations.

@dongjoon-hyun @wangyum Is this OK to you?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22608: [SPARK-23257][K8S][TESTS] Kerberos Support Integration T...

2018-10-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22608
  
Kubernetes integration test status failure
URL: 
https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/3627/



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22608: [SPARK-23257][K8S][TESTS] Kerberos Support Integration T...

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22608
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/3627/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22608: [SPARK-23257][K8S][TESTS] Kerberos Support Integration T...

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22608
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22599: [SPARK-25581][SQL] Rename method `benchmark` as `runBenc...

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22599
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/3628/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22599: [SPARK-25581][SQL] Rename method `benchmark` as `runBenc...

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22599
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   4   5   >