date:20181205

[GitHub] spark pull request #23234: [SPARK-26233][SQL][BACKPORT-2.2] CheckOverflow wh...

2018-12-05 Thread mgaido91

GitHub user mgaido91 opened a pull request:

https://github.com/apache/spark/pull/23234

[SPARK-26233][SQL][BACKPORT-2.2] CheckOverflow when encoding a decimal value

## What changes were proposed in this pull request?

When we encode a Decimal from external source we don't check for overflow. 
That method is useful not only in order to enforce that we can represent the 
correct value in the specified range, but it also changes the underlying data 
to the right precision/scale. Since in our code generation we assume that a 
decimal has exactly the same precision and scale of its data type, missing to 
enforce it can lead to corrupted output/results when there are subsequent 
transformations.

## How was this patch tested?

added UT


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mgaido91/spark SPARK-26233_2.2

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/23234.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #23234


commit 930c51029b845c74357305e7ec30a4f2e6ea748a
Author: Marco Gaido 
Date:   2018-12-04T18:33:27Z

[SPARK-26233][SQL] CheckOverflow when encoding a decimal value

When we encode a Decimal from external source we don't check for overflow. 
That method is useful not only in order to enforce that we can represent the 
correct value in the specified range, but it also changes the underlying data 
to the right precision/scale. Since in our code generation we assume that a 
decimal has exactly the same precision and scale of its data type, missing to 
enforce it can lead to corrupted output/results when there are subsequent 
transformations.

added UT

Closes #23210 from mgaido91/SPARK-26233.

Authored-by: Marco Gaido 
Signed-off-by: Dongjoon Hyun 




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23234: [SPARK-26233][SQL][BACKPORT-2.2] CheckOverflow when enco...

2018-12-05 Thread mgaido91

Github user mgaido91 commented on the issue:

https://github.com/apache/spark/pull/23234
  
cc @cloud-fan @dongjoon-hyun


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23232: [SPARK-26233][SQL][BACKPORT-2.4] CheckOverflow when enco...

2018-12-05 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23232
  
**[Test build #99716 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99716/testReport)**
 for PR 23232 at commit 
[`821db48`](https://github.com/apache/spark/commit/821db4854c0e685aac3168da75a1c839681dbfc4).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23233: [SPARK-26233][SQL][BACKPORT-2.3] CheckOverflow when enco...

2018-12-05 Thread mgaido91

Github user mgaido91 commented on the issue:

https://github.com/apache/spark/pull/23233
  
cc @cloud-fan @dongjoon-hyun


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23233: [SPARK-26233][SQL][BACKPORT-2.3] CheckOverflow wh...

2018-12-05 Thread mgaido91

GitHub user mgaido91 opened a pull request:

https://github.com/apache/spark/pull/23233

[SPARK-26233][SQL][BACKPORT-2.3] CheckOverflow when encoding a decimal value


## What changes were proposed in this pull request?

When we encode a Decimal from external source we don't check for overflow. 
That method is useful not only in order to enforce that we can represent the 
correct value in the specified range, but it also changes the underlying data 
to the right precision/scale. Since in our code generation we assume that a 
decimal has exactly the same precision and scale of its data type, missing to 
enforce it can lead to corrupted output/results when there are subsequent 
transformations.

## How was this patch tested?

added UT


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mgaido91/spark SPARK-26233_2.3

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/23233.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #23233


commit a1e77445c2675137fbcddf73181c47469f159dbf
Author: Marco Gaido 
Date:   2018-12-04T18:33:27Z

[SPARK-26233][SQL] CheckOverflow when encoding a decimal value

When we encode a Decimal from external source we don't check for overflow. 
That method is useful not only in order to enforce that we can represent the 
correct value in the specified range, but it also changes the underlying data 
to the right precision/scale. Since in our code generation we assume that a 
decimal has exactly the same precision and scale of its data type, missing to 
enforce it can lead to corrupted output/results when there are subsequent 
transformations.

added UT

Closes #23210 from mgaido91/SPARK-26233.

Authored-by: Marco Gaido 
Signed-off-by: Dongjoon Hyun 




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23232: [SPARK-26233][SQL][BACKPORT-2.4] CheckOverflow when enco...

2018-12-05 Thread mgaido91

Github user mgaido91 commented on the issue:

https://github.com/apache/spark/pull/23232
  
cc @cloud-fan  @dongjoon-hyun 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23232: [SPARK-26233][SQL][BACKPORT-2.4] CheckOverflow wh...

2018-12-05 Thread mgaido91

GitHub user mgaido91 opened a pull request:

https://github.com/apache/spark/pull/23232

[SPARK-26233][SQL][BACKPORT-2.4] CheckOverflow when encoding a decimal value

When we encode a Decimal from external source we don't check for overflow. 
That method is useful not only in order to enforce that we can represent the 
correct value in the specified range, but it also changes the underlying data 
to the right precision/scale. Since in our code generation we assume that a 
decimal has exactly the same precision and scale of its data type, missing to 
enforce it can lead to corrupted output/results when there are subsequent 
transformations.

added UT

Closes #23210 from mgaido91/SPARK-26233.

Authored-by: Marco Gaido 
Signed-off-by: Dongjoon Hyun 

## What changes were proposed in this pull request?

(Please fill in changes proposed in this fix)

## How was this patch tested?

(Please explain how this patch was tested. E.g. unit tests, integration 
tests, manual tests)
(If this patch involves UI changes, please attach a screenshot; otherwise, 
remove this)

Please review http://spark.apache.org/contributing.html before opening a 
pull request.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mgaido91/spark SPARK-26233_2.4

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/23232.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #23232


commit 821db4854c0e685aac3168da75a1c839681dbfc4
Author: Marco Gaido 
Date:   2018-12-04T18:33:27Z

[SPARK-26233][SQL] CheckOverflow when encoding a decimal value

When we encode a Decimal from external source we don't check for overflow. 
That method is useful not only in order to enforce that we can represent the 
correct value in the specified range, but it also changes the underlying data 
to the right precision/scale. Since in our code generation we assume that a 
decimal has exactly the same precision and scale of its data type, missing to 
enforce it can lead to corrupted output/results when there are subsequent 
transformations.

added UT

Closes #23210 from mgaido91/SPARK-26233.

Authored-by: Marco Gaido 
Signed-off-by: Dongjoon Hyun 




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23229: [MINOR][CORE] Modify some field name because it may be c...

2018-12-05 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/23229
  
I don't think it's worth to change naming the variable in a single PR. 
Let's do that when we fix some codes around here, or let other people try to 
fix later.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23224: [MINOR][SQL][TEST] WholeStageCodegen metrics should be t...

2018-12-05 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/23224
  
Can we file a JIRA? I think it's not minor.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23230: [SPARK-26133][ML][Followup] Fix doc for OneHotEncoder

2018-12-05 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/23230
  
Oh, the original one was 3.0. Although this doc change can go to branch-2.4 
alone as well, let me revert it in branch-2.4 for management simplicity.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23230: [SPARK-26133][ML][Followup] Fix doc for OneHotEnc...

2018-12-05 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/23230


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23159: [SPARK-26191][SQL] Control truncation of Spark plans via...

2018-12-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23159
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23159: [SPARK-26191][SQL] Control truncation of Spark plans via...

2018-12-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23159
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5760/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23230: [SPARK-26133][ML][Followup] Fix doc for OneHotEncoder

2018-12-05 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/23230
  
Merged to master and branch-2.4.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23196: [SPARK-26243][SQL] Use java.time API for parsing timesta...

2018-12-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23196
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23196: [SPARK-26243][SQL] Use java.time API for parsing timesta...

2018-12-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23196
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5758/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23159: [SPARK-26191][SQL] Control truncation of Spark plans via...

2018-12-05 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23159
  
**[Test build #99715 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99715/testReport)**
 for PR 23159 at commit 
[`e0aa626`](https://github.com/apache/spark/commit/e0aa626c886976489348a6c0179d160bbe3252da).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23196: [SPARK-26243][SQL] Use java.time API for parsing timesta...

2018-12-05 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23196
  
**[Test build #99714 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99714/testReport)**
 for PR 23196 at commit 
[`07fcf46`](https://github.com/apache/spark/commit/07fcf4666a96928c8096db7a131e6514013679f0).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22957: [SPARK-25951][SQL] Ignore aliases for distributions and ...

2018-12-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22957
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5759/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22957: [SPARK-25951][SQL] Ignore aliases for distributions and ...

2018-12-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22957
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22957: [SPARK-25951][SQL] Ignore aliases for distributions and ...

2018-12-05 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22957
  
**[Test build #99713 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99713/testReport)**
 for PR 22957 at commit 
[`e4f617f`](https://github.com/apache/spark/commit/e4f617fc7e47d7c49f3d773ac2d91c5508c0a239).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23229: [MINOR][CORE] Modify some field name because it may be c...

2018-12-05 Thread heary-cao

Github user heary-cao commented on the issue:

https://github.com/apache/spark/pull/23229
  
ok to test


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23231: [SPARK-26273][ML] Add OneHotEncoderEstimator as alias to...

2018-12-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23231
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5757/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23231: [SPARK-26273][ML] Add OneHotEncoderEstimator as alias to...

2018-12-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23231
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23213: [SPARK-26262][SQL] Runs SQLQueryTestSuite on mixed confi...

2018-12-05 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/23213
  
how about `wholeStage=false, factoryMode=CODE_ONLY`? I think it's different 
from `wholeStage=false, factoryMode=NO_CODEGEN`.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23231: [SPARK-26273][ML] Add OneHotEncoderEstimator as alias to...

2018-12-05 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23231
  
**[Test build #99712 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99712/testReport)**
 for PR 23231 at commit 
[`453d60f`](https://github.com/apache/spark/commit/453d60f42b99de621a7ee3fab6bc6138fc20ed05).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23230: [SPARK-26133][ML][Followup] Fix doc for OneHotEncoder

2018-12-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23230
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23230: [SPARK-26133][ML][Followup] Fix doc for OneHotEncoder

2018-12-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23230
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99710/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23230: [SPARK-26133][ML][Followup] Fix doc for OneHotEncoder

2018-12-05 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23230
  
**[Test build #99710 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99710/testReport)**
 for PR 23230 at commit 
[`5c7f6be`](https://github.com/apache/spark/commit/5c7f6be3c52e39924953f613d13225e32e8a63f9).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23224: [MINOR][SQL][TEST] WholeStageCodegen metrics should be t...

2018-12-05 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23224
  
**[Test build #99711 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99711/testReport)**
 for PR 23224 at commit 
[`021728c`](https://github.com/apache/spark/commit/021728ccc70cf971592c560cfc5492dedbdc362a).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23224: [MINOR][SQL][TEST] WholeStageCodegen metrics should be t...

2018-12-05 Thread heary-cao

Github user heary-cao commented on the issue:

https://github.com/apache/spark/pull/23224
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23230: [SPARK-26133][ML][Followup] Fix doc for OneHotEncoder

2018-12-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23230
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99706/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23230: [SPARK-26133][ML][Followup] Fix doc for OneHotEncoder

2018-12-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23230
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23230: [SPARK-26133][ML][Followup] Fix doc for OneHotEncoder

2018-12-05 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23230
  
**[Test build #99706 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99706/testReport)**
 for PR 23230 at commit 
[`63b7183`](https://github.com/apache/spark/commit/63b71834b101c800973b73490640a44e507306d1).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23230: [SPARK-26133][ML][Followup] Fix doc for OneHotEncoder

2018-12-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23230
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23230: [SPARK-26133][ML][Followup] Fix doc for OneHotEncoder

2018-12-05 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23230
  
**[Test build #99710 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99710/testReport)**
 for PR 23230 at commit 
[`5c7f6be`](https://github.com/apache/spark/commit/5c7f6be3c52e39924953f613d13225e32e8a63f9).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23231: [SPARK-26273][ML] Add OneHotEncoderEstimator as alias to...

2018-12-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23231
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23230: [SPARK-26133][ML][Followup] Fix doc for OneHotEncoder

2018-12-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23230
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5756/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23231: [SPARK-26273][ML] Add OneHotEncoderEstimator as alias to...

2018-12-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23231
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5755/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23231: [SPARK-26273][ML] Add OneHotEncoderEstimator as alias to...

2018-12-05 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23231
  
**[Test build #99709 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99709/testReport)**
 for PR 23231 at commit 
[`f5ed812`](https://github.com/apache/spark/commit/f5ed81279d95b765ccf11752e09e3e66230b047a).
 * This patch **fails Python style tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class OneHotEncoderEstimator @Since(\"2.3.0\") (@Since(\"2.3.0\") 
override val uid: String)`
  * `class OneHotEncoderEstimator(JavaEstimator, HasInputCols, 
HasOutputCols, HasHandleInvalid,`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23231: [SPARK-26273][ML] Add OneHotEncoderEstimator as alias to...

2018-12-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23231
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99709/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23231: [SPARK-26273][ML] Add OneHotEncoderEstimator as alias to...

2018-12-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23231
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23229: [MINOR][CORE] Modify some field name because it may be c...

2018-12-05 Thread heary-cao

Github user heary-cao commented on the issue:

https://github.com/apache/spark/pull/23229
  
ok to test


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23230: [SPARK-26133][ML][Followup] Fix doc for OneHotEncoder

2018-12-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23230
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99705/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23230: [SPARK-26133][ML][Followup] Fix doc for OneHotEncoder

2018-12-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23230
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23231: [SPARK-26273][ML] Add OneHotEncoderEstimator as a...

2018-12-05 Thread viirya

Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/23231#discussion_r239011539
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/ml/feature/OneHotEncoderEstimator.scala 
---
@@ -0,0 +1,91 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.ml.feature
+
+import org.apache.spark.annotation.Since
+import org.apache.spark.ml.Estimator
+import org.apache.spark.ml.param._
+import org.apache.spark.ml.util._
+import org.apache.spark.sql.Dataset
+import org.apache.spark.sql.types.StructType
+
+/**
+ * A one-hot encoder that maps a column of category indices to a column of 
binary vectors, with
+ * at most a single one-value per row that indicates the input category 
index.
+ * For example with 5 categories, an input value of 2.0 would map to an 
output vector of
+ * `[0.0, 0.0, 1.0, 0.0]`.
+ * The last category is not included by default (configurable via 
`dropLast`),
+ * because it makes the vector entries sum up to one, and hence linearly 
dependent.
+ * So an input value of 4.0 maps to `[0.0, 0.0, 0.0, 0.0]`.
+ *
+ * @note This is different from scikit-learn's OneHotEncoder, which keeps 
all categories.
+ * The output vectors are sparse.
+ *
+ * When `handleInvalid` is configured to 'keep', an extra "category" 
indicating invalid values is
+ * added as last category. So when `dropLast` is true, invalid values are 
encoded as all-zeros
+ * vector.
+ *
+ * @note When encoding multi-column by using `inputCols` and `outputCols` 
params, input/output cols
+ * come in pairs, specified by the order in the arrays, and each pair is 
treated independently.
+ *
+ * @note `OneHotEncoderEstimator` is renamed to `OneHotEncoder` in 3.0.0. 
This
+ * `OneHotEncoderEstimator` is kept as an alias and will be removed in 
further version.
+ *
+ * @see `StringIndexer` for converting categorical values into category 
indices
+ */
+@Since("2.3.0")
--- End diff --

These since tags are from original OneHotEncoderEstimator.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23231: [SPARK-26273][ML] Add OneHotEncoderEstimator as alias to...

2018-12-05 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23231
  
**[Test build #99709 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99709/testReport)**
 for PR 23231 at commit 
[`f5ed812`](https://github.com/apache/spark/commit/f5ed81279d95b765ccf11752e09e3e66230b047a).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23230: [SPARK-26133][ML][Followup] Fix doc for OneHotEncoder

2018-12-05 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23230
  
**[Test build #99705 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99705/testReport)**
 for PR 23230 at commit 
[`c84886a`](https://github.com/apache/spark/commit/c84886aef9a53d0d58ca4f0f68ece57ee80f88c8).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23231: [SPARK-26273][ML] Add OneHotEncoderEstimator as alias to...

2018-12-05 Thread viirya

Github user viirya commented on the issue:

https://github.com/apache/spark/pull/23231
  
cc @srowen @dbtsai 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23196: [SPARK-26243][SQL] Use java.time API for parsing ...

2018-12-05 Thread MaxGekk

Github user MaxGekk commented on a diff in the pull request:

https://github.com/apache/spark/pull/23196#discussion_r239010321
  
--- Diff: 
sql/hive/compatibility/src/test/scala/org/apache/spark/sql/hive/execution/HiveCompatibilitySuite.scala
 ---
@@ -49,8 +49,8 @@ class HiveCompatibilitySuite extends HiveQueryFileTest 
with BeforeAndAfter {
   override def beforeAll() {
 super.beforeAll()
 TestHive.setCacheTables(true)
-// Timezone is fixed to America/Los_Angeles for those timezone 
sensitive tests (timestamp_*)
-TimeZone.setDefault(TimeZone.getTimeZone("America/Los_Angeles"))
+// Timezone is fixed to GMT for those timezone sensitive tests 
(timestamp_*)
--- End diff --

Our current approach for converting dates is inconsistent in a few places, 
for example:
- `UTF8String` -> `num days` uses hardcoded `GMT` and ignores SQL config: 
https://github.com/apache/spark/blob/f982ca07e80074bdc1e3b742c5e21cf368e4ede2/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala#L493
- `String` -> `java.util.Date` ignores Spark's time zone settings, and uses 
system time zone: 
https://github.com/apache/spark/blob/f982ca07e80074bdc1e3b742c5e21cf368e4ede2/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala#L186
- In many places even a function accepts timeZone parameter, it is not 
passed (used default time zone - **not from config but from 
TimeZone.getDefault()**). For example: 
https://github.com/apache/spark/blob/36edbac1c8337a4719f90e4abd58d38738b2e1fb/sql/core/src/main/scala/org/apache/spark/sql/execution/QueryExecution.scala#L187
 .
- Casting to the date type depends on type of argument, if it is 
`TimestampType`, expression-wise timezone is used, otherwise `GMT`: 
https://github.com/apache/spark/blob/d03e0af80d7659f12821cc2442efaeaee94d3985/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala#L403-L410

I do really think to disable new parser/formatter outside of CSV/JSON 
datasources because it is hard to guarantee consistent behavior in combination 
with other date/timestamp functions. @srowen @gatorsmile @HyukjinKwon WDYT?
 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23231: [SPARK-26273][ML] Add OneHotEncoderEstimator as alias to...

2018-12-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23231
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5753/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23231: [SPARK-26273][ML] Add OneHotEncoderEstimator as alias to...

2018-12-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23231
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23163: [SPARK-26164][SQL] Allow FileFormatWriter to write multi...

2018-12-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23163
  
Build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23231: [SPARK-26273][ML] Add OneHotEncoderEstimator as alias to...

2018-12-05 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23231
  
**[Test build #99707 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99707/testReport)**
 for PR 23231 at commit 
[`1716071`](https://github.com/apache/spark/commit/17160710cadc49b54f4385ae3ca9ddb0eb4034b0).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23163: [SPARK-26164][SQL] Allow FileFormatWriter to write multi...

2018-12-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23163
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5754/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23163: [SPARK-26164][SQL] Allow FileFormatWriter to write multi...

2018-12-05 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23163
  
**[Test build #99708 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99708/testReport)**
 for PR 23163 at commit 
[`6cb993b`](https://github.com/apache/spark/commit/6cb993b26e6b6867b3315228b55624b98acf1dcb).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23231: [SPARK-26273][ML] Add OneHotEncoderEstimator as a...

2018-12-05 Thread viirya

Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/23231#discussion_r239008438
  
--- Diff: 
mllib/src/test/scala/org/apache/spark/ml/feature/OneHotEncoderEstimatorSuite.scala
 ---
@@ -0,0 +1,423 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.ml.feature
+
+import org.apache.spark.ml.attribute.{AttributeGroup, BinaryAttribute, 
NominalAttribute}
+import org.apache.spark.ml.linalg.{Vector, Vectors, VectorUDT}
+import org.apache.spark.ml.param.ParamsSuite
+import org.apache.spark.ml.util.{DefaultReadWriteTest, MLTest}
+import org.apache.spark.sql.{Encoder, Row}
+import org.apache.spark.sql.catalyst.encoders.ExpressionEncoder
+import org.apache.spark.sql.functions.col
+import org.apache.spark.sql.types._
+
+class OneHotEncoderEstimatorSuite extends MLTest with DefaultReadWriteTest 
{
--- End diff --

The fitting of OneHotEncoderEstimator is actually done by OneHotEncoder. 
OneHotEncoderEstimator is just an alias. I'm not sure if we really need to add 
this test suite for it.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23163: [SPARK-26164][SQL] Allow FileFormatWriter to write multi...

2018-12-05 Thread heary-cao

Github user heary-cao commented on the issue:

https://github.com/apache/spark/pull/23163
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23231: [SPARK-26273][ML] Add OneHotEncoderEstimator as a...

2018-12-05 Thread viirya

GitHub user viirya opened a pull request:

https://github.com/apache/spark/pull/23231

[SPARK-26273][ML] Add OneHotEncoderEstimator as alias to OneHotEncoder

## What changes were proposed in this pull request?

SPARK-26133 removed deprecated OneHotEncoder and renamed 
OneHotEncoderEstimator to OneHotEncoder.

Based on ml migration doc, we need to keep OneHotEncoderEstimator as an 
alias to OneHotEncoder.

This task is going to add it.

## How was this patch tested?

Added tests.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/viirya/spark-1 one-hot-encoder-estimator-alias

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/23231.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #23231


commit 17160710cadc49b54f4385ae3ca9ddb0eb4034b0
Author: Liang-Chi Hsieh 
Date:   2018-12-05T09:27:58Z

Add OneHotEncoderEstimator as alias to OneHotEncoder.




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23230: [SPARK-26133][ML][Followup] Fix doc for OneHotEncoder

2018-12-05 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23230
  
**[Test build #99706 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99706/testReport)**
 for PR 23230 at commit 
[`63b7183`](https://github.com/apache/spark/commit/63b71834b101c800973b73490640a44e507306d1).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23230: [SPARK-26133][ML][Followup] Fix doc for OneHotEncoder

2018-12-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23230
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5752/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23213: [SPARK-26262][SQL] Runs SQLQueryTestSuite on mixed confi...

2018-12-05 Thread mgaido91

Github user mgaido91 commented on the issue:

https://github.com/apache/spark/pull/23213
  
Yes, I am wondering too: which is the difference between:

`spark.sql.codegen.wholeStage=false,spark.sql.codegen.factoryMode=NO_CODEGEN` 
and 
`spark.sql.codegen.wholeStage=true,spark.sql.codegen.factoryMode=NO_CODEGEN`?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23230: [SPARK-26133][ML][Followup] Fix doc for OneHotEncoder

2018-12-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23230
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23230: [SPARK-26133][ML][Followup] Fix doc for OneHotEncoder

2018-12-05 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23230
  
**[Test build #99705 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99705/testReport)**
 for PR 23230 at commit 
[`c84886a`](https://github.com/apache/spark/commit/c84886aef9a53d0d58ca4f0f68ece57ee80f88c8).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23230: [SPARK-26133][ML][Followup] Fix doc for OneHotEncoder

2018-12-05 Thread viirya

Github user viirya commented on the issue:

https://github.com/apache/spark/pull/23230
  
cc @HyukjinKwon @srowen 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23230: [SPARK-26133][ML][Followup] Fix doc for OneHotEnc...

2018-12-05 Thread viirya

GitHub user viirya opened a pull request:

https://github.com/apache/spark/pull/23230

[SPARK-26133][ML][Followup] Fix doc for OneHotEncoder

## What changes were proposed in this pull request?

This fixes doc of renamed OneHotEncoder in PySpark.

## How was this patch tested?

N/A

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/viirya/spark-1 remove_one_hot_encoder_followup

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/23230.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #23230


commit c84886aef9a53d0d58ca4f0f68ece57ee80f88c8
Author: Liang-Chi Hsieh 
Date:   2018-12-05T10:08:01Z

Fix doc for OneHotEncoder.




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23213: [SPARK-26262][SQL] Runs SQLQueryTestSuite on mixed confi...

2018-12-05 Thread maropu

Github user maropu commented on the issue:

https://github.com/apache/spark/pull/23213
  
Sorry, my bad; it was longer than the current master by ~2 times. That's 
because the current master has already run two config set patterns 
(`wholeStage=true,factoryMode=CODEGEN_ONLY` and 
`wholeStage=true,factoryMode=NO_CODEGEN`) in `SQLQueryTestSuite`. The second 
test run (`wholeStage=true,factoryMode=NO_CODEGEN`) was introduced in my 
previous pr (#22512).

IMHO two config set patterns below could cover most code paths in Spark?
 - wholeStage=true, factoryMode=CODEGEN_ONLY
 - wholeStage=false, factoryMode=NO_CODEGEN

In this case, there is little change in the test time;
```
// the current master
=== Codegen/Interpreter Time Metrics ===
Total time: 358.584989321 seconds

Configs 
 Run Time 
spark.sql.codegen.wholeStage=true,spark.sql.codegen.factoryMode=NO_CODEGEN  
 165961038511   

spark.sql.codegen.wholeStage=true,spark.sql.codegen.factoryMode=CODEGEN_ONLY 
192623950810  
// with this pr
=== Codegen/Interpreter Time Metrics ===
Total time: 345.468455247 seconds

Configs 
 Run Time 

spark.sql.codegen.wholeStage=true,spark.sql.codegen.factoryMode=CODEGEN_ONLY 
196572976377   
spark.sql.codegen.wholeStage=false,spark.sql.codegen.factoryMode=NO_CODEGEN 
 148895478870
```
WDYT?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23195: [SPARK-26236][SS] Add kafka delegation token supp...

2018-12-05 Thread gaborgsomogyi

Github user gaborgsomogyi commented on a diff in the pull request:

https://github.com/apache/spark/pull/23195#discussion_r238995809
  
--- Diff: docs/structured-streaming-kafka-integration.md ---
@@ -624,3 +624,57 @@ For experimenting on `spark-shell`, you can also use 
`--packages` to add `spark-
 
 See [Application Submission Guide](submitting-applications.html) for more 
details about submitting
 applications with external dependencies.
+
+## Security
+
+Kafka 0.9.0.0 introduced several features that increases security in a 
cluster. For detailed
+description about these possibilities, see [Kafka security 
docs](http://kafka.apache.org/documentation.html#security).
+
+It's worth noting that security is optional and turned off by default.
+
+Spark supports the following ways to authenticate against Kafka cluster:
+- **Delegation token (introduced in Kafka broker 1.1.0)**: This way the 
application can be configured
+  via Spark parameters and may not need JAAS login configuration (Spark 
can use Kafka's dynamic JAAS
+  configuration feature). For further information about delegation tokens, 
see
+  [Kafka delegation token 
docs](http://kafka.apache.org/documentation/#security_delegation_token).
+
+  The process is initiated by Spark's Kafka delegation token provider. 
When `spark.kafka.bootstrap.servers`
+  set Spark looks for authentication information in the following order 
and choose the first available to log in:
+  - **JAAS login configuration**
+  - **Keytab file**, such as,
+
+./bin/spark-submit \
+--keytab  \
+--principal  \
+--conf spark.kafka.bootstrap.servers= \
+...
+
+  - **Kerberos credential cache**, such as,
+
+./bin/spark-submit \
+--conf spark.kafka.bootstrap.servers= \
+...
+
+  Kafka delegation token provider can be turned off by setting 
`spark.security.credentials.kafka.enabled` to `false` (default: `true`).
+
+  Spark can be configured to use the following authentication protocols to 
obtain token:
+  - **SASL SSL (default)**: With `GSSAPI` mechanism Kerberos used for 
authentication and SSL for encryption.
+  - **SSL**: It's leveraging a capability from SSL called 2-way 
authentication. The server authenticates
+clients through certificates. Please note 2-way authentication must be 
enabled on Kafka brokers.
+  - **SASL PLAINTEXT (for testing)**: With `GSSAPI` mechanism Kerberos 
used for authentication but
+because there is no encryption it's only for testing purposes.
+
+  After obtaining delegation token successfully, Spark distributes it 
across nodes and renews it accordingly.
+  Delegation token uses `SCRAM` login module for authentication and 
because of that the appropriate
+  `sasl.mechanism` has to be configured on source/sink.
--- End diff --

It means exactly that. This is missing, added and example.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23195: [SPARK-26236][SS] Add kafka delegation token supp...

2018-12-05 Thread gaborgsomogyi

Github user gaborgsomogyi commented on a diff in the pull request:

https://github.com/apache/spark/pull/23195#discussion_r238995441
  
--- Diff: docs/structured-streaming-kafka-integration.md ---
@@ -624,3 +624,57 @@ For experimenting on `spark-shell`, you can also use 
`--packages` to add `spark-
 
 See [Application Submission Guide](submitting-applications.html) for more 
details about submitting
 applications with external dependencies.
+
+## Security
+
+Kafka 0.9.0.0 introduced several features that increases security in a 
cluster. For detailed
+description about these possibilities, see [Kafka security 
docs](http://kafka.apache.org/documentation.html#security).
+
+It's worth noting that security is optional and turned off by default.
+
+Spark supports the following ways to authenticate against Kafka cluster:
+- **Delegation token (introduced in Kafka broker 1.1.0)**: This way the 
application can be configured
+  via Spark parameters and may not need JAAS login configuration (Spark 
can use Kafka's dynamic JAAS
+  configuration feature). For further information about delegation tokens, 
see
+  [Kafka delegation token 
docs](http://kafka.apache.org/documentation/#security_delegation_token).
+
+  The process is initiated by Spark's Kafka delegation token provider. 
When `spark.kafka.bootstrap.servers`
+  set Spark looks for authentication information in the following order 
and choose the first available to log in:
+  - **JAAS login configuration**
+  - **Keytab file**, such as,
+
+./bin/spark-submit \
+--keytab  \
+--principal  \
+--conf spark.kafka.bootstrap.servers= \
+...
+
+  - **Kerberos credential cache**, such as,
+
+./bin/spark-submit \
+--conf spark.kafka.bootstrap.servers= \
+...
+
+  Kafka delegation token provider can be turned off by setting 
`spark.security.credentials.kafka.enabled` to `false` (default: `true`).
--- End diff --

Fixed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23195: [SPARK-26236][SS] Add kafka delegation token supp...

2018-12-05 Thread gaborgsomogyi

Github user gaborgsomogyi commented on a diff in the pull request:

https://github.com/apache/spark/pull/23195#discussion_r238995312
  
--- Diff: docs/structured-streaming-kafka-integration.md ---
@@ -624,3 +624,57 @@ For experimenting on `spark-shell`, you can also use 
`--packages` to add `spark-
 
 See [Application Submission Guide](submitting-applications.html) for more 
details about submitting
 applications with external dependencies.
+
+## Security
+
+Kafka 0.9.0.0 introduced several features that increases security in a 
cluster. For detailed
+description about these possibilities, see [Kafka security 
docs](http://kafka.apache.org/documentation.html#security).
+
+It's worth noting that security is optional and turned off by default.
+
+Spark supports the following ways to authenticate against Kafka cluster:
+- **Delegation token (introduced in Kafka broker 1.1.0)**: This way the 
application can be configured
+  via Spark parameters and may not need JAAS login configuration (Spark 
can use Kafka's dynamic JAAS
+  configuration feature). For further information about delegation tokens, 
see
+  [Kafka delegation token 
docs](http://kafka.apache.org/documentation/#security_delegation_token).
+
+  The process is initiated by Spark's Kafka delegation token provider. 
When `spark.kafka.bootstrap.servers`
+  set Spark looks for authentication information in the following order 
and choose the first available to log in:
--- End diff --

Fixed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23195: [SPARK-26236][SS] Add kafka delegation token supp...

2018-12-05 Thread gaborgsomogyi

Github user gaborgsomogyi commented on a diff in the pull request:

https://github.com/apache/spark/pull/23195#discussion_r238994314
  
--- Diff: docs/structured-streaming-kafka-integration.md ---
@@ -624,3 +624,56 @@ For experimenting on `spark-shell`, you can also use 
`--packages` to add `spark-
 
 See [Application Submission Guide](submitting-applications.html) for more 
details about submitting
 applications with external dependencies.
+
+## Security
+
+Kafka 0.9.0.0 introduced several features that increases security in a 
cluster. For detailed
+description about these possibilities, see [Kafka security 
docs](http://kafka.apache.org/documentation.html#security).
+
+It's worth noting that security is optional and turned off by default.
+
+Spark supports the following ways to authenticate against Kafka cluster:
+- **Delegation token (introduced in Kafka broker 1.1.0)**: This way the 
application can be configured
+  via Spark parameters and may not need JAAS login configuration (Spark 
can use Kafka's dynamic JAAS
+  configuration feature). For further information about delegation tokens, 
see
+  [Kafka delegation token 
docs](http://kafka.apache.org/documentation/#security_delegation_token).
+
+  The process is initiated by Spark's Kafka delegation token provider. 
This is enabled by default
+  but can be turned off with `spark.security.credentials.kafka.enabled`. 
When
+  `spark.kafka.bootstrap.servers` set Spark looks for authentication 
information in the following
+  order and choose the first available to log in:
+  - **JAAS login configuration**
+  - **Keytab file**, such as,
+
+./bin/spark-submit \
+--keytab  \
+--principal  \
+--conf spark.kafka.bootstrap.servers= \
+...
+
+  - **Kerberos credential cache**, such as,
+
+./bin/spark-submit \
+--conf spark.kafka.bootstrap.servers= \
+...
+
+  Spark supports the following authentication protocols to obtain token:
--- End diff --

OK, fixed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22952: [SPARK-20568][SS] Provide option to clean up completed f...

2018-12-05 Thread gaborgsomogyi

Github user gaborgsomogyi commented on the issue:

https://github.com/apache/spark/pull/22952
  
@HeartSaVioR @steveloughran 
As I see not only `*` and `?` missing but `[]` also.

* Having glob parser in spark and supporting it I think it's too heavy and 
brittle.
* Considering these I would solve it with warnings + caveat message in the 
doc (mentioning the slow globbing on object stores).

As a separate offtopic just wondering how hadoop's globbing works if 
expander doesn't support all the glob elements. Maybe other operators (like 
`[]`) handled in different code part!?



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23229: [MINOR][CORE] Modify some field name because it may be c...

2018-12-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23229
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23229: [MINOR][CORE] Modify some field name because it may be c...

2018-12-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23229
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23229: [MINOR][CORE] Modify some field name because it may be c...

2018-12-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23229
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23229: [MINOR][CORE] Modify some field name because it m...

2018-12-05 Thread wangjiaochun

GitHub user wangjiaochun opened a pull request:

https://github.com/apache/spark/pull/23229

[MINOR][CORE] Modify some field name because it may be cause confusion

## What changes were proposed in this pull request?
There is different field name style for tracking allocated data pages, 
such as class BytesToBytesMap use field name dataPages for allocated data 
pages,
class UnsafeExternalSorter and ShuffleExternalSorter use field name 
allocatedPages for allocated data pages
They are all belong to memory consumer, so I think it is best to use 
unified name;
and class TaskMemoryManager filed name allocatedPages is modified to 
pagesBitSetï¼used to indicate the function of bitmap ;

## How was this patch tested?
Existing tests

(Please explain how this patch was tested. E.g. unit tests, integration 
tests, manual tests)
(If this patch involves UI changes, please attach a screenshot; otherwise, 
remove this)

Please review http://spark.apache.org/contributing.html before opening a 
pull request.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/wangjiaochun/spark memory_consumer_name

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/23229.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #23229


commit 00fa455a6e145350a2bc5750df54cd0a9d1f0cdc
Author: 10087686 
Date:   2018-12-05T08:48:08Z

  modify field name in MemoryConsumer




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23227: [SPARK-26271][FOLLOW-UP][SQL] remove unuse object SparkP...

2018-12-05 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23227
  
**[Test build #99704 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99704/testReport)**
 for PR 23227 at commit 
[`5cb416d`](https://github.com/apache/spark/commit/5cb416df5f03b0d750c83e1a8a344b8ea44b1735).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23227: [SPARK-26271][FOLLOW-UP][SQL] remove unuse object SparkP...

2018-12-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23227
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5751/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23227: [SPARK-26271][FOLLOW-UP][SQL] remove unuse object SparkP...

2018-12-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23227
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23228: [MINOR][DOC]The condition description of serialized shuf...

2018-12-05 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23228
  
**[Test build #99703 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99703/testReport)**
 for PR 23228 at commit 
[`d5dadbf`](https://github.com/apache/spark/commit/d5dadbf30d5429c36ec3d5c2845a71c2717fd6f3).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23228: [MINOR][DOC]The condition description of serialized shuf...

2018-12-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23228
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5750/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23228: [MINOR][DOC]The condition description of serialized shuf...

2018-12-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23228
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23228: [MINOR][DOC]The condition description of serializ...

2018-12-05 Thread 10110346

GitHub user 10110346 opened a pull request:

https://github.com/apache/spark/pull/23228

[MINOR][DOC]The condition description of serialized shuffle is not very 
accurate

## What changes were proposed in this pull request?
`1. The shuffle dependency specifies no aggregation or output ordering.`
If the shuffle dependency specifies aggregation, but it only aggregates at 
the reducer side, serialized shuffle can still be used.
`3. The shuffle produces fewer than 16777216 output partitions.`
If the number of output partitions is 16777216 , we can use serialized 
shuffle.
## How was this patch tested?
N/A


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/10110346/spark SerializedShuffle_doc

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/23228.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #23228


commit d5dadbf30d5429c36ec3d5c2845a71c2717fd6f3
Author: liuxian 
Date:   2018-12-05T08:55:20Z

fix




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23227: [SPARK-26271][FOLLOW-UP][SQL] remove unuse object SparkP...

2018-12-05 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/23227
  
ok to test


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23227: [SPARK-16958][FOLLOW-UP][SQL] remove unuse object SparkP...

2018-12-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23227
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5749/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23227: [SPARK-16958][FOLLOW-UP][SQL] remove unuse object SparkP...

2018-12-05 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23227
  
**[Test build #99702 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99702/testReport)**
 for PR 23227 at commit 
[`5cb416d`](https://github.com/apache/spark/commit/5cb416df5f03b0d750c83e1a8a344b8ea44b1735).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23227: [SPARK-16958][FOLLOW-UP][SQL] remove unuse object SparkP...

2018-12-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23227
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23226: [MINOR][TEST] Add MAXIMUM_PAGE_SIZE_BYTES Excepti...

2018-12-05 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/23226#discussion_r238977650
  
--- Diff: 
core/src/test/java/org/apache/spark/unsafe/map/AbstractBytesToBytesMapSuite.java
 ---
@@ -622,6 +622,17 @@ public void initialCapacityBoundsChecking() {
 } catch (IllegalArgumentException e) {
   // expected exception
 }
+
+try {
+  new BytesToBytesMap(
+  taskMemoryManager,
--- End diff --

Let's keep the indentation consistent


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23227: [SPARK-16958][FOLLOW-UP][SQL] remove unuse object SparkP...

2018-12-05 Thread heary-cao

Github user heary-cao commented on the issue:

https://github.com/apache/spark/pull/23227
  
cc @cloud-fan, @gatorsmile, @hvanhovell ,@davies  


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23225: [MINOR][CORE]Don't need to create an empty spill file wh...

2018-12-05 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/23225
  
Also, it needs a JIRA. it's not minor one.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23227: [SPARK-16958][FOLLOW-UP][SQL] remove unuse object SparkP...

2018-12-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23227
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23225: [MINOR][CORE]Don't need to create an empty spill file wh...

2018-12-05 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/23225
  
How come existing tests cover if the empty file is created or not?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23227: [SPARK-16958][FOLLOW-UP][SQL] remove unuse object...

2018-12-05 Thread heary-cao

GitHub user heary-cao opened a pull request:

https://github.com/apache/spark/pull/23227

[SPARK-16958][FOLLOW-UP][SQL] remove unuse object SparkPlan

## What changes were proposed in this pull request?

this code come from PR: https://github.com/apache/spark/pull/11190,
but this code has never been used, only since  PR: 
https://github.com/apache/spark/pull/14548,
Let's continue fix it. thanks.

## How was this patch tested?

N / A

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/heary-cao/spark unuseSparkPlan

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/23227.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #23227


commit 5cb416df5f03b0d750c83e1a8a344b8ea44b1735
Author: caoxuewen 
Date:   2018-12-05T08:52:23Z

[SPARK-16958][FOLLOW-UP][SQL] remove unuse object SparkPlan




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23222: [SPARK-20636] Add the rule TransposeWindow to the optimi...

2018-12-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23222
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23222: [SPARK-20636] Add the rule TransposeWindow to the optimi...

2018-12-05 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23222
  
**[Test build #99701 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99701/testReport)**
 for PR 23222 at commit 
[`1270e89`](https://github.com/apache/spark/commit/1270e89026d80c862137c03edbeee53e56f3ed6d).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23222: [SPARK-20636] Add the rule TransposeWindow to the optimi...

2018-12-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23222
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5748/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23222: [SPARK-20636] Add the rule TransposeWindow to the optimi...

2018-12-05 Thread dongjoon-hyun

Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/23222
  
Retest this please.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22683: [SPARK-25696] The storage memory displayed on spark Appl...

2018-12-05 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22683
  
**[Test build #99700 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99700/testReport)**
 for PR 22683 at commit 
[`8cc05a5`](https://github.com/apache/spark/commit/8cc05a57e8ecaa3e2a2f67d125b12645bb4eb3a2).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23226: [MINOR][TEST] Add MAXIMUM_PAGE_SIZE_BYTES Exception test

2018-12-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23226
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23226: [MINOR][TEST] Add MAXIMUM_PAGE_SIZE_BYTES Exception test

2018-12-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23226
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

< 1 2 3 4 5 6 7 >

501 - 600 of 614 matches

Mail list logo