date:20200623

[spark] branch master updated (045106e -> 9f540fa)

2020-06-23 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 045106e  [SPARK-32072][CORE][TESTS] Fix table formatting with 
benchmark results
 add 9f540fa  [SPARK-32062][SQL] Reset listenerRegistered in SparkSession

No new revisions were added by this update.

Summary of changes:
 .../main/scala/org/apache/spark/sql/SparkSession.scala  |  1 +
 .../org/apache/spark/sql/SparkSessionBuilderSuite.scala | 17 +
 2 files changed, 18 insertions(+)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (045106e -> 9f540fa)

2020-06-23 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 045106e  [SPARK-32072][CORE][TESTS] Fix table formatting with 
benchmark results
 add 9f540fa  [SPARK-32062][SQL] Reset listenerRegistered in SparkSession

No new revisions were added by this update.

Summary of changes:
 .../main/scala/org/apache/spark/sql/SparkSession.scala  |  1 +
 .../org/apache/spark/sql/SparkSessionBuilderSuite.scala | 17 +
 2 files changed, 18 insertions(+)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (045106e -> 9f540fa)

2020-06-23 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 045106e  [SPARK-32072][CORE][TESTS] Fix table formatting with 
benchmark results
 add 9f540fa  [SPARK-32062][SQL] Reset listenerRegistered in SparkSession

No new revisions were added by this update.

Summary of changes:
 .../main/scala/org/apache/spark/sql/SparkSession.scala  |  1 +
 .../org/apache/spark/sql/SparkSessionBuilderSuite.scala | 17 +
 2 files changed, 18 insertions(+)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (986fa01 -> 045106e)

2020-06-23 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 986fa01  [SPARK-32075][DOCS] Fix a few issues in parameters table
 add 045106e  [SPARK-32072][CORE][TESTS] Fix table formatting with 
benchmark results

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/benchmark/Benchmark.scala |  9 +++
 .../MakeDateTimeBenchmark-jdk11-results.txt| 28 +++---
 .../benchmarks/MakeDateTimeBenchmark-results.txt   | 28 +++---
 3 files changed, 33 insertions(+), 32 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (045106e -> 9f540fa)

2020-06-23 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 045106e  [SPARK-32072][CORE][TESTS] Fix table formatting with 
benchmark results
 add 9f540fa  [SPARK-32062][SQL] Reset listenerRegistered in SparkSession

No new revisions were added by this update.

Summary of changes:
 .../main/scala/org/apache/spark/sql/SparkSession.scala  |  1 +
 .../org/apache/spark/sql/SparkSessionBuilderSuite.scala | 17 +
 2 files changed, 18 insertions(+)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (986fa01 -> 045106e)

2020-06-23 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 986fa01  [SPARK-32075][DOCS] Fix a few issues in parameters table
 add 045106e  [SPARK-32072][CORE][TESTS] Fix table formatting with 
benchmark results

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/benchmark/Benchmark.scala |  9 +++
 .../MakeDateTimeBenchmark-jdk11-results.txt| 28 +++---
 .../benchmarks/MakeDateTimeBenchmark-results.txt   | 28 +++---
 3 files changed, 33 insertions(+), 32 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (045106e -> 9f540fa)

2020-06-23 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 045106e  [SPARK-32072][CORE][TESTS] Fix table formatting with 
benchmark results
 add 9f540fa  [SPARK-32062][SQL] Reset listenerRegistered in SparkSession

No new revisions were added by this update.

Summary of changes:
 .../main/scala/org/apache/spark/sql/SparkSession.scala  |  1 +
 .../org/apache/spark/sql/SparkSessionBuilderSuite.scala | 17 +
 2 files changed, 18 insertions(+)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (986fa01 -> 045106e)

2020-06-23 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 986fa01  [SPARK-32075][DOCS] Fix a few issues in parameters table
 add 045106e  [SPARK-32072][CORE][TESTS] Fix table formatting with 
benchmark results

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/benchmark/Benchmark.scala |  9 +++
 .../MakeDateTimeBenchmark-jdk11-results.txt| 28 +++---
 .../benchmarks/MakeDateTimeBenchmark-results.txt   | 28 +++---
 3 files changed, 33 insertions(+), 32 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-32075][DOCS] Fix a few issues in parameters table

2020-06-23 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new a36140c  [SPARK-32075][DOCS] Fix a few issues in parameters table
a36140c is described below

commit a36140c3c300beaf50d19381ac72e2524f888e53
Author: sidedoorleftroad 
AuthorDate: Wed Jun 24 13:39:55 2020 +0900

[SPARK-32075][DOCS] Fix a few issues in parameters table

### What changes were proposed in this pull request?

Fix a few issues in parameters table in 
structured-streaming-kafka-integration doc.

### Why are the changes needed?

Make the title of the table consistent with the data.

### Does this PR introduce _any_ user-facing change?

Yes.

Before:

![image](https://user-images.githubusercontent.com/67275816/85414316-8475e300-b59e-11ea-84ec-fa78ecc980b3.png)
After:

![image](https://user-images.githubusercontent.com/67275816/85414562-d61e6d80-b59e-11ea-9fe6-247e0ad4d9ee.png)

Before:

![image](https://user-images.githubusercontent.com/67275816/85414467-b8510880-b59e-11ea-92a0-7205542fe28b.png)
After:

![image](https://user-images.githubusercontent.com/67275816/85414589-de76a880-b59e-11ea-91f2-5073eaf3444b.png)

Before:

![image](https://user-images.githubusercontent.com/67275816/85414502-c69f2480-b59e-11ea-837f-1201f10a56b6.png)
After:

![image](https://user-images.githubusercontent.com/67275816/85414615-e9313d80-b59e-11ea-9b1a-fc11da0b6bc5.png)

### How was this patch tested?

Manually build and check.

Closes #28910 from sidedoorleftroad/SPARK-32075.

Authored-by: sidedoorleftroad 
Signed-off-by: HyukjinKwon 
(cherry picked from commit 986fa01747db4b52bb8ca1165e759ca2d46d26ff)
Signed-off-by: HyukjinKwon 
---
 docs/structured-streaming-kafka-integration.md | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/docs/structured-streaming-kafka-integration.md 
b/docs/structured-streaming-kafka-integration.md
index 016faa7..8dc2a73 100644
--- a/docs/structured-streaming-kafka-integration.md
+++ b/docs/structured-streaming-kafka-integration.md
@@ -528,28 +528,28 @@ The following properties are available to configure the 
consumer pool:
 Property NameDefaultMeaningSince 
Version
 
   spark.kafka.consumer.cache.capacity
-  The maximum number of consumers cached. Please note that it's a soft 
limit.
   64
+  The maximum number of consumers cached. Please note that it's a soft 
limit.
   3.0.0
 
 
   spark.kafka.consumer.cache.timeout
-  The minimum amount of time a consumer may sit idle in the pool before it 
is eligible for eviction by the evictor.
   5m (5 minutes)
+  The minimum amount of time a consumer may sit idle in the pool before it 
is eligible for eviction by the evictor.
   3.0.0
 
 
   spark.kafka.consumer.cache.evictorThreadRunInterval
-  The interval of time between runs of the idle evictor thread for 
consumer pool. When non-positive, no idle evictor thread will be run.
   1m (1 minute)
+  The interval of time between runs of the idle evictor thread for 
consumer pool. When non-positive, no idle evictor thread will be run.
   3.0.0
 
 
   spark.kafka.consumer.cache.jmx.enable
+  false
   Enable or disable JMX for pools created with this configuration 
instance. Statistics of the pool are available via JMX instance.
   The prefix of JMX name is set to 
"kafka010-cached-simple-kafka-consumer-pool".
   
-  false
   3.0.0
 
 
@@ -578,14 +578,14 @@ The following properties are available to configure the 
fetched data pool:
 Property NameDefaultMeaningSince 
Version
 
   spark.kafka.consumer.fetchedData.cache.timeout
-  The minimum amount of time a fetched data may sit idle in the pool 
before it is eligible for eviction by the evictor.
   5m (5 minutes)
+  The minimum amount of time a fetched data may sit idle in the pool 
before it is eligible for eviction by the evictor.
   3.0.0
 
 
   spark.kafka.consumer.fetchedData.cache.evictorThreadRunInterval
-  The interval of time between runs of the idle evictor thread for fetched 
data pool. When non-positive, no idle evictor thread will be run.
   1m (1 minute)
+  The interval of time between runs of the idle evictor thread for fetched 
data pool. When non-positive, no idle evictor thread will be run.
   3.0.0
 
 
@@ -825,14 +825,14 @@ The following properties are available to configure the 
producer pool:
 Property NameDefaultMeaningSince 
Version
 
   spark.kafka.producer.cache.timeout
-  The minimum amount of time a producer may sit idle in the pool before it 
is eligible for eviction by the evictor.
   10m (10 minutes)
+  The minimum amount of time a producer may sit idle in the pool before it 
is eligible for eviction by the evictor.
   2.2.1
 
 
   spark.kafka.producer.cache.evictorThreadRunInterval

[spark] branch master updated (986fa01 -> 045106e)

2020-06-23 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 986fa01  [SPARK-32075][DOCS] Fix a few issues in parameters table
 add 045106e  [SPARK-32072][CORE][TESTS] Fix table formatting with 
benchmark results

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/benchmark/Benchmark.scala |  9 +++
 .../MakeDateTimeBenchmark-jdk11-results.txt| 28 +++---
 .../benchmarks/MakeDateTimeBenchmark-results.txt   | 28 +++---
 3 files changed, 33 insertions(+), 32 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-32075][DOCS] Fix a few issues in parameters table

2020-06-23 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new a36140c  [SPARK-32075][DOCS] Fix a few issues in parameters table
a36140c is described below

commit a36140c3c300beaf50d19381ac72e2524f888e53
Author: sidedoorleftroad 
AuthorDate: Wed Jun 24 13:39:55 2020 +0900

[SPARK-32075][DOCS] Fix a few issues in parameters table

### What changes were proposed in this pull request?

Fix a few issues in parameters table in 
structured-streaming-kafka-integration doc.

### Why are the changes needed?

Make the title of the table consistent with the data.

### Does this PR introduce _any_ user-facing change?

Yes.

Before:

![image](https://user-images.githubusercontent.com/67275816/85414316-8475e300-b59e-11ea-84ec-fa78ecc980b3.png)
After:

![image](https://user-images.githubusercontent.com/67275816/85414562-d61e6d80-b59e-11ea-9fe6-247e0ad4d9ee.png)

Before:

![image](https://user-images.githubusercontent.com/67275816/85414467-b8510880-b59e-11ea-92a0-7205542fe28b.png)
After:

![image](https://user-images.githubusercontent.com/67275816/85414589-de76a880-b59e-11ea-91f2-5073eaf3444b.png)

Before:

![image](https://user-images.githubusercontent.com/67275816/85414502-c69f2480-b59e-11ea-837f-1201f10a56b6.png)
After:

![image](https://user-images.githubusercontent.com/67275816/85414615-e9313d80-b59e-11ea-9b1a-fc11da0b6bc5.png)

### How was this patch tested?

Manually build and check.

Closes #28910 from sidedoorleftroad/SPARK-32075.

Authored-by: sidedoorleftroad 
Signed-off-by: HyukjinKwon 
(cherry picked from commit 986fa01747db4b52bb8ca1165e759ca2d46d26ff)
Signed-off-by: HyukjinKwon 
---
 docs/structured-streaming-kafka-integration.md | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/docs/structured-streaming-kafka-integration.md 
b/docs/structured-streaming-kafka-integration.md
index 016faa7..8dc2a73 100644
--- a/docs/structured-streaming-kafka-integration.md
+++ b/docs/structured-streaming-kafka-integration.md
@@ -528,28 +528,28 @@ The following properties are available to configure the 
consumer pool:
 Property NameDefaultMeaningSince 
Version
 
   spark.kafka.consumer.cache.capacity
-  The maximum number of consumers cached. Please note that it's a soft 
limit.
   64
+  The maximum number of consumers cached. Please note that it's a soft 
limit.
   3.0.0
 
 
   spark.kafka.consumer.cache.timeout
-  The minimum amount of time a consumer may sit idle in the pool before it 
is eligible for eviction by the evictor.
   5m (5 minutes)
+  The minimum amount of time a consumer may sit idle in the pool before it 
is eligible for eviction by the evictor.
   3.0.0
 
 
   spark.kafka.consumer.cache.evictorThreadRunInterval
-  The interval of time between runs of the idle evictor thread for 
consumer pool. When non-positive, no idle evictor thread will be run.
   1m (1 minute)
+  The interval of time between runs of the idle evictor thread for 
consumer pool. When non-positive, no idle evictor thread will be run.
   3.0.0
 
 
   spark.kafka.consumer.cache.jmx.enable
+  false
   Enable or disable JMX for pools created with this configuration 
instance. Statistics of the pool are available via JMX instance.
   The prefix of JMX name is set to 
"kafka010-cached-simple-kafka-consumer-pool".
   
-  false
   3.0.0
 
 
@@ -578,14 +578,14 @@ The following properties are available to configure the 
fetched data pool:
 Property NameDefaultMeaningSince 
Version
 
   spark.kafka.consumer.fetchedData.cache.timeout
-  The minimum amount of time a fetched data may sit idle in the pool 
before it is eligible for eviction by the evictor.
   5m (5 minutes)
+  The minimum amount of time a fetched data may sit idle in the pool 
before it is eligible for eviction by the evictor.
   3.0.0
 
 
   spark.kafka.consumer.fetchedData.cache.evictorThreadRunInterval
-  The interval of time between runs of the idle evictor thread for fetched 
data pool. When non-positive, no idle evictor thread will be run.
   1m (1 minute)
+  The interval of time between runs of the idle evictor thread for fetched 
data pool. When non-positive, no idle evictor thread will be run.
   3.0.0
 
 
@@ -825,14 +825,14 @@ The following properties are available to configure the 
producer pool:
 Property NameDefaultMeaningSince 
Version
 
   spark.kafka.producer.cache.timeout
-  The minimum amount of time a producer may sit idle in the pool before it 
is eligible for eviction by the evictor.
   10m (10 minutes)
+  The minimum amount of time a producer may sit idle in the pool before it 
is eligible for eviction by the evictor.
   2.2.1
 
 
   spark.kafka.producer.cache.evictorThreadRunInterval

[spark] branch master updated (986fa01 -> 045106e)

2020-06-23 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 986fa01  [SPARK-32075][DOCS] Fix a few issues in parameters table
 add 045106e  [SPARK-32072][CORE][TESTS] Fix table formatting with 
benchmark results

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/benchmark/Benchmark.scala |  9 +++
 .../MakeDateTimeBenchmark-jdk11-results.txt| 28 +++---
 .../benchmarks/MakeDateTimeBenchmark-results.txt   | 28 +++---
 3 files changed, 33 insertions(+), 32 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (eedc6cc -> 986fa01)

2020-06-23 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from eedc6cc  [SPARK-32028][WEBUI] fix app id link for multi attempts app 
in history summary page
 add 986fa01  [SPARK-32075][DOCS] Fix a few issues in parameters table

No new revisions were added by this update.

Summary of changes:
 docs/structured-streaming-kafka-integration.md | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-32075][DOCS] Fix a few issues in parameters table

2020-06-23 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new a36140c  [SPARK-32075][DOCS] Fix a few issues in parameters table
a36140c is described below

commit a36140c3c300beaf50d19381ac72e2524f888e53
Author: sidedoorleftroad 
AuthorDate: Wed Jun 24 13:39:55 2020 +0900

[SPARK-32075][DOCS] Fix a few issues in parameters table

### What changes were proposed in this pull request?

Fix a few issues in parameters table in 
structured-streaming-kafka-integration doc.

### Why are the changes needed?

Make the title of the table consistent with the data.

### Does this PR introduce _any_ user-facing change?

Yes.

Before:

![image](https://user-images.githubusercontent.com/67275816/85414316-8475e300-b59e-11ea-84ec-fa78ecc980b3.png)
After:

![image](https://user-images.githubusercontent.com/67275816/85414562-d61e6d80-b59e-11ea-9fe6-247e0ad4d9ee.png)

Before:

![image](https://user-images.githubusercontent.com/67275816/85414467-b8510880-b59e-11ea-92a0-7205542fe28b.png)
After:

![image](https://user-images.githubusercontent.com/67275816/85414589-de76a880-b59e-11ea-91f2-5073eaf3444b.png)

Before:

![image](https://user-images.githubusercontent.com/67275816/85414502-c69f2480-b59e-11ea-837f-1201f10a56b6.png)
After:

![image](https://user-images.githubusercontent.com/67275816/85414615-e9313d80-b59e-11ea-9b1a-fc11da0b6bc5.png)

### How was this patch tested?

Manually build and check.

Closes #28910 from sidedoorleftroad/SPARK-32075.

Authored-by: sidedoorleftroad 
Signed-off-by: HyukjinKwon 
(cherry picked from commit 986fa01747db4b52bb8ca1165e759ca2d46d26ff)
Signed-off-by: HyukjinKwon 
---
 docs/structured-streaming-kafka-integration.md | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/docs/structured-streaming-kafka-integration.md 
b/docs/structured-streaming-kafka-integration.md
index 016faa7..8dc2a73 100644
--- a/docs/structured-streaming-kafka-integration.md
+++ b/docs/structured-streaming-kafka-integration.md
@@ -528,28 +528,28 @@ The following properties are available to configure the 
consumer pool:
 Property NameDefaultMeaningSince 
Version
 
   spark.kafka.consumer.cache.capacity
-  The maximum number of consumers cached. Please note that it's a soft 
limit.
   64
+  The maximum number of consumers cached. Please note that it's a soft 
limit.
   3.0.0
 
 
   spark.kafka.consumer.cache.timeout
-  The minimum amount of time a consumer may sit idle in the pool before it 
is eligible for eviction by the evictor.
   5m (5 minutes)
+  The minimum amount of time a consumer may sit idle in the pool before it 
is eligible for eviction by the evictor.
   3.0.0
 
 
   spark.kafka.consumer.cache.evictorThreadRunInterval
-  The interval of time between runs of the idle evictor thread for 
consumer pool. When non-positive, no idle evictor thread will be run.
   1m (1 minute)
+  The interval of time between runs of the idle evictor thread for 
consumer pool. When non-positive, no idle evictor thread will be run.
   3.0.0
 
 
   spark.kafka.consumer.cache.jmx.enable
+  false
   Enable or disable JMX for pools created with this configuration 
instance. Statistics of the pool are available via JMX instance.
   The prefix of JMX name is set to 
"kafka010-cached-simple-kafka-consumer-pool".
   
-  false
   3.0.0
 
 
@@ -578,14 +578,14 @@ The following properties are available to configure the 
fetched data pool:
 Property NameDefaultMeaningSince 
Version
 
   spark.kafka.consumer.fetchedData.cache.timeout
-  The minimum amount of time a fetched data may sit idle in the pool 
before it is eligible for eviction by the evictor.
   5m (5 minutes)
+  The minimum amount of time a fetched data may sit idle in the pool 
before it is eligible for eviction by the evictor.
   3.0.0
 
 
   spark.kafka.consumer.fetchedData.cache.evictorThreadRunInterval
-  The interval of time between runs of the idle evictor thread for fetched 
data pool. When non-positive, no idle evictor thread will be run.
   1m (1 minute)
+  The interval of time between runs of the idle evictor thread for fetched 
data pool. When non-positive, no idle evictor thread will be run.
   3.0.0
 
 
@@ -825,14 +825,14 @@ The following properties are available to configure the 
producer pool:
 Property NameDefaultMeaningSince 
Version
 
   spark.kafka.producer.cache.timeout
-  The minimum amount of time a producer may sit idle in the pool before it 
is eligible for eviction by the evictor.
   10m (10 minutes)
+  The minimum amount of time a producer may sit idle in the pool before it 
is eligible for eviction by the evictor.
   2.2.1
 
 
   spark.kafka.producer.cache.evictorThreadRunInterval

[spark] branch branch-3.0 updated: [SPARK-32075][DOCS] Fix a few issues in parameters table

2020-06-23 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new a36140c  [SPARK-32075][DOCS] Fix a few issues in parameters table
a36140c is described below

commit a36140c3c300beaf50d19381ac72e2524f888e53
Author: sidedoorleftroad 
AuthorDate: Wed Jun 24 13:39:55 2020 +0900

[SPARK-32075][DOCS] Fix a few issues in parameters table

### What changes were proposed in this pull request?

Fix a few issues in parameters table in 
structured-streaming-kafka-integration doc.

### Why are the changes needed?

Make the title of the table consistent with the data.

### Does this PR introduce _any_ user-facing change?

Yes.

Before:

![image](https://user-images.githubusercontent.com/67275816/85414316-8475e300-b59e-11ea-84ec-fa78ecc980b3.png)
After:

![image](https://user-images.githubusercontent.com/67275816/85414562-d61e6d80-b59e-11ea-9fe6-247e0ad4d9ee.png)

Before:

![image](https://user-images.githubusercontent.com/67275816/85414467-b8510880-b59e-11ea-92a0-7205542fe28b.png)
After:

![image](https://user-images.githubusercontent.com/67275816/85414589-de76a880-b59e-11ea-91f2-5073eaf3444b.png)

Before:

![image](https://user-images.githubusercontent.com/67275816/85414502-c69f2480-b59e-11ea-837f-1201f10a56b6.png)
After:

![image](https://user-images.githubusercontent.com/67275816/85414615-e9313d80-b59e-11ea-9b1a-fc11da0b6bc5.png)

### How was this patch tested?

Manually build and check.

Closes #28910 from sidedoorleftroad/SPARK-32075.

Authored-by: sidedoorleftroad 
Signed-off-by: HyukjinKwon 
(cherry picked from commit 986fa01747db4b52bb8ca1165e759ca2d46d26ff)
Signed-off-by: HyukjinKwon 
---
 docs/structured-streaming-kafka-integration.md | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/docs/structured-streaming-kafka-integration.md 
b/docs/structured-streaming-kafka-integration.md
index 016faa7..8dc2a73 100644
--- a/docs/structured-streaming-kafka-integration.md
+++ b/docs/structured-streaming-kafka-integration.md
@@ -528,28 +528,28 @@ The following properties are available to configure the 
consumer pool:
 Property NameDefaultMeaningSince 
Version
 
   spark.kafka.consumer.cache.capacity
-  The maximum number of consumers cached. Please note that it's a soft 
limit.
   64
+  The maximum number of consumers cached. Please note that it's a soft 
limit.
   3.0.0
 
 
   spark.kafka.consumer.cache.timeout
-  The minimum amount of time a consumer may sit idle in the pool before it 
is eligible for eviction by the evictor.
   5m (5 minutes)
+  The minimum amount of time a consumer may sit idle in the pool before it 
is eligible for eviction by the evictor.
   3.0.0
 
 
   spark.kafka.consumer.cache.evictorThreadRunInterval
-  The interval of time between runs of the idle evictor thread for 
consumer pool. When non-positive, no idle evictor thread will be run.
   1m (1 minute)
+  The interval of time between runs of the idle evictor thread for 
consumer pool. When non-positive, no idle evictor thread will be run.
   3.0.0
 
 
   spark.kafka.consumer.cache.jmx.enable
+  false
   Enable or disable JMX for pools created with this configuration 
instance. Statistics of the pool are available via JMX instance.
   The prefix of JMX name is set to 
"kafka010-cached-simple-kafka-consumer-pool".
   
-  false
   3.0.0
 
 
@@ -578,14 +578,14 @@ The following properties are available to configure the 
fetched data pool:
 Property NameDefaultMeaningSince 
Version
 
   spark.kafka.consumer.fetchedData.cache.timeout
-  The minimum amount of time a fetched data may sit idle in the pool 
before it is eligible for eviction by the evictor.
   5m (5 minutes)
+  The minimum amount of time a fetched data may sit idle in the pool 
before it is eligible for eviction by the evictor.
   3.0.0
 
 
   spark.kafka.consumer.fetchedData.cache.evictorThreadRunInterval
-  The interval of time between runs of the idle evictor thread for fetched 
data pool. When non-positive, no idle evictor thread will be run.
   1m (1 minute)
+  The interval of time between runs of the idle evictor thread for fetched 
data pool. When non-positive, no idle evictor thread will be run.
   3.0.0
 
 
@@ -825,14 +825,14 @@ The following properties are available to configure the 
producer pool:
 Property NameDefaultMeaningSince 
Version
 
   spark.kafka.producer.cache.timeout
-  The minimum amount of time a producer may sit idle in the pool before it 
is eligible for eviction by the evictor.
   10m (10 minutes)
+  The minimum amount of time a producer may sit idle in the pool before it 
is eligible for eviction by the evictor.
   2.2.1
 
 
   spark.kafka.producer.cache.evictorThreadRunInterval

[spark] branch master updated (eedc6cc -> 986fa01)

2020-06-23 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from eedc6cc  [SPARK-32028][WEBUI] fix app id link for multi attempts app 
in history summary page
 add 986fa01  [SPARK-32075][DOCS] Fix a few issues in parameters table

No new revisions were added by this update.

Summary of changes:
 docs/structured-streaming-kafka-integration.md | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-32075][DOCS] Fix a few issues in parameters table

2020-06-23 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new a36140c  [SPARK-32075][DOCS] Fix a few issues in parameters table
a36140c is described below

commit a36140c3c300beaf50d19381ac72e2524f888e53
Author: sidedoorleftroad 
AuthorDate: Wed Jun 24 13:39:55 2020 +0900

[SPARK-32075][DOCS] Fix a few issues in parameters table

### What changes were proposed in this pull request?

Fix a few issues in parameters table in 
structured-streaming-kafka-integration doc.

### Why are the changes needed?

Make the title of the table consistent with the data.

### Does this PR introduce _any_ user-facing change?

Yes.

Before:

![image](https://user-images.githubusercontent.com/67275816/85414316-8475e300-b59e-11ea-84ec-fa78ecc980b3.png)
After:

![image](https://user-images.githubusercontent.com/67275816/85414562-d61e6d80-b59e-11ea-9fe6-247e0ad4d9ee.png)

Before:

![image](https://user-images.githubusercontent.com/67275816/85414467-b8510880-b59e-11ea-92a0-7205542fe28b.png)
After:

![image](https://user-images.githubusercontent.com/67275816/85414589-de76a880-b59e-11ea-91f2-5073eaf3444b.png)

Before:

![image](https://user-images.githubusercontent.com/67275816/85414502-c69f2480-b59e-11ea-837f-1201f10a56b6.png)
After:

![image](https://user-images.githubusercontent.com/67275816/85414615-e9313d80-b59e-11ea-9b1a-fc11da0b6bc5.png)

### How was this patch tested?

Manually build and check.

Closes #28910 from sidedoorleftroad/SPARK-32075.

Authored-by: sidedoorleftroad 
Signed-off-by: HyukjinKwon 
(cherry picked from commit 986fa01747db4b52bb8ca1165e759ca2d46d26ff)
Signed-off-by: HyukjinKwon 
---
 docs/structured-streaming-kafka-integration.md | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/docs/structured-streaming-kafka-integration.md 
b/docs/structured-streaming-kafka-integration.md
index 016faa7..8dc2a73 100644
--- a/docs/structured-streaming-kafka-integration.md
+++ b/docs/structured-streaming-kafka-integration.md
@@ -528,28 +528,28 @@ The following properties are available to configure the 
consumer pool:
 Property NameDefaultMeaningSince 
Version
 
   spark.kafka.consumer.cache.capacity
-  The maximum number of consumers cached. Please note that it's a soft 
limit.
   64
+  The maximum number of consumers cached. Please note that it's a soft 
limit.
   3.0.0
 
 
   spark.kafka.consumer.cache.timeout
-  The minimum amount of time a consumer may sit idle in the pool before it 
is eligible for eviction by the evictor.
   5m (5 minutes)
+  The minimum amount of time a consumer may sit idle in the pool before it 
is eligible for eviction by the evictor.
   3.0.0
 
 
   spark.kafka.consumer.cache.evictorThreadRunInterval
-  The interval of time between runs of the idle evictor thread for 
consumer pool. When non-positive, no idle evictor thread will be run.
   1m (1 minute)
+  The interval of time between runs of the idle evictor thread for 
consumer pool. When non-positive, no idle evictor thread will be run.
   3.0.0
 
 
   spark.kafka.consumer.cache.jmx.enable
+  false
   Enable or disable JMX for pools created with this configuration 
instance. Statistics of the pool are available via JMX instance.
   The prefix of JMX name is set to 
"kafka010-cached-simple-kafka-consumer-pool".
   
-  false
   3.0.0
 
 
@@ -578,14 +578,14 @@ The following properties are available to configure the 
fetched data pool:
 Property NameDefaultMeaningSince 
Version
 
   spark.kafka.consumer.fetchedData.cache.timeout
-  The minimum amount of time a fetched data may sit idle in the pool 
before it is eligible for eviction by the evictor.
   5m (5 minutes)
+  The minimum amount of time a fetched data may sit idle in the pool 
before it is eligible for eviction by the evictor.
   3.0.0
 
 
   spark.kafka.consumer.fetchedData.cache.evictorThreadRunInterval
-  The interval of time between runs of the idle evictor thread for fetched 
data pool. When non-positive, no idle evictor thread will be run.
   1m (1 minute)
+  The interval of time between runs of the idle evictor thread for fetched 
data pool. When non-positive, no idle evictor thread will be run.
   3.0.0
 
 
@@ -825,14 +825,14 @@ The following properties are available to configure the 
producer pool:
 Property NameDefaultMeaningSince 
Version
 
   spark.kafka.producer.cache.timeout
-  The minimum amount of time a producer may sit idle in the pool before it 
is eligible for eviction by the evictor.
   10m (10 minutes)
+  The minimum amount of time a producer may sit idle in the pool before it 
is eligible for eviction by the evictor.
   2.2.1
 
 
   spark.kafka.producer.cache.evictorThreadRunInterval

[spark] branch master updated (eedc6cc -> 986fa01)

2020-06-23 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from eedc6cc  [SPARK-32028][WEBUI] fix app id link for multi attempts app 
in history summary page
 add 986fa01  [SPARK-32075][DOCS] Fix a few issues in parameters table

No new revisions were added by this update.

Summary of changes:
 docs/structured-streaming-kafka-integration.md | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (eedc6cc -> 986fa01)

2020-06-23 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from eedc6cc  [SPARK-32028][WEBUI] fix app id link for multi attempts app 
in history summary page
 add 986fa01  [SPARK-32075][DOCS] Fix a few issues in parameters table

No new revisions were added by this update.

Summary of changes:
 docs/structured-streaming-kafka-integration.md | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-32028][WEBUI] fix app id link for multi attempts app in history summary page

2020-06-23 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 25b7d8b  [SPARK-32028][WEBUI] fix app id link for multi attempts app 
in history summary page
25b7d8b is described below

commit 25b7d8b6e6e3a11f589badfa4bbbd904b1195307
Author: Zhen Li 
AuthorDate: Tue Jun 23 21:43:02 2020 -0500

[SPARK-32028][WEBUI] fix app id link for multi attempts app in history 
summary page

### What changes were proposed in this pull request?

Fix app id link for multi attempts application in history summary page
If attempt id is available (yarn), app id link url will contain correct 
attempt id, like `/history/application_1561589317410_0002/1/jobs/`.
If attempt id is not available (standalone), app id link url will not 
contain fake attempt id, like `/history/app-20190404053606-/jobs/`.

### Why are the changes needed?

This PR is for fixing 
[32028](https://issues.apache.org/jira/browse/SPARK-32028). App id link use 
application attempt count as attempt id. this would cause link url wrong for 
below cases:
1. there are multi attempts, all links point to last attempt

![multi_same](https://user-images.githubusercontent.com/10524738/85098505-c45c5500-b1af-11ea-8912-fa5fd72ce064.JPG)

2. if there is one attempt, but attempt id is not 1 (before attempt maybe 
crash or fail to gerenerate event file). link url points to worng attempt (1) 
here.

![wrong_attemptJPG](https://user-images.githubusercontent.com/10524738/85098513-c9b99f80-b1af-11ea-8cbc-fd7f745c1080.JPG)

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

Tested this manually.

Closes #28867 from zhli1142015/fix-appid-link-in-history-page.

Authored-by: Zhen Li 
Signed-off-by: Sean Owen 
(cherry picked from commit eedc6cc37df9b32995f41bd0e1779101ba1df1b8)
Signed-off-by: Sean Owen 
---
 .../resources/org/apache/spark/ui/static/historypage-template.html  | 6 +++---
 core/src/main/resources/org/apache/spark/ui/static/historypage.js   | 5 +++--
 2 files changed, 6 insertions(+), 5 deletions(-)

diff --git 
a/core/src/main/resources/org/apache/spark/ui/static/historypage-template.html 
b/core/src/main/resources/org/apache/spark/ui/static/historypage-template.html
index 33eb7bf..7e9927d 100644
--- 
a/core/src/main/resources/org/apache/spark/ui/static/historypage-template.html
+++ 
b/core/src/main/resources/org/apache/spark/ui/static/historypage-template.html
@@ -77,12 +77,12 @@
   
   {{#applications}}
 
+  {{#attempts}}
   {{version}}
-  {{id}}
+  {{id}}
   {{name}}
-  {{#attempts}}
   {{#hasMultipleAttempts}}
-  {{attemptId}}
+  {{attemptId}}
   {{/hasMultipleAttempts}}
   {{startTime}}
   {{#showCompletedColumns}}
diff --git a/core/src/main/resources/org/apache/spark/ui/static/historypage.js 
b/core/src/main/resources/org/apache/spark/ui/static/historypage.js
index 4df5f07..3a4c815 100644
--- a/core/src/main/resources/org/apache/spark/ui/static/historypage.js
+++ b/core/src/main/resources/org/apache/spark/ui/static/historypage.js
@@ -130,7 +130,7 @@ $(document).ready(function() {
 if (app["attempts"].length > 1) {
 hasMultipleAttempts = true;
 }
-var num = app["attempts"].length;
+
 for (var j in app["attempts"]) {
   var attempt = app["attempts"][j];
   attempt["startTime"] = formatTimeMillis(attempt["startTimeEpoch"]);
@@ -140,7 +140,8 @@ $(document).ready(function() {
 (attempt.hasOwnProperty("attemptId") ? attempt["attemptId"] + "/" 
: "") + "logs";
   attempt["durationMillisec"] = attempt["duration"];
   attempt["duration"] = formatDuration(attempt["duration"]);
-  var app_clone = {"id" : id, "name" : name, "version": version, "num" 
: num, "attempts" : [attempt]};
+  var hasAttemptId = attempt.hasOwnProperty("attemptId");
+  var app_clone = {"id" : id, "name" : name, "version": version, 
"hasAttemptId" : hasAttemptId, "attempts" : [attempt]};
   array.push(app_clone);
 }
   }


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-32028][WEBUI] fix app id link for multi attempts app in history summary page

2020-06-23 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 25b7d8b  [SPARK-32028][WEBUI] fix app id link for multi attempts app 
in history summary page
25b7d8b is described below

commit 25b7d8b6e6e3a11f589badfa4bbbd904b1195307
Author: Zhen Li 
AuthorDate: Tue Jun 23 21:43:02 2020 -0500

[SPARK-32028][WEBUI] fix app id link for multi attempts app in history 
summary page

### What changes were proposed in this pull request?

Fix app id link for multi attempts application in history summary page
If attempt id is available (yarn), app id link url will contain correct 
attempt id, like `/history/application_1561589317410_0002/1/jobs/`.
If attempt id is not available (standalone), app id link url will not 
contain fake attempt id, like `/history/app-20190404053606-/jobs/`.

### Why are the changes needed?

This PR is for fixing 
[32028](https://issues.apache.org/jira/browse/SPARK-32028). App id link use 
application attempt count as attempt id. this would cause link url wrong for 
below cases:
1. there are multi attempts, all links point to last attempt

![multi_same](https://user-images.githubusercontent.com/10524738/85098505-c45c5500-b1af-11ea-8912-fa5fd72ce064.JPG)

2. if there is one attempt, but attempt id is not 1 (before attempt maybe 
crash or fail to gerenerate event file). link url points to worng attempt (1) 
here.

![wrong_attemptJPG](https://user-images.githubusercontent.com/10524738/85098513-c9b99f80-b1af-11ea-8cbc-fd7f745c1080.JPG)

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

Tested this manually.

Closes #28867 from zhli1142015/fix-appid-link-in-history-page.

Authored-by: Zhen Li 
Signed-off-by: Sean Owen 
(cherry picked from commit eedc6cc37df9b32995f41bd0e1779101ba1df1b8)
Signed-off-by: Sean Owen 
---
 .../resources/org/apache/spark/ui/static/historypage-template.html  | 6 +++---
 core/src/main/resources/org/apache/spark/ui/static/historypage.js   | 5 +++--
 2 files changed, 6 insertions(+), 5 deletions(-)

diff --git 
a/core/src/main/resources/org/apache/spark/ui/static/historypage-template.html 
b/core/src/main/resources/org/apache/spark/ui/static/historypage-template.html
index 33eb7bf..7e9927d 100644
--- 
a/core/src/main/resources/org/apache/spark/ui/static/historypage-template.html
+++ 
b/core/src/main/resources/org/apache/spark/ui/static/historypage-template.html
@@ -77,12 +77,12 @@
   
   {{#applications}}
 
+  {{#attempts}}
   {{version}}
-  {{id}}
+  {{id}}
   {{name}}
-  {{#attempts}}
   {{#hasMultipleAttempts}}
-  {{attemptId}}
+  {{attemptId}}
   {{/hasMultipleAttempts}}
   {{startTime}}
   {{#showCompletedColumns}}
diff --git a/core/src/main/resources/org/apache/spark/ui/static/historypage.js 
b/core/src/main/resources/org/apache/spark/ui/static/historypage.js
index 4df5f07..3a4c815 100644
--- a/core/src/main/resources/org/apache/spark/ui/static/historypage.js
+++ b/core/src/main/resources/org/apache/spark/ui/static/historypage.js
@@ -130,7 +130,7 @@ $(document).ready(function() {
 if (app["attempts"].length > 1) {
 hasMultipleAttempts = true;
 }
-var num = app["attempts"].length;
+
 for (var j in app["attempts"]) {
   var attempt = app["attempts"][j];
   attempt["startTime"] = formatTimeMillis(attempt["startTimeEpoch"]);
@@ -140,7 +140,8 @@ $(document).ready(function() {
 (attempt.hasOwnProperty("attemptId") ? attempt["attemptId"] + "/" 
: "") + "logs";
   attempt["durationMillisec"] = attempt["duration"];
   attempt["duration"] = formatDuration(attempt["duration"]);
-  var app_clone = {"id" : id, "name" : name, "version": version, "num" 
: num, "attempts" : [attempt]};
+  var hasAttemptId = attempt.hasOwnProperty("attemptId");
+  var app_clone = {"id" : id, "name" : name, "version": version, 
"hasAttemptId" : hasAttemptId, "attempts" : [attempt]};
   array.push(app_clone);
 }
   }


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (b62e253 -> eedc6cc)

2020-06-23 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from b62e253  [SPARK-32073][R] Drop R < 3.5 support
 add eedc6cc  [SPARK-32028][WEBUI] fix app id link for multi attempts app 
in history summary page

No new revisions were added by this update.

Summary of changes:
 .../resources/org/apache/spark/ui/static/historypage-template.html  | 6 +++---
 core/src/main/resources/org/apache/spark/ui/static/historypage.js   | 5 +++--
 2 files changed, 6 insertions(+), 5 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-32028][WEBUI] fix app id link for multi attempts app in history summary page

2020-06-23 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 25b7d8b  [SPARK-32028][WEBUI] fix app id link for multi attempts app 
in history summary page
25b7d8b is described below

commit 25b7d8b6e6e3a11f589badfa4bbbd904b1195307
Author: Zhen Li 
AuthorDate: Tue Jun 23 21:43:02 2020 -0500

[SPARK-32028][WEBUI] fix app id link for multi attempts app in history 
summary page

### What changes were proposed in this pull request?

Fix app id link for multi attempts application in history summary page
If attempt id is available (yarn), app id link url will contain correct 
attempt id, like `/history/application_1561589317410_0002/1/jobs/`.
If attempt id is not available (standalone), app id link url will not 
contain fake attempt id, like `/history/app-20190404053606-/jobs/`.

### Why are the changes needed?

This PR is for fixing 
[32028](https://issues.apache.org/jira/browse/SPARK-32028). App id link use 
application attempt count as attempt id. this would cause link url wrong for 
below cases:
1. there are multi attempts, all links point to last attempt

![multi_same](https://user-images.githubusercontent.com/10524738/85098505-c45c5500-b1af-11ea-8912-fa5fd72ce064.JPG)

2. if there is one attempt, but attempt id is not 1 (before attempt maybe 
crash or fail to gerenerate event file). link url points to worng attempt (1) 
here.

![wrong_attemptJPG](https://user-images.githubusercontent.com/10524738/85098513-c9b99f80-b1af-11ea-8cbc-fd7f745c1080.JPG)

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

Tested this manually.

Closes #28867 from zhli1142015/fix-appid-link-in-history-page.

Authored-by: Zhen Li 
Signed-off-by: Sean Owen 
(cherry picked from commit eedc6cc37df9b32995f41bd0e1779101ba1df1b8)
Signed-off-by: Sean Owen 
---
 .../resources/org/apache/spark/ui/static/historypage-template.html  | 6 +++---
 core/src/main/resources/org/apache/spark/ui/static/historypage.js   | 5 +++--
 2 files changed, 6 insertions(+), 5 deletions(-)

diff --git 
a/core/src/main/resources/org/apache/spark/ui/static/historypage-template.html 
b/core/src/main/resources/org/apache/spark/ui/static/historypage-template.html
index 33eb7bf..7e9927d 100644
--- 
a/core/src/main/resources/org/apache/spark/ui/static/historypage-template.html
+++ 
b/core/src/main/resources/org/apache/spark/ui/static/historypage-template.html
@@ -77,12 +77,12 @@
   
   {{#applications}}
 
+  {{#attempts}}
   {{version}}
-  {{id}}
+  {{id}}
   {{name}}
-  {{#attempts}}
   {{#hasMultipleAttempts}}
-  {{attemptId}}
+  {{attemptId}}
   {{/hasMultipleAttempts}}
   {{startTime}}
   {{#showCompletedColumns}}
diff --git a/core/src/main/resources/org/apache/spark/ui/static/historypage.js 
b/core/src/main/resources/org/apache/spark/ui/static/historypage.js
index 4df5f07..3a4c815 100644
--- a/core/src/main/resources/org/apache/spark/ui/static/historypage.js
+++ b/core/src/main/resources/org/apache/spark/ui/static/historypage.js
@@ -130,7 +130,7 @@ $(document).ready(function() {
 if (app["attempts"].length > 1) {
 hasMultipleAttempts = true;
 }
-var num = app["attempts"].length;
+
 for (var j in app["attempts"]) {
   var attempt = app["attempts"][j];
   attempt["startTime"] = formatTimeMillis(attempt["startTimeEpoch"]);
@@ -140,7 +140,8 @@ $(document).ready(function() {
 (attempt.hasOwnProperty("attemptId") ? attempt["attemptId"] + "/" 
: "") + "logs";
   attempt["durationMillisec"] = attempt["duration"];
   attempt["duration"] = formatDuration(attempt["duration"]);
-  var app_clone = {"id" : id, "name" : name, "version": version, "num" 
: num, "attempts" : [attempt]};
+  var hasAttemptId = attempt.hasOwnProperty("attemptId");
+  var app_clone = {"id" : id, "name" : name, "version": version, 
"hasAttemptId" : hasAttemptId, "attempts" : [attempt]};
   array.push(app_clone);
 }
   }


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (b62e253 -> eedc6cc)

2020-06-23 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from b62e253  [SPARK-32073][R] Drop R < 3.5 support
 add eedc6cc  [SPARK-32028][WEBUI] fix app id link for multi attempts app 
in history summary page

No new revisions were added by this update.

Summary of changes:
 .../resources/org/apache/spark/ui/static/historypage-template.html  | 6 +++---
 core/src/main/resources/org/apache/spark/ui/static/historypage.js   | 5 +++--
 2 files changed, 6 insertions(+), 5 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (b62e253 -> eedc6cc)

2020-06-23 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from b62e253  [SPARK-32073][R] Drop R < 3.5 support
 add eedc6cc  [SPARK-32028][WEBUI] fix app id link for multi attempts app 
in history summary page

No new revisions were added by this update.

Summary of changes:
 .../resources/org/apache/spark/ui/static/historypage-template.html  | 6 +++---
 core/src/main/resources/org/apache/spark/ui/static/historypage.js   | 5 +++--
 2 files changed, 6 insertions(+), 5 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-2.4 updated: [SPARK-32073][R] Drop R < 3.5 support

2020-06-23 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-2.4 by this push:
 new 77006b2  [SPARK-32073][R] Drop R < 3.5 support
77006b2 is described below

commit 77006b2c65e0e4b6b9facddbb13aa88a264adbe2
Author: HyukjinKwon 
AuthorDate: Wed Jun 24 11:05:27 2020 +0900

[SPARK-32073][R] Drop R < 3.5 support

Spark 3.0 accidentally dropped R < 3.5. It is built by R 3.6.3 which not 
support R < 3.5:

```
Error in readRDS(pfile) : cannot read workspace version 3 written by R 
3.6.3; need R 3.5.0 or newer version.
```

In fact, with SPARK-31918, we will have to drop R < 3.5 entirely to support 
R 4.0.0. This is inevitable to release on CRAN because they require to make the 
tests pass with the latest R.

To show the supported versions correctly, and support R 4.0.0 to unblock 
the releases.

In fact, no because Spark 3.0.0 already does not work with R < 3.5.
Compared to Spark 2.4, yes. R < 3.5 would not work.

Jenkins should test it out.

Closes #28908 from HyukjinKwon/SPARK-32073.

Authored-by: HyukjinKwon 
Signed-off-by: HyukjinKwon 
(cherry picked from commit b62e2536db9def0d11605ceac8990f72a515e9a0)
Signed-off-by: HyukjinKwon 
---
 R/WINDOWS.md  | 2 +-
 R/pkg/DESCRIPTION | 2 +-
 docs/index.md | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/R/WINDOWS.md b/R/WINDOWS.md
index da668a6..73f8a21 100644
--- a/R/WINDOWS.md
+++ b/R/WINDOWS.md
@@ -2,7 +2,7 @@
 
 To build SparkR on Windows, the following steps are required
 
-1. Install R (>= 3.1) and 
[Rtools](http://cran.r-project.org/bin/windows/Rtools/). Make sure to
+1. Install R (>= 3.5) and 
[Rtools](http://cran.r-project.org/bin/windows/Rtools/). Make sure to
 include Rtools and R in `PATH`.
 
 2. Install
diff --git a/R/pkg/DESCRIPTION b/R/pkg/DESCRIPTION
index b70014d..2940d04 100644
--- a/R/pkg/DESCRIPTION
+++ b/R/pkg/DESCRIPTION
@@ -15,7 +15,7 @@ URL: https://www.apache.org/ https://spark.apache.org/
 BugReports: https://spark.apache.org/contributing.html
 SystemRequirements: Java (== 8)
 Depends:
-R (>= 3.0),
+R (>= 3.5),
 methods
 Suggests:
 knitr,
diff --git a/docs/index.md b/docs/index.md
index 52f1a5a..73cb57a 100644
--- a/docs/index.md
+++ b/docs/index.md
@@ -31,7 +31,7 @@ Spark runs on both Windows and UNIX-like systems (e.g. Linux, 
Mac OS). It's easy
 locally on one machine --- all you need is to have `java` installed on your 
system `PATH`,
 or the `JAVA_HOME` environment variable pointing to a Java installation.
 
-Spark runs on Java 8, Python 2.7+/3.4+ and R 3.1+. For the Scala API, Spark 
{{site.SPARK_VERSION}}
+Spark runs on Java 8, Python 2.7+/3.4+ and R 3.5+. For the Scala API, Spark 
{{site.SPARK_VERSION}}
 uses Scala {{site.SCALA_BINARY_VERSION}}. You will need to use a compatible 
Scala version
 ({{site.SCALA_BINARY_VERSION}}.x).
 


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-2.4 updated: [SPARK-32073][R] Drop R < 3.5 support

2020-06-23 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-2.4 by this push:
 new 77006b2  [SPARK-32073][R] Drop R < 3.5 support
77006b2 is described below

commit 77006b2c65e0e4b6b9facddbb13aa88a264adbe2
Author: HyukjinKwon 
AuthorDate: Wed Jun 24 11:05:27 2020 +0900

[SPARK-32073][R] Drop R < 3.5 support

Spark 3.0 accidentally dropped R < 3.5. It is built by R 3.6.3 which not 
support R < 3.5:

```
Error in readRDS(pfile) : cannot read workspace version 3 written by R 
3.6.3; need R 3.5.0 or newer version.
```

In fact, with SPARK-31918, we will have to drop R < 3.5 entirely to support 
R 4.0.0. This is inevitable to release on CRAN because they require to make the 
tests pass with the latest R.

To show the supported versions correctly, and support R 4.0.0 to unblock 
the releases.

In fact, no because Spark 3.0.0 already does not work with R < 3.5.
Compared to Spark 2.4, yes. R < 3.5 would not work.

Jenkins should test it out.

Closes #28908 from HyukjinKwon/SPARK-32073.

Authored-by: HyukjinKwon 
Signed-off-by: HyukjinKwon 
(cherry picked from commit b62e2536db9def0d11605ceac8990f72a515e9a0)
Signed-off-by: HyukjinKwon 
---
 R/WINDOWS.md  | 2 +-
 R/pkg/DESCRIPTION | 2 +-
 docs/index.md | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/R/WINDOWS.md b/R/WINDOWS.md
index da668a6..73f8a21 100644
--- a/R/WINDOWS.md
+++ b/R/WINDOWS.md
@@ -2,7 +2,7 @@
 
 To build SparkR on Windows, the following steps are required
 
-1. Install R (>= 3.1) and 
[Rtools](http://cran.r-project.org/bin/windows/Rtools/). Make sure to
+1. Install R (>= 3.5) and 
[Rtools](http://cran.r-project.org/bin/windows/Rtools/). Make sure to
 include Rtools and R in `PATH`.
 
 2. Install
diff --git a/R/pkg/DESCRIPTION b/R/pkg/DESCRIPTION
index b70014d..2940d04 100644
--- a/R/pkg/DESCRIPTION
+++ b/R/pkg/DESCRIPTION
@@ -15,7 +15,7 @@ URL: https://www.apache.org/ https://spark.apache.org/
 BugReports: https://spark.apache.org/contributing.html
 SystemRequirements: Java (== 8)
 Depends:
-R (>= 3.0),
+R (>= 3.5),
 methods
 Suggests:
 knitr,
diff --git a/docs/index.md b/docs/index.md
index 52f1a5a..73cb57a 100644
--- a/docs/index.md
+++ b/docs/index.md
@@ -31,7 +31,7 @@ Spark runs on both Windows and UNIX-like systems (e.g. Linux, 
Mac OS). It's easy
 locally on one machine --- all you need is to have `java` installed on your 
system `PATH`,
 or the `JAVA_HOME` environment variable pointing to a Java installation.
 
-Spark runs on Java 8, Python 2.7+/3.4+ and R 3.1+. For the Scala API, Spark 
{{site.SPARK_VERSION}}
+Spark runs on Java 8, Python 2.7+/3.4+ and R 3.5+. For the Scala API, Spark 
{{site.SPARK_VERSION}}
 uses Scala {{site.SCALA_BINARY_VERSION}}. You will need to use a compatible 
Scala version
 ({{site.SCALA_BINARY_VERSION}}.x).
 


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-32073][R] Drop R < 3.5 support

2020-06-23 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new da8133f  [SPARK-32073][R] Drop R < 3.5 support
da8133f is described below

commit da8133f8f1df7d2ddfa995974d9a7db06ff4cd5a
Author: HyukjinKwon 
AuthorDate: Wed Jun 24 11:05:27 2020 +0900

[SPARK-32073][R] Drop R < 3.5 support

### What changes were proposed in this pull request?

Spark 3.0 accidentally dropped R < 3.5. It is built by R 3.6.3 which not 
support R < 3.5:

```
Error in readRDS(pfile) : cannot read workspace version 3 written by R 
3.6.3; need R 3.5.0 or newer version.
```

In fact, with SPARK-31918, we will have to drop R < 3.5 entirely to support 
R 4.0.0. This is inevitable to release on CRAN because they require to make the 
tests pass with the latest R.

### Why are the changes needed?

To show the supported versions correctly, and support R 4.0.0 to unblock 
the releases.

### Does this PR introduce _any_ user-facing change?

In fact, no because Spark 3.0.0 already does not work with R < 3.5.
Compared to Spark 2.4, yes. R < 3.5 would not work.

### How was this patch tested?

Jenkins should test it out.

Closes #28908 from HyukjinKwon/SPARK-32073.

Authored-by: HyukjinKwon 
Signed-off-by: HyukjinKwon 
(cherry picked from commit b62e2536db9def0d11605ceac8990f72a515e9a0)
Signed-off-by: HyukjinKwon 
---
 R/WINDOWS.md | 4 ++--
 R/pkg/DESCRIPTION| 2 +-
 R/pkg/inst/profile/general.R | 4 
 R/pkg/inst/profile/shell.R   | 4 
 docs/index.md| 3 +--
 5 files changed, 4 insertions(+), 13 deletions(-)

diff --git a/R/WINDOWS.md b/R/WINDOWS.md
index dbc2717..9fe4a22b 100644
--- a/R/WINDOWS.md
+++ b/R/WINDOWS.md
@@ -22,8 +22,8 @@ To build SparkR on Windows, the following steps are required
 
 1. Make sure `bash` is available and in `PATH` if you already have a built-in 
`bash` on Windows. If you do not have, install 
[Cygwin](https://www.cygwin.com/).
 
-2. Install R (>= 3.1) and 
[Rtools](https://cloud.r-project.org/bin/windows/Rtools/). Make sure to
-include Rtools and R in `PATH`. Note that support for R prior to version 3.4 
is deprecated as of Spark 3.0.0.
+2. Install R (>= 3.5) and 
[Rtools](https://cloud.r-project.org/bin/windows/Rtools/). Make sure to
+include Rtools and R in `PATH`.
 
 3. Install JDK that SparkR supports (see `R/pkg/DESCRIPTION`), and set 
`JAVA_HOME` in the system environment variables.
 
diff --git a/R/pkg/DESCRIPTION b/R/pkg/DESCRIPTION
index 21f3eaa..86514f2 100644
--- a/R/pkg/DESCRIPTION
+++ b/R/pkg/DESCRIPTION
@@ -15,7 +15,7 @@ URL: https://www.apache.org/ https://spark.apache.org/
 BugReports: https://spark.apache.org/contributing.html
 SystemRequirements: Java (>= 8, < 12)
 Depends:
-R (>= 3.1),
+R (>= 3.5),
 methods
 Suggests:
 knitr,
diff --git a/R/pkg/inst/profile/general.R b/R/pkg/inst/profile/general.R
index 3efb460..8c75c19 100644
--- a/R/pkg/inst/profile/general.R
+++ b/R/pkg/inst/profile/general.R
@@ -16,10 +16,6 @@
 #
 
 .First <- function() {
-  if (utils::compareVersion(paste0(R.version$major, ".", R.version$minor), 
"3.4.0") == -1) {
-warning("Support for R prior to version 3.4 is deprecated since Spark 
3.0.0")
-  }
-
   packageDir <- Sys.getenv("SPARKR_PACKAGE_DIR")
   dirs <- strsplit(packageDir, ",")[[1]]
   .libPaths(c(dirs, .libPaths()))
diff --git a/R/pkg/inst/profile/shell.R b/R/pkg/inst/profile/shell.R
index e4e0d03..f6c20e1 100644
--- a/R/pkg/inst/profile/shell.R
+++ b/R/pkg/inst/profile/shell.R
@@ -16,10 +16,6 @@
 #
 
 .First <- function() {
-  if (utils::compareVersion(paste0(R.version$major, ".", R.version$minor), 
"3.4.0") == -1) {
-warning("Support for R prior to version 3.4 is deprecated since Spark 
3.0.0")
-  }
-
   home <- Sys.getenv("SPARK_HOME")
   .libPaths(c(file.path(home, "R", "lib"), .libPaths()))
   Sys.setenv(NOAWT = 1)
diff --git a/docs/index.md b/docs/index.md
index 38f12dd4..c0771ca 100644
--- a/docs/index.md
+++ b/docs/index.md
@@ -44,10 +44,9 @@ source, visit [Building Spark](building-spark.html).
 
 Spark runs on both Windows and UNIX-like systems (e.g. Linux, Mac OS), and it 
should run on any platform that runs a supported version of Java. This should 
include JVMs on x86_64 and ARM64. It's easy to run locally on one machine --- 
all you need is to have `java` installed on your system `PATH`, or the 
`JAVA_HOME` environment variable pointing to a Java installation.
 
-Spark runs on Java 8/11, Scala 2.12, Python 2.7+/3.4+ and R 3.1+.
+Spark runs on Java 8/11, Scala 2.12, Python 2.7+/3.4+ and R 3.5+.
 Java 8 prior to version 8u92 support is deprecated as of Spark 3.0.0.
 Python 2 and Python 3 prior to version 3.6 support is deprecated as of Spark 
3.0.0.
-R prior to

[spark] branch master updated (11d2b07 -> b62e253)

2020-06-23 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 11d2b07  [SPARK-31918][R] Ignore S4 generic methods under SparkR 
namespace in closure cleaning to support R 4.0.0+
 add b62e253  [SPARK-32073][R] Drop R < 3.5 support

No new revisions were added by this update.

Summary of changes:
 R/WINDOWS.md | 4 ++--
 R/pkg/DESCRIPTION| 2 +-
 R/pkg/inst/profile/general.R | 4 
 R/pkg/inst/profile/shell.R   | 4 
 docs/index.md| 3 +--
 5 files changed, 4 insertions(+), 13 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-2.4 updated: [SPARK-31918][R] Ignore S4 generic methods under SparkR namespace in closure cleaning to support R 4.0.0+

2020-06-23 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-2.4 by this push:
 new 29873c9  [SPARK-31918][R] Ignore S4 generic methods under SparkR 
namespace in closure cleaning to support R 4.0.0+
29873c9 is described below

commit 29873c9126503bbd3edfd523d8531b3644a2dd65
Author: HyukjinKwon 
AuthorDate: Wed Jun 24 11:03:05 2020 +0900

[SPARK-31918][R] Ignore S4 generic methods under SparkR namespace in 
closure cleaning to support R 4.0.0+

### What changes were proposed in this pull request?

This PR proposes to ignore S4 generic methods under SparkR namespace in 
closure cleaning to support R 4.0.0+.

Currently, when you run the codes that runs R native codes, it fails as 
below with R 4.0.0:

```r
df <- createDataFrame(lapply(seq(100), function (e) list(value=e)))
count(dapply(df, function(x) as.data.frame(x[x$value < 50,]), schema(df)))
```

```
org.apache.spark.SparkException: R unexpectedly exited.
R worker produced errors: Error in lapply(part, FUN) : attempt to bind a 
variable to R_UnboundValue
```

The root cause seems to be related to when an S4 generic method is manually 
included into the closure's environment via `SparkR:::cleanClosure`. For 
example, when an RRDD is created via `createDataFrame` with calling `lapply` to 
convert, `lapply` itself:


https://github.com/apache/spark/blob/f53d8c63e80172295e2fbc805c0c391bdececcaa/R/pkg/R/RDD.R#L484

is added into the environment of the cleaned closure - because this is not 
an exposed namespace; however, this is broken in R 4.0.0+ for an unknown reason 
with an error message such as "attempt to bind a variable to R_UnboundValue".

Actually, we don't need to add the `lapply` into the environment of the 
closure because it is not supposed to be called in worker side. In fact, there 
is no private generic methods supposed to be called in worker side in SparkR at 
all from my understanding.

Therefore, this PR takes a simpler path to work around just by explicitly 
excluding the S4 generic methods under SparkR namespace to support R 4.0.0. in 
SparkR.

### Why are the changes needed?

To support R 4.0.0+ with SparkR, and unblock the releases on CRAN. CRAN 
requires the tests pass with the latest R.

### Does this PR introduce _any_ user-facing change?

Yes, it will support R 4.0.0 to end-users.

### How was this patch tested?

Manually tested. Both CRAN and tests with R 4.0.1:

```
══ testthat results  
═══
[ OK: 13 | SKIPPED: 0 | WARNINGS: 0 | FAILED: 0 ]
✔ |  OK F W S | Context
✔ |  11   | binary functions [2.5 s]
✔ |   4   | functions on binary files [2.1 s]
✔ |   2   | broadcast variables [0.5 s]
✔ |   5   | functions in client.R
✔ |  46   | test functions in sparkR.R [6.3 s]
✔ |   2   | include R packages [0.3 s]
✔ |   2   | JVM API [0.2 s]
✔ |  75   | MLlib classification algorithms, except for tree-based 
algorithms [86.3 s]
✔ |  70   | MLlib clustering algorithms [44.5 s]
✔ |   6   | MLlib frequent pattern mining [3.0 s]
✔ |   8   | MLlib recommendation algorithms [9.6 s]
✔ | 136   | MLlib regression algorithms, except for tree-based 
algorithms [76.0 s]
✔ |   8   | MLlib statistics algorithms [0.6 s]
✔ |  94   | MLlib tree-based algorithms [85.2 s]
✔ |  29   | parallelize() and collect() [0.5 s]
✔ | 428   | basic RDD functions [25.3 s]
✔ |  39   | SerDe functionality [2.2 s]
✔ |  20   | partitionBy, groupByKey, reduceByKey etc. [3.9 s]
✔ |   4   | functions in sparkR.R
✔ |  16   | SparkSQL Arrow optimization [19.2 s]
✔ |   6   | test show SparkDataFrame when eager execution is enabled. 
[1.1 s]
✔ | 1175   | SparkSQL functions [134.8 s]
✔ |  42   | Structured Streaming [478.2 s]
✔ |  16   | tests RDD function take() [1.1 s]
✔ |  14   | the textFile() function [2.9 s]
✔ |  46   | functions in utils.R [0.7 s]
✔ |   0 1 | Windows-specific tests


test_Windows.R:22: skip: sparkJars tag in SparkContext
Reason: This test is only for Windows, skipped



══ Results 
═
Duration: 987.3 s

OK:   2304
Failed:   0
Warnings: 0
Skipped:  1
...
Status: OK
+ popd
Tests passed.
```

Note that I tested to build SparkR in R 4.0.0, and run the

[spark] branch branch-2.4 updated: [SPARK-32073][R] Drop R < 3.5 support

2020-06-23 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-2.4 by this push:
 new 77006b2  [SPARK-32073][R] Drop R < 3.5 support
77006b2 is described below

commit 77006b2c65e0e4b6b9facddbb13aa88a264adbe2
Author: HyukjinKwon 
AuthorDate: Wed Jun 24 11:05:27 2020 +0900

[SPARK-32073][R] Drop R < 3.5 support

Spark 3.0 accidentally dropped R < 3.5. It is built by R 3.6.3 which not 
support R < 3.5:

```
Error in readRDS(pfile) : cannot read workspace version 3 written by R 
3.6.3; need R 3.5.0 or newer version.
```

In fact, with SPARK-31918, we will have to drop R < 3.5 entirely to support 
R 4.0.0. This is inevitable to release on CRAN because they require to make the 
tests pass with the latest R.

To show the supported versions correctly, and support R 4.0.0 to unblock 
the releases.

In fact, no because Spark 3.0.0 already does not work with R < 3.5.
Compared to Spark 2.4, yes. R < 3.5 would not work.

Jenkins should test it out.

Closes #28908 from HyukjinKwon/SPARK-32073.

Authored-by: HyukjinKwon 
Signed-off-by: HyukjinKwon 
(cherry picked from commit b62e2536db9def0d11605ceac8990f72a515e9a0)
Signed-off-by: HyukjinKwon 
---
 R/WINDOWS.md  | 2 +-
 R/pkg/DESCRIPTION | 2 +-
 docs/index.md | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/R/WINDOWS.md b/R/WINDOWS.md
index da668a6..73f8a21 100644
--- a/R/WINDOWS.md
+++ b/R/WINDOWS.md
@@ -2,7 +2,7 @@
 
 To build SparkR on Windows, the following steps are required
 
-1. Install R (>= 3.1) and 
[Rtools](http://cran.r-project.org/bin/windows/Rtools/). Make sure to
+1. Install R (>= 3.5) and 
[Rtools](http://cran.r-project.org/bin/windows/Rtools/). Make sure to
 include Rtools and R in `PATH`.
 
 2. Install
diff --git a/R/pkg/DESCRIPTION b/R/pkg/DESCRIPTION
index b70014d..2940d04 100644
--- a/R/pkg/DESCRIPTION
+++ b/R/pkg/DESCRIPTION
@@ -15,7 +15,7 @@ URL: https://www.apache.org/ https://spark.apache.org/
 BugReports: https://spark.apache.org/contributing.html
 SystemRequirements: Java (== 8)
 Depends:
-R (>= 3.0),
+R (>= 3.5),
 methods
 Suggests:
 knitr,
diff --git a/docs/index.md b/docs/index.md
index 52f1a5a..73cb57a 100644
--- a/docs/index.md
+++ b/docs/index.md
@@ -31,7 +31,7 @@ Spark runs on both Windows and UNIX-like systems (e.g. Linux, 
Mac OS). It's easy
 locally on one machine --- all you need is to have `java` installed on your 
system `PATH`,
 or the `JAVA_HOME` environment variable pointing to a Java installation.
 
-Spark runs on Java 8, Python 2.7+/3.4+ and R 3.1+. For the Scala API, Spark 
{{site.SPARK_VERSION}}
+Spark runs on Java 8, Python 2.7+/3.4+ and R 3.5+. For the Scala API, Spark 
{{site.SPARK_VERSION}}
 uses Scala {{site.SCALA_BINARY_VERSION}}. You will need to use a compatible 
Scala version
 ({{site.SCALA_BINARY_VERSION}}.x).
 


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-32073][R] Drop R < 3.5 support

2020-06-23 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new da8133f  [SPARK-32073][R] Drop R < 3.5 support
da8133f is described below

commit da8133f8f1df7d2ddfa995974d9a7db06ff4cd5a
Author: HyukjinKwon 
AuthorDate: Wed Jun 24 11:05:27 2020 +0900

[SPARK-32073][R] Drop R < 3.5 support

### What changes were proposed in this pull request?

Spark 3.0 accidentally dropped R < 3.5. It is built by R 3.6.3 which not 
support R < 3.5:

```
Error in readRDS(pfile) : cannot read workspace version 3 written by R 
3.6.3; need R 3.5.0 or newer version.
```

In fact, with SPARK-31918, we will have to drop R < 3.5 entirely to support 
R 4.0.0. This is inevitable to release on CRAN because they require to make the 
tests pass with the latest R.

### Why are the changes needed?

To show the supported versions correctly, and support R 4.0.0 to unblock 
the releases.

### Does this PR introduce _any_ user-facing change?

In fact, no because Spark 3.0.0 already does not work with R < 3.5.
Compared to Spark 2.4, yes. R < 3.5 would not work.

### How was this patch tested?

Jenkins should test it out.

Closes #28908 from HyukjinKwon/SPARK-32073.

Authored-by: HyukjinKwon 
Signed-off-by: HyukjinKwon 
(cherry picked from commit b62e2536db9def0d11605ceac8990f72a515e9a0)
Signed-off-by: HyukjinKwon 
---
 R/WINDOWS.md | 4 ++--
 R/pkg/DESCRIPTION| 2 +-
 R/pkg/inst/profile/general.R | 4 
 R/pkg/inst/profile/shell.R   | 4 
 docs/index.md| 3 +--
 5 files changed, 4 insertions(+), 13 deletions(-)

diff --git a/R/WINDOWS.md b/R/WINDOWS.md
index dbc2717..9fe4a22b 100644
--- a/R/WINDOWS.md
+++ b/R/WINDOWS.md
@@ -22,8 +22,8 @@ To build SparkR on Windows, the following steps are required
 
 1. Make sure `bash` is available and in `PATH` if you already have a built-in 
`bash` on Windows. If you do not have, install 
[Cygwin](https://www.cygwin.com/).
 
-2. Install R (>= 3.1) and 
[Rtools](https://cloud.r-project.org/bin/windows/Rtools/). Make sure to
-include Rtools and R in `PATH`. Note that support for R prior to version 3.4 
is deprecated as of Spark 3.0.0.
+2. Install R (>= 3.5) and 
[Rtools](https://cloud.r-project.org/bin/windows/Rtools/). Make sure to
+include Rtools and R in `PATH`.
 
 3. Install JDK that SparkR supports (see `R/pkg/DESCRIPTION`), and set 
`JAVA_HOME` in the system environment variables.
 
diff --git a/R/pkg/DESCRIPTION b/R/pkg/DESCRIPTION
index 21f3eaa..86514f2 100644
--- a/R/pkg/DESCRIPTION
+++ b/R/pkg/DESCRIPTION
@@ -15,7 +15,7 @@ URL: https://www.apache.org/ https://spark.apache.org/
 BugReports: https://spark.apache.org/contributing.html
 SystemRequirements: Java (>= 8, < 12)
 Depends:
-R (>= 3.1),
+R (>= 3.5),
 methods
 Suggests:
 knitr,
diff --git a/R/pkg/inst/profile/general.R b/R/pkg/inst/profile/general.R
index 3efb460..8c75c19 100644
--- a/R/pkg/inst/profile/general.R
+++ b/R/pkg/inst/profile/general.R
@@ -16,10 +16,6 @@
 #
 
 .First <- function() {
-  if (utils::compareVersion(paste0(R.version$major, ".", R.version$minor), 
"3.4.0") == -1) {
-warning("Support for R prior to version 3.4 is deprecated since Spark 
3.0.0")
-  }
-
   packageDir <- Sys.getenv("SPARKR_PACKAGE_DIR")
   dirs <- strsplit(packageDir, ",")[[1]]
   .libPaths(c(dirs, .libPaths()))
diff --git a/R/pkg/inst/profile/shell.R b/R/pkg/inst/profile/shell.R
index e4e0d03..f6c20e1 100644
--- a/R/pkg/inst/profile/shell.R
+++ b/R/pkg/inst/profile/shell.R
@@ -16,10 +16,6 @@
 #
 
 .First <- function() {
-  if (utils::compareVersion(paste0(R.version$major, ".", R.version$minor), 
"3.4.0") == -1) {
-warning("Support for R prior to version 3.4 is deprecated since Spark 
3.0.0")
-  }
-
   home <- Sys.getenv("SPARK_HOME")
   .libPaths(c(file.path(home, "R", "lib"), .libPaths()))
   Sys.setenv(NOAWT = 1)
diff --git a/docs/index.md b/docs/index.md
index 38f12dd4..c0771ca 100644
--- a/docs/index.md
+++ b/docs/index.md
@@ -44,10 +44,9 @@ source, visit [Building Spark](building-spark.html).
 
 Spark runs on both Windows and UNIX-like systems (e.g. Linux, Mac OS), and it 
should run on any platform that runs a supported version of Java. This should 
include JVMs on x86_64 and ARM64. It's easy to run locally on one machine --- 
all you need is to have `java` installed on your system `PATH`, or the 
`JAVA_HOME` environment variable pointing to a Java installation.
 
-Spark runs on Java 8/11, Scala 2.12, Python 2.7+/3.4+ and R 3.1+.
+Spark runs on Java 8/11, Scala 2.12, Python 2.7+/3.4+ and R 3.5+.
 Java 8 prior to version 8u92 support is deprecated as of Spark 3.0.0.
 Python 2 and Python 3 prior to version 3.6 support is deprecated as of Spark 
3.0.0.
-R prior to

[spark] branch master updated (11d2b07 -> b62e253)

2020-06-23 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 11d2b07  [SPARK-31918][R] Ignore S4 generic methods under SparkR 
namespace in closure cleaning to support R 4.0.0+
 add b62e253  [SPARK-32073][R] Drop R < 3.5 support

No new revisions were added by this update.

Summary of changes:
 R/WINDOWS.md | 4 ++--
 R/pkg/DESCRIPTION| 2 +-
 R/pkg/inst/profile/general.R | 4 
 R/pkg/inst/profile/shell.R   | 4 
 docs/index.md| 3 +--
 5 files changed, 4 insertions(+), 13 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-2.4 updated: [SPARK-31918][R] Ignore S4 generic methods under SparkR namespace in closure cleaning to support R 4.0.0+

2020-06-23 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-2.4 by this push:
 new 29873c9  [SPARK-31918][R] Ignore S4 generic methods under SparkR 
namespace in closure cleaning to support R 4.0.0+
29873c9 is described below

commit 29873c9126503bbd3edfd523d8531b3644a2dd65
Author: HyukjinKwon 
AuthorDate: Wed Jun 24 11:03:05 2020 +0900

[SPARK-31918][R] Ignore S4 generic methods under SparkR namespace in 
closure cleaning to support R 4.0.0+

### What changes were proposed in this pull request?

This PR proposes to ignore S4 generic methods under SparkR namespace in 
closure cleaning to support R 4.0.0+.

Currently, when you run the codes that runs R native codes, it fails as 
below with R 4.0.0:

```r
df <- createDataFrame(lapply(seq(100), function (e) list(value=e)))
count(dapply(df, function(x) as.data.frame(x[x$value < 50,]), schema(df)))
```

```
org.apache.spark.SparkException: R unexpectedly exited.
R worker produced errors: Error in lapply(part, FUN) : attempt to bind a 
variable to R_UnboundValue
```

The root cause seems to be related to when an S4 generic method is manually 
included into the closure's environment via `SparkR:::cleanClosure`. For 
example, when an RRDD is created via `createDataFrame` with calling `lapply` to 
convert, `lapply` itself:


https://github.com/apache/spark/blob/f53d8c63e80172295e2fbc805c0c391bdececcaa/R/pkg/R/RDD.R#L484

is added into the environment of the cleaned closure - because this is not 
an exposed namespace; however, this is broken in R 4.0.0+ for an unknown reason 
with an error message such as "attempt to bind a variable to R_UnboundValue".

Actually, we don't need to add the `lapply` into the environment of the 
closure because it is not supposed to be called in worker side. In fact, there 
is no private generic methods supposed to be called in worker side in SparkR at 
all from my understanding.

Therefore, this PR takes a simpler path to work around just by explicitly 
excluding the S4 generic methods under SparkR namespace to support R 4.0.0. in 
SparkR.

### Why are the changes needed?

To support R 4.0.0+ with SparkR, and unblock the releases on CRAN. CRAN 
requires the tests pass with the latest R.

### Does this PR introduce _any_ user-facing change?

Yes, it will support R 4.0.0 to end-users.

### How was this patch tested?

Manually tested. Both CRAN and tests with R 4.0.1:

```
══ testthat results  
═══
[ OK: 13 | SKIPPED: 0 | WARNINGS: 0 | FAILED: 0 ]
✔ |  OK F W S | Context
✔ |  11   | binary functions [2.5 s]
✔ |   4   | functions on binary files [2.1 s]
✔ |   2   | broadcast variables [0.5 s]
✔ |   5   | functions in client.R
✔ |  46   | test functions in sparkR.R [6.3 s]
✔ |   2   | include R packages [0.3 s]
✔ |   2   | JVM API [0.2 s]
✔ |  75   | MLlib classification algorithms, except for tree-based 
algorithms [86.3 s]
✔ |  70   | MLlib clustering algorithms [44.5 s]
✔ |   6   | MLlib frequent pattern mining [3.0 s]
✔ |   8   | MLlib recommendation algorithms [9.6 s]
✔ | 136   | MLlib regression algorithms, except for tree-based 
algorithms [76.0 s]
✔ |   8   | MLlib statistics algorithms [0.6 s]
✔ |  94   | MLlib tree-based algorithms [85.2 s]
✔ |  29   | parallelize() and collect() [0.5 s]
✔ | 428   | basic RDD functions [25.3 s]
✔ |  39   | SerDe functionality [2.2 s]
✔ |  20   | partitionBy, groupByKey, reduceByKey etc. [3.9 s]
✔ |   4   | functions in sparkR.R
✔ |  16   | SparkSQL Arrow optimization [19.2 s]
✔ |   6   | test show SparkDataFrame when eager execution is enabled. 
[1.1 s]
✔ | 1175   | SparkSQL functions [134.8 s]
✔ |  42   | Structured Streaming [478.2 s]
✔ |  16   | tests RDD function take() [1.1 s]
✔ |  14   | the textFile() function [2.9 s]
✔ |  46   | functions in utils.R [0.7 s]
✔ |   0 1 | Windows-specific tests


test_Windows.R:22: skip: sparkJars tag in SparkContext
Reason: This test is only for Windows, skipped



══ Results 
═
Duration: 987.3 s

OK:   2304
Failed:   0
Warnings: 0
Skipped:  1
...
Status: OK
+ popd
Tests passed.
```

Note that I tested to build SparkR in R 4.0.0, and run the

[spark] branch branch-3.0 updated: [SPARK-31918][R] Ignore S4 generic methods under SparkR namespace in closure cleaning to support R 4.0.0+

2020-06-23 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new f18c0f6  [SPARK-31918][R] Ignore S4 generic methods under SparkR 
namespace in closure cleaning to support R 4.0.0+
f18c0f6 is described below

commit f18c0f61f8cd3757fc28078a3d92ce69babf04b3
Author: HyukjinKwon 
AuthorDate: Wed Jun 24 11:03:05 2020 +0900

[SPARK-31918][R] Ignore S4 generic methods under SparkR namespace in 
closure cleaning to support R 4.0.0+

### What changes were proposed in this pull request?

This PR proposes to ignore S4 generic methods under SparkR namespace in 
closure cleaning to support R 4.0.0+.

Currently, when you run the codes that runs R native codes, it fails as 
below with R 4.0.0:

```r
df <- createDataFrame(lapply(seq(100), function (e) list(value=e)))
count(dapply(df, function(x) as.data.frame(x[x$value < 50,]), schema(df)))
```

```
org.apache.spark.SparkException: R unexpectedly exited.
R worker produced errors: Error in lapply(part, FUN) : attempt to bind a 
variable to R_UnboundValue
```

The root cause seems to be related to when an S4 generic method is manually 
included into the closure's environment via `SparkR:::cleanClosure`. For 
example, when an RRDD is created via `createDataFrame` with calling `lapply` to 
convert, `lapply` itself:


https://github.com/apache/spark/blob/f53d8c63e80172295e2fbc805c0c391bdececcaa/R/pkg/R/RDD.R#L484

is added into the environment of the cleaned closure - because this is not 
an exposed namespace; however, this is broken in R 4.0.0+ for an unknown reason 
with an error message such as "attempt to bind a variable to R_UnboundValue".

Actually, we don't need to add the `lapply` into the environment of the 
closure because it is not supposed to be called in worker side. In fact, there 
is no private generic methods supposed to be called in worker side in SparkR at 
all from my understanding.

Therefore, this PR takes a simpler path to work around just by explicitly 
excluding the S4 generic methods under SparkR namespace to support R 4.0.0. in 
SparkR.

### Why are the changes needed?

To support R 4.0.0+ with SparkR, and unblock the releases on CRAN. CRAN 
requires the tests pass with the latest R.

### Does this PR introduce _any_ user-facing change?

Yes, it will support R 4.0.0 to end-users.

### How was this patch tested?

Manually tested. Both CRAN and tests with R 4.0.1:

```
══ testthat results  
═══
[ OK: 13 | SKIPPED: 0 | WARNINGS: 0 | FAILED: 0 ]
✔ |  OK F W S | Context
✔ |  11   | binary functions [2.5 s]
✔ |   4   | functions on binary files [2.1 s]
✔ |   2   | broadcast variables [0.5 s]
✔ |   5   | functions in client.R
✔ |  46   | test functions in sparkR.R [6.3 s]
✔ |   2   | include R packages [0.3 s]
✔ |   2   | JVM API [0.2 s]
✔ |  75   | MLlib classification algorithms, except for tree-based 
algorithms [86.3 s]
✔ |  70   | MLlib clustering algorithms [44.5 s]
✔ |   6   | MLlib frequent pattern mining [3.0 s]
✔ |   8   | MLlib recommendation algorithms [9.6 s]
✔ | 136   | MLlib regression algorithms, except for tree-based 
algorithms [76.0 s]
✔ |   8   | MLlib statistics algorithms [0.6 s]
✔ |  94   | MLlib tree-based algorithms [85.2 s]
✔ |  29   | parallelize() and collect() [0.5 s]
✔ | 428   | basic RDD functions [25.3 s]
✔ |  39   | SerDe functionality [2.2 s]
✔ |  20   | partitionBy, groupByKey, reduceByKey etc. [3.9 s]
✔ |   4   | functions in sparkR.R
✔ |  16   | SparkSQL Arrow optimization [19.2 s]
✔ |   6   | test show SparkDataFrame when eager execution is enabled. 
[1.1 s]
✔ | 1175   | SparkSQL functions [134.8 s]
✔ |  42   | Structured Streaming [478.2 s]
✔ |  16   | tests RDD function take() [1.1 s]
✔ |  14   | the textFile() function [2.9 s]
✔ |  46   | functions in utils.R [0.7 s]
✔ |   0 1 | Windows-specific tests


test_Windows.R:22: skip: sparkJars tag in SparkContext
Reason: This test is only for Windows, skipped



══ Results 
═
Duration: 987.3 s

OK:   2304
Failed:   0
Warnings: 0
Skipped:  1
...
Status: OK
+ popd
Tests passed.
```

Note that I tested to build SparkR in R 4.0.0, and run the

[spark] branch branch-2.4 updated: [SPARK-32073][R] Drop R < 3.5 support

2020-06-23 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-2.4 by this push:
 new 77006b2  [SPARK-32073][R] Drop R < 3.5 support
77006b2 is described below

commit 77006b2c65e0e4b6b9facddbb13aa88a264adbe2
Author: HyukjinKwon 
AuthorDate: Wed Jun 24 11:05:27 2020 +0900

[SPARK-32073][R] Drop R < 3.5 support

Spark 3.0 accidentally dropped R < 3.5. It is built by R 3.6.3 which not 
support R < 3.5:

```
Error in readRDS(pfile) : cannot read workspace version 3 written by R 
3.6.3; need R 3.5.0 or newer version.
```

In fact, with SPARK-31918, we will have to drop R < 3.5 entirely to support 
R 4.0.0. This is inevitable to release on CRAN because they require to make the 
tests pass with the latest R.

To show the supported versions correctly, and support R 4.0.0 to unblock 
the releases.

In fact, no because Spark 3.0.0 already does not work with R < 3.5.
Compared to Spark 2.4, yes. R < 3.5 would not work.

Jenkins should test it out.

Closes #28908 from HyukjinKwon/SPARK-32073.

Authored-by: HyukjinKwon 
Signed-off-by: HyukjinKwon 
(cherry picked from commit b62e2536db9def0d11605ceac8990f72a515e9a0)
Signed-off-by: HyukjinKwon 
---
 R/WINDOWS.md  | 2 +-
 R/pkg/DESCRIPTION | 2 +-
 docs/index.md | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/R/WINDOWS.md b/R/WINDOWS.md
index da668a6..73f8a21 100644
--- a/R/WINDOWS.md
+++ b/R/WINDOWS.md
@@ -2,7 +2,7 @@
 
 To build SparkR on Windows, the following steps are required
 
-1. Install R (>= 3.1) and 
[Rtools](http://cran.r-project.org/bin/windows/Rtools/). Make sure to
+1. Install R (>= 3.5) and 
[Rtools](http://cran.r-project.org/bin/windows/Rtools/). Make sure to
 include Rtools and R in `PATH`.
 
 2. Install
diff --git a/R/pkg/DESCRIPTION b/R/pkg/DESCRIPTION
index b70014d..2940d04 100644
--- a/R/pkg/DESCRIPTION
+++ b/R/pkg/DESCRIPTION
@@ -15,7 +15,7 @@ URL: https://www.apache.org/ https://spark.apache.org/
 BugReports: https://spark.apache.org/contributing.html
 SystemRequirements: Java (== 8)
 Depends:
-R (>= 3.0),
+R (>= 3.5),
 methods
 Suggests:
 knitr,
diff --git a/docs/index.md b/docs/index.md
index 52f1a5a..73cb57a 100644
--- a/docs/index.md
+++ b/docs/index.md
@@ -31,7 +31,7 @@ Spark runs on both Windows and UNIX-like systems (e.g. Linux, 
Mac OS). It's easy
 locally on one machine --- all you need is to have `java` installed on your 
system `PATH`,
 or the `JAVA_HOME` environment variable pointing to a Java installation.
 
-Spark runs on Java 8, Python 2.7+/3.4+ and R 3.1+. For the Scala API, Spark 
{{site.SPARK_VERSION}}
+Spark runs on Java 8, Python 2.7+/3.4+ and R 3.5+. For the Scala API, Spark 
{{site.SPARK_VERSION}}
 uses Scala {{site.SCALA_BINARY_VERSION}}. You will need to use a compatible 
Scala version
 ({{site.SCALA_BINARY_VERSION}}.x).
 


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-32073][R] Drop R < 3.5 support

2020-06-23 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new da8133f  [SPARK-32073][R] Drop R < 3.5 support
da8133f is described below

commit da8133f8f1df7d2ddfa995974d9a7db06ff4cd5a
Author: HyukjinKwon 
AuthorDate: Wed Jun 24 11:05:27 2020 +0900

[SPARK-32073][R] Drop R < 3.5 support

### What changes were proposed in this pull request?

Spark 3.0 accidentally dropped R < 3.5. It is built by R 3.6.3 which not 
support R < 3.5:

```
Error in readRDS(pfile) : cannot read workspace version 3 written by R 
3.6.3; need R 3.5.0 or newer version.
```

In fact, with SPARK-31918, we will have to drop R < 3.5 entirely to support 
R 4.0.0. This is inevitable to release on CRAN because they require to make the 
tests pass with the latest R.

### Why are the changes needed?

To show the supported versions correctly, and support R 4.0.0 to unblock 
the releases.

### Does this PR introduce _any_ user-facing change?

In fact, no because Spark 3.0.0 already does not work with R < 3.5.
Compared to Spark 2.4, yes. R < 3.5 would not work.

### How was this patch tested?

Jenkins should test it out.

Closes #28908 from HyukjinKwon/SPARK-32073.

Authored-by: HyukjinKwon 
Signed-off-by: HyukjinKwon 
(cherry picked from commit b62e2536db9def0d11605ceac8990f72a515e9a0)
Signed-off-by: HyukjinKwon 
---
 R/WINDOWS.md | 4 ++--
 R/pkg/DESCRIPTION| 2 +-
 R/pkg/inst/profile/general.R | 4 
 R/pkg/inst/profile/shell.R   | 4 
 docs/index.md| 3 +--
 5 files changed, 4 insertions(+), 13 deletions(-)

diff --git a/R/WINDOWS.md b/R/WINDOWS.md
index dbc2717..9fe4a22b 100644
--- a/R/WINDOWS.md
+++ b/R/WINDOWS.md
@@ -22,8 +22,8 @@ To build SparkR on Windows, the following steps are required
 
 1. Make sure `bash` is available and in `PATH` if you already have a built-in 
`bash` on Windows. If you do not have, install 
[Cygwin](https://www.cygwin.com/).
 
-2. Install R (>= 3.1) and 
[Rtools](https://cloud.r-project.org/bin/windows/Rtools/). Make sure to
-include Rtools and R in `PATH`. Note that support for R prior to version 3.4 
is deprecated as of Spark 3.0.0.
+2. Install R (>= 3.5) and 
[Rtools](https://cloud.r-project.org/bin/windows/Rtools/). Make sure to
+include Rtools and R in `PATH`.
 
 3. Install JDK that SparkR supports (see `R/pkg/DESCRIPTION`), and set 
`JAVA_HOME` in the system environment variables.
 
diff --git a/R/pkg/DESCRIPTION b/R/pkg/DESCRIPTION
index 21f3eaa..86514f2 100644
--- a/R/pkg/DESCRIPTION
+++ b/R/pkg/DESCRIPTION
@@ -15,7 +15,7 @@ URL: https://www.apache.org/ https://spark.apache.org/
 BugReports: https://spark.apache.org/contributing.html
 SystemRequirements: Java (>= 8, < 12)
 Depends:
-R (>= 3.1),
+R (>= 3.5),
 methods
 Suggests:
 knitr,
diff --git a/R/pkg/inst/profile/general.R b/R/pkg/inst/profile/general.R
index 3efb460..8c75c19 100644
--- a/R/pkg/inst/profile/general.R
+++ b/R/pkg/inst/profile/general.R
@@ -16,10 +16,6 @@
 #
 
 .First <- function() {
-  if (utils::compareVersion(paste0(R.version$major, ".", R.version$minor), 
"3.4.0") == -1) {
-warning("Support for R prior to version 3.4 is deprecated since Spark 
3.0.0")
-  }
-
   packageDir <- Sys.getenv("SPARKR_PACKAGE_DIR")
   dirs <- strsplit(packageDir, ",")[[1]]
   .libPaths(c(dirs, .libPaths()))
diff --git a/R/pkg/inst/profile/shell.R b/R/pkg/inst/profile/shell.R
index e4e0d03..f6c20e1 100644
--- a/R/pkg/inst/profile/shell.R
+++ b/R/pkg/inst/profile/shell.R
@@ -16,10 +16,6 @@
 #
 
 .First <- function() {
-  if (utils::compareVersion(paste0(R.version$major, ".", R.version$minor), 
"3.4.0") == -1) {
-warning("Support for R prior to version 3.4 is deprecated since Spark 
3.0.0")
-  }
-
   home <- Sys.getenv("SPARK_HOME")
   .libPaths(c(file.path(home, "R", "lib"), .libPaths()))
   Sys.setenv(NOAWT = 1)
diff --git a/docs/index.md b/docs/index.md
index 38f12dd4..c0771ca 100644
--- a/docs/index.md
+++ b/docs/index.md
@@ -44,10 +44,9 @@ source, visit [Building Spark](building-spark.html).
 
 Spark runs on both Windows and UNIX-like systems (e.g. Linux, Mac OS), and it 
should run on any platform that runs a supported version of Java. This should 
include JVMs on x86_64 and ARM64. It's easy to run locally on one machine --- 
all you need is to have `java` installed on your system `PATH`, or the 
`JAVA_HOME` environment variable pointing to a Java installation.
 
-Spark runs on Java 8/11, Scala 2.12, Python 2.7+/3.4+ and R 3.1+.
+Spark runs on Java 8/11, Scala 2.12, Python 2.7+/3.4+ and R 3.5+.
 Java 8 prior to version 8u92 support is deprecated as of Spark 3.0.0.
 Python 2 and Python 3 prior to version 3.6 support is deprecated as of Spark 
3.0.0.
-R prior to

[spark] branch master updated (11d2b07 -> b62e253)

2020-06-23 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 11d2b07  [SPARK-31918][R] Ignore S4 generic methods under SparkR 
namespace in closure cleaning to support R 4.0.0+
 add b62e253  [SPARK-32073][R] Drop R < 3.5 support

No new revisions were added by this update.

Summary of changes:
 R/WINDOWS.md | 4 ++--
 R/pkg/DESCRIPTION| 2 +-
 R/pkg/inst/profile/general.R | 4 
 R/pkg/inst/profile/shell.R   | 4 
 docs/index.md| 3 +--
 5 files changed, 4 insertions(+), 13 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-31918][R] Ignore S4 generic methods under SparkR namespace in closure cleaning to support R 4.0.0+

2020-06-23 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new f18c0f6  [SPARK-31918][R] Ignore S4 generic methods under SparkR 
namespace in closure cleaning to support R 4.0.0+
f18c0f6 is described below

commit f18c0f61f8cd3757fc28078a3d92ce69babf04b3
Author: HyukjinKwon 
AuthorDate: Wed Jun 24 11:03:05 2020 +0900

[SPARK-31918][R] Ignore S4 generic methods under SparkR namespace in 
closure cleaning to support R 4.0.0+

### What changes were proposed in this pull request?

This PR proposes to ignore S4 generic methods under SparkR namespace in 
closure cleaning to support R 4.0.0+.

Currently, when you run the codes that runs R native codes, it fails as 
below with R 4.0.0:

```r
df <- createDataFrame(lapply(seq(100), function (e) list(value=e)))
count(dapply(df, function(x) as.data.frame(x[x$value < 50,]), schema(df)))
```

```
org.apache.spark.SparkException: R unexpectedly exited.
R worker produced errors: Error in lapply(part, FUN) : attempt to bind a 
variable to R_UnboundValue
```

The root cause seems to be related to when an S4 generic method is manually 
included into the closure's environment via `SparkR:::cleanClosure`. For 
example, when an RRDD is created via `createDataFrame` with calling `lapply` to 
convert, `lapply` itself:


https://github.com/apache/spark/blob/f53d8c63e80172295e2fbc805c0c391bdececcaa/R/pkg/R/RDD.R#L484

is added into the environment of the cleaned closure - because this is not 
an exposed namespace; however, this is broken in R 4.0.0+ for an unknown reason 
with an error message such as "attempt to bind a variable to R_UnboundValue".

Actually, we don't need to add the `lapply` into the environment of the 
closure because it is not supposed to be called in worker side. In fact, there 
is no private generic methods supposed to be called in worker side in SparkR at 
all from my understanding.

Therefore, this PR takes a simpler path to work around just by explicitly 
excluding the S4 generic methods under SparkR namespace to support R 4.0.0. in 
SparkR.

### Why are the changes needed?

To support R 4.0.0+ with SparkR, and unblock the releases on CRAN. CRAN 
requires the tests pass with the latest R.

### Does this PR introduce _any_ user-facing change?

Yes, it will support R 4.0.0 to end-users.

### How was this patch tested?

Manually tested. Both CRAN and tests with R 4.0.1:

```
══ testthat results  
═══
[ OK: 13 | SKIPPED: 0 | WARNINGS: 0 | FAILED: 0 ]
✔ |  OK F W S | Context
✔ |  11   | binary functions [2.5 s]
✔ |   4   | functions on binary files [2.1 s]
✔ |   2   | broadcast variables [0.5 s]
✔ |   5   | functions in client.R
✔ |  46   | test functions in sparkR.R [6.3 s]
✔ |   2   | include R packages [0.3 s]
✔ |   2   | JVM API [0.2 s]
✔ |  75   | MLlib classification algorithms, except for tree-based 
algorithms [86.3 s]
✔ |  70   | MLlib clustering algorithms [44.5 s]
✔ |   6   | MLlib frequent pattern mining [3.0 s]
✔ |   8   | MLlib recommendation algorithms [9.6 s]
✔ | 136   | MLlib regression algorithms, except for tree-based 
algorithms [76.0 s]
✔ |   8   | MLlib statistics algorithms [0.6 s]
✔ |  94   | MLlib tree-based algorithms [85.2 s]
✔ |  29   | parallelize() and collect() [0.5 s]
✔ | 428   | basic RDD functions [25.3 s]
✔ |  39   | SerDe functionality [2.2 s]
✔ |  20   | partitionBy, groupByKey, reduceByKey etc. [3.9 s]
✔ |   4   | functions in sparkR.R
✔ |  16   | SparkSQL Arrow optimization [19.2 s]
✔ |   6   | test show SparkDataFrame when eager execution is enabled. 
[1.1 s]
✔ | 1175   | SparkSQL functions [134.8 s]
✔ |  42   | Structured Streaming [478.2 s]
✔ |  16   | tests RDD function take() [1.1 s]
✔ |  14   | the textFile() function [2.9 s]
✔ |  46   | functions in utils.R [0.7 s]
✔ |   0 1 | Windows-specific tests


test_Windows.R:22: skip: sparkJars tag in SparkContext
Reason: This test is only for Windows, skipped



══ Results 
═
Duration: 987.3 s

OK:   2304
Failed:   0
Warnings: 0
Skipped:  1
...
Status: OK
+ popd
Tests passed.
```

Note that I tested to build SparkR in R 4.0.0, and run the

[spark] branch branch-2.4 updated: [SPARK-31918][R] Ignore S4 generic methods under SparkR namespace in closure cleaning to support R 4.0.0+

2020-06-23 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-2.4 by this push:
 new 29873c9  [SPARK-31918][R] Ignore S4 generic methods under SparkR 
namespace in closure cleaning to support R 4.0.0+
29873c9 is described below

commit 29873c9126503bbd3edfd523d8531b3644a2dd65
Author: HyukjinKwon 
AuthorDate: Wed Jun 24 11:03:05 2020 +0900

[SPARK-31918][R] Ignore S4 generic methods under SparkR namespace in 
closure cleaning to support R 4.0.0+

### What changes were proposed in this pull request?

This PR proposes to ignore S4 generic methods under SparkR namespace in 
closure cleaning to support R 4.0.0+.

Currently, when you run the codes that runs R native codes, it fails as 
below with R 4.0.0:

```r
df <- createDataFrame(lapply(seq(100), function (e) list(value=e)))
count(dapply(df, function(x) as.data.frame(x[x$value < 50,]), schema(df)))
```

```
org.apache.spark.SparkException: R unexpectedly exited.
R worker produced errors: Error in lapply(part, FUN) : attempt to bind a 
variable to R_UnboundValue
```

The root cause seems to be related to when an S4 generic method is manually 
included into the closure's environment via `SparkR:::cleanClosure`. For 
example, when an RRDD is created via `createDataFrame` with calling `lapply` to 
convert, `lapply` itself:


https://github.com/apache/spark/blob/f53d8c63e80172295e2fbc805c0c391bdececcaa/R/pkg/R/RDD.R#L484

is added into the environment of the cleaned closure - because this is not 
an exposed namespace; however, this is broken in R 4.0.0+ for an unknown reason 
with an error message such as "attempt to bind a variable to R_UnboundValue".

Actually, we don't need to add the `lapply` into the environment of the 
closure because it is not supposed to be called in worker side. In fact, there 
is no private generic methods supposed to be called in worker side in SparkR at 
all from my understanding.

Therefore, this PR takes a simpler path to work around just by explicitly 
excluding the S4 generic methods under SparkR namespace to support R 4.0.0. in 
SparkR.

### Why are the changes needed?

To support R 4.0.0+ with SparkR, and unblock the releases on CRAN. CRAN 
requires the tests pass with the latest R.

### Does this PR introduce _any_ user-facing change?

Yes, it will support R 4.0.0 to end-users.

### How was this patch tested?

Manually tested. Both CRAN and tests with R 4.0.1:

```
══ testthat results  
═══
[ OK: 13 | SKIPPED: 0 | WARNINGS: 0 | FAILED: 0 ]
✔ |  OK F W S | Context
✔ |  11   | binary functions [2.5 s]
✔ |   4   | functions on binary files [2.1 s]
✔ |   2   | broadcast variables [0.5 s]
✔ |   5   | functions in client.R
✔ |  46   | test functions in sparkR.R [6.3 s]
✔ |   2   | include R packages [0.3 s]
✔ |   2   | JVM API [0.2 s]
✔ |  75   | MLlib classification algorithms, except for tree-based 
algorithms [86.3 s]
✔ |  70   | MLlib clustering algorithms [44.5 s]
✔ |   6   | MLlib frequent pattern mining [3.0 s]
✔ |   8   | MLlib recommendation algorithms [9.6 s]
✔ | 136   | MLlib regression algorithms, except for tree-based 
algorithms [76.0 s]
✔ |   8   | MLlib statistics algorithms [0.6 s]
✔ |  94   | MLlib tree-based algorithms [85.2 s]
✔ |  29   | parallelize() and collect() [0.5 s]
✔ | 428   | basic RDD functions [25.3 s]
✔ |  39   | SerDe functionality [2.2 s]
✔ |  20   | partitionBy, groupByKey, reduceByKey etc. [3.9 s]
✔ |   4   | functions in sparkR.R
✔ |  16   | SparkSQL Arrow optimization [19.2 s]
✔ |   6   | test show SparkDataFrame when eager execution is enabled. 
[1.1 s]
✔ | 1175   | SparkSQL functions [134.8 s]
✔ |  42   | Structured Streaming [478.2 s]
✔ |  16   | tests RDD function take() [1.1 s]
✔ |  14   | the textFile() function [2.9 s]
✔ |  46   | functions in utils.R [0.7 s]
✔ |   0 1 | Windows-specific tests


test_Windows.R:22: skip: sparkJars tag in SparkContext
Reason: This test is only for Windows, skipped



══ Results 
═
Duration: 987.3 s

OK:   2304
Failed:   0
Warnings: 0
Skipped:  1
...
Status: OK
+ popd
Tests passed.
```

Note that I tested to build SparkR in R 4.0.0, and run the

[spark] branch master updated (e00f43c -> 11d2b07)

2020-06-23 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from e00f43c  [SPARK-32043][SQL] Replace Decimal by Int op in 
`make_interval` and `make_timestamp`
 add 11d2b07  [SPARK-31918][R] Ignore S4 generic methods under SparkR 
namespace in closure cleaning to support R 4.0.0+

No new revisions were added by this update.

Summary of changes:
 R/pkg/R/utils.R   |  5 -
 R/pkg/tests/fulltests/test_context.R  |  4 +++-
 R/pkg/tests/fulltests/test_mllib_classification.R | 18 +-
 R/pkg/tests/fulltests/test_mllib_clustering.R |  2 +-
 R/pkg/tests/fulltests/test_mllib_regression.R |  2 +-
 5 files changed, 18 insertions(+), 13 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-2.4 updated: [SPARK-32073][R] Drop R < 3.5 support

2020-06-23 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-2.4 by this push:
 new 77006b2  [SPARK-32073][R] Drop R < 3.5 support
77006b2 is described below

commit 77006b2c65e0e4b6b9facddbb13aa88a264adbe2
Author: HyukjinKwon 
AuthorDate: Wed Jun 24 11:05:27 2020 +0900

[SPARK-32073][R] Drop R < 3.5 support

Spark 3.0 accidentally dropped R < 3.5. It is built by R 3.6.3 which not 
support R < 3.5:

```
Error in readRDS(pfile) : cannot read workspace version 3 written by R 
3.6.3; need R 3.5.0 or newer version.
```

In fact, with SPARK-31918, we will have to drop R < 3.5 entirely to support 
R 4.0.0. This is inevitable to release on CRAN because they require to make the 
tests pass with the latest R.

To show the supported versions correctly, and support R 4.0.0 to unblock 
the releases.

In fact, no because Spark 3.0.0 already does not work with R < 3.5.
Compared to Spark 2.4, yes. R < 3.5 would not work.

Jenkins should test it out.

Closes #28908 from HyukjinKwon/SPARK-32073.

Authored-by: HyukjinKwon 
Signed-off-by: HyukjinKwon 
(cherry picked from commit b62e2536db9def0d11605ceac8990f72a515e9a0)
Signed-off-by: HyukjinKwon 
---
 R/WINDOWS.md  | 2 +-
 R/pkg/DESCRIPTION | 2 +-
 docs/index.md | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/R/WINDOWS.md b/R/WINDOWS.md
index da668a6..73f8a21 100644
--- a/R/WINDOWS.md
+++ b/R/WINDOWS.md
@@ -2,7 +2,7 @@
 
 To build SparkR on Windows, the following steps are required
 
-1. Install R (>= 3.1) and 
[Rtools](http://cran.r-project.org/bin/windows/Rtools/). Make sure to
+1. Install R (>= 3.5) and 
[Rtools](http://cran.r-project.org/bin/windows/Rtools/). Make sure to
 include Rtools and R in `PATH`.
 
 2. Install
diff --git a/R/pkg/DESCRIPTION b/R/pkg/DESCRIPTION
index b70014d..2940d04 100644
--- a/R/pkg/DESCRIPTION
+++ b/R/pkg/DESCRIPTION
@@ -15,7 +15,7 @@ URL: https://www.apache.org/ https://spark.apache.org/
 BugReports: https://spark.apache.org/contributing.html
 SystemRequirements: Java (== 8)
 Depends:
-R (>= 3.0),
+R (>= 3.5),
 methods
 Suggests:
 knitr,
diff --git a/docs/index.md b/docs/index.md
index 52f1a5a..73cb57a 100644
--- a/docs/index.md
+++ b/docs/index.md
@@ -31,7 +31,7 @@ Spark runs on both Windows and UNIX-like systems (e.g. Linux, 
Mac OS). It's easy
 locally on one machine --- all you need is to have `java` installed on your 
system `PATH`,
 or the `JAVA_HOME` environment variable pointing to a Java installation.
 
-Spark runs on Java 8, Python 2.7+/3.4+ and R 3.1+. For the Scala API, Spark 
{{site.SPARK_VERSION}}
+Spark runs on Java 8, Python 2.7+/3.4+ and R 3.5+. For the Scala API, Spark 
{{site.SPARK_VERSION}}
 uses Scala {{site.SCALA_BINARY_VERSION}}. You will need to use a compatible 
Scala version
 ({{site.SCALA_BINARY_VERSION}}.x).
 


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-32073][R] Drop R < 3.5 support

2020-06-23 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new da8133f  [SPARK-32073][R] Drop R < 3.5 support
da8133f is described below

commit da8133f8f1df7d2ddfa995974d9a7db06ff4cd5a
Author: HyukjinKwon 
AuthorDate: Wed Jun 24 11:05:27 2020 +0900

[SPARK-32073][R] Drop R < 3.5 support

### What changes were proposed in this pull request?

Spark 3.0 accidentally dropped R < 3.5. It is built by R 3.6.3 which not 
support R < 3.5:

```
Error in readRDS(pfile) : cannot read workspace version 3 written by R 
3.6.3; need R 3.5.0 or newer version.
```

In fact, with SPARK-31918, we will have to drop R < 3.5 entirely to support 
R 4.0.0. This is inevitable to release on CRAN because they require to make the 
tests pass with the latest R.

### Why are the changes needed?

To show the supported versions correctly, and support R 4.0.0 to unblock 
the releases.

### Does this PR introduce _any_ user-facing change?

In fact, no because Spark 3.0.0 already does not work with R < 3.5.
Compared to Spark 2.4, yes. R < 3.5 would not work.

### How was this patch tested?

Jenkins should test it out.

Closes #28908 from HyukjinKwon/SPARK-32073.

Authored-by: HyukjinKwon 
Signed-off-by: HyukjinKwon 
(cherry picked from commit b62e2536db9def0d11605ceac8990f72a515e9a0)
Signed-off-by: HyukjinKwon 
---
 R/WINDOWS.md | 4 ++--
 R/pkg/DESCRIPTION| 2 +-
 R/pkg/inst/profile/general.R | 4 
 R/pkg/inst/profile/shell.R   | 4 
 docs/index.md| 3 +--
 5 files changed, 4 insertions(+), 13 deletions(-)

diff --git a/R/WINDOWS.md b/R/WINDOWS.md
index dbc2717..9fe4a22b 100644
--- a/R/WINDOWS.md
+++ b/R/WINDOWS.md
@@ -22,8 +22,8 @@ To build SparkR on Windows, the following steps are required
 
 1. Make sure `bash` is available and in `PATH` if you already have a built-in 
`bash` on Windows. If you do not have, install 
[Cygwin](https://www.cygwin.com/).
 
-2. Install R (>= 3.1) and 
[Rtools](https://cloud.r-project.org/bin/windows/Rtools/). Make sure to
-include Rtools and R in `PATH`. Note that support for R prior to version 3.4 
is deprecated as of Spark 3.0.0.
+2. Install R (>= 3.5) and 
[Rtools](https://cloud.r-project.org/bin/windows/Rtools/). Make sure to
+include Rtools and R in `PATH`.
 
 3. Install JDK that SparkR supports (see `R/pkg/DESCRIPTION`), and set 
`JAVA_HOME` in the system environment variables.
 
diff --git a/R/pkg/DESCRIPTION b/R/pkg/DESCRIPTION
index 21f3eaa..86514f2 100644
--- a/R/pkg/DESCRIPTION
+++ b/R/pkg/DESCRIPTION
@@ -15,7 +15,7 @@ URL: https://www.apache.org/ https://spark.apache.org/
 BugReports: https://spark.apache.org/contributing.html
 SystemRequirements: Java (>= 8, < 12)
 Depends:
-R (>= 3.1),
+R (>= 3.5),
 methods
 Suggests:
 knitr,
diff --git a/R/pkg/inst/profile/general.R b/R/pkg/inst/profile/general.R
index 3efb460..8c75c19 100644
--- a/R/pkg/inst/profile/general.R
+++ b/R/pkg/inst/profile/general.R
@@ -16,10 +16,6 @@
 #
 
 .First <- function() {
-  if (utils::compareVersion(paste0(R.version$major, ".", R.version$minor), 
"3.4.0") == -1) {
-warning("Support for R prior to version 3.4 is deprecated since Spark 
3.0.0")
-  }
-
   packageDir <- Sys.getenv("SPARKR_PACKAGE_DIR")
   dirs <- strsplit(packageDir, ",")[[1]]
   .libPaths(c(dirs, .libPaths()))
diff --git a/R/pkg/inst/profile/shell.R b/R/pkg/inst/profile/shell.R
index e4e0d03..f6c20e1 100644
--- a/R/pkg/inst/profile/shell.R
+++ b/R/pkg/inst/profile/shell.R
@@ -16,10 +16,6 @@
 #
 
 .First <- function() {
-  if (utils::compareVersion(paste0(R.version$major, ".", R.version$minor), 
"3.4.0") == -1) {
-warning("Support for R prior to version 3.4 is deprecated since Spark 
3.0.0")
-  }
-
   home <- Sys.getenv("SPARK_HOME")
   .libPaths(c(file.path(home, "R", "lib"), .libPaths()))
   Sys.setenv(NOAWT = 1)
diff --git a/docs/index.md b/docs/index.md
index 38f12dd4..c0771ca 100644
--- a/docs/index.md
+++ b/docs/index.md
@@ -44,10 +44,9 @@ source, visit [Building Spark](building-spark.html).
 
 Spark runs on both Windows and UNIX-like systems (e.g. Linux, Mac OS), and it 
should run on any platform that runs a supported version of Java. This should 
include JVMs on x86_64 and ARM64. It's easy to run locally on one machine --- 
all you need is to have `java` installed on your system `PATH`, or the 
`JAVA_HOME` environment variable pointing to a Java installation.
 
-Spark runs on Java 8/11, Scala 2.12, Python 2.7+/3.4+ and R 3.1+.
+Spark runs on Java 8/11, Scala 2.12, Python 2.7+/3.4+ and R 3.5+.
 Java 8 prior to version 8u92 support is deprecated as of Spark 3.0.0.
 Python 2 and Python 3 prior to version 3.6 support is deprecated as of Spark 
3.0.0.
-R prior to

[spark] branch master updated (11d2b07 -> b62e253)

2020-06-23 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 11d2b07  [SPARK-31918][R] Ignore S4 generic methods under SparkR 
namespace in closure cleaning to support R 4.0.0+
 add b62e253  [SPARK-32073][R] Drop R < 3.5 support

No new revisions were added by this update.

Summary of changes:
 R/WINDOWS.md | 4 ++--
 R/pkg/DESCRIPTION| 2 +-
 R/pkg/inst/profile/general.R | 4 
 R/pkg/inst/profile/shell.R   | 4 
 docs/index.md| 3 +--
 5 files changed, 4 insertions(+), 13 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-2.4 updated: [SPARK-31918][R] Ignore S4 generic methods under SparkR namespace in closure cleaning to support R 4.0.0+

2020-06-23 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-2.4 by this push:
 new 29873c9  [SPARK-31918][R] Ignore S4 generic methods under SparkR 
namespace in closure cleaning to support R 4.0.0+
29873c9 is described below

commit 29873c9126503bbd3edfd523d8531b3644a2dd65
Author: HyukjinKwon 
AuthorDate: Wed Jun 24 11:03:05 2020 +0900

[SPARK-31918][R] Ignore S4 generic methods under SparkR namespace in 
closure cleaning to support R 4.0.0+

### What changes were proposed in this pull request?

This PR proposes to ignore S4 generic methods under SparkR namespace in 
closure cleaning to support R 4.0.0+.

Currently, when you run the codes that runs R native codes, it fails as 
below with R 4.0.0:

```r
df <- createDataFrame(lapply(seq(100), function (e) list(value=e)))
count(dapply(df, function(x) as.data.frame(x[x$value < 50,]), schema(df)))
```

```
org.apache.spark.SparkException: R unexpectedly exited.
R worker produced errors: Error in lapply(part, FUN) : attempt to bind a 
variable to R_UnboundValue
```

The root cause seems to be related to when an S4 generic method is manually 
included into the closure's environment via `SparkR:::cleanClosure`. For 
example, when an RRDD is created via `createDataFrame` with calling `lapply` to 
convert, `lapply` itself:


https://github.com/apache/spark/blob/f53d8c63e80172295e2fbc805c0c391bdececcaa/R/pkg/R/RDD.R#L484

is added into the environment of the cleaned closure - because this is not 
an exposed namespace; however, this is broken in R 4.0.0+ for an unknown reason 
with an error message such as "attempt to bind a variable to R_UnboundValue".

Actually, we don't need to add the `lapply` into the environment of the 
closure because it is not supposed to be called in worker side. In fact, there 
is no private generic methods supposed to be called in worker side in SparkR at 
all from my understanding.

Therefore, this PR takes a simpler path to work around just by explicitly 
excluding the S4 generic methods under SparkR namespace to support R 4.0.0. in 
SparkR.

### Why are the changes needed?

To support R 4.0.0+ with SparkR, and unblock the releases on CRAN. CRAN 
requires the tests pass with the latest R.

### Does this PR introduce _any_ user-facing change?

Yes, it will support R 4.0.0 to end-users.

### How was this patch tested?

Manually tested. Both CRAN and tests with R 4.0.1:

```
══ testthat results  
═══
[ OK: 13 | SKIPPED: 0 | WARNINGS: 0 | FAILED: 0 ]
✔ |  OK F W S | Context
✔ |  11   | binary functions [2.5 s]
✔ |   4   | functions on binary files [2.1 s]
✔ |   2   | broadcast variables [0.5 s]
✔ |   5   | functions in client.R
✔ |  46   | test functions in sparkR.R [6.3 s]
✔ |   2   | include R packages [0.3 s]
✔ |   2   | JVM API [0.2 s]
✔ |  75   | MLlib classification algorithms, except for tree-based 
algorithms [86.3 s]
✔ |  70   | MLlib clustering algorithms [44.5 s]
✔ |   6   | MLlib frequent pattern mining [3.0 s]
✔ |   8   | MLlib recommendation algorithms [9.6 s]
✔ | 136   | MLlib regression algorithms, except for tree-based 
algorithms [76.0 s]
✔ |   8   | MLlib statistics algorithms [0.6 s]
✔ |  94   | MLlib tree-based algorithms [85.2 s]
✔ |  29   | parallelize() and collect() [0.5 s]
✔ | 428   | basic RDD functions [25.3 s]
✔ |  39   | SerDe functionality [2.2 s]
✔ |  20   | partitionBy, groupByKey, reduceByKey etc. [3.9 s]
✔ |   4   | functions in sparkR.R
✔ |  16   | SparkSQL Arrow optimization [19.2 s]
✔ |   6   | test show SparkDataFrame when eager execution is enabled. 
[1.1 s]
✔ | 1175   | SparkSQL functions [134.8 s]
✔ |  42   | Structured Streaming [478.2 s]
✔ |  16   | tests RDD function take() [1.1 s]
✔ |  14   | the textFile() function [2.9 s]
✔ |  46   | functions in utils.R [0.7 s]
✔ |   0 1 | Windows-specific tests


test_Windows.R:22: skip: sparkJars tag in SparkContext
Reason: This test is only for Windows, skipped



══ Results 
═
Duration: 987.3 s

OK:   2304
Failed:   0
Warnings: 0
Skipped:  1
...
Status: OK
+ popd
Tests passed.
```

Note that I tested to build SparkR in R 4.0.0, and run the

[spark] branch branch-3.0 updated: [SPARK-31918][R] Ignore S4 generic methods under SparkR namespace in closure cleaning to support R 4.0.0+

2020-06-23 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new f18c0f6  [SPARK-31918][R] Ignore S4 generic methods under SparkR 
namespace in closure cleaning to support R 4.0.0+
f18c0f6 is described below

commit f18c0f61f8cd3757fc28078a3d92ce69babf04b3
Author: HyukjinKwon 
AuthorDate: Wed Jun 24 11:03:05 2020 +0900

[SPARK-31918][R] Ignore S4 generic methods under SparkR namespace in 
closure cleaning to support R 4.0.0+

### What changes were proposed in this pull request?

This PR proposes to ignore S4 generic methods under SparkR namespace in 
closure cleaning to support R 4.0.0+.

Currently, when you run the codes that runs R native codes, it fails as 
below with R 4.0.0:

```r
df <- createDataFrame(lapply(seq(100), function (e) list(value=e)))
count(dapply(df, function(x) as.data.frame(x[x$value < 50,]), schema(df)))
```

```
org.apache.spark.SparkException: R unexpectedly exited.
R worker produced errors: Error in lapply(part, FUN) : attempt to bind a 
variable to R_UnboundValue
```

The root cause seems to be related to when an S4 generic method is manually 
included into the closure's environment via `SparkR:::cleanClosure`. For 
example, when an RRDD is created via `createDataFrame` with calling `lapply` to 
convert, `lapply` itself:


https://github.com/apache/spark/blob/f53d8c63e80172295e2fbc805c0c391bdececcaa/R/pkg/R/RDD.R#L484

is added into the environment of the cleaned closure - because this is not 
an exposed namespace; however, this is broken in R 4.0.0+ for an unknown reason 
with an error message such as "attempt to bind a variable to R_UnboundValue".

Actually, we don't need to add the `lapply` into the environment of the 
closure because it is not supposed to be called in worker side. In fact, there 
is no private generic methods supposed to be called in worker side in SparkR at 
all from my understanding.

Therefore, this PR takes a simpler path to work around just by explicitly 
excluding the S4 generic methods under SparkR namespace to support R 4.0.0. in 
SparkR.

### Why are the changes needed?

To support R 4.0.0+ with SparkR, and unblock the releases on CRAN. CRAN 
requires the tests pass with the latest R.

### Does this PR introduce _any_ user-facing change?

Yes, it will support R 4.0.0 to end-users.

### How was this patch tested?

Manually tested. Both CRAN and tests with R 4.0.1:

```
══ testthat results  
═══
[ OK: 13 | SKIPPED: 0 | WARNINGS: 0 | FAILED: 0 ]
✔ |  OK F W S | Context
✔ |  11   | binary functions [2.5 s]
✔ |   4   | functions on binary files [2.1 s]
✔ |   2   | broadcast variables [0.5 s]
✔ |   5   | functions in client.R
✔ |  46   | test functions in sparkR.R [6.3 s]
✔ |   2   | include R packages [0.3 s]
✔ |   2   | JVM API [0.2 s]
✔ |  75   | MLlib classification algorithms, except for tree-based 
algorithms [86.3 s]
✔ |  70   | MLlib clustering algorithms [44.5 s]
✔ |   6   | MLlib frequent pattern mining [3.0 s]
✔ |   8   | MLlib recommendation algorithms [9.6 s]
✔ | 136   | MLlib regression algorithms, except for tree-based 
algorithms [76.0 s]
✔ |   8   | MLlib statistics algorithms [0.6 s]
✔ |  94   | MLlib tree-based algorithms [85.2 s]
✔ |  29   | parallelize() and collect() [0.5 s]
✔ | 428   | basic RDD functions [25.3 s]
✔ |  39   | SerDe functionality [2.2 s]
✔ |  20   | partitionBy, groupByKey, reduceByKey etc. [3.9 s]
✔ |   4   | functions in sparkR.R
✔ |  16   | SparkSQL Arrow optimization [19.2 s]
✔ |   6   | test show SparkDataFrame when eager execution is enabled. 
[1.1 s]
✔ | 1175   | SparkSQL functions [134.8 s]
✔ |  42   | Structured Streaming [478.2 s]
✔ |  16   | tests RDD function take() [1.1 s]
✔ |  14   | the textFile() function [2.9 s]
✔ |  46   | functions in utils.R [0.7 s]
✔ |   0 1 | Windows-specific tests


test_Windows.R:22: skip: sparkJars tag in SparkContext
Reason: This test is only for Windows, skipped



══ Results 
═
Duration: 987.3 s

OK:   2304
Failed:   0
Warnings: 0
Skipped:  1
...
Status: OK
+ popd
Tests passed.
```

Note that I tested to build SparkR in R 4.0.0, and run the

[spark] branch master updated (e00f43c -> 11d2b07)

2020-06-23 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from e00f43c  [SPARK-32043][SQL] Replace Decimal by Int op in 
`make_interval` and `make_timestamp`
 add 11d2b07  [SPARK-31918][R] Ignore S4 generic methods under SparkR 
namespace in closure cleaning to support R 4.0.0+

No new revisions were added by this update.

Summary of changes:
 R/pkg/R/utils.R   |  5 -
 R/pkg/tests/fulltests/test_context.R  |  4 +++-
 R/pkg/tests/fulltests/test_mllib_classification.R | 18 +-
 R/pkg/tests/fulltests/test_mllib_clustering.R |  2 +-
 R/pkg/tests/fulltests/test_mllib_regression.R |  2 +-
 5 files changed, 18 insertions(+), 13 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-32073][R] Drop R < 3.5 support

2020-06-23 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new da8133f  [SPARK-32073][R] Drop R < 3.5 support
da8133f is described below

commit da8133f8f1df7d2ddfa995974d9a7db06ff4cd5a
Author: HyukjinKwon 
AuthorDate: Wed Jun 24 11:05:27 2020 +0900

[SPARK-32073][R] Drop R < 3.5 support

### What changes were proposed in this pull request?

Spark 3.0 accidentally dropped R < 3.5. It is built by R 3.6.3 which not 
support R < 3.5:

```
Error in readRDS(pfile) : cannot read workspace version 3 written by R 
3.6.3; need R 3.5.0 or newer version.
```

In fact, with SPARK-31918, we will have to drop R < 3.5 entirely to support 
R 4.0.0. This is inevitable to release on CRAN because they require to make the 
tests pass with the latest R.

### Why are the changes needed?

To show the supported versions correctly, and support R 4.0.0 to unblock 
the releases.

### Does this PR introduce _any_ user-facing change?

In fact, no because Spark 3.0.0 already does not work with R < 3.5.
Compared to Spark 2.4, yes. R < 3.5 would not work.

### How was this patch tested?

Jenkins should test it out.

Closes #28908 from HyukjinKwon/SPARK-32073.

Authored-by: HyukjinKwon 
Signed-off-by: HyukjinKwon 
(cherry picked from commit b62e2536db9def0d11605ceac8990f72a515e9a0)
Signed-off-by: HyukjinKwon 
---
 R/WINDOWS.md | 4 ++--
 R/pkg/DESCRIPTION| 2 +-
 R/pkg/inst/profile/general.R | 4 
 R/pkg/inst/profile/shell.R   | 4 
 docs/index.md| 3 +--
 5 files changed, 4 insertions(+), 13 deletions(-)

diff --git a/R/WINDOWS.md b/R/WINDOWS.md
index dbc2717..9fe4a22b 100644
--- a/R/WINDOWS.md
+++ b/R/WINDOWS.md
@@ -22,8 +22,8 @@ To build SparkR on Windows, the following steps are required
 
 1. Make sure `bash` is available and in `PATH` if you already have a built-in 
`bash` on Windows. If you do not have, install 
[Cygwin](https://www.cygwin.com/).
 
-2. Install R (>= 3.1) and 
[Rtools](https://cloud.r-project.org/bin/windows/Rtools/). Make sure to
-include Rtools and R in `PATH`. Note that support for R prior to version 3.4 
is deprecated as of Spark 3.0.0.
+2. Install R (>= 3.5) and 
[Rtools](https://cloud.r-project.org/bin/windows/Rtools/). Make sure to
+include Rtools and R in `PATH`.
 
 3. Install JDK that SparkR supports (see `R/pkg/DESCRIPTION`), and set 
`JAVA_HOME` in the system environment variables.
 
diff --git a/R/pkg/DESCRIPTION b/R/pkg/DESCRIPTION
index 21f3eaa..86514f2 100644
--- a/R/pkg/DESCRIPTION
+++ b/R/pkg/DESCRIPTION
@@ -15,7 +15,7 @@ URL: https://www.apache.org/ https://spark.apache.org/
 BugReports: https://spark.apache.org/contributing.html
 SystemRequirements: Java (>= 8, < 12)
 Depends:
-R (>= 3.1),
+R (>= 3.5),
 methods
 Suggests:
 knitr,
diff --git a/R/pkg/inst/profile/general.R b/R/pkg/inst/profile/general.R
index 3efb460..8c75c19 100644
--- a/R/pkg/inst/profile/general.R
+++ b/R/pkg/inst/profile/general.R
@@ -16,10 +16,6 @@
 #
 
 .First <- function() {
-  if (utils::compareVersion(paste0(R.version$major, ".", R.version$minor), 
"3.4.0") == -1) {
-warning("Support for R prior to version 3.4 is deprecated since Spark 
3.0.0")
-  }
-
   packageDir <- Sys.getenv("SPARKR_PACKAGE_DIR")
   dirs <- strsplit(packageDir, ",")[[1]]
   .libPaths(c(dirs, .libPaths()))
diff --git a/R/pkg/inst/profile/shell.R b/R/pkg/inst/profile/shell.R
index e4e0d03..f6c20e1 100644
--- a/R/pkg/inst/profile/shell.R
+++ b/R/pkg/inst/profile/shell.R
@@ -16,10 +16,6 @@
 #
 
 .First <- function() {
-  if (utils::compareVersion(paste0(R.version$major, ".", R.version$minor), 
"3.4.0") == -1) {
-warning("Support for R prior to version 3.4 is deprecated since Spark 
3.0.0")
-  }
-
   home <- Sys.getenv("SPARK_HOME")
   .libPaths(c(file.path(home, "R", "lib"), .libPaths()))
   Sys.setenv(NOAWT = 1)
diff --git a/docs/index.md b/docs/index.md
index 38f12dd4..c0771ca 100644
--- a/docs/index.md
+++ b/docs/index.md
@@ -44,10 +44,9 @@ source, visit [Building Spark](building-spark.html).
 
 Spark runs on both Windows and UNIX-like systems (e.g. Linux, Mac OS), and it 
should run on any platform that runs a supported version of Java. This should 
include JVMs on x86_64 and ARM64. It's easy to run locally on one machine --- 
all you need is to have `java` installed on your system `PATH`, or the 
`JAVA_HOME` environment variable pointing to a Java installation.
 
-Spark runs on Java 8/11, Scala 2.12, Python 2.7+/3.4+ and R 3.1+.
+Spark runs on Java 8/11, Scala 2.12, Python 2.7+/3.4+ and R 3.5+.
 Java 8 prior to version 8u92 support is deprecated as of Spark 3.0.0.
 Python 2 and Python 3 prior to version 3.6 support is deprecated as of Spark 
3.0.0.
-R prior to

[spark] branch master updated (11d2b07 -> b62e253)

2020-06-23 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 11d2b07  [SPARK-31918][R] Ignore S4 generic methods under SparkR 
namespace in closure cleaning to support R 4.0.0+
 add b62e253  [SPARK-32073][R] Drop R < 3.5 support

No new revisions were added by this update.

Summary of changes:
 R/WINDOWS.md | 4 ++--
 R/pkg/DESCRIPTION| 2 +-
 R/pkg/inst/profile/general.R | 4 
 R/pkg/inst/profile/shell.R   | 4 
 docs/index.md| 3 +--
 5 files changed, 4 insertions(+), 13 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-31918][R] Ignore S4 generic methods under SparkR namespace in closure cleaning to support R 4.0.0+

2020-06-23 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new f18c0f6  [SPARK-31918][R] Ignore S4 generic methods under SparkR 
namespace in closure cleaning to support R 4.0.0+
f18c0f6 is described below

commit f18c0f61f8cd3757fc28078a3d92ce69babf04b3
Author: HyukjinKwon 
AuthorDate: Wed Jun 24 11:03:05 2020 +0900

[SPARK-31918][R] Ignore S4 generic methods under SparkR namespace in 
closure cleaning to support R 4.0.0+

### What changes were proposed in this pull request?

This PR proposes to ignore S4 generic methods under SparkR namespace in 
closure cleaning to support R 4.0.0+.

Currently, when you run the codes that runs R native codes, it fails as 
below with R 4.0.0:

```r
df <- createDataFrame(lapply(seq(100), function (e) list(value=e)))
count(dapply(df, function(x) as.data.frame(x[x$value < 50,]), schema(df)))
```

```
org.apache.spark.SparkException: R unexpectedly exited.
R worker produced errors: Error in lapply(part, FUN) : attempt to bind a 
variable to R_UnboundValue
```

The root cause seems to be related to when an S4 generic method is manually 
included into the closure's environment via `SparkR:::cleanClosure`. For 
example, when an RRDD is created via `createDataFrame` with calling `lapply` to 
convert, `lapply` itself:


https://github.com/apache/spark/blob/f53d8c63e80172295e2fbc805c0c391bdececcaa/R/pkg/R/RDD.R#L484

is added into the environment of the cleaned closure - because this is not 
an exposed namespace; however, this is broken in R 4.0.0+ for an unknown reason 
with an error message such as "attempt to bind a variable to R_UnboundValue".

Actually, we don't need to add the `lapply` into the environment of the 
closure because it is not supposed to be called in worker side. In fact, there 
is no private generic methods supposed to be called in worker side in SparkR at 
all from my understanding.

Therefore, this PR takes a simpler path to work around just by explicitly 
excluding the S4 generic methods under SparkR namespace to support R 4.0.0. in 
SparkR.

### Why are the changes needed?

To support R 4.0.0+ with SparkR, and unblock the releases on CRAN. CRAN 
requires the tests pass with the latest R.

### Does this PR introduce _any_ user-facing change?

Yes, it will support R 4.0.0 to end-users.

### How was this patch tested?

Manually tested. Both CRAN and tests with R 4.0.1:

```
══ testthat results  
═══
[ OK: 13 | SKIPPED: 0 | WARNINGS: 0 | FAILED: 0 ]
✔ |  OK F W S | Context
✔ |  11   | binary functions [2.5 s]
✔ |   4   | functions on binary files [2.1 s]
✔ |   2   | broadcast variables [0.5 s]
✔ |   5   | functions in client.R
✔ |  46   | test functions in sparkR.R [6.3 s]
✔ |   2   | include R packages [0.3 s]
✔ |   2   | JVM API [0.2 s]
✔ |  75   | MLlib classification algorithms, except for tree-based 
algorithms [86.3 s]
✔ |  70   | MLlib clustering algorithms [44.5 s]
✔ |   6   | MLlib frequent pattern mining [3.0 s]
✔ |   8   | MLlib recommendation algorithms [9.6 s]
✔ | 136   | MLlib regression algorithms, except for tree-based 
algorithms [76.0 s]
✔ |   8   | MLlib statistics algorithms [0.6 s]
✔ |  94   | MLlib tree-based algorithms [85.2 s]
✔ |  29   | parallelize() and collect() [0.5 s]
✔ | 428   | basic RDD functions [25.3 s]
✔ |  39   | SerDe functionality [2.2 s]
✔ |  20   | partitionBy, groupByKey, reduceByKey etc. [3.9 s]
✔ |   4   | functions in sparkR.R
✔ |  16   | SparkSQL Arrow optimization [19.2 s]
✔ |   6   | test show SparkDataFrame when eager execution is enabled. 
[1.1 s]
✔ | 1175   | SparkSQL functions [134.8 s]
✔ |  42   | Structured Streaming [478.2 s]
✔ |  16   | tests RDD function take() [1.1 s]
✔ |  14   | the textFile() function [2.9 s]
✔ |  46   | functions in utils.R [0.7 s]
✔ |   0 1 | Windows-specific tests


test_Windows.R:22: skip: sparkJars tag in SparkContext
Reason: This test is only for Windows, skipped



══ Results 
═
Duration: 987.3 s

OK:   2304
Failed:   0
Warnings: 0
Skipped:  1
...
Status: OK
+ popd
Tests passed.
```

Note that I tested to build SparkR in R 4.0.0, and run the

[spark] branch branch-2.4 updated: [SPARK-31918][R] Ignore S4 generic methods under SparkR namespace in closure cleaning to support R 4.0.0+

2020-06-23 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-2.4 by this push:
 new 29873c9  [SPARK-31918][R] Ignore S4 generic methods under SparkR 
namespace in closure cleaning to support R 4.0.0+
29873c9 is described below

commit 29873c9126503bbd3edfd523d8531b3644a2dd65
Author: HyukjinKwon 
AuthorDate: Wed Jun 24 11:03:05 2020 +0900

[SPARK-31918][R] Ignore S4 generic methods under SparkR namespace in 
closure cleaning to support R 4.0.0+

### What changes were proposed in this pull request?

This PR proposes to ignore S4 generic methods under SparkR namespace in 
closure cleaning to support R 4.0.0+.

Currently, when you run the codes that runs R native codes, it fails as 
below with R 4.0.0:

```r
df <- createDataFrame(lapply(seq(100), function (e) list(value=e)))
count(dapply(df, function(x) as.data.frame(x[x$value < 50,]), schema(df)))
```

```
org.apache.spark.SparkException: R unexpectedly exited.
R worker produced errors: Error in lapply(part, FUN) : attempt to bind a 
variable to R_UnboundValue
```

The root cause seems to be related to when an S4 generic method is manually 
included into the closure's environment via `SparkR:::cleanClosure`. For 
example, when an RRDD is created via `createDataFrame` with calling `lapply` to 
convert, `lapply` itself:


https://github.com/apache/spark/blob/f53d8c63e80172295e2fbc805c0c391bdececcaa/R/pkg/R/RDD.R#L484

is added into the environment of the cleaned closure - because this is not 
an exposed namespace; however, this is broken in R 4.0.0+ for an unknown reason 
with an error message such as "attempt to bind a variable to R_UnboundValue".

Actually, we don't need to add the `lapply` into the environment of the 
closure because it is not supposed to be called in worker side. In fact, there 
is no private generic methods supposed to be called in worker side in SparkR at 
all from my understanding.

Therefore, this PR takes a simpler path to work around just by explicitly 
excluding the S4 generic methods under SparkR namespace to support R 4.0.0. in 
SparkR.

### Why are the changes needed?

To support R 4.0.0+ with SparkR, and unblock the releases on CRAN. CRAN 
requires the tests pass with the latest R.

### Does this PR introduce _any_ user-facing change?

Yes, it will support R 4.0.0 to end-users.

### How was this patch tested?

Manually tested. Both CRAN and tests with R 4.0.1:

```
══ testthat results  
═══
[ OK: 13 | SKIPPED: 0 | WARNINGS: 0 | FAILED: 0 ]
✔ |  OK F W S | Context
✔ |  11   | binary functions [2.5 s]
✔ |   4   | functions on binary files [2.1 s]
✔ |   2   | broadcast variables [0.5 s]
✔ |   5   | functions in client.R
✔ |  46   | test functions in sparkR.R [6.3 s]
✔ |   2   | include R packages [0.3 s]
✔ |   2   | JVM API [0.2 s]
✔ |  75   | MLlib classification algorithms, except for tree-based 
algorithms [86.3 s]
✔ |  70   | MLlib clustering algorithms [44.5 s]
✔ |   6   | MLlib frequent pattern mining [3.0 s]
✔ |   8   | MLlib recommendation algorithms [9.6 s]
✔ | 136   | MLlib regression algorithms, except for tree-based 
algorithms [76.0 s]
✔ |   8   | MLlib statistics algorithms [0.6 s]
✔ |  94   | MLlib tree-based algorithms [85.2 s]
✔ |  29   | parallelize() and collect() [0.5 s]
✔ | 428   | basic RDD functions [25.3 s]
✔ |  39   | SerDe functionality [2.2 s]
✔ |  20   | partitionBy, groupByKey, reduceByKey etc. [3.9 s]
✔ |   4   | functions in sparkR.R
✔ |  16   | SparkSQL Arrow optimization [19.2 s]
✔ |   6   | test show SparkDataFrame when eager execution is enabled. 
[1.1 s]
✔ | 1175   | SparkSQL functions [134.8 s]
✔ |  42   | Structured Streaming [478.2 s]
✔ |  16   | tests RDD function take() [1.1 s]
✔ |  14   | the textFile() function [2.9 s]
✔ |  46   | functions in utils.R [0.7 s]
✔ |   0 1 | Windows-specific tests


test_Windows.R:22: skip: sparkJars tag in SparkContext
Reason: This test is only for Windows, skipped



══ Results 
═
Duration: 987.3 s

OK:   2304
Failed:   0
Warnings: 0
Skipped:  1
...
Status: OK
+ popd
Tests passed.
```

Note that I tested to build SparkR in R 4.0.0, and run the

[spark] branch master updated (e00f43c -> 11d2b07)

2020-06-23 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from e00f43c  [SPARK-32043][SQL] Replace Decimal by Int op in 
`make_interval` and `make_timestamp`
 add 11d2b07  [SPARK-31918][R] Ignore S4 generic methods under SparkR 
namespace in closure cleaning to support R 4.0.0+

No new revisions were added by this update.

Summary of changes:
 R/pkg/R/utils.R   |  5 -
 R/pkg/tests/fulltests/test_context.R  |  4 +++-
 R/pkg/tests/fulltests/test_mllib_classification.R | 18 +-
 R/pkg/tests/fulltests/test_mllib_clustering.R |  2 +-
 R/pkg/tests/fulltests/test_mllib_regression.R |  2 +-
 5 files changed, 18 insertions(+), 13 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (e00f43c -> 11d2b07)

2020-06-23 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from e00f43c  [SPARK-32043][SQL] Replace Decimal by Int op in 
`make_interval` and `make_timestamp`
 add 11d2b07  [SPARK-31918][R] Ignore S4 generic methods under SparkR 
namespace in closure cleaning to support R 4.0.0+

No new revisions were added by this update.

Summary of changes:
 R/pkg/R/utils.R   |  5 -
 R/pkg/tests/fulltests/test_context.R  |  4 +++-
 R/pkg/tests/fulltests/test_mllib_classification.R | 18 +-
 R/pkg/tests/fulltests/test_mllib_clustering.R |  2 +-
 R/pkg/tests/fulltests/test_mllib_regression.R |  2 +-
 5 files changed, 18 insertions(+), 13 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[GitHub] [spark-website] gatorsmile merged pull request #276: Update the artifactId in the Download Page

2020-06-23 Thread GitBox



gatorsmile merged pull request #276:
URL: https://github.com/apache/spark-website/pull/276


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark-website] branch asf-site updated: Update the artifactId in the Download Page #276

2020-06-23 Thread lixiao

This is an automated email from the ASF dual-hosted git repository.

lixiao pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/spark-website.git


The following commit(s) were added to refs/heads/asf-site by this push:
 new 2c5679f  Update the artifactId in the Download Page #276
2c5679f is described below

commit 2c5679f415c3605726e68c0a2b8c204c91131d0c
Author: Xiao Li 
AuthorDate: Tue Jun 23 17:38:28 2020 -0700

Update the artifactId in the Download Page #276

The existing artifactId is not correct. We need to update it from 2.11 to 
2.12
---
 downloads.md| 2 +-
 site/downloads.html | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/downloads.md b/downloads.md
index 2ed9870..880024a 100644
--- a/downloads.md
+++ b/downloads.md
@@ -40,7 +40,7 @@ The latest preview release is Spark 3.0.0-preview2, published 
on Dec 23, 2019.
 Spark artifacts are [hosted in Maven 
Central](https://search.maven.org/search?q=g:org.apache.spark). You can add a 
Maven dependency with the following coordinates:
 
 groupId: org.apache.spark
-artifactId: spark-core_2.11
+artifactId: spark-core_2.12
 version: 3.0.0
 
 ### Installing with PyPi
diff --git a/site/downloads.html b/site/downloads.html
index e3b060f..d820471 100644
--- a/site/downloads.html
+++ b/site/downloads.html
@@ -240,7 +240,7 @@ The latest preview release is Spark 3.0.0-preview2, 
published on Dec 23, 2019.Spark artifacts are https://search.maven.org/search?q=g:org.apache.spark;>hosted in Maven 
Central. You can add a Maven dependency with the following coordinates:
 
 groupId: org.apache.spark
-artifactId: spark-core_2.11
+artifactId: spark-core_2.12
 version: 3.0.0
 
 


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[GitHub] [spark-website] gatorsmile commented on pull request #276: Update the artifactId in the Download Page

2020-06-23 Thread GitBox



gatorsmile commented on pull request #276:
URL: https://github.com/apache/spark-website/pull/276#issuecomment-648482243


   cc @srowen @cloud-fan 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[GitHub] [spark-website] gatorsmile opened a new pull request #276: Update the artifactId in the Download Page

2020-06-23 Thread GitBox



gatorsmile opened a new pull request #276:
URL: https://github.com/apache/spark-website/pull/276


   The existing artifactId is not correct. We need to update it from 2.11 to 
2.12
   
   ![Screen Shot 2020-06-23 at 4 34 33 
PM](https://user-images.githubusercontent.com/11567269/85477700-af950e00-b56f-11ea-915e-f027d383a73e.png)
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (2dbfae8 -> e00f43c)

2020-06-23 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 2dbfae8  [SPARK-32049][SQL][TESTS] Upgrade Oracle JDBC Driver 8
 add e00f43c  [SPARK-32043][SQL] Replace Decimal by Int op in 
`make_interval` and `make_timestamp`

No new revisions were added by this update.

Summary of changes:
 .../catalyst/expressions/datetimeExpressions.scala | 14 +++
 .../spark/sql/catalyst/util/IntervalUtils.scala|  3 ++-
 .../MakeDateTimeBenchmark-jdk11-results.txt| 28 +++---
 .../benchmarks/MakeDateTimeBenchmark-results.txt   | 28 +++---
 4 files changed, 39 insertions(+), 34 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (2dbfae8 -> e00f43c)

2020-06-23 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 2dbfae8  [SPARK-32049][SQL][TESTS] Upgrade Oracle JDBC Driver 8
 add e00f43c  [SPARK-32043][SQL] Replace Decimal by Int op in 
`make_interval` and `make_timestamp`

No new revisions were added by this update.

Summary of changes:
 .../catalyst/expressions/datetimeExpressions.scala | 14 +++
 .../spark/sql/catalyst/util/IntervalUtils.scala|  3 ++-
 .../MakeDateTimeBenchmark-jdk11-results.txt| 28 +++---
 .../benchmarks/MakeDateTimeBenchmark-results.txt   | 28 +++---
 4 files changed, 39 insertions(+), 34 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (2bcbe3d -> 2dbfae8)

2020-06-23 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 2bcbe3d  [SPARK-32045][BUILD] Upgrade to Apache Commons Lang 3.10
 add 2dbfae8  [SPARK-32049][SQL][TESTS] Upgrade Oracle JDBC Driver 8

No new revisions were added by this update.

Summary of changes:
 external/docker-integration-tests/pom.xml  | 12 ++--
 .../sql/jdbc/DockerJDBCIntegrationSuite.scala  |  3 +-
 .../spark/sql/jdbc/OracleIntegrationSuite.scala| 36 +-
 pom.xml|  6 
 4 files changed, 32 insertions(+), 25 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-32049][SQL][TESTS] Upgrade Oracle JDBC Driver 8

2020-06-23 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 2dbfae8  [SPARK-32049][SQL][TESTS] Upgrade Oracle JDBC Driver 8
2dbfae8 is described below

commit 2dbfae8775e00da521f59c6328428ae541396802
Author: Gabor Somogyi 
AuthorDate: Tue Jun 23 03:58:40 2020 -0700

[SPARK-32049][SQL][TESTS] Upgrade Oracle JDBC Driver 8

### What changes were proposed in this pull request?
`OracleIntegrationSuite` is not using the latest oracle JDBC driver. In 
this PR I've upgraded the driver to the latest which supports JDK8, JDK9, and 
JDK11.

### Why are the changes needed?
Old JDBC driver.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Existing unit tests.
Existing integration tests (especially `OracleIntegrationSuite`)

Closes #28893 from gaborgsomogyi/SPARK-32049.

Authored-by: Gabor Somogyi 
Signed-off-by: Dongjoon Hyun 
---
 external/docker-integration-tests/pom.xml  | 12 ++--
 .../sql/jdbc/DockerJDBCIntegrationSuite.scala  |  3 +-
 .../spark/sql/jdbc/OracleIntegrationSuite.scala| 36 +-
 pom.xml|  6 
 4 files changed, 32 insertions(+), 25 deletions(-)

diff --git a/external/docker-integration-tests/pom.xml 
b/external/docker-integration-tests/pom.xml
index 298e3d3..b240dd2 100644
--- a/external/docker-integration-tests/pom.xml
+++ b/external/docker-integration-tests/pom.xml
@@ -130,15 +130,9 @@
   postgresql
   test
 
-
-
-  com.oracle
-  ojdbc6
-  11.2.0.1.0
+
+  com.oracle.database.jdbc
+  ojdbc8
   test
 
 
diff --git 
a/external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/DockerJDBCIntegrationSuite.scala
 
b/external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/DockerJDBCIntegrationSuite.scala
index d15b366..6d1a22d 100644
--- 
a/external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/DockerJDBCIntegrationSuite.scala
+++ 
b/external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/DockerJDBCIntegrationSuite.scala
@@ -95,6 +95,7 @@ abstract class DockerJDBCIntegrationSuite extends 
SharedSparkSession with Eventu
 
   protected val dockerIp = DockerUtils.getDockerIp()
   val db: DatabaseOnDocker
+  val connectionTimeout = timeout(2.minutes)
 
   private var docker: DockerClient = _
   protected var externalPort: Int = _
@@ -155,7 +156,7 @@ abstract class DockerJDBCIntegrationSuite extends 
SharedSparkSession with Eventu
   docker.startContainer(containerId)
   jdbcUrl = db.getJdbcUrl(dockerIp, externalPort)
   var conn: Connection = null
-  eventually(timeout(2.minutes), interval(1.second)) {
+  eventually(connectionTimeout, interval(1.second)) {
 conn = getConnection()
   }
   // Run any setup queries:
diff --git 
a/external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/OracleIntegrationSuite.scala
 
b/external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/OracleIntegrationSuite.scala
index 24c3adb..9c59023 100644
--- 
a/external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/OracleIntegrationSuite.scala
+++ 
b/external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/OracleIntegrationSuite.scala
@@ -21,6 +21,8 @@ import java.math.BigDecimal
 import java.sql.{Connection, Date, Timestamp}
 import java.util.{Properties, TimeZone}
 
+import org.scalatest.time.SpanSugar._
+
 import org.apache.spark.sql.{Row, SaveMode}
 import org.apache.spark.sql.execution.{RowDataSourceScanExec, 
WholeStageCodegenExec}
 import org.apache.spark.sql.execution.datasources.LogicalRelation
@@ -31,27 +33,27 @@ import org.apache.spark.sql.types._
 import org.apache.spark.tags.DockerTest
 
 /**
- * This patch was tested using the Oracle docker. Created this integration 
suite for the same.
- * The ojdbc6-11.2.0.2.0.jar was to be downloaded from the maven repository. 
Since there was
- * no jdbc jar available in the maven repository, the jar was downloaded from 
oracle site
- * manually and installed in the local; thus tested. So, for SparkQA test case 
run, the
- * ojdbc jar might be manually placed in the local maven 
repository(com/oracle/ojdbc6/11.2.0.2.0)
- * while Spark QA test run.
- *
  * The following would be the steps to test this
  * 1. Build Oracle database in Docker, please refer below link about how to.
  *
https://github.com/oracle/docker-images/blob/master/OracleDatabase/SingleInstance/README.md
  * 2. export ORACLE_DOCKER_IMAGE_NAME=$ORACLE_DOCKER_IMAGE_NAME
  *Pull oracle $ORACLE_DOCKER_IMAGE_NAME image - docker pull 
$ORACLE_DOCKER_IMAGE_NAME
  * 3. Start docker - sudo service

[spark] branch master updated (fcf9768 -> 2bcbe3d)

2020-06-23 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from fcf9768  [SPARK-32052][SQL] Extract common code from date-time field 
expressions
 add 2bcbe3d  [SPARK-32045][BUILD] Upgrade to Apache Commons Lang 3.10

No new revisions were added by this update.

Summary of changes:
 dev/deps/spark-deps-hadoop-2.7-hive-1.2 | 2 +-
 dev/deps/spark-deps-hadoop-2.7-hive-2.3 | 2 +-
 dev/deps/spark-deps-hadoop-3.2-hive-2.3 | 2 +-
 pom.xml | 2 +-
 4 files changed, 4 insertions(+), 4 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (fcf9768 -> 2bcbe3d)

2020-06-23 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from fcf9768  [SPARK-32052][SQL] Extract common code from date-time field 
expressions
 add 2bcbe3d  [SPARK-32045][BUILD] Upgrade to Apache Commons Lang 3.10

No new revisions were added by this update.

Summary of changes:
 dev/deps/spark-deps-hadoop-2.7-hive-1.2 | 2 +-
 dev/deps/spark-deps-hadoop-2.7-hive-2.3 | 2 +-
 dev/deps/spark-deps-hadoop-3.2-hive-2.3 | 2 +-
 pom.xml | 2 +-
 4 files changed, 4 insertions(+), 4 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (fcf9768 -> 2bcbe3d)

2020-06-23 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from fcf9768  [SPARK-32052][SQL] Extract common code from date-time field 
expressions
 add 2bcbe3d  [SPARK-32045][BUILD] Upgrade to Apache Commons Lang 3.10

No new revisions were added by this update.

Summary of changes:
 dev/deps/spark-deps-hadoop-2.7-hive-1.2 | 2 +-
 dev/deps/spark-deps-hadoop-2.7-hive-2.3 | 2 +-
 dev/deps/spark-deps-hadoop-3.2-hive-2.3 | 2 +-
 pom.xml | 2 +-
 4 files changed, 4 insertions(+), 4 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (fcf9768 -> 2bcbe3d)

2020-06-23 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from fcf9768  [SPARK-32052][SQL] Extract common code from date-time field 
expressions
 add 2bcbe3d  [SPARK-32045][BUILD] Upgrade to Apache Commons Lang 3.10

No new revisions were added by this update.

Summary of changes:
 dev/deps/spark-deps-hadoop-2.7-hive-1.2 | 2 +-
 dev/deps/spark-deps-hadoop-2.7-hive-2.3 | 2 +-
 dev/deps/spark-deps-hadoop-3.2-hive-2.3 | 2 +-
 pom.xml | 2 +-
 4 files changed, 4 insertions(+), 4 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (fcf9768 -> 2bcbe3d)

2020-06-23 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from fcf9768  [SPARK-32052][SQL] Extract common code from date-time field 
expressions
 add 2bcbe3d  [SPARK-32045][BUILD] Upgrade to Apache Commons Lang 3.10

No new revisions were added by this update.

Summary of changes:
 dev/deps/spark-deps-hadoop-2.7-hive-1.2 | 2 +-
 dev/deps/spark-deps-hadoop-2.7-hive-2.3 | 2 +-
 dev/deps/spark-deps-hadoop-3.2-hive-2.3 | 2 +-
 pom.xml | 2 +-
 4 files changed, 4 insertions(+), 4 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (979a8eb -> fcf9768)

2020-06-23 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 979a8eb  [MINOR][SQL] Simplify DateTimeUtils.cleanLegacyTimestampStr
 add fcf9768  [SPARK-32052][SQL] Extract common code from date-time field 
expressions

No new revisions were added by this update.

Summary of changes:
 .../catalyst/expressions/datetimeExpressions.scala | 290 ++---
 .../spark/sql/catalyst/util/DateTimeUtils.scala|  20 ++
 2 files changed, 93 insertions(+), 217 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (979a8eb -> fcf9768)

2020-06-23 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 979a8eb  [MINOR][SQL] Simplify DateTimeUtils.cleanLegacyTimestampStr
 add fcf9768  [SPARK-32052][SQL] Extract common code from date-time field 
expressions

No new revisions were added by this update.

Summary of changes:
 .../catalyst/expressions/datetimeExpressions.scala | 290 ++---
 .../spark/sql/catalyst/util/DateTimeUtils.scala|  20 ++
 2 files changed, 93 insertions(+), 217 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (979a8eb -> fcf9768)

2020-06-23 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 979a8eb  [MINOR][SQL] Simplify DateTimeUtils.cleanLegacyTimestampStr
 add fcf9768  [SPARK-32052][SQL] Extract common code from date-time field 
expressions

No new revisions were added by this update.

Summary of changes:
 .../catalyst/expressions/datetimeExpressions.scala | 290 ++---
 .../spark/sql/catalyst/util/DateTimeUtils.scala|  20 ++
 2 files changed, 93 insertions(+), 217 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

69 matches

Mail list logo