from:"kabhwan"

[spark] branch master updated (68c032c -> 93ad26b)

2020-11-05 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 68c032c  [SPARK-33364][SQL] Introduce the "purge" option in 
TableCatalog.dropTable for v2 catalog
 add 93ad26b  [SPARK-23432][UI] Add executor peak jvm memory metrics in 
executors page

No new revisions were added by this update.

Summary of changes:
 .../spark/ui/static/executorspage-template.html| 16 +++
 .../org/apache/spark/ui/static/executorspage.js| 52 --
 2 files changed, 64 insertions(+), 4 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (e66201b -> 21413b7)

2020-11-05 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from e66201b  [MINOR][SS][DOCS] Update join type in stream static joins 
code examples
 add 21413b7  [SPARK-30294][SS] Explicitly defines read-only StateStore and 
optimize for HDFSBackedStateStore

No new revisions were added by this update.

Summary of changes:
 .../state/HDFSBackedStateStoreProvider.scala   |  46 +++--
 .../sql/execution/streaming/state/StateStore.scala | 111 +
 .../execution/streaming/state/StateStoreRDD.scala  | 104 ++-
 .../state/StreamingAggregationStateManager.scala   |  22 ++--
 .../sql/execution/streaming/state/package.scala|  35 +++
 .../execution/streaming/statefulOperators.scala|   2 +-
 .../streaming/state/StateStoreSuite.scala  |   4 +-
 7 files changed, 261 insertions(+), 63 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (e66201b -> 21413b7)

2020-11-05 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from e66201b  [MINOR][SS][DOCS] Update join type in stream static joins 
code examples
 add 21413b7  [SPARK-30294][SS] Explicitly defines read-only StateStore and 
optimize for HDFSBackedStateStore

No new revisions were added by this update.

Summary of changes:
 .../state/HDFSBackedStateStoreProvider.scala   |  46 +++--
 .../sql/execution/streaming/state/StateStore.scala | 111 +
 .../execution/streaming/state/StateStoreRDD.scala  | 104 ++-
 .../state/StreamingAggregationStateManager.scala   |  22 ++--
 .../sql/execution/streaming/state/package.scala|  35 +++
 .../execution/streaming/statefulOperators.scala|   2 +-
 .../streaming/state/StateStoreSuite.scala  |   4 +-
 7 files changed, 261 insertions(+), 63 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (e66201b -> 21413b7)

2020-11-05 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from e66201b  [MINOR][SS][DOCS] Update join type in stream static joins 
code examples
 add 21413b7  [SPARK-30294][SS] Explicitly defines read-only StateStore and 
optimize for HDFSBackedStateStore

No new revisions were added by this update.

Summary of changes:
 .../state/HDFSBackedStateStoreProvider.scala   |  46 +++--
 .../sql/execution/streaming/state/StateStore.scala | 111 +
 .../execution/streaming/state/StateStoreRDD.scala  | 104 ++-
 .../state/StreamingAggregationStateManager.scala   |  22 ++--
 .../sql/execution/streaming/state/package.scala|  35 +++
 .../execution/streaming/statefulOperators.scala|   2 +-
 .../streaming/state/StateStoreSuite.scala  |   4 +-
 7 files changed, 261 insertions(+), 63 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (e66201b -> 21413b7)

2020-11-05 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from e66201b  [MINOR][SS][DOCS] Update join type in stream static joins 
code examples
 add 21413b7  [SPARK-30294][SS] Explicitly defines read-only StateStore and 
optimize for HDFSBackedStateStore

No new revisions were added by this update.

Summary of changes:
 .../state/HDFSBackedStateStoreProvider.scala   |  46 +++--
 .../sql/execution/streaming/state/StateStore.scala | 111 +
 .../execution/streaming/state/StateStoreRDD.scala  | 104 ++-
 .../state/StreamingAggregationStateManager.scala   |  22 ++--
 .../sql/execution/streaming/state/package.scala|  35 +++
 .../execution/streaming/statefulOperators.scala|   2 +-
 .../streaming/state/StateStoreSuite.scala  |   4 +-
 7 files changed, 261 insertions(+), 63 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (e66201b -> 21413b7)

2020-11-05 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from e66201b  [MINOR][SS][DOCS] Update join type in stream static joins 
code examples
 add 21413b7  [SPARK-30294][SS] Explicitly defines read-only StateStore and 
optimize for HDFSBackedStateStore

No new revisions were added by this update.

Summary of changes:
 .../state/HDFSBackedStateStoreProvider.scala   |  46 +++--
 .../sql/execution/streaming/state/StateStore.scala | 111 +
 .../execution/streaming/state/StateStoreRDD.scala  | 104 ++-
 .../state/StreamingAggregationStateManager.scala   |  22 ++--
 .../sql/execution/streaming/state/package.scala|  35 +++
 .../execution/streaming/statefulOperators.scala|   2 +-
 .../streaming/state/StateStoreSuite.scala  |   4 +-
 7 files changed, 261 insertions(+), 63 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-2.4 updated: [MINOR][SS][DOCS] Update join type in stream static joins code examples

2020-11-05 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a commit to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-2.4 by this push:
 new 8684720  [MINOR][SS][DOCS] Update join type in stream static joins 
code examples
8684720 is described below

commit 868472040a9eb67169086ed0fab0afcbbbd321f9
Author: Sarvesh Dave 
AuthorDate: Thu Nov 5 16:22:31 2020 +0900

[MINOR][SS][DOCS] Update join type in stream static joins code examples

### What changes were proposed in this pull request?
Update join type in stream static joins code examples in structured 
streaming programming guide.
1) Scala, Java and Python examples have a common issue.
The join keyword is "right_join", it should be "left_outer".

_Reasons:_
a) This code snippet is an example of "left outer join" as the 
streaming df is on left and static df is on right. Also, right outerjoin 
between stream df(left) and static df(right) is not supported.
b) The keyword "right_join/left_join" is unsupported and it should be 
"right_outer/left_outer".

So, all of these code snippets have been updated to "left_outer".

2) R exmaple is correct, but the example is of "right_outer" with static df 
(left) and streaming df(right).
It is changed to "left_outer" to make it consistent with other three 
examples of scala, java and python.

### Why are the changes needed?
To fix the mistake in example code of documentation.

### Does this PR introduce _any_ user-facing change?
Yes, it is a user-facing change (but documentation update only).

**Screenshots 1: Scala/Java/python example (similar issue)**
_Before:_
https://user-images.githubusercontent.com/62717942/98155351-19e59400-1efc-11eb-8142-e6a25a5e6497.png";>

_After:_
https://user-images.githubusercontent.com/62717942/98155503-5d400280-1efc-11eb-96e1-5ba0f3c35c82.png";>

**Screenshots 2: R example (Make it consistent with above change)**
_Before:_
https://user-images.githubusercontent.com/62717942/98155685-ac863300-1efc-11eb-93bc-b7ca4dd34634.png";>

_After:_
https://user-images.githubusercontent.com/62717942/98155739-c0ca3000-1efc-11eb-8f95-a7538fa784b7.png";>

### How was this patch tested?
The change was tested locally.
1) cd docs/
SKIP_API=1 jekyll build
2) Verify docs/_site/structured-streaming-programming-guide.html file in 
browser.

Closes #30252 from sarveshdave1/doc-update-stream-static-joins.

Authored-by: Sarvesh Dave 
Signed-off-by: Jungtaek Lim (HeartSaVioR) 
(cherry picked from commit e66201b30bc1f3da7284af14b32e5e6200768dbd)
Signed-off-by: Jungtaek Lim (HeartSaVioR) 
---
 docs/structured-streaming-programming-guide.md | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/docs/structured-streaming-programming-guide.md 
b/docs/structured-streaming-programming-guide.md
index dce4b35..aac262b 100644
--- a/docs/structured-streaming-programming-guide.md
+++ b/docs/structured-streaming-programming-guide.md
@@ -1089,7 +1089,7 @@ val staticDf = spark.read. ...
 val streamingDf = spark.readStream. ...
 
 streamingDf.join(staticDf, "type")  // inner equi-join with a static DF
-streamingDf.join(staticDf, "type", "right_join")  // right outer join with a 
static DF  
+streamingDf.join(staticDf, "type", "left_outer")  // left outer join with a 
static DF
 
 {% endhighlight %}
 
@@ -1100,7 +1100,7 @@ streamingDf.join(staticDf, "type", "right_join")  // 
right outer join with a sta
 Dataset staticDf = spark.read(). ...;
 Dataset streamingDf = spark.readStream(). ...;
 streamingDf.join(staticDf, "type"); // inner equi-join with a static DF
-streamingDf.join(staticDf, "type", "right_join");  // right outer join with a 
static DF
+streamingDf.join(staticDf, "type", "left_outer");  // left outer join with a 
static DF
 {% endhighlight %}
 
 
@@ -,7 +,7 @@ streamingDf.join(staticDf, "type", "right_join");  // 
right outer join with a st
 staticDf = spark.read. ...
 streamingDf = spark.readStream. ...
 streamingDf.join(staticDf, "type")  # inner equi-join with a static DF
-streamingDf.join(staticDf, "type", "right_join")  # right outer join with a 
static DF
+streamingDf.join(staticDf, "type", "left_outer")  # left outer join with a 
static DF
 {% endhighlight %}
 
 
@@ -1123,10 +1123,10 @@ staticDf <- read.df(...)
 streamingDf <- read.stream(...)
 joined <- merge(streamingDf, staticDf, sort = FALSE)  # inner equi

[spark] branch branch-3.0 updated: [MINOR][SS][DOCS] Update join type in stream static joins code examples

2020-11-05 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new c43231c  [MINOR][SS][DOCS] Update join type in stream static joins 
code examples
c43231c is described below

commit c43231c6cd2d38fb9e6c6435ca08b735c9aacb61
Author: Sarvesh Dave 
AuthorDate: Thu Nov 5 16:22:31 2020 +0900

[MINOR][SS][DOCS] Update join type in stream static joins code examples

### What changes were proposed in this pull request?
Update join type in stream static joins code examples in structured 
streaming programming guide.
1) Scala, Java and Python examples have a common issue.
The join keyword is "right_join", it should be "left_outer".

_Reasons:_
a) This code snippet is an example of "left outer join" as the 
streaming df is on left and static df is on right. Also, right outerjoin 
between stream df(left) and static df(right) is not supported.
b) The keyword "right_join/left_join" is unsupported and it should be 
"right_outer/left_outer".

So, all of these code snippets have been updated to "left_outer".

2) R exmaple is correct, but the example is of "right_outer" with static df 
(left) and streaming df(right).
It is changed to "left_outer" to make it consistent with other three 
examples of scala, java and python.

### Why are the changes needed?
To fix the mistake in example code of documentation.

### Does this PR introduce _any_ user-facing change?
Yes, it is a user-facing change (but documentation update only).

**Screenshots 1: Scala/Java/python example (similar issue)**
_Before:_
https://user-images.githubusercontent.com/62717942/98155351-19e59400-1efc-11eb-8142-e6a25a5e6497.png";>

_After:_
https://user-images.githubusercontent.com/62717942/98155503-5d400280-1efc-11eb-96e1-5ba0f3c35c82.png";>

**Screenshots 2: R example (Make it consistent with above change)**
_Before:_
https://user-images.githubusercontent.com/62717942/98155685-ac863300-1efc-11eb-93bc-b7ca4dd34634.png";>

_After:_
https://user-images.githubusercontent.com/62717942/98155739-c0ca3000-1efc-11eb-8f95-a7538fa784b7.png";>

### How was this patch tested?
The change was tested locally.
1) cd docs/
SKIP_API=1 jekyll build
2) Verify docs/_site/structured-streaming-programming-guide.html file in 
browser.

Closes #30252 from sarveshdave1/doc-update-stream-static-joins.

Authored-by: Sarvesh Dave 
Signed-off-by: Jungtaek Lim (HeartSaVioR) 
(cherry picked from commit e66201b30bc1f3da7284af14b32e5e6200768dbd)
Signed-off-by: Jungtaek Lim (HeartSaVioR) 
---
 docs/structured-streaming-programming-guide.md | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/docs/structured-streaming-programming-guide.md 
b/docs/structured-streaming-programming-guide.md
index 69d744d..31b1ca9d 100644
--- a/docs/structured-streaming-programming-guide.md
+++ b/docs/structured-streaming-programming-guide.md
@@ -1115,7 +1115,7 @@ val staticDf = spark.read. ...
 val streamingDf = spark.readStream. ...
 
 streamingDf.join(staticDf, "type")  // inner equi-join with a static DF
-streamingDf.join(staticDf, "type", "right_join")  // right outer join with a 
static DF  
+streamingDf.join(staticDf, "type", "left_outer")  // left outer join with a 
static DF
 
 {% endhighlight %}
 
@@ -1126,7 +1126,7 @@ streamingDf.join(staticDf, "type", "right_join")  // 
right outer join with a sta
 Dataset staticDf = spark.read(). ...;
 Dataset streamingDf = spark.readStream(). ...;
 streamingDf.join(staticDf, "type"); // inner equi-join with a static DF
-streamingDf.join(staticDf, "type", "right_join");  // right outer join with a 
static DF
+streamingDf.join(staticDf, "type", "left_outer");  // left outer join with a 
static DF
 {% endhighlight %}
 
 
@@ -1137,7 +1137,7 @@ streamingDf.join(staticDf, "type", "right_join");  // 
right outer join with a st
 staticDf = spark.read. ...
 streamingDf = spark.readStream. ...
 streamingDf.join(staticDf, "type")  # inner equi-join with a static DF
-streamingDf.join(staticDf, "type", "right_join")  # right outer join with a 
static DF
+streamingDf.join(staticDf, "type", "left_outer")  # left outer join with a 
static DF
 {% endhighlight %}
 
 
@@ -1149,10 +1149,10 @@ staticDf <- read.df(...)
 streamingDf <- read.stream(...)
 joined <- merge(streamingDf, staticDf, sort = FALSE)  # inner equi

[spark] branch branch-2.4 updated: [MINOR][SS][DOCS] Update join type in stream static joins code examples

2020-11-05 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a commit to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-2.4 by this push:
 new 8684720  [MINOR][SS][DOCS] Update join type in stream static joins 
code examples
8684720 is described below

commit 868472040a9eb67169086ed0fab0afcbbbd321f9
Author: Sarvesh Dave 
AuthorDate: Thu Nov 5 16:22:31 2020 +0900

[MINOR][SS][DOCS] Update join type in stream static joins code examples

### What changes were proposed in this pull request?
Update join type in stream static joins code examples in structured 
streaming programming guide.
1) Scala, Java and Python examples have a common issue.
The join keyword is "right_join", it should be "left_outer".

_Reasons:_
a) This code snippet is an example of "left outer join" as the 
streaming df is on left and static df is on right. Also, right outerjoin 
between stream df(left) and static df(right) is not supported.
b) The keyword "right_join/left_join" is unsupported and it should be 
"right_outer/left_outer".

So, all of these code snippets have been updated to "left_outer".

2) R exmaple is correct, but the example is of "right_outer" with static df 
(left) and streaming df(right).
It is changed to "left_outer" to make it consistent with other three 
examples of scala, java and python.

### Why are the changes needed?
To fix the mistake in example code of documentation.

### Does this PR introduce _any_ user-facing change?
Yes, it is a user-facing change (but documentation update only).

**Screenshots 1: Scala/Java/python example (similar issue)**
_Before:_
https://user-images.githubusercontent.com/62717942/98155351-19e59400-1efc-11eb-8142-e6a25a5e6497.png";>

_After:_
https://user-images.githubusercontent.com/62717942/98155503-5d400280-1efc-11eb-96e1-5ba0f3c35c82.png";>

**Screenshots 2: R example (Make it consistent with above change)**
_Before:_
https://user-images.githubusercontent.com/62717942/98155685-ac863300-1efc-11eb-93bc-b7ca4dd34634.png";>

_After:_
https://user-images.githubusercontent.com/62717942/98155739-c0ca3000-1efc-11eb-8f95-a7538fa784b7.png";>

### How was this patch tested?
The change was tested locally.
1) cd docs/
SKIP_API=1 jekyll build
2) Verify docs/_site/structured-streaming-programming-guide.html file in 
browser.

Closes #30252 from sarveshdave1/doc-update-stream-static-joins.

Authored-by: Sarvesh Dave 
Signed-off-by: Jungtaek Lim (HeartSaVioR) 
(cherry picked from commit e66201b30bc1f3da7284af14b32e5e6200768dbd)
Signed-off-by: Jungtaek Lim (HeartSaVioR) 
---
 docs/structured-streaming-programming-guide.md | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/docs/structured-streaming-programming-guide.md 
b/docs/structured-streaming-programming-guide.md
index dce4b35..aac262b 100644
--- a/docs/structured-streaming-programming-guide.md
+++ b/docs/structured-streaming-programming-guide.md
@@ -1089,7 +1089,7 @@ val staticDf = spark.read. ...
 val streamingDf = spark.readStream. ...
 
 streamingDf.join(staticDf, "type")  // inner equi-join with a static DF
-streamingDf.join(staticDf, "type", "right_join")  // right outer join with a 
static DF  
+streamingDf.join(staticDf, "type", "left_outer")  // left outer join with a 
static DF
 
 {% endhighlight %}
 
@@ -1100,7 +1100,7 @@ streamingDf.join(staticDf, "type", "right_join")  // 
right outer join with a sta
 Dataset staticDf = spark.read(). ...;
 Dataset streamingDf = spark.readStream(). ...;
 streamingDf.join(staticDf, "type"); // inner equi-join with a static DF
-streamingDf.join(staticDf, "type", "right_join");  // right outer join with a 
static DF
+streamingDf.join(staticDf, "type", "left_outer");  // left outer join with a 
static DF
 {% endhighlight %}
 
 
@@ -,7 +,7 @@ streamingDf.join(staticDf, "type", "right_join");  // 
right outer join with a st
 staticDf = spark.read. ...
 streamingDf = spark.readStream. ...
 streamingDf.join(staticDf, "type")  # inner equi-join with a static DF
-streamingDf.join(staticDf, "type", "right_join")  # right outer join with a 
static DF
+streamingDf.join(staticDf, "type", "left_outer")  # left outer join with a 
static DF
 {% endhighlight %}
 
 
@@ -1123,10 +1123,10 @@ staticDf <- read.df(...)
 streamingDf <- read.stream(...)
 joined <- merge(streamingDf, staticDf, sort = FALSE)  # inner equi

[spark] branch branch-3.0 updated: [MINOR][SS][DOCS] Update join type in stream static joins code examples

2020-11-05 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new c43231c  [MINOR][SS][DOCS] Update join type in stream static joins 
code examples
c43231c is described below

commit c43231c6cd2d38fb9e6c6435ca08b735c9aacb61
Author: Sarvesh Dave 
AuthorDate: Thu Nov 5 16:22:31 2020 +0900

[MINOR][SS][DOCS] Update join type in stream static joins code examples

### What changes were proposed in this pull request?
Update join type in stream static joins code examples in structured 
streaming programming guide.
1) Scala, Java and Python examples have a common issue.
The join keyword is "right_join", it should be "left_outer".

_Reasons:_
a) This code snippet is an example of "left outer join" as the 
streaming df is on left and static df is on right. Also, right outerjoin 
between stream df(left) and static df(right) is not supported.
b) The keyword "right_join/left_join" is unsupported and it should be 
"right_outer/left_outer".

So, all of these code snippets have been updated to "left_outer".

2) R exmaple is correct, but the example is of "right_outer" with static df 
(left) and streaming df(right).
It is changed to "left_outer" to make it consistent with other three 
examples of scala, java and python.

### Why are the changes needed?
To fix the mistake in example code of documentation.

### Does this PR introduce _any_ user-facing change?
Yes, it is a user-facing change (but documentation update only).

**Screenshots 1: Scala/Java/python example (similar issue)**
_Before:_
https://user-images.githubusercontent.com/62717942/98155351-19e59400-1efc-11eb-8142-e6a25a5e6497.png";>

_After:_
https://user-images.githubusercontent.com/62717942/98155503-5d400280-1efc-11eb-96e1-5ba0f3c35c82.png";>

**Screenshots 2: R example (Make it consistent with above change)**
_Before:_
https://user-images.githubusercontent.com/62717942/98155685-ac863300-1efc-11eb-93bc-b7ca4dd34634.png";>

_After:_
https://user-images.githubusercontent.com/62717942/98155739-c0ca3000-1efc-11eb-8f95-a7538fa784b7.png";>

### How was this patch tested?
The change was tested locally.
1) cd docs/
SKIP_API=1 jekyll build
2) Verify docs/_site/structured-streaming-programming-guide.html file in 
browser.

Closes #30252 from sarveshdave1/doc-update-stream-static-joins.

Authored-by: Sarvesh Dave 
Signed-off-by: Jungtaek Lim (HeartSaVioR) 
(cherry picked from commit e66201b30bc1f3da7284af14b32e5e6200768dbd)
Signed-off-by: Jungtaek Lim (HeartSaVioR) 
---
 docs/structured-streaming-programming-guide.md | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/docs/structured-streaming-programming-guide.md 
b/docs/structured-streaming-programming-guide.md
index 69d744d..31b1ca9d 100644
--- a/docs/structured-streaming-programming-guide.md
+++ b/docs/structured-streaming-programming-guide.md
@@ -1115,7 +1115,7 @@ val staticDf = spark.read. ...
 val streamingDf = spark.readStream. ...
 
 streamingDf.join(staticDf, "type")  // inner equi-join with a static DF
-streamingDf.join(staticDf, "type", "right_join")  // right outer join with a 
static DF  
+streamingDf.join(staticDf, "type", "left_outer")  // left outer join with a 
static DF
 
 {% endhighlight %}
 
@@ -1126,7 +1126,7 @@ streamingDf.join(staticDf, "type", "right_join")  // 
right outer join with a sta
 Dataset staticDf = spark.read(). ...;
 Dataset streamingDf = spark.readStream(). ...;
 streamingDf.join(staticDf, "type"); // inner equi-join with a static DF
-streamingDf.join(staticDf, "type", "right_join");  // right outer join with a 
static DF
+streamingDf.join(staticDf, "type", "left_outer");  // left outer join with a 
static DF
 {% endhighlight %}
 
 
@@ -1137,7 +1137,7 @@ streamingDf.join(staticDf, "type", "right_join");  // 
right outer join with a st
 staticDf = spark.read. ...
 streamingDf = spark.readStream. ...
 streamingDf.join(staticDf, "type")  # inner equi-join with a static DF
-streamingDf.join(staticDf, "type", "right_join")  # right outer join with a 
static DF
+streamingDf.join(staticDf, "type", "left_outer")  # left outer join with a 
static DF
 {% endhighlight %}
 
 
@@ -1149,10 +1149,10 @@ staticDf <- read.df(...)
 streamingDf <- read.stream(...)
 joined <- merge(streamingDf, staticDf, sort = FALSE)  # inner equi

[spark] branch master updated (d530ed0 -> e66201b)

2020-11-05 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from d530ed0  Revert "[SPARK-33277][PYSPARK][SQL] Use ContextAwareIterator 
to stop consuming after the task ends"
 add e66201b  [MINOR][SS][DOCS] Update join type in stream static joins 
code examples

No new revisions were added by this update.

Summary of changes:
 docs/structured-streaming-programming-guide.md | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-2.4 updated: [MINOR][SS][DOCS] Update join type in stream static joins code examples

2020-11-05 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a commit to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-2.4 by this push:
 new 8684720  [MINOR][SS][DOCS] Update join type in stream static joins 
code examples
8684720 is described below

commit 868472040a9eb67169086ed0fab0afcbbbd321f9
Author: Sarvesh Dave 
AuthorDate: Thu Nov 5 16:22:31 2020 +0900

[MINOR][SS][DOCS] Update join type in stream static joins code examples

### What changes were proposed in this pull request?
Update join type in stream static joins code examples in structured 
streaming programming guide.
1) Scala, Java and Python examples have a common issue.
The join keyword is "right_join", it should be "left_outer".

_Reasons:_
a) This code snippet is an example of "left outer join" as the 
streaming df is on left and static df is on right. Also, right outerjoin 
between stream df(left) and static df(right) is not supported.
b) The keyword "right_join/left_join" is unsupported and it should be 
"right_outer/left_outer".

So, all of these code snippets have been updated to "left_outer".

2) R exmaple is correct, but the example is of "right_outer" with static df 
(left) and streaming df(right).
It is changed to "left_outer" to make it consistent with other three 
examples of scala, java and python.

### Why are the changes needed?
To fix the mistake in example code of documentation.

### Does this PR introduce _any_ user-facing change?
Yes, it is a user-facing change (but documentation update only).

**Screenshots 1: Scala/Java/python example (similar issue)**
_Before:_
https://user-images.githubusercontent.com/62717942/98155351-19e59400-1efc-11eb-8142-e6a25a5e6497.png";>

_After:_
https://user-images.githubusercontent.com/62717942/98155503-5d400280-1efc-11eb-96e1-5ba0f3c35c82.png";>

**Screenshots 2: R example (Make it consistent with above change)**
_Before:_
https://user-images.githubusercontent.com/62717942/98155685-ac863300-1efc-11eb-93bc-b7ca4dd34634.png";>

_After:_
https://user-images.githubusercontent.com/62717942/98155739-c0ca3000-1efc-11eb-8f95-a7538fa784b7.png";>

### How was this patch tested?
The change was tested locally.
1) cd docs/
SKIP_API=1 jekyll build
2) Verify docs/_site/structured-streaming-programming-guide.html file in 
browser.

Closes #30252 from sarveshdave1/doc-update-stream-static-joins.

Authored-by: Sarvesh Dave 
Signed-off-by: Jungtaek Lim (HeartSaVioR) 
(cherry picked from commit e66201b30bc1f3da7284af14b32e5e6200768dbd)
Signed-off-by: Jungtaek Lim (HeartSaVioR) 
---
 docs/structured-streaming-programming-guide.md | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/docs/structured-streaming-programming-guide.md 
b/docs/structured-streaming-programming-guide.md
index dce4b35..aac262b 100644
--- a/docs/structured-streaming-programming-guide.md
+++ b/docs/structured-streaming-programming-guide.md
@@ -1089,7 +1089,7 @@ val staticDf = spark.read. ...
 val streamingDf = spark.readStream. ...
 
 streamingDf.join(staticDf, "type")  // inner equi-join with a static DF
-streamingDf.join(staticDf, "type", "right_join")  // right outer join with a 
static DF  
+streamingDf.join(staticDf, "type", "left_outer")  // left outer join with a 
static DF
 
 {% endhighlight %}
 
@@ -1100,7 +1100,7 @@ streamingDf.join(staticDf, "type", "right_join")  // 
right outer join with a sta
 Dataset staticDf = spark.read(). ...;
 Dataset streamingDf = spark.readStream(). ...;
 streamingDf.join(staticDf, "type"); // inner equi-join with a static DF
-streamingDf.join(staticDf, "type", "right_join");  // right outer join with a 
static DF
+streamingDf.join(staticDf, "type", "left_outer");  // left outer join with a 
static DF
 {% endhighlight %}
 
 
@@ -,7 +,7 @@ streamingDf.join(staticDf, "type", "right_join");  // 
right outer join with a st
 staticDf = spark.read. ...
 streamingDf = spark.readStream. ...
 streamingDf.join(staticDf, "type")  # inner equi-join with a static DF
-streamingDf.join(staticDf, "type", "right_join")  # right outer join with a 
static DF
+streamingDf.join(staticDf, "type", "left_outer")  # left outer join with a 
static DF
 {% endhighlight %}
 
 
@@ -1123,10 +1123,10 @@ staticDf <- read.df(...)
 streamingDf <- read.stream(...)
 joined <- merge(streamingDf, staticDf, sort = FALSE)  # inner equi

[spark] branch branch-3.0 updated: [MINOR][SS][DOCS] Update join type in stream static joins code examples

2020-11-05 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new c43231c  [MINOR][SS][DOCS] Update join type in stream static joins 
code examples
c43231c is described below

commit c43231c6cd2d38fb9e6c6435ca08b735c9aacb61
Author: Sarvesh Dave 
AuthorDate: Thu Nov 5 16:22:31 2020 +0900

[MINOR][SS][DOCS] Update join type in stream static joins code examples

### What changes were proposed in this pull request?
Update join type in stream static joins code examples in structured 
streaming programming guide.
1) Scala, Java and Python examples have a common issue.
The join keyword is "right_join", it should be "left_outer".

_Reasons:_
a) This code snippet is an example of "left outer join" as the 
streaming df is on left and static df is on right. Also, right outerjoin 
between stream df(left) and static df(right) is not supported.
b) The keyword "right_join/left_join" is unsupported and it should be 
"right_outer/left_outer".

So, all of these code snippets have been updated to "left_outer".

2) R exmaple is correct, but the example is of "right_outer" with static df 
(left) and streaming df(right).
It is changed to "left_outer" to make it consistent with other three 
examples of scala, java and python.

### Why are the changes needed?
To fix the mistake in example code of documentation.

### Does this PR introduce _any_ user-facing change?
Yes, it is a user-facing change (but documentation update only).

**Screenshots 1: Scala/Java/python example (similar issue)**
_Before:_
https://user-images.githubusercontent.com/62717942/98155351-19e59400-1efc-11eb-8142-e6a25a5e6497.png";>

_After:_
https://user-images.githubusercontent.com/62717942/98155503-5d400280-1efc-11eb-96e1-5ba0f3c35c82.png";>

**Screenshots 2: R example (Make it consistent with above change)**
_Before:_
https://user-images.githubusercontent.com/62717942/98155685-ac863300-1efc-11eb-93bc-b7ca4dd34634.png";>

_After:_
https://user-images.githubusercontent.com/62717942/98155739-c0ca3000-1efc-11eb-8f95-a7538fa784b7.png";>

### How was this patch tested?
The change was tested locally.
1) cd docs/
SKIP_API=1 jekyll build
2) Verify docs/_site/structured-streaming-programming-guide.html file in 
browser.

Closes #30252 from sarveshdave1/doc-update-stream-static-joins.

Authored-by: Sarvesh Dave 
Signed-off-by: Jungtaek Lim (HeartSaVioR) 
(cherry picked from commit e66201b30bc1f3da7284af14b32e5e6200768dbd)
Signed-off-by: Jungtaek Lim (HeartSaVioR) 
---
 docs/structured-streaming-programming-guide.md | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/docs/structured-streaming-programming-guide.md 
b/docs/structured-streaming-programming-guide.md
index 69d744d..31b1ca9d 100644
--- a/docs/structured-streaming-programming-guide.md
+++ b/docs/structured-streaming-programming-guide.md
@@ -1115,7 +1115,7 @@ val staticDf = spark.read. ...
 val streamingDf = spark.readStream. ...
 
 streamingDf.join(staticDf, "type")  // inner equi-join with a static DF
-streamingDf.join(staticDf, "type", "right_join")  // right outer join with a 
static DF  
+streamingDf.join(staticDf, "type", "left_outer")  // left outer join with a 
static DF
 
 {% endhighlight %}
 
@@ -1126,7 +1126,7 @@ streamingDf.join(staticDf, "type", "right_join")  // 
right outer join with a sta
 Dataset staticDf = spark.read(). ...;
 Dataset streamingDf = spark.readStream(). ...;
 streamingDf.join(staticDf, "type"); // inner equi-join with a static DF
-streamingDf.join(staticDf, "type", "right_join");  // right outer join with a 
static DF
+streamingDf.join(staticDf, "type", "left_outer");  // left outer join with a 
static DF
 {% endhighlight %}
 
 
@@ -1137,7 +1137,7 @@ streamingDf.join(staticDf, "type", "right_join");  // 
right outer join with a st
 staticDf = spark.read. ...
 streamingDf = spark.readStream. ...
 streamingDf.join(staticDf, "type")  # inner equi-join with a static DF
-streamingDf.join(staticDf, "type", "right_join")  # right outer join with a 
static DF
+streamingDf.join(staticDf, "type", "left_outer")  # left outer join with a 
static DF
 {% endhighlight %}
 
 
@@ -1149,10 +1149,10 @@ staticDf <- read.df(...)
 streamingDf <- read.stream(...)
 joined <- merge(streamingDf, staticDf, sort = FALSE)  # inner equi

[spark] branch master updated (d530ed0 -> e66201b)

2020-11-05 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from d530ed0  Revert "[SPARK-33277][PYSPARK][SQL] Use ContextAwareIterator 
to stop consuming after the task ends"
 add e66201b  [MINOR][SS][DOCS] Update join type in stream static joins 
code examples

No new revisions were added by this update.

Summary of changes:
 docs/structured-streaming-programming-guide.md | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (d530ed0 -> e66201b)

2020-11-05 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from d530ed0  Revert "[SPARK-33277][PYSPARK][SQL] Use ContextAwareIterator 
to stop consuming after the task ends"
 add e66201b  [MINOR][SS][DOCS] Update join type in stream static joins 
code examples

No new revisions were added by this update.

Summary of changes:
 docs/structured-streaming-programming-guide.md | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-2.4 updated: [MINOR][SS][DOCS] Update join type in stream static joins code examples

2020-11-05 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a commit to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-2.4 by this push:
 new 8684720  [MINOR][SS][DOCS] Update join type in stream static joins 
code examples
8684720 is described below

commit 868472040a9eb67169086ed0fab0afcbbbd321f9
Author: Sarvesh Dave 
AuthorDate: Thu Nov 5 16:22:31 2020 +0900

[MINOR][SS][DOCS] Update join type in stream static joins code examples

### What changes were proposed in this pull request?
Update join type in stream static joins code examples in structured 
streaming programming guide.
1) Scala, Java and Python examples have a common issue.
The join keyword is "right_join", it should be "left_outer".

_Reasons:_
a) This code snippet is an example of "left outer join" as the 
streaming df is on left and static df is on right. Also, right outerjoin 
between stream df(left) and static df(right) is not supported.
b) The keyword "right_join/left_join" is unsupported and it should be 
"right_outer/left_outer".

So, all of these code snippets have been updated to "left_outer".

2) R exmaple is correct, but the example is of "right_outer" with static df 
(left) and streaming df(right).
It is changed to "left_outer" to make it consistent with other three 
examples of scala, java and python.

### Why are the changes needed?
To fix the mistake in example code of documentation.

### Does this PR introduce _any_ user-facing change?
Yes, it is a user-facing change (but documentation update only).

**Screenshots 1: Scala/Java/python example (similar issue)**
_Before:_
https://user-images.githubusercontent.com/62717942/98155351-19e59400-1efc-11eb-8142-e6a25a5e6497.png";>

_After:_
https://user-images.githubusercontent.com/62717942/98155503-5d400280-1efc-11eb-96e1-5ba0f3c35c82.png";>

**Screenshots 2: R example (Make it consistent with above change)**
_Before:_
https://user-images.githubusercontent.com/62717942/98155685-ac863300-1efc-11eb-93bc-b7ca4dd34634.png";>

_After:_
https://user-images.githubusercontent.com/62717942/98155739-c0ca3000-1efc-11eb-8f95-a7538fa784b7.png";>

### How was this patch tested?
The change was tested locally.
1) cd docs/
SKIP_API=1 jekyll build
2) Verify docs/_site/structured-streaming-programming-guide.html file in 
browser.

Closes #30252 from sarveshdave1/doc-update-stream-static-joins.

Authored-by: Sarvesh Dave 
Signed-off-by: Jungtaek Lim (HeartSaVioR) 
(cherry picked from commit e66201b30bc1f3da7284af14b32e5e6200768dbd)
Signed-off-by: Jungtaek Lim (HeartSaVioR) 
---
 docs/structured-streaming-programming-guide.md | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/docs/structured-streaming-programming-guide.md 
b/docs/structured-streaming-programming-guide.md
index dce4b35..aac262b 100644
--- a/docs/structured-streaming-programming-guide.md
+++ b/docs/structured-streaming-programming-guide.md
@@ -1089,7 +1089,7 @@ val staticDf = spark.read. ...
 val streamingDf = spark.readStream. ...
 
 streamingDf.join(staticDf, "type")  // inner equi-join with a static DF
-streamingDf.join(staticDf, "type", "right_join")  // right outer join with a 
static DF  
+streamingDf.join(staticDf, "type", "left_outer")  // left outer join with a 
static DF
 
 {% endhighlight %}
 
@@ -1100,7 +1100,7 @@ streamingDf.join(staticDf, "type", "right_join")  // 
right outer join with a sta
 Dataset staticDf = spark.read(). ...;
 Dataset streamingDf = spark.readStream(). ...;
 streamingDf.join(staticDf, "type"); // inner equi-join with a static DF
-streamingDf.join(staticDf, "type", "right_join");  // right outer join with a 
static DF
+streamingDf.join(staticDf, "type", "left_outer");  // left outer join with a 
static DF
 {% endhighlight %}
 
 
@@ -,7 +,7 @@ streamingDf.join(staticDf, "type", "right_join");  // 
right outer join with a st
 staticDf = spark.read. ...
 streamingDf = spark.readStream. ...
 streamingDf.join(staticDf, "type")  # inner equi-join with a static DF
-streamingDf.join(staticDf, "type", "right_join")  # right outer join with a 
static DF
+streamingDf.join(staticDf, "type", "left_outer")  # left outer join with a 
static DF
 {% endhighlight %}
 
 
@@ -1123,10 +1123,10 @@ staticDf <- read.df(...)
 streamingDf <- read.stream(...)
 joined <- merge(streamingDf, staticDf, sort = FALSE)  # inner equi

[spark] branch branch-3.0 updated: [MINOR][SS][DOCS] Update join type in stream static joins code examples

2020-11-05 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new c43231c  [MINOR][SS][DOCS] Update join type in stream static joins 
code examples
c43231c is described below

commit c43231c6cd2d38fb9e6c6435ca08b735c9aacb61
Author: Sarvesh Dave 
AuthorDate: Thu Nov 5 16:22:31 2020 +0900

[MINOR][SS][DOCS] Update join type in stream static joins code examples

### What changes were proposed in this pull request?
Update join type in stream static joins code examples in structured 
streaming programming guide.
1) Scala, Java and Python examples have a common issue.
The join keyword is "right_join", it should be "left_outer".

_Reasons:_
a) This code snippet is an example of "left outer join" as the 
streaming df is on left and static df is on right. Also, right outerjoin 
between stream df(left) and static df(right) is not supported.
b) The keyword "right_join/left_join" is unsupported and it should be 
"right_outer/left_outer".

So, all of these code snippets have been updated to "left_outer".

2) R exmaple is correct, but the example is of "right_outer" with static df 
(left) and streaming df(right).
It is changed to "left_outer" to make it consistent with other three 
examples of scala, java and python.

### Why are the changes needed?
To fix the mistake in example code of documentation.

### Does this PR introduce _any_ user-facing change?
Yes, it is a user-facing change (but documentation update only).

**Screenshots 1: Scala/Java/python example (similar issue)**
_Before:_
https://user-images.githubusercontent.com/62717942/98155351-19e59400-1efc-11eb-8142-e6a25a5e6497.png";>

_After:_
https://user-images.githubusercontent.com/62717942/98155503-5d400280-1efc-11eb-96e1-5ba0f3c35c82.png";>

**Screenshots 2: R example (Make it consistent with above change)**
_Before:_
https://user-images.githubusercontent.com/62717942/98155685-ac863300-1efc-11eb-93bc-b7ca4dd34634.png";>

_After:_
https://user-images.githubusercontent.com/62717942/98155739-c0ca3000-1efc-11eb-8f95-a7538fa784b7.png";>

### How was this patch tested?
The change was tested locally.
1) cd docs/
SKIP_API=1 jekyll build
2) Verify docs/_site/structured-streaming-programming-guide.html file in 
browser.

Closes #30252 from sarveshdave1/doc-update-stream-static-joins.

Authored-by: Sarvesh Dave 
Signed-off-by: Jungtaek Lim (HeartSaVioR) 
(cherry picked from commit e66201b30bc1f3da7284af14b32e5e6200768dbd)
Signed-off-by: Jungtaek Lim (HeartSaVioR) 
---
 docs/structured-streaming-programming-guide.md | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/docs/structured-streaming-programming-guide.md 
b/docs/structured-streaming-programming-guide.md
index 69d744d..31b1ca9d 100644
--- a/docs/structured-streaming-programming-guide.md
+++ b/docs/structured-streaming-programming-guide.md
@@ -1115,7 +1115,7 @@ val staticDf = spark.read. ...
 val streamingDf = spark.readStream. ...
 
 streamingDf.join(staticDf, "type")  // inner equi-join with a static DF
-streamingDf.join(staticDf, "type", "right_join")  // right outer join with a 
static DF  
+streamingDf.join(staticDf, "type", "left_outer")  // left outer join with a 
static DF
 
 {% endhighlight %}
 
@@ -1126,7 +1126,7 @@ streamingDf.join(staticDf, "type", "right_join")  // 
right outer join with a sta
 Dataset staticDf = spark.read(). ...;
 Dataset streamingDf = spark.readStream(). ...;
 streamingDf.join(staticDf, "type"); // inner equi-join with a static DF
-streamingDf.join(staticDf, "type", "right_join");  // right outer join with a 
static DF
+streamingDf.join(staticDf, "type", "left_outer");  // left outer join with a 
static DF
 {% endhighlight %}
 
 
@@ -1137,7 +1137,7 @@ streamingDf.join(staticDf, "type", "right_join");  // 
right outer join with a st
 staticDf = spark.read. ...
 streamingDf = spark.readStream. ...
 streamingDf.join(staticDf, "type")  # inner equi-join with a static DF
-streamingDf.join(staticDf, "type", "right_join")  # right outer join with a 
static DF
+streamingDf.join(staticDf, "type", "left_outer")  # left outer join with a 
static DF
 {% endhighlight %}
 
 
@@ -1149,10 +1149,10 @@ staticDf <- read.df(...)
 streamingDf <- read.stream(...)
 joined <- merge(streamingDf, staticDf, sort = FALSE)  # inner equi

[spark] branch master updated (d530ed0 -> e66201b)

2020-11-05 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from d530ed0  Revert "[SPARK-33277][PYSPARK][SQL] Use ContextAwareIterator 
to stop consuming after the task ends"
 add e66201b  [MINOR][SS][DOCS] Update join type in stream static joins 
code examples

No new revisions were added by this update.

Summary of changes:
 docs/structured-streaming-programming-guide.md | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-2.4 updated: [MINOR][SS][DOCS] Update join type in stream static joins code examples

2020-11-04 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a commit to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-2.4 by this push:
 new 8684720  [MINOR][SS][DOCS] Update join type in stream static joins 
code examples
8684720 is described below

commit 868472040a9eb67169086ed0fab0afcbbbd321f9
Author: Sarvesh Dave 
AuthorDate: Thu Nov 5 16:22:31 2020 +0900

[MINOR][SS][DOCS] Update join type in stream static joins code examples

### What changes were proposed in this pull request?
Update join type in stream static joins code examples in structured 
streaming programming guide.
1) Scala, Java and Python examples have a common issue.
The join keyword is "right_join", it should be "left_outer".

_Reasons:_
a) This code snippet is an example of "left outer join" as the 
streaming df is on left and static df is on right. Also, right outerjoin 
between stream df(left) and static df(right) is not supported.
b) The keyword "right_join/left_join" is unsupported and it should be 
"right_outer/left_outer".

So, all of these code snippets have been updated to "left_outer".

2) R exmaple is correct, but the example is of "right_outer" with static df 
(left) and streaming df(right).
It is changed to "left_outer" to make it consistent with other three 
examples of scala, java and python.

### Why are the changes needed?
To fix the mistake in example code of documentation.

### Does this PR introduce _any_ user-facing change?
Yes, it is a user-facing change (but documentation update only).

**Screenshots 1: Scala/Java/python example (similar issue)**
_Before:_
https://user-images.githubusercontent.com/62717942/98155351-19e59400-1efc-11eb-8142-e6a25a5e6497.png";>

_After:_
https://user-images.githubusercontent.com/62717942/98155503-5d400280-1efc-11eb-96e1-5ba0f3c35c82.png";>

**Screenshots 2: R example (Make it consistent with above change)**
_Before:_
https://user-images.githubusercontent.com/62717942/98155685-ac863300-1efc-11eb-93bc-b7ca4dd34634.png";>

_After:_
https://user-images.githubusercontent.com/62717942/98155739-c0ca3000-1efc-11eb-8f95-a7538fa784b7.png";>

### How was this patch tested?
The change was tested locally.
1) cd docs/
SKIP_API=1 jekyll build
2) Verify docs/_site/structured-streaming-programming-guide.html file in 
browser.

Closes #30252 from sarveshdave1/doc-update-stream-static-joins.

Authored-by: Sarvesh Dave 
Signed-off-by: Jungtaek Lim (HeartSaVioR) 
(cherry picked from commit e66201b30bc1f3da7284af14b32e5e6200768dbd)
Signed-off-by: Jungtaek Lim (HeartSaVioR) 
---
 docs/structured-streaming-programming-guide.md | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/docs/structured-streaming-programming-guide.md 
b/docs/structured-streaming-programming-guide.md
index dce4b35..aac262b 100644
--- a/docs/structured-streaming-programming-guide.md
+++ b/docs/structured-streaming-programming-guide.md
@@ -1089,7 +1089,7 @@ val staticDf = spark.read. ...
 val streamingDf = spark.readStream. ...
 
 streamingDf.join(staticDf, "type")  // inner equi-join with a static DF
-streamingDf.join(staticDf, "type", "right_join")  // right outer join with a 
static DF  
+streamingDf.join(staticDf, "type", "left_outer")  // left outer join with a 
static DF
 
 {% endhighlight %}
 
@@ -1100,7 +1100,7 @@ streamingDf.join(staticDf, "type", "right_join")  // 
right outer join with a sta
 Dataset staticDf = spark.read(). ...;
 Dataset streamingDf = spark.readStream(). ...;
 streamingDf.join(staticDf, "type"); // inner equi-join with a static DF
-streamingDf.join(staticDf, "type", "right_join");  // right outer join with a 
static DF
+streamingDf.join(staticDf, "type", "left_outer");  // left outer join with a 
static DF
 {% endhighlight %}
 
 
@@ -,7 +,7 @@ streamingDf.join(staticDf, "type", "right_join");  // 
right outer join with a st
 staticDf = spark.read. ...
 streamingDf = spark.readStream. ...
 streamingDf.join(staticDf, "type")  # inner equi-join with a static DF
-streamingDf.join(staticDf, "type", "right_join")  # right outer join with a 
static DF
+streamingDf.join(staticDf, "type", "left_outer")  # left outer join with a 
static DF
 {% endhighlight %}
 
 
@@ -1123,10 +1123,10 @@ staticDf <- read.df(...)
 streamingDf <- read.stream(...)
 joined <- merge(streamingDf, staticDf, sort = FALSE)  # inner equi

[spark] branch branch-3.0 updated: [MINOR][SS][DOCS] Update join type in stream static joins code examples

2020-11-04 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new c43231c  [MINOR][SS][DOCS] Update join type in stream static joins 
code examples
c43231c is described below

commit c43231c6cd2d38fb9e6c6435ca08b735c9aacb61
Author: Sarvesh Dave 
AuthorDate: Thu Nov 5 16:22:31 2020 +0900

[MINOR][SS][DOCS] Update join type in stream static joins code examples

### What changes were proposed in this pull request?
Update join type in stream static joins code examples in structured 
streaming programming guide.
1) Scala, Java and Python examples have a common issue.
The join keyword is "right_join", it should be "left_outer".

_Reasons:_
a) This code snippet is an example of "left outer join" as the 
streaming df is on left and static df is on right. Also, right outerjoin 
between stream df(left) and static df(right) is not supported.
b) The keyword "right_join/left_join" is unsupported and it should be 
"right_outer/left_outer".

So, all of these code snippets have been updated to "left_outer".

2) R exmaple is correct, but the example is of "right_outer" with static df 
(left) and streaming df(right).
It is changed to "left_outer" to make it consistent with other three 
examples of scala, java and python.

### Why are the changes needed?
To fix the mistake in example code of documentation.

### Does this PR introduce _any_ user-facing change?
Yes, it is a user-facing change (but documentation update only).

**Screenshots 1: Scala/Java/python example (similar issue)**
_Before:_
https://user-images.githubusercontent.com/62717942/98155351-19e59400-1efc-11eb-8142-e6a25a5e6497.png";>

_After:_
https://user-images.githubusercontent.com/62717942/98155503-5d400280-1efc-11eb-96e1-5ba0f3c35c82.png";>

**Screenshots 2: R example (Make it consistent with above change)**
_Before:_
https://user-images.githubusercontent.com/62717942/98155685-ac863300-1efc-11eb-93bc-b7ca4dd34634.png";>

_After:_
https://user-images.githubusercontent.com/62717942/98155739-c0ca3000-1efc-11eb-8f95-a7538fa784b7.png";>

### How was this patch tested?
The change was tested locally.
1) cd docs/
SKIP_API=1 jekyll build
2) Verify docs/_site/structured-streaming-programming-guide.html file in 
browser.

Closes #30252 from sarveshdave1/doc-update-stream-static-joins.

Authored-by: Sarvesh Dave 
Signed-off-by: Jungtaek Lim (HeartSaVioR) 
(cherry picked from commit e66201b30bc1f3da7284af14b32e5e6200768dbd)
Signed-off-by: Jungtaek Lim (HeartSaVioR) 
---
 docs/structured-streaming-programming-guide.md | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/docs/structured-streaming-programming-guide.md 
b/docs/structured-streaming-programming-guide.md
index 69d744d..31b1ca9d 100644
--- a/docs/structured-streaming-programming-guide.md
+++ b/docs/structured-streaming-programming-guide.md
@@ -1115,7 +1115,7 @@ val staticDf = spark.read. ...
 val streamingDf = spark.readStream. ...
 
 streamingDf.join(staticDf, "type")  // inner equi-join with a static DF
-streamingDf.join(staticDf, "type", "right_join")  // right outer join with a 
static DF  
+streamingDf.join(staticDf, "type", "left_outer")  // left outer join with a 
static DF
 
 {% endhighlight %}
 
@@ -1126,7 +1126,7 @@ streamingDf.join(staticDf, "type", "right_join")  // 
right outer join with a sta
 Dataset staticDf = spark.read(). ...;
 Dataset streamingDf = spark.readStream(). ...;
 streamingDf.join(staticDf, "type"); // inner equi-join with a static DF
-streamingDf.join(staticDf, "type", "right_join");  // right outer join with a 
static DF
+streamingDf.join(staticDf, "type", "left_outer");  // left outer join with a 
static DF
 {% endhighlight %}
 
 
@@ -1137,7 +1137,7 @@ streamingDf.join(staticDf, "type", "right_join");  // 
right outer join with a st
 staticDf = spark.read. ...
 streamingDf = spark.readStream. ...
 streamingDf.join(staticDf, "type")  # inner equi-join with a static DF
-streamingDf.join(staticDf, "type", "right_join")  # right outer join with a 
static DF
+streamingDf.join(staticDf, "type", "left_outer")  # left outer join with a 
static DF
 {% endhighlight %}
 
 
@@ -1149,10 +1149,10 @@ staticDf <- read.df(...)
 streamingDf <- read.stream(...)
 joined <- merge(streamingDf, staticDf, sort = FALSE)  # inner equi

[spark] branch master updated (d530ed0 -> e66201b)

2020-11-04 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from d530ed0  Revert "[SPARK-33277][PYSPARK][SQL] Use ContextAwareIterator 
to stop consuming after the task ends"
 add e66201b  [MINOR][SS][DOCS] Update join type in stream static joins 
code examples

No new revisions were added by this update.

Summary of changes:
 docs/structured-streaming-programming-guide.md | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (9818f07 -> 4b0e23e)

2020-10-26 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 9818f07  [SPARK-33243][PYTHON][BUILD] Add numpydoc into documentation 
dependency
 add 4b0e23e  [SPARK-33215][WEBUI] Speed up event log download by skipping 
UI rebuild

No new revisions were added by this update.

Summary of changes:
 .../history/ApplicationHistoryProvider.scala   |  7 +++
 .../spark/deploy/history/FsHistoryProvider.scala   | 34 ++
 .../spark/deploy/history/HistoryServer.scala   |  5 ++
 .../spark/status/api/v1/ApiRootResource.scala  | 15 ++
 .../status/api/v1/OneApplicationResource.scala |  9 ++--
 .../main/scala/org/apache/spark/ui/SparkUI.scala   |  5 ++
 .../deploy/history/FsHistoryProviderSuite.scala| 54 +-
 .../spark/deploy/history/HistoryServerSuite.scala  | 18 
 8 files changed, 132 insertions(+), 15 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (9818f07 -> 4b0e23e)

2020-10-26 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 9818f07  [SPARK-33243][PYTHON][BUILD] Add numpydoc into documentation 
dependency
 add 4b0e23e  [SPARK-33215][WEBUI] Speed up event log download by skipping 
UI rebuild

No new revisions were added by this update.

Summary of changes:
 .../history/ApplicationHistoryProvider.scala   |  7 +++
 .../spark/deploy/history/FsHistoryProvider.scala   | 34 ++
 .../spark/deploy/history/HistoryServer.scala   |  5 ++
 .../spark/status/api/v1/ApiRootResource.scala  | 15 ++
 .../status/api/v1/OneApplicationResource.scala |  9 ++--
 .../main/scala/org/apache/spark/ui/SparkUI.scala   |  5 ++
 .../deploy/history/FsHistoryProviderSuite.scala| 54 +-
 .../spark/deploy/history/HistoryServerSuite.scala  | 18 
 8 files changed, 132 insertions(+), 15 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (9818f07 -> 4b0e23e)

2020-10-26 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 9818f07  [SPARK-33243][PYTHON][BUILD] Add numpydoc into documentation 
dependency
 add 4b0e23e  [SPARK-33215][WEBUI] Speed up event log download by skipping 
UI rebuild

No new revisions were added by this update.

Summary of changes:
 .../history/ApplicationHistoryProvider.scala   |  7 +++
 .../spark/deploy/history/FsHistoryProvider.scala   | 34 ++
 .../spark/deploy/history/HistoryServer.scala   |  5 ++
 .../spark/status/api/v1/ApiRootResource.scala  | 15 ++
 .../status/api/v1/OneApplicationResource.scala |  9 ++--
 .../main/scala/org/apache/spark/ui/SparkUI.scala   |  5 ++
 .../deploy/history/FsHistoryProviderSuite.scala| 54 +-
 .../spark/deploy/history/HistoryServerSuite.scala  | 18 
 8 files changed, 132 insertions(+), 15 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (9818f07 -> 4b0e23e)

2020-10-26 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 9818f07  [SPARK-33243][PYTHON][BUILD] Add numpydoc into documentation 
dependency
 add 4b0e23e  [SPARK-33215][WEBUI] Speed up event log download by skipping 
UI rebuild

No new revisions were added by this update.

Summary of changes:
 .../history/ApplicationHistoryProvider.scala   |  7 +++
 .../spark/deploy/history/FsHistoryProvider.scala   | 34 ++
 .../spark/deploy/history/HistoryServer.scala   |  5 ++
 .../spark/status/api/v1/ApiRootResource.scala  | 15 ++
 .../status/api/v1/OneApplicationResource.scala |  9 ++--
 .../main/scala/org/apache/spark/ui/SparkUI.scala   |  5 ++
 .../deploy/history/FsHistoryProviderSuite.scala| 54 +-
 .../spark/deploy/history/HistoryServerSuite.scala  | 18 
 8 files changed, 132 insertions(+), 15 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (9818f07 -> 4b0e23e)

2020-10-26 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 9818f07  [SPARK-33243][PYTHON][BUILD] Add numpydoc into documentation 
dependency
 add 4b0e23e  [SPARK-33215][WEBUI] Speed up event log download by skipping 
UI rebuild

No new revisions were added by this update.

Summary of changes:
 .../history/ApplicationHistoryProvider.scala   |  7 +++
 .../spark/deploy/history/FsHistoryProvider.scala   | 34 ++
 .../spark/deploy/history/HistoryServer.scala   |  5 ++
 .../spark/status/api/v1/ApiRootResource.scala  | 15 ++
 .../status/api/v1/OneApplicationResource.scala |  9 ++--
 .../main/scala/org/apache/spark/ui/SparkUI.scala   |  5 ++
 .../deploy/history/FsHistoryProviderSuite.scala| 54 +-
 .../spark/deploy/history/HistoryServerSuite.scala  | 18 
 8 files changed, 132 insertions(+), 15 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (369cc61 -> d87a0bb)

2020-10-25 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 369cc61  Revert "[SPARK-32388][SQL] TRANSFORM with schema-less mode 
should keep the same with hive"
 add d87a0bb  [SPARK-32862][SS] Left semi stream-stream join

No new revisions were added by this update.

Summary of changes:
 .../analysis/UnsupportedOperationChecker.scala |  15 +-
 .../spark/sql/catalyst/expressions/JoinedRow.scala |  10 +
 .../analysis/UnsupportedOperationsSuite.scala  |  66 ++-
 .../streaming/StreamingSymmetricHashJoinExec.scala | 121 +++--
 .../state/SymmetricHashJoinStateManager.scala  |  11 +-
 .../spark/sql/streaming/StreamingJoinSuite.scala   | 502 +++--
 6 files changed, 545 insertions(+), 180 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (369cc61 -> d87a0bb)

2020-10-25 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 369cc61  Revert "[SPARK-32388][SQL] TRANSFORM with schema-less mode 
should keep the same with hive"
 add d87a0bb  [SPARK-32862][SS] Left semi stream-stream join

No new revisions were added by this update.

Summary of changes:
 .../analysis/UnsupportedOperationChecker.scala |  15 +-
 .../spark/sql/catalyst/expressions/JoinedRow.scala |  10 +
 .../analysis/UnsupportedOperationsSuite.scala  |  66 ++-
 .../streaming/StreamingSymmetricHashJoinExec.scala | 121 +++--
 .../state/SymmetricHashJoinStateManager.scala  |  11 +-
 .../spark/sql/streaming/StreamingJoinSuite.scala   | 502 +++--
 6 files changed, 545 insertions(+), 180 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (369cc61 -> d87a0bb)

2020-10-25 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 369cc61  Revert "[SPARK-32388][SQL] TRANSFORM with schema-less mode 
should keep the same with hive"
 add d87a0bb  [SPARK-32862][SS] Left semi stream-stream join

No new revisions were added by this update.

Summary of changes:
 .../analysis/UnsupportedOperationChecker.scala |  15 +-
 .../spark/sql/catalyst/expressions/JoinedRow.scala |  10 +
 .../analysis/UnsupportedOperationsSuite.scala  |  66 ++-
 .../streaming/StreamingSymmetricHashJoinExec.scala | 121 +++--
 .../state/SymmetricHashJoinStateManager.scala  |  11 +-
 .../spark/sql/streaming/StreamingJoinSuite.scala   | 502 +++--
 6 files changed, 545 insertions(+), 180 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (369cc61 -> d87a0bb)

2020-10-25 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 369cc61  Revert "[SPARK-32388][SQL] TRANSFORM with schema-less mode 
should keep the same with hive"
 add d87a0bb  [SPARK-32862][SS] Left semi stream-stream join

No new revisions were added by this update.

Summary of changes:
 .../analysis/UnsupportedOperationChecker.scala |  15 +-
 .../spark/sql/catalyst/expressions/JoinedRow.scala |  10 +
 .../analysis/UnsupportedOperationsSuite.scala  |  66 ++-
 .../streaming/StreamingSymmetricHashJoinExec.scala | 121 +++--
 .../state/SymmetricHashJoinStateManager.scala  |  11 +-
 .../spark/sql/streaming/StreamingJoinSuite.scala   | 502 +++--
 6 files changed, 545 insertions(+), 180 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (369cc61 -> d87a0bb)

2020-10-25 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 369cc61  Revert "[SPARK-32388][SQL] TRANSFORM with schema-less mode 
should keep the same with hive"
 add d87a0bb  [SPARK-32862][SS] Left semi stream-stream join

No new revisions were added by this update.

Summary of changes:
 .../analysis/UnsupportedOperationChecker.scala |  15 +-
 .../spark/sql/catalyst/expressions/JoinedRow.scala |  10 +
 .../analysis/UnsupportedOperationsSuite.scala  |  66 ++-
 .../streaming/StreamingSymmetricHashJoinExec.scala | 121 +++--
 .../state/SymmetricHashJoinStateManager.scala  |  11 +-
 .../spark/sql/streaming/StreamingJoinSuite.scala   | 502 +++--
 6 files changed, 545 insertions(+), 180 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated (0bff1f6 -> 02f80cf)

2020-10-18 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 0bff1f6  [SPARK-33123][INFRA] Ignore GitHub only changes in Amplab 
Jenkins build
 new 15ed312  [SPARK-32557][CORE] Logging and swallowing the exception per 
entry in History server
 new 02f80cf  Revert "Revert "[SPARK-33146][CORE] Check for non-fatal 
errors when loading new applications in SHS""

The 2 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 .../spark/deploy/history/FsHistoryProvider.scala   |  7 +++-
 .../deploy/history/FsHistoryProviderSuite.scala| 49 ++
 2 files changed, 55 insertions(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] 02/02: Revert "Revert "[SPARK-33146][CORE] Check for non-fatal errors when loading new applications in SHS""

2020-10-18 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git

commit 02f80cf293739f4d2881316897dbdcea74daa0bc
Author: Jungtaek Lim (HeartSaVioR) 
AuthorDate: Thu Oct 15 15:28:52 2020 +0900

Revert "Revert "[SPARK-33146][CORE] Check for non-fatal errors when loading 
new applications in SHS""

This reverts commit e40c147a5d194adbba13f12590959dc68347ec14.
---
 .../spark/deploy/history/FsHistoryProvider.scala   |  3 ++
 .../deploy/history/FsHistoryProviderSuite.scala| 49 ++
 2 files changed, 52 insertions(+)

diff --git 
a/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala 
b/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
index f5e7c4fa..7e63d55 100644
--- 
a/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
+++ 
b/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
@@ -527,6 +527,9 @@ private[history] class FsHistoryProvider(conf: SparkConf, 
clock: Clock)
 reader.fileSizeForLastIndex > 0
   } catch {
 case _: FileNotFoundException => false
+case NonFatal(e) =>
+  logWarning(s"Error while reading new log 
${reader.rootPath}", e)
+  false
   }
 
 case NonFatal(e) =>
diff --git 
a/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala
 
b/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala
index c2f34fc..f3beb35 100644
--- 
a/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala
+++ 
b/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala
@@ -1470,6 +1470,55 @@ class FsHistoryProviderSuite extends SparkFunSuite with 
Matchers with Logging {
 }
   }
 
+  test("SPARK-33146: don't let one bad rolling log folder prevent loading 
other applications") {
+withTempDir { dir =>
+  val conf = createTestConf(true)
+  conf.set(HISTORY_LOG_DIR, dir.getAbsolutePath)
+  val hadoopConf = SparkHadoopUtil.newConfiguration(conf)
+  val fs = new Path(dir.getAbsolutePath).getFileSystem(hadoopConf)
+
+  val provider = new FsHistoryProvider(conf)
+
+  val writer = new RollingEventLogFilesWriter("app", None, dir.toURI, 
conf, hadoopConf)
+  writer.start()
+
+  writeEventsToRollingWriter(writer, Seq(
+SparkListenerApplicationStart("app", Some("app"), 0, "user", None),
+SparkListenerJobStart(1, 0, Seq.empty)), rollFile = false)
+  provider.checkForLogs()
+  provider.cleanLogs()
+  assert(dir.listFiles().size === 1)
+  assert(provider.getListing.length === 1)
+
+  // Manually delete the appstatus file to make an invalid rolling event 
log
+  val appStatusPath = RollingEventLogFilesWriter.getAppStatusFilePath(new 
Path(writer.logPath),
+"app", None, true)
+  fs.delete(appStatusPath, false)
+  provider.checkForLogs()
+  provider.cleanLogs()
+  assert(provider.getListing.length === 0)
+
+  // Create a new application
+  val writer2 = new RollingEventLogFilesWriter("app2", None, dir.toURI, 
conf, hadoopConf)
+  writer2.start()
+  writeEventsToRollingWriter(writer2, Seq(
+SparkListenerApplicationStart("app2", Some("app2"), 0, "user", None),
+SparkListenerJobStart(1, 0, Seq.empty)), rollFile = false)
+
+  // Both folders exist but only one application found
+  provider.checkForLogs()
+  provider.cleanLogs()
+  assert(provider.getListing.length === 1)
+  assert(dir.listFiles().size === 2)
+
+  // Make sure a new provider sees the valid application
+  provider.stop()
+  val newProvider = new FsHistoryProvider(conf)
+  newProvider.checkForLogs()
+  assert(newProvider.getListing.length === 1)
+}
+  }
+
   /**
* Asks the provider to check for logs and calls a function to perform 
checks on the updated
* app list. Example:


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] 01/02: [SPARK-32557][CORE] Logging and swallowing the exception per entry in History server

2020-10-18 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git

commit 15ed3121994d4c7e6f1db7e0b5b2cea5565b495a
Author: Yan Xiaole 
AuthorDate: Sun Aug 9 16:47:31 2020 -0700

[SPARK-32557][CORE] Logging and swallowing the exception per entry in 
History server

### What changes were proposed in this pull request?
This PR adds a try catch wrapping the History server scan logic to log and 
swallow the exception per entry.

### Why are the changes needed?
As discussed in #29350 , one entry failure shouldn't affect others.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Manually tested.

Closes #29374 from yanxiaole/SPARK-32557.

Authored-by: Yan Xiaole 
Signed-off-by: Dongjoon Hyun 
---
 .../scala/org/apache/spark/deploy/history/FsHistoryProvider.scala | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git 
a/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala 
b/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
index c262152..f5e7c4fa 100644
--- 
a/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
+++ 
b/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
@@ -27,6 +27,7 @@ import java.util.zip.ZipOutputStream
 import scala.collection.JavaConverters._
 import scala.collection.mutable
 import scala.io.Source
+import scala.util.control.NonFatal
 import scala.xml.Node
 
 import com.fasterxml.jackson.annotation.JsonIgnore
@@ -528,7 +529,8 @@ private[history] class FsHistoryProvider(conf: SparkConf, 
clock: Clock)
 case _: FileNotFoundException => false
   }
 
-case _: FileNotFoundException =>
+case NonFatal(e) =>
+  logWarning(s"Error while filtering log ${reader.rootPath}", e)
   false
   }
 }


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] 01/02: [SPARK-32557][CORE] Logging and swallowing the exception per entry in History server

2020-10-18 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git

commit 15ed3121994d4c7e6f1db7e0b5b2cea5565b495a
Author: Yan Xiaole 
AuthorDate: Sun Aug 9 16:47:31 2020 -0700

[SPARK-32557][CORE] Logging and swallowing the exception per entry in 
History server

### What changes were proposed in this pull request?
This PR adds a try catch wrapping the History server scan logic to log and 
swallow the exception per entry.

### Why are the changes needed?
As discussed in #29350 , one entry failure shouldn't affect others.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Manually tested.

Closes #29374 from yanxiaole/SPARK-32557.

Authored-by: Yan Xiaole 
Signed-off-by: Dongjoon Hyun 
---
 .../scala/org/apache/spark/deploy/history/FsHistoryProvider.scala | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git 
a/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala 
b/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
index c262152..f5e7c4fa 100644
--- 
a/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
+++ 
b/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
@@ -27,6 +27,7 @@ import java.util.zip.ZipOutputStream
 import scala.collection.JavaConverters._
 import scala.collection.mutable
 import scala.io.Source
+import scala.util.control.NonFatal
 import scala.xml.Node
 
 import com.fasterxml.jackson.annotation.JsonIgnore
@@ -528,7 +529,8 @@ private[history] class FsHistoryProvider(conf: SparkConf, 
clock: Clock)
 case _: FileNotFoundException => false
   }
 
-case _: FileNotFoundException =>
+case NonFatal(e) =>
+  logWarning(s"Error while filtering log ${reader.rootPath}", e)
   false
   }
 }


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] 02/02: Revert "Revert "[SPARK-33146][CORE] Check for non-fatal errors when loading new applications in SHS""

2020-10-18 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git

commit 02f80cf293739f4d2881316897dbdcea74daa0bc
Author: Jungtaek Lim (HeartSaVioR) 
AuthorDate: Thu Oct 15 15:28:52 2020 +0900

Revert "Revert "[SPARK-33146][CORE] Check for non-fatal errors when loading 
new applications in SHS""

This reverts commit e40c147a5d194adbba13f12590959dc68347ec14.
---
 .../spark/deploy/history/FsHistoryProvider.scala   |  3 ++
 .../deploy/history/FsHistoryProviderSuite.scala| 49 ++
 2 files changed, 52 insertions(+)

diff --git 
a/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala 
b/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
index f5e7c4fa..7e63d55 100644
--- 
a/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
+++ 
b/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
@@ -527,6 +527,9 @@ private[history] class FsHistoryProvider(conf: SparkConf, 
clock: Clock)
 reader.fileSizeForLastIndex > 0
   } catch {
 case _: FileNotFoundException => false
+case NonFatal(e) =>
+  logWarning(s"Error while reading new log 
${reader.rootPath}", e)
+  false
   }
 
 case NonFatal(e) =>
diff --git 
a/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala
 
b/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala
index c2f34fc..f3beb35 100644
--- 
a/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala
+++ 
b/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala
@@ -1470,6 +1470,55 @@ class FsHistoryProviderSuite extends SparkFunSuite with 
Matchers with Logging {
 }
   }
 
+  test("SPARK-33146: don't let one bad rolling log folder prevent loading 
other applications") {
+withTempDir { dir =>
+  val conf = createTestConf(true)
+  conf.set(HISTORY_LOG_DIR, dir.getAbsolutePath)
+  val hadoopConf = SparkHadoopUtil.newConfiguration(conf)
+  val fs = new Path(dir.getAbsolutePath).getFileSystem(hadoopConf)
+
+  val provider = new FsHistoryProvider(conf)
+
+  val writer = new RollingEventLogFilesWriter("app", None, dir.toURI, 
conf, hadoopConf)
+  writer.start()
+
+  writeEventsToRollingWriter(writer, Seq(
+SparkListenerApplicationStart("app", Some("app"), 0, "user", None),
+SparkListenerJobStart(1, 0, Seq.empty)), rollFile = false)
+  provider.checkForLogs()
+  provider.cleanLogs()
+  assert(dir.listFiles().size === 1)
+  assert(provider.getListing.length === 1)
+
+  // Manually delete the appstatus file to make an invalid rolling event 
log
+  val appStatusPath = RollingEventLogFilesWriter.getAppStatusFilePath(new 
Path(writer.logPath),
+"app", None, true)
+  fs.delete(appStatusPath, false)
+  provider.checkForLogs()
+  provider.cleanLogs()
+  assert(provider.getListing.length === 0)
+
+  // Create a new application
+  val writer2 = new RollingEventLogFilesWriter("app2", None, dir.toURI, 
conf, hadoopConf)
+  writer2.start()
+  writeEventsToRollingWriter(writer2, Seq(
+SparkListenerApplicationStart("app2", Some("app2"), 0, "user", None),
+SparkListenerJobStart(1, 0, Seq.empty)), rollFile = false)
+
+  // Both folders exist but only one application found
+  provider.checkForLogs()
+  provider.cleanLogs()
+  assert(provider.getListing.length === 1)
+  assert(dir.listFiles().size === 2)
+
+  // Make sure a new provider sees the valid application
+  provider.stop()
+  val newProvider = new FsHistoryProvider(conf)
+  newProvider.checkForLogs()
+  assert(newProvider.getListing.length === 1)
+}
+  }
+
   /**
* Asks the provider to check for logs and calls a function to perform 
checks on the updated
* app list. Example:


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated (0bff1f6 -> 02f80cf)

2020-10-18 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 0bff1f6  [SPARK-33123][INFRA] Ignore GitHub only changes in Amplab 
Jenkins build
 new 15ed312  [SPARK-32557][CORE] Logging and swallowing the exception per 
entry in History server
 new 02f80cf  Revert "Revert "[SPARK-33146][CORE] Check for non-fatal 
errors when loading new applications in SHS""

The 2 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 .../spark/deploy/history/FsHistoryProvider.scala   |  7 +++-
 .../deploy/history/FsHistoryProviderSuite.scala| 49 ++
 2 files changed, 55 insertions(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] 01/02: [SPARK-32557][CORE] Logging and swallowing the exception per entry in History server

2020-10-18 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git

commit 15ed3121994d4c7e6f1db7e0b5b2cea5565b495a
Author: Yan Xiaole 
AuthorDate: Sun Aug 9 16:47:31 2020 -0700

[SPARK-32557][CORE] Logging and swallowing the exception per entry in 
History server

### What changes were proposed in this pull request?
This PR adds a try catch wrapping the History server scan logic to log and 
swallow the exception per entry.

### Why are the changes needed?
As discussed in #29350 , one entry failure shouldn't affect others.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Manually tested.

Closes #29374 from yanxiaole/SPARK-32557.

Authored-by: Yan Xiaole 
Signed-off-by: Dongjoon Hyun 
---
 .../scala/org/apache/spark/deploy/history/FsHistoryProvider.scala | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git 
a/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala 
b/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
index c262152..f5e7c4fa 100644
--- 
a/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
+++ 
b/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
@@ -27,6 +27,7 @@ import java.util.zip.ZipOutputStream
 import scala.collection.JavaConverters._
 import scala.collection.mutable
 import scala.io.Source
+import scala.util.control.NonFatal
 import scala.xml.Node
 
 import com.fasterxml.jackson.annotation.JsonIgnore
@@ -528,7 +529,8 @@ private[history] class FsHistoryProvider(conf: SparkConf, 
clock: Clock)
 case _: FileNotFoundException => false
   }
 
-case _: FileNotFoundException =>
+case NonFatal(e) =>
+  logWarning(s"Error while filtering log ${reader.rootPath}", e)
   false
   }
 }


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated (0bff1f6 -> 02f80cf)

2020-10-18 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 0bff1f6  [SPARK-33123][INFRA] Ignore GitHub only changes in Amplab 
Jenkins build
 new 15ed312  [SPARK-32557][CORE] Logging and swallowing the exception per 
entry in History server
 new 02f80cf  Revert "Revert "[SPARK-33146][CORE] Check for non-fatal 
errors when loading new applications in SHS""

The 2 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 .../spark/deploy/history/FsHistoryProvider.scala   |  7 +++-
 .../deploy/history/FsHistoryProviderSuite.scala| 49 ++
 2 files changed, 55 insertions(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] 02/02: Revert "Revert "[SPARK-33146][CORE] Check for non-fatal errors when loading new applications in SHS""

2020-10-18 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git

commit 02f80cf293739f4d2881316897dbdcea74daa0bc
Author: Jungtaek Lim (HeartSaVioR) 
AuthorDate: Thu Oct 15 15:28:52 2020 +0900

Revert "Revert "[SPARK-33146][CORE] Check for non-fatal errors when loading 
new applications in SHS""

This reverts commit e40c147a5d194adbba13f12590959dc68347ec14.
---
 .../spark/deploy/history/FsHistoryProvider.scala   |  3 ++
 .../deploy/history/FsHistoryProviderSuite.scala| 49 ++
 2 files changed, 52 insertions(+)

diff --git 
a/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala 
b/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
index f5e7c4fa..7e63d55 100644
--- 
a/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
+++ 
b/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
@@ -527,6 +527,9 @@ private[history] class FsHistoryProvider(conf: SparkConf, 
clock: Clock)
 reader.fileSizeForLastIndex > 0
   } catch {
 case _: FileNotFoundException => false
+case NonFatal(e) =>
+  logWarning(s"Error while reading new log 
${reader.rootPath}", e)
+  false
   }
 
 case NonFatal(e) =>
diff --git 
a/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala
 
b/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala
index c2f34fc..f3beb35 100644
--- 
a/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala
+++ 
b/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala
@@ -1470,6 +1470,55 @@ class FsHistoryProviderSuite extends SparkFunSuite with 
Matchers with Logging {
 }
   }
 
+  test("SPARK-33146: don't let one bad rolling log folder prevent loading 
other applications") {
+withTempDir { dir =>
+  val conf = createTestConf(true)
+  conf.set(HISTORY_LOG_DIR, dir.getAbsolutePath)
+  val hadoopConf = SparkHadoopUtil.newConfiguration(conf)
+  val fs = new Path(dir.getAbsolutePath).getFileSystem(hadoopConf)
+
+  val provider = new FsHistoryProvider(conf)
+
+  val writer = new RollingEventLogFilesWriter("app", None, dir.toURI, 
conf, hadoopConf)
+  writer.start()
+
+  writeEventsToRollingWriter(writer, Seq(
+SparkListenerApplicationStart("app", Some("app"), 0, "user", None),
+SparkListenerJobStart(1, 0, Seq.empty)), rollFile = false)
+  provider.checkForLogs()
+  provider.cleanLogs()
+  assert(dir.listFiles().size === 1)
+  assert(provider.getListing.length === 1)
+
+  // Manually delete the appstatus file to make an invalid rolling event 
log
+  val appStatusPath = RollingEventLogFilesWriter.getAppStatusFilePath(new 
Path(writer.logPath),
+"app", None, true)
+  fs.delete(appStatusPath, false)
+  provider.checkForLogs()
+  provider.cleanLogs()
+  assert(provider.getListing.length === 0)
+
+  // Create a new application
+  val writer2 = new RollingEventLogFilesWriter("app2", None, dir.toURI, 
conf, hadoopConf)
+  writer2.start()
+  writeEventsToRollingWriter(writer2, Seq(
+SparkListenerApplicationStart("app2", Some("app2"), 0, "user", None),
+SparkListenerJobStart(1, 0, Seq.empty)), rollFile = false)
+
+  // Both folders exist but only one application found
+  provider.checkForLogs()
+  provider.cleanLogs()
+  assert(provider.getListing.length === 1)
+  assert(dir.listFiles().size === 2)
+
+  // Make sure a new provider sees the valid application
+  provider.stop()
+  val newProvider = new FsHistoryProvider(conf)
+  newProvider.checkForLogs()
+  assert(newProvider.getListing.length === 1)
+}
+  }
+
   /**
* Asks the provider to check for logs and calls a function to perform 
checks on the updated
* app list. Example:


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated (0bff1f6 -> 02f80cf)

2020-10-18 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 0bff1f6  [SPARK-33123][INFRA] Ignore GitHub only changes in Amplab 
Jenkins build
 new 15ed312  [SPARK-32557][CORE] Logging and swallowing the exception per 
entry in History server
 new 02f80cf  Revert "Revert "[SPARK-33146][CORE] Check for non-fatal 
errors when loading new applications in SHS""

The 2 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 .../spark/deploy/history/FsHistoryProvider.scala   |  7 +++-
 .../deploy/history/FsHistoryProviderSuite.scala| 49 ++
 2 files changed, 55 insertions(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] 02/02: Revert "Revert "[SPARK-33146][CORE] Check for non-fatal errors when loading new applications in SHS""

2020-10-18 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git

commit 02f80cf293739f4d2881316897dbdcea74daa0bc
Author: Jungtaek Lim (HeartSaVioR) 
AuthorDate: Thu Oct 15 15:28:52 2020 +0900

Revert "Revert "[SPARK-33146][CORE] Check for non-fatal errors when loading 
new applications in SHS""

This reverts commit e40c147a5d194adbba13f12590959dc68347ec14.
---
 .../spark/deploy/history/FsHistoryProvider.scala   |  3 ++
 .../deploy/history/FsHistoryProviderSuite.scala| 49 ++
 2 files changed, 52 insertions(+)

diff --git 
a/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala 
b/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
index f5e7c4fa..7e63d55 100644
--- 
a/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
+++ 
b/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
@@ -527,6 +527,9 @@ private[history] class FsHistoryProvider(conf: SparkConf, 
clock: Clock)
 reader.fileSizeForLastIndex > 0
   } catch {
 case _: FileNotFoundException => false
+case NonFatal(e) =>
+  logWarning(s"Error while reading new log 
${reader.rootPath}", e)
+  false
   }
 
 case NonFatal(e) =>
diff --git 
a/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala
 
b/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala
index c2f34fc..f3beb35 100644
--- 
a/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala
+++ 
b/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala
@@ -1470,6 +1470,55 @@ class FsHistoryProviderSuite extends SparkFunSuite with 
Matchers with Logging {
 }
   }
 
+  test("SPARK-33146: don't let one bad rolling log folder prevent loading 
other applications") {
+withTempDir { dir =>
+  val conf = createTestConf(true)
+  conf.set(HISTORY_LOG_DIR, dir.getAbsolutePath)
+  val hadoopConf = SparkHadoopUtil.newConfiguration(conf)
+  val fs = new Path(dir.getAbsolutePath).getFileSystem(hadoopConf)
+
+  val provider = new FsHistoryProvider(conf)
+
+  val writer = new RollingEventLogFilesWriter("app", None, dir.toURI, 
conf, hadoopConf)
+  writer.start()
+
+  writeEventsToRollingWriter(writer, Seq(
+SparkListenerApplicationStart("app", Some("app"), 0, "user", None),
+SparkListenerJobStart(1, 0, Seq.empty)), rollFile = false)
+  provider.checkForLogs()
+  provider.cleanLogs()
+  assert(dir.listFiles().size === 1)
+  assert(provider.getListing.length === 1)
+
+  // Manually delete the appstatus file to make an invalid rolling event 
log
+  val appStatusPath = RollingEventLogFilesWriter.getAppStatusFilePath(new 
Path(writer.logPath),
+"app", None, true)
+  fs.delete(appStatusPath, false)
+  provider.checkForLogs()
+  provider.cleanLogs()
+  assert(provider.getListing.length === 0)
+
+  // Create a new application
+  val writer2 = new RollingEventLogFilesWriter("app2", None, dir.toURI, 
conf, hadoopConf)
+  writer2.start()
+  writeEventsToRollingWriter(writer2, Seq(
+SparkListenerApplicationStart("app2", Some("app2"), 0, "user", None),
+SparkListenerJobStart(1, 0, Seq.empty)), rollFile = false)
+
+  // Both folders exist but only one application found
+  provider.checkForLogs()
+  provider.cleanLogs()
+  assert(provider.getListing.length === 1)
+  assert(dir.listFiles().size === 2)
+
+  // Make sure a new provider sees the valid application
+  provider.stop()
+  val newProvider = new FsHistoryProvider(conf)
+  newProvider.checkForLogs()
+  assert(newProvider.getListing.length === 1)
+}
+  }
+
   /**
* Asks the provider to check for logs and calls a function to perform 
checks on the updated
* app list. Example:


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] 01/02: [SPARK-32557][CORE] Logging and swallowing the exception per entry in History server

2020-10-18 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git

commit 15ed3121994d4c7e6f1db7e0b5b2cea5565b495a
Author: Yan Xiaole 
AuthorDate: Sun Aug 9 16:47:31 2020 -0700

[SPARK-32557][CORE] Logging and swallowing the exception per entry in 
History server

### What changes were proposed in this pull request?
This PR adds a try catch wrapping the History server scan logic to log and 
swallow the exception per entry.

### Why are the changes needed?
As discussed in #29350 , one entry failure shouldn't affect others.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Manually tested.

Closes #29374 from yanxiaole/SPARK-32557.

Authored-by: Yan Xiaole 
Signed-off-by: Dongjoon Hyun 
---
 .../scala/org/apache/spark/deploy/history/FsHistoryProvider.scala | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git 
a/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala 
b/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
index c262152..f5e7c4fa 100644
--- 
a/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
+++ 
b/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
@@ -27,6 +27,7 @@ import java.util.zip.ZipOutputStream
 import scala.collection.JavaConverters._
 import scala.collection.mutable
 import scala.io.Source
+import scala.util.control.NonFatal
 import scala.xml.Node
 
 import com.fasterxml.jackson.annotation.JsonIgnore
@@ -528,7 +529,8 @@ private[history] class FsHistoryProvider(conf: SparkConf, 
clock: Clock)
 case _: FileNotFoundException => false
   }
 
-case _: FileNotFoundException =>
+case NonFatal(e) =>
+  logWarning(s"Error while filtering log ${reader.rootPath}", e)
   false
   }
 }


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] 02/02: Revert "Revert "[SPARK-33146][CORE] Check for non-fatal errors when loading new applications in SHS""

2020-10-18 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git

commit 02f80cf293739f4d2881316897dbdcea74daa0bc
Author: Jungtaek Lim (HeartSaVioR) 
AuthorDate: Thu Oct 15 15:28:52 2020 +0900

Revert "Revert "[SPARK-33146][CORE] Check for non-fatal errors when loading 
new applications in SHS""

This reverts commit e40c147a5d194adbba13f12590959dc68347ec14.
---
 .../spark/deploy/history/FsHistoryProvider.scala   |  3 ++
 .../deploy/history/FsHistoryProviderSuite.scala| 49 ++
 2 files changed, 52 insertions(+)

diff --git 
a/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala 
b/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
index f5e7c4fa..7e63d55 100644
--- 
a/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
+++ 
b/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
@@ -527,6 +527,9 @@ private[history] class FsHistoryProvider(conf: SparkConf, 
clock: Clock)
 reader.fileSizeForLastIndex > 0
   } catch {
 case _: FileNotFoundException => false
+case NonFatal(e) =>
+  logWarning(s"Error while reading new log 
${reader.rootPath}", e)
+  false
   }
 
 case NonFatal(e) =>
diff --git 
a/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala
 
b/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala
index c2f34fc..f3beb35 100644
--- 
a/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala
+++ 
b/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala
@@ -1470,6 +1470,55 @@ class FsHistoryProviderSuite extends SparkFunSuite with 
Matchers with Logging {
 }
   }
 
+  test("SPARK-33146: don't let one bad rolling log folder prevent loading 
other applications") {
+withTempDir { dir =>
+  val conf = createTestConf(true)
+  conf.set(HISTORY_LOG_DIR, dir.getAbsolutePath)
+  val hadoopConf = SparkHadoopUtil.newConfiguration(conf)
+  val fs = new Path(dir.getAbsolutePath).getFileSystem(hadoopConf)
+
+  val provider = new FsHistoryProvider(conf)
+
+  val writer = new RollingEventLogFilesWriter("app", None, dir.toURI, 
conf, hadoopConf)
+  writer.start()
+
+  writeEventsToRollingWriter(writer, Seq(
+SparkListenerApplicationStart("app", Some("app"), 0, "user", None),
+SparkListenerJobStart(1, 0, Seq.empty)), rollFile = false)
+  provider.checkForLogs()
+  provider.cleanLogs()
+  assert(dir.listFiles().size === 1)
+  assert(provider.getListing.length === 1)
+
+  // Manually delete the appstatus file to make an invalid rolling event 
log
+  val appStatusPath = RollingEventLogFilesWriter.getAppStatusFilePath(new 
Path(writer.logPath),
+"app", None, true)
+  fs.delete(appStatusPath, false)
+  provider.checkForLogs()
+  provider.cleanLogs()
+  assert(provider.getListing.length === 0)
+
+  // Create a new application
+  val writer2 = new RollingEventLogFilesWriter("app2", None, dir.toURI, 
conf, hadoopConf)
+  writer2.start()
+  writeEventsToRollingWriter(writer2, Seq(
+SparkListenerApplicationStart("app2", Some("app2"), 0, "user", None),
+SparkListenerJobStart(1, 0, Seq.empty)), rollFile = false)
+
+  // Both folders exist but only one application found
+  provider.checkForLogs()
+  provider.cleanLogs()
+  assert(provider.getListing.length === 1)
+  assert(dir.listFiles().size === 2)
+
+  // Make sure a new provider sees the valid application
+  provider.stop()
+  val newProvider = new FsHistoryProvider(conf)
+  newProvider.checkForLogs()
+  assert(newProvider.getListing.length === 1)
+}
+  }
+
   /**
* Asks the provider to check for logs and calls a function to perform 
checks on the updated
* app list. Example:


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated (0bff1f6 -> 02f80cf)

2020-10-18 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 0bff1f6  [SPARK-33123][INFRA] Ignore GitHub only changes in Amplab 
Jenkins build
 new 15ed312  [SPARK-32557][CORE] Logging and swallowing the exception per 
entry in History server
 new 02f80cf  Revert "Revert "[SPARK-33146][CORE] Check for non-fatal 
errors when loading new applications in SHS""

The 2 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 .../spark/deploy/history/FsHistoryProvider.scala   |  7 +++-
 .../deploy/history/FsHistoryProviderSuite.scala| 49 ++
 2 files changed, 55 insertions(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] 01/02: [SPARK-32557][CORE] Logging and swallowing the exception per entry in History server

2020-10-18 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git

commit 15ed3121994d4c7e6f1db7e0b5b2cea5565b495a
Author: Yan Xiaole 
AuthorDate: Sun Aug 9 16:47:31 2020 -0700

[SPARK-32557][CORE] Logging and swallowing the exception per entry in 
History server

### What changes were proposed in this pull request?
This PR adds a try catch wrapping the History server scan logic to log and 
swallow the exception per entry.

### Why are the changes needed?
As discussed in #29350 , one entry failure shouldn't affect others.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Manually tested.

Closes #29374 from yanxiaole/SPARK-32557.

Authored-by: Yan Xiaole 
Signed-off-by: Dongjoon Hyun 
---
 .../scala/org/apache/spark/deploy/history/FsHistoryProvider.scala | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git 
a/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala 
b/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
index c262152..f5e7c4fa 100644
--- 
a/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
+++ 
b/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
@@ -27,6 +27,7 @@ import java.util.zip.ZipOutputStream
 import scala.collection.JavaConverters._
 import scala.collection.mutable
 import scala.io.Source
+import scala.util.control.NonFatal
 import scala.xml.Node
 
 import com.fasterxml.jackson.annotation.JsonIgnore
@@ -528,7 +529,8 @@ private[history] class FsHistoryProvider(conf: SparkConf, 
clock: Clock)
 case _: FileNotFoundException => false
   }
 
-case _: FileNotFoundException =>
+case NonFatal(e) =>
+  logWarning(s"Error while filtering log ${reader.rootPath}", e)
   false
   }
 }


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-33146][CORE] Check for non-fatal errors when loading new applications in SHS

2020-10-14 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new d9669bd  [SPARK-33146][CORE] Check for non-fatal errors when loading 
new applications in SHS
d9669bd is described below

commit d9669bdf0ff4ed9951d7077b8dc9ad94507615c5
Author: Adam Binford 
AuthorDate: Thu Oct 15 11:59:29 2020 +0900

[SPARK-33146][CORE] Check for non-fatal errors when loading new 
applications in SHS

### What changes were proposed in this pull request?

Adds an additional check for non-fatal errors when attempting to add a new 
entry to the history server application listing.

### Why are the changes needed?

A bad rolling event log folder (missing appstatus file or no log files) 
would cause no applications to be loaded by the Spark history server. Figuring 
out why invalid event log folders are created in the first place will be 
addressed in separate issues, this just lets the history server skip the 
invalid folder and successfully load all the valid applications.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

New UT

Closes #30037 from Kimahriman/bug/rolling-log-crashing-history.

Authored-by: Adam Binford 
Signed-off-by: Jungtaek Lim (HeartSaVioR) 
(cherry picked from commit 9ab0ec4e38e5df0537b38cb0f89e004ad57bec90)
Signed-off-by: Jungtaek Lim (HeartSaVioR) 
---
 .../spark/deploy/history/FsHistoryProvider.scala   |  3 ++
 .../deploy/history/FsHistoryProviderSuite.scala| 49 ++
 2 files changed, 52 insertions(+)

diff --git 
a/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala 
b/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
index c262152..5970708 100644
--- 
a/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
+++ 
b/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
@@ -526,6 +526,9 @@ private[history] class FsHistoryProvider(conf: SparkConf, 
clock: Clock)
 reader.fileSizeForLastIndex > 0
   } catch {
 case _: FileNotFoundException => false
+case NonFatal(e) =>
+  logWarning(s"Error while reading new log 
${reader.rootPath}", e)
+  false
   }
 
 case _: FileNotFoundException =>
diff --git 
a/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala
 
b/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala
index c2f34fc..f3beb35 100644
--- 
a/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala
+++ 
b/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala
@@ -1470,6 +1470,55 @@ class FsHistoryProviderSuite extends SparkFunSuite with 
Matchers with Logging {
 }
   }
 
+  test("SPARK-33146: don't let one bad rolling log folder prevent loading 
other applications") {
+withTempDir { dir =>
+  val conf = createTestConf(true)
+  conf.set(HISTORY_LOG_DIR, dir.getAbsolutePath)
+  val hadoopConf = SparkHadoopUtil.newConfiguration(conf)
+  val fs = new Path(dir.getAbsolutePath).getFileSystem(hadoopConf)
+
+  val provider = new FsHistoryProvider(conf)
+
+  val writer = new RollingEventLogFilesWriter("app", None, dir.toURI, 
conf, hadoopConf)
+  writer.start()
+
+  writeEventsToRollingWriter(writer, Seq(
+SparkListenerApplicationStart("app", Some("app"), 0, "user", None),
+SparkListenerJobStart(1, 0, Seq.empty)), rollFile = false)
+  provider.checkForLogs()
+  provider.cleanLogs()
+  assert(dir.listFiles().size === 1)
+  assert(provider.getListing.length === 1)
+
+  // Manually delete the appstatus file to make an invalid rolling event 
log
+  val appStatusPath = RollingEventLogFilesWriter.getAppStatusFilePath(new 
Path(writer.logPath),
+"app", None, true)
+  fs.delete(appStatusPath, false)
+  provider.checkForLogs()
+  provider.cleanLogs()
+  assert(provider.getListing.length === 0)
+
+  // Create a new application
+  val writer2 = new RollingEventLogFilesWriter("app2", None, dir.toURI, 
conf, hadoopConf)
+  writer2.start()
+  writeEventsToRollingWriter(writer2, Seq(
+SparkListenerApplicationStart("app2", Some("app2"), 0, "user", None),
+SparkListenerJobStart(1, 0, Seq.empty)), rollFile = false)
+
+  // Both folders exist but only one application found
+  provider.checkForLogs()
+  provider.cleanLogs()
+  assert(provider.getListing.length === 1)
+  assert(dir.listFiles

[spark] branch branch-3.0 updated: [SPARK-33146][CORE] Check for non-fatal errors when loading new applications in SHS

2020-10-14 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new d9669bd  [SPARK-33146][CORE] Check for non-fatal errors when loading 
new applications in SHS
d9669bd is described below

commit d9669bdf0ff4ed9951d7077b8dc9ad94507615c5
Author: Adam Binford 
AuthorDate: Thu Oct 15 11:59:29 2020 +0900

[SPARK-33146][CORE] Check for non-fatal errors when loading new 
applications in SHS

### What changes were proposed in this pull request?

Adds an additional check for non-fatal errors when attempting to add a new 
entry to the history server application listing.

### Why are the changes needed?

A bad rolling event log folder (missing appstatus file or no log files) 
would cause no applications to be loaded by the Spark history server. Figuring 
out why invalid event log folders are created in the first place will be 
addressed in separate issues, this just lets the history server skip the 
invalid folder and successfully load all the valid applications.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

New UT

Closes #30037 from Kimahriman/bug/rolling-log-crashing-history.

Authored-by: Adam Binford 
Signed-off-by: Jungtaek Lim (HeartSaVioR) 
(cherry picked from commit 9ab0ec4e38e5df0537b38cb0f89e004ad57bec90)
Signed-off-by: Jungtaek Lim (HeartSaVioR) 
---
 .../spark/deploy/history/FsHistoryProvider.scala   |  3 ++
 .../deploy/history/FsHistoryProviderSuite.scala| 49 ++
 2 files changed, 52 insertions(+)

diff --git 
a/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala 
b/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
index c262152..5970708 100644
--- 
a/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
+++ 
b/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
@@ -526,6 +526,9 @@ private[history] class FsHistoryProvider(conf: SparkConf, 
clock: Clock)
 reader.fileSizeForLastIndex > 0
   } catch {
 case _: FileNotFoundException => false
+case NonFatal(e) =>
+  logWarning(s"Error while reading new log 
${reader.rootPath}", e)
+  false
   }
 
 case _: FileNotFoundException =>
diff --git 
a/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala
 
b/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala
index c2f34fc..f3beb35 100644
--- 
a/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala
+++ 
b/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala
@@ -1470,6 +1470,55 @@ class FsHistoryProviderSuite extends SparkFunSuite with 
Matchers with Logging {
 }
   }
 
+  test("SPARK-33146: don't let one bad rolling log folder prevent loading 
other applications") {
+withTempDir { dir =>
+  val conf = createTestConf(true)
+  conf.set(HISTORY_LOG_DIR, dir.getAbsolutePath)
+  val hadoopConf = SparkHadoopUtil.newConfiguration(conf)
+  val fs = new Path(dir.getAbsolutePath).getFileSystem(hadoopConf)
+
+  val provider = new FsHistoryProvider(conf)
+
+  val writer = new RollingEventLogFilesWriter("app", None, dir.toURI, 
conf, hadoopConf)
+  writer.start()
+
+  writeEventsToRollingWriter(writer, Seq(
+SparkListenerApplicationStart("app", Some("app"), 0, "user", None),
+SparkListenerJobStart(1, 0, Seq.empty)), rollFile = false)
+  provider.checkForLogs()
+  provider.cleanLogs()
+  assert(dir.listFiles().size === 1)
+  assert(provider.getListing.length === 1)
+
+  // Manually delete the appstatus file to make an invalid rolling event 
log
+  val appStatusPath = RollingEventLogFilesWriter.getAppStatusFilePath(new 
Path(writer.logPath),
+"app", None, true)
+  fs.delete(appStatusPath, false)
+  provider.checkForLogs()
+  provider.cleanLogs()
+  assert(provider.getListing.length === 0)
+
+  // Create a new application
+  val writer2 = new RollingEventLogFilesWriter("app2", None, dir.toURI, 
conf, hadoopConf)
+  writer2.start()
+  writeEventsToRollingWriter(writer2, Seq(
+SparkListenerApplicationStart("app2", Some("app2"), 0, "user", None),
+SparkListenerJobStart(1, 0, Seq.empty)), rollFile = false)
+
+  // Both folders exist but only one application found
+  provider.checkForLogs()
+  provider.cleanLogs()
+  assert(provider.getListing.length === 1)
+  assert(dir.listFiles

[spark] branch master updated (f3ad32f -> 9ab0ec4)

2020-10-14 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from f3ad32f  [SPARK-33026][SQL][FOLLOWUP] metrics name should be 
numOutputRows
 add 9ab0ec4  [SPARK-33146][CORE] Check for non-fatal errors when loading 
new applications in SHS

No new revisions were added by this update.

Summary of changes:
 .../spark/deploy/history/FsHistoryProvider.scala   |  3 ++
 .../deploy/history/FsHistoryProviderSuite.scala| 49 ++
 2 files changed, 52 insertions(+)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-33146][CORE] Check for non-fatal errors when loading new applications in SHS

2020-10-14 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new d9669bd  [SPARK-33146][CORE] Check for non-fatal errors when loading 
new applications in SHS
d9669bd is described below

commit d9669bdf0ff4ed9951d7077b8dc9ad94507615c5
Author: Adam Binford 
AuthorDate: Thu Oct 15 11:59:29 2020 +0900

[SPARK-33146][CORE] Check for non-fatal errors when loading new 
applications in SHS

### What changes were proposed in this pull request?

Adds an additional check for non-fatal errors when attempting to add a new 
entry to the history server application listing.

### Why are the changes needed?

A bad rolling event log folder (missing appstatus file or no log files) 
would cause no applications to be loaded by the Spark history server. Figuring 
out why invalid event log folders are created in the first place will be 
addressed in separate issues, this just lets the history server skip the 
invalid folder and successfully load all the valid applications.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

New UT

Closes #30037 from Kimahriman/bug/rolling-log-crashing-history.

Authored-by: Adam Binford 
Signed-off-by: Jungtaek Lim (HeartSaVioR) 
(cherry picked from commit 9ab0ec4e38e5df0537b38cb0f89e004ad57bec90)
Signed-off-by: Jungtaek Lim (HeartSaVioR) 
---
 .../spark/deploy/history/FsHistoryProvider.scala   |  3 ++
 .../deploy/history/FsHistoryProviderSuite.scala| 49 ++
 2 files changed, 52 insertions(+)

diff --git 
a/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala 
b/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
index c262152..5970708 100644
--- 
a/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
+++ 
b/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
@@ -526,6 +526,9 @@ private[history] class FsHistoryProvider(conf: SparkConf, 
clock: Clock)
 reader.fileSizeForLastIndex > 0
   } catch {
 case _: FileNotFoundException => false
+case NonFatal(e) =>
+  logWarning(s"Error while reading new log 
${reader.rootPath}", e)
+  false
   }
 
 case _: FileNotFoundException =>
diff --git 
a/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala
 
b/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala
index c2f34fc..f3beb35 100644
--- 
a/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala
+++ 
b/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala
@@ -1470,6 +1470,55 @@ class FsHistoryProviderSuite extends SparkFunSuite with 
Matchers with Logging {
 }
   }
 
+  test("SPARK-33146: don't let one bad rolling log folder prevent loading 
other applications") {
+withTempDir { dir =>
+  val conf = createTestConf(true)
+  conf.set(HISTORY_LOG_DIR, dir.getAbsolutePath)
+  val hadoopConf = SparkHadoopUtil.newConfiguration(conf)
+  val fs = new Path(dir.getAbsolutePath).getFileSystem(hadoopConf)
+
+  val provider = new FsHistoryProvider(conf)
+
+  val writer = new RollingEventLogFilesWriter("app", None, dir.toURI, 
conf, hadoopConf)
+  writer.start()
+
+  writeEventsToRollingWriter(writer, Seq(
+SparkListenerApplicationStart("app", Some("app"), 0, "user", None),
+SparkListenerJobStart(1, 0, Seq.empty)), rollFile = false)
+  provider.checkForLogs()
+  provider.cleanLogs()
+  assert(dir.listFiles().size === 1)
+  assert(provider.getListing.length === 1)
+
+  // Manually delete the appstatus file to make an invalid rolling event 
log
+  val appStatusPath = RollingEventLogFilesWriter.getAppStatusFilePath(new 
Path(writer.logPath),
+"app", None, true)
+  fs.delete(appStatusPath, false)
+  provider.checkForLogs()
+  provider.cleanLogs()
+  assert(provider.getListing.length === 0)
+
+  // Create a new application
+  val writer2 = new RollingEventLogFilesWriter("app2", None, dir.toURI, 
conf, hadoopConf)
+  writer2.start()
+  writeEventsToRollingWriter(writer2, Seq(
+SparkListenerApplicationStart("app2", Some("app2"), 0, "user", None),
+SparkListenerJobStart(1, 0, Seq.empty)), rollFile = false)
+
+  // Both folders exist but only one application found
+  provider.checkForLogs()
+  provider.cleanLogs()
+  assert(provider.getListing.length === 1)
+  assert(dir.listFiles

[spark] branch master updated (f3ad32f -> 9ab0ec4)

2020-10-14 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from f3ad32f  [SPARK-33026][SQL][FOLLOWUP] metrics name should be 
numOutputRows
 add 9ab0ec4  [SPARK-33146][CORE] Check for non-fatal errors when loading 
new applications in SHS

No new revisions were added by this update.

Summary of changes:
 .../spark/deploy/history/FsHistoryProvider.scala   |  3 ++
 .../deploy/history/FsHistoryProviderSuite.scala| 49 ++
 2 files changed, 52 insertions(+)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-33146][CORE] Check for non-fatal errors when loading new applications in SHS

2020-10-14 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new d9669bd  [SPARK-33146][CORE] Check for non-fatal errors when loading 
new applications in SHS
d9669bd is described below

commit d9669bdf0ff4ed9951d7077b8dc9ad94507615c5
Author: Adam Binford 
AuthorDate: Thu Oct 15 11:59:29 2020 +0900

[SPARK-33146][CORE] Check for non-fatal errors when loading new 
applications in SHS

### What changes were proposed in this pull request?

Adds an additional check for non-fatal errors when attempting to add a new 
entry to the history server application listing.

### Why are the changes needed?

A bad rolling event log folder (missing appstatus file or no log files) 
would cause no applications to be loaded by the Spark history server. Figuring 
out why invalid event log folders are created in the first place will be 
addressed in separate issues, this just lets the history server skip the 
invalid folder and successfully load all the valid applications.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

New UT

Closes #30037 from Kimahriman/bug/rolling-log-crashing-history.

Authored-by: Adam Binford 
Signed-off-by: Jungtaek Lim (HeartSaVioR) 
(cherry picked from commit 9ab0ec4e38e5df0537b38cb0f89e004ad57bec90)
Signed-off-by: Jungtaek Lim (HeartSaVioR) 
---
 .../spark/deploy/history/FsHistoryProvider.scala   |  3 ++
 .../deploy/history/FsHistoryProviderSuite.scala| 49 ++
 2 files changed, 52 insertions(+)

diff --git 
a/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala 
b/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
index c262152..5970708 100644
--- 
a/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
+++ 
b/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
@@ -526,6 +526,9 @@ private[history] class FsHistoryProvider(conf: SparkConf, 
clock: Clock)
 reader.fileSizeForLastIndex > 0
   } catch {
 case _: FileNotFoundException => false
+case NonFatal(e) =>
+  logWarning(s"Error while reading new log 
${reader.rootPath}", e)
+  false
   }
 
 case _: FileNotFoundException =>
diff --git 
a/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala
 
b/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala
index c2f34fc..f3beb35 100644
--- 
a/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala
+++ 
b/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala
@@ -1470,6 +1470,55 @@ class FsHistoryProviderSuite extends SparkFunSuite with 
Matchers with Logging {
 }
   }
 
+  test("SPARK-33146: don't let one bad rolling log folder prevent loading 
other applications") {
+withTempDir { dir =>
+  val conf = createTestConf(true)
+  conf.set(HISTORY_LOG_DIR, dir.getAbsolutePath)
+  val hadoopConf = SparkHadoopUtil.newConfiguration(conf)
+  val fs = new Path(dir.getAbsolutePath).getFileSystem(hadoopConf)
+
+  val provider = new FsHistoryProvider(conf)
+
+  val writer = new RollingEventLogFilesWriter("app", None, dir.toURI, 
conf, hadoopConf)
+  writer.start()
+
+  writeEventsToRollingWriter(writer, Seq(
+SparkListenerApplicationStart("app", Some("app"), 0, "user", None),
+SparkListenerJobStart(1, 0, Seq.empty)), rollFile = false)
+  provider.checkForLogs()
+  provider.cleanLogs()
+  assert(dir.listFiles().size === 1)
+  assert(provider.getListing.length === 1)
+
+  // Manually delete the appstatus file to make an invalid rolling event 
log
+  val appStatusPath = RollingEventLogFilesWriter.getAppStatusFilePath(new 
Path(writer.logPath),
+"app", None, true)
+  fs.delete(appStatusPath, false)
+  provider.checkForLogs()
+  provider.cleanLogs()
+  assert(provider.getListing.length === 0)
+
+  // Create a new application
+  val writer2 = new RollingEventLogFilesWriter("app2", None, dir.toURI, 
conf, hadoopConf)
+  writer2.start()
+  writeEventsToRollingWriter(writer2, Seq(
+SparkListenerApplicationStart("app2", Some("app2"), 0, "user", None),
+SparkListenerJobStart(1, 0, Seq.empty)), rollFile = false)
+
+  // Both folders exist but only one application found
+  provider.checkForLogs()
+  provider.cleanLogs()
+  assert(provider.getListing.length === 1)
+  assert(dir.listFiles

[spark] branch master updated (f3ad32f -> 9ab0ec4)

2020-10-14 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from f3ad32f  [SPARK-33026][SQL][FOLLOWUP] metrics name should be 
numOutputRows
 add 9ab0ec4  [SPARK-33146][CORE] Check for non-fatal errors when loading 
new applications in SHS

No new revisions were added by this update.

Summary of changes:
 .../spark/deploy/history/FsHistoryProvider.scala   |  3 ++
 .../deploy/history/FsHistoryProviderSuite.scala| 49 ++
 2 files changed, 52 insertions(+)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-33146][CORE] Check for non-fatal errors when loading new applications in SHS

2020-10-14 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new d9669bd  [SPARK-33146][CORE] Check for non-fatal errors when loading 
new applications in SHS
d9669bd is described below

commit d9669bdf0ff4ed9951d7077b8dc9ad94507615c5
Author: Adam Binford 
AuthorDate: Thu Oct 15 11:59:29 2020 +0900

[SPARK-33146][CORE] Check for non-fatal errors when loading new 
applications in SHS

### What changes were proposed in this pull request?

Adds an additional check for non-fatal errors when attempting to add a new 
entry to the history server application listing.

### Why are the changes needed?

A bad rolling event log folder (missing appstatus file or no log files) 
would cause no applications to be loaded by the Spark history server. Figuring 
out why invalid event log folders are created in the first place will be 
addressed in separate issues, this just lets the history server skip the 
invalid folder and successfully load all the valid applications.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

New UT

Closes #30037 from Kimahriman/bug/rolling-log-crashing-history.

Authored-by: Adam Binford 
Signed-off-by: Jungtaek Lim (HeartSaVioR) 
(cherry picked from commit 9ab0ec4e38e5df0537b38cb0f89e004ad57bec90)
Signed-off-by: Jungtaek Lim (HeartSaVioR) 
---
 .../spark/deploy/history/FsHistoryProvider.scala   |  3 ++
 .../deploy/history/FsHistoryProviderSuite.scala| 49 ++
 2 files changed, 52 insertions(+)

diff --git 
a/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala 
b/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
index c262152..5970708 100644
--- 
a/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
+++ 
b/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
@@ -526,6 +526,9 @@ private[history] class FsHistoryProvider(conf: SparkConf, 
clock: Clock)
 reader.fileSizeForLastIndex > 0
   } catch {
 case _: FileNotFoundException => false
+case NonFatal(e) =>
+  logWarning(s"Error while reading new log 
${reader.rootPath}", e)
+  false
   }
 
 case _: FileNotFoundException =>
diff --git 
a/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala
 
b/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala
index c2f34fc..f3beb35 100644
--- 
a/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala
+++ 
b/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala
@@ -1470,6 +1470,55 @@ class FsHistoryProviderSuite extends SparkFunSuite with 
Matchers with Logging {
 }
   }
 
+  test("SPARK-33146: don't let one bad rolling log folder prevent loading 
other applications") {
+withTempDir { dir =>
+  val conf = createTestConf(true)
+  conf.set(HISTORY_LOG_DIR, dir.getAbsolutePath)
+  val hadoopConf = SparkHadoopUtil.newConfiguration(conf)
+  val fs = new Path(dir.getAbsolutePath).getFileSystem(hadoopConf)
+
+  val provider = new FsHistoryProvider(conf)
+
+  val writer = new RollingEventLogFilesWriter("app", None, dir.toURI, 
conf, hadoopConf)
+  writer.start()
+
+  writeEventsToRollingWriter(writer, Seq(
+SparkListenerApplicationStart("app", Some("app"), 0, "user", None),
+SparkListenerJobStart(1, 0, Seq.empty)), rollFile = false)
+  provider.checkForLogs()
+  provider.cleanLogs()
+  assert(dir.listFiles().size === 1)
+  assert(provider.getListing.length === 1)
+
+  // Manually delete the appstatus file to make an invalid rolling event 
log
+  val appStatusPath = RollingEventLogFilesWriter.getAppStatusFilePath(new 
Path(writer.logPath),
+"app", None, true)
+  fs.delete(appStatusPath, false)
+  provider.checkForLogs()
+  provider.cleanLogs()
+  assert(provider.getListing.length === 0)
+
+  // Create a new application
+  val writer2 = new RollingEventLogFilesWriter("app2", None, dir.toURI, 
conf, hadoopConf)
+  writer2.start()
+  writeEventsToRollingWriter(writer2, Seq(
+SparkListenerApplicationStart("app2", Some("app2"), 0, "user", None),
+SparkListenerJobStart(1, 0, Seq.empty)), rollFile = false)
+
+  // Both folders exist but only one application found
+  provider.checkForLogs()
+  provider.cleanLogs()
+  assert(provider.getListing.length === 1)
+  assert(dir.listFiles

[spark] branch master updated (f3ad32f -> 9ab0ec4)

2020-10-14 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from f3ad32f  [SPARK-33026][SQL][FOLLOWUP] metrics name should be 
numOutputRows
 add 9ab0ec4  [SPARK-33146][CORE] Check for non-fatal errors when loading 
new applications in SHS

No new revisions were added by this update.

Summary of changes:
 .../spark/deploy/history/FsHistoryProvider.scala   |  3 ++
 .../deploy/history/FsHistoryProviderSuite.scala| 49 ++
 2 files changed, 52 insertions(+)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (f3ad32f -> 9ab0ec4)

2020-10-14 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from f3ad32f  [SPARK-33026][SQL][FOLLOWUP] metrics name should be 
numOutputRows
 add 9ab0ec4  [SPARK-33146][CORE] Check for non-fatal errors when loading 
new applications in SHS

No new revisions were added by this update.

Summary of changes:
 .../spark/deploy/history/FsHistoryProvider.scala   |  3 ++
 .../deploy/history/FsHistoryProviderSuite.scala| 49 ++
 2 files changed, 52 insertions(+)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (7fdb571 -> d936cb3)

2020-09-16 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 7fdb571  [SPARK-32890][SQL] Pass all `sql/hive` module UTs in Scala 
2.13
 add d936cb3  [SPARK-26425][SS] Add more constraint checks to avoid 
checkpoint corruption

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/execution/streaming/FileStreamSource.scala | 12 +---
 .../spark/sql/execution/streaming/MicroBatchExecution.scala  |  4 +++-
 2 files changed, 12 insertions(+), 4 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (7fdb571 -> d936cb3)

2020-09-16 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 7fdb571  [SPARK-32890][SQL] Pass all `sql/hive` module UTs in Scala 
2.13
 add d936cb3  [SPARK-26425][SS] Add more constraint checks to avoid 
checkpoint corruption

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/execution/streaming/FileStreamSource.scala | 12 +---
 .../spark/sql/execution/streaming/MicroBatchExecution.scala  |  4 +++-
 2 files changed, 12 insertions(+), 4 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (7fdb571 -> d936cb3)

2020-09-16 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 7fdb571  [SPARK-32890][SQL] Pass all `sql/hive` module UTs in Scala 
2.13
 add d936cb3  [SPARK-26425][SS] Add more constraint checks to avoid 
checkpoint corruption

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/execution/streaming/FileStreamSource.scala | 12 +---
 .../spark/sql/execution/streaming/MicroBatchExecution.scala  |  4 +++-
 2 files changed, 12 insertions(+), 4 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (7fdb571 -> d936cb3)

2020-09-16 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 7fdb571  [SPARK-32890][SQL] Pass all `sql/hive` module UTs in Scala 
2.13
 add d936cb3  [SPARK-26425][SS] Add more constraint checks to avoid 
checkpoint corruption

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/execution/streaming/FileStreamSource.scala | 12 +---
 .../spark/sql/execution/streaming/MicroBatchExecution.scala  |  4 +++-
 2 files changed, 12 insertions(+), 4 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (7fdb571 -> d936cb3)

2020-09-16 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 7fdb571  [SPARK-32890][SQL] Pass all `sql/hive` module UTs in Scala 
2.13
 add d936cb3  [SPARK-26425][SS] Add more constraint checks to avoid 
checkpoint corruption

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/execution/streaming/FileStreamSource.scala | 12 +---
 .../spark/sql/execution/streaming/MicroBatchExecution.scala  |  4 +++-
 2 files changed, 12 insertions(+), 4 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (4a09613 -> db89b0e)

2020-09-09 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 4a09613  Revert "[SPARK-32772][SQL][FOLLOWUP] Remove legacy silent 
support mode for spark-sql CLI"
 add db89b0e  [SPARK-32831][SS] Refactor SupportsStreamingUpdate to 
represent actual meaning of the behavior

No new revisions were added by this update.

Summary of changes:
 .../apache/spark/sql/kafka010/KafkaSourceProvider.scala   |  5 ++---
 ...Update.scala => SupportsStreamingUpdateAsAppend.scala} | 15 +++
 .../sql/execution/datasources/noop/NoopDataSource.scala   |  5 ++---
 .../spark/sql/execution/streaming/StreamExecution.scala   |  6 +++---
 .../apache/spark/sql/execution/streaming/console.scala|  7 +++
 .../execution/streaming/sources/ForeachWriterTable.scala  |  7 +++
 .../spark/sql/execution/streaming/sources/memory.scala|  7 ++-
 7 files changed, 26 insertions(+), 26 deletions(-)
 rename 
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/connector/{SupportsStreamingUpdate.scala
 => SupportsStreamingUpdateAsAppend.scala} (64%)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (4a09613 -> db89b0e)

2020-09-09 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 4a09613  Revert "[SPARK-32772][SQL][FOLLOWUP] Remove legacy silent 
support mode for spark-sql CLI"
 add db89b0e  [SPARK-32831][SS] Refactor SupportsStreamingUpdate to 
represent actual meaning of the behavior

No new revisions were added by this update.

Summary of changes:
 .../apache/spark/sql/kafka010/KafkaSourceProvider.scala   |  5 ++---
 ...Update.scala => SupportsStreamingUpdateAsAppend.scala} | 15 +++
 .../sql/execution/datasources/noop/NoopDataSource.scala   |  5 ++---
 .../spark/sql/execution/streaming/StreamExecution.scala   |  6 +++---
 .../apache/spark/sql/execution/streaming/console.scala|  7 +++
 .../execution/streaming/sources/ForeachWriterTable.scala  |  7 +++
 .../spark/sql/execution/streaming/sources/memory.scala|  7 ++-
 7 files changed, 26 insertions(+), 26 deletions(-)
 rename 
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/connector/{SupportsStreamingUpdate.scala
 => SupportsStreamingUpdateAsAppend.scala} (64%)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (4a09613 -> db89b0e)

2020-09-09 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 4a09613  Revert "[SPARK-32772][SQL][FOLLOWUP] Remove legacy silent 
support mode for spark-sql CLI"
 add db89b0e  [SPARK-32831][SS] Refactor SupportsStreamingUpdate to 
represent actual meaning of the behavior

No new revisions were added by this update.

Summary of changes:
 .../apache/spark/sql/kafka010/KafkaSourceProvider.scala   |  5 ++---
 ...Update.scala => SupportsStreamingUpdateAsAppend.scala} | 15 +++
 .../sql/execution/datasources/noop/NoopDataSource.scala   |  5 ++---
 .../spark/sql/execution/streaming/StreamExecution.scala   |  6 +++---
 .../apache/spark/sql/execution/streaming/console.scala|  7 +++
 .../execution/streaming/sources/ForeachWriterTable.scala  |  7 +++
 .../spark/sql/execution/streaming/sources/memory.scala|  7 ++-
 7 files changed, 26 insertions(+), 26 deletions(-)
 rename 
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/connector/{SupportsStreamingUpdate.scala
 => SupportsStreamingUpdateAsAppend.scala} (64%)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (4a09613 -> db89b0e)

2020-09-09 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 4a09613  Revert "[SPARK-32772][SQL][FOLLOWUP] Remove legacy silent 
support mode for spark-sql CLI"
 add db89b0e  [SPARK-32831][SS] Refactor SupportsStreamingUpdate to 
represent actual meaning of the behavior

No new revisions were added by this update.

Summary of changes:
 .../apache/spark/sql/kafka010/KafkaSourceProvider.scala   |  5 ++---
 ...Update.scala => SupportsStreamingUpdateAsAppend.scala} | 15 +++
 .../sql/execution/datasources/noop/NoopDataSource.scala   |  5 ++---
 .../spark/sql/execution/streaming/StreamExecution.scala   |  6 +++---
 .../apache/spark/sql/execution/streaming/console.scala|  7 +++
 .../execution/streaming/sources/ForeachWriterTable.scala  |  7 +++
 .../spark/sql/execution/streaming/sources/memory.scala|  7 ++-
 7 files changed, 26 insertions(+), 26 deletions(-)
 rename 
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/connector/{SupportsStreamingUpdate.scala
 => SupportsStreamingUpdateAsAppend.scala} (64%)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (4a09613 -> db89b0e)

2020-09-09 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 4a09613  Revert "[SPARK-32772][SQL][FOLLOWUP] Remove legacy silent 
support mode for spark-sql CLI"
 add db89b0e  [SPARK-32831][SS] Refactor SupportsStreamingUpdate to 
represent actual meaning of the behavior

No new revisions were added by this update.

Summary of changes:
 .../apache/spark/sql/kafka010/KafkaSourceProvider.scala   |  5 ++---
 ...Update.scala => SupportsStreamingUpdateAsAppend.scala} | 15 +++
 .../sql/execution/datasources/noop/NoopDataSource.scala   |  5 ++---
 .../spark/sql/execution/streaming/StreamExecution.scala   |  6 +++---
 .../apache/spark/sql/execution/streaming/console.scala|  7 +++
 .../execution/streaming/sources/ForeachWriterTable.scala  |  7 +++
 .../spark/sql/execution/streaming/sources/memory.scala|  7 ++-
 7 files changed, 26 insertions(+), 26 deletions(-)
 rename 
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/connector/{SupportsStreamingUpdate.scala
 => SupportsStreamingUpdateAsAppend.scala} (64%)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (3eee915 -> 9151a58)

2020-08-24 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 3eee915  [MINOR][SQL] Add missing documentation for LongType mapping
 add 9151a58  [SPARK-31608][CORE][WEBUI][TEST] Add test suites for 
HybridStore and HistoryServerMemoryManager

No new revisions were added by this update.

Summary of changes:
 .../history/HistoryServerMemoryManager.scala   |   5 +-
 .../apache/spark/deploy/history/HybridStore.scala  |   6 +-
 .../deploy/history/FsHistoryProviderSuite.scala|  13 +-
 .../history/HistoryServerMemoryManagerSuite.scala  |  55 +
 .../spark/deploy/history/HybridStoreSuite.scala| 232 +
 5 files changed, 304 insertions(+), 7 deletions(-)
 create mode 100644 
core/src/test/scala/org/apache/spark/deploy/history/HistoryServerMemoryManagerSuite.scala
 create mode 100644 
core/src/test/scala/org/apache/spark/deploy/history/HybridStoreSuite.scala


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (3eee915 -> 9151a58)

2020-08-24 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 3eee915  [MINOR][SQL] Add missing documentation for LongType mapping
 add 9151a58  [SPARK-31608][CORE][WEBUI][TEST] Add test suites for 
HybridStore and HistoryServerMemoryManager

No new revisions were added by this update.

Summary of changes:
 .../history/HistoryServerMemoryManager.scala   |   5 +-
 .../apache/spark/deploy/history/HybridStore.scala  |   6 +-
 .../deploy/history/FsHistoryProviderSuite.scala|  13 +-
 .../history/HistoryServerMemoryManagerSuite.scala  |  55 +
 .../spark/deploy/history/HybridStoreSuite.scala| 232 +
 5 files changed, 304 insertions(+), 7 deletions(-)
 create mode 100644 
core/src/test/scala/org/apache/spark/deploy/history/HistoryServerMemoryManagerSuite.scala
 create mode 100644 
core/src/test/scala/org/apache/spark/deploy/history/HybridStoreSuite.scala


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (3eee915 -> 9151a58)

2020-08-24 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 3eee915  [MINOR][SQL] Add missing documentation for LongType mapping
 add 9151a58  [SPARK-31608][CORE][WEBUI][TEST] Add test suites for 
HybridStore and HistoryServerMemoryManager

No new revisions were added by this update.

Summary of changes:
 .../history/HistoryServerMemoryManager.scala   |   5 +-
 .../apache/spark/deploy/history/HybridStore.scala  |   6 +-
 .../deploy/history/FsHistoryProviderSuite.scala|  13 +-
 .../history/HistoryServerMemoryManagerSuite.scala  |  55 +
 .../spark/deploy/history/HybridStoreSuite.scala| 232 +
 5 files changed, 304 insertions(+), 7 deletions(-)
 create mode 100644 
core/src/test/scala/org/apache/spark/deploy/history/HistoryServerMemoryManagerSuite.scala
 create mode 100644 
core/src/test/scala/org/apache/spark/deploy/history/HybridStoreSuite.scala


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (3eee915 -> 9151a58)

2020-08-24 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 3eee915  [MINOR][SQL] Add missing documentation for LongType mapping
 add 9151a58  [SPARK-31608][CORE][WEBUI][TEST] Add test suites for 
HybridStore and HistoryServerMemoryManager

No new revisions were added by this update.

Summary of changes:
 .../history/HistoryServerMemoryManager.scala   |   5 +-
 .../apache/spark/deploy/history/HybridStore.scala  |   6 +-
 .../deploy/history/FsHistoryProviderSuite.scala|  13 +-
 .../history/HistoryServerMemoryManagerSuite.scala  |  55 +
 .../spark/deploy/history/HybridStoreSuite.scala| 232 +
 5 files changed, 304 insertions(+), 7 deletions(-)
 create mode 100644 
core/src/test/scala/org/apache/spark/deploy/history/HistoryServerMemoryManagerSuite.scala
 create mode 100644 
core/src/test/scala/org/apache/spark/deploy/history/HybridStoreSuite.scala


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (3eee915 -> 9151a58)

2020-08-24 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 3eee915  [MINOR][SQL] Add missing documentation for LongType mapping
 add 9151a58  [SPARK-31608][CORE][WEBUI][TEST] Add test suites for 
HybridStore and HistoryServerMemoryManager

No new revisions were added by this update.

Summary of changes:
 .../history/HistoryServerMemoryManager.scala   |   5 +-
 .../apache/spark/deploy/history/HybridStore.scala  |   6 +-
 .../deploy/history/FsHistoryProviderSuite.scala|  13 +-
 .../history/HistoryServerMemoryManagerSuite.scala  |  55 +
 .../spark/deploy/history/HybridStoreSuite.scala| 232 +
 5 files changed, 304 insertions(+), 7 deletions(-)
 create mode 100644 
core/src/test/scala/org/apache/spark/deploy/history/HistoryServerMemoryManagerSuite.scala
 create mode 100644 
core/src/test/scala/org/apache/spark/deploy/history/HybridStoreSuite.scala


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-31792][SS][DOC][FOLLOW-UP] Rephrase the description for some operations

2020-08-22 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new a6df16b  [SPARK-31792][SS][DOC][FOLLOW-UP] Rephrase the description 
for some operations
a6df16b is described below

commit a6df16b36210da32359c77205920eaee98d3e232
Author: Yuanjian Li 
AuthorDate: Sat Aug 22 21:32:23 2020 +0900

[SPARK-31792][SS][DOC][FOLLOW-UP] Rephrase the description for some 
operations

### What changes were proposed in this pull request?
Rephrase the description for some operations to make it clearer.

### Why are the changes needed?
Add more detail in the document.

### Does this PR introduce _any_ user-facing change?
No, document only.

### How was this patch tested?
Document only.

Closes #29269 from xuanyuanking/SPARK-31792-follow.

Authored-by: Yuanjian Li 
Signed-off-by: Jungtaek Lim (HeartSaVioR) 
(cherry picked from commit 8b26c69ce7f905a3c7bbabb1c47ee6a51a23)
Signed-off-by: Jungtaek Lim (HeartSaVioR) 
---
 docs/web-ui.md | 10 +-
 .../spark/sql/execution/streaming/MicroBatchExecution.scala|  3 +--
 2 files changed, 6 insertions(+), 7 deletions(-)

diff --git a/docs/web-ui.md b/docs/web-ui.md
index 134a8c8..fe26043 100644
--- a/docs/web-ui.md
+++ b/docs/web-ui.md
@@ -425,11 +425,11 @@ queries. Currently, it contains the following metrics.
 * **Batch Duration.** The process duration of each batch. 
 * **Operation Duration.** The amount of time taken to perform various 
operations in milliseconds.
 The tracked operations are listed as follows.
-* addBatch: Adds result data of the current batch to the sink.
-* getBatch: Gets a new batch of data to process.
-* latestOffset: Gets the latest offsets for sources. 
-* queryPlanning: Generates the execution plan.
-* walCommit: Writes the offsets to the metadata log.
+* addBatch: Time taken to read the micro-batch's input data from the 
sources, process it, and write the batch's output to the sink. This should take 
the bulk of the micro-batch's time.
+* getBatch: Time taken to prepare the logical query to read the input of 
the current micro-batch from the sources.
+* latestOffset & getOffset: Time taken to query the maximum available 
offset for this source.
+* queryPlanning: Time taken to generates the execution plan.
+* walCommit: Time taken to write the offsets to the metadata log.
 
 As an early-release version, the statistics page is still under development 
and will be improved in
 future releases.
diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/MicroBatchExecution.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/MicroBatchExecution.scala
index e022bfb..e0731db 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/MicroBatchExecution.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/MicroBatchExecution.scala
@@ -566,8 +566,7 @@ class MicroBatchExecution(
 val nextBatch =
   new Dataset(lastExecution, RowEncoder(lastExecution.analyzed.schema))
 
-val batchSinkProgress: Option[StreamWriterCommitProgress] =
-  reportTimeTaken("addBatch") {
+val batchSinkProgress: Option[StreamWriterCommitProgress] = 
reportTimeTaken("addBatch") {
   SQLExecution.withNewExecutionId(lastExecution) {
 sink match {
   case s: Sink => s.addBatch(currentBatchId, nextBatch)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-31792][SS][DOC][FOLLOW-UP] Rephrase the description for some operations

2020-08-22 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new a6df16b  [SPARK-31792][SS][DOC][FOLLOW-UP] Rephrase the description 
for some operations
a6df16b is described below

commit a6df16b36210da32359c77205920eaee98d3e232
Author: Yuanjian Li 
AuthorDate: Sat Aug 22 21:32:23 2020 +0900

[SPARK-31792][SS][DOC][FOLLOW-UP] Rephrase the description for some 
operations

### What changes were proposed in this pull request?
Rephrase the description for some operations to make it clearer.

### Why are the changes needed?
Add more detail in the document.

### Does this PR introduce _any_ user-facing change?
No, document only.

### How was this patch tested?
Document only.

Closes #29269 from xuanyuanking/SPARK-31792-follow.

Authored-by: Yuanjian Li 
Signed-off-by: Jungtaek Lim (HeartSaVioR) 
(cherry picked from commit 8b26c69ce7f905a3c7bbabb1c47ee6a51a23)
Signed-off-by: Jungtaek Lim (HeartSaVioR) 
---
 docs/web-ui.md | 10 +-
 .../spark/sql/execution/streaming/MicroBatchExecution.scala|  3 +--
 2 files changed, 6 insertions(+), 7 deletions(-)

diff --git a/docs/web-ui.md b/docs/web-ui.md
index 134a8c8..fe26043 100644
--- a/docs/web-ui.md
+++ b/docs/web-ui.md
@@ -425,11 +425,11 @@ queries. Currently, it contains the following metrics.
 * **Batch Duration.** The process duration of each batch. 
 * **Operation Duration.** The amount of time taken to perform various 
operations in milliseconds.
 The tracked operations are listed as follows.
-* addBatch: Adds result data of the current batch to the sink.
-* getBatch: Gets a new batch of data to process.
-* latestOffset: Gets the latest offsets for sources. 
-* queryPlanning: Generates the execution plan.
-* walCommit: Writes the offsets to the metadata log.
+* addBatch: Time taken to read the micro-batch's input data from the 
sources, process it, and write the batch's output to the sink. This should take 
the bulk of the micro-batch's time.
+* getBatch: Time taken to prepare the logical query to read the input of 
the current micro-batch from the sources.
+* latestOffset & getOffset: Time taken to query the maximum available 
offset for this source.
+* queryPlanning: Time taken to generates the execution plan.
+* walCommit: Time taken to write the offsets to the metadata log.
 
 As an early-release version, the statistics page is still under development 
and will be improved in
 future releases.
diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/MicroBatchExecution.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/MicroBatchExecution.scala
index e022bfb..e0731db 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/MicroBatchExecution.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/MicroBatchExecution.scala
@@ -566,8 +566,7 @@ class MicroBatchExecution(
 val nextBatch =
   new Dataset(lastExecution, RowEncoder(lastExecution.analyzed.schema))
 
-val batchSinkProgress: Option[StreamWriterCommitProgress] =
-  reportTimeTaken("addBatch") {
+val batchSinkProgress: Option[StreamWriterCommitProgress] = 
reportTimeTaken("addBatch") {
   SQLExecution.withNewExecutionId(lastExecution) {
 sink match {
   case s: Sink => s.addBatch(currentBatchId, nextBatch)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (12f4331 -> 8b26c69)

2020-08-22 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 12f4331  [SPARK-32672][SQL] Fix data corruption in boolean bit set 
compression
 add 8b26c69  [SPARK-31792][SS][DOC][FOLLOW-UP] Rephrase the description 
for some operations

No new revisions were added by this update.

Summary of changes:
 docs/web-ui.md | 10 +-
 .../spark/sql/execution/streaming/MicroBatchExecution.scala|  3 +--
 2 files changed, 6 insertions(+), 7 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-31792][SS][DOC][FOLLOW-UP] Rephrase the description for some operations

2020-08-22 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new a6df16b  [SPARK-31792][SS][DOC][FOLLOW-UP] Rephrase the description 
for some operations
a6df16b is described below

commit a6df16b36210da32359c77205920eaee98d3e232
Author: Yuanjian Li 
AuthorDate: Sat Aug 22 21:32:23 2020 +0900

[SPARK-31792][SS][DOC][FOLLOW-UP] Rephrase the description for some 
operations

### What changes were proposed in this pull request?
Rephrase the description for some operations to make it clearer.

### Why are the changes needed?
Add more detail in the document.

### Does this PR introduce _any_ user-facing change?
No, document only.

### How was this patch tested?
Document only.

Closes #29269 from xuanyuanking/SPARK-31792-follow.

Authored-by: Yuanjian Li 
Signed-off-by: Jungtaek Lim (HeartSaVioR) 
(cherry picked from commit 8b26c69ce7f905a3c7bbabb1c47ee6a51a23)
Signed-off-by: Jungtaek Lim (HeartSaVioR) 
---
 docs/web-ui.md | 10 +-
 .../spark/sql/execution/streaming/MicroBatchExecution.scala|  3 +--
 2 files changed, 6 insertions(+), 7 deletions(-)

diff --git a/docs/web-ui.md b/docs/web-ui.md
index 134a8c8..fe26043 100644
--- a/docs/web-ui.md
+++ b/docs/web-ui.md
@@ -425,11 +425,11 @@ queries. Currently, it contains the following metrics.
 * **Batch Duration.** The process duration of each batch. 
 * **Operation Duration.** The amount of time taken to perform various 
operations in milliseconds.
 The tracked operations are listed as follows.
-* addBatch: Adds result data of the current batch to the sink.
-* getBatch: Gets a new batch of data to process.
-* latestOffset: Gets the latest offsets for sources. 
-* queryPlanning: Generates the execution plan.
-* walCommit: Writes the offsets to the metadata log.
+* addBatch: Time taken to read the micro-batch's input data from the 
sources, process it, and write the batch's output to the sink. This should take 
the bulk of the micro-batch's time.
+* getBatch: Time taken to prepare the logical query to read the input of 
the current micro-batch from the sources.
+* latestOffset & getOffset: Time taken to query the maximum available 
offset for this source.
+* queryPlanning: Time taken to generates the execution plan.
+* walCommit: Time taken to write the offsets to the metadata log.
 
 As an early-release version, the statistics page is still under development 
and will be improved in
 future releases.
diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/MicroBatchExecution.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/MicroBatchExecution.scala
index e022bfb..e0731db 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/MicroBatchExecution.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/MicroBatchExecution.scala
@@ -566,8 +566,7 @@ class MicroBatchExecution(
 val nextBatch =
   new Dataset(lastExecution, RowEncoder(lastExecution.analyzed.schema))
 
-val batchSinkProgress: Option[StreamWriterCommitProgress] =
-  reportTimeTaken("addBatch") {
+val batchSinkProgress: Option[StreamWriterCommitProgress] = 
reportTimeTaken("addBatch") {
   SQLExecution.withNewExecutionId(lastExecution) {
 sink match {
   case s: Sink => s.addBatch(currentBatchId, nextBatch)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (12f4331 -> 8b26c69)

2020-08-22 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 12f4331  [SPARK-32672][SQL] Fix data corruption in boolean bit set 
compression
 add 8b26c69  [SPARK-31792][SS][DOC][FOLLOW-UP] Rephrase the description 
for some operations

No new revisions were added by this update.

Summary of changes:
 docs/web-ui.md | 10 +-
 .../spark/sql/execution/streaming/MicroBatchExecution.scala|  3 +--
 2 files changed, 6 insertions(+), 7 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-31792][SS][DOC][FOLLOW-UP] Rephrase the description for some operations

2020-08-22 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new a6df16b  [SPARK-31792][SS][DOC][FOLLOW-UP] Rephrase the description 
for some operations
a6df16b is described below

commit a6df16b36210da32359c77205920eaee98d3e232
Author: Yuanjian Li 
AuthorDate: Sat Aug 22 21:32:23 2020 +0900

[SPARK-31792][SS][DOC][FOLLOW-UP] Rephrase the description for some 
operations

### What changes were proposed in this pull request?
Rephrase the description for some operations to make it clearer.

### Why are the changes needed?
Add more detail in the document.

### Does this PR introduce _any_ user-facing change?
No, document only.

### How was this patch tested?
Document only.

Closes #29269 from xuanyuanking/SPARK-31792-follow.

Authored-by: Yuanjian Li 
Signed-off-by: Jungtaek Lim (HeartSaVioR) 
(cherry picked from commit 8b26c69ce7f905a3c7bbabb1c47ee6a51a23)
Signed-off-by: Jungtaek Lim (HeartSaVioR) 
---
 docs/web-ui.md | 10 +-
 .../spark/sql/execution/streaming/MicroBatchExecution.scala|  3 +--
 2 files changed, 6 insertions(+), 7 deletions(-)

diff --git a/docs/web-ui.md b/docs/web-ui.md
index 134a8c8..fe26043 100644
--- a/docs/web-ui.md
+++ b/docs/web-ui.md
@@ -425,11 +425,11 @@ queries. Currently, it contains the following metrics.
 * **Batch Duration.** The process duration of each batch. 
 * **Operation Duration.** The amount of time taken to perform various 
operations in milliseconds.
 The tracked operations are listed as follows.
-* addBatch: Adds result data of the current batch to the sink.
-* getBatch: Gets a new batch of data to process.
-* latestOffset: Gets the latest offsets for sources. 
-* queryPlanning: Generates the execution plan.
-* walCommit: Writes the offsets to the metadata log.
+* addBatch: Time taken to read the micro-batch's input data from the 
sources, process it, and write the batch's output to the sink. This should take 
the bulk of the micro-batch's time.
+* getBatch: Time taken to prepare the logical query to read the input of 
the current micro-batch from the sources.
+* latestOffset & getOffset: Time taken to query the maximum available 
offset for this source.
+* queryPlanning: Time taken to generates the execution plan.
+* walCommit: Time taken to write the offsets to the metadata log.
 
 As an early-release version, the statistics page is still under development 
and will be improved in
 future releases.
diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/MicroBatchExecution.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/MicroBatchExecution.scala
index e022bfb..e0731db 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/MicroBatchExecution.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/MicroBatchExecution.scala
@@ -566,8 +566,7 @@ class MicroBatchExecution(
 val nextBatch =
   new Dataset(lastExecution, RowEncoder(lastExecution.analyzed.schema))
 
-val batchSinkProgress: Option[StreamWriterCommitProgress] =
-  reportTimeTaken("addBatch") {
+val batchSinkProgress: Option[StreamWriterCommitProgress] = 
reportTimeTaken("addBatch") {
   SQLExecution.withNewExecutionId(lastExecution) {
 sink match {
   case s: Sink => s.addBatch(currentBatchId, nextBatch)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (12f4331 -> 8b26c69)

2020-08-22 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 12f4331  [SPARK-32672][SQL] Fix data corruption in boolean bit set 
compression
 add 8b26c69  [SPARK-31792][SS][DOC][FOLLOW-UP] Rephrase the description 
for some operations

No new revisions were added by this update.

Summary of changes:
 docs/web-ui.md | 10 +-
 .../spark/sql/execution/streaming/MicroBatchExecution.scala|  3 +--
 2 files changed, 6 insertions(+), 7 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-31792][SS][DOC][FOLLOW-UP] Rephrase the description for some operations

2020-08-22 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new a6df16b  [SPARK-31792][SS][DOC][FOLLOW-UP] Rephrase the description 
for some operations
a6df16b is described below

commit a6df16b36210da32359c77205920eaee98d3e232
Author: Yuanjian Li 
AuthorDate: Sat Aug 22 21:32:23 2020 +0900

[SPARK-31792][SS][DOC][FOLLOW-UP] Rephrase the description for some 
operations

### What changes were proposed in this pull request?
Rephrase the description for some operations to make it clearer.

### Why are the changes needed?
Add more detail in the document.

### Does this PR introduce _any_ user-facing change?
No, document only.

### How was this patch tested?
Document only.

Closes #29269 from xuanyuanking/SPARK-31792-follow.

Authored-by: Yuanjian Li 
Signed-off-by: Jungtaek Lim (HeartSaVioR) 
(cherry picked from commit 8b26c69ce7f905a3c7bbabb1c47ee6a51a23)
Signed-off-by: Jungtaek Lim (HeartSaVioR) 
---
 docs/web-ui.md | 10 +-
 .../spark/sql/execution/streaming/MicroBatchExecution.scala|  3 +--
 2 files changed, 6 insertions(+), 7 deletions(-)

diff --git a/docs/web-ui.md b/docs/web-ui.md
index 134a8c8..fe26043 100644
--- a/docs/web-ui.md
+++ b/docs/web-ui.md
@@ -425,11 +425,11 @@ queries. Currently, it contains the following metrics.
 * **Batch Duration.** The process duration of each batch. 
 * **Operation Duration.** The amount of time taken to perform various 
operations in milliseconds.
 The tracked operations are listed as follows.
-* addBatch: Adds result data of the current batch to the sink.
-* getBatch: Gets a new batch of data to process.
-* latestOffset: Gets the latest offsets for sources. 
-* queryPlanning: Generates the execution plan.
-* walCommit: Writes the offsets to the metadata log.
+* addBatch: Time taken to read the micro-batch's input data from the 
sources, process it, and write the batch's output to the sink. This should take 
the bulk of the micro-batch's time.
+* getBatch: Time taken to prepare the logical query to read the input of 
the current micro-batch from the sources.
+* latestOffset & getOffset: Time taken to query the maximum available 
offset for this source.
+* queryPlanning: Time taken to generates the execution plan.
+* walCommit: Time taken to write the offsets to the metadata log.
 
 As an early-release version, the statistics page is still under development 
and will be improved in
 future releases.
diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/MicroBatchExecution.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/MicroBatchExecution.scala
index e022bfb..e0731db 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/MicroBatchExecution.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/MicroBatchExecution.scala
@@ -566,8 +566,7 @@ class MicroBatchExecution(
 val nextBatch =
   new Dataset(lastExecution, RowEncoder(lastExecution.analyzed.schema))
 
-val batchSinkProgress: Option[StreamWriterCommitProgress] =
-  reportTimeTaken("addBatch") {
+val batchSinkProgress: Option[StreamWriterCommitProgress] = 
reportTimeTaken("addBatch") {
   SQLExecution.withNewExecutionId(lastExecution) {
 sink match {
   case s: Sink => s.addBatch(currentBatchId, nextBatch)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (12f4331 -> 8b26c69)

2020-08-22 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 12f4331  [SPARK-32672][SQL] Fix data corruption in boolean bit set 
compression
 add 8b26c69  [SPARK-31792][SS][DOC][FOLLOW-UP] Rephrase the description 
for some operations

No new revisions were added by this update.

Summary of changes:
 docs/web-ui.md | 10 +-
 .../spark/sql/execution/streaming/MicroBatchExecution.scala|  3 +--
 2 files changed, 6 insertions(+), 7 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (12f4331 -> 8b26c69)

2020-08-22 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 12f4331  [SPARK-32672][SQL] Fix data corruption in boolean bit set 
compression
 add 8b26c69  [SPARK-31792][SS][DOC][FOLLOW-UP] Rephrase the description 
for some operations

No new revisions were added by this update.

Summary of changes:
 docs/web-ui.md | 10 +-
 .../spark/sql/execution/streaming/MicroBatchExecution.scala|  3 +--
 2 files changed, 6 insertions(+), 7 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (ae82768 -> 813532d)

2020-07-30 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from ae82768  [SPARK-32421][SQL] Add code-gen for shuffled hash join
 add 813532d  [SPARK-32468][SS][TESTS] Fix timeout config issue in Kafka 
connector tests

No new revisions were added by this update.

Summary of changes:
 .../apache/spark/sql/kafka010/KafkaContinuousSourceSuite.scala| 2 +-
 .../apache/spark/sql/kafka010/KafkaDontFailOnDataLossSuite.scala  | 2 +-
 .../apache/spark/sql/kafka010/KafkaMicroBatchSourceSuite.scala| 8 
 3 files changed, 6 insertions(+), 6 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (ae82768 -> 813532d)

2020-07-30 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from ae82768  [SPARK-32421][SQL] Add code-gen for shuffled hash join
 add 813532d  [SPARK-32468][SS][TESTS] Fix timeout config issue in Kafka 
connector tests

No new revisions were added by this update.

Summary of changes:
 .../apache/spark/sql/kafka010/KafkaContinuousSourceSuite.scala| 2 +-
 .../apache/spark/sql/kafka010/KafkaDontFailOnDataLossSuite.scala  | 2 +-
 .../apache/spark/sql/kafka010/KafkaMicroBatchSourceSuite.scala| 8 
 3 files changed, 6 insertions(+), 6 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (ae82768 -> 813532d)

2020-07-30 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from ae82768  [SPARK-32421][SQL] Add code-gen for shuffled hash join
 add 813532d  [SPARK-32468][SS][TESTS] Fix timeout config issue in Kafka 
connector tests

No new revisions were added by this update.

Summary of changes:
 .../apache/spark/sql/kafka010/KafkaContinuousSourceSuite.scala| 2 +-
 .../apache/spark/sql/kafka010/KafkaDontFailOnDataLossSuite.scala  | 2 +-
 .../apache/spark/sql/kafka010/KafkaMicroBatchSourceSuite.scala| 8 
 3 files changed, 6 insertions(+), 6 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (ae82768 -> 813532d)

2020-07-30 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from ae82768  [SPARK-32421][SQL] Add code-gen for shuffled hash join
 add 813532d  [SPARK-32468][SS][TESTS] Fix timeout config issue in Kafka 
connector tests

No new revisions were added by this update.

Summary of changes:
 .../apache/spark/sql/kafka010/KafkaContinuousSourceSuite.scala| 2 +-
 .../apache/spark/sql/kafka010/KafkaDontFailOnDataLossSuite.scala  | 2 +-
 .../apache/spark/sql/kafka010/KafkaMicroBatchSourceSuite.scala| 8 
 3 files changed, 6 insertions(+), 6 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (ae82768 -> 813532d)

2020-07-30 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from ae82768  [SPARK-32421][SQL] Add code-gen for shuffled hash join
 add 813532d  [SPARK-32468][SS][TESTS] Fix timeout config issue in Kafka 
connector tests

No new revisions were added by this update.

Summary of changes:
 .../apache/spark/sql/kafka010/KafkaContinuousSourceSuite.scala| 2 +-
 .../apache/spark/sql/kafka010/KafkaDontFailOnDataLossSuite.scala  | 2 +-
 .../apache/spark/sql/kafka010/KafkaMicroBatchSourceSuite.scala| 8 
 3 files changed, 6 insertions(+), 6 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (9d7b1d9 -> f602782)

2020-07-30 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 9d7b1d9  [SPARK-32175][SPARK-32175][FOLLOWUP] Remove flaky test added 
in
 add f602782  [SPARK-32482][SS][TESTS] Eliminate deprecated poll(long) API 
calls to avoid infinite wait in tests

No new revisions were added by this update.

Summary of changes:
 .../apache/spark/sql/kafka010/KafkaTestUtils.scala | 47 +++---
 1 file changed, 14 insertions(+), 33 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (9d7b1d9 -> f602782)

2020-07-30 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 9d7b1d9  [SPARK-32175][SPARK-32175][FOLLOWUP] Remove flaky test added 
in
 add f602782  [SPARK-32482][SS][TESTS] Eliminate deprecated poll(long) API 
calls to avoid infinite wait in tests

No new revisions were added by this update.

Summary of changes:
 .../apache/spark/sql/kafka010/KafkaTestUtils.scala | 47 +++---
 1 file changed, 14 insertions(+), 33 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (9d7b1d9 -> f602782)

2020-07-30 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 9d7b1d9  [SPARK-32175][SPARK-32175][FOLLOWUP] Remove flaky test added 
in
 add f602782  [SPARK-32482][SS][TESTS] Eliminate deprecated poll(long) API 
calls to avoid infinite wait in tests

No new revisions were added by this update.

Summary of changes:
 .../apache/spark/sql/kafka010/KafkaTestUtils.scala | 47 +++---
 1 file changed, 14 insertions(+), 33 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (9d7b1d9 -> f602782)

2020-07-30 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 9d7b1d9  [SPARK-32175][SPARK-32175][FOLLOWUP] Remove flaky test added 
in
 add f602782  [SPARK-32482][SS][TESTS] Eliminate deprecated poll(long) API 
calls to avoid infinite wait in tests

No new revisions were added by this update.

Summary of changes:
 .../apache/spark/sql/kafka010/KafkaTestUtils.scala | 47 +++---
 1 file changed, 14 insertions(+), 33 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (9d7b1d9 -> f602782)

2020-07-30 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 9d7b1d9  [SPARK-32175][SPARK-32175][FOLLOWUP] Remove flaky test added 
in
 add f602782  [SPARK-32482][SS][TESTS] Eliminate deprecated poll(long) API 
calls to avoid infinite wait in tests

No new revisions were added by this update.

Summary of changes:
 .../apache/spark/sql/kafka010/KafkaTestUtils.scala | 47 +++---
 1 file changed, 14 insertions(+), 33 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (39181ff -> 7b9d755)

2020-07-21 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 39181ff  [SPARK-32286][SQL] Coalesce bucketed table for shuffled hash 
join if applicable
 add 7b9d755  [SPARK-32350][CORE] Add batch-write on LevelDB to improve 
performance of HybridStore

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/util/kvstore/LevelDB.java | 73 ++
 .../apache/spark/deploy/history/HybridStore.scala  |  9 +--
 2 files changed, 66 insertions(+), 16 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (39181ff -> 7b9d755)

2020-07-21 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 39181ff  [SPARK-32286][SQL] Coalesce bucketed table for shuffled hash 
join if applicable
 add 7b9d755  [SPARK-32350][CORE] Add batch-write on LevelDB to improve 
performance of HybridStore

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/util/kvstore/LevelDB.java | 73 ++
 .../apache/spark/deploy/history/HybridStore.scala  |  9 +--
 2 files changed, 66 insertions(+), 16 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (39181ff -> 7b9d755)

2020-07-21 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 39181ff  [SPARK-32286][SQL] Coalesce bucketed table for shuffled hash 
join if applicable
 add 7b9d755  [SPARK-32350][CORE] Add batch-write on LevelDB to improve 
performance of HybridStore

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/util/kvstore/LevelDB.java | 73 ++
 .../apache/spark/deploy/history/HybridStore.scala  |  9 +--
 2 files changed, 66 insertions(+), 16 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (39181ff -> 7b9d755)

2020-07-21 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 39181ff  [SPARK-32286][SQL] Coalesce bucketed table for shuffled hash 
join if applicable
 add 7b9d755  [SPARK-32350][CORE] Add batch-write on LevelDB to improve 
performance of HybridStore

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/util/kvstore/LevelDB.java | 73 ++
 .../apache/spark/deploy/history/HybridStore.scala  |  9 +--
 2 files changed, 66 insertions(+), 16 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (39181ff -> 7b9d755)

2020-07-21 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 39181ff  [SPARK-32286][SQL] Coalesce bucketed table for shuffled hash 
join if applicable
 add 7b9d755  [SPARK-32350][CORE] Add batch-write on LevelDB to improve 
performance of HybridStore

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/util/kvstore/LevelDB.java | 73 ++
 .../apache/spark/deploy/history/HybridStore.scala  |  9 +--
 2 files changed, 66 insertions(+), 16 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (fb51925 -> 9747e8f)

2020-07-16 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from fb51925  [SPARK-32335][K8S][TESTS] Remove Python2 test from K8s IT
 add 9747e8f  [SPARK-31831][SQL][TESTS][FOLLOWUP] Put mocks for 
HiveSessionImplSuite in hive version related subdirectories

No new revisions were added by this update.

Summary of changes:
 sql/hive-thriftserver/pom.xml  | 12 ++
 .../hive/thriftserver/HiveSessionImplSuite.scala   | 29 +
 .../thriftserver/GetCatalogsOperationMock.scala| 50 ++
 .../thriftserver/GetCatalogsOperationMock.scala| 50 ++
 4 files changed, 113 insertions(+), 28 deletions(-)
 create mode 100644 sql/hive-thriftserver/v1.2/src/test/scala/ 
org/apache/spark/sql/hive/thriftserver/GetCatalogsOperationMock.scala
 create mode 100644 sql/hive-thriftserver/v2.3/src/test/scala/ 
org/apache/spark/sql/hive/thriftserver/GetCatalogsOperationMock.scala


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (fb51925 -> 9747e8f)

2020-07-16 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from fb51925  [SPARK-32335][K8S][TESTS] Remove Python2 test from K8s IT
 add 9747e8f  [SPARK-31831][SQL][TESTS][FOLLOWUP] Put mocks for 
HiveSessionImplSuite in hive version related subdirectories

No new revisions were added by this update.

Summary of changes:
 sql/hive-thriftserver/pom.xml  | 12 ++
 .../hive/thriftserver/HiveSessionImplSuite.scala   | 29 +
 .../thriftserver/GetCatalogsOperationMock.scala| 50 ++
 .../thriftserver/GetCatalogsOperationMock.scala| 50 ++
 4 files changed, 113 insertions(+), 28 deletions(-)
 create mode 100644 sql/hive-thriftserver/v1.2/src/test/scala/ 
org/apache/spark/sql/hive/thriftserver/GetCatalogsOperationMock.scala
 create mode 100644 sql/hive-thriftserver/v2.3/src/test/scala/ 
org/apache/spark/sql/hive/thriftserver/GetCatalogsOperationMock.scala


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (fb51925 -> 9747e8f)

2020-07-16 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from fb51925  [SPARK-32335][K8S][TESTS] Remove Python2 test from K8s IT
 add 9747e8f  [SPARK-31831][SQL][TESTS][FOLLOWUP] Put mocks for 
HiveSessionImplSuite in hive version related subdirectories

No new revisions were added by this update.

Summary of changes:
 sql/hive-thriftserver/pom.xml  | 12 ++
 .../hive/thriftserver/HiveSessionImplSuite.scala   | 29 +
 .../thriftserver/GetCatalogsOperationMock.scala| 50 ++
 .../thriftserver/GetCatalogsOperationMock.scala| 50 ++
 4 files changed, 113 insertions(+), 28 deletions(-)
 create mode 100644 sql/hive-thriftserver/v1.2/src/test/scala/ 
org/apache/spark/sql/hive/thriftserver/GetCatalogsOperationMock.scala
 create mode 100644 sql/hive-thriftserver/v2.3/src/test/scala/ 
org/apache/spark/sql/hive/thriftserver/GetCatalogsOperationMock.scala


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (fb51925 -> 9747e8f)

2020-07-16 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from fb51925  [SPARK-32335][K8S][TESTS] Remove Python2 test from K8s IT
 add 9747e8f  [SPARK-31831][SQL][TESTS][FOLLOWUP] Put mocks for 
HiveSessionImplSuite in hive version related subdirectories

No new revisions were added by this update.

Summary of changes:
 sql/hive-thriftserver/pom.xml  | 12 ++
 .../hive/thriftserver/HiveSessionImplSuite.scala   | 29 +
 .../thriftserver/GetCatalogsOperationMock.scala| 50 ++
 .../thriftserver/GetCatalogsOperationMock.scala| 50 ++
 4 files changed, 113 insertions(+), 28 deletions(-)
 create mode 100644 sql/hive-thriftserver/v1.2/src/test/scala/ 
org/apache/spark/sql/hive/thriftserver/GetCatalogsOperationMock.scala
 create mode 100644 sql/hive-thriftserver/v2.3/src/test/scala/ 
org/apache/spark/sql/hive/thriftserver/GetCatalogsOperationMock.scala


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

< 1 2 3 4 5 6 7 >

501 - 600 of 645 matches

Mail list logo