[spark] branch master updated (ebe7bb6 -> aae7310)

2021-08-30 Thread yao
This is an automated email from the ASF dual-hosted git repository.

yao pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from ebe7bb6  [SPARK-36614][CORE][UI] Correct executor loss reason caused 
by decommission in UI
 add aae7310  [SPARK-36611] Remove unused listener in 
HiveThriftServer2AppStatusStore

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/sql/hive/thriftserver/HiveThriftServer2.scala| 2 +-
 .../sql/hive/thriftserver/ui/HiveThriftServer2AppStatusStore.scala| 4 +---
 .../sql/hive/thriftserver/ui/HiveThriftServer2ListenerSuite.scala | 2 +-
 .../apache/spark/sql/hive/thriftserver/ui/ThriftServerPageSuite.scala | 2 +-
 4 files changed, 4 insertions(+), 6 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch 3.0.1 created (now 3fdfce3)

2021-08-30 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch 3.0.1
in repository https://gitbox.apache.org/repos/asf/spark.git.


  at 3fdfce3  Preparing Spark release v3.0.0-rc3

No new revisions were added by this update.

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch branch-3.1 updated: [SPARK-36614][CORE][UI] Correct executor loss reason caused by decommission in UI

2021-08-30 Thread dongjoon
This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.1
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.1 by this push:
 new 7b64a75  [SPARK-36614][CORE][UI] Correct executor loss reason caused 
by decommission in UI
7b64a75 is described below

commit 7b64a751db38f76efe698326c2fad288ca39918c
Author: yi.wu 
AuthorDate: Mon Aug 30 09:09:22 2021 -0700

[SPARK-36614][CORE][UI] Correct executor loss reason caused by decommission 
in UI

### What changes were proposed in this pull request?

Post the correct executor loss reason to UI.

### Why are the changes needed?

To show the accurate loss reason.

### Does this PR introduce _any_ user-facing change?

Yes. Users can see the difference from the UI.

Before:
https://user-images.githubusercontent.com/16397174/131341692-6f412607-87b8-405e-822d-0d28f07928da.png";>
https://user-images.githubusercontent.com/16397174/131341699-f2c9de09-635f-49df-8e27-2495f34276c0.png";>

After:

https://user-images.githubusercontent.com/16397174/131341754-e3c93b5d-5252-4006-a4cc-94d76f41303b.png";>
https://user-images.githubusercontent.com/16397174/131341761-e1e0644f-1e76-49c0-915a-26aad77ec272.png";>

### How was this patch tested?

Manully tested.

Closes #33868 from Ngone51/fix-executor-remove-reason.

Authored-by: yi.wu 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit ebe7bb62176ac3c29b0c238e411a0dc989371c33)
Signed-off-by: Dongjoon Hyun 
---
 .../spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala   | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git 
a/core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala
 
b/core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala
index ccb5eb1..548cab9 100644
--- 
a/core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala
+++ 
b/core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala
@@ -409,8 +409,8 @@ class CoarseGrainedSchedulerBackend(scheduler: 
TaskSchedulerImpl, val rpcEnv: Rp
   totalCoreCount.addAndGet(-executorInfo.totalCores)
   totalRegisteredExecutors.addAndGet(-1)
   scheduler.executorLost(executorId, lossReason)
-  listenerBus.post(
-SparkListenerExecutorRemoved(System.currentTimeMillis(), 
executorId, reason.toString))
+  listenerBus.post(SparkListenerExecutorRemoved(
+System.currentTimeMillis(), executorId, lossReason.toString))
 case None =>
   // SPARK-15262: If an executor is still alive even after the 
scheduler has removed
   // its metadata, we may receive a heartbeat from that executor and 
tell its block

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch branch-3.2 updated: [SPARK-36614][CORE][UI] Correct executor loss reason caused by decommission in UI

2021-08-30 Thread dongjoon
This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.2
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.2 by this push:
 new 8be53c3  [SPARK-36614][CORE][UI] Correct executor loss reason caused 
by decommission in UI
8be53c3 is described below

commit 8be53c3298c41c749663bc1dbbd5b2d65246c854
Author: yi.wu 
AuthorDate: Mon Aug 30 09:09:22 2021 -0700

[SPARK-36614][CORE][UI] Correct executor loss reason caused by decommission 
in UI

### What changes were proposed in this pull request?

Post the correct executor loss reason to UI.

### Why are the changes needed?

To show the accurate loss reason.

### Does this PR introduce _any_ user-facing change?

Yes. Users can see the difference from the UI.

Before:
https://user-images.githubusercontent.com/16397174/131341692-6f412607-87b8-405e-822d-0d28f07928da.png";>
https://user-images.githubusercontent.com/16397174/131341699-f2c9de09-635f-49df-8e27-2495f34276c0.png";>

After:

https://user-images.githubusercontent.com/16397174/131341754-e3c93b5d-5252-4006-a4cc-94d76f41303b.png";>
https://user-images.githubusercontent.com/16397174/131341761-e1e0644f-1e76-49c0-915a-26aad77ec272.png";>

### How was this patch tested?

Manully tested.

Closes #33868 from Ngone51/fix-executor-remove-reason.

Authored-by: yi.wu 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit ebe7bb62176ac3c29b0c238e411a0dc989371c33)
Signed-off-by: Dongjoon Hyun 
---
 .../spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala   | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git 
a/core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala
 
b/core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala
index 3c121c5..fd68dfa 100644
--- 
a/core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala
+++ 
b/core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala
@@ -428,8 +428,8 @@ class CoarseGrainedSchedulerBackend(scheduler: 
TaskSchedulerImpl, val rpcEnv: Rp
   totalCoreCount.addAndGet(-executorInfo.totalCores)
   totalRegisteredExecutors.addAndGet(-1)
   scheduler.executorLost(executorId, lossReason)
-  listenerBus.post(
-SparkListenerExecutorRemoved(System.currentTimeMillis(), 
executorId, reason.toString))
+  listenerBus.post(SparkListenerExecutorRemoved(
+System.currentTimeMillis(), executorId, lossReason.toString))
 case None =>
   // SPARK-15262: If an executor is still alive even after the 
scheduler has removed
   // its metadata, we may receive a heartbeat from that executor and 
tell its block

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (fcc91cf -> ebe7bb6)

2021-08-30 Thread dongjoon
This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from fcc91cf  [SPARK-36574][SQL] pushDownPredicate=false should prevent 
push down filters to JDBC data source
 add ebe7bb6  [SPARK-36614][CORE][UI] Correct executor loss reason caused 
by decommission in UI

No new revisions were added by this update.

Summary of changes:
 .../spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala   | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch branch-3.2 updated: [SPARK-36574][SQL] pushDownPredicate=false should prevent push down filters to JDBC data source

2021-08-30 Thread wenchen
This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a commit to branch branch-3.2
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.2 by this push:
 new d42536a  [SPARK-36574][SQL] pushDownPredicate=false should prevent 
push down filters to JDBC data source
d42536a is described below

commit d42536a6eedceba5d6572968a6dd0946cfaaffca
Author: gengjiaan 
AuthorDate: Mon Aug 30 19:09:28 2021 +0800

[SPARK-36574][SQL] pushDownPredicate=false should prevent push down filters 
to JDBC data source

### What changes were proposed in this pull request?
Spark SQL includes a data source that can read data from other databases 
using JDBC.
Spark also supports the case-insensitive option `pushDownPredicate`.
According to 
http://spark.apache.org/docs/latest/sql-data-sources-jdbc.html, If set 
`pushDownPredicate` to false, no filter will be pushed down to the JDBC data 
source and thus all filters will be handled by Spark.
But I find it still be pushed down to JDBC data source.

### Why are the changes needed?
Fix bug `pushDownPredicate`=false failed to prevent push down filters to 
JDBC data source.

### Does this PR introduce _any_ user-facing change?
'No'.
The output of query will not change.

### How was this patch tested?
Jenkins test.

Closes #33822 from beliefer/SPARK-36574.

Authored-by: gengjiaan 
Signed-off-by: Wenchen Fan 
(cherry picked from commit fcc91cfec4d939eeebfa8cd88f2791aca48645c6)
Signed-off-by: Wenchen Fan 
---
 .../sql/execution/datasources/jdbc/JDBCRDD.scala   | 14 -
 .../execution/datasources/jdbc/JDBCRelation.scala  |  8 -
 .../org/apache/spark/sql/jdbc/JDBCSuite.scala  | 34 ++
 3 files changed, 46 insertions(+), 10 deletions(-)

diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCRDD.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCRDD.scala
index f26897d..e024e4b 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCRDD.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCRDD.scala
@@ -172,12 +172,12 @@ object JDBCRDD extends Logging {
*
* @param sc - Your SparkContext.
* @param schema - The Catalyst schema of the underlying database table.
-   * @param requiredColumns - The names of the columns to SELECT.
+   * @param requiredColumns - The names of the columns or aggregate columns to 
SELECT.
* @param filters - The filters to include in all WHERE clauses.
* @param parts - An array of JDBCPartitions specifying partition ids and
*per-partition WHERE clauses.
* @param options - JDBC options that contains url, table and other 
information.
-   * @param outputSchema - The schema of the columns to SELECT.
+   * @param outputSchema - The schema of the columns or aggregate columns to 
SELECT.
* @param groupByColumns - The pushed down group by columns.
*
* @return An RDD representing "SELECT requiredColumns FROM fqTable".
@@ -213,8 +213,8 @@ object JDBCRDD extends Logging {
 }
 
 /**
- * An RDD representing a table in a database accessed via JDBC.  Both the
- * driver code and the workers must be able to access the database; the driver
+ * An RDD representing a query is related to a table in a database accessed 
via JDBC.
+ * Both the driver code and the workers must be able to access the database; 
the driver
  * needs to fetch the schema while the workers need to fetch the data.
  */
 private[jdbc] class JDBCRDD(
@@ -237,11 +237,7 @@ private[jdbc] class JDBCRDD(
   /**
* `columns`, but as a String suitable for injection into a SQL query.
*/
-  private val columnList: String = {
-val sb = new StringBuilder()
-columns.foreach(x => sb.append(",").append(x))
-if (sb.isEmpty) "1" else sb.substring(1)
-  }
+  private val columnList: String = if (columns.isEmpty) "1" else 
columns.mkString(",")
 
   /**
* `filters`, but as a WHERE clause suitable for injection into a SQL query.
diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCRelation.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCRelation.scala
index 60d88b6..8098fa0 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCRelation.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCRelation.scala
@@ -278,12 +278,18 @@ private[sql] case class JDBCRelation(
   }
 
   override def buildScan(requiredColumns: Array[String], filters: 
Array[Filter]): RDD[Row] = {
+// When pushDownPredicate is false, all Filters that need to be pushed 
down should be ignored
+val pushedFilters = if (jdbcOptions.pushDownPredicate) {
+  filters
+} else {
+  Array.empty

[spark] branch master updated: [SPARK-36574][SQL] pushDownPredicate=false should prevent push down filters to JDBC data source

2021-08-30 Thread wenchen
This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new fcc91cf  [SPARK-36574][SQL] pushDownPredicate=false should prevent 
push down filters to JDBC data source
fcc91cf is described below

commit fcc91cfec4d939eeebfa8cd88f2791aca48645c6
Author: gengjiaan 
AuthorDate: Mon Aug 30 19:09:28 2021 +0800

[SPARK-36574][SQL] pushDownPredicate=false should prevent push down filters 
to JDBC data source

### What changes were proposed in this pull request?
Spark SQL includes a data source that can read data from other databases 
using JDBC.
Spark also supports the case-insensitive option `pushDownPredicate`.
According to 
http://spark.apache.org/docs/latest/sql-data-sources-jdbc.html, If set 
`pushDownPredicate` to false, no filter will be pushed down to the JDBC data 
source and thus all filters will be handled by Spark.
But I find it still be pushed down to JDBC data source.

### Why are the changes needed?
Fix bug `pushDownPredicate`=false failed to prevent push down filters to 
JDBC data source.

### Does this PR introduce _any_ user-facing change?
'No'.
The output of query will not change.

### How was this patch tested?
Jenkins test.

Closes #33822 from beliefer/SPARK-36574.

Authored-by: gengjiaan 
Signed-off-by: Wenchen Fan 
---
 .../sql/execution/datasources/jdbc/JDBCRDD.scala   | 14 -
 .../execution/datasources/jdbc/JDBCRelation.scala  |  8 -
 .../org/apache/spark/sql/jdbc/JDBCSuite.scala  | 34 ++
 3 files changed, 46 insertions(+), 10 deletions(-)

diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCRDD.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCRDD.scala
index f26897d..e024e4b 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCRDD.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCRDD.scala
@@ -172,12 +172,12 @@ object JDBCRDD extends Logging {
*
* @param sc - Your SparkContext.
* @param schema - The Catalyst schema of the underlying database table.
-   * @param requiredColumns - The names of the columns to SELECT.
+   * @param requiredColumns - The names of the columns or aggregate columns to 
SELECT.
* @param filters - The filters to include in all WHERE clauses.
* @param parts - An array of JDBCPartitions specifying partition ids and
*per-partition WHERE clauses.
* @param options - JDBC options that contains url, table and other 
information.
-   * @param outputSchema - The schema of the columns to SELECT.
+   * @param outputSchema - The schema of the columns or aggregate columns to 
SELECT.
* @param groupByColumns - The pushed down group by columns.
*
* @return An RDD representing "SELECT requiredColumns FROM fqTable".
@@ -213,8 +213,8 @@ object JDBCRDD extends Logging {
 }
 
 /**
- * An RDD representing a table in a database accessed via JDBC.  Both the
- * driver code and the workers must be able to access the database; the driver
+ * An RDD representing a query is related to a table in a database accessed 
via JDBC.
+ * Both the driver code and the workers must be able to access the database; 
the driver
  * needs to fetch the schema while the workers need to fetch the data.
  */
 private[jdbc] class JDBCRDD(
@@ -237,11 +237,7 @@ private[jdbc] class JDBCRDD(
   /**
* `columns`, but as a String suitable for injection into a SQL query.
*/
-  private val columnList: String = {
-val sb = new StringBuilder()
-columns.foreach(x => sb.append(",").append(x))
-if (sb.isEmpty) "1" else sb.substring(1)
-  }
+  private val columnList: String = if (columns.isEmpty) "1" else 
columns.mkString(",")
 
   /**
* `filters`, but as a WHERE clause suitable for injection into a SQL query.
diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCRelation.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCRelation.scala
index 60d88b6..8098fa0 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCRelation.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCRelation.scala
@@ -278,12 +278,18 @@ private[sql] case class JDBCRelation(
   }
 
   override def buildScan(requiredColumns: Array[String], filters: 
Array[Filter]): RDD[Row] = {
+// When pushDownPredicate is false, all Filters that need to be pushed 
down should be ignored
+val pushedFilters = if (jdbcOptions.pushDownPredicate) {
+  filters
+} else {
+  Array.empty[Filter]
+}
 // Rely on a type erasure hack to pass RDD[InternalRow] back as RDD[Row]
 JDBCRDD.scanTa