[spark] branch master updated (529b875 -> 132cbf0)

2021-04-28 Thread sarutak
This is an automated email from the ASF dual-hosted git repository.

sarutak pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 529b875  [SPARK-35226][SQL] Support refreshKrb5Config option in JDBC 
datasources
 add 132cbf0  [SPARK-35105][SQL] Support multiple paths for ADD 
FILE/JAR/ARCHIVE commands

No new revisions were added by this update.

Summary of changes:
 docs/sql-migration-guide.md|   2 +
 ...sql-ref-syntax-aux-resource-mgmt-add-archive.md |   3 +-
 docs/sql-ref-syntax-aux-resource-mgmt-add-file.md  |   3 +-
 docs/sql-ref-syntax-aux-resource-mgmt-add-jar.md   |   3 +-
 .../spark/sql/execution/SparkSqlParser.scala   |  11 +--
 .../spark/sql/execution/command/resources.scala|  12 +--
 .../spark/sql/execution/SparkSqlParserSuite.scala  |  19 ++--
 .../sql/hive/thriftserver/SparkSQLCLIDriver.scala  |   4 +-
 .../spark/sql/hive/client/HiveClientImpl.scala |   4 +-
 .../spark/sql/hive/execution/HiveQuerySuite.scala  | 104 -
 10 files changed, 135 insertions(+), 30 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch branch-3.1 updated: [SPARK-35226][SQL] Support refreshKrb5Config option in JDBC datasources

2021-04-28 Thread sarutak
This is an automated email from the ASF dual-hosted git repository.

sarutak pushed a commit to branch branch-3.1
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.1 by this push:
 new c245d84  [SPARK-35226][SQL] Support refreshKrb5Config option in JDBC 
datasources
c245d84 is described below

commit c245d84066785be60ea0fcdbf69f2e09361b5cf4
Author: Kousuke Saruta 
AuthorDate: Thu Apr 29 13:55:53 2021 +0900

[SPARK-35226][SQL] Support refreshKrb5Config option in JDBC datasources

### What changes were proposed in this pull request?

This PR proposes to introduce a new JDBC option `refreshKrb5Config` which 
allows to reflect the change of `krb5.conf`.

### Why are the changes needed?

In the current master, JDBC datasources can't accept `refreshKrb5Config` 
which is defined in `Krb5LoginModule`.
So even if we change the `krb5.conf` after establishing a connection, the 
change will not be reflected.

The similar issue happens when we run multiple `*KrbIntegrationSuites` at 
the same time.
`MiniKDC` starts and stops every KerberosIntegrationSuite and different 
port number is recorded to `krb5.conf`.
Due to `SecureConnectionProvider.JDBCConfiguration` doesn't take 
`refreshKrb5Config`, KerberosIntegrationSuites except the first running one see 
the wrong port so those suites fail.
You can easily confirm with the following command.
```
build/sbt -Phive Phive-thriftserver -Pdocker-integration-tests "testOnly 
org.apache.spark.sql.jdbc.*KrbIntegrationSuite"
```
### Does this PR introduce _any_ user-facing change?

Yes. Users can set `refreshKrb5Config` to refresh krb5 relevant 
configuration.

### How was this patch tested?

New test.

Closes #32344 from sarutak/kerberos-refresh-issue.

Authored-by: Kousuke Saruta 
Signed-off-by: Kousuke Saruta 
(cherry picked from commit 529b875901a91a03caeb73d9eb7b3008b552c736)
Signed-off-by: Kousuke Saruta 
---
 docs/sql-data-sources-jdbc.md  | 19 
 .../spark/sql/jdbc/DB2KrbIntegrationSuite.scala|  2 +-
 .../sql/jdbc/DockerKrbJDBCIntegrationSuite.scala   | 50 ++
 .../sql/jdbc/MariaDBKrbIntegrationSuite.scala  |  2 +-
 .../sql/jdbc/PostgresKrbIntegrationSuite.scala |  2 +-
 .../execution/datasources/jdbc/JDBCOptions.scala   |  3 ++
 .../jdbc/connection/SecureConnectionProvider.scala |  9 ++--
 7 files changed, 81 insertions(+), 6 deletions(-)

diff --git a/docs/sql-data-sources-jdbc.md b/docs/sql-data-sources-jdbc.md
index 7d60915..89a025c 100644
--- a/docs/sql-data-sources-jdbc.md
+++ b/docs/sql-data-sources-jdbc.md
@@ -211,6 +211,25 @@ the following case-insensitive options:
  Specifies kerberos principal name for the JDBC client. If both 
keytab and principal are defined then Spark tries to 
do kerberos authentication.
 
   
+
+  
+refreshKrb5Config
+
+  This option controls whether the kerberos configuration is to be 
refreshed or not for the JDBC client before
+  establishing a new connection. Set to true if you want to refresh the 
configuration, otherwise set to false.
+  The default value is false. Note that if you set this option to true and 
try to establish multiple connections,
+  a race condition can occur. One possble situation would be like as 
follows.
+  
+refreshKrb5Config flag is set with security context 1
+A JDBC connection provider is used for the corresponding DBMS
+The krb5.conf is modified but the JVM not yet realized that it 
must be reloaded
+Spark authenticates successfully for security context 1
+The JVM loads security context 2 from the modified krb5.conf
+Spark restores the previously saved security context 1
+The modified krb5.conf content just gone
+  
+
+
 
 
 
diff --git 
a/external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/DB2KrbIntegrationSuite.scala
 
b/external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/DB2KrbIntegrationSuite.scala
index 5cbe6fa..f79809f 100644
--- 
a/external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/DB2KrbIntegrationSuite.scala
+++ 
b/external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/DB2KrbIntegrationSuite.scala
@@ -81,7 +81,7 @@ class DB2KrbIntegrationSuite extends 
DockerKrbJDBCIntegrationSuite {
 
   override protected def setAuthentication(keytabFile: String, principal: 
String): Unit = {
 val config = new SecureConnectionProvider.JDBCConfiguration(
-  Configuration.getConfiguration, "JaasClient", keytabFile, principal)
+  Configuration.getConfiguration, "JaasClient", keytabFile, principal, 
true)
 Configuration.setConfiguration(config)
   }
 
diff --git 

[spark] branch master updated (7713565 -> 529b875)

2021-04-28 Thread sarutak
This is an automated email from the ASF dual-hosted git repository.

sarutak pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 7713565  [SPARK-34786][SQL][FOLLOWUP] Explicitly declare 
DecimalType(20, 0) for Parquet UINT_64
 add 529b875  [SPARK-35226][SQL] Support refreshKrb5Config option in JDBC 
datasources

No new revisions were added by this update.

Summary of changes:
 docs/sql-data-sources-jdbc.md  | 19 
 .../spark/sql/jdbc/DB2KrbIntegrationSuite.scala|  2 +-
 .../sql/jdbc/DockerKrbJDBCIntegrationSuite.scala   | 50 ++
 .../sql/jdbc/MariaDBKrbIntegrationSuite.scala  |  2 +-
 .../sql/jdbc/PostgresKrbIntegrationSuite.scala |  2 +-
 .../execution/datasources/jdbc/JDBCOptions.scala   |  3 ++
 .../jdbc/connection/SecureConnectionProvider.scala |  9 ++--
 7 files changed, 81 insertions(+), 6 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated: [SPARK-34786][SQL][FOLLOWUP] Explicitly declare DecimalType(20, 0) for Parquet UINT_64

2021-04-28 Thread wenchen
This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 7713565  [SPARK-34786][SQL][FOLLOWUP] Explicitly declare 
DecimalType(20, 0) for Parquet UINT_64
7713565 is described below

commit 771356555c1110b898ff09ea23fe0b00749caefd
Author: Kent Yao 
AuthorDate: Thu Apr 29 04:51:27 2021 +

[SPARK-34786][SQL][FOLLOWUP] Explicitly declare DecimalType(20, 0) for 
Parquet UINT_64

### What changes were proposed in this pull request?

Explicitly declare DecimalType(20, 0) for Parquet UINT_64, avoid use 
DecimalType.LongDecimal which only happens to have 20 as precision.

https://github.com/apache/spark/pull/31960#discussion_r622691560

### Why are the changes needed?

fix ambiguity

### Does this PR introduce _any_ user-facing change?

no

### How was this patch tested?

not needed, just current CI pass

Closes #32390 from yaooqinn/SPARK-34786-F.

Authored-by: Kent Yao 
Signed-off-by: Wenchen Fan 
---
 .../sql/execution/datasources/parquet/ParquetSchemaConverter.scala| 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaConverter.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaConverter.scala
index 8c4e088..e751c97 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaConverter.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaConverter.scala
@@ -141,7 +141,9 @@ class ParquetToSparkSchemaConverter(
 originalType match {
   case INT_64 | null => LongType
   case DECIMAL => makeDecimalType(Decimal.MAX_LONG_DIGITS)
-  case UINT_64 => DecimalType.LongDecimal
+  // The precision to hold the largest unsigned long is:
+  // `java.lang.Long.toUnsignedString(-1).length` = 20
+  case UINT_64 => DecimalType(20, 0)
   case TIMESTAMP_MICROS => TimestampType
   case TIMESTAMP_MILLIS => TimestampType
   case _ => illegalType()

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated: [SPARK-35135][CORE] Turn the `WritablePartitionedIterator` from a trait into a default implementation class

2021-04-28 Thread wuyi
This is an automated email from the ASF dual-hosted git repository.

wuyi pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 74b9326  [SPARK-35135][CORE] Turn the `WritablePartitionedIterator` 
from a trait into a default implementation class
74b9326 is described below

commit 74b93261af29e76b9da31b1c9f20900a818d97e6
Author: yangjie01 
AuthorDate: Thu Apr 29 11:46:24 2021 +0800

[SPARK-35135][CORE] Turn the `WritablePartitionedIterator` from a trait 
into a default implementation class

### What changes were proposed in this pull request?
`WritablePartitionedIterator` define in 
`WritablePartitionedPairCollection.scala` and there are two implementation of 
these trait,  but the code for these two implementations is duplicate.

The main change of this pr is turn the `WritablePartitionedIterator` from a 
trait into a default implementation class because there is only one 
implementation now.

### Why are the changes needed?
Cleanup duplicate code.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Pass the Jenkins or GitHub Action

Closes #32232 from LuciferYang/writable-partitioned-iterator.

Authored-by: yangjie01 
Signed-off-by: yi.wu 
---
 .../spark/util/collection/ExternalSorter.scala | 17 +++--
 .../WritablePartitionedPairCollection.scala| 28 +-
 project/MimaExcludes.scala |  5 +++-
 3 files changed, 18 insertions(+), 32 deletions(-)

diff --git 
a/core/src/main/scala/org/apache/spark/util/collection/ExternalSorter.scala 
b/core/src/main/scala/org/apache/spark/util/collection/ExternalSorter.scala
index 66bc3e5..1913637 100644
--- a/core/src/main/scala/org/apache/spark/util/collection/ExternalSorter.scala
+++ b/core/src/main/scala/org/apache/spark/util/collection/ExternalSorter.scala
@@ -263,7 +263,7 @@ private[spark] class ExternalSorter[K, V, C](
   /**
* Spill contents of in-memory iterator to a temporary file on disk.
*/
-  private[this] def spillMemoryIteratorToDisk(inMemoryIterator: 
WritablePartitionedIterator)
+  private[this] def spillMemoryIteratorToDisk(inMemoryIterator: 
WritablePartitionedIterator[K, C])
   : SpilledFile = {
 // Because these files may be read during shuffle, their compression must 
be controlled by
 // spark.shuffle.compress instead of spark.shuffle.spill.compress, so we 
need to use
@@ -750,7 +750,7 @@ private[spark] class ExternalSorter[K, V, C](
   // Case where we only have in-memory data
   val collection = if (aggregator.isDefined) map else buffer
   val it = 
collection.destructiveSortedWritablePartitionedIterator(comparator)
-  while (it.hasNext()) {
+  while (it.hasNext) {
 val partitionId = it.nextPartition()
 var partitionWriter: ShufflePartitionWriter = null
 var partitionPairsWriter: ShufflePartitionPairsWriter = null
@@ -866,18 +866,7 @@ private[spark] class ExternalSorter[K, V, C](
   if (hasSpilled) {
 false
   } else {
-val inMemoryIterator = new WritablePartitionedIterator {
-  private[this] var cur = if (upstream.hasNext) upstream.next() else 
null
-
-  def writeNext(writer: PairsWriter): Unit = {
-writer.write(cur._1._2, cur._2)
-cur = if (upstream.hasNext) upstream.next() else null
-  }
-
-  def hasNext(): Boolean = cur != null
-
-  def nextPartition(): Int = cur._1._1
-}
+val inMemoryIterator = new WritablePartitionedIterator[K, C](upstream)
 logInfo(s"Task ${TaskContext.get().taskAttemptId} force spilling 
in-memory map to disk " +
   s"and it will release 
${org.apache.spark.util.Utils.bytesToString(getUsed())} memory")
 val spillFile = spillMemoryIteratorToDisk(inMemoryIterator)
diff --git 
a/core/src/main/scala/org/apache/spark/util/collection/WritablePartitionedPairCollection.scala
 
b/core/src/main/scala/org/apache/spark/util/collection/WritablePartitionedPairCollection.scala
index 9624b02..3472a08 100644
--- 
a/core/src/main/scala/org/apache/spark/util/collection/WritablePartitionedPairCollection.scala
+++ 
b/core/src/main/scala/org/apache/spark/util/collection/WritablePartitionedPairCollection.scala
@@ -46,20 +46,9 @@ private[spark] trait WritablePartitionedPairCollection[K, V] 
{
* This may destroy the underlying collection.
*/
   def destructiveSortedWritablePartitionedIterator(keyComparator: 
Option[Comparator[K]])
-: WritablePartitionedIterator = {
+: WritablePartitionedIterator[K, V] = {
 val it = partitionedDestructiveSortedIterator(keyComparator)
-new WritablePartitionedIterator {
-  private[this] var cur = if (it.hasNext) it.next() else null
-
-  def writeNext(writer: PairsWriter): Unit = {
-

[GitHub] [spark-website] cloud-fan commented on pull request #333: Announce the new repository service for spark-packages

2021-04-28 Thread GitBox


cloud-fan commented on pull request #333:
URL: https://github.com/apache/spark-website/pull/333#issuecomment-828909689


   cc @srowen @dongjoon-hyun 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[GitHub] [spark-website] xuanyuanking commented on a change in pull request #333: Announce the new repository service for spark-packages

2021-04-28 Thread GitBox


xuanyuanking commented on a change in pull request #333:
URL: https://github.com/apache/spark-website/pull/333#discussion_r622700931



##
File path: news/_posts/2021-04-28-new-repository-service.md
##
@@ -0,0 +1,18 @@
+---
+layout: post
+title: New repository service for spark-packages
+categories:
+- News
+tags: []
+status: publish
+type: post
+published: true
+meta:
+  _edit_last: '4'
+  _wpas_done_all: '1'
+---
+We have spun up a new repository service at https://repos.spark-packages.org;>https://repos.spark-packages.org 
and it will be the new home for the artifacts on spark-packages.
+
+https://bintray.com/;>Bintray, the original repository service 
used for https://spark-packages.org/;>https://spark-packages.org/, 
is in its https://jfrog.com/blog/into-the-sunset-bintray-jcenter-gocenter-and-chartcenter/;>sunset
 process. It will no longer be available from May 1st. To consume artifacts 
from that, please replace "dl.bintray.com/spark-packages/maven" with 
"repos.spark-packages.org" in the Maven pom files or sbt build files in your 
repositories.

Review comment:
   For spark release, we need to set this env but I think after we backport 
the change in all release branches, it should not be a problem.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[GitHub] [spark-website] xuanyuanking commented on a change in pull request #333: Announce the new repository service for spark-packages

2021-04-28 Thread GitBox


xuanyuanking commented on a change in pull request #333:
URL: https://github.com/apache/spark-website/pull/333#discussion_r622700755



##
File path: news/_posts/2021-04-28-new-repository-service.md
##
@@ -0,0 +1,18 @@
+---
+layout: post
+title: New repository service for spark-packages
+categories:
+- News
+tags: []
+status: publish
+type: post
+published: true
+meta:
+  _edit_last: '4'
+  _wpas_done_all: '1'
+---
+We have spun up a new repository service at https://repos.spark-packages.org;>https://repos.spark-packages.org 
and it will be the new home for the artifacts on spark-packages.
+
+https://bintray.com/;>Bintray, the original repository service 
used for https://spark-packages.org/;>https://spark-packages.org/, 
is in its https://jfrog.com/blog/into-the-sunset-bintray-jcenter-gocenter-and-chartcenter/;>sunset
 process. It will no longer be available from May 1st. To consume artifacts 
from that, please replace "dl.bintray.com/spark-packages/maven" with 
"repos.spark-packages.org" in the Maven pom files or sbt build files in your 
repositories.

Review comment:
   The system env is specific usage for Spark. The end-users should just 
care about their pom/build file.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[GitHub] [spark-website] xuanyuanking commented on a change in pull request #333: Announce the new repository service for spark-packages

2021-04-28 Thread GitBox


xuanyuanking commented on a change in pull request #333:
URL: https://github.com/apache/spark-website/pull/333#discussion_r622699181



##
File path: news/_posts/2021-04-28-new-repository-service.md
##
@@ -0,0 +1,18 @@
+---
+layout: post
+title: New repository service for spark-packages
+categories:
+- News
+tags: []
+status: publish
+type: post
+published: true
+meta:
+  _edit_last: '4'
+  _wpas_done_all: '1'
+---
+We have spun up a new repository service at https://repos.spark-packages.org;>https://repos.spark-packages.org 
and it will be the new home for the artifacts on spark-packages.
+
+https://bintray.com/;>Bintray, the original repository service 
used for https://spark-packages.org/;>https://spark-packages.org/, 
is in its https://jfrog.com/blog/into-the-sunset-bintray-jcenter-gocenter-and-chartcenter/;>sunset
 process. It will no longer be available from May 1st. To consume artifacts 
from that, please replace "dl.bintray.com/spark-packages/maven" with 
"repos.spark-packages.org" in the Maven pom files or sbt build files in your 
repositories.

Review comment:
   Thanks, done in 27e06ff




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (86d3bb5 -> 403e479)

2021-04-28 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 86d3bb5  [SPARK-34981][SQL] Implement V2 function resolution and 
evaluation
 add 403e479  [SPARK-35244][SQL][FOLLOWUP] Add null check for the exception 
cause

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/sql/catalyst/expressions/objects/objects.scala| 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (0bcf348 -> 86d3bb5)

2021-04-28 Thread wenchen
This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 0bcf348  [SPARK-34781][SQL][FOLLOWUP] Adjust the order of AQE 
optimizer rules
 add 86d3bb5  [SPARK-34981][SQL] Implement V2 function resolution and 
evaluation

No new revisions were added by this update.

Summary of changes:
 .../catalog/functions/AggregateFunction.java   |  15 +-
 .../catalog/functions/ScalarFunction.java  |  58 ++-
 .../spark/sql/catalyst/analysis/Analyzer.scala | 168 +++--
 .../catalyst/analysis/higherOrderFunctions.scala   |   9 +-
 .../spark/sql/catalyst/analysis/unresolved.scala   |  18 +-
 .../sql/catalyst/catalog/SessionCatalog.scala  |   5 +-
 .../expressions/ApplyFunctionExpression.scala  |  46 +++
 .../expressions/aggregate/V2Aggregator.scala   |  70 
 .../spark/sql/catalyst/parser/AstBuilder.scala |  19 +-
 .../sql/connector/catalog/CatalogV2Implicits.scala |  32 ++
 .../sql/connector/catalog/LookupCatalog.scala  |  22 ++
 .../spark/sql/errors/QueryCompilationErrors.scala  |   6 -
 .../catalyst/analysis/LookupFunctionsSuite.scala   |   7 +-
 .../connector/catalog/CatalogManagerSuite.scala|   4 +-
 ...{TableCatalogSuite.scala => CatalogSuite.scala} |  50 ++-
 .../sql/connector/catalog/InMemoryCatalog.scala|  58 +++
 .../connector/catalog/InMemoryTableCatalog.scala   |   2 +-
 .../connector/catalog/functions/JavaAverage.java   | 102 +
 .../connector/catalog/functions/JavaStrLen.java| 122 ++
 .../sql/connector/DataSourceV2FunctionSuite.scala  | 420 +
 .../spark/sql/connector/DatasourceV2SQLBase.scala  |   6 +-
 21 files changed, 1163 insertions(+), 76 deletions(-)
 create mode 100644 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ApplyFunctionExpression.scala
 create mode 100644 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/V2Aggregator.scala
 rename 
sql/catalyst/src/test/scala/org/apache/spark/sql/connector/catalog/{TableCatalogSuite.scala
 => CatalogSuite.scala} (93%)
 create mode 100644 
sql/catalyst/src/test/scala/org/apache/spark/sql/connector/catalog/InMemoryCatalog.scala
 create mode 100644 
sql/core/src/test/java/test/org/apache/spark/sql/connector/catalog/functions/JavaAverage.java
 create mode 100644 
sql/core/src/test/java/test/org/apache/spark/sql/connector/catalog/functions/JavaStrLen.java
 create mode 100644 
sql/core/src/test/scala/org/apache/spark/sql/connector/DataSourceV2FunctionSuite.scala

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (8b62c29 -> 0bcf348)

2021-04-28 Thread wenchen
This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 8b62c29  [SPARK-35214][SQL] OptimizeSkewedJoin support 
ShuffledHashJoinExec
 add 0bcf348  [SPARK-34781][SQL][FOLLOWUP] Adjust the order of AQE 
optimizer rules

No new revisions were added by this update.

Summary of changes:
 .../scala/org/apache/spark/sql/execution/adaptive/AQEOptimizer.scala | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[GitHub] [spark-website] cloud-fan commented on a change in pull request #333: Announce the new repository service for spark-packages

2021-04-28 Thread GitBox


cloud-fan commented on a change in pull request #333:
URL: https://github.com/apache/spark-website/pull/333#discussion_r622186096



##
File path: news/_posts/2021-04-28-new-repository-service.md
##
@@ -0,0 +1,18 @@
+---
+layout: post
+title: New repository service for spark-packages
+categories:
+- News
+tags: []
+status: publish
+type: post
+published: true
+meta:
+  _edit_last: '4'
+  _wpas_done_all: '1'
+---
+We have spun up a new repository service at https://repos.spark-packages.org;>https://repos.spark-packages.org 
and it will be the new home for the artifacts on spark-packages.
+
+https://bintray.com/;>Bintray, the original repository service 
used for https://spark-packages.org/;>https://spark-packages.org/, 
is in its https://jfrog.com/blog/into-the-sunset-bintray-jcenter-gocenter-and-chartcenter/;>sunset
 process. It will no longer be available from May 1st. To consume artifacts 
from that, please replace "dl.bintray.com/spark-packages/maven" with 
"repos.spark-packages.org" in the Maven pom files or sbt build files in your 
repositories.

Review comment:
   Is the setting in pom file or sbt build file? I thought it's controlled 
by the system env `DEFAULT_ARTIFACT_REPOSITORY`. See 
https://github.com/apache/spark/commit/f738fe07b6fc85c880b64a1cc2f6c7cc1cc1379b#diff-f8564df81d845c0cd2f621bc2ed22761cbf9731f28cb2828d9cbd0491f4e7584R1201




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[GitHub] [spark-website] bozhang2820 commented on a change in pull request #333: Announce the new repository service for spark-packages

2021-04-28 Thread GitBox


bozhang2820 commented on a change in pull request #333:
URL: https://github.com/apache/spark-website/pull/333#discussion_r622184366



##
File path: news/_posts/2021-04-28-new-repository-service.md
##
@@ -0,0 +1,18 @@
+---
+layout: post
+title: New repository service for spark-packages
+categories:
+- News
+tags: []
+status: publish
+type: post
+published: true
+meta:
+  _edit_last: '4'
+  _wpas_done_all: '1'
+---
+We have spun up a new repository service at https://repos.spark-packages.org;>https://repos.spark-packages.org 
and it will be the new home for the artifacts on spark-packages.
+
+https://bintray.com/;>Bintray, the original repository service 
used for https://spark-packages.org/;>https://spark-packages.org/, 
is in its https://jfrog.com/blog/into-the-sunset-bintray-jcenter-gocenter-and-chartcenter/;>sunset
 process. It will no longer be available from May 1st. To consume artifacts 
from that, please replace "dl.bintray.com/spark-packages/maven" with 
"repos.spark-packages.org" in the Maven pom files or sbt build files in your 
repositories.

Review comment:
   Please change
   "... in its sunset process. It will no longer ..." 
   to 
   "... in its sunset process, and will no longer ..."
   
   and
   "To consume artifacts from that"
   to
   "To consume artifacts from the new repository service"




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[GitHub] [spark-website] xuanyuanking commented on pull request #333: Announce the new repository service for spark-packages

2021-04-28 Thread GitBox


xuanyuanking commented on pull request #333:
URL: https://github.com/apache/spark-website/pull/333#issuecomment-828453546


   cc @bozhang2820 @cloud-fan @HyukjinKwon @gatorsmile 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[GitHub] [spark-website] xuanyuanking opened a new pull request #333: Announce the new repository service for spark-packages

2021-04-28 Thread GitBox


xuanyuanking opened a new pull request #333:
URL: https://github.com/apache/spark-website/pull/333


   This PR is to add the news page for the new repository service.
   
![image](https://user-images.githubusercontent.com/4833765/116411290-283bb980-a868-11eb-9074-c63d4ca1a899.png)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch branch-2.4 updated (183eb75 -> e89526d)

2021-04-28 Thread viirya
This is an automated email from the ASF dual-hosted git repository.

viirya pushed a change to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 183eb75  [SPARK-35227][BUILD] Update the resolver for spark-packages 
in SparkSubmit
 add e89526d  Preparing Spark release v2.4.8-rc3

No new revisions were added by this update.

Summary of changes:
 R/pkg/DESCRIPTION  | 2 +-
 assembly/pom.xml   | 2 +-
 common/kvstore/pom.xml | 2 +-
 common/network-common/pom.xml  | 2 +-
 common/network-shuffle/pom.xml | 2 +-
 common/network-yarn/pom.xml| 2 +-
 common/sketch/pom.xml  | 2 +-
 common/tags/pom.xml| 2 +-
 common/unsafe/pom.xml  | 2 +-
 core/pom.xml   | 2 +-
 docs/_config.yml   | 4 ++--
 examples/pom.xml   | 2 +-
 external/avro/pom.xml  | 2 +-
 external/docker-integration-tests/pom.xml  | 2 +-
 external/flume-assembly/pom.xml| 2 +-
 external/flume-sink/pom.xml| 2 +-
 external/flume/pom.xml | 2 +-
 external/kafka-0-10-assembly/pom.xml   | 2 +-
 external/kafka-0-10-sql/pom.xml| 2 +-
 external/kafka-0-10/pom.xml| 2 +-
 external/kafka-0-8-assembly/pom.xml| 2 +-
 external/kafka-0-8/pom.xml | 2 +-
 external/kinesis-asl-assembly/pom.xml  | 2 +-
 external/kinesis-asl/pom.xml   | 2 +-
 external/spark-ganglia-lgpl/pom.xml| 2 +-
 graphx/pom.xml | 2 +-
 hadoop-cloud/pom.xml   | 2 +-
 launcher/pom.xml   | 2 +-
 mllib-local/pom.xml| 2 +-
 mllib/pom.xml  | 2 +-
 pom.xml| 2 +-
 python/pyspark/version.py  | 2 +-
 repl/pom.xml   | 2 +-
 resource-managers/kubernetes/core/pom.xml  | 2 +-
 resource-managers/kubernetes/integration-tests/pom.xml | 2 +-
 resource-managers/mesos/pom.xml| 2 +-
 resource-managers/yarn/pom.xml | 2 +-
 sql/catalyst/pom.xml   | 2 +-
 sql/core/pom.xml   | 2 +-
 sql/hive-thriftserver/pom.xml  | 2 +-
 sql/hive/pom.xml   | 2 +-
 streaming/pom.xml  | 2 +-
 tools/pom.xml  | 2 +-
 43 files changed, 44 insertions(+), 44 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] tag v2.4.8-rc3 created (now e89526d)

2021-04-28 Thread viirya
This is an automated email from the ASF dual-hosted git repository.

viirya pushed a change to tag v2.4.8-rc3
in repository https://gitbox.apache.org/repos/asf/spark.git.


  at e89526d  (commit)
This tag includes the following new commits:

 new e89526d  Preparing Spark release v2.4.8-rc3

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] 01/01: Preparing Spark release v2.4.8-rc3

2021-04-28 Thread viirya
This is an automated email from the ASF dual-hosted git repository.

viirya pushed a commit to tag v2.4.8-rc3
in repository https://gitbox.apache.org/repos/asf/spark.git

commit e89526d2401b3a04719721c923a6f630e555e286
Author: Liang-Chi Hsieh 
AuthorDate: Wed Apr 28 08:22:14 2021 +

Preparing Spark release v2.4.8-rc3
---
 R/pkg/DESCRIPTION  | 2 +-
 assembly/pom.xml   | 2 +-
 common/kvstore/pom.xml | 2 +-
 common/network-common/pom.xml  | 2 +-
 common/network-shuffle/pom.xml | 2 +-
 common/network-yarn/pom.xml| 2 +-
 common/sketch/pom.xml  | 2 +-
 common/tags/pom.xml| 2 +-
 common/unsafe/pom.xml  | 2 +-
 core/pom.xml   | 2 +-
 docs/_config.yml   | 4 ++--
 examples/pom.xml   | 2 +-
 external/avro/pom.xml  | 2 +-
 external/docker-integration-tests/pom.xml  | 2 +-
 external/flume-assembly/pom.xml| 2 +-
 external/flume-sink/pom.xml| 2 +-
 external/flume/pom.xml | 2 +-
 external/kafka-0-10-assembly/pom.xml   | 2 +-
 external/kafka-0-10-sql/pom.xml| 2 +-
 external/kafka-0-10/pom.xml| 2 +-
 external/kafka-0-8-assembly/pom.xml| 2 +-
 external/kafka-0-8/pom.xml | 2 +-
 external/kinesis-asl-assembly/pom.xml  | 2 +-
 external/kinesis-asl/pom.xml   | 2 +-
 external/spark-ganglia-lgpl/pom.xml| 2 +-
 graphx/pom.xml | 2 +-
 hadoop-cloud/pom.xml   | 2 +-
 launcher/pom.xml   | 2 +-
 mllib-local/pom.xml| 2 +-
 mllib/pom.xml  | 2 +-
 pom.xml| 2 +-
 python/pyspark/version.py  | 2 +-
 repl/pom.xml   | 2 +-
 resource-managers/kubernetes/core/pom.xml  | 2 +-
 resource-managers/kubernetes/integration-tests/pom.xml | 2 +-
 resource-managers/mesos/pom.xml| 2 +-
 resource-managers/yarn/pom.xml | 2 +-
 sql/catalyst/pom.xml   | 2 +-
 sql/core/pom.xml   | 2 +-
 sql/hive-thriftserver/pom.xml  | 2 +-
 sql/hive/pom.xml   | 2 +-
 streaming/pom.xml  | 2 +-
 tools/pom.xml  | 2 +-
 43 files changed, 44 insertions(+), 44 deletions(-)

diff --git a/R/pkg/DESCRIPTION b/R/pkg/DESCRIPTION
index 0b85a88..fc6149d 100644
--- a/R/pkg/DESCRIPTION
+++ b/R/pkg/DESCRIPTION
@@ -1,6 +1,6 @@
 Package: SparkR
 Type: Package
-Version: 2.4.9
+Version: 2.4.8
 Title: R Front End for 'Apache Spark'
 Description: Provides an R Front end for 'Apache Spark' 
.
 Authors@R: c(person("Shivaram", "Venkataraman", role = c("aut", "cre"),
diff --git a/assembly/pom.xml b/assembly/pom.xml
index e97cc93..1b534d1 100644
--- a/assembly/pom.xml
+++ b/assembly/pom.xml
@@ -21,7 +21,7 @@
   
 org.apache.spark
 spark-parent_2.11
-2.4.9-SNAPSHOT
+2.4.8
 ../pom.xml
   
 
diff --git a/common/kvstore/pom.xml b/common/kvstore/pom.xml
index 217ed3f..062290a 100644
--- a/common/kvstore/pom.xml
+++ b/common/kvstore/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.spark
 spark-parent_2.11
-2.4.9-SNAPSHOT
+2.4.8
 ../../pom.xml
   
 
diff --git a/common/network-common/pom.xml b/common/network-common/pom.xml
index 5724cdf..cd57c43 100644
--- a/common/network-common/pom.xml
+++ b/common/network-common/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.spark
 spark-parent_2.11
-2.4.9-SNAPSHOT
+2.4.8
 ../../pom.xml
   
 
diff --git a/common/network-shuffle/pom.xml b/common/network-shuffle/pom.xml
index 67e5dbd..336255e 100644
--- a/common/network-shuffle/pom.xml
+++ b/common/network-shuffle/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.spark
 spark-parent_2.11
-2.4.9-SNAPSHOT
+2.4.8
 ../../pom.xml
   
 
diff --git a/common/network-yarn/pom.xml b/common/network-yarn/pom.xml
index cd6ed29..c1025d2 100644
--- a/common/network-yarn/pom.xml
+++ b/common/network-yarn/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.spark
 spark-parent_2.11
-2.4.9-SNAPSHOT
+2.4.8
 ../../pom.xml
   
 
diff --git a/common/sketch/pom.xml b/common/sketch/pom.xml
index 4b8b85e..fd66de0 

[spark] branch branch-3.0 updated (a556bc8 -> c6659e6)

2021-04-28 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git.


from a556bc8  [SPARK-33976][SQL][DOCS][3.0] Add a SQL doc page for a 
TRANSFORM clause
 add c6659e6  [SPARK-35159][SQL][DOCS][3.0] Extract hive format doc

No new revisions were added by this update.

Summary of changes:
 docs/sql-ref-syntax-ddl-create-table-hiveformat.md | 52 +--
 docs/sql-ref-syntax-hive-format.md | 73 ++
 docs/sql-ref-syntax-qry-select-transform.md| 48 +-
 3 files changed, 77 insertions(+), 96 deletions(-)
 create mode 100644 docs/sql-ref-syntax-hive-format.md

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch branch-3.1 updated (361e684 -> db8204e)

2021-04-28 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch branch-3.1
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 361e684  [SPARK-33976][SQL][DOCS][3.1] Add a SQL doc page for a 
TRANSFORM clause
 add db8204e  [SPARK-35159][SQL][DOCS][3.1] Extract hive format doc

No new revisions were added by this update.

Summary of changes:
 docs/sql-ref-syntax-ddl-create-table-hiveformat.md | 52 +--
 docs/sql-ref-syntax-hive-format.md | 73 ++
 docs/sql-ref-syntax-qry-select-transform.md| 48 +-
 3 files changed, 77 insertions(+), 96 deletions(-)
 create mode 100644 docs/sql-ref-syntax-hive-format.md

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (26a5e33 -> 8b62c29)

2021-04-28 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 26a5e33  [SPARK-33976][SQL][DOCS][FOLLOWUP] Fix syntax error in select 
doc page
 add 8b62c29  [SPARK-35214][SQL] OptimizeSkewedJoin support 
ShuffledHashJoinExec

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/sql/internal/SQLConf.scala|   4 +-
 .../execution/adaptive/OptimizeSkewedJoin.scala| 189 -
 .../execution/exchange/EnsureRequirements.scala|   9 +-
 .../sql/execution/joins/ShuffledHashJoinExec.scala |   3 +-
 .../spark/sql/execution/joins/ShuffledJoin.scala   |  18 +-
 .../sql/execution/joins/SortMergeJoinExec.scala|  17 --
 .../adaptive/AdaptiveQueryExecSuite.scala  | 130 +++---
 7 files changed, 204 insertions(+), 166 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch branch-3.0 updated (6e83789b -> a556bc8)

2021-04-28 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 6e83789b [SPARK-35244][SQL] Invoke should throw the original exception
 add a556bc8  [SPARK-33976][SQL][DOCS][3.0] Add a SQL doc page for a 
TRANSFORM clause

No new revisions were added by this update.

Summary of changes:
 docs/_data/menu-sql.yaml|   2 +
 docs/sql-ref-syntax-qry-select-transform.md | 235 
 docs/sql-ref-syntax-qry-select.md   |   7 +-
 docs/sql-ref-syntax-qry.md  |   1 +
 docs/sql-ref-syntax.md  |   1 +
 5 files changed, 245 insertions(+), 1 deletion(-)
 create mode 100644 docs/sql-ref-syntax-qry-select-transform.md

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch branch-3.1 updated (e58055b -> 361e684)

2021-04-28 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch branch-3.1
in repository https://gitbox.apache.org/repos/asf/spark.git.


from e58055b  [SPARK-35244][SQL] Invoke should throw the original exception
 add 361e684  [SPARK-33976][SQL][DOCS][3.1] Add a SQL doc page for a 
TRANSFORM clause

No new revisions were added by this update.

Summary of changes:
 docs/_data/menu-sql.yaml|   2 +
 docs/sql-ref-syntax-qry-select-transform.md | 235 
 docs/sql-ref-syntax-qry-select.md   |   7 +-
 docs/sql-ref-syntax-qry.md  |   1 +
 docs/sql-ref-syntax.md  |   1 +
 5 files changed, 245 insertions(+), 1 deletion(-)
 create mode 100644 docs/sql-ref-syntax-qry-select-transform.md

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated: [SPARK-33976][SQL][DOCS][FOLLOWUP] Fix syntax error in select doc page

2021-04-28 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 26a5e33  [SPARK-33976][SQL][DOCS][FOLLOWUP] Fix syntax error in select 
doc page
26a5e33 is described below

commit 26a5e339a61ab06fb2949166db705f1b575addd3
Author: Angerszh 
AuthorDate: Wed Apr 28 16:47:02 2021 +0900

[SPARK-33976][SQL][DOCS][FOLLOWUP] Fix syntax error in select doc page

### What changes were proposed in this pull request?
Add doc about `TRANSFORM` and related function.

### Why are the changes needed?

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Not need

Closes #32257 from AngersZh/SPARK-33976-followup.

Authored-by: Angerszh 
Signed-off-by: Takeshi Yamamuro 
---
 docs/sql-ref-syntax-qry-select.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/sql-ref-syntax-qry-select.md 
b/docs/sql-ref-syntax-qry-select.md
index 62a7f5f..500eda1 100644
--- a/docs/sql-ref-syntax-qry-select.md
+++ b/docs/sql-ref-syntax-qry-select.md
@@ -41,7 +41,7 @@ select_statement [ { UNION | INTERSECT | EXCEPT } [ ALL | 
DISTINCT ] select_stat
 
 While `select_statement` is defined as
 ```sql
-SELECT [ hints , ... ] [ ALL | DISTINCT ] { [[ named_expression | 
regex_column_names ] [ , ... ] | TRANSFORM (...)) ] }
+SELECT [ hints , ... ] [ ALL | DISTINCT ] { [ [ named_expression | 
regex_column_names ] [ , ... ] | TRANSFORM (...) ] }
 FROM { from_item [ , ... ] }
 [ PIVOT clause ]
 [ LATERAL VIEW clause ] [ ... ] 

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch branch-3.0 updated: [SPARK-35244][SQL] Invoke should throw the original exception

2021-04-28 Thread wenchen
This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 6e83789b [SPARK-35244][SQL] Invoke should throw the original exception
6e83789b is described below

commit 6e83789be5fe5141affee600f9b614996cd91482
Author: Wenchen Fan 
AuthorDate: Wed Apr 28 10:45:04 2021 +0900

[SPARK-35244][SQL] Invoke should throw the original exception

### What changes were proposed in this pull request?

This PR updates the interpreted code path of invoke expressions, to unwrap 
the `InvocationTargetException`

### Why are the changes needed?

Make interpreted and codegen path consistent for invoke expressions.

### Does this PR introduce _any_ user-facing change?

no

### How was this patch tested?

new UT

Closes #32370 from cloud-fan/minor.

Authored-by: Wenchen Fan 
Signed-off-by: hyukjinkwon 
---
 .../spark/sql/catalyst/expressions/objects/objects.scala   |  7 ++-
 .../sql/catalyst/expressions/ObjectExpressionsSuite.scala  | 10 ++
 2 files changed, 16 insertions(+), 1 deletion(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala
index 7ebf70b..71eacce 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala
@@ -127,7 +127,12 @@ trait InvokeLike extends Expression with NonSQLExpression {
   // return null if one of arguments is null
   null
 } else {
-  val ret = method.invoke(obj, args: _*)
+  val ret = try {
+method.invoke(obj, args: _*)
+  } catch {
+// Re-throw the original exception.
+case e: java.lang.reflect.InvocationTargetException => throw e.getCause
+  }
   val boxedClass = ScalaReflection.typeBoxedJavaMapping.get(dataType)
   if (boxedClass.isDefined) {
 boxedClass.get.cast(ret)
diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ObjectExpressionsSuite.scala
 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ObjectExpressionsSuite.scala
index c401493..50c76f1 100644
--- 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ObjectExpressionsSuite.scala
+++ 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ObjectExpressionsSuite.scala
@@ -604,6 +604,16 @@ class ObjectExpressionsSuite extends SparkFunSuite with 
ExpressionEvalHelper {
 checkExceptionInExpression[RuntimeException](
   serializer4, EmptyRow, "Cannot use null as map key!")
   }
+
+  test("SPARK-35244: invoke should throw the original exception") {
+val strClsType = ObjectType(classOf[String])
+checkExceptionInExpression[StringIndexOutOfBoundsException](
+  Invoke(Literal("a", strClsType), "substring", strClsType, 
Seq(Literal(3))), "")
+
+val mathCls = classOf[Math]
+checkExceptionInExpression[ArithmeticException](
+  StaticInvoke(mathCls, IntegerType, "addExact", 
Seq(Literal(Int.MaxValue), Literal(1))), "")
+  }
 }
 
 class TestBean extends Serializable {

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch branch-3.1 updated: [SPARK-35244][SQL] Invoke should throw the original exception

2021-04-28 Thread wenchen
This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a commit to branch branch-3.1
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.1 by this push:
 new e58055b  [SPARK-35244][SQL] Invoke should throw the original exception
e58055b is described below

commit e58055bdec919704ce82d659aad2d18913de7512
Author: Wenchen Fan 
AuthorDate: Wed Apr 28 10:45:04 2021 +0900

[SPARK-35244][SQL] Invoke should throw the original exception

### What changes were proposed in this pull request?

This PR updates the interpreted code path of invoke expressions, to unwrap 
the `InvocationTargetException`

### Why are the changes needed?

Make interpreted and codegen path consistent for invoke expressions.

### Does this PR introduce _any_ user-facing change?

no

### How was this patch tested?

new UT

Closes #32370 from cloud-fan/minor.

Authored-by: Wenchen Fan 
Signed-off-by: hyukjinkwon 
---
 .../spark/sql/catalyst/expressions/objects/objects.scala   |  7 ++-
 .../sql/catalyst/expressions/ObjectExpressionsSuite.scala  | 10 ++
 2 files changed, 16 insertions(+), 1 deletion(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala
index 76ba523..b93e2e6 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala
@@ -127,7 +127,12 @@ trait InvokeLike extends Expression with NonSQLExpression {
   // return null if one of arguments is null
   null
 } else {
-  val ret = method.invoke(obj, args: _*)
+  val ret = try {
+method.invoke(obj, args: _*)
+  } catch {
+// Re-throw the original exception.
+case e: java.lang.reflect.InvocationTargetException => throw e.getCause
+  }
   val boxedClass = ScalaReflection.typeBoxedJavaMapping.get(dataType)
   if (boxedClass.isDefined) {
 boxedClass.get.cast(ret)
diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ObjectExpressionsSuite.scala
 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ObjectExpressionsSuite.scala
index bc2b93e..6e71c95 100644
--- 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ObjectExpressionsSuite.scala
+++ 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ObjectExpressionsSuite.scala
@@ -608,6 +608,16 @@ class ObjectExpressionsSuite extends SparkFunSuite with 
ExpressionEvalHelper {
 checkExceptionInExpression[RuntimeException](
   serializer4, EmptyRow, "Cannot use null as map key!")
   }
+
+  test("SPARK-35244: invoke should throw the original exception") {
+val strClsType = ObjectType(classOf[String])
+checkExceptionInExpression[StringIndexOutOfBoundsException](
+  Invoke(Literal("a", strClsType), "substring", strClsType, 
Seq(Literal(3))), "")
+
+val mathCls = classOf[Math]
+checkExceptionInExpression[ArithmeticException](
+  StaticInvoke(mathCls, IntegerType, "addExact", 
Seq(Literal(Int.MaxValue), Literal(1))), "")
+  }
 }
 
 class TestBean extends Serializable {

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org