date:20240304

(spark) branch master updated (dfb35bed522c -> c9435b8b4864)

2024-03-04 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


from dfb35bed522c [SPARK-47146][CORE] Possible thread leak when doing sort 
merge join
 add c9435b8b4864 [SPARK-47176][SQL][FOLLOW-UP] resolveExpressions should 
have three versions which is the same as resolveOperators

No new revisions were added by this update.

Summary of changes:
 .../catalyst/plans/logical/AnalysisHelper.scala| 36 ++
 1 file changed, 36 insertions(+)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

(spark) branch master updated: [SPARK-47146][CORE] Possible thread leak when doing sort merge join

2024-03-04 Thread mridulm80

This is an automated email from the ASF dual-hosted git repository.

mridulm80 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new dfb35bed522c [SPARK-47146][CORE] Possible thread leak when doing sort 
merge join
dfb35bed522c is described below

commit dfb35bed522ca706f8fc18e37c05c1766c8d8a18
Author: JacobZheng0927 
AuthorDate: Mon Mar 4 23:17:32 2024 -0600

[SPARK-47146][CORE] Possible thread leak when doing sort merge join

### What changes were proposed in this pull request?
Add TaskCompletionListener to close inputStream to avoid thread leakage 
caused by unclosed ReadAheadInputStream.

### Why are the changes needed?
SPARK-40849 modified the implementation of `newDaemonSingleThreadExecutor` 
to use `newFixedThreadPool` instead of `newSingleThreadExecutor` .The 
difference is that `newSingleThreadExecutor` uses the 
`FinalizableDelegatedExecutorService`, which provides a `finalize` method that 
automatically closes the thread pool. In some cases, sort merge join execution 
uses `ReadAheadSteam` and does not close it, so this change caused a thread 
leak. Since Finalization is deprecated and subject to re [...]

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Unit test

### Was this patch authored or co-authored using generative AI tooling?
No

Closes #45327 from JacobZheng0927/SPARK-47146.

Authored-by: JacobZheng0927 
Signed-off-by: Mridul Muralidharan gmail.com>
---
 .../unsafe/sort/UnsafeSorterSpillReader.java   | 12 
 .../scala/org/apache/spark/sql/JoinSuite.scala | 33 +-
 2 files changed, 44 insertions(+), 1 deletion(-)

diff --git 
a/core/src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeSorterSpillReader.java
 
b/core/src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeSorterSpillReader.java
index db79efd00853..8bd44c8c52c1 100644
--- 
a/core/src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeSorterSpillReader.java
+++ 
b/core/src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeSorterSpillReader.java
@@ -28,6 +28,8 @@ import org.apache.spark.io.ReadAheadInputStream;
 import org.apache.spark.serializer.SerializerManager;
 import org.apache.spark.storage.BlockId;
 import org.apache.spark.unsafe.Platform;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
 
 import java.io.*;
 
@@ -36,6 +38,7 @@ import java.io.*;
  * of the file format).
  */
 public final class UnsafeSorterSpillReader extends UnsafeSorterIterator 
implements Closeable {
+  private static final Logger logger = 
LoggerFactory.getLogger(ReadAheadInputStream.class);
   public static final int MAX_BUFFER_SIZE_BYTES = 16777216; // 16 mb
 
   private InputStream in;
@@ -82,6 +85,15 @@ public final class UnsafeSorterSpillReader extends 
UnsafeSorterIterator implemen
   Closeables.close(bs, /* swallowIOException = */ true);
   throw e;
 }
+if (taskContext != null) {
+  taskContext.addTaskCompletionListener(context -> {
+try {
+  close();
+} catch (IOException e) {
+  logger.info("error while closing UnsafeSorterSpillReader", e);
+}
+  });
+}
   }
 
   @Override
diff --git a/sql/core/src/test/scala/org/apache/spark/sql/JoinSuite.scala 
b/sql/core/src/test/scala/org/apache/spark/sql/JoinSuite.scala
index f31f60e8df56..be6862f5b96b 100644
--- a/sql/core/src/test/scala/org/apache/spark/sql/JoinSuite.scala
+++ b/sql/core/src/test/scala/org/apache/spark/sql/JoinSuite.scala
@@ -23,6 +23,7 @@ import scala.collection.mutable.ListBuffer
 import scala.jdk.CollectionConverters._
 
 import org.apache.spark.TestUtils.{assertNotSpilled, assertSpilled}
+import 
org.apache.spark.internal.config.SHUFFLE_SPILL_NUM_ELEMENTS_FORCE_SPILL_THRESHOLD
 import org.apache.spark.sql.catalyst.TableIdentifier
 import org.apache.spark.sql.catalyst.analysis.UnresolvedRelation
 import org.apache.spark.sql.catalyst.expressions.{Ascending, GenericRow, 
SortOrder}
@@ -34,7 +35,7 @@ import 
org.apache.spark.sql.execution.exchange.{ShuffleExchangeExec, ShuffleExch
 import org.apache.spark.sql.execution.joins._
 import org.apache.spark.sql.execution.python.BatchEvalPythonExec
 import org.apache.spark.sql.internal.SQLConf
-import org.apache.spark.sql.test.SharedSparkSession
+import org.apache.spark.sql.test.{SharedSparkSession, TestSparkSession}
 import org.apache.spark.sql.types.StructType
 import org.apache.spark.tags.SlowSQLTest
 
@@ -1737,3 +1738,33 @@ class JoinSuite extends QueryTest with 
SharedSparkSession with AdaptiveSparkPlan
 }
   }
 }
+
+class ThreadLeakInSortMergeJoinSuite
+  extends QueryTest
+with SharedSparkSession
+with AdaptiveSparkPlanHelper {
+
+  setupTestData()
+  override protected def createSparkSession: TestSparkSession = {
+

(spark) branch branch-3.5 updated: [SPARK-47177][SQL] Cached SQL plan do not display final AQE plan in explain string

2024-03-04 Thread ulyssesyou

This is an automated email from the ASF dual-hosted git repository.

ulyssesyou pushed a commit to branch branch-3.5
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.5 by this push:
 new 14762b372dd6 [SPARK-47177][SQL] Cached SQL plan do not display final 
AQE plan in explain string
14762b372dd6 is described below

commit 14762b372dd623179aa2c985c44cd49048660dda
Author: ulysses-you 
AuthorDate: Tue Mar 5 10:13:00 2024 +0800

[SPARK-47177][SQL] Cached SQL plan do not display final AQE plan in explain 
string

### What changes were proposed in this pull request?

This pr adds lock for ExplainUtils.processPlan to avoid tag race condition.

### Why are the changes needed?

To fix the issue 
[SPARK-47177](https://issues.apache.org/jira/browse/SPARK-47177)

### Does this PR introduce _any_ user-facing change?

yes, affect plan explain

### How was this patch tested?

add test

### Was this patch authored or co-authored using generative AI tooling?

no

Closes #45282 from ulysses-you/SPARK-47177.

Authored-by: ulysses-you 
Signed-off-by: youxiduo 
(cherry picked from commit 6e62a5690b810edb99e4fc6ad39afbd4d49ef85e)
Signed-off-by: youxiduo 
---
 .../apache/spark/sql/catalyst/trees/TreeNode.scala |  7 ++--
 .../apache/spark/sql/execution/ExplainUtils.scala  |  6 +++-
 .../sql/execution/columnar/InMemoryRelation.scala  | 12 +--
 .../execution/columnar/InMemoryRelationSuite.scala | 41 +++---
 4 files changed, 38 insertions(+), 28 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/trees/TreeNode.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/trees/TreeNode.scala
index 9e605a45414b..82228a5b2aaf 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/trees/TreeNode.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/trees/TreeNode.scala
@@ -1030,10 +1030,11 @@ abstract class TreeNode[BaseType <: TreeNode[BaseType]]
 append(str)
 append("\n")
 
-if (innerChildren.nonEmpty) {
+val innerChildrenLocal = innerChildren
+if (innerChildrenLocal.nonEmpty) {
   lastChildren.add(children.isEmpty)
   lastChildren.add(false)
-  innerChildren.init.foreach(_.generateTreeString(
+  innerChildrenLocal.init.foreach(_.generateTreeString(
 depth + 2, lastChildren, append, verbose,
 addSuffix = addSuffix, maxFields = maxFields, printNodeId = 
printNodeId, indent = indent))
   lastChildren.remove(lastChildren.size() - 1)
@@ -1041,7 +1042,7 @@ abstract class TreeNode[BaseType <: TreeNode[BaseType]]
 
   lastChildren.add(children.isEmpty)
   lastChildren.add(true)
-  innerChildren.last.generateTreeString(
+  innerChildrenLocal.last.generateTreeString(
 depth + 2, lastChildren, append, verbose,
 addSuffix = addSuffix, maxFields = maxFields, printNodeId = 
printNodeId, indent = indent)
   lastChildren.remove(lastChildren.size() - 1)
diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/ExplainUtils.scala 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/ExplainUtils.scala
index 3da3e646f36b..11f6ae0e47ee 100644
--- a/sql/core/src/main/scala/org/apache/spark/sql/execution/ExplainUtils.scala
+++ b/sql/core/src/main/scala/org/apache/spark/sql/execution/ExplainUtils.scala
@@ -75,8 +75,12 @@ object ExplainUtils extends AdaptiveSparkPlanHelper {
* Given a input physical plan, performs the following tasks.
*   1. Generates the explain output for the input plan excluding the 
subquery plans.
*   2. Generates the explain output for each subquery referenced in the 
plan.
+   *
+   * Note that, ideally this is a no-op as different explain actions operate 
on different plan,
+   * instances but cached plan is an exception. The 
`InMemoryRelation#innerChildren` use a shared
+   * plan instance across multi-queries. Add lock for this method to avoid tag 
race condition.
*/
-  def processPlan[T <: QueryPlan[T]](plan: T, append: String => Unit): Unit = {
+  def processPlan[T <: QueryPlan[T]](plan: T, append: String => Unit): Unit = 
synchronized {
 try {
   // Initialize a reference-unique set of Operators to avoid accdiental 
overwrites and to allow
   // intentional overwriting of IDs generated in previous AQE iteration
diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/InMemoryRelation.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/InMemoryRelation.scala
index 5bab8e53eb16..f750a4503be1 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/InMemoryRelation.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/InMemoryRelation.scala
@@ -392,17 +392,7 @@ case class InMemoryRelation(
 
   @volatile var statsOfPlanToCache:

(spark) branch master updated (73aa14466683 -> 6e62a5690b81)

2024-03-04 Thread ulyssesyou

This is an automated email from the ASF dual-hosted git repository.

ulyssesyou pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


from 73aa14466683 [SPARK-47252][DOCS] Clarify that pivot may trigger an 
eager computation
 add 6e62a5690b81 [SPARK-47177][SQL] Cached SQL plan do not display final 
AQE plan in explain string

No new revisions were added by this update.

Summary of changes:
 .../apache/spark/sql/catalyst/trees/TreeNode.scala |  7 ++--
 .../apache/spark/sql/execution/ExplainUtils.scala  |  6 +++-
 .../sql/execution/columnar/InMemoryRelation.scala  | 12 +--
 .../execution/columnar/InMemoryRelationSuite.scala | 41 +++---
 4 files changed, 38 insertions(+), 28 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

(spark) branch master updated (35bced42474e -> 73aa14466683)

2024-03-04 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


from 35bced42474e [SPARK-47242][BUILD] Bump ap-loader 3.0(v8) to support 
for async-profiler 3.0
 add 73aa14466683 [SPARK-47252][DOCS] Clarify that pivot may trigger an 
eager computation

No new revisions were added by this update.

Summary of changes:
 .../apache/spark/sql/RelationalGroupedDataset.scala| 18 +-
 python/pyspark/sql/group.py| 10 +-
 .../apache/spark/sql/RelationalGroupedDataset.scala| 16 
 3 files changed, 22 insertions(+), 22 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

(spark) branch master updated (db0e5c7bc464 -> 35bced42474e)

2024-03-04 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


from db0e5c7bc464 [SPARK-47269][BUILD] Upgrade jetty to 11.0.20
 add 35bced42474e [SPARK-47242][BUILD] Bump ap-loader 3.0(v8) to support 
for async-profiler 3.0

No new revisions were added by this update.

Summary of changes:
 connector/profiler/pom.xml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

(spark) branch master updated (6b5917beff30 -> db0e5c7bc464)

2024-03-04 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


from 6b5917beff30 [SPARK-46961][SS] Using ProcessorContext to store and 
retrieve handle
 add db0e5c7bc464 [SPARK-47269][BUILD] Upgrade jetty to 11.0.20

No new revisions were added by this update.

Summary of changes:
 dev/deps/spark-deps-hadoop-3-hive-2.3 | 4 ++--
 pom.xml   | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

(spark) branch master updated: [SPARK-46961][SS] Using ProcessorContext to store and retrieve handle

2024-03-04 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 6b5917beff30 [SPARK-46961][SS] Using ProcessorContext to store and 
retrieve handle
6b5917beff30 is described below

commit 6b5917beff30c813a362584a135a587001df1390
Author: Eric Marnadi 
AuthorDate: Mon Mar 4 21:20:23 2024 +0300

[SPARK-46961][SS] Using ProcessorContext to store and retrieve handle

### What changes were proposed in this pull request?

Setting the processorHandle as a part of the statefulProcessor, so that the 
user doesn't have to explicitly keep track of it, and can instead simply call 
`getStatefulProcessorHandle`

### Why are the changes needed?

This enhances the usability of the State API

### Does this PR introduce _any_ user-facing change?

Yes, this is an API change. This enhances usability of the 
StatefulProcessorHandle and the TransformWithState operator.

### How was this patch tested?

Existing unit tests are sufficient

### Was this patch authored or co-authored using generative AI tooling?

No

Closes #45359 from ericm-db/handle-context.

Authored-by: Eric Marnadi 
Signed-off-by: Max Gekk 
---
 .../src/main/resources/error/error-classes.json|  7 +++
 docs/sql-error-conditions.md   |  7 +++
 .../apache/spark/sql/errors/ExecutionErrors.scala  |  6 +++
 .../spark/sql/streaming/StatefulProcessor.scala| 38 ---
 .../streaming/TransformWithStateExec.scala |  4 +-
 .../streaming/TransformWithListStateSuite.scala| 14 ++
 .../sql/streaming/TransformWithStateSuite.scala| 54 ++
 7 files changed, 84 insertions(+), 46 deletions(-)

diff --git a/common/utils/src/main/resources/error/error-classes.json 
b/common/utils/src/main/resources/error/error-classes.json
index 6ccd841ccd0f..7cf3e9c533ca 100644
--- a/common/utils/src/main/resources/error/error-classes.json
+++ b/common/utils/src/main/resources/error/error-classes.json
@@ -3337,6 +3337,13 @@
 ],
 "sqlState" : "42802"
   },
+  "STATE_STORE_HANDLE_NOT_INITIALIZED" : {
+"message" : [
+  "The handle has not been initialized for this StatefulProcessor.",
+  "Please only use the StatefulProcessor within the transformWithState 
operator."
+],
+"sqlState" : "42802"
+  },
   "STATE_STORE_MULTIPLE_VALUES_PER_KEY" : {
 "message" : [
   "Store does not support multiple values per key"
diff --git a/docs/sql-error-conditions.md b/docs/sql-error-conditions.md
index f026c456eb2d..7be01f8cb513 100644
--- a/docs/sql-error-conditions.md
+++ b/docs/sql-error-conditions.md
@@ -2091,6 +2091,13 @@ Star (*) is not allowed in a select list when GROUP BY 
an ordinal position is us
 
 Failed to remove default column family with reserved name=``.
 
+### STATE_STORE_HANDLE_NOT_INITIALIZED
+
+[SQLSTATE: 
42802](sql-error-conditions-sqlstates.html#class-42-syntax-error-or-access-rule-violation)
+
+The handle has not been initialized for this StatefulProcessor.
+Please only use the StatefulProcessor within the transformWithState operator.
+
 ### STATE_STORE_MULTIPLE_VALUES_PER_KEY
 
 [SQLSTATE: 
42802](sql-error-conditions-sqlstates.html#class-42-syntax-error-or-access-rule-violation)
diff --git 
a/sql/api/src/main/scala/org/apache/spark/sql/errors/ExecutionErrors.scala 
b/sql/api/src/main/scala/org/apache/spark/sql/errors/ExecutionErrors.scala
index b74a67b49bda..7910c386fcf1 100644
--- a/sql/api/src/main/scala/org/apache/spark/sql/errors/ExecutionErrors.scala
+++ b/sql/api/src/main/scala/org/apache/spark/sql/errors/ExecutionErrors.scala
@@ -53,6 +53,12 @@ private[sql] trait ExecutionErrors extends 
DataTypeErrorsBase {
   e)
   }
 
+  def stateStoreHandleNotInitialized(): SparkRuntimeException = {
+new SparkRuntimeException(
+  errorClass = "STATE_STORE_HANDLE_NOT_INITIALIZED",
+  messageParameters = Map.empty)
+  }
+
   def failToRecognizePatternAfterUpgradeError(
   pattern: String, e: Throwable): SparkUpgradeException = {
 new SparkUpgradeException(
diff --git 
a/sql/api/src/main/scala/org/apache/spark/sql/streaming/StatefulProcessor.scala 
b/sql/api/src/main/scala/org/apache/spark/sql/streaming/StatefulProcessor.scala
index 76794136dd49..42a9430bf39d 100644
--- 
a/sql/api/src/main/scala/org/apache/spark/sql/streaming/StatefulProcessor.scala
+++ 
b/sql/api/src/main/scala/org/apache/spark/sql/streaming/StatefulProcessor.scala
@@ -20,6 +20,7 @@ package org.apache.spark.sql.streaming
 import java.io.Serializable
 
 import org.apache.spark.annotation.{Evolving, Experimental}
+import org.apache.spark.sql.errors.ExecutionErrors
 
 /**
  * Represents the arbitrary stateful logic that needs to be provided by the 
user to perform
@@ -29,17 +30,18 @@ import

(spark) branch master updated: [SPARK-47245][SQL] Improve error code for INVALID_PARTITION_COLUMN_DATA_TYPE

2024-03-04 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 0b2eca28c307 [SPARK-47245][SQL] Improve error code for 
INVALID_PARTITION_COLUMN_DATA_TYPE
0b2eca28c307 is described below

commit 0b2eca28c307b4feb4edd1f53c9aecc64523b7eb
Author: Stefan Kandic 
AuthorDate: Mon Mar 4 19:58:52 2024 +0300

[SPARK-47245][SQL] Improve error code for INVALID_PARTITION_COLUMN_DATA_TYPE

### What changes were proposed in this pull request?

Improving the error code for error class 
`INVALID_PARTITION_COLUMN_DATA_TYPE`.

### Why are the changes needed?

`0A000` means a feature is not supported, It implies that in some future it 
may be and the user hit a limit rather than just writing something inherently 
wrong.

### Does this PR introduce _any_ user-facing change?

Yes, new sql error code

### How was this patch tested?

UTs

### Was this patch authored or co-authored using generative AI tooling?

No

Closes #45355 from stefankandic/SPARK-47245-improveErrorCode.

Authored-by: Stefan Kandic 
Signed-off-by: Max Gekk 
---
 common/utils/src/main/resources/error/error-classes.json | 2 +-
 docs/sql-error-conditions.md | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/common/utils/src/main/resources/error/error-classes.json 
b/common/utils/src/main/resources/error/error-classes.json
index 493635d1f8d3..6ccd841ccd0f 100644
--- a/common/utils/src/main/resources/error/error-classes.json
+++ b/common/utils/src/main/resources/error/error-classes.json
@@ -2325,7 +2325,7 @@
 "message" : [
   "Cannot use  for partition column."
 ],
-"sqlState" : "42601"
+"sqlState" : "0A000"
   },
   "INVALID_PARTITION_OPERATION" : {
 "message" : [
diff --git a/docs/sql-error-conditions.md b/docs/sql-error-conditions.md
index 510f56f413c6..f026c456eb2d 100644
--- a/docs/sql-error-conditions.md
+++ b/docs/sql-error-conditions.md
@@ -1320,7 +1320,7 @@ For more details see 
[INVALID_PARAMETER_VALUE](sql-error-conditions-invalid-para
 
 ### INVALID_PARTITION_COLUMN_DATA_TYPE
 
-[SQLSTATE: 
42601](sql-error-conditions-sqlstates.html#class-42-syntax-error-or-access-rule-violation)
+[SQLSTATE: 
0A000](sql-error-conditions-sqlstates.html#class-0A-feature-not-supported)
 
 Cannot use `` for partition column.
 


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

(spark) branch master updated (7e7f3ff9e281 -> ca7c60b49988)

2024-03-04 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


from 7e7f3ff9e281 [SPARK-47070] Fix invalid aggregation after subquery 
rewrite
 add ca7c60b49988 [SPARK-47268][SQL][COLLATIONS] Support for repartition 
with collations

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/catalyst/expressions/hash.scala  | 43 --
 .../org/apache/spark/sql/CollationSuite.scala  | 21 ++-
 2 files changed, 52 insertions(+), 12 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

(spark) branch master updated (479954cf73a5 -> 7e7f3ff9e281)

2024-03-04 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


from 479954cf73a5 [SPARK-47131][SQL][COLLATION] String function support: 
contains, startswith, endswith
 add 7e7f3ff9e281 [SPARK-47070] Fix invalid aggregation after subquery 
rewrite

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/catalyst/optimizer/subquery.scala|  65 +-
 .../exists-subquery/exists-aggregate.sql.out   | 131 
 .../subquery/in-subquery/in-group-by.sql.out   | 237 +
 .../subquery/exists-subquery/exists-aggregate.sql  |  39 +++-
 .../inputs/subquery/in-subquery/in-group-by.sql|  71 ++
 .../exists-subquery/exists-aggregate.sql.out   |  83 
 .../subquery/in-subquery/in-group-by.sql.out   | 117 ++
 7 files changed, 736 insertions(+), 7 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

(spark) branch master updated (b711efd6671a -> 479954cf73a5)

2024-03-04 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


from b711efd6671a [SPARK-43258][SQL] Assign names to error 
_LEGACY_ERROR_TEMP_202[3,5]
 add 479954cf73a5 [SPARK-47131][SQL][COLLATION] String function support: 
contains, startswith, endswith

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/catalyst/util/CollationFactory.java  |   3 +
 .../org/apache/spark/unsafe/types/UTF8String.java  |  44 +
 .../src/main/resources/error/error-classes.json|  18 ++
 ...ror-conditions-datatype-mismatch-error-class.md |   4 +
 ...onditions-unsupported-collation-error-class.md} |  14 +-
 docs/sql-error-conditions.md   |   8 +
 .../org/apache/spark/sql/types/StringType.scala|   3 +-
 .../catalyst/expressions/stringExpressions.scala   |  89 -
 .../sql/catalyst/types/PhysicalDataType.scala  |   4 +-
 .../spark/sql/execution/columnar/ColumnType.scala  |   3 +-
 .../org/apache/spark/sql/CollationSuite.scala  | 216 -
 11 files changed, 379 insertions(+), 27 deletions(-)
 copy docs/{sql-error-conditions-unsupported-save-mode-error-class.md => 
sql-error-conditions-unsupported-collation-error-class.md} (82%)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

(spark) branch master updated (a1e57b32e495 -> b711efd6671a)

2024-03-04 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


from a1e57b32e495 [SPARK-47266][CONNECT] Make `ProtoUtils.abbreviate` 
return the same type as the input
 add b711efd6671a [SPARK-43258][SQL] Assign names to error 
_LEGACY_ERROR_TEMP_202[3,5]

No new revisions were added by this update.

Summary of changes:
 .../src/main/resources/error/error-classes.json| 22 +-
 docs/sql-error-conditions.md   | 12 ++
 .../sql/catalyst/encoders/ExpressionEncoder.scala  |  2 +-
 .../sql/catalyst/expressions/Expression.scala  | 12 +++---
 .../sql/catalyst/expressions/arithmetic.scala  |  4 +-
 .../expressions/higherOrderFunctions.scala |  2 +-
 .../spark/sql/errors/QueryExecutionErrors.scala|  6 +--
 .../sql/errors/QueryExecutionErrorsSuite.scala | 50 --
 8 files changed, 84 insertions(+), 26 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

(spark) branch master updated (69743ad98f34 -> a1e57b32e495)

2024-03-04 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


from 69743ad98f34 [SPARK-46094][CORE][FOLLOW-UP] Mark `writing` as volatile 
at ExecutorJVMProfiler
 add a1e57b32e495 [SPARK-47266][CONNECT] Make `ProtoUtils.abbreviate` 
return the same type as the input

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/connect/common/ProtoUtils.scala  |  8 ++--
 .../sql/connect/messages/AbbreviateSuite.scala | 43 +-
 2 files changed, 21 insertions(+), 30 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

(spark) branch master updated (dfb35bed522c -> c9435b8b4864)

(spark) branch master updated: [SPARK-47146][CORE] Possible thread leak when doing sort merge join

(spark) branch branch-3.5 updated: [SPARK-47177][SQL] Cached SQL plan do not display final AQE plan in explain string

(spark) branch master updated (73aa14466683 -> 6e62a5690b81)

(spark) branch master updated (35bced42474e -> 73aa14466683)

(spark) branch master updated (db0e5c7bc464 -> 35bced42474e)

(spark) branch master updated (6b5917beff30 -> db0e5c7bc464)

(spark) branch master updated: [SPARK-46961][SS] Using ProcessorContext to store and retrieve handle

(spark) branch master updated: [SPARK-47245][SQL] Improve error code for INVALID_PARTITION_COLUMN_DATA_TYPE

(spark) branch master updated (7e7f3ff9e281 -> ca7c60b49988)

(spark) branch master updated (479954cf73a5 -> 7e7f3ff9e281)

(spark) branch master updated (b711efd6671a -> 479954cf73a5)

(spark) branch master updated (a1e57b32e495 -> b711efd6671a)

(spark) branch master updated (69743ad98f34 -> a1e57b32e495)

14 matches

Site Navigation

Mail list logo

Footer information