Re: [PR] Add Spark 4.0.0 release notes [spark-website]

via GitHub Fri, 23 May 2025 09:20:05 -0700


bjornjorgensen commented on code in PR #608:
URL: https://github.com/apache/spark-website/pull/608#discussion_r2104972978



##########
releases/_posts/2025-05-23-spark-release-4-0-0.md:
##########
@@ -0,0 +1,694 @@
+---
+layout: post
+title: Spark Release 4.0.0
+categories: []
+tags: []
+status: publish
+type: post
+published: true
+meta:
+  _edit_last: '4'
+  _wpas_done_all: '1'
+---
+
+Apache Spark 4.0.0 marks a significant milestone as the inaugural release in 
the 4.x series, embodying the collective effort of the vibrant open-source 
community. This release is a testament to tremendous collaboration, resolving 
over 5100 tickets with contributions from more than 390 individuals.
+
+Spark Connect continues its rapid advancement, delivering substantial 
improvements:
+- A new lightweight Python client 
([pyspark-client](https://pypi.org/project/pyspark-client)) at just 1.5 MB.
+- Full API compatibility for the Java client.
+- Greatly expanded API coverage.
+- ML on Spark Connect.
+- A new client implementation for 
[Swift](https://github.com/apache/spark-connect-swift).
+
+Spark SQL is significantly enriched with powerful new features designed to 
boost expressiveness and versatility for SQL workloads, such as VARIANT data 
type support, SQL user-defined functions, session variables, pipe syntax, and 
string collation.
+
+PySpark sees continuous dedication to both its functional breadth and the 
overall developer experience, bringing a native plotting API, a new Python Data 
Source API, support for Python UDTFs, and unified profiling for PySpark UDFs, 
alongside numerous other enhancements.
+
+Structured Streaming evolves with key additions that provide greater control 
and ease of debugging, notably the introduction of the Arbitrary State API v2 
for more flexible state management and the State Data Source for easier 
debugging.
+
+To download Apache Spark 4.0.0, please visit the 
[downloads](https://spark.apache.org/downloads.html) page. For [detailed 
changes](https://issues.apache.org/jira/projects/SPARK/versions/12353359), you 
can consult JIRA. We have also curated a list of high-level changes here, 
grouped by major modules.
+
+
+* This will become a table of contents (this text will be scraped).
+{:toc}
+
+
+### Core and Spark SQL Highlights
+
+- [[SPARK-45314]](https://issues.apache.org/jira/browse/SPARK-45314) Drop 
Scala 2.12 and make Scala 2.13 the default
+- [[SPARK-45315]](https://issues.apache.org/jira/browse/SPARK-45315) Drop JDK 
8/11 and make JDK 17 the default
+- [[SPARK-45923]](https://issues.apache.org/jira/browse/SPARK-45923) Spark 
Kubernetes Operator
+- [[SPARK-45869]](https://issues.apache.org/jira/browse/SPARK-45869) Revisit 
and improve Spark Standalone Cluster
+- [[SPARK-42849]](https://issues.apache.org/jira/browse/SPARK-42849) Session 
Variables
+- [[SPARK-44444]](https://issues.apache.org/jira/browse/SPARK-44444) Use ANSI 
SQL mode by default
+- [[SPARK-46057]](https://issues.apache.org/jira/browse/SPARK-46057) Support 
SQL user-defined functions
+- [[SPARK-45827]](https://issues.apache.org/jira/browse/SPARK-45827) Add 
VARIANT data type
+- [[SPARK-49555]](https://issues.apache.org/jira/browse/SPARK-49555) SQL Pipe 
syntax
+- [[SPARK-46830]](https://issues.apache.org/jira/browse/SPARK-46830) String 
Collation support
+- [[SPARK-44265]](https://issues.apache.org/jira/browse/SPARK-44265) Built-in 
XML data source support
+
+
+### Spark Core
+
+- [[SPARK-49524]](https://issues.apache.org/jira/browse/SPARK-49524) Improve 
K8s support
+- [[SPARK-47240]](https://issues.apache.org/jira/browse/SPARK-47240) SPIP: 
Structured Logging Framework for Apache Spark
+- [[SPARK-44893]](https://issues.apache.org/jira/browse/SPARK-44893) 
`ThreadInfo` improvements for monitoring APIs
+- [[SPARK-46861]](https://issues.apache.org/jira/browse/SPARK-46861) Avoid 
Deadlock in DAGScheduler
+- [[SPARK-47764]](https://issues.apache.org/jira/browse/SPARK-47764) Cleanup 
shuffle dependencies based on `ShuffleCleanupMode`
+- [[SPARK-49459]](https://issues.apache.org/jira/browse/SPARK-49459) Support 
CRC32C for Shuffle Checksum
+- [[SPARK-46383]](https://issues.apache.org/jira/browse/SPARK-46383) Reduce 
Driver Heap Usage by shortening `TaskInfo.accumulables()` lifespan
+- [[SPARK-45527]](https://issues.apache.org/jira/browse/SPARK-45527) Use 
fraction-based resource calculation
+- [[SPARK-47172]](https://issues.apache.org/jira/browse/SPARK-47172) Add 
AES-GCM as an optional AES cipher mode for RPC encryption
+- [[SPARK-47448]](https://issues.apache.org/jira/browse/SPARK-47448) Enable 
`spark.shuffle.service.removeShuffle` by default
+- [[SPARK-47674]](https://issues.apache.org/jira/browse/SPARK-47674) Enable 
`spark.metrics.appStatusSource.enabled` by default
+- [[SPARK-48063]](https://issues.apache.org/jira/browse/SPARK-48063) Enable 
`spark.stage.ignoreDecommissionFetchFailure` by default
+- [[SPARK-48268]](https://issues.apache.org/jira/browse/SPARK-48268) Add 
`spark.checkpoint.dir` config
+- [[SPARK-48292]](https://issues.apache.org/jira/browse/SPARK-48292) Revert 
SPARK-39195 (OutputCommitCoordinator) to fix duplication issues
+- [[SPARK-48518]](https://issues.apache.org/jira/browse/SPARK-48518) Make LZF 
compression run in parallel
+- [[SPARK-46132]](https://issues.apache.org/jira/browse/SPARK-46132) Support 
key password for JKS keys for RPC SSL
+- [[SPARK-46456]](https://issues.apache.org/jira/browse/SPARK-46456) Add 
`spark.ui.jettyStopTimeout` to set Jetty server stop timeout
+- [[SPARK-46256]](https://issues.apache.org/jira/browse/SPARK-46256) Parallel 
Compression Support for ZSTD
+- [[SPARK-45544]](https://issues.apache.org/jira/browse/SPARK-45544) Integrate 
SSL support into `TransportContext`
+- [[SPARK-45351]](https://issues.apache.org/jira/browse/SPARK-45351) Change 
`spark.shuffle.service.db.backend` default value to `ROCKSDB`
+- [[SPARK-44741]](https://issues.apache.org/jira/browse/SPARK-44741) Support 
regex-based `MetricFilter` in `StatsdSink`
+- [[SPARK-43987]](https://issues.apache.org/jira/browse/SPARK-43987) Separate 
`finalizeShuffleMerge` Processing to Dedicated Thread Pools
+- [[SPARK-45439]](https://issues.apache.org/jira/browse/SPARK-45439) Reduce 
memory usage of `LiveStageMetrics.accumIdsToMetricType`
+
+
+### Spark SQL
+
+#### Features
+
+- [[SPARK-50541]](https://issues.apache.org/jira/browse/SPARK-50541) Describe 
Table As JSON
+- [[SPARK-48031]](https://issues.apache.org/jira/browse/SPARK-48031) Support 
view schema evolution
+- [[SPARK-50883]](https://issues.apache.org/jira/browse/SPARK-50883) Support 
altering multiple columns in the same command
+- [[SPARK-47627]](https://issues.apache.org/jira/browse/SPARK-47627) Add `SQL 
MERGE` syntax to enable schema evolution
+- [[SPARK-47430]](https://issues.apache.org/jira/browse/SPARK-47430) Support 
`GROUP BY` for `MapType`
+- [[SPARK-49093]](https://issues.apache.org/jira/browse/SPARK-49093) `GROUP 
BY` with MapType nested inside complex type
+- [[SPARK-49098]](https://issues.apache.org/jira/browse/SPARK-49098) Add write 
options for `INSERT`
+- [[SPARK-49451]](https://issues.apache.org/jira/browse/SPARK-49451) Allow 
duplicate keys in `parse_json`
+- [[SPARK-46536]](https://issues.apache.org/jira/browse/SPARK-46536) Support 
`GROUP BY calendar_interval_type`
+- [[SPARK-46908]](https://issues.apache.org/jira/browse/SPARK-46908) Support 
star clause in `WHERE` clause
+- [[SPARK-36680]](https://issues.apache.org/jira/browse/SPARK-36680) Support 
dynamic table options via `WITH OPTIONS` syntax
+- [[SPARK-35553]](https://issues.apache.org/jira/browse/SPARK-35553) Improve 
correlated subqueries
+- [[SPARK-47492]](https://issues.apache.org/jira/browse/SPARK-47492) Widen 
whitespace rules in lexer to allow Unicode
+- [[SPARK-46246]](https://issues.apache.org/jira/browse/SPARK-46246) `EXECUTE 
IMMEDIATE`SQL support
+- [[SPARK-46207]](https://issues.apache.org/jira/browse/SPARK-46207) Support 
`MergeInto` in DataFrameWriterV2
+- [[SPARK-50129]](https://issues.apache.org/jira/browse/SPARK-50129) Add 
DataFrame APIs for subqueries
+- [[SPARK-50075]](https://issues.apache.org/jira/browse/SPARK-50075) DataFrame 
APIs for table-valued functions
+
+
+#### Functions
+
+- [[SPARK-52016]](https://issues.apache.org/jira/browse/SPARK-52016) New 
built-in functions in Spark 4.0
+- [[SPARK-44001]](https://issues.apache.org/jira/browse/SPARK-44001) Add 
option to allow unwrapping protobuf well-known wrapper types
+- [[SPARK-43427]](https://issues.apache.org/jira/browse/SPARK-43427) spark 
protobuf: allow upcasting unsigned integer types
+- [[SPARK-44983]](https://issues.apache.org/jira/browse/SPARK-44983) Convert 
`binary` to `string` by `to_char` for the formats: hex, base64, utf-8
+- [[SPARK-44868]](https://issues.apache.org/jira/browse/SPARK-44868) Convert 
`datetime` to `string` by `to_char`/`to_varchar`
+- [[SPARK-45796]](https://issues.apache.org/jira/browse/SPARK-45796) Support 
`MODE() WITHIN GROUP (ORDER BY col)`
+- [[SPARK-48658]](https://issues.apache.org/jira/browse/SPARK-48658) 
Encode/Decode functions report coding errors instead of mojibake
+- [[SPARK-45034]](https://issues.apache.org/jira/browse/SPARK-45034) Support 
deterministic mode function
+- [[SPARK-44778]](https://issues.apache.org/jira/browse/SPARK-44778) Add the 
alias `TIMEDIFF` for `TIMESTAMPDIFF`
+- [[SPARK-47497]](https://issues.apache.org/jira/browse/SPARK-47497) Make 
`to_csv` support arrays/maps/binary as pretty strings
+- [[SPARK-44840]](https://issues.apache.org/jira/browse/SPARK-44840) Make 
`array_insert()` 1-based for negative indexes
+
+#### Query optimization
+
+- [[SPARK-46946]](https://issues.apache.org/jira/browse/SPARK-46946) 
Supporting broadcast of multiple filtering keys in `DynamicPruning`
+- [[SPARK-48445]](https://issues.apache.org/jira/browse/SPARK-48445) Don’t 
inline UDFs with expansive children
+- [[SPARK-41413]](https://issues.apache.org/jira/browse/SPARK-41413) Avoid 
shuffle in Storage-Partitioned Join when partition keys mismatch, but 
expressions are compatible
+- [[SPARK-46941]](https://issues.apache.org/jira/browse/SPARK-46941) Prevent 
insertion of window group limit node with `SizeBasedWindowFunction`
+- [[SPARK-46707]](https://issues.apache.org/jira/browse/SPARK-46707) Add 
throwable field to expressions to improve predicate pushdown
+- [[SPARK-47511]](https://issues.apache.org/jira/browse/SPARK-47511) 
Canonicalize `WITH` expressions by reassigning IDs
+- [[SPARK-46502]](https://issues.apache.org/jira/browse/SPARK-46502) Support 
timestamp types in `UnwrapCastInBinaryComparison`
+- [[SPARK-46069]](https://issues.apache.org/jira/browse/SPARK-46069) Support 
unwrap timestamp type to date type
+- [[SPARK-46219]](https://issues.apache.org/jira/browse/SPARK-46219) Unwrap 
cast in join predicates
+- [[SPARK-45606]](https://issues.apache.org/jira/browse/SPARK-45606) Release 
restrictions on multi-layer runtime filter
+- [[SPARK-45909]](https://issues.apache.org/jira/browse/SPARK-45909) Remove 
`NumericType` cast if it can safely up-cast in `IsNotNull`
+
+#### Query execution
+
+- [[SPARK-45592]](https://issues.apache.org/jira/browse/SPARK-45592) 
Correctness issue in AQE with `InMemoryTableScanExec`
+- [[SPARK-50258]](https://issues.apache.org/jira/browse/SPARK-50258) Fix 
output column order changed issue after AQE
+- [[SPARK-46693]](https://issues.apache.org/jira/browse/SPARK-46693) Inject 
`LocalLimitExec` when matching `OffsetAndLimit` or `LimitAndOffset`
+- [[SPARK-48873]](https://issues.apache.org/jira/browse/SPARK-48873) Use 
`UnsafeRow` in JSON parser
+- [[SPARK-41471]](https://issues.apache.org/jira/browse/SPARK-41471) Reduce 
Spark shuffle when only one side of a join is `KeyGroupedPartitioning`
+- [[SPARK-45452]](https://issues.apache.org/jira/browse/SPARK-45452) Improve 
`InMemoryFileIndex` to use` FileSystem.listFiles` API
+- [[SPARK-48649]](https://issues.apache.org/jira/browse/SPARK-48649) Add 
`ignoreInvalidPartitionPaths` configs for skipping invalid partition paths
+- [[SPARK-45882]](https://issues.apache.org/jira/browse/SPARK-45882) 
`BroadcastHashJoinExec` propagate partitioning should respect 
CoalescedHashPartitioning
+
+
+### Spark Connectors
+
+#### Data Source V2 framework
+
+- [[SPARK-45784]](https://issues.apache.org/jira/browse/SPARK-45784) Introduce 
clustering mechanism to Spark
+- [[SPARK-50820]](https://issues.apache.org/jira/browse/SPARK-50820) DSv2: 
Conditional nullification of metadata columns in DML
+- [[SPARK-51938]](https://issues.apache.org/jira/browse/SPARK-51938) Improve 
Storage Partition Join
+- [[SPARK-50700]](https://issues.apache.org/jira/browse/SPARK-50700) 
`spark.sql.catalog.spark_catalog` supports builtin magic value
+- [[SPARK-48781]](https://issues.apache.org/jira/browse/SPARK-48781) Add 
Catalog APIs for loading stored procedures
+- [[SPARK-49246]](https://issues.apache.org/jira/browse/SPARK-49246) 
`TableCatalog#loadTable` should indicate if it's for writing
+- [[SPARK-45965]](https://issues.apache.org/jira/browse/SPARK-45965) Move DSv2 
partitioning expressions into functions.partitioning
+- [[SPARK-46272]](https://issues.apache.org/jira/browse/SPARK-46272) Support 
CTAS using DSv2 sources
+- [[SPARK-46043]](https://issues.apache.org/jira/browse/SPARK-46043) Support 
create table using DSv2 sources
+- [[SPARK-48668]](https://issues.apache.org/jira/browse/SPARK-48668) Support 
`ALTER NAMESPACE ... UNSET PROPERTIES` in v2
+- [[SPARK-46442]](https://issues.apache.org/jira/browse/SPARK-46442) DS V2 
supports push down `PERCENTILE_CONT` and `PERCENTILE_DISC`
+- [[SPARK-49078]](https://issues.apache.org/jira/browse/SPARK-49078) Support 
show columns syntax in v2 table
+
+#### Hive Catalog
+
+- [[SPARK-45328]](https://issues.apache.org/jira/browse/SPARK-45328) Remove 
Hive support prior to 2.0.0
+- [[SPARK-47101]](https://issues.apache.org/jira/browse/SPARK-47101) Allow 
comma in top-level column names and relax HiveExternalCatalog schema check
+- [[SPARK-45265]](https://issues.apache.org/jira/browse/SPARK-45265) Support 
Hive 4.0 metastore
+
+#### XML
+
+- [[SPARK-44265]](https://issues.apache.org/jira/browse/SPARK-44265) Built-in 
XML data source support
+
+#### CSV
+
+- [[SPARK-46862]](https://issues.apache.org/jira/browse/SPARK-46862) Disable 
CSV column pruning in multi-line mode
+- [[SPARK-46890]](https://issues.apache.org/jira/browse/SPARK-46890) Fix CSV 
parsing bug with default values and column pruning
+- [[SPARK-50616]](https://issues.apache.org/jira/browse/SPARK-50616) Add File 
Extension Option to CSV DataSource Writer
+- [[SPARK-49125]](https://issues.apache.org/jira/browse/SPARK-49125) Allow 
duplicated column names in CSV writing
+- [[SPARK-49016]](https://issues.apache.org/jira/browse/SPARK-49016) Restore 
behavior for queries from raw CSV files
+- [[SPARK-48807]](https://issues.apache.org/jira/browse/SPARK-48807) Binary 
support for CSV datasource
+- [[SPARK-48602]](https://issues.apache.org/jira/browse/SPARK-48602) Make csv 
generator support different output style via spark.sql.binaryOutputStyle
+
+#### ORC
+
+- [[SPARK-46648]](https://issues.apache.org/jira/browse/SPARK-46648) Use zstd 
as the default ORC compression
+- [[SPARK-47456]](https://issues.apache.org/jira/browse/SPARK-47456) Support 
ORC Brotli codec
+- [[SPARK-41858]](https://issues.apache.org/jira/browse/SPARK-41858) Fix ORC 
reader perf regression due to DEFAULT value feature
+
+#### Avro
+
+- [[SPARK-47739]](https://issues.apache.org/jira/browse/SPARK-47739) Register 
logical Avro type
+- [[SPARK-49082]](https://issues.apache.org/jira/browse/SPARK-49082) Widening 
type promotions in `AvroDeserializer`
+- [[SPARK-46633]](https://issues.apache.org/jira/browse/SPARK-46633) Fix Avro 
reader to handle zero-length blocks
+- [[SPARK-50350]](https://issues.apache.org/jira/browse/SPARK-50350) Avro: add 
new function `schema_of_avro` (Scala side)
+- [[SPARK-46930]](https://issues.apache.org/jira/browse/SPARK-46930) Add 
support for custom prefix for Union type fields in Avro
+- [[SPARK-46746]](https://issues.apache.org/jira/browse/SPARK-46746) Attach 
codec extension to Avro datasource files
+- [[SPARK-46759]](https://issues.apache.org/jira/browse/SPARK-46759) Support 
compression level for xz and zstandard in Avro
+- [[SPARK-46766]](https://issues.apache.org/jira/browse/SPARK-46766) Add ZSTD 
Buffer Pool support for Avro datasource
+- [[SPARK-43380]](https://issues.apache.org/jira/browse/SPARK-43380) Fix Avro 
data type conversion issues without causing performance regression
+- [[SPARK-48545]](https://issues.apache.org/jira/browse/SPARK-48545) Create 
`to_avro` and `from_avro` SQL functions
+- [[SPARK-46990]](https://issues.apache.org/jira/browse/SPARK-46990) Fix 
loading empty Avro files (infinite loop)
+
+#### JDBC
+
+- [[SPARK-47361]](https://issues.apache.org/jira/browse/SPARK-47361) Improve 
JDBC data sources
+- [[SPARK-44977]](https://issues.apache.org/jira/browse/SPARK-44977) Upgrade 
Derby to 10.16.1.1
+- [[SPARK-47044]](https://issues.apache.org/jira/browse/SPARK-47044) Add 
executed query for JDBC external datasources to explain output
+- [[SPARK-45139]](https://issues.apache.org/jira/browse/SPARK-45139) Add 
`DatabricksDialect` to handle SQL type conversion
+
+#### Other notable Spark Connectors changes
+
+- [[SPARK-45905]](https://issues.apache.org/jira/browse/SPARK-45905) Least 
common type between decimal types should retain integral digits first
+- [[SPARK-45786]](https://issues.apache.org/jira/browse/SPARK-45786) Fix 
inaccurate Decimal multiplication and division results
+- [[SPARK-50705]](https://issues.apache.org/jira/browse/SPARK-50705) Make 
`QueryPlan` lock‑free
+- [[SPARK-46743]](https://issues.apache.org/jira/browse/SPARK-46743) Fix 
corner-case with `COUNT` + constant folding subquery
+- [[SPARK-47509]](https://issues.apache.org/jira/browse/SPARK-47509) Block 
subquery expressions in lambda/higher-order functions for correctness
+- [[SPARK-48498]](https://issues.apache.org/jira/browse/SPARK-48498) Always do 
char padding in predicates
+- [[SPARK-45915]](https://issues.apache.org/jira/browse/SPARK-45915) Treat 
decimal(x, 0) the same as IntegralType in PromoteStrings
+- [[SPARK-46220]](https://issues.apache.org/jira/browse/SPARK-46220) Restrict 
charsets in decode()
+- [[SPARK-45816]](https://issues.apache.org/jira/browse/SPARK-45816) Return 
`NULL` when overflowing during casting from timestamp to integers
+- [[SPARK-45586]](https://issues.apache.org/jira/browse/SPARK-45586) Reduce 
compiler latency for plans with large expression trees
+- [[SPARK-45507]](https://issues.apache.org/jira/browse/SPARK-45507) 
Correctness fix for nested correlated scalar subqueries with `COUNT` aggregates
+- [[SPARK-44550]](https://issues.apache.org/jira/browse/SPARK-44550) Enable 
correctness fixes for null `IN` (empty list) under ANSI
+- [[SPARK-47911]](https://issues.apache.org/jira/browse/SPARK-47911) 
Introduces a universal `BinaryFormatter` to make binary output consistent
+
+
+### PySpark Highlights
+
+- [[SPARK-49530]](https://issues.apache.org/jira/browse/SPARK-49530) 
Introducing PySpark Plotting API
+- [[SPARK-47540]](https://issues.apache.org/jira/browse/SPARK-47540) SPIP: 
Pure Python Package (Spark Connect)
+- [[SPARK-50132]](https://issues.apache.org/jira/browse/SPARK-50132) Add 
DataFrame API for Lateral Joins
+- [[SPARK-45981]](https://issues.apache.org/jira/browse/SPARK-45981) Improve 
Python language test coverage
+- [[SPARK-46858]](https://issues.apache.org/jira/browse/SPARK-46858) Upgrade 
Pandas to 2
+- [[SPARK-46910]](https://issues.apache.org/jira/browse/SPARK-46910) Eliminate 
JDK Requirement in PySpark Installation
+- [[SPARK-47274]](https://issues.apache.org/jira/browse/SPARK-47274) Provide 
more useful context for DataFrame API errors
+- [[SPARK-44076]](https://issues.apache.org/jira/browse/SPARK-44076) SPIP: 
Python Data Source API
+- [[SPARK-43797]](https://issues.apache.org/jira/browse/SPARK-43797) Python 
User-defined Table Functions
+- [[SPARK-46685]](https://issues.apache.org/jira/browse/SPARK-46685) PySpark 
UDF Unified Profiling
+
+#### DataFrame APIs and Features
+
+- [[SPARK-51079]](https://issues.apache.org/jira/browse/SPARK-51079) Support 
large variable types in pandas UDF, `createDataFrame` and `toPandas` with Arrow
+- [[SPARK-50718]](https://issues.apache.org/jira/browse/SPARK-50718) Support 
`addArtifact(s)` for PySpark
+- [[SPARK-50778]](https://issues.apache.org/jira/browse/SPARK-50778) Add 
`metadataColumn` to PySpark DataFrame
+- [[SPARK-50719]](https://issues.apache.org/jira/browse/SPARK-50719) Support 
`interruptOperation` for PySpark
+- [[SPARK-50790]](https://issues.apache.org/jira/browse/SPARK-50790) Implement 
`parse_json` in PySpark
+- [[SPARK-49306]](https://issues.apache.org/jira/browse/SPARK-49306) Create 
SQL function aliases for `zeroifnull` and `nullifzero`
+- [[SPARK-50132]](https://issues.apache.org/jira/browse/SPARK-50132) Add 
DataFrame API for Lateral Joins
+- [[SPARK-43295]](https://issues.apache.org/jira/browse/SPARK-43295) Support 
string type columns for `DataFrameGroupBy.sum`
+- [[SPARK-45575]](https://issues.apache.org/jira/browse/SPARK-45575) Support 
time travel options for `df.read` API
+- [[SPARK-45755]](https://issues.apache.org/jira/browse/SPARK-45755) Improve 
`Dataset.isEmpty()` by applying global limit 1
+  - Improves performance of isEmpty() by pushing down a global limit of 1.
+- [[SPARK-48761]](https://issues.apache.org/jira/browse/SPARK-48761) Introduce 
`clusterBy` DataFrameWriter API for Scala
+- [[SPARK-45929]](https://issues.apache.org/jira/browse/SPARK-45929) Support 
`groupingSets` operation in DataFrame API
+  - Extends `groupingSets(...)` to DataFrame/DS-level APIs.
+- [[SPARK-40178]](https://issues.apache.org/jira/browse/SPARK-40178) Support 
coalesce hints with ease for PySpark and R
+
+#### Pandas API on Spark
+
+- [[SPARK-46931]](https://issues.apache.org/jira/browse/SPARK-46931) Implement 
`{Frame, Series}.to_hdf`
+- [[SPARK-46936]](https://issues.apache.org/jira/browse/SPARK-46936) Implement 
`Frame.to_feather`
+- [[SPARK-46955]](https://issues.apache.org/jira/browse/SPARK-46955) Implement 
`Frame.to_stata`
+- [[SPARK-46976]](https://issues.apache.org/jira/browse/SPARK-46976) Implement 
`DataFrameGroupBy.corr`
+- [[SPARK-49344]](https://issues.apache.org/jira/browse/SPARK-49344) Support 
`json_normalize` for Pandas API on Spark
+- [[SPARK-45552]](https://issues.apache.org/jira/browse/SPARK-45552) Introduce 
flexible parameters to `assertDataFrameEqual`
+- [[SPARK-47824]](https://issues.apache.org/jira/browse/SPARK-47824) Fix 
nondeterminism in pyspark.pandas.series.asof
+- [[SPARK-46926]](https://issues.apache.org/jira/browse/SPARK-46926) Add 
`convert_dtypes`, `infer_objects`, `set_axis` in fallback list
+- [[SPARK-48295]](https://issues.apache.org/jira/browse/SPARK-48295) Turn on 
`compute.ops_on_diff_frames` by default
+- [[SPARK-48336]](https://issues.apache.org/jira/browse/SPARK-48336) Implement 
`ps.sql` in Spark Connect
+- [[SPARK-45267]](https://issues.apache.org/jira/browse/SPARK-45267) Change 
the default value for `numeric_only`
+- [[SPARK-42619]](https://issues.apache.org/jira/browse/SPARK-42619) Add 
show_counts parameter for `DataFrame.info`
+- [[SPARK-42620]](https://issues.apache.org/jira/browse/SPARK-42620) Add 
inclusive parameter for `(DataFrame|Series).between_time`
+- [[SPARK-42621]](https://issues.apache.org/jira/browse/SPARK-42621) Add 
inclusive parameter for `pd.date_range`
+- [[SPARK-45553]](https://issues.apache.org/jira/browse/SPARK-45553) Deprecate 
`assertPandasOnSparkEqual`
+- [[SPARK-45718]](https://issues.apache.org/jira/browse/SPARK-45718) Remove 
remaining deprecated Pandas features from Spark 3.4.0
+- [[SPARK-45550]](https://issues.apache.org/jira/browse/SPARK-45550) Remove 
deprecated APIs from Pandas API on Spark
+- [[SPARK-45634]](https://issues.apache.org/jira/browse/SPARK-45634) Remove 
`DataFrame.get_dtype_counts` from Pandas API on Spark
+- [[SPARK-45165]](https://issues.apache.org/jira/browse/SPARK-45165) Remove 
`inplace` parameter from CategoricalIndex APIs
+- [[SPARK-45177]](https://issues.apache.org/jira/browse/SPARK-45177) Remove 
`col_space `parameter from `to_latex`
+- [[SPARK-45164]](https://issues.apache.org/jira/browse/SPARK-45164) Remove 
deprecated Index APIs
+- [[SPARK-45180]](https://issues.apache.org/jira/browse/SPARK-45180) Remove 
boolean inputs for inclusive parameter from `Series.between`
+- [[SPARK-43709]](https://issues.apache.org/jira/browse/SPARK-43709) Remove 
closed parameter from `ps.date_range` & enable test
+- [[SPARK-43453]](https://issues.apache.org/jira/browse/SPARK-43453) Ignore 
the names of` MultiIndex` when `axis=1` for `concat`
+- [[SPARK-43433]](https://issues.apache.org/jira/browse/SPARK-43433) Match 
`GroupBy.nth` behavior to the latest Pandas
+
+#### Other notable PySpark changes
+
+- [[SPARK-50357]](https://issues.apache.org/jira/browse/SPARK-50357) Support 
`Interrupt(Tag|All)` APIs for PySpark
+- [[SPARK-50392]](https://issues.apache.org/jira/browse/SPARK-50392) DataFrame 
conversion to table argument in Spark Classic
+- [[SPARK-50752]](https://issues.apache.org/jira/browse/SPARK-50752) Introduce 
configs for tuning Python UDF without Arrow
+- [[SPARK-47366]](https://issues.apache.org/jira/browse/SPARK-47366) Add 
VariantVal for PySpark
+- [[SPARK-47683]](https://issues.apache.org/jira/browse/SPARK-47683) Decouple 
PySpark core API to pyspark.core package
+- [[SPARK-47565]](https://issues.apache.org/jira/browse/SPARK-47565) Improve 
PySpark worker pool crash resilience
+- [[SPARK-47933]](https://issues.apache.org/jira/browse/SPARK-47933) Parent 
Column class for Spark Connect and Spark Classic
+- [[SPARK-50499]](https://issues.apache.org/jira/browse/SPARK-50499) Expose 
metrics from `BasePythonRunner`
+- [[SPARK-50220]](https://issues.apache.org/jira/browse/SPARK-50220) Support 
`listagg` in PySpark
+- [[SPARK-46910]](https://issues.apache.org/jira/browse/SPARK-46910) Eliminate 
JDK Requirement in PySpark Installation
+- [[SPARK-46522]](https://issues.apache.org/jira/browse/SPARK-46522) Block 
Python data source registration with name conflicts
+- [[SPARK-48996]](https://issues.apache.org/jira/browse/SPARK-48996) Allow 
bare Python literals in Column.and / or
+- [[SPARK-48762]](https://issues.apache.org/jira/browse/SPARK-48762) Introduce 
`clusterBy` DataFrameWriter API for Python
+- [[SPARK-49009]](https://issues.apache.org/jira/browse/SPARK-49009) Make 
Column APIs accept Python Enums
+- [[SPARK-45891]](https://issues.apache.org/jira/browse/SPARK-45891) Add 
interval types in Variant Spec
+- [[SPARK-48710]](https://issues.apache.org/jira/browse/SPARK-48710) Use NumPy 
2.0-compatible types
+- [[SPARK-48714]](https://issues.apache.org/jira/browse/SPARK-48714) Implement 
`DataFrame.mergeInto` in PySpark
+- [[SPARK-48798]](https://issues.apache.org/jira/browse/SPARK-48798) Introduce 
`spark.profile.render` for SparkSession-based profiling
+- [[SPARK-47346]](https://issues.apache.org/jira/browse/SPARK-47346) Make 
daemon mode configurable for Python planner workers
+- [[SPARK-47366]](https://issues.apache.org/jira/browse/SPARK-47366) Add 
`parse_json` alias in PySpark/dataframe
+- [[SPARK-48247]](https://issues.apache.org/jira/browse/SPARK-48247) Use all 
`dict` pairs in `MapType` schema inference
+- [[SPARK-48340]](https://issues.apache.org/jira/browse/SPARK-48340) Support 
`TimestampNTZ` schema inference with `prefer_timestamp_ntz`
+- [[SPARK-48220]](https://issues.apache.org/jira/browse/SPARK-48220) Allow 
passing PyArrow Table to `createDataFrame()`
+- [[SPARK-48482]](https://issues.apache.org/jira/browse/SPARK-48482) 
`dropDuplicates`, `dropDuplicatesWithinWatermark` accept `var-args`
+- [[SPARK-48508]](https://issues.apache.org/jira/browse/SPARK-48508) Client 
Side RPC optimization for Spark Connect
+- [[SPARK-50311]](https://issues.apache.org/jira/browse/SPARK-50311) 
(`add`|`remove`|`get`|`clear`)Tag(s) APIs
+- [[SPARK-50238]](https://issues.apache.org/jira/browse/SPARK-50238) Add 
Variant Support in PySpark UDFs/UDTFs/UDAFs
+- [[SPARK-50446]](https://issues.apache.org/jira/browse/SPARK-50446) 
Concurrent level in Arrow-optimized Python UDF
+- [[SPARK-50310]](https://issues.apache.org/jira/browse/SPARK-50310) Add a 
flag to disable DataFrameQueryContext
+- [[SPARK-50471]](https://issues.apache.org/jira/browse/SPARK-50471) Support 
Arrow-based Python Data Source Writer
+- [[SPARK-49899]](https://issues.apache.org/jira/browse/SPARK-49899) Support 
`deleteIfExists` for `TransformWithStateInPandas`
+- [[SPARK-45597]](https://issues.apache.org/jira/browse/SPARK-45597) Support 
creating table using a Python data source in SQL (DSv2 exec)
+- [[SPARK-46424]](https://issues.apache.org/jira/browse/SPARK-46424) Support 
Python metrics in Python Data Source
+- [[SPARK-45525]](https://issues.apache.org/jira/browse/SPARK-45525) Support 
for Python data source write using DSv2
+- [[SPARK-41666]](https://issues.apache.org/jira/browse/SPARK-41666) Support 
parameterized SQL by `sql()`
+- [[SPARK-45768]](https://issues.apache.org/jira/browse/SPARK-45768) Make 
`faulthandler` a runtime configuration for Python execution in SQL
+- [[SPARK-45555]](https://issues.apache.org/jira/browse/SPARK-45555) Includes 
a debuggable object for failed assertion
+- [[SPARK-45600]](https://issues.apache.org/jira/browse/SPARK-45600) Make 
Python data source registration session level
+- [[SPARK-46048]](https://issues.apache.org/jira/browse/SPARK-46048) Support 
DataFrame.groupingSets in PySpark
+- [[SPARK-46103]](https://issues.apache.org/jira/browse/SPARK-46103) Enhancing 
PySpark documentation
+- [[SPARK-40559]](https://issues.apache.org/jira/browse/SPARK-40559) Add 
applyInArrow to groupBy and cogroup
+- [[SPARK-45420]](https://issues.apache.org/jira/browse/SPARK-45420) Add 
`DataType.fromDDL` into PySpark
+- [[SPARK-45554]](https://issues.apache.org/jira/browse/SPARK-45554) Introduce 
flexible parameter to a`ssertSchemaEqual`
+- [[SPARK-44918]](https://issues.apache.org/jira/browse/SPARK-44918) Support 
named arguments in scalar Python/Pandas UDFs
+- [[SPARK-45017]](https://issues.apache.org/jira/browse/SPARK-45017) Add 
`CalendarIntervalType` to PySpark
+- [[SPARK-44952]](https://issues.apache.org/jira/browse/SPARK-44952) Support 
named arguments in aggregate Pandas UDFs
+- [[SPARK-44665]](https://issues.apache.org/jira/browse/SPARK-44665) Add 
support for pandas DataFrame `assertDataFrameEqual`
+- [[SPARK-44705]](https://issues.apache.org/jira/browse/SPARK-44705) Make 
PythonRunner single-threaded
+- [[SPARK-45673]](https://issues.apache.org/jira/browse/SPARK-45673) Enhancing 
clarity and usability of PySpark error messages
+
+
+### Spark Streaming Highlights
+
+- [[SPARK-46815]](https://issues.apache.org/jira/browse/SPARK-46815) 
Structured Streaming - Arbitrary State API v2
+- [[SPARK-45511]](https://issues.apache.org/jira/browse/SPARK-45511) SPIP: 
State Data Source - Reader
+- [[SPARK-46962]](https://issues.apache.org/jira/browse/SPARK-46962) Implement 
python worker to run python streaming data source
+
+#### Other notable Streaming changes
+
+- [[SPARK-44865]](https://issues.apache.org/jira/browse/SPARK-44865) Make 
StreamingRelationV2 support metadata column
+- [[SPARK-45080]](https://issues.apache.org/jira/browse/SPARK-45080) 
Explicitly call out support for columnar in DSv2 streaming data sources
+- [[SPARK-45178]](https://issues.apache.org/jira/browse/SPARK-45178) Fallback 
to execute a single batch for Trigger.AvailableNow with unsupported sources
+- [[SPARK-45415]](https://issues.apache.org/jira/browse/SPARK-45415) Allow 
selective disabling of "fallocate" in RocksDB statestore
+- [[SPARK-45503]](https://issues.apache.org/jira/browse/SPARK-45503) Add Conf 
to Set RocksDB Compression
+- [[SPARK-45511]](https://issues.apache.org/jira/browse/SPARK-45511) State 
Data Source - Reader
+- [[SPARK-45558]](https://issues.apache.org/jira/browse/SPARK-45558) Introduce 
a metadata file for streaming stateful operator
+- [[SPARK-45794]](https://issues.apache.org/jira/browse/SPARK-45794) Introduce 
state metadata source to query the streaming state metadata information
+- [[SPARK-45815]](https://issues.apache.org/jira/browse/SPARK-45815) Provide 
an interface for other Streaming sources to add \_metadata columns
+- [[SPARK-45845]](https://issues.apache.org/jira/browse/SPARK-45845) Add 
number of evicted state rows to streaming UI
+- [[SPARK-46641]](https://issues.apache.org/jira/browse/SPARK-46641) Add 
`maxBytesPerTrigger` threshold
+- [[SPARK-46816]](https://issues.apache.org/jira/browse/SPARK-46816) Add base 
support for new arbitrary state management operator (multiple state 
variables/column families)
+- [[SPARK-46865]](https://issues.apache.org/jira/browse/SPARK-46865) Add Batch 
Support for TransformWithState Operator
+- [[SPARK-46906]](https://issues.apache.org/jira/browse/SPARK-46906) Add a 
check for stateful operator change for streaming
+- [[SPARK-46961]](https://issues.apache.org/jira/browse/SPARK-46961) Use 
`ProcessorContext` to store and retrieve handle
+- [[SPARK-46962]](https://issues.apache.org/jira/browse/SPARK-46962) Add 
interface for Python streaming data source & worker
+- [[SPARK-47107]](https://issues.apache.org/jira/browse/SPARK-47107) Partition 
reader for Python streaming data sources
+- [[SPARK-47273]](https://issues.apache.org/jira/browse/SPARK-47273) Python 
data stream writer interface
+- [[SPARK-47553]](https://issues.apache.org/jira/browse/SPARK-47553) Add Java 
support for `transformWithState` operator APIs
+- [[SPARK-47653]](https://issues.apache.org/jira/browse/SPARK-47653) Add 
support for negative numeric types and range scan key encoder
+- [[SPARK-47733]](https://issues.apache.org/jira/browse/SPARK-47733) Add 
custom metrics for transformWithState operator part of query progress
+- [[SPARK-47960]](https://issues.apache.org/jira/browse/SPARK-47960) Allow 
chaining other stateful operators after transformWithState
+- [[SPARK-48447]](https://issues.apache.org/jira/browse/SPARK-48447) Check 
`StateStoreProvider` class before constructor
+- [[SPARK-48569]](https://issues.apache.org/jira/browse/SPARK-48569) Handle 
edge cases in `query.name` for streaming queries
+- [[SPARK-48589]](https://issues.apache.org/jira/browse/SPARK-48589) Add 
`snapshotStartBatchId` / `snapshotPartitionId` for state data source (see SQL)
+- [[SPARK-48589]](https://issues.apache.org/jira/browse/SPARK-48589) Add 
snapshotStartBatchId / snapshotPartitionId options to state data source
+- [[SPARK-48726]](https://issues.apache.org/jira/browse/SPARK-48726) Create 
StateSchemaV3 file for `TransformWithStateExec`
+- [[SPARK-48742]](https://issues.apache.org/jira/browse/SPARK-48742) Virtual 
Column Family for RocksDB (arbitrary stateful API v2)
+- [[SPARK-48755]](https://issues.apache.org/jira/browse/SPARK-48755) 
`transformWithState` pyspark base implementation and `ValueState` support
+- [[SPARK-48772]](https://issues.apache.org/jira/browse/SPARK-48772) State 
Data Source Change Feed Reader Mode
+- [[SPARK-48836]](https://issues.apache.org/jira/browse/SPARK-48836) Integrate 
SQL schema with state schema/metadata for TWS operator
+- [[SPARK-48849]](https://issues.apache.org/jira/browse/SPARK-48849) Create 
OperatorStateMetadataV2 for `TransformWithStateExec` operator
+- [[SPARK-48931]](https://issues.apache.org/jira/browse/SPARK-48931) Reduce 
Cloud Store List API cost for state-store maintenance
+- [[SPARK-49021]](https://issues.apache.org/jira/browse/SPARK-49021) Add 
support for reading `transformWithState` value state variables with state data 
source reader
+- [[SPARK-49048]](https://issues.apache.org/jira/browse/SPARK-49048) Add 
support for reading operator metadata at given batch id
+- [[SPARK-49191]](https://issues.apache.org/jira/browse/SPARK-49191) Read 
`transformWithState` map state with state data source
+- [[SPARK-49259]](https://issues.apache.org/jira/browse/SPARK-49259) 
Size-based partition creation during Kafka read
+- [[SPARK-49411]](https://issues.apache.org/jira/browse/SPARK-49411) 
Communicate State Store Checkpoint ID
+- [[SPARK-49463]](https://issues.apache.org/jira/browse/SPARK-49463) ListState 
support in `TransformWithStateInPandas`
+- [[SPARK-49467]](https://issues.apache.org/jira/browse/SPARK-49467) Add state 
data source reader for list state
+- [[SPARK-49513]](https://issues.apache.org/jira/browse/SPARK-49513) Add timer 
support in `transformWithStateInPandas`
+- [[SPARK-49630]](https://issues.apache.org/jira/browse/SPARK-49630) Add 
flatten option for collection types in state data source reader
+- [[SPARK-49656]](https://issues.apache.org/jira/browse/SPARK-49656) Support 
state variables with value state collection types
+- [[SPARK-49676]](https://issues.apache.org/jira/browse/SPARK-49676) Chaining 
of operators in `transformWithStateInPandas`
+- [[SPARK-49699]](https://issues.apache.org/jira/browse/SPARK-49699) Disable 
`PruneFilters` for streaming workloads
+- [[SPARK-49744]](https://issues.apache.org/jira/browse/SPARK-49744) TTL 
support for ListState in `TransformWithStateInPandas`
+- [[SPARK-49745]](https://issues.apache.org/jira/browse/SPARK-49745) Read 
registered timers in `transformWithState`
+- [[SPARK-49802]](https://issues.apache.org/jira/browse/SPARK-49802) Add 
support for read change feed for map/list types
+- [[SPARK-49846]](https://issues.apache.org/jira/browse/SPARK-49846) Add 
`numUpdatedStateRows`/`numRemovedStateRows` metrics
+- [[SPARK-49883]](https://issues.apache.org/jira/browse/SPARK-49883) State 
Store Checkpoint Structure V2 Integration with RocksDB and RocksDBFileManager
+- [[SPARK-50017]](https://issues.apache.org/jira/browse/SPARK-50017) Support 
Avro encoding for `TransformWithState` operator
+- [[SPARK-50035]](https://issues.apache.org/jira/browse/SPARK-50035) Explicit 
`handleExpiredTimer` function in the stateful processor
+- [[SPARK-50128]](https://issues.apache.org/jira/browse/SPARK-50128) Add 
handle APIs using implicit encoders
+- [[SPARK-50152]](https://issues.apache.org/jira/browse/SPARK-50152) Support 
handleInitialState with state data source reader
+- [[SPARK-50194]](https://issues.apache.org/jira/browse/SPARK-50194) 
Integration of New Timer API and Initial State API
+- [[SPARK-50378]](https://issues.apache.org/jira/browse/SPARK-50378) Add 
custom metric for time spent populating initial state
+- [[SPARK-50428]](https://issues.apache.org/jira/browse/SPARK-50428) Support 
`TransformWithStateInPandas` in batch queries
+- [[SPARK-50573]](https://issues.apache.org/jira/browse/SPARK-50573) Adding 
State Schema ID to State Rows for schema evolution
+- [[SPARK-50714]](https://issues.apache.org/jira/browse/SPARK-50714) Enable 
schema evolution for `TransformWithState` with Avro encoding
+
+
+### Spark ML Highlights
+
+- [[SPARK-48463]](https://issues.apache.org/jira/browse/SPARK-48463) Make 
various ML transformers support nested input columns
+- [[SPARK-48463]](https://issues.apache.org/jira/browse/SPARK-48463) Make 
`StringIndexer` support nested input columns
+- [[SPARK-45757]](https://issues.apache.org/jira/browse/SPARK-45757) Avoid 
re-computation of NNZ in Binarizer
+- [[SPARK-45397]](https://issues.apache.org/jira/browse/SPARK-45397) Add array 
assembler feature transformer
+- [[SPARK-45547]](https://issues.apache.org/jira/browse/SPARK-45547) Validate 
Vectors with built-in function
+
+### Spark UX Highlights
+
+- [[SPARK-44893]](https://issues.apache.org/jira/browse/SPARK-44893) 
`ThreadInfo` improvements for monitoring APIs
+- [[SPARK-45595]](https://issues.apache.org/jira/browse/SPARK-45595) Expose 
`SQLSTATE` in error message
+- [[SPARK-45022]](https://issues.apache.org/jira/browse/SPARK-45022) Provide 
context for dataset API errors
+- [[SPARK-45771]](https://issues.apache.org/jira/browse/SPARK-45771) Enable 
`spark.eventLog.rolling.enabled` by default
+
+#### Other notable Spark UX changes
+
+- [[SPARK-41685]](https://issues.apache.org/jira/browse/SPARK-41685) Support 
Protobuf serializer for the KVStore in History server
+- [[SPARK-44770]](https://issues.apache.org/jira/browse/SPARK-44770) Add a 
`displayOrder` variable to `WebUITab` to specify the order in which tabs appear
+- [[SPARK-44801]](https://issues.apache.org/jira/browse/SPARK-44801) Capture 
analyzing failed queries in Listener and UI
+- [[SPARK-44838]](https://issues.apache.org/jira/browse/SPARK-44838) 
`raise_error` improvement
+- [[SPARK-44863]](https://issues.apache.org/jira/browse/SPARK-44863) Add a 
button to download thread dump as a txt in Spark UI
+- [[SPARK-44895]](https://issues.apache.org/jira/browse/SPARK-44895) Add 
'daemon', 'priority' for `ThreadStackTrace`
+- [[SPARK-45022]](https://issues.apache.org/jira/browse/SPARK-45022) Provide 
context for dataset API errors
+- [[SPARK-45151]](https://issues.apache.org/jira/browse/SPARK-45151) Task 
Level Thread Dump Support
+- [[SPARK-45207]](https://issues.apache.org/jira/browse/SPARK-45207) Implement 
Error Enrichment for Scala Client
+- [[SPARK-45209]](https://issues.apache.org/jira/browse/SPARK-45209) 
FlameGraph Support For Executor Thread Dump Page
+- [[SPARK-45240]](https://issues.apache.org/jira/browse/SPARK-45240) Implement 
Error Enrichment for Python Client
+- [[SPARK-45248]](https://issues.apache.org/jira/browse/SPARK-45248) Set the 
timeout for spark UI server
+- [[SPARK-45274]](https://issues.apache.org/jira/browse/SPARK-45274) 
Implementation of a new DAG drawing approach for job/stage/plan graphics
+- [[SPARK-45312]](https://issues.apache.org/jira/browse/SPARK-45312) Support 
toggle display/hide plan svg on execution page
+- [[SPARK-45439]](https://issues.apache.org/jira/browse/SPARK-45439) Reduce 
memory usage of `LiveStageMetrics.accumIdsToMetricType`
+- [[SPARK-45462]](https://issues.apache.org/jira/browse/SPARK-45462) Show 
Duration in ApplicationPage
+- [[SPARK-45480]](https://issues.apache.org/jira/browse/SPARK-45480) 
Selectable Spark Plan Node on UI
+- [[SPARK-45491]](https://issues.apache.org/jira/browse/SPARK-45491) Add 
missing SQLSTATES
+- [[SPARK-45500]](https://issues.apache.org/jira/browse/SPARK-45500) Show the 
number of abnormally completed drivers in MasterPage
+- [[SPARK-45516]](https://issues.apache.org/jira/browse/SPARK-45516) Include 
`QueryContext` in `SparkThrowable` proto message
+- [[SPARK-45581]](https://issues.apache.org/jira/browse/SPARK-45581) Make 
`SQLSTATE` mandatory
+- [[SPARK-45595]](https://issues.apache.org/jira/browse/SPARK-45595) Expose 
`SQLSTATE` in error message
+- [[SPARK-45609]](https://issues.apache.org/jira/browse/SPARK-45609) Include 
`SqlState` in `SparkThrowable` proto message
+- [[SPARK-45641]](https://issues.apache.org/jira/browse/SPARK-45641) Display 
the application start time on AllJobsPage
+- [[SPARK-45771]](https://issues.apache.org/jira/browse/SPARK-45771) Enable 
`spark.eventLog.rolling.enabled` by default
+- [[SPARK-45774]](https://issues.apache.org/jira/browse/SPARK-45774) Support 
`spark.master.ui.historyServerUrl` in ApplicationPage
+- [[SPARK-45955]](https://issues.apache.org/jira/browse/SPARK-45955) Collapse 
Support for Flamegraph and thread dump details
+- [[SPARK-46003]](https://issues.apache.org/jira/browse/SPARK-46003) Create a 
`ui-test` module with Jest to test UI JavaScript code
+- [[SPARK-46094]](https://issues.apache.org/jira/browse/SPARK-46094) Support 
Executor JVM Profiling
+- [[SPARK-46399]](https://issues.apache.org/jira/browse/SPARK-46399) Add exit 
status to the Application End event for the use of Spark Listener
+- [[SPARK-46886]](https://issues.apache.org/jira/browse/SPARK-46886) Enable 
`spark.ui.prometheus.enabled` by default
+- [[SPARK-46893]](https://issues.apache.org/jira/browse/SPARK-46893) Remove 
inline scripts from UI descriptions
+- [[SPARK-46903]](https://issues.apache.org/jira/browse/SPARK-46903) Support 
Spark History Server Log UI
+- [[SPARK-46922]](https://issues.apache.org/jira/browse/SPARK-46922) Do not 
wrap runtime user-facing errors
+- [[SPARK-46933]](https://issues.apache.org/jira/browse/SPARK-46933) Add query 
execution time metric to connectors using JDBCRDD
+- [[SPARK-47253]](https://issues.apache.org/jira/browse/SPARK-47253) Allow 
LiveEventBus to stop without draining the event queue
+- [[SPARK-47894]](https://issues.apache.org/jira/browse/SPARK-47894) Add 
Environment page to Master UI
+- [[SPARK-48459]](https://issues.apache.org/jira/browse/SPARK-48459) Implement 
`DataFrameQueryContext` in Spark Connect
+- [[SPARK-48597]](https://issues.apache.org/jira/browse/SPARK-48597) Introduce 
marker for `isStreaming` in text representation of logical plan
+- [[SPARK-48628]](https://issues.apache.org/jira/browse/SPARK-48628) Add task 
peak on/off heap memory metrics
+- [[SPARK-48716]](https://issues.apache.org/jira/browse/SPARK-48716) Add 
`jobGroupId` to `SparkListenerSQLExecutionStart`
+- [[SPARK-49128]](https://issues.apache.org/jira/browse/SPARK-49128) Support 
custom History Server UI title
+- [[SPARK-49206]](https://issues.apache.org/jira/browse/SPARK-49206) Add 
Environment Variables table to Master EnvironmentPage
+- [[SPARK-49241]](https://issues.apache.org/jira/browse/SPARK-49241) Add 
OpenTelemetryPush Sink with opentelemetry profile
+- [[SPARK-49445]](https://issues.apache.org/jira/browse/SPARK-49445) Support 
show tooltip in the progress bar of UI
+- [[SPARK-50049]](https://issues.apache.org/jira/browse/SPARK-50049) Support 
custom driver metrics in writing to v2 table
+- [[SPARK-50315]](https://issues.apache.org/jira/browse/SPARK-50315) Support 
custom metrics for V1Fallback writes
+- [[SPARK-50915]](https://issues.apache.org/jira/browse/SPARK-50915) Add 
`getCondition` and deprecate `getErrorClass` in PySparkException
+- [[SPARK-51021]](https://issues.apache.org/jira/browse/SPARK-51021) Add log 
throttler
+
+
+### Spark Connect Highlights
+
+- [[SPARK-49248]](https://issues.apache.org/jira/browse/SPARK-49248) Scala 
Client Parity with existing Dataset/DataFrame API
+- [[SPARK-48918]](https://issues.apache.org/jira/browse/SPARK-48918) Create a 
unified SQL Scala interface shared by regular SQL and Connect
+- [[SPARK-50812]](https://issues.apache.org/jira/browse/SPARK-50812) Support 
pyspark.ml on Connect
+- [[SPARK-47908]](https://issues.apache.org/jira/browse/SPARK-47908) Parent 
classes for Spark Connect and Spark Classic
+
+#### Other Spark Connect changes and improvements
+
+- [[SPARK-41065]](https://issues.apache.org/jira/browse/SPARK-41065) Implement 
`DataFrame.freqItems` and `DataFrame.stat.freqItems`
+- [[SPARK-41066]](https://issues.apache.org/jira/browse/SPARK-41066) Implement 
`DataFrame.sampleBy` and `DataFrame.stat.sampleBy`
+- [[SPARK-41067]](https://issues.apache.org/jira/browse/SPARK-41067) Implement 
`DataFrame.stat.cov`
+- [[SPARK-41068]](https://issues.apache.org/jira/browse/SPARK-41068) Implement 
`DataFrame.stat.corr`
+- [[SPARK-41069]](https://issues.apache.org/jira/browse/SPARK-41069) Implement 
`DataFrame.approxQuantile` and `DataFrame.stat.approxQuantile`
+- [[SPARK-41292]](https://issues.apache.org/jira/browse/SPARK-41292) Implement 
`Window` functions
+- [[SPARK-41333]](https://issues.apache.org/jira/browse/SPARK-41333) Implement 
`GroupedData.{min, max, avg, sum}`
+- [[SPARK-41364]](https://issues.apache.org/jira/browse/SPARK-41364) Implement 
broadcast function
+- [[SPARK-41383]](https://issues.apache.org/jira/browse/SPARK-41383) Implement 
`rollup`, `cube`, and `pivot`
+- [[SPARK-41434]](https://issues.apache.org/jira/browse/SPARK-41434) Initial 
LambdaFunction implementation
+- [[SPARK-41440]](https://issues.apache.org/jira/browse/SPARK-41440) Implement 
`DataFrame.randomSplit`
+- [[SPARK-41464]](https://issues.apache.org/jira/browse/SPARK-41464) Implement 
`DataFrame.to`
+- [[SPARK-41473]](https://issues.apache.org/jira/browse/SPARK-41473) Implement 
`format_number` function
+- [[SPARK-41503]](https://issues.apache.org/jira/browse/SPARK-41503) Implement 
Partition Transformation Functions
+- [[SPARK-41529]](https://issues.apache.org/jira/browse/SPARK-41529) Implement 
`SparkSession.stop`
+- [[SPARK-41629]](https://issues.apache.org/jira/browse/SPARK-41629) Support 
for Protocol Extensions in Relation and Expression
+- [[SPARK-41663]](https://issues.apache.org/jira/browse/SPARK-41663) Implement 
the rest of Lambda functions
+- [[SPARK-41673]](https://issues.apache.org/jira/browse/SPARK-41673) Implement 
`Column.astype`
+- [[SPARK-41707]](https://issues.apache.org/jira/browse/SPARK-41707) Implement 
Catalog API in Spark Connect
+- [[SPARK-41710]](https://issues.apache.org/jira/browse/SPARK-41710) Implement 
`Column.between`
+- [[SPARK-41722]](https://issues.apache.org/jira/browse/SPARK-41722) Implement 
3 missing time window functions
+- [[SPARK-41723]](https://issues.apache.org/jira/browse/SPARK-41723) Implement 
sequence function
+- [[SPARK-41724]](https://issues.apache.org/jira/browse/SPARK-41724) Implement 
`call_udf` function
+- [[SPARK-41728]](https://issues.apache.org/jira/browse/SPARK-41728) Implement 
`unwrap_udt` function
+- [[SPARK-41731]](https://issues.apache.org/jira/browse/SPARK-41731) Implement 
the column accessor (`getItem`, `getField`, `getitem`, etc.)
+- [[SPARK-41740]](https://issues.apache.org/jira/browse/SPARK-41740) Implement 
`Column.name`
+- [[SPARK-41767]](https://issues.apache.org/jira/browse/SPARK-41767) Implement 
`Column.{withField, dropFields}`
+- [[SPARK-41785]](https://issues.apache.org/jira/browse/SPARK-41785) Implement 
`GroupedData.mean`
+- [[SPARK-41803]](https://issues.apache.org/jira/browse/SPARK-41803) Add 
missing function `log(arg1, arg2)`
+- [[SPARK-41811]](https://issues.apache.org/jira/browse/SPARK-41811) Implement 
`SQLStringFormatter` with `WithRelations`
+- [[SPARK-42664]](https://issues.apache.org/jira/browse/SPARK-42664) Support 
`bloomFilter` function for `DataFrameStatFunction`s
+- [[SPARK-43662]](https://issues.apache.org/jira/browse/SPARK-43662) Support 
`merge_asof` in Spark Connect
+- [[SPARK-43704]](https://issues.apache.org/jira/browse/SPARK-43704) Support 
`MultiIndex` for `to_series()` in Spark Connect
+- [[SPARK-44736]](https://issues.apache.org/jira/browse/SPARK-44736) Add 
`Dataset.explode` to Spark Connect Scala Client
+- [[SPARK-44740]](https://issues.apache.org/jira/browse/SPARK-44740) Support 
specifying `session_id` in `SPARK_REMOTE` connection string
+- [[SPARK-44747]](https://issues.apache.org/jira/browse/SPARK-44747) Add 
missing `SparkSession.Builder` methods
+- [[SPARK-44761]](https://issues.apache.org/jira/browse/SPARK-44761) Support 
`DataStreamWriter.foreachBatch(VoidFunction2)`
+- [[SPARK-44788]](https://issues.apache.org/jira/browse/SPARK-44788) Add 
`from_xml` and `schema_of_xml` to pyspark, Spark Connect, and SQL functions
+- [[SPARK-44807]](https://issues.apache.org/jira/browse/SPARK-44807) Add 
`Dataset.metadataColumn` to Scala Client
+- [[SPARK-44877]](https://issues.apache.org/jira/browse/SPARK-44877) Support 
python protobuf functions for Spark Connect
+- [[SPARK-45000]](https://issues.apache.org/jira/browse/SPARK-45000) Implement 
`DataFrame.foreach`
+- [[SPARK-45001]](https://issues.apache.org/jira/browse/SPARK-45001) Implement 
`DataFrame.foreachPartition`
+- [[SPARK-45091]](https://issues.apache.org/jira/browse/SPARK-45091) Function 
`floor`/`round`/`bround` now accept Column type scale
+- [[SPARK-45121]](https://issues.apache.org/jira/browse/SPARK-45121) Support 
`Series.empty` for Spark Connect
+- [[SPARK-45137]](https://issues.apache.org/jira/browse/SPARK-45137) Support 
map/array parameters in parameterized sql()
+- [[SPARK-45143]](https://issues.apache.org/jira/browse/SPARK-45143) Make 
PySpark compatible with PyArrow 13.0.0
+- [[SPARK-45190]](https://issues.apache.org/jira/browse/SPARK-45190) Make 
`from_xml` support `StructType` schema
+- [[SPARK-45235]](https://issues.apache.org/jira/browse/SPARK-45235) Support 
`map and array` parameters by sql()
+- [[SPARK-45485]](https://issues.apache.org/jira/browse/SPARK-45485) User 
agent improvements: Use `SPARK_CONNECT_USER_AGENT` env variable and include 
environment specific attributes
+- [[SPARK-45506]](https://issues.apache.org/jira/browse/SPARK-45506) Add ivy 
URI support to SparkcConnect `addArtifact`
+- [[SPARK-45619]](https://issues.apache.org/jira/browse/SPARK-45619) Apply the 
observed metrics to Observation object
+- [[SPARK-45733]](https://issues.apache.org/jira/browse/SPARK-45733) Support 
multiple retry policies
+- [[SPARK-45851]](https://issues.apache.org/jira/browse/SPARK-45851) Support 
multiple policies in scala client
+- [[SPARK-46039]](https://issues.apache.org/jira/browse/SPARK-46039) Upgrade 
`grpcio\*` to 1.59.3 for Python 3.12
+- [[SPARK-46048]](https://issues.apache.org/jira/browse/SPARK-46048) Support 
`DataFrame.groupingSets` in Python Spark Connect
+- [[SPARK-46085]](https://issues.apache.org/jira/browse/SPARK-46085) 
`Dataset.groupingSets` in Scala Spark Connect client
+- [[SPARK-46202]](https://issues.apache.org/jira/browse/SPARK-46202) Expose 
new `ArtifactManager` APIs to support custom target directories
+- [[SPARK-46229]](https://issues.apache.org/jira/browse/SPARK-46229) Add 
`applyInArrow` to `groupBy` and `cogroup` in Spark Connect
+- [[SPARK-46255]](https://issues.apache.org/jira/browse/SPARK-46255) Support 
complex type -> string conversion
+- [[SPARK-46620]](https://issues.apache.org/jira/browse/SPARK-46620) Introduce 
a basic fallback mechanism for frame methods
+- [[SPARK-46812]](https://issues.apache.org/jira/browse/SPARK-46812) Make 
`mapInPandas`/`mapInArrow` support `ResourceProfile`
+- [[SPARK-46919]](https://issues.apache.org/jira/browse/SPARK-46919) Upgrade 
grpcio\* and grpc-java to 1.62.x
+- [[SPARK-47014]](https://issues.apache.org/jira/browse/SPARK-47014) Implement 
methods `dumpPerfProfile` and `dumpMemoryProfiles` of SparkSession
+- [[SPARK-47069]](https://issues.apache.org/jira/browse/SPARK-47069) Introduce 
`spark.profile.show`/`.dump` for SparkSession-based profiling
+- [[SPARK-47081]](https://issues.apache.org/jira/browse/SPARK-47081) Support 
Query Execution Progress
+- [[SPARK-47137]](https://issues.apache.org/jira/browse/SPARK-47137) Add 
`getAll` to `spark.conf` for feature parity with Scala
+- [[SPARK-47233]](https://issues.apache.org/jira/browse/SPARK-47233) Client & 
Server logic for client-side streaming query listener
+- [[SPARK-47276]](https://issues.apache.org/jira/browse/SPARK-47276) Introduce 
`spark.profile.clear` for SparkSession-based profiling
+- [[SPARK-47367]](https://issues.apache.org/jira/browse/SPARK-47367) Support 
Python data sources with Spark Connect
+- [[SPARK-47543]](https://issues.apache.org/jira/browse/SPARK-47543) Infer 
`dict` as `MapType` from Pandas DataFrame (via new config)
+- [[SPARK-47545]](https://issues.apache.org/jira/browse/SPARK-47545) 
`Dataset.observe` for Scala Connect
+- [[SPARK-47694]](https://issues.apache.org/jira/browse/SPARK-47694) Make max 
message size configurable on the client side
+- [[SPARK-47712]](https://issues.apache.org/jira/browse/SPARK-47712) Allow 
connect plugins to create and process Datasets
+- [[SPARK-47812]](https://issues.apache.org/jira/browse/SPARK-47812) Support 
Serialization of `SparkSession` for `ForEachBatch` worker
+- [[SPARK-47818]](https://issues.apache.org/jira/browse/SPARK-47818) Introduce 
plan cache in SparkConnectPlanner to improve performance of Analyze requests
+- [[SPARK-47845]](https://issues.apache.org/jira/browse/SPARK-47845) Support 
Column type in split function for Scala and Python
+- [[SPARK-47909]](https://issues.apache.org/jira/browse/SPARK-47909) Parent 
DataFrame class for Spark Connect and Spark Classic
+- [[SPARK-48008]](https://issues.apache.org/jira/browse/SPARK-48008) Support 
UDAFs in Spark Connect
+- [[SPARK-48048]](https://issues.apache.org/jira/browse/SPARK-48048) Added 
client side listener support for Scala
+- [[SPARK-48112]](https://issues.apache.org/jira/browse/SPARK-48112) Expose 
session in `SparkConnectPlanner` to plugins
+- [[SPARK-48113]](https://issues.apache.org/jira/browse/SPARK-48113) Allow 
Plugins to integrate with Spark Connect
+- [[SPARK-48258]](https://issues.apache.org/jira/browse/SPARK-48258) 
`Checkpoint` and `localCheckpoint` in Spark Connect
+- [[SPARK-48510]](https://issues.apache.org/jira/browse/SPARK-48510) Support 
UDAF `toColumn` API in Spark Connect
+- [[SPARK-48555]](https://issues.apache.org/jira/browse/SPARK-48555) Support 
using Columns as parameters for several functions (`array_remove`, 
`array_position`, etc.)
+- [[SPARK-48638]](https://issues.apache.org/jira/browse/SPARK-48638) Add 
`ExecutionInfo` support for DataFrame
+- [[SPARK-48794]](https://issues.apache.org/jira/browse/SPARK-48794) 
`DataFrame.mergeInto` support for Spark Connect (Scala & Python)
+- [[SPARK-48960]](https://issues.apache.org/jira/browse/SPARK-48960) Makes 
spark‑shell work with Spark Connect (`–remote` support)
+- [[SPARK-49027]](https://issues.apache.org/jira/browse/SPARK-49027) Share 
Column API between Classic and Connect
+- [[SPARK-49028]](https://issues.apache.org/jira/browse/SPARK-49028) Create a 
shared SparkSession
+- [[SPARK-49029]](https://issues.apache.org/jira/browse/SPARK-49029) Create 
shared Dataset interface
+- [[SPARK-49185]](https://issues.apache.org/jira/browse/SPARK-49185) 
Reimplement kde plot with Spark SQL
+- [[SPARK-49201]](https://issues.apache.org/jira/browse/SPARK-49201) 
Reimplement hist plot with Spark SQL
+- [[SPARK-49249]](https://issues.apache.org/jira/browse/SPARK-49249) Add 
`addArtifac`t API to the Spark SQL Core
+- [[SPARK-49273]](https://issues.apache.org/jira/browse/SPARK-49273) Origin 
support for Spark Connect Scala client
+- [[SPARK-49282]](https://issues.apache.org/jira/browse/SPARK-49282) Create a 
shared `SparkSessionBuilder` interface
+- [[SPARK-49284]](https://issues.apache.org/jira/browse/SPARK-49284) Create a 
shared Catalog interface
+- [[SPARK-49413]](https://issues.apache.org/jira/browse/SPARK-49413) Create a 
shared `RuntimeConfig` interface
+- [[SPARK-49416]](https://issues.apache.org/jira/browse/SPARK-49416) Add 
shared `DataStreamReader` interface
+- [[SPARK-49417]](https://issues.apache.org/jira/browse/SPARK-49417) Add 
shared `StreamingQueryManager` interface
+- [[SPARK-49419]](https://issues.apache.org/jira/browse/SPARK-49419) Create 
shared DataFrameStatFunctions
+- [[SPARK-49429]](https://issues.apache.org/jira/browse/SPARK-49429) Add 
shared `DataStreamWriter` interface
+- [[SPARK-49526]](https://issues.apache.org/jira/browse/SPARK-49526) Support 
Windows-style paths in ArtifactManager
+- [[SPARK-49530]](https://issues.apache.org/jira/browse/SPARK-49530) Support 
kde/density plots
+- [[SPARK-49531]](https://issues.apache.org/jira/browse/SPARK-49531) Support 
line plot with plotly backend
+- [[SPARK-49626]](https://issues.apache.org/jira/browse/SPARK-49626) Support 
horizontal/vertical bar plots
+- [[SPARK-49907]](https://issues.apache.org/jira/browse/SPARK-49907) Support 
spark.ml on Connect
+- [[SPARK-49948]](https://issues.apache.org/jira/browse/SPARK-49948) Add 
"precision" parameter to pandas on Spark box plot
+- [[SPARK-50050]](https://issues.apache.org/jira/browse/SPARK-50050) Make lit 
accept str/bool numpy ndarray
+- [[SPARK-50054]](https://issues.apache.org/jira/browse/SPARK-50054) Support 
histogram plots
+- [[SPARK-50063]](https://issues.apache.org/jira/browse/SPARK-50063) Add 
support for Variant in the Spark Connect Scala client
+- [[SPARK-50298]](https://issues.apache.org/jira/browse/SPARK-50298) Implement 
`verifySchema` parameter of `createDataFrame`
+- [[SPARK-50306]](https://issues.apache.org/jira/browse/SPARK-50306) Support 
Python 3.13 in Spark Connect
+- [[SPARK-50544]](https://issues.apache.org/jira/browse/SPARK-50544) Implement 
`StructType.toDDL`
+- [[SPARK-50710]](https://issues.apache.org/jira/browse/SPARK-50710) Add 
support for optional client reconnection to sessions after release
+- [[SPARK-50828]](https://issues.apache.org/jira/browse/SPARK-50828) Deprecate 
`pyspark.ml.connect`
+- [[SPARK-46465]](https://issues.apache.org/jira/browse/SPARK-46465) Add 
`Column.isNaN` in PySpark
+- [[SPARK-44753]](https://issues.apache.org/jira/browse/SPARK-44753) XML: 
pyspark SQL XML reader/writer
+
+
+### Build and Others
+
+- [[SPARK-44442]](https://issues.apache.org/jira/browse/SPARK-44442) Drop 
mesos support
+- [[SPARK-43831]](https://issues.apache.org/jira/browse/SPARK-43831) Build and 
Run Spark on Java 21
+- [[SPARK-47993]](https://issues.apache.org/jira/browse/SPARK-47993) Drop 
Python 3.8 support
+- [[SPARK-49347]](https://issues.apache.org/jira/browse/SPARK-49347) Deprecate 
SparkR
+- [[SPARK-49624]](https://issues.apache.org/jira/browse/SPARK-49624) Upgrade 
aircompressor to 2.0.2
+- [[SPARK-50439]](https://issues.apache.org/jira/browse/SPARK-50439) Upgrade 
Arrow to 18.1.0
+- [[SPARK-49965]](https://issues.apache.org/jira/browse/SPARK-49965) Upgrade 
ASM to 9.7.1
+- [[SPARK-50859]](https://issues.apache.org/jira/browse/SPARK-50859) Upgrade 
AWS SDK v2 to 2.25.53
+- [[SPARK-50738]](https://issues.apache.org/jira/browse/SPARK-50738) Upgrade 
black to 23.12.1
+- [[SPARK-48582]](https://issues.apache.org/jira/browse/SPARK-48582) Upgrade 
braces in ui-test to 3.0.3
+- [[SPARK-49842]](https://issues.apache.org/jira/browse/SPARK-49842) Add 
byte-buddy dep for mockito-core with Java 21
+- [[SPARK-50823]](https://issues.apache.org/jira/browse/SPARK-50823) Upgrade 
cloudpickle from 3.1.0 to 3.1.1
+- [[SPARK-49242]](https://issues.apache.org/jira/browse/SPARK-49242) Upgrade 
commons-cli to 1.9.0
+- [[SPARK-50754]](https://issues.apache.org/jira/browse/SPARK-50754) Upgrade 
commons-codec to 1.17.2
+- [[SPARK-49327]](https://issues.apache.org/jira/browse/SPARK-49327) Upgrade 
commons-compress to 1.27.1
+- [[SPARK-50375]](https://issues.apache.org/jira/browse/SPARK-50375) Upgrade 
commons-io to 2.18.0
+- [[SPARK-49483]](https://issues.apache.org/jira/browse/SPARK-49483) Upgrade 
commons-lang3 to 3.17.0
+- [[SPARK-50576]](https://issues.apache.org/jira/browse/SPARK-50576) Upgrade 
commons-text to 1.13.0
+- [[SPARK-50136]](https://issues.apache.org/jira/browse/SPARK-50136) Upgrade 
curator to 5.7.1
+- [[SPARK-49936]](https://issues.apache.org/jira/browse/SPARK-49936) Upgrade 
datasketches-java to 6.1.1
+- [[SPARK-50861]](https://issues.apache.org/jira/browse/SPARK-50861) Upgrade 
dropwizard metrics to 4.2.30
+- [[SPARK-50452]](https://issues.apache.org/jira/browse/SPARK-50452) Upgrade 
jackson to 2.18.2
+- [[SPARK-48826]](https://issues.apache.org/jira/browse/SPARK-48826) Upgrade 
fasterxml.jackson to 2.17.2
+- [[SPARK-51006]](https://issues.apache.org/jira/browse/SPARK-51006) Upgrade 
gcs-connector to 2.2.26
+- [[SPARK-49120]](https://issues.apache.org/jira/browse/SPARK-49120) Bump Gson 
to 2.11.0
+- [[SPARK-50972]](https://issues.apache.org/jira/browse/SPARK-50972) Upgrade 
Guava to 33.4.0
+- [[SPARK-49550]](https://issues.apache.org/jira/browse/SPARK-49550) Upgrade 
Hadoop to 3.4.1
+- [[SPARK-47715]](https://issues.apache.org/jira/browse/SPARK-47715) Upgrade 
hive‑service‑rpc to 4.0.0
+- [[SPARK-50794]](https://issues.apache.org/jira/browse/SPARK-50794) Upgrade 
Ivy to 2.5.3
+- [[SPARK-50047]](https://issues.apache.org/jira/browse/SPARK-50047) Upgrade 
jersey to 3.0.16
+- [[SPARK-49682]](https://issues.apache.org/jira/browse/SPARK-49682) Upgrade 
joda-time to 2.13.0
+- [[SPARK-47706]](https://issues.apache.org/jira/browse/SPARK-47706) Bump 
json4s to 4.0.7
+- [[SPARK-50677]](https://issues.apache.org/jira/browse/SPARK-50677) Upgrade 
jupiter-interface to 0.13.3 and JUnit5 to 5.11.4
+- [[SPARK-50345]](https://issues.apache.org/jira/browse/SPARK-50345) Upgrade 
Kafka to 3.9.0
+- [[SPARK-50493]](https://issues.apache.org/jira/browse/SPARK-50493) Migrate 
kubernetes-client from 6.x to 7.x
+- [[SPARK-50580]](https://issues.apache.org/jira/browse/SPARK-50580) Upgrade 
log4j2 to 2.24.3
+- [[SPARK-49335]](https://issues.apache.org/jira/browse/SPARK-49335) Upgrade 
Maven to 3.9.9
+- [[SPARK-48625]](https://issues.apache.org/jira/browse/SPARK-48625) Upgrade 
mssql-jdbc to 12.6.2.jre11
+- [[SPARK-47298]](https://issues.apache.org/jira/browse/SPARK-47298) Upgrade 
mysql-connector-j to 8.3.0 / mariadb-java-client to 2.7.12
+- [[SPARK-51054]](https://issues.apache.org/jira/browse/SPARK-51054) Upgrade 
Netty to 4.1.117.Final
+- [[SPARK-50278]](https://issues.apache.org/jira/browse/SPARK-50278) Upgrade 
netty-tcnative to 2.0.69.Final
+- [[SPARK-45590]](https://issues.apache.org/jira/browse/SPARK-45590) Upgrade 
okio to 1.17.6
+- [[SPARK-50728]](https://issues.apache.org/jira/browse/SPARK-50728) Update 
ORC to 2.1.0
+- [[SPARK-50425]](https://issues.apache.org/jira/browse/SPARK-50425) Bump 
Apache Parquet to 1.15.0
+- [[SPARK-48563]](https://issues.apache.org/jira/browse/SPARK-48563) Upgrade 
pickle to 1.5
+- [[SPARK-50894]](https://issues.apache.org/jira/browse/SPARK-50894) Postgres 
driver bump to 42.7.5
+- [[SPARK-50796]](https://issues.apache.org/jira/browse/SPARK-50796) Upgrade 
protobuf-java to 4.29.3
+- [[SPARK-50821]](https://issues.apache.org/jira/browse/SPARK-50821) Upgrade 
Py4J from 0.10.9.8 to 0.10.9.9
+- [[SPARK-47737]](https://issues.apache.org/jira/browse/SPARK-47737) Bump 
PyArrow to 10.0.0
+- [[SPARK-47923]](https://issues.apache.org/jira/browse/SPARK-47923) Upgrade 
minimum version of arrow R package to 10.0.0
+- [[SPARK-49708]](https://issues.apache.org/jira/browse/SPARK-49708) Upgrade 
RoaringBitmap to 1.3.0
+- [[SPARK-50862]](https://issues.apache.org/jira/browse/SPARK-50862) Upgrade 
rocksdbjni to 9.8.4
+- [[SPARK-50871]](https://issues.apache.org/jira/browse/SPARK-50871) Upgrade 
scala-parallel-collections to 1.2.0
+- [[SPARK-48427]](https://issues.apache.org/jira/browse/SPARK-48427) Upgrade 
scala-parser-combinators to 2.4
+- [[SPARK-48609]](https://issues.apache.org/jira/browse/SPARK-48609) Upgrade 
scala-xml to 2.3.0
+- [[SPARK-49187]](https://issues.apache.org/jira/browse/SPARK-49187) Upgrade 
slf4j to 2.0.16
+- [[SPARK-49170]](https://issues.apache.org/jira/browse/SPARK-49170) Upgrade 
snappy to 1.1.10.6
+- [[SPARK-50632]](https://issues.apache.org/jira/browse/SPARK-50632) Upgrade 
tink to 1.16.0
+- [[SPARK-49234]](https://issues.apache.org/jira/browse/SPARK-49234) Upgrade 
xz to 1.10
+- [[SPARK-50741]](https://issues.apache.org/jira/browse/SPARK-50741) Upgrade 
zstd-jni to 1.5.6-9
+- [[SPARK-50952]](https://issues.apache.org/jira/browse/SPARK-50952) Include 
jjwt-related libraries with jjwt-provided profile
+- [[SPARK-49964]](https://issues.apache.org/jira/browse/SPARK-49964) Remove 
ws-rs-api package
+- [[SPARK-50383]](https://issues.apache.org/jira/browse/SPARK-50383) Support 
Virtual Threads in REST Submission API
+- [[SPARK-50811]](https://issues.apache.org/jira/browse/SPARK-50811) Support 
enabling JVM profiler on driver
+- [[SPARK-41634]](https://issues.apache.org/jira/browse/SPARK-41634) Upgrade 
minimatch to 3.1.2
+- [[SPARK-41704]](https://issues.apache.org/jira/browse/SPARK-41704) Upgrade 
sbt-assembly from 2.0.0 to 2.1.0
+- [[SPARK-41714]](https://issues.apache.org/jira/browse/SPARK-41714) Update 
maven-checkstyle-plugin from 3.1.2 to 3.2.0
+- [[SPARK-41750]](https://issues.apache.org/jira/browse/SPARK-41750) Upgrade 
dev.ludovic.netlib to 3.0.3
+- [[SPARK-41787]](https://issues.apache.org/jira/browse/SPARK-41787) Upgrade 
silencer to 1.7.12
+- [[SPARK-41798]](https://issues.apache.org/jira/browse/SPARK-41798) Upgrade 
hive-storage-api to 2.8.1
+- [[SPARK-41802]](https://issues.apache.org/jira/browse/SPARK-41802) Upgrade 
Apache httpcore to 4.4.16
+- [[SPARK-45956]](https://issues.apache.org/jira/browse/SPARK-45956) Upgrade 
Apache ZooKeeper to 3.9.1
+- [[SPARK-46174]](https://issues.apache.org/jira/browse/SPARK-46174) Upgrade 
gcs-connector to 2.2.18
+- [[SPARK-45850]](https://issues.apache.org/jira/browse/SPARK-45850) Upgrade 
oracle jdbc driver to 23.3.0.23.09
+- [[SPARK-45540]](https://issues.apache.org/jira/browse/SPARK-45540) Upgrade 
jetty to 9.4.53.v20231009
+- [[SPARK-45269]](https://issues.apache.org/jira/browse/SPARK-45269) Use Java 
21-jre in K8s Dockerfile
+- [[SPARK-45284]](https://issues.apache.org/jira/browse/SPARK-45284) Update 
SparkR minimum SystemRequirements to Java 17
+- [[SPARK-45325]](https://issues.apache.org/jira/browse/SPARK-45325) Upgrade 
Avro to 1.11.3
+- [[SPARK-44366]](https://issues.apache.org/jira/browse/SPARK-44366) Upgrade 
antlr4 to 4.13.1
+- [[SPARK-45247]](https://issues.apache.org/jira/browse/SPARK-45247) Upgrade 
Pandas to 2.1.1

Review Comment:
   pandas is version 2.2.3 https://issues.apache.org/jira/browse/SPARK-49801 
[SPARK-49801](https://issues.apache.org/jira/browse/SPARK-49801)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

Re: [PR] Add Spark 4.0.0 release notes [spark-website]

Reply via email to