Deprecations removal schedule

Yaniv Kunda Wed, 25 Jun 2025 10:09:49 -0700

Hi all,
During local builds, I've noticed there are >800 warnings about deprecated
classes/methods.
Is there a planned schedule for the removal of some of them?


At least since Java 9, the Java `@Deprecation` annotation has a
`forRemoval` attribute, which is not used yet - but since Spark 4.x
requires Java 17, maybe it's a good time to start using it.
It's still beneficial to define some sort of an external removal
schedule/policy, whether time-based or version-based - to encourage users
to move away from the deprecated code.
(p.s., the comparable scala annotation doesn't have that attribute).

I've processed the build output (with a bit of manual filling-ins) and
created a table with most of the deprecation warnings, sorted by #
occurrences:
+-------+----------------------------------------------------+--------------------------------------+-------+-----+---------+
| Lang. | What                                               | Where
                         | since | #   | Source  |
+-------+----------------------------------------------------+--------------------------------------+-------+-----+---------+
| Scala | class StreamingContext                             | package
streaming                    | 3.4.0 | 132 |         |
| Scala | JavaStreamingContext                               |
org.apache.spark.streaming.api.java  | 3.4.0 | 58  |         |
| Scala | method Once                                        | class
Trigger                        | 3.4.0 | 25  |         |
| Scala | class UserDefinedAggregateFunction                 | package
expressions                  | 3.0.0 | 17  |         |
| Scala | object StreamingContext                            | package
streaming                    | 3.4.0 | 15  |         |
| Scala | class ChiSqSelector                                | package
feature                      | 3.1.1 | 14  |         |
| Scala | class JavaStreamingContext                         | package java
                        | 3.4.0 | 14  |         |
| Scala | class SparkListenerExecutorBlacklisted             | package
scheduler                    | 3.1.0 | 13  |         |
| Scala | method createExternalTable                         | class
SQLContext                     | 2.2.0 | 12  |         |
| Scala | method load                                        | class
SQLContext                     | 1.4.0 | 12  |         |
| Scala | class SparkListenerExecutorBlacklistedForStage     | package
scheduler                    | 3.1.0 | 11  |         |
| Scala | class SparkListenerExecutorUnblacklisted           | package
scheduler                    | 3.1.0 | 11  |         |
| Scala | method schema                                      | trait Table
                         | 3.4.0 | 11  |         |
| Scala | class SparkListenerExecutorUnblacklistedForStage   | package
scheduler                    | 3.1.0 | 10  |         |
| Scala | class SparkListenerNodeBlacklistedForStage         | package
scheduler                    | 3.1.0 | 10  |         |
| Scala | method jsonRDD                                     | class
SQLContext                     | 1.4.0 | 10  |         |
| Scala | value blacklistedInStages                          | class
ExecutorSummary                | 3.1.0 | 10  |         |
| Scala | class SparkListenerNodeBlacklisted                 | package
scheduler                    | 3.1.0 | 9   |         |
| Scala | value isBlacklisted                                | class
ExecutorSummary                | 3.1.0 | 9   |         |
| Scala | method applySchema                                 | class
SQLContext                     | 1.3.0 | 8   |         |
| Scala | class SparkListenerNodeUnblacklisted               | package
scheduler                    | 3.1.0 | 7   |         |
| Scala | method toDegrees                                   | object
functions                     | 2.1.0 | 7   |         |
| Scala | method toRadians                                   | object
functions                     | 2.1.0 | 7   |         |
| Scala | method jsonFile                                    | class
SQLContext                     | 1.4.0 | 6   |         |
| Scala | method jdbc                                        | class
SQLContext                     | 1.4.0 | 6   |         |
| Scala | method registerTempTable                           | class
Dataset                        | 2.0.0 | 6   |         |
| Scala | method udf                                         | object
functions                     | 3.0.0 | 6   |         |
| Scala | method register                                    | class
UDFRegistration                | 3.0.0 | 6   |         |
| Scala | value isBlacklistedForStage                        | class
ExecutorStageSummary           | 3.1.0 | 5   |         |
| Scala | method approxCountDistinct                         | object
functions                     | 2.1.0 | 5   |         |
| Scala | method explode                                     | class
Dataset                        | 2.0.0 | 4   |         |
| Scala | method explode                                     | class
Dataset                        | 3.5.0 | 4   |         |
| Java  | createTable(Identifier,Column[],Transform[],Map)   | TableCatalog
                        | 4.1.0 | 4   |         |
| Java  | createTable(Identifier,StructType,Transform[],Map) | TableCatalog
                        | 3.4.0 | 4   |         |
| Java  | SparkListenerExecutorBlacklisted                   |
org.apache.spark.scheduler           | 3.1.0 | 3   |         |
| Java  | SparkListenerExecutorBlacklistedForStage           |
org.apache.spark.scheduler           | 3.1.0 | 3   |         |
| Java  | SparkListenerExecutorUnblacklisted                 |
org.apache.spark.scheduler           | 3.1.0 | 3   |         |
| Java  | SparkListenerNodeBlacklisted                       |
org.apache.spark.scheduler           | 3.1.0 | 3   |         |
| Java  | SparkListenerNodeBlacklistedForStage               |
org.apache.spark.scheduler           | 3.1.0 | 3   |         |
| Java  | SparkListenerNodeUnblacklisted                     |
org.apache.spark.scheduler           | 3.1.0 | 3   |         |
| Scala | method !==                                         | class Column
                        | 2.0.0 | 3   |         |
| Scala | method createExternalTable                         | class
Catalog                        | 2.2.0 | 3   |         |
| Scala | method newTaskTempFile                             | trait
FileCommitProtocol             | 3.3.0 | 2   |         |
| Scala | DEPRECATED_CHILD_CONNECTION_TIMEOUT                |
SparkLauncher                        | 3.2.0 | 2   |         |
| Java  | DEPRECATED_CHILD_CONNECTION_TIMEOUT                |
SparkLauncher                        | 3.2.0 | 2   |         |
| Scala | method newTaskTempFileAbsPath                      | trait
FileCommitProtocol             | 3.3.0 | 1   |         |
| Scala | value holdingLocks                                 | class
ThreadStackTrace               | 4.0.0 | 1   |         |
| Java  | AppStatusSource.BLACKLISTED_EXECUTORS              |
AppStatusSource                      | 3.1.0 | 1   |         |
| Java  | AppStatusSource.UNBLACKLISTED_EXECUTORS            |
AppStatusSource                      | 3.1.0 | 1   |         |
| Scala | method classifyException                           | class
JdbcDialect                    | 4.0.0 | 1   |         |
| Scala | string sql                                         | message
SqlCommand                   |       | 5   | proto   |
| Scala | trait AggregationBuffer                            | class
GenericUDAFEvaluator           |       | 16  | hive    |
| Scala | method initialize                                  | class
AbstractSerDe                  |       | 1   | hive    |
| Scala | method poll(long)                                  | trait
Consumer                       |       | 5   | kafka   |
| Scala | class DefaultPartitioner                           | package
internals                    |       | 3   | kafka   |
| Scala | method getAllStatistics                            | class
FileSystem                     |       | 3   | hadoop  |
| Java  | ParquetFileReader(Path,ParquetReadOptions)         |
ParquetFileReader                    |       | 3   | parquet |
| Java  | ParquetFileReader(Path,ParquetReadOptions,List)    |
ParquetFileReader                    |       | 3   | parquet |
| Java  | ParquetFileReader(Configuration,List,List,boolean) |
ParquetFileReader                    |       | 3   | parquet |
| Java  | readFooter(Configuration,Path)                     |
ParquetFileReader                    |       | 3   | parquet |
| Java  | BytesInput toByteArray()                           | BytesInput
                        |       | 1   | parquet |
| Java  | AvroParquetWriter builder(Path)                    |
AvroParquetWriter                    |       | 1   | parquet |
| Scala | method readAllFootersInParallel                    | class
ParquetFileReader              |       | 1   | parquet |
| Java  | RandomStringUtils random(int)                      |
RandomStringUtils                    |       | 3   | c-lang3 |
| Java  | RandomStringUtils randomAlphabetic(int)            |
RandomStringUtils                    |       | 3   | c-lang3 |
| Scala | method randomAlphanumeric                          | class
RandomStringUtils              |       | 1   | c-lang3 |
+-------+----------------------------------------------------+--------------------------------------+-------+-----+---------+

Most are spark's own deprecations - per your input, I can create issues and
PRs to handle specific areas.
Regarding the 3rd-party deprecations (at the end of the table) - most
(maybe all besides parquet) would be trivial to fix, for which I can create
separate tasks.

Deprecations removal schedule

Reply via email to