Hi all,
During local builds, I've noticed there are >800 warnings about deprecated
classes/methods.
Is there a planned schedule for the removal of some of them?
At least since Java 9, the Java `@Deprecation` annotation has a
`forRemoval` attribute, which is not used yet - but since Spark 4.x
requires Java 17, maybe it's a good time to start using it.
It's still beneficial to define some sort of an external removal
schedule/policy, whether time-based or version-based - to encourage users
to move away from the deprecated code.
(p.s., the comparable scala annotation doesn't have that attribute).
I've processed the build output (with a bit of manual filling-ins) and
created a table with most of the deprecation warnings, sorted by #
occurrences:
+-------+----------------------------------------------------+--------------------------------------+-------+-----+---------+
| Lang. | What | Where
| since | # | Source |
+-------+----------------------------------------------------+--------------------------------------+-------+-----+---------+
| Scala | class StreamingContext | package
streaming | 3.4.0 | 132 | |
| Scala | JavaStreamingContext |
org.apache.spark.streaming.api.java | 3.4.0 | 58 | |
| Scala | method Once | class
Trigger | 3.4.0 | 25 | |
| Scala | class UserDefinedAggregateFunction | package
expressions | 3.0.0 | 17 | |
| Scala | object StreamingContext | package
streaming | 3.4.0 | 15 | |
| Scala | class ChiSqSelector | package
feature | 3.1.1 | 14 | |
| Scala | class JavaStreamingContext | package java
| 3.4.0 | 14 | |
| Scala | class SparkListenerExecutorBlacklisted | package
scheduler | 3.1.0 | 13 | |
| Scala | method createExternalTable | class
SQLContext | 2.2.0 | 12 | |
| Scala | method load | class
SQLContext | 1.4.0 | 12 | |
| Scala | class SparkListenerExecutorBlacklistedForStage | package
scheduler | 3.1.0 | 11 | |
| Scala | class SparkListenerExecutorUnblacklisted | package
scheduler | 3.1.0 | 11 | |
| Scala | method schema | trait Table
| 3.4.0 | 11 | |
| Scala | class SparkListenerExecutorUnblacklistedForStage | package
scheduler | 3.1.0 | 10 | |
| Scala | class SparkListenerNodeBlacklistedForStage | package
scheduler | 3.1.0 | 10 | |
| Scala | method jsonRDD | class
SQLContext | 1.4.0 | 10 | |
| Scala | value blacklistedInStages | class
ExecutorSummary | 3.1.0 | 10 | |
| Scala | class SparkListenerNodeBlacklisted | package
scheduler | 3.1.0 | 9 | |
| Scala | value isBlacklisted | class
ExecutorSummary | 3.1.0 | 9 | |
| Scala | method applySchema | class
SQLContext | 1.3.0 | 8 | |
| Scala | class SparkListenerNodeUnblacklisted | package
scheduler | 3.1.0 | 7 | |
| Scala | method toDegrees | object
functions | 2.1.0 | 7 | |
| Scala | method toRadians | object
functions | 2.1.0 | 7 | |
| Scala | method jsonFile | class
SQLContext | 1.4.0 | 6 | |
| Scala | method jdbc | class
SQLContext | 1.4.0 | 6 | |
| Scala | method registerTempTable | class
Dataset | 2.0.0 | 6 | |
| Scala | method udf | object
functions | 3.0.0 | 6 | |
| Scala | method register | class
UDFRegistration | 3.0.0 | 6 | |
| Scala | value isBlacklistedForStage | class
ExecutorStageSummary | 3.1.0 | 5 | |
| Scala | method approxCountDistinct | object
functions | 2.1.0 | 5 | |
| Scala | method explode | class
Dataset | 2.0.0 | 4 | |
| Scala | method explode | class
Dataset | 3.5.0 | 4 | |
| Java | createTable(Identifier,Column[],Transform[],Map) | TableCatalog
| 4.1.0 | 4 | |
| Java | createTable(Identifier,StructType,Transform[],Map) | TableCatalog
| 3.4.0 | 4 | |
| Java | SparkListenerExecutorBlacklisted |
org.apache.spark.scheduler | 3.1.0 | 3 | |
| Java | SparkListenerExecutorBlacklistedForStage |
org.apache.spark.scheduler | 3.1.0 | 3 | |
| Java | SparkListenerExecutorUnblacklisted |
org.apache.spark.scheduler | 3.1.0 | 3 | |
| Java | SparkListenerNodeBlacklisted |
org.apache.spark.scheduler | 3.1.0 | 3 | |
| Java | SparkListenerNodeBlacklistedForStage |
org.apache.spark.scheduler | 3.1.0 | 3 | |
| Java | SparkListenerNodeUnblacklisted |
org.apache.spark.scheduler | 3.1.0 | 3 | |
| Scala | method !== | class Column
| 2.0.0 | 3 | |
| Scala | method createExternalTable | class
Catalog | 2.2.0 | 3 | |
| Scala | method newTaskTempFile | trait
FileCommitProtocol | 3.3.0 | 2 | |
| Scala | DEPRECATED_CHILD_CONNECTION_TIMEOUT |
SparkLauncher | 3.2.0 | 2 | |
| Java | DEPRECATED_CHILD_CONNECTION_TIMEOUT |
SparkLauncher | 3.2.0 | 2 | |
| Scala | method newTaskTempFileAbsPath | trait
FileCommitProtocol | 3.3.0 | 1 | |
| Scala | value holdingLocks | class
ThreadStackTrace | 4.0.0 | 1 | |
| Java | AppStatusSource.BLACKLISTED_EXECUTORS |
AppStatusSource | 3.1.0 | 1 | |
| Java | AppStatusSource.UNBLACKLISTED_EXECUTORS |
AppStatusSource | 3.1.0 | 1 | |
| Scala | method classifyException | class
JdbcDialect | 4.0.0 | 1 | |
| Scala | string sql | message
SqlCommand | | 5 | proto |
| Scala | trait AggregationBuffer | class
GenericUDAFEvaluator | | 16 | hive |
| Scala | method initialize | class
AbstractSerDe | | 1 | hive |
| Scala | method poll(long) | trait
Consumer | | 5 | kafka |
| Scala | class DefaultPartitioner | package
internals | | 3 | kafka |
| Scala | method getAllStatistics | class
FileSystem | | 3 | hadoop |
| Java | ParquetFileReader(Path,ParquetReadOptions) |
ParquetFileReader | | 3 | parquet |
| Java | ParquetFileReader(Path,ParquetReadOptions,List) |
ParquetFileReader | | 3 | parquet |
| Java | ParquetFileReader(Configuration,List,List,boolean) |
ParquetFileReader | | 3 | parquet |
| Java | readFooter(Configuration,Path) |
ParquetFileReader | | 3 | parquet |
| Java | BytesInput toByteArray() | BytesInput
| | 1 | parquet |
| Java | AvroParquetWriter builder(Path) |
AvroParquetWriter | | 1 | parquet |
| Scala | method readAllFootersInParallel | class
ParquetFileReader | | 1 | parquet |
| Java | RandomStringUtils random(int) |
RandomStringUtils | | 3 | c-lang3 |
| Java | RandomStringUtils randomAlphabetic(int) |
RandomStringUtils | | 3 | c-lang3 |
| Scala | method randomAlphanumeric | class
RandomStringUtils | | 1 | c-lang3 |
+-------+----------------------------------------------------+--------------------------------------+-------+-----+---------+
Most are spark's own deprecations - per your input, I can create issues and
PRs to handle specific areas.
Regarding the 3rd-party deprecations (at the end of the table) - most
(maybe all besides parquet) would be trivial to fix, for which I can create
separate tasks.