Hi all, During local builds, I've noticed there are >800 warnings about deprecated classes/methods. Is there a planned schedule for the removal of some of them?
At least since Java 9, the Java `@Deprecation` annotation has a `forRemoval` attribute, which is not used yet - but since Spark 4.x requires Java 17, maybe it's a good time to start using it. It's still beneficial to define some sort of an external removal schedule/policy, whether time-based or version-based - to encourage users to move away from the deprecated code. (p.s., the comparable scala annotation doesn't have that attribute). I've processed the build output (with a bit of manual filling-ins) and created a table with most of the deprecation warnings, sorted by # occurrences: +-------+----------------------------------------------------+--------------------------------------+-------+-----+---------+ | Lang. | What | Where | since | # | Source | +-------+----------------------------------------------------+--------------------------------------+-------+-----+---------+ | Scala | class StreamingContext | package streaming | 3.4.0 | 132 | | | Scala | JavaStreamingContext | org.apache.spark.streaming.api.java | 3.4.0 | 58 | | | Scala | method Once | class Trigger | 3.4.0 | 25 | | | Scala | class UserDefinedAggregateFunction | package expressions | 3.0.0 | 17 | | | Scala | object StreamingContext | package streaming | 3.4.0 | 15 | | | Scala | class ChiSqSelector | package feature | 3.1.1 | 14 | | | Scala | class JavaStreamingContext | package java | 3.4.0 | 14 | | | Scala | class SparkListenerExecutorBlacklisted | package scheduler | 3.1.0 | 13 | | | Scala | method createExternalTable | class SQLContext | 2.2.0 | 12 | | | Scala | method load | class SQLContext | 1.4.0 | 12 | | | Scala | class SparkListenerExecutorBlacklistedForStage | package scheduler | 3.1.0 | 11 | | | Scala | class SparkListenerExecutorUnblacklisted | package scheduler | 3.1.0 | 11 | | | Scala | method schema | trait Table | 3.4.0 | 11 | | | Scala | class SparkListenerExecutorUnblacklistedForStage | package scheduler | 3.1.0 | 10 | | | Scala | class SparkListenerNodeBlacklistedForStage | package scheduler | 3.1.0 | 10 | | | Scala | method jsonRDD | class SQLContext | 1.4.0 | 10 | | | Scala | value blacklistedInStages | class ExecutorSummary | 3.1.0 | 10 | | | Scala | class SparkListenerNodeBlacklisted | package scheduler | 3.1.0 | 9 | | | Scala | value isBlacklisted | class ExecutorSummary | 3.1.0 | 9 | | | Scala | method applySchema | class SQLContext | 1.3.0 | 8 | | | Scala | class SparkListenerNodeUnblacklisted | package scheduler | 3.1.0 | 7 | | | Scala | method toDegrees | object functions | 2.1.0 | 7 | | | Scala | method toRadians | object functions | 2.1.0 | 7 | | | Scala | method jsonFile | class SQLContext | 1.4.0 | 6 | | | Scala | method jdbc | class SQLContext | 1.4.0 | 6 | | | Scala | method registerTempTable | class Dataset | 2.0.0 | 6 | | | Scala | method udf | object functions | 3.0.0 | 6 | | | Scala | method register | class UDFRegistration | 3.0.0 | 6 | | | Scala | value isBlacklistedForStage | class ExecutorStageSummary | 3.1.0 | 5 | | | Scala | method approxCountDistinct | object functions | 2.1.0 | 5 | | | Scala | method explode | class Dataset | 2.0.0 | 4 | | | Scala | method explode | class Dataset | 3.5.0 | 4 | | | Java | createTable(Identifier,Column[],Transform[],Map) | TableCatalog | 4.1.0 | 4 | | | Java | createTable(Identifier,StructType,Transform[],Map) | TableCatalog | 3.4.0 | 4 | | | Java | SparkListenerExecutorBlacklisted | org.apache.spark.scheduler | 3.1.0 | 3 | | | Java | SparkListenerExecutorBlacklistedForStage | org.apache.spark.scheduler | 3.1.0 | 3 | | | Java | SparkListenerExecutorUnblacklisted | org.apache.spark.scheduler | 3.1.0 | 3 | | | Java | SparkListenerNodeBlacklisted | org.apache.spark.scheduler | 3.1.0 | 3 | | | Java | SparkListenerNodeBlacklistedForStage | org.apache.spark.scheduler | 3.1.0 | 3 | | | Java | SparkListenerNodeUnblacklisted | org.apache.spark.scheduler | 3.1.0 | 3 | | | Scala | method !== | class Column | 2.0.0 | 3 | | | Scala | method createExternalTable | class Catalog | 2.2.0 | 3 | | | Scala | method newTaskTempFile | trait FileCommitProtocol | 3.3.0 | 2 | | | Scala | DEPRECATED_CHILD_CONNECTION_TIMEOUT | SparkLauncher | 3.2.0 | 2 | | | Java | DEPRECATED_CHILD_CONNECTION_TIMEOUT | SparkLauncher | 3.2.0 | 2 | | | Scala | method newTaskTempFileAbsPath | trait FileCommitProtocol | 3.3.0 | 1 | | | Scala | value holdingLocks | class ThreadStackTrace | 4.0.0 | 1 | | | Java | AppStatusSource.BLACKLISTED_EXECUTORS | AppStatusSource | 3.1.0 | 1 | | | Java | AppStatusSource.UNBLACKLISTED_EXECUTORS | AppStatusSource | 3.1.0 | 1 | | | Scala | method classifyException | class JdbcDialect | 4.0.0 | 1 | | | Scala | string sql | message SqlCommand | | 5 | proto | | Scala | trait AggregationBuffer | class GenericUDAFEvaluator | | 16 | hive | | Scala | method initialize | class AbstractSerDe | | 1 | hive | | Scala | method poll(long) | trait Consumer | | 5 | kafka | | Scala | class DefaultPartitioner | package internals | | 3 | kafka | | Scala | method getAllStatistics | class FileSystem | | 3 | hadoop | | Java | ParquetFileReader(Path,ParquetReadOptions) | ParquetFileReader | | 3 | parquet | | Java | ParquetFileReader(Path,ParquetReadOptions,List) | ParquetFileReader | | 3 | parquet | | Java | ParquetFileReader(Configuration,List,List,boolean) | ParquetFileReader | | 3 | parquet | | Java | readFooter(Configuration,Path) | ParquetFileReader | | 3 | parquet | | Java | BytesInput toByteArray() | BytesInput | | 1 | parquet | | Java | AvroParquetWriter builder(Path) | AvroParquetWriter | | 1 | parquet | | Scala | method readAllFootersInParallel | class ParquetFileReader | | 1 | parquet | | Java | RandomStringUtils random(int) | RandomStringUtils | | 3 | c-lang3 | | Java | RandomStringUtils randomAlphabetic(int) | RandomStringUtils | | 3 | c-lang3 | | Scala | method randomAlphanumeric | class RandomStringUtils | | 1 | c-lang3 | +-------+----------------------------------------------------+--------------------------------------+-------+-----+---------+ Most are spark's own deprecations - per your input, I can create issues and PRs to handle specific areas. Regarding the 3rd-party deprecations (at the end of the table) - most (maybe all besides parquet) would be trivial to fix, for which I can create separate tasks.