Hi dev,

Please vote to retain migration logic of incorrect `spark.databricks.*`
configuration in Spark 4.0.x.

- DISCUSSION:
https://lists.apache.org/thread/xzk9729lsmo397crdtk14f74g8cyv4sr
([DISCUSS] Handling spark.databricks.* config being exposed in 3.5.4 in
Spark 4.0.0+)

Specifically, please review this post
https://lists.apache.org/thread/xtq1kjhsl4ohfon78z3wld2hmfm78t9k which
explains pros and cons about the proposal - proposal is about "Option 1".

Simply speaking, this vote is to allow streaming queries which had been
ever run in Spark 3.5.4 to be upgraded with Spark 4.0.x, "without having to
be upgraded with Spark 3.5.5+ in prior". If the vote passes, we will help
users to have a smooth upgrade from Spark 3.5.4 to Spark 4.0.x, which would
be almost 1 year.

The (only) cons in this option is having to retain the incorrect
configuration name as "string" in the codebase a bit longer. The code
complexity of migration logic is arguably trivial. (link
<https://github.com/apache/spark/blob/4231d58245251a34ae80a38ea4bbf7d720caa439/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/OffsetSeq.scala#L174-L183>
)

This VOTE is for Spark 4.0.x, but if someone supports including migration
logic to be longer than Spark 4.0.x, please cast +1 here and leave the
desired last minor version of Spark to retain this migration logic.

The vote is open for the next 72 hours and passes if a majority +1 PMC
votes are cast, with a minimum of 3 +1 votes.

[ ] +1 Retain migration logic of incorrect `spark.databricks.*`
configuration in Spark 4.0.x
[ ] -1 Remove migration logic of incorrect `spark.databricks.*`
configuration in Spark 4.0.0 because...

Thanks!
Jungtaek Lim (HeartSaVioR)

Reply via email to