xuanyuanking commented on pull request #31638: URL: https://github.com/apache/spark/pull/31638#issuecomment-788974534
> The behavioral change is bound to file data source, right? I prefer adding the source option instead of adding config, because 1) Spark has a bunch of configurations already 2) I'd prefer having smaller range of impact, per source instead of session-wide. Agree. Especially for the 1), fully agree we should carefully to add new configs only when it's really needed. It should be my fault that didn't provide more context as mentioned in https://github.com/apache/spark/pull/31638#discussion_r585636401. Actually the same user code can pass in version 2.4 but fail now. If we add a source option, code changes is needed for them on controlling the behavior. It might not be a user friendly fix IMO. If this is a new behavior, I totally will follow your suggestion to add an option instead of config. > I'd love to see the proper notice on such case as well since we are here. Yes, thanks for the reminding. I'll add warning log for the cases you mentioned in proper places (like the `hasMetadata` return false) as well as SS user guides (do you think user guides is a also a reasonable place?). Do you prefer to do this in the current PR or a separated one? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
