marmbrus commented on issue #24902: [SPARK-28093][SQL] Fix TRIM/LTRIM/RTRIM function parameter order issue URL: https://github.com/apache/spark/pull/24902#issuecomment-585979246 @dongjoon-hyun I agree that it would be helpful to come up with a more concrete API evolution rubric to help the Spark project continue to grow and help resolve these debates uniformly across a growing number of reviewers. That said, I don't think worrying about compatibility / upgradability is a big policy change for the Spark project. As far back as the debate to release Spark 1.0, when we were first debating the use of Semantic versioning, Matei said: > I know that some names are suboptimal, but I absolutely detest breaking APIs, config names, etc. I’ve seen it happen way too often in other projects (even things we depend on that are officially post-1.0, like Akka or Protobuf or Hadoop), and it’s very painful. I think that we as fairly cutting-edge users are okay with libraries occasionally changing, but many others will consider it a show-stopper. Given this, I think that any cosmetic change now, even though it might improve clarity slightly, is not worth the tradeoff in terms of creating an update barrier for existing users. Given the above, I strongly believe that having empathy for users upgrading Spark has been part of the project's ethos since the beginning. Nobody is arguing the we *cannot* break APIs at a Major version. The argument is the benefits of changing the API need to outweigh the costs to existing users. In my opinion, this change is particularly insidious, as both arguments are strings, you won't even get an analysis exception. As @cloud-fan points out, the answers to queries are just going to silently change. This could cause wrong results in production, possibly for a very long time before someone notices. In contrast, while a different parameter ordering might be more intuitive given experience with other SQL engines, I would guess people writing new Spark jobs will quickly figure out the right way to use this API (by just trying both orders, or by looking up the docs).
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
