+1 (non-binding) Sent from my iPhone Pardon the dumb thumb typos :)
> On Mar 6, 2020, at 7:09 PM, Sean Owen <sro...@gmail.com> wrote: > > +1 > >> On Fri, Mar 6, 2020 at 8:59 PM Michael Armbrust <mich...@databricks.com> >> wrote: >> >> I propose to add the following text to Spark's Semantic Versioning policy >> and adopt it as the rubric that should be used when deciding to break APIs >> (even at major versions such as 3.0). >> >> >> I'll leave the vote open until Tuesday, March 10th at 2pm. As this is a >> procedural vote, the measure will pass if there are more favourable votes >> than unfavourable ones. PMC votes are binding, but the community is >> encouraged to add their voice to the discussion. >> >> >> [ ] +1 - Spark should adopt this policy. >> >> [ ] -1 - Spark should not adopt this policy. >> >> >> <new policy> >> >> >> Considerations When Breaking APIs >> >> The Spark project strives to avoid breaking APIs or silently changing >> behavior, even at major versions. While this is not always possible, the >> balance of the following factors should be considered before choosing to >> break an API. >> >> >> Cost of Breaking an API >> >> Breaking an API almost always has a non-trivial cost to the users of Spark. >> A broken API means that Spark programs need to be rewritten before they can >> be upgraded. However, there are a few considerations when thinking about >> what the cost will be: >> >> Usage - an API that is actively used in many different places, is always >> very costly to break. While it is hard to know usage for sure, there are a >> bunch of ways that we can estimate: >> >> How long has the API been in Spark? >> >> Is the API common even for basic programs? >> >> How often do we see recent questions in JIRA or mailing lists? >> >> How often does it appear in StackOverflow or blogs? >> >> Behavior after the break - How will a program that works today, work after >> the break? The following are listed roughly in order of increasing severity: >> >> Will there be a compiler or linker error? >> >> Will there be a runtime exception? >> >> Will that exception happen after significant processing has been done? >> >> Will we silently return different answers? (very hard to debug, might not >> even notice!) >> >> >> Cost of Maintaining an API >> >> Of course, the above does not mean that we will never break any APIs. We >> must also consider the cost both to the project and to our users of keeping >> the API in question. >> >> Project Costs - Every API we have needs to be tested and needs to keep >> working as other parts of the project changes. These costs are significantly >> exacerbated when external dependencies change (the JVM, Scala, etc). In some >> cases, while not completely technically infeasible, the cost of maintaining >> a particular API can become too high. >> >> User Costs - APIs also have a cognitive cost to users learning Spark or >> trying to understand Spark programs. This cost becomes even higher when the >> API in question has confusing or undefined semantics. >> >> >> Alternatives to Breaking an API >> >> In cases where there is a "Bad API", but where the cost of removal is also >> high, there are alternatives that should be considered that do not hurt >> existing users but do address some of the maintenance costs. >> >> >> Avoid Bad APIs - While this is a bit obvious, it is an important point. >> Anytime we are adding a new interface to Spark we should consider that we >> might be stuck with this API forever. Think deeply about how new APIs relate >> to existing ones, as well as how you expect them to evolve over time. >> >> Deprecation Warnings - All deprecation warnings should point to a clear >> alternative and should never just say that an API is deprecated. >> >> Updated Docs - Documentation should point to the "best" recommended way of >> performing a given task. In the cases where we maintain legacy >> documentation, we should clearly point to newer APIs and suggest to users >> the "right" way. >> >> Community Work - Many people learn Spark by reading blogs and other sites >> such as StackOverflow. However, many of these resources are out of date. >> Update them, to reduce the cost of eventually removing deprecated APIs. >> >> >> </new policy> > > --------------------------------------------------------------------- > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org > --------------------------------------------------------------------- To unsubscribe e-mail: dev-unsubscr...@spark.apache.org