+1 (non-binding) 

Sent from my iPhone
Pardon the dumb thumb typos :)

> On Mar 6, 2020, at 7:09 PM, Sean Owen <sro...@gmail.com> wrote:
> 
> +1
> 
>> On Fri, Mar 6, 2020 at 8:59 PM Michael Armbrust <mich...@databricks.com> 
>> wrote:
>> 
>> I propose to add the following text to Spark's Semantic Versioning policy 
>> and adopt it as the rubric that should be used when deciding to break APIs 
>> (even at major versions such as 3.0).
>> 
>> 
>> I'll leave the vote open until Tuesday, March 10th at 2pm. As this is a 
>> procedural vote, the measure will pass if there are more favourable votes 
>> than unfavourable ones. PMC votes are binding, but the community is 
>> encouraged to add their voice to the discussion.
>> 
>> 
>> [ ] +1 - Spark should adopt this policy.
>> 
>> [ ] -1  - Spark should not adopt this policy.
>> 
>> 
>> <new policy>
>> 
>> 
>> Considerations When Breaking APIs
>> 
>> The Spark project strives to avoid breaking APIs or silently changing 
>> behavior, even at major versions. While this is not always possible, the 
>> balance of the following factors should be considered before choosing to 
>> break an API.
>> 
>> 
>> Cost of Breaking an API
>> 
>> Breaking an API almost always has a non-trivial cost to the users of Spark. 
>> A broken API means that Spark programs need to be rewritten before they can 
>> be upgraded. However, there are a few considerations when thinking about 
>> what the cost will be:
>> 
>> Usage - an API that is actively used in many different places, is always 
>> very costly to break. While it is hard to know usage for sure, there are a 
>> bunch of ways that we can estimate:
>> 
>> How long has the API been in Spark?
>> 
>> Is the API common even for basic programs?
>> 
>> How often do we see recent questions in JIRA or mailing lists?
>> 
>> How often does it appear in StackOverflow or blogs?
>> 
>> Behavior after the break - How will a program that works today, work after 
>> the break? The following are listed roughly in order of increasing severity:
>> 
>> Will there be a compiler or linker error?
>> 
>> Will there be a runtime exception?
>> 
>> Will that exception happen after significant processing has been done?
>> 
>> Will we silently return different answers? (very hard to debug, might not 
>> even notice!)
>> 
>> 
>> Cost of Maintaining an API
>> 
>> Of course, the above does not mean that we will never break any APIs. We 
>> must also consider the cost both to the project and to our users of keeping 
>> the API in question.
>> 
>> Project Costs - Every API we have needs to be tested and needs to keep 
>> working as other parts of the project changes. These costs are significantly 
>> exacerbated when external dependencies change (the JVM, Scala, etc). In some 
>> cases, while not completely technically infeasible, the cost of maintaining 
>> a particular API can become too high.
>> 
>> User Costs - APIs also have a cognitive cost to users learning Spark or 
>> trying to understand Spark programs. This cost becomes even higher when the 
>> API in question has confusing or undefined semantics.
>> 
>> 
>> Alternatives to Breaking an API
>> 
>> In cases where there is a "Bad API", but where the cost of removal is also 
>> high, there are alternatives that should be considered that do not hurt 
>> existing users but do address some of the maintenance costs.
>> 
>> 
>> Avoid Bad APIs - While this is a bit obvious, it is an important point. 
>> Anytime we are adding a new interface to Spark we should consider that we 
>> might be stuck with this API forever. Think deeply about how new APIs relate 
>> to existing ones, as well as how you expect them to evolve over time.
>> 
>> Deprecation Warnings - All deprecation warnings should point to a clear 
>> alternative and should never just say that an API is deprecated.
>> 
>> Updated Docs - Documentation should point to the "best" recommended way of 
>> performing a given task. In the cases where we maintain legacy 
>> documentation, we should clearly point to newer APIs and suggest to users 
>> the "right" way.
>> 
>> Community Work - Many people learn Spark by reading blogs and other sites 
>> such as StackOverflow. However, many of these resources are out of date. 
>> Update them, to reduce the cost of eventually removing deprecated APIs.
>> 
>> 
>> </new policy>
> 
> ---------------------------------------------------------------------
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
> 


---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Reply via email to