for EMR, I think they show 3.1.2-amazon in Spark UI, no?

On Wed, Jun 7, 2023 at 11:30 Grisha Weintraub <grisha.weintr...@gmail.com>
wrote:

> Hi,
>
> I am not taking sides here, but just for fairness, I think it should be
> noted that AWS EMR does exactly the same thing.
> We choose the EMR version (e.g., 6.4.0) and it has an associated Spark
> version (e.g., 3.1.2).
> The Spark version here is not the original Apache version but AWS Spark
> distribution.
>
> On Wed, Jun 7, 2023 at 8:24 PM Dongjoon Hyun <dongjoon.h...@gmail.com>
> wrote:
>
>> I disagree with you in several ways.
>>
>> The following is not a *minor* change like the given examples
>> (alterations to the start-up and shutdown scripts, configuration files,
>> file layout etc.).
>>
>> > The change you cite meets the 4th point, minor change, made for
>> integration reasons.
>>
>> The following is also wrong. There is no such point of state of Apache
>> Spark 3.4.0 after 3.4.0 tag creation. Apache Spark community didn't allow
>> Scala reverting patches in both `master` branch and `branch-3.4`.
>>
>> > There is no known technical objection; this was after all at one point
>> the state of Apache Spark.
>>
>> Is the following your main point? So, you are selling a box "including
>> Harry Potter by J. K. Rolling whose main character is Barry instead of
>> Harry", but it's okay because you didn't sell the book itself? And, as a
>> cloud-vendor, you borrowed the box instead of selling it like private
>> libraries?
>>
>> > There is no standalone distribution of Apache Spark anywhere here.
>>
>> We are not asking a big thing. Why are you so reluctant to say you are
>> not "Apache Spark 3.4.0" by simply saying "Apache Spark 3.4.0-databricks".
>> What is the marketing reason here?
>>
>> Dongjoon.
>>
>>
>> On Wed, Jun 7, 2023 at 9:27 AM Sean Owen <sro...@gmail.com> wrote:
>>
>>> Hi Dongjoon, I think this conversation is not advancing anymore. I
>>> personally consider the matter closed unless you can find other support or
>>> respond with more specifics. While this perhaps should be on private@,
>>> I think it's not wrong as an instructive discussion on dev@.
>>>
>>> I don't believe you've made a clear argument about the problem, or how
>>> it relates specifically to policy. Nevertheless I will show you my logic.
>>>
>>> You are asserting that a vendor cannot call a product Apache Spark 3.4.0
>>> if it omits a patch updating a Scala maintenance version. This difference
>>> has no known impact on usage, as far as I can tell.
>>>
>>> Let's see what policy requires:
>>>
>>> 1/ All source code changes must meet at least one of the acceptable
>>> changes criteria set out below:
>>> - The change has accepted by the relevant Apache project community for
>>> inclusion in a future release. Note that the process used to accept changes
>>> and how that acceptance is documented varies between projects.
>>> - A change is a fix for an undisclosed security issue; and the fix is
>>> not publicly disclosed as as security fix; and the Apache project has been
>>> notified of the both issue and the proposed fix; and the PMC has rejected
>>> neither the vulnerability report nor the proposed fix.
>>> - A change is a fix for a bug; and the Apache project has been notified
>>> of both the bug and the proposed fix; and the PMC has rejected neither the
>>> bug report nor the proposed fix.
>>> - Minor changes (e.g. alterations to the start-up and shutdown scripts,
>>> configuration files, file layout etc.) to integrate with the target
>>> platform providing the Apache project has not objected to those changes.
>>>
>>> The change you cite meets the 4th point, minor change, made for
>>> integration reasons. There is no known technical objection; this was after
>>> all at one point the state of Apache Spark.
>>>
>>>
>>> 2/ A version number must be used that both clearly differentiates it
>>> from an Apache Software Foundation release and clearly identifies the
>>> Apache Software Foundation version on which the software is based.
>>>
>>> Keep in mind the product here is not "Apache Spark", but the "Databricks
>>> Runtime 13.1 (including Apache Spark 3.4.0)". That is, there is far more
>>> than a version number differentiating this product from Apache Spark. There
>>> is no standalone distribution of Apache Spark anywhere here. I believe that
>>> easily matches the intent.
>>>
>>>
>>> 3/ The documentation must clearly identify the Apache Software
>>> Foundation version on which the software is based.
>>>
>>> Clearly, yes.
>>>
>>>
>>> 4/ The end user expects that the distribution channel will back-port
>>> fixes. It is not necessary to back-port all fixes. Selection of fixes to
>>> back-port must be consistent with the update policy of that distribution
>>> channel.
>>>
>>> I think this is safe to say too. Indeed this explicitly contemplates not
>>> back-porting a change.
>>>
>>>
>>> Backing up, you can see from this document that the spirit of it is:
>>> don't include changes in your own Apache Foo x.y that aren't wanted by the
>>> project, and still call it Apache Foo x.y. I don't believe your case
>>> matches this spirit either.
>>>
>>> I do think it's not crazy to suggest, hey vendor, would you call this
>>> "Apache Spark + patches" or ".vendor123". But that's at best a suggestion,
>>> and I think it does nothing in particular for users. You've made the
>>> suggestion, and I do not see some police action from the PMC must follow.
>>>
>>>
>>> I think you're simply objecting to a vendor choice, but that is not
>>> on-topic here unless you can specifically rebut the reasoning above and
>>> show it's connected.
>>>
>>>
>>> On Wed, Jun 7, 2023 at 11:02 AM Dongjoon Hyun <dongj...@apache.org>
>>> wrote:
>>>
>>>> Sean, it seems that you are confused here. We are not talking about
>>>> your upper system (the notebook environment). We are talking about the
>>>> submodule, "Apache Spark 3.4.0-databricks". Whatever you call it, both of
>>>> us knows "Apache Spark 3.4.0-databricks" is different from "Apache Spark
>>>> 3.4.0". You should not use "3.4.0" at your subsystem.
>>>>
>>>> > This also is aimed at distributions of "Apache Foo", not products that
>>>> > "include Apache Foo", which are clearly not Apache Foo.
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>>>
>>>>

Reply via email to