Vlad, let's open a PR and discuss it there. We have many other committees
to review / help with as well.

On Fri, Mar 28, 2025 at 6:28 AM Rozov, Vlad <vro...@amazon.com.invalid>
wrote:

> Hi Hyukjin,
>
> I open https://issues.apache.org/jira/browse/SPARK-51643 and
> https://issues.apache.org/jira/browse/SPARK-51644. Please add more
> details to the first JIRA. As far as I can see
> https://github.com/vrozov/spark/tree/spark-shell should fix both JIRAs
> and if not I’d like to understand why the first one would not be fixed by
> adding back relocation of guava to the connect client.
>
> Thank you,
>
> Vlad
>
>
>
> On Mar 26, 2025, at 6:15 PM, Hyukjin Kwon <gurwls...@apache.org> wrote:
>
> Nah, I wasn't clear.
>
> Maven and SBT builds are synced for this special code path, e.g.,
> https://github.com/apache/spark/commit/e927a7edad47f449aeb0d5014b6185ac36b344d0
> .
> If you build Maven and SBT, the results are almost the same.
>
> Now, the fix you landed in Maven (and indeed it was a Maven specific fix).
> However, it changes the syncing part between Maven and SBT, and it results
> in a different output.
> The CI in the PR passed because we run SBT in PR builders.
>
> My suggestion is to fix both at the same time as usual PRs have been so
> also considering that the fix had an issue.
> It would also need some closer looks who are used to SBT as well to make
> sure the same issue does not happen.
>
>
>
> On Thu, 27 Mar 2025 at 10:06, Rozov, Vlad <vro...@amazon.com.invalid>
> wrote:
>
>> Sorry, but I still don’t follow. My PR broke Maven and the fix I provided
>> fixes Maven. SBT was never broken except there is inconsistency between SBT
>> and Maven builds. Can the inconsistency be fixed in a follow up PR?
>>
>> Thank you,
>>
>> Vlad
>>
>> On Mar 26, 2025, at 5:57 PM, Hyukjin Kwon <gurwls...@apache.org> wrote:
>>
>> It is not broken ... because we run SBT in PR builders for ASF resource
>> restrictions and faster build. We use Maven for release so it was found out
>> now.
>>
>> CI did not test your change. The part you are fixing is a special path ..
>>
>> On Thu, Mar 27, 2025 at 9:53 AM Rozov, Vlad <vro...@amazon.com.invalid>
>> wrote:
>>
>>> If it is not broken, can the sync between maven and SBT
>>> dependencies/shadow be done in a follow up PR?
>>>
>>> Thank you,
>>>
>>> Vlad
>>>
>>>
>>> On Mar 26, 2025, at 5:44 PM, Hyukjin Kwon <gurwls...@apache.org> wrote:
>>>
>>> It is not broken. The fix you applied would not be applied in SBT. For
>>> example, the lines you changed (added in
>>> https://github.com/apache/spark/commit/e927a7edad47f449aeb0d5014b6185ac36b344d0
>>> ):
>>>
>>> diff```
>>> -            <relocation>
>>> -              <pattern>com.google.common</pattern>
>>> -
>>> <shadedPattern>${spark.shade.packageName}.connect.guava</shadedPattern>
>>> -              <includes>
>>> -                <include>com.google.common.**</include>
>>> -              </includes>
>>> -            </relocation>
>>> ```
>>>
>>> The companion part of this has to be removed in SBT as well, for example
>>> here
>>> https://github.com/apache/spark/commit/e927a7edad47f449aeb0d5014b6185ac36b344d0#diff-6f545c33f2fcc975200bf208c900a600a593ce6b170180f81e2f93b3efb6cb3eR674
>>>
>>> Adding simple dependency changes usually are shared between SBT and
>>> Maven so it is not a tricky part. However, some changes like this have to
>>> be made in both places.
>>> Both Maven and SBT builds are usually sync, and changes are mode
>>> together. Let me know if I am underding your change wrongly.
>>>
>>>
>>> On Thu, 27 Mar 2025 at 09:25, Rozov, Vlad <vro...@amazon.com.invalid>
>>> wrote:
>>>
>>>> Hi Hyukjin,
>>>>
>>>> Can you clarify what is broken in SBT? As I already mentioned in my
>>>> previous e-mail, I tested both maven and SBT builds and both works. Is
>>>> there a test that I can run that shows what is broken? Please file JIRA, so
>>>> we can communicate on the JIRA and document what is broken.
>>>>
>>>> Thank you,
>>>>
>>>> Vlad
>>>>
>>>> On Mar 26, 2025, at 3:18 PM, Hyukjin Kwon <gurwls...@apache.org> wrote:
>>>>
>>>> That only fixes Maven. Both SBT build and Maven build should work in
>>>> the same or similar wat. Let's make sure both work.
>>>>
>>>> On Thu, Mar 27, 2025 at 3:18 AM Rozov, Vlad <vro...@amazon.com.invalid>
>>>> wrote:
>>>>
>>>>> Please see https://github.com/vrozov/spark/tree/spark-shell. I tested
>>>>> only spark-shell —remote local after building with maven and sbt. It may
>>>>> not be a complete fix and there is no PR. I’ll look into SBT build issue
>>>>> (assuming that there is still one after the fix) once you file JIRA.
>>>>>
>>>>> Thank you,
>>>>>
>>>>> Vlad
>>>>>
>>>>>
>>>>> On Mar 26, 2025, at 8:27 AM, Hyukjin Kwon <gurwls...@apache.org>
>>>>> wrote:
>>>>>
>>>>> Vlad,
>>>>>
>>>>> - Please show me if there is a simple fix. If that's the case, yes, I
>>>>> will revert this out from the master branch. That works for me.
>>>>> - If not, let's make a new PR.
>>>>> - If you feel this is an issue, let's start a vote. Let me know.
>>>>>
>>>>>
>>>>> On Thu, 27 Mar 2025 at 00:13, Rozov, Vlad <vro...@amazon.com.invalid>
>>>>> wrote:
>>>>>
>>>>>> Every graduated from incubating Apache project has guards against
>>>>>> what you name “chaotic” and what other name breaking best development
>>>>>> practices. Such guards include JIRA, unit tests and PR review. Instead of
>>>>>> reverting commit, I would expect you to open JIRA and outline what is
>>>>>> broken. If you filed JIRA, please provide the link, if not, please open
>>>>>> one.
>>>>>>
>>>>>> It took me 2 hours to fix Spark shells, so should you open JIRA
>>>>>> instead of spending time to identify the commit and reverting it, you 
>>>>>> will
>>>>>> save time as well. I’ll post fix once JIRA is open and I validate that my
>>>>>> understanding of what is broken is the same as yours.
>>>>>>
>>>>>> Thank you,
>>>>>>
>>>>>> Vlad
>>>>>>
>>>>>> On Mar 26, 2025, at 12:01 AM, Hyukjin Kwon <gurwls...@apache.org>
>>>>>> wrote:
>>>>>>
>>>>>> Rozov, please test the patch, see if there is a relevant test or not,
>>>>>> and add a test if not there. If it is difficult to add a test, describe 
>>>>>> it
>>>>>> in the PR description, and how you manually tested.
>>>>>> This is what I think you need to do instead of reverting the revert.
>>>>>> Imagine that there are many of such PRs living in the master branch.
>>>>>> Thankfully this time you are alone but imagine many people do the same
>>>>>> things - it will be chaotic. To me, it took 4 hours to identify this 
>>>>>> single
>>>>>> commit.
>>>>>>
>>>>>> About the change itself, you need to fix SBT together, and it will
>>>>>> need the changes such as
>>>>>> https://github.com/apache/spark/commit/e927a7edad47f449aeb0d5014b6185ac36b344d0
>>>>>> .
>>>>>> Should also test Spark shells, and describe how you tested it as well.
>>>>>>
>>>>>> This is what I expect:
>>>>>> - Please show me if there is a simple fix. If that's the case, yes, I
>>>>>> will revert this out from the master branch. That works for me.
>>>>>> - If not, let's make a new PR.
>>>>>> - If you feel this is an issue, let's start a vote. Let me know.
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, 26 Mar 2025 at 15:07, Reynold Xin <r...@databricks.com.invalid>
>>>>>> wrote:
>>>>>>
>>>>>>> Sorry Vlad - I disagree. Where is the simple fix? As a new
>>>>>>> contributor, you should not be coming in guns blazing blaming committers
>>>>>>> who are trying to keep the master branch sane and clean.
>>>>>>>
>>>>>>> On Tue, Mar 25, 2025 at 10:53 PM Rozov, Vlad
>>>>>>> <vro...@amazon.com.invalid> wrote:
>>>>>>>
>>>>>>>> There is a simple fix. This is exactly what I outlined in the
>>>>>>>> e-mail. Prior to reverting commit (on master) it was necessary to see 
>>>>>>>> if an
>>>>>>>> easy fix exists. The PR that introduced the error was merged into
>>>>>>>> master 3 weeks ago, so I still don’t get why it was reverted 
>>>>>>>> overnight. It
>>>>>>>> was also necessary to open JIRA and provide reproducible steps. I 
>>>>>>>> assume
>>>>>>>> that steps on the PR is what needs to be fixed, but it will be better 
>>>>>>>> to
>>>>>>>> avoid guessing.
>>>>>>>>
>>>>>>>> Thank you,
>>>>>>>>
>>>>>>>> Vlad
>>>>>>>>
>>>>>>>> On Mar 25, 2025, at 10:05 PM, Reynold Xin
>>>>>>>> <r...@databricks.com.INVALID> wrote:
>>>>>>>>
>>>>>>>> Is there a fix already available or a very simple fix a
>>>>>>>> committer can create quickly? If yes, we can merge the fix. If there 
>>>>>>>> isn't,
>>>>>>>> for major functionality breaking change, we should just revert. That's
>>>>>>>> fairly basic software engineering practices.
>>>>>>>>
>>>>>>>>
>>>>>>>> On Tue, Mar 25, 2025 at 9:53 PM Hyukjin Kwon <gurwls...@apache.org>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> With the change, the main entry points, Spark shalls, don't work
>>>>>>>>> and developers cannot debug and test. The snapshots become uesless.
>>>>>>>>>
>>>>>>>>> The tests passed because you did not fix SBT. It needs a larger
>>>>>>>>> change.
>>>>>>>>>
>>>>>>>>> Such change cannot be in the source. I can start a vote if you
>>>>>>>>> think this is an issue.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wed, Mar 26, 2025 at 12:32 PM Rozov, Vlad
>>>>>>>>> <vro...@amazon.com.invalid> wrote:
>>>>>>>>>
>>>>>>>>>> This does not make any sense.
>>>>>>>>>>
>>>>>>>>>> 1. There are no broken tests introduced by
>>>>>>>>>> https://github.com/apache/spark/pull/49971
>>>>>>>>>> 2. There are no JIRA filed for “the main entry point”
>>>>>>>>>> 3. “The main entry point” that does not have any unit test
>>>>>>>>>> suggests that it is not the main entry point.
>>>>>>>>>> 4. It is not practical to revert every commit that introduced
>>>>>>>>>> some functional issue or regression. If regression can’t be fixed, 
>>>>>>>>>> it is
>>>>>>>>>> reverted, if it can be reasonably fixed, the fix is applied to the
>>>>>>>>>> regression. Just as an example,
>>>>>>>>>> https://github.com/apache/spark/pull/42823 introduced a
>>>>>>>>>> regression. It was fixed by
>>>>>>>>>> https://github.com/apache/spark/pull/50348. Nobody asked for the
>>>>>>>>>> revert.
>>>>>>>>>> 5. The way how revert was handled is not "Apache way”.
>>>>>>>>>>
>>>>>>>>>> It may be easier for you to handle it in a single PR, but it is
>>>>>>>>>> easier for me to provide PR on top of existing changes and it will 
>>>>>>>>>> also
>>>>>>>>>> provide JIRA and a commit that documents why there is an issue. 
>>>>>>>>>> Otherwise
>>>>>>>>>> it will be easier to introduce the error back again.
>>>>>>>>>>
>>>>>>>>>> Thank you,
>>>>>>>>>>
>>>>>>>>>> Vlad
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Mar 25, 2025, at 4:58 PM, Hyukjin Kwon <gurwls...@apache.org>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>> Rozov, this broke the main entry points of release, Spark shells.
>>>>>>>>>> Even in the mast branch, you build a Spark, and cannot use Spark 
>>>>>>>>>> shells.
>>>>>>>>>> Why don't you submit a PR that contains the proper fix? It is
>>>>>>>>>> easier to have one PR that has no issue, e.g., reverting backporting 
>>>>>>>>>> etc.
>>>>>>>>>>
>>>>>>>>>> On Wed, 26 Mar 2025 at 00:17, Rozov, Vlad
>>>>>>>>>> <vro...@amazon.com.invalid> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi All,
>>>>>>>>>>>
>>>>>>>>>>> I kind of understand why
>>>>>>>>>>> https://github.com/apache/spark/pull/49971 was reverted on the
>>>>>>>>>>> branch-4.0 to allow testing of 4.0 release. Why was it also 
>>>>>>>>>>> reverted on the
>>>>>>>>>>> master branch? I don’t see any JIRA being open for the failure. 
>>>>>>>>>>> AFAIK, the
>>>>>>>>>>> proper way to handle the issue in Apache project:
>>>>>>>>>>>
>>>>>>>>>>> - revert  https://github.com/apache/spark/pull/49971 on the
>>>>>>>>>>> branch-4.0 (note that PR targeted master branch and SPARK-51229 
>>>>>>>>>>> targets 4.1
>>>>>>>>>>> as affected version)
>>>>>>>>>>> - open JIRA and provide details of how to reproduce the error on
>>>>>>>>>>> the master branch
>>>>>>>>>>> - wait for the author of
>>>>>>>>>>> https://github.com/apache/spark/pull/49971 (in this case it was
>>>>>>>>>>> me) to reply in a timely manner and only then revert the commit
>>>>>>>>>>>
>>>>>>>>>>> If you agree with the approach, please undo the revert and file
>>>>>>>>>>> JIRA. There is an easy fix to the issue.
>>>>>>>>>>>
>>>>>>>>>>> Thank you,
>>>>>>>>>>>
>>>>>>>>>>> Vlad
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Reply via email to