Hi, All.

Unfortunately, the Apache Spark project seems to have a technical debt in
the source code releases. It happens to be discussed at least twice on both
dev@spark and legal-discuss mailing lists. (Thank you for the head-up,
Vlad.)

1. https://lists.apache.org/thread/3sxw9gwp51mrkzlo2xchq1g20gbgbnz8
(2018-06-21, dev@spark)
2. https://lists.apache.org/thread/xmbgpgt30n7fdd99pnbg7983qzzrx24k
(2018-06-25, legal-discuss@)
3. https://lists.apache.org/thread/z3oq1db80vc8c7r6892hwjnq4h7hnwmd
(2025-02-25, dev@spark)

To be short, according to the previous conclusion in 2018, the Apache Spark
community wanted to adhere to the ASF policy by removing those jar files
from source code releases (although it was not considered as a release
blocker at that time and until now).

> it's important to be able to recreate these JARs somehow,
> and I don't think we have the source in the repo for all of them
> (at least, the ones that originate from Spark).
> That much seems like a must-do. After that, seems worth figuring out
> just how hard it is to build these artifacts from source.
> If it's easy, great. If not, either the test can be removed or
> we figure out just how hard a requirement this is.

Given the unresolved issue for seven years, I proposed SPARK-51318 as a
potential solution to comply with ASF policy. After SPARK-51318, we can
recover the test coverage one by one later by addressing IDed TODO items
without any legal concerns during the votes.

https://issues.apache.org/jira/browse/SPARK-51318
(Remove `jar` files from Apache Spark repository and disable affected tests)

WDYT?

BTW, please note that I didn't define SPARK-51318 as a blocker for any
on-going releases yet.

Best regards,
Dongjoon.

Reply via email to