The two areas most at risk are artifacts from the npm package manager and pypi
NPM is a mess for multiple reasons; not only is it mediocre security wise, it's now targeted North Korea for infiltration of enemy systems and exfiltration of cloud and bitcoin wallet credentials - https://snyk.io/blog/axios-npm-package-compromised-supply-chain-attack-delivers-cross-platform/ - https://gist.github.com/joe-desimone/36061dabd2bc2513705e0d083a9673e7 The malicious axios package was up for three hours. There's a good writeup of the attack somewhere -included a whole fake company image set up on LI to lure a committer into a conf call, then under the guise of some podcast or audio issues, run a binary. Here, I'd advocate: modify npm and dependabot to wait 7 days, use a lockfile to lock down dependencies. maven currently has 7 documented malware packages, including a typosquatted jackson https://opensourcemalware.com/?type=package&ecosystem=maven pipy is a lot more exposed: https://opensourcemalware.com/?type=package&ecosystem=pypi Here: just be paranoid. 1. if you can build through any employers proxy with security checks, do so. Whoever cuts a release really ought to do this. 2. If you need to build anything other than spark, try to build in a container: WSL, docker, apple containers (those are new; doesn't need docker runtime) To conclude, your main risk in Spark is probably PyPI; for ASF projects your concern should be "what is their dependency update mechanism?", and does their release manager do releases safely. oh, then there's github actions. Separarate issue, and one I'm still trying to understand. On Thu, 23 Apr 2026 at 14:18, Wenchen Fan <[email protected]> wrote: > > we may want to extend our exceptions to include all ASF project releases > > I think it's a good idea to avoid cascading the delay, but for now we are > still discussing this delay for Spark, so there is no real cascading today. > We can revisit this if many Apache projects start to delay dependency > upgrades as well. > > On Thu, Apr 23, 2026 at 9:13 PM Steve Loughran <[email protected]> > wrote: > >> >> The "malicious release manager" is an interesting attack, one that the >> ASF "we trust the community" doesn't defend against. The risk here is that >> someone generates a set of malicious artifacts (maybe just publishes them >> to maven), while the source code is safe. >> >> To help defend against this, here's some code which will do a >> bytecode-level diff between JARs, ignoring debugging stuff, generated >> metadata etc. Enjoy >> >> https://github.com/steveloughran/auditor >> >> This wouldn't defend against someone adding a malicious dependency to the >> artifacts they publish on maven, so really the tool should audit that too. >> But you can at least check out a spark branch, build the binaries and then >> audit the RC's artifacts against them to look for tangible variations. >> >> On Wed, 22 Apr 2026 at 23:53, Tian Gao via dev <[email protected]> >> wrote: >> >>> So are you suggesting that we don't enforce this 1-week buffer for all >>> Apache projects? I agree that a legitimate Apache project release is >>> well-vetted and generally safe, but there could be situations where a >>> release is maliciously executed by stealing identities of people who have >>> access to make releases - that's where many supply chain attacks occur. >>> Moreover, it would be more difficult to enforce this (whether for LLM or >>> for human) to treat Apache projects differently. Also I think a 7-day delay >>> to accept an Apache project release is not a big deal for us. >>> >>> Regarding the Spark-related projects, we don't need to enforce the >>> policy for them. >>> >>> I think for supply chain attacks, we are defending ourselves not only >>> against package developers, but more importantly, we are defending >>> ourselves against potential loopholes in the release process. We must >>> assume that there could be something wrong during the release process of >>> any project. >>> >>> Tian >>> >>> On Wed, Apr 22, 2026 at 3:32 PM Dongjoon Hyun <[email protected]> >>> wrote: >>> >>>> To be clear, this discussion should be applied to Apache Spark main >>>> repository only. >>>> >>>> https://github.com/apache/spark >>>> >>>> It's because subprojects need to consume Apache Spark releases ASAP. >>>> For example, Apache Spark K8s Operator will upgrade its dependency on the >>>> same day of Apache Spark release because we trust our release process >>>> (including vote). >>>> >>>> In addition, probably, we may want to extend our exceptions to include >>>> all ASF project releases (Apache Hadoop, Avro, Parquet, ORC, Kafka, ...) >>>> which have established community vote process. >>>> >>>> Dongjoon. >>>> >>>> On 2026/04/22 22:21:41 Dongjoon Hyun wrote: >>>> > Thank you for the suggestion. >>>> > >>>> > +1 for the general predefined (1-week) grace-period policy sounds >>>> good to me. >>>> > >>>> > For the exception cases, I believe we can let the PMC members make >>>> the final decision on merge timing like the PMC members decides the >>>> `Blocker` level priority of JIRA issues already. >>>> > >>>> > If we have a voted policy, it would be great if we can add the policy >>>> to AGENTS.md explicitly to apply the policy from the PR steps. >>>> > >>>> > Best, >>>> > Dongjoon. >>>> > >>>> > On 2026/04/22 20:47:24 Steve Loughran wrote: >>>> > > 7 days is long enough to catch most (all?) malicious attacks. >>>> > > >>>> > > Regarding developers, there's a strong case to be made for only >>>> doing >>>> > > builds and especially tests in isolated containers, even though >>>> artifacts >>>> > > will leak across shared containers through a shared maven repo. It >>>> still >>>> > > limits the damage malicious binaries can do. >>>> > > >>>> > > On Tue, 21 Apr 2026 at 23:58, Jungtaek Lim < >>>> [email protected]> >>>> > > wrote: >>>> > > >>>> > > > +1 >>>> > > > >>>> > > > We tend to consider that merging to master branch gives some time >>>> to bake >>>> > > > before releasing. But we (Spark devs) are people who build Spark >>>> and >>>> > > > run some tests against the master branch almost day to day. For >>>> us, there >>>> > > > is literally no time for these library upgrades to be baked - we >>>> are >>>> > > > exposed to any kind of potential CVE from these library upgrades. >>>> > > > >>>> > > > It's arguable whether we should stay up to date with the recent >>>> release >>>> > > > version for dependencies, but that'd probably be uneasy to make >>>> consensus; >>>> > > > there is a clear trade-off. The current proposal sounds to me as >>>> a good >>>> > > > compromise - IMHO delaying by 2 weeks (14 days) seems reasonable, >>>> but >>>> > > > strict 1 week (7 days) is better than nothing if anyone is >>>> concerned 2 >>>> > > > weeks is too long. >>>> > > > >>>> > > > On Tue, Apr 21, 2026 at 9:45 PM Szehon Ho < >>>> [email protected]> wrote: >>>> > > > >>>> > > >> +1 make sense to me as well. We should of course be fast for >>>> security >>>> > > >> upgrades, but make sense to avoid such eager upgrades for the >>>> rest of >>>> > > >> the hundreds of Spark dependencies, due to the increased supply >>>> chain >>>> > > >> attack risks in the ecosystem. >>>> > > >> >>>> > > >> Thanks >>>> > > >> Szehon >>>> > > >> >>>> > > >> On Tue, Apr 21, 2026 at 3:32 AM Wenchen Fan <[email protected]> >>>> wrote: >>>> > > >> >>>> > > >>> Thanks for starting this discussion! I did a data analysis a >>>> while ago >>>> > > >>> but didn't have time to act on it. The analysis shows: >>>> > > >>> >>>> > > >>> *58* maven dep upgrades in the last 3 months. >>>> > > >>> *46%* (27/58) within 7 days of release >>>> > > >>> ≤7d : 27 / 58 (47%) >>>> > > >>> 8d–30d : 12 / 58 (21%) >>>> > > >>> >30d : 19 / 58 (32%) >>>> > > >>> >>>> > > >>> You can find the raw data in the attached file. This does look >>>> a bit >>>> > > >>> aggressive. I build Spark locally everyday, and I believe I'm >>>> not the only >>>> > > >>> one. Having a couple of weeks as the buffer time is a good idea >>>> to protect >>>> > > >>> developers like me from potential supply chain attacks. >>>> > > >>> >>>> > > >>> On Tue, Apr 21, 2026 at 6:24 AM Hyukjin Kwon < >>>> [email protected]> >>>> > > >>> wrote: >>>> > > >>> >>>> > > >>>> SGTM I think it's good practice to give a couple of weeks >>>> before the >>>> > > >>>> upgrade >>>> > > >>>> >>>> > > >>>> On Tue, 21 Apr 2026 at 07:13, Tian Gao via dev < >>>> [email protected]> >>>> > > >>>> wrote: >>>> > > >>>> >>>> > > >>>>> Hi, I want to start a discussion about our dependency upgrade >>>> policy >>>> > > >>>>> for active development. >>>> > > >>>>> >>>> > > >>>>> Our current dependency upgrade (mostly for Java, but Python >>>> should be >>>> > > >>>>> included too) is a bit spontaneous. People find that a >>>> dependency has a new >>>> > > >>>>> version available and we just do the upgrade. >>>> > > >>>>> >>>> > > >>>>> This raises concerns about potential supply chain attacks. We >>>> already >>>> > > >>>>> established a few sets of rules (including pinning the github >>>> action >>>> > > >>>>> versions) to avoid the supply chain attack, but manually >>>> upgrading the >>>> > > >>>>> dependency version too eagerly could also be risky. >>>> > > >>>>> >>>> > > >>>>> It normally takes time for a bad release to be recognized, so >>>> I think >>>> > > >>>>> we should set a buffer time before upgrading to the latest >>>> version. For >>>> > > >>>>> example, we can wait a week or two after the latest release >>>> before we set >>>> > > >>>>> our development dependency to it. This could reduce the >>>> possibility of >>>> > > >>>>> being impacted by malicious releases, or just give them >>>> enough time to fix >>>> > > >>>>> their own severe bugs. >>>> > > >>>>> >>>> > > >>>>> The cost for this policy is very low - it barely impacts us >>>> if we >>>> > > >>>>> can’t use the “latest” version of dependencies. >>>> > > >>>>> >>>> > > >>>>> Of course, there should be exceptions when dependency >>>> upgrades include >>>> > > >>>>> security fixes for known vulnerabilities; we should upgrade >>>> as fast as >>>> > > >>>>> possible. >>>> > > >>>>> >>>> > > >>>>> Tian >>>> > > >>>>> >>>> > > >>>> >>>> > > >>> >>>> --------------------------------------------------------------------- >>>> > > >>> To unsubscribe e-mail: [email protected] >>>> > > >> >>>> > > >> >>>> > > >>>> > >>>> > --------------------------------------------------------------------- >>>> > To unsubscribe e-mail: [email protected] >>>> > >>>> > >>>> >>>> --------------------------------------------------------------------- >>>> To unsubscribe e-mail: [email protected] >>>> >>>>
