Apologies for chiming in so late on this. The Java agent route seems like a huge improvement over our existing approach. I love that it's a global protection - it doesn't rely on devs remembering to call the required "check the path" method in every single place we might read a file in our code.
But ultimately I still think systemd/seccomp is a much more sustainable route for us than the Java-agent approach We're a Search community that's here to solve people's Search problems. I won't minimize for a second all the effort we've put into security over the years. But there are entire projects that exist to solve these sorts of problems. IMO steering users towards something like systemd that has a whole community of security folks behind is going to leave them more secure, and be more sustainable for us. (To be clear I'm not advocating for systemd specifically, but for the general idea of relying on well-known, OS-level protections that exist outside of the JVM and Solr.) I've heard two objections to the "systemd" approach so far: that systemd isn't cross-platform, and that some folks won't bother to enable it. To the first objection, I'd say that our responsibility has never been to find one security solution that works for everyone. We never mandated that folks use a particular firewall: we let users choose how they wanted to provide that network isolation. IMO this could be handled the same way: point folks to a few different tools, optionally provide a few example configs, and let them pick based on their OS/platform. To the second objection: this is the whole point of having a security model. It's expected and normal for projects to describe how to make a deployment of their software "secure". It's neither fair nor possible to secure folks who won't read our "Going to Production" docs before deploying. On Thu, May 28, 2026 at 8:06 AM Jan Høydahl <[email protected]> wrote: > > So the implementation of SIP-24 is now ready for review at > > https://github.com/apache/solr/pull/4471 > > The PR description gives through description of the change and what to look > for. > I have run several rounds of Copilot reviews and Claude reviews. > We have unit tests and BATS tests, so coverage should be fairly good. > > The PR is fairly large +5454 lines across 62 files and 54 commits, but mostly > contained in the new gradle module, plus one hook in CoreContainer. > > Take it for a spin and let me know how it works. > > Jan > > > 11. mai 2026 kl. 00:04 skrev Jan Høydahl <[email protected]>: > > > > Hi, > > > > Welcoming more feedback on this approach. > > > > I'm planning to move to implementation phase on Wednesday, to get a first > > version of the agent ready for testing. But it would be helpful with as much > > feedback on the SIP high level design before I start implementing. Thanks > > David > > for your intial feedback. > > > > Jan > > > >> 1. mai 2026 kl. 14:30 skrev Gus Heck <[email protected]>: > >> > >> I know it's probably unrealistic because corporate environments into which > >> solr is deployed likely would have difficulty with it, but there is a JDK > >> fork that keeps and improves the Security Manager... It's supported by the > >> descendants of the Apache River project. > >> https://github.com/pfirmstone/DirtyChai > >> > >> Probably not useful, but perhaps interesting. Certainly we are not the only > >> ones irritated by the loss of the security manager. > >> > >> On Thu, Apr 30, 2026 at 4:08 AM Jan Høydahl <[email protected]> wrote: > >> > >>>> I should clarify something. My objections mostly relate to the insistent > >>>> language in the SIP requiring Solr to have a substitute for the JSM. I'm > >>>> not quite against doing something but I might vote -0 on something that > >>>> seems to have a poor security payoff relative to the maintenance burden. > >>>> The more off-the-shelf, the better IMO. > >>>> Thank you for investing your time researching some options. > >>> > >>> The security landscape has dramatically changed during the years that we > >>> have enjoyed JSM protection for Solr. It's a crazy world out there and > >>> every > >>> single code flaw, existing and future, will be found and exploited or > >>> published > >>> by so called security researchers. That's why I believe we need a > >>> centralized > >>> solution. > >>> > >>>>> Rejected Alternatives > >>>>> > >>>>> - *Staying on Java < 24* — not viable long-term; Solr must support > >>>>> current Java LTS releases. > >>>>> - *Removing JSM protections without any replacement* — unacceptable > >>>>> security regression. > >>>>> > >>>> Both Eric Pugh and I have challenged this. > >>>>> > >>>>> - *OS-level hardening only (systemd, seccomp)* — not cross-platform; > >>>>> does not cover Windows or macOS. > >>>>> > >>>> I challenge this. Why should the Solr project burden itself with building > >>>> & maintaining security mechanisms already provided by off-the-shelf > >>> tools? > >>>> If a user/operator wishes to run on Windows/macOS that may not have this > >>>> protection mechanism, it is a risk consideration for that user to > >>> consider, > >>>> but isn't a deal-breaker. The JSM wasn't/isn't a front-line defense; > >>> it's > >>>> a defense-in-depth strategy. Put differently, the protections here are > >>>> "best effort" but not worthy of a CVE if they were to falter. I want to > >>>> get Arnout's opinion on this supposition. > >>> > >>> Java Security Manager is a beast, but its file system read/write controls > >>> have > >>> saved us many a CVE in 9.x when JSM have been on by default. > >>> The SIP primarily focuses on sandboxing Solr for wrt file and network > >>> access. > >>> The security landscape has changed dramatically last few years, and > >>> solving > >>> file- and network restrictions in a central place through interception > >>> rathen than > >>> at each of the 100+ call sites is a more sustainable way. > >>> > >>> Users are free to add layers outside the JVM as well, such as systemd, > >>> container, > >>> SELinux etc. But be honest - how many small/medium organizations really do > >>> this? > >>> > >>>>> - *Dynamic ZK-watcher-based network policy* — correct but > >>>>> significantly more complex; adds ZK client dependency to the agent > >>> JAR. > >>>>> Superseded by port-based wildcards for intra-cluster traffic. > >>>>> - *Building a Java agent from scratch* — higher effort with no > >>>>> functional advantage over adapting the Apache 2.0-licensed OpenSearch > >>>>> implementation. > >>>>> > >>>>> I agree with not burdening this project with building & maintainining > >>> such > >>>> a mechanism. > >>>> > >>>> I'm not sure, to what degree, we can leverage that existing agent you > >>> speak > >>>> of without further burdening us. It's a burden/reward trade-off. > >>> > >>> If the agent becomes a true maintenance burden (i.e. a larger burden than > >>> handling the CVEs it would prevent, then a valid action is to remove the > >>> agent again. > >>> Like any other feature we develop and maintain. Good this is that this is > >>> pure Java, > >>> and production-ready stable code. > >>> > >>>> In your updated proposal, you point > >>>> out org.apache.solr.core.CoreContainer#assertPathAllowed That is called > >>> by > >>>> a number of places... although I *think* the original intention was only > >>> to > >>>> limit where cores are created? Can you elaborate on what the role of > >>> that > >>>> method *should* be and how the JSM might also or work? > >>> > >>> The pathAllowed checks (also the ones pre-dating the centralization in > >>> CoreContainer, > >>> were introduced in 8.x, before JSM was introduced or enabled by default. > >>> Initially > >>> assertPathAllowed would validate end-user API input to avoid cores being > >>> created > >>> or loaded outside blessed folders. It has then been re-used for other code > >>> locations > >>> that may do file access based on user-input in API or config. It as also > >>> extended to > >>> block UNC paths after many attacks with this as a vector. > >>> > >>> I believe perhaps pathAllowed would not have been written had JSM already > >>> been > >>> enforced. Although I don't know if you can block UNC with JSM? > >>> The SIP does not propose to remove pathAllowed now. One benefit of early > >>> detection > >>> is that we can give better context-sensitive error- and log messages > >>> rather than throwing > >>> an exception. > >>> > >>> The agent approach laid out sandboxes four attack vectors > >>> * File access outside a limited set of folders and full block of Windows > >>> UNC > >>> * Network access other than peer solr nodes and zk nodes > >>> * Disabllow calling System.exit() (mainly useful for 3rd party plugins?) > >>> * Disallow spawning processes > >>> > >>> A nice thing with the agent approach is that it is unintrusive and easy to > >>> disable, > >>> so users who want to take care of all this on OS level may disable it, and > >>> if we or > >>> users don't find value in it down the road they can disable it or we can > >>> remove it. > >>> > >>> > >>> Jan > >>> --------------------------------------------------------------------- > >>> To unsubscribe, e-mail: [email protected] > >>> For additional commands, e-mail: [email protected] > >>> > >>> > >> > >> -- > >> http://www.needhamsoftware.com (work) > >> https://a.co/d/b2sZLD9 (my fantasy fiction book) > > > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
