I think we can treat it as pulling any other 3rd party crate/library. I see that it's marked as an optional dependency [1], which is great. It's also added as a feature [2]; I would suggest making it explicit that this is a community contribution, instead of apache. So maybe rename the feature to `community_json` or something similar. We can also document in LICENSE/NOTICE/README that the library is a community contribution not affiliated with the Apache Software Foundation
Best, Kevin Liu [1] https://github.com/apache/datafusion-python/pull/1466/files#diff-bac59d6e5ada615de3d27a8e8f87d272613a80b5d3e4a7e2c2e4a08e63dcf0a1R56 [2] https://github.com/apache/datafusion-python/pull/1466/files#diff-bac59d6e5ada615de3d27a8e8f87d272613a80b5d3e4a7e2c2e4a08e63dcf0a1R78 On Mon, Mar 30, 2026 at 9:45 AM Andrew Lamb <[email protected]> wrote: > Another thing to consider is the maintenance burden (maybe not that bad) > > In my mind if we are going to distribute datafusion-python with the json > functions, we should bring datafusion-json functions under apache > governance . Otherwise we might end up with a situation like a security > issue in an Apache product due to some other crate > > Of course, we already do this with the other third-party dependencies (like > `hashbrown` for example ) so maybe it isn't that different 🤔 > > I think the most important thing about bringing in code like that is that > we ensure that it IP provenance is clear (e.g. that the (original authors) > have made the donation explicitly under the apache license. > > I am not sure who wrote the code in datafusion-json -- could we get them to > make the PR instead of a third party? > > Andrew > > Andrew > > On Mon, Mar 30, 2026 at 10:57 AM Luke Kim <[email protected]> wrote: > > > We (Spice AI) use the json crate and it would be nice to have it in, but > I > > think the API should be reviewed for consistency before making it > official > > and having people depend on it. > > > > It aligns to the PostgreSQL syntax but not exactly/completely. > > > > > > > > On Mon, Mar 30, 2026 at 7:39 AM, Tim Saucer <[email protected]<mailto: > > [email protected]>> wrote: > > > > Hi all, > > > > A recent PR[1] has been opened to bring in json scalar functions from the > > datafusion-contrib crate datafusion-functions-json. Before I move forward > > with either approving or closing this PR, I was wondering how the broader > > community felt about adding outside content like this. The code from > > datafusion-contrib is unofficial, so I'm hesitant to include it in our > > official release. > > > > I could see a second route which would be to add python support for all > of > > those functions inside that contrib crate. But that means someone who > > maintains that code will also need to publish python packages in addition > > to their current rust code. It's not a huge burden, but it is additional > > work. > > > > I'd appreciate any thoughts you have on non-official crate functions > being > > included. > > > > [1]: > > > https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fdatafusion-python%2Fpull%2F1466&data=05%7C02%7Cluke%40spice.ai%7C7850c34353f04558129508de8e6a27af%7C925431232b6a4eec9b6f595720cd1c8f%7C0%7C0%7C639104783748364549%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C4000%7C%7C%7C&sdata=63BE9QbaZnvKFLwvy76kW%2FF6QtL9M6HNxTSOIIwAd0k%3D&reserved=0 > > <https://github.com/apache/datafusion-python/pull/1466> > > > > >
