dmontagu opened a new pull request, #10981: URL: https://github.com/apache/datafusion/pull/10981
Opening this PR at this (early) stage primarily for the sake of getting initial feedback on its viability; @samuelcolvin has additional context. The idea is that we want to experiment with rewriting operators to use the JSON functions in https://github.com/datafusion-contrib/datafusion-functions-json, but it would help if `datafusion` supported parsing the relevant operators. We can fork `datafusion` to add this support if necessary, but even if that's the case I was hoping to get some guidance about the implementation, and in particular feedback on whether the path this is going down is "wrong" or has a chance of getting merged eventually as the JSON functions stuff gets more mature. ## Which issue does this PR close? Related to https://github.com/apache/datafusion/issues/7845, though it does not close that issue. ## Rationale for this change We are using https://github.com/datafusion-contrib/datafusion-functions-json, but the problem is that, as far as I can tell, there is not a way to add support for new operators without modifying the SQL parsing in the main `datafusion` crate. We want to add support for the common JSON operators (at least `?` and `->`/`->>`, if not the others used by Postgres), and are happy to try those out ourselves, but it would be nice if parsing support was included in the `datafusion` crate even if the new operators remain completely unused. The good news is that the various operators can already be parsed by the `sqlparser` crate so adding support is mostly mechanical, other than determining their signatures (which admittedly may pose a problem/debate). ## What changes are included in this PR? This PR currently just adds support for parsing the `?` (`Question`), `->` (`Arrow`), and `->>` (`LongArrow`) operators from `sqlparser::ast::BinaryOperator`. I'll note that Postgres also has `#>` (`HashArrow`), `#>>` (`HashLongArrow`), `#-` (`HashMinus`), `@?` (`AtQuestion`), `?&` (`QuestionAnd`), `?|` (`QuestionPipe`), `@@` (`AtAt`), as established JSON operators that it might be nice to add support for at the same time, considering these are already cases in the `sqlparser::ast::BinaryOperator` enum. I would be happy to also add support for some/all of those as well if desirable. ## Are these changes tested? Not yet, I'll be happy to add some tests if this has a chance of being accepted. ## Are there any user-facing changes? This PR adds the ability to parse these operators. For most/all of them, we'll presumably want any actual use to produce errors if the operator isn't involved in a user-defined rewrite. Right now, given our interest in using them for JSON, I think we'd like to support `Utf8`/`LargeUtf8` as inputs where a `JSON` value would be expected, but I'm not sure if that makes sense if we want to introduce support for other types. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
