dmontagu opened a new pull request, #10981:
URL: https://github.com/apache/datafusion/pull/10981

   Opening this PR at this (early) stage primarily for the sake of getting 
initial feedback on its viability; @samuelcolvin has additional context.
   
   The idea is that we want to experiment with rewriting operators to use the 
JSON functions in 
https://github.com/datafusion-contrib/datafusion-functions-json, but it would 
help if `datafusion` supported parsing the relevant operators. We can fork 
`datafusion` to add this support if necessary, but even if that's the case I 
was hoping to get some guidance about the implementation, and in particular 
feedback on whether the path this is going down is "wrong" or has a chance of 
getting merged eventually as the JSON functions stuff gets more mature.
   
   ## Which issue does this PR close?
   
   Related to https://github.com/apache/datafusion/issues/7845, though it does 
not close that issue.
   
   ## Rationale for this change
   
   We are using 
https://github.com/datafusion-contrib/datafusion-functions-json, but the 
problem is that, as far as I can tell, there is not a way to add support for 
new operators without modifying the SQL parsing in the main `datafusion` crate. 
We want to add support for the common JSON operators (at least `?` and 
`->`/`->>`, if not the others used by Postgres), and are happy to try those out 
ourselves, but it would be nice if parsing support was included in the 
`datafusion` crate even if the new operators remain completely unused.
   
   The good news is that the various operators can already be parsed by the 
`sqlparser` crate so adding support is mostly mechanical, other than 
determining their signatures (which admittedly may pose a problem/debate).
   
   ## What changes are included in this PR?
   
   This PR currently just adds support for parsing the `?` (`Question`), `->` 
(`Arrow`), and `->>` (`LongArrow`) operators from 
`sqlparser::ast::BinaryOperator`.
   
   I'll note that Postgres also has `#>` (`HashArrow`), `#>>` 
(`HashLongArrow`), `#-` (`HashMinus`), `@?` (`AtQuestion`), `?&` 
(`QuestionAnd`), `?|` (`QuestionPipe`), `@@` (`AtAt`), as established JSON 
operators that it might be nice to add support for at the same time, 
considering these are already cases in the `sqlparser::ast::BinaryOperator` 
enum. I would be happy to also add support for some/all of those as well if 
desirable.
   
   ## Are these changes tested?
   
   Not yet, I'll be happy to add some tests if this has a chance of being 
accepted.
   
   ## Are there any user-facing changes?
   
   This PR adds the ability to parse these operators. For most/all of them, 
we'll presumably want any actual use to produce errors if the operator isn't 
involved in a user-defined rewrite. Right now, given our interest in using them 
for JSON, I think we'd like to support `Utf8`/`LargeUtf8` as inputs where a 
`JSON` value would be expected, but I'm not sure if that makes sense if we want 
to introduce support for other types.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to