sunng87 opened a new issue, #18210:
URL: https://github.com/apache/datafusion/issues/18210

   ### Is your feature request related to a problem or challenge?
   
   Hello community,
   
   I have been thinking about adding [Postgres style JSON 
operators](https://www.postgresql.org/docs/18/functions-json.html#FUNCTIONS-JSON)
 for nested data structures, mostly for `Struct` and `List`. These operators 
include:
   
   - `->`, `->>`, `#>>`: Access data field by index/key
   - `@>`, `<@`, `?`, `?|`, `?&`: Containment testing
   - `||`, `-`, `#-`: Data structure manipulation
   - `@?`, `@@`: Predicate testing
   
   ### Describe the solution you'd like
   
   Just want to make sure I'm in the right direction.
   
   1. I assume we won't have built-in `json` type in datafusion, so these 
operators will be implemented directly on `Struct`, `List` and other 
`json`-like primitives directly, following postgres' semantics of them. I 
noticed we have VARIANT coming to arrow/datafusion, will we have a new 
`DataType` for `VARIANT`? If so, it will be good option for input and return 
type of these operators.
   2. At the moment, we don't have support for operators on nested data 
structure and primitives. If the left input is nested, we will assume the right 
array is nested too, and perform compare operators recursively: 
https://github.com/apache/datafusion/blob/main/datafusion/physical-expr/src/expressions/binary.rs#L254
 I will need to change this behavior. 
   3. Some of the operators may create dynamic results if the right input is 
Array. For example, if the right array of `->` is `["a", "b", "c"]`, it is 
expected to return 3 different data types in result set which breaks our type 
system. So for these `>` operators, I'm going to support scalar version only.
   
   Some of the kernels are going to be implemented in `arrow-rs` first, and 
integrate into datafusion.
   
   Let me know if these changes will make sense, and align with our previous 
plan if any. And I will start to send pull requests on both repos.
   
   ### Describe alternatives you've considered
   
   _No response_
   
   ### Additional context
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to