clintropolis commented on PR #12753:
URL: https://github.com/apache/druid/pull/12753#issuecomment-1183687543

   >To me this sounds like the basics of the feature are production-ready. 
There may be various callouts about performance, but it seems that the 
compatibility story is tight enough that the feature doesn't need the 
experimental markings at this time.
   
   I think that is fair :+1:
   
   >You mentioned being somewhat less certain about the behavior of nested 
arrays. We should figure out if that part is going to be included in the 
production-ready feature set, or if we'll call that particular scenario out as 
an evolving area. What is your intent & recommendation in this area?
   
   It is definitely going to be an evolving area (which I think could be said 
of our array support in general), though there are probably a narrow range use 
cases that could be used today, mainly where array lengths and element 
positions are known and have some meaning and query time operations are 
primarily extracting and operating on individual elements. This is more or less 
the current limitations of `flattenSpec` with nested arrays I think.
   
   There are some lower hanging fruit that would  improve stuff in the near 
term, some of which might be possible to get in before the next release. The 
first  supporting wildcards in the subset of the path syntax that we support, 
which would allow `JSON_QUERY` and `JSON_VALUE` (or something like it.. i'm not 
sure entirely how the `RETURNING` syntax would work with array types in SQL so 
need to do some tinkering there) to extract complete arrays. For `JSON_QUERY` 
these results would still be `COMPLEX<json>` typed , but `JSON_VALUE`* would 
spit out druid literal array types (`ARRAY<LONG>`, `ARRAY<STRING>`, etc).
   
   For nested arrays of JSON objects extracted by `JSON_QUERY`, i think we will 
want a way to convert a `COMLEX<json>` into an `ARRAY<COMPLEX<json>>` so that 
they too can take part in array operations, _especially_ once we add a native 
`UNNEST` function to transform arrays into tables, which would be the path to 
exploding out these nested objects and performing operations on their contents.
   
   At some point after that, I intend to introduce the option to begin storing 
literal arrays in nested `ARRAY` typed columns instead of them broken out into 
separate columns for individual elements like they currently exist (so that 
array operations don't have to decompress a bunch of separate columns to do 
stuff). 
   
   I guess I'm getting a bit into the weeds, but my point I guess is that I 
think this feature will evolve along-side and should help us improve array 
support in general, so am hyped to get it there.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to