mihaibudiu commented on PR #3947:
URL: https://github.com/apache/calcite/pull/3947#issuecomment-2524953060

   In the last commit I have reworked the runtime implementation of VARIANT.
   (We have tested this design in a different backend for a while, and it works 
pretty well.)
   In the previous implementation the runtime type for generic types like ARRAY 
or MAP would hold information about the element types. In the current 
implementation this is no longer true: when converting an array to a variant 
all elements are converted to VARIANT as well. Same for a MAP. Converting a ROW 
to a VARIANT generates a MAP indexed by the field names.
   
   For the user there isn't much of a difference, and, as you may notice, the 
tests have changed very little. The runtime cost is different, though. Neither 
of the schemes dominates the other, whether one is preferable depends on the 
workload.
   
   I believe that this is similar to what Snowflake does, although I could not 
find a precise description of their exact implementation. But I believe that 
the TYPEOF function in Snowflake applied to an ARRAY will return just "ARRAY" 
and not "INT ARRAY" - so it resembles more the current implementation.
   
   VARIANT shines at handling JSON, so in future PRs we should add more JSON 
support. Unlike the existing Calcite JSON support, VARIANT represents the JSON 
natively, and does not need to convert back and forth to strings on every 
operation, saving many resources.
   
   I will give potential readers a few more days to review this.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to