mihaibudiu commented on PR #3947: URL: https://github.com/apache/calcite/pull/3947#issuecomment-2524953060
In the last commit I have reworked the runtime implementation of VARIANT. (We have tested this design in a different backend for a while, and it works pretty well.) In the previous implementation the runtime type for generic types like ARRAY or MAP would hold information about the element types. In the current implementation this is no longer true: when converting an array to a variant all elements are converted to VARIANT as well. Same for a MAP. Converting a ROW to a VARIANT generates a MAP indexed by the field names. For the user there isn't much of a difference, and, as you may notice, the tests have changed very little. The runtime cost is different, though. Neither of the schemes dominates the other, whether one is preferable depends on the workload. I believe that this is similar to what Snowflake does, although I could not find a precise description of their exact implementation. But I believe that the TYPEOF function in Snowflake applied to an ARRAY will return just "ARRAY" and not "INT ARRAY" - so it resembles more the current implementation. VARIANT shines at handling JSON, so in future PRs we should add more JSON support. Unlike the existing Calcite JSON support, VARIANT represents the JSON natively, and does not need to convert back and forth to strings on every operation, saving many resources. I will give potential readers a few more days to review this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
