mkaravel commented on PR #47186: URL: https://github.com/apache/spark/pull/47186#issuecomment-2214443368
When Redshift introduced SUPER they faced the same issue, which is basically the clunkiness of writing an effective query in the presence of the SUPER data type, which (as the VARIANT type in OSS) implements dynamic typing. See the [corresponding documentation](https://docs.aws.amazon.com/redshift/latest/dg/query-super.html#dynamic-typing-lax-processing) for a relevant example. Redshift’s solution was to embed try_cast semantics in all operators to avoid making users to write very/extremely complicated queries. The choice was obviously made to made to make users’ life easier (ease of use) and of course to make the resulting SQL code much more readable. One has to note here though that Redshift does not have try_cast functionality. My (personal) take: the idea of having LAX (this is what Redshift calls it) semantics is actually paramount when we are talking about dynamic typing (which is what Redshift's SUPER and OSS VARIANT are trying to do), and makes usability of the new types orders of magnitude simpler. The proposed syntax in OSS (again in my opinion) goes one step further than what Redshift does in the sense that we not only offer LAX semantics (which are essential and necessary), but also does it in a user-controlled way. In our case (Spark) the alternative would have been to populate try_cast calls to the entire query. I believe the proposed syntax makes it much more simple, and in IMHO much more intuitive (use of a cast operator when a cast is needed as opposed to a function call). So bottom line: Spark is not the first one to associate LAX semantics when it comes to dynamic/polymorphic data types, but with this PR OSS will the first one to make it part of the syntax in a nice manner that extends existing and established casting syntax (I am referring to `::`). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
