Github user srowen commented on the issue: https://github.com/apache/spark/pull/15432 @gatorsmile @HyukjinKwon As a general comment Spark SQL doesn't claim a particular level of ANSI SQL compatibility. If anything it tries to match "whatever Hive does" and that's probably the best goal. It's always interesting to see what other RDBMSes do, especially where Hive's behavior seems ambiguous, but I would not describe this as required research. "What Hive does" is the important question in most cases. This one is funny because Hive accepts behavior I wouldn't expect, even after reading its docs. I'm neutral on changing it if nobody is suggesting it's actually a problem in practice. I'm for changing it to match Hive if @rxin slightly favors it. Documenting the behavior... sounds good. I suppose that commits to the behavior, but I can't see that Hive would change and we'd want to follow that change. For a change that's so clearly inside the SQL engine and not language-specific, I personally wouldn't imagine we need language-specific tests. The new tests seem to cover the possibilities from a SQL perspective. I don't think we're here testing whether R/Python correctly serialize "null". This is also, keep in mind, a corner case of the behavior to begin with.
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org