Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/1889#issuecomment-52090218 @ueshin, thanks a lot for investigating this further! This is super important and I have been meaning to get to it for awhile. Here are my thoughts: - We probably shouldn't throw an exception when trying to store arrays or maps just because `containsNull/valueContainsNull=true`. This is because those values mean "could contain null" not "do contain null", and due to hive semantics we are often very conservative is stating `nullable=false`. - It would be great if you could explain how the format is going to change to handle null values. Is there consensus in the parquet community about how to encode this? Will the change be backwards incompatible? - If its going to be backwards incompatible then it would be really good to make the change before 1.1. Please open a blocker JIRA targeted at 1.1 if that is the case. If we don't need to make backwards incompatible changes then this is more a "very nice to have" for 1.1. I'm okay throwing exceptions saying "not supported" when people try to store null values into arrays or maps (though this is less than ideal obviously). Thanks again!
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org