[GitHub] spark pull request: [SPARK-6052][SQL]In JSON schema inference, we ...

viirya Fri, 27 Feb 2015 07:40:51 -0800

Github user viirya commented on the pull request:

    https://github.com/apache/spark/pull/4806#issuecomment-76413623
  
    Besides, I think that it is weird to manually set up the `containsNull` for 
JSON schema inference. Sampling should not be an issue because you can also 
argue that we may miss arrays with different column types.
    
    So the main point is still the problem of inserting JSON data to parquet 
data source table. I did in #4729 just copy the schema of JSON data and modify 
its `containsNull` then use it for insertion, without actually modifying the 
schema of the JSON data.
    
    Both solutions are working on the unit test. @liancheng @yhuai you can 
decide which one is more proper.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: [SPARK-6052][SQL]In JSON schema inference, we ...

Reply via email to