[GitHub] spark pull request: [SPARK-14543] [SQL] Improve InsertIntoTable co...

yhuai Fri, 27 May 2016 11:52:03 -0700

Github user yhuai commented on the pull request:

    https://github.com/apache/spark/pull/12313#issuecomment-222225637
  
    @rdblue Thank you for your repl. For #2, yea, I feel it is better to be 
strict right now. I checked with yesterday's master and seems we already 
require the data and the table have the same number of fields for 
write.insertInto (see 
https://databricks-prod-cloudfront.cloud.databricks.com/public/4027ec902e239c93eaaa8714f173bcfc/52316283059651/545869019913238/4814681571895601/latest.html)?
    
    I do agree that in DataFrame API, it is not obvious that `insertInto` 
follows SQL's behavior. But, I am not sure changing its behavior is the best 
solution. My main concern of adding `byName` is that it makes the behavior of 
`insertInto` different from SQL's `insertInto`. Also, I feel it will be good to 
have a holistic solution that handles self-describing data well (e.g. adding 
new columns/inner fields and missing existing columns/inner fields). 
    
    (btw, right now, `write.mode("append").saveAsTable(....)` does name-based 
resolution on top level columns.)



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: [SPARK-14543] [SQL] Improve InsertIntoTable co...

Reply via email to