dilipbiswal commented on issue #25522: [SPARK-28787][DOC][SQL]Document LOAD DATA statement in SQL Reference URL: https://github.com/apache/spark/pull/25522#issuecomment-544687149 @srowen > What's the move vs copy issue? If we see https://cwiki.apache.org/confluence/display/Hive/GettingStarted and look for "LOAD DATA" command, we see the following comments under `NOTES` NO verification of data against the schema is performed by the load command. 1. If the file is in hdfs, it is moved into the Hive-controlled file system namespace. 2. The root of the Hive directory is specified by the option hive.metastore.warehouse.dir in hive-default.xml. We advise users to create this directory before trying to create tables via Hive. The question i had was , "should we document our exact behaviour" i.e do we move the data from original location to the target location vs do we copy ? Can we move ahead on this PR as is and clarify it in a follow-up ?
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
