[GitHub] spark issue #16179: [SPARK-18752][hive] "isSrcLocal" value should be set fro...

vanzin Fri, 09 Dec 2016 16:03:07 -0800

Github user vanzin commented on the issue:

    https://github.com/apache/spark/pull/16179
  
    > After the change these temporary data files in staging directory of 
InsertIntoHiveTable will be moved to the table location instead of copying to 
the table location. Is that right?
    
    It depends. Without this change, it would depend on where the table was. If 
the table was in HDFS (or anything but the local FS), the files would be moved, 
so the behavior doesn't change. If the table was in the local filesystem, 
before this change the files would be copied, and later deleted when the 
staging directory was deleted. So in the end, it's the same thing.
    
    With the change, the data would be moved in both cases, which is also 
correct and leads to the same result.
    
    I just want to reinforce, again, that this is not about a change in 
behavior in Hive at all. This is Spark using a Hive API incorrectly.
    
    > VersionSuite is also being used for testing end-to-end behaviors in 
#16104.
    
    I'm not sure that's such a great idea, but in any case, the tests for this 
change are the existing tests in "InsertIntoHiveTableSuite" and 
"HiveCommandSuite". So basically you'd be asking to run those against all 
different version of Hive metastores supported by Spark. It's doable, but 
that's a bigger change that I don't really think is necessary here. The Hive 
semantics haven't changed. Spark was depending on undocumented behavior that 
worked out of luck, and this change fixes that.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark issue #16179: [SPARK-18752][hive] "isSrcLocal" value should be set fro...

Reply via email to