[jira] [Work logged] (HIVE-26894) After using scratchdir for staging final job, CTAS and IOW on ACID tables are failing with wrongFS exception
[ https://issues.apache.org/jira/browse/HIVE-26894?focusedWorklogId=851012=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-851012 ] ASF GitHub Bot logged work on HIVE-26894: - Author: ASF GitHub Bot Created on: 15/Mar/23 00:21 Start Date: 15/Mar/23 00:21 Worklog Time Spent: 10m Work Description: github-actions[bot] closed pull request #3909: HIVE-26894: After using scratchdir for staging final job, CTAS and IOW on ACID tables are failing with wrongFS exception URL: https://github.com/apache/hive/pull/3909 Issue Time Tracking --- Worklog Id: (was: 851012) Time Spent: 1h 20m (was: 1h 10m) > After using scratchdir for staging final job, CTAS and IOW on ACID tables are > failing with wrongFS exception > > > Key: HIVE-26894 > URL: https://issues.apache.org/jira/browse/HIVE-26894 > Project: Hive > Issue Type: Sub-task > Components: Hive >Affects Versions: 3.1.3 >Reporter: Sruthi M >Assignee: Sruthi M >Priority: Major > Labels: pull-request-available > Time Spent: 1h 20m > Remaining Estimate: 0h > > ERROR : Failed with exception Wrong FS: > abfs:///hive/warehouse/managed/tpcds_orc.db/test_sales/delta_001_001_, > expected: hdfs://mycluster -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26894) After using scratchdir for staging final job, CTAS and IOW on ACID tables are failing with wrongFS exception
[ https://issues.apache.org/jira/browse/HIVE-26894?focusedWorklogId=849423=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-849423 ] ASF GitHub Bot logged work on HIVE-26894: - Author: ASF GitHub Bot Created on: 07/Mar/23 00:23 Start Date: 07/Mar/23 00:23 Worklog Time Spent: 10m Work Description: github-actions[bot] commented on PR #3909: URL: https://github.com/apache/hive/pull/3909#issuecomment-1457263005 This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Feel free to reach out on the d...@hive.apache.org list if the patch is in need of reviews. Issue Time Tracking --- Worklog Id: (was: 849423) Time Spent: 1h 10m (was: 1h) > After using scratchdir for staging final job, CTAS and IOW on ACID tables are > failing with wrongFS exception > > > Key: HIVE-26894 > URL: https://issues.apache.org/jira/browse/HIVE-26894 > Project: Hive > Issue Type: Sub-task > Components: Hive >Affects Versions: 3.1.3 >Reporter: Sruthi M >Assignee: Sruthi M >Priority: Major > Labels: pull-request-available > Time Spent: 1h 10m > Remaining Estimate: 0h > > ERROR : Failed with exception Wrong FS: > abfs:///hive/warehouse/managed/tpcds_orc.db/test_sales/delta_001_001_, > expected: hdfs://mycluster -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26894) After using scratchdir for staging final job, CTAS and IOW on ACID tables are failing with wrongFS exception
[ https://issues.apache.org/jira/browse/HIVE-26894?focusedWorklogId=837200=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-837200 ] ASF GitHub Bot logged work on HIVE-26894: - Author: ASF GitHub Bot Created on: 05/Jan/23 10:30 Start Date: 05/Jan/23 10:30 Worklog Time Spent: 10m Work Description: sankarh commented on code in PR #3909: URL: https://github.com/apache/hive/pull/3909#discussion_r1062211047 ## ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java: ## @@ -5263,23 +5263,38 @@ public Void call() throws HiveException { private static void moveAcidFilesForDelta(String deltaFileType, FileSystem fs, Path dst, Set createdDeltaDirs, -List newFiles, FileStatus deltaStat) throws HiveException { - +List newFiles, FileStatus deltaStat, HiveConf conf) throws HiveException { +String configuredOwner = HiveConf.getVar(conf, ConfVars.HIVE_LOAD_DATA_OWNER); Path deltaPath = deltaStat.getPath(); // Create the delta directory. Don't worry if it already exists, // as that likely means another task got to it first. Then move each of the buckets. // it would be more efficient to try to move the delta with it's buckets but that is // harder to make race condition proof. Path deltaDest = new Path(dst, deltaPath.getName()); try { + FileSystem destFs = deltaDest.getFileSystem(conf); if (!createdDeltaDirs.contains(deltaDest)) { try { - if(fs.mkdirs(deltaDest)) { + // Check if the src and dest filesystems are same or not + // if the src and dest filesystems are different, then we need to do copy, instead of rename + if (needToCopy(conf, deltaStat.getPath(), deltaDest, fs, destFs, configuredOwner, true)) { Review Comment: Can you add tests that cover these 2 newly added flows? ## ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java: ## @@ -5263,23 +5263,38 @@ public Void call() throws HiveException { private static void moveAcidFilesForDelta(String deltaFileType, FileSystem fs, Path dst, Set createdDeltaDirs, -List newFiles, FileStatus deltaStat) throws HiveException { - +List newFiles, FileStatus deltaStat, HiveConf conf) throws HiveException { +String configuredOwner = HiveConf.getVar(conf, ConfVars.HIVE_LOAD_DATA_OWNER); Path deltaPath = deltaStat.getPath(); // Create the delta directory. Don't worry if it already exists, // as that likely means another task got to it first. Then move each of the buckets. // it would be more efficient to try to move the delta with it's buckets but that is // harder to make race condition proof. Path deltaDest = new Path(dst, deltaPath.getName()); try { + FileSystem destFs = deltaDest.getFileSystem(conf); if (!createdDeltaDirs.contains(deltaDest)) { try { - if(fs.mkdirs(deltaDest)) { + // Check if the src and dest filesystems are same or not + // if the src and dest filesystems are different, then we need to do copy, instead of rename + if (needToCopy(conf, deltaStat.getPath(), deltaDest, fs, destFs, configuredOwner, true)) { +//copy if across file system or encryption zones. +LOG.debug("Copying source " + deltaStat.getPath() + " to " + deltaDest + " because HDFS encryption zones are different."); Review Comment: The log msg can also highlight, the copy is across different file systems or encryption zones. ## ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java: ## @@ -5294,11 +5309,24 @@ private static void moveAcidFilesForDelta(String deltaFileType, FileSystem fs, for (FileStatus bucketStat : bucketStats) { Path bucketSrc = bucketStat.getPath(); Path bucketDest = new Path(deltaDest, bucketSrc.getName()); +FileSystem bucketDestFs = bucketDest.getFileSystem(conf); LOG.info("Moving bucket " + bucketSrc.toUri().toString() + " to " + bucketDest.toUri().toString()); try { - fs.rename(bucketSrc, bucketDest); - if (newFiles != null) { +// Check if the src and dest filesystems are same or not +// if the src and dest filesystems are different, then we need to do copy, instead of rename +if (needToCopy(conf, bucketSrc, bucketDest, fs, bucketDestFs, configuredOwner, true)) { + //copy if across file system or encryption zones. + LOG.debug("Copying source " + bucketSrc + " to " + bucketDest + " because HDFS encryption zones are different."); +
[jira] [Work logged] (HIVE-26894) After using scratchdir for staging final job, CTAS and IOW on ACID tables are failing with wrongFS exception
[ https://issues.apache.org/jira/browse/HIVE-26894?focusedWorklogId=837020=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-837020 ] ASF GitHub Bot logged work on HIVE-26894: - Author: ASF GitHub Bot Created on: 04/Jan/23 20:02 Start Date: 04/Jan/23 20:02 Worklog Time Spent: 10m Work Description: sonarcloud[bot] commented on PR #3909: URL: https://github.com/apache/hive/pull/3909#issuecomment-1371361113 Kudos, SonarCloud Quality Gate passed! [![Quality Gate passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png 'Quality Gate passed')](https://sonarcloud.io/dashboard?id=apache_hive=3909) [![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png 'Bug')](https://sonarcloud.io/project/issues?id=apache_hive=3909=false=BUG) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=3909=false=BUG) [0 Bugs](https://sonarcloud.io/project/issues?id=apache_hive=3909=false=BUG) [![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png 'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive=3909=false=VULNERABILITY) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=3909=false=VULNERABILITY) [0 Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive=3909=false=VULNERABILITY) [![Security Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png 'Security Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3909=false=SECURITY_HOTSPOT) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3909=false=SECURITY_HOTSPOT) [0 Security Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3909=false=SECURITY_HOTSPOT) [![Code Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png 'Code Smell')](https://sonarcloud.io/project/issues?id=apache_hive=3909=false=CODE_SMELL) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=3909=false=CODE_SMELL) [8 Code Smells](https://sonarcloud.io/project/issues?id=apache_hive=3909=false=CODE_SMELL) [![No Coverage information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png 'No Coverage information')](https://sonarcloud.io/component_measures?id=apache_hive=3909=coverage=list) No Coverage information [![No Duplication information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png 'No Duplication information')](https://sonarcloud.io/component_measures?id=apache_hive=3909=duplicated_lines_density=list) No Duplication information Issue Time Tracking --- Worklog Id: (was: 837020) Time Spent: 50m (was: 40m) > After using scratchdir for staging final job, CTAS and IOW on ACID tables are > failing with wrongFS exception > > > Key: HIVE-26894 > URL: https://issues.apache.org/jira/browse/HIVE-26894 > Project: Hive > Issue Type: Sub-task > Components: Hive >Affects Versions: 3.1.3 >Reporter: Sruthi M >Assignee: Sruthi M >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > ERROR : Failed with exception Wrong FS: > abfs:///hive/warehouse/managed/tpcds_orc.db/test_sales/delta_001_001_, > expected: hdfs://mycluster -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26894) After using scratchdir for staging final job, CTAS and IOW on ACID tables are failing with wrongFS exception
[ https://issues.apache.org/jira/browse/HIVE-26894?focusedWorklogId=836857=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-836857 ] ASF GitHub Bot logged work on HIVE-26894: - Author: ASF GitHub Bot Created on: 04/Jan/23 10:20 Start Date: 04/Jan/23 10:20 Worklog Time Spent: 10m Work Description: warriersruthi opened a new pull request, #3909: URL: https://github.com/apache/hive/pull/3909 ###What changes were proposed in this pull request? moveAcidFiles() method in Hive.java is modified to cater to the case where the src FS and dest FS are different. Instead of rename() operation, in the above case, we need to do copy(). ###Why are the changes needed? With the Jira: HIVE-26815, we can change the hive staging directory to be the same as that of scratchdir. In that case, staging files will be in HDFS while the target location would be blob or some other FS. In such scenarios, while creating and updating ACID tables, the final rename operation was not working fine (throws Wrong FS error) since the src and dest FS are different. This is fixed in this patch by changing the rename() operation to copy() if FS are in different encryption zones. ###Does this PR introduce any user-facing change? Nope. ###How was this patch tested? Tested with patch on hive 3.1.2 Issue Time Tracking --- Worklog Id: (was: 836857) Time Spent: 40m (was: 0.5h) > After using scratchdir for staging final job, CTAS and IOW on ACID tables are > failing with wrongFS exception > > > Key: HIVE-26894 > URL: https://issues.apache.org/jira/browse/HIVE-26894 > Project: Hive > Issue Type: Sub-task > Components: Hive >Affects Versions: 3.1.3 >Reporter: Sruthi M >Assignee: Sruthi M >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > ERROR : Failed with exception Wrong FS: > abfs:///hive/warehouse/managed/tpcds_orc.db/test_sales/delta_001_001_, > expected: hdfs://mycluster -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26894) After using scratchdir for staging final job, CTAS and IOW on ACID tables are failing with wrongFS exception
[ https://issues.apache.org/jira/browse/HIVE-26894?focusedWorklogId=836856=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-836856 ] ASF GitHub Bot logged work on HIVE-26894: - Author: ASF GitHub Bot Created on: 04/Jan/23 10:17 Start Date: 04/Jan/23 10:17 Worklog Time Spent: 10m Work Description: warriersruthi closed pull request #3906: HIVE-26894: After using scratchdir for staging final job, CTAS and IOW on ACID tables are failing with wrongFS exception URL: https://github.com/apache/hive/pull/3906 Issue Time Tracking --- Worklog Id: (was: 836856) Time Spent: 0.5h (was: 20m) > After using scratchdir for staging final job, CTAS and IOW on ACID tables are > failing with wrongFS exception > > > Key: HIVE-26894 > URL: https://issues.apache.org/jira/browse/HIVE-26894 > Project: Hive > Issue Type: Sub-task > Components: Hive >Affects Versions: 3.1.3 >Reporter: Sruthi M >Assignee: Sruthi M >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > ERROR : Failed with exception Wrong FS: > abfs:///hive/warehouse/managed/tpcds_orc.db/test_sales/delta_001_001_, > expected: hdfs://mycluster -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26894) After using scratchdir for staging final job, CTAS and IOW on ACID tables are failing with wrongFS exception
[ https://issues.apache.org/jira/browse/HIVE-26894?focusedWorklogId=836639=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-836639 ] ASF GitHub Bot logged work on HIVE-26894: - Author: ASF GitHub Bot Created on: 03/Jan/23 15:29 Start Date: 03/Jan/23 15:29 Worklog Time Spent: 10m Work Description: sonarcloud[bot] commented on PR #3906: URL: https://github.com/apache/hive/pull/3906#issuecomment-1369902797 Kudos, SonarCloud Quality Gate passed! [![Quality Gate passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png 'Quality Gate passed')](https://sonarcloud.io/dashboard?id=apache_hive=3906) [![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png 'Bug')](https://sonarcloud.io/project/issues?id=apache_hive=3906=false=BUG) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=3906=false=BUG) [0 Bugs](https://sonarcloud.io/project/issues?id=apache_hive=3906=false=BUG) [![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png 'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive=3906=false=VULNERABILITY) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=3906=false=VULNERABILITY) [0 Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive=3906=false=VULNERABILITY) [![Security Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png 'Security Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3906=false=SECURITY_HOTSPOT) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3906=false=SECURITY_HOTSPOT) [0 Security Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive=3906=false=SECURITY_HOTSPOT) [![Code Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png 'Code Smell')](https://sonarcloud.io/project/issues?id=apache_hive=3906=false=CODE_SMELL) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive=3906=false=CODE_SMELL) [9 Code Smells](https://sonarcloud.io/project/issues?id=apache_hive=3906=false=CODE_SMELL) [![No Coverage information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png 'No Coverage information')](https://sonarcloud.io/component_measures?id=apache_hive=3906=coverage=list) No Coverage information [![No Duplication information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png 'No Duplication information')](https://sonarcloud.io/component_measures?id=apache_hive=3906=duplicated_lines_density=list) No Duplication information Issue Time Tracking --- Worklog Id: (was: 836639) Time Spent: 20m (was: 10m) > After using scratchdir for staging final job, CTAS and IOW on ACID tables are > failing with wrongFS exception > > > Key: HIVE-26894 > URL: https://issues.apache.org/jira/browse/HIVE-26894 > Project: Hive > Issue Type: Sub-task > Components: Hive >Affects Versions: 3.1.3 >Reporter: Sruthi M >Assignee: Sruthi M >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > ERROR : Failed with exception Wrong FS: > abfs:///hive/warehouse/managed/tpcds_orc.db/test_sales/delta_001_001_, > expected: hdfs://mycluster -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Work logged] (HIVE-26894) After using scratchdir for staging final job, CTAS and IOW on ACID tables are failing with wrongFS exception
[ https://issues.apache.org/jira/browse/HIVE-26894?focusedWorklogId=836619=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-836619 ] ASF GitHub Bot logged work on HIVE-26894: - Author: ASF GitHub Bot Created on: 03/Jan/23 14:03 Start Date: 03/Jan/23 14:03 Worklog Time Spent: 10m Work Description: warriersruthi opened a new pull request, #3906: URL: https://github.com/apache/hive/pull/3906 After using scratchdir for staging the final job, CTAS and IOW on ACID tables are failing with wrongFS exception ### What changes were proposed in this pull request? moveAcidFiles() method in Hive.java is modified to cater to the case where the src FS and dest FS are different. Instead of rename() operation, in the above case, we need to do copy(). ### Why are the changes needed? With the Jira: HIVE-26815, we can change the hive staging directory to be the same as that of scratchdir. In that case, staging files will be in HDFS while the target location would be a blob or some other FS. In such scenarios, while creating and updating ACID tables, the final rename operation was not working fine (throws Wrong FS error) since the src and dest FS is different. This is fixed in this patch by changing the rename() operation to copy() if FS are in different encryption zones. ### Does this PR introduce any user-facing change? Nope. ### How was this patch tested? Tested with a patch on hive 3.1.2 Issue Time Tracking --- Worklog Id: (was: 836619) Remaining Estimate: 0h Time Spent: 10m > After using scratchdir for staging final job, CTAS and IOW on ACID tables are > failing with wrongFS exception > > > Key: HIVE-26894 > URL: https://issues.apache.org/jira/browse/HIVE-26894 > Project: Hive > Issue Type: Sub-task > Components: Hive >Affects Versions: 3.1.3 >Reporter: Sruthi M >Assignee: Sruthi M >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > ERROR : Failed with exception Wrong FS: > abfs:///hive/warehouse/managed/tpcds_orc.db/test_sales/delta_001_001_, > expected: hdfs://mycluster -- This message was sent by Atlassian Jira (v8.20.10#820010)