[jira] [Work logged] (HDFS-15545) (S)Webhdfs will not use updated delegation tokens available in the ugi after the old ones expire
[ https://issues.apache.org/jira/browse/HDFS-15545?focusedWorklogId=475651=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-475651 ] ASF GitHub Bot logged work on HDFS-15545: - Author: ASF GitHub Bot Created on: 28/Aug/20 02:59 Start Date: 28/Aug/20 02:59 Worklog Time Spent: 10m Work Description: ibuenros opened a new pull request #2255: URL: https://github.com/apache/hadoop/pull/2255 …rom UGI if the previous one is expired. ## NOTICE Please create an issue in ASF JIRA before opening a pull request, and you need to set the title of the pull request which starts with the corresponding JIRA issue number. (e.g. HADOOP-X. Fix a typo in YYY.) For more details, please see https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 475651) Remaining Estimate: 0h Time Spent: 10m > (S)Webhdfs will not use updated delegation tokens available in the ugi after > the old ones expire > > > Key: HDFS-15545 > URL: https://issues.apache.org/jira/browse/HDFS-15545 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Issac Buenrostro >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > WebHdfsFileSystem can select a delegation token to use from the current user > UGI. The token selection is sticky, and WebHdfsFileSystem will re-use it > every time without searching the UGI again. > If the previous token expires, WebHdfsFileSystem will catch the exception and > attempt to get a new token. However, the mechanism to get a new token > bypasses searching for one on the UGI, so even if there is external logic > that has retrieved a new token, it is not possible to make the FileSystem use > the new, valid token, rendering the FileSystem object unusable. > A simple fix would allow WebHdfsFileSystem to re-search the UGI, and if it > finds a different token than the cached one try to use it. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14694) Call recoverLease on DFSOutputStream close exception
[ https://issues.apache.org/jira/browse/HDFS-14694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17186223#comment-17186223 ] Lisheng Sun commented on HDFS-14694: hi [~ayushtkn] [~weichiu] [~elgoiri] [~hexiaoqiao] Could you help review this patch? Thank you. > Call recoverLease on DFSOutputStream close exception > > > Key: HDFS-14694 > URL: https://issues.apache.org/jira/browse/HDFS-14694 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Reporter: Chen Zhang >Assignee: Lisheng Sun >Priority: Major > Attachments: HDFS-14694.001.patch, HDFS-14694.002.patch, > HDFS-14694.003.patch, HDFS-14694.004.patch, HDFS-14694.005.patch, > HDFS-14694.006.patch > > > HDFS uses file-lease to manage opened files, when a file is not closed > normally, NN will recover lease automatically after hard limit exceeded. But > for a long running service(e.g. HBase), the hdfs-client will never die and NN > don't have any chances to recover the file. > Usually client program needs to handle exceptions by themself to avoid this > condition(e.g. HBase automatically call recover lease for files that not > closed normally), but in our experience, most services (in our company) don't > process this condition properly, which will cause lots of files in abnormal > status or even data loss. > This Jira propose to add a feature that call recoverLease operation > automatically when DFSOutputSteam close encounters exception. It should be > disabled by default, but when somebody builds a long-running service based on > HDFS, they can enable this option. > We've add this feature to our internal Hadoop distribution for more than 3 > years, it's quite useful according our experience. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15098) Add SM4 encryption method for HDFS
[ https://issues.apache.org/jira/browse/HDFS-15098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17186195#comment-17186195 ] liusheng commented on HDFS-15098: - Hi [~vinayakumarb], Thank a lot for your review, I have update the patch according to your comments. please take a look again when you have time. > Add SM4 encryption method for HDFS > -- > > Key: HDFS-15098 > URL: https://issues.apache.org/jira/browse/HDFS-15098 > Project: Hadoop HDFS > Issue Type: New Feature >Affects Versions: 3.4.0 >Reporter: liusheng >Assignee: liusheng >Priority: Major > Labels: sm4 > Attachments: HDFS-15098.001.patch, HDFS-15098.002.patch, > HDFS-15098.003.patch, HDFS-15098.004.patch, HDFS-15098.005.patch, > HDFS-15098.006.patch, HDFS-15098.007.patch, HDFS-15098.008.patch, > HDFS-15098.009.patch, HDFS-15098.009.patch, image-2020-08-19-16-54-41-341.png > > Time Spent: 10m > Remaining Estimate: 0h > > SM4 (formerly SMS4)is a block cipher used in the Chinese National Standard > for Wireless LAN WAPI (Wired Authentication and Privacy Infrastructure). > SM4 was a cipher proposed to for the IEEE 802.11i standard, but has so far > been rejected by ISO. One of the reasons for the rejection has been > opposition to the WAPI fast-track proposal by the IEEE. please see: > [https://en.wikipedia.org/wiki/SM4_(cipher)] > > *Use sm4 on hdfs as follows:* > 1.Configure Hadoop KMS > 2.test HDFS sm4 > hadoop key create key1 -cipher 'SM4/CTR/NoPadding' > hdfs dfs -mkdir /benchmarks > hdfs crypto -createZone -keyName key1 -path /benchmarks > *requires:* > 1.openssl version >=1.1.1 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15098) Add SM4 encryption method for HDFS
[ https://issues.apache.org/jira/browse/HDFS-15098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liusheng updated HDFS-15098: Attachment: HDFS-15098.009.patch Status: Patch Available (was: Open) > Add SM4 encryption method for HDFS > -- > > Key: HDFS-15098 > URL: https://issues.apache.org/jira/browse/HDFS-15098 > Project: Hadoop HDFS > Issue Type: New Feature >Affects Versions: 3.4.0 >Reporter: liusheng >Assignee: liusheng >Priority: Major > Labels: sm4 > Attachments: HDFS-15098.001.patch, HDFS-15098.002.patch, > HDFS-15098.003.patch, HDFS-15098.004.patch, HDFS-15098.005.patch, > HDFS-15098.006.patch, HDFS-15098.007.patch, HDFS-15098.008.patch, > HDFS-15098.009.patch, HDFS-15098.009.patch, image-2020-08-19-16-54-41-341.png > > Time Spent: 10m > Remaining Estimate: 0h > > SM4 (formerly SMS4)is a block cipher used in the Chinese National Standard > for Wireless LAN WAPI (Wired Authentication and Privacy Infrastructure). > SM4 was a cipher proposed to for the IEEE 802.11i standard, but has so far > been rejected by ISO. One of the reasons for the rejection has been > opposition to the WAPI fast-track proposal by the IEEE. please see: > [https://en.wikipedia.org/wiki/SM4_(cipher)] > > *Use sm4 on hdfs as follows:* > 1.Configure Hadoop KMS > 2.test HDFS sm4 > hadoop key create key1 -cipher 'SM4/CTR/NoPadding' > hdfs dfs -mkdir /benchmarks > hdfs crypto -createZone -keyName key1 -path /benchmarks > *requires:* > 1.openssl version >=1.1.1 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15098) Add SM4 encryption method for HDFS
[ https://issues.apache.org/jira/browse/HDFS-15098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liusheng updated HDFS-15098: Status: Open (was: Patch Available) > Add SM4 encryption method for HDFS > -- > > Key: HDFS-15098 > URL: https://issues.apache.org/jira/browse/HDFS-15098 > Project: Hadoop HDFS > Issue Type: New Feature >Affects Versions: 3.4.0 >Reporter: liusheng >Assignee: liusheng >Priority: Major > Labels: sm4 > Attachments: HDFS-15098.001.patch, HDFS-15098.002.patch, > HDFS-15098.003.patch, HDFS-15098.004.patch, HDFS-15098.005.patch, > HDFS-15098.006.patch, HDFS-15098.007.patch, HDFS-15098.008.patch, > HDFS-15098.009.patch, image-2020-08-19-16-54-41-341.png > > Time Spent: 10m > Remaining Estimate: 0h > > SM4 (formerly SMS4)is a block cipher used in the Chinese National Standard > for Wireless LAN WAPI (Wired Authentication and Privacy Infrastructure). > SM4 was a cipher proposed to for the IEEE 802.11i standard, but has so far > been rejected by ISO. One of the reasons for the rejection has been > opposition to the WAPI fast-track proposal by the IEEE. please see: > [https://en.wikipedia.org/wiki/SM4_(cipher)] > > *Use sm4 on hdfs as follows:* > 1.Configure Hadoop KMS > 2.test HDFS sm4 > hadoop key create key1 -cipher 'SM4/CTR/NoPadding' > hdfs dfs -mkdir /benchmarks > hdfs crypto -createZone -keyName key1 -path /benchmarks > *requires:* > 1.openssl version >=1.1.1 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15545) (S)Webhdfs will not use updated delegation tokens available in the ugi after the old ones expire
[ https://issues.apache.org/jira/browse/HDFS-15545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Issac Buenrostro updated HDFS-15545: Description: WebHdfsFileSystem can select a delegation token to use from the current user UGI. The token selection is sticky, and WebHdfsFileSystem will re-use it every time without searching the UGI again. If the previous token expires, WebHdfsFileSystem will catch the exception and attempt to get a new token. However, the mechanism to get a new token bypasses searching for one on the UGI, so even if there is external logic that has retrieved a new token, it is not possible to make the FileSystem use the new, valid token, rendering the FileSystem object unusable. A simple fix would allow WebHdfsFileSystem to re-search the UGI, and if it finds a different token than the cached one try to use it. was: WebHdfsFileSystem can select a delegation token to use from the current user UGI. The token selection is sticky, and WebHdfsFileSystem will re-use it every time without searching the UGI again. If the previous token expires, WebHdfsFileSystem will catch the exception and attempt to get a new token. However, the mechanism to get a new token bypasses searching for one on the UGI, so even if there is external logic that has retrieved a new token, it is not possible to make the FileSystem use the new, valid token, rendering the FileSystem object unusable. > (S)Webhdfs will not use updated delegation tokens available in the ugi after > the old ones expire > > > Key: HDFS-15545 > URL: https://issues.apache.org/jira/browse/HDFS-15545 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Issac Buenrostro >Priority: Major > > WebHdfsFileSystem can select a delegation token to use from the current user > UGI. The token selection is sticky, and WebHdfsFileSystem will re-use it > every time without searching the UGI again. > If the previous token expires, WebHdfsFileSystem will catch the exception and > attempt to get a new token. However, the mechanism to get a new token > bypasses searching for one on the UGI, so even if there is external logic > that has retrieved a new token, it is not possible to make the FileSystem use > the new, valid token, rendering the FileSystem object unusable. > A simple fix would allow WebHdfsFileSystem to re-search the UGI, and if it > finds a different token than the cached one try to use it. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-15545) (S)Webhdfs will not use updated delegation tokens available in the ugi after the old ones expire
Issac Buenrostro created HDFS-15545: --- Summary: (S)Webhdfs will not use updated delegation tokens available in the ugi after the old ones expire Key: HDFS-15545 URL: https://issues.apache.org/jira/browse/HDFS-15545 Project: Hadoop HDFS Issue Type: Bug Reporter: Issac Buenrostro WebHdfsFileSystem can select a delegation token to use from the current user UGI. The token selection is sticky, and WebHdfsFileSystem will re-use it every time without searching the UGI again. If the previous token expires, WebHdfsFileSystem will catch the exception and attempt to get a new token. However, the mechanism to get a new token bypasses searching for one on the UGI, so even if there is external logic that has retrieved a new token, it is not possible to make the FileSystem use the new, valid token, rendering the FileSystem object unusable. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15098) Add SM4 encryption method for HDFS
[ https://issues.apache.org/jira/browse/HDFS-15098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17186106#comment-17186106 ] Hadoop QA commented on HDFS-15098: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 2m 0s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 1s{color} | {color:green} No case conflicting files found. {color} | | {color:blue}0{color} | {color:blue} markdownlint {color} | {color:blue} 0m 1s{color} | {color:blue} markdownlint was not available. {color} | | {color:blue}0{color} | {color:blue} buf {color} | {color:blue} 0m 1s{color} | {color:blue} buf was not available. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 4 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 20s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 25m 31s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 26m 54s{color} | {color:green} trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 19m 47s{color} | {color:green} trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 53s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 54s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 22m 38s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 18s{color} | {color:green} trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 55s{color} | {color:green} trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 3m 21s{color} | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 8m 25s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 23s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 5s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 23m 31s{color} | {color:green} the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} | | {color:red}-1{color} | {color:red} cc {color} | {color:red} 23m 31s{color} | {color:red} root-jdkUbuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 generated 22 new + 141 unchanged - 22 fixed = 163 total (was 163) {color} | | {color:green}+1{color} | {color:green} golang {color} | {color:green} 23m 31s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 23m 31s{color} | {color:red} root-jdkUbuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 generated 1 new + 2053 unchanged - 5 fixed = 2054 total (was 2058) {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 21m 0s{color} | {color:green} the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} | | {color:red}-1{color} | {color:red} cc {color} | {color:red} 21m 0s{color} | {color:red} root-jdkPrivateBuild-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 generated 5 new + 158 unchanged - 5 fixed = 163 total (was 163) {color} | | {color:green}+1{color} | {color:green} golang {color} | {color:green} 21m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} |
[jira] [Work logged] (HDFS-15098) Add SM4 encryption method for HDFS
[ https://issues.apache.org/jira/browse/HDFS-15098?focusedWorklogId=475535=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-475535 ] ASF GitHub Bot logged work on HDFS-15098: - Author: ASF GitHub Bot Created on: 27/Aug/20 20:34 Start Date: 27/Aug/20 20:34 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #2211: URL: https://github.com/apache/hadoop/pull/2211#issuecomment-682177101 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | +0 :ok: | reexec | 1m 54s | Docker mode activated. | ||| _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | No case conflicting files found. | | +0 :ok: | markdownlint | 0m 1s | markdownlint was not available. | | +0 :ok: | buf | 0m 1s | buf was not available. | | +1 :green_heart: | @author | 0m 0s | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | The patch appears to include 4 new or modified test files. | ||| _ trunk Compile Tests _ | | +0 :ok: | mvndep | 3m 18s | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 29m 4s | trunk passed | | +1 :green_heart: | compile | 24m 1s | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | compile | 21m 56s | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | checkstyle | 3m 53s | trunk passed | | +1 :green_heart: | mvnsite | 4m 14s | trunk passed | | +1 :green_heart: | shadedclient | 24m 3s | branch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 2m 14s | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 3m 39s | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +0 :ok: | spotbugs | 2m 41s | Used deprecated FindBugs config; considering switching to SpotBugs. | | +1 :green_heart: | findbugs | 8m 15s | trunk passed | ||| _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 27s | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 2m 58s | the patch passed | | +1 :green_heart: | compile | 21m 58s | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | -1 :x: | cc | 21m 58s | root-jdkUbuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 generated 26 new + 137 unchanged - 26 fixed = 163 total (was 163) | | +1 :green_heart: | golang | 21m 58s | the patch passed | | +1 :green_heart: | javac | 21m 58s | root-jdkUbuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 generated 0 new + 2049 unchanged - 4 fixed = 2049 total (was 2053) | | +1 :green_heart: | compile | 18m 54s | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | -1 :x: | cc | 18m 54s | root-jdkPrivateBuild-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 generated 29 new + 134 unchanged - 29 fixed = 163 total (was 163) | | +1 :green_heart: | golang | 18m 54s | the patch passed | | +1 :green_heart: | javac | 18m 54s | root-jdkPrivateBuild-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 generated 0 new + 1944 unchanged - 4 fixed = 1944 total (was 1948) | | -0 :warning: | checkstyle | 2m 49s | root: The patch generated 3 new + 212 unchanged - 8 fixed = 215 total (was 220) | | +1 :green_heart: | mvnsite | 3m 51s | the patch passed | | +1 :green_heart: | whitespace | 0m 0s | The patch has no whitespace issues. | | +1 :green_heart: | xml | 0m 3s | The patch has no ill-formed XML file. | | +1 :green_heart: | shadedclient | 15m 38s | patch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 2m 9s | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 3m 38s | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | -1 :x: | findbugs | 2m 34s | hadoop-common-project/hadoop-common generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) | ||| _ Other Tests _ | | +1 :green_heart: | unit | 9m 53s | hadoop-common in the patch passed. | | +1 :green_heart: | unit | 2m 10s | hadoop-hdfs-client in the patch passed. | | -1 :x: | unit | 131m 7s | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 1m 26s
[jira] [Commented] (HDFS-15131) FoldedTreeSet appears to degrade over time
[ https://issues.apache.org/jira/browse/HDFS-15131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17186050#comment-17186050 ] Stephen O'Donnell commented on HDFS-15131: -- [~maniaabdi] We caught this bug as operations in the namenode or datanode became slow on a running cluster, and jstacks pointed to accessing to this data structure as being the slow point. Unfortunately we don't know how to reproduce this problem. My test program did not succeed in reproducing it. I don't have that program any more, either, as I just created it as a throwaway experiment. All it did was add and remove values from the FoldedTreeset structure, so it was not very complex. > FoldedTreeSet appears to degrade over time > -- > > Key: HDFS-15131 > URL: https://issues.apache.org/jira/browse/HDFS-15131 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, namenode >Affects Versions: 3.3.0 >Reporter: Stephen O'Donnell >Assignee: Stephen O'Donnell >Priority: Major > > We have seen some occurrences of the Namenode getting very slow on delete > operations, to the point where IBRs get blocked frequently and files fail to > close. On one cluster in particular, after about 4 weeks of uptime, the > Namenode started responding very poorly. Restarting it corrected the problem > for another 4 weeks. > In that example, jstacks in the namenode always pointed to slow operations > around a HDFS delete call which was performing an operation on the > FoldedTreeSet structure. The captured jstacks always pointed at an operation > on the folded tree set each time they were sampled: > {code} > "IPC Server handler 573 on 8020" #663 daemon prio=5 os_prio=0 > tid=0x7fe6a4087800 nid=0x97a6 runnable [0x7fe67bdfd000] >java.lang.Thread.State: RUNNABLE > at > org.apache.hadoop.hdfs.util.FoldedTreeSet.removeAndGet(FoldedTreeSet.java:879) > at > org.apache.hadoop.hdfs.util.FoldedTreeSet.remove(FoldedTreeSet.java:911) > at > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeStorageInfo.removeBlock(DatanodeStorageInfo.java:263) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.removeBlock(BlocksMap.java:194) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.removeBlock(BlocksMap.java:108) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.removeBlockFromMap(BlockManager.java:3676) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.removeBlock(BlockManager.java:3507) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.removeBlocks(FSNamesystem.java:4158) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:4132) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInt(FSNamesystem.java:4069) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:4053) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:845) > at > org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.delete(AuthorizationProviderProxyClientProtocol.java:308) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.delete(ClientNamenodeProtocolServerSideTranslatorPB.java:603) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2216) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2212) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > {code} > The observation in this case, was that the namenode worked fine after a > restart and then at some point after about 4 weeks of uptime, this problem > started happening, and it would persist until the namenode was restarted. > Then the problem did not return for about another 4 weeks. > On a completely different cluster and version, I recently came across a > problem where files were again failing to close (last block does not have > sufficient number of replicas) and the datanodes were logging a lot of > messages like the following: > {code} > 2019-11-27 09:00:49,678 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: > Took 21540ms to process 1 commands from NN > {code} > These messages had a range of durations and were fairly frequent. Focusing on > the longer messages at around 20 seconds and checking a few different
[jira] [Work logged] (HDFS-15471) TestHDFSContractMultipartUploader fails on trunk
[ https://issues.apache.org/jira/browse/HDFS-15471?focusedWorklogId=475473=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-475473 ] ASF GitHub Bot logged work on HDFS-15471: - Author: ASF GitHub Bot Created on: 27/Aug/20 18:42 Start Date: 27/Aug/20 18:42 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #2252: URL: https://github.com/apache/hadoop/pull/2252#issuecomment-682123897 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | +0 :ok: | reexec | 0m 33s | Docker mode activated. | ||| _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | No case conflicting files found. | | +1 :green_heart: | @author | 0m 0s | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | The patch appears to include 2 new or modified test files. | ||| _ trunk Compile Tests _ | | +0 :ok: | mvndep | 3m 20s | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 25m 36s | trunk passed | | +1 :green_heart: | compile | 20m 55s | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | compile | 17m 33s | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | checkstyle | 2m 39s | trunk passed | | +1 :green_heart: | mvnsite | 2m 51s | trunk passed | | +1 :green_heart: | shadedclient | 22m 41s | branch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 1m 47s | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 3m 36s | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +0 :ok: | spotbugs | 4m 4s | Used deprecated FindBugs config; considering switching to SpotBugs. | | +1 :green_heart: | findbugs | 6m 52s | trunk passed | ||| _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 30s | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 2m 35s | the patch passed | | +1 :green_heart: | compile | 24m 47s | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javac | 24m 47s | the patch passed | | +1 :green_heart: | compile | 22m 41s | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | javac | 22m 41s | the patch passed | | +1 :green_heart: | checkstyle | 3m 25s | the patch passed | | +1 :green_heart: | mvnsite | 3m 28s | the patch passed | | +1 :green_heart: | whitespace | 0m 0s | The patch has no whitespace issues. | | +1 :green_heart: | shadedclient | 18m 11s | patch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 1m 38s | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 3m 36s | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | findbugs | 7m 32s | the patch passed | ||| _ Other Tests _ | | -1 :x: | unit | 11m 18s | hadoop-common in the patch passed. | | -1 :x: | unit | 27m 14s | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 0m 54s | The patch does not generate ASF License warnings. | | | | 236m 2s | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.ha.TestZKFailoverController | | | hadoop.hdfs.TestDatanodeLayoutUpgrade | | | hadoop.hdfs.TestFileChecksumCompositeCrc | | | hadoop.hdfs.TestReplaceDatanodeOnFailure | | | hadoop.hdfs.TestFileCreation | | | hadoop.hdfs.TestFileConcurrentReader | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.40 ServerAPI=1.40 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2252/1/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/2252 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux f5c61fc54d74 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / d1c60a53f60 | | Default Java | Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | |
[jira] [Resolved] (HDFS-15531) Namenode UI: List snapshots in separate table for each snapshottable directory
[ https://issues.apache.org/jira/browse/HDFS-15531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vivek Ratnavel Subramanian resolved HDFS-15531. --- Resolution: Fixed > Namenode UI: List snapshots in separate table for each snapshottable directory > -- > > Key: HDFS-15531 > URL: https://issues.apache.org/jira/browse/HDFS-15531 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ui >Reporter: Vivek Ratnavel Subramanian >Assignee: Vivek Ratnavel Subramanian >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-15531) Namenode UI: List snapshots in separate table for each snapshottable directory
[ https://issues.apache.org/jira/browse/HDFS-15531?focusedWorklogId=475471=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-475471 ] ASF GitHub Bot logged work on HDFS-15531: - Author: ASF GitHub Bot Created on: 27/Aug/20 18:36 Start Date: 27/Aug/20 18:36 Worklog Time Spent: 10m Work Description: vivekratnavel merged pull request #2230: URL: https://github.com/apache/hadoop/pull/2230 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 475471) Remaining Estimate: 0h Time Spent: 10m > Namenode UI: List snapshots in separate table for each snapshottable directory > -- > > Key: HDFS-15531 > URL: https://issues.apache.org/jira/browse/HDFS-15531 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ui >Reporter: Vivek Ratnavel Subramanian >Assignee: Vivek Ratnavel Subramanian >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15543) RBF: Write Should allow, when a subcluster is unavailable for RANDOM mount points with fault Tolerance enabled.
[ https://issues.apache.org/jira/browse/HDFS-15543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17186029#comment-17186029 ] Hemanth Boyina commented on HDFS-15543: --- RouterRpcServer#invokeAtAvailableNs , invokes on Default namespace if default namespace is set , but if default name space is down this creates problem on mount points with fault tolerant enabled , so i think we need to retry to another namespace in invokeAtAvailableNs incase of namespace unavailability > RBF: Write Should allow, when a subcluster is unavailable for RANDOM mount > points with fault Tolerance enabled. > > > Key: HDFS-15543 > URL: https://issues.apache.org/jira/browse/HDFS-15543 > Project: Hadoop HDFS > Issue Type: Bug > Components: rbf >Affects Versions: 3.1.1 >Reporter: Harshakiran Reddy >Assignee: Hemanth Boyina >Priority: Major > Attachments: HDFS-15543_testrepro.patch > > > A RANDOM mount point should allow to creating new files if one subcluster is > down also with Fault Tolerance was enabled. but here it's failed. > MultiDestination_client]# hdfs dfsrouteradmin -ls /test_ec > *Mount Table Entries:* > Source Destinations Owner Group Mode Quota/Usage > /test_ec *hacluster->/tes_ec,hacluster1->/tes_ec* test ficommon rwxr-xr-x > [NsQuota: -/-, SsQuota: -/-] > *File Write throne the Exception:-* > 2020-08-26 19:13:21,839 WARN hdfs.DataStreamer: Abandoning blk_1073743375_2551 > 2020-08-26 19:13:21,877 WARN hdfs.DataStreamer: Excluding datanode > DatanodeInfoWithStorage[DISK] > 2020-08-26 19:13:21,878 WARN hdfs.DataStreamer: DataStreamer Exception > java.io.IOException: Unable to create new block. > at > org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1758) > at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:718) > 2020-08-26 19:13:21,879 WARN hdfs.DataStreamer: Could not get block > locations. Source file "/test_ec/f1._COPYING_" - Aborting...block==null > put: Could not get block locations. Source file "/test_ec/f1._COPYING_" - > Aborting...block==null -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15543) RBF: Write Should allow, when a subcluster is unavailable for RANDOM mount points with fault Tolerance enabled.
[ https://issues.apache.org/jira/browse/HDFS-15543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hemanth Boyina updated HDFS-15543: -- Attachment: HDFS-15543_testrepro.patch > RBF: Write Should allow, when a subcluster is unavailable for RANDOM mount > points with fault Tolerance enabled. > > > Key: HDFS-15543 > URL: https://issues.apache.org/jira/browse/HDFS-15543 > Project: Hadoop HDFS > Issue Type: Bug > Components: rbf >Affects Versions: 3.1.1 >Reporter: Harshakiran Reddy >Assignee: Hemanth Boyina >Priority: Major > Attachments: HDFS-15543_testrepro.patch > > > A RANDOM mount point should allow to creating new files if one subcluster is > down also with Fault Tolerance was enabled. but here it's failed. > MultiDestination_client]# hdfs dfsrouteradmin -ls /test_ec > *Mount Table Entries:* > Source Destinations Owner Group Mode Quota/Usage > /test_ec *hacluster->/tes_ec,hacluster1->/tes_ec* test ficommon rwxr-xr-x > [NsQuota: -/-, SsQuota: -/-] > *File Write throne the Exception:-* > 2020-08-26 19:13:21,839 WARN hdfs.DataStreamer: Abandoning blk_1073743375_2551 > 2020-08-26 19:13:21,877 WARN hdfs.DataStreamer: Excluding datanode > DatanodeInfoWithStorage[DISK] > 2020-08-26 19:13:21,878 WARN hdfs.DataStreamer: DataStreamer Exception > java.io.IOException: Unable to create new block. > at > org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1758) > at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:718) > 2020-08-26 19:13:21,879 WARN hdfs.DataStreamer: Could not get block > locations. Source file "/test_ec/f1._COPYING_" - Aborting...block==null > put: Could not get block locations. Source file "/test_ec/f1._COPYING_" - > Aborting...block==null -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15131) FoldedTreeSet appears to degrade over time
[ https://issues.apache.org/jira/browse/HDFS-15131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17186025#comment-17186025 ] Mania Abdi commented on HDFS-15131: --- [~sodonnell] I am researching on debugging distributed system, I am building a tool that allows developers to find the frequent processing patterns within the workflow graphs of applications. This bug seems like a case study for my proposed research. Would it be possible to let me know what was the process of catching this bugs for you? and how can I reproduce this bug, what was the test that leads to this bug? Is it possible that we get access to the test program you mentioned 2 comments above? > FoldedTreeSet appears to degrade over time > -- > > Key: HDFS-15131 > URL: https://issues.apache.org/jira/browse/HDFS-15131 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, namenode >Affects Versions: 3.3.0 >Reporter: Stephen O'Donnell >Assignee: Stephen O'Donnell >Priority: Major > > We have seen some occurrences of the Namenode getting very slow on delete > operations, to the point where IBRs get blocked frequently and files fail to > close. On one cluster in particular, after about 4 weeks of uptime, the > Namenode started responding very poorly. Restarting it corrected the problem > for another 4 weeks. > In that example, jstacks in the namenode always pointed to slow operations > around a HDFS delete call which was performing an operation on the > FoldedTreeSet structure. The captured jstacks always pointed at an operation > on the folded tree set each time they were sampled: > {code} > "IPC Server handler 573 on 8020" #663 daemon prio=5 os_prio=0 > tid=0x7fe6a4087800 nid=0x97a6 runnable [0x7fe67bdfd000] >java.lang.Thread.State: RUNNABLE > at > org.apache.hadoop.hdfs.util.FoldedTreeSet.removeAndGet(FoldedTreeSet.java:879) > at > org.apache.hadoop.hdfs.util.FoldedTreeSet.remove(FoldedTreeSet.java:911) > at > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeStorageInfo.removeBlock(DatanodeStorageInfo.java:263) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.removeBlock(BlocksMap.java:194) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.removeBlock(BlocksMap.java:108) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.removeBlockFromMap(BlockManager.java:3676) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.removeBlock(BlockManager.java:3507) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.removeBlocks(FSNamesystem.java:4158) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:4132) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInt(FSNamesystem.java:4069) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:4053) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:845) > at > org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.delete(AuthorizationProviderProxyClientProtocol.java:308) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.delete(ClientNamenodeProtocolServerSideTranslatorPB.java:603) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2216) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2212) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > {code} > The observation in this case, was that the namenode worked fine after a > restart and then at some point after about 4 weeks of uptime, this problem > started happening, and it would persist until the namenode was restarted. > Then the problem did not return for about another 4 weeks. > On a completely different cluster and version, I recently came across a > problem where files were again failing to close (last block does not have > sufficient number of replicas) and the datanodes were logging a lot of > messages like the following: > {code} > 2019-11-27 09:00:49,678 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: > Took 21540ms to process 1 commands from NN > {code} > These messages had a range of durations and were fairly frequent. Focusing on > the longer messages at around 20 seconds and checking a few
[jira] [Comment Edited] (HDFS-14111) hdfsOpenFile on HDFS causes unnecessary IO from file offset 0
[ https://issues.apache.org/jira/browse/HDFS-14111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17186020#comment-17186020 ] Mania Abdi edited comment on HDFS-14111 at 8/27/20, 5:57 PM: - Hi [~tlipcon] and [~stakiar] [~tlipcon] I am researching on debugging distributed system, I am building a tool that allows developers to find the frequent processing patterns within the workflow graphs of applications. This bug seems like a case study for my proposed research. Would it be possible to let me know what was the process of catching this bugs for you? and how can I reproduce this bug, what was the test that leads to this bug? Regards Mania was (Author: maniaabdi): Hi [~tlipcon] and [~stakiar] I am researching on debugging distributed system, I am building a tool that allows developers to find the frequent processing patterns within the workflow graphs of applications. This bug seems like a case study for my proposed research. Would it be possible to let me know what was the process of catching this bugs for you? and how can I reproduce this bug, what was the test that leads to this bug? Regards Mania > hdfsOpenFile on HDFS causes unnecessary IO from file offset 0 > - > > Key: HDFS-14111 > URL: https://issues.apache.org/jira/browse/HDFS-14111 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client, libhdfs >Affects Versions: 3.2.0 >Reporter: Todd Lipcon >Assignee: Sahil Takiar >Priority: Major > Fix For: 3.3.0 > > Attachments: HDFS-14111.001.patch, HDFS-14111.002.patch, > HDFS-14111.003.patch > > > hdfsOpenFile() calls readDirect() with a 0-length argument in order to check > whether the underlying stream supports bytebuffer reads. With DFSInputStream, > the read(0) isn't short circuited, and results in the DFSClient opening a > block reader. In the case of a remote block, the block reader will actually > issue a read of the whole block, causing the datanode to perform unnecessary > IO and network transfers in order to fill up the client's TCP buffers. This > causes performance degradation. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-14111) hdfsOpenFile on HDFS causes unnecessary IO from file offset 0
[ https://issues.apache.org/jira/browse/HDFS-14111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17186020#comment-17186020 ] Mania Abdi edited comment on HDFS-14111 at 8/27/20, 5:56 PM: - Hi [~tlipcon] and [~stakiar] I am researching on debugging distributed system, I am building a tool that allows developers to find the frequent processing patterns within the workflow graphs of applications. This bug seems like a case study for my proposed research. Would it be possible to let me know what was the process of catching this bugs for you? and how can I reproduce this bug, what was the test that leads to this bug? Regards Mania was (Author: maniaabdi): Hi [~tlipcon] and [~stakiar] I am researching on debugging distributed system, I am building a tool that allows developers to find the frequent processing patterns within the workflow graphs of applications. This bug seems like a case study for my proposed research. Would it be possible to let me know what was the process of catching this bugs for you? and how can I reproduce this bug? Regards Mania > hdfsOpenFile on HDFS causes unnecessary IO from file offset 0 > - > > Key: HDFS-14111 > URL: https://issues.apache.org/jira/browse/HDFS-14111 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client, libhdfs >Affects Versions: 3.2.0 >Reporter: Todd Lipcon >Assignee: Sahil Takiar >Priority: Major > Fix For: 3.3.0 > > Attachments: HDFS-14111.001.patch, HDFS-14111.002.patch, > HDFS-14111.003.patch > > > hdfsOpenFile() calls readDirect() with a 0-length argument in order to check > whether the underlying stream supports bytebuffer reads. With DFSInputStream, > the read(0) isn't short circuited, and results in the DFSClient opening a > block reader. In the case of a remote block, the block reader will actually > issue a read of the whole block, causing the datanode to perform unnecessary > IO and network transfers in order to fill up the client's TCP buffers. This > causes performance degradation. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14111) hdfsOpenFile on HDFS causes unnecessary IO from file offset 0
[ https://issues.apache.org/jira/browse/HDFS-14111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17186020#comment-17186020 ] Mania Abdi commented on HDFS-14111: --- Hi [~tlipcon] and [~stakiar] I am researching on debugging distributed system, I am building a tool that allows developers to find the frequent processing patterns within the workflow graphs of applications. This bug seems like a case study for my proposed research. Would it be possible to let me know what was the process of catching this bugs for you? and how can I reproduce this bug? Regards Mania > hdfsOpenFile on HDFS causes unnecessary IO from file offset 0 > - > > Key: HDFS-14111 > URL: https://issues.apache.org/jira/browse/HDFS-14111 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client, libhdfs >Affects Versions: 3.2.0 >Reporter: Todd Lipcon >Assignee: Sahil Takiar >Priority: Major > Fix For: 3.3.0 > > Attachments: HDFS-14111.001.patch, HDFS-14111.002.patch, > HDFS-14111.003.patch > > > hdfsOpenFile() calls readDirect() with a 0-length argument in order to check > whether the underlying stream supports bytebuffer reads. With DFSInputStream, > the read(0) isn't short circuited, and results in the DFSClient opening a > block reader. In the case of a remote block, the block reader will actually > issue a read of the whole block, causing the datanode to perform unnecessary > IO and network transfers in order to fill up the client's TCP buffers. This > causes performance degradation. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-15471) TestHDFSContractMultipartUploader fails on trunk
[ https://issues.apache.org/jira/browse/HDFS-15471?focusedWorklogId=475426=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-475426 ] ASF GitHub Bot logged work on HDFS-15471: - Author: ASF GitHub Bot Created on: 27/Aug/20 17:11 Start Date: 27/Aug/20 17:11 Worklog Time Spent: 10m Work Description: steveloughran commented on pull request #2252: URL: https://github.com/apache/hadoop/pull/2252#issuecomment-682078871 > Ignore me, I'm just seeing why PRs are not linking to github in this case... no idea -just linked it by hand. Some discussion on common-dev about JIRA permissions This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 475426) Time Spent: 0.5h (was: 20m) > TestHDFSContractMultipartUploader fails on trunk > > > Key: HDFS-15471 > URL: https://issues.apache.org/jira/browse/HDFS-15471 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Affects Versions: 3.3.1, 3.4.0 >Reporter: Ahmed Hussein >Assignee: Steve Loughran >Priority: Major > Labels: test > Time Spent: 0.5h > Remaining Estimate: 0h > > {{TestHDFSContractMultipartUploader}} fails on trunk with > {{IllegalArgumentException}} > {code:bash} > [ERROR] > testConcurrentUploads(org.apache.hadoop.fs.contract.hdfs.TestHDFSContractMultipartUploader) > Time elapsed: 0.127 s <<< ERROR! > java.lang.IllegalArgumentException > at > com.google.common.base.Preconditions.checkArgument(Preconditions.java:127) > at > org.apache.hadoop.test.LambdaTestUtils$ProportionalRetryInterval.(LambdaTestUtils.java:907) > at > org.apache.hadoop.fs.contract.AbstractContractMultipartUploaderTest.testConcurrentUploads(AbstractContractMultipartUploaderTest.java:815) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at java.lang.Thread.run(Thread.java:748) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15471) TestHDFSContractMultipartUploader fails on trunk
[ https://issues.apache.org/jira/browse/HDFS-15471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17185991#comment-17185991 ] Steve Loughran commented on HDFS-15471: --- PR up at https://github.com/apache/hadoop/pull/2252 > TestHDFSContractMultipartUploader fails on trunk > > > Key: HDFS-15471 > URL: https://issues.apache.org/jira/browse/HDFS-15471 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Affects Versions: 3.3.1, 3.4.0 >Reporter: Ahmed Hussein >Assignee: Steve Loughran >Priority: Major > Labels: test > Time Spent: 20m > Remaining Estimate: 0h > > {{TestHDFSContractMultipartUploader}} fails on trunk with > {{IllegalArgumentException}} > {code:bash} > [ERROR] > testConcurrentUploads(org.apache.hadoop.fs.contract.hdfs.TestHDFSContractMultipartUploader) > Time elapsed: 0.127 s <<< ERROR! > java.lang.IllegalArgumentException > at > com.google.common.base.Preconditions.checkArgument(Preconditions.java:127) > at > org.apache.hadoop.test.LambdaTestUtils$ProportionalRetryInterval.(LambdaTestUtils.java:907) > at > org.apache.hadoop.fs.contract.AbstractContractMultipartUploaderTest.testConcurrentUploads(AbstractContractMultipartUploaderTest.java:815) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at java.lang.Thread.run(Thread.java:748) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-15544) Standby namenode EditLogTailerThread shouldn't aquire a lock interruptibly when do tail edits
[ https://issues.apache.org/jira/browse/HDFS-15544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17185922#comment-17185922 ] snodawn edited comment on HDFS-15544 at 8/27/20, 3:31 PM: -- Hey, [~jianghuazhu] , thanks for your advisement. I will add some unit tests later. was (Author: snodawn): Hey, jianghua zhu, thanks for your advisement. I will add some unit tests later. > Standby namenode EditLogTailerThread shouldn't aquire a lock interruptibly > when do tail edits > - > > Key: HDFS-15544 > URL: https://issues.apache.org/jira/browse/HDFS-15544 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.3.0 >Reporter: snodawn >Priority: Major > Attachments: HDFS-15544.001.patch > > > In my practice, active namenode sometimes holds a long time write lock in > rollEditLog > {code:java} > Longest write-lock held at 2020-08-27 12:59:30,773+0800 for 66067ms via > java.lang.Thread.getStackTrace(Thread.java:1559) > org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1032) > org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:283) > > org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:258) > > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.writeUnlock(FSNamesystem.java:1610) > > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.rollEditLog(FSNamesystem.java:4667) > > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.rollEditLog(NameNodeRpcServer.java:1292) > > org.apache.hadoop.hdfs.protocolPB.NamenodeProtocolServerSideTranslatorPB.rollEditLog(NamenodeProtocolServerSideTranslatorPB.java:146){code} > because standby namenode may not triggerActiveLogRoll() as set in > dfs.ha.log-roll.period (60s) after its last checkpoint, which may lead to a > large size (120m~200m) editlog for active namenode to roll. > > When try to do tail edits, standby namenode EditLogTailerThread acquire the > same lock as it do in checkpoint thread, but checkpoint thread may paste a > log of time to save fsimage file (in my practice, 5 minutes) , so > triggerActiveLogRoll() in EditLogTailerThread will not be called as set in > dfs.ha.log-roll.period. > I propose that EditLogTailerThread shouldn't acquire a lock by using > cpLockInterruptibly(), trylock() is enough. > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15544) Standby namenode EditLogTailerThread shouldn't aquire a lock interruptibly when do tail edits
[ https://issues.apache.org/jira/browse/HDFS-15544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17185922#comment-17185922 ] snodawn commented on HDFS-15544: Hey, jianghua zhu, thanks for your advisement. I will add some unit tests later. > Standby namenode EditLogTailerThread shouldn't aquire a lock interruptibly > when do tail edits > - > > Key: HDFS-15544 > URL: https://issues.apache.org/jira/browse/HDFS-15544 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.3.0 >Reporter: snodawn >Priority: Major > Attachments: HDFS-15544.001.patch > > > In my practice, active namenode sometimes holds a long time write lock in > rollEditLog > {code:java} > Longest write-lock held at 2020-08-27 12:59:30,773+0800 for 66067ms via > java.lang.Thread.getStackTrace(Thread.java:1559) > org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1032) > org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:283) > > org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:258) > > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.writeUnlock(FSNamesystem.java:1610) > > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.rollEditLog(FSNamesystem.java:4667) > > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.rollEditLog(NameNodeRpcServer.java:1292) > > org.apache.hadoop.hdfs.protocolPB.NamenodeProtocolServerSideTranslatorPB.rollEditLog(NamenodeProtocolServerSideTranslatorPB.java:146){code} > because standby namenode may not triggerActiveLogRoll() as set in > dfs.ha.log-roll.period (60s) after its last checkpoint, which may lead to a > large size (120m~200m) editlog for active namenode to roll. > > When try to do tail edits, standby namenode EditLogTailerThread acquire the > same lock as it do in checkpoint thread, but checkpoint thread may paste a > log of time to save fsimage file (in my practice, 5 minutes) , so > triggerActiveLogRoll() in EditLogTailerThread will not be called as set in > dfs.ha.log-roll.period. > I propose that EditLogTailerThread shouldn't acquire a lock by using > cpLockInterruptibly(), trylock() is enough. > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-15471) TestHDFSContractMultipartUploader fails on trunk
[ https://issues.apache.org/jira/browse/HDFS-15471?focusedWorklogId=475343=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-475343 ] ASF GitHub Bot logged work on HDFS-15471: - Author: ASF GitHub Bot Created on: 27/Aug/20 15:09 Start Date: 27/Aug/20 15:09 Worklog Time Spent: 10m Work Description: Humbedooh commented on pull request #2252: URL: https://github.com/apache/hadoop/pull/2252#issuecomment-682010131 Ignore me, I'm just seeing why PRs are not linking to github in this case... This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 475343) Time Spent: 20m (was: 10m) > TestHDFSContractMultipartUploader fails on trunk > > > Key: HDFS-15471 > URL: https://issues.apache.org/jira/browse/HDFS-15471 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Affects Versions: 3.3.1, 3.4.0 >Reporter: Ahmed Hussein >Assignee: Steve Loughran >Priority: Major > Labels: test > Time Spent: 20m > Remaining Estimate: 0h > > {{TestHDFSContractMultipartUploader}} fails on trunk with > {{IllegalArgumentException}} > {code:bash} > [ERROR] > testConcurrentUploads(org.apache.hadoop.fs.contract.hdfs.TestHDFSContractMultipartUploader) > Time elapsed: 0.127 s <<< ERROR! > java.lang.IllegalArgumentException > at > com.google.common.base.Preconditions.checkArgument(Preconditions.java:127) > at > org.apache.hadoop.test.LambdaTestUtils$ProportionalRetryInterval.(LambdaTestUtils.java:907) > at > org.apache.hadoop.fs.contract.AbstractContractMultipartUploaderTest.testConcurrentUploads(AbstractContractMultipartUploaderTest.java:815) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at java.lang.Thread.run(Thread.java:748) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15098) Add SM4 encryption method for HDFS
[ https://issues.apache.org/jira/browse/HDFS-15098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liusheng updated HDFS-15098: Attachment: HDFS-15098.009.patch Status: Patch Available (was: Open) > Add SM4 encryption method for HDFS > -- > > Key: HDFS-15098 > URL: https://issues.apache.org/jira/browse/HDFS-15098 > Project: Hadoop HDFS > Issue Type: New Feature >Affects Versions: 3.4.0 >Reporter: liusheng >Assignee: liusheng >Priority: Major > Labels: sm4 > Attachments: HDFS-15098.001.patch, HDFS-15098.002.patch, > HDFS-15098.003.patch, HDFS-15098.004.patch, HDFS-15098.005.patch, > HDFS-15098.006.patch, HDFS-15098.007.patch, HDFS-15098.008.patch, > HDFS-15098.009.patch, image-2020-08-19-16-54-41-341.png > > > SM4 (formerly SMS4)is a block cipher used in the Chinese National Standard > for Wireless LAN WAPI (Wired Authentication and Privacy Infrastructure). > SM4 was a cipher proposed to for the IEEE 802.11i standard, but has so far > been rejected by ISO. One of the reasons for the rejection has been > opposition to the WAPI fast-track proposal by the IEEE. please see: > [https://en.wikipedia.org/wiki/SM4_(cipher)] > > *Use sm4 on hdfs as follows:* > 1.Configure Hadoop KMS > 2.test HDFS sm4 > hadoop key create key1 -cipher 'SM4/CTR/NoPadding' > hdfs dfs -mkdir /benchmarks > hdfs crypto -createZone -keyName key1 -path /benchmarks > *requires:* > 1.openssl version >=1.1.1 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15098) Add SM4 encryption method for HDFS
[ https://issues.apache.org/jira/browse/HDFS-15098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liusheng updated HDFS-15098: Attachment: (was: HDFS-15098.009.patch) > Add SM4 encryption method for HDFS > -- > > Key: HDFS-15098 > URL: https://issues.apache.org/jira/browse/HDFS-15098 > Project: Hadoop HDFS > Issue Type: New Feature >Affects Versions: 3.4.0 >Reporter: liusheng >Assignee: liusheng >Priority: Major > Labels: sm4 > Attachments: HDFS-15098.001.patch, HDFS-15098.002.patch, > HDFS-15098.003.patch, HDFS-15098.004.patch, HDFS-15098.005.patch, > HDFS-15098.006.patch, HDFS-15098.007.patch, HDFS-15098.008.patch, > HDFS-15098.009.patch, image-2020-08-19-16-54-41-341.png > > > SM4 (formerly SMS4)is a block cipher used in the Chinese National Standard > for Wireless LAN WAPI (Wired Authentication and Privacy Infrastructure). > SM4 was a cipher proposed to for the IEEE 802.11i standard, but has so far > been rejected by ISO. One of the reasons for the rejection has been > opposition to the WAPI fast-track proposal by the IEEE. please see: > [https://en.wikipedia.org/wiki/SM4_(cipher)] > > *Use sm4 on hdfs as follows:* > 1.Configure Hadoop KMS > 2.test HDFS sm4 > hadoop key create key1 -cipher 'SM4/CTR/NoPadding' > hdfs dfs -mkdir /benchmarks > hdfs crypto -createZone -keyName key1 -path /benchmarks > *requires:* > 1.openssl version >=1.1.1 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15098) Add SM4 encryption method for HDFS
[ https://issues.apache.org/jira/browse/HDFS-15098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liusheng updated HDFS-15098: Status: Open (was: Patch Available) > Add SM4 encryption method for HDFS > -- > > Key: HDFS-15098 > URL: https://issues.apache.org/jira/browse/HDFS-15098 > Project: Hadoop HDFS > Issue Type: New Feature >Affects Versions: 3.4.0 >Reporter: liusheng >Assignee: liusheng >Priority: Major > Labels: sm4 > Attachments: HDFS-15098.001.patch, HDFS-15098.002.patch, > HDFS-15098.003.patch, HDFS-15098.004.patch, HDFS-15098.005.patch, > HDFS-15098.006.patch, HDFS-15098.007.patch, HDFS-15098.008.patch, > image-2020-08-19-16-54-41-341.png > > > SM4 (formerly SMS4)is a block cipher used in the Chinese National Standard > for Wireless LAN WAPI (Wired Authentication and Privacy Infrastructure). > SM4 was a cipher proposed to for the IEEE 802.11i standard, but has so far > been rejected by ISO. One of the reasons for the rejection has been > opposition to the WAPI fast-track proposal by the IEEE. please see: > [https://en.wikipedia.org/wiki/SM4_(cipher)] > > *Use sm4 on hdfs as follows:* > 1.Configure Hadoop KMS > 2.test HDFS sm4 > hadoop key create key1 -cipher 'SM4/CTR/NoPadding' > hdfs dfs -mkdir /benchmarks > hdfs crypto -createZone -keyName key1 -path /benchmarks > *requires:* > 1.openssl version >=1.1.1 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-15471) TestHDFSContractMultipartUploader fails on trunk
[ https://issues.apache.org/jira/browse/HDFS-15471?focusedWorklogId=475332=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-475332 ] ASF GitHub Bot logged work on HDFS-15471: - Author: ASF GitHub Bot Created on: 27/Aug/20 14:45 Start Date: 27/Aug/20 14:45 Worklog Time Spent: 10m Work Description: steveloughran opened a new pull request #2252: URL: https://github.com/apache/hadoop/pull/2252 Contributed by Steve Loughran (Also: broken by Steve Loughran) PR includes a minor change to the HDFS Code to ensure yetus is happy; main change is that the eventually() clause is skipped on a consistent store, because it is not needed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 475332) Remaining Estimate: 0h Time Spent: 10m > TestHDFSContractMultipartUploader fails on trunk > > > Key: HDFS-15471 > URL: https://issues.apache.org/jira/browse/HDFS-15471 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Affects Versions: 3.3.1, 3.4.0 >Reporter: Ahmed Hussein >Assignee: Steve Loughran >Priority: Major > Labels: test > Time Spent: 10m > Remaining Estimate: 0h > > {{TestHDFSContractMultipartUploader}} fails on trunk with > {{IllegalArgumentException}} > {code:bash} > [ERROR] > testConcurrentUploads(org.apache.hadoop.fs.contract.hdfs.TestHDFSContractMultipartUploader) > Time elapsed: 0.127 s <<< ERROR! > java.lang.IllegalArgumentException > at > com.google.common.base.Preconditions.checkArgument(Preconditions.java:127) > at > org.apache.hadoop.test.LambdaTestUtils$ProportionalRetryInterval.(LambdaTestUtils.java:907) > at > org.apache.hadoop.fs.contract.AbstractContractMultipartUploaderTest.testConcurrentUploads(AbstractContractMultipartUploaderTest.java:815) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at java.lang.Thread.run(Thread.java:748) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15471) TestHDFSContractMultipartUploader fails on trunk
[ https://issues.apache.org/jira/browse/HDFS-15471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated HDFS-15471: -- Affects Version/s: 3.4.0 3.3.1 > TestHDFSContractMultipartUploader fails on trunk > > > Key: HDFS-15471 > URL: https://issues.apache.org/jira/browse/HDFS-15471 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.3.1, 3.4.0 >Reporter: Ahmed Hussein >Assignee: Steve Loughran >Priority: Major > Labels: test > > {{TestHDFSContractMultipartUploader}} fails on trunk with > {{IllegalArgumentException}} > {code:bash} > [ERROR] > testConcurrentUploads(org.apache.hadoop.fs.contract.hdfs.TestHDFSContractMultipartUploader) > Time elapsed: 0.127 s <<< ERROR! > java.lang.IllegalArgumentException > at > com.google.common.base.Preconditions.checkArgument(Preconditions.java:127) > at > org.apache.hadoop.test.LambdaTestUtils$ProportionalRetryInterval.(LambdaTestUtils.java:907) > at > org.apache.hadoop.fs.contract.AbstractContractMultipartUploaderTest.testConcurrentUploads(AbstractContractMultipartUploaderTest.java:815) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at java.lang.Thread.run(Thread.java:748) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15471) TestHDFSContractMultipartUploader fails on trunk
[ https://issues.apache.org/jira/browse/HDFS-15471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated HDFS-15471: -- Component/s: test > TestHDFSContractMultipartUploader fails on trunk > > > Key: HDFS-15471 > URL: https://issues.apache.org/jira/browse/HDFS-15471 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Affects Versions: 3.3.1, 3.4.0 >Reporter: Ahmed Hussein >Assignee: Steve Loughran >Priority: Major > Labels: test > > {{TestHDFSContractMultipartUploader}} fails on trunk with > {{IllegalArgumentException}} > {code:bash} > [ERROR] > testConcurrentUploads(org.apache.hadoop.fs.contract.hdfs.TestHDFSContractMultipartUploader) > Time elapsed: 0.127 s <<< ERROR! > java.lang.IllegalArgumentException > at > com.google.common.base.Preconditions.checkArgument(Preconditions.java:127) > at > org.apache.hadoop.test.LambdaTestUtils$ProportionalRetryInterval.(LambdaTestUtils.java:907) > at > org.apache.hadoop.fs.contract.AbstractContractMultipartUploaderTest.testConcurrentUploads(AbstractContractMultipartUploaderTest.java:815) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at java.lang.Thread.run(Thread.java:748) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-15471) TestHDFSContractMultipartUploader fails on trunk
[ https://issues.apache.org/jira/browse/HDFS-15471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran reassigned HDFS-15471: - Assignee: Steve Loughran (was: Lisheng Sun) > TestHDFSContractMultipartUploader fails on trunk > > > Key: HDFS-15471 > URL: https://issues.apache.org/jira/browse/HDFS-15471 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Ahmed Hussein >Assignee: Steve Loughran >Priority: Major > Labels: test > > {{TestHDFSContractMultipartUploader}} fails on trunk with > {{IllegalArgumentException}} > {code:bash} > [ERROR] > testConcurrentUploads(org.apache.hadoop.fs.contract.hdfs.TestHDFSContractMultipartUploader) > Time elapsed: 0.127 s <<< ERROR! > java.lang.IllegalArgumentException > at > com.google.common.base.Preconditions.checkArgument(Preconditions.java:127) > at > org.apache.hadoop.test.LambdaTestUtils$ProportionalRetryInterval.(LambdaTestUtils.java:907) > at > org.apache.hadoop.fs.contract.AbstractContractMultipartUploaderTest.testConcurrentUploads(AbstractContractMultipartUploaderTest.java:815) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at java.lang.Thread.run(Thread.java:748) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15544) Standby namenode EditLogTailerThread shouldn't aquire a lock interruptibly when do tail edits
[ https://issues.apache.org/jira/browse/HDFS-15544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17185851#comment-17185851 ] jianghua zhu commented on HDFS-15544: - [~snodawn] , some unit tests should be added here. > Standby namenode EditLogTailerThread shouldn't aquire a lock interruptibly > when do tail edits > - > > Key: HDFS-15544 > URL: https://issues.apache.org/jira/browse/HDFS-15544 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.3.0 >Reporter: snodawn >Priority: Major > Attachments: HDFS-15544.001.patch > > > In my practice, active namenode sometimes holds a long time write lock in > rollEditLog > {code:java} > Longest write-lock held at 2020-08-27 12:59:30,773+0800 for 66067ms via > java.lang.Thread.getStackTrace(Thread.java:1559) > org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1032) > org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:283) > > org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:258) > > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.writeUnlock(FSNamesystem.java:1610) > > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.rollEditLog(FSNamesystem.java:4667) > > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.rollEditLog(NameNodeRpcServer.java:1292) > > org.apache.hadoop.hdfs.protocolPB.NamenodeProtocolServerSideTranslatorPB.rollEditLog(NamenodeProtocolServerSideTranslatorPB.java:146){code} > because standby namenode may not triggerActiveLogRoll() as set in > dfs.ha.log-roll.period (60s) after its last checkpoint, which may lead to a > large size (120m~200m) editlog for active namenode to roll. > > When try to do tail edits, standby namenode EditLogTailerThread acquire the > same lock as it do in checkpoint thread, but checkpoint thread may paste a > log of time to save fsimage file (in my practice, 5 minutes) , so > triggerActiveLogRoll() in EditLogTailerThread will not be called as set in > dfs.ha.log-roll.period. > I propose that EditLogTailerThread shouldn't acquire a lock by using > cpLockInterruptibly(), trylock() is enough. > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15471) TestHDFSContractMultipartUploader fails on trunk
[ https://issues.apache.org/jira/browse/HDFS-15471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17185843#comment-17185843 ] Steve Loughran commented on HDFS-15471: --- sorry, missed this. will look > TestHDFSContractMultipartUploader fails on trunk > > > Key: HDFS-15471 > URL: https://issues.apache.org/jira/browse/HDFS-15471 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Ahmed Hussein >Assignee: Lisheng Sun >Priority: Major > Labels: test > > {{TestHDFSContractMultipartUploader}} fails on trunk with > {{IllegalArgumentException}} > {code:bash} > [ERROR] > testConcurrentUploads(org.apache.hadoop.fs.contract.hdfs.TestHDFSContractMultipartUploader) > Time elapsed: 0.127 s <<< ERROR! > java.lang.IllegalArgumentException > at > com.google.common.base.Preconditions.checkArgument(Preconditions.java:127) > at > org.apache.hadoop.test.LambdaTestUtils$ProportionalRetryInterval.(LambdaTestUtils.java:907) > at > org.apache.hadoop.fs.contract.AbstractContractMultipartUploaderTest.testConcurrentUploads(AbstractContractMultipartUploaderTest.java:815) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at java.lang.Thread.run(Thread.java:748) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15544) Standby namenode EditLogTailerThread shouldn't aquire a lock interruptibly when do tail edits
[ https://issues.apache.org/jira/browse/HDFS-15544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] snodawn updated HDFS-15544: --- Description: In my practice, active namenode sometimes holds a long time write lock in rollEditLog {code:java} Longest write-lock held at 2020-08-27 12:59:30,773+0800 for 66067ms via java.lang.Thread.getStackTrace(Thread.java:1559) org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1032) org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:283) org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:258) org.apache.hadoop.hdfs.server.namenode.FSNamesystem.writeUnlock(FSNamesystem.java:1610) org.apache.hadoop.hdfs.server.namenode.FSNamesystem.rollEditLog(FSNamesystem.java:4667) org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.rollEditLog(NameNodeRpcServer.java:1292) org.apache.hadoop.hdfs.protocolPB.NamenodeProtocolServerSideTranslatorPB.rollEditLog(NamenodeProtocolServerSideTranslatorPB.java:146){code} because standby namenode may not triggerActiveLogRoll() as set in dfs.ha.log-roll.period (60s) after its last checkpoint, which may lead to a large size (120m~200m) editlog for active namenode to roll. When try to do tail edits, standby namenode EditLogTailerThread acquire the same lock as it do in checkpoint thread, but checkpoint thread may paste a log of time to save fsimage file (in my practice, 5 minutes) , so triggerActiveLogRoll() in EditLogTailerThread will not be called as set in dfs.ha.log-roll.period. I propose that EditLogTailerThread shouldn't acquire a lock by using cpLockInterruptibly(), trylock() is enough. was: In my practice, active namenode sometimes holds a long time write lock in rollEditLog {code:java} Longest write-lock held at 2020-08-27 12:59:30,773+0800 for 66067ms via java.lang.Thread.getStackTrace(Thread.java:1559) org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1032) org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:283) org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:258) org.apache.hadoop.hdfs.server.namenode.FSNamesystem.writeUnlock(FSNamesystem.java:1610) org.apache.hadoop.hdfs.server.namenode.FSNamesystem.rollEditLog(FSNamesystem.java:4667) org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.rollEditLog(NameNodeRpcServer.java:1292) org.apache.hadoop.hdfs.protocolPB.NamenodeProtocolServerSideTranslatorPB.rollEditLog(NamenodeProtocolServerSideTranslatorPB.java:146){code} because standby namenode may not triggerActiveLogRoll() as set in dfs.ha.log-roll.period (60s) after its last checkpoint, which may lead to a large size (120m~200m) editlog for active namenode to roll. When try to do tail edits, standby namenode EditLogTailerThread acquire the same lock as it do in checkpoint thread, but checkpoint thread may paste a log of time to save fsimage file (in my practice, 4 minutes) , so triggerActiveLogRoll() in EditLogTailerThread will not be called as set in dfs.ha.log-roll.period. I propose that EditLogTailerThread shouldn't acquire a lock by using cpLockInterruptibly(), trylock() is enough. > Standby namenode EditLogTailerThread shouldn't aquire a lock interruptibly > when do tail edits > - > > Key: HDFS-15544 > URL: https://issues.apache.org/jira/browse/HDFS-15544 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.3.0 >Reporter: snodawn >Priority: Major > Attachments: HDFS-15544.001.patch > > > In my practice, active namenode sometimes holds a long time write lock in > rollEditLog > {code:java} > Longest write-lock held at 2020-08-27 12:59:30,773+0800 for 66067ms via > java.lang.Thread.getStackTrace(Thread.java:1559) > org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1032) > org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:283) > > org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:258) > > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.writeUnlock(FSNamesystem.java:1610) > > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.rollEditLog(FSNamesystem.java:4667) > > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.rollEditLog(NameNodeRpcServer.java:1292) > > org.apache.hadoop.hdfs.protocolPB.NamenodeProtocolServerSideTranslatorPB.rollEditLog(NamenodeProtocolServerSideTranslatorPB.java:146){code} > because standby namenode may not triggerActiveLogRoll() as set in > dfs.ha.log-roll.period (60s) after its last checkpoint, which may lead to a > large size (120m~200m) editlog for
[jira] [Updated] (HDFS-15544) Standby namenode EditLogTailerThread shouldn't aquire a lock interruptibly when do tail edits
[ https://issues.apache.org/jira/browse/HDFS-15544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] snodawn updated HDFS-15544: --- Description: In my practice, active namenode sometimes holds a long time write lock in rollEditLog {code:java} Longest write-lock held at 2020-08-27 12:59:30,773+0800 for 66067ms via java.lang.Thread.getStackTrace(Thread.java:1559) org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1032) org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:283) org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:258) org.apache.hadoop.hdfs.server.namenode.FSNamesystem.writeUnlock(FSNamesystem.java:1610) org.apache.hadoop.hdfs.server.namenode.FSNamesystem.rollEditLog(FSNamesystem.java:4667) org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.rollEditLog(NameNodeRpcServer.java:1292) org.apache.hadoop.hdfs.protocolPB.NamenodeProtocolServerSideTranslatorPB.rollEditLog(NamenodeProtocolServerSideTranslatorPB.java:146){code} because standby namenode may not triggerActiveLogRoll() as set in dfs.ha.log-roll.period (60s) after its last checkpoint, which may lead to a large size (120m~200m) editlog for active namenode to roll. When try to do tail edits, standby namenode EditLogTailerThread acquire the same lock as it do in checkpoint thread, but checkpoint thread may paste a log of time to save fsimage file (in my practice, 4 minutes) , so triggerActiveLogRoll() in EditLogTailerThread will not be called as set in dfs.ha.log-roll.period. I propose that EditLogTailerThread shouldn't acquire a lock by using cpLockInterruptibly(), trylock() is enough. was: In my practice, active namenode sometimes holds a long time write lock in rollEditLog {code:java} Longest write-lock held at 2020-08-27 12:59:30,773+0800 for 66067ms via java.lang.Thread.getStackTrace(Thread.java:1559) org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1032) org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:283) org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:258) org.apache.hadoop.hdfs.server.namenode.FSNamesystem.writeUnlock(FSNamesystem.java:1610) org.apache.hadoop.hdfs.server.namenode.FSNamesystem.rollEditLog(FSNamesystem.java:4667) org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.rollEditLog(NameNodeRpcServer.java:1292) org.apache.hadoop.hdfs.protocolPB.NamenodeProtocolServerSideTranslatorPB.rollEditLog(NamenodeProtocolServerSideTranslatorPB.java:146){code} because standby namenode may not triggerActiveLogRoll() as set in dfs.ha.log-roll.period after its last checkpoint, which may lead to a large size editlog for active namenode to roll. When try to do tail edits, standby namenode EditLogTailerThread acquire the same lock as it do in checkpoint thread, but checkpoint thread may paste a log of time to save fsimage file (in my practice, 4 minutes) , so triggerActiveLogRoll() in EditLogTailerThread will not be called as set in dfs.ha.log-roll.period. I propose that EditLogTailerThread shouldn't acquire a lock by using cpLockInterruptibly(), trylock() is enough. > Standby namenode EditLogTailerThread shouldn't aquire a lock interruptibly > when do tail edits > - > > Key: HDFS-15544 > URL: https://issues.apache.org/jira/browse/HDFS-15544 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.3.0 >Reporter: snodawn >Priority: Major > Attachments: HDFS-15544.001.patch > > > In my practice, active namenode sometimes holds a long time write lock in > rollEditLog > {code:java} > Longest write-lock held at 2020-08-27 12:59:30,773+0800 for 66067ms via > java.lang.Thread.getStackTrace(Thread.java:1559) > org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1032) > org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:283) > > org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:258) > > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.writeUnlock(FSNamesystem.java:1610) > > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.rollEditLog(FSNamesystem.java:4667) > > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.rollEditLog(NameNodeRpcServer.java:1292) > > org.apache.hadoop.hdfs.protocolPB.NamenodeProtocolServerSideTranslatorPB.rollEditLog(NamenodeProtocolServerSideTranslatorPB.java:146){code} > because standby namenode may not triggerActiveLogRoll() as set in > dfs.ha.log-roll.period (60s) after its last checkpoint, which may lead to a > large size (120m~200m) editlog for active namenode to
[jira] [Updated] (HDFS-15544) Standby namenode EditLogTailerThread shouldn't aquire a lock interruptibly when do tail edits
[ https://issues.apache.org/jira/browse/HDFS-15544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] snodawn updated HDFS-15544: --- Attachment: HDFS-15544.001.patch > Standby namenode EditLogTailerThread shouldn't aquire a lock interruptibly > when do tail edits > - > > Key: HDFS-15544 > URL: https://issues.apache.org/jira/browse/HDFS-15544 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.3.0 >Reporter: snodawn >Priority: Major > Attachments: HDFS-15544.001.patch > > > In my practice, active namenode sometimes holds a long time write lock in > rollEditLog > {code:java} > Longest write-lock held at 2020-08-27 12:59:30,773+0800 for 66067ms via > java.lang.Thread.getStackTrace(Thread.java:1559) > org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1032) > org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:283) > > org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:258) > > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.writeUnlock(FSNamesystem.java:1610) > > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.rollEditLog(FSNamesystem.java:4667) > > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.rollEditLog(NameNodeRpcServer.java:1292) > > org.apache.hadoop.hdfs.protocolPB.NamenodeProtocolServerSideTranslatorPB.rollEditLog(NamenodeProtocolServerSideTranslatorPB.java:146){code} > because standby namenode may not triggerActiveLogRoll() as set in > dfs.ha.log-roll.period after its last checkpoint, which may lead to a large > size editlog for active namenode to roll. > > When try to do tail edits, standby namenode EditLogTailerThread acquire the > same lock as it do in checkpoint thread, but checkpoint thread may paste a > log of time to save fsimage file (in my practice, 4 minutes) , so > triggerActiveLogRoll() in EditLogTailerThread will not be called as set in > dfs.ha.log-roll.period. > I propose that EditLogTailerThread shouldn't acquire a lock by using > cpLockInterruptibly(), trylock() is enough. > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-15544) Standby namenode EditLogTailerThread shouldn't aquire a lock interruptibly when do tail edits
snodawn created HDFS-15544: -- Summary: Standby namenode EditLogTailerThread shouldn't aquire a lock interruptibly when do tail edits Key: HDFS-15544 URL: https://issues.apache.org/jira/browse/HDFS-15544 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 3.3.0 Reporter: snodawn In my practice, active namenode sometimes holds a long time write lock in rollEditLog {code:java} Longest write-lock held at 2020-08-27 12:59:30,773+0800 for 66067ms via java.lang.Thread.getStackTrace(Thread.java:1559) org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1032) org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:283) org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:258) org.apache.hadoop.hdfs.server.namenode.FSNamesystem.writeUnlock(FSNamesystem.java:1610) org.apache.hadoop.hdfs.server.namenode.FSNamesystem.rollEditLog(FSNamesystem.java:4667) org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.rollEditLog(NameNodeRpcServer.java:1292) org.apache.hadoop.hdfs.protocolPB.NamenodeProtocolServerSideTranslatorPB.rollEditLog(NamenodeProtocolServerSideTranslatorPB.java:146){code} because standby namenode may not triggerActiveLogRoll() as set in dfs.ha.log-roll.period after its last checkpoint, which may lead to a large size editlog for active namenode to roll. When try to do tail edits, standby namenode EditLogTailerThread acquire the same lock as it do in checkpoint thread, but checkpoint thread may paste a log of time to save fsimage file (in my practice, 4 minutes) , so triggerActiveLogRoll() in EditLogTailerThread will not be called as set in dfs.ha.log-roll.period. I propose that EditLogTailerThread shouldn't acquire a lock by using cpLockInterruptibly(), trylock() is enough. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15543) RBF: Write Should allow, when a subcluster is unavailable for RANDOM mount points with fault Tolerance enabled.
[ https://issues.apache.org/jira/browse/HDFS-15543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17185762#comment-17185762 ] Hemanth Boyina commented on HDFS-15543: --- thanks [~Harsha1206] for reporting the issue When a destination sub cluster is unavailable we allow to write a file if the mount point is enabled with fault tolerant after the file got created , the client will get newDataEncryptionKey() which calls getServerDefaults to fetch the details about encryptDataTransfer getServerDefaults in router will invoke to default namespace if default namespace is set , in you scenario if default namespace is unavailable it will leads to connexception which makes the data write fail > RBF: Write Should allow, when a subcluster is unavailable for RANDOM mount > points with fault Tolerance enabled. > > > Key: HDFS-15543 > URL: https://issues.apache.org/jira/browse/HDFS-15543 > Project: Hadoop HDFS > Issue Type: Bug > Components: rbf >Affects Versions: 3.1.1 >Reporter: Harshakiran Reddy >Assignee: Hemanth Boyina >Priority: Major > > A RANDOM mount point should allow to creating new files if one subcluster is > down also with Fault Tolerance was enabled. but here it's failed. > MultiDestination_client]# hdfs dfsrouteradmin -ls /test_ec > *Mount Table Entries:* > Source Destinations Owner Group Mode Quota/Usage > /test_ec *hacluster->/tes_ec,hacluster1->/tes_ec* test ficommon rwxr-xr-x > [NsQuota: -/-, SsQuota: -/-] > *File Write throne the Exception:-* > 2020-08-26 19:13:21,839 WARN hdfs.DataStreamer: Abandoning blk_1073743375_2551 > 2020-08-26 19:13:21,877 WARN hdfs.DataStreamer: Excluding datanode > DatanodeInfoWithStorage[DISK] > 2020-08-26 19:13:21,878 WARN hdfs.DataStreamer: DataStreamer Exception > java.io.IOException: Unable to create new block. > at > org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1758) > at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:718) > 2020-08-26 19:13:21,879 WARN hdfs.DataStreamer: Could not get block > locations. Source file "/test_ec/f1._COPYING_" - Aborting...block==null > put: Could not get block locations. Source file "/test_ec/f1._COPYING_" - > Aborting...block==null -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15543) RBF: Write Should allow, when a subcluster is unavailable for RANDOM mount points with fault Tolerance enabled.
[ https://issues.apache.org/jira/browse/HDFS-15543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harshakiran Reddy updated HDFS-15543: - Description: A RANDOM mount point should allow to creating new files if one subcluster is down also with Fault Tolerance was enabled. but here it's failed. MultiDestination_client]# hdfs dfsrouteradmin -ls /test_ec *Mount Table Entries:* Source Destinations Owner Group Mode Quota/Usage /test_ec *hacluster->/tes_ec,hacluster1->/tes_ec* test ficommon rwxr-xr-x [NsQuota: -/-, SsQuota: -/-] *File Write throne the Exception:-* 2020-08-26 19:13:21,839 WARN hdfs.DataStreamer: Abandoning blk_1073743375_2551 2020-08-26 19:13:21,877 WARN hdfs.DataStreamer: Excluding datanode DatanodeInfoWithStorage[DISK] 2020-08-26 19:13:21,878 WARN hdfs.DataStreamer: DataStreamer Exception java.io.IOException: Unable to create new block. at org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1758) at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:718) 2020-08-26 19:13:21,879 WARN hdfs.DataStreamer: Could not get block locations. Source file "/test_ec/f1._COPYING_" - Aborting...block==null put: Could not get block locations. Source file "/test_ec/f1._COPYING_" - Aborting...block==null was: A RANDOM mount point should allow to creating new files if one subcluster is down also with Fault Tolerance was enabled. but here it's failed. FI_MultiDestination_client]# hdfs dfsrouteradmin -ls /test_ec *Mount Table Entries:* Source Destinations Owner Group Mode Quota/Usage /test_ec *hacluster->/tes_ec,hacluster1->/tes_ec* test ficommon rwxr-xr-x [NsQuota: -/-, SsQuota: -/-] *File Write throne the Exception:-* 2020-08-26 19:13:21,839 WARN hdfs.DataStreamer: Abandoning blk_1073743375_2551 2020-08-26 19:13:21,877 WARN hdfs.DataStreamer: Excluding datanode DatanodeInfoWithStorage[DISK] 2020-08-26 19:13:21,878 WARN hdfs.DataStreamer: DataStreamer Exception java.io.IOException: Unable to create new block. at org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1758) at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:718) 2020-08-26 19:13:21,879 WARN hdfs.DataStreamer: Could not get block locations. Source file "/test_ec/f1._COPYING_" - Aborting...block==null put: Could not get block locations. Source file "/test_ec/f1._COPYING_" - Aborting...block==null > RBF: Write Should allow, when a subcluster is unavailable for RANDOM mount > points with fault Tolerance enabled. > > > Key: HDFS-15543 > URL: https://issues.apache.org/jira/browse/HDFS-15543 > Project: Hadoop HDFS > Issue Type: Bug > Components: rbf >Affects Versions: 3.1.1 >Reporter: Harshakiran Reddy >Assignee: Hemanth Boyina >Priority: Major > > A RANDOM mount point should allow to creating new files if one subcluster is > down also with Fault Tolerance was enabled. but here it's failed. > MultiDestination_client]# hdfs dfsrouteradmin -ls /test_ec > *Mount Table Entries:* > Source Destinations Owner Group Mode Quota/Usage > /test_ec *hacluster->/tes_ec,hacluster1->/tes_ec* test ficommon rwxr-xr-x > [NsQuota: -/-, SsQuota: -/-] > *File Write throne the Exception:-* > 2020-08-26 19:13:21,839 WARN hdfs.DataStreamer: Abandoning blk_1073743375_2551 > 2020-08-26 19:13:21,877 WARN hdfs.DataStreamer: Excluding datanode > DatanodeInfoWithStorage[DISK] > 2020-08-26 19:13:21,878 WARN hdfs.DataStreamer: DataStreamer Exception > java.io.IOException: Unable to create new block. > at > org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1758) > at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:718) > 2020-08-26 19:13:21,879 WARN hdfs.DataStreamer: Could not get block > locations. Source file "/test_ec/f1._COPYING_" - Aborting...block==null > put: Could not get block locations. Source file "/test_ec/f1._COPYING_" - > Aborting...block==null -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15543) RBF: Write Should allow, when a subcluster is unavailable for RANDOM mount points with fault Tolerance enabled.
[ https://issues.apache.org/jira/browse/HDFS-15543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harshakiran Reddy updated HDFS-15543: - Description: A RANDOM mount point should allow to creating new files if one subcluster is down also with Fault Tolerance was enabled. but here it's failed. FI_MultiDestination_client]# hdfs dfsrouteradmin -ls /test_ec *Mount Table Entries:* Source Destinations Owner Group Mode Quota/Usage /test_ec *hacluster->/tes_ec,hacluster1->/tes_ec* test ficommon rwxr-xr-x [NsQuota: -/-, SsQuota: -/-] *File Write throne the Exception:-* 2020-08-26 19:13:21,839 WARN hdfs.DataStreamer: Abandoning blk_1073743375_2551 2020-08-26 19:13:21,877 WARN hdfs.DataStreamer: Excluding datanode DatanodeInfoWithStorage[DISK] 2020-08-26 19:13:21,878 WARN hdfs.DataStreamer: DataStreamer Exception java.io.IOException: Unable to create new block. at org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1758) at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:718) 2020-08-26 19:13:21,879 WARN hdfs.DataStreamer: Could not get block locations. Source file "/test_ec/f1._COPYING_" - Aborting...block==null put: Could not get block locations. Source file "/test_ec/f1._COPYING_" - Aborting...block==null was: A RANDOM mount point should allow to creating new files if one subcluster is down also with Fault Tolerance was enabled. but here it's failed. *File Write throne the Exception:-* 2020-08-26 19:13:21,839 WARN hdfs.DataStreamer: Abandoning blk_1073743375_2551 2020-08-26 19:13:21,877 WARN hdfs.DataStreamer: Excluding datanode DatanodeInfoWithStorage[DISK] 2020-08-26 19:13:21,878 WARN hdfs.DataStreamer: DataStreamer Exception java.io.IOException: Unable to create new block. at org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1758) at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:718) 2020-08-26 19:13:21,879 WARN hdfs.DataStreamer: Could not get block locations. Source file "/test_ec/f1._COPYING_" - Aborting...block==null put: Could not get block locations. Source file "/test_ec/f1._COPYING_" - Aborting...block==null Environment: (was: FI_MultiDestination_client]# *hdfs dfsrouteradmin -ls /test_ec* *Mount Table Entries:* SourceDestinations Owner Group Mode Quota/Usage */test_ec* *hacluster->/tes_ec,hacluster1->/tes_ec* test ficommon rwxr-xr-x [NsQuota: -/-, SsQuota: -/-] ) > RBF: Write Should allow, when a subcluster is unavailable for RANDOM mount > points with fault Tolerance enabled. > > > Key: HDFS-15543 > URL: https://issues.apache.org/jira/browse/HDFS-15543 > Project: Hadoop HDFS > Issue Type: Bug > Components: rbf >Affects Versions: 3.1.1 >Reporter: Harshakiran Reddy >Assignee: Hemanth Boyina >Priority: Major > > A RANDOM mount point should allow to creating new files if one subcluster is > down also with Fault Tolerance was enabled. but here it's failed. > FI_MultiDestination_client]# hdfs dfsrouteradmin -ls /test_ec > *Mount Table Entries:* > Source Destinations Owner Group Mode Quota/Usage > /test_ec *hacluster->/tes_ec,hacluster1->/tes_ec* test ficommon rwxr-xr-x > [NsQuota: -/-, SsQuota: -/-] > *File Write throne the Exception:-* > 2020-08-26 19:13:21,839 WARN hdfs.DataStreamer: Abandoning blk_1073743375_2551 > 2020-08-26 19:13:21,877 WARN hdfs.DataStreamer: Excluding datanode > DatanodeInfoWithStorage[DISK] > 2020-08-26 19:13:21,878 WARN hdfs.DataStreamer: DataStreamer Exception > java.io.IOException: Unable to create new block. > at > org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1758) > at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:718) > 2020-08-26 19:13:21,879 WARN hdfs.DataStreamer: Could not get block > locations. Source file "/test_ec/f1._COPYING_" - Aborting...block==null > put: Could not get block locations. Source file "/test_ec/f1._COPYING_" - > Aborting...block==null -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-15543) RBF: Write Should allow, when a subcluster is unavailable for RANDOM mount points with fault Tolerance enabled.
[ https://issues.apache.org/jira/browse/HDFS-15543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hemanth Boyina reassigned HDFS-15543: - Assignee: Hemanth Boyina > RBF: Write Should allow, when a subcluster is unavailable for RANDOM mount > points with fault Tolerance enabled. > > > Key: HDFS-15543 > URL: https://issues.apache.org/jira/browse/HDFS-15543 > Project: Hadoop HDFS > Issue Type: Bug > Components: rbf >Affects Versions: 3.1.1 > Environment: FI_MultiDestination_client]# *hdfs dfsrouteradmin -ls > /test_ec* > *Mount Table Entries:* > SourceDestinations Owner > Group Mode Quota/Usage > */test_ec* *hacluster->/tes_ec,hacluster1->/tes_ec* test > ficommon rwxr-xr-x > [NsQuota: -/-, SsQuota: -/-] >Reporter: Harshakiran Reddy >Assignee: Hemanth Boyina >Priority: Major > > A RANDOM mount point should allow to creating new files if one subcluster is > down also with Fault Tolerance was enabled. but here it's failed. > *File Write throne the Exception:-* > 2020-08-26 19:13:21,839 WARN hdfs.DataStreamer: Abandoning blk_1073743375_2551 > 2020-08-26 19:13:21,877 WARN hdfs.DataStreamer: Excluding datanode > DatanodeInfoWithStorage[DISK] > 2020-08-26 19:13:21,878 WARN hdfs.DataStreamer: DataStreamer Exception > java.io.IOException: Unable to create new block. > at > org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1758) > at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:718) > 2020-08-26 19:13:21,879 WARN hdfs.DataStreamer: Could not get block > locations. Source file "/test_ec/f1._COPYING_" - Aborting...block==null > put: Could not get block locations. Source file "/test_ec/f1._COPYING_" - > Aborting...block==null -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-15543) RBF: Write Should allow, when a subcluster is unavailable for RANDOM mount points with fault Tolerance enabled.
Harshakiran Reddy created HDFS-15543: Summary: RBF: Write Should allow, when a subcluster is unavailable for RANDOM mount points with fault Tolerance enabled. Key: HDFS-15543 URL: https://issues.apache.org/jira/browse/HDFS-15543 Project: Hadoop HDFS Issue Type: Bug Components: rbf Affects Versions: 3.1.1 Environment: FI_MultiDestination_client]# *hdfs dfsrouteradmin -ls /test_ec* *Mount Table Entries:* SourceDestinations Owner Group Mode Quota/Usage */test_ec* *hacluster->/tes_ec,hacluster1->/tes_ec* test ficommon rwxr-xr-x [NsQuota: -/-, SsQuota: -/-] Reporter: Harshakiran Reddy A RANDOM mount point should allow to creating new files if one subcluster is down also with Fault Tolerance was enabled. but here it's failed. *File Write throne the Exception:-* 2020-08-26 19:13:21,839 WARN hdfs.DataStreamer: Abandoning blk_1073743375_2551 2020-08-26 19:13:21,877 WARN hdfs.DataStreamer: Excluding datanode DatanodeInfoWithStorage[DISK] 2020-08-26 19:13:21,878 WARN hdfs.DataStreamer: DataStreamer Exception java.io.IOException: Unable to create new block. at org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1758) at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:718) 2020-08-26 19:13:21,879 WARN hdfs.DataStreamer: Could not get block locations. Source file "/test_ec/f1._COPYING_" - Aborting...block==null put: Could not get block locations. Source file "/test_ec/f1._COPYING_" - Aborting...block==null -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-15542) Add identified snapshot corruption tests for ordered snapshot deletion
[ https://issues.apache.org/jira/browse/HDFS-15542?focusedWorklogId=475218=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-475218 ] ASF GitHub Bot logged work on HDFS-15542: - Author: ASF GitHub Bot Created on: 27/Aug/20 09:57 Start Date: 27/Aug/20 09:57 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #2251: URL: https://github.com/apache/hadoop/pull/2251#issuecomment-681849593 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | +0 :ok: | reexec | 0m 35s | Docker mode activated. | ||| _ Prechecks _ | | +1 :green_heart: | dupname | 0m 1s | No case conflicting files found. | | +1 :green_heart: | @author | 0m 0s | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | The patch appears to include 1 new or modified test files. | ||| _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 29m 2s | trunk passed | | +1 :green_heart: | compile | 1m 20s | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | compile | 1m 14s | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | checkstyle | 0m 49s | trunk passed | | +1 :green_heart: | mvnsite | 1m 25s | trunk passed | | +1 :green_heart: | shadedclient | 16m 32s | branch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 0m 53s | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 1m 24s | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +0 :ok: | spotbugs | 3m 1s | Used deprecated FindBugs config; considering switching to SpotBugs. | | +1 :green_heart: | findbugs | 2m 59s | trunk passed | ||| _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 1m 8s | the patch passed | | +1 :green_heart: | compile | 1m 9s | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javac | 1m 9s | the patch passed | | +1 :green_heart: | compile | 1m 5s | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | javac | 1m 5s | the patch passed | | -0 :warning: | checkstyle | 0m 39s | hadoop-hdfs-project/hadoop-hdfs: The patch generated 10 new + 0 unchanged - 0 fixed = 10 total (was 0) | | +1 :green_heart: | mvnsite | 1m 10s | the patch passed | | +1 :green_heart: | whitespace | 0m 0s | The patch has no whitespace issues. | | +1 :green_heart: | shadedclient | 13m 38s | patch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 0m 46s | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 1m 24s | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | findbugs | 3m 2s | the patch passed | ||| _ Other Tests _ | | -1 :x: | unit | 96m 2s | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 0m 40s | The patch does not generate ASF License warnings. | | | | 178m 51s | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.hdfs.server.namenode.ha.TestBootstrapAliasmap | | | hadoop.hdfs.server.datanode.TestBPOfferService | | | hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks | | | hadoop.hdfs.TestFileChecksumCompositeCrc | | | hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFSStriped | | | hadoop.fs.contract.hdfs.TestHDFSContractMultipartUploader | | | hadoop.hdfs.TestFileChecksum | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.40 ServerAPI=1.40 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2251/1/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/2251 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux f0018197eeeb 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / d8aaa8c3380 | | Default Java | Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | |
[jira] [Resolved] (HDFS-15500) In-order deletion of snapshots: Diff lists must be update only in the last snapshot
[ https://issues.apache.org/jira/browse/HDFS-15500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shashikant Banerjee resolved HDFS-15500. Fix Version/s: 3.4.0 Resolution: Fixed Thanks [~szetszwo] for the contribution. > In-order deletion of snapshots: Diff lists must be update only in the last > snapshot > --- > > Key: HDFS-15500 > URL: https://issues.apache.org/jira/browse/HDFS-15500 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: snapshots >Reporter: Mukul Kumar Singh >Assignee: Tsz-wo Sze >Priority: Major > Fix For: 3.4.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > With ordered deletions the diff lists of the snapshots should become > immutable except the latest one. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-15500) In-order deletion of snapshots: Diff lists must be update only in the last snapshot
[ https://issues.apache.org/jira/browse/HDFS-15500?focusedWorklogId=475202=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-475202 ] ASF GitHub Bot logged work on HDFS-15500: - Author: ASF GitHub Bot Created on: 27/Aug/20 09:25 Start Date: 27/Aug/20 09:25 Worklog Time Spent: 10m Work Description: bshashikant merged pull request #2233: URL: https://github.com/apache/hadoop/pull/2233 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 475202) Time Spent: 0.5h (was: 20m) > In-order deletion of snapshots: Diff lists must be update only in the last > snapshot > --- > > Key: HDFS-15500 > URL: https://issues.apache.org/jira/browse/HDFS-15500 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: snapshots >Reporter: Mukul Kumar Singh >Assignee: Tsz-wo Sze >Priority: Major > Time Spent: 0.5h > Remaining Estimate: 0h > > With ordered deletions the diff lists of the snapshots should become > immutable except the latest one. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-14546) Document block placement policies
[ https://issues.apache.org/jira/browse/HDFS-14546?focusedWorklogId=475171=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-475171 ] ASF GitHub Bot logged work on HDFS-14546: - Author: ASF GitHub Bot Created on: 27/Aug/20 07:00 Start Date: 27/Aug/20 07:00 Worklog Time Spent: 10m Work Description: Amithsha closed pull request #1562: URL: https://github.com/apache/hadoop/pull/1562 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 475171) Remaining Estimate: 0h Time Spent: 10m > Document block placement policies > - > > Key: HDFS-14546 > URL: https://issues.apache.org/jira/browse/HDFS-14546 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Íñigo Goiri >Assignee: Amithsha >Priority: Major > Labels: documentation > Fix For: 3.4.0 > > Attachments: HDFS-14546-01.patch, HDFS-14546-02.patch, > HDFS-14546-03.patch, HDFS-14546-04.patch, HDFS-14546-05.patch, > HDFS-14546-06.patch, HDFS-14546-07.patch, HDFS-14546-08.patch, > HDFS-14546-09.patch, HdfsDesign.patch > > Time Spent: 10m > Remaining Estimate: 0h > > Currently, all the documentation refers to the default block placement policy. > However, over time there have been new policies: > * BlockPlacementPolicyRackFaultTolerant (HDFS-7891) > * BlockPlacementPolicyWithNodeGroup (HDFS-3601) > * BlockPlacementPolicyWithUpgradeDomain (HDFS-9006) > We should update the documentation to refer to them explaining their > particularities and probably how to setup each one of them. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-15500) In-order deletion of snapshots: Diff lists must be update only in the last snapshot
[ https://issues.apache.org/jira/browse/HDFS-15500?focusedWorklogId=475170=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-475170 ] ASF GitHub Bot logged work on HDFS-15500: - Author: ASF GitHub Bot Created on: 27/Aug/20 06:58 Start Date: 27/Aug/20 06:58 Worklog Time Spent: 10m Work Description: szetszwo commented on pull request #2233: URL: https://github.com/apache/hadoop/pull/2233#issuecomment-681643051 > ./hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java:1574: assert assertsEnabled = true; // Intentional side effect!!!:27: Inner assignments should be avoided. [InnerAssignment] @bshashikant The checkstyle issue is intentional as stated in the comment. That is the way to check if assert is enabled in the code. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 475170) Time Spent: 20m (was: 10m) > In-order deletion of snapshots: Diff lists must be update only in the last > snapshot > --- > > Key: HDFS-15500 > URL: https://issues.apache.org/jira/browse/HDFS-15500 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: snapshots >Reporter: Mukul Kumar Singh >Assignee: Tsz-wo Sze >Priority: Major > Time Spent: 20m > Remaining Estimate: 0h > > With ordered deletions the diff lists of the snapshots should become > immutable except the latest one. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-15542) Add identified snapshot corruption tests for ordered snapshot deletion
[ https://issues.apache.org/jira/browse/HDFS-15542?focusedWorklogId=475169=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-475169 ] ASF GitHub Bot logged work on HDFS-15542: - Author: ASF GitHub Bot Created on: 27/Aug/20 06:56 Start Date: 27/Aug/20 06:56 Worklog Time Spent: 10m Work Description: bshashikant opened a new pull request #2251: URL: https://github.com/apache/hadoop/pull/2251 please see https://issues.apache.org/jira/browse/HDFS-15542. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 475169) Remaining Estimate: 0h Time Spent: 10m > Add identified snapshot corruption tests for ordered snapshot deletion > -- > > Key: HDFS-15542 > URL: https://issues.apache.org/jira/browse/HDFS-15542 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: snapshots >Reporter: Shashikant Banerjee >Assignee: Shashikant Banerjee >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > HDFS-13101, HDFS-15012 and HDFS-15313 along with HDFS-15470 have fsimage > corruption sequences with snapshots . The idea here is to aggregate these > unit tests and enabled them for ordered snapshot deletion feature. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-15500) In-order deletion of snapshots: Diff lists must be update only in the last snapshot
[ https://issues.apache.org/jira/browse/HDFS-15500?focusedWorklogId=475153=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-475153 ] ASF GitHub Bot logged work on HDFS-15500: - Author: ASF GitHub Bot Created on: 27/Aug/20 06:29 Start Date: 27/Aug/20 06:29 Worklog Time Spent: 10m Work Description: bshashikant commented on pull request #2233: URL: https://github.com/apache/hadoop/pull/2233#issuecomment-681620675 @szetszwo , can you take care of the checkstyle issue? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 475153) Remaining Estimate: 0h Time Spent: 10m > In-order deletion of snapshots: Diff lists must be update only in the last > snapshot > --- > > Key: HDFS-15500 > URL: https://issues.apache.org/jira/browse/HDFS-15500 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: snapshots >Reporter: Mukul Kumar Singh >Assignee: Tsz-wo Sze >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > With ordered deletions the diff lists of the snapshots should become > immutable except the latest one. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-15541) Disallow making a Snapshottable directory unsnapshottable if it has no empty snapshot trash directory
[ https://issues.apache.org/jira/browse/HDFS-15541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shashikant Banerjee resolved HDFS-15541. Fix Version/s: 3.4.0 Resolution: Duplicate > Disallow making a Snapshottable directory unsnapshottable if it has no empty > snapshot trash directory > - > > Key: HDFS-15541 > URL: https://issues.apache.org/jira/browse/HDFS-15541 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: snapshots >Reporter: Shashikant Banerjee >Assignee: Siyao Meng >Priority: Major > Fix For: 3.4.0 > > > If the snapshot trash is enabled, a snapshottable directory should be > disallowed to be marked unsnapshottable if it has non-empty snapshot trash > directory. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-15542) Add identified snapshot corruption tests for ordered snapshot deletion
Shashikant Banerjee created HDFS-15542: -- Summary: Add identified snapshot corruption tests for ordered snapshot deletion Key: HDFS-15542 URL: https://issues.apache.org/jira/browse/HDFS-15542 Project: Hadoop HDFS Issue Type: Sub-task Components: snapshots Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee HDFS-13101, HDFS-15012 and HDFS-15313 along with HDFS-15470 have fsimage corruption sequences with snapshots . The idea here is to aggregate these unit tests and enabled them for ordered snapshot deletion feature. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-15541) Disallow making a Snapshottable directory unsnapshottable if it has no empty snapshot trash directory
Shashikant Banerjee created HDFS-15541: -- Summary: Disallow making a Snapshottable directory unsnapshottable if it has no empty snapshot trash directory Key: HDFS-15541 URL: https://issues.apache.org/jira/browse/HDFS-15541 Project: Hadoop HDFS Issue Type: Sub-task Components: snapshots Reporter: Shashikant Banerjee Assignee: Siyao Meng If the snapshot trash is enabled, a snapshottable directory should be disallowed to be marked unsnapshottable if it has non-empty snapshot trash directory. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-15025) Applying NVDIMM storage media to HDFS
[ https://issues.apache.org/jira/browse/HDFS-15025?focusedWorklogId=475152=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-475152 ] ASF GitHub Bot logged work on HDFS-15025: - Author: ASF GitHub Bot Created on: 27/Aug/20 06:19 Start Date: 27/Aug/20 06:19 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #2189: URL: https://github.com/apache/hadoop/pull/2189#issuecomment-681615978 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | +0 :ok: | reexec | 0m 30s | Docker mode activated. | ||| _ Prechecks _ | | +1 :green_heart: | dupname | 0m 1s | No case conflicting files found. | | +0 :ok: | buf | 0m 0s | buf was not available. | | +0 :ok: | markdownlint | 0m 0s | markdownlint was not available. | | +1 :green_heart: | @author | 0m 0s | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | The patch appears to include 16 new or modified test files. | ||| _ trunk Compile Tests _ | | +0 :ok: | mvndep | 3m 21s | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 25m 44s | trunk passed | | +1 :green_heart: | compile | 19m 9s | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | compile | 16m 45s | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | checkstyle | 3m 0s | trunk passed | | +1 :green_heart: | mvnsite | 4m 3s | trunk passed | | +1 :green_heart: | shadedclient | 21m 36s | branch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 2m 32s | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 4m 4s | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +0 :ok: | spotbugs | 2m 35s | Used deprecated FindBugs config; considering switching to SpotBugs. | | +1 :green_heart: | findbugs | 7m 58s | trunk passed | ||| _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 26s | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 2m 44s | the patch passed | | +1 :green_heart: | compile | 18m 39s | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | -1 :x: | cc | 18m 38s | root-jdkUbuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 generated 12 new + 151 unchanged - 12 fixed = 163 total (was 163) | | +1 :green_heart: | javac | 18m 38s | the patch passed | | +1 :green_heart: | compile | 17m 50s | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | -1 :x: | cc | 17m 50s | root-jdkPrivateBuild-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 generated 10 new + 153 unchanged - 10 fixed = 163 total (was 163) | | +1 :green_heart: | javac | 17m 50s | the patch passed | | -0 :warning: | checkstyle | 2m 57s | root: The patch generated 3 new + 725 unchanged - 4 fixed = 728 total (was 729) | | +1 :green_heart: | mvnsite | 4m 4s | the patch passed | | +1 :green_heart: | whitespace | 0m 0s | The patch has no whitespace issues. | | +1 :green_heart: | xml | 0m 1s | The patch has no ill-formed XML file. | | +1 :green_heart: | shadedclient | 14m 14s | patch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 2m 15s | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 3m 54s | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | findbugs | 8m 39s | the patch passed | ||| _ Other Tests _ | | +1 :green_heart: | unit | 9m 56s | hadoop-common in the patch passed. | | +1 :green_heart: | unit | 2m 18s | hadoop-hdfs-client in the patch passed. | | -1 :x: | unit | 101m 28s | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 0m 57s | The patch does not generate ASF License warnings. | | | | 297m 7s | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.hdfs.TestFileChecksumCompositeCrc | | | hadoop.fs.contract.hdfs.TestHDFSContractMultipartUploader | | | hadoop.hdfs.server.sps.TestExternalStoragePolicySatisfier | | | hadoop.hdfs.TestFileChecksum | | | hadoop.hdfs.server.balancer.TestBalancer | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.40
[jira] [Updated] (HDFS-15540) Directories protected from delete can still be moved to the trash
[ https://issues.apache.org/jira/browse/HDFS-15540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HDFS-15540: --- Fix Version/s: 3.4.0 3.3.1 Resolution: Fixed Status: Resolved (was: Patch Available) Thanks [~sodonnell] for the patch and [~ferhui] for the review! > Directories protected from delete can still be moved to the trash > - > > Key: HDFS-15540 > URL: https://issues.apache.org/jira/browse/HDFS-15540 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.4.0 >Reporter: Stephen O'Donnell >Assignee: Stephen O'Donnell >Priority: Major > Fix For: 3.3.1, 3.4.0 > > Attachments: HDFS-15540.001.patch > > > With HDFS-8983, HDFS-14802 and HDFS-15243 we are able to list protected > directories which cannot be deleted or renamed, provided the following is set: > fs.protected.directories: > dfs.protected.subdirectories.enable: true > Testing this feature out, I can see it mostly works fine, but protected > non-empty folders can still be moved to the trash. In this example > /dir/protected is set in fs.protected.directories, and > dfs.protected.subdirectories.enable is true. > {code} > hadoop fs -ls -R /dir > drwxr-xr-x - hdfs supergroup 0 2020-08-26 16:52 /dir/protected > -rw-r--r-- 3 hdfs supergroup 174 2020-08-26 16:52 /dir/protected/file1 > drwxr-xr-x - hdfs supergroup 0 2020-08-26 16:52 /dir/protected/subdir1 > -rw-r--r-- 3 hdfs supergroup 174 2020-08-26 16:52 /dir/protected/subdir1/file1 > drwxr-xr-x - hdfs supergroup 0 2020-08-26 16:52 /dir/protected/subdir2 > -rw-r--r-- 3 hdfs supergroup 174 2020-08-26 16:52 /dir/protected/subdir2/file1 > [hdfs@7d67ed1af9b0 /]$ hadoop fs -rm -r -f -skipTrash /dir/protected/subdir1 > rm: Cannot delete/rename subdirectory under protected subdirectory > /dir/protected > [hdfs@7d67ed1af9b0 /]$ hadoop fs -mv /dir/protected/subdir1 > /dir/protected/subdir1-moved > mv: Cannot delete/rename subdirectory under protected subdirectory > /dir/protected > ** ALL GOOD SO FAR ** > [hdfs@7d67ed1af9b0 /]$ hadoop fs -rm -r -f /dir/protected/subdir1 > 2020-08-26 16:54:32,404 INFO fs.TrashPolicyDefault: Moved: > 'hdfs://nn1/dir/protected/subdir1' to trash at: > hdfs://nn1/user/hdfs/.Trash/Current/dir/protected/subdir1 > ** It moved the protected sub-dir to the trash, where it will be deleted ** > ** Checking the top level dir, it is the same ** > [hdfs@7d67ed1af9b0 /]$ hadoop fs -rm -r -f -skipTrash /dir/protected > rm: Cannot delete/rename non-empty protected directory /dir/protected > [hdfs@7d67ed1af9b0 /]$ hadoop fs -mv /dir/protected /dir/protected-new > mv: Cannot delete/rename non-empty protected directory /dir/protected > [hdfs@7d67ed1af9b0 /]$ hadoop fs -rm -r -f /dir/protected > 2020-08-26 16:55:32,402 INFO fs.TrashPolicyDefault: Moved: > 'hdfs://nn1/dir/protected' to trash at: > hdfs://nn1/user/hdfs/.Trash/Current/dir/protected1598460932388 > {code} > The reason for this, seems to be that "move to trash" uses a different rename > method in FSNameSystem and FSDirRenameOp which avoids the > DFSUtil.checkProtectedDescendants(...) in the earlier Jiras. > I believe that "move to trash" should be protected in the same way as a > -skipTrash delete. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15540) Directories protected from delete can still be moved to the trash
[ https://issues.apache.org/jira/browse/HDFS-15540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17185627#comment-17185627 ] Wei-Chiu Chuang commented on HDFS-15540: Committed to trunk and branch-3.3. For branch-3.2 and below they miss HDFS-14802 and HDFS-15243 and won't cherry pick cleanly. > Directories protected from delete can still be moved to the trash > - > > Key: HDFS-15540 > URL: https://issues.apache.org/jira/browse/HDFS-15540 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.4.0 >Reporter: Stephen O'Donnell >Assignee: Stephen O'Donnell >Priority: Major > Attachments: HDFS-15540.001.patch > > > With HDFS-8983, HDFS-14802 and HDFS-15243 we are able to list protected > directories which cannot be deleted or renamed, provided the following is set: > fs.protected.directories: > dfs.protected.subdirectories.enable: true > Testing this feature out, I can see it mostly works fine, but protected > non-empty folders can still be moved to the trash. In this example > /dir/protected is set in fs.protected.directories, and > dfs.protected.subdirectories.enable is true. > {code} > hadoop fs -ls -R /dir > drwxr-xr-x - hdfs supergroup 0 2020-08-26 16:52 /dir/protected > -rw-r--r-- 3 hdfs supergroup 174 2020-08-26 16:52 /dir/protected/file1 > drwxr-xr-x - hdfs supergroup 0 2020-08-26 16:52 /dir/protected/subdir1 > -rw-r--r-- 3 hdfs supergroup 174 2020-08-26 16:52 /dir/protected/subdir1/file1 > drwxr-xr-x - hdfs supergroup 0 2020-08-26 16:52 /dir/protected/subdir2 > -rw-r--r-- 3 hdfs supergroup 174 2020-08-26 16:52 /dir/protected/subdir2/file1 > [hdfs@7d67ed1af9b0 /]$ hadoop fs -rm -r -f -skipTrash /dir/protected/subdir1 > rm: Cannot delete/rename subdirectory under protected subdirectory > /dir/protected > [hdfs@7d67ed1af9b0 /]$ hadoop fs -mv /dir/protected/subdir1 > /dir/protected/subdir1-moved > mv: Cannot delete/rename subdirectory under protected subdirectory > /dir/protected > ** ALL GOOD SO FAR ** > [hdfs@7d67ed1af9b0 /]$ hadoop fs -rm -r -f /dir/protected/subdir1 > 2020-08-26 16:54:32,404 INFO fs.TrashPolicyDefault: Moved: > 'hdfs://nn1/dir/protected/subdir1' to trash at: > hdfs://nn1/user/hdfs/.Trash/Current/dir/protected/subdir1 > ** It moved the protected sub-dir to the trash, where it will be deleted ** > ** Checking the top level dir, it is the same ** > [hdfs@7d67ed1af9b0 /]$ hadoop fs -rm -r -f -skipTrash /dir/protected > rm: Cannot delete/rename non-empty protected directory /dir/protected > [hdfs@7d67ed1af9b0 /]$ hadoop fs -mv /dir/protected /dir/protected-new > mv: Cannot delete/rename non-empty protected directory /dir/protected > [hdfs@7d67ed1af9b0 /]$ hadoop fs -rm -r -f /dir/protected > 2020-08-26 16:55:32,402 INFO fs.TrashPolicyDefault: Moved: > 'hdfs://nn1/dir/protected' to trash at: > hdfs://nn1/user/hdfs/.Trash/Current/dir/protected1598460932388 > {code} > The reason for this, seems to be that "move to trash" uses a different rename > method in FSNameSystem and FSDirRenameOp which avoids the > DFSUtil.checkProtectedDescendants(...) in the earlier Jiras. > I believe that "move to trash" should be protected in the same way as a > -skipTrash delete. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15540) Directories protected from delete can still be moved to the trash
[ https://issues.apache.org/jira/browse/HDFS-15540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17185620#comment-17185620 ] Wei-Chiu Chuang commented on HDFS-15540: +1 nice catch. Committing the 001 patch. > Directories protected from delete can still be moved to the trash > - > > Key: HDFS-15540 > URL: https://issues.apache.org/jira/browse/HDFS-15540 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.4.0 >Reporter: Stephen O'Donnell >Assignee: Stephen O'Donnell >Priority: Major > Attachments: HDFS-15540.001.patch > > > With HDFS-8983, HDFS-14802 and HDFS-15243 we are able to list protected > directories which cannot be deleted or renamed, provided the following is set: > fs.protected.directories: > dfs.protected.subdirectories.enable: true > Testing this feature out, I can see it mostly works fine, but protected > non-empty folders can still be moved to the trash. In this example > /dir/protected is set in fs.protected.directories, and > dfs.protected.subdirectories.enable is true. > {code} > hadoop fs -ls -R /dir > drwxr-xr-x - hdfs supergroup 0 2020-08-26 16:52 /dir/protected > -rw-r--r-- 3 hdfs supergroup 174 2020-08-26 16:52 /dir/protected/file1 > drwxr-xr-x - hdfs supergroup 0 2020-08-26 16:52 /dir/protected/subdir1 > -rw-r--r-- 3 hdfs supergroup 174 2020-08-26 16:52 /dir/protected/subdir1/file1 > drwxr-xr-x - hdfs supergroup 0 2020-08-26 16:52 /dir/protected/subdir2 > -rw-r--r-- 3 hdfs supergroup 174 2020-08-26 16:52 /dir/protected/subdir2/file1 > [hdfs@7d67ed1af9b0 /]$ hadoop fs -rm -r -f -skipTrash /dir/protected/subdir1 > rm: Cannot delete/rename subdirectory under protected subdirectory > /dir/protected > [hdfs@7d67ed1af9b0 /]$ hadoop fs -mv /dir/protected/subdir1 > /dir/protected/subdir1-moved > mv: Cannot delete/rename subdirectory under protected subdirectory > /dir/protected > ** ALL GOOD SO FAR ** > [hdfs@7d67ed1af9b0 /]$ hadoop fs -rm -r -f /dir/protected/subdir1 > 2020-08-26 16:54:32,404 INFO fs.TrashPolicyDefault: Moved: > 'hdfs://nn1/dir/protected/subdir1' to trash at: > hdfs://nn1/user/hdfs/.Trash/Current/dir/protected/subdir1 > ** It moved the protected sub-dir to the trash, where it will be deleted ** > ** Checking the top level dir, it is the same ** > [hdfs@7d67ed1af9b0 /]$ hadoop fs -rm -r -f -skipTrash /dir/protected > rm: Cannot delete/rename non-empty protected directory /dir/protected > [hdfs@7d67ed1af9b0 /]$ hadoop fs -mv /dir/protected /dir/protected-new > mv: Cannot delete/rename non-empty protected directory /dir/protected > [hdfs@7d67ed1af9b0 /]$ hadoop fs -rm -r -f /dir/protected > 2020-08-26 16:55:32,402 INFO fs.TrashPolicyDefault: Moved: > 'hdfs://nn1/dir/protected' to trash at: > hdfs://nn1/user/hdfs/.Trash/Current/dir/protected1598460932388 > {code} > The reason for this, seems to be that "move to trash" uses a different rename > method in FSNameSystem and FSDirRenameOp which avoids the > DFSUtil.checkProtectedDescendants(...) in the earlier Jiras. > I believe that "move to trash" should be protected in the same way as a > -skipTrash delete. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org