[jira] [Created] (HDFS-13091) Remove tomcat from the Hadoop-auth test bundle
Xiao Chen created HDFS-13091: Summary: Remove tomcat from the Hadoop-auth test bundle Key: HDFS-13091 URL: https://issues.apache.org/jira/browse/HDFS-13091 Project: Hadoop HDFS Issue Type: Task Reporter: Xiao Chen Assignee: Xiao Chen We have switched KMS and HttpFS from tomcat to jetty in 3.0. There appears to have some left over tests in Hadoop-auth which were for used for KMS / HttpFS coverage. We should cleanup the test accordingly. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13044) RBF: Add a safe mode for the Router
[ https://issues.apache.org/jira/browse/HDFS-13044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16346291#comment-16346291 ] Yiqun Lin commented on HDFS-13044: -- +1 on the patch for branch-3.0. Please go ahead. > RBF: Add a safe mode for the Router > --- > > Key: HDFS-13044 > URL: https://issues.apache.org/jira/browse/HDFS-13044 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Assignee: Íñigo Goiri >Priority: Major > Attachments: HDFS-13004.000.patch, HDFS-13044-branch-3.0.000.patch, > HDFS-13044.001.patch, HDFS-13044.002.patch, HDFS-13044.003.patch, > HDFS-13044.004.patch, HDFS-13044.005.patch > > > When a Router cannot communicate with the State Store, it should enter into a > safe mode that disallows certain operations. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12528) Add an option to not disable short-circuit reads on failures
[ https://issues.apache.org/jira/browse/HDFS-12528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Chen updated HDFS-12528: - Fix Version/s: 2.10.0 > Add an option to not disable short-circuit reads on failures > > > Key: HDFS-12528 > URL: https://issues.apache.org/jira/browse/HDFS-12528 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client, performance >Affects Versions: 2.6.0 >Reporter: Andre Araujo >Assignee: Xiao Chen >Priority: Major > Fix For: 3.1.0, 2.10.0, 3.0.1 > > Attachments: HDFS-12528.000.patch, HDFS-12528.01.patch, > HDFS-12528.02.patch, HDFS-12528.03.patch, HDFS-12528.04.patch, > HDFS-12528.05.patch > > > We have scenarios where data ingestion makes use of the -appendToFile > operation to add new data to existing HDFS files. In these situations, we're > frequently running into the problem described below. > We're using Impala to query the HDFS data with short-circuit reads (SCR) > enabled. After each file read, Impala "unbuffer"'s the HDFS file to reduce > the memory footprint. In some cases, though, Impala still keeps the HDFS file > handle open for reuse. > The "unbuffer" call, however, causes the file's current block reader to be > closed, which makes the associated ShortCircuitReplica evictable from the > ShortCircuitCache. When the cluster is under load, this means that the > ShortCircuitReplica can be purged off the cache pretty fast, which closes the > file descriptor to the underlying storage file. > That means that when Impala re-reads the file it has to re-open the storage > files associated with the ShortCircuitReplica's that were evicted from the > cache. If there were no appends to those blocks, the re-open will succeed > without problems. If one block was appended since the ShortCircuitReplica was > created, the re-open will fail with the following error: > {code} > Meta file for BP-810388474-172.31.113.69-1499543341726:blk_1074012183_273087 > not found > {code} > This error is handled as an "unknown response" by the BlockReaderFactory [1], > which disables short-circuit reads for 10 minutes [2] for the client. > These 10 minutes without SCR can have a big performance impact for the client > operations. In this particular case ("Meta file not found") it would suffice > to return null without disabling SCR. This particular block read would fall > back to the normal, non-short-circuited, path and other SCR requests would > continue to work as expected. > It might also be interesting to be able to control how long SCR is disabled > for in the "unknown response" case. 10 minutes seems a bit to long and not > being able to change that is a problem. > [1] > https://github.com/apache/hadoop/blob/f67237cbe7bc48a1b9088e990800b37529f1db2a/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/impl/BlockReaderFactory.java#L646 > [2] > https://github.com/apache/hadoop/blob/f67237cbe7bc48a1b9088e990800b37529f1db2a/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/shortcircuit/DomainSocketFactory.java#L97 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12528) Add an option to not disable short-circuit reads on failures
[ https://issues.apache.org/jira/browse/HDFS-12528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16346285#comment-16346285 ] Xiao Chen commented on HDFS-12528: -- Thanks for the commit [~cheersyang]. Cherry-picked to branch-2 as well. > Add an option to not disable short-circuit reads on failures > > > Key: HDFS-12528 > URL: https://issues.apache.org/jira/browse/HDFS-12528 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client, performance >Affects Versions: 2.6.0 >Reporter: Andre Araujo >Assignee: Xiao Chen >Priority: Major > Fix For: 3.1.0, 2.10.0, 3.0.1 > > Attachments: HDFS-12528.000.patch, HDFS-12528.01.patch, > HDFS-12528.02.patch, HDFS-12528.03.patch, HDFS-12528.04.patch, > HDFS-12528.05.patch > > > We have scenarios where data ingestion makes use of the -appendToFile > operation to add new data to existing HDFS files. In these situations, we're > frequently running into the problem described below. > We're using Impala to query the HDFS data with short-circuit reads (SCR) > enabled. After each file read, Impala "unbuffer"'s the HDFS file to reduce > the memory footprint. In some cases, though, Impala still keeps the HDFS file > handle open for reuse. > The "unbuffer" call, however, causes the file's current block reader to be > closed, which makes the associated ShortCircuitReplica evictable from the > ShortCircuitCache. When the cluster is under load, this means that the > ShortCircuitReplica can be purged off the cache pretty fast, which closes the > file descriptor to the underlying storage file. > That means that when Impala re-reads the file it has to re-open the storage > files associated with the ShortCircuitReplica's that were evicted from the > cache. If there were no appends to those blocks, the re-open will succeed > without problems. If one block was appended since the ShortCircuitReplica was > created, the re-open will fail with the following error: > {code} > Meta file for BP-810388474-172.31.113.69-1499543341726:blk_1074012183_273087 > not found > {code} > This error is handled as an "unknown response" by the BlockReaderFactory [1], > which disables short-circuit reads for 10 minutes [2] for the client. > These 10 minutes without SCR can have a big performance impact for the client > operations. In this particular case ("Meta file not found") it would suffice > to return null without disabling SCR. This particular block read would fall > back to the normal, non-short-circuited, path and other SCR requests would > continue to work as expected. > It might also be interesting to be able to control how long SCR is disabled > for in the "unknown response" case. 10 minutes seems a bit to long and not > being able to change that is a problem. > [1] > https://github.com/apache/hadoop/blob/f67237cbe7bc48a1b9088e990800b37529f1db2a/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/impl/BlockReaderFactory.java#L646 > [2] > https://github.com/apache/hadoop/blob/f67237cbe7bc48a1b9088e990800b37529f1db2a/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/shortcircuit/DomainSocketFactory.java#L97 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13085) getStoragePolicy varies b/w REST and CLI for unspecified path
[ https://issues.apache.org/jira/browse/HDFS-13085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16346259#comment-16346259 ] Surendra Singh Lilhore commented on HDFS-13085: --- Thanks [~brahmareddy] for reporting issue. One suggestion, currently {{-getStoragePolicy}} command give the current inode policy. It will not inherit from the parent. We can add one optional parameter "-i" in command to return the inherited policy when current inode policy is undefined. For example: {noformat} ./hdfs storagepolicies -getStoragePolicy [-i] -path {noformat} > getStoragePolicy varies b/w REST and CLI for unspecified path > - > > Key: HDFS-13085 > URL: https://issues.apache.org/jira/browse/HDFS-13085 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Brahma Reddy Battula >Assignee: Mukul Kumar Singh >Priority: Major > > *CLI* > {noformat} > hdfs storagepolicies -getStoragePolicy -path /node-label-root > The storage policy of /node-label-root is unspecified > {noformat} > *REST* > {noformat} > curl -i "http://1*.*.*.*:50070/webhdfs/v1/node-label-root?op=GETSTORAGEPOLICY; > {"BlockStoragePolicy":\{"copyOnCreateFile":false,"creationFallbacks":[],"id":7,"name":"HOT","replicationFallbacks":["ARCHIVE"],"storageTypes":["DISK"]}} > {noformat} > -HDFS-11968- introduced this new behaviour for CLI. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13083) RBF: Fix doc error setting up client
[ https://issues.apache.org/jira/browse/HDFS-13083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16346233#comment-16346233 ] tartarus commented on HDFS-13083: - [~elgoiri] [~hudson] It's my pleasure to participate in the project's contribution. > RBF: Fix doc error setting up client > > > Key: HDFS-13083 > URL: https://issues.apache.org/jira/browse/HDFS-13083 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: federation >Affects Versions: 2.9.0, 3.0.0 > Environment: CentOS6.5 > Hadoop 3.0.0 >Reporter: tartarus >Assignee: tartarus >Priority: Major > Labels: Router > Fix For: 3.1.0, 2.10.0, 2.9.1, 3.0.1 > > Attachments: HDFS-13083.patch > > Original Estimate: 24h > Remaining Estimate: 24h > > HDFS Router Federation doc error > Configuration instructions > {quote} > dfs.namenodes.ns-fed > r1,r2 > > {quote} > then execute : *hdfs dfs -ls hdfs://ns-fed/tmp/* failed with > *{{Couldn't create proxy provider class > org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider}}* > The correct configuration is > {quote} > dfs.{color:#FF}ha{color}.namenodes.ns-fed > r1,r2 > > {quote} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12051) Reimplement NameCache in NameNode: Intern duplicate byte[] arrays (mainly those denoting file/directory names) to save memory
[ https://issues.apache.org/jira/browse/HDFS-12051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16346231#comment-16346231 ] genericqa commented on HDFS-12051: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 13s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 52s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 0s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 56s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 2s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 35s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 49s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 9s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 29s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 21s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 21s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 1m 5s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch generated 1 new + 1235 unchanged - 18 fixed = 1236 total (was 1253) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 58s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 9s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 3s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 51s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}136m 58s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 24s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}193m 32s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-hdfs-project/hadoop-hdfs | | | Increment of volatile field org.apache.hadoop.hdfs.server.namenode.NameCache.size in org.apache.hadoop.hdfs.server.namenode.NameCache.put(byte[]) At NameCache.java:in org.apache.hadoop.hdfs.server.namenode.NameCache.put(byte[]) At NameCache.java:[line 119] | | Failed junit tests | hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA | | | hadoop.hdfs.server.namenode.TestMetaSave | | | hadoop.hdfs.server.namenode.ha.TestFailureToReadEdits | | | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | HDFS-12051 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12908462/HDFS-12051.09.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle xml | | uname | Linux 3849f6cfea35 3.13.0-135-generic
[jira] [Commented] (HDFS-13043) RBF: Expose the state of the Routers in the federation
[ https://issues.apache.org/jira/browse/HDFS-13043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16346223#comment-16346223 ] genericqa commented on HDFS-13043: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 10s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 4 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 57s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 53s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 38s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 58s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 35s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 56s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 53s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 0m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 47s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 2s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 50s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}152m 47s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 1m 2s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}204m 30s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.TestDFSClientRetries | | | hadoop.hdfs.TestPread | | | hadoop.hdfs.TestReadStripedFileWithMissingBlocks | | | hadoop.hdfs.security.TestDelegationTokenForProxyUser | | | hadoop.hdfs.server.datanode.TestDataNodeUUID | | | hadoop.hdfs.TestErasureCodeBenchmarkThroughput | | | hadoop.hdfs.server.datanode.TestDirectoryScanner | | | hadoop.hdfs.TestLocalDFS | | | hadoop.hdfs.server.federation.metrics.TestFederationMetrics | | | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting | | | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | HDFS-13043 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12908459/HDFS-13043.007.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle cc | | uname | Linux 0b729fdb416b 3.13.0-135-generic #184-Ubuntu SMP Wed Oct 18 11:55:51 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality |
[jira] [Commented] (HDFS-12997) Move logging to slf4j in BlockPoolSliceStorage and Storage
[ https://issues.apache.org/jira/browse/HDFS-12997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16346195#comment-16346195 ] genericqa commented on HDFS-12997: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 18m 46s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 46s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 4s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 44s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 7s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 36s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 5s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 58s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 4s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 56s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 56s{color} | {color:green} hadoop-hdfs-project_hadoop-hdfs generated 0 new + 392 unchanged - 2 fixed = 392 total (was 394) {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 38s{color} | {color:green} hadoop-hdfs-project/hadoop-hdfs: The patch generated 0 new + 152 unchanged - 7 fixed = 152 total (was 159) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 2s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 18s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 59s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}114m 1s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 25s{color} | {color:red} The patch generated 4 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}187m 24s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.TestReconstructStripedFile | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure080 | | | hadoop.hdfs.TestRollingUpgrade | | | hadoop.hdfs.TestAbandonBlock | | | hadoop.hdfs.TestDecommission | | | hadoop.hdfs.TestPread | | | hadoop.hdfs.server.datanode.TestRefreshNamenodes | | | hadoop.hdfs.TestReplaceDatanodeFailureReplication | | | hadoop.hdfs.TestSafeModeWithStripedFileWithRandomECPolicy | | | hadoop.hdfs.server.namenode.ha.TestHAAppend | | | hadoop.hdfs.server.namenode.ha.TestStandbyIsHot | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | HDFS-12997 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12908456/HDFS-12997.007.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux a4a183eca85c 3.13.0-135-generic #184-Ubuntu SMP Wed Oct 18 11:55:51 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision |
[jira] [Commented] (HDFS-13043) RBF: Expose the state of the Routers in the federation
[ https://issues.apache.org/jira/browse/HDFS-13043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16346188#comment-16346188 ] Yiqun Lin commented on HDFS-13043: -- The patch almost looks good to me. Would you fix the related failure test? Seems failed in these lines: {code} + assertEquals("", json.get("lastMembershipUpdate")); + assertEquals("", json.get("lastMountTableUpdate")); {code} > RBF: Expose the state of the Routers in the federation > -- > > Key: HDFS-13043 > URL: https://issues.apache.org/jira/browse/HDFS-13043 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Assignee: Íñigo Goiri >Priority: Major > Attachments: HDFS-13043.000.patch, HDFS-13043.001.patch, > HDFS-13043.002.patch, HDFS-13043.003.patch, HDFS-13043.004.patch, > HDFS-13043.005.patch, HDFS-13043.006.patch, HDFS-13043.007.patch, > router-info.png > > > The Router should expose the state of the other Routers in the federation > through a user UI. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13083) RBF: Fix doc error setting up client
[ https://issues.apache.org/jira/browse/HDFS-13083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16346183#comment-16346183 ] Hudson commented on HDFS-13083: --- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13585 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/13585/]) HDFS-13083. RBF: Fix doc error setting up client. Contributed by (inigoiri: rev 5206b2c7ca479dd53f614d75bab594043a9866e1) * (edit) hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HDFSRouterFederation.md > RBF: Fix doc error setting up client > > > Key: HDFS-13083 > URL: https://issues.apache.org/jira/browse/HDFS-13083 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: federation >Affects Versions: 2.9.0, 3.0.0 > Environment: CentOS6.5 > Hadoop 3.0.0 >Reporter: tartarus >Assignee: tartarus >Priority: Major > Labels: Router > Fix For: 3.1.0, 2.10.0, 2.9.1, 3.0.1 > > Attachments: HDFS-13083.patch > > Original Estimate: 24h > Remaining Estimate: 24h > > HDFS Router Federation doc error > Configuration instructions > {quote} > dfs.namenodes.ns-fed > r1,r2 > > {quote} > then execute : *hdfs dfs -ls hdfs://ns-fed/tmp/* failed with > *{{Couldn't create proxy provider class > org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider}}* > The correct configuration is > {quote} > dfs.{color:#FF}ha{color}.namenodes.ns-fed > r1,r2 > > {quote} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13089) Add test to validate dfs used and no of blocks when blocks are moved across volumes
[ https://issues.apache.org/jira/browse/HDFS-13089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16346172#comment-16346172 ] genericqa commented on HDFS-13089: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 18s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 11s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 51s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 35s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 54s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 9m 57s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 50s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 48s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 1m 10s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 31s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 52s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 1s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 9m 54s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 46s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}147m 22s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 21s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}193m 49s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA | | | hadoop.hdfs.web.TestWebHdfsTimeouts | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure | | | hadoop.tools.TestHdfsConfigFields | | | hadoop.hdfs.TestReadStripedFileWithMissingBlocks | | | hadoop.hdfs.server.namenode.TestDecommissioningStatus | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | HDFS-13089 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12908454/HDFS-13089.000.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 5568ef2dedcb 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 11:50:30 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / f9dd5b6 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_151 | | findbugs | v3.1.0-RC1 | | mvninstall | https://builds.apache.org/job/PreCommit-HDFS-Build/22896/artifact/out/patch-mvninstall-hadoop-hdfs-project_hadoop-hdfs.txt | | unit |
[jira] [Updated] (HDFS-13083) RBF: Fix doc error setting up client
[ https://issues.apache.org/jira/browse/HDFS-13083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Íñigo Goiri updated HDFS-13083: --- Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: (was: 3.0.0) 3.0.1 2.9.1 2.10.0 3.1.0 Status: Resolved (was: Patch Available) Thanks [~tartarus] for the contribution. Committed to {{trunk}}, {{branch-3.0}}, {{branch-2}}, and {{branch-2.9}}. > RBF: Fix doc error setting up client > > > Key: HDFS-13083 > URL: https://issues.apache.org/jira/browse/HDFS-13083 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: federation >Affects Versions: 2.9.0, 3.0.0 > Environment: CentOS6.5 > Hadoop 3.0.0 >Reporter: tartarus >Assignee: tartarus >Priority: Major > Labels: Router > Fix For: 3.1.0, 2.10.0, 2.9.1, 3.0.1 > > Attachments: HDFS-13083.patch > > Original Estimate: 24h > Remaining Estimate: 24h > > HDFS Router Federation doc error > Configuration instructions > {quote} > dfs.namenodes.ns-fed > r1,r2 > > {quote} > then execute : *hdfs dfs -ls hdfs://ns-fed/tmp/* failed with > *{{Couldn't create proxy provider class > org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider}}* > The correct configuration is > {quote} > dfs.{color:#FF}ha{color}.namenodes.ns-fed > r1,r2 > > {quote} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13083) RBF: Fix doc error setting up client
[ https://issues.apache.org/jira/browse/HDFS-13083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16346125#comment-16346125 ] tartarus commented on HDFS-13083: - [~elgoiri] Thank you for your optimization. it's okay no problem. > RBF: Fix doc error setting up client > > > Key: HDFS-13083 > URL: https://issues.apache.org/jira/browse/HDFS-13083 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: federation >Affects Versions: 2.9.0, 3.0.0 > Environment: CentOS6.5 > Hadoop 3.0.0 >Reporter: tartarus >Assignee: tartarus >Priority: Major > Labels: Router > Fix For: 3.0.0 > > Attachments: HDFS-13083.patch > > Original Estimate: 24h > Remaining Estimate: 24h > > HDFS Router Federation doc error > Configuration instructions > {quote} > dfs.namenodes.ns-fed > r1,r2 > > {quote} > then execute : *hdfs dfs -ls hdfs://ns-fed/tmp/* failed with > *{{Couldn't create proxy provider class > org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider}}* > The correct configuration is > {quote} > dfs.{color:#FF}ha{color}.namenodes.ns-fed > r1,r2 > > {quote} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12942) Synchronization issue in FSDataSetImpl#moveBlock
[ https://issues.apache.org/jira/browse/HDFS-12942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16346120#comment-16346120 ] Ajay Kumar commented on HDFS-12942: --- test failures are unrelated. all of them passed locally. > Synchronization issue in FSDataSetImpl#moveBlock > > > Key: HDFS-12942 > URL: https://issues.apache.org/jira/browse/HDFS-12942 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Ajay Kumar >Assignee: Ajay Kumar >Priority: Major > Attachments: HDFS-12942.001.patch, HDFS-12942.002.patch, > HDFS-12942.003.patch, HDFS-12942.004.patch, HDFS-12942.005.patch, > HDFS-12942.006.patch > > > FSDataSetImpl#moveBlock works in following following 3 steps: > # first creates a new replicaInfo object > # calls finalizeReplica to finalize it. > # Calls removeOldReplica to remove oldReplica. > A client can potentially append to the old replica between step 1 and 2. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12051) Reimplement NameCache in NameNode: Intern duplicate byte[] arrays (mainly those denoting file/directory names) to save memory
[ https://issues.apache.org/jira/browse/HDFS-12051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16346080#comment-16346080 ] Misha Dmitriev commented on HDFS-12051: --- I ran another test in a cluster with ~30M HDFS files, where NameNode uses ~20GB of memory. The measurements, before my change: *Types of duplicate objects:* || *Overhead* || *# objects* || *Unique objects* || *Class name* || | 745,343K (4.0%)| 28,746,601| 9,566,545|byte[]| | 48,014K (0.3%)| 1,533,733| 1,438|int[]| The same table after my change: || *Overhead* || *# objects* || *Unique objects* || *Class name* || | 48,014K (0.3%)| 1,533,702| 1,407|int[]| | 4,906K (< 0.1%)| 9,611,557| 9,553,953|byte[]| In other words, we have about the same number of unique byte[] arrays, but almost none of them have duplicates now. As a result, we saved ~0.6GB of memory. This is the removed overhead of duplicate byte[] arrays (approx. 740MB) minus the size of the cache (approx. 160MB) > Reimplement NameCache in NameNode: Intern duplicate byte[] arrays (mainly > those denoting file/directory names) to save memory > - > > Key: HDFS-12051 > URL: https://issues.apache.org/jira/browse/HDFS-12051 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Misha Dmitriev >Assignee: Misha Dmitriev >Priority: Major > Attachments: HDFS-12051.01.patch, HDFS-12051.02.patch, > HDFS-12051.03.patch, HDFS-12051.04.patch, HDFS-12051.05.patch, > HDFS-12051.06.patch, HDFS-12051.07.patch, HDFS-12051.08.patch, > HDFS-12051.09.patch > > > When snapshot diff operation is performed in a NameNode that manages several > million HDFS files/directories, NN needs a lot of memory. Analyzing one heap > dump with jxray (www.jxray.com), we observed that duplicate byte[] arrays > result in 6.5% memory overhead, and most of these arrays are referenced by > {{org.apache.hadoop.hdfs.server.namenode.INodeFileAttributes$SnapshotCopy.name}} > and {{org.apache.hadoop.hdfs.server.namenode.INodeFile.name}}: > {code:java} > 19. DUPLICATE PRIMITIVE ARRAYS > Types of duplicate objects: > Ovhd Num objs Num unique objs Class name > 3,220,272K (6.5%) 104749528 25760871 byte[] > > 1,841,485K (3.7%), 53194037 dup arrays (13158094 unique) > 3510556 of byte[17](112, 97, 114, 116, 45, 109, 45, 48, 48, 48, ...), 2228255 > of byte[8](48, 48, 48, 48, 48, 48, 95, 48), 357439 of byte[17](112, 97, 114, > 116, 45, 109, 45, 48, 48, 48, ...), 237395 of byte[8](48, 48, 48, 48, 48, 49, > 95, 48), 227853 of byte[17](112, 97, 114, 116, 45, 109, 45, 48, 48, 48, ...), > 179193 of byte[17](112, 97, 114, 116, 45, 109, 45, 48, 48, 48, ...), 169487 > of byte[8](48, 48, 48, 48, 48, 50, 95, 48), 145055 of byte[17](112, 97, 114, > 116, 45, 109, 45, 48, 48, 48, ...), 128134 of byte[8](48, 48, 48, 48, 48, 51, > 95, 48), 108265 of byte[17](112, 97, 114, 116, 45, 109, 45, 48, 48, 48, ...) > ... and 45902395 more arrays, of which 13158084 are unique > <-- > org.apache.hadoop.hdfs.server.namenode.INodeFileAttributes$SnapshotCopy.name > <-- org.apache.hadoop.hdfs.server.namenode.snapshot.FileDiff.snapshotINode > <-- {j.u.ArrayList} <-- > org.apache.hadoop.hdfs.server.namenode.snapshot.FileDiffList.diffs <-- > org.apache.hadoop.hdfs.server.namenode.snapshot.FileWithSnapshotFeature.diffs > <-- org.apache.hadoop.hdfs.server.namenode.INode$Feature[] <-- > org.apache.hadoop.hdfs.server.namenode.INodeFile.features <-- > org.apache.hadoop.hdfs.server.blockmanagement.BlockInfo.bc <-- ... (1 > elements) ... <-- > org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap$1.entries <-- > org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.blocks <-- > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.blocksMap <-- > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$BlockReportProcessingThread.this$0 > <-- j.l.Thread[] <-- j.l.ThreadGroup.threads <-- j.l.Thread.group <-- Java > Static: org.apache.hadoop.fs.FileSystem$Statistics.STATS_DATA_CLEANER > 409,830K (0.8%), 13482787 dup arrays (13260241 unique) > 430 of byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 353 of > byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 352 of > byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 350 of > byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 342 of > byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 341 of > byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 341 of > byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 340 of > byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 337 of > byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 334 of > byte[32](116, 97, 115, 107, 95, 49,
[jira] [Commented] (HDFS-13048) LowRedundancyReplicatedBlocks metric can be negative
[ https://issues.apache.org/jira/browse/HDFS-13048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16346064#comment-16346064 ] Chen Liang commented on HDFS-13048: --- Thanks for the catch [~ajisakaa]! +1 on v002 patch. > LowRedundancyReplicatedBlocks metric can be negative > > > Key: HDFS-13048 > URL: https://issues.apache.org/jira/browse/HDFS-13048 > Project: Hadoop HDFS > Issue Type: Bug > Components: metrics >Affects Versions: 3.0.0 >Reporter: Akira Ajisaka >Assignee: Akira Ajisaka >Priority: Major > Attachments: HDFS-13048-sample.patch, HDFS-13048.001.patch, > HDFS-13048.002.patch > > > I'm seeing {{LowRedundancyReplicatedBlocks}} become negative. This should be > 0 or positive. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12051) Reimplement NameCache in NameNode: Intern duplicate byte[] arrays (mainly those denoting file/directory names) to save memory
[ https://issues.apache.org/jira/browse/HDFS-12051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Misha Dmitriev updated HDFS-12051: -- Status: Patch Available (was: In Progress) I've just updated this patch to make NameNode automatically set the cache size as a percentage of the total heap (by default 1/512th, or little less than 0.25%). I found in my testing that this works better than the previous fixed-size cache. > Reimplement NameCache in NameNode: Intern duplicate byte[] arrays (mainly > those denoting file/directory names) to save memory > - > > Key: HDFS-12051 > URL: https://issues.apache.org/jira/browse/HDFS-12051 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Misha Dmitriev >Assignee: Misha Dmitriev >Priority: Major > Attachments: HDFS-12051.01.patch, HDFS-12051.02.patch, > HDFS-12051.03.patch, HDFS-12051.04.patch, HDFS-12051.05.patch, > HDFS-12051.06.patch, HDFS-12051.07.patch, HDFS-12051.08.patch, > HDFS-12051.09.patch > > > When snapshot diff operation is performed in a NameNode that manages several > million HDFS files/directories, NN needs a lot of memory. Analyzing one heap > dump with jxray (www.jxray.com), we observed that duplicate byte[] arrays > result in 6.5% memory overhead, and most of these arrays are referenced by > {{org.apache.hadoop.hdfs.server.namenode.INodeFileAttributes$SnapshotCopy.name}} > and {{org.apache.hadoop.hdfs.server.namenode.INodeFile.name}}: > {code:java} > 19. DUPLICATE PRIMITIVE ARRAYS > Types of duplicate objects: > Ovhd Num objs Num unique objs Class name > 3,220,272K (6.5%) 104749528 25760871 byte[] > > 1,841,485K (3.7%), 53194037 dup arrays (13158094 unique) > 3510556 of byte[17](112, 97, 114, 116, 45, 109, 45, 48, 48, 48, ...), 2228255 > of byte[8](48, 48, 48, 48, 48, 48, 95, 48), 357439 of byte[17](112, 97, 114, > 116, 45, 109, 45, 48, 48, 48, ...), 237395 of byte[8](48, 48, 48, 48, 48, 49, > 95, 48), 227853 of byte[17](112, 97, 114, 116, 45, 109, 45, 48, 48, 48, ...), > 179193 of byte[17](112, 97, 114, 116, 45, 109, 45, 48, 48, 48, ...), 169487 > of byte[8](48, 48, 48, 48, 48, 50, 95, 48), 145055 of byte[17](112, 97, 114, > 116, 45, 109, 45, 48, 48, 48, ...), 128134 of byte[8](48, 48, 48, 48, 48, 51, > 95, 48), 108265 of byte[17](112, 97, 114, 116, 45, 109, 45, 48, 48, 48, ...) > ... and 45902395 more arrays, of which 13158084 are unique > <-- > org.apache.hadoop.hdfs.server.namenode.INodeFileAttributes$SnapshotCopy.name > <-- org.apache.hadoop.hdfs.server.namenode.snapshot.FileDiff.snapshotINode > <-- {j.u.ArrayList} <-- > org.apache.hadoop.hdfs.server.namenode.snapshot.FileDiffList.diffs <-- > org.apache.hadoop.hdfs.server.namenode.snapshot.FileWithSnapshotFeature.diffs > <-- org.apache.hadoop.hdfs.server.namenode.INode$Feature[] <-- > org.apache.hadoop.hdfs.server.namenode.INodeFile.features <-- > org.apache.hadoop.hdfs.server.blockmanagement.BlockInfo.bc <-- ... (1 > elements) ... <-- > org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap$1.entries <-- > org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.blocks <-- > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.blocksMap <-- > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$BlockReportProcessingThread.this$0 > <-- j.l.Thread[] <-- j.l.ThreadGroup.threads <-- j.l.Thread.group <-- Java > Static: org.apache.hadoop.fs.FileSystem$Statistics.STATS_DATA_CLEANER > 409,830K (0.8%), 13482787 dup arrays (13260241 unique) > 430 of byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 353 of > byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 352 of > byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 350 of > byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 342 of > byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 341 of > byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 341 of > byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 340 of > byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 337 of > byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 334 of > byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...) > ... and 13479257 more arrays, of which 13260231 are unique > <-- org.apache.hadoop.hdfs.server.namenode.INodeFile.name <-- > org.apache.hadoop.hdfs.server.blockmanagement.BlockInfo.bc <-- > org.apache.hadoop.util.LightWeightGSet$LinkedElement[] <-- > org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap$1.entries <-- > org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.blocks <-- > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.blocksMap <-- >
[jira] [Updated] (HDFS-12051) Reimplement NameCache in NameNode: Intern duplicate byte[] arrays (mainly those denoting file/directory names) to save memory
[ https://issues.apache.org/jira/browse/HDFS-12051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Misha Dmitriev updated HDFS-12051: -- Attachment: HDFS-12051.09.patch > Reimplement NameCache in NameNode: Intern duplicate byte[] arrays (mainly > those denoting file/directory names) to save memory > - > > Key: HDFS-12051 > URL: https://issues.apache.org/jira/browse/HDFS-12051 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Misha Dmitriev >Assignee: Misha Dmitriev >Priority: Major > Attachments: HDFS-12051.01.patch, HDFS-12051.02.patch, > HDFS-12051.03.patch, HDFS-12051.04.patch, HDFS-12051.05.patch, > HDFS-12051.06.patch, HDFS-12051.07.patch, HDFS-12051.08.patch, > HDFS-12051.09.patch > > > When snapshot diff operation is performed in a NameNode that manages several > million HDFS files/directories, NN needs a lot of memory. Analyzing one heap > dump with jxray (www.jxray.com), we observed that duplicate byte[] arrays > result in 6.5% memory overhead, and most of these arrays are referenced by > {{org.apache.hadoop.hdfs.server.namenode.INodeFileAttributes$SnapshotCopy.name}} > and {{org.apache.hadoop.hdfs.server.namenode.INodeFile.name}}: > {code:java} > 19. DUPLICATE PRIMITIVE ARRAYS > Types of duplicate objects: > Ovhd Num objs Num unique objs Class name > 3,220,272K (6.5%) 104749528 25760871 byte[] > > 1,841,485K (3.7%), 53194037 dup arrays (13158094 unique) > 3510556 of byte[17](112, 97, 114, 116, 45, 109, 45, 48, 48, 48, ...), 2228255 > of byte[8](48, 48, 48, 48, 48, 48, 95, 48), 357439 of byte[17](112, 97, 114, > 116, 45, 109, 45, 48, 48, 48, ...), 237395 of byte[8](48, 48, 48, 48, 48, 49, > 95, 48), 227853 of byte[17](112, 97, 114, 116, 45, 109, 45, 48, 48, 48, ...), > 179193 of byte[17](112, 97, 114, 116, 45, 109, 45, 48, 48, 48, ...), 169487 > of byte[8](48, 48, 48, 48, 48, 50, 95, 48), 145055 of byte[17](112, 97, 114, > 116, 45, 109, 45, 48, 48, 48, ...), 128134 of byte[8](48, 48, 48, 48, 48, 51, > 95, 48), 108265 of byte[17](112, 97, 114, 116, 45, 109, 45, 48, 48, 48, ...) > ... and 45902395 more arrays, of which 13158084 are unique > <-- > org.apache.hadoop.hdfs.server.namenode.INodeFileAttributes$SnapshotCopy.name > <-- org.apache.hadoop.hdfs.server.namenode.snapshot.FileDiff.snapshotINode > <-- {j.u.ArrayList} <-- > org.apache.hadoop.hdfs.server.namenode.snapshot.FileDiffList.diffs <-- > org.apache.hadoop.hdfs.server.namenode.snapshot.FileWithSnapshotFeature.diffs > <-- org.apache.hadoop.hdfs.server.namenode.INode$Feature[] <-- > org.apache.hadoop.hdfs.server.namenode.INodeFile.features <-- > org.apache.hadoop.hdfs.server.blockmanagement.BlockInfo.bc <-- ... (1 > elements) ... <-- > org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap$1.entries <-- > org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.blocks <-- > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.blocksMap <-- > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$BlockReportProcessingThread.this$0 > <-- j.l.Thread[] <-- j.l.ThreadGroup.threads <-- j.l.Thread.group <-- Java > Static: org.apache.hadoop.fs.FileSystem$Statistics.STATS_DATA_CLEANER > 409,830K (0.8%), 13482787 dup arrays (13260241 unique) > 430 of byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 353 of > byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 352 of > byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 350 of > byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 342 of > byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 341 of > byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 341 of > byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 340 of > byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 337 of > byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 334 of > byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...) > ... and 13479257 more arrays, of which 13260231 are unique > <-- org.apache.hadoop.hdfs.server.namenode.INodeFile.name <-- > org.apache.hadoop.hdfs.server.blockmanagement.BlockInfo.bc <-- > org.apache.hadoop.util.LightWeightGSet$LinkedElement[] <-- > org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap$1.entries <-- > org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.blocks <-- > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.blocksMap <-- > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$BlockReportProcessingThread.this$0 > <-- j.l.Thread[] <-- > org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap$1.entries <-- > org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.blocks <-- >
[jira] [Updated] (HDFS-12051) Reimplement NameCache in NameNode: Intern duplicate byte[] arrays (mainly those denoting file/directory names) to save memory
[ https://issues.apache.org/jira/browse/HDFS-12051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Misha Dmitriev updated HDFS-12051: -- Status: In Progress (was: Patch Available) > Reimplement NameCache in NameNode: Intern duplicate byte[] arrays (mainly > those denoting file/directory names) to save memory > - > > Key: HDFS-12051 > URL: https://issues.apache.org/jira/browse/HDFS-12051 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Misha Dmitriev >Assignee: Misha Dmitriev >Priority: Major > Attachments: HDFS-12051.01.patch, HDFS-12051.02.patch, > HDFS-12051.03.patch, HDFS-12051.04.patch, HDFS-12051.05.patch, > HDFS-12051.06.patch, HDFS-12051.07.patch, HDFS-12051.08.patch > > > When snapshot diff operation is performed in a NameNode that manages several > million HDFS files/directories, NN needs a lot of memory. Analyzing one heap > dump with jxray (www.jxray.com), we observed that duplicate byte[] arrays > result in 6.5% memory overhead, and most of these arrays are referenced by > {{org.apache.hadoop.hdfs.server.namenode.INodeFileAttributes$SnapshotCopy.name}} > and {{org.apache.hadoop.hdfs.server.namenode.INodeFile.name}}: > {code:java} > 19. DUPLICATE PRIMITIVE ARRAYS > Types of duplicate objects: > Ovhd Num objs Num unique objs Class name > 3,220,272K (6.5%) 104749528 25760871 byte[] > > 1,841,485K (3.7%), 53194037 dup arrays (13158094 unique) > 3510556 of byte[17](112, 97, 114, 116, 45, 109, 45, 48, 48, 48, ...), 2228255 > of byte[8](48, 48, 48, 48, 48, 48, 95, 48), 357439 of byte[17](112, 97, 114, > 116, 45, 109, 45, 48, 48, 48, ...), 237395 of byte[8](48, 48, 48, 48, 48, 49, > 95, 48), 227853 of byte[17](112, 97, 114, 116, 45, 109, 45, 48, 48, 48, ...), > 179193 of byte[17](112, 97, 114, 116, 45, 109, 45, 48, 48, 48, ...), 169487 > of byte[8](48, 48, 48, 48, 48, 50, 95, 48), 145055 of byte[17](112, 97, 114, > 116, 45, 109, 45, 48, 48, 48, ...), 128134 of byte[8](48, 48, 48, 48, 48, 51, > 95, 48), 108265 of byte[17](112, 97, 114, 116, 45, 109, 45, 48, 48, 48, ...) > ... and 45902395 more arrays, of which 13158084 are unique > <-- > org.apache.hadoop.hdfs.server.namenode.INodeFileAttributes$SnapshotCopy.name > <-- org.apache.hadoop.hdfs.server.namenode.snapshot.FileDiff.snapshotINode > <-- {j.u.ArrayList} <-- > org.apache.hadoop.hdfs.server.namenode.snapshot.FileDiffList.diffs <-- > org.apache.hadoop.hdfs.server.namenode.snapshot.FileWithSnapshotFeature.diffs > <-- org.apache.hadoop.hdfs.server.namenode.INode$Feature[] <-- > org.apache.hadoop.hdfs.server.namenode.INodeFile.features <-- > org.apache.hadoop.hdfs.server.blockmanagement.BlockInfo.bc <-- ... (1 > elements) ... <-- > org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap$1.entries <-- > org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.blocks <-- > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.blocksMap <-- > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$BlockReportProcessingThread.this$0 > <-- j.l.Thread[] <-- j.l.ThreadGroup.threads <-- j.l.Thread.group <-- Java > Static: org.apache.hadoop.fs.FileSystem$Statistics.STATS_DATA_CLEANER > 409,830K (0.8%), 13482787 dup arrays (13260241 unique) > 430 of byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 353 of > byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 352 of > byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 350 of > byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 342 of > byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 341 of > byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 341 of > byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 340 of > byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 337 of > byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 334 of > byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...) > ... and 13479257 more arrays, of which 13260231 are unique > <-- org.apache.hadoop.hdfs.server.namenode.INodeFile.name <-- > org.apache.hadoop.hdfs.server.blockmanagement.BlockInfo.bc <-- > org.apache.hadoop.util.LightWeightGSet$LinkedElement[] <-- > org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap$1.entries <-- > org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.blocks <-- > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.blocksMap <-- > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$BlockReportProcessingThread.this$0 > <-- j.l.Thread[] <-- > org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap$1.entries <-- > org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.blocks <-- >
[jira] [Updated] (HDFS-13043) RBF: Expose the state of the Routers in the federation
[ https://issues.apache.org/jira/browse/HDFS-13043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Íñigo Goiri updated HDFS-13043: --- Attachment: HDFS-13043.007.patch > RBF: Expose the state of the Routers in the federation > -- > > Key: HDFS-13043 > URL: https://issues.apache.org/jira/browse/HDFS-13043 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Assignee: Íñigo Goiri >Priority: Major > Attachments: HDFS-13043.000.patch, HDFS-13043.001.patch, > HDFS-13043.002.patch, HDFS-13043.003.patch, HDFS-13043.004.patch, > HDFS-13043.005.patch, HDFS-13043.006.patch, HDFS-13043.007.patch, > router-info.png > > > The Router should expose the state of the other Routers in the federation > through a user UI. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13044) RBF: Add a safe mode for the Router
[ https://issues.apache.org/jira/browse/HDFS-13044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16346038#comment-16346038 ] Íñigo Goiri commented on HDFS-13044: The patch for branch-3.0 [^HDFS-13044-branch-3.0.000.patch] seems to be running out of memory to run {{TestRouterSafemode}}; not sure why. > RBF: Add a safe mode for the Router > --- > > Key: HDFS-13044 > URL: https://issues.apache.org/jira/browse/HDFS-13044 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Assignee: Íñigo Goiri >Priority: Major > Attachments: HDFS-13004.000.patch, HDFS-13044-branch-3.0.000.patch, > HDFS-13044.001.patch, HDFS-13044.002.patch, HDFS-13044.003.patch, > HDFS-13044.004.patch, HDFS-13044.005.patch > > > When a Router cannot communicate with the State Store, it should enter into a > safe mode that disallows certain operations. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12512) RBF: Add WebHDFS
[ https://issues.apache.org/jira/browse/HDFS-12512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16346035#comment-16346035 ] Íñigo Goiri commented on HDFS-12512: Tested and working fine; we had most of it already there and the refactoring doesn't break it :) Just the comments I left 19/Jan/18 18:11. > RBF: Add WebHDFS > > > Key: HDFS-12512 > URL: https://issues.apache.org/jira/browse/HDFS-12512 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: fs >Reporter: Íñigo Goiri >Assignee: Wei Yan >Priority: Major > Labels: RBF > Attachments: HDFS-12512.000.patch, HDFS-12512.001.patch, > HDFS-12512.002.patch, HDFS-12512.003.patch > > > The Router currently does not support WebHDFS. It needs to implement > something similar to {{NamenodeWebHdfsMethods}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-13090) Support composite trusted channel resolver that supports both whitelist and blacklist
Ajay Kumar created HDFS-13090: - Summary: Support composite trusted channel resolver that supports both whitelist and blacklist Key: HDFS-13090 URL: https://issues.apache.org/jira/browse/HDFS-13090 Project: Hadoop HDFS Issue Type: New Feature Reporter: Ajay Kumar Assignee: Ajay Kumar support composite trusted channel resolver that supports both whitelist and blacklist -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12997) Move logging to slf4j in BlockPoolSliceStorage and Storage
[ https://issues.apache.org/jira/browse/HDFS-12997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16346028#comment-16346028 ] Ajay Kumar commented on HDFS-12997: --- [~xyao], Updated patch v7 to address suggestions. > Move logging to slf4j in BlockPoolSliceStorage and Storage > --- > > Key: HDFS-12997 > URL: https://issues.apache.org/jira/browse/HDFS-12997 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ajay Kumar >Assignee: Ajay Kumar >Priority: Major > Attachments: HDFS-12997.001.patch, HDFS-12997.002.patch, > HDFS-12997.003.patch, HDFS-12997.004.patch, HDFS-12997.005.patch, > HDFS-12997.006.patch, HDFS-12997.007.patch > > > Move logging to slf4j in BlockPoolSliceStorage and Storage classes. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12997) Move logging to slf4j in BlockPoolSliceStorage and Storage
[ https://issues.apache.org/jira/browse/HDFS-12997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajay Kumar updated HDFS-12997: -- Attachment: HDFS-12997.007.patch > Move logging to slf4j in BlockPoolSliceStorage and Storage > --- > > Key: HDFS-12997 > URL: https://issues.apache.org/jira/browse/HDFS-12997 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ajay Kumar >Assignee: Ajay Kumar >Priority: Major > Attachments: HDFS-12997.001.patch, HDFS-12997.002.patch, > HDFS-12997.003.patch, HDFS-12997.004.patch, HDFS-12997.005.patch, > HDFS-12997.006.patch, HDFS-12997.007.patch > > > Move logging to slf4j in BlockPoolSliceStorage and Storage classes. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12528) Add an option to not disable short-circuit reads on failures
[ https://issues.apache.org/jira/browse/HDFS-12528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weiwei Yang updated HDFS-12528: --- Fix Version/s: 3.0.1 > Add an option to not disable short-circuit reads on failures > > > Key: HDFS-12528 > URL: https://issues.apache.org/jira/browse/HDFS-12528 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client, performance >Affects Versions: 2.6.0 >Reporter: Andre Araujo >Assignee: Xiao Chen >Priority: Major > Fix For: 3.1.0, 3.0.1 > > Attachments: HDFS-12528.000.patch, HDFS-12528.01.patch, > HDFS-12528.02.patch, HDFS-12528.03.patch, HDFS-12528.04.patch, > HDFS-12528.05.patch > > > We have scenarios where data ingestion makes use of the -appendToFile > operation to add new data to existing HDFS files. In these situations, we're > frequently running into the problem described below. > We're using Impala to query the HDFS data with short-circuit reads (SCR) > enabled. After each file read, Impala "unbuffer"'s the HDFS file to reduce > the memory footprint. In some cases, though, Impala still keeps the HDFS file > handle open for reuse. > The "unbuffer" call, however, causes the file's current block reader to be > closed, which makes the associated ShortCircuitReplica evictable from the > ShortCircuitCache. When the cluster is under load, this means that the > ShortCircuitReplica can be purged off the cache pretty fast, which closes the > file descriptor to the underlying storage file. > That means that when Impala re-reads the file it has to re-open the storage > files associated with the ShortCircuitReplica's that were evicted from the > cache. If there were no appends to those blocks, the re-open will succeed > without problems. If one block was appended since the ShortCircuitReplica was > created, the re-open will fail with the following error: > {code} > Meta file for BP-810388474-172.31.113.69-1499543341726:blk_1074012183_273087 > not found > {code} > This error is handled as an "unknown response" by the BlockReaderFactory [1], > which disables short-circuit reads for 10 minutes [2] for the client. > These 10 minutes without SCR can have a big performance impact for the client > operations. In this particular case ("Meta file not found") it would suffice > to return null without disabling SCR. This particular block read would fall > back to the normal, non-short-circuited, path and other SCR requests would > continue to work as expected. > It might also be interesting to be able to control how long SCR is disabled > for in the "unknown response" case. 10 minutes seems a bit to long and not > being able to change that is a problem. > [1] > https://github.com/apache/hadoop/blob/f67237cbe7bc48a1b9088e990800b37529f1db2a/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/impl/BlockReaderFactory.java#L646 > [2] > https://github.com/apache/hadoop/blob/f67237cbe7bc48a1b9088e990800b37529f1db2a/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/shortcircuit/DomainSocketFactory.java#L97 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12528) Add an option to not disable short-circuit reads on failures
[ https://issues.apache.org/jira/browse/HDFS-12528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16346009#comment-16346009 ] Hudson commented on HDFS-12528: --- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13584 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/13584/]) HDFS-12528. Add an option to not disable short-circuit reads on (wwei: rev 2e7331ca264dd366b975f3c8e610cf84eb8cc155) * (edit) hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/HdfsClientConfigKeys.java * (edit) hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/impl/BlockReaderFactory.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml * (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/client/impl/TestBlockReaderFactory.java * (edit) hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/impl/DfsClientConf.java * (edit) hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/shortcircuit/ShortCircuitCache.java * (edit) hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/shortcircuit/DomainSocketFactory.java > Add an option to not disable short-circuit reads on failures > > > Key: HDFS-12528 > URL: https://issues.apache.org/jira/browse/HDFS-12528 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client, performance >Affects Versions: 2.6.0 >Reporter: Andre Araujo >Assignee: Xiao Chen >Priority: Major > Fix For: 3.1.0, 3.0.1 > > Attachments: HDFS-12528.000.patch, HDFS-12528.01.patch, > HDFS-12528.02.patch, HDFS-12528.03.patch, HDFS-12528.04.patch, > HDFS-12528.05.patch > > > We have scenarios where data ingestion makes use of the -appendToFile > operation to add new data to existing HDFS files. In these situations, we're > frequently running into the problem described below. > We're using Impala to query the HDFS data with short-circuit reads (SCR) > enabled. After each file read, Impala "unbuffer"'s the HDFS file to reduce > the memory footprint. In some cases, though, Impala still keeps the HDFS file > handle open for reuse. > The "unbuffer" call, however, causes the file's current block reader to be > closed, which makes the associated ShortCircuitReplica evictable from the > ShortCircuitCache. When the cluster is under load, this means that the > ShortCircuitReplica can be purged off the cache pretty fast, which closes the > file descriptor to the underlying storage file. > That means that when Impala re-reads the file it has to re-open the storage > files associated with the ShortCircuitReplica's that were evicted from the > cache. If there were no appends to those blocks, the re-open will succeed > without problems. If one block was appended since the ShortCircuitReplica was > created, the re-open will fail with the following error: > {code} > Meta file for BP-810388474-172.31.113.69-1499543341726:blk_1074012183_273087 > not found > {code} > This error is handled as an "unknown response" by the BlockReaderFactory [1], > which disables short-circuit reads for 10 minutes [2] for the client. > These 10 minutes without SCR can have a big performance impact for the client > operations. In this particular case ("Meta file not found") it would suffice > to return null without disabling SCR. This particular block read would fall > back to the normal, non-short-circuited, path and other SCR requests would > continue to work as expected. > It might also be interesting to be able to control how long SCR is disabled > for in the "unknown response" case. 10 minutes seems a bit to long and not > being able to change that is a problem. > [1] > https://github.com/apache/hadoop/blob/f67237cbe7bc48a1b9088e990800b37529f1db2a/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/impl/BlockReaderFactory.java#L646 > [2] > https://github.com/apache/hadoop/blob/f67237cbe7bc48a1b9088e990800b37529f1db2a/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/shortcircuit/DomainSocketFactory.java#L97 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13062) Provide support for JN to use separate journal disk per namespace
[ https://issues.apache.org/jira/browse/HDFS-13062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16345996#comment-16345996 ] Hanisha Koneru commented on HDFS-13062: --- Thanks for updating the patch, [~bharatviswa]. * In {{setConf()}}, why are we getting the nameserviceIds from the config key {{DFS_INTERNAL_NAMESERVICES_KEY}} before {{DFS_NAMESERVICES}}? IIUC from HDFS-6376, which introduced the internal nameservices key, it is meant for datanodes to distinguish between which nameservices to connect to. JournalNodes should not be using this configuration to deduce the nameservice Ids. Please correct me if I am wrong. * In {{getLogDir}}: ** The if condition below should check that the {{dir}} path does not exist in {{localDir}}. The {{dir.exists}} check is not sufficient to ensure that the {{dir}} path has been validated by {{validateAndCreateJournalDir}}. {code:java} else if (!new File(dir.trim()).exists()){{code} ** The above check should be done irrespective of which config setting {{dir}} is configured from. ** In {{getLogDir}}, we can reuse and optimize the File objects. {{new File(dir.trim())}} is created multiple times with the same {{dir}} path. {code}if (dir == null) { dir = conf.get(DFSConfigKeys.DFS_JOURNALNODE_EDITS_DIR_KEY, DFSConfigKeys.DFS_JOURNALNODE_EDITS_DIR_DEFAULT); } File journalDir = new File(dir.trim()); if (!localDir.contains(journalDir)) { validateAndCreateJournalDir(journalDir); } {code} > Provide support for JN to use separate journal disk per namespace > - > > Key: HDFS-13062 > URL: https://issues.apache.org/jira/browse/HDFS-13062 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Bharat Viswanadham >Assignee: Bharat Viswanadham >Priority: Major > Attachments: HDFS-13062.00.patch, HDFS-13062.01.patch, > HDFS-13062.02.patch, HDFS-13062.03.patch, HDFS-13062.04.patch > > > In Federated HA setup, provide support for separate journal disk for each > namespace. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12528) Add an option to not disable short-circuit reads on failures
[ https://issues.apache.org/jira/browse/HDFS-12528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16345993#comment-16345993 ] Weiwei Yang commented on HDFS-12528: Thanks [~xiaochen] to provide the fix, thanks for [~jzhuge] for the UT and review, also reviews from [~GeLiXin]. I just committed this to trunk and branch-3.0. > Add an option to not disable short-circuit reads on failures > > > Key: HDFS-12528 > URL: https://issues.apache.org/jira/browse/HDFS-12528 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client, performance >Affects Versions: 2.6.0 >Reporter: Andre Araujo >Assignee: Xiao Chen >Priority: Major > Fix For: 3.1.0, 3.0.1 > > Attachments: HDFS-12528.000.patch, HDFS-12528.01.patch, > HDFS-12528.02.patch, HDFS-12528.03.patch, HDFS-12528.04.patch, > HDFS-12528.05.patch > > > We have scenarios where data ingestion makes use of the -appendToFile > operation to add new data to existing HDFS files. In these situations, we're > frequently running into the problem described below. > We're using Impala to query the HDFS data with short-circuit reads (SCR) > enabled. After each file read, Impala "unbuffer"'s the HDFS file to reduce > the memory footprint. In some cases, though, Impala still keeps the HDFS file > handle open for reuse. > The "unbuffer" call, however, causes the file's current block reader to be > closed, which makes the associated ShortCircuitReplica evictable from the > ShortCircuitCache. When the cluster is under load, this means that the > ShortCircuitReplica can be purged off the cache pretty fast, which closes the > file descriptor to the underlying storage file. > That means that when Impala re-reads the file it has to re-open the storage > files associated with the ShortCircuitReplica's that were evicted from the > cache. If there were no appends to those blocks, the re-open will succeed > without problems. If one block was appended since the ShortCircuitReplica was > created, the re-open will fail with the following error: > {code} > Meta file for BP-810388474-172.31.113.69-1499543341726:blk_1074012183_273087 > not found > {code} > This error is handled as an "unknown response" by the BlockReaderFactory [1], > which disables short-circuit reads for 10 minutes [2] for the client. > These 10 minutes without SCR can have a big performance impact for the client > operations. In this particular case ("Meta file not found") it would suffice > to return null without disabling SCR. This particular block read would fall > back to the normal, non-short-circuited, path and other SCR requests would > continue to work as expected. > It might also be interesting to be able to control how long SCR is disabled > for in the "unknown response" case. 10 minutes seems a bit to long and not > being able to change that is a problem. > [1] > https://github.com/apache/hadoop/blob/f67237cbe7bc48a1b9088e990800b37529f1db2a/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/impl/BlockReaderFactory.java#L646 > [2] > https://github.com/apache/hadoop/blob/f67237cbe7bc48a1b9088e990800b37529f1db2a/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/shortcircuit/DomainSocketFactory.java#L97 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12528) Add an option to not disable short-circuit reads on failures
[ https://issues.apache.org/jira/browse/HDFS-12528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weiwei Yang updated HDFS-12528: --- Release Note: Add an option to not disables short-circuit reads on failures, by setting dfs.domain.socket.disable.interval.seconds to 0. > Add an option to not disable short-circuit reads on failures > > > Key: HDFS-12528 > URL: https://issues.apache.org/jira/browse/HDFS-12528 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client, performance >Affects Versions: 2.6.0 >Reporter: Andre Araujo >Assignee: Xiao Chen >Priority: Major > Fix For: 3.1.0 > > Attachments: HDFS-12528.000.patch, HDFS-12528.01.patch, > HDFS-12528.02.patch, HDFS-12528.03.patch, HDFS-12528.04.patch, > HDFS-12528.05.patch > > > We have scenarios where data ingestion makes use of the -appendToFile > operation to add new data to existing HDFS files. In these situations, we're > frequently running into the problem described below. > We're using Impala to query the HDFS data with short-circuit reads (SCR) > enabled. After each file read, Impala "unbuffer"'s the HDFS file to reduce > the memory footprint. In some cases, though, Impala still keeps the HDFS file > handle open for reuse. > The "unbuffer" call, however, causes the file's current block reader to be > closed, which makes the associated ShortCircuitReplica evictable from the > ShortCircuitCache. When the cluster is under load, this means that the > ShortCircuitReplica can be purged off the cache pretty fast, which closes the > file descriptor to the underlying storage file. > That means that when Impala re-reads the file it has to re-open the storage > files associated with the ShortCircuitReplica's that were evicted from the > cache. If there were no appends to those blocks, the re-open will succeed > without problems. If one block was appended since the ShortCircuitReplica was > created, the re-open will fail with the following error: > {code} > Meta file for BP-810388474-172.31.113.69-1499543341726:blk_1074012183_273087 > not found > {code} > This error is handled as an "unknown response" by the BlockReaderFactory [1], > which disables short-circuit reads for 10 minutes [2] for the client. > These 10 minutes without SCR can have a big performance impact for the client > operations. In this particular case ("Meta file not found") it would suffice > to return null without disabling SCR. This particular block read would fall > back to the normal, non-short-circuited, path and other SCR requests would > continue to work as expected. > It might also be interesting to be able to control how long SCR is disabled > for in the "unknown response" case. 10 minutes seems a bit to long and not > being able to change that is a problem. > [1] > https://github.com/apache/hadoop/blob/f67237cbe7bc48a1b9088e990800b37529f1db2a/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/impl/BlockReaderFactory.java#L646 > [2] > https://github.com/apache/hadoop/blob/f67237cbe7bc48a1b9088e990800b37529f1db2a/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/shortcircuit/DomainSocketFactory.java#L97 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12528) Add an option to not disable short-circuit reads on failures
[ https://issues.apache.org/jira/browse/HDFS-12528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weiwei Yang updated HDFS-12528: --- Release Note: Added an option to not disables short-circuit reads on failures, by setting dfs.domain.socket.disable.interval.seconds to 0. (was: Add an option to not disables short-circuit reads on failures, by setting dfs.domain.socket.disable.interval.seconds to 0.) > Add an option to not disable short-circuit reads on failures > > > Key: HDFS-12528 > URL: https://issues.apache.org/jira/browse/HDFS-12528 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client, performance >Affects Versions: 2.6.0 >Reporter: Andre Araujo >Assignee: Xiao Chen >Priority: Major > Fix For: 3.1.0 > > Attachments: HDFS-12528.000.patch, HDFS-12528.01.patch, > HDFS-12528.02.patch, HDFS-12528.03.patch, HDFS-12528.04.patch, > HDFS-12528.05.patch > > > We have scenarios where data ingestion makes use of the -appendToFile > operation to add new data to existing HDFS files. In these situations, we're > frequently running into the problem described below. > We're using Impala to query the HDFS data with short-circuit reads (SCR) > enabled. After each file read, Impala "unbuffer"'s the HDFS file to reduce > the memory footprint. In some cases, though, Impala still keeps the HDFS file > handle open for reuse. > The "unbuffer" call, however, causes the file's current block reader to be > closed, which makes the associated ShortCircuitReplica evictable from the > ShortCircuitCache. When the cluster is under load, this means that the > ShortCircuitReplica can be purged off the cache pretty fast, which closes the > file descriptor to the underlying storage file. > That means that when Impala re-reads the file it has to re-open the storage > files associated with the ShortCircuitReplica's that were evicted from the > cache. If there were no appends to those blocks, the re-open will succeed > without problems. If one block was appended since the ShortCircuitReplica was > created, the re-open will fail with the following error: > {code} > Meta file for BP-810388474-172.31.113.69-1499543341726:blk_1074012183_273087 > not found > {code} > This error is handled as an "unknown response" by the BlockReaderFactory [1], > which disables short-circuit reads for 10 minutes [2] for the client. > These 10 minutes without SCR can have a big performance impact for the client > operations. In this particular case ("Meta file not found") it would suffice > to return null without disabling SCR. This particular block read would fall > back to the normal, non-short-circuited, path and other SCR requests would > continue to work as expected. > It might also be interesting to be able to control how long SCR is disabled > for in the "unknown response" case. 10 minutes seems a bit to long and not > being able to change that is a problem. > [1] > https://github.com/apache/hadoop/blob/f67237cbe7bc48a1b9088e990800b37529f1db2a/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/impl/BlockReaderFactory.java#L646 > [2] > https://github.com/apache/hadoop/blob/f67237cbe7bc48a1b9088e990800b37529f1db2a/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/shortcircuit/DomainSocketFactory.java#L97 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10285) Storage Policy Satisfier in Namenode
[ https://issues.apache.org/jira/browse/HDFS-10285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16345983#comment-16345983 ] Daryn Sharp commented on HDFS-10285: *StoragePolicySatisifier* Should {{handleException}} use a double-checked lock to avoid synchronization? Unexpected exceptions should be a rarity, right? Speaking of which, it’s not safe to ignore all {{Throwable}} in the run loop! You have no idea if data structures are in a sane or consistent state. *FSTreeTraverser* I need to study this more but I have grave concerns this will work correctly in a mutating namesystem. Ex. renames and deletes esp. in combination with snapshots. Looks like there's a chance it will go off in the weeds when backtracking out of a renamed directory. {{traverseDir}} may NPE if it's traversing a tree in a snapshot and one of the ancestors is deleted. Not sure why it's bothering to re-check permissions during the crawl. The storage policy is inherited by the entire tree, regardless of whether the sub-contents are accessible. The effect of this patch is the storage policy is enforced for all readable files, non-readable violate the new storage policy, new non-readable will conform to the new storage policy. Very convoluted. Since new files will conform, should just process the entire tree. > Storage Policy Satisfier in Namenode > > > Key: HDFS-10285 > URL: https://issues.apache.org/jira/browse/HDFS-10285 > Project: Hadoop HDFS > Issue Type: New Feature > Components: datanode, namenode >Affects Versions: HDFS-10285 >Reporter: Uma Maheswara Rao G >Assignee: Uma Maheswara Rao G >Priority: Major > Attachments: HDFS-10285-consolidated-merge-patch-00.patch, > HDFS-10285-consolidated-merge-patch-01.patch, > HDFS-10285-consolidated-merge-patch-02.patch, > HDFS-10285-consolidated-merge-patch-03.patch, > HDFS-10285-consolidated-merge-patch-04.patch, > HDFS-10285-consolidated-merge-patch-05.patch, > HDFS-SPS-TestReport-20170708.pdf, SPS Modularization.pdf, > Storage-Policy-Satisfier-in-HDFS-June-20-2017.pdf, > Storage-Policy-Satisfier-in-HDFS-May10.pdf, > Storage-Policy-Satisfier-in-HDFS-Oct-26-2017.pdf > > > Heterogeneous storage in HDFS introduced the concept of storage policy. These > policies can be set on directory/file to specify the user preference, where > to store the physical block. When user set the storage policy before writing > data, then the blocks could take advantage of storage policy preferences and > stores physical block accordingly. > If user set the storage policy after writing and completing the file, then > the blocks would have been written with default storage policy (nothing but > DISK). User has to run the ‘Mover tool’ explicitly by specifying all such > file names as a list. In some distributed system scenarios (ex: HBase) it > would be difficult to collect all the files and run the tool as different > nodes can write files separately and file can have different paths. > Another scenarios is, when user rename the files from one effected storage > policy file (inherited policy from parent directory) to another storage > policy effected directory, it will not copy inherited storage policy from > source. So it will take effect from destination file/dir parent storage > policy. This rename operation is just a metadata change in Namenode. The > physical blocks still remain with source storage policy. > So, Tracking all such business logic based file names could be difficult for > admins from distributed nodes(ex: region servers) and running the Mover tool. > Here the proposal is to provide an API from Namenode itself for trigger the > storage policy satisfaction. A Daemon thread inside Namenode should track > such calls and process to DN as movement commands. > Will post the detailed design thoughts document soon. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12528) Add an option to not disable short-circuit reads on failures
[ https://issues.apache.org/jira/browse/HDFS-12528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weiwei Yang updated HDFS-12528: --- Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 3.1.0 Status: Resolved (was: Patch Available) > Add an option to not disable short-circuit reads on failures > > > Key: HDFS-12528 > URL: https://issues.apache.org/jira/browse/HDFS-12528 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client, performance >Affects Versions: 2.6.0 >Reporter: Andre Araujo >Assignee: Xiao Chen >Priority: Major > Fix For: 3.1.0 > > Attachments: HDFS-12528.000.patch, HDFS-12528.01.patch, > HDFS-12528.02.patch, HDFS-12528.03.patch, HDFS-12528.04.patch, > HDFS-12528.05.patch > > > We have scenarios where data ingestion makes use of the -appendToFile > operation to add new data to existing HDFS files. In these situations, we're > frequently running into the problem described below. > We're using Impala to query the HDFS data with short-circuit reads (SCR) > enabled. After each file read, Impala "unbuffer"'s the HDFS file to reduce > the memory footprint. In some cases, though, Impala still keeps the HDFS file > handle open for reuse. > The "unbuffer" call, however, causes the file's current block reader to be > closed, which makes the associated ShortCircuitReplica evictable from the > ShortCircuitCache. When the cluster is under load, this means that the > ShortCircuitReplica can be purged off the cache pretty fast, which closes the > file descriptor to the underlying storage file. > That means that when Impala re-reads the file it has to re-open the storage > files associated with the ShortCircuitReplica's that were evicted from the > cache. If there were no appends to those blocks, the re-open will succeed > without problems. If one block was appended since the ShortCircuitReplica was > created, the re-open will fail with the following error: > {code} > Meta file for BP-810388474-172.31.113.69-1499543341726:blk_1074012183_273087 > not found > {code} > This error is handled as an "unknown response" by the BlockReaderFactory [1], > which disables short-circuit reads for 10 minutes [2] for the client. > These 10 minutes without SCR can have a big performance impact for the client > operations. In this particular case ("Meta file not found") it would suffice > to return null without disabling SCR. This particular block read would fall > back to the normal, non-short-circuited, path and other SCR requests would > continue to work as expected. > It might also be interesting to be able to control how long SCR is disabled > for in the "unknown response" case. 10 minutes seems a bit to long and not > being able to change that is a problem. > [1] > https://github.com/apache/hadoop/blob/f67237cbe7bc48a1b9088e990800b37529f1db2a/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/impl/BlockReaderFactory.java#L646 > [2] > https://github.com/apache/hadoop/blob/f67237cbe7bc48a1b9088e990800b37529f1db2a/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/shortcircuit/DomainSocketFactory.java#L97 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13089) Add test to validate dfs used and no of blocks when blocks are moved across volumes
[ https://issues.apache.org/jira/browse/HDFS-13089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajay Kumar updated HDFS-13089: -- Status: Patch Available (was: Open) > Add test to validate dfs used and no of blocks when blocks are moved across > volumes > --- > > Key: HDFS-13089 > URL: https://issues.apache.org/jira/browse/HDFS-13089 > Project: Hadoop HDFS > Issue Type: Test >Reporter: Ajay Kumar >Assignee: Ajay Kumar >Priority: Major > Attachments: HDFS-13089.000.patch > > > Add test to validate dfs used and no of blocks when blocks are moved across > volumes -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13089) Add test to validate dfs used and no of blocks when blocks are moved across volumes
[ https://issues.apache.org/jira/browse/HDFS-13089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajay Kumar updated HDFS-13089: -- Attachment: HDFS-13089.000.patch > Add test to validate dfs used and no of blocks when blocks are moved across > volumes > --- > > Key: HDFS-13089 > URL: https://issues.apache.org/jira/browse/HDFS-13089 > Project: Hadoop HDFS > Issue Type: Test >Reporter: Ajay Kumar >Assignee: Ajay Kumar >Priority: Major > Attachments: HDFS-13089.000.patch > > > Add test to validate dfs used and no of blocks when blocks are moved across > volumes -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-13089) Add test to validate dfs used and no of blocks when blocks are moved across volumes
Ajay Kumar created HDFS-13089: - Summary: Add test to validate dfs used and no of blocks when blocks are moved across volumes Key: HDFS-13089 URL: https://issues.apache.org/jira/browse/HDFS-13089 Project: Hadoop HDFS Issue Type: Test Reporter: Ajay Kumar Assignee: Ajay Kumar Add test to validate dfs used and no of blocks when blocks are moved across volumes -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13060) Adding a BlacklistBasedTrustedChannelResolver for TrustedChannelResolver
[ https://issues.apache.org/jira/browse/HDFS-13060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16345954#comment-16345954 ] genericqa commented on HDFS-13060: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 14m 4s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 39s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 32m 22s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 13m 4s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 6s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 19s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 15m 26s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 55s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 14s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 19s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 59s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 12m 46s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 12m 46s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 2m 21s{color} | {color:orange} root: The patch generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 30s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 58s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 15s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red}142m 13s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 1m 51s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}273m 18s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.TestDFSClientRetries | | | hadoop.hdfs.server.datanode.TestDataNodeUUID | | | hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA | | | hadoop.hdfs.server.datanode.TestDataNodeMultipleRegistrations | | | hadoop.hdfs.server.namenode.TestDecommissioningStatus | | | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureToleration | | | hadoop.hdfs.server.datanode.TestDirectoryScanner | | | hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean | | | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting | | | hadoop.hdfs.server.namenode.TestMetaSave | | | hadoop.hdfs.server.namenode.ha.TestHAAppend | | | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure | | | hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistReplicaRecovery | \\ \\ ||
[jira] [Commented] (HDFS-13061) SaslDataTransferClient#checkTrustAndSend should not trust a partially trusted channel
[ https://issues.apache.org/jira/browse/HDFS-13061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16345866#comment-16345866 ] genericqa commented on HDFS-13061: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 9s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 46s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 34s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 42s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 38s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 11s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 21s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 17s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 9s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 33s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 32s{color} | {color:green} hadoop-hdfs-project generated 0 new + 433 unchanged - 1 fixed = 433 total (was 434) {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 33s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 46s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 13s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 21s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red}115m 4s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 24s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}176m 58s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.namenode.TestNameNodeMetadataConsistency | | | hadoop.hdfs.TestHDFSFileSystemContract | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | HDFS-13061 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12908397/HDFS-13061.003.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 72ce7cf6f7de 3.13.0-135-generic #184-Ubuntu SMP Wed Oct 18 11:55:51 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / f9dd5b6 | | maven | version: Apache Maven 3.3.9 | | Default
[jira] [Updated] (HDFS-7240) Scaling HDFS
[ https://issues.apache.org/jira/browse/HDFS-7240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sanjay Radia updated HDFS-7240: --- Summary: Scaling HDFS (was: Object store in HDFS) > Scaling HDFS > > > Key: HDFS-7240 > URL: https://issues.apache.org/jira/browse/HDFS-7240 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Jitendra Nath Pandey >Assignee: Jitendra Nath Pandey >Priority: Major > Attachments: HDFS Scalability and Ozone.pdf, HDFS Scalability-v2.pdf, > HDFS-7240.001.patch, HDFS-7240.002.patch, HDFS-7240.003.patch, > HDFS-7240.003.patch, HDFS-7240.004.patch, HDFS-7240.005.patch, > HDFS-7240.006.patch, HadoopStorageLayerSecurity.pdf, MeetingMinutes.pdf, > Ozone-architecture-v1.pdf, Ozonedesignupdate.pdf, ozone_user_v0.pdf > > > [^HDFS Scalability-v2.pdf] describes areas where HDFS does well and its > scaling challenges and how to address those challenges. Scaling HDFS requires > scaling the namespace layer and also the block layer. _This jira provides a > new block layer, Hadoop Distributed Storage Layer (HDSL), that scales the > block layer by grouping blocks into containers thereby reducing the > block-to-location map and also reducing the number of block reports and their > processing_ > _A scalable namespace can be put on top this scalable block layer:_ > * _HDFS-10419 describes how the existing NN can be modified to use the new > block layer._ > * _HDFS-13074 also provides, as an_ *_interim_* _step; a scalable flat > Key-Value namespace on top of the new block layer; while it does not provide > the HDFS API, it does support the Hadoop FS APIs (Hadoop FileSystem, > FileContext)._ > > Old Description > This jira proposes to add object store capabilities into HDFS. > As part of the federation work (HDFS-1052) we separated block storage as a > generic storage layer. Using the Block Pool abstraction, new kinds of > namespaces can be built on top of the storage layer i.e. datanodes. > In this jira I will explore building an object store using the datanode > storage, but independent of namespace metadata. > I will soon update with a detailed design document. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-7240) Object store in HDFS
[ https://issues.apache.org/jira/browse/HDFS-7240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sanjay Radia updated HDFS-7240: --- Description: [^HDFS Scalability-v2.pdf] describes areas where HDFS does well and its scaling challenges and how to address those challenges. Scaling HDFS requires scaling the namespace layer and also the block layer. _This jira provides a new block layer, Hadoop Distributed Storage Layer (HDSL), that scales the block layer by grouping blocks into containers thereby reducing the block-to-location map and also reducing the number of block reports and their processing_ _A scalable namespace can be put on top this scalable block layer:_ * _HDFS-10419 describes how the existing NN can be modified to use the new block layer._ * _HDFS-13074 also provides, as an_ *_interim_* _step; a scalable flat Key-Value namespace on top of the new block layer; while it does not provide the HDFS API, it does support the Hadoop FS APIs (Hadoop FileSystem, FileContext)._ Old Description This jira proposes to add object store capabilities into HDFS. As part of the federation work (HDFS-1052) we separated block storage as a generic storage layer. Using the Block Pool abstraction, new kinds of namespaces can be built on top of the storage layer i.e. datanodes. In this jira I will explore building an object store using the datanode storage, but independent of namespace metadata. I will soon update with a detailed design document. was: [Scaling HDFS| x] describes areas where HDFS does well and its scaling challenges and how to address those challenges. Scaling HDFS requires scaling the namespace layer and also the block layer. _This jira provides a new block layer, Hadoop Distributed Storage Layer (HDSL), that scales the block layer by grouping blocks into containers thereby reducing the block-to-location map and also reducing the number of block reports and their processing_ _A scalable namespace can be put on top this scalable block layer:_ * _HDFS-10419 describes how the existing NN can be modified to use the new block layer._ * _HDFS-13074 also provides, as an_ *_interim_* _step; a scalable flat KV namespace on top of the new block layer; while it does not provide the HDFS API, it does support the Hadoop FS APIs (Hadoop FileSystem, FileContext)._ Old Description This jira proposes to add object store capabilities into HDFS. As part of the federation work (HDFS-1052) we separated block storage as a generic storage layer. Using the Block Pool abstraction, new kinds of namespaces can be built on top of the storage layer i.e. datanodes. In this jira I will explore building an object store using the datanode storage, but independent of namespace metadata. I will soon update with a detailed design document. > Object store in HDFS > > > Key: HDFS-7240 > URL: https://issues.apache.org/jira/browse/HDFS-7240 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Jitendra Nath Pandey >Assignee: Jitendra Nath Pandey >Priority: Major > Attachments: HDFS Scalability and Ozone.pdf, HDFS Scalability-v2.pdf, > HDFS-7240.001.patch, HDFS-7240.002.patch, HDFS-7240.003.patch, > HDFS-7240.003.patch, HDFS-7240.004.patch, HDFS-7240.005.patch, > HDFS-7240.006.patch, HadoopStorageLayerSecurity.pdf, MeetingMinutes.pdf, > Ozone-architecture-v1.pdf, Ozonedesignupdate.pdf, ozone_user_v0.pdf > > > [^HDFS Scalability-v2.pdf] describes areas where HDFS does well and its > scaling challenges and how to address those challenges. Scaling HDFS requires > scaling the namespace layer and also the block layer. _This jira provides a > new block layer, Hadoop Distributed Storage Layer (HDSL), that scales the > block layer by grouping blocks into containers thereby reducing the > block-to-location map and also reducing the number of block reports and their > processing_ > _A scalable namespace can be put on top this scalable block layer:_ > * _HDFS-10419 describes how the existing NN can be modified to use the new > block layer._ > * _HDFS-13074 also provides, as an_ *_interim_* _step; a scalable flat > Key-Value namespace on top of the new block layer; while it does not provide > the HDFS API, it does support the Hadoop FS APIs (Hadoop FileSystem, > FileContext)._ > > Old Description > This jira proposes to add object store capabilities into HDFS. > As part of the federation work (HDFS-1052) we separated block storage as a > generic storage layer. Using the Block Pool abstraction, new kinds of > namespaces can be built on top of the storage layer i.e. datanodes. > In this jira I will explore building an object store using the datanode > storage, but independent of namespace metadata. > I will soon update with a detailed design document. -- This message was sent by
[jira] [Commented] (HDFS-13043) RBF: Expose the state of the Routers in the federation
[ https://issues.apache.org/jira/browse/HDFS-13043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16345843#comment-16345843 ] genericqa commented on HDFS-13043: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 19s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 4 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 56s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 48s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 31s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 52s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 9m 51s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 44s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 55s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 0m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 31s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 56s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 9m 24s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 47s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}132m 9s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 23s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}177m 33s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA | | | hadoop.hdfs.server.federation.metrics.TestFederationMetrics | | | hadoop.hdfs.web.TestWebHdfsTimeouts | | | hadoop.hdfs.server.namenode.TestReencryptionWithKMS | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure | | | hadoop.hdfs.TestReadStripedFileWithMissingBlocks | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | HDFS-13043 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12908394/HDFS-13043.006.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle cc | | uname | Linux 52f04c6127fb 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 11:50:30 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / f9dd5b6 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_151 | | findbugs | v3.1.0-RC1 | | unit |
[jira] [Updated] (HDFS-7240) Object store in HDFS
[ https://issues.apache.org/jira/browse/HDFS-7240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sanjay Radia updated HDFS-7240: --- Description: [Scaling HDFS| x] describes areas where HDFS does well and its scaling challenges and how to address those challenges. Scaling HDFS requires scaling the namespace layer and also the block layer. _This jira provides a new block layer, Hadoop Distributed Storage Layer (HDSL), that scales the block layer by grouping blocks into containers thereby reducing the block-to-location map and also reducing the number of block reports and their processing_ _A scalable namespace can be put on top this scalable block layer:_ * _HDFS-10419 describes how the existing NN can be modified to use the new block layer._ * _HDFS-13074 also provides, as an_ *_interim_* _step; a scalable flat KV namespace on top of the new block layer; while it does not provide the HDFS API, it does support the Hadoop FS APIs (Hadoop FileSystem, FileContext)._ Old Description This jira proposes to add object store capabilities into HDFS. As part of the federation work (HDFS-1052) we separated block storage as a generic storage layer. Using the Block Pool abstraction, new kinds of namespaces can be built on top of the storage layer i.e. datanodes. In this jira I will explore building an object store using the datanode storage, but independent of namespace metadata. I will soon update with a detailed design document. was: [^HDFS Scalability-v2.pdf] describes areas where HDFS does well and its scaling challenges and how to address those challenges. Scaling HDFS requires scaling the namespace layer and also the block layer. _This jira provides a new block layer, Hadoop Distributed Storage Layer (HDSL), that scales the block layer by grouping blocks into containers thereby reducing the block-to-location map and also reducing the number of block reports and their processing_ _A scalable namespace can be put on top this scalable block layer:_ * _HDFS-10419 describes how the existing NN can be modified to use the new block layer._ * _HDFS-13074 also provides, as an_ *_interim_* _step; a scalable flat KV namespace on top of the new block layer; while it does not provide the HDFS API, it does support the Hadoop FS APIs (Hadoop FileSystem, FileContext)._ Old Description This jira proposes to add object store capabilities into HDFS. As part of the federation work (HDFS-1052) we separated block storage as a generic storage layer. Using the Block Pool abstraction, new kinds of namespaces can be built on top of the storage layer i.e. datanodes. In this jira I will explore building an object store using the datanode storage, but independent of namespace metadata. I will soon update with a detailed design document. > Object store in HDFS > > > Key: HDFS-7240 > URL: https://issues.apache.org/jira/browse/HDFS-7240 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Jitendra Nath Pandey >Assignee: Jitendra Nath Pandey >Priority: Major > Attachments: HDFS Scalability and Ozone.pdf, HDFS Scalability-v2.pdf, > HDFS-7240.001.patch, HDFS-7240.002.patch, HDFS-7240.003.patch, > HDFS-7240.003.patch, HDFS-7240.004.patch, HDFS-7240.005.patch, > HDFS-7240.006.patch, HadoopStorageLayerSecurity.pdf, MeetingMinutes.pdf, > Ozone-architecture-v1.pdf, Ozonedesignupdate.pdf, ozone_user_v0.pdf > > > [Scaling HDFS| x] describes areas where HDFS does well and its scaling > challenges and how to address those challenges. Scaling HDFS requires scaling > the namespace layer and also the block layer. _This jira provides a new block > layer, Hadoop Distributed Storage Layer (HDSL), that scales the block layer > by grouping blocks into containers thereby reducing the block-to-location map > and also reducing the number of block reports and their processing_ > _A scalable namespace can be put on top this scalable block layer:_ > * _HDFS-10419 describes how the existing NN can be modified to use the new > block layer._ > * _HDFS-13074 also provides, as an_ *_interim_* _step; a scalable flat KV > namespace on top of the new block layer; while it does not provide the HDFS > API, it does support the Hadoop FS APIs (Hadoop FileSystem, FileContext)._ > > > Old Description > This jira proposes to add object store capabilities into HDFS. > As part of the federation work (HDFS-1052) we separated block storage as a > generic storage layer. Using the Block Pool abstraction, new kinds of > namespaces can be built on top of the storage layer i.e. datanodes. > In this jira I will explore building an object store using the datanode > storage, but independent of namespace metadata. > I will soon update with a detailed design document. -- This message was sent by Atlassian JIRA
[jira] [Updated] (HDFS-7240) Object store in HDFS
[ https://issues.apache.org/jira/browse/HDFS-7240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sanjay Radia updated HDFS-7240: --- Description: [^HDFS Scalability-v2.pdf] describes areas where HDFS does well and its scaling challenges and how to address those challenges. Scaling HDFS requires scaling the namespace layer and also the block layer. _This jira provides a new block layer, Hadoop Distributed Storage Layer (HDSL), that scales the block layer by grouping blocks into containers thereby reducing the block-to-location map and also reducing the number of block reports and their processing_ _A scalable namespace can be put on top this scalable block layer:_ * _HDFS-10419 describes how the existing NN can be modified to use the new block layer._ * _HDFS-13074 also provides, as an_ *_interim_* _step; a scalable flat KV namespace on top of the new block layer; while it does not provide the HDFS API, it does support the Hadoop FS APIs (Hadoop FileSystem, FileContext)._ Old Description This jira proposes to add object store capabilities into HDFS. As part of the federation work (HDFS-1052) we separated block storage as a generic storage layer. Using the Block Pool abstraction, new kinds of namespaces can be built on top of the storage layer i.e. datanodes. In this jira I will explore building an object store using the datanode storage, but independent of namespace metadata. I will soon update with a detailed design document. was: This jira proposes to add object store capabilities into HDFS. As part of the federation work (HDFS-1052) we separated block storage as a generic storage layer. Using the Block Pool abstraction, new kinds of namespaces can be built on top of the storage layer i.e. datanodes. In this jira I will explore building an object store using the datanode storage, but independent of namespace metadata. I will soon update with a detailed design document. > Object store in HDFS > > > Key: HDFS-7240 > URL: https://issues.apache.org/jira/browse/HDFS-7240 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Jitendra Nath Pandey >Assignee: Jitendra Nath Pandey >Priority: Major > Attachments: HDFS Scalability and Ozone.pdf, HDFS Scalability-v2.pdf, > HDFS-7240.001.patch, HDFS-7240.002.patch, HDFS-7240.003.patch, > HDFS-7240.003.patch, HDFS-7240.004.patch, HDFS-7240.005.patch, > HDFS-7240.006.patch, HadoopStorageLayerSecurity.pdf, MeetingMinutes.pdf, > Ozone-architecture-v1.pdf, Ozonedesignupdate.pdf, ozone_user_v0.pdf > > > [^HDFS Scalability-v2.pdf] describes areas where HDFS does well and its > scaling challenges and how to address those challenges. Scaling HDFS requires > scaling the namespace layer and also the block layer. _This jira provides a > new block layer, Hadoop Distributed Storage Layer (HDSL), that scales the > block layer by grouping blocks into containers thereby reducing the > block-to-location map and also reducing the number of block reports and their > processing_ > _A scalable namespace can be put on top this scalable block layer:_ > * _HDFS-10419 describes how the existing NN can be modified to use the new > block layer._ > * _HDFS-13074 also provides, as an_ *_interim_* _step; a scalable flat KV > namespace on top of the new block layer; while it does not provide the HDFS > API, it does support the Hadoop FS APIs (Hadoop FileSystem, FileContext)._ > > > Old Description > This jira proposes to add object store capabilities into HDFS. > As part of the federation work (HDFS-1052) we separated block storage as a > generic storage layer. Using the Block Pool abstraction, new kinds of > namespaces can be built on top of the storage layer i.e. datanodes. > In this jira I will explore building an object store using the datanode > storage, but independent of namespace metadata. > I will soon update with a detailed design document. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12997) Move logging to slf4j in BlockPoolSliceStorage and Storage
[ https://issues.apache.org/jira/browse/HDFS-12997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16345828#comment-16345828 ] genericqa commented on HDFS-12997: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 10m 49s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 13s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 56s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 39s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 1s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 32s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 55s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 54s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 53s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 53s{color} | {color:green} hadoop-hdfs-project_hadoop-hdfs generated 0 new + 392 unchanged - 2 fixed = 392 total (was 394) {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 37s{color} | {color:green} hadoop-hdfs-project/hadoop-hdfs: The patch generated 0 new + 153 unchanged - 7 fixed = 153 total (was 160) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 59s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 47s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 2s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 52s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 86m 36s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 25s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}149m 0s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting | | | hadoop.hdfs.TestReconstructStripedFile | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure120 | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | HDFS-12997 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12908395/HDFS-12997.006.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 6378e0117699 3.13.0-135-generic #184-Ubuntu SMP Wed Oct 18 11:55:51 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / f9dd5b6 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_151 | | findbugs | v3.1.0-RC1 | | unit | https://builds.apache.org/job/PreCommit-HDFS-Build/22894/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/22894/testReport/ | | Max.
[jira] [Updated] (HDFS-7240) Object store in HDFS
[ https://issues.apache.org/jira/browse/HDFS-7240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sanjay Radia updated HDFS-7240: --- Attachment: HDFS Scalability-v2.pdf > Object store in HDFS > > > Key: HDFS-7240 > URL: https://issues.apache.org/jira/browse/HDFS-7240 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Jitendra Nath Pandey >Assignee: Jitendra Nath Pandey >Priority: Major > Attachments: HDFS Scalability and Ozone.pdf, HDFS Scalability-v2.pdf, > HDFS-7240.001.patch, HDFS-7240.002.patch, HDFS-7240.003.patch, > HDFS-7240.003.patch, HDFS-7240.004.patch, HDFS-7240.005.patch, > HDFS-7240.006.patch, HadoopStorageLayerSecurity.pdf, MeetingMinutes.pdf, > Ozone-architecture-v1.pdf, Ozonedesignupdate.pdf, ozone_user_v0.pdf > > > This jira proposes to add object store capabilities into HDFS. > As part of the federation work (HDFS-1052) we separated block storage as a > generic storage layer. Using the Block Pool abstraction, new kinds of > namespaces can be built on top of the storage layer i.e. datanodes. > In this jira I will explore building an object store using the datanode > storage, but independent of namespace metadata. > I will soon update with a detailed design document. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11187) Optimize disk access for last partial chunk checksum of Finalized replica
[ https://issues.apache.org/jira/browse/HDFS-11187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16345722#comment-16345722 ] genericqa commented on HDFS-11187: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 10m 24s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 44s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 59s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 45s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 6s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 0s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 5s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 52s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 59s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 7s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 10s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 55s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 92m 25s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 24s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}156m 16s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | HDFS-11187 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12908379/HDFS-11187.005.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 164716be0a39 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 901d15a | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_151 | | findbugs | v3.1.0-RC1 | | unit | https://builds.apache.org/job/PreCommit-HDFS-Build/22888/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/22888/testReport/ | | Max. process+thread count | 4147 (vs. ulimit of 5000) | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/22888/console | | Powered by | Apache Yetus 0.8.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Optimize disk access for last partial chunk checksum of
[jira] [Commented] (HDFS-12997) Move logging to slf4j in BlockPoolSliceStorage and Storage
[ https://issues.apache.org/jira/browse/HDFS-12997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16345716#comment-16345716 ] Xiaoyu Yao commented on HDFS-12997: --- Thanks [~ajayydv] for the update. The patch looks good to me. Two more minor issues, +1 after that being fixed. BlockPoolSliceStorage.java LINE 169-170: "formatted for {}" is redundant. We can remove it and the 2nd "nsInfo.getBlockPoolID()" NNStorage.java Line 866-867: can we wrap these two calls inside {color}if (LOG.isDebugEnabled()) to minimize performance impact of logging? Line 882-883: Same as above > Move logging to slf4j in BlockPoolSliceStorage and Storage > --- > > Key: HDFS-12997 > URL: https://issues.apache.org/jira/browse/HDFS-12997 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ajay Kumar >Assignee: Ajay Kumar >Priority: Major > Attachments: HDFS-12997.001.patch, HDFS-12997.002.patch, > HDFS-12997.003.patch, HDFS-12997.004.patch, HDFS-12997.005.patch, > HDFS-12997.006.patch > > > Move logging to slf4j in BlockPoolSliceStorage and Storage classes. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13044) RBF: Add a safe mode for the Router
[ https://issues.apache.org/jira/browse/HDFS-13044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16345705#comment-16345705 ] genericqa commented on HDFS-13044: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 10s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} branch-3.0 Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 49s{color} | {color:green} branch-3.0 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 47s{color} | {color:green} branch-3.0 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 37s{color} | {color:green} branch-3.0 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 53s{color} | {color:green} branch-3.0 passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 9m 41s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 44s{color} | {color:green} branch-3.0 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 51s{color} | {color:green} branch-3.0 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 52s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 52s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 2s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 9m 6s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 45s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 63m 41s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 21s{color} | {color:red} The patch generated 1 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}111m 2s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Unreaped Processes | hadoop-hdfs:7 | | Failed junit tests | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure070 | | | hadoop.hdfs.server.blockmanagement.TestNodeCount | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure180 | | | hadoop.hdfs.TestDistributedFileSystem | | | hadoop.hdfs.TestFileChecksum | | | hadoop.hdfs.TestReadStripedFileWithMissingBlocks | | | hadoop.hdfs.server.federation.router.TestRouterSafemode | | | hadoop.hdfs.server.balancer.TestBalancerWithEncryptedTransfer | | | hadoop.hdfs.TestReadStripedFileWithDecodingCorruptData | | | hadoop.hdfs.server.datanode.TestDataNodeECN | | | hadoop.hdfs.server.namenode.TestFSImageWithAcl | | | hadoop.hdfs.TestDFSClientRetries | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailureWithRandomECPolicy | | | hadoop.hdfs.TestDFSPermission | | | hadoop.hdfs.tools.offlineImageViewer.TestOfflineImageViewerWithStripedBlocks | | | hadoop.hdfs.TestGetBlocks | | | hadoop.hdfs.TestDFSStripedOutputStreamWithRandomECPolicy | | | hadoop.hdfs.TestErasureCodingPolicies | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure200 | | |
[jira] [Comment Edited] (HDFS-13069) Enable HDFS to cache data read from external storage systems
[ https://issues.apache.org/jira/browse/HDFS-13069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16345673#comment-16345673 ] Virajith Jalaparti edited comment on HDFS-13069 at 1/30/18 7:55 PM: The design for caching PROVIDED files is as follows: 1) The onus of caching any file will be on the client. If a particular file has to be cached, a {{readThrough}} flag will be set to {{true}} as part of the {{CachingStrategy}}. 2) When a Datanode receives a {{readBlock}} request, for a PROVIDED block, with the {{readThrough}} flag set to {{true}}, it creates a {{TEMPORARY}} local replica. As data is streamed back to the client, it will be written this temporary replica (can be synchronous or asynchronous). 3) When the stream to the client closes, the Datanode attempts to finalize the replica. If the block is not fully read by the client, the Datanode pages in the remaining data from the PROVIDED store. The local replica is then finalized and the NN is notified ({{FsDatasetSpi#notifyNamenodeForNewReplica}} is called). 4) In the above, if any failure occurs between the creation of the temporary replica and the replica being finalized, it will be cleaned up. If the Datanode goes down while this happens, the temporary replica will be handled as any other temporary replica. Note that if the Namenode gets notified of the new replica in (4) above, it would be deleted as an excess replica. To prevent this, we will introduce a new kind of per-file replication factor called "over-replication" with the following semantics: the NN will allow (replication + over-replication) number of replicas to exist for any block. The current semantics of replication will continue to be enforced (i.e., if the number of replicas of a block are less than the replication factor, the NN will schedule replications) but no replications will be scheduled to meet the over-replication. Thus, to enable read-through caching, the client has to first set the appropriate over-replication for the PROVIDED files and then read the data. If needed, the over-replication can also be set as part of the image generation process for the PROVIDED storage (HDFS-9806/HDFS-10706). As long as files in the PROVIDED store is read-only (and thus, do not get modified), there will be no need to invalidate the cached replicas. If data in the PROVIDED store changes, the cached replicas will require invalidation. was (Author: virajith): The design for caching PROVIDED files is as follows: 1) The onus of caching any file will be on the client. If a particular file has to be cached, a {{readThrough}} flag will be set to {{true}} as part of the {{CachingStrategy}}. 2) When a Datanode receives a {{readBlock}} request, for a PROVIDED block, with the {{readThrough}} flag set to {{true}}, it creates a {{TEMPORARY}} local replica. As data is streamed back to the client, it will be written this temporary replica (can be synchronous or asynchronous). 3) When the stream to the client closes, the Datanode attempts to finalize the replica. If the block is not fully read by the client, the Datanode pages in the remaining data from the PROVIDED store. The local replica is then finalized and the NN is notified ({{FsDatasetSpi#notifyNamenodeForNewReplica}} is called). 4) In the above, if any failure occurs between the creation of the temporary replica and the replica being finalized, it will be cleaned up. If the Datanode goes down while this happens, the temporary replica will be handled as any other temporary replica. Note that if the Namenode gets notified of the new replica in (4) above, it would be deleted as an excess replica. To prevent this, we will introduce a new kind of per-file replication factor called "over-replication" with the following semantics: the NN will allow (replication + over-replication) number of replicas to exist for any block. The current semantics of replication will continue to be enforced (i.e., if the number of replicas of a block are less than the replication factor, the NN will schedule replications) but no replications will be scheduled to meet the over-replication. So to enable read-through caching, the client has to first set the appropriate over-replication for the PROVIDED files and then read the data. If needed, the over-replication can also be set as part of the image generation process for the PROVIDED storage (HDFS-9806/HDFS-10706) > Enable HDFS to cache data read from external storage systems > > > Key: HDFS-13069 > URL: https://issues.apache.org/jira/browse/HDFS-13069 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Virajith Jalaparti >Priority: Major > > With {{PROVIDED}} storage (HDFS-9806), HDFS can address data stored in > external
[jira] [Created] (HDFS-13088) Allow HDFS files/blocks to be over-replicated.
Virajith Jalaparti created HDFS-13088: - Summary: Allow HDFS files/blocks to be over-replicated. Key: HDFS-13088 URL: https://issues.apache.org/jira/browse/HDFS-13088 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Virajith Jalaparti This JIRA is to add a per-file "over-replication" factor to HDFS. As mentioned in HDFS-13069, the over-replication factor will be the excess replicas that will be allowed to exist for a file or block. This is beneficial if the application deems additional replicas for a file are needed. In the case of HDFS-13069, it would allow copies of data in PROVIDED storage to be cached locally in HDFS in a read-through manner. The Namenode will not proactively meet the over-replication i.e., it does not schedule replications if the number of replicas for a block is less than (replication factor + over-replication factor) as long as they are more than the replication factor of the file. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12512) RBF: Add WebHDFS
[ https://issues.apache.org/jira/browse/HDFS-12512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16345680#comment-16345680 ] Wei Yan commented on HDFS-12512: [~elgoiri] any comments from your test cluster? :) > RBF: Add WebHDFS > > > Key: HDFS-12512 > URL: https://issues.apache.org/jira/browse/HDFS-12512 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: fs >Reporter: Íñigo Goiri >Assignee: Wei Yan >Priority: Major > Labels: RBF > Attachments: HDFS-12512.000.patch, HDFS-12512.001.patch, > HDFS-12512.002.patch, HDFS-12512.003.patch > > > The Router currently does not support WebHDFS. It needs to implement > something similar to {{NamenodeWebHdfsMethods}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13069) Enable HDFS to cache data read from external storage systems
[ https://issues.apache.org/jira/browse/HDFS-13069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16345673#comment-16345673 ] Virajith Jalaparti commented on HDFS-13069: --- The design for caching PROVIDED files is as follows: 1) The onus of caching any file will be on the client. If a particular file has to be cached, a {{readThrough}} flag will be set to {{true}} as part of the {{CachingStrategy}}. 2) When a Datanode receives a {{readBlock}} request, for a PROVIDED block, with the {{readThrough}} flag set to {{true}}, it creates a {{TEMPORARY}} local replica. As data is streamed back to the client, it will be written this temporary replica (can be synchronous or asynchronous). 3) When the stream to the client closes, the Datanode attempts to finalize the replica. If the block is not fully read by the client, the Datanode pages in the remaining data from the PROVIDED store. The local replica is then finalized and the NN is notified ({{FsDatasetSpi#notifyNamenodeForNewReplica}} is called). 4) In the above, if any failure occurs between the creation of the temporary replica and the replica being finalized, it will be cleaned up. If the Datanode goes down while this happens, the temporary replica will be handled as any other temporary replica. Note that if the Namenode gets notified of the new replica in (4) above, it would be deleted as an excess replica. To prevent this, we will introduce a new kind of per-file replication factor called "over-replication" with the following semantics: the NN will allow (replication + over-replication) number of replicas to exist for any block. The current semantics of replication will continue to be enforced (i.e., if the number of replicas of a block are less than the replication factor, the NN will schedule replications) but no replications will be scheduled to meet the over-replication. So to enable read-through caching, the client has to first set the appropriate over-replication for the PROVIDED files and then read the data. If needed, the over-replication can also be set as part of the image generation process for the PROVIDED storage (HDFS-9806/HDFS-10706) > Enable HDFS to cache data read from external storage systems > > > Key: HDFS-13069 > URL: https://issues.apache.org/jira/browse/HDFS-13069 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Virajith Jalaparti >Priority: Major > > With {{PROVIDED}} storage (HDFS-9806), HDFS can address data stored in > external storage systems. Caching this data, locally, in HDFS can speed up > subsequent accesses to the data -- this feature is especially useful when > accesses to the external store have limited bandwidth/higher latency. This > JIRA is to add this feature in HDFS. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12522) Ozone: Remove the Priority Queues used in the Container State Manager
[ https://issues.apache.org/jira/browse/HDFS-12522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anu Engineer updated HDFS-12522: Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: HDFS-7240 Status: Resolved (was: Patch Available) [~xyao] and [~nandakumar131] Thanks for the reviews, I have committed this to the feature branch. > Ozone: Remove the Priority Queues used in the Container State Manager > - > > Key: HDFS-12522 > URL: https://issues.apache.org/jira/browse/HDFS-12522 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Anu Engineer >Assignee: Anu Engineer >Priority: Major > Fix For: HDFS-7240 > > Attachments: HDFS-12522-HDFS-7240.001.patch, > HDFS-12522-HDFS-7240.002.patch, HDFS-12522-HDFS-7240.003.patch, > HDFS-12522-HDFS-7240.004.patch > > > During code review of HDFS-12387, it was suggested that we remove the > priority queues that was used in ContainerStateManager. This JIRA tracks that > issue. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-11701) NPE from Unresolved Host causes permanent DFSInputStream failures
[ https://issues.apache.org/jira/browse/HDFS-11701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16345156#comment-16345156 ] Jitendra Nath Pandey edited comment on HDFS-11701 at 1/30/18 6:51 PM: -- +1. The patch looks good to me. Could you please evaluate if it is possible to add a unit test using Mockito? was (Author: jnp): +1. The patch looks good to me. > NPE from Unresolved Host causes permanent DFSInputStream failures > - > > Key: HDFS-11701 > URL: https://issues.apache.org/jira/browse/HDFS-11701 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Affects Versions: 2.6.0 > Environment: AWS Centos linux running HBase CDH 5.9.0 and HDFS CDH > 5.9.0 >Reporter: James Moore >Assignee: Lokesh Jain >Priority: Major > Attachments: HDFS-11701.001.patch, HDFS-11701.002.patch > > > We recently encountered the following NPE due to the DFSInputStream storing > old cached block locations from hosts which could no longer resolve. > {quote} > Caused by: java.lang.NullPointerException > at org.apache.hadoop.hdfs.DFSClient.isLocalAddress(DFSClient.java:1122) > at > org.apache.hadoop.hdfs.shortcircuit.DomainSocketFactory.getPathInfo(DomainSocketFactory.java:148) > at > org.apache.hadoop.hdfs.BlockReaderFactory.getBlockReaderLocal(BlockReaderFactory.java:474) > at > org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFactory.java:354) > at > org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:662) > at > org.apache.hadoop.hdfs.DFSInputStream.seekToNewSource(DFSInputStream.java:1613) > at > org.apache.hadoop.fs.FSDataInputStream.seekToNewSource(FSDataInputStream.java:127) > ~HBase related stack frames trimmed~ > {quote} > After investigating, the DFSInputStream appears to have been open for upwards > of 3-4 weeks and had cached block locations from decommissioned nodes that no > longer resolve in DNS and had been shutdown and removed from the cluster 2 > weeks prior. If the DFSInputStream had refreshed its block locations from > the name node, it would have received alternative block locations which would > not contain the decommissioned data nodes. As the above NPE leaves the > non-resolving data node in the list of block locations the DFSInputStream > never refreshes the block locations and all attempts to open a BlockReader > for the given blocks will fail. > In our case, we resolved the NPE by closing and re-opening every > DFSInputStream in the cluster to force a purge of the block locations cache. > Ideally, the DFSInputStream would re-fetch all block locations for a host > which can't be resolved in DNS or at least the blocks requested. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13061) SaslDataTransferClient#checkTrustAndSend should not trust a partially trusted channel
[ https://issues.apache.org/jira/browse/HDFS-13061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16345595#comment-16345595 ] Ajay Kumar commented on HDFS-13061: --- [~xyao], Updated patch to remove {{socket.close()}} call. > SaslDataTransferClient#checkTrustAndSend should not trust a partially trusted > channel > - > > Key: HDFS-13061 > URL: https://issues.apache.org/jira/browse/HDFS-13061 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Xiaoyu Yao >Assignee: Ajay Kumar >Priority: Major > Attachments: HDFS-13061.000.patch, HDFS-13061.001.patch, > HDFS-13061.002.patch, HDFS-13061.003.patch > > > HDFS-5910 introduces encryption negotiation between client and server based > on a customizable TrustedChannelResolver class. The TrustedChannelResolver is > invoked on both client and server side. If the resolver indicates that the > channel is trusted, then the data transfer will not be encrypted even if > dfs.encrypt.data.transfer is set to true. > SaslDataTransferClient#checkTrustAndSend ask the channel resolve whether the > client and server address are trusted, respectively. It decides the channel > is untrusted only if both client and server are not trusted to enforce > encryption. *This ticket is opened to change it to not trust (and encrypt) if > either client or server address are not trusted.* -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13061) SaslDataTransferClient#checkTrustAndSend should not trust a partially trusted channel
[ https://issues.apache.org/jira/browse/HDFS-13061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajay Kumar updated HDFS-13061: -- Attachment: HDFS-13061.003.patch > SaslDataTransferClient#checkTrustAndSend should not trust a partially trusted > channel > - > > Key: HDFS-13061 > URL: https://issues.apache.org/jira/browse/HDFS-13061 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Xiaoyu Yao >Assignee: Ajay Kumar >Priority: Major > Attachments: HDFS-13061.000.patch, HDFS-13061.001.patch, > HDFS-13061.002.patch, HDFS-13061.003.patch > > > HDFS-5910 introduces encryption negotiation between client and server based > on a customizable TrustedChannelResolver class. The TrustedChannelResolver is > invoked on both client and server side. If the resolver indicates that the > channel is trusted, then the data transfer will not be encrypted even if > dfs.encrypt.data.transfer is set to true. > SaslDataTransferClient#checkTrustAndSend ask the channel resolve whether the > client and server address are trusted, respectively. It decides the channel > is untrusted only if both client and server are not trusted to enforce > encryption. *This ticket is opened to change it to not trust (and encrypt) if > either client or server address are not trusted.* -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12997) Move logging to slf4j in BlockPoolSliceStorage and Storage
[ https://issues.apache.org/jira/browse/HDFS-12997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajay Kumar updated HDFS-12997: -- Attachment: HDFS-12997.006.patch > Move logging to slf4j in BlockPoolSliceStorage and Storage > --- > > Key: HDFS-12997 > URL: https://issues.apache.org/jira/browse/HDFS-12997 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ajay Kumar >Assignee: Ajay Kumar >Priority: Major > Attachments: HDFS-12997.001.patch, HDFS-12997.002.patch, > HDFS-12997.003.patch, HDFS-12997.004.patch, HDFS-12997.005.patch, > HDFS-12997.006.patch > > > Move logging to slf4j in BlockPoolSliceStorage and Storage classes. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12997) Move logging to slf4j in BlockPoolSliceStorage and Storage
[ https://issues.apache.org/jira/browse/HDFS-12997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16345592#comment-16345592 ] Ajay Kumar commented on HDFS-12997: --- Test failures are unrelated. Updated patch to fix 1 checkstyle issue. > Move logging to slf4j in BlockPoolSliceStorage and Storage > --- > > Key: HDFS-12997 > URL: https://issues.apache.org/jira/browse/HDFS-12997 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ajay Kumar >Assignee: Ajay Kumar >Priority: Major > Attachments: HDFS-12997.001.patch, HDFS-12997.002.patch, > HDFS-12997.003.patch, HDFS-12997.004.patch, HDFS-12997.005.patch, > HDFS-12997.006.patch > > > Move logging to slf4j in BlockPoolSliceStorage and Storage classes. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12522) Ozone: Remove the Priority Queues used in the Container State Manager
[ https://issues.apache.org/jira/browse/HDFS-12522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16345588#comment-16345588 ] Xiaoyu Yao commented on HDFS-12522: --- Thanks [~anu] for the update. Patch v4 LGTM, +1. > Ozone: Remove the Priority Queues used in the Container State Manager > - > > Key: HDFS-12522 > URL: https://issues.apache.org/jira/browse/HDFS-12522 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Anu Engineer >Assignee: Anu Engineer >Priority: Major > Attachments: HDFS-12522-HDFS-7240.001.patch, > HDFS-12522-HDFS-7240.002.patch, HDFS-12522-HDFS-7240.003.patch, > HDFS-12522-HDFS-7240.004.patch > > > During code review of HDFS-12387, it was suggested that we remove the > priority queues that was used in ContainerStateManager. This JIRA tracks that > issue. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13068) RBF: Add router admin option to manage safe mode
[ https://issues.apache.org/jira/browse/HDFS-13068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16345583#comment-16345583 ] Íñigo Goiri commented on HDFS-13068: Thanks [~linyiqun] for [^HDFS-13068.001.patch] . Most of the patch is just the interface part so it's big but the changes are pretty contained. For the regular {{dfsadmin}} the way the option is there, is something like: {{[-safemode enter | leave | get | wait | forceExit]}}, it should be something like that. We may want to add an option to get the state. > RBF: Add router admin option to manage safe mode > > > Key: HDFS-13068 > URL: https://issues.apache.org/jira/browse/HDFS-13068 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Assignee: Yiqun Lin >Priority: Major > Attachments: HDFS-13068.001.patch > > > HDFS-13044 adds a safe mode to reject requests. We should have an option to > manually set the Router into safe mode. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12636) Ozone: OzoneFileSystem: Implement seek functionality for rpc client
[ https://issues.apache.org/jira/browse/HDFS-12636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HDFS-12636: Parent Issue: HDFS-13074 (was: HDFS-7240) > Ozone: OzoneFileSystem: Implement seek functionality for rpc client > --- > > Key: HDFS-12636 > URL: https://issues.apache.org/jira/browse/HDFS-12636 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ozone >Affects Versions: HDFS-7240 >Reporter: Mukul Kumar Singh >Assignee: Lokesh Jain >Priority: Major > Fix For: HDFS-7240 > > Attachments: HDFS-12636-HDFS-7240.001.patch, > HDFS-12636-HDFS-7240.002.patch, HDFS-12636-HDFS-7240.003.patch, > HDFS-12636-HDFS-7240.004.patch > > > OzoneClient library provides a method to invoke both RPC as well as REST > based methods to ozone. This api will help in the improving both the > performance as well as the interface management in OzoneFileSystem. > This jira will be used to convert the REST based calls to use this new > unified client. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13083) RBF: Fix doc error setting up client
[ https://issues.apache.org/jira/browse/HDFS-13083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16345574#comment-16345574 ] Íñigo Goiri commented on HDFS-13083: I took the freedom to make this part of HDFS-12615 and renamed the title to be a little more specific. If you don't have any concerns, I'll commit this labeling it as "Contributed by tartarus". [~tartarus], let me know if this is good with you or want some other change. > RBF: Fix doc error setting up client > > > Key: HDFS-13083 > URL: https://issues.apache.org/jira/browse/HDFS-13083 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: federation >Affects Versions: 2.9.0, 3.0.0 > Environment: CentOS6.5 > Hadoop 3.0.0 >Reporter: tartarus >Assignee: tartarus >Priority: Major > Labels: Router > Fix For: 3.0.0 > > Attachments: HDFS-13083.patch > > Original Estimate: 24h > Remaining Estimate: 24h > > HDFS Router Federation doc error > Configuration instructions > {quote} > dfs.namenodes.ns-fed > r1,r2 > > {quote} > then execute : *hdfs dfs -ls hdfs://ns-fed/tmp/* failed with > *{{Couldn't create proxy provider class > org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider}}* > The correct configuration is > {quote} > dfs.{color:#FF}ha{color}.namenodes.ns-fed > r1,r2 > > {quote} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13083) RBF: Fix doc error setting up client
[ https://issues.apache.org/jira/browse/HDFS-13083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Íñigo Goiri updated HDFS-13083: --- Summary: RBF: Fix doc error setting up client (was: Fix HDFS Router Federation doc error) > RBF: Fix doc error setting up client > > > Key: HDFS-13083 > URL: https://issues.apache.org/jira/browse/HDFS-13083 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: federation >Affects Versions: 2.9.0, 3.0.0 > Environment: CentOS6.5 > Hadoop 3.0.0 >Reporter: tartarus >Assignee: tartarus >Priority: Major > Labels: Router > Fix For: 3.0.0 > > Attachments: HDFS-13083.patch > > Original Estimate: 24h > Remaining Estimate: 24h > > HDFS Router Federation doc error > Configuration instructions > {quote} > dfs.namenodes.ns-fed > r1,r2 > > {quote} > then execute : *hdfs dfs -ls hdfs://ns-fed/tmp/* failed with > *{{Couldn't create proxy provider class > org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider}}* > The correct configuration is > {quote} > dfs.{color:#FF}ha{color}.namenodes.ns-fed > r1,r2 > > {quote} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13043) RBF: Expose the state of the Routers in the federation
[ https://issues.apache.org/jira/browse/HDFS-13043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Íñigo Goiri updated HDFS-13043: --- Attachment: HDFS-13043.006.patch > RBF: Expose the state of the Routers in the federation > -- > > Key: HDFS-13043 > URL: https://issues.apache.org/jira/browse/HDFS-13043 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Assignee: Íñigo Goiri >Priority: Major > Attachments: HDFS-13043.000.patch, HDFS-13043.001.patch, > HDFS-13043.002.patch, HDFS-13043.003.patch, HDFS-13043.004.patch, > HDFS-13043.005.patch, HDFS-13043.006.patch, router-info.png > > > The Router should expose the state of the other Routers in the federation > through a user UI. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13060) Adding a BlacklistBasedTrustedChannelResolver for TrustedChannelResolver
[ https://issues.apache.org/jira/browse/HDFS-13060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajay Kumar updated HDFS-13060: -- Attachment: HDFS-13060.001.patch > Adding a BlacklistBasedTrustedChannelResolver for TrustedChannelResolver > > > Key: HDFS-13060 > URL: https://issues.apache.org/jira/browse/HDFS-13060 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Xiaoyu Yao >Assignee: Ajay Kumar >Priority: Major > Attachments: HDFS-13060.000.patch, HDFS-13060.001.patch > > > HDFS-5910 introduces encryption negotiation between client and server based > on a customizable TrustedChannelResolver class. The TrustedChannelResolver is > invoked on both client and server side. If the resolver indicates that the > channel is trusted, then the data transfer will not be encrypted even if > dfs.encrypt.data.transfer is set to true. > The default trust channel resolver implementation returns false indicating > that the channel is not trusted, which always enables encryption. HDFS-5910 > also added a build-int whitelist based trust channel resolver. It allows you > to put IP address/Network Mask of trusted client/server in whitelist files to > skip encryption for certain traffics. > This ticket is opened to add a blacklist based trust channel resolver for > cases only certain machines (IPs) are untrusted without adding each trusted > IP individually. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13061) SaslDataTransferClient#checkTrustAndSend should not trust a partially trusted channel
[ https://issues.apache.org/jira/browse/HDFS-13061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16345557#comment-16345557 ] Xiaoyu Yao commented on HDFS-13061: --- Thanks [~ajayydv] for the update. Just one more NIT, +1 after than being fixed pending Jenkins. TestSaslDataTransfer.java Line 303/353/398: {{socket.close()}}; is not needed as you have the following to handle that. {code:java} IOUtils.cleanupWithLogger(null, socket, serverSocket);{code} > SaslDataTransferClient#checkTrustAndSend should not trust a partially trusted > channel > - > > Key: HDFS-13061 > URL: https://issues.apache.org/jira/browse/HDFS-13061 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Xiaoyu Yao >Assignee: Ajay Kumar >Priority: Major > Attachments: HDFS-13061.000.patch, HDFS-13061.001.patch, > HDFS-13061.002.patch > > > HDFS-5910 introduces encryption negotiation between client and server based > on a customizable TrustedChannelResolver class. The TrustedChannelResolver is > invoked on both client and server side. If the resolver indicates that the > channel is trusted, then the data transfer will not be encrypted even if > dfs.encrypt.data.transfer is set to true. > SaslDataTransferClient#checkTrustAndSend ask the channel resolve whether the > client and server address are trusted, respectively. It decides the channel > is untrusted only if both client and server are not trusted to enforce > encryption. *This ticket is opened to change it to not trust (and encrypt) if > either client or server address are not trusted.* -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12522) Ozone: Remove the Priority Queues used in the Container State Manager
[ https://issues.apache.org/jira/browse/HDFS-12522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16345548#comment-16345548 ] Anu Engineer commented on HDFS-12522: - [~xyao] and [~nandakumar131] Can you please take a look at the latest patch. The test failures are fixed, there is one checkstyle warning about more than 80 chars that I will fix while committing. Thanks > Ozone: Remove the Priority Queues used in the Container State Manager > - > > Key: HDFS-12522 > URL: https://issues.apache.org/jira/browse/HDFS-12522 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Anu Engineer >Assignee: Anu Engineer >Priority: Major > Attachments: HDFS-12522-HDFS-7240.001.patch, > HDFS-12522-HDFS-7240.002.patch, HDFS-12522-HDFS-7240.003.patch, > HDFS-12522-HDFS-7240.004.patch > > > During code review of HDFS-12387, it was suggested that we remove the > priority queues that was used in ContainerStateManager. This JIRA tracks that > issue. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13044) RBF: Add a safe mode for the Router
[ https://issues.apache.org/jira/browse/HDFS-13044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16345543#comment-16345543 ] Íñigo Goiri commented on HDFS-13044: I added [^HDFS-13044.005.patch] with the actual commit to trunk and [^HDFS-13044-branch-3.0.000.patch] with the one that would go to branch-3.0 and others. > RBF: Add a safe mode for the Router > --- > > Key: HDFS-13044 > URL: https://issues.apache.org/jira/browse/HDFS-13044 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Assignee: Íñigo Goiri >Priority: Major > Attachments: HDFS-13004.000.patch, HDFS-13044-branch-3.0.000.patch, > HDFS-13044.001.patch, HDFS-13044.002.patch, HDFS-13044.003.patch, > HDFS-13044.004.patch, HDFS-13044.005.patch > > > When a Router cannot communicate with the State Store, it should enter into a > safe mode that disallows certain operations. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13060) Adding a BlacklistBasedTrustedChannelResolver for TrustedChannelResolver
[ https://issues.apache.org/jira/browse/HDFS-13060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16345546#comment-16345546 ] Ajay Kumar commented on HDFS-13060: --- Patch v2 to address checkstyle issues. > Adding a BlacklistBasedTrustedChannelResolver for TrustedChannelResolver > > > Key: HDFS-13060 > URL: https://issues.apache.org/jira/browse/HDFS-13060 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Xiaoyu Yao >Assignee: Ajay Kumar >Priority: Major > Attachments: HDFS-13060.000.patch, HDFS-13060.001.patch > > > HDFS-5910 introduces encryption negotiation between client and server based > on a customizable TrustedChannelResolver class. The TrustedChannelResolver is > invoked on both client and server side. If the resolver indicates that the > channel is trusted, then the data transfer will not be encrypted even if > dfs.encrypt.data.transfer is set to true. > The default trust channel resolver implementation returns false indicating > that the channel is not trusted, which always enables encryption. HDFS-5910 > also added a build-int whitelist based trust channel resolver. It allows you > to put IP address/Network Mask of trusted client/server in whitelist files to > skip encryption for certain traffics. > This ticket is opened to add a blacklist based trust channel resolver for > cases only certain machines (IPs) are untrusted without adding each trusted > IP individually. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13044) RBF: Add a safe mode for the Router
[ https://issues.apache.org/jira/browse/HDFS-13044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Íñigo Goiri updated HDFS-13044: --- Attachment: HDFS-13044-branch-3.0.000.patch > RBF: Add a safe mode for the Router > --- > > Key: HDFS-13044 > URL: https://issues.apache.org/jira/browse/HDFS-13044 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Assignee: Íñigo Goiri >Priority: Major > Attachments: HDFS-13004.000.patch, HDFS-13044-branch-3.0.000.patch, > HDFS-13044.001.patch, HDFS-13044.002.patch, HDFS-13044.003.patch, > HDFS-13044.004.patch, HDFS-13044.005.patch > > > When a Router cannot communicate with the State Store, it should enter into a > safe mode that disallows certain operations. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13044) RBF: Add a safe mode for the Router
[ https://issues.apache.org/jira/browse/HDFS-13044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16345528#comment-16345528 ] genericqa commented on HDFS-13044: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 4s{color} | {color:red} HDFS-13044 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | HDFS-13044 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12908390/HDFS-13044.005.patch | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/22889/console | | Powered by | Apache Yetus 0.8.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > RBF: Add a safe mode for the Router > --- > > Key: HDFS-13044 > URL: https://issues.apache.org/jira/browse/HDFS-13044 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Assignee: Íñigo Goiri >Priority: Major > Attachments: HDFS-13004.000.patch, HDFS-13044.001.patch, > HDFS-13044.002.patch, HDFS-13044.003.patch, HDFS-13044.004.patch, > HDFS-13044.005.patch > > > When a Router cannot communicate with the State Store, it should enter into a > safe mode that disallows certain operations. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13044) RBF: Add a safe mode for the Router
[ https://issues.apache.org/jira/browse/HDFS-13044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Íñigo Goiri updated HDFS-13044: --- Attachment: HDFS-13044.005.patch > RBF: Add a safe mode for the Router > --- > > Key: HDFS-13044 > URL: https://issues.apache.org/jira/browse/HDFS-13044 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Assignee: Íñigo Goiri >Priority: Major > Attachments: HDFS-13004.000.patch, HDFS-13044.001.patch, > HDFS-13044.002.patch, HDFS-13044.003.patch, HDFS-13044.004.patch, > HDFS-13044.005.patch > > > When a Router cannot communicate with the State Store, it should enter into a > safe mode that disallows certain operations. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13083) Fix HDFS Router Federation doc error
[ https://issues.apache.org/jira/browse/HDFS-13083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Íñigo Goiri updated HDFS-13083: --- Issue Type: Sub-task (was: Bug) Parent: HDFS-12615 > Fix HDFS Router Federation doc error > > > Key: HDFS-13083 > URL: https://issues.apache.org/jira/browse/HDFS-13083 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: federation >Affects Versions: 2.9.0, 3.0.0 > Environment: CentOS6.5 > Hadoop 3.0.0 >Reporter: tartarus >Assignee: tartarus >Priority: Major > Labels: Router > Fix For: 3.0.0 > > Attachments: HDFS-13083.patch > > Original Estimate: 24h > Remaining Estimate: 24h > > HDFS Router Federation doc error > Configuration instructions > {quote} > dfs.namenodes.ns-fed > r1,r2 > > {quote} > then execute : *hdfs dfs -ls hdfs://ns-fed/tmp/* failed with > *{{Couldn't create proxy provider class > org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider}}* > The correct configuration is > {quote} > dfs.{color:#FF}ha{color}.namenodes.ns-fed > r1,r2 > > {quote} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13083) Fix HDFS Router Federation doc error
[ https://issues.apache.org/jira/browse/HDFS-13083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16345513#comment-16345513 ] genericqa commented on HDFS-13083: -- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 36s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 0s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 27m 30s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 6s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 21s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 40m 30s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | HDFS-13083 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12908292/HDFS-13083.patch | | Optional Tests | asflicense mvnsite | | uname | Linux 624eb2ecc69a 3.13.0-135-generic #184-Ubuntu SMP Wed Oct 18 11:55:51 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 901d15a | | maven | version: Apache Maven 3.3.9 | | Max. process+thread count | 301 (vs. ulimit of 5000) | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/22887/console | | Powered by | Apache Yetus 0.8.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Fix HDFS Router Federation doc error > > > Key: HDFS-13083 > URL: https://issues.apache.org/jira/browse/HDFS-13083 > Project: Hadoop HDFS > Issue Type: Bug > Components: federation >Affects Versions: 2.9.0, 3.0.0 > Environment: CentOS6.5 > Hadoop 3.0.0 >Reporter: tartarus >Assignee: tartarus >Priority: Major > Labels: Router > Fix For: 3.0.0 > > Attachments: HDFS-13083.patch > > Original Estimate: 24h > Remaining Estimate: 24h > > HDFS Router Federation doc error > Configuration instructions > {quote} > dfs.namenodes.ns-fed > r1,r2 > > {quote} > then execute : *hdfs dfs -ls hdfs://ns-fed/tmp/* failed with > *{{Couldn't create proxy provider class > org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider}}* > The correct configuration is > {quote} > dfs.{color:#FF}ha{color}.namenodes.ns-fed > r1,r2 > > {quote} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-11187) Optimize disk access for last partial chunk checksum of Finalized replica
[ https://issues.apache.org/jira/browse/HDFS-11187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16345478#comment-16345478 ] Wei-Chiu Chuang edited comment on HDFS-11187 at 1/30/18 5:43 PM: - v005 to address checstyle warning. Test errors are unrelated. Do not reproduce in my local tree. was (Author: jojochuang): v005 to address checkstyl warning. > Optimize disk access for last partial chunk checksum of Finalized replica > - > > Key: HDFS-11187 > URL: https://issues.apache.org/jira/browse/HDFS-11187 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang >Priority: Major > Attachments: HDFS-11187.001.patch, HDFS-11187.002.patch, > HDFS-11187.003.patch, HDFS-11187.004.patch, HDFS-11187.005.patch > > > The patch at HDFS-11160 ensures BlockSender reads the correct version of > metafile when there are concurrent writers. > However, the implementation is not optimal, because it must always read the > last partial chunk checksum from disk while holding FsDatasetImpl lock for > every reader. It is possible to optimize this by keeping an up-to-date > version of last partial checksum in-memory and reduce disk access. > I am separating the optimization into a new jira, because maintaining the > state of in-memory checksum requires a lot more work. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10285) Storage Policy Satisfier in Namenode
[ https://issues.apache.org/jira/browse/HDFS-10285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16345489#comment-16345489 ] Daryn Sharp commented on HDFS-10285: More comments, mostly focusing on the DN side. There appear to be more serious issues here. *FSDirSatisfyStoragePolicyOp* {{satisfyStoragePolicy}} errors if the xattr is already present. Should this be a no-op? A client re-requesting a storage policy correction probably shouldn't fail. {{unprotectedSatisfyStoragePolicy}} is called prior to xattr updates, which calls {{addSPSPathId}}. To avoid race conditions or inconsistent state if the xattr fails, should call {{addSPSPathId}} after xattrs are successfully updated. {{inodeHasSatisfyXAttr}} calls {{getXAttrFeature}} then immediately shorts out if the inode isn't a file. Should do file check first to avoid unnecessary computation. In general, not fond of unnecessarily guava. Instead of {{newArrayListWithCapacity}} + {{add}}, standard {{Arrays.asList(item)}} is more succinct. *FSDirStatAndListOp* Not sure why javadoc was changed to add needLocation. It's already present and now doubled up. {{BlockStorageMovementCommand/BlocksStorageMoveAttemptFinished}} Again, not sure that a new DN command is necessary, and why does it specifically report back successful moves instead of relying on IBRs? I would actually expect the DN to be completely ignorant of a SPS move vs any other move. *StoragePolicySatisfyWorker* bq. // TODO: Needs to manage the number of concurrent moves per DataNode. Shouldn't this be fixed now? {{DFS_MOVER_MOVERTHREADS_DEFAULT}} is *1000 per DN*? If the DN is concurrently doing 1000 moves, it's not in a good state, disk io is probably saturated, and this will only make it much worse. 10 is probably more than sufficient. The executor uses a {{CallerRunsPolicy}}? If the thread pool is exhausted, WHY should the heartbeat thread process the operation? That's a terrible idea. I was under the impression the DN did an optimized movement between its storages? Does it actually make a connection to itself? If yes, when the DN xceiver pool is very busy, the mover thread pool will exhaust and cause the heartbeat thread to block. *BlockStorageMovementTracker* Many data structures are riddled with non-threadsafe race conditions and risk of CMEs. Ex. The {{moverTaskFutures}} map. Adding new blocks and/or adding to a block's list of futures is synchronized. However the run loop does an unsynchronized block get, unsynchronized future remove, unsynchronized isEmpty, possibly another unsynchronized get, only then does it do a synchronized remove of the block. The whole chunk of code should be synchronized. Is the problematic {{moverTaskFutures}} even needed? It's aggregating futures per-block for seemingly no reason. Why track all the futures at all instead of just relying on the completion service? As best I can tell: * It's only used to determine if a future from the completion service should be ignored during shutdown. Shutdown sets the running boolean to false and clears the entire datastructure so why not use the {{running}} boolean like a check just a little further down? * As synchronization to sleep up to 2 seconds before performing a blocking {{moverCompletionService.take}}, but only when it thinks there are no active futures. I'll ignore the missed notify race that the bounded wait masks, but the real question is why not just do the blocking take? Why all the complexity? Am I missing something? *BlocksMovementsStatusHandler* Suffers same type of thread safety issues as {{StoragePolicySatisfyWorker}}. Ex. {{blockIdVsMovementStatus}} is inconsistent synchronized. Does synchronize to return an unmodifiable list which sadly does nothing to protect the caller from CME. {{handle}} is iterating over a non-thread safe list. > Storage Policy Satisfier in Namenode > > > Key: HDFS-10285 > URL: https://issues.apache.org/jira/browse/HDFS-10285 > Project: Hadoop HDFS > Issue Type: New Feature > Components: datanode, namenode >Affects Versions: HDFS-10285 >Reporter: Uma Maheswara Rao G >Assignee: Uma Maheswara Rao G >Priority: Major > Attachments: HDFS-10285-consolidated-merge-patch-00.patch, > HDFS-10285-consolidated-merge-patch-01.patch, > HDFS-10285-consolidated-merge-patch-02.patch, > HDFS-10285-consolidated-merge-patch-03.patch, > HDFS-10285-consolidated-merge-patch-04.patch, > HDFS-10285-consolidated-merge-patch-05.patch, > HDFS-SPS-TestReport-20170708.pdf, SPS Modularization.pdf, > Storage-Policy-Satisfier-in-HDFS-June-20-2017.pdf, > Storage-Policy-Satisfier-in-HDFS-May10.pdf, > Storage-Policy-Satisfier-in-HDFS-Oct-26-2017.pdf > > > Heterogeneous storage in HDFS introduced the concept of storage
[jira] [Commented] (HDFS-11187) Optimize disk access for last partial chunk checksum of Finalized replica
[ https://issues.apache.org/jira/browse/HDFS-11187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16345478#comment-16345478 ] Wei-Chiu Chuang commented on HDFS-11187: v005 to address checkstyl warning. > Optimize disk access for last partial chunk checksum of Finalized replica > - > > Key: HDFS-11187 > URL: https://issues.apache.org/jira/browse/HDFS-11187 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang >Priority: Major > Attachments: HDFS-11187.001.patch, HDFS-11187.002.patch, > HDFS-11187.003.patch, HDFS-11187.004.patch, HDFS-11187.005.patch > > > The patch at HDFS-11160 ensures BlockSender reads the correct version of > metafile when there are concurrent writers. > However, the implementation is not optimal, because it must always read the > last partial chunk checksum from disk while holding FsDatasetImpl lock for > every reader. It is possible to optimize this by keeping an up-to-date > version of last partial checksum in-memory and reduce disk access. > I am separating the optimization into a new jira, because maintaining the > state of in-memory checksum requires a lot more work. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11187) Optimize disk access for last partial chunk checksum of Finalized replica
[ https://issues.apache.org/jira/browse/HDFS-11187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HDFS-11187: --- Attachment: HDFS-11187.005.patch > Optimize disk access for last partial chunk checksum of Finalized replica > - > > Key: HDFS-11187 > URL: https://issues.apache.org/jira/browse/HDFS-11187 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang >Priority: Major > Attachments: HDFS-11187.001.patch, HDFS-11187.002.patch, > HDFS-11187.003.patch, HDFS-11187.004.patch, HDFS-11187.005.patch > > > The patch at HDFS-11160 ensures BlockSender reads the correct version of > metafile when there are concurrent writers. > However, the implementation is not optimal, because it must always read the > last partial chunk checksum from disk while holding FsDatasetImpl lock for > every reader. It is possible to optimize this by keeping an up-to-date > version of last partial checksum in-memory and reduce disk access. > I am separating the optimization into a new jira, because maintaining the > state of in-memory checksum requires a lot more work. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12909) SSLConnectionConfigurator creation error should be printed only if security is enabled
[ https://issues.apache.org/jira/browse/HDFS-12909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16345401#comment-16345401 ] Ajay Kumar commented on HDFS-12909: --- [~ljain], thanks for working on this. Do you mind adding a test case specific to this change. (i.e If we don't set any http policy {{URLConnectionFactory}} doesn't attempt creating an SSL con)? have one question: * Why move HTTP policy related keys to {{HdfsClientConfigKeys}}? I can imagine a case where client may have different policy than server in that case those keys may exist in both classes (i.e {{HdfsClientConfigKeys}} and {{DFSConfigKeys}}) but having it only in client config gives an impression that this config is not relevant in server side. > SSLConnectionConfigurator creation error should be printed only if security > is enabled > -- > > Key: HDFS-12909 > URL: https://issues.apache.org/jira/browse/HDFS-12909 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Namit Maheshwari >Assignee: Lokesh Jain >Priority: Major > Attachments: HDFS-12909.002.patch, HDFS-12909.003.patch, > HDFS-12909.patch > > > Currently URLConnectionFactory#getSSLConnectionConfiguration attempts to > create a SSL connection configurator even if security is not enabled. This > raises the below false warning in the logs. > {code:java} > 17/12/08 10:12:03 WARN web.URLConnectionFactory: Cannot load customized ssl > related configuration. Fallback to system-generic settings. > java.io.FileNotFoundException: /etc/security/clientKeys/all.jks (No such file > or directory) > at java.io.FileInputStream.open0(Native Method) > at java.io.FileInputStream.open(FileInputStream.java:195) > at java.io.FileInputStream.(FileInputStream.java:138) > at > org.apache.hadoop.security.ssl.ReloadingX509TrustManager.loadTrustManager(ReloadingX509TrustManager.java:169) > at > org.apache.hadoop.security.ssl.ReloadingX509TrustManager.(ReloadingX509TrustManager.java:87) > at > org.apache.hadoop.security.ssl.FileBasedKeyStoresFactory.init(FileBasedKeyStoresFactory.java:219) > at org.apache.hadoop.security.ssl.SSLFactory.init(SSLFactory.java:176) > at > org.apache.hadoop.hdfs.web.URLConnectionFactory.newSslConnConfigurator(URLConnectionFactory.java:164) > at > org.apache.hadoop.hdfs.web.URLConnectionFactory.getSSLConnectionConfiguration(URLConnectionFactory.java:106) > at > org.apache.hadoop.hdfs.web.URLConnectionFactory.newDefaultURLConnectionFactory(URLConnectionFactory.java:85) > at org.apache.hadoop.hdfs.tools.DFSck.(DFSck.java:136) > at org.apache.hadoop.hdfs.tools.DFSck.(DFSck.java:128) > at org.apache.hadoop.hdfs.tools.DFSck.main(DFSck.java:396) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-13084) [SPS]: Fix the branch review comments
[ https://issues.apache.org/jira/browse/HDFS-13084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16344876#comment-16344876 ] Rakesh R edited comment on HDFS-13084 at 1/30/18 4:57 PM: -- Thanks a lot [~daryn] for your time and review comments. My reply follows, please take a look at it. *Comment-1)* {quote}BlockManager Shouldn’t spsMode be volatile? Although I question why it’s here. {quote} [Rakesh's reply] Agreed, will do the changes. *Comment-2)* {quote}Adding SPS methods to this class implies an unexpected coupling of the SPS service to the block manager. Please move them out to prove it’s not tightly coupled. {quote} [Rakesh's reply] Agreed. I'm planning to create {{StoragePolicySatisfyManager}} and keep all the related apis over there. *Comment-3)* {quote}BPServiceActor Is it actually sending back the moved blocks? Aren’t IBRs sufficient? {quote} [Rakesh's reply] Currently, SPS have a coupling with heartbeat for sending the blk move tasks to DN and recieving the {{blksMovementsFinished}} list from the DN. Since we have grouped separately, each block movement finished block can be easily/quickly recognized by the Satisfier in NN side and can easily update the tracking details. I think, if we use IBR approach, for every newly reported block, Satisfier in NN side could do *additional extra* check whether this new block has occurred due to SPSBlockMove operation for updating the track details and this *extra* check will be quite often considering more other ops compares to SPSBlockMove op. I meant, a new block can be reported due SPSBlockMove op or as part of some other operation like re-replication or newly written file block or balancer tool or Mover tool etc. With the current approach, it can simply get the {{blksMovementsFinished}} list from each heartbeat and do the logic. *Comment-4)* {quote}DataNode Why isn’t this just a block transfer? How is transferring between DNs any different than across storages? {quote} [Rakesh's reply] I could see Mover is also using {{REPLACE_BLOCK}} call and we just followed same approach in sps also. Am I missing anything here? *Comment-5)* {quote}DatanodeDescriptor Why use a synchronized linked list to offer/poll instead of BlockingQueue? {quote} [Rakesh's reply] Agreed, will do the changes. *Comment-6)* {quote}DatanodeManager I know it’s configurable, but realistically, when would you ever want to give storage movement tasks equal footing with under-replication? Is there really a use case for not valuing durability? {quote} [Rakesh's reply] We don't have any partcular use case, though. One scenario we thought is, user configured SSDs and filled up quickly. In that case, there could be situations that cleaning-up is considered as a high priority. If you feel, this is not a real case then I'm OK to remove this config and SPS will use only the remaining slots. If needed, we can bring such configs later. *Comment-7)* {quote}Adding getDatanodeStorageReport is concerning. getDatanodeListForReport is already a very bad method that should be avoided for anything but jmx – even then it’s a concern. I eliminated calls to it years ago. All it takes is a nscd/dns hiccup and you’re left holding the fsn lock for an excessive length of time. Beyond that, the response is going to be pretty large and tagging all the storage reports is not going to be cheap. verifyTargetDatanodeHasSpaceForScheduling does it really need the namesystem lock? Can’t DatanodeDescriptor#chooseStorage4Block synchronize on its storageMap? Appears to be calling getLiveDatanodeStorageReport for every file. As mentioned earlier, this is NOT cheap. The SPS should be able to operate on a fuzzy/cached state of the world. Then it gets another datanode report to determine the number of live nodes to decide if it should sleep before processing the next path. The number of nodes from the prior cached view of the world should suffice. {quote} [Rakesh's reply] Good point. Sometime back Uma and me thought about cache part. Actually, we depend on this api for the data node storage types and remaining space details. I think, it requires two different mechanisms for internal and external sps. For internal, how about sps can directly refer {{DatanodeManager#datanodeMap}} for every file. For the external, IIUC you are suggesting a cache mechanism. How about, get storageReport once and cache at ExternalContext. This local cache can be refreshed periodically. Say, After every 5mins (just an arbitrary number I put here, if you have some period in mind please suggest), when getDatanodeStorageReport called, cache can be treated as expired and fetch freshly. Within 5mins it can use from cache. Does this make sense to you? Another point we thought of is, right now for checking whether node has good space, its going to NN. With above caching, we could just check the space constraints
[jira] [Commented] (HDFS-13083) Fix HDFS Router Federation doc error
[ https://issues.apache.org/jira/browse/HDFS-13083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16345367#comment-16345367 ] Íñigo Goiri commented on HDFS-13083: I missed this. Thanks [~tartarus] for the patch, I assigned it to you. Once Yetus comes back I'll commit. +1 > Fix HDFS Router Federation doc error > > > Key: HDFS-13083 > URL: https://issues.apache.org/jira/browse/HDFS-13083 > Project: Hadoop HDFS > Issue Type: Bug > Components: federation >Affects Versions: 2.9.0, 3.0.0 > Environment: CentOS6.5 > Hadoop 3.0.0 >Reporter: tartarus >Assignee: tartarus >Priority: Major > Labels: Router > Fix For: 3.0.0 > > Attachments: HDFS-13083.patch > > Original Estimate: 24h > Remaining Estimate: 24h > > HDFS Router Federation doc error > Configuration instructions > {quote} > dfs.namenodes.ns-fed > r1,r2 > > {quote} > then execute : *hdfs dfs -ls hdfs://ns-fed/tmp/* failed with > *{{Couldn't create proxy provider class > org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider}}* > The correct configuration is > {quote} > dfs.{color:#FF}ha{color}.namenodes.ns-fed > r1,r2 > > {quote} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-13084) [SPS]: Fix the branch review comments
[ https://issues.apache.org/jira/browse/HDFS-13084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16344876#comment-16344876 ] Rakesh R edited comment on HDFS-13084 at 1/30/18 4:51 PM: -- Thanks a lot [~daryn] for your time and review comments. My reply follows, please take a look at it. *Comment-1)* {quote}BlockManager Shouldn’t spsMode be volatile? Although I question why it’s here. {quote} [Rakesh's reply] Agreed, will do the changes. *Comment-2)* {quote}Adding SPS methods to this class implies an unexpected coupling of the SPS service to the block manager. Please move them out to prove it’s not tightly coupled. {quote} [Rakesh's reply] Agreed. I'm planning to create {{StoragePolicySatisfyManager}} and keep all the related apis over there. *Comment-3)* {quote}BPServiceActor Is it actually sending back the moved blocks? Aren’t IBRs sufficient? {quote} [Rakesh's reply] Currently, SPS have a coupling with heartbeat for sending the blk move tasks to DN and recieving the {{blksMovementsFinished}} list from the DN. Since we have grouped separately, each block movement finished block can be easily/quickly recognized by the Satisfier in NN side and can easily update the tracking details. I think, if we use IBR approach, for every newly reported block, Satisfier in NN side could do *additional extra* check whether this new block has occurred due to SPSBlockMove operation for updating the track details and this *extra* check will be quite often considering more other ops compares to SPSBlockMove op. I meant, a new block can be reported due SPSBlockMove op or as part of some other operation like re-replication or newly written file block or balancer tool or Mover tool etc. With the current approach, it can simply get the {{blksMovementsFinished}} list from each heartbeat and do the logic. *Comment-4)* {quote}DataNode Why isn’t this just a block transfer? How is transferring between DNs any different than across storages? {quote} [Rakesh's reply] I could see Mover is also using {{REPLACE_BLOCK}} call and we just followed same approach in sps also. Am I missing anything here? *Comment-5)* {quote}DatanodeDescriptor Why use a synchronized linked list to offer/poll instead of BlockingQueue? {quote} [Rakesh's reply] Agreed, will do the changes. *Comment-6)* {quote}DatanodeManager I know it’s configurable, but realistically, when would you ever want to give storage movement tasks equal footing with under-replication? Is there really a use case for not valuing durability? {quote} [Rakesh's reply] We don't have any partcular use case, though. One scenario we thought is, user configured SSDs and filled up quickly. In that case, there could be situations that cleaning-up is considered as a high priority. If you feel, this is not a real case then I'm OK to remove this config and SPS will use only the remaining slots. If needed, we can bring such configs later. *Comment-7)* {quote}Adding getDatanodeStorageReport is concerning. getDatanodeListForReport is already a very bad method that should be avoided for anything but jmx – even then it’s a concern. I eliminated calls to it years ago. All it takes is a nscd/dns hiccup and you’re left holding the fsn lock for an excessive length of time. Beyond that, the response is going to be pretty large and tagging all the storage reports is not going to be cheap. verifyTargetDatanodeHasSpaceForScheduling does it really need the namesystem lock? Can’t DatanodeDescriptor#chooseStorage4Block synchronize on its storageMap? Appears to be calling getLiveDatanodeStorageReport for every file. As mentioned earlier, this is NOT cheap. The SPS should be able to operate on a fuzzy/cached state of the world. Then it gets another datanode report to determine the number of live nodes to decide if it should sleep before processing the next path. The number of nodes from the prior cached view of the world should suffice. {quote} [Rakesh's reply] Good point. Sometime back Uma and me thought about cache part. Actually, we depend on this api for the data node storage types and remaining space details. I think, it requires two different mechanisms for internal and external sps. For internal, how about sps can directly refer {{DatanodeManager#datanodeMap}} for every file. For the external, IIUC you are suggesting a cache mechanism. How about, get storageReport once and cache at ExternalContext. This local cache can be refreshed periodically. Say, After every 5mins (just an arbitrary number I put here, if you have some period in mind please suggest), when getDatanodeStorageReport called, cache can be treated as expired and fetch freshly. Within 5mins it can use from cache. Does this make sense to you? *Comment-8)* {quote}DFSUtil DFSUtil.removeOverlapBetweenStorageTypes and {{DFSUtil.getSPSWorkMultiplier }}. These aren’t generally useful methods so why are they
[jira] [Assigned] (HDFS-13083) Fix HDFS Router Federation doc error
[ https://issues.apache.org/jira/browse/HDFS-13083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Íñigo Goiri reassigned HDFS-13083: -- Assignee: tartarus > Fix HDFS Router Federation doc error > > > Key: HDFS-13083 > URL: https://issues.apache.org/jira/browse/HDFS-13083 > Project: Hadoop HDFS > Issue Type: Bug > Components: federation >Affects Versions: 2.9.0, 3.0.0 > Environment: CentOS6.5 > Hadoop 3.0.0 >Reporter: tartarus >Assignee: tartarus >Priority: Major > Labels: Router > Fix For: 3.0.0 > > Attachments: HDFS-13083.patch > > Original Estimate: 24h > Remaining Estimate: 24h > > HDFS Router Federation doc error > Configuration instructions > {quote} > dfs.namenodes.ns-fed > r1,r2 > > {quote} > then execute : *hdfs dfs -ls hdfs://ns-fed/tmp/* failed with > *{{Couldn't create proxy provider class > org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider}}* > The correct configuration is > {quote} > dfs.{color:#FF}ha{color}.namenodes.ns-fed > r1,r2 > > {quote} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13083) Fix HDFS Router Federation doc error
[ https://issues.apache.org/jira/browse/HDFS-13083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Íñigo Goiri updated HDFS-13083: --- Status: Patch Available (was: Reopened) > Fix HDFS Router Federation doc error > > > Key: HDFS-13083 > URL: https://issues.apache.org/jira/browse/HDFS-13083 > Project: Hadoop HDFS > Issue Type: Bug > Components: federation >Affects Versions: 3.0.0, 2.9.0 > Environment: CentOS6.5 > Hadoop 3.0.0 >Reporter: tartarus >Priority: Major > Labels: Router > Fix For: 3.0.0 > > Attachments: HDFS-13083.patch > > Original Estimate: 24h > Remaining Estimate: 24h > > HDFS Router Federation doc error > Configuration instructions > {quote} > dfs.namenodes.ns-fed > r1,r2 > > {quote} > then execute : *hdfs dfs -ls hdfs://ns-fed/tmp/* failed with > *{{Couldn't create proxy provider class > org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider}}* > The correct configuration is > {quote} > dfs.{color:#FF}ha{color}.namenodes.ns-fed > r1,r2 > > {quote} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13085) getStoragePolicy varies b/w REST and CLI for unspecified path
[ https://issues.apache.org/jira/browse/HDFS-13085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16345350#comment-16345350 ] Arpit Agarwal commented on HDFS-13085: -- Thanks for the analysis [~msingh]. [~brahmareddy], any change here will likely be incompatible and hence out for Hadoop 3.x, so probably we should leave things as they are. What do you think? > getStoragePolicy varies b/w REST and CLI for unspecified path > - > > Key: HDFS-13085 > URL: https://issues.apache.org/jira/browse/HDFS-13085 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Brahma Reddy Battula >Assignee: Mukul Kumar Singh >Priority: Major > > *CLI* > {noformat} > hdfs storagepolicies -getStoragePolicy -path /node-label-root > The storage policy of /node-label-root is unspecified > {noformat} > *REST* > {noformat} > curl -i "http://1*.*.*.*:50070/webhdfs/v1/node-label-root?op=GETSTORAGEPOLICY; > {"BlockStoragePolicy":\{"copyOnCreateFile":false,"creationFallbacks":[],"id":7,"name":"HOT","replicationFallbacks":["ARCHIVE"],"storageTypes":["DISK"]}} > {noformat} > -HDFS-11968- introduced this new behaviour for CLI. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13085) getStoragePolicy varies b/w REST and CLI for unspecified path
[ https://issues.apache.org/jira/browse/HDFS-13085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-13085: - Hadoop Flags: Incompatible change > getStoragePolicy varies b/w REST and CLI for unspecified path > - > > Key: HDFS-13085 > URL: https://issues.apache.org/jira/browse/HDFS-13085 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Brahma Reddy Battula >Assignee: Mukul Kumar Singh >Priority: Major > > *CLI* > {noformat} > hdfs storagepolicies -getStoragePolicy -path /node-label-root > The storage policy of /node-label-root is unspecified > {noformat} > *REST* > {noformat} > curl -i "http://1*.*.*.*:50070/webhdfs/v1/node-label-root?op=GETSTORAGEPOLICY; > {"BlockStoragePolicy":\{"copyOnCreateFile":false,"creationFallbacks":[],"id":7,"name":"HOT","replicationFallbacks":["ARCHIVE"],"storageTypes":["DISK"]}} > {noformat} > -HDFS-11968- introduced this new behaviour for CLI. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12528) Add an option to not disable short-circuit reads on failures
[ https://issues.apache.org/jira/browse/HDFS-12528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16345343#comment-16345343 ] Xiao Chen commented on HDFS-12528: -- Thanks all, and thanks John for writing the initial unit test! > Add an option to not disable short-circuit reads on failures > > > Key: HDFS-12528 > URL: https://issues.apache.org/jira/browse/HDFS-12528 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client, performance >Affects Versions: 2.6.0 >Reporter: Andre Araujo >Assignee: Xiao Chen >Priority: Major > Attachments: HDFS-12528.000.patch, HDFS-12528.01.patch, > HDFS-12528.02.patch, HDFS-12528.03.patch, HDFS-12528.04.patch, > HDFS-12528.05.patch > > > We have scenarios where data ingestion makes use of the -appendToFile > operation to add new data to existing HDFS files. In these situations, we're > frequently running into the problem described below. > We're using Impala to query the HDFS data with short-circuit reads (SCR) > enabled. After each file read, Impala "unbuffer"'s the HDFS file to reduce > the memory footprint. In some cases, though, Impala still keeps the HDFS file > handle open for reuse. > The "unbuffer" call, however, causes the file's current block reader to be > closed, which makes the associated ShortCircuitReplica evictable from the > ShortCircuitCache. When the cluster is under load, this means that the > ShortCircuitReplica can be purged off the cache pretty fast, which closes the > file descriptor to the underlying storage file. > That means that when Impala re-reads the file it has to re-open the storage > files associated with the ShortCircuitReplica's that were evicted from the > cache. If there were no appends to those blocks, the re-open will succeed > without problems. If one block was appended since the ShortCircuitReplica was > created, the re-open will fail with the following error: > {code} > Meta file for BP-810388474-172.31.113.69-1499543341726:blk_1074012183_273087 > not found > {code} > This error is handled as an "unknown response" by the BlockReaderFactory [1], > which disables short-circuit reads for 10 minutes [2] for the client. > These 10 minutes without SCR can have a big performance impact for the client > operations. In this particular case ("Meta file not found") it would suffice > to return null without disabling SCR. This particular block read would fall > back to the normal, non-short-circuited, path and other SCR requests would > continue to work as expected. > It might also be interesting to be able to control how long SCR is disabled > for in the "unknown response" case. 10 minutes seems a bit to long and not > being able to change that is a problem. > [1] > https://github.com/apache/hadoop/blob/f67237cbe7bc48a1b9088e990800b37529f1db2a/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/impl/BlockReaderFactory.java#L646 > [2] > https://github.com/apache/hadoop/blob/f67237cbe7bc48a1b9088e990800b37529f1db2a/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/shortcircuit/DomainSocketFactory.java#L97 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11701) NPE from Unresolved Host causes permanent DFSInputStream failures
[ https://issues.apache.org/jira/browse/HDFS-11701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16345156#comment-16345156 ] Jitendra Nath Pandey commented on HDFS-11701: - +1. The patch looks good to me. > NPE from Unresolved Host causes permanent DFSInputStream failures > - > > Key: HDFS-11701 > URL: https://issues.apache.org/jira/browse/HDFS-11701 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Affects Versions: 2.6.0 > Environment: AWS Centos linux running HBase CDH 5.9.0 and HDFS CDH > 5.9.0 >Reporter: James Moore >Assignee: Lokesh Jain >Priority: Major > Attachments: HDFS-11701.001.patch, HDFS-11701.002.patch > > > We recently encountered the following NPE due to the DFSInputStream storing > old cached block locations from hosts which could no longer resolve. > {quote} > Caused by: java.lang.NullPointerException > at org.apache.hadoop.hdfs.DFSClient.isLocalAddress(DFSClient.java:1122) > at > org.apache.hadoop.hdfs.shortcircuit.DomainSocketFactory.getPathInfo(DomainSocketFactory.java:148) > at > org.apache.hadoop.hdfs.BlockReaderFactory.getBlockReaderLocal(BlockReaderFactory.java:474) > at > org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFactory.java:354) > at > org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:662) > at > org.apache.hadoop.hdfs.DFSInputStream.seekToNewSource(DFSInputStream.java:1613) > at > org.apache.hadoop.fs.FSDataInputStream.seekToNewSource(FSDataInputStream.java:127) > ~HBase related stack frames trimmed~ > {quote} > After investigating, the DFSInputStream appears to have been open for upwards > of 3-4 weeks and had cached block locations from decommissioned nodes that no > longer resolve in DNS and had been shutdown and removed from the cluster 2 > weeks prior. If the DFSInputStream had refreshed its block locations from > the name node, it would have received alternative block locations which would > not contain the decommissioned data nodes. As the above NPE leaves the > non-resolving data node in the list of block locations the DFSInputStream > never refreshes the block locations and all attempts to open a BlockReader > for the given blocks will fail. > In our case, we resolved the NPE by closing and re-opening every > DFSInputStream in the cluster to force a purge of the block locations cache. > Ideally, the DFSInputStream would re-fetch all block locations for a host > which can't be resolved in DNS or at least the blocks requested. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13080) Ozone: Make finalhash in ContainerInfo of StorageContainerDatanodeProtocol.proto optional
[ https://issues.apache.org/jira/browse/HDFS-13080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nanda kumar updated HDFS-13080: --- Resolution: Fixed Status: Resolved (was: Patch Available) > Ozone: Make finalhash in ContainerInfo of > StorageContainerDatanodeProtocol.proto optional > - > > Key: HDFS-13080 > URL: https://issues.apache.org/jira/browse/HDFS-13080 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ozone >Reporter: Nanda kumar >Assignee: Elek, Marton >Priority: Major > Attachments: HDFS-13080-HDFS-7240.000.patch, > HDFS-13080-HDFS-7240.001.patch > > > ContainerInfo in StorageContainerDatanodeProtocol.proto has a required field, > {{finalhash}} which will be null for an open container, this has to be made > as an optional field. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13080) Ozone: Make finalhash in ContainerInfo of StorageContainerDatanodeProtocol.proto optional
[ https://issues.apache.org/jira/browse/HDFS-13080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16345132#comment-16345132 ] Nanda kumar commented on HDFS-13080: Thanks [~elek] for the contribution. I have committed this to the feature branch. > Ozone: Make finalhash in ContainerInfo of > StorageContainerDatanodeProtocol.proto optional > - > > Key: HDFS-13080 > URL: https://issues.apache.org/jira/browse/HDFS-13080 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ozone >Reporter: Nanda kumar >Assignee: Elek, Marton >Priority: Major > Attachments: HDFS-13080-HDFS-7240.000.patch, > HDFS-13080-HDFS-7240.001.patch > > > ContainerInfo in StorageContainerDatanodeProtocol.proto has a required field, > {{finalhash}} which will be null for an open container, this has to be made > as an optional field. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13080) Ozone: Make finalhash in ContainerInfo of StorageContainerDatanodeProtocol.proto optional
[ https://issues.apache.org/jira/browse/HDFS-13080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16345110#comment-16345110 ] Nanda kumar commented on HDFS-13080: +1, I will commit this shortly. > Ozone: Make finalhash in ContainerInfo of > StorageContainerDatanodeProtocol.proto optional > - > > Key: HDFS-13080 > URL: https://issues.apache.org/jira/browse/HDFS-13080 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ozone >Reporter: Nanda kumar >Assignee: Elek, Marton >Priority: Major > Attachments: HDFS-13080-HDFS-7240.000.patch, > HDFS-13080-HDFS-7240.001.patch > > > ContainerInfo in StorageContainerDatanodeProtocol.proto has a required field, > {{finalhash}} which will be null for an open container, this has to be made > as an optional field. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12116) BlockReportTestBase#blockReport_08 and #blockReport_08 intermittently fail
[ https://issues.apache.org/jira/browse/HDFS-12116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16345089#comment-16345089 ] genericqa commented on HDFS-12116: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 14s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 29s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 55s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 40s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 59s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 34s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 53s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 51s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 59s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 52s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 52s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 55s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 50s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}116m 22s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 22s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}167m 16s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.tools.TestDFSZKFailoverController | | | hadoop.hdfs.TestReadStripedFileWithMissingBlocks | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | HDFS-12116 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12877421/HDFS-12116.03.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 6ec659f0f99e 3.13.0-135-generic #184-Ubuntu SMP Wed Oct 18 11:55:51 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 6463e10 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_151 | | findbugs | v3.1.0-RC1 | | unit | https://builds.apache.org/job/PreCommit-HDFS-Build/22886/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/22886/testReport/ | | Max. process+thread count | 3204 (vs. ulimit of 5000) | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/22886/console | | Powered by | Apache
[jira] [Comment Edited] (HDFS-13084) [SPS]: Fix the branch review comments
[ https://issues.apache.org/jira/browse/HDFS-13084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16344876#comment-16344876 ] Rakesh R edited comment on HDFS-13084 at 1/30/18 12:25 PM: --- Thanks a lot [~daryn] for your time and review comments. My reply follows, please take a look at it. *Comment-1)* {quote}BlockManager Shouldn’t spsMode be volatile? Although I question why it’s here. {quote} [Rakesh's reply] Agreed, will do the changes. *Comment-2)* {quote}Adding SPS methods to this class implies an unexpected coupling of the SPS service to the block manager. Please move them out to prove it’s not tightly coupled. {quote} [Rakesh's reply] Agreed. I'm planning to create {{StoragePolicySatisfyManager}} and keep all the related apis over there. *Comment-3)* {quote}BPServiceActor Is it actually sending back the moved blocks? Aren’t IBRs sufficient? {quote} [Rakesh's reply] Currently, SPS have a coupling with heartbeat for sending the blk move tasks to DN and recieving the {{blksMovementsFinished}} list from the DN. Since we have grouped separately, each block movement finished block can be easily/quickly recognized by the Satisfier in NN side and can easily update the tracking details. I think, if we use IBR approach, for every newly reported block, Satisfier in NN side could do *additional extra* check whether this new block has occurred due to SPSBlockMove operation for updating the track details and this *extra* check will be quite often considering more other ops compares to SPSBlockMove op. I meant, a new block can be reported due SPSBlockMove op or as part of some other operation like re-replication or newly written file block or balancer tool or Mover tool etc. With the current approach, it can simply get the {{blksMovementsFinished}} list from each heartbeat and do the logic. *Comment-4)* {quote}DataNode Why isn’t this just a block transfer? How is transferring between DNs any different than across storages? {quote} [Rakesh's reply] I could see Mover is also using {{REPLACE_BLOCK}} call and we just followed same approach in sps also. Am I missing anything here? *Comment-5)* {quote}DatanodeDescriptor Why use a synchronized linked list to offer/poll instead of BlockingQueue? {quote} [Rakesh's reply] Agreed, will do the changes. *Comment-6)* {quote}DatanodeManager I know it’s configurable, but realistically, when would you ever want to give storage movement tasks equal footing with under-replication? Is there really a use case for not valuing durability? {quote} [Rakesh's reply] We don't have any partcular use case, though. One scenario we thought is, user configured SSDs and filled up quickly. In that case, there could be situations that cleaning-up is considered as a high priority. If you feel, this is not a real case then I'm OK to remove this config and SPS will use only the remaining slots. If needed, we can bring such configs later. *Comment-7)* {quote}Adding getDatanodeStorageReport is concerning. getDatanodeListForReport is already a very bad method that should be avoided for anything but jmx – even then it’s a concern. I eliminated calls to it years ago. All it takes is a nscd/dns hiccup and you’re left holding the fsn lock for an excessive length of time. Beyond that, the response is going to be pretty large and tagging all the storage reports is not going to be cheap. verifyTargetDatanodeHasSpaceForScheduling does it really need the namesystem lock? Can’t DatanodeDescriptor#chooseStorage4Block synchronize on its storageMap? Appears to be calling getLiveDatanodeStorageReport for every file. As mentioned earlier, this is NOT cheap. The SPS should be able to operate on a fuzzy/cached state of the world. Then it gets another datanode report to determine the number of live nodes to decide if it should sleep before processing the next path. The number of nodes from the prior cached view of the world should suffice. {quote} [Rakesh's reply] Good point. Sometime back Uma and me thought about cache part. Actually, we depend on this api for the data node storage types and remaining space details. I think, it requires two different mechanisms for internal and external sps. For internal, how about sps can directly refer {{DatanodeManager#datanodeMap}} for every file. For the external, IIUC you are suggesting a cache mechanism. How about, make data structure {{StorageGroup}} similar to Mover class and build cache. This local cache can be refreshed per file or periodically. *Comment-8)* {quote}DFSUtil DFSUtil.removeOverlapBetweenStorageTypes and {{DFSUtil.getSPSWorkMultiplier }}. These aren’t generally useful methods so why are they in DFSUtil? Why aren’t they in the only calling class StoragePolicySatisfier? {quote} [Rakesh's reply] Agreed, Will do the changes. *Comment-9)* {quote}FSDirXAttrOp Not fond of the SPS queue updates being edge triggered by xattr
[jira] [Created] (HDFS-13087) Fix: Snapshots On encryption zones get incorrect EZ settings when encryption zone changes
LiXin Ge created HDFS-13087: --- Summary: Fix: Snapshots On encryption zones get incorrect EZ settings when encryption zone changes Key: HDFS-13087 URL: https://issues.apache.org/jira/browse/HDFS-13087 Project: Hadoop HDFS Issue Type: Bug Components: encryption Affects Versions: 3.1.0 Reporter: LiXin Ge Snapshots are supposed to be immutable and read only, so the EZ settings which in a snapshot path shouldn't change when the origin encryption zone changes. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-13087) Fix: Snapshots On encryption zones get incorrect EZ settings when encryption zone changes
[ https://issues.apache.org/jira/browse/HDFS-13087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] LiXin Ge reassigned HDFS-13087: --- Assignee: LiXin Ge > Fix: Snapshots On encryption zones get incorrect EZ settings when encryption > zone changes > - > > Key: HDFS-13087 > URL: https://issues.apache.org/jira/browse/HDFS-13087 > Project: Hadoop HDFS > Issue Type: Bug > Components: encryption >Affects Versions: 3.1.0 >Reporter: LiXin Ge >Assignee: LiXin Ge >Priority: Major > > Snapshots are supposed to be immutable and read only, so the EZ settings > which in a snapshot path shouldn't change when the origin encryption zone > changes. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-13086) Add a field about exception type to BlockOpResponseProto
LiXin Ge created HDFS-13086: --- Summary: Add a field about exception type to BlockOpResponseProto Key: HDFS-13086 URL: https://issues.apache.org/jira/browse/HDFS-13086 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 3.1.0 Reporter: LiXin Ge Assignee: LiXin Ge When user re-read a file in the way of short-circuit reads, it may come across unknown errors due to the reasons that the file has been appended after the first read which changes it's meta file, or the file has been moved away by the balancer. Such unknown errors will unnecessary disable short-circuit reads for 10 minutes. HDFS-12528 Make the {{expireAfterWrite}} of {{DomainSocketFactory$pathMap}} configurable to give user a choice of never disable the domain socket. We can go a step further that add a field about exception type to BlockOpResponseProto, so that we can Ignore the acceptable FNFE and set a appropriate disable time to handle the unacceptable exceptions when different type of exception happens in the same cluster. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-12116) BlockReportTestBase#blockReport_08 and #blockReport_08 intermittently fail
[ https://issues.apache.org/jira/browse/HDFS-12116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Bota reassigned HDFS-12116: - Assignee: (was: Gabor Bota) > BlockReportTestBase#blockReport_08 and #blockReport_08 intermittently fail > -- > > Key: HDFS-12116 > URL: https://issues.apache.org/jira/browse/HDFS-12116 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Affects Versions: 0.22.0 >Reporter: Xiao Chen >Priority: Major > Attachments: HDFS-12116.01.patch, HDFS-12116.02.patch, > HDFS-12116.03.patch, > TEST-org.apache.hadoop.hdfs.server.datanode.TestNNHandlesBlockReportPerStorage.xml > > > This seems to be long-standing, but the failure rate (~10%) is slightly > higher in dist-test run in using cdh. > In both _08 and _09 tests: > # an attempt is made to make a replica in {{TEMPORARY}} > state, by {{waitForTempReplica}}. > # Once that's returned, the test goes on to verify block reports shows > correct pending replication blocks. > But there's a race condition. If the replica is replicated between steps #1 > and #2, {{getPendingReplicationBlocks}} could return 0 or 1, depending on how > many replicas are replicated, hence failing the test. > Failures are seen on both {{TestNNHandlesBlockReportPerStorage}} and > {{TestNNHandlesCombinedBlockReport}} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13084) [SPS]: Fix the branch review comments
[ https://issues.apache.org/jira/browse/HDFS-13084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16344876#comment-16344876 ] Rakesh R commented on HDFS-13084: - Thanks a lot [~daryn] for your time and review comments. My reply follows, please take a look at it. *Comment-1)* {quote}BlockManager Shouldn’t spsMode be volatile? Although I question why it’s here. {quote} [Rakesh's reply] Agreed, will do the changes. *Comment-2)* {quote}Adding SPS methods to this class implies an unexpected coupling of the SPS service to the block manager. Please move them out to prove it’s not tightly coupled. {quote} [Rakesh's reply] Agreed. I'm planning to create {{StoragePolicySatisfyManager}} and keep all the related apis over there. *Comment-3)* {quote}BPServiceActor Is it actually sending back the moved blocks? Aren’t IBRs sufficient? {quote} [Rakesh's reply] Currently, SPS have a coupling with heartbeat for sending the blk move tasks to DN and recieving the {{blksMovementsFinished}} list from the DN. Since we have grouped separately, each block movement finished block can be easily/quickly recognized by the Satisfier in NN side and can easily update the tracking details. I think, if we use IBR approach, for every newly reported block, Satisfier in NN side could check whether this new block has occurred due to SPSBlockMove operation for updating the track details. I meant, a new block can be reported due SPSBlockMove op or as part of some other operation like re-replication or newly written file block or balancer tool or Mover tool etc. With the current approach, it can simply get the {{blksMovementsFinished}} list from each heartbeat and do the logic. *Comment-4)* {quote}DataNode Why isn’t this just a block transfer? How is transferring between DNs any different than across storages? {quote} [Rakesh's reply] I could see Mover is also using {{REPLACE_BLOCK}} call and we just followed same approach in sps also. Am I missing anything here? *Comment-5)* {quote}DatanodeDescriptor Why use a synchronized linked list to offer/poll instead of BlockingQueue? {quote} [Rakesh's reply] Agreed, will do the changes. *Comment-5)* {quote}DatanodeManager I know it’s configurable, but realistically, when would you ever want to give storage movement tasks equal footing with under-replication? Is there really a use case for not valuing durability? {quote} [Rakesh's reply] We don't have any partcular use case, though. One scenario we thought is, user configured SSDs and filled up quickly. In that case, there could be situations that cleaning-up is considered as a high priority. If you feel, this is not a real case then I'm OK to remove this config and SPS will use only the remaining slots. If needed, we can bring such configs later. *Comment-6)* {quote}Adding getDatanodeStorageReport is concerning. getDatanodeListForReport is already a very bad method that should be avoided for anything but jmx – even then it’s a concern. I eliminated calls to it years ago. All it takes is a nscd/dns hiccup and you’re left holding the fsn lock for an excessive length of time. Beyond that, the response is going to be pretty large and tagging all the storage reports is not going to be cheap. verifyTargetDatanodeHasSpaceForScheduling does it really need the namesystem lock? Can’t DatanodeDescriptor#chooseStorage4Block synchronize on its storageMap? Appears to be calling getLiveDatanodeStorageReport for every file. As mentioned earlier, this is NOT cheap. The SPS should be able to operate on a fuzzy/cached state of the world. Then it gets another datanode report to determine the number of live nodes to decide if it should sleep before processing the next path. The number of nodes from the prior cached view of the world should suffice. {quote} [Rakesh's reply] Good point. Sometime back Uma and me thought about cache part. Actually, we depend on this api for the data node storage types and remaining space details. I think, it requires two different mechanisms for internal and external sps. For internal, how about sps can directly refer {{DatanodeManager#datanodeMap}} for every file. For the external, IIUC you are suggesting a cache mechanism. How about, make data structure {{StorageGroup}} similar to Mover class and build cache. This local cache can be refreshed per file or periodically. *Comment-7)* {quote}DFSUtil DFSUtil.removeOverlapBetweenStorageTypes and {{DFSUtil.getSPSWorkMultiplier }}. These aren’t generally useful methods so why are they in DFSUtil? Why aren’t they in the only calling class StoragePolicySatisfier? {quote} [Rakesh's reply] Agreed, Will do the changes. *Comment-8)* {quote}FSDirXAttrOp Not fond of the SPS queue updates being edge triggered by xattr changes. Food for thought: why add more rpc methods if the client can just twiddle the xattr? {quote} [Rakesh's reply] Sps is depending on the xattr. On each user satisfy
[jira] [Comment Edited] (HDFS-13084) [SPS]: Fix the branch review comments
[ https://issues.apache.org/jira/browse/HDFS-13084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16344876#comment-16344876 ] Rakesh R edited comment on HDFS-13084 at 1/30/18 10:55 AM: --- Thanks a lot [~daryn] for your time and review comments. My reply follows, please take a look at it. *Comment-1)* {quote}BlockManager Shouldn’t spsMode be volatile? Although I question why it’s here. {quote} [Rakesh's reply] Agreed, will do the changes. *Comment-2)* {quote}Adding SPS methods to this class implies an unexpected coupling of the SPS service to the block manager. Please move them out to prove it’s not tightly coupled. {quote} [Rakesh's reply] Agreed. I'm planning to create {{StoragePolicySatisfyManager}} and keep all the related apis over there. *Comment-3)* {quote}BPServiceActor Is it actually sending back the moved blocks? Aren’t IBRs sufficient? {quote} [Rakesh's reply] Currently, SPS have a coupling with heartbeat for sending the blk move tasks to DN and recieving the {{blksMovementsFinished}} list from the DN. Since we have grouped separately, each block movement finished block can be easily/quickly recognized by the Satisfier in NN side and can easily update the tracking details. I think, if we use IBR approach, for every newly reported block, Satisfier in NN side could check whether this new block has occurred due to SPSBlockMove operation for updating the track details. I meant, a new block can be reported due SPSBlockMove op or as part of some other operation like re-replication or newly written file block or balancer tool or Mover tool etc. With the current approach, it can simply get the {{blksMovementsFinished}} list from each heartbeat and do the logic. *Comment-4)* {quote}DataNode Why isn’t this just a block transfer? How is transferring between DNs any different than across storages? {quote} [Rakesh's reply] I could see Mover is also using {{REPLACE_BLOCK}} call and we just followed same approach in sps also. Am I missing anything here? *Comment-5)* {quote}DatanodeDescriptor Why use a synchronized linked list to offer/poll instead of BlockingQueue? {quote} [Rakesh's reply] Agreed, will do the changes. *Comment-6)* {quote}DatanodeManager I know it’s configurable, but realistically, when would you ever want to give storage movement tasks equal footing with under-replication? Is there really a use case for not valuing durability? {quote} [Rakesh's reply] We don't have any partcular use case, though. One scenario we thought is, user configured SSDs and filled up quickly. In that case, there could be situations that cleaning-up is considered as a high priority. If you feel, this is not a real case then I'm OK to remove this config and SPS will use only the remaining slots. If needed, we can bring such configs later. *Comment-7)* {quote}Adding getDatanodeStorageReport is concerning. getDatanodeListForReport is already a very bad method that should be avoided for anything but jmx – even then it’s a concern. I eliminated calls to it years ago. All it takes is a nscd/dns hiccup and you’re left holding the fsn lock for an excessive length of time. Beyond that, the response is going to be pretty large and tagging all the storage reports is not going to be cheap. verifyTargetDatanodeHasSpaceForScheduling does it really need the namesystem lock? Can’t DatanodeDescriptor#chooseStorage4Block synchronize on its storageMap? Appears to be calling getLiveDatanodeStorageReport for every file. As mentioned earlier, this is NOT cheap. The SPS should be able to operate on a fuzzy/cached state of the world. Then it gets another datanode report to determine the number of live nodes to decide if it should sleep before processing the next path. The number of nodes from the prior cached view of the world should suffice. {quote} [Rakesh's reply] Good point. Sometime back Uma and me thought about cache part. Actually, we depend on this api for the data node storage types and remaining space details. I think, it requires two different mechanisms for internal and external sps. For internal, how about sps can directly refer {{DatanodeManager#datanodeMap}} for every file. For the external, IIUC you are suggesting a cache mechanism. How about, make data structure {{StorageGroup}} similar to Mover class and build cache. This local cache can be refreshed per file or periodically. *Comment-8)* {quote}DFSUtil DFSUtil.removeOverlapBetweenStorageTypes and {{DFSUtil.getSPSWorkMultiplier }}. These aren’t generally useful methods so why are they in DFSUtil? Why aren’t they in the only calling class StoragePolicySatisfier? {quote} [Rakesh's reply] Agreed, Will do the changes. *Comment-9)* {quote}FSDirXAttrOp Not fond of the SPS queue updates being edge triggered by xattr changes. Food for thought: why add more rpc methods if the client can just twiddle the xattr? {quote} [Rakesh's reply]