[ https://issues.apache.org/jira/browse/HDFS-16262?focusedWorklogId=669916&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-669916 ]
ASF GitHub Bot logged work on HDFS-16262: ----------------------------------------- Author: ASF GitHub Bot Created on: 26/Oct/21 06:37 Start Date: 26/Oct/21 06:37 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #3527: URL: https://github.com/apache/hadoop/pull/3527#issuecomment-951607494 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |:----:|----------:|--------:|:--------:|:-------:| | +0 :ok: | reexec | 0m 36s | | Docker mode activated. | |||| _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 1s | | codespell was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 2 new or modified test files. | |||| _ trunk Compile Tests _ | | +0 :ok: | mvndep | 13m 10s | | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 21m 26s | | trunk passed | | +1 :green_heart: | compile | 5m 7s | | trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | compile | 4m 51s | | trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | checkstyle | 1m 14s | | trunk passed | | +1 :green_heart: | mvnsite | 2m 22s | | trunk passed | | +1 :green_heart: | javadoc | 1m 39s | | trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | javadoc | 2m 7s | | trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | spotbugs | 5m 33s | | trunk passed | | +1 :green_heart: | shadedclient | 21m 36s | | branch has no errors when building and testing our client artifacts. | |||| _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 27s | | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 2m 2s | | the patch passed | | +1 :green_heart: | compile | 5m 1s | | the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | javac | 5m 1s | | the patch passed | | +1 :green_heart: | compile | 4m 45s | | the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | javac | 4m 45s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 1m 5s | [/results-checkstyle-hadoop-hdfs-project.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3527/10/artifact/out/results-checkstyle-hadoop-hdfs-project.txt) | hadoop-hdfs-project: The patch generated 20 new + 105 unchanged - 0 fixed = 125 total (was 105) | | +1 :green_heart: | mvnsite | 2m 7s | | the patch passed | | +1 :green_heart: | xml | 0m 2s | | The patch has no ill-formed XML file. | | +1 :green_heart: | javadoc | 1m 22s | | the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 | | +1 :green_heart: | javadoc | 1m 53s | | the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | +1 :green_heart: | spotbugs | 5m 40s | | the patch passed | | +1 :green_heart: | shadedclient | 21m 14s | | patch has no errors when building and testing our client artifacts. | |||| _ Other Tests _ | | +1 :green_heart: | unit | 2m 21s | | hadoop-hdfs-client in the patch passed. | | +1 :green_heart: | unit | 222m 16s | | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 0m 46s | | The patch does not generate ASF License warnings. | | | | 348m 58s | | | | Subsystem | Report/Notes | |----------:|:-------------| | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3527/10/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/3527 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell xml | | uname | Linux 90d8a338ecc6 4.15.0-156-generic #163-Ubuntu SMP Thu Aug 19 23:31:58 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / 2ceae4ca2a14e5a01b8f0236770148ed1f4e2088 | | Default Java | Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3527/10/testReport/ | | Max. process+thread count | 3571 (vs. ulimit of 5500) | | modules | C: hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project | | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3527/10/console | | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 | | Powered by | Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org | This message was automatically generated. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking ------------------- Worklog Id: (was: 669916) Time Spent: 2h 20m (was: 2h 10m) > Async refresh of cached locations in DFSInputStream > --------------------------------------------------- > > Key: HDFS-16262 > URL: https://issues.apache.org/jira/browse/HDFS-16262 > Project: Hadoop HDFS > Issue Type: Improvement > Reporter: Bryan Beaudreault > Assignee: Bryan Beaudreault > Priority: Major > Labels: pull-request-available > Time Spent: 2h 20m > Remaining Estimate: 0h > > HDFS-15119 added the ability to invalidate cached block locations in > DFSInputStream. As written, the feature will affect all DFSInputStreams > regardless of whether they need it or not. The invalidation also only applies > on the next request, so the next request will pay the cost of calling > openInfo before reading the data. > I'm working on a feature for HBase which enables efficient healing of > locality through Balancer-style low level block moves (HBASE-26250). I'd like > to utilize the idea started in HDFS-15119 in order to update DFSInputStreams > after blocks have been moved to local hosts. > I was considering using the feature as is, but some of our clusters are quite > large and I'm concerned about the impact on the namenode: > * We have some clusters with over 350k StoreFiles, so that'd be 350k > DFSInputStreams. With such a large number and very active usage, having the > refresh be in-line makes it too hard to ensure we don't DDOS the NameNode. > * Currently we need to pay the price of openInfo the next time a > DFSInputStream is invoked. Moving that async would minimize the latency hit. > Also, some StoreFiles might be far less frequently accessed, so they may live > on for a long time before ever refreshing. We'd like to be able to know that > all DFSInputStreams are refreshed by a given time. > * We may have 350k files, but only a small percentage of them are ever > non-local at a given time. Refreshing only if necessary will save a lot of > work. > In order to make this as painless to end users as possible, I'd like to: > * Update the implementation to utilize an async thread for managing > refreshes. This will give more control over rate limiting across all > DFSInputStreams in a DFSClient, and also ensure that all DFSInputStreams are > refreshed. > * Only refresh files which are lacking a local replica or have known > deadNodes to be cleaned up > > -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org