[jira] [Commented] (HDFS-15630) RBF: Fix wrong client IP info in CallerContext when requests mount points with multi-destinations.
[ https://issues.apache.org/jira/browse/HDFS-15630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17692552#comment-17692552 ] Takanobu Asanuma commented on HDFS-15630: - I added 3.3.5 to fix versions. > RBF: Fix wrong client IP info in CallerContext when requests mount points > with multi-destinations. > -- > > Key: HDFS-15630 > URL: https://issues.apache.org/jira/browse/HDFS-15630 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: rbf >Reporter: Chengwei Wang >Assignee: Chengwei Wang >Priority: Major > Fix For: 3.4.0, 3.3.5 > > Attachments: HDFS-15630.001.patch, HDFS-15630.002.patch, > HDFS-15630.003.patch, HDFS-15630.004.patch, HDFS-15630.005.patch, > HDFS-15630.006.patch, HDFS-15630.test.patch > > > There are two issues about client IP info in CallerContext when we try to > request mount points with multi-destinations. > # the clientIp would duplicate in CallerContext when > RouterRpcClient#invokeSequential. > # the clientIp would miss in CallerContext when > RouterRpcClient#invokeConcurrent. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15630) RBF: Fix wrong client IP info in CallerContext when requests mount points with multi-destinations.
[ https://issues.apache.org/jira/browse/HDFS-15630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takanobu Asanuma updated HDFS-15630: Fix Version/s: 3.3.5 > RBF: Fix wrong client IP info in CallerContext when requests mount points > with multi-destinations. > -- > > Key: HDFS-15630 > URL: https://issues.apache.org/jira/browse/HDFS-15630 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: rbf >Reporter: Chengwei Wang >Assignee: Chengwei Wang >Priority: Major > Fix For: 3.4.0, 3.3.5 > > Attachments: HDFS-15630.001.patch, HDFS-15630.002.patch, > HDFS-15630.003.patch, HDFS-15630.004.patch, HDFS-15630.005.patch, > HDFS-15630.006.patch, HDFS-15630.test.patch > > > There are two issues about client IP info in CallerContext when we try to > request mount points with multi-destinations. > # the clientIp would duplicate in CallerContext when > RouterRpcClient#invokeSequential. > # the clientIp would miss in CallerContext when > RouterRpcClient#invokeConcurrent. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13293) RBF: The RouterRPCServer should transfer client IP via CallerContext to NamenodeRpcServer
[ https://issues.apache.org/jira/browse/HDFS-13293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takanobu Asanuma updated HDFS-13293: Fix Version/s: 3.3.5 > RBF: The RouterRPCServer should transfer client IP via CallerContext to > NamenodeRpcServer > - > > Key: HDFS-13293 > URL: https://issues.apache.org/jira/browse/HDFS-13293 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: rbf >Reporter: Baolong Mao >Assignee: Hui Fei >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0, 3.3.5 > > Attachments: HDFS-13293.001.patch > > Time Spent: 4h 40m > Remaining Estimate: 0h > > Otherwise, the namenode don't know the client's callerContext > This jira focuses on audit log which logs real client ip. Leave locality to > HDFS-13248 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13293) RBF: The RouterRPCServer should transfer client IP via CallerContext to NamenodeRpcServer
[ https://issues.apache.org/jira/browse/HDFS-13293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17692551#comment-17692551 ] Takanobu Asanuma commented on HDFS-13293: - I added 3.3.5 to fix versions. > RBF: The RouterRPCServer should transfer client IP via CallerContext to > NamenodeRpcServer > - > > Key: HDFS-13293 > URL: https://issues.apache.org/jira/browse/HDFS-13293 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: rbf >Reporter: Baolong Mao >Assignee: Hui Fei >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0, 3.3.5 > > Attachments: HDFS-13293.001.patch > > Time Spent: 4h 40m > Remaining Estimate: 0h > > Otherwise, the namenode don't know the client's callerContext > This jira focuses on audit log which logs real client ip. Leave locality to > HDFS-13248 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16845) Add configuration flag to enable observer reads on routers without using ObserverReadProxyProvider
[ https://issues.apache.org/jira/browse/HDFS-16845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17692547#comment-17692547 ] Takanobu Asanuma commented on HDFS-16845: - It is only in trunk. I corrected the fixed versions. > Add configuration flag to enable observer reads on routers without using > ObserverReadProxyProvider > -- > > Key: HDFS-16845 > URL: https://issues.apache.org/jira/browse/HDFS-16845 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Simbarashe Dzinamarira >Assignee: Simbarashe Dzinamarira >Priority: Critical > Labels: pull-request-available > Fix For: 3.4.0 > > > In order for clients to have routers forward their reads to observers, the > clients must use a proxy with an alignment context. This is currently > achieved by using the ObserverReadProxyProvider. > Using ObserverReadProxyProvider allows backward compatible for client > configurations. > However, the ObserverReadProxyProvider forces an msync on initialization > which is not required with routers. > Performing msync calls is more expensive with routers because the router fans > out the cal to all namespaces, so we'd like to avoid this. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16845) Add configuration flag to enable observer reads on routers without using ObserverReadProxyProvider
[ https://issues.apache.org/jira/browse/HDFS-16845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takanobu Asanuma updated HDFS-16845: Fix Version/s: (was: 3.3.5) (was: 2.10.3) > Add configuration flag to enable observer reads on routers without using > ObserverReadProxyProvider > -- > > Key: HDFS-16845 > URL: https://issues.apache.org/jira/browse/HDFS-16845 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Simbarashe Dzinamarira >Assignee: Simbarashe Dzinamarira >Priority: Critical > Labels: pull-request-available > Fix For: 3.4.0 > > > In order for clients to have routers forward their reads to observers, the > clients must use a proxy with an alignment context. This is currently > achieved by using the ObserverReadProxyProvider. > Using ObserverReadProxyProvider allows backward compatible for client > configurations. > However, the ObserverReadProxyProvider forces an msync on initialization > which is not required with routers. > Performing msync calls is more expensive with routers because the router fans > out the cal to all namespaces, so we'd like to avoid this. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16726) There is a memory-related problem about HDFS namenode
[ https://issues.apache.org/jira/browse/HDFS-16726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17692544#comment-17692544 ] zinx commented on HDFS-16726: - [~yuyanlei] we use 2.6.x, and we upgrade jdk to 12 and remove G1RSetRegionEntries param. we not use 3.x before. > There is a memory-related problem about HDFS namenode > - > > Key: HDFS-16726 > URL: https://issues.apache.org/jira/browse/HDFS-16726 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs, namenode >Affects Versions: 2.7.2 > Environment: -Xms280G -Xmx280G -XX:MaxDirectMemorySize=10G > -XX:MetaspaceSize=128M -server \ > -XX:+UseG1GC -XX:+UseStringDeduplication > -XX:MaxGCPauseMillis=250 -XX:+UnlockExperimentalVMOptions > -XX:+PrintGCApplicationStoppedTime -XX:+PrintSafepointStatistics > -XX:PrintSafepointStatisticsCount=1 \ > -XX:G1OldCSetRegionThresholdPercent=1 > -XX:G1MixedGCCountTarget=9 -XX:+SafepointTimeout > -XX:SafepointTimeoutDelay=4000 \ > -XX:ParallelGCThreads=24 -XX:ConcGCThreads=6 > -XX:G1RSetRegionEntries=4096 -XX:+AggressiveOpts -XX:+DisableExplicitGC \ > -XX:G1HeapWastePercent=9 > -XX:G1MixedGCLiveThresholdPercent=85 -XX:InitiatingHeapOccupancyPercent=75 \ > -XX:+ParallelRefProcEnabled -XX:-ResizePLAB > -XX:+PrintAdaptiveSizePolicy \ > -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps \ > -Xloggc:$HADOOP_LOG_DIR/namenode.gc.log \ > -XX:+HeapDumpOnOutOfMemoryError > -XX:ErrorFile=$HADOOP_LOG_DIR/hs_err_pid%p.log > -XX:HeapDumpPath=$HADOOP_LOG_DIR \ > -Dcom.sun.management.jmxremote \ > -Dcom.sun.management.jmxremote.port=9009 \ > -Dcom.sun.management.jmxremote.ssl=false \ > -Dcom.sun.management.jmxremote.authenticate=false \ > $HADOOP_NAMENODE_OPTS >Reporter: Yanlei Yu >Priority: Critical > > In the cluster, the memory usage of Namenode exceeds the XMX setting (XMX > =280GB). The actual memory usage of Namenode is 479GB > Output via pamp: > Address Perm Offset Device Inode Size Rss Pss > Referenced Anonymous Swap Locked Mapping > 2b42f000 rw-p 00:00 0 294174720 293756960 293756960 > 293756960 293756960 0 0 > 01e21000 rw-p 00:00 0 195245456 195240848 195240848 > 195240848 195240848 0 0 [heap] > 2b897c00 rw-p 00:00 0 9246724 9246724 9246724 > 9246724 9246724 0 0 > 2b8bb0905000 rw-p 00:00 0 1781124 1754572 1754572 > 1754572 1754572 0 0 > 2b893600 rw-p 00:00 0 1146880 1002084 1002084 > 1002084 1002084 0 0 > 2b42db652000 rwxp 00:00 0 57792 55252 55252 > 55252 55252 0 0 > 2b42ec12a000 rw-p 00:00 0 25696 24700 24700 > 24700 24700 0 0 > 2b42ef25b000 rw-p 00:00 0 9988 8972 8972 > 8972 8972 0 0 > 2b8c1d467000 rw-p 00:00 0 9216 8204 8204 > 8204 8204 0 0 > 2b8d6f8db000 rw-p 00:00 0 7160 6228 6228 > 6228 6228 0 0 > The first line should configure the memory footprint for XMX, and [heap] is > unusually large, so a memory leak is suspected! > > * [heap] is associated with malloc > After configuring JCMD in the test environment, we found that the malloc part > of Internal in JCMD increased significantly when the client was writing to a > gz file (XMX =40g in the test environment, and the Internal area was 900MB > before the client wrote) : > Total: reserved=47276MB, committed=47070MB > - Java Heap (reserved=40960MB, committed=40960MB) > (mmap: reserved=40960MB, committed=40960MB) > > - Class (reserved=53MB, committed=52MB) > (classes #7423) > (malloc=1MB #17053) > (mmap: reserved=52MB, committed=52MB) > > - Thread (reserved=2145MB, committed=2145MB) > (thread #2129) > (stack: reserved=2136MB, committed=2136MB) > (malloc=7MB #10673) > (arena=2MB #4256) > > - Code (reserved=251MB, committed=45MB) > (malloc=7MB #10661) > (mmap: reserved=244MB, committed=38MB) > > - GC
[jira] [Commented] (HDFS-16917) Add transfer rate quantile metrics for DataNode reads
[ https://issues.apache.org/jira/browse/HDFS-16917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17692522#comment-17692522 ] ASF GitHub Bot commented on HDFS-16917: --- hadoop-yetus commented on PR #5397: URL: https://github.com/apache/hadoop/pull/5397#issuecomment-1441280753 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 47s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +0 :ok: | markdownlint | 0m 0s | | markdownlint was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 2 new or modified test files. | _ trunk Compile Tests _ | | +0 :ok: | mvndep | 16m 50s | | Maven dependency ordering for branch | | -1 :x: | mvninstall | 30m 58s | [/branch-mvninstall-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5397/14/artifact/out/branch-mvninstall-root.txt) | root in trunk failed. | | -1 :x: | compile | 13m 22s | [/branch-compile-root-jdkUbuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5397/14/artifact/out/branch-compile-root-jdkUbuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04.txt) | root in trunk failed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04. | | -1 :x: | compile | 12m 0s | [/branch-compile-root-jdkPrivateBuild-1.8.0_352-8u352-ga-1~20.04-b08.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5397/14/artifact/out/branch-compile-root-jdkPrivateBuild-1.8.0_352-8u352-ga-1~20.04-b08.txt) | root in trunk failed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08. | | +1 :green_heart: | checkstyle | 3m 40s | | trunk passed | | +1 :green_heart: | mvnsite | 3m 5s | | trunk passed | | +1 :green_heart: | javadoc | 2m 2s | | trunk passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 | | +1 :green_heart: | javadoc | 2m 16s | | trunk passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | +1 :green_heart: | spotbugs | 5m 41s | | trunk passed | | +1 :green_heart: | shadedclient | 25m 38s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 28s | | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 2m 36s | | the patch passed | | -1 :x: | compile | 13m 12s | [/patch-compile-root-jdkUbuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5397/14/artifact/out/patch-compile-root-jdkUbuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04.txt) | root in the patch failed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04. | | -1 :x: | javac | 13m 12s | [/patch-compile-root-jdkUbuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5397/14/artifact/out/patch-compile-root-jdkUbuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04.txt) | root in the patch failed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04. | | -1 :x: | compile | 11m 53s | [/patch-compile-root-jdkPrivateBuild-1.8.0_352-8u352-ga-1~20.04-b08.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5397/14/artifact/out/patch-compile-root-jdkPrivateBuild-1.8.0_352-8u352-ga-1~20.04-b08.txt) | root in the patch failed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08. | | -1 :x: | javac | 11m 53s | [/patch-compile-root-jdkPrivateBuild-1.8.0_352-8u352-ga-1~20.04-b08.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5397/14/artifact/out/patch-compile-root-jdkPrivateBuild-1.8.0_352-8u352-ga-1~20.04-b08.txt) | root in the patch failed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08. | | -1 :x: | blanks | 0m 0s | [/blanks-eol.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5397/14/artifact/out/blanks-eol.txt) | The patch has 2 line(s) that end in blanks. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply | | -0 :warning: | checkstyle | 3m 20s | [/results-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5397/14/artifact/out/results-checkstyle-root.txt) | root: The patch generated 1 new + 205 unchanged - 0 fixed = 206 total (was 205) | | +1 :green_heart: | mvnsite | 2m 55s | | the patch passed | | +1 :green_heart: | javadoc | 1m 49s | |
[jira] [Commented] (HDFS-16917) Add transfer rate quantile metrics for DataNode reads
[ https://issues.apache.org/jira/browse/HDFS-16917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17692500#comment-17692500 ] ASF GitHub Bot commented on HDFS-16917: --- hadoop-yetus commented on PR #5397: URL: https://github.com/apache/hadoop/pull/5397#issuecomment-1441251613 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 37s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +0 :ok: | markdownlint | 0m 0s | | markdownlint was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 2 new or modified test files. | _ trunk Compile Tests _ | | +0 :ok: | mvndep | 15m 53s | | Maven dependency ordering for branch | | -1 :x: | mvninstall | 31m 14s | [/branch-mvninstall-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5397/13/artifact/out/branch-mvninstall-root.txt) | root in trunk failed. | | -1 :x: | compile | 13m 24s | [/branch-compile-root-jdkUbuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5397/13/artifact/out/branch-compile-root-jdkUbuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04.txt) | root in trunk failed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04. | | -1 :x: | compile | 11m 59s | [/branch-compile-root-jdkPrivateBuild-1.8.0_352-8u352-ga-1~20.04-b08.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5397/13/artifact/out/branch-compile-root-jdkPrivateBuild-1.8.0_352-8u352-ga-1~20.04-b08.txt) | root in trunk failed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08. | | +1 :green_heart: | checkstyle | 4m 19s | | trunk passed | | +1 :green_heart: | mvnsite | 3m 6s | | trunk passed | | +1 :green_heart: | javadoc | 2m 0s | | trunk passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 | | +1 :green_heart: | javadoc | 2m 13s | | trunk passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | +1 :green_heart: | spotbugs | 5m 45s | | trunk passed | | +1 :green_heart: | shadedclient | 25m 47s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 29s | | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 2m 30s | | the patch passed | | -1 :x: | compile | 13m 16s | [/patch-compile-root-jdkUbuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5397/13/artifact/out/patch-compile-root-jdkUbuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04.txt) | root in the patch failed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04. | | -1 :x: | javac | 13m 16s | [/patch-compile-root-jdkUbuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5397/13/artifact/out/patch-compile-root-jdkUbuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04.txt) | root in the patch failed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04. | | -1 :x: | compile | 11m 51s | [/patch-compile-root-jdkPrivateBuild-1.8.0_352-8u352-ga-1~20.04-b08.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5397/13/artifact/out/patch-compile-root-jdkPrivateBuild-1.8.0_352-8u352-ga-1~20.04-b08.txt) | root in the patch failed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08. | | -1 :x: | javac | 11m 51s | [/patch-compile-root-jdkPrivateBuild-1.8.0_352-8u352-ga-1~20.04-b08.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5397/13/artifact/out/patch-compile-root-jdkPrivateBuild-1.8.0_352-8u352-ga-1~20.04-b08.txt) | root in the patch failed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08. | | -1 :x: | blanks | 0m 0s | [/blanks-eol.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5397/13/artifact/out/blanks-eol.txt) | The patch has 2 line(s) that end in blanks. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply | | -0 :warning: | checkstyle | 3m 21s | [/results-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5397/13/artifact/out/results-checkstyle-root.txt) | root: The patch generated 2 new + 205 unchanged - 0 fixed = 207 total (was 205) | | +1 :green_heart: | mvnsite | 2m 50s | | the patch passed | | +1 :green_heart: | javadoc | 1m 46s | |
[jira] [Commented] (HDFS-16917) Add transfer rate quantile metrics for DataNode reads
[ https://issues.apache.org/jira/browse/HDFS-16917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17692493#comment-17692493 ] ASF GitHub Bot commented on HDFS-16917: --- hadoop-yetus commented on PR #5397: URL: https://github.com/apache/hadoop/pull/5397#issuecomment-1441216333 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 39s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +0 :ok: | markdownlint | 0m 0s | | markdownlint was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 1 new or modified test files. | _ trunk Compile Tests _ | | +0 :ok: | mvndep | 15m 40s | | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 31m 11s | | trunk passed | | +1 :green_heart: | compile | 23m 5s | | trunk passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 | | +1 :green_heart: | compile | 20m 32s | | trunk passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | +1 :green_heart: | checkstyle | 3m 46s | | trunk passed | | +1 :green_heart: | mvnsite | 3m 26s | | trunk passed | | +1 :green_heart: | javadoc | 2m 27s | | trunk passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 | | +1 :green_heart: | javadoc | 2m 40s | | trunk passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | +1 :green_heart: | spotbugs | 6m 8s | | trunk passed | | +1 :green_heart: | shadedclient | 26m 31s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 28s | | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 2m 33s | | the patch passed | | +1 :green_heart: | compile | 22m 27s | | the patch passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 | | +1 :green_heart: | javac | 22m 27s | | the patch passed | | +1 :green_heart: | compile | 20m 33s | | the patch passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | +1 :green_heart: | javac | 20m 33s | | the patch passed | | -1 :x: | blanks | 0m 0s | [/blanks-eol.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5397/11/artifact/out/blanks-eol.txt) | The patch has 1 line(s) that end in blanks. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply | | -0 :warning: | checkstyle | 4m 57s | [/results-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5397/11/artifact/out/results-checkstyle-root.txt) | root: The patch generated 1 new + 140 unchanged - 0 fixed = 141 total (was 140) | | +1 :green_heart: | mvnsite | 3m 35s | | the patch passed | | +1 :green_heart: | javadoc | 2m 19s | | the patch passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 | | +1 :green_heart: | javadoc | 2m 40s | | the patch passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | +1 :green_heart: | spotbugs | 6m 22s | | the patch passed | | +1 :green_heart: | shadedclient | 26m 34s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 18m 15s | | hadoop-common in the patch passed. | | -1 :x: | unit | 210m 48s | [/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5397/11/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 1m 14s | | The patch does not generate ASF License warnings. | | | | 458m 33s | | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.hdfs.server.namenode.ha.TestObserverNode | | | hadoop.hdfs.server.datanode.TestDirectoryScanner | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.42 ServerAPI=1.42 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5397/11/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/5397 | | Optional Tests | dupname asflicense mvnsite codespell detsecrets markdownlint compile javac javadoc mvninstall unit shadedclient spotbugs checkstyle | | uname | Linux c1a9574555cb 4.15.0-200-generic #211-Ubuntu
[jira] [Commented] (HDFS-16917) Add transfer rate quantile metrics for DataNode reads
[ https://issues.apache.org/jira/browse/HDFS-16917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17692486#comment-17692486 ] ASF GitHub Bot commented on HDFS-16917: --- hadoop-yetus commented on PR #5397: URL: https://github.com/apache/hadoop/pull/5397#issuecomment-1441210715 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 45s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +0 :ok: | markdownlint | 0m 0s | | markdownlint was not available. | | +1 :green_heart: | @author | 0m 1s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 2 new or modified test files. | _ trunk Compile Tests _ | | +0 :ok: | mvndep | 16m 3s | | Maven dependency ordering for branch | | -1 :x: | mvninstall | 30m 43s | [/branch-mvninstall-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5397/12/artifact/out/branch-mvninstall-root.txt) | root in trunk failed. | | -1 :x: | compile | 13m 29s | [/branch-compile-root-jdkUbuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5397/12/artifact/out/branch-compile-root-jdkUbuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04.txt) | root in trunk failed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04. | | -1 :x: | compile | 12m 4s | [/branch-compile-root-jdkPrivateBuild-1.8.0_352-8u352-ga-1~20.04-b08.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5397/12/artifact/out/branch-compile-root-jdkPrivateBuild-1.8.0_352-8u352-ga-1~20.04-b08.txt) | root in trunk failed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08. | | +1 :green_heart: | checkstyle | 3m 35s | | trunk passed | | +1 :green_heart: | mvnsite | 3m 2s | | trunk passed | | +1 :green_heart: | javadoc | 2m 0s | | trunk passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 | | +1 :green_heart: | javadoc | 2m 11s | | trunk passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | +1 :green_heart: | spotbugs | 5m 42s | | trunk passed | | +1 :green_heart: | shadedclient | 25m 42s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 29s | | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 2m 34s | | the patch passed | | -1 :x: | compile | 13m 17s | [/patch-compile-root-jdkUbuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5397/12/artifact/out/patch-compile-root-jdkUbuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04.txt) | root in the patch failed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04. | | -1 :x: | javac | 13m 17s | [/patch-compile-root-jdkUbuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5397/12/artifact/out/patch-compile-root-jdkUbuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04.txt) | root in the patch failed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04. | | -1 :x: | compile | 11m 54s | [/patch-compile-root-jdkPrivateBuild-1.8.0_352-8u352-ga-1~20.04-b08.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5397/12/artifact/out/patch-compile-root-jdkPrivateBuild-1.8.0_352-8u352-ga-1~20.04-b08.txt) | root in the patch failed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08. | | -1 :x: | javac | 11m 54s | [/patch-compile-root-jdkPrivateBuild-1.8.0_352-8u352-ga-1~20.04-b08.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5397/12/artifact/out/patch-compile-root-jdkPrivateBuild-1.8.0_352-8u352-ga-1~20.04-b08.txt) | root in the patch failed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08. | | -1 :x: | blanks | 0m 0s | [/blanks-eol.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5397/12/artifact/out/blanks-eol.txt) | The patch has 1 line(s) that end in blanks. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply | | -0 :warning: | checkstyle | 4m 46s | [/results-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5397/12/artifact/out/results-checkstyle-root.txt) | root: The patch generated 1 new + 205 unchanged - 0 fixed = 206 total (was 205) | | +1 :green_heart: | mvnsite | 2m 59s | | the patch passed | | +1 :green_heart: | javadoc | 1m 50s | |
[jira] [Commented] (HDFS-16932) Mockito causing ClassCastException
[ https://issues.apache.org/jira/browse/HDFS-16932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17692461#comment-17692461 ] Steve Vaughan commented on HDFS-16932: -- It looks like the problem is caused by Mockito Spy (doAnswer on getBlocks within TestBalancer) that then calls another handler of Mockito Spy. > Mockito causing ClassCastException > -- > > Key: HDFS-16932 > URL: https://issues.apache.org/jira/browse/HDFS-16932 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 3.4.0 > Environment: Running in the Hadoop development environment in docker, > running mvn. >Reporter: Steve Vaughan >Priority: Major > > Running tests in TestBalancerRPCDelay fails because of ClassCastExceptions > introduced by Mockito. As an example, in this stack trace note that the > RedundancyMonitor is calling "isRunning" but incorrectly ends up being routed > to getBlocks (which returns BlocksWithLocations) via TestBalancer and a > Mockito Spy. This ultimately is reported as a failure during the shutdown > process. > {{Exception in thread "RedundancyMonitor" java.lang.ClassCastException: > java.lang.Boolean cannot be cast to > org.apache.hadoop.hdfs.server.protocol.BlocksWithLocations}}{{ at > org.apache.hadoop.hdfs.server.balancer.TestBalancer$2.answer(TestBalancer.java:1865)}}{{ > at > org.apache.hadoop.hdfs.server.balancer.TestBalancer$2.answer(TestBalancer.java:1858)}}{{ > at > org.mockito.internal.stubbing.StubbedInvocationMatcher.answer(StubbedInvocationMatcher.java:39)}}{{ > at > org.mockito.internal.handler.MockHandlerImpl.handle(MockHandlerImpl.java:96)}}{{ > at > org.mockito.internal.handler.NullResultGuardian.handle(NullResultGuardian.java:29)}}{{ > at > org.mockito.internal.handler.InvocationNotifierHandler.handle(InvocationNotifierHandler.java:35)}}{{ > at > org.mockito.internal.creation.bytebuddy.MockMethodInterceptor.doIntercept(MockMethodInterceptor.java:61)}}{{ > at > org.mockito.internal.creation.bytebuddy.MockMethodInterceptor.doIntercept(MockMethodInterceptor.java:49)}}{{ > at > org.mockito.internal.creation.bytebuddy.MockMethodInterceptor$DispatcherDefaultingToRealMethod.interceptSuperCallable(MockMethodInterceptor.java:108)}}{{ > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem$MockitoMock$1070381809.isRunning(Unknown > Source)}}{{ at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$RedundancyMonitor.run(BlockManager.java:5155)}}{{ > at java.lang.Thread.run(Thread.java:750)}} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-16932) Mockito causing ClassCastException
Steve Vaughan created HDFS-16932: Summary: Mockito causing ClassCastException Key: HDFS-16932 URL: https://issues.apache.org/jira/browse/HDFS-16932 Project: Hadoop HDFS Issue Type: Bug Components: hdfs Affects Versions: 3.4.0 Environment: Running in the Hadoop development environment in docker, running mvn. Reporter: Steve Vaughan Running tests in TestBalancerRPCDelay fails because of ClassCastExceptions introduced by Mockito. As an example, in this stack trace note that the RedundancyMonitor is calling "isRunning" but incorrectly ends up being routed to getBlocks (which returns BlocksWithLocations) via TestBalancer and a Mockito Spy. This ultimately is reported as a failure during the shutdown process. {{Exception in thread "RedundancyMonitor" java.lang.ClassCastException: java.lang.Boolean cannot be cast to org.apache.hadoop.hdfs.server.protocol.BlocksWithLocations}}{{ at org.apache.hadoop.hdfs.server.balancer.TestBalancer$2.answer(TestBalancer.java:1865)}}{{ at org.apache.hadoop.hdfs.server.balancer.TestBalancer$2.answer(TestBalancer.java:1858)}}{{ at org.mockito.internal.stubbing.StubbedInvocationMatcher.answer(StubbedInvocationMatcher.java:39)}}{{ at org.mockito.internal.handler.MockHandlerImpl.handle(MockHandlerImpl.java:96)}}{{ at org.mockito.internal.handler.NullResultGuardian.handle(NullResultGuardian.java:29)}}{{ at org.mockito.internal.handler.InvocationNotifierHandler.handle(InvocationNotifierHandler.java:35)}}{{ at org.mockito.internal.creation.bytebuddy.MockMethodInterceptor.doIntercept(MockMethodInterceptor.java:61)}}{{ at org.mockito.internal.creation.bytebuddy.MockMethodInterceptor.doIntercept(MockMethodInterceptor.java:49)}}{{ at org.mockito.internal.creation.bytebuddy.MockMethodInterceptor$DispatcherDefaultingToRealMethod.interceptSuperCallable(MockMethodInterceptor.java:108)}}{{ at org.apache.hadoop.hdfs.server.namenode.FSNamesystem$MockitoMock$1070381809.isRunning(Unknown Source)}}{{ at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$RedundancyMonitor.run(BlockManager.java:5155)}}{{ at java.lang.Thread.run(Thread.java:750)}} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16917) Add transfer rate quantile metrics for DataNode reads
[ https://issues.apache.org/jira/browse/HDFS-16917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17692410#comment-17692410 ] ASF GitHub Bot commented on HDFS-16917: --- rdingankar commented on code in PR #5397: URL: https://github.com/apache/hadoop/pull/5397#discussion_r1115082559 ## hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataXceiver.java: ## @@ -1122,6 +1124,7 @@ public void copyBlock(final ExtendedBlock block, datanode.metrics.incrBytesRead((int) read); datanode.metrics.incrBlocksRead(); datanode.metrics.incrTotalReadTime(duration); + datanode.metrics.addReadTransferRateMBs(DFSUtil.transferRateMBs(read, duration)); Review Comment: updated to log for cases when we get unexpected data. 1) Bytes read is negative 2) Duration is negative or 0. Since we are logging these cases not emitting the data to metrics. > Add transfer rate quantile metrics for DataNode reads > - > > Key: HDFS-16917 > URL: https://issues.apache.org/jira/browse/HDFS-16917 > Project: Hadoop HDFS > Issue Type: Task > Components: datanode >Reporter: Ravindra Dingankar >Priority: Minor > Labels: pull-request-available > > Currently we have the following metrics for datanode reads. > |BytesRead > BlocksRead > TotalReadTime|Total number of bytes read from DataNode > Total number of blocks read from DataNode > Total number of milliseconds spent on read operation| > We would like to add a new quantile metric calculating the transfer rate for > datanode reads. > This will give us a distribution across a window of the read transfer rate > for each datanode. > Quantiles for transfer rate per host will help in identifying issues like > hotspotting of datasets as well as finding repetitive slow datanodes. > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-16901) RBF: Routers should propagate the real user in the UGI via the caller context
[ https://issues.apache.org/jira/browse/HDFS-16901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley resolved HDFS-16901. -- Fix Version/s: 3.4.0 3.3.6 Resolution: Fixed Thanks, Simba! > RBF: Routers should propagate the real user in the UGI via the caller context > - > > Key: HDFS-16901 > URL: https://issues.apache.org/jira/browse/HDFS-16901 > Project: Hadoop HDFS > Issue Type: Task >Reporter: Simbarashe Dzinamarira >Assignee: Simbarashe Dzinamarira >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0, 3.3.6 > > > If the router receives an operation from a proxyUser, it drops the realUser > in the UGI and makes the routerUser the realUser for the operation that goes > to the namenode. > In the namenode UGI logs, we'd like the ability to know the original realUser. > The router should propagate the realUser from the client call as part of the > callerContext. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16917) Add transfer rate quantile metrics for DataNode reads
[ https://issues.apache.org/jira/browse/HDFS-16917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17692385#comment-17692385 ] ASF GitHub Bot commented on HDFS-16917: --- rdingankar commented on code in PR #5397: URL: https://github.com/apache/hadoop/pull/5397#discussion_r1114847865 ## hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataXceiver.java: ## @@ -1122,6 +1124,7 @@ public void copyBlock(final ExtendedBlock block, datanode.metrics.incrBytesRead((int) read); datanode.metrics.incrBlocksRead(); datanode.metrics.incrTotalReadTime(duration); + datanode.metrics.addReadTransferRateMBs(DFSUtil.transferRateMBs(read, duration)); Review Comment: I was thinking we can publish -1 for the error scenarios which can be visualized from the graphs as well. For duration of 0, changing it to 1 will give us the number of bytes if transferred. > Add transfer rate quantile metrics for DataNode reads > - > > Key: HDFS-16917 > URL: https://issues.apache.org/jira/browse/HDFS-16917 > Project: Hadoop HDFS > Issue Type: Task > Components: datanode >Reporter: Ravindra Dingankar >Priority: Minor > Labels: pull-request-available > > Currently we have the following metrics for datanode reads. > |BytesRead > BlocksRead > TotalReadTime|Total number of bytes read from DataNode > Total number of blocks read from DataNode > Total number of milliseconds spent on read operation| > We would like to add a new quantile metric calculating the transfer rate for > datanode reads. > This will give us a distribution across a window of the read transfer rate > for each datanode. > Quantiles for transfer rate per host will help in identifying issues like > hotspotting of datasets as well as finding repetitive slow datanodes. > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16917) Add transfer rate quantile metrics for DataNode reads
[ https://issues.apache.org/jira/browse/HDFS-16917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17692386#comment-17692386 ] ASF GitHub Bot commented on HDFS-16917: --- rdingankar commented on code in PR #5397: URL: https://github.com/apache/hadoop/pull/5397#discussion_r1115011334 ## hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSUtil.java: ## @@ -1936,4 +1936,20 @@ public static boolean isParentEntry(final String path, final String parent) { return path.charAt(parent.length()) == Path.SEPARATOR_CHAR || parent.equals(Path.SEPARATOR); } + + /** + * Calculate the transfer rate in megabytes/second. Return -1 for any negative input. + * @param bytes bytes + * @param durationMS duration in milliseconds + * @return the number of megabytes/second of the transfer rate + */ + public static long transferRateMBs(long bytes, long durationMS) { Review Comment: Added unit tests > Add transfer rate quantile metrics for DataNode reads > - > > Key: HDFS-16917 > URL: https://issues.apache.org/jira/browse/HDFS-16917 > Project: Hadoop HDFS > Issue Type: Task > Components: datanode >Reporter: Ravindra Dingankar >Priority: Minor > Labels: pull-request-available > > Currently we have the following metrics for datanode reads. > |BytesRead > BlocksRead > TotalReadTime|Total number of bytes read from DataNode > Total number of blocks read from DataNode > Total number of milliseconds spent on read operation| > We would like to add a new quantile metric calculating the transfer rate for > datanode reads. > This will give us a distribution across a window of the read transfer rate > for each datanode. > Quantiles for transfer rate per host will help in identifying issues like > hotspotting of datasets as well as finding repetitive slow datanodes. > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16901) RBF: Routers should propagate the real user in the UGI via the caller context
[ https://issues.apache.org/jira/browse/HDFS-16901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17692379#comment-17692379 ] ASF GitHub Bot commented on HDFS-16901: --- omalley merged PR #5346: URL: https://github.com/apache/hadoop/pull/5346 > RBF: Routers should propagate the real user in the UGI via the caller context > - > > Key: HDFS-16901 > URL: https://issues.apache.org/jira/browse/HDFS-16901 > Project: Hadoop HDFS > Issue Type: Task >Reporter: Simbarashe Dzinamarira >Assignee: Simbarashe Dzinamarira >Priority: Major > Labels: pull-request-available > > If the router receives an operation from a proxyUser, it drops the realUser > in the UGI and makes the routerUser the realUser for the operation that goes > to the namenode. > In the namenode UGI logs, we'd like the ability to know the original realUser. > The router should propagate the realUser from the client call as part of the > callerContext. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13522) HDFS-13522: Add federated nameservices states to client protocol and propagate it between routers and clients.
[ https://issues.apache.org/jira/browse/HDFS-13522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17692377#comment-17692377 ] Simbarashe Dzinamarira commented on HDFS-13522: --- Hi [~tasanuma] I agree porting back these changes to 2.7.x would be more work. Regarding Design A and Design B. Initially, Design A would msync only for the first call, while Design B would always msync. The decision to msync or not would just be governed by whether the federatedState is present in the RPC header. We then changed Design A to direct the first call to the active instead of msync. With this change, it is then more complicated to add the always msync option since more information is needed to distinguish between 1) a new client performing its first call. 2) An old client performing an n-th call. So right now we do not have design B implemented. One part of design B that is implemented is limiting the rpc size, through "dfs.federation.router.observer.federated.state.propagation.maxsize". However, instead of falling back to always-msync, it falls back to directing calls to the active. There is no active discussion of adding in the always-msync behavior. > HDFS-13522: Add federated nameservices states to client protocol and > propagate it between routers and clients. > -- > > Key: HDFS-13522 > URL: https://issues.apache.org/jira/browse/HDFS-13522 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: federation, namenode >Reporter: Erik Krogen >Assignee: Simbarashe Dzinamarira >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Attachments: HDFS-13522.001.patch, HDFS-13522.002.patch, > HDFS-13522_WIP.patch, RBF_ Observer support.pdf, Router+Observer RPC > clogging.png, ShortTerm-Routers+Observer.png, > observer_reads_in_rbf_proposal_simbadzina_v1.pdf, > observer_reads_in_rbf_proposal_simbadzina_v2.pdf > > Time Spent: 20h 50m > Remaining Estimate: 0h > > Changes will need to occur to the router to support the new observer node. > One such change will be to make the router understand the observer state, > e.g. {{{}FederationNamenodeServiceState{}}}. > This patch captures the state of all namespaces in the routers and propagates > it to clients. A follow up patch will change router behavior to direct > requests to the observer. > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16931) Observer nn delete blocks asynchronously when tail OP_DELETE editlog
[ https://issues.apache.org/jira/browse/HDFS-16931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17692354#comment-17692354 ] ASF GitHub Bot commented on HDFS-16931: --- hadoop-yetus commented on PR #5424: URL: https://github.com/apache/hadoop/pull/5424#issuecomment-1440749446 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 1m 21s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 2s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 2s | | detect-secrets was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | -1 :x: | test4tests | 0m 0s | | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 52m 12s | | trunk passed | | +1 :green_heart: | compile | 1m 38s | | trunk passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 | | +1 :green_heart: | compile | 1m 22s | | trunk passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | +1 :green_heart: | checkstyle | 1m 10s | | trunk passed | | +1 :green_heart: | mvnsite | 1m 40s | | trunk passed | | +1 :green_heart: | javadoc | 1m 12s | | trunk passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 | | +1 :green_heart: | javadoc | 1m 34s | | trunk passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | +1 :green_heart: | spotbugs | 3m 42s | | trunk passed | | +1 :green_heart: | shadedclient | 29m 47s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 1m 36s | | the patch passed | | +1 :green_heart: | compile | 1m 30s | | the patch passed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 | | +1 :green_heart: | javac | 1m 30s | | the patch passed | | +1 :green_heart: | compile | 1m 21s | | the patch passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | +1 :green_heart: | javac | 1m 21s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 0m 55s | [/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5424/1/artifact/out/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs-project/hadoop-hdfs: The patch generated 7 new + 118 unchanged - 0 fixed = 125 total (was 118) | | +1 :green_heart: | mvnsite | 1m 26s | | the patch passed | | -1 :x: | javadoc | 0m 58s | [/patch-javadoc-hadoop-hdfs-project_hadoop-hdfs-jdkUbuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5424/1/artifact/out/patch-javadoc-hadoop-hdfs-project_hadoop-hdfs-jdkUbuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04.txt) | hadoop-hdfs in the patch failed with JDK Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04. | | +1 :green_heart: | javadoc | 1m 28s | | the patch passed with JDK Private Build-1.8.0_352-8u352-ga-1~20.04-b08 | | +1 :green_heart: | spotbugs | 3m 45s | | the patch passed | | +1 :green_heart: | shadedclient | 30m 17s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | -1 :x: | unit | 269m 3s | [/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5424/1/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 0m 50s | | The patch does not generate ASF License warnings. | | | | 406m 21s | | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.hdfs.TestRollingUpgrade | | | hadoop.hdfs.server.namenode.TestNameEditsConfigs | | | hadoop.fs.TestHDFSFileContextMainOperations | | | hadoop.hdfs.TestAclsEndToEnd | | | hadoop.hdfs.server.namenode.snapshot.TestRandomOpsWithSnapshots | | | hadoop.hdfs.server.namenode.snapshot.TestSnapshot | | | hadoop.hdfs.server.blockmanagement.TestPendingInvalidateBlock | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.42 ServerAPI=1.42 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5424/1/artifact/out/Dockerfile | | GITHUB PR |
[jira] [Commented] (HDFS-16896) HDFS Client hedged read has increased failure rate than without hedged read
[ https://issues.apache.org/jira/browse/HDFS-16896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17692337#comment-17692337 ] ASF GitHub Bot commented on HDFS-16896: --- mkuchenbecker commented on code in PR #5322: URL: https://github.com/apache/hadoop/pull/5322#discussion_r1114855356 ## hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java: ## @@ -197,6 +197,15 @@ private void clearLocalDeadNodes() { deadNodes.clear(); } + /** + * Clear list of ignored nodes used for hedged reads. + */ + private void clearIgnoredNodes(Collection ignoredNodes) { Review Comment: Nit. Param documentation. ## hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java: ## @@ -1337,8 +1352,12 @@ private void hedgedFetchBlockByteRange(LocatedBlock block, long start, } catch (InterruptedException ie) { // Ignore and retry } -if (refetch) { - refetchLocations(block, ignored); +// If refetch is true, then all nodes are in deadNodes or ignoredNodes. +// We should loop through all futures and remove them, so we do not +// have concurrent requests to the same node. +// Once all futures are cleared, we can clear the ignoredNodes and retry. Review Comment: Ignored and dead nodes are cleared, correct? ## hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java: ## @@ -224,7 +233,7 @@ boolean deadNodesContain(DatanodeInfo nodeInfo) { } /** - * Grab the open-file info from namenode + * Grab the open-file info from namenode. Review Comment: Is this change needed? ## hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java: ## @@ -197,6 +197,15 @@ private void clearLocalDeadNodes() { deadNodes.clear(); } + /** + * Clear list of ignored nodes used for hedged reads. + */ + private void clearIgnoredNodes(Collection ignoredNodes) { Review Comment: Does it make more sense to clean up ignored / dead in the same function so we don't need to worry about calling both? ## hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestPread.java: ## @@ -603,7 +603,9 @@ public Void answer(InvocationOnMock invocation) throws Throwable { input.read(0, buffer, 0, 1024); Assert.fail("Reading the block should have thrown BlockMissingException"); } catch (BlockMissingException e) { - assertEquals(3, input.getHedgedReadOpsLoopNumForTesting()); + // The result of 9 is due to 2 blocks by 4 iterations plus one because + // hedgedReadOpsLoopNumForTesting is incremented at start of the loop. + assertEquals(9, input.getHedgedReadOpsLoopNumForTesting()); Review Comment: We are tripling the IO per hedged request? > HDFS Client hedged read has increased failure rate than without hedged read > --- > > Key: HDFS-16896 > URL: https://issues.apache.org/jira/browse/HDFS-16896 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Reporter: Tom McCormick >Assignee: Tom McCormick >Priority: Major > Labels: pull-request-available > > When hedged read is enabled by HDFS client, we see an increased failure rate > on reads. > *stacktrace* > > {code:java} > Caused by: org.apache.hadoop.hdfs.BlockMissingException: Could not obtain > block: BP-1183972111-10.197.192.88-1590025572374:blk_17114848218_16043459722 > file=/data/tracking/streaming/AdImpressionEvent/daily/2022/07/18/compaction_1/part-r-1914862.1658217125623.1362294472.orc > at > org.apache.hadoop.hdfs.DFSInputStream.refetchLocations(DFSInputStream.java:1077) > at > org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:1060) > at > org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:1039) > at > org.apache.hadoop.hdfs.DFSInputStream.hedgedFetchBlockByteRange(DFSInputStream.java:1365) > at org.apache.hadoop.hdfs.DFSInputStream.pread(DFSInputStream.java:1572) > at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:1535) > at org.apache.hadoop.fs.FSInputStream.readFully(FSInputStream.java:121) > at > org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:112) > at > org.apache.hadoop.fs.RetryingInputStream.lambda$readFully$3(RetryingInputStream.java:172) > at org.apache.hadoop.fs.RetryPolicy.lambda$run$0(RetryPolicy.java:137) > at org.apache.hadoop.fs.NoOpRetryPolicy.run(NoOpRetryPolicy.java:36) > at org.apache.hadoop.fs.RetryPolicy.run(RetryPolicy.java:136) > at >
[jira] [Commented] (HDFS-16917) Add transfer rate quantile metrics for DataNode reads
[ https://issues.apache.org/jira/browse/HDFS-16917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17692331#comment-17692331 ] ASF GitHub Bot commented on HDFS-16917: --- rdingankar commented on code in PR #5397: URL: https://github.com/apache/hadoop/pull/5397#discussion_r1114852429 ## hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSUtil.java: ## @@ -1936,4 +1936,20 @@ public static boolean isParentEntry(final String path, final String parent) { return path.charAt(parent.length()) == Path.SEPARATOR_CHAR || parent.equals(Path.SEPARATOR); } + + /** + * Calculate the transfer rate in megabytes/second. Return -1 for any negative input. + * @param bytes bytes + * @param durationMS duration in milliseconds + * @return the number of megabytes/second of the transfer rate + */ + public static long transferRateMBs(long bytes, long durationMS) { +if (bytes < 0 || durationMS < 0) { + return -1; +} +if (durationMS == 0) { + durationMS = 1; +} +return bytes / (1024 * 1024) * 1000 / durationMS; Review Comment: If we want to have the consumer do the conversion, then we can just keep it to bytes/milliseconds. bytes and milliseconds are the units used for most of the metrics in HDFS. Thoughts @xinglin @mkuchenbecker ? > Add transfer rate quantile metrics for DataNode reads > - > > Key: HDFS-16917 > URL: https://issues.apache.org/jira/browse/HDFS-16917 > Project: Hadoop HDFS > Issue Type: Task > Components: datanode >Reporter: Ravindra Dingankar >Priority: Minor > Labels: pull-request-available > > Currently we have the following metrics for datanode reads. > |BytesRead > BlocksRead > TotalReadTime|Total number of bytes read from DataNode > Total number of blocks read from DataNode > Total number of milliseconds spent on read operation| > We would like to add a new quantile metric calculating the transfer rate for > datanode reads. > This will give us a distribution across a window of the read transfer rate > for each datanode. > Quantiles for transfer rate per host will help in identifying issues like > hotspotting of datasets as well as finding repetitive slow datanodes. > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16917) Add transfer rate quantile metrics for DataNode reads
[ https://issues.apache.org/jira/browse/HDFS-16917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17692329#comment-17692329 ] ASF GitHub Bot commented on HDFS-16917: --- rdingankar commented on code in PR #5397: URL: https://github.com/apache/hadoop/pull/5397#discussion_r1114847865 ## hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataXceiver.java: ## @@ -1122,6 +1124,7 @@ public void copyBlock(final ExtendedBlock block, datanode.metrics.incrBytesRead((int) read); datanode.metrics.incrBlocksRead(); datanode.metrics.incrTotalReadTime(duration); + datanode.metrics.addReadTransferRateMBs(DFSUtil.transferRateMBs(read, duration)); Review Comment: I was thinking we can publish -1 for the error scenarios which can be visualized from the graphs as well. > Add transfer rate quantile metrics for DataNode reads > - > > Key: HDFS-16917 > URL: https://issues.apache.org/jira/browse/HDFS-16917 > Project: Hadoop HDFS > Issue Type: Task > Components: datanode >Reporter: Ravindra Dingankar >Priority: Minor > Labels: pull-request-available > > Currently we have the following metrics for datanode reads. > |BytesRead > BlocksRead > TotalReadTime|Total number of bytes read from DataNode > Total number of blocks read from DataNode > Total number of milliseconds spent on read operation| > We would like to add a new quantile metric calculating the transfer rate for > datanode reads. > This will give us a distribution across a window of the read transfer rate > for each datanode. > Quantiles for transfer rate per host will help in identifying issues like > hotspotting of datasets as well as finding repetitive slow datanodes. > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16917) Add transfer rate quantile metrics for DataNode reads
[ https://issues.apache.org/jira/browse/HDFS-16917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17692324#comment-17692324 ] ASF GitHub Bot commented on HDFS-16917: --- mkuchenbecker commented on code in PR #5397: URL: https://github.com/apache/hadoop/pull/5397#discussion_r1114838006 ## hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSUtil.java: ## @@ -1936,4 +1936,20 @@ public static boolean isParentEntry(final String path, final String parent) { return path.charAt(parent.length()) == Path.SEPARATOR_CHAR || parent.equals(Path.SEPARATOR); } + + /** + * Calculate the transfer rate in megabytes/second. Return -1 for any negative input. + * @param bytes bytes + * @param durationMS duration in milliseconds + * @return the number of megabytes/second of the transfer rate + */ + public static long transferRateMBs(long bytes, long durationMS) { Review Comment: This function needs unit tests. > Add transfer rate quantile metrics for DataNode reads > - > > Key: HDFS-16917 > URL: https://issues.apache.org/jira/browse/HDFS-16917 > Project: Hadoop HDFS > Issue Type: Task > Components: datanode >Reporter: Ravindra Dingankar >Priority: Minor > Labels: pull-request-available > > Currently we have the following metrics for datanode reads. > |BytesRead > BlocksRead > TotalReadTime|Total number of bytes read from DataNode > Total number of blocks read from DataNode > Total number of milliseconds spent on read operation| > We would like to add a new quantile metric calculating the transfer rate for > datanode reads. > This will give us a distribution across a window of the read transfer rate > for each datanode. > Quantiles for transfer rate per host will help in identifying issues like > hotspotting of datasets as well as finding repetitive slow datanodes. > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16917) Add transfer rate quantile metrics for DataNode reads
[ https://issues.apache.org/jira/browse/HDFS-16917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17692322#comment-17692322 ] ASF GitHub Bot commented on HDFS-16917: --- mkuchenbecker commented on code in PR #5397: URL: https://github.com/apache/hadoop/pull/5397#discussion_r1114836649 ## hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataXceiver.java: ## @@ -1122,6 +1124,7 @@ public void copyBlock(final ExtendedBlock block, datanode.metrics.incrBytesRead((int) read); datanode.metrics.incrBlocksRead(); datanode.metrics.incrTotalReadTime(duration); + datanode.metrics.addReadTransferRateMBs(DFSUtil.transferRateMBs(read, duration)); Review Comment: You will be publishing the metrics, including the -1 here. Assuming duration is 0, I would expect bytes read to be 0 as well (only instance with sub millisecond transfers should be when there's an error prior to the transfer). Bytes Read 0, Duration 0: Return 0. Bytes Read N, duration 0: Exception which we log. Bytes Read N, duration K: Normal > Add transfer rate quantile metrics for DataNode reads > - > > Key: HDFS-16917 > URL: https://issues.apache.org/jira/browse/HDFS-16917 > Project: Hadoop HDFS > Issue Type: Task > Components: datanode >Reporter: Ravindra Dingankar >Priority: Minor > Labels: pull-request-available > > Currently we have the following metrics for datanode reads. > |BytesRead > BlocksRead > TotalReadTime|Total number of bytes read from DataNode > Total number of blocks read from DataNode > Total number of milliseconds spent on read operation| > We would like to add a new quantile metric calculating the transfer rate for > datanode reads. > This will give us a distribution across a window of the read transfer rate > for each datanode. > Quantiles for transfer rate per host will help in identifying issues like > hotspotting of datasets as well as finding repetitive slow datanodes. > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16917) Add transfer rate quantile metrics for DataNode reads
[ https://issues.apache.org/jira/browse/HDFS-16917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17692320#comment-17692320 ] ASF GitHub Bot commented on HDFS-16917: --- mkuchenbecker commented on code in PR #5397: URL: https://github.com/apache/hadoop/pull/5397#discussion_r1114833478 ## hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSUtil.java: ## @@ -1936,4 +1936,20 @@ public static boolean isParentEntry(final String path, final String parent) { return path.charAt(parent.length()) == Path.SEPARATOR_CHAR || parent.equals(Path.SEPARATOR); } + + /** + * Calculate the transfer rate in megabytes/second. Return -1 for any negative input. + * @param bytes bytes + * @param durationMS duration in milliseconds + * @return the number of megabytes/second of the transfer rate + */ + public static long transferRateMBs(long bytes, long durationMS) { +if (bytes < 0 || durationMS < 0) { + return -1; +} +if (durationMS == 0) { + durationMS = 1; +} +return bytes / (1024 * 1024) * 1000 / durationMS; Review Comment: I would make this bytes per second and let the consumer do the conversion. > Add transfer rate quantile metrics for DataNode reads > - > > Key: HDFS-16917 > URL: https://issues.apache.org/jira/browse/HDFS-16917 > Project: Hadoop HDFS > Issue Type: Task > Components: datanode >Reporter: Ravindra Dingankar >Priority: Minor > Labels: pull-request-available > > Currently we have the following metrics for datanode reads. > |BytesRead > BlocksRead > TotalReadTime|Total number of bytes read from DataNode > Total number of blocks read from DataNode > Total number of milliseconds spent on read operation| > We would like to add a new quantile metric calculating the transfer rate for > datanode reads. > This will give us a distribution across a window of the read transfer rate > for each datanode. > Quantiles for transfer rate per host will help in identifying issues like > hotspotting of datasets as well as finding repetitive slow datanodes. > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16931) Observer nn delete blocks asynchronously when tail OP_DELETE editlog
[ https://issues.apache.org/jira/browse/HDFS-16931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HDFS-16931: -- Labels: pull-request-available (was: ) > Observer nn delete blocks asynchronously when tail OP_DELETE editlog > > > Key: HDFS-16931 > URL: https://issues.apache.org/jira/browse/HDFS-16931 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: kinit >Priority: Minor > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16931) Observer nn delete blocks asynchronously when tail OP_DELETE editlog
[ https://issues.apache.org/jira/browse/HDFS-16931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17692196#comment-17692196 ] ASF GitHub Bot commented on HDFS-16931: --- liuaer opened a new pull request, #5424: URL: https://github.com/apache/hadoop/pull/5424 In our HDFS cluster that holds hundreds of millions of metadata, all of which is EC data. When Observer NN replays the OP_DELETE operation which deletes a large directory, involving tens of millions of metadata. Because the operation is synchronous, the Observer NN is stuck for nearly 50 minutes. [HDFS-16043](https://issues.apache.org/jira/browse/HDFS-16043) has solved the asynchronous deletion on the active NN. This is to solve the asynchronous deletion on the Observer NN. > Observer nn delete blocks asynchronously when tail OP_DELETE editlog > > > Key: HDFS-16931 > URL: https://issues.apache.org/jira/browse/HDFS-16931 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: kinit >Priority: Minor > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16931) Observer nn delete blocks asynchronously when tail OP_DELETE editlog
[ https://issues.apache.org/jira/browse/HDFS-16931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] kinit updated HDFS-16931: - Labels: (was: english) > Observer nn delete blocks asynchronously when tail OP_DELETE editlog > > > Key: HDFS-16931 > URL: https://issues.apache.org/jira/browse/HDFS-16931 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: kinit >Priority: Minor > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-16931) Observer nn delete blocks asynchronously when tail OP_DELETE editlog
kinit created HDFS-16931: Summary: Observer nn delete blocks asynchronously when tail OP_DELETE editlog Key: HDFS-16931 URL: https://issues.apache.org/jira/browse/HDFS-16931 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: kinit -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16930) Update the wrapper for fuse-dfs
[ https://issues.apache.org/jira/browse/HDFS-16930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17692132#comment-17692132 ] Ayush Saxena commented on HDFS-16930: - Added [~chaoheng] as HDFS contributor to assign the ticket. Welcome to hadoop!!! > Update the wrapper for fuse-dfs > --- > > Key: HDFS-16930 > URL: https://issues.apache.org/jira/browse/HDFS-16930 > Project: Hadoop HDFS > Issue Type: Bug > Components: fuse-dfs >Reporter: Chao-Heng Lee >Assignee: Chao-Heng Lee >Priority: Minor > > The fuse_dfs_wrapper.sh hasn't been updated for quite a long time. > Although the documentation mentions that the script may not work out of the > box, it would be clearer to update any outdated paths. > for example, from > {code:java} > export > LIBHDFS_PATH="$HADOOP_HOME/hadoop-hdfs-project/hadoop-hdfs-native-client/target/usr/local/lib"{code} > to > {code:java} > export > LIBHDFS_PATH="$HADOOP_HOME/hadoop-hdfs-project/hadoop-hdfs-native-client/target/native/target/usr/local/lib"{code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-16930) Update the wrapper for fuse-dfs
[ https://issues.apache.org/jira/browse/HDFS-16930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena reassigned HDFS-16930: --- Assignee: Chao-Heng Lee > Update the wrapper for fuse-dfs > --- > > Key: HDFS-16930 > URL: https://issues.apache.org/jira/browse/HDFS-16930 > Project: Hadoop HDFS > Issue Type: Bug > Components: fuse-dfs >Reporter: Chao-Heng Lee >Assignee: Chao-Heng Lee >Priority: Minor > > The fuse_dfs_wrapper.sh hasn't been updated for quite a long time. > Although the documentation mentions that the script may not work out of the > box, it would be clearer to update any outdated paths. > for example, from > {code:java} > export > LIBHDFS_PATH="$HADOOP_HOME/hadoop-hdfs-project/hadoop-hdfs-native-client/target/usr/local/lib"{code} > to > {code:java} > export > LIBHDFS_PATH="$HADOOP_HOME/hadoop-hdfs-project/hadoop-hdfs-native-client/target/native/target/usr/local/lib"{code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16930) Update the wrapper for fuse-dfs
[ https://issues.apache.org/jira/browse/HDFS-16930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17692115#comment-17692115 ] Chao-Heng Lee commented on HDFS-16930: -- I am newbie. I want to contribute this issue. Is there someone can assign this issue to me? > Update the wrapper for fuse-dfs > --- > > Key: HDFS-16930 > URL: https://issues.apache.org/jira/browse/HDFS-16930 > Project: Hadoop HDFS > Issue Type: Bug > Components: fuse-dfs >Reporter: Chao-Heng Lee >Priority: Minor > > The fuse_dfs_wrapper.sh hasn't been updated for quite a long time. > Although the documentation mentions that the script may not work out of the > box, it would be clearer to update any outdated paths. > for example, from > {code:java} > export > LIBHDFS_PATH="$HADOOP_HOME/hadoop-hdfs-project/hadoop-hdfs-native-client/target/usr/local/lib"{code} > to > {code:java} > export > LIBHDFS_PATH="$HADOOP_HOME/hadoop-hdfs-project/hadoop-hdfs-native-client/target/native/target/usr/local/lib"{code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16930) Update the wrapper for fuse-dfs
[ https://issues.apache.org/jira/browse/HDFS-16930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17692107#comment-17692107 ] Chao-Heng Lee commented on HDFS-16930: -- [~chia7712] > Update the wrapper for fuse-dfs > --- > > Key: HDFS-16930 > URL: https://issues.apache.org/jira/browse/HDFS-16930 > Project: Hadoop HDFS > Issue Type: Bug > Components: fuse-dfs >Reporter: Chao-Heng Lee >Priority: Minor > > The fuse_dfs_wrapper.sh hasn't been updated for quite a long time. > Although the documentation mentions that the script may not work out of the > box, it would be clearer to update any outdated paths. > for example, from > {code:java} > export > LIBHDFS_PATH="$HADOOP_HOME/hadoop-hdfs-project/hadoop-hdfs-native-client/target/usr/local/lib"{code} > to > {code:java} > export > LIBHDFS_PATH="$HADOOP_HOME/hadoop-hdfs-project/hadoop-hdfs-native-client/target/native/target/usr/local/lib"{code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-16930) Update the wrapper for fuse-dfs
Chao-Heng Lee created HDFS-16930: Summary: Update the wrapper for fuse-dfs Key: HDFS-16930 URL: https://issues.apache.org/jira/browse/HDFS-16930 Project: Hadoop HDFS Issue Type: Bug Components: fuse-dfs Reporter: Chao-Heng Lee The fuse_dfs_wrapper.sh hasn't been updated for quite a long time. Although the documentation mentions that the script may not work out of the box, it would be clearer to update any outdated paths. for example, from {code:java} export LIBHDFS_PATH="$HADOOP_HOME/hadoop-hdfs-project/hadoop-hdfs-native-client/target/usr/local/lib"{code} to {code:java} export LIBHDFS_PATH="$HADOOP_HOME/hadoop-hdfs-project/hadoop-hdfs-native-client/target/native/target/usr/local/lib"{code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16882) RBF: Add cache hit rate metric in MountTableResolver#getDestinationForPath
[ https://issues.apache.org/jira/browse/HDFS-16882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17692024#comment-17692024 ] ASF GitHub Bot commented on HDFS-16882: --- hadoop-yetus commented on PR #5423: URL: https://github.com/apache/hadoop/pull/5423#issuecomment-1439623278 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 1m 37s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 1s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 1s | | detect-secrets was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 1 new or modified test files. | _ branch-3.3 Compile Tests _ | | +1 :green_heart: | mvninstall | 40m 44s | | branch-3.3 passed | | +1 :green_heart: | compile | 0m 39s | | branch-3.3 passed | | +1 :green_heart: | checkstyle | 0m 35s | | branch-3.3 passed | | +1 :green_heart: | mvnsite | 0m 46s | | branch-3.3 passed | | +1 :green_heart: | javadoc | 1m 8s | | branch-3.3 passed | | +1 :green_heart: | spotbugs | 1m 29s | | branch-3.3 passed | | +1 :green_heart: | shadedclient | 27m 12s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 0m 42s | | the patch passed | | +1 :green_heart: | compile | 0m 30s | | the patch passed | | +1 :green_heart: | javac | 0m 30s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | +1 :green_heart: | checkstyle | 0m 18s | | the patch passed | | +1 :green_heart: | mvnsite | 0m 33s | | the patch passed | | +1 :green_heart: | javadoc | 0m 51s | | the patch passed | | +1 :green_heart: | spotbugs | 1m 17s | | the patch passed | | +1 :green_heart: | shadedclient | 26m 38s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 17m 16s | | hadoop-hdfs-rbf in the patch passed. | | +1 :green_heart: | asflicense | 0m 39s | | The patch does not generate ASF License warnings. | | | | 124m 40s | | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.42 ServerAPI=1.42 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5423/1/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/5423 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets | | uname | Linux 36ffdcad9afc 4.15.0-200-generic #211-Ubuntu SMP Thu Nov 24 18:16:04 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | branch-3.3 / b721f8de078a47f847f228ad15b92972739fc451 | | Default Java | Private Build-1.8.0_352-8u352-ga-1~18.04-b08 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5423/1/testReport/ | | Max. process+thread count | 2736 (vs. ulimit of 5500) | | modules | C: hadoop-hdfs-project/hadoop-hdfs-rbf U: hadoop-hdfs-project/hadoop-hdfs-rbf | | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5423/1/console | | versions | git=2.17.1 maven=3.6.0 spotbugs=4.2.2 | | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org | This message was automatically generated. > RBF: Add cache hit rate metric in MountTableResolver#getDestinationForPath > -- > > Key: HDFS-16882 > URL: https://issues.apache.org/jira/browse/HDFS-16882 > Project: Hadoop HDFS > Issue Type: Improvement > Components: rbf >Affects Versions: 3.3.4 >Reporter: ZhangHB >Priority: Minor > Labels: pull-request-available > Fix For: 3.4.0 > > Attachments: locationCache.png > > > Currently, the default value of > "dfs.federation.router.mount-table.cache.enable" is true and the default > value of "dfs.federation.router.mount-table.max-cache-size" is 1. > But there is no metric that display cache hit rate, I think we can add a hit > rate metric to watch the Cache performance and better tuning the parameters. -- This message was sent by Atlassian Jira (v8.20.10#820010)