[
https://issues.apache.org/jira/browse/HDFS-17871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18051217#comment-18051217
]
ASF GitHub Bot commented on HDFS-17871:
---------------------------------------
hadoop-yetus commented on PR #8179:
URL: https://github.com/apache/hadoop/pull/8179#issuecomment-3738089455
:broken_heart: **-1 overall**
| Vote | Subsystem | Runtime | Logfile | Comment |
|:----:|----------:|--------:|:--------:|:-------:|
| +0 :ok: | reexec | 16m 2s | | Docker mode activated. |
|||| _ Prechecks _ |
| +1 :green_heart: | dupname | 0m 0s | | No case conflicting files
found. |
| +0 :ok: | codespell | 0m 0s | | codespell was not available. |
| +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available.
|
| +1 :green_heart: | @author | 0m 0s | | The patch does not contain
any @author tags. |
| -1 :x: | test4tests | 0m 0s | | The patch doesn't appear to include
any new or modified tests. Please justify why no new tests are needed for this
patch. Also please list what manual steps were performed to verify this patch.
|
|||| _ trunk Compile Tests _ |
| +1 :green_heart: | mvninstall | 34m 22s | | trunk passed |
| +1 :green_heart: | compile | 0m 46s | | trunk passed with JDK
Ubuntu-21.0.7+6-Ubuntu-0ubuntu120.04 |
| +1 :green_heart: | compile | 1m 10s | | trunk passed with JDK
Ubuntu-17.0.15+6-Ubuntu-0ubuntu120.04 |
| +1 :green_heart: | checkstyle | 0m 31s | | trunk passed |
| +1 :green_heart: | mvnsite | 1m 14s | | trunk passed |
| +1 :green_heart: | javadoc | 0m 46s | | trunk passed with JDK
Ubuntu-21.0.7+6-Ubuntu-0ubuntu120.04 |
| +1 :green_heart: | javadoc | 0m 42s | | trunk passed with JDK
Ubuntu-17.0.15+6-Ubuntu-0ubuntu120.04 |
| +1 :green_heart: | spotbugs | 2m 5s | | trunk passed |
| +1 :green_heart: | shadedclient | 27m 23s | | branch has no errors
when building and testing our client artifacts. |
|||| _ Patch Compile Tests _ |
| +1 :green_heart: | mvninstall | 1m 3s | | the patch passed |
| +1 :green_heart: | compile | 0m 40s | | the patch passed with JDK
Ubuntu-21.0.7+6-Ubuntu-0ubuntu120.04 |
| +1 :green_heart: | javac | 0m 40s | | the patch passed |
| +1 :green_heart: | compile | 1m 1s | | the patch passed with JDK
Ubuntu-17.0.15+6-Ubuntu-0ubuntu120.04 |
| +1 :green_heart: | javac | 1m 1s | | the patch passed |
| +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks
issues. |
| +1 :green_heart: | checkstyle | 0m 22s | | the patch passed |
| +1 :green_heart: | mvnsite | 1m 7s | | the patch passed |
| +1 :green_heart: | javadoc | 0m 33s | | the patch passed with JDK
Ubuntu-21.0.7+6-Ubuntu-0ubuntu120.04 |
| +1 :green_heart: | javadoc | 0m 33s | | the patch passed with JDK
Ubuntu-17.0.15+6-Ubuntu-0ubuntu120.04 |
| +1 :green_heart: | spotbugs | 2m 1s | | the patch passed |
| +1 :green_heart: | shadedclient | 26m 55s | | patch has no errors
when building and testing our client artifacts. |
|||| _ Other Tests _ |
| -1 :x: | unit | 42m 36s |
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8179/1/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt)
| hadoop-hdfs-rbf in the patch passed. |
| +1 :green_heart: | asflicense | 0m 36s | | The patch does not
generate ASF License warnings. |
| | | 164m 8s | | |
| Reason | Tests |
|-------:|:------|
| Failed junit tests | hadoop.hdfs.server.federation.router.TestRouterTrash |
| Subsystem | Report/Notes |
|----------:|:-------------|
| Docker | ClientAPI=1.52 ServerAPI=1.52 base:
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8179/1/artifact/out/Dockerfile
|
| GITHUB PR | https://github.com/apache/hadoop/pull/8179 |
| Optional Tests | dupname asflicense compile javac javadoc mvninstall
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
| uname | Linux 7a1635c75fa5 5.15.0-164-generic #174-Ubuntu SMP Fri Nov 14
20:25:16 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | dev-support/bin/hadoop.sh |
| git revision | trunk / f5a6df6a005d4ddaf7e16b9d0ccca1a080dd00f2 |
| Default Java | Ubuntu-17.0.15+6-Ubuntu-0ubuntu120.04 |
| Multi-JDK versions |
/usr/lib/jvm/java-21-openjdk-amd64:Ubuntu-21.0.7+6-Ubuntu-0ubuntu120.04
/usr/lib/jvm/java-17-openjdk-amd64:Ubuntu-17.0.15+6-Ubuntu-0ubuntu120.04 |
| Test Results |
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8179/1/testReport/ |
| Max. process+thread count | 3777 (vs. ulimit of 5500) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs-rbf U:
hadoop-hdfs-project/hadoop-hdfs-rbf |
| Console output |
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8179/1/console |
| versions | git=2.25.1 maven=3.9.11 spotbugs=4.9.7 |
| Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
This message was automatically generated.
> Performance Issue: `findDeepest()` in RBF traverses entire tree when parent
> path doesn't exist
> ----------------------------------------------------------------------------------------------
>
> Key: HDFS-17871
> URL: https://issues.apache.org/jira/browse/HDFS-17871
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: rbf
> Reporter: xorsum
> Priority: Minor
> Labels: pull-request-available
> Attachments: BenchmarkFindDeepest.java
>
>
> h2. Problem
> The current implementation of
> `org.apache.hadoop.hdfs.server.federation.resolver.MountTableResolver#findDeepest`
> method in RBF has a performance issue: when the tree doesn't contain a
> parent path for the queried path, the `while` loop iterates through *all
> entries* in the TreeMap, resulting in `O(N)` time complexity instead of the
> expected `O(log N)`.
> h2. Example
> Consider a mount table containing:
> - `/user/hive/warehouse/db1.db/table1`
> - `/user/hive/warehouse/db1.db/table2`
> - `/user/hive/warehouse/db2.db/table3`
> When querying a path like `/xyz/random/path/file.txt`, the algorithm will:
> - `floorEntry("/xyz/random/path/file.txt")` returns the largest entry in the
> tree
> - Check `isParentEntry()`: false
> - Call `lowerEntry()` repeatedly until reaching the first entry in the tree
> - Worst case: Iterates through ALL entries in `O(N)` time
> h2. Proposed Solution
> We can improve the `findDeepest` method by checking the parent path directly,
> instead of using `floorEntry`.
> I benchmarked the performance of these two implementations.
> The benchmark results show that the improved method is 50x faster than the
> original method.
> {quote}================================================================================
> BENCHMARK COMPARISON RESULTS
> ================================================================================
> 【Original Method - findDeepest】
> Total Execution Time: 1,924,288,819 ns (1924.29 ms)
> Total Lookup Count: 15,302,856
> Avg Time Per Query: 320,607.93 ns (0.32 ms)
> Avg Lookups Per Query: 2,549.63
> 【Improved Method - findDeepestImproved】
> Total Execution Time: 38,030,989 ns (38.03 ms)
> Total Lookup Count: 24,008
> Avg Time Per Query: 6,336.39 ns (0.01 ms)
> Avg Lookups Per Query: 4.00
> 【Performance Improvement】
> Time Saved: 98.02%
> Lookup Count Reduced: 99.84%
> Speedup Factor: 50.60x
> 【Correctness Verification】
> ✓ All results match! Both methods returned identical results.
> ================================================================================
> {quote}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]