zclllyybb commented on issue #64138:
URL: https://github.com/apache/doris/issues/64138#issuecomment-4628998988

   Breakwater-GitHub-Analysis-Slot: slot_5ce3712a7728
   
   Initial analysis from the current broker code:
   
   This looks like a valid broker-side resource lifetime bug, not just a heap 
sizing symptom. In `BrokerFileSystem.closeFileSystem()`, the underlying Hadoop 
`FileSystem.close()` call is still commented out, so the method only drops 
Doris' reference by setting `dfsFileSystem = null`. The HDFS broker path 
creates the instance through `FileSystem.get(...)` with 
`fs.hdfs.impl.disable.cache=true`, which means the broker's own cache 
effectively owns the lifecycle. When `openReader`, `openWriter`, or other file 
operations hit an `IOException`, they call `fileSystem.closeFileSystem()`. With 
the current implementation, that invalidates the cached wrapper but does not 
close the Hadoop/LiteFS resources. If the concrete filesystem owns Netty 
event-loop threads, repeated failing imports/exports can accumulate those 
resources, which matches the reported `DFS_OPEN` OOM stack and the suggested 
`jstack` symptom.
   
   I also agree that simply reverting the old close change is unsafe. The 
broker stores the same cached `BrokerFileSystem` behind active readers/writers, 
and normal operations use `getDFSFileSystem()` without holding 
`BrokerFileSystem.lock`. Re-enabling an unconditional close in an error or 
expiry path could close the shared `FileSystem` while another thread is still 
using streams opened from it, which explains the historical `Filesystem closed` 
failure mode.
   
   There is a separate real concurrency issue in `updateCachedFileSystem()`. It 
uses `containsKey()`, `get()`, and `put()` as separate operations on a 
`ConcurrentHashMap`. During concurrent first use, two callers can create 
different wrappers and the later `put()` can overwrite a wrapper that another 
thread may already have initialized with a Hadoop `FileSystem`. During expiry 
replacement, the code locks the wrapper read earlier, but then rechecks the map 
separately and can still replace the wrong generation. Both cases can leave a 
`FileSystem` outside the intended cache lifecycle or make close-on-eviction 
unsafe.
   
   Suggested fix direction:
   
   1. Make cache creation/replacement atomic per `FileSystemIdentity`, for 
example with a `compute`/generation-check pattern or an explicit per-identity 
lifecycle lock, so an initialized wrapper cannot be overwritten accidentally.
   2. Reintroduce real resource closing only for a wrapper that has first been 
removed from visibility to new users and is no longer referenced by active 
streams. A reference-count or lease-style wrapper would address the leak 
without reintroducing the old `Filesystem closed` regression.
   3. Add a concurrency test covering simultaneous first creation, open 
failure, and expiry replacement, plus a stress/repro check that repeated 
failing broker opens do not monotonically increase Hadoop/LiteFS/Netty worker 
threads.
   
   Useful confirmation data for the PR review:
   
   - the exact Doris commit/branch and the Hadoop/LiteFS dependency versions 
used by the broker;
   - jstack thread-count snapshots over time and one heap histogram or heap 
dump showing retained `FileSystem` / LiteFS / Netty event-loop objects;
   - a small repro config or script for the failing import/export path that 
repeatedly triggers `DFS_OPEN` failures.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to