ferdelyi commented on PR #8556:
URL: https://github.com/apache/hadoop/pull/8556#issuecomment-4788162391
Change in LogAggregationIndexedFileController.java explained by Claude Code:
1. Core bug fix — UUID validation in loadIndexedLogsMeta (line ~937)
Before:
if (this.uuid == null) {
this.uuid = createUUID(appId);
}
if (uuidReadLen != UUID_LENGTH || !Arrays.equals(this.uuid, uuidRead)) {
After:
byte[] expectedUuid = createUUID(appId);
if (uuidReadLen != UUID_LENGTH || !Arrays.equals(expectedUuid, uuidRead)) {
This is the root fix. The IFile format embeds a UUID derived from the
application ID to tie log files to a specific app. When reading logs, the
controller was checking the file's UUID against this.uuid — an instance field
set once and never reset.
After HADOOP-15984 made HsWebServices a singleton, the same controller
instance handles requests for multiple applications. The instance field
this.uuid would hold the UUID of the first application ever read. Every
subsequent app's log files would fail validation with "UUID mismatch"
and return no results — until JHS restarted.
The fix: derive the expected UUID directly from the appId argument on
every call, never relying on the cached instance field.
---
2. CachedIndexedLogsMeta class removed
The inner class CachedIndexedLogsMeta and its use in getApplicationOwner /
getApplicationAcls are deleted entirely. Both methods now call
loadIndexedLogsMeta() directly:
// Before:
if (this.cachedIndexedLogsMeta == null ||
!this.cachedIndexedLogsMeta.getRemoteLogPath().equals(aggregatedLogPath)) {
this.cachedIndexedLogsMeta = new
CachedIndexedLogsMeta(loadIndexedLogsMeta(...), aggregatedLogPath);
}
return this.cachedIndexedLogsMeta.getCachedIndexedLogsMeta().getUser();
// After:
return loadIndexedLogsMeta(aggregatedLogPath, appId).getUser();
The cache was stale for the same reason — it was an instance field that
persisted across application requests. Removing it also simplifies the code
with no meaningful performance impact (these paths are not hot).
---
3. Write-side state reset in initializeWriter (line ~162)
The cachedIndexedLogsMeta field is replaced by currentWriteAppId. When
initializeWriter is called, it now detects an app change and clears per-app
write state:
if (currentWriteAppId == null || !currentWriteAppId.equals(appId)) {
currentWriteAppId = appId;
indexedLogsMeta = null;
uuid = null;
}
This ensures the write path also starts clean when the controller is
reused for a different application, preventing stale uuid and indexedLogsMeta
from a previous write leaking into a new one.
---
4. Bug fix in loadUUIDFromLogFile (line ~1305)
A pre-existing logic inversion bug was fixed:
// Before (buggy — deletes matching files, keeps mismatches):
if (actual != uuid.length || Arrays.equals(b, uuid)) {
deleteFileWithRetries(fc, checkPath);
} else if (id == null) {
id = uuid; // also wrong: stores the input parameter, not what was
read
}
// After (correct):
if (actual != uuid.length || !Arrays.equals(b, uuid)) {
deleteFileWithRetries(fc, checkPath);
} else if (id == null) {
id = b; // stores the bytes actually read from the file
}
Two bugs fixed here: the missing ! meant valid UUID files were being
deleted while corrupt ones were kept, and id = uuid was storing the method's
input parameter rather than the bytes read from disk (b).
---
Summary: The commit fixes a singleton-controller bug where instance-level
UUID and metadata cache fields caused all applications except the first to fail
log reads. The fix makes UUID derivation stateless (computed from appId on
every call) and removes the per-instance cache entirely.
A logic inversion bug in the UUID file scan is also corrected as a side
fix.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]