[ 
https://issues.apache.org/jira/browse/HDFS-17860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ConfX updated HDFS-17860:
-------------------------
    Description: 
h2. Overview 

A NullPointerException occurs in `BlocksMap.numNodes()` when calling 
`getBlockLocations()` on a file that has been created via `concat()` operation, 
after a NameNode restart. This is a critical production bug that could crash 
the NameNode during normal file read operations.
h2. Reproduction
h3. Quick Start with Bash Script

Use the provided `reproduce.sh` script to automatically reproduce this bug:
This script automates the entire reproduction process by cloning Hadoop 3.3.5, 
applying the test patch, building the project, and running the failing test 
case.
 
{code:java}
# Make sure you download both reproduce.sh and restart.patch

$ ./reproduce.sh{code}
 
*What the script does:*
1. Clones the Hadoop repository (release 3.3.5 branch)
2. Applies the test patch (`restart.patch`) that adds the reproduction test
3. Builds the Hadoop HDFS module
4. Runs the test case `TestHDFSConcat#testConcatWithRestart` which demonstrates 
the NullPointerException
 
The bug is confirmed if the test fails with a `NullPointerException` in 
`BlocksMap.numNodes()`.

*Manual Reproduction Steps:*
If you prefer to run the test manually:
{code:java}
mvn surefire:test 
-Dtest=TestHDFSConcat_RestartInjected#testConcat_AfterConcat_NN_Crash{code}
*Test Scenario*
1. Create a target file (`/trg`) with 3 blocks (512 bytes each)
2. Create 10 source files, each with 3 blocks
3. Call `dfs.concat(trgPath, files)` to concatenate all source files into target
4. Restart the NameNode
5. Call `nn.getBlockLocations(trg, 0, trgLen)` on the concatenated file
6. NPE occurs at `BlocksMap.numNodes()`
  

*Stack Trace*
{code:java}
java.lang.NullPointerException
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.numNodes(BlocksMap.java:172)
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.createLocatedBlock(BlockManager.java:1420)
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.createLocatedBlock(BlockManager.java:1382)
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.createLocatedBlockList(BlockManager.java:1353)
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.createLocatedBlocks(BlockManager.java:1503)
at 
org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp.getBlockLocations(FSDirStatAndListingOp.java:179)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:2124)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:769){code}
h2. Root Cause Analysis

Code Location
File: `BlocksMap.java:172`
{code:java}
int numNodes(Block b) {
  BlockInfo info = blocks.get(b);  // LINE 172 - NPE HERE
  return info == null ? 0 : info.numNodes();
} {code}
*Why the NPE Occurs*
The NPE happens because a null `Block` parameter `b` is being passed to 
`numNodes()`. When `b` is null, calling `blocks.get(null)` throws a 
`NullPointerException` because:
1. The `blocks` map is likely a `LightWeightGSet` or similar hash map 
implementation
2. The `get()` method calls `b.hashCode()` for lookup
3. Calling `hashCode()` on null throws NPE
 

*How Null Blocks Enter the System*

*Debug Investigation*
I added debug logging to `BlockManager.createLocatedBlockList()` to inspect the 
blocks array:
{code:java}
LOG.info("RESTART_DEBUG: createLocatedBlockList called with blocks.length=" + 
blocks.length);
for (int i = 0; i < blocks.length; i++) {
  if (blocks[i] == null) {
    LOG.error("RESTART_DEBUG: blocks[" + i + "] is NULL!");
  }
}{code}
{*}Finding{*}: The blocks array itself does NOT contain null elements after 
restart. All 33 blocks in the concatenated file's blocks array are non-null 
BlockInfo objects.
 

Hypothesis: Stale Block References
 
After concat, the file's INode contains blocks from the original target file 
PLUS all blocks from the concatenated source files. The test logs show:
 - Before restart: Each source file has 3 blocks (10 files × 3 = 30 blocks)
 - After concat: Target file should have 3 + 30 = 33 blocks
 - After restart: `blocks.length=33` - all blocks present

However, based on the stack trace and NPE location, the issue likely stems from:
 
1. Concat operation moves blocks from source files to target file
2. Source files are deleted after concat
3. Block metadata for source files may be marked for deletion/invalidation
4. After restart, when NameNode reloads FSImage/EditLog:
   - The concatenated file's INode correctly references all 33 blocks
   - BUT some blocks may have been removed from the `BlocksMap` during fsimage 
load
   - Or blocks from deleted source files were never added to BlocksMap
5. When `createLocatedBlock()` is called with one of these "ghost" block 
references:
   - The BlockInfo object exists in the file's blocks array
   - But the actual Block/BlockInfo lookup in BlocksMap fails or returns 
inconsistent state
   - A null block reference propagates to `numNodes()`
h3. Evidence

Test Log Analysis
 
*Before Restart (all blocks non-null):*
{code:java}
2025-12-01 12:44:06,136 [Time-limited test] INFO  BlockManager -
  RESTART_DEBUG: createLocatedBlockList called with blocks.length=3, offset=0, 
length=1536
  blocks[0] = blk_1073741825_1001, numBytes=512
  blocks[1] = blk_1073741826_1002, numBytes=512
  blocks[2] = blk_1073741827_1003, numBytes=512{code}
 
*After Restart (file now has 33 blocks from concat, all non-null):*
{code:java}
2025-12-01 12:44:15,034 [IPC Server handler 0] INFO  BlockManager -
  RESTART_DEBUG: createLocatedBlockList called with blocks.length=33, offset=0, 
length=5120
  blocks[0] = blk_1073741825_1001, numBytes=512
  blocks[1] = blk_1073741826_1002, numBytes=512
  blocks[2] = blk_1073741827_1003, numBytes=512
  blocks[3] = blk_1073741855_1031, numBytes=512
  ... (all 33 blocks non-null)
  blocks[32] = blk_1073741830_1006, numBytes=512{code}
 
*Then NPE occurs* - suggesting the null block comes from a code path not 
instrumented by my debug logging, or there's a race condition during BlocksMap 
access.
h3. *Likely Bug Location*

*Most likely: FSNamesystem reads the stale reference*

After the NameNode restarts, the old FSNamesystem in the test code reads the 
stale reference and causes the problem.

If this is the cause, then we should either prevent stale reference (e.g., 
create a new FSNamesystem) and do a better error message instead of just a NPE 
thrown.

*Suspect Area 2: Concat Implementation*
 
The concat operation (in `FSNamesystem` or `FSDirectory`) may not properly 
handle block ownership transfer during the transaction that gets persisted to 
edit log. On restart:
 - FSImage loading might not correctly restore all blocks to BlocksMap
 - Blocks from deleted source files might be in a transitional state

*Suspect Area 3: BlockInfo Reference vs BlocksMap Inconsistency*
 
There may be a race or ordering issue where:
1. INodeFile's blocks array references BlockInfo objects
2. These BlockInfo objects are not yet added to (or have been removed from) the 
global BlocksMap
3. When `createLocatedBlock()` tries to look up block locations, it accesses a 
BlockInfo that's not in the map
 

*Code Path to NPE*
{code:java}
getBlockLocations()
  → createLocatedBlocks()
    → createLocatedBlockList()
      → createLocatedBlock(blocks[curBlk], ...)  // blocks[curBlk] might be 
problematic
        → createLocatedBlock(blk, ...)
          → blocksMap.numNodes(blk)  // NPE if blk is somehow null or 
invalid{code}
h2. Impact

Production Impact:
1. NameNode Crash Risk: NPE in RPC handler can crash the NameNode during client 
`getBlockLocations()` calls
2. Data Availability: Files created via `concat()` become unreadable after 
NameNode restart
3. Silent Corruption: The concat operation appears to succeed, but the file is 
broken after restart
 

Affected Operations
 - Any `getBlockLocations()` call on concatenated files after restart
 - File reads (since clients call getBlockLocations)
 - MapReduce/Spark jobs reading concatenated files
 - Backup/replication tools accessing these files

h2. *Recommended Fix*

*Immediate Mitigation*
Add null-safety check in `BlocksMap.numNodes()`:
{code:java}
int numNodes(Block b) {
  if (b == null) {
    LOG.error("Null block passed to numNodes()!", new Exception("Stack trace"));
    return 0;  // Or throw IOException
  }
  BlockInfo info = blocks.get(b);
  return info == null ? 0 : info.numNodes();
} {code}
And in `BlockManager.createLocatedBlock()`:
{code:java}
private LocatedBlock createLocatedBlock(LocatedBlockBuilder locatedBlocks,
    final BlockInfo blk, final long pos, final AccessMode mode)
        throws IOException {
  if (blk == null) {
    LOG.error("Null block in createLocatedBlock at pos=" + pos, new 
Exception());
    throw new IOException("Null block reference in file's block list");
  }
  // ... rest of method
} {code}
 

*Root Cause Fix*
 
Requires deeper investigation:
 
1. Audit concat implementation to ensure all blocks are properly:
   - Added to target file's INode
   - Registered in BlocksMap
   - Persisted correctly in edit log
   - Loaded correctly from FSImage/EditLog on restart
 
2. Check FSImage/EditLog loading for concat transactions:
   - Verify blocks from concatenated files are added to BlocksMap
   - Ensure proper ordering of operations during replay
   - Check for race conditions in block map population
 
3. Add consistency checks during NameNode startup:
   - Verify all blocks referenced by INodes exist in BlocksMap
   - Log warnings for orphaned block references
   - Option to auto-repair or fail-safe mode
 
 

*Related Code Files to Investigate*
1. `BlocksMap.java` - Block storage map
2. `BlockManager.java` - Block management and location services
3. `FSNamesystem.java` / `FSDirConcatOp.java` - Concat operation implementation
4. `FSImageFormat.java` / `FSEditLog.java` - Persistence and loading
5. `INodeFile.java` - File inode and block array management

 

I'm more than happy to discuss the potential root cause and fix!

  was:
h2. Overview 

A NullPointerException occurs in `BlocksMap.numNodes()` when calling 
`getBlockLocations()` on a file that has been created via `concat()` operation, 
after a NameNode restart. This is a critical production bug that could crash 
the NameNode during normal file read operations.
h2. Reproduction
h3. Quick Start with Bash Script

Use the provided `reproduce.sh` script to automatically reproduce this bug:
This script automates the entire reproduction process by cloning Hadoop 3.3.5, 
applying the test patch, building the project, and running the failing test 
case.
 
{code:java}
# Make sure you download both reproduce.sh and restart.patch

$ ./reproduce.sh{code}
 
*What the script does:*
1. Clones the Hadoop repository (release 3.3.5 branch)
2. Applies the test patch (`restart.patch`) that adds the reproduction test
3. Builds the Hadoop HDFS module
4. Runs the test case `TestHDFSConcat#testConcatWithRestart` which demonstrates 
the NullPointerException
 
The bug is confirmed if the test fails with a `NullPointerException` in 
`BlocksMap.numNodes()`.

*Manual Reproduction Steps:*
If you prefer to run the test manually:
{code:java}
mvn surefire:test 
-Dtest=TestHDFSConcat_RestartInjected#testConcat_AfterConcat_NN_Crash{code}
*Test Scenario*
1. Create a target file (`/trg`) with 3 blocks (512 bytes each)
2. Create 10 source files, each with 3 blocks
3. Call `dfs.concat(trgPath, files)` to concatenate all source files into target
4. Restart the NameNode
5. Call `nn.getBlockLocations(trg, 0, trgLen)` on the concatenated file
6. NPE occurs at `BlocksMap.numNodes()`
  

*Stack Trace*
{code:java}
java.lang.NullPointerException
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.numNodes(BlocksMap.java:172)
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.createLocatedBlock(BlockManager.java:1420)
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.createLocatedBlock(BlockManager.java:1382)
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.createLocatedBlockList(BlockManager.java:1353)
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.createLocatedBlocks(BlockManager.java:1503)
at 
org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp.getBlockLocations(FSDirStatAndListingOp.java:179)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:2124)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:769){code}
h2. Root Cause Analysis

Code Location
File: `BlocksMap.java:172`
{code:java}
int numNodes(Block b) {
  BlockInfo info = blocks.get(b);  // LINE 172 - NPE HERE
  return info == null ? 0 : info.numNodes();
} {code}
*Why the NPE Occurs*
The NPE happens because a null `Block` parameter `b` is being passed to 
`numNodes()`. When `b` is null, calling `blocks.get(null)` throws a 
`NullPointerException` because:
1. The `blocks` map is likely a `LightWeightGSet` or similar hash map 
implementation
2. The `get()` method calls `b.hashCode()` for lookup
3. Calling `hashCode()` on null throws NPE
 

*How Null Blocks Enter the System*

*Debug Investigation*
I added debug logging to `BlockManager.createLocatedBlockList()` to inspect the 
blocks array:
{code:java}
LOG.info("RESTART_DEBUG: createLocatedBlockList called with blocks.length=" + 
blocks.length);
for (int i = 0; i < blocks.length; i++) {
  if (blocks[i] == null) {
    LOG.error("RESTART_DEBUG: blocks[" + i + "] is NULL!");
  }
}{code}
{*}Finding{*}: The blocks array itself does NOT contain null elements after 
restart. All 33 blocks in the concatenated file's blocks array are non-null 
BlockInfo objects.
 

Hypothesis: Stale Block References
 
After concat, the file's INode contains blocks from the original target file 
PLUS all blocks from the concatenated source files. The test logs show:
 - Before restart: Each source file has 3 blocks (10 files × 3 = 30 blocks)
 - After concat: Target file should have 3 + 30 = 33 blocks
 - After restart: `blocks.length=33` - all blocks present

However, based on the stack trace and NPE location, the issue likely stems from:
 
1. Concat operation moves blocks from source files to target file
2. Source files are deleted after concat
3. Block metadata for source files may be marked for deletion/invalidation
4. After restart, when NameNode reloads FSImage/EditLog:
   - The concatenated file's INode correctly references all 33 blocks
   - BUT some blocks may have been removed from the `BlocksMap` during fsimage 
load
   - Or blocks from deleted source files were never added to BlocksMap
5. When `createLocatedBlock()` is called with one of these "ghost" block 
references:
   - The BlockInfo object exists in the file's blocks array
   - But the actual Block/BlockInfo lookup in BlocksMap fails or returns 
inconsistent state
   - A null block reference propagates to `numNodes()`
h3. Evidence

Test Log Analysis
 
*Before Restart (all blocks non-null):*
{code:java}
2025-12-01 12:44:06,136 [Time-limited test] INFO  BlockManager -
  RESTART_DEBUG: createLocatedBlockList called with blocks.length=3, offset=0, 
length=1536
  blocks[0] = blk_1073741825_1001, numBytes=512
  blocks[1] = blk_1073741826_1002, numBytes=512
  blocks[2] = blk_1073741827_1003, numBytes=512{code}
 
*After Restart (file now has 33 blocks from concat, all non-null):*
{code:java}
2025-12-01 12:44:15,034 [IPC Server handler 0] INFO  BlockManager -
  RESTART_DEBUG: createLocatedBlockList called with blocks.length=33, offset=0, 
length=5120
  blocks[0] = blk_1073741825_1001, numBytes=512
  blocks[1] = blk_1073741826_1002, numBytes=512
  blocks[2] = blk_1073741827_1003, numBytes=512
  blocks[3] = blk_1073741855_1031, numBytes=512
  ... (all 33 blocks non-null)
  blocks[32] = blk_1073741830_1006, numBytes=512{code}
 
*Then NPE occurs* - suggesting the null block comes from a code path not 
instrumented by my debug logging, or there's a race condition during BlocksMap 
access.
h3. *Likely Bug Location*

*Most likely: FSNamesystem reads the stale reference*

After the NameNode restarts, the old FSNamesystem in the test code reads the 
stale reference and causes the problem.

If this is the cause, then we should either prevent stale reference (e.g., 
create a new FSNamesystem) and do a better error message instead of just a NPE 
thrown.

*Suspect Area 1: Concat Implementation*
 
The concat operation (in `FSNamesystem` or `FSDirectory`) may not properly 
handle block ownership transfer during the transaction that gets persisted to 
edit log. On restart:
 - FSImage loading might not correctly restore all blocks to BlocksMap
 - Blocks from deleted source files might be in a transitional state

*Suspect Area 2: BlockInfo Reference vs BlocksMap Inconsistency*
 
There may be a race or ordering issue where:
1. INodeFile's blocks array references BlockInfo objects
2. These BlockInfo objects are not yet added to (or have been removed from) the 
global BlocksMap
3. When `createLocatedBlock()` tries to look up block locations, it accesses a 
BlockInfo that's not in the map
 

*Code Path to NPE*
{code:java}
getBlockLocations()
  → createLocatedBlocks()
    → createLocatedBlockList()
      → createLocatedBlock(blocks[curBlk], ...)  // blocks[curBlk] might be 
problematic
        → createLocatedBlock(blk, ...)
          → blocksMap.numNodes(blk)  // NPE if blk is somehow null or 
invalid{code}
h2. Impact

Production Impact:
1. NameNode Crash Risk: NPE in RPC handler can crash the NameNode during client 
`getBlockLocations()` calls
2. Data Availability: Files created via `concat()` become unreadable after 
NameNode restart
3. Silent Corruption: The concat operation appears to succeed, but the file is 
broken after restart
 

Affected Operations
 - Any `getBlockLocations()` call on concatenated files after restart
 - File reads (since clients call getBlockLocations)
 - MapReduce/Spark jobs reading concatenated files
 - Backup/replication tools accessing these files

h2. *Recommended Fix*

*Immediate Mitigation*
Add null-safety check in `BlocksMap.numNodes()`:
{code:java}
int numNodes(Block b) {
  if (b == null) {
    LOG.error("Null block passed to numNodes()!", new Exception("Stack trace"));
    return 0;  // Or throw IOException
  }
  BlockInfo info = blocks.get(b);
  return info == null ? 0 : info.numNodes();
} {code}
And in `BlockManager.createLocatedBlock()`:
{code:java}
private LocatedBlock createLocatedBlock(LocatedBlockBuilder locatedBlocks,
    final BlockInfo blk, final long pos, final AccessMode mode)
        throws IOException {
  if (blk == null) {
    LOG.error("Null block in createLocatedBlock at pos=" + pos, new 
Exception());
    throw new IOException("Null block reference in file's block list");
  }
  // ... rest of method
} {code}
 

*Root Cause Fix*
 
Requires deeper investigation:
 
1. Audit concat implementation to ensure all blocks are properly:
   - Added to target file's INode
   - Registered in BlocksMap
   - Persisted correctly in edit log
   - Loaded correctly from FSImage/EditLog on restart
 
2. Check FSImage/EditLog loading for concat transactions:
   - Verify blocks from concatenated files are added to BlocksMap
   - Ensure proper ordering of operations during replay
   - Check for race conditions in block map population
 
3. Add consistency checks during NameNode startup:
   - Verify all blocks referenced by INodes exist in BlocksMap
   - Log warnings for orphaned block references
   - Option to auto-repair or fail-safe mode
 
 

*Related Code Files to Investigate*
1. `BlocksMap.java` - Block storage map
2. `BlockManager.java` - Block management and location services
3. `FSNamesystem.java` / `FSDirConcatOp.java` - Concat operation implementation
4. `FSImageFormat.java` / `FSEditLog.java` - Persistence and loading
5. `INodeFile.java` - File inode and block array management

 

I'm more than happy to discuss the potential root cause and fix!


> NPE in BlocksMap After NameNode Concat and Restart
> --------------------------------------------------
>
>                 Key: HDFS-17860
>                 URL: https://issues.apache.org/jira/browse/HDFS-17860
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 3.3.5
>            Reporter: ConfX
>            Priority: Critical
>              Labels: ai-tooling
>         Attachments: reproduce.sh, restart.patch
>
>
> h2. Overview 
> A NullPointerException occurs in `BlocksMap.numNodes()` when calling 
> `getBlockLocations()` on a file that has been created via `concat()` 
> operation, after a NameNode restart. This is a critical production bug that 
> could crash the NameNode during normal file read operations.
> h2. Reproduction
> h3. Quick Start with Bash Script
> Use the provided `reproduce.sh` script to automatically reproduce this bug:
> This script automates the entire reproduction process by cloning Hadoop 
> 3.3.5, applying the test patch, building the project, and running the failing 
> test case.
>  
> {code:java}
> # Make sure you download both reproduce.sh and restart.patch
> $ ./reproduce.sh{code}
>  
> *What the script does:*
> 1. Clones the Hadoop repository (release 3.3.5 branch)
> 2. Applies the test patch (`restart.patch`) that adds the reproduction test
> 3. Builds the Hadoop HDFS module
> 4. Runs the test case `TestHDFSConcat#testConcatWithRestart` which 
> demonstrates the NullPointerException
>  
> The bug is confirmed if the test fails with a `NullPointerException` in 
> `BlocksMap.numNodes()`.
> *Manual Reproduction Steps:*
> If you prefer to run the test manually:
> {code:java}
> mvn surefire:test 
> -Dtest=TestHDFSConcat_RestartInjected#testConcat_AfterConcat_NN_Crash{code}
> *Test Scenario*
> 1. Create a target file (`/trg`) with 3 blocks (512 bytes each)
> 2. Create 10 source files, each with 3 blocks
> 3. Call `dfs.concat(trgPath, files)` to concatenate all source files into 
> target
> 4. Restart the NameNode
> 5. Call `nn.getBlockLocations(trg, 0, trgLen)` on the concatenated file
> 6. NPE occurs at `BlocksMap.numNodes()`
>   
> *Stack Trace*
> {code:java}
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.numNodes(BlocksMap.java:172)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.createLocatedBlock(BlockManager.java:1420)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.createLocatedBlock(BlockManager.java:1382)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.createLocatedBlockList(BlockManager.java:1353)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.createLocatedBlocks(BlockManager.java:1503)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp.getBlockLocations(FSDirStatAndListingOp.java:179)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:2124)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:769){code}
> h2. Root Cause Analysis
> Code Location
> File: `BlocksMap.java:172`
> {code:java}
> int numNodes(Block b) {
>   BlockInfo info = blocks.get(b);  // LINE 172 - NPE HERE
>   return info == null ? 0 : info.numNodes();
> } {code}
> *Why the NPE Occurs*
> The NPE happens because a null `Block` parameter `b` is being passed to 
> `numNodes()`. When `b` is null, calling `blocks.get(null)` throws a 
> `NullPointerException` because:
> 1. The `blocks` map is likely a `LightWeightGSet` or similar hash map 
> implementation
> 2. The `get()` method calls `b.hashCode()` for lookup
> 3. Calling `hashCode()` on null throws NPE
>  
> *How Null Blocks Enter the System*
> *Debug Investigation*
> I added debug logging to `BlockManager.createLocatedBlockList()` to inspect 
> the blocks array:
> {code:java}
> LOG.info("RESTART_DEBUG: createLocatedBlockList called with blocks.length=" + 
> blocks.length);
> for (int i = 0; i < blocks.length; i++) {
>   if (blocks[i] == null) {
>     LOG.error("RESTART_DEBUG: blocks[" + i + "] is NULL!");
>   }
> }{code}
> {*}Finding{*}: The blocks array itself does NOT contain null elements after 
> restart. All 33 blocks in the concatenated file's blocks array are non-null 
> BlockInfo objects.
>  
> Hypothesis: Stale Block References
>  
> After concat, the file's INode contains blocks from the original target file 
> PLUS all blocks from the concatenated source files. The test logs show:
>  - Before restart: Each source file has 3 blocks (10 files × 3 = 30 blocks)
>  - After concat: Target file should have 3 + 30 = 33 blocks
>  - After restart: `blocks.length=33` - all blocks present
> However, based on the stack trace and NPE location, the issue likely stems 
> from:
>  
> 1. Concat operation moves blocks from source files to target file
> 2. Source files are deleted after concat
> 3. Block metadata for source files may be marked for deletion/invalidation
> 4. After restart, when NameNode reloads FSImage/EditLog:
>    - The concatenated file's INode correctly references all 33 blocks
>    - BUT some blocks may have been removed from the `BlocksMap` during 
> fsimage load
>    - Or blocks from deleted source files were never added to BlocksMap
> 5. When `createLocatedBlock()` is called with one of these "ghost" block 
> references:
>    - The BlockInfo object exists in the file's blocks array
>    - But the actual Block/BlockInfo lookup in BlocksMap fails or returns 
> inconsistent state
>    - A null block reference propagates to `numNodes()`
> h3. Evidence
> Test Log Analysis
>  
> *Before Restart (all blocks non-null):*
> {code:java}
> 2025-12-01 12:44:06,136 [Time-limited test] INFO  BlockManager -
>   RESTART_DEBUG: createLocatedBlockList called with blocks.length=3, 
> offset=0, length=1536
>   blocks[0] = blk_1073741825_1001, numBytes=512
>   blocks[1] = blk_1073741826_1002, numBytes=512
>   blocks[2] = blk_1073741827_1003, numBytes=512{code}
>  
> *After Restart (file now has 33 blocks from concat, all non-null):*
> {code:java}
> 2025-12-01 12:44:15,034 [IPC Server handler 0] INFO  BlockManager -
>   RESTART_DEBUG: createLocatedBlockList called with blocks.length=33, 
> offset=0, length=5120
>   blocks[0] = blk_1073741825_1001, numBytes=512
>   blocks[1] = blk_1073741826_1002, numBytes=512
>   blocks[2] = blk_1073741827_1003, numBytes=512
>   blocks[3] = blk_1073741855_1031, numBytes=512
>   ... (all 33 blocks non-null)
>   blocks[32] = blk_1073741830_1006, numBytes=512{code}
>  
> *Then NPE occurs* - suggesting the null block comes from a code path not 
> instrumented by my debug logging, or there's a race condition during 
> BlocksMap access.
> h3. *Likely Bug Location*
> *Most likely: FSNamesystem reads the stale reference*
> After the NameNode restarts, the old FSNamesystem in the test code reads the 
> stale reference and causes the problem.
> If this is the cause, then we should either prevent stale reference (e.g., 
> create a new FSNamesystem) and do a better error message instead of just a 
> NPE thrown.
> *Suspect Area 2: Concat Implementation*
>  
> The concat operation (in `FSNamesystem` or `FSDirectory`) may not properly 
> handle block ownership transfer during the transaction that gets persisted to 
> edit log. On restart:
>  - FSImage loading might not correctly restore all blocks to BlocksMap
>  - Blocks from deleted source files might be in a transitional state
> *Suspect Area 3: BlockInfo Reference vs BlocksMap Inconsistency*
>  
> There may be a race or ordering issue where:
> 1. INodeFile's blocks array references BlockInfo objects
> 2. These BlockInfo objects are not yet added to (or have been removed from) 
> the global BlocksMap
> 3. When `createLocatedBlock()` tries to look up block locations, it accesses 
> a BlockInfo that's not in the map
>  
> *Code Path to NPE*
> {code:java}
> getBlockLocations()
>   → createLocatedBlocks()
>     → createLocatedBlockList()
>       → createLocatedBlock(blocks[curBlk], ...)  // blocks[curBlk] might be 
> problematic
>         → createLocatedBlock(blk, ...)
>           → blocksMap.numNodes(blk)  // NPE if blk is somehow null or 
> invalid{code}
> h2. Impact
> Production Impact:
> 1. NameNode Crash Risk: NPE in RPC handler can crash the NameNode during 
> client `getBlockLocations()` calls
> 2. Data Availability: Files created via `concat()` become unreadable after 
> NameNode restart
> 3. Silent Corruption: The concat operation appears to succeed, but the file 
> is broken after restart
>  
> Affected Operations
>  - Any `getBlockLocations()` call on concatenated files after restart
>  - File reads (since clients call getBlockLocations)
>  - MapReduce/Spark jobs reading concatenated files
>  - Backup/replication tools accessing these files
> h2. *Recommended Fix*
> *Immediate Mitigation*
> Add null-safety check in `BlocksMap.numNodes()`:
> {code:java}
> int numNodes(Block b) {
>   if (b == null) {
>     LOG.error("Null block passed to numNodes()!", new Exception("Stack 
> trace"));
>     return 0;  // Or throw IOException
>   }
>   BlockInfo info = blocks.get(b);
>   return info == null ? 0 : info.numNodes();
> } {code}
> And in `BlockManager.createLocatedBlock()`:
> {code:java}
> private LocatedBlock createLocatedBlock(LocatedBlockBuilder locatedBlocks,
>     final BlockInfo blk, final long pos, final AccessMode mode)
>         throws IOException {
>   if (blk == null) {
>     LOG.error("Null block in createLocatedBlock at pos=" + pos, new 
> Exception());
>     throw new IOException("Null block reference in file's block list");
>   }
>   // ... rest of method
> } {code}
>  
> *Root Cause Fix*
>  
> Requires deeper investigation:
>  
> 1. Audit concat implementation to ensure all blocks are properly:
>    - Added to target file's INode
>    - Registered in BlocksMap
>    - Persisted correctly in edit log
>    - Loaded correctly from FSImage/EditLog on restart
>  
> 2. Check FSImage/EditLog loading for concat transactions:
>    - Verify blocks from concatenated files are added to BlocksMap
>    - Ensure proper ordering of operations during replay
>    - Check for race conditions in block map population
>  
> 3. Add consistency checks during NameNode startup:
>    - Verify all blocks referenced by INodes exist in BlocksMap
>    - Log warnings for orphaned block references
>    - Option to auto-repair or fail-safe mode
>  
>  
> *Related Code Files to Investigate*
> 1. `BlocksMap.java` - Block storage map
> 2. `BlockManager.java` - Block management and location services
> 3. `FSNamesystem.java` / `FSDirConcatOp.java` - Concat operation 
> implementation
> 4. `FSImageFormat.java` / `FSEditLog.java` - Persistence and loading
> 5. `INodeFile.java` - File inode and block array management
>  
> I'm more than happy to discuss the potential root cause and fix!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to