[
https://issues.apache.org/jira/browse/HDFS-3883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Eli Collins updated HDFS-3883:
------------------------------
Description:
I saw the following NPE for a client on branch-1 that looks like it accessed a
block not in the volume map, probably because the block was already deleted
(otherwise the primary should have a block file). Looks like a create is racing
with a delete.
DFSClient.java in updateBlockInfo..
{code}
Block newBlock = primary.getBlockInfo(last.getBlock());
long newBlockSize = newBlock.getNumBytes(); <--------
{code}
>From getBlockInfo to getStoredBlock..
{code}
public synchronized Block getStoredBlock(long blkid) throws IOException {
File blockfile = findBlockFile(blkid);
if (blockfile == null) {
return null;
}
{code}
Digging into findBlockFile..
{code}
public synchronized File findBlockFile(long blockId) {
final Block b = new Block(blockId);
File blockfile = null;
ActiveFile activefile = ongoingCreates.get(b);
if (activefile != null) {
blockfile = activefile.file;
}
if (blockfile == null) {
blockfile = getFile(b);
}
if (blockfile == null) {
if (DataNode.LOG.isDebugEnabled()) {
DataNode.LOG.debug("ongoingCreates=" + ongoingCreates);
DataNode.LOG.debug("volumeMap=" + volumeMap);
}
}
return blockfile;
{code}
Into getFile..
{code}
public synchronized File getFile(Block b) {
DatanodeBlockInfo info = volumeMap.get(b);
if (info != null) {
return info.getFile();
}
return null;
{code}
was:
I saw the following NPE for a client on branch-1 that looks like it accessed a
block not in the volume map, probably because the block was already deleted
(otherwise the primary should have a block file). We should throw an IOE in
this case.
DFSClient.java..
{code}
Block newBlock = primary.getBlockInfo(last.getBlock());
long newBlockSize = newBlock.getNumBytes(); <--------
{code}
>From getBlockInfo to getStoredBlock..
{code}
public synchronized Block getStoredBlock(long blkid) throws IOException {
File blockfile = findBlockFile(blkid);
if (blockfile == null) {
return null;
}
{code}
Digging into findBlockFile..
{code}
public synchronized File findBlockFile(long blockId) {
final Block b = new Block(blockId);
File blockfile = null;
ActiveFile activefile = ongoingCreates.get(b);
if (activefile != null) {
blockfile = activefile.file;
}
if (blockfile == null) {
blockfile = getFile(b);
}
if (blockfile == null) {
if (DataNode.LOG.isDebugEnabled()) {
DataNode.LOG.debug("ongoingCreates=" + ongoingCreates);
DataNode.LOG.debug("volumeMap=" + volumeMap);
}
}
return blockfile;
{code}
Into getFile..
{code}
public synchronized File getFile(Block b) {
DatanodeBlockInfo info = volumeMap.get(b);
if (info != null) {
return info.getFile();
}
return null;
{code}
> DFSClient NPE due to missing block when opening a file
> ------------------------------------------------------
>
> Key: HDFS-3883
> URL: https://issues.apache.org/jira/browse/HDFS-3883
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: hdfs client
> Affects Versions: 1.0.0
> Reporter: Eli Collins
> Assignee: Eli Collins
> Priority: Minor
>
> I saw the following NPE for a client on branch-1 that looks like it accessed
> a block not in the volume map, probably because the block was already deleted
> (otherwise the primary should have a block file). Looks like a create is
> racing with a delete.
> DFSClient.java in updateBlockInfo..
> {code}
> Block newBlock = primary.getBlockInfo(last.getBlock());
> long newBlockSize = newBlock.getNumBytes(); <--------
> {code}
> From getBlockInfo to getStoredBlock..
> {code}
> public synchronized Block getStoredBlock(long blkid) throws IOException {
> File blockfile = findBlockFile(blkid);
> if (blockfile == null) {
> return null;
> }
> {code}
> Digging into findBlockFile..
> {code}
> public synchronized File findBlockFile(long blockId) {
> final Block b = new Block(blockId);
> File blockfile = null;
> ActiveFile activefile = ongoingCreates.get(b);
> if (activefile != null) {
> blockfile = activefile.file;
> }
> if (blockfile == null) {
> blockfile = getFile(b);
> }
> if (blockfile == null) {
> if (DataNode.LOG.isDebugEnabled()) {
> DataNode.LOG.debug("ongoingCreates=" + ongoingCreates);
> DataNode.LOG.debug("volumeMap=" + volumeMap);
> }
> }
> return blockfile;
> {code}
> Into getFile..
> {code}
> public synchronized File getFile(Block b) {
> DatanodeBlockInfo info = volumeMap.get(b);
> if (info != null) {
> return info.getFile();
> }
> return null;
> {code}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira