[
https://issues.apache.org/jira/browse/HDFS-8498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15405968#comment-15405968
]
Vinayakumar B commented on HDFS-8498:
-------------------------------------
We experienced this in one of our testing cluster under high load.
*Scenario:*
Error occured for the HBase-RegionServer's WAL file.
1. In HBase there will be multiple threads performing the write,sync and close
of same WAL file.
2. Actual writer writes the entries, multiple syncers call hsync on same stream
and A roller thread rolls the WALs in regular intervals. i.e. close the curren
WAL file and open another one for next entries.
3. During file close() by roller, last block got committed with less size, than
present in all DNs.
4. All IBRs reported by DNs have more length, than that of COMMITTED length by
the client. So all those replicas are marked as CORRUPT.
5. We use IBR batch with {{dfs.namenode.file.close.num-committed-allowed=1}}.
So Client(HBase-RS) did not experience any problem, as file got closed
successfully without waiting for the Correct IBR for last block.
*Current Analysis:*
HDFS-9289, safegaurded the {{DataStreamer#block}}'s re-assignment during
pipeline update by making it volatile. But it did not actually protected the
contents of the {{block}}.
*Suspected problem is:*
1. ResponseProcessor updated the block size by updating the numBytes after
receiving every Ack by calling {{ExtendedBlock.setNumBytes()}}, which
internally updates the numBytes of internal {{block}} which is not thread safe.
2. LogRoller calls close by by passing {{DataStreamer#block}} as last block.
During this time, GUESS is that {{ExtendedBlock.getNumBytes()}} is not
returning the latest value updated by ReponseProcessor, instead returning some
of the earlier update. Because ExtendedBlock and its internal block is not
threadsafe.
By this lesser size, Block is getting COMMITTED at NameNode and all IBRs are
getting marked as CORRUPT.
*Possible solution:*
Make the ExtendedBlock threadsafe for setNumBytes() and getNumBytes().
If the above analysis makes sense, then we can raise one Jira and contribute
the fix.
Note:
This issue we got in 40-core/380GB-RAM machine thrice. Trying to reproduce
again with more logs, but no luck till now.
Once it was reproduced with DEBUG logs as well, from that its confirmed that
complete() call is sent only after receiving all ACKs. But DEBUG logs was
having no information of numBytes sent during complete(). So could not actually
verify that this would be the fix.
> Blocks can be committed with wrong size
> ---------------------------------------
>
> Key: HDFS-8498
> URL: https://issues.apache.org/jira/browse/HDFS-8498
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: namenode
> Affects Versions: 2.5.0
> Reporter: Daryn Sharp
> Assignee: Daryn Sharp
> Priority: Critical
>
> When an IBR for a UC block arrives, the NN updates the expected location's
> block and replica state _only_ if it's on an unexpected storage for an
> expected DN. If it's for an expected storage, only the genstamp is updated.
> When the block is committed, and the expected locations are verified, only
> the genstamp is checked. The size is not checked but it wasn't updated in
> the expected locations anyway.
> A faulty client may misreport the size when committing the block. The block
> is effectively corrupted. If the NN issues replications, the received IBR is
> considered corrupt, the NN invalidates the block, immediately issues another
> replication. The NN eventually realizes all the original replicas are
> corrupt after full BRs are received from the original DNs.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]