[ 
https://issues.apache.org/jira/browse/HADOOP-1134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12487127
 ] 

Konstantin Shvachko commented on HADOOP-1134:
---------------------------------------------

> For this issue the client should continue to behave as it currently does when 
> a checksum file is missing.
>We might change that in a subsequent issue, perhaps urgently.

An urgent +1

Currently once a missing crc block is detected the client does not check any 
crcs for any blocks of the file until it is reopened.
So yes missing crcs are ignored but in a bad way discarding further 
verification with existing crcs.

I agree with the idea that for now checksum verification should remain only on 
the client, but in the
long term we should also verify it on data-nodes. Probably a good time to do 
that would be
when we introduce periodic checksum verification.

I'd like to underline some advantages of using a client program to perform the 
checksum upgrade.
- This code will be used only once, and it should be as isolated as possible.
- The client approach requires very few if any changes to the existing 
communication protocols.
- It does not require changes to the name-node code and very few changes to the 
data-node.
- It has a potential to be transformed into a distributed fsck later on.
Everything that Raghu is proposing to run during checksum upgrade on the 
data-node can be implemented on the client.
Regular upgrade procedures can be used to save current file system state before 
starting the checksum upgrade
leaving a possibility of doing a rollback.
We do not need to worry about old data-nodes coming up after the checksum 
upgrade was done.
These data-nodes should fail and be reformatted, since the cluster has already 
collected/replicated all necessary blocks.


> Block level CRCs in HDFS
> ------------------------
>
>                 Key: HADOOP-1134
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1134
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Raghu Angadi
>         Assigned To: Raghu Angadi
>
> Currently CRCs are handled at FileSystem level and are transparent to core 
> HDFS. See recent improvement HADOOP-928 ( that can add checksums to a given 
> filesystem ) regd more about it. Though this served us well there a few 
> disadvantages :
> 1) This doubles namespace in HDFS ( or other filesystem implementations ). In 
> many cases, it nearly doubles the number of blocks. Taking namenode out of 
> CRCs would nearly double namespace performance both in terms of CPU and 
> memory.
> 2) Since CRCs are transparent to HDFS, it can not actively detect corrupted 
> blocks. With block level CRCs, Datanode can periodically verify the checksums 
> and report corruptions to namnode such that name replicas can be created.
> We propose to have CRCs maintained for all HDFS data in much the same way as 
> in GFS. I will update the jira with detailed requirements and design. This 
> will include same guarantees provided by current implementation and will 
> include a upgrade of current data.
>  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to