[jira] Commented: (HADOOP-3941) Extend FileSystem API to return file-checksums/file-digests

Doug Cutting (JIRA) Thu, 04 Sep 2008 14:14:06 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-3941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12628457#action_12628457
 ]


Doug Cutting commented on HADOOP-3941:
--------------------------------------

> We might need a method for getting the supported algorithms of a file system.

If we remove the "algorithm" parameter to getFileChecksum() then each 
FileSystem would simply return checksums using its native algorithm.  When 
these match, cross-filesystem copies would be checksummed.  Later, if we have 
filesystems that implement multiple checksum algorithms, we might consider 
something more elaborate, but that seems sufficient for now, no?


> Extend FileSystem API to return file-checksums/file-digests
> -----------------------------------------------------------
>
>                 Key: HADOOP-3941
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3941
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: fs
>            Reporter: Tsz Wo (Nicholas), SZE
>            Assignee: Tsz Wo (Nicholas), SZE
>         Attachments: 3941_20080818.patch, 3941_20080819.patch, 
> 3941_20080819b.patch, 3941_20080820.patch, 3941_20080826.patch, 
> 3941_20080827.patch
>
>
> Suppose we have two files in two locations (may be two clusters) and these 
> two files have the same size.  How could we tell whether the content of them 
> are the same?
> Currently, the only way is to read both files and compare the content of 
> them.  This is a very expensive operation if the files are huge.
> So, we would like to extend the FileSystem API to support returning 
> file-checksums/file-digests.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-3941) Extend FileSystem API to return file-checksums/file-digests

Reply via email to