[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13977383#comment-13977383
 ] 

Jason Dere commented on MAPREDUCE-5853:
---------------------------------------

It looks like FileSystem's implementation of getContentSummary() really just 
uses getFileStatus()/listStatus().  If we get rid of the overridden version of 
getContentSummary() in FilterFileSystem and just fall back to the FileSystem 
implementation, would this work correctly, since FilterFileSystem does have 
overridden versions of getFileStatus()/listStatus()?

> ChecksumFileSystem.getContentSummary() including contents for crc files 
> ------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-5853
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5853
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>            Reporter: Jason Dere
>
> Trying to track down some differences in Hive statistics between 
> hadoop-1/hadoop-2.  It looks like although ChecksumFileSystem.listStatus() 
> filters out CRC files, getContentSummary() falls back to using the 
> FilterFileSystem.getContentSummary() implementation, which calls 
> fs.getContentSummary().  The underlying fs may not have the same filters as 
> the ChecksumFileSystem and so the CRC files can get included in the content 
> summary.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to