Wangda Tan created MAPREDUCE-7101: ------------------------------------- Summary: Revisit behavior of JHS scan file behavior Key: MAPREDUCE-7101 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7101 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Wangda Tan
Currently, the JHS scan directory if the modification of *directory* changed: {code} public synchronized void scanIfNeeded(FileStatus fs) { long newModTime = fs.getModificationTime(); if (modTime != newModTime) { <... omitted some logics ...> // reset scanTime before scanning happens scanTime = System.currentTimeMillis(); Path p = fs.getPath(); try { scanIntermediateDirectory(p); {code} This logic relies on an assumption that, the directory's modification time will be updated if a file got placed under the directory. However, the semantic of directory's modification time is not consistent in different FS implementations. For example, MAPREDUCE-6680 fixed some issues of truncated modification time. And HADOOP-12837 mentioned on S3, the directory's modification time is always 0. I think we need to revisit behavior of this logic to make it to more robustly work on different file systems. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org