Wangda Tan created MAPREDUCE-7101:
-------------------------------------

             Summary: Revisit behavior of JHS scan file behavior
                 Key: MAPREDUCE-7101
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7101
             Project: Hadoop Map/Reduce
          Issue Type: Bug
            Reporter: Wangda Tan


Currently, the JHS scan directory if the modification of *directory* changed: 

{code} 
    public synchronized void scanIfNeeded(FileStatus fs) {
      long newModTime = fs.getModificationTime();
      if (modTime != newModTime) {
        <... omitted some logics ...>
        // reset scanTime before scanning happens
        scanTime = System.currentTimeMillis();
        Path p = fs.getPath();
        try {
          scanIntermediateDirectory(p);
{code}

This logic relies on an assumption that, the directory's modification time will 
be updated if a file got placed under the directory.

However, the semantic of directory's modification time is not consistent in 
different FS implementations. For example, MAPREDUCE-6680 fixed some issues of 
truncated modification time. And HADOOP-12837 mentioned on S3, the directory's 
modification time is always 0.

I think we need to revisit behavior of this logic to make it to more robustly 
work on different file systems.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org

Reply via email to