Yongkun Wang created FLUME-1391:
-----------------------------------

             Summary: Use sync() instead of syncFs() in HDFS Sink to be 
compatible with hadoop 0.20.2
                 Key: FLUME-1391
                 URL: https://issues.apache.org/jira/browse/FLUME-1391
             Project: Flume
          Issue Type: Improvement
          Components: Sinks+Sources
    Affects Versions: v1.3.0
            Reporter: Yongkun Wang


For HDFS sink, the syncFs() is called in HDFSSequenceFile. But syncFs() is not 
available in legacy hadoop 0.20.2, which may be a widely used version. sync() 
method is available for all hadoop versions. And syncFs() is also implemented 
by sync() in hadoop (SequenceFile):
{code}
    /** create a sync point */
    public void sync() throws IOException {
      if (sync != null && lastSyncPos != out.getPos()) {
        out.writeInt(SYNC_ESCAPE);                // mark the start of the sync
        out.write(sync);                          // write sync
        lastSyncPos = out.getPos();               // update lastSyncPos
      }
    }

    /** flush all currently written data to the file system */
    public void syncFs() throws IOException {
      if (out != null) {
        out.sync();                               // flush contents to file 
system
      }
    }
{code}

Therefore, using sync() in HDFSSequenceFile may be better.
{code}
  @Override
  public void sync() throws IOException {
    //writer.syncFs(); //for hadoop 0.20.205.0+
    writer.sync(); //support hadoop 0.20.2+
  }
{code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to