[ 
https://issues.apache.org/jira/browse/HDFS-6698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14065300#comment-14065300
 ] 

stack commented on HDFS-6698:
-----------------------------

Makes sense [~xieliang007] You see this as prob in prod?

  public synchronized long getFileLength() {
    return locatedBlocks == null? 0:
        locatedBlocks.getFileLength() + lastBlockBeingWrittenLength;
  }

The last block length does not change post construction of the FSDIS. Maybe it 
will when 'tail' starts to work but for now it looks fixed after open. Block 
locations may change during life of stream but length of other-than-last block 
should not change (would be a problem if it did -- could check each time 
located blocks changed?).  Could we not have the length be a final data member 
rather than do a calculation inside a synchronized block each time?

Or maybe easier, change getFileLength to do something like as follows:

  public long getFileLength() {
    if (!this.locatedBlocks.locatedBlocks.isUnderConstruction() && 
this.locatedBlocks.isLastBlockComplete()) {
      return cachedFileLength;
    }
    cachedFileLength = calculateFileLength();
    return cachedFileLength;
  }

  private synchronized long calculateFileLength() {
    return locatedBlocks == null? 0:
        locatedBlocks.getFileLength() + lastBlockBeingWrittenLength;
  }

> try to optimize DFSInputStream.getFileLength()
> ----------------------------------------------
>
>                 Key: HDFS-6698
>                 URL: https://issues.apache.org/jira/browse/HDFS-6698
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: hdfs-client
>    Affects Versions: 3.0.0
>            Reporter: Liang Xie
>            Assignee: Liang Xie
>
> HBase prefers to invoke read() serving scan request, and invoke pread() 
> serving get reqeust. Because pread() almost holds no lock.
> Let's image there's a read() running, because the definition is:
> {code}
> public synchronized int read
> {code}
> so no other read() request could run concurrently, this is known, but pread() 
> also could not run...  because:
> {code}
>   public int read(long position, byte[] buffer, int offset, int length)
>     throws IOException {
>     // sanity checks
>     dfsClient.checkOpen();
>     if (closed) {
>       throw new IOException("Stream closed");
>     }
>     failures = 0;
>     long filelen = getFileLength();
> {code}
> the getFileLength() also needs lock.  so we need to figure out a no lock impl 
> for getFileLength() before HBase multi stream feature done. 
> [~saint....@gmail.com]



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to