[ 
https://issues.apache.org/jira/browse/HDFS-6698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14072489#comment-14072489
 ] 

Colin Patrick McCabe commented on HDFS-6698:
--------------------------------------------

{code}
+  public long getFileLength() {
+    if (locatedBlocks == null) {
+      return 0;
+    }
+    if (locatedBlocks.isLastBlockComplete()
{code}

What happens if {{locatedBlocks}} is set to null in between the "if" statement 
and the call to {{isLastBlockComplete}}?

bq. Colin Patrick McCabe The patch attached here seems to respect the existing 
locking model and conditions that prefix length calculations. Or are you 
thinking that we close this issue and do all synchronization changes in 
HDFS-6735, all in the one go? Thanks.

I apologize for slowing things down, but I'd like to have a clear idea of what 
the design is before we start making stuff volatile or atomic.  Otherwise I'm 
afraid our code will get radiation poisoning :)

We should consider some alternate approaches.  For example, maybe instead of 
volatiles, we could have a reader/writer lock.  Or a lock protecting length and 
locatedBlocks vs. a lock for doing non-positional reads.  etc.

> try to optimize DFSInputStream.getFileLength()
> ----------------------------------------------
>
>                 Key: HDFS-6698
>                 URL: https://issues.apache.org/jira/browse/HDFS-6698
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: hdfs-client
>    Affects Versions: 3.0.0
>            Reporter: Liang Xie
>            Assignee: Liang Xie
>         Attachments: HDFS-6698.txt, HDFS-6698.txt
>
>
> HBase prefers to invoke read() serving scan request, and invoke pread() 
> serving get reqeust. Because pread() almost holds no lock.
> Let's image there's a read() running, because the definition is:
> {code}
> public synchronized int read
> {code}
> so no other read() request could run concurrently, this is known, but pread() 
> also could not run...  because:
> {code}
>   public int read(long position, byte[] buffer, int offset, int length)
>     throws IOException {
>     // sanity checks
>     dfsClient.checkOpen();
>     if (closed) {
>       throw new IOException("Stream closed");
>     }
>     failures = 0;
>     long filelen = getFileLength();
> {code}
> the getFileLength() also needs lock.  so we need to figure out a no lock impl 
> for getFileLength() before HBase multi stream feature done. 
> [[email protected]]



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to