[jira] [Commented] (HDFS-2834) ByteBuffer-based read API for DFSInputStream

jirapos...@reviews.apache.org (Commented) (JIRA) Tue, 06 Mar 2012 16:12:32 -0800

    [ 
https://issues.apache.org/jira/browse/HDFS-2834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13223847#comment-13223847
 ]

jirapos...@reviews.apache.org commented on HDFS-2834:
-----------------------------------------------------

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4212/#review5665
-----------------------------------------------------------

some stuff around error cases here -- I think you'd run into these bugs if you 
hit a checksum error, since the "retry on different node" path would trigger in 
DFSInputStream, but the target buffer's position would be prematurely updated 
from prior attempts.

But getting close!

Is it possible to split out the test refactor change from this one? Hard to 
tell what's changed in the tests vs just refactored. If it's a pain in the butt 
I'll look more closely.

hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/BlockReader.java
<https://reviews.apache.org/r/4212/#comment12328>

    unused import?

hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/BlockReaderLocal.java
<https://reviews.apache.org/r/4212/#comment12329>

    please use javadoc style comments for these, rather than //s

hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/BlockReaderLocal.java
<https://reviews.apache.org/r/4212/#comment12330>

    I envision some silly user setting the buffer size < bytesPerChecksum and 
getting screwed here. Worth a check for valid range here (throwing IOE or 
IllegalArgumentException or something)

hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/BlockReaderLocal.java
<https://reviews.apache.org/r/4212/#comment12331>

    This isn't your bug, but I just realized there's a potential leak here if 
the skip() calls below failed. I think we need to add a try..finally{} which 
returns these buffers to the pool if construction fails

hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/BlockReaderLocal.java
<https://reviews.apache.org/r/4212/#comment12332>

    should this line go in a finally{} clause? Otherwise if 'to' doesn't have 
enough space for the copy, we'd end up leaving 'from' with the modified limit

hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/BlockReaderLocal.java
<https://reviews.apache.org/r/4212/#comment12333>

    in finally{} clause perhaps, so the limit isn't messed up by a failed read

hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/BlockReaderLocal.java
<https://reviews.apache.org/r/4212/#comment12337>

    the error conditions here don't quite work right.
    For example, if a checksum error occurs, the buffer's position will still 
be updated.

    Above, there's a similar problem if "phase 1" succeeds but "phase 2" 
encounters an error -- the position will change but the method will throw.

hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/BlockReaderLocal.java
<https://reviews.apache.org/r/4212/#comment12338>

    no '-'s needed in javadoc

hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java
<https://reviews.apache.org/r/4212/#comment12336>

    do you really need synchronization here and below? it seems like the 
strategy is being called from a synchronized context, so this is redundant. 
Removing this also allows the inner classes to be static which I think is 
preferable

hadoop-hdfs-project/hadoop-hdfs/src/main/native/hdfs.c
<https://reviews.apache.org/r/4212/#comment12342>

    worth printing the exception message

hadoop-hdfs-project/hadoop-hdfs/src/main/native/hdfs.c
<https://reviews.apache.org/r/4212/#comment12340>

    do you need to do something here to clear the JVM exception state? you'll 
probably have an OOME pending here. Probably good to use errNoFromException so 
that you get a reasonable error code (assume it does something good with OOME).

hadoop-hdfs-project/hadoop-hdfs/src/main/native/hdfs.c
<https://reviews.apache.org/r/4212/#comment12343>

    is there any logging setting for libhdfs? seems like you'd want to print 
the stack trace if some debug flag is set

hadoop-hdfs-project/hadoop-hdfs/src/main/native/hdfs.c
<https://reviews.apache.org/r/4212/#comment12344>

    style, } else {

- Todd

On 2012-03-06 23:14:07, Henry Robinson wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/4212/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2012-03-06 23:14:07)
bq.  
bq.  
bq.  Review request for hadoop-hdfs and Todd Lipcon.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  New patch for HDFS-2834 (I can't update the old review request).
bq.  
bq.  
bq.  This addresses bug HDFS-2834.
bq.      http://issues.apache.org/jira/browse/HDFS-2834
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/BlockReader.java
 dfab730 
bq.    
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/BlockReaderLocal.java
 cc61697 
bq.    
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
 4187f1c 
bq.    
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java
 2b817ff 
bq.    
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/RemoteBlockReader.java
 b7da8d4 
bq.    
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/RemoteBlockReader2.java
 ea24777 
bq.    hadoop-hdfs-project/hadoop-hdfs/src/main/native/hdfs.h 0ee29d5 
bq.    hadoop-hdfs-project/hadoop-hdfs/src/main/native/hdfs.c 7371caf 
bq.    
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/BlockReaderTestUtil.java
 9d4f4a2 
bq.    
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestParallelLocalRead.java
 PRE-CREATION 
bq.    
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestParallelRead.java
 bbd0012 
bq.    
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestParallelReadUtil.java
 PRE-CREATION 
bq.    
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestShortCircuitLocalRead.java
 eb2a1d8 
bq.  
bq.  Diff: https://reviews.apache.org/r/4212/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Henry
bq.  
bq.

> ByteBuffer-based read API for DFSInputStream
> --------------------------------------------
>
>                 Key: HDFS-2834
>                 URL: https://issues.apache.org/jira/browse/HDFS-2834
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Henry Robinson
>            Assignee: Henry Robinson
>         Attachments: HDFS-2834-no-common.patch, HDFS-2834.3.patch, 
> HDFS-2834.4.patch, HDFS-2834.5.patch, HDFS-2834.6.patch, HDFS-2834.7.patch, 
> HDFS-2834.8.patch, HDFS-2834.patch, HDFS-2834.patch, 
> hdfs-2834-libhdfs-benchmark.png
>
>
> The {{DFSInputStream}} read-path always copies bytes into a JVM-allocated 
> {{byte[]}}. Although for many clients this is desired behaviour, in certain 
> situations, such as native-reads through libhdfs, this imposes an extra copy 
> penalty since the {{byte[]}} needs to be copied out again into a natively 
> readable memory area. 
> For these cases, it would be preferable to allow the client to supply its own 
> buffer, wrapped in a {{ByteBuffer}}, to avoid that final copy overhead. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-2834) ByteBuffer-based read API for DFSInputStream

Reply via email to