Github user keith-turner commented on a diff in the pull request:

    https://github.com/apache/accumulo/pull/134#discussion_r72681960
  
    --- Diff: 
core/src/main/java/org/apache/accumulo/core/file/rfile/bcfile/BoundedRangeFileInputStream.java
 ---
    @@ -62,12 +62,15 @@ public BoundedRangeFileInputStream(FSDataInputStream 
in, long offset, long lengt
     
       @Override
       public int available() throws IOException {
    -    int avail = in.available();
    -    if (pos + avail > end) {
    -      avail = (int) (end - pos);
    -    }
    +    final FSDataInputStream inLocal = in;
    +    synchronized (inLocal) {
    +      int avail = inLocal.available();
    --- End diff --
    
    > I am considering this, but others had commented to me that adhering to 
the API as close as possible was a good thing. 
    
    Thats a hard call to make, I don't know what the right answer is.  We can 
make BoundedRangeInputStream work with the code that currently calls it.  To 
make it work with code that might call it in the future, need to follow the 
interface javadoc.  For example the way hadoop uses it could change in the 
future.
    
    Calling `in.seek()` then `in.available()` seems like the correct thing to 
do.  However, based on what @phrocker  said, `in.seek()` call may have a 
performance penalty, with no real benefit.  I don't really like that.   One 
thing I am thinking is that always returning `end-pos` seems more correct than 
only calling `in.available()` w/o calling `in.seek()`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to