[GitHub] trafodion pull request #1626: [TRAFODION-3126] Refactored HDFS client implem...

selvaganesang Mon, 02 Jul 2018 18:37:10 -0700

Github user selvaganesang commented on a diff in the pull request:

    https://github.com/apache/trafodion/pull/1626#discussion_r199663083
  
    --- Diff: core/sql/src/main/java/org/trafodion/sql/HDFSClient.java ---
    @@ -142,21 +154,32 @@
           HDFSRead() 
           {
           }
    - 
    +    
           public Object call() throws IOException 
           {
              int bytesRead;
              int totalBytesRead = 0;
              if (compressed_) {
                 bufArray_ = new byte[ioByteArraySizeInKB_ * 1024];
    -         } else 
    -         if (! buf_.hasArray()) {
    -            try {
    -              fsdis_.seek(pos_);
    -            } catch (EOFException e) {
    -              isEOF_ = 1;
    -              return new Integer(totalBytesRead);
    -            } 
    +         } 
    +         else  {
    +            // alluxio doesn't support direct ByteBuffer reads
    +            // Hence, create a non-direct ByteBuffer, read into
    +            // byteArray backing up this ByteBuffer and 
    +            // then copy the data read to direct ByteBuffer for the 
    +            // native layer to process the data
    +            if ((! alluxioNotInstalled_) && fs_ instanceof 
alluxio.hadoop.FileSystem) {
    +               savedBuf_ = buf_;
    +               buf_ = ByteBuffer.allocate(savedBuf_.capacity());
    --- End diff --
    
    HdfsClient object is constructed for every range or every HdfsScanBuf size 
within a range in HdfsScan.  The call method is called only once. So, it is ok 
to allocate.
    
    However, It is possible to optimize on this, if you push the detection of 
file system before HdfsClient object is constructed to HdfsScan, if the 
allocation becomes a bottleneck.

---

[GitHub] trafodion pull request #1626: [TRAFODION-3126] Refactored HDFS client implem...

Reply via email to