Github user selvaganesang commented on a diff in the pull request:
https://github.com/apache/trafodion/pull/1626#discussion_r199663083
--- Diff: core/sql/src/main/java/org/trafodion/sql/HDFSClient.java ---
@@ -142,21 +154,32 @@
HDFSRead()
{
}
-
+
public Object call() throws IOException
{
int bytesRead;
int totalBytesRead = 0;
if (compressed_) {
bufArray_ = new byte[ioByteArraySizeInKB_ * 1024];
- } else
- if (! buf_.hasArray()) {
- try {
- fsdis_.seek(pos_);
- } catch (EOFException e) {
- isEOF_ = 1;
- return new Integer(totalBytesRead);
- }
+ }
+ else {
+ // alluxio doesn't support direct ByteBuffer reads
+ // Hence, create a non-direct ByteBuffer, read into
+ // byteArray backing up this ByteBuffer and
+ // then copy the data read to direct ByteBuffer for the
+ // native layer to process the data
+ if ((! alluxioNotInstalled_) && fs_ instanceof
alluxio.hadoop.FileSystem) {
+ savedBuf_ = buf_;
+ buf_ = ByteBuffer.allocate(savedBuf_.capacity());
--- End diff --
HdfsClient object is constructed for every range or every HdfsScanBuf size
within a range in HdfsScan. The call method is called only once. So, it is ok
to allocate.
However, It is possible to optimize on this, if you push the detection of
file system before HdfsClient object is constructed to HdfsScan, if the
allocation becomes a bottleneck.
---