[ 
https://issues.apache.org/jira/browse/HBASE-7336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14050538#comment-14050538
 ] 

Vladimir Rodionov commented on HBASE-7336:
------------------------------------------

I am looking into this stuff now, trying to figure out how to make parallel 
scan on a region more efficient. The code in AbstractFSReader looks dangerous 
and does not provide any benefits in terms of MT performance.
{code}
    protected int readAtOffset(FSDataInputStream istream,
        byte[] dest, int destOffset, int size,
        boolean peekIntoNextBlock, long fileOffset, boolean pread)
        throws IOException {
      if (peekIntoNextBlock &&
          destOffset + size + hdrSize > dest.length) {
        // We are asked to read the next block's header as well, but there is
        // not enough room in the array.
        throw new IOException("Attempted to read " + size + " bytes and " +
            hdrSize + " bytes of next header into a " + dest.length +
            "-byte array at offset " + destOffset);
      }

      if (!pread && streamLock.tryLock()) {
        // Seek + read. Better for scanning.
        try {
          istream.seek(fileOffset);

          long realOffset = istream.getPos();
          if (realOffset != fileOffset) {
            throw new IOException("Tried to seek to " + fileOffset + " to "
                + "read " + size + " bytes, but pos=" + realOffset
                + " after seek");
          }

          if (!peekIntoNextBlock) {
            IOUtils.readFully(istream, dest, destOffset, size);
            return -1;
          }

          // Try to read the next block header.
          if (!readWithExtra(istream, dest, destOffset, size, hdrSize))
            return -1;
        } finally {
          streamLock.unlock();
        }
      } else {
        // Positional read. Better for random reads; or when the streamLock is 
already locked.
        int extraSize = peekIntoNextBlock ? hdrSize : 0;

        int ret = istream.read(fileOffset, dest, destOffset, size + extraSize);
        if (ret < size) {
          throw new IOException("Positional read of " + size + " bytes " +
              "failed at offset " + fileOffset + " (returned " + ret + ")");
        }

        if (ret == size || ret < size + extraSize) {
          // Could not read the next block's header, or did not try.
          return -1;
        }
      }

      assert peekIntoNextBlock;
      return Bytes.toInt(dest, destOffset + size + BlockType.MAGIC_LENGTH) +
          hdrSize;
    }
{code}

Positional reads in FSInputStream (DFSInputStream) are heavily synchronized. It 
is lock on stream than seek and read, unlock. Here is the code for 
FSInputStream:
{code}
  @Override
  public int read(long position, byte[] buffer, int offset, int length)
    throws IOException {
    synchronized (this) {
      long oldPos = getPos();
      int nread = -1;
      try {
        seek(position);
        nread = read(buffer, offset, length);
      } finally {
        seek(oldPos);
      }
      return nread;
    }
  }
{code}

DFSInputStream extends FSInputStream but does not override the above method. 
Taking into account that code is synchronized, it is hard to explain observed 
performance improvement published in this JIRA.


  


> HFileBlock.readAtOffset does not work well with multiple threads
> ----------------------------------------------------------------
>
>                 Key: HBASE-7336
>                 URL: https://issues.apache.org/jira/browse/HBASE-7336
>             Project: HBase
>          Issue Type: Sub-task
>          Components: Performance
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>            Priority: Critical
>             Fix For: 0.94.4, 0.95.0
>
>         Attachments: 7336-0.94.txt, 7336-0.96.txt
>
>
> HBase grinds to a halt when many threads scan along the same set of blocks 
> and neither read short circuit is nor block caching is enabled for the dfs 
> client ... disabling the block cache makes sense on very large scans.
> It turns out that synchronizing in istream in HFileBlock.readAtOffset is the 
> culprit.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to