[
https://issues.apache.org/jira/browse/HBASE-1849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12756778#action_12756778
]
stack commented on HBASE-1849:
------------------------------
See the thread dump in https://issues.apache.org/jira/browse/HBASE-1753 for
example of how client can get hung up on synchronized batch put in particular.
> HTable doesn't work well at the core of a multi-threaded server; e.g.
> webserver
> -------------------------------------------------------------------------------
>
> Key: HBASE-1849
> URL: https://issues.apache.org/jira/browse/HBASE-1849
> Project: Hadoop HBase
> Issue Type: Improvement
> Reporter: stack
>
> HTable must do the following:
> + Sit in a shell or simple client -- e.g. Map or Reduce task -- and feed and
> read from HBase single-threadedly. It does this job OK.
> + Sit at core of a multithreaded server (100s of threads) -- a webserver or
> thrift gateway -- and keep the throughput high. Its currently not good at
> this job.
> In the way of our achieving the second in the list above are the following:
> + HTable must seekout and cache region locations. It keeps cache down in
> HConnectionManager. One is shared by all HTable instances if the HTable
> instance was made with same HBaseConfiguration instance. Lookups of regions
> is inside a synchronize block; if the region wanted is in the cache, the lock
> is held a short time. Otherwise, must wait till trip to server completed
> (may require retries). Meantime all other work is blocked even if we're
> using HTablePool.
> + Regardless of the identity of the HBaseConfiguration, Hadoop RPC has ONE
> Connection open to a server at a time; request and response are multiplexed
> over this single connection.
> Broken stuff:
> + Puts are synchronized to protect the write buffer so only one thread at a
> time appends but flushcommit is open for any thread to call it. Once the
> write buffer is full, all Puts block until its freed again. This looks like
> hang if hundreds of threads and each write is to a random region in a big
> table and each write has to have its region looked-up (There may be some
> other brokenness in here because this bottleneck seems to last longer than it
> should even if hundreds of threads).
> Ideas:
> + Query of the cache does not block all access to the cache. We only block
> access if wanted region is being looked up so other reads and writes to
> regions we know the location of can go ahead.
> + nio'd client and server
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.