[
https://issues.apache.org/jira/browse/HBASE-1849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Andrew Purtell resolved HBASE-1849.
-----------------------------------
Resolution: Incomplete
Assignee: (was: Benoit Sigoure)
It's not perfect, but the client has come a long way. No action on this issue
for a long time, resolving as Incomplete
> HTable doesn't work well at the core of a multi-threaded server; e.g.
> webserver
> -------------------------------------------------------------------------------
>
> Key: HBASE-1849
> URL: https://issues.apache.org/jira/browse/HBASE-1849
> Project: HBase
> Issue Type: Improvement
> Components: Performance
> Reporter: stack
>
> HTable must do the following:
> + Sit in a shell or simple client -- e.g. Map or Reduce task -- and feed and
> read from HBase single-threadedly. It does this job OK.
> + Sit at core of a multithreaded server (100s of threads) -- a webserver or
> thrift gateway -- and keep the throughput high. Its currently not good at
> this job.
> In the way of our achieving the second in the list above are the following:
> + HTable must seekout and cache region locations. It keeps cache down in
> HConnectionManager. One is shared by all HTable instances if the HTable
> instance was made with same HBaseConfiguration instance. Lookups of regions
> is inside a synchronize block; if the region wanted is in the cache, the lock
> is held a short time. Otherwise, must wait till trip to server completed
> (may require retries). Meantime all other work is blocked even if we're
> using HTablePool.
> + Regardless of the identity of the HBaseConfiguration, Hadoop RPC has ONE
> Connection open to a server at a time; request and response are multiplexed
> over this single connection.
> Broken stuff:
> + Puts are synchronized to protect the write buffer so only one thread at a
> time appends but flushcommit is open for any thread to call it. Once the
> write buffer is full, all Puts block until its freed again. This looks like
> hang if hundreds of threads and each write is to a random region in a big
> table and each write has to have its region looked-up (There may be some
> other brokenness in here because this bottleneck seems to last longer than it
> should even if hundreds of threads).
> Ideas:
> + Query of the cache does not block all access to the cache. We only block
> access if wanted region is being looked up so other reads and writes to
> regions we know the location of can go ahead.
> + nio'd client and server
--
This message was sent by Atlassian JIRA
(v6.2#6252)