[
https://issues.apache.org/jira/browse/HBASE-9775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13799503#comment-13799503
]
Nicolas Liochon commented on HBASE-9775:
----------------------------------------
I've done various tests, on a much smaller cluster than Elliott's.
I observed better write performances on the 0.96 than 0.94, by about 20% when
inserting 100m of rows from an empty cluster. There are around 18 regions at
this stage IIRC, so the cluster size should not matter that much when we start
from an empty table. I've inserted around 1b w/o issue on 0.96.
I haven't compared the number of thread. What I see in .96 is that the actual
limit is the limit per region: there is one thread per client and per server.
Once one multi operation on this server is done, another starts. For this
reason, there are little operations on multiple server: they are not
synchronized. In theory this gives better performances; Elliott tests says the
opposite, at least at large scales. At least, I've seen that adding a YCSB
client increase the throughput. It would not be the case if the client was
maxing the cluster or client physical capacity. As well, increasing the max per
region helped (by about the same ratio: 50%). So there is for sure room for
improvement here.
I will do the comparison with 0.94 beginning of next week for these points
(#thread, impact of more clients). I will as well look at the pure CPU
performances of the client. From the tests so far it seems that we can play
with the limits parameters to increase / limit the throughput. This does not
explain the ITBLL failure at all.
BTW, I observed better performances when having 2 YCSB instances vs. a single
YCSB with 2 threads. I've seen this as well with the .96 before the
AsyncProcess implementation. On a 10 nodes cluster the difference was 30%. I've
never done this test w/ the .94.
For the ITBLL I would be interested to see the servers & client logs. The
SocketTimeoutException was strange.
[~jmspaggi] It would be great if you could redo the same tests as the ones
you've done a while ago on HBASE-6295: it could help to see if we have a
regression of if it's only a matter a medium / large cluster...
> Client write path perf issues
> -----------------------------
>
> Key: HBASE-9775
> URL: https://issues.apache.org/jira/browse/HBASE-9775
> Project: HBase
> Issue Type: Bug
> Components: Client
> Affects Versions: 0.96.0
> Reporter: Elliott Clark
> Priority: Critical
> Attachments: Charts Search Cloudera Manager - ITBLL.png, Charts
> Search Cloudera Manager.png, hbase-9775.patch, job_run.log, short_ycsb.png,
> ycsb_insert_94_vs_96.png
>
>
> Testing on larger clusters has not had the desired throughput increases.
--
This message was sent by Atlassian JIRA
(v6.1#6144)