[ 
https://issues.apache.org/jira/browse/HBASE-9775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13799503#comment-13799503
 ] 

Nicolas Liochon commented on HBASE-9775:
----------------------------------------

I've done various tests, on a much smaller cluster than Elliott's.
I observed better write performances on the 0.96 than 0.94, by about 20% when 
inserting 100m of rows from an empty cluster.  There are  around 18 regions at 
this stage IIRC, so the cluster size should not matter that much when we start 
from an empty table. I've inserted around 1b w/o issue on 0.96.

I haven't compared the number of thread. What I see in .96 is that the actual 
limit is the limit per region: there is one thread per client and per server. 
Once one multi operation on this server is done, another starts. For this 
reason, there are little operations on multiple server: they are not 
synchronized. In theory this gives better performances; Elliott tests says the 
opposite, at least at large scales. At least, I've seen that adding a YCSB 
client increase the throughput. It would not be the case if the client was 
maxing the cluster or client physical capacity. As well, increasing the max per 
region helped (by about the same ratio: 50%). So there is for sure room for 
improvement here.

I will do the comparison with 0.94 beginning of next week for these points 
(#thread,  impact of more clients).  I will as well look at the pure CPU 
performances of the client. From the tests so far it seems that we can play 
with the limits parameters to increase / limit the throughput. This does not 
explain the ITBLL failure at all.

BTW, I observed better performances when having 2 YCSB instances vs. a single 
YCSB with 2 threads. I've seen this as well with the .96 before the 
AsyncProcess implementation. On a 10 nodes cluster the difference was 30%. I've 
never done this test w/ the .94.

For the ITBLL I would be interested to see the servers & client logs. The 
SocketTimeoutException was strange.

[~jmspaggi] It would be great if you could redo the same tests as the ones 
you've done a while ago on HBASE-6295: it could help to see if we have a 
regression of if it's only a matter a medium / large cluster... 


> Client write path perf issues
> -----------------------------
>
>                 Key: HBASE-9775
>                 URL: https://issues.apache.org/jira/browse/HBASE-9775
>             Project: HBase
>          Issue Type: Bug
>          Components: Client
>    Affects Versions: 0.96.0
>            Reporter: Elliott Clark
>            Priority: Critical
>         Attachments: Charts Search   Cloudera Manager - ITBLL.png, Charts 
> Search   Cloudera Manager.png, hbase-9775.patch, job_run.log, short_ycsb.png, 
> ycsb_insert_94_vs_96.png
>
>
> Testing on larger clusters has not had the desired throughput increases.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to