On Tue, Apr 26, 2011 at 3:34 AM, Eran Kutner <[email protected]> wrote:
> Hi J-D,
> I don't think it's a Thrift issue. First, I use the TBufferedTransport
> transport, second, I implemented my own connection pool so the same
> connections are reused over and over again,

Hey!  I'm using C#->Hbase and high on my list of things todo is
'Implement Thrift Connection Pooling in C#'.  You have any desire to
release that code?


> so there is no overhead
> for opening and closing connections (I've verified that using
> Wireshark), third, if it was a client capacity issue I would expect to
> see an increase in throughput as I add more threads or run the test on
> two servers in parallel, this doesn't seem to happen, the total
> capacity remains unchanged.
>
> As for metrics, I already have it configured and monitored using
> Zabbix, but it only monitors specific counters, so let me know what
> information you would like to see. The numbers I quoted before are
> based on client counters and correlated with server counters ("multi"
> for writes and "get" for reads).
>
> -eran
>
>
>
> On Thu, Apr 21, 2011 at 20:43, Jean-Daniel Cryans <[email protected]> wrote:
>>
>> Hey Eran,
>>
>> Glad you could go back to debugging performance :)
>>
>> The scalability issues you are seeing are unknown to me, it sounds
>> like the client isn't pushing it enough. It reminded me of when we
>> switched to using the native Thrift PHP extension instead of the
>> "normal" one and we saw huge speedups. My limited knowledge of Thrift
>> may be blinding me, but I looked around for C# Thrift performance
>> issues and found threads like this one
>> http://www.mail-archive.com/[email protected]/msg00320.html
>>
>> As you didn't really debug the speed of Thrift itself in your setup,
>> this is one more variable in the problem.
>>
>> Also you don't really provide metrics about your system apart from
>> requests/second. Would it be possible for you set them up using this
>> guide? http://hbase.apache.org/metrics.html
>>
>> J-D
>>
>> On Thu, Apr 21, 2011 at 5:13 AM, Eran Kutner <eran@> wrote:
>> > Hi J-D,
>> > After stabilizing the configuration, with your great help, I was able
>> > to go back to the the load tests. I tried using IRC, as you suggested,
>> > to continue this discussion but because of the time difference (I'm
>> > GMT+3) it is quite difficult to find a time when people are present
>> > and I am available to run long tests, so I'll give the mailing list
>> > one more try.
>> >
>> > I tested again on a clean table using 100 insert threads each, using a
>> > separate keyspace within the test table. Every row had just one column
>> > with 128 bytes of data.
>> > With one server and one region I got about 2300 inserts per second.
>> > After manually splitting the region I got about 3600 inserts per
>> > second (still on one machine). After a while the regions were balanced
>> > and one was moved to another server, that got writes to around 4500
>> > writes per second. Additional splits and moves to more servers didn't
>> > improve this number and the write performance stabilized at ~4000
>> > writes/sec per server. This seems pretty low, especially considering
>> > other numbers I've seen around here.
>> >
>> > Read performance is at around 1500 rows per second per server, which
>> > seems extremely low to me, especially considering that all the working
>> > set I was querying could fit in the servers memory. To make the test
>> > interesting I limited my client to fetch only 1 row (always the same
>> > one) from each keyspace, that yielded 10K reads per sec per server, so
>> > I tried increasing the range again a read the same 10 rows, now the
>> > performance dropped to 8500 reads/sec per server. Increasing the range
>> > to 100 rows and the performance drops to around 3500 reads per second
>> > per server.
>> > Do you have any idea what could explain this behavior and how do I get
>> > a decent number of reads from those servers?
>> >
>> > -eran
>



-- 
josh
@schulz
http://schulzone.org

Reply via email to