Since the attachment didn't make it, here it is again: http://shortText.com/jp73moaesx -eran
On Wed, Apr 27, 2011 at 16:51, Eran Kutner <[email protected]> wrote: > Hi Josh, > > The connection pooling code is attached AS IS (with all the usual legal > disclaimers), note that you will have to modify it a bit to get it to > compile because it depends on some internal libraries we use. In particular, > DynamicAppSettings and Log are two internal classes that do what their names > imply :) > Make sure you initialize "servers" in the NewConnection() method to an array > with your Thrift servers and you should be good to go. You use > GetConnection() to get a connection and ReturnConnection() to return it back > to the pool after you finish using it - make sure you don't close it in the > application code. > > -eran > > > > On Wed, Apr 27, 2011 at 00:30, Josh <[email protected]> wrote: >> >> On Tue, Apr 26, 2011 at 3:34 AM, Eran Kutner <[email protected]> wrote: >> > Hi J-D, >> > I don't think it's a Thrift issue. First, I use the TBufferedTransport >> > transport, second, I implemented my own connection pool so the same >> > connections are reused over and over again, >> >> Hey! I'm using C#->Hbase and high on my list of things todo is >> 'Implement Thrift Connection Pooling in C#'. You have any desire to >> release that code? >> >> >> > so there is no overhead >> > for opening and closing connections (I've verified that using >> > Wireshark), third, if it was a client capacity issue I would expect to >> > see an increase in throughput as I add more threads or run the test on >> > two servers in parallel, this doesn't seem to happen, the total >> > capacity remains unchanged. >> > >> > As for metrics, I already have it configured and monitored using >> > Zabbix, but it only monitors specific counters, so let me know what >> > information you would like to see. The numbers I quoted before are >> > based on client counters and correlated with server counters ("multi" >> > for writes and "get" for reads). >> > >> > -eran >> > >> > >> > >> > On Thu, Apr 21, 2011 at 20:43, Jean-Daniel Cryans <[email protected]> >> > wrote: >> >> >> >> Hey Eran, >> >> >> >> Glad you could go back to debugging performance :) >> >> >> >> The scalability issues you are seeing are unknown to me, it sounds >> >> like the client isn't pushing it enough. It reminded me of when we >> >> switched to using the native Thrift PHP extension instead of the >> >> "normal" one and we saw huge speedups. My limited knowledge of Thrift >> >> may be blinding me, but I looked around for C# Thrift performance >> >> issues and found threads like this one >> >> http://www.mail-archive.com/[email protected]/msg00320.html >> >> >> >> As you didn't really debug the speed of Thrift itself in your setup, >> >> this is one more variable in the problem. >> >> >> >> Also you don't really provide metrics about your system apart from >> >> requests/second. Would it be possible for you set them up using this >> >> guide? http://hbase.apache.org/metrics.html >> >> >> >> J-D >> >> >> >> On Thu, Apr 21, 2011 at 5:13 AM, Eran Kutner <eran@> wrote: >> >> > Hi J-D, >> >> > After stabilizing the configuration, with your great help, I was able >> >> > to go back to the the load tests. I tried using IRC, as you >> >> > suggested, >> >> > to continue this discussion but because of the time difference (I'm >> >> > GMT+3) it is quite difficult to find a time when people are present >> >> > and I am available to run long tests, so I'll give the mailing list >> >> > one more try. >> >> > >> >> > I tested again on a clean table using 100 insert threads each, using >> >> > a >> >> > separate keyspace within the test table. Every row had just one >> >> > column >> >> > with 128 bytes of data. >> >> > With one server and one region I got about 2300 inserts per second. >> >> > After manually splitting the region I got about 3600 inserts per >> >> > second (still on one machine). After a while the regions were >> >> > balanced >> >> > and one was moved to another server, that got writes to around 4500 >> >> > writes per second. Additional splits and moves to more servers didn't >> >> > improve this number and the write performance stabilized at ~4000 >> >> > writes/sec per server. This seems pretty low, especially considering >> >> > other numbers I've seen around here. >> >> > >> >> > Read performance is at around 1500 rows per second per server, which >> >> > seems extremely low to me, especially considering that all the >> >> > working >> >> > set I was querying could fit in the servers memory. To make the test >> >> > interesting I limited my client to fetch only 1 row (always the same >> >> > one) from each keyspace, that yielded 10K reads per sec per server, >> >> > so >> >> > I tried increasing the range again a read the same 10 rows, now the >> >> > performance dropped to 8500 reads/sec per server. Increasing the >> >> > range >> >> > to 100 rows and the performance drops to around 3500 reads per second >> >> > per server. >> >> > Do you have any idea what could explain this behavior and how do I >> >> > get >> >> > a decent number of reads from those servers? >> >> > >> >> > -eran >> > >> >> >> >> -- >> josh >> @schulz >> http://schulzone.org > >
