Inline. J-D
> Hi J-D, > I can't paste the entire file because it's 126K. Trying to attach it > now as zip, lets see if that has more luck. In the jstack you posted, all the Gets were hitting HDFS which is probably why it's slow. Until you can get something like HDFS-347 in your Hadoop you'll have to make sure you can block cache most of what what you're going to read. You can tune the size of the block cache since by default it's only 20% of the whole heap. > > I didn't pre-split and I guess that explains the behavior I saw in > which the write performance started at 300 inserts/sec and then > increased up to 3000 per server when the region was split and spread > to two servers. It doesn't explain why the rate actually dropped after > more splits and more servers were added to the table, until eventually > it stabilized on around 2000 inserts/sec per server. Yeah that doesn't explain it, but for that part of the loading we basically have 0 information about the regions' layout on the cluster and how the regions were used. 3k might just be a spike that didn't last super long and for all I know it should not be cared about. Was the 2k/sec done by just one machine or they were all participating equally? How many regions did you end up with at the end? > > I have 1 thrift server per slave. I'm using C# to access the thirft > servers. My C# library manages its own connection pool, it does > round-robin between the servers and re-uses open connections, so not > every call will open a new connection. After a few seconds of running > the test all the connections are re-used and no new connections are > being opened. Sounds good. > > I'm inserting the rows one by one because that represent the kind of > OLTP load that I have in mind for this system. Batching multiple rows, > I believe, is more suitable for analytical processing. Makes sense. > > The second client was using the same key space, but I tried the single > client with a few thread configurations, from 1 to 100, where each > thread was using a different key space, I didn't really see any > difference between 50 threads and 100 threads, so I don't think it's a > key space distribution issue. That part doesn't make sense at all, there must be something you're not seeing that would explain that. Like number of regions and their layout. Also maybe your assumptions about the key spaces are wrong (by experience I always assume the user is wrong, sorry). > > I agree that network latency can be causing the problem but then I > would expect to see more overall reads/writes as the client thread > count increases, as I said above 40-50 thread there was no > improvement. Indeed, something is off and we're not seeing it.
