Eric, would it possible for you to publish your test setup so others
can try it on different hardware and/or tweak the config a bit?

Thanks.

-Moazam


On Fri, Mar 6, 2009 at 8:22 AM, Eric Lambert <[email protected]> wrote:
>
> Hey Adam:
>
> I've done some simple profiling of the spy client and noticed a drop off in
> performance when the number of threads exceeds about 40 (this was on a Sun
> Fire x2200 m2, dual core) which sounds similar to the issue you are seeing.
>  You can see the details of the benchmark on my blog
> http://blogs.sun.com/elambert. I've had it as a background task to root
> cause the issue and now that I see that it is happening in the real world
> and not just my benchmark I'll give this some attention in the very near
> future.
>
> I do have a couple comments, although I don't know how useful they'll be
>
> 1) In my benchmark I noticed that throughput plateau's at about 20 threads
> and that adding more threads (threads 21 - 39) after this point does not
> appear to significantly change throughput. Do you know if your clients have
> hit such a plateau? In which case, maybe dialing down the concurrency is the
> right call since adding more threads is not getting you anything.
>
> 2) I also noticed that when I run the benchmark on  older non multi-core
> hardware (say Sunfire v20z with two cpus), I dont see the performance drop
> (which i am sure is a big clue as to cause).
>
> I'll try and spend some time looking into this in the next day or two and
> let you know what i find.
>
> Eric
>
>
> Adam Lee wrote:
>>
>> we recently made the switch from the whalin client to spy and seem to
>> be running into problems under heavy concurrency/load in our front-end
>> servers and i was wondering if anybody (dustin, perhaps?) had any
>> ideas for strategies to deal with it.
>>
>> the majority of our front-end servers are sun fire t1000s (8 cores, 4
>> threads per core) running solaris 10, so obviously the spy client
>> works a lot better for us in the vast majority of cases-- the
>> synchronized blocks in the whalin connection pool gave us a lot of
>> contention problems in particular. when the systems get busy, though,
>> it seems that i/o can't keep up and we start seeing a lot of timeouts,
>> which in turn has a domino effect and effectively brings down the
>> entire cluster.  the problem is that the machines aren't even reaching
>> 60% cpu when this happens.
>>
>> does my diagnosis of the problem seem right and, if so, any ideas for
>> the best way to deal with this?  obviously adjusting timeouts would
>> probably only exacerbate the problem, so i toyed with the idea of
>> having a pool of clients (though i haven't really delved into the code
>> to see if that's feasible or would help at all) or possibly hacking it
>> to change how its i/o threads work.  for now, we've just added a few
>> more machines to this cluster, but it seems like a waste of hardware
>> when i know that these things can operate above 90% cpu for a
>> sustained period with no problem.
>>
>> thanks... any help would be great and let me know if you have any more
>> questions about specifics.
>>
>> --
>> awl
>>
>
>

Reply via email to