Re: Question on HTTPClient-675

Ortwin Glück Mon, 15 Jun 2009 15:42:16 -0700


D H wrote:


I agree about this, they wanted proof of a fix before any changes would be
made and my manager is still saying there must be proof before code changes
even now after I showed him your email and the documentation from the site.

Nice way of thinking -- at a university where time and money are infiniteresources. Wishful thinking anywhere else. Enter The Real World.

Anyway if this is so hard to reproduce and profiling didn't give you any
idea what's broken, why do you suspect something in this particular piece of
code is the cause of your problem?



That's a very good question, I was given the source code, no explanation of

their code and told HttpClient was the problem.

Why am I not convinced? But it's well possible. My best advice I can give is:Upgrade for God's sake, and fix obvious mistakes in the use of the API. Use somemonitoring tools like JConsole / JVisualVM, jmap, netstat, top and the like tosee if you have a garbage collection problem or any obvious resource leaks, thentake appropriate action.


> It is a rather large

codebase so I'm taking their word for it.  They've apparently worked on it
sporadically for a couple of months to isolate it to HttpClient, and my task
is to prove it is causing this problem with JMeter in a self-contained code
sample.  This problem has only been seen in Production and only after almost
a week of running 24/7 so it's hard to duplicate it easily.  I've sent over
two hundred thousand HttpClient requests without seeing the problem so I'd
rather see this code fix go to Production and test that way personally.

Sounds like a a typical production problem to me: it can take weeks to see it,there is no way you can trigger it on purpose and maybe it has only ever beenencountered on the production system.

Face it, you will not reproduce it locally in reasonable time. It is maybedependent on the workload you are running. Maybe it's even platform specific andmay not trigger on your testing platform. And with platform I don't mean justthe OS. Also the processor type (Single core, Multi core) can make a hugedifference.

What can really help you is to expect the situation in production. And insteadof panicing and quickly restarting, take your time, having the right tools athand to find out what's going on in this moment. Maybe even a "post mortem" jobthat gathers useful information in case this happens during everyone issleeping. Is it swapping? Has the VM run out of memory (stack, heap, perm gen,code) and is constantly GCing? Has the OS run out of file descriptors? To whichsignals does it react? Is it creating threads at a high rate? Are there just toomany runnable threads? Is it busy waiting? Is it looping endlessly? Is it I/Obound? Is it lock contented or even deadlocked? Is it blocking on I/O ornetwork? Is it waiting for the DB? What's going on on the DB? What's going on onthe network? Is your log detailes enough to give you the information you need?

If all fails you, you may have to live with it and rather setup monitoringinfrastructure that can reliably detect the situation and restart the process.


I really appreciate you taking the time to answer my emails, thank you very
much.

Sincerely,
David Hamilton


Ortwin

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: Question on HTTPClient-675

Reply via email to