On 03/09/2012 10:04 AM, Fraser Adams wrote:
I don't suppose anyone has had any more thoughts on this.
Unfortunately sorting out our network problem hasn't resolved this
issue, it does take a lot longer for the broker memory to grow but
unfortunately it still does.
Can you (or have you) tracked the queue depth, connection and session
stats for the broker exhibiting the problem? Anything you can think of
that might correlate with the rate of growth (e.g. does it look like its
per message)?
As I say below we've got a 0.8 qpid::client producer delivering to
amq.match on a broker co-located on the same host which is federated to
another 0.8 broker (all brokers are c++) via a source queue route.
That is all that's happening on the problem broker?
One weird thing: As an experiment we kept the general topology the same
but we moved the first broker on to its own host "just to see", so we've
got the producer on one host writing to amq.match on a broker now on a
different host with that broker federated to the core broker as before.
We've had that running for days now and the brokers all seem to be
stable!!!
Has anyone seen circumstances that could cause brokers to appear to leak
memory when co-located with a producer but be fine when run on a
separate host??
No.
I don't believe that there are any significant differences in the
dependent libraries on each host, but I couldn't swear to it is anyone
aware of stability issues say with particular versions of boost and qpid
or indeed any other library.
Annoyingly I've never noticed things like this in my set up at home,
just at work where it matters more and I've got deadlines to meet :-(
Can anyone think of a good way to "profile" our hosts to verify that
they should be able to run qpid with no issues? I always build from
source at home (that has its own issues on Ubuntu!!!!) but at work I
believe qpid had been installed from RPMs I'm not clear on the
provenance of the RPMs though I'm a bit suspicious of them they don't
appear to have many dependency checks (for example it didn't barf when
SASL wasn't present but that seems necessary even with --auth no).
So one possibility for the carnage I'm seeing is some hosts might have
slightly different versions of dependent libraries hence why I'd like to
know in a systematic manner what to check for.
rpm -qv <name> will give you the versioned name of the rpms which may
shed some light...
I can't think of any known bug that could explain this. Its just
possible that some combination of versions on federation result in
mismatched expectations I suppose.
On a related note is anyone aware of any differences in behaviour
relating to hardware/chipset? All of our hosts are running RHEL but they
are a mix of hardware - all Intel but varying numbers of cores and
chipsets.
Nothing that should cause leaks... are you running any RHEL6?
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]