Can you confirm that the memory size reported is res?
-Sam
On Mon, Feb 18, 2013 at 8:46 AM, Christopher Kunz chrisl...@de-punkt.de wrote:
Am 16.02.13 10:09, schrieb Wido den Hollander:
On 02/16/2013 08:09 AM, Andrey Korolyov wrote:
Can anyone who hit this bug please confirm that your system
Am 16.02.13 10:09, schrieb Wido den Hollander:
On 02/16/2013 08:09 AM, Andrey Korolyov wrote:
Can anyone who hit this bug please confirm that your system contains
libc 2.15+?
Hello,
when we started a deep scrub on our 0.56.2 cluster today, we saw a
massive memleak about 1 hour into the
+1
--
Regards,
Sébastien Han.
On Sat, Feb 16, 2013 at 10:09 AM, Wido den Hollander w...@42on.com wrote:
On 02/16/2013 08:09 AM, Andrey Korolyov wrote:
Can anyone who hit this bug please confirm that your system contains libc
2.15+?
I've seen this with 0.56.2 as well on Ubuntu 12.04.
Can anyone who hit this bug please confirm that your system contains libc 2.15+?
On Tue, Feb 5, 2013 at 1:27 AM, Sébastien Han han.sebast...@gmail.com wrote:
oh nice, the pattern also matches path :D, didn't know that
thanks Greg
--
Regards,
Sébastien Han.
On Mon, Feb 4, 2013 at 10:22 PM,
Hum just tried several times on my test cluster and I can't get any
core dump. Does Ceph commit suicide or something? Is it expected
behavior?
--
Regards,
Sébastien Han.
On Sun, Feb 3, 2013 at 10:03 PM, Sébastien Han han.sebast...@gmail.com wrote:
Hi Loïc,
Thanks for bringing our discussion
On Mon, 4 Feb 2013, S?bastien Han wrote:
Hum just tried several times on my test cluster and I can't get any
core dump. Does Ceph commit suicide or something? Is it expected
behavior?
SIGSEGV should trigger the usual path that dumps a stack trace and then
dumps core. Was your ulimit -c set
...and/or do you have the corepath set interestingly, or one of the
core-trapping mechanisms turned on?
On 02/04/2013 11:29 AM, Sage Weil wrote:
On Mon, 4 Feb 2013, S?bastien Han wrote:
Hum just tried several times on my test cluster and I can't get any
core dump. Does Ceph commit suicide or
ok I finally managed to get something on my test cluster,
unfortunately, the dump goes to /
any idea to change the destination path?
My production / won't be big enough...
--
Regards,
Sébastien Han.
On Mon, Feb 4, 2013 at 10:03 PM, Dan Mick dan.m...@inktank.com wrote:
...and/or do you have
Set your /proc/sys/kernel/core_pattern file. :) http://linux.die.net/man/5/core
-Greg
On Mon, Feb 4, 2013 at 1:08 PM, Sébastien Han han.sebast...@gmail.com wrote:
ok I finally managed to get something on my test cluster,
unfortunately, the dump goes to /
any idea to change the destination
oh nice, the pattern also matches path :D, didn't know that
thanks Greg
--
Regards,
Sébastien Han.
On Mon, Feb 4, 2013 at 10:22 PM, Gregory Farnum g...@inktank.com wrote:
Set your /proc/sys/kernel/core_pattern file. :)
http://linux.die.net/man/5/core
-Greg
On Mon, Feb 4, 2013 at 1:08 PM,
Hi,
As discussed during FOSDEM, the script you wrote to kill the OSD when it grows
too much could be amended to core dump instead of just being killed
restarted. The binary + core could probably be used to figure out where the
leak is.
You should make sure the OSD current working directory
Hi Loïc,
Thanks for bringing our discussion on the ML. I'll check that tomorrow :-).
Cheer
--
Regards,
Sébastien Han.
On Sun, Feb 3, 2013 at 10:01 PM, Sébastien Han han.sebast...@gmail.com wrote:
Hi Loïc,
Thanks for bringing our discussion on the ML. I'll check that tomorrow :-).
Cheers
Hi,
I disabled scrubbing using
ceph osd tell \* injectargs '--osd-scrub-min-interval 100'
ceph osd tell \* injectargs '--osd-scrub-max-interval 1000'
and the leak seems to be gone.
See the graph at http://i.imgur.com/A0KmVot.png with the OSD memory
for the 12 osd processes over the
Hi,
I'm crossing my fingers, but I just noticed that since I upgraded to kernel
version 3.2.0-36-generic on Ubuntu 12.04 the other day, ceph-osd memory
usage has stayed stable.
Unfortunately for me, I'm already on 3.2.0-36-generic (Ubuntu 12.04 as well).
Cheers,
Sylvain
PS: Dave
On Thu, 31 Jan 2013, Sylvain Munaut wrote:
Hi,
I disabled scrubbing using
ceph osd tell \* injectargs '--osd-scrub-min-interval 100'
ceph osd tell \* injectargs '--osd-scrub-max-interval 1000'
and the leak seems to be gone.
See the graph at http://i.imgur.com/A0KmVot.png
Just to keep you posted, upgraded our cluster yesterday to a custom
compiled 0.56.1 and it has now been more than 24h and there is no sign
on memory leak anymore. Previously it would rise by ~ 100 M every 24h
almost like clock work and now, it's been slightly more than 24h and
memory is
On Wed, 30 Jan 2013, Sylvain Munaut wrote:
Just to keep you posted, upgraded our cluster yesterday to a custom
compiled 0.56.1 and it has now been more than 24h and there is no sign
on memory leak anymore. Previously it would rise by ~ 100 M every 24h
almost like clock work and now, it's
Hi,
Can you try disabling scrubbing and see if the leak stops?
ceph osd tell \* injectargs '--osd-scrub-load-threshold .01'
(that will work for 0.56.1, but is fixed in later versions, btw.) On
newer code,
ceph osd tell \* injectargs '--osd-scrub-min-interval 100'
On Wed, 30 Jan 2013, Sylvain Munaut wrote:
Hi,
Can you try disabling scrubbing and see if the leak stops?
ceph osd tell \* injectargs '--osd-scrub-load-threshold .01'
(that will work for 0.56.1, but is fixed in later versions, btw.) On
newer code,
ceph osd
Hi,
Just to keep you posted, upgraded our cluster yesterday to a custom
compiled 0.56.1 and it has now been more than 24h and there is no sign
on memory leak anymore. Previously it would rise by ~ 100 M every 24h
almost like clock work and now, it's been slightly more than 24h and
memory is
On Sun, 27 Jan 2013, Sylvain Munaut wrote:
Hi,
Just to keep you posted, upgraded our cluster yesterday to a custom
compiled 0.56.1 and it has now been more than 24h and there is no sign
on memory leak anymore. Previously it would rise by ~ 100 M every 24h
almost like clock work and now,
Hi,
Just to keep you posted, upgraded our cluster yesterday to a custom
compiled 0.56.1 and it has now been more than 24h and there is no sign
on memory leak anymore. Previously it would rise by ~ 100 M every 24h
almost like clock work and now, it's been slightly more than 24h and
memory is
Hi,
Could provide those heaps? Is it possible?
--
Regards,
Sébastien Han.
On Tue, Jan 22, 2013 at 10:38 PM, Sébastien Han han.sebast...@gmail.com wrote:
Well ideally you want to run the profiler during the scrubbing process
when the memory leaks appear :-).
--
Regards,
Sébastien Han.
Could provide those heaps? Is it possible?
We're updating this weekend to 0.56.1.
If it still happens after the update, I'll try and reproduce it on our
test infra and do the profile there, because unfortunately running the
profiler seem to make it eat up CPU and RAM a lot ...
I also need to
Hi,
Since I have ceph in prod, I experienced a memory leak in the OSD
forcing to restart them every 5 or 6 days. Without that the OSD
process just grows infinitely and eventually gets killed by the OOM
killer. (To make sure it wasn't legitimate, I left one grow up to 4G
or RSS ...).
Here's for
Hi,
I originally started a thread around these memory leaks problems here:
http://www.mail-archive.com/ceph-devel@vger.kernel.org/msg11000.html
I'm happy to see that someone supports my theory about the scrubbing
process leaking the memory. I only use RBD from Ceph, so your theory
makes sense as
Hi,
I don't really want to try the mem profiler, I had quite a bad
experience with it on a test cluster. While running the profiler some
OSD crashed...
The only way to fix this is to provide a heap dump. Could you provide one?
I just did:
ceph osd tell 0 heap start_profiler
ceph osd tell 0
Well ideally you want to run the profiler during the scrubbing process
when the memory leaks appear :-).
--
Regards,
Sébastien Han.
On Tue, Jan 22, 2013 at 10:32 PM, Sylvain Munaut
s.mun...@whatever-company.com wrote:
Hi,
I don't really want to try the mem profiler, I had quite a bad
28 matches
Mail list logo