Re: OSD memory leaks?

2013-03-14 Thread Dave Spano
Sage Weil" , "Wido den Hollander" , "Sylvain Munaut" , "Samuel Just" , "Vladislav Gorbunov" Sent: Wednesday, March 13, 2013 3:59:03 PM Subject: Re: OSD memory leaks? Dave, Just to be sure, did the log max recent=1 _completely_ stod the mem

Re: OSD memory leaks?

2013-03-13 Thread Josh Durgin
On 03/13/2013 05:05 PM, Dave Spano wrote: I renamed the old one from images to images-old, and the new one from images-new to images. This reminds me of a problem you might hit with this: RBD clones track the parent image pool by id, so they'll continue working after the pool is renamed. If y

Re: OSD memory leaks?

2013-03-13 Thread Dave Spano
uot;Sage Weil" <s...@inktank.com (mailto:s...@inktank.com)>, "Wido den Hollander" <w...@42on.com (mailto:w...@42on.com)>, "Sylvain Munaut" <s.mun...@whatever-company.com (mailto:s.mun...@whatever-company.com)>, "Samuel Just" <sam.j...@inktank.c

Re: OSD memory leaks?

2013-03-13 Thread Greg Farnum
> To: "Dave Spano" mailto:dsp...@optogenics.com)> > Cc: "Greg Farnum" mailto:g...@inktank.com)>, "ceph-devel" > mailto:ceph-devel@vger.kernel.org)>, "Sage Weil" > mailto:s...@inktank.com)>, "Wido den Hollander" > mailto

Re: OSD memory leaks?

2013-03-13 Thread Dave Spano
PermissionError: error creating image Dave Spano - Original Message - From: "Sébastien Han" To: "Dave Spano" Cc: "Greg Farnum" , "ceph-devel" , "Sage Weil" , "Wido den Hollander" , "Sylvain

Re: OSD memory leaks?

2013-03-13 Thread Sébastien Han
> > > > - Original Message - > > From: "Greg Farnum" > To: "Dave Spano" > Cc: "ceph-devel" , "Sage Weil" > , "Wido den Hollander" , "Sylvain Munaut" > , "Samuel Just" , > "Vladis

Re: OSD memory leaks?

2013-03-13 Thread Dave Spano
to:dsp...@optogenics.com)>, > "ceph-devel" (mailto:ceph-devel@vger.kernel.org)>, "Sage Weil" (mailto:s...@inktank.com)>, "Wido den Hollander" (mailto:w...@42on.com)>, "Sylvain Munaut" (mailto:s.mun...@whatever-company.com)>, "Samuel J

Re: OSD memory leaks?

2013-03-12 Thread Greg Farnum
ailto:s...@inktank.com)>, "Wido den Hollander" (mailto:w...@42on.com)>, "Sylvain Munaut" (mailto:s.mun...@whatever-company.com)>, "Samuel Just" (mailto:sam.j...@inktank.com)>, "Vladislav Gorbunov" (mailto:vadi...@gmail.com)> > Sent: Tuesd

Re: OSD memory leaks?

2013-03-12 Thread Dave Spano
com)>, "Gregory Farnum" > > mailto:g...@inktank.com)>, "Sylvain Munaut" > > mailto:s.mun...@whatever-company.com)>, > > "ceph-devel" > (mailto:ceph-devel@vger.kernel.org)>, "Samuel Just" > (mailto:sam.j...@inktank.com)&

Re: OSD memory leaks?

2013-03-12 Thread Bryan K. Wright
han.sebast...@gmail.com said: > Well to avoid un necessary data movement, there is also an _experimental_ > feature to change on fly the number of PGs in a pool. > ceph osd pool set pg_num --allow-experimental-feature I've been following the instructions here: http://ceph.com/docs/master/rado

Re: OSD memory leaks?

2013-03-12 Thread Greg Farnum
t; > (mailto:han.sebast...@gmail.com)> > > Cc: "Sage Weil" mailto:s...@inktank.com)>, "Wido den > > Hollander" mailto:w...@42on.com)>, "Gregory Farnum" > > mailto:g...@inktank.com)>, "Sylvain Munaut" > > mailto:s.mun...@w

Re: OSD memory leaks?

2013-03-12 Thread Sébastien Han
> From: "Dave Spano" > To: "Sébastien Han" > Cc: "Sage Weil" , "Wido den Hollander" , > "Gregory Farnum" , "Sylvain Munaut" > , "ceph-devel" , > "Samuel Just" , "Vladislav Gorbunov" >

Re: OSD memory leaks?

2013-03-12 Thread Dave Spano
quot; To: "Sébastien Han" Cc: "Sage Weil" , "Wido den Hollander" , "Gregory Farnum" , "Sylvain Munaut" , "ceph-devel" , "Samuel Just" , "Vladislav Gorbunov" Sent: Tuesday, March 12, 2013 1:41:21 PM Subject: Re: OSD m

Re: OSD memory leaks?

2013-03-12 Thread Sébastien Han
>Sorry, i mean pg_num and pgp_num on all pools. Shown by the "ceph osd >dump | grep 'rep size'" Well it's still 450 each... >The default pg_num value 8 is NOT suitable for big cluster. Thanks I know, I'm not new with Ceph. What's your point here? I already said that pg_num was 450... -- Regards,

Re: OSD memory leaks?

2013-03-12 Thread Vladislav Gorbunov
Sorry, i mean pg_num and pgp_num on all pools. Shown by the "ceph osd dump | grep 'rep size'" The default pg_num value 8 is NOT suitable for big cluster. 2013/3/13 Sébastien Han : > Replica count has been set to 2. > > Why? > -- > Regards, > Sébastien Han. > > > On Tue, Mar 12, 2013 at 12:45 PM, V

Re: OSD memory leaks?

2013-03-12 Thread Sébastien Han
Replica count has been set to 2. Why? -- Regards, Sébastien Han. On Tue, Mar 12, 2013 at 12:45 PM, Vladislav Gorbunov wrote: >> FYI I'm using 450 pgs for my pools. > Please, can you show the number of object replicas? > > ceph osd dump | grep 'rep size' > > Vlad Gorbunov > > 2013/3/5 Sébastien

Re: OSD memory leaks?

2013-03-12 Thread Vladislav Gorbunov
> FYI I'm using 450 pgs for my pools. Please, can you show the number of object replicas? ceph osd dump | grep 'rep size' Vlad Gorbunov 2013/3/5 Sébastien Han : > FYI I'm using 450 pgs for my pools. > > -- > Regards, > Sébastien Han. > > > On Fri, Mar 1, 2013 at 8:10 PM, Sage Weil wrote: >> >>

Re: OSD memory leaks?

2013-03-11 Thread Sébastien Han
x27;s keeping 100k lines of logs in memory, which can eat a lot >>> of >>> ram (but is great when debugging issues). > > Dave Spano > > > > > - Original Message - > From: "Sébastien Han" > To: "Sage Weil" > Cc: "Wido den Hol

Re: OSD memory leaks?

2013-03-04 Thread Sébastien Han
FYI I'm using 450 pgs for my pools. -- Regards, Sébastien Han. On Fri, Mar 1, 2013 at 8:10 PM, Sage Weil wrote: > > On Fri, 1 Mar 2013, Wido den Hollander wrote: > > On 02/23/2013 01:44 AM, Sage Weil wrote: > > > On Fri, 22 Feb 2013, S?bastien Han wrote: > > > > Hi all, > > > > > > > > I finall

Re: OSD memory leaks?

2013-03-01 Thread Sage Weil
On Fri, 1 Mar 2013, Wido den Hollander wrote: > On 02/23/2013 01:44 AM, Sage Weil wrote: > > On Fri, 22 Feb 2013, S?bastien Han wrote: > > > Hi all, > > > > > > I finally got a core dump. > > > > > > I did it with a kill -SEGV on the OSD process. > > > > > > https://www.dropbox.com/s/ahv6hm0ipna

Re: OSD memory leaks?

2013-03-01 Thread Samuel Just
That pattern would seem to support the log trimming theory of the leak. -Sam On Fri, Mar 1, 2013 at 7:51 AM, Wido den Hollander wrote: > On 02/23/2013 01:44 AM, Sage Weil wrote: >> >> On Fri, 22 Feb 2013, S?bastien Han wrote: >>> >>> Hi all, >>> >>> I finally got a core dump. >>> >>> I did it wit

Re: OSD memory leaks?

2013-03-01 Thread Wido den Hollander
On 02/23/2013 01:44 AM, Sage Weil wrote: On Fri, 22 Feb 2013, S?bastien Han wrote: Hi all, I finally got a core dump. I did it with a kill -SEGV on the OSD process. https://www.dropbox.com/s/ahv6hm0ipnak5rf/core-ceph-osd-11-0-0-20100-1361539008 Hope we will get something out of it :-). AHA

Re: OSD memory leaks?

2013-02-25 Thread Sébastien Han
Ok thanks guys. Hope we will find something :-). -- Regards, Sébastien Han. On Mon, Feb 25, 2013 at 8:51 AM, Wido den Hollander wrote: > On 02/25/2013 01:21 AM, Sage Weil wrote: >> >> On Mon, 25 Feb 2013, S?bastien Han wrote: >>> >>> Hi Sage, >>> >>> Sorry it's a production system, so I can't te

Re: OSD memory leaks?

2013-02-24 Thread Wido den Hollander
On 02/25/2013 01:21 AM, Sage Weil wrote: On Mon, 25 Feb 2013, S?bastien Han wrote: Hi Sage, Sorry it's a production system, so I can't test it. So at the end, you can't get anything out of the core dump? I saw a bunch of dup object anmes, which is what led us to the pg log theory. I can look

Re: OSD memory leaks?

2013-02-24 Thread Sage Weil
On Mon, 25 Feb 2013, S?bastien Han wrote: > Hi Sage, > > Sorry it's a production system, so I can't test it. > So at the end, you can't get anything out of the core dump? I saw a bunch of dup object anmes, which is what led us to the pg log theory. I can look a bit more carefully to confirm, bu

Re: OSD memory leaks?

2013-02-24 Thread Sébastien Han
Hi Sage, Sorry it's a production system, so I can't test it. So at the end, you can't get anything out of the core dump? -- Regards, Sébastien Han. On Sat, Feb 23, 2013 at 1:44 AM, Sage Weil wrote: > On Fri, 22 Feb 2013, S?bastien Han wrote: >> Hi all, >> >> I finally got a core dump. >> >> I

Re: OSD memory leaks?

2013-02-22 Thread Sage Weil
On Fri, 22 Feb 2013, S?bastien Han wrote: > Hi all, > > I finally got a core dump. > > I did it with a kill -SEGV on the OSD process. > > https://www.dropbox.com/s/ahv6hm0ipnak5rf/core-ceph-osd-11-0-0-20100-1361539008 > > Hope we will get something out of it :-). AHA! We have a theory. The p

Re: OSD memory leaks?

2013-02-22 Thread Sébastien Han
Hi all, I finally got a core dump. I did it with a kill -SEGV on the OSD process. https://www.dropbox.com/s/ahv6hm0ipnak5rf/core-ceph-osd-11-0-0-20100-1361539008 Hope we will get something out of it :-). -- Regards, Sébastien Han. On Fri, Jan 11, 2013 at 7:13 PM, Gregory Farnum wrote: > On F

Re: OSD memory leaks?

2013-01-11 Thread Gregory Farnum
On Fri, Jan 11, 2013 at 6:57 AM, Sébastien Han wrote: >> Is osd.1 using the heap profiler as well? Keep in mind that active use >> of the memory profiler will itself cause memory usage to increase — >> this sounds a bit like that to me since it's staying stable at a large >> but finite portion of

Re: OSD memory leaks?

2013-01-11 Thread Sébastien Han
> Is osd.1 using the heap profiler as well? Keep in mind that active use > of the memory profiler will itself cause memory usage to increase — > this sounds a bit like that to me since it's staying stable at a large > but finite portion of total memory. Well, the memory consumption was already hig

Re: OSD memory leaks?

2013-01-10 Thread Gregory Farnum
On Wed, Jan 9, 2013 at 10:09 AM, Sylvain Munaut wrote: > Just fyi, I also have growing memory on OSD, and I have the same logs: > > "libceph: osd4 172.20.11.32:6801 socket closed" in the RBD clients That message is not an error; it just happens if the RBD client doesn't talk to that OSD for a whi

Re: OSD memory leaks?

2013-01-10 Thread Gregory Farnum
On Wed, Jan 9, 2013 at 8:10 AM, Dave Spano wrote: > Yes, I'm using argonaut. > > I've got 38 heap files from yesterday. Currently, the OSD in question is > using 91.2% of memory according to top, and staying there. I initially > thought it would go until the OOM killer started killing processes,

Re: OSD memory leaks?

2013-01-09 Thread Dave Spano
Thank you. I appreciate it! Dave Spano Optogenics Systems Administrator - Original Message - From: "Sébastien Han" To: "Dave Spano" Cc: "ceph-devel" , "Samuel Just" Sent: Wednesday, January 9, 2013 5:12:12 PM Subject: Re: OSD mem

Re: OSD memory leaks?

2013-01-09 Thread Sébastien Han
t 10:42 PM, Dave Spano wrote: > That's very good to know. I'll be restarting ceph-osd right now! Thanks for > the heads up! > > Dave Spano > Optogenics > Systems Administrator > > > > - Original Message - > > From: "Sébastien Han" > T

Re: OSD memory leaks?

2013-01-09 Thread Dave Spano
t; Sent: Wednesday, January 9, 2013 11:35:13 AM Subject: Re: OSD memory leaks? If you wait too long, the system will trigger OOM killer :D, I already experienced that unfortunately... Sam? On Wed, Jan 9, 2013 at 5:10 PM, Dave Spano wrote: > OOM killer -- Regards, Sébastien

Re: OSD memory leaks?

2013-01-09 Thread Sébastien Han
Hi, Thanks for the input. I also have tons of "socket closed", I recall that this message is harmless. Anyway Cephx is disable on my platform from the beginning... Anyone to approve or disapprove my "scrub theory"? -- Regards, Sébastien Han. On Wed, Jan 9, 2013 at 7:09 PM, Sylvain Munaut wrote

Re: OSD memory leaks?

2013-01-09 Thread Sylvain Munaut
Just fyi, I also have growing memory on OSD, and I have the same logs: "libceph: osd4 172.20.11.32:6801 socket closed" in the RBD clients I traced that problem and correlated it to some cephx issue in the OSD some time ago in this thread http://www.mail-archive.com/ceph-devel@vger.kernel.org/ms

Re: OSD memory leaks?

2013-01-09 Thread Sébastien Han
If you wait too long, the system will trigger OOM killer :D, I already experienced that unfortunately... Sam? On Wed, Jan 9, 2013 at 5:10 PM, Dave Spano wrote: > OOM killer -- Regards, Sébastien Han. -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a mes

Re: OSD memory leaks?

2013-01-09 Thread Dave Spano
Message - From: "Sébastien Han" To: "Samuel Just" Cc: "Dave Spano" , "ceph-devel" Sent: Wednesday, January 9, 2013 10:20:43 AM Subject: Re: OSD memory leaks? I guess he runs Argonaut as well. More suggestions about this problem? Thanks! -- Regard

Re: OSD memory leaks?

2013-01-09 Thread Sébastien Han
pano" > > To: "Sébastien Han" > > Cc: "ceph-devel" , "Samuel Just" > > > > Sent: Monday, January 7, 2013 12:40:06 PM > > Subject: Re: OSD memory leaks? > > > > > > Sam, > > > > Attached are some heaps tha

Re: OSD memory leaks?

2013-01-07 Thread Samuel Just
> > - Original Message - > > From: "Dave Spano" > To: "Sébastien Han" > Cc: "ceph-devel" , "Samuel Just" > > Sent: Monday, January 7, 2013 12:40:06 PM > Subject: Re: OSD memory leaks? > > > Sam, > > Attached are some he

Re: OSD memory leaks?

2013-01-04 Thread Sébastien Han
Hi Sam, Thanks for your answer and sorry the late reply. Unfortunately I can't get something out from the profiler, actually I do but I guess it doesn't show what is supposed to show... I will keep on trying this. Anyway yesterday I just thought that the problem might be due to some over usage of

Re: OSD memory leaks?

2012-12-19 Thread Samuel Just
Sorry, it's been very busy. The next step would to try to get a heap dump. You can start a heap profile on osd N by: ceph osd tell N heap start_profiler and you can get it to dump the collected profile using ceph osd tell N heap dump. The dumps should show up in the osd log directory. Assumi

Re: OSD memory leaks?

2012-12-19 Thread Sébastien Han
No more suggestions? :( -- Regards, Sébastien Han. On Tue, Dec 18, 2012 at 6:21 PM, Sébastien Han wrote: > Nothing terrific... > > Kernel logs from my clients are full of "libceph: osd4 > 172.20.11.32:6801 socket closed" > > I saw this somewhere on the tracker. > > Does this harm? > > Thanks. >

Re: OSD memory leaks?

2012-12-18 Thread Sébastien Han
Nothing terrific... Kernel logs from my clients are full of "libceph: osd4 172.20.11.32:6801 socket closed" I saw this somewhere on the tracker. Does this harm? Thanks. -- Regards, Sébastien Han. On Mon, Dec 17, 2012 at 11:55 PM, Samuel Just wrote: > > What is the workload like? > -Sam > >

Re: OSD memory leaks?

2012-12-17 Thread Samuel Just
What is the workload like? -Sam On Mon, Dec 17, 2012 at 2:41 PM, Sébastien Han wrote: > Hi, > > No, I don't see nothing abnormal in the network stats. I don't see > anything in the logs... :( > The weird thing is that one node over 4 seems to take way more memory > than the others... > > -- > Reg

Re: OSD memory leaks?

2012-12-17 Thread Sébastien Han
Hi, No, I don't see nothing abnormal in the network stats. I don't see anything in the logs... :( The weird thing is that one node over 4 seems to take way more memory than the others... -- Regards, Sébastien Han. On Mon, Dec 17, 2012 at 11:31 PM, Sébastien Han wrote: > > Hi, > > No, I don't s

Re: OSD memory leaks?

2012-12-17 Thread Samuel Just
Are you having network hiccups? There was a bug noticed recently that could cause a memory leak if nodes are being marked up and down. -Sam On Mon, Dec 17, 2012 at 12:28 AM, Sébastien Han wrote: > Hi guys, > > Today looking at my graphs I noticed that one over 4 ceph nodes used a > lot of memory