Sage Weil" , "Wido den
Hollander" , "Sylvain Munaut" ,
"Samuel Just" , "Vladislav Gorbunov"
Sent: Wednesday, March 13, 2013 3:59:03 PM
Subject: Re: OSD memory leaks?
Dave,
Just to be sure, did the log max recent=1 _completely_ stod the
mem
On 03/13/2013 05:05 PM, Dave Spano wrote:
I renamed the old one from images to images-old, and the new one from
images-new to images.
This reminds me of a problem you might hit with this:
RBD clones track the parent image pool by id, so they'll continue
working after the pool is renamed. If y
uot;Sage Weil" <s...@inktank.com
(mailto:s...@inktank.com)>, "Wido den Hollander" <w...@42on.com
(mailto:w...@42on.com)>, "Sylvain Munaut" <s.mun...@whatever-company.com
(mailto:s.mun...@whatever-company.com)>, "Samuel Just"
<sam.j...@inktank.c
> To: "Dave Spano" mailto:dsp...@optogenics.com)>
> Cc: "Greg Farnum" mailto:g...@inktank.com)>, "ceph-devel"
> mailto:ceph-devel@vger.kernel.org)>, "Sage Weil"
> mailto:s...@inktank.com)>, "Wido den Hollander"
> mailto
PermissionError: error creating image
Dave Spano
- Original Message -
From: "Sébastien Han"
To: "Dave Spano"
Cc: "Greg Farnum" , "ceph-devel"
, "Sage Weil" , "Wido den
Hollander" , "Sylvain
>
>
>
> - Original Message -
>
> From: "Greg Farnum"
> To: "Dave Spano"
> Cc: "ceph-devel" , "Sage Weil"
> , "Wido den Hollander" , "Sylvain Munaut"
> , "Samuel Just" ,
> "Vladis
"Vladislav Gorbunov" , "Sébastien Han"
Sent: Tuesday, March 12, 2013 5:37:37 PM
Subject: Re: OSD memory leaks?
Yeah. There's not anything intelligent about that cppool mechanism. :)
-Greg
On Tuesday, March 12, 2013 at 2:15 PM, Dave Spano wrote:
> I'd rathe
ailto:s...@inktank.com)>, "Wido den Hollander" (mailto:w...@42on.com)>, "Sylvain Munaut" (mailto:s.mun...@whatever-company.com)>, "Samuel Just" (mailto:sam.j...@inktank.com)>, "Vladislav Gorbunov" (mailto:vadi...@gmail.com)>
> Sent: Tuesd
Just" , "Vladislav Gorbunov"
Sent: Tuesday, March 12, 2013 4:20:13 PM
Subject: Re: OSD memory leaks?
On Tuesday, March 12, 2013 at 1:10 PM, Sébastien Han wrote:
> Well to avoid un necessary data movement, there is also an
> _experimental_ feature to change on fly the number
han.sebast...@gmail.com said:
> Well to avoid un necessary data movement, there is also an _experimental_
> feature to change on fly the number of PGs in a pool.
> ceph osd pool set pg_num --allow-experimental-feature
I've been following the instructions here:
http://ceph.com/docs/master/rado
t; > (mailto:han.sebast...@gmail.com)>
> > Cc: "Sage Weil" mailto:s...@inktank.com)>, "Wido den
> > Hollander" mailto:w...@42on.com)>, "Gregory Farnum"
> > mailto:g...@inktank.com)>, "Sylvain Munaut"
> > mailto:s.mun...@w
> From: "Dave Spano"
> To: "Sébastien Han"
> Cc: "Sage Weil" , "Wido den Hollander" ,
> "Gregory Farnum" , "Sylvain Munaut"
> , "ceph-devel" ,
> "Samuel Just" , "Vladislav Gorbunov"
>
quot;
To: "Sébastien Han"
Cc: "Sage Weil" , "Wido den Hollander" ,
"Gregory Farnum" , "Sylvain Munaut"
, "ceph-devel" ,
"Samuel Just" , "Vladislav Gorbunov"
Sent: Tuesday, March 12, 2013 1:41:21 PM
Subject: Re: OSD m
>Sorry, i mean pg_num and pgp_num on all pools. Shown by the "ceph osd
>dump | grep 'rep size'"
Well it's still 450 each...
>The default pg_num value 8 is NOT suitable for big cluster.
Thanks I know, I'm not new with Ceph. What's your point here? I
already said that pg_num was 450...
--
Regards,
Sorry, i mean pg_num and pgp_num on all pools. Shown by the "ceph osd
dump | grep 'rep size'"
The default pg_num value 8 is NOT suitable for big cluster.
2013/3/13 Sébastien Han :
> Replica count has been set to 2.
>
> Why?
> --
> Regards,
> Sébastien Han.
>
>
> On Tue, Mar 12, 2013 at 12:45 PM, V
Replica count has been set to 2.
Why?
--
Regards,
Sébastien Han.
On Tue, Mar 12, 2013 at 12:45 PM, Vladislav Gorbunov wrote:
>> FYI I'm using 450 pgs for my pools.
> Please, can you show the number of object replicas?
>
> ceph osd dump | grep 'rep size'
>
> Vlad Gorbunov
>
> 2013/3/5 Sébastien
> FYI I'm using 450 pgs for my pools.
Please, can you show the number of object replicas?
ceph osd dump | grep 'rep size'
Vlad Gorbunov
2013/3/5 Sébastien Han :
> FYI I'm using 450 pgs for my pools.
>
> --
> Regards,
> Sébastien Han.
>
>
> On Fri, Mar 1, 2013 at 8:10 PM, Sage Weil wrote:
>>
>>
x27;s keeping 100k lines of logs in memory, which can eat a lot
>>> of
>>> ram (but is great when debugging issues).
>
> Dave Spano
>
>
>
>
> - Original Message -
> From: "Sébastien Han"
> To: "Sage Weil"
> Cc: "Wido den Hol
FYI I'm using 450 pgs for my pools.
--
Regards,
Sébastien Han.
On Fri, Mar 1, 2013 at 8:10 PM, Sage Weil wrote:
>
> On Fri, 1 Mar 2013, Wido den Hollander wrote:
> > On 02/23/2013 01:44 AM, Sage Weil wrote:
> > > On Fri, 22 Feb 2013, S?bastien Han wrote:
> > > > Hi all,
> > > >
> > > > I finall
On Fri, 1 Mar 2013, Wido den Hollander wrote:
> On 02/23/2013 01:44 AM, Sage Weil wrote:
> > On Fri, 22 Feb 2013, S?bastien Han wrote:
> > > Hi all,
> > >
> > > I finally got a core dump.
> > >
> > > I did it with a kill -SEGV on the OSD process.
> > >
> > > https://www.dropbox.com/s/ahv6hm0ipna
That pattern would seem to support the log trimming theory of the leak.
-Sam
On Fri, Mar 1, 2013 at 7:51 AM, Wido den Hollander wrote:
> On 02/23/2013 01:44 AM, Sage Weil wrote:
>>
>> On Fri, 22 Feb 2013, S?bastien Han wrote:
>>>
>>> Hi all,
>>>
>>> I finally got a core dump.
>>>
>>> I did it wit
On 02/23/2013 01:44 AM, Sage Weil wrote:
On Fri, 22 Feb 2013, S?bastien Han wrote:
Hi all,
I finally got a core dump.
I did it with a kill -SEGV on the OSD process.
https://www.dropbox.com/s/ahv6hm0ipnak5rf/core-ceph-osd-11-0-0-20100-1361539008
Hope we will get something out of it :-).
AHA
Ok thanks guys. Hope we will find something :-).
--
Regards,
Sébastien Han.
On Mon, Feb 25, 2013 at 8:51 AM, Wido den Hollander wrote:
> On 02/25/2013 01:21 AM, Sage Weil wrote:
>>
>> On Mon, 25 Feb 2013, S?bastien Han wrote:
>>>
>>> Hi Sage,
>>>
>>> Sorry it's a production system, so I can't te
On 02/25/2013 01:21 AM, Sage Weil wrote:
On Mon, 25 Feb 2013, S?bastien Han wrote:
Hi Sage,
Sorry it's a production system, so I can't test it.
So at the end, you can't get anything out of the core dump?
I saw a bunch of dup object anmes, which is what led us to the pg log
theory. I can look
On Mon, 25 Feb 2013, S?bastien Han wrote:
> Hi Sage,
>
> Sorry it's a production system, so I can't test it.
> So at the end, you can't get anything out of the core dump?
I saw a bunch of dup object anmes, which is what led us to the pg log
theory. I can look a bit more carefully to confirm, bu
Hi Sage,
Sorry it's a production system, so I can't test it.
So at the end, you can't get anything out of the core dump?
--
Regards,
Sébastien Han.
On Sat, Feb 23, 2013 at 1:44 AM, Sage Weil wrote:
> On Fri, 22 Feb 2013, S?bastien Han wrote:
>> Hi all,
>>
>> I finally got a core dump.
>>
>> I
On Fri, 22 Feb 2013, S?bastien Han wrote:
> Hi all,
>
> I finally got a core dump.
>
> I did it with a kill -SEGV on the OSD process.
>
> https://www.dropbox.com/s/ahv6hm0ipnak5rf/core-ceph-osd-11-0-0-20100-1361539008
>
> Hope we will get something out of it :-).
AHA! We have a theory. The p
Hi all,
I finally got a core dump.
I did it with a kill -SEGV on the OSD process.
https://www.dropbox.com/s/ahv6hm0ipnak5rf/core-ceph-osd-11-0-0-20100-1361539008
Hope we will get something out of it :-).
--
Regards,
Sébastien Han.
On Fri, Jan 11, 2013 at 7:13 PM, Gregory Farnum wrote:
> On F
On Fri, Jan 11, 2013 at 6:57 AM, Sébastien Han wrote:
>> Is osd.1 using the heap profiler as well? Keep in mind that active use
>> of the memory profiler will itself cause memory usage to increase —
>> this sounds a bit like that to me since it's staying stable at a large
>> but finite portion of
> Is osd.1 using the heap profiler as well? Keep in mind that active use
> of the memory profiler will itself cause memory usage to increase —
> this sounds a bit like that to me since it's staying stable at a large
> but finite portion of total memory.
Well, the memory consumption was already hig
On Wed, Jan 9, 2013 at 10:09 AM, Sylvain Munaut
wrote:
> Just fyi, I also have growing memory on OSD, and I have the same logs:
>
> "libceph: osd4 172.20.11.32:6801 socket closed" in the RBD clients
That message is not an error; it just happens if the RBD client
doesn't talk to that OSD for a whi
On Wed, Jan 9, 2013 at 8:10 AM, Dave Spano wrote:
> Yes, I'm using argonaut.
>
> I've got 38 heap files from yesterday. Currently, the OSD in question is
> using 91.2% of memory according to top, and staying there. I initially
> thought it would go until the OOM killer started killing processes,
Thank you. I appreciate it!
Dave Spano
Optogenics
Systems Administrator
- Original Message -
From: "Sébastien Han"
To: "Dave Spano"
Cc: "ceph-devel" , "Samuel Just"
Sent: Wednesday, January 9, 2013 5:12:12 PM
Subject: Re: OSD mem
t 10:42 PM, Dave Spano wrote:
> That's very good to know. I'll be restarting ceph-osd right now! Thanks for
> the heads up!
>
> Dave Spano
> Optogenics
> Systems Administrator
>
>
>
> - Original Message -
>
> From: "Sébastien Han"
> T
t;
Sent: Wednesday, January 9, 2013 11:35:13 AM
Subject: Re: OSD memory leaks?
If you wait too long, the system will trigger OOM killer :D, I already
experienced that unfortunately...
Sam?
On Wed, Jan 9, 2013 at 5:10 PM, Dave Spano wrote:
> OOM killer
--
Regards,
Sébastien
Hi,
Thanks for the input.
I also have tons of "socket closed", I recall that this message is
harmless. Anyway Cephx is disable on my platform from the beginning...
Anyone to approve or disapprove my "scrub theory"?
--
Regards,
Sébastien Han.
On Wed, Jan 9, 2013 at 7:09 PM, Sylvain Munaut
wrote
Just fyi, I also have growing memory on OSD, and I have the same logs:
"libceph: osd4 172.20.11.32:6801 socket closed" in the RBD clients
I traced that problem and correlated it to some cephx issue in the OSD
some time ago in this thread
http://www.mail-archive.com/ceph-devel@vger.kernel.org/ms
If you wait too long, the system will trigger OOM killer :D, I already
experienced that unfortunately...
Sam?
On Wed, Jan 9, 2013 at 5:10 PM, Dave Spano wrote:
> OOM killer
--
Regards,
Sébastien Han.
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a mes
Message -
From: "Sébastien Han"
To: "Samuel Just"
Cc: "Dave Spano" , "ceph-devel"
Sent: Wednesday, January 9, 2013 10:20:43 AM
Subject: Re: OSD memory leaks?
I guess he runs Argonaut as well.
More suggestions about this problem?
Thanks!
--
Regard
pano"
> > To: "Sébastien Han"
> > Cc: "ceph-devel" , "Samuel Just"
> >
> > Sent: Monday, January 7, 2013 12:40:06 PM
> > Subject: Re: OSD memory leaks?
> >
> >
> > Sam,
> >
> > Attached are some heaps tha
>
> - Original Message -
>
> From: "Dave Spano"
> To: "Sébastien Han"
> Cc: "ceph-devel" , "Samuel Just"
>
> Sent: Monday, January 7, 2013 12:40:06 PM
> Subject: Re: OSD memory leaks?
>
>
> Sam,
>
> Attached are some he
Hi Sam,
Thanks for your answer and sorry the late reply.
Unfortunately I can't get something out from the profiler, actually I
do but I guess it doesn't show what is supposed to show... I will keep
on trying this. Anyway yesterday I just thought that the problem might
be due to some over usage of
Sorry, it's been very busy. The next step would to try to get a heap
dump. You can start a heap profile on osd N by:
ceph osd tell N heap start_profiler
and you can get it to dump the collected profile using
ceph osd tell N heap dump.
The dumps should show up in the osd log directory.
Assumi
No more suggestions? :(
--
Regards,
Sébastien Han.
On Tue, Dec 18, 2012 at 6:21 PM, Sébastien Han wrote:
> Nothing terrific...
>
> Kernel logs from my clients are full of "libceph: osd4
> 172.20.11.32:6801 socket closed"
>
> I saw this somewhere on the tracker.
>
> Does this harm?
>
> Thanks.
>
Nothing terrific...
Kernel logs from my clients are full of "libceph: osd4
172.20.11.32:6801 socket closed"
I saw this somewhere on the tracker.
Does this harm?
Thanks.
--
Regards,
Sébastien Han.
On Mon, Dec 17, 2012 at 11:55 PM, Samuel Just wrote:
>
> What is the workload like?
> -Sam
>
>
What is the workload like?
-Sam
On Mon, Dec 17, 2012 at 2:41 PM, Sébastien Han wrote:
> Hi,
>
> No, I don't see nothing abnormal in the network stats. I don't see
> anything in the logs... :(
> The weird thing is that one node over 4 seems to take way more memory
> than the others...
>
> --
> Reg
Hi,
No, I don't see nothing abnormal in the network stats. I don't see
anything in the logs... :(
The weird thing is that one node over 4 seems to take way more memory
than the others...
--
Regards,
Sébastien Han.
On Mon, Dec 17, 2012 at 11:31 PM, Sébastien Han wrote:
>
> Hi,
>
> No, I don't s
Are you having network hiccups? There was a bug noticed recently that
could cause a memory leak if nodes are being marked up and down.
-Sam
On Mon, Dec 17, 2012 at 12:28 AM, Sébastien Han wrote:
> Hi guys,
>
> Today looking at my graphs I noticed that one over 4 ceph nodes used a
> lot of memory
Hi guys,
Today looking at my graphs I noticed that one over 4 ceph nodes used a
lot of memory. It keeps growing and growing.
See the graph attached to this mail.
I run 0.48.2 on Ubuntu 12.04.
The other nodes also grow, but slowly than the first one.
I'm not quite sure about the information that
49 matches
Mail list logo