Yes, we do have our application using shared memory which is what we see when the cluster is down.
On Tue, May 17, 2016 at 10:53 PM, Ken Gaillot <[email protected]> wrote: > On 05/17/2016 12:02 PM, Nikhil Utane wrote: > > OK. Will do that. > > > > Actually I gave the /dev/shm usage when the cluster wasn't up. > > When it is up, I see it occupies close to 300 MB (it's also the DC). > > Hmmm, there should be no usage if the cluster is stopped. Any memory > used by the cluster will start with "qb-", so anything else is from > something else. > > If all executables using libqb (including corosync and pacemaker) are > stopped, it's safe to remove any /dev/shm/qb-* files that remain. That > should be rare, probably only after a core dump or such. > > > tmpfs 500.0M 329.4M 170.6M 66% /dev/shm > > > > On another node the same is 115 MB. > > > > Anyways, I'll monitor the usage to know what size is needed. > > > > Thank you Ken and Ulrich. > > > > On Tue, May 17, 2016 at 8:23 PM, Ken Gaillot <[email protected] > > <mailto:[email protected]>> wrote: > > > > On 05/17/2016 04:07 AM, Nikhil Utane wrote: > > > What I would like to understand is how much total shared memory > > > (approximately) would Pacemaker need so that accordingly I can > define > > > the partition size. Currently it is 300 MB in our system. I > recently ran > > > into insufficient shared memory issue because of improper > clean-up. So > > > would like to understand how much Pacemaker would need for a 6-node > > > cluster so that accordingly I can increase it. > > > > I have no idea :-) > > > > I don't think there's any way to pre-calculate it. The libqb library > is > > the part of the software stack that actually manages the shared > memory, > > but it's used by everything -- corosync (including its cpg and > > votequorum components) and each pacemaker daemon. > > > > The size depends directly on the amount of communication activity in > the > > cluster, which is only indirectly related to the number of > > nodes/resources/etc., the size of the CIB, etc. A cluster with nodes > > joining/leaving frequently and resources moving around a lot will use > > more shared memory than a cluster of the same size that's quiet. > Cluster > > options such as cluster-recheck-interval would also matter. > > > > Practically, I think all you can do is simulate expected cluster > > configurations and loads, and see what it comes out to be. > > > > > # df -kh > > > tmpfs 300.0M 27.5M 272.5M 9% /dev/shm > > > > > > Thanks > > > Nikhil > > > > > > On Tue, May 17, 2016 at 12:09 PM, Ulrich Windl > > > <[email protected] > > <mailto:[email protected]> > > > <mailto:[email protected] > > <mailto:[email protected]>>> wrote: > > > > > > Hi! > > > > > > One of the main problems I identified with POSIX shared memory > > > (/dev/shm) in Linux is that changes to the shared memory don't > > > affect the i-node, so you cannot tell from a "ls -rtl" which > > > segments are still active and which are not. You can only see > the > > > creation time. > > > > > > Maybe there should be a tool that identifies and cleans up > obsolete > > > shared memory. > > > I don't understand the part talking about the size of > /dev/shm: It's > > > shared memory. See "kernel.shmmax" and "kernel.shmall" in you > sysctl > > > settings (/etc/sysctl.conf). > > > > > > Regards, > > > Ulrich > > > > > > >>> Nikhil Utane <[email protected] <mailto: > [email protected]> > > > <mailto:[email protected] > > <mailto:[email protected]>>> schrieb am 16.05.2016 um > 14:31 in > > > Nachricht > > > > > <CAGNWmJVSye5PJgkdbFAi5AzO+Qq-j=2fs1c+0rgnqs994vv...@mail.gmail.com > > <mailto:2fs1c%[email protected]> > > > <mailto:2fs1c%[email protected] > > <mailto:2fs1c%[email protected]>>>: > > > > Thanks Ken. > > > > > > > > Could you also respond on the second question? > > > > > > > >> Also, in /dev/shm I see that it created around 300+ > files of > > > around > > > >> 250 MB. > > > >> > > > >> For e.g. > > > >> -rw-rw---- 1 hacluste hacluste 8232 May 6 13:03 > > > >> qb-cib_rw-response-25035-25038-10-header > > > >> -rw-rw---- 1 hacluste hacluste 540672 May 6 13:03 > > > >> qb-cib_rw-response-25035-25038-10-data > > > >> -rw------- 1 hacluste hacluste 8232 May 6 13:03 > > > >> qb-cib_rw-response-25035-25036-12-header > > > >> -rw------- 1 hacluste hacluste 540672 May 6 13:03 > > > >> qb-cib_rw-response-25035-25036-12-data > > > >> And many more.. > > > >> > > > >> We have limited space in /dev/shm and all these files > are > > > filling it > > > >> up. Are these all needed? Any way to limit? Do we need > to do any > > > >> clean-up if pacemaker termination was not graceful? > What's the > > > > recommended size for this folder for Pacemaker? Our cluster > will have > > > > maximum 6 nodes. > > > > > > > > -Regards > > > > Nikhil > > > > > > > > On Sat, May 14, 2016 at 3:11 AM, Ken Gaillot < > [email protected] <mailto:[email protected]> > > > <mailto:[email protected] <mailto:[email protected]>>> > wrote: > > > > > > > >> On 05/08/2016 11:19 PM, Nikhil Utane wrote: > > > >> > Moving these questions to a different thread. > > > >> > > > > >> > Hi, > > > >> > > > > >> > We have limited storage capacity in our system for > > > different folders. > > > >> > How can I configure to use a different folder for > > > /var/lib/pacemaker? > > > >> > > > >> ./configure --localstatedir=/wherever (defaults to /var or > > > ${prefix}/var) > > > >> > > > >> That will change everything that normally is placed or > > looked for > > > under > > > >> /var (/var/lib/pacemaker, /var/lib/heartbeat, /var/run, > etc.). > > > >> > > > >> Note that while ./configure lets you change the location of > > nearly > > > >> everything, /usr/lib/ocf/resource.d is an exception, > > because it is > > > >> specified in the OCF standard. > > > >> > > > >> > > > > >> > > > > >> > Also, in /dev/shm I see that it created around 300+ > files > > > of around > > > >> > 250 MB. > > > >> > > > > >> > For e.g. > > > >> > -rw-rw---- 1 hacluste hacluste 8232 May 6 > 13:03 > > > >> > qb-cib_rw-response-25035-25038-10-header > > > >> > -rw-rw---- 1 hacluste hacluste 540672 May 6 > 13:03 > > > >> > qb-cib_rw-response-25035-25038-10-data > > > >> > -rw------- 1 hacluste hacluste 8232 May 6 > 13:03 > > > >> > qb-cib_rw-response-25035-25036-12-header > > > >> > -rw------- 1 hacluste hacluste 540672 May 6 > 13:03 > > > >> > qb-cib_rw-response-25035-25036-12-data > > > >> > And many more.. > > > >> > > > > >> > We have limited space in /dev/shm and all these files > are > > > filling it > > > >> > up. Are these all needed? Any way to limit? Do we > need to > > > do any > > > >> > clean-up if pacemaker termination was not graceful? > > > >> > > > > >> > -Thanks > > > >> > Nikhil >
_______________________________________________ Users mailing list: [email protected] http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
