Hi,
Looking at lib/common/ipc.c, Pacemaker recommends setting
PCMK_ipc_buffer to 4 times the *uncompressed* size of the biggest
message seen:
error: Could not compress the message (2309508 bytes) into less than the
configured ipc limit (131072 bytes). Set PCMK_ipc_buffer to a higher value
(9238
"Lentes, Bernd" writes:
> 2018-12-03T16:03:02.836145+01:00 ha-idg-2 libvirtd[3117]: 2018-12-03
> 15:03:02.835+: 4515: error : qemuMigrationCheckJobStatus:1456 : operation
> failed: migration job: unexpectedly failed
The above message is a hint at the real problem. It comes from
libvirtd,
Patrick Whitney writes:
> I have a two node (test) cluster running corosync/pacemaker with DLM
> and CLVM.
>
> I was running into an issue where when one node failed, the remaining node
> would appear to do the right thing, from the pcmk perspective, that is.
> It would create a new cluster (of
Christine Caulfield writes:
> I'm also looking into high-res timestamps for logfiles too.
Wouldn't that be a useful option for the syslog output as well? I'm
sometimes concerned by the batching effect added by the transport
between the application and the (local) log server (rsyslog or systemd)
Ken Gaillot writes:
> libqb would simply provide the API for reopening the log, and clients
> such as pacemaker would intercept the signal and call the API.
Just for posterity: you needn't restrict yourself to signals. Logrotate
has nothing to do with signals. Signals are a rather limited form
Ken Gaillot writes:
> On Thu, 2018-09-27 at 09:36 +0200, Ulrich Windl wrote:
>
>> Obviously you violated the most important cluster rule that is "be
>> patient". Maybe the next important is "Don't change the
>> configuration while the cluster is not in IDLE state" ;-)
>
> Agreed -- although eve
Christine Caulfield writes:
> TBH I would be quite happy to leave this to logrotate but the message I
> was getting here is that we need additional help from libqb. I'm willing
> to go with a consensus on this though
Yes, to do a proper job logrotate has to have a way to get the log files
reopen
Christine Caulfield writes:
> I'm looking into new features for libqb and the option in
> https://github.com/ClusterLabs/libqb/issues/142#issuecomment-76206425
> looks like a good option to me.
It feels backwards to me: traditionally, increasing numbers signify
older rotated logs, while this pro
Hi,
The current behavior of cancelled migration with Pacemaker 1.1.16 with a
resource implementing push migration:
# /usr/sbin/crm_resource --ban -r vm-conv-4
vhbl03 crmd[10017]: notice: State transition S_IDLE -> S_POLICY_ENGINE
vhbl03 pengine[10016]: notice: Migrate vm-conv-4#011(Started v
Jan Friesse writes:
> wagner.fer...@kifu.gov.hu writes:
>
>> triggered by your favourite IPC mechanism (SIGHUP and SIGUSRx are common
>> choices, but logging.* cmap keys probably fit Corosync better). That
>> would enable proper log rotation.
>
> What is the reason that you find "copytruncate" a
Jan Friesse writes:
> Default example config should be definitively ported to newer style of
> nodelist without interface section. example.udpu can probably be
> deleted as well as example.xml (whole idea of having XML was because
> of cluster config tools like pcs, but these tools never used
> c
Jan Friesse writes:
> Have you had a time to play with packaging current alpha to find out
> if there are no issues? I had no problems with Fedora, but Debian has
> a lot of patches, and I would be really grateful if we could reduce
> them a lot - so please let me know if there is patch which you
Jan Friesse writes:
> Currently I'm pretty happy with current Corosync alpha stability so it
> would be possible to release final right now, but because I want to
> give us some room to break protocol/abi (only if needed and right now
> I don't see any strong reason for such breakage), I didn't r
Jan Friesse writes:
> try corosync 3.x (current Alpha4 is pretty stable [...]
Hi Honza,
Can you provide an estimate for the Corosync 3 release timeline? We
have to plan the ABI transition in Debian anf the freeze date is drawing
closer.
--
Thanks,
Feri
wf...@niif.hu (Ferenc Wágner) writes:
> David Tolosa writes:
>
>> I tried to install corosync 3.x and it works pretty well.
>> But when I install pacemaker, it installs previous version of corosync as
>> dependency and breaks all the setup.
>> Any suggestions?
>
David Tolosa writes:
> I tried to install corosync 3.x and it works pretty well.
> But when I install pacemaker, it installs previous version of corosync as
> dependency and breaks all the setup.
> Any suggestions?
Install the equivs package to create a dummy corosync package
representing your l
Jan Friesse writes:
> Is that system VM or physical machine? Because " Corosync main process
> was not scheduled for..." is usually happening on VMs where hosts are
> highly overloaded.
Or when physical hosts use BMC watchdogs. But Prasad didn't encounter
such logs in the setup at hand, as far
FeldHost™ Admin writes:
> rule of thumb is use separate dedicated network for corosync traffic.
> For ex. we use two corosync rings, first and active one on separate
> network card and switch, second passive one on team (bond) device vlan.
Hi,
That's fine in principle, but this is a bladecenter
David Teigland writes:
> On Thu, Aug 09, 2018 at 06:11:48PM +0200, Ferenc Wágner wrote:
>
>> Almost ten years ago you requested more info in a similar case, let's
>> see if we can get further now!
>
> Hi, the usual cause is that a network message from the dlm h
wf...@niif.hu (Ferenc Wágner) writes:
> For a start I attached the dump output from another node.
I meant to...
146 dlm_controld 4.0.5 started
146 our_nodeid 167773708
146 found /dev/misc/dlm-control minor 58
146 found /dev/misc/dlm-monitor minor 57
146 found /dev/misc/dlm_plock minor 56
Hi David,
Almost ten years ago you requested more info in a similar case, let's
see if we can get further now!
We're running a 6-node Corosync cluster. DLM is started by systemd:
● dlm.service - dlm control daemon
Loaded: loaded (/lib/systemd/system/dlm.service; enabled)
Active: active (r
Jan Pokorný writes:
> 1. [X] Do you edit CIB by hand (as opposed to relying on crm/pcs or
> their UI counterparts)?
For debugging one has to understand the CIB anyway, so why learn
additional syntaxes? :) Most of our configuration changes are scripted
via a home-grown domain-specific CL
Jan Friesse writes:
> Ferenc Wágner napsal(a):
>
>> I wonder if c139255 (totemsrp: Implement sanity checks of received
>> msgs) has direct security relevance as well.
>
> Not entirely direct, but quite similar.
>
>> Should I include that too in the Debian secur
Jan Pokorný writes:
> On 12/04/18 14:33 +0200, Jan Friesse wrote:
>
>> This release contains a lot of fixes, including fix for
>> CVE-2018-1084.
>
> Security related updates would preferably provide more context
Absolutely, thanks for providing that! Looking at the git log, I wonder
if c139255
Ken Gaillot writes:
> A couple of regressions have been found in the recent Pacemaker 1.1.18
> release.
>
> Fixes for these, plus one finishing an incomplete fix in 1.1.18, are in
> the master branch, and have been backported to the 1.1 branch for ease
> of patching. It is recommended that anyone
Andrei Borzenkov writes:
> 25.11.2017 10:05, Andrei Borzenkov пишет:
>
>> In one of guides suggested procedure to simulate split brain was to kill
>> corosync process. It actually worked on one cluster, but on another
>> corosync process was restarted after being killed without cluster
>> noticin
Ken Gaillot writes:
> This will also be of interest to distribution packagers ...
Hi Ken,
Do you mean that this warrants a prominent package changelog entry?
Or what else could packagers do about this?
--
Thanks,
Feri
___
Users mailing list: Users@c
Ken Gaillot writes:
> When an operation completes, a history entry () is added to
> the pe-input file. If the agent supports reload, the entry will include
> op-force-restart and op-restart-digest fields. Now I see those are
> present in the vm-alder_last_0 entry, so agent support isn't the issue
Ken Gaillot writes:
> The pe-input is indeed entirely sufficient.
>
> I forgot to check why the reload was not possible in this case. It
> turns out it is this:
>
> trace: check_action_definition: Resource vm-alder doesn't know
> how to reload
>
> Does the resource agent implement the "re
Dennis Jacobfeuerborn writes:
> if I create a new unit file for the new file the services would not
> depend on it so it wouldn't get automatically mounted when they start.
Put the new unit file under /etc/systemd/system/x.service.requires to
have x.service require it. I don't get the full pict
Ken Gaillot writes:
> On Fri, 2017-10-20 at 15:52 +0200, Ferenc Wágner wrote:
>
>> Ken Gaillot writes:
>>
>>> On Fri, 2017-09-22 at 18:30 +0200, Ferenc Wágner wrote:
>>>
>>>> Ken Gaillot writes:
>>>>
>>>>> Hm
Norberto Lopes writes:
> On Fri, 27 Oct 2017 at 06:41 Ferenc Wágner wrote:
>
>> Norberto Lopes writes:
>>
>>> colocation backup-vip-not-with-master -inf: backupVIP postgresMS:Master
>>> colocation backup-vip-not-with-master inf: backupVIP postgresMS:Slave
&
Norberto Lopes writes:
> colocation backup-vip-not-with-master -inf: backupVIP postgresMS:Master
> colocation backup-vip-not-with-master inf: backupVIP postgresMS:Slave
>
> Basically what's occurring in my cluster is that the first rule stops the
> Sync node from being promoted if the Master ever
Ken Gaillot writes:
> On Fri, 2017-09-22 at 18:30 +0200, Ferenc Wágner wrote:
>> Ken Gaillot writes:
>>
>>> Hmm, stop+reload is definitely a bug. Can you attach (or email it to
>>> me privately, or file a bz with it attached) the above pe-input file
&g
Václav Mach writes:
> On 10/11/2017 09:00 AM, Ferenc Wágner wrote:
>
>> Václav Mach writes:
>>
>>> allow-hotplug eth0
>>> iface eth0 inet dhcp
>>
>> Try replacing allow-hotplug with auto. Ifupdown simply runs ifup -a
>> before network-
Donat Zenichev writes:
> then resource is stopped, but nothing occurred on e-mail destination.
> Where I did wrong actions?
Please note that ClusterMon notifications are becoming deprecated (they
should still work, but I've got no experience with them). Try using
alerts instead, as documented a
Václav Mach writes:
> allow-hotplug eth0
> iface eth0 inet dhcp
Try replacing allow-hotplug with auto. Ifupdown simply runs ifup -a
before network-online.target, which excludes allow-hotplug interfaces.
That means allow-hotplug interfaces are not waited for before corosync
is started during boo
Dennis Jacobfeuerborn writes:
> I see the following messages repeated every 15 minutes in
> /var/log/messages:
>
> Sep 25 20:49:52 nfs2-storage2 pengine[2640]: warning: Processing failed op
> promote for drbd:0 on nfs2-storage2: unknown error (1)
>
> The status still shows an error but this seem
Ken Gaillot writes:
> Hmm, stop+reload is definitely a bug. Can you attach (or email it to me
> privately, or file a bz with it attached) the above pe-input file with
> any sensitive info removed?
I sent you the pe-input file privately. It indeed shows the issue:
$ /usr/sbin/crm_simulate -x pe
Hi,
I'm running a custom resourcre agent under Pacemaker 1.1.16, which has
several reloadable parameters:
$ /usr/sbin/crm_resource --show-metadata=ocf:niif:TransientDomain | fgrep
unique=
I used to routinely change the unique="0" parameters without having the
corresponding resources re
Ken Gaillot writes:
> * undocumented LRMD_MAX_CHILDREN environment variable
> (PCMK_node_action_limit is the current syntax)
By the way, is the current syntax documented somewhere? Looking at
crmd/throttle.c, throttle_update_job_max() is only ever invoked with a
NULL argument, so "Global prefer
Jan Friesse writes:
> Back to problem you have. It's definitively HW issue but I'm thinking
> how to solve it in software. Right now, I can see two ways:
> 1. Set dog FD to be non blocking right at the end of setup_watchdog -
>This is proffered but I'm not sure if it's really going to work.
Klaus Wenninger writes:
> Just for my understanding: You are using watchdog-handling in corosync?
Yes, I was.
--
Feri
___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users
Project Home: http://www.clusterl
Valentin Vidic writes:
> On Sun, Sep 10, 2017 at 08:27:47AM +0200, Ferenc Wágner wrote:
>
>> Confirmed: setting watchdog_device: off cluster wide got rid of the
>> above warnings.
>
> Interesting, what brand or version of IPMI has this problem?
It's a Fujitsu PR
wf...@niif.hu (Ferenc Wágner) writes:
> Jan Friesse writes:
>
>> wf...@niif.hu writes:
>>
>>> In a 6-node cluster (vhbl03-08) the following happens 1-5 times a day
>>> (in August; in May, it happened 0-2 times a day only, it's slowly
>>> rampin
Jan Friesse writes:
> wf...@niif.hu writes:
>
>> In a 6-node cluster (vhbl03-08) the following happens 1-5 times a day
>> (in August; in May, it happened 0-2 times a day only, it's slowly
>> ramping up):
>>
>> vhbl08 corosync[3687]: [TOTEM ] A processor failed, forming new
>> configuration.
>>
Digimer writes:
> On 2017-08-29 10:45 AM, Ferenc Wágner wrote:
>
>> Digimer writes:
>>
>>> On 2017-08-28 12:07 PM, Ferenc Wágner wrote:
>>>
>>>> [...]
>>>> While dlm_tool status reports (similar on all nodes):
>>>>
&g
Jan Friesse writes:
> wf...@niif.hu writes:
>
>> Jan Friesse writes:
>>
>>> wf...@niif.hu writes:
>>>
In a 6-node cluster (vhbl03-08) the following happens 1-5 times a day
(in August; in May, it happened 0-2 times a day only, it's slowly
ramping up):
vhbl08 corosync[3687
Jan Friesse writes:
> wf...@niif.hu writes:
>
>> In a 6-node cluster (vhbl03-08) the following happens 1-5 times a day
>> (in August; in May, it happened 0-2 times a day only, it's slowly
>> ramping up):
>>
>> vhbl08 corosync[3687]: [TOTEM ] A processor failed, forming new
>> configuration.
>>
Klaus Wenninger writes:
> Just seen that you are hosting VMs which might make you use KSM ...
> Don't fully remember at the moment but I have some memory of
> issues with KSM and page-locking.
> iirc it was some bug in the kernel memory-management that should
> be fixed a long time ago but ...
H
"Ulrich Windl" writes:
>>>> Ferenc Wágner schrieb am 28.08.2017 um 18:07 in Nachricht
> <87mv6jk75r@lant.ki.iif.hu>:
>
> cLVM under I/O load can be really slow (I'm talking about delays in the range
> of a few seconds).
Yes, I know, and it'
Jan Friesse writes:
> wf...@niif.hu writes:
>
>> Jan Friesse writes:
>>
>>> wf...@niif.hu writes:
>>>
In a 6-node cluster (vhbl03-08) the following happens 1-5 times a day
(in August; in May, it happened 0-2 times a day only, it's slowly
ramping up):
vhbl08 corosync[3687
Digimer writes:
> On 2017-08-28 12:07 PM, Ferenc Wágner wrote:
>
>> [...]
>> While dlm_tool status reports (similar on all nodes):
>>
>> cluster nodeid 167773705 quorate 1 ring seq 3088 3088
>> daemon now 2941405 fence_pid 0
>> node 167773705 M a
Jan Friesse writes:
> wf...@niif.hu writes:
>
>> In a 6-node cluster (vhbl03-08) the following happens 1-5 times a day
>> (in August; in May, it happened 0-2 times a day only, it's slowly
>> ramping up):
>>
>> vhbl08 corosync[3687]: [TOTEM ] A processor failed, forming new
>> configuration.
>>
Hi,
In a 6-node cluster (vhbl03-08) the following happens 1-5 times a day
(in August; in May, it happened 0-2 times a day only, it's slowly
ramping up):
vhbl08 corosync[3687]: [TOTEM ] A processor failed, forming new configuration.
vhbl03 corosync[3890]: [TOTEM ] A processor failed, forming n
Ken Gaillot writes:
> The most significant change in this release is a new cluster option to
> improve scalability.
>
> As users start to create clusters with hundreds of resources and many
> nodes, one bottleneck is a complete reprobe of all resources (for
> example, after a cleanup of all resou
Digimer writes:
> On 19/06/17 11:40 PM, Andrei Borzenkov wrote:
>
>> 20.06.2017 02:15, Digimer пишет:
>>
>>> On 19/06/17 06:59 PM, Ferenc Wágner wrote:
>>>
>>>> Digimer writes:
>>>>
>>>>> So we have a tool that watches
Digimer writes:
> So we have a tool that watches for changes to clvmd by running
> pvscan/vgscan/lvscan, but this seems to be expensive and occassionally
> cause trouble.
What kind of trouble did you experience?
> Is there any other way to be notified or to check when something
> changes?
LV (
James Booth writes:
> Sorry for the repeat mails, but I had issues subscribing list time
> (Looks like it has worked successfully now!).
>
> Anywho, I'm really desperate for some help on my issue in
> http://lists.clusterlabs.org/pipermail/users/2017-April/005495.html -
> I can recap the info in
Ken Gaillot writes:
> On 04/13/2017 11:11 AM, Ferenc Wágner wrote:
>
>> I encountered several (old) statements on various forums along the lines
>> of: "the CIB is not a transactional database and shouldn't be used as
>> one" or "resource paramet
Hi,
I encountered several (old) statements on various forums along the lines
of: "the CIB is not a transactional database and shouldn't be used as
one" or "resource parameters should only uniquely identify a resource,
not configure it" and "the CIB was not designed to be a configuration
database b
kgronl...@suse.com (Kristoffer Grönlund) writes:
> I discovered today that a location constraint with score=INFINITY
> doesn't actually restrict resources to running only on particular
> nodes.
Yeah, I made the same "discovery" some time ago. Since then I've been
using something like the followi
Jeffrey Westgate writes:
> We use Nagios to monitor, and once every 20 to 40 hours - sometimes
> longer, and we cannot set a clock by it - while the machine is 95%
> idle (or more according to 'top'), the host load shoots up to 50 or
> 60%. It takes about 20 minutes to peak, and another 30 to 45
Oscar Segarra writes:
> In my environment I have 5 guestes that have to be started up in a
> specified order starting for the MySQL database server.
We use a somewhat redesigned resource agent, which connects to the guest
using a virtio channel and waits for a signal before exiting from the
star
Jehan-Guillaume de Rorthais writes:
> PAF use private attribute to give informations between actions. We
> detect the failure during the notify as well, but raise the error
> during the promotion itself. See how I dealt with this in PAF:
>
> https://github.com/ioguix/PAF/commit/6123025ff7cd9929b5
Ken Gaillot writes:
> On 02/07/2017 01:11 AM, Ulrich Windl wrote:
>
>> Ken Gaillot writes:
>>
>>> On 02/06/2017 03:28 AM, Ulrich Windl wrote:
>>>
Isn't the question: Is crmd a process that is expected to die (and
thus need restarting)? Or wouldn't one prefer to debug this
situatio
Ken Gaillot writes:
> On 02/03/2017 07:00 AM, RaSca wrote:
>>
>> On 03/02/2017 11:06, Ferenc Wágner wrote:
>>> Ken Gaillot writes:
>>>
>>>> On 01/10/2017 04:24 AM, Stefan Schloesser wrote:
>>>>
>>>>> I am currently
Hi,
There was an interesting discussion on this list about "Doing reload
right" last July (which I still haven't digested entirely). Now I've
got a related question about the current and intented behavior: what
happens if a reload operation fails? I found some suggestions in
http://ocf.community
Ken Gaillot writes:
> On 01/10/2017 04:24 AM, Stefan Schloesser wrote:
>
>> I am currently testing a 2 node cluster under Ubuntu 16.04. The setup
>> seems to be working ok including the STONITH.
>> For test purposes I issued a "pkill -f pace" killing all pacemaker
>> processes on one node.
>>
>
Marco Marino writes:
> Ferenc, regarding the flag use_lvmetad in
> /usr/lib/ocf/resource.d/heartbeat/LVM I read:
>
>> lvmetad is a daemon that caches lvm metadata to improve the
>> performance of LVM commands. This daemon should never be used when
>> volume groups exist that are being managed by
Marco Marino writes:
> I agree with you for
> use_lvmetad = 0 (setting it = 1 in a clustered environment is an error)
Where does this information come from? AFAIK, if locking_type=3 (LVM
uses internal clustered locking, that is, clvmd), lvmetad is not used
anyway, even if it's running. So it's
Ken Gaillot writes:
> * When you move the VM, the cluster detects that it is not running on
> the node you told it to keep it running on. Because there is no
> "Stopped" monitor, the cluster doesn't immediately realize that a new
> rogue instance is running on another node. So, the cluster thinks
Jan Friesse writes:
> Ferenc Wágner napsal(a):
>
>> Have you got any plans/timeline for 2.4.2 yet?
>
> Yep, I'm going to release it in few minutes/hours.
Man, that was quick. I've got a bunch of typo fixes queued..:) Please
consider announcing upcoming releases a c
Jan Friesse writes:
>> Jan Friesse writes:
>>
>>> Please note that because of required changes in votequorum,
>>> libvotequorum is no longer binary compatible. This is reason for
>>> version bump.
>>
>> Er, what version bump? Corosync 2.4.1 still produces
>> libvotequorum.so.7.0.0 for me, just
Ken Gaillot writes:
> This spurred me to complete a long-planned overhaul of Pacemaker
> Explained's "Upgrading" appendix:
>
> http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html/Pacemaker_Explained/_upgrading.html
>
> Feedback is welcome.
Since you asked for it..:)
1. Table D.1.: why does
Jan Friesse writes:
> Please note that because of required changes in votequorum,
> libvotequorum is no longer binary compatible. This is reason for
> version bump.
Er, what version bump? Corosync 2.4.1 still produces
libvotequorum.so.7.0.0 for me, just like Corosync 2.3.6.
--
Thanks,
Feri
__
"Carlos Xavier" writes:
> 1467918891 Is dlm missing from kernel? No misc devices found.
> 1467918891 /sys/kernel/config/dlm/cluster/comms: opendir failed: 2
> 1467918891 /sys/kernel/config/dlm/cluster/spaces: opendir failed: 2
> 1467918891 No /sys/kernel/config, is configfs loaded?
> 1467918891 s
Ken Gaillot writes:
> Does anyone know of an RA that uses reload correctly?
My resource agents advertise a no-op reload action for handling their
"private" meta attributes. Meta in the sense that they are used by the
resource agent when performing certain operations, not by the managed
resource
"Lentes, Bernd" writes:
> i don't have neither an init-script nor a systemd service file.
> The only packages i find in the repositories concerning dlm are:
> libdlm3-3.00.01-0.31.87
> libdlm-3.00.01-0.31.87
> And i have a kernel module for dlm.
> Nothing else.
Sorry, my experience is limited to
"Lentes, Bernd" writes:
> wf...@niif.hu writes:
>
>> "Lentes, Bernd" writes:
>>
>>> is it possible to have a DLM running without CRM?
>>
>> Yes. You'll need to configure fencing, though, since by default DLM
>> will try to use stonithd (from Pacemaker). But DLM fencing didn't
>> handle fencing f
"Lentes, Bernd" writes:
> is it possible to have a DLM running without CRM?
Yes. You'll need to configure fencing, though, since by default DLM
will try to use stonithd (from Pacemaker). But DLM fencing didn't
handle fencing failures correctly for me, resulting in more nodes being
fenced until
Hi,
Could somebody please elaborate a little why the pacemaker systemd
service file contains "Restart=on-failure"? I mean that a failed node
gets fenced anyway, so most of the time this would be a futile effort.
On the other hand, one could argue that restarting failed services
should be the defa
Klaus Wenninger writes:
> On 06/16/2016 11:05 AM, Ferenc Wágner wrote:
>
>> Klaus Wenninger writes:
>>
>>> On 06/15/2016 06:11 PM, Ferenc Wágner wrote:
>>>
>>>> I think the default timestamp should contain date and time zone
>>>> spe
Klaus Wenninger writes:
> On 06/15/2016 06:11 PM, Ferenc Wágner wrote:
>
>> Please find some random notes about my adventures testing the new alert
>> system.
>>
>> The first alert example in the documentation has no recipient:
>>
>>
>>
>&
Hi,
Please find some random notes about my adventures testing the new alert
system.
The first alert example in the documentation has no recipient:
In the example above, the cluster will call my-script.sh for each
event.
while the next section starts as:
Each alert may be conf
Ken Gaillot writes:
> With this release candidate, we now provide three sample alert scripts
> to use with the new alerts feature, installed in the
> /usr/share/pacemaker/alerts directory.
Hi,
Is there a real reason to name these scripts *.sample? Sure, they are
samples, but they are also usab
Ilia Sokolinski writes:
> We have a custom Master-Slave resource running on a 3-node pcs cluster on
> CentOS 7.1
>
> As part of what is supposed to be an NDU we do update some properties of the
> resource.
> For some reason this causes both Master and Slave instances of the resource
> to be r
Nikhil Utane writes:
> Would like to know the best and easiest way to add a new node to an already
> running cluster.
>
> Our limitation:
> 1) pcsd cannot be used since (as per my understanding) it communicates over
> ssh which is prevented.
> 2) No manual editing of corosync.conf
If you use IPv
"Lentes, Bernd" writes:
> - On Jun 7, 2016, at 3:53 PM, Ferenc Wágner wf...@niif.hu wrote:
>
>> "Lentes, Bernd" writes:
>>
>>> Ok. Does DLM takes care that a LV just can be used on one host ?
>>
>> No. Even plain LVM uses loc
"Lentes, Bernd" writes:
> Ok. Does DLM takes care that a LV just can be used on one host ?
No. Even plain LVM uses locks to serialize access to its metadata
(avoid concurrent writes corrupting it). These locks are provided by
the host kernel (locking_type=1). DLM extends the locking concept t
"Stephano-Shachter, Dylan" writes:
> I can not figure out why version 4 is not supported.
Have you got fsid=root (or fsid=0) on your root export?
See man exports.
--
Feri
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/li
Andrey Rogovsky writes:
> I have deb rules, comes from 1.12 and try apply it to current release.
1.1.14 is available in sid, stretch and jessie-backports, any reason you
can't use those packages?
> In the building I get an error:
> dh_testroot -a
> rm -rf `pwd`/debian/tmp/usr/lib/service_crm.so
David Teigland writes:
> On Tue, Apr 26, 2016 at 09:57:06PM +0200, Valentin Vidic wrote:
>
>> The bug is caused by the missing braces in the expanded if
>> statement.
>>
>> Do you think we can get a new version out with this patch as the
>> fencing in 4.0.4 does not work properly due to this iss
Hi,
Are recurring monitor operations constrained by the batch-limit cluster
option? I ask because I'd like to limit the number of parallel start
and stop operations (because they are resource hungry and potentially
take long) without starving other operations, especially monitors.
--
Thanks,
Fer
Ken Gaillot writes:
> Each alert may have any number of recipients configured. These values
> will simply be passed to the script as arguments. The first recipient
> will also be passed as the CRM_alert_recipient environment variable,
> for compatibility with existing scripts that only support on
"Ulrich Windl" writes:
> Ferenc Wágner schrieb am 19.04.2016 um 13:42 in Nachricht
>
>> "Ulrich Windl" writes:
>>
>>> Ferenc Wágner schrieb am 18.04.2016 um 17:07 in Nachricht
>>>
>>>> I'm using the "balance
"Ulrich Windl" writes:
> Ferenc Wágner schrieb am 18.04.2016 um 17:07 in Nachricht
>
>> I'm using the "balanced" placement strategy with good success. It
>> distributes our VM resources according to memory size perfectly.
>> However, I
Hi,
I'm using the "balanced" placement strategy with good success. It
distributes our VM resources according to memory size perfectly.
However, I'd like to take the NUMA topology into account. That means
each host should have several capacity pools (of each capacity type) to
arrange the resource
Hi,
On a freshly rebooted cluster node (after crm_mon reports it as
'online'), I get the following:
wferi@vhbl08:~$ sudo crm_resource -r vm-cedar --cleanup
Cleaning up vm-cedar on vhbl03, removing fail-count-vm-cedar
Cleaning up vm-cedar on vhbl04, removing fail-count-vm-cedar
Cleaning up vm-ceda
"Ulrich Windl" writes:
> Actually form my SLES11 SP[1-4] experience, the cluster always
> distributes resources across all available nodes, and only if don't
> want that, I'll have to add constraints. I wonder why that does not
> seem to work for you.
Because I'd like to spread small subsets of
1 - 100 of 121 matches
Mail list logo