- On Oct 14, 2019, at 6:27 AM, Roger Zhou zz...@suse.com wrote:
> The stop failure is very bad, and is crucial for HA system.
Yes, that's true.
> You can try o2locktop cli to find the potential INODE to be blamed[1].
>
> `o2locktop --help` gives you more usage details
I will try that.
Hi,
occasionally the stop of a Filesystem resource for an OCFS2 Partition fails to
stop.
I'm currently tracing this RA hoping to find the culprit.
I'm putting one of both nodes into standby, hoping the error appears.
Afterwards setting it online again and doing the same procedure with the other
HI,
i have a two node cluster running on SLES 12 SP4.
I did some testing on it.
I put one into standby (ha-idg-2), the other (ha-idg-1) got fenced a few
minutes later because i made a mistake.
ha-idg-2 was DC. ha-idg-1 made a fresh boot and i started corosync/pacemaker on
it.
It seems ha-idg-1 d
Hi,
i finally managed to find out how i can simulate configuration changes and see
their results before committing them.
OMG. That makes live much more relaxed. I need to change the configuration of a
resource which is part of a group, the group is
running as a clone on all nodes.
Unfortunately
- Am 1. Okt 2019 um 14:29 schrieb Yan Gao y...@suse.com:
> On 10/1/19 1:37 PM, Lentes, Bernd wrote:
>>
>> - On Oct 1, 2019, at 12:26 PM, Yan Gao y...@suse.com wrote:
>>
>> Currently i'm running SLES 12 SP4. Is it worth thinking about Updating t
- On Oct 1, 2019, at 12:26 PM, Yan Gao y...@suse.com wrote:
> On 9/30/19 6:45 PM, Lentes, Bernd wrote:
> The behavior/idea about cleanup makes more sense in pacemaker-2.0
> (SLE-HA 15 releases). It does *real* cleanup only if a resource has any
> failures.
Currently i'
>>
>> Hi Yan,
>> I had a look in the logs and what happened when i issued a "resource
>> cleanup" of
>> the GFS2 resource is
>> that the cluster deleted an entry in the status section:
>>
>> Sep 26 14:52:52 [9317] ha-idg-2cib: info: cib_process_request:
>> Completed cib_delete operat
- On Sep 26, 2019, at 5:19 PM, Yan Gao y...@suse.com wrote:
> Hi,
>
> On 9/26/19 3:25 PM, Lentes, Bernd wrote:
>> HI,
>>
>> i had two errors with a GSF2 Partition several days ago:
>> gfs2_share_monitor_3 on ha-idg-2 'unknown error' (1): ca
HI,
i had two errors with a GSF2 Partition several days ago:
gfs2_share_monitor_3 on ha-idg-2 'unknown error' (1): call=103,
status=Timed Out, exitreason='',
last-rc-change='Thu Sep 19 13:44:22 2019', queued=0ms, exec=0ms
gfs2_share_monitor_3 on ha-idg-1 'unknown error' (1): call=103
- Am 14. Aug 2019 um 19:07 schrieb kgaillot kgail...@redhat.com:
>> That's my setting:
>>
>> expected_votes: 2
>> two_node: 1
>> wait_for_all: 0
>>
>> no-quorum-policy=ignore
>>
>> I did that because i want be able to start the cluster although one
>> node has e.g. a hardware pr
- On Aug 13, 2019, at 1:19 AM, kgaillot kgail...@redhat.com wrote:
>
> The key messages are:
>
> Aug 09 17:43:27 [6326] ha-idg-1 crmd: info: crm_timer_popped:
> Election
> Trigger (I_DC_TIMEOUT) just popped (2ms)
> Aug 09 17:43:27 [6326] ha-idg-1 crmd: warning: do_l
- On Aug 14, 2019, at 8:25 AM, Ulrich Windl
ulrich.wi...@rz.uni-regensburg.de wrote:
> But why do the eth interfaces on both nodes come up the same second
> (2019-08-09T17:42:19)?
>
The respective eth's of the two bonds of the two hosts are connected directly
to each other.
Just a wir
- On Aug 13, 2019, at 3:14 PM, Ulrich Windl
ulrich.wi...@rz.uni-regensburg.de wrote:
> You said you booted the hosts sequentially. From the logs they were starting
> in
> parallel.
>
No. last says:
ha-idg-1:
reboot system boot 4.12.14-95.29-de Fri Aug 9 17:42 - 15:56 (3+22:14)
ha-i
- On Aug 13, 2019, at 3:34 PM, Matthias Ferdinand m...@14v.de wrote:
>> 17:26:35 crm node standby ha-idg1-
>
> if that is not a copy&paste error (ha-idg1- vs. ha-idg-1), then ha-idg-1
> was not set to standby, and installing updates may have done some
> meddling with corosync/pacemaker (li
- On Aug 12, 2019, at 7:47 PM, Chris Walker cwal...@cray.com wrote:
> When ha-idg-1 started Pacemaker around 17:43, it did not see ha-idg-2, for
> example,
>
> Aug 09 17:43:05 [6318] ha-idg-1 pacemakerd: info:
> pcmk_quorum_notification:
> Quorum retained | membership=1320 members=1
>
>
- On Aug 13, 2019, at 9:00 AM, Ulrich Windl
ulrich.wi...@rz.uni-regensburg.de wrote:
> Personally I feel more save with updates when the whole cluster node is
> offline, not standby. When you are going to boot anyway, it won't make much of
> a difference. Also you don't have to remember to
Hi,
last Friday (9th of August) i had to install patches on my two-node cluster.
I put one of the nodes (ha-idg-2) into standby (crm node standby ha-idg-2),
patched it, rebooted,
started the cluster (systemctl start pacemaker) again, put the node again
online, everything fine.
Then i wanted to
- Am 30. Jul 2019 um 21:07 schrieb kgaillot kgail...@redhat.com:
>
> There was a regression in 1.1.20 and 2.0.0 (fixed in the next versions)
> where cleanups of multiple errors would miss some of them. Any chance
> you're using one of those?
I'm afraid not. Is there another way to get rid o
Hi,
i always have on one of my cluster nodes "crm_mon -nfrALm 3" running in a ssh
session,
which gives a good and short overview of the status of the cluster.
I just had some problems in live migrating some VirtualDomains.
These are the errors i see:
Failed Resource Actions:
* vm_genetrap_migrate
Hi,
sometimes my clvm resource does not stop cleanly so the respective node is
fenced. To investigate
that further i set a "trace stop" on that resource:
ha-idg-2:~ # cibadmin -Q |grep -i trace
Is that correct ?
But setting now already several times both nodes into standby mod
- On Jul 4, 2019, at 1:25 AM, kgaillot kgail...@redhat.com wrote:
> On Wed, 2019-06-19 at 18:46 +0200, Lentes, Bernd wrote:
>> - On Jun 15, 2019, at 4:30 PM, Bernd Lentes
>> bernd.len...@helmholtz-muenchen.de wrote:
>>
>> > - Am 14. Jun 2019 u
- On Jun 23, 2019, at 1:40 PM, Somanath Jeeva somanath.je...@ericsson.com
wrote:
> Hi All,
> I have a two node cluster with multicast (udp) transport . The multicast IP
> used
> in 224.1.1.1 .
> Whenever there is a CPU intensive task the pcs cluster goes into split brain
> scenario and doe
- On Jun 15, 2019, at 4:30 PM, Bernd Lentes
bernd.len...@helmholtz-muenchen.de wrote:
> - Am 14. Jun 2019 um 21:20 schrieb kgaillot kgail...@redhat.com:
>
>> On Fri, 2019-06-14 at 18:27 +0200, Lentes, Bernd wrote:
>>> Hi,
>>>
>>> i had that
- Am 14. Jun 2019 um 21:20 schrieb kgaillot kgail...@redhat.com:
> On Fri, 2019-06-14 at 18:27 +0200, Lentes, Bernd wrote:
>> Hi,
>>
>> i had that problem already once but still it's not clear for me what
>> really happens.
>> I had this problem som
Hi,
i had that problem already once but still it's not clear for me what really
happens.
I had this problem some days ago:
I have a 2-node cluster with several virtual domains as resources. I put one
node (ha-idg-2) into standby, and two running virtual domains were migrated to
the other node (
- On May 20, 2019, at 8:28 AM, Ulrich Windl
ulrich.wi...@rz.uni-regensburg.de wrote:
>>>> "Lentes, Bernd" schrieb am 16.05.2019
> um
> 17:10 in Nachricht
> <1151882511.6631123.1558019430655.javamail.zim...@helmholtz-muenchen.de>:
>> Hi,
>&g
Hi,
my HA-Cluster with two nodes fenced one on 14th of may.
ha-idg-1 has been the DC, ha-idg-2 was fenced.
It happened around 11:30 am.
The log from the fenced one isn't really informative:
==
2019-05-14T11:22:09.948980+02:00 ha-idg-2 liblogging-stdlog: -- MARK --
- Am 3. Mai 2019 um 22:32 schrieb Bernd Lentes
bernd.len...@helmholtz-muenchen.de:
>>
>> For now, I guess you'll have to post-process it with sed or something.
>
> I don't know much about cgi-scripts. With -w i can write the output from
> crm_mon
> to a cgi-script.
> Wouldn't it be possib
- Am 3. Mai 2019 um 19:37 schrieb Christopher Lumens clum...@redhat.com:
>> The output of the webpage is quite nice, you get a short and quick overview.
>> Only some inactive resources are displayed in yellow which is impossible to
>> read.
>>
>> Is there a way to configure the output of th
Hi,
on my cluster nodes i established a systemd service which starts crm_mon which
writes cluster information into a html-file so i can see the state
of my cluster in a webbrowser.
crm_mon is started that way:
/usr/sbin/crm_mon -d -i 10 -h /srv/www/hawk/public/crm_mon.html -m3 -nrfotAL -p
/va
- Am 21. Apr 2019 um 6:51 schrieb Andrei Borzenkov arvidj...@gmail.com:
> 20.04.2019 22:29, Lentes, Bernd пишет:
>>
>>
>> - Am 18. Apr 2019 um 16:21 schrieb kgaillot kgail...@redhat.com:
>>
>>>
>>> Simply stopping pacemaker and corosync by
- Am 18. Apr 2019 um 16:21 schrieb kgaillot kgail...@redhat.com:
>
> Simply stopping pacemaker and corosync by whatever mechanism your
> distribution uses (e.g. systemctl) should be sufficient.
That works. But strangely is that after a reboot both nodes are
shown as UNCLEAN. Does the clus
Hi,
i have a two-node cluster, both servers are buffered by an UPS.
If power is gone the UPS sends after a configurable time a signal via network
to shutdown the servers.
The UPS-Software (APC Power Chute Network Shutdown) gives me on the host the
possibility to run scripts
before it shuts down.
- On Jan 23, 2019, at 3:20 PM, Klaus Wenninger kwenn...@redhat.com wrote:
>> I have corosync-2.3.6-9.13.1.x86_64.
>> Where can i configure this value ?
>
> speaking of two_node & wait_for_all?
> That is configured in the quorum-section of corosync.conf:
>
> quorum {
> ...
> wait_for_all: 1
- On Jan 22, 2019, at 6:35 PM, Andrei Borzenkov arvidj...@gmail.com wrote:
> This is another problem - if cluster requires stonith, it won't statr
> resources with another node UNCLEAN and fencing attempt apparently failed.
Let's assume a running two-node cluster. Now node1 needs to be fe
- On Jan 22, 2019, at 9:24 PM, kgaillot kgail...@redhat.com wrote:
>> > Good plan, though perhaps there should be some allowance for the
>> > case
>> > in which only node1 is running when the power dies.
Yes, i will take care of that.
> Good point, I missed that. If you're sure the target
- On Jan 22, 2019, at 6:00 PM, kgaillot kgail...@redhat.com wrote:
> On Tue, 2019-01-22 at 16:52 +0100, Lentes, Bernd wrote:
>> Now the restart, which makes me trouble.
>> Currently i want to restart the cluster manually, because i'm not
>> completly familia
Hi,
we have a new UPS which has enough charge to provide our 2-node cluster with
the periphery (SAN, switches ...) for a resonable time.
I'm currently thinking of the shutdown- and restart-procedure of the complete
cluster when the power is lost and does not come back soon.
Then cluster is provi
Hi,
i have a two node cluster with several VirtualDomains as resources. Normally
live migration is no problem. But rarely it fails, without giving any
reasonable
message in the logs. I tried to migrate several VirtualDmains concurrently from
ha-idg-2 to ha-idg-1. One VirtualDomain failed, the
- On Oct 11, 2018, at 4:26 PM, Kristoffer Grönlund kgronl...@suse.de wrote:
> On Thu, 2018-10-11 at 13:59 +0200, Lentes, Bernd wrote:
>> Hi,
>>
>> i'm trying to write a script which shutdown my VirtualDomains in the
>> night for a short period to take a cl
Hi,
i'm trying to write a script which shutdown my VirtualDomains in the night for
a short period to take a clean snapshot with libvirt.
To shut them down i can use "crm resource stop VirtualDomain".
But when i do a "crm resource stop VirtualDomain" in my script, the command
returns immediately
- On Sep 20, 2018, at 6:58 PM, kgaillot kgail...@redhat.com wrote:
> OK, drop "default-" and it should work. The names in rsc_defaults are
> identical to what you'd use in the resource meta-data.
Now it's working.
Thanks Ken.
Bernd
Helmholtz Zentrum München
__
Ken wrote:
>
> I think you meant default-resource-stickiness ... and even that's
> deprecated in 1.1 and gone in 2.0. :-)
>
> The proper way is to set resource-stickiness in the rsc_defaults
> section (however that's done using your tools of choice).
> --
> Ken Gaillot
Hi,
i did it that way:
Hi,
i have a two-node cluster with several VirtualDomain resources. Scenario:
Two VirtualDomains, running on different nodes.
Migrating one VirtualDomain from node 1 to node 2 migrates the other
VirtualDomain from node 2 to node 1.
These are the scores AFTER the migration:
...
vm_mausdb
- On Sep 11, 2018, at 8:54 AM, Ulrich Windl
ulrich.wi...@rz.uni-regensburg.de wrote:
>
> Hi Bernd,
>
> the disappointing answer ist this: With cLVM you cannot make snapshots of the
> LVs (easily), and in SLES11 SP4 at least the tool to make snapshots of OCFS2
> also isn't provided. So the m
- On Sep 11, 2018, at 4:29 AM, Gang He g...@suse.com wrote:
> Hello Lentes,
>
> It does not look like a OCFS2 or pacemaker problem, more like virtualization
> problem.
> From OCFS2/LVM2 perspective, if you use one LV for one VirtualDomain, that
> means
> the guest VMs on that VirtualDomai
Hi,
i'm establishing a cluster with virtual guests as resources which should reside
in a raw files on OCFS2 formatted logical volumes.
My first idea was to create for each VirtualDomain its own logical volume, i
thought that would be well-structured.
But now i realize that my cluster configurat
- On Sep 5, 2018, at 6:58 PM, FeldHost™ Admin ad...@feldhost.cz wrote:
> Why you use FS for raw image, when you can use directly LV as block device for
> your VM
>
Because i want to make snapshots with virsh or qemu-img. I think i can't do
that with a naked block device.
Bernd
Helmhol
- On Sep 5, 2018, at 6:28 PM, FeldHost™ Admin ad...@feldhost.cz wrote:
> hello, yes, you need ocfs2 or gfs2, but in your case (raw image) probably
> better
> to use lvm
I use cLVM. The fs for the raw image resides on a clustered VG/LV.
But nevertheless i still need a cluster fs because of
Hi guys,
just to be sure. I thought (maybe i'm wrong) that having a VM on a shared
storage (FC SAN), e.g. in a raw file on an ext3 fs on that SAN allows
live-migration because pacemaker takes care that the ext3 fs is at any time
only mounted on one node. I tried it, but "live"-migration wasn't
- On Mar 15, 2018, at 3:47 AM, Gang He g...@suse.com wrote:
> Hello Lentes,
>
>
>> Hi,
>>
>> i have a 2-node-cluster with my services (web, db) running in VirtualDomain
>> resources.
>> I have a SAN with cLVM, each guest lies in a dedicated logical volume with
>> an ext3 fs.
>>
>> C
- On Mar 15, 2018, at 3:47 AM, Gang He g...@suse.com wrote:
> Just one comments, you have to make sure the VM file integrity before calling
> reflink.
>
Hi Gang,
how could i achieve that ? sync ? The disks of the VM's are configured without
cache,
otherwise they can't be live migrated.
- On Mar 14, 2018, at 11:54 AM, Ulrich Windl
ulrich.wi...@rz.uni-regensburg.de wrote:
> Hi!
>
> IMHO the only clean solution would be this procedure:
> 1) pause the VMs and cause them to flush their disk buffers, or at least make
> sure the writes of the VM guest arrived at the VM host's buf
Hi,
i have a 2-node-cluster with my services (web, db) running in VirtualDomain
resources.
I have a SAN with cLVM, each guest lies in a dedicated logical volume with an
ext3 fs.
Currently i'm thinking about snapshoting the guests to make a backup in the
background. With cLVM that's not possibl
- On Oct 16, 2017, at 10:57 PM, kgaillot kgail...@redhat.com wrote:
>> from the Changelog:
>>
>> Changes since Pacemaker-1.1.15
>> ...
>> + pengine: do not fence a node in maintenance mode if it shuts down
>> cleanly
>> ...
>>
>> just saying ... may or may not be what you are seeing.
- On Oct 16, 2017, at 9:27 PM, Digimer li...@alteeve.ca wrote:
>
> I understood what you meant about it getting fenced after stopping
> corosync. What I am not clear on is if you are stopping corosync on the
> normal node, or the node that is in maintenance mode.
>
> In either case, as I
- On Oct 16, 2017, at 7:37 PM, emmanuel segura emi2f...@gmail.com wrote:
> I put a node in maintenance mode?
> do you mean you put the cluster in maintenance mode
I did "crm node maintenance ". From my understanding that means that i
put the node in maintenance mode.
Bernd
Helmholtz
- On Oct 16, 2017, at 7:38 PM, Digimer li...@alteeve.ca wrote:
> On 2017-10-16 01:24 PM, Lentes, Bernd wrote:
>> Hi,
>>
>> i have the following behavior: I put a node in maintenance mode, afterwards
>> stop
>> corosync on that node with /etc/ini
Hi,
i have the following behavior: I put a node in maintenance mode, afterwards
stop corosync on that node with /etc/init.d/openais stop.
This node is immediately fenced. Is that expected behavior ? I thought putting
a node into maintenance does mean the cluster does not care anymore about that
Hi,
i just want to be sure. I created a DRBD partition in a dual primary setup. I
have a VirtualDomain (KVM) resource which resides in the naked DRBD (without
FS), and i can live migrate.
Are there situations where in this setup a cluster fs is necessary/recommended
? I'd like to avoid it, it c
> Hi,
>
>
> What happened:
> I tried to configure a simple drbd resource following
> http://clusterlabs.org/doc/en-US/Pacemaker/1.1-plugin/html-single/Clusters_from_Scratch/index.html#idm140457860751296
> I used this simple snip from the doc:
> configure primitive WebData ocf:linbit:drbd param
- On Aug 10, 2017, at 2:11 PM, Lars Ellenberg lars.ellenb...@linbit.com
wrote:
>
> if you use crmsh "interactively",
> crmsh does implicitly use a shadow cib,
> and will only commit changes once you "commit",
> see "crm configure help commit"
>
Hi,
i tested it:
First try:
crm(live)# con
- Am 8. Aug 2017 um 15:36 schrieb Lars Ellenberg lars.ellenb...@linbit.com:
> crm shell in "auto-commit"?
> never seen that.
i googled for "crmsh autocommit pacemaker" and found that:
https://github.com/ClusterLabs/crmsh/blob/master/ChangeLog
See line 650. Don't know what that means.
>
>
- On Aug 8, 2017, at 9:42 AM, Ulrich Windl
ulrich.wi...@rz.uni-regensburg.de wrote:
>
> Maybe just be concrete with your questions, so it's much easier to provide
> useful answers.
>
Which question is not concrete ?
Bernd
Helmholtz Zentrum Muenchen
Deutsches Forschungszentrum fuer G
- On Aug 7, 2017, at 10:43 PM, kgaillot kgail...@redhat.com wrote:
>
> The logs are very useful, but not particularly easy to follow. It takes
> some practice and experience, but I think it's worth it if you have to
> troubleshoot cluster events often.
I will give my best.
>
> It's on the
- On Aug 7, 2017, at 10:26 PM, kgaillot kgail...@redhat.com wrote:
>
> Unmanaging doesn't stop monitoring a resource, it only prevents starting
> and stopping of the resource. That lets you see the current status, even
> if you're in the middle of maintenance or what not. You can disable
>
- On Aug 4, 2017, at 10:19 PM, kgaillot kgail...@redhat.com wrote:
> The cluster reacted promptly:
> crm(live)# configure primitive prim_drbd_idcc_devel ocf:linbit:drbd params
> drbd_resource=idcc-devel \
>> op monitor interval=60
> WARNING: prim_drbd_idcc_devel: default timeout 20s for s
- On Aug 7, 2017, at 3:43 PM, Ulrich Windl
ulrich.wi...@rz.uni-regensburg.de wrote:
>>
>>>
>>> The "ERROR" message is coming from the DRBD resource agent itself, not
>>> pacemaker. Between that message and the two separate monitor operations,
>>> it looks like the agent will only run as
- On Aug 4, 2017, at 10:19 PM, kgaillot kgail...@redhat.com wrote:
>
> The "ERROR" message is coming from the DRBD resource agent itself, not
> pacemaker. Between that message and the two separate monitor operations,
> it looks like the agent will only run as a master/slave clone.
Btw:
Does
- On Aug 6, 2017, at 12:05 PM, Kristoffer Grönlund kgronl...@suse.com wrote:
>> What happened:
>> I tried to configure a simple drbd resource following
>> http://clusterlabs.org/doc/en-US/Pacemaker/1.1-plugin/html-single/Clusters_from_Scratch/index.html#idm140457860751296
>> I used this simpl
- On Aug 4, 2017, at 10:19 PM, Ken Gaillot kgail...@redhat.com wrote:
>
> Unfortunately no -- logging, and troubleshooting in general, is an area
> we are continually striving to improve, but there are more to-do's than
> time to do them.
sad but comprehensible. Is it worth trying to unders
Hi,
first: is there a tutorial or s.th. else which helps in understanding what
pacemaker logs in syslog and /var/log/cluster/corosync.log ?
I try hard to find out what's going wrong, but they are difficult to
understand, also because of the amount of information.
Or should i deal more with "crm
- On Aug 2, 2017, at 10:42 AM, Ulrich Windl
ulrich.wi...@rz.uni-regensburg.de wrote:
>
> I thought the cluster does not perform actions that are not defined in the
> configuration (e.g. "monitor").
I think the cluster performs and configures automatically start/stop operations
if not d
Hi,
i'm wondering from where the default values for operations of a resource come
from.
I tried to configure:
crm(live)# configure primitive prim_drbd_idcc_devel ocf:linbit:drbd params
drbd_resource=idcc-devel \
> op monitor interval=60
WARNING: prim_drbd_idcc_devel: default timeout 20s for
- On Aug 1, 2017, at 8:06 AM, Ulrich Windl
ulrich.wi...@rz.uni-regensburg.de wrote:
>>>> "Lentes, Bernd" schrieb am 31.07.2017
> um
> 18:51 in Nachricht
> <641329685.12981098.1501519915026.javamail.zim...@helmholtz-muenchen.de>:
>> Hi,
>>
Hi,
i'm currently a bit confused. I have several resources running as
VirtualDomains, the vm reside on plain logical volumes without fs, these lv's
reside themself on a FC SAN.
In that scenario i need cLVM to distribute the lvm metadata between the nodes.
For playing around a bit and getting us
Hi,
just to be sure:
i have a VirtualDomain resource (called prim_vm_servers_alive) running on one
node (ha-idg-2). From reasons i don't remember i have a location constraint:
location cli-prefer-prim_vm_servers_alive prim_vm_servers_alive role=Started
inf: ha-idg-2
Now i try to set this node i
Hi,
i have a VirtualDomian resource running a Windows 7 client. This is the
respective configuration:
primitive prim_vm_servers_alive VirtualDomain \
params config="/var/lib/libvirt/images/xml/Server_Monitoring.xml" \
params hypervisor="qemu:///system" \
params migration_
>
> If you have DRBD (PV) -> Clustered VG -> LV per VM, and you have
> dual-primary DRBD, you can already do a live migration.
>
What is about PV -> clustered VG -> LV -> DRBD ?
Bernd
Helmholtz Zentrum Muenchen
Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH)
Ingolstaedter La
>>
>> On SLES 11 SP4 HAE DRBD 8.4 is available. Do i need a cluster fs on top of a
>> dual primary DRBD ?
>> I assume.
>>
>>
>> Bernd
>
> Depends.
>
> If you want to have a shared FS, yes. If you want to back VMs though, we
> use clustered LVM to manage the DRBD space, creating per-VM LVs, an
>>>
>>
>> Is with DRBD and Virtual Machines live migration possible ?
>>
>>
>> Bernd
>
> yes, but dual-primary is needed (this is how the Anvil! does live
> migration). With DRBD 9, you can set it up to momentarily do
> dual-primary to support live migration, though I have not used this
> my
- On Jul 17, 2017, at 11:51 AM, Bernd Lentes
bernd.len...@helmholtz-muenchen.de wrote:
> Hi,
>
> i established a two node cluster with two HP servers and SLES 11 SP4. I'd like
> to start now with a test period. Resources are virtual machines. The vm's
> reside on a FC SAN. The SAN has two
- On Jul 11, 2017, at 7:25 PM, Bernd Lentes
bernd.len...@helmholtz-muenchen.de wrote:
> Hi,
>
> i established a two node cluster and i'd like to start now a test period with
> some not very important resources.
> I'd like to monitor the cluster via SNMP, so i realize if he's e.g. migrating
Hi,
i established a two node cluster with two HP servers and SLES 11 SP4. I'd like
to start now with a test period. Resources are virtual machines. The vm's
reside on a FC SAN. The SAN has two power supplies, two storage controller, two
network interfaces for configuration. Each storage control
Hi,
i established a two node cluster and i'd like to start now a test period with
some not very important resources.
I'd like to monitor the cluster via SNMP, so i realize if he's e.g. migrating.
I followed
http://clusterlabs.org/doc/en-US/Pacemaker/1.1-plugin/html-single/Pacemaker_Explained/ind
Hi,
i configured an order so that a simple virtual machine is started after some
other resources are started.
That was how i configured it:
configure order order_clone_group_prim_dlm_prim_vm_idcc-devel
clone_group_prim_dlm_clvmd_vg_cluster_01_ocfs2_fs_lv_xml prim_vm_idcc_devel
I did this twice
Hi,
what would you consider to be the best way for removing a node temporary from
the cluster, e.g. for installing updates ?
I thought "crm node maintenance node" would be the right way, but i was
astonished that the resources keep running on it. I would have expected that
the resources stop.
I
- On May 17, 2017, at 4:24 PM, Dmitri Maziuk dmitri.maz...@gmail.com wrote:
> On 2017-05-17 06:24, Lentes, Bernd wrote:
>>
> ...
>> I'd like to know what the software is use is doing. Am i the only one having
>> that opinion ?
>
> No.
>
>> How
- On May 17, 2017, at 2:58 PM, Klaus Wenninger kwenn...@redhat.com wrote:
>> I don't see that.
>
> fence_* are the RHCS-style fence-agents coming mainly from
> https://github.com/ClusterLabs/fence-agents.
>
Ah. Ok, i see that.
Do you know if they cooperate with a SuSE HAE ? I found rpm'
- On May 17, 2017, at 2:11 PM, Vladislav Bogdanov bub...@hoster-ok.com
wrote:
> 08.05.2017 22:20, Lentes, Bernd wrote:
>> Hi,
>>
>> i remember that digimer often campaigns for a fence delay in a 2-node
>> cluster.
>> E.g. here:
>> http://oss.cluste
- On May 10, 2017, at 9:15 PM, Dimitri Maziuk dmaz...@bmrb.wisc.edu wrote:
> On 05/10/2017 01:54 PM, Ken Gaillot wrote:
>> On 05/10/2017 12:26 PM, Dimitri Maziuk wrote:
>
>>> - fencing in 2-node clusters does not work reliably without fixed delay
>>
>> Not quite. Fixed delay allows a parti
- On May 8, 2017, at 9:20 PM, Bernd Lentes
bernd.len...@helmholtz-muenchen.de wrote:
> Hi,
>
> i remember that digimer often campaigns for a fence delay in a 2-node
> cluster.
> E.g. here:
> http://oss.clusterlabs.org/pipermail/pacemaker/2013-July/019228.html
> In my eyes it makes sense
Hi,
i remember that digimer often campaigns for a fence delay in a 2-node cluster.
E.g. here: http://oss.clusterlabs.org/pipermail/pacemaker/2013-July/019228.html
In my eyes it makes sense, so i try to establish that. I have two HP servers,
each with an ILO card.
I have to use the stonith:extern
- On May 8, 2017, at 6:44 PM, Ken Gaillot kgail...@redhat.com wrote:
>>
>> This is the file without -d:
>>
>> ha-idg-2:/srv/www/hawk/public # stat crm_mon.html
>> File: `crm_mon.html'
>> Size: 1963Blocks: 8 IO Block: 4096 regular file
>> Device: 1fh/31d Inode: 7
Hi,
playing around with my cluster i always have a shell with crm_mon running
because it provides me a lot of useful and current information concerning
cluster, nodes, resources ...
Normally i have a "crm_mon -nrfRAL" running.
I'd like to have that output as a web page too.
So i tried the option
- On Apr 25, 2017, at 1:37 PM, Ulrich Windl
ulrich.wi...@rz.uni-regensburg.de wrote:
>>>> "Lentes, Bernd" schrieb am 25.04.2017
>>>> um
> 11:02 in Nachricht
> <406563603.26964612.1493110931994.javamail.zim...@helmholtz-muenchen.de>:
>
- On Apr 24, 2017, at 11:11 PM, Ken Gaillot kgail...@redhat.com wrote:
> On 04/24/2017 02:33 PM, Lentes, Bernd wrote:
>>
>> - On Apr 24, 2017, at 9:11 PM, Ken Gaillot kgail...@redhat.com wrote:
>>
>>>>> primitive prim_vnc_ip_mausdb IPaddr \
>&g
- On Apr 25, 2017, at 8:08 AM, Ulrich Windl
ulrich.wi...@rz.uni-regensburg.de wrote:
> Berdn
>
> you are long enough on this list to know that the reason for your failure is
> most likely to be found in the logs which you did not provide. Couldn't you
> find out yourself from the logs?
>
- On Apr 24, 2017, at 9:11 PM, Ken Gaillot kgail...@redhat.com wrote:
>>>
>>>
>>> primitive prim_vnc_ip_mausdb IPaddr \
>>>params ip=146.107.235.161 nic=br0 cidr_netmask=24 \
>>>meta is-managed=true
>
> I don't see allow-migrate on the IP. Is this a modified IPaddr? The
> st
- On Apr 24, 2017, at 8:26 PM, Bernd Lentes
bernd.len...@helmholtz-muenchen.de wrote:
> Hi,
>
> i have a primitive VirtualDomain resource which i can live migrate without any
> problem.
> Additionally i have an IP as a resource which i can live mirgate easily too.
> If i combine them in a
101 - 200 of 254 matches
Mail list logo