Re: [DRBD-user] drbd 8.3 - 6 nodes
Hi, now you've cleared things up for me, thanks. On 03/06/2012 04:27 AM, Umarzuki Mochlis wrote: without drbd, A3 would automatically mount A1's LUN run as A1 while resuming A1's role via rgmanager Without DRBD? How does this work? Are those LUNs on a SAN? With DRBD, you would need to set things up so that A3 has 2 DRBD volumes. One is shared with A1, another with A2. Your cluster manager would take care to normally have A1 and A2 be primary, and A3 in respective failover conditions. I'm not sure how you separate A1's from A2's services on A3, but your cluster configuration probably takes care of that already. A1 in case of failure, will failover to A3 A2 in case of failure, will failover to A3 OK, fine. i hope i explained it correctly this time. forgive me for my terrible command of the english language. Trust me, I've seen far, far worse :-) Regards, Felix ___ drbd-user mailing list drbd-user@lists.linbit.com http://lists.linbit.com/mailman/listinfo/drbd-user
Re: [DRBD-user] Kernel hung on DRBD / MD RAID
Hi, - Oorspronkelijk bericht - Van: Andreas Bauer a...@voltage.de Aan: drbd-user@lists.linbit.com Verzonden: Dinsdag 6 maart 2012 00:27:33 Onderwerp: Re: [DRBD-user] Kernel hung on DRBD / MD RAID From: Florian Haas flor...@hastexo.com Sent: Mon 05-03-2012 23:59 On Mon, Mar 5, 2012 at 11:45 PM, Andreas Bauer a...@voltage.de wrote: snip If you're telling your system to use an sync/verify rate that you _know_ to be higher than what the disk can handle, then kicking off a verify (drbdadm verify) or full sync (drbdadm invalidate-remote) will badly beat up your I/O stack. The documentation tells you to use a sync rate that doesn't exceed about one third of your available bandwidth. You can also use variable-rate synchronization which should take care of properly throttling the syncer rate for you. But by deliberately setting a sync rate that exceeds disk bandwidth, you're begging for trouble. Why would you want to do this? Because I want to badly beat up my I/O stack? The point of this exercise is to reproduce the kernel crash. So to stay with the image, the stack should be able to take a beating without dying in the process. In reality it happens often when a MD resync is running (even when throttled to f.e. 1000 K/s) and DRBD is also resyncing. When you then look at /dev/mdstat you'll find the sync rate is slowing down to zero. We do use DRBD on top of raid1 on several other servers without problems. We do use Xen on top of raid1 on several other servers without problems. Greets, Micha Kersloot ___ drbd-user mailing list drbd-user@lists.linbit.com http://lists.linbit.com/mailman/listinfo/drbd-user
Re: [DRBD-user] drbd 8.3 - 6 nodes
Pada 6 Mac 2012 4:45 PTG, Felix Frank f...@mpexnet.de menulis: On 03/06/2012 04:27 AM, Umarzuki Mochlis wrote: without drbd, A3 would automatically mount A1's LUN run as A1 while resuming A1's role via rgmanager Without DRBD? How does this work? Are those LUNs on a SAN? With DRBD, you would need to set things up so that A3 has 2 DRBD volumes. One is shared with A1, another with A2. Your cluster manager would take care to normally have A1 and A2 be primary, and A3 in respective failover conditions. I'm not sure how you separate A1's from A2's services on A3, but your cluster configuration probably takes care of that already. A1 in case of failure, will failover to A3 A2 in case of failure, will failover to A3 OK, fine. yes, the mailbox storages are LUN's on a SAN. it manages to do clustering with rgmanager + cman, so A3 would take where A1 or A2 left off by mounting A1's or A2's mailbox storage on itself since it is already running a standby mailbox service (zimbra-cluster standby server). i believe this is called 2+1 clustering. but now i have to setup another cluster for disaster recovery as i described before using drbd. is there anyway, with this setup that i could achieve what i had intended? FYI, all hardware had been bought and storage had been calculated beforehand which i had no saying or part of. so this is a bit of a problem for me. -- Regards, Umarzuki Mochlis http://debmal.my ___ drbd-user mailing list drbd-user@lists.linbit.com http://lists.linbit.com/mailman/listinfo/drbd-user
Re: [DRBD-user] drbd 8.3 - 6 nodes
On 03/06/2012 11:18 AM, Umarzuki Mochlis wrote: yes, the mailbox storages are LUN's on a SAN. it manages to do clustering with rgmanager + cman, so A3 would take where A1 or A2 left off by mounting A1's or A2's mailbox storage on itself since it is already running a standby mailbox service (zimbra-cluster standby server). i believe this is called 2+1 clustering. but now i have to setup another cluster for disaster recovery as i described before using drbd. is there anyway, with this setup that i could achieve what i had intended? FYI, all hardware had been bought and storage had been calculated beforehand which i had no saying or part of. so this is a bit of a problem for me. Ah, I see now. Technically, you'd want establish DRBD synchronisation between the SAN at your A site and the SAN at the B site. Manual failover would include making SAN B Primary. Now, if said SANs are proprietary all-in-one products, your possibilities for adding DRBD to it may be severely limited. Your SAN vendor may or may not offer a cross-site synchronisation of their own. HTH, Felix ___ drbd-user mailing list drbd-user@lists.linbit.com http://lists.linbit.com/mailman/listinfo/drbd-user
Re: [DRBD-user] drbd 8.3 - 6 nodes
Hi, On 03/06/2012 11:34 AM, Kaloyan Kovachev wrote: Yes. You may use floating IP for DRBD and have one instance (IP) in site A and another in site B for each service. Do not use the service IP as floating IP as you will have problems moving the service from A to B. If A1 is active, you have the DRBD_A1 IP on that node, wich will move to A3 in case of failure before the service ... now you have DRBD_A1 and the service running on A3 over DRBD_A1, while DRBD_B1 will run undependable on B1 or B3. Now your A site goes down - you promote DRBD_B1 to primary and start A1 service on B1 over DRBD_B1. interesting. So you suggest that A1 should DRBD-sync with B1 at all times etc.? Keep in mind that this is shared storage we're talking about here, no local disks in either A1 *or* B1. I believe DRBD could be made to operate thus, but there might be performance issues. Cheers, Felix ___ drbd-user mailing list drbd-user@lists.linbit.com http://lists.linbit.com/mailman/listinfo/drbd-user
Re: [DRBD-user] drbd 8.3 - 6 nodes
On Tue, 06 Mar 2012 11:40:22 +0100, Felix Frank f...@mpexnet.de wrote: Hi, On 03/06/2012 11:34 AM, Kaloyan Kovachev wrote: Yes. You may use floating IP for DRBD and have one instance (IP) in site A and another in site B for each service. Do not use the service IP as floating IP as you will have problems moving the service from A to B. If A1 is active, you have the DRBD_A1 IP on that node, wich will move to A3 in case of failure before the service ... now you have DRBD_A1 and the service running on A3 over DRBD_A1, while DRBD_B1 will run undependable on B1 or B3. Now your A site goes down - you promote DRBD_B1 to primary and start A1 service on B1 over DRBD_B1. interesting. So you suggest that A1 should DRBD-sync with B1 at all times etc.? Yes. That's the only option to sync both SAN's if they do not provide such functionality. Keep in mind that this is shared storage we're talking about here, no local disks in either A1 *or* B1. I believe DRBD could be made to operate thus, but there might be performance issues. Again Yes. Performance will be lower and protocol A with DRBD Proxy may be required, but again if the SAN does not provide native cross-site replication, there is not much else to do ... rsync from a snapshot is one i can think of, but even from the currently inactive node (A3) the performance will suffer and additionally there will be delay in synchronization and possible data loss, so DRBD is still better. Cheers, Felix ___ drbd-user mailing list drbd-user@lists.linbit.com http://lists.linbit.com/mailman/listinfo/drbd-user
Re: [DRBD-user] drbd 8.3 - 6 nodes
Pada 6 Mac 2012 6:57 PTG, Kaloyan Kovachev kkovac...@varna.net menulis: Keep in mind that this is shared storage we're talking about here, no local disks in either A1 *or* B1. I believe DRBD could be made to operate thus, but there might be performance issues. Again Yes. Performance will be lower and protocol A with DRBD Proxy may be required, but again if the SAN does not provide native cross-site replication, there is not much else to do ... rsync from a snapshot is one i can think of, but even from the currently inactive node (A3) the performance will suffer and additionally there will be delay in synchronization and possible data loss, so DRBD is still better. addendum: what i want to do is storage failover since email service's clustering is handled by zimbra-cluster and rgmaner + cman on centos the reason we're using drbd is because we have no budget for remote mirroring which requires 2 SAN routers on current network setup -- Regards, Umarzuki Mochlis http://debmal.my ___ drbd-user mailing list drbd-user@lists.linbit.com http://lists.linbit.com/mailman/listinfo/drbd-user
[DRBD-user] DRBD uses a wrong interface.
Hi ALL, I've found a rather strange thing. It looks like a server try to use wrong network interface for drbd connection Could you explain me how it is possible? See details below: Mar 7 11:30:35 infplsm018 daemon.notice openais[3142]: [TOTEM] entering GATHER state from 11. Mar 7 11:30:35 infplsm018 daemon.notice openais[3142]: [TOTEM] Creating commit token because I am the rep. Mar 7 11:30:35 infplsm018 daemon.notice openais[3142]: [TOTEM] Storing new sequence id for ring 1518 Mar 7 11:30:35 infplsm018 daemon.notice openais[3142]: [TOTEM] entering COMMIT state. Mar 7 11:30:35 infplsm018 daemon.notice openais[3142]: [TOTEM] entering RECOVERY state. Mar 7 11:30:35 infplsm018 daemon.notice openais[3142]: [TOTEM] position [0] member 10.102.1.55: Mar 7 11:30:35 infplsm018 daemon.notice openais[3142]: [TOTEM] previous ring seq 5396 rep 10.10.24.10 Mar 7 11:30:35 infplsm018 daemon.notice openais[3142]: [TOTEM] aru 63 high delivered 63 received flag 1 Mar 7 11:30:35 infplsm018 daemon.notice openais[3142]: [TOTEM] Did not need to originate any messages in recovery. Mar 7 11:30:35 infplsm018 daemon.notice openais[3142]: [TOTEM] Sending initial ORF token Mar 7 11:30:35 infplsm018 daemon.notice openais[3142]: [CLM ] CLM CONFIGURATION CHANGE Mar 7 11:30:35 infplsm018 daemon.notice openais[3142]: [CLM ] New Configuration: Mar 7 11:30:35 infplsm018 daemon.notice openais[3142]: [CLM ] r(0) ip(10.102.1.55) Mar 7 11:30:35 infplsm018 daemon.notice openais[3142]: [CLM ] Members Left: Mar 7 11:30:35 infplsm018 daemon.notice openais[3142]: [CLM ] r(0) ip(10.10.24.10) Mar 7 11:30:35 infplsm018 daemon.notice openais[3142]: [CLM ] Members Joined: Mar 7 11:30:35 infplsm018 daemon.notice openais[3142]: [CLM ] CLM CONFIGURATION CHANGE Mar 7 11:30:35 infplsm018 daemon.notice openais[3142]: [CLM ] New Configuration: Mar 7 11:30:35 infplsm018 daemon.notice openais[3142]: [CLM ] r(0) ip(10.102.1.55) Mar 7 11:30:35 infplsm018 daemon.notice openais[3142]: [CLM ] Members Left: Mar 7 11:30:35 infplsm018 daemon.notice openais[3142]: [CLM ] Members Joined: Mar 7 11:30:35 infplsm018 daemon.notice openais[3142]: [SYNC ] This node is within the primary component and will provide service. Mar 7 11:30:35 infplsm018 daemon.notice openais[3142]: [TOTEM] entering OPERATIONAL state. Mar 7 11:30:35 infplsm018 daemon.notice openais[3142]: [CLM ] got nodejoin message 10.102.1.55 Mar 7 11:30:35 infplsm018 daemon.notice openais[3142]: [CPG ] got joinlist message from node 2 Mar 7 11:32:03 infplsm018 daemon.notice openais[3142]: [TOTEM] entering GATHER state from 11. Mar 7 11:32:03 infplsm018 daemon.notice openais[3142]: [TOTEM] Storing new sequence id for ring 151c Mar 7 11:32:03 infplsm018 daemon.notice openais[3142]: [TOTEM] entering COMMIT state. Mar 7 11:32:03 infplsm018 daemon.notice openais[3142]: [TOTEM] entering RECOVERY state. Mar 7 11:32:03 infplsm018 daemon.notice openais[3142]: [TOTEM] position [0] member 10.10.24.10: Mar 7 11:32:03 infplsm018 daemon.notice openais[3142]: [TOTEM] previous ring seq 5400 rep 10.10.24.10 Mar 7 11:32:03 infplsm018 daemon.notice openais[3142]: [TOTEM] aru c high delivered c received flag 1 Mar 7 11:32:03 infplsm018 daemon.notice openais[3142]: [TOTEM] position [1] member 10.102.1.55: Mar 7 11:32:03 infplsm018 daemon.notice openais[3142]: [TOTEM] previous ring seq 5400 rep 10.102.1.55 Mar 7 11:32:03 infplsm018 daemon.notice openais[3142]: [TOTEM] aru d high delivered d received flag 1 Mar 7 11:32:03 infplsm018 daemon.notice openais[3142]: [TOTEM] Did not need to originate any messages in recovery. Mar 7 11:32:03 infplsm018 daemon.notice openais[3142]: [CLM ] CLM CONFIGURATION CHANGE Mar 7 11:32:03 infplsm018 daemon.notice openais[3142]: [CLM ] New Configuration: Mar 7 11:32:03 infplsm018 daemon.notice openais[3142]: [CLM ] r(0) ip(10.102.1.55) Mar 7 11:32:03 infplsm018 daemon.notice openais[3142]: [CLM ] Members Left: Mar 7 11:32:03 infplsm018 daemon.notice openais[3142]: [CLM ] Members Joined: Mar 7 11:32:03 infplsm018 daemon.notice openais[3142]: [CLM ] CLM CONFIGURATION CHANGE Mar 7 11:32:03 infplsm018 daemon.notice openais[3142]: [CLM ] New Configuration: Mar 7 11:32:03 infplsm018 daemon.notice openais[3142]: [CLM ] r(0) ip(10.10.24.10) Mar 7 11:32:03 infplsm018 daemon.notice openais[3142]: [CLM ] r(0) ip(10.102.1.55) Mar 7 11:32:03 infplsm018 daemon.notice openais[3142]: [CLM ] Members Left: Mar 7 11:32:03 infplsm018 daemon.notice openais[3142]: [CLM ] Members Joined: Mar 7 11:32:03 infplsm018 daemon.notice openais[3142]: [CLM ] CLM CONFIGURATION CHANGE Mar 7 11:32:03 infplsm018 daemon.notice openais[3142]: [CLM ] New Configuration: Mar 7 11:32:03 infplsm018 daemon.notice openais[3142]: [CLM ] r(0) ip(10.10.24.10) Mar 7 11:32:03 infplsm018 daemon.notice openais[3142]: [CLM ] r(0) ip(10.102.1.55) Mar 7 11:32:03 infplsm018
Re: [DRBD-user] DRBD uses a wrong interface.
On 03/06/2012 05:39 PM, Ivan Pavlenko wrote: Hi ALL, I've found a rather strange thing. It looks like a server try to use wrong network interface for drbd connection Could you explain me how it is possible? See details below: Mar 7 11:30:35 infplsm018 daemon.notice openais[3142]: [TOTEM] entering GATHER state from 11. Mar 7 11:30:35 infplsm018 daemon.notice openais[3142]: [TOTEM] Creating commit token because I am the rep. Mar 7 11:30:35 infplsm018 daemon.notice openais[3142]: [TOTEM] Storing new sequence id for ring 1518 Mar 7 11:30:35 infplsm018 daemon.notice openais[3142]: [TOTEM] entering COMMIT state. Mar 7 11:30:35 infplsm018 daemon.notice openais[3142]: [TOTEM] entering RECOVERY state. Mar 7 11:30:35 infplsm018 daemon.notice openais[3142]: [TOTEM] position [0] member 10.102.1.55: Mar 7 11:30:35 infplsm018 daemon.notice openais[3142]: [TOTEM] previous ring seq 5396 rep 10.10.24.10 Mar 7 11:30:35 infplsm018 daemon.notice openais[3142]: [TOTEM] aru 63 high delivered 63 received flag 1 Mar 7 11:30:35 infplsm018 daemon.notice openais[3142]: [TOTEM] Did not need to originate any messages in recovery. Mar 7 11:30:35 infplsm018 daemon.notice openais[3142]: [TOTEM] Sending initial ORF token Mar 7 11:30:35 infplsm018 daemon.notice openais[3142]: [CLM ] CLM CONFIGURATION CHANGE Mar 7 11:30:35 infplsm018 daemon.notice openais[3142]: [CLM ] New Configuration: Mar 7 11:30:35 infplsm018 daemon.notice openais[3142]: [CLM ] r(0) ip(10.102.1.55) Mar 7 11:30:35 infplsm018 daemon.notice openais[3142]: [CLM ] Members Left: Mar 7 11:30:35 infplsm018 daemon.notice openais[3142]: [CLM ] r(0) ip(10.10.24.10) Mar 7 11:30:35 infplsm018 daemon.notice openais[3142]: [CLM ] Members Joined: Mar 7 11:30:35 infplsm018 daemon.notice openais[3142]: [CLM ] CLM CONFIGURATION CHANGE Mar 7 11:30:35 infplsm018 daemon.notice openais[3142]: [CLM ] New Configuration: Mar 7 11:30:35 infplsm018 daemon.notice openais[3142]: [CLM ] r(0) ip(10.102.1.55) Mar 7 11:30:35 infplsm018 daemon.notice openais[3142]: [CLM ] Members Left: Mar 7 11:30:35 infplsm018 daemon.notice openais[3142]: [CLM ] Members Joined: Mar 7 11:30:35 infplsm018 daemon.notice openais[3142]: [SYNC ] This node is within the primary component and will provide service. Mar 7 11:30:35 infplsm018 daemon.notice openais[3142]: [TOTEM] entering OPERATIONAL state. Mar 7 11:30:35 infplsm018 daemon.notice openais[3142]: [CLM ] got nodejoin message 10.102.1.55 Mar 7 11:30:35 infplsm018 daemon.notice openais[3142]: [CPG ] got joinlist message from node 2 Mar 7 11:32:03 infplsm018 daemon.notice openais[3142]: [TOTEM] entering GATHER state from 11. Mar 7 11:32:03 infplsm018 daemon.notice openais[3142]: [TOTEM] Storing new sequence id for ring 151c Mar 7 11:32:03 infplsm018 daemon.notice openais[3142]: [TOTEM] entering COMMIT state. Mar 7 11:32:03 infplsm018 daemon.notice openais[3142]: [TOTEM] entering RECOVERY state. Mar 7 11:32:03 infplsm018 daemon.notice openais[3142]: [TOTEM] position [0] member 10.10.24.10: Mar 7 11:32:03 infplsm018 daemon.notice openais[3142]: [TOTEM] previous ring seq 5400 rep 10.10.24.10 Mar 7 11:32:03 infplsm018 daemon.notice openais[3142]: [TOTEM] aru c high delivered c received flag 1 Mar 7 11:32:03 infplsm018 daemon.notice openais[3142]: [TOTEM] position [1] member 10.102.1.55: Mar 7 11:32:03 infplsm018 daemon.notice openais[3142]: [TOTEM] previous ring seq 5400 rep 10.102.1.55 Mar 7 11:32:03 infplsm018 daemon.notice openais[3142]: [TOTEM] aru d high delivered d received flag 1 Mar 7 11:32:03 infplsm018 daemon.notice openais[3142]: [TOTEM] Did not need to originate any messages in recovery. Mar 7 11:32:03 infplsm018 daemon.notice openais[3142]: [CLM ] CLM CONFIGURATION CHANGE Mar 7 11:32:03 infplsm018 daemon.notice openais[3142]: [CLM ] New Configuration: Mar 7 11:32:03 infplsm018 daemon.notice openais[3142]: [CLM ] r(0) ip(10.102.1.55) Mar 7 11:32:03 infplsm018 daemon.notice openais[3142]: [CLM ] Members Left: Mar 7 11:32:03 infplsm018 daemon.notice openais[3142]: [CLM ] Members Joined: Mar 7 11:32:03 infplsm018 daemon.notice openais[3142]: [CLM ] CLM CONFIGURATION CHANGE Mar 7 11:32:03 infplsm018 daemon.notice openais[3142]: [CLM ] New Configuration: Mar 7 11:32:03 infplsm018 daemon.notice openais[3142]: [CLM ] r(0) ip(10.10.24.10) Mar 7 11:32:03 infplsm018 daemon.notice openais[3142]: [CLM ] r(0) ip(10.102.1.55) Mar 7 11:32:03 infplsm018 daemon.notice openais[3142]: [CLM ] Members Left: Mar 7 11:32:03 infplsm018 daemon.notice openais[3142]: [CLM ] Members Joined: Mar 7 11:32:03 infplsm018 daemon.notice openais[3142]: [CLM ] CLM CONFIGURATION CHANGE Mar 7 11:32:03 infplsm018 daemon.notice openais[3142]: [CLM ] New Configuration: Mar 7 11:32:03 infplsm018 daemon.notice openais[3142]: [CLM ] r(0) ip(10.10.24.10) Mar 7 11:32:03 infplsm018
Re: [DRBD-user] Rebooting the servers
On 03/06/2012 11:53 AM, Marcelo Pereira wrote: Hello guys, Is there a way to reboot a pair of DRBD servers without having to do a resync afterwards? I would like to know what is the correct procedure to do that. For example, assuming that both are in sync and running properly, and that I have to move the servers from a rack to another rack. How can I do that? Thanks, Marcelo Unmount any filesystems then stop the drbd service. That's it. DRBD only syncs blocks that have changed since the two nodes last saw one another. With nothing being written, their is nothing to sync when you restart. -- Digimer E-Mail: digi...@alteeve.com Papers and Projects: https://alteeve.com ___ drbd-user mailing list drbd-user@lists.linbit.com http://lists.linbit.com/mailman/listinfo/drbd-user