Re: [PVE-User] pve-csync version of pve-zsync?
Hi. There is some news regarding two-cluster replication ? I would like to use it in a DR scenario, with the big cluster that can have some VMs replicated on a small one (but different, and possibly in a different datacenter). Something included in the proxmox gui will be really great and add a lot of value to proxmox itself. Regards, Fabrizio Cuseo - Il 13-mar-18, alle 19:32, Alexandre DERUMIER aderum...@odiso.com ha scritto: > Hi, > > I have plans to implement storage replication for rbd in proxmox, > like for zfs export|import. (with rbd export-diff |rbd import-diff ) > > I'll try to work on it next month. > > I'm not sure that currently a plugin infrastructe in done in code, > and that it's able to manage storages with differents name. > > Can't tell if it'll be hard to implement, but the workflow is almost the same. > > I'll try to look also at rbd mirror, but it's only work with librbd in qemu, > not > with krbd, > so it can't be implemented for container. > > > - Mail original - > De: "Mark Adams" > À: "proxmoxve" > Envoyé: Mardi 13 Mars 2018 18:52:21 > Objet: Re: [PVE-User] pve-csync version of pve-zsync? > > Hi Alwin, > > I might have to take another look at it, but have you actually done this > with 2 proxmox clusters? I can't remember the exact part I got stuck on as > it was quite a while ago, but it wasn't as straight forward as you suggest. > I think you couldn't use the same cluster name, which in turn created > issues trying to use the "remote" (backup/dr/whatever you wanna call it) > cluster with proxmox because it needed to be called ceph. > > The docs I was referring to were the ceph ones yes. Some of the options > listed in that doc do not work in the current proxmox version (I think the > doc hasn't been updated for newer versions...) > > Regards, > Mark > > On 13 March 2018 at 17:19, Alwin Antreich wrote: > >> On Mon, Mar 12, 2018 at 04:51:32PM +, Mark Adams wrote: >> > Hi Alwin, >> > >> > The last I looked at it, rbd mirror only worked if you had different >> > cluster names. Tried to get it working with proxmox but to no avail, >> > without really messing with how proxmox uses ceph I'm not sure it's >> > feasible, as proxmox assumes the default cluster name for everything... >> That isn't mentioned anywhere in the ceph docs, they use for ease of >> explaining two different cluster names. >> >> If you have a config file named after the cluster, then you can specifiy >> it on the command line. >> http://docs.ceph.com/docs/master/rados/configuration/ >> ceph-conf/#running-multiple-clusters >> >> > >> > Also the documentation was a bit poor for it IMO. >> Which documentation do you mean? >> ? -> http://docs.ceph.com/docs/master/rbd/rbd-mirroring/ >> >> > >> > Would also be nice to choose specifically which VM's you want to be >> > mirroring, rather than the whole cluster. >> It is done either per pool or image separately. See the link above. >> >> > >> > I've manually done rbd export-diff and rbd import-diff between 2 separate >> > proxmox clusters over ssh, and it seems to work really well... It would >> > just be nice to have a tool like pve-zsync so I don't have to write some >> > script myself. Seems to me like something that would be desirable as part >> > of proxmox as well? >> That would basically implement the ceph rbd mirror feature. >> >> > >> > Cheers, >> > Mark >> > >> > On 12 March 2018 at 16:37, Alwin Antreich >> wrote: >> > >> > > Hi Mark, >> > > >> > > On Mon, Mar 12, 2018 at 03:49:42PM +, Mark Adams wrote: >> > > > Hi All, >> > > > >> > > > Has anyone looked at or thought of making a version of pve-zsync for >> > > ceph? >> > > > >> > > > This would be great for DR scenarios... >> > > > >> > > > How easy do you think this would be to do? I imagine it wouId it be >> quite >> > > > similar to pve-zsync, but using rbd export-diff and rbd import-diff >> > > instead >> > > > of zfs send and zfs receive? so could the existing script be >> relatively >> > > > easily modified? (I know nothing about perl) >> > > > >> > > > Cheers, >> > > > Mark >> > > > ___ >> > > > pve-user mailing list >> > > > pve-user@pve.proxmox.com >> > > > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user >> > > Isn't ceph mirror already what you want? It can mirror a image or a >> > > whole pool. It keeps track of changes and serves remote image deletes >> > > (adjustable delay). >> > > >> >> ___ >> pve-user mailing list >> pve-user@pve.proxmox.com >> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user >> > ___ > pve-user mailing list > pve-user@pve.proxmox.com > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user > > ___ > pve-user mailing list > pve-user@pve.proxmox.com > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user -- ---
Re: [PVE-User] pve-firewall, clustering and HA gone bad
np! -- Mark Schouten Tuxis, Ede, https://www.tuxis.nl T: +31 318 200208 - Originele bericht - Van: Thomas Lamprecht (t.lampre...@proxmox.com) Datum: 25-06-2019 10:31 Naar: PVE User List (pve-user@pve.proxmox.com), Mark Schouten (m...@tuxis.nl) Onderwerp: Re: [PVE-User] pve-firewall, clustering and HA gone bad On 6/25/19 9:44 AM, Thomas Lamprecht wrote: > And as also said (see quote below), for more specific hinters I need the raw > logs, unmerged and as untouched as possible. may just be that I did not saw the mail in my inbox, so it looks like you already send it to me, sorry about missing it. ___ pve-user mailing list pve-user@pve.proxmox.com https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
Re: [PVE-User] pve-firewall, clustering and HA gone bad
On 6/25/19 9:44 AM, Thomas Lamprecht wrote: > And as also said (see quote below), for more specific hinters I need the raw > logs, unmerged and as untouched as possible. may just be that I did not saw the mail in my inbox, so it looks like you already send it to me, sorry about missing it. ___ pve-user mailing list pve-user@pve.proxmox.com https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
Re: [PVE-User] pve-firewall, clustering and HA gone bad
On 6/25/19 9:10 AM, Mark Schouten wrote: > On Thu, Jun 13, 2019 at 12:34:28PM +0200, Thomas Lamprecht wrote: >>> 2: ha-manager should not be able to start the VM's when they are running >>> elsewhere >> >> This can only happen if fencing fails, and that fencing works is always >> a base assumption we must take (as else no HA is possible at all). >> So it would be interesting why fencing did not worked here (see below >> for the reason I could not determine that yet as I did not have your logs >> at hand) > > Reading the emails from that specific night, I saw this message: > > The node 'proxmox01' failed and needs manual intervention. > > The PVE HA manager tries to fence it and recover the > configured HA resources to a healthy node if possible. > > Current fence status: SUCCEED > fencing: acknowledged - got agent lock for node 'proxmox01' > > This seems to suggest that the cluster is confident that the fencing > succeeded. How does it determine that? > It got the other's node LRM agent lock through pmxcfs. Normal LRM cycle is 0. startup 1. (re-)acquire agent lock, if OK go to 2, else to 4 2. do work (start, stop, migrate resources) 3. got to 1 4. no lock: if we had the lock once we stop watchdog updates, stop doing anything, wait for either quorum again (<60s) or the watchdog to trigger (>=60) if we never had the lock just poll for it continuously Locks can be held only by one node. If the CRM sees a node offline for >120 seconds (IIRC) it tries to acquire the lock from that node, once it has it it can know that the HA stack on the other side cannot start any actions anymore - and if your "unfreeze before watchdog enable" did not happened it would got fenced by the watchdog. The lock and recovery action itself was not the direct root cause, as said, the most I could take out from the logs you sent was: > ... > So, the "unfreeze before the respective LRM got active+online with watchdog" > seems the cause of the real wrong behavior here in your log, it allows the > recovery to happen, as else frozen services wouldn't not have been recovered > (that mechanism exactly exists to avoid such issues during a upgrade, where > one does not want to stop or migrate all HA VM/CTs) And as also said (see quote below), for more specific hinters I need the raw logs, unmerged and as untouched as possible. On 6/13/19 6:29 PM, Thomas Lamprecht wrote: > While you interpolated the different logs into a single time-line it does not > seem to match everywhere, for my better understanding could you please send > me: > > * corosync.conf > * the journal or syslog of proxmox01 and proxmox03 around "Jun 12 01:38:16" > plus/minus ~ 5 minutes, please in separated files, no interpolation and as > unredacted as possible > * information if you have a HW watchdog or use the Linux soft-dog > > that would be appreciated. ___ pve-user mailing list pve-user@pve.proxmox.com https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
Re: [PVE-User] pve-firewall, clustering and HA gone bad
On Thu, Jun 13, 2019 at 12:34:28PM +0200, Thomas Lamprecht wrote: > > 2: ha-manager should not be able to start the VM's when they are running > > elsewhere > > This can only happen if fencing fails, and that fencing works is always > a base assumption we must take (as else no HA is possible at all). > So it would be interesting why fencing did not worked here (see below > for the reason I could not determine that yet as I did not have your logs > at hand) Reading the emails from that specific night, I saw this message: The node 'proxmox01' failed and needs manual intervention. The PVE HA manager tries to fence it and recover the configured HA resources to a healthy node if possible. Current fence status: SUCCEED fencing: acknowledged - got agent lock for node 'proxmox01' This seems to suggest that the cluster is confident that the fencing succeeded. How does it determine that? -- Mark Schouten | Tuxis B.V. KvK: 74698818 | http://www.tuxis.nl/ T: +31 318 200208 | i...@tuxis.nl ___ pve-user mailing list pve-user@pve.proxmox.com https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user