Re: [PVE-User] pve-csync version of pve-zsync?

2019-06-25 Thread Fabrizio Cuseo
Hi.
There is some news regarding two-cluster replication ? I would like to use it 
in a DR scenario, with the big cluster that can have some VMs replicated on a 
small one (but different, and possibly in a different datacenter).

Something included in the proxmox gui will be really great and add a lot of 
value to proxmox itself.

Regards, Fabrizio Cuseo


- Il 13-mar-18, alle 19:32, Alexandre DERUMIER aderum...@odiso.com ha 
scritto:

> Hi,
> 
> I have plans to implement storage replication for rbd in proxmox,
> like for zfs export|import.  (with rbd export-diff |rbd import-diff )
> 
> I'll try to work on it next month.
> 
> I'm not sure that currently a plugin infrastructe in done in code,
> and that it's able to manage storages with differents name.
> 
> Can't tell if it'll be hard to implement, but the workflow is almost the same.
> 
> I'll try to look also at rbd mirror, but it's only work with librbd in qemu, 
> not
> with krbd,
> so it can't be implemented for container.
> 
> 
> - Mail original -
> De: "Mark Adams" 
> À: "proxmoxve" 
> Envoyé: Mardi 13 Mars 2018 18:52:21
> Objet: Re: [PVE-User] pve-csync version of pve-zsync?
> 
> Hi Alwin,
> 
> I might have to take another look at it, but have you actually done this
> with 2 proxmox clusters? I can't remember the exact part I got stuck on as
> it was quite a while ago, but it wasn't as straight forward as you suggest.
> I think you couldn't use the same cluster name, which in turn created
> issues trying to use the "remote" (backup/dr/whatever you wanna call it)
> cluster with proxmox because it needed to be called ceph.
> 
> The docs I was referring to were the ceph ones yes. Some of the options
> listed in that doc do not work in the current proxmox version (I think the
> doc hasn't been updated for newer versions...)
> 
> Regards,
> Mark
> 
> On 13 March 2018 at 17:19, Alwin Antreich  wrote:
> 
>> On Mon, Mar 12, 2018 at 04:51:32PM +, Mark Adams wrote:
>> > Hi Alwin,
>> > 
>> > The last I looked at it, rbd mirror only worked if you had different
>> > cluster names. Tried to get it working with proxmox but to no avail,
>> > without really messing with how proxmox uses ceph I'm not sure it's
>> > feasible, as proxmox assumes the default cluster name for everything...
>> That isn't mentioned anywhere in the ceph docs, they use for ease of
>> explaining two different cluster names.
>> 
>> If you have a config file named after the cluster, then you can specifiy
>> it on the command line.
>> http://docs.ceph.com/docs/master/rados/configuration/
>> ceph-conf/#running-multiple-clusters
>> 
>> > 
>> > Also the documentation was a bit poor for it IMO.
>> Which documentation do you mean?
>> ? -> http://docs.ceph.com/docs/master/rbd/rbd-mirroring/
>> 
>> > 
>> > Would also be nice to choose specifically which VM's you want to be
>> > mirroring, rather than the whole cluster.
>> It is done either per pool or image separately. See the link above.
>> 
>> > 
>> > I've manually done rbd export-diff and rbd import-diff between 2 separate
>> > proxmox clusters over ssh, and it seems to work really well... It would
>> > just be nice to have a tool like pve-zsync so I don't have to write some
>> > script myself. Seems to me like something that would be desirable as part
>> > of proxmox as well?
>> That would basically implement the ceph rbd mirror feature.
>> 
>> > 
>> > Cheers,
>> > Mark
>> > 
>> > On 12 March 2018 at 16:37, Alwin Antreich 
>> wrote:
>> > 
>> > > Hi Mark,
>> > > 
>> > > On Mon, Mar 12, 2018 at 03:49:42PM +, Mark Adams wrote:
>> > > > Hi All,
>> > > > 
>> > > > Has anyone looked at or thought of making a version of pve-zsync for
>> > > ceph?
>> > > > 
>> > > > This would be great for DR scenarios...
>> > > > 
>> > > > How easy do you think this would be to do? I imagine it wouId it be
>> quite
>> > > > similar to pve-zsync, but using rbd export-diff and rbd import-diff
>> > > instead
>> > > > of zfs send and zfs receive? so could the existing script be
>> relatively
>> > > > easily modified? (I know nothing about perl)
>> > > > 
>> > > > Cheers,
>> > > > Mark
>> > > > ___
>> > > > pve-user mailing list
>> > > > pve-user@pve.proxmox.com
>> > > > https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
>> > > Isn't ceph mirror already what you want? It can mirror a image or a
>> > > whole pool. It keeps track of changes and serves remote image deletes
>> > > (adjustable delay).
>> > > 
>> 
>> ___
>> pve-user mailing list
>> pve-user@pve.proxmox.com
>> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
>> 
> ___
> pve-user mailing list
> pve-user@pve.proxmox.com
> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
> 
> ___
> pve-user mailing list
> pve-user@pve.proxmox.com
> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user

-- 
---

Re: [PVE-User] pve-firewall, clustering and HA gone bad

2019-06-25 Thread Mark Schouten

np!

--

Mark Schouten 

Tuxis, Ede, https://www.tuxis.nl

T: +31 318 200208 
 



- Originele bericht -


Van: Thomas Lamprecht (t.lampre...@proxmox.com)
Datum: 25-06-2019 10:31
Naar: PVE User List (pve-user@pve.proxmox.com), Mark Schouten (m...@tuxis.nl)
Onderwerp: Re: [PVE-User] pve-firewall, clustering and HA gone bad


On 6/25/19 9:44 AM, Thomas Lamprecht wrote:
> And as also said (see quote below), for more specific hinters I need the raw
> logs, unmerged and as untouched as possible.

may just be that I did not saw the mail in my inbox, so it looks like
you already send it to me, sorry about missing it.



___
pve-user mailing list
pve-user@pve.proxmox.com
https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user


Re: [PVE-User] pve-firewall, clustering and HA gone bad

2019-06-25 Thread Thomas Lamprecht
On 6/25/19 9:44 AM, Thomas Lamprecht wrote:
> And as also said (see quote below), for more specific hinters I need the raw
> logs, unmerged and as untouched as possible.

may just be that I did not saw the mail in my inbox, so it looks like
you already send it to me, sorry about missing it.

___
pve-user mailing list
pve-user@pve.proxmox.com
https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user


Re: [PVE-User] pve-firewall, clustering and HA gone bad

2019-06-25 Thread Thomas Lamprecht
On 6/25/19 9:10 AM, Mark Schouten wrote:
> On Thu, Jun 13, 2019 at 12:34:28PM +0200, Thomas Lamprecht wrote:
>>> 2: ha-manager should not be able to start the VM's when they are running
>>> elsewhere
>>
>> This can only happen if fencing fails, and that fencing works is always
>> a base assumption we must take (as else no HA is possible at all).
>> So it would be interesting why fencing did not worked here (see below
>> for the reason I could not determine that yet as I did not have your logs
>> at hand)
> 
> Reading the emails from that specific night, I saw this message:
> 
>  The node 'proxmox01' failed and needs manual intervention.
> 
>  The PVE HA manager tries to fence it and recover the
>  configured HA resources to a healthy node if possible.
> 
>  Current fence status: SUCCEED
>  fencing: acknowledged - got agent lock for node 'proxmox01'
> 
> This seems to suggest that the cluster is confident that the fencing
> succeeded. How does it determine that?
> 

It got the other's node LRM agent lock through pmxcfs.

Normal LRM cycle is

0. startup
1. (re-)acquire agent lock, if OK go to 2, else to 4
2. do work (start, stop, migrate resources)
3. got to 1
4. no lock: if we had the lock once we stop watchdog updates, stop doing
   anything, wait for either quorum again (<60s) or the watchdog to trigger
   (>=60)
   if we never had the lock just poll for it continuously

Locks can be held only by one node. If the CRM sees a node offline for >120
seconds (IIRC) it tries to acquire the lock from that node, once it has it
it can know that the HA stack on the other side cannot start any actions
anymore - and if your "unfreeze before watchdog enable" did not happened
it would got fenced by the watchdog.

The lock and recovery action itself was not the direct root cause, as said,
the most I could take out from the logs you sent was:
> ...
> So, the "unfreeze before the respective LRM got active+online with watchdog"
> seems the cause of the real wrong behavior here in your log, it allows the
> recovery to happen, as else frozen services wouldn't not have been recovered
> (that mechanism exactly exists to avoid such issues during a upgrade, where
> one does not want to stop or migrate all HA VM/CTs)

And as also said (see quote below), for more specific hinters I need the raw
logs, unmerged and as untouched as possible.

On 6/13/19 6:29 PM, Thomas Lamprecht wrote:
> While you interpolated the different logs into a single time-line it does not
> seem to match everywhere, for my better understanding could you please send 
> me:
> 
> * corosync.conf
> * the journal or syslog of proxmox01 and proxmox03 around "Jun 12 01:38:16"
>   plus/minus ~ 5 minutes, please in separated files, no interpolation and as
>   unredacted as possible
> * information if you have a HW watchdog or use the Linux soft-dog
> 
> that would be appreciated.

___
pve-user mailing list
pve-user@pve.proxmox.com
https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user


Re: [PVE-User] pve-firewall, clustering and HA gone bad

2019-06-25 Thread Mark Schouten
On Thu, Jun 13, 2019 at 12:34:28PM +0200, Thomas Lamprecht wrote:
> > 2: ha-manager should not be able to start the VM's when they are running
> > elsewhere
> 
> This can only happen if fencing fails, and that fencing works is always
> a base assumption we must take (as else no HA is possible at all).
> So it would be interesting why fencing did not worked here (see below
> for the reason I could not determine that yet as I did not have your logs
> at hand)

Reading the emails from that specific night, I saw this message:

 The node 'proxmox01' failed and needs manual intervention.

 The PVE HA manager tries to fence it and recover the
 configured HA resources to a healthy node if possible.

 Current fence status: SUCCEED
 fencing: acknowledged - got agent lock for node 'proxmox01'

This seems to suggest that the cluster is confident that the fencing
succeeded. How does it determine that?

-- 
Mark Schouten | Tuxis B.V.
KvK: 74698818 | http://www.tuxis.nl/
T: +31 318 200208 | i...@tuxis.nl
___
pve-user mailing list
pve-user@pve.proxmox.com
https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user