Re: [libvirt-users] domains paused without any obvious reason
- Am 14. Mai 2019 um 11:08 schrieb Daniel P. Berrangé berra...@redhat.com: > > 'virsh domstate --reason $GUEST' > > will tell you what event caused the guest to pause in the first place. > > If you can resume successfully, this indicates the event was a transient > problem. Given the domblkerror message 'no space' I'm it looks that > you had a problem running out of disk space temporarily which then > resolved itself. > > Regards, > Daniel Hi, i have a clue what happened. The script shuts down the domains, snapshots them, restarts them and then copy the backing files to a CIFS server. After the copy is done (which lasts several hours), the domains are blockcommitted. Finally the script deletes the local snap files. I think the snap files got too big, because the logical volume for them has just 20GB and i'm snapshotting currently 8 domains. Limit of the LV was reached. And because i finally deleted the snapshot files i didn't see that. I will monitor now the LV for the snap files in my script to see how big they are growing. Bernd Helmholtz Zentrum Muenchen Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH) Ingolstaedter Landstr. 1 85764 Neuherberg www.helmholtz-muenchen.de Stellv. Aufsichtsratsvorsitzender: MinDirig. Dr. Manfred Wolter Geschaeftsfuehrung: Prof. Dr. med. Dr. h.c. Matthias Tschoep, Heinrich Bassler, Kerstin Guenther Registergericht: Amtsgericht Muenchen HRB 6466 USt-IdNr: DE 129521671 ___ libvirt-users mailing list libvirt-users@redhat.com https://www.redhat.com/mailman/listinfo/libvirt-users
Re: [libvirt-users] domains paused without any obvious reason
On Mon, May 13, 2019 at 06:19:05PM +0200, Lentes, Bernd wrote: > > > - On May 13, 2019, at 3:34 PM, Bernd Lentes > bernd.len...@helmholtz-muenchen.de wrote: > > > Hi, > > > > i have a two node HA-Cluster with several domains as resources. > > Currently it's running in test mode. > > Some domains (all on the same host) stopped running, virsh list shows them > > as > > "paused". > > All stopped at the same time (11th of may, 7:00 am), my monitoring system > > began > > to yell. > > I don't have any clue why this happened. > > virsh domblkerror says for all the domains (5) "no space". The days before > > the > > domains were running fine and i know that all disks inside the domain should > > have enough space. > > Also the host is not running out of space. > > The logs don't say anything sensefully, unfortunately i didn't have a log > > for > > the libvirtd daemon, i just configured that now. > > The domains are stopped each day by cron at 10:30 pm for a short moment, a > > snapshot is taken, domains are started again, the backing file is copied to > > a > > CIFS server and if that is finished the snapshot is blockcommited into the > > backing file. > > That's working fine already for several days. This cronjob creates a log and > > it's looking fine. > > The domains reside in naked Logical Volumes, the respective Volume Group has > > enough space. > > > > > > I resumed one of the guests and it continued without any problem. > The log doesn't indicate any problem, and df -h shows enough space on > all partitions. 'virsh domstate --reason $GUEST' will tell you what event caused the guest to pause in the first place. If you can resume successfully, this indicates the event was a transient problem. Given the domblkerror message 'no space' I'm it looks that you had a problem running out of disk space temporarily which then resolved itself. Regards, Daniel -- |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o-https://fstop138.berrange.com :| |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :| ___ libvirt-users mailing list libvirt-users@redhat.com https://www.redhat.com/mailman/listinfo/libvirt-users
Re: [libvirt-users] domains paused without any obvious reason
- On May 13, 2019, at 3:34 PM, Bernd Lentes bernd.len...@helmholtz-muenchen.de wrote: > Hi, > > i have a two node HA-Cluster with several domains as resources. > Currently it's running in test mode. > Some domains (all on the same host) stopped running, virsh list shows them as > "paused". > All stopped at the same time (11th of may, 7:00 am), my monitoring system > began > to yell. > I don't have any clue why this happened. > virsh domblkerror says for all the domains (5) "no space". The days before the > domains were running fine and i know that all disks inside the domain should > have enough space. > Also the host is not running out of space. > The logs don't say anything sensefully, unfortunately i didn't have a log for > the libvirtd daemon, i just configured that now. > The domains are stopped each day by cron at 10:30 pm for a short moment, a > snapshot is taken, domains are started again, the backing file is copied to a > CIFS server and if that is finished the snapshot is blockcommited into the > backing file. > That's working fine already for several days. This cronjob creates a log and > it's looking fine. > The domains reside in naked Logical Volumes, the respective Volume Group has > enough space. > > I resumed one of the guests and it continued without any problem. The log doesn't indicate any problem, and df -h shows enough space on all partitions. Bernd Helmholtz Zentrum Muenchen Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH) Ingolstaedter Landstr. 1 85764 Neuherberg www.helmholtz-muenchen.de Stellv. Aufsichtsratsvorsitzender: MinDirig. Dr. Manfred Wolter Geschaeftsfuehrung: Prof. Dr. med. Dr. h.c. Matthias Tschoep, Heinrich Bassler, Kerstin Guenther Registergericht: Amtsgericht Muenchen HRB 6466 USt-IdNr: DE 129521671 ___ libvirt-users mailing list libvirt-users@redhat.com https://www.redhat.com/mailman/listinfo/libvirt-users
[libvirt-users] domains paused without any obvious reason
Hi, i have a two node HA-Cluster with several domains as resources. Currently it's running in test mode. Some domains (all on the same host) stopped running, virsh list shows them as "paused". All stopped at the same time (11th of may, 7:00 am), my monitoring system began to yell. I don't have any clue why this happened. virsh domblkerror says for all the domains (5) "no space". The days before the domains were running fine and i know that all disks inside the domain should have enough space. Also the host is not running out of space. The logs don't say anything sensefully, unfortunately i didn't have a log for the libvirtd daemon, i just configured that now. The domains are stopped each day by cron at 10:30 pm for a short moment, a snapshot is taken, domains are started again, the backing file is copied to a CIFS server and if that is finished the snapshot is blockcommited into the backing file. That's working fine already for several days. This cronjob creates a log and it's looking fine. The domains reside in naked Logical Volumes, the respective Volume Group has enough space. Bernd -- Bernd Lentes Systemadministration Institut für Entwicklungsgenetik Gebäude 35.34 - Raum 208 HelmholtzZentrum münchen bernd.len...@helmholtz-muenchen.de phone: +49 89 3187 1241 phone: +49 89 3187 3827 fax: +49 89 3187 2294 http://www.helmholtz-muenchen.de/idg wer Fehler macht kann etwas lernen wer nichts macht kann auch nichts lernen Helmholtz Zentrum Muenchen Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH) Ingolstaedter Landstr. 1 85764 Neuherberg www.helmholtz-muenchen.de Stellv. Aufsichtsratsvorsitzender: MinDirig. Dr. Manfred Wolter Geschaeftsfuehrung: Prof. Dr. med. Dr. h.c. Matthias Tschoep, Heinrich Bassler, Kerstin Guenther Registergericht: Amtsgericht Muenchen HRB 6466 USt-IdNr: DE 129521671 ___ libvirt-users mailing list libvirt-users@redhat.com https://www.redhat.com/mailman/listinfo/libvirt-users