Re: [libvirt-users] domains paused without any obvious reason

2019-05-14 Thread Lentes, Bernd


- Am 14. Mai 2019 um 11:08 schrieb Daniel P. Berrangé berra...@redhat.com:

> 
> 'virsh domstate --reason $GUEST'
> 
> will tell you what event caused the guest to pause in the first place.
> 
> If you can resume successfully, this indicates the event was a transient
> problem.   Given the domblkerror message 'no space' I'm it looks that
> you had a problem running out of disk space temporarily which then
> resolved itself.
> 
> Regards,
> Daniel


Hi,

i have a clue what happened.
The script shuts down the domains, snapshots them, restarts them and then copy 
the backing files to a CIFS
server. After the copy is done (which lasts several hours), the domains are 
blockcommitted.
Finally the script deletes the local snap files. I think the snap files got too 
big,
because the logical volume for them has just 20GB and i'm snapshotting 
currently 8 domains.
Limit of the LV was reached. And because i finally deleted the snapshot files i 
didn't see that.
I will monitor now the LV for the snap files in my script to see how big they 
are growing.

Bernd
 

Helmholtz Zentrum Muenchen
Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH)
Ingolstaedter Landstr. 1
85764 Neuherberg
www.helmholtz-muenchen.de
Stellv. Aufsichtsratsvorsitzender: MinDirig. Dr. Manfred Wolter
Geschaeftsfuehrung: Prof. Dr. med. Dr. h.c. Matthias Tschoep, Heinrich Bassler, 
Kerstin Guenther
Registergericht: Amtsgericht Muenchen HRB 6466
USt-IdNr: DE 129521671


___
libvirt-users mailing list
libvirt-users@redhat.com
https://www.redhat.com/mailman/listinfo/libvirt-users

Re: [libvirt-users] domains paused without any obvious reason

2019-05-14 Thread Daniel P . Berrangé
On Mon, May 13, 2019 at 06:19:05PM +0200, Lentes, Bernd wrote:
> 
> 
> - On May 13, 2019, at 3:34 PM, Bernd Lentes 
> bernd.len...@helmholtz-muenchen.de wrote:
> 
> > Hi,
> > 
> > i have a two node HA-Cluster with several domains as resources.
> > Currently it's running in test mode.
> > Some domains (all on the same host) stopped running, virsh list shows them 
> > as
> > "paused".
> > All stopped at the same time (11th of may, 7:00 am), my monitoring system 
> > began
> > to yell.
> > I don't have any clue why this happened.
> > virsh domblkerror says for all the domains (5) "no space". The days before 
> > the
> > domains were running fine and i know that all disks inside the domain should
> > have enough space.
> > Also the host is not running out of space.
> > The logs don't say anything sensefully, unfortunately i didn't have a log 
> > for
> > the libvirtd daemon, i just configured that now.
> > The domains are stopped each day by cron at 10:30 pm for a short moment, a
> > snapshot is taken, domains are started again, the backing file is copied to 
> > a
> > CIFS server and if that is finished the snapshot is blockcommited into the
> > backing file.
> > That's working fine already for several days. This cronjob creates a log and
> > it's looking fine.
> > The domains reside in naked Logical Volumes, the respective Volume Group has
> > enough space.
> > 
> > 
> 
> I resumed one of the guests and it continued without any problem.
> The log doesn't indicate any problem, and df -h shows enough space on
> all partitions.

'virsh domstate --reason $GUEST'

will tell you what event caused the guest to pause in the first place.

If you can resume successfully, this indicates the event was a transient
problem.   Given the domblkerror message 'no space' I'm it looks that
you had a problem running out of disk space temporarily which then
resolved itself.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

___
libvirt-users mailing list
libvirt-users@redhat.com
https://www.redhat.com/mailman/listinfo/libvirt-users


Re: [libvirt-users] domains paused without any obvious reason

2019-05-13 Thread Lentes, Bernd



- On May 13, 2019, at 3:34 PM, Bernd Lentes 
bernd.len...@helmholtz-muenchen.de wrote:

> Hi,
> 
> i have a two node HA-Cluster with several domains as resources.
> Currently it's running in test mode.
> Some domains (all on the same host) stopped running, virsh list shows them as
> "paused".
> All stopped at the same time (11th of may, 7:00 am), my monitoring system 
> began
> to yell.
> I don't have any clue why this happened.
> virsh domblkerror says for all the domains (5) "no space". The days before the
> domains were running fine and i know that all disks inside the domain should
> have enough space.
> Also the host is not running out of space.
> The logs don't say anything sensefully, unfortunately i didn't have a log for
> the libvirtd daemon, i just configured that now.
> The domains are stopped each day by cron at 10:30 pm for a short moment, a
> snapshot is taken, domains are started again, the backing file is copied to a
> CIFS server and if that is finished the snapshot is blockcommited into the
> backing file.
> That's working fine already for several days. This cronjob creates a log and
> it's looking fine.
> The domains reside in naked Logical Volumes, the respective Volume Group has
> enough space.
> 
> 

I resumed one of the guests and it continued without any problem.
The log doesn't indicate any problem, and df -h shows enough space on
all partitions.


Bernd
 

Helmholtz Zentrum Muenchen
Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH)
Ingolstaedter Landstr. 1
85764 Neuherberg
www.helmholtz-muenchen.de
Stellv. Aufsichtsratsvorsitzender: MinDirig. Dr. Manfred Wolter
Geschaeftsfuehrung: Prof. Dr. med. Dr. h.c. Matthias Tschoep, Heinrich Bassler, 
Kerstin Guenther
Registergericht: Amtsgericht Muenchen HRB 6466
USt-IdNr: DE 129521671

___
libvirt-users mailing list
libvirt-users@redhat.com
https://www.redhat.com/mailman/listinfo/libvirt-users


[libvirt-users] domains paused without any obvious reason

2019-05-13 Thread Lentes, Bernd
Hi,

i have a two node HA-Cluster with several domains as resources.
Currently it's running in test mode.
Some domains (all on the same host) stopped running, virsh list shows them as 
"paused".
All stopped at the same time (11th of may, 7:00 am), my monitoring system began 
to yell.
I don't have any clue why this happened.
virsh domblkerror says for all the domains (5) "no space". The days before the 
domains were running fine and i know that all disks inside the domain should 
have enough space.
Also the host is not running out of space.
The logs don't say anything sensefully, unfortunately i didn't have a log for 
the libvirtd daemon, i just configured that now.
The domains are stopped each day by cron at 10:30 pm for a short moment, a 
snapshot is taken, domains are started again, the backing file is copied to a 
CIFS server and if that is finished the snapshot is blockcommited into the 
backing file.
That's working fine already for several days. This cronjob creates a log and 
it's looking fine.
The domains reside in naked Logical Volumes, the respective Volume Group has 
enough space.


Bernd


-- 

Bernd Lentes 
Systemadministration 
Institut für Entwicklungsgenetik 
Gebäude 35.34 - Raum 208 
HelmholtzZentrum münchen 
bernd.len...@helmholtz-muenchen.de 
phone: +49 89 3187 1241 
phone: +49 89 3187 3827 
fax: +49 89 3187 2294 
http://www.helmholtz-muenchen.de/idg 

wer Fehler macht kann etwas lernen 
wer nichts macht kann auch nichts lernen
 

Helmholtz Zentrum Muenchen
Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH)
Ingolstaedter Landstr. 1
85764 Neuherberg
www.helmholtz-muenchen.de
Stellv. Aufsichtsratsvorsitzender: MinDirig. Dr. Manfred Wolter
Geschaeftsfuehrung: Prof. Dr. med. Dr. h.c. Matthias Tschoep, Heinrich Bassler, 
Kerstin Guenther
Registergericht: Amtsgericht Muenchen HRB 6466
USt-IdNr: DE 129521671


___
libvirt-users mailing list
libvirt-users@redhat.com
https://www.redhat.com/mailman/listinfo/libvirt-users