On Sat, Mar 28, 2020 at 7:34 PM Nir Soffer <nsof...@redhat.com> wrote:
> On Sat, Mar 28, 2020 at 5:00 AM Gianluca Cecchi > <gianluca.cec...@gmail.com> wrote: > ... > > Further information. > > What I see around time frame in gluster brick log file > gluster_bricks-vmstore-vmstore.log (timestamp is behind 1 hour in log file) > > > > [2020-03-27 23:30:38.575808] I [MSGID: 101055] > [client_t.c:436:gf_client_unref] 0-vmstore-server: Shutting down connection > CTX_ID:6e8f70b8-1946-4505-860f-be90e5807cb3-GRAPH_ID:0-PID:223418-HOST:ovirt.mydomain.local-PC_NAME:vmstore-client-0-RECON_NO:-0 > > [2020-03-27 23:35:15.281449] E [MSGID: 113072] > [posix-inode-fd-ops.c:1886:posix_writev] 0-vmstore-posix: write failed: > offset 0, [Invalid argument] > > [2020-03-27 23:35:15.281545] E [MSGID: 115067] > [server-rpc-fops_v2.c:1373:server4_writev_cbk] 0-vmstore-server: 34139378: > WRITEV 10 (00d9fe81-8a31-498e-8401-7b9d1477378e), client: > CTX_ID:d04437ba-ef98-43df-864f-5e9d3738620a-GRAPH_ID:0-PID:27687-HOST:ovirt.mydomain.local-PC_NAME:vmstore-client-0-RECON_NO:-0, > error-xlator: vmstore-posix [Invalid argument] > > [2020-03-27 23:40:15.415794] E [MSGID: 113072] > [posix-inode-fd-ops.c:1886:posix_writev] 0-vmstore-posix: write failed: > offset 0, [Invalid argument] > > Invalid arguments are expected when activating a storage domain, and > every 5 minutes when > storage domain are refreshed. The writes are performed to to a temporary > file at > /rhev/data-center/mnt/server:_path/.prob-random-uuid > > These logs do not show the path, so we don't know if the writes are > related to block size probing. > > But in vdsm log we see: > > 2020-03-27 00:40:08,979+0100 INFO (monitor/665ff83) > [storage.StorageDomain] Removing remnants of deleted images [] > (fileSD:726) > > This call happens when vdsm is refreshing storage domain. Right before > this log, vdsm try to detect the underlying > storage block size. > > So it looks like the gluster logs are related to block size probing > and are not related to the > I/O error that caused the VM to pause. > [snip] > Looking at both "abnormal vm stop" and storage refresh events: > > [snip] > I don't see any relation between refreshes and the abnormal vm stop events. > > I think the key to understanding this is to enable more verbose logs > in gluster understand what was > the failure that caused the vm to stop. > > Ah, ok. Thanks It seems default gluster logs level are INFO and I can have them more verbose for a limited amount of time seeing if more information is provided. Can I do it with VMs running and only doing sort of reload of the service or do I have to stop all to do it? > It would also help if we had detailed error logs in qemu log in > /var/log/libvirt/qemu/vm-name.log > I will find them. The system is not available to check right now > Did you enable libvirt logs? We may have more information about the error > there. > > You can enable logs by modifying these lines in /etc/libvirt/libvirtd.conf: > > log_filters="1:qemu 1:libvirt 4:object 4:json 4:event 1:util" > log_outputs="1:file:/var/log/libvirt/libvirtd.log" > > And restart libvirt. > > Note that libvirt log may be huge, so you need to watch it and change > the log level or filter after you collect what you need. > > To log only warning and errors use: > > log_outputs="3:file:/var/log/libvirt/libvirtd.log" > > Someone from gluster should help debugging this. > > Nir > > Ok, I could also try this way if enabling more verbose gluster logs is not sufficient. Thanks for all detailed explanations, in the mean time. Gianluca
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/7MOOUIVWUABS2IMM4BVP4BR7JZL2DTWW/