Nir, See attached the repoplot output. On Thu, Apr 14, 2016 at 12:18 PM, Nir Soffer <[email protected]> wrote:
> On Thu, Apr 14, 2016 at 12:02 PM, Fred Rolland <[email protected]> > wrote: > > From the log, we can see that the lvextend command took 18 sec, which is > > quite long. > > Fred, can you run repoplot on this log file? it will may explain why this > lvm > call took 18 seconds. > > Nir > > > > > 60decf0c-6d9a-4c3b-bee6-de9d2ff05e85::DEBUG::2016-04-13 > > 10:52:06,759::lvm::290::Storage.Misc.excCmd::(cmd) /usr/bin/taskset > > --cpu-list 0-23 /usr/bin/sudo -n /usr/sbin/lvm lvextend --config ' > devices { > > preferred_names = ["^/dev/mapper/"] ignore_suspended_devices=1 > > write_cache_state=0 disable_after_error_count=3 filter = [ > > '\''a|/dev/mapper/36000eb3a4f1acbc20000000000000043|'\'', '\''r|.*|'\'' > ] } > > global { locking_type=1 prioritise_write_locks=1 wait_for_locks=1 > > use_lvmetad=0 } backup { retain_min = 50 retain_days = 0 } ' > --autobackup > > n --size 6016m > > 5de4a000-a9c4-489c-8eee-10368647c413/721d09bc-60e7-4310-9ba2-522d2a4b03d0 > > (cwd None) > > .... > > 60decf0c-6d9a-4c3b-bee6-de9d2ff05e85::DEBUG::2016-04-13 > > 10:52:22,217::lvm::290::Storage.Misc.excCmd::(cmd) SUCCESS: <err> = ' > > WARNING: lvmetad is running but disabled. Restart lvmetad before enabling > > it!\n WARNING: This metadata update is NOT backed up\n'; <rc> = 0 > > > > > > The watermark can be configured by the following value: > > > > 'volume_utilization_percent', '50', > > 'Together with volume_utilization_chunk_mb, set the minimal free ' > > 'space before a thin provisioned block volume is extended. Use ' > > 'lower values to extend earlier.') > > > > On Thu, Apr 14, 2016 at 11:42 AM, Michal Skrivanek > > <[email protected]> wrote: > >> > >> > >> > On 14 Apr 2016, at 09:57, [email protected] wrote: > >> > > >> > Ok, that makes sense, thanks for the insight both Alex and Fred. I'm > >> > attaching the VDSM log of the SPM node at the time of the pause. I > couldn't > >> > find anything that would clearly identify the problem, but maybe > you'll be > >> > able to. > >> > >> In extreme conditions it will happen. When your storage is slow to > respond > >> to extension request, and when your write rate is very high then it may > >> happen, as it is happening to you, that you run out space sooner than > the > >> extension finishes. You can change the watermark value I guess(right, > >> Fred?), but better would be to plan a bit more ahead and either use > >> preallocated or create thin and then allocate expected size in advance > >> before the operation causing it (typically it only happens during > untarring > >> gigabytes of data, or huge database dump/restore) > >> Even then, the VM should always be automatially resumed once the disk > >> space is allocated > >> > >> Thanks, > >> michal > >> > >> > > >> > Thanks. > >> > > >> > Regards. > >> > > >> > El 2016-04-13 13:09, Fred Rolland escribió: > >> >> Hi, > >> >> Yes, just as Alex explained, if the disk has been created as thin > >> >> provisioning, the vdsm will extends once a watermark is reached. > >> >> Usually it should not get to the state the Vm is paused. > >> >> From the log, you can see that the request for extension has been > sent > >> >> before the VM got to the No Space Error. > >> >> Later, we can see the VM resuming. > >> >> INFO::2016-04-13 > >> >> 10:52:04,182::vm::1026::virt.vm::(extendDrivesIfNeeded) > >> >> vmId=`f9cd282e-110a-4896-98d3-6d320662744d`::Requesting extension for > >> >> volume > >> >> .... > >> >> INFO::2016-04-13 10:52:29,360::vm::3728::virt.vm::(onIOError) > >> >> vmId=`f9cd282e-110a-4896-98d3-6d320662744d`::abnormal vm stop device > >> >> virtio-disk0 error enospc > >> >> .... > >> >> INFO::2016-04-13 > 10:52:54,317::vm::5084::virt.vm::(_logGuestCpuStatus) > >> >> vmId=`f9cd282e-110a-4896-98d3-6d320662744d`::CPU running: onResume > >> >> Note that the extension is done on the SPM host, so it would be > >> >> interesting to see the vdsm log from the host that was in SPM role at > >> >> this timeframe. > >> >> Regards, > >> >> Fred > >> >> On Wed, Apr 13, 2016 at 2:43 PM, Alex Crow <[email protected]> > >> >> wrote: > >> >>> Hi, > >> >>> If you have set up VM disks as Thin Provisioned, the VM has to > >> >>> pause when the disk image needs to expand. You won't see this on VMs > >> >>> with preallocated storage. > >> >>> It's not the SAN that's running out of space, it's the VM image > >> >>> needing to be expanded incrementally each time. > >> >>> Cheers > >> >>> Alex > >> >>> On 13/04/16 12:04, [email protected] wrote: > >> >>> Hi Fred, > >> >>> This is an iSCSI storage. I'm attaching the VDSM logs from the host > >> >>> where this machine has been running. Should you need any further > >> >>> info, don't hesitate to ask. > >> >>> Thanks. > >> >>> Regards. > >> >>> El 2016-04-13 11:54, Fred Rolland escribió: > >> >>> Hi, > >> >>> What kind of storage do you have ? (ISCSI,FC,NFS...) > >> >>> Can you provide the vdsm logs from the host where this VM runs ? > >> >>> Thanks, > >> >>> Freddy > >> >>> On Wed, Apr 13, 2016 at 1:02 PM, <[email protected]> wrote: > >> >>> Hi, > >> >>> We're running oVirt 3.6.4.1-1. Lately we're seeing a bunch of > >> >>> events like these: > >> >>> 2016-04-13 10:52:30,735 INFO > >> >>> [org.ovirt.engine.core.vdsbroker.VmAnalyzer] > >> >>> (DefaultQuartzScheduler_Worker-86) [60dea18f] VM > >> >>> 'f9cd282e-110a-4896-98d3-6d320662744d'(vm.domain.com [1] [1]) moved > >> >>> from > >> >>> 'Up' --> 'Paused' > >> >>> 2016-04-13 10:52:30,815 INFO > >> >> > [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] > >> >>> (DefaultQuartzScheduler_Worker-86) [60dea18f] Correlation ID: null, > >> >>> Call Stack: null, Custom Event ID: -1, Message: VM vm.domain.com > >> >>> [1] [1] > >> >>> has been paused. > >> >>> 2016-04-13 10:52:30,898 ERROR > >> >> > [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] > >> >>> (DefaultQuartzScheduler_Worker-86) [60dea18f] Correlation ID: null, > >> >>> Call Stack: null, Custom Event ID: -1, Message: VM vm.domain.com > >> >>> [1] [1] > >> >>> has been paused due to no Storage space error. > >> >>> 2016-04-13 10:52:52,320 WARN > >> >>> [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxyData] > >> >>> (org.ovirt.thread.pool-8-thread-38) [] domain > >> >>> '5de4a000-a9c4-489c-8eee-10368647c413:iscsi01' in problem. vds: > >> >>> 'host6.domain.com [2] [2]' > >> >>> 2016-04-13 10:52:55,183 INFO > >> >>> [org.ovirt.engine.core.vdsbroker.VmAnalyzer] > >> >>> (DefaultQuartzScheduler_Worker-70) [3da0f3d4] VM > >> >>> 'f9cd282e-110a-4896-98d3-6d320662744d'(vm.domain.com [1] [1]) moved > >> >>> from > >> >>> 'Paused' --> 'Up' > >> >>> 2016-04-13 10:52:55,318 INFO > >> >> > [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] > >> >>> (DefaultQuartzScheduler_Worker-70) [3da0f3d4] Correlation ID: null, > >> >>> Call Stack: null, Custom Event ID: -1, Message: VM vm.domain.com > >> >>> [1] [1] > >> >>> has recovered from paused back to up. > >> >>> The storage domain is far from being full, though (400+ G available > >> >>> right now). Could this be related to this other issue [1]? If not, > >> >>> how could I debug what's going on? > >> >>> Thanks. > >> >>> [1]: https://www.mail-archive.com/[email protected]/msg32079.html > >> >>> [3] > >> >>> [3] > >> >>> _______________________________________________ > >> >>> Users mailing list > >> >>> [email protected] > >> >>> http://lists.ovirt.org/mailman/listinfo/users [4] [4] > >> >>> Links: > >> >>> ------ > >> >>> [1] http://vm.domain.com [1] > >> >>> [2] http://host6.domain.com [2] > >> >>> [3] https://www.mail-archive.com/[email protected]/msg32079.html [3] > >> >>> [4] http://lists.ovirt.org/mailman/listinfo/users [4] > >> >> _______________________________________________ > >> >> Users mailing list > >> >> [email protected] > >> >> http://lists.ovirt.org/mailman/listinfo/users [4] > >> >> -- > >> >> This message is intended only for the addressee and may contain > >> >> confidential information. Unless you are that person, you may not > >> >> disclose its contents or use it in any way and are requested to > delete > >> >> the message along with any attachments and notify us immediately. > >> >> This email is not intended to, nor should it be taken to, constitute > >> >> advice. > >> >> The information provided is correct to our knowledge & belief and > must > >> >> not > >> >> be used as a substitute for obtaining tax, regulatory, investment, > >> >> legal or > >> >> any other appropriate advice. > >> >> "Transact" is operated by Integrated Financial Arrangements Ltd. > >> >> 29 Clement's Lane, London EC4N 7AE. Tel: (020) 7608 4900 Fax: (020) > >> >> 7608 5300. > >> >> (Registered office: as above; Registered in England and Wales under > >> >> number: 3727592). Authorised and regulated by the Financial Conduct > >> >> Authority (entered on the Financial Services Register; no. 190856). > >> >> _______________________________________________ > >> >> Users mailing list > >> >> [email protected] > >> >> http://lists.ovirt.org/mailman/listinfo/users [4] > >> >> Links: > >> >> ------ > >> >> [1] http://vm.domain.com > >> >> [2] http://host6.domain.com > >> >> [3] https://www.mail-archive.com/[email protected]/msg32079.html > >> >> [4] http://lists.ovirt.org/mailman/listinfo/users > >> >> _______________________________________________ > >> >> Users mailing list > >> >> [email protected] > >> >> http://lists.ovirt.org/mailman/listinfo/users > >> > <vdsm.log.gz>_______________________________________________ > >> > Users mailing list > >> > [email protected] > >> > http://lists.ovirt.org/mailman/listinfo/users > >> > > > > > > _______________________________________________ > > Users mailing list > > [email protected] > > http://lists.ovirt.org/mailman/listinfo/users > > >
vdsm.log.pdf
Description: Adobe PDF document
_______________________________________________ Users mailing list [email protected] http://lists.ovirt.org/mailman/listinfo/users

