Hello, The need to monitor cumulative VM network usage has come up several times in the past; while this should be handled as part of (https://bugzilla.redhat.com/show_bug.cgi?id=1063343), in the mean time I've written a small Python script that monitors those statistics, attached here.
The script polls the engine via RESTful API periodically and dumps the up-to-date total usage into a file. The output is a multi-level map/dictionary in JSON format, where: * The top level keys are VM names. * Under each VM, the next level keys are vNIC names. * Under each vNIC, there are keys for total 'rx' (received) and 'tx' (transmitted), where the values are in Bytes. The script is built to run forever. It may be stopped at any time, but while it's not running VM network usage data will "be lost". When it's re-run, it'll go back to accumulating data on top of its previous data. A few disclaimers: * I haven't tested this with any edge cases (engine service dies, etc.). * Tested this with tens of VMs, not sure it'll work fine with hundreds. * The PERIOD_TIME (polling interval) should be set so that it matches both the engine's and vdsm's polling interval (see comments inside the script), otherwise data will be either lost or counted multiple times. >From 3.4 onwards, default configuration should be fine with 15 seconds. * The precision of traffic measurement on a NIC is 0.1% of the interface's speed over each PERIOD_TIME interval. For example, on a 1Gbps vNIC, when PERIOD_TIME = 15s, data will only be measured in 15Mb (~2MB) quanta. Specifically what this means is, that in this example, any traffic smaller than 2MB over a 15-second period would be negligible and wouldn't be recorded. Knock yourselves out :)
from json import load,dump from genericpath import exists from ovirtsdk.api import API from threading import Thread from time import sleep, time PERIOD_TIME = 15 # time in seconds between measurements - for data integrity's sake, should match both the engine polling interval (determined by configuration values: NumberVmRefreshesBeforeSave * VdsRefreshRate) and the vdsm polling interval (vm_sample_net_interval in /etc/vdsm/vdsm.conf) ENGINE_URL = 'http://localhost:8080/api' USERNAME = 'admin@internal' PASSWORD = 'foo' PATHNAME = 'traffic.txt' RX_ENTRY = 'rx' TX_ENTRY = 'tx' api = API(url=ENGINE_URL, username=USERNAME, password=PASSWORD) def deserialize(): if exists(PATHNAME): f = open(PATHNAME, 'r+') traffic = load(f) f.close() else: traffic = {} return traffic def serialize(traffic): f = open(PATHNAME, 'w') dump(traffic, f) f.close() # returns an up-to-date cumulative NIC network usage # nicEntry := {'rx' : totalRxInBytes, 'tx' : totalTxInBytes} def updateNic(nic, nicEntry): rx = nicEntry[RX_ENTRY] if (RX_ENTRY in nicEntry) else 0 tx = nicEntry[TX_ENTRY] if (TX_ENTRY in nicEntry) else 0 for statistic in nic.statistics.list(): if statistic.get_name() == 'data.current.rx': rx += PERIOD_TIME * statistic.get_values().get_value().pop().get_datum() elif statistic.get_name() == 'data.current.tx': tx += PERIOD_TIME * statistic.get_values().get_value().pop().get_datum() nicEntry[RX_ENTRY] = rx nicEntry[TX_ENTRY] = tx # returns the up-to-date cumulative network usage for all the NICs of a VM # vmEntry := {nicName1 : nicEntry, nicName2 : nicEntry, ...} # see nicEntry format in updateNic() def updateVm(vm, vmEntry): for nic in vm.nics.list(): nicName = nic.get_name() nicEntry = {} if nicName in vmEntry: nicEntry = vmEntry[nicName] updateNic(nic, nicEntry) vmEntry[nicName] = nicEntry # returns the up-to-date cumulative network usage for all the NICs of all the VMs in the deployment # traffic := {vmName1 : vmEntry, vmName2 : vmEntry, ...} # see vmEntry format in updateVm def updateAllVms(): traffic = deserialize() for vm in api.vms.list(): vmName = vm.get_name() vmEntry = {} if vmName in traffic: vmEntry = traffic[vmName] updateVm(vm, vmEntry) traffic[vmName] = vmEntry serialize(traffic) while True: reference = time() thread = Thread(target=updateAllVms) thread.run() overhead = time() - reference sleep(PERIOD_TIME - overhead) # for some reason initializing the thread takes non-negligible amount of time in Python, correct for it api.disconnect()
_______________________________________________ Users mailing list [email protected] http://lists.ovirt.org/mailman/listinfo/users

