----- Original Message ----- > I have left this for a while without continuing because I had to focus > on other things. However this is still in progress :-)
Are you writing patches? (if so, what solution are you pursuing) > On 03/13/2013 10:55 PM, Ayal Baron wrote: > > > > ----- Original Message ----- > >> > >> ----- Original Message ----- > >>> From: "Ayal Baron" <[email protected]> > >>> To: "Saggi Mizrahi" <[email protected]> > >>> Cc: [email protected], [email protected], > >>> "Vinzenz Feenstra" <[email protected]> > >>> Sent: Wednesday, March 13, 2013 5:39:24 PM > >>> Subject: Re: [vdsm] [Engine-devel] Proposal VDSM <=> Engine Data > >>> Statistics Retrieval Optimization > >>> > >>> > >>> > >>> ----- Original Message ----- > >>>> I am completely against this. > >>>> It make the return value differ according to input which > >>>> is a big no no when talking about type safe APIs. > >>>> > >>>> The only reason we have this problem is because there is this > >>>> thing against making multiple calls. > Which is totally contra productive because multiple calls, if properly > split up, will actually lead to less data sent for frequent needed data > calls. And the others shall be triggered when necessary. > >>>> > >>>> Just split it up. > >>>> getVmRuntimeStats() - transient things like mem and cpu% > >>>> getVmInformation() - (semi)static things like disk\networking > >>>> layout > >>>> etc. > >>>> Each updated at different intervals. > >>> +1 on splitting the data up into 2 separate API calls. > >>> You could potentially add a checksum (md5, or any other way) of the > >>> "static" data to getVmRuntimeStats and not bother even with polling > >>> the VmInformation if this hasn't changed. Then you could poll as > >>> often as you'd like the stats and immediately see if you also need > >>> to retrieve VmInfo or not (you rarely would). > >> +1 To Ayal's suggestion > >> except that instead of the engine hashing the data VDSM sends the > >> key which is opaque to the engine. > >> This can be a local timestap or a generation number. > > Of course vdsm does the hash, otherwise you'd need to pass all the data to > > engine which would beat the purpose. > We need the hash if we can't have dynamic content. Generation numbers > aren't really helpful as every call aggregates the statistics data > newly, at the moment at least. > >> But, we might want to consider that when we add events polling > >> becomes (much) less frequent so maybe it'll be an overkill. > > You'd still need to compare versions of the data in vdsm and send only if > > it changed. If you don't persist what was received last then potentially > > you could have a monday morning effect where upon on system startup you'd > > be sending everything. So I still think you'd want to have the hash. > We do a hash already on the XML and include it in the getStats response. > Hashes should show enough difference. > > Now to the non-dynamic responses and 'type-safe' API: If we would go for > non dynamic responses we would need for sure 5 new API calls to achieve > some gain on the amount of data sent. > > *getAllVmRuntimeStats() "returns a map of vmId/data pairs for all vms"* > # All the time changing data which is needed by the oVirt Engine, or so > often changing that it does not make sense > # to place it anywhere else > { > VmId: { > cpuSys --> Could be potentially summarized > cpuUser -/ > memUsage > elapsedTime, > status > statsAge > > hashes = { > conf, # Hased information of the XML > (This one is called "hash" in getStats()) > info, # Hashed information of semi > static items > statusHash: # Hashed information of items with are > likely to change however not that often > guestDetails: # Hashed value of the guest details > (applicationList, network information) > } > } > > **getVmStatuses([vmId1, vmId2, ...])*****"Returns a vmId/data pair for > each vm requested"** > *# This data does not change that often and can be retrieved on demand > once the hash changes > return { > vmId: { > timeOffset, > monitorResponse > clientIp, > lastLogin, > username, > session, > guestIPs, > } > } > > *getAllVmDeviceStatistics():**"Returns a vmId/data pair for all vms"* > # This data has to be requested all the time however in lower > intervals (e.g. every 5 minutes) > # And is usually needed for all the VMs anyway > return { > vmId: { > network, > disksUsage, # Might be improved by summarizing? > disks, > balloonInfo, > memoryStats > } > } > > *getVmInfo([vmId1, vmId2, ...]) "Returns a vmId/data pair for each vm > requested" > * # Basically this should be almost constant, except if there have > been changes like migrations, pausing, errors etc > return { > vmId: { > acpiEnable, > vmType, > guestName, > guestOS, > kvmEnable, > pauseCode, > displayIp, > displayPort, > displaySecurePort, > pid, > } > } > > *getVmGuestDetails*([vmId1, vmId2, ...]) > # Data which changes seldom and these changes can be reflected in > the hash when this needs to be requested > # This data is really only necessary when it really has been > changed or needs to be refreshed for whatever reason. > return { > vmId: { > appsList, > netIfaces, > } > } > > > > > > >>>> ----- Original Message ----- > >>>>> From: "Vinzenz Feenstra" <[email protected]> > >>>>> To: [email protected], [email protected] > >>>>> Sent: Thursday, March 7, 2013 6:25:54 AM > >>>>> Subject: [Engine-devel] Proposal VDSM <=> Engine Data > >>>>> Statistics > >>>>> Retrieval Optimization > >>>>> > >>>>> > >>>>> Please find the prettier version on the wiki: > >>>>> http://www.ovirt.org/Proposal_VDSM_-_Engine_Data_Statistics_Retrieval > >>>>> > >>>>> Proposal VDSM - Engine Data Statistics Retrieval > >>>>> VDSM <=> Engine data retrieval optimization > >>>>> Motivation: > >>>>> > >>>>> > >>>>> Currently the RHEVM engine is polling the a lot of data from > >>>>> VDSM > >>>>> every 15 seconds. This should be optimized and the amount of > >>>>> data > >>>>> requested should be more specific. > >>>>> > >>>>> For each VM the data currently contains much more information > >>>>> than > >>>>> actually needed which blows up the size of the XML content > >>>>> quite > >>>>> big. We could optimize this by splitting the reply on the > >>>>> getVmStats > >>>>> based on the request of the engine into sections. For this > >>>>> reason > >>>>> Omer Frenkel and me have split up the data into parts based on > >>>>> their > >>>>> usage. > >>>>> > >>>>> This data can and usually does change during the lifetime of > >>>>> the > >>>>> VM. > >>>>> Rarely Changed: > >>>>> > >>>>> > >>>>> This data is change not very frequent and it should be enough > >>>>> to > >>>>> update this only once in a while. Most commonly this data > >>>>> changes > >>>>> after changes made in the UI or after a migration of the VM to > >>>>> another Host. Status = Running acpiEnable = true vmType = kvm > >>>>> guestName = W864GUESTAGENTT displayType = qxl guestOs = Win 8 > >>>>> kvmEnable = true # this should be constant and never changed > >>>>> pauseCode = NOERR monitorResponse = 0 session = Locked # unused > >>>>> netIfaces = [{'name': 'Realtek RTL8139C+ Fast Ethernet NIC', > >>>>> 'inet6': ['fe80::490c:92bb:bbcc:9f87'], 'inet': > >>>>> ['10.34.60.148'], > >>>>> 'hw': '00:1a:4a:22:3c:db'}] appsList = ['RHEV-Tools 3.2.4', > >>>>> 'RHEV-Agent64 3.2.3', 'RHEV-Serial64 3.2.3', 'RHEV-Network64 > >>>>> 3.2.2', > >>>>> 'RHEV-Network64 3.2.3', 'RHEV-Block64 3.2.3', 'RHEV-Balloon64 > >>>>> 3.2.3', 'RHEV-Balloon64 3.2.2', 'RHEV-Agent64 3.2.2', 'RHEV-USB > >>>>> 3.2.3', 'RHEV-Block64 3.2.2', 'RHEV-Serial64 3.2.2'] pid = > >>>>> 11314 > >>>>> guestIPs = 10.34.60.148 # duplicated info displayIp = 0 > >>>>> displayPort > >>>>> = 5902 displaySecurePort = 5903 username = user@W864GUESTAGENTT > >>>>> clientIp = lastLogin = 1361976900.67 Often Changed: > >>>>> > >>>>> > >>>>> This data is changed quite often however it is not necessary to > >>>>> update this data every 15 seconds. As this is cumulative data > >>>>> and > >>>>> reflects the current status, and it does not need to be > >>>>> snapshotted > >>>>> every 15 seconds to retrieve statistics. The data can be > >>>>> retrieved > >>>>> in much more generous time slices. (e.g. Every 5 minutes) > >>>>> network > >>>>> = > >>>>> {'vnet1': {'macAddr': '00:1a:4a:22:3c:db', 'rxDropped': '0', > >>>>> 'txDropped': '0', 'rxErrors': '0', 'txRate': '0.0', 'rxRate': > >>>>> '0.0', > >>>>> 'txErrors': '0', 'state': 'unknown', 'speed': '100', 'name': > >>>>> 'vnet1'}} disksUsage = [{'path': 'c:\\', 'total': > >>>>> '64055406592', > >>>>> 'fs': 'NTFS', 'used': '19223846912'}, {'path': 'd:\\', 'total': > >>>>> '3490912256', 'fs': 'UDF', 'used': '3490912256'}] timeOffset = > >>>>> 14422 > >>>>> elapsedTime = 68591 hash = 2335461227228498964 statsAge = 0.09 > >>>>> # > >>>>> unused Often Changed but unused > >>>>> > >>>>> > >>>>> This data does not seem to be used in the engine at all. It is > >>>>> not > >>>>> even used in the data warehouse. memoryStats = {'swap_out': > >>>>> '0', > >>>>> 'majflt': '0', 'mem_free': '1466884', 'swap_in': '0', > >>>>> 'pageflt': > >>>>> '0', 'mem_total': '2096736', 'mem_unused': '1466884'} > >>>>> balloonInfo > >>>>> = > >>>>> {'balloon_max': 2097152, 'balloon_cur': 2097152} disks = > >>>>> {'vda': > >>>>> {'readLatency': '0', 'apparentsize': '64424509440', > >>>>> 'writeLatency': > >>>>> '1754496', 'imageID': '28abb923-7b89-4638-84f8-1700f0b76482', > >>>>> 'flushLatency': '156549', 'readRate': '0.00', 'truesize': > >>>>> '18855059456', 'writeRate': '952.05'}, 'hdc': {'readLatency': > >>>>> '0', > >>>>> 'apparentsize': '0', 'writeLatency': '0', 'flushLatency': '0', > >>>>> 'readRate': '0.00', 'truesize': '0', 'writeRate': '0.00'}} Very > >>>>> frequent uppdates needed by webadmin portal: > >>>>> > >>>>> > >>>>> This data is mostly needed for the webadmin portal and might be > >>>>> required to be updated quite often. An exception here is the > >>>>> statsAge field, which seems to be unused by the Engine. This > >>>>> data > >>>>> could be requested every 15 seconds to keep things as they are > >>>>> now. > >>>>> cpuSys = 2.32 cpuUser = 1.34 memUsage = 30 Proposed Solution > >>>>> for > >>>>> VDSM & Engine: > >>>>> > >>>>> > >>>>> We will introduce new optional parameters to getVmStats, > >>>>> getAllVmStats and list to allow a finer grained specification > >>>>> of > >>>>> data which should be included. > >>>>> > >>>>> Parameter: statsType = <string> (getVmStats, getAllVmStats > >>>>> only) > >>>>> Allowed values: > >>>>> > >>>>> * full (default to keep backwards compatibility) > >>>>> * app-list (Just send the application list) > >>>>> * rare (include everything from rarely changed to very > >>>>> frequent) > >>>>> * often (include everything from often changed to very > >>>>> frequent) > >>>>> * frequent (only send the very frequently changed items) > >>>>> > >>>>> > >>>>> > >>>>> Parameter: clientId = <string> The client id is specified by > >>>>> the > >>>>> client and should be unique however constantly used. > >>>>> > >>>>> Parameter: diff = <boolean> In combination with the clientId > >>>>> VDSM > >>>>> will send only differences to the previous request from the > >>>>> named > >>>>> clientId. (if diff=true) > >>>>> > >>>>> > >>>>> Additional Change: > >>>>> > >>>>> > >>>>> Besides the introduction of the new parameters for list, > >>>>> getVmStats > >>>>> and getAllVmStats it might make sense to include a hash for the > >>>>> appList into the rarely changed section of the response which > >>>>> would > >>>>> allow to identify changes and avoid having to sent the complete > >>>>> appList every so often and only if the hash known to the client > >>>>> is > >>>>> outdated. > >>>>> > >>>>> Note: The appList (Application List) reported by the guest > >>>>> agent > >>>>> could be fully implemented on request only, as long as the > >>>>> guest > >>>>> agent installed supports this. As there seems to be a request > >>>>> to > >>>>> have the complete list of installed applications on all guests > >>>>> this > >>>>> data could be quite extensive and a huge list. On the other > >>>>> hand > >>>>> this data is only rarely visible and therefore it should not be > >>>>> requested all the time and only on demand. Improvement of the > >>>>> Guest > >>>>> Agent: > >>>>> > >>>>> > >>>>> As part of the proposed solution it is necessary to improve the > >>>>> guest > >>>>> agent as well. For the full application list there should be > >>>>> implemented a caching system which will be fully reactive and > >>>>> should > >>>>> not poll the application list for example all the time. The > >>>>> guest > >>>>> can create a prepared data file containing all data in the JSON > >>>>> format (as used for the communication with VDSM via VIO) and > >>>>> just > >>>>> have to read that file from disk and directly sends it to VDSM. > >>>>> However it is quite possible that this list is to big and it > >>>>> might > >>>>> have to be chunked into pieces. (Multiple messages, which would > >>>>> have > >>>>> to be supported by VDSM then as well) The solution for this is > >>>>> to > >>>>> make VDSM request this data and it will retrieve the data > >>>>> necessary > >>>>> on request only. -- > >>>>> Regards, > >>>>> > >>>>> Vinzenz Feenstra | Senior Software Engineer > >>>>> RedHat Engineering Virtualization R & D > >>>>> Phone: +420 532 294 625 > >>>>> IRC: vfeenstr or evilissimo > >>>>> > >>>>> Better technology. Faster innovation. Powered by community > >>>>> collaboration. > >>>>> See how it works at redhat.com > >>>>> _______________________________________________ > >>>>> Engine-devel mailing list > >>>>> [email protected] > >>>>> http://lists.ovirt.org/mailman/listinfo/engine-devel > >>>>> > >>>> _______________________________________________ > >>>> vdsm-devel mailing list > >>>> [email protected] > >>>> https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel > >>>> > > > -- > Regards, > > Vinzenz Feenstra | Senior Software Engineer > RedHat Engineering Virtualization R & D > Phone: +420 532 294 625 > IRC: vfeenstr or evilissimo > > Better technology. Faster innovation. Powered by community collaboration. > See how it works at redhat.com > > _______________________________________________ Engine-devel mailing list [email protected] http://lists.ovirt.org/mailman/listinfo/engine-devel
