... > > In aeolus, condor is what's monitoring running instances and I thought > > it did that be GETing each instance individually every 90 seconds > > No, condor does a batch update of every provider every 30 seconds. We can > certainly tune that down, but the problem is that it is a tradeoff; the longer > the timeout, the worse the user experience becomes in terms of updating the > UI with the current state. Even with the 30 second timeout, we are getting > complaints that we take "way longer" to show an instance going to running > than the EC2 front-end.
We use a 'sliding window' approach in rhev-m in some places from user perspective i.e., if the user performed an action (or clicked refresh), we'll refresh more frequently for a few cycles. But that's UI --> Backend. Your use case is more close to backend --> vdsm, where we poll every 2 seconds actually. But the polling every 2 seconds is a very light poll - only list of VM's and their status. The heavier polling for all detils/stats is not as frequent. So you should probably define the minimal data you need to poll for more frequently, and run the lighter query (the api has a 'detail' level, though probably not one that light yet).
