As part of an issue that if you push start for 200vms in the same time it takes hours because undefined issue, we thought about moving the collection of statistics outside vdsm.

It can help because the stat collection is an internal threads of vdsm that can spend not a bit of a time, I'm not sure if it would help with the issue of starting many vms simultaneously, but it might improve vdsm response.

Currently we start thread for each vm and then collecting stats on them in constant intervals, and it must effect vdsm if we have 200 thread like this that can take some time. for example if we have connection errors to storage and we can't receive its response, all the 200 threads can get stuck and lock other threads (gil issue).

I wanted to know what do you think about it and if you have better solution to avoid initiate so many threads? And if splitting vdsm is a good idea here? In first look, my opinion is that it can help and would be nice to have vmStatisticService that runs and writes to separate log the vms status.

The problem with this solution is that if those interval functions needs to communicate with internal parts of vdsm to set values or start internal processes when something has changed, it depends on the stat function.. and I'm not sure that stat function should control internal flows. Today to recognize connectivity error we count on this method, but we can add polling mechanics for those issues (which can raise same problems we are trying to deal with..)

I would like to here your ideas and comments.. thanks

Yaniv Bronhaim.
RedHat, Israel

vdsm-devel mailing list

Reply via email to