Hi,

Nir has begun reviewing my draft patches about the thread pool and sampling 
refactoring (thanks!),
and already suggested quite some improvements which I'd like to summarize

Quick links to the ongoing discussion:
http://gerrit.ovirt.org/#/c/29191/8/lib/threadpool/worker.py,cm
http://gerrit.ovirt.org/#/c/29190/4/lib/threadpool/README.rst,cm

Quick summary of the discussion on gerrit so far:
1. extract the scheduling logic from the thread pool. Either add a separate 
scheduler class
   or let the sampling task reschedule themselves after a succesfull completion.
   In any way the concept of  'periodic task', and the added complexity, isn't 
needed.

2. drop all the *queue classes I've added, thus making the package simpler.
   They are no longer needed since we remove the concept of periodic task.

3. have per-task timeout, move the stuck task detection elsewhere, like in the 
worker thread, ot
   maybe better in the aforementioned scheduler.
   If the scheduler finds that any task started in the former pass (or even 
before!)
   has not yet completed, there is no point in keeping this task alive and it 
should be cancelled.

4. the sampling task (or maybe the scheduler) can be smarter and halting the 
sample in presence of
   not responding calls for a given VM, granted the VM reports its 
'health'/responsiveness.

(Hopefully I haven't forgot anything big)

In the draft currently published, I reluctantly added the *queue classes and I 
agree the periodic
task implementation is messy, so I'll be very happy to drop them.

However, a core question still holds: what to do in presence of the stuck task?

I think it is worth to discuss this topic on a medium friendlier than gerrit, 
as it is the single
most important decision to make in the sampling refactoring.

It all boils down to:
Should we just keep somewhere stuck threads and wait? Should we cancel stuck 
tasks?

A. Let's cancel the stuck tasks.
If we move toward a libvirt connection pool, and we give each worker thread in 
the sampling pool
a separate libvirt connection, hopefully read-only, then we should be able to 
cancel stuck task by
killing the worker's libvirt connection. We'll still need a (probably much 
simpler) watchman/supervisor,
but no big deal here. 
Libvirt allows to close a connection from a different thread.
I haven't actually tried to unstuck a blocked thread this way, but I have no 
reason to believe it
will not work.

B. Let's keep around blocked threads
The code as it is just leaves a blocked libvirt call and the worker thread that 
carried it frozen.
The stuck worker thread can be replaced up to a cap of frozen threads.
In this worst case scenario, we end up with one (blocked!) thread per VM, as it 
is today, and with
no sampling data.

I believe that #A has some drawbacks which we risk to overlook, and on the same 
time #B has some merits.

Let me explain:
The hardest case is a call blocked in the kernel in D state. Libvirt has no 
more room than VDSM
to unblock it; and libvirt itself *has* a pool of resources (threads in this 
case) which can be depleted
by stuck calls. Actually, retrying to do a failed task may deplete their pool 
even faster[1].

I'm not happy to just push this problem down the stack, as it looks to me that 
we gain
very little by doing so. VDSM itself surely stays cleaner, but the 
VDS/hypervisor hosts on the whole
improves just a bit: libvirt scales better, and that gives us some more room.

On the other hand, by avoiding to reissue dangerous calls, I believe we make 
better use of
the host resources in general. Actually, the point of keeping blocked thread 
around is a side effect
of not reattempting blocked calls. Moreover, to keep the blocked thread around 
has a significant
benefit: we can discover at the earliest moment when it is safe again to do the 
blocked call,
because the blocked call itself returns and we can track this event! (and of 
course drop the
now stale result). Otherwise, if we drop the connection, we'll lose this event 
and we have no
more option that trying again and hoping for the best[2]

I know the #B approach is not the cleanest, but I think it has slightly more 
appeal, especially
on the libvirt depletion front.

Thoughts and comments very welcome!

+++

[1] They have extensions to management API to dinamically adjust their thread 
pool and/or to cancel
tasks, but it is in the RHEL7.2 timeframe.
[2] A crazy idea would be to do something like 
http://en.wikipedia.org/wiki/Exponential_backoff
which I'm not sure would be beneficial

Bests and thanks,

-- 
Francesco Romani
RedHat Engineering Virtualization R & D
Phone: 8261328
IRC: fromani
_______________________________________________
Devel mailing list
[email protected]
http://lists.ovirt.org/mailman/listinfo/devel

Reply via email to