Thanks for the answer and effort, but without intention to be rude - how does
that answer my question :) So, let's start over; for some my personal reasons
;), I need exactly the way I shown in the example, but it doesn't seem to work
and I don't know how to make it work. I'd appreciate a working example, or
clear answer that such an approach isn't possible (which is also fine, btw). In
case that my suggested approach is not possible, I'd like to see a working
example of how to squeeze out the maximum (ideally using ALL available
resources) from the hardware.