I'm a bit closer to solving the nbd overload problem. I have replicated
the problem, varying the conditions, and I'm confident, but not
certain ...

The problem seems to be that a window buffer on the host computer, the
one receiving the data and storing it on a remote swap file fills up,
and from that point on, problems develop.

The nbd protocol has a send command packet (write) send data (1024-4096
bytes) and receive the an acknowledgement packet from the host computer.

Problem is, that the sending computer does not wait for an acknowledgement
of the nbd packet before sending another one. It is possible to send (say)
ten nbd packets before receiving the acknowledgement packets for all ten
after the last has been received (if you artificially slow down the server,
that is).

At a finer level, ethernet packets are transmitted and acknowledged. Even
here, some "writing ahead" takes place. Its possible for up to six
ethernet packets to be transmitted before a (ethernet) acknowledgement
packet is returned.

The problem is that the receiving window fills up. I don't know if at the
nbd level, there is any fixed amount of "writing ahead"; if this could
be reduced, that would solve the problem. 

If, within the ethernet protocol, there is feedback based on the remaining
size of the window buffer, then the output rate could be throttled.

I suspect that this is what is _supposed_ to happen, and at some level
something is going wrong. Given it is quite an exotic, stressful
situation, this would not be surprising. The nbd driver could be be
playing fast and loose at it too. Indeed, the nbd driver may not be
even worrying about the acknowledge packets ...

Anyway, quite apart from it being supposed to work, there may be a way
of limiting the write-ahead at the ethernet level or the tcp-ip level,
or delaying the ethernet acknowledgement of received packets to slow
the system down. Clearly I'd want to apply this judiciously, lest I
slow down the whole system.

How would you do this ?

Another option would be (I suspect) to use a DMA card - but even then,
this may still feed into a window buffer which would fill up.

Thanks for the suggestion, Chesty, I tried it but was not able to 
change the packet size. Oh well :(

Cheers,

-- 
John August

Some of us are paying for sins we have committed.
Others are paying for sins we still have to commit.


--
SLUG - Sydney Linux User Group Mailing List - http://slug.org.au/
More Info: http://slug.org.au/lists/listinfo/slug

Reply via email to