Thierry wrote:
On Thu, 04 Sep 2003 20:23:08 +0200, Peter Graf wrote:
> Hi,
>
> I made an experimental boost of QLwIP speed to the Ethernet maximum of 10
> Mbit/sec, which results in a massive amount of calls to MT.SUSJB, MT.RELJB
> and MT.PRIOR, typically several thousands per second.
[snip]
Well, I guess the problem is that all three calls are exiting via the
scheduler (they are not atomic traps). My guess is that calling them in
rapid succession (more than once every 1/50th of second) makes the job
to reenter recursively the scheduler and to fill up the supervisor stack...
Calls can indeed be more than 20 times per 1/50th of a second. I have no
idea how the recursion could emerge, but your scenario would fit into the
picture.
It might work under SMSQ/E (bigger stack, much better and faster scheduler),
but this is definitely not recommended under QDOS...
I'll have a look.
Plus, I'm a bit surprised that you are apparently using jobs to fetch the
data from the ethernet card... It should be done via an interrupt handler
instead...
At first sight it looks like that of course. QDOS/SMS reality is different
though.
Actually, the best design would be to have the Q60 fast interrupt
handler to fill a buffer, and a frame interrupt task to move the data from
that buffer into a bigger one for your job to fetch it in big chunks...).
Wrong.
1. TCP is not a linear flow of data into one direction, even if the purpose
is file transfer. QDOS (and likely SMSQ/E, too) is so primitive that an
interrupt service routine can _not_ trigger immediate rescheduling of jobs
after it has completed. The time until the next rescheduling can be 20 ms
(worst case) so the user job has to wait that time until it can process the
data. The effect is that the other TCP endpoint in the network has to wait
20 ms + processing + transfer time until it can react to the response
packet. Given MTU=1460=1.5KB your interrupt driven approach can not
guarantee more than a throughput of 1.5 KB / 20 ms = 75 KB/s with TCP, even
if the other endpoint needs zero time to process it's packets. (75 KB/s is
not quite what I want.)
Unlike an ISR, a job _can_ trigger immediate rescheduling! You don't need
to always poll the NIC, a clever approach can lead to full TCP throughput
during network activity, but zero polling waste (except for a a few tens of
instructions per 50 Hz) when the network is inactive. The details are
somewhat complex, but as long the OS isn't changed, I have no better choice.
2. You waste response (and processor) time by your second copying level.
Imagine running the TCP/IP stack on a SuperGoldCard. Copying or not copying
about 1 MB every second _does_ matter.
3. The idea of collecting fragments into larger buffers is not feasible,
unless you implement the TCP/IP stack itself within ISRs. (There are good
reasons not to do that!)
All the best
Peter