Re: [ql-developers] Massive amount of job state transitions and re-scheduling

2003-09-08 Thread Richard Zidlicky

On Sun, Sep 07, 2003 at 10:48:50PM +0200, BRANE wrote:
 
 
 - Original Message - 
 From: Peter Graf [EMAIL PROTECTED]
 To: [EMAIL PROTECTED]
 Sent: Sunday, September 07, 2003 9:53 PM
 Subject: Re: [ql-developers] Massive amount of job state transitions and
 re-scheduling
 

  Simple example: A M$ or Unix machine sends a file to the QDOS machine via
  TCP. It will send one or two packets, then stop and wait for ACK. Further
  packets will only be sent after further ACKs. Your ACKs can only be
  generated in 50 Hz rhythm, so packets will crawl one-by-one in 50 Hz
  rhythm. (Or two-by-two, if you're lucky.)
 
 AFAIK with TCP/IP this is negotiable. There is no need for such small
 window...

don't forget this is a rather simple TCP/IP implementation and apparently
it is already hard enough to make the simplest variant working reliably
with the garden variety of TCP/IP implementations out there.

Richard


Re: [ql-developers] Massive amount of job state transitions and re-scheduling

2003-09-08 Thread BRANE


 don't forget this is a rather simple TCP/IP implementation and apparently
 it is already hard enough to make the simplest variant working reliably
 with the garden variety of TCP/IP implementations out there.

 Richard

O.K. I'm not following this thread from beginning, so I don't know exactly
what hardware are we talking about, but for this detail, it probably doesn't
matter much...

So, what is a solution ? Using external interrupt ? Maybe a bit bulkier
controller with built-in Ethernet ?






Re: [ql-developers] Massive amount of job state transitions and re-scheduling

2003-09-08 Thread pgraf

On 12 Sep 2003 at 13:00, BRANE wrote:

 O.K. I'm not following this thread from beginning, so I don't know exactly
 what hardware are we talking about, but for this detail, it probably doesn't
 matter much...
 
 So, what is a solution ? Using external interrupt ? Maybe a bit bulkier
 controller with built-in Ethernet ?

1. My recent problem is a timing-related bug when the full 
datarate is used. It is yet unclear where it comes from, so the 
solution is also unknown. The lazy workaround is to poll the 
controller only every 20 ms. Using ISR's brings no advantage 
compared to this (due to a QDOS-specific shortcoming).

2. Unrelated to this problem, QDOS could use an improvement so 
ISRs can trigger immediate rescheduling of jobs. Given this 
improvement, it makes sense to implement the driver with ISR IOT 
gain a cleaner driver structure. Wether or not this would have 
the side effect to cure the aforementioned bug is unclear.

A hardware change is not required.

All the best
Peter



Re: [ql-developers] Massive amount of job state transitions and re-scheduling

2003-09-08 Thread P Witte

Peter Graf writes:

 Hi Per,

 And Peter, did you try out the suggestions that were made at that time?

 Can you be a bit more specific? I remember only one applicable suggestion,
 which was to set a system variable before leaving the ISR. Didn't work, at
 least not under QDOS.

# By exiting the interrupt handler through the sms.rte function the
# requested re-schedule will be done immediately if possible (i.e. no
# supervisor code was running at that time). Example:
#
# include dev8_smsq_smsq_basekeys
# include dev8_keys_psf
#
# int_handler
# movem.l psf.reg,-(sp)
#
# [blah]
#
# st  sys_rshd(a6); Request re-schedule
# move.l  sms.rte,a5  ; ...now would be convenient
# jmp (a5)

etc, as per my mail to this list on 18/01/03

Per



Re: [ql-developers] Massive amount of job state transitions and re-scheduling

2003-09-07 Thread P Witte

Richard Zidlicky writes:


  Plus, I'm a bit surprised that you are apparently using jobs to fetch
the
  data from the ethernet card... It should be done via an interrupt
handler
  instead... Actually, the best design would be to have the Q60 fast
interrupt
  handler to fill a buffer, and a frame interrupt task to move the data
from
  that buffer into a bigger one for your job to fetch it in big
chunks...).

 this was discussed a while ago here, the big problem is that
 neither QDOS nor SMSQ will attempt to reschedule after interrupt
 handling and there is no way to deal with the complexities of the
 TCP/IP protocol inside the interupt handler.
 That means sending of protocol replies would be very often delayed
 by 1/50s which would make especially TCP crawl..

The last words you wrote the last time we discussed this topic was:

 Otoh checking for sys_rschd after isr processing looks really trivial 
 and top priority now.

Did you ever get round to it?

And Peter, did you try out the suggestions that were made at that time?

Could the effects Peter mentions have anything to do with the cache?

Per




Re: [ql-developers] Massive amount of job state transitions and re-scheduling

2003-09-07 Thread Thierry Godefroy

On Sat, 06 Sep 2003 00:24:18 +0200, Peter Graf wrote:

 
 Thierry wrote:
 
 .../...

 Plus, I'm a bit surprised that you are apparently using jobs to fetch the
 data from the ethernet card... It should be done via an interrupt handler
 instead...
 
 At first sight it looks like that of course. QDOS/SMS reality is different 
 though.
 
 Actually, the best design would be to have the Q60 fast interrupt
 handler to fill a buffer, and a frame interrupt task to move the data from
 that buffer into a bigger one for your job to fetch it in big chunks...).
 
 Wrong.
 
 1. TCP is not a linear flow of data into one direction, even if the purpose 
 is file transfer.

Yes, this I know, thanks... I'm perfectly aware of the fragmentation and of
out of order receipt of TCP packets... That doesn't change the fact you could
use the fast interrupt to store as many TCP packet as needed (i.e. when they
come in), into a buffer (organized as a linked list of recieved packets),
then to transfer the whole lot of packets to the higher level layers of the
TCP/IP stack at once and every 1/50th of second...

 QDOS (and likely SMSQ/E, too) is so primitive that an 
 interrupt service routine can _not_ trigger immediate rescheduling of jobs 
 after it has completed. The time until the next rescheduling can be 20 ms 
 (worst case) so the user job has to wait that time until it can process the 
 data. The effect is that the other TCP endpoint in the network has to wait 
 20 ms + processing + transfer time until it can react to the response 
 packet. Given MTU=1460=1.5KB your interrupt driven approach can not 
 guarantee more than a throughput of 1.5 KB / 20 ms = 75 KB/s with TCP, even 
 if the other endpoint needs zero time to process it's packets. (75 KB/s is 
 not quite what I want.)

Wrong... With my method, you simply get a 20ms penalty (at worst) on the
acknowledgment of all the packets that were bufered... I.e. you'll have
a (worst case) 20ms penalty when pinging a Q60 on a network, compared to
another computer...

 Unlike an ISR, a job _can_ trigger immediate rescheduling! You don't need 
 to always poll the NIC, a clever approach can lead to full TCP throughput 
 during network activity, but zero polling waste (except for a a few tens of 
 instructions per 50 Hz) when the network is inactive.

You don't need to poll the hardware as long as you can use an interrupt
to signal the arrival of each new packet. Is the Q60 able to trigger an
extrenal interrupt on such conditions ?  If yes, then the lowest layer
of the TCP/IP stack (actually of the Ethernet driver) could be implemented
as the external interrupt handler...

 The details are 
 somewhat complex, but as long the OS isn't changed, I have no better choice.
 
 2. You waste response (and processor) time by your second copying level. 
 Imagine running the TCP/IP stack on a SuperGoldCard. Copying or not copying 
 about 1 MB every second _does_ matter.

Well, aren't we speaking about the Q60 (or Q40) here ?  I mean, there's not
even an Ethernet I/F on (S)GCs...

 3. The idea of collecting fragments into larger buffers is not feasible, 
 unless you implement the TCP/IP stack itself within ISRs. (There are good 
 reasons not to do that!)

This is wrong... The low level part is only responsible for moving the data
from the hardware into an area of the memory wher it can wait until it's
processed... I see no problem at all...

Thierry.


Re: [ql-developers] Massive amount of job state transitions and re-scheduling

2003-09-07 Thread Peter Graf
Thierry wrote:

Yes, this I know, thanks... I'm perfectly aware of the fragmentation and of
out of order receipt of TCP packets... That doesn't change the fact you could
use the fast interrupt to store as many TCP packet as needed (i.e. when they
come in), into a buffer (organized as a linked list of recieved packets),
then to transfer the whole lot of packets to the higher level layers of the
TCP/IP stack at once and every 1/50th of second...
Obviously correct but useless. Try to understand that the problem in your 
approach is latency and can not be solved by buffering, no matter how 
efficient buffering is implemented.

Simple example: A M$ or Unix machine sends a file to the QDOS machine via 
TCP. It will send one or two packets, then stop and wait for ACK. Further 
packets will only be sent after further ACKs. Your ACKs can only be 
generated in 50 Hz rhythm, so packets will crawl one-by-one in 50 Hz 
rhythm. (Or two-by-two, if you're lucky.)

 QDOS (and likely SMSQ/E, too) is so primitive that an
 interrupt service routine can _not_ trigger immediate rescheduling of jobs
 after it has completed. The time until the next rescheduling can be 20 ms
 (worst case) so the user job has to wait that time until it can process 
the
 data. The effect is that the other TCP endpoint in the network has to wait
 20 ms + processing + transfer time until it can react to the response
 packet. Given MTU=1460=1.5KB your interrupt driven approach can not
 guarantee more than a throughput of 1.5 KB / 20 ms = 75 KB/s with TCP, 
even
 if the other endpoint needs zero time to process it's packets. (75 KB/s is
 not quite what I want.)

Wrong... With my method, you simply get a 20ms penalty (at worst) on the
acknowledgment of all the packets that were bufered... I.e. you'll have
a (worst case) 20ms penalty when pinging a Q60 on a network, compared to
another computer...
Obviously correct, it only supports what I explained. You seem to have the 
(unrealistic) idea that the other endpoint will be sending (much) more than 
one packet per ACK.

 Unlike an ISR, a job _can_ trigger immediate rescheduling! You don't need
 to always poll the NIC, a clever approach can lead to full TCP throughput
 during network activity, but zero polling waste (except for a a few 
tens of
 instructions per 50 Hz) when the network is inactive.

You don't need to poll the hardware as long as you can use an interrupt
to signal the arrival of each new packet. Is the Q60 able to trigger an
extrenal interrupt on such conditions ? If yes, then the lowest layer
of the TCP/IP stack (actually of the Ethernet driver) could be implemented
as the external interrupt handler...
Yes the Q60 can trigger those interrupts, yes driver implementation is 
possible, yes it replaces polling. Irrelevant altogether in a QDOS system, 
unless I want TCP to crawl.

 2. You waste response (and processor) time by your second copying level.
 Imagine running the TCP/IP stack on a SuperGoldCard. Copying or not 
copying
 about 1 MB every second _does_ matter.

Well, aren't we speaking about the Q60 (or Q40) here ?  I mean, there's not
even an Ethernet I/F on (S)GCs...
Compared to the overhead of the scheduler calls, such a large number of 
external memory accesses seems relevant to me. Plus your approach doesn't 
even eliminate the scheduler calls if you want similar TCP response time 
(and TCP is implemented as a job).

BTW Nasta had the Ethernet design for (S)GC, even the PCB, ready.

 3. The idea of collecting fragments into larger buffers is not feasible,
 unless you implement the TCP/IP stack itself within ISRs. (There are good
 reasons not to do that!)
This is wrong... The low level part is only responsible for moving the data
from the hardware into an area of the memory wher it can wait until it's
processed... I see no problem at all...
You proposed collection of fragments so the job can fetch them in big 
chunks, i.e. combining the payload. I referred to that.

All the best
Peter



Re: [ql-developers] Massive amount of job state transitions and re-scheduling

2003-09-07 Thread Peter Graf
BRANE wrote:

 Simple example: A M$ or Unix machine sends a file to the QDOS machine via
 TCP. It will send one or two packets, then stop and wait for ACK. Further
 packets will only be sent after further ACKs. Your ACKs can only be
 generated in 50 Hz rhythm, so packets will crawl one-by-one in 50 Hz
 rhythm. (Or two-by-two, if you're lucky.)
AFAIK with TCP/IP this is negotiable. There is no need for such small
window...
QLwIP offers a window of 8760, but I have found that neither Windows nor 
Linux will exploit that in their standard configuration, at least not as 
long as their counterpart has 20 ms latency. Depending on the application, 
they send one, maximum two packets before they wait for ACK.

Somehow QLwIP must live with standard behaviour of other machines, you can 
not expect people to tune their TCP stacks.

All the best
Peter



Re: [ql-developers] Massive amount of job state transitions and re-scheduling

2003-09-07 Thread BRANE


- Original Message - 
From: BRANE [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Monday, September 08, 2003 1:03 AM
Subject: Re: [ql-developers] Massive amount of job state transitions and
re-scheduling


 Besides, I find it a bit hard to believe that average PC does acknowledge
 every packet on 100 Mbit Ethernet.  This would mean something like
 interrupts with 100 kHz rate. Not very likely on modern machines...

Ooops. Divide bug ;o) 10 kHz is more like it, but still very high...