Re: [ql-developers] Massive amount of job state transitions and re-scheduling

2004-03-03 Thread Peter Graf
Hi,

late continuation of this old thread. Two issues were discussed back then:

(1) A principal shortcoming of QDOS/SMS, that does not allow a highspeed 
multitasking TCP/IP implementation with an interrupt driven structure.

(2) A problem in QLwIP that occured only when an extremely high number of 
scheduler calls was made.

The good news: Problem (1) has been solved, thanks to an improvement for 
QDOS Classic written by Richard Zidlicky. Only few lines of code, but with 
great effect. It has allowed me to make the lowest driver level 
interrupt-triggered, without the latency problem when interacting with 
jobs. I plan more intense testing. Later on, the modification shall be 
discussed with Mark Swift.

Problem (2) also seems to be solved, as a side-effect of the other 
modification. Since the cause of the problem never was exactly found, I'm 
not sure wether this is just luck. But I have transfered several 100 MB of 
data over the network this evening and the wellknown "hanging" of the HTTPD 
job at full speed has not yet appeared again, so there's hope.

Kudos to Richard!

All the best
Peter
Just by the way, I never was a big fan of the Qubide IDE driver, but I must 
admit that it is more intelligent than the SMSQ/E driver. While the latter 
one needed to restrict buffer size for acceptable speed, Qubide can deal 
with huge buffers, without the "crawling slaveblock" effect. (Helpful for 
the HTTP server.)



Re: [ql-developers] Massive amount of job state transitions and re-scheduling

2003-09-10 Thread Peter Graf
Per wrote:

Peter writes:

<>
> > Arent you trying to make the OS do something it was never designed to
> > do? Writing drivers is a programming challenge. The OS is there to help
> > where it can, but no OS author can anticipate any and every piece of
> > hardware that is going to be attached to the machine in the future. That
is
> > the job of the driver. (Preferably without each driver author altering
the
> > OS to suit their own needs ;)
>
> Somehow I doubt that you need to teach me that writing drivers
> is a programming challenge or more trivialities and generalities
> about OS and driver structure ;)
I would not presume to teach you anything. I know that would be futile. I
was not debating detail with you - I know nothing of the detail. I was
discussing principle, and that I, and other well-meaning souls on this
list trying to help, happen to know a little about. The detail normally
follows from the priciple.
Please spare me from ridicule for voicing a legitimate point of
view. It does neither of us any favours, and tends to sour the atmosphere.
Calm down. If you publicly explain someone who has just constructed a car 
(and asks for help improving the transmission) what a car is, don't be 
surprised that he recognizes you weren't gonna help him with the 
transmission. He might have the impression that you were trying to spread a 
little doubt about his ability to build a good car.

Peter




Re: [ql-developers] Massive amount of job state transitions and re-scheduling

2003-09-10 Thread P Witte

Peter writes:

<>
> > Arent you trying to make the OS do something it was never designed to
> > do? Writing drivers is a programming challenge. The OS is there to help
> > where it can, but no OS author can anticipate any and every piece of
> > hardware that is going to be attached to the machine in the future. That
is
> > the job of the driver. (Preferably without each driver author altering
the
> > OS to suit their own needs ;)
>
> Somehow I doubt that you need to teach me that writing drivers
> is a programming challenge or more trivialities and generalities
> about OS and driver structure ;)

I would not presume to teach you anything. I know that would be futile. I
was not debating detail with you - I know nothing of the detail. I was
discussing principle, and that I, and other well-meaning souls on this
list trying to help, happen to know a little about. The detail normally
follows from the priciple.

Please spare me from ridicule for voicing a legitimate point of
view. It does neither of us any favours, and tends to sour the atmosphere.

> I guess you have to accept that QDOS (SMS?) has a principal
> shortcoming, not an author dependant need, and should be
> improved.

As to my little jibe, it was a kindly nudge-wink in the direction of
Richard, relating to an upstream tributary of this discussion.
"Improvements"  to Qdos  is a very serious matter and should not be
undertaken lightly as it affects everyone.

Per












Re: [ql-developers] Massive amount of job state transitions and re-scheduling

2003-09-09 Thread pgraf

On 9 Sep 2003 at 14:42, Thierry Godefroy wrote:

> 
> On Sun, 07 Sep 2003 21:53:34 +0200, Peter Graf wrote:
> 
> > Thierry wrote:
> > 
> > >Yes, this I know, thanks... I'm perfectly aware of the fragmentation and of
> > >out of order receipt of TCP packets... That doesn't change the fact you could
> > >use the fast interrupt to store as many TCP packet as needed (i.e. when they
> > >come in), into a buffer (organized as a linked list of recieved packets),
> > >then to transfer the whole lot of packets to the higher level layers of the
> > >TCP/IP stack at once and every 1/50th of second...
> > 
> > Obviously correct but useless. Try to understand that the problem in your 
> > approach is latency and can not be solved by buffering, no matter how 
> > efficient buffering is implemented.
> 
> From the computer sending packets to a Qx0, the latency would just be seen
> as a longer route... When you ping a computer on Internet, you have to
> wait a variable amount of time for the reply, depending on how many routers
> must be crossed, on how long are the wires (for trans-oceanic links, or worst
> for satellite links, it's far than neglectable), and how fast and/or busy is
> the receiver... 
> 
> If the topographic design of the network between a PC and a Q60 should lead
> to, say, a 200ms delay in the reply, then the TCP/IP implementation on the
> Q60 would simply add a 20ms latency to this number, but in the end, the
> sender should still receive its ACKs between 200 and 220ms after the packets
> are sent...

Sure. That's all obvious but still unrelated to the the given LAN data 
rate challenge.

> Of course, this supposes that the acknowledgement is -actually- done every
> 20ms in the Q60, which is -NOT- the case if it's done at the job level (jobs
> are elected or not, depending on their cumulated priority and are therefore
> -NOT- running each 20ms unless they are alone in memory)...

This is still only the trivial view. It is not applicable to the way I've 
implemented things. Let me give you a (much) simplified example:

A top priority (let's call it 50Hz ISR replacement) job that suspends 
itself long before the 20 ms interval is over (and thereby gives the lower 
priority jobs their share in the same interval) will practically always be 
elected to run again after the suspension time (20 ms) is over. You can 
have 10 other jobs running (most of them will usually block for I/O most 
of the time) and even a benchmark that consumes all the rest of the CPU 
power, it still works. There are rare circumstances where indeed the high 
prio job will not be elected now and then, but under these circumstances 
it is the best that can happen, because some of the user interface should 
remain in working condition. It's better to slow down the network or even 
drop packets under these rare circumstances (let TCP deal with the dropped 
packets :).

Just by the way, the remark with "alone in memory" was completely wrong, 
because it's irrelevant wether a job is loaded and activated. The point is 
in which intervals jobs are blocking for IO (or something else). You can 
easily have the case where several jobs are each and all executed every 20 
ms and all do real work. This is not even unusual if the jobs process data 
from IRQ driven IO, and the IO is slower than the CPU.

> The code for reassembling the fragmented packets and acknowledging them
> must be implemented either as a frame interrupt, or (if frame interrupts
> are still too slow for your taste),

Were are not talking a special taste, but the normal 10 Mbit/s Ethernet 
data rate under TCP.

 by using a polled routine triggered
> by the Q60 fast interrupt (the one used by the sound system).
> 
> The struture for the whole TCP/IP stack would then be:
> 
> 1.- IP packets fetching from the I/F and buffering:
> - External interrupt handler (best), or fast interrupt polling loop.
> 2.- IP packets reassembling and acknowledging:
> - polled task: fast interrupt handler (best) or frame interrupt.
> 3.- TCP/IP high level protocols:
> - High priority (127) job.
> 
> How does this sound ?

I remember that I thought about a similar structure when I was still in 
the very beginning of this project and had not delt with the details.

Firstly your structure would not allow 10 Mbit/s TCP streams in a normal 
LAN. (Note that level 1 won't generate TCP ACKs.)

Second, I wonder what makes you think there were packet acknowledgements 
on IP level.

Third, it is dangerous to let TCP processing run at highest priority. If 
there is very much TCP traffic (let's for a moment take into acount a 
wider range of machines than Q60, or malicious traffic) your user 
interface may stop working.

...

The list could go on. I for one have 3 different philosphies: One in case 
QDOS is fixed and I have a lot of time, the second in case QDOS is fixed 
and I'm short of time, the third in case QDOS isn't fixed :-) 

Thierry, I don't search for theoretical advice on OS and drivers 
str

Re: [ql-developers] Massive amount of job state transitions and re-scheduling

2003-09-09 Thread Richard Zidlicky

On Tue, Sep 09, 2003 at 01:37:53AM +0100, P Witte wrote:
> 
> Peter Graf writes:
> 
> > >Yes, this I know, thanks... I'm perfectly aware of the fragmentation and
> of
> > >out of order receipt of TCP packets... That doesn't change the fact you
> could
> > >use the fast interrupt to store as many TCP packet as needed (i.e. when
> they
> > >come in), into a buffer (organized as a linked list of recieved packets),
> > >then to transfer the whole lot of packets to the higher level layers of
> the
> > >TCP/IP stack at once and every 1/50th of second...
> >
> > Obviously correct but useless. Try to understand that the problem in your
> > approach is latency and can not be solved by buffering, no matter how
> > efficient buffering is implemented.
> >
> > Simple example: A M$ or Unix machine sends a file to the QDOS machine via
> > TCP. It will send one or two packets, then stop and wait for ACK. Further
> > packets will only be sent after further ACKs. Your ACKs can only be
> > generated in 50 Hz rhythm, so packets will crawl one-by-one in 50 Hz
> > rhythm. (Or two-by-two, if you're lucky.)
> 
> But does the incoming data need to be processed in any way before
> acknowledgement? Why cant the ISR simply receive and buffer the data and
> then send the ACK before exiting, leaving any processing to the higher
> levels?

my impression is that to do that for TCP you would have to do all of
the protocol implementation into the ISR

> In our January discussion you mentioned the case of echo. There is nothing
> to stop you from implementing time-critical routines, like echo, in the
> 'physical layer'.

well there is, as long as QDOS won't allow reschedule after interrupts.
Echo is supposed to be a normal application and you would not move it
into the ISR layer.

As I said it isn't hard to change ISR handling in QDOS and I will certainly
do it, right now I have other things that need to be done.

Richard


Re: [ql-developers] Massive amount of job state transitions and re-scheduling

2003-09-09 Thread Thierry Godefroy

On Sun, 07 Sep 2003 21:53:34 +0200, Peter Graf wrote:

> Thierry wrote:
> 
> >Yes, this I know, thanks... I'm perfectly aware of the fragmentation and of
> >out of order receipt of TCP packets... That doesn't change the fact you could
> >use the fast interrupt to store as many TCP packet as needed (i.e. when they
> >come in), into a buffer (organized as a linked list of recieved packets),
> >then to transfer the whole lot of packets to the higher level layers of the
> >TCP/IP stack at once and every 1/50th of second...
> 
> Obviously correct but useless. Try to understand that the problem in your 
> approach is latency and can not be solved by buffering, no matter how 
> efficient buffering is implemented.

>From the computer sending packets to a Qx0, the latency would just be seen
as a longer route... When you ping a computer on Internet, you have to
wait a variable amount of time for the reply, depending on how many routers
must be crossed, on how long are the wires (for trans-oceanic links, or worst
for satellite links, it's far than neglectable), and how fast and/or busy is
the receiver...

If the topographic design of the network between a PC and a Q60 should lead
to, say, a 200ms delay in the reply, then the TCP/IP implementation on the
Q60 would simply add a 20ms latency to this number, but in the end, the
sender should still receive its ACKs between 200 and 220ms after the packets
are sent...

Of course, this supposes that the acknowledgement is -actually- done every
20ms in the Q60, which is -NOT- the case if it's done at the job level (jobs
are elected or not, depending on their cumulated priority and are therefore
-NOT- running each 20ms unless they are alone in memory)...

The code for reassembling the fragmented packets and acknowledging them
must be implemented either as a frame interrupt, or (if frame interrupts
are still too slow for your taste), by using a polled routine triggered
by the Q60 fast interrupt (the one used by the sound system).

The struture for the whole TCP/IP stack would then be:

1.- IP packets fetching from the I/F and buffering:
- External interrupt handler (best), or fast interrupt polling loop.
2.- IP packets reassembling and acknowledging:
- polled task: fast interrupt handler (best) or frame interrupt.
3.- TCP/IP high level protocols:
- High priority (127) job.

How does this sound ?

Thierry.


Re: [ql-developers] Massive amount of job state transitions and re-scheduling

2003-09-09 Thread pgraf

On 8 Sep 2003 at 0:53, P Witte wrote:

> 
> Peter Graf writes:
> 
> > Hi Per,
> >
> > >And Peter, did you try out the suggestions that were made at that time?
> >
> > Can you be a bit more specific? I remember only one applicable suggestion,
> > which was to set a system variable before leaving the ISR. Didn't work, at
> > least not under QDOS.
> 
> # By exiting the interrupt handler through the sms.rte function the
> # requested re-schedule will be done immediately if possible (i.e. no
> # supervisor code was running at that time). Example:
> #
> # include dev8_smsq_smsq_basekeys
> # include dev8_keys_psf
> #
> # int_handler
> # movem.l psf.reg,-(sp)
> #
> # [blah]
> #
> # st  sys_rshd(a6); Request re-schedule
> # move.l  sms.rte,a5  ; ...now would be convenient
> # jmp (a5)
> 
> etc, as per my mail to this list on 18/01/03

See above.

Peter



Re: [ql-developers] Massive amount of job state transitions and re-scheduling

2003-09-09 Thread pgraf



On 9 Sep 2003 at 1:37, P Witte wrote:

[snip]


> But does the incoming data need to be processed in any way before
> acknowledgement? Why cant the ISR simply receive and buffer the data and
> then send the ACK before exiting, leaving any processing to the higher
> levels?

The reason is that it's part of the TCP processing and can not 
be done on ethernet packet level.

> In our January discussion you mentioned the case of echo. There is nothing
> to stop you from implementing time-critical routines, like echo, in the
> 'physical layer'. In fact you can take over the whole machine and do as you
> please.

Not a task where speed is relevant for the user. No point in 
speeding up ICMP echo only.


> he important thing is to split the driver correctly: Time
> critical, ie
> usually hardware related stuff, and in this case it appears also certain
> demands of the TCP/IP protocol (if I understand correctly) are rightly the
> provinance of the ISR. If this sort of thing is not clearcut in TCP/IP, then
> a messy solution is called for ;)
> 
> Arent you trying to make the OS do something it was never designed to
> do? Writing drivers is a programming challenge. The OS is there to help
> where it can, but no OS author can anticipate any and every piece of
> hardware that is going to be attached to the machine in the future. That is
> the job of the driver. (Preferably without each driver author altering the
> OS to suit their own needs ;)


Somehow I doubt that you need to teach me that writing drivers
is a programming challenge or more trivialities and generalities
about OS and driver structure ;)


I guess you have to accept that QDOS (SMS?) has a principal 
shortcoming, not an author dependant need, and should be 
improved.

> Afterall, someone did implement TCP/IP on the Spectrum, which neither
> multitasks

A singletasking TCP/IP implementation is easier not harder.


Just in case you didn't notice: My TCP/IP package for QDOS 
works. I was talking the obstacles of higher data rates.

> nor (for all 
I know) understands interrupts.

Of course the Spectrum uses interrupts, and even singletasking 
TCP/IP needs some sort of timers.


All the best
Peter





Re: [ql-developers] Massive amount of job state transitions and re-scheduling

2003-09-08 Thread P Witte

Peter Graf writes:

> >Yes, this I know, thanks... I'm perfectly aware of the fragmentation and
of
> >out of order receipt of TCP packets... That doesn't change the fact you
could
> >use the fast interrupt to store as many TCP packet as needed (i.e. when
they
> >come in), into a buffer (organized as a linked list of recieved packets),
> >then to transfer the whole lot of packets to the higher level layers of
the
> >TCP/IP stack at once and every 1/50th of second...
>
> Obviously correct but useless. Try to understand that the problem in your
> approach is latency and can not be solved by buffering, no matter how
> efficient buffering is implemented.
>
> Simple example: A M$ or Unix machine sends a file to the QDOS machine via
> TCP. It will send one or two packets, then stop and wait for ACK. Further
> packets will only be sent after further ACKs. Your ACKs can only be
> generated in 50 Hz rhythm, so packets will crawl one-by-one in 50 Hz
> rhythm. (Or two-by-two, if you're lucky.)

But does the incoming data need to be processed in any way before
acknowledgement? Why cant the ISR simply receive and buffer the data and
then send the ACK before exiting, leaving any processing to the higher
levels?

In our January discussion you mentioned the case of echo. There is nothing
to stop you from implementing time-critical routines, like echo, in the
'physical layer'. In fact you can take over the whole machine and do as you
please.

The important thing is to split the driver correctly: Time critical, ie
usually hardware related stuff, and in this case it appears also certain
demands of the TCP/IP protocol (if I understand correctly) are rightly the
provinance of the ISR. If this sort of thing is not clearcut in TCP/IP, then
a messy solution is called for ;)

Arent you trying to make the OS do something it was never designed to
do? Writing drivers is a programming challenge. The OS is there to help
where it can, but no OS author can anticipate any and every piece of
hardware that is going to be attached to the machine in the future. That is
the job of the driver. (Preferably without each driver author altering the
OS to suit their own needs ;)

Afterall, someone did implement TCP/IP on the Spectrum, which neither
multitasks nor (for all I know) understands interrupts.

Good luck!

Per








Re: [ql-developers] Massive amount of job state transitions and re-scheduling

2003-09-08 Thread P Witte

Peter Graf writes:

> Hi Per,
>
> >And Peter, did you try out the suggestions that were made at that time?
>
> Can you be a bit more specific? I remember only one applicable suggestion,
> which was to set a system variable before leaving the ISR. Didn't work, at
> least not under QDOS.

# By exiting the interrupt handler through the sms.rte function the
# requested re-schedule will be done immediately if possible (i.e. no
# supervisor code was running at that time). Example:
#
# include dev8_smsq_smsq_basekeys
# include dev8_keys_psf
#
# int_handler
# movem.l psf.reg,-(sp)
#
# [blah]
#
# st  sys_rshd(a6); Request re-schedule
# move.l  sms.rte,a5  ; ...now would be convenient
# jmp (a5)

etc, as per my mail to this list on 18/01/03

Per



Re: [ql-developers] Massive amount of job state transitions and re-scheduling

2003-09-08 Thread pgraf

On 12 Sep 2003 at 13:00, BRANE wrote:

> O.K. I'm not following this thread from beginning, so I don't know exactly
> what hardware are we talking about, but for this detail, it probably doesn't
> matter much...
> 
> So, what is a solution ? Using external interrupt ? Maybe a bit bulkier
> controller with built-in Ethernet ?

1. My recent problem is a timing-related bug when the full 
datarate is used. It is yet unclear where it comes from, so the 
solution is also unknown. The lazy workaround is to poll the 
controller only every 20 ms. Using ISR's brings no advantage 
compared to this (due to a QDOS-specific shortcoming).

2. Unrelated to this problem, QDOS could use an improvement so 
ISRs can trigger immediate rescheduling of jobs. Given this 
improvement, it makes sense to implement the driver with ISR IOT 
gain a cleaner driver structure. Wether or not this would have 
the side effect to cure the aforementioned bug is unclear.

A hardware change is not required.

All the best
Peter



Re: [ql-developers] Massive amount of job state transitions and re-scheduling

2003-09-08 Thread BRANE

>
> don't forget this is a rather simple TCP/IP implementation and apparently
> it is already hard enough to make the simplest variant working reliably
> with the garden variety of TCP/IP implementations out there.
>
> Richard

O.K. I'm not following this thread from beginning, so I don't know exactly
what hardware are we talking about, but for this detail, it probably doesn't
matter much...

So, what is a solution ? Using external interrupt ? Maybe a bit bulkier
controller with built-in Ethernet ?






Re: [ql-developers] Massive amount of job state transitions and re-scheduling

2003-09-08 Thread Richard Zidlicky

On Sun, Sep 07, 2003 at 10:48:50PM +0200, BRANE wrote:
> 
> 
> - Original Message - 
> From: "Peter Graf" <[EMAIL PROTECTED]>
> To: <[EMAIL PROTECTED]>
> Sent: Sunday, September 07, 2003 9:53 PM
> Subject: Re: [ql-developers] Massive amount of job state transitions and
> re-scheduling
> 

> > Simple example: A M$ or Unix machine sends a file to the QDOS machine via
> > TCP. It will send one or two packets, then stop and wait for ACK. Further
> > packets will only be sent after further ACKs. Your ACKs can only be
> > generated in 50 Hz rhythm, so packets will crawl one-by-one in 50 Hz
> > rhythm. (Or two-by-two, if you're lucky.)
> 
> AFAIK with TCP/IP this is negotiable. There is no need for such small
> window...

don't forget this is a rather simple TCP/IP implementation and apparently
it is already hard enough to make the simplest variant working reliably
with the garden variety of TCP/IP implementations out there.

Richard


Re: [ql-developers] Massive amount of job state transitions and re-scheduling

2003-09-08 Thread pgraf

On 8 Sep 2003 at 1:03, BRANE wrote:

> > QLwIP offers a window of 8760, but I have found that neither Windows nor
> > Linux will exploit that in their standard configuration, at least not as
> > long as their counterpart has 20 ms latency. Depending on the application,
> > they send one, maximum two packets before they wait for ACK.
> >
> > Somehow QLwIP must live with standard behaviour of other machines, you can
> > not expect people to tune their TCP stacks.
> >
> > All the best
> > Peter
> 
> Hmm. Something doesn't sound right here. Internet is full of "speed up your
> Internet connection" programs that do amongst other things exactly this, so
> I presume this has to be negotiable.

Well for Windows I tried such programs, without effect in
the given situation. It is possible that the long latency leads 
(absolutely unusual in ethernet networks) the counterpart to the 
decision not to send more than one or two TCP packets at once. I 
have not yet invested much in tuning the other machines, since 
this can not be the general solution.

For the IRQ approach to work with the same performance as 
polling, you'd need at least 14 TCP packets to be transferred 
back to back without ACK. Decide yourself wether this is 
realistic.

> Besides, I find it a bit hard to believe that average PC does acknowledge
> every packet on 100 Mbit Ethernet.  This would mean something like
> interrupts with 100 kHz rate. Not very likely on modern machines...

The rate would be about 7 kHz. It may surprise you, but IRQ's 
are actually triggered at this rate, although not every single 
packet is acknowledged. Still the number of packets per ACK is 
usually very small.

> This would also make TCP/IP on 1Gbit Ethernet useless...

On a CPU that can handle TCP at this rate, it's also no problem 
to deal with the exception handling. I have not looked into 1Gb, 
no idea how it's usually implemented.

All the best
Peter



Re: [ql-developers] Massive amount of job state transitions and re-scheduling

2003-09-07 Thread BRANE


- Original Message - 
From: "BRANE" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Monday, September 08, 2003 1:03 AM
Subject: Re: [ql-developers] Massive amount of job state transitions and
re-scheduling


> Besides, I find it a bit hard to believe that average PC does acknowledge
> every packet on 100 Mbit Ethernet.  This would mean something like
> interrupts with 100 kHz rate. Not very likely on modern machines...

Ooops. Divide bug ;o) 10 kHz is more like it, but still very high...





Re: [ql-developers] Massive amount of job state transitions and re-scheduling

2003-09-07 Thread BRANE


- Original Message - 
From: "Peter Graf" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Monday, September 08, 2003 12:05 AM
Subject: Re: [ql-developers] Massive amount of job state transitions and
re-scheduling


>
> BRANE wrote:
>
>  >> Simple example: A M$ or Unix machine sends a file to the QDOS machine
via
> > > TCP. It will send one or two packets, then stop and wait for ACK.
Further
> > > packets will only be sent after further ACKs. Your ACKs can only be
> > > generated in 50 Hz rhythm, so packets will crawl one-by-one in 50 Hz
> > > rhythm. (Or two-by-two, if you're lucky.)
> >
> >AFAIK with TCP/IP this is negotiable. There is no need for such small
> >window...
>
> QLwIP offers a window of 8760, but I have found that neither Windows nor
> Linux will exploit that in their standard configuration, at least not as
> long as their counterpart has 20 ms latency. Depending on the application,
> they send one, maximum two packets before they wait for ACK.
>
> Somehow QLwIP must live with standard behaviour of other machines, you can
> not expect people to tune their TCP stacks.
>
> All the best
> Peter

Hmm. Something doesn't sound right here. Internet is full of "speed up your
Internet connection" programs that do amongst other things exactly this, so
I presume this has to be negotiable.

Besides, I find it a bit hard to believe that average PC does acknowledge
every packet on 100 Mbit Ethernet.  This would mean something like
interrupts with 100 kHz rate. Not very likely on modern machines...

This would also make TCP/IP on 1Gbit Ethernet useless...

 Regards,


Branko



Re: [ql-developers] Massive amount of job state transitions and re-scheduling

2003-09-07 Thread Peter Graf
BRANE wrote:

>> Simple example: A M$ or Unix machine sends a file to the QDOS machine via
> TCP. It will send one or two packets, then stop and wait for ACK. Further
> packets will only be sent after further ACKs. Your ACKs can only be
> generated in 50 Hz rhythm, so packets will crawl one-by-one in 50 Hz
> rhythm. (Or two-by-two, if you're lucky.)
AFAIK with TCP/IP this is negotiable. There is no need for such small
window...
QLwIP offers a window of 8760, but I have found that neither Windows nor 
Linux will exploit that in their standard configuration, at least not as 
long as their counterpart has 20 ms latency. Depending on the application, 
they send one, maximum two packets before they wait for ACK.

Somehow QLwIP must live with standard behaviour of other machines, you can 
not expect people to tune their TCP stacks.

All the best
Peter



Re: [ql-developers] Massive amount of job state transitions and re-scheduling

2003-09-07 Thread BRANE


- Original Message - 
From: "Peter Graf" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Sunday, September 07, 2003 9:53 PM
Subject: Re: [ql-developers] Massive amount of job state transitions and
re-scheduling


>
> Thierry wrote:
>
> >Yes, this I know, thanks... I'm perfectly aware of the fragmentation and
of
> >out of order receipt of TCP packets... That doesn't change the fact you
could
> >use the fast interrupt to store as many TCP packet as needed (i.e. when
they
> >come in), into a buffer (organized as a linked list of recieved packets),
> >then to transfer the whole lot of packets to the higher level layers of
the
> >TCP/IP stack at once and every 1/50th of second...
>
> Obviously correct but useless. Try to understand that the problem in your
> approach is latency and can not be solved by buffering, no matter how
> efficient buffering is implemented.
>
> Simple example: A M$ or Unix machine sends a file to the QDOS machine via
> TCP. It will send one or two packets, then stop and wait for ACK. Further
> packets will only be sent after further ACKs. Your ACKs can only be
> generated in 50 Hz rhythm, so packets will crawl one-by-one in 50 Hz
> rhythm. (Or two-by-two, if you're lucky.)

AFAIK with TCP/IP this is negotiable. There is no need for such small
window...




Re: [ql-developers] Massive amount of job state transitions and re-scheduling

2003-09-07 Thread Peter Graf
Hi Per,

And Peter, did you try out the suggestions that were made at that time?
Can you be a bit more specific? I remember only one applicable suggestion,
which was to set a system variable before leaving the ISR. Didn't work, at
least not under QDOS.
Could the effects Peter mentions have anything to do with the cache?
Same with caches off. The effect only happens in realtime and is hard to 
debug, it may have a completely different cause.

Bye,
Peter



Re: [ql-developers] Massive amount of job state transitions and re-scheduling

2003-09-07 Thread Peter Graf
Thierry wrote:

Yes, this I know, thanks... I'm perfectly aware of the fragmentation and of
out of order receipt of TCP packets... That doesn't change the fact you could
use the fast interrupt to store as many TCP packet as needed (i.e. when they
come in), into a buffer (organized as a linked list of recieved packets),
then to transfer the whole lot of packets to the higher level layers of the
TCP/IP stack at once and every 1/50th of second...
Obviously correct but useless. Try to understand that the problem in your 
approach is latency and can not be solved by buffering, no matter how 
efficient buffering is implemented.

Simple example: A M$ or Unix machine sends a file to the QDOS machine via 
TCP. It will send one or two packets, then stop and wait for ACK. Further 
packets will only be sent after further ACKs. Your ACKs can only be 
generated in 50 Hz rhythm, so packets will crawl one-by-one in 50 Hz 
rhythm. (Or two-by-two, if you're lucky.)

> QDOS (and likely SMSQ/E, too) is so primitive that an
> interrupt service routine can _not_ trigger immediate rescheduling of jobs
> after it has completed. The time until the next rescheduling can be 20 ms
> (worst case) so the user job has to wait that time until it can process 
the
> data. The effect is that the other TCP endpoint in the network has to wait
> 20 ms + processing + transfer time until it can react to the response
> packet. Given MTU=1460=1.5KB your interrupt driven approach can not
> guarantee more than a throughput of 1.5 KB / 20 ms = 75 KB/s with TCP, 
even
> if the other endpoint needs zero time to process it's packets. (75 KB/s is
> not quite what I want.)

Wrong... With my method, you simply get a 20ms penalty (at worst) on the
acknowledgment of all the packets that were bufered... I.e. you'll have
a (worst case) 20ms penalty when pinging a Q60 on a network, compared to
another computer...
Obviously correct, it only supports what I explained. You seem to have the 
(unrealistic) idea that the other endpoint will be sending (much) more than 
one packet per ACK.

> Unlike an ISR, a job _can_ trigger immediate rescheduling! You don't need
> to always poll the NIC, a clever approach can lead to full TCP throughput
> during network activity, but zero polling waste (except for a a few 
tens of
> instructions per 50 Hz) when the network is inactive.

You don't need to poll the hardware as long as you can use an interrupt
to signal the arrival of each new packet. Is the Q60 able to trigger an
extrenal interrupt on such conditions ? If yes, then the lowest layer
of the TCP/IP stack (actually of the Ethernet driver) could be implemented
as the external interrupt handler...
Yes the Q60 can trigger those interrupts, yes driver implementation is 
possible, yes it replaces polling. Irrelevant altogether in a QDOS system, 
unless I want TCP to crawl.

> 2. You waste response (and processor) time by your second copying level.
> Imagine running the TCP/IP stack on a SuperGoldCard. Copying or not 
copying
> about 1 MB every second _does_ matter.

Well, aren't we speaking about the Q60 (or Q40) here ?  I mean, there's not
even an Ethernet I/F on (S)GCs...
Compared to the overhead of the scheduler calls, such a large number of 
external memory accesses seems relevant to me. Plus your approach doesn't 
even eliminate the scheduler calls if you want similar TCP response time 
(and TCP is implemented as a job).

BTW Nasta had the Ethernet design for (S)GC, even the PCB, ready.

> 3. The idea of collecting fragments into larger buffers is not feasible,
> unless you implement the TCP/IP stack itself within ISRs. (There are good
> reasons not to do that!)
This is wrong... The low level part is only responsible for moving the data
from the hardware into an area of the memory wher it can wait until it's
processed... I see no problem at all...
You proposed collection of fragments so the job can fetch them in big 
chunks, i.e. combining the payload. I referred to that.

All the best
Peter



Re: [ql-developers] Massive amount of job state transitions and re-scheduling

2003-09-07 Thread Thierry Godefroy

On Sat, 06 Sep 2003 00:24:18 +0200, Peter Graf wrote:

> 
> Thierry wrote:
> 
> .../...
>
> >Plus, I'm a bit surprised that you are apparently using jobs to fetch the
> >data from the ethernet card... It should be done via an interrupt handler
> >instead...
> 
> At first sight it looks like that of course. QDOS/SMS reality is different 
> though.
> 
> >Actually, the best design would be to have the Q60 fast interrupt
> >handler to fill a buffer, and a frame interrupt task to move the data from
> >that buffer into a bigger one for your job to fetch it in big chunks...).
> 
> Wrong.
> 
> 1. TCP is not a linear flow of data into one direction, even if the purpose 
> is file transfer.

Yes, this I know, thanks... I'm perfectly aware of the fragmentation and of
out of order receipt of TCP packets... That doesn't change the fact you could
use the fast interrupt to store as many TCP packet as needed (i.e. when they
come in), into a buffer (organized as a linked list of recieved packets),
then to transfer the whole lot of packets to the higher level layers of the
TCP/IP stack at once and every 1/50th of second...

> QDOS (and likely SMSQ/E, too) is so primitive that an 
> interrupt service routine can _not_ trigger immediate rescheduling of jobs 
> after it has completed. The time until the next rescheduling can be 20 ms 
> (worst case) so the user job has to wait that time until it can process the 
> data. The effect is that the other TCP endpoint in the network has to wait 
> 20 ms + processing + transfer time until it can react to the response 
> packet. Given MTU=1460=1.5KB your interrupt driven approach can not 
> guarantee more than a throughput of 1.5 KB / 20 ms = 75 KB/s with TCP, even 
> if the other endpoint needs zero time to process it's packets. (75 KB/s is 
> not quite what I want.)

Wrong... With my method, you simply get a 20ms penalty (at worst) on the
acknowledgment of all the packets that were bufered... I.e. you'll have
a (worst case) 20ms penalty when pinging a Q60 on a network, compared to
another computer...

> Unlike an ISR, a job _can_ trigger immediate rescheduling! You don't need 
> to always poll the NIC, a clever approach can lead to full TCP throughput 
> during network activity, but zero polling waste (except for a a few tens of 
> instructions per 50 Hz) when the network is inactive.

You don't need to poll the hardware as long as you can use an interrupt
to signal the arrival of each new packet. Is the Q60 able to trigger an
extrenal interrupt on such conditions ?  If yes, then the lowest layer
of the TCP/IP stack (actually of the Ethernet driver) could be implemented
as the external interrupt handler...

> The details are 
> somewhat complex, but as long the OS isn't changed, I have no better choice.
> 
> 2. You waste response (and processor) time by your second copying level. 
> Imagine running the TCP/IP stack on a SuperGoldCard. Copying or not copying 
> about 1 MB every second _does_ matter.

Well, aren't we speaking about the Q60 (or Q40) here ?  I mean, there's not
even an Ethernet I/F on (S)GCs...

> 3. The idea of collecting fragments into larger buffers is not feasible, 
> unless you implement the TCP/IP stack itself within ISRs. (There are good 
> reasons not to do that!)

This is wrong... The low level part is only responsible for moving the data
from the hardware into an area of the memory wher it can wait until it's
processed... I see no problem at all...

Thierry.


Re: [ql-developers] Massive amount of job state transitions and re-scheduling

2003-09-07 Thread P Witte

Richard Zidlicky writes:

<>
> > Plus, I'm a bit surprised that you are apparently using jobs to fetch
the
> > data from the ethernet card... It should be done via an interrupt
handler
> > instead... Actually, the best design would be to have the Q60 fast
interrupt
> > handler to fill a buffer, and a frame interrupt task to move the data
from
> > that buffer into a bigger one for your job to fetch it in big
chunks...).
>
> this was discussed a while ago here, the big problem is that
> neither QDOS nor SMSQ will attempt to reschedule after interrupt
> handling and there is no way to deal with the complexities of the
> TCP/IP protocol inside the interupt handler.
> That means sending of protocol replies would be very often delayed
> by 1/50s which would make especially TCP crawl..

The last words you wrote the last time we discussed this topic was:

> Otoh checking for sys_rschd after isr processing looks really trivial 
> and top priority now.

Did you ever get round to it?

And Peter, did you try out the suggestions that were made at that time?

Could the effects Peter mentions have anything to do with the cache?

Per




Re: [ql-developers] Massive amount of job state transitions and re-scheduling

2003-09-06 Thread Richard Zidlicky

On Fri, Sep 05, 2003 at 09:18:52PM +0200, Thierry Godefroy wrote:

> > Is it even possible that the number of SUSJB/RELJB/PRIOR calls per frame 
> > interrupt is limited?
> 
> Well, I guess the problem is that all three calls are exiting via the
> scheduler (they are not atomic traps). My guess is that calling them in
> rapid succession (more than once every 1/50th of second) makes the job
> to reenter recursively the scheduler and to fill up the supervisor stack...

I don't see how this could happen. The calls will exit through
scheduler which will cleanup supervisior stack before returning
to usermode. The problem could only arise when you call either
of this traps from supervisormode which is a very bad idea anyway.


> Plus, I'm a bit surprised that you are apparently using jobs to fetch the
> data from the ethernet card... It should be done via an interrupt handler
> instead... Actually, the best design would be to have the Q60 fast interrupt
> handler to fill a buffer, and a frame interrupt task to move the data from
> that buffer into a bigger one for your job to fetch it in big chunks...).

this was discussed a while ago here, the big problem is that
neither QDOS nor SMSQ will attempt to reschedule after interrupt
handling and there is no way to deal with the complexities of the
TCP/IP protocol inside the interupt handler.
That means sending of protocol replies would be very often delayed
by 1/50s which would make especially TCP crawl..

Richard


Re: [ql-developers] Massive amount of job state transitions and re-scheduling

2003-09-05 Thread Peter Graf
Thierry wrote:

On Thu, 04 Sep 2003 20:23:08 +0200, Peter Graf wrote:

> Hi,
>
> I made an experimental boost of QLwIP speed to the Ethernet maximum of 10
> Mbit/sec, which results in a massive amount of calls to MT.SUSJB, MT.RELJB
> and MT.PRIOR, typically several thousands per second.
[snip]

Well, I guess the problem is that all three calls are exiting via the
scheduler (they are not atomic traps). My guess is that calling them in
rapid succession (more than once every 1/50th of second) makes the job
to reenter recursively the scheduler and to fill up the supervisor stack...
Calls can indeed be more than 20 times per 1/50th of a second. I have no 
idea how the recursion could emerge, but your scenario would fit into the 
picture.

It might work under SMSQ/E (bigger stack, much better and faster scheduler),
but this is definitely not recommended under QDOS...
I'll have a look.

Plus, I'm a bit surprised that you are apparently using jobs to fetch the
data from the ethernet card... It should be done via an interrupt handler
instead...
At first sight it looks like that of course. QDOS/SMS reality is different 
though.

Actually, the best design would be to have the Q60 fast interrupt
handler to fill a buffer, and a frame interrupt task to move the data from
that buffer into a bigger one for your job to fetch it in big chunks...).
Wrong.

1. TCP is not a linear flow of data into one direction, even if the purpose 
is file transfer. QDOS (and likely SMSQ/E, too) is so primitive that an 
interrupt service routine can _not_ trigger immediate rescheduling of jobs 
after it has completed. The time until the next rescheduling can be 20 ms 
(worst case) so the user job has to wait that time until it can process the 
data. The effect is that the other TCP endpoint in the network has to wait 
20 ms + processing + transfer time until it can react to the response 
packet. Given MTU=1460=1.5KB your interrupt driven approach can not 
guarantee more than a throughput of 1.5 KB / 20 ms = 75 KB/s with TCP, even 
if the other endpoint needs zero time to process it's packets. (75 KB/s is 
not quite what I want.)

Unlike an ISR, a job _can_ trigger immediate rescheduling! You don't need 
to always poll the NIC, a clever approach can lead to full TCP throughput 
during network activity, but zero polling waste (except for a a few tens of 
instructions per 50 Hz) when the network is inactive. The details are 
somewhat complex, but as long the OS isn't changed, I have no better choice.

2. You waste response (and processor) time by your second copying level. 
Imagine running the TCP/IP stack on a SuperGoldCard. Copying or not copying 
about 1 MB every second _does_ matter.

3. The idea of collecting fragments into larger buffers is not feasible, 
unless you implement the TCP/IP stack itself within ISRs. (There are good 
reasons not to do that!)

All the best
Peter



Re: [ql-developers] Massive amount of job state transitions and re-scheduling

2003-09-05 Thread Thierry Godefroy

On Thu, 04 Sep 2003 20:23:08 +0200, Peter Graf wrote:

> Hi,
> 
> I made an experimental boost of QLwIP speed to the Ethernet maximum of 10 
> Mbit/sec, which results in a massive amount of calls to MT.SUSJB, MT.RELJB 
> and MT.PRIOR, typically several thousands per second.
> 
> After days of of debugging attempts, I still have strange effects that lead 
> me to a slight concern about the scheduler. The problem is hard to 
> reproduce and even harder to debug lack of appropriate tools. It seems like 
> under rare timing conditions a job is not released after a call to 
> MT.RELJB. The problem does no longer occur when I reduce the amount of the 
> three mentioned OS calls to several hundreds per second.
> 
> It can still be a side effect of a bug in my code, nevertheless I'd like to 
> know if such a massive use of job state transition and rescheduling has 
> ever been tested before. Any ideas?
> 
> Is it even possible that the number of SUSJB/RELJB/PRIOR calls per frame 
> interrupt is limited?

Well, I guess the problem is that all three calls are exiting via the
scheduler (they are not atomic traps). My guess is that calling them in
rapid succession (more than once every 1/50th of second) makes the job
to reenter recursively the scheduler and to fill up the supervisor stack...

It might work under SMSQ/E (bigger stack, much better and faster scheduler),
but this is definitely not recommended under QDOS...

Plus, I'm a bit surprised that you are apparently using jobs to fetch the
data from the ethernet card... It should be done via an interrupt handler
instead... Actually, the best design would be to have the Q60 fast interrupt
handler to fill a buffer, and a frame interrupt task to move the data from
that buffer into a bigger one for your job to fetch it in big chunks...).

Thierry.