Re: [ql-developers] Massive amount of job state transitions and re-scheduling
Hi, late continuation of this old thread. Two issues were discussed back then: (1) A principal shortcoming of QDOS/SMS, that does not allow a highspeed multitasking TCP/IP implementation with an interrupt driven structure. (2) A problem in QLwIP that occured only when an extremely high number of scheduler calls was made. The good news: Problem (1) has been solved, thanks to an improvement for QDOS Classic written by Richard Zidlicky. Only few lines of code, but with great effect. It has allowed me to make the lowest driver level interrupt-triggered, without the latency problem when interacting with jobs. I plan more intense testing. Later on, the modification shall be discussed with Mark Swift. Problem (2) also seems to be solved, as a side-effect of the other modification. Since the cause of the problem never was exactly found, I'm not sure wether this is just luck. But I have transfered several 100 MB of data over the network this evening and the wellknown "hanging" of the HTTPD job at full speed has not yet appeared again, so there's hope. Kudos to Richard! All the best Peter Just by the way, I never was a big fan of the Qubide IDE driver, but I must admit that it is more intelligent than the SMSQ/E driver. While the latter one needed to restrict buffer size for acceptable speed, Qubide can deal with huge buffers, without the "crawling slaveblock" effect. (Helpful for the HTTP server.)
Re: [ql-developers] Massive amount of job state transitions and re-scheduling
Per wrote: Peter writes: <> > > Arent you trying to make the OS do something it was never designed to > > do? Writing drivers is a programming challenge. The OS is there to help > > where it can, but no OS author can anticipate any and every piece of > > hardware that is going to be attached to the machine in the future. That is > > the job of the driver. (Preferably without each driver author altering the > > OS to suit their own needs ;) > > Somehow I doubt that you need to teach me that writing drivers > is a programming challenge or more trivialities and generalities > about OS and driver structure ;) I would not presume to teach you anything. I know that would be futile. I was not debating detail with you - I know nothing of the detail. I was discussing principle, and that I, and other well-meaning souls on this list trying to help, happen to know a little about. The detail normally follows from the priciple. Please spare me from ridicule for voicing a legitimate point of view. It does neither of us any favours, and tends to sour the atmosphere. Calm down. If you publicly explain someone who has just constructed a car (and asks for help improving the transmission) what a car is, don't be surprised that he recognizes you weren't gonna help him with the transmission. He might have the impression that you were trying to spread a little doubt about his ability to build a good car. Peter
Re: [ql-developers] Massive amount of job state transitions and re-scheduling
Peter writes: <> > > Arent you trying to make the OS do something it was never designed to > > do? Writing drivers is a programming challenge. The OS is there to help > > where it can, but no OS author can anticipate any and every piece of > > hardware that is going to be attached to the machine in the future. That is > > the job of the driver. (Preferably without each driver author altering the > > OS to suit their own needs ;) > > Somehow I doubt that you need to teach me that writing drivers > is a programming challenge or more trivialities and generalities > about OS and driver structure ;) I would not presume to teach you anything. I know that would be futile. I was not debating detail with you - I know nothing of the detail. I was discussing principle, and that I, and other well-meaning souls on this list trying to help, happen to know a little about. The detail normally follows from the priciple. Please spare me from ridicule for voicing a legitimate point of view. It does neither of us any favours, and tends to sour the atmosphere. > I guess you have to accept that QDOS (SMS?) has a principal > shortcoming, not an author dependant need, and should be > improved. As to my little jibe, it was a kindly nudge-wink in the direction of Richard, relating to an upstream tributary of this discussion. "Improvements" to Qdos is a very serious matter and should not be undertaken lightly as it affects everyone. Per
Re: [ql-developers] Massive amount of job state transitions and re-scheduling
On 9 Sep 2003 at 14:42, Thierry Godefroy wrote: > > On Sun, 07 Sep 2003 21:53:34 +0200, Peter Graf wrote: > > > Thierry wrote: > > > > >Yes, this I know, thanks... I'm perfectly aware of the fragmentation and of > > >out of order receipt of TCP packets... That doesn't change the fact you could > > >use the fast interrupt to store as many TCP packet as needed (i.e. when they > > >come in), into a buffer (organized as a linked list of recieved packets), > > >then to transfer the whole lot of packets to the higher level layers of the > > >TCP/IP stack at once and every 1/50th of second... > > > > Obviously correct but useless. Try to understand that the problem in your > > approach is latency and can not be solved by buffering, no matter how > > efficient buffering is implemented. > > From the computer sending packets to a Qx0, the latency would just be seen > as a longer route... When you ping a computer on Internet, you have to > wait a variable amount of time for the reply, depending on how many routers > must be crossed, on how long are the wires (for trans-oceanic links, or worst > for satellite links, it's far than neglectable), and how fast and/or busy is > the receiver... > > If the topographic design of the network between a PC and a Q60 should lead > to, say, a 200ms delay in the reply, then the TCP/IP implementation on the > Q60 would simply add a 20ms latency to this number, but in the end, the > sender should still receive its ACKs between 200 and 220ms after the packets > are sent... Sure. That's all obvious but still unrelated to the the given LAN data rate challenge. > Of course, this supposes that the acknowledgement is -actually- done every > 20ms in the Q60, which is -NOT- the case if it's done at the job level (jobs > are elected or not, depending on their cumulated priority and are therefore > -NOT- running each 20ms unless they are alone in memory)... This is still only the trivial view. It is not applicable to the way I've implemented things. Let me give you a (much) simplified example: A top priority (let's call it 50Hz ISR replacement) job that suspends itself long before the 20 ms interval is over (and thereby gives the lower priority jobs their share in the same interval) will practically always be elected to run again after the suspension time (20 ms) is over. You can have 10 other jobs running (most of them will usually block for I/O most of the time) and even a benchmark that consumes all the rest of the CPU power, it still works. There are rare circumstances where indeed the high prio job will not be elected now and then, but under these circumstances it is the best that can happen, because some of the user interface should remain in working condition. It's better to slow down the network or even drop packets under these rare circumstances (let TCP deal with the dropped packets :). Just by the way, the remark with "alone in memory" was completely wrong, because it's irrelevant wether a job is loaded and activated. The point is in which intervals jobs are blocking for IO (or something else). You can easily have the case where several jobs are each and all executed every 20 ms and all do real work. This is not even unusual if the jobs process data from IRQ driven IO, and the IO is slower than the CPU. > The code for reassembling the fragmented packets and acknowledging them > must be implemented either as a frame interrupt, or (if frame interrupts > are still too slow for your taste), Were are not talking a special taste, but the normal 10 Mbit/s Ethernet data rate under TCP. by using a polled routine triggered > by the Q60 fast interrupt (the one used by the sound system). > > The struture for the whole TCP/IP stack would then be: > > 1.- IP packets fetching from the I/F and buffering: > - External interrupt handler (best), or fast interrupt polling loop. > 2.- IP packets reassembling and acknowledging: > - polled task: fast interrupt handler (best) or frame interrupt. > 3.- TCP/IP high level protocols: > - High priority (127) job. > > How does this sound ? I remember that I thought about a similar structure when I was still in the very beginning of this project and had not delt with the details. Firstly your structure would not allow 10 Mbit/s TCP streams in a normal LAN. (Note that level 1 won't generate TCP ACKs.) Second, I wonder what makes you think there were packet acknowledgements on IP level. Third, it is dangerous to let TCP processing run at highest priority. If there is very much TCP traffic (let's for a moment take into acount a wider range of machines than Q60, or malicious traffic) your user interface may stop working. ... The list could go on. I for one have 3 different philosphies: One in case QDOS is fixed and I have a lot of time, the second in case QDOS is fixed and I'm short of time, the third in case QDOS isn't fixed :-) Thierry, I don't search for theoretical advice on OS and drivers str
Re: [ql-developers] Massive amount of job state transitions and re-scheduling
On Tue, Sep 09, 2003 at 01:37:53AM +0100, P Witte wrote: > > Peter Graf writes: > > > >Yes, this I know, thanks... I'm perfectly aware of the fragmentation and > of > > >out of order receipt of TCP packets... That doesn't change the fact you > could > > >use the fast interrupt to store as many TCP packet as needed (i.e. when > they > > >come in), into a buffer (organized as a linked list of recieved packets), > > >then to transfer the whole lot of packets to the higher level layers of > the > > >TCP/IP stack at once and every 1/50th of second... > > > > Obviously correct but useless. Try to understand that the problem in your > > approach is latency and can not be solved by buffering, no matter how > > efficient buffering is implemented. > > > > Simple example: A M$ or Unix machine sends a file to the QDOS machine via > > TCP. It will send one or two packets, then stop and wait for ACK. Further > > packets will only be sent after further ACKs. Your ACKs can only be > > generated in 50 Hz rhythm, so packets will crawl one-by-one in 50 Hz > > rhythm. (Or two-by-two, if you're lucky.) > > But does the incoming data need to be processed in any way before > acknowledgement? Why cant the ISR simply receive and buffer the data and > then send the ACK before exiting, leaving any processing to the higher > levels? my impression is that to do that for TCP you would have to do all of the protocol implementation into the ISR > In our January discussion you mentioned the case of echo. There is nothing > to stop you from implementing time-critical routines, like echo, in the > 'physical layer'. well there is, as long as QDOS won't allow reschedule after interrupts. Echo is supposed to be a normal application and you would not move it into the ISR layer. As I said it isn't hard to change ISR handling in QDOS and I will certainly do it, right now I have other things that need to be done. Richard
Re: [ql-developers] Massive amount of job state transitions and re-scheduling
On Sun, 07 Sep 2003 21:53:34 +0200, Peter Graf wrote: > Thierry wrote: > > >Yes, this I know, thanks... I'm perfectly aware of the fragmentation and of > >out of order receipt of TCP packets... That doesn't change the fact you could > >use the fast interrupt to store as many TCP packet as needed (i.e. when they > >come in), into a buffer (organized as a linked list of recieved packets), > >then to transfer the whole lot of packets to the higher level layers of the > >TCP/IP stack at once and every 1/50th of second... > > Obviously correct but useless. Try to understand that the problem in your > approach is latency and can not be solved by buffering, no matter how > efficient buffering is implemented. >From the computer sending packets to a Qx0, the latency would just be seen as a longer route... When you ping a computer on Internet, you have to wait a variable amount of time for the reply, depending on how many routers must be crossed, on how long are the wires (for trans-oceanic links, or worst for satellite links, it's far than neglectable), and how fast and/or busy is the receiver... If the topographic design of the network between a PC and a Q60 should lead to, say, a 200ms delay in the reply, then the TCP/IP implementation on the Q60 would simply add a 20ms latency to this number, but in the end, the sender should still receive its ACKs between 200 and 220ms after the packets are sent... Of course, this supposes that the acknowledgement is -actually- done every 20ms in the Q60, which is -NOT- the case if it's done at the job level (jobs are elected or not, depending on their cumulated priority and are therefore -NOT- running each 20ms unless they are alone in memory)... The code for reassembling the fragmented packets and acknowledging them must be implemented either as a frame interrupt, or (if frame interrupts are still too slow for your taste), by using a polled routine triggered by the Q60 fast interrupt (the one used by the sound system). The struture for the whole TCP/IP stack would then be: 1.- IP packets fetching from the I/F and buffering: - External interrupt handler (best), or fast interrupt polling loop. 2.- IP packets reassembling and acknowledging: - polled task: fast interrupt handler (best) or frame interrupt. 3.- TCP/IP high level protocols: - High priority (127) job. How does this sound ? Thierry.
Re: [ql-developers] Massive amount of job state transitions and re-scheduling
On 8 Sep 2003 at 0:53, P Witte wrote: > > Peter Graf writes: > > > Hi Per, > > > > >And Peter, did you try out the suggestions that were made at that time? > > > > Can you be a bit more specific? I remember only one applicable suggestion, > > which was to set a system variable before leaving the ISR. Didn't work, at > > least not under QDOS. > > # By exiting the interrupt handler through the sms.rte function the > # requested re-schedule will be done immediately if possible (i.e. no > # supervisor code was running at that time). Example: > # > # include dev8_smsq_smsq_basekeys > # include dev8_keys_psf > # > # int_handler > # movem.l psf.reg,-(sp) > # > # [blah] > # > # st sys_rshd(a6); Request re-schedule > # move.l sms.rte,a5 ; ...now would be convenient > # jmp (a5) > > etc, as per my mail to this list on 18/01/03 See above. Peter
Re: [ql-developers] Massive amount of job state transitions and re-scheduling
On 9 Sep 2003 at 1:37, P Witte wrote: [snip] > But does the incoming data need to be processed in any way before > acknowledgement? Why cant the ISR simply receive and buffer the data and > then send the ACK before exiting, leaving any processing to the higher > levels? The reason is that it's part of the TCP processing and can not be done on ethernet packet level. > In our January discussion you mentioned the case of echo. There is nothing > to stop you from implementing time-critical routines, like echo, in the > 'physical layer'. In fact you can take over the whole machine and do as you > please. Not a task where speed is relevant for the user. No point in speeding up ICMP echo only. > he important thing is to split the driver correctly: Time > critical, ie > usually hardware related stuff, and in this case it appears also certain > demands of the TCP/IP protocol (if I understand correctly) are rightly the > provinance of the ISR. If this sort of thing is not clearcut in TCP/IP, then > a messy solution is called for ;) > > Arent you trying to make the OS do something it was never designed to > do? Writing drivers is a programming challenge. The OS is there to help > where it can, but no OS author can anticipate any and every piece of > hardware that is going to be attached to the machine in the future. That is > the job of the driver. (Preferably without each driver author altering the > OS to suit their own needs ;) Somehow I doubt that you need to teach me that writing drivers is a programming challenge or more trivialities and generalities about OS and driver structure ;) I guess you have to accept that QDOS (SMS?) has a principal shortcoming, not an author dependant need, and should be improved. > Afterall, someone did implement TCP/IP on the Spectrum, which neither > multitasks A singletasking TCP/IP implementation is easier not harder. Just in case you didn't notice: My TCP/IP package for QDOS works. I was talking the obstacles of higher data rates. > nor (for all I know) understands interrupts. Of course the Spectrum uses interrupts, and even singletasking TCP/IP needs some sort of timers. All the best Peter
Re: [ql-developers] Massive amount of job state transitions and re-scheduling
Peter Graf writes: > >Yes, this I know, thanks... I'm perfectly aware of the fragmentation and of > >out of order receipt of TCP packets... That doesn't change the fact you could > >use the fast interrupt to store as many TCP packet as needed (i.e. when they > >come in), into a buffer (organized as a linked list of recieved packets), > >then to transfer the whole lot of packets to the higher level layers of the > >TCP/IP stack at once and every 1/50th of second... > > Obviously correct but useless. Try to understand that the problem in your > approach is latency and can not be solved by buffering, no matter how > efficient buffering is implemented. > > Simple example: A M$ or Unix machine sends a file to the QDOS machine via > TCP. It will send one or two packets, then stop and wait for ACK. Further > packets will only be sent after further ACKs. Your ACKs can only be > generated in 50 Hz rhythm, so packets will crawl one-by-one in 50 Hz > rhythm. (Or two-by-two, if you're lucky.) But does the incoming data need to be processed in any way before acknowledgement? Why cant the ISR simply receive and buffer the data and then send the ACK before exiting, leaving any processing to the higher levels? In our January discussion you mentioned the case of echo. There is nothing to stop you from implementing time-critical routines, like echo, in the 'physical layer'. In fact you can take over the whole machine and do as you please. The important thing is to split the driver correctly: Time critical, ie usually hardware related stuff, and in this case it appears also certain demands of the TCP/IP protocol (if I understand correctly) are rightly the provinance of the ISR. If this sort of thing is not clearcut in TCP/IP, then a messy solution is called for ;) Arent you trying to make the OS do something it was never designed to do? Writing drivers is a programming challenge. The OS is there to help where it can, but no OS author can anticipate any and every piece of hardware that is going to be attached to the machine in the future. That is the job of the driver. (Preferably without each driver author altering the OS to suit their own needs ;) Afterall, someone did implement TCP/IP on the Spectrum, which neither multitasks nor (for all I know) understands interrupts. Good luck! Per
Re: [ql-developers] Massive amount of job state transitions and re-scheduling
Peter Graf writes: > Hi Per, > > >And Peter, did you try out the suggestions that were made at that time? > > Can you be a bit more specific? I remember only one applicable suggestion, > which was to set a system variable before leaving the ISR. Didn't work, at > least not under QDOS. # By exiting the interrupt handler through the sms.rte function the # requested re-schedule will be done immediately if possible (i.e. no # supervisor code was running at that time). Example: # # include dev8_smsq_smsq_basekeys # include dev8_keys_psf # # int_handler # movem.l psf.reg,-(sp) # # [blah] # # st sys_rshd(a6); Request re-schedule # move.l sms.rte,a5 ; ...now would be convenient # jmp (a5) etc, as per my mail to this list on 18/01/03 Per
Re: [ql-developers] Massive amount of job state transitions and re-scheduling
On 12 Sep 2003 at 13:00, BRANE wrote: > O.K. I'm not following this thread from beginning, so I don't know exactly > what hardware are we talking about, but for this detail, it probably doesn't > matter much... > > So, what is a solution ? Using external interrupt ? Maybe a bit bulkier > controller with built-in Ethernet ? 1. My recent problem is a timing-related bug when the full datarate is used. It is yet unclear where it comes from, so the solution is also unknown. The lazy workaround is to poll the controller only every 20 ms. Using ISR's brings no advantage compared to this (due to a QDOS-specific shortcoming). 2. Unrelated to this problem, QDOS could use an improvement so ISRs can trigger immediate rescheduling of jobs. Given this improvement, it makes sense to implement the driver with ISR IOT gain a cleaner driver structure. Wether or not this would have the side effect to cure the aforementioned bug is unclear. A hardware change is not required. All the best Peter
Re: [ql-developers] Massive amount of job state transitions and re-scheduling
> > don't forget this is a rather simple TCP/IP implementation and apparently > it is already hard enough to make the simplest variant working reliably > with the garden variety of TCP/IP implementations out there. > > Richard O.K. I'm not following this thread from beginning, so I don't know exactly what hardware are we talking about, but for this detail, it probably doesn't matter much... So, what is a solution ? Using external interrupt ? Maybe a bit bulkier controller with built-in Ethernet ?
Re: [ql-developers] Massive amount of job state transitions and re-scheduling
On Sun, Sep 07, 2003 at 10:48:50PM +0200, BRANE wrote: > > > - Original Message - > From: "Peter Graf" <[EMAIL PROTECTED]> > To: <[EMAIL PROTECTED]> > Sent: Sunday, September 07, 2003 9:53 PM > Subject: Re: [ql-developers] Massive amount of job state transitions and > re-scheduling > > > Simple example: A M$ or Unix machine sends a file to the QDOS machine via > > TCP. It will send one or two packets, then stop and wait for ACK. Further > > packets will only be sent after further ACKs. Your ACKs can only be > > generated in 50 Hz rhythm, so packets will crawl one-by-one in 50 Hz > > rhythm. (Or two-by-two, if you're lucky.) > > AFAIK with TCP/IP this is negotiable. There is no need for such small > window... don't forget this is a rather simple TCP/IP implementation and apparently it is already hard enough to make the simplest variant working reliably with the garden variety of TCP/IP implementations out there. Richard
Re: [ql-developers] Massive amount of job state transitions and re-scheduling
On 8 Sep 2003 at 1:03, BRANE wrote: > > QLwIP offers a window of 8760, but I have found that neither Windows nor > > Linux will exploit that in their standard configuration, at least not as > > long as their counterpart has 20 ms latency. Depending on the application, > > they send one, maximum two packets before they wait for ACK. > > > > Somehow QLwIP must live with standard behaviour of other machines, you can > > not expect people to tune their TCP stacks. > > > > All the best > > Peter > > Hmm. Something doesn't sound right here. Internet is full of "speed up your > Internet connection" programs that do amongst other things exactly this, so > I presume this has to be negotiable. Well for Windows I tried such programs, without effect in the given situation. It is possible that the long latency leads (absolutely unusual in ethernet networks) the counterpart to the decision not to send more than one or two TCP packets at once. I have not yet invested much in tuning the other machines, since this can not be the general solution. For the IRQ approach to work with the same performance as polling, you'd need at least 14 TCP packets to be transferred back to back without ACK. Decide yourself wether this is realistic. > Besides, I find it a bit hard to believe that average PC does acknowledge > every packet on 100 Mbit Ethernet. This would mean something like > interrupts with 100 kHz rate. Not very likely on modern machines... The rate would be about 7 kHz. It may surprise you, but IRQ's are actually triggered at this rate, although not every single packet is acknowledged. Still the number of packets per ACK is usually very small. > This would also make TCP/IP on 1Gbit Ethernet useless... On a CPU that can handle TCP at this rate, it's also no problem to deal with the exception handling. I have not looked into 1Gb, no idea how it's usually implemented. All the best Peter
Re: [ql-developers] Massive amount of job state transitions and re-scheduling
- Original Message - From: "BRANE" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Sent: Monday, September 08, 2003 1:03 AM Subject: Re: [ql-developers] Massive amount of job state transitions and re-scheduling > Besides, I find it a bit hard to believe that average PC does acknowledge > every packet on 100 Mbit Ethernet. This would mean something like > interrupts with 100 kHz rate. Not very likely on modern machines... Ooops. Divide bug ;o) 10 kHz is more like it, but still very high...
Re: [ql-developers] Massive amount of job state transitions and re-scheduling
- Original Message - From: "Peter Graf" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Sent: Monday, September 08, 2003 12:05 AM Subject: Re: [ql-developers] Massive amount of job state transitions and re-scheduling > > BRANE wrote: > > >> Simple example: A M$ or Unix machine sends a file to the QDOS machine via > > > TCP. It will send one or two packets, then stop and wait for ACK. Further > > > packets will only be sent after further ACKs. Your ACKs can only be > > > generated in 50 Hz rhythm, so packets will crawl one-by-one in 50 Hz > > > rhythm. (Or two-by-two, if you're lucky.) > > > >AFAIK with TCP/IP this is negotiable. There is no need for such small > >window... > > QLwIP offers a window of 8760, but I have found that neither Windows nor > Linux will exploit that in their standard configuration, at least not as > long as their counterpart has 20 ms latency. Depending on the application, > they send one, maximum two packets before they wait for ACK. > > Somehow QLwIP must live with standard behaviour of other machines, you can > not expect people to tune their TCP stacks. > > All the best > Peter Hmm. Something doesn't sound right here. Internet is full of "speed up your Internet connection" programs that do amongst other things exactly this, so I presume this has to be negotiable. Besides, I find it a bit hard to believe that average PC does acknowledge every packet on 100 Mbit Ethernet. This would mean something like interrupts with 100 kHz rate. Not very likely on modern machines... This would also make TCP/IP on 1Gbit Ethernet useless... Regards, Branko
Re: [ql-developers] Massive amount of job state transitions and re-scheduling
BRANE wrote: >> Simple example: A M$ or Unix machine sends a file to the QDOS machine via > TCP. It will send one or two packets, then stop and wait for ACK. Further > packets will only be sent after further ACKs. Your ACKs can only be > generated in 50 Hz rhythm, so packets will crawl one-by-one in 50 Hz > rhythm. (Or two-by-two, if you're lucky.) AFAIK with TCP/IP this is negotiable. There is no need for such small window... QLwIP offers a window of 8760, but I have found that neither Windows nor Linux will exploit that in their standard configuration, at least not as long as their counterpart has 20 ms latency. Depending on the application, they send one, maximum two packets before they wait for ACK. Somehow QLwIP must live with standard behaviour of other machines, you can not expect people to tune their TCP stacks. All the best Peter
Re: [ql-developers] Massive amount of job state transitions and re-scheduling
- Original Message - From: "Peter Graf" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Sent: Sunday, September 07, 2003 9:53 PM Subject: Re: [ql-developers] Massive amount of job state transitions and re-scheduling > > Thierry wrote: > > >Yes, this I know, thanks... I'm perfectly aware of the fragmentation and of > >out of order receipt of TCP packets... That doesn't change the fact you could > >use the fast interrupt to store as many TCP packet as needed (i.e. when they > >come in), into a buffer (organized as a linked list of recieved packets), > >then to transfer the whole lot of packets to the higher level layers of the > >TCP/IP stack at once and every 1/50th of second... > > Obviously correct but useless. Try to understand that the problem in your > approach is latency and can not be solved by buffering, no matter how > efficient buffering is implemented. > > Simple example: A M$ or Unix machine sends a file to the QDOS machine via > TCP. It will send one or two packets, then stop and wait for ACK. Further > packets will only be sent after further ACKs. Your ACKs can only be > generated in 50 Hz rhythm, so packets will crawl one-by-one in 50 Hz > rhythm. (Or two-by-two, if you're lucky.) AFAIK with TCP/IP this is negotiable. There is no need for such small window...
Re: [ql-developers] Massive amount of job state transitions and re-scheduling
Hi Per, And Peter, did you try out the suggestions that were made at that time? Can you be a bit more specific? I remember only one applicable suggestion, which was to set a system variable before leaving the ISR. Didn't work, at least not under QDOS. Could the effects Peter mentions have anything to do with the cache? Same with caches off. The effect only happens in realtime and is hard to debug, it may have a completely different cause. Bye, Peter
Re: [ql-developers] Massive amount of job state transitions and re-scheduling
Thierry wrote: Yes, this I know, thanks... I'm perfectly aware of the fragmentation and of out of order receipt of TCP packets... That doesn't change the fact you could use the fast interrupt to store as many TCP packet as needed (i.e. when they come in), into a buffer (organized as a linked list of recieved packets), then to transfer the whole lot of packets to the higher level layers of the TCP/IP stack at once and every 1/50th of second... Obviously correct but useless. Try to understand that the problem in your approach is latency and can not be solved by buffering, no matter how efficient buffering is implemented. Simple example: A M$ or Unix machine sends a file to the QDOS machine via TCP. It will send one or two packets, then stop and wait for ACK. Further packets will only be sent after further ACKs. Your ACKs can only be generated in 50 Hz rhythm, so packets will crawl one-by-one in 50 Hz rhythm. (Or two-by-two, if you're lucky.) > QDOS (and likely SMSQ/E, too) is so primitive that an > interrupt service routine can _not_ trigger immediate rescheduling of jobs > after it has completed. The time until the next rescheduling can be 20 ms > (worst case) so the user job has to wait that time until it can process the > data. The effect is that the other TCP endpoint in the network has to wait > 20 ms + processing + transfer time until it can react to the response > packet. Given MTU=1460=1.5KB your interrupt driven approach can not > guarantee more than a throughput of 1.5 KB / 20 ms = 75 KB/s with TCP, even > if the other endpoint needs zero time to process it's packets. (75 KB/s is > not quite what I want.) Wrong... With my method, you simply get a 20ms penalty (at worst) on the acknowledgment of all the packets that were bufered... I.e. you'll have a (worst case) 20ms penalty when pinging a Q60 on a network, compared to another computer... Obviously correct, it only supports what I explained. You seem to have the (unrealistic) idea that the other endpoint will be sending (much) more than one packet per ACK. > Unlike an ISR, a job _can_ trigger immediate rescheduling! You don't need > to always poll the NIC, a clever approach can lead to full TCP throughput > during network activity, but zero polling waste (except for a a few tens of > instructions per 50 Hz) when the network is inactive. You don't need to poll the hardware as long as you can use an interrupt to signal the arrival of each new packet. Is the Q60 able to trigger an extrenal interrupt on such conditions ? If yes, then the lowest layer of the TCP/IP stack (actually of the Ethernet driver) could be implemented as the external interrupt handler... Yes the Q60 can trigger those interrupts, yes driver implementation is possible, yes it replaces polling. Irrelevant altogether in a QDOS system, unless I want TCP to crawl. > 2. You waste response (and processor) time by your second copying level. > Imagine running the TCP/IP stack on a SuperGoldCard. Copying or not copying > about 1 MB every second _does_ matter. Well, aren't we speaking about the Q60 (or Q40) here ? I mean, there's not even an Ethernet I/F on (S)GCs... Compared to the overhead of the scheduler calls, such a large number of external memory accesses seems relevant to me. Plus your approach doesn't even eliminate the scheduler calls if you want similar TCP response time (and TCP is implemented as a job). BTW Nasta had the Ethernet design for (S)GC, even the PCB, ready. > 3. The idea of collecting fragments into larger buffers is not feasible, > unless you implement the TCP/IP stack itself within ISRs. (There are good > reasons not to do that!) This is wrong... The low level part is only responsible for moving the data from the hardware into an area of the memory wher it can wait until it's processed... I see no problem at all... You proposed collection of fragments so the job can fetch them in big chunks, i.e. combining the payload. I referred to that. All the best Peter
Re: [ql-developers] Massive amount of job state transitions and re-scheduling
On Sat, 06 Sep 2003 00:24:18 +0200, Peter Graf wrote: > > Thierry wrote: > > .../... > > >Plus, I'm a bit surprised that you are apparently using jobs to fetch the > >data from the ethernet card... It should be done via an interrupt handler > >instead... > > At first sight it looks like that of course. QDOS/SMS reality is different > though. > > >Actually, the best design would be to have the Q60 fast interrupt > >handler to fill a buffer, and a frame interrupt task to move the data from > >that buffer into a bigger one for your job to fetch it in big chunks...). > > Wrong. > > 1. TCP is not a linear flow of data into one direction, even if the purpose > is file transfer. Yes, this I know, thanks... I'm perfectly aware of the fragmentation and of out of order receipt of TCP packets... That doesn't change the fact you could use the fast interrupt to store as many TCP packet as needed (i.e. when they come in), into a buffer (organized as a linked list of recieved packets), then to transfer the whole lot of packets to the higher level layers of the TCP/IP stack at once and every 1/50th of second... > QDOS (and likely SMSQ/E, too) is so primitive that an > interrupt service routine can _not_ trigger immediate rescheduling of jobs > after it has completed. The time until the next rescheduling can be 20 ms > (worst case) so the user job has to wait that time until it can process the > data. The effect is that the other TCP endpoint in the network has to wait > 20 ms + processing + transfer time until it can react to the response > packet. Given MTU=1460=1.5KB your interrupt driven approach can not > guarantee more than a throughput of 1.5 KB / 20 ms = 75 KB/s with TCP, even > if the other endpoint needs zero time to process it's packets. (75 KB/s is > not quite what I want.) Wrong... With my method, you simply get a 20ms penalty (at worst) on the acknowledgment of all the packets that were bufered... I.e. you'll have a (worst case) 20ms penalty when pinging a Q60 on a network, compared to another computer... > Unlike an ISR, a job _can_ trigger immediate rescheduling! You don't need > to always poll the NIC, a clever approach can lead to full TCP throughput > during network activity, but zero polling waste (except for a a few tens of > instructions per 50 Hz) when the network is inactive. You don't need to poll the hardware as long as you can use an interrupt to signal the arrival of each new packet. Is the Q60 able to trigger an extrenal interrupt on such conditions ? If yes, then the lowest layer of the TCP/IP stack (actually of the Ethernet driver) could be implemented as the external interrupt handler... > The details are > somewhat complex, but as long the OS isn't changed, I have no better choice. > > 2. You waste response (and processor) time by your second copying level. > Imagine running the TCP/IP stack on a SuperGoldCard. Copying or not copying > about 1 MB every second _does_ matter. Well, aren't we speaking about the Q60 (or Q40) here ? I mean, there's not even an Ethernet I/F on (S)GCs... > 3. The idea of collecting fragments into larger buffers is not feasible, > unless you implement the TCP/IP stack itself within ISRs. (There are good > reasons not to do that!) This is wrong... The low level part is only responsible for moving the data from the hardware into an area of the memory wher it can wait until it's processed... I see no problem at all... Thierry.
Re: [ql-developers] Massive amount of job state transitions and re-scheduling
Richard Zidlicky writes: <> > > Plus, I'm a bit surprised that you are apparently using jobs to fetch the > > data from the ethernet card... It should be done via an interrupt handler > > instead... Actually, the best design would be to have the Q60 fast interrupt > > handler to fill a buffer, and a frame interrupt task to move the data from > > that buffer into a bigger one for your job to fetch it in big chunks...). > > this was discussed a while ago here, the big problem is that > neither QDOS nor SMSQ will attempt to reschedule after interrupt > handling and there is no way to deal with the complexities of the > TCP/IP protocol inside the interupt handler. > That means sending of protocol replies would be very often delayed > by 1/50s which would make especially TCP crawl.. The last words you wrote the last time we discussed this topic was: > Otoh checking for sys_rschd after isr processing looks really trivial > and top priority now. Did you ever get round to it? And Peter, did you try out the suggestions that were made at that time? Could the effects Peter mentions have anything to do with the cache? Per
Re: [ql-developers] Massive amount of job state transitions and re-scheduling
On Fri, Sep 05, 2003 at 09:18:52PM +0200, Thierry Godefroy wrote: > > Is it even possible that the number of SUSJB/RELJB/PRIOR calls per frame > > interrupt is limited? > > Well, I guess the problem is that all three calls are exiting via the > scheduler (they are not atomic traps). My guess is that calling them in > rapid succession (more than once every 1/50th of second) makes the job > to reenter recursively the scheduler and to fill up the supervisor stack... I don't see how this could happen. The calls will exit through scheduler which will cleanup supervisior stack before returning to usermode. The problem could only arise when you call either of this traps from supervisormode which is a very bad idea anyway. > Plus, I'm a bit surprised that you are apparently using jobs to fetch the > data from the ethernet card... It should be done via an interrupt handler > instead... Actually, the best design would be to have the Q60 fast interrupt > handler to fill a buffer, and a frame interrupt task to move the data from > that buffer into a bigger one for your job to fetch it in big chunks...). this was discussed a while ago here, the big problem is that neither QDOS nor SMSQ will attempt to reschedule after interrupt handling and there is no way to deal with the complexities of the TCP/IP protocol inside the interupt handler. That means sending of protocol replies would be very often delayed by 1/50s which would make especially TCP crawl.. Richard
Re: [ql-developers] Massive amount of job state transitions and re-scheduling
Thierry wrote: On Thu, 04 Sep 2003 20:23:08 +0200, Peter Graf wrote: > Hi, > > I made an experimental boost of QLwIP speed to the Ethernet maximum of 10 > Mbit/sec, which results in a massive amount of calls to MT.SUSJB, MT.RELJB > and MT.PRIOR, typically several thousands per second. [snip] Well, I guess the problem is that all three calls are exiting via the scheduler (they are not atomic traps). My guess is that calling them in rapid succession (more than once every 1/50th of second) makes the job to reenter recursively the scheduler and to fill up the supervisor stack... Calls can indeed be more than 20 times per 1/50th of a second. I have no idea how the recursion could emerge, but your scenario would fit into the picture. It might work under SMSQ/E (bigger stack, much better and faster scheduler), but this is definitely not recommended under QDOS... I'll have a look. Plus, I'm a bit surprised that you are apparently using jobs to fetch the data from the ethernet card... It should be done via an interrupt handler instead... At first sight it looks like that of course. QDOS/SMS reality is different though. Actually, the best design would be to have the Q60 fast interrupt handler to fill a buffer, and a frame interrupt task to move the data from that buffer into a bigger one for your job to fetch it in big chunks...). Wrong. 1. TCP is not a linear flow of data into one direction, even if the purpose is file transfer. QDOS (and likely SMSQ/E, too) is so primitive that an interrupt service routine can _not_ trigger immediate rescheduling of jobs after it has completed. The time until the next rescheduling can be 20 ms (worst case) so the user job has to wait that time until it can process the data. The effect is that the other TCP endpoint in the network has to wait 20 ms + processing + transfer time until it can react to the response packet. Given MTU=1460=1.5KB your interrupt driven approach can not guarantee more than a throughput of 1.5 KB / 20 ms = 75 KB/s with TCP, even if the other endpoint needs zero time to process it's packets. (75 KB/s is not quite what I want.) Unlike an ISR, a job _can_ trigger immediate rescheduling! You don't need to always poll the NIC, a clever approach can lead to full TCP throughput during network activity, but zero polling waste (except for a a few tens of instructions per 50 Hz) when the network is inactive. The details are somewhat complex, but as long the OS isn't changed, I have no better choice. 2. You waste response (and processor) time by your second copying level. Imagine running the TCP/IP stack on a SuperGoldCard. Copying or not copying about 1 MB every second _does_ matter. 3. The idea of collecting fragments into larger buffers is not feasible, unless you implement the TCP/IP stack itself within ISRs. (There are good reasons not to do that!) All the best Peter
Re: [ql-developers] Massive amount of job state transitions and re-scheduling
On Thu, 04 Sep 2003 20:23:08 +0200, Peter Graf wrote: > Hi, > > I made an experimental boost of QLwIP speed to the Ethernet maximum of 10 > Mbit/sec, which results in a massive amount of calls to MT.SUSJB, MT.RELJB > and MT.PRIOR, typically several thousands per second. > > After days of of debugging attempts, I still have strange effects that lead > me to a slight concern about the scheduler. The problem is hard to > reproduce and even harder to debug lack of appropriate tools. It seems like > under rare timing conditions a job is not released after a call to > MT.RELJB. The problem does no longer occur when I reduce the amount of the > three mentioned OS calls to several hundreds per second. > > It can still be a side effect of a bug in my code, nevertheless I'd like to > know if such a massive use of job state transition and rescheduling has > ever been tested before. Any ideas? > > Is it even possible that the number of SUSJB/RELJB/PRIOR calls per frame > interrupt is limited? Well, I guess the problem is that all three calls are exiting via the scheduler (they are not atomic traps). My guess is that calling them in rapid succession (more than once every 1/50th of second) makes the job to reenter recursively the scheduler and to fill up the supervisor stack... It might work under SMSQ/E (bigger stack, much better and faster scheduler), but this is definitely not recommended under QDOS... Plus, I'm a bit surprised that you are apparently using jobs to fetch the data from the ethernet card... It should be done via an interrupt handler instead... Actually, the best design would be to have the Q60 fast interrupt handler to fill a buffer, and a frame interrupt task to move the data from that buffer into a bigger one for your job to fetch it in big chunks...). Thierry.