On Thursday 16 October 2008 04:47:37 Denys Vlasenko wrote:
> On Thursday 16 October 2008 08:38:41 am Rob Landley wrote:
> > On Wednesday 15 October 2008 11:36:19 Rob Landley wrote:
> > > I think I can work out a test for this (write a script suspending the
> > > qemu process with "while true; do kill -STOP $PID; sleep 1; kill -CONT
> > > $PID; sleep 1; done" and then hold down "cursor left" in vi for a
> > > couple minutes and see if it zaps the line it's on.
> >
> > I finally got a chance to do this, and qemu did _not_ eat the line.  So
> > the serial interrupt takes precedence over the timer interrupt in Linux,
> > which means all we have to worry about is the actual serial delay.
>
> I tend to agree with your previous mail, not this one :)
>
> If you got an ESC, next char is delayed by 8ms by serial line and you
> are scheduled away in poll(), there is no guarantee you come back
> 25ms and not 25s later.

Sure, but the data will have come in to the kernel and be queued, so when you 
_do_ get scheduled again the poll() will return because there's data pending, 
not because of the timeout.

> Kernel simply gives no such promises. 

It checks for pending data before it checks for the timeout.  The kernel 
accepting data from the serial port and queuing it to the tty happens in 
interrupt context, and that interrupt handling takes priority over handling 
the timer interrupt that causes the poll to expire.  (That was the test I did 
by repeatedly suspending qemu for more than the timeout period.)

> In-kernel stuff (IRQs from gigabit ethernet which is flooded
> with UDP packets) also can delay processing of next character
> from serial line.

Yes, but it would delay the timer expiration more.  They still get done in 
sequence; the pending interrupt that provides data to poll() is handled 
before the pending interrupt that causes poll() to time out.  Even if you 
stop the processor so it can't handle either interrupt until both have been 
asserted (killall -STOP qemu), when it resumes it handles them in the right 
order.

> It's actually quite easy to artificially create 
> such conditions for testing. With enough of IRQs coming in, you
> can even make serial line lose characters. :)

If you actually lose characters and corrupt the escape sequence, we're going 
to interpret it as individual characters no matter what happens.  No poll 
delay will fix that one.

(As an aside, I note that a 16550a has a 16 byte hardware buffer.  That's been 
pc standard since the late 80's, and it's cheap and plastic enough it gets 
used even on a lot of embedded systems.)

> > I trimmed it
> > down to 25 miliseconds, which should be plenty for the next character to
> > come in at 1200 bps, and is way the heck below any human perceptual
> > threshold.
>
> Couldn't resist, joined bikeshed painting :| and bumped it to 50.

*shrug*  I note that 25 miliseconds is already about 3 times what a 1200 bps 
connection needs, but the perceptual threshold of "sluggish UI" is something 
like 1/8th of a second according to the human factors people (I'd look up the 
actual reference, but don't actually care), so we're still good.

> It's 5 scheduling intervals on a 100 Hz system, I consider system
> which does not deprive vi from scheduling for 5 timer tick intervals
> as a "system not yet too overloaded". 25ms is only 2 intervals.

It's not scheduling.  It has nothing to do with scheduling.  That's what I ran 
a test to prove.)  The point is that poll() is a syscall, and the kernel 
flags it to return to userspace when it either gets data from the device or 
when the timer it requested expires, neither of which has anything to do with 
the process scheduler.  From the time the process gets blocked waiting for 
poll() to return, the process isn't eligible to be scheduled, then poll() has 
to complete and change the process's state back to runnable before the 
process can _be_ scheduled again.  (I believe the process isn't even in a CPU 
queue during any of this, it gets moved to a timer queue.)

It has nothing to do with timer ticks.  (And the comment that 50ms is somehow 
5 timer ticks is actively wrong; if you look at the stupid chunk of perl 
peter anvin added to the kernel build in 2.6.25, there's apparently a 
platform out there using 24/second.  Also, the current scheduler doesn't 
switch _every_ timer tick, it's more complicated than that.  Plus a higher 
priority task becoming runnable doesn't wait until the next tick to interrupt 
a lower priority task, and with dynamic priorities something that's blocked 
waiting for input a lot, like vi, goes up in priority.  All this happened 
before the kernel went to "dynamic ticks" where it can switch the jiffy timer 
interrupt off entirely while idle...)

> I also increased a count in read(). There is a non-obvious reason
> why bumping it past 3 would require more surgery, or
> "hold down Page Down" starts editing the file on it's own :)

I don't understand what you're talking about here.

> (hint: assigning "n = 0;" is wrong in this case).

I changed the code to only pre-read one character at a time, and only when 
another character was needed to match a potential escape sequence.  I also 
sorted the escape sequences so that we never look at a _shorter_ escape 
sequence after looking at a long one.  (There's a comment on that.)

That means that if we matched the entire escape sequence, we read _exactly_ as 
much data as we needed to do so, and have thus consumed exactly the contents 
of the buffer.  That's why I set n=0.  Why is this wrong?

Rob
_______________________________________________
busybox mailing list
[email protected]
http://busybox.net/cgi-bin/mailman/listinfo/busybox

Reply via email to