Fri, 18 Feb 2000 Randy MacLeod wrote:
>    Did this make it to the mailing list? I got no responses...
> and there have been problems with the listserver.

Not sure, although I don't think I've seen this...

> Now I want to backup one of my files while I collect the next one.
> In tests this should put the fs at ~80 % of max load.

Normally, I don't think you can expect more than some 70% of the max
theoretical bandwidth when reading/writing multiple files - and that's with an
optimized streamig engine, or an fs optimized for high bandwidth streaming. In
a normal fs, the head seeking kills the transfer rate in favour of latency if
there are multiple tasks doing I/O simultaneuosly...

> Even though I have >~10 sec of buffering in place, eventually
> backing up the 2GB file causes the deadtime of the DAQ system
> to skyrocket.

Buffering doesn't solve the problem. In a normal fs, the head seeking kills
the transfer rate in favour of latency if there are multiple tasks doing I/O
simultaneuosly...

>   Solutions:
> 0. Buy better faster fs/disk hardware. This just changes the limit
>    unless the fs is RT. I'd like better software control.

Yeah, faster hardware is a huge waste of money if the problem is of this kind.
The maximum real-life transfer rate is 10 MB/s or better on almost any HD these
days.

> 1. Hack the backup program to spy on the buffer pool and go to
>    sleep when there aren't "many" buffers available.

This is close to the "Right Thing" in theory (it does the right thing from the
hardware POV), but doing it that way isn't all that nice...

> 2. Create a simple shell that spys on the buffer pool and sends
>      SIGTSTP         20      /* Keyboard stop (POSIX).  */ 
>    to all children and then resumes the children once
>    enough buffers are available. Not sure if this will work.
>    I've yet to write a prototype.

Sounds like something like that might work, but IMO it's a bit brutal to pause
the whole process when what you need is actually just to get it off the fs in a
periodic manner, to reduce the access interleaving frequency. (Doesn't matter
much in this case, but it would be terrible if that task wasn't 95% I/O
bound...)

> Can you say cooperative multi-tasking? Yuck.

Well, timesharing should work very well here actually - it works fine as
long as the task switching overhead doesn't eat too much of your performance.

The problem here is that the kernel is tuned for the CPU task switching, while a
HD, or even worse a CD-ROM (*aaargh!*), has many orders of magnitude higher
costs when switching between file I/O jobs. A head seek takes some 1 - 20 ms...

(BTW, this is the main distinction between accelerated 3D graphics subsystems
for games and the professional ones - pro 3D chips can switch command pipeline
in the middle of rendering a polygon, while the game cards cannot. The result
is that even though some game cards are *very* fast at rendering, they still
suck in X environments with multiple tasks or threads...)

> Yes, I tried running the data writer as a Solaris "RT" process
> but this property is not inherited by the data passed to the fs.

Doesn't help. In fact, it might even make it worse, as allowing real time
priority in the fs would entirely wipe out the fs' ability to optimize for
average speed rather than maximum access latency.


If you don't want to hack the fs, use large block or something like that, I'd
suggest that you lower the fs "access frequency" in some way.

One way:

Move the fs access out of the "RT" thread (you'll have to do that anyway to get
reliable real time performance), and into another thread. Use a circular buffer
or something like that to stream the data from the RT thread to the I/O thread.
Then, the actual solution lies in the I/O thread; use enough buffering for at
least 100 ms or so + margin (to avoid buffer overrun) in between the RT thread
and the I/O thread. Start a disk write operation only when you have those 100
ms worth of data in the buffer.

When the I/O thread is waiting for data, it should either do the copying work
itself, or tell some other task that it's ok to access the disk for a while.
Using another task and some form of IPC has the advantage that the copying task
could assume that the I/O + RT task has died, and just copy all existing data.
It could even kill + respawn the I/O + RT task if it should go silent
unexpectedly, thus making the system more fault tolerant.

This last token ring style trick is required, as you'd otherwise end up in the
same situation you are in now; multiple tasks "sharing" the fs. Not sexy, but
the alternatives (like doing it in the fs getting huge latencies for all file
operations) are probably worse... HD's have access times, and we just have to
accept that.


Regards,

//David


     P r o f e s s i o n a l   L i n u x   A u d i o
· ··-------------------------------------------------·· ·
         MuCoS - http://www.linuxdj.com/mucos
     Audiality - http://www.angelfire.com/or/audiality
 David Olofson - [EMAIL PROTECTED]
-- [rtl] ---
To unsubscribe:
echo "unsubscribe rtl" | mail [EMAIL PROTECTED] OR
echo "unsubscribe rtl <Your_email>" | mail [EMAIL PROTECTED]
---
For more information on Real-Time Linux see:
http://www.rtlinux.org/~rtlinux/

Reply via email to