Hello,
I have been working on enhancing the linux sg driver for a while now
and you may like to try it. It has the following features:
- interface binary compatible with original version
- adds scatter-gather support
- adds command queuing
- memory allocation more "dynamic"
- supports polling (so did original) and SIGPOLL (SIGIO) for
asynchronous notification
It can be found at www.netwinder.org/~dougg and is only suitable for
kernels >= 2.1.118 up to and including 2.2.0-pre6 . It should be
considered to be in alpha test. In my testing it works with 2
Advansys adpaters (940UA and the 940UW) and both sane (scanning package)
and xcdroast (cd writer package) work with it without recompilations.
Memory usage within the kernel is the most difficult issue and may
well need some more tuning. As it stands each device (i.e. sga, sgb,
sgc) grabs its own 32KB on the 1st open on that device and releases
it on the last close. The original version grabbed one 32KB buffer
at initialisation and all users had to fight over it. This may explain
your poor performance when you tried to use sg on 2 devices at the
same time.
>I posted a mail about an experiment on the linux generic SCSI interface:
>
>From: GOTO Masanori <[EMAIL PROTECTED]>
>Subject: linux generic SCSI interface 'READ' speed.
>> I make an experiment of the simple buffering system on HDDs(cheetah)
>> controling with linux generic SCSI interface.
>> But, the SCSI command which is such a 'READ6' for /dev/sgX with the generic
>> interface is slower than the standard C function 'read()' for /dev/sdX.
>>
>> Measuring difference between 'READ6' and 'read()' is below.
>> This test reads 200MByte to /dev/sda sequencially from the head of HDD.
>> The machine using this test is PPro with 128MByte DRAM.
>(snip)
>> The result is teached me that the generic SCSI interface takes 6 msec
>> per one 'READ6' SCSI command.
The original sg had locks enforcing the strict write/read/write/read ..
sequence which meant that you get kernel+driver+scsi_command overhead
on every scsi command issued. 6 milliseconds is the figure I had in my
head from a previous project (embedded raid controller).
>> It seems that the kernel blocks when kernel gets the generic SCSI
>> interface command.
>>
>> Now question:
>> * How method is the generic SCSI interface to make more faster?
>> * Can a program using the generic SCSI interface run with pthread?
>> * Are there any way to make simple buffering system?
>
>and have received some replies.
>I made some additional experiments, and derived good results of
>READ6 equivalent to read() system call with the scsi generic interface
>in case of using only one disk.
>In my experiments, however, I plan to use 4 disks controlled by pthreads
>under the following conditions:
>
>* The first disk do ONLY READ sequentially.
>* The second disk do ONLY WRITE sequentially.
>* Other disk( remaining 2 disks ) do READ/WRITE randomly.
>
>I naturally derived good results of ONLY read or write operations on both
>first and second disks.
>But, the situation in which the first disk ONLY READ and the second disk
>ONLY WRITE using read()/write() system call, results in enough write
>performance but insufficient read.
>I have checked access granularity of the file /proc/scsi/aic7xxx/x recording
>read/write statistics. They indicate the access patterns as follows:
>
> W
> W
> W
> W
> W W
> ---------------------------
> 1K 2K 4K 8K 16K 32K 64K
>
> R R
> R R R R R R
> R R R R R R R
> ---------------------------
> 1K 2K 4K 8K 16K 32K 64K
>
>While the granularity of write operations is skewed in large size,
>it of read is scattered from small to large size.
>
>I do not know how to prevent scattered access. I tried to use the scsi
>generic interface which provided enough performance for only one disk,
>but in case of more than two disks, too bad.
>The access lamps of two disks were not lighted in the same time,
>but only one lamp is lighted at a time. It's just like a seesaw.
>Total access time is as twice as one disk.
>Of course, I use the scsi generic interface and pthread.
Hopefully the "scattered access" you are referring to is in the
above graphs and not that you wish to preclude scatter-gather
from the sg driver. Scatter-gather within a kernel driver should
be transparent to the user space. If supported by the scsi adapater,
it allows single scsi operations to move more data than the
original driver's SG_BIG_BUFF limit.
>
>Linux has the Multiple Disk Driver, which seems to achieve parallel
>disk accesses with read/write...
>Why don't user level programs obtain good results of accessing multiple
>disks?
>
>My goal is the parallel disk accesses by user level programming.
>Are there any methods to handle parallel read/write in user mode without
>scattered access?
Try this driver and inform me about your results...
>Otherwise, should I only do rewrite kernel codes or use ioctl() to implement
>my buffer system?
If you wish to go that path then I may be able to help.
>
>From: Marc SCHAEFER <[EMAIL PROTECTED]>
>Subject: Re: linux generic SCSI interface 'READ' speed.
>> What about trying the new direct IO on block devices (see kernel
>> announcement) with a multithreaded program ?
>
>I have tried to use RAW disk devices. But AFAIK, it handles with not
>raw but ext2fs disks. The direct I/O is not useful in my goal, unfortunately.
>I also know Linus dislikes to implement the raw I/O device driver.
It would be nice if the great man could set aside a little (real) memory
that device drivers fight over rather than compete with userland and
apps
like X and Netscape.
>
>From: Kurt Garloff <[EMAIL PROTECTED]>
>Subject: Re: linux generic SCSI interface 'READ' speed.
>> What comes into my mind are two things:
>> a) The sd devices do buffering, readahead; adjacent reads are merged etc.
>> This gives you a major speed improvement. I think you saw a speed
>> improvement with using larger blocks.
>> b) The sg high-level scsi code might be inefficient, i.e. locking the I/O
>> for a too long time. Maybe this prevents the driver from issueing more thn
>> one command at a time and such using Tagged Command queuing, when you use
>> sg ?
>
>I don't know how to read larger blocks referred to a) without scattering.
>I have obtained good results by using tagged command queuing, but the queuing
>has not worked well for multiple disks in my program referred to b).
Now you can; just make sure Netscape is not running _or_ that your app
backs off gracefully when it asks to do a 512KB scsi read and get a
ENOMEM
(eg asks for 256KB instead and if that fails 128KB etc. ).
>>From: Gerard Roudier <[EMAIL PROTECTED]>
>Subject: Re: linux generic SCSI interface 'READ' speed.
>> The read_block() interface prefetches up to 120 sectors = 60KB at a time
>> by default for sd devices and stores the prefetched data into the buffer
>> cache. So, using read() resulted in the same actual physical IO pattern
>> for your 3 different logical IO patterns.
>
>Umm... Do you know any methods to preventing read scattering?
>
>> Despite the fact that the read_block() interface read-ahead is not
>> asynchronous (no anticipated read IO that makes the C code execution
>> latency overlap the IO latency), the Cheatah probably has been able
>> to sustaint the read since it has a large cache and performs large
>> prefetching.
>>
>> You got about 14 MB/s. BTW, the Cheatah2 is able to sustain about 18 MB/s.
>
>I am using Cheetah 4LP whose data sheet specifies that the max throughput
>is 14MB/sec, and I checked it out.
Regards,
--
GOTO Masanori
Tokyo Institute of Technology, Department of Computer Science
-----------------------------------------------------------------
Douglas Gilbert [[EMAIL PROTECTED] or [EMAIL PROTECTED]]
48 Windsor Court Road Web: www.interlog.com/~dgilbert
Thornhill, Ontario L3T 4Y5, Canada Tel: +1 905 771 6151
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]