New disk schedulers available for FreeBSD

2009-01-12 Thread Luigi Rizzo
Hi,
Fabio Checconi and myself have developed a GEOM-based disk scheduler
for FreeBSD. The scheduler is made of a GEOM kernel module, the
corresponding userland claas library, and other loadable kernel
modules that implement the actual scheduling algorithm.

At the URL below you can find a tarball with full sources and
also a set of pre-built modules/libraries for RELENG_7, to ease testing.

http://feanor.sssup.it/~fabio/freebsd/io_sched/fc_sched.tar.gz

Below you can find the README file that comes with the distribution.

I would encourage people to try it and submit feedback, because the
initial results are extremely interesting. While I just tried the
code under RELENG_7/i386, it should build and work on all versions
that have GEOM (but read below).

Also the code is quite robust, because most of the difficult tasks
(data moving, synchronization, etc.) are handled by GEOM, and the
scheduler is only deciding which requests to serve and when.

NOTE: The scheduler is designed to be distributed as a port, but
it needs an extra field in 'struct bio' and a small change in
function g_io_request() to work. Both changes are trivial but need
a kernel rebuild.

To try this code on AMD64 you do need to patch and rebuild the kernel.

On i386, and purely to ease evaluation, we avoid the need for a kernel
rebuild by patching one function in-memory (and patching it back
when the module is unloaded).

cheers
luigi and fabio

A copy of the README file follows:

--- GEOM BASED DISK SCHEDULERS FOR FREEBSD ---

This code contains a framework for GEOM-based disk schedulers and a
couple of sample scheduling algorithms that use the framework and
implement two forms of anticipatory scheduling (see below for more
details).

As a quick example of what this code can give you, try to run dd,
or tar, or some other code with highly SEQUENTIAL access patterns,
together with cvs or cvsup or other highly RANDOM access patterns
(this is not a made-up example: it is pretty common for developers
to have one or more apps doing random accesses, and others that do
sequential accesses e.g., loading large binaries from disk, checking
the integrity of tarballs, watching media streams and so on).

These are the results we get on a local machine (AMD BE2400 dual
core CPU, SATA 250GB disk):

/mnt is a partition mounted on /dev/ad0s1f
(or /dev/ad0-sched-s1f when used with the scheduler)

cvs:cvs -d /mnt/home/ncvs-local update -Pd /mnt/ports
dd-read:dd bs=128k of=/dev/null if=/dev/ad0 (or ad0-sched-)
dd-writew   dd bs=128k if=/dev/zero of=/mnt/largefile

NO SCHEDULERRR SCHEDULER
dd  cvs dd  cvs

dd-read only72 MB/s 72 MB/s ---
dd-write only   55 MB/s --- 55 MB/s ---
dd-read+cvs  6 MB/s ok  30 MB/s ok
dd-write+cvs55 MB/s slooow  14 MB/s ok

As you can see, when a cvs is running concurrently with dd, the
performance drops dramatically, and depending on read or write mode,
one of the two is severely penalized. The use of the RR scheduler
in this example makes the dd-reader go much faster when competing
with cvs, and lets cvs progress when competing with a writer.

To try it out:

1. PLEASE READ CAREFULLY THE FOLLOWING:

To avoid the need to rebuild a kernel, and just for testing
purposes, we implemented a hack which consists in patching one
kernel function (g_io_request) so that it executes the marking
of bio's (I/O requests). Also, the classification info is
stored in a rarely used field of struct bio. See details in the
file g_sched.c .

At the moment the 'patch' hack is only for i386 kernels built
with standard flags. For other configurations, you need to
manually patch sys/geom/geom_io.c as indicated by the error
message that you will get.

If you don't like the above, don't run this code.

Also note that these hacks are only for testing purpose.  If
this code ever goes in the tree, it will use the correct approach
which is adding a field to 'struct bio' to store the classification
info, and modify g_io_request() to call a function to initialize
that field.

2. PLEASE MAKE SURE THAT THE DISK THAT YOU WILL BE USING FOR TESTS
   DOES NOT CONTAIN PRECIOUS DATA.
   This is experimental code and may fail, especially at this stage.

3. EXTRACT AND BUILD THE PROGRAMS
   A 'make install' in the directory should work (with root privs),
   or you can even try the binary modules.
   If you want to build the modules yourself, look at the Makefile.

4. LOAD THE MODULE, CREATE A GEOM NODE, RUN TESTS

kldload gsched_rr
# --- configure the scheduler on device ad0
geom sched create -a rr ad0
# -- now you will have entries /dev/ad0-sched-

   For tests you can do the same as i did above, i.e. run concurrent
   programs that access the disk 

Re: New disk schedulers available for FreeBSD

2009-01-12 Thread Garrett Cooper
On Mon, Jan 12, 2009 at 2:00 PM, Luigi Rizzo ri...@icir.org wrote:
 Hi,
 Fabio Checconi and myself have developed a GEOM-based disk scheduler
 for FreeBSD. The scheduler is made of a GEOM kernel module, the
 corresponding userland claas library, and other loadable kernel
 modules that implement the actual scheduling algorithm.

 At the URL below you can find a tarball with full sources and
 also a set of pre-built modules/libraries for RELENG_7, to ease testing.

http://feanor.sssup.it/~fabio/freebsd/io_sched/fc_sched.tar.gz

 Below you can find the README file that comes with the distribution.

 I would encourage people to try it and submit feedback, because the
 initial results are extremely interesting. While I just tried the
 code under RELENG_7/i386, it should build and work on all versions
 that have GEOM (but read below).

Hi Luigi!
Is this changeset already available in CURRENT?
Thanks,
-Garrett
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: New disk schedulers available for FreeBSD

2009-01-12 Thread Xin LI
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Garrett Cooper wrote:
 On Mon, Jan 12, 2009 at 2:00 PM, Luigi Rizzo ri...@icir.org wrote:
 Hi,
 Fabio Checconi and myself have developed a GEOM-based disk scheduler
 for FreeBSD. The scheduler is made of a GEOM kernel module, the
 corresponding userland claas library, and other loadable kernel
 modules that implement the actual scheduling algorithm.

 At the URL below you can find a tarball with full sources and
 also a set of pre-built modules/libraries for RELENG_7, to ease testing.

http://feanor.sssup.it/~fabio/freebsd/io_sched/fc_sched.tar.gz

 Below you can find the README file that comes with the distribution.

 I would encourage people to try it and submit feedback, because the
 initial results are extremely interesting. While I just tried the
 code under RELENG_7/i386, it should build and work on all versions
 that have GEOM (but read below).
 
 Hi Luigi!
 Is this changeset already available in CURRENT?

Not (yet).

- --
Xin LI delp...@delphij.nethttp://www.delphij.net/
FreeBSD - The Power to Serve!
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.9 (FreeBSD)

iEYEARECAAYFAklsKz0ACgkQi+vbBBjt66A0oQCfaB3qBKF7QZ1lDMrSkHCmReUD
Di4AoIBQgg/Pe8zKD6Y7TBZO3Mz4pqUj
=pCBe
-END PGP SIGNATURE-
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: New disk schedulers available for FreeBSD

2009-01-12 Thread Luigi Rizzo
On Mon, Jan 12, 2009 at 06:45:13PM -0800, Garrett Cooper wrote:
 On Mon, Jan 12, 2009 at 2:00 PM, Luigi Rizzo ri...@icir.org wrote:
  Hi,
  Fabio Checconi and myself have developed a GEOM-based disk scheduler
  for FreeBSD. The scheduler is made of a GEOM kernel module, the
  corresponding userland claas library, and other loadable kernel
  modules that implement the actual scheduling algorithm.
 
  At the URL below you can find a tarball with full sources and
  also a set of pre-built modules/libraries for RELENG_7, to ease testing.
 
 http://feanor.sssup.it/~fabio/freebsd/io_sched/fc_sched.tar.gz
 
  Below you can find the README file that comes with the distribution.
 
  I would encourage people to try it and submit feedback, because the
  initial results are extremely interesting. While I just tried the
  code under RELENG_7/i386, it should build and work on all versions
  that have GEOM (but read below).
 
 Hi Luigi!
 Is this changeset already available in CURRENT?

no but the port above should hopefully build under -current as well,
unless there are changes in the GEOM ABI.

I built it on RELENG_7 and RELENG_6, will try HEAD today.

cheers
luigi
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org