[ ... ]
> I thought I would do a real measurement to have some numbers.
> On my raid-1 ext3, extracting a kernel archive:
> benjamin@metis ~/software $ time tar xfj
> /usr/portage/distfiles/linux-2.6.38.tar.bz2
> real 0m21.769s
> user 0m13.905s
> sys 0m1.751s
That's a "real measurement" of *something*, and it does give
"some numbers", but to me the numbers are not that interesting
as it is far from clear what they are about.
So I happen to have an otherwise totally unused fastish
contemporary 500GB disk and laptop for a measurement of
something that might be better defined, a bit simplemindedly,
but taking care about a few details (see also appended setup
details), so that the numbers be about as good as possible
(YMMV).
First with 'ext3':
% mount -t ext3 -o relatime /dev/sdb /mnt/sdb
% df -BM /mnt/sdb
Filesystem 1M-blocks Used Available Use% Mounted on
/dev/sdb 469455M 687M 444922M 1% /mnt/sdb
% df -i /mnt/sdb
Filesystem Inodes IUsed IFree IUse% Mounted on
/dev/sdb 30531584 38100 30493484 1% /mnt/sdb
% time sh -c 'cd /mnt/sdb; star -x -b 2048 -f /tmp/linux-2.6.38.tar; cd /;
umount /mnt/sdb'
star: 420 blocks + 81920 bytes (total of 440483840 bytes = 430160.00k).
real 12m49.610s
user 0m0.990s
sys 0m8.610s
That's like 570KB/s and 50 files/s, in more or less optimal
conditions. Not so good for 'ext3', which indeed is well known
for appalling small file/metadata write performance, but the
order-of-magnitude of the results is the plausible one.
XFS with 'delaylog' does worse, but then it has a difference
tradeoff envelope:
% mount -t xfs -o relatime,delaylog /dev/sdb /mnt/sdb
% time sh -c 'cd /mnt/sdb; star -x -b 2048 -f /tmp/linux-2.6.38.tar; cd /;
umount /mnt/sdb'
star: 420 blocks + 81920 bytes (total of 440483840 bytes = 430160.00k).
real 24m4.282s
user 0m1.260s
sys 0m14.030s
I also tried with JFS and it is faster at 1MB/s and 90 files/s
which is pretty good (and I suspect that JFS may be cheating
slightly on the semantics, but I know about its on-disk
structure and twice as fast as 'ext3' is plausible):
% mount -t jfs -o relatime /dev/sdb /mnt/sdb
% time sh -c 'cd /mnt/sdb; star -x -b 2048 -f /tmp/linux-2.6.38.tar; cd /;
umount /mnt/sdb'
star: 420 blocks + 81920 bytes (total of 440483840 bytes = 430160.00k).
real 6m56.508s
user 0m1.000s
sys 0m7.130s
Consolation notes :-)
=====================
Naturally the real (and arguably rather more meaningful than
others) measurements above will be baffling those described
here:
[ ... ] many people (some with decades of "experience") just
don't understand IOPS and metadata and commits and caching and
who think "performance" is whatever number they can get with
their clever "benchmarks".
So as a consolation prize to them let's rerun with entirely
different semantics but still taking a bit of care:
% mount -t ext3 -o relatime /dev/sdb /mnt/sdb
% time sh -c 'cd /mnt/sdb; star -x -b 2048 -f /tmp/linux-2.6.38.tar
-no-fsync; cd /; umount /mnt/sdb'
star: 420 blocks + 81920 bytes (total of 440483840 bytes = 430160.00k).
real 0m27.414s
user 0m0.270s
sys 0m2.430s
Oh gosh, it looks like much better "performance"! 'ext3' really
rises and shines with contiguous large IOs! :-)
And similarly for XFS:
% mount -t xfs -o relatime,delaylog /dev/sdb /mnt/sdb
% time sh -c 'cd /mnt/sdb; star -x -b 2048 -f /tmp/linux-2.6.38.tar
-no-fsync; cd /; umount /mnt/sdb'
star: 420 blocks + 81920 bytes (total of 440483840 bytes = 430160.00k).
real 0m33.849s
user 0m0.310s
sys 0m2.960s
% mount -o relatime /dev/sdb /mnt/sdb
And JFS is quite similar too:
% mount -t jfs -o relatime /dev/sdb /mnt/sdb
% time sh -c 'cd /mnt/sdb; star -x -b 2048 -f /tmp/linux-2.6.38.tar
-no-fsync; cd /; umount /mnt/sdb'
star: 420 blocks + 81920 bytes (total of 440483840 bytes = 430160.00k).
real 0m35.191s
user 0m0.380s
sys 0m2.920s
Journaling notes
================
So there. I apologize to the readers who "understand IOPS and
metadata and commits and caching" (and who may have read the
man-page for 'star') who will be bored with the beginner-level
nature of the points made above.
But I am actually a bit surprised disappointed with the "really"
numbers above because I would expected something more like 2-3
minutes duration or 2-4 files/s per IOPS, but I guess such are
the horrors of seeking crazily between journal and metadata and
data space, so let's try without a journal with 'ext2':
% mount -t ext2 -o relatime /dev/sdb /mnt/sdb
% time sh -c 'cd /mnt/sdb; star -x -b 2048 -f /tmp/linux-2.6.38.tar; cd /;
umount /mnt/sdb' star: 420 blocks + 81920 bytes
(total of 440483840 bytes = 430160.00k).
real 8m12.196s
user 0m1.120s
sys 0m6.030s
Sure it is better, that's 50% faster than 'ext3'.
Let'a also try as a special case 'ext4' (yes, 'ext4' with its
many improvements) without a journal:
% mkfs.ext4 -O ^has_journal /dev/sdb
mke2fs 1.41.11 (14-Mar-2010)
/dev/sdb is entire device, not just one partition!
Proceed anyway? (y,n) y
[ ... ]
% mount -t ext4 -o relatime /dev/sdb /mnt/sdb
% time sh -c 'cd /mnt/sdb; star -x -b 2048 -f /tmp/linux-2.6.38.tar; cd /;
umount /mnt/sdb'
star: 420 blocks + 81920 bytes (total of 440483840 bytes = 430160.00k).
real 0m31.119s
user 0m0.870s
sys 0m6.190s
Well, I don't believe that. That looks like a feature or bug in
'ext4' where without a journal it won't honor commits. The same
appears to be the case for JFS, but then the manual explicitly
says that 'nointegrity' is aptly named, and so it is be;lievable
that switching off journaling is not its only effect:
% mount -t jfs -o relatime,nointegrity /dev/sdb /mnt/sdb
% time sh -c 'cd /mnt/sdb; star -x -b 2048 -f /tmp/linux-2.6.38.tar; cd /;
umount /mnt/sdb'
star: 420 blocks + 81920 bytes (total of 440483840 bytes = 430160.00k).
real 0m35.820s
user 0m0.610s
sys 0m5.740s
Setup details
=============
ULTS10 64b, 2.6.35 kernel, 4GiB RAM, I3-M370 CPU. Quiet except
for measurements. Every 'tar' extraction is preceded by a
re-'mkfs'. Note the details below (e.g. the archive is
uncompressed and stored in in-memory 'tmpfs', the disk is a
fairly fast 500GB drive on eSATA).
----------------------------------------------------------------
% dd bs=1M if=/tmp/linux-2.6.38.tar of=/dev/null
420+1 records in
420+1 records out
440483840 bytes (440 MB) copied, 0.159935 s, 2.8 GB/s
----------------------------------------------------------------
% hdparm -t /dev/sdb
/dev/sdb:
Timing buffered disk reads: 388 MB in 3.01 seconds = 128.98 MB/sec
----------------------------------------------------------------
% lsscsi | grep sdb
[4:0:0:0] disk ATA ST3500418AS CC44 /dev/sdb
----------------------------------------------------------------
% mkfs.ext3 /dev/sdb
mke2fs 1.41.11 (14-Mar-2010)
/dev/sdb is entire device, not just one partition!
Proceed anyway? (y,n) y
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
Stride=0 blocks, Stripe width=0 blocks
30531584 inodes, 122096646 blocks
6104832 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=4294967296
3727 block groups
32768 blocks per group, 32768 fragments per group
8192 inodes per group
Superblock backups stored on blocks:
32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632,
2654208,
4096000, 7962624, 11239424, 20480000, 23887872, 71663616, 78675968,
102400000
Writing inode tables: done
Creating journal (32768 blocks): done
Writing superblocks and filesystem accounting information: done
This filesystem will be automatically checked every 32 mounts or
180 days, whichever comes first. Use tune2fs -c or -i to override.
----------------------------------------------------------------
% mkfs.xfs -f /dev/sdb
meta-data=/dev/sdb isize=256 agcount=4, agsize=30524162
blks
= sectsz=512 attr=2
data = bsize=4096 blocks=122096646, imaxpct=25
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0
log =internal log bsize=4096 blocks=59617, version=2
= sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
----------------------------------------------------------------
------------------------------------------------------------------------------
WhatsUp Gold - Download Free Network Management Software
The most intuitive, comprehensive, and cost-effective network
management toolset available today. Delivers lowest initial
acquisition cost and overall TCO of any competing solution.
http://p.sf.net/sfu/whatsupgold-sd
_______________________________________________
Jfs-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/jfs-discussion