David Rees schrieb:
> On Tue, Feb 26, 2008 at 2:23 PM, Tomasz Chmielewski <[EMAIL PROTECTED]> wrote:
>> > Can you give us more details on your disk array? Controller, disks,
>> > RAID layout, ext3 fs creation options, etc...
>>
>> I said some of that already - but here are some missing parts.
>> 5x 400 MB HDD (WDC WD4000YR)
>> The controller uses sata_mv module (Marvell).
>>
>> Linux software RAID over it, LVM over it. Filesystem for backup is 1.2
>> TB big.
>> The fs has these features: has_journal filetype needs_recovery
>> sparse_super large_file.
>
> Let me see if I can summarize this properly:
>
> iSCSI to a Thecus n5200 with 5x 400 MB HDD (WDC WD4000YR). The box
> runs a custom Debian kernel.
> The box uses software raid 5 + LVM.
>
> You didn't mention stripe size of the raid 5. You also didn't mention
> whether you used the stride option when creating the ext3 filesystem.
Stripe size is 64k.
Also, the system was made with just "mkfs.ext3 -j /dev/sdX", so without
the stride option (or other useful options, like online resizing, which
is enabled by default only in the recent releases of e2fsprogs).
On the other hand, using "stride" is a bit unclear for me.
Although you can somehow calculate it if you place your fs directly on a
RAID array:
stride=stripe-size
Configure the filesystem for a RAID array with stripe-size
filesystem blocks per stripe.
It is a bit harder if you have LVM on your RAID, I guess.
I can only wish "filesystem - volume manager - RAID" trio was better
integrated in Linux.
But as I looked at dumpe2fs output (and HDD LEDs blinking), everything
is scattered rather among all disks.
>> On an empty filesystem on than NAS I can write with ~25 MB/s.
>> I guess that backup filesystem is just very fragmented?
>> On the other hand, it is only 60-70% full, so drop from ~25 MB/s (empty
>> filesystem) to ~1.3 MB/s is indeed something odd.
>
> 25 MB/s seems abysmal for an empty filesystem. I really would expect
> at least twice that on a GigE iSCSI network, and probably closer to 75
> MB/s.
The SAN device has only a 600 MHz CPU.
Hey, I just disabled internal bitmap in RAID-5 and it seems the things
are much faster now - this is "iostat sda -d 10" output without the
internal bitmap.
The system mostly tries to read (Blk_read/s), and once in a while it
does a big commit (Blk_wrtn/s):
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 164,67 2088,62 0,00 20928 0
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 180,12 1999,60 0,00 20016 0
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 172,63 2587,01 0,00 25896 0
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 156,53 2054,64 0,00 20608 0
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 170,20 3013,60 0,00 30136 0
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 119,46 1377,25 5264,67 13800 52752
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 154,05 1897,10 0,00 18952 0
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 197,70 2177,02 0,00 21792 0
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 166,47 1805,19 0,00 18088 0
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 150,95 1552,05 0,00 15536 0
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 158,44 1792,61 0,00 17944 0
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 132,47 1399,40 3781,82 14008 37856
With the bitmap enabled, it sometimes behave similarly, but mostly, I
can see as reads compete with writes, like below:
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 112,57 946,11 5837,13 9480 58488
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 157,24 1858,94 0,00 18608 0
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 116,90 1173,60 44,00 11736 440
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 24,05 85,43 172,46 856 1728
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 25,60 90,40 165,60 904 1656
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 25,05 276,25 180,44 2768 1808
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 22,70 65,60 229,60 656 2296
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 21,66 202,79 786,43 2032 7880
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 20,90 83,20 1800,00 832 18000
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 51,75 237,36 479,52 2376 4800
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 35,43 129,34 245,91 1296 2464
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 34,50 88,00 270,40 880 2704
Now, let's disable the bitmap in the RAID-5 array:
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 110,59 536,26 973,43 5368 9744
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 119,68 533,07 1574,43 5336 15760
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 123,78 368,43 2335,26 3688 23376
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 122,48 315,68 1990,01 3160 19920
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 117,08 580,22 1009,39 5808 10104
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 119,50 324,00 1080,80 3240 10808
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 118,36 353,69 1926,55 3544 19304
And let's enable it again - after a while, it degrades again, and I can
see "rm -rf" stops for longer periods:
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 162,70 2213,60 0,00 22136 0
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 165,73 1639,16 0,00 16408 0
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 119,76 1192,81 3722,16 11952 37296
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 178,70 1855,20 0,00 18552 0
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 162,64 1528,07 0,80 15296 8
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 182,87 2082,07 0,00 20904 0
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 168,93 1692,71 0,00 16944 0
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 177,45 1572,06 0,00 15752 0
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 123,10 1436,00 4941,60 14360 49416
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 201,30 1984,03 0,00 19880 0
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 165,50 1555,20 22,40 15552 224
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 25,35 273,05 189,22 2736 1896
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 22,58 63,94 165,43 640 1656
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 69,40 435,20 262,40 4352 2624
It can also be that it is what I want to see as I made too few tests.
Anyway, I think I will have all old "trash" directories removed sometime
this or next day, hopefully:
[EMAIL PROTECTED] trash]# ls|wc -l
68
Down from over 100 a few hours ago.
I didn't test if I get more than ~25 MB/s with bitmaps disabled.
> Anyway, looking at your setup, you have a number of things which are
> contributing to your performance.
>
> * Software RAID5 - RAID5 is HORRIBLE (yes, it needs to be
> capitalized!) for small reads and writes. Especially for small random
> writes, performance can degrade to speeds well below the speed of a
> single disk. Small random reads will also perform poorly and at best
> will match the speed of a single disk. Hardware RAID5 with a large
> enough battery backed cache can negate some of the performance
> drawbacks of small random reads/writes
Yes, you either have big storage, or fast storage, or pay lots of money
to have both.
> * LVM - Often LVM can slow things down as well since it adds another
> layer of abstraction.
Also, I don't imagine it without LVM.
> * iScsi - Network based filesystem is never going to be as fast as
> directly attached storage.
> * Unlikely that the proper ext3 stride option was used to create the
> filesystem which can result in poor performance on a striped RAID
> array. I'm not sure how LVM would further affect this.
True. Too late now.
> * Small amount of memory on NAS - It appears that your Thecus 5200
> only has 256MB of RAM. I would expect that having significantly more
> memory could help IO performance.
It originally has indeed 256 MB, but it runs with 512 MB here.
But I don't think it's a problem here:
# cat /proc/meminfo
(...)
Dirty: 0 kB
Writeback: 0 kB
(...)
It never has pressure there.
Surely more RAM could help *if* I used write cache in iSCSI target, but
I just don't trust this box that much (it is haunted, probably).
--
Tomasz Chmielewski
http://wpkg.org
-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
BackupPC-users mailing list
[email protected]
List: https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki: http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/