So if one OST gets 200MiB/s and another OST gets 200MiB/s does that make 400 MiB/s or this is not how to calculate throughput? I will eventually plug the right sequence into iozone to measure it.
>From my perspective it looks like ioio.ca/ioio.jpg ioio.ca/lustreone.png >ioio.ca/lustretwo.png ioio.ca/lustrethree.png ioio.ca/lustrefour.png --- On Sat, 1/24/09, Arden Wiebe <[email protected]> wrote: From: Arden Wiebe <[email protected]> Subject: [Lustre-discuss] Plateau around 200MiB/s bond0 To: [email protected] Date: Saturday, January 24, 2009, 6:04 PM 1-2948-SFP Plus Baseline 3Com Switch 1-MGS bond0(eth0,eth1,eth2,eth3,eth4,eth5) raid1 1-MDT bond0(eth0,eth1,eth2,eth3,eth4,eth5) raid1 2-OSS bond0(eth0,eth1,eth2,eth3,eth4,eth5) raid6 1-MGS-CLIENT bond0(eth0,eth1,eth2,eth3,eth4,eth5) 1-CLIENT bond0(eth0,eth1) 1-CLIENT eth0 1-CLIENT eth0 I fail so far creating external journal for MDT, MGS and OSSx2. How to add the external journal to /etc/fstab specifically the output of e2label /dev/sdb followed by what options for fstab? [r...@lustreone ~]# cat /proc/fs/lustre/devices 0 UP mgs MGS MGS 17 1 UP mgc mgc192.168....@tcp 876c20af-aaec-1da0-5486-1fc61ec8cd15 5 2 UP lov ioio-clilov-ffff810209363c00 7307490a-4a12-4e8c-56ea-448e030a82e4 4 3 UP mdc ioio-MDT0000-mdc-ffff810209363c00 7307490a-4a12-4e8c-56ea-448e030a82e4 5 4 UP osc ioio-OST0000-osc-ffff810209363c00 7307490a-4a12-4e8c-56ea-448e030a82e4 5 5 UP osc ioio-OST0001-osc-ffff810209363c00 7307490a-4a12-4e8c-56ea-448e030a82e4 5 [r...@lustreone ~]# lfs df -h UUID bytes Used Available Use% Mounted on ioio-MDT0000_UUID 815.0G 534.0M 767.9G 0% /mnt/ioio[MDT:0] ioio-OST0000_UUID 3.6T 28.4G 3.4T 0% /mnt/ioio[OST:0] ioio-OST0001_UUID 3.6T 18.0G 3.4T 0% /mnt/ioio[OST:1] filesystem summary: 7.2T 46.4G 6.8T 0% /mnt/ioio [r...@lustreone ~]# cat /proc/net/bonding/bond0 Ethernet Channel Bonding Driver: v3.2.4 (January 28, 2008) Bonding Mode: IEEE 802.3ad Dynamic link aggregation Transmit Hash Policy: layer2 (0) MII Status: up MII Polling Interval (ms): 100 Up Delay (ms): 0 Down Delay (ms): 0 802.3ad info LACP rate: slow Active Aggregator Info: Aggregator ID: 1 Number of ports: 1 Actor Key: 17 Partner Key: 1 Partner Mac Address: 00:00:00:00:00:00 Slave Interface: eth0 MII Status: up Link Failure Count: 1 Permanent HW addr: 00:1b:21:28:77:db Aggregator ID: 1 Slave Interface: eth1 MII Status: up Link Failure Count: 1 Permanent HW addr: 00:1b:21:28:77:6c Aggregator ID: 2 Slave Interface: eth3 MII Status: up Link Failure Count: 0 Permanent HW addr: 00:22:15:06:3a:94 Aggregator ID: 3 Slave Interface: eth2 MII Status: up Link Failure Count: 0 Permanent HW addr: 00:22:15:06:3a:93 Aggregator ID: 4 Slave Interface: eth4 MII Status: up Link Failure Count: 0 Permanent HW addr: 00:22:15:06:3a:95 Aggregator ID: 5 Slave Interface: eth5 MII Status: up Link Failure Count: 0 Permanent HW addr: 00:22:15:06:3a:96 Aggregator ID: 6 [r...@lustreone ~]# cat /proc/mdstat Personalities : [raid1] md0 : active raid1 sdb[0] sdc[1] 976762496 blocks [2/2] [UU] unused devices: <none> [r...@lustreone ~]# cat /etc/fstab LABEL=/ / ext3 defaults 1 1 tmpfs /dev/shm tmpfs defaults 0 0 devpts /dev/pts devpts gid=5,mode=620 0 0 sysfs /sys sysfs defaults 0 0 proc /proc proc defaults 0 0 LABEL=MGS /mnt/mgs lustre defaults,_netdev 0 0 192.168....@tcp0:/ioio /mnt/ioio lustre defaults,_netdev,noauto 0 0 [r...@lustreone ~]# ifconfig bond0 Link encap:Ethernet HWaddr 00:1B:21:28:77:DB inet addr:192.168.0.7 Bcast:192.168.0.255 Mask:255.255.255.0 inet6 addr: fe80::21b:21ff:fe28:77db/64 Scope:Link UP BROADCAST RUNNING MASTER MULTICAST MTU:9000 Metric:1 RX packets:5457486 errors:0 dropped:0 overruns:0 frame:0 TX packets:4665580 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:12376680079 (11.5 GiB) TX bytes:34438742885 (32.0 GiB) eth0 Link encap:Ethernet HWaddr 00:1B:21:28:77:DB inet6 addr: fe80::21b:21ff:fe28:77db/64 Scope:Link UP BROADCAST RUNNING SLAVE MULTICAST MTU:9000 Metric:1 RX packets:3808615 errors:0 dropped:0 overruns:0 frame:0 TX packets:4664270 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:12290700380 (11.4 GiB) TX bytes:34438581771 (32.0 GiB) Base address:0xec00 Memory:febe0000-fec00000 >From what I have read not having an external journal configured for the OST's >is a sure recipie for slowness which I would rather not have considering the >goal is around 350MiB/s or more which should be obtainable. Here is how I formated the raid6 device on both OSS's that have identical [r...@lustrefour ~]# fdisk -l Disk /dev/sda: 1000.2 GB, 1000204886016 bytes 255 heads, 63 sectors/track, 121601 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Device Boot Start End Blocks Id System /dev/sda1 * 1 121601 976760001 83 Linux Disk /dev/sdb: 1000.2 GB, 1000204886016 bytes 255 heads, 63 sectors/track, 121601 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk /dev/sdb doesn't contain a valid partition table Disk /dev/sdc: 1000.2 GB, 1000204886016 bytes 255 heads, 63 sectors/track, 121601 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk /dev/sdc doesn't contain a valid partition table Disk /dev/sdd: 1000.2 GB, 1000204886016 bytes 255 heads, 63 sectors/track, 121601 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk /dev/sdd doesn't contain a valid partition table Disk /dev/sde: 1000.2 GB, 1000204886016 bytes 255 heads, 63 sectors/track, 121601 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk /dev/sde doesn't contain a valid partition table Disk /dev/sdf: 1000.2 GB, 1000204886016 bytes 255 heads, 63 sectors/track, 121601 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk /dev/sdf doesn't contain a valid partition table Disk /dev/sdg: 1000.2 GB, 1000204886016 bytes 255 heads, 63 sectors/track, 121601 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk /dev/sdg doesn't contain a valid partition table Disk /dev/sdh: 1000.2 GB, 1000204886016 bytes 255 heads, 63 sectors/track, 121601 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk /dev/sdh doesn't contain a valid partition table Disk /dev/md0: 4000.8 GB, 4000819183616 bytes 2 heads, 4 sectors/track, 976762496 cylinders Units = cylinders of 8 * 512 = 4096 bytes Disk /dev/md0 doesn't contain a valid partition table [r...@lustrefour ~]# [r...@lustrefour ~]# mdadm --create --assume-clean /dev/md0 --level=6 --chunk=128 --raid-devices=6 /dev/sd[cdefgh] [r...@lustrefour ~]# cat /proc/mdstat Personalities : [raid6] [raid5] [raid4] md0 : active raid6 sdc[0] sdh[5] sdg[4] sdf[3] sde[2] sdd[1] 3907049984 blocks level 6, 128k chunk, algorithm 2 [6/6] [UUUUUU] in: 16674 reads, 16217479 writes; out: 3022788 reads, 32865192 writes 7712698 in raid5d, 8264 out of stripes, 25661224 handle called reads: 0 for rmw, 1710975 for rcw. zcopy writes: 4864584, copied writes: 16115932 0 delayed, 0 bit delayed, 0 active, queues: 0 in, 0 out 0 expanding overlap unused devices: <none> Followed with: [r...@lustrefour ~]# mkfs.lustre --ost --fsname=ioio --mgsnode=192.168....@tcp0 --mkfsoptions="-J device=/dev/sdb1" --reformat /dev/md0 [r...@lustrefour ~]# mke2fs -b 4096 -O journal_dev /dev/sdb1 But that is hard to reassemble on the reboot or at least was before I use e2label and label things right. Question how to label the external journal in fstab if at all? Right now only running [r...@lustrefour ~]# mkfs.lustre --fsname=ioio --ost --mgsnode=192.168....@tcp0 --reformat /dev/md0 So just raid6 no external journal. [r...@lustrefour ~]# cat /etc/fstab LABEL=/ / ext3 defaults 1 1 tmpfs /dev/shm tmpfs defaults 0 0 devpts /dev/pts devpts gid=5,mode=620 0 0 sysfs /sys sysfs defaults 0 0 proc /proc proc defaults 0 0 LABEL=ioio-OST0001 /mnt/ost00 lustre defaults,_netdev 0 0 192.168....@tcp0:/ioio /mnt/ioio lustre defaults,_netdev,noauto 0 0 [r...@lustrefour ~]# [r...@lustreone bin]# ./ost-survey -s 4096 /mnt/ioio ./ost-survey: 01/24/09 OST speed survey on /mnt/ioio from 192.168....@tcp Number of Active OST devices : 2 Worst Read OST indx: 0 speed: 38.789337 Best Read OST indx: 1 speed: 40.017201 Read Average: 39.403269 +/- 0.613932 MB/s Worst Write OST indx: 0 speed: 49.227064 Best Write OST indx: 1 speed: 78.673564 Write Average: 63.950314 +/- 14.723250 MB/s Ost# Read(MB/s) Write(MB/s) Read-time Write-time ---------------------------------------------------- 0 38.789 49.227 105.596 83.206 1 40.017 78.674 102.356 52.063 [r...@lustreone bin]# ./ost-survey -s 1024 /mnt/ioio ./ost-survey: 01/24/09 OST speed survey on /mnt/ioio from 192.168....@tcp Number of Active OST devices : 2 Worst Read OST indx: 0 speed: 38.559620 Best Read OST indx: 1 speed: 40.053787 Read Average: 39.306704 +/- 0.747083 MB/s Worst Write OST indx: 0 speed: 71.623744 Best Write OST indx: 1 speed: 82.764897 Write Average: 77.194320 +/- 5.570577 MB/s Ost# Read(MB/s) Write(MB/s) Read-time Write-time ---------------------------------------------------- 0 38.560 71.624 26.556 14.297 1 40.054 82.765 25.566 12.372 [r...@lustreone bin]# dd of=/mnt/ioio/bigfileMGS if=/dev/zero bs=1048576 3536+0 records in 3536+0 records out 3707764736 bytes (3.7 GB) copied, 38.4775 seconds, 96.4 MB/s lustreonetwothreefour all have the same for modprobe.conf [r...@lustrefour ~]# cat /etc/modprobe.conf alias eth0 e1000 alias eth1 e1000 alias scsi_hostadapter pata_marvell alias scsi_hostadapter1 ata_piix options lnet networks=tcp alias eth2 sky2 alias eth3 sky2 alias eth4 sky2 alias eth5 sky2 alias bond0 bonding options bonding miimon=100 mode=4 [r...@lustrefour ~]# When do the same from all clients I can watch ./usr/bin/gnome-system-monitor and the send and recieve from the various nodes reaches a 209 MiB/s plateau? Uggh -----Inline Attachment Follows----- _______________________________________________ Lustre-discuss mailing list [email protected] http://lists.lustre.org/mailman/listinfo/lustre-discuss
_______________________________________________ Lustre-discuss mailing list [email protected] http://lists.lustre.org/mailman/listinfo/lustre-discuss
