On 14-Oct-09, at 14:15, James Robnett wrote: > After reading through my first post I felt some clarification was > probably warranted. > > In this test setup there are two OSS, call them OSS-1 and OSS-2, > each has an OST, call them OSS-1-A, OSS-1-B and OSS-2-A, OSS-2-B. > > The MDS, OSSes and client all have 1Gbit ethernet connections. > > The following table illustrates the data rates I see in MB/s. > > OST(s) Read Write > OSS-1-A 113 95 > OSS-1-B 112 93 > OSS-1-A OSS-1-B 112 98 > OSS-2-A 105 93 > OSS-2-B 115 94 > OSS-2-A OSS-2-B 115 98 > OSS-1-B OSS-2-A ---> 42 113 > OSS-1-A OSS-2-B ---> 42 114 > OSS-1-A OSS-1-B OSS-2-A OSS-2-B ---> 46 114
You're sure that there isn't some other strange effect here, like you are only measuring the speed of a single iozone thread or similar? > I can envision that there would be more re-assembly overhead on > the client in the case of 2 OSSes(1) but I'm surprised it's that high. > > Is this an expected result ? > > If it's unexpected is there a common misconfiguration or client > short coming that causes it to be slower when reading from multiple > OSSes? This is definitely NOT expected, and I'm puzzled as to why this might be. > Is there some command I could run or data I could provide that would > help identify the issue ? I'm fairly new to Lustre so I'm just as > likely to flood noise as signal if I just randomly appended data > beyond raw rates. You could check /proc/fs/lustre/obdfilter/*/brw_stats on the respective OSTs to see if the client is not assembling the RPCs very well for some reason. Alternately, it might be that you have configured the disk storage of OSS-1 and OSS-2 to compete (e.g. different partitions sharing the same disks). > 1) I'm assuming in the case of a single OSS with 2 OSTs the OSS > presents the client with a single stream. If assembly of two data > streams is required on the client in both the single and dual OSS > (both with 2 OSTs) cases then I'm even more confused about those > results. No, the client needs to assemble the OST objects itself, regardless of whether the OSTs are on the same OSS or not. The file should be striped over all of the OSTs involved in the test. > James Robnett wrote: >> The nodes are a bit cobbled together from what I had handy. >> >> One MDS: Dual quad-core 2.5GHz nehalem 8GB RAM E1000 gigabit NIC >> MDT is just a partition on a 1TB SAS Seagate >> Two OSS: Single dual core 2.8GHz Xeon, 4GB RAM single gigabit NIC >> Dual 3ware 9550SX cards with 7+1 RAID 5 across 400GB WD SATA >> drives. >> Two OST/OSS: 2TB. Configured as LVM. 1 and 4MB stripe size tried. >> Client: Dual quad-core 2.5 GHz Xeon, 8GB RAM single gigabit NIC >> Network: Dedicated Cisco 2960g Gigabit switch Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. _______________________________________________ Lustre-discuss mailing list [email protected] http://lists.lustre.org/mailman/listinfo/lustre-discuss
