I've closely followed the metadata mailing list posts over the last year. We've been running our small filesystem for a couple of months in semi-production mode. We don't have a traditional HPC workload (it's big image files with 5-10 small xml files) and we knew that lustre didn't excel at small files.

I ordered the beefiest MDS I could (quad proc dual core opterons with 16GB ram) and put it on the fastest array I could (14 drive raid 10 with 15k rpm disks). Still, as always, I'm wondering if I can do better.

Everything runs over tcp/ip with no jumbo frames. My standard test is to simply track how many opens we do every 10 seconds. I run the following command and keep track of the results. We never really exceed ~2000 opens/second. Our workload often involves downloading ~50000 small (4-10k) xml files as fast as possible.

I'm just interested in what other lustre gurus have to say about my results. I've tested different lru_size amounts (makes little difference) and portals debug is off. My understanding is that the biggest performance increase I would see is moving to infiniband instead of tcp interconnects.

Thanks, [EMAIL PROTECTED] lustre]# echo 0 > /proc/fs/lustre/mds/lustre1-MDT0000/stats; sleep 10; cat /proc/fs/lustre/mds/lustre1-MDT0000/stats
snapshot_time             1179180513.905326 secs.usecs
open                      14948 samples [reqs]
close                     7456 samples [reqs]
mknod                     0 samples [reqs]
link                      0 samples [reqs]
unlink                    110 samples [reqs]
mkdir                     0 samples [reqs]
rmdir                     0 samples [reqs]
rename                    99 samples [reqs]
getxattr                  0 samples [reqs]
setxattr                  0 samples [reqs]
iocontrol                 0 samples [reqs]
get_info                  0 samples [reqs]
set_info_async            0 samples [reqs]
attach                    0 samples [reqs]
detach                    0 samples [reqs]
setup                     0 samples [reqs]
precleanup                0 samples [reqs]
cleanup                   0 samples [reqs]
process_config            0 samples [reqs]
postrecov                 0 samples [reqs]
add_conn                  0 samples [reqs]
del_conn                  0 samples [reqs]
connect                   0 samples [reqs]
reconnect                 0 samples [reqs]
disconnect                0 samples [reqs]
statfs                    27 samples [reqs]
statfs_async              0 samples [reqs]
packmd                    0 samples [reqs]
unpackmd                  0 samples [reqs]
checkmd                   0 samples [reqs]
preallocate               0 samples [reqs]
create                    0 samples [reqs]
destroy                   0 samples [reqs]
setattr                   389 samples [reqs]
setattr_async             0 samples [reqs]
getattr                   3467 samples [reqs]
getattr_async             0 samples [reqs]
brw                       0 samples [reqs]
brw_async                 0 samples [reqs]
prep_async_page           0 samples [reqs]
queue_async_io            0 samples [reqs]
queue_group_io            0 samples [reqs]
trigger_group_io          0 samples [reqs]
set_async_flags           0 samples [reqs]
teardown_async_page       0 samples [reqs]
merge_lvb                 0 samples [reqs]
adjust_kms                0 samples [reqs]
punch                     0 samples [reqs]
sync                      0 samples [reqs]
migrate                   0 samples [reqs]
copy                      0 samples [reqs]
iterate                   0 samples [reqs]
preprw                    0 samples [reqs]
commitrw                  0 samples [reqs]
enqueue                   0 samples [reqs]
match                     0 samples [reqs]
change_cbdata             0 samples [reqs]
cancel                    0 samples [reqs]
cancel_unused             0 samples [reqs]
join_lru                  0 samples [reqs]
init_export               0 samples [reqs]
destroy_export            0 samples [reqs]
extent_calc               0 samples [reqs]
llog_init                 0 samples [reqs]
llog_finish               0 samples [reqs]
pin                       0 samples [reqs]
unpin                     0 samples [reqs]
import_event              0 samples [reqs]
notify                    0 samples [reqs]
health_check              0 samples [reqs]
quotacheck                0 samples [reqs]
quotactl                  0 samples [reqs]
ping                      0 samples [reqs]


--
Daniel Leaberry
Systems Administrator
iArchives
Tel: 801-494-6528
Cell: 801-376-6411

_______________________________________________
Lustre-discuss mailing list
[email protected]
https://mail.clusterfs.com/mailman/listinfo/lustre-discuss

Reply via email to