I've closely followed the metadata mailing list posts over the last
year. We've been running our small filesystem for a couple of months in
semi-production mode. We don't have a traditional HPC workload (it's big
image files with 5-10 small xml files) and we knew that lustre didn't
excel at small files.
I ordered the beefiest MDS I could (quad proc dual core opterons with
16GB ram) and put it on the fastest array I could (14 drive raid 10 with
15k rpm disks). Still, as always, I'm wondering if I can do better.
Everything runs over tcp/ip with no jumbo frames. My standard test is to
simply track how many opens we do every 10 seconds. I run the following
command and keep track of the results. We never really exceed ~2000
opens/second. Our workload often involves downloading ~50000 small
(4-10k) xml files as fast as possible.
I'm just interested in what other lustre gurus have to say about my
results. I've tested different lru_size amounts (makes little
difference) and portals debug is off. My understanding is that the
biggest performance increase I would see is moving to infiniband instead
of tcp interconnects.
Thanks,
[EMAIL PROTECTED] lustre]# echo 0 >
/proc/fs/lustre/mds/lustre1-MDT0000/stats; sleep 10; cat
/proc/fs/lustre/mds/lustre1-MDT0000/stats
snapshot_time 1179180513.905326 secs.usecs
open 14948 samples [reqs]
close 7456 samples [reqs]
mknod 0 samples [reqs]
link 0 samples [reqs]
unlink 110 samples [reqs]
mkdir 0 samples [reqs]
rmdir 0 samples [reqs]
rename 99 samples [reqs]
getxattr 0 samples [reqs]
setxattr 0 samples [reqs]
iocontrol 0 samples [reqs]
get_info 0 samples [reqs]
set_info_async 0 samples [reqs]
attach 0 samples [reqs]
detach 0 samples [reqs]
setup 0 samples [reqs]
precleanup 0 samples [reqs]
cleanup 0 samples [reqs]
process_config 0 samples [reqs]
postrecov 0 samples [reqs]
add_conn 0 samples [reqs]
del_conn 0 samples [reqs]
connect 0 samples [reqs]
reconnect 0 samples [reqs]
disconnect 0 samples [reqs]
statfs 27 samples [reqs]
statfs_async 0 samples [reqs]
packmd 0 samples [reqs]
unpackmd 0 samples [reqs]
checkmd 0 samples [reqs]
preallocate 0 samples [reqs]
create 0 samples [reqs]
destroy 0 samples [reqs]
setattr 389 samples [reqs]
setattr_async 0 samples [reqs]
getattr 3467 samples [reqs]
getattr_async 0 samples [reqs]
brw 0 samples [reqs]
brw_async 0 samples [reqs]
prep_async_page 0 samples [reqs]
queue_async_io 0 samples [reqs]
queue_group_io 0 samples [reqs]
trigger_group_io 0 samples [reqs]
set_async_flags 0 samples [reqs]
teardown_async_page 0 samples [reqs]
merge_lvb 0 samples [reqs]
adjust_kms 0 samples [reqs]
punch 0 samples [reqs]
sync 0 samples [reqs]
migrate 0 samples [reqs]
copy 0 samples [reqs]
iterate 0 samples [reqs]
preprw 0 samples [reqs]
commitrw 0 samples [reqs]
enqueue 0 samples [reqs]
match 0 samples [reqs]
change_cbdata 0 samples [reqs]
cancel 0 samples [reqs]
cancel_unused 0 samples [reqs]
join_lru 0 samples [reqs]
init_export 0 samples [reqs]
destroy_export 0 samples [reqs]
extent_calc 0 samples [reqs]
llog_init 0 samples [reqs]
llog_finish 0 samples [reqs]
pin 0 samples [reqs]
unpin 0 samples [reqs]
import_event 0 samples [reqs]
notify 0 samples [reqs]
health_check 0 samples [reqs]
quotacheck 0 samples [reqs]
quotactl 0 samples [reqs]
ping 0 samples [reqs]
--
Daniel Leaberry
Systems Administrator
iArchives
Tel: 801-494-6528
Cell: 801-376-6411
_______________________________________________
Lustre-discuss mailing list
[email protected]
https://mail.clusterfs.com/mailman/listinfo/lustre-discuss