Re: [Lustre-discuss] Lustre on WAN
[ ... ] Lustre is originally designed to target at HPC clusters, i.e., systems on a single LAN environment. It is not so much single LAN, but streaming and low latency. On the other hand, the cloud we are building is physically distributed at different cities in the province of Alberta. [ ... ] performance is impressively good, partly due to the fast network we are running in the province. The question here is whether the *clients* and/or the *servers* are physically distributed. If the servers are physically distributed then what are the resilience requirements, and this largely relates to what is the redundancy strategy for the underlying storage, and whether files are striped across OSSes at different sites. In the example below the servers are centralized but maybe this is not what you mean by a cloud. This is very similar to the environment that is being used at Indiana University. They have the Lustre servers at a central site, but several labs/campuses in other cities are mounting the filesystem and they can saturate 10GigE links between the sites. That is quite plausible, but relevant performance depends on whether access patterns are streaming or not, and doing some decent TCP setup to maximize link utilization. [ ... ] one application we've tried over the WAN didn't fare very well out of the box - lots of small random reads (read ~4k, seek a bunch, read ~4k, etc ad nauseum). That looks like some people have unrealistic expectations as to latency and synchronous IO more than something to which Lustre is relevant. Even if Lustre is mostly targeted at streaming workloads on low latency networks, and even if it is not too bad in different circumstances. ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] Client evictions and RMDA failures
Thanks for your help Brian. We've resolved the problem by upgrading the firmware on the HCA's from 1.0.7 to 1.2.0. mounts have stabilized. Also upgraded to ofed 1.4 (minus the kernel-ib patches). On Tue, Mar 31, 2009 at 4:29 PM, Brian J. Murrell brian.murr...@sun.com wrote: On Tue, 2009-03-31 at 16:02 -0400, syed haider wrote: What would cause this? Could this be because of the fabric also? Sure. When the fabric is flaky all sorts of unexpected things (can) happen. Really, your primary task should be making your network stable rather than continuing to muck with Lustre on it. b. ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] e2scan question
We are using e2scan since few days and we have noticed that date specification is not being processed correctly by e2scan. date Fri Apr 3 15:56:49 BST 2009 /usr/sbin/e2scan -C /ROOT -l -N 2009-03-29 19:44:00 /dev/dm-0 file_list generating list of files with mtime newer than Sun Mar 29 15:56:54 2009 ctime newer than Sun Mar 29 15:56:54 2009 inode bitmap is read, 0 seconds visible root: /ROOT scanning inode tables .. However solution for us is to use epoch time format. Let me know if it works for you. Cheers Wojciech Chad Kerner wrote: We have been trying to use e2scan to generate a list of files that were changed since a specific date. We are seeing that it has been missing some of the modified files and was wondering if anyone else has seen this behavior. I executed: e2scan -N Mar 27 08:00:00 2009 /dev/mapper/mdt_scratch chad.out [r...@abe-mds04 projects_sync]# grep ILRocstar chad.out ./ROOT/.projects/roc/fnajjar/ILRocstar [r...@abe-mds04 projects_sync]# The following list is a very small subset of files within this directory that were changed after the scan date: [r...@abe-bk1 ILRocstar]# pwd /cfs/scratch/.projects/roc/fnajjar/ILRocstar [r...@abe-bk1 ILRocstar]# ls -lR | grep Mar 28 drwxr-x--- 4 fnajjar gac4096 Mar 28 00:56 RSRM105_3D_HG drwxr-x--- 2 fnajjar gac 4096 Mar 28 00:56 Turb_LES -rw--- 1 fnajjar gac 4424 Mar 28 01:05 IVHG_j01.o863001 -rw--- 1 fnajjar gac 4412 Mar 28 02:00 IVHG_j02.o863005 -rw--- 1 fnajjar gac 4427 Mar 28 05:53 IVHG_j03.o863017 -rw--- 1 fnajjar gac 5616 Mar 28 06:54 IVHG_j04.o863026 -rw--- 1 fnajjar gac 5084 Mar 28 11:47 IVHG_j05.o863591 -rw--- 1 fnajjar gac 4489 Mar 28 11:58 IVHG_j06.o863592 -rw--- 1 fnajjar gac 4428 Mar 28 20:55 IVHG_j07.o863637 [r...@abe-bk1 ILRocstar]# ls -lR | grep Mar 28 | wc -l 164 [r...@abe-bk1 ILRocstar]# We are wanting to use e2scan to generate a list of files to back up with tivoli, because it is much faster to pass tivoli a list than have it traverse the entire tree. However, it looks like e2scan can't be trusted to generate a valid list. The version of e2scan we are running is: [r...@abe-mds04 projects_sync]# rpm -qilf `which e2scan` Name: e2fsprogsRelocations: (not relocatable) Version : 1.40.11.sun1 Vendor: (none) Release : 0redhat Build Date: Thu 10 Jul 2008 11:21:23 AM CDT Install Date: Sat 20 Dec 2008 04:07:31 AM CST Build Host: lts-build-x86-64-0.co.cfs Group : System Environment/Base Source RPM: e2fsprogs-1.40.11.sun1-0redhat.src.rpm Size: 2085263 License: GPLv2 Signature : (none) URL : ftp://ftp.lustre.org/pub/lustre/other/e2fsprogs/ Summary : Utilities for managing the second and third extended (ext2/ext3) filesystems Description : Any thoughts or suggestions would be greately appreciated. Thanks, Chad -- Chad Kerner - cker...@ncsa.uiuc.edu Systems Engineer, Storage Enabling Technologies National Center for Supercomputing Applications http://www.ncsa.uiuc.edu/~ckerner ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] OSS Cache Size for read optimization
Jordan Mendler wrote: Hi all, I deployed Lustre on some legacy hardware and as a result my (4) OSS's each have 32GB of RAM. Our workflow is such that we are frequently rereading the same 15GB indexes over and over again from Lustre (they are striped across all OSS's) by all nodes on our cluster. As such, is there any way to increase the amount of memory that either Lustre or the Linux kernel uses to cache files read from disk by the OSS's? This would allow much of the indexes to be served from memory on the OSS's rather than disk. I see a /lustre.memused_max = 48140176/ parameter, but not sure what that does. If it matters, my setup is such that each of the 4 OSS's serves 1 OST that consists of a software RAID10 across 4 SATA disks internal to that OSS. Any other suggestions for tuning for fast reads of large files would also be greatly appreciated. Current Lustre does not cache on OSTs at all. All IO is direct. Future Lustre releases will provide an OST cache. For now, you can increase the amount of data cached on clients, which might help a little. Client caching is set with /proc/fs/lustre/osc/*/max_dirty_mb. cliffw Thanks so much, Jordan ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] OSS Cache Size for read optimization
The parameter is called dirty, is that write cache, or is it read-write? Current Lustre does not cache on OSTs at all. All IO is direct. Future Lustre releases will provide an OST cache. For now, you can increase the amount of data cached on clients, which might help a little. Client caching is set with /proc/fs/lustre/osc/*/max_dirty_mb. ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] OSS Cache Size for read optimization
Yes, it is for dirty cache limiting on a per-osc basis. There is also /proc/fs/lustre/llite/*/max_cached_mb that regulates how much cached data per client you can have. (default is 3/4 of RAM) On Apr 3, 2009, at 2:52 PM, Lundgren, Andrew wrote: The parameter is called dirty, is that write cache, or is it read- write? Current Lustre does not cache on OSTs at all. All IO is direct. Future Lustre releases will provide an OST cache. For now, you can increase the amount of data cached on clients, which might help a little. Client caching is set with /proc/fs/lustre/osc/*/max_dirty_mb. ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss