[Lustre-discuss] external journal raid1 vs. single disk ext journal + hot spare on raid6
Hi All, With the upgrade from 1.6.x to 1.8.x we are planning to reconfigure our RAID systems. The OST RAID hardware are Sun 6140 arrays with 16x500GB SATA disks. Each 6140 tray has one OSS node (Sun X2200 M2). We have redundant paths and ultimately plan a failover strategy. The MDT will be a RAID 1+0 Sun 2540 with 12x73GB SAS disks. Each 6140 tray will be configured either as 1 or 2 RAID6 volumes. The lustre manual recommends more smaller OST's over large and other docs I've seen seem to indicate that the optimal number of drives is ~(6+2). For these 16 disk trays, the choice would be one (12+2R6) + external journal and/or hot spares or two (5+2R6)'s + ext. jrnl and/or hot spares. So my questions are: 1.) What are the trade-offs of RAID1 external journal with no hot spare vs. single disk ext journal with a hot spare (spare is for R6 volume)? Specifically: - If a single disk external journal is lost, can we run fsck and only lose the transactions that have not been committed to disk? If so, then the loss of the disk hosting the external journal would not be catastrophic for the file system as a whole. - How comfortable are RAID6 users with no hot spares? (We'll have cold spares handy, but prefer to get through weekends w/out service) 2.) The external journal only takes up ~400MB. If we create 2 RAID6 volumes, can we put 2 external journals on one disk or RAID1 set (suitably partitioned), or do we need to blow an entire disk for one external journal? 3.) In planning for segment size (chunk size in lustre manual) we'd have to go to 128kB or lower. However, in single disk tests (SATA), it seems that larger is better so perhaps this argues for small RAID6 sets as mentioned in the manual. Just wondering what other folks have found here also. We have the opportunity to test several scenarios with 2 6140 trays that are not part of the 1.6.x production system so I expect we will test performance as a function of the number of drives in the RAID6 volume (eg. 12+2 vs 5+2) along with array write segment sizes via sgpdd-survey. I'll report back with test results once we sort out which knobs seem to make the most difference. Any advice or comments welcome, Stuart ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] compact lustre system advice needed
Thanks for the reply I'm interested in the sense that I thought the lustre client-server i/o has less overhead and latency than NFS. If that is true, then this may be better than NFS even with one server. In particular, with single patchless lustre client using ethernet, I get can much better performance from my existing (larger) lustre system than from NFS. Since each file is actually only on one server, it seems the same might be true in the case I described. So I'm game to test this if I can figure out the details of the configuration. I'm quite familiar with the regular configuration with distinct MDS/MGS and OSS's. Stuart On Thu, Feb 26, 2009 at 5:19 PM, Kevin Van Maren kevin.vanma...@sun.comwrote: Yes, it can be done (use bonded Ethernet), but with only a single server NFS is likely a better fit - Lustre's advantage is scaling with many servers. Kevin On Feb 26, 2009, at 5:15 PM, Stuart Marshall stuart.l.marsh...@gmail.com wrote: Hi, Is it possible to set up a lustre service on a single machine with multiple ethernet ports, multiple cpu cores and multiple disks? The idea would be that the machine would either use L4 link aggregation or mulitple IP's and a modest number of clients (10) would access the file system. The MDT and =1 OST's would not share physical disks. The motivation for this is to get the best performance possible for such a small collection of hosts. In my case, there would be 1 writer and multiple (asynchronous) readers. The clients would be seperate machines. The MGS/MDS/OSS's would be in the same machine. Has this been done or discussed before? any comments welcome, thanks, Stuart ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] Can't build sun rdac driver against lustre source.
Hi, I have compiled and used the run rdac driver and my modified makefile is attached. The sequence I've used (perhaps not the best) is: - cd /lib/modules/2.6.9-67.0.7.EL_lustre.1.6.5.1smp/source/ - cp /boot/config-2.6.9-67.0.7.EL_lustre.1.6.5.1smp .config - make clean - make mrproper - make prepare-all - cd /tmp - tar xf path_to_rdac_tarfile/rdac-LINUX-09.01.B2.74-source.tar - cd linuxrdac-09.01.B2.74/ - cp path_to_my_makefile/Makefile_linuxrdac-09.01.B2.74 Makefile - make clean - make uninstall - make - make install - vim /boot/grub/menu.lst (initrd - mpp) - reboot The changes in the Makefile may fix your problem. I'm using 6140 Sun arrays and also plan to use a 2540 as the MDT soon. Stuart On Fri, Jul 25, 2008 at 11:03 AM, Brock Palen [EMAIL PROTECTED] wrote: Hi I ran into two problems, The first was easy to resolve: /bin/sh: scripts/genksyms/genksyms: No such file or directory /bin/sh: scripts/mod/modpost: No such file or directory I just had to copy genksyms and mod from linux-2.6.9-67.0.7.EL_lustre.1.6.5.1 to linux-2.6.9-67.0.7.EL_lustre. 1.6.5.1-obj I figured you should be aware of this, if its a problem with sun's build system for their multipath driver or lustre source package. This is on RHEL4. Using the lustre RPM's form sun's website. The next problem I am stuck on is: In file included from mppLnx26_spinlock_size.c:51: /usr/include/linux/autoconf.h:1:2: #error Invalid kernel header included in userspace mppLnx26_spinlock_size.c: In function `main': mppLnx26_spinlock_size.c:102: error: `spinlock_t' undeclared (first use in this function) mppLnx26_spinlock_size.c:102: error: (Each undeclared identifier is reported only once mppLnx26_spinlock_size.c:102: error: for each function it appears in.) make: *** [mppLnx_Spinlock_Size] Error 1 I guess what I should really ask is, Has anyone ever made multipath work with a sun 2540 array for use as the MDS/MGS file system? Brock Palen www.umich.edu/~brockp http://www.umich.edu/%7Ebrockp Center for Advanced Computing [EMAIL PROTECTED] (734)936-1985 ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss Makefile_linuxrdac-09.01.B2.74 Description: Binary data ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
[Lustre-discuss] problem with updating to e2fsprogs-1.40.7.sun3-0redhat.i386.rpm on x86_64 RHEL4U6 machine
I downloaded the file from the Sun download site with the (site) title: Download Lustre(TM) 1.6.5.1 General Availability for Red Hat Enterprise Linux 4, i686, English and name: e2fsprogs-1.40.7.sun3-0redhat.i386.rpm When I install this on a fully patched RHEL4U6 x86_64 machine (to satisfy dependencies on libcom_err.so.2 I get unmet dependencies on GLIBC_2.4 which is not installed in RHEL4 (I don't think). So the SUN download center does not seem to have the right RHEL4 .i386.rpm in the i686 section. If instead, I download from: http://downloads.lustre.org/public/tools/e2fsprogs/1.40.7.sun3/ with name: e2fsprogs-1.40.7.sun3-0redhat.rhel4.i386.rpm I see no problem: Which file is supposed to be correct? Should the Sun download site be updated? Should I file a bug or is there one already? (I looked but did not find this exact problem). thanks in advance, Stuart ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] lustre and multi path
Hi Brock, We have sun oss's mds's and fc attached arrays all connected via an fc switch. We use sun's rdac driver rdac-LINUX-09.01.B2.74-source.tar. I think we tried to get the native RHEL4 multipath working but did not succeed with our configuration. Stuart On Thu, Jun 5, 2008 at 3:57 PM, Brock Palen [EMAIL PROTECTED] wrote: Our new lustre hardware arrived from sun today. Looking at the duel MDS and FC disk array for it. We will need multipath. Has anyone ever used multipath with lustre? Is there any issues? If we set up regular multipath via LVM lustre won't care as far as I can tell and browsing archives. What about multipath without LVM? Our StorageTek array has dual controllers with dual ports going to dual port FC cards in the MDS's. Each MDS has a connection to both controllers so we will need multipath to get any advantage to this. Comments? Brock Palen www.umich.edu/~brockp http://www.umich.edu/%7Ebrockp Center for Advanced Computing [EMAIL PROTECTED] (734)936-1985 ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss