Re: [Lustre-discuss] Lustre over two TCP interfaces
That will probably be slow - the machine you use to proxy the IPVS address would be a bottleneck. Out of curiosity, what problem are you trying to solve here? Do you anticipate whole-subnet outages to be an issue (and if so, why)? -- Mike Shuey On Wed, Jun 26, 2013 at 4:53 AM, Alfonso Pardo alfonso.pa...@ciemat.eswrote: oooh! Thanks for you reply! May be another way is a floating IP between two interfaces with IPVS (corosync). -Mensaje original- From: Brian O'Connor Sent: Wednesday, June 26, 2013 10:15 AM To: Alfonso Pardo Cc: 'Michael Shuey' ; 'WC-Discuss' ; lustre-discuss@lists.lustre.**orglustre-discuss@lists.lustre.org Subject: Re: [Lustre-discuss] Lustre over two TCP interfaces On 06/26/2013 04:16 PM, Alfonso Pardo wrote: But if I configure the OST assigning to the first interface of the OSS (bond0) and as failover OSS the second inteface of the OSS. If the bond0 network down, the client will try to connect to the failover, that is the second interface of the OSS. is it possible? I stand to be corrected, but no, I don't think so. As I understand it the failover code looks for a different server instance, rather than a different nid. See http://lists.opensfs.org/**pipermail/lustre-devel-** opensfs.org/2012-August/**28.htmlhttp://lists.opensfs.org/pipermail/lustre-devel-opensfs.org/2012-August/28.html *From:* Brian O'Connor mailto:bri...@sgi.com *Sent:* Wednesday, June 26, 2013 1:09 AM *To:* 'Alfonso Pardo' mailto:alfonso.pardo@ciemat.**esalfonso.pa...@ciemat.es ; 'Michael Shuey' mailto:sh...@purdue.edu *Cc:* 'WC-Discuss' mailto:WC-Discuss.Migration@**intel.comwc-discuss.migrat...@intel.com ; mailto:lustre-discuss@lists.**lustre.orglustre-discuss@lists.lustre.org *Subject:* RE: [Lustre-discuss] Lustre over two TCP interfaces Unless something has changed in the new versions of lustre, I don't think lustre can do failover between nids on the same machine. It can choose the available nid at mount time, but if an active nid goes away after you are mounted then the client chooses the failover nid, and this must be on a different server. Check the archives for more discussion in this topic :) -Original Message- *From: *Alfonso Pardo [alfonso.pa...@ciemat.es mailto:alfonso.pardo@ciemat.**es alfonso.pa...@ciemat.es] *Sent: *Tuesday, June 25, 2013 07:23 AM Central Standard Time *To: *Michael Shuey *Cc: *WC-Discuss; lustre-discuss@lists.lustre.**orglustre-discuss@lists.lustre.org *Subject: *Re: [Lustre-discuss] Lustre over two TCP interfaces thank Michael, This is my second step, I will change the lnet with “options lnet networks=tcp0(bond0,bond1)” because my machines has 4 nics. I have a bond0 and bond1 with LACP. I need to comunicate the clients with two network for HA network. If the bond0 network is down, the clients can reach the OSS by the second network bond1. If I change the modprobe with “options lnet networks=tcp0(bond0),tcp1(**bond1)”, how the clients mount the filesystem to reach the OSS by two network? *From:* Michael Shuey mailto:sh...@purdue.edu *Sent:* Tuesday, June 25, 2013 2:14 PM *To:* Alfonso Pardo mailto:alfonso.pardo@ciemat.**esalfonso.pa...@ciemat.es *Cc:* lustre-discuss@lists.lustre.**org lustre-discuss@lists.lustre.org mailto:lustre-discuss@lists.**lustre.orglustre-discuss@lists.lustre.org ; WC-Discuss mailto:WC-Discuss.Migration@**intel.com wc-discuss.migrat...@intel.com *Subject:* Re: [Lustre-discuss] Lustre over two TCP interfaces Different interfaces need to be declared with different LNET networks - something like networks=tcp0(eth0),tcp1(**eth1). Of course, that assumes your clients are configured to use a mix of tcp0 and tcp1 for connections (with each client only using one of the two). This is really only useful in corner cases, when you're doing something strange; if eth0 and eth1 are in the same subnet (as in your example), this is almost certainly not productive. A better bet might be to use a single LNET, and bond the two interfaces together - either as an active/passive pair, or active/active (e.g., LACP). Then you'd declare networks=tcp0(bond0), give the bond a single IP address, and client traffic would be split across the two members in the bond more like you probably expect (given the limits of the bond protocol you're using). -- Mike Shuey On Tue, Jun 25, 2013 at 8:06 AM, Alfonso Pardo alfonso.pa...@ciemat.es mailto:alfonso.pardo@ciemat.**es alfonso.pa...@ciemat.es wrote: hello friends, I need to comunicate my OSS by two ethernet TCP interfaces: eth0 and eth1. I have configured this feature in my modprobe.d with: “options lnet networks=tcp0(eth0,eth1)” And I can see two interfaces with: lctl --net tcp interface_list sa-d4-01.ceta-ciemat.es http://sa-d4-01.ceta-ciemat.**eshttp://sa-d4-01.ceta-ciemat.es : (192.168.11.15/255.255.255.0 http://192.168.11.15/255.255
Re: [Lustre-discuss] Lustre over two TCP interfaces
Different interfaces need to be declared with different LNET networks - something like networks=tcp0(eth0),tcp1(eth1). Of course, that assumes your clients are configured to use a mix of tcp0 and tcp1 for connections (with each client only using one of the two). This is really only useful in corner cases, when you're doing something strange; if eth0 and eth1 are in the same subnet (as in your example), this is almost certainly not productive. A better bet might be to use a single LNET, and bond the two interfaces together - either as an active/passive pair, or active/active (e.g., LACP). Then you'd declare networks=tcp0(bond0), give the bond a single IP address, and client traffic would be split across the two members in the bond more like you probably expect (given the limits of the bond protocol you're using). -- Mike Shuey On Tue, Jun 25, 2013 at 8:06 AM, Alfonso Pardo alfonso.pa...@ciemat.eswrote: hello friends, I need to comunicate my OSS by two ethernet TCP interfaces: eth0 and eth1. I have configured this feature in my modprobe.d with: “options lnet networks=tcp0(eth0,eth1)” And I can see two interfaces with: lctl --net tcp interface_list sa-d4-01.ceta-ciemat.es: (192.168.11.15/255.255.255.0) npeer 0 nroute 2 sa-d4-01.ceta-ciemat.es: (192.168.11.35/255.255.255.0) npeer 0 nroute 0 But, the clients only can communicate with the first interface: lctl ping 192.168.11.15 12345-0@lo 12345-192.168.11.15@tcp lctl ping 192.168.11.35 failed to ping 192.168.11.35@tcp: Input/output error Any suggestions how to “enable” the second interface? thank in advance *Alfonso Pardo Diaz * *System Administrator / Researcher * *c/ Sola nº 1; 10200 TRUJILLO, SPAIN * *Tel: +34 927 65 93 17 Fax: +34 927 32 32 37 * [image: CETA-Ciemat logo] http://www.ceta-ciemat.es/ Confidencialidad: Este mensaje y sus ficheros adjuntos se dirige exclusivamente a su destinatario y puede contener información privilegiada o confidencial. Si no es vd. el destinatario indicado, queda notificado de que la utilización, divulgación y/o copia sin autorización está prohibida en virtud de la legislación vigente. Si ha recibido este mensaje por error, le rogamos que nos lo comunique inmediatamente respondiendo al mensaje y proceda a su destrucción. Disclaimer: This message and its attached files is intended exclusively for its recipients and may contain confidential information. If you received this e-mail in error you are hereby notified that any dissemination, copy or disclosure of this communication is strictly prohibited and may be unlawful. In this case, please notify us by a reply and delete this email and its contents immediately. ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] [HPDD-discuss] Disk array setup opinions
If you can, I'd advocate the route you suggest - multiple RAID groups, each group maps to a unique LUN, and each LUN is an OST. Note that you'll likely want the number of data disks in each RAID to be a power of 2 (e.g., 6- or 10-disk raid6, 5- or 9-disk raid5). Obviously, you'll be wasting more spindles on overhead (RAID parity), but performance is more predictable. The other way (single RAID, multiple LUNs, LUN == OST) means the performance of OSTs aren't independent - they all bottleneck on the same RAID array. If you have enough RAID controller bandwidth, this can (theoretically) work, but makes hunting/fixing performance problems more complex. In Lustre, if it's not writing fast enough, you can just stripe over more OSTs. However, if your OSTs aren't really independent, that may or may not help - you'll get different bandwidth depending on how many OSTs are sharing the same pool of physical disks. I'd expect two OSTs that don't share drives to write faster than two that do, and so on. BTW, if you have multiple controllers, and the LUN platform has a sense of controller affinity (i.e., a LUN uses one controller as primary and another as secondary or backup), try and balance your RAIDs across the two controllers in your array. For instance, stick even-numbered LUNs on one, odd-numbered LUNs on another. Also, if you're doing multi-pathing into your OSSes, make sure your multipath drivers are aware of this arrangement, and respect it. Most midrange disk trays will do multipath, and cache mirroring between controllers - but if you read the fine print, you often find that access through the secondary controller is MUCH slower. It's usually implemented as a write-through to the primary, or has its cache disabled while the primary is active, etc. Cache mirroring at high speed is hard, complicated, and expensive, so vendors often only implement what's minimally necessary to do failover - even if it means the secondary controller doesn't cache a LUN unless the primary dies. If you have one of these (and I've no idea if Dell's 3200 does this, but this behavior is common enough I'd think about it), you'll want to split LUNs evenly between controllers to maximize the cache use. You'll also want to make sure the OSS knows which path is primary for which LUN, so it doesn't send traffic down the wrong path (or worse, down both - round-robin balancing is a bad idea when the paths are asymmetric) unless there's been a hardware failure. BTW, if you implemented a single RAID group and exported multiple LUNs, any multi-controller effects can get way more complicated - and are highly implementation-dependent. TL;DR - Multi-raid, RAID group == LUN == OST. Keep OSTs as independent as you can, and watch your controller and OSS multipath settings (if used). -- Mike Shuey On Sat, Mar 9, 2013 at 10:19 AM, Jerome, Ron ron.jer...@ssc-spc.gc.ca wrote: I am currently having a debate about the best way to carve up Dell MD3200's to be used as OST's in a Lustre file system and I invite this community to weigh in... I am of the opinion that it should be setup as multiple raid groups each having a single LUN, with each raid group representing an OST, while my colleague feels that it should be setup as a single raid group across the whole array with multiple LUNS, with each LUN representing an OST. Does anyone in this group have an opinion (one way or another)? Regards, Ron Jerome ___ HPDD-discuss mailing list hpdd-disc...@lists.01.org https://lists.01.org/mailman/listinfo/hpdd-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] problem with installing lustre and ofed
RedHat's OFED tends to lag Mellanox's. They're pretty current on bugfixes, but support for the latest hardware is usually 3-6 months behind - it took about 4 months to bring in drivers for our most recent FDR system. Also, support for Mellanox's advanced features (e.g., MXM, FCA) is often missing. -- Mike Shuey On Mon, Dec 31, 2012 at 11:32 AM, Brian J. Murrell brian.murr...@linux.intel.com wrote: On Fri, 2012-12-28 at 15:54 -0800, Jason Brooks wrote: Hello, Hi, I am having trouble installing the server modules for lustre 2.1.4 and use mellanox's OFED distribution Is there a particular need for the Mellanox OFED distribution? The Redhat EL 6 kernel comes stock with the inifiniband drivers and stack already baked in and we leverage that and build our Lustre modules RPM against it. So unless there is something particular that you need that is only in the Mellanox OFED distribution and is not already in EL6's kernels, you should be able to just use the binary kernel and lustre-modules RPMs that we supply and have working inifiniband support. Cheers, b. ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] ofed with FDR14 support Lustre
We're using 1.8.7.80-wc1 here. It's basically 1.8.7-wc1, but with a few fixes pulled in from git a few months back to build on rhel6.2. It's built on top of Mellanox's OFED 1.5.3-3.0.0, and is working just fine on our FDR14 cluster. -- Mike Shuey On Wed, May 30, 2012 at 3:41 PM, John White jwh...@lbl.gov wrote: Does anyone know of Lustre version that can build against an ofed that supports FDR14 (1.5.4+, by my understanding)? Or is this still in the pipes? The compat matrix on the Whamcloud site only talks of support up to 1.5.3.1 (confirmed to build but doesn't support FDR14). John White HPC Systems Engineer (510) 486-7307 One Cyclotron Rd, MS: 50C-3209C Lawrence Berkeley National Lab Berkeley, CA 94720 ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] Could Client MGS or Client OSS running at one node together?
I've had success with LACP bonding, with an LACP-aware switch. You may also want to check your xmit_hash_policy (it's a kernel option to the linux bonding driver). I've had the best luck with layer3+4 bonding, using several OSSes with sequential IPs, and striping files at a multiple of the number of links in the bond. With most bonding modes, packets can get send across different links in the bond. This results in out-of-order packets, and can slow down a TCP stream (like your Lustre connection) unnecessarily. LACP will route packets to a given destination across exactly one link in the bond - but that will limit each TCP stream to the link speed of a single bond member. You can improve upon single link speeds with Lustre, because Lustre will let you stripe large files across multiple OSSes. A client will build a separate TCP connection to each OSS, so as long as traffic passes over different links you can use all the available bandwidth. The way traffic is scattered across the bond members is controlled by the xmit_hash_policy option; layer3+4 uses an XOR of the source and destination addresses, combined with the source and destination TCP ports (modulo number of links in the bond) to pick the specific link for that stream. If you're using sequential IPs for your OSSes, you should be able to get a good scattering effect (since your source address and port won't change, but your destination address will vary across the OSSes). A few years ago, I was using Lustre 1.4 and 1.6 and saw 200+ MB/sec across two gigE links bonded together on the client (using striped files). Your mileage may vary, of course. Caveats include: Small files may have poorer performance than usual, due to transaction overhead to multiple OSSes (if your bond is on the client). Similarly, non-striped files will only see the speed of a single link. If your OSTs start to fill, Lustre's load balancing may not give you an ideal distribution of stripes across OSSes - causing multiple TCP streams to land on the same bond member on the client. Unfortunately, this will present as slowdowns for certain files on certain clients (because the number of bonds that can be used is a function of both which OSSes are used in the file and the client's IP in the hash policy). All metadata accesses are limited to the speed of a single bond member on the client. If your bonds are on the server, then (as long as you have a number of clients) you should see a nice increase in overall IO throughput. It won't be as marked a boost as 10gigE or Infiniband, but bonds are inexpensive and generally better than a single link (to multiple clients). Hope this helps - good luck! -- Mike Shuey On Wed, Mar 7, 2012 at 9:27 PM, zhengfeng zf5984...@gmail.com wrote: Dear all, thanks a lot for your answers ;) Now I have another problem about the network between nodes. Since there is no Infiniband or 10G-NIC, but I still want to increase the bandwidth by add more 1G-NICs, I plan to use Linux bonding. Then, bonding 4 NICs together at one node, BUT there is NO performance enhanced no matter which bongding mode, described in kernel doc, used. In stead, the performance of 4-NICs-bonding is lower than 1 NIC's. Then we use 2 NICs bonding, the performance is better than 1 NIC's. The result is : bonding 2 NIC: 1 + 1 1 bonding 4 NIC: 1 + 1 + 1 + 1 1 So confused.. The benchmark we used is netperf. And I use tcpdump to dump the packages, found that there are great of TCP segments out of orders. My question is that: a) TCP segments are out of order, which induced that 4-NIC-bonding performance decay, is this the root cause? b) We are doubting the feasibility of this method: using 4-NIC-bonding to increase bandwidth. Any proposals about that? If so, maybe I should use some other method instead of this. Thanks again, all Best Regards Zheng From: Peter Grandi Date: 2012-03-07 20:54 To: Lustre discussion Subject: Re: [Lustre-discuss] Could Client MGS or Client OSS running at one node together? Since there is no more node in our project when using Lustre, I want to confirm that: 1) Could the Client and MGS run at one node together? or could Client and OSS run at one node together? 2) Suppose I had deployed them at one node, what potential shortcomings or harm are there? Running MGS and MDS on the same nodes is customary, see: http://wiki.lustre.org/manual/LustreManual20_HTML/LustreOperations.html#50438194_24122 Running the MGS, MDS and OSS service on the same node is possible and fairly common in very small setups, usually those in which there is only 1-2 nodes. It is possible to use the client code on all types of Lustre servers, but at least in the case of using the client code on an OSS there is the non-negligible possibility of a resource deadlock, if the client uses the OSS on the same node, as the client and OSS codes compete for memory, so in the past this has been
Re: [Lustre-discuss] Mount 2 clusters, different networks - LNET tcp1-tcp2-o2ib
Is your ethernet FS in tcp1, or tcp0? Your config bits indicate the client is in tcp1 - do the servers agree? -- Mike Shuey On Tue, Jun 14, 2011 at 12:23 PM, Thomas Roth t.r...@gsi.de wrote: Hi all, I'd like to mount two Lustre filesystems on one client. Issues with more than one MGS set aside, the point here is that one of them is an Infiniband-cluster, the other is ethernet-based. And my client is on the ethernet. I have managed to mount the o2ib-fs by setting up an LNET router, but now this client's LNET doesn;t known how to reach the ethernet-fs. So the basic modprobe.conf reads options lnet networks=tcp1(eth0) routes=o2ib LNET-Router-IP@tcp1 This mounts the MGS on the o2ib network. What do I have to add to get to the MGS on the tpc network? Meanwhile I have studied more posts here and came up with options lnet networks=tcp1(eth0),tcp2(eth0:0) routes=o2ib LNET-Router-IP@tcp1; tcp Default-Gateway-IP@tcp2 Doesn't work either, but I see in the log of the (tcp-)MGS: LustreError: 120-3: Refusing connection from Client-IP for MGS-IP@tcp2: No matching NI Somethings getting through ... Any ideas? Regards, Thomas ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] Mount 2 clusters, different networks - LNET tcp1-tcp2-o2ib
That may be because your gateway doesn't have an interface on tcp (aka tcp0). I suspect you want to keep your ethernet clients in tcp0, your IB clients in o2ib0, and your router in both. Personally, I find it easiest to just give different module options on each system (rather than try ip2nets stuff). On the ether clients, I'd try: options lnet networks=tcp0(eth0) routes=o2ib0 LNET-router-eth_IP@tcp0 dead_router_check_interval=300 On IB clients: options lnet networks=o2ib0(ib0) routes=tcp0 LNET-router-IB_IP@ib0 dead_router_check_interval=300 then on the router: options lnet networks=tcp0(eth0),o2ib0(ib0) forwarding=enabled accept_timeout=15 Obviously, your file servers will need to have lnet options similar to the clients: options lnet networks=tcp0(eth0) routes=o2ib0 LNET-router-eth_IP@tcp0 dead_router_check_interval=300 options lnet networks=o2ib0(ib0) routes=tcp0 LNET-router-IB_IP@o2ib0 dead_router_check_interval=300 That's just a guess, your mileage may vary, etc., but I think it's close to what you want. Note that you really want the dead_router_check_interval if you're using lnet routers. Without that parameter, the lustre client will automatically mark a router as failed when it's unavailable but will not check to see if it ever comes back. With this param, it checks every 300 seconds (and re-enables it if found). Hope this helps. -- Mike Shuey On Tue, Jun 14, 2011 at 1:26 PM, Thomas Roth t.r...@gsi.de wrote: Hm, the ethernet FS is in tcp0 - MGS says its nids are MGS-IP@tcp. So not surprising it refuses that connection. On the other hand, options lnet networks=tcp1(eth0),tcp(eth0:0) routes=o2ib LNET-Router-IP@tcp1; tcp Default-Gateway-IP@tcp results in Can't create route to tcp via Gateway-IP@tcp Cheers, Thomas On 06/14/2011 07:00 PM, Michael Shuey wrote: Is your ethernet FS in tcp1, or tcp0? Your config bits indicate the client is in tcp1 - do the servers agree? -- Mike Shuey On Tue, Jun 14, 2011 at 12:23 PM, Thomas Roth t.r...@gsi.de wrote: Hi all, I'd like to mount two Lustre filesystems on one client. Issues with more than one MGS set aside, the point here is that one of them is an Infiniband-cluster, the other is ethernet-based. And my client is on the ethernet. I have managed to mount the o2ib-fs by setting up an LNET router, but now this client's LNET doesn;t known how to reach the ethernet-fs. So the basic modprobe.conf reads options lnet networks=tcp1(eth0) routes=o2ib LNET-Router-IP@tcp1 This mounts the MGS on the o2ib network. What do I have to add to get to the MGS on the tpc network? Meanwhile I have studied more posts here and came up with options lnet networks=tcp1(eth0),tcp2(eth0:0) routes=o2ib LNET-Router-IP@tcp1; tcp Default-Gateway-IP@tcp2 Doesn't work either, but I see in the log of the (tcp-)MGS: LustreError: 120-3: Refusing connection from Client-IP for MGS-IP@tcp2: No matching NI Somethings getting through ... Any ideas? Regards, Thomas ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] Can one node mount more than one lustre cluster?
FYI, I'm using Lustre 1.8.5 to mount two filesystems from separate domains (one in Lafayette, IN, and one in Bloomington, IN, run by two different institutions) on 900+ nodes, and things just work. -- Mike Shuey On Tue, Mar 15, 2011 at 6:24 PM, Andreas Dilger adil...@whamcloud.com wrote: On 2011-03-15, at 4:19 PM, Malcolm Cowe wrote: On 15/03/2011 17:15, Brian O'Connor wrote: What are the constraints on a client mounting more than one lustre file system? I realise that a lustre cluster can have more than one file system configured, but can a client mount different file systems from different lustre clusters on the same network?;ie. Assume a Single IB fabric and two Lustre clusters with separate MGS/MDS/OSS. One lustre is Lincoln and the other is Washington list_nids 192.168.1.10@o2ib mount -t lustre lincoln:/lustre /lincoln mount -t lustre washington:/lustre /washinton Is this doable or should they be on separate IB fabrics From my understanding, there should only be one MGS for the entire environment (although as many MDTs and OSTs as are required). This is because clients will only communicate with exactly one MGS (and will communicate with the MGS of the last FS mounted) and will only receive updates from the MGS with which it is registered. So, in the above example if there is a change to the lincoln file system (e.g. a failover event, some configuration changes), clients will not receive notification. There's not an issue with having multiple MGS's on a site, only with mounting lustre file systems from multiple domains on the same client, IIRC. That discussion was had on the list a few months ago, and the correct answer is that it should just work. The only use last mounted MGS problem was fixed at some point, though I don't have the exact version handy. Cheers, Andreas -- Andreas Dilger Principal Engineer Whamcloud, Inc. ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] status of lustre 2.0 on 2.6.18-194.17.1.0.1.el5 kernels
What does that imply for sites migrating from 1.8 to 2.1? Presumably some sites will have both 1.8 and 2.1 filesystems; will those sites need to run 2.0 on the clients to mount both FS versions concurrently? -- Mike Shuey On Tue, Jan 11, 2011 at 3:07 PM, Andreas Dilger adil...@whamcloud.com wrote: While 2.0 was submitted to quite heavy testing at Oracle before it's release, it has not been widely deployed for production at this point. All of the develoment and maintenance effort has gone into the next release (2.1) which is not released yet. I think that 2.1 will represent a much more sustainable target for production usage, when it is released. Until that happens, I would only recommend 2.0 for evaluation usage, and especially for sites new to Lustre that they stay on the tried-and-true 1.8 code base. Cheers, Andreas On 2011-01-11, at 12:56, Samuel Aparicio sapari...@bccrc.ca wrote: thanks for this note. is lustre 2.0 regarded as stable for production? Professor Samuel Aparicio BM BCh PhD FRCPath Nan and Lorraine Robertson Chair UBC/BC Cancer Agency 675 West 10th, Vancouver V5Z 1L3, Canada. office: +1 604 675 8200 cellphone: +1 604 762 5178: lab website http://molonc.bccrc.ca On Jan 7, 2011, at 5:11 PM, Colin Faber wrote: Hi, I've built several against 2.6.18-194.17.1.el5 kernels without problem so I would think you can probably get away with 0.1 as well. -cf On 01/07/2011 06:05 PM, Samuel Aparicio wrote: Is it known if Lustre 2.0 GA will run with 2.6.18-194.17.1.0.1.el5 kernels. The test matrix has only the 164 kernel as the latest tested. Professor Samuel Aparicio BM BCh PhD FRCPath Nan and Lorraine Robertson Chair UBC/BC Cancer Agency 675 West 10th, Vancouver V5Z 1L3, Canada. office: +1 604 675 8200 cellphone: +1 604 762 5178: lab website http://molonc.bccrc.ca http://molonc.bccrc.ca/ ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss