Re: [Lustre-discuss] Lustre over two TCP interfaces

2013-06-26 Thread Michael Shuey
That will probably be slow - the machine you use to proxy the IPVS address
would be a bottleneck.  Out of curiosity, what problem are you trying to
solve here?  Do you anticipate whole-subnet outages to be an issue (and if
so, why)?

--
Mike Shuey


On Wed, Jun 26, 2013 at 4:53 AM, Alfonso Pardo alfonso.pa...@ciemat.eswrote:

 oooh!


 Thanks for you reply! May be another way is a floating IP between two
 interfaces with IPVS (corosync).

 -Mensaje original- From: Brian O'Connor
 Sent: Wednesday, June 26, 2013 10:15 AM
 To: Alfonso Pardo
 Cc: 'Michael Shuey' ; 'WC-Discuss' ; 
 lustre-discuss@lists.lustre.**orglustre-discuss@lists.lustre.org
 Subject: Re: [Lustre-discuss] Lustre over two TCP interfaces





 On 06/26/2013 04:16 PM, Alfonso Pardo wrote:

 But if I configure the OST assigning to the first interface of the
 OSS (bond0) and as failover OSS the second inteface of the OSS. If the
 bond0 network down, the client will try to connect to the failover, that
 is the second interface of the OSS.
 is it possible?



 I stand to be corrected, but no, I don't think so. As I understand it
 the failover code looks for a different server instance, rather than a
 different nid.

 See

 http://lists.opensfs.org/**pipermail/lustre-devel-**
 opensfs.org/2012-August/**28.htmlhttp://lists.opensfs.org/pipermail/lustre-devel-opensfs.org/2012-August/28.html


  *From:* Brian O'Connor mailto:bri...@sgi.com
 *Sent:* Wednesday, June 26, 2013 1:09 AM
 *To:* 'Alfonso Pardo' 
 mailto:alfonso.pardo@ciemat.**esalfonso.pa...@ciemat.es
 ; 'Michael Shuey'
 mailto:sh...@purdue.edu
 *Cc:* 'WC-Discuss' 
 mailto:WC-Discuss.Migration@**intel.comwc-discuss.migrat...@intel.com
 ;
 mailto:lustre-discuss@lists.**lustre.orglustre-discuss@lists.lustre.org
 *Subject:* RE: [Lustre-discuss] Lustre over two TCP interfaces
 Unless something has changed in the new versions of lustre, I don't
 think lustre can do failover between nids on the same machine.

 It can choose the available nid at mount time, but if an active nid goes
 away after you are mounted then the client chooses the failover nid, and
 this must be on a different server.

 Check the archives for more discussion in this topic :)



 -Original Message-
 *From: *Alfonso Pardo [alfonso.pa...@ciemat.es
 mailto:alfonso.pardo@ciemat.**es alfonso.pa...@ciemat.es]
 *Sent: *Tuesday, June 25, 2013 07:23 AM Central Standard Time
 *To: *Michael Shuey
 *Cc: *WC-Discuss; 
 lustre-discuss@lists.lustre.**orglustre-discuss@lists.lustre.org
 *Subject: *Re: [Lustre-discuss] Lustre over two TCP interfaces

 thank Michael,
 This is my second step, I will change the lnet with “options lnet
 networks=tcp0(bond0,bond1)” because my machines has 4 nics. I have a
 bond0 and bond1 with LACP. I need to comunicate the clients with two
 network for HA network.
 If the bond0 network is down, the clients can reach the OSS by the
 second network bond1.
 If I change the modprobe with “options lnet
 networks=tcp0(bond0),tcp1(**bond1)”, how the clients mount the filesystem
 to reach the OSS by two network?
 *From:* Michael Shuey mailto:sh...@purdue.edu
 *Sent:* Tuesday, June 25, 2013 2:14 PM
 *To:* Alfonso Pardo 
 mailto:alfonso.pardo@ciemat.**esalfonso.pa...@ciemat.es
 
 *Cc:* lustre-discuss@lists.lustre.**org lustre-discuss@lists.lustre.org
 mailto:lustre-discuss@lists.**lustre.orglustre-discuss@lists.lustre.org
 ; WC-Discuss
 mailto:WC-Discuss.Migration@**intel.com wc-discuss.migrat...@intel.com
 
 *Subject:* Re: [Lustre-discuss] Lustre over two TCP interfaces
 Different interfaces need to be declared with different LNET networks -
 something like networks=tcp0(eth0),tcp1(**eth1).  Of course, that
 assumes your clients are configured to use a mix of tcp0 and tcp1 for
 connections (with each client only using one of the two).  This is
 really only useful in corner cases, when you're doing something strange;
 if eth0 and eth1 are in the same subnet (as in your example), this is
 almost certainly not productive.
 A better bet might be to use a single LNET, and bond the two interfaces
 together - either as an active/passive pair, or active/active (e.g.,
 LACP).  Then you'd declare networks=tcp0(bond0), give the bond a single
 IP address, and client traffic would be split across the two members in
 the bond more like you probably expect (given the limits of the bond
 protocol you're using).
 --
 Mike Shuey


 On Tue, Jun 25, 2013 at 8:06 AM, Alfonso Pardo alfonso.pa...@ciemat.es
 mailto:alfonso.pardo@ciemat.**es alfonso.pa...@ciemat.es wrote:

 hello friends,
 I need to comunicate my OSS by two ethernet TCP interfaces: eth0 and
 eth1.
 I have configured this feature in my modprobe.d with:
 “options lnet networks=tcp0(eth0,eth1)”
 And I can see two interfaces with:
 lctl --net tcp interface_list
 sa-d4-01.ceta-ciemat.es 
 http://sa-d4-01.ceta-ciemat.**eshttp://sa-d4-01.ceta-ciemat.es
 :
 (192.168.11.15/255.255.255.0 
 http://192.168.11.15/255.255

Re: [Lustre-discuss] Lustre over two TCP interfaces

2013-06-25 Thread Michael Shuey
Different interfaces need to be declared with different LNET networks -
something like networks=tcp0(eth0),tcp1(eth1).  Of course, that assumes
your clients are configured to use a mix of tcp0 and tcp1 for connections
(with each client only using one of the two).  This is really only useful
in corner cases, when you're doing something strange; if eth0 and eth1 are
in the same subnet (as in your example), this is almost certainly not
productive.

A better bet might be to use a single LNET, and bond the two interfaces
together - either as an active/passive pair, or active/active (e.g., LACP).
 Then you'd declare networks=tcp0(bond0), give the bond a single IP
address, and client traffic would be split across the two members in the
bond more like you probably expect (given the limits of the bond protocol
you're using).

--
Mike Shuey


On Tue, Jun 25, 2013 at 8:06 AM, Alfonso Pardo alfonso.pa...@ciemat.eswrote:

   hello friends,

 I need to comunicate my OSS by two ethernet TCP interfaces: eth0 and eth1.

 I have configured this feature in my modprobe.d with:

 “options lnet networks=tcp0(eth0,eth1)”

 And I can see two interfaces with:

 lctl --net tcp interface_list
 sa-d4-01.ceta-ciemat.es: (192.168.11.15/255.255.255.0) npeer 0 nroute 2
 sa-d4-01.ceta-ciemat.es: (192.168.11.35/255.255.255.0) npeer 0 nroute 0

 But, the clients only can communicate with the first interface:

 lctl ping 192.168.11.15
 12345-0@lo
 12345-192.168.11.15@tcp
 lctl ping 192.168.11.35
 failed to ping 192.168.11.35@tcp: Input/output error


 Any suggestions how to “enable” the second interface?


 thank in advance

 *Alfonso Pardo Diaz *
 *System Administrator / Researcher *
 *c/ Sola nº 1; 10200 TRUJILLO, SPAIN *
 *Tel: +34 927 65 93 17 Fax: +34 927 32 32 37 *

 [image: CETA-Ciemat logo] http://www.ceta-ciemat.es/

  Confidencialidad: Este mensaje y sus ficheros
 adjuntos se dirige exclusivamente a su destinatario y puede contener
 información privilegiada o confidencial. Si no es vd. el destinatario
 indicado, queda notificado de que la utilización, divulgación y/o copia sin
 autorización está prohibida en virtud de la legislación vigente. Si ha
 recibido este mensaje por error, le rogamos que nos lo comunique
 inmediatamente respondiendo al mensaje y proceda a su destrucción.
 Disclaimer: This message and its attached files is intended exclusively for
 its recipients and may contain confidential information. If you received
 this e-mail in error you are hereby notified that any dissemination, copy
 or disclosure of this communication is strictly prohibited and may be
 unlawful. In this case, please notify us by a reply and delete this email
 and its contents immediately. 

 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss


___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] [HPDD-discuss] Disk array setup opinions

2013-03-09 Thread Michael Shuey
If you can, I'd advocate the route you suggest - multiple RAID groups,
each group maps to a unique LUN, and each LUN is an OST.  Note that
you'll likely want the number of data disks in each RAID to be a power
of 2 (e.g., 6- or 10-disk raid6, 5- or 9-disk raid5).  Obviously,
you'll be wasting more spindles on overhead (RAID parity), but
performance is more predictable.

The other way (single RAID, multiple LUNs, LUN == OST) means the
performance of OSTs aren't independent - they all bottleneck on the
same RAID array.  If you have enough RAID controller bandwidth, this
can (theoretically) work, but makes hunting/fixing performance
problems more complex.  In Lustre, if it's not writing fast enough,
you can just stripe over more OSTs.  However, if your OSTs aren't
really independent, that may or may not help - you'll get different
bandwidth depending on how many OSTs are sharing the same pool of
physical disks.  I'd expect two OSTs that don't share drives to write
faster than two that do, and so on.

BTW, if you have multiple controllers, and the LUN platform has a
sense of controller affinity (i.e., a LUN uses one controller as
primary and another as secondary or backup), try and balance
your RAIDs across the two controllers in your array.  For instance,
stick even-numbered LUNs on one, odd-numbered LUNs on another.  Also,
if you're doing multi-pathing into your OSSes, make sure your
multipath drivers are aware of this arrangement, and respect it.

Most midrange disk trays will do multipath, and cache mirroring
between controllers - but if you read the fine print, you often find
that access through the secondary controller is MUCH slower.  It's
usually implemented as a write-through to the primary, or has its
cache disabled while the primary is active, etc.  Cache mirroring at
high speed is hard, complicated, and expensive, so vendors often only
implement what's minimally necessary to do failover - even if it means
the secondary controller doesn't cache a LUN unless the primary dies.
If you have one of these (and I've no idea if Dell's 3200 does this,
but this behavior is common enough I'd think about it), you'll want to
split LUNs evenly between controllers to maximize the cache use.
You'll also want to make sure the OSS knows which path is primary for
which LUN, so it doesn't send traffic down the wrong path (or worse,
down both - round-robin balancing is a bad idea when the paths are
asymmetric) unless there's been a hardware failure.

BTW, if you implemented a single RAID group and exported multiple
LUNs, any multi-controller effects can get way more complicated - and
are highly implementation-dependent.

TL;DR - Multi-raid, RAID group == LUN == OST.  Keep OSTs as
independent as you can, and watch your controller and OSS multipath
settings (if used).

--
Mike Shuey


On Sat, Mar 9, 2013 at 10:19 AM, Jerome, Ron ron.jer...@ssc-spc.gc.ca wrote:
 I am currently having a debate about the best way to carve up Dell MD3200's 
 to be used as OST's in a Lustre file system and I invite this community to 
 weigh in...

 I am of the opinion that it should be setup as multiple raid groups each 
 having a single LUN, with each raid group representing an OST, while my 
 colleague feels that it should be setup as a single raid group across the 
 whole array with multiple LUNS, with each LUN representing an OST.

 Does anyone in this group have an opinion (one way or another)?

 Regards,

 Ron Jerome
 ___
 HPDD-discuss mailing list
 hpdd-disc...@lists.01.org
 https://lists.01.org/mailman/listinfo/hpdd-discuss
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] problem with installing lustre and ofed

2012-12-31 Thread Michael Shuey
RedHat's OFED tends to lag Mellanox's.  They're pretty current on
bugfixes, but support for the latest hardware is usually 3-6 months
behind - it took about 4 months to bring in drivers for our most
recent FDR system.  Also, support for Mellanox's advanced features
(e.g., MXM, FCA) is often missing.

--
Mike Shuey


On Mon, Dec 31, 2012 at 11:32 AM, Brian J. Murrell
brian.murr...@linux.intel.com wrote:
 On Fri, 2012-12-28 at 15:54 -0800, Jason Brooks wrote:
 Hello,

 Hi,

 I am having trouble installing the server modules for  lustre 2.1.4
 and use mellanox's OFED distribution

 Is there a particular need for the Mellanox OFED distribution?  The
 Redhat EL 6 kernel comes stock with the inifiniband drivers and stack
 already baked in and we leverage that and build our Lustre modules RPM
 against it.

 So unless there is something particular that you need that is only in
 the Mellanox OFED distribution and is not already in EL6's kernels, you
 should be able to just use the binary kernel and lustre-modules RPMs
 that we supply and have working inifiniband support.

 Cheers,
 b.



 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] ofed with FDR14 support Lustre

2012-05-30 Thread Michael Shuey
We're using 1.8.7.80-wc1 here.  It's basically 1.8.7-wc1, but with a
few fixes pulled in from git a few months back to build on rhel6.2.
It's built on top of Mellanox's OFED 1.5.3-3.0.0, and is working just
fine on our FDR14 cluster.

--
Mike Shuey


On Wed, May 30, 2012 at 3:41 PM, John White jwh...@lbl.gov wrote:
 Does anyone know of Lustre version that can build against an ofed that 
 supports FDR14 (1.5.4+, by my understanding)?  Or is this still in the pipes?

 The compat matrix on the Whamcloud site only talks of support up to 1.5.3.1 
 (confirmed to build but doesn't support FDR14).
 
 John White
 HPC Systems Engineer
 (510) 486-7307
 One Cyclotron Rd, MS: 50C-3209C
 Lawrence Berkeley National Lab
 Berkeley, CA 94720

 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Could Client MGS or Client OSS running at one node together?

2012-03-07 Thread Michael Shuey
I've had success with LACP bonding, with an LACP-aware switch.  You
may also want to check your xmit_hash_policy (it's a kernel option to
the linux bonding driver).  I've had the best luck with layer3+4
bonding, using several OSSes with sequential IPs, and striping files
at a multiple of the number of links in the bond.

With most bonding modes, packets can get send across different links
in the bond.  This results in out-of-order packets, and can slow down
a TCP stream (like your Lustre connection) unnecessarily.  LACP will
route packets to a given destination across exactly one link in the
bond - but that will limit each TCP stream to the link speed of a
single bond member.

You can improve upon single link speeds with Lustre, because Lustre
will let you stripe large files across multiple OSSes.  A client will
build a separate TCP connection to each OSS, so as long as traffic
passes over different links you can use all the available bandwidth.
The way traffic is scattered across the bond members is controlled by
the xmit_hash_policy option; layer3+4 uses an XOR of the source and
destination addresses, combined with the source and destination TCP
ports (modulo number of links in the bond) to pick the specific link
for that stream.  If you're using sequential IPs for your OSSes, you
should be able to get a good scattering effect (since your source
address and port won't change, but your destination address will vary
across the OSSes).

A few years ago, I was using Lustre 1.4 and 1.6 and saw 200+ MB/sec
across two gigE links bonded together on the client (using striped
files).  Your mileage may vary, of course.  Caveats include:

Small files may have poorer performance than usual, due to transaction
overhead to multiple OSSes (if your bond is on the client).
Similarly, non-striped files will only see the speed of a single link.

If your OSTs start to fill, Lustre's load balancing may not give you
an ideal distribution of stripes across OSSes - causing multiple TCP
streams to land on the same bond member on the client.  Unfortunately,
this will present as slowdowns for certain files on certain clients
(because the number of bonds that can be used is a function of both
which OSSes are used in the file and the client's IP in the hash
policy).

All metadata accesses are limited to the speed of a single bond member
on the client.

If your bonds are on the server, then (as long as you have a number of
clients) you should see a nice increase in overall IO throughput.  It
won't be as marked a boost as 10gigE or Infiniband, but bonds are
inexpensive and generally better than a single link (to multiple
clients).

Hope this helps - good luck!

--
Mike Shuey



On Wed, Mar 7, 2012 at 9:27 PM, zhengfeng zf5984...@gmail.com wrote:
 Dear all, thanks a lot for your answers ;)

 Now I have another problem about the network between nodes.
 Since there is no Infiniband or 10G-NIC, but I still want to
 increase the bandwidth by add more 1G-NICs, I plan to use Linux bonding.

 Then, bonding 4 NICs together at one node, BUT there is NO performance
 enhanced no matter which bongding mode, described in kernel doc, used.
 In stead, the performance of 4-NICs-bonding is lower than 1 NIC's.
 Then we use 2 NICs bonding, the performance is better than 1 NIC's.
 The result is :
 bonding 2 NIC: 1 + 1  1
 bonding 4 NIC: 1 + 1 + 1 + 1  1

 So confused..
 The benchmark we used is netperf.
 And I use tcpdump to dump the packages, found that there are great
 of TCP segments out of orders.

 My question is that:
 a) TCP segments are out of order, which induced that 4-NIC-bonding
  performance decay, is this the root cause?
 b) We are doubting the feasibility of this method: using 4-NIC-bonding
 to increase bandwidth. Any proposals about that?  If so, maybe I should
 use some other method instead of this.

 Thanks again, all

 
 Best Regards
 Zheng

 From: Peter Grandi
 Date: 2012-03-07 20:54
 To: Lustre discussion
 Subject: Re: [Lustre-discuss] Could Client  MGS or Client  OSS running
 at one node together?
 Since there is no more node in our project when using Lustre,
 I want to confirm that:

 1) Could the Client and MGS run at one node together? or
 could Client and OSS run at one node together? 2) Suppose
 I had deployed them at one node, what potential shortcomings
 or harm are there?

 Running MGS and MDS on the same nodes is customary, see:
    http://wiki.lustre.org/manual/LustreManual20_HTML/LustreOperations.html#50438194_24122

 Running the MGS, MDS and OSS service on the same node is
 possible and fairly common in very small setups, usually those
 in which there is only 1-2 nodes.

 It is possible to use the client code on all types of Lustre
 servers, but at least in the case of using the client code on an
 OSS there is the non-negligible possibility of a resource
 deadlock, if the client uses the OSS on the same node, as the
 client and OSS codes compete for memory, so in the past this has
 been 

Re: [Lustre-discuss] Mount 2 clusters, different networks - LNET tcp1-tcp2-o2ib

2011-06-14 Thread Michael Shuey
Is your ethernet FS in tcp1, or tcp0?  Your config bits indicate the
client is in tcp1 - do the servers agree?

--
Mike Shuey



On Tue, Jun 14, 2011 at 12:23 PM, Thomas Roth t.r...@gsi.de wrote:
 Hi all,

 I'd like to mount two Lustre filesystems on one client. Issues with more than 
 one MGS set aside,
 the point here is that one of them is an Infiniband-cluster, the other is 
 ethernet-based.
 And my client is on the ethernet.
 I have managed to mount the o2ib-fs by setting up an LNET router, but now 
 this client's LNET doesn;t
 known how to reach the ethernet-fs.

 So the basic modprobe.conf reads
   options lnet networks=tcp1(eth0) routes=o2ib LNET-Router-IP@tcp1
 This mounts the MGS on the o2ib network.

 What do I have to add to get to the MGS on the tpc network?

 Meanwhile I have studied more posts here and came up with
   options lnet networks=tcp1(eth0),tcp2(eth0:0) routes=o2ib 
 LNET-Router-IP@tcp1; tcp
 Default-Gateway-IP@tcp2

 Doesn't work either, but I see in the log of the (tcp-)MGS:
   LustreError: 120-3: Refusing connection from Client-IP for MGS-IP@tcp2: No 
 matching NI

 Somethings getting through ...

 Any ideas?

 Regards,
 Thomas
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Mount 2 clusters, different networks - LNET tcp1-tcp2-o2ib

2011-06-14 Thread Michael Shuey
That may be because your gateway doesn't have an interface on tcp (aka
tcp0).  I suspect you want to keep your ethernet clients in tcp0, your
IB clients in o2ib0, and your router in both.  Personally, I find it
easiest to just give different module options on each system (rather
than try ip2nets stuff).

On the ether clients, I'd try:

options lnet networks=tcp0(eth0) routes=o2ib0
LNET-router-eth_IP@tcp0 dead_router_check_interval=300

On IB clients:

options lnet networks=o2ib0(ib0) routes=tcp0 LNET-router-IB_IP@ib0
dead_router_check_interval=300

then on the router:

options lnet networks=tcp0(eth0),o2ib0(ib0) forwarding=enabled accept_timeout=15

Obviously, your file servers will need to have lnet options similar to
the clients:

options lnet networks=tcp0(eth0) routes=o2ib0
LNET-router-eth_IP@tcp0 dead_router_check_interval=300
options lnet networks=o2ib0(ib0) routes=tcp0 LNET-router-IB_IP@o2ib0
dead_router_check_interval=300

That's just a guess, your mileage may vary, etc., but I think it's
close to what you want.  Note that you really want the
dead_router_check_interval if you're using lnet routers.  Without that
parameter, the lustre client will automatically mark a router as
failed when it's unavailable but will not check to see if it ever
comes back.  With this param, it checks every 300 seconds (and
re-enables it if found).

Hope this helps.

--
Mike Shuey



On Tue, Jun 14, 2011 at 1:26 PM, Thomas Roth t.r...@gsi.de wrote:
 Hm, the ethernet FS is in tcp0 - MGS says its nids are MGS-IP@tcp.
 So not surprising it refuses that connection.
 On the other hand,
 options lnet networks=tcp1(eth0),tcp(eth0:0) routes=o2ib
 LNET-Router-IP@tcp1; tcp Default-Gateway-IP@tcp

 results in
 Can't create route to tcp via Gateway-IP@tcp

 Cheers,
 Thomas


 On 06/14/2011 07:00 PM, Michael Shuey wrote:

 Is your ethernet FS in tcp1, or tcp0? Your config bits indicate the
 client is in tcp1 - do the servers agree?

 --
 Mike Shuey



 On Tue, Jun 14, 2011 at 12:23 PM, Thomas Roth t.r...@gsi.de wrote:
   Hi all,
  
   I'd like to mount two Lustre filesystems on one client. Issues with
 more than one MGS set aside,
   the point here is that one of them is an Infiniband-cluster, the other
 is ethernet-based.
   And my client is on the ethernet.
   I have managed to mount the o2ib-fs by setting up an LNET router, but
 now this client's LNET doesn;t
   known how to reach the ethernet-fs.
  
   So the basic modprobe.conf reads
    options lnet networks=tcp1(eth0) routes=o2ib LNET-Router-IP@tcp1
   This mounts the MGS on the o2ib network.
  
   What do I have to add to get to the MGS on the tpc network?
  
   Meanwhile I have studied more posts here and came up with
    options lnet networks=tcp1(eth0),tcp2(eth0:0) routes=o2ib
 LNET-Router-IP@tcp1; tcp
   Default-Gateway-IP@tcp2
  
   Doesn't work either, but I see in the log of the (tcp-)MGS:
    LustreError: 120-3: Refusing connection from Client-IP for
 MGS-IP@tcp2: No matching NI
  
   Somethings getting through ...
  
   Any ideas?
  
   Regards,
   Thomas
   ___
   Lustre-discuss mailing list
   Lustre-discuss@lists.lustre.org
   http://lists.lustre.org/mailman/listinfo/lustre-discuss
  




___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Can one node mount more than one lustre cluster?

2011-03-15 Thread Michael Shuey
FYI, I'm using Lustre 1.8.5 to mount two filesystems from separate
domains (one in Lafayette, IN, and one in Bloomington, IN, run by two
different institutions) on 900+ nodes, and things just work.

--
Mike Shuey



On Tue, Mar 15, 2011 at 6:24 PM, Andreas Dilger adil...@whamcloud.com wrote:
 On 2011-03-15, at 4:19 PM, Malcolm Cowe wrote:
 On 15/03/2011 17:15, Brian O'Connor wrote:
 What are the constraints on a client mounting more than one lustre
 file system?

 I realise that a lustre cluster can have more than one file system
 configured, but can a client mount different file systems from different
 lustre clusters on the same network?;ie.

 Assume a Single IB fabric and two Lustre clusters with separate
 MGS/MDS/OSS. One lustre is Lincoln and the other is Washington

 list_nids
 192.168.1.10@o2ib

 mount -t lustre lincoln:/lustre  /lincoln
 mount -t lustre washington:/lustre /washinton

 Is this doable or should they be on separate IB fabrics

 From my understanding, there should only be one MGS for the entire
 environment (although as many MDTs and OSTs as are required). This is
 because clients will only communicate with exactly one MGS (and will
 communicate with the MGS of the last FS mounted) and will only receive
 updates from the MGS with which it is registered. So, in the above
 example if there is a change to the lincoln file system (e.g. a
 failover event, some configuration changes), clients will not receive
 notification.

 There's not an issue with having multiple MGS's on a site, only with
 mounting lustre file systems from multiple domains on the same client,
 IIRC.

 That discussion was had on the list a few months ago, and the correct answer 
 is that it should just work.  The only use last mounted MGS problem was 
 fixed at some point, though I don't have the exact version handy.


 Cheers, Andreas
 --
 Andreas Dilger
 Principal Engineer
 Whamcloud, Inc.



 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] status of lustre 2.0 on 2.6.18-194.17.1.0.1.el5 kernels

2011-01-11 Thread Michael Shuey
What does that imply for sites migrating from 1.8 to 2.1?  Presumably
some sites will have both 1.8 and 2.1 filesystems; will those sites
need to run 2.0 on the clients to mount both FS versions concurrently?

--
Mike Shuey



On Tue, Jan 11, 2011 at 3:07 PM, Andreas Dilger adil...@whamcloud.com wrote:
 While 2.0 was submitted to quite heavy testing at Oracle before it's
 release, it has not been widely deployed for production at this point. All
 of the develoment and maintenance effort has gone into the next release
 (2.1) which is not released yet. I think that 2.1 will represent a much more
 sustainable target for production usage, when it is released. Until that
 happens, I would only recommend 2.0 for evaluation usage, and especially for
 sites new to Lustre that they stay on the tried-and-true 1.8 code base.

 Cheers, Andreas
 On 2011-01-11, at 12:56, Samuel Aparicio sapari...@bccrc.ca wrote:

 thanks for this note.
 is lustre 2.0 regarded as stable for production?
 Professor Samuel Aparicio BM BCh PhD FRCPath
 Nan and Lorraine Robertson Chair UBC/BC Cancer Agency
 675 West 10th, Vancouver V5Z 1L3, Canada.
 office: +1 604 675 8200 cellphone: +1 604 762 5178: lab
 website http://molonc.bccrc.ca




 On Jan 7, 2011, at 5:11 PM, Colin Faber wrote:

 Hi,

 I've built several against 2.6.18-194.17.1.el5 kernels without problem
 so I would think you can probably get away with 0.1 as well.

 -cf


 On 01/07/2011 06:05 PM, Samuel Aparicio wrote:

 Is it known if Lustre 2.0 GA will run with 2.6.18-194.17.1.0.1.el5

 kernels. The test matrix has only the 164 kernel as the latest tested.

 Professor Samuel Aparicio BM BCh PhD FRCPath

 Nan and Lorraine Robertson Chair UBC/BC Cancer Agency

 675 West 10th, Vancouver V5Z 1L3, Canada.

 office: +1 604 675 8200 cellphone: +1 604 762 5178: lab website

 http://molonc.bccrc.ca http://molonc.bccrc.ca/






 ___

 Lustre-discuss mailing list

 Lustre-discuss@lists.lustre.org

 http://lists.lustre.org/mailman/listinfo/lustre-discuss

 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss

 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss


___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss