Re: [Lustre-discuss] Recommended failover software for Lustre

2012-07-16 Thread Cliff White
Thanks, we've created http://jira.whamcloud.com/browse/LUDOC-69 to track
the fixes to the manual.
cliffw


On Mon, Jul 16, 2012 at 4:23 AM, Christopher J.Walker c.j.wal...@qmul.ac.uk
 wrote:

 The configuring failover section in the Whamcloud release of the
 Lustre manual seems rather out of date:


 http://build.whamcloud.com/job/lustre-manual/lastSuccessfulBuild/artifact/lustre_manual.html#configuringfailover

 The Oracle release says much the same thing:

 http://wiki.lustre.org/manual/LustreManual20_HTML/ConfiguringFailover.html#50540588_50628

 In section 11.1.1 Power management software, it says:

 For more information about PowerMan, go to:
 https://computing.llnl.gov/linux/powerman.html;

 Which no longer exists. It should probably point at
 http://code.google.com/p/powerman/


 Then in section 11.2. Setting up High-Availability (HA) Software with
 Lustre it mentions Red Hat Cluster Manager  and Pacemaker.

 Red Hat Cluster Manager points to
 http://wiki.lustre.org/index.php/Using_Red_Hat_Cluster_Manager_with_Lustre

 which says In comparison with other HA solutions, RedHat Cluster as in
 RHEL 5.5 is an old HA solution. We recommend using other HA solutions
 like Pacemaker, if possible. 

 The pacemaker link:
 http://wiki.lustre.org/index.php/Using_Pacemaker_with_Lustre

 Although the title of this is Using Pacemaker with Lustre, it starts
 off by saying In modern clusters, OpenAIS, or more specifically, its
 communication stack corosync, is used for this task.


 In summary:

 1) The manual could do with some updating here.

 2) I suspect I should be using corosync.

 Chris



 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss




-- 
cliffw
Support Guy
WhamCloud, Inc.
www.whamcloud.com
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Seeking LNET router recomendations

2012-03-30 Thread Cliff White
I don't think so, because any time two NIDs map to the same node, LNET will
pick a 'best' interface (based on hop count and other things) and always
use that interface.  In the normal case, you would subnet to separate the
interfaces and use both, but you can't in this case as ib0 is the same IP
for boh.  Really would be best to have a second IP for TCP.
cliffw


On Wed, Mar 28, 2012 at 9:01 AM, Hayes, Bob bob.ha...@intel.com wrote:

  When we added ‘options lnet networks=o2ib(ib0),tcp1(ib0)’ to the MDS and
 the OSS’s, communications from the nodes using tcp1 would not be returned
 by the OSS’s. We need to use two protocols over the same interface. Is this
 possible?

 ** **

 *Bob Hayes*

 HPC Sys. Admin.

 Intel Corp   Software  Services Group/DRD/CRT-DC*
 ***

 DP3-307-H7Tel:  (253)371-3040
 

 2800 N Center DrFax: (253)371-4647 

 DuPont WA 98327   *bob.ha...@intel.com mallick.arigap...@intel.com*

 ** **

 *From:* Cliff White [mailto:cli...@whamcloud.com]
 *Sent:* Wednesday, March 21, 2012 9:28 AM
 *To:* Hayes, Bob
 *Cc:* lustre-discuss@lists.lustre.org
 *Subject:* Re: [Lustre-discuss] Seeking LNET router recomendations

 ** **

 Or to put it another way, if your OSS systems can already 'see' both IB
 and IPoIB networks the most cost

 effective, high performance solution would be to add the necessary
 interface and put your MDS/MGS on both networks also.

 No need for routers, no performance impact. 

 cliffw

 ** **

 On Tue, Mar 13, 2012 at 11:25 AM, Hayes, Bob bob.ha...@intel.com wrote:*
 ***

 Are there any recommendations or guidelines for sizing a LNET routing
 facility.

 ~400 nodes, 8 OSS (dual socket E5 w/48GB RAM), 24 OST (10spindle RAID6
 over SRP), 1 MGS/MDT

 How much load does LNET routing put on a system?

 If I make the 8 OSS systems do double duty as IB to IPoIB routers, will it
 have much impact on performance?

  

 *Bob Hayes*

 HPC Sys. Admin.

 Intel Corp   Software  Services Group/DRD/CRT-DC*
 ***

 DP3-307-H7Tel:  (253)371-3040
 

 2800 N Center DrFax: (253)371-4647 

 DuPont WA 98327   *bob.ha...@intel.com mallick.arigap...@intel.com*

  


 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss



 

 ** **

 --
 cliffw

 Support Guy

 WhamCloud, Inc. 

 www.whamcloud.com

 ** **

 ** **




-- 
cliffw
Support Guy
WhamCloud, Inc.
www.whamcloud.com
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Seeking LNET router recomendations

2012-03-21 Thread Cliff White
- An OSS really can't be a router, an OSS is an endpoint.  Topologically,
it shouldn't work, you should re-think network layout.
- routing does place a load on the system, nodes doing routing should be
dedicated to routing.
- Load depends on traffic, basically you would have two hardware network
interfaces, and ideally would
be sending max traffic through both. Impact would depend on hardware types,
etc.

WIth ~400 nodes, you would want a 'pool' of routers, size of pool would
depend on your usage.
cliffw


On Tue, Mar 13, 2012 at 11:25 AM, Hayes, Bob bob.ha...@intel.com wrote:

  Are there any recommendations or guidelines for sizing a LNET routing
 facility.

 ~400 nodes, 8 OSS (dual socket E5 w/48GB RAM), 24 OST (10spindle RAID6
 over SRP), 1 MGS/MDT

 How much load does LNET routing put on a system?

 If I make the 8 OSS systems do double duty as IB to IPoIB routers, will it
 have much impact on performance?

 ** **

 *Bob Hayes*

 HPC Sys. Admin.

 Intel Corp   Software  Services Group/DRD/CRT-DC*
 ***

 DP3-307-H7Tel:  (253)371-3040
 

 2800 N Center DrFax: (253)371-4647 

 DuPont WA 98327   *bob.ha...@intel.com mallick.arigap...@intel.com*

 ** **

 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss




-- 
cliffw
Support Guy
WhamCloud, Inc.
www.whamcloud.com
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Seeking LNET router recomendations

2012-03-21 Thread Cliff White
Or to put it another way, if your OSS systems can already 'see' both IB and
IPoIB networks the most cost
effective, high performance solution would be to add the necessary
interface and put your MDS/MGS on both networks also.
No need for routers, no performance impact.
cliffw


On Tue, Mar 13, 2012 at 11:25 AM, Hayes, Bob bob.ha...@intel.com wrote:

  Are there any recommendations or guidelines for sizing a LNET routing
 facility.

 ~400 nodes, 8 OSS (dual socket E5 w/48GB RAM), 24 OST (10spindle RAID6
 over SRP), 1 MGS/MDT

 How much load does LNET routing put on a system?

 If I make the 8 OSS systems do double duty as IB to IPoIB routers, will it
 have much impact on performance?

 ** **

 *Bob Hayes*

 HPC Sys. Admin.

 Intel Corp   Software  Services Group/DRD/CRT-DC*
 ***

 DP3-307-H7Tel:  (253)371-3040
 

 2800 N Center DrFax: (253)371-4647 

 DuPont WA 98327   *bob.ha...@intel.com mallick.arigap...@intel.com*

 ** **

 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss




-- 
cliffw
Support Guy
WhamCloud, Inc.
www.whamcloud.com
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Lustre 1.8.7 - Setup prototype in Research field - STUCK !

2012-02-03 Thread Cliff White
You should download from the Whamcloud download site, for a start:
http://downloads.whamcloud.com/public/lustre/
Typically, the Lustre server does nothing but run Lustre. For that reason
there is generally little risk
from using our current version on the server platforms. If your clients
require a particular kernel version, you
would have to build lustre-client and lustre-client modules only.

I would reccomend
http://wiki.whamcloud.com/display/PUB/Getting+started+with+Lustre
as a useful resource.
cliffw


On Fri, Feb 3, 2012 at 2:30 PM, Charles Cummings ccummi...@harthosp.orgwrote:

  Hello Everyone,

 being the local crafty busy admin for a neuroscience research branch,
 Lustre seems the only way to go however I'm a bit stuck and need some
 thoughtful guidance.

 My goal  is to setup a virtual OS environment which is a replica of our
 Direct attached storage head node running SLES 11.0 x86 64   Kernel:
 2.6.27.19-5 default #1 SMP
 and our (2) Dell blade clusters running CentOS 5.3 x86 64   Kernel:
 2.6.18-128.el5 #1 SMP
 which I now have running as a) SLES 11 same kernel MDS  b) SLES 11 same
 kernel OSS   and   c) CentOS 5.3 x86 65 same kernel
 and then get Lustre running across it.

 The trouble began when i was informed that the Lustre rpm kernel numbers
 MUST match the OS kernel number EXACTLY due to modprobe errors and mount
 errors on the client,
 and some known messages on the servers after the rpm installs.

 My only direct access to Oracle Lustre downloads is through another person
 with an Oracle ID who's not very willing to help - i.e. this route is
 painful

 So to explain why I'm stuck:

 a) access to oracle downloads is not easy
 b) there is so much risk with altering kernels, given all the applications
 and stability of the environment you could literally trash the server and
 spend days recovering - in addition to it being the main storage / resource
 for research
 c) I can't seem to find after looking Lustre RPMs that match my kernel
 environment specifically, i.e. the SLES 11 AND CENTOS 5.3
 d) I've never created rpms to a specific kernel version and that would be
 a deep dive into new territory and frankly another gamble

 What's the least painful and least risky to get Lustre working in this
 prototype which will then lend to production (equally least painful) given
 these statements - Help !
 Cliff, I could use some details on how specifically wamcloud can fit this
 scenero - and thanks for all the enlightenment.


 thanks for your help
 Charles




-- 
cliffw
Support Guy
WhamCloud, Inc.
www.whamcloud.com
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] two multi-homed cluster

2011-12-16 Thread Cliff White
You can do this, simply define networks for both devices.
Assuming ib0, and eth0, you would have
options lnet networks=tcp0(eth0),o2ib0(ib0)

The IB clients will mount using a @o2ib0 NID, and the ethernet clients will
mount using @tcp0 NIDs. Since you are explicitly specifying the network,
the hop rule doesn't apply.
cliffw


On Fri, Dec 16, 2011 at 9:49 AM, Patrice Hamelin
patrice.hame...@ec.gc.cawrote:

 **
 Hi,

   I have two Infiniband clusters, each in a separate location with a solid
 ethernet connectivity between each of them.  Say they are named cluster A
 and cluster B.  All members of each clusters have both IB and eth networks
 available to them, and the IB network is not routed between cluster A and
 B, but ethernet is.  On each clusters, I have 4 OSS's serving FC disks.
 Clients on cluster A mounts Lustre disk from their local cluster, and the
 same goes on for for cluster B, both on Infiniband NIDs.

   What I would like to achieve is client from cluster A to mount disks
 from OSS's on cluster B on the ethernet connection.  The same goes on for
 clients in cluster B to mount disks from OSS's on cluster A.

   From my readings in the luster 1.8.7 manual, I got:

 7.1.1 Modprobe.conf
 Options under modprobe.conf are used to specify the networks available to
 a node.
 You have the choice of two different options – the networks option, which
 explicitly
 lists the networks available and the ip2nets option, which provides a
 list-matching
 lookup. Only one option can be used at any one time. The order of LNET
 lines in
 modprobe.conf is important when configuring multi-homed servers. *If a
 server
 node can be reached using more than one network, the first network
 specified in
 modprobe.conf will be used.*

 Is the last sentence means that I cannot do that?

 Thanks.

 --
 Patrice Hamelin
 Specialiste sénior en systèmes d'exploitation | Senior OS specialist
 Environnement Canada | Environment Canada
 2121, route Transcanadienne | 2121 Transcanada Highway
 Dorval, QC H9P 1J3
 Gouvernement du Canada | Government of Canada


 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss




-- 
cliffw
Support Guy
WhamCloud, Inc.
www.whamcloud.com
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Log Files Skipped nn previous similar messages

2011-11-23 Thread Cliff White
It means errors have occurred which are duplicates of the displayed
message, we limit messages in that case to reduce system log traffic. If a
different error occurs that message will be displayed.
cliffw

On Wed, Nov 23, 2011 at 11:41 AM, Lucia M. Walle lucia.wa...@cornell.eduwrote:

 Hello,
 I'm curious about this message:
 LustreError: Skipped some_number previous similar messages.
 I see multiple instances of this message with various errors.
 Does this mean that no other errors have occurred but this one since the
 previous LustreError?
 Or ?
 Thanks in Advance for you help.
 Lucy


 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss




-- 
cliffw
Support Guy
WhamCloud, Inc.
www.whamcloud.com
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] EXTERNAL: Re: Unable to write to the Lustre File System as any user except root

2011-10-12 Thread Cliff White
If you are not using LDAP, etc, then the user's information must be in the
MDS's password files.
Users must be known to the MDS.
cliffw


On Wed, Oct 12, 2011 at 6:37 AM, Barberi, Carl E carl.e.barb...@lmco.comwrote:

 Thank you Kilian.  However, I was just able to perform Lustre operations as
 another user.  So now it seems that only one of my users exhibits the
 problem I saw.  I don't have NIS or LDAP configured on this MDS, but I have
 another Lustre FS setup elsewhere that does not exhibit any of the problems
 I've seen.  Any thoughts as to why now only one user cannot access the
 Lustre FS?

 Thanks,
 Carl

 -Original Message-
 From: Kilian CAVALOTTI [mailto:kilian.cavalotti.w...@gmail.com]
 Sent: Wednesday, October 12, 2011 2:24 AM
 To: Barberi, Carl E
 Cc: lustre-discuss@lists.lustre.org
 Subject: EXTERNAL: Re: [Lustre-discuss] Unable to write to the Lustre File
 System as any user except root

 Hi Carl,

 On Tue, Oct 11, 2011 at 9:07 PM, Barberi, Carl E
 carl.e.barb...@lmco.com wrote:
  “ LustreError: 11-0: an error occurred while communicating with
  192.168.10.2@o2ib.  The mds_getxattr operation failed with -13.”

 You likely miss authentication information on your MDS about the user
 you're trying to write as.
 Just configure NIS, LDAP or whatever you're using on your MDS, and you
 should be good to go.

 Cheers,
 --
 Kilian
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss




-- 
cliffw
Support Guy
WhamCloud, Inc.
www.whamcloud.com
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] quilt messages on CentOS 5.7 x86_64

2011-09-28 Thread Cliff White
Sadly, it is not so much unsafe to continue as impossible - the patches that
failed were reverted,
so you don't have a properly patched source.
At this point, you would have to walk through the Lustre patches and fix
each place where there is a FAIL.
The fixes themselves are usually trivial, but the process is a bit of work.

Have you tried running our 5.5/5.6 kernel with your 5.7 install? I don't
know what changes there are in 5.7 however.
cliffw



On Wed, Sep 28, 2011 at 9:02 AM, Kristen J. Webb kw...@teradactyl.comwrote:

 Hi All,
 I've moved to CentOS 5.7 (5.6 seems to have disappeared
 from the download wiki).  I'm following along with the
 instructions from:


 http://wiki.whamcloud.com/display/PUB/Walk-thru-+Build+Lustre+1.8+on+CentOS+5.5+or+5.6+from+Whamcloud+git

 NOTE: lots of changes in 5.7 for where things are/called,
 I'll submit them if I can get it working.

 I get to the step for:

 quilt push -av

 And get a few messages that don't look good:

 patching file include/linux/backing-dev.h
 Hunk #1 FAILED at 48.
 Hunk #2 succeeded at 100 with fuzz 2 (offset 5 lines).
 1 out of 2 hunks FAILED -- rejects in file include/linux/backing-dev.h

 and maybe:

 Patch patches/raid5-zerocopy-rhel5.patch does not apply (enforce with -f)

 along with a lot of fuzz.

 Is it safe to continue?  I'm concerned about all of the Restoring messages
 at the end of the output.

 Thanks!
 Kris

 Here is the entire log just in case:

 Applying patch patches/lustre_version.patch
 patching file include/linux/lustre_version.h

 Applying patch patches/vfs_races-2.6-rhel5.patch
 patching file fs/dcache.c
 Hunk #2 succeeded at 1599 (offset 192 lines).
 patching file include/linux/dcache.h
 Hunk #1 succeeded at 174 (offset -3 lines).
 Hunk #2 succeeded at 258 (offset 3 lines).

 Applying patch patches/jbd-jcberr-2.6.18-vanilla.patch
 patching file include/linux/jbd.h
 Hunk #1 succeeded at 359 (offset 3 lines).
 Hunk #3 succeeded at 414 (offset 3 lines).
 Hunk #5 succeeded at 593 (offset 3 lines).
 patching file fs/jbd/checkpoint.c
 Hunk #1 succeeded at 713 (offset 25 lines).
 patching file fs/jbd/commit.c
 Hunk #1 succeeded at 765 (offset 57 lines).
 patching file fs/jbd/journal.c
 patching file fs/jbd/transaction.c
 Hunk #3 succeeded at 1335 (offset 41 lines).

 Applying patch patches/export_symbols-2.6.12.patch
 patching file fs/dcache.c
 Hunk #1 succeeded at 2157 (offset 576 lines).

 Applying patch patches/dev_read_only-2.6.18-vanilla.patch
 patching file block/ll_rw_blk.c
 Hunk #1 succeeded at 3144 (offset 77 lines).
 Hunk #2 succeeded at 3153 with fuzz 2.
 Hunk #3 succeeded at 3833 (offset 60 lines).
 patching file fs/block_dev.c
 Hunk #1 succeeded at 1144 (offset 85 lines).
 patching file include/linux/fs.h
 Hunk #1 succeeded at 1882 (offset 197 lines).

 Applying patch patches/export-2.6.18-vanilla.patch
 patching file fs/jbd/journal.c

 Applying patch patches/sd_iostats-2.6-rhel5.patch
 patching file drivers/scsi/Kconfig
 Hunk #1 succeeded at 84 (offset 6 lines).
 patching file drivers/scsi/scsi_proc.c
 patching file drivers/scsi/sd.c
 Hunk #1 succeeded at 62 (offset -1 lines).
 Hunk #2 succeeded at 185 (offset 1 line).
 Hunk #3 succeeded at 617 (offset 37 lines).
 Hunk #4 succeeded at 1072 (offset -4 lines).
 Hunk #5 succeeded at 1805 (offset 29 lines).
 Hunk #6 succeeded at 1845 (offset -4 lines).
 Hunk #7 succeeded at 2256 (offset 29 lines).
 Hunk #8 succeeded at 2340 (offset 37 lines).
 Hunk #9 succeeded at 2343 (offset 29 lines).
 Hunk #10 succeeded at 2376 (offset 37 lines).

 Applying patch patches/export_symbol_numa-2.6-fc5.patch
 patching file arch/i386/kernel/smpboot.c
 Hunk #1 succeeded at 627 (offset 48 lines).

 Applying patch patches/blkdev_tunables-2.6-rhel5.patch
 patching file include/linux/blkdev.h
 Hunk #1 succeeded at 808 (offset 20 lines).
 patching file include/scsi/scsi_host.h
 patching file drivers/scsi/lpfc/lpfc.h

 Applying patch patches/jbd-stats-2.6-rhel5.patch
 patching file include/linux/jbd.h
 patching file fs/jbd/transaction.c
 patching file fs/jbd/journal.c
 Hunk #2 succeeded at 641 (offset 2 lines).
 Hunk #4 succeeded at 1023 (offset 2 lines).
 Hunk #6 succeeded at 1470 (offset 2 lines).
 Hunk #7 succeeded at 2323 (offset 6 lines).
 Hunk #8 succeeded at 2404 (offset 2 lines).
 Hunk #9 succeeded at 2420 (offset 6 lines).
 patching file fs/jbd/checkpoint.c
 patching file fs/jbd/commit.c
 Hunk #3 succeeded at 303 (offset 13 lines).
 Hunk #5 succeeded at 425 (offset 13 lines).
 Hunk #6 succeeded at 493 (offset -2 lines).
 Hunk #7 succeeded at 662 (offset 13 lines).
 Hunk #8 succeeded at 847 (offset -2 lines).
 Hunk #9 succeeded at 939 (offset 13 lines).

 Applying patch patches/raid5-stats-rhel5.patch
 patching file drivers/md/raid5.c
 Hunk #1 succeeded at 149 (offset 34 lines).
 Hunk #3 succeeded at 348 (offset 34 lines).
 Hunk #5 succeeded at 684 (offset 33 lines).
 Hunk #7 succeeded at 1730 (offset 33 lines).
 Hunk #9 succeeded at 1920 (offset 35 lines).
 Hunk #11 succeeded at 1973 

Re: [Lustre-discuss] Hotspots

2011-09-21 Thread Cliff White
Well, this is why Lustre uses striping - however if your file is very small,
it will be located on
one stripe only, and at that point it's limited by hardware.
In current Lustre (1.8.6-wc1) you can enable caching on the OSS which may
help.
cliffw


On Wed, Sep 21, 2011 at 1:16 PM, Michael Di Domenico mdidomeni...@gmail.com
 wrote:

 is there an easy way to identify such hotspots?

 On Fri, Sep 16, 2011 at 2:24 AM, Mark Day mark@rsp.com.au wrote:
  Hi all,
 
  Does anyone have tips on dealing with 'hotspots' in a Lustre
  filesystems? We've recently noticed some large loads caused by a large
  number of clients hitting a single 'small' file at the same time.
  Something like a NetApp FlexCache would seem to be a solution for this
  but that's obviously proprietary.
 
  We're currently using 1.8.3
 
  tia, Mark.
  --
  mark day
  ___
  Lustre-discuss mailing list
  Lustre-discuss@lists.lustre.org
  http://lists.lustre.org/mailman/listinfo/lustre-discuss
 
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss




-- 
cliffw
Support Guy
WhamCloud, Inc.
www.whamcloud.com
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] how to add force_over_8tb to MDS

2011-07-14 Thread Cliff White
--writeconf will erase parameters set via lctl conf_param, and will erase
pools definitions.
It will also allow you to set rather silly parameters that can prevent your
filesystem from starting, such
as incorrect server NIDs or incorrect failover NIDs. For this reason (and
from a history of customer
support) we caveat it's use in the manual.

The --writeconf option never touches data, only server configs, so it will
not mess up your data.

So, given sensible precautions as mentioned above, it's safe to do.
cliffw


On Thu, Jul 14, 2011 at 11:03 AM, Theodore Omtzigt
t...@stillwater-sc.comwrote:

 Andreas:

   Thanks for taking a look at this. Unfortunately, I don't quite
 understand the guidance you present: If you are seeing 'this'
 problem. I haven't seen 'any' problems pertaining to 8tb yet, so I
 cannot place your guidance in the context of the question I posted.

 My question was whether or not I need this parameter on the MDS and if
 so, how to apply it retroactively.  The Lustre environment I installed
 was the 1.8.5 set. Any insight in the issues would be appreciated.

 Theo

 On 7/14/2011 1:41 PM, Andreas Dilger wrote:
  If you are seeing this problem it means you are using the ext3-based
 ldiskfs. Go back to the download site and get the lustre-ldiskfs and
 lustre-modules RPMs with ext4 in the name.
 
  That is the code that was tested with LUNs over 8TB. We kept these
 separate for some time to reduce risk for users that did not need larger LUN
 sizes.  This is the default for the recent Whamcloud 1.8.6 release.
 
  Cheers, Andreas
 
  On 2011-07-14, at 11:15 AM, Theodore Omtzigtt...@stillwater-sc.com
  wrote:
 
  I configured a Lustre file system on a collection of storage servers
  that have 12TB raw devices. I configured a combined MGS/MDS with the
  default configuration. On the OSTs however I added the force_over_8tb to
  the mountfsoptions.
 
  Two part question:
  1- do I need to set that parameter on the MGS/MDS server as well
  2- if yes, how do I properly add this parameter on this running Lustre
  file system (100TB on 9 storage servers)
 
  I can't resolve the ambiguity in the documentation as I can't find a
  good explanation of the configuration log mechanism that is being
  referenced in the man pages. The fact that the doc for --writeconf
  states This is very dangerous, I am hesitant to pull the trigger as
  there is 60TB of data on this file system that I rather not lose.
  ___
  Lustre-discuss mailing list
  Lustre-discuss@lists.lustre.org
  http://lists.lustre.org/mailman/listinfo/lustre-discuss
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss




-- 
cliffw
Support Guy
WhamCloud, Inc.
www.whamcloud.com
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] how to add force_over_8tb to MDS

2011-07-14 Thread Cliff White
This error message you are seeing is what Andreas was talking about - you
must use the
ext4-based version, as you will not need any option with your size LUNS. The
'must use force_over_8tb'
error is the key here, you most certainly want/need to *.ext4.rpm versions
of stuff.
cliffw


On Thu, Jul 14, 2011 at 11:10 AM, Theodore Omtzigt
t...@stillwater-sc.comwrote:

 Michael:

The reason I had to do it on the OST's is because when issuing the
 mkfs.lustre command to build the OST it would error out with the message
 that I should use the force_over_8tb mount option. I was not able to
 create an OST on that device without the force_over_8tb option.

 Your insights on the writeconf are excellent: good to know that
 writeconf is solid. Thank you.

 Theo

 On 7/14/2011 1:29 PM, Michael Barnes wrote:
  On Jul 14, 2011, at 1:15 PM, Theodore Omtzigt wrote:
 
  Two part question:
  1- do I need to set that parameter on the MGS/MDS server as well
  No, they are different filesystems.  You shouldn't need to do this on the
 OSTs either.  You must be using an older lustre release.
 
  2- if yes, how do I properly add this parameter on this running Lustre
  file system (100TB on 9 storage servers)
  covered
 
  I can't resolve the ambiguity in the documentation as I can't find a
  good explanation of the configuration log mechanism that is being
  referenced in the man pages. The fact that the doc for --writeconf
  states This is very dangerous, I am hesitant to pull the trigger as
  there is 60TB of data on this file system that I rather not lose.
  I've had no issues with writeconf.  Its nice because it shows you the old
 and new parameters.  Make sure that the changes that you made were the what
 you want, and that the old parameters that you want to keep are still in
 tact.  I don't remember the exact circumstances, but I've found settings
 were lost when doing a writeconf, and I had to explictly put these settings
 in tunefs.lustre command to preserve them.
 
  -mb
 
  --
  +---
  | Michael Barnes
  |
  | Thomas Jefferson National Accelerator Facility
  | Scientific Computing Group
  | 12000 Jefferson Ave.
  | Newport News, VA 23606
  | (757) 269-7634
  +---
 
 
 
 
 
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss




-- 
cliffw
Support Guy
WhamCloud, Inc.
www.whamcloud.com
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Fwd: Lustre performance issue (obdfilter_survey

2011-07-06 Thread Cliff White
The case=network part of obdfilter_survey has really been replaced by
lnet_selftest.
I don't think it's been maintained in awhile.

It would be best to repeat the network-only test with lnet_selftest, this is
likely an issue with
the script.
cliffw

On Wed, Jul 6, 2011 at 1:04 PM, lior amar lioror...@gmail.com wrote:

 Hi,

 I am installing a Lustre system and I wanted to measure the OSS
 performance.
 I used the obdfilter_survey and got very low performance for low
 thread numbers when using the case=network option


 System Configuration:
 * Lustre 1.8.6-wc (compiled from the whamcloud git)
 * Centos 5.6
 * Infiniband (mellanox cards) open ib from centos 5.6
 * OSS - 2 quad core  E5620 CPUS
 * OSS - memory 48GB
 * LSI 2965 raid card with 18 disks in raid 6 (16 data + 2). Raw
 performance are good both  when testing the block device or over a file
 system with Bonnie++

 * OSS uses ext4 and mkfs parameters were set to reflect the stripe
 size .. -E stride =...

 The performance test I did:


 1) obdfilter_survey case=disk -
OSS performance is ok (similar to raw disk performance) -
In the case of 1  thread and one object getting 966MB/sec

 2) obdfilter_survey case=network -
 OSS performance is bad for low thread numbers and get better as
 the  number of  threads increases.
 For the 1 thread one object getting 88MB/sec

 3) obdfilter_survey case=netdisk -- Same as network case

 4) When running ost_survey I am getting also low performance:
Read = 156 MB/sec Write = ~350MB/sec

 5) Running the lnet_self test I get much higher numbers
  Numbers obtained with concurrency = 1

  [LNet Rates of servers]
  [R] Avg: 3556 RPC/s Min: 3556 RPC/s Max: 3556 RPC/s
  [W] Avg: 4742 RPC/s Min: 4742 RPC/s Max: 4742 RPC/s
  [LNet Bandwidth of servers]
  [R] Avg: 1185.72  MB/s  Min: 1185.72  MB/s  Max: 1185.72  MB/s
  [W] Avg: 1185.72  MB/s  Min: 1185.72  MB/s  Max: 1185.72  MB/s




 Any Ideas why a single thread over network obtain 88MB/sec while the same
 test conducted local obtained 966MB/sec??

 What else should I test/read/try ??

 10x

 Below are the actual numbers:

 = obdfilter_survey case = disk ==
 Wed Jul  6 13:24:57 IDT 2011 Obdfilter-survey for case=disk from oss1
 ost  1 sz 16777216K rsz 1024K obj1 thr1 write  966.90
 [ 644.40,1030.02] rewrite 1286.23 [1300.78,1315.77] read
 8474.33 SHORT
 ost  1 sz 16777216K rsz 1024K obj1 thr2 write 1577.95
 [1533.57,1681.43] rewrite 1548.29 [1244.83,1718.42] read
 11003.26 SHORT
 ost  1 sz 16777216K rsz 1024K obj1 thr4 write 1465.68
 [1354.73,1600.50] rewrite 1484.98 [1271.54,1584.52] read
 16464.13 SHORT
 ost  1 sz 16777216K rsz 1024K obj1 thr8 write 1267.39
 [ 797.25,1476.48] rewrite 1350.28 [1283.80,1387.70] read
 15353.69 SHORT
 ost  1 sz 16777216K rsz 1024K obj1 thr   16 write 1295.35
 [1266.82,1408.70] rewrite 1332.59 [1315.61,1429.66] read
 15001.67 SHORT
 ost  1 sz 16777216K rsz 1024K obj2 thr2 write 1467.80
 [1472.62,1691.42] rewrite 1218.88 [ 821.23,1338.74] read
 13538.41 SHORT
 ost  1 sz 16777216K rsz 1024K obj2 thr4 write 1561.09
 [1521.57,1682.75] rewrite 1183.31 [ 959.10,1372.52] read
 15955.31 SHORT
 ost  1 sz 16777216K rsz 1024K obj2 thr8 write 1498.74
 [1543.58,1704.41] rewrite 1116.19 [1001.06,1163.91] read
 15523.22 SHORT
 ost  1 sz 16777216K rsz 1024K obj2 thr   16 write 1462.54
 [ 985.08,1615.48] rewrite 1244.29 [1100.97,1444.80] read
 15174.56 SHORT
 ost  1 sz 16777216K rsz 1024K obj4 thr4 write 1483.42
 [1497.88,1648.45] rewrite 1042.92 [ 801.25,1192.69] read
 15997.30 SHORT
 ost  1 sz 16777216K rsz 1024K obj4 thr8 write 1494.63
 [1458.85,1624.13] rewrite 1041.81 [ 806.25,1183.89] read
 15450.18 SHORT
 ost  1 sz 16777216K rsz 1024K obj4 thr   16 write 1469.96
 [1450.65,1647.45] rewrite 1027.06 [ 645.50,1215.86] read
 15543.46 SHORT
 ost  1 sz 16777216K rsz 1024K obj8 thr8 write 1417.93
 [1250.85,1520.58] rewrite 1007.45 [ 905.15,1130.82] read
 15789.66 SHORT
 ost  1 sz 16777216K rsz 1024K obj8 thr   16 write 1324.28
 [ 951.87,1518.26] rewrite  986.48 [ 855.21,1079.99] read
 15510.70 SHORT
 ost  1 sz 16777216K rsz 1024K obj   16 thr   16 write 1237.22
 [ 989.07,1345.17] rewrite  915.56 [ 749.08,1033.03] read
 15415.75 SHORT

 ==

 == obdfilter_survey case = network 
 Wed Jul  6 16:29:38 IDT 2011 Obdfilter-survey for case=network from
 oss6
 ost  1 sz 16777216K rsz 1024K obj1 thr1 write   87.99
 [  86.92,  88.92] rewrite   87.98 [  86.83,  88.92] read   88.09
 [  86.92,  88.92]
 ost  1 sz 16777216K rsz 1024K obj1 thr2 write  175.76
 [ 173.84, 176.83] rewrite  175.75 [ 174.84, 176.83] read  172.76
 [ 171.67, 174.84]
 ost  1 sz 16777216K rsz 1024K obj1 thr

Re: [Lustre-discuss] Need help

2011-07-01 Thread Cliff White
Did you also install the correct e2fsprogs?
cliffw


On Fri, Jul 1, 2011 at 5:45 PM, Mervini, Joseph A jame...@sandia.govwrote:

 Hi,

 I just upgraded our servers from RHEL 5.4 - RHEL 5.5 and went from lustre
 1.8.3 to 1.8.5.

 Now when I try to mount the OSTs I'm getting:

 [root@aoss1 ~]# mount -t lustre /dev/disk/by-label/scratch2-OST0001
 /mnt/lustre/local/scratch2-OST0001
 mount.lustre: mount /dev/disk/by-label/scratch2-OST0001 at
 /mnt/lustre/local/scratch2-OST0001 failed: No such file or directory
 Is the MGS specification correct?
 Is the filesystem name correct?
 If upgrading, is the copied client log valid? (see upgrade docs)

 tunefs.lustre looks okay on both the MDT (which is mounted) and the OSTs:

 [root@amds1 ~]# tunefs.lustre /dev/disk/by-label/scratch2-MDT
 checking for existing Lustre data: found CONFIGS/mountdata
 Reading CONFIGS/mountdata

   Read previous values:
 Target: scratch2-MDT
 Index:  0
 Lustre FS:  scratch2
 Mount type: ldiskfs
 Flags:  0x5
  (MDT MGS )
 Persistent mount opts:
 errors=panic,iopen_nopriv,user_xattr,maxdirsize=2000
 Parameters: lov.stripecount=4 failover.node=failnode@tcp1
 failover.node=failnode@o2ib1 mdt.group_upcall=/usr/sbin/l_getgroups


   Permanent disk data:
 Target: scratch2-MDT
 Index:  0
 Lustre FS:  scratch2
 Mount type: ldiskfs
 Flags:  0x5
  (MDT MGS )
 Persistent mount opts:
 errors=panic,iopen_nopriv,user_xattr,maxdirsize=2000
 Parameters: lov.stripecount=4 failover.node=failnode@tcp1
 failover.node=failnode@o2ib1 mdt.group_upcall=/usr/sbin/l_getgroups

 exiting before disk write.


 [root@aoss1 ~]# tunefs.lustre /dev/disk/by-label/scratch2-OST0001
 checking for existing Lustre data: found CONFIGS/mountdata
 Reading CONFIGS/mountdata

   Read previous values:
 Target: scratch2-OST0001
 Index:  1
 Lustre FS:  scratch2
 Mount type: ldiskfs
 Flags:  0x2
  (OST )
 Persistent mount opts: errors=panic,extents,mballoc
 Parameters: mgsnode=mds-server1@tcp1 mgsnode=mds-server1@o2ib1
 mgsnode=mds-server2@tcp1 mgsnode=mds-server2@o2ib1
 failover.node=failnode@tcp1 failover.node=failnode@o2ib1


   Permanent disk data:
 Target: scratch2-OST0001
 Index:  1
 Lustre FS:  scratch2
 Mount type: ldiskfs
 Flags:  0x2
  (OST )
 Persistent mount opts: errors=panic,extents,mballoc
 Parameters: mgsnode=mds-server1@tcp1 mgsnode=mds-server1@o2ib1
 mgsnode=mds-server2@tcp1 mgsnode=mds-server2@o2ib1
 failover.node=falnode@tcp1 failover.node=failnode@o2ib1

 exiting before disk write.


 I am really stuck and could really use some help.

 Thanks.

 ==

 Joe Mervini
 Sandia National Laboratories
 Dept 09326
 PO Box 5800 MS-0823
 Albuquerque NM 87185-0823



 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss




-- 
cliffw
Support Guy
WhamCloud, Inc.
www.whamcloud.com
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] What exactly is punch statistic?

2011-06-16 Thread Cliff White
It is called when truncating a file - afaik it is showing you the number of
truncates, more or less.
cliffw



On Thu, Jun 16, 2011 at 10:52 AM, Mervini, Joseph A jame...@sandia.govwrote:

 Hi,

 I have been covertly trying for a long time to find out what punch means as
 far a lustre llobdstat output but have not really found anything definitive.

 Can someone answer that for me? (BTW: I am not alone in my ignorance... :)
 )

 Thanks.
 

 Joe Mervini
 Sandia National Laboratories
 High Performance Computing
 505.844.6770
 jame...@sandia.gov




 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss




-- 
cliffw
Support Guy
WhamCloud, Inc.
www.whamcloud.com
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Enabling mds failover after filesystem creation

2011-06-14 Thread Cliff White
It depends - are you using a combined MGS/MDS?
If so, you will have to update the mgsnid on all servers to reflect the
failover node,
plus change the client mount string to show the failover node.
otherwise, it's the same procedure as with an OST.
cliffw


On Tue, Jun 14, 2011 at 12:06 PM, Jeff Johnson 
jeff.john...@aeoncomputing.com wrote:

 Greetings,

 I am attempting to add mds failover operation to an existing v1.8.4
 filesystem. I have heartbeat/stonith configured on the mds nodes. What
 is unclear is what to change in the lustre parameters. I have read over
 the 1.8.x and 2.0 manuals and they are unclear as exactly how to enable
 failover mds operation on an existing filesystem.

 Do I simply run the following on the primary mds node and specify the
 NID of the secondary mds node?

 tunefs.lustre --param=failover.node=10.0.1.3@o2ib /dev/mdt device

 where: 10.0.1.2=primary mds, 10.0.1.3=secondary mds

 All of the examples for enabling failover via tunefs.lustre are for OSTs
 and I want to be sure that there isn't a different procedure for the MDS
 since it can only be active/passive.

 Thanks,

 --Jeff

 --
 Jeff Johnson
 Aeon Computing

 www.aeoncomputing.com
 4905 Morena Boulevard, Suite 1313 - San Diego, CA 92117

 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss




-- 
cliffw
Support Guy
WhamCloud, Inc.
www.whamcloud.com
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Enabling mds failover after filesystem creation

2011-06-14 Thread Cliff White
Then it should be the same as the OST case. The only difference between the
two
is that we never allow two active MDSs on the same filesystem, so MDT is
always active/passive.
cliffw

On Tue, Jun 14, 2011 at 12:18 PM, Jeff Johnson 
jeff.john...@aeoncomputing.com wrote:

  Apologies, I should have been more descriptive.

 I am running a dedicated MGS node and MGT device. The MDT is a standalone
 RAID-10 shared via SAS between two nodes, one being the current MDS and the
 second being the planned secondary MDS. Heartbeat and stonith w/ ipmi
 control is currently configured but not started between the two nodes.



 On 6/14/11 12:12 PM, Cliff White wrote:

 It depends - are you using a combined MGS/MDS?
 If so, you will have to update the mgsnid on all servers to reflect the
 failover node,
 plus change the client mount string to show the failover node.
  otherwise, it's the same procedure as with an OST.
 cliffw


 On Tue, Jun 14, 2011 at 12:06 PM, Jeff Johnson 
 jeff.john...@aeoncomputing.com wrote:

 Greetings,

 I am attempting to add mds failover operation to an existing v1.8.4
 filesystem. I have heartbeat/stonith configured on the mds nodes. What
 is unclear is what to change in the lustre parameters. I have read over
 the 1.8.x and 2.0 manuals and they are unclear as exactly how to enable
 failover mds operation on an existing filesystem.

 Do I simply run the following on the primary mds node and specify the
 NID of the secondary mds node?

 tunefs.lustre --param=failover.node=10.0.1.3@o2ib /dev/mdt device

 where: 10.0.1.2=primary mds, 10.0.1.3=secondary mds

 All of the examples for enabling failover via tunefs.lustre are for OSTs
 and I want to be sure that there isn't a different procedure for the MDS
 since it can only be active/passive.

 Thanks,

 --Jeff

 --
 Jeff Johnson
 Aeon Computing

 www.aeoncomputing.com
 4905 Morena Boulevard, Suite 1313 - San Diego, CA 92117

 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss




 --
 cliffw
 Support Guy
 WhamCloud, Inc.
 www.whamcloud.com



 --

 Jeff Johnson
 Aeon Computing
 www.aeoncomputing.com
 4905 Morena Boulevard, Suite 1313 - San Diego, CA 92117


 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss




-- 
cliffw
Support Guy
WhamCloud, Inc.
www.whamcloud.com
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Has anyone built 1.8.5 on Centos 5.6?

2011-06-01 Thread Cliff White
Actually, we have 2.6.18-238 in testing for Lustre 1.8.6 release, builds
fine,
you can get RPMS/SRPMS
here: http://newbuild.whamcloud.com/job/lustre-b1_8/lastSuccessfulBuild/
including the server kernels.

or source from our git repo
cliffw



On Wed, Jun 1, 2011 at 1:16 AM, Joe Landman land...@scalableinformatics.com
 wrote:

 On 06/01/2011 04:13 AM, Götz Waschk wrote:
  On Tue, May 31, 2011 at 10:56 PM, Joe Landman
  land...@scalableinformatics.com  wrote:
  Are there any gotchas?  Or is it worth staying with the older Centos
  5.4/5.5 based kernels from the download site?
  Hi Joseph,
 
  are you talking about the client or the server? The client works fine
  with the 2.6.18-238.9.1.el5 kernel, on Scientific Linux 5, that is.

 Server side.  I finished the build, and put the results here (if this is
 useful for anyone else)
 http://download.scalableinformatics.com/lustre/1.8git_build/

 Thanks to all for the pointer to the documents.



 --
 Joseph Landman, Ph.D
 Founder and CEO
 Scalable Informatics, Inc.
 email: land...@scalableinformatics.com
 web  : http://scalableinformatics.com
http://scalableinformatics.com/sicluster
 phone: +1 734 786 8423 x121
 fax  : +1 866 888 3112
 cell : +1 734 612 4615
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss




-- 
cliffw
Support Guy
WhamCloud, Inc.
www.whamcloud.com
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] LNET routing question

2011-04-04 Thread Cliff White
On Mon, Apr 4, 2011 at 1:32 PM, David Noriega tsk...@my.utsa.edu wrote:

 Reading up on LNET routing and have a question. Currently have nothing
 special going on, simply specified tcp0(bond0) on the OSSs and MDS.
 Same for all the clients as well, we have an internal network for our
 cluster, 192.168.x.x.  How would I go about doing the following?

 Data1,Data2 = OSS, Meta1,Meta2 = MDS.

 Internally its 192.168.1.x for cluster nodes, 192.168.5.x for lustre nodes.

 But I would like a 1) a 'forwarding' sever, which would be our file
 server which exports lustre via samba/nfs to also be the outside
 world's access point to lustre(outside world being the rest of the
 campus). 2) a second internal network simply connecting the OSSs and
 MDS to the backup client to do backups outside of the cluster network.


Slightly confused am I.
 1) is just a samba/nfs exporter, while you might
have two networks in the one box, you wouldn't be doing any routing,
the Lustre client is re-exporting the FS.
The Lustre client has to find the Lustre servers, the samba/NFS clients only
have to find the Lustre client.

2) if the second internal net connects backup clients directly to OSS/MDS
you  again need no routing.

Lustre Routing is really to connect disparte network hardware for
Lustre traffic, for example Infiniband routed to TCP/IP, or Quadratics to
IB.

Also, file servers are never routers, since they have direct connections to
all clients. Routers are dedicated nodes that have both hardware interfaces
and
sit between a client and server.
Typical setup are things like a cluster with server and clients on IB, you
wish to add a second client pool on TCP/IP, you have to build nodes that
have both TCP/IP and IB interfaces, and those are Lustre Routers.

Since all your traffic is TCP/IP, sounds like normal TCP/IP network
manipulation
is all you are needing. You would need the 'lnet networks' stuff to
align nets with interfaces, and that part looks correct.
cliffw



 So would I do the following?

 OSS/MDS
 options lnet networks=tcp0(bond0),tcp1(eth3) routes=tcp2 192.168.2.1

 Backup client
 options lnet networks=tcp1(eth1)

 Cluster clients
 options lnet networks=tcp0(eth0)

 File Server
 options lnet networks=tcp0(eth1),tcp2(eth2) forwarding=enabled

 And for any outside clients I would do the following?
 options lnet networks=tcp2(eth0)

 And when mounting from the outside I would use in /etc/fstab the external
 ip?
 x.x.x.x@tcp2:/lustre /lustre lustre defaults,_netdev 0 0

 Is this how it would work? Also can I do this piece-meal or does it
 have to be done all at once?

 Thanks
 David

 --
 Personally, I liked the university. They gave us money and facilities,
 we didn't have to produce anything! You've never been out of college!
 You don't know what it's like out there! I've worked in the private
 sector. They expect results. -Ray Ghostbusters
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss




-- 
cliffw
Support Guy
WhamCloud, Inc.
www.whamcloud.com
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] MDT extremely slow after restart

2011-04-03 Thread Cliff White
What is the underlying disk, did that hardware/RAID config change
when you switched hardware?
The 'still busy' message is a bug, may be fixed in 1.8.5
cliffw


On Sat, Apr 2, 2011 at 1:01 AM, Thomas Roth t.r...@gsi.de wrote:

 Hi all,

 we are suffering from a sever metadata performance degradation on our 1.8.4
 cluster and are pretty clueless.
 - We moved the MDT to a new hardware, since the old one was failing
 - We increased the size of the MDT with 'resize2fs' (+ mounted it and saw
 all the files)
 - We found the performance of the new mds dreadful
 - We restarted the MDT on the old hardware with the failed RAID controller
 replaced, but without doing anything with OSS or clients
 The machine crashed three minutes after recovery was over
 - Moved back to the new hardware, but the system was now pretty messed up:
 persistent  still busy with N RPCs and some going back to sleep
 messages (by the way, there is no way to find out what these RPCs are, and
 how to kill them? Of course I wouldn't mind switching off some clients or
 even rebooting some OSS if I only new which ones...)
 - Shut down the entire cluster, writeconf, restart without any client
 mounts - worked fine
 - Mounted Lustre and tried to ls a directory with 100 files:   takes
 several minutes(!)
 - Being patient and then trying the same on a second client: takes
 msecs.

 I have done complete shutdowns before, lastly to upgrade from 1.6 to 1.8,
 then without writeconf and without performance loss. Before to change the
 IPs of all servers (moving into a subnet), with writeconf, but without
 recollection of the metadata behavior afterwards.
 It is clear that after writeconf some information has to be regenerated,
 but this is really extreme - also normal?

 The MDT now behaves more like an xrootd master which makes first contact to
 its file servers and has to read in the entire database (would be nice to
 have in Lustre to regenerate the MDT in case of desaster ;-) ).
 Which caches are being filled now when I ls through the cluster? May I
 expect the MDT to explode once it has learned about a certain percentage of
 the
 system? ;-) I mean, we have 100 mio files now and the current MDT hardware
 has just 32GB memory...
 In any case this is not the Lustre behavior we are used to.

 Thanks for any hints,
 Thomas

 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss




-- 
cliffw
Support Guy
WhamCloud, Inc.
www.whamcloud.com
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Migrating so all MGS/MDTs on same node

2011-03-31 Thread Cliff White
You can't mount two MGS on the same node. The MGS NID has to be unique.

On Thu, Mar 31, 2011 at 2:22 AM, Andrus, Brian Contractor
bdand...@nps.eduwrote:

 All,



 We have a system that was grown from 3 separate lustre filesystems and I
 would like to set it up so they share mgs/mdt services from a single node
 (so the mount points on clients all use the same ‘server’.



 I have tried doing a writeconf and changing the mgs nid on the osts and
 mdts, but when I try mounting the mdt on the new system (which already has
 one mgs/mdt mounted) I get:



 mount.lustre: mount /dev/VG_hamming/work_mdt at /mnt/lustre/work/mdt
 failed: Operation already in progress

 The target service is already running. (/dev/VG_hamming/work_mdt)



 But if I immediately try again, it works.

 Anyone know why this is?



 Running lustre_1.8.5 using ext4 rpms



 Brian Andrus

 ITACS/Research Computing

 Naval Postgraduate School

 Monterey, California

 voice: 831-656-6238



 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss




-- 
cliffw
Support Guy
WhamCloud, Inc.
www.whamcloud.com
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Optimal stratgy for OST distribution

2011-03-31 Thread Cliff White
No, the algorithm is not purely random, it is weighted on QOS, space and a
few other things.
When a stripe is chosen on one OSS, we add a penalty to the other OSTs on
that OSS to prevent
IO bunching on one OSS.
cliffw


On Thu, Mar 31, 2011 at 1:59 PM, Jeremy Filizetti 
jeremy.filize...@gmail.com wrote:

 I this a feature implemented after 1.8.5?  In the past default striping
 without an offset resulted in sequential stripe allocation according to
 client device order for a striped file.  Basically the order OSTs were
 mounted after the the last --writeconf is the order the targets are added to
 the client llog and allocated.

 It's probably not a big deal for lots of clients but for a small number of
 clients doing large sequential IO or working over the WAN it is.  So
 regardless of an A or B configuration a file with a stripe count of 3 could
 end up issuing IO to a single OSS instead of using round-robin between the
 socket/queue pair to each OSS.

 Jeremy


 On Thu, Mar 31, 2011 at 11:06 AM, Kevin Van Maren 
 kevin.van.ma...@oracle.com wrote:

 It used to be that multi-stripe files were created with sequential OST
 indexes.  It also used to be that OST indexes were sequentially assigned
 to newly-created files.
 As Lustre now adds greater randomization, the strategy for assigning
 OSTs to OSS nodes (and storage hardware, which often limits the
 aggregate performance of multiple OSTs) is less important.

 While I have normally gone with a, b can make it easier to remember
 where OSTs are located, and also keep a uniform convention if the
 storage system is later grown.

 Kevin


 Heckes, Frank wrote:
  Hi all,
 
  sorry if this question has been answered before.
 
  What is the optimal 'strategy' assigning OSTs to OSS nodes:
 
  -a- Assign OST via round-robin to the OSS
  -b- Assign in consecutive order (as long as the backend storage provides
  enought capacity for iops and bandwidth)
  -c- Something 'in-between' the 'extremes' of -a- and -b-
 
  E.g.:
 
  -a- OSS_1   OSS_2   OST_3
|_  |_  |_
  OST_1   OST_2   OST_3
  OST_4   OST_5   OST_6
  OST_7   OST_8   OST_9
 
  -b- OSS_1   OSS_2   OST_3
|_  |_  |_
  OST_1   OST_4   OST_7
  OST_2   OST_5   OST_8
  OST_3   OST_6   OST_9
 
  I thought -a- would be best for task-local (each task write to own
  file) and single file (all task write to single file) I/O since its like
  a raid-0 approach used disk I/O (and SUN create our first FS this way).
  Does someone made any systematic investigations which approach is best
  or have some educated opinion?
  Many thanks in advance.
  BR
 
  -Frank Heckes
 
 
 
 
 
  Forschungszentrum Juelich GmbH
  52425 Juelich
  Sitz der Gesellschaft: Juelich
  Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
  Vorsitzender des Aufsichtsrats: MinDirig Dr. Karl Eugen Huthmacher
  Geschaeftsfuehrung: Prof. Dr. Achim Bachem (Vorsitzender),
  Dr. Ulrich Krafft (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
  Prof. Dr. Sebastian M. Schmidt
 
 
 
 
 
  Besuchen Sie uns auf unserem neuen Webauftritt unter www.fz-juelich.de
  ___
  Lustre-discuss mailing list
  Lustre-discuss@lists.lustre.org
  http://lists.lustre.org/mailman/listinfo/lustre-discuss
 

 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss



 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss




-- 
cliffw
Support Guy
WhamCloud, Inc.
www.whamcloud.com
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] software raid

2011-03-24 Thread Cliff White
Historically, Linux software RAID had multiple issues, we did not advise
using it.
Those issues afaik were fixed long ago, and we changed the advice.
Sun/Oracle sold a product that was based on software RAID - there are no
unique issues
using soft RAID with Lustre.

Performance/reliability  is a whole 'nother set of topics - there are
reasons why people
buy the expensive flavors.,
cliffw

On Thu, Mar 24, 2011 at 3:34 AM, Stuart Midgley sdm...@gmail.com wrote:

 Hi Brian

 Long time no speak.

 Anyway, we use to use software raid exclusively but have slowly stopped.
  Using 3ware cards in Rackable nodes now.  All going well so far.  Though,
 for our MDS we are running a 3 way mirror on sas disks.

 md has a few issues... all of them tend to end at the same place... losing
 data.  We have had situations where md returns crap data cause its getting
 it from a disk, but doesn't actually verify it against other disks (the disk
 hasn't actually thrown hardware errors)... you manually fail the disk and
 all of a sudden the file is no longer corrupt.

 We have also had situations where md says the write occurred successfully,
 but really it has just hit the cache on the disk and hasn't been committed
 to platter... and a short time later, the disk reports the error to md but
 for a much earlier read/write.  The data is now corrupt on disk and flushed
 form all of lustre's caches.

 With all our software raid we now do /sbin/hdparm -W 0 $dev  to disable
 write caching on the disk.  This has helped, but obviously hurts
 performance.





 --
 Dr Stuart Midgley
 sdm...@gmail.com



 On 24/03/2011, at 10:54 AM, Brian O'Connor wrote:

 
  This has probably been asked and answered.
 
  Is software raid(md) still considered bad practice?
 
  I would like to use ssd drives for an mdt, but using fast ssd drives
  behind a raid controller seems to defeat the purpose.
 
  There was some thought that the decision not to support
  software raid was mostly about Sun/Oracle trying to sell hardware
  raid.
 
  thoughts?
 
  --
  Brian O'Connor
  ---
  SGI Consulting
  Email: bri...@sgi.com, Mobile +61 417 746 452
  Phone: +61 3 9963 1900, Fax:  +61 3 9963 1902
  357 Camberwell Road, Camberwell, Victoria, 3124
  AUSTRALIA
  http://www.sgi.com/support/services
  ---
 
  ___
  Lustre-discuss mailing list
  Lustre-discuss@lists.lustre.org
  http://lists.lustre.org/mailman/listinfo/lustre-discuss

 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss




-- 
cliffw
Support Guy
WhamCloud, Inc.
www.whamcloud.com
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Details on the LNET-Selftests

2011-03-23 Thread Cliff White
Sadly, as far as I am aware, no.

cliffw


On Wed, Mar 23, 2011 at 10:56 AM, Alvaro Aguilera 
s2506...@inf.tu-dresden.de wrote:

 Hello,

 I've read the manual section about the selftest-Module and wonder if
 someone here can point me to more detailed information about it.
 For example some kind of diagram showing the packet exchange for the
 BRW-test, the key factors influencing/limiting its performance, etc. would
 be very helpful. Does something like that exist?

 Thanks,
 Alvaro.





 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss




-- 
cliffw
Support Guy
WhamCloud, Inc.
www.whamcloud.com
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] MDS hangs with OFED

2011-03-17 Thread Cliff White
Unfortunately, we've had lot's of reports of IB instability.  It does appear
to happen
quite a bit, and generally is not a Lustre problem at all.
- Check all mechanical connections, cables, etc. - replace if need be - many
issues have been cable-related.
- Check firmware versions of all IB cards, find the best version for yours.
- Make sure your IB cards are in the proper (best performing) slots in your
backplane.
- If you have an IB switch with monitoring/error reporting you may be able
to get more data.
cliffw


On Thu, Mar 17, 2011 at 10:54 AM, Kevin Hildebrand ke...@umd.edu wrote:


 We've been seeing occasional hangs on our MDS and I'd like to see if
 anyone else is seeing this or can provide suggestions on where to look.
 This might not even be a Lustre problem at all.

 We're running Lustre 1.8.4 with OFED 1.5.2, and kernel version
 2.6.18-194.3.1.el5_lustre.1.8.4.

 The problem is that at some point it appears that something in the IB
 stack is going out to lunch- pings to the IPoIB interface time out, and
 anything that touches IB (perfquery, etc) goes into a hard hang and cannot
 be killed.

 The only solution to the problem once it occurs is to power-cycle the
 machine, as shutdown/reboot hang as well.

 From what I can see, the first abnormal entries in the system logs on
 the MDS are messages showing that connections to the OSSes are timing out.

 Any insight would be appreciated.

 Thanks,

 Kevin
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss




-- 
cliffw
Support Guy
WhamCloud, Inc.
www.whamcloud.com
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] OST problem

2011-03-04 Thread Cliff White
For clarity, Lustre does not replicate data. If you add an OST, it is
unique.
If you wish to do failover, this requires shared storage between two nodes.
We do not replicate storage.

If you wish to increase the size of your filesystem, you can add OSTs.
cliffw


On Fri, Mar 4, 2011 at 7:16 AM, Lucius lucius...@hotmail.com wrote:

 Hi Larry,

 thank you for your answer, but I do not have the chance to use infiniband.
 This description also starts with formatting the fs. I don't want to format
 the already in user node. I would like to extend it online.
 Is it possible to achieve full sync of the data of the two nodes (the one
 existing already, and the second is the new that is on the new server about
 to be attached) The old node has 50% of its capacity uploaded, the new node
 is completely empty.

 So the question is, how do I add a failnode to an online system and how do
 I
 manage to get the data to be in synchron.
 Hope someone can help

 thank you,
 Lucius

 - Eredeti üzenet -
 From: Larry
 Sent: Tuesday, March 01, 2011 6:25 AM
 To: Lucius
 Cc: lustre-discuss@lists.lustre.org
 Subject: Re: [Lustre-discuss] OST problem

 Hi Lucius,
 lustre manual  chapter 15 tells you how to do it


 On Tue, Mar 1, 2011 at 1:05 PM, Lucius lucius...@hotmail.com wrote:
  Hello everyone,
 
  I would like to extend a OSS, which is still in current use. I would like
  to
  extend it with a server which has exactly the same HW configuration, and
 I’d
  like to extend it in an active/active mode.
  I couldn’t find any documentation about this, as most of the examples
 show
  how to use failnode during formatting. However, I need to extend the
  currently working system without losing data.
  Also, tunefs.lustre examples show only the parameter configuration, but
  they
  won’t tell if you need to synchronize the file system before setting the
  How
  would the system know that on the given server identified by its unique
  IP,
  which OST mirrors should run?
 
  Thank you in advance,
  Viktor
  ___
  Lustre-discuss mailing list
  Lustre-discuss@lists.lustre.org
  http://lists.lustre.org/mailman/listinfo/lustre-discuss
 
 

 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss




-- 
cliffw
Support Guy
WhamCloud, Inc.
www.whamcloud.com
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Help! Newbie trying to set up Lustre network

2011-02-22 Thread Cliff White
Run 'lctl list_nids' on the client also.
Then you can
# lctl ping other nid
from both server and client to verify your LNET is functioning.

Also, use tunefs.lustre --print on your MDS/MGT and OST devices to verify
that mgsnid
is set correctly there.
cliffw


On Tue, Feb 22, 2011 at 11:10 AM, Xiang, Yang yang.xi...@teradata.comwrote:

 Dmesg and syslog are clean and has no entries about lustre client. And
 on the server side, there is no entry on the client side connection
 attempt either. Tcpdump shows no trace of incoming client connection
 request. I think it is the syntax of the MGSNID of the mount.lustre is
 not interpreted by the client correctly or I am missing some
 configuration on the client side to turn the 192.168.0.2@tcp0 to a
 valid NID, ie lustre device for the client to make an outwards
 connection request to the server.

 Thanks,

 Yang


 -Original Message-
 From: lustre-discuss-boun...@lists.lustre.org
 [mailto:lustre-discuss-boun...@lists.lustre.org] On Behalf Of Brian J.
 Murrell
 Sent: Tuesday, February 22, 2011 11:04 AM
 To: lustre-discuss@lists.lustre.org
 Subject: Re: [Lustre-discuss] Help! Newbie trying to set up Lustre
 network

 On 11-02-22 02:00 PM, Xiang, Yang wrote:
  It fails and complains about:
 
  mount.lustre: mount 192.168.0.2@tcp0:/temp at /lustre failed: No such
  device
 
  Are the lustre modules loaded?
 
  Check /etc/modprobe.conf and /proc/filesystems
 
  Note 'alias lustre llite' should be removed from modprobe.conf

 The first thing to check in such situations is the client's syslog
 and/or dmesg.

 b.

 --
 Brian J. Murrell
 Senior Software Engineer
 Whamcloud, Inc.

 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss




-- 
cliffw
Support Guy
WhamCloud, Inc.
www.whamcloud.com
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Is this setup possible

2011-02-17 Thread Cliff White
All the nodes have to run the same network type, so they can talk to one
another. If client is runnng
Infiniband, server must also run Infiniband, in most cases. See the Lustre
Manual for information on
Lustre Routing.
Clients and server can run different versions of Lustre.
You need to run a version of the kernel supported by Lustre, even with
patchless clients.
See Lustre release notes for kernel versions.
cliffw


On Thu, Feb 17, 2011 at 2:54 PM, vilobh meshram meshram.vil...@gmail.comwrote:

 Hi,

 Is this setup possible.I want to install a patchless client on the client
 nodes.

 Following are details :-

 1) Server has no OFED ; client has OFED 1.5.2.

 2) Client / Server has Lustre 1.8.3.

 3) Client Linux version   : 2.6.30

 Server Linux Version : 2.6.18-164.11.1

 Is it possible that if I install different Linux version of Client/Server
 but same Lustre version on Client/Server the things will work fine.
 Also since I want to install patchless client on the client side do I need
 to restrict the Lustre setup to some specific version of the Linux Kernel.

 Thanks,
 Vilobh

 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss


___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Recovery from Hardware Failure

2011-02-07 Thread Cliff White
You should not have to do the lfsck if the initial fsck's come back clean.
cliffw


On Mon, Feb 7, 2011 at 1:16 PM, Joe Digilio jgd-lus...@metajoe.com wrote:

 Last week we experienced a major hardware failure (disk controller)
 that brought down our system hard.  Now that I have the replacement
 controller, I want to make sure I recover correctly.  Below is the
 procedure I plan to follow based on what I've gathered from the
 Operations Manual.

 Any comments?
 Do I need to create the mds/ost DBs AFTER ll_recover_lost_found_objs?

 Thanks!
 -Joe


 ###MDT Recovery
 # Capture fs state before doing anything
 e2fsck -vfn /dev/$MDTDEV
 # safe repair
 e2fsck -vfp /dev/$MDTDEV
 # Verify no more problems and generate mdsdb
 e2fsck -vfn --mdsdb /tmp/mdsdb /dev/$MDTDEV

 ###OST Recovery
 foreach OST
# Capture fs state before doing anything
e2fsck -vfn /dev/$OSTDEV
# safe repair
e2fsck -vfp /dev/$OSTDEV
# Verify no more problems
e2fsck -vfn --mdsdb /tmp/mdsdb --ostdb /tmp/ostXdb /dev/$OSTDEV

 ### Recover lost+found Objects
 foreach OST
mount -t ldiskfs /dev/$OSTDEV /mnt/ost
ll_recover_lost_found_objs -v -d /mnt/ost/lost+found

 ### Coherency Check
 lfsck -n -v --mdsdb /tmp/mdsdb --ostdb
 /tmp/ost1db,/tmp/ost2db,...,/tmp/ostNdb /lustre
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Fwd: question about routing between subnets

2011-01-25 Thread Cliff White
The MGS can be behind an lnet router. So, provided you have LNET routing set
up between
A and B, the MGS should be okay with only a subnet A address, clients on B
with proper
routing configuration should be fine.

The address on net C is thus immaterial.
cliffw


On Tue, Jan 25, 2011 at 6:51 AM, Bob Ball b...@umich.edu wrote:

  Hi, no response on this the first time I sent it around.  Can anyone help
 me on this?

 Thanks,
 bob


  Original Message   Subject: [Lustre-discuss] question
 about routing between subnets  Date: Fri, 21 Jan 2011 15:48:25 -0500  From:
 Bob Ball b...@umich.edu b...@umich.edu  To: Lustre discussion
 lustre-discuss@lists.Lustre.org lustre-discuss@lists.Lustre.org

 Our lustre 1.8.4 system sits primarily on subnet A.  However, we also
 have a small number of clients that sit on subnet B.  In setting up the
 subnet B clients, we provided lnet router machines that have addresses
 on both subnet A and on subnet B, the MGS machine has addresses on both
 subnet A and subnet B, and all the OSS have lnet routing that lets this
 work.  Our world is a happy place.

 Now, due to other factors, we have to change the MGS subnet B address to
 instead be on subnet C.  Subnet C will be set up to have the usual IP
 routing to find subnet B, but what will happen to the clients that exist
 only on subnet B?  Is there a way for them to find the MGS at boot time,
 or are they going to stop working once this network change is effected?

 Thanks much,
 bob
 ___
 Lustre-discuss mailing 
 listLustre-discuss@lists.lustre.orghttp://lists.lustre.org/mailman/listinfo/lustre-discuss


 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss


___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] MDT raid parameters, multiple MGSes

2011-01-21 Thread Cliff White
On Fri, Jan 21, 2011 at 3:43 AM, Thomas Roth t.r...@gsi.de wrote:

 Hi all,

 we have gotten new MDS hardware, and I've got two questions:

 What are the recommendations for the RAID configuration and formatting
 options?
 I was following the recent discussion about these aspects on an OST:
 chunk size, strip size, stride-size, stripe-width etc. in the light of
 the 1MB chunks of Lustre ... So what about the MDT? I will have a RAID
 10 that consists of 11 RAID-1 pairs striped over. giving me roughly 3TB
 of space. What would be the correct value for insert your favorite
 term, the amount of data written to one disk before proceeding to the
 next disk?


The MDS does very small random IO - inodes and directories.  Afaik, the
largest chunk
of data read/written would be 4.5K -and you would see that only with large
OST stripe
counts.   RAID 10 is fine. You will not
be doing IO that spans more than one spindle, so I'm not sure if there's a
real need to tune here.
Also, the size of the data on the MDS is determined by the number of files
in the
filesystem (~4k per file is good)
unless you are buried in petabytes 3TB is likely way oversize for an MDT.
cliffw



 Secondly, it is not yet decided whether we wouldn't use this hardware to
 set up a second Lustre cluster. The manual recommends to have only one
 MGS per site, but doesn't elaborate: what would be the drawback of
 having two MGSes, two different network addresses the clients have to
 connect to to mount the Lustres?
 I know that it didn't work in Lustre 1.6.3 ;-) and there are no apparent
 issues when connecting a Lustre client to a test cluster now (version
 1.8.4), but what about production?


 Cheers,
 Thomas
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] finding performance issues

2010-12-10 Thread Cliff White
On 12/10/2010 11:42 AM, Brock Palen wrote:
 We have an lustre 1.6.x filesystem,

1.6 has been dead for well over a year. End Of Life.

 4 OSS,  3 x4500 and 1 ddn s2a6620

 Each oss has 4 1gig interfaces bonded, or 1 10gig interface.

 I have a user who is running a few hundred serial jobs that are all accessing 
 the same 16GB file, we striped the file over all the osts, and are tapped at 
 500-600MB/s no mater the number of hosts running.   IO per OST is around 
 15-20MB/s  (31 total ost's)

 This set of jobs keeps reading in the same data set, and has been running for 
 about 24 hours (the group of about 900 total jobs).

 *  Is there a recommendation of a better way to do these sorts of jobs?

Upgrade to the latest release of Lustre.

  The compute nodes have 48GB of ram, he does not use much ram for the 
job just all the IO.

 * Is there a better way to tune?

Yes, you upgrade to the code that has all the tuning fixes/enhancements 
- Lustre 1.8

  What should I be looking for to tune?
You are wasting your time tuning here.
1.8 supports many things, including cache on OSTs which would likely 
help bunches in your case.

cliffw


 Thanks!

 Brock Palen
 www.umich.edu/~brockp
 Center for Advanced Computing
 bro...@umich.edu
 (734)936-1985



 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Determining /proc/fs/lustre/llite subdirectoy

2010-12-08 Thread Cliff White
On 12/08/2010 10:42 AM, James Robnett wrote:

 Our clients have 2 or 3 different lustre filesystems mounted.
 We're using Lustre 1.8.4 on RHEL 5.5.

 On clients I'd like to be able to toggle via a script the extents
 monitoring in /proc/fs/lustre/llite/lustre-XX/extents_stats

When you have multiple file systems, you really should use the --fsname
parameter and give them different names. You apparently kept the default 
name of 'lustre'. The first field is actually the fsname, so for
example on a test system I have two filesystems named 'test1' and 
'test2' so on a typical client I have 
/proc/fs/lustre/llite/test1-XXX, and 
/proc/fs/lustre/llite/test2-XX.

Using the fsname makes identifying the fs bits on the clients trival, 
since all the ID's start with the fsname. (Which is why we added it)
I think it may be possible to change the name with tunefs.lustre after 
creation.

Using fsname is the 'simple more direct way' you seek.
cliffw


 Given a Lustre directory X how do I determine which lustre-X /proc
 directory that equates to.  I know I've seen the question before
 but after searching the docs and googling old mailing lists I'm
 still stumped.

 I can cheat a bit since I know which MDS goes to which filesystem,
 do 'lctl list_nids', see which MDS is the right one, then see which
 OSSes are grouped with that MDS and pull the trailing -osc--XXX
 suffix off and use that as the suffix for /proc/fs/lustre/llite
 but I have a vague recollection of a simpler more direct way.

 [lustre]# df .
 Filesystem   1K-blocks  Used Available Use% Mounted on
 10.64.1...@tcp:/lustre  


 [lustre]# /usr/sbin/lctl device_list
 0 UP mgc mgc10.64.1...@tcp f7f9beb4-2f44-9219-f491-01dde044e308 5
 1 UP lov lustre-clilov-81021c910400
 3de485df-1dfa-31a1-64d3-c54e948e6f22 4
 2 UP mdc lustre-MDT-mdc-81021c910400
 3de485df-1dfa-31a1-64d3-c54e948e6f22 5
 3 UP osc lustre-OST-osc-81021c910400
 3de485df-1dfa-31a1-64d3-c54e948e6f22 5
 ...remaining OSTS removed
11 UP mgc mgc10.64.2...@tcp d9cd36e8-0748-60bd-3b0f-327efce98418 5
12 UP lov lustre-clilov-810220907c00
 c2e6c647-c01f-f841-f06d-1a97571a6aea 4
13 UP mdc lustre-MDT-mdc-810220907c00
 c2e6c647-c01f-f841-f06d-1a97571a6aea 5
14 UP osc lustre-OST-osc-810220907c00
 c2e6c647-c01f-f841-f06d-1a97571a6aea 5
 ...remaining OSTS removed

 Since I want the filesystem on the first MDS (10.64.1.11) that equates to
/proc/fs/lustre/llite/lustre-81021c910400, this works but its a bit
 cumbersome to script.  Is there a simpler way like 'lfs list-llite dir'

 James Robnett
 NRAO/NM



 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Getting around a Catch-22

2010-12-07 Thread Cliff White
On 12/07/2010 06:51 AM, Bob Ball wrote:
 We have 6 OSS, each with at least 8 OST.  It sometimes happens that I
 need to do maintenance on an OST, so to avoid hanging processes on the
 client machines, I use lctl to disable access to that OST on active
 client machines.

 So, now, it may happen during this maintenance that a client machine is
 rebooted.  So far so good, until it comes time for the Lustre mount.  At
 this point, the reboot will hang, as the under-maintenance OST that is
 expected to be found by this rebooting client, is not found.

 Is there some way around this Catch-22?

This is covered in the Lustre Manual, see the 'exclude' option to mount:
http://wiki.lustre.org/manual/LustreManual18_HTML/ConfiguringLustre.html#50651184_pgfId-1298889

cliffw

 Thanks,
 bob

 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] manual OST failover for maintenance work?

2010-12-07 Thread Cliff White
On 12/06/2010 09:57 AM, Adeyemi Adesanya wrote:

 Hi.

 We have pairs of OSS nodes hooked up to shared storage arrays
 containing OSTs but we have not enabled any failover settings yet. Now
 we need to perform maintenance work on an OSS and we would like to
 minimize Lustre downtime. Can I use tunefs.lustre to specify the OSS
 failover NID for an existing OST? I assume i'll have to take the OST
 offline to make this change. Will clients that have Lustre mounted
 pick up this change or will all clients have to remount? I should
 mention that we are running Lustre 1.8.2.


Yes, see the Lustre Manual for details.
cliffw


 ---
 Yemi
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] client modules not loading during boot

2010-09-08 Thread Cliff White
The mount command will automatically load the modules on the client.
cliffw


On 09/03/2010 11:56 AM, Ronald K Long wrote:

 We have installed lustre 1.8.2 and 1.8.4 client on Red hat 5. The lustre
 modules are not loading during boot. In order to get the lustre file
 system to mount we have to add

 modprobe lustre
 mount /lustre

 to our /etc/rc.local file.

 Here is a list of rpms loaded

 kernel-2.6.18-164.11.1.el5_lustre.1.8.2
 lustre-modules-1.8.2-2.6.18_164.11.1.el5_lustre.1.8.2
 lustre-1.8.2-2.6.18_164.11.1.el5_lustre.1.8.2
 lustre-ldiskfs-3.0.9-2.6.18_164.11.1.el5_lustre.1.8.2
 kernel-devel-2.6.18-164.11.1.el5_lustre.1.8.2

 Is there a way to take care of this or is the way we are handling it the
 way to go?

 Thank you

 Rocky



 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Configuration question

2010-08-19 Thread Cliff White
On 08/19/2010 10:59 AM, David Noriega wrote:
 I'm curious about the underlying framework of lustre in regards to failover.

 When creating the filesystems, one can provide --failnode=x.x@tcp0
 and even for the OSTs you can provide two nids for the MDS/MGS. What
 do these options tell lustre and the clients? Are these required for
 use with heartbeat? If so why doesn't that second of the manual
 reference this? Also I think there is a typo in 4.5 Operational
 Scenarios, where it says one can use 'mkfs.lustre --ost --mgs
 --fsname='  That of course returns an error.

 David


- The --failnode= parameter gives a list of LNET address that will be
   tried by a 'client' (Client in this case can be a client process on a
   Lustre server, as in OSS talking to failover MDS) in the even of
   dropped connection to the primary address. This is actually un-related
   to the heartbeat setup (which governs services) but critical if
   you wish clients to connect to the new service after failover. So
   it is a necessary part of _any_ Lustre failover whether performed
   by heartbeat or another service.

It is described in the manual, in various places, including section 8.2
Failover Functionality in Lustre and section 4.4.1, please ask further 
if that's not clear.

Thanks for the manual typo catch - have filed a bug.

cliffw
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Virtualization and Lustre

2010-06-10 Thread Cliff White
On 05/20/2010 08:03 PM, Tyler Hawes wrote:
 Has there been any testing or conclusions regarding the use of virtualization 
 and Lustre, or is this even possible considering how Lustre is coded? I've 
 gotten used to the idea of virtualization for all our other servers, where it 
 is great to know we can mount the image on another host very quickly if a 
 hardware problem brings down a machine, and it seems the same would be nice 
 with Lustre...

 Tyler Hawes
 Lit Post
 www.litpost.com


We do virtualization routinely. Should be no issues. In fact, use of 
KVM/VMware/Virtual box for local testing is quite common among Lustre devs.

However, we really like to be close to the metal for performance 
reasons.  I would not use visualization in a production environment 
where performance was crucial.  I am also not certain how well 
high-performance networking does in a virtual server - we make heavy use 
of RDMA when available.

cliffw



 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Newbie w/issues

2010-04-28 Thread Cliff White
Brian Andrus wrote:
 Ok, I inherited a lustre filesystem used on a cluster. 
 
 I am seeing an issue where on the frontend, I see all of /work
 On nodes, however, I only see SOME of the user's directories.

That's rather odd. The directory structure is all on the MDS, so
it's usually either all there, or not there. Are any of the user errors
permission-related? That's the only thing I can think that would change 
what directories one node sees vs another.
 
 Work consists of one MDT/MGS and 3 osts
 The osts are LVMs served from a DDN via infiniband
 
 Running the kernel modules/client one the nodes/frontend
 lustre-client-1.8.2-2.6.18_164.11.1.el5_lustre.1.8.2
 lustre-client-modules-1.8.2-2.6.18_164.11.1.el5_lustre.1.8.2
 
 on the ost/mdt
 lustre-modules-1.8.2-2.6.18_164.11.1.el5_lustre.1.8.2
 kernel-2.6.18-164.11.1.el5_lustre.1.8.2
 lustre-1.8.2-2.6.18_164.11.1.el5_lustre.1.8.2
 lustre-ldiskfs-3.0.9-2.6.18_164.11.1.el5_lustre.1.8.2
 
 I have so many error messages in the logs, I am not sure which to sift 
 through for this issue.
 A quick tail on the MDT:
 =
 Apr 27 16:15:19 nas-0-1 kernel: LustreError: 
 4133:0:(ldlm_lib.c:1848:target_send_reply_msg()) @@@ processing error 
 (-107)  r...@810669d35c50 x1334203739385128/t0 o400-?@?:0/0 lens 
 192/0 e 0 to 0 dl 1272410135 ref 1 fl Interpret:H/0/0 rc -107/0
 Apr 27 16:15:19 nas-0-1 kernel: LustreError: 
 4133:0:(ldlm_lib.c:1848:target_send_reply_msg()) Skipped 419 previous 
 similar messages
 Apr 27 16:16:38 nas-0-1 kernel: LustreError: 
 4155:0:(handler.c:1518:mds_handle()) operation 400 on unconnected MDS 
 from 12345-10.1.255...@tcp
 Apr 27 16:16:38 nas-0-1 kernel: LustreError: 
 4155:0:(handler.c:1518:mds_handle()) Skipped 177 previous similar messages
 Apr 27 16:25:21 nas-0-1 kernel: LustreError: 
 6789:0:(mgs_handler.c:573:mgs_handle()) lustre_mgs: operation 400 on 
 unconnected MGS
 Apr 27 16:25:21 nas-0-1 kernel: LustreError: 
 6789:0:(mgs_handler.c:573:mgs_handle()) Skipped 229 previous similar 
 messages
 Apr 27 16:25:21 nas-0-1 kernel: LustreError: 
 6789:0:(ldlm_lib.c:1848:target_send_reply_msg()) @@@ processing error 
 (-107)  r...@810673a78050 x1334009404220652/t0 o400-?@?:0/0 lens 
 192/0 e 0 to 0 dl 1272410737 ref 1 fl Interpret:H/0/0 rc -107/0
 Apr 27 16:25:21 nas-0-1 kernel: LustreError: 
 6789:0:(ldlm_lib.c:1848:target_send_reply_msg()) Skipped 404 previous 
 similar messages
 Apr 27 16:26:41 nas-0-1 kernel: LustreError: 
 4173:0:(handler.c:1518:mds_handle()) operation 400 on unconnected MDS 
 from 12345-10.1.255...@tcp
 Apr 27 16:26:41 nas-0-1 kernel: LustreError: 
 4173:0:(handler.c:1518:mds_handle()) Skipped 181 previous similar messages
 =
 

The ENOTCONN (-107) points at server/network health. I would umount the 
clients and verify server health, then verify LNET connectivity. 
However, this would not relate to missing directories - in the absence 
of other explanations, check the MDT with fsck - that's more of a 
generic useful thing to do rather then something indicated by your data.

I would also look through older logs if available, and see if you can
find a point in time where things go bad. The first error is always the 
most useful.
 Any direction/insigt would be most helpful.

Hope this helps
cliffw

 
 Brian Andrus
 
 
 
 
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] mgt backup

2010-04-01 Thread Cliff White
John White wrote:
 I just wanted to confirm that the backup/restore procedure for MDTs apply 
 equally to MGTs.  Can someone please confirm?

Actually, the only things kept on a dedicated MGT are config logs.
Should be a trivial task to backup. mount as ldiskfs and make a quick 
tarball. You don't have to stop the filesystem to umount MGT.

# ls -Rl /mnt/mgs
/mnt/mgs:
total 20
drwxrwxrwx 2 root root  4096 Mar 18 10:46 CONFIGS
drwx-- 2 root root 16384 Mar 17 11:08 lost+found

/mnt/mgs/CONFIGS:
total 60
-rw-r--r-- 1 root root 12288 Mar 17 11:08 mountdata
-rw-r--r-- 1 root root 11688 Mar 17 11:10 test1-client
-rw-r--r-- 1 root root 12144 Mar 17 11:10 test1-MDT
-rw-r--r-- 1 root root  8880 Mar 18 10:46 test1-OST
-rw-r--r-- 1 root root  8880 Mar 18 10:46 test1-OST0001


If it's at combined MGT/MDT, then use the MDT procedure.
cliffw


 
 John White
 High Performance Computing Services (HPCS)
 (510) 486-7307
 One Cyclotron Rd, MS: 50B-3209C
 Lawrence Berkeley National Lab
 Berkeley, CA 94720
 
 
 
 
 
 
 
 
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] filter_grant_incoming()) LBUG in 1.8.1.1

2010-03-26 Thread Cliff White
Scott Barber wrote:
 Background:
 MDS and OSTs are all running CentOS 5.4 / x86_64 /
 2.6.18-128.7.1.el5_lustre.1.8.1.1
 2 types of clients
  - CentOS 5.4 / x86_64 / 2.6.18-128.7.1.el5_lustre.1.8.1.1
  - Ubuntu 8.04.1 / i686 / 2.6.22.19 patchless
 
 A few days ago one of the OSSs hit an LBUG. The syslog looked like:
 http://pastie.org/887643
 
 I brought it back up by unmounting the OSTs, restarting the machine
 and remounting the OSTs. The OST was just fine after that, but this
 seemed to start a chain-reaction with other OSSs. I'd run into the
 same LBUG and same call trace in the syslog on other OSSs. I kept
 bringing them back up again and an hour later it would happen again -
 interestingly never on the same OSS twice. It finally stopped when I
 unmounted the MDS/MGS, rebooted the MDS server and them remounted it
 again. We had no issues after that until this afternoon :(
 
 In researching the issue it looks as though it is bug #19338 which in
 turn is a duplicate of #20278. It looks as though that bug isn't
 slated for 1.8 at all. Am I reading that right? There's been no
 testing that I could tell of the patch on 1.8.x so I'm leery of trying
 to patch my servers. Is there something else that I can do? Any more
 info you need?
 

Hmm. Not sure why that fix was not landed for 1.8. Looks like we may 
have just missed it. :(  The correct fix is in 20278.
bugzilla.lustre.org/attachment.cgi?id=25139

We'll see about getting it tested/landed. It applies mostly okay to 
b1_8, further news when available.
cliffw

 
 Thanks for your help,
 Scott Barber
 Senior Systems Admin
 iMemories.com
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] filter_grant_incoming()) LBUG in 1.8.1.1

2010-03-26 Thread Cliff White
Scott Barber wrote:
 Background:
 MDS and OSTs are all running CentOS 5.4 / x86_64 /
 2.6.18-128.7.1.el5_lustre.1.8.1.1
 2 types of clients
  - CentOS 5.4 / x86_64 / 2.6.18-128.7.1.el5_lustre.1.8.1.1
  - Ubuntu 8.04.1 / i686 / 2.6.22.19 patchless
 
 A few days ago one of the OSSs hit an LBUG. The syslog looked like:
 http://pastie.org/887643
 
 I brought it back up by unmounting the OSTs, restarting the machine
 and remounting the OSTs. The OST was just fine after that, but this
 seemed to start a chain-reaction with other OSSs. I'd run into the
 same LBUG and same call trace in the syslog on other OSSs. I kept
 bringing them back up again and an hour later it would happen again -
 interestingly never on the same OSS twice. It finally stopped when I
 unmounted the MDS/MGS, rebooted the MDS server and them remounted it
 again. We had no issues after that until this afternoon :(
 
 In researching the issue it looks as though it is bug #19338 which in
 turn is a duplicate of #20278. It looks as though that bug isn't
 slated for 1.8 at all. Am I reading that right? There's been no
 testing that I could tell of the patch on 1.8.x so I'm leery of trying
 to patch my servers. Is there something else that I can do? Any more
 info you need?

I've attached a 1.8.x version of the patch to 20278. Builds fine on 
rhel5.  Further tests are in the queue, but likely to be awhile running.
I've also asked for landings/further inspection, you can follow progress 
in the bug.
cliffw

 
 
 Thanks for your help,
 Scott Barber
 Senior Systems Admin
 iMemories.com
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] How to force client-oss communication over IB when the MDS has only ethernet?

2010-03-26 Thread Cliff White
Tero Hakala wrote:
 Hi,
 our MDS is temporarily missing IB connection and has only eth available. 
 However, OSS and clients have both IB and eth.
 
 At the moment, it seems that all the traffic between clients-OSS goes 
 also through the slow eth connection.  Is it possible to force them to 
 use faster IB interfaces when communication with each other, and only 
 use eth to communicate with the MDS?
 
 Both clients and OSS have interfaces configured as
 options lnet networks=o2ib0(ib0),tcp0(eth0)  (in modprobe.conf) also 
 lctl ping works fine over IB.   The documents seem to suggests that the 
 first interface is preferred, but apparently it is not when MDS is only 
 available through the other.
 

I believe in this case since you are using the tcp NID for the client 
mount, lnet assumes that you are using tcp0, since tcp0 can reach the 
OSS. I think you might be able to use options lnet ip2nets on the 
clients to force the client-OSS connection to use ib0.

cliffw

   -t
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] programmatic access to parameters

2010-03-25 Thread Cliff White
burlen wrote:
 System limits are sometimes provided in a header, I wasn't sure if 
 Lustre adopted that approach. The llapi_* functions are great, I see how 
 to set the stripe count and size. I wasn't sure if there was also a 
 function to query about the configuration, eg number of OST's deployed?
 
 This would be for use in a global hybrid megnetospheric simulation that 
 runs on a large scale (1E4-1E5 cores). The good striping parameters 
 depend on the run, and could be calculated at run time. It can make a 
 significant difference in our run times to have these set correctly. I 
 am not sure if we always want a stripe count of the maximum. I think 
 this depends on how many files we are synchronously writing, and the 
 number of available OST's total. Eg if there are 256 OST's on some 
 system and we have 2 files to write would it not make sense to set the 
 stripe count to 128?
 
 We can't rely on our user to set the Lustre parameter correctly. We 
 can't rely on the system defaults either, they typically aren't set 
 optimally for our use case. MPI hints look promising but the ADIO Lustre 
 optimization are fairly new,  as far as I understand not publically 
 available in MPICH until next release (maybe in may?). We run on a 
 variety of systems some with variety of MPI implementation (eg Cray, 
 SGI). The MPI hints will only be useful on implementation that support 
 the particular hint. From a consistency point of view we need to both 
 make use of MPI hints and direct access via the llapi so that we run 
 well on all those systems, regardless of which MPI implementation is 
 deployed.  
 

I don't know what your constraints are, but should note that this sort
of information (number of OSTs) can be obtained rather trivially from 
any lustre client via shell prompt, to wit:
# lctl dl |grep OST |wc -l
2
or:
# ls /proc/fs/lustre/osc | grep OST |wc -l
2

probably a few other ways to do that. Not as stylish as llapi_*..

cliffw

 Thanks
 Burlen
 
 
 Andreas Dilger wrote:
 On 2010-03-23, at 14:25, burlen wrote:
 How can one programmatically probe the lustre system an application is
 running on?
 Lustre-specific interfaces are generally llapi_* functions, from 
 liblustreapi.

 At compile time I'd like access to the various lustre system limits ,
 for example those listed in ch.32 of operations manual.
 There are no llapi_* functions for this today.  Can you explain a bit 
 better what you are trying to use this for?

 statfs(2) will tell you a number of limits, as will pathconf(3), and 
 those are standard POSIX APIs.

 Incidentally one I didn't see listed in that chapter is the maximum 
 number of OST's a single file can be striped across.
 That is the first thing listed:

 32.1Maximum Stripe Count
 The maximum number of stripe count is 160. This limit is hard-coded, 
 but is near the upper limit imposed by the underlying ext3 file 
 system. It may be increased in future releases. Under normal 
 circumstances, the stripe count is not affected by ACLs.

 At run time I'd like to be able to probe the size (number of OSS, OST
 etc...) of the system the application is running on.

 One shortcut is to specify -1 for the stripe count will stripe a 
 file across all available OSTs, which is what most applications want, 
 if they are not being striped over only 1 or 2 OSTs.

 If you are using MPIIO, the Lustre ADIO layer can optimize these 
 things for you, based on application hints.

 If you could elaborate on your needs, there may not be any need to 
 make your application more Lustre-aware.

 Cheers, Andreas
 -- 
 Andreas Dilger
 Sr. Staff Engineer, Lustre Group
 Sun Microsystems of Canada, Inc.

 
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] add new network links to existing OSS

2010-03-24 Thread Cliff White
Jake Maul wrote:
 Greetings,
 
 We've got a small Lustre network set up, and have rather suddenly run
 into a bottleneck with a single gigabit link to certain OSS's.
 
 http://wiki.lustre.org/manual/LustreManual18_HTML/Bonding.html#50638966_pgfId-1289000
 
 Based on that page in the manual, it sounds like setting up Lustre to
 handle multiple links will give us better performance than setting up
 a traditional bonded ethernet link. My question is... how do we do
 that in an existing setup? Clearly we need to edit modprobe.conf and
 add the new network interface, and reload all the lnet-related
 modules. Does anything need to happen besides that?

That's not generally true. In almost all cases, you are better off using 
the bonded interface.
 
 My concern is that the new link would also have it's own IP. Does
 anything need to be informed of this additional IP (like the
 MGS/MDS)? Any other pitfalls we need to be aware of?

Bunches. Basically you would have to start with the second interface on 
a different subnet, establishing routing, etc. All nodes need to know 
about it. You're really better off with bonding.
cliffw

 
 Thanks,
 Jake
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Lustre Monitoring Tools

2010-01-06 Thread Cliff White
Jagga Soorma wrote:
 Hi Guys,
 
 I would like to monitor the performance and usage of my Lustre 
 filesystem and was wondering what are the commonly used monitoring tools 
 for this?  Cacti? Nagios?  Any input would be greatly appreciated.
 
 Regards,
 -Simran
 

LLNL's LMT tool is very good. It's available on Sourceforge, afaik.
cliffw

 
 
 
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Lustre Monitoring Tools

2010-01-06 Thread Cliff White
Jeffrey Bennett wrote:
 Last time I checked, LMT was designed for Lustre 1.4. LLNL stopped 
 development of LMT some time ago. Not sure if LMT will work with Lustre 1.8. 
 If somebody has tried, please let everyone know.
 

Ah, it has moved to Google:
http://code.google.com/p/lmt/

The current release has been tested with Lustre 1.6.6.
So, yup, seems a bit old. But might be worth looking into.
cliffw

 jab
 
 
 -Original Message-
 From: lustre-discuss-boun...@lists.lustre.org 
 [mailto:lustre-discuss-boun...@lists.lustre.org] On Behalf Of Cliff White
 Sent: Wednesday, January 06, 2010 11:12 AM
 To: Jagga Soorma
 Cc: lustre-discuss@lists.lustre.org
 Subject: Re: [Lustre-discuss] Lustre Monitoring Tools
 
 Jagga Soorma wrote:
 Hi Guys,

 I would like to monitor the performance and usage of my Lustre 
 filesystem and was wondering what are the commonly used monitoring tools 
 for this?  Cacti? Nagios?  Any input would be greatly appreciated.

 Regards,
 -Simran

 
 LLNL's LMT tool is very good. It's available on Sourceforge, afaik.
 cliffw
 
 

 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss
 
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Lustre and iSCSI

2009-07-31 Thread Cliff White
David Pratt wrote:
 Hi. I am exploring possibilities for pooled storage for virtual 
 machines. Lustre looks quite interesting for both tolerance and speed. I 
 have a couple of basic questions:
 
 1) Can Lustre present an iSCSI target

Lustre doesn't present target, we use targets, and we should work fine 
with iSCSI. We don't have a lot of iSCSI users, due to performance 
concerns.

 2) I am looking at physical machines with 4 1TB 24x7 drives in each. How 
 many machines will I need to cluster to create a solution with provide a 
 good level of speed and fault tolerance.
 
'It depends' - what is a 'good level of speed' for your app?

Lustre IO scales as you add servers. Basically, if the IO is big enough, 
the client 'sees' the bandwidth of multiple servers.  So, if you know 
the  bandwidth of 1 server (sgp_dd or other raw IO tools helps) then 
your total bandwidth is going to be that figure, times the number of 
servers. This assumes whatever network you have is capable of sinking 
this bandwidth.

So, if you know the IO you need, and you know the IO one server can 
drive, you just divide the one by the other.

Fault tolerance at the disk level == RAID.
Fault tolerance at the server level is done with shared storage 
failover, using linux-ha or other packages.
hope this helps,
cliffw

 Many thanks.
 
 Regards,
 David
 
 
 
 
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] failover software - heartbeat

2009-07-14 Thread Cliff White
Lundgren, Andrew wrote:
 It is very difficult to find relevant documentation for heartbeat 1/2. I just 
 finished configuring a heartbeat system and would not recommend it because of 
 the documentation.  (They seem to have removed portions the heartbeat 
 documentation from the site.)  
 
 Pacemaker is not a simple solution to configure either. I played briefly with 
 the RH clustering software.  It does not directly support any FS type other 
 than the basic ext2/ext3, and wasn't happy with a lustre type.  
 

That might be simple to fix, if it is script-based. We submitted a patch 
aeons ago to the heartbeat guys to add 'ldiskfs' as a supported FS. As I 
recall, it was a one-line change.
cliffw

 --
 Andrew
 
 -Original Message-
 From: lustre-discuss-boun...@lists.lustre.org [mailto:lustre-discuss-
 boun...@lists.lustre.org] On Behalf Of Carlos Santana
 Sent: Monday, July 13, 2009 11:42 AM
 To: lustre-discuss@lists.lustre.org
 Subject: [Lustre-discuss] failover software - heartbeat

 Howdy,

 The lustre manual recommends heartbeat for handling failover. The
 pacemaker is successor of hearbeat version 2. So whats recommended -
 should we be using pacemaker or stick to hearbeat?

 -
 CS.
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] failover software - heartbeat

2009-07-14 Thread Cliff White
Jim Garlick wrote:
 Hi,
 
 OK I have posted it to https://bugzilla.lustre.org/show_bug.cgi?id=20165
 
   20165: scripts for heartbeat v1 integration
 
 I added example config files from our test cluster.  Probably best to
 redirect questions/comments/criticisms to the bug and I'll respond there.

Looks very good, thanks bunches. I've added a few extras from the 
discussion. Did you guy try ipfail, or only pingd?
cliffw

 
 Jim
 
 
 On Tue, Jul 14, 2009 at 12:26:24PM +1000, Atul Vidwansa wrote:
 Hi Jim,

 It would be great if you can attach the scripts to a Lustre bugzilla bug.

 Cheers,
 _Atul

 Jim Garlick wrote:
 We recently put heartbeat v1 in production and along the way
 developed some admin scripts including heartbeat resource agent compliant
 lustre init scripts, a script to initiate failover/failback and get 
 detailed
 status, a powerman stonith interface, and various safeguards to ensure MMP
 is on, devices are present and usable, etc. before starting lustre.

 If this is of general interest I could post it to a bug for review.

 Jim

 On Mon, Jul 13, 2009 at 01:45:02PM -0600, Lundgren, Andrew wrote:
  
 It is very difficult to find relevant documentation for heartbeat 1/2. I 
 just finished configuring a heartbeat system and would not recommend it 
 because of the documentation.  (They seem to have removed portions the 
 heartbeat documentation from the site.)  
 Pacemaker is not a simple solution to configure either. I played briefly 
 with the RH clustering software.  It does not directly support any FS 
 type other than the basic ext2/ext3, and wasn't happy with a lustre type. 

 --
 Andrew


 -Original Message-
 From: lustre-discuss-boun...@lists.lustre.org [mailto:lustre-discuss-
 boun...@lists.lustre.org] On Behalf Of Carlos Santana
 Sent: Monday, July 13, 2009 11:42 AM
 To: lustre-discuss@lists.lustre.org
 Subject: [Lustre-discuss] failover software - heartbeat

 Howdy,

 The lustre manual recommends heartbeat for handling failover. The
 pacemaker is successor of hearbeat version 2. So whats recommended -
 should we be using pacemaker or stick to hearbeat?

 -
 CS.
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://**lists.lustre.org/mailman/listinfo/lustre-discuss
  
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://**lists.lustre.org/mailman/listinfo/lustre-discuss

 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://*lists.lustre.org/mailman/listinfo/lustre-discuss
  

 -- 
 ==
 Atul Vidwansa
 Sun Microsystems Australia Pty Ltd
 Web: http://*blogs.sun.com/atulvid
 Email: atul.vidwa...@sun.com

 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Lustre DRBD failover time

2009-07-14 Thread Cliff White
tao.a...@nokia.com wrote:
  
 Hi, all,
  
 I am evaluating Lustre with DRBD failover, and experiencing about 2 
 minutes in OSS failover time to switch to the secondary node.  Has 
 anyone have the similar observation (so that we can conclude this should 
 be expected), or if there is some parameters that I should tune to 
 reduce that time?
  
 I have a simple setup: the MDS and OSS0 are hosted on server1, and OSS1 
 are hosted on server2.  OSS0 and OSS1 are the primary nodes for OST0 and 
 OST1, respectively, and the OSTs are replicated using DRBD (protocol C) 
 to the other machine.  The two OSTs are about 73GB each.  I am running 
 Lustre 1.6 + DRBD 8 + Heartbeat v2 (but using v1 configuration).
  
  From HA logs, it looks that Heartbeat noticed a node is down within 10 
 seconds (with is consistent with the deadtime of 6 seconds).  Where does 
 the secondary node spend the remaining 100-110 seconds?  There was a 
 post 
 (_http://groups.google.com/group/lustre-discuss-list/msg/bbbeac047df678ca?dmode=source_)
  
 contributing MDS failover time to fsck.  Does it also cause my problem?

as Brian mentioned, Lustre servers go through a recovery process.
You need to examine system logs on the OSS - if Lustre is in recovery, 
there will be messages in the logs explaining this.

cliffw



 Thanks,
  
 -Tao
  
  
  
  
 
 
 
 
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Out of Memory on MDS

2009-06-23 Thread Cliff White
Roger Spellman wrote:
 I have an MDS that is crashing with out-of-memory.
 
  
 
 Prior to the crash, I started collecting /proc/slabinfo.  I see that 
 ldlm_locks is up to 4,500,000, and each one is 512 bytes, for a total of 
 2.2GB, which is more than half my RAM.
 
  
 
 Is there a way to limit this?

You don't mention the version of Lustre - lru_size might have an impact, 
i am not certain. I believe it is the only lock tuneable of note. (and 
is auto-sized in recent Lustre)

cliffw

 
  
 
 Other heavy memory users are ldisk_inode_cache (421 MB) and 
 ldlm_resources (137 MB).  Is there a way to limit these too?
 
  
 
 Thanks.
 
  
 
 Roger Spellman
 
 Staff Engineer
 
 Terascala, Inc.
 
 508-588-1501
 
 www.terascala.com http://www.terascala.com/
 
  
 
 
 
 
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] a simple question

2009-06-19 Thread Cliff White
Onane wrote:
 Hello,
 After installing lustre, how can I test it quickliy if it is installed 
 correctly ?
 

# modprobe -v lustre

If the modules load without error this is good.

# lctl network up
# lctl list_nids

This shows you that LNET can run.

Beyond that, create a filesystem. Examples at manual.lustre.org.

One thing we do lot's - create a quick scratch filesystem using loopback 
devices for OST and MDS/MGS. You can do that on one node. See llmount.sh 
in the lustre-source package for an example.

cliffw

 
 
 
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Lustre installation and configuration problems

2009-06-17 Thread Cliff White
Carlos Santana wrote:
 Thanks Cliff.
 
 The depmod -a was successful before as well. I am using CentOS 5.2
 box. Following are the packages installed:
 [r...@localhost tmp]# rpm -qa | grep -i lustre
 lustre-modules-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp
 lustre-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp

Those are server modules. You would need to add lustre-kernel-smp for 
that to work

For a client, you install the matching vendor kernel, then:
lustre-client-modules
lustre-client

For a server, you need
lustre-kernel-smp
lustre-modules
lustre-
ldiskfs-

And as others have mentioned in this thread, kernel version must match 
exactly. Check /lib/modules - if you have a mis-match, there will be an 
extra directory there.

cliffw

 
 [r...@localhost tmp]# uname -a
 Linux localhost.localdomain 2.6.18-92.el5 #1 SMP Tue Jun 10 18:49:47
 EDT 2008 i686 i686 i386 GNU/Linux
 
 And here is a output from strace for mount: http://www.heypasteit.com/clip/8WT
 
 Any further debugging hints?
 
 Thanks,
 CS.
 
 On 6/16/09, Cliff White cliff.wh...@sun.com wrote:
 Carlos Santana wrote:
 The '$ modprobe -l lustre*' did not show any module on a patchless
 client. modprobe -v returns 'FATAL: Module lustre not found'.

 How do I install a patchless client?
 I have tried lustre-client-modules and lustre-client-ver rpm packages in
 both sequences. Am I missing anything?

 Make sure the lustre-client-modules package matches your running kernel.
 Run depmod -a to be sure
 cliffw

 Thanks,
 CS.



 On Tue, Jun 16, 2009 at 2:28 PM, Cliff White cliff.wh...@sun.com
 mailto:cliff.wh...@sun.com wrote:

 Carlos Santana wrote:

 The lctlt ping and 'net up' failed with the following messages:
 --- ---
 [r...@localhost ~]# lctl ping 10.0.0.42
 opening /dev/lnet failed: No such device
 hint: the kernel modules may not be loaded
 failed to ping 10.0.0...@tcp: No such device

 [r...@localhost ~]# lctl network up
 opening /dev/lnet failed: No such device
 hint: the kernel modules may not be loaded
 LNET configure error 19: No such device


 Make sure modules are unloaded, then try modprobe -v.
 Looks like you have lnet mis-configured, if your module options are
 wrong, you will see an error during the modprobe.
 cliffw

 --- ---


 I tried lustre_rmmod and depmod commands and it did not return
 any error messages. Any further clues? Reinstall patchless
 client again?

 -
 CS.


 On Tue, Jun 16, 2009 at 1:32 PM, Cliff White
 cliff.wh...@sun.com mailto:cliff.wh...@sun.com
 mailto:cliff.wh...@sun.com mailto:cliff.wh...@sun.com wrote:

Carlos Santana wrote:

I was able to run lustre_rmmod and depmod successfully. The
'$lctl list_nids' returned the server ip address and
 interface
(tcp0).

I tried to mount the file system on a remote client, but it
failed with the following message.
--- ---
[r...@localhost ~]# mount -t lustre 10.0.0...@tcp0:/lustre
/mnt/lustre
mount.lustre: mount 10.0.0...@tcp0:/lustre at /mnt/lustre
failed: No such device
Are the lustre modules loaded?
Check /etc/modprobe.conf and /proc/filesystems
Note 'alias lustre llite' should be removed from
 modprobe.conf
--- ---

However, the mounting is successful on a single node
configuration - with client on the same machine as MDS
 and OST.
Any clues? Where to look for logs and debug messages?


Syslog || /var/log/messages is the normal place.

You can use 'lctl ping' to verify that the client can reach
 the server.
Usually in these cases, it's a network/name misconfiguration.

Run 'tunefs.lustre --print' on your servers, and verify that
 mgsnode=
is correct.

cliffw


Thanks,
CS.





On Tue, Jun 16, 2009 at 12:16 PM, Cliff White
cliff.wh...@sun.com mailto:cliff.wh...@sun.com
 mailto:cliff.wh...@sun.com mailto:cliff.wh...@sun.com
mailto:cliff.wh...@sun.com mailto:cliff.wh...@sun.com
 mailto:cliff.wh...@sun.com mailto:cliff.wh...@sun.com wrote:

   Carlos Santana wrote:

   Thanks Kevin..

   Please read:


 http://manual.lustre.org/manual/LustreManual16_HTML/ConfiguringLustre.html#50401328_pgfId-1289529

   Those instructions are identical for 1.6 and 1.8.

   For current lustre, only two commands are used for
 configuration.
   mkfs.lustre and mount.


   Usually when lustre_rmmod returns that error, you run

Re: [Lustre-discuss] missing ost's?

2009-06-17 Thread Cliff White
Michael Di Domenico wrote:
 On Tue, Jun 16, 2009 at 8:25 PM, Michael Di
 Domenicomdidomeni...@gmail.com wrote:
 I have a small lustre test cluster with eight OST's running.  The
 servers were shut off over the weekend, upon turning them back on and
 trying to startup lustre I seem to have lost my OST's.

 [r...@node1 ~]$ lctl dl
  0 UP mgs MGS MGS 19
  1 UP mgc mgc192.168.1@tcp 8acd9bf1-d1ca-8e26-1fad-bd2cf88a2957 5
  2 UP mdt MDS MDS_uuid 3
  3 UP lov lustre-mdtlov lustre-mdtlov_UUID 4
  4 UP mds lustre-MDT lustre-MDT_UUID 3
  5 UP ost OSS OSS_uuid 3
  6 UP obdfilter lustre-OST lustre-OST_UUID 3

 Everything in the messages log appears to be fine as if it was just a
 normal startup of lustre, except for the below message.  I'm not sure
 what logfile the error is referring to, and the message gives little
 detail on where i should start looking for an error.

 Jun 16 20:13:55 node1-eth0 kernel: LustreError:
 3106:0:(llog_lvfs.c:577:llog_filp_open()) logfile creation
 CONFIGS/lustre-MDTT: -28
 Jun 16 20:13:55 node1-eth0 kernel: LustreError:
 3106:0:(mgc_request.c:1086:mgc_copy_llog()) Failed to copy remote log
 lustre-MDT (-28)
 
 Apparently from the lustre manual the -28 at the end of the line is an
 error code, which points to
 
 -28 -ENOSPC The file system is out-of-space or out of inodes. Use lfs df
 (query the amount of file system space) or lfs df -i
 (query the number of inodes).
 
 verified by
 
 [r...@node1 ~]$ df -i
 FilesystemInodes   IUsed   IFree IUse% Mounted on
 /dev/md2 128   42132 12378684% /
 /dev/md0  255232  45  2551871% /boot
 tmpfs 124645   1  1246441% /dev/shm
 /dev/md3   63872  24   638481% /mgs
 /dev/md4  255040  255040   0  100% /mdt
 /dev/md5 29892608   28726 298638821% /ost
 
 I only put 500k files in the filesystem i would not have thought the
 mdt would have used up the inodes that fast

The MDT will consume one inode for each file in the global Lustre file 
system. You have plenty of OST space, but no inodes.

You have 255K inodes on the MDS, but you are trying to create 500k files.

cliffw

 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Lustre installation and configuration problems

2009-06-17 Thread Cliff White
Arden Wiebe wrote:
 Cliff:
 
 I have some questions about the client packages.  I am not sure why the 
 roadmap or lustre users require separate client packages but stating the 
 obvious some people must need separate client packages is that correct?  

The key here is 'patchless client' Yes, any machine with Lustre server 
bits installed can  be a client. Not Long Ago, there was only one 
installation for Lustre. Everybody got the same bits.

And the Lustre design re-uses things. Note that any Lustre node 
connecting to a service has a 'client' - for example the OSS is a 
'client' of the MDS, and the MDS a 'client' of the OSS.

The 'patchless client' was created to allow users to run Lustre with a 
stock vendor/distro kernel. This removes a lot of support/installation 
issues - servers can be considered 'Lustre-only' devices, but clients 
typically have other goop installed. Allowing users to use a stock 
distro kernel simplifies their support relationship with their other 
vendors.
 
 Otherwise the server packages contain the client anyhow correct?  If the 
 later are the client packages for linux somewhat redundant?  

Yes, the client packages are somewhat redundant, if you don't mind a 
Lustre-patched kernel on your clients.


When will the real client .exe for windows become available?

No idea, see the roadmap.
cliffw

 
 Arden
 
 --- On Wed, 6/17/09, Sheila Barthel sheila.bart...@sun.com wrote:
 
 From: Sheila Barthel sheila.bart...@sun.com
 Subject: Re: [Lustre-discuss] Lustre installation and configuration problems
 To: Carlos Santana neu...@gmail.com
 Cc: Cliff White cliff.wh...@sun.com, lustre-discuss@lists.lustre.org
 Date: Wednesday, June 17, 2009, 1:08 PM
 Carlos -

 The installation procedures for Lustre 1.6 and 1.8 are the
 same. The manual's installation procedure includes a table
 that shows which packages to install on servers and clients
 (I've attached a PDF of the table). The procedure also
 describes the installation order for packages (kernel,
 modules, ldiskfs, then utilities/userspace, then
 e2fsprogs).

 http://manual.lustre.org/manual/LustreManual16_HTML/LustreInstallation.html#50401389_pgfId-1291574

 Sheila

 Cliff White wrote:
 Carlos Santana wrote:

 Huh... :( Sorry to bug you guys again...

 I am planning to make a fresh start now as nothing
 seems to have worked for me. If you have any
 comments/feedback please share them.
 I would like to confirm installation order before
 I make a fresh start.  From Arden's experience: 
 http://lists.lustre.org/pipermail/lustre-discuss/2009-June/010710.html
 , the lusre-module is installed last. As I was installing
 Lustre 1.8, I was referring 1.8 operations manual 
 http://manual.lustre.org/index.php?title=Main_Page .
 The installation order in the manual is different than what
 Arden has suggested.
 Will it make a difference in configuration at
 later stage? Which one should I follow now?
 Any comments?
  
 RPM installation order really doesn't matter. If you
 install in the 'wrong' order you will get a lot of warnings
 from RPM due to the relationship of the various RPMs. But
 these are harmless - whatever order you install in, it
 should work fine.
 cliffw

 Thanks,
 CS.


 On Wed, Jun 17, 2009 at 12:35 AM, Carlos Santana
 neu...@gmail.com
 mailto:neu...@gmail.com
 wrote:
  Thanks Cliff.

  The depmod -a was
 successful before as well. I am using CentOS 5.2
  box. Following are the
 packages installed:
  [r...@localhost tmp]# rpm
 -qa | grep -i lustre
  
lustre-modules-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp
  
lustre-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp
  [r...@localhost tmp]#
 uname -a
  Linux
 localhost.localdomain 2.6.18-92.el5 #1 SMP Tue Jun 10
 18:49:47
  EDT 2008 i686 i686 i386
 GNU/Linux
  And here is a output from
 strace for mount:
  http://www.heypasteit.com/clip/8WT

  Any further debugging
 hints?
  Thanks,
  CS.

  On 6/16/09, Cliff White
 cliff.wh...@sun.com
  mailto:cliff.wh...@sun.com
 wrote:
Carlos Santana wrote:
The '$ modprobe -l
 lustre*' did not show any module on a patchless
client. modprobe -v
 returns 'FATAL: Module lustre not found'.
   
How do I install a
 patchless client?
I have tried
 lustre-client-modules and lustre-client-ver rpm
  packages in
both sequences. Am I
 missing anything?
   
   
Make sure the
 lustre-client-modules package matches your running
  kernel.
Run depmod -a to be sure
cliffw
   
Thanks,
CS.
   
   
   
On Tue, Jun 16, 2009
 at 2:28 PM, Cliff White
  cliff.wh...@sun.com
 mailto:cliff.wh...@sun.com
mailto:cliff.wh...@sun.com
 mailto:cliff.wh...@sun.com
 wrote:
   

Carlos Santana wrote:
   

The lctlt ping and 'net up' failed with
 the following
  messages:

--- ---

[r...@localhost

Re: [Lustre-discuss] Lustre installation and configuration problems

2009-06-17 Thread Cliff White
 -a was successful before as well. I am using CentOS 5.2
 box. Following are the packages installed:
 [r...@localhost tmp]# rpm -qa | grep -i lustre
 lustre-modules-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp

 lustre-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp

 [r...@localhost tmp]# uname -a

 Linux localhost.localdomain 2.6.18-92.el5 #1 SMP Tue Jun 10 18:49:47
 EDT 2008 i686 i686 i386 GNU/Linux

 And here is a output from strace for mount:
 http://www.heypasteit.com/clip/8WT

 Any further debugging hints?

 Thanks,
 CS.

 On 6/16/09, Cliff White cliff.wh...@sun.com wrote:
 Carlos Santana wrote:
 The '$ modprobe -l lustre*' did not show any module on a patchless
 client. modprobe -v returns 'FATAL: Module lustre not found'.

 How do I install a patchless client?
 I have tried lustre-client-modules and lustre-client-ver rpm packages in
 both sequences. Am I missing anything?

 Make sure the lustre-client-modules package matches your running kernel.
 Run depmod -a to be sure
 cliffw

 Thanks,
 CS.



 On Tue, Jun 16, 2009 at 2:28 PM, Cliff White cliff.wh...@sun.com
 mailto:cliff.wh...@sun.com wrote:

 Carlos Santana wrote:

 The lctlt ping and 'net up' failed with the following messages:
 --- ---
 [r...@localhost ~]# lctl ping 10.0.0.42
 opening /dev/lnet failed: No such device
 hint: the kernel modules may not be loaded
 failed to ping 10.0.0...@tcp: No such device

 [r...@localhost ~]# lctl network up
 opening /dev/lnet failed: No such device
 hint: the kernel modules may not be loaded
 LNET configure error 19: No such device


 Make sure modules are unloaded, then try modprobe -v.
 Looks like you have lnet mis-configured, if your module options are
 wrong, you will see an error during the modprobe.
 cliffw

 --- ---


 I tried lustre_rmmod and depmod commands and it did not return
 any error messages. Any further clues? Reinstall patchless
 client again?

 -
 CS.


 On Tue, Jun 16, 2009 at 1:32 PM, Cliff White
 cliff.wh...@sun.com mailto:cliff.wh...@sun.com
 mailto:cliff.wh...@sun.com mailto:cliff.wh...@sun.com wrote:

Carlos Santana wrote:

I was able to run lustre_rmmod and depmod successfully.
 The
'$lctl list_nids' returned the server ip address and
 interface
(tcp0).

I tried to mount the file system on a remote client, but
 it
failed with the following message.
--- ---
[r...@localhost ~]# mount -t lustre 10.0.0...@tcp0:/lustre
/mnt/lustre
mount.lustre: mount 10.0.0...@tcp0:/lustre at /mnt/lustre
failed: No such device
Are the lustre modules loaded?
Check /etc/modprobe.conf and /proc/filesystems
Note 'alias lustre llite' should be removed from
 modprobe.conf
--- ---

However, the mounting is successful on a single node
configuration - with client on the same machine as MDS
 and OST.
Any clues? Where to look for logs and debug messages?


Syslog || /var/log/messages is the normal place.

You can use 'lctl ping' to verify that the client can reach
 the server.
Usually in these cases, it's a network/name misconfiguration.

Run 'tunefs.lustre --print' on your servers, and verify that
 mgsnode=
is correct.

cliffw


Thanks,
CS.





On Tue, Jun 16, 2009 at 12:16 PM, Cliff White
cliff.wh...@sun.com mailto:cliff.wh...@sun.com
 mailto:cliff.wh...@sun.com mailto:cliff.wh...@sun.com
mailto:cliff.wh...@sun.com mailto:cliff.wh...@sun.com
 mailto:cliff.wh...@sun.com mailto:cliff.wh...@sun.com
 wrote:

   Carlos Santana wrote:

   Thanks Kevin..

   Please read:



 http://manual.lustre.org/manual/LustreManual16_HTML/ConfiguringLustre.html#50401328_pgfId-1289529

   Those instructions are identical for 1.6 and 1.8.

   For current lustre, only two commands are used for
 configuration.
   mkfs.lustre and mount.


   Usually when lustre_rmmod returns that error, you run
 it a second
   time, and it will clear things. Unless you have live
 mounts or
   network connections.

   cliffw


   I am referring to 1.8 manual, but I was also
 referring to
HowTo
   page on wiki which seems to be for 1.6. The HowTo
 page



 http://wiki.lustre.org/index.php/Lustre_Howto#Using_Supplied_Configuration_Tools
   mentions

Re: [Lustre-discuss] Lustre installation and configuration problems

2009-06-16 Thread Cliff White
Carlos Santana wrote:
 Thanks Kevin..
 
Please read:
http://manual.lustre.org/manual/LustreManual16_HTML/ConfiguringLustre.html#50401328_pgfId-1289529

Those instructions are identical for 1.6 and 1.8.

For current lustre, only two commands are used for configuration.
mkfs.lustre and mount.


Usually when lustre_rmmod returns that error, you run it a second time, 
and it will clear things. Unless you have live mounts or network 
connections.

cliffw


 I am referring to 1.8 manual, but I was also referring to HowTo page on 
 wiki which seems to be for 1.6. The HowTo page 
 http://wiki.lustre.org/index.php/Lustre_Howto#Using_Supplied_Configuration_Tools
  
 mentions abt lmc, lconf, and lctl.
 
 The modules are installed in the right place. The '$ lustre_rmmod' 
 resulted in following o/p:
 [r...@localhost 2.6.18-92.1.17.el5_lustre.1.8.0smp]# lustre_rmmod
 ERROR: Module obdfilter is in use
 ERROR: Module ost is in use
 ERROR: Module mds is in use
 ERROR: Module fsfilt_ldiskfs is in use
 ERROR: Module mgs is in use
 ERROR: Module mgc is in use by mgs
 ERROR: Module ldiskfs is in use by fsfilt_ldiskfs
 ERROR: Module lov is in use
 ERROR: Module lquota is in use by obdfilter,mds
 ERROR: Module osc is in use
 ERROR: Module ksocklnd is in use
 ERROR: Module ptlrpc is in use by obdfilter,ost,mds,mgs,mgc,lov,lquota,osc
 ERROR: Module obdclass is in use by 
 obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ptlrpc
 ERROR: Module lnet is in use by ksocklnd,ptlrpc,obdclass
 ERROR: Module lvfs is in use by 
 obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ptlrpc,obdclass
 ERROR: Module libcfs is in use by 
 obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ksocklnd,ptlrpc,obdclass,lnet,lvfs
 
 Do I need to shutdown these services? How can I do that?
 
 Thanks,
 CS.
 
 
 On Tue, Jun 16, 2009 at 11:36 AM, Kevin Van Maren 
 kevin.vanma...@sun.com mailto:kevin.vanma...@sun.com wrote:
 
 I think lconf and lmc went away with Lustre 1.6.  Are you sure you
 are looking at the 1.8 manual, and not directions for 1.4?
 
 /usr/sbin/lctl should be in the lustre-version RPM.  Do a:
 # rpm -q -l lustre-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp
 
 
 Do make sure the modules are installed in the right place:
 # cd /lib/modules/`uname -r`
 # find . | grep lustre.ko
 
 If it shows up, then do:
 # lustre_rmmod
 # depmod
 and try again.
 
 Otherwise, figure out where your modules are installed:
 # uname -r
 # cd /lib/modules
 # find . | grep lustre.ko
 
 
 You can also double-check the NID.  On the MSD server, do
 # lctl list_nids
 
 Should show 10.0.0...@tcp0
 
 Kevin
 
 
 Carlos Santana wrote:
 
 Thanks for the update Sheila. I am using manual for Lustre 1.8
 (May-09).
 
 Arden, as per the 1.8 manual:
 --- ---
 Install the kernel, modules and ldiskfs packages.
 Use the rpm -ivh command to install the kernel, module and ldiskfs
 packages. For example:
 $ rpm -ivh kernel-lustre-smp-ver \
 kernel-ib-ver \
 lustre-modules-ver \
 lustre-ldiskfs-ver
 c. Install the utilities/userspace packages.
 Use the rpm -ivh command to install the utilities packages. For
 example:
 $ rpm -ivh lustre-ver
 d. Install the e2fsprogs package.
 Use the rpm -i command to install the e2fsprogs package. For
 example:
 $ rpm -i e2fsprogs-ver
 If you want to add any optional packages to your Lustre file
 system, install them
 now.
 4. Verify that the boot loader (grub.conf or lilo.conf) has
 --- ---
 I followed the same order.
 
 
 The lconf and lmc are not available on my system. I am not sure
 what are they and when will I need it. I continued to explore
 other things in lustre and have created MDS and OST mount points
 on the same system. I have installed lustre client on a separate
 machine and when I tried to mount lustre MGS on it, I received
 following error:
 
 --- ---
 [r...@localhost ~]# mount -t lustre 10.0.0...@tcp0:/lustre
 /mnt/lustre
 mount.lustre: mount 10.0.0...@tcp0:/lustre at /mnt/lustre
 failed: No such device
 Are the lustre modules loaded?
 Check /etc/modprobe.conf and /proc/filesystems
 Note 'alias lustre llite' should be removed from modprobe.conf
 --- ---
 
 
 The modprobe on client says, 'module lustre not found'. Any clues?
 
 Client: Linux localhost.localdomain 2.6.18-92.el5 #1 SMP Tue Jun
 10 18:49:47 EDT 2008 i686 i686 i386 GNU/Linux
 MDS/OST: Linux localhost.localdomain
 2.6.18-92.1.17.el5_lustre.1.8.0smp #1 SMP Wed Feb 18 18:40:54
 MST 2009 i686 i686 i386 GNU/Linux
 
 Thanks,
 CS.
 
 
 
 On Mon, Jun 15, 2009 at 5:16 PM, Arden Wiebe
 

Re: [Lustre-discuss] Lustre installation and configuration problems

2009-06-16 Thread Cliff White
Carlos Santana wrote:
 I was able to run lustre_rmmod and depmod successfully. The '$lctl 
 list_nids' returned the server ip address and interface (tcp0).
 
 I tried to mount the file system on a remote client, but it failed with 
 the following message.
 --- ---
 [r...@localhost ~]# mount -t lustre 10.0.0...@tcp0:/lustre /mnt/lustre
 mount.lustre: mount 10.0.0...@tcp0:/lustre at /mnt/lustre failed: No 
 such device
 Are the lustre modules loaded?
 Check /etc/modprobe.conf and /proc/filesystems
 Note 'alias lustre llite' should be removed from modprobe.conf
 --- ---
 
 However, the mounting is successful on a single node configuration - 
 with client on the same machine as MDS and OST.
 Any clues? Where to look for logs and debug messages?

Syslog || /var/log/messages is the normal place.

You can use 'lctl ping' to verify that the client can reach the server.
Usually in these cases, it's a network/name misconfiguration.

Run 'tunefs.lustre --print' on your servers, and verify that mgsnode=
is correct.

cliffw

 
 Thanks,
 CS.
 
 
 
 
 On Tue, Jun 16, 2009 at 12:16 PM, Cliff White cliff.wh...@sun.com 
 mailto:cliff.wh...@sun.com wrote:
 
 Carlos Santana wrote:
 
 Thanks Kevin..
 
 Please read:
 
 http://manual.lustre.org/manual/LustreManual16_HTML/ConfiguringLustre.html#50401328_pgfId-1289529
 
 Those instructions are identical for 1.6 and 1.8.
 
 For current lustre, only two commands are used for configuration.
 mkfs.lustre and mount.
 
 
 Usually when lustre_rmmod returns that error, you run it a second
 time, and it will clear things. Unless you have live mounts or
 network connections.
 
 cliffw
 
 
 I am referring to 1.8 manual, but I was also referring to HowTo
 page on wiki which seems to be for 1.6. The HowTo page
 
 http://wiki.lustre.org/index.php/Lustre_Howto#Using_Supplied_Configuration_Tools
 mentions abt lmc, lconf, and lctl.
 
 The modules are installed in the right place. The '$
 lustre_rmmod' resulted in following o/p:
 [r...@localhost 2.6.18-92.1.17.el5_lustre.1.8.0smp]# lustre_rmmod
 ERROR: Module obdfilter is in use
 ERROR: Module ost is in use
 ERROR: Module mds is in use
 ERROR: Module fsfilt_ldiskfs is in use
 ERROR: Module mgs is in use
 ERROR: Module mgc is in use by mgs
 ERROR: Module ldiskfs is in use by fsfilt_ldiskfs
 ERROR: Module lov is in use
 ERROR: Module lquota is in use by obdfilter,mds
 ERROR: Module osc is in use
 ERROR: Module ksocklnd is in use
 ERROR: Module ptlrpc is in use by
 obdfilter,ost,mds,mgs,mgc,lov,lquota,osc
 ERROR: Module obdclass is in use by
 obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ptlrpc
 ERROR: Module lnet is in use by ksocklnd,ptlrpc,obdclass
 ERROR: Module lvfs is in use by
 
 obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ptlrpc,obdclass
 ERROR: Module libcfs is in use by
 
 obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ksocklnd,ptlrpc,obdclass,lnet,lvfs
 
 Do I need to shutdown these services? How can I do that?
 
 Thanks,
 CS.
 
 
 On Tue, Jun 16, 2009 at 11:36 AM, Kevin Van Maren
 kevin.vanma...@sun.com mailto:kevin.vanma...@sun.com
 mailto:kevin.vanma...@sun.com mailto:kevin.vanma...@sun.com
 wrote:
 
I think lconf and lmc went away with Lustre 1.6.  Are you
 sure you
are looking at the 1.8 manual, and not directions for 1.4?
 
/usr/sbin/lctl should be in the lustre-version RPM.  Do a:
# rpm -q -l lustre-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp
 
 
Do make sure the modules are installed in the right place:
# cd /lib/modules/`uname -r`
# find . | grep lustre.ko
 
If it shows up, then do:
# lustre_rmmod
# depmod
and try again.
 
Otherwise, figure out where your modules are installed:
# uname -r
# cd /lib/modules
# find . | grep lustre.ko
 
 
You can also double-check the NID.  On the MSD server, do
# lctl list_nids
 
Should show 10.0.0...@tcp0
 
Kevin
 
 
 
 
 
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Lustre installation and configuration problems

2009-06-16 Thread Cliff White
Carlos Santana wrote:
 The lctlt ping and 'net up' failed with the following messages:
 --- ---
 [r...@localhost ~]# lctl ping 10.0.0.42
 opening /dev/lnet failed: No such device
 hint: the kernel modules may not be loaded
 failed to ping 10.0.0...@tcp: No such device
 
 [r...@localhost ~]# lctl network up
 opening /dev/lnet failed: No such device
 hint: the kernel modules may not be loaded
 LNET configure error 19: No such device

Make sure modules are unloaded, then try modprobe -v.
Looks like you have lnet mis-configured, if your module options are 
wrong, you will see an error during the modprobe.
cliffw

 --- ---
 
 I tried lustre_rmmod and depmod commands and it did not return any error 
 messages. Any further clues? Reinstall patchless client again?
 
 -
 CS.
 
 
 On Tue, Jun 16, 2009 at 1:32 PM, Cliff White cliff.wh...@sun.com 
 mailto:cliff.wh...@sun.com wrote:
 
 Carlos Santana wrote:
 
 I was able to run lustre_rmmod and depmod successfully. The
 '$lctl list_nids' returned the server ip address and interface
 (tcp0).
 
 I tried to mount the file system on a remote client, but it
 failed with the following message.
 --- ---
 [r...@localhost ~]# mount -t lustre 10.0.0...@tcp0:/lustre
 /mnt/lustre
 mount.lustre: mount 10.0.0...@tcp0:/lustre at /mnt/lustre
 failed: No such device
 Are the lustre modules loaded?
 Check /etc/modprobe.conf and /proc/filesystems
 Note 'alias lustre llite' should be removed from modprobe.conf
 --- ---
 
 However, the mounting is successful on a single node
 configuration - with client on the same machine as MDS and OST.
 Any clues? Where to look for logs and debug messages?
 
 
 Syslog || /var/log/messages is the normal place.
 
 You can use 'lctl ping' to verify that the client can reach the server.
 Usually in these cases, it's a network/name misconfiguration.
 
 Run 'tunefs.lustre --print' on your servers, and verify that mgsnode=
 is correct.
 
 cliffw
 
 
 Thanks,
 CS.
 
 
 
 
 
 On Tue, Jun 16, 2009 at 12:16 PM, Cliff White
 cliff.wh...@sun.com mailto:cliff.wh...@sun.com
 mailto:cliff.wh...@sun.com mailto:cliff.wh...@sun.com wrote:
 
Carlos Santana wrote:
 
Thanks Kevin..
 
Please read:
  
  
 http://manual.lustre.org/manual/LustreManual16_HTML/ConfiguringLustre.html#50401328_pgfId-1289529
 
Those instructions are identical for 1.6 and 1.8.
 
For current lustre, only two commands are used for configuration.
mkfs.lustre and mount.
 
 
Usually when lustre_rmmod returns that error, you run it a second
time, and it will clear things. Unless you have live mounts or
network connections.
 
cliffw
 
 
I am referring to 1.8 manual, but I was also referring to
 HowTo
page on wiki which seems to be for 1.6. The HowTo page
  
  
 http://wiki.lustre.org/index.php/Lustre_Howto#Using_Supplied_Configuration_Tools
mentions abt lmc, lconf, and lctl.
 
The modules are installed in the right place. The '$
lustre_rmmod' resulted in following o/p:
[r...@localhost 2.6.18-92.1.17.el5_lustre.1.8.0smp]#
 lustre_rmmod
ERROR: Module obdfilter is in use
ERROR: Module ost is in use
ERROR: Module mds is in use
ERROR: Module fsfilt_ldiskfs is in use
ERROR: Module mgs is in use
ERROR: Module mgc is in use by mgs
ERROR: Module ldiskfs is in use by fsfilt_ldiskfs
ERROR: Module lov is in use
ERROR: Module lquota is in use by obdfilter,mds
ERROR: Module osc is in use
ERROR: Module ksocklnd is in use
ERROR: Module ptlrpc is in use by
obdfilter,ost,mds,mgs,mgc,lov,lquota,osc
ERROR: Module obdclass is in use by
  
  obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ptlrpc
ERROR: Module lnet is in use by ksocklnd,ptlrpc,obdclass
ERROR: Module lvfs is in use by
  
  
 obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ptlrpc,obdclass
ERROR: Module libcfs is in use by
  
  
 obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ksocklnd,ptlrpc,obdclass,lnet,lvfs
 
Do I need to shutdown these services? How can I do that?
 
Thanks,
CS.
 
 
On Tue, Jun 16, 2009 at 11:36 AM, Kevin Van Maren
kevin.vanma...@sun.com mailto:kevin.vanma...@sun.com
 mailto:kevin.vanma...@sun.com mailto:kevin.vanma...@sun.com

Re: [Lustre-discuss] Lustre 2.0* CMD doc/info?

2009-06-08 Thread Cliff White
Tom.Wang wrote:
 Hi
 
 CMD evaluation will be available on lustre 2.0 alpha-5.0.
 There are no more information yet except Wiki.

And btw, we always welcome any contributions to lustre documentation.
Use the wiki, and/or submit a bug against the Documentation product on 
bugzilla.lustre.org.

cliffw

 
 Thanks
 WangDi
 
 Then
 
 Josephine Palencia wrote:
 Hi,

 I am looking for more information for setting up Clustered Metadata 
 (CMD) for lustre-2.0.* aside from what's on the wiki.

 Ps direct me to the proper link/contact?
 If there's no documentation, I could help with that.

 Thank you,
 josephin
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss
   
 
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] ext3 tuning on a lustre filesystem

2009-04-28 Thread Cliff White
Nick Jennings wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1
 
 Hello Everyone,
 
  I was wondering if certain ext3 tweaks can be applied to a lustre
 filesystem? Things like:
 
 - - reclaiming some of the reserved 5% space for non-root filesystems

This one gets done quite often - 5% is a lot when partitions are terabytes.
 
 - - disabling automatic boot-time checks after X number of boots and
 having a quick check done during each boot)

Again, with very large disks this may be desirable. Depends on your 
desire for a quick boot.
 
 - - enabling full journaling

No sure what you mean here. We've seen some hardware configs benefit 
from external journaling, this eliminates a lot of head movement in some 
cases.
 
 I've done these things with straight up ext3 filesystems and been
 generally happy with the tweaks. Wondering if I should even go there
 with Lustre, is it worth the trouble it might cause?

Doubt any of the above would cause you 'trouble' especially reclaiming 
the root reservation.
cliffw

 
 Thanks!
 - -Nick
 
 - --
 Nick Jennings
 Technical Director
 Creative Motion Design
 www.creativemotiondesign.com
 -BEGIN PGP SIGNATURE-
 Version: GnuPG v2.0.9 (GNU/Linux)
 Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org
 
 iEYEARECAAYFAkn3NecACgkQbqosUH1Nr8dczACgsP4eA0nHWRD69tRQwJzWYiiE
 H1AAoOHbJJjYwC/5gN4fY86zD0ygjbYc
 =jfmH
 -END PGP SIGNATURE-
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Random access is not improving

2009-04-06 Thread Cliff White
set...@gmail.com wrote:
 Does Lustre increase random access performance?  I would like to know
 this becauseI have a large random access file (a hash table).  I have
 striped this file across multiple OSTs.  The file is 24 gigabytes, and
 the stripe size was 1gig across 10 OSTs.  I also tried a stripe size
 of 100megabytes.  Both stripe sizes did not seem to improve random
 access performance.  Am I doing something wrong?
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss

I may be wrong, but I would think performance of a single random access
would still be mostly limited by disk seek times, etc.

Lustre should do better with multiple random queries, since they should 
be spread across multiple disk spindles. Less chance of two queries 
contending for the same spindle.

But there is nothing we do that will make a single disk access any 
faster, afaik. If you are jumping about randomly, it's going to be up to 
the disk heads.

Changing the stripe size won't do anything here. If you are doing 
multiple random queries, increasing the number of OSTs would spread the 
load out.

This is a case where a future feature (OST cache) might help.

cliffw
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] OSS Cache Size for read optimization

2009-04-03 Thread Cliff White
Jordan Mendler wrote:
 Hi all,
 
 I deployed Lustre on some legacy hardware and as a result my (4) OSS's 
 each have 32GB of RAM. Our workflow is such that we are frequently 
 rereading the same 15GB indexes over and over again from Lustre (they 
 are striped across all OSS's) by all nodes on our cluster. As such, is 
 there any way to increase the amount of memory that either Lustre or the 
 Linux kernel uses to cache files read from disk by the OSS's? This would 
 allow much of the indexes to be served from memory on the OSS's rather 
 than disk.
 
 I see a /lustre.memused_max = 48140176/ parameter, but not sure what 
 that does. If it matters, my setup is such that each of the 4 OSS's 
 serves 1 OST that consists of a software RAID10 across 4 SATA disks 
 internal to that OSS.
 
 Any other suggestions for tuning for fast reads of large files would 
 also be greatly appreciated.
 

Current Lustre does not cache on OSTs at all. All IO is direct.
Future Lustre releases will provide an OST cache.

For now, you can increase the amount of data cached on clients, which
might help a little. Client caching is set with 
/proc/fs/lustre/osc/*/max_dirty_mb.

cliffw

 Thanks so much,
 Jordan
 
 
 
 
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Adding OSTs problem

2009-03-16 Thread Cliff White
Mag Gam wrote:
 I have added 2 volumes onto my existing filesystem.
 
 mkfs.lustre --fsname lfs002 --ost --mgsnode=mg...@tcp /dev/lustrevg/lv03
 mkfs.lustre --fsname lfs002 --ost --mgsnode=mg...@tcp /dev/lustrevg/lv04
 
 I even managed to mount up the OSTS (each are 2TB)
 
 However, on the clients we don't see the extra 4TB.
 
 I am not sure whats going on.
 
 
 On the OST:
 I can see the new OSTs mounted up properly
 
 /proc/fs/lustre/obdfilter/lfs002* I see the mounted filesystem device
 on mntdev in the /proc/filesystem
 
 I even managed to reboot the MDS and OSS and the clients still no go.
 I **think** on my MDS I see the new OSS too, just not on the clients.
 
 Is there anything I should be looking at?

first, tunefs.lustre --print on the OST to be certain you have the 
fsname set correctly ( i think it needs to be --fsname= )

Second, 'lctl dl' on the MDS should show you a connection to the new 
OSTs. If not, the MGS hasn't registered them with this filesystem.
# lctl dl

   3 UP mds test1-MDT test1-MDT_UUID 5
...
   5 UP osc test1-OST0001-osc test1-mdtlov_UUID 5


'lctl dl' on the OST should show an MGC connection
# lctl dl
   0 UP mgc mgc10.67.73@tcp 92778f8e-f7e6-0327-e433-63d230eea1a9 5
   1 UP ost OSS OSS_uuid 3
   2 UP obdfilter test1-OST0001 test1-OST0001_UUID 7


Check the OST log, there should be a 'recieved MDS connection from'
message if you were registered correctly.
/var/log/messages.2:Mar  2 11:17:24 d1c2 kernel: Lustre: test1-OST0001: 
received MDS connection from 10.67.73@tcp


New OSTs should be automatically recognized by the clients, and 'lctl 
dl' on the clients should show the new OSTs.

cliffw

 
 TIA
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Error using mkfs.lustre

2009-03-03 Thread Cliff White
Rayentray Tappa wrote:
 On Tue, 2009-03-03 at 08:46 -0800, Evan Felix wrote:
 Yes, while you are installing lustre you should probably have /sbin/ and
 /usr/sbin in your path.

 
 ok, so i'll add them
 
 Check to see if /usr/sbin/mkfs.ext2 exists.

 
 i checked and there's no such thing as mkfs.ext2 there. how should i
 procceed? 
 
 thanks,
 
   ra
 
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss

Make sure you have e2fsprogs installed. A lustre-specific version is 
available on the Sun download site.
cliffw

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] About MDS failover

2009-01-15 Thread Cliff White
Jeffrey Alan Bennett wrote:
 Hi,
 
 What software are people using for MDS failover? 
 
 I have been using Heartbeat from Linux-HA but I am not absolutely happy with 
 its performance.
 
 Is there anything better out there?

Are you using heartbeat V1 or V2?

I would like to hear more about the issues you are experiencing.
We have had some people use the Red Hat cluster tools.

cliffw

 
 Thanks,
 
 Jeffrey Bennett
 HPC Data Engineer
 San Diego Supercomputer Center
 858.822.0936 http://users.sdsc.edu/~jab
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] LBUG ASSERTION(lock-l_resource != NULL) failed

2009-01-14 Thread Cliff White
Brock Palen wrote:
 I am having servers LBUG on a regular basis, Clients are running  
 1.6.6 patchless on RHEL4,  servers are running RHEL4 with 1.6.5.1  
 RPM's from the download page.  All connection is over Ethernet,   
 Servers are x4600's.

This looks like bug 16496, which is fixed in 1.6.6. You should upgrade
your servers to 1.6.6
cliffw

 
 The OSS that BUG'd has in its log:
 
 Jan 13 16:35:39 oss2 kernel: LustreError: 10243:0:(ldlm_lock.c: 
 430:__ldlm_handle2lock()) ASSERTION(lock-l_resource != NULL) failed
 Jan 13 16:35:39 oss2 kernel: LustreError: 10243:0:(tracefile.c: 
 432:libcfs_assertion_failed()) LBUG
 Jan 13 16:35:39 oss2 kernel: Lustre: 10243:0:(linux-debug.c: 
 167:libcfs_debug_dumpstack()) showing stack for process 10243
 Jan 13 16:35:39 oss2 kernel: ldlm_cn_08R  running task   0  
 10243  1 10244  7776 (L-TLB)
 Jan 13 16:35:39 oss2 kernel:  a0414629  
 0103d83c7e00 
 Jan 13 16:35:39 oss2 kernel:0101f8c88d40 a021445e  
 0103e315dd98 0001
 Jan 13 16:35:39 oss2 kernel:0101f3993ea0 
 Jan 13 16:35:39 oss2 kernel: Call Trace:a0414629 
 {:ptlrpc:ptlrpc_server_handle_request+2457}
 Jan 13 16:35:39 oss2 kernel:a021445e 
 {:libcfs:lcw_update_time+30} 80133855{__wake_up_common+67}
 Jan 13 16:35:39 oss2 kernel:a0416d05 
 {:ptlrpc:ptlrpc_main+3989} a0415270 
 {:ptlrpc:ptlrpc_retry_rqbds+0}
 Jan 13 16:35:39 oss2 kernel:a0415270 
 {:ptlrpc:ptlrpc_retry_rqbds+0} a0415270 
 {:ptlrpc:ptlrpc_retry_rqbds+0}
 Jan 13 16:35:39 oss2 kernel:80110de3{child_rip+8}  
 a0415d70{:ptlrpc:ptlrpc_main+0}
 Jan 13 16:35:39 oss2 kernel:80110ddb{child_rip+0}
 Jan 13 16:35:40 oss2 kernel: LustreError: dumping log to /tmp/lustre- 
 log.1231882539.10243
 
 
 At the same time a client (nyx346) lost contact with that oss, and is  
 never allowed to reconnect.
 Client /var/log/message:
 
 Jan 13 16:37:20 nyx346 kernel: Lustre: nobackup-OST000d- 
 osc-01022c2a7800: Connection to service nobackup-OST000d via nid  
 10.164.3@tcp was lost; in progress operations using this service  
 will wait for recovery to complete.Jan 13 16:37:20 nyx346 kernel:  
 Lustre: Skipped 6 previous similar messagesJan 13 16:37:20 nyx346  
 kernel: LustreError: 3889:0:(ldlm_request.c:996:ldlm_cli_cancel_req 
 ()) Got rc -11 from cancel RPC: canceling anywayJan 13 16:37:20  
 nyx346 kernel: LustreError: 3889:0:(ldlm_request.c: 
 1605:ldlm_cli_cancel_list()) ldlm_cli_cancel_list: -11Jan 13 16:37:20  
 nyx346 kernel: LustreError: 11-0: an error occurred while  
 communicating with 10.164.3@tcp. The ost_connect operation failed  
 with -16Jan 13 16:37:20 nyx346 kernel: LustreError: Skipped 10  
 previous similar messages
 Jan 13 16:37:45 nyx346 kernel: Lustre: 3849:0:(import.c: 
 410:import_select_connection()) nobackup-OST000d- 
 osc-01022c2a7800: tried all connections, increasing latency to 7s
 
 Even now the server(OSS) is refusing connection to OST00d,  with the  
 message:
 
 Lustre: 9631:0:(ldlm_lib.c:760:target_handle_connect()) nobackup- 
 OST000d: refuse reconnection from 145a1ec5-07ef- 
 f7eb-0ca9-2a2b6503e...@10.164.1.90@tcp to 0x0103d5ce7000; still  
 busy with 2 active RPCs
 
 
 If I reboot the OSS, the OST's on it go though recovery like normal,  
 and then the client is fine.
 
 Network looks clean, found one machine with lots of dropped packets  
 between the servers, but that is not the client in question.
 
 Thank you!  If it happens again, and I find any other data I will let  
 you know.
 
 
 Brock Palen
 www.umich.edu/~brockp
 Center for Advanced Computing
 bro...@umich.edu
 (734)936-1985
 
 
 
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] LBUG ASSERTION(lock-l_resource != NULL) failed

2009-01-14 Thread Cliff White
Brock Palen wrote:
 Gah!  Ok no problem.
 
 No risk of data loss right?  
Umm...I can't say that. No idea really.

And is there anyway to 'limp along' till an
 outage without rebooting OST's?

Nope. An LBUG always requires a reboot. We freeze the LBUG'd thread
for debugging purposes. The frozen thread may stall other threads, which 
can eventually wedge the server. Best not to try to run with an LBUG.

However, 1.6.5.1 - 1.6.6 is a minor bug fix upgrade, and you can run
a mix of 1.6.5.1 and 1.6.6 quite well. So if you cannot take a big 
downtime, you could do a 'rolling upgrade'
- Make the MGS/MDS 1.6.6
- Each time an OSS LBUGs, reboot, upgrade to 1.6.6, remount Lustre. 
Installing the new rpms should be quite quick. You don't have to change 
any configuration, just throw the new RPMS on there.

Given a random distribution, eventually all your OSSs will be 1.6.6 :)
cliffw
 
 Thanks for the insight!
 
 Brock Palen
 www.umich.edu/~brockp
 Center for Advanced Computing
 bro...@umich.edu
 (734)936-1985
 
 
 
 On Jan 14, 2009, at 7:27 PM, Cliff White wrote:
 
 Brock Palen wrote:
 I am having servers LBUG on a regular basis, Clients are running  
 1.6.6 patchless on RHEL4,  servers are running RHEL4 with 1.6.5.1  
 RPM's from the download page.  All connection is over Ethernet,   
 Servers are x4600's.

 This looks like bug 16496, which is fixed in 1.6.6. You should upgrade
 your servers to 1.6.6
 cliffw

 The OSS that BUG'd has in its log:
 Jan 13 16:35:39 oss2 kernel: LustreError: 10243:0:(ldlm_lock.c: 
 430:__ldlm_handle2lock()) ASSERTION(lock-l_resource != NULL) failed
 Jan 13 16:35:39 oss2 kernel: LustreError: 10243:0:(tracefile.c: 
 432:libcfs_assertion_failed()) LBUG
 Jan 13 16:35:39 oss2 kernel: Lustre: 10243:0:(linux-debug.c: 
 167:libcfs_debug_dumpstack()) showing stack for process 10243
 Jan 13 16:35:39 oss2 kernel: ldlm_cn_08R  running task   0  
 10243  1 10244  7776 (L-TLB)
 Jan 13 16:35:39 oss2 kernel:  a0414629  
 0103d83c7e00 
 Jan 13 16:35:39 oss2 kernel:0101f8c88d40 
 a021445e  0103e315dd98 0001
 Jan 13 16:35:39 oss2 kernel:0101f3993ea0 
 Jan 13 16:35:39 oss2 kernel: Call Trace:a0414629 
 {:ptlrpc:ptlrpc_server_handle_request+2457}
 Jan 13 16:35:39 oss2 kernel:a021445e 
 {:libcfs:lcw_update_time+30} 80133855{__wake_up_common+67}
 Jan 13 16:35:39 oss2 kernel:a0416d05 
 {:ptlrpc:ptlrpc_main+3989} a0415270 
 {:ptlrpc:ptlrpc_retry_rqbds+0}
 Jan 13 16:35:39 oss2 kernel:a0415270 
 {:ptlrpc:ptlrpc_retry_rqbds+0} a0415270 
 {:ptlrpc:ptlrpc_retry_rqbds+0}
 Jan 13 16:35:39 oss2 kernel:80110de3{child_rip+8}  
 a0415d70{:ptlrpc:ptlrpc_main+0}
 Jan 13 16:35:39 oss2 kernel:80110ddb{child_rip+0}
 Jan 13 16:35:40 oss2 kernel: LustreError: dumping log to /tmp/lustre- 
 log.1231882539.10243
 At the same time a client (nyx346) lost contact with that oss, and 
 is  never allowed to reconnect.
 Client /var/log/message:
 Jan 13 16:37:20 nyx346 kernel: Lustre: nobackup-OST000d- 
 osc-01022c2a7800: Connection to service nobackup-OST000d via nid  
 10.164.3@tcp was lost; in progress operations using this service  
 will wait for recovery to complete.Jan 13 16:37:20 nyx346 kernel:  
 Lustre: Skipped 6 previous similar messagesJan 13 16:37:20 nyx346  
 kernel: LustreError: 3889:0:(ldlm_request.c:996:ldlm_cli_cancel_req 
 ()) Got rc -11 from cancel RPC: canceling anywayJan 13 16:37:20  
 nyx346 kernel: LustreError: 3889:0:(ldlm_request.c: 
 1605:ldlm_cli_cancel_list()) ldlm_cli_cancel_list: -11Jan 13 
 16:37:20  nyx346 kernel: LustreError: 11-0: an error occurred while  
 communicating with 10.164.3@tcp. The ost_connect operation 
 failed  with -16Jan 13 16:37:20 nyx346 kernel: LustreError: Skipped 
 10  previous similar messages
 Jan 13 16:37:45 nyx346 kernel: Lustre: 3849:0:(import.c: 
 410:import_select_connection()) nobackup-OST000d- 
 osc-01022c2a7800: tried all connections, increasing latency to 7s
 Even now the server(OSS) is refusing connection to OST00d,  with the  
 message:
 Lustre: 9631:0:(ldlm_lib.c:760:target_handle_connect()) nobackup- 
 OST000d: refuse reconnection from 145a1ec5-07ef- 
 f7eb-0ca9-2a2b6503e...@10.164.1.90@tcp to 0x0103d5ce7000; still  
 busy with 2 active RPCs
 If I reboot the OSS, the OST's on it go though recovery like normal,  
 and then the client is fine.
 Network looks clean, found one machine with lots of dropped packets  
 between the servers, but that is not the client in question.
 Thank you!  If it happens again, and I find any other data I will 
 let  you know.
 Brock Palen
 www.umich.edu/~brockp
 Center for Advanced Computing
 bro...@umich.edu
 (734)936-1985
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo

Re: [Lustre-discuss] Lustre NOT HEALTHY

2009-01-13 Thread Cliff White
Brock Palen wrote:
 How common is it for servers to go NOT HEALTHY?  I feel it is  
 happening much more often than it should be with us.  A few times a  
 month.
 
It should not happen at all, in the normal case. It indicates a problem.

 If this happens, we reboot the servers.  Should we do something  
 else?  Maybe it depends on what the problem was?

Well, determining what the actual problem that caused the NOT HEALTHY 
would be quite useful, yes. I would not just reboot.

-Examine consoles of _all_ servers for any error indications
- Examine syslogs of _all_ servers for any LustreErrors or LBUG
- Check network and hardware health. Are your disks happy?
Is your network dropping packets?

Try to figure out what was happening on the cluster. Does this relate to
a specific user workload or system load condition? Can you reproduce
the situation? Does it happen at a specific time of day, time of month?
 
 If we should not be getting NOT HEALTHY that often, what information  
 should I collect to report to CFS?

The lustre-diagnostics package is good start for general system config.
Beyond that, most of what we would need is listed above.
cliffw

 
 
 Brock Palen
 www.umich.edu/~brockp
 Center for Advanced Computing
 bro...@umich.edu
 (734)936-1985
 
 
 
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] writeconf needed for 1.6.6?

2009-01-12 Thread Cliff White
Roger Spellman wrote:
 Is writeconf needed to upgrade a filesystem from 1.6.5 to 1.6.6?
 
 If so, is this run just on the MGS and MDT, or also on the OSTs?

If you are not changing any configuration or NIDS, no writeconf
is needed when upgrading Lustre from one point release to the next.

Major releases are different, but a minor point release should never
require you to re-do the configuration.
cliffw
 
 Thanks.
 
 Roger Spellman
 Staff Engineer
 Terascala, Inc.
 508-588-1501
 www.terascala.com http://www.terascala.com/
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] What do clients run on?

2009-01-12 Thread Cliff White
Arden Wiebe wrote:
 I've read it a zillion times but can't seem to find it again.  Can a client 
 run on the same server as a MGS, MDT or OSS?  Is a dedicated client machines 
 necessary?
 

You can run all of Lustre (clients and all servers) on one node, but 
this is not supported for production use.

MDS and client on same machine can have recovery/deadlock issues.

OSS and client on same machine will have issues with low memory/memory 
pressure. The client consumes all memory, tries to flush pages to disk,
OSS needs to allocate pages to receive data from client and can't due to 
low memory - can result in OOM kill and other issues.

But, for testing, non-production work, quick sanity checks, etc, you can 
certainly run everything on one node, provided you are not doing much work.

cliffw

 
   
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] What do clients run on?

2009-01-12 Thread Cliff White
Arden Wiebe wrote:
 Okay I'll rephrase the question?  Given a limited deployment can I mount 
 the client on the MDT, MGS or OSS?  Is the best choice to build a 
 dedicated client?

If you care about performance at all, a dedicated client is always best.
While you can run client/MDS somewhat safely (modulo recovery issues) a 
busy client will steal resources from the MDS/MGS, so other clients may 
suffer.

Again, this is something that may be okay for a testing situation but 
should really be avoided for any kind of production system.

cliffw

 
 --- On *Sat, 1/10/09, Arden Wiebe /albert...@yahoo.com/* wrote:
 
 
 From: Arden Wiebe albert...@yahoo.com
 Subject: [Lustre-discuss] What do clients run on?
 To: lustre-discuss@lists.lustre.org
 Date: Saturday, January 10, 2009, 12:51 PM
 
 I've read it a zillion times but can't seem to find it again.  Can a
 client run on the same server as a MGS, MDT or OSS?  Is a dedicated
 client machines necessary?
 
 
  
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 /mc/compose?to=lustre-disc...@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss
 
 
 
 
 
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] what are the meanings of collectl output for lustre

2009-01-12 Thread Cliff White
xiangyong ouyang wrote:
 hi all,
 
 I'm running collectl to get profiling information about lustre.  I'm
 using lustre   2.6.18-53.1.13.el5_lustre.1.6.4.3smp, and
 collectl V3.1.1-5 (zlib:1.42,HiRes:1.86)
 
 Basically I want to see the metadata information on both client and MDS.
 
 On client side, I run:   collectl -sL  --lustopts M:
 # LUSTRE CLIENT DETAIL (/sec): METADATA
 #Filsy   KBRead  Reads SizeKB  KBWrite Writes SizeKB  Open Close GAttr
 SAttr  Seek Fsync DrtHit DrtMis
 datafs0  0  00  0  0 3 0 0
 0 0 0  0  0
 datafs0  0  050169   3915 12 0 3 0
 0 0 0   3912  12543
 datafs0  0  00  0  0 0 0 0
 0 0 0  0  0
 
 What are the fileds DrtHit and DrtMis here?
 
 But On MDS side, when I run
 [wci70-oib:~/tools]collectl -sL
 Error: -sL only applies to MDS services when used with --lustopts D
 type 'collectl -h' for help
 
 Then I run:
 [wci70-oib:~/tools]collectl -sL --lustopts D
 Error: --lustopts D only applies to HP-SFS
 type 'collectl -h' for help
 
 I want to see more detailed metadata information at MDS side, such as
 locks, cache hit/miss.  What are the options to do that on MDS?
 Thanks very much!

I would consult the Lustre manual. Some of this information is available 
  under /proc on the MDS, you can also enable lock tracing through the 
Lustre debug flags.
cliffw

 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] MGS and MDT on Failover Pair

2008-09-12 Thread Cliff White
Brian J. Murrell wrote:
 On Wed, 2008-09-10 at 16:23 -0400, Roger Spellman wrote:
 I am building a system with a redundant MDS, that is two MDS sharing a
 set of disks, one being Active, the other Standby.
  
 If I put the MGS and MDS on the same system, it appears that they must
 be on the same partition as well.
 
 No.
 
 Otherwise, when there is a failover, the MGS will not fail over.  Is
 that true?

If the MDT and MGT are separate partitions, then you will have to fail 
them over as separate services, as each partition will be mounted 
separately. The separate partitions can be on one system. Of course any 
decent HA tool will allow you to failover multiple services with one action.

Should note- the MGS is very small, and only used for configuration 
changes, and mount information. If all your clients are already mounted, 
an MGS failure is quite transparent - you can run for quite some time 
with a dead MGS.

For a very robust system, I would suggest moving the MGS to a small 
machine (heck, a cheap laptop would work for all but the biggest sites)
replicate the MGT disk and put your failover dollars on the MGS.

You could build a very robust failover MGS for the cost of two cheap
whitebox PC's (modulo network hardware cost). Also, a separate MGS is 
recomended when you have more than one filesystem.

cliffw


 
 Not true.
 
 b.
 
 
 
 
 
 
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] beta lustre

2008-09-09 Thread Cliff White
Papp Tamas wrote:
 hi All,
 
 Where can I download beta version of lustre?

Depends on what you mean.
Current Lustre is always available from the Sun Download site.
http://www.sun.com/software/products/lustre/get.jsp
(free, of course)

We have pre-released versions availble via CVS.
http://wiki.lustre.org/index.php?title=Open_CVS

I don't think anything is actually in a 'beta' program right now,
we probably will have 'beta' releases as we get closer to 1.8, but we
always encourage people to test latest CVS.
 
 Which lustre version will support FC8 kernels (2.6.23+)?

We don't much build on FC kernels, due to lack of commercial demand, but
any recent Lustre should work. We support patchless client builds on
vanilla linux kernels  2.6.22, so you should be able to build any 
recent Lustre clients against FC8 (maybe :)

cliffw

 
 Thank you,
 
 tamas
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Typical IB timeout? Or something more?

2008-09-09 Thread Cliff White
Alex Lee wrote:
 I been seeing something that looks like IB timeout errors lately after 
 upgrading to 1.6.5.1 using the supplied ofed kernel drivers.
 
  From what I can tell there hasnt been any real network issues that was 
 apparent. Are these errors just typical if the network is busy?

Could be. If you are having actual IB issues, there is generally an 
error from IB prior to any Lustre errors. (And generally lots of 
LustreErrors - we really get unhappy when your network breaks) What we 
see here are two requests timing out, one with a 6sec limit, one 20 sec. 
No indication of a related error, just a timeout exceeded.

These timeouts can happen on a busy network. If they are frequent, you 
should increase obd_timeout.
cliffw

 
 
 Heres from MDS:
 Sep  8 16:04:19 lustre-mds-0-0 kernel: LustreError: 
 10001:0:(events.c:55:request_out_callback()) @@@ ty
 pe 4, status -5  [EMAIL PROTECTED] x15310157/t0 
 o104-F0'8488[EMAIL PROTECTED]
 c1d08_UUID:15/16 lens 232/256 e 0 to 6 dl 1220857463 ref 2 fl Rpc:N/0/0 
 rc 0/0
 Sep  8 16:04:19 lustre-mds-0-0 kernel: Lustre: Request x15310157 sent 
 from lfs-MDT to NID 10.12.29.
 [EMAIL PROTECTED] 2s ago has timed out (limit 6s).
 Sep  8 16:04:23 lustre-mds-0-0 kernel: Lustre: Request x15310047 sent 
 from lfs-MDT to NID 10.12.28.
 [EMAIL PROTECTED] 6s ago has timed out (limit 6s).
 Sep  8 16:04:43 lustre-mds-0-0 kernel: LustreError: 
 10003:0:(events.c:55:request_out_callback()) @@@ ty
 pe 4, status -5  [EMAIL PROTECTED] x15310096/t0 
 o104-F0'8488[EMAIL PROTECTED]
 c1d0e_UUID:15/16 lens 232/256 e 0 to 6 dl 1220857463 ref 1 fl 
 Complete:XN/0/0 rc 0/0
 Sep  8 16:05:07 lustre-mds-0-0 kernel: LustreError: 
 3930:0:(events.c:55:request_out_callback()) @@@ typ
 e 4, status -113  [EMAIL PROTECTED] x15310047/t0 
 o104-F0'8488[EMAIL PROTECTED]
 0c1ceb_UUID:15/16 lens 232/256 e 0 to 6 dl 1220857463 ref 1 fl 
 Complete:XN/0/0 rc 0/0
 Sep  8 16:08:44 lustre-mds-0-0 kernel: Lustre: Skipped 1 previous 
 similar message
 Sep  8 16:13:24 lustre-mds-0-0 kernel: Lustre: Skipped 4 previous 
 similar messages
 
 On the OSS:
 Sep  9 00:24:55 lustre-oss-4-1 kernel: Lustre: Skipped 3 previous 
 similar messages
 Sep  9 00:25:01 lustre-oss-4-1 kernel: Lustre: Request x784766 sent from 
 lfs-OST0039 to NID 10.12.29.7@
 o2ib 20s ago has timed out (limit 20s).
 Sep  9 00:25:31 lustre-oss-4-1 kernel: LustreError: 
 13228:0:(o2iblnd_cb.c:2874:kiblnd_check_conns()) Ti
 med out RDMA with [EMAIL PROTECTED]
 Sep  9 00:25:31 lustre-oss-4-1 kernel: LustreError: 
 13228:0:(events.c:55:request_out_callback()) @@@ ty
 pe 4, status -103  [EMAIL PROTECTED] x784766/t0 o104-@:15/16 lens 
 232/256 e 0 to 20 dl 1220887501 r
 ef 1 fl Complete:XN/0/0 rc 0/0
 
 -Alex
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] simulations

2008-08-08 Thread Cliff White
Mag Gam wrote:
 CliffW:
 
 This helps out a lot!
 
 We still have problems determining devices. We don't know what their
 numbers are (I been using lctl dl), but I don't know how to activate
 or deactivate them.
 
 
 Do you have an example?
 
Yup
http://manual.lustre.org/manual/LustreManual16_HTML/KnowledgeBase.html#50544717_84403

The .pdf version I think has more details.
cliffw

 
 TIA
 
 On Thu, Aug 7, 2008 at 10:59 AM, Cliff White [EMAIL PROTECTED] wrote:
 Mag Gam wrote:
 We do a lot of fluid simulations at my university, but on a similar
 note I would like to know what the Lustre experts will do in
 particular simulated scenarios...

 The environment is this:
 30 Servers (All Linux)
 1000+ Clients (All Linux)

 30 Servers
 1 MDS
 30 OSTs each with 2TB of storage

 No fail over capabilities.


 Scenario 1:
 Your client is trying to mount lustre filesystem using lustre module,
 and it hung. Do what?
 Answer 0 to all questions:
 Read the Lustre Manual. File doc bugs in Lustre Bugzilla if there's a part
 you don't understand, or a part missing

 Answer 1 for all your questions.
 Check syslogs/consoles on the impacted clients.
 Check syslogs/consoles on _all lustre servers.
 Pay careful attention to timestamps.
 Work backwards to the first error.

 Is the problem restricted to one client or seen by multiple clients?
 If multiple clients, start with the network, use lctl ping to check lustre
 connectivity.
 If a single client, it's generally a client config/network config issue.
 Scenario 2:
 Your MDS won't mount up. Its saying, The server is already running.
 You try to mount it up couple of times and still its not
 Be certain the server is not already running.
 Be certain no hung mount processes exist.
 Unload all lustre modules (lustre_rmmod script will do this)
 Retry and - answer 1

 Scenario 3:
 OST/OSS reboots due to a power outage. Some files are striped on this,
 and some aren't What happens? What to do for minimal outage?
 - Clients can be mounted with a dead OST using the exclude options to the
 mount command. lfs getstripe can be run from clients to find files
 on the bad OST. See answer 0 for detailed process.
 Scenario 4:
 lctl dl shows some devices in ST state. What does that mean, and how
 do I clear it?
 ST = stopped.
 Clear this by cleaning up all devices (answer 0)
 or restarting the stopped devices.
 Usually indicates an error/issue with the stopped device, so see
 answer 1.

 I know some of these scenarios may be ambiguous, but please let me
 know which so I can further elaborate. I am eventually planning to
 wiki this for future reference and other lustre newbies.
 Please contribute to wiki.lustre.org - there is considerable information
 there already, and a decent existing structure.
 If anyone else has any other scenarios, please don't be shy and ask
 away. We can create a good trouble shooting doc similar to the
 operations manual.
 Again, please file doc bugs at bugzilla.lustre.org and contribute to
 wiki.lustre.org, hope this helps!
 cliffw


 TIA
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss

 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] MDS

2008-08-07 Thread Cliff White
Cliff White wrote:
 Mag Gam wrote:
 Also, what is the best way to test the backup? Other than really
 remove my MGS and restore it. Is there a better way to test this?
 
 If you really care about the backups, you need to be brave. If you can't
 remove the MDS and restore it, then something is wrong with your backup
 process. Many people seem to focus on the backup part and ignore the 
 'restore' bit, so I definately reccomend a live test.
 
 That said, if you can bring up your backup MDT image on a separate node, 
 you could configure that node as a failover MDS - this would require you 
 to tunefs.lustre all the servers, and remount all the clients. Then you 
 can test restore using a 'manual failover' - and once you made the mount 
 changes, you could repeat this test at will, without even halting the 
 filesystem. Also, you would not have to 'remove' your primary MDS, just 
 stop that node.
 
 If your MDS _does_ die, the failover config will cause a slightly longer 
 timeout (everybody will retry the alternate) but otherwise won't impact 
 you.

Just to be clear, there is a potential data loss issue due to the time 
delta between the backup and the live system. Any transactions in play
that miss the snapshot could result in lost data, as the MDS will replay 
transaction logs and delete orphans on startup. So testing on your live 
system definately is for the brave.
cliffw

 
 cliffw

 TIA


 On Tue, Aug 5, 2008 at 6:37 PM, Mag Gam [EMAIL PROTECTED] wrote:
 Brian:

 Thanks for the response. I actually seen this response before and was
 wondering if my technique would simply work. I guess not.

 I guess another question will be, if I take a snapshot every 10 mins
 and back it up. If I have a failure at 15th minute. Can I just simply
 restore my MDS to the previous snapshot and be with it? Ofcourse I
 will lose my 5 minutes of data, correct?

 TIA


 On Tue, Aug 5, 2008 at 12:08 PM, Brian J. Murrell 
 [EMAIL PROTECTED] wrote:
 On Tue, 2008-08-05 at 01:12 -0400, Mag Gam wrote:
 What is a good MGS/MDT backup strategy if there is one?

 I was thinking of  mounting the MGS/MDT partition on the MDS as ext3
 and rsync it every 10 mins to another server. Would this work? What
 would happen in the 9th minute I lose my MDS, would I still be able to
 have a good copy? Any thoughts or ideas?
 Peter Braam answered a similar question and of course, the answer is in
 the archives.  It was the second google hit on a search for lustre mds
 backup.  The answer is at:

 http://lists.lustre.org/pipermail/lustre-discuss/2006-June/001655.html

 Backup of the MDT is also covered in the manual in section 15 at

 http://manual.lustre.org/manual/LustreManual16_HTML/BackupAndRestore.html#50544703_pgfId-5529
  


 Now, as for mounting the MDT as ext3 (you should actually use ldiskfs,
 not ext3) every 10 minutes, that means you are going to make your
 filesystem unavailable every 10 minutes as you CANNOT mount the MDT
 partition on more than one machine and we have not tested multiple
 mounting on a single machine with any degree of confidence.

 Of course Peter's LVM snapshotting technique will allow you to mount
 snapshots which you can backup as you describe.

 But if you are going to have a whole separate machine with enough
 storage to mirror your MDT why not use something more active like DRBD
 and have a fully functional active/passive MDT failover strategy?  
 While
 nobody in the Lustre Group has done any extensive testing of Lustre on
 DRBD, there have been a number of reports of success with it here on
 this list.

 b.


 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss


 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss
 
 

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] simulations

2008-08-07 Thread Cliff White
Mag Gam wrote:
 We do a lot of fluid simulations at my university, but on a similar
 note I would like to know what the Lustre experts will do in
 particular simulated scenarios...
 
 The environment is this:
 30 Servers (All Linux)
 1000+ Clients (All Linux)
 
 30 Servers
 1 MDS
 30 OSTs each with 2TB of storage
 
 No fail over capabilities.
 
 
 Scenario 1:
 Your client is trying to mount lustre filesystem using lustre module,
 and it hung. Do what?
Answer 0 to all questions:
Read the Lustre Manual. File doc bugs in Lustre Bugzilla if there's a 
part you don't understand, or a part missing

Answer 1 for all your questions.
Check syslogs/consoles on the impacted clients.
Check syslogs/consoles on _all lustre servers.
Pay careful attention to timestamps.
Work backwards to the first error.

Is the problem restricted to one client or seen by multiple clients?
If multiple clients, start with the network, use lctl ping to check 
lustre connectivity.
If a single client, it's generally a client config/network config issue.
 
 Scenario 2:
 Your MDS won't mount up. Its saying, The server is already running.
 You try to mount it up couple of times and still its not

Be certain the server is not already running.
Be certain no hung mount processes exist.
Unload all lustre modules (lustre_rmmod script will do this)
Retry and - answer 1

 
 Scenario 3:
 OST/OSS reboots due to a power outage. Some files are striped on this,
 and some aren't What happens? What to do for minimal outage?

- Clients can be mounted with a dead OST using the exclude options to 
the mount command. lfs getstripe can be run from clients to find files
on the bad OST. See answer 0 for detailed process.
 
 Scenario 4:
 lctl dl shows some devices in ST state. What does that mean, and how
 do I clear it?

ST = stopped.
Clear this by cleaning up all devices (answer 0)
or restarting the stopped devices.
Usually indicates an error/issue with the stopped device, so see
answer 1.
 
 
 I know some of these scenarios may be ambiguous, but please let me
 know which so I can further elaborate. I am eventually planning to
 wiki this for future reference and other lustre newbies.

Please contribute to wiki.lustre.org - there is considerable information 
there already, and a decent existing structure.
 
 If anyone else has any other scenarios, please don't be shy and ask
 away. We can create a good trouble shooting doc similar to the
 operations manual.

Again, please file doc bugs at bugzilla.lustre.org and contribute to 
wiki.lustre.org, hope this helps!
cliffw

 
 
 TIA
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Lustre health check

2008-07-15 Thread Cliff White
Mag Gam wrote:
 We are planning to deploy lustre on a large scale at my university,
 and we were wondering if there are any health check utilities
 available for OST, OSS, MDS and MDT. I know there is a SNMP module
 avaliable, but I prefer a solid front end with SNMP as the backend. So
 what tools are people using to monitor their lustre's infrastructure?
 
 
 TIA
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss

Also look at the Lustre Management Toolkit (LMT) available on Sourceforge.
cliffw
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Gluster then DRBD now Lustre?

2008-06-16 Thread Cliff White
[EMAIL PROTECTED] wrote:
 On Mon, 16 Jun 2008, Kilian CAVALOTTI wrote:
 
 On Monday 16 June 2008 11:40:40 am Andreas Dilger wrote:
 NYC == New York City?  What
 is SJC?
 SJC == San Jose, California
 That's why I thought, but if so, the following part loses me:

 This is working in a test setup, however there are some down sides.
 The first is that DRBD only supports IP, so we have to run IPoIB over
 our our infiniband adapters, not an ideal solution.
 Nathan, you won't be able to use Infiniband between Ney Work City and
 San Jose, CA, anyway, right? Even without considering IB cables' length
 limitation, and unless you can use some kind of dedicated,
 special-purpose link between your sites, the public Internet is not
 really able to provide bandwidth nor latencies compatible with
 Infiniband standards.
 
 Ok, so in the original email east to west was what we originally wanted to 
 do but realized that would not be possible because of round trip delay 
 even over gig e. Instead of mirroring our traffic east west we are starting 
 with 2 servers in each location tied together with Infiniband. The 
 infiniband cables are only 5M. : ) Currently we are mirroring traffic with 
 DRBD between the two local systems in each datacenter, but we are looking 
 for the tradeoffs of switching to Lustre since DRBD does not support 
 Infiniband.
 

UmmLustre is not a replacement for DRBD, so we're very confused over 
here. Lustre is a way of making a big distributed filesystem out of a 
bunch of storage nodes. We don't do replication, it's basically RAID 0.

So, you could use Lustre to make one big filesystem out of two local 
servers. You could even make one big filesystem out of your multiple 
locations over the WAN (it's been done).

But, you can't use Lustre to mirror data. (yet, wait a year)

So I think your Gluster expedition might have confused you. Gluster and 
Lustre are only words that sound somewhat the same, there is _no_ 
relationship between the two. (except the fact that there is some 
filesystem goop involved) You're comparing apples to knee socks if you 
are attempting to map gluster experience to a Lustre setup.

cliffw

 -Nathan
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Size of MDT, used space

2008-05-13 Thread Cliff White
Thomas Roth wrote:
 Hi all,
 
 I'm still in trouble with numbers: the available, used and necessary 
 space on my MDT:
 According to lfs df, I have now filled my file system with 115.3 TB.
 All of these files are sized 5 MB. That should be roughly 24 million files.
 For the MDT, lfs df reports 28.2 GB used.
 
 Now I believed that creating a file on Lustre means using one inode on 
 the MDT. Since all of my Lustre partitions were formatted with the 
 default options (all of this is running Lustre v. 1.6.4.3, btw), an 
 inode should eat up 4kB on the MDT partition. Of course, 24 million 
 files times 4 kB gives you 91 GB rather than 28GB.
 Obviously, there is something I missed completely. Perhaps somebody 
 could illuminate me here?
 
 This issue could also be phrased as How large should my MDT be to 
 accommodate n TB storage space? The manual's answer boils down to = 
 number of files * 4 kB   (*2 per recommendation). That's how I 
 calculated above - maybe my test system is broken? I can't check on the 
 content of these files, it's just 5MB test files created with the 
 'stress' utility.
 
   Thanks and regards,
 Thomas

The size of the MDS inode depends on the number of stripes. 4.5k is the 
maximum, 512k the minimum. Actually size varies with number of stripes 
in the file. So, we advise using 4k as an estimate, as that will cover 
the vast majority of cases, but actual use in almost all situations will 
be smaller than 4k.
cliffw

 
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Lustre Downloads

2008-02-14 Thread Cliff White
Canon, Richard Shane wrote:
  
 
 I see that the download site has been moved and integrated into the Sun 
 site.  It looks like this broke a few things.  For one, I can’t get to 
 any of the 1.4 releases.  Can this get fixed?
 
I'll see what can be done.
cliffw

  
 
 Thanks,
 
  
 
 --Shane
 
  
 
  
 
 --
 
 R. Shane Canon
 
 National Center for Computational Science
 
 Oak Ridge National Laboratory
 
 [EMAIL PROTECTED]
 
  
 
 
 
 
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Lustre Downloads

2008-02-14 Thread Cliff White
Canon, Richard Shane wrote:
  
 
 I see that the download site has been moved and integrated into the Sun 
 site.  It looks like this broke a few things.  For one, I can’t get to 
 any of the 1.4 releases.  Can this get fixed?
 

It looks like some links were recently mis-moved. Should be fixed shortly.
cliffw

  
 
 Thanks,
 
  
 
 --Shane
 
  
 
  
 
 --
 
 R. Shane Canon
 
 National Center for Computational Science
 
 Oak Ridge National Laboratory
 
 [EMAIL PROTECTED]
 
  
 
 
 
 
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Lustre Downloads

2008-02-14 Thread Cliff White
Cliff White wrote:
 Canon, Richard Shane wrote:
  

 I see that the download site has been moved and integrated into the Sun 
 site.  It looks like this broke a few things.  For one, I can’t get to 
 any of the 1.4 releases.  Can this get fixed?

 
 It looks like some links were recently mis-moved. Should be fixed shortly.
 cliffw
http://downloads.clusterfs.com/
should be working now. Please let us know if there are further issues.
cliffw


 
  

 Thanks,

  

 --Shane

  

  

 --

 R. Shane Canon

 National Center for Computational Science

 Oak Ridge National Laboratory

 [EMAIL PROTECTED]

  


 

 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss
 
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Help with lustre 1.6.4

2008-02-12 Thread Cliff White
Ali Algarrous wrote:
 My name is Ali Algarrous and I'm doing a research about Lustre file
 systesm.
 
 I'm running Debian on my machine and after a long time I was able to
 install Lustre on my machine. I was able to format devices and mount
 clients and servers easily but I had to reinstall the OS. So I started
 the process of installing Lustre again and this time it was really
 easy.
 
 However, I came across another problem.  When i'm running this
 command:  mkfs.lustre --fsname datafs --mdt --mgs /dev/sda3
 
 I get the following error messgae:
 __
 gog:~# mkfs.lustre --fsname spfs --mdt --mgs /dev/sda3
 
  Permanent disk data:
 Target: spfs-MDT
 Index:  unassigned
 Lustre FS:  spfs
 Mount type: ldiskfs
 Flags:  0x75
 (MDT MGS needs_index first_time update )
 Persistent mount opts: errors=remount-ro,iopen_nopriv,user_xattr
 Parameters: mdt.group_upcall=/usr/sbin/l_getgroups
 
 
 mkfs.lustre FATAL: loop device requires a --device-size= param
 
 mkfs.lustre FATAL: Loop device setup for /dev/sda3 failed: Invalid
 argument
 mkfs.lustre: exiting with 22 (Invalid argument)
 ___
 And then when i specified the size of the disk by putting --device-
 size=40371345 I started to get the following error:
 
 gog:/home/algarra# mkfs.lustre --fsname lustrefs --mdt --mgs /dev/sda3
 
 Permanent disk data:
 Target: lustrefs-MDT
 Index:  unassigned
 Lustre FS:  lustrefs
 Mount type: ldiskfs
 Flags:  0x75
(MDT MGS needs_index first_time update )
 Persistent mount opts: errors=remount-ro,iopen_nopriv,user_xattr
 Parameters: mdt.group_upcall=/usr/sbin/l_getgroups
 
 checking for existing Lustre data: not found
 mkfs.lustre: size ioctl failed: Inappropriate ioctl for device
 
 mkfs.lustre FATAL: mkfs failed 19
 ___
 
 I'm running kernel 2.6.18.
 
 Would you please help me in solving this problem?

Something is wrong with your /dev/sda3. Lustre is not finding a proper 
block device there, so it assumes you are using a loopback device, but
since you are not using a loopback, it fails (size is only used with 
loopback)  Verfiy that sda exists, and is partitioned correctly. (sda3 
exists)
As a test, make an ext2 FS on sda3 with mkfs. If that works, Lustre 
should work fine.
cliffw

 
 Regards,
 Ali
 
 -- 
 Never stop smiling, not even when you're sad, someone might fall in love 
 with your smile.
 :)
 
 
 
 
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss