Re: [Lustre-discuss] Luster clients getting evicted

2008-02-06 Thread Brock Palen
Palen Center for Advanced Computing [EMAIL PROTECTED] (734)936-1985 On Feb 4, 2008, at 2:47 PM, Brock Palen wrote: Which version of lustre do you use? Server and clients same version and same os? which one? lustre-1.6.4.1 The servers (oss and mds/mgs) use the RHEL4 rpm from lustre.org

[Lustre-discuss] lustre dstat plugin

2008-03-05 Thread Brock Palen
16M| 0 0 | 3523 424 | 24M 14M 69 30 0 0 01| 0 8192B|1029k 18M| 0 0 | 3029 88 | 0 0 Patches/comments, Brock Palen Center for Advanced Computing [EMAIL PROTECTED] (734)936-1985 ___ Lustre-discuss

[Lustre-discuss] socknal_sd00 100% lower?

2008-03-06 Thread Brock Palen
If our IO servers are seeing extended periods of socknal_sd00 at 100% cpu, Would this cause a bottle neck? If so its a single homed hosts, would adding another interface to the host help? Is there threading anyplace? Or is faster cpu the only way out? Brock Palen Center for Advanced

Re: [Lustre-discuss] socknal_sd00 100% lower?

2008-03-07 Thread Brock Palen
On Mar 7, 2008, at 8:51 AM, Maxim V. Patlasov wrote: Brock, If our IO servers are seeing extended periods of socknal_sd00 at 100% cpu, Would this cause a bottle neck? Yes, I think so. If so its a single homed hosts, would adding another interface to the host help? Probably, no.

[Lustre-discuss] yet another lustre error

2008-03-07 Thread Brock Palen
someplace ? Brock Palen Center for Advanced Computing [EMAIL PROTECTED] (734)936-1985 ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss

Re: [Lustre-discuss] lustre dstat plugin

2008-03-10 Thread Brock Palen
). And how you can pull our results, like I use the following on our lustre OSS with two OST's sda and sdb. dstat -D sda,sdb,total That gives me per disk stats and a total. Similar tools could be made for collectl I'm sure. Brock -Aaron On Mar 7, 2008, at 7:03 PM, Brock Palen wrote: On Mar 7

Re: [Lustre-discuss] yet another lustre error

2008-03-10 Thread Brock Palen
the weekend the MDS/MGS went into a unhealthy state forced a reboot+fsck and when it came back up the directory was accessible again and jobs started working again. -Aaron On Mar 7, 2008, at 6:45 PM, Brock Palen wrote: On a file system thats been up for only 57 days, I have: 505 lustre

[Lustre-discuss] more problems with lustre,

2008-03-24 Thread Brock Palen
right now with patchless clients). Thanks for all the help you have given us while we have been evaluating it! data Description: Binary data Brock Palen www.umich.edu/~brockp Center for Advanced Computing [EMAIL PROTECTED] (734)936-1985 ___ Lustre

[Lustre-discuss] filesystem UID' GID's

2008-04-11 Thread Brock Palen
Does a /etc/passwd with all the filesystem users UID's required only on the MDS ? Or does the OST's need them also? Testing for me shows only the MDS, but I could be wrong. We don't use LDAP or anything like that at the moment for UID GID mapping. Brock Palen www.umich.edu/~brockp Center

Re: [Lustre-discuss] MGS and loop devices

2008-04-14 Thread Brock Palen
in memory. Might want to verify this, just don't get caught with stuff in ram. Brock Palen www.umich.edu/~brockp Center for Advanced Computing [EMAIL PROTECTED] (734)936-1985 On Apr 14, 2008, at 3:12 PM, Jakob Goldbach wrote: On Mon, 2008-04-14 at 17:40 +0200, Fereyre Jerome wrote: Has anybody

Re: [Lustre-discuss] lfs setstripe

2008-04-17 Thread Brock Palen
would need to copy them, and move over the old one to change to the new stripe settings. Check the lustre manual they have something about this. You can use 'getstripe' to see what a file/directory use for their settings. Brock Palen www.umich.edu/~brockp Center for Advanced Computing

Re: [Lustre-discuss] lfs setstripe

2008-04-18 Thread Brock Palen
On Apr 17, 2008, at 10:48 PM, Kaizaad Bilimorya wrote: On Thu, 17 Apr 2008, Brock Palen wrote: I don't think you need to do this. If i understand right, you can set the stripe size of the mount, and everything inside that directory inherits it, unless they them self's were explicitly set

Re: [Lustre-discuss] state of sun x4500 drivers

2008-04-23 Thread Brock Palen
for us, we only plan on bonding the 4 thumber 1Gig-e interfaces. Brock Palen www.umich.edu/~brockp Center for Advanced Computing [EMAIL PROTECTED] (734)936-1985 On Apr 23, 2008, at 1:00 PM, Brian Behlendorf wrote: Recently I have also been doing some linux work with the x4500 and I have been

[Lustre-discuss] MDS Fail-Over planning.

2008-05-06 Thread Brock Palen
that are to be killed. The plan on our table right now is two thumpers as the OSS's. Then two x4100 or 4200/s with mirrors SAS drives then shared across with DRBD with Heart Beat. Any comments? Any issues to be aware of? Anyone running something similar? Brock Palen www.umich.edu/~brockp Center

[Lustre-discuss] Luster access locking up login nodes

2008-05-16 Thread Brock Palen
times when lustre screws up it recovers but more and more it does not. and we see these bulk errors followed by mds errors. We are using lustre 1.6.x Brock Palen www.umich.edu/~brockp Center for Advanced Computing [EMAIL PROTECTED] (734)936-1985

Re: [Lustre-discuss] Luster access locking up login nodes

2008-05-16 Thread Brock Palen
pdsh to all the clients, but machines to get rebooted some times. Brock Palen www.umich.edu/~brockp Center for Advanced Computing [EMAIL PROTECTED] (734)936-1985 On May 16, 2008, at 4:13 PM, Brian J. Murrell wrote: On Fri, 2008-05-16 at 15:48 -0400, Brock Palen wrote: I have seen

[Lustre-discuss] external journals

2008-05-29 Thread Brock Palen
see the most help with? Or should be just devote these disks to being another OST? Brock Palen www.umich.edu/~brockp Center for Advanced Computing [EMAIL PROTECTED] (734)936-1985 ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http

[Lustre-discuss] lustre and multi path

2008-06-05 Thread Brock Palen
. What about multipath without LVM? Our StorageTek array has dual controllers with dual ports going to dual port FC cards in the MDS's. Each MDS has a connection to both controllers so we will need multipath to get any advantage to this. Comments? Brock Palen www.umich.edu/~brockp Center

Re: [Lustre-discuss] Lustre delete efficency

2008-06-26 Thread Brock Palen
On Jun 26, 2008, at 1:57 PM, Stew Paddaso wrote: We are considering using Lustre as our backend file platform. The specific application involves storing a high-volume of sequential data writes, with a moderate amount of reads (mostly sequencial, with some random seeks). Our concern is with

[Lustre-discuss] OSS load in the roof

2008-06-27 Thread Brock Palen
: SUCCESS (sc=010038904c40) and: Lustre: 6698:0:(lustre_fsfilt.h:306:fsfilt_setattr()) nobackup- OST0001: slow setattr 100s Lustre: 6698:0:(watchdog.c:312:lcw_update_time()) Expired watchdog for pid 6698 disabled after 103.1261s Thanks Brock Palen www.umich.edu/~brockp Center for Advanced

Re: [Lustre-discuss] OSS load in the roof

2008-06-27 Thread Brock Palen
On Jun 27, 2008, at 1:39 PM, Bernd Schubert wrote: On Fri, Jun 27, 2008 at 01:07:32PM -0400, Brian J. Murrell wrote: On Fri, 2008-06-27 at 12:44 -0400, Brock Palen wrote: All of them are stuck in un-interruptible sleep. Has anyone seen this happen before? Is this caused by a pending disk

Re: [Lustre-discuss] OSS load in the roof

2008-06-27 Thread Brock Palen
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Jun 27, 2008, at 1:07 PM, Brian J. Murrell wrote: On Fri, 2008-06-27 at 12:44 -0400, Brock Palen wrote: All of them are stuck in un-interruptible sleep. Has anyone seen this happen before? Is this caused by a pending disk failure? Well

[Lustre-discuss] Lustre locking up on login/interactive nodes

2008-07-21 Thread Brock Palen
Every so often lustre locks up. It will recover eventually. The process show this self's in 'D' Uninterruptible IO Wait. This case it was 'ar' making an archive. Dmesg then shows: Lustre: nobackup-MDT-mdc-0101fc467800: Connection to service nobackup-MDT via nid [EMAIL

Re: [Lustre-discuss] Lustre locking up on login/interactive nodes

2008-07-21 Thread Brock Palen
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Jul 21, 2008, at 11:51 AM, Brian J. Murrell wrote: On Mon, 2008-07-21 at 11:43 -0400, Brock Palen wrote: Every so often lustre locks up. It will recover eventually. The process show this self's in 'D' Uninterruptible IO Wait. This case

[Lustre-discuss] rpm kernel-devel package

2008-07-25 Thread Brock Palen
-1.6.5.1-2.6.9_67.0.7.EL_lustre.1.6.5.1smp.x86_64.rpm Or: kernel-lustre-source-2.6.9-67.0.7.EL_lustre.1.6.5.1.x86_64.rpm Is there a reason why there is not just a normal: kernel-lustre-smp-devel Just like RedHat/SLES provides? Thanks! Brock Palen www.umich.edu/~brockp Center for Advanced

[Lustre-discuss] Can't build sun rdac driver against lustre source.

2008-07-25 Thread Brock Palen
only once mppLnx26_spinlock_size.c:102: error: for each function it appears in.) make: *** [mppLnx_Spinlock_Size] Error 1 I guess what I should really ask is, Has anyone ever made multipath work with a sun 2540 array for use as the MDS/MGS file system? Brock Palen www.umich.edu/~brockp Center

Re: [Lustre-discuss] Can't build sun rdac driver against lustre source.

2008-07-25 Thread Brock Palen
Yes that worked! Thank you very much. Hin't to sun, the 2540 is a very nice array for lustre, it would be good if all the tools with it were checked to work out the box with lustre. Just 2 cents. Brock Palen www.umich.edu/~brockp Center for Advanced Computing [EMAIL PROTECTED] (734)936

Re: [Lustre-discuss] Can't build sun rdac driver against lustre source.

2008-07-25 Thread Brock Palen
Stuart, It looks like you have a newer rdac package than sun has on their website. So while your make file builds everything, it ties to install a bit of code that does not exist. FYI. Brock Palen www.umich.edu/~brockp Center for Advanced Computing [EMAIL PROTECTED] (734)936-1985

Re: [Lustre-discuss] MGS failover

2008-07-30 Thread Brock Palen
]:[EMAIL PROTECTED] Would that be valid? Brock Palen www.umich.edu/~brockp Center for Advanced Computing [EMAIL PROTECTED] (734)936-1985 On Jul 30, 2008, at 10:29 AM, Brian J. Murrell wrote: On Wed, 2008-07-30 at 09:48 -0400, Brock Palen wrote: The manual does not make much sense when it comes

[Lustre-discuss] Luster recovery when clients go away

2008-07-31 Thread Brock Palen
replayed_requests: 0/?? queued_requests: 0 next_transno: 193097794 Brock Palen www.umich.edu/~brockp Center for Advanced Computing [EMAIL PROTECTED] (734)936-1985 ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman

[Lustre-discuss] lustre 1.6.5.1 panic on failover

2008-07-31 Thread Brock Palen
have not tired yanking power yet, but I want to simulate a MDS in a semi dead state and ran into this. Brock Palen www.umich.edu/~brockp Center for Advanced Computing [EMAIL PROTECTED] (734)936-1985 ___ Lustre-discuss mailing list Lustre-discuss

Re: [Lustre-discuss] lustre 1.6.5.1 panic on failover

2008-07-31 Thread Brock Palen
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Whats a good tool to grab this? Its more than one page long, and the machine does not have serial ports. Links are ok. Brock Palen www.umich.edu/~brockp Center for Advanced Computing [EMAIL PROTECTED] (734)936-1985 On Jul 31, 2008, at 5:14 PM

[Lustre-discuss] stata_mv mv_stata which is better?

2008-08-06 Thread Brock Palen
kernel source packaging. If it is worth all the pain, if others have already figured it out. Any help would be grateful. Brock Palen www.umich.edu/~brockp Center for Advanced Computing [EMAIL PROTECTED] (734)936-1985 ___ Lustre-discuss mailing list

Re: [Lustre-discuss] stata_mv mv_stata which is better?

2008-08-07 Thread Brock Palen
reference. Regards, Mike Berg Sr. Lustre Solutions Engineer Sun Microsystems, Inc. Office/Fax: (303) 547-3491 E-mail: [EMAIL PROTECTED] X4500-preparation.pdf On Aug 6, 2008, at 1:48 PM, Brock Palen wrote: Is it still worth the effort to try and build mv_stata? when working

[Lustre-discuss] operation 400 on unconnected MGS

2008-08-07 Thread Brock Palen
the OST's start booting the client also. Servers are 1.6.5.1 clients are patch-less 1.6.4.1 on RHEL4. Any insight would be great. Brock Palen www.umich.edu/~brockp Center for Advanced Computing [EMAIL PROTECTED] (734)936-1985 ___ Lustre-discuss mailing

[Lustre-discuss] mv_sata patch

2008-08-13 Thread Brock Palen
Is the cache patch for mv_sata noted in the sun paper on the x4500 available? Or has it been rolled into the source distributed by sun? Trying to avoid data loss. Brock Palen www.umich.edu/~brockp Center for Advanced Computing [EMAIL PROTECTED] (734)936-1985

[Lustre-discuss] Bug 15912

2008-08-14 Thread Brock Palen
work around this? Do I just need to build the mkfs out of CVS for 1.6.6 ? Brock Palen www.umich.edu/~brockp Center for Advanced Computing [EMAIL PROTECTED] (734)936-1985 ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http

Re: [Lustre-discuss] Bug 15912

2008-08-18 Thread Brock Palen
1.6.5.1. Can I change the MGSSPEC for the OST's after the fact? And will that work? How would this be done? Thanks ahead of time. Brock Palen www.umich.edu/~brockp Center for Advanced Computing [EMAIL PROTECTED] (734)936-1985 On Aug 14, 2008, at 11:15 AM, Brock Palen wrote: I see it is fixed now

Re: [Lustre-discuss] Bug 15912

2008-08-18 Thread Brock Palen
my work around. mkfs.lustre --reformat --ost --fsname=nobackup --mgsnode=mds1 -- mgsnode=mds2 --mkfsoptions -j -J device=/dev/md27 /dev/md17 Thanks, Though I am scared about behavior of tunefs.lustre if we ever needed to re-ip the nodes. Re-formating is not really an option. Brock Palen

Re: [Lustre-discuss] It gives error no space left while lustre still have spaces left.

2008-08-20 Thread Brock Palen
. Lustre should not ignore this (and doesn't). I don't know how you would work around a this, A use every stripe you can till its out of space I don't think exists. Brock Palen www.umich.edu/~brockp Center for Advanced Computing [EMAIL PROTECTED] (734)936-1985 On Aug 21, 2008, at 12:13 AM

Re: [Lustre-discuss] HLRN lustre breakdown

2008-08-21 Thread Brock Palen
On Aug 21, 2008, at 10:22 AM, Troy Benjegerdes wrote: This is a big nasty issue, particularly for HPC applications where performance is a big issue. How does one even begin to benchmark the performance overhead of a parallel filesystem with checksumming? I am having nightmares over the ways

Re: [Lustre-discuss] HLRN lustre breakdown

2008-08-21 Thread Brock Palen
were cpu bound. (two x4500's) Brock Palen www.umich.edu/~brockp Center for Advanced Computing [EMAIL PROTECTED] (734)936-1985 On Aug 21, 2008, at 2:59 PM, Andreas Dilger wrote: On Aug 21, 2008 10:55 -0400, Brock Palen wrote: On Aug 21, 2008, at 10:22 AM, Troy Benjegerdes wrote

[Lustre-discuss] New lustre message

2008-08-21 Thread Brock Palen
it with our old setup) would be great. Thanks, New install is working great, nice product. Brock Palen www.umich.edu/~brockp Center for Advanced Computing [EMAIL PROTECTED] (734)936-1985 ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http

Re: [Lustre-discuss] New lustre message

2008-08-21 Thread Brock Palen
On Aug 21, 2008, at 11:17 PM, Brian J. Murrell wrote: On Thu, 2008-08-21 at 22:23 -0400, Brock Palen wrote: I don't know if this is a bad thing, I was doing a stress of our new lustre install and managed to have a client kicked out with the following message on the OST that kicked it out

[Lustre-discuss] lru_size very small

2008-08-21 Thread Brock Palen
the felling the cache will not function at all because of the lack of available locks. I don't want to end up on the wrong end of can speed up Lustre dramatically. Thanks. 633 clients, 16 GB MDS/MGS 2x16GB OSS's. Brock Palen www.umich.edu/~brockp Center for Advanced Computing [EMAIL

Re: [Lustre-discuss] lru_size very small

2008-08-23 Thread Brock Palen
Great! So I read this as being lru_size no-longer needs to be manually adjusted. Thats great! Thanks! Brock Palen www.umich.edu/~brockp Center for Advanced Computing [EMAIL PROTECTED] (734)936-1985 On Aug 23, 2008, at 7:22 AM, Andreas Dilger wrote: On Aug 22, 2008 15:39 -0400, Brock

[Lustre-discuss] Lustre clients failing, and cant reconnect

2008-09-04 Thread Brock Palen
. Clients and servers are all using TCP. Is this enough information? Brock Palen www.umich.edu/~brockp Center for Advanced Computing [EMAIL PROTECTED] (734)936-1985 ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http

Re: [Lustre-discuss] Lustre clients failing, and cant reconnect

2008-09-04 Thread Brock Palen
it out. Very strange. Brock Palen www.umich.edu/~brockp Center for Advanced Computing [EMAIL PROTECTED] (734)936-1985 On Sep 4, 2008, at 11:34 PM, Brock Palen wrote: Is this enough information? Probably. If you are running 1.6.5, try disabling statahead on all of your clients... # echo 0

Re: [Lustre-discuss] Lustre clients failing, and cant reconnect

2008-09-05 Thread Brock Palen
I had to reboot the MDS to get the problem to go away. I will watch and see if it reappears. I screwed up and deleted the wrong /var/log/messages So I don't have the messages. I am watching this issues. Brock Palen www.umich.edu/~brockp Center for Advanced Computing [EMAIL PROTECTED] (734)936

Re: [Lustre-discuss] lustre-ldiskfs

2008-09-26 Thread Brock Palen
to the cluster, no cmd line download was possible. If anyone knows how to get around this let me know. Brock Palen www.umich.edu/~brockp Center for Advanced Computing [EMAIL PROTECTED] (734)936-1985 On Sep 26, 2008, at 6:39 AM, Andreas Dilger wrote: On Sep 26, 2008 10:26 +0530, Chirag

[Lustre-discuss] l_getgroups: no such user

2008-09-26 Thread Brock Palen
users on the filesystem find . -uid # Finds nothing, Does lustre check if a user just cd's to that directory? Or is it for any user that logs in? Is it safe to ignore these messages for non cluster users? Brock Palen www.umich.edu/~brockp Center for Advanced Computing [EMAIL PROTECTED] (734

Re: [Lustre-discuss] Adding IB to tcp only cluster

2008-10-10 Thread Brock Palen
On Oct 10, 2008, at 2:45 PM, Brian J. Murrell wrote: On Fri, 2008-10-10 at 11:08 -0400, Brock Palen wrote: We have added a few IB nodes to our cluster (about 70 our of 600 nodes). What would it take to have lustre go over IB as well as tcp for the rest of the hosts? So I'm assuming

Re: [Lustre-discuss] Getting random No space left on device (28)

2008-10-12 Thread Brock Palen
On any client lfs df -h Show you all your OST usage for all your OST in one command. Brock Palen www.umich.edu/~brockp Center for Advanced Computing [EMAIL PROTECTED] (734)936-1985 On Oct 12, 2008, at 3:24 PM, Kevin Van Maren wrote: Sounds like one (or more) of your existing OSTs are out

Re: [Lustre-discuss] Lustre 1.6.5.1 on X4200 and STK 6140 Issues

2008-10-13 Thread Brock Palen
working on the load this would happen. Just FYI it was unrelated to lustre (using provided rpm's no kernel build) this solved my problem on the x4500 Brock Palen www.umich.edu/~brockp Center for Advanced Computing [EMAIL PROTECTED] (734)936-1985 On Oct 13, 2008, at 4:41 AM, Malcolm Cowe

Re: [Lustre-discuss] Lustre 1.6.5.1 on X4200 and STK 6140 Issues

2008-10-13 Thread Brock Palen
I never uninstalled it (i still use some of the tools in it) Faultmond is a service, just chkconfig it off. Brock Palen www.umich.edu/~brockp Center for Advanced Computing [EMAIL PROTECTED] (734)936-1985 On Oct 13, 2008, at 11:03 AM, Malcolm Cowe wrote: Brock Palen wrote: I know you

[Lustre-discuss] unexpectedly long timeout

2008-11-05 Thread Brock Palen
seconds. I think it's dead, and I am evicting it. Any thoughts? Brock Palen www.umich.edu/~brockp Center for Advanced Computing [EMAIL PROTECTED] (734)936-1985 ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org

Re: [Lustre-discuss] Is patchless ok for EL4 now?

2008-11-06 Thread Brock Palen
We have been running this for a while. Brock Palen www.umich.edu/~brockp Center for Advanced Computing [EMAIL PROTECTED] (734)936-1985 On Nov 6, 2008, at 10:54 AM, Peter Kjellstrom wrote: After reading http://wiki.lustre.org/index.php? title=Patchless_Client it is my understanding

Re: [Lustre-discuss] Is patchless ok for EL4 now?

2008-11-06 Thread Brock Palen
2.6.9-78.0.1.ELsmp Lustre-1.6.5.1 Brock Palen www.umich.edu/~brockp Center for Advanced Computing [EMAIL PROTECTED] (734)936-1985 On Nov 6, 2008, at 11:18 AM, Peter Kjellstrom wrote: On Thursday 06 November 2008, Brock Palen wrote: We have been running this for a while. Brock Palen

Re: [Lustre-discuss] Clients fail every now and again,

2008-11-18 Thread Brock Palen
Thanks, Brock Palen www.umich.edu/~brockp Center for Advanced Computing [EMAIL PROTECTED] (734)936-1985 On Nov 18, 2008, at 4:47 PM, Andreas Dilger wrote: On Nov 18, 2008 12:14 -0500, Brock Palen wrote: if that is the bug causing this, is the fix till we upgrade to the newer lustre

[Lustre-discuss] lustre/abaqus tweaks for lustre?

2008-11-26 Thread Brock Palen
have twisted on their own systems for this that I can be informed on? Brock Palen www.umich.edu/~brockp Center for Advanced Computing [EMAIL PROTECTED] (734)936-1985 ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http

[Lustre-discuss] Lustre Intelligence?

2008-12-10 Thread Brock Palen
wanted to do 8bytes at a time, lustre cleaned it up? Or did LInux do this some place? Brock Palen www.umich.edu/~brockp Center for Advanced Computing [EMAIL PROTECTED] (734)936-1985 ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http

[Lustre-discuss] Lustre NOT HEALTHY

2009-01-13 Thread Brock Palen
that often, what information should I collect to report to CFS? Brock Palen www.umich.edu/~brockp Center for Advanced Computing bro...@umich.edu (734)936-1985 ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org

Re: [Lustre-discuss] Lustre NOT HEALTHY

2009-01-14 Thread Brock Palen
Ok thanks, It happened again last night, sooner than normal. I will send a new message with the details. Brock Palen www.umich.edu/~brockp Center for Advanced Computing bro...@umich.edu (734)936-1985 On Jan 13, 2009, at 11:09 PM, Cliff White wrote: Brock Palen wrote: How common

[Lustre-discuss] LBUG ASSERTION(lock-l_resource != NULL) failed

2009-01-14 Thread Brock Palen
, found one machine with lots of dropped packets between the servers, but that is not the client in question. Thank you! If it happens again, and I find any other data I will let you know. Brock Palen www.umich.edu/~brockp Center for Advanced Computing bro...@umich.edu (734)936-1985

Re: [Lustre-discuss] Recovery without end

2009-02-25 Thread Brock Palen
We used to do something similar, and still had issues, Upgrading all servers (2 OSS's 7 OSTs each) and clients (800) to 1.6.6 fixed all our issues, we run default timeout's and default everything really, no issues. Brock Palen www.umich.edu/~brockp Center for Advanced Computing bro

Re: [Lustre-discuss] rdac configuration, please help

2009-02-27 Thread Brock Palen
in the future will push their stuff into DM-Multipath, or just package it with lustre. Brock Palen www.umich.edu/~brockp Center for Advanced Computing bro...@umich.edu (734)936-1985 On Feb 27, 2009, at 6:34 PM, Adint, Eric (CIV) wrote: ok at this point im desparate i have a rocks cluster

[Lustre-discuss] e2scan for cleaning scratch space

2009-03-04 Thread Brock Palen
e2scan? Is there a way to have e2scan not only list the file but also the mtime/ctime in the log file, so that we can sort oldest to newest? Thank you! Brock Palen www.umich.edu/~brockp Center for Advanced Computing bro...@umich.edu (734)936-1985

Re: [Lustre-discuss] e2scan for cleaning scratch space

2009-03-04 Thread Brock Palen
The e2scan shipped from sun's rpms does not support sqlite3 out of the box: rpm -qf /usr/sbin/e2scan e2fsprogs-1.40.7.sun3-0redhat e2scan: sqlite3 was not detected on configure, database creation is not supported Should I just rebuilt only e2scan? Brock Palen www.umich.edu/~brockp Center

[Lustre-discuss] RHEL4 build of lustre patched e2fsprogs

2009-03-16 Thread Brock Palen
just silly. Does anyone have a working patched e2fsprogs from rhel4? Brock Palen www.umich.edu/~brockp Center for Advanced Computing bro...@umich.edu (734)936-1985 ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org

[Lustre-discuss] SELinux and lustre clients

2009-03-17 Thread Brock Palen
it is a lustre problem, after working on it a few months with them: https://bugzilla.redhat.com/show_bug.cgi?id=489583 Is this the case? Has anyone managed to run lustre clients on systems with SELinux enabled? Brock Palen www.umich.edu/~brockp Center for Advanced Computing bro...@umich.edu

[Lustre-discuss] checking lustre health

2009-05-06 Thread Brock Palen
on? Brock Palen www.umich.edu/~brockp Center for Advanced Computing bro...@umich.edu (734)936-1985 ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss

[Lustre-discuss] OpenMX

2009-05-29 Thread Brock Palen
I had the dev of OpenMX on my podcast (www.rce-cast.com) this got me thinking, has anyone ever tried OpenMX with Lustre? In theory it should work, but it wasn't the case with some other tools when asking around. Note we have not tried OpenMX yet, but will evaluate it soon. Brock Palen

[Lustre-discuss] Lustre on Podcast?

2009-06-10 Thread Brock Palen
I host an HPC podcast along with Jeff Squyres at www.rce-cast.com We would like to invite Lustre to be the next guest on the show. Please contact me on or off list if you would like to do this, and if so who should be the point of contact from the Lustre group. Thanks! Brock Palen

Re: [Lustre-discuss] x4540 (thor) panic

2009-06-15 Thread Brock Palen
On Jun 15, 2009, at 11:44 AM, Nirmal Seenu wrote: We have been running the Lustre servers on a machine with Nvidia chipset(nVidia Corporation MCP55 Ethernet (rev a3)) for well over a year now, the following two options seems to work the best on these servers: options forcedeth

[Lustre-discuss] Lustre featured on podcast (HT: Andreas Dilger)

2009-08-03 Thread Brock Palen
Thanks to Andreas for taking an hour out to talk with Jeff Squyres and myself (Brock Palen) about the Lustre cluster filesystem on our podcast www.rce-cast.com, You can find the whole show at: http://www.rce-cast.com/index.php/Podcast/rce-14-lustre-cluster-filesystem.html Thanks again

Re: [Lustre-discuss] Lustre featured on podcast (HT: Andreas Dilger)

2009-08-03 Thread Brock Palen
http://en.wikipedia.org/wiki/Nagle%27s_algorithm Looks like you intentionally hold up data to try to make fatter payloads in packets so they are not 99% header/crc data. Sounds like a way to make latency bad. Brock Palen www.umich.edu/~brockp Center for Advanced Computing bro...@umich.edu

[Lustre-discuss] recover borked mds

2009-08-19 Thread Brock Palen
: 176:llog_cat_id2handle()) error opening log id 0xf150010:80d24629: rc -2 Aug 19 12:37:43 mds2 kernel: LustreError: 7525:0:(llog_obd.c: 262:cat_cancel_cb()) Cannot find handle for log 0xf150010 Catch my attention, Thanks, we are running 1.6.6 Brock Palen www.umich.edu/~brockp Center for Advanced

Re: [Lustre-discuss] recover borked mds

2009-08-20 Thread Brock Palen
trying to start up. Is there a way to get lustre to stop trying to open 0xf150010:80d24629: ? And not go though recovery? If not, can I format a new mds, and just untar ROOTS/ and apply the extended attributes to ROOTS from the old mds filesystem? Brock Palen www.umich.edu/~brockp Center

[Lustre-discuss] repquota for lustre

2009-10-23 Thread Brock Palen
I see the bug in bugzilla from version 1.4 that is put on hold, I just want to bump interest for such a tool. If anyone has made something that does quota reports for lustre I would be interested. Brock Palen www.umich.edu/~brockp Center for Advanced Computing bro...@umich.edu (734)936-1985

Re: [Lustre-discuss] repquota for lustre

2009-10-23 Thread Brock Palen
Thanks I am checking it out, Brock Palen www.umich.edu/~brockp Center for Advanced Computing bro...@umich.edu (734)936-1985 On Oct 23, 2009, at 3:38 PM, Jim Garlick wrote: I wrote a 'repquota' tool that groks lustre: http://sourceforge.net/projects/rquota/ I think LBL has a lustre quota

[Lustre-discuss] mixing server versions

2010-09-15 Thread Brock Palen
to move everything to 1.8 soon but we are in a bind for the moment. Is our only (safe) option to load 1.6.x on the new server also and wait till we can shutdown the filesystem? Thanks! Brock Palen www.umich.edu/~brockp Center for Advanced Computing bro...@umich.edu (734)936-1985

[Lustre-discuss] controlling which eth interface lustre uses

2010-10-21 Thread Brock Palen
the 10Gb interface, do I have so much traffic over the 1Gb interface? There is some traffic on the 10Gb interface, but I would like to tell lustre 'don't use the 1Gb interface'. Thanks! Brock Palen www.umich.edu/~brockp Center for Advanced Computing bro...@umich.edu (734)936-1985

Re: [Lustre-discuss] controlling which eth interface lustre uses

2010-10-21 Thread Brock Palen
On Oct 21, 2010, at 9:48 AM, Joe Landman wrote: On 10/21/2010 09:37 AM, Brock Palen wrote: We recently added a new oss, it has 1 1Gb interface and 1 10Gb interface, The 10Gb interface is eth4 10.164.0.166 The 1Gb interface is eth0 10.164.0.10 They look like they are on the same subnet

Re: [Lustre-discuss] controlling which eth interface lustre uses

2010-10-21 Thread Brock Palen
a 'back door' management network to get to the box should we have issues with the 10Gb driver. Oddly I ran: ifconfig eth0 down and I could nolonger ping the box over the eth4 interface, I had to power cycle it form management. Very odd. bob On 10/21/2010 9:51 AM, Brock Palen wrote: On Oct

Re: [Lustre-discuss] controlling which eth interface lustre uses

2010-10-21 Thread Brock Palen
On Oct 21, 2010, at 10:35 AM, Brian J. Murrell wrote: On Thu, 2010-10-21 at 10:29 -0400, Brock Palen wrote: We could do this, the 10Gb drivers have been such a pain for us we wanted to have a 'back door' management network to get to the box should we have issues with the 10Gb driver

Re: [Lustre-discuss] finding clients that is opening/closing files

2010-10-26 Thread Brock Palen
This was very helpful, I found the culprit. Brock Palen www.umich.edu/~brockp Center for Advanced Computing bro...@umich.edu (734)936-1985 On Oct 26, 2010, at 3:42 PM, Wojciech Turek wrote: One way is to check the /proc/fs/lustre/mds/*/exports/*/stats files, which contains per-client

[Lustre-discuss] mv_sata module for rhel5 and write through patch

2011-05-26 Thread Brock Palen
. Can we upgrade directly from 1.6 to 2.0 if we did this? Brock Palen www.umich.edu/~brockp Center for Advanced Computing bro...@umich.edu (734)936-1985 ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman

[Lustre-discuss] Line rate performance for clients

2011-07-29 Thread Brock Palen
ideas why I cannot do 1Gig-e full duplex? Brock Palen www.umich.edu/~brockp Center for Advanced Computing bro...@umich.edu (734)936-1985 ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss

Re: [Lustre-discuss] Line rate performance for clients

2011-07-29 Thread Brock Palen
Brock Palen www.umich.edu/~brockp Center for Advanced Computing bro...@umich.edu (734)936-1985 On Jul 29, 2011, at 2:01 PM, Andreas Dilger wrote: On 2011-07-29, at 11:33 AM, Brock Palen wrote: I think this is a networking question. We have lustre 1.8 clients with 1gig-e interfaces

[Lustre-discuss] Setting lustre directory and content immutable but keep permissions

2011-12-14 Thread Brock Palen
? Thanks! Brock Palen www.umich.edu/~brockp Center for Advanced Computing bro...@umich.edu (734)936-1985 ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss

[Lustre-discuss] killing lfs_migrate

2012-02-27 Thread Brock Palen
I will have a limited window to migrate files to a new OST. I would like to get as far as I can in the window I have. Is it safe to kill lfs_migrate while it is still running? If so will it leave any 'partial copies' around? Brock Palen www.umich.edu/~brockp CAEN Advanced Computing bro

Re: [Lustre-discuss] killing lfs_migrate

2012-02-27 Thread Brock Palen
On Feb 27, 2012, at 2:49 PM, Ashley Pittman wrote: On 27 Feb 2012, at 19:30, Brock Palen wrote: I will have a limited window to migrate files to a new OST. I would like to get as far as I can in the window I have. Is it safe to kill lfs_migrate while it is still running? If so

Re: [Lustre-discuss] How to efficiently get sizes of all files stored in Lustre?

2014-09-17 Thread Brock Palen
version. Or if you are using change logs and can have it run all the time, new versions should be fast enough to keep up with changes. Brock Palen www.umich.edu/~brockp CAEN Advanced Computing XSEDE Campus Champion bro...@umich.edu (734)936-1985 On Sep 17, 2014, at 8:44 AM, Alexander Oltu

Re: [lustre-discuss] Lustre on Ceph Block Devices

2017-02-22 Thread Brock Palen
is required.I have dedicated Lustre today for larger systems and they will stay that way. Was just curious if anyone tried this. Brock Palen www.umich.edu/~brockp Director Advanced Research Computing - TS XSEDE Campus Champion bro...@umich.edu (734)936-1985 On Wed, Feb 22, 2017 at 4:54 AM

[lustre-discuss] Lustre on Ceph Block Devices

2017-02-21 Thread Brock Palen
? This probably isn't that different than the Cloud Formation script that uses EBS volumes if it works as intended. Thanks Brock Palen www.umich.edu/~brockp Director Advanced Research Computing - TS XSEDE Campus Champion bro...@umich.edu (734)936-1985 ___ lustre