is required.I have dedicated
Lustre today for larger systems and they will stay that way. Was just
curious if anyone tried this.
Brock Palen
www.umich.edu/~brockp
Director Advanced Research Computing - TS
XSEDE Campus Champion
bro...@umich.edu
(734)936-1985
On Wed, Feb 22, 2017 at 4:54 AM
? This probably isn't that different than the Cloud
Formation script that uses EBS volumes if it works as intended.
Thanks
Brock Palen
www.umich.edu/~brockp
Director Advanced Research Computing - TS
XSEDE Campus Champion
bro...@umich.edu
(734)936-1985
___
lustre
version. Or if you are using change logs
and can have it run all the time, new versions should be fast enough to keep up
with changes.
Brock Palen
www.umich.edu/~brockp
CAEN Advanced Computing
XSEDE Campus Champion
bro...@umich.edu
(734)936-1985
On Sep 17, 2014, at 8:44 AM, Alexander Oltu
I will have a limited window to migrate files to a new OST. I would like to get
as far as I can in the window I have.
Is it safe to kill lfs_migrate while it is still running?
If so will it leave any 'partial copies' around?
Brock Palen
www.umich.edu/~brockp
CAEN Advanced Computing
bro
On Feb 27, 2012, at 2:49 PM, Ashley Pittman wrote:
On 27 Feb 2012, at 19:30, Brock Palen wrote:
I will have a limited window to migrate files to a new OST. I would like to
get as far as I can in the window I have.
Is it safe to kill lfs_migrate while it is still running?
If so
?
Thanks!
Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
bro...@umich.edu
(734)936-1985
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss
ideas why I cannot do 1Gig-e full duplex?
Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
bro...@umich.edu
(734)936-1985
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss
Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
bro...@umich.edu
(734)936-1985
On Jul 29, 2011, at 2:01 PM, Andreas Dilger wrote:
On 2011-07-29, at 11:33 AM, Brock Palen wrote:
I think this is a networking question.
We have lustre 1.8 clients with 1gig-e interfaces
.
Can we upgrade directly from 1.6 to 2.0 if we did this?
Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
bro...@umich.edu
(734)936-1985
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman
This was very helpful, I found the culprit.
Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
bro...@umich.edu
(734)936-1985
On Oct 26, 2010, at 3:42 PM, Wojciech Turek wrote:
One way is to check the /proc/fs/lustre/mds/*/exports/*/stats files, which
contains per-client
the 10Gb
interface, do I have so much traffic over the 1Gb interface? There is some
traffic on the 10Gb interface, but I would like to tell lustre 'don't use the
1Gb interface'.
Thanks!
Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
bro...@umich.edu
(734)936-1985
On Oct 21, 2010, at 9:48 AM, Joe Landman wrote:
On 10/21/2010 09:37 AM, Brock Palen wrote:
We recently added a new oss, it has 1 1Gb interface and 1 10Gb
interface,
The 10Gb interface is eth4 10.164.0.166 The 1Gb interface is eth0
10.164.0.10
They look like they are on the same subnet
a 'back door' management network to get to the box should we have issues
with the 10Gb driver.
Oddly I ran:
ifconfig eth0 down
and I could nolonger ping the box over the eth4 interface, I had to power cycle
it form management. Very odd.
bob
On 10/21/2010 9:51 AM, Brock Palen wrote:
On Oct
On Oct 21, 2010, at 10:35 AM, Brian J. Murrell wrote:
On Thu, 2010-10-21 at 10:29 -0400, Brock Palen wrote:
We could do this, the 10Gb drivers have been such a pain for us we wanted to
have a 'back door' management network to get to the box should we have
issues with the 10Gb driver
to move everything to 1.8 soon but we are in a bind for the moment.
Is our only (safe) option to load 1.6.x on the new server also and wait till we
can shutdown the filesystem?
Thanks!
Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
bro...@umich.edu
(734)936-1985
I see the bug in bugzilla from version 1.4 that is put on hold, I just
want to bump interest for such a tool.
If anyone has made something that does quota reports for lustre I
would be interested.
Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
bro...@umich.edu
(734)936-1985
Thanks I am checking it out,
Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
bro...@umich.edu
(734)936-1985
On Oct 23, 2009, at 3:38 PM, Jim Garlick wrote:
I wrote a 'repquota' tool that groks lustre:
http://sourceforge.net/projects/rquota/
I think LBL has a lustre quota
trying to start up.
Is there a way to get lustre to stop trying to open
0xf150010:80d24629: ? And not go though recovery?
If not, can I format a new mds, and just untar ROOTS/ and apply
the extended attributes to ROOTS from the old mds filesystem?
Brock Palen
www.umich.edu/~brockp
Center
:
176:llog_cat_id2handle()) error opening log id 0xf150010:80d24629: rc -2
Aug 19 12:37:43 mds2 kernel: LustreError: 7525:0:(llog_obd.c:
262:cat_cancel_cb()) Cannot find handle for log 0xf150010
Catch my attention,
Thanks, we are running 1.6.6
Brock Palen
www.umich.edu/~brockp
Center for Advanced
Thanks to Andreas for taking an hour out to talk with Jeff Squyres and
myself (Brock Palen) about the Lustre cluster filesystem on our
podcast www.rce-cast.com,
You can find the whole show at:
http://www.rce-cast.com/index.php/Podcast/rce-14-lustre-cluster-filesystem.html
Thanks again
http://en.wikipedia.org/wiki/Nagle%27s_algorithm
Looks like you intentionally hold up data to try to make fatter
payloads in packets so they are not 99% header/crc data. Sounds like
a way to make latency bad.
Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
bro...@umich.edu
On Jun 15, 2009, at 11:44 AM, Nirmal Seenu wrote:
We have been running the Lustre servers on a machine with Nvidia
chipset(nVidia Corporation MCP55 Ethernet (rev a3)) for well over a
year
now, the following two options seems to work the best on these
servers:
options forcedeth
I host an HPC podcast along with Jeff Squyres at www.rce-cast.com
We would like to invite Lustre to be the next guest on the show.
Please contact me on or off list if you would like to do this, and if
so who should be the point of contact from the Lustre group.
Thanks!
Brock Palen
I had the dev of OpenMX on my podcast (www.rce-cast.com) this got me
thinking, has anyone ever tried OpenMX with Lustre? In theory it
should work, but it wasn't the case with some other tools when asking
around.
Note we have not tried OpenMX yet, but will evaluate it soon.
Brock Palen
on?
Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
bro...@umich.edu
(734)936-1985
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss
it is a lustre problem, after working on it a few months
with them:
https://bugzilla.redhat.com/show_bug.cgi?id=489583
Is this the case? Has anyone managed to run lustre clients on
systems with SELinux enabled?
Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
bro...@umich.edu
just silly.
Does anyone have a working patched e2fsprogs from rhel4?
Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
bro...@umich.edu
(734)936-1985
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org
e2scan?
Is there a way to have e2scan not only list the file but also the
mtime/ctime in the log file, so that we can sort oldest to newest?
Thank you!
Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
bro...@umich.edu
(734)936-1985
The e2scan shipped from sun's rpms does not support sqlite3 out of
the box:
rpm -qf /usr/sbin/e2scan
e2fsprogs-1.40.7.sun3-0redhat
e2scan: sqlite3 was not detected on configure, database creation is
not supported
Should I just rebuilt only e2scan?
Brock Palen
www.umich.edu/~brockp
Center
in the future will push their stuff into DM-Multipath, or
just package it with lustre.
Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
bro...@umich.edu
(734)936-1985
On Feb 27, 2009, at 6:34 PM, Adint, Eric (CIV) wrote:
ok at this point im desparate
i have a rocks cluster
We used to do something similar, and still had issues,
Upgrading all servers (2 OSS's 7 OSTs each) and clients (800) to
1.6.6 fixed all our issues, we run default timeout's and default
everything really, no issues.
Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
bro
Ok thanks,
It happened again last night, sooner than normal. I will send a new
message with the details.
Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
bro...@umich.edu
(734)936-1985
On Jan 13, 2009, at 11:09 PM, Cliff White wrote:
Brock Palen wrote:
How common
, found one machine with lots of dropped packets
between the servers, but that is not the client in question.
Thank you! If it happens again, and I find any other data I will let
you know.
Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
bro...@umich.edu
(734)936-1985
that often, what information
should I collect to report to CFS?
Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
bro...@umich.edu
(734)936-1985
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org
wanted to do 8bytes
at a time, lustre cleaned it up? Or did LInux do this some place?
Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
[EMAIL PROTECTED]
(734)936-1985
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http
have twisted on their own systems for this that I can be
informed on?
Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
[EMAIL PROTECTED]
(734)936-1985
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http
Thanks,
Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
[EMAIL PROTECTED]
(734)936-1985
On Nov 18, 2008, at 4:47 PM, Andreas Dilger wrote:
On Nov 18, 2008 12:14 -0500, Brock Palen wrote:
if that is the bug causing this, is the fix till we upgrade to the
newer lustre
We have been running this for a while.
Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
[EMAIL PROTECTED]
(734)936-1985
On Nov 6, 2008, at 10:54 AM, Peter Kjellstrom wrote:
After reading http://wiki.lustre.org/index.php?
title=Patchless_Client it is my
understanding
2.6.9-78.0.1.ELsmp
Lustre-1.6.5.1
Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
[EMAIL PROTECTED]
(734)936-1985
On Nov 6, 2008, at 11:18 AM, Peter Kjellstrom wrote:
On Thursday 06 November 2008, Brock Palen wrote:
We have been running this for a while.
Brock Palen
seconds. I think it's dead, and I am
evicting it.
Any thoughts?
Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
[EMAIL PROTECTED]
(734)936-1985
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org
working on
the load this would happen. Just FYI it was unrelated to lustre
(using provided rpm's no kernel build) this solved my problem on the
x4500
Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
[EMAIL PROTECTED]
(734)936-1985
On Oct 13, 2008, at 4:41 AM, Malcolm Cowe
I never uninstalled it (i still use some of the tools in it)
Faultmond is a service, just chkconfig it off.
Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
[EMAIL PROTECTED]
(734)936-1985
On Oct 13, 2008, at 11:03 AM, Malcolm Cowe wrote:
Brock Palen wrote:
I know you
On any client
lfs df -h
Show you all your OST usage for all your OST in one command.
Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
[EMAIL PROTECTED]
(734)936-1985
On Oct 12, 2008, at 3:24 PM, Kevin Van Maren wrote:
Sounds like one (or more) of your existing OSTs are out
On Oct 10, 2008, at 2:45 PM, Brian J. Murrell wrote:
On Fri, 2008-10-10 at 11:08 -0400, Brock Palen wrote:
We have added a few IB nodes to our cluster (about 70 our of 600
nodes).
What would it take to have lustre go over IB as well as tcp for the
rest of the hosts?
So I'm assuming
to the cluster,
no cmd line download was possible. If anyone knows how to get around
this let me know.
Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
[EMAIL PROTECTED]
(734)936-1985
On Sep 26, 2008, at 6:39 AM, Andreas Dilger wrote:
On Sep 26, 2008 10:26 +0530, Chirag
users on the filesystem
find . -uid #
Finds nothing,
Does lustre check if a user just cd's to that directory? Or is it
for any user that logs in?
Is it safe to ignore these messages for non cluster users?
Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
[EMAIL PROTECTED]
(734
I had to reboot the MDS to get the problem to go away.
I will watch and see if it reappears. I screwed up and deleted the
wrong /var/log/messages So I don't have the messages.
I am watching this issues.
Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
[EMAIL PROTECTED]
(734)936
.
Clients and servers are all using TCP.
Is this enough information?
Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
[EMAIL PROTECTED]
(734)936-1985
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http
it out.
Very strange.
Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
[EMAIL PROTECTED]
(734)936-1985
On Sep 4, 2008, at 11:34 PM, Brock Palen wrote:
Is this enough information?
Probably. If you are running 1.6.5, try disabling statahead on
all of
your clients...
# echo 0
Great!
So I read this as being lru_size no-longer needs to be manually
adjusted. Thats great!
Thanks!
Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
[EMAIL PROTECTED]
(734)936-1985
On Aug 23, 2008, at 7:22 AM, Andreas Dilger wrote:
On Aug 22, 2008 15:39 -0400, Brock
On Aug 21, 2008, at 10:22 AM, Troy Benjegerdes wrote:
This is a big nasty issue, particularly for HPC applications where
performance is a big issue.
How does one even begin to benchmark the performance overhead of a
parallel filesystem with checksumming? I am having nightmares over the
ways
were cpu
bound. (two x4500's)
Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
[EMAIL PROTECTED]
(734)936-1985
On Aug 21, 2008, at 2:59 PM, Andreas Dilger wrote:
On Aug 21, 2008 10:55 -0400, Brock Palen wrote:
On Aug 21, 2008, at 10:22 AM, Troy Benjegerdes wrote
it with
our old setup) would be great.
Thanks,
New install is working great, nice product.
Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
[EMAIL PROTECTED]
(734)936-1985
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http
On Aug 21, 2008, at 11:17 PM, Brian J. Murrell wrote:
On Thu, 2008-08-21 at 22:23 -0400, Brock Palen wrote:
I don't know if this is a bad thing, I was doing a stress of our new
lustre install and managed to have a client kicked out with the
following message on the OST that kicked it out
the felling the cache will
not function at all because of the lack of available locks. I don't
want to end up on the wrong end of can speed up Lustre dramatically.
Thanks.
633 clients,
16 GB MDS/MGS
2x16GB OSS's.
Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
[EMAIL
. Lustre should not ignore this (and
doesn't).
I don't know how you would work around a this, A use every stripe
you can till its out of space I don't think exists.
Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
[EMAIL PROTECTED]
(734)936-1985
On Aug 21, 2008, at 12:13 AM
1.6.5.1.
Can I change the MGSSPEC for the OST's after the fact? And will that
work?
How would this be done?
Thanks ahead of time.
Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
[EMAIL PROTECTED]
(734)936-1985
On Aug 14, 2008, at 11:15 AM, Brock Palen wrote:
I see it is fixed now
my work around.
mkfs.lustre --reformat --ost --fsname=nobackup --mgsnode=mds1 --
mgsnode=mds2 --mkfsoptions -j -J device=/dev/md27 /dev/md17
Thanks,
Though I am scared about behavior of tunefs.lustre if we ever needed
to re-ip the nodes. Re-formating is not really an option.
Brock Palen
work around this? Do I just need to build the
mkfs out of CVS for 1.6.6 ?
Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
[EMAIL PROTECTED]
(734)936-1985
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http
Is the cache patch for mv_sata noted in the sun paper on the x4500
available? Or has it been rolled into the source distributed by sun?
Trying to avoid data loss.
Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
[EMAIL PROTECTED]
(734)936-1985
reference.
Regards,
Mike Berg
Sr. Lustre Solutions Engineer
Sun Microsystems, Inc.
Office/Fax: (303) 547-3491
E-mail: [EMAIL PROTECTED]
X4500-preparation.pdf
On Aug 6, 2008, at 1:48 PM, Brock Palen wrote:
Is it still worth the effort to try and build mv_stata? when working
the OST's start booting the client also.
Servers are 1.6.5.1 clients are patch-less 1.6.4.1 on RHEL4.
Any insight would be great.
Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
[EMAIL PROTECTED]
(734)936-1985
___
Lustre-discuss mailing
kernel source packaging.
If it is worth all the pain, if others have already figured it out.
Any help would be grateful.
Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
[EMAIL PROTECTED]
(734)936-1985
___
Lustre-discuss mailing list
replayed_requests: 0/??
queued_requests: 0
next_transno: 193097794
Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
[EMAIL PROTECTED]
(734)936-1985
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman
have not tired yanking power yet, but I want to simulate a MDS in a
semi dead state and ran into this.
Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
[EMAIL PROTECTED]
(734)936-1985
___
Lustre-discuss mailing list
Lustre-discuss
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
Whats a good tool to grab this? Its more than one page long, and the
machine does not have serial ports.
Links are ok.
Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
[EMAIL PROTECTED]
(734)936-1985
On Jul 31, 2008, at 5:14 PM
]:[EMAIL PROTECTED]
Would that be valid?
Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
[EMAIL PROTECTED]
(734)936-1985
On Jul 30, 2008, at 10:29 AM, Brian J. Murrell wrote:
On Wed, 2008-07-30 at 09:48 -0400, Brock Palen wrote:
The manual does not make much sense when it comes
-1.6.5.1-2.6.9_67.0.7.EL_lustre.1.6.5.1smp.x86_64.rpm
Or:
kernel-lustre-source-2.6.9-67.0.7.EL_lustre.1.6.5.1.x86_64.rpm
Is there a reason why there is not just a normal:
kernel-lustre-smp-devel
Just like RedHat/SLES provides?
Thanks!
Brock Palen
www.umich.edu/~brockp
Center for Advanced
only once
mppLnx26_spinlock_size.c:102: error: for each function it appears in.)
make: *** [mppLnx_Spinlock_Size] Error 1
I guess what I should really ask is,
Has anyone ever made multipath work with a sun 2540 array for use as
the MDS/MGS file system?
Brock Palen
www.umich.edu/~brockp
Center
Yes that worked! Thank you very much.
Hin't to sun, the 2540 is a very nice array for lustre, it would be
good if all the tools with it were checked to work out the box with
lustre. Just 2 cents.
Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
[EMAIL PROTECTED]
(734)936
Stuart,
It looks like you have a newer rdac package than sun has on their
website. So while your make file builds everything, it ties to
install a bit of code that does not exist. FYI.
Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
[EMAIL PROTECTED]
(734)936-1985
Every so often lustre locks up. It will recover eventually. The
process show this self's in 'D' Uninterruptible IO Wait. This case
it was 'ar' making an archive.
Dmesg then shows:
Lustre: nobackup-MDT-mdc-0101fc467800: Connection to service
nobackup-MDT via nid [EMAIL
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
On Jul 21, 2008, at 11:51 AM, Brian J. Murrell wrote:
On Mon, 2008-07-21 at 11:43 -0400, Brock Palen wrote:
Every so often lustre locks up. It will recover eventually. The
process show this self's in 'D' Uninterruptible IO Wait. This case
: SUCCESS (sc=010038904c40)
and:
Lustre: 6698:0:(lustre_fsfilt.h:306:fsfilt_setattr()) nobackup-
OST0001: slow setattr 100s
Lustre: 6698:0:(watchdog.c:312:lcw_update_time()) Expired watchdog
for pid 6698 disabled after 103.1261s
Thanks
Brock Palen
www.umich.edu/~brockp
Center for Advanced
On Jun 27, 2008, at 1:39 PM, Bernd Schubert wrote:
On Fri, Jun 27, 2008 at 01:07:32PM -0400, Brian J. Murrell wrote:
On Fri, 2008-06-27 at 12:44 -0400, Brock Palen wrote:
All of them are stuck in un-interruptible sleep.
Has anyone seen this happen before? Is this caused by a pending
disk
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
On Jun 27, 2008, at 1:07 PM, Brian J. Murrell wrote:
On Fri, 2008-06-27 at 12:44 -0400, Brock Palen wrote:
All of them are stuck in un-interruptible sleep.
Has anyone seen this happen before? Is this caused by a pending disk
failure?
Well
On Jun 26, 2008, at 1:57 PM, Stew Paddaso wrote:
We are considering using Lustre as our backend file platform. The
specific application involves storing a high-volume of sequential data
writes, with a moderate amount of reads (mostly sequencial, with some
random seeks). Our concern is with
.
What about multipath without LVM? Our StorageTek array has dual
controllers with dual ports going to dual port FC cards in the
MDS's. Each MDS has a connection to both controllers so we will need
multipath to get any advantage to this.
Comments?
Brock Palen
www.umich.edu/~brockp
Center
see the most help with? Or should
be just devote these disks to being another OST?
Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
[EMAIL PROTECTED]
(734)936-1985
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http
times when lustre screws up it recovers but more and
more it does not. and we see these bulk errors followed by mds errors.
We are using lustre 1.6.x
Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
[EMAIL PROTECTED]
(734)936-1985
pdsh to all the clients, but machines to get rebooted some times.
Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
[EMAIL PROTECTED]
(734)936-1985
On May 16, 2008, at 4:13 PM, Brian J. Murrell wrote:
On Fri, 2008-05-16 at 15:48 -0400, Brock Palen wrote:
I have seen
that are to be killed.
The plan on our table right now is two thumpers as the OSS's.
Then two x4100 or 4200/s with mirrors SAS drives then shared across
with DRBD with Heart Beat.
Any comments? Any issues to be aware of? Anyone running something
similar?
Brock Palen
www.umich.edu/~brockp
Center
for us, we only plan on bonding the 4 thumber
1Gig-e interfaces.
Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
[EMAIL PROTECTED]
(734)936-1985
On Apr 23, 2008, at 1:00 PM, Brian Behlendorf wrote:
Recently I have also been doing some linux work with the x4500 and
I have been
On Apr 17, 2008, at 10:48 PM, Kaizaad Bilimorya wrote:
On Thu, 17 Apr 2008, Brock Palen wrote:
I don't think you need to do this.
If i understand right, you can set the stripe size of the mount,
and everything inside that directory inherits it, unless they them
self's were explicitly set
would need to copy them, and move over
the old one to change to the new stripe settings. Check the lustre
manual they have something about this.
You can use 'getstripe' to see what a file/directory use for their
settings.
Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
in memory.
Might want to verify this, just don't get caught with stuff in ram.
Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
[EMAIL PROTECTED]
(734)936-1985
On Apr 14, 2008, at 3:12 PM, Jakob Goldbach wrote:
On Mon, 2008-04-14 at 17:40 +0200, Fereyre Jerome wrote:
Has anybody
Does a /etc/passwd with all the filesystem users UID's required only
on the MDS ? Or does the OST's need them also?
Testing for me shows only the MDS, but I could be wrong.
We don't use LDAP or anything like that at the moment for UID GID
mapping.
Brock Palen
www.umich.edu/~brockp
Center
right now with patchless clients). Thanks for
all the help you have given us while we have been evaluating it!
data
Description: Binary data
Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
[EMAIL PROTECTED]
(734)936-1985
___
Lustre
).
And how you can pull our results, like I use the following on our
lustre OSS with two OST's sda and sdb.
dstat -D sda,sdb,total
That gives me per disk stats and a total.
Similar tools could be made for collectl I'm sure.
Brock
-Aaron
On Mar 7, 2008, at 7:03 PM, Brock Palen wrote:
On Mar 7
the weekend the MDS/MGS went into a unhealthy state forced a
reboot+fsck and when it came back up the directory was accessible
again and jobs started working again.
-Aaron
On Mar 7, 2008, at 6:45 PM, Brock Palen wrote:
On a file system thats been up for only 57 days, I have:
505 lustre
On Mar 7, 2008, at 8:51 AM, Maxim V. Patlasov wrote:
Brock,
If our IO servers are seeing extended periods of socknal_sd00 at
100% cpu, Would this cause a bottle neck?
Yes, I think so.
If so its a single homed hosts, would adding another interface to
the host help?
Probably, no.
someplace ?
Brock Palen
Center for Advanced Computing
[EMAIL PROTECTED]
(734)936-1985
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss
If our IO servers are seeing extended periods of socknal_sd00 at 100%
cpu, Would this cause a bottle neck? If so its a single homed
hosts, would adding another interface to the host help?
Is there threading anyplace? Or is faster cpu the only way out?
Brock Palen
Center for Advanced
16M| 0 0 |
3523 424 | 24M 14M
69 30 0 0 01| 0 8192B|1029k 18M| 0 0 |
3029 88 | 0 0
Patches/comments,
Brock Palen
Center for Advanced Computing
[EMAIL PROTECTED]
(734)936-1985
___
Lustre-discuss
Palen
Center for Advanced Computing
[EMAIL PROTECTED]
(734)936-1985
On Feb 4, 2008, at 2:47 PM, Brock Palen wrote:
Which version of lustre do you use?
Server and clients same version and same os? which one?
lustre-1.6.4.1
The servers (oss and mds/mgs) use the RHEL4 rpm from lustre.org
95 matches
Mail list logo