Re: [Lustre-discuss] OST threads

2011-02-24 Thread Andreas Dilger
Yes, this can be set at startup time to limit the number of started threads. There is a patch I wrote to also reduce the number of running treads but it wasn't landed yet. Cheers, Andreas On 2011-02-24, at 14:04, Mervini, Joseph A jame...@sandia.gov wrote: I'm inclined to agree. So

Re: [Lustre-discuss] Disabling RDMA on an IB interface

2011-02-23 Thread Andreas Dilger
hardware, so that probably won't help you, but maybe they already measured what you are looking at :-). Cheers, Andreas -- Andreas Dilger Principal Engineer Whamcloud, Inc. ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http

[Lustre-discuss] Update of PDSI filesystem stats data

2011-02-23 Thread Andreas Dilger
. Thanks in advance for any contributions. Cheers, Andreas -- Andreas Dilger Principal Engineer Whamcloud, Inc. ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss

Re: [Lustre-discuss] How to detect process owner on client

2011-02-11 Thread Andreas Dilger
and possibly a rank # to the Lustre RPC messages is definitely something that we've thought about, but it would need help from userspace (MPI, job scheduler, etc) in order to be useful, so it hasn't been done yet. Am 11.02.2011 17:34, schrieb Andreas Dilger: On the OSS and MDS nodes there are per

Re: [Lustre-discuss] question about size on MDS (MDT) for lustre-1.8

2011-02-03 Thread Andreas Dilger
On 2011-02-01, at 11:38, Jason Rappleye jason.rappl...@nasa.gov wrote: I heard through the grapevine that you suggest not using too few flex_bgs on an ext4 filesystem. Can you elaborate on what might be a reasonable number, and why? My gut feeling is that a flex_bg factor of 256 may give

Re: [Lustre-discuss] question about size on MDS (MDT) for lustre-1.8

2011-01-28 Thread Andreas Dilger
directory blocks in the first group of a flex_bg, so if that entire group is on SSD it would potentially avoid this problem. Cheers, Andreas -- Andreas Dilger Principal Engineer Whamcloud, Inc. ___ Lustre-discuss mailing list Lustre-discuss

Re: [Lustre-discuss] MDT raid parameters, multiple MGSes

2011-01-27 Thread Andreas Dilger
On 2011-01-25, at 17:05, Jeremy Filizetti wrote: On Fri, Jan 21, 2011 at 1:02 PM, Andreas Dilger adil...@whamcloud.com wrote: While this runs, it is definitely not correct. The problem is that the client will only connect to a single MGS for configuration updates (in particular, the MGS

Re: [Lustre-discuss] MDT raid parameters, multiple MGSes

2011-01-27 Thread Andreas Dilger
On 2011-01-27, at 08:26, Jason Rappleye wrote: On Jan 27, 2011, at 3:15 AM, Andreas Dilger wrote: The problem is that the client will only connect to a single MGS for configuration updates (in particular, the MGS for the last filesystem that was mounted). If there is a configuration

Re: [Lustre-discuss] llverfs outcome

2011-01-27 Thread Andreas Dilger
, if unspecified, and then use that for the rest of the test. Cheers, Andreas -- Andreas Dilger Principal Engineer Whamcloud, Inc. ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss

Re: [Lustre-discuss] lustre and software RAID

2011-01-22 Thread Andreas Dilger
Presumably, unlike the order shown below, you run the mkfs.lustre AFTER the mdadm command? Cheers, Andreas On 2011-01-21, at 14:55, Samuel Aparicio sapari...@bccrc.ca wrote: e2fsprogs-1.41.10.sun2-0redhat.rhel5.x86_64 mkfs.lustre --ost --fsname=lustre --reformat --mgsnode=11.1.254.3@tcp0

Re: [Lustre-discuss] MDT raid parameters, multiple MGSes

2011-01-21 Thread Andreas Dilger
___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss Cheers, Andreas -- Andreas Dilger Principal Engineer Whamcloud, Inc. ___ Lustre-discuss mailing list Lustre-discuss

Re: [Lustre-discuss] lustre and software RAID

2011-01-21 Thread Andreas Dilger
^^ author: Remy Card, Stephen Tweedie, Andrew Morton, Andreas Dilger, Theodore Ts'o and others srcversion: D5D8992C8B3E6FCA6ED4FF2 depends: vermagic:2.6.32.20 SMP mod_unload modversions Cheers, Andreas -- Andreas Dilger Principal Engineer Whamcloud, Inc

Re: [Lustre-discuss] lustre and software RAID

2011-01-21 Thread Andreas Dilger
in it. If that doesn't work then the format failed for some reason. Providing the output of mkfs.lustre -v {options} would help diagnose it. On Jan 21, 2011, at 12:59 PM, Andreas Dilger wrote: On 2011-01-21, at 13:36, Samuel Aparicio wrote: trying to create an ext4 lustre filesystem attached to an OSS

Re: [Lustre-discuss] Problem on OST data recovery

2011-01-14 Thread Andreas Dilger
with the -l option. On Thu, Jan 13, 2011 at 11:09:55AM -0700, Andreas Dilger wrote: On 2011-01-13, at 01:41, thhsieh wrote: I am wondering whether it is possible to recover the OST data ? We have faced the following problem. We installed a new OST server which is intended to replace an old one

Re: [Lustre-discuss] Problem on OST data recovery

2011-01-13 Thread Andreas Dilger
, Andreas -- Andreas Dilger Principal Engineer Whamcloud, Inc. ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss

Re: [Lustre-discuss] Serious problem with OSTs

2010-12-29 Thread Andreas Dilger
On 2010-12-29, at 20:22, Mervini, Joseph A jame...@sandia.gov wrote: And examining the LUN with tunefs.lustre produces the following: [r...@rio37 ~]# tunefs.lustre /dev/sdf checking for existing Lustre data: found last_rcvd tunefs.lustre: Unable to read 1.6 config /tmp/dirUvdBcz/mountdata.

Re: [Lustre-discuss] how to reuse OST indices (EADDRINUSE)

2010-12-21 Thread Andreas Dilger
On 2010-12-21, at 8:58, Charles Taylor tay...@hpc.ufl.edu wrote: So we are evacuating all the OSTs, replacing the Areca 1680ix cards with Adaptec 51645s, re-initializing the LUNs, reformatting the LUNs as OSTs (using the same OST index as before) and remounting them.That is the plan

Re: [Lustre-discuss] howto make a lvm, or virtual lvm?

2010-12-16 Thread Andreas Dilger
On 2010-12-16, at 7:49, Eudes PHILIPPE eu...@cisneo.fr wrote: Now, I add a new oss server, with one ost (1GB) - On Oss 1, he use 750 Mb of 1000 - On Oss 2, he use 750 Mb of 1000 - On Oss 3, he use 0 Mb of 1000 lfs setstripe -c3 /home on client I upload a big file, 1.3 Go He write on

Re: [Lustre-discuss] howto make a lvm, or virtual lvm?

2010-12-16 Thread Andreas Dilger
index is 2). Regards -Message d'origine- De : Andreas Dilger [mailto:andreas.dil...@oracle.com] Envoyé : mercredi 15 décembre 2010 22:39 À : Eudes PHILIPPE Cc : lustre-discuss@lists.lustre.org Objet : Re: [Lustre-discuss] howto make a lvm, or virtual lvm? On 2010-12-15

Re: [Lustre-discuss] Unable to mount OSTs

2010-12-15 Thread Andreas Dilger
for nscratch-OST0001: -38 Cheers, Andreas -- Andreas Dilger Lustre Technical Lead Oracle Corporation Canada Inc. ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss

Re: [Lustre-discuss] Renaming an OSS

2010-12-13 Thread Andreas Dilger
(procedure in manual) I don't _think_ the UUIDs or filesystem name are stored elsewhere, but like I said, this has never been tested so should probably do this on a test filesystem first and verify it can be mounted properly afterward. Cheers, Andreas -- Andreas Dilger Lustre Technical Lead Oracle

Re: [Lustre-discuss] Optimal block size for database ...

2010-12-12 Thread Andreas Dilger
Lustre only supports 4096-byte blocksize for ldiskfs, so that makes your study very easy. I suspect that Oracle database uses 8192-byte blocks, but to be honest I don't know anything about it. That said, Lustre is probably not very well suited for Oracle databases, but I'd be very happy to

Re: [Lustre-discuss] finding performance issues

2010-12-10 Thread Andreas Dilger
busy the disks are. They may be imbalanced due to being different hardware, and will only go as fast as the slowest OSTs. Cheers, Andreas -- Andreas Dilger Lustre Technical Lead Oracle Corporation Canada Inc. ___ Lustre-discuss mailing list Lustre

Re: [Lustre-discuss] sync_journal

2010-12-09 Thread Andreas Dilger
.ost.sync_journal=0 lctl conf_param {fsname}-OST0001.ost.sync_journal=0 : : lctl conf_param {fsname}-OST000N.ost.sync_journal=0 Cheers, Andreas -- Andreas Dilger Lustre Technical Lead Oracle Corporation Canada Inc. ___ Lustre-discuss mailing list Lustre

Re: [Lustre-discuss] fsck.ext4 for device ... exited with signal 11.

2010-12-02 Thread Andreas Dilger
list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss Cheers, Andreas -- Andreas Dilger Lustre Technical Lead Oracle Corporation Canada Inc. ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http

Re: [Lustre-discuss] execute-only Like Ref: Bug 22376

2010-12-02 Thread Andreas Dilger
/MDT /var/log/messages file. Any tips? Settings?? Thank you, Megan Larko ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss Cheers, Andreas -- Andreas Dilger Lustre

Re: [Lustre-discuss] Help

2010-11-29 Thread Andreas Dilger
any idea what the bug numbers were. You could look in the lustre/ChangeLog file and/or search this up in bugzilla to find which Lustre version they were fixed in, but your best bet is simply to upgrade to a newer version of Lustre. Cheers, Andreas -- Andreas Dilger Lustre Technical Lead Oracle

Re: [Lustre-discuss] Problems with lfs find

2010-11-29 Thread Andreas Dilger
but the MDS file was not. It is possible to just delete this file using the unlink command - it does not contain any data in any case. Cheers, Andreas -- Andreas Dilger Lustre Technical Lead Oracle Corporation Canada Inc. ___ Lustre-discuss mailing list

Re: [Lustre-discuss] Reclaiming Reserved Disk Space

2010-11-26 Thread Andreas Dilger
is only about $200 per OST, so you have to balance, say, 25% performance loss from 95% - 99% full vs the $200 saved. On 25 November 2010 23:57, Andreas Dilger andreas.dil...@oracle.com wrote: On 2010-11-25, at 13:12, Wojciech Turek wrote: I forgot to ask, is setting reserved % to 0 is safe

Re: [Lustre-discuss] Lustre and SSD hard disk

2010-11-23 Thread Andreas Dilger
of the discard mount option (by other ext4 filesystem developers, not related to Lustre) has shown that it can significantly hurt the io performance due to the slowness of the TRIM commands, and the kernel implements TRIM like a barrier so it slows down all of the IO. Cheers, Andreas -- Andreas Dilger

Re: [Lustre-discuss] Bad lmm_size during open replay for inode

2010-11-23 Thread Andreas Dilger
, I've never seen these messages before. So far we've not noticed any ill effect but would like to know what that message is and if we can safely ignore it. It would only affect the listed inodes, if at all. Cheers, Andreas -- Andreas Dilger Lustre Technical Lead Oracle Corporation Canada Inc

Re: [Lustre-discuss] Resizing OSTs?

2010-11-14 Thread Andreas Dilger
On 2010-11-09, at 06:16, Brian J. Murrell wrote: On Tue, 2010-11-09 at 14:06 +0100, Roy Dragseth wrote: It would be nice to make use of the extra volume and I'm wondering if we can just take down the OST, extend the logical partition on the array, run resize2fs on the filesystem and be

Re: [Lustre-discuss] Resizing OSTs?

2010-11-14 Thread Andreas Dilger
On 2010-11-14, at 04:28, Andreas Dilger wrote: Speaking with my non-Oracle hat on - I have done offline resizing of OSTs on top of LVM many times w/o problems (subject to other OST size limitations of course). As suggested elsewhere, using the latest Lustre e2fsprogs is important. Also

Re: [Lustre-discuss] non-consecutive OST ordering

2010-11-12 Thread Andreas Dilger
that are making up the bulk of the orphan space usage, you could mount those OSTs as type ldiskfs and delete the objects by hand to free up the space. Cheers, Andreas -- Andreas Dilger Lustre Technical Lead Oracle Corporation Canada Inc. ___ Lustre

Re: [Lustre-discuss] questions about an OST content

2010-11-10 Thread Andreas Dilger
here, did not put that file back in place. Back up and (so far) operating cleanly. Thanks, bob On 11/8/2010 3:04 PM, Andreas Dilger wrote: On 2010-11-08, at 11:39, Bob Ball wrote: Don't know if I sent to the whole list. One of those days. remade the raid device, remade the lustre fs

Re: [Lustre-discuss] questions about an OST content

2010-11-10 Thread Andreas Dilger
option evicted all of the clients, so any of their in-progress operations would have failed. They have all since reconnected and no action is needed. On 11/10/2010 3:00 PM, Andreas Dilger wrote: On 2010-11-10, at 11:01, Bob Ball wrote: Well, we ran 2 days, migrating files off OST

Re: [Lustre-discuss] questions about an OST content

2010-11-08 Thread Andreas Dilger
much, bob On 11/6/2010 11:09 AM, Andreas Dilger wrote: On 2010-11-06, at 8:24, Bob Ballb...@umich.edu wrote: I am emptying a set of OST so that I can reformat the underlying RAID-6 more efficiently. Two questions: 1. Is there a quick way to tell if the OST is really empty? lfs_find takes

Re: [Lustre-discuss] questions about an OST content

2010-11-08 Thread Andreas Dilger
may as well specify the right index from the beginning. On 11/6/2010 11:09 AM, Andreas Dilger wrote: On 2010-11-06, at 8:24, Bob Ballb...@umich.edu wrote: I am emptying a set of OST so that I can reformat the underlying RAID-6 more efficiently. Two questions: 1. Is there a quick way

Re: [Lustre-discuss] questions about an OST content

2010-11-08 Thread Andreas Dilger
definitely be worthwhile for someone to look at this. I filed bug 24128 on this, in case anyone wants to work on it. Cheers, Andreas -- Andreas Dilger Lustre Technical Lead Oracle Corporation Canada Inc. ___ Lustre-discuss mailing list Lustre-discuss

Re: [Lustre-discuss] mkfs.lustre fails, ldiskfs: ext4 or ext3 ?

2010-11-03 Thread Andreas Dilger
with Lustre called llverdev and llverfs that can be used to do partial or full data integrity testing of large filesystems (and the underlying block/SCSI/ATA drivers and h/w or s/w RAID). Cheers, Andreas -- Andreas Dilger Lustre Technical Lead Oracle Corporation Canada Inc

Re: [Lustre-discuss] Inode xxxxxx has a extra size (144) which is invalid

2010-11-02 Thread Andreas Dilger
, Andreas -- Andreas Dilger Lustre Technical Lead Oracle Corporation Canada Inc. ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss

Re: [Lustre-discuss] system disk with external journals for OSTs formatted

2010-10-27 Thread Andreas Dilger
? The rest of the fileystem errors are very minor. You should probably delete the journal device via tune2fs -O ^has_journal, run a full e2fsck -f and then recreate the journal with tune2fs -j size=400. Cheers, Andreas -- Andreas Dilger Lustre Technical Lead Oracle Corporation Canada Inc

Re: [Lustre-discuss] recovering formatted OST

2010-10-26 Thread Andreas Dilger
, Wojciech On 22 October 2010 17:15, Andreas Dilger andreas.dil...@oracle.com wrote: On 2010-10-22, at 5:42, Bernd Schubert bernd.schub...@fastmail.fm wrote: Hmm, e2fsck didn't catch

Re: [Lustre-discuss] recovering formatted OST

2010-10-22 Thread Andreas Dilger
/0x12db RSP 8101c6481d90 CR2: 0Kernel panic - not syncing: Fatal exception On 22 October 2010 03:09, Andreas Dilger andreas.dil...@oracle.com wrote: On 2010-10-21, at 18:44, Wojciech Turek wj...@cam.ac.uk wrote: fsck has finished and does not find any more errors

Re: [Lustre-discuss] recovering formatted OST

2010-10-22 Thread Andreas Dilger
files in the filesystem? Best regards, Wojciech On 22 October 2010 17:15, Andreas Dilger andreas.dil...@oracle.com wrote: On 2010-10-22, at 5:42, Bernd Schubert bernd.schub...@fastmail.fm wrote: Hmm, e2fsck didn't catch that? rec_len is the length

Re: [Lustre-discuss] recovering formatted OST

2010-10-21 Thread Andreas Dilger
-0redhat.x86_64 Any idea what may have happened? Cheers Wojciech On 21 October 2010 03:32, Andreas Dilger andreas.dil...@oracle.com wrote: Probably LVM will refuse to create a whole-device PV if there is a partition table. Cheers, Andreas On 2010-10-20, at 18:31, Wojciech Turek wj

Re: [Lustre-discuss] recovering formatted OST

2010-10-21 Thread Andreas Dilger
On 2010-10-21, at 18:44, Wojciech Turek wj...@cam.ac.uk wrote: fsck has finished and does not find any more errors to correct. However when I try to mount the device as ldiskfs kernel panics with following message: Assertion failure in cleanup_journal_tail() at fs/jbd/checkpoint.c:459:

Re: [Lustre-discuss] recovering formatted OST

2010-10-20 Thread Andreas Dilger
significantly. Cheers, Andreas -- Andreas Dilger Lustre Technical Lead Oracle Corporation Canada Inc. ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss

Re: [Lustre-discuss] recovering formatted OST

2010-10-20 Thread Andreas Dilger
. In that case i guess firstly I need to try to recover the LVM information otherwise fsck will not be able to find anything is that right? Best regards, Wojciech On 20 October 2010 08:46, Andreas Dilger andreas.dil...@oracle.com wrote: On 2010-10-19, at 17:01, Wojciech Turek wrote

Re: [Lustre-discuss] vanilla kernel with 2.0 version

2010-10-20 Thread Andreas Dilger
I assume your question is related to the server, since clients generally work with vanilla kernels. We are working on the RHEL6 2.6.32 for Lustre 2.1 (available in bugzilla), and I'd hope that this will also work fairly well with the vanilla 2.6.32 kernel. There are no plans to add support

Re: [Lustre-discuss] high CPU load limits bandwidth?

2010-10-20 Thread Andreas Dilger
Is this client CPU or server CPU? If you are using Ethernet it will definitely be CPU hungry and can easily saturate a single core. Cheers, Andreas On 2010-10-20, at 8:41, Michael Kluge michael.kl...@tu-dresden.de wrote: Hi list, is it normal, that a 'dd' or an 'IOR' pushing 10MB blocks

Re: [Lustre-discuss] recovering formatted OST

2010-10-20 Thread Andreas Dilger
On 2010-10-20, at 10:15, Wojciech Turek wj...@cam.ac.uk wrote: On 20 October 2010 16:32, Andreas Dilger andreas.dil...@oracle.com wrote: Right - you need to recreate the LV exactly as it was before. If you created it all at once on the whole LUN then it is likely to be allocated in a linear

Re: [Lustre-discuss] high CPU load limits bandwidth?

2010-10-20 Thread Andreas Dilger
to reduce this overhead, but still not as fast as no checksum at all. Cheers, Andreas Am 20.10.2010 18:15, schrieb Andreas Dilger: Is this client CPU or server CPU? If you are using Ethernet it will definitely be CPU hungry and can easily saturate a single core. Cheers, Andreas On 2010-10

Re: [Lustre-discuss] recovering formatted OST

2010-10-20 Thread Andreas Dilger
. Also, the original device is much more likely to run e2fsck faster, which will help you get any remaining data back more quickly. On 20 October 2010 17:41, Andreas Dilger andreas.dil...@oracle.com wrote: On 2010-10-20, at 10:15, Wojciech Turek wj...@cam.ac.uk wrote: On 20 October 2010 16:32

Re: [Lustre-discuss] recovering formatted OST

2010-10-20 Thread Andreas Dilger
with zeros prior that? I guess creation of the LVM will overwrite it but I am asking just to make sure. Wojciech On 20 October 2010 18:40, Andreas Dilger andreas.dil...@oracle.com wrote: On 2010-10-20, at 11:36, Wojciech Turek wrote: Your help is mostly appreciated Andreas. May I ask one

Re: [Lustre-discuss] Lustre-discuss Digest, Vol 57, Issue 22

2010-10-19 Thread Andreas Dilger
truly make a difference. With Best Regards, Norm Morse President and CEO, OpenSFS ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss Cheers, Andreas -- Andreas Dilger

Re: [Lustre-discuss] Maximum OST Size

2010-10-19 Thread Andreas Dilger
to the same value, say 512? No, this will cause mke2fs to fail. There needs to be some free space in the filesystem for the above filesysem/Lustre metadata. In any case, since the maximum number of inodes is 2^32 the total filesystem size is not the limiting factor. Andreas Dilger wrote

Re: [Lustre-discuss] mkfs options/tuning for RAID based OSTs

2010-10-19 Thread Andreas Dilger
LUNs for a single OST. That is really the best configuration, and will probably double your write performance. Cheers, Andreas -- Andreas Dilger Lustre Technical Lead Oracle Corporation Canada Inc. ___ Lustre-discuss mailing list Lustre-discuss

Re: [Lustre-discuss] ldiskfs performance vs. XFS performance

2010-10-18 Thread Andreas Dilger
, and then obdfilter-survey to test the local Lustre IO submission path. Cheers, Andreas -- Andreas Dilger Lustre Technical Lead Oracle Corporation Canada Inc. ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org

Re: [Lustre-discuss] Maximum OST Size

2010-10-18 Thread Andreas Dilger
will consume 4096 bytes of space and also be slower to access. Can we get more inodes if we use zfs? Definitely yes. Cheers, Andreas -- Andreas Dilger Lustre Technical Lead Oracle Corporation Canada Inc. ___ Lustre-discuss mailing list Lustre-discuss

Re: [Lustre-discuss] steps to take to replace a failed ost with permanent data loss

2010-10-15 Thread Andreas Dilger
If the old OST is still accessible, you can copy the last_rcvd file and O/0/LAST_ID file, copy them over to the reformatted OST, and it should take on the identity of the old OST. The only other thing that identifies the filesystem is the label, which should be set by mkfs.lustre if the

Re: [Lustre-discuss] New lustre-community mailing list

2010-10-14 Thread Andreas Dilger
On 2010-10-13, at 13:08, Andreas Dilger wrote: I'd like to announce the creation of a new lustre-commun...@lists.lustre.org mailing list. After discussions with the various Lustre parties in the community, we thought there should be a new list to focus the meta discussion related to Lustre

[Lustre-discuss] New lustre-community mailing list

2010-10-13 Thread Andreas Dilger
discussions on lustre-devel, and the usage questions and general discussion on lustre-discuss. I think we all agree that maintaining the quality of the Lustre codebase going forward is important to everyone, so I hope we can have a productive discussion. Cheers, Andreas -- Andreas Dilger Lustre

Re: [Lustre-discuss] need help debuggin an access permission problem

2010-09-24 Thread Andreas Dilger
I think there is a bit of confusion here. The MDS is doing the initial authorization for the file, using l_getgroups to access the group information from LDAP (or whatever database is used). Daniel's point was that after the client has gotten access to the file, it will cache this file locally

[Lustre-discuss] List of Lustre Projects

2010-09-24 Thread Andreas Dilger
developers on how best to implement some of these features. Some projects that I think are of particular importance to the community at large are the cleanup and/or removal of the server-side kernel patches, and cleanup/reduction of the ldiskfs patches. Cheers, Andreas -- Andreas Dilger Lustre

Re: [Lustre-discuss] write RPC congestion

2010-09-24 Thread Andreas Dilger
On 2010-09-24, at 18:20, burlen wrote: Andreas Dilger wrote: When one of the server threads is ready to process a read/write request it will get or put the data from/to the buffers that the client already prepared. The number of currently active IO requests is exactly the number of active

Re: [Lustre-discuss] write RPC congestion

2010-09-24 Thread Andreas Dilger
On 2010-09-24, at 19:10, Andreas Dilger wrote: On 2010-09-24, at 18:20, burlen wrote: To be sure I understand this, is it correct that each OST has its own pool of service threads? So system wide number of service threads is bound by oss_max_threads*num_osts? Actuall, the current

Re: [Lustre-discuss] need help debuggin an access permission problem

2010-09-23 Thread Andreas Dilger
database/LDAP) from the command-line. Cheers, Andreas -- Andreas Dilger Lustre Technical Lead Oracle Corporation Canada Inc. ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss

Re: [Lustre-discuss] error e2fsck run for lfsck

2010-09-18 Thread Andreas Dilger
I attached a patch for fixing this problem to the bug. I haven't yet had a chance to test it fully, but it did not fail on my home filesystem which is using pools. Cheers, Andreas On 2010-09-18, at 3:47, Thomas Roth t.r...@gsi.de wrote: Thanks, Daniel. I have tried on another test system

Re: [Lustre-discuss] Multi-Role/Tasking MDS/OSS Hosts

2010-09-17 Thread Andreas Dilger
this and expect recovery to work in a robust manner. The reason is that the MDS is a client of the OSS, and if they are both on the same node that crashes, the OSS will wait for the MDS client to reconnect and will time out recovery of the real clients. Cheers, Andreas -- Andreas Dilger Lustre

Re: [Lustre-discuss] Multi-Role/Tasking MDS/OSS Hosts

2010-09-17 Thread Andreas Dilger
currently build for RHEL5, OEL5, SLES10 and SLES11) kernels or patch your own kernel with the patches. The patches are needed on the Lustre server kernel, but are not needed on the client. Cheers, Andreas -- Andreas Dilger Lustre Technical Lead Oracle Corporation Canada Inc

Re: [Lustre-discuss] mixing server versions

2010-09-15 Thread Andreas Dilger
with the same 1.6.x release as your other OSTs, and then upgrade them later when you are ready to do them all. That said, I wouldn't _expect_ problems if you run some OSTs with 1.8 for a while, but depending on which two releases you have there could be some issues. Cheers, Andreas -- Andreas

Re: [Lustre-discuss] lnet router tuning

2010-09-10 Thread Andreas Dilger
the expected performance from the components before trying them both together. Often it is necessary to tune the ethernet send/receive buffers. Cheers, Andreas -- Andreas Dilger Lustre Technical Lead Oracle Corporation Canada Inc. ___ Lustre-discuss

Re: [Lustre-discuss] Cannot get an OST to activate

2010-09-10 Thread Andreas Dilger
://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss Cheers, Andreas -- Andreas Dilger Lustre Technical Lead Oracle Corporation

Re: [Lustre-discuss] Large directory performance

2010-09-10 Thread Andreas Dilger
. Cheers, Andreas -- Andreas Dilger Lustre Technical Lead Oracle Corporation Canada Inc. ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss

Re: [Lustre-discuss] Large directory performance

2010-09-10 Thread Andreas Dilger
On 2010-09-10, at 17:32, Bernd Schubert wrote: On Saturday, September 11, 2010, Andreas Dilger wrote: Are you using the DDN 9550s for the MDT? That would be a bad configuration, because they can only be configured with RAID-6, and would explain why you are seeing such bad performance

Re: [Lustre-discuss] Announce: Lustre 2.0.0 is available!

2010-09-09 Thread Andreas Dilger
On a related note, there are already patches for SLES11SP1 in bugzilla (for 1.8) and for RHEL 6 (for 2.0) that should be used as the basis for this. Cheers, Andreas On 2010-09-09, at 5:11, Brian J. Murrell brian.murr...@oracle.com wrote: On Thu, 2010-09-09 at 11:54 +0200, Patrick Winnertz

Re: [Lustre-discuss] Announce: Lustre 2.0.0 is available!

2010-09-09 Thread Andreas Dilger
On 2010-09-09, at 4:53, Mag Gam magaw...@gmail.com wrote: For the future releases, will the client ever be part of the stock kernel? There are no current plans to do this, due to the huge amount of work needed, and the fact that it probably wouldn't be accepted without removing all of the

Re: [Lustre-discuss] Problem with LNET and openibd on Lustre 1.8.4 while rebooting

2010-09-09 Thread Andreas Dilger
] system_call+0x7e/0x83 Nirmal ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss Cheers, Andreas -- Andreas Dilger Lustre Technical Lead Oracle Corporation Canada Inc

Re: [Lustre-discuss] MDT backup (using tar) taking very long

2010-09-08 Thread Andreas Dilger
On 2010-09-08, at 3:35, Frederik Ferner frederik.fer...@diamond.ac.uk wrote: I can now report that the backup was faster using this new version of tar. It is still not really fast, though. the backup still takes nearly 24h (that is running getfattr followed by tar...). That surprises

Re: [Lustre-discuss] brief 'hangs' on file operations

2010-09-03 Thread Andreas Dilger
It could be both. We had good success with performance improvements up to 1GB journals, and AFAIK Some customers had reduced their journal size to 128 MB without significant performance impact to reduce RAM consumption on their OSS. We haven't really made any testing with reduced journal size

Re: [Lustre-discuss] Lustre 1.8.4 Patched Kernel Build

2010-09-03 Thread Andreas Dilger
. That is done automatically during the lustre build to create the ldiskfs module (which is the patched and renamed ext3 or ext4 code). Please see the how to build a lustre kernel page: http://wiki.lustre.org/index.php/Applying_Lustre_Patches_to_a_Kernel Cheers, Andreas -- Andreas Dilger Lustre

Re: [Lustre-discuss] Windows client for lustre 2.0

2010-09-03 Thread Andreas Dilger
to an acceptable level. Unfortunately, there is no release date planned yet. Cheers, Andreas -- Andreas Dilger Lustre Technical Lead Oracle Corporation Canada Inc. ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman

Re: [Lustre-discuss] brief 'hangs' on file operations

2010-09-02 Thread Andreas Dilger
under a lock that other clients are waiting on. Cheers, Andreas -- Andreas Dilger Lustre Technical Lead Oracle Corporation Canada Inc. ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss

Re: [Lustre-discuss] lfs 1.8.3 print ost_conn_uuid

2010-09-01 Thread Andreas Dilger
?: lctl get_param osc.*.ost_conn_uuid but it makes sense to make this into lfs osts (as I'd also suggested in that thread). Submitting the patch to Bugzilla is the first step to getting this landed. Cheers, Andreas -- Andreas Dilger Lustre Technical Lead Oracle Corporation Canada Inc

Re: [Lustre-discuss] Is 1.6.x.x still available

2010-08-30 Thread Andreas Dilger
, and 1.8 is already in the latter mode - it will not be getting any major feature additions and is focussed on stability. Cheers, Andreas -- Andreas Dilger Lustre Technical Lead Oracle Corporation Canada Inc. ___ Lustre-discuss mailing list Lustre

Re: [Lustre-discuss] would patches in kernel_patches/patches change under the same name?

2010-08-26 Thread Andreas Dilger
are sure to get a matching kernel. Cheers, Andreas -- Andreas Dilger Lustre Technical Lead Oracle Corporation Canada Inc. ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss

Re: [Lustre-discuss] write RPC congestion

2010-08-23 Thread Andreas Dilger
On 2010-08-22, at 11:58, burlen wrote: Andreas Dilger wrote: Currently, 1MB is the largest bulk IO size, and is the typical size used by clients for all IO. Is my understanding correct? A single RPC request will initiate an RDMA transfer of at most max_pages_per_rpc. where the page unit

Re: [Lustre-discuss] Fwd: Lustre and Large Pages

2010-08-20 Thread Andreas Dilger
slocate across multiple filesystems will fill all of RAM with inodes/dentries, and if you pin some of these in memory (e.g. start a shell with some deep directory as CWD), you should quickly be able to fragment your memory with unfreeable inode/dentry allocations. Cheers, Andreas -- Andreas Dilger

Re: [Lustre-discuss] SSD caching of MDT

2010-08-19 Thread Andreas Dilger
On 2010-08-19, at 7:27, LaoTsao 老曹 laot...@gmail.com wrote: IMHO, SSD has more IOPS then disks and has larger capacity then raid/nvram so it seems that SSD should help in MDS, the U want SSD in dual host env to support failover? regards On 8/19/2010 8:29 AM, Gregory Matthews wrote:

Re: [Lustre-discuss] More detail regarding soft lockup error

2010-08-19 Thread Andreas Dilger
is stored on the MDT and the metadata is small. -Original Message- From: lustre-discuss-boun...@lists.lustre.org [mailto:lustre-discuss-boun...@lists.lustre.org] On Behalf Of Fourie Joubert Sent: Wednesday, August 18, 2010 2:01 PM To: Andreas Dilger; lustre-discuss@lists.lustre.org

Re: [Lustre-discuss] More detail regarding soft lockup error

2010-08-19 Thread Andreas Dilger
On 2010-08-19, at 10:49, Brian J. Murrell wrote: On Thu, 2010-08-19 at 10:09 -0600, Andreas Dilger wrote: If you increase the size of the MDT (via resize2fs) it will increase the number of inodes as well. Andreas: what is [y]our confidence level with resize2fs and our MDT? Given that I

Re: [Lustre-discuss] Fwd: Lustre and Large Pages

2010-08-19 Thread Andreas Dilger
-- Andreas Dilger Lustre Technical Lead Oracle Corporation Canada Inc. ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss

Re: [Lustre-discuss] More detail regarding soft lockup error

2010-08-18 Thread Andreas Dilger
On 2010-08-18, at 4:14, Fourie Joubert fourie.joub...@up.ac.za wrote: Just reporting some more detail about the soft lockup error I have been getting: I am running Lustre 1.8.1, kernels are from the Lustre distro. Firstly, there is a known corruption bug in 1.8.1, you should ar minimum

Re: [Lustre-discuss] write RPC congestion

2010-08-18 Thread Andreas Dilger
are distributed denial-of-service engines it is always possible to overwhelm the server under some conditions. In case of a client RPC timeout (hundreds of seconds under load) the client will resend the request and/or try to contact the backup server until one responds. Cheers, Andreas -- Andreas

Re: [Lustre-discuss] ACLs?

2010-08-17 Thread Andreas Dilger
I'm very old versions of Lustre it was necessary to enable ACLs explicitly, because of compatibility issues. In newer releases (1.8) they are always available. Cheers, Andreas On 2010-08-17, at 5:50, Andreas Davour dav...@pdc.kth.se wrote: I just read the fine manual about ACL usage. It

Re: [Lustre-discuss] needs_recovery flag?

2010-08-16 Thread Andreas Dilger
The main issue is that tune2fs changing the superblock while the journal is not recovered means any changes will be lost. It is hard to get this 100% correct, since it is possible to set some tunable on the mounted superblock, and replaying the journal in that case would be bad. Running

Re: [Lustre-discuss] ll_ost_creat_* goes bersek (100% cpu used - OST disabled)

2010-08-14 Thread Andreas Dilger
On 2010-08-14, at 2:28, Adrian Ulrich adr...@blinkenlights.ch wrote: - the on-disk structure of the object directory for this OST is corrupted. Run e2fsck -fp /dev/{ostdev} on the unmounted OST filesystem. e2fsck fixed it: The OST is now running since 40 minutes without problems: But

Re: [Lustre-discuss] O_DIRECT

2010-08-14 Thread Andreas Dilger
On 2010-08-14, at 1:32, Michael Kluge michael.kl...@tu-dresden.de wrote: how does Lustre handle write() requests to files opened with O_DIRECT. Does the OSS enforce that the OST has physically written the data to the OST before the op is completed or does the write() call return on the

Re: [Lustre-discuss] ll_ost_creat_* goes bersek (100% cpu used - OST disabled)

2010-08-13 Thread Andreas Dilger
OST filesystem. Cheers, Andreas -- Andreas Dilger Lustre Technical Lead Oracle Corporation Canada Inc. ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss

<    1   2   3   4   5   6   7   8   9   10   >