Re: [Lustre-discuss] OST problem

2008-02-22 Thread Wojciech Turek
Hi, Lustre stripes in round robin manner. So let say that you have stripe count set to 2 and stripe size set to 1MB. When you start write 3GB file from a client to lustre it will send 1MB piece to OST1 and then 1MB piece to OST2 and it will keep doing that until it send 3GB or until

Re: [Lustre-discuss] [Lustre-announce] Lustre 1.6.4.3 Released!

2008-04-07 Thread Wojciech Turek
lustre-1.6.4.3 doesn't comes with support for OFED-1.3. However there is solution for that problem which is working for us. https://bugzilla.lustre.org/show_bug.cgi?id=14309 Cheers Wojciech Turek On 4 Apr 2008, at 16:59, Steve Byrnes (stbyrnes) wrote: Is there a version of Lustre available

Re: [Lustre-discuss] add failover node to an existing one

2008-04-08 Thread Wojciech Turek
If you have arranged you nodes to see each other disk devices then yes it simple configuration with tunefs.lustre command. Details are in Lustre operation manual. Cheers Wojciech On 8 Apr 2008, at 15:00, Papp Tamás wrote: Dear All, Is this possible? The cluster exists and working, but

[Lustre-discuss] lustre and small files overhead lru_size question

2008-06-10 Thread Wojciech Turek
/2005-December/001040.html Thank You, Wojciech Turek ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss

Re: [Lustre-discuss] lustre and small files overhead lru_size question

2008-06-10 Thread Wojciech Turek
On 10 Jun 2008, at 19:18, Jakob Goldbach wrote: My question is why not bigger values? What determines the max lru_size. The number of clients and ram on servers. Is there a recommendation for lustre servers how much max ram one can spend on locks? Locks are held by server and

Re: [Lustre-discuss] specifying OST

2008-07-16 Thread Wojciech Turek
Hi, Type 'lfs help setstripe' on lustre client node lfs help setstripe setstripe: Create a new file with a specific striping pattern or set the default striping pattern on an existing directory or delete the default striping pattern from an existing directory usage: setstripe filename|dirname

Re: [Lustre-discuss] Lustre 1.6.4.3 install/upgrade problem

2008-09-25 Thread Wojciech Turek
@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss -- Wojciech Turek Assistant System Manager High Performance

Re: [Lustre-discuss] recovery_status inactive

2008-09-30 Thread Wojciech Turek
http://lists.lustre.org/mailman/listinfo/lustre-discuss -- Wojciech Turek Assistant System Manager High Performance Computing Service University of Cambridge Email: [EMAIL PROTECTED] Tel: (+)44 1223 763517 ___ Lustre-discuss mailing list Lustre

Re: [Lustre-discuss] MGS/MDS error: operation 101 on unconnected MGS

2008-09-30 Thread Wojciech Turek
PROTECTED]://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss -- Wojciech Turek Assistant System Manager High

Re: [Lustre-discuss] Kernel panic on RHEL 4 WS

2008-10-01 Thread Wojciech Turek
Hi, This will cure your problem. https://bugzilla.lustre.org/show_bug.cgi?id=16404#c22 Cheers Wojciech Sridhar Gullapalli wrote: Folks, Are there any gotchas that I am missing, I am trying to install Lustre 1.6.5.1 on a RHEL 4 Workstation machine, and getting a kernel panic. Does anyone

Re: [Lustre-discuss] Lustre file system and DNS

2008-10-02 Thread Wojciech Turek
Hi, As far as I know this is correct. To make sure just run lctl ping from each client to the lustre servers using names instead of IP addresses. If lctl ping resolves names correctly than in case of names everything should work fine. Cheers Wojciech Minh Hien wrote: Dear all, We had

Re: [Lustre-discuss] recovery_status inactive

2008-10-02 Thread Wojciech Turek
is pretty normal status for lustre targets. I don't thing that you can force lustre target device into recovery without unmounting it. Regards, Wojciech Papp Tamas wrote: Wojciech Turek wrote: Hi, COMPLETE means that this particular OST was in recovery and recovery is now finished. To force

Re: [Lustre-discuss] lustre/drbd/heartbeat setup [was: drbd async mode]

2008-10-02 Thread Wojciech Turek
Hi, Thanks for that.I was thinking about trying drbd on my MDSs so I find your PDF very useful. Heiko Schroeter wrote: Hello, at last a first version of our setup scenario is ready. Please consider this as a general guideline. It may contain errors. We know that some things are done

Re: [Lustre-discuss] Getting random No space left on device (28)

2008-10-12 Thread Wojciech Turek
Hi, I don't have a script but if you run command given below on the client it will produce a list of files that are striped to a particular OST. lfs find --recursive --obd lustrefs-OST_UUID /mnt/lustre Substitute lustrefs-OST_UUID with your full OST and the /mnt/lustre with your

Re: [Lustre-discuss] LBUG mds_reint.c, questions about recovery time

2008-10-13 Thread Wojciech Turek
Lustre recovery time is 2.5 x timeout You can find timeout by running this command on the MDS cat /proc/sys/lustre/timeout Thomas Roth wrote: Hi all, I just ran into a LBUG on an MDS still running Lustre Version 1.6.3 with kernel 2.6.18, Debian Etch. kern.log c.f. below. You will probably

Re: [Lustre-discuss] 1.6.5.1 patches

2008-11-03 Thread Wojciech Turek
I have some spare hardware (20TB of storage and several PE2950 servers) and I would like to use it as a test platform for new lustre version. I noticed that some people talk about testing beta version of 1.6.6. Can some one tell me where could I obtain rc version of 1.6.6 ? Many thanks, Mag

Re: [Lustre-discuss] 1.6.5.1 patches

2008-11-03 Thread Wojciech Turek
Indeed you right! I used to get messages from lustre-announce list about such events but it seems it didn't work this time. Thanks JD Neumann wrote: !.6.6 is released and should be available from the down load center. J.D. Wojciech Turek wrote: I have some spare hardware (20TB of storage

Re: [Lustre-discuss] Replacing An Existing OST with a bigger one (unsupported optional features)?!

2008-11-04 Thread Wojciech Turek
Alex wrote: [EMAIL PROTECTED] ~]# lfs df -h UUID bytes Used Available Use% Mounted on testfs-MDT_UUID 130.4G460.1M122.5G0% /mnt/lustre[MDT:0] testfs-OST_UUID 18.3G 17.4G 2.0M 94% /mnt/lustre[OST:0] testfs-OST0001_UUID

Re: [Lustre-discuss] redistribute used space to other OSTs (free space)

2008-11-04 Thread Wojciech Turek
Alex wrote: On Tuesday 04 November 2008 16:37, Brian J. Murrell wrote: On Tue, 2008-11-04 at 15:51 +0200, Alex wrote: [EMAIL PROTECTED] ~]# lfs df -h UUID bytes Used Available Use% Mounted on testfs-MDT_UUID 130.4G460.1M

Re: [Lustre-discuss] redistribute used space to other OSTs (free space)

2008-11-05 Thread Wojciech Turek
Alex wrote: On Tuesday 04 November 2008 18:52, Brian J. Murrell wrote: On Tue, 2008-11-04 at 16:23 +, Wojciech Turek wrote: I don't know how to move a particular object but you could move a whole file to another OST and that would release some space from the full OST. mkdir

Re: [Lustre-discuss] Connection losses to MGS/MDS

2008-12-18 Thread Wojciech Turek
Hi, It doesn't look healthy. I assume that those messages and the numbers are from the client side, what do you see on the MDS server itself? It seem to me that your network connection to the MDS is flaky and thus so many disconnection messages. It maybe doesn't hurt noticeably your bandwidth

Re: [Lustre-discuss] 1.6.5.1 - 1.6.6

2009-01-07 Thread Wojciech Turek
-- Wojciech Turek Assistant System Manager High Performance Computing Service University of Cambridge Email: wj...@cam.ac.uk Tel: (+)44 1223 763517 ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman

Re: [Lustre-discuss] 1.6.5.1 - 1.6.6

2009-01-08 Thread Wojciech Turek
ays afraid to manipulate the MDT: might go wrong and I end up with 100's of TB of junk (as a restore of backups never works once you need it). But if it's harmless to run writeconf as you said, I will try... Regards, Thomas Wojciech Turek wrote: writeconf forces all the lustre targets (OSTs and MDTs

Re: [Lustre-discuss] Singlehomed to multihomed upgrade

2009-01-08 Thread Wojciech Turek
? -- Wojciech Turek Assistant System Manager High Performance Computing Service University of Cambridge Email: wj...@cam.ac.uk Tel: (+)44 1223 763517 ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo

Re: [Lustre-discuss] How to read from a different client

2009-01-09 Thread Wojciech Turek
Hi, I am a bit confused with your description and the question. Can you please answer my questions below? By OSC do you mean Lustre client node ? Is mount point on the Lustre client /mnt/lfs ? Are sample1 and sample2 files located on Lustre filesystem mounted at /mnt/lfs ? If answer for all

Re: [Lustre-discuss] writeconf needed for 1.6.6?

2009-01-09 Thread Wojciech Turek
www.terascala.com http://www.terascala.com/ ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss -- Wojciech Turek Assistant System Manager High Performance Computing Service University

Re: [Lustre-discuss] What do clients run on?

2009-01-10 Thread Wojciech Turek
MGS is not the same as MDS but usually these are running on one machine. If you have dedicated server for MGS it can serve as client. In most cases MDS can be used as client too but not OSS(s) Arden Wiebe wrote: So if MGS is trivial load then I can safely mount client there? --- On *Sat,

Re: [Lustre-discuss] Singlehomed to multihomed upgrade

2009-01-12 Thread Wojciech Turek
configuration. Mount MDT, OSTs and the client and let me know how it works for you. I also recommend to add modprobe.conf line on the clients, although this is not necessary in your case, it will make configuration more sane. options lnet networks=tcp(eth0) Cheers Wojciech Turek   Lukas Hejtmanek wrote

Re: [Lustre-discuss] About MDS failover

2009-01-14 Thread Wojciech Turek
I like HA-linux, however if you are looking for alternatives have a look at RedHat Cluster Suite http://www.redhat.com/docs/manuals/csgfs/browse/rh-cs-en/ Jeffrey Alan Bennett wrote: Hi, What software are people using for MDS failover? I have been using Heartbeat from Linux-HA but I am not

[Lustre-discuss] mds reports that OST is full (ENOSPC error -28 ) but df tells different

2009-01-22 Thread Wojciech Turek
Hello, RHEL4 Kernel 2.6.9-67.0.22smp Lustre-1.6.6 Lustre MDS report following error: Jan 22 15:20:40 mds01.beowulf.cluster kernel: LustreError: 24680:0:(lov_request.c:692:lov_update_create_set()) error creating fid 0xeb79c9d sub-object on OST idx 4/1: rc = -28 Which I translate as that

Re: [Lustre-discuss] mds reports that OST is full (ENOSPC error -28 ) but df tells different

2009-01-22 Thread Wojciech Turek
Hi Brian, Brian J. Murrell wrote: On Thu, 2009-01-22 at 15:44 +, Wojciech Turek wrote: Hello, Hi, Lustre MDS report following error: Jan 22 15:20:40 mds01.beowulf.cluster kernel: LustreError: 24680:0:(lov_request.c:692:lov_update_create_set()) error

Re: [Lustre-discuss] mds reports that OST is full (ENOSPC error -28 ) but df tells different

2009-01-22 Thread Wojciech Turek
-only state of the OST0004? Regards, Wojciech Wojciech Turek wrote: Hi Brian, Brian J. Murrell wrote: On Thu, 2009-01-22 at 15:44 +, Wojciech Turek wrote: Hello, Hi, Lustre MDS report following error: Jan 22 15:20:40 mds01.beowulf.cluster

Re: [Lustre-discuss] mds reports that OST is full (ENOSPC error -28 ) but df tells different

2009-01-22 Thread Wojciech Turek
I am sorry, I should have looked in there before spamming here. Thank you Brian, Wojciech Brian J. Murrell wrote: On Thu, 2009-01-22 at 18:19 +, Wojciech Turek wrote: Hi Brian, I have tried to umount the OST(idx4) and the server LBUGed I attache LBUG below

[Lustre-discuss] OSS Service Thread Count

2009-01-25 Thread Wojciech Turek
Hi, My lustre system specs: Lustre-1.6.6 RHEL4 2 lustre file systems: one consists of 4 OSTs and other consists of 20 OSTs 4 x OSS/6OSTs Storage: S2A9500 Clients: 600 Interconnect: Ethernet I noticed that my OSSs sometimes report very high load (around 500). I read that increasing number

Re: [Lustre-discuss] Performance Expectations of Lustre

2009-01-26 Thread Wojciech Turek
LUNs - Added RAID6 support - Enhanced IPv6 support for all ports - Included Smart Battery (Smart BBU) management - Enabled SNTP on management port - Increased number of snapshots and volume copies per volume from 4 to 8 (an additional Premium Feature Key required) -- Best Regards, Wojciech Turek

[Lustre-discuss] mds device unhealthy - clients got stuck

2009-02-06 Thread Wojciech Turek
Hi, Today our mds started to behave unstable. /proc/fs/lustre/health_check file reported that mds device is not healthy. All clients connected to ddn_home file system got stuck and MDS server started to refuse client connections and after some time it started to evict clients. Can some one

[Lustre-discuss] separating MGS from MDT

2009-03-25 Thread Wojciech Turek
Hello, I would like to move MGS service to separate device. Would it work if I backup my current MDT/MGS device and then create new MGS on separate device and new MDT and then restore the backup to new MDT? I will be grateful for any thought on this subject. Cheers Wojciech

Re: [Lustre-discuss] e2scan question

2009-04-03 Thread Wojciech Turek
We are using e2scan since few days and we have noticed that date specification is not being processed correctly by e2scan. date Fri Apr 3 15:56:49 BST 2009 /usr/sbin/e2scan -C /ROOT -l -N 2009-03-29 19:44:00 /dev/dm-0 file_list generating list of files with mtime newer than Sun

Re: [Lustre-discuss] Bad kernel-lustre-sources? impossible to compile OFED infiniband drivers

2009-07-28 Thread Wojciech Turek
___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss -- -- Wojciech Turek Assistant System Manager High Performance Computing Service University of Cambridge Email: wj...@cam.ac.uk Tel: (+)44

[Lustre-discuss] o2ib and tcp(IPoIB) on the same IB interface.

2009-09-07 Thread Wojciech Turek
) Many thanks for all answers. Best regards, Wojciech -- Wojciech Turek ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss

Re: [Lustre-discuss] Client complaining about duplicate inode entry after luster recovery

2009-10-09 Thread Wojciech Turek
(C-DAC) Pune University Campus,Ganesh Khind Road Pune-Maharastra ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss -- -- Wojciech Turek Assistant System Manager High

Re: [Lustre-discuss] MGS disk size and activity

2009-10-09 Thread Wojciech Turek
. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss -- -- Wojciech Turek Assistant System Manager High

Re: [Lustre-discuss] Client complaining about duplicate inode entry after luster recovery

2009-10-10 Thread Wojciech Turek
as the affected ones. I can not see any problems on the third file system. Wojciech 2009/10/10 Bernd Schubert bs_li...@aakef.fastmail.fm ASSERTION(old_inode-i_state I_FREEING) is the infamous bug17485. You will need to run lfsck to fix it. On Saturday 10 October 2009, Wojciech Turek wrote: Hi

[Lustre-discuss] lfsck

2009-10-13 Thread Wojciech Turek
I am running lfscks on my file systems right now. Once they finished I would like to re run lfscks to make sure that all problems were cleaned. Do you know if I need to rebuild mdsdb and ostdbs for the second lfsck run? I can see that lfsck change timestamps on db files so maybe I don't have to

[Lustre-discuss] How to find a file by object id

2009-10-21 Thread Wojciech Turek
I apologize if this question was answered earlier but I can not find it in the mailing list. I have an object ID and I would like to find file that this object is part of. I tried to use lfs find but I can not seem to find right combination of options. Also is there a simple way to list all the

Re: [Lustre-discuss] How to find a file by object id

2009-10-21 Thread Wojciech Turek
Many Thanks Daniel, these hints are very helpful. Wojciech 2009/10/21 Daniel Kobras kob...@linux.de Hi! On Wed, Oct 21, 2009 at 11:58:43AM +0100, Wojciech Turek wrote: I apologize if this question was answered earlier but I can not find it in the mailing list. I have an object ID

Re: [Lustre-discuss] How to find a file by object id

2009-10-21 Thread Wojciech Turek
got into Lustre-1.8 manual. I guess it didn't make it's way to 1.6 manual because manual was not updated since May and the bug was resolved in July. Cheers Wojciech 2009/10/21 Brian J. Murrell brian.murr...@sun.com On Wed, 2009-10-21 at 11:58 +0100, Wojciech Turek wrote: I apologize

Re: [Lustre-discuss] file system instability after fsck and lfsck

2009-10-26 Thread Wojciech Turek
___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss -- -- Wojciech Turek Assistant System Manager High Performance Computing Service University of Cambridge Email: wj...@cam.ac.uk Tel: (+)44 1223

Re: [Lustre-discuss] file system instability after fsck and lfsck

2009-10-27 Thread Wojciech Turek
after every fsck run and also dont use the same lustre DB for more than one operation using lfsck. Hope this will help On Tue, Oct 27, 2009 at 12:00 AM, Wojciech Turek wj...@cam.ac.uk wrote: Hi, I had similar problem just three weeks ago on our Lustre 1.6.6 RHEL4. It all started

[Lustre-discuss] Slow file open close on RHEL5

2009-11-11 Thread Wojciech Turek
)) { printf(close error\n); } return 0; } int main(void) { int i; for(i = 0; i 30; i++) { openClose(); } } -- -- Wojciech Turek Assistant System Manager High Performance Computing Service University of Cambridge Email: wj...@cam.ac.uk

Re: [Lustre-discuss] Lustre slow file open close on RHEL5

2009-11-15 Thread Wojciech Turek
/LustreProc.html#50557055_78950 This could probably be your cause. On Thu, Nov 12, 2009 at 3:03 PM, Wojciech Turek wj...@cam.ac.uk wrote: Hi, Cluster running Lustre 1.6.6 Opening and closing files takes longer on RHEL5 than on RHEL4. This is only happens with files located on Lustre file

Re: [Lustre-discuss] MD1000 woes and OSS migration suggestions

2009-12-30 Thread Wojciech Turek
://lists.lustre.org/mailman/listinfo/lustre-discuss -- -- Wojciech Turek Assistant System Manager High Performance Computing Service University of Cambridge Email: wj...@cam.ac.uk Tel: (+)44 1223 763517 ___ Lustre-discuss mailing list Lustre-discuss

[Lustre-discuss] The client profile could not be read from the MGS

2010-01-05 Thread Wojciech Turek
and then run writeconf on them and mount them back, would this recreate this missing files? Also can do above without umounting clients (let them wait until lustre targets come back) and would this kill any jobs running one them? Many thanks for your input Cheers Wojciech -- -- Wojciech Turek

Re: [Lustre-discuss] Fw: Re: Unable to activate OST

2010-01-15 Thread Wojciech Turek
with that? ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss -- -- Wojciech Turek Assistant System Manager High Performance Computing Service University of Cambridge Email: wj...@cam.ac.uk Tel: (+)44 1223 763517

Re: [Lustre-discuss] Fw: Re: Unable to activate OST

2010-01-15 Thread Wojciech Turek
On Fri, Jan 15, 2010 at 5:39 PM, Wojciech Turek wj...@cam.ac.uk wrote: Hi, Could you please post output of the 'lctl list_nids' command on OSS system and on MDS system. This will show us which network was configured to work with lustre. Regarding entries in the modprobe.conf, they tell lnet

Re: [Lustre-discuss] Fw: Re: Unable to activate OST

2010-01-15 Thread Wojciech Turek
Could you also post here syslog messages from the OSS ? 2010/1/16 Wojciech Turek wj...@cam.ac.uk: Can you check if you can ping MDS and OSS using normal ping command? 2010/1/16 Dusty Marks dustynma...@gmail.com: the output of ltcl list_nids on the oss is [r...@oss ~]# lctl list_nids

[Lustre-discuss] Kernel Panic on MDS

2010-01-18 Thread Wojciech Turek
{vfs_read+207} 80179008{sys_read+69} 80110236{system_call+126} Code: 8b 14 90 31 c0 e8 9c d8 03 00 48 98 49 01 c4 8b 13 b8 20 00 RIP 801af8f0{proc_pid_status+534} RSP 010416fc9e48 0Kernel panic - not syncing: Oops -- -- Wojciech Turek Assistant System Manager

Re: [Lustre-discuss] Kernel Panic on MDS

2010-01-18 Thread Wojciech Turek
Thanks Andreas for quick answer. So upgrading to a newer version of colletcl should fix it? Cheers Wojciech 2010/1/18 Andreas Dilger adil...@sun.com: On 2010-01-18, at 19:59, Wojciech Turek wrote: RHEL4 Lustre-1.6.6 Does the kernel panic below rings a bell to anyone? RIP: 0010

Re: [Lustre-discuss] Heartbeat Failover Issues on MDS

2010-01-19 Thread Wojciech Turek
___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss -- -- Wojciech Turek Assistant System Manager High Performance Computing Service University of Cambridge Email: wj...@cam.ac.uk Tel: (+)44 1223 763517

Re: [Lustre-discuss] Lustre failover on IDE disks

2010-01-20 Thread Wojciech Turek
___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss -- -- Wojciech Turek Assistant System Manager High Performance Computing Service University of Cambridge Email: wj...@cam.ac.uk Tel: (+)44

Re: [Lustre-discuss] e2fsprogs conflicts with previously installed packages

2010-01-25 Thread Wojciech Turek
-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss -- -- Wojciech Turek Assistant System Manager High Performance Computing Service University of Cambridge Email: wj...@cam.ac.uk Tel: (+)44 1223 763517 ___ Lustre-discuss

[Lustre-discuss] lustre 1.8.2 patchless client on suse 2.6.31 kernel

2010-03-18 Thread Wojciech Turek
I am trying to compile lustre modules for Open suse 2.6.31 kernel Unfortunately make fails with following error /usr/src/lustre-1.8.2/lustre/llite/lloop.c: In function 'loop_set_fd' /usr/src/lustre-1.8.2/lustre/llite/lloop.c:506: error: implicit declaration of function 'blk_queue_hardsect_size'

Re: [Lustre-discuss] lustre 1.8.2 patchless client on suse 2.6.31 kernel

2010-03-19 Thread Wojciech Turek
number? Best regards, Wojciech On 18 March 2010 22:55, Andreas Dilger adil...@sun.com wrote: On 2010-03-18, at 13:23, Wojciech Turek wrote: I am trying to compile lustre modules for Open suse 2.6.31 kernel Unfortunately make fails with following error /usr/src/lustre-1.8.2/lustre/llite

Re: [Lustre-discuss] lustre 1.8.2 patchless client on suse 2.6.31 kernel

2010-03-22 Thread Wojciech Turek
Thanks Andreas, my mistake was not to search the attachments. Next time I won't bother you. Again, many thanks. Cheers, Wojciech On 20 March 2010 05:46, Andreas Dilger adil...@sun.com wrote: On 2010-03-19, at 08:56, Wojciech Turek wrote: Thanks for a quick answer. I have tried to compile

Re: [Lustre-discuss] lustre 1.8.2 patchless client on suse 2.6.31 kernel

2010-03-22 Thread Wojciech Turek
FYI I have working OpenSUSE Lustre patchless client using kernel 2.6.31.12-0.1-xen and lustre-1.8.2 source. I used info from bug 21500 and http://www.mail-archive.com/lustre-discuss@lists.lustre.org/msg05655.html On 22 March 2010 07:40, Wojciech Turek wj...@cam.ac.uk wrote: Thanks Andreas, my

Re: [Lustre-discuss] Installing Lustre 1.8.2 on CentOS 5.4 with OFED 1.4.2

2010-03-27 Thread Wojciech Turek
___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss -- -- Wojciech Turek Assistant System Manager High Performance Computing Service University of Cambridge Email: wj...@cam.ac.uk Tel: (+)44 1223 763517

Re: [Lustre-discuss] Kernel oops after cat on /proc/fs/lustre/mgs/MGS/exports/*/stats

2010-04-23 Thread Wojciech Turek
/listinfo/lustre-discuss -- -- Wojciech Turek Assistant System Manager High Performance Computing Service University of Cambridge Email: wj...@cam.ac.uk Tel: (+)44 1223 763517 ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http

Re: [Lustre-discuss] 1.8.1.1 - 1.8.3 upgrade questions

2010-05-25 Thread Wojciech Turek
. ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss -- -- Wojciech Turek Assistant System Manager High Performance Computing Service University of Cambridge Email: wj

[Lustre-discuss] umounting server with or without failover

2010-05-31 Thread Wojciech Turek
: umount -f MDS|OSS mount point This stops the server and preserves client export information. When the server restarts, the clients reconnect and resume in-progress transactions. -- -- Wojciech Turek Assistant System Manager High Performance Computing Service University of Cambridge

[Lustre-discuss] uninit_bg

2010-06-01 Thread Wojciech Turek
e2fsprogs-1.41.10.sun2-0redhat.x86_64 Best regards -- -- Wojciech Turek Assistant System Manager High Performance Computing Service University of Cambridge ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org

Re: [Lustre-discuss] I/O error on clients

2010-07-06 Thread Wojciech Turek
://lists.lustre.org/mailman/listinfo/lustre-discuss -- -- Wojciech Turek Assistant System Manager High Performance Computing Service University of Cambridge Email: wj...@cam.ac.uk Tel: (+)44 1223 763517 ___ Lustre-discuss mailing list Lustre-discuss

Re: [Lustre-discuss] No space left on device on not full filesystem

2010-07-08 Thread Wojciech Turek
http://lists.lustre.org/mailman/listinfo/lustre-discuss -- -- Wojciech Turek Assistant System Manager High Performance Computing Service University of Cambridge ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org

[Lustre-discuss] How to determine which lustre clients are loading filesystem.

2010-07-08 Thread Wojciech Turek
that there are lots of I/O going on (mainly read). I would like to find a good method of finding out which Lustre clients are generating the I/O so I could pinpoint the high load to a particular jobs. I hope that some Lustre users can share their experience in that matter. Best regards, -- -- Wojciech

Re: [Lustre-discuss] How to determine which lustre clients are loading filesystem.

2010-07-09 Thread Wojciech Turek
Oracle Corporation Canada Inc. -- -- Wojciech Turek ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss

Re: [Lustre-discuss] 1.6.6 to 1.8.3 upgrade, OSS with wrong Target value

2010-07-14 Thread Wojciech Turek
. -- -- Wojciech Turek ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss

Re: [Lustre-discuss] 1.6.6 to 1.8.3 upgrade, OSS with wrong Target value

2010-07-15 Thread Wojciech Turek
/dev/XXX fsck -pf /dev/XXX Is the above correct? I'd like to move our systems to ext4. I didn't know those steps were necessary. Other answers listed below. Wojciech Turek wrote: Hi Roger, Sorry for the delay. From the ldiskfs messages I seem to me that you are using ext4

Re: [Lustre-discuss] 1.6.6 to 1.8.3 upgrade, OSS with wrong Target value

2010-07-15 Thread Wojciech Turek
. Thanks for your time on this. Roger S. Wojciech Turek wrote: Hi Roger, the Lustre 1.8.3 for RHEL5 has to set of RPMS one set for old style ext3 based ldiskfs and one set for the ext4 based ldiskfs. When upgrading from 1.6.6 to 1.8.3 I think you should not try to use the ext4 based packages

Re: [Lustre-discuss] 1.6.6 to 1.8.3 upgrade, OSS with wrong Target value

2010-07-15 Thread Wojciech Turek
reformatting the MDS and OSSes. Roger S. -- -- Wojciech Turek ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss

Re: [Lustre-discuss] panic on jbd:journal_dirty_metadata

2010-07-22 Thread Wojciech Turek
: -- ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss -- -- Wojciech Turek Assistant System Manager High Performance Computing Service University of Cambridge Email: wj

Re: [Lustre-discuss] I/O errors with NAMD

2010-07-22 Thread Wojciech Turek
Hi Richard, If the cause of the I/O errors is Lustre there will be some message in the logs. I am seeing similar problem with some applications that run on our cluster. The symptoms are always the same, just before application crashes with I/O error node gets evicted with a message like that:

Re: [Lustre-discuss] I/O errors with NAMD

2010-07-23 Thread Wojciech Turek
On 23 July 2010 10:02, Larry tsr...@gmail.com wrote: we have the same problem when running namd in lustre sometimes, the console log suggest file lock expired, but I don't know why. On Fri, Jul 23, 2010 at 8:12 AM, Wojciech Turek wj...@cam.ac.uk wrote: Hi Richard, If the cause of the I/O

Re: [Lustre-discuss] panic on jbd:journal_dirty_metadata

2010-07-25 Thread Wojciech Turek
the fact that the OSTs are nearly full contributes(?). I also see higher usage. In any case, I'll attempt compilation with the patch applied. With best regards, Michael On Jul 22, 2010, at 9:16 , Wojciech Turek wrote: Hi Michael, This looks like the problem we had some time ago after

Re: [Lustre-discuss] Getting weird disk errors, no apparent impact

2010-08-13 Thread Wojciech Turek
know what it's like out there! I've worked in the private sector. They expect results. -Ray Ghostbusters ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss -- Wojciech Turek

Re: [Lustre-discuss] Getting weird disk errors, no apparent impact

2010-08-13 Thread Wojciech Turek
rr_weight priorities failbackimmediate no_path_retry fail user_friendly_names yes } Comment out from multipath.conf file: blacklist { devnode * } On Fri, Aug 13, 2010 at 4:31 AM, Wojciech Turek wj...@cam.ac.uk

Re: [Lustre-discuss] Question on setting up fail-over

2010-08-17 Thread Wojciech Turek
@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss -- Wojciech Turek Senior System Architect High Performance Computing Service University of Cambridge Email: wj...@cam.ac.uk Tel: (+)44 1223 763517 ___ Lustre-discuss mailing

Re: [Lustre-discuss] help tracking down extremely high loads on OSSs

2010-10-18 Thread Wojciech Turek
://lists.lustre.org/mailman/listinfo/lustre-discuss -- Wojciech Turek Senior System Architect High Performance Computing Service University of Cambridge Email: wj...@cam.ac.uk Tel: (+)44 1223 763517 ___ Lustre-discuss mailing list Lustre-discuss

[Lustre-discuss] recovering formatted OST

2010-10-19 Thread Wojciech Turek
Hi Due to the locac disk failure in an OSS one of our /scratch OSTs was formatted by automatic installation script. This script created 5 small partitions and 6th partition consisting of the remaining space on that OST. Nothing else was written to that device since then. Is there a way to recover

Re: [Lustre-discuss] recovering formatted OST

2010-10-20 Thread Wojciech Turek
833 10484719 sdc1 8344193280 sdc2 8354193280 sdc3 8368387584 sdc4 837 7782640640 sdc5 Cheers, Andreas On 2010-10-20, at 9:06, Wojciech Turek wj...@cam.ac.uk wrote: Thank you for quick reply. Unfortunately all partitions were formatted

Re: [Lustre-discuss] mkfs options/tuning for RAID based OSTs

2010-10-20 Thread Wojciech Turek
Hi Edward, As Andreas mentioned earlier the max OST size is 16TB if one uses ext4 based ldiskfs. So creation of RAID group bigger than that will definitely hurt your performance because you would have to split the large array into smaller logical disks and that randomises IOs on the raid

Re: [Lustre-discuss] recovering formatted OST

2010-10-20 Thread Wojciech Turek
as it is was a physical device? Best regards, Wojciech On 20 October 2010 17:41, Andreas Dilger andreas.dil...@oracle.com wrote: On 2010-10-20, at 10:15, Wojciech Turek wj...@cam.ac.uk wrote: On 20 October 2010 16:32, Andreas Dilger andreas.dil...@oracle.com andreas.dil...@oracle.com wrote

Re: [Lustre-discuss] recovering formatted OST

2010-10-20 Thread Wojciech Turek
...@oracle.com wrote: On 2010-10-20, at 11:36, Wojciech Turek wrote: Your help is mostly appreciated Andreas. May I ask one more question? I would like to perform the recovery procedure on the image of the disk (I am making it using dd) rather then the physical device. In order to do

Re: [Lustre-discuss] recovering formatted OST

2010-10-21 Thread Wojciech Turek
On 21 October 2010 03:32, Andreas Dilger andreas.dil...@oracle.com wrote: Probably LVM will refuse to create a whole-device PV if there is a partition table. Cheers, Andreas On 2010-10-20, at 18:31, Wojciech Turek wj...@cam.ac.uk wrote: Hi Andres, If I am going to recreate LVM on the whole

Re: [Lustre-discuss] controlling which eth interface lustre uses

2010-10-21 Thread Wojciech Turek
Maybe I am missing a point here but can you explain me why would you need to have two NICs in one host on the same subnet? If you need additional access route to your host why not to configure eth0 on different subnet? On 21 October 2010 15:29, Brock Palen bro...@umich.edu wrote: Why do you

Re: [Lustre-discuss] recovering formatted OST

2010-10-21 Thread Wojciech Turek
for Lustre xattrs, or dump to look at the contents. If nome of this shows any results you may just have to give it up as lost. Cheers, Andreas On 2010-10-21, at 6:26, Wojciech Turek wj...@cam.ac.uk wrote: I ran e2fsck -fy on recreated LVM but it segfaulted after running for sometime: ... Block

Re: [Lustre-discuss] recovering formatted OST

2010-10-21 Thread Wojciech Turek
regards, Wojciech On 21 October 2010 17:45, Bernd Schubert bs_li...@aakef.fastmail.fm wrote: Hello Wojciech Turek, On Thursday, October 21, 2010, Wojciech Turek wrote: Hi Andreas, I have restarted fsck after the segfault and it ran for several hours and it segfaulted again. Pass

Re: [Lustre-discuss] recovering formatted OST

2010-10-21 Thread Wojciech Turek
Thanks Ken, that worked. On 21 October 2010 17:39, Ken Hornstein k...@cmf.nrl.navy.mil wrote: Now I have another problem. After last segfault I can not restart the fsck due to MMP. [...] Also when I try to access filesystem via debugfs it fails: debugfs -c -R 'ls'

Re: [Lustre-discuss] recovering formatted OST

2010-10-21 Thread Wojciech Turek
44 39 a3 58 01 00 00 75 0e c7 RIP [88034a95] :jbd:cleanup_journal_tail+0x9d/0x118 RSP 81016f00da68 0Kernel panic - not syncing: Fatal exception Any idea how to fix this? Many thanks Wojciech On 21 October 2010 17:54, Wojciech Turek wj...@cam.ac.uk wrote: Thanks Ken, that worked

Re: [Lustre-discuss] recovering formatted OST

2010-10-22 Thread Wojciech Turek
panic - not syncing: Fatal exception On 22 October 2010 03:09, Andreas Dilger andreas.dil...@oracle.com wrote: On 2010-10-21, at 18:44, Wojciech Turek wj...@cam.ac.uk wrote: fsck has finished and does not find any more errors to correct. However when I try to mount the device as ldiskfs

Re: [Lustre-discuss] recovering formatted OST

2010-10-22 Thread Wojciech Turek
file (if you can find it) into a new ldiskfs filesystem and then run ll_recover_lost_found_objs on that. On Friday, October 22, 2010, Wojciech Turek wrote: Ok, removing and recreating the journal fixed that problem and I am able to mount device as ldiskfs filesystem. Now I hit another wall

Re: [Lustre-discuss] recovering formatted OST

2010-10-22 Thread Wojciech Turek
a small fake device on a ramdisk and copy files over, run tunefs --writeconf /mdt and then start everything (inlcuding all OSTs) again. Cheers, On Friday, October 22, 2010, Wojciech Turek wrote: I have tried Bernd's suggestion and it seem to have worked, after running e2fsck -D

  1   2   >