Hi,
Lustre stripes in round robin manner. So let say that you have stripe
count set to 2 and stripe size set to 1MB. When you start write 3GB
file from a client to lustre it will send 1MB piece to OST1 and then
1MB piece to OST2 and it will keep doing that until it send 3GB or
until
lustre-1.6.4.3 doesn't comes with support for OFED-1.3.
However there is solution for that problem which is working for us.
https://bugzilla.lustre.org/show_bug.cgi?id=14309
Cheers
Wojciech Turek
On 4 Apr 2008, at 16:59, Steve Byrnes (stbyrnes) wrote:
Is there a version of Lustre available
If you have arranged you nodes to see each other disk devices then yes
it simple configuration with tunefs.lustre command.
Details are in Lustre operation manual.
Cheers
Wojciech
On 8 Apr 2008, at 15:00, Papp Tamás wrote:
Dear All,
Is this possible?
The cluster exists and working, but
/2005-December/001040.html
Thank You,
Wojciech Turek
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss
On 10 Jun 2008, at 19:18, Jakob Goldbach wrote:
My question is why not bigger values? What determines the max
lru_size.
The number of clients and ram on servers.
Is there a recommendation for lustre servers how much max ram one can
spend on locks?
Locks are held by server and
Hi,
Type 'lfs help setstripe' on lustre client node
lfs help setstripe
setstripe: Create a new file with a specific striping pattern or
set the default striping pattern on an existing directory or
delete the default striping pattern from an existing directory
usage: setstripe filename|dirname
@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss
--
Wojciech Turek
Assistant System Manager
High Performance
http://lists.lustre.org/mailman/listinfo/lustre-discuss
--
Wojciech Turek
Assistant System Manager
High Performance Computing Service
University of Cambridge
Email: [EMAIL PROTECTED]
Tel: (+)44 1223 763517
___
Lustre-discuss mailing list
Lustre
PROTECTED]://lists.lustre.org/mailman/listinfo/lustre-discuss
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss
--
Wojciech Turek
Assistant System Manager
High
Hi,
This will cure your problem.
https://bugzilla.lustre.org/show_bug.cgi?id=16404#c22
Cheers
Wojciech
Sridhar Gullapalli wrote:
Folks,
Are there any gotchas that I am missing, I am trying to install Lustre
1.6.5.1 on a RHEL 4 Workstation machine, and getting a kernel panic.
Does anyone
Hi,
As far as I know this is correct. To make sure just run lctl ping from
each client to the lustre servers using names instead of IP addresses.
If lctl ping resolves names correctly than in case of names everything
should work fine.
Cheers
Wojciech
Minh Hien wrote:
Dear all,
We had
is pretty normal status for lustre targets. I don't thing that you can
force lustre target device into recovery without unmounting it.
Regards,
Wojciech
Papp Tamas wrote:
Wojciech Turek wrote:
Hi,
COMPLETE means that this particular OST was in recovery and recovery
is now finished.
To force
Hi,
Thanks for that.I was thinking about trying drbd on my MDSs so I find
your PDF very useful.
Heiko Schroeter wrote:
Hello,
at last a first version of our setup scenario is ready.
Please consider this as a general guideline. It may contain errors.
We know that some things are done
Hi,
I don't have a script but if you run command given below on the client
it will produce a list of files that are striped to a particular OST.
lfs find --recursive --obd lustrefs-OST_UUID /mnt/lustre
Substitute lustrefs-OST_UUID with your full OST and the /mnt/lustre
with your
Lustre recovery time is 2.5 x timeout
You can find timeout by running this command on the MDS
cat /proc/sys/lustre/timeout
Thomas Roth wrote:
Hi all,
I just ran into a LBUG on an MDS still running Lustre Version 1.6.3 with
kernel 2.6.18, Debian Etch.
kern.log c.f. below. You will probably
I have some spare hardware (20TB of storage and several PE2950 servers)
and I would like to use it as a test platform for new lustre version. I
noticed that some people talk about testing beta version of 1.6.6. Can
some one tell me where could I obtain rc version of 1.6.6 ?
Many thanks,
Mag
Indeed you right! I used to get messages from lustre-announce list about
such events but it seems it didn't work this time.
Thanks
JD Neumann wrote:
!.6.6 is released and should be available from the down load center.
J.D.
Wojciech Turek wrote:
I have some spare hardware (20TB of storage
Alex wrote:
[EMAIL PROTECTED] ~]# lfs df -h
UUID bytes Used Available Use% Mounted on
testfs-MDT_UUID 130.4G460.1M122.5G0% /mnt/lustre[MDT:0]
testfs-OST_UUID 18.3G 17.4G 2.0M 94% /mnt/lustre[OST:0]
testfs-OST0001_UUID
Alex wrote:
On Tuesday 04 November 2008 16:37, Brian J. Murrell wrote:
On Tue, 2008-11-04 at 15:51 +0200, Alex wrote:
[EMAIL PROTECTED] ~]# lfs df -h
UUID bytes Used Available Use% Mounted on
testfs-MDT_UUID 130.4G460.1M
Alex wrote:
On Tuesday 04 November 2008 18:52, Brian J. Murrell wrote:
On Tue, 2008-11-04 at 16:23 +, Wojciech Turek wrote:
I don't know how to move a particular object but you could move a
whole file to another OST and that would release some space from the
full OST.
mkdir
Hi,
It doesn't look healthy. I assume that those messages and the numbers
are from the client side, what do you see on the MDS server itself?
It seem to me that your network connection to the MDS is flaky and thus
so many disconnection messages. It maybe doesn't hurt noticeably your
bandwidth
--
Wojciech Turek
Assistant System Manager
High Performance Computing Service
University of Cambridge
Email: wj...@cam.ac.uk
Tel: (+)44 1223 763517
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman
ays afraid to manipulate the MDT: might go wrong and I end up
with 100's of TB of junk (as a restore of backups never works once you
need it).
But if it's harmless to run writeconf as you said, I will try...
Regards,
Thomas
Wojciech Turek wrote:
writeconf forces all the lustre targets (OSTs and MDTs
?
--
Wojciech Turek
Assistant System Manager
High Performance Computing Service
University of Cambridge
Email: wj...@cam.ac.uk
Tel: (+)44 1223 763517
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo
Hi,
I am a bit confused with your description and the question. Can you
please answer my questions below?
By OSC do you mean Lustre client node ?
Is mount point on the Lustre client /mnt/lfs ?
Are sample1 and sample2 files located on Lustre filesystem mounted at
/mnt/lfs ?
If answer for all
www.terascala.com http://www.terascala.com/
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss
--
Wojciech Turek
Assistant System Manager
High Performance Computing Service
University
MGS is not the same as MDS but usually these are running on one machine.
If you have dedicated server for MGS it can serve as client. In most
cases MDS can be used as client too but not OSS(s)
Arden Wiebe wrote:
So if MGS is trivial load then I can safely mount client there?
--- On *Sat,
configuration.
Mount MDT, OSTs and the client and let me know how it works for you.
I also recommend to add modprobe.conf line on the clients, although
this is not necessary in your case, it will make configuration more
sane.
options lnet networks=tcp(eth0)
Cheers
Wojciech Turek
Lukas Hejtmanek wrote
I like HA-linux, however if you are looking for alternatives have a look
at RedHat Cluster Suite
http://www.redhat.com/docs/manuals/csgfs/browse/rh-cs-en/
Jeffrey Alan Bennett wrote:
Hi,
What software are people using for MDS failover?
I have been using Heartbeat from Linux-HA but I am not
Hello,
RHEL4
Kernel 2.6.9-67.0.22smp
Lustre-1.6.6
Lustre MDS report following error:
Jan 22 15:20:40 mds01.beowulf.cluster kernel: LustreError:
24680:0:(lov_request.c:692:lov_update_create_set()) error creating fid
0xeb79c9d sub-object on OST idx 4/1: rc = -28
Which I translate as that
Hi Brian,
Brian J. Murrell wrote:
On Thu, 2009-01-22 at 15:44 +, Wojciech Turek wrote:
Hello,
Hi,
Lustre MDS report following error:
Jan 22 15:20:40 mds01.beowulf.cluster kernel: LustreError:
24680:0:(lov_request.c:692:lov_update_create_set()) error
-only state of the OST0004?
Regards,
Wojciech
Wojciech Turek wrote:
Hi Brian,
Brian J. Murrell wrote:
On Thu, 2009-01-22 at 15:44 +, Wojciech Turek wrote:
Hello,
Hi,
Lustre MDS report following error:
Jan 22 15:20:40 mds01.beowulf.cluster
I am sorry, I should have looked in there before spamming here.
Thank you Brian,
Wojciech
Brian J. Murrell wrote:
On Thu, 2009-01-22 at 18:19 +, Wojciech Turek wrote:
Hi Brian,
I have tried to umount the OST(idx4) and the server LBUGed
I attache LBUG below
Hi,
My lustre system specs:
Lustre-1.6.6
RHEL4
2 lustre file systems: one consists of 4 OSTs and other consists of
20 OSTs
4 x OSS/6OSTs
Storage: S2A9500
Clients: 600
Interconnect: Ethernet
I noticed that my OSSs sometimes report very high load (around 500). I
read that increasing number
LUNs
- Added RAID6 support
- Enhanced IPv6 support for all ports
- Included Smart Battery (Smart BBU) management
- Enabled SNTP on management port
- Increased number of snapshots and volume copies per volume from 4 to
8 (an additional Premium Feature Key required)
--
Best Regards,
Wojciech Turek
Hi,
Today our mds started to behave unstable. /proc/fs/lustre/health_check
file reported that mds device is not healthy. All clients connected to
ddn_home file system got stuck and MDS server started to refuse client
connections and after some time it started to evict clients. Can some
one
Hello,
I would like to move MGS service to separate device. Would it work if I
backup my current MDT/MGS device and then create new MGS on separate
device and new MDT and then restore the backup to new MDT?
I will be grateful for any thought on this subject.
Cheers
Wojciech
We are using e2scan since few days and we have noticed that date
specification is not being processed correctly by e2scan.
date
Fri Apr 3 15:56:49 BST 2009
/usr/sbin/e2scan -C /ROOT -l -N 2009-03-29 19:44:00 /dev/dm-0
file_list
generating list of files with
mtime newer than Sun
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss
--
--
Wojciech Turek
Assistant System Manager
High Performance Computing Service
University of Cambridge
Email: wj...@cam.ac.uk
Tel: (+)44
)
Many thanks for all answers.
Best regards,
Wojciech
--
Wojciech Turek
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss
(C-DAC)
Pune University Campus,Ganesh Khind Road
Pune-Maharastra
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss
--
--
Wojciech Turek
Assistant System Manager
High
. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss
--
--
Wojciech Turek
Assistant System Manager
High
as the affected ones. I can not see any
problems on the third file system.
Wojciech
2009/10/10 Bernd Schubert bs_li...@aakef.fastmail.fm
ASSERTION(old_inode-i_state I_FREEING) is the infamous bug17485. You
will
need to run lfsck to fix it.
On Saturday 10 October 2009, Wojciech Turek wrote:
Hi
I am running lfscks on my file systems right now. Once they finished I would
like to re run lfscks to make sure that all problems were cleaned. Do you
know if I need to rebuild mdsdb and ostdbs for the second lfsck run? I can
see that lfsck change timestamps on db files so maybe I don't have to
I apologize if this question was answered earlier but I can not find it in
the mailing list.
I have an object ID and I would like to find file that this object is part
of. I tried to use lfs find but I can not seem to find right combination of
options.
Also is there a simple way to list all the
Many Thanks Daniel, these hints are very helpful.
Wojciech
2009/10/21 Daniel Kobras kob...@linux.de
Hi!
On Wed, Oct 21, 2009 at 11:58:43AM +0100, Wojciech Turek wrote:
I apologize if this question was answered earlier but I can not find it
in
the mailing list.
I have an object ID
got into
Lustre-1.8 manual. I guess it didn't make it's way to 1.6 manual because
manual was not updated since May and the bug was resolved in July.
Cheers
Wojciech
2009/10/21 Brian J. Murrell brian.murr...@sun.com
On Wed, 2009-10-21 at 11:58 +0100, Wojciech Turek wrote:
I apologize
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss
--
--
Wojciech Turek
Assistant System Manager
High Performance Computing Service
University of Cambridge
Email: wj...@cam.ac.uk
Tel: (+)44 1223
after every fsck run and also dont use the
same lustre DB for more than one operation using lfsck.
Hope this will help
On Tue, Oct 27, 2009 at 12:00 AM, Wojciech Turek wj...@cam.ac.uk wrote:
Hi,
I had similar problem just three weeks ago on our Lustre 1.6.6 RHEL4.
It all started
)) {
printf(close error\n);
}
return 0;
}
int main(void) {
int i;
for(i = 0; i 30; i++) {
openClose();
}
}
--
--
Wojciech Turek
Assistant System Manager
High Performance Computing Service
University of Cambridge
Email: wj...@cam.ac.uk
/LustreProc.html#50557055_78950
This could probably be your cause.
On Thu, Nov 12, 2009 at 3:03 PM, Wojciech Turek wj...@cam.ac.uk wrote:
Hi,
Cluster running Lustre 1.6.6
Opening and closing files takes longer on RHEL5 than on RHEL4. This is
only
happens with files located on Lustre file
://lists.lustre.org/mailman/listinfo/lustre-discuss
--
--
Wojciech Turek
Assistant System Manager
High Performance Computing Service
University of Cambridge
Email: wj...@cam.ac.uk
Tel: (+)44 1223 763517
___
Lustre-discuss mailing list
Lustre-discuss
and then run writeconf on them and
mount them back, would this recreate this missing files?
Also can do above without umounting clients (let them wait until
lustre targets come back) and would this kill any jobs running one
them?
Many thanks for your input
Cheers
Wojciech
--
--
Wojciech Turek
with that?
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss
--
--
Wojciech Turek
Assistant System Manager
High Performance Computing Service
University of Cambridge
Email: wj...@cam.ac.uk
Tel: (+)44 1223 763517
On Fri, Jan 15, 2010 at 5:39 PM, Wojciech Turek wj...@cam.ac.uk wrote:
Hi,
Could you please post output of the 'lctl list_nids' command on OSS
system and on MDS system. This will show us which network was
configured to work with lustre.
Regarding entries in the modprobe.conf, they tell lnet
Could you also post here syslog messages from the OSS ?
2010/1/16 Wojciech Turek wj...@cam.ac.uk:
Can you check if you can ping MDS and OSS using normal ping command?
2010/1/16 Dusty Marks dustynma...@gmail.com:
the output of ltcl list_nids on the oss is
[r...@oss ~]# lctl list_nids
{vfs_read+207}
80179008{sys_read+69} 80110236{system_call+126}
Code: 8b 14 90 31 c0 e8 9c d8 03 00 48 98 49 01 c4 8b 13 b8 20 00
RIP 801af8f0{proc_pid_status+534} RSP 010416fc9e48
0Kernel panic - not syncing: Oops
--
--
Wojciech Turek
Assistant System Manager
Thanks Andreas for quick answer. So upgrading to a newer version of
colletcl should fix it?
Cheers
Wojciech
2010/1/18 Andreas Dilger adil...@sun.com:
On 2010-01-18, at 19:59, Wojciech Turek wrote:
RHEL4 Lustre-1.6.6
Does the kernel panic below rings a bell to anyone?
RIP: 0010
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss
--
--
Wojciech Turek
Assistant System Manager
High Performance Computing Service
University of Cambridge
Email: wj...@cam.ac.uk
Tel: (+)44 1223 763517
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss
--
--
Wojciech Turek
Assistant System Manager
High Performance Computing Service
University of Cambridge
Email: wj...@cam.ac.uk
Tel: (+)44
-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss
--
--
Wojciech Turek
Assistant System Manager
High Performance Computing Service
University of Cambridge
Email: wj...@cam.ac.uk
Tel: (+)44 1223 763517
___
Lustre-discuss
I am trying to compile lustre modules for Open suse 2.6.31 kernel
Unfortunately make fails with following error
/usr/src/lustre-1.8.2/lustre/llite/lloop.c: In function 'loop_set_fd'
/usr/src/lustre-1.8.2/lustre/llite/lloop.c:506: error: implicit declaration
of function 'blk_queue_hardsect_size'
number?
Best regards,
Wojciech
On 18 March 2010 22:55, Andreas Dilger adil...@sun.com wrote:
On 2010-03-18, at 13:23, Wojciech Turek wrote:
I am trying to compile lustre modules for Open suse 2.6.31 kernel
Unfortunately make fails with following error
/usr/src/lustre-1.8.2/lustre/llite
Thanks Andreas, my mistake was not to search the attachments. Next time I
won't bother you. Again, many thanks.
Cheers,
Wojciech
On 20 March 2010 05:46, Andreas Dilger adil...@sun.com wrote:
On 2010-03-19, at 08:56, Wojciech Turek wrote:
Thanks for a quick answer. I have tried to compile
FYI I have working OpenSUSE Lustre patchless client using kernel
2.6.31.12-0.1-xen and lustre-1.8.2 source. I used info from bug 21500 and
http://www.mail-archive.com/lustre-discuss@lists.lustre.org/msg05655.html
On 22 March 2010 07:40, Wojciech Turek wj...@cam.ac.uk wrote:
Thanks Andreas, my
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss
--
--
Wojciech Turek
Assistant System Manager
High Performance Computing Service
University of Cambridge
Email: wj...@cam.ac.uk
Tel: (+)44 1223 763517
/listinfo/lustre-discuss
--
--
Wojciech Turek
Assistant System Manager
High Performance Computing Service
University of Cambridge
Email: wj...@cam.ac.uk
Tel: (+)44 1223 763517
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http
.
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss
--
--
Wojciech Turek
Assistant System Manager
High Performance Computing Service
University of Cambridge
Email: wj
:
umount -f MDS|OSS mount point
This stops the server and preserves client export information. When the server
restarts, the clients reconnect and resume in-progress transactions.
--
--
Wojciech Turek
Assistant System Manager
High Performance Computing Service
University of Cambridge
e2fsprogs-1.41.10.sun2-0redhat.x86_64
Best regards
--
--
Wojciech Turek
Assistant System Manager
High Performance Computing Service
University of Cambridge
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org
://lists.lustre.org/mailman/listinfo/lustre-discuss
--
--
Wojciech Turek
Assistant System Manager
High Performance Computing Service
University of Cambridge
Email: wj...@cam.ac.uk
Tel: (+)44 1223 763517
___
Lustre-discuss mailing list
Lustre-discuss
http://lists.lustre.org/mailman/listinfo/lustre-discuss
--
--
Wojciech Turek
Assistant System Manager
High Performance Computing Service
University of Cambridge
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org
that there
are lots of I/O going on (mainly read). I would like to find a good method
of finding out which Lustre clients are generating the I/O so I could
pinpoint the high load to a particular jobs. I hope that some Lustre users
can share their experience in that matter.
Best regards,
--
--
Wojciech
Oracle Corporation Canada Inc.
--
--
Wojciech Turek
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss
.
--
--
Wojciech Turek
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss
/dev/XXX
fsck -pf /dev/XXX
Is the above correct? I'd like to move our systems to ext4. I didn't know
those steps were necessary.
Other answers listed below.
Wojciech Turek wrote:
Hi Roger,
Sorry for the delay. From the ldiskfs messages I seem to me that you are
using ext4
.
Thanks for your time on this.
Roger S.
Wojciech Turek wrote:
Hi Roger,
the Lustre 1.8.3 for RHEL5 has to set of RPMS one set for old style ext3
based ldiskfs and one set for the ext4 based ldiskfs. When upgrading from
1.6.6 to 1.8.3 I think you should not try to use the ext4 based packages
reformatting the MDS and OSSes.
Roger S.
--
--
Wojciech Turek
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss
:
--
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss
--
--
Wojciech Turek
Assistant System Manager
High Performance Computing Service
University of Cambridge
Email: wj
Hi Richard,
If the cause of the I/O errors is Lustre there will be some message in the
logs. I am seeing similar problem with some applications that run on our
cluster. The symptoms are always the same, just before application crashes
with I/O error node gets evicted with a message like that:
On 23 July 2010 10:02, Larry tsr...@gmail.com wrote:
we have the same problem when running namd in lustre sometimes, the
console log suggest file lock expired, but I don't know why.
On Fri, Jul 23, 2010 at 8:12 AM, Wojciech Turek wj...@cam.ac.uk wrote:
Hi Richard,
If the cause of the I/O
the fact that the OSTs are
nearly full contributes(?). I also see higher usage.
In any case, I'll attempt compilation with the patch applied.
With best regards,
Michael
On Jul 22, 2010, at 9:16 , Wojciech Turek wrote:
Hi Michael,
This looks like the problem we had some time ago after
know what it's like out there! I've worked in the private
sector. They expect results. -Ray Ghostbusters
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss
--
Wojciech Turek
rr_weight priorities
failbackimmediate
no_path_retry fail
user_friendly_names yes
}
Comment out from multipath.conf file:
blacklist {
devnode *
}
On Fri, Aug 13, 2010 at 4:31 AM, Wojciech Turek wj...@cam.ac.uk
@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss
--
Wojciech Turek
Senior System Architect
High Performance Computing Service
University of Cambridge
Email: wj...@cam.ac.uk
Tel: (+)44 1223 763517
___
Lustre-discuss mailing
://lists.lustre.org/mailman/listinfo/lustre-discuss
--
Wojciech Turek
Senior System Architect
High Performance Computing Service
University of Cambridge
Email: wj...@cam.ac.uk
Tel: (+)44 1223 763517
___
Lustre-discuss mailing list
Lustre-discuss
Hi
Due to the locac disk failure in an OSS one of our /scratch OSTs was
formatted by automatic installation script. This script created 5 small
partitions and 6th partition consisting of the remaining space on that OST.
Nothing else was written to that device since then. Is there a way to
recover
833 10484719 sdc1
8344193280 sdc2
8354193280 sdc3
8368387584 sdc4
837 7782640640 sdc5
Cheers, Andreas
On 2010-10-20, at 9:06, Wojciech Turek wj...@cam.ac.uk wrote:
Thank you for quick reply.
Unfortunately all partitions were formatted
Hi Edward,
As Andreas mentioned earlier the max OST size is 16TB if one uses ext4 based
ldiskfs. So creation of RAID group bigger than that will definitely hurt
your performance because you would have to split the large array into
smaller logical disks and that randomises IOs on the raid
as it is was a physical device?
Best regards,
Wojciech
On 20 October 2010 17:41, Andreas Dilger andreas.dil...@oracle.com wrote:
On 2010-10-20, at 10:15, Wojciech Turek wj...@cam.ac.uk wrote:
On 20 October 2010 16:32, Andreas Dilger andreas.dil...@oracle.com
andreas.dil...@oracle.com wrote
...@oracle.com wrote:
On 2010-10-20, at 11:36, Wojciech Turek wrote:
Your help is mostly appreciated Andreas. May I ask one more question?
I would like to perform the recovery procedure on the image of the disk
(I am making it using dd) rather then the physical device. In order to do
On 21 October 2010 03:32, Andreas Dilger andreas.dil...@oracle.com wrote:
Probably LVM will refuse to create a whole-device PV if there is a
partition table.
Cheers, Andreas
On 2010-10-20, at 18:31, Wojciech Turek wj...@cam.ac.uk wrote:
Hi Andres,
If I am going to recreate LVM on the whole
Maybe I am missing a point here but can you explain me why would you need to
have two NICs in one host on the same subnet?
If you need additional access route to your host why not to configure eth0
on different subnet?
On 21 October 2010 15:29, Brock Palen bro...@umich.edu wrote:
Why do you
for Lustre
xattrs, or dump to look at the contents. If nome of this shows any results
you may just have to give it up as lost.
Cheers, Andreas
On 2010-10-21, at 6:26, Wojciech Turek wj...@cam.ac.uk wrote:
I ran e2fsck -fy on recreated LVM but it segfaulted after running for
sometime:
...
Block
regards,
Wojciech
On 21 October 2010 17:45, Bernd Schubert bs_li...@aakef.fastmail.fm wrote:
Hello Wojciech Turek,
On Thursday, October 21, 2010, Wojciech Turek wrote:
Hi Andreas,
I have restarted fsck after the segfault and it ran for several hours and
it segfaulted again.
Pass
Thanks Ken, that worked.
On 21 October 2010 17:39, Ken Hornstein k...@cmf.nrl.navy.mil wrote:
Now I have another problem. After last segfault I can not restart the fsck
due to MMP.
[...]
Also when I try to access filesystem via debugfs it fails:
debugfs -c -R 'ls'
44 39 a3 58 01 00 00 75 0e c7
RIP [88034a95] :jbd:cleanup_journal_tail+0x9d/0x118
RSP 81016f00da68
0Kernel panic - not syncing: Fatal exception
Any idea how to fix this?
Many thanks
Wojciech
On 21 October 2010 17:54, Wojciech Turek wj...@cam.ac.uk wrote:
Thanks Ken, that worked
panic - not syncing: Fatal exception
On 22 October 2010 03:09, Andreas Dilger andreas.dil...@oracle.com wrote:
On 2010-10-21, at 18:44, Wojciech Turek wj...@cam.ac.uk wrote:
fsck has finished and does not find any more errors to correct. However
when I try to mount the device as ldiskfs
file (if you can find it) into a new ldiskfs filesystem
and then run ll_recover_lost_found_objs on that.
On Friday, October 22, 2010, Wojciech Turek wrote:
Ok, removing and recreating the journal fixed that problem and I am able
to
mount device as ldiskfs filesystem. Now I hit another wall
a small fake device on a ramdisk and copy
files
over, run tunefs --writeconf /mdt and then start everything (inlcuding all
OSTs) again.
Cheers,
On Friday, October 22, 2010, Wojciech Turek wrote:
I have tried Bernd's suggestion and it seem to have worked, after running
e2fsck -D
1 - 100 of 143 matches
Mail list logo