Re: [Lustre-discuss] Fwd: Simple servers as storage nodes

2009-03-30 Thread Brian J. Murrell
On Mon, 2009-03-30 at 13:43 +0300, Stas Oskin wrote: What I'm actually look for, is a solution that can take plain Linux boxes, and unify their space into a single volume, with optional replication of every file. You will need to look elsewhere then. So my question is, whether Lustre able

Re: [Lustre-discuss] Fwd: Simple servers as storage nodes

2009-03-30 Thread Stas Oskin
Hi. Are you aware of any other solutions that might achieve this? Regards, 2009/3/30 Brian J. Murrell brian.murr...@sun.com On Mon, 2009-03-30 at 13:43 +0300, Stas Oskin wrote: What I'm actually look for, is a solution that can take plain Linux boxes, and unify their space into a single

Re: [Lustre-discuss] LustreError: lock callback timer expired after

2009-03-30 Thread Simon Latapie
Simon Latapie wrote: Greetings, I currently have a lustre system with 1 MDS, 2 OSS with 2 OSTs each, and 37 lustre clients (1 login and 36 compute nodes), all using infiniband as lustre network (o2ib). All nodes are on 1.6.5.1 patched kernel. There is network error (no packet loss

Re: [Lustre-discuss] LustreError: lock callback timer expired after

2009-03-30 Thread Oleg Drokin
Hello! On Mar 30, 2009, at 7:06 AM, Simon Latapie wrote: I currently have a lustre system with 1 MDS, 2 OSS with 2 OSTs each, and 37 lustre clients (1 login and 36 compute nodes), all using infiniband as lustre network (o2ib). All nodes are on 1.6.5.1 patched kernel. For the past two

Re: [Lustre-discuss] Fwd: Simple servers as storage nodes

2009-03-30 Thread Brian J. Murrell
Hi. It was pointed out that perhaps I was misunderstanding your question/situation. On Mon, 2009-03-30 at 09:16 -0400, Brian J. Murrell wrote: On Mon, 2009-03-30 at 13:43 +0300, Stas Oskin wrote: What I'm actually look for, is a solution that can take plain Linux boxes, and unify their

Re: [Lustre-discuss] Fwd: Simple servers as storage nodes

2009-03-30 Thread Stas Oskin
Hi. So these Linux boxes you have, do they have a task already or are you going to dedicate them to the job of serving their disk space out in a uniform volume? Actually I'd prefer an approach where these boxes do other tasks, but depends on the solution. Do you have any expectations that these

Re: [Lustre-discuss] Fwd: Simple servers as storage nodes

2009-03-30 Thread Brian J. Murrell
On Mon, 2009-03-30 at 19:15 +0300, Stas Oskin wrote: Hi. Hello, Actually I'd prefer an approach where these boxes do other tasks, but depends on the solution. For a Lustre solution, you should dedicate the Linux boxes to sharing their storage out. For stability and performance, these

Re: [Lustre-discuss] Building 1.6.7 from source - errors

2009-03-30 Thread Andrew Perepechko
Jeremy, is the running kernel built from the source (and with the configuration) you specified with --kernel configure option when building lustre? I built 1.6.7 from source against a 2.6.22.14 kernel from kernel.org. The build finished and I installed them, however when I try to make the

Re: [Lustre-discuss] Building 1.6.7 from source - errors

2009-03-30 Thread Jeremy Mann
Andrew Perepechko wrote: Jeremy, is the running kernel built from the source (and with the configuration) you specified with --kernel configure option when building lustre? I found out my mistake, I never patched the kernel. Its working perfectly now. -- Jeremy Mann

Re: [Lustre-discuss] finding files belonging to dead/offline ost

2009-03-30 Thread Andreas Dilger
On Mar 23, 2009 13:19 -0500, Hendelman, Rob wrote: We recently had an OST become badly corrupted (850G or so in lost+found). I deactivated this OST on the MDT server the clients. The OST is not mounted on the OSS (not possible). You can likely recover many of these files by running the

[Lustre-discuss] If I use a lustre kernel do I get the patched raid5?

2009-03-30 Thread Paul Nowoczynski
Sparing myself from doing any diff's can someone tell me if Andreas' raid5 patches are integrated with the lustre kernels? I'm interested 2.6.16-27 and up. thanks, paul ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org

Re: [Lustre-discuss] Failover recovery issues / questions

2009-03-30 Thread Jeffrey Bennett
Hi, I am not familiar with using heartbeat with the OSS, I have only used it on the MDS for failover, since you can't have an active/active configuration on the MDS. However, you can have active/active on the OSS, I can't understand why would you want to use heartbeat to unmount the OSTs on one

Re: [Lustre-discuss] Failover recovery issues / questions

2009-03-30 Thread Kevin Van Maren
You can NOT have an OST mounted on both. You can use heartbeat to mount different OSTs on each, and to mount them all on one node when the other node goes down. Kevin On Mar 30, 2009, at 6:16 PM, Jeffrey Bennett j...@sdsc.edu wrote: Hi, I am not familiar with using heartbeat with the OSS,

Re: [Lustre-discuss] Failover recovery issues / questions

2009-03-30 Thread Adam Gandelman
Jab- Hi, I am not familiar with using heartbeat with the OSS, I have only used it on the MDS for failover, since you can't have an active/active configuration on the MDS. However, you can have active/active on the OSS, I can't understand why would you want to use heartbeat to unmount the

Re: [Lustre-discuss] Failover recovery issues / questions

2009-03-30 Thread Brian J. Murrell
On Mon, 2009-03-30 at 17:52 -0700, Jeffrey Bennett wrote: Thanks Kevin for clearing this up. So when the manual mentions Load-balanced Active/Active configuration, what does that mean? It simply means out of all of the OSTs that both machines can see/use, you mount 50% of them on one

[Lustre-discuss] Fwd: Additional RPMs for older kernels

2009-03-30 Thread Jordan Mendler
Hi all, Are there any additional repositories that provide RPMs for older kernels? In particular I am looking of lustre client modules for 1.6.7 for a 2.6.9-55.0.2 (centos/rhel 4.5) kernel. If not, is there a way to build RPMs for just the kernel modules? It is my impression that 'make RPMs'

[Lustre-discuss] lustre knowledge base

2009-03-30 Thread Mag Gam
Does anyone know if the KB is still being maintained? if so, where is the URL for it? TIA ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss

Re: [Lustre-discuss] lustre knowledge base

2009-03-30 Thread Sheila Barthel
The KB has been migrated to the Lustre manual: http://manual.lustre.org/manual/LustreManual16_HTML/KnowledgeBase.html#50548891_pgfId-1288153 The KB is not actively maintained. Sheila Mag Gam wrote: Does anyone know if the KB is still being maintained? if so, where is the URL for it? TIA

[Lustre-discuss] File Content change without Error log

2009-03-30 Thread Lu Wang
Dear all, There are more than 100 files demaged recently without any error logs on OSS. The demaged files has same size with their original copys in our backup system. However, the chksum changed. For example, #ll run_0008126_All_file015_SFO-1.raw.353645 -rw-r--r-- 1 chyd u07 2108082156