[Lustre-discuss] Performance Question

2013-04-17 Thread David Noriega
We have a small luster setup with two ost on two oss servers and I'm
curious if moving to one ost per oss with 4 oss servers would increase
performance?
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] Upgrade from 1.8.4-oracle to 1.8.8-whamcloud question

2013-02-19 Thread David Noriega
We are still running 1.8.4 back when Lustre was still hosted by Oracle, and
its been mostly stable except for a few bugs here and there that I see have
been fixed in the last 1.8.8 release from whamcloud. I'm wondering, can I
update the server side with the whamcloud's rpms without updating the
client side right away(thus requiring a full shut down).
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] How smart is Lustre?

2012-12-20 Thread David Noriega
In my experience, if there is a particular driver for multipathing from the
vendor, go for that. In our setup, we have Oracle/Sun disk arrays and with
the standard linux multipathing daemon, I would get lots of weird I/O
errors. Turns out the disk arrays had picked their preferred path, but
Linux was trying to talk to the LUNs on both paths and would only receive a
response on the preferred one.

There is an rdac driver that can be installed. Simply disable the
multipathing daemon or configure it to ignore the disk arrays and use the
vendor solution. I had no more I/O errors(Which only served to slow down
the boot up process).


On Wed, Dec 19, 2012 at 11:36 AM, Jason Brooks brook...@ohsu.edu wrote:

 Hello,

 I am building a 2.3.x filesystem right now, and I am looking at setting up
 some active-active failover abilities to my oss's.  I have been looking at
 Dell's md3xxx arrays, as they have redundant controllers, and allow up to
 four hosts to connect to each controller.

 I can see how linux multi-path can be used with redundant disk
 controllers.  I can even (slightly) understand how lustre fails over when
 an oss goes down.


1. Is lustre smart enough to use redundant paths, or failover oss's if
an oss is congested?  (it would be cool, no?)
2. Does the linux multi-path module slow performance?
3. How much does a raid array such as the one listed above act as a
bottleneck, say if I have as many volumes available on the raid controllers
as there are oss hosts?
4. Are there arrays similar to Dell's model that would work?

 Thanks!

 --jason

 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] Monitoring program io usage

2012-11-12 Thread David Noriega
We run a small cluster with a two node lustre setup, so its easy to
see when some program thrashes the file system. Not being a
programmer, what tools or methods could I use to monitor and log data
to help the developer regarding their io usage on lustre?
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Service thread count parameter

2012-10-15 Thread David Noriega
How does one estimate a good number of service threads? I'm not sure I
understand the following: 1 thread / 128MB * number of cpus

On Wed, Oct 10, 2012 at 9:17 AM, Jean-Francois Le Fillatre
jean-francois.lefilla...@clumeq.ca wrote:

 Hi David,

 It needs to be specified as a module parameter at boot time, in
 /etc/modprobe.conf. Check the Lustre tuning page:
 http://wiki.lustre.org/manual/LustreManual18_HTML/LustreTuning.html
 http://wiki.lustre.org/manual/LustreManual20_HTML/LustreTuning.html

 Note that once created, the threads won't be destroyed, so if you want to
 lower your thread count you'll need to reboot your system.

 Thanks,
 JF


 On Tue, Oct 9, 2012 at 6:00 PM, David Noriega tsk...@my.utsa.edu wrote:

 Is this a parameter, ost.OSS.ost_io.threads_max, when set via lctl
 conf_parm will persist between reboots/remounts?
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss




 --
 Jean-François Le Fillâtre
 Calcul Québec / Université Laval, Québec, Canada
 jean-francois.lefilla...@clumeq.ca




-- 
David Noriega
CSBC/CBI System Administrator
University of Texas at San Antonio
One UTSA Circle
San Antonio, TX 78249
Office: BSE 3.114
Phone: 210-458-7100
http://www.cbi.utsa.edu

Please remember to acknowledge the RCMI grant , wording should be as
stated below:This project was supported by a grant from the National
Institute on Minority Health and Health Disparities (G12MD007591) from
the National Institutes of Health. Also, remember to register all
publications with PubMed Central.
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] Service thread count parameter

2012-10-09 Thread David Noriega
Is this a parameter, ost.OSS.ost_io.threads_max, when set via lctl
conf_parm will persist between reboots/remounts?
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Lustre missing physical volume

2012-07-02 Thread David Noriega
What an adventure this turned into. Turns out when I had to relabel
the physical volumes, I got two of them backwards(realized this when I
checked /proc/fs/luster/devices) and somehow this was tripping things
up. I swapped them back using pvremove and pvcreate, remounted and
after a few minutes, the clients reconnected and the system is happy
again.

On Mon, Jul 2, 2012 at 12:42 AM, David Noriega tsk...@my.utsa.edu wrote:
 Sorry for the rushed email. For some reason the LVM metadata got
 screwed up, managed to restore it, though now running into another
 issue. I've mounted the OSTs yet it seems they are not all
 cooperating. One of the OSTs will stay listed as Resource Unavailable
 and this seems to be the main message on the OSS node:

 LustreError: 137-5: UUID 'lustre-OST0002_UUID' is not available  for
 connect (no target)
 LustreError: Skipped 470 previous similar messages
 LustreError: 5214:0:(ldlm_lib.c:1914:target_send_reply_msg()) @@@
 processing error (-19)  req@8103ffc73400 x1404513746630678/t0
 o8-?@?:0/0 lens 368/0 e 0 to 0 dl 1341207057 ref 1 fl
 Interpret:/0/0 rc -19/0
 LustreError: 5214:0:(ldlm_lib.c:1914:target_send_reply_msg()) Skipped
 470 previous similar messages

 I've tried remounting this ost on the other data node but still won't
 connect from the client side. I've even rebooted the mds and still no
 go. I've run e2fsck to check the OSTs and no issues and the disk
 arrays report no problems on their end and fibre connections are good
 and the multipath driver doesnt report anything(These are Sun disk
 arrays so using the rdac driver instead of the basic multpath daemon).

 On the client side I'll see this:
 Lustre: 3289:0:(client.c:1476:ptlrpc_expire_one_request()) @@@ Request
 x1404591888147958 sent from lustre-OST0002-osc-8104104ad800 to NID
 192.168.5.101@tcp 0s ago has failed due to network error (30s prior to
 deadline).
   req@81015113b400 x1404591888147958/t0
 o8-lustre-OST0002_UUID@192.168.5.101@tcp:28/4 lens 368/584 e 0 to 1
 dl 1341187631 ref 1 fl Rpc:N/0/0 rc 0/0

 Lustre: 3290:0:(import.c:517:import_select_connection())
 lustre-OST0002-osc-8104104ad800: tried all connections, increasing
 latency to 22s
 Lustre: 3290:0:(import.c:517:import_select_connection()) Skipped 39
 previous similar messages


 On Sun, Jul 1, 2012 at 8:10 PM, Mark Day mark@rsp.com.au wrote:
 Does the device show up in /dev ?
 Have you physically checked for Fibre/SAS connectivity, RAID controller
 errors etc?

 You may need to supply more information about your setup. It sounds more
 like a RAID/disk issue than a Lustre issue.

 
 From: David Noriega tsk...@my.utsa.edu
 To: lustre-discuss@lists.lustre.org
 Sent: Monday, 2 July, 2012 8:51:18 AM
 Subject: [Lustre-discuss] Lustre missing physical volume


 Just recently used heartbeat to failover resources so that I could
 power down a lustre node to add more ram and failed back to do the
 same to our second lustre node. Only then do I find that now our
 lustre install is missing a physical volume out of lvm. pvscan only
 shows three out of four partitions.

 Any hints? I've tried some recovery steps in lvm with pvcreate using
 the archived config for the missing pv but no luck, says no device
 with such uuid. I'm lost on what to do now. This is lustre 1.8.4
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss




 --
 David Noriega
 CSBC/CBI System Administrator
 University of Texas at San Antonio
 One UTSA Circle
 San Antonio, TX 78249
 Office: BSE 3.112
 Phone: 210-458-7100
 http://www.cbi.utsa.edu



-- 
David Noriega
CSBC/CBI System Administrator
University of Texas at San Antonio
One UTSA Circle
San Antonio, TX 78249
Office: BSE 3.112
Phone: 210-458-7100
http://www.cbi.utsa.edu
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] Lustre missing physical volume

2012-07-01 Thread David Noriega
Just recently used heartbeat to failover resources so that I could
power down a lustre node to add more ram and failed back to do the
same to our second lustre node. Only then do I find that now our
lustre install is missing a physical volume out of lvm. pvscan only
shows three out of four partitions.

Any hints? I've tried some recovery steps in lvm with pvcreate using
the archived config for the missing pv but no luck, says no device
with such uuid. I'm lost on what to do now. This is lustre 1.8.4
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Lustre missing physical volume

2012-07-01 Thread David Noriega
Sorry for the rushed email. For some reason the LVM metadata got
screwed up, managed to restore it, though now running into another
issue. I've mounted the OSTs yet it seems they are not all
cooperating. One of the OSTs will stay listed as Resource Unavailable
and this seems to be the main message on the OSS node:

LustreError: 137-5: UUID 'lustre-OST0002_UUID' is not available  for
connect (no target)
LustreError: Skipped 470 previous similar messages
LustreError: 5214:0:(ldlm_lib.c:1914:target_send_reply_msg()) @@@
processing error (-19)  req@8103ffc73400 x1404513746630678/t0
o8-?@?:0/0 lens 368/0 e 0 to 0 dl 1341207057 ref 1 fl
Interpret:/0/0 rc -19/0
LustreError: 5214:0:(ldlm_lib.c:1914:target_send_reply_msg()) Skipped
470 previous similar messages

I've tried remounting this ost on the other data node but still won't
connect from the client side. I've even rebooted the mds and still no
go. I've run e2fsck to check the OSTs and no issues and the disk
arrays report no problems on their end and fibre connections are good
and the multipath driver doesnt report anything(These are Sun disk
arrays so using the rdac driver instead of the basic multpath daemon).

On the client side I'll see this:
Lustre: 3289:0:(client.c:1476:ptlrpc_expire_one_request()) @@@ Request
x1404591888147958 sent from lustre-OST0002-osc-8104104ad800 to NID
192.168.5.101@tcp 0s ago has failed due to network error (30s prior to
deadline).
  req@81015113b400 x1404591888147958/t0
o8-lustre-OST0002_UUID@192.168.5.101@tcp:28/4 lens 368/584 e 0 to 1
dl 1341187631 ref 1 fl Rpc:N/0/0 rc 0/0

Lustre: 3290:0:(import.c:517:import_select_connection())
lustre-OST0002-osc-8104104ad800: tried all connections, increasing
latency to 22s
Lustre: 3290:0:(import.c:517:import_select_connection()) Skipped 39
previous similar messages


On Sun, Jul 1, 2012 at 8:10 PM, Mark Day mark@rsp.com.au wrote:
 Does the device show up in /dev ?
 Have you physically checked for Fibre/SAS connectivity, RAID controller
 errors etc?

 You may need to supply more information about your setup. It sounds more
 like a RAID/disk issue than a Lustre issue.

 
 From: David Noriega tsk...@my.utsa.edu
 To: lustre-discuss@lists.lustre.org
 Sent: Monday, 2 July, 2012 8:51:18 AM
 Subject: [Lustre-discuss] Lustre missing physical volume


 Just recently used heartbeat to failover resources so that I could
 power down a lustre node to add more ram and failed back to do the
 same to our second lustre node. Only then do I find that now our
 lustre install is missing a physical volume out of lvm. pvscan only
 shows three out of four partitions.

 Any hints? I've tried some recovery steps in lvm with pvcreate using
 the archived config for the missing pv but no luck, says no device
 with such uuid. I'm lost on what to do now. This is lustre 1.8.4
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss




-- 
David Noriega
CSBC/CBI System Administrator
University of Texas at San Antonio
One UTSA Circle
San Antonio, TX 78249
Office: BSE 3.112
Phone: 210-458-7100
http://www.cbi.utsa.edu
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] Client kernel panic

2012-03-05 Thread David Noriega
I've seen this happen every once in a while on nodes in our cluster.
Since they crash hard, unable to get much in the way of logs and this
is all I can see via remote console from their ilom:

Code: 8b 17 85 d2 74 73 8b 47 28 85 c0 74 f6 05 d1 58 d1 ff 01
RIP [88781ce1] :lustre:ll_intent_drop_lock+0x11/0xb0
RSP 810c3608d388
0Kernel panic -not syncing : Fatal exception

I dont see anything on the OSS or meta data nodes except for the I
think its dead I'm evicting it message.



-- 
David Noriega
System Administrator
Computational Biology Initiative
High Performance Computing Center
University of Texas at San Antonio
One UTSA Circle
San Antonio, TX 78249
Office: BSE 3.112
Phone: 210-458-7100
http://www.cbi.utsa.edu
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] Lustre Read Tuning

2012-02-09 Thread David Noriega
On our system, we typically have more reading then writing going on
and was wondering what are the best parameters to tune?

I have set lnet.debug to 0, and have increased  max rpcs in flight as
well as dirty mb. I left lru_size dynamic as setting it didn't seem to
have any affect.

-- 
David Noriega
System Administrator
Computational Biology Initiative
High Performance Computing Center
University of Texas at San Antonio
One UTSA Circle
San Antonio, TX 78249
Office: BSE 3.112
Phone: 210-458-7100
http://www.cbi.utsa.edu
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Thread might be hung, Heavy IO Load messages

2012-02-02 Thread David Noriega
We have two OSSs, each with two quad core AMD Opterons and 8GB of ram
and two OSTs each(4.4T and 3.5T). Backend storage is a pair of Sun
StorageTek 2540 connected with 8Gb fiber.

What about tweaking max_dirty_mb on the client side?

On Wed, Feb 1, 2012 at 6:33 PM, Carlos Thomaz ctho...@ddn.com wrote:
 David,

 The oss service threads is a function of your RAM size and CPUs. It's
 difficult to say what would be a good upper limit without knowing the size
 of your OSS, # clients, storage back-end and workload. But the good thing
 you can give a try on the fly via lctl set_param command.

 Assuming you are running lustre 1.8, here is a good explanation on how to
 do it:
 http://wiki.lustre.org/manual/LustreManual18_HTML/LustreProc.html#50651263_
 87260

 Some remarks:
 - reducing the number of OSS threads may impact the performance depending
 on how is your workload.
 - unfortunately I guess you will need to try and see what happens. I would
 go for 128 and analyze the behavior of your OSSs (via log files) and also
 keeping an eye on your workload. Seems to me that 300 is a bit too high
 (but again, I don't know what you have on your storage back-end or OSS
 configuration).


 I can't tell you much about the lru_size, but as far as I understand the
 values are dynamic and there's not much to do rather than clear the last
 recently used queue or disable the lru sizing. I can't help much on this
 other than pointing you out the explanation for it (see 31.2.11):

 http://wiki.lustre.org/manual/LustreManual20_HTML/LustreProc.html


 Regards,
 Carlos




 --
 Carlos Thomaz | HPC Systems Architect
 Mobile: +1 (303) 519-0578
 ctho...@ddn.com | Skype ID: carlosthomaz
 DataDirect Networks, Inc.
 9960 Federal Dr., Ste 100 Colorado Springs, CO 80921
 ddn.com http://www.ddn.com/ | Twitter: @ddn_limitless
 http://twitter.com/ddn_limitless | 1.800.TERABYTE





 On 2/1/12 2:11 PM, David Noriega tsk...@my.utsa.edu wrote:

zone_reclaim_mode is 0 on all clients/servers

When changing number of service threads or the lru_size, can these be
done on the fly or do they require a reboot of either client or
server?
For my two OSTs, cat /proc/fs/lustre/ost/OSS/ost_io/threads_started
give about 300(300, 359) so I'm thinking try half of that and see how
it goes?

Also checking lru_size, I get different numbers from the clients. cat
/proc/fs/lustre/ldlm/namespaces/*/lru_size

Client: MDT0 OST0 OST1 OST2 OST3 MGC
head node: 0 22 22 22 22 400 (only a few users logged in)
busy node: 1 501 504 503 505 400 (Fully loaded with jobs)
samba/nfs server: 4 440070 44370 44348 26282 1600

So my understanding is the lru_size is set to auto by default thus the
varying values, but setting it manually is effectively setting a max
value? Also what does it mean to have a lower value(especially in the
case of the samba/nfs server)?

On Wed, Feb 1, 2012 at 1:27 PM, Charles Taylor tay...@hpc.ufl.edu wrote:

 You may also want to check and, if necessary, limit the lru_size on
your clients.   I believe there are guidelines in the ops manual.
We have ~750 clients and limit ours to 600 per OST.   That, combined
with the setting zone_reclaim_mode=0 should make a big difference.

 Regards,

 Charlie Taylor
 UF HPC Center


 On Feb 1, 2012, at 2:04 PM, Carlos Thomaz wrote:

 Hi David,

 You may be facing the same issue discussed on previous threads, which
is
 the issue regarding the zone_reclaim_mode.

 Take a look on the previous thread where myself and Kevin replied to
 Vijesh Ek.

 If you don't have access to the previous emails, look at your kernel
 settings for the zone reclaim:

 cat /proc/sys/vm/zone_reclaim_mode

 It should be set to 0.

 Also, look at the number of Lustre OSS service threads. It may be set
to
 high...

 Rgds.
 Carlos.


 --
 Carlos Thomaz | HPC Systems Architect
 Mobile: +1 (303) 519-0578
 ctho...@ddn.com | Skype ID: carlosthomaz
 DataDirect Networks, Inc.
 9960 Federal Dr., Ste 100 Colorado Springs, CO 80921
 ddn.com http://www.ddn.com/ | Twitter: @ddn_limitless
 http://twitter.com/ddn_limitless | 1.800.TERABYTE





 On 2/1/12 11:57 AM, David Noriega tsk...@my.utsa.edu wrote:

 indicates the system was overloaded (too many service threads, or


 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss

 Charles A. Taylor, Ph.D.
 Associate Director,
 UF HPC Center
 (352) 392-4036






--
David Noriega
System Administrator
Computational Biology Initiative
High Performance Computing Center
University of Texas at San Antonio
One UTSA Circle
San Antonio, TX 78249
Office: BSE 3.112
Phone: 210-458-7100
http://www.cbi.utsa.edu
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss




-- 
David Noriega
System Administrator
Computational Biology Initiative
High Performance Computing Center
University of Texas at San Antonio
One

Re: [Lustre-discuss] Thread might be hung, Heavy IO Load messages

2012-02-02 Thread David Noriega
On a side note, what about increasing the MDS service threads?
Checking that, its running at its max of 128.

On Thu, Feb 2, 2012 at 9:54 AM, David Noriega tsk...@my.utsa.edu wrote:
 We have two OSSs, each with two quad core AMD Opterons and 8GB of ram
 and two OSTs each(4.4T and 3.5T). Backend storage is a pair of Sun
 StorageTek 2540 connected with 8Gb fiber.

 What about tweaking max_dirty_mb on the client side?

 On Wed, Feb 1, 2012 at 6:33 PM, Carlos Thomaz ctho...@ddn.com wrote:
 David,

 The oss service threads is a function of your RAM size and CPUs. It's
 difficult to say what would be a good upper limit without knowing the size
 of your OSS, # clients, storage back-end and workload. But the good thing
 you can give a try on the fly via lctl set_param command.

 Assuming you are running lustre 1.8, here is a good explanation on how to
 do it:
 http://wiki.lustre.org/manual/LustreManual18_HTML/LustreProc.html#50651263_
 87260

 Some remarks:
 - reducing the number of OSS threads may impact the performance depending
 on how is your workload.
 - unfortunately I guess you will need to try and see what happens. I would
 go for 128 and analyze the behavior of your OSSs (via log files) and also
 keeping an eye on your workload. Seems to me that 300 is a bit too high
 (but again, I don't know what you have on your storage back-end or OSS
 configuration).


 I can't tell you much about the lru_size, but as far as I understand the
 values are dynamic and there's not much to do rather than clear the last
 recently used queue or disable the lru sizing. I can't help much on this
 other than pointing you out the explanation for it (see 31.2.11):

 http://wiki.lustre.org/manual/LustreManual20_HTML/LustreProc.html


 Regards,
 Carlos




 --
 Carlos Thomaz | HPC Systems Architect
 Mobile: +1 (303) 519-0578
 ctho...@ddn.com | Skype ID: carlosthomaz
 DataDirect Networks, Inc.
 9960 Federal Dr., Ste 100 Colorado Springs, CO 80921
 ddn.com http://www.ddn.com/ | Twitter: @ddn_limitless
 http://twitter.com/ddn_limitless | 1.800.TERABYTE





 On 2/1/12 2:11 PM, David Noriega tsk...@my.utsa.edu wrote:

zone_reclaim_mode is 0 on all clients/servers

When changing number of service threads or the lru_size, can these be
done on the fly or do they require a reboot of either client or
server?
For my two OSTs, cat /proc/fs/lustre/ost/OSS/ost_io/threads_started
give about 300(300, 359) so I'm thinking try half of that and see how
it goes?

Also checking lru_size, I get different numbers from the clients. cat
/proc/fs/lustre/ldlm/namespaces/*/lru_size

Client: MDT0 OST0 OST1 OST2 OST3 MGC
head node: 0 22 22 22 22 400 (only a few users logged in)
busy node: 1 501 504 503 505 400 (Fully loaded with jobs)
samba/nfs server: 4 440070 44370 44348 26282 1600

So my understanding is the lru_size is set to auto by default thus the
varying values, but setting it manually is effectively setting a max
value? Also what does it mean to have a lower value(especially in the
case of the samba/nfs server)?

On Wed, Feb 1, 2012 at 1:27 PM, Charles Taylor tay...@hpc.ufl.edu wrote:

 You may also want to check and, if necessary, limit the lru_size on
your clients.   I believe there are guidelines in the ops manual.
We have ~750 clients and limit ours to 600 per OST.   That, combined
with the setting zone_reclaim_mode=0 should make a big difference.

 Regards,

 Charlie Taylor
 UF HPC Center


 On Feb 1, 2012, at 2:04 PM, Carlos Thomaz wrote:

 Hi David,

 You may be facing the same issue discussed on previous threads, which
is
 the issue regarding the zone_reclaim_mode.

 Take a look on the previous thread where myself and Kevin replied to
 Vijesh Ek.

 If you don't have access to the previous emails, look at your kernel
 settings for the zone reclaim:

 cat /proc/sys/vm/zone_reclaim_mode

 It should be set to 0.

 Also, look at the number of Lustre OSS service threads. It may be set
to
 high...

 Rgds.
 Carlos.


 --
 Carlos Thomaz | HPC Systems Architect
 Mobile: +1 (303) 519-0578
 ctho...@ddn.com | Skype ID: carlosthomaz
 DataDirect Networks, Inc.
 9960 Federal Dr., Ste 100 Colorado Springs, CO 80921
 ddn.com http://www.ddn.com/ | Twitter: @ddn_limitless
 http://twitter.com/ddn_limitless | 1.800.TERABYTE





 On 2/1/12 11:57 AM, David Noriega tsk...@my.utsa.edu wrote:

 indicates the system was overloaded (too many service threads, or


 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss

 Charles A. Taylor, Ph.D.
 Associate Director,
 UF HPC Center
 (352) 392-4036






--
David Noriega
System Administrator
Computational Biology Initiative
High Performance Computing Center
University of Texas at San Antonio
One UTSA Circle
San Antonio, TX 78249
Office: BSE 3.112
Phone: 210-458-7100
http://www.cbi.utsa.edu
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http

Re: [Lustre-discuss] Thread might be hung, Heavy IO Load messages

2012-02-02 Thread David Noriega
I found this thread Luster clients getting evicted as I've also seen
the ost_connect operation failed with -16 message and there they
recommend increasing the timeout, though that was for 1.6 and as I've
read 1.8 has a different timeout system. Reading that, would
increasing at_min(currently 0) or at_max(currently 600) be best?

On Thu, Feb 2, 2012 at 12:07 PM, Andreas Dilger adil...@whamcloud.com wrote:
 On 2012-02-02, at 8:54 AM, David Noriega wrote:
 We have two OSSs, each with two quad core AMD Opterons and 8GB of ram
 and two OSTs each(4.4T and 3.5T). Backend storage is a pair of Sun
 StorageTek 2540 connected with 8Gb fiber.

 Running 32-64 threads per OST is the optimum number, based on previous
 experience.

 What about tweaking max_dirty_mb on the client side?

 Probably unrelated.

 On Wed, Feb 1, 2012 at 6:33 PM, Carlos Thomaz ctho...@ddn.com wrote:
 David,

 The oss service threads is a function of your RAM size and CPUs. It's
 difficult to say what would be a good upper limit without knowing the size
 of your OSS, # clients, storage back-end and workload. But the good thing
 you can give a try on the fly via lctl set_param command.

 Assuming you are running lustre 1.8, here is a good explanation on how to
 do it:
 http://wiki.lustre.org/manual/LustreManual18_HTML/LustreProc.html#50651263_
 87260

 Some remarks:
 - reducing the number of OSS threads may impact the performance depending
 on how is your workload.
 - unfortunately I guess you will need to try and see what happens. I would
 go for 128 and analyze the behavior of your OSSs (via log files) and also
 keeping an eye on your workload. Seems to me that 300 is a bit too high
 (but again, I don't know what you have on your storage back-end or OSS
 configuration).


 I can't tell you much about the lru_size, but as far as I understand the
 values are dynamic and there's not much to do rather than clear the last
 recently used queue or disable the lru sizing. I can't help much on this
 other than pointing you out the explanation for it (see 31.2.11):

 http://wiki.lustre.org/manual/LustreManual20_HTML/LustreProc.html


 Regards,
 Carlos




 --
 Carlos Thomaz | HPC Systems Architect
 Mobile: +1 (303) 519-0578
 ctho...@ddn.com | Skype ID: carlosthomaz
 DataDirect Networks, Inc.
 9960 Federal Dr., Ste 100 Colorado Springs, CO 80921
 ddn.com http://www.ddn.com/ | Twitter: @ddn_limitless
 http://twitter.com/ddn_limitless | 1.800.TERABYTE





 On 2/1/12 2:11 PM, David Noriega tsk...@my.utsa.edu wrote:

 zone_reclaim_mode is 0 on all clients/servers

 When changing number of service threads or the lru_size, can these be
 done on the fly or do they require a reboot of either client or
 server?
 For my two OSTs, cat /proc/fs/lustre/ost/OSS/ost_io/threads_started
 give about 300(300, 359) so I'm thinking try half of that and see how
 it goes?

 Also checking lru_size, I get different numbers from the clients. cat
 /proc/fs/lustre/ldlm/namespaces/*/lru_size

 Client: MDT0 OST0 OST1 OST2 OST3 MGC
 head node: 0 22 22 22 22 400 (only a few users logged in)
 busy node: 1 501 504 503 505 400 (Fully loaded with jobs)
 samba/nfs server: 4 440070 44370 44348 26282 1600

 So my understanding is the lru_size is set to auto by default thus the
 varying values, but setting it manually is effectively setting a max
 value? Also what does it mean to have a lower value(especially in the
 case of the samba/nfs server)?

 On Wed, Feb 1, 2012 at 1:27 PM, Charles Taylor tay...@hpc.ufl.edu wrote:

 You may also want to check and, if necessary, limit the lru_size on
 your clients.   I believe there are guidelines in the ops manual.
 We have ~750 clients and limit ours to 600 per OST.   That, combined
 with the setting zone_reclaim_mode=0 should make a big difference.

 Regards,

 Charlie Taylor
 UF HPC Center


 On Feb 1, 2012, at 2:04 PM, Carlos Thomaz wrote:

 Hi David,

 You may be facing the same issue discussed on previous threads, which
 is
 the issue regarding the zone_reclaim_mode.

 Take a look on the previous thread where myself and Kevin replied to
 Vijesh Ek.

 If you don't have access to the previous emails, look at your kernel
 settings for the zone reclaim:

 cat /proc/sys/vm/zone_reclaim_mode

 It should be set to 0.

 Also, look at the number of Lustre OSS service threads. It may be set
 to
 high...

 Rgds.
 Carlos.


 --
 Carlos Thomaz | HPC Systems Architect
 Mobile: +1 (303) 519-0578
 ctho...@ddn.com | Skype ID: carlosthomaz
 DataDirect Networks, Inc.
 9960 Federal Dr., Ste 100 Colorado Springs, CO 80921
 ddn.com http://www.ddn.com/ | Twitter: @ddn_limitless
 http://twitter.com/ddn_limitless | 1.800.TERABYTE





 On 2/1/12 11:57 AM, David Noriega tsk...@my.utsa.edu wrote:

 indicates the system was overloaded (too many service threads, or


 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss

 Charles A. Taylor, Ph.D.
 Associate

Re: [Lustre-discuss] Thread might be hung, Heavy IO Load messages

2012-02-01 Thread David Noriega
zone_reclaim_mode is 0 on all clients/servers

When changing number of service threads or the lru_size, can these be
done on the fly or do they require a reboot of either client or
server?
For my two OSTs, cat /proc/fs/lustre/ost/OSS/ost_io/threads_started
give about 300(300, 359) so I'm thinking try half of that and see how
it goes?

Also checking lru_size, I get different numbers from the clients. cat
/proc/fs/lustre/ldlm/namespaces/*/lru_size

Client: MDT0 OST0 OST1 OST2 OST3 MGC
head node: 0 22 22 22 22 400 (only a few users logged in)
busy node: 1 501 504 503 505 400 (Fully loaded with jobs)
samba/nfs server: 4 440070 44370 44348 26282 1600

So my understanding is the lru_size is set to auto by default thus the
varying values, but setting it manually is effectively setting a max
value? Also what does it mean to have a lower value(especially in the
case of the samba/nfs server)?

On Wed, Feb 1, 2012 at 1:27 PM, Charles Taylor tay...@hpc.ufl.edu wrote:

 You may also want to check and, if necessary, limit the lru_size on your 
 clients.   I believe there are guidelines in the ops manual.      We have 
 ~750 clients and limit ours to 600 per OST.   That, combined with the setting 
 zone_reclaim_mode=0 should make a big difference.

 Regards,

 Charlie Taylor
 UF HPC Center


 On Feb 1, 2012, at 2:04 PM, Carlos Thomaz wrote:

 Hi David,

 You may be facing the same issue discussed on previous threads, which is
 the issue regarding the zone_reclaim_mode.

 Take a look on the previous thread where myself and Kevin replied to
 Vijesh Ek.

 If you don't have access to the previous emails, look at your kernel
 settings for the zone reclaim:

 cat /proc/sys/vm/zone_reclaim_mode

 It should be set to 0.

 Also, look at the number of Lustre OSS service threads. It may be set to
 high...

 Rgds.
 Carlos.


 --
 Carlos Thomaz | HPC Systems Architect
 Mobile: +1 (303) 519-0578
 ctho...@ddn.com | Skype ID: carlosthomaz
 DataDirect Networks, Inc.
 9960 Federal Dr., Ste 100 Colorado Springs, CO 80921
 ddn.com http://www.ddn.com/ | Twitter: @ddn_limitless
 http://twitter.com/ddn_limitless | 1.800.TERABYTE





 On 2/1/12 11:57 AM, David Noriega tsk...@my.utsa.edu wrote:

 indicates the system was overloaded (too many service threads, or


 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss

 Charles A. Taylor, Ph.D.
 Associate Director,
 UF HPC Center
 (352) 392-4036






-- 
David Noriega
System Administrator
Computational Biology Initiative
High Performance Computing Center
University of Texas at San Antonio
One UTSA Circle
San Antonio, TX 78249
Office: BSE 3.112
Phone: 210-458-7100
http://www.cbi.utsa.edu
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] Lustre error with nfs?

2011-10-27 Thread David Noriega
I get these errors, any ideas? Running Lustre 1.8.4. This client is
also the server where we nfs export the filesystem.

LustreError: 4994:0:(dir.c:384:ll_readdir_18()) error reading dir
575283686/935610515 page 0: rc -110
LustreError: 11-0: an error occurred while communicating with
192.168.5.104@tcp. The mds_readpage operation failed with -107
LustreError: 28410:0:(dir.c:384:ll_readdir_18()) error reading dir
579577179/4015460576 page 0: rc -110
LustreError: Skipped 12 previous similar messages
Lustre: lustre-MDT-mdc-810338e81400: Connection to service
lustre-MDT via nid 192.168.5.104@tcp was lost; in progress
operations using this service will wait for recovery to complete.
LustreError: 167-0: This client was evicted by lustre-MDT; in
progress operations using this service will fail.
LustreError: 25118:0:(client.c:858:ptlrpc_import_delay_req()) @@@
IMP_INVALID  req@8101f87d8c00 x1383759180968916/t0
o35-lustre-MDT_UUID@192.168.5.104@tcp:23/10 lens 408/1128 e 0 to
1 dl 0 ref 1 fl Rpc:/0/0 rc 0/0
LustreError: 25118:0:(file.c:116:ll_close_inode_openhandle()) inode
17928860 mdc close failed: rc = -108
LustreError: 25118:0:(mdc_locks.c:646:mdc_enqueue()) ldlm_cli_enqueue: -108
LustreError: 9199:0:(file.c:116:ll_close_inode_openhandle()) inode
579577179 mdc close failed: rc = -108
LustreError: 9199:0:(file.c:116:ll_close_inode_openhandle()) Skipped 1
previous similar message
Lustre: lustre-MDT-mdc-810338e81400: Connection restored to
service lustre-MDT using nid 192.168.5.104@tcp.
nfsd: non-standard errno: -43
nfsd: non-standard errno: -43
LustreError: 4994:0:(dir.c:384:ll_readdir_18()) error reading dir
575283686/935610515 page 0: rc -110
LustreError: 4994:0:(dir.c:384:ll_readdir_18()) Skipped 29 previous
similar messages
LustreError: 11-0: an error occurred while communicating with
192.168.5.104@tcp. The mds_readpage operation failed with -107
Lustre: lustre-MDT-mdc-810338e81400: Connection to service
lustre-MDT via nid 192.168.5.104@tcp was lost; in progress
operations using this service will wait for recovery to complete.
LustreError: 167-0: This client was evicted by lustre-MDT; in
progress operations using this service will fail.
LustreError: 4994:0:(client.c:858:ptlrpc_import_delay_req()) @@@
IMP_INVALID  req@8102a576c000 x1383759180969003/t0
o37-lustre-MDT_UUID@192.168.5.104@tcp:23/10 lens 408/600 e 0 to 1
dl 0 ref 1 fl Rpc:/0/0 rc 0/0
LustreError: 4994:0:(client.c:858:ptlrpc_import_delay_req()) Skipped
34 previous similar messages
nfsd: non-standard errno: -108
nfsd: non-standard errno: -4
nfsd: non-standard errno: -4
nfsd: non-standard errno: -108
LustreError: 25118:0:(file.c:116:ll_close_inode_openhandle()) inode
17928860 mdc close failed: rc = -4
LustreError: 25118:0:(file.c:116:ll_close_inode_openhandle()) Skipped
1 previous similar message
LustreError: 25118:0:(mdc_locks.c:646:mdc_enqueue()) ldlm_cli_enqueue: -108
LustreError: 25118:0:(mdc_locks.c:646:mdc_enqueue()) Skipped 4
previous similar messages
LustreError: 28407:0:(file.c:3280:ll_inode_revalidate_fini()) failure
-108 inode 558497795
LustreError: 28407:0:(file.c:3280:ll_inode_revalidate_fini()) Skipped
3 previous similar messages
nfsd: non-standard errno: -108
Lustre: lustre-MDT-mdc-810338e81400: Connection restored to
service lustre-MDT using nid 192.168.5.104@tcp.
LustreError: 11-0: an error occurred while communicating with
192.168.5.104@tcp. The mds_close operation failed with -116
LustreError: Skipped 1 previous similar message
LustreError: 28407:0:(file.c:116:ll_close_inode_openhandle()) inode
558497794 mdc close failed: rc = -116
LustreError: 28407:0:(file.c:116:ll_close_inode_openhandle()) Skipped
4 previous similar messages
LustreError: 11-0: an error occurred while communicating with
192.168.5.104@tcp. The mds_close operation failed with -116


-- 
Personally, I liked the university. They gave us money and facilities,
we didn't have to produce anything! You've never been out of college!
You don't know what it's like out there! I've worked in the private
sector. They expect results. -Ray Ghostbusters
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Lustre error with nfs?

2011-10-27 Thread David Noriega
Overloaded on the client or mds? All the lustre nodes use nic bonding,
so I suppose since we have alot of io traffic on this client, should
bump up the number of nics in use?

On Thu, Oct 27, 2011 at 3:28 PM, Colin Faber colin_fa...@xyratex.com wrote:
 Hi,
 Just quickly looking at the log you've posted, it looks like you're
 timing out with overloaded network.

 -cf


 On 10/27/2011 10:08 AM, David Noriega wrote:
 I get these errors, any ideas? Running Lustre 1.8.4. This client is
 also the server where we nfs export the filesystem.

 LustreError: 4994:0:(dir.c:384:ll_readdir_18()) error reading dir
 575283686/935610515 page 0: rc -110
 LustreError: 11-0: an error occurred while communicating with
 192.168.5.104@tcp. The mds_readpage operation failed with -107
 LustreError: 28410:0:(dir.c:384:ll_readdir_18()) error reading dir
 579577179/4015460576 page 0: rc -110
 LustreError: Skipped 12 previous similar messages
 Lustre: lustre-MDT-mdc-810338e81400: Connection to service
 lustre-MDT via nid 192.168.5.104@tcp was lost; in progress
 operations using this service will wait for recovery to complete.
 LustreError: 167-0: This client was evicted by lustre-MDT; in
 progress operations using this service will fail.
 LustreError: 25118:0:(client.c:858:ptlrpc_import_delay_req()) @@@
 IMP_INVALID  req@8101f87d8c00 x1383759180968916/t0
 o35-lustre-MDT_UUID@192.168.5.104@tcp:23/10 lens 408/1128 e 0 to
 1 dl 0 ref 1 fl Rpc:/0/0 rc 0/0
 LustreError: 25118:0:(file.c:116:ll_close_inode_openhandle()) inode
 17928860 mdc close failed: rc = -108
 LustreError: 25118:0:(mdc_locks.c:646:mdc_enqueue()) ldlm_cli_enqueue: -108
 LustreError: 9199:0:(file.c:116:ll_close_inode_openhandle()) inode
 579577179 mdc close failed: rc = -108
 LustreError: 9199:0:(file.c:116:ll_close_inode_openhandle()) Skipped 1
 previous similar message
 Lustre: lustre-MDT-mdc-810338e81400: Connection restored to
 service lustre-MDT using nid 192.168.5.104@tcp.
 nfsd: non-standard errno: -43
 nfsd: non-standard errno: -43
 LustreError: 4994:0:(dir.c:384:ll_readdir_18()) error reading dir
 575283686/935610515 page 0: rc -110
 LustreError: 4994:0:(dir.c:384:ll_readdir_18()) Skipped 29 previous
 similar messages
 LustreError: 11-0: an error occurred while communicating with
 192.168.5.104@tcp. The mds_readpage operation failed with -107
 Lustre: lustre-MDT-mdc-810338e81400: Connection to service
 lustre-MDT via nid 192.168.5.104@tcp was lost; in progress
 operations using this service will wait for recovery to complete.
 LustreError: 167-0: This client was evicted by lustre-MDT; in
 progress operations using this service will fail.
 LustreError: 4994:0:(client.c:858:ptlrpc_import_delay_req()) @@@
 IMP_INVALID  req@8102a576c000 x1383759180969003/t0
 o37-lustre-MDT_UUID@192.168.5.104@tcp:23/10 lens 408/600 e 0 to 1
 dl 0 ref 1 fl Rpc:/0/0 rc 0/0
 LustreError: 4994:0:(client.c:858:ptlrpc_import_delay_req()) Skipped
 34 previous similar messages
 nfsd: non-standard errno: -108
 nfsd: non-standard errno: -4
 nfsd: non-standard errno: -4
 nfsd: non-standard errno: -108
 LustreError: 25118:0:(file.c:116:ll_close_inode_openhandle()) inode
 17928860 mdc close failed: rc = -4
 LustreError: 25118:0:(file.c:116:ll_close_inode_openhandle()) Skipped
 1 previous similar message
 LustreError: 25118:0:(mdc_locks.c:646:mdc_enqueue()) ldlm_cli_enqueue: -108
 LustreError: 25118:0:(mdc_locks.c:646:mdc_enqueue()) Skipped 4
 previous similar messages
 LustreError: 28407:0:(file.c:3280:ll_inode_revalidate_fini()) failure
 -108 inode 558497795
 LustreError: 28407:0:(file.c:3280:ll_inode_revalidate_fini()) Skipped
 3 previous similar messages
 nfsd: non-standard errno: -108
 Lustre: lustre-MDT-mdc-810338e81400: Connection restored to
 service lustre-MDT using nid 192.168.5.104@tcp.
 LustreError: 11-0: an error occurred while communicating with
 192.168.5.104@tcp. The mds_close operation failed with -116
 LustreError: Skipped 1 previous similar message
 LustreError: 28407:0:(file.c:116:ll_close_inode_openhandle()) inode
 558497794 mdc close failed: rc = -116
 LustreError: 28407:0:(file.c:116:ll_close_inode_openhandle()) Skipped
 4 previous similar messages
 LustreError: 11-0: an error occurred while communicating with
 192.168.5.104@tcp. The mds_close operation failed with -116


 __
 This email may contain privileged or confidential information, which should 
 only be used for the purpose for which it was sent by Xyratex. No further 
 rights or licenses are granted to use such information. If you are not the 
 intended recipient of this message, please notify the sender by return and 
 delete it. You may not use, copy, disclose or rely on the information 
 contained in it.

 Internet email is susceptible to data corruption, interception and 
 unauthorised amendment for which Xyratex does not accept liability. While we 
 have taken reasonable

[Lustre-discuss] Upgrade from 1.8.6 to 2.1?

2011-10-21 Thread David Noriega
How easy would it be to upgrade from 1.8.6 to 2.1? Would simply
dropping in the new packages be enough? Would it require downtime of
the whole system? Also could I have the servers move to 2.1 while
still having the clients at 1.8.6?

-- 
Personally, I liked the university. They gave us money and facilities,
we didn't have to produce anything! You've never been out of college!
You don't know what it's like out there! I've worked in the private
sector. They expect results. -Ray Ghostbusters
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Client unable to connect after reboot: Unable to process log 108

2011-08-31 Thread David Noriega
I think I'll add the lctl ping to a start up script as a workaround,
but any ideas why this is happening?

On Mon, Aug 29, 2011 at 10:26 AM, David Noriega tsk...@my.utsa.edu wrote:
 I've begun to notice this behavor in my clients. Not sure whats going
 on, but when a client reboots, its unable to mount lustre. I have to
 use 'lctrl ping' to ping any of the lustre nodes before I'm able to
 mount the lustre filesystem. Any ideas?

 Lustre: OBD class driver, http://www.lustre.org/
 Lustre:     Lustre Version: 1.8.4
 Lustre:     Build Version:
 1.8.4-20100726215630-PRISTINE-2.6.18-194.3.1.el5_lustre.1.8.4
 Lustre: Added LNI 192.168.1.2@tcp [8/256/0/180]
 Lustre: Accept secure, port 988
 Lustre: Lustre Client File System; http://www.lustre.org/
 Lustre: 3977:0:(client.c:1476:ptlrpc_expire_one_request()) @@@ Request
 x1378464080855041 sent from MGC192.168.5.104@tcp to NID
 192.168.5.104@tcp 5s ago has timed out (5s prior to deadline).
  req@81032d28dc00 x1378464080855041/t0
 o250-MGS@MGC192.168.5.104@tcp_0:26/25 lens 368/584 e 0 to 1 dl
 1314605796 ref 1 fl Rpc:N/0/0 rc 0/0
 Lustre: 3977:0:(client.c:1476:ptlrpc_expire_one_request()) @@@ Request
 x1378464080855043 sent from MGC192.168.5.104@tcp to NID
 192.168.5.105@tcp 5s ago has timed out (5s prior to deadline).
  req@81033f410c00 x1378464080855043/t0
 o250-MGS@MGC192.168.5.104@tcp_1:26/25 lens 368/584 e 0 to 1 dl
 1314605821 ref 1 fl Rpc:N/0/0 rc 0/0
 LustreError: 3839:0:(client.c:858:ptlrpc_import_delay_req()) @@@
 IMP_INVALID  req@81032d28d800 x1378464080855044/t0
 o501-MGS@MGC192.168.5.104@tcp_1:26/25 lens 264/432 e 0 to 1 dl 0 ref
 1 fl Rpc:/0/0 rc 0/0
 LustreError: 15c-8: MGC192.168.5.104@tcp: The configuration from log
 'lustre-client' failed (-108). This may be the result of communication
 errors between this node and the MGS, a bad configuration, or other
 errors. See the syslog for more information.
 LustreError: 3839:0:(llite_lib.c:1086:ll_fill_super()) Unable to
 process log: -108
 Lustre: client 81033887dc00 umount complete
 LustreError: 3839:0:(obd_mount.c:2050:lustre_fill_super()) Unable to
 mount  (-108)
 Installing knfsd (copyright (C) 1996 o...@monad.swb.de).
 NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory
 NFSD: starting 90-second grace period
 FS-Cache: Loaded
 Lustre: 3977:0:(client.c:1476:ptlrpc_expire_one_request()) @@@ Request
 x1378464080855045 sent from MGC192.168.5.104@tcp to NID
 192.168.5.104@tcp 0s ago has failed due to network error (5s prior to
 deadline).
  req@810324d67400 x1378464080855045/t0
 o250-MGS@MGC192.168.5.104@tcp_0:26/25 lens 368/584 e 0 to 1 dl
 1314605832 ref 1 fl Rpc:N/0/0 rc 0/0
 Lustre: 3977:0:(client.c:1476:ptlrpc_expire_one_request()) @@@ Request
 x1378464080855047 sent from MGC192.168.5.104@tcp to NID
 192.168.5.105@tcp 0s ago has failed due to network error (5s prior to
 deadline).
  req@810330d9c800 x1378464080855047/t0
 o250-MGS@MGC192.168.5.104@tcp_1:26/25 lens 368/584 e 0 to 1 dl
 1314605857 ref 1 fl Rpc:N/0/0 rc 0/0
 LustreError: 5178:0:(client.c:858:ptlrpc_import_delay_req()) @@@
 IMP_INVALID  req@810324d67000 x1378464080855048/t0
 o501-MGS@MGC192.168.5.104@tcp_1:26/25 lens 264/432 e 0 to 1 dl 0 ref
 1 fl Rpc:/0/0 rc 0/0
 LustreError: 15c-8: MGC192.168.5.104@tcp: The configuration from log
 'lustre-client' failed (-108). This may be the result of communication
 errors between this node and the MGS, a bad configuration, or other
 errors. See the syslog for more information.
 LustreError: 5178:0:(llite_lib.c:1086:ll_fill_super()) Unable to
 process log: -108
 Lustre: client 81032f4a3400 umount complete
 LustreError: 5178:0:(obd_mount.c:2050:lustre_fill_super()) Unable to
 mount  (-108)

 --
 Personally, I liked the university. They gave us money and facilities,
 we didn't have to produce anything! You've never been out of college!
 You don't know what it's like out there! I've worked in the private
 sector. They expect results. -Ray Ghostbusters




-- 
Personally, I liked the university. They gave us money and facilities,
we didn't have to produce anything! You've never been out of college!
You don't know what it's like out there! I've worked in the private
sector. They expect results. -Ray Ghostbusters
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] Client unable to connect after reboot: Unable to process log 108

2011-08-29 Thread David Noriega
I've begun to notice this behavor in my clients. Not sure whats going
on, but when a client reboots, its unable to mount lustre. I have to
use 'lctrl ping' to ping any of the lustre nodes before I'm able to
mount the lustre filesystem. Any ideas?

Lustre: OBD class driver, http://www.lustre.org/
Lustre: Lustre Version: 1.8.4
Lustre: Build Version:
1.8.4-20100726215630-PRISTINE-2.6.18-194.3.1.el5_lustre.1.8.4
Lustre: Added LNI 192.168.1.2@tcp [8/256/0/180]
Lustre: Accept secure, port 988
Lustre: Lustre Client File System; http://www.lustre.org/
Lustre: 3977:0:(client.c:1476:ptlrpc_expire_one_request()) @@@ Request
x1378464080855041 sent from MGC192.168.5.104@tcp to NID
192.168.5.104@tcp 5s ago has timed out (5s prior to deadline).
  req@81032d28dc00 x1378464080855041/t0
o250-MGS@MGC192.168.5.104@tcp_0:26/25 lens 368/584 e 0 to 1 dl
1314605796 ref 1 fl Rpc:N/0/0 rc 0/0
Lustre: 3977:0:(client.c:1476:ptlrpc_expire_one_request()) @@@ Request
x1378464080855043 sent from MGC192.168.5.104@tcp to NID
192.168.5.105@tcp 5s ago has timed out (5s prior to deadline).
  req@81033f410c00 x1378464080855043/t0
o250-MGS@MGC192.168.5.104@tcp_1:26/25 lens 368/584 e 0 to 1 dl
1314605821 ref 1 fl Rpc:N/0/0 rc 0/0
LustreError: 3839:0:(client.c:858:ptlrpc_import_delay_req()) @@@
IMP_INVALID  req@81032d28d800 x1378464080855044/t0
o501-MGS@MGC192.168.5.104@tcp_1:26/25 lens 264/432 e 0 to 1 dl 0 ref
1 fl Rpc:/0/0 rc 0/0
LustreError: 15c-8: MGC192.168.5.104@tcp: The configuration from log
'lustre-client' failed (-108). This may be the result of communication
errors between this node and the MGS, a bad configuration, or other
errors. See the syslog for more information.
LustreError: 3839:0:(llite_lib.c:1086:ll_fill_super()) Unable to
process log: -108
Lustre: client 81033887dc00 umount complete
LustreError: 3839:0:(obd_mount.c:2050:lustre_fill_super()) Unable to
mount  (-108)
Installing knfsd (copyright (C) 1996 o...@monad.swb.de).
NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory
NFSD: starting 90-second grace period
FS-Cache: Loaded
Lustre: 3977:0:(client.c:1476:ptlrpc_expire_one_request()) @@@ Request
x1378464080855045 sent from MGC192.168.5.104@tcp to NID
192.168.5.104@tcp 0s ago has failed due to network error (5s prior to
deadline).
  req@810324d67400 x1378464080855045/t0
o250-MGS@MGC192.168.5.104@tcp_0:26/25 lens 368/584 e 0 to 1 dl
1314605832 ref 1 fl Rpc:N/0/0 rc 0/0
Lustre: 3977:0:(client.c:1476:ptlrpc_expire_one_request()) @@@ Request
x1378464080855047 sent from MGC192.168.5.104@tcp to NID
192.168.5.105@tcp 0s ago has failed due to network error (5s prior to
deadline).
  req@810330d9c800 x1378464080855047/t0
o250-MGS@MGC192.168.5.104@tcp_1:26/25 lens 368/584 e 0 to 1 dl
1314605857 ref 1 fl Rpc:N/0/0 rc 0/0
LustreError: 5178:0:(client.c:858:ptlrpc_import_delay_req()) @@@
IMP_INVALID  req@810324d67000 x1378464080855048/t0
o501-MGS@MGC192.168.5.104@tcp_1:26/25 lens 264/432 e 0 to 1 dl 0 ref
1 fl Rpc:/0/0 rc 0/0
LustreError: 15c-8: MGC192.168.5.104@tcp: The configuration from log
'lustre-client' failed (-108). This may be the result of communication
errors between this node and the MGS, a bad configuration, or other
errors. See the syslog for more information.
LustreError: 5178:0:(llite_lib.c:1086:ll_fill_super()) Unable to
process log: -108
Lustre: client 81032f4a3400 umount complete
LustreError: 5178:0:(obd_mount.c:2050:lustre_fill_super()) Unable to
mount  (-108)

-- 
Personally, I liked the university. They gave us money and facilities,
we didn't have to produce anything! You've never been out of college!
You don't know what it's like out there! I've worked in the private
sector. They expect results. -Ray Ghostbusters
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] multipathd or sun rdac driver?

2011-07-20 Thread David Noriega
We already use multipathd in our install already, but this was
something I wondered about. We use Sun disk arrays and they mention
the use of their RDAC driver to multipathing on Linux. Since its from
the vendor, one would think it be better. What does the collective
think?

Sun StorageTek RDAC Multipath Failover Driver for Linux
http://download.oracle.com/docs/cd/E19373-01/820-4738-13/chapsing.html

David
-- 
Personally, I liked the university. They gave us money and facilities,
we didn't have to produce anything! You've never been out of college!
You don't know what it's like out there! I've worked in the private
sector. They expect results. -Ray Ghostbusters
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] multipathd or sun rdac driver?

2011-07-20 Thread David Noriega
They are 2540 and I'm running EL5(centos).

Well the thought came around since I had to rebuild a node after a
hardware problem. So I went ahead and gave it a shot. I think I posted
about this problem before somewhere in the mailing list about getting
stray I/O errors which were for /dev/sdX devices that were the other
path to the same device(Well thats the idea we came to). Well after
installing the Sun RDAC module and disabling multipathd, I can happily
say those messages are gone, so I suppose Sun's module is able to talk
to the disk array in a better manner then multipathd. Though I haven't
failed back the lustre ost's to this particular node just yet(will
wait till the weekend). I'll post again if anything goes wrong, but I
think going with this RDAC module might be better.

ps: One thing that has nagged me since Lustre was installed and setup
by a vendor, was the disk arrays were never setup with initiators or
hosts in the configuration(Using CAM). We have another similar disk
array(6140) we setup for another filesystem and I know
initiators/hosts were setup on the array. I can't say that this has
caused any problems, but its something in the back of my mind.

Thanks,
David

On Wed, Jul 20, 2011 at 4:15 PM, Kevin Van Maren
kevin.van.ma...@oracle.com wrote:
 David Noriega wrote:

 We already use multipathd in our install already, but this was
 something I wondered about. We use Sun disk arrays and they mention
 the use of their RDAC driver to multipathing on Linux. Since its from
 the vendor, one would think it be better. What does the collective
 think?

 Sun StorageTek RDAC Multipath Failover Driver for Linux
 http://download.oracle.com/docs/cd/E19373-01/820-4738-13/chapsing.html

 David


 I assume you are using the ST25xx or ST6xxx storage with Lustre?  Exactly
 which arrays?

 I've been happy with RDAC, but I don't think Oracle has released RHEL6
 support yet
 (but Oracle also does not support Lustre servers on RHEL6 yet).

 If your multupath config is working (ie, you've tested it by
 unplugging/replugging cables
 under load and were happy with the behavior), I'm not going to tell you to
 change.

 Kevin





-- 
Personally, I liked the university. They gave us money and facilities,
we didn't have to produce anything! You've never been out of college!
You don't know what it's like out there! I've worked in the private
sector. They expect results. -Ray Ghostbusters
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] Client doesn't mount at boot

2011-06-30 Thread David Noriega
Just installed a new node on the cluster, imaged just like the rest,
but it was unable to mount lustre on boot. I tried to mount but got
the following from dmesg:

Lustre: OBD class driver, http://www.lustre.org/
Lustre: Lustre Version: 1.8.4
Lustre: Build Version:
1.8.4-20100726215630-PRISTINE-2.6.18-194.3.1.el5_lustre.1.8.4
Lustre: Added LNI 192.168.255.194@tcp [8/256/0/180]
Lustre: Accept secure, port 988
Lustre: Lustre Client File System; http://www.lustre.org/
Lustre: 4872:0:(client.c:1476:ptlrpc_expire_one_request()) @@@ Request
x1373071042674689 sent from MGC192.168.5.104@tcp to NID
192.168.5.104@tcp 5s ago has timed out (5s prior to deadline).
  req@811070397800 x1373071042674689/t0
o250-MGS@MGC192.168.5.104@tcp_0:26/25 lens 368/584 e 0 to 1 dl
1309462593 ref 1 fl Rpc:N/0/0 rc 0/0
eth0: no IPv6 routers present
Lustre: 4872:0:(client.c:1476:ptlrpc_expire_one_request()) @@@ Request
x1373071042674691 sent from MGC192.168.5.104@tcp to NID
192.168.5.105@tcp 5s ago has timed out (5s prior to deadline).
  req@81107dc57000 x1373071042674691/t0
o250-MGS@MGC192.168.5.104@tcp_1:26/25 lens 368/584 e 0 to 1 dl
1309462618 ref 1 fl Rpc:N/0/0 rc 0/0
LustreError: 4735:0:(client.c:858:ptlrpc_import_delay_req()) @@@
IMP_INVALID  req@81107039b800 x1373071042674692/t0
o501-MGS@MGC192.168.5.104@tcp_1:26/25 lens 264/432 e 0 to 1 dl 0 ref
1 fl Rpc:/0/0 rc 0/0
LustreError: 15c-8: MGC192.168.5.104@tcp: The configuration from log
'lustre-client' failed (-108). This may be the result of communication
errors between this node and the MGS, a bad configuration, or other
errors. See the syslog for more information.
LustreError: 4735:0:(llite_lib.c:1086:ll_fill_super()) Unable to
process log: -108
Lustre: client 81106881fc00 umount complete
LustreError: 4735:0:(obd_mount.c:2050:lustre_fill_super()) Unable to
mount  (-108)

and from /var/log/messages:

Jun 30 14:52:18 compute-6-3 kernel: LustreError:
4395:0:(client.c:858:ptlrpc_import_delay_req()) @@@ IMP_INVALID
req@81106f017c00 x1373072007364612/t0
o501-MGS@MGC192.168.5.104@tcp_1:26/25 lens 264/432 e 0 to 1 dl 0 ref
1 fl Rpc:/0/0 rc 0/0
Jun 30 14:52:18 compute-6-3 kernel: LustreError: 15c-8:
MGC192.168.5.104@tcp: The configuration from log 'lustre-client'
failed (-108). This may be the result of communication errors between
this node and the MGS, a bad configuration, or other errors. See the
syslog for more information.
Jun 30 14:52:18 compute-6-3 kernel: LustreError:
4395:0:(llite_lib.c:1086:ll_fill_super()) Unable to process log: -108
Jun 30 14:52:18 compute-6-3 kernel: LustreError:
4395:0:(obd_mount.c:2050:lustre_fill_super()) Unable to mount  (-108)

Only after I ran lctl ping x.x.x.x to the MDS/MGS was I able to
manually mount lustre.

I got the idea to run lctl ping from a post from someone with the same
problem but over infinaband, we are using ethernet here.

David

-- 
Personally, I liked the university. They gave us money and facilities,
we didn't have to produce anything! You've never been out of college!
You don't know what it's like out there! I've worked in the private
sector. They expect results. -Ray Ghostbusters
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] ZFS question: HW raid5 vs raidz?

2011-06-10 Thread David Noriega
I was checking out zfsonlinux.org to see how things have been going
lately and I had a question. Whats the difference, or whats better:
Use a hardware raid5(or 6) or use zfs to create a raidz pool? In terms
of Lustre, is one preferred over another?

David

-- 
Personally, I liked the university. They gave us money and facilities,
we didn't have to produce anything! You've never been out of college!
You don't know what it's like out there! I've worked in the private
sector. They expect results. -Ray Ghostbusters
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] ost_write operation failed with -28 in 1.8.5 lustre client

2011-05-31 Thread David Noriega
We are running lustre 1.8.4 and I can confirm that I see this message
on one of our clients, the 'file server.' It serves up the lustre fs
to machines outside our network via samba and nfs. On other
clients(nodes in our compute cluster), I see the same message on a few
times, though it says -19 or in one case -107 as the error number.
Though just as they reported, we've had a few users say they have
gotten a message saying the filesystem is full, even though its not.

On Fri, Apr 29, 2011 at 10:04 AM, Rajendra prasad rajendra...@gmail.com wrote:
 Hi All,

 I am running lustre servers on 1.8.5 (recently upgraded from 1.8.2).
 Clients are still on 1.8.2 .

 I am getting the error ost_write operation failed with -28 in the clients.
 Due to this i am getting error message as No space left on the device
 oftenly. As per lfs df -h output all the OSTs are occupied around 55% only.

 lfs df -h
 UUID   bytes    Used   Available Use% Mounted on
 lustre-MDT_UUID    52.3G    4.2G   48.1G   8%
 /opt/lustre[MDT:0]
 lustre-OST_UUID   442.9G  245.6G  197.3G  55%
 /opt/lustre[OST:0]
 lustre-OST0001_UUID   442.9G  238.7G  204.3G  53%
 /opt/lustre[OST:1]
 lustre-OST0002_UUID   442.9G  243.2G  199.7G  54%
 /opt/lustre[OST:2]
 lustre-OST0003_UUID   442.9G  236.5G  206.5G  53%
 /opt/lustre[OST:3]
 lustre-OST0004_UUID   442.9G  234.8G  208.1G  53%
 /opt/lustre[OST:4]
 lustre-OST0005_UUID   442.9G  239.7G  203.3G  54%
 /opt/lustre[OST:5]
 lustre-OST0006_UUID   442.9G  237.2G  205.7G  53%
 /opt/lustre[OST:6]
 lustre-OST0007_UUID   442.9G  227.9G  215.0G  51%
 /opt/lustre[OST:7]
 filesystem summary: 3.5T    1.9T    1.6T  53% /opt/lustre
 As per the below bugzilla, i have upgraded one of the lustre client verstion
 to 1.8.5 but still the issue persist in that client.

     https://bugzilla.lustre.org/show_bug.cgi?id=22755

 Lustre clients are on Suse linux 10.1 . In order to install lustre client
 packages of 1.8.5, i have upgraded the Suse kernel also.

 I have also checked and found that no quota are enabled in the clients.

 lfs quota -u 36401 /opt/lustre
 Disk quotas for user 36401 (uid 36401):
  Filesystem  kbytes   quota   limit   grace   files   quota   limit
 grace
     /opt/lustre 127315748   0   0   - 1001083   0
 0   -
 Below are the lustre client packages i have installed.


 lustre-client-modules-1.8.5-2.6.16_60_0.69.1_lustre.1.8.5_smp

 lustre-client-1.8.5-2.6.16_60_0.69.1_lustre.1.8.5_smp



 Suse kernel packages installed:



 kernel-default-2.6.16.60-0.69.1

 kernel-source-2.6.16.60-0.69.1

 kernel-smp-2.6.16.60-0.69.1

 kernel-syms-2.6.16.60-0.69.1



 Error:

 Apr 29 15:35:55 hostname kernel: LustreError: 11-0: an error occurred while
 communicating with 172.16.x.x@tcp. The ost_write operation failed with -28

 Apr 29 15:35:55 hostname kernel: LustreError: Skipped 9657 previous similar
 messages

 Apr 29 15:38:03 hostname kernel: LustreError: 11-0: an error occurred while
 communicating with 172.16.x.x@tcp. The ost_write operation failed with -28



 Kindly suggest.



 Regards,

 Prasad

 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss





-- 
Personally, I liked the university. They gave us money and facilities,
we didn't have to produce anything! You've never been out of college!
You don't know what it's like out there! I've worked in the private
sector. They expect results. -Ray Ghostbusters
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] aacraid kernel panic caused failover

2011-04-06 Thread David Noriega
It is adaptec based, just branded by sun and built by intel. Anyways I
reseated the card and will wait and see. If it still goes wonky, is
there a card anyone recommends? It has to be a low profile pcie 8x
with two x4 sas internal connectors.

On Wed, Apr 6, 2011 at 10:38 AM, Thomas Roth t.r...@gsi.de wrote:
 Provided your card is actually a Adaptec Raid controller (it says
 Adaptec ASR 5405 on our cards, not Intel or Sun), this is definitely
 not the problem. We have had a number of broken or aged batteries amongs
 our 60 or so controller cards, but never any relation with the kernel
 panic and the controller complaining about its BBU.

 Cheers,
 Thomas

 On 04/06/2011 04:58 PM, David Noriega wrote:
 Our adaptec raid card is a Sun StorageTek RAID INT card, made by intel
 of all people. So I installed the raid manager software, which of
 course doesn't say anything is wrong, but it does come with a
 monitoring daemon and it printed this message after the last aacraid
 kernel panic:

 Sun StorageTek RAID Manager Agent: [203] The battery-backup cache
 device needs a new battery: controller 1.

 So could that be the problem?

 On Wed, Apr 6, 2011 at 7:52 AM, Jeff Johnson
 jeff.john...@aeoncomputing.com  wrote:
 I have seen similar behavior on these controllers. On dissimilar configs 
 and different aged systems. These happened to be non-Lustre standalone nfs 
 and iscsi target boxes.

 Went through controller and drive firmware upgrades, low-level fw dumps  
 and analysis from dev engineers.

 In the end it was never really explained or resolved. It appears that these 
 controllers, like small children, have tantrums and fall apart. A power 
 cycle clears the condition.

 Not the best controller for an OSS.

 --Jeff

 ---mobile signature---
 Jeff Johnson - Aeon Computing
 jeff.john...@aeoncomputing.com


 On Apr 6, 2011, at 1:05, Thomas Rotht.r...@gsi.de  wrote:

 We have ~ 60 servers with these Adaptec controllers, and found this 
 problem just to happen from time to time.
 Upgrade of the aacraid module wouldn't help. We had contacts to Adaptec, 
 but they had no clue either.
 Only good thing is it seems that this adapter panic happens in an instant, 
 halting the machine, but has no prior phase of degradation: the controller
 doesn't start leaving out every second bit or just writing the '1's and 
 not the '0's or ... - so whatever data has made it to the disks before the
 crash seems to be quite sensible. Reboot and never buy Adaptec again.

 Cheers,
 Thomas

 On 04/06/2011 07:03 AM, David Noriega wrote:
 Ok I updated the aacraid driver and the raid firmware, yet I still had
 the problem happen, so I did more research and applied the following
 tweaks:

 1) Rebuilt mkinitrd with the following options:
 a) edit /etc/sysconfig/mkinitrid/multipath to contain MULTIPATH=yes
 b) mkinitrid initrd-2.6.18-194.3.1.el5_lustre.1.8.4.img
 2.6.18-194.3.1.el5_lustre.1.8.4 --preload=scsi_dh_rdac
 2) Added the local hard disk to the multipath black list
 3) Edited modprobe.conf to have the following aacraid options:
 options aacraid firmware_debug=2 startup_timeout=60 #the debug doesn't
 seem to print anything to dmesg
 4) Added pcie_aspm=off to the kernel boot options

 So things looked good for a while. I did have a problem mounting the
 lustre partitions but this was my fault in misconfiguring some lnet
 options I was experimenting with. I fixed that and just as a test, I
 ran 'modprobe lustre' since I wasn't ready to fail back the partitions
 just yet(wanted to wait till when activity was the lowest). That was
 earlier today. I was about to fail back tonight, yet when I checked
 the server again I saw in dmesg the same aacraid problems from before.
 Is it possible lustre is interfering with aacraid? Its weird since I
 do have a duplicate machine and its not having any of thise problems.

 On Fri, Mar 25, 2011 at 9:55 AM, Temple  Jasonjtem...@cscs.ch  wrote:
 Adaptec should have the firmware and drivers on their site for your 
 card.  If not adaptec, then SOracle will have it available somewhere.

 The firmware and system drivers usually have a utility that will check 
 the current version and upgrade it for you.

 Hope this helps (I use different cards, so I can't tell you exactly).

 -Jason

 -Original Message-
 From: David Noriega [mailto:tsk...@my.utsa.edu]
 Sent: venerdì, 25. marzo 2011 15:47
 To: Temple Jason
 Subject: Re: [Lustre-discuss] aacraid kernel panic caused failover

 Hmm not sure, whats the best way to find out?

 On Fri, Mar 25, 2011 at 9:46 AM, Temple  Jasonjtem...@cscs.ch  wrote:
 Hi,

 Are you using the latest firmware?  This sort of thing used to happen 
 to me, but with different raid cards.

 -Jason

 -Original Message-
 From: lustre-discuss-boun...@lists.lustre.org 
 [mailto:lustre-discuss-boun...@lists.lustre.org] On Behalf Of David 
 Noriega
 Sent: venerdì, 25. marzo 2011 15:38
 To: lustre-discuss@lists.lustre.org
 Subject: [Lustre-discuss] aacraid kernel panic caused failover

Re: [Lustre-discuss] LNET routing question

2011-04-05 Thread David Noriega
What about this example?
http://comments.gmane.org/gmane.comp.file-systems.lustre.user/6687

Also to my second question, would these changes have to be done all at
once? or could I edit one modprobe.conf at a time and fail over then
back as I make changes to each oss/mds?

Thanks
David

On Tue, Apr 5, 2011 at 11:52 AM, Cliff White cli...@whamcloud.com wrote:
 Lustre routing is to connect different types of network.
 If all your networks are TCP, you should be able to use standard
 TCP routing/addressing without needing Lustre routers.
 Again, if the Linux workstations in your lab are TCP, you should
 be able to create a TCP route to the Lustre servers without needing
 a Lustre router in the middle, unless you have some barrier, and
 you need a lustre router to cross that barrier.
 Generally, people do not use routers as clients, there is nothing
 stopping your from doing this, but a) the router will take resources
 away from the clients, impacting performance of both.
 b) again,clients are typically endpoints, and routers sit in the middle,
 so from a network design perspective it's usually silly.
 Also, lustre routers function as a pool, failed routers are bypassed,
 so a pool of dedicated routers can tolerate individual machine outages.
 That's not good for clients.
 But the main reason people do dedicated boxes for Lustre routing, is that
 Lustre routing is designed to bridge different network hardware.
 If all your nets are TCP, I think using standard networking methods will be
 better for you, simpler and easier to maintain.
 cliffw

 On Mon, Apr 4, 2011 at 6:50 PM, David Noriega tsk...@my.utsa.edu wrote:

 The file server does sit on both networks, internal and external. I
 would just like to have a thrid option beyond nfs/samba, such as
 making the linux workstations up in our lab, lustre clients. But you
 are saying either 1) I do some sort of regular tcp routing? or 2) an
 existing client cannot also work as a router?

 On Mon, Apr 4, 2011 at 3:43 PM, Cliff White cli...@whamcloud.com wrote:
 
 
  On Mon, Apr 4, 2011 at 1:32 PM, David Noriega tsk...@my.utsa.edu
  wrote:
 
  Reading up on LNET routing and have a question. Currently have nothing
  special going on, simply specified tcp0(bond0) on the OSSs and MDS.
  Same for all the clients as well, we have an internal network for our
  cluster, 192.168.x.x.  How would I go about doing the following?
 
  Data1,Data2 = OSS, Meta1,Meta2 = MDS.
 
  Internally its 192.168.1.x for cluster nodes, 192.168.5.x for lustre
  nodes.
 
  But I would like a 1) a 'forwarding' sever, which would be our file
  server which exports lustre via samba/nfs to also be the outside
  world's access point to lustre(outside world being the rest of the
  campus). 2) a second internal network simply connecting the OSSs and
  MDS to the backup client to do backups outside of the cluster network.
 
  Slightly confused am I.
   1) is just a samba/nfs exporter, while you might
  have two networks in the one box, you wouldn't be doing any routing,
  the Lustre client is re-exporting the FS.
  The Lustre client has to find the Lustre servers, the samba/NFS clients
  only
  have to find the Lustre client.
  2) if the second internal net connects backup clients directly to
  OSS/MDS
  you  again need no routing.
  Lustre Routing is really to connect disparte network hardware for
  Lustre traffic, for example Infiniband routed to TCP/IP, or Quadratics
  to
  IB.
  Also, file servers are never routers, since they have direct connections
  to
  all clients. Routers are dedicated nodes that have both hardware
  interfaces
  and
  sit between a client and server.
  Typical setup are things like a cluster with server and clients on IB,
  you
  wish to add a second client pool on TCP/IP, you have to build nodes that
  have both TCP/IP and IB interfaces, and those are Lustre Routers.
  Since all your traffic is TCP/IP, sounds like normal TCP/IP network
  manipulation
  is all you are needing. You would need the 'lnet networks' stuff to
  align nets with interfaces, and that part looks correct.
  cliffw
 
 
  So would I do the following?
 
  OSS/MDS
  options lnet networks=tcp0(bond0),tcp1(eth3) routes=tcp2 192.168.2.1
 
  Backup client
  options lnet networks=tcp1(eth1)
 
  Cluster clients
  options lnet networks=tcp0(eth0)
 
  File Server
  options lnet networks=tcp0(eth1),tcp2(eth2) forwarding=enabled
 
  And for any outside clients I would do the following?
  options lnet networks=tcp2(eth0)
 
  And when mounting from the outside I would use in /etc/fstab the
  external
  ip?
  x.x.x.x@tcp2:/lustre /lustre lustre defaults,_netdev 0 0
 
  Is this how it would work? Also can I do this piece-meal or does it
  have to be done all at once?
 
  Thanks
  David
 
  --
  Personally, I liked the university. They gave us money and facilities,
  we didn't have to produce anything! You've never been out of college!
  You don't know what it's like out there! I've worked in the private
  sector

Re: [Lustre-discuss] LNET routing question

2011-04-05 Thread David Noriega
Well I would call our setup a barrier case. The internal 192.168.x.x
network is completely internal to the cluster, inaccessible from the
outside. So following this I can setup a router machine to allow
access from the external network to lustre, correct?

On part two, i would simply be adding tcp1 to the oss/mds, tcp0 which
everything already connects to would still be there. So it was my
guess and looks like you agree that so long as the clients continue to
use tcp0, which they will, they will still be able to connect just
fine. tcp1 would be just for the backup client.

Thanks
David

On Tue, Apr 5, 2011 at 1:48 PM, Cliff White cli...@whamcloud.com wrote:
 That's the 'barrier' case i was talking about - using routers to separate
 public/private networks - basically using a Lustre router as a hole through
 a firewall.
 Second question - depends on your world. Obviously, machines with
 mis-matched network configs may not be able to comunicate, so depends on
 whether you can tolerate some clients not reaching some OST while the
 changes are rolling through.
 I would think adding the second net for the backup would be transparent to
 existing
 clients, modulo the OST restart needed.
 cliffw

 On Tue, Apr 5, 2011 at 11:36 AM, David Noriega tsk...@my.utsa.edu wrote:

 What about this example?
 http://comments.gmane.org/gmane.comp.file-systems.lustre.user/6687

 Also to my second question, would these changes have to be done all at
 once? or could I edit one modprobe.conf at a time and fail over then
 back as I make changes to each oss/mds?

 Thanks
 David

 On Tue, Apr 5, 2011 at 11:52 AM, Cliff White cli...@whamcloud.com wrote:
  Lustre routing is to connect different types of network.
  If all your networks are TCP, you should be able to use standard
  TCP routing/addressing without needing Lustre routers.
  Again, if the Linux workstations in your lab are TCP, you should
  be able to create a TCP route to the Lustre servers without needing
  a Lustre router in the middle, unless you have some barrier, and
  you need a lustre router to cross that barrier.
  Generally, people do not use routers as clients, there is nothing
  stopping your from doing this, but a) the router will take resources
  away from the clients, impacting performance of both.
  b) again,clients are typically endpoints, and routers sit in the middle,
  so from a network design perspective it's usually silly.
  Also, lustre routers function as a pool, failed routers are bypassed,
  so a pool of dedicated routers can tolerate individual machine outages.
  That's not good for clients.
  But the main reason people do dedicated boxes for Lustre routing, is
  that
  Lustre routing is designed to bridge different network hardware.
  If all your nets are TCP, I think using standard networking methods will
  be
  better for you, simpler and easier to maintain.
  cliffw
 
  On Mon, Apr 4, 2011 at 6:50 PM, David Noriega tsk...@my.utsa.edu
  wrote:
 
  The file server does sit on both networks, internal and external. I
  would just like to have a thrid option beyond nfs/samba, such as
  making the linux workstations up in our lab, lustre clients. But you
  are saying either 1) I do some sort of regular tcp routing? or 2) an
  existing client cannot also work as a router?
 
  On Mon, Apr 4, 2011 at 3:43 PM, Cliff White cli...@whamcloud.com
  wrote:
  
  
   On Mon, Apr 4, 2011 at 1:32 PM, David Noriega tsk...@my.utsa.edu
   wrote:
  
   Reading up on LNET routing and have a question. Currently have
   nothing
   special going on, simply specified tcp0(bond0) on the OSSs and MDS.
   Same for all the clients as well, we have an internal network for
   our
   cluster, 192.168.x.x.  How would I go about doing the following?
  
   Data1,Data2 = OSS, Meta1,Meta2 = MDS.
  
   Internally its 192.168.1.x for cluster nodes, 192.168.5.x for lustre
   nodes.
  
   But I would like a 1) a 'forwarding' sever, which would be our file
   server which exports lustre via samba/nfs to also be the outside
   world's access point to lustre(outside world being the rest of the
   campus). 2) a second internal network simply connecting the OSSs and
   MDS to the backup client to do backups outside of the cluster
   network.
  
   Slightly confused am I.
    1) is just a samba/nfs exporter, while you might
   have two networks in the one box, you wouldn't be doing any routing,
   the Lustre client is re-exporting the FS.
   The Lustre client has to find the Lustre servers, the samba/NFS
   clients
   only
   have to find the Lustre client.
   2) if the second internal net connects backup clients directly to
   OSS/MDS
   you  again need no routing.
   Lustre Routing is really to connect disparte network hardware for
   Lustre traffic, for example Infiniband routed to TCP/IP, or
   Quadratics
   to
   IB.
   Also, file servers are never routers, since they have direct
   connections
   to
   all clients. Routers are dedicated nodes that have both hardware
   interfaces

Re: [Lustre-discuss] aacraid kernel panic caused failover

2011-04-05 Thread David Noriega
Ok I updated the aacraid driver and the raid firmware, yet I still had
the problem happen, so I did more research and applied the following
tweaks:

1) Rebuilt mkinitrd with the following options:
a) edit /etc/sysconfig/mkinitrid/multipath to contain MULTIPATH=yes
b) mkinitrid initrd-2.6.18-194.3.1.el5_lustre.1.8.4.img
2.6.18-194.3.1.el5_lustre.1.8.4 --preload=scsi_dh_rdac
2) Added the local hard disk to the multipath black list
3) Edited modprobe.conf to have the following aacraid options:
options aacraid firmware_debug=2 startup_timeout=60 #the debug doesn't
seem to print anything to dmesg
4) Added pcie_aspm=off to the kernel boot options

So things looked good for a while. I did have a problem mounting the
lustre partitions but this was my fault in misconfiguring some lnet
options I was experimenting with. I fixed that and just as a test, I
ran 'modprobe lustre' since I wasn't ready to fail back the partitions
just yet(wanted to wait till when activity was the lowest). That was
earlier today. I was about to fail back tonight, yet when I checked
the server again I saw in dmesg the same aacraid problems from before.
Is it possible lustre is interfering with aacraid? Its weird since I
do have a duplicate machine and its not having any of thise problems.

On Fri, Mar 25, 2011 at 9:55 AM, Temple  Jason jtem...@cscs.ch wrote:
 Adaptec should have the firmware and drivers on their site for your card.  If 
 not adaptec, then SOracle will have it available somewhere.

 The firmware and system drivers usually have a utility that will check the 
 current version and upgrade it for you.

 Hope this helps (I use different cards, so I can't tell you exactly).

 -Jason

 -Original Message-
 From: David Noriega [mailto:tsk...@my.utsa.edu]
 Sent: venerdì, 25. marzo 2011 15:47
 To: Temple Jason
 Subject: Re: [Lustre-discuss] aacraid kernel panic caused failover

 Hmm not sure, whats the best way to find out?

 On Fri, Mar 25, 2011 at 9:46 AM, Temple  Jason jtem...@cscs.ch wrote:
 Hi,

 Are you using the latest firmware?  This sort of thing used to happen to me, 
 but with different raid cards.

 -Jason

 -Original Message-
 From: lustre-discuss-boun...@lists.lustre.org 
 [mailto:lustre-discuss-boun...@lists.lustre.org] On Behalf Of David Noriega
 Sent: venerdì, 25. marzo 2011 15:38
 To: lustre-discuss@lists.lustre.org
 Subject: [Lustre-discuss] aacraid kernel panic caused failover

 Had some crazyness happen to our lustre system. We have two OSSs, both
 identical sun x4140 servers and on only one of them have I've seen
 this pop up in the kernel messages and then a kernel panic. The panic
 seemed to then spread and caused the network to go down and the second
 OSS to try to failover(or failback?). Anyways 'splitbrain' occurred
 and I was able to get in and set them straight. I researched this
 aacraid module messages and so far all I can find says to increase the
 timeout, but these are old messages and currently they are set to 60.
 Anyone else have any ideas?

 aacraid: Host adapter abort request (0,0,0,0)
 aacraid: Host adapter reset request. SCSI hang ?
 AAC: Host adapter BLINK LED 0xef
 AAC0: adapter kernel panic'd ef.

 --
 Personally, I liked the university. They gave us money and facilities,
 we didn't have to produce anything! You've never been out of college!
 You don't know what it's like out there! I've worked in the private
 sector. They expect results. -Ray Ghostbusters
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss




 --
 Personally, I liked the university. They gave us money and facilities,
 we didn't have to produce anything! You've never been out of college!
 You don't know what it's like out there! I've worked in the private
 sector. They expect results. -Ray Ghostbusters




-- 
Personally, I liked the university. They gave us money and facilities,
we didn't have to produce anything! You've never been out of college!
You don't know what it's like out there! I've worked in the private
sector. They expect results. -Ray Ghostbusters
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] LNET routing question

2011-04-04 Thread David Noriega
Reading up on LNET routing and have a question. Currently have nothing
special going on, simply specified tcp0(bond0) on the OSSs and MDS.
Same for all the clients as well, we have an internal network for our
cluster, 192.168.x.x.  How would I go about doing the following?

Data1,Data2 = OSS, Meta1,Meta2 = MDS.

Internally its 192.168.1.x for cluster nodes, 192.168.5.x for lustre nodes.

But I would like a 1) a 'forwarding' sever, which would be our file
server which exports lustre via samba/nfs to also be the outside
world's access point to lustre(outside world being the rest of the
campus). 2) a second internal network simply connecting the OSSs and
MDS to the backup client to do backups outside of the cluster network.

So would I do the following?

OSS/MDS
options lnet networks=tcp0(bond0),tcp1(eth3) routes=tcp2 192.168.2.1

Backup client
options lnet networks=tcp1(eth1)

Cluster clients
options lnet networks=tcp0(eth0)

File Server
options lnet networks=tcp0(eth1),tcp2(eth2) forwarding=enabled

And for any outside clients I would do the following?
options lnet networks=tcp2(eth0)

And when mounting from the outside I would use in /etc/fstab the external ip?
x.x.x.x@tcp2:/lustre /lustre lustre defaults,_netdev 0 0

Is this how it would work? Also can I do this piece-meal or does it
have to be done all at once?

Thanks
David

-- 
Personally, I liked the university. They gave us money and facilities,
we didn't have to produce anything! You've never been out of college!
You don't know what it's like out there! I've worked in the private
sector. They expect results. -Ray Ghostbusters
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] aacraid kernel panic caused failover

2011-03-25 Thread David Noriega
Had some crazyness happen to our lustre system. We have two OSSs, both
identical sun x4140 servers and on only one of them have I've seen
this pop up in the kernel messages and then a kernel panic. The panic
seemed to then spread and caused the network to go down and the second
OSS to try to failover(or failback?). Anyways 'splitbrain' occurred
and I was able to get in and set them straight. I researched this
aacraid module messages and so far all I can find says to increase the
timeout, but these are old messages and currently they are set to 60.
Anyone else have any ideas?

aacraid: Host adapter abort request (0,0,0,0)
aacraid: Host adapter reset request. SCSI hang ?
AAC: Host adapter BLINK LED 0xef
AAC0: adapter kernel panic'd ef.

-- 
Personally, I liked the university. They gave us money and facilities,
we didn't have to produce anything! You've never been out of college!
You don't know what it's like out there! I've worked in the private
sector. They expect results. -Ray Ghostbusters
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Help debugging a client

2011-03-11 Thread David Noriega
kernel ver 2.6.18-194.3.1.el5_lustre.1.8.4, downloaded from lustre
recompiled. How can I check the stack size and how would I increase
it?

On Fri, Mar 11, 2011 at 1:17 PM, Michael Barnes michael.bar...@jlab.org wrote:
 David,

 What kernel are you running on the file server?  I've heard on the list
 that the stock RedHat kernels are compiled with too small of a stack
 size option and that running NFS and lustre on the same node will not
 behave well together.  A minimum of a 8k stack size is needed for this
 configuration.

 -mb

 On Mar 11, 2011, at 12:37 PM, David Noriega wrote:

 We've been running Lustre happily for a few months now, but we have
 one client that can be troublesome at times and it happens to be the
 most important client. Its our file server client as it runs NFS and
 Samba. I'm not sure where to start. I've seen this client disconnect
 from lustre nodes, but then recover and reconnect. There are hundreds
 of messages in dmesg about a few inodes. The big problem happened a
 few weeks ago when this client was booted and never could reconnect.
 The client and the lustre nodes simply kept saying HELLO to each
 other.

 Anyways as of right now this is what I see in dmesg:

 nfsd: non-standard errno: -108
 LustreError: 30558:0:(mdc_locks.c:646:mdc_enqueue()) ldlm_cli_enqueue: -108
 LustreError: 30558:0:(mdc_locks.c:646:mdc_enqueue()) Skipped 2114
 previous similar messages
 LustreError: 30558:0:(file.c:3280:ll_inode_revalidate_fini()) failure
 -108 inode 561619132
 LustreError: 30558:0:(file.c:3280:ll_inode_revalidate_fini()) Skipped
 777 previous similar messages
 LustreError: 29282:0:(file.c:116:ll_close_inode_openhandle()) inode
 18382976 mdc close failed: rc = -108
 nfsd: non-standard errno: -108
 LustreError: 29282:0:(file.c:116:ll_close_inode_openhandle()) Skipped
 17238 previous similar messages
 nfsd: non-standard errno: -108
 nfsd: non-standard errno: -108
 nfsd: non-standard errno: -108
 nfsd: non-standard errno: -108
 nfsd: non-standard errno: -108
 LustreError: 29282:0:(client.c:858:ptlrpc_import_delay_req()) @@@
 IMP_INVALID  req@81032da81800 x1360479978792199/t0
 o35-lustre-MDT_UUID@192.168.5.104@tcp:23/10 lens 408/1128 e 0 to
 1 dl 0 ref 1 fl Rpc:/0/0 rc 0/0
 LustreError: 29282:0:(client.c:858:ptlrpc_import_delay_req()) Skipped
 19011 previous similar messages
 nfsd: non-standard errno: -108

 LustreError: 11-0: an error occurred while communicating with
 192.168.5.104@tcp. The mds_close operation failed with -116
 LustreError: 520:0:(file.c:116:ll_close_inode_openhandle()) inode
 12094041 mdc close failed: rc = -116
 LustreError: 30271:0:(llite_nfs.c:96:search_inode_for_lustre())
 failure -2 inode 560111661


 Any ideas?

 --
 Personally, I liked the university. They gave us money and facilities,
 we didn't have to produce anything! You've never been out of college!
 You don't know what it's like out there! I've worked in the private
 sector. They expect results. -Ray Ghostbusters
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss

 --
 +---
 | Michael Barnes
 |
 | Thomas Jefferson National Accelerator Facility
 | Scientific Computing Group
 | 12000 Jefferson Ave.
 | Newport News, VA 23606
 | (757) 269-7634
 +---




 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss




-- 
Personally, I liked the university. They gave us money and facilities,
we didn't have to produce anything! You've never been out of college!
You don't know what it's like out there! I've worked in the private
sector. They expect results. -Ray Ghostbusters
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] Setting up quotas after the fact

2011-03-10 Thread David Noriega
I've been reading up on setting up quotas and looks like luster needs
to be shut down for that as it scans the entire filesystem. The thing
is we already have ours up and running and with quite a bit of data on
it. So any idea on how to estimate how long it would be to setup
quotas on lustre?

David

-- 
Personally, I liked the university. They gave us money and facilities,
we didn't have to produce anything! You've never been out of college!
You don't know what it's like out there! I've worked in the private
sector. They expect results. -Ray Ghostbusters
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Setting up quotas after the fact

2011-03-10 Thread David Noriega
Well we are running lustre 1.8.4, so thats great to hear. Thanks

On Thu, Mar 10, 2011 at 12:15 PM, Johann Lombardi joh...@whamcloud.com wrote:
 On Thu, Mar 10, 2011 at 11:51:44AM -0600, David Noriega wrote:
 I've been reading up on setting up quotas and looks like luster needs
 to be shut down for that as it scans the entire filesystem. The thing

 The problem is that accounting can be wrong if files/blocks are 
 allocated/freed during the scan.

 is we already have ours up and running and with quite a bit of data on
 it. So any idea on how to estimate how long it would be to setup
 quotas on lustre?

 quotacheck has been greatly improved in 1.8.2 (see bugzilla ticket 19763 for 
 more information). As an example, quotacheck takes approximately 5min to 
 complete when run against a 3.4TB filesystem (2 OSTs) which is 87% full.

 Cheers,
 Johann




-- 
Personally, I liked the university. They gave us money and facilities,
we didn't have to produce anything! You've never been out of college!
You don't know what it's like out there! I've worked in the private
sector. They expect results. -Ray Ghostbusters
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] Metadata performance question

2010-10-05 Thread David Noriega
If I'm wrong please let me know, but my understanding of how lustre
1.8 works is metadata is only accessible from a single host. So should
there be alot of activity, the metadata server becomes a bottleneck.
But I've heard that in ver 2.x that we'll be able to setup multiple
machines for metadata just like for the OSSs, and that should cut down
on a bottleneck when accessing metadata information.

-- 
Personally, I liked the university. They gave us money and facilities,
we didn't have to produce anything! You've never been out of college!
You don't know what it's like out there! I've worked in the private
sector. They expect results. -Ray Ghostbusters
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] Setting up quotas

2010-10-05 Thread David Noriega
Can I setup quotas after lustre is active? Or does that require taking
everything offline? Or could I just run lfs quota on and then start
setting quotas for every user? Will running this command on one client
then effect all of them? or do I have to run it everywhere? And is
there a way to notify users or at least the admins via email? Or is it
simply something that is returned on the shell?

David

-- 
Personally, I liked the university. They gave us money and facilities,
we didn't have to produce anything! You've never been out of college!
You don't know what it's like out there! I've worked in the private
sector. They expect results. -Ray Ghostbusters
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Strange messages from samba

2010-10-05 Thread David Noriega
So then Samba isn't Lustre-aware in the sense it checks and respects quotas?

On Tue, Oct 5, 2010 at 7:18 AM, Johann Lombardi
johann.lomba...@oracle.com wrote:
 Hi David,

 On Mon, Oct 04, 2010 at 12:09:21PM -0500, David Noriega wrote:
 Moved our samba server to use Lustre as its backend file system and
 things look like they are working, but I'm seeing the following
 message repeat over and over

 [2010/10/04 11:09:40, 0] lib/sysquotas.c:sys_get_quota(421)
  sys_path_to_bdev() failed for path [.]!
 [...]

 Any ideas?

 A quick google search shows that others - who don't export a lustre fs - get 
 the same error message, so i would recommend to try the samba mailing list 
 instead.
 That being said, if samba tries to access quota information through standard 
 quotactl(2) calls, it cannot work since Lustre has its own quota 
 administrative interface (see llapi_quotactl(3)).

 HTH

 Cheers,
 Johann




-- 
Personally, I liked the university. They gave us money and facilities,
we didn't have to produce anything! You've never been out of college!
You don't know what it's like out there! I've worked in the private
sector. They expect results. -Ray Ghostbusters
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] Strange messages from samba

2010-10-04 Thread David Noriega
Moved our samba server to use Lustre as its backend file system and
things look like they are working, but I'm seeing the following
message repeat over and over

[2010/10/04 11:09:40, 0] lib/sysquotas.c:sys_get_quota(421)
 sys_path_to_bdev() failed for path [.]!
[2010/10/04 11:09:40, 0] lib/sysquotas.c:sys_get_quota(421)
 sys_path_to_bdev() failed for path [.]!
[2010/10/04 11:09:45, 0] lib/sysquotas.c:sys_get_quota(421)
 sys_path_to_bdev() failed for path [.]!
[2010/10/04 11:09:45, 0] lib/sysquotas.c:sys_get_quota(421)
 sys_path_to_bdev() failed for path [.]!

Any ideas?

-- 
Personally, I liked the university. They gave us money and facilities,
we didn't have to produce anything! You've never been out of college!
You don't know what it's like out there! I've worked in the private
sector. They expect results. -Ray Ghostbusters
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] Profiling data

2010-09-28 Thread David Noriega
This question isn't really about Lustre, but file system
administration. I was wondering what tools exist, particularly
anything free/open source, that can scan for old files and either
report to the admin or user that said files are say 1yr old, please
archive them or delete them. Also any tools that can profile file
types, such as to check if someone is keeping their mp3 library on our
server.

Thanks
David

-- 
Personally, I liked the university. They gave us money and facilities,
we didn't have to produce anything! You've never been out of college!
You don't know what it's like out there! I've worked in the private
sector. They expect results. -Ray Ghostbusters
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] Exporting lustre over nfs

2010-09-02 Thread David Noriega
I've read you can export lustre via nfs but I'm running into some
trouble. I tried nfs3 but when I would check a directory, all the
files where labeled red and ls -al showed no username or permissions,
just ? This was on the server

nfsd: non-standard errno: -43
LustreError: 11-0: an error occurred while communicating with
192.168.5@tcp. The mds_getxattr operation failed with -43
nfsd: non-standard errno: -43
LustreError: 11-0: an error occurred while communicating with
192.168.5@tcp. The mds_getxattr operation failed with -43
nfsd: non-standard errno: -43

So then I tried out nfs4 and trying to navigate or ls into the nfs
mount would hang and I would on get the mds_getxattr error.

Something I'm doing wrong?
-- 
Personally, I liked the university. They gave us money and facilities,
we didn't have to produce anything! You've never been out of college!
You don't know what it's like out there! I've worked in the private
sector. They expect results. -Ray Ghostbusters
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Samba and file locking

2010-08-30 Thread David Noriega
No, we will only have a single samba server sharing out lustre-backed
files. What do you mean in a way similar to samba? What does samba do
that is different? We are using lustre to replace our old nfs server
for serving up home directories in our cluster and the rest of our
systems.

On Fri, Aug 27, 2010 at 6:15 PM, Oleg Drokin oleg.dro...@oracle.com wrote:
 Hello!

 On Aug 27, 2010, at 6:41 PM, David Noriega wrote:
 But I also found out about the flock option for lustre. Should I set
 flock on all clients? or can I just use localflock option on the
 fileserver?

 It depends.
 If you are 100% sure none of your other clients use flocks in a way similar 
 to samba to
 guard their file accesses AND you don't export (same fs with) samba from more 
 than one node, you
 can mount with localflock on samba-exporting node.

 Otherwise you need to mount with flock, but please be aware that flock is not 
 exactly cheap in lustre,
 every flock operation is a synchronous RPC plus it puts even more load on MDS 
 and some applications
 start to use flock once they see it as available resulting in possible 
 unexpected slowdowns
 (MPI apps in some IO modes without lustre ADIO driver tend to do this, I 
 think)

 Bye,
    Oleg



-- 
Personally, I liked the university. They gave us money and facilities,
we didn't have to produce anything! You've never been out of college!
You don't know what it's like out there! I've worked in the private
sector. They expect results. -Ray Ghostbusters
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Samba and file locking

2010-08-30 Thread David Noriega
Well the samba server will be just for that, but we only have the
single filesystem '/lustre' So because of that I'm going to have to
put the flock option on all of the clients? this was my original
question.

On Mon, Aug 30, 2010 at 10:52 AM, Mark Hahn h...@mcmaster.ca wrote:
 No, we will only have a single samba server sharing out lustre-backed
 files. What do you mean in a way similar to samba? What does samba do
 that is different? We are using lustre to replace our old nfs server
 for serving up home directories in our cluster and the rest of our
 systems.

 what he meant is that if lustre is backing a single samba server,
 and the shared filesystem is only available via samba, you can turn
 optimize from flock to localflock.  that is, since flock is relatively
 expensive, localflock provides the behavior within a single client, such as
 the machine running samba.  if you have other lustre clients
 also mounting that filesystem, you'll need flock not localflock to provide
 consistency.

 -mark

 On Fri, Aug 27, 2010 at 6:15 PM, Oleg Drokin oleg.dro...@oracle.com
 wrote:

 Hello!

 On Aug 27, 2010, at 6:41 PM, David Noriega wrote:

 But I also found out about the flock option for lustre. Should I set
 flock on all clients? or can I just use localflock option on the
 fileserver?

 It depends.
 If you are 100% sure none of your other clients use flocks in a way
 similar to samba to
 guard their file accesses AND you don't export (same fs with) samba from
 more than one node, you
 can mount with localflock on samba-exporting node.

 Otherwise you need to mount with flock, but please be aware that flock is
 not exactly cheap in lustre,
 every flock operation is a synchronous RPC plus it puts even more load on
 MDS and some applications
 start to use flock once they see it as available resulting in possible
 unexpected slowdowns
 (MPI apps in some IO modes without lustre ADIO driver tend to do this, I
 think)

 Bye,
    Oleg



-- 
Personally, I liked the university. They gave us money and facilities,
we didn't have to produce anything! You've never been out of college!
You don't know what it's like out there! I've worked in the private
sector. They expect results. -Ray Ghostbusters
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] Samba and file locking

2010-08-27 Thread David Noriega
Are their issues with Samba and Lustre working together? I remember
something about turning oplocks off in samba, and while testing samba
I noticed this

[2010/08/27 17:30:59, 3] lib/util.c:fcntl_getlock(2064)
  fcntl_getlock: lock request failed at offset 75694080 count 65536
type 1 (Function not implemented)

But I also found out about the flock option for lustre. Should I set
flock on all clients? or can I just use localflock option on the
fileserver?

David

-- 
Personally, I liked the university. They gave us money and facilities,
we didn't have to produce anything! You've never been out of college!
You don't know what it's like out there! I've worked in the private
sector. They expect results. -Ray Ghostbusters
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] LNET internal/external question

2010-08-25 Thread David Noriega
OK our lustre system is up and running, but currently its hooked into
our internal network. How do we go about accessing it from the
external network(university).

Its the basic setup, two OSSs, and two MDS/MGS, all setup with
failover, all mount options are currently set using their internal
ips(192.168.x.x). When these machines are given a public ip, do I have
to change anything to allow access from external clients(ie not from
192.168.x.x space).

David

-- 
Personally, I liked the university. They gave us money and facilities,
we didn't have to produce anything! You've never been out of college!
You don't know what it's like out there! I've worked in the private
sector. They expect results. -Ray Ghostbusters
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] Configuration question

2010-08-19 Thread David Noriega
I'm curious about the underlying framework of lustre in regards to failover.

When creating the filesystems, one can provide --failnode=x.x@tcp0
and even for the OSTs you can provide two nids for the MDS/MGS. What
do these options tell lustre and the clients? Are these required for
use with heartbeat? If so why doesn't that second of the manual
reference this? Also I think there is a typo in 4.5 Operational
Scenarios, where it says one can use 'mkfs.lustre --ost --mgs
--fsname='  That of course returns an error.

David

-- 
Personally, I liked the university. They gave us money and facilities,
we didn't have to produce anything! You've never been out of college!
You don't know what it's like out there! I've worked in the private
sector. They expect results. -Ray Ghostbusters
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] mkfs.lustre and failover question

2010-08-18 Thread David Noriega
I've read through the 'More Complicated Configurations' section in the
manual and it says as part of setting up failover with
two(active/passive) MDS/MGS and two OSSs(active/active) to use the
following:

mkfs.lustre --fsname=lustre --ost --failnode=192.168.5@tcp0
--mgsnode=192.168.5@tcp0,192.168.5@tcp0
/dev/lustre-ost1-dg1/lv1

Yet it fails when I try to mount:

kjournald starting.  Commit interval 5 seconds
LDISKFS FS on dm-9, internal journal
LDISKFS-fs: mounted filesystem with ordered data mode.
kjournald starting.  Commit interval 5 seconds
LDISKFS FS on dm-9, internal journal
LDISKFS-fs: mounted filesystem with ordered data mode.
LDISKFS-fs: file extents enabled
LDISKFS-fs: mballoc enabled
Lustre: 5984:0:(client.c:1463:ptlrpc_expire_one_request()) @@@ Request
x1343800163193112 sent from mgc192.168.5@tcp to NID
192.168.5@tcp 5s ago has timed out (5s prior to deadline).
  r...@810118a53400 x1343800163193112/t0
o250-m...@mgc192.168.5.104@tcp_0:26/25 lens 368/584 e 0 to 1 dl
1282144240 ref 1 fl Rpc:N/0/0 rc 0/0
LustreError: 4854:0:(obd_mount.c:1095:server_start_targets()) Required
registration failed for lustre-OST: -4
LustreError: 4854:0:(obd_mount.c:1653:server_fill_super()) Unable to
start targets: -4
LustreError: 4854:0:(obd_mount.c:1436:server_put_super()) no obd lustre-OST
LustreError: 4854:0:(obd_mount.c:147:server_deregister_mount())
lustre-OST not registered
LDISKFS-fs: mballoc: 0 blocks 0 reqs (0 success)
LDISKFS-fs: mballoc: 0 extents scanned, 0 goal hits, 0 2^N hits, 0
breaks, 0 lost
LDISKFS-fs: mballoc: 0 generated and it took 0
LDISKFS-fs: mballoc: 0 preallocated, 0 discarded

Reading that makes me think its looking for 192.168.5.105 to be an
active MGS/MDS as well as 192.168.5.104(which is the primary).

David

-- 
Personally, I liked the university. They gave us money and facilities,
we didn't have to produce anything! You've never been out of college!
You don't know what it's like out there! I've worked in the private
sector. They expect results. -Ray Ghostbusters
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] Splitting lustre space

2010-08-18 Thread David Noriega
OK hooray! Lustre setup with failover of all nodes, but now we have
this huge lustre mount point. How can I say create /lustre/home and
/lustre/groups and mount on the client?

David

-- 
Personally, I liked the university. They gave us money and facilities,
we didn't have to produce anything! You've never been out of college!
You don't know what it's like out there! I've worked in the private
sector. They expect results. -Ray Ghostbusters
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Splitting lustre space

2010-08-18 Thread David Noriega
Ok, so I could do
mount --bind /lustre/home /home
mount --bind /lustre/groups /groups

Is this a generally accepted practice with Lustre? This just seems so
much like a nifty trick, but if its what the community uses, then ok.
But ultimately if I wanted two separate filesystems, I would need more
hardware? An OST can't be put into a general 'pool' for use between
the two?

David

On Wed, Aug 18, 2010 at 12:33 PM, Kevin Van Maren
kevin.van.ma...@oracle.com wrote:
 David Noriega wrote:

 OK hooray! Lustre setup with failover of all nodes, but now we have
 this huge lustre mount point. How can I say create /lustre/home and
 /lustre/groups and mount on the client?

 David


 Two choices:

 1) create two Lustre file systems (separate MDT and OSTs for each)
 2) use mount --bind on the client to make one filesystem's directories
 show up in different places





-- 
Personally, I liked the university. They gave us money and facilities,
we didn't have to produce anything! You've never been out of college!
You don't know what it's like out there! I've worked in the private
sector. They expect results. -Ray Ghostbusters
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Question on setting up fail-over

2010-08-17 Thread David Noriega
Some info:
MDS/MGS 192.168.5.104
Passive failover MDS/MGS 192.168.5.105
OSS1 192.168.5.100
OSS2 192.168.5.101

I've got some more questions about setting up failover. Besides having
heartbeat setup, what about using tunefs.lustre to set options?

On the MDS/MGS I set the following options
tunefs.lustre --failnode=192.168.5.105 /dev/lustre-mdt-dg/lv1
Heartbeat works just fine, can mount on the primary node and then
failover to the other and back.

Now on the OSSs things get a bit more confusing. Reading these two blog posts:
http://mergingbusinessandit.blogspot.com/2008/12/implementing-lustre-failover.html
http://jermen.posterous.com/lustre-mds-failover

From these I tried these options:
tunefs.lustre --erase-params --mgsnode=192.168.5@tcp0
--mgsnode=192.168.5@tcp0 --failover=192.168.5@tcp0
-write-params /dev/lustre-ost1-dg1/lv1

I ran that for all for OSTs, changing the failover option on the last
two OSTs to point OSS1 while the first two point to OST2.

My understanding is that you mount the OSTs first, then the MDS, but
the OSTs are failing to mount. Are all these options needed? Or is
simply specifying the primary MDS is enough for it to find out about
the second MDS?

David

On Mon, Aug 16, 2010 at 2:14 PM, Kevin Van Maren
kevin.van.ma...@oracle.com wrote:
 David Noriega wrote:

 Ok I've gotten heartbeat setup with the two OSSs, but I do have a
 question that isn't stated in the documentation. Shouldn't the lustre
 mounts be removed from fstab once they are given to heartbeat since
 when it comes online, it will mount the resources, correct?

 David



 Yes: on the servers, they must be not there or noauto.  Once you start
 running heartbeat,
 you have given control of the resource away, and must not mount/umount it
 yourself
 (unless you stop heartbeat on both nodes in the HA pair to get control
 back).

 Kevin





-- 
Personally, I liked the university. They gave us money and facilities,
we didn't have to produce anything! You've never been out of college!
You don't know what it's like out there! I've worked in the private
sector. They expect results. -Ray Ghostbusters
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Question on setting up fail-over

2010-08-17 Thread David Noriega
Oppps some how I changed the target name of all OSTs to lustre-OST
and trying to mount any other ost fails. I've gone and found the 'More
Complicated Configuration' section which details the usage of
--mgsnode=nid1,nid2 and so using this I think I'll just reformat.

On Tue, Aug 17, 2010 at 11:26 AM, David Noriega tsk...@my.utsa.edu wrote:
 Some info:
 MDS/MGS 192.168.5.104
 Passive failover MDS/MGS 192.168.5.105
 OSS1 192.168.5.100
 OSS2 192.168.5.101

 I've got some more questions about setting up failover. Besides having
 heartbeat setup, what about using tunefs.lustre to set options?

 On the MDS/MGS I set the following options
 tunefs.lustre --failnode=192.168.5.105 /dev/lustre-mdt-dg/lv1
 Heartbeat works just fine, can mount on the primary node and then
 failover to the other and back.

 Now on the OSSs things get a bit more confusing. Reading these two blog posts:
 http://mergingbusinessandit.blogspot.com/2008/12/implementing-lustre-failover.html
 http://jermen.posterous.com/lustre-mds-failover

 From these I tried these options:
 tunefs.lustre --erase-params --mgsnode=192.168.5@tcp0
 --mgsnode=192.168.5@tcp0 --failover=192.168.5@tcp0
 -write-params /dev/lustre-ost1-dg1/lv1

 I ran that for all for OSTs, changing the failover option on the last
 two OSTs to point OSS1 while the first two point to OST2.

 My understanding is that you mount the OSTs first, then the MDS, but
 the OSTs are failing to mount. Are all these options needed? Or is
 simply specifying the primary MDS is enough for it to find out about
 the second MDS?

 David

 On Mon, Aug 16, 2010 at 2:14 PM, Kevin Van Maren
 kevin.van.ma...@oracle.com wrote:
 David Noriega wrote:

 Ok I've gotten heartbeat setup with the two OSSs, but I do have a
 question that isn't stated in the documentation. Shouldn't the lustre
 mounts be removed from fstab once they are given to heartbeat since
 when it comes online, it will mount the resources, correct?

 David



 Yes: on the servers, they must be not there or noauto.  Once you start
 running heartbeat,
 you have given control of the resource away, and must not mount/umount it
 yourself
 (unless you stop heartbeat on both nodes in the HA pair to get control
 back).

 Kevin





 --
 Personally, I liked the university. They gave us money and facilities,
 we didn't have to produce anything! You've never been out of college!
 You don't know what it's like out there! I've worked in the private
 sector. They expect results. -Ray Ghostbusters




-- 
Personally, I liked the university. They gave us money and facilities,
we didn't have to produce anything! You've never been out of college!
You don't know what it's like out there! I've worked in the private
sector. They expect results. -Ray Ghostbusters
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Question on setting up fail-over

2010-08-17 Thread David Noriega
That is good to know, but already started formatting. No issues as it
hasn't been put into production, just playing with it and working
kinks like this out. Though formatting the OSTs was rather quick while
the MDT is taking some time. Is this normal?

192.168.5.105 is the other(standby) mds node.
[r...@meta1 ~]# mkfs.lustre --reformat --fsname=lustre --mgs --mdt
--failnode=192.168.5@tcp0 /dev/lustre-mdt-dg/lv1

   Permanent disk data:
Target: lustre-MDT
Index:  unassigned
Lustre FS:  lustre
Mount type: ldiskfs
Flags:  0x75
  (MDT MGS needs_index first_time update )
Persistent mount opts: iopen_nopriv,user_xattr,errors=remount-ro
Parameters: failover.node=192.168.5@tcp
mdt.group_upcall=/usr/sbin/l_getgroups

device size = 2323456MB
2 6 18
formatting backing filesystem ldiskfs on /dev/lustre-mdt-dg/lv1
target name  lustre-MDT
4k blocks 594804736
options-J size=400 -i 4096 -I 512 -q -O
dir_index,extents,uninit_groups,mmp -F
mkfs_cmd = mke2fs -j -b 4096 -L lustre-MDT  -J size=400 -i 4096 -I
512 -q -O dir_index,extents,uninit_groups,mmp -F
/dev/lustre-mdt-dg/lv1 594804736

David

On Tue, Aug 17, 2010 at 12:27 PM, Wojciech Turek wj...@cam.ac.uk wrote:
 Hi David,

 You need to umount your OSTs and MDTs and run tunefs.lustre  --writeconf
 /dev/lustre device on all Lustre OSTs and MDTs This will force the lustre
 targets to fetch new configuration next time they are mounted. The order of
 mounting is: MGT - MDT - OSTs

 Best regards,

 Wojciech



 On 17 August 2010 18:19, David Noriega tsk...@my.utsa.edu wrote:

 Oppps some how I changed the target name of all OSTs to lustre-OST
 and trying to mount any other ost fails. I've gone and found the 'More
 Complicated Configuration' section which details the usage of
 --mgsnode=nid1,nid2 and so using this I think I'll just reformat.

 On Tue, Aug 17, 2010 at 11:26 AM, David Noriega tsk...@my.utsa.edu
 wrote:
  Some info:
  MDS/MGS 192.168.5.104
  Passive failover MDS/MGS 192.168.5.105
  OSS1 192.168.5.100
  OSS2 192.168.5.101
 
  I've got some more questions about setting up failover. Besides having
  heartbeat setup, what about using tunefs.lustre to set options?
 
  On the MDS/MGS I set the following options
  tunefs.lustre --failnode=192.168.5.105 /dev/lustre-mdt-dg/lv1
  Heartbeat works just fine, can mount on the primary node and then
  failover to the other and back.
 
  Now on the OSSs things get a bit more confusing. Reading these two blog
  posts:
 
  http://mergingbusinessandit.blogspot.com/2008/12/implementing-lustre-failover.html
  http://jermen.posterous.com/lustre-mds-failover
 
  From these I tried these options:
  tunefs.lustre --erase-params --mgsnode=192.168.5@tcp0
  --mgsnode=192.168.5@tcp0 --failover=192.168.5@tcp0
  -write-params /dev/lustre-ost1-dg1/lv1
 
  I ran that for all for OSTs, changing the failover option on the last
  two OSTs to point OSS1 while the first two point to OST2.
 
  My understanding is that you mount the OSTs first, then the MDS, but
  the OSTs are failing to mount. Are all these options needed? Or is
  simply specifying the primary MDS is enough for it to find out about
  the second MDS?
 
  David
 
  On Mon, Aug 16, 2010 at 2:14 PM, Kevin Van Maren
  kevin.van.ma...@oracle.com wrote:
  David Noriega wrote:
 
  Ok I've gotten heartbeat setup with the two OSSs, but I do have a
  question that isn't stated in the documentation. Shouldn't the lustre
  mounts be removed from fstab once they are given to heartbeat since
  when it comes online, it will mount the resources, correct?
 
  David
 
 
 
  Yes: on the servers, they must be not there or noauto.  Once you
  start
  running heartbeat,
  you have given control of the resource away, and must not mount/umount
  it
  yourself
  (unless you stop heartbeat on both nodes in the HA pair to get control
  back).
 
  Kevin
 
 
 
 
 
  --
  Personally, I liked the university. They gave us money and facilities,
  we didn't have to produce anything! You've never been out of college!
  You don't know what it's like out there! I've worked in the private
  sector. They expect results. -Ray Ghostbusters
 



 --
 Personally, I liked the university. They gave us money and facilities,
 we didn't have to produce anything! You've never been out of college!
 You don't know what it's like out there! I've worked in the private
 sector. They expect results. -Ray Ghostbusters
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss



 --
 Wojciech Turek

 Senior System Architect

 High Performance Computing Service
 University of Cambridge
 Email: wj...@cam.ac.uk
 Tel: (+)44 1223 763517




-- 
Personally, I liked the university. They gave us money and facilities,
we didn't have to produce anything! You've never been out of college!
You don't know what it's like out there! I've worked

[Lustre-discuss] needs_recovery flag?

2010-08-16 Thread David Noriega
Still very new to lustre, and now I'm going over the failover part. I
use tune2fs to set MMP, but I would get this warning about
needs_recovery, do a journal replay or else the setting will be lost.
With dumpe2fs I could see the needs_recovery flag was set on all of
the OST/MDT. Reading over the recovery part, nothing really matching
what was going on here, I elected to use e2fsck -fn then e2fsck -fp on
all of the OST/MDT, now the needs_recovery flag is gone and I was able
to set MMP to on.  So my question is, what did I do to get this sort
of thing to happen? And how to avoid it?

David

-- 
Personally, I liked the university. They gave us money and facilities,
we didn't have to produce anything! You've never been out of college!
You don't know what it's like out there! I've worked in the private
sector. They expect results. -Ray Ghostbusters
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Getting weird disk errors, no apparent impact

2010-08-13 Thread David Noriega
We have three Sun StorageTek 2150, one connected to the metadata
server and two crossconnected to the two data storage nodes. They are
connected via fiber using the qla2xxx driver that comes with CentOS
5.5.  The multipath daemon has the following config:

defaults {
udev_dir/dev
polling_interval10
selectorround-robin 0
path_grouping_policymultibus
getuid_callout  /sbin/scsi_id -g -u -s /block/%n
prio_callout /sbin/mpath_prio_rdac /dev/%n
path_checkerrdac
rr_min_io   100
max_fds 8192
rr_weight   priorities
failbackimmediate
no_path_retry   fail
user_friendly_names yes
}

Comment out from multipath.conf file:

blacklist {
devnode *
}


On Fri, Aug 13, 2010 at 4:31 AM, Wojciech Turek wj...@cam.ac.uk wrote:
 Hi David,

 I have seen simmilar errors given out by some storage arrays. There were
 caused by arrays exporting volumes via more then a single path without multi
 path driver installed or configured properly. Some times the array
 controllers requires a special driver to be installed on Linux host (for
 example RDAC mpp driver) to properly present and handle configured volumes
 in the OS. What sort of disk raid array are you using?

 Best gerads,

 Wojciech

 On 12 August 2010 17:58, David Noriega tsk...@my.utsa.edu wrote:

 We just setup a lustre system, and all looks good, but there is this
 nagging error thats floating about. When I reboot any of the nodes, be
 it a OSS or MDS, I will get this:

 [r...@meta1 ~]# dmesg | grep sdc
 sdc : very big device. try to use READ CAPACITY(16).
 SCSI device sdc: 4878622720 512-byte hdwr sectors (2497855 MB)
 sdc: Write Protect is off
 sdc: Mode Sense: 77 00 10 08
 SCSI device sdc: drive cache: write back w/ FUA
 sdc : very big device. try to use READ CAPACITY(16).
 SCSI device sdc: 4878622720 512-byte hdwr sectors (2497855 MB)
 sdc: Write Protect is off
 sdc: Mode Sense: 77 00 10 08
 SCSI device sdc: drive cache: write back w/ FUA
  sdc:end_request: I/O error, dev sdc, sector 0
 Buffer I/O error on device sdc, logical block 0
 end_request: I/O error, dev sdc, sector 0

 This doesn't seem to affect anything. fdisk -l doesn't even report the
 device. The same(thought of course different block device sdd, sde, on
 the OSSs), happens on all the nodes.

 If I run pvdisplay or lvdisplay, I'll get this:
 /dev/sdc: read failed after 0 of 4096 at 0: Input/output error

 Any ideas?
 David
 --
 Personally, I liked the university. They gave us money and facilities,
 we didn't have to produce anything! You've never been out of college!
 You don't know what it's like out there! I've worked in the private
 sector. They expect results. -Ray Ghostbusters
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss



 --
 Wojciech Turek

 Senior System Architect

 High Performance Computing Service
 University of Cambridge
 Email: wj...@cam.ac.uk
 Tel: (+)44 1223 763517




-- 
Personally, I liked the university. They gave us money and facilities,
we didn't have to produce anything! You've never been out of college!
You don't know what it's like out there! I've worked in the private
sector. They expect results. -Ray Ghostbusters
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] Getting weird disk errors, no apparent impact

2010-08-12 Thread David Noriega
We just setup a lustre system, and all looks good, but there is this
nagging error thats floating about. When I reboot any of the nodes, be
it a OSS or MDS, I will get this:

[r...@meta1 ~]# dmesg | grep sdc
sdc : very big device. try to use READ CAPACITY(16).
SCSI device sdc: 4878622720 512-byte hdwr sectors (2497855 MB)
sdc: Write Protect is off
sdc: Mode Sense: 77 00 10 08
SCSI device sdc: drive cache: write back w/ FUA
sdc : very big device. try to use READ CAPACITY(16).
SCSI device sdc: 4878622720 512-byte hdwr sectors (2497855 MB)
sdc: Write Protect is off
sdc: Mode Sense: 77 00 10 08
SCSI device sdc: drive cache: write back w/ FUA
 sdc:end_request: I/O error, dev sdc, sector 0
Buffer I/O error on device sdc, logical block 0
end_request: I/O error, dev sdc, sector 0

This doesn't seem to affect anything. fdisk -l doesn't even report the
device. The same(thought of course different block device sdd, sde, on
the OSSs), happens on all the nodes.

If I run pvdisplay or lvdisplay, I'll get this:
/dev/sdc: read failed after 0 of 4096 at 0: Input/output error

Any ideas?
David
-- 
Personally, I liked the university. They gave us money and facilities,
we didn't have to produce anything! You've never been out of college!
You don't know what it's like out there! I've worked in the private
sector. They expect results. -Ray Ghostbusters
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Question on setting up fail-over

2010-08-10 Thread David Noriega
Could you describe this resource fencing in more detail? As for
regards to STONITH, the pdu already has the grubby hands of IT plugged
into it and doubt they would be happy if I unplugged them.  What about
the network management port or ILOM?

On Mon, Aug 9, 2010 at 1:08 PM, Kevin Van Maren
kevin.van.ma...@oracle.com wrote:
 On Aug 9, 2010, at 11:45 AM, David Noriega tsk...@my.utsa.edu wrote:

 My understanding of setting up fail-over is you need some control over
 the power so with a script it can turn off a machine by cutting its
 power? Is this correct?

 It is the recommended configuration because it is simple to understand and
 implement.

 But the only _hard_ requirement is that both nodes can access the storage.


 Is there a way to do fail-over without having
 access to the pdu(power strips)?

 If you have IPMI support, that can be used for power control, instead of a
 switched PDU.  Depending on the storage, you may be able to do resource
 fencing of the disks instead of STONITH.  Or you can run fast-and-loose,
 without any way to ensure the dead node is really dead and not accessing
 storage (at your risk).  While Lustre has MMP, it is really more to protect
 against a mount typo than to guarantee resource fencing.


 Thanks
 David

 --
 Personally, I liked the university. They gave us money and facilities,
 we didn't have to produce anything! You've never been out of college!
 You don't know what it's like out there! I've worked in the private
 sector. They expect results. -Ray Ghostbusters
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss




-- 
Personally, I liked the university. They gave us money and facilities,
we didn't have to produce anything! You've never been out of college!
You don't know what it's like out there! I've worked in the private
sector. They expect results. -Ray Ghostbusters
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Question on setting up fail-over

2010-08-10 Thread David Noriega
I think I'll go the ipmi route. So reading on STONITH, its just a
script, so all I would need is a script to run ipmi that tells the
server to power off, right?

Also while reading through the lustre manual, seems some things are
being deleted from the wiki,
http://wiki.lustre.org/index.php?title=Clu_Manager no longer exists,
and noticed this too when I found the lustre quick guide is no longer
available.

Thanks
David

On Tue, Aug 10, 2010 at 10:57 AM, Kevin Van Maren
kevin.van.ma...@oracle.com wrote:
 David Noriega wrote:

 Could you describe this resource fencing in more detail? As for
 regards to STONITH, the pdu already has the grubby hands of IT plugged
 into it and doubt they would be happy if I unplugged them.  What about
 the network management port or ILOM?


 Resource fencing is needed to ensure that a node does not take over a
 resource (ie, OST)
 while the other node is still accessing it (as could happen if the node only
 partly crashes,
 where it is not responding to the HA package but still writing to the disk).

 STONITH is a pretty common way to ensure the other node is dead and can no
 longer
 access the resource.  If you can't use your switched PDU, then using the
 ILOM for IPMI-based
 power control works.  The other common way to do resource fencing is to use
 scsi reserve
 commands (if supported by the hardware and the HA package) to ensure
 exclusive access.

 Kevin

 On Mon, Aug 9, 2010 at 1:08 PM, Kevin Van Maren
 kevin.van.ma...@oracle.com wrote:


 On Aug 9, 2010, at 11:45 AM, David Noriega tsk...@my.utsa.edu wrote:



 My understanding of setting up fail-over is you need some control over
 the power so with a script it can turn off a machine by cutting its
 power? Is this correct?


 It is the recommended configuration because it is simple to understand
 and
 implement.

 But the only _hard_ requirement is that both nodes can access the
 storage.




 Is there a way to do fail-over without having
 access to the pdu(power strips)?


 If you have IPMI support, that can be used for power control, instead of
 a
 switched PDU.  Depending on the storage, you may be able to do resource
 fencing of the disks instead of STONITH.  Or you can run fast-and-loose,
 without any way to ensure the dead node is really dead and not
 accessing
 storage (at your risk).  While Lustre has MMP, it is really more to
 protect
 against a mount typo than to guarantee resource fencing.




 Thanks
 David

 --
 Personally, I liked the university. They gave us money and facilities,
 we didn't have to produce anything! You've never been out of college!
 You don't know what it's like out there! I've worked in the private
 sector. They expect results. -Ray Ghostbusters
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss










-- 
Personally, I liked the university. They gave us money and facilities,
we didn't have to produce anything! You've never been out of college!
You don't know what it's like out there! I've worked in the private
sector. They expect results. -Ray Ghostbusters
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Question on setting up fail-over

2010-08-10 Thread David Noriega
Another question. Is it possible to use centos/redhat's clustering
software? In the manual it mentions using that for metadata
failover(since having more then one metadata server online isnt
possible right now), so why not use that for all of lustre? But since
the information is missing, can someone fill in the blanks on setting
up metadata failover?

David

On Tue, Aug 10, 2010 at 11:11 AM, Kevin Van Maren
kevin.van.ma...@oracle.com wrote:
 Depends on the HA package you are using.  Heartbeat comes with a script that
 supports IPMI.

 The important thing is that stonith NOT succeed if you don't _know_ that the
 node is off.
 So it is absolutely not a 1-line script.

 Kevin


 David Noriega wrote:

 I think I'll go the ipmi route. So reading on STONITH, its just a
 script, so all I would need is a script to run ipmi that tells the
 server to power off, right?

 Also while reading through the lustre manual, seems some things are
 being deleted from the wiki,
 http://wiki.lustre.org/index.php?title=Clu_Manager no longer exists,
 and noticed this too when I found the lustre quick guide is no longer
 available.

 Thanks
 David

 On Tue, Aug 10, 2010 at 10:57 AM, Kevin Van Maren
 kevin.van.ma...@oracle.com wrote:


 David Noriega wrote:


 Could you describe this resource fencing in more detail? As for
 regards to STONITH, the pdu already has the grubby hands of IT plugged
 into it and doubt they would be happy if I unplugged them.  What about
 the network management port or ILOM?



 Resource fencing is needed to ensure that a node does not take over a
 resource (ie, OST)
 while the other node is still accessing it (as could happen if the node
 only
 partly crashes,
 where it is not responding to the HA package but still writing to the
 disk).

 STONITH is a pretty common way to ensure the other node is dead and can
 no
 longer
 access the resource.  If you can't use your switched PDU, then using the
 ILOM for IPMI-based
 power control works.  The other common way to do resource fencing is to
 use
 scsi reserve
 commands (if supported by the hardware and the HA package) to ensure
 exclusive access.

 Kevin



 On Mon, Aug 9, 2010 at 1:08 PM, Kevin Van Maren
 kevin.van.ma...@oracle.com wrote:



 On Aug 9, 2010, at 11:45 AM, David Noriega tsk...@my.utsa.edu wrote:




 My understanding of setting up fail-over is you need some control over
 the power so with a script it can turn off a machine by cutting its
 power? Is this correct?



 It is the recommended configuration because it is simple to understand
 and
 implement.

 But the only _hard_ requirement is that both nodes can access the
 storage.





 Is there a way to do fail-over without having
 access to the pdu(power strips)?



 If you have IPMI support, that can be used for power control, instead
 of
 a
 switched PDU.  Depending on the storage, you may be able to do resource
 fencing of the disks instead of STONITH.  Or you can run
 fast-and-loose,
 without any way to ensure the dead node is really dead and not
 accessing
 storage (at your risk).  While Lustre has MMP, it is really more to
 protect
 against a mount typo than to guarantee resource fencing.





 Thanks
 David

 --
 Personally, I liked the university. They gave us money and facilities,
 we didn't have to produce anything! You've never been out of college!
 You don't know what it's like out there! I've worked in the private
 sector. They expect results. -Ray Ghostbusters
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss
















-- 
Personally, I liked the university. They gave us money and facilities,
we didn't have to produce anything! You've never been out of college!
You don't know what it's like out there! I've worked in the private
sector. They expect results. -Ray Ghostbusters
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] Two file servers question.

2010-08-10 Thread David Noriega
We just got our lustre system online, and as we continue to play with
it, I need some help supporting my argument that we should have two
file servers. One nfs server to host up user's home directories and
then the lustre file server to host up space for their jobs to run. My
manager's concern is confusing users, which I think for anyone using a
cluster isn't completely valid, but any information towards technical
details supporting a two file server solution would be helpful.

Thanks
David

-- 
Personally, I liked the university. They gave us money and facilities,
we didn't have to produce anything! You've never been out of college!
You don't know what it's like out there! I've worked in the private
sector. They expect results. -Ray Ghostbusters
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] Question on setting up fail-over

2010-08-09 Thread David Noriega
My understanding of setting up fail-over is you need some control over
the power so with a script it can turn off a machine by cutting its
power? Is this correct? Is there a way to do fail-over without having
access to the pdu(power strips)?

Thanks
David

-- 
Personally, I liked the university. They gave us money and facilities,
we didn't have to produce anything! You've never been out of college!
You don't know what it's like out there! I've worked in the private
sector. They expect results. -Ray Ghostbusters
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] Lustre and Automount

2010-05-27 Thread David Noriega
We are pre-Lustre right now and have some questions. Currently our cluster
uses LDAP+automount to mount user's home directories from our file server.
Once we go Lustre, is there any sort of modification to LDAP or
automount(besides the installation of the Lustre client programs) needed?

-- 
Personally, I liked the university. They gave us money and facilities, we
didn't have to produce anything! You've never been out of college! You don't
know what it's like out there! I've worked in the private sector. They
expect results. -Ray Ghostbusters
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] Monitoring filesystem usage

2010-05-27 Thread David Noriega
What tools do you use to keep track of who is using and how much of the
filesystem? Are there any free tools to keep track of old files, temp files,
large files, etc? Basically how to you keep things running in an orderly
fashion and keep users in line, besides adding more space?

-- 
Personally, I liked the university. They gave us money and facilities, we
didn't have to produce anything! You've never been out of college! You don't
know what it's like out there! I've worked in the private sector. They
expect results. -Ray Ghostbusters
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] What do you think of this idea?

2010-05-19 Thread David Noriega
My supervisor has this idea and I would like the input of the
Lustre community as we are still very new to Lustre.

We have 7 workstations, and the idea was to put into them 3 2TB drives, for
a total of 42TB, and set them up as object servers, and another workstation
as a meta data server. How feasible is this idea? And I know of the
downfalls, what if a student reboots a machine, etc, but baring those
events, would this setup work?

David
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] What do you think of this idea?

2010-05-19 Thread David Noriega
I just had another idea but again since I know very little about how Lustre
works, I'll need some input.

What if I take a single workstation and attach to it via iSCSI two Drobo
disk arrays. Would it be possible to run both the metadata and the object
storage off of a single machine? Two maxed out Drobo arrays gets 32TB of
space. It costs more but would this be better then adding disks to existing
workstations where we cant control the environment(ie users)?

On Wed, May 19, 2010 at 1:01 PM, hungsheng Tsao
hungsheng.t...@oracle.comwrote:

  as playground is  fine.
 even in this type of env , IMHO, u will need to do some raid for 3x2tb ost
 and mirror  for mds
 Hope that U have enough memory and cpu power and network  for MDS and OST.
 these are dedicate for lustre, u need other compute nodes.
 regards



 On 5/19/2010 1:31 PM, David Noriega wrote:

 My supervisor has this idea and I would like the input of the
 Lustre community as we are still very new to Lustre.

  We have 7 workstations, and the idea was to put into them 3 2TB drives,
 for a total of 42TB, and set them up as object servers, and another
 workstation as a meta data server. How feasible is this idea? And I know of
 the downfalls, what if a student reboots a machine, etc, but baring those
 events, would this setup work?

 David

 --

 ___
 Lustre-discuss mailing 
 listlustre-disc...@lists.lustre.orghttp://lists.lustre.org/mailman/listinfo/lustre-discuss


 --
 [image: Oracle]
  Hung-Sheng Tsao, Ph.D. | Principal Sales Consultant
 Higher Education | +1.973.495.0840
  Oracle * - North America Commercial Hardware*
  400 Atrium Dr. Somerset, NJ 08873
 Email hungsheng.t...@oracle.com hungsheng.t...@oracle.com




-- 
Personally, I liked the university. They gave us money and facilities, we
didn't have to produce anything! You've never been out of college! You don't
know what it's like out there! I've worked in the private sector. They
expect results. -Ray Ghostbusters
ora_logo_small.gif___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss