from:"Raj"

Re: [lustre-discuss] Restrict who can assign OST pools to directories

2022-11-07 Thread Raj via lustre-discuss

Marco, One other idea is to give an unfriendly pool name that users
can't guess. Like "myfs.mkpilaxluia"  instead of myfs.flash or
myfs.ssd so that it becomes difficult(not impossible though) for users
to use :). Users don't have access to MDS to get the entire lists of
pools defined.
Thanks,
Raj

On Mon, Nov 7, 2022 at 4:28 AM Andreas Dilger via lustre-discuss
 wrote:
>
> Unfortunately, this is not possible today, though I don't think it would be 
> too hard for someone to implement this by copying "enable_remote_dir_gid" and 
> similar checks on the MDS.
>
> In Lustre 2.14 and later, it is possible to set an OST pool quota that can 
> restrict users from creating too many files in a pool.  This doesn't directly 
> prevent them from setting the pool on a directory (though I guess this 
> _could_ be checked), but they would get an EDQUOT error when trying to create 
> in that directory, and quickly tire of trying to use it.
>
> Cheers, Andreas
>
> On Nov 4, 2022, at 05:57, Passerini Marco  wrote:
>
> Hi,
>
> Is there a way in Lustre to restrict who can assign OST pools to directories? 
> In specific, can we limit the following command so that it can be run by root 
> only?
>
> lfs setstripe --pool myfs.mypool test_dir
>
> I would need something similar to what can be done for remote directories:
> lctl set_param mdt.*.enable_remote_dir_gid=1
>
> Regards,
> Marco Passerini
>
>
> Cheers, Andreas
> --
> Andreas Dilger
> Lustre Principal Architect
> Whamcloud
>
>
>
>
>
>
>
> ___
> lustre-discuss mailing list
> lustre-discuss@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] Disk usage for users per project

2022-05-18 Thread Raj via lustre-discuss

I do not think this exists yet.
But, if every user has individual area(sub folder) inside the main project
folder, can you create a  ‘lustre project’ per project-user?

-Raj
On Tue, May 17, 2022 at 7:21 AM Kenneth Waegeman via lustre-discuss <
lustre-discuss@lists.lustre.org> wrote:

> Hi all,
>
> We have a lustre file system with project quota enabled. We are able to
> list the project quota usage with lfs quota -p, and the total usage of a
> user on the file system with lfs quota -u, but is there any way to find out
> the per user usage within a project ?
>
> Thank you!
>
> Kind regards,
> Kenneth
>
> ___
> lustre-discuss mailing list
> lustre-discuss@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] Essential tools for Lustre

2022-04-22 Thread Raj via lustre-discuss

Andreas, Is there any IO penalties in enabling project quota? Will I see
the same throughput from the FS?
Thanks
-Raj

On Fri, Apr 15, 2022 at 1:32 PM Andreas Dilger via lustre-discuss <
lustre-discuss@lists.lustre.org> wrote:

> Note that in newer Lustre releases, if you have project IDs enabled (you
> don't need to *enforce* project quotas, just have quota accounting
> enabled), that "df" (statfs()) will return the quota for the project ID on
> that directory tree.  It isn't _quite_ "fast du" for arbitrary directory
> trees, but quite useful in any case.
>
>
> On Apr 15, 2022, at 11:17, Steve L via lustre-discuss <
> lustre-discuss@lists.lustre.org> wrote:
>
> Hi Thomas,
>
> Thanks for the tips.
>
> I have a client that they are looking into implementing their own "fast
> du". I will look into Robinhood to see if there are any duplicated
> functions.  Not sure if they are looking into advanced functionality like
> policy management, thou.
>
> Does Robinhood offer APIs, or command-line tools to provide "du" like
> data? If not, how about direct database accesses?
>
> /steve
> --
> *From:* thomas.leibov...@cea.fr 
> *Sent:* Friday, April 15, 2022 4:03 AM
> *To:* 'Steve L' 
> *Cc:* lustre-discuss@lists.lustre.org 
> *Subject:* RE: Essential tools for Lustre
>
> Dear Steve,
>
>
> >What are the essential tools to make life easier in the Lustre ecosystem?
>
>
> We use and maintain to those 2 open-source tools:
> -  Shine, for Lustre administration:
> https://github.com/cea-hpc/shine
> -  RobinHood policy engine for managing data in filesystems:
> https://github.com/cea-hpc/robinhood/wiki
>
>
> Best regards,
> Thomas
>
>
> *De :* lustre-discuss [mailto:lustre-discuss-boun...@lists.lustre.org
> ] *De la part de* Steve L via
> lustre-discuss
> *Envoyé :* samedi 9 avril 2022 00:52
> *À :* lustre-discuss@lists.lustre.org
> *Objet :* [lustre-discuss] Essential tools for Lustre
>
>
> Hi All,
>
>
> I am new to Lustre. I like to know if anyone out there is actively using
> IML (Integrated Manager for Lustre) to administer and/or monitor Lustre?
>
>
> If yes, I plan to run Lustre LTS (e.g., 2.12.8). Do you know if IML can
> run against 2.12.8? IML official website seems to be very quiet for the
> last year. Is the project still alive?
>
>
> If no, do you run any other GUI or non-GUI tools to help
> administer/monitor? What are the essential tools to make life easier in the
> Lustre ecosystem?
>
>
> Any pointers are deeply appreciated.
>
>
> If this is not the right forum to ask such questions, please accept my
> apology. I searched the archive, but could not discover any items related
> to IML.
>
>
> Thanks ahead for your assistance.
>
>
> Steve
>
>
> ___
> lustre-discuss mailing list
> lustre-discuss@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>
>
> Cheers, Andreas
> --
> Andreas Dilger
> Lustre Principal Architect
> Whamcloud
>
>
>
>
>
>
>
> ___
> lustre-discuss mailing list
> lustre-discuss@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] Lustre Client Lockup Under Buffered I/O (2.14/2.15)

2022-01-20 Thread Raj via lustre-discuss

Ellis, I would also check the peer_credit between server and the client.
They should be same.

On Wed, Jan 19, 2022 at 9:27 AM Patrick Farrell via lustre-discuss <
lustre-discuss@lists.lustre.org> wrote:

> Ellis,
>
> As you may have guessed, that function just set looks like a node which is
> doing buffered I/O and thrashing for memory.  No particular insight
> available from the count of functions there.
>
> Would you consider opening a bug report in the Whamcloud JIRA?  You should
> have enough for a good report, here's a few things that would be helpful as
> well:
>
> It sounds like you can hang the node on demand.  If you could collect
> stack traces with:
>
> echo t > /proc/sysrq-trigger
>
> after creating the hang, that would be useful.  (It will print to dmesg.)
>
> You've also collected debug logs - Could you include, say, the last 100
> MiB of that log set?  That should be reasonable to attach if compressed.
>
> Regards,
> Patrick
>
> --
> *From:* lustre-discuss  on
> behalf of Ellis Wilson via lustre-discuss  >
> *Sent:* Wednesday, January 19, 2022 8:32 AM
> *To:* Andreas Dilger 
> *Cc:* lustre-discuss@lists.lustre.org 
> *Subject:* Re: [lustre-discuss] Lustre Client Lockup Under Buffered I/O
> (2.14/2.15)
>
>
> Hi Andreas,
>
>
>
> Apologies in advance for the top-post.  I’m required to use Outlook for
> work, and it doesn’t handle in-line or bottom-posting well.
>
>
>
> Client-side defaults prior to any tuning of mine (this is a very minimal
> 1-client, 1-MDS/MGS, 2-OSS cluster):
>
>
> ~# lctl get_param llite.*.max_cached_mb
>
> llite.lustrefs-8d52a9c52800.max_cached_mb=
>
> users: 5
>
> max_cached_mb: 7748
>
> used_mb: 0
>
> unused_mb: 7748
>
> reclaim_count: 0
>
> ~# lctl get_param osc.*.max_dirty_mb
>
> osc.lustrefs-OST-osc-8d52a9c52800.max_dirty_mb=1938
>
> osc.lustrefs-OST0001-osc-8d52a9c52800.max_dirty_mb=1938
>
> ~# lctl get_param osc.*.max_rpcs_in_flight
>
> osc.lustrefs-OST-osc-8d52a9c52800.max_rpcs_in_flight=8
>
> osc.lustrefs-OST0001-osc-8d52a9c52800.max_rpcs_in_flight=8
>
> ~# lctl get_param osc.*.max_pages_per_rpc
>
> osc.lustrefs-OST-osc-8d52a9c52800.max_pages_per_rpc=1024
>
> osc.lustrefs-OST0001-osc-8d52a9c52800.max_pages_per_rpc=1024
>
>
>
> Thus far I’ve reduced the following to what I felt were really
> conservative values for a 16GB RAM machine:
>
>
>
> ~# lctl set_param llite.*.max_cached_mb=1024
>
> llite.lustrefs-8d52a9c52800.max_cached_mb=1024
>
> ~# lctl set_param osc.*.max_dirty_mb=512
>
> osc.lustrefs-OST-osc-8d52a9c52800.max_dirty_mb=512
>
> osc.lustrefs-OST0001-osc-8d52a9c52800.max_dirty_mb=512
>
> ~# lctl set_param osc.*.max_pages_per_rpc=128
>
> osc.lustrefs-OST-osc-8d52a9c52800.max_pages_per_rpc=128
>
> osc.lustrefs-OST0001-osc-8d52a9c52800.max_pages_per_rpc=128
>
> ~# lctl set_param osc.*.max_rpcs_in_flight=2
>
> osc.lustrefs-OST-osc-8d52a9c52800.max_rpcs_in_flight=2
>
> osc.lustrefs-OST0001-osc-8d52a9c52800.max_rpcs_in_flight=2
>
>
>
> This slows down how fast I get to basically OOM from <10 seconds to more
> like 25 seconds, but the trend is identical.
>
>
>
> As an example of what I’m seeing on the client, you can see below we start
> with most free, and then iozone rapidly (within ~10 seconds) causes all
> memory to be marked used, and that stabilizes at about 140MB free until at
> some point it stalls for 20 or more seconds and then some has been synced
> out:
>
>
> ~# dstat --mem
>
> --memory-usage-
>
> used  free  buff  cach
>
> 1029M 13.9G 2756k  215M
>
> 1028M 13.9G 2756k  215M
>
> 1028M 13.9G 2756k  215M
>
> 1088M 13.9G 2756k  215M
>
> 2550M 11.5G 2764k 1238M
>
> 3989M 10.1G 2764k 1236M
>
> 5404M 8881M 2764k 1239M
>
> 6831M 7453M 2772k 1240M
>
> 8254M 6033M 2772k 1237M
>
> 9672M 4613M 2772k 1239M
>
> 10.6G 3462M 2772k 1240M
>
> 12.1G 1902M 2772k 1240M
>
> 13.4G  582M 2772k 1240M
>
> 13.9G  139M 2488k 1161M
>
> 13.9G  139M 1528k 1174M
>
> 13.9G  140M  896k 1175M
>
> 13.9G  139M  676k 1176M
>
> 13.9G  142M  528k 1177M
>
> 13.9G  140M  484k 1188M
>
> 13.9G  139M  492k 1188M
>
> 13.9G  139M  488k 1188M
>
> 13.9G  141M  488k 1186M
>
> 13.9G  141M  480k 1187M
>
> 13.9G  139M  492k 1188M
>
> 13.9G  141M  600k 1188M
>
> 13.9G  139M  580k 1187M
>
> 13.9G  140M  536k 1186M
>
> 13.9G  141M  668k 1186M
>
> 13.9G  139M  580k 1188M
>
> 13.9G  140M  568k 1187M
>
> 12.7G 1299M 2064k 1197M missed 20 ticks <-- client is totally unresponsive
> during this time
>
> 11.0G 2972M 5404k 1238M^C
>
>
>
> Additionally, I’ve messed with sysctl settings.  Defaults:
>
> vm.dirty_background_bytes = 0
>
> vm.dirty_background_ratio = 10
>
> vm.dirty_bytes = 0
>
> vm.dirty_expire_centisecs = 3000
>
> vm.dirty_ratio = 20
>
> vm.dirty_writeback_centisecs = 500
>
>
>
> Revised to conservative values:
>
> vm.dirty_background_bytes = 1073741824
>
> vm.dirty_background_ratio = 0
>
> vm.dirty_bytes = 2147483648
>
> vm.dirty_expire_centisecs = 200
>
> vm.dirty_ratio = 0
>
>

Re: [lustre-discuss] [EXTERNAL] good ways to identify clients causing problems?

2021-05-29 Thread Raj via lustre-discuss

One other way is to install xltop(https://github.com/jhammond/xltop)
and use xltop client (ncurses based linux top like tool) to watch for
top client with more requests per sec (xltop -k q h).
You can also use it to track jobs but you might have to write your own
nodes to job mapping script (xltop-clusterd).

On Fri, May 28, 2021 at 4:21 PM Mohr, Rick via lustre-discuss
 wrote:
>
> Bill,
>
> One option I have used in the past is to look at the rpc request history.  
> For example, on an oss server, you can run:
>
> lctl get_param ost.OSS.ost_io.req_history
>
> and then extract the client nid for each request.   Based on that, you can 
> calculate the number of requests coming into the server and look for any 
> clients that are significantly higher than the others.  Maybe something like:
>
> lctl get_param ost.OSS.ost_io.req_history | cut -d: -f3 | sort | uniq -c | 
> sort -n
>
> I have used that approach in the past to identify misbehaving clients (the 
> number of requests from such clients was usually one or two orders of 
> magnitude higher than the others).  If multiple clients are unusually high, 
> you may be able to correlate the nodes with currently running jobs to 
> identify a particular job (assuming you don't already have lustre job stats 
> enabled).
>
> -Rick
>
>
> On 5/4/21, 2:41 PM, "lustre-discuss on behalf of Bill Anderson via 
> lustre-discuss"  lustre-discuss@lists.lustre.org> wrote:
>
>
>Hi All,
>
>Can you recommend good ways to identify Lustre client hosts that might 
> be causing stability or performance problems for the entire filesystem?
>
>For example, if a user is inadvertently doing something that's 
> creating an RPC storm, what are good ways to identify the client host that 
> has triggered the storm?
>
>Thank you!
>
>Bill
>
> ___
> lustre-discuss mailing list
> lustre-discuss@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] Multiple IB interfaces

2021-03-11 Thread Raj via lustre-discuss

Alastair,
Few scenarios which you may consider:
1) define 2 lnets one per IB interface (say o2ib1 and o2ib2) and share out
one OST through o2ib1 and other one through o2ib2. You can map HBA and disk
locality so that they are attached to the same cpu.

2) Same as above but share the ost/s from both lnets But configure odd
clients (clients with odd ips) to use o2ib1 and even clients to use o2ib2.
This may not be exactly what you are looking for but can efficiently
utilize both interfaces.

-Raj

On Tue, Mar 9, 2021 at 9:18 AM Alastair Basden via lustre-discuss <
lustre-discuss@lists.lustre.org> wrote:

> Hi,
>
> We are installing some new Lustre servers with 2 InfiniBand cards, 1
> attached to each CPU socket.  Storage is nvme, again, some drives attached
> to each socket.
>
> We want to ensure that data to/from each drive uses the appropriate IB
> card, and doesn't need to travel through the inter-cpu link.  Data being
> written is fairly easy I think, we just set that OST to the appropriate IP
> address.  However, data being read may well go out the other NIC, if I
> understand correctly.
>
> What setup do we need for this?
>
> I think probably not bonding, as that will presumably not tie
> NIC interfaces to cpus.  But I also see a note in the Lustre manual:
>
> """If the server has multiple interfaces on the same subnet, the Linux
> kernel will send all traffic using the first configured interface. This is
> a limitation of Linux, not Lustre. In this case, network interface bonding
> should be used. For more information about network interface bonding, see
> Chapter 7, Setting Up Network Interface Bonding."""
>
> (plus, no idea if bonding is supported on InfiniBand).
>
> Thanks,
> Alastair.
> ___
> lustre-discuss mailing list
> lustre-discuss@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] Repeatable ldlm_enqueue error

2019-10-31 Thread Raj Ayyampalayam

I had the same thought and I checked all the nodes, and they were all
exactly the same time.

Raj

On Wed, Oct 30, 2019, 10:19 PM Raj  wrote:

> Raj,
> Just eyeballing your logs from server and client, it looks like they have
> different time. Are they out of sync? It is important to have both clients
> and server to have same time.
>
> On Wed, Oct 30, 2019 at 3:37 PM Raj Ayyampalayam  wrote:
>
>> Hello,
>>
>> A particular job (MPI Maker genome annotation) on our cluster produces
>> the following error and the job errors out with a "Could not open file
>> error."
>> Server: The server is running lustre-2.10.4
>> Client: I've tried it with 2.10.5, 2.10.8 and 2.12.3 with the same result.
>> I don't see any other servers (Other MDS and OSS server nodes) reporting
>> communication loss to the client. The IB fabric is stable. The job runs to
>> completion when using a local storage on the node or a NFS mounted storage.
>> The job creates a lot of IO but it does not increase the load on the
>> luster servers.
>>
>> Client:
>> Oct 22 14:56:39 n305 kernel: LustreError: 11-0:
>> lustre2-MDT-mdc-8c3f222c4800: operation ldlm_enqueue to node
>> 10.55.49.215@o2ib failed: rc = -107
>> Oct 22 14:56:39 n305 kernel: Lustre:
>> lustre2-MDT-mdc-8c3f222c4800: Connection to lustre2-MDT (at
>> 10.55.49.215@o2ib) was lost; in progress operations using this service
>> will wait for recovery to complete
>> Oct 22 14:56:39 n305 kernel: Lustre: Skipped 2 previous similar messages
>> Oct 22 14:56:39 n305 kernel: LustreError: 167-0:
>> lustre2-MDT-mdc-8c3f222c4800: This client was evicted by
>> lustre2-MDT; in progress operations using this service will fail.
>> Oct 22 14:56:39 n305 kernel: LustreError:
>> 125851:0:(file.c:172:ll_close_inode_openhandle())
>> lustre2-clilmv-8c3f222c4800: inode [0x2ef38:0xffd6:0x0] mdc close
>> failed: rc = -108
>> Oct 22 14:56:39 n305 kernel: LustreError: Skipped 1 previous similar
>> message
>> Oct 22 14:56:40 n305 kernel: LustreError:
>> 125959:0:(file.c:3644:ll_inode_revalidate_fini()) lustre2: revalidate FID
>> [0x2eedf:0xed9d:0x0] error: rc = -108
>> Oct 22 14:56:40 n305 kernel: LustreError:
>> 125665:0:(vvp_io.c:1474:vvp_io_init()) lustre2: refresh file layout
>> [0x2ef38:0xff55:0x0] error -108.
>> Oct 22 14:56:40 n305 kernel: LustreError:
>> 125883:0:(ldlm_resource.c:1100:ldlm_resource_complain())
>> lustre2-MDT-mdc-8c3f222c4800: namespace resource
>> [0x2ef38:0xff55:0x0].0x0 (8bdc6823c9c0) refcount nonzero (1) after
>> lock cleanup; forcing cleanup.
>> Oct 22 14:56:40 n305 kernel: LustreError:
>> 125883:0:(ldlm_resource.c:1682:ldlm_resource_dump()) --- Resource:
>> [0x2ef38:0xff55:0x0].0x0 (8bdc6823c9c0) refcount = 1
>> Oct 22 14:56:40 n305 kernel: Lustre:
>> lustre2-MDT-mdc-8c3f222c4800: Connection restored to
>> 10.55.49.215@o2ib (at 10.55.49.215@o2ib)
>> Oct 22 14:56:40 n305 kernel: Lustre: Skipped 1 previous similar message
>> Oct 22 14:56:40 n305 kernel: LustreError:
>> 125959:0:(file.c:3644:ll_inode_revalidate_fini()) Skipped 2 previous
>> similar messages
>>
>> Server:
>> mds2-eno1: Oct 22 14:59:36 mds2 kernel: LustreError:
>> 7182:0:(ldlm_lockd.c:697:ldlm_handle_ast_error()) ### client (nid
>> 10.55.14.49@o2ib) failed to reply to blocking AST (req@881b0e68b900
>> x1635734905828112 status 0 rc -110), evict it ns: mdt-lustre2-MDT_UUID
>> lock: 88187ec45e00/0x121438a5db957b5 lrc: 4/0,0 mode: PR/PR res:
>> [0x2ef38:0xffec:0x0].0x0 bits 0x20 rrc: 4 type: IBT flags:
>> 0x6020040020 nid: 10.55.14.49@o2ib remote: 0x3154abaef2786884
>> expref: 72083 pid: 7182 timeout: 16143455124 lvb_type: 0
>> mds2-eno1: Oct 22 14:59:36 mds2 kernel: LustreError: 138-a:
>> lustre2-MDT: A client on nid 10.55.14.49@o2ib was evicted due to a
>> lock blocking callback time out: rc -110
>> mds2-eno1: Oct 22 14:59:36 mds2 kernel: Lustre: lustre2-MDT:
>> Connection restored to 3b42ec33-0885-6b7f-6575-9b200c4b6f55 (at
>> 10.55.14.49@o2ib)
>> mds2-eno1: Oct 22 14:59:37 mds2 kernel: LustreError:
>> 8936:0:(client.c:1166:ptlrpc_import_delay_req()) @@@ IMP_CLOSED
>> req@881b0e68b900 x1635734905828176/t0(0)
>> o104->lustre2-MDT@10.55.14.49@o2ib:15/16 lens 296/224 e 0 to 0 dl 0
>> ref 1 fl Rpc:/0/ rc 0/-1
>>
>>
>> Can anyone point me in the right direction on how to debug this issue?
>>
>> Thanks,
>> -Raj
>> ___
>> lustre-discuss mailing list
>> lustre-discuss@lists.lustre.org
>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>>
>
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

[lustre-discuss] Repeatable ldlm_enqueue error

2019-10-30 Thread Raj Ayyampalayam

Hello,

A particular job (MPI Maker genome annotation) on our cluster produces the
following error and the job errors out with a "Could not open file error."
Server: The server is running lustre-2.10.4
Client: I've tried it with 2.10.5, 2.10.8 and 2.12.3 with the same result.
I don't see any other servers (Other MDS and OSS server nodes) reporting
communication loss to the client. The IB fabric is stable. The job runs to
completion when using a local storage on the node or a NFS mounted storage.
The job creates a lot of IO but it does not increase the load on the luster
servers.

Client:
Oct 22 14:56:39 n305 kernel: LustreError: 11-0:
lustre2-MDT-mdc-8c3f222c4800: operation ldlm_enqueue to node
10.55.49.215@o2ib failed: rc = -107
Oct 22 14:56:39 n305 kernel: Lustre: lustre2-MDT-mdc-8c3f222c4800:
Connection to lustre2-MDT (at 10.55.49.215@o2ib) was lost; in progress
operations using this service will wait for recovery to complete
Oct 22 14:56:39 n305 kernel: Lustre: Skipped 2 previous similar messages
Oct 22 14:56:39 n305 kernel: LustreError: 167-0:
lustre2-MDT-mdc-8c3f222c4800: This client was evicted by
lustre2-MDT; in progress operations using this service will fail.
Oct 22 14:56:39 n305 kernel: LustreError:
125851:0:(file.c:172:ll_close_inode_openhandle())
lustre2-clilmv-8c3f222c4800: inode [0x2ef38:0xffd6:0x0] mdc close
failed: rc = -108
Oct 22 14:56:39 n305 kernel: LustreError: Skipped 1 previous similar message
Oct 22 14:56:40 n305 kernel: LustreError:
125959:0:(file.c:3644:ll_inode_revalidate_fini()) lustre2: revalidate FID
[0x2eedf:0xed9d:0x0] error: rc = -108
Oct 22 14:56:40 n305 kernel: LustreError:
125665:0:(vvp_io.c:1474:vvp_io_init()) lustre2: refresh file layout
[0x2ef38:0xff55:0x0] error -108.
Oct 22 14:56:40 n305 kernel: LustreError:
125883:0:(ldlm_resource.c:1100:ldlm_resource_complain())
lustre2-MDT-mdc-8c3f222c4800: namespace resource
[0x2ef38:0xff55:0x0].0x0 (8bdc6823c9c0) refcount nonzero (1) after
lock cleanup; forcing cleanup.
Oct 22 14:56:40 n305 kernel: LustreError:
125883:0:(ldlm_resource.c:1682:ldlm_resource_dump()) --- Resource:
[0x2ef38:0xff55:0x0].0x0 (8bdc6823c9c0) refcount = 1
Oct 22 14:56:40 n305 kernel: Lustre: lustre2-MDT-mdc-8c3f222c4800:
Connection restored to 10.55.49.215@o2ib (at 10.55.49.215@o2ib)
Oct 22 14:56:40 n305 kernel: Lustre: Skipped 1 previous similar message
Oct 22 14:56:40 n305 kernel: LustreError:
125959:0:(file.c:3644:ll_inode_revalidate_fini()) Skipped 2 previous
similar messages

Server:
mds2-eno1: Oct 22 14:59:36 mds2 kernel: LustreError:
7182:0:(ldlm_lockd.c:697:ldlm_handle_ast_error()) ### client (nid
10.55.14.49@o2ib) failed to reply to blocking AST (req@881b0e68b900
x1635734905828112 status 0 rc -110), evict it ns: mdt-lustre2-MDT_UUID
lock: 88187ec45e00/0x121438a5db957b5 lrc: 4/0,0 mode: PR/PR res:
[0x2ef38:0xffec:0x0].0x0 bits 0x20 rrc: 4 type: IBT flags:
0x6020040020 nid: 10.55.14.49@o2ib remote: 0x3154abaef2786884 expref:
72083 pid: 7182 timeout: 16143455124 lvb_type: 0
mds2-eno1: Oct 22 14:59:36 mds2 kernel: LustreError: 138-a:
lustre2-MDT: A client on nid 10.55.14.49@o2ib was evicted due to a lock
blocking callback time out: rc -110
mds2-eno1: Oct 22 14:59:36 mds2 kernel: Lustre: lustre2-MDT: Connection
restored to 3b42ec33-0885-6b7f-6575-9b200c4b6f55 (at 10.55.14.49@o2ib)
mds2-eno1: Oct 22 14:59:37 mds2 kernel: LustreError:
8936:0:(client.c:1166:ptlrpc_import_delay_req()) @@@ IMP_CLOSED
req@881b0e68b900 x1635734905828176/t0(0)
o104->lustre2-MDT@10.55.14.49@o2ib:15/16 lens 296/224 e 0 to 0 dl 0 ref
1 fl Rpc:/0/ rc 0/-1


Can anyone point me in the right direction on how to debug this issue?

Thanks,
-Raj
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] Running an older Lustre server (2.5) with a newer client (2.11)

2019-08-31 Thread Raj

Kirill
We are running 2.11.x clients on 2.5.x servers (Neo 2.0) in TDS environment
and haven’t seen any issues yet. Plan is to put it in production in a month
or so.
We also have 2.11.x servers(Neo 3.1-022). I am will be very keen to test
PFL  in a mix env of 2.7 and 2.11 clients. If anybody has tested this,
please let me know.

Thanks
-Raj
On Friday, August 30, 2019, Leonardo Saavedra  wrote:

> On 8/29/19 5:00 PM, Kirill Lozinskiy wrote:
>
> Hey Lustre fans!
>
> Is there anyone out there running Lustre server version 2.5.x with a
> Lustre client version 2.11.x? I'm curious if you are running this
> combination and whether or not you saw and gains or losses when you went to
> the newer Lustre client.
>
> We were running Lustre 2.7 client before and just recently went to 2.11.
> I'm trying to get a better understanding as to how common it is to have the
> version drift between the server and the client, as well as any positive or
> negative experiences.
>
> it works, no problem at all, in your client's logs you'll see a message
> like:
>
> ---
> [60010.976717] Lustre: Lustre: Build Version: 2.10.4
> [60011.029258] LNet: Added LNI 10.64.102.152@tcp [8/256/0/180]
> [60011.029297] LNet: Accept secure, port 988
> [60011.042970] Lustre: Server MGS version (2.5.5.0) is much older than
> client. Consider upgrading server (2.10.4)
> ---
>
>
> --
> Leo Saavedra
> National Radio Astronomy Observatorywww.nrao.edu
> +1-575-8357033
>
>
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] Very bad lnet ethernet read performance

2019-08-13 Thread Raj

Louis,
I would also try:
- turning on selective ack (net.ipv4.tcp_sack=1) on all nodes. This helps
although there is a CVE out there for older kernels.
- turning off checksum osc.ostid*.checksums. This can be turned off per
OST/FS on clients.
- Increasing max_pages_per_rpc to 16M. Although this may not help with your
reads.
- Increasing max_rpcs_in_flight and max_dirty_mb be  2 x max_rpcs_in_flight
- Increasing llite.ostid*.max_read_ahead_mb to up to 1024 on clients. Again
this can be set per OST/FS.

_Raj

On Mon, Aug 12, 2019 at 12:12 PM Shawn Hall  wrote:

> Do you have Ethernet flow control configured on all ports (especially the
> uplink ports)?  We’ve found that flow control is critical when there are
> mismatched uplink/client port speeds.
>
>
>
> Shawn
>
>
>
> *From:* lustre-discuss  *On
> Behalf Of *Louis Bailleul
> *Sent:* Monday, August 12, 2019 1:08 PM
> *To:* lustre-discuss@lists.lustre.org
> *Subject:* [lustre-discuss] Very bad lnet ethernet read performance
>
>
>
> Hi all,
>
> I am trying to understand what I am doing wrong here.
> I have a Lustre 2.12.1 system backed by NVME drives under zfs for which
> obdfilter-survey gives descent values
>
> ost  2 sz 536870912K rsz 1024K obj2 thr  256 write 15267.49 [6580.36,
> 8664.20] rewrite 15225.24 [6559.05, 8900.54] read 19739.86 [9062.25,
> 10429.04]
>
> But my actual Lustre performances are pretty poor in comparison (can't top
> 8GB/s write and 13.5GB/s read)
> So I started to question my lnet tuning but playing with peer_credits and
> max_rpc_per_pages didn't help.
>
> My test setup consist of 133x10G Ethernet clients (uplinks between end
> devices and OSS are 2x100G for every 20 nodes).
> The single OSS is fitted with a bonding of 2x100G Ethernet.
>
> I have tried to understand the problem using lnet_selftest but I'll need
> some help/doco as this doesn't make sense to me.
>
> Testing a single 10G client
>
> [LNet Rates of lfrom]
> [R] Avg: 2231 RPC/s Min: 2231 RPC/s Max: 2231 RPC/s
> [W] Avg: 1156 RPC/s Min: 1156 RPC/s Max: 1156 RPC/s
> [LNet Bandwidth of lfrom]
> [R] Avg: 1075.16  MiB/s Min: 1075.16  MiB/s Max: 1075.16  MiB/s
> [W] Avg: 0.18 MiB/s Min: 0.18 MiB/s Max: 0.18 MiB/s
> [LNet Rates of lto]
> [R] Avg: 1179 RPC/s Min: 1179 RPC/s Max: 1179 RPC/s
> [W] Avg: 2254 RPC/s Min: 2254 RPC/s Max: 2254 RPC/s
> [LNet Bandwidth of lto]
> [R] Avg: 0.19 MiB/s Min: 0.19 MiB/s Max: 0.19 MiB/s
> [W] Avg: 1075.17  MiB/s Min: 1075.17  MiB/s Max: 1075.17  MiB/s
>
> With 10x10G clients :
>
> [LNet Rates of lfrom]
> [R] Avg: 1416 RPC/s Min: 1102 RPC/s Max: 1642 RPC/s
> [W] Avg: 708  RPC/s Min: 551  RPC/s Max: 821  RPC/s
> [LNet Bandwidth of lfrom]
> [R] Avg: 708.20   MiB/s Min: 550.77   MiB/s Max: 820.96   MiB/s
> [W] Avg: 0.11 MiB/s Min: 0.08 MiB/s Max: 0.13 MiB/s
> [LNet Rates of lto]
> [R] Avg: 7084 RPC/s Min: 7084 RPC/s Max: 7084 RPC/s
> [W] Avg: 14165RPC/s Min: 14165RPC/s Max: 14165RPC/s
> [LNet Bandwidth of lto]
> [R] Avg: 1.08 MiB/s Min: 1.08 MiB/s Max: 1.08 MiB/s
> [W] Avg: 7081.86  MiB/s Min: 7081.86  MiB/s Max: 7081.86  MiB/s
>
>
> With all 133x10G clients:
>
> [LNet Rates of lfrom]
> [R] Avg: 510  RPC/s Min: 98   RPC/s Max: 23457RPC/s
> [W] Avg: 510  RPC/s Min: 49   RPC/s Max: 45863RPC/s
> [LNet Bandwidth of lfrom]
> [R] Avg: 169.87   MiB/s Min: 48.77MiB/s Max: 341.26   MiB/s
> [W] Avg: 169.86   MiB/s Min: 0.01 MiB/s Max: 22757.92 MiB/s
> [LNet Rates of lto]
> [R] Avg: 23458RPC/s Min: 23458RPC/s Max: 23458RPC/s
> [W] Avg: 45876RPC/s Min: 45876RPC/s Max: 45876RPC/s
> [LNet Bandwidth of lto]
> [R] Avg: 341.12   MiB/s Min: 341.12   MiB/s Max: 341.12   MiB/s
> [W] Avg: 22758.42 MiB/s Min: 22758.42 MiB/s Max: 22758.42 MiB/s
>
>
> So if I add clients the aggregate write bandwidth somewhat stacks, but the
> read bandwidth decrease ???
> When throwing all the nodes at the system, I am pretty happy with the
> ~22GB/s on write pretty as this is in the 90% of the 2x100G, but the
> 341MB/s read sounds very weird considering that this is a third of the
> performance of a single client.
>
> This are my ksocklnd tuning :
>
> # for i in /sys/module/ksocklnd/parameters/*; do echo "$i : $(cat $i)";
> done
> /sys/module/ksocklnd/parameters/credits : 1024
> /sys/module/ksocklnd/parameters/eager_ack : 0
> /sys/module/ksocklnd/parameters/enable_csum : 0
> /sys/module/ksocklnd/parameters/enable_irq_affinity : 0
> /sys/module/ksocklnd/parameters/inject_csum_error : 0
> /sys/module/ksocklnd/parameters/keepalive : 30
> /sys/module/ksocklnd/parameters/keepalive_count : 5
> /sys/module/ksocklnd/parameters/keepalive_idle : 30
> /sys/module/ksocklnd/parameters/keepalive_intvl : 5
> /sys/module/ksocklnd/parameters/max_reconnectms : 6
> /sys/module/ksocklnd/parameters/min_bulk : 1024
> /sys/module/ksocklnd/parameters/min_reconnectms : 1000
>

Re: [lustre-discuss] lustre filesystem in hung state

2019-02-21 Thread Raj

Anil,
Your error message show
o8->scratch-OST0003-osc-MDT@192.168.1.5@o2ib
which means it is trying to connect (opcode o8 is OST connect) to OST0003
ost of scratch file system which is hosted in 192.168.1.5@o2ib nid OSS node
but the client has lost connection to the OSS node. This seems to me that
you are having network issue. You can ping server nid from client and try
to troubleshoot network issue:

client# lctl ping 192.168.1.5@o2ib


On Tue, Feb 19, 2019 at 11:44 PM Anilkumar Naik 
wrote:

> Dear All,
>
> Lustre file system goes to hung state and unable to know the exact issue
> with lustre. Kindly find below information and help us to know the fixes
> for flle system kernerl hung issue.
>
> Cluster Details:
>
> Oss node/server is mounted with below mount targets. We could able to
> mount the client with home mounts and its works for some time. After
> 10-15mins all the clients hangs and oss node get rebooted. Kindly help.
>
>  /dev/mapper/mdt-mgt19G  446M   17G   3% /mdt-mgt
> /dev/mapper/mdt-home  140G  2.8G  128G   3% /mdt-home
> /dev/mapper/mdt-scratch   140G  759M  130G   1% /mdt-scratch
> /dev/mapper/ost-home  3.7T  2.4T  1.1T  69% /ost-home
>
> Below Lustre packages has been installed at oss node.
> ==
> kernel-devel-2.6.32-431.23.3.el6_lustre.x86_64
> lustre-debuginfo-2.5.3-2.6.32_431.23.3.el6_lustre.x86_64.x86_64
> lustre-2.5.3-2.6.32_431.23.3.el6_lustre.x86_64.x86_64
> kernel-firmware-2.6.32-431.23.3.el6_lustre.x86_64
> lustre-iokit-2.5.3-2.6.32_431.23.3.el6_lustre.x86_64.x86_64
> kernel-2.6.32-431.23.3.el6_lustre.x86_64
> lustre-modules-2.5.3-2.6.32_431.23.3.el6_lustre.x86_64.x86_64
> lustre-tests-2.5.3-2.6.32_431.23.3.el6_lustre.x86_64.x86_64
> kernel-debuginfo-common-x86_64-2.6.32-431.23.3.el6_lustre.x86_64
> lustre-osd-ldiskfs-2.5.3-2.6.32_431.23.3.el6_lustre.x86_64.x86_64
> kernel-debuginfo-2.6.32-431.23.3.el6_lustre.x86_64
> =
>
> Lustre errors:
> =
> Feb 20 06:22:06 oss1 kernel: Lustre:
> 6285:0:(client.c:1918:ptlrpc_expire_one_request()) Skipped 17 previous
> similar messages
> Feb 20 06:29:11 oss1 kernel: LustreError: 137-5: scratch-OST0001_UUID: not
> available for connect from 0@lo (no target). If you are running an HA
> pair check that the target is mounted on the other server.
> Feb 20 06:29:11 oss1 kernel: LustreError: Skipped 16 previous similar
> messages
> Feb 20 06:29:11 oss1 kernel: LustreError: 11-0:
> scratch-OST0001-osc-MDT: Communicating with 0@lo, operation
> ost_connect failed with -19.
> Feb 20 06:29:11 oss1 kernel: LustreError: Skipped 16 previous similar
> messages
> Feb 20 06:32:42 oss1 kernel: Lustre:
> 6285:0:(client.c:1918:ptlrpc_expire_one_request()) @@@ Request sent has
> timed out for sent delay: [sent 1550624551/real 0]  req@880800be1000
> x1625913123994836/t0(0) o8->scratch-OST0003-osc-MDT@192.168.1.5@o2ib:28/4
> lens 400/544 e 0 to 1 dl 1550624562 ref 2 fl Rpc:XN/0/ rc 0/-1
> Feb 20 06:32:42 oss1 kernel: Lustre:
> 6285:0:(client.c:1918:ptlrpc_expire_one_request()) Skipped 15 previous
> similar messages
> Feb 20 06:39:36 oss1 kernel: LustreError: 137-5: scratch-OST0003_UUID: not
> available for connect from 0@lo (no target). If you are running an HA
> pair check that the target is mounted on the other server.
> Feb 20 06:39:36 oss1 kernel: LustreError: Skipped 17 previous similar
> messages
> Feb 20 06:39:36 oss1 kernel: LustreError: 11-0:
> scratch-OST0003-osc-MDT: Communicating with 0@lo, operation
> ost_connect failed with -19.
> Feb 20 06:39:36 oss1 kernel: LustreError: Skipped 17 previous similar
> messages
> Feb 20 06:43:12 oss1 kernel: Lustre:
> 6285:0:(client.c:1918:ptlrpc_expire_one_request()) @@@ Request sent has
> timed out for sent delay: [sent 1550625151/real 0]  req@880800dcd000
> x1625913123996040/t0(0) o8->scratch-OST0001-osc-MDT@192.168.1.5@o2ib:28/4
> lens 400/544 e 0 to 1 dl 1550625192 ref 2 fl Rpc:XN/0/ rc 0/-1
> Feb 20 06:43:12 oss1 kernel: Lustre:
> 6285:0:(client.c:1918:ptlrpc_expire_one_request()) Skipped 17 previous
> similar messages
> Feb 20 06:50:01 oss1 kernel: LustreError: 137-5: scratch-OST0003_UUID: not
> available for connect from 0@lo (no target). If you are running an HA
> pair check that the target is mounted on the other server.
> Feb 20 06:50:01 oss1 kernel: LustreError: Skipped 15 previous similar
> messages
> Feb 20 06:50:01 oss1 kernel: LustreError: 11-0:
> scratch-OST0003-osc-MDT: Communicating with 0@lo, operation
> ost_connect failed with -19.
> Feb 20 06:50:01 oss1 kernel: LustreError: Skipped 15 previous similar
> messages
> Feb 20 06:53:57 oss1 kernel: Lustre:
> 6285:0:(client.c:1918:ptlrpc_expire_one_request()) @@@ Request sent has
> timed out for sent delay: [sent 1550625826/real 0]  req@881005e88800
> x1625913123997352/t0(0) o8->scratch-OST0003-osc-MDT@192.168.1.5@o2ib:28/4
> lens 400/544 e 0 to 1 dl 1550625837 ref 2 fl Rpc:XN/0/ rc 0/-1
> Feb 20 06:53:57 oss1 kernel: Lustre:
>

Re: [lustre-discuss] Suspended jobs and rebooting lustre servers

2019-02-21 Thread Raj Ayyampalayam

Got it. I rather be safe than sorry. This is my first time doing a lustre
configuration change.

Raj

On Thu, Feb 21, 2019, 11:55 PM Raj  wrote:

> I also agree with Colin's comment.
> If the current OSTs are not touched, and you are only adding new OSTs to
> existing OSS nodes and adding new ost-mount resources in your existing
> (already running) Pacemaker configuration, you can achieve the upgrade with
> no downtime. If your Corosync-Pacemaker configuration is working correctly,
> you can failover and failback and take turn to reboot each OSS nodes. But,
> chances of human error is too high in doing this.
>
> On Thu, Feb 21, 2019 at 10:30 PM Raj Ayyampalayam 
> wrote:
>
>> Hi Raj,
>>
>> Thanks for the explanation. We will have to rethink our upgrade process.
>>
>> Thanks again.
>> Raj
>>
>> On Thu, Feb 21, 2019, 10:23 PM Raj  wrote:
>>
>>> Hello Raj,
>>> It’s best and safe to unmount from all the clients and then do the
>>> upgrade. Your FS is getting more OSTs and changing conf in the existing
>>> ones, your client needs to get the new layout by remounting it.
>>> Also you mentioned about client eviction, during eviction the client has
>>> to drop it’s dirty pages and all the open file descriptors in the FS will
>>> be gone.
>>>
>>> On Thu, Feb 21, 2019 at 12:25 PM Raj Ayyampalayam 
>>> wrote:
>>>
>>>> What can I expect to happen to the jobs that are suspended during the
>>>> file system restart?
>>>> Will the processes holding an open file handle die when I unsuspend
>>>> them after the filesystem restart?
>>>>
>>>> Thanks!
>>>> -Raj
>>>>
>>>>
>>>> On Thu, Feb 21, 2019 at 12:52 PM Colin Faber  wrote:
>>>>
>>>>> Ah yes,
>>>>>
>>>>> If you're adding to an existing OSS, then you will need to reconfigure
>>>>> the file system which requires writeconf event.
>>>>>
>>>>
>>>>> On Thu, Feb 21, 2019 at 10:00 AM Raj Ayyampalayam 
>>>>> wrote:
>>>>>
>>>>>> The new OST's will be added to the existing file system (the OSS
>>>>>> nodes are already part of the filesystem), I will have to re-configure 
>>>>>> the
>>>>>> current HA resource configuration to tell it about the 4 new OST's.
>>>>>> Our exascaler's HA monitors the individual OST and I need to
>>>>>> re-configure the HA on the existing filesystem.
>>>>>>
>>>>>> Our vendor support has confirmed that we would have to restart the
>>>>>> filesystem if we want to regenerate the HA configs to include the new 
>>>>>> OST's.
>>>>>>
>>>>>> Thanks,
>>>>>> -Raj
>>>>>>
>>>>>>
>>>>>> On Thu, Feb 21, 2019 at 11:23 AM Colin Faber 
>>>>>> wrote:
>>>>>>
>>>>>>> It seems to me that steps may still be missing?
>>>>>>>
>>>>>>> You're going to rack/stack and provision the OSS nodes with new
>>>>>>> OSTs'.
>>>>>>>
>>>>>>> Then you're going to introduce failover options somewhere? new osts?
>>>>>>> existing system? etc?
>>>>>>>
>>>>>>> If you're introducing failover with the new OST's and leaving the
>>>>>>> existing system in place, you should be able to accomplish this without
>>>>>>> bringing the system offline.
>>>>>>>
>>>>>>> If you're going to be introducing failover to your existing system
>>>>>>> then you will need to reconfigure the file system to accommodate the new
>>>>>>> failover settings (failover nides, etc.)
>>>>>>>
>>>>>>> -cf
>>>>>>>
>>>>>>>
>>>>>>> On Thu, Feb 21, 2019 at 9:13 AM Raj Ayyampalayam 
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Our upgrade strategy is as follows:
>>>>>>>>
>>>>>>>> 1) Load all disks into the storage array.
>>>>>>>> 2) Create RAID pools and virtual disks.
>>>>>>>> 3) Create lustre file system using mkfs.lustre command. (I still
>>>>>>>> have to figure out all the parameters used on the

Re: [lustre-discuss] Suspended jobs and rebooting lustre servers

2019-02-21 Thread Raj

I also agree with Colin's comment.
If the current OSTs are not touched, and you are only adding new OSTs to
existing OSS nodes and adding new ost-mount resources in your existing
(already running) Pacemaker configuration, you can achieve the upgrade with
no downtime. If your Corosync-Pacemaker configuration is working correctly,
you can failover and failback and take turn to reboot each OSS nodes. But,
chances of human error is too high in doing this.

On Thu, Feb 21, 2019 at 10:30 PM Raj Ayyampalayam  wrote:

> Hi Raj,
>
> Thanks for the explanation. We will have to rethink our upgrade process.
>
> Thanks again.
> Raj
>
> On Thu, Feb 21, 2019, 10:23 PM Raj  wrote:
>
>> Hello Raj,
>> It’s best and safe to unmount from all the clients and then do the
>> upgrade. Your FS is getting more OSTs and changing conf in the existing
>> ones, your client needs to get the new layout by remounting it.
>> Also you mentioned about client eviction, during eviction the client has
>> to drop it’s dirty pages and all the open file descriptors in the FS will
>> be gone.
>>
>> On Thu, Feb 21, 2019 at 12:25 PM Raj Ayyampalayam 
>> wrote:
>>
>>> What can I expect to happen to the jobs that are suspended during the
>>> file system restart?
>>> Will the processes holding an open file handle die when I unsuspend them
>>> after the filesystem restart?
>>>
>>> Thanks!
>>> -Raj
>>>
>>>
>>> On Thu, Feb 21, 2019 at 12:52 PM Colin Faber  wrote:
>>>
>>>> Ah yes,
>>>>
>>>> If you're adding to an existing OSS, then you will need to reconfigure
>>>> the file system which requires writeconf event.
>>>>
>>>
>>>> On Thu, Feb 21, 2019 at 10:00 AM Raj Ayyampalayam 
>>>> wrote:
>>>>
>>>>> The new OST's will be added to the existing file system (the OSS nodes
>>>>> are already part of the filesystem), I will have to re-configure the
>>>>> current HA resource configuration to tell it about the 4 new OST's.
>>>>> Our exascaler's HA monitors the individual OST and I need to
>>>>> re-configure the HA on the existing filesystem.
>>>>>
>>>>> Our vendor support has confirmed that we would have to restart the
>>>>> filesystem if we want to regenerate the HA configs to include the new 
>>>>> OST's.
>>>>>
>>>>> Thanks,
>>>>> -Raj
>>>>>
>>>>>
>>>>> On Thu, Feb 21, 2019 at 11:23 AM Colin Faber  wrote:
>>>>>
>>>>>> It seems to me that steps may still be missing?
>>>>>>
>>>>>> You're going to rack/stack and provision the OSS nodes with new
>>>>>> OSTs'.
>>>>>>
>>>>>> Then you're going to introduce failover options somewhere? new osts?
>>>>>> existing system? etc?
>>>>>>
>>>>>> If you're introducing failover with the new OST's and leaving the
>>>>>> existing system in place, you should be able to accomplish this without
>>>>>> bringing the system offline.
>>>>>>
>>>>>> If you're going to be introducing failover to your existing system
>>>>>> then you will need to reconfigure the file system to accommodate the new
>>>>>> failover settings (failover nides, etc.)
>>>>>>
>>>>>> -cf
>>>>>>
>>>>>>
>>>>>> On Thu, Feb 21, 2019 at 9:13 AM Raj Ayyampalayam 
>>>>>> wrote:
>>>>>>
>>>>>>> Our upgrade strategy is as follows:
>>>>>>>
>>>>>>> 1) Load all disks into the storage array.
>>>>>>> 2) Create RAID pools and virtual disks.
>>>>>>> 3) Create lustre file system using mkfs.lustre command. (I still
>>>>>>> have to figure out all the parameters used on the existing OSTs).
>>>>>>> 4) Create mount points on all OSSs.
>>>>>>> 5) Mount the lustre OSTs.
>>>>>>> 6) Maybe rebalance the filesystem.
>>>>>>> My understanding is that the above can be done without bringing the
>>>>>>> filesystem down. I want to create the HA configuration (corosync and
>>>>>>> pacemaker) for the new OSTs. This step requires the filesystem to be 
>>>>>>> down.
>>>>>>> I want to know what w

Re: [lustre-discuss] Suspended jobs and rebooting lustre servers

2019-02-21 Thread Raj Ayyampalayam

Hi Raj,

Thanks for the explanation. We will have to rethink our upgrade process.

Thanks again.
Raj

On Thu, Feb 21, 2019, 10:23 PM Raj  wrote:

> Hello Raj,
> It’s best and safe to unmount from all the clients and then do the
> upgrade. Your FS is getting more OSTs and changing conf in the existing
> ones, your client needs to get the new layout by remounting it.
> Also you mentioned about client eviction, during eviction the client has
> to drop it’s dirty pages and all the open file descriptors in the FS will
> be gone.
>
> On Thu, Feb 21, 2019 at 12:25 PM Raj Ayyampalayam 
> wrote:
>
>> What can I expect to happen to the jobs that are suspended during the
>> file system restart?
>> Will the processes holding an open file handle die when I unsuspend them
>> after the filesystem restart?
>>
>> Thanks!
>> -Raj
>>
>>
>> On Thu, Feb 21, 2019 at 12:52 PM Colin Faber  wrote:
>>
>>> Ah yes,
>>>
>>> If you're adding to an existing OSS, then you will need to reconfigure
>>> the file system which requires writeconf event.
>>>
>>
>>> On Thu, Feb 21, 2019 at 10:00 AM Raj Ayyampalayam 
>>> wrote:
>>>
>>>> The new OST's will be added to the existing file system (the OSS nodes
>>>> are already part of the filesystem), I will have to re-configure the
>>>> current HA resource configuration to tell it about the 4 new OST's.
>>>> Our exascaler's HA monitors the individual OST and I need to
>>>> re-configure the HA on the existing filesystem.
>>>>
>>>> Our vendor support has confirmed that we would have to restart the
>>>> filesystem if we want to regenerate the HA configs to include the new 
>>>> OST's.
>>>>
>>>> Thanks,
>>>> -Raj
>>>>
>>>>
>>>> On Thu, Feb 21, 2019 at 11:23 AM Colin Faber  wrote:
>>>>
>>>>> It seems to me that steps may still be missing?
>>>>>
>>>>> You're going to rack/stack and provision the OSS nodes with new OSTs'.
>>>>>
>>>>> Then you're going to introduce failover options somewhere? new osts?
>>>>> existing system? etc?
>>>>>
>>>>> If you're introducing failover with the new OST's and leaving the
>>>>> existing system in place, you should be able to accomplish this without
>>>>> bringing the system offline.
>>>>>
>>>>> If you're going to be introducing failover to your existing system
>>>>> then you will need to reconfigure the file system to accommodate the new
>>>>> failover settings (failover nides, etc.)
>>>>>
>>>>> -cf
>>>>>
>>>>>
>>>>> On Thu, Feb 21, 2019 at 9:13 AM Raj Ayyampalayam 
>>>>> wrote:
>>>>>
>>>>>> Our upgrade strategy is as follows:
>>>>>>
>>>>>> 1) Load all disks into the storage array.
>>>>>> 2) Create RAID pools and virtual disks.
>>>>>> 3) Create lustre file system using mkfs.lustre command. (I still have
>>>>>> to figure out all the parameters used on the existing OSTs).
>>>>>> 4) Create mount points on all OSSs.
>>>>>> 5) Mount the lustre OSTs.
>>>>>> 6) Maybe rebalance the filesystem.
>>>>>> My understanding is that the above can be done without bringing the
>>>>>> filesystem down. I want to create the HA configuration (corosync and
>>>>>> pacemaker) for the new OSTs. This step requires the filesystem to be 
>>>>>> down.
>>>>>> I want to know what would happen to the suspended processes across the
>>>>>> cluster when I bring the filesystem down to re-generate the HA configs.
>>>>>>
>>>>>> Thanks,
>>>>>> -Raj
>>>>>>
>>>>>> On Thu, Feb 21, 2019 at 12:59 AM Colin Faber 
>>>>>> wrote:
>>>>>>
>>>>>>> Can you provide more details on your upgrade strategy? In some cases
>>>>>>> expanding your storage shouldn't impact client / job activity at all.
>>>>>>>
>>>>>>> On Wed, Feb 20, 2019, 11:09 AM Raj Ayyampalayam 
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hello,
>>>>>>>>
>>>>>>>> We are planning on expanding our storage by adding more OST

Re: [lustre-discuss] Suspended jobs and rebooting lustre servers

2019-02-21 Thread Raj

Hello Raj,
It’s best and safe to unmount from all the clients and then do the upgrade.
Your FS is getting more OSTs and changing conf in the existing ones, your
client needs to get the new layout by remounting it.
Also you mentioned about client eviction, during eviction the client has to
drop it’s dirty pages and all the open file descriptors in the FS will be
gone.

On Thu, Feb 21, 2019 at 12:25 PM Raj Ayyampalayam  wrote:

> What can I expect to happen to the jobs that are suspended during the file
> system restart?
> Will the processes holding an open file handle die when I unsuspend them
> after the filesystem restart?
>
> Thanks!
> -Raj
>
>
> On Thu, Feb 21, 2019 at 12:52 PM Colin Faber  wrote:
>
>> Ah yes,
>>
>> If you're adding to an existing OSS, then you will need to reconfigure
>> the file system which requires writeconf event.
>>
>
>> On Thu, Feb 21, 2019 at 10:00 AM Raj Ayyampalayam 
>> wrote:
>>
>>> The new OST's will be added to the existing file system (the OSS nodes
>>> are already part of the filesystem), I will have to re-configure the
>>> current HA resource configuration to tell it about the 4 new OST's.
>>> Our exascaler's HA monitors the individual OST and I need to
>>> re-configure the HA on the existing filesystem.
>>>
>>> Our vendor support has confirmed that we would have to restart the
>>> filesystem if we want to regenerate the HA configs to include the new OST's.
>>>
>>> Thanks,
>>> -Raj
>>>
>>>
>>> On Thu, Feb 21, 2019 at 11:23 AM Colin Faber  wrote:
>>>
>>>> It seems to me that steps may still be missing?
>>>>
>>>> You're going to rack/stack and provision the OSS nodes with new OSTs'.
>>>>
>>>> Then you're going to introduce failover options somewhere? new osts?
>>>> existing system? etc?
>>>>
>>>> If you're introducing failover with the new OST's and leaving the
>>>> existing system in place, you should be able to accomplish this without
>>>> bringing the system offline.
>>>>
>>>> If you're going to be introducing failover to your existing system then
>>>> you will need to reconfigure the file system to accommodate the new
>>>> failover settings (failover nides, etc.)
>>>>
>>>> -cf
>>>>
>>>>
>>>> On Thu, Feb 21, 2019 at 9:13 AM Raj Ayyampalayam 
>>>> wrote:
>>>>
>>>>> Our upgrade strategy is as follows:
>>>>>
>>>>> 1) Load all disks into the storage array.
>>>>> 2) Create RAID pools and virtual disks.
>>>>> 3) Create lustre file system using mkfs.lustre command. (I still have
>>>>> to figure out all the parameters used on the existing OSTs).
>>>>> 4) Create mount points on all OSSs.
>>>>> 5) Mount the lustre OSTs.
>>>>> 6) Maybe rebalance the filesystem.
>>>>> My understanding is that the above can be done without bringing the
>>>>> filesystem down. I want to create the HA configuration (corosync and
>>>>> pacemaker) for the new OSTs. This step requires the filesystem to be down.
>>>>> I want to know what would happen to the suspended processes across the
>>>>> cluster when I bring the filesystem down to re-generate the HA configs.
>>>>>
>>>>> Thanks,
>>>>> -Raj
>>>>>
>>>>> On Thu, Feb 21, 2019 at 12:59 AM Colin Faber  wrote:
>>>>>
>>>>>> Can you provide more details on your upgrade strategy? In some cases
>>>>>> expanding your storage shouldn't impact client / job activity at all.
>>>>>>
>>>>>> On Wed, Feb 20, 2019, 11:09 AM Raj Ayyampalayam 
>>>>>> wrote:
>>>>>>
>>>>>>> Hello,
>>>>>>>
>>>>>>> We are planning on expanding our storage by adding more OSTs to our
>>>>>>> lustre file system. It looks like it would be easier to expand if we 
>>>>>>> bring
>>>>>>> the filesystem down and perform the necessary operations. We are 
>>>>>>> planning
>>>>>>> to suspend all the jobs running on the cluster. We originally planned to
>>>>>>> add new OSTs to the live filesystem.
>>>>>>>
>>>>>>> We are trying to determine the potential impact to the suspended
>>>>>>> jobs if we bring down the filesystem for the upgrade.
>>>>>>> One of the questions we have is what would happen to the suspended
>>>>>>> processes that hold an open file handle in the lustre file system when 
>>>>>>> the
>>>>>>> filesystem is brought down for the upgrade?
>>>>>>> Will they recover from the client eviction?
>>>>>>>
>>>>>>> We do have vendor support and have engaged them. I wanted to ask the
>>>>>>> community and get some feedback.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> -Raj
>>>>>>>
>>>>>> ___
>>>>>>> lustre-discuss mailing list
>>>>>>> lustre-discuss@lists.lustre.org
>>>>>>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>>>>>>>
>>>>>> ___
> lustre-discuss mailing list
> lustre-discuss@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] Suspended jobs and rebooting lustre servers

2019-02-21 Thread Raj Ayyampalayam

What can I expect to happen to the jobs that are suspended during the file
system restart?
Will the processes holding an open file handle die when I unsuspend them
after the filesystem restart?

Thanks!
-Raj


On Thu, Feb 21, 2019 at 12:52 PM Colin Faber  wrote:

> Ah yes,
>
> If you're adding to an existing OSS, then you will need to reconfigure the
> file system which requires writeconf event.
>
> On Thu, Feb 21, 2019 at 10:00 AM Raj Ayyampalayam 
> wrote:
>
>> The new OST's will be added to the existing file system (the OSS nodes
>> are already part of the filesystem), I will have to re-configure the
>> current HA resource configuration to tell it about the 4 new OST's.
>> Our exascaler's HA monitors the individual OST and I need to re-configure
>> the HA on the existing filesystem.
>>
>> Our vendor support has confirmed that we would have to restart the
>> filesystem if we want to regenerate the HA configs to include the new OST's.
>>
>> Thanks,
>> -Raj
>>
>>
>> On Thu, Feb 21, 2019 at 11:23 AM Colin Faber  wrote:
>>
>>> It seems to me that steps may still be missing?
>>>
>>> You're going to rack/stack and provision the OSS nodes with new OSTs'.
>>>
>>> Then you're going to introduce failover options somewhere? new osts?
>>> existing system? etc?
>>>
>>> If you're introducing failover with the new OST's and leaving the
>>> existing system in place, you should be able to accomplish this without
>>> bringing the system offline.
>>>
>>> If you're going to be introducing failover to your existing system then
>>> you will need to reconfigure the file system to accommodate the new
>>> failover settings (failover nides, etc.)
>>>
>>> -cf
>>>
>>>
>>> On Thu, Feb 21, 2019 at 9:13 AM Raj Ayyampalayam 
>>> wrote:
>>>
>>>> Our upgrade strategy is as follows:
>>>>
>>>> 1) Load all disks into the storage array.
>>>> 2) Create RAID pools and virtual disks.
>>>> 3) Create lustre file system using mkfs.lustre command. (I still have
>>>> to figure out all the parameters used on the existing OSTs).
>>>> 4) Create mount points on all OSSs.
>>>> 5) Mount the lustre OSTs.
>>>> 6) Maybe rebalance the filesystem.
>>>> My understanding is that the above can be done without bringing the
>>>> filesystem down. I want to create the HA configuration (corosync and
>>>> pacemaker) for the new OSTs. This step requires the filesystem to be down.
>>>> I want to know what would happen to the suspended processes across the
>>>> cluster when I bring the filesystem down to re-generate the HA configs.
>>>>
>>>> Thanks,
>>>> -Raj
>>>>
>>>> On Thu, Feb 21, 2019 at 12:59 AM Colin Faber  wrote:
>>>>
>>>>> Can you provide more details on your upgrade strategy? In some cases
>>>>> expanding your storage shouldn't impact client / job activity at all.
>>>>>
>>>>> On Wed, Feb 20, 2019, 11:09 AM Raj Ayyampalayam 
>>>>> wrote:
>>>>>
>>>>>> Hello,
>>>>>>
>>>>>> We are planning on expanding our storage by adding more OSTs to our
>>>>>> lustre file system. It looks like it would be easier to expand if we 
>>>>>> bring
>>>>>> the filesystem down and perform the necessary operations. We are planning
>>>>>> to suspend all the jobs running on the cluster. We originally planned to
>>>>>> add new OSTs to the live filesystem.
>>>>>>
>>>>>> We are trying to determine the potential impact to the suspended jobs
>>>>>> if we bring down the filesystem for the upgrade.
>>>>>> One of the questions we have is what would happen to the suspended
>>>>>> processes that hold an open file handle in the lustre file system when 
>>>>>> the
>>>>>> filesystem is brought down for the upgrade?
>>>>>> Will they recover from the client eviction?
>>>>>>
>>>>>> We do have vendor support and have engaged them. I wanted to ask the
>>>>>> community and get some feedback.
>>>>>>
>>>>>> Thanks,
>>>>>> -Raj
>>>>>>
>>>>> ___
>>>>>> lustre-discuss mailing list
>>>>>> lustre-discuss@lists.lustre.org
>>>>>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>>>>>>
>>>>>
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] Suspended jobs and rebooting lustre servers

2019-02-21 Thread Raj Ayyampalayam

The new OST's will be added to the existing file system (the OSS nodes are
already part of the filesystem), I will have to re-configure the current HA
resource configuration to tell it about the 4 new OST's.
Our exascaler's HA monitors the individual OST and I need to re-configure
the HA on the existing filesystem.

Our vendor support has confirmed that we would have to restart the
filesystem if we want to regenerate the HA configs to include the new OST's.

Thanks,
-Raj


On Thu, Feb 21, 2019 at 11:23 AM Colin Faber  wrote:

> It seems to me that steps may still be missing?
>
> You're going to rack/stack and provision the OSS nodes with new OSTs'.
>
> Then you're going to introduce failover options somewhere? new osts?
> existing system? etc?
>
> If you're introducing failover with the new OST's and leaving the existing
> system in place, you should be able to accomplish this without bringing the
> system offline.
>
> If you're going to be introducing failover to your existing system then
> you will need to reconfigure the file system to accommodate the new
> failover settings (failover nides, etc.)
>
> -cf
>
>
> On Thu, Feb 21, 2019 at 9:13 AM Raj Ayyampalayam  wrote:
>
>> Our upgrade strategy is as follows:
>>
>> 1) Load all disks into the storage array.
>> 2) Create RAID pools and virtual disks.
>> 3) Create lustre file system using mkfs.lustre command. (I still have to
>> figure out all the parameters used on the existing OSTs).
>> 4) Create mount points on all OSSs.
>> 5) Mount the lustre OSTs.
>> 6) Maybe rebalance the filesystem.
>> My understanding is that the above can be done without bringing the
>> filesystem down. I want to create the HA configuration (corosync and
>> pacemaker) for the new OSTs. This step requires the filesystem to be down.
>> I want to know what would happen to the suspended processes across the
>> cluster when I bring the filesystem down to re-generate the HA configs.
>>
>> Thanks,
>> -Raj
>>
>> On Thu, Feb 21, 2019 at 12:59 AM Colin Faber  wrote:
>>
>>> Can you provide more details on your upgrade strategy? In some cases
>>> expanding your storage shouldn't impact client / job activity at all.
>>>
>>> On Wed, Feb 20, 2019, 11:09 AM Raj Ayyampalayam 
>>> wrote:
>>>
>>>> Hello,
>>>>
>>>> We are planning on expanding our storage by adding more OSTs to our
>>>> lustre file system. It looks like it would be easier to expand if we bring
>>>> the filesystem down and perform the necessary operations. We are planning
>>>> to suspend all the jobs running on the cluster. We originally planned to
>>>> add new OSTs to the live filesystem.
>>>>
>>>> We are trying to determine the potential impact to the suspended jobs
>>>> if we bring down the filesystem for the upgrade.
>>>> One of the questions we have is what would happen to the suspended
>>>> processes that hold an open file handle in the lustre file system when the
>>>> filesystem is brought down for the upgrade?
>>>> Will they recover from the client eviction?
>>>>
>>>> We do have vendor support and have engaged them. I wanted to ask the
>>>> community and get some feedback.
>>>>
>>>> Thanks,
>>>> -Raj
>>>>
>>> ___
>>>> lustre-discuss mailing list
>>>> lustre-discuss@lists.lustre.org
>>>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>>>>
>>>
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] Suspended jobs and rebooting lustre servers

2019-02-21 Thread Raj Ayyampalayam

Our upgrade strategy is as follows:

1) Load all disks into the storage array.
2) Create RAID pools and virtual disks.
3) Create lustre file system using mkfs.lustre command. (I still have to
figure out all the parameters used on the existing OSTs).
4) Create mount points on all OSSs.
5) Mount the lustre OSTs.
6) Maybe rebalance the filesystem.
My understanding is that the above can be done without bringing the
filesystem down. I want to create the HA configuration (corosync and
pacemaker) for the new OSTs. This step requires the filesystem to be down.
I want to know what would happen to the suspended processes across the
cluster when I bring the filesystem down to re-generate the HA configs.

Thanks,
-Raj

On Thu, Feb 21, 2019 at 12:59 AM Colin Faber  wrote:

> Can you provide more details on your upgrade strategy? In some cases
> expanding your storage shouldn't impact client / job activity at all.
>
> On Wed, Feb 20, 2019, 11:09 AM Raj Ayyampalayam  wrote:
>
>> Hello,
>>
>> We are planning on expanding our storage by adding more OSTs to our
>> lustre file system. It looks like it would be easier to expand if we bring
>> the filesystem down and perform the necessary operations. We are planning
>> to suspend all the jobs running on the cluster. We originally planned to
>> add new OSTs to the live filesystem.
>>
>> We are trying to determine the potential impact to the suspended jobs if
>> we bring down the filesystem for the upgrade.
>> One of the questions we have is what would happen to the suspended
>> processes that hold an open file handle in the lustre file system when the
>> filesystem is brought down for the upgrade?
>> Will they recover from the client eviction?
>>
>> We do have vendor support and have engaged them. I wanted to ask the
>> community and get some feedback.
>>
>> Thanks,
>> -Raj
>>
> ___
>> lustre-discuss mailing list
>> lustre-discuss@lists.lustre.org
>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>>
>
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

[lustre-discuss] Suspended jobs and rebooting lustre servers

2019-02-20 Thread Raj Ayyampalayam

Hello,

We are planning on expanding our storage by adding more OSTs to our lustre
file system. It looks like it would be easier to expand if we bring the
filesystem down and perform the necessary operations. We are planning to
suspend all the jobs running on the cluster. We originally planned to add
new OSTs to the live filesystem.

We are trying to determine the potential impact to the suspended jobs if we
bring down the filesystem for the upgrade.
One of the questions we have is what would happen to the suspended
processes that hold an open file handle in the lustre file system when the
filesystem is brought down for the upgrade?
Will they recover from the client eviction?

We do have vendor support and have engaged them. I wanted to ask the
community and get some feedback.

Thanks,
-Raj
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] new mounted client shows lower disk space

2018-11-14 Thread Raj

I would check if LNET address gets setup properly before mounting lustre FS
from client. You can try manually loading lustre module and try pinging
(lctl ping oss-nid) all the OSS nodes and observe any abnormalities and
dmesg before mounting FS.
It could be as simple as duplicate IP address in your ib interface or
unstable IB fabric.

On Wed, Nov 14, 2018 at 8:08 AM Thomas Roth  wrote:

> Hi,
>
> your error messages are all well known - the one on the OSS will show up
> as soon as the Lustre modules
> are loaded, provided you have some clients asking for the OSTs (and your
> MDT, which should be up by
> then, is also looking for the OSTs).
> The kiblnd_check_conns message I have also seen quite often, never with
> any OST problems.
>
> Rather seems your OST take a lot of time to mount or to recover - did you
> check
> /proc/fs/lustre/obdfilter/lustre-OST00*/recovery_status
> ?
>
> Regards
> Thomas
>
> On 11/12/18 9:46 AM, fırat yılmaz wrote:
> > Hi All
> > OS=Redhat 7.4
> > Lustre Version: Intel® Manager for Lustre* software 4.0.3.0
> >
> > I have 72 osts over 6 oss with HA and 1 mdt serving to 195 clients over
> > infiniband EDR.
> >
> > After a reboot on client, lustre filesystem mounts on startup. It should
> be
> > 2.1 TB area but lt starts with 350TB.
> >
> > lfs osts command shows 90 percent of even numbered osts are ACTIVE and no
> > information about other OSTs, as time passes like 1 hour or so, all OSTs
> > become active and lustre area can be seen as 2.1 PB
> >
> >
> > dmesg on lustre oss server:
> > LustreError: 137-5: lustre-OST0009_UUID: not available for connect from
> > 10.0.0.130@o2ib (no target). If you are running an HA pair check that
> the
> > target is mounted on the other server.
> >
> > dmesg on client:
> > LNet: 5419:0:(o2iblnd_cb.c:3192:kiblnd_check_conns()) Timed out tx for
> > 10.0.0.5@o2ib: 15 seconds
> > Lustre: 5546:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request
> sent
> > has failed due to network error: [sent 1542009416/real 1542009426]
> > req@885f4761 x1616909446641136/t0(0)
> > o8->lustre-OST0030-osc-885f75219800@10.0.0.8@o2ib:28/4 lens 520/544
> e 0
> > to 1 dl 1542009696 ref 1 fl Rpc:eXN/0/ rc 0/-1
> >
> > I tested infiniband with ib_send_lat, ib_read_lat and no error occured
> > I tested lnet ping with lctl ping 10.0.0.8@o2ib , no error occured
> > 12345-0@lo
> > 12345-10.51.22.8@o2ib
> >
> > Why some OST's  can be accesible while some are not in the same server?
> > Best Regards.
> >
> >
> > ___
> > lustre-discuss mailing list
> > lustre-discuss@lists.lustre.org
> > http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
> >
>
>
> ___
> lustre-discuss mailing list
> lustre-discuss@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] server_bulk_callback errors until server reboots

2018-06-07 Thread Raj

I seen the error when we had mix of FDR (using mlx4) and EDR(using mlx5)
devices in lustre network. server_bulk_callback should have the
corresponding client_bulk_callback in client.

http://wiki.lustre.org/Infiniband_Configuration_Howto
On Thu, Jun 7, 2018 at 11:24 AM Hebenstreit, Michael <
michael.hebenstr...@intel.com> wrote:

> No, clients do not show any issues.
>
> -Original Message-
> From: White, Cliff
> Sent: Thursday, June 07, 2018 9:26 AM
> To: Hebenstreit, Michael ; lustre-discuss <
> lustre-discuss@lists.lustre.org>
> Subject: Re: [lustre-discuss] server_bulk_callback errors until server
> reboots
>
>
> On 6/7/18, 7:00 AM, "lustre-discuss on behalf of Hebenstreit, Michael" <
> lustre-discuss-boun...@lists.lustre.org on behalf of
> michael.hebenstr...@intel.com> wrote:
>
> Hello
>
> I have now 2 Lustre systems that suddenly show this error - on a
> single OST the kernel log is filling with messages
>
> [58858.365663] LustreError:
> 123642:0:(events.c:447:server_bulk_callback()) event type 3, status -61,
> desc 880524f7e000
> [58865.328317] LustreError:
> 123640:0:(events.c:447:server_bulk_callback()) event type 5, status -61,
> desc 880cab4ec800
> [58865.340792] LustreError:
> 123641:0:(events.c:447:server_bulk_callback()) event type 5, status -61,
> desc 880524f7c600
> [58865.353167] LustreError:
> 123640:0:(events.c:447:server_bulk_callback()) event type 3, status -61,
> desc 880cab4ec800
> [58865.365503] LustreError:
> 123641:0:(events.c:447:server_bulk_callback()) event type 3, status -61,
> desc 880524f7c600
>
> until the server reboots. Clients are on 2.11/RH7.5, servers are on
> 2.7.19.10/RH7.4 . Has anyone experienced this before?
>
> There should be some corresponding error messages on your clients, have
> you checked there?
> cliffw
>
> Thanks
> Michael
>
>
> 
> Michael Hebenstreit Senior Cluster Architect
> Intel Corporation, MS: RR1-105/H14  Core and Visual Compute Group (DCE)
> 4100 Sara Road
> 
> Tel.:   +1 505-794-3144
> Rio Rancho, NM 87124
> UNITED STATES   E-mail:
> michael.hebenstr...@intel.com
>
>
>
> ___
> lustre-discuss mailing list
> lustre-discuss@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>
>
> ___
> lustre-discuss mailing list
> lustre-discuss@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] Lustre client 2.10.3 with 2.1 server

2018-02-28 Thread Raj Ayyampalayam

Yes, this is a CStor 1500 unit originally supplied by Xyratek.
Thanks for your recommendation.

-Raj

On Tue, Feb 27, 2018 at 8:05 PM Dilger, Andreas <andreas.dil...@intel.com>
wrote:

> On Feb 27, 2018, at 16:19, Raj Ayyampalayam <ans...@gmail.com> wrote:
> >
> > We are using a lustre 2.1 server with 2.5 client.
> >
> > Can the latest 2.10.3 client can be used with the 2.1 server?
> > I figured I would ask the list before I start installing the client on a
> test node.
>
> I don't believe this is possible, due to changes in the protocol.  In any
> case, we haven't tested the 2.1 code in many years.
>
> Very likely your "2.1" server is really a vendor port with thousands of
> patches, so you might consider to ask the vendor, in case they've tested
> this.  If not, then I'd strongly recommend to upgrade to a newer release on
> the server.
>
> Cheers, Andreas
> --
> Andreas Dilger
> Lustre Principal Architect
> Intel Corporation
>
>
>
>
>
>
>
>
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

[lustre-discuss] Lustre client 2.10.3 with 2.1 server

2018-02-27 Thread Raj Ayyampalayam

Hello,

We are using a lustre 2.1 server with 2.5 client.

Can the latest 2.10.3 client can be used with the 2.1 server?
I figured I would ask the list before I start installing the client on a
test node.

Thanks!
Raj
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] how to get statistics from lustre clients ?

2018-01-02 Thread Raj

To add to the list, I like ‘xltop’ which works like a linux ‘top’ command
and integrates with your scheduler.
On Tue, Jan 2, 2018 at 9:25 AM Ben Evans  wrote:

> The OSTs and MDTs have client-specific information in their proc
> hierarchies, so there's no need for client-side monitoring if that's all
> you're looking for.
>
> -Ben Evans
>
> From: Christopher Johnston 
> Date: Tuesday, January 2, 2018 at 10:21 AM
> To: Ben Evans 
> Cc: "Black.S" , "lustre-discuss@lists.lustre.org" <
> lustre-discuss@lists.lustre.org>
> Subject: Re: [lustre-discuss] how to get statistics from lustre clients ?
>
> I use the telgraf plugin for MDS/OSS/OST stats.  I know its not the client
> side that you were after but I find the metrics to be very useful and
> display nicely in grafana.
>
> https://github.com/influxdata/telegraf/tree/master/plugins/inputs/lustre2
>
> On Tue, Jan 2, 2018 at 8:45 AM, Ben Evans  wrote:
>
>> You can get breakdowns of client->OST reads and writes, and combine them
>> into OSS-level info.
>>
>> There is currently no timestamp on it, all the stats files are cumulative
>> since you last cleared.  You can get around this by reading the file
>> regularly and noting the time, and doing the diffs since the last read.
>>
>> There are a few programs out there that do this sort of thing for you
>> already, collectl and collectd come to mind.
>>
>> -Ben Evans
>>
>> From: lustre-discuss  on behalf
>> of "Black.S" 
>> Date: Saturday, December 30, 2017 at 9:18 AM
>> To: "lustre-discuss@lists.lustre.org" 
>> Subject: [lustre-discuss] how to get statistics from lustre clients ?
>>
>> May be anybody know it or get the target where I can search
>>
>> I want to get from each lustre client:
>>
>>-
>>
>>how much lustre client read/write ?
>>-
>>
>>which size blocks ?
>>-
>>
>>with which OSS communicate ?
>>-
>>
>>timestemp of operation ( I mean time then start read/write from
>>lustre client) ?
>>
>> Is I can get that from (or by) linux node with lustre client?
>>
>> ___
>> lustre-discuss mailing list
>> lustre-discuss@lists.lustre.org
>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>>
>>
> ___
> lustre-discuss mailing list
> lustre-discuss@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] lustre causing dropped packets

2017-12-05 Thread Raj

Brian,
I would check the following:
- MTU size must be same across all the nodes (servers + client)
- peer_credit and credit must be same across all the nodes
- /proc/sys/lnet/peers can show if you are constantly seeing negative
credits
- Buffer overflow counters on the switches if it provide. If the buffer
size is low to handle IO stream, you may want to reduce credits.

-Raj


On Tue, Dec 5, 2017 at 11:56 AM Brian Andrus <toomuc...@gmail.com> wrote:

> Shawn,
>
> Flow control is configured and these connections are all on the same 40g
> subnet and all directly connected to the same switch.
>
> I'm a little new with using lnet_selftest, but as I run it 1:1, I do see
> the dropped packets go up on the client node pretty significantly when I
> run it. The node I set for server does not drop any packets.
>
> Brian Andrus
>
> On 12/5/2017 9:20 AM, Shawn Hall wrote:
>
> Hi Brian,
>
> Do you have flow control configured on all ports that are on the network
> path? Lustre has a tendency to cause packet losses in ways that performance
> testing tools don’t because of the N to 1 packet flows, so flow control is
> often necessary. Lnet_selftest should replicate this behavior.
>
> Is there a point in the network path where the link bandwidth changes
> (e.g. 40 GbE down to 10 GbE, or 2x40 GbE down to 1x40 GbE)? That will
> commonly be the biggest point of loss if flow control isn’t doing its job.
>
> Shawn
>
> On 12/5/17, 11:49 AM, "lustre-discuss on behalf of jongwoo...@naver.com" 
> <lustre-discuss-boun...@lists.lustre.org
> on behalf of jongwoo...@naver.com>
> <lustre-discuss-bounces@lists.lustre.orgonbehalfofjongwoo...@naver.com>
> wrote:
>
> Did you check your connection with iperf and iperf3 in TCP bandwidth? in
> that case, these tools cannot find out packet drops.
>
> Try checking out your block device backend responsibility with benchmark
> tools like vdbench or bonnie++. Sometimes bad block device causes incorrect
> data transfer.
>
> -Original Message-
> From: Brian Andrus<toomuc...@gmail.com> <toomuc...@gmail.com>
> To: "lustre-discuss@lists.lustre.org" <lustre-discuss@lists.lustre.org>
> <lustre-discuss@lists.lustre.org> <lustre-discuss@lists.lustre.org>;
> Cc:
> Sent: 2017-12-06 (수) 01:38:04
> Subject: [lustre-discuss] lustre causing dropped packets
>
> All,
>
> I have a small setup I am testing (1 MGS, 2 OSS) that is connected via
> 40G ethernet.
>
> I notice that when I run anything that writes to the lustre filesystem
> causes dropped packets. Reads do not seem to cause this. I have also
> tested the network (iperf, iperf3, general traffic) with no dropped
> packets.
>
> Is there something with writes that can cause dropped packets?
>
>
> Brian Andrus
>
> ___
> lustre-discuss mailing list
> lustre-discuss@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
> ___
> lustre-discuss mailing list
> lustre-discuss@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>
>
>
> *Disclaimer*
>
> This e-mail has been scanned for all viruses and malware, and may have
> been automatically archived by Mimecast Ltd, an innovator in Software as a
> Service (SaaS) for business.
>
>
> ___
> lustre-discuss mailing list
> lustre-discuss@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] weird issue w. lnet routers

2017-11-28 Thread Raj

John, increasing MTU size on Ethernet side should increase the b/w. I also
have a feeling that some lnet routers and/or intermediate switches/routers
does not have jumbo frame turned on (some switches needs to be set at 9212
bytes ).
How many LNet  routers are you using? I believe you are routing between EDR
IB and 100GbE.


On Tue, Nov 28, 2017 at 7:21 PM John Casu  wrote:

> just built a system w. lnet routers that bridge Infiniband & 100GbE, using
> Centos built in Infiniband support
> servers are Infiniband, clients are 100GbE (connectx-4 cards)
>
> my direct write performance from clients over Infiniband is around 15GB/s
>
> When I introduced the lnet routers, performance dropped to 10GB/s
>
> Thought the problem was an MTU of 1500, but when I changed the MTUs to 9000
> performance dropped to 3GB/s.
>
> When I tuned according to John Fragella's LUG slides, things went even
> slower (1.5GB/s write)
>
> does anyone have any ideas on what I'm doing wrong??
>
> thanks,
> -john c.
>
> ___
> lustre-discuss mailing list
> lustre-discuss@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] mdt mounting error

2017-11-01 Thread Raj

Parag, I have not tested two FS using a common MGT and I don’t know whether
it is supported.

On Wed, Nov 1, 2017 at 6:37 AM Parag Khuraswar <para...@citilindia.com>
wrote:

> Hi Raj,
>
> But I have two file systems,
> And I think I can use one mgt for two filesystems. Please correct me if
> I am wrong.
>
> Regards,
> Parag
>
>
> On 2017-11-01 16:56, Raj wrote:
> > The following can contribute to this issue:
> > - Missing FS name in mgt creation (it must be <=9 character long):
> > --fsname=
> > mkfs.lustre --servicenode=10.2.1.204@o2ib
> > --servicenode=10.2.1.205@o2ib --FSNAME=HOME --mgs /dev/mapper/mpathc
> >
> > - verify if /mdt directory exists
> >
> > On Wed, Nov 1, 2017 at 6:16 AM Raj <rajgau...@gmail.com> wrote:
> >
> >> What options in mkfs.lustre did you use to format with lustre?
> >>
> >> On Wed, Nov 1, 2017 at 6:14 AM Parag Khuraswar
> >> <para...@citilindia.com> wrote:
> >>
> >> Hi Raj,
> >>
> >> Yes, /dev/mapper/mpatha available.
> >>
> >> I could format and mount using ext4.
>
> >>
> >> Regards,
> >>
> >> Parag
> >>
> >> FROM: Raj [mailto:rajgau...@gmail.com]
> >> SENT: Wednesday, November , 2017 4:39 PM
> >> TO: Parag Khuraswar; Lustre discussion
> >> SUBJECT: Re: [lustre-discuss] mdt mounting error
> >>
> >> Parag,
> >> Is the device /dev/mapper/mpatha available?
> >> If not, the multipathd may not have started or the multipath
> >> configuration may not be correct.
> >>
> >> On Wed, Nov 1, 2017 at 5:18 AM Parag Khuraswar
> >> <para...@citilindia.com> wrote:
> >>
> >> Hi,
> >>
> >> I am getting below error while mounting mdt. Mgt is mounted.
> >>
> >> Please suggest
> >>
> >> [root@mds2 ~]# mount -t lustre /dev/mapper/mpatha /mdt
> >>
> >> mount.lustre: mount /dev/mapper/mpatha at /mdt failed: No such file
> >> or directory
> >>
> >> Is the MGS specification correct?
> >>
> >> Is the filesystem name correct?
> >>
> >> If upgrading, is the copied client log valid? (see upgrade docs)
> >>
> >> Regards,
> >>
> >> Parag
> >>
> >> ___
> >> lustre-discuss mailing list
> >> lustre-discuss@lists.lustre.org
> >> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>
>
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] mdt mounting error

2017-11-01 Thread Raj

The following can contribute to this issue:
- Missing FS name in mgt creation (it must be <=9 character long): --fsname=
mkfs.lustre --servicenode=10.2.1.204@o2ib --servicenode=10.2.1.205@o2ib
*--fsname=home* --mgs /dev/mapper/mpathc

- verify if /mdt directory exists



On Wed, Nov 1, 2017 at 6:16 AM Raj <rajgau...@gmail.com> wrote:

> What options in mkfs.lustre did you use to format with lustre?
> On Wed, Nov 1, 2017 at 6:14 AM Parag Khuraswar <para...@citilindia.com>
> wrote:
>
>> Hi Raj,
>>
>>
>>
>> Yes, /dev/mapper/mpatha available.
>>
>> I could format and mount using ext4.
>>
>>
>>
>> Regards,
>>
>> Parag
>>
>>
>>
>>
>>
>> *From:* Raj [mailto:rajgau...@gmail.com]
>> *Sent:* Wednesday, November , 2017 4:39 PM
>> *To:* Parag Khuraswar; Lustre discussion
>> *Subject:* Re: [lustre-discuss] mdt mounting error
>>
>>
>>
>> Parag,
>> Is the device /dev/mapper/mpatha available?
>> If not, the multipathd may not have started or the multipath
>> configuration may not be correct.
>>
>> On Wed, Nov 1, 2017 at 5:18 AM Parag Khuraswar <para...@citilindia.com>
>> wrote:
>>
>> Hi,
>>
>>
>>
>> I am getting below error while mounting mdt. Mgt is mounted.
>>
>>
>>
>> Please suggest
>>
>>
>>
>> [root@mds2 ~]# mount -t lustre /dev/mapper/mpatha /mdt
>>
>> mount.lustre: mount /dev/mapper/mpatha at /mdt failed: No such file or
>> directory
>>
>> Is the MGS specification correct?
>>
>> Is the filesystem name correct?
>>
>> If upgrading, is the copied client log valid? (see upgrade docs)
>>
>>
>>
>> Regards,
>>
>> Parag
>>
>>
>>
>>
>>
>> ___
>> lustre-discuss mailing list
>> lustre-discuss@lists.lustre.org
>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>>
>>
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] mdt mounting error

2017-11-01 Thread Raj

Parag,
Is the device /dev/mapper/mpatha available?
If not, the multipathd may not have started or the multipath configuration
may not be correct.

On Wed, Nov 1, 2017 at 5:18 AM Parag Khuraswar 
wrote:

> Hi,
>
>
>
> I am getting below error while mounting mdt. Mgt is mounted.
>
>
>
> Please suggest
>
>
>
> [root@mds2 ~]# mount -t lustre /dev/mapper/mpatha /mdt
>
> mount.lustre: mount /dev/mapper/mpatha at /mdt failed: No such file or
> directory
>
> Is the MGS specification correct?
>
> Is the filesystem name correct?
>
> If upgrading, is the copied client log valid? (see upgrade docs)
>
>
>
> Regards,
>
> Parag
>
>
>
>
> ___
> lustre-discuss mailing list
> lustre-discuss@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] backup zfs MDT or migrate from ZFS back to ldiskfs

2017-07-22 Thread Raj

Stu,
Is there a reason why you picked Raidz 3 rather than 4 way mirror across 4
disks?
Raidz 3 parity calculation might take more cpu resources rather than
mirroring across disks but also the latency may be higher in mirroring to
sync across all the disks. Wondering if you did some testing before
deciding it.

On Fri, Jul 21, 2017 at 12:27 AM Stu Midgley  wrote:

> we have been happily using 2.9.52+0.7.0-rc3 for a while now.
>
> The MDT is a raidz3 across 4 disks.
>
> On Fri, Jul 21, 2017 at 1:19 PM, Isaac Huang  wrote:
>
>> On Fri, Jul 21, 2017 at 12:54:15PM +0800, Stu Midgley wrote:
>> > Afternoon
>> >
>> > I have an MDS running on spinning media and wish to migrate it to SSD's.
>> >
>> > Lustre 2.9.52
>> > ZFS 0.7.0-rc3
>>
>> This may not be a stable combination - I don't think Lustre officially
>> supports 0.7.0-rc yet. Plus, there's a recent Lustre osd-zfs bug and
>> its fix hasn't been back ported to 2.9 yet (to the best of my knowledge):
>> https://jira.hpdd.intel.com/browse/LU-9305
>>
>> > How do I do it?
>>
>> Depends on how you've configured the MDT pool. If the disks are
>> mirrored or just plan disks without any redundancy (i.e. not RAIDz),
>> you can simply attach the SSDs to the hard drives to form or extend
>> mirrors and then detach the hard drives - see zpool attach/detach.
>>
>> -Isaac
>>
>
>
>
> --
> Dr Stuart Midgley
> sdm...@gmail.com
> ___
> lustre-discuss mailing list
> lustre-discuss@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] set OSTs read only ?

2017-07-16 Thread Raj

I observed the same thing with 2.5. I believe it was the unlink object list
not being pushed to OST once it came back active. But, I didn't see this
issue with version 2.8. It released the space almost immediately after OSTs
were activated.

On Sun, Jul 16, 2017, 10:16 AM Bob Ball <b...@umich.edu> wrote:

> I agree with Raj.  Also, I have noted with Lustre 2.7, that the space is
> not actually freed after re-activation of the OST, until the mgs is
> restarted.  I don't recall the reason for this, or know if this was fixed
> in later Lustre versions.
>
> Remember, this is done on the mgs, not on the clients.  If you do it on a
> client, the behavior is as you thought.
>
>
> bob
>
>
> On 7/16/2017 11:10 AM, Raj wrote:
>
> No. Deactivating an OST will not allow to create new objects(file). But,
> client can read AND modify an existing objects(append the file). Also, it
> will not free any space from deleted objects until the OST is activated
> again.
>
> On Sun, Jul 16, 2017, 9:29 AM E.S. Rosenberg <esr+lus...@mail.hebrew.edu>
> wrote:
>
>> On Thu, Jul 13, 2017 at 5:49 AM, Bob Ball <b...@umich.edu> wrote:
>>
>>> On the mgs/mgt do something like:
>>> lctl --device -OST0019-osc-MDT deactivate
>>>
>>> No further files will be assigned to that OST.  Reverse with
>>> "activate".  Or reboot the mgs/mdt as this is not persistent.  "lctl dl"
>>> will tell you exactly what that device name should be for you.
>>>
>> Doesn't that also disable reads from the OST though?
>>
>>>
>>> bob
>>>
>>>
>>> On 7/12/2017 6:04 PM, Alexander I Kulyavtsev wrote:
>>>
>>> You may find advise from Andreas on this list (also attached below). I
>>> did not try setting fail_loc myself.
>>>
>>> In 2.9 there is setting  osp.*.max_create_count=0 described
>>> at LUDOC-305.
>>>
>>> We used to set OST degraded as described in lustre manual.
>>> It works most of the time but at some point I saw lustre errors in logs
>>> for some ops. Sorry, I do not recall details.
>>>
>>> I still not sure either of these approaches will work for you: setting
>>> OST degraded or fail_loc will makes some osts selected instead of others.
>>> You may want to verify if these settings will trigger clean error on
>>> user side (instead of blocking) when all OSTs are degraded.
>>>
>>> The other and also simpler approach would be to enable lustre quota and
>>> set quota below used space for all users (or groups).
>>>
>>> Alex.
>>>
>>> *From: *"Dilger, Andreas" <andreas.dil...@intel.com>
>>> *Subject: **Re: [lustre-discuss] lustre 2.5.3 ost not draining*
>>> *Date: *July 28, 2015 at 11:51:38 PM CDT
>>> *Cc: *"lustre-discuss@lists.lustre.org" <lustre-discuss@lists.lustre.org
>>> >
>>>
>>> Setting it degraded means the MDS will avoid allocations on that OST
>>> unless there aren't enough OSTs to meet the request (e.g. stripe_count =
>>> -1), so it should work.
>>>
>>> That is actually a very interesting workaround for this problem, and it
>>> will work for older versions of Lustre as well.  It doesn't disable the
>>> OST completely, which is fine if you are doing space balancing (and may
>>> even be desirable to allow apps that need more bandwidth for a widely
>>> striped file), but it isn't good if you are trying to empty the OST
>>> completely to remove it.
>>>
>>> It looks like another approach would be to mark the OST as having no free
>>> space using OBD_FAIL_OST_ENOINO (0x229) fault injection on that OST:
>>>
>>>   lctl set_param fail_loc=0x229 fail_val=
>>>
>>> This would cause the OST to return 0 free inodes from OST_STATFS for the
>>> specified OST index, and the MDT would skip this OST completely.  To
>>> disable all of the OSTs on an OSS use  = -1.  It isn't
>>> possible
>>> to selectively disable a subset of OSTs using this method.  The
>>> OBD_FAIL_OST_ENOINO fail_loc has been available since Lustre 2.2, which
>>> covers all of the 2.4+ versions that are affected by this issue.
>>>
>>> If this mechanism works for you (it should, as this fail_loc is used
>>> during regular testing) I'd be obliged if someone could file an LUDOC bug
>>> so the manual can be updated.
>>>
>>> Cheers, Andreas
>>>
>>>
>>>
>>> On Jul 12, 2017, at 4:20 PM, Riccardo Veraldi <

Re: [lustre-discuss] set OSTs read only ?

2017-07-16 Thread Raj

No. Deactivating an OST will not allow to create new objects(file). But,
client can read AND modify an existing objects(append the file). Also, it
will not free any space from deleted objects until the OST is activated
again.

On Sun, Jul 16, 2017, 9:29 AM E.S. Rosenberg 
wrote:

> On Thu, Jul 13, 2017 at 5:49 AM, Bob Ball  wrote:
>
>> On the mgs/mgt do something like:
>> lctl --device -OST0019-osc-MDT deactivate
>>
>> No further files will be assigned to that OST.  Reverse with "activate".
>> Or reboot the mgs/mdt as this is not persistent.  "lctl dl" will tell you
>> exactly what that device name should be for you.
>>
> Doesn't that also disable reads from the OST though?
>
>>
>> bob
>>
>>
>> On 7/12/2017 6:04 PM, Alexander I Kulyavtsev wrote:
>>
>> You may find advise from Andreas on this list (also attached below). I
>> did not try setting fail_loc myself.
>>
>> In 2.9 there is setting  osp.*.max_create_count=0 described at LUDOC-305.
>>
>> We used to set OST degraded as described in lustre manual.
>> It works most of the time but at some point I saw lustre errors in logs
>> for some ops. Sorry, I do not recall details.
>>
>> I still not sure either of these approaches will work for you: setting
>> OST degraded or fail_loc will makes some osts selected instead of others.
>> You may want to verify if these settings will trigger clean error on user
>> side (instead of blocking) when all OSTs are degraded.
>>
>> The other and also simpler approach would be to enable lustre quota and
>> set quota below used space for all users (or groups).
>>
>> Alex.
>>
>> *From: *"Dilger, Andreas" 
>> *Subject: **Re: [lustre-discuss] lustre 2.5.3 ost not draining*
>> *Date: *July 28, 2015 at 11:51:38 PM CDT
>> *Cc: *"lustre-discuss@lists.lustre.org" 
>>
>> Setting it degraded means the MDS will avoid allocations on that OST
>> unless there aren't enough OSTs to meet the request (e.g. stripe_count =
>> -1), so it should work.
>>
>> That is actually a very interesting workaround for this problem, and it
>> will work for older versions of Lustre as well.  It doesn't disable the
>> OST completely, which is fine if you are doing space balancing (and may
>> even be desirable to allow apps that need more bandwidth for a widely
>> striped file), but it isn't good if you are trying to empty the OST
>> completely to remove it.
>>
>> It looks like another approach would be to mark the OST as having no free
>> space using OBD_FAIL_OST_ENOINO (0x229) fault injection on that OST:
>>
>>   lctl set_param fail_loc=0x229 fail_val=
>>
>> This would cause the OST to return 0 free inodes from OST_STATFS for the
>> specified OST index, and the MDT would skip this OST completely.  To
>> disable all of the OSTs on an OSS use  = -1.  It isn't possible
>> to selectively disable a subset of OSTs using this method.  The
>> OBD_FAIL_OST_ENOINO fail_loc has been available since Lustre 2.2, which
>> covers all of the 2.4+ versions that are affected by this issue.
>>
>> If this mechanism works for you (it should, as this fail_loc is used
>> during regular testing) I'd be obliged if someone could file an LUDOC bug
>> so the manual can be updated.
>>
>> Cheers, Andreas
>>
>>
>>
>> On Jul 12, 2017, at 4:20 PM, Riccardo Veraldi <
>> riccardo.vera...@cnaf.infn.it> wrote:
>>
>> Hello,
>>
>> on one of my lustre FS I need to find a solution so that users can still
>> access data on the FS but cannot write new files on it.
>> I have hundreds of clients accessing the FS so remounting it ro is not
>> really easily feasible.
>> Is there an option on the OSS side to allow OSTs to be accessed just to
>> read data and not to store new data ?
>> tunefs.lustre could do that ?
>> thank you
>>
>> Rick
>>
>> ___
>> lustre-discuss mailing list
>> lustre-discuss@lists.lustre.org
>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>>
>>
>>
>>
>> ___
>> lustre-discuss mailing 
>> listlustre-discuss@lists.lustre.orghttp://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>>
>>
>>
>> ___
>> lustre-discuss mailing list
>> lustre-discuss@lists.lustre.org
>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>>
>> ___
> lustre-discuss mailing list
> lustre-discuss@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] How to added new interface in lustre server?

2017-07-08 Thread Raj

Lu,
I will add ib0 just so that it becomes more clear to understand in the
future:
options lnet networks=o2ib*(ib0)*,tcp0(em1),tcp1(p5p2)

Reloading lnet module or restarting luster will disconnect all the clients.
But don't worry, the clients will reconnect once it comes back online.
But if you have HA setup, I would recommend you to failover any OSTs that
this node is hosting to its partner. Once failed over, you can stop lnet by:
# lustre_rmmod
And reload lnet by
#modprobe lustre
Check whether you have new nids available locally by:
# lctl list_nids
If everything looks good, you can failback the OSTs to its original OSS
node.

Also, since you are adding a new lnet network (tcp1) or new NID to the
server, I believe you must change the lustre configuration information as
mentioned in the manual unless somebody here says its not necessary.
http://wiki.old.lustre.org/manual/LustreManual20_HTML/LustreMaintenance.html#50438199_31353

Thanks
-Raj

On Fri, Jul 7, 2017 at 1:28 AM Wei-Zhao Lu <w...@gate.sinica.edu.tw> wrote:

> Hi ALL,
>
> My lustre server is 2.5.3, there are 2 interface(ib0, tcp).
> module parameter is "options lnet networks=o2ib,tcp"
>
> Now, I changed module parameter as "options lnet
> networks=o2ib,tcp0(em1),tcp1(p5p2)"
> How to reload lnet module or restart lustre service?
> There are many lustre client running jobs, I wish no any bad effect to
> these client.
>
> Thanks a lot.
>
> Best Regards,
> Lu
> ___
> lustre-discuss mailing list
> lustre-discuss@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] client fails to mount

2017-04-24 Thread Raj

Yes, this is strange. Normally, I have seen that credits mismatch results
this scenario but it doesn't look like this is the case.

You wouldn't want to put mgs into capture debug messages as there will be a
lot of data.

I guess you already tried removing the lustre drivers and adding it again ?
lustre_rmmod
modprobe -v lustre

And check dmesg for any errors...


On Mon, Apr 24, 2017 at 12:43 PM Strikwerda, Ger <g.j.c.strikwe...@rug.nl>
wrote:

> Hi Raj,
>
> When i do a lctl ping on a MGS server i do not see any logs at all. Also
> not when i do a sucessfull ping from a working node. Is there a way to
> verbose the Lustre logging to see more detail on the LNET level?
>
> It is very strange that a rebooted node is able to lctl ping compute
> nodes, but fails to lctl ping metadata and storage nodes.
>
>
>
>
> On Mon, Apr 24, 2017 at 7:35 PM, Raj <rajgau...@gmail.com> wrote:
>
>> Ger,
>> It looks like default configuration of lustre.
>>
>> Do you see any error message on the MGS side while you are doing lctl
>> ping from the rebooted clients?
>> On Mon, Apr 24, 2017 at 12:27 PM Strikwerda, Ger <g.j.c.strikwe...@rug.nl>
>> wrote:
>>
>>> Hi Eli,
>>>
>>> Nothing can be mounted on the Lustre filesystems so the output is:
>>>
>>> [root@pg-gpu01 ~]# lfs df /home/ger/
>>> [root@pg-gpu01 ~]#
>>>
>>> Empty..
>>>
>>>
>>>
>>> On Mon, Apr 24, 2017 at 7:24 PM, E.S. Rosenberg <e...@cs.huji.ac.il>
>>> wrote:
>>>
>>>>
>>>>
>>>> On Mon, Apr 24, 2017 at 8:19 PM, Strikwerda, Ger <
>>>> g.j.c.strikwe...@rug.nl> wrote:
>>>>
>>>>> Hallo Eli,
>>>>>
>>>>> Logfile/syslog on the client-side:
>>>>>
>>>>> Lustre: Lustre: Build Version:
>>>>> 2.5.3-RC1--PRISTINE-2.6.32-573.el6.x86_64
>>>>> LNet: Added LNI 172.23.54.51@o2ib [8/256/0/180]
>>>>> LNetError: 2878:0:(o2iblnd_cb.c:2587:kiblnd_rejected())
>>>>> 172.23.55.211@o2ib rejected: consumer defined fatal error
>>>>>
>>>>
>>>> lctl df /path/to/some/file
>>>>
>>>> gives nothing useful? (the second one will dump *a lot*)
>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Apr 24, 2017 at 7:16 PM, E.S. Rosenberg <
>>>>> esr+lus...@mail.hebrew.edu> wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> On Mon, Apr 24, 2017 at 8:13 PM, Strikwerda, Ger <
>>>>>> g.j.c.strikwe...@rug.nl> wrote:
>>>>>>
>>>>>>> Hi Raj (and others),
>>>>>>>
>>>>>>> In which file should i state the credits/peer_credits stuff?
>>>>>>>
>>>>>>> Perhaps relevant config-files:
>>>>>>>
>>>>>>> [root@pg-gpu01 ~]# cd /etc/modprobe.d/
>>>>>>>
>>>>>>> [root@pg-gpu01 modprobe.d]# ls
>>>>>>> anaconda.conf   blacklist-kvm.conf  dist-alsa.conf
>>>>>>> dist-oss.conf   ib_ipoib.conf  lustre.conf  openfwwf.conf
>>>>>>> blacklist.conf  blacklist-nouveau.conf  dist.conf
>>>>>>> freeipmi-modalias.conf  ib_sdp.confmlnx.conftruescale.conf
>>>>>>>
>>>>>>> [root@pg-gpu01 modprobe.d]# cat ./ib_ipoib.conf
>>>>>>> alias netdev-ib* ib_ipoib
>>>>>>>
>>>>>>> [root@pg-gpu01 modprobe.d]# cat ./mlnx.conf
>>>>>>> # Module parameters for MLNX_OFED kernel modules
>>>>>>>
>>>>>>> [root@pg-gpu01 modprobe.d]# cat ./lustre.conf
>>>>>>> options lnet networks=o2ib(ib0)
>>>>>>>
>>>>>>> Are there more Lustre/LNET options that could help in this situation?
>>>>>>>
>>>>>>
>>>>>> What about the logfiles?
>>>>>> Any error messages in syslog? lctl debug options?
>>>>>> Veel geluk,
>>>>>> Eli
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Apr 24, 2017 at 7:02 PM, Raj <rajgau...@gmail.com> wrote:
>>>>>>>
>>>>>>>> May be worth checking your lnet credits and peer_cred

Re: [lustre-discuss] client fails to mount

2017-04-24 Thread Raj

Ger,
It looks like default configuration of lustre.

Do you see any error message on the MGS side while you are doing lctl ping
from the rebooted clients?
On Mon, Apr 24, 2017 at 12:27 PM Strikwerda, Ger <g.j.c.strikwe...@rug.nl>
wrote:

> Hi Eli,
>
> Nothing can be mounted on the Lustre filesystems so the output is:
>
> [root@pg-gpu01 ~]# lfs df /home/ger/
> [root@pg-gpu01 ~]#
>
> Empty..
>
>
>
> On Mon, Apr 24, 2017 at 7:24 PM, E.S. Rosenberg <e...@cs.huji.ac.il> wrote:
>
>>
>>
>> On Mon, Apr 24, 2017 at 8:19 PM, Strikwerda, Ger <g.j.c.strikwe...@rug.nl
>> > wrote:
>>
>>> Hallo Eli,
>>>
>>> Logfile/syslog on the client-side:
>>>
>>> Lustre: Lustre: Build Version: 2.5.3-RC1--PRISTINE-2.6.32-573.el6.x86_64
>>> LNet: Added LNI 172.23.54.51@o2ib [8/256/0/180]
>>> LNetError: 2878:0:(o2iblnd_cb.c:2587:kiblnd_rejected())
>>> 172.23.55.211@o2ib rejected: consumer defined fatal error
>>>
>>
>> lctl df /path/to/some/file
>>
>> gives nothing useful? (the second one will dump *a lot*)
>>
>>>
>>>
>>>
>>>
>>> On Mon, Apr 24, 2017 at 7:16 PM, E.S. Rosenberg <
>>> esr+lus...@mail.hebrew.edu> wrote:
>>>
>>>>
>>>>
>>>> On Mon, Apr 24, 2017 at 8:13 PM, Strikwerda, Ger <
>>>> g.j.c.strikwe...@rug.nl> wrote:
>>>>
>>>>> Hi Raj (and others),
>>>>>
>>>>> In which file should i state the credits/peer_credits stuff?
>>>>>
>>>>> Perhaps relevant config-files:
>>>>>
>>>>> [root@pg-gpu01 ~]# cd /etc/modprobe.d/
>>>>>
>>>>> [root@pg-gpu01 modprobe.d]# ls
>>>>> anaconda.conf   blacklist-kvm.conf  dist-alsa.conf
>>>>> dist-oss.conf   ib_ipoib.conf  lustre.conf  openfwwf.conf
>>>>> blacklist.conf  blacklist-nouveau.conf  dist.conf
>>>>> freeipmi-modalias.conf  ib_sdp.confmlnx.conftruescale.conf
>>>>>
>>>>> [root@pg-gpu01 modprobe.d]# cat ./ib_ipoib.conf
>>>>> alias netdev-ib* ib_ipoib
>>>>>
>>>>> [root@pg-gpu01 modprobe.d]# cat ./mlnx.conf
>>>>> # Module parameters for MLNX_OFED kernel modules
>>>>>
>>>>> [root@pg-gpu01 modprobe.d]# cat ./lustre.conf
>>>>> options lnet networks=o2ib(ib0)
>>>>>
>>>>> Are there more Lustre/LNET options that could help in this situation?
>>>>>
>>>>
>>>> What about the logfiles?
>>>> Any error messages in syslog? lctl debug options?
>>>> Veel geluk,
>>>> Eli
>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Apr 24, 2017 at 7:02 PM, Raj <rajgau...@gmail.com> wrote:
>>>>>
>>>>>> May be worth checking your lnet credits and peer_credits in
>>>>>> /etc/modprobe.d ?
>>>>>> You can compare between working hosts and non working hosts.
>>>>>> Thanks
>>>>>> _Raj
>>>>>>
>>>>>> On Mon, Apr 24, 2017 at 10:10 AM Strikwerda, Ger <
>>>>>> g.j.c.strikwe...@rug.nl> wrote:
>>>>>>
>>>>>>> Hi Rick,
>>>>>>>
>>>>>>> Even without iptables rules and loading the correct modules
>>>>>>> afterwards, we get the same results:
>>>>>>>
>>>>>>> [root@pg-gpu01 sysconfig]# iptables --list
>>>>>>> Chain INPUT (policy ACCEPT)
>>>>>>> target prot opt source   destination
>>>>>>>
>>>>>>> Chain FORWARD (policy ACCEPT)
>>>>>>> target prot opt source   destination
>>>>>>>
>>>>>>> Chain OUTPUT (policy ACCEPT)
>>>>>>> target prot opt source   destination
>>>>>>>
>>>>>>> Chain LOGDROP (0 references)
>>>>>>> target prot opt source   destination
>>>>>>> LOGall  --  anywhere anywhereLOG
>>>>>>> level warning
>>>>>>> DROP   all  --  anywhere anywhere
>>>>>>>
>>>>>>> [root@pg-gpu01 sysconfig]# modprobe lnet
>>>>>>>
>

Re: [lustre-discuss] How to find out what the OSS interacts with the Lustre client at a given time ?

2017-04-15 Thread Raj

I prefer to pull them from server side by:

On OSS:
 /proc/fs/lustre/obdfilter/*/exports/CLIENT_IP@FABRIC/stat
snapshot_time 1492265908.551519 secs.usecs
read_bytes21575 samples [bytes] 458752 1048576 22621818880
statfs1 samples [reqs]
preprw21575 samples [reqs]
commitrw  21575 samples [reqs]
ping  1 samples [reqs]

You can clear the stat (echo clear > stat) and then read at sample
intervals.

Same on MDS node:
/proc/fs/lustre/mdt/*/exports/CLIENT_IP@FABRIC/stat
snapshot_time 1492266510.531256 secs.usecs
open  3 samples [reqs]
close 1 samples [reqs]
unlink1 samples [reqs]
getattr   3 samples [reqs]

Thanks
_Raj

On Sat, Apr 15, 2017 at 5:46 AM Black.S  wrote:

> How to find out what the OSS interacts with the Lustre client at a given
> time?
> Maybe it know from /proc/fs/lustre the lustre client or something else ?
>
> Is it possible to obtain such statistics from MDS, OSS or from a lustre
> client host ?
>
> thanks for you time
> ___
> lustre-discuss mailing list
> lustre-discuss@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] LNET Self-test

2017-02-05 Thread Raj

You should be able to do concurrent streams using --concurrency option. I
would try with 2/4/8.
-RG

On Sun, Feb 5, 2017 at 1:30 PM Jon Tegner  wrote:

> Hi,
>
> I'm trying to use lnet selftest to evaluate network performance on a
> test setup (only two machines). Using e.g., iperf or Netpipe I've
> managed to demonstrate the bandwidth of the underlying 10 Gbits/s
> network (and typically you reach the expected bandwidth as the packet
> size increases).
>
> How can I do the same using lnet selftest (i.e., verifying the bandwidth
> of the underlying hardware)? My initial thought was to increase the I/O
> size, but it seems the maximum size one can use is "--size=1M".
>
> Thanks,
>
> /jon
> ___
> lustre-discuss mailing list
> lustre-discuss@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

[lustre-discuss] Client eviction during OST failover

2017-02-02 Thread Raj

Hello, I am testing 2.8.0 in server and clients are getting evicted when an
OST failover occurs resulting in IO error. Failover happens within 2 mins
and lustre timeouts are very comfortable (lustre timeout is set to 300,
at_min is 40, at_max is 400, ldlm_enqueue_min=260).
Do you see any thing abnormal in the below debug? Any help is appreciated.

Client side debug when OST000e is unmounted and mounted on an OSS node(it
was mounted again at 7:21):
Feb  2 07:20:25 hcrddm005 kernel: LustreError: 11-0:
hsi003-OST000e-osc-8815c65f4400: Communicating with 10.10.150.21@o2ib,
operation ost_read failed with -107.
Feb  2 07:20:25 hcrddm005 kernel: Lustre:
hsi003-OST000e-osc-8815c65f4400: Connection to hsi003-OST000e (at
10.10.150.21@o2ib) was lost; in progress operations using this service will
wait for recovery to complete
Feb  2 07:20:25 hcrddm005 kernel: LustreError: dumping log to
*/tmp/lustre-log.1486041625.111200*
Feb  2 07:20:25 hcrddm005 kernel: LustreError: Skipped 1 previous similar
message
Feb  2 07:21:27 hcrddm005 kernel: LustreError: 167-0:
hsi003-OST000e-osc-8815c65f4400: This client was evicted by
hsi003-OST000e; in progress operations using this service will fail.
Feb  2 07:21:28 hcrddm005 kernel: LustreError:
78611:0:(osc_lock.c:832:osc_ldlm_completion_ast()) lock@8819fd5eca98[3
3 0 1 1 ] R(1):[0, 18446744073709551615]@[0x1000e:0x26442:0x0] {
Feb  2 07:21:28 hcrddm005 kernel: LustreError:
78611:0:(osc_lock.c:832:osc_ldlm_completion_ast())
lovsub@881be723c720: [0 881fb55c26e0 R(1):[0,
18446744073709551615]@[0x20bd1:0x1e:0x0]]
Feb  2 07:21:28 hcrddm005 kernel: LustreError:
78611:0:(osc_lock.c:832:osc_ldlm_completion_ast()) osc@881ab01f5978:
881d5cb4ebc00x2001001 0x3e482ccc3f43b14c 3 880b6711a5d0
size: 61045997568 mtime: 1485886851 atime: 0 ctime: 1485886851 blocks:
113896795
Feb  2 07:21:28 hcrddm005 kernel: LustreError:
78611:0:(osc_lock.c:832:osc_ldlm_completion_ast()) } lock@8819fd5eca98
Feb  2 07:21:28 hcrddm005 kernel: LustreError:
78611:0:(osc_lock.c:832:osc_ldlm_completion_ast()) dlmlock returned -5
Feb  2 07:21:28 hcrddm005 kernel: LustreError:
78611:0:(ldlm_resource.c:809:ldlm_resource_complain())
hsi003-OST000e-osc-8815c65f4400: namespace resource [0x26442:0x0:0x0].0
(8812426c35c0) refcount nonzero (1) after lock cleanup; forcing cleanup.
Feb  2 07:21:28 hcrddm005 kernel: LustreError:
78611:0:(ldlm_resource.c:1448:ldlm_resource_dump()) --- Resource:
[0x26442:0x0:0x0].0 (8812426c35c0) refcount = 2
Feb  2 07:21:28 hcrddm005 kernel: LustreError:
78611:0:(ldlm_resource.c:1451:ldlm_resource_dump()) Granted locks (in
reverse order):
Feb  2 07:21:28 hcrddm005 kernel: LustreError:
78611:0:(ldlm_resource.c:1454:ldlm_resource_dump()) ### ### ns:
hsi003-OST000e-osc-8815c65f4400 lock:
881d5cb4ebc0/0x3e482ccc3f43b14c lrc: 3/0,0 mode: PR/PR res:
[0x26442:0x0:0x0].0 rrc: 2 type: EXT [0->18446744073709551615] (req
0->18446744073709551615) flags: 0x1264 nid: local remote:
0xa7242e5b8ddab5d0 expref: -99 pid: 232385 timeout: 0 lvb_type: 1
Feb  2 07:21:28 hcrddm005 kernel: LustreError:
78611:0:(import.c:1296:ptlrpc_invalidate_import_thread()) dump the log upon
eviction
Feb  2 07:21:28 hcrddm005 kernel: LustreError: dumping log to
*/tmp/lustre-log.1486041688.78611*
Feb  2 07:21:28 hcrddm005 kernel: Lustre:
hsi003-OST000e-osc-8815c65f4400: Connection restored to hsi003-OST000e
(at 10.10.150.21@o2ib)
Feb  2 07:21:34 hcrddm005 kernel: LustreError:
78524:0:(cl_lock.c:1422:cl_unuse_try()) result = -5, this is unlikely!
Feb  2 07:21:34 hcrddm005 kernel: LustreError:
78524:0:(cl_lock.c:1437:cl_unuse_locked()) lock@8817a078[2 0 0 1 0
] R(1):[0, 18446744073709551615]@[0x20bd1:0x1e:0x0] {
Feb  2 07:21:34 hcrddm005 kernel: LustreError:
78524:0:(cl_lock.c:1437:cl_unuse_locked()) vvp@881c7a3823f8:
Feb  2 07:21:34 hcrddm005 kernel: LustreError:
78524:0:(cl_lock.c:1437:cl_unuse_locked()) lov@881fb55c26e0: 1
Feb  2 07:21:34 hcrddm005 kernel: LustreError:
78524:0:(cl_lock.c:1437:cl_unuse_locked()) 0 0: ---
Feb  2 07:21:34 hcrddm005 kernel: LustreError:
78524:0:(cl_lock.c:1437:cl_unuse_locked())
Feb  2 07:21:34 hcrddm005 kernel: LustreError:
78524:0:(cl_lock.c:1437:cl_unuse_locked()) } lock@8817a078
Feb  2 07:21:34 hcrddm005 kernel: LustreError:
78524:0:(cl_lock.c:1437:cl_unuse_locked()) unuse return –5

[client]$ dd if=raj_14 of=/dev/null bs=1M
*dd: reading `raj_14': Input/output error*
8307+0 records in
8307+0 records out
8710520832 bytes (8.7 GB) copied, 72.793 s, 120 MB/s
[client]$

/tmp/lustre-log.1486041625.111200 says:
0100:0008:12.0:1486041625.090841:0:111200:0:(recover.c:249:ptlrpc_request_handle_notconn())
import hsi003-OST000e-osc-8815c65f4400 of
hsi003-OST000e_UUID@10.10.150.21@o2ib abruptly disconnected: reconnecting
0100:02000400:12.0:1486041625.090846:0:111200:0:(import.c:167:ptlrpc_set_import_discon())
hsi003-OST000e-osc-8815c65f4400: Connection

Re: [Lustre-discuss] Fwd: Reg /// OSS rebooted automatically

2010-12-20 Thread Daniel Raj

Hi Jeff,


Thanks for your reply

*Storage information *:


DL380G5   == OSS + 16GB Ram
OS== SFS G3.2-2 + centos 5.3 + lustre 1.8.3
MSA60 box   == OST
RAID 6


Regards,

Daniel A

On Tue, Dec 21, 2010 at 11:45 AM, Jeff Johnson 
jeff.john...@aeoncomputing.com wrote:

 Daniel,

 It looks like your OST backend storage device may be having an issue. I
 would check the health and stability of the backend storage device or raid
 you are using for an OST device. It wouldn't likely cause a system reboot of
 your OSS system. There may be more problems, hardware and/or OS related that
 are causing the system to reboot in addition to Lustre complaining that it
 can't find the OST storage device.

 Others here on the list will likely give you a more detailed answer. The
 storage device is the place i would look first.

 --Jeff

 --
 --
 Jeff Johnson
 Manager
 Aeon Computing

 jeff.john...@aeoncomputing.com
 www.aeoncomputing.com
 t: 858-412-3810 x101   f: 858-412-3845
 m: 619-204-9061

 4905 Morena Boulevard, Suite 1313 - San Diego, CA 92117


 On Mon, Dec 20, 2010 at 9:43 PM, Daniel Raj danielraj2...@gmail.comwrote:




 Hi Genius,


 Good Day  !!


 I am Daniel. My OSS getting  automatically rebooted again and again .
 kindly help to me

 Its showing the below error messages


  *kernel: LustreError: 23351:0:(ldlm_lib.c:1892:target_send_reply_msg())
 @@@ processing error (-19)  r...@810400e24400 x1353488904620274/t0
 o8-?@?:0/0 lens 368/0 e 0 to 0 dl 1292738958 ref 1 fl Interpret:/0/0 rc
 -19/0
 kernel: LustreError: 137-5: UUID 'south-ost7_UUID' is not available  for
 connect (no target)
 kernel: LustreError: 23284:0:(ldlm_lib.c:1892:target_send_reply_msg()) @@@
 processing error (-19)  r...@8101124c7c00 x1353488904620359/t0
 o8-?@?:0/0 lens 368/0 e 0 to 0 dl 1292739025 ref 1 fl Interpret:/0/0 rc
 -19/0
 *

 Regards,

 Daniel A



___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss

39 matches

Mail list logo