[Lustre-discuss] Lustre inode cache tunables
Hi, I daily run 'find /lustre' on a filesystem with many files. This consumes a lot of memory and /proc/slabinfo reveals that lustre_inode_cache has ~900 objects. I've seen the system swapping sometimes, causing slow responses and evictions. Any tunables for reclaming pages from the lustre_inode_cache slab? /Jakob ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] Lustre MDS Errors 1-7 and operation 101
Thank you for this clarification on the operation X message! Running Lustre without MGS or even MDT is something I have tested already - involuntarily ;-) But I was confused because in this case, there were new mounts coming all the time, so the MGS was there and answering, and at the same time Lustre talks about an unconnected MGS. Thomas Cliff White wrote: Thomas Roth wrote: Hi all, on our production cluster we have for a surprisingly long time ( 1 day) only the following two error messages (and no visible problems), although the system is under heavy load right now: Jan 14 10:44:33 server1 kernel: LustreError: 5118:0:(ldlm_lib.c:1536:target_send_reply_msg()) @@@ processing error (-107) r...@8107fd6c4c50 x2077599/t0 o101-?@?:0/0 lens 232/0 e 0 to 0 dl 1231927273 ref 1 fl Interpret:/0/0 rc -107/0 and: Jan 14 10:46:42 server1 kernel: LustreError: 6766:0:(mgs_handler.c:557:mgs_handle()) lustre_mgs: operation 101 on unconnected MGS error (-107) is /* Transport endpoint is not connected */ - I have seen this before on clients which had lost the connection to the cluster. But this is on the MGS/MDS - one server with one partition for the MGS and one for the MDT. Remember, this is a distributed client/server system. When any node needs to connect to a service, there will be a client process. So, an OSS (which needs to talk to the MDS) will have a metadata client (mdc) running on it. The second error suggests of course that the MGS is actually not connected - but how can a Lustre system run when its MGS isn't there? Makes no sense, does it? Ah, that's the beauty of Lustre. The MGS is needed for two things: - New clients get the mount from the MGS - Configuration changes are propagated from the MGS. So, if you are not actively mounting clients, and not changing the configuration, in fact Lustre can run just fine without the MGS. Filesystem users will not even notice it's gone, unless they are attempting a mount. Likewise, the MDS is used for metadata transactions. If a client is not actively touching metadata, (for example a client already has an open file and is doing IO only) you can fail the MDS without the clients noticing. Those two errors are quite harmless in this case - 'operation x on unconnected MGS' means a client was evicted, the client is attempting to replay an RPC, however the server has destroyed the import (due to the eviction) and it has not been re-established. cliffw O.k., the cluster is running Debian Etch 64bit, Kernel 2.6.22, Lustre 1.6.5.1. The operation 101 thing is supposed to have been solved in the 1.6.4 - 1.6.5 upgrade, according to the change logs. Either it hasn't, or I have a real problem were this error message really applies. It is also remarkable that it seems nobody seems to know about the meaning of operation X on unconnected MGS - via Google one will find many questions but no answers - at least that's my impression (and I didn't search Bugzilla). Many thanks, Thomas ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss -- Thomas Roth Department: Informationstechnologie Location: SB3 1.262 Phone: +49-6159-71 1453 Fax: +49-6159-71 2986 GSI Helmholtzzentrum für Schwerionenforschung GmbH Planckstraße 1 D-64291 Darmstadt www.gsi.de Gesellschaft mit beschränkter Haftung Sitz der Gesellschaft: Darmstadt Handelsregister: Amtsgericht Darmstadt, HRB 1528 Geschäftsführer: Professor Dr. Horst Stöcker Vorsitzende des Aufsichtsrates: Dr. Beatrix Vierkorn-Rudolph, Stellvertreter: Ministerialdirigent Dr. Rolf Bernhardt ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
[Lustre-discuss] o2ib cant ping/mount Infiniband NID
Problem is similer to http://lists.lustre.org/pipermail/lustre-discuss/2008-May/007498.html But by looking at the thread could not really get the solution for the problem. I have two RHEL5 Linux servers installed with following packages - kernel-lustre-smp-2.6.18-53.1.14.el5_lustre.1.6.5.1 kernel-ib-1.3-2.6.18_53.1.14.el5_lustre.1.6.5.1smp lustre-ldiskfs-3.0.4-2.6.18_53.1.14.el5_lustre.1.6.5.1smp lustre-1.6.5.1-2.6.18_53.1.14.el5_lustre.1.6.5.1smp lustre-modules-1.6.5.1-2.6.18_53.1.14.el5_lustre.1.6.5.1smp e2fsprogs-1.40.7.sun3-0redhat machine 1: with ib0 IP address : 172.24.198.111 machine 2: with ib0 IP address : 172.24.198.112 /etc/modprobe.conf contains options lnet networks=o2ib TCP networking worked fine and now I am trying with Infiniband network finding it difficult in communicating with IB nodes mounting effort throghs me the following error [r...@p186 ~]# mount -t lustre -o loop /tmp/lustre-ost1 /mnt/ost1 mount.lustre: mount /dev/loop0 at /mnt/ost1 failed: Input/output error Is the MGS running? /var/log/messages : Jan 15 16:55:25 p186 kernel: kjournald starting. Commit interval 5 seconds Jan 15 16:55:25 p186 kernel: LDISKFS FS on loop0, internal journal Jan 15 16:55:25 p186 kernel: LDISKFS-fs: mounted filesystem with ordered data mode. Jan 15 16:55:25 p186 kernel: kjournald starting. Commit interval 5 seconds Jan 15 16:55:25 p186 kernel: LDISKFS FS on loop0, internal journal Jan 15 16:55:25 p186 kernel: LDISKFS-fs: mounted filesystem with ordered data mode. Jan 15 16:55:25 p186 kernel: LDISKFS-fs: file extents enabled Jan 15 16:55:25 p186 kernel: LDISKFS-fs: mballoc enabled Jan 15 16:55:30 p186 kernel: Lustre: Request x7 sent from mgc172.24.198@o2ib to NID 172.24.198@o2ib 5s ago has timed out (limit 5s). Jan 15 16:55:30 p186 kernel: LustreError: 7193:0:(obd_mount.c:1062:server_start_targets()) Required registration failed for lustre-OST: -5 Jan 15 16:55:30 p186 kernel: LustreError: 15f-b: Communication error with the MGS. Is the MGS running? Jan 15 16:55:30 p186 kernel: LustreError: 7193:0:(obd_mount.c:1597:server_fill_super()) Unable to start targets: -5 Jan 15 16:55:30 p186 kernel: LustreError: 7193:0:(obd_mount.c:1382:server_put_super()) no obd lustre-OST Jan 15 16:55:30 p186 kernel: LustreError: 7193:0:(obd_mount.c:119:server_deregister_mount()) lustre-OST not registered Jan 15 16:55:30 p186 kernel: LDISKFS-fs: mballoc: 0 blocks 0 reqs (0 success) Jan 15 16:55:30 p186 kernel: LDISKFS-fs: mballoc: 0 extents scanned, 0 goal hits, 0 2^N hits, 0 breaks, 0 lost Jan 15 16:55:30 p186 kernel: LDISKFS-fs: mballoc: 0 generated and it took 0 Jan 15 16:55:30 p186 kernel: LDISKFS-fs: mballoc: 0 preallocated, 0 discarded Jan 15 16:55:30 p186 kernel: Lustre: server umount lustre-OST complete Jan 15 16:55:30 p186 kernel: LustreError: 7193:0:(obd_mount.c:1951:lustre_fill_super()) Unable to mount (-5) All pinging efforts also failed to the IB NIDS local/remote can ping the ip address : [r...@p186 ~]# ping 172.24.198.112 PING 172.24.198.112 (172.24.198.112) 56(84) bytes of data. 64 bytes from 172.24.198.112: icmp_seq=1 ttl=64 time=0.052 ms 64 bytes from 172.24.198.112: icmp_seq=2 ttl=64 time=0.024 ms --- 172.24.198.112 ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 1000ms rtt min/avg/max/mdev = 0.024/0.038/0.052/0.014 ms [r...@p186 ~]# ping 172.24.198.111 PING 172.24.198.111 (172.24.198.111) 56(84) bytes of data. 64 bytes from 172.24.198.111: icmp_seq=1 ttl=64 time=2.16 ms 64 bytes from 172.24.198.111: icmp_seq=2 ttl=64 time=0.296 ms --- 172.24.198.111 ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 1000ms rtt min/avg/max/mdev = 0.296/1.231/2.166/0.935 ms but cant ping the NIDS : [r...@p186 ~]# lctl ping 172.24.198@o2ib failed to ping 172.24.198@o2ib: Input/output error [r...@p186 ~]# lctl ping 172.24.198@o2ib failed to ping 172.24.198@o2ib: Input/output error Any idea why lnet cant ping NIDS ? some more configurations: [r...@p186 ~]# ibstat CA 'mthca0' CA type: MT23108 Number of ports: 2 Firmware version: 3.5.0 Hardware version: a1 Node GUID: 0x0002c9020021550c Machines are connected via IB switch. Looking forward for help. ~subbu ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] About MDS failover
Jeffrey Alan Bennett wrote: Hi, What software are people using for MDS failover? I have been using Heartbeat from Linux-HA but I am not absolutely happy with its performance. Is there anything better out there? Are you using heartbeat V1 or V2? I would like to hear more about the issues you are experiencing. We have had some people use the Red Hat cluster tools. cliffw Thanks, Jeffrey Bennett HPC Data Engineer San Diego Supercomputer Center 858.822.0936 http://users.sdsc.edu/~jab ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] o2ib cant ping/mount Infiniband NID
Subbu, I'd suggest: 1) make sure ko2iblnd has been brought up (please check if there is any error message when startup ko2iblnd) 2) echo +neterror /proc/sys/lnet/printk, then try with lctl ping, if it still can't work please post error messages Regards Liang subbu kl: Problem is similer to http://lists.lustre.org/pipermail/lustre-discuss/2008-May/007498.html But by looking at the thread could not really get the solution for the problem. I have two RHEL5 Linux servers installed with following packages - kernel-lustre-smp-2.6.18-53.1.14.el5_lustre.1.6.5.1 kernel-ib-1.3-2.6.18_53.1.14.el5_lustre.1.6.5.1smp lustre-ldiskfs-3.0.4-2.6.18_53.1.14.el5_lustre.1.6.5.1smp lustre-1.6.5.1-2.6.18_53.1.14.el5_lustre.1.6.5.1smp lustre-modules-1.6.5.1-2.6.18_53.1.14.el5_lustre.1.6.5.1smp e2fsprogs-1.40.7.sun3-0redhat machine 1: with ib0 IP address : 172.24.198.111 machine 2: with ib0 IP address : 172.24.198.112 /etc/modprobe.conf contains options lnet networks=o2ib TCP networking worked fine and now I am trying with Infiniband network finding it difficult in communicating with IB nodes mounting effort throghs me the following error [r...@p186 ~]# mount -t lustre -o loop /tmp/lustre-ost1 /mnt/ost1 mount.lustre: mount /dev/loop0 at /mnt/ost1 failed: Input/output error Is the MGS running? /var/log/messages : Jan 15 16:55:25 p186 kernel: kjournald starting. Commit interval 5 seconds Jan 15 16:55:25 p186 kernel: LDISKFS FS on loop0, internal journal Jan 15 16:55:25 p186 kernel: LDISKFS-fs: mounted filesystem with ordered data mode. Jan 15 16:55:25 p186 kernel: kjournald starting. Commit interval 5 seconds Jan 15 16:55:25 p186 kernel: LDISKFS FS on loop0, internal journal Jan 15 16:55:25 p186 kernel: LDISKFS-fs: mounted filesystem with ordered data mode. Jan 15 16:55:25 p186 kernel: LDISKFS-fs: file extents enabled Jan 15 16:55:25 p186 kernel: LDISKFS-fs: mballoc enabled Jan 15 16:55:30 p186 kernel: Lustre: Request x7 sent from mgc172.24.198@o2ib to NID 172.24.198@o2ib 5s ago has timed out (limit 5s). Jan 15 16:55:30 p186 kernel: LustreError: 7193:0:(obd_mount.c:1062:server_start_targets()) Required registration failed for lustre-OST: -5 Jan 15 16:55:30 p186 kernel: LustreError: 15f-b: Communication error with the MGS. Is the MGS running? Jan 15 16:55:30 p186 kernel: LustreError: 7193:0:(obd_mount.c:1597:server_fill_super()) Unable to start targets: -5 Jan 15 16:55:30 p186 kernel: LustreError: 7193:0:(obd_mount.c:1382:server_put_super()) no obd lustre-OST Jan 15 16:55:30 p186 kernel: LustreError: 7193:0:(obd_mount.c:119:server_deregister_mount()) lustre-OST not registered Jan 15 16:55:30 p186 kernel: LDISKFS-fs: mballoc: 0 blocks 0 reqs (0 success) Jan 15 16:55:30 p186 kernel: LDISKFS-fs: mballoc: 0 extents scanned, 0 goal hits, 0 2^N hits, 0 breaks, 0 lost Jan 15 16:55:30 p186 kernel: LDISKFS-fs: mballoc: 0 generated and it took 0 Jan 15 16:55:30 p186 kernel: LDISKFS-fs: mballoc: 0 preallocated, 0 discarded Jan 15 16:55:30 p186 kernel: Lustre: server umount lustre-OST complete Jan 15 16:55:30 p186 kernel: LustreError: 7193:0:(obd_mount.c:1951:lustre_fill_super()) Unable to mount (-5) All pinging efforts also failed to the IB NIDS local/remote can ping the ip address : [r...@p186 ~]# ping 172.24.198.112 PING 172.24.198.112 (172.24.198.112) 56(84) bytes of data. 64 bytes from 172.24.198.112 http://172.24.198.112: icmp_seq=1 ttl=64 time=0.052 ms 64 bytes from 172.24.198.112 http://172.24.198.112: icmp_seq=2 ttl=64 time=0.024 ms --- 172.24.198.112 ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 1000ms rtt min/avg/max/mdev = 0.024/0.038/0.052/0.014 ms [r...@p186 ~]# ping 172.24.198.111 PING 172.24.198.111 (172.24.198.111) 56(84) bytes of data. 64 bytes from 172.24.198.111 http://172.24.198.111: icmp_seq=1 ttl=64 time=2.16 ms 64 bytes from 172.24.198.111 http://172.24.198.111: icmp_seq=2 ttl=64 time=0.296 ms --- 172.24.198.111 ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 1000ms rtt min/avg/max/mdev = 0.296/1.231/2.166/0.935 ms but cant ping the NIDS : [r...@p186 ~]# lctl ping 172.24.198@o2ib failed to ping 172.24.198@o2ib: Input/output error [r...@p186 ~]# lctl ping 172.24.198@o2ib failed to ping 172.24.198@o2ib: Input/output error Any idea why lnet cant ping NIDS ? some more configurations: [r...@p186 ~]# ibstat CA 'mthca0' CA type: MT23108 Number of ports: 2 Firmware version: 3.5.0 Hardware version: a1 Node GUID: 0x0002c9020021550c Machines are connected via IB switch. Looking forward for help. ~subbu ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] Lustre MDS Errors 1-7 and operation 101
On Jan 14, 2009 11:34 +0100, Thomas Roth wrote: Jan 14 10:44:33 server1 kernel: LustreError: 5118:0:(ldlm_lib.c:1536:target_send_reply_msg()) @@@ processing error (-107) r...@8107fd6c4c50 x2077599/t0 o101-?@?:0/0 lens 232/0 e 0 to 0 dl 1231927273 ref 1 fl Interpret:/0/0 rc -107/0 and: Jan 14 10:46:42 server1 kernel: LustreError: 6766:0:(mgs_handler.c:557:mgs_handle()) lustre_mgs: operation 101 on unconnected MGS error (-107) is /* Transport endpoint is not connected */ - I have seen this before on clients which had lost the connection to the cluster. But this is on the MGS/MDS - one server with one partition for the MGS and one for the MDT. The second error suggests of course that the MGS is actually not connected - but how can a Lustre system run when its MGS isn't there? Makes no sense, does it? It means some client is trying to perform operations on the MGS before it is connected. O.k., the cluster is running Debian Etch 64bit, Kernel 2.6.22, Lustre 1.6.5.1. The operation 101 thing is supposed to have been solved in the 1.6.4 - 1.6.5 upgrade, according to the change logs. There are a million things that might cause operation 101 problems. 101 = LDLM_ENQUEUE, so this is just a lock enqueue. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] Lustre MDS Errors 1-7 and operation 101
Thanks, Andreas. Andreas Dilger wrote: error (-107) is /* Transport endpoint is not connected */ - I have seen this before on clients which had lost the connection to the cluster. But this is on the MGS/MDS - one server with one partition for the MGS and one for the MDT. The second error suggests of course that the MGS is actually not connected - but how can a Lustre system run when its MGS isn't there? Makes no sense, does it? It means some client is trying to perform operations on the MGS before it is connected. Who? Before the client is connected, or before the MGS is connected? Of course the client can't do something before it is connected? But the MGS is connected in the sense that it is mounted and responsive - I can do a fresh client mount of this system any time. Maybe that's more semantics than Lustre. In any case, I am reassured by your comments, in particular since the cluster is doing fine in this situation. Regards, Thomas O.k., the cluster is running Debian Etch 64bit, Kernel 2.6.22, Lustre 1.6.5.1. The operation 101 thing is supposed to have been solved in the 1.6.4 - 1.6.5 upgrade, according to the change logs. There are a million things that might cause operation 101 problems. 101 = LDLM_ENQUEUE, so this is just a lock enqueue. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. -- Thomas Roth Department: Informationstechnologie Location: SB3 1.262 Phone: +49-6159-71 1453 Fax: +49-6159-71 2986 GSI Helmholtzzentrum für Schwerionenforschung GmbH Planckstraße 1 D-64291 Darmstadt www.gsi.de Gesellschaft mit beschränkter Haftung Sitz der Gesellschaft: Darmstadt Handelsregister: Amtsgericht Darmstadt, HRB 1528 Geschäftsführer: Professor Dr. Horst Stöcker Vorsitzende des Aufsichtsrates: Dr. Beatrix Vierkorn-Rudolph, Stellvertreter: Ministerialdirigent Dr. Rolf Bernhardt ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] About MDS failover
Hi Cliff, Are you using heartbeat V1 or V2? I am using heartbeat V2. It works as expected, I just had to tune some time outs, but it still takes around 3 minutes to totally move the MGS/MDS services to the other system. I guess having the MGS and MDS on separate systems would help reduce this time. Also, MMP is affecting somehow to this time, but MMP is necessary for failover. My biggest concern is that I can't control the situation in which the HBA connectivity with the storage system is damaged, ie: I pull the cables from the HBAs on the MGS/MDS and nothing happens, the MDS and MGS services keep running, they are still mounted and therefore heartbeat does nothing. From the heartbeat documentation it does not seem that this can be done, at least easily?. I read something about HBA ping and it seems it requires HBAAPI which does not work with Brocade HBAs... Any help will be greatly appreciated. I would like to hear more about the issues you are experiencing. We have had some people use the Red Hat cluster tools. I will try Red Hat cluster tools. Thanks, Jeff ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] Query
Hi Ravi, On Ter, 2009-01-13 at 11:59 +0530, Ravi Rattihalli wrote: Here are my questions: 1. Integrating ZFS with Lustre: Does this mean that some features of ZFS are going to be integrated with Lustre? ( like optimized checksums of ZFS etc ) Yes, we plan to integrate some ZFS features with Lustre. One of such features is the checksumming, like you mentioned. We are planning to make the Lustre clients compute and provide the checksums over the wire to the servers, and use them as the block checksums in ZFS. That will achieve two goals: 1) offload checksum computation to clients, which in total have more CPU available than servers and 2) achieve true end-to-end data integrity. There may also be some integration in terms of quotas, or some other features. We will also be developing some features in ZFS, to achieve either better performance in some cases (e.g., a zero-copy API), or to achieve new functionality (e.g., multi-mount protection, but this is not our highest priority right now). 1. Which version of Lustre will have end-to-end data interity and which checksum algorithm will be used (if not CRC32)? (I read in one document written by Peter Bojanic which said Lustre+ZFS = End-to-End Data Integrity) So is it ver. 3.0 and above? I believe so. I read in wikipedia under ZFS integration “Lustre 3.0 will allow users to choose between ZFS and ldiskfs as back-end storage”. Why ZFS and ldiskfs are treated separately here even after integration here? Once integrated it is just Lustre 3.0 isn’t it? Yes, but I'm not sure what is your confusion here. With Lustre 3.0, you will be free to choose whether you wish to create ldiskfs or ZFS-formatted backend devices - both options should be available. I would be glad to hear the answers from you which may solve my queries and confusionJ I hope my answers clarify things a bit. Cheers, Ricardo ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] About MDS failover
On Jan 15, 2009 11:38 -0800, Jeffrey Alan Bennett wrote: I am using heartbeat V2. It works as expected, I just had to tune some time outs, but it still takes around 3 minutes to totally move the MGS/MDS services to the other system. This is largely an issue of the Lustre failover itself, and not the HA software. The problem today is that under heavy load the clients may have to wait a long time for any requests sent to the server to complete (100s of seconds in some cases), so it is difficult for the clients to distinguish between server death (unlikely) and heavy server load (common). In the case where a server dies and fails over, the clients have to wait for their requests to time out, then they resend and wait again (in the common case the server is just overloaded), then finally they try to contact any other server listed as failover for that node. What we are looking to do for improving failover speed is to have the backup server broadcast to the clients that it has taken over the OST/MDT when it has started. Then the clients will be able to do failover to the new server as soon as it is ready, instead of waiting for the original requests to time out. My biggest concern is that I can't control the situation in which the HBA connectivity with the storage system is damaged, ie: I pull the cables from the HBAs on the MGS/MDS and nothing happens, the MDS and MGS services keep running, they are still mounted and therefore heartbeat does nothing. From the heartbeat documentation it does not seem that this can be done, at least easily?. I read something about HBA ping and it seems it requires HBAAPI which does not work with Brocade HBAs... You can use HBA multi-pathing to avoid this problem, if your hardware supports it. You can also use /proc/fs/lustre/health_check to check if the filesystems have encountered errors and are marked unhealthy. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
[Lustre-discuss] Log creation/deletion of files?
Is there a way to enable logging of UID and host for creation/deletion of files/directories within the cluster? -- Andrew ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] About MDS failover
Thanks Andreas, I understand that this is a common issue with failover, as mentioned in the Lustre documentation. You can use HBA multi-pathing to avoid this problem, if your hardware supports it. You can also use /proc/fs/lustre/health_check to check if the filesystems have encountered errors and are marked unhealthy. We use multipath in all our configurations. However, will Lustre be able to detect if the connectivity to the storage has been totally lost ( ie. no available path ) and display accordingly on /proc/fs/lustre/health_check? Thanks, Jeff ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] Log creation/deletion of files?
On Thu, 2009-01-15 at 18:41 -0700, Lundgren, Andrew wrote: Is there a way to enable logging of UID and host for creation/deletion of files/directories within the cluster? The feature you are looking for is called audit logs. It used to exist on code branch for the Hendrix project but I don't see it on our current roadmap. Likely, given the age of the Hendrix code, it would take some non-insignificant work to port forward to current Lustre. That said, we do have a server changelogs feature coming in 2.0 and while that will likely log the filesystem changes, I'm not sure if/that it will log the uid responsible for the change. b. ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] Log creation/deletion of files?
Thank you. -Original Message- From: lustre-discuss-boun...@lists.lustre.org [mailto:lustre-discuss- boun...@lists.lustre.org] On Behalf Of Brian J. Murrell Sent: Thursday, January 15, 2009 7:06 PM To: lustre-discuss@lists.lustre.org Subject: Re: [Lustre-discuss] Log creation/deletion of files? On Thu, 2009-01-15 at 18:41 -0700, Lundgren, Andrew wrote: Is there a way to enable logging of UID and host for creation/deletion of files/directories within the cluster? The feature you are looking for is called audit logs. It used to exist on code branch for the Hendrix project but I don't see it on our current roadmap. Likely, given the age of the Hendrix code, it would take some non-insignificant work to port forward to current Lustre. That said, we do have a server changelogs feature coming in 2.0 and while that will likely log the filesystem changes, I'm not sure if/that it will log the uid responsible for the change. b. ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] o2ib cant ping/mount Infiniband NID
Liang, after executing following echo : echo +neterror /proc/sys/lnet/printk now lctlt ping shows the following error # lctl ping 172.24.198@o2ib failed to ping 172.24.198@o2ib: Input/output error Jan 16 10:24:14 p128 kernel: Lustre: 2750:0:(o2iblnd_cb.c:2687:kiblnd_cm_callback()) 172.24.198@o2ib: ROUTE ERROR -22 Jan 16 10:24:14 p128 kernel: Lustre: 2750:0:(o2iblnd_cb.c:2101:kiblnd_peer_connect_failed()) Deleting messages for 172.24.198@o2ib: connection failed Looks like some problem with IB connection manager ! 1. do we have any help docs to setup IPoIB and Lustre, lustre operation manual has very minimal info about this . I think I am missing some IPoIB setup part here. 2. or is it mannual assignment of IP addresses to ib0 is creating some problem *Some more supporting info : *subnet manager of following version is also running : OpenSM 3.1.8 Initially I got this error for MDS mount Jan 16 09:45:20 p128 kernel: LustreError: 4991:0:(linux-tcpip.c:124:libcfs_ipif_query()) Can't get IP address for interface ib0 Jan 16 09:45:20 p128 kernel: LustreError: 4991:0:(o2iblnd.c:1563:kiblnd_startup()) Can't query IPoIB interface ib0: -99 Jan 16 09:45:21 p128 kernel: LustreError: 105-4: Error -100 starting up LNI o2ib Jan 16 09:45:21 p128 kernel: LustreError: 4991:0:(events.c:707:ptlrpc_init_portals()) network initialisation failed Jan 16 09:45:21 p128 modprobe: WARNING: Error inserting ptlrpc (/lib/modules/2.6.18-53.1.14.el5_lustre.1.6.5.1smp/kernel/fs/lustre/ptlrpc.ko): Input/output error Jan 16 09:45:21 p128 modprobe: WARNING: Error inserting osc (/lib/modules/2.6.18-53.1.14.el5_lustre.1.6.5.1smp/kernel/fs/lustre/osc.ko): Unknown symbol in module, or unknown parameter (see dmesg) Jan 16 09:45:21 p128 kernel: osc: Unknown symbol ldlm_prep_enqueue_req Jan 16 09:45:21 p128 kernel: osc: Unknown symbol ldlm_resource_get Jan 16 09:45:21 p128 kernel: osc: Unknown symbol ptlrpc_lprocfs_register_obd . . . then I mannually set the IP address for ib0 as folows : ifconfig ib0 172.24.198.111 [r...@p186 ~]# ifconfig ib0 ib0 Link encap:InfiniBand HWaddr 80:00:04:04:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00 inet addr:172.24.198.112 Bcast:172.24.255.255 Mask:255.255.0.0 UP BROADCAST MULTICAST MTU:65520 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:256 RX bytes:0 (0.0 b) TX bytes:0 (0.0 b) then it mounted sucessfully Jan 16 09:47:09 p128 kernel: Lustre: Added LNI 172.24.198@o2ib [8/64] Jan 16 09:47:09 p128 kernel: Lustre: MGS MGS started Jan 16 09:47:09 p128 kernel: Lustre: Setting parameter lustre-MDT.mdt.group_upcall in log lustre-MDT Jan 16 09:47:09 p128 kernel: Lustre: Enabling user_xattr Jan 16 09:47:09 p128 kernel: Lustre: lustre-MDT: new disk, initializing Jan 16 09:47:09 p128 kernel: Lustre: MDT lustre-MDT now serving dev (lustre-MDT/64db1fc7-03ba-9803-4d20-ab0d2aa66116) with recovery enabled Jan 16 09:47:09 p128 kernel: Lustre: 5274:0:(lproc_mds.c:262:lprocfs_wr_group_upcall()) lustre-MDT: group upcall set to /usr/sbin/l_getgroups Jan 16 09:47:09 p128 kernel: Lustre: lustre-MDT.mdt: set parameter group_upcall=/usr/sbin/l_getgroups Jan 16 09:47:09 p128 kernel: Lustre: Server lustre-MDT on device /dev/loop0 has started . . . ~subbu On Thu, Jan 15, 2009 at 8:37 PM, Liang Zhen zhen.li...@sun.com wrote: Subbu, I'd suggest: 1) make sure ko2iblnd has been brought up (please check if there is any error message when startup ko2iblnd) 2) echo +neterror /proc/sys/lnet/printk, then try with lctl ping, if it still can't work please post error messages Regards Liang subbu kl: Problem is similer to http://lists.lustre.org/pipermail/lustre-discuss/2008-May/007498.html But by looking at the thread could not really get the solution for the problem. I have two RHEL5 Linux servers installed with following packages - kernel-lustre-smp-2.6.18-53.1.14.el5_lustre.1.6.5.1 kernel-ib-1.3-2.6.18_53.1.14.el5_lustre.1.6.5.1smp lustre-ldiskfs-3.0.4-2.6.18_53.1.14.el5_lustre.1.6.5.1smp lustre-1.6.5.1-2.6.18_53.1.14.el5_lustre.1.6.5.1smp lustre-modules-1.6.5.1-2.6.18_53.1.14.el5_lustre.1.6.5.1smp e2fsprogs-1.40.7.sun3-0redhat machine 1: with ib0 IP address : 172.24.198.111 machine 2: with ib0 IP address : 172.24.198.112 /etc/modprobe.conf contains options lnet networks=o2ib TCP networking worked fine and now I am trying with Infiniband network finding it difficult in communicating with IB nodes mounting effort throghs me the following error [r...@p186 ~]# mount -t lustre -o loop /tmp/lustre-ost1 /mnt/ost1 mount.lustre: mount /dev/loop0 at /mnt/ost1 failed: Input/output error Is the MGS running? /var/log/messages : Jan 15 16:55:25 p186 kernel: kjournald starting. Commit interval 5 seconds Jan 15 16:55:25 p186 kernel: LDISKFS FS on loop0, internal journal Jan 15 16:55:25
[Lustre-discuss] Lustre locking
At our university many of our students and professors use SQLite and Berkley DB for their projects. Probally, BDB more than SQLite. Would I we need to have Lustre mounted up a certain way to avoid corruption via file locking? Any thoughts about this? TIA ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] Lustre inode cache tunables
On Jan 15, 2009 11:27 +0100, Jakob Goldbach wrote: I daily run 'find /lustre' on a filesystem with many files. This consumes a lot of memory and /proc/slabinfo reveals that lustre_inode_cache has ~900 objects. I've seen the system swapping sometimes, causing slow responses and evictions. Any tunables for reclaming pages from the lustre_inode_cache slab? This is a problem with the Linux VFS more than Lustre itself. A find even on a local filesystem would generate this many inodes. Depending on what you are doing with find you could instead use the lfs find command. This avoids instantiating inodes or requesting any data from the OSTs unless it is absolutely required. In many cases lfs find can do its work with only information from the MDS, and it does not need to instantiate the inode. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss