Re: [Lustre-discuss] client modules not loading during boot
The mount command will automatically load the modules on the client. cliffw On 09/03/2010 11:56 AM, Ronald K Long wrote: We have installed lustre 1.8.2 and 1.8.4 client on Red hat 5. The lustre modules are not loading during boot. In order to get the lustre file system to mount we have to add modprobe lustre mount /lustre to our /etc/rc.local file. Here is a list of rpms loaded kernel-2.6.18-164.11.1.el5_lustre.1.8.2 lustre-modules-1.8.2-2.6.18_164.11.1.el5_lustre.1.8.2 lustre-1.8.2-2.6.18_164.11.1.el5_lustre.1.8.2 lustre-ldiskfs-3.0.9-2.6.18_164.11.1.el5_lustre.1.8.2 kernel-devel-2.6.18-164.11.1.el5_lustre.1.8.2 Is there a way to take care of this or is the way we are handling it the way to go? Thank you Rocky ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] MDT backup (using tar) taking very long
Hi Bernd, Frederik Ferner wrote: Bernd Schubert wrote: On Thursday, September 02, 2010, Frederik Ferner wrote: we are currently reviewing our backup policy for our Lustre file system as backups of the MDT are taking longer and longer. Yes, that is due to the size-on-mds feature, which was introduced into 1.6.7.2. See bug https://bugzilla.lustre.org/show_bug.cgi?id=21376 It has a patch, that also got accepted in upstream tar last week. You may find updated RHEL5 tar packages on my home page: Thanks, I'll give that a go. I can now report that the backup was faster using this new version of tar. It is still not really fast, though. the backup still takes nearly 24h (that is running getfattr followed by tar...). (Any chance of adding the SRPM to your download page?) Thanks for the tar file. I seem to remember that the separate getfattr call during the backup was not required anymore with the new version of tar. A quick look in the code seems to confirm this, at least it seems to be able to store the trusted extended attributes. Can you confirm if 'tar --xattr --acl' on the ldiskfs mounted MDT is sufficient to store all required extended attributes (and ACLs)? Thanks, Frederik -- Frederik Ferner Computer Systems Administrator phone: +44 1235 77 8624 Diamond Light Source Ltd. mob: +44 7917 08 5110 (Apologies in advance for the lines below. Some bits are a legal requirement and I have no control over them.) ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] Virtual machines
On Wed, 2010-09-08 at 05:50 -0500, Brian O'Connor wrote: Does lustre work in a VM? Yes, of course, given that a VM provides an entire virtual computer. what about in a VM over Infiniband? I don't know of any VMs which expose the hosts Infiniband hardware for the VM to use directly. Xen might. libvirt/kvm might. But those are just WAGs. b. signature.asc Description: This is a digitally signed message part ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] Virtual machines
For training porpuse we use virtualbox. We have a patchless client in production on a xen virtual machine with 10GbE without problem. Bye On 09/08/2010 02:09 PM, Brian J. Murrell wrote: On Wed, 2010-09-08 at 05:50 -0500, Brian O'Connor wrote: Does lustre work in a VM? Yes, of course, given that a VM provides an entire virtual computer. what about in a VM over Infiniband? I don't know of any VMs which expose the hosts Infiniband hardware for the VM to use directly. Xen might. libvirt/kvm might. But those are just WAGs. b. ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss -- _Gabriele Paciucci_ http://www.linkedin.com/in/paciucci Pursuant to legislative Decree n. 196/03 you are hereby informed that this email contains confidential information intended only for use of addressee. If you are not the addressee and have received this email by mistake, please send this email to the sender. You may not copy or disseminate this message to anyone. Thank You. ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] Virtual machines
On 9/8/10 8:09 AM, Brian J. Murrell wrote: On Wed, 2010-09-08 at 05:50 -0500, Brian O'Connor wrote: Does lustre work in a VM? Yes, of course, given that a VM provides an entire virtual computer. what about in a VM over Infiniband? I don't know of any VMs which expose the hosts Infiniband hardware for the VM to use directly. Xen might. libvirt/kvm might. But those are just WAGs. AFAIK, Xen can only expose IB HCAs to guests via PCI passthrough: http://wiki.xensource.com/xenwiki/XenPCIpassthrough Mellanox has a modified OFED 1.3.1 for VMWare that provides Virtual-IQ, never tried it myself though: http://www.mellanox.com/content/pages.php?pg=products_dynproduct_family=36menu_section=34 You can bridge IPoIB interfaces into guests just like any other network device (and thus socklnd). You can export SRP block devices to guests just like any other block device (e.g. for OST/MDT). Ideally, someone would implement SR-IOV support in OFED: http://www.pcisig.com/specifications/iov/ Which works nicely for, e.g. igb. Lots of talk about IB virtualization a few years ago, unfortunately doesn't appear to be much recent work in this area that I can find. - Dardo ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] Virtual machines
I seem to recall Mellanox presenting a paper on IB support virtual machines at SC two years ago. I think it was just a proof of concept, and I'm unaware of the current status. Kevin On Sep 8, 2010, at 6:09 AM, Brian J. Murrell brian.murr...@oracle.com wrote: On Wed, 2010-09-08 at 05:50 -0500, Brian O'Connor wrote: Does lustre work in a VM? Yes, of course, given that a VM provides an entire virtual computer. what about in a VM over Infiniband? I don't know of any VMs which expose the hosts Infiniband hardware for the VM to use directly. Xen might. libvirt/kvm might. But those are just WAGs. b. ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] client modules not loading during boot
Try adding _netdev as a mount option. bob Cliff White wrote: The mount command will automatically load the modules on the client. cliffw On 09/03/2010 11:56 AM, Ronald K Long wrote: We have installed lustre 1.8.2 and 1.8.4 client on Red hat 5. The lustre modules are not loading during boot. In order to get the lustre file system to mount we have to add modprobe lustre mount /lustre to our /etc/rc.local file. Here is a list of rpms loaded kernel-2.6.18-164.11.1.el5_lustre.1.8.2 lustre-modules-1.8.2-2.6.18_164.11.1.el5_lustre.1.8.2 lustre-1.8.2-2.6.18_164.11.1.el5_lustre.1.8.2 lustre-ldiskfs-3.0.9-2.6.18_164.11.1.el5_lustre.1.8.2 kernel-devel-2.6.18-164.11.1.el5_lustre.1.8.2 Is there a way to take care of this or is the way we are handling it the way to go? Thank you Rocky ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] Virtual machines
Am 08.09.10 14:09, schrieb Brian J. Murrell: On Wed, 2010-09-08 at 05:50 -0500, Brian O'Connor wrote: what about in a VM over Infiniband? I don't know of any VMs which expose the hosts Infiniband hardware for the VM to use directly. Xen might. libvirt/kvm might. But those are just WAGs. b. At the last International Supercomputing Conference Mellanox claimed to have an Infiniband solution ready to support Infiniband in Virtual Machines with nearly no performance impact. It is based on the SR-IOV Technologie. Some Information can be found in their blog at: http://www.mellanox.com/blog/2009/04/io-virtualization/ If someone already uses this technology it would be nice to get some feedback how well it is working. Cheers, Florian Feldhaus -- --- Florian Feldhaus | voice: +49-231-755 5324 ITMC/LS Service Computing | fax:+49-231-755 2731 TU Dortmund | office: GB 5, 347 D-44227 Dortmund | email: florian.feldh...@tu-dortmund.de Germany | http://www.itmc.tu-dortmund.de/ --- ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] Virtual machines
Mellanox sold us ConnectX-2 IB cards earlier this year which claim to not only work on a single VM, passed through via pci pass-through, but also to be able to be shared between multiple VM's. Vmware support was supposed to be there at the time, then KVM and Xen. We are already running Lustre under KVM on our cloud cluster but we have not yet had time to try the infiniband feature to see if we can see the infiniband cards in the VM. But the vanilla Xen kernel on the bare metal does recognize the infiniband card so that is a good sign. (PS--the Lustre-under-KVM is a proof-of-concept only at this stage, obviously you do not get the kind of throughput that you get on bare metal.) Steve On Wed, 8 Sep 2010, Kevin Van Maren wrote: I seem to recall Mellanox presenting a paper on IB support virtual machines at SC two years ago. I think it was just a proof of concept, and I'm unaware of the current status. Kevin On Sep 8, 2010, at 6:09 AM, Brian J. Murrell brian.murr...@oracle.com wrote: On Wed, 2010-09-08 at 05:50 -0500, Brian O'Connor wrote: Does lustre work in a VM? Yes, of course, given that a VM provides an entire virtual computer. what about in a VM over Infiniband? I don't know of any VMs which expose the hosts Infiniband hardware for the VM to use directly. Xen might. libvirt/kvm might. But those are just WAGs. b. ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss -- -- Steven C. Timm, Ph.D (630) 840-8525 t...@fnal.gov http://home.fnal.gov/~timm/ Fermilab Computing Division, Scientific Computing Facilities, Grid Facilities Department, FermiGrid Services Group, Assistant Group Leader. ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
[Lustre-discuss] Oss Error and 0 byte files
Hello everyone I've an installation with Lustre 1.8.2, Centos 5, x86_64 and I encountered this problem: After several months of smooth operation, client begin to write empty files without log error,from their point of view writing was successful. OSS wrote, in their log, several lines like: Sep 8 12:40:31 tgoss-0200 kernel: LustreError: 5816:0:(filter_io.c:183:filter_grant_space_left()) lfs01-OST: cli 20d94382-3300-f12e-65d1-c0f1743e1e20/8106a4e30a00 grant 39956230144 available 39956226048 and pending 0 I checked the availability of space and inodes, but this is not the problem. the problem goes away by rebooting ost. This is the second time I have, first at july 2010, second september 2010. Any ideas?It's a bug? Thanks -- Gianluca Tresoldi ***SysAdmin*** ***Demon's Trainer*** Tuttogratis Italia Spa E-mail: gianluca.treso...@tuttogratis.com http://www.tuttogratis.it Tel Centralino 02-57313101 Tel Diretto 02-57313136 Be open... *** Confidentiality Notice Disclaimer * This message, together with any attachments, is for the confidential and exclusive use of the addressee(s). If you receive it in error, please delete the message and its attachments from your system immediately and notify us by return e-mail. Do not disclose, copy, circulate or use any information contained in this e-mail. ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
[Lustre-discuss] Lustre requirements and tuning tricks
Hello all! We are planning an upgrade to our current storage infrastructure, and we intend to deploy lustre to serve some 150 clients. We intend to use 5 OSS with the following configuration: - 2 x Intel 5520 (quad core) processor (or equivalent). - 24Gb RAM. - 20 x 2Tb SAS2 (7,200 rpm) disks. - 1 x 16 and 1x8 ports Adaptec RAID controllers, with 20 SAS and 4 SSD drives. - 2 x10Gb Ethernet ports. And then 2 MDS like these: - 2 x Intel 5520 (quad core) processor (or equivalent). - 36Gb RAM. - 2 x 64Gb SSD disks. - 2 x10Gb Ethernet ports. After having read the documentation, it seems to be a sensible configuration, specially regarding the OSS. However we are not so sure about the MDS. We have seen recommendations to reserve 5% of the total file system space in the MDS. Is this true and then we should go for 2x2Tb SAS disks for the MDS? Is SSD really worth there? And we have also read about having a separate storage for the OSTs' journals. Is it really useful to get a pair of extra small (16Gb) SSD disks for each OST to keep the journals and bitmaps? Finally, we have also read that it's important to have different OSTs in different physical drives to avoid bottlenecks. Is thas so if we make a big RAID volume and then several logical volumes (done with the hardware raid card, the operating system would just see different block devices)? Thank you very much in advance, Joan -- -- Joan Josep Piles Contreras - Analista de sistemas I3A - Instituto de Investigación en Ingeniería de Aragón Tel: 976 76 10 00 (ext. 5454) http://i3a.unizar.es -- jpi...@unizar.es -- ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] Oss Error and 0 byte files
Hi Luca, the error mean: ost grant more space than there is avaiable!!! You could check in the proc filesystem of yours client checking the cur_grant_bytes to check how much grant space those clients take. On 09/08/2010 05:02 PM, Gianluca Tresoldi wrote: Hello everyone I've an installation with Lustre 1.8.2, Centos 5, x86_64 and I encountered this problem: After several months of smooth operation, client begin to write empty files without log error,from their point of view writing was successful. OSS wrote, in their log, several lines like: Sep 8 12:40:31 tgoss-0200 kernel: LustreError: 5816:0:(filter_io.c:183:filter_grant_space_left()) lfs01-OST: cli 20d94382-3300-f12e-65d1-c0f1743e1e20/8106a4e30a00 grant 39956230144 available 39956226048 and pending 0 I checked the availability of space and inodes, but this is not the problem. the problem goes away by rebooting ost. This is the second time I have, first at july 2010, second september 2010. Any ideas?It's a bug? Thanks -- Gianluca Tresoldi ***SysAdmin*** ***Demon's Trainer*** Tuttogratis Italia Spa E-mail: gianluca.treso...@tuttogratis.com http://www.tuttogratis.it Tel Centralino 02-57313101 Tel Diretto 02-57313136 http://www.gnu.orgBe open... Confidentiality Notice Disclaimer * This message, together with any attachments, is for the confidential and exclusive use of the addressee(s). If you receive it in error, please delete the message and its attachments from your system immediately and notify us by return e-mail. Do not disclose, copy, circulate or use any information contained in this e-mail. * ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss -- _Gabriele Paciucci_ http://www.linkedin.com/in/paciucci Pursuant to legislative Decree n. 196/03 you are hereby informed that this email contains confidential information intended only for use of addressee. If you are not the addressee and have received this email by mistake, please send this email to the sender. You may not copy or disseminate this message to anyone. Thank You. ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] Lustre requirements and tuning tricks
Joan J. Piles wrote: And then 2 MDS like these: - 2 x Intel 5520 (quad core) processor (or equivalent). - 36Gb RAM. - 2 x 64Gb SSD disks. - 2 x10Gb Ethernet ports. Hmmm After having read the documentation, it seems to be a sensible configuration, specially regarding the OSS. However we are not so sure about the MDS. We have seen recommendations to reserve 5% of the total file system space in the MDS. Is this true and then we should go for 2x2Tb SAS disks for the MDS? Is SSD really worth there? There is a nice formula for approximating your MDS needs on the wiki. Basically it is something to the effect of Number-of-inodes-planned * 1kB = storage space required So, for 10 million inodes, you need ~10 GB of space. I am not sure if this helps, but you might be able to estimate your likely usage scenario. Updating MDSes isn't easy (e.g. you have to pre-plan) And we have also read about having a separate storage for the OSTs' journals. Is it really useful to get a pair of extra small (16Gb) SSD disks for each OST to keep the journals and bitmaps? Finally, we have also read that it's important to have different OSTs in different physical drives to avoid bottlenecks. Is thas so if we make a big RAID volume and then several logical volumes (done with the hardware raid card, the operating system would just see different block devices)? Yes, though this will be suboptimal in performance. You want traffic to different LUNs not sharing the same physical disks. Build smaller RAID containers, and single LUNs atop those. -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics Inc. email: land...@scalableinformatics.com web : http://scalableinformatics.com http://scalableinformatics.com/jackrabbit phone: +1 734 786 8423 x121 fax : +1 866 888 3112 cell : +1 734 612 4615 ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] Oss Error and 0 byte files
It might be related to bug 22755, but there the client gets ENOSPC On Sep 8, 2010, at 8:02 AM, Gianluca Tresoldi gianluca.treso...@tuttogratis.com wrote: Hello everyone I've an installation with Lustre 1.8.2, Centos 5, x86_64 and I encountered this problem: After several months of smooth operation, client begin to write empty files without log error,from their point of view writing was successful. OSS wrote, in their log, several lines like: Sep 8 12:40:31 tgoss-0200 kernel: LustreError: 5816:0:(filter_io.c: 183:filter_grant_space_left()) lfs01-OST: cli 20d94382-3300- f12e-65d1-c0f1743e1e20/8106a4e30a00 grant 39956230144 available 39956226048 and pending 0 I checked the availability of space and inodes, but this is not the problem. the problem goes away by rebooting ost. This is the second time I have, first at july 2010, second september 2010. Any ideas?It's a bug? Thanks -- Gianluca Tresoldi ***SysAdmin*** ***Demon's Trainer*** Tuttogratis Italia Spa E-mail: gianluca.treso...@tuttogratis.com http://www.tuttogratis.it Tel Centralino 02-57313101 Tel Diretto 02-57313136 linux40.jpgBe open... *** Confidentiality Notice Disclaimer * This message, together with any attachments, is for the confidential and exclusive use of the addressee(s). If you receive it in error, please delete the message and its attachments from your system immediately and notify us by return e-mail. Do not disclose, copy, circulate or use any information contained in this e-mail. ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] Lustre requirements and tuning tricks
On Sep 8, 2010, at 8:25 AM, Joe Landman land...@scalableinformatics.com wrote: Joan J. Piles wrote: And then 2 MDS like these: - 2 x Intel 5520 (quad core) processor (or equivalent). - 36Gb RAM. - 2 x 64Gb SSD disks. - 2 x10Gb Ethernet ports. Hmmm In general there is not much gain from using SSD for MDT, and depending on the SSD, it could do much _worse_ than spinning rust. Many ssd controllers degrade horribly under the small random write workload. (SSD are best for sequential write, random read). Journals may receive some benefit, as the sequential write pattern works much better for SSDs, although SSDs are not normally needed there. After having read the documentation, it seems to be a sensible configuration, specially regarding the OSS. However we are not so sure about the MDS. We have seen recommendations to reserve 5% of the total file system space in the MDS. Is this true and then we should go for 2x2Tb SAS disks for the MDS? Is SSD really worth there? There is a nice formula for approximating your MDS needs on the wiki. Basically it is something to the effect of Number-of-inodes-planned * 1kB = storage space required So, for 10 million inodes, you need ~10 GB of space. I am not sure if this helps, but you might be able to estimate your likely usage scenario. Updating MDSes isn't easy (e.g. you have to pre-plan) It is 4KB/inode on the MDT. (It can be set to 2KB if you need 4 billion files on an 8TB MDT). My sizing rule of thumb has been ~ one MDT drive in RAID10 for each OST, to ensure you scale IOPS. And we have also read about having a separate storage for the OSTs' journals. Is it really useful to get a pair of extra small (16Gb) SSD disks for each OST to keep the journals and bitmaps? It doesn't have to be SSD, and bitmaps are only applicable for software RAID. But unless you use asynchronous journals, there is normally a big win from external journals -- even with HW RAID having non-volatile storage. The bug win is putting journals on raid 1, rather than raid5/6. Finally, we have also read that it's important to have different OSTs in different physical drives to avoid bottlenecks. Is thas so if we make a big RAID volume and then several logical volumes (done with the hardware raid card, the operating system would just see different block devices)? Yes, though this will be suboptimal in performance. You want traffic to different LUNs not sharing the same physical disks. Build smaller RAID containers, and single LUNs atop those. You get best performane with one HW RAID per OST. And that RAID should be optimized for 1MB IO (ie, not. 6+p) for best performance without having to muck with a bunch of parameters. If the OSTs are on the same drives, then there will be excessive head contention as different OST filesystems seek the same disks, greatly reducing throughput. -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics Inc. email: land...@scalableinformatics.com web : http://scalableinformatics.com http://scalableinformatics.com/jackrabbit phone: +1 734 786 8423 x121 fax : +1 866 888 3112 cell : +1 734 612 4615 ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] MDT backup (using tar) taking very long
On 2010-09-08, at 3:35, Frederik Ferner frederik.fer...@diamond.ac.uk wrote: I can now report that the backup was faster using this new version of tar. It is still not really fast, though. the backup still takes nearly 24h (that is running getfattr followed by tar...). That surprises me a bit, unless you have a huge number of files. You should be able to do backups at least at the stat rate (5000/sec = 18M/hr is not unreasonable for almost any MDS). I seem to remember that the separate getfattr call during the backup was not required anymore with the new version of tar. A quick look in the code seems to confirm this, at least it seems to be able to store the trusted extended attributes. Can you confirm if 'tar --xattr --acl' on the ldiskfs mounted MDT is sufficient to store all required extended attributes (and ACLs)? Yes, that should work. It always is worthwhile to verify that they are restored correctly, however. You can just start a restore in some temp ext3 filesyste, interrupt it after some files in /ROOT have been restored, and use getfattr to verify the trusted.lov xattrs are restored. Cheers, Andreas ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss