Re: [Lustre-discuss] Compiling lustre with snmp feature

2010-10-20 Thread Alfonso Pardo
Thanks so much, I will update my lustre to 1.8.4 in few weeks. I'am compiling the 1.8.0 and 1.8.4 lustres snmp module with the same error in both versions. checking whether to try to build SNMP support... yes checking for net-snmp-config... net-snmp-config checking net-snmp/net-snmp-config.h

Re: [Lustre-discuss] recovering formatted OST

2010-10-20 Thread Andreas Dilger
On 2010-10-19, at 17:01, Wojciech Turek wrote: Due to the locac disk failure in an OSS one of our /scratch OSTs was formatted by automatic installation script. This script created 5 small partitions and 6th partition consisting of the remaining space on that OST. Nothing else was written to

Re: [Lustre-discuss] Compiling lustre with snmp feature

2010-10-20 Thread Alfonso Pardo
Ok, the snmp lustre module is compiled I have installed the newest net-snmp (v5.6) and the module was compiled perfect. When I do a snmpwalk, I dont't get any information, but my mudule is compiled and load with dlmod. El mié, 20-10-2010 a las 08:58 +0200, Alfonso Pardo escribió: Thanks

Re: [Lustre-discuss] mkfs options/tuning for RAID based OSTs

2010-10-20 Thread Brian J. Murrell
On Tue, 2010-10-19 at 21:00 -0400, Edward Walter wrote: Ed, That seems to validate how I'm interpreting the parameters. We have 10 data disks and 2 parity disks per array so it looks like we need to be at 64 KB or less. I think you have been missing everyone's point in this thread. The

Re: [Lustre-discuss] ldiskfs performance vs. XFS performance

2010-10-20 Thread Michael Kluge
Thanks a lot for all the replies. sgpdd shows 700+ MB/s for the device. We trapped into one or two bugs with obdfilter-survey as lctl has at least one bug in 1.8.3 when is uses multiple threads and obdfilter-survey also causes an LBUG when you CTRL+C it. We see 600+ MB/s for obdfilter-survey over

Re: [Lustre-discuss] mkfs options/tuning for RAID based OSTs

2010-10-20 Thread Edward Walter
Hi Brian, Thanks for the clarification. It didn't click that the optimal data size is exactly 1MB... Everything you're saying makes sense though. Obviously with 12 disk arrays; there's tension between maximizing space and maximizing performance. I was hoping/trying to get the best of

[Lustre-discuss] high CPU load limits bandwidth?

2010-10-20 Thread Michael Kluge
Hi list, is it normal, that a 'dd' or an 'IOR' pushing 10MB blocks to a lustre file system shows up with a 100% CPU load within 'top'? The reason why I am asking this is that I can write from one client to one OST with 500 MB/s. The CPU load will be at 100% in this case. If I stripe over two OSTs

[Lustre-discuss] vanilla kernel with 2.0 version

2010-10-20 Thread jherold
Hello Will the vanilla kernels be supported with Lustre 2.0 Best regards Jacek ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss

Re: [Lustre-discuss] recovering formatted OST

2010-10-20 Thread Andreas Dilger
Right - you need to recreate the LV exactly as it was before. If you created it all at once on the whole LUN then it is likely to be allocated in a linear way. If there are multiple LVs on the same LUN and they were expanded after use the chance of recovering them is very low. In the e2fsprogs

Re: [Lustre-discuss] mkfs options/tuning for RAID based OSTs

2010-10-20 Thread Charland, Denis
Brian J. Murrell wrote: On Tue, 2010-10-19 at 21:00 -0400, Edward Walter wrote: This is why the recommendations in this thread have continued to be using a number of data disks that divides evenly into 1MB (i.e. powers of 2: 2, 4, 8, etc.). So for RAID6: 4+2 or 8+2, etc. What about

Re: [Lustre-discuss] vanilla kernel with 2.0 version

2010-10-20 Thread Andreas Dilger
I assume your question is related to the server, since clients generally work with vanilla kernels. We are working on the RHEL6 2.6.32 for Lustre 2.1 (available in bugzilla), and I'd hope that this will also work fairly well with the vanilla 2.6.32 kernel. There are no plans to add support

Re: [Lustre-discuss] recovering formatted OST

2010-10-20 Thread Wojciech Turek
Many thanks for prompt reply, On 20 October 2010 16:32, Andreas Dilger andreas.dil...@oracle.com wrote: Right - you need to recreate the LV exactly as it was before. If you created it all at once on the whole LUN then it is likely to be allocated in a linear way. If there are multiple LVs on

Re: [Lustre-discuss] high CPU load limits bandwidth?

2010-10-20 Thread Andreas Dilger
Is this client CPU or server CPU? If you are using Ethernet it will definitely be CPU hungry and can easily saturate a single core. Cheers, Andreas On 2010-10-20, at 8:41, Michael Kluge michael.kl...@tu-dresden.de wrote: Hi list, is it normal, that a 'dd' or an 'IOR' pushing 10MB blocks

Re: [Lustre-discuss] mkfs options/tuning for RAID based OSTs

2010-10-20 Thread Edward Walter
Hi Denis, Changing the number of parity disks (RAID5 = 1, RAID6 = 2) doesn't change the math on the data disks and data segment size. You still need a power of 2 number of data disks to insure that the product of the RAID chunk size and the number of data disks is 1MB. Aside from that; I

Re: [Lustre-discuss] high CPU load limits bandwidth?

2010-10-20 Thread Bernd Schubert
That is normal and probably comes from the page cache, should be about the same for lustre, ldiskfs, ext4, xfs, etc. It goes down if you specify -odirect, but which is obviously not optimal on Lustre clients. Cheers, Bernd On Wednesday, October 20, 2010, Andreas Dilger wrote: Is this client

Re: [Lustre-discuss] mkfs options/tuning for RAID based OSTs

2010-10-20 Thread Bernd Schubert
On Wednesday, October 20, 2010, Charland, Denis wrote: Brian J. Murrell wrote: On Tue, 2010-10-19 at 21:00 -0400, Edward Walter wrote: This is why the recommendations in this thread have continued to be using a number of data disks that divides evenly into 1MB (i.e. powers of 2: 2,

Re: [Lustre-discuss] high CPU load limits bandwidth?

2010-10-20 Thread Michael Kluge
It is the CPU load on the client. The dd/IOR process is using one core completely. The clients and the servers are connected via DDR IB. LNET bandwidth is at 1.8 GB/s. Servers have 1.8.3, the client has 1.8.3 patchless. Micha Am 20.10.2010 18:15, schrieb Andreas Dilger: Is this client CPU

Re: [Lustre-discuss] ldiskfs performance vs. XFS performance

2010-10-20 Thread Bernd Schubert
For your final final filesystem you still probably want to enable async journals (unless you are willing to enable the S2A unmirrored device cache). Most obdecho/obdfilter-survey bugs are gone in 1.8.4, except your ctrl+c problem, for which a patch exists:

Re: [Lustre-discuss] recovering formatted OST

2010-10-20 Thread Andreas Dilger
On 2010-10-20, at 10:15, Wojciech Turek wj...@cam.ac.uk wrote: On 20 October 2010 16:32, Andreas Dilger andreas.dil...@oracle.com wrote: Right - you need to recreate the LV exactly as it was before. If you created it all at once on the whole LUN then it is likely to be allocated in a linear

Re: [Lustre-discuss] ldiskfs performance vs. XFS performance

2010-10-20 Thread Michael Kluge
For your final final filesystem you still probably want to enable async journals (unless you are willing to enable the S2A unmirrored device cache). OK, thanks. We'll give this a try. Michael Most obdecho/obdfilter-survey bugs are gone in 1.8.4, except your ctrl+c problem, for which a patch

Re: [Lustre-discuss] mkfs options/tuning for RAID based OSTs

2010-10-20 Thread Wojciech Turek
Hi Edward, As Andreas mentioned earlier the max OST size is 16TB if one uses ext4 based ldiskfs. So creation of RAID group bigger than that will definitely hurt your performance because you would have to split the large array into smaller logical disks and that randomises IOs on the raid

Re: [Lustre-discuss] high CPU load limits bandwidth?

2010-10-20 Thread Andreas Dilger
On 2010-10-20, at 10:40, Michael Kluge michael.kl...@tu-dresden.de wrote: It is the CPU load on the client. The dd/IOR process is using one core completely. The clients and the servers are connected via DDR IB. LNET bandwidth is at 1.8 GB/s. Servers have 1.8.3, the client has 1.8.3 patchless.

Re: [Lustre-discuss] Maximum OST Size

2010-10-20 Thread Bernd Schubert
On Wednesday, October 20, 2010, Andreas Dilger wrote: On 2010-10-19, at 08:27, Roger Spellman wrote: I don't understand this comment: For the MDT, yes, you could potentially use -i 1500 as about the minimum space per inode, but then you risk running out of space in the filesystem before

Re: [Lustre-discuss] high CPU load limits bandwidth?

2010-10-20 Thread Bernd Schubert
On Wednesday, October 20, 2010, Andreas Dilger wrote: On 2010-10-20, at 10:40, Michael Kluge michael.kl...@tu-dresden.de wrote: It is the CPU load on the client. The dd/IOR process is using one core completely. The clients and the servers are connected via DDR IB. LNET bandwidth is at 1.8

Re: [Lustre-discuss] high CPU load limits bandwidth?

2010-10-20 Thread Michael Kluge
Using O_DIRECT reduces the CPU load but the magical limit of 500 MB/s for one thread remains. Are the CRC sums calculated on a per thread base? Or stripe base? Is there a way to test the checksumming speed only? Michael Am 20.10.2010 18:53, schrieb Andreas Dilger: On 2010-10-20, at 10:40,

[Lustre-discuss] Lustre 1.8.4 client error message

2010-10-20 Thread Jagga Soorma
Hi Guys, I received the following error message on one of my lustre 1.8.4 clients and don't see any network related issues on this node. Just to point out that my server is still running 1.8.1.1. Any ideas what this error message is: -- Oct 19 09:57:08 node20 kernel: LustreError:

Re: [Lustre-discuss] high CPU load limits bandwidth?

2010-10-20 Thread Michael Kluge
Disabling checksums boosts the performance to 660 MB/s for a single thread. Now placing 6 IOR processes one my eight core box gives with some striping 1.6 GB/s which is close to the LNET bandwidth. Thanks a lot again! Michael Am 20.10.2010 19:13, schrieb Michael Kluge: Using O_DIRECT reduces

Re: [Lustre-discuss] recovering formatted OST

2010-10-20 Thread Wojciech Turek
Your help is mostly appreciated Andreas. May I ask one more question? I would like to perform the recovery procedure on the image of the disk (I am making it using dd) rather then the physical device. In order to do that is it enough to bind the image to the loop device and use that loop device as

Re: [Lustre-discuss] recovering formatted OST

2010-10-20 Thread Andreas Dilger
On 2010-10-20, at 11:36, Wojciech Turek wrote: Your help is mostly appreciated Andreas. May I ask one more question? I would like to perform the recovery procedure on the image of the disk (I am making it using dd) rather then the physical device. In order to do that is it enough to bind the

Re: [Lustre-discuss] recovering formatted OST

2010-10-20 Thread Wojciech Turek
Hi Andres, If I am going to recreate LVM on the whole device (as it was originaly created) do I still need to overwrite MBR with zeros prior that? I guess creation of the LVM will overwrite it but I am asking just to make sure. Wojciech On 20 October 2010 18:40, Andreas Dilger

Re: [Lustre-discuss] recovering formatted OST

2010-10-20 Thread Andreas Dilger
Probably LVM will refuse to create a whole-device PV if there is a partition table. Cheers, Andreas On 2010-10-20, at 18:31, Wojciech Turek wj...@cam.ac.uk wrote: Hi Andres, If I am going to recreate LVM on the whole device (as it was originaly created) do I still need to overwrite MBR

Re: [Lustre-discuss] recovering formatted OST

2010-10-20 Thread Sebastian Gutierrez
If everything was on a LVM first you may be able to recover if nothing has been written to the disk. I am assuming that you do not have your lvm backup files /etc/lvm/backup/. If you did you could use the pvcreate recovery procedure there are a couple of different walkthroughs here that may