Disabling checksums boosts the performance to 660 MB/s for a single thread. Now placing 6 IOR processes one my eight core box gives with some striping 1.6 GB/s which is close to the LNET bandwidth. Thanks a lot again!
Michael Am 20.10.2010 19:13, schrieb Michael Kluge: > Using O_DIRECT reduces the CPU load but the magical limit of 500 MB/s > for one thread remains. Are the CRC sums calculated on a per thread > base? Or stripe base? Is there a way to test the checksumming speed only? > > > Michael > > Am 20.10.2010 18:53, schrieb Andreas Dilger: >> On 2010-10-20, at 10:40, Michael Kluge<[email protected]> wrote: >>> It is the CPU load on the client. The dd/IOR process is using one core >>> completely. The clients and the servers are connected via DDR IB. LNET >>> bandwidth is at 1.8 GB/s. Servers have 1.8.3, the client has 1.8.3 >>> patchless. >> >> If you only have a single threaded write, then this is somewhat unavoidable >> to saturate a CPU due to copy_from_user(). O_DIRECT will avoid this. >> >> Also, disabling data checksums and debugging can help considerably. There >> is a patch in bugzilla to add support for h/w crc32c on Nehalem CPUs to >> reduce this overhead, but still not as fast as no checksum at all. >> >> Cheers, Andreas >> >>> Am 20.10.2010 18:15, schrieb Andreas Dilger: >>>> Is this client CPU or server CPU? If you are using Ethernet it will >>>> definitely be CPU hungry and can easily saturate a single core. >>>> >>>> Cheers, Andreas >>>> >>>> On 2010-10-20, at 8:41, Michael Kluge<[email protected]> >>>> wrote: >>>> >>>>> Hi list, >>>>> >>>>> is it normal, that a 'dd' or an 'IOR' pushing 10MB blocks to a lustre >>>>> file system shows up with a 100% CPU load within 'top'? The reason why I >>>>> am asking this is that I can write from one client to one OST with 500 >>>>> MB/s. The CPU load will be at 100% in this case. If I stripe over two >>>>> OSTs (which use different OSS servers and different RAID controllers) I >>>>> will get 500 as well (seeing 2x250 MB/s on the OSTs). The CPU load will >>>>> be at 100% again. >>>>> >>>>> A 'dd' on my desktop pushing 10M blocks to the local disk shows 7-10% >>>>> CPU load. >>>>> >>>>> Are there ways to tune this behavior? Changing max_rpcs_in_flight and >>>>> max_dirty_mb did not help. >>>>> >>>>> >>>>> Regards, Michael >>>>> >>>>> -- >>>>> >>>>> Michael Kluge, M.Sc. >>>>> >>>>> Technische Universität Dresden >>>>> Center for Information Services and >>>>> High Performance Computing (ZIH) >>>>> D-01062 Dresden >>>>> Germany >>>>> >>>>> Contact: >>>>> Willersbau, Room A 208 >>>>> Phone: (+49) 351 463-34217 >>>>> Fax: (+49) 351 463-37773 >>>>> e-mail: [email protected] >>>>> WWW: http://www.tu-dresden.de/zih >>>>> _______________________________________________ >>>>> Lustre-discuss mailing list >>>>> [email protected] >>>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss >>>> >>> >>> >>> -- >>> Michael Kluge, M.Sc. >>> >>> Technische Universität Dresden >>> Center for Information Services and >>> High Performance Computing (ZIH) >>> D-01062 Dresden >>> Germany >>> >>> Contact: >>> Willersbau, Room WIL A 208 >>> Phone: (+49) 351 463-34217 >>> Fax: (+49) 351 463-37773 >>> e-mail: [email protected] >>> WWW: http://www.tu-dresden.de/zih >> > > -- Michael Kluge, M.Sc. Technische Universität Dresden Center for Information Services and High Performance Computing (ZIH) D-01062 Dresden Germany Contact: Willersbau, Room WIL A 208 Phone: (+49) 351 463-34217 Fax: (+49) 351 463-37773 e-mail: [email protected] WWW: http://www.tu-dresden.de/zih _______________________________________________ Lustre-discuss mailing list [email protected] http://lists.lustre.org/mailman/listinfo/lustre-discuss
