Mersenne Digest Thursday, December 25 2003 Volume 01 : Number 1099
---------------------------------------------------------------------- Date: Tue, 23 Dec 2003 14:13:20 GMT From: [EMAIL PROTECTED] Subject: Re: Mersenne: Large memory pages in Linux > On Wed, Dec 10, 2003 at 06:26:54PM +0000, Nick Craig-Wood wrote: > > I'm in the process of writing (not quite finished or working ;-) some > > code which you load as an LD_PRELOAD library under linux. This gets > > its fingers into the memory allocation, and makes all malloc space > > come from hugetlbfs (how you get large pages under linux). A much more complicated and completely transparent super-page implementation can be found here: http://shimizu-lab.dt.u-tokai.ac.jp/lsp.html Unfotrunately the patches have to be forward-ported to latter-day linux kernels. jasonp - --------------------------------------------- This message was sent using Endymion MailMan. http://www.endymion.com/products/mailman/ _________________________________________________________________________ Unsubscribe & list info -- http://www.ndatech.com/mersenne/signup.htm Mersenne Prime FAQ -- http://www.tasam.com/~lrwiman/FAQ-mers ------------------------------ Date: Tue, 23 Dec 2003 21:15:46 +0100 From: Matthias Waldhauer <[EMAIL PROTECTED]> Subject: Mersenne: Re: Large memory pages in Linux For a TLB aware client like Prime95 the improvements are small as expected. But I'm sure that speedups greater than ~2.5% are possible by modifying the FFTs to make optimal use of large pages. For most FFT sizes currently in use the available number of 2/4MB TLB entries cover the complete working set. But for larger sizes it is again necessary to take care of TLB entries. Last friday I read some messages about recent kernel modifications and patches for version 2.6.0. There is an "imcplicit_large_page" patch, allowing applications to use large pages without modifications. I don't have the time to dig into it :( I wish you all a Merry Chistmas! Matthias _________________________________________________________________________ Unsubscribe & list info -- http://www.ndatech.com/mersenne/signup.htm Mersenne Prime FAQ -- http://www.tasam.com/~lrwiman/FAQ-mers ------------------------------ Date: Wed, 24 Dec 2003 07:46:32 +0000 From: "Brian J. Beesley" <[EMAIL PROTECTED]> Subject: Re: Mersenne: Re: Large memory pages in Linux On Tuesday 23 December 2003 20:15, Matthias Waldhauer wrote: > > Last friday I read some messages about recent kernel modifications and > patches for version 2.6.0. There is an "imcplicit_large_page" patch, > allowing applications to use large pages without modifications. I don't > have the time to dig into it :( Sure. This is a much better approach than mucking about with application-specific modifications which would likely involve serious security hazards (leaking kernel priveleges to the application) and/or clash with other applications private large-page code and/or large page enabled kernels in the future. The "bad news" with kernel 2.6 is that the (default) jiffy timer resolution is changed from 10ms to 1ms, resulting in the task scheduler stealing 10 times as many cycles. This will likely cause a small but noticeable drop in the performance of mprime. Probably ~1% on fast systems. In other words the cycles gained by large page efficiency could easily be swallowed up by the task scheduler being tuned to improve interactive responsiveness (and cope with more processors in a SMP setup). I suppose you could retrofit a 10ms jiffy timer to the 2.6 kernel, but then you could just as easily patch large page support into a 2.4 kernel & (hopefully) keep the stability of a tried, tested & trusted kernel. Finally, the "good news". Crandall & Pomerance p441 describes the "ping pong" variant of the Stockham FFT, in which an extra copy of the data is used but the innermost loop runs essentially consecutively through data memory. C&P note that contiguous memory access is "important" on vector processors but similar memory access techniques are surely the key to avoiding problems with TLB architectures _and small processor caches_ - and the largest caches present on commercial x86 architecture are indeed small compared with the size of the work vectors we use for LL testing. Perhaps implementation along these lines could reduce the cache size dependency which seems to affect Prime95/mprime - though paying a very large premium for the "extreme" version of the Intel Pentium 4 is most certainly not cost effective in view of the small performance benefit the extra cache generates, most probably because the Prime95/mprime code appears not to be tuned for the P4 Extreme Edition. Seasonal felicitations Brian Beesley _________________________________________________________________________ Unsubscribe & list info -- http://www.ndatech.com/mersenne/signup.htm Mersenne Prime FAQ -- http://www.tasam.com/~lrwiman/FAQ-mers ------------------------------ Date: Wed, 24 Dec 2003 07:46:32 +0000 From: "Brian J. Beesley" <[EMAIL PROTECTED]> Subject: Re: Mersenne: Re: Large memory pages in Linux On Tuesday 23 December 2003 20:15, Matthias Waldhauer wrote: > > Last friday I read some messages about recent kernel modifications and > patches for version 2.6.0. There is an "imcplicit_large_page" patch, > allowing applications to use large pages without modifications. I don't > have the time to dig into it :( Sure. This is a much better approach than mucking about with application-specific modifications which would likely involve serious security hazards (leaking kernel priveleges to the application) and/or clash with other applications private large-page code and/or large page enabled kernels in the future. The "bad news" with kernel 2.6 is that the (default) jiffy timer resolution is changed from 10ms to 1ms, resulting in the task scheduler stealing 10 times as many cycles. This will likely cause a small but noticeable drop in the performance of mprime. Probably ~1% on fast systems. In other words the cycles gained by large page efficiency could easily be swallowed up by the task scheduler being tuned to improve interactive responsiveness (and cope with more processors in a SMP setup). I suppose you could retrofit a 10ms jiffy timer to the 2.6 kernel, but then you could just as easily patch large page support into a 2.4 kernel & (hopefully) keep the stability of a tried, tested & trusted kernel. Finally, the "good news". Crandall & Pomerance p441 describes the "ping pong" variant of the Stockham FFT, in which an extra copy of the data is used but the innermost loop runs essentially consecutively through data memory. C&P note that contiguous memory access is "important" on vector processors but similar memory access techniques are surely the key to avoiding problems with TLB architectures _and small processor caches_ - and the largest caches present on commercial x86 architecture are indeed small compared with the size of the work vectors we use for LL testing. Perhaps implementation along these lines could reduce the cache size dependency which seems to affect Prime95/mprime - though paying a very large premium for the "extreme" version of the Intel Pentium 4 is most certainly not cost effective in view of the small performance benefit the extra cache generates, most probably because the Prime95/mprime code appears not to be tuned for the P4 Extreme Edition. Seasonal felicitations Brian Beesley _________________________________________________________________________ Unsubscribe & list info -- http://www.ndatech.com/mersenne/signup.htm Mersenne Prime FAQ -- http://www.tasam.com/~lrwiman/FAQ-mers _________________________________________________________________________ Unsubscribe & list info -- http://www.ndatech.com/mersenne/signup.htm Mersenne Prime FAQ -- http://www.tasam.com/~lrwiman/FAQ-mers ------------------------------ Date: Wed, 24 Dec 2003 10:09:47 +0000 From: Nick Craig-Wood <[EMAIL PROTECTED]> Subject: Re: Mersenne: Re: Large memory pages in Linux On Tue, Dec 23, 2003 at 09:15:46PM +0100, Matthias Waldhauer wrote: > For a TLB aware client like Prime95 the improvements are small as > expected. Kudos to George ;-) > But I'm sure that speedups greater than ~2.5% are possible by > modifying the FFTs to make optimal use of large pages. I'd love to have George's opinion on this. > For most FFT sizes currently in use the available number of 2/4MB > TLB entries cover the complete working set. But for larger sizes it > is again necessary to take care of TLB entries. My PII laptop has 32 TLB entries and 4 MB pages, so thats 128 MB of "flat" memory which is really quite a large FFT! > Last friday I read some messages about recent kernel modifications > and patches for version 2.6.0. There is an "imcplicit_large_page" > patch, allowing applications to use large pages without > modifications. I don't have the time to dig into it :( Do you mean wli's superpage patches? I think he is aiming those for 2.7. They sound very intersting but will require extensive reworking of the kernels internal assumptions that a page is constant size. - -- Nick Craig-Wood [EMAIL PROTECTED] _________________________________________________________________________ Unsubscribe & list info -- http://www.ndatech.com/mersenne/signup.htm Mersenne Prime FAQ -- http://www.tasam.com/~lrwiman/FAQ-mers ------------------------------ Date: Wed, 24 Dec 2003 10:22:14 +0000 From: Nick Craig-Wood <[EMAIL PROTECTED]> Subject: Re: Mersenne: Re: Large memory pages in Linux On Wed, Dec 24, 2003 at 07:46:32AM +0000, Brian J. Beesley wrote: > On Tuesday 23 December 2003 20:15, Matthias Waldhauer wrote: > > > > Last friday I read some messages about recent kernel modifications and > > patches for version 2.6.0. There is an "imcplicit_large_page" patch, > > allowing applications to use large pages without modifications. I don't > > have the time to dig into it :( > > Sure. This is a much better approach than mucking about with > application-specific modifications which would likely involve serious > security hazards (leaking kernel priveleges to the application) and/or clash > with other applications private large-page code and/or large page enabled > kernels in the future. Its also a much more invasive approach - changing something as fundamental as the kernels page size (and the assumption that its constant) is very hard work - don't expect it in 2.6! As for security hazards - the hugepage implementation in 2.6 has been very well thought out. No extra privs are required. The super user must :- a) allocate some huge pages b) mount the hugetlbfs filesystem c) use standard unix permissions so the right user(s) can use hugetlbfs The permissioned user can then allocate huge pages using mmap out of hugetlbfs. You use my particular hack with a couple of LD_PRELOAD libraries which doesn't require patching the application, eg LD_PRELOAD="`pwd`/intercept.so `pwd`/alloc.so" ./mprime -m So you can use huge pages with any application that is dynamically linked. This seems pretty secure and non-invasive to me ;-) (I will shortly put full instructions and code on a web page - family duties over christmas permitting!) - -- Nick Craig-Wood [EMAIL PROTECTED] _________________________________________________________________________ Unsubscribe & list info -- http://www.ndatech.com/mersenne/signup.htm Mersenne Prime FAQ -- http://www.tasam.com/~lrwiman/FAQ-mers ------------------------------ Date: Fri, 26 Dec 2003 00:22:43 +0100 From: =?iso-8859-1?Q?Ignacio_Larrosa_Ca=F1estro?= <[EMAIL PROTECTED]> Subject: Mersenne: Where is M23494381? Where went M23494381? I has assigned that exponent to factor. But today it dissappears from my Individual Account Report. And I don't found it in the Assigned Exponents Report nor in the Cleared Exponents Report ... Best regards, Ignacio Larrosa Caņestro A Coruņa (Espaņa) [EMAIL PROTECTED] _________________________________________________________________________ Unsubscribe & list info -- http://www.ndatech.com/mersenne/signup.htm Mersenne Prime FAQ -- http://www.tasam.com/~lrwiman/FAQ-mers ------------------------------ End of Mersenne Digest V1 #1099 *******************************