Very interesting and insightful posting :-)

Den mandag den 26. februar 2018 kl. 14.42.12 UTC+1 skrev Bill Hart:
>
> Hi all,
>
> I've been thinking recently about computing, and how an ideal community 
> would move forward with MPIR development. I realise we don't actually have 
> all that much of a community left, but perhaps my comments will inspire a 
> new generation of developers to pitch in, as I myself am totally engaged 
> with other projects nowadays and don't have time to commit to MPIR.
>
> Currently I see no reason to duplicate work being done in the official GMP 
> project. They are doing a good job of supporting modern CPU 
> microarchitectures. I believe MPIR development effort should focus 
> elsewhere. But let me explain why that is.
>
> There's a perception in our community (at least amongst older researchers) 
> that once clock speeds leveled off in about 2004/5 that Moore's law was at 
> an end for single core performance. (GMP and MPIR are currently single core 
> libraries.) But that perception is not correct.
>
> In fact, single core performance of CPU's has increased unabated, with a 
> predicted leveling off only in the next year or so, with a few minor 
> improvements yet to come due to shrinking of dies.
>
> Whilst clock speeds have not gone up, pipelines have gotten deeper, 
> numerous new features have been added, including various revisions of SSE 
> and then AVX, more integer ports, specialised instructions for 
> multiprecision arithmetic (yes the kind GMP/MPIR specialise in), 
> performance management through temperature profiles, hyperthreading (though 
> this is usually useless for superoptimised code; it's great for code coming 
> from a C compiler, depending on the application) and more recently 
> real-time AI adjustment of silicon usage to optimise performance.
>
> All of this development means that nothing prior to Intel's Core 
> architecture and nothing prior to AMD's K10 architecture is now relevant, 
> even for single core performance. But even these won't be relevant much 
> longer. There's a feeling that this past year, Core 2 has finally become 
> totally obsolete. 
>
> Looking to the future, bigger changes are coming. Because yields go down 
> as die sizes go up, it is inevitable that large monolithic CPU cores will 
> give way to a far greater number of smaller dies, especially as we move 
> from the current 14nm technology to an eventual 5nm, where things will 
> surely start to get really hard (Intel is already reportedly having major 
> problems with yield at 10nm).
>
> Then there's the performance decreases we have seen due to Spectre and 
> Meltdown. A lot of the superoptimisation MPIR does is surely worthless in 
> light of this. I see more and more of this sort of thing happening in the 
> future.
>
> Like it or not, the era of single core performance increases is almost at 
> an end.
>
> MPIR is also about 3-5 years behind anyway; the latest Intel 
> microarchitecture we optimise for is Skylake, which was 2015 and the latest 
> AMD microarchitecture we had a good look at was Piledriver, from 2012, not 
> that AMD has been terribly relevant in the period 2012-2016. Admittedly 
> server architectures are usually a few generations behind desktop 
> architectures.
>
> But the face of CPU and computing technology development is all changing 
> fast. Abstractly, modern computer processors consist (or will consist) of 
> three components, a CPU, GPU and TPU (tensor processing unit). The CPU as 
> we know it is becoming less relevant every day (it will still continue to 
> exist, but will change radically; for example I/O, memory controllers and 
> cache will be separated from an increasing number of small dies for 
> arithmetic pipelines).
>
> Modern Ryzen/Threadripper/EPYC CPU's from AMD already basically use a kind 
> of TPU to control the temperature profile of the CPU (switching silicon on 
> and off and scheduling which silicon does what). Result: 10-30% performance 
> improvement by adapting to the values coming from bazillions of sensors, 
> based on what code is currently running.
>
> Modern EPYC server racks have 640 CPU cores, but this is a tiny fraction 
> of their compute power. By far, the greatest part of their compute power is 
> in their GPU's. If I have my calculations correct, in excess of 320,000 GPU 
> cores (actually there are three different kinds of cores in the GPU's), all 
> in a 1U rack. These are targeted at Data Centres, AI research and 
> Scientific computing, but why not also computer algebra system users? You 
> can use a GPU for Groebner basis computations or for linear algebra, right? 
> Why not other things too?
>
> Still worried about memory bandwidth? How does 11 Gbps sound?
>
> Then there are tensor cores. Nvidia's Volta architecture has 640 Tensor 
> cores, for 100 teraflops per device.
>
> Google's second generation TPU's (what Google use in their data centres 
> for photo processing and Google ranking: 100 million photos processed per 
> day per device) are 15-30 times faster than Nvidia's GPU's for many tasks 
> (23 petaflops per rack). And Nvidia are already ahead of AMD (who make 
> Ryzen, Threadripper and EPYC).
>
> It's very clear that the future of MPIR is in multicore/GPU/TPU code. This 
> means a fundamental change in approach, and a lot of work.
>
> If I were a young person looking to get involved in this, I would purge 
> MPIR of support for anything older than Core 2 Duo or AMD K10 and use 
> fallback C code on everything that is not AMD or Intel. I'd wouldn't worry 
> about support for Pentium and Atom. I'd support only I3/I5/I7/I9 and Xeon. 
> I wouldn't waste my time catching up on the AMD side of things. I'd start 
> with Zen support only (Ryzen, Threadripper, EPYC).
>
> I'd radically simplify the build system, ditching the legacy autotools, 
> which is approximately useless, and adopt something more modern.
>
> I'd dump code from MPIR that isn't faster than GMP (the FFT and divide and 
> conquer division code should still be used from MPIR, I believe) and try to 
> get the two libraries as close together as possible so that taking code 
> from GMP to MPIR is easy.
>
> Then I would make the big change: switch development effort to supporting 
> GPUs and eventually TPUs. There's lots of interesting challenges here, both 
> theoretical and practical.
>
> The future is here today, and younger people who want to get stuck in 
> shouldn't wait for us old dinosaurs who aren't even up with the latest 
> tech. I believe it's time for a new generation to gradually start taking 
> over MPIR and showing us old dinosaurs the way forward.
>
> I'm looking forward to seeing what the next generation can do!
>
> Bill.
>
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"mpir-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to mpir-devel+unsubscr...@googlegroups.com.
To post to this group, send email to mpir-devel@googlegroups.com.
Visit this group at https://groups.google.com/group/mpir-devel.
For more options, visit https://groups.google.com/d/optout.

Reply via email to