Correction: 1U -> 2U On 26 February 2018 at 14:42, Bill Hart <goodwillh...@googlemail.com> wrote:
> Hi all, > > I've been thinking recently about computing, and how an ideal community > would move forward with MPIR development. I realise we don't actually have > all that much of a community left, but perhaps my comments will inspire a > new generation of developers to pitch in, as I myself am totally engaged > with other projects nowadays and don't have time to commit to MPIR. > > Currently I see no reason to duplicate work being done in the official GMP > project. They are doing a good job of supporting modern CPU > microarchitectures. I believe MPIR development effort should focus > elsewhere. But let me explain why that is. > > There's a perception in our community (at least amongst older researchers) > that once clock speeds leveled off in about 2004/5 that Moore's law was at > an end for single core performance. (GMP and MPIR are currently single core > libraries.) But that perception is not correct. > > In fact, single core performance of CPU's has increased unabated, with a > predicted leveling off only in the next year or so, with a few minor > improvements yet to come due to shrinking of dies. > > Whilst clock speeds have not gone up, pipelines have gotten deeper, > numerous new features have been added, including various revisions of SSE > and then AVX, more integer ports, specialised instructions for > multiprecision arithmetic (yes the kind GMP/MPIR specialise in), > performance management through temperature profiles, hyperthreading (though > this is usually useless for superoptimised code; it's great for code coming > from a C compiler, depending on the application) and more recently > real-time AI adjustment of silicon usage to optimise performance. > > All of this development means that nothing prior to Intel's Core > architecture and nothing prior to AMD's K10 architecture is now relevant, > even for single core performance. But even these won't be relevant much > longer. There's a feeling that this past year, Core 2 has finally become > totally obsolete. > > Looking to the future, bigger changes are coming. Because yields go down > as die sizes go up, it is inevitable that large monolithic CPU cores will > give way to a far greater number of smaller dies, especially as we move > from the current 14nm technology to an eventual 5nm, where things will > surely start to get really hard (Intel is already reportedly having major > problems with yield at 10nm). > > Then there's the performance decreases we have seen due to Spectre and > Meltdown. A lot of the superoptimisation MPIR does is surely worthless in > light of this. I see more and more of this sort of thing happening in the > future. > > Like it or not, the era of single core performance increases is almost at > an end. > > MPIR is also about 3-5 years behind anyway; the latest Intel > microarchitecture we optimise for is Skylake, which was 2015 and the latest > AMD microarchitecture we had a good look at was Piledriver, from 2012, not > that AMD has been terribly relevant in the period 2012-2016. Admittedly > server architectures are usually a few generations behind desktop > architectures. > > But the face of CPU and computing technology development is all changing > fast. Abstractly, modern computer processors consist (or will consist) of > three components, a CPU, GPU and TPU (tensor processing unit). The CPU as > we know it is becoming less relevant every day (it will still continue to > exist, but will change radically; for example I/O, memory controllers and > cache will be separated from an increasing number of small dies for > arithmetic pipelines). > > Modern Ryzen/Threadripper/EPYC CPU's from AMD already basically use a kind > of TPU to control the temperature profile of the CPU (switching silicon on > and off and scheduling which silicon does what). Result: 10-30% performance > improvement by adapting to the values coming from bazillions of sensors, > based on what code is currently running. > > Modern EPYC server racks have 640 CPU cores, but this is a tiny fraction > of their compute power. By far, the greatest part of their compute power is > in their GPU's. If I have my calculations correct, in excess of 320,000 GPU > cores (actually there are three different kinds of cores in the GPU's), all > in a 1U rack. These are targeted at Data Centres, AI research and > Scientific computing, but why not also computer algebra system users? You > can use a GPU for Groebner basis computations or for linear algebra, right? > Why not other things too? > > Still worried about memory bandwidth? How does 11 Gbps sound? > > Then there are tensor cores. Nvidia's Volta architecture has 640 Tensor > cores, for 100 teraflops per device. > > Google's second generation TPU's (what Google use in their data centres > for photo processing and Google ranking: 100 million photos processed per > day per device) are 15-30 times faster than Nvidia's GPU's for many tasks > (23 petaflops per rack). And Nvidia are already ahead of AMD (who make > Ryzen, Threadripper and EPYC). > > It's very clear that the future of MPIR is in multicore/GPU/TPU code. This > means a fundamental change in approach, and a lot of work. > > If I were a young person looking to get involved in this, I would purge > MPIR of support for anything older than Core 2 Duo or AMD K10 and use > fallback C code on everything that is not AMD or Intel. I'd wouldn't worry > about support for Pentium and Atom. I'd support only I3/I5/I7/I9 and Xeon. > I wouldn't waste my time catching up on the AMD side of things. I'd start > with Zen support only (Ryzen, Threadripper, EPYC). > > I'd radically simplify the build system, ditching the legacy autotools, > which is approximately useless, and adopt something more modern. > > I'd dump code from MPIR that isn't faster than GMP (the FFT and divide and > conquer division code should still be used from MPIR, I believe) and try to > get the two libraries as close together as possible so that taking code > from GMP to MPIR is easy. > > Then I would make the big change: switch development effort to supporting > GPUs and eventually TPUs. There's lots of interesting challenges here, both > theoretical and practical. > > The future is here today, and younger people who want to get stuck in > shouldn't wait for us old dinosaurs who aren't even up with the latest > tech. I believe it's time for a new generation to gradually start taking > over MPIR and showing us old dinosaurs the way forward. > > I'm looking forward to seeing what the next generation can do! > > Bill. > > > > -- You received this message because you are subscribed to the Google Groups "mpir-devel" group. To unsubscribe from this group and stop receiving emails from it, send an email to mpir-devel+unsubscr...@googlegroups.com. To post to this group, send email to mpir-devel@googlegroups.com. Visit this group at https://groups.google.com/group/mpir-devel. For more options, visit https://groups.google.com/d/optout.