Re: [RFC] Old school parallelization of WPA streaming

2014-02-20 Thread Jan Hubicka
> > I plan to commit it shortly (i am just slowly progressing through the > > bugreports and TODOs cumulated) > > - indeed for bigger apps and edit/relink cycle it is an life saver ;) > > I haven't tested exactly around this, but I see a ~10s (~5%) improved kernel > LTO build time going from 4.9-2

Re: [RFC] Old school parallelization of WPA streaming

2014-02-20 Thread Andi Kleen
> I plan to commit it shortly (i am just slowly progressing through the > bugreports and TODOs cumulated) > - indeed for bigger apps and edit/relink cycle it is an life saver ;) I haven't tested exactly around this, but I see a ~10s (~5%) improved kernel LTO build time going from 4.9-20140209 to 2

Re: [RFC] Old school parallelization of WPA streaming

2014-02-20 Thread H.J. Lu
On Thu, Dec 5, 2013 at 3:54 PM, Jan Hubicka wrote: >> On Thu, 21 Nov 2013, Jan Hubicka wrote: >> >> > > >> > > Why do you need an additional -fparallelism? Wouldn't >> > > -fwpa=... be a better match, matching -flto=...? As we already >> > > pass down a -fwpa option to WPA this would make things

Re: [RFC] Old school parallelization of WPA streaming

2013-12-13 Thread Jan Hubicka
> On 2013.12.06 at 10:43 +0100, Richard Biener wrote: > > On Fri, 6 Dec 2013, Jan Hubicka wrote: > > > > > > On Thu, 21 Nov 2013, Jan Hubicka wrote: > > > > > > > > > > > > > > > > Why do you need an additional -fparallelism? Wouldn't > > > > > > -fwpa=... be a better match, matching -flto=...?

Re: [RFC] Old school parallelization of WPA streaming

2013-12-13 Thread Markus Trippelsdorf
On 2013.12.06 at 10:43 +0100, Richard Biener wrote: > On Fri, 6 Dec 2013, Jan Hubicka wrote: > > > > On Thu, 21 Nov 2013, Jan Hubicka wrote: > > > > > > > > > > > > > Why do you need an additional -fparallelism? Wouldn't > > > > > -fwpa=... be a better match, matching -flto=...? As we already

Re: [RFC] Old school parallelization of WPA streaming

2013-12-06 Thread Richard Biener
On Fri, 6 Dec 2013, Jan Hubicka wrote: > > On Thu, 21 Nov 2013, Jan Hubicka wrote: > > > > > > > > > > Why do you need an additional -fparallelism? Wouldn't > > > > -fwpa=... be a better match, matching -flto=...? As we already > > > > pass down a -fwpa option to WPA this would make things eas

Re: [RFC] Old school parallelization of WPA streaming

2013-12-05 Thread Jan Hubicka
> On Thu, 21 Nov 2013, Jan Hubicka wrote: > > > > > > > Why do you need an additional -fparallelism? Wouldn't > > > -fwpa=... be a better match, matching -flto=...? As we already > > > pass down a -fwpa option to WPA this would make things easier, no? > > > > My plan was to possibly use same o

Re: [RFC] Old school parallelization of WPA streaming

2013-11-21 Thread Richard Biener
On Thu, 21 Nov 2013, Jan Hubicka wrote: > > > > Why do you need an additional -fparallelism? Wouldn't > > -fwpa=... be a better match, matching -flto=...? As we already > > pass down a -fwpa option to WPA this would make things easier, no? > > My plan was to possibly use same option later for

Re: [RFC] Old school parallelization of WPA streaming

2013-11-21 Thread Jan Hubicka
> > Why do you need an additional -fparallelism? Wouldn't > -fwpa=... be a better match, matching -flto=...? As we already > pass down a -fwpa option to WPA this would make things easier, no? My plan was to possibly use same option later for parallelizing more parts of compiler, not only WPA st

Re: [RFC] Old school parallelization of WPA streaming

2013-11-21 Thread Richard Biener
On Thu, 21 Nov 2013, Jan Hubicka wrote: > Hi, > I am not sure where we converged concerning the fork trick. I am using it in > my > tree for months and it does save my waiting time for WPA compilations, so I am > re-attaching the patch. > > Does it seem resonable for mainline? > > As for other

Re: [RFC] Old school parallelization of WPA streaming

2013-11-20 Thread Jan Hubicka
Hi, I am not sure where we converged concerning the fork trick. I am using it in my tree for months and it does save my waiting time for WPA compilations, so I am re-attaching the patch. Does it seem resonable for mainline? As for other plans mentioned on this thread > > > > I still have some i

Re: [RFC] Old school parallelization of WPA streaming

2013-08-29 Thread Andi Kleen
On Thu, Aug 29, 2013 at 03:58:45PM +0200, Jan Hubicka wrote: > > > Said that, I now have the fork() patch in all my trees and enjoy 50% > > > faster > > > WPA times. I changed my mind about claim that stremaing should be disk > > > bound - > > > it is hard to hope for disk boundness for somethin

Re: [RFC] Old school parallelization of WPA streaming

2013-08-29 Thread Jan Hubicka
> > Said that, I now have the fork() patch in all my trees and enjoy 50% faster > > WPA times. I changed my mind about claim that stremaing should be disk > > bound - > > it is hard to hope for disk boundness for something that should fit in > > cache. > > It should at least limit its fork rate

Re: [RFC] Old school parallelization of WPA streaming

2013-08-29 Thread Richard Biener
On Thu, 29 Aug 2013, Jan Hubicka wrote: > Jakub, > I am adding you to CC since I put my current toughts on LTO and debug info > in here. > > > Fork-fire-forget is really a much simpler choice here IMO; no worries > > > about shared resources, less debug hassle. > > > > It might be not as cheap a

Re: [RFC] Old school parallelization of WPA streaming

2013-08-29 Thread Michael Matz
Hi, On Thu, 29 Aug 2013, Richard Biener wrote: > > Fork-fire-forget is really a much simpler choice here IMO; no worries > > about shared resources, less debug hassle. > > It might be not as cheap as it is on Linux hosts on other hosts of > course. Sure. Don't use it there then. Not a reason

Re: [RFC] Old school parallelization of WPA streaming

2013-08-29 Thread Jan Hubicka
Jakub, I am adding you to CC since I put my current toughts on LTO and debug info in here. > > Fork-fire-forget is really a much simpler choice here IMO; no worries > > about shared resources, less debug hassle. > > It might be not as cheap as it is on Linux hosts on other hosts of > course. Als

Re: [RFC] Old school parallelization of WPA streaming

2013-08-29 Thread Richard Biener
On Wed, 28 Aug 2013, Michael Matz wrote: > Hi, > > On Wed, 21 Aug 2013, Richard Biener wrote: > > > I also fail to see why threads should not work here. Maybe simply > > annotate gcc with openmp? > > Threads simply don't work here, because the whole streamer infrastructure > (or anything els

Re: [RFC] Old school parallelization of WPA streaming

2013-08-28 Thread Michael Matz
Hi, On Wed, 21 Aug 2013, Richard Biener wrote: > I also fail to see why threads should not work here. Maybe simply > annotate gcc with openmp? Threads simply don't work here, because the whole streamer infrastructure (or anything else in GCC for that matter) isn't thread safe (you'd have to

Re: [RFC] Old school parallelization of WPA streaming

2013-08-21 Thread Jan Hubicka
> > We should also use a faster compressor Yep, at least once it arrives higher in profiles. So far other stuff is a lot slower. > > > For -flto=jobserver I simply fork all 32 processes. It may not be a > > disaster, > > but perhaps we should figure out how to communicate with jobserver. At

Re: [RFC] Old school parallelization of WPA streaming

2013-08-21 Thread Jan Hubicka
> > > >One risk is if someone streams to a spinning disk it may add more seeks > >for > >the parallel IO. But I think it's a reasonable tradeoffs. > > It'll also wreck all WPA dump files. We do not dump anything during the main streaming. If we now stream 2GB for firefox, I think we can hope t

Re: [RFC] Old school parallelization of WPA streaming

2013-08-21 Thread Andi Kleen
> I also fail to see why threads should not work here. Maybe simply annotate > gcc with openmp? Don't you have to set a environment variable to set the number of threads for openmp? Otherwise it sounds like a reasonable way to do it. -Andi

Re: [RFC] Old school parallelization of WPA streaming

2013-08-21 Thread Richard Biener
Andi Kleen wrote: >On Wed, Aug 21, 2013 at 04:17:48PM +0200, Jan Hubicka wrote: >> Hi, >> this is my attempt to bring GCC into wonderful era of multicore CPUs >:) >> It is a hack, but it seems to help quite a lot. About 50% of WPA >time is spent >> by streaming the individual ltrans .o files. Th

Re: [RFC] Old school parallelization of WPA streaming

2013-08-21 Thread Andi Kleen
On Wed, Aug 21, 2013 at 04:17:48PM +0200, Jan Hubicka wrote: > Hi, > this is my attempt to bring GCC into wonderful era of multicore CPUs :) > It is a hack, but it seems to help quite a lot. About 50% of WPA time is > spent > by streaming the individual ltrans .o files. This can be easily parall

[RFC] Old school parallelization of WPA streaming

2013-08-21 Thread Jan Hubicka
Hi, this is my attempt to bring GCC into wonderful era of multicore CPUs :) It is a hack, but it seems to help quite a lot. About 50% of WPA time is spent by streaming the individual ltrans .o files. This can be easily parallelized by fork - we do nothing afterwards, just exit and pass the list t