Re: Enable SSE math on i386 with -Ofast

2013-10-08 Thread Jan Hubicka
Hi, this is patch I ended up comitting after some furhter testing. The difference to initial version is that it now eanbles SSE math with -ffast-math too and it does so outside the ugly target macro. Bootstrapped/regtested x86_64-linux, tested with -m32 Honza * config/i386/i386.c (ix86

Re: Enable SSE math on i386 with -Ofast

2013-10-07 Thread Jan Hubicka
> > In meantime I (partially, > > since megrez stopped producing 32bit spec2k6 results) benchmarked > > -mfpmath=sse,387 and it does not seem to be a loss anymore. So perhaps we > > can > > give it a try? > > Not sure ... I would guess that it's not a win on any recent architecture > (and LRA i

Re: Enable SSE math on i386 with -Ofast

2013-10-07 Thread Richard Biener
On Mon, 7 Oct 2013, Jan Hubicka wrote: > > On Fri, 4 Oct 2013, Jan Hubicka wrote: > > > > > Hi, > > > this patch makes -Ofast to also imply -mfpmath=sse. It is important win > > > on > > > SPECfP (2000 and 2006). Even though for exmaple the following > > > float a(float b) > > > { > > >retu

Re: Enable SSE math on i386 with -Ofast

2013-10-07 Thread Jan Hubicka
> On Fri, 4 Oct 2013, Jan Hubicka wrote: > > > Hi, > > this patch makes -Ofast to also imply -mfpmath=sse. It is important win on > > SPECfP (2000 and 2006). Even though for exmaple the following > > float a(float b) > > { > >return b+10; > > } > > > > results in somewhat ridiculous > > a: >

Re: Enable SSE math on i386 with -Ofast

2013-10-07 Thread Richard Biener
On Fri, 4 Oct 2013, Jan Hubicka wrote: > Hi, > this patch makes -Ofast to also imply -mfpmath=sse. It is important win on > SPECfP (2000 and 2006). Even though for exmaple the following > float a(float b) > { >return b+10; > } > > results in somewhat ridiculous > a: > .LFB0: > .cfi

Enable SSE math on i386 with -Ofast

2013-10-04 Thread Jan Hubicka
Hi, this patch makes -Ofast to also imply -mfpmath=sse. It is important win on SPECfP (2000 and 2006). Even though for exmaple the following float a(float b) { return b+10; } results in somewhat ridiculous a: .LFB0: .cfi_startproc subl$4, %esp .cfi_def_cfa_offset