So, by default it running in fast mode. With best wishes, Leonid
-----Original Message----- From: carsten.m...@gmail.com [mailto:carsten.m...@gmail.com] On Behalf Of ext Carsten Munk Sent: 20 January, 2011 11:24 To: Moiseichuk Leonid (Nokia-MS/Helsinki) Cc: thi...@kde.org; meego-dev@meego.com Subject: Re: [MeeGo-dev] ARM RunFast by default in glibc For those interested, on a Nokia N900 running hardfp-meego-runfast-by-default: [root@localhost fpumode]# ./fpumodetest * run scalar_load<float> testing => 28.499384 seconds * run vector_load<float> testing => 14.4294176433 seconds * run scalar_load<double> testing => 28.508026 seconds * run vector_load<double> testing => 21.349396 seconds [root@localhost fpumode]# LD_PRELOAD=$PWD/fpumode-ieee.so ./fpumodetest * fpu mode build Oct 29 2010 14:38:19 * current fpu mode is 0x03000000 [RUN FAST] * changing mode to 0x00001f00 [IEEE] * run scalar_load<float> testing => 28.511074 seconds * run vector_load<float> testing => 21.4294921398 seconds * run scalar_load<double> testing => 29.4294486949 seconds * run vector_load<double> testing => 21.333160 seconds [root@localhost fpumode]# LD_PRELOAD=$PWD/fpumode-fast.so ./fpumodetest * fpu mode build Oct 29 2010 14:38:19 * current fpu mode is 0x03000000 [RUN FAST] * changing mode to 0x03000000 [RUN FAST] * run scalar_load<float> testing => 29.4294486369 seconds * run vector_load<float> testing => 13.230743 seconds * run scalar_load<double> testing => 29.4294424724 seconds * run vector_load<double> testing => 21.397857 seconds /Carsten 2011/1/14 <leonid.moiseic...@nokia.com>: > See the attached package. Readme contains example of usage. The binaries > compiled for ca8-hardfp, so you can launch it without recompilation if hw is > the same. > > With best wishes, > Leonid > > > -----Original Message----- > From: carsten.m...@gmail.com [mailto:carsten.m...@gmail.com] On Behalf Of ext > Carsten Munk > Sent: 14 January, 2011 11:21 > To: Moiseichuk Leonid (Nokia-MS/Helsinki) > Cc: thi...@kde.org; meego-dev@meego.com > Subject: Re: [MeeGo-dev] ARM RunFast by default in glibc > > 2011/1/14 <leonid.moiseic...@nokia.com>: >> Enabling run-fast mode using -ffast-math is not-trivial hack. Also required >> updating packages for compilation flags or global options. >> Patching glibc is much cheaper to implement and safer. >> >> In ideal case the speedup on cotext-a8 could be around 40% (depends on >> vector/matrix size), even non-vector operations with floats improves for >> margin more 10%. >> BUT all float/doubles operations could be are affected: you may get >> different outcome in comparison to IEEE mode. > > Got any good benchmarks/tools we can run so we can verify this on > actual MeeGo hardfp? >> >> With best wishes, >> Leonid >> >> >> -----Original Message----- >> From: meego-dev-boun...@meego.com [mailto:meego-dev-boun...@meego.com] On >> Behalf Of ext Thiago Macieira >> Sent: 12 January, 2011 17:55 >> To: meego-dev@meego.com >> Subject: Re: [MeeGo-dev] ARM RunFast by default in glibc >> >> On Wednesday, 12 de January de 2011 16:01:31 Carsten Munk wrote: >>> 2011/1/12 Arjan van de Ven <ar...@linux.intel.com>: >>> > On 1/12/2011 1:06 AM, Carsten Munk wrote: >>> >> Hi (ARM toolchain group mostly) >>> >> >>> >> Do we have a patch for glibc-2.11-12-g24c0bf7 and/or glibc-2.12.1 >>> >> that enables ARM RunFast[1] mode by default anywhere? Would be good >>> >> to push it along with hardfp while we're at it and getting things >>> >> tested through. >>> > >>> > can this be turned into something that's passed in via CFLAGS ? >>> > that way apps will not be surprised, and there is an easy way for us >>> > to toggle >> >> Right now, it's a context configuration, so there's nothing that will really >> work from CFLAGS. Without changing gcc, the only thing we could do is supply >> different crt1.o, one that puts the FPU in RunFast, the other doesn't. >> >> But this will, like I said, apply to all code within a process, so it >> doesn't help the library case. Libraries will need to cope with running in >> both modes. >> >>> > of course we can have a default in our OBS that you pick, but it >>> > becomes an easy-to-manage (from a distro perspective) property >>> >>> That would be a better way than patching glibc, I would believe? >> >> Not necessarily. To do the right thing, the compiler would need to emit the >> code that changes FPSCR before any FP operation, so this means an increase >> in code size. I can also bet that there's a pipeline delay in modifying this >> register. >> >> And there's no such GCC patch. >> >>> Wouldn't -ffast-math correspond to this on x86 side at least? >>> >>> Leonid, does this correspond to an auto-setup of RunFast on ARM, when >>> used there? >> >> No, it's different. >> >> By the way, I should point out that on Cortex-A8, RunFast only has a >> perceptible improvement for float. If you use double, you still have >> performance issues. >> >> On Cortex-A9, both are fast. >> >> -- >> Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org >> Senior Product Manager - Nokia, Qt Development Frameworks >> PGP/GPG: 0x6EF45358; fingerprint: >> E067 918B B660 DBD1 105C 966C 33F5 F005 6EF4 5358 >> _______________________________________________ >> MeeGo-dev mailing list >> MeeGo-dev@meego.com >> http://lists.meego.com/listinfo/meego-dev >> > _______________________________________________ MeeGo-dev mailing list MeeGo-dev@meego.com http://lists.meego.com/listinfo/meego-dev