[Bug target/29845] sh floating point emulation is inefficient
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29845 --- Comment #9 from Oleg Endo olegendo at gcc dot gnu.org 2012-11-07 21:37:47 UTC --- Jörn, I was curious whether the soft fpu code of yours is also available as C/C++, or did you write it in asm only? I guess it would be an interesting bunch of code quality tests for the compiler.
[Bug target/29845] sh floating point emulation is inefficient
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29845 --- Comment #10 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org 2012-11-07 22:40:39 UTC --- (In reply to comment #9) Jörn, I was curious whether the soft fpu code of yours is also available as C/C++, or did you write it in asm only? I guess it would be an interesting bunch of code quality tests for the compiler. I wrote it as asm only. There are a number of C implementations of software floating point available, but the compiler is not much good at combining high-level transformations with streamlined data representation, ABI modification, register allocation and scheduling to make the best use of an architecture.
[Bug target/29845] sh floating point emulation is inefficient
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29845 --- Comment #11 from Oleg Endo olegendo at gcc dot gnu.org 2012-11-07 23:33:55 UTC --- (In reply to comment #10) but the compiler is not much good at combining high-level transformations with streamlined data representation, ABI modification, register allocation and scheduling to make the best use of an architecture. Do you have any particular example in mind? Generally, true ... but things are improving slowly ;)
[Bug target/29845] sh floating point emulation is inefficient
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29845 --- Comment #12 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org 2012-11-07 23:56:37 UTC --- (In reply to comment #11) Do you have any particular example in mind? Just compare the size performance of the code generated from fp-bit.c with the hand-coded asm. Also observe how comparisons for the latter are lighter on the caller as less registers are clobbered.
[Bug target/29845] sh floating point emulation is inefficient
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29845 --- Comment #13 from Oleg Endo olegendo at gcc dot gnu.org 2012-11-08 01:08:51 UTC --- (In reply to comment #12) (In reply to comment #11) Do you have any particular example in mind? Just compare the size performance of the code generated from fp-bit.c with the hand-coded asm. Also observe how comparisons for the latter are lighter on the caller as less registers are clobbered. OK, I get the point (not much to compare -- it's pretty obvious). Maybe it would make sense to transition the sh soft fp stuff step by step (insn by insn) from the generic fp-bit to custom fp code...
[Bug target/29845] sh floating point emulation is inefficient
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29845 Oleg Endo olegendo at gcc dot gnu.org changed: What|Removed |Added CC||olegendo at gcc dot gnu.org --- Comment #8 from Oleg Endo olegendo at gcc dot gnu.org 2012-09-09 11:22:33 UTC --- Just for the record, a rather recent discussion on the issue: Original thread start: http://gcc.gnu.org/ml/gcc/2010-06/msg00388.html Continuation: http://gcc.gnu.org/ml/gcc/2010-07/msg00250.html Continuation: http://gcc.gnu.org/ml/gcc/2010-08/msg00044.html Aggregated version: http://www.mentby.com/Group/gcc-discuss/sh-optimized-software-floating-point-routines.html
[Bug target/29845] sh floating point emulation is inefficient
--- Comment #4 from christian dot bruel at st dot com 2007-01-31 13:47 --- Hereattached a patch to fix a few problems: 1) Rounding to nearest must be infinity if the infinitely precise result has a magniture at least 2 exp Emax (2-2exp-p) (ansi 754/1985 sect 4.1). The implementation for addsf3 and adddf3 returned a NaN. 2) Infinity in divsf3.S was not set (case of 1.0/0.0). 2007-01-29 Christian Bruel [EMAIL PROTECTED] * config/sh/IEEE-754/m3/adddf3.S: Fix inf mantissa. * config/sh/IEEE-754/m3/addsf3.S: Likewise. * config/sh/IEEE-754/m3/divsf3.S: Intialize xff00 label. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29845
[Bug target/29845] sh floating point emulation is inefficient
--- Comment #5 from christian dot bruel at st dot com 2007-01-31 13:50 --- Created an attachment (id=12986) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=12986action=view) fixes the nearest to infinity and divide by 0 bugs. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29845
[Bug target/29845] sh floating point emulation is inefficient
--- Comment #6 from christian dot bruel at st dot com 2007-01-31 13:56 --- (From update of attachment 12986) (note: this diff was made from the wrong direction. (-) shows the newest version. sorry -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29845
[Bug target/29845] sh floating point emulation is inefficient
--- Comment #7 from amylaar at gcc dot gnu dot org 2007-02-01 03:41 --- (In reply to comment #4) Hereattached a patch to fix a few problems: 1) Rounding to nearest must be infinity if the infinitely precise result has a magniture at least 2 exp Emax (2-2exp-p) (ansi 754/1985 sect 4.1). The implementation for addsf3 and adddf3 returned a NaN. Not always a NaN, but it's a bug regardless. Good catch. However, this: LOCAL(inf): + mov #0,DBLRL negatively impacts sh3 MA unit scheduling. It might be an idea to align LOCAL(pos_difference_0) for sh3. - tst r7,r0 + tst r7,r0 You have to be careful with the whitespace. 2) Infinity in divsf3.S was not set (case of 1.0/0.0). * config/sh/IEEE-754/m3/divsf3.S: Intialize xff00 label. This was supposed to be: LOCAL(m1): .word -1 .balign 4 LOCAL(xff00): #ifdef __LITTLE_ENDIAN__ .word 0 LOCAL(xff00): .word 0xff00 #else LOCAL(xff00): .word 0xff00 .word 0 #endif saving four bytes over using unshared constants. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29845
[Bug target/29845] sh floating point emulation is inefficient
--- Comment #2 from amylaar at gcc dot gnu dot org 2006-11-29 18:52 --- Created an attachment (id=12708) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=12708action=view) ChangeLog entries for softfp-diff-20061110 These are the ChangeLog entries for the SH specific code. The ChangeLog entries for the target-independent code are included in the patches for the blocking PRs (28618, 29846 and 29847) . -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29845
[Bug target/29845] sh floating point emulation is inefficient
--- Comment #3 from amylaar at gcc dot gnu dot org 2006-11-29 19:03 --- Created an attachment (id=12709) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=12709action=view) ChangeLog entries for softfp-diff-20061110 The previous version was missing the enumeration of two new files. -- amylaar at gcc dot gnu dot org changed: What|Removed |Added Attachment #12708|0 |1 is obsolete|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29845
[Bug target/29845] sh floating point emulation is inefficient
--- Comment #1 from amylaar at gcc dot gnu dot org 2006-11-15 18:02 --- Created an attachment (id=12624) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=12624action=view) patch This patch has been regression tested on i686-pc-linux-gnu X sh-elf. However, I need approval for the non-SH parts before I can commit it. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29845