On Sun, May 15, 2011 at 05:02:37PM +0300, Blue Swirl wrote: > On Sun, May 15, 2011 at 4:43 PM, Aurelien Jarno <aurel...@aurel32.net> wrote: > > On Sun, May 15, 2011 at 04:16:20PM +0300, Blue Swirl wrote: > >> On Sun, May 15, 2011 at 3:49 PM, Aurelien Jarno <aurel...@aurel32.net> > >> wrote: > >> > On Sun, May 15, 2011 at 03:30:17PM +0300, Blue Swirl wrote: > >> >> On Sun, May 15, 2011 at 2:36 PM, Aurelien Jarno <aurel...@aurel32.net> > >> >> wrote: > >> >> > On Sun, May 15, 2011 at 02:12:27PM +0300, Blue Swirl wrote: > >> >> >> On Sun, May 15, 2011 at 1:19 PM, Aurelien Jarno > >> >> >> <aurel...@aurel32.net> wrote: > >> >> >> > I still don't get where are this list of possible changes? Did I > >> >> >> > miss > >> >> >> > something in another thread? > >> >> >> > >> >> >> I'm referring to the patches I sent. > >> >> > > >> >> > Ok, patches 1, 2 and 4 to 7 looks ok at a first glance, though I think > >> >> > patches 6 and 7 should be done for all hosts or none of them. > >> >> > >> >> The changes can be done in steps, but of course removing temp_buf from > >> >> CPUState would need all targets to be converted first. > >> >> > >> >> >> > On the TCG generated code, the env structure is used almost for > >> >> >> > every > >> >> >> > op, so it really makes sense to keep it in a register instead of > >> >> >> > having to > >> >> >> > reload the address of env regularly from memory. Given it only > >> >> >> > affects > >> >> >> > TCG generated code, I don't see the point of portability here. > >> >> >> > >> >> >> For example, maybe the bugs in Sparc glibc could be avoided by using > >> >> >> one of %i set of registers (not accessible from helpers) for AREG0 > >> >> >> within generated code instead of %g registers which seem to be > >> >> >> fragile. > >> >> > > >> >> > First of all, but it's a different subject, I am not sure there are > >> >> > sparc glibc bugs, I'd rather says QEMU mis-uses some register. For > >> >> > example the following code is probably wrong: > >> >> > > >> >> > /* Note: must be synced with dyngen-exec.h */ > >> >> > #ifdef CONFIG_SOLARIS > >> >> > #define TCG_AREG0 TCG_REG_G2 > >> >> > #elif defined(__sparc_v9__) > >> >> > #define TCG_AREG0 TCG_REG_G5 > >> >> > #else > >> >> > #define TCG_AREG0 TCG_REG_G6 > >> >> > #endif > >> >> > > >> >> > __sparc_v9__ can set on the 32-bit ABI, when the compiler targets V8+, > >> >> > so the condition is probably wrong there. Secondly the SPARC ABI [1] > >> >> > on > >> >> > page 23 that %g5 to %g7 are reserved for system. I don't think QEMU > >> >> > has > >> >> > the right to use this registers. > >> >> > >> >> Yes, but the situation is not so nice. Please see this post for status > >> >> as of 2010: > >> >> http://article.gmane.org/gmane.comp.emulators.qemu/63610 > >> >> > >> >> This is from Debian glibc 2.11.2-10: > >> >> $ file /lib/libc-2.11.2.so > >> >> /lib/libc-2.11.2.so: ELF 32-bit MSB shared object, SPARC32PLUS, > >> >> version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux > >> >> 2.6.18, stripped > >> >> $ objdump -d /lib/libc.so.6 |grep %g1|wc -l > >> >> 69648 > >> >> $ objdump -d /lib/libc.so.6 |grep %g2|wc -l > >> >> 37299 > >> >> $ objdump -d /lib/libc.so.6 |grep %g3|wc -l > >> >> 20635 > >> >> $ objdump -d /lib/libc.so.6 |grep %g4|wc -l > >> >> 11603 > >> >> $ objdump -d /lib/libc.so.6 |grep %g5|wc -l > >> >> 448 > >> >> $ objdump -d /lib/libc.so.6 |grep %g6|wc -l > >> >> 150 > >> >> $ objdump -d /lib/libc.so.6 |grep %g7|wc -l > >> >> 3052 > >> >> > >> >> Glibc is compiled for Sparc32plus, so it should only use %g6 and %g7, > >> > > >> > From the calling convention point of view, sparc32 and sparc32plus are > >> > the same ABI, so %g5 is also reserved for system use. > >> > > >> >> or %g1 and %g5 for scratch purposes. However, it is the application > >> >> registers %g2 to %g4 that are used heaviest. Looking inside the > >> >> objdump it's easy to see that the uses are not for example saving or > >> >> restoring, but actually using them without saving the previous value > >> >> first: > >> > > >> > Well, we have to define system and application. System is defined as > >> > library in Chapter 6, and I don't see the libc there, and is probably > >> > considered as part of the application. > >> > >> No, for example unistd.h is described and even X11. GCC also says that > >> libraries should be compiled without using the registers: > >> > >> http://gcc.gnu.org/onlinedocs/gcc-4.6.0/gcc/SPARC-Options.html#SPARC-Options > >> > >> >> 000211e0 <__divdi3>: > >> >> 211e0: 9d e3 bf a0 save %sp, -96, %sp > >> >> 211e4: 90 10 00 18 mov %i0, %o0 > >> >> 211e8: 92 10 00 19 mov %i1, %o1 > >> >> 211ec: 94 10 00 1a mov %i2, %o2 > >> >> 211f0: 96 10 00 1b mov %i3, %o3 > >> >> 211f4: 80 a6 20 00 cmp %i0, 0 > >> >> 211f8: 06 40 00 10 bl,pn %icc, 21238 <__divdi3+0x58> > >> >> 211fc: a0 10 20 00 clr %l0 > >> >> 21200: 80 a2 a0 00 cmp %o2, 0 > >> >> 21204: 26 40 00 13 bl,a,pn %icc, 21250 <__divdi3+0x70> > >> >> 21208: a0 38 00 10 xnor %g0, %l0, %l0 > >> >> 2120c: 7f ff fe ed call 20dc0 <__ashldi3+0x40> > >> >> 21210: 98 10 20 00 clr %o4 > >> >> 21214: 84 10 00 08 mov %o0, %g2 > >> >> > >> >> ...whoops... > >> >> > >> >> 21218: 80 a4 20 00 cmp %l0, 0 > >> >> 2121c: 02 40 00 04 be,pn %icc, 2122c <__divdi3+0x4c> > >> >> 21220: 86 10 00 09 mov %o1, %g3 > >> >> > >> >> ...whoops... > >> >> > >> >> 21224: 86 a0 00 09 subcc %g0, %o1, %g3 > >> >> 21228: 84 60 00 02 subc %g0, %g2, %g2 > >> >> 2122c: b2 10 00 03 mov %g3, %i1 > >> >> 21230: 81 cf e0 08 rett %i7 + 8 > >> >> 21234: 90 10 00 02 mov %g2, %o0 > >> >> 21238: 92 a0 00 19 subcc %g0, %i1, %o1 > >> >> 2123c: 90 60 00 18 subc %g0, %i0, %o0 > >> >> 21240: 80 a2 a0 00 cmp %o2, 0 > >> >> 21244: 16 4f ff f2 bge %icc, 2120c <__divdi3+0x2c> > >> >> 21248: a0 10 3f ff mov -1, %l0 > >> >> 2124c: a0 38 00 10 xnor %g0, %l0, %l0 > >> >> 21250: 96 a0 00 0b subcc %g0, %o3, %o3 > >> >> 21254: 10 6f ff ee b %xcc, 2120c <__divdi3+0x2c> > >> >> 21258: 94 60 00 0a subc %g0, %o2, %o2 > >> >> 2125c: 01 00 00 00 nop > >> >> > >> >> This is libc from OpenBSD/Sparc64 4.9: > >> >> $ objdump -d /usr/lib/libc.so.58.0 |grep %g1|wc -l > >> >> 40562 > >> >> $ objdump -d /usr/lib/libc.so.58.0 |grep %g2|wc -l > >> >> 20384 > >> >> $ objdump -d /usr/lib/libc.so.58.0 |grep %g3|wc -l > >> >> 10240 > >> >> $ objdump -d /usr/lib/libc.so.58.0 |grep %g4|wc -l > >> >> 6606 > >> >> $ objdump -d /usr/lib/libc.so.58.0 |grep %g5|wc -l > >> >> 3811 > >> >> $ objdump -d /usr/lib/libc.so.58.0 |grep %g6|wc -l > >> >> 4 > >> >> $ objdump -d /usr/lib/libc.so.58.0 |grep %g7|wc -l > >> >> 20 > >> >> > >> >> Not so great there either. > >> >> > >> >> > Anyway, I don't see why keeping TCG_AREG0 inside the TCG generated > >> >> > code > >> >> > would prevent you to use a register from the %i set for it. > >> >> > >> >> The helpers currently use global env register, but %i registers can't > >> >> be accessed from the next level of function call nesting hierarchy so > >> >> they can't be used for global env. > >> >> > >> > > >> > That's the current situation yes. Using %i registers for TCG_AREG0 does > >> > mean you can't use a global env register in the helpers, but it doesn't > >> > mean that internal TCG code can't use them for TCG_AREG0. > >> > >> Exactly. > >> > >> > What I am telling you since the beginning is that: > >> > - I have no objection that we stop using a fixed register in GCC > >> > generated code (that is completely removing HELPER_CFLAGS). However I > >> > don't really see the point of doing that, though the Sparc issue might > >> > be an argument. > >> > - I do have objection to remove TCG_AREG0 from inside the TCG generated > >> > code, this register is used for almost every TCG op, and I don't see > >> > any real argument for not keeping it. > >> > >> I'm pretty much open at this point for all alternatives. > >> > > > > So what about getting rid of TCG_AREG0 for GCC generated code only, at > > least as a first step? > > > > So what about the following changes: > > - Change TCG_AREG0 of all targets to a callee saved register (if > > possible, e.g. sparc) > > - Change the prologue of all TCG targets to take env as an argument, > > and save it into TCG_AREG0. > > This can be the first step. > > > - Change all helpers to explicitly take an env pointer instead of using > > the fixed register. Note that it also includes softmmu helpers, but > > the TCG load/store instructions should be kept unchanged. > > I think this step will lose performance slightly if TCG is not changed.
Agreed. What do you have in mind by "if TCG is not changed"? > > - Remove HELPER_CFLAGS from makefiles when all helpers have been > > changed. > > This should restore most of the performance loss from previous step, > maybe even improve. > > > - TCG_AREG0 can then be changed to another register if needed. > > I'd combine this with callee saved register change. Anyway, at this > point there should be a lot of flexibility with the register choice. > > > And later we can do more steps to get a complete removal of TCG_AREG0, > > including in TCG code, though I still think it is a really bad idea. > > Maybe. At this point most of the hard work has been done, so it's > possible to make experiments. That's exactly my point. We more or less agree on previous step, not on this one, so we will need to experiment for that. -- Aurelien Jarno GPG: 1024D/F1BCDB73 aurel...@aurel32.net http://www.aurel32.net