Re: [Qemu-devel] floppy disk
On Mon, 2007-12-17 at 03:28 +, Thiemo Seufer wrote: Benjamin David Lunt wrote: Hi everyone, I only recently have started to use QEmu due to a request on the alt.os.development usenet group. My OS was not working on QEmu due to it would not recognize the (emulated) floppy. After a lot of testing, QEmu does not return the correct values for a Sense Interrupt command and the Status 0 byte. After looking over fdc.cc, someone has placed a hack write where it should return these values. I am just wondering if this hack is temporary or if it will be committed as code. That's the current state of the code in CVS. This hack is terribly bugged and is likely to break all commands status... For a more detailed description, QEmu returns the value 0x20 while Bochs, VMware, and real hardware return the values 0xC0, 0xC1, 0xC2, and 0xC3, for each drive 1 - 4 respectively. The polling mode is not implemented (and is even emulated in real hardware) and only exists for 8 drives compatibility, according to Intel specification, then has no meaning when using other drives formats. Any reasonable OS would then disable this feature with a CONFIGURE command as it's pointless on any modern machine (all PCs, for example). But it can be easily added, implementing the missing internal emulated drive status change register, as described in the 82078 datasheet. I am asking for more information on this subject. The interrupt status handling in QEMU's FDC emulation looks bogus to me, patches to fix it are welcome. :-) After taking a look to the specs and the code, it seems to me that the greatest bug here is the hack added in the SENSE_INTERRUPT_STATUS command answer. The 'SE' bit in status register 0 is set properly in case of 'implied seek' without the hack. And, as far as I can see, not all hardware do update this bit after read and write commands (Intel 82078 does, NS and SMC superIOs do not, according to their datasheet), so it should not hurt any OS not having it set after any READ or WRITE command. So, the hack is really more buggy than the previous code: - the SE bit (0x20) should be set only when there have been a implied seek for READ WRITE command. This case is hopefully not the common case, as OSes always try to do sequential reads/writes for performances reasons, and is properly handled, at least in DMA transfer case. There's a case to be checked in case of PIO transfers: the bit seems always set, even if no implied seek was done, which is buggy. - the SE bit is never set by some hardware on any READ WRITE commands, then any code that would rely on this bit to be set on those commands won't run on real hardware and is broken. - it breaks all other commands status cases One case that seems obviously not correct is that the SENSE_INTERRUPT_STATUS should be treated as an invalid command when there is no interrupt condition set. The following patch implements the polling mode and the invalid SENSE_INTERRUPT_STATUS case. -- J. Mayer [EMAIL PROTECTED] Never organized Index: hw/fdc.c === RCS file: /sources/qemu/qemu/hw/fdc.c,v retrieving revision 1.33 diff -u -d -d -p -r1.33 fdc.c --- hw/fdc.c 17 Nov 2007 17:14:41 - 1.33 +++ hw/fdc.c 17 Dec 2007 14:17:10 - @@ -399,6 +399,8 @@ struct fdctrl_t { uint8_t lock; /* Power down config (also with status regB access mode */ uint8_t pwrd; +/* Drive status change emulation */ +uint8_t drstch; /* Sun4m quirks? */ int sun4m; /* Floppy drives */ @@ -674,7 +676,7 @@ static void fdctrl_raise_irq (fdctrl_t * fdctrl-int_status = status; return; } -if (~(fdctrl-state FD_CTRL_INTR)) { +if (!(fdctrl-state FD_CTRL_INTR)) { qemu_set_irq(fdctrl-irq, 1); fdctrl-state |= FD_CTRL_INTR; } @@ -698,6 +700,8 @@ static void fdctrl_reset (fdctrl_t *fdct fdctrl-data_dir = FD_DIR_WRITE; for (i = 0; i MAX_FD; i++) fd_reset(fdctrl-drives[i]); +/* Initialize the emulated drive status change register */ +fdctrl-drstch = 0xF; fdctrl_reset_fifo(fdctrl); if (do_irq) fdctrl_raise_irq(fdctrl, 0xc0); @@ -1410,17 +1414,36 @@ static void fdctrl_write_data (fdctrl_t /* SENSE_INTERRUPT_STATUS */ FLOPPY_DPRINTF(SENSE_INTERRUPT_STATUS command (%02x)\n, fdctrl-int_status); -/* No parameters cmd: returns status if no interrupt */ +/* No parameters cmd: returns interrupt status */ +if (!(fdctrl-config 0x10) fdctrl-drstch != 0x00) { +/* If emulated polling mode is active... */ +int i; +for (i = 0; i 4; i++) { +if (fdctrl-drstch (1 i)) { +/* Emulate 8 drive 'i' status change */ +fdctrl-fifo[0] = 0xC0 | i; +fdctrl-drstch = ~(1 i
Re: [Qemu-devel] qemu hw/ppc_oldworld.c target-ppc/cpu.h target-...
On Sat, 2007-11-24 at 00:52 +, Paul Brook wrote: By your own admission, we can get away with not calculating the high 32 bit of the register. If follows that the high bits are completely meaningless. Not completelly. There are even some way to do 64 bits computations when running in 32 bits mode... Some may see this as an architecture hack, but this gives the only way to switch from 32 bits to 64 bits mode (as the sixty-four MSR bits lies in the highest bits of the register). Anything that involves switching to 64-bit mode to see th results is irelevant because we don't implement that. You can't have it both ways. Either you need to implement the full 64-bit gpr for correctness, in which case I guess we're most of the way to scrapping ppc-softmmu and using ppc64-softmmu all the time, or the high bits are not part of the interesting CPU state. Yes, when running on a 64 bits host, we could avoid compiling ppc-softmmu. It's still interresting to use it on 32 bits host, as an optimisation, because it runs much faster than the ppc64-softmmu version. I can believe that on some hosts it's cheaper to use a 64-bit gpr_t, and the architecture/implementation is such that it gives the same results as a 32-bit gpr_t. However this is an implementation detail, and should not be exposed to the user. To complicate the situation, it's also required that standard implementation do all computations on 64 bit values Really? Are you sure? I can understand the architecture being defined in terms of 64-bit gprs. However if the high half of those registers is never visible to the application/OS then those aren't actually requirements, they're just a convenient shorthand for avoiding having to describe everything twice. I disagree. qemu is implementing ppc32. which does not exists. Well, I admit I've invented the term ppc32, but there are dozens of 32-bit PowerPC chips. I'd be amazed if they do 64-bit computations or have 64-bit GPRs. SPE doesn't count as the high half is effectively a separate register file on 32-bit cores. OK. Maybe I did not properly said the fact: the spec says that if an implementation (said embedded...) does not implement any 64 bits operation, it could also optionally avoid using 64 bits GPR. And of course this can lead to avoid any 64 bits computation to be implemented on the silicium. But this is not defined as the normal behavior of a PowerPC CPU. But as you said, this does not change nothing when seen from the execution environment, even if it's not architecturally correct. SPE is not using separate registers. The specification actually says the GPR are 64 bits even on 32 bits implementations if SPR is implemented. SPE operations affect all 64 bits of the register and this can then be seen with standard PowerPC operations. I did take care of your remarks about the buggy prints and made a general pass in all the target-ppc code (it seems they were even more issues). I also made the SPE part of GPR available in CPU dumps. This leads me to say that if we ever want to change the behavior of the 32 bits PowerPC emulation, it will only need a one line patch. For now I would really like to keep the current behavior which is architecturally correct and helps me debugging the 64 bits part; this is one of the reasons why I first decided to do this way, the other one being the fact it seems to lead to better code on my x86_64 host. When the 64 bits emulation will be fully usable, I could imagine come back to strict 32 bits for the ppc-xxx target, as those target would become not so useful... -- J. Mayer [EMAIL PROTECTED] Never organized
Re: [Qemu-devel] qemu hw/ppc_oldworld.c target-ppc/cpu.h target-...
On Fri, 2007-11-23 at 21:36 +, Paul Brook wrote: Then I took a closer look to the code, to ensure I was not wrong. The PowerPC 32 on 64 bits hosts is implemented the same way that the specification says a PowerPC in 32 bits mode should be. Then higher bits are not garbage. They are what the PowerPC specification say they should be (apart if they are some bugs in the implementation). The fact that they are or not used by computations is another point. The fact is the registers values are correct. AFAICS the high bits are never used by anything. They are used the way the specification tells it should be, ie when running in 32 bits mode, all computation are done in 64 bits. I think what you mean is that they work the way that ppc64 is defined, to remain compatible with ppc32. IMHO this is entirely irrelevant as we're emulating a ppc32. You could replace the high bits with garbage and nothing would ever be able to tell the difference. PowerPC is a 64 bits architecture. PowerPC 32 on 32 bits host is optimized not to compute the 32 highest bits, the same way it's allowed to cut down the GPR when implementing a CPU that would not support the 64 bits mode (but this is a tolerance, this is not the architecture is defined). PowerPC 32 on 64 bits host is implemented as PowerPC 32 bits mode. This is a choice that may be discussed but this is the way it's done. Then, the 32 highest bits are to be computed properly, even if they do not actually participate in any result seen from the 32 bits application. Then, print the 64 bits GPR is relevant. Running the PowerPC 32 emulation on a 64 bits hosts is strictly equivalent to running the PowerPC 64 emulation in 32 bits mode, as the architecture specifies it should be. One could then argue the PowerPC 32 targets are not relevant when running on a 64 bits host, which is true. And the fact is that printing a uint64_t on any 64 bits host (x86_64 or any other) using PRIx64 is exactly what is to be done, according to ISO C. Then, pretending that it would crash on any host is completelly pointless. We weren't printing a 64-bit value. We were passing a 32-bit target_ulong with a PRIx64 format. Some concrete examples: translate.c:6052: cpu_fprintf(f, MSR REGX FILL HID0 REGX FILL HF REGX FILL env-msr, env-hflags, env-spr[SPR_HID0], All these values are 32-bit tagret_ulong. Using a 64-bit format specifierfor ppc32 targets is just nonsense. OK. Those are real bugs to be fixed. I'll take a look But I'll try not to break the GPR dump. In fact, GPR should always dumped as 64 bits, even when runnig on 32 bits hosts. This would be more consistent with the specification. And at line 6069 we even have an explicit cast to a 32-bit type: cpu_fprintf(f, REGX, (target_ulong)env-gpr[i]); OK, this is false. I'll remove this buggy cast. I see the SPE stuff that uses T0_64 et al, however this still uses stores the value in the low 32 bits of the {gpr,gprth} pair. SPE dump is the case that does not work properly. Your patch does not solve anything here, just breaks the main stream case. I agree that SPE register dumping does not work, however I also assert that it was never even close to working, and if REGX is supposed to be the solution then most of the uses of REGX are incorrect. Please give a concrete example of something that worked before and does not now. The fact that you cannot dump the full GPR is a bug. When GPR is stored as 64 bits, they are to be dumped as 64 bits values. If you see bugs in my code, please tell me I'll try to fix them (and I'll thank you for this). But not doing weird things that are more buggy than the original code ! But once again, the biggest problem is that you break my code without any concertation and without even trying to understand why the code has been written this way. So, tell me you think there's a bug, or propose a patch. If I think the patch is OK, I'll tell you. If not, I'll try to address the bug the way I think ithas to be done. -- J. Mayer [EMAIL PROTECTED] Never organized
Re: [Qemu-devel] qemu hw/ppc_oldworld.c target-ppc/cpu.h target-...
On Fri, 2007-11-23 at 22:23 +, Paul Brook wrote: I think what you mean is that they work the way that ppc64 is defined, to remain compatible with ppc32. IMHO this is entirely irrelevant as we're emulating a ppc32. You could replace the high bits with garbage and nothing would ever be able to tell the difference. PowerPC is a 64 bits architecture. PowerPC 32 on 32 bits host is optimized not to compute the 32 highest bits, the same way it's allowed to cut down the GPR when implementing a CPU that would not support the 64 bits mode (but this is a tolerance, this is not the architecture is defined). No. PowerPC is defined as a 64-bit archirecure. However there is a subset of this architecture (aka ppc32) that is a complete 32-bit architecture in its own right. It used to be defined this way... years ago. The latest specifications say that there is one 64 bits architecture with 2 computational modes. They also say that an embedded implementation can avoid implementating some parts or the whole 64 bits computation mode. To complicate the situation, it's also required that standard implementation do all computations on 64 bit values but that embedded implementations that implement SPE extension do never modify the highest 32 bits of GPR if they do not implement the 64 bits computation mode (but this restriction do not exists if the implementation implements the 2 computation modes), which explains why I added the gprh registers to be able to handle all cases without ifdefs. By your own admission, we can get away with not calculating the high 32 bit of the register. If follows that the high bits are completely meaningless. Not completelly. There are even some way to do 64 bits computations when running in 32 bits mode... Some may see this as an architecture hack, but this gives the only way to switch from 32 bits to 64 bits mode (as the sixty-four MSR bits lies in the highest bits of the register). The qemu ppc32 emulation is implemented in such a way that on 64-bit hosts it looks a lot like a ppc64 implementation. However this need not, and should not be exposed to the user. OK. Those are real bugs to be fixed. I'll take a look But I'll try not to break the GPR dump. In fact, GPR should always dumped as 64 bits, even when runnig on 32 bits hosts. This would be more consistent with the specification. I disagree. qemu is implementing ppc32. which does not exists. Showing more than 32 bits of register is completely bogus. No. It's showing the full CPU state, which can be more than what the application (or the OS, when running virtualized on a real CPU) could see. The OS cannot see the whole CPU state, but Qemu must implement more than the OS can see and is then able to dump it. 64 bits GPR is just a specific case of a general behavior. Any differences between a 32-bit host and a 64-bit host are a qemu bug. If you display 64 bits, then those 64 bits had better be the same when run on 32-bit hosts. Why ? The idea is that it costs too much to keep the whole state when running on a 32 bits host, then we act as a restricted embedded implementation. When the host CPU allows it without any extra cost, we act as the specification defines we should. This is a choice. Once again, this choice can be discussed and may be changed if I get convinced it would be better not to act this way. But this behavior is sure not bugged, it exactly follows (or may say should exactly if well implemented) the PowerPC specification. -- J. Mayer [EMAIL PROTECTED] Never organized
[Qemu-devel] [RFC] thunk.c / thunk.h bugfix
There's a problem in thunk.h as 2 possibly recursive functions are declared as inline. There could be 2 solutions for this. We can never inline those functions and move them to thunk.c (see the attached patch). The drawback is that all cases that do not recurse would be less efficient. The other solution would be to call another intermediary function that would not be inlined to handle the recursive case. The second solution may be better as it would keep the fast cases inlined. -- J. Mayer [EMAIL PROTECTED] Never organized Index: thunk.c === RCS file: /sources/qemu/qemu/thunk.c,v retrieving revision 1.10 diff -u -d -d -p -r1.10 thunk.c --- thunk.c 11 Nov 2007 19:31:34 - 1.10 +++ thunk.c 18 Nov 2007 15:50:56 - @@ -31,7 +31,7 @@ /* XXX: make it dynamic */ StructEntry struct_entries[MAX_STRUCTS]; -static inline const argtype *thunk_type_next(const argtype *type_ptr) +static const argtype *thunk_type_next(const argtype *type_ptr) { int type; @@ -267,3 +267,78 @@ unsigned int host_to_target_bitmask(unsi } return(x86_mask); } + +#ifndef NO_THUNK_TYPE_SIZE +int thunk_type_size(const argtype *type_ptr, int is_host) +{ +int type, size; +const StructEntry *se; + +type = *type_ptr; +switch(type) { +case TYPE_CHAR: +return 1; +case TYPE_SHORT: +return 2; +case TYPE_INT: +return 4; +case TYPE_LONGLONG: +case TYPE_ULONGLONG: +return 8; +case TYPE_LONG: +case TYPE_ULONG: +case TYPE_PTRVOID: +case TYPE_PTR: +if (is_host) { +return HOST_LONG_SIZE; +} else { +return TARGET_ABI_BITS / 8; +} +break; +case TYPE_ARRAY: +size = type_ptr[1]; +return size * thunk_type_size(type_ptr + 2, is_host); +case TYPE_STRUCT: +se = struct_entries + type_ptr[1]; +return se-size[is_host]; +default: +return -1; +} +} + +int thunk_type_align(const argtype *type_ptr, int is_host) +{ +int type; +const StructEntry *se; + +type = *type_ptr; +switch(type) { +case TYPE_CHAR: +return 1; +case TYPE_SHORT: +return 2; +case TYPE_INT: +return 4; +case TYPE_LONGLONG: +case TYPE_ULONGLONG: +return 8; +case TYPE_LONG: +case TYPE_ULONG: +case TYPE_PTRVOID: +case TYPE_PTR: +if (is_host) { +return HOST_LONG_SIZE; +} else { +return TARGET_ABI_BITS / 8; +} +break; +case TYPE_ARRAY: +return thunk_type_align(type_ptr + 2, is_host); +case TYPE_STRUCT: +se = struct_entries + type_ptr[1]; +return se-align[is_host]; +default: +return -1; +} +} +#endif /* ndef NO_THUNK_TYPE_SIZE */ Index: thunk.h === RCS file: /sources/qemu/qemu/thunk.h,v retrieving revision 1.15 diff -u -d -d -p -r1.15 thunk.h --- thunk.h 14 Oct 2007 16:27:28 - 1.15 +++ thunk.h 18 Nov 2007 15:50:56 - @@ -75,78 +75,8 @@ const argtype *thunk_convert(void *dst, extern StructEntry struct_entries[]; -static inline int thunk_type_size(const argtype *type_ptr, int is_host) -{ -int type, size; -const StructEntry *se; - -type = *type_ptr; -switch(type) { -case TYPE_CHAR: -return 1; -case TYPE_SHORT: -return 2; -case TYPE_INT: -return 4; -case TYPE_LONGLONG: -case TYPE_ULONGLONG: -return 8; -case TYPE_LONG: -case TYPE_ULONG: -case TYPE_PTRVOID: -case TYPE_PTR: -if (is_host) { -return HOST_LONG_SIZE; -} else { -return TARGET_ABI_BITS / 8; -} -break; -case TYPE_ARRAY: -size = type_ptr[1]; -return size * thunk_type_size(type_ptr + 2, is_host); -case TYPE_STRUCT: -se = struct_entries + type_ptr[1]; -return se-size[is_host]; -default: -return -1; -} -} - -static inline int thunk_type_align(const argtype *type_ptr, int is_host) -{ -int type; -const StructEntry *se; - -type = *type_ptr; -switch(type) { -case TYPE_CHAR: -return 1; -case TYPE_SHORT: -return 2; -case TYPE_INT: -return 4; -case TYPE_LONGLONG: -case TYPE_ULONGLONG: -return 8; -case TYPE_LONG: -case TYPE_ULONG: -case TYPE_PTRVOID: -case TYPE_PTR: -if (is_host) { -return HOST_LONG_SIZE; -} else { -return TARGET_ABI_BITS / 8; -} -break; -case TYPE_ARRAY: -return thunk_type_align(type_ptr + 2, is_host); -case TYPE_STRUCT: -se = struct_entries + type_ptr[1]; -return se-align[is_host]; -default: -return -1; -} -} +int thunk_type_size(const argtype *type_ptr, int is_host); +int thunk_type_align(const argtype *type_ptr, int is_host
[Qemu-devel] [RFC] Fix for random Qemu crashes
Here's an updated patch to fix the inlining problems that make some Qemu targets crash randomly. As we have at least one broken target in the CVS because of this bug (and maybe more), we have an urgent need of a fix. I'll then commit this patch today if there is no other fix proposed that actually solves the problem. -- J. Mayer [EMAIL PROTECTED] Never organized Index: exec-all.h === RCS file: /sources/qemu/qemu/exec-all.h,v retrieving revision 1.70 diff -u -d -d -p -r1.70 exec-all.h --- exec-all.h 4 Nov 2007 02:24:57 - 1.70 +++ exec-all.h 18 Nov 2007 15:44:16 - @@ -21,36 +21,6 @@ /* allow to see translation results - the slowdown should be negligible, so we leave it */ #define DEBUG_DISAS -#ifndef glue -#define xglue(x, y) x ## y -#define glue(x, y) xglue(x, y) -#define stringify(s) tostring(s) -#define tostring(s) #s -#endif - -#ifndef likely -#if __GNUC__ 3 -#define __builtin_expect(x, n) (x) -#endif - -#define likely(x) __builtin_expect(!!(x), 1) -#define unlikely(x) __builtin_expect(!!(x), 0) -#endif - -#ifndef always_inline -#if (__GNUC__ 3) || defined(__APPLE__) -#define always_inline inline -#else -#define always_inline __attribute__ (( always_inline )) inline -#endif -#endif - -#ifdef __i386__ -#define REGPARM(n) __attribute((regparm(n))) -#else -#define REGPARM(n) -#endif - /* is_jmp field values */ #define DISAS_NEXT0 /* next instruction can be analyzed */ #define DISAS_JUMP1 /* only pc was modified dynamically */ Index: osdep.h === RCS file: /sources/qemu/qemu/osdep.h,v retrieving revision 1.10 diff -u -d -d -p -r1.10 osdep.h --- osdep.h 7 Jun 2007 23:09:47 - 1.10 +++ osdep.h 18 Nov 2007 15:44:16 - @@ -3,6 +3,44 @@ #include stdarg.h +#ifndef glue +#define xglue(x, y) x ## y +#define glue(x, y) xglue(x, y) +#define stringify(s) tostring(s) +#define tostring(s) #s +#endif + +#ifndef likely +#if __GNUC__ 3 +#define __builtin_expect(x, n) (x) +#endif + +#define likely(x) __builtin_expect(!!(x), 1) +#define unlikely(x) __builtin_expect(!!(x), 0) +#endif + +#ifndef MIN +#define MIN(a, b) (((a) (b)) ? (a) : (b)) +#endif +#ifndef MAX +#define MAX(a, b) (((a) (b)) ? (a) : (b)) +#endif + +#ifndef always_inline +#if (__GNUC__ 3) || defined(__APPLE__) +#define always_inline inline +#else +#define always_inline __attribute__ (( always_inline )) __inline__ +#endif +#endif +#define inline always_inline + +#ifdef __i386__ +#define REGPARM(n) __attribute((regparm(n))) +#else +#define REGPARM(n) +#endif + #define qemu_printf printf void *qemu_malloc(size_t size); Index: qemu-common.h === RCS file: /sources/qemu/qemu/qemu-common.h,v retrieving revision 1.2 diff -u -d -d -p -r1.2 qemu-common.h --- qemu-common.h 17 Nov 2007 17:14:38 - 1.2 +++ qemu-common.h 18 Nov 2007 15:44:16 - @@ -62,37 +62,6 @@ static inline char *realpath(const char #endif /* !defined(NEED_CPU_H) */ -#ifndef glue -#define xglue(x, y) x ## y -#define glue(x, y) xglue(x, y) -#define stringify(s) tostring(s) -#define tostring(s) #s -#endif - -#ifndef likely -#if __GNUC__ 3 -#define __builtin_expect(x, n) (x) -#endif - -#define likely(x) __builtin_expect(!!(x), 1) -#define unlikely(x) __builtin_expect(!!(x), 0) -#endif - -#ifndef MIN -#define MIN(a, b) (((a) (b)) ? (a) : (b)) -#endif -#ifndef MAX -#define MAX(a, b) (((a) (b)) ? (a) : (b)) -#endif - -#ifndef always_inline -#if (__GNUC__ 3) || defined(__APPLE__) -#define always_inline inline -#else -#define always_inline __attribute__ (( always_inline )) inline -#endif -#endif - /* bottom halves */ typedef struct QEMUBH QEMUBH; Index: translate-op.c === RCS file: /sources/qemu/qemu/translate-op.c,v retrieving revision 1.3 diff -u -d -d -p -r1.3 translate-op.c --- translate-op.c 18 Nov 2007 01:44:36 - 1.3 +++ translate-op.c 18 Nov 2007 15:44:16 - @@ -24,6 +24,7 @@ #include inttypes.h #include config.h +#include osdep.h enum { #define DEF(s, n, copy_size) INDEX_op_ ## s, Index: darwin-user/qemu.h === RCS file: /sources/qemu/qemu/darwin-user/qemu.h,v retrieving revision 1.1 diff -u -d -d -p -r1.1 qemu.h --- darwin-user/qemu.h 18 Jan 2007 20:06:33 - 1.1 +++ darwin-user/qemu.h 18 Nov 2007 15:44:16 - @@ -1,13 +1,13 @@ #ifndef GEMU_H #define GEMU_H -#include thunk.h - #include signal.h #include string.h #include cpu.h +#include thunk.h + #include gdbstub.h typedef siginfo_t target_siginfo_t;
Re: [Qemu-devel] Re: RFC: fix for random Qemu crashes
On Fri, 2007-11-16 at 18:58 -0800, Ben Pfaff wrote: J. Mayer [EMAIL PROTECTED] writes: On Fri, 2007-11-16 at 21:32 +0100, andrzej zaborowski wrote: I think a line like #define inline __attribute__ (( always_inline )) inline in dyngen-exec.h should be As I already pointed it in the first message of the thread, this kind of define would expand recursivelly, [...] No. A macro is not expanded within its own expansion. See ISO C99: I just take a look of what happens in *real life* while compiling the linux kernel which uses such a definition As I reported, I had compilation problems due to this behavior and did inspect the preprocessor output and saw this result. I did not check if it happens only with some versions of gcc or if this behavior has been changed with newer releases, I have to admit. 6.10.3.4 Rescanning and further replacement [...] 2If the name of the macro being replaced is found during this scan of the replacement list (not including the rest of the source file's preprocessing tokens), it is not replaced. If it still bothers you, you could write it as #define inline __attribute__ (( always_inline )) __inline__ since GCC accepts __inline__ as a synonym for inline. You're right, this would be a good solution to avoid many changes in the code. -- J. Mayer [EMAIL PROTECTED] Never organized
Re: [Qemu-devel] qemu softmmu_template.h
On Sat, 2007-11-17 at 09:53 +, Andrzej Zaborowski wrote: CVSROOT: /sources/qemu Module name: qemu Changes by: Andrzej Zaborowski balrog 07/11/17 09:53:42 Modified files: . : softmmu_template.h Log message: Check permissions for the last byte first in unaligned slow_st accesses (patch from TeLeMan). CVSWeb URLs: http://cvs.savannah.gnu.org/viewcvs/qemu/softmmu_template.h?cvsroot=qemur1=1.19r2=1.20 Has it been checked that it's legal for all architectures and cannot have any nasty side effect to do accesses in the reverse order ? Real hardware do not ever seem to do this... -- J. Mayer [EMAIL PROTECTED] Never organized
Re: [Qemu-devel] qemu softmmu_template.h
On Sat, 2007-11-17 at 12:57 +0100, andrzej zaborowski wrote: On 17/11/2007, J. Mayer [EMAIL PROTECTED] wrote: On Sat, 2007-11-17 at 11:44 +0100, andrzej zaborowski wrote: On 17/11/2007, J. Mayer [EMAIL PROTECTED] wrote: On Sat, 2007-11-17 at 11:14 +0100, andrzej zaborowski wrote: On 17/11/2007, J. Mayer [EMAIL PROTECTED] wrote: On Sat, 2007-11-17 at 09:53 +, Andrzej Zaborowski wrote: CVSROOT: /sources/qemu Module name: qemu Changes by: Andrzej Zaborowski balrog 07/11/17 09:53:42 Modified files: . : softmmu_template.h Log message: Check permissions for the last byte first in unaligned slow_st accesses (patch from TeLeMan). CVSWeb URLs: http://cvs.savannah.gnu.org/viewcvs/qemu/softmmu_template.h?cvsroot=qemur1=1.19r2=1.20 Has it been checked that it's legal for all architectures and cannot have any nasty side effect to do accesses in the reverse order ? Real hardware do not ever seem to do this... For real hardware the store is a single operation. For PowerPC, at least, only aligned stores are defined as atomic. It's absolutely legal for an implementation to split all non-atomic accesses into smaller aligned accesses. And I guess it is the same for all architecture that can do unaligned accesses. Logically it shouldn't have any side effects, but if it does then it would rather mean that other code for that architecture is (also) broken, I believe. I've only tested ARM, mips, x86 and x86_64 before committing, so please test. I figured that the patch won't get any comments on the mailing list if it isn't merged. I don't think it's so easy to test because it may be very hard to trigger the cases that would have side effects, which are target dependent. I then am very curious to know how you did check that there is no problem with this patch Well, for ARM, x86 and x86_64 I only checked that unaligned accesses still work, i.e. that I haven't made an obvious typo. I haven't tested cross-page accesses with the access to the second page being invalid, I also don't know how the specifications for other architectures define the effect of such accesses, so maybe I shouldn't have committed this, but I assumed a common sense in the design of cpu archs, meaning that in the example given by TeLeMan the addition is not performed two times on some bytes. One case that obviously can have nasty side effects is if doing unaligned IO accesses. Doing accesses from first byte to the last is very different than doing the access from the last to the first. Hmm, right, I had not thought about IO accesses. I will watch for reports of any breakage that may have any connection with this and revert if there's any such report. What also can be very different is what is to happen when the instruction is to be restarted because of a page fault. I checked the PowerPC specification, and it appears that it allows splitted memory accesses to be done in any order. It also specifies that load and stores are restartable even if they have been partially executed (ie some registers or memory locations have already been changed), then this patch is likely not to break this target (but I did not check all specific implementations to see if some have specific requirements). This is to be checked for all other targets before such a patch can be applied, imho. Yes, although in practice that means the workaround (not a proper bugfix) would never be in qemu CVS and would be maintained in other trees endlessly. Hopefully not ! Just means one have to check the targets specifications. If specifications say it's valid to do access in random order, then it's up to the emulation code to take care of that case and make it work properly and the patch would not be to blame if it triggers some bugs. In the meantime, I checked the Alpha spec which seems, if I understood well, to allow such a behavior. -- J. Mayer [EMAIL PROTECTED] Never organized
Re: [Qemu-devel] qemu target-ppc/cpu.h target-ppc/op.c target-pp...
On Fri, 2007-11-16 at 14:11 +, Jocelyn Mayer wrote: CVSROOT: /sources/qemu Module name: qemu Changes by: Jocelyn Mayer j_mayer 07/11/16 14:11:29 Modified files: target-ppc : cpu.h op.c op_helper.c op_helper.h translate.c . : translate-all.c Log message: Always make PowerPC hypervisor mode memory accesses and instructions available for full system emulation, then removing all #if TARGET_PPC64H from micro-ops and code translator. Add new macros to dramatically simplify memory access tables definitions in target-ppc/translate.c. Remark: one should take care that having the hypervisor memory accessor available might lead to trigger the gcc inlining limits bug. Then it seems to me that a fix for this bug is needed asap, as reported in my previous messages (titled RFC: fix for random Qemu crashes). -- J. Mayer [EMAIL PROTECTED] Never organized
Re: [Qemu-devel] RFC: fix for random Qemu crashes
On Fri, 2007-11-16 at 21:32 +0100, andrzej zaborowski wrote: On 16/11/2007, Jocelyn Mayer [EMAIL PROTECTED] wrote: On Fri, 2007-11-16 at 15:52 +, Paul Brook wrote: Then, I choosed to replace 'inline' by 'always_inline', which is more invasive but have less risks of side effects. The diff is attached in always_inline.diff. The last thing that helps solve the problem is to change the inlining limits of gcc, at least to compile the op.o file. Presumably we only need one of the last two patches? It seems rather pointless to have always_inline *and* change the inlining heuristics. From the tests I made, it seems that adding always_inline helps but unfortunatelly does not solve all cases. Should check in the gcc source code why it is so... I'm ok with using always_inline for op.o (and things it uses directly) as this is required for correctness. I'm not convinced that that using always_inline everywhere is such a good idea. That's exactly what I did: I changed 'inline' to 'always_inline' in headers that are included by op.c, I did not made any change in other headers. I think a line like #define inline __attribute__ (( always_inline )) inline in dyngen-exec.h should be As I already pointed it in the first message of the thread, this kind of define would expand recursivelly, which is particullary ugly, and which can in some cases lead to compiler warnings or errors. I already had this kind of problems using the linux kernel headers which preciselly uses this definitition. But, once again, adding always_inline to functions does not completelly solve the problem (please read the thread !) or at least does not solves it with all gcc versions. The inline growth limits tweaking seems needed too. -- J. Mayer [EMAIL PROTECTED] Never organized
[Qemu-devel] RFC: fix for random Qemu crashes
Some may have experienced of having some Qemu builds crashing, apparently at random places, but in a reproducable way. I found one reason for this crashes: it appears that with the growth of the op.c file, there may be cases where we could reach the inlining limits of gcc. In such a case, gcc would not inline some declared inline function but would emit a call and provide a separate function. Unfortunately, this is not acceptable in op.o context as it will slowdown the emulation and because the call is likely to break the specific compilation rules (ie reserved registers) used while compiling op.o I found some workaround to avoid this behavior and I'd like to get opinions about it. The first idea is to change all occurences of 'inline' with 'always_inline' in all headers included in op.c. It then appeared to me that always_inline is not globally declared and that the definition is duplicated in vl.h and exec-all.h. But it's not declared in darwin-user/qemu.h and linux-user/qemu.h, which is not consistent with the declaration in vl.h. Further investigations showed me that the osdep.h header is the one that is actually included everywhere. Even if those are more compiler than OS dependent, I decided to move the definitions for glue, tostring, likely, unlikely, always_inline and REGPARM to this header so they can be globally used. I also changed the include orders in darwin-user/qemu.h to be sure this header will be included first. This patch is attached in common_defs.diff. Giving this patch, I've been able to replace all occurence of 'inline' with 'always_inline' in all headers included from op.c (given the generated .d file). Some would say I'd better add a #define inline always_inline somewhere. I personnally dislike this solution as this kind of macro as it tends to expand recursivally (always_inline definition contains the inline word) and this may lead to compilation warnings or errors in some context; one could do tests using the linux kernel headers to get convinced that it can happen. Then, I choosed to replace 'inline' by 'always_inline', which is more invasive but have less risks of side effects. The diff is attached in always_inline.diff. The last thing that helps solve the problem is to change the inlining limits of gcc, at least to compile the op.o file.Unfortunatelly, there is no way to disable those limits (I checked in the source code), then I put them to an arbitrary high level. I also added the -funit-at-a-time switch, as this kind of optimisation would not be relevant in op.o context. The diff is attached in gcc_inline_limits.diff. Please comment. -- J. Mayer [EMAIL PROTECTED] Never organized Index: exec-all.h === RCS file: /sources/qemu/qemu/exec-all.h,v retrieving revision 1.70 diff -u -d -d -p -r1.70 exec-all.h --- exec-all.h 4 Nov 2007 02:24:57 - 1.70 +++ exec-all.h 15 Nov 2007 22:58:45 - @@ -21,36 +21,6 @@ /* allow to see translation results - the slowdown should be negligible, so we leave it */ #define DEBUG_DISAS -#ifndef glue -#define xglue(x, y) x ## y -#define glue(x, y) xglue(x, y) -#define stringify(s) tostring(s) -#define tostring(s) #s -#endif - -#ifndef likely -#if __GNUC__ 3 -#define __builtin_expect(x, n) (x) -#endif - -#define likely(x) __builtin_expect(!!(x), 1) -#define unlikely(x) __builtin_expect(!!(x), 0) -#endif - -#ifndef always_inline -#if (__GNUC__ 3) || defined(__APPLE__) -#define always_inline inline -#else -#define always_inline __attribute__ (( always_inline )) inline -#endif -#endif - -#ifdef __i386__ -#define REGPARM(n) __attribute((regparm(n))) -#else -#define REGPARM(n) -#endif - /* is_jmp field values */ #define DISAS_NEXT0 /* next instruction can be analyzed */ #define DISAS_JUMP1 /* only pc was modified dynamically */ Index: osdep.h === RCS file: /sources/qemu/qemu/osdep.h,v retrieving revision 1.10 diff -u -d -d -p -r1.10 osdep.h --- osdep.h 7 Jun 2007 23:09:47 - 1.10 +++ osdep.h 15 Nov 2007 22:58:45 - @@ -3,6 +3,43 @@ #include stdarg.h +#ifndef glue +#define xglue(x, y) x ## y +#define glue(x, y) xglue(x, y) +#define stringify(s) tostring(s) +#define tostring(s) #s +#endif + +#ifndef likely +#if __GNUC__ 3 +#define __builtin_expect(x, n) (x) +#endif + +#define likely(x) __builtin_expect(!!(x), 1) +#define unlikely(x) __builtin_expect(!!(x), 0) +#endif + +#ifndef MIN +#define MIN(a, b) (((a) (b)) ? (a) : (b)) +#endif +#ifndef MAX +#define MAX(a, b) (((a) (b)) ? (a) : (b)) +#endif + +#ifndef always_inline +#if (__GNUC__ 3) || defined(__APPLE__) +#define always_inline inline +#else +#define always_inline __attribute__ (( always_inline )) inline +#endif +#endif + +#ifdef __i386__ +#define REGPARM(n) __attribute((regparm(n))) +#else +#define REGPARM(n) +#endif + #define qemu_printf printf void *qemu_malloc(size_t size); Index: vl.h
Re: [Qemu-devel] [PATCH] Fix NaN handling in softfloat
On Sat, 2007-11-10 at 10:35 +0100, Aurelien Jarno wrote: J. Mayer a écrit : On Thu, 2007-11-08 at 00:05 +0100, Aurelien Jarno wrote: On Tue, Nov 06, 2007 at 09:01:13PM +0100, J. Mayer wrote: On Sat, 2007-11-03 at 22:28 +0100, Aurelien Jarno wrote: On Sat, Nov 03, 2007 at 02:06:04PM -0400, Daniel Jacobowitz wrote: On Sat, Nov 03, 2007 at 06:35:48PM +0100, Aurelien Jarno wrote: Hi all, The current softfloat implementation changes qNaN into sNaN when converting between formats, for no reason. The attached patch fixes that. It also fixes an off-by-one in the extended double precision format (aka floatx80), the mantissa is 64-bit long and not 63-bit long. With this patch applied all the glibc 2.7 floating point tests are successfull on MIPS and MIPSEL. [...] Anyway there is no way to do that in the target specific code *after the conversion*, as the detection of a mantissa being nul when converting from double to single precision can only be done when both values are still known. In other words when the value is not fixed during the conversion, the value 0x7f80 can either be infinity or a conversion of NaN from double to single precision, and thus is it not possible to fix the value afterwards in the target specific code. I don't say you have to return an infinity when the argument is a qNaN. I just say you have to return a qNaN in a generic way. Just return sign | 0x7f80 | mantissa, which is the more generic form and seems to me to even be OK for sNaNs. It's even needed for some target (not to say 0x7f80 is actually not a NaN, but infinity. PowerPC) that specify that the result have to be equal to the operand (in the single precision format, of course) in such a case. This is simpler, it ensures that any target could then detect the presence of a NaN, know which one, and can then adjust the value according to its specification if needed. I then still can'tl see any reason of having target specific code in that area. Ok, let's give an example then. On MIPS let's say you want to convert 0x7ff1 (qNaN) to single precision. The mantissa shifted to the right become 0, so you have to generate a new value. As you proposed, let's generate a generic value 0x7fc0 in the softfloat routines. This value has to be converted to 0x7fbf in the MIPS target code. OK, the values that can cause a problem is all values that would have a zero mantissa once rounded to sinlge-precision. As the PowerPC requires that the result would have a zero mantissa (and the result class set to qNan), I can see no way to handle this case in the generic code. And even adding a #ifdef TARGET_PPC won't solve the problem as the PowerPC code would not be able to make the distinction between infinity case and qNaN case. Then, the only solution, as you already mentioned, is to check for qNaN before calling the rounding function. As the target emulation code already has to check for sNaN to be able to raise an exception when it's needed, checking for qNaN would cost nothing more; just have to change the check if (float64_is_signaling_nan) check with a check for NaN and handle the two cases by hand. I can see no other way to have all cases handled for all targets specific cases, do you ? [...] -- J. Mayer [EMAIL PROTECTED] Never organized
Re: [Qemu-devel] [PATCH] Fix NaN handling in softfloat
On Sat, 2007-11-10 at 17:15 +0100, Aurelien Jarno wrote: J. Mayer a écrit : On Sat, 2007-11-10 at 10:35 +0100, Aurelien Jarno wrote: J. Mayer a écrit : On Thu, 2007-11-08 at 00:05 +0100, Aurelien Jarno wrote: On Tue, Nov 06, 2007 at 09:01:13PM +0100, J. Mayer wrote: On Sat, 2007-11-03 at 22:28 +0100, Aurelien Jarno wrote: On Sat, Nov 03, 2007 at 02:06:04PM -0400, Daniel Jacobowitz wrote: On Sat, Nov 03, 2007 at 06:35:48PM +0100, Aurelien Jarno wrote: Hi all, The current softfloat implementation changes qNaN into sNaN when converting between formats, for no reason. The attached patch fixes that. It also fixes an off-by-one in the extended double precision format (aka floatx80), the mantissa is 64-bit long and not 63-bit long. With this patch applied all the glibc 2.7 floating point tests are successfull on MIPS and MIPSEL. [...] Anyway there is no way to do that in the target specific code *after the conversion*, as the detection of a mantissa being nul when converting from double to single precision can only be done when both values are still known. In other words when the value is not fixed during the conversion, the value 0x7f80 can either be infinity or a conversion of NaN from double to single precision, and thus is it not possible to fix the value afterwards in the target specific code. I don't say you have to return an infinity when the argument is a qNaN. I just say you have to return a qNaN in a generic way. Just return sign | 0x7f80 | mantissa, which is the more generic form and seems to me to even be OK for sNaNs. It's even needed for some target (not to say 0x7f80 is actually not a NaN, but infinity. PowerPC) that specify that the result have to be equal to the operand (in the single precision format, of course) in such a case. This is simpler, it ensures that any target could then detect the presence of a NaN, know which one, and can then adjust the value according to its specification if needed. I then still can'tl see any reason of having target specific code in that area. Ok, let's give an example then. On MIPS let's say you want to convert 0x7ff1 (qNaN) to single precision. The mantissa shifted to the right become 0, so you have to generate a new value. As you proposed, let's generate a generic value 0x7fc0 in the softfloat routines. This value has to be converted to 0x7fbf in the MIPS target code. OK, the values that can cause a problem is all values that would have a zero mantissa once rounded to sinlge-precision. As the PowerPC requires that the result would have a zero mantissa (and the result class set to Are you sure of that? According to IEEE 754 a zero mantissa is not a NaN. And tests on a real machine shows different results. 0x7ff1 is converted to 0x7fc0 on a 740/750 CPU. First, please note that a PowerPC do not have any single precision register nor internal representation. The operation here is round to single precision (frsp) but the result is still a 64 bits float. Then the result is more likely to be 0x7fc0. 0x7FF1 seems to be a SNaN, according to what I see in the PowerPC specification. Then the result is OK: when no exception is raised, SNaN is converted to QNaN during rounding to single operation (please see below). What about 0x7FF80001, which is a QNaN ? According to the PowerPC specification, this should be rounded to 0x7FF8 which is also a QNaN, then is also OK. Then rounding the mantissa and copying sign || exponent || mantissa would, in fact, always be OK in the PowerPC case. What seem to appear to me now is that the problems are due to the fact Mips have an inverted representation of SNaN / QNaN (if I understood well) that do not allow distinction between a rounded QNaN and an infinity... qNan), I can see no way to handle this case in the generic code. And even adding a #ifdef TARGET_PPC won't solve the problem as the PowerPC code would not be able to make the distinction between infinity case and qNaN case. Then, the only solution, as you already mentioned, is to check for qNaN before calling the rounding function. As the target emulation code already has to check for sNaN to be able to raise an exception when it's needed, checking for qNaN would cost nothing more; Except this is currently done *after* the call to the rounding function, using the flags returned by the softmmu routines. Doing a check before and after would slow down the emulation. On PowerPC at least, you have to check operands for sNaN _before_ doing any floating-point operation and check the result _after_ having done the floating-point computation in order to set the flags. The sNaN operand exception must be raised before any computation is done, because it's related to the operation operands which may be lost during the operation if the target register is the same than
Re: [Qemu-devel] [PATCH] Fix NaN handling in softfloat
On Sat, 2007-11-10 at 19:09 +0100, Aurelien Jarno wrote: J. Mayer a écrit : On Sat, 2007-11-10 at 17:15 +0100, Aurelien Jarno wrote: J. Mayer a écrit : On Sat, 2007-11-10 at 10:35 +0100, Aurelien Jarno wrote: J. Mayer a écrit : On Thu, 2007-11-08 at 00:05 +0100, Aurelien Jarno wrote: On Tue, Nov 06, 2007 at 09:01:13PM +0100, J. Mayer wrote: On Sat, 2007-11-03 at 22:28 +0100, Aurelien Jarno wrote: On Sat, Nov 03, 2007 at 02:06:04PM -0400, Daniel Jacobowitz wrote: On Sat, Nov 03, 2007 at 06:35:48PM +0100, Aurelien Jarno wrote: Hi all, The current softfloat implementation changes qNaN into sNaN when converting between formats, for no reason. The attached patch fixes that. It also fixes an off-by-one in the extended double precision format (aka floatx80), the mantissa is 64-bit long and not 63-bit long. With this patch applied all the glibc 2.7 floating point tests are successfull on MIPS and MIPSEL. [...] Anyway there is no way to do that in the target specific code *after the conversion*, as the detection of a mantissa being nul when converting from double to single precision can only be done when both values are still known. In other words when the value is not fixed during the conversion, the value 0x7f80 can either be infinity or a conversion of NaN from double to single precision, and thus is it not possible to fix the value afterwards in the target specific code. I don't say you have to return an infinity when the argument is a qNaN. I just say you have to return a qNaN in a generic way. Just return sign | 0x7f80 | mantissa, which is the more generic form and seems to me to even be OK for sNaNs. It's even needed for some target (not to say 0x7f80 is actually not a NaN, but infinity. PowerPC) that specify that the result have to be equal to the operand (in the single precision format, of course) in such a case. This is simpler, it ensures that any target could then detect the presence of a NaN, know which one, and can then adjust the value according to its specification if needed. I then still can'tl see any reason of having target specific code in that area. Ok, let's give an example then. On MIPS let's say you want to convert 0x7ff1 (qNaN) to single precision. The mantissa shifted to the right become 0, so you have to generate a new value. As you proposed, let's generate a generic value 0x7fc0 in the softfloat routines. This value has to be converted to 0x7fbf in the MIPS target code. OK, the values that can cause a problem is all values that would have a zero mantissa once rounded to sinlge-precision. As the PowerPC requires that the result would have a zero mantissa (and the result class set to Are you sure of that? According to IEEE 754 a zero mantissa is not a NaN. And tests on a real machine shows different results. 0x7ff1 is converted to 0x7fc0 on a 740/750 CPU. First, please note that a PowerPC do not have any single precision register nor internal representation. The operation here is round to single precision (frsp) but the result is still a 64 bits float. Then the result is more likely to be 0x7fc0. 0x7FF1 seems to be a SNaN, according to what I see in the PowerPC specification. Then the result is OK: when no exception is raised, SNaN is converted to QNaN during rounding to single operation (please see below). What about 0x7FF80001, which is a QNaN ? According to the PowerPC specification, this should be rounded to 0x7FF8 which is also a QNaN, then is also OK. Then rounding the mantissa and copying sign || exponent || mantissa would, in fact, always be OK in the PowerPC case. What seem to appear to me now is that the problems are due to the fact Mips have an inverted representation of SNaN / QNaN (if I understood well) that do not allow distinction between a rounded QNaN and an infinity... Nope it is not due to the fact that MIPS uses an inverted representation. It is the same problem on x86 or other target, except that they can allow the distinction between a rounded SNaN and an infinity. The problem is present on all targets that can represent a single precision FP For those targets that can make the distinction between rounded sNaN and infinite case, I don't see why there should be a problem. But, I realize now, as you, that PowerPC is really not a good example for comparison as it does not know about 32 bits floats. And I also realize it cannot use the softmmu routines to round 64 bits floats to single precision. qNan), I can see no way to handle this case in the generic code. And even adding a #ifdef TARGET_PPC won't solve the problem as the PowerPC code would not be able to make the distinction between infinity case and qNaN case. Then, the only solution, as you already mentioned, is to check for qNaN before calling the rounding
Re: [Qemu-devel] Removal of some target CPU macros
On Wed, 2007-11-07 at 23:37 +, Paul Brook wrote: I can check the hypervisor feature is not present, for emulating PowerPC 620 on a target that would have hypervisor emulation support. But I cannot do as if the CPU do not have the feature if it's actually available. The PowerPC 64 target emulates PowerPC 64 without the hypervisor feature, which actually do not exist but looks like a G5 machine when running Linux on it. If the emulator has the hypervisor feature enabled, I need an hypervisor software to boot and manage the machine I agree with this much. There is nothing in the CPU that would allow me to make it run as if the hypervisor mode do not exists. So add one. It obviously exists conceptually, because that's what the non-hypervisor qemu emulates. I admit you're right, here... Maybe just disabling the hypervisor mode flags and cleaverly initialise all hypervisor specific registers could make it act like the current PowerPC 64 target but this is to be checked when all hypervisor features will be emulated. The only possible runtime solution would be to duplicate every defined 64 bits CPU to define one model supporting hypervisor feature and another acting as this feature do not exist (the register definitions / access rights are not the same, and are defined at CPU instanciation time, adding run-time checks there would cost a lot...) and hope run-time checks won't cost too much. As I mentioned earlier, from looking at all the occurrences of TARGET_PPC64H I'd expect the runtime overhead to be minimal, if it's measurable at all. Maybe because there are a lot of things missing for the hypervisor feature to be completelly emulated... All the MMU part, specific registers, ..., is missing. I'm not sure what you're getting at about flags being defined at instantiation time. That's the same whether you have two binaries or one. True. Duplicating the CPU definitions should also be fairly trivial. You're effectively already doing it when you build the separate ppc64 and ppc64h binaries. I find it hard to believe it would be hard to do the same transformation at runtime. Yes, it's trivial to duplicate the CPU definitions. I'm just afraid of the confusion it could introduce for the user seeing two definitions of the same CPU. -- J. Mayer [EMAIL PROTECTED] Never organized
[Fwd: Re: [Qemu-devel] multiple boot devices]
What about this patch ? Is there any remark ? Is it to be applied ? Forwarded Message From: J. Mayer [EMAIL PROTECTED] Reply-To: qemu-devel@nongnu.org To: qemu-devel@nongnu.org Subject: Re: [Qemu-devel] multiple boot devices Date: Mon, 05 Nov 2007 14:04:40 +0100 On Sat, 2007-11-03 at 01:18 +, Thiemo Seufer wrote: J. Mayer wrote: [snip] It restricts the letter to the ones historically allowed by Qemu, not to anything specific to any architecture or hw platform. What I like in my implementation, compared to the strchr..., is that it exactly tells the user which given device is incorrect. Well, here it makes no difference, strchr tells you exactly same as much. Yes, you're right. Was thinking about the original strspn. Instead of the check, the code could also allow everything from 'a' to 'z' and then just AND the produced 32bit bitmap with a machine defined bitmap that would be part of QEMUMachine. I guess we would better stop at 'n', because we can easily define a semantic for devices 'c' to 'm' (ie hard disk drives in a hardware platform specific order) but we have to define what means 'o' to 'z'. But I agree we would better extend it now, instead of having to rework it later... To select the network device to boot from would probably become a 'n' 'o' 'p' 'q' series. [snip] Here's a second pass cleanup, adding the machine dependant checks for the PC machine and the PowerPC ones. As one can see, the OpenHack'Ware firmware is able to boot from devices 'e' and 'f'. For the PowerPC machines, I choosed to try to boot from the first given usable device, some may not agree with this choice. It can be noticed that the available boot devices are not the same for PowerPC PreP, g3bw and mac99 machines. As I don't know the features and requirements for the other architectures, I prefered not to add any check for those ones. Most other machines ignore -boot and those that don't, shouldn't break from the introduced change, so please commit it when you feel ok with it. I'd like to know what are the feelings around about this patch and if there are specific requirements and/or problems for some platforms to be addressed before... I think the proposed scheme (and the implementation) is flexible enough to accomodate all relevant platforms. Here's an updated patch that address the remark about network boot devices. -- J. Mayer [EMAIL PROTECTED] Never organized
Re: [Qemu-devel] [PATCH] Fix NaN handling in softfloat
On Thu, 2007-11-08 at 00:05 +0100, Aurelien Jarno wrote: On Tue, Nov 06, 2007 at 09:01:13PM +0100, J. Mayer wrote: On Sat, 2007-11-03 at 22:28 +0100, Aurelien Jarno wrote: On Sat, Nov 03, 2007 at 02:06:04PM -0400, Daniel Jacobowitz wrote: On Sat, Nov 03, 2007 at 06:35:48PM +0100, Aurelien Jarno wrote: Hi all, The current softfloat implementation changes qNaN into sNaN when converting between formats, for no reason. The attached patch fixes that. It also fixes an off-by-one in the extended double precision format (aka floatx80), the mantissa is 64-bit long and not 63-bit long. With this patch applied all the glibc 2.7 floating point tests are successfull on MIPS and MIPSEL. FYI, I posted a similar patch and haven't had time to get back to it. Andreas reminded me that we need to make sure at least one mantissa bit is set. If we're confident that the common NaN format will already have some bit other than the qnan/snan bit set, this is fine; otherwise, we might want to forcibly set some other mantissa bit. Please find an updated patch below. I have tried to match real x86, MIPS, HPPA, PowerPC and SPARC hardware when all mantissa bits are cleared. It's a good idea to fix NaN problems here but in my opinion, it's a bad idea to have target dependant code here. This code should implement IEEE behavior. Target specific behavior / deviations from the norm has to be Has Thiemo already said, there is no IEEE behavior. If you look at the IEEE 754 document you will see that it has requirements on what should be supported by an IEEE compliant FPU, but has very few requirements on the implementation. OK. implemented in target specific code. As targets have to check the presence of a NaN to update the FP flags, it seems that uglyifying this code with target specific hacks is pointless. If the target code do not check the presence of a NaN, that means that it does not implement precise FPU emulation, then there's no need to have specific code to return a precise value (I mean target dependant) from the generic code, imho. I actually know very few CPU that check for NaN in general. They check for sNaN as required by IEEE 754, but rarely for qNaN as their purpose is exactly to be propagated through all FPU operations as a normal FP number would be. CPU do check QNaNs because most of them update a specific flag that can be checked to know there was a NaN seen during FPU operations. I don't know for all FPU, but I can see that the PowerPC gives me 4 bits that give the class of the last FPU result and I guess you have those kind of flags in most implementations. Anyway there is no way to do that in the target specific code *after the conversion*, as the detection of a mantissa being nul when converting from double to single precision can only be done when both values are still known. In other words when the value is not fixed during the conversion, the value 0x7f80 can either be infinity or a conversion of NaN from double to single precision, and thus is it not possible to fix the value afterwards in the target specific code. I don't say you have to return an infinity when the argument is a qNaN. I just say you have to return a qNaN in a generic way. Just return sign | 0x7f80 | mantissa, which is the more generic form and seems to me to even be OK for sNaNs. It's even needed for some target (not to say PowerPC) that specify that the result have to be equal to the operand (in the single precision format, of course) in such a case. This is simpler, it ensures that any target could then detect the presence of a NaN, know which one, and can then adjust the value according to its specification if needed. I then still can'tl see any reason of having target specific code in that area. -- J. Mayer [EMAIL PROTECTED] Never organized
Re: [Fwd: Re: [Qemu-devel] multiple boot devices]
On Sat, 2007-11-10 at 00:43 +0100, andrzej zaborowski wrote: On 09/11/2007, J. Mayer [EMAIL PROTECTED] wrote: What about this patch ? Is there any remark ? Is it to be applied ? Yes, I'm also in favour. Regarding the machines that boot off flash, I will try to come up with some logical synatx. The Palm T|E board can boot off the ROM and it needs no kernel image in such case. Currently I was using -option-rom rom.image -boot n for this, as a hack, next week I should again have some time to play with it. What I do is use the -L / -bios options to specify a boot ROM, the same way we can do for other machines. Maybe this scheme is not applicable to all machines (?). The only ennoying thing I see is that I have to give qemu a fake disk image because qemu never wants to run if it has no bloc device specified, which is a pity when the target machine do not have any bloc device available. -- J. Mayer [EMAIL PROTECTED] Never organized
Re: [Qemu-devel] Removal of some target CPU macros
On Wed, 2007-11-07 at 22:55 +0100, Fabrice Bellard wrote: Jocelyn Mayer wrote: On Wed, 2007-11-07 at 19:32 +0100, Fabrice Bellard wrote: I noticed that some target CPUs macros have been added while they do not seem necessary. I don't like that because it introduces more #ifdefs which prevent making a version supporting simultaneously all the CPUs. In particular I saw the following: - TARGET_MIPSN32 : it is always combined with TARGET_MIPS64 in target-mips/. If its only usage is to select a different Linux ABI, then I suggest keeping TARGET_MIPS64 and using another define to choose that. - TARGET_PPC64H, TARGET_PPCEMB : I see no reason why they cannot be handled dynamically as the other PowerPC CPU types, provided that TARGET_PPC64 is defined. Is it the long term plan ? PowerPC embedded models are already available (should say should be as none are actually implemented) when PPC64 is defined. But as those are mainly PowerPC 32 with some extensions to manipulate the 64 bits GPR, it's a great help if we can avoid doing all operations in 64 bits when running on a 32 bits host (which would greatly decrease performances by at least a factor of two, which is not acceptable). Then having a specific 32 bits target using 64 bits register is very useful if one want to use those features, but may be disabled if the host is 64 bits. Note that most (all ?) embedded Freescale PowerPC microcontrollers implement those extensions and that some ones are greatly interrested with having an usable emulation avaible for those CPUs. OK for the speed gain, but such features make the code more difficult to test because there are a lot of possible combinations. I'd say the same about the fact that ppc_gpr_t can be 64 bit long on a pure 32 bit CPU. I got no choice here, as the 64 bits extensions of those CPUs uses the GPR for computations. And the programs are supposed not to take any care of the higher 32 bits when not using those extensions. I cannot make them 64 bits for all PowerPC targets and those CPU are not PowerPC 64 ones, so they do not match neither the PowerPC target, neither the PowerPC 64 one. [...] - someone provide an open-source hypervisor, compatible with the ones used on real machines, that would allow at least Linux to be able to run on a CPU with hypervisor mode available. Most 64 bits PowerPC, including the 970 (ie G5) have the hypervisor mode support implemented. If the hypervisor mode emulation is present, the OS won't be allowed to access most SPR and some exceptions will need to have some specific handlers in the hypervisor firmware. As I don't know such a software available, the hypervisor mode can not be enabled for standard PowerPC 64 emulation; or no-one will be able to actually use the emulator, except if using the venerable but mostly undocumented (and nearly impossible to find on real hardware) PowerPC 620 CPU. Furthermore, running (or emulating) a SMP machine on a 64 bits PowerPC with hypervisor features without hypervisor software is exactly impossible. Then I don't see how we can do without a separated target for hypervisor features support. What you say does not justify the separate ppc64h target : it just implies that you need to add a separate machine to make hypervisor tests. There is no documented way to disable the hypervisor feature on 64 bits PowerPC CPUs, even if the Apple SMU is said to do this before the CPU boots (but there is no known way to check if it's true). Then, it will make all PowerPC 64 emulated (but the 602) unusable as there'll be no known OSS to manage them. Removing the ppc64h target means, for me, removing any option to emulate the hypervisor feature at any time (if removed) or removing the ability to use the PowerPC 64 targets the way they are when booting on Apple G5 machines (if merged with the ppc64 target). None of those options seem acceptable to me. -- J. Mayer [EMAIL PROTECTED] Never organized
Re: [Qemu-devel] Removal of some target CPU macros
On Wed, 2007-11-07 at 22:47 +, Paul Brook wrote: Removing the ppc64h target means, for me, removing any option to emulate the hypervisor feature at any time (if removed) or removing the ability to use the PowerPC 64 targets the way they are when booting on Apple G5 machines (if merged with the ppc64 target). None of those options seem acceptable to me. ICBW, but it looks like it should be fairly straightforward to replace TARGET_PPC64H with a runtime check. The only really significant overhead I can see is having to flush an extra set of TLBs, and I guess there are ways round that if it is a problem. I can check the hypervisor feature is not present, for emulating PowerPC 620 on a target that would have hypervisor emulation support. But I cannot do as if the CPU do not have the feature if it's actually available. The PowerPC 64 target emulates PowerPC 64 without the hypervisor feature, which actually do not exist but looks like a G5 machine when running Linux on it. If the emulator has the hypervisor feature enabled, I need an hypervisor software to boot and manage the machine (trap the hypervisor exception, update or emulate the registers the OS is not allowed to access anymore, ...), which I don't have, then I would not be able to do any test or run any OS on this emulated target (you could argue that I do not have it to test the ppc64h target too, which is true...). There is nothing in the CPU that would allow me to make it run as if the hypervisor mode do not exists. Then I do not have anything to check. The only possible runtime solution would be to duplicate every defined 64 bits CPU to define one model supporting hypervisor feature and another acting as this feature do not exist (the register definitions / access rights are not the same, and are defined at CPU instanciation time, adding run-time checks there would cost a lot...) and hope run-time checks won't cost too much. I don't think this solution is cleaner than having a separate target, it would bring confusion for the user imho, and it does not seem easier to test. I can see why you want CPUs with and without hypervisor mode, I'm just not convinced it needs a separate qemu binary. -- J. Mayer [EMAIL PROTECTED] Never organized
Re: [Qemu-devel] [PATCH] Fix NaN handling in softfloat
On Sat, 2007-11-03 at 22:28 +0100, Aurelien Jarno wrote: On Sat, Nov 03, 2007 at 02:06:04PM -0400, Daniel Jacobowitz wrote: On Sat, Nov 03, 2007 at 06:35:48PM +0100, Aurelien Jarno wrote: Hi all, The current softfloat implementation changes qNaN into sNaN when converting between formats, for no reason. The attached patch fixes that. It also fixes an off-by-one in the extended double precision format (aka floatx80), the mantissa is 64-bit long and not 63-bit long. With this patch applied all the glibc 2.7 floating point tests are successfull on MIPS and MIPSEL. FYI, I posted a similar patch and haven't had time to get back to it. Andreas reminded me that we need to make sure at least one mantissa bit is set. If we're confident that the common NaN format will already have some bit other than the qnan/snan bit set, this is fine; otherwise, we might want to forcibly set some other mantissa bit. Please find an updated patch below. I have tried to match real x86, MIPS, HPPA, PowerPC and SPARC hardware when all mantissa bits are cleared. It's a good idea to fix NaN problems here but in my opinion, it's a bad idea to have target dependant code here. This code should implement IEEE behavior. Target specific behavior / deviations from the norm has to be implemented in target specific code. As targets have to check the presence of a NaN to update the FP flags, it seems that uglyifying this code with target specific hacks is pointless. If the target code do not check the presence of a NaN, that means that it does not implement precise FPU emulation, then there's no need to have specific code to return a precise value (I mean target dependant) from the generic code, imho. [...] -- J. Mayer [EMAIL PROTECTED] Never organized
Re: [Qemu-devel] multiple boot devices
On Sat, 2007-11-03 at 01:18 +, Thiemo Seufer wrote: J. Mayer wrote: [snip] It restricts the letter to the ones historically allowed by Qemu, not to anything specific to any architecture or hw platform. What I like in my implementation, compared to the strchr..., is that it exactly tells the user which given device is incorrect. Well, here it makes no difference, strchr tells you exactly same as much. Yes, you're right. Was thinking about the original strspn. Instead of the check, the code could also allow everything from 'a' to 'z' and then just AND the produced 32bit bitmap with a machine defined bitmap that would be part of QEMUMachine. I guess we would better stop at 'n', because we can easily define a semantic for devices 'c' to 'm' (ie hard disk drives in a hardware platform specific order) but we have to define what means 'o' to 'z'. But I agree we would better extend it now, instead of having to rework it later... To select the network device to boot from would probably become a 'n' 'o' 'p' 'q' series. [snip] Here's a second pass cleanup, adding the machine dependant checks for the PC machine and the PowerPC ones. As one can see, the OpenHack'Ware firmware is able to boot from devices 'e' and 'f'. For the PowerPC machines, I choosed to try to boot from the first given usable device, some may not agree with this choice. It can be noticed that the available boot devices are not the same for PowerPC PreP, g3bw and mac99 machines. As I don't know the features and requirements for the other architectures, I prefered not to add any check for those ones. Most other machines ignore -boot and those that don't, shouldn't break from the introduced change, so please commit it when you feel ok with it. I'd like to know what are the feelings around about this patch and if there are specific requirements and/or problems for some platforms to be addressed before... I think the proposed scheme (and the implementation) is flexible enough to accomodate all relevant platforms. Here's an updated patch that address the remark about network boot devices. -- J. Mayer [EMAIL PROTECTED] Never organized Index: vl.c === RCS file: /sources/qemu/qemu/vl.c,v retrieving revision 1.353 diff -u -d -d -p -r1.353 vl.c --- vl.c 31 Oct 2007 01:54:03 - 1.353 +++ vl.c 5 Nov 2007 12:07:05 - @@ -162,12 +162,6 @@ static DisplayState display_state; int nographic; const char* keyboard_layout = NULL; int64_t ticks_per_sec; -#if defined(TARGET_I386) -#define MAX_BOOT_DEVICES 3 -#else -#define MAX_BOOT_DEVICES 1 -#endif -static char boot_device[MAX_BOOT_DEVICES + 1]; int ram_size; int pit_min_timer_count = 0; int nb_nics; @@ -7556,14 +7552,16 @@ int main(int argc, char **argv) int use_gdbstub; const char *gdbstub_port; #endif +uint32_t boot_devices_bitmap = 0; int i, cdrom_index, pflash_index; -int snapshot, linux_boot; +int snapshot, linux_boot, net_boot; const char *initrd_filename; const char *hd_filename[MAX_DISKS], *fd_filename[MAX_FD]; const char *pflash_filename[MAX_PFLASH]; const char *sd_filename; const char *mtd_filename; const char *kernel_filename, *kernel_cmdline; +const char *boot_devices = ; DisplayState *ds = display_state; int cyls, heads, secs, translation; char net_clients[MAX_NET_CLIENTS][256]; @@ -7815,20 +7813,34 @@ int main(int argc, char **argv) } break; case QEMU_OPTION_boot: -if (strlen(optarg) MAX_BOOT_DEVICES) { -fprintf(stderr, qemu: too many boot devices\n); -exit(1); -} -strncpy(boot_device, optarg, MAX_BOOT_DEVICES); -#if defined(TARGET_SPARC) || defined(TARGET_I386) -#define BOOTCHARS acdn -#else -#define BOOTCHARS acd -#endif -if (strlen(boot_device) != strspn(boot_device, BOOTCHARS)) { -fprintf(stderr, qemu: invalid boot device -sequence '%s'\n, boot_device); -exit(1); +boot_devices = optarg; +/* We just do some generic consistency checks */ +{ +/* Could easily be extended to 64 devices if needed */ +const unsigned char *p; + +boot_devices_bitmap = 0; +for (p = boot_devices; *p != '\0'; p++) { +/* Allowed boot devices are: + * a b : floppy disk drives + * c ... f : IDE disk drives + * g ... m : machine implementation dependant drives + * n ... p : network devices + * It's up to each machine implementation to check
Re: [Qemu-devel] [PATCH, RFC] Disable implicit self-modifying code support for RISC CPUs
On Sun, 2007-11-04 at 09:12 +0200, Blue Swirl wrote: On 11/4/07, Fabrice Bellard [EMAIL PROTECTED] wrote: Blue Swirl wrote: Hi, RISC CPUs don't support self-modifying code unless the affected area is flushed explicitly. This patch disables the extra effort for SMC. The changes in this version would affect all CPUs except x86, but I'd like to see if there are problems with some target, so that the committed change can be limited. Without comments, I'll just disable SMC for Sparc, as there are no problems. So please comment, especially if you want to opt in. For some reason, I can't disable all TB/TLB flushing, for example there was already one line with TARGET_HAS_SMC || 1, but removing the || 1 part causes crashing. Does anyone know why? With the current QEMU architecture, you cannot disable self-modifying code as you did. This is why I did not fully supported the TARGET_HAS_SMC flag. The problem is that the translator make the assumption that the RAM and the TB contents are consistent for example when handling exceptions. Suppressing this assumption is possible but requires more work. I think the conclusion is that we would need some kind of emulator for i-cache for any accurate emulation. And handling the boot loader may need an uncached mode. The performance benefit from disabling SMC is unnoticeable according to my benchmarks. Adding a TB flush to i-cache flushing made things worse. Moreover, SMC is hardly ever used on Sparc. I'll just commit the debug statement fixes and the fix that separates PAGE_READ from PAGE_EXEC for Sparc. This patch is absolutely not needed. You have to directly call tlb_set_page_exec instead of tlb_set_page if you want to separate PAGE_READ from PAGE_EXEC. #ifdef TARGET_xxx should never occur in generic code and in that specific case, it's the Sparc target code that has to be fixed... Maybe this issue should be documented in qemu-tech.texi, there are also frequently some questions about caches. Yes, some documentation on such tricks can never hurt ! -- J. Mayer [EMAIL PROTECTED] Never organized
Re: [Qemu-devel] [RFC] linux-user (mostly syscall.c)
On Sun, 2007-11-04 at 01:51 +, Paul Brook wrote: If you take a close look, you'll find more variations between Linux ABIs for different CPUs than between all BSD implementations: common syscalls of all BSD flavors do the same thing (and have the same ABI whatever the CPU...). You'll also find very few variations between the syscalls common to BSD Linux because most of those directly map POSIX defined functions. Then, following the given argument, we never should try to share any code between linux-user for different targets, as the Linux ABI and behavior is different for different CPUs... I'd guess that the ones that are all the same are the ones that don't take any real effort to implement in the first place. If you can combine the implementations I'd also expect to be able to do cross emulation. e.g. run *BSD applications on a Linux host. This definitely works for simple cases, even in the extreme case of a windows host - as you say many syscalls map directly onto POSIX functions so there is only ever one implementation. Whether it works well enough for real applications or whole distributions of software I'm not so sure. If you can't do cross emulation I'm sceptical about how much they can be combined. Ooops... I should have been more precise. In my idea, it was BSD-on-Linux I was talking about. Let's say OpenBSD / NetBSD. FreeBSD has some specific tricks that might be difficult to map on Linux (or even other BSD), not even talking of Darwin which is quite impossible to emulate (or if one wants to emulate the IOkit...). The main difficulty of emulating BSD on Linux is the sysctl syscall, the trace facilites and the ioctls. I guess we can forget the ioctls... Most of the other syscalls mappings are quite like mapping one Linux port to another. -- J. Mayer [EMAIL PROTECTED] Never organized
Re: [Qemu-devel] How to split vl.h
On Sun, 2007-11-04 at 12:17 +, Paul Brook wrote: I have another solution: include all architecture specific files from the main file. I'd really rather not do this. I doubt it's going to be a win, as now you have to recompile the whole thing every time you change the implementation. At least with vl.h you only have to recompile when you change the interface. What I feel about this is that adding a hw/hw.h, included in all hw/*.c files would greatly improve the situation: changing vl.h would lead to recompile the core emulator object files, changing hw/hw.h would lead to recompile the hardware library. A first pass to do this could be achieved with a minimal effort, just moving all prototypes and structure definitions that could be moved without having to change vl.c. Then, things could be refined to move some hardware specific stuffs from vl.c to hw subdirectory: for example, the USB or display registration functions could go in a file in hw which would avoid USBDevice or DisplayState to be defined globally. -- J. Mayer [EMAIL PROTECTED] Never organized
Re: [Qemu-devel] How to split vl.h
On Sun, 2007-11-04 at 17:54 +, Paul Brook wrote: On Sunday 04 November 2007, J. Mayer wrote: On Sun, 2007-11-04 at 12:17 +, Paul Brook wrote: I have another solution: include all architecture specific files from the main file. I'd really rather not do this. I doubt it's going to be a win, as now you have to recompile the whole thing every time you change the implementation. At least with vl.h you only have to recompile when you change the interface. What I feel about this is that adding a hw/hw.h, included in all hw/*.c files would greatly improve the situation: changing vl.h would lead to recompile the core emulator object files, changing hw/hw.h would lead to recompile the hardware library. Well, most of the core emulator doesn't depend on vl.h anyway. It's just the device emulation and host disk/display code. I not sure a single hw/hw.h file will give any benefit because there's a fair amount of interfacing between the target devices emulation and the host side interaction code. i.e. there's not much that's only used inside hw/. hw/ is about as big as most of the rest of qemu put together, so splitting that is probably going to get the biggest wins. hw library contains a lot of code but is not all is compiled for all targets. How about dividing things up by category? e.g. Have header files for all of (in no particular order): - Things includes by everything - The block IO layer. - The character IO layer - Network IO layer. - Display interface. - Generic Device infrastructure (memory mapping, IRQs, etc). Maybe subdivide for common busses like scsi/usb/pci/i2c. - Prototypes for device emulation init routines. Each file can then include whichever categories it needs. Yes, it could be a great solution. Mine was just the quick and less effort proposal ! It seems you also should have headers for target specific declarations. This would avoid recompiling all targets when working on devices specific to only one or a few of them. -- J. Mayer [EMAIL PROTECTED] Never organized
Re: [Qemu-devel] qemu vl.c vl.h hw/an5206.c hw/etraxfs.c hw/inte...
On Sat, 2007-11-03 at 01:18 +, Thiemo Seufer wrote: J. Mayer wrote: [snip] It restricts the letter to the ones historically allowed by Qemu, not to anything specific to any architecture or hw platform. What I like in my implementation, compared to the strchr..., is that it exactly tells the user which given device is incorrect. Well, here it makes no difference, strchr tells you exactly same as much. Yes, you're right. Was thinking about the original strspn. Instead of the check, the code could also allow everything from 'a' to 'z' and then just AND the produced 32bit bitmap with a machine defined bitmap that would be part of QEMUMachine. I guess we would better stop at 'n', because we can easily define a semantic for devices 'c' to 'm' (ie hard disk drives in a hardware platform specific order) but we have to define what means 'o' to 'z'. But I agree we would better extend it now, instead of having to rework it later... To select the network device to boot from would probably become a 'n' 'o' 'p' 'q' series. Seems OK. Can we say 'c' to 'm' is sufficient to address all disk drive cases or some more possibilities are needed for SCSI devices, or MTD devices boot ? Maybe 'u'... for USB, as most available machines know how to boot on USB, in the real world ? Or may we just consider 'c' to 'm' are sufficient and it's up to the machine to determine the real meaning of the letters, according to its implementation ? In this case, I think we would better provide a per-machine callback to handle the '?' case, printing an help on available boot devices letters and their meaning... [snip] Here's a second pass cleanup, adding the machine dependant checks for the PC machine and the PowerPC ones. As one can see, the OpenHack'Ware firmware is able to boot from devices 'e' and 'f'. For the PowerPC machines, I choosed to try to boot from the first given usable device, some may not agree with this choice. It can be noticed that the available boot devices are not the same for PowerPC PreP, g3bw and mac99 machines. As I don't know the features and requirements for the other architectures, I prefered not to add any check for those ones. Most other machines ignore -boot and those that don't, shouldn't break from the introduced change, so please commit it when you feel ok with it. I'd like to know what are the feelings around about this patch and if there are specific requirements and/or problems for some platforms to be addressed before... I think the proposed scheme (and the implementation) is flexible enough to accomodate all relevant platforms. I think there are still a few problems here: 1/ it would not be easy to add a way to use the disk syntax as proposed here, but this could be useful. Another option could be added for this; or it could be part of the '-disk' syntax. 2/ doing a generic check in vl.c using the machine features would need a great rework. We then would have to first parse all options, then retrieve the machine features, then check all options according to those features. But this can be designed and done later 3/ it would be a great idea to provide a way to boot without any bloc device available. Embedded devices often just have a flash device or a ROM, then the checks done in vl.c to be sure there is a least one device present should be machine specific, imho. Then having an empty boot_devices string may not be a mistake. The two last point can sure be addressed separatly. For the first one, it seems to me that defining the way it has to be would be great: if it needs the -boot option to be extended or redesigned, I think the best would be to do it in the same patch... -- J. Mayer [EMAIL PROTECTED] Never organized
Re: [Qemu-devel] [RFC] linux-user (mostly syscall.c)
On Sat, 2007-11-03 at 01:21 +, Thiemo Seufer wrote: Thayne Harbaugh wrote: There are several things that I'd like to see addressed in linux-user. Some of these are to fix bugs, some are to make qemu linux-user more like the Linux kernel, some are to make the internal qemu interfaces more consistent. An internal coding practice that is being addressed bit-by-bit is that of managing the interface between the host and the target. Currently this is a bit sloppy and inconsistent (some of which I've contributed to). There are examples of using target addresses for host pointers and host errnos for target errnos, using different types between target and host that don't sign-extend properly, as well as other things. This causes compiler warnings to actual run-time bugs. Currently I'm reviewing all of the linux-user code (mostly syscall.c) to fix these inconsistencies. I will be writing developer documentation describing the coding practices that should govern the target/host interface and submitting patches for the fixes. As obvious as it may seem I'll re-state that the linux-user emulation is emulating the Linux kernel (duh!). There are portions of qemu linux-user that are even excerpted directly from the Linux kernel. Consequently it is useful for internal qemu data and functions to closely mimic the kernel for best code sharing. There are also advantages to even structuring qemu directly and file organization in similar divisions, groupings and locations. Some of this organization might lead to good division so that other user/kernel divisions are cleaner (different kernel versions, other OSes - darwin-user and others). Internal qemu interfaces are consistent - except when they aren't. This causes coding errors when passing target and host arguments or return codes. I'll be documenting the coding practices as well as submitting patches to make these consistent. (That sounds a bit redundant with other things I've mentioned). I have about 40 patches already worked up that do this. Some of those patches might be broken up smaller. The qemu that we've been working with is nearly rock solid (still a few more bugs being wrung out). It can nearly build an entire Debian arm distribution for an arm target being hosted on x86_64. We're quite excited to get our patches upstream so that others can benefit and to ease our maintenance overhead. We're also turning our focus to PPC and other archs. Please let me know if you support the general idea of the coding changes above: General clean-up, consistent target/host interfaces, file splitting/reorganizing, etc.. In the meantime I'll be putting together the developer documentation/coding guidelines for review. FWIW, I agree with everything you said above. I agree too. Code cleanup and sanitization is needed there. I'm just reserved about the code splitting point: as for the vl.h splitting, it should not lead to get files with only a single or two small function inside. But it could be great to group the syscalls by categories, or so. For example, putting all POSIX compliant syscalls in a single file and using a syscall table could make quite easy to develop a BSD-user target (I did this in the past, not in Qemu though...). POSIX compliant interfaces can mostly be shared with Linux ones and a lot of other syscalls are common to the 3 BSD flavors (Net, Open and Free..). Being able to add a BSD target sharing the same code would be a proof the code is flexible and well organized; I guess large parts of the Darwin user target could also be merged with a FreeBSD user target... Just my few cents ideas, don't say it has to be implemented soon, just think keeping those long-term goals in mind may help having a flexible and clean implementation... -- J. Mayer [EMAIL PROTECTED] Never organized
Re: [Qemu-devel] [RFC] linux-user (mostly syscall.c)
On Sat, 2007-11-03 at 19:16 -0600, Thayne Harbaugh wrote: On Sat, 2007-11-03 at 20:13 +0100, Fabrice Bellard wrote: Thayne Harbaugh wrote: On Sat, 2007-11-03 at 13:52 +0100, J. Mayer wrote: On Sat, 2007-11-03 at 01:21 +, Thiemo Seufer wrote: [...] But it could be great to group the syscalls by categories, or so. For example, putting all POSIX compliant syscalls in a single file and using a syscall table could make quite easy to develop a BSD-user target (I did this in the past, not in Qemu though...). POSIX compliant interfaces can mostly be shared with Linux ones and a lot of other syscalls are common to the 3 BSD flavors (Net, Open and Free..). Being able to add a BSD target sharing the same code would be a proof the code is flexible and well organized; I guess large parts of the Darwin user target could also be merged with a FreeBSD user target... That's a reasonable strategy as well. I've looked through some of the darwin code and have considered how common code could be merged. I am strongly against such merges. Different OS emulation must be handled in different directories (and maybe even in different projects) as they are likely to have subtle differences which makes impossible to test a modification made for one OS without testing all the other OSes. Agreed. If you take a close look, you'll find more variations between Linux ABIs for different CPUs than between all BSD implementations: common syscalls of all BSD flavors do the same thing (and have the same ABI whatever the CPU...). You'll also find very few variations between the syscalls common to BSD Linux because most of those directly map POSIX defined functions. Then, following the given argument, we never should try to share any code between linux-user for different targets, as the Linux ABI and behavior is different for different CPUs... -- J. Mayer [EMAIL PROTECTED] Never organized
Re: [Qemu-devel] qemu exec-all.h host-utils.c host-utils.h targe...
On Sun, 2007-11-04 at 02:24 +, Jocelyn Mayer wrote: CVSROOT: /sources/qemu Module name: qemu Changes by: Jocelyn Mayer j_mayer 07/11/04 02:24:58 Modified files: . : exec-all.h host-utils.c host-utils.h target-alpha : op.c target-i386: helper.c Log message: For consistency, move muls64 / mulu64 prototypes to host-utils.h Make x86_64 optimized versions inline. Following this patch, I also got optimized versions of muls64 / mulu64 / clz64 for PowerPC 64 and clz32 for PowerPC 32 hosts. Seems like it could be useful... -- J. Mayer [EMAIL PROTECTED] Never organized Index: host-utils.h === RCS file: /sources/qemu/qemu/host-utils.h,v retrieving revision 1.3 diff -u -d -d -p -r1.3 host-utils.h --- host-utils.h 4 Nov 2007 02:24:57 - 1.3 +++ host-utils.h 4 Nov 2007 02:26:34 - @@ -40,6 +40,25 @@ static always_inline void muls64 (uint64 : =d (*phigh), =a (*plow) : a (a), 0 (b)); } +#elif defined(__powerpc64__) +#define __HAVE_FAST_MULU64__ +static always_inline void mulu64 (uint64_t *plow, uint64_t *phigh, + uint64_t a, uint64_t b) +{ +__asm__ (mulld %1, %2, %3 \n\t + mulhdu %0, %2, %3 \n\t + : =r(*phigh), =r(*plow) + : r(a), r(b)); +} +#define __HAVE_FAST_MULS64__ +static always_inline void muls64 (uint64_t *plow, uint64_t *phigh, + uint64_t a, uint64_t b) +{ +__asm__ (mulld %1, %2, %3 \n\t + mulhd %0, %2, %3 \n\t + : =r(*phigh), =r(*plow) + : r(a), r(b)); +} #else void muls64(int64_t *phigh, int64_t *plow, int64_t a, int64_t b); void mulu64(uint64_t *phigh, uint64_t *plow, uint64_t a, uint64_t b); @@ -50,7 +69,19 @@ void mulu64(uint64_t *phigh, uint64_t *p cope with that. */ /* Binary search for leading zeros. */ +#if defined(__powerpc__) +#define __HAVE_FAST_CLZ32__ +static always_inline int clz32 (uint32_t val) +{ +int cnt; + +__asm__ (cntlzw %0, %1 \n\t + : =r(cnt) + : r(val)); +return cnt; +} +#else static always_inline int clz32(uint32_t val) { int cnt = 0; @@ -80,12 +111,26 @@ static always_inline int clz32(uint32_t } return cnt; } +#endif static always_inline int clo32(uint32_t val) { return clz32(~val); } +#if defined(__powerpc64__) +#define __HAVE_FAST_CLZ64__ +static always_inline int clz64 (uint32_t val) +{ +int cnt; + +__asm__ (cntlzd %0, %1 \n\t + : =r(cnt) + : r(val)); + +return cnt; +} +#else static always_inline int clz64(uint64_t val) { int cnt = 0; @@ -98,6 +143,7 @@ static always_inline int clz64(uint64_t return cnt + clz32(val); } +#endif static always_inline int clo64(uint64_t val) {
Re: [Qemu-devel] qemu-system-ppc problem with PVR access from user space
On Fri, 2007-11-02 at 08:04 -0500, Jason Wessel wrote: The typical kernel + user space I boot on the prep machine no longer boots due to an issue accessing the PVR special purpose register. When the PVR is accessed from user space, it should generate an exception with the PC set to the instruction that it occurred at when it saves to the stack. In the latest CVS, it is off by 4 bytes. With out the fix /sbin/init gets killed because the kernel's trap handler which does the userspace emulation of the instruction does not clean up the trap. I am using the attached patch to work around the problem, but I wonder if there is a more generic problem that was introduced as a regression with all ppc merges in the last month or so, given this used to work fine through the generic handler. Any insight into this would certainly be useful. Seems like I made a mistake for program exception generation while fixing floating-point ones, I'm sorry. Your patch is incorrect but the one attached should fix the problem. Could you please check it in your case ? -- J. Mayer [EMAIL PROTECTED] Never organized Index: target-ppc/helper.c === RCS file: /sources/qemu/qemu/target-ppc/helper.c,v retrieving revision 1.85 diff -u -d -d -p -r1.85 helper.c --- target-ppc/helper.c 28 Oct 2007 00:55:05 - 1.85 +++ target-ppc/helper.c 2 Nov 2007 13:35:52 - @@ -2146,10 +2145,9 @@ static always_inline void powerpc_excp ( new_msr |= (target_ulong)1 MSR_HV; #endif msr |= 0x0010; -if (msr_fe0 != msr_fe1) { -msr |= 0x0001; -goto store_current; -} +if (msr_fe0 == msr_fe1) +goto store_next; +msr |= 0x0001; break; case POWERPC_EXCP_INVAL: #if defined (DEBUG_EXCEPTIONS) @@ -2187,7 +2185,7 @@ static always_inline void powerpc_excp ( env-error_code); break; } -goto store_next; +goto store_current; case POWERPC_EXCP_FPU: /* Floating-point unavailable exception */ new_msr = ~((target_ulong)1 MSR_RI); #if defined(TARGET_PPC64H)
Re: [Qemu-devel] qemu-system-ppc problem with PVR access from user space
On Fri, 2007-11-02 at 16:46 -0400, Daniel Jacobowitz wrote: On Fri, Nov 02, 2007 at 05:23:59PM +0100, Jocelyn Mayer wrote: No, it's not accidental. An application accessing priviledged SPR, including the PVR, is likely to be buggy. I checked in the kernel (2.6.23), trapping the mfpvr instruction is a huge bug because it breaks the virtualisation features of the PowerPC architecture. Application like mol will suffer of this, not being able to pretend the virtualized CPU is not the same as the host CPU. The PowerPC architecture has been designed to be fully virtualisable but the vanilla Linux kernel breaks this useful feature. The bug is then to be fixed in the kernel (and the glibc if it really uses mfpvr). I suggest you take this up with the PowerPC kernel maintainers, which might work, instead of making QEMU noisy about it; the people using QEMU don't care, and they'll just disable the warning. It wasn't an accidental decision on the kernel maintainers' part either. You're absolutely right, it's a kernel problem: it would prevent any attempt to enable a kqemu-like feature for the PowerPC, for example. And it seems this behavior has been in the Linux kernel for a very long time... I will disable the warning in the PVR specific case, but this is ugly as it will prevent detection of bugged PVR accesses when using OSes that respect the PowerPC specifications. I don't see the PVR read in current glibc, but I thought it was there; I don't remember exactly what happened. One thing is sure: any application which uses mfpvr is bugged. I guess there might be some libraries that would like to do it to enable some optimisations at run-time. Or applications like mplayer... But I don't see why init should ever have any usage of knowing the CPU features... -- J. Mayer [EMAIL PROTECTED] Never organized
Re: [Qemu-devel] qemu Makefile.target vl.h hw/cuda.c hw/grackle_...
On Fri, 2007-11-02 at 22:33 +0200, Blue Swirl wrote: On 11/2/07, Jocelyn Mayer [EMAIL PROTECTED] wrote: On Fri, 2007-11-02 at 17:18 +0200, Blue Swirl wrote: On 11/2/07, J. Mayer [EMAIL PROTECTED] wrote: On Thu, 2007-11-01 at 23:13 +0100, J. Mayer wrote: On Thu, 2007-11-01 at 21:53 +0200, Blue Swirl wrote: On 11/1/07, Blue Swirl [EMAIL PROTECTED] wrote: On 10/29/07, Jocelyn Mayer [EMAIL PROTECTED] wrote: CVSROOT:/sources/qemu Module name:qemu Changes by: Jocelyn Mayer j_mayer 07/10/28 23:42:18 Modified files: . : Makefile.target vl.h hw : cuda.c grackle_pci.c heathrow_pic.c ppc.c ppc_chrp.c ppc_prep.c Added files: hw : mac_dbdma.c mac_nvram.c macio.c ppc_mac.h ppc_oldworld.c Log message: * sort the PowerPC target object files * make PowerPC NVRAM accessors generic to be able to use a MacIO NVRAM instead of the M48T59 one * split PowerMac targets code: - move all PowerMac related definitions and prototypes into hw/ppc_mac.h - add hw/mac_dbdma.c, hw/mac_nvram.c and macio.c which implements shared PowerMac devices - define the g3bw machine in a new hw/ppc_oldworld.c file * Fix the g3bw target: - fix the Grackle host PCI device - connect the Heathrow PIC to the PowerPC 6xx bus pins CVSWeb URLs: http://cvs.savannah.gnu.org/viewcvs/qemu/Makefile.target?cvsroot=qemur1=1.212r2=1.213 http://cvs.savannah.gnu.org/viewcvs/qemu/vl.h?cvsroot=qemur1=1.280r2=1.281 http://cvs.savannah.gnu.org/viewcvs/qemu/hw/cuda.c?cvsroot=qemur1=1.16r2=1.17 http://cvs.savannah.gnu.org/viewcvs/qemu/hw/grackle_pci.c?cvsroot=qemur1=1.6r2=1.7 http://cvs.savannah.gnu.org/viewcvs/qemu/hw/heathrow_pic.c?cvsroot=qemur1=1.5r2=1.6 http://cvs.savannah.gnu.org/viewcvs/qemu/hw/ppc.c?cvsroot=qemur1=1.34r2=1.35 http://cvs.savannah.gnu.org/viewcvs/qemu/hw/ppc_chrp.c?cvsroot=qemur1=1.44r2=1.45 http://cvs.savannah.gnu.org/viewcvs/qemu/hw/ppc_prep.c?cvsroot=qemur1=1.47r2=1.48 http://cvs.savannah.gnu.org/viewcvs/qemu/hw/mac_dbdma.c?cvsroot=qemurev=1.1 http://cvs.savannah.gnu.org/viewcvs/qemu/hw/mac_nvram.c?cvsroot=qemurev=1.1 http://cvs.savannah.gnu.org/viewcvs/qemu/hw/macio.c?cvsroot=qemurev=1.1 http://cvs.savannah.gnu.org/viewcvs/qemu/hw/ppc_mac.h?cvsroot=qemurev=1.1 http://cvs.savannah.gnu.org/viewcvs/qemu/hw/ppc_oldworld.c?cvsroot=qemurev=1.1 You broke sparc64-softmmu build with this patch. I am missing something ? I rebuilt all available targets before commiting... but I now see sparc64-softmmu seems not to be in the available targets, which could explain I cannot check if it compiles or not... As it been removed by mistake ? I think the best solution to fix this is to put the nvram helpers to m48t59.h as inline functions instead of duplicating the code in several places. You mean the NVRAM_set / get_xxx ? I was to remove the definitions from vl.h, I have to say, because those are supposed to be PowerPC (in fact OpenHack'Ware) related hacks. Those functions will never go in m48t59.h as they are not related with m48t59. Apple machine don't have such a device (even if Qemu pretend it has, this is to be removed in the days to come) but need those functions to pass arguments to the firmware. What I needed to do (and that what I did commit) is make those routines independant from m48t59 so I can remove this device from ppc_chrp.c and ppc_oldworld.c and use the real Mac nvram instead (but ppc_prep.c still uses m48t59...). I see. Should sun4m use these functions too? On the other hand, there is no need to be too independent on Sparc, because I think all Sun4u machines use m48t59 and sun4m machines have either m48t08 or older m48t02 (not supported yet). So if you prefer, sun4u could use the same approach as sun4m and not use these functions? Depends on how you feel about it... If there is a real need to have a generic devices registers and/or internal memory accessor used during the target machine initialisation (the model I propose could be used not only for NVRAM...), then the code should be made more generic (ie renaming the nvram_t type with a more generic name) and only one implementation should be kept. If this is only useful for the PowerPC target initialisation, then you should keep using the m48t59 only implementation you have now for Sparc64 and the PowerPC
Re: [Qemu-devel] qemu vl.c vl.h hw/an5206.c hw/etraxfs.c hw/inte...
On Sat, 2007-11-03 at 01:01 +0100, andrzej zaborowski wrote: Hi, On 01/11/2007, J. Mayer [EMAIL PROTECTED] wrote: On Thu, 2007-11-01 at 01:01 +0100, andrzej zaborowski wrote: On 31/10/2007, J. Mayer [EMAIL PROTECTED] wrote: On Wed, 2007-10-31 at 11:22 +0100, J. Mayer wrote: On Wed, 2007-10-31 at 11:01 +0100, andrzej zaborowski wrote: On 31/10/2007, J. Mayer [EMAIL PROTECTED] wrote: On Wed, 2007-10-31 at 03:35 +0100, andrzej zaborowski wrote: Hi, On 31/10/2007, J. Mayer [EMAIL PROTECTED] wrote: On Wed, 2007-10-31 at 01:54 +, Andrzej Zaborowski wrote: CVSROOT: /sources/qemu Module name: qemu Changes by: Andrzej Zaborowski balrog 07/10/31 01:54:05 Modified files: . : vl.c vl.h hw : an5206.c etraxfs.c integratorcp.c mcf5208.c mips_malta.c mips_mipssim.c mips_pica61.c mips_r4k.c palm.c pc.c ppc405_boards.c ppc_chrp.c ppc_oldworld.c ppc_prep.c r2d.c realview.c shix.c spitz.c sun4m.c sun4u.c versatilepb.c Log message: Set boot sequence from command line (Dan Kenigsberg). There have been remarks about this patch that have not been addressed (not even answered, in fact). For example, the MAX_BOOT_DEVICES is set to 3 when more than 3 boot devices are possible to select (see the BOOTCHARS definition), which clearly shows the patch is not consistent. I double-checked to make sure all remarks made on qemu-devel were addressed, but I may have missed something. It was explained that the default bios supports only three boot devices, Then just take a look at the function boot_device2nibble in hw/pc.c. You can see 4 possibilities implemented here. And I think I've never seen a PC BIOS (on real machines, I mean) that don't allow more than 4 choices in last 5 years (and maybe much more...) MAX_BOOT_DEVICES doesn't limit the number of possible boot devices, it is only a limit for the length of the sequence given on command-line. The second point is that, as the legacy PC-BIOS is maybe the less versatile architecture that can be, putting limitations to the emulation model based on this spec seems to be a nonsense in Qemu, which is supposed to emulate _a lot_ of different architectures. As a matter of fact, a specific implementation (ie legacy PC target) should not lead to have hardcoded limits that would affect all other emulated targets. I personally wouldn't hardcode any limit but this code was submitted this way and doesn't limit any current functionality in any way, it extends it. I prefer the GNU/Hurd style code where no software limits are ever imposed and even the standard unix limits are undefined (e.g. no MAXPATHLEN), sometimes at significant cost. Imho, in that case, the only thing that can be check is that the given string contains only characters that can be valid devices in Qemu. Then, making boot_device a pointer directly assigned to optarg then check that all chars are = 'a' and 'c' + MAX_DISKS || chars == 'n' would greatly simplify the code. And this kind of check is the only valid one you can do in the generic code. Here's a generic implementation that checks only the boot devices known to be supported, ie 'a', 'c', 'd' and 'n', thus need no change in the machine emulation code to work. When the machines will be able to check properly if the boot devices match the emulated hardware and the BIOS ABI, then it can be easily extended, changing one line, to allow boot from more devices. I think that this code should allow choosing to (try to...) boot from at least the 2 floppies and the 4 possible IDE devices. The consistency test could also be changed to add more drives if it seems to be needed. For consistency, I also made the boot_devices variable local to the main routine, as it's never used anywhere else. Thanks for the rework, I'm in favour of this patch. However, similar to the previous approach it still restricts the driver letters set and assumes that vl.c will be extended when some per-machine code needs more letters (which is okay with me, but I had understood that this was your concern). The letter check is equivalent to !strchr(BOOTCHARS, *p
Re: [Qemu-devel] qemu vl.c vl.h hw/an5206.c hw/etraxfs.c hw/inte...
On Thu, 2007-11-01 at 01:01 +0100, andrzej zaborowski wrote: Hi, On 31/10/2007, J. Mayer [EMAIL PROTECTED] wrote: On Wed, 2007-10-31 at 11:22 +0100, J. Mayer wrote: On Wed, 2007-10-31 at 11:01 +0100, andrzej zaborowski wrote: On 31/10/2007, J. Mayer [EMAIL PROTECTED] wrote: On Wed, 2007-10-31 at 03:35 +0100, andrzej zaborowski wrote: Hi, On 31/10/2007, J. Mayer [EMAIL PROTECTED] wrote: On Wed, 2007-10-31 at 01:54 +, Andrzej Zaborowski wrote: CVSROOT: /sources/qemu Module name: qemu Changes by: Andrzej Zaborowski balrog 07/10/31 01:54:05 Modified files: . : vl.c vl.h hw : an5206.c etraxfs.c integratorcp.c mcf5208.c mips_malta.c mips_mipssim.c mips_pica61.c mips_r4k.c palm.c pc.c ppc405_boards.c ppc_chrp.c ppc_oldworld.c ppc_prep.c r2d.c realview.c shix.c spitz.c sun4m.c sun4u.c versatilepb.c Log message: Set boot sequence from command line (Dan Kenigsberg). There have been remarks about this patch that have not been addressed (not even answered, in fact). For example, the MAX_BOOT_DEVICES is set to 3 when more than 3 boot devices are possible to select (see the BOOTCHARS definition), which clearly shows the patch is not consistent. I double-checked to make sure all remarks made on qemu-devel were addressed, but I may have missed something. It was explained that the default bios supports only three boot devices, Then just take a look at the function boot_device2nibble in hw/pc.c. You can see 4 possibilities implemented here. And I think I've never seen a PC BIOS (on real machines, I mean) that don't allow more than 4 choices in last 5 years (and maybe much more...) MAX_BOOT_DEVICES doesn't limit the number of possible boot devices, it is only a limit for the length of the sequence given on command-line. The second point is that, as the legacy PC-BIOS is maybe the less versatile architecture that can be, putting limitations to the emulation model based on this spec seems to be a nonsense in Qemu, which is supposed to emulate _a lot_ of different architectures. As a matter of fact, a specific implementation (ie legacy PC target) should not lead to have hardcoded limits that would affect all other emulated targets. I personally wouldn't hardcode any limit but this code was submitted this way and doesn't limit any current functionality in any way, it extends it. I prefer the GNU/Hurd style code where no software limits are ever imposed and even the standard unix limits are undefined (e.g. no MAXPATHLEN), sometimes at significant cost. Imho, in that case, the only thing that can be check is that the given string contains only characters that can be valid devices in Qemu. Then, making boot_device a pointer directly assigned to optarg then check that all chars are = 'a' and 'c' + MAX_DISKS || chars == 'n' would greatly simplify the code. And this kind of check is the only valid one you can do in the generic code. Here's a generic implementation that checks only the boot devices known to be supported, ie 'a', 'c', 'd' and 'n', thus need no change in the machine emulation code to work. When the machines will be able to check properly if the boot devices match the emulated hardware and the BIOS ABI, then it can be easily extended, changing one line, to allow boot from more devices. I think that this code should allow choosing to (try to...) boot from at least the 2 floppies and the 4 possible IDE devices. The consistency test could also be changed to add more drives if it seems to be needed. For consistency, I also made the boot_devices variable local to the main routine, as it's never used anywhere else. Thanks for the rework, I'm in favour of this patch. However, similar to the previous approach it still restricts the driver letters set and assumes that vl.c will be extended when some per-machine code needs more letters (which is okay with me, but I had understood that this was your concern). The letter check is equivalent to !strchr(BOOTCHARS, *p). It restricts the letter to the ones historically allowed by Qemu, not to anything specific to any architecture or hw platform. What I like in my implementation, compared to the strchr..., is that it exactly tells the user which given device is incorrect. This patch does not make the code simpler (in fact it's even more complicated as it does more generic consistency checks) but is generic and extensible, not breaking the previous
Re: [Qemu-devel] qemu Makefile.target vl.h hw/cuda.c hw/grackle_...
On Thu, 2007-11-01 at 21:53 +0200, Blue Swirl wrote: On 11/1/07, Blue Swirl [EMAIL PROTECTED] wrote: On 10/29/07, Jocelyn Mayer [EMAIL PROTECTED] wrote: CVSROOT:/sources/qemu Module name:qemu Changes by: Jocelyn Mayer j_mayer 07/10/28 23:42:18 Modified files: . : Makefile.target vl.h hw : cuda.c grackle_pci.c heathrow_pic.c ppc.c ppc_chrp.c ppc_prep.c Added files: hw : mac_dbdma.c mac_nvram.c macio.c ppc_mac.h ppc_oldworld.c Log message: * sort the PowerPC target object files * make PowerPC NVRAM accessors generic to be able to use a MacIO NVRAM instead of the M48T59 one * split PowerMac targets code: - move all PowerMac related definitions and prototypes into hw/ppc_mac.h - add hw/mac_dbdma.c, hw/mac_nvram.c and macio.c which implements shared PowerMac devices - define the g3bw machine in a new hw/ppc_oldworld.c file * Fix the g3bw target: - fix the Grackle host PCI device - connect the Heathrow PIC to the PowerPC 6xx bus pins CVSWeb URLs: http://cvs.savannah.gnu.org/viewcvs/qemu/Makefile.target?cvsroot=qemur1=1.212r2=1.213 http://cvs.savannah.gnu.org/viewcvs/qemu/vl.h?cvsroot=qemur1=1.280r2=1.281 http://cvs.savannah.gnu.org/viewcvs/qemu/hw/cuda.c?cvsroot=qemur1=1.16r2=1.17 http://cvs.savannah.gnu.org/viewcvs/qemu/hw/grackle_pci.c?cvsroot=qemur1=1.6r2=1.7 http://cvs.savannah.gnu.org/viewcvs/qemu/hw/heathrow_pic.c?cvsroot=qemur1=1.5r2=1.6 http://cvs.savannah.gnu.org/viewcvs/qemu/hw/ppc.c?cvsroot=qemur1=1.34r2=1.35 http://cvs.savannah.gnu.org/viewcvs/qemu/hw/ppc_chrp.c?cvsroot=qemur1=1.44r2=1.45 http://cvs.savannah.gnu.org/viewcvs/qemu/hw/ppc_prep.c?cvsroot=qemur1=1.47r2=1.48 http://cvs.savannah.gnu.org/viewcvs/qemu/hw/mac_dbdma.c?cvsroot=qemurev=1.1 http://cvs.savannah.gnu.org/viewcvs/qemu/hw/mac_nvram.c?cvsroot=qemurev=1.1 http://cvs.savannah.gnu.org/viewcvs/qemu/hw/macio.c?cvsroot=qemurev=1.1 http://cvs.savannah.gnu.org/viewcvs/qemu/hw/ppc_mac.h?cvsroot=qemurev=1.1 http://cvs.savannah.gnu.org/viewcvs/qemu/hw/ppc_oldworld.c?cvsroot=qemurev=1.1 You broke sparc64-softmmu build with this patch. I am missing something ? I rebuilt all available targets before commiting... but I now see sparc64-softmmu seems not to be in the available targets, which could explain I cannot check if it compiles or not... As it been removed by mistake ? I think the best solution to fix this is to put the nvram helpers to m48t59.h as inline functions instead of duplicating the code in several places. You mean the NVRAM_set / get_xxx ? I was to remove the definitions from vl.h, I have to say, because those are supposed to be PowerPC (in fact OpenHack'Ware) related hacks. Those functions will never go in m48t59.h as they are not related with m48t59. Apple machine don't have such a device (even if Qemu pretend it has, this is to be removed in the days to come) but need those functions to pass arguments to the firmware. What I needed to do (and that what I did commit) is make those routines independant from m48t59 so I can remove this device from ppc_chrp.c and ppc_oldworld.c and use the real Mac nvram instead (but ppc_prep.c still uses m48t59...). Whatever, I'll try to fix the sparc64 case as I broke it (if I found the code ;-) ). -- J. Mayer [EMAIL PROTECTED] Never organized
Re: [Qemu-devel] qemu vl.c vl.h hw/an5206.c hw/etraxfs.c hw/inte...
On Wed, 2007-10-31 at 11:01 +0100, andrzej zaborowski wrote: On 31/10/2007, J. Mayer [EMAIL PROTECTED] wrote: On Wed, 2007-10-31 at 03:35 +0100, andrzej zaborowski wrote: Hi, On 31/10/2007, J. Mayer [EMAIL PROTECTED] wrote: On Wed, 2007-10-31 at 01:54 +, Andrzej Zaborowski wrote: CVSROOT: /sources/qemu Module name: qemu Changes by: Andrzej Zaborowski balrog 07/10/31 01:54:05 Modified files: . : vl.c vl.h hw : an5206.c etraxfs.c integratorcp.c mcf5208.c mips_malta.c mips_mipssim.c mips_pica61.c mips_r4k.c palm.c pc.c ppc405_boards.c ppc_chrp.c ppc_oldworld.c ppc_prep.c r2d.c realview.c shix.c spitz.c sun4m.c sun4u.c versatilepb.c Log message: Set boot sequence from command line (Dan Kenigsberg). There have been remarks about this patch that have not been addressed (not even answered, in fact). For example, the MAX_BOOT_DEVICES is set to 3 when more than 3 boot devices are possible to select (see the BOOTCHARS definition), which clearly shows the patch is not consistent. I double-checked to make sure all remarks made on qemu-devel were addressed, but I may have missed something. It was explained that the default bios supports only three boot devices, Then just take a look at the function boot_device2nibble in hw/pc.c. You can see 4 possibilities implemented here. And I think I've never seen a PC BIOS (on real machines, I mean) that don't allow more than 4 choices in last 5 years (and maybe much more...) MAX_BOOT_DEVICES doesn't limit the number of possible boot devices, it is only a limit for the length of the sequence given on command-line. The second point is that, as the legacy PC-BIOS is maybe the less versatile architecture that can be, putting limitations to the emulation model based on this spec seems to be a nonsense in Qemu, which is supposed to emulate _a lot_ of different architectures. As a matter of fact, a specific implementation (ie legacy PC target) should not lead to have hardcoded limits that would affect all other emulated targets. I personally wouldn't hardcode any limit but this code was submitted this way and doesn't limit any current functionality in any way, it extends it. I prefer the GNU/Hurd style code where no software limits are ever imposed and even the standard unix limits are undefined (e.g. no MAXPATHLEN), sometimes at significant cost. Imho, in that case, the only thing that can be check is that the given string contains only characters that can be valid devices in Qemu. Then, making boot_device a pointer directly assigned to optarg then check that all chars are = 'a' and 'c' + MAX_DISKS || chars == 'n' would greatly simplify the code. And this kind of check is the only valid one you can do in the generic code. on a second thought I see how this may affect people using a non-default bios, but I guess 3 boot devices is better than only one that was possible without this patch. For most emulation targets, there still is a limit to 1. And the global limit to 3 is not even related to the PC spec, according to the code commited in pc.c. Then, imho, it cannot be better as it's inconsistent for the PC case and provides nothing in most cases. The limit of three is the max boot sequence length and the same (or lower) limit is already hardcoded in the machine initialisations where they deal with passing the sequence to the BIOS. Sure. And here is the only place you can hardcode / check anything like this, because this is _the_ place where you know what physical devices are actually emulated, and maybe (not always) what are the BIOS features, ... What is the explanation of a global define to 1 for most target when you cannot globally know how the information will be exploited ? Yes, you can, it all sits in one repository and is part of the same project. vl.c doesn't have to deal with cases where, say, hw/pc.c is modified. In such case the author is supposed to update vl.c too. There is no information about the emulated target reaching vl.c. In fact, in a ideal world, there should be not even a single #ifdef TARGET_xxx in that code. All the hardware emulation part is in the hardware library and the generic code has no idea about the actual emulated hardware. It even does not know if a family of device is supported or not by the target. In microcontrollers, you often have no bloc device. But nothing prevents you to add '-hda my_disk -boot c' to the command line... It would seem really more logical to allow the user to give all defined possible boot devices to the -boot parameter, then it's up to the target initialisation code or the BIOS (some target may use different BIOS with different ABIs
Re: [Qemu-devel] qemu vl.c vl.h hw/an5206.c hw/etraxfs.c hw/inte...
On Wed, 2007-10-31 at 11:22 +0100, J. Mayer wrote: On Wed, 2007-10-31 at 11:01 +0100, andrzej zaborowski wrote: On 31/10/2007, J. Mayer [EMAIL PROTECTED] wrote: On Wed, 2007-10-31 at 03:35 +0100, andrzej zaborowski wrote: Hi, On 31/10/2007, J. Mayer [EMAIL PROTECTED] wrote: On Wed, 2007-10-31 at 01:54 +, Andrzej Zaborowski wrote: CVSROOT: /sources/qemu Module name: qemu Changes by: Andrzej Zaborowski balrog 07/10/31 01:54:05 Modified files: . : vl.c vl.h hw : an5206.c etraxfs.c integratorcp.c mcf5208.c mips_malta.c mips_mipssim.c mips_pica61.c mips_r4k.c palm.c pc.c ppc405_boards.c ppc_chrp.c ppc_oldworld.c ppc_prep.c r2d.c realview.c shix.c spitz.c sun4m.c sun4u.c versatilepb.c Log message: Set boot sequence from command line (Dan Kenigsberg). There have been remarks about this patch that have not been addressed (not even answered, in fact). For example, the MAX_BOOT_DEVICES is set to 3 when more than 3 boot devices are possible to select (see the BOOTCHARS definition), which clearly shows the patch is not consistent. I double-checked to make sure all remarks made on qemu-devel were addressed, but I may have missed something. It was explained that the default bios supports only three boot devices, Then just take a look at the function boot_device2nibble in hw/pc.c. You can see 4 possibilities implemented here. And I think I've never seen a PC BIOS (on real machines, I mean) that don't allow more than 4 choices in last 5 years (and maybe much more...) MAX_BOOT_DEVICES doesn't limit the number of possible boot devices, it is only a limit for the length of the sequence given on command-line. The second point is that, as the legacy PC-BIOS is maybe the less versatile architecture that can be, putting limitations to the emulation model based on this spec seems to be a nonsense in Qemu, which is supposed to emulate _a lot_ of different architectures. As a matter of fact, a specific implementation (ie legacy PC target) should not lead to have hardcoded limits that would affect all other emulated targets. I personally wouldn't hardcode any limit but this code was submitted this way and doesn't limit any current functionality in any way, it extends it. I prefer the GNU/Hurd style code where no software limits are ever imposed and even the standard unix limits are undefined (e.g. no MAXPATHLEN), sometimes at significant cost. Imho, in that case, the only thing that can be check is that the given string contains only characters that can be valid devices in Qemu. Then, making boot_device a pointer directly assigned to optarg then check that all chars are = 'a' and 'c' + MAX_DISKS || chars == 'n' would greatly simplify the code. And this kind of check is the only valid one you can do in the generic code. Here's a generic implementation that checks only the boot devices known to be supported, ie 'a', 'c', 'd' and 'n', thus need no change in the machine emulation code to work. When the machines will be able to check properly if the boot devices match the emulated hardware and the BIOS ABI, then it can be easily extended, changing one line, to allow boot from more devices. I think that this code should allow choosing to (try to...) boot from at least the 2 floppies and the 4 possible IDE devices. The consistency test could also be changed to add more drives if it seems to be needed. For consistency, I also made the boot_devices variable local to the main routine, as it's never used anywhere else. This patch does not make the code simpler (in fact it's even more complicated as it does more generic consistency checks) but is generic and extensible, not breaking the previous patch and being consistent with the i386 machine BIOS features, as implemented now. The machine specific checks can be added later, for each target that need some. Another solution could be that every machine implements a callback that return a features bitmap, then the generic code could check if the given command line arguments (including the -boot option, but not only) are consistent with the emulated hardware platform. [...] -- J. Mayer [EMAIL PROTECTED] Never organized Index: vl.c === RCS file: /sources/qemu/qemu/vl.c,v retrieving revision 1.353 diff -u -d -d -p -r1.353 vl.c --- vl.c 31 Oct 2007 01:54:03 - 1.353 +++ vl.c 31 Oct 2007 22:39:27 - @@ -162,12 +162,6 @@ static DisplayState display_state; int nographic; const char* keyboard_layout = NULL; int64_t ticks_per_sec; -#if defined(TARGET_I386) -#define MAX_BOOT_DEVICES 3 -#else -#define MAX_BOOT_DEVICES 1
Re: [Qemu-devel] qemu vl.c vl.h hw/an5206.c hw/etraxfs.c hw/inte...
On Wed, 2007-10-31 at 01:54 +, Andrzej Zaborowski wrote: CVSROOT: /sources/qemu Module name: qemu Changes by: Andrzej Zaborowski balrog 07/10/31 01:54:05 Modified files: . : vl.c vl.h hw : an5206.c etraxfs.c integratorcp.c mcf5208.c mips_malta.c mips_mipssim.c mips_pica61.c mips_r4k.c palm.c pc.c ppc405_boards.c ppc_chrp.c ppc_oldworld.c ppc_prep.c r2d.c realview.c shix.c spitz.c sun4m.c sun4u.c versatilepb.c Log message: Set boot sequence from command line (Dan Kenigsberg). There have been remarks about this patch that have not been addressed (not even answered, in fact). For example, the MAX_BOOT_DEVICES is set to 3 when more than 3 boot devices are possible to select (see the BOOTCHARS definition), which clearly shows the patch is not consistent. Furthermore, the patch breaks the coding style in some files (at least the ones I checked), which is weird. Seems _very_ strange to see it commited, then. -- J. Mayer [EMAIL PROTECTED] Never organized
Re: [Qemu-devel] qemu vl.c vl.h hw/an5206.c hw/etraxfs.c hw/inte...
On Wed, 2007-10-31 at 03:35 +0100, andrzej zaborowski wrote: Hi, On 31/10/2007, J. Mayer [EMAIL PROTECTED] wrote: On Wed, 2007-10-31 at 01:54 +, Andrzej Zaborowski wrote: CVSROOT: /sources/qemu Module name: qemu Changes by: Andrzej Zaborowski balrog 07/10/31 01:54:05 Modified files: . : vl.c vl.h hw : an5206.c etraxfs.c integratorcp.c mcf5208.c mips_malta.c mips_mipssim.c mips_pica61.c mips_r4k.c palm.c pc.c ppc405_boards.c ppc_chrp.c ppc_oldworld.c ppc_prep.c r2d.c realview.c shix.c spitz.c sun4m.c sun4u.c versatilepb.c Log message: Set boot sequence from command line (Dan Kenigsberg). There have been remarks about this patch that have not been addressed (not even answered, in fact). For example, the MAX_BOOT_DEVICES is set to 3 when more than 3 boot devices are possible to select (see the BOOTCHARS definition), which clearly shows the patch is not consistent. I double-checked to make sure all remarks made on qemu-devel were addressed, but I may have missed something. It was explained that the default bios supports only three boot devices, Then just take a look at the function boot_device2nibble in hw/pc.c. You can see 4 possibilities implemented here. And I think I've never seen a PC BIOS (on real machines, I mean) that don't allow more than 4 choices in last 5 years (and maybe much more...) The second point is that, as the legacy PC-BIOS is maybe the less versatile architecture that can be, putting limitations to the emulation model based on this spec seems to be a nonsense in Qemu, which is supposed to emulate _a lot_ of different architectures. As a matter of fact, a specific implementation (ie legacy PC target) should not lead to have hardcoded limits that would affect all other emulated targets. on a second thought I see how this may affect people using a non-default bios, but I guess 3 boot devices is better than only one that was possible without this patch. For most emulation targets, there still is a limit to 1. And the global limit to 3 is not even related to the PC spec, according to the code commited in pc.c. Then, imho, it cannot be better as it's inconsistent for the PC case and provides nothing in most cases. What is the explanation of a global define to 1 for most target when you cannot globally know how the information will be exploited ? It would seem really more logical to allow the user to give all defined possible boot devices to the -boot parameter, then it's up to the target initialisation code or the BIOS (some target may use different BIOS with different ABIs for different usages...) to determine if the information can be used totally, partially or not at all. Let me give an example: what is the meaning of the -boot parameter for embedded board that can only boot from a flash device (see the ppc405_boards.c, for example...) ? My answer is that the user can always give the -boot parameter but it will just be ignored by the target specific code. And the number of boot devices that may be usefull for a target is target or BIOS dependant. It's not in any way CPU architecture dependant, then the MAX_BOOT_DEVICES as it is implemented is false for the legacy PC architecture and has no meaning for all other cases. Feel free to revert if you see any issues. I don't think it breaks anything, then now that it's commited, it seems more urgent to see the patch reworked to make it consistent and really usable in all cases (PC is not the only Qemu target !) than to revert and generate CVS noise... Furthermore, the patch breaks the coding style in some files (at least the ones I checked), which is weird. I also tried to make sure that the original style in every file was retained (i.e. I wrapped lines crossing 80 chars) Apparently, not totally. (including 80 chars wrapping lines). -- J. Mayer [EMAIL PROTECTED] Never organized
Re: [Qemu-devel] qemu Makefile.target
On Mon, 2007-10-29 at 02:26 +0100, andrzej zaborowski wrote: On 28/10/2007, Jocelyn Mayer [EMAIL PROTECTED] wrote: CVSROOT:/sources/qemu Module name:qemu Changes by: Jocelyn Mayer j_mayer 07/10/28 13:07:13 Modified files: . : Makefile.target Log message: Use cpp to generate correct build dependencies for target objects instead of using incomplete hardcoded ones. This doesn't work very well for me, but I'm not sure why. Now the Makefile rebuilds much more than necessary, even after a make depend in the target dirs. You may need to do a make clean to generate the initial dependency files. The 'depend' target need to be reworked, afaik, as it does not handle all the needed files. make may rebuild a lot of things (or everything) when some headers are modified because some headers (like vl.h) are included almost in every source file. -- J. Mayer [EMAIL PROTECTED] Never organized
Re: [Qemu-devel] qemu/target-ppc op_helper.c
On Mon, 2007-10-29 at 22:12 +0100, Aurelien Jarno wrote: CVSROOT:/sources/qemu Module name:qemu Changes by: Jocelyn Mayer j_mayer 07/10/27 17:59:46 Modified files: target-ppc : op_helper.c Log message: PowerPC float bugfix: 64 bits float mantissa is 52 bits long. CVSWeb URLs: http://cvs.savannah.gnu.org/viewcvs/qemu/target-ppc/op_helper.c?cvsroot=qemur1=1.55r2=1.56 I know that it looks strange, but this commit breaks perl. The function mkdir(dir, mode) sometimes does not set the mode correctly. This can be easily reproduce on Debian by using dpkg-source -x file.dsc, which fails due to wrong permission. I am using qemu-system-ppc -M prep -cpu G3 Thanks for the report ! Could you please try replacing the 0x3FF mask with 0x7FF ? As the exponent is supposed to be 11 bits, my patch is obviously still buggy... I noticed there was a bug somewhere, but was not convinced it was in the FPU emulation... Now ... -- J. Mayer [EMAIL PROTECTED] Never organized
Re: [Qemu-devel] qemu/target-ppc op_helper.c
On Mon, 2007-10-29 at 22:33 +0100, Aurelien Jarno wrote: On Mon, Oct 29, 2007 at 10:21:26PM +0100, J. Mayer wrote: On Mon, 2007-10-29 at 22:12 +0100, Aurelien Jarno wrote: CVSROOT:/sources/qemu Module name:qemu Changes by: Jocelyn Mayer j_mayer 07/10/27 17:59:46 Modified files: target-ppc : op_helper.c Log message: PowerPC float bugfix: 64 bits float mantissa is 52 bits long. CVSWeb URLs: http://cvs.savannah.gnu.org/viewcvs/qemu/target-ppc/op_helper.c?cvsroot=qemur1=1.55r2=1.56 I know that it looks strange, but this commit breaks perl. The function mkdir(dir, mode) sometimes does not set the mode correctly. This can be easily reproduce on Debian by using dpkg-source -x file.dsc, which fails due to wrong permission. I am using qemu-system-ppc -M prep -cpu G3 Thanks for the report ! Could you please try replacing the 0x3FF mask with 0x7FF ? As the exponent is supposed to be 11 bits, my patch is obviously still buggy... I noticed there was a bug somewhere, but was not convinced it was in the FPU emulation... Now ... Yep that fixes the problem. Thanks a lot! Great ! Thanks for your help ! I'll commit it in a minute ! -- J. Mayer [EMAIL PROTECTED] Never organized
Re: [Qemu-devel] Proposal for host-utils
On Sun, 2007-10-28 at 00:53 +0200, J. Mayer wrote: Following the previous discussions about host-utils implementations, here's a patch with the following changes: - move mulu64 and muls64 definitions from exec.h to host-utils.h, for consistency - include host-utils.h in more files to reflect this change - make the optimized version of mulu64 / muls64 for amd64 hosts static inline - change clz64 to avoid 64 bits logical operations to optimize the 32 bits host case - add ctz32, ctz64, cto32 and cto64, using the same method than ctlzxx / cloxx implementations - add ctpop8, ctpop16, ctpop32 and ctpop64, using the Sparc target implementation method ctpop8 is used by the PowerPC target, I added ctpop16 for consistency - change the Alpha and the PowerPC targets to use those helpers I did commit the ctz, cto and ctpop helpers. I have a remaining patch that contains the changes for mulu64 / muls64 and optimize clz64 and clo64, avoiding 64 bits values manipulation, which may help when running on a 32 bits host. Please comment. -- J. Mayer [EMAIL PROTECTED] Never organized Index: exec-all.h === RCS file: /sources/qemu/qemu/exec-all.h,v retrieving revision 1.69 diff -u -d -d -p -r1.69 exec-all.h --- exec-all.h 20 Oct 2007 19:45:43 - 1.69 +++ exec-all.h 28 Oct 2007 12:59:46 - @@ -91,9 +91,6 @@ void optimize_flags_init(void); extern FILE *logfile; extern int loglevel; -void muls64(int64_t *phigh, int64_t *plow, int64_t a, int64_t b); -void mulu64(uint64_t *phigh, uint64_t *plow, uint64_t a, uint64_t b); - int gen_intermediate_code(CPUState *env, struct TranslationBlock *tb); int gen_intermediate_code_pc(CPUState *env, struct TranslationBlock *tb); void dump_ops(const uint16_t *opc_buf, const uint32_t *opparam_buf); Index: host-utils.c === RCS file: /sources/qemu/qemu/host-utils.c,v retrieving revision 1.4 diff -u -d -d -p -r1.4 host-utils.c --- host-utils.c 26 Oct 2007 22:35:01 - 1.4 +++ host-utils.c 28 Oct 2007 12:59:46 - @@ -28,6 +28,7 @@ //#define DEBUG_MULDIV /* Long integer helpers */ +#if !defined(__x86_64__) static void add128 (uint64_t *plow, uint64_t *phigh, uint64_t a, uint64_t b) { *plow += a; @@ -69,17 +70,10 @@ static void mul64 (uint64_t *plow, uint6 *phigh += v; } - /* Unsigned 64x64 - 128 multiplication */ void mulu64 (uint64_t *plow, uint64_t *phigh, uint64_t a, uint64_t b) { -#if defined(__x86_64__) -__asm__ (mul %0\n\t - : =d (*phigh), =a (*plow) - : a (a), 0 (b)); -#else mul64(plow, phigh, a, b); -#endif #if defined(DEBUG_MULDIV) printf(mulu64: 0x%016llx * 0x%016llx = 0x%016llx%016llx\n, a, b, *phigh, *plow); @@ -89,11 +83,6 @@ void mulu64 (uint64_t *plow, uint64_t *p /* Signed 64x64 - 128 multiplication */ void muls64 (uint64_t *plow, uint64_t *phigh, int64_t a, int64_t b) { -#if defined(__x86_64__) -__asm__ (imul %0\n\t - : =d (*phigh), =a (*plow) - : a (a), 0 (b)); -#else int sa, sb; sa = (a 0); @@ -106,9 +95,9 @@ void muls64 (uint64_t *plow, uint64_t *p if (sa ^ sb) { neg128(plow, phigh); } -#endif #if defined(DEBUG_MULDIV) printf(muls64: 0x%016llx * 0x%016llx = 0x%016llx%016llx\n, a, b, *phigh, *plow); #endif } +#endif /* !defined(__x86_64__) */ Index: host-utils.h === RCS file: /sources/qemu/qemu/host-utils.h,v retrieving revision 1.2 diff -u -d -d -p -r1.2 host-utils.h --- host-utils.h 28 Oct 2007 12:52:38 - 1.2 +++ host-utils.h 28 Oct 2007 12:59:46 - @@ -23,6 +23,26 @@ * THE SOFTWARE. */ +#if defined(__x86_64__) +static always_inline void mulu64 (uint64_t *plow, uint64_t *phigh, + uint64_t a, uint64_t b) +{ +__asm__ (mul %0\n\t + : =d (*phigh), =a (*plow) + : a (a), 0 (b)); +} +static always_inline void muls64 (uint64_t *plow, uint64_t *phigh, + int64_t a, int64_t b) +{ +__asm__ (imul %0\n\t + : =d (*phigh), =a (*plow) + : a (a), 0 (b)); +} +#else +void muls64(int64_t *phigh, int64_t *plow, int64_t a, int64_t b); +void mulu64(uint64_t *phigh, uint64_t *plow, uint64_t a, uint64_t b); +#endif + /* Note that some of those functions may end up calling libgcc functions, depending on the host machine. It is up to the target emulation to cope with that. */ @@ -68,34 +88,13 @@ static always_inline int clz64(uint64_t { int cnt = 0; -if (!(val 0xULL)) { +if (!(val 32)) { cnt += 32; -val = 32; -} -if (!(val 0xULL)) { -cnt += 16; -val = 16; -} -if (!(val 0xFF00ULL)) { -cnt += 8; -val = 8; -} -if (!(val 0xF000ULL
Re: [Qemu-devel] qemu Makefile.target vl.h hw/cuda.c hw/grackle_...
On Mon, 2007-10-29 at 00:59 +, Stuart Brady wrote: On Sun, Oct 28, 2007 at 11:42:18PM +, Jocelyn Mayer wrote: * Fix the g3bw target: - fix the Grackle host PCI device - connect the Heathrow PIC to the PowerPC 6xx bus pins Cool! With this, the Debian 3.1 install CD boots again! :) Yes, I did not have time to update the PowerPC status file, but the g3bw machine should be able to boot again. I guess all the long pending regression for the PowerPC targets have been solved. Note that this may need a firmware update in order to boot properly. I'm working on a cleanup of the hacked OHW firmware (with a lot of traces...) I worked with in order to debug and find the g3bw regressions. As I also did find some bugs and added new features in order to support more CPUs (including the (in)famous PowerPC 601...) and more boot scripts, I will commit and publish a new version soon (I hope !). -- J. Mayer [EMAIL PROTECTED] Never organized
Re: [Qemu-devel] Mips 64 emulation not compiling
On Sat, 2007-10-27 at 12:19 +0100, Thiemo Seufer wrote: J. Mayer wrote: The latest patches in clo makes gcc 3.4.6 fail to build the mips64 targets on my amd64 host (looks like an register allocation clash in the optimizer code). Your version is likely faster as well. Furthermore, the clz micro-op for Mips seems very suspect to me, according to the changes made in the clo implementation. It is correct, the sign-extension are zero in that case. OK, you know better than me... I did change the clz / clo implementation to use the same code as the one used for the PowerPC implementation. It seems to me that the result would be correct... And it compiles... Please take a look to the folowing patch: We have now clz/clo in several places, so I expanded your patch a bit. For now it is only used for the mips target. Comments? I fully aggree with the idea of sharing this code, if it's OK according to all targets specifications. Please commit and I'll update PowerPC and Alpha target to use them. Oh, I did an optimisation for clz64 used on 32 bits host, avoiding use of 64 bits logical operations: static always_inline int clz64(uint64_t val) { int cnt = 0; #if HOST_LONG_BITS == 64 if (!(val 0xULL)) { cnt += 32; val = 32; } if (!(val 0xULL)) { cnt += 16; val = 16; } if (!(val 0xFF00ULL)) { cnt += 8; val = 8; } if (!(val 0xF000ULL)) { cnt += 4; val = 4; } if (!(val 0xC000ULL)) { cnt += 2; val = 2; } if (!(val 0x8000ULL)) { cnt++; val = 1; } if (!(val 0x8000ULL)) { cnt++; } #else /* Make it easier on 32 bits host machines */ if (!(val 32)) cnt = _do_cntlzw(val) + 32; else cnt = _do_cntlzw(val 32); #endif return cnt; } If gcc is really cleaver, this would not lead to a better code, but it seemed that the 32 bits implementation leaded to a more optimized code on 32 bits hosts. Maybe this implementation could also be used for 64 bits host, avoiding #ifdef. Count trailing zero is also implemented on Alpha, it may be a good idea to share the implementation, if needed: static always_inline void ctz32 (uint32_t val) { int cnt = 0; if (!(val 0xUL)) { cnt += 16; op32 = 16; } if (!(val 0x00FFUL)) { cnt += 8; val = 8; } if (!(val 0x000FUL)) { cnt += 4; val = 4; } if (!(val 0x0003UL)) { cnt += 2; val = 2; } if (!(val 0x0001UL)) { cnt++; val = 1; } if (!(val 0x0001UL)) { cnt++; } return cnt; } static always_inline void ctz64 (uint64_t val) { int cnt = 0; if (!(val 0xULL)) { cnt+= 32; val = 32; } /* Make it easier for 32 bits hosts */ cnt += ctz32(val); return cnt; } And of course cto32 and cto64 could also be added. I also got optimized versions of bit population count which could also be shared: static always_inline int ctpop32 (uint32_t val) { int i; for (i = 0; val != 0; i++) val = val ^ (val - 1); return i; } If you prefer, I can add those shared functions (ctz32, ctz64, cto32, cto64, ctpop32, ctpop64) later, as they do not seem as widely used as clxxx functions. -- J. Mayer [EMAIL PROTECTED] Never organized
Re: [Qemu-devel] Mips 64 emulation not compiling
On Sat, 2007-10-27 at 16:01 +0300, Blue Swirl wrote: On 10/27/07, J. Mayer [EMAIL PROTECTED] wrote: I also got optimized versions of bit population count which could also be shared: static always_inline int ctpop32 (uint32_t val) { int i; for (i = 0; val != 0; i++) val = val ^ (val - 1); return i; } If you prefer, I can add those shared functions (ctz32, ctz64, cto32, cto64, ctpop32, ctpop64) later, as they do not seem as widely used as clxxx functions. This would be interesting for Sparc64. Could you compare your version to do_popc() in target-sparc/op_helper.c? My feeling is: my implementation does n loops, n being the number of bits set in the word, then will always be faster than yours when only a few bits are set. your implementation could be better because: - it has a fixed cost - it does not do any tests / jumps / loops The drawback of your implementation is that it generates a lot of code, thus could never be used directly in micro-ops: on my amd64 host, my implementation compiles in 36 bytes of code and the 64 bits version does not generate more code than the 32 bits one. Your (64 bits only) implementation compiles in 217 bytes of code. On a x86, my 32 bits version is 49 bytes long, the 64 bits one is 79 bits long and yours is 323 bytes long. But this would never be a problem when called from a helper. Then, I'm not really sure of what is the best choice to be done here We may have to do tests to see which one of the 2 implementations seems more efficient. -- J. Mayer [EMAIL PROTECTED] Never organized
Re: [Qemu-devel] Mips 64 emulation not compiling
On Sat, 2007-10-27 at 15:27 +0200, Christian Eddie Dost wrote: The sparc64 popc works in O(lg(n)) No, it has a fix cost, whatever the operand is. It has another advantage: it does not need any intermediate variable, which is great when running on CISC host in the Qemu execution environmnent. , the optimized code below work in O(n). Yes. But it's false It shoudl be val = val - 1 instead of val ^= val - 1... [...] I did tests on my PC, which will imho close the debate: the Sparc implementation is at least 50 % faster. I did generate 2 ^ 29 random numbers to achieve this test (and checked that the repartition was OK). -- J. Mayer [EMAIL PROTECTED] Never organized
[Qemu-devel] Proposal for host-utils
Following the previous discussions about host-utils implementations, here's a patch with the following changes: - move mulu64 and muls64 definitions from exec.h to host-utils.h, for consistency - include host-utils.h in more files to reflect this change - make the optimized version of mulu64 / muls64 for amd64 hosts static inline - change clz64 to avoid 64 bits logical operations to optimize the 32 bits host case - add ctz32, ctz64, cto32 and cto64, using the same method than ctlzxx / cloxx implementations - add ctpop8, ctpop16, ctpop32 and ctpop64, using the Sparc target implementation method ctpop8 is used by the PowerPC target, I added ctpop16 for consistency - change the Alpha and the PowerPC targets to use those helpers Please comment. -- J. Mayer [EMAIL PROTECTED] Never organized Index: exec-all.h === RCS file: /sources/qemu/qemu/exec-all.h,v retrieving revision 1.69 diff -u -d -d -p -r1.69 exec-all.h --- exec-all.h 20 Oct 2007 19:45:43 - 1.69 +++ exec-all.h 27 Oct 2007 22:45:08 - @@ -91,9 +91,6 @@ void optimize_flags_init(void); extern FILE *logfile; extern int loglevel; -void muls64(int64_t *phigh, int64_t *plow, int64_t a, int64_t b); -void mulu64(uint64_t *phigh, uint64_t *plow, uint64_t a, uint64_t b); - int gen_intermediate_code(CPUState *env, struct TranslationBlock *tb); int gen_intermediate_code_pc(CPUState *env, struct TranslationBlock *tb); void dump_ops(const uint16_t *opc_buf, const uint32_t *opparam_buf); Index: host-utils.c === RCS file: /sources/qemu/qemu/host-utils.c,v retrieving revision 1.4 diff -u -d -d -p -r1.4 host-utils.c --- host-utils.c 26 Oct 2007 22:35:01 - 1.4 +++ host-utils.c 27 Oct 2007 22:45:08 - @@ -28,6 +28,7 @@ //#define DEBUG_MULDIV /* Long integer helpers */ +#if !defined(__x86_64__) static void add128 (uint64_t *plow, uint64_t *phigh, uint64_t a, uint64_t b) { *plow += a; @@ -69,17 +70,10 @@ static void mul64 (uint64_t *plow, uint6 *phigh += v; } - /* Unsigned 64x64 - 128 multiplication */ void mulu64 (uint64_t *plow, uint64_t *phigh, uint64_t a, uint64_t b) { -#if defined(__x86_64__) -__asm__ (mul %0\n\t - : =d (*phigh), =a (*plow) - : a (a), 0 (b)); -#else mul64(plow, phigh, a, b); -#endif #if defined(DEBUG_MULDIV) printf(mulu64: 0x%016llx * 0x%016llx = 0x%016llx%016llx\n, a, b, *phigh, *plow); @@ -89,11 +83,6 @@ void mulu64 (uint64_t *plow, uint64_t *p /* Signed 64x64 - 128 multiplication */ void muls64 (uint64_t *plow, uint64_t *phigh, int64_t a, int64_t b) { -#if defined(__x86_64__) -__asm__ (imul %0\n\t - : =d (*phigh), =a (*plow) - : a (a), 0 (b)); -#else int sa, sb; sa = (a 0); @@ -106,9 +95,9 @@ void muls64 (uint64_t *plow, uint64_t *p if (sa ^ sb) { neg128(plow, phigh); } -#endif #if defined(DEBUG_MULDIV) printf(muls64: 0x%016llx * 0x%016llx = 0x%016llx%016llx\n, a, b, *phigh, *plow); #endif } +#endif /* !defined(__x86_64__) */ Index: host-utils.h === RCS file: /sources/qemu/qemu/host-utils.h,v retrieving revision 1.1 diff -u -d -d -p -r1.1 host-utils.h --- host-utils.h 27 Oct 2007 13:05:54 - 1.1 +++ host-utils.h 27 Oct 2007 22:45:08 - @@ -23,6 +23,26 @@ * THE SOFTWARE. */ +#if defined(__x86_64__) +static always_inline void mulu64 (uint64_t *plow, uint64_t *phigh, + uint64_t a, uint64_t b) +{ +__asm__ (mul %0\n\t + : =d (*phigh), =a (*plow) + : a (a), 0 (b)); +} +static always_inline void muls64 (uint64_t *plow, uint64_t *phigh, + int64_t a, int64_t b) +{ +__asm__ (imul %0\n\t + : =d (*phigh), =a (*plow) + : a (a), 0 (b)); +} +#else +void muls64(int64_t *phigh, int64_t *plow, int64_t a, int64_t b); +void mulu64(uint64_t *phigh, uint64_t *plow, uint64_t a, uint64_t b); +#endif + /* Note that some of those functions may end up calling libgcc functions, depending on the host machine. It is up to the target emulation to cope with that. */ @@ -68,37 +88,113 @@ static always_inline int clz64(uint64_t { int cnt = 0; -if (!(val 0xULL)) { +if (!(val 32)) { cnt += 32; -val = 32; +} else { +val = 32; } -if (!(val 0xULL)) { + +return cnt + clz32(val); +} + +static always_inline int clo64(uint64_t val) +{ +return clz64(~val); +} + +static always_inline int ctz32 (uint32_t val) +{ +int cnt; + +cnt = 0; +if (!(val 0xUL)) { cnt += 16; -val = 16; +val = 16; } -if (!(val 0xFF00ULL)) { +if (!(val 0x00FFUL)) { cnt += 8; -val = 8; +val = 8; } -if (!(val
Re: [Qemu-devel] qemu-2007-10-24 build error
On Wed, 2007-10-24 at 09:36 +0900, Hwang YunSong(황윤성) wrote: gcc32 -g -o qemu-system-cris vl.o osdep.o readline.o monitor.o pci.o console.o loader.o isa_mmio.o cutils.o block.o block-raw.o block-cow.o block-qcow.o aes.o block-vmdk.o block-cloop.o block-dmg.o block-bochs.o block-vpc.o block-vvfat.o block-qcow2.o block-parallels.o irq.o i2c.o smbus.o scsi-disk.o cdrom.o lsi53c895a.o usb.o usb-hub.o usb-linux.o usb-hid.o usb-ohci.o usb-msd.o usb-wacom.o eeprom93xx.o eepro100.o ne2000.o pcnet.o rtl8139.o etraxfs.o ptimer.o etraxfs_timer.o etraxfs_ser.o gdbstub.o sdl.o x_keymap.o vnc.o d3des.o slirp/cksum.o slirp/if.o slirp/ip_icmp.o slirp/ip_input.o slirp/ip_output.o slirp/slirp.o slirp/mbuf.o slirp/misc.o slirp/sbuf.o slirp/socket.o slirp/tcp_input.o slirp/tcp_output.o slirp/tcp_subr.o slirp/tcp_timer.o slirp/udp.o slirp/bootp.o slirp/debug.o slirp/tftp.o libqemu.a -lm -lz -lgnutls -L/usr/lib -lSDL -lpthread -lrt -lutil libqemu.a(helper.o): In function `do_interrupt': /usr/src/Haansoft/BUILD/qemu/target-cris/helper.c:137: undefined reference to `__builtin_clz' libqemu.a(translate-op.o): In function `dyngen_code': /home/hys545/qemu/cris-softmmu/op.h:1566: undefined reference to `__builtin_clz' libqemu.a(op.o): In function `op_lz_T0_T1': /usr/src/Haansoft/BUILD/qemu/target-cris/op.c:1009: undefined reference to `__builtin_clz' collect2: ld returned 1 exit status It does not seem to be a good idea, imho, to use gcc builtins directly from micro-ops. But your compiler should implement __builtin_clz. As far as I can see, the 4.1.1 version I got (Gentoo distribution) has this builtin implemented, then there might be a problem in your gcc package. I used this little program to check the builtin presence and found no version from gcc 3.4.4 to gcc 4.2.2 without __builtin_clz implemented: int a = 123456; int main (void) { int b; b = __builtin_clz(a); return b; } Compiled with gcc-version -O2 -Wall -W -o /tmp/clz /tmp/ckz.c [...] -- J. Mayer [EMAIL PROTECTED] Never organized
Re: [Qemu-devel] qemu host-utils.c
On Wed, 2007-10-24 at 12:20 +0200, Fabrice Bellard wrote: I strongly suggest to reuse my code which was in target-i386/helper.c revision 1.80 which was far easier to validate. Moreover, integer divisions from target-i386/helper.c should be put in the same file. I fully agree with this. I still use the same code in the PowerPC op_helper.c file because I never conviced myself that the host_utils version was bug-free. I would likely switch to the common version if I could be sure it cannot lead to any regression. Thiemo Seufer wrote: CVSROOT:/sources/qemu Module name:qemu Changes by: Thiemo Seufer ths 07/10/23 23:22:54 Modified files: . : host-utils.c Log message: Fix overflow when multiplying two large positive numbers. CVSWeb URLs: http://cvs.savannah.gnu.org/viewcvs/qemu/host-utils.c?cvsroot=qemur1=1.1r2=1.2 -- J. Mayer [EMAIL PROTECTED] Never organized
[Qemu-devel] Mips 64 emulation not compiling
The latest patches in clo makes gcc 3.4.6 fail to build the mips64 targets on my amd64 host (looks like an register allocation clash in the optimizer code). Furthermore, the clz micro-op for Mips seems very suspect to me, according to the changes made in the clo implementation. I did change the clz / clo implementation to use the same code as the one used for the PowerPC implementation. It seems to me that the result would be correct... And it compiles... Please take a look to the folowing patch: Index: target-mips/op.c === RCS file: /sources/qemu/qemu/target-mips/op.c,v retrieving revision 1.80 diff -u -d -d -p -r1.80 op.c --- target-mips/op.c24 Oct 2007 00:10:32 - 1.80 +++ target-mips/op.c24 Oct 2007 10:38:26 - @@ -535,37 +535,44 @@ void op_rotrv (void) RETURN(); } -void op_clo (void) +static always_inline int _do_cntlzw (uint32_t val) { -int n; - -if (T0 == ~((target_ulong)0)) { -T0 = 32; -} else { -for (n = 0; n 32; n++) { -if (!(((int32_t)T0) (1 31))) -break; -T0 = 1; -} -T0 = n; +int cnt = 0; +if (!(val 0xUL)) { +cnt += 16; +val = 16; +} +if (!(val 0xFF00UL)) { +cnt += 8; +val = 8; } +if (!(val 0xF000UL)) { +cnt += 4; +val = 4; +} +if (!(val 0xC000UL)) { +cnt += 2; +val = 2; +} +if (!(val 0x8000UL)) { +cnt++; +val = 1; +} +if (!(val 0x8000UL)) { +cnt++; +} +return cnt; +} + +void op_clo (void) +{ +T0 = _do_cntlzw(~T0); RETURN(); } void op_clz (void) { -int n; - -if (T0 == 0) { -T0 = 32; -} else { -for (n = 0; n 32; n++) { -if (T0 (1 31)) -break; -T0 = 1; -} -T0 = n; -} +T0 = _do_cntlzw(T0); RETURN(); } -- J. Mayer [EMAIL PROTECTED] Never organized
Re: [Qemu-devel] [Patch] set boot sequence from command line
On Wed, 2007-10-24 at 23:59 +0200, andrzej zaborowski wrote: On 24/10/2007, Dan Kenigsberg [EMAIL PROTECTED] wrote: Real PCs try to boot from a CD, then try the hard drive, and finally go to the network. And they let their owner change that order. With the difference that on real PCs this is controlled by the BIOS menu rather than a hardware switch, but the latter seems more convenient for qemu. This patch lets Qemu do the same with -boot dcn. I'll be happy to hear comments about it, Dan. diff --git a/hw/an5206.c b/hw/an5206.c index 94ecccb..2134184 100644 --- a/hw/an5206.c +++ b/hw/an5206.c @@ -27,7 +27,7 @@ void DMA_run (void) /* Board init. */ -static void an5206_init(int ram_size, int vga_ram_size, int boot_device, +static void an5206_init(int ram_size, int vga_ram_size, char *boot_device, DisplayState *ds, const char **fd_filename, int snapshot, const char *kernel_filename, const char *kernel_cmdline, const char *initrd_filename, const char *cpu_model) BTW, it may be a good idea to pass all these values (maybe except ds) as a single struct, for purely practical reasons. Regards Maybe the use of several structure may be better: - one could be used to describe the kernel boot (with kernel_filename, kernel_cmdline, initrd_filename) and could be NULL - one for the hardware emulation parameters (ram_size, vga_ram_sze, cpu_model) - one for the emulation parameters if needed (snapshot, ds...) ... It may be more consistent than use a single structure melting misc stuff that are not directly related. One remark about the submited patch: why are the boot order table limited to 3 elements ? There are at least 4 choices available today (floppy, disk, CDROM, network) and maybe more in the future for some architecture (refering to the curently emulated hardware: 2 floppies, 4 IDE devices, n network devices, SCSI storages, ...). I guess it's not a so good idea to override the boot_device table in the machine init routines. Imho, it would better be passed as a const char * argument. For the PowerPC part, there could be a local int boot_device variable that would be initialized to the first argument of the table. And this would not change the NVRAM initialisation API: if this API need to support more than one boot device in the future, it will have to be completelly reworked (for other reasons too), then I would suggest to be conservative and do not change this API at all. -- J. Mayer [EMAIL PROTECTED] Never organized
Re: [Qemu-devel] qemu host-utils.c
On Wed, 2007-10-24 at 18:37 +0100, Thiemo Seufer wrote: J. Mayer wrote: On Wed, 2007-10-24 at 12:20 +0200, Fabrice Bellard wrote: I strongly suggest to reuse my code which was in target-i386/helper.c revision 1.80 which was far easier to validate. Moreover, integer divisions from target-i386/helper.c should be put in the same file. I fully agree with this. I still use the same code in the PowerPC op_helper.c file because I never conviced myself that the host_utils version was bug-free. I would likely switch to the common version if I could be sure it cannot lead to any regression. Like this? Questions/Comments I have: [...] - The x86-64 assembler is untested for this version, could you check it works for you? I did a small test program, comparing the result of the Fabrice implementation and the x86_64 optimized implementation results in signed and unsigned case. I used the code from the CVS from host-utils.c for the optimized case and from target-ppc/op_helper.c for the C code case. For my tests vectors, I first used a walking-one like pattern generation algorithm (including the 0 argument cases) then purely random numbers. I did more than 2^32 tests with no differences between the two implementations. What I suggest, to be safe: - do not change the current host-utils API and keep the x86_64 optimised case as it is. This way, we are sure not to break anything. - just merge Fabrice's code to replace the non-x86_64 code. As using this API could lead to more optimisations in the PowerPC implementation code, I can wait for you to commit this part and remove the private helpers as soon as you'll have commited. I will then also sanitize the Alpha case, which seems broken, even when running on 64 bits hosts. I don't know much for Sparc, then I won't change it. [...] -- J. Mayer [EMAIL PROTECTED] Never organized
Re: [Qemu-devel] [RFC/PATCH] remove $check_gfx from configure
On Mon, 2007-10-22 at 20:09 -0500, Carlo Marcelo Arenas Belon wrote: The following patch removes check_gfx from qemu's configure as the check it was trying to enforce is no longer valid. If neither sdl or cocoa are available, the video output for the console will be still available from the vnc server (which can't be disabled). Then, it seems better to me to be add an option to disable the VNC server. Why should we have a graphical output when we want to emulate boards that don't have any kind of graphical devices ? [...] -- J. Mayer [EMAIL PROTECTED] Never organized
Re: [Qemu-devel] I got a kernel booted under qemu-system-ppc !
Hi, Sorry for the delay... On Fri, 2007-10-19 at 12:57 -0500, Milton Miller wrote: On Oct 18, 2007, at 6:46 PM, J. Mayer wrote: On Thu, 2007-10-18 at 19:12 -0500, Rob Landley wrote: [...] Rob was complaining he needed arch ppc to boot prep qemu but kernel_headers only worked on powerpc. Rob choose prep because the header on the prep kernel was the only one that satisfied Open Hackware when supplied to -kernel. About this time Dave Gibson started code to run prep hardware under powerpc. It had several assumptions on what the machine looked like. As I remember it, I tried Dave's port on the qemu 0.6.2 found on the Knoppix 4.0.2 dvd and caused qemu to segv. I was not prepared to debug qemu; I assumed the assumptions were not met. Taking a look at Open Hackware 0.4, it appears that it treats the memory given to -kernel as a block device. That makes no sense to me. I would expect it to treat the memory as the file loaded from the block device. This seems very common and useful to me to treat memory the same way you do with any block device. You can then have RAM disks, MTD, ... That's the idea behind it, even if the current implementation may not properly reflect it. It's even more logical to treat the PreP kernel image as a bloc device as it is exactly a bloc device image dump... Seeing that no updates to Open Hackware had been made in years, and that it only pretended to have a open firmware client interface and contained a mostly useless rtas (eg no pci config methods), I decided to try bypassing it and just give the kernel what it was looking for: a description of the hardware. OpenFirmware should not be required to boot a PreP system, as it is only an option of the platform. The residual data should be sufficient for any OS, if we follow the specification. RTAS is not used on PreP, as far as I know, it seems to be useful mostly when running in a virtual partition but this is not supported by Qemu for now. I choose to continue with prep because I knew the expected hardware much better than pmac (the heathrow emulation). I started by mining Open Hackware 0.4 for information describing the hardware and its startup. I found the entry point was 0xfffc and after a bit of experimentation I found qemu required the rom to be 512k and was read only. I found that since there were no caches I could do IO to serial in real mode, which meant I didn't have to decide what to map via bats. I guess you found most of the useful informations, there ! [...] I have to admit I never put the focus on trying to solve this issue, has I usually use Mac99 or PowerPC 405 targets for tests and that PreP machines are long obsolete and the heathrow target does not reflect any real machine. But this is to be solved, for sure. From the patch description you inserted a openpic between the 8259 and the cpu. If this is true, at a minimum it will require this to be described to be in the device tree for linux (or in the prep residual data in general for hackware). My qemu linux code wasn't expecting an openpic so it would need to adjust; or perhaps David's kernel code would now work. There is no OpenPIC in the Qemu PreP target. There could be one, but currently, there is only one i8259, when there should be 2, cascaded... (maybe some of the IRQ problems we have come from the fact we don't have this second IRQ controller ?). The patch did add a description of what I called the 'internal PowerPC interrupt controller', which in fact is the emulation of the PowerPC bus interface. For the PreP target, only the 6xx bus is supported. The also added proper connection between the Mac99 target OpenPIC controller and the PowerPC input pins and should also provide the connection between the i8529 output pin and the PowerPC IRQ input pin for the PreP target. I realized some times ago that the connections between the heathrow PIC and the PowerPC input pins are not emulated, but there seem to be more problems in the heathrow emulation. [] If the proposed ROM image is best suitable to make PreP target run (and as I don't spend a lot of time hacking OHW those days), it may also be interesting to add a ppc_prep_rom.bin to the repository... In fact, it was maybe not a good idea to try to use the same BIOS image on all PowerPC targets... The code has drawbacks, as I mentioned. Its very linux specific and the only error it finds is no load segement in the presumed elf header. It wouldn't be that hard to add some error checking and even print error to the serial. The code itself is not platform specific but the device tree attached to it is. As I said, we could link multiple copies together (one for each platform) and search until we got the right one; or just search trees based on the platform with the fixups in code specific to that tree. In that regard, this would be an option to use
Re: [Qemu-devel] PreP kernels boot using Qemu
On Tue, 2007-10-23 at 12:47 +0100, Thiemo Seufer wrote: J. Mayer wrote: On Tue, 2007-10-23 at 00:05 +0200, Aurelien Jarno wrote: J. Mayer a écrit : On Mon, 2007-10-22 at 18:28 +0200, Aurelien Jarno wrote: On Mon, Oct 22, 2007 at 09:36:07AM +0200, J. Mayer wrote: Hi all, I've been investigating more about PreP kernel boot using Qemu and I achieved to boot 2.4.35, 2.6.12 and 2.6.22 kernels using Qemu CVS and unmodified OHW. [...] - The floating point problem I reported during the week-end does not exists, probably because of the switch from powerpc to ppc. I still don't know if it is a kernel problem or a QEMU problem (or both). There may be issues with the floating point emulation, especially if some kernel or programs relies on the FPSCR (floating-point status) register which is never updated in Qemu. Is there any technical reason behind that, or is it just a lack of time? I can say both: for most program, using floating point arithmetic ala fast-math, it's not necessary to maintain a precise FPU state, as those program will never raise any FPU exception, never generate NaNs, infinites, ... The other reason is that it would need to check every FPU insn arguments and results at run time and treat all special cases following the actual PowerPC implementations behavior if we want to get a precise emulation. This behavior could be for example selected at compile time: then one would have the choice to have a quick FPU emulation model or a precise one. For mips I chose the middle ground: The emulation is architecturally correct but may not reflect FPU behaviour of the specific silicon. E.g. one effect is that in certain cases the emulation computes values close to underflow, while real hardware would throw the (mips FPU specific) unimplemented exception. For most cases this should be good enough, since only specialized software will rely on a specific implementation's oddities. Well, what you've done for Mips is exactly what I called the precise emulation and is far slower than the fast math emulation I got for PowerPC. I was wrong talking about PowerPC implementations when I should have said PowerPC specification; but there should be no difference between the two (or it's not a PowerPC CPU...) because the POWER/PowerPC specification describes very precisely the behavior of the FPU. The fast math model relies on the native-softmmu code and is suficient for most applications. But there are a few instructions that should always take care (or maybe at least reset) the FPSCR register, which is not done in the current code. -- J. Mayer [EMAIL PROTECTED] Never organized
Re: [Qemu-devel] PreP kernels boot using Qemu
On Tue, 2007-10-23 at 23:59 +0200, Aurelien Jarno wrote: J. Mayer a écrit : On Tue, 2007-10-23 at 12:47 +0100, Thiemo Seufer wrote: J. Mayer wrote: On Tue, 2007-10-23 at 00:05 +0200, Aurelien Jarno wrote: J. Mayer a écrit : On Mon, 2007-10-22 at 18:28 +0200, Aurelien Jarno wrote: On Mon, Oct 22, 2007 at 09:36:07AM +0200, J. Mayer wrote: Hi all, I've been investigating more about PreP kernel boot using Qemu and I achieved to boot 2.4.35, 2.6.12 and 2.6.22 kernels using Qemu CVS and unmodified OHW. [...] - The floating point problem I reported during the week-end does not exists, probably because of the switch from powerpc to ppc. I still don't know if it is a kernel problem or a QEMU problem (or both). There may be issues with the floating point emulation, especially if some kernel or programs relies on the FPSCR (floating-point status) register which is never updated in Qemu. Is there any technical reason behind that, or is it just a lack of time? I can say both: for most program, using floating point arithmetic ala fast-math, it's not necessary to maintain a precise FPU state, as those program will never raise any FPU exception, never generate NaNs, infinites, ... The other reason is that it would need to check every FPU insn arguments and results at run time and treat all special cases following the actual PowerPC implementations behavior if we want to get a precise emulation. This behavior could be for example selected at compile time: then one would have the choice to have a quick FPU emulation model or a precise one. For mips I chose the middle ground: The emulation is architecturally correct but may not reflect FPU behaviour of the specific silicon. E.g. one effect is that in certain cases the emulation computes values close to underflow, while real hardware would throw the (mips FPU specific) unimplemented exception. For most cases this should be good enough, since only specialized software will rely on a specific implementation's oddities. Well, what you've done for Mips is exactly what I called the precise emulation and is far slower than the fast math emulation I got for PowerPC. I was wrong talking about PowerPC implementations when I should have said PowerPC specification; but there should be no difference between the two (or it's not a PowerPC CPU...) because the POWER/PowerPC specification describes very precisely the behavior of the FPU. The fast math model relies on the native-softmmu code and is suficient for most applications. But there are a few instructions that should always take care (or maybe at least reset) the FPSCR register, which is not done in the current code. Then I guess it is what has been done on the SPARC target: after each FP instruction, check_ieee_exceptions() is called to accumulate the IEEE exceptions and generate real exceptions if they are enabled. That doesn't look really complex, but I agree that could slow down a bit the emulation. I will get a closer look in two or three weeks. It's not so complex. What would greatly slow down the emulation is that you need to use the softfloat model instead of the softfloat-native one for this to produce the expected result. The PowerPC fadd instruction just compiles with 3 insns on amd64, using the fast math model: movlpd 0x1b8(%r14),%xmm0 ; /* Load env-ft0 into a MMX register */ addsd 0x1b0(%r14),%xmm0 ; /* Add env-ft1 */ movsd %xmm0,0x1b0(%r14) ; /* Store the result into env-ft0 */ With the precise model, you need to: 1/ Clear the floating point flags 2/ Load operands from env-ft0 env-ft1 into host registers 3/ Call the float64_add function 4/ Store the result into env-ft0 5/ Compute the architecture specific FPU flags which will lead to execute much more code for each FPU operation and will consume much more space in the TB buffer. It's a good idea to allow the use of such a precise model, when you want to use specific applications that rely on the FPU to properly handle NaNs, infinities and properly generate exceptions. But, as it's not needed by most applications, having a fast math model is also great to have a quicker emulation. I said it would be great to allow the choice of the model at compile time but it could in fact be choosen at run-time, just tweaking the code translator (which should not lead to any performance penalty for the fast model case) and compiling twice the FPU micro-operations, once with the CONFIG_SOFTFLOAT defined, once without. This way, the Qemu user could easily choose between fast or precise models, just changing a switch on the command line. -- J. Mayer [EMAIL PROTECTED] Never organized
[Qemu-devel] PreP kernels boot using Qemu
Hi all, I've been investigating more about PreP kernel boot using Qemu and I achieved to boot 2.4.35, 2.6.12 and 2.6.22 kernels using Qemu CVS and unmodified OHW. The issues I found in the kernel are: - the OpenFirmware video console driver is broken in recent 2.4 kernels and have been removed from recent 2.6 kernel - I then decided to use the vga16fb console driver but needed to do some patches in order to make it compile properly - the CMOS RTC driver is not available for PPC architecture in 2.6 kernels and need some patches in order to be usable - I discovered that the mkprep utility is bugged in 2.4.35 and 2.6.12 kernels. The bugs are visible only when cross-compiling from a little-endian and/or 64 bits host. - I got issues (ie process freezing) when using the 2.6.22 kernel with HZ 100. It seems to run properly when the system timer is set to 100 Hz but this needs more tests for confirmation. - I got the 2.6.22 kernel crashing (ie kernel Oops in workqueue code) when it has no RTC available. There is no problem when the RTC is present. This is likely to be a kernel bug: when no RTC is available, it cannot calibrate its timers properly and the kernel timer seems to run very fast. Forcing (with hacks...) the timer to run at nearly real-time seems to prevent the bug to happen. I then generated some kernels that allow me to boot and use those 3 kernels. Here are 3 tarballs with: - a patch to be applied to the vanilla kernel sources to fix the mentionned bugs - the .config file I used to build the kernel - the zImage.prep image http://perso.magic.fr/l_indien/qemu-ppc/linux-tests/linux-2.4.35-prep.tar.bz2 http://perso.magic.fr/l_indien/qemu-ppc/linux-tests/linux-2.6.12-prep.tar.bz2 http://perso.magic.fr/l_indien/qemu-ppc/linux-tests/linux-2.6.22-prep.tar.bz2 I then run Qemu with the following command line template: ./ppc-softmmu/qemu-system-ppc -serial stdio -net nic,model=ne2k_isa -net tap -net nic,model=ne2k_pci -net tap -net nic,model=rtl8139 -net tap -cpu 604 -M prep -L pc-bios/ -hda my_first_disk -cdrom my_cdrom -kernel src_base/linux-kversion.patched/arch/ppc/boot/images/zImage.prep Hope this helps. -- J. Mayer [EMAIL PROTECTED] Never organized
Re: [Qemu-devel] PreP kernels boot using Qemu
On Mon, 2007-10-22 at 18:28 +0200, Aurelien Jarno wrote: On Mon, Oct 22, 2007 at 09:36:07AM +0200, J. Mayer wrote: Hi all, I've been investigating more about PreP kernel boot using Qemu and I achieved to boot 2.4.35, 2.6.12 and 2.6.22 kernels using Qemu CVS and unmodified OHW. The issues I found in the kernel are: - the OpenFirmware video console driver is broken in recent 2.4 kernels and have been removed from recent 2.6 kernel - I then decided to use the vga16fb console driver but needed to do some patches in order to make it compile properly - the CMOS RTC driver is not available for PPC architecture in 2.6 kernels and need some patches in order to be usable - I discovered that the mkprep utility is bugged in 2.4.35 and 2.6.12 kernels. The bugs are visible only when cross-compiling from a little-endian and/or 64 bits host. - I got issues (ie process freezing) when using the 2.6.22 kernel with HZ 100. It seems to run properly when the system timer is set to 100 Hz but this needs more tests for confirmation. - I got the 2.6.22 kernel crashing (ie kernel Oops in workqueue code) when it has no RTC available. There is no problem when the RTC is present. This is likely to be a kernel bug: when no RTC is available, it cannot calibrate its timers properly and the kernel timer seems to run very fast. Forcing (with hacks...) the timer to run at nearly real-time seems to prevent the bug to happen. I then generated some kernels that allow me to boot and use those 3 kernels. Here are 3 tarballs with: - a patch to be applied to the vanilla kernel sources to fix the mentionned bugs - the .config file I used to build the kernel - the zImage.prep image http://perso.magic.fr/l_indien/qemu-ppc/linux-tests/linux-2.4.35-prep.tar.bz2 http://perso.magic.fr/l_indien/qemu-ppc/linux-tests/linux-2.6.12-prep.tar.bz2 http://perso.magic.fr/l_indien/qemu-ppc/linux-tests/linux-2.6.22-prep.tar.bz2 I then run Qemu with the following command line template: ./ppc-softmmu/qemu-system-ppc -serial stdio -net nic,model=ne2k_isa -net tap -net nic,model=ne2k_pci -net tap -net nic,model=rtl8139 -net tap -cpu 604 -M prep -L pc-bios/ -hda my_first_disk -cdrom my_cdrom -kernel src_base/linux-kversion.patched/arch/ppc/boot/images/zImage.prep Hope this helps. Yes, this help a lot, thanks! With your config file, I have been able to build and boot a 2.6.22 kernel. I have used a Debian sid chroot. Here are a few remarks: - The NE2000 card doesn't work for the same reason as with the powerpc architecture. The kernel patch below fixes the problem. I will send it later along with the ppc patch. There's something else strange with the PCI ethernet devices: they got no IRQ assigned (as if the BIOS does not configure them properly). And the RTL8139 never has a mac address, never detects the PHY link, then there may be endianness issues in the emulation (I did not check at all). - The floating point problem I reported during the week-end does not exists, probably because of the switch from powerpc to ppc. I still don't know if it is a kernel problem or a QEMU problem (or both). There may be issues with the floating point emulation, especially if some kernel or programs relies on the FPSCR (floating-point status) register which is never updated in Qemu. - PCI is broken. PCI IDs are reported in the wrong endianness: 00:00.0 Non-VGA unclassified device: Unknown device 0148:5710 (rev 06) 00:01.0 Non-VGA unclassified device: Santa Cruz Operation Unknown device 3412 (rev 03) This does not happen with 2.4 kernels. Using the 2.4.35 image, all PCI descriptors are OK and the drivers properly recognize the devices. What I suspect is that 2.6 kernels tweak the chipset to make it handle the endian-reverse accesses. [...] -- J. Mayer [EMAIL PROTECTED] Never organized
Re: [Qemu-devel] PreP kernels boot using Qemu
On Tue, 2007-10-23 at 00:05 +0200, Aurelien Jarno wrote: J. Mayer a écrit : On Mon, 2007-10-22 at 18:28 +0200, Aurelien Jarno wrote: On Mon, Oct 22, 2007 at 09:36:07AM +0200, J. Mayer wrote: Hi all, I've been investigating more about PreP kernel boot using Qemu and I achieved to boot 2.4.35, 2.6.12 and 2.6.22 kernels using Qemu CVS and unmodified OHW. [...] - The floating point problem I reported during the week-end does not exists, probably because of the switch from powerpc to ppc. I still don't know if it is a kernel problem or a QEMU problem (or both). There may be issues with the floating point emulation, especially if some kernel or programs relies on the FPSCR (floating-point status) register which is never updated in Qemu. Is there any technical reason behind that, or is it just a lack of time? I can say both: for most program, using floating point arithmetic ala fast-math, it's not necessary to maintain a precise FPU state, as those program will never raise any FPU exception, never generate NaNs, infinites, ... The other reason is that it would need to check every FPU insn arguments and results at run time and treat all special cases following the actual PowerPC implementations behavior if we want to get a precise emulation. This behavior could be for example selected at compile time: then one would have the choice to have a quick FPU emulation model or a precise one. -- J. Mayer [EMAIL PROTECTED] Never organized
Re: [Qemu-devel] qemu alpha?
On Mon, 2007-10-22 at 09:43 +0200, Oliver Falk wrote: On 10/21/2007 01:06 PM, J. Mayer wrote: On Sun, 2007-10-21 at 05:43 -0500, Rob Landley wrote: On Saturday 20 October 2007 3:56:12 am J. Mayer wrote: On Fri, 2007-10-19 at 19:49 -0500, Rob Landley wrote: On Sunday 14 October 2007 5:14:27 am J. Mayer wrote: On Sun, 2007-10-14 at 11:19 +0200, Oliver Falk wrote: Hi list! Hi you ! Just wanted to know how far the progress on alpha target is? I would be happy if I have some 'virtual alpha' to test new isos. If I can help some way (I have a few alphas around). Let me know. I'm happy to see someone interresting in improving Alpha support, which is very alpha for now ! I'm interested in testing Alpha too, but I haven't seem a qemu-system-alpha show up yet. Alas, I have no hardware or specific expertise in this platform, I'm just trying to build and boot Linux kernels (and corresponding root filesystems) on as many emulated target platforms as I can. There are a lot of things missing for qemu-system-alpha to be available: - the PALCode emulation is far from being complete or even usable I have no idea what that is. The PALCode is mainly equivalent to the microcode of most CPU architectures. What is different to microcode is that is uses only regular Alpha instructions, just adding 4 instructions to access special hardware registers and access the memory with different priviledge levels. Another main idea is that everyone can write its own PALCode image and switch to it at run-time. Then, for example, the PALCode ABI is not the same one if you run Linux or Windows NT. The PALCode handles all complex operations. For example, the CPU provides only TLB and the MMU tables search is actually implemented in software, in the PALCode. This greatly simplifies the CPU design and allows a high level of flexibility. And if your OS need a specific ABI for example to handle CPU exception, you define your ABI, write the PALCode using Alpha insns and use it ! The Alpha CPU also provide an instruction to do PALCode calls from the OS or applications. There are 3 (4 ?) native PALCode ABIs documented in the Alpha CPUs specifications then those can be emulated at the host side in Qemu. It is in fact needed to emulate a subset of the PALCode even to run user-mode programs. Pretty good explained! Thanks! However, what do you need to make the alpha emulation work? Does ssh to an Alpha help you? I'm quite sure I can offer you access to some ev5 machine very soon and I might give access to some ds10 (ev67 machine). There's also some ds10 (ev6 'only') machine in Australia, that actually works as a builder for the AlphaCore project - but it's not mine and I would need to ask if I can give access to someone else... I actually do not have a lot of time to spend of Alpha emulation, that's why it would be great if some could test and compare the execution of simple programs (then later more complicated one) in order to find the most obvious emulation bugs, with the linux-user mode emulation. For this, an access to any Alpha machine could help a lot. For the full system emulation, a lot of work is to be done, mostly the PALCode emulation and putting together all elements of an actual hardware machine. Note that the PALCode emulation could be avoided if the emulator is able to run native PALCode image but I don't know if those images are easily available... -- J. Mayer [EMAIL PROTECTED] Never organized
Re: [Qemu-devel] qemu alpha?
On Sat, 2007-10-20 at 13:49 +0100, Thiemo Seufer wrote: J. Mayer wrote: On Fri, 2007-10-19 at 19:49 -0500, Rob Landley wrote: On Sunday 14 October 2007 5:14:27 am J. Mayer wrote: On Sun, 2007-10-14 at 11:19 +0200, Oliver Falk wrote: Hi list! Hi you ! Just wanted to know how far the progress on alpha target is? I would be happy if I have some 'virtual alpha' to test new isos. If I can help some way (I have a few alphas around). Let me know. I'm happy to see someone interresting in improving Alpha support, which is very alpha for now ! I'm interested in testing Alpha too, but I haven't seem a qemu-system-alpha show up yet. Alas, I have no hardware or specific expertise in this platform, I'm just trying to build and boot Linux kernels (and corresponding root filesystems) on as many emulated target platforms as I can. There are a lot of things missing for qemu-system-alpha to be available: - the PALCode emulation is far from being complete or even usable - there is no hardware machine emulation for Alpha in Qemu. As I have no Alpha platform, I don't know much about the hardware to be emulated. But the first step about the Alpha target would be to properly debug the linux-user-mode emulation, that would validate the core CPU INSNS emulation part. I guess my Alpha CPU and ABI knowledge is too restricted to find the problem of most program crashing for now. It seems to me that the Unique register is not initialized properly, but this is just a guess and I have no idea of what's going wrong with this register and what should be its value. Could you record the limitations you know about in a STATUS file and commit that to the target-alpha directory? You're right. I will commit a status file today. -- J. Mayer [EMAIL PROTECTED] Never organized
Re: [Qemu-devel] I got a kernel booted under qemu-system-ppc !
On Sat, 2007-10-20 at 23:49 +0200, Aurelien Jarno wrote: Aurelien Jarno a écrit : Aurelien Jarno a écrit : Aurelien Jarno a écrit : I have used QEMU CVS with a Debian Sid image. It basically works, I am even able to login via SSH, but I have noticed two problems: - Some process hang, stay into D state and become unkillable. It seems it can happen to all processes, but it is always reproducible with uptime or top. I still don't know if it is a problem of the kernel or if it comes from the emulation. This problem arise when using floating point instructions. It can be easily triggered by running the following testcase: #include stdio.h int main() { double a = 1.34; printf(%.2f, a); return 0; } This is actually not enough to trigger the bug. The testcase works if the bug has already been trigger in another process before, for example uptime. I finally found a testcase that trigger the bug in any case: #include stdio.h int main() { printf(%d %f\n, 7, 0.40); return 0; } The bug could also be trigger with sprintf(), so this is not directly related to I/O. It happens when printing an integer followed by a float, even when the two are printed in two different calls to printf(). OK, thanks. I'll do test with this program. It seems that floats are OK when running 2.4 kernels, it maybe a difference in recent glibc. I'll try to investigate more about it. -- J. Mayer [EMAIL PROTECTED] Never organized
Re: [Qemu-devel] I got a kernel booted under qemu-system-ppc !
On Sun, 2007-10-21 at 04:55 -0500, Rob Landley wrote: On Saturday 20 October 2007 3:50:52 am J. Mayer wrote: Sleep mode is currently implemented only for a few CPUs. I should add all the currently emulated cores. For this, I would have to emulate the HID registers, in most case, which is still not done. Getting it to exit in response to a shut down attempt would be really nice too. (It may already do so, but I have no idea how to trigger it and neither did Milton last I checked.) Well, if it does not exit, that means that there should be something emulated in the chipset to do so and that the Linux kernel just enter an infinite loop instead of shuting down / reseting the board. And you can get the list of all CPUs emulated by Qemu with the '-cpu ?' switch. I did that, but it -cpu ? gives output like: PowerPC 7448 PVR 80040201 PowerPC 7448v1.0 PVR 80040100 PowerPC 7448v1.1 PVR 80040101 PowerPC 7448v2.0 PVR 80040200 PowerPC 7448v2.1 PVR 80040201 I prefer the result of -M ? which makes it slightly clearer which field you need to feed qemu as an argument. (For the record, -cpu seems to want filed $2 of the -cpu ? output.) Also, that doesn't tell me what the differences between any of them are. The idea of showing the name of the model and the PVR (processor version register) is that it can be useful to give Qemu a PVR instead of a name, even if this possibility is not properly implemented in the commited version. It's sometime more relevant to provide a PVR than a core name, imho... I will put a dump of the CPU features for all cores emulated by Qemu on line soon. From earlier research, I know that if you configure a toolchain for 7xx all major PowerPC variants except two will run that, it's more or less -mcpu 386 of the powerpc world. The two that won't run it (Motorola's 8xx and IBM's 4xx) are both embedded subsets of powerpc that have had instructions removed, and thus need their own toolchains. (Of course those two removed DIFFERENT instructions, sigh...) Those two ones do not have any hardware floating point implemented and lack the support of some optional instructions. If you want to compile executables that would run on any PowerPC core, you have to use the switch '-many' or '-mcom'. This is supposed to generate code that would even run on the original RS/6000 architecture. The '-mppc' generates code for 603/604 which should be a better PowerPC insns subset than the 750 one, for portability. If you want all programs to also run on 4xx/8xx/82xx, you may also add '-msoft-float' so no hardware floating point instructions will be used. But you may not care about those cores as they are used only in microcontrollers so I need to implement at least a subset of their internal devices to make them usable in the full-system emulation. My random and confused notes about various hardware platforms are at http://landley.net/ols/ols2007/platforms.txt;, which has a largeish section on ppc that probably makes sense to nobody but me. :) I don't actually have any _background_ in embedded hardware. Busybox, uClibc, and qemu all dragged me into it, and I've been trying to pick things up as I go along... Rob P.S. I removed you from the CC: list because your ISP is still bouncing my emails as spam, and I don't know if the list sends you a copy if you're cc'd. Strange, my ISP does not tag your mails as spams, when I receive them. But it's not a problem for me not to be CCed... -- J. Mayer [EMAIL PROTECTED] Never organized
Re: [Qemu-devel] qemu alpha?
On Sun, 2007-10-21 at 05:43 -0500, Rob Landley wrote: On Saturday 20 October 2007 3:56:12 am J. Mayer wrote: On Fri, 2007-10-19 at 19:49 -0500, Rob Landley wrote: On Sunday 14 October 2007 5:14:27 am J. Mayer wrote: On Sun, 2007-10-14 at 11:19 +0200, Oliver Falk wrote: Hi list! Hi you ! Just wanted to know how far the progress on alpha target is? I would be happy if I have some 'virtual alpha' to test new isos. If I can help some way (I have a few alphas around). Let me know. I'm happy to see someone interresting in improving Alpha support, which is very alpha for now ! I'm interested in testing Alpha too, but I haven't seem a qemu-system-alpha show up yet. Alas, I have no hardware or specific expertise in this platform, I'm just trying to build and boot Linux kernels (and corresponding root filesystems) on as many emulated target platforms as I can. There are a lot of things missing for qemu-system-alpha to be available: - the PALCode emulation is far from being complete or even usable I have no idea what that is. The PALCode is mainly equivalent to the microcode of most CPU architectures. What is different to microcode is that is uses only regular Alpha instructions, just adding 4 instructions to access special hardware registers and access the memory with different priviledge levels. Another main idea is that everyone can write its own PALCode image and switch to it at run-time. Then, for example, the PALCode ABI is not the same one if you run Linux or Windows NT. The PALCode handles all complex operations. For example, the CPU provides only TLB and the MMU tables search is actually implemented in software, in the PALCode. This greatly simplifies the CPU design and allows a high level of flexibility. And if your OS need a specific ABI for example to handle CPU exception, you define your ABI, write the PALCode using Alpha insns and use it ! The Alpha CPU also provide an instruction to do PALCode calls from the OS or applications. There are 3 (4 ?) native PALCode ABIs documented in the Alpha CPUs specifications then those can be emulated at the host side in Qemu. It is in fact needed to emulate a subset of the PALCode even to run user-mode programs. [...] -- J. Mayer [EMAIL PROTECTED] Never organized
Re: [Qemu-devel] I got a kernel booted under qemu-system-ppc !
On Sun, 2007-10-21 at 12:24 +0200, J. Mayer wrote: On Sun, 2007-10-21 at 04:55 -0500, Rob Landley wrote: On Saturday 20 October 2007 3:50:52 am J. Mayer wrote: [...] And you can get the list of all CPUs emulated by Qemu with the '-cpu ?' switch. I did that, but it -cpu ? gives output like: PowerPC 7448 PVR 80040201 PowerPC 7448v1.0 PVR 80040100 PowerPC 7448v1.1 PVR 80040101 PowerPC 7448v2.0 PVR 80040200 PowerPC 7448v2.1 PVR 80040201 I prefer the result of -M ? which makes it slightly clearer which field you need to feed qemu as an argument. (For the record, -cpu seems to want filed $2 of the -cpu ? output.) Also, that doesn't tell me what the differences between any of them are. The idea of showing the name of the model and the PVR (processor version register) is that it can be useful to give Qemu a PVR instead of a name, even if this possibility is not properly implemented in the commited version. It's sometime more relevant to provide a PVR than a core name, imho... Something I forgot to say is that this output is supposed to give a help to the end-user. It seems to me that showing the CPU family (ie PowerPC) can be useful when POWER and RS64 families will be emulated too, in order to help the user choose which CPU to emulate. I will put a dump of the CPU features for all cores emulated by Qemu on line soon. Here you'll find a reference for all PowerPC CPUs / cores / microcontrollers available with the Qemu -cpu switch. Hope I did not forget anything. http://perso.magic.fr/l_indien/qemu-ppc/PowerPC_ref/PowerPC_ref.html For each CPU, you get: - its name, as provided with the Qemu -cpu switch - its PVR (processor version register) - the bits defined in the MSR (machine state register) - its MMU model and the TLB model when relevant - its exception handling model - its input bus model - the specific features provided by the MSR - its complete instructions set (only one insn is dumped when 2 instructions share the same opcode) - the complete list of SPRs (special purpose registers) with access rights in supervisor and user mode [...] -- J. Mayer [EMAIL PROTECTED] Never organized
Re: [Qemu-devel] I got a kernel booted under qemu-system-ppc !
On Sat, 2007-10-20 at 01:08 -0500, Rob Landley wrote: On Friday 19 October 2007 3:33:52 pm Aurelien Jarno wrote: Aurelien Jarno a écrit : - The target CPU never gets into idle loop, so the host CPU is always used at 100% This is actually not a problem. The default CPU (604) does not support DOZE or NAP. Switching to a 603 CPU, the target CPU correctly goes into idle loop. Sleep mode is currently implemented only for a few CPUs. I should add all the currently emulated cores. For this, I would have to emulate the HID registers, in most case, which is still not done. This would be adding -cpu 603 to the command line? Yes Is there a web page listing all the powerpc processors somewhere? I'm still at the everything is 7xx except for 4xx and 8xx stage... I found this: http://www.power.org/resources/devcorner/roadmap But it groups by manufacturer rather than capabilities or software compatability... I could do this, as Qemu has definitions for most PowerPC cores (even if most are still not available). For now, you can take a look in target-ppc/translate_init.c. Most PowerPC are referenced here: - there's a big table with all the PVR I know (but there's still a lot missing) - the ppc_defs table contains most PowerPC definitions, with their features defined. I will think of doing a reference table on my web pages, to have a more readable PowerPC reference document. Of course, any information about missing PVRs or PowerPC implementation in welcome ! You can also take a look at the file target-ppc/STATUS file to figure out all cores emulation working in Qemu. And you can get the list of all CPUs emulated by Qemu with the '-cpu ?' switch. -- J. Mayer [EMAIL PROTECTED] Never organized
Re: [Qemu-devel] qemu alpha?
On Fri, 2007-10-19 at 19:49 -0500, Rob Landley wrote: On Sunday 14 October 2007 5:14:27 am J. Mayer wrote: On Sun, 2007-10-14 at 11:19 +0200, Oliver Falk wrote: Hi list! Hi you ! Just wanted to know how far the progress on alpha target is? I would be happy if I have some 'virtual alpha' to test new isos. If I can help some way (I have a few alphas around). Let me know. I'm happy to see someone interresting in improving Alpha support, which is very alpha for now ! I'm interested in testing Alpha too, but I haven't seem a qemu-system-alpha show up yet. Alas, I have no hardware or specific expertise in this platform, I'm just trying to build and boot Linux kernels (and corresponding root filesystems) on as many emulated target platforms as I can. There are a lot of things missing for qemu-system-alpha to be available: - the PALCode emulation is far from being complete or even usable - there is no hardware machine emulation for Alpha in Qemu. As I have no Alpha platform, I don't know much about the hardware to be emulated. But the first step about the Alpha target would be to properly debug the linux-user-mode emulation, that would validate the core CPU INSNS emulation part. I guess my Alpha CPU and ABI knowledge is too restricted to find the problem of most program crashing for now. It seems to me that the Unique register is not initialized properly, but this is just a guess and I have no idea of what's going wrong with this register and what should be its value. -- J. Mayer [EMAIL PROTECTED] Never organized
Re: [Qemu-devel] I got a kernel booted under qemu-system-ppc !
On Fri, 2007-10-19 at 17:19 +0200, Aurelien Jarno wrote: On Thu, Oct 18, 2007 at 07:12:57PM -0500, Rob Landley wrote: The easy way to reproduce this is go to http://landley.net/hg/firmware;, download tip, and ./build.sh powerpc. When it finishes building everything, cd build and ./run-powerpc.sh. What I did is build a new ppc_rom.bin (attached, source code is at http://landley.net/hg/firmware/raw-diff/92f89c9c9495/sources/toys/make-ppc_rom.tar.bz2 ) which was written by Milton Miller. I use that firmware as the boot rom (point -L at the directory it's in) instead of Open Hackware, which still doesn't work for me. Then I build a 2.6.23 kernel with this patch: http://landley.net/hg/firmware/raw-diff/fdb6ddd4c3b7/sources/patches/linux-ppcqemu.patch which adds a qemu target. I then boot with the following command line (modulo wordwrap damage): qemu-system-ppc -M prep -nographic -hda image-powerpc.ext2 -kernel zImage-powerpc -append 'rw init=/tools/bin/sh panic=1 PATH=/tools/bin root=/dev/hda console=ttyS0' -L ../sources/toys And I get a shell prompt inside qemu! (After almost _two_years_ of trying, I'm kind of happy about this.) The downside is that the result boots fine under qemu-0.9.0, but is broken with current cvs. I tracked it down to the specific patch with git bisect, and it's this one: http://git.kernel.dk/?p=qemu.git;a=commit;h=36f447f730f61ac413c5b1c4a512781f5dea0c94 author j_mayer j_mayer Mon, 9 Apr 2007 22:45:36 + (22:45 +) committer j_mayer j_mayer Mon, 9 Apr 2007 22:45:36 + (22:45 +) Implement embedded IRQ controller for PowerPC 6xx/740 750. Fix PowerPC external interrupt input handling and lowering. Fix OpenPIC output pins management. Fix multiples bugs in OpenPIC IRQ management. Fix OpenPIC CPU(s) reset function. Fix Mac99 machine to properly route OpenPIC outputs to the PowerPC input pins. Fix PREP machine to properly route i8259 output to the PowerPC external interrupt pin. Versions before that patch went in work fine. Versions since then hang halfway through IDE controller initialization: Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2 ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx hda: QEMU HARDDISK, ATA DISK drive hda: IRQ probe failed (0x0) hdb: IRQ probe failed (0x0) hdb: IRQ probe failed (0x0) hdb: QEMU CD-ROM, ATAPI CD/DVD-ROM drive hdb: IRQ probe failed (0x0) -- hangs here with the patch ide0 at 0x1f0-0x1f7,0x3f6 on irq 13 hda: max request size: 512KiB hda: 4194304 sectors (2147 MB) w/256KiB Cache, CHS=4161/255/63 hda: set_multmode: status=0x41 { DriveReady Error } hda: set_multmode: error=0x04 { DriveStatusError } ide: failed opcode was: 0xef hda: cache flushes supported hda: unknown partition table mice: PS/2 mouse device common for all mice The small patch below fixes the IDE problem, but not the NE2000 ISA one. Please apply. This patch makes the PreP target run for me, using OpenHackWare, and I got NE2000 working too. 2.4 vanilla kernels runs perfectly, as well as old 2.6 ones. But there still seems to be problems with recent 2.6 kernels not using the frame buffer properly: I can see the kernel entering user mode, from the messages on the serial console, but I got no more messages from here. But I guess it's booting as I can see the CPU entering sleep mode a few seconds after reaching this point, the same way it does when I can see it waiting for the user login. So I will apply the patch. I also added PCI network devices but still haven't validated them. Index: hw/i8259.c === RCS file: /sources/qemu/qemu/hw/i8259.c,v retrieving revision 1.25 diff -u -d -p -r1.25 i8259.c --- hw/i8259.c17 Sep 2007 08:09:46 - 1.25 +++ hw/i8259.c19 Oct 2007 15:17:22 - @@ -164,7 +164,7 @@ void pic_update_irq(PicState2 *s) } /* all targets should do this rather than acking the IRQ in the cpu */ -#if defined(TARGET_MIPS) +#if defined(TARGET_MIPS) || defined(TARGET_PPC) else { qemu_irq_lower(s-parent_irq); } -- J. Mayer [EMAIL PROTECTED] Never organized
Re: [Qemu-devel] I got a kernel booted under qemu-system-ppc !
On Thu, 2007-10-18 at 19:12 -0500, Rob Landley wrote: The easy way to reproduce this is go to http://landley.net/hg/firmware;, download tip, and ./build.sh powerpc. When it finishes building everything, cd build and ./run-powerpc.sh. [...] The downside is that the result boots fine under qemu-0.9.0, but is broken with current cvs. I tracked it down to the specific patch with git bisect, and it's this one: http://git.kernel.dk/?p=qemu.git;a=commit;h=36f447f730f61ac413c5b1c4a512781f5dea0c94 author j_mayer j_mayer Mon, 9 Apr 2007 22:45:36 + (22:45 +) committer j_mayer j_mayer Mon, 9 Apr 2007 22:45:36 + (22:45 +) What is strange is that it was reported as unable to boot long before this patch. It was broken when the Qemu PCI architecture has been redesigned and have never been reported booting since then (2 years ago, if I remember well). In march 2007, I did test and reported that PreP and heathrow target were unable to boot, not receiving any IRQ from PCI (in fact, adding traces proves there are even no IRQ generated in the PCI code). Someone reported the faulty patch during this summer (but I've not been unable to find the mail in the mailing list archive tonight). I have to admit I never put the focus on trying to solve this issue, has I usually use Mac99 or PowerPC 405 targets for tests and that PreP machines are long obsolete and the heathrow target does not reflect any real machine. But this is to be solved, for sure. -- J. Mayer [EMAIL PROTECTED] Never organized
Re: [Qemu-devel] Mips target '-kernel' option bug
On Wed, 2007-10-17 at 22:06 +0300, Blue Swirl wrote: On 10/17/07, Jocelyn Mayer [EMAIL PROTECTED] wrote: On Wed, 2007-10-17 at 14:51 +0100, Thiemo Seufer wrote: J. Mayer wrote: I failed to run Mips target test image on my amd64 machine and I now found the reason of the bug: the kernel loader code used in hw/mips_r4k.c and hw/mips_malta.c implicitelly assumes that the ram_addr_t is 32 bits long. Unfortunatelly, on 64 bits hosts, this won't be the case and the kernel load address then is over 4 GB. Then, when computing the initrd_offset, the code always concludes that there's not enough RAM available to load it at the top of the kernel. I found 2 ways of fixing the bug, but I don't know which one is correct in Mips execution environment. The first patch is to make the VIRT_TO_PHYS_ADDEND negative, thus translating the kernel virtual address from 0x8000 to the physical one 0x (instead of 0x1, when running on 64 bits hosts). The second solution would be to explicitelly always cast the kernel_high value to 32 bits. As I do not really know if some Mips target specific constraints would make one of the other solution prefered, I'd better let the specialist choose ! The good news is that, once this issue is fixed, the Mips test images run with the reverse-endian softmmu patch applied. I think this patch is the correct fix. Please test and comment. Thanks, I'll test it at home tonight. To satisfy my curiosity, is there a specific reason to have a positive VIRT_TO_PHYS_ADDEND ? On Sparc, OpenBIOS image is loaded to a physical address that is higher in the address space than the virtual address: #define PROM_PADDR 0xff000ULL #define PROM_VADDR 0xffd0 and #define PROM_ADDR0x1fff000ULL #define PROM_VADDR 0x000ffd0ULL OK, thanks. And the patch seems OK for me, it may be a good idea to commit it ! -- J. Mayer [EMAIL PROTECTED] Never organized
Re: [Qemu-devel] RFC: Code fetch optimisation
On Mon, 2007-10-15 at 23:42 +0100, Paul Brook wrote: VLE targets (x86, m68k) can translate almost a full page of instructions, and a page boundary can be anywhere within that block. Once we've spanned multiple pages there's not point stopping translation immediately. We may as well translate as many instructions as we can on the second page. I'd guess most TB are much smaller than a page, so on average only a few instructions are going to come after the page boundary. This leads me to another reflexion. For fixed length encoding targets, we always stop translation when reaching a page boundary. If we keep using the current model and we optimize the slow case, it would be possible to stop only if we cross 2 pages boundary during code translation, and it seems that this case is not likely to happen. If we keep the current behavior, we could remove the second page_addr element in the tb structure and maybe optimize parts of the tb management and invalidation code. The latter may be the only feasible option. Some targets (ARMv5, maybe others) do not have an explicit fault address for MMU instruction faults. The faulting address is the address of the current instruction when the fault occurs. Prefetch aborts are generated at translation time, which effectively means the faulting instruction must be at the start of a TB. Terminating the TB on a page boundary guarantees this behavior. Well, we got the same behavior on PowerPC. What I was thinking of is that if we fix the VLE problems, the fix, if done in a proper way, could also allow benefit to RISC targets. What I don't know is; would we really have a benefit not stopping translation on page boundaries ? For VLE targets we already get this wrong (the prefetch abort occurs some time before the faulting instruction executes). I don't know if this behavior is permitted by the ISA, but it's definitely possible to construct cases where it has visible effect. I think that it would be possible to do things properly. I'm not really sure what is the best solution to implement it but if, in the slow case path of the code fetch low-level routine, we call the get_physical_address or the cpu_get_phys_page_debug function, we then have a way to know if the code fetch is allowed. If it, we would just have to adjust our host_pc and host_pc_start for the next fetch to be optimized. If not, we could stop the translation and generate a gen_op_raise_excp_error to raise the exception at the right place, respecting the ISA insns execution ordering. Generating the exception from inside the TB won't be OK, as it may not be necessary on a second execution of the same TB, then the solution may be to link the TB with a special other TB that would just raise this exception and would be unlink once the exception has been treated. Or maybe the solution would just be to stop the translation knowing that the exception will be raised when trying to translate the first instruction in the next page. There still may be specific problems for instructions spanning 2 pages, using those solutions... -- J. Mayer [EMAIL PROTECTED] Never organized
[Qemu-devel] Mips target '-kernel' option bug
I failed to run Mips target test image on my amd64 machine and I now found the reason of the bug: the kernel loader code used in hw/mips_r4k.c and hw/mips_malta.c implicitelly assumes that the ram_addr_t is 32 bits long. Unfortunatelly, on 64 bits hosts, this won't be the case and the kernel load address then is over 4 GB. Then, when computing the initrd_offset, the code always concludes that there's not enough RAM available to load it at the top of the kernel. I found 2 ways of fixing the bug, but I don't know which one is correct in Mips execution environment. The first patch is to make the VIRT_TO_PHYS_ADDEND negative, thus translating the kernel virtual address from 0x8000 to the physical one 0x (instead of 0x1, when running on 64 bits hosts). The second solution would be to explicitelly always cast the kernel_high value to 32 bits. As I do not really know if some Mips target specific constraints would make one of the other solution prefered, I'd better let the specialist choose ! The good news is that, once this issue is fixed, the Mips test images run with the reverse-endian softmmu patch applied. -- J. Mayer [EMAIL PROTECTED] Never organized
Re: [Qemu-devel] RFC: Code fetch optimisation
On Tue, 2007-10-16 at 23:00 +0100, Paul Brook wrote: Well, we got the same behavior on PowerPC. What I was thinking of is that if we fix the VLE problems, the fix, if done in a proper way, could also allow benefit to RISC targets. What I don't know is; would we really have a benefit not stopping translation on page boundaries ? I suspect that we're going to want/need to break the TB to get the exception semantics right, so for RISC targets there's no point having TBs that span a page boundary. My opinion is that this an optimisation that may be tried later, if it really give an advantage in terms of translation efficiency, which is far from being evident. Then, let's keep what works well and just try to solve the VLE problems for now... For VLE targets we already get this wrong (the prefetch abort occurs some time before the faulting instruction executes). I don't know if this behavior is permitted by the ISA, but it's definitely possible to construct cases where it has visible effect. I think that it would be possible to do things properly. [...] Or maybe the solution would just be to stop the translation knowing that the exception will be raised when trying to translate the first instruction in the next page. I'd go for this one. It's approximately the same method currently used for RISC targets. In general think this will require target specific support. For RISC targets this is trivial. For x86/m68k figuring out the length of an insn is trickier. Detecting crossing a page boundary on subsequent insns in the load/mmu routines is problematic because it happens relatively late. In particular it may theoretically happen after we've output ops that change CPU state. I suspect the best solution is to backtrack (remove the generated ops) after decoding the insn if we discover we've passed a page boundary. The ld*_code routines can simply return garbage (e.g. zero) if the read is not on the first page. The incorrect returned value may be target specific to be sure it's always an invalid opcode. Backtracking should not be hard if we register the last cc pointer each time we finish translating an insn. I'll think about this solution, which really seems feasible to me. Trying to generate prefetch aborts at runtime sounds too hairy for my liking. It might be really tricky and is likely to be bugged, I agree. -- J. Mayer [EMAIL PROTECTED] Never organized
Re: [Qemu-devel] RFC: Code fetch optimisation
On Mon, 2007-10-15 at 03:30 +0100, Paul Brook wrote: On Sunday 14 October 2007, J. Mayer wrote: Here's an updated version of the code fetch optimisation patch against current CVS. As a remainder, this patch avoid use of softmmu helpers to fetch the code in most case. A new target define TARGET_HAS_VLE_INSNS has been added which is used to handle the case of an instruction that span 2 pages, when the target CPU uses a variable-length instructions encoding. For pure RISC, the code fetch is done using raw access routines. +unsigned long phys_pc; +unsigned long phys_pc_start; These are ram offsets, not physical addresses. I recommend naming them as such to avoid confusion. Well, those are host addresses. Fabrice even suggested me to replace them with void * to prevent confusion, but I kept using unsigned long because the _p functions API do not use pointers. As those values are defined as phys_ram_base + offset, those are likely to be host address, not RAM offset, and are used directly to dereference host pointers in the ldxxx_p functions. Did I miss something ? +opc = glue(glue(lds,SUFFIX),MEMSUFFIX)(virt_pc); +/* Avoid softmmu access on next load */ +/* XXX: dont: phys PC is not correct anymore + * We could call get_phys_addr_code(env, pc); and remove the else + * condition, here. + */ +//*start_pc = phys_pc; The commented out code is completely bogus, please remove it. The comment is also somewhat misleading/incorrect. The else would still be required for accesses that span a page boundary. I guess trying to optimize this case retrieving the physical address would not bring any optimization as in fact only the last translated instruction of a TB (then only a few code loads) may hit this case. I'd like to keep a comment here to show that it may not be a good idea (or may not be as simple as it seems at first sight) to try to do more optimisation here, but you're right this comment is not correct. The code itself looks ok, though I'd be surprised if it made a significant difference. We're always going to hit the fast-path TLB lookup case anyway. It seems that the generated code for the code fetch is much more efficient than the one generated when we get when using the softmmu routines. But it's true we do not get any significant performance boost. As it was previously mentioned, the idea of the patch is more a 'don't do unneeded things during code translation' than a great performance improvment. -- J. Mayer [EMAIL PROTECTED] Never organized
Re: [Qemu-devel] RFC: reverse-endian softmmu memory accessors
On Sun, 2007-10-14 at 15:59 +0300, Blue Swirl wrote: On 10/14/07, J. Mayer [EMAIL PROTECTED] wrote: Here's an updated version of the patch against current CVS. This patches provides reverse-endian, little-endian and big-endian memory accessors, available with and without softmmu. It also provides an IO_MEM_REVERSE TLB flag to allow future support of per-page endianness control, which is required by some targets CPU emulations. Having reverse-endian memory accessors also make it possible to optimise reverse-endian memory access when the target CPU has dedicated instructions. For now, it includes optimisations for the PowerPC target. This breaks Sparc32 softmmu, I get a black screen. Your changes to target-sparc and hw/sun4m.c look fine, so the problem could be in IO? Did it worked before my commits ? I may have done something wrong during the merge... I will do more checks and more tests... -- J. Mayer [EMAIL PROTECTED] Never organized
Re: [Qemu-devel] RFC: reverse-endian softmmu memory accessors
On Mon, 2007-10-15 at 19:02 +0300, Blue Swirl wrote: On 10/15/07, J. Mayer [EMAIL PROTECTED] wrote: On Sun, 2007-10-14 at 15:59 +0300, Blue Swirl wrote: On 10/14/07, J. Mayer [EMAIL PROTECTED] wrote: Here's an updated version of the patch against current CVS. This patches provides reverse-endian, little-endian and big-endian memory accessors, available with and without softmmu. It also provides an IO_MEM_REVERSE TLB flag to allow future support of per-page endianness control, which is required by some targets CPU emulations. Having reverse-endian memory accessors also make it possible to optimise reverse-endian memory access when the target CPU has dedicated instructions. For now, it includes optimisations for the PowerPC target. This breaks Sparc32 softmmu, I get a black screen. Your changes to target-sparc and hw/sun4m.c look fine, so the problem could be in IO? Did it worked before my commits ? I may have done something wrong during the merge... I will do more checks and more tests... If I disable the IOSWAP code, black screen is gone. I think this is logical: the io accessors return host CPU values, therefore no byte swapping need to be performed. Memory mapped I/O access function hopefully return data in the target endianness. This is the reason why there are so many #ifdef TARGET_WORDS_BIGENDIAN in the emulated devices memory mapped accesses routines and also in io_read and io_write functions for 64 bits accesses. And the emulated CPU is expecting data to always come in its endiannes when doing a load from memory, even if the access is a device one. Your patch works as long as you don't use load/store with reverse endian accessor routines nor TLB wih reverse endian bit set. On PowerPC, using reverse-endian load and stores, the byteswap in I/O routines is needed for most MMIO device accesses (like IDE, which always returns little-endian data) could ever be accessed. The bug you report just means there's a logical error somewhere in my code. I did download the Sparc test and was able to reproduce it. I'm working to find the bug. And I finally found it. The bug is just that I did something completelly stupid, defining IO_MEM_REVERSE as 3 instead of 4: it's obvious that it has to be a power of 2 to be combined with the other TB bits. I wonder how the PowerPC case was able to run with such a huge bug... Please apologive. I'm going to do more test with this fix and try to merge the sparc_reverse_endian in my code and repost an updated patch. -- J. Mayer [EMAIL PROTECTED] Never organized
Re: [Qemu-devel] RFC: Code fetch optimisation
On Mon, 2007-10-15 at 17:01 +0100, Paul Brook wrote: +unsigned long phys_pc; +unsigned long phys_pc_start; These are ram offsets, not physical addresses. I recommend naming them as such to avoid confusion. Well, those are host addresses. Fabrice even suggested me to replace them with void * to prevent confusion, but I kept using unsigned long because the _p functions API do not use pointers. As those values are defined as phys_ram_base + offset, those are likely to be host address, not RAM offset, and are used directly to dereference host pointers in the ldxxx_p functions. Did I miss something ? You are correct, they are host addresses. I still think calling them phys_pc is confusing. It took me a while to convince myself that unsigned long was an appropriate type (ignoring 64-bit windows hosts for now). How about host_pc? It's OK for me. +/* Avoid softmmu access on next load */ +/* XXX: dont: phys PC is not correct anymore + * We could call get_phys_addr_code(env, pc); and remove the else + * condition, here. + */ +//*start_pc = phys_pc; The commented out code is completely bogus, please remove it. The comment is also somewhat misleading/incorrect. The else would still be required for accesses that span a page boundary. I guess trying to optimize this case retrieving the physical address would not bring any optimization as in fact only the last translated instruction of a TB (then only a few code loads) may hit this case. VLE targets (x86, m68k) can translate almost a full page of instructions, and a page boundary can be anywhere within that block. Once we've spanned multiple pages there's not point stopping translation immediately. We may as well translate as many instructions as we can on the second page. I'd guess most TB are much smaller than a page, so on average only a few instructions are going to come after the page boundary. This leads me to another reflexion. For fixed length encoding targets, we always stop translation when reaching a page boundary. If we keep using the current model and we optimize the slow case, it would be possible to stop only if we cross 2 pages boundary during code translation, and it seems that this case is not likely to happen. If we keep the current behavior, we could remove the second page_addr element in the tb structure and maybe optimize parts of the tb management and invalidation code. I'd like to keep a comment here to show that it may not be a good idea (or may not be as simple as it seems at first sight) to try to do more optimisation here, but you're right this comment is not correct. Agreed. The code itself looks ok, though I'd be surprised if it made a significant difference. We're always going to hit the fast-path TLB lookup case anyway. It seems that the generated code for the code fetch is much more efficient than the one generated when we get when using the softmmu routines. But it's true we do not get any significant performance boost. As it was previously mentioned, the idea of the patch is more a 'don't do unneeded things during code translation' than a great performance improvment. OTOH it does make the the code more complicated. I'm agnostic about whether this patch should be applied. I agree that this proposal was an answer to a challenging idea that I received more than a real need. The worst thing in this patch, imho, is that you need to increase 2 values each time you want to change the PC. This is likely to bring some bug when one will forgot to increase one of the two. I was thinking of hiding the pc, host_pc and host_pc_start (and maybe also pc_start) in a structure and add inline helpers: * get_pc would return the current virtual PC, as needed by the jump and relative memory accesses functions. * get_tb_len would return the difference between the virtual PC and the virtual pc_start, as it is done at the end of the gen_intermediate_code functions * move_pc would add an offset to the virtual and the physical PC. This has to be target dependant, due to the special case for Sparc * update_phys_pc would be void for most targets, except for Sparc where the phys_pc needs to be adjusted after the translation of each target instruction. and maybe more, if needed. This structure could also contain target specific information. To address the problem of segment limit check reported by Fabrice Bellard, we could for example add the address of the next segment limit for x86 target and add a target specific check at the start of the ldx_code_p function. But I don't know much about segmentation subtilities on x86, then this idea may not be appropriate to solve this problem. -- J. Mayer [EMAIL PROTECTED] Never organized
Re: [Qemu-devel] RFC: reverse-endian softmmu memory accessors
On Sun, 2007-10-14 at 11:19 +0300, Blue Swirl wrote: On 10/14/07, J. Mayer [EMAIL PROTECTED] wrote: On Sat, 2007-10-13 at 16:17 +0200, J. Mayer wrote: On Sat, 2007-10-13 at 16:07 +0300, Blue Swirl wrote: On 10/13/07, J. Mayer [EMAIL PROTECTED] wrote: On Sat, 2007-10-13 at 13:47 +0300, Blue Swirl wrote: On 10/13/07, J. Mayer [EMAIL PROTECTED] wrote: The problem: some CPU architectures, namely PowerPC and maybe others, offers facilities to access the memory or I/O in the reverse endianness, ie little-endian instead of big-endian for PowerPC, or provide instruction to make memory accesses in the reverse-endian. This is implemented as a global flag on some CPU. This case is already handled by the PowerPC emulation but is is far from being optimal. Some other implementations allow the OS to store an reverse-endian flag in the TLB or the segment descriptors, thus providing per-page or per-segment endianness control. This is mostly used to ease driver migration from a PC platform to PowerPC without taking any care of the device endianness in the driver code (yes, this is bad...). Nice, this may be useful for Sparc64. It has a global CPU flag for endianness, individual pages can be marked as reverse endian, and finally there are instructions that access memory in reverse endian. The end result is a XOR of all these reverses. Though I don't know if any of these features are used at all. I realized that I/O accesses for reverse-endian pages were not correct in the softmmu_template.h header. This new version fixes this. It also remove duplicated code in the case of unaligned accesses in a reverse-endian page. I think 64 bit access case is not handled correctly, but to solve that it would be nice to extend the current IO access system to 64 bits. I think that if it was previously correct, it should still be, but... I don't know how much having 64 bits I/O accesses is interresting, as I don't know if there are real hw buses that have 64 bits data path... Here's another version taking care of your remark about ldl memory accessors. * I replaced all ldl occurences with ldul * when TARGET_LONG_BITS == 64, I also added ldsl accessors. And I started using it in the PowerPC memory access micro-ops. Then the patch is really more invasive than the previous ones. This still does not break PowerPC or i386 target, as it seems. Here's a new version. The only change is that, for consistency, I did add the big-endian and little-endian accessors that were documented in cpu-all.h as unimplemented. The implementation is quite trivial, having native and reverse-endian accessors available, and changes functionnally nothing to the previous version. The patch does not apply anymore. The Sparc part looks OK. The benefits from the patch can be gained by mapping Sparc64 lduw and ldsw in op_mem.h directly to ldul and ldsl using SPARC_LD_OP and replacing the ldl+bswap etc. for the LE cases with ldlr in op_helper.c. If you prefer, I can do this after you have applied the patch. Yes, there are conflicts between this patch and the mmu_idx one I just commited. I will regenerate an updated diff in the hours to come, after I finished commiting the PowerPC fixes and improvments I got waiting in stock. For the Sparc improvments, as I merged the PowerPC improvments in the patch, I think it can be a good idea to include it directly in the patch. I'm also wondering if it would not be a good idea to define lduq/ldsq even if they in fact do exactly what ldq does now, just to have a fully consistent API. -- J. Mayer [EMAIL PROTECTED] Never organized
Re: [Qemu-devel] qemu alpha?
On Sun, 2007-10-14 at 11:19 +0200, Oliver Falk wrote: Hi list! Hi you ! Just wanted to know how far the progress on alpha target is? I would be happy if I have some 'virtual alpha' to test new isos. If I can help some way (I have a few alphas around). Let me know. I'm happy to see someone interresting in improving Alpha support, which is very alpha for now ! Current status is alpha-linux-user is able to launch some executable. Unfortunately, all the one I got crash after some time, probably because the unique register is not initialized properly. And I got no Alpha machine to be able to compare what's going wrong in the emulation code. The softmmu Alpha support is far from being usable for many reasons. First of all, I started to code a PALcode host-side emulation but it still needs a lot of work before being usable. The second thing is I don't really know which CPU model is the best to emulate first and I don't have the precise specification of the hardware platform to be able to code the needed hw/alpha.c file. If you feel like helping developping the Alpha target support, I think the first target would be to work on the linux-user mode, which mostly needs debug and bugfixes. Once the core CPU will be validated, the softmmu suport could be done too. I can of course help you doing this, if you need advices or support, having started the target as an invitation for Alpha specialist to make it really usable. Then, your help and knowledge is welcome here ! -- J. Mayer [EMAIL PROTECTED] Never organized
[Qemu-devel] RFC: Code fetch optimisation
Here's an updated version of the code fetch optimisation patch against current CVS. As a remainder, this patch avoid use of softmmu helpers to fetch the code in most case. A new target define TARGET_HAS_VLE_INSNS has been added which is used to handle the case of an instruction that span 2 pages, when the target CPU uses a variable-length instructions encoding. For pure RISC, the code fetch is done using raw access routines. -- J. Mayer [EMAIL PROTECTED] Never organized Index: cpu-all.h === RCS file: /sources/qemu/qemu/cpu-all.h,v retrieving revision 1.76 diff -u -d -d -p -r1.76 cpu-all.h --- cpu-all.h 23 Sep 2007 15:28:03 - 1.76 +++ cpu-all.h 14 Oct 2007 11:35:53 - @@ -646,6 +646,13 @@ static inline void stfq_be_p(void *ptr, #define ldl_code(p) ldl_raw(p) #define ldq_code(p) ldq_raw(p) +#define ldub_code_p(sp, pp, p) ldub_raw(p) +#define ldsb_code_p(sp, pp, p) ldsb_raw(p) +#define lduw_code_p(sp, pp, p) lduw_raw(p) +#define ldsw_code_p(sp, pp, p) ldsw_raw(p) +#define ldl_code_p(sp, pp, p) ldl_raw(p) +#define ldq_code_p(sp, pp, p) ldq_raw(p) + #define ldub_kernel(p) ldub_raw(p) #define ldsb_kernel(p) ldsb_raw(p) #define lduw_kernel(p) lduw_raw(p) Index: cpu-exec.c === RCS file: /sources/qemu/qemu/cpu-exec.c,v retrieving revision 1.120 diff -u -d -d -p -r1.120 cpu-exec.c --- cpu-exec.c 14 Oct 2007 07:07:04 - 1.120 +++ cpu-exec.c 14 Oct 2007 11:35:53 - @@ -133,6 +133,7 @@ static TranslationBlock *tb_find_slow(ta tb-tc_ptr = tc_ptr; tb-cs_base = cs_base; tb-flags = flags; +tb-page_addr[0] = phys_page1; cpu_gen_code(env, tb, CODE_GEN_MAX_SIZE, code_gen_size); code_gen_ptr = (void *)(((unsigned long)code_gen_ptr + code_gen_size + CODE_GEN_ALIGN - 1) ~(CODE_GEN_ALIGN - 1)); Index: softmmu_header.h === RCS file: /sources/qemu/qemu/softmmu_header.h,v retrieving revision 1.18 diff -u -d -d -p -r1.18 softmmu_header.h --- softmmu_header.h 14 Oct 2007 07:07:05 - 1.18 +++ softmmu_header.h 14 Oct 2007 11:35:53 - @@ -289,6 +289,68 @@ static inline void glue(glue(st, SUFFIX) } } +#else + +#if DATA_SIZE = 2 +static inline RES_TYPE glue(glue(glue(lds,SUFFIX),MEMSUFFIX),_p)(unsigned long *start_pc, + unsigned long phys_pc, + target_ulong virt_pc) +{ +RES_TYPE opc; + +/* XXX: Target executing code from MMIO ares is not supported for now */ +#if defined(TARGET_HAS_VLE_INSNS) /* || defined(TARGET_MMIO_CODE) */ +if (unlikely((*start_pc ^ + (phys_pc + sizeof(RES_TYPE) - 1)) TARGET_PAGE_BITS)) { +/* Slow path: phys_pc is not in the same page than start_pc + *or the insn is spanning two pages + */ +opc = glue(glue(lds,SUFFIX),MEMSUFFIX)(virt_pc); +/* Avoid softmmu access on next load */ +/* XXX: dont: phys PC is not correct anymore + * We could call get_phys_addr_code(env, pc); and remove the else + * condition, here. + */ +//*start_pc = phys_pc; +} else +#endif +{ +opc = glue(glue(lds,SUFFIX),_raw)(phys_pc); +} + +return opc; +} +#endif + +static inline RES_TYPE glue(glue(glue(ld,USUFFIX),MEMSUFFIX),_p)(unsigned long *start_pc, + unsigned long phys_pc, + target_ulong virt_pc) +{ +RES_TYPE opc; + +/* XXX: Target executing code from MMIO ares is not supported for now */ +#if defined(TARGET_HAS_VLE_INSNS) /* || defined(TARGET_MMIO_CODE) */ +if (unlikely((*start_pc ^ + (phys_pc + sizeof(RES_TYPE) - 1)) TARGET_PAGE_BITS)) { +/* Slow path: phys_pc is not in the same page than start_pc + *or the insn is spanning two pages + */ +opc = glue(glue(ld,USUFFIX),MEMSUFFIX)(virt_pc); +/* Avoid softmmu access on next load */ +/* XXX: dont: phys PC is not correct anymore + * We could call get_phys_addr_code(env, pc); and remove the else + * condition, here. + */ +//*start_pc = phys_pc; +} else +#endif +{ +opc = glue(glue(ld,USUFFIX),_raw)(phys_pc); +} + +return opc; +} + #endif /* ACCESS_TYPE != (NB_MMU_MODES + 1) */ #endif /* !asm */ Index: target-alpha/translate.c === RCS file: /sources/qemu/qemu/target-alpha/translate.c,v retrieving revision 1.6 diff -u -d -d -p -r1.6 translate.c --- target-alpha/translate.c 14 Oct 2007 08:50:17 - 1.6 +++ target-alpha/translate.c 14 Oct 2007 11:35:54 - @@ -1965,6 +1965,7 @@ int gen_intermediate_code_internal (CPUS
[Qemu-devel] Qemu build dependencies
Following the discussion initiated last week about Qemu build dependencies, I do propose to include the included patch (or the one that was previously proposed that was very close to this one). Please tell about any objection or improvments suggestions. -- J. Mayer [EMAIL PROTECTED] Never organized Index: Makefile.target === RCS file: /sources/qemu/qemu/Makefile.target,v retrieving revision 1.209 diff -u -d -d -p -r1.209 Makefile.target --- Makefile.target 14 Oct 2007 08:38:29 - 1.209 +++ Makefile.target 14 Oct 2007 10:25:11 - @@ -24,7 +24,7 @@ TARGET_BASE_ARCH:=sparc endif TARGET_PATH=$(SRC_PATH)/target-$(TARGET_BASE_ARCH) VPATH=$(SRC_PATH):$(TARGET_PATH):$(SRC_PATH)/hw:$(SRC_PATH)/audio -CPPFLAGS=-I. -I.. -I$(TARGET_PATH) -I$(SRC_PATH) +CPPFLAGS=-I. -I.. -I$(TARGET_PATH) -I$(SRC_PATH) -MMD -MP ifdef CONFIG_DARWIN_USER VPATH+=:$(SRC_PATH)/darwin-user CPPFLAGS+=-I$(SRC_PATH)/darwin-user -I$(SRC_PATH)/darwin-user/$(TARGET_ARCH) @@ -636,58 +642,6 @@ cpu-exec.o: cpu-exec.c signal.o: signal.c $(CC) $(HELPER_CFLAGS) $(CPPFLAGS) $(BASE_CFLAGS) -c -o $@ $ -vga.o: pixel_ops.h - -tcx.o: pixel_ops.h - -ifeq ($(TARGET_BASE_ARCH), i386) -op.o: op.c opreg_template.h ops_template.h ops_template_mem.h ops_mem.h ops_sse.h -endif - -ifeq ($(TARGET_ARCH), arm) -op.o: op.c op_template.h -pl110.o: pl110_template.h -endif - -ifeq ($(TARGET_BASE_ARCH), sparc) -helper.o: cpu.h exec-all.h -op.o: op.c op_template.h op_mem.h fop_template.h fbranch_template.h exec.h cpu.h -op_helper.o: exec.h softmmu_template.h cpu.h -translate.o: cpu.h exec-all.h disas.h -endif - -ifeq ($(TARGET_BASE_ARCH), ppc) -op.o: op.c op_template.h op_mem.h op_helper.h -op_helper.o: op_helper.c mfrom_table.c op_helper_mem.h op_helper.h -translate.o: translate.c translate_init.c -endif - -ifeq ($(TARGET_BASE_ARCH), mips) -helper.o: cpu.h exec-all.h -op.o: op_template.c fop_template.c op_mem.c exec.h cpu.h -op_helper.o: exec.h softmmu_template.h cpu.h -translate.o: translate_init.c exec-all.h disas.h -endif - -loader.o: loader.c elf_ops.h - -ifeq ($(TARGET_ARCH), sh4) -op.o: op.c op_mem.c cpu.h -op_helper.o: op_helper.c exec.h cpu.h -helper.o: helper.c exec.h cpu.h -sh7750.o: sh7750.c sh7750_regs.h sh7750_regnames.h cpu.h -shix.o: shix.c sh7750_regs.h sh7750_regnames.h -sh7750_regnames.o: sh7750_regnames.c sh7750_regnames.h sh7750_regs.h -tc58128.o: tc58128.c -endif - -ifeq ($(TARGET_BASE_ARCH), alpha) -op.o: op.c op_template.h op_mem.h -op_helper.o: op_helper_mem.h -endif - -$(OBJS) $(LIBOBJS) $(VL_OBJS): config.h ../config-host.h - %.o: %.c $(CC) $(CFLAGS) $(CPPFLAGS) $(BASE_CFLAGS) -c -o $@ $ @@ -695,7 +649,8 @@ $(OBJS) $(LIBOBJS) $(VL_OBJS): config.h $(CC) $(CPPFLAGS) -c -o $@ $ clean: - rm -f *.o *.a *~ $(PROGS) gen-op.h opc.h op.h nwfpe/*.o slirp/*.o fpu/*.o + rm -f *.o *.a *~ $(PROGS) gen-op.h opc.h op.h nwfpe/*.o slirp/*.o fpu/*.o + rm -f *.d */*.d install: all ifneq ($(PROGS),) @@ -711,3 +666,6 @@ audio.o sdlaudio.o dsoundaudio.o ossaudi fmodaudio.o alsaaudio.o mixeng.o sb16.o es1370.o gus.o adlib.o: \ CFLAGS := $(CFLAGS) -Wall -Werror -W -Wsign-compare endif + +# Include automatically generated dependency files +-include $(wildcard *.d */*.d)
Re: [Fwd: Re: [Qemu-devel] RFC: Code fetch optimisation]
On Sat, 2007-10-13 at 10:11 +0300, Blue Swirl wrote: On 10/13/07, J. Mayer [EMAIL PROTECTED] wrote: Forwarded Message From: Jocelyn Mayer [EMAIL PROTECTED] Reply-To: [EMAIL PROTECTED], qemu-devel@nongnu.org To: qemu-devel@nongnu.org Subject: Re: [Qemu-devel] RFC: Code fetch optimisation Date: Fri, 12 Oct 2007 20:24:44 +0200 On Fri, 2007-10-12 at 18:21 +0300, Blue Swirl wrote: On 10/12/07, J. Mayer [EMAIL PROTECTED] wrote: Here's a small patch that allow an optimisation for code fetch, at least for RISC CPU targets, as suggested by Fabrice Bellard. The main idea is that a translated block is never to span over a page boundary. As the tb_find_slow routine already gets the physical address of the page of code to be translated, the code translator could then fetch the code using raw host memory accesses instead of doing it through the softmmu routines. This patch could also be adapted to RISC CPU targets, with care for the last instruction of a page. For now, I did implement it for alpha, arm, mips, PowerPC and SH4. I don't actually know if the optimsation would bring a sensible speed gain or if it will be absolutelly marginal. Please comment. This will not work correctly for execution of MMIO registers, but maybe that won't work on real hardware either. Who cares. I wonder if this is important or not... But maybe, when retrieving the physical address we could check if it is inside ROM/RAM or an I/O area and in the last case do not give the phys_addr information to the translator. In that case, it would go on using the ldxx_code. I guess if we want to do that, a set of helpers would be appreciated to avoid adding code like: if (phys_pc == 0) opc = ldul_code(virt_pc) else opc = ldul_raw(phys_pc) everywhere... I could also add another check so this set of macro would automatically use ldxx_code if we reach a page boundary, which would then make easy to use this optimisation for CISC/VLE architectures too. I'm not sure of the proper solution to allow executing code from mmio devices. But adding specific accessors to handle the CISC/VLE case is to be done. [...] I did update my patch following this way and it's now able to run x86 and PowerPC targets. PowerPC is the easy case, x86 is maybe the worst... Well, I'm not really sure of what I've done for Sparc, but other targets should be safe. It broke Sparc, delay slot handling makes things complicated. The updated patch passes my tests. OK. I will take a look of how you solved this issue. For extra performance, I bypassed the ldl_code_p. On Sparc, instructions can't be split between two pages. Isn't translation always contained to the same page for all targets like Sparc? Yes, for RISC targets running 32 bits mode, we always stop translation when we reach the end of a code page. The problem comes with CISC architectures, like x86 or m68k, or RISC architecture running 16/32 bits code, like ARM in thumb mode or PowerPC in VLE mode. In all those case, there can be instructions spanning on 2 pages, then we need the ldx_code_p functions. My idea of always using the ldx_code_p function is that we may have the occasion to make it more cleaver and make the slow case handle code execution in mmio areas, when it will be possible. -- J. Mayer [EMAIL PROTECTED] Never organized
Re: [Qemu-devel] RFC: avoid #ifdef for target cpu list - for x86, too.
On Fri, 2007-10-12 at 10:54 +0200, Dan Kenigsberg wrote: This seems like a good excuse to send my suggested -cpu option for the x86 target. It is just like my previous take 4, but fits to the newly unified cpu_list. I don't know x86 so well to comment the x86 CPU definitions, but having this option for x86 too is welcome... [...] -- J. Mayer [EMAIL PROTECTED] Never organized
Re: [Qemu-devel] RFC: cleanups: CPU_MEM_INDEX
On Fri, 2007-10-12 at 09:01 +0200, J. Mayer wrote: On Thu, 2007-10-11 at 14:09 +0200, J. Mayer wrote: On Wed, 2007-10-10 at 07:06 +0200, J. Mayer wrote: On Wed, 2007-10-10 at 01:12 +0100, Thiemo Seufer wrote: J. Mayer wrote: Here's a proposal to add a int cpu_mem_index (CPUState *env) function in targets cpu.h header. The idea of this patch is: - avoid many #ifdef TARGET_xxx in exec-all.h and softmmu_header.h then make the code more readable - avoid multiple implementation of the same code (3, in that particular case) this to avoid potential conflicts if the definition has to be updated for any reason (ie support for new memory access modes, emulation optimisation...) Please comment. -- J. Mayer [EMAIL PROTECTED] Never organized [...] Here's an updated version of the patch. My comments about it stay valid, with two additions: 1/ when is user is needed to maintain compatibility with existing code, I now define it as: int is_user = mmu_idx == MMU_USER_IDX; instead of just is_user = mmu_idx. This definition will then remain correct even if the definition of the MMU modes are later changed for a specific target 2/ I now precompute the mmu_idx on PowerPC platform as it can never change inside a single TB. This may save a few instructions for every memory access. I guess the same optimisation can be made for the other targets, but not knowing exactly when it would have to be recomputed, for most targets, I prefer not to do this optimisation myself. Here's another update, taking care of the commit I just made (which changes some of the target-xxx/cpu.h files). It also fixes an issue in softmmu_header; this was missing: #if (DATA_SIZE = 4) (TARGET_LONG_BITS == 32) defined(__i386__) \ -(ACCESS_TYPE = 1) defined(ASM_SOFTMMU) +(ACCESS_TYPE NB_MMU_MODES) defined(ASM_SOFTMMU) #define CPU_TLB_ENTRY_BITS 4 As this affects only the i386 target which is defined with 2 MMU modes, the miss had no run-time consequence but was still a bug. I presume cpu_mem_index is supposed to do more than checking for usermode. In that case, is_user should get renamed, and the cpu_mem_index implementation of some (most?) CPUs should have a FIXME comment as reminder to implement the missing MMU modes. You're right, calling this variable is_user is only valid because this code supposes it knows what cpu_mem_index means. For targets with more than 2 modes of execution, this is not correct. My first idea was to try not to change the code too much. After thinking more about the problem, it appears to me that: 1/ in the softmmu routines, we should do no assumption about the signification of the memory index 2/ then, softmmu routines should use an index and all exported interfaces (ie tlb_fill and handle_mmu_fault) should take an index instead of is_user as an argument. 3/ to maintain compatibility with the existing code, I choosed to add a is_user variable inside most handle_mmu_fault implementation, initialized with the value of the given index, which is then given to the target mmu translation routines. 4/ to ease implementation of targets with more than 2 execution modes, I choosed to define a per-target NB_MMU_MODES in each target_xxx/cpu.h (instead of the hack for PowerPC 64 and Alpha that did pre-exist) and add a local definition of the meaning of each mmu_index index. Then, for PowerPC, I choosed to use the same convention than I do in translate.c, which seems more logical to me, then: 0 = user, 1 = supervisor, 2 = hypervisor. 5/ to avoid confusion between the memory index used in the translation context, which may contain more than the access mode information, and the one used by the softmmu routines, I choosed to name the one used in softmmu 'mmu_idx' (the one in target_xxx/translate.c is called mem_idx). 6/ I choosed to add a constant MMU_USER_IDX which is used in the user-mode handle_cpu_signal routine, then addressing your first remark. This patch solves a problem I had no solution to until today: how to add new mmu modes (ie hypervisor for PowerPC 64, supervisor and executive for Alpha) for some specific targets. The result is a much more invasive patch but the is supposed (!) to; 1/ do not change the behavior of the current targets implementations 2/ be less hardcoded, more flexible and extensible for any specific targets requirements. As this new version of the patch could deadly break the softmmu mode and I got no way to properly check all targets, I would greatly appreciate that some do some tests for arm, cris, m68k, mips, sh4 and sparc targets. For now, I did a few tests, running Linux (debian hdd installation) for PowerPC on PPC 603 750 in 32 bits and 64 bits mode emulation on x86 and amd64
[Qemu-devel] RFC: Code fetch optimisation
Here's a small patch that allow an optimisation for code fetch, at least for RISC CPU targets, as suggested by Fabrice Bellard. The main idea is that a translated block is never to span over a page boundary. As the tb_find_slow routine already gets the physical address of the page of code to be translated, the code translator could then fetch the code using raw host memory accesses instead of doing it through the softmmu routines. This patch could also be adapted to RISC CPU targets, with care for the last instruction of a page. For now, I did implement it for alpha, arm, mips, PowerPC and SH4. I don't actually know if the optimsation would bring a sensible speed gain or if it will be absolutelly marginal. Please comment. -- J. Mayer [EMAIL PROTECTED] Never organized Index: cpu-exec.c === RCS file: /sources/qemu/qemu/cpu-exec.c,v retrieving revision 1.119 diff -u -d -d -p -r1.119 cpu-exec.c --- cpu-exec.c 8 Oct 2007 13:16:13 - 1.119 +++ cpu-exec.c 12 Oct 2007 07:14:43 - @@ -133,6 +133,7 @@ static TranslationBlock *tb_find_slow(ta tb-tc_ptr = tc_ptr; tb-cs_base = cs_base; tb-flags = flags; +tb-page_addr[0] = phys_page1; cpu_gen_code(env, tb, CODE_GEN_MAX_SIZE, code_gen_size); code_gen_ptr = (void *)(((unsigned long)code_gen_ptr + code_gen_size + CODE_GEN_ALIGN - 1) ~(CODE_GEN_ALIGN - 1)); Index: target-alpha/translate.c === RCS file: /sources/qemu/qemu/target-alpha/translate.c,v retrieving revision 1.5 diff -u -d -d -p -r1.5 translate.c --- target-alpha/translate.c 16 Sep 2007 21:08:01 - 1.5 +++ target-alpha/translate.c 12 Oct 2007 07:14:47 - @@ -1966,12 +1966,15 @@ int gen_intermediate_code_internal (CPUS #endif DisasContext ctx, *ctxp = ctx; target_ulong pc_start; +unsigned long phys_pc; uint32_t insn; uint16_t *gen_opc_end; int j, lj = -1; int ret; pc_start = tb-pc; +phys_pc = (unsigned long)phys_ram_base + tb-page_addr[0] + +(pc_start ~TARGET_PAGE_MASK); gen_opc_ptr = gen_opc_buf; gen_opc_end = gen_opc_buf + OPC_MAX_SIZE; gen_opparam_ptr = gen_opparam_buf; @@ -2010,7 +2013,7 @@ int gen_intermediate_code_internal (CPUS ctx.pc, ctx.mem_idx); } #endif -insn = ldl_code(ctx.pc); +insn = ldl_raw(phys_pc); #if defined ALPHA_DEBUG_DISAS insn_count++; if (logfile != NULL) { @@ -2018,6 +2021,7 @@ int gen_intermediate_code_internal (CPUS } #endif ctx.pc += 4; +phys_pc += 4; ret = translate_one(ctxp, insn); if (ret != 0) break; Index: target-arm/translate.c === RCS file: /sources/qemu/qemu/target-arm/translate.c,v retrieving revision 1.57 diff -u -d -d -p -r1.57 translate.c --- target-arm/translate.c 17 Sep 2007 08:09:51 - 1.57 +++ target-arm/translate.c 12 Oct 2007 07:14:47 - @@ -38,6 +38,7 @@ /* internal defines */ typedef struct DisasContext { target_ulong pc; +unsigned long phys_pc; int is_jmp; /* Nonzero if this instruction has been conditionally skipped. */ int condjmp; @@ -2206,8 +2207,9 @@ static void disas_arm_insn(CPUState * en { unsigned int cond, insn, val, op1, i, shift, rm, rs, rn, rd, sh; -insn = ldl_code(s-pc); +insn = ldl_raw(s-phys_pc); s-pc += 4; +s-phys_pc += 4; cond = insn 28; if (cond == 0xf){ @@ -2971,8 +2973,9 @@ static void disas_thumb_insn(DisasContex int32_t offset; int i; -insn = lduw_code(s-pc); +insn = lduw_raw(s-phys_pc); s-pc += 2; +s-phys_pc += 2; switch (insn 12) { case 0: case 1: @@ -3494,7 +3497,7 @@ static void disas_thumb_insn(DisasContex break; } offset = ((int32_t)insn 21) 10; -insn = lduw_code(s-pc); +insn = lduw_raw(s-phys_pc); offset |= insn 0x7ff; val = (uint32_t)s-pc + 2; @@ -3544,6 +3547,8 @@ static inline int gen_intermediate_code_ dc-is_jmp = DISAS_NEXT; dc-pc = pc_start; +dc-phys_pc = (unsigned long)phys_ram_base + tb-page_addr[0] + +(pc_start ~TARGET_PAGE_MASK); dc-singlestep_enabled = env-singlestep_enabled; dc-condjmp = 0; dc-thumb = env-thumb; Index: target-mips/translate.c === RCS file: /sources/qemu/qemu/target-mips/translate.c,v retrieving revision 1.106 diff -u -d -d -p -r1.106 translate.c --- target-mips/translate.c 9 Oct 2007 03:39:58 - 1.106 +++ target-mips/translate.c 12 Oct 2007 07:14:48 - @@ -6483,6 +6483,7 @@ gen_intermediate_code_internal (CPUState { DisasContext ctx; target_ulong pc_start; +unsigned long phys_pc; uint16_t *gen_opc_end; int j, lj = -1; @@ -6490,6 +6491,8 @@ gen_intermediate_code_internal
Re: [Qemu-devel] Unable to Run Gprof Successfully on QEMU
On Fri, 2007-10-12 at 01:00 -0700, Atoosaah S wrote: I'd appreciate any input on how to run gprof successfully on qemu. I'm new to gprof and am probably missing some steps. I successfully ran gprof on a sorting program available online, then I attempted to run gprof on qemu. Here are the steps I take: I'm trying to run gprof on qemu, but am unsuccessful. my os is linux, my qemu version is 0.8.2. I configure qemu with the options configure --prefix=/install_path --enable-gprof. Then I make and make install. I run qemu successfully using the options /install_path/qemu -hda diskimage.img -m 256 which results in the gmon.out file. My run of qemu involved starting the image (virtual linux OS), running a few simple commands and shutting the image down. Finally, I run gprof /intsall_path/qemu gmon.out result.txt which gives the error: gprof: file 'qemu' has no symbols' Are there any other configuration options required? Should the image be run with differently? You need a qemu executable with debugging symbols. Distributed versions are usually stripped, which means the debug symbols are not present anymore. A way to get the debug symbol is to fetch the source and recompile it... -- J. Mayer [EMAIL PROTECTED] Never organized
[Fwd: Re: [Qemu-devel] RFC: Code fetch optimisation]
Forwarded Message From: Jocelyn Mayer [EMAIL PROTECTED] Reply-To: [EMAIL PROTECTED], qemu-devel@nongnu.org To: qemu-devel@nongnu.org Subject: Re: [Qemu-devel] RFC: Code fetch optimisation Date: Fri, 12 Oct 2007 20:24:44 +0200 On Fri, 2007-10-12 at 18:21 +0300, Blue Swirl wrote: On 10/12/07, J. Mayer [EMAIL PROTECTED] wrote: Here's a small patch that allow an optimisation for code fetch, at least for RISC CPU targets, as suggested by Fabrice Bellard. The main idea is that a translated block is never to span over a page boundary. As the tb_find_slow routine already gets the physical address of the page of code to be translated, the code translator could then fetch the code using raw host memory accesses instead of doing it through the softmmu routines. This patch could also be adapted to RISC CPU targets, with care for the last instruction of a page. For now, I did implement it for alpha, arm, mips, PowerPC and SH4. I don't actually know if the optimsation would bring a sensible speed gain or if it will be absolutelly marginal. Please comment. This will not work correctly for execution of MMIO registers, but maybe that won't work on real hardware either. Who cares. I wonder if this is important or not... But maybe, when retrieving the physical address we could check if it is inside ROM/RAM or an I/O area and in the last case do not give the phys_addr information to the translator. In that case, it would go on using the ldxx_code. I guess if we want to do that, a set of helpers would be appreciated to avoid adding code like: if (phys_pc == 0) opc = ldul_code(virt_pc) else opc = ldul_raw(phys_pc) everywhere... I could also add another check so this set of macro would automatically use ldxx_code if we reach a page boundary, which would then make easy to use this optimisation for CISC/VLE architectures too. I'm not sure of the proper solution to allow executing code from mmio devices. But adding specific accessors to handle the CISC/VLE case is to be done. [...] I did update my patch following this way and it's now able to run x86 and PowerPC targets. PowerPC is the easy case, x86 is maybe the worst... Well, I'm not really sure of what I've done for Sparc, but other targets should be safe. Please comment. -- J. Mayer [EMAIL PROTECTED] Never organized Index: cpu-all.h === RCS file: /sources/qemu/qemu/cpu-all.h,v retrieving revision 1.76 diff -u -d -d -p -r1.76 cpu-all.h --- cpu-all.h 23 Sep 2007 15:28:03 - 1.76 +++ cpu-all.h 12 Oct 2007 22:53:37 - @@ -646,6 +646,13 @@ static inline void stfq_be_p(void *ptr, #define ldl_code(p) ldl_raw(p) #define ldq_code(p) ldq_raw(p) +#define ldub_code_p(sp, pp, p) ldub_raw(p) +#define ldsb_code_p(sp, pp, p) ldsb_raw(p) +#define lduw_code_p(sp, pp, p) lduw_raw(p) +#define ldsw_code_p(sp, pp, p) ldsw_raw(p) +#define ldl_code_p(sp, pp, p) ldl_raw(p) +#define ldq_code_p(sp, pp, p) ldq_raw(p) + #define ldub_kernel(p) ldub_raw(p) #define ldsb_kernel(p) ldsb_raw(p) #define lduw_kernel(p) lduw_raw(p) Index: cpu-exec.c === RCS file: /sources/qemu/qemu/cpu-exec.c,v retrieving revision 1.119 diff -u -d -d -p -r1.119 cpu-exec.c --- cpu-exec.c 8 Oct 2007 13:16:13 - 1.119 +++ cpu-exec.c 12 Oct 2007 22:53:37 - @@ -133,6 +133,7 @@ static TranslationBlock *tb_find_slow(ta tb-tc_ptr = tc_ptr; tb-cs_base = cs_base; tb-flags = flags; +tb-page_addr[0] = phys_page1; cpu_gen_code(env, tb, CODE_GEN_MAX_SIZE, code_gen_size); code_gen_ptr = (void *)(((unsigned long)code_gen_ptr + code_gen_size + CODE_GEN_ALIGN - 1) ~(CODE_GEN_ALIGN - 1)); Index: softmmu_header.h === RCS file: /sources/qemu/qemu/softmmu_header.h,v retrieving revision 1.17 diff -u -d -d -p -r1.17 softmmu_header.h --- softmmu_header.h 8 Oct 2007 13:16:14 - 1.17 +++ softmmu_header.h 12 Oct 2007 22:53:37 - @@ -336,6 +336,60 @@ static inline void glue(glue(st, SUFFIX) } } +#else + +#if DATA_SIZE = 2 +static inline RES_TYPE glue(glue(glue(lds,SUFFIX),MEMSUFFIX),_p)(unsigned long *start_pc, + unsigned long phys_pc, + target_ulong virt_pc) +{ +RES_TYPE opc; + +if (unlikely((*start_pc ^ + (phys_pc + sizeof(RES_TYPE) - 1)) TARGET_PAGE_BITS)) { +/* Slow path: phys_pc is not in the same page than start_pc + *or the insn is spanning two pages + */ +opc = glue(glue(lds,SUFFIX),MEMSUFFIX)(virt_pc); +/* Avoid softmmu access on next load */ +/* XXX: dont: phys PC is not correct anymore + * We chould call get_phys_addr_code(env, pc); and remove the else
Re: [Qemu-devel] [PATCH] syscall_target_errno.patch
On Wed, 2007-10-10 at 21:38 -0600, Thayne Harbaugh wrote: I appreciate the work that Jocelyn did to correct the types used throughout linux-user/syscall.c. Along those same lines I am working on several patches to eliminate some incorrect constructs that have crept into syscall.c - some of which I have ignorantly propagated in previous patches that I have submitted. I have noticed that many functions in syscall.c return a *host* errno when a *target* errno should be return. At the same time, there are several places in syscall.c:do_syscall() that immediately return an errno rather than setting the return value and exiting through the syscall return value reporting at the end of do_syscall(). This patch addresses both of those problems at once rather than touching the exact same errno return lines twice in do_syscall(). It also touches a few functions in linux-user/signal.c that are called from do_syscall(). Please send comments - I have several more patches that will build on this one as well as a few more patches that will fix other incorrect constructs with target/host address handling. Thanks. Hi, there are still a lot of problems hidden in syscalls.c and signal.c, as you noticed. Your patch seems OK to me and adding all those comments is imho really great. My only remark is a cosmetic one: I don't like too much hidding 'goto' in macros... -- J. Mayer [EMAIL PROTECTED] Never organized
Re: [Qemu-devel] RFC: fix run of 32 bits Linux executables on 64 bits targets
On Thu, 2007-10-11 at 22:26 +0300, Blue Swirl wrote: On 10/10/07, Fabrice Bellard [EMAIL PROTECTED] wrote: Thiemo Seufer wrote: Fabrice Bellard wrote: J. Mayer wrote: Following the patches done for elfload32, it appeared to me that there were still problems that would prevent 32 bits executables to run on 64 bits target in linux user mode emulation. [...] Are you sure it is a good idea to try to add 32 bit executable support to a 64 bit target ? In the end you will need to write a 64 bit to 32 bit linux syscall converter which would mean duplicating all the linux-user code of the corresponding 32 bit target (think of ioctls with strutures, signals frames, etc...). I would think this feature will be limited to platforms which can handle 32bit and 64bit binaries with a single personality. I am not sure it is a common case ! However, I suggest to emulate a 32 bit user linux system with a 64 bit guest CPU running in 32 bit compatibily mode. It would be useful to test 64 bit CPUs in 32 bit compatibility mode. The only required modification in linux user is to rename target_ulong so that it can have a different size of the CPU word default size. I made a patch to rename target_ulong/long to abi_ulong/long and also add a new emulator target that uses the 32 bit ABI with 64 bit CPU. Some Sparc32 binaries run, others don't, possibly indicating bugs in the Sparc64 emulation! The patch is quite large because of the renaming, but this shouldn't have effect to any other target. Any comments? Great ! The patch seems safe, at first look, then I noticed a few things that are not correct or may be improved: * In linux-user/main.c: PowerPC DCR access should keep using target_ulong. This is a hardware bus, not an ABI dependent stuff. If a 32 bits cast is needed, it would be done in the micro-ops that handle the DCR bus accesses. * in linux-user/qemu.h: why is there still a OVERRIDE_ELF_CLASS variable, when checking TARGET_ABI32 should be sufficient ? It seems to me that having 2 defines which are, in fact, synonymous may be a source of confusion. * in configure: you also added a sparc64-softmmu target, which seems not related with this particular patch. * in configure: why add a specific TARGET_ABI32_DIR variable for that case ? It seems to me that a TARGET_ABI_DIR variable could be useful for all targets. Let me give an example: I want to add a ppcemb-linux-user target, emulating a PowerPC 32 with 64 bits registers and SIMD extensions but I don't want to duplicate the linux-user/ppc subdirectory. Having a TARGET_ABI_DIR available for all targets would solve my problem. In fact, even ppc and ppc64 could be merged... As you need this feature in your case, I think it would be a good idea to add it for all targets. And then, the kludge in Makefile.target could be replaced by: -CPPFLAGS+=-I$(SRC_PATH)/linux-user -I $(SRC_PATH)/linux-user/$(TARGET_ARCH) +CPPFLAGS+=-I$(SRC_PATH)/linux-user -I $(SRC_PATH)/linux-user/$(TARGET_ABI_DIR) which is simpler and easier to understand, imho. -- J. Mayer [EMAIL PROTECTED] Never organized
Re: [Qemu-devel] RFC: cleanups: CPU_MEM_INDEX
On Thu, 2007-10-11 at 18:46 +0100, Thiemo Seufer wrote: J. Mayer wrote: On Wed, 2007-10-10 at 07:06 +0200, J. Mayer wrote: On Wed, 2007-10-10 at 01:12 +0100, Thiemo Seufer wrote: J. Mayer wrote: Here's a proposal to add a int cpu_mem_index (CPUState *env) function in targets cpu.h header. The idea of this patch is: - avoid many #ifdef TARGET_xxx in exec-all.h and softmmu_header.h then make the code more readable - avoid multiple implementation of the same code (3, in that particular case) this to avoid potential conflicts if the definition has to be updated for any reason (ie support for new memory access modes, emulation optimisation...) Please comment. -- J. Mayer [EMAIL PROTECTED] Never organized [...] Here's an updated version of the patch. My comments about it stay valid, with two additions: 1/ when is user is needed to maintain compatibility with existing code, I now define it as: int is_user = mmu_idx == MMU_USER_IDX; instead of just is_user = mmu_idx. This definition will then remain correct even if the definition of the MMU modes are later changed for a specific target 2/ I now precompute the mmu_idx on PowerPC platform as it can never change inside a single TB. This may save a few instructions for every memory access. I guess the same optimisation can be made for the other targets, but not knowing exactly when it would have to be recomputed, for most targets, I prefer not to do this optimisation myself. I like this version. Tested with x86 and mips, on Linux/ppc host. Thanks for testing. I guess it's safe... but I'd like to get more reports or comments about it before applying this ! -- J. Mayer [EMAIL PROTECTED] Never organized
[Fwd: [Qemu-devel] RFC: avoid #ifdef for target cpu list]
Forwarded Message From: J. Mayer [EMAIL PROTECTED] Reply-To: qemu-devel@nongnu.org To: qemu-devel@nongnu.org Subject: [Qemu-devel] RFC: avoid #ifdef for target cpu list Date: Wed, 10 Oct 2007 07:14:22 +0200 This tiny patch unifies the -cpu ? option for all cpu that actually can handle it. It changes the arm_cpu_list to use the same prototype as ppc, mips and sparc and add a new define cpu_list in target_xxx/cpu.h As the cpu selection is not implemented for all targets, I had to protect the call to cpu_list with a #if defined(cpu_list) that will have to be suppressed once all target will implement this feature. Please comment. If there is no objection about it, I'll apply it today. -- J. Mayer [EMAIL PROTECTED] Never organized
[Qemu-devel] RFC: fix run of 32 bits Linux executables on 64 bits targets
Following the patches done for elfload32, it appeared to me that there were still problems that would prevent 32 bits executables to run on 64 bits target in linux user mode emulation. First of all, the personality was never set to PER_LINUX32 The second problem was that pointers used to set the values on the stack were still of target_ulong size, which lead 32 bits executable crash dereferencing NULL pointers as soon as they wanted to parse their arguments. The attached patch makes 32 bits PowerPC executables run in ppc64_linux_user target. More fixes may be needed in the start_thread function and elf_check_arch to make other targets run as well. Please comment. -- J. Mayer [EMAIL PROTECTED] Never organized Index: configure === RCS file: /sources/qemu/qemu/configure,v retrieving revision 1.161 diff -u -d -d -p -r1.161 configure --- configure 9 Oct 2007 16:34:28 - 1.161 +++ configure 10 Oct 2007 07:38:59 - @@ -1029,11 +1030,12 @@ elif test $target_cpu = ppc ; then echo TARGET_ARCH=ppc $config_mak echo #define TARGET_ARCH \ppc\ $config_h echo #define TARGET_PPC 1 $config_h elif test $target_cpu = ppc64 ; then echo TARGET_ARCH=ppc64 $config_mak echo #define TARGET_ARCH \ppc64\ $config_h echo #define TARGET_PPC 1 $config_h echo #define TARGET_PPC64 1 $config_h + elfload32=yes elif test $target_cpu = ppcemb ; then echo TARGET_ARCH=ppcemb $config_mak echo #define TARGET_ARCH \ppcemb\ $config_h Index: linux-user/elfload.c === RCS file: /sources/qemu/qemu/linux-user/elfload.c,v retrieving revision 1.51 diff -u -d -d -p -r1.51 elfload.c --- linux-user/elfload.c 9 Oct 2007 16:34:29 - 1.51 +++ linux-user/elfload.c 10 Oct 2007 07:38:59 - @@ -12,65 +12,10 @@ #include qemu.h #include disas.h -/* from personality.h */ - -/* - * Flags for bug emulation. - * - * These occupy the top three bytes. - */ -enum { - ADDR_NO_RANDOMIZE = 0x004, /* disable randomization of VA space */ - FDPIC_FUNCPTRS = 0x008, /* userspace function ptrs point to descriptors - * (signal handling) - */ - MMAP_PAGE_ZERO = 0x010, - ADDR_COMPAT_LAYOUT = 0x020, - READ_IMPLIES_EXEC = 0x040, - ADDR_LIMIT_32BIT = 0x080, - SHORT_INODE = 0x100, - WHOLE_SECONDS = 0x200, - STICKY_TIMEOUTS = 0x400, - ADDR_LIMIT_3GB = 0x800, -}; - -/* - * Personality types. - * - * These go in the low byte. Avoid using the top bit, it will - * conflict with error returns. - */ -enum { - PER_LINUX = 0x, - PER_LINUX_32BIT = 0x | ADDR_LIMIT_32BIT, - PER_LINUX_FDPIC = 0x | FDPIC_FUNCPTRS, - PER_SVR4 = 0x0001 | STICKY_TIMEOUTS | MMAP_PAGE_ZERO, - PER_SVR3 = 0x0002 | STICKY_TIMEOUTS | SHORT_INODE, - PER_SCOSVR3 = 0x0003 | STICKY_TIMEOUTS | - WHOLE_SECONDS | SHORT_INODE, - PER_OSR5 = 0x0003 | STICKY_TIMEOUTS | WHOLE_SECONDS, - PER_WYSEV386 = 0x0004 | STICKY_TIMEOUTS | SHORT_INODE, - PER_ISCR4 = 0x0005 | STICKY_TIMEOUTS, - PER_BSD = 0x0006, - PER_SUNOS = 0x0006 | STICKY_TIMEOUTS, - PER_XENIX = 0x0007 | STICKY_TIMEOUTS | SHORT_INODE, - PER_LINUX32 = 0x0008, - PER_LINUX32_3GB = 0x0008 | ADDR_LIMIT_3GB, - PER_IRIX32 = 0x0009 | STICKY_TIMEOUTS,/* IRIX5 32-bit */ - PER_IRIXN32 = 0x000a | STICKY_TIMEOUTS,/* IRIX6 new 32-bit */ - PER_IRIX64 = 0x000b | STICKY_TIMEOUTS,/* IRIX6 64-bit */ - PER_RISCOS = 0x000c, - PER_SOLARIS = 0x000d | STICKY_TIMEOUTS, - PER_UW7 = 0x000e | STICKY_TIMEOUTS | MMAP_PAGE_ZERO, - PER_OSF4 = 0x000f, /* OSF/1 v4 */ - PER_HPUX = 0x0010, - PER_MASK = 0x00ff, -}; - /* * Return the base personality without flags. */ -#define personality(pers) (pers PER_MASK) +#define personality(pers) ((pers) PER_MASK) /* this flag is uneffective under linux too, should be deleted */ #ifndef MAP_DENYWRITE @@ -215,6 +160,7 @@ enum #define ELF_START_MMAP 0x8000 #define elf_check_arch(x) ( (x) == EM_SPARCV9 || (x) == EM_SPARC32PLUS ) +#define elf_check_arch32(x) ( (x) == EM_SPARC32PLUS ) #define ELF_CLASS ELFCLASS64 #define ELF_DATAELFDATA2MSB @@ -261,7 +207,8 @@ static inline void init_thread(struct ta #ifdef TARGET_PPC64 -#define elf_check_arch(x) ( (x) == EM_PPC64 ) +#define elf_check_arch(x) ( (x) == EM_PPC64 || (x) == EM_PPC ) +#define elf_check_arch32(x) ( (x) == EM_PPC ) #define ELF_CLASS ELFCLASS64 @@ -311,32 +258,51 @@ do { NEW_AUX_ENT(AT_IGNOREPPC, AT_IGNOREPPC); \ } while (0) -static inline void init_thread(struct target_pt_regs *_regs, struct image_info *infop) +static inline void init_thread(struct target_pt_regs *_regs, + struct image_info *infop) { target_ulong pos = infop-start_stack; -target_ulong tmp; +target_ulong tmp, elen; #ifdef TARGET_PPC64 target_ulong entry, toc; #endif -_regs-msr = 1 MSR_PR; /* Set user mode */ +_regs-msr = 1ULL MSR_PR
Re: [Qemu-devel] RFC: fix run of 32 bits Linux executables on 64 bits targets
On Wed, 2007-10-10 at 19:01 +0300, Blue Swirl wrote: On 10/10/07, J. Mayer [EMAIL PROTECTED] wrote: Following the patches done for elfload32, it appeared to me that there were still problems that would prevent 32 bits executables to run on 64 bits target in linux user mode emulation. First of all, the personality was never set to PER_LINUX32 It's set in elfload32.c, but I think your approach is better. The check for elf_ex-e_ident[EI_CLASS] == ELFCLASS64 could be moved from elfload32.c. Well, it is overriden just before the create_elf_table call... And it's especially needed there and in the start_thread code, at least for PowerPC. As the kernel set it up at this point, it seems to be a good idea to do the same ! The second problem was that pointers used to set the values on the stack were still of target_ulong size, which lead 32 bits executable crash dereferencing NULL pointers as soon as they wanted to parse their arguments. Nice, I was wondering why my test program crashed. I realized there are tons of unneeded checks/code in my patch, as this code is compiled twice. I will repost a cleaned one soon... -- J. Mayer [EMAIL PROTECTED] Never organized
Re: [Qemu-devel] RFC: fix run of 32 bits Linux executables on 64 bits targets
On Wed, 2007-10-10 at 22:02 +0300, Blue Swirl wrote: On 10/10/07, Fabrice Bellard [EMAIL PROTECTED] wrote: Thiemo Seufer wrote: Fabrice Bellard wrote: J. Mayer wrote: Following the patches done for elfload32, it appeared to me that there were still problems that would prevent 32 bits executables to run on 64 bits target in linux user mode emulation. [...] Are you sure it is a good idea to try to add 32 bit executable support to a 64 bit target ? In the end you will need to write a 64 bit to 32 bit linux syscall converter which would mean duplicating all the linux-user code of the corresponding 32 bit target (think of ioctls with strutures, signals frames, etc...). I would think this feature will be limited to platforms which can handle 32bit and 64bit binaries with a single personality. I am not sure it is a common case ! However, I suggest to emulate a 32 bit user linux system with a 64 bit guest CPU running in 32 bit compatibily mode. It would be useful to test 64 bit CPUs in 32 bit compatibility mode. The only required modification in linux user is to rename target_ulong so that it can have a different size of the CPU word default size. I think this would be sufficient for the Sparc and this way there would be no need to convert the structures. Brilliant! Should we revert the elfload32 patch? What about PPC? We can keep the elfload32 for now, it does not hurt. This approach is OK for PPC too. And as I got some 32 bits programs running in the 64 bits linux-user emulator, the same programs behavior can be compared to find eventual issues... -- J. Mayer [EMAIL PROTECTED] Never organized
Re: [Qemu-devel] qemu Makefile.target
On Mon, 2007-10-08 at 21:33 +0200, Stefan Weil wrote: Blue Swirl schrieb: On 6/1/07, Stefan Weil [EMAIL PROTECTED] wrote: Wouldn't it be better to let the compiler create dependency files which make can read? I posted a patch some time ago and use it since many months for different QEMU target platforms. I don't know, the dependencies aren't changing very often. The current Makefile.target contains a lot of dependencies, and I think they change often, their number is growing and nevertheless they are far from being complete. The appended patch removes most explicit dependencies and adds gcc flags which create dependency files during normal compilation (so they are always up-to-date). I arrived to the same conclusion, that hand coded dependencies are always bugged and/or incomplete. I then was to propose a similar patch with a few small differences. As -MMD / -MP is a preprocessor option, it seems more logical to add them in CPPFLAGS. The second difference is that it seems better to use the following syntax to include dependency files: -include $(wildcard *.d) -include $(wildcard */*.d) (some also include /dev/null in the list then it's never empty...) This patch has a great drawback; you are to recompile almost everything any time you change vl.h or target-xxx/cpu.h, for example. This, imho, should not prevent us to apply it but shows us that most headers should be splitted in a better way. As an example, it seems strange to me to declare all devices prototypes in vl.h when those declaration should never be used outside of hw subdirectory, where all hardware devices emulation related stuff should stay. I guess we could find a lot of examples of declarations/prototypes globally exported but only used locally. Fixing this would make the code cleaner and, as a side effect, would avoid a lot of waste of time recompiling useless stuff when doing changes in headers. -- J. Mayer [EMAIL PROTECTED] Never organized
[Qemu-devel] RFC: cleanups: CPU_MEM_INDEX
Here's a proposal to add a int cpu_mem_index (CPUState *env) function in targets cpu.h header. The idea of this patch is: - avoid many #ifdef TARGET_xxx in exec-all.h and softmmu_header.h then make the code more readable - avoid multiple implementation of the same code (3, in that particular case) this to avoid potential conflicts if the definition has to be updated for any reason (ie support for new memory access modes, emulation optimisation...) Please comment. -- J. Mayer [EMAIL PROTECTED] Never organized Index: cpu-exec.c === RCS file: /sources/qemu/qemu/cpu-exec.c,v retrieving revision 1.119 diff -u -d -d -p -r1.119 cpu-exec.c --- cpu-exec.c 8 Oct 2007 13:16:13 - 1.119 +++ cpu-exec.c 9 Oct 2007 10:36:07 - @@ -885,7 +885,7 @@ static inline int handle_cpu_signal(unsi /* see if it is an MMU fault */ ret = cpu_x86_handle_mmu_fault(env, address, is_write, - ((env-hflags HF_CPL_MASK) == 3), 0); + cpu_mem_index(env), 0); if (ret 0) return 0; /* not an MMU fault */ if (ret == 0) @@ -1007,7 +1009,8 @@ static inline int handle_cpu_signal(unsi } /* see if it is an MMU fault */ -ret = cpu_ppc_handle_mmu_fault(env, address, is_write, msr_pr, 0); +ret = cpu_ppc_handle_mmu_fault(env, address, is_write, + cpu_mem_index(env), 0); if (ret 0) return 0; /* not an MMU fault */ if (ret == 0) @@ -1191,7 +1197,8 @@ static inline int handle_cpu_signal(unsi } /* see if it is an MMU fault */ -ret = cpu_alpha_handle_mmu_fault(env, address, is_write, 1, 0); +ret = cpu_alpha_handle_mmu_fault(env, address, is_write, + cpu_mem_index(env), 0); if (ret 0) return 0; /* not an MMU fault */ if (ret == 0) Index: exec-all.h === RCS file: /sources/qemu/qemu/exec-all.h,v retrieving revision 1.67 diff -u -d -d -p -r1.67 exec-all.h --- exec-all.h 8 Oct 2007 13:16:14 - 1.67 +++ exec-all.h 9 Oct 2007 10:36:07 - @@ -601,27 +606,7 @@ static inline target_ulong get_phys_addr int is_user, index, pd; index = (addr TARGET_PAGE_BITS) (CPU_TLB_SIZE - 1); -#if defined(TARGET_I386) -is_user = ((env-hflags HF_CPL_MASK) == 3); -#elif defined (TARGET_PPC) -is_user = msr_pr; -#elif defined (TARGET_MIPS) -is_user = ((env-hflags MIPS_HFLAG_MODE) == MIPS_HFLAG_UM); -#elif defined (TARGET_SPARC) -is_user = (env-psrs == 0); -#elif defined (TARGET_ARM) -is_user = ((env-uncached_cpsr CPSR_M) == ARM_CPU_MODE_USR); -#elif defined (TARGET_SH4) -is_user = ((env-sr SR_MD) == 0); -#elif defined (TARGET_ALPHA) -is_user = ((env-ps 3) 3); -#elif defined (TARGET_M68K) -is_user = ((env-sr SR_S) == 0); -#elif defined (TARGET_CRIS) -is_user = (0); -#else -#error unimplemented CPU -#endif +is_user = cpu_mem_index(env); if (__builtin_expect(env-tlb_table[is_user][index].addr_code != (addr TARGET_PAGE_MASK), 0)) { ldub_code(addr); Index: softmmu_header.h === RCS file: /sources/qemu/qemu/softmmu_header.h,v retrieving revision 1.17 diff -u -d -d -p -r1.17 softmmu_header.h --- softmmu_header.h 8 Oct 2007 13:16:14 - 1.17 +++ softmmu_header.h 9 Oct 2007 10:36:07 - @@ -51,54 +51,12 @@ #elif ACCESS_TYPE == 2 -#ifdef TARGET_I386 -#define CPU_MEM_INDEX ((env-hflags HF_CPL_MASK) == 3) -#elif defined (TARGET_PPC) -#define CPU_MEM_INDEX (msr_pr) -#elif defined (TARGET_MIPS) -#define CPU_MEM_INDEX ((env-hflags MIPS_HFLAG_MODE) == MIPS_HFLAG_UM) -#elif defined (TARGET_SPARC) -#define CPU_MEM_INDEX ((env-psrs) == 0) -#elif defined (TARGET_ARM) -#define CPU_MEM_INDEX ((env-uncached_cpsr CPSR_M) == ARM_CPU_MODE_USR) -#elif defined (TARGET_SH4) -#define CPU_MEM_INDEX ((env-sr SR_MD) == 0) -#elif defined (TARGET_ALPHA) -#define CPU_MEM_INDEX ((env-ps 3) 3) -#elif defined (TARGET_M68K) -#define CPU_MEM_INDEX ((env-sr SR_S) == 0) -#elif defined (TARGET_CRIS) -/* CRIS FIXME: I guess we want to validate supervisor mode acceses here. */ -#define CPU_MEM_INDEX (0) -#else -#error unsupported CPU -#endif +#define CPU_MEM_INDEX (cpu_mem_index(env)) #define MMUSUFFIX _mmu #elif ACCESS_TYPE == 3 -#ifdef TARGET_I386 -#define CPU_MEM_INDEX ((env-hflags HF_CPL_MASK) == 3) -#elif defined (TARGET_PPC) -#define CPU_MEM_INDEX (msr_pr) -#elif defined (TARGET_MIPS) -#define CPU_MEM_INDEX ((env-hflags MIPS_HFLAG_MODE) == MIPS_HFLAG_UM) -#elif defined (TARGET_SPARC) -#define CPU_MEM_INDEX ((env-psrs) == 0) -#elif defined (TARGET_ARM) -#define CPU_MEM_INDEX ((env-uncached_cpsr CPSR_M) == ARM_CPU_MODE_USR) -#elif defined (TARGET_SH4) -#define CPU_MEM_INDEX ((env-sr SR_MD) == 0) -#elif defined (TARGET_ALPHA) -#define CPU_MEM_INDEX ((env-ps
Re: [Qemu-devel] [RFC, PATCH] Support for loading 32 bit ELF files for 64 bit linux-user
On Sun, 2007-10-07 at 15:45 +0300, Blue Swirl wrote: Hi, Hi, This patch adds support for loading a 32 bit ELF file in the 64 bit user mode emulator. This means that qemu-sparc64 can be used to execute 32 bit ELF files containing V9 instructions (SPARC32PLUS). This format is used by Solaris/Sparc and maybe by Debian in the future. Other targets shouldn't be affected, but I have done only compile testing. Any comments? The idea of loading 32 bits executables on 64 bits target seems great. Then, I got two remarks about this patch: - it seems that it does not take care about my patch. As I was to commit it today, I wonder if I still should do it. But then, your patch lacks some bugifxes (start_data not properly computed and TARGET_LONG_BITS != HOST_LONG_BITS problems). - it seems that quite all the ELF loader code is affected by your patch. I think (maybe too naively) that adding functions to read the ELF infos should be sufficient, ie add a read_elf_ehdr, ..., functions and a few patches in the create_elf_table function. Then, all informations nedded to load a 32 bits executable can be kept into the 64 bits structures. As the kernel does not duplicate the code to handle this case, I think Qemu loader should be kept as simple as the kernel one, and the elfload_ops.h seems to me to be useless. In fact, Qemu loader could (should ?) even be the same code than the kernel one with just a few helpers for endianness swaps and the needed fixes to avoid confusions between host_long and target_long... -- J. Mayer [EMAIL PROTECTED] Never organized