Re: [Qemu-devel] [PATCH 1/7] virtio: allow byte swapping for vring and config access
Peter Maydell peter.mayd...@linaro.org writes: On 9 August 2013 08:35, Rusty Russell ru...@rustcorp.com.au wrote: That's a lot of replumbing and indirect function calls for a fairly obscure case. We certainly don't have a nice CPUState lying around in virtio at the moment, for example. Actually if you're in an IO routine you do: it will be in the global variable 'current_cpu'. Most IO functions don't need this, but it's what we use for things like this device behaviour varies depending on which CPU accesses it. Hmm, that was NULL for me when called from virtio. I have stuck with first_cpu in the ppc64 case. Patches coming, Rusty.
Re: [Qemu-devel] [PATCH 1/7] virtio: allow byte swapping for vring and config access
Anthony Liguori anth...@codemonkey.ws writes: Rusty Russell ru...@rustcorp.com.au writes: (Qemu run under eatmydata to eliminate syncs) FYI, cache=unsafe is equivalent to using eatmydata. Ah, thanks! I can reproduce this although I also see a larger standard deviation. BEFORE: MIN: 496 MAX: 1055 AVG: 873.22 STDEV: 136.88 AFTER: MIN: 494 MAX: 1456 AVG: 947.77 STDEV: 150.89 BTW, how did you generate these stats? Consider this my plug for my little stats filter: https://github.com/rustyrussell/stats In my datasets, the stdev is higher in the after case implying that there is more variation. Indeed, the MIN is pretty much the same. GCC is inlining the functions, I'm still surprised that it's measurable at all. GCC won't inline across compilation units without -flto though, so the stub call won't be inlined, right? Cheers, Rusty.
Re: [Qemu-devel] [PATCH 1/7] virtio: allow byte swapping for vring and config access
Benjamin Herrenschmidt b...@kernel.crashing.org writes: This whole exercise should have nothing to do with the current endian mode of the CPU. If for example you are running lx86 (the x86 emulator IBM provides) which exploits MSR:LE on POWER7 to run x86 binaries in userspace, you don't want virtio to suddenly change endian ! The information we care about is the endianness of the operating system. Which is why my original patches nabbed the endianness when the target updated the virtio device status. You're making an assumption about the nature of the guest, that they don't pass the virtio device directly through to userspace. I don't care, though. The point is to make something which works, until the Real Fix (LE virtio). The most logical way to infer it is a different bit, which used to be MSR:ILE and is now in LPCR for guests and controlled via a hypercall on pseries, which indicates what is the endianness of interrupt vectors. IE. It indicates how the cpu should set MSR:LE when taking an interrupt, regardless of what the current MSR:LE value is at any given point in time. So what should be done in fact is whenever *that* bit is changed (currently via hcall, maybe via MSR:ILE if we emulate that on older models or LPCR when we emulate that), then the qemu cpu model can call out to change the OS endianness which we can propagate to virtio. Anything trying to do stuff based on the current endianness in the MSR sounds like a cesspit to me. OK. What should Anton's gdb stub do then? Cheers, Rusty.
Re: [Qemu-devel] [PATCH 1/7] virtio: allow byte swapping for vring and config access
On Mon, 2013-08-12 at 09:58 +0930, Rusty Russell wrote: Benjamin Herrenschmidt b...@kernel.crashing.org writes: This whole exercise should have nothing to do with the current endian mode of the CPU. If for example you are running lx86 (the x86 emulator IBM provides) which exploits MSR:LE on POWER7 to run x86 binaries in userspace, you don't want virtio to suddenly change endian ! The information we care about is the endianness of the operating system. Which is why my original patches nabbed the endianness when the target updated the virtio device status. You're making an assumption about the nature of the guest, that they don't pass the virtio device directly through to userspace. Two points here: - Userspace is VERY likely to have the same endianness as the operating system. - The case where we might support foreign endian userspace *and* pass virtio directly to it *and* give a shit about virtio v1.0 doesn't exist anywhere but your imagination right now :-) I don't care, though. The point is to make something which works, until the Real Fix (LE virtio). Exactly. The most logical way to infer it is a different bit, which used to be MSR:ILE and is now in LPCR for guests and controlled via a hypercall on pseries, which indicates what is the endianness of interrupt vectors. IE. It indicates how the cpu should set MSR:LE when taking an interrupt, regardless of what the current MSR:LE value is at any given point in time. So what should be done in fact is whenever *that* bit is changed (currently via hcall, maybe via MSR:ILE if we emulate that on older models or LPCR when we emulate that), then the qemu cpu model can call out to change the OS endianness which we can propagate to virtio. Anything trying to do stuff based on the current endianness in the MSR sounds like a cesspit to me. OK. What should Anton's gdb stub do then? Something else. It's a different problem and needs a different solution. For one, I think, we should first fix the root problem with gdb (tagging endianness in the protocol etc...) and once that's done, look at what band-aid can be applied for old stuff if we care at all (it's not like LE ppc64 is going to not require a new gdb anyway). Cheers, Ben.
Re: [Qemu-devel] [PATCH 1/7] virtio: allow byte swapping for vring and config access
Anthony Liguori anth...@codemonkey.ws writes: I suspect this is a premature optimization. With a weak function called directly in the accessors below, I suspect you would see no measurable performance overhead compared to this approach. It's all very predictable so the CPU should do a decent job optimizing the if () away. Perhaps. I was leery of introducing performance regressions, but the actual I/O tends to dominate anyway. So I tested this, by adding the patch (below) and benchmarking qemu-system-i386 on my laptop before and after. Setup: Intel(R) Core(TM) i5 CPU M 560 @ 2.67GHz (Performance cpu governer enabled) Guest: virtio user net, virtio block on raw file, 1 CPU, 512MB RAM. (Qemu run under eatmydata to eliminate syncs) First test: ping -f -c 1 -q 10.0.2.0 (100 times) (Ping chosen since packets stay in qemu's user net code) BEFORE: MIN: 824ms MAX: 914ms AVG: 876.95ms STDDEV: 16ms AFTER: MIN: 872ms MAX: 933ms AVG: 904.35ms STDDEV: 15ms Second test: dd if=/dev/vda iflag=direct count=1 of=/dev/null (100 times) BEFORE: MIN: 0.927994sec MAX: 1.051640sec AVG: 0.99733sec STDDEV: 0.028sec AFTER: MIN: 0.941706sec MAX: 1.034810sec AVG: 0.988692sec STDDEV: 0.021sec So, we can notice performance on ping, but anything which does actual IO is a wash. Cheers, Rusty. diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c index 2887f17..df8733b 100644 --- a/hw/virtio/virtio.c +++ b/hw/virtio/virtio.c @@ -85,20 +85,6 @@ struct VirtQueue EventNotifier host_notifier; }; -#ifdef TARGET_VIRTIO_SWAPENDIAN -bool virtio_byteswap; - -/* Ask target code if we should swap endian for all vring and config access. */ -static void mark_endian(void) -{ -virtio_byteswap = virtio_swap_endian(); -} -#else -static void mark_endian(void) -{ -} -#endif - /* virt queue functions */ static void virtqueue_init(VirtQueue *vq) { @@ -540,9 +526,6 @@ void virtio_set_status(VirtIODevice *vdev, uint8_t val) VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(vdev); trace_virtio_set_status(vdev, val); -/* If guest virtio endian is uncertain, set it now. */ -mark_endian(); - if (k-set_status) { k-set_status(vdev, val); } diff --git a/include/hw/virtio/virtio-access.h b/include/hw/virtio/virtio-access.h index b1d531e..ea4166a 100644 --- a/include/hw/virtio/virtio-access.h +++ b/include/hw/virtio/virtio-access.h @@ -13,18 +13,9 @@ #ifndef _QEMU_VIRTIO_ACCESS_H #define _QEMU_VIRTIO_ACCESS_H -#ifdef TARGET_VIRTIO_SWAPENDIAN -/* Architectures which need biendian define this function. */ -extern bool virtio_swap_endian(void); - -extern bool virtio_byteswap; -#else -#define virtio_byteswap false -#endif - static inline uint16_t virtio_lduw_phys(hwaddr pa) { -if (virtio_byteswap) { +if (cpu_get_byteswap()) { return bswap16(lduw_phys(pa)); } return lduw_phys(pa); @@ -33,7 +24,7 @@ static inline uint16_t virtio_lduw_phys(hwaddr pa) static inline uint32_t virtio_ldl_phys(hwaddr pa) { -if (virtio_byteswap) { +if (cpu_get_byteswap()) { return bswap32(ldl_phys(pa)); } return ldl_phys(pa); @@ -41,7 +32,7 @@ static inline uint32_t virtio_ldl_phys(hwaddr pa) static inline uint64_t virtio_ldq_phys(hwaddr pa) { -if (virtio_byteswap) { +if (cpu_get_byteswap()) { return bswap64(ldq_phys(pa)); } return ldq_phys(pa); @@ -49,7 +40,7 @@ static inline uint64_t virtio_ldq_phys(hwaddr pa) static inline void virtio_stw_phys(hwaddr pa, uint16_t value) { -if (virtio_byteswap) { +if (cpu_get_byteswap()) { stw_phys(pa, bswap16(value)); } else { stw_phys(pa, value); @@ -58,7 +49,7 @@ static inline void virtio_stw_phys(hwaddr pa, uint16_t value) static inline void virtio_stl_phys(hwaddr pa, uint32_t value) { -if (virtio_byteswap) { +if (cpu_get_byteswap()) { stl_phys(pa, bswap32(value)); } else { stl_phys(pa, value); @@ -67,7 +58,7 @@ static inline void virtio_stl_phys(hwaddr pa, uint32_t value) static inline void virtio_stw_p(void *ptr, uint16_t v) { -if (virtio_byteswap) { +if (cpu_get_byteswap()) { stw_p(ptr, bswap16(v)); } else { stw_p(ptr, v); @@ -76,7 +67,7 @@ static inline void virtio_stw_p(void *ptr, uint16_t v) static inline void virtio_stl_p(void *ptr, uint32_t v) { -if (virtio_byteswap) { +if (cpu_get_byteswap()) { stl_p(ptr, bswap32(v)); } else { stl_p(ptr, v); @@ -85,7 +76,7 @@ static inline void virtio_stl_p(void *ptr, uint32_t v) static inline void virtio_stq_p(void *ptr, uint64_t v) { -if (virtio_byteswap) { +if (cpu_get_byteswap()) { stq_p(ptr, bswap64(v)); } else { stq_p(ptr, v); @@ -94,7 +85,7 @@ static inline void virtio_stq_p(void *ptr, uint64_t v) static inline int virtio_lduw_p(const void
Re: [Qemu-devel] [PATCH 1/7] virtio: allow byte swapping for vring and config access
Andreas Färber afaer...@suse.de writes: Am 08.08.2013 15:31, schrieb Anthony Liguori: Rusty Russell ru...@rustcorp.com.au writes: We have a mechanism to do weak functions via stubs/. I think it would be better to do cpu_get_byteswap() as a stub function and then overload it in the ppc64 code. If this as your name indicates is a per-CPU function then it should go into CPUClass. Interesting question is, what is virtio supposed to do if we have two ppc CPUs, one is Big Endian, the other is Little Endian. We'd need to check current_cpu then, which for Xen is always NULL. Below is the minimal solution, which is sufficient for virtio. If Anton wants per-cpu endianness for gdb, he'll need something more sophisticated. Feedback welcome! Rusty. Subject: cpu_get_byteswap: function for endian-ambivalent targets. Signed-off-by: Rusty Russell ru...@rustcorp.com.au diff --git a/include/qom/cpu.h b/include/qom/cpu.h index a5bb515..ed84267 100644 --- a/include/qom/cpu.h +++ b/include/qom/cpu.h @@ -357,4 +357,13 @@ void cpu_reset_interrupt(CPUState *cpu, int mask); */ void cpu_resume(CPUState *cpu); +/** + * cpu_get_byteswap: + * + * Is (any) CPU running in byteswapped mode: normally false. This + * doesn't take a cpu argument, because we don't support heterogeneous + * endianness. + */ +bool cpu_get_byteswap(void); + #endif diff --git a/stubs/Makefile.objs b/stubs/Makefile.objs index 9b701b4..d4af94a 100644 --- a/stubs/Makefile.objs +++ b/stubs/Makefile.objs @@ -25,3 +25,4 @@ stub-obj-y += vm-stop.o stub-obj-y += vmstate.o stub-obj-$(CONFIG_WIN32) += fd-register.o stub-obj-y += cpus.o +stub-obj-y += cpu_byteswap.o diff --git a/stubs/cpu_byteswap.c b/stubs/cpu_byteswap.c new file mode 100644 index 000..b3b669f --- /dev/null +++ b/stubs/cpu_byteswap.c @@ -0,0 +1,6 @@ +#include qom/cpu.h + +bool cpu_get_byteswap(void) +{ +return false; +} Subject: target-ppc: ppc64 targets can be either endian. In this case, we just query the first cpu. Signed-off-by: Rusty Russell ru...@rustcorp.com.au diff --git a/target-ppc/misc_helper.c b/target-ppc/misc_helper.c index 616aab6..0a508eb 100644 --- a/target-ppc/misc_helper.c +++ b/target-ppc/misc_helper.c @@ -116,3 +116,8 @@ void ppc_store_msr(CPUPPCState *env, target_ulong value) { hreg_store_msr(env, value, 0); } + +bool cpu_get_byteswap(void) +{ +return first_cpu-hflags (1 MSR_LE); +}
Re: [Qemu-devel] [PATCH 1/7] virtio: allow byte swapping for vring and config access
Andreas Färber afaer...@suse.de writes: Am 08.08.2013 17:40, schrieb Anthony Liguori: Andreas Färber afaer...@suse.de writes: Am 08.08.2013 15:31, schrieb Anthony Liguori: We have a mechanism to do weak functions via stubs/. I think it would be better to do cpu_get_byteswap() as a stub function and then overload it in the ppc64 code. If this as your name indicates is a per-CPU function then it should go into CPUClass. Interesting question is, what is virtio supposed to do if we have two ppc CPUs, one is Big Endian, the other is Little Endian. PPC64 is big endian. AFAIK, there is no such thing as a little endian PPC64 processor. This is just a processor mode where loads/stores are byte lane swapped. Hence the name 'cpu_get_byteswap'. It's just asking whether the load/stores are being swapped or not. Exactly, just read it as is in ... Endian mode. On the CPUs I am more familiar with (e.g., 970), this used to be controlled via an MSR bit, which as CPUPPCState::msr exists per CPUState. I did not check on real hardware, but from the QEMU code this would allow for the mixed-endian scenario described above. At least for PPC64, it's not possible to enable/disable byte lane swapping for individual CPUs. It's done through a system-wide hcall. What is offending me is only the following: If we name it cpu_get_byteswap() as proposed by you, then its first argument should be a CPUState *cpu. Its value would be read from the derived type's state, such as the MSR bit in the code path that you wanted duplicated. The function implementing that register-reading would be a hook in CPUClass, with a default implementation in qom/cpu.c rather than a fallback in stubs/. To access CPUClass, CPUState cannot be NULL, as brought up by Stefano for cpu_do_unassigned_access(); not following that pattern prevents mixing CPU architectures, which my large refactorings have partially been about. Cf. my guest-memory-dump refactoring. If it is just some random global value, then please don't call it cpu_*(). Since sPAPR is not a target of its own, I don't see how/where you want to implement that hcall query as per-target function either, that might rather call for a QEMUMachine hook? I don't care or argue about byte lanes here, I am just trying to keep API design consistent and not regressing on the way to heterogeneous emulation. That's a lot of replumbing and indirect function calls for a fairly obscure case. We certainly don't have a nice CPUState lying around in virtio at the moment, for example. I can try to plumb this in if there's consensus, but I suspect it's making the job 10x harder. (The next logical step would be for st* and ld* to take the cpu to query its endianness, Anthony's weird ideas notwithstanding). Cheers, Rusty.
Re: [Qemu-devel] [PATCH 1/7] virtio: allow byte swapping for vring and config access
On 9 August 2013 08:35, Rusty Russell ru...@rustcorp.com.au wrote: That's a lot of replumbing and indirect function calls for a fairly obscure case. We certainly don't have a nice CPUState lying around in virtio at the moment, for example. Actually if you're in an IO routine you do: it will be in the global variable 'current_cpu'. Most IO functions don't need this, but it's what we use for things like this device behaviour varies depending on which CPU accesses it. -- PMM
Re: [Qemu-devel] [PATCH 1/7] virtio: allow byte swapping for vring and config access
On Fri, 2013-08-09 at 17:05 +0930, Rusty Russell wrote: Exactly, just read it as is in ... Endian mode. On the CPUs I am more familiar with (e.g., 970), this used to be controlled via an MSR bit, 970 didn't have an LE mode :-) which as CPUPPCState::msr exists per CPUState. I did not check on real hardware, but from the QEMU code this would allow for the mixed-endian scenario described above. This whole exercise should have nothing to do with the current endian mode of the CPU. If for example you are running lx86 (the x86 emulator IBM provides) which exploits MSR:LE on POWER7 to run x86 binaries in userspace, you don't want virtio to suddenly change endian ! The information we care about is the endianness of the operating system. The most logical way to infer it is a different bit, which used to be MSR:ILE and is now in LPCR for guests and controlled via a hypercall on pseries, which indicates what is the endianness of interrupt vectors. IE. It indicates how the cpu should set MSR:LE when taking an interrupt, regardless of what the current MSR:LE value is at any given point in time. So what should be done in fact is whenever *that* bit is changed (currently via hcall, maybe via MSR:ILE if we emulate that on older models or LPCR when we emulate that), then the qemu cpu model can call out to change the OS endianness which we can propagate to virtio. Anything trying to do stuff based on the current endianness in the MSR sounds like a cesspit to me. (The next logical step would be for st* and ld* to take the cpu to query its endianness, Anthony's weird ideas notwithstanding). Why would we even consider something like that ? Ben.
Re: [Qemu-devel] [PATCH 1/7] virtio: allow byte swapping for vring and config access
On 9 August 2013 03:58, Rusty Russell ru...@rustcorp.com.au wrote: Anthony Liguori anth...@codemonkey.ws writes: The distinction is important in QEMU. ppc64 is still TARGET_WORDS_BIGENDIAN. We still want most stl_phys to treat integers as big endian. There's just this extra concept that CPU loads/stores are sometimes byte swapped. That affects virtio but not a lot else. You've redefined endian here; please don't do that. Endian is the order in memory which a CPU does loads and stores. Agreed (subject to the complicating factor that it's possible for a CPU to have the order to be different for data, instruction and TLB walk loads, so it's not a single setting for a CPU). From any reasonable definition, PPC is bi-endian. It's actually a weird thing for the qemu core to know at all: It's a TCG performance optimisation and/or simplification hack, basically -- by hardcoding the endianness we think the core is at compile time we can either always-byteswap or never-byteswap in the fast paths. almost everything which cares is in target-specific code. The exceptions are gdb stubs and virtio, both of which are native endian (and that weird code in exec.c: what is notdirty_mem_write?). The code for actually doing TCG CPU loads and stores cares too. notdirty_mem_write is (I think) part of how we handle self-modifying code: if you write to a page in memory that we have translated some code from, then we need to intercept that write so that we can throw away the translated code. notdirty_mem_write() is the function that does that interception, throws away the code, figures out if we still need to intercept next time around, and actually does the access the guest asked for. (It might also be used for spotting when the guest touches memory during migration and thus that page needs to be retransmitted -- I haven't checked.) -- PMM
Re: [Qemu-devel] [PATCH 1/7] virtio: allow byte swapping for vring and config access
Rusty Russell ru...@rustcorp.com.au writes: Anthony Liguori anth...@codemonkey.ws writes: I suspect this is a premature optimization. With a weak function called directly in the accessors below, I suspect you would see no measurable performance overhead compared to this approach. It's all very predictable so the CPU should do a decent job optimizing the if () away. Perhaps. I was leery of introducing performance regressions, but the actual I/O tends to dominate anyway. So I tested this, by adding the patch (below) and benchmarking qemu-system-i386 on my laptop before and after. Setup: Intel(R) Core(TM) i5 CPU M 560 @ 2.67GHz (Performance cpu governer enabled) Guest: virtio user net, virtio block on raw file, 1 CPU, 512MB RAM. (Qemu run under eatmydata to eliminate syncs) FYI, cache=unsafe is equivalent to using eatmydata. First test: ping -f -c 1 -q 10.0.2.0 (100 times) (Ping chosen since packets stay in qemu's user net code) BEFORE: MIN: 824ms MAX: 914ms AVG: 876.95ms STDDEV: 16ms AFTER: MIN: 872ms MAX: 933ms AVG: 904.35ms STDDEV: 15ms I can reproduce this although I also see a larger standard deviation. BEFORE: MIN: 496 MAX: 1055 AVG: 873.22 STDEV: 136.88 AFTER: MIN: 494 MAX: 1456 AVG: 947.77 STDEV: 150.89 In my datasets, the stdev is higher in the after case implying that there is more variation. Indeed, the MIN is pretty much the same. GCC is inlining the functions, I'm still surprised that it's measurable at all. At any rate, I think the advantage of not increasing the amount of target specific code outweighs the performance difference here. As you said, if there is real I/O, the differences isn't noticable. Regards, Anthony Liguori
Re: [Qemu-devel] [PATCH 1/7] virtio: allow byte swapping for vring and config access
Rusty Russell ru...@rustcorp.com.au writes: Anthony Liguori anth...@codemonkey.ws writes: Daniel P. Berrange berra...@redhat.com writes: The distinction is important in QEMU. ppc64 is still TARGET_WORDS_BIGENDIAN. We still want most stl_phys to treat integers as big endian. There's just this extra concept that CPU loads/stores are sometimes byte swapped. That affects virtio but not a lot else. You've redefined endian here; please don't do that. Endian is the order in memory which a CPU does loads and stores. From any reasonable definition, PPC is bi-endian. It's actually a weird thing for the qemu core to know at all: almost everything which cares is in target-specific code. The exceptions are gdb stubs and virtio, both of which are native endian (and that weird code in exec.c: what is notdirty_mem_write?). Your argument that we shouldn't fix stl_* might be justifiable (ie. just hack virtio and gdb as one-offs), but it's neither clear nor least surprise. That's not what I'm suggesting. I'm suggesting that we should introduce multiple variants of {ld,st}* for different types of memory access. These are bad names, but I'm thinking along the lines of: /* Read a word as the load/store instructions would */ cpu_ldst_ldw() /* Read a word as the instruction fetch unit would */ cpu_fetch_ldw() /* Read a word as the hardware MMU would */ cpu_mmu_ldw() Peter was suggesting that instead of having separate functions, we should use a context: ldw(cpu-ldst, ..) ldw(cpu-fetch, ..) ... I think I prefer functions though over a context. But this is really about TCG, not virtio. As Ben pointed out, virtio endianness needs to be independent of CPUs. We process the ring outside of a specific CPU context and it's possible that if we pick an arbitrary one, it will be in the wrong context (if running BE userspace). The only real problem I have with your original patch is putting virtio knowledge in the target code. I think your adjusted version is fine. Regards, Anthony Liguori Chers, Rusty.
Re: [Qemu-devel] [PATCH 1/7] virtio: allow byte swapping for vring and config access
Am 09.08.2013 09:00, schrieb Rusty Russell: Andreas Färber afaer...@suse.de writes: Am 08.08.2013 15:31, schrieb Anthony Liguori: Rusty Russell ru...@rustcorp.com.au writes: We have a mechanism to do weak functions via stubs/. I think it would be better to do cpu_get_byteswap() as a stub function and then overload it in the ppc64 code. If this as your name indicates is a per-CPU function then it should go into CPUClass. Interesting question is, what is virtio supposed to do if we have two ppc CPUs, one is Big Endian, the other is Little Endian. We'd need to check current_cpu then, which for Xen is always NULL. Below is the minimal solution, which is sufficient for virtio. If Anton wants per-cpu endianness for gdb, he'll need something more sophisticated. Feedback welcome! Rusty. Subject: cpu_get_byteswap: function for endian-ambivalent targets. Signed-off-by: Rusty Russell ru...@rustcorp.com.au diff --git a/include/qom/cpu.h b/include/qom/cpu.h index a5bb515..ed84267 100644 --- a/include/qom/cpu.h +++ b/include/qom/cpu.h @@ -357,4 +357,13 @@ void cpu_reset_interrupt(CPUState *cpu, int mask); */ void cpu_resume(CPUState *cpu); +/** + * cpu_get_byteswap: + * + * Is (any) CPU running in byteswapped mode: normally false. This + * doesn't take a cpu argument, because we don't support heterogeneous + * endianness. + */ +bool cpu_get_byteswap(void); + #endif diff --git a/stubs/Makefile.objs b/stubs/Makefile.objs index 9b701b4..d4af94a 100644 --- a/stubs/Makefile.objs +++ b/stubs/Makefile.objs @@ -25,3 +25,4 @@ stub-obj-y += vm-stop.o stub-obj-y += vmstate.o stub-obj-$(CONFIG_WIN32) += fd-register.o stub-obj-y += cpus.o +stub-obj-y += cpu_byteswap.o diff --git a/stubs/cpu_byteswap.c b/stubs/cpu_byteswap.c new file mode 100644 index 000..b3b669f --- /dev/null +++ b/stubs/cpu_byteswap.c @@ -0,0 +1,6 @@ +#include qom/cpu.h + +bool cpu_get_byteswap(void) +{ +return false; +} That is exactly what I asked not to do. first_cpu_get_byteswap() or virtio_get_byteswap() etc. would be names OK for me. But see below. Subject: target-ppc: ppc64 targets can be either endian. In this case, we just query the first cpu. Signed-off-by: Rusty Russell ru...@rustcorp.com.au diff --git a/target-ppc/misc_helper.c b/target-ppc/misc_helper.c index 616aab6..0a508eb 100644 --- a/target-ppc/misc_helper.c +++ b/target-ppc/misc_helper.c @@ -116,3 +116,8 @@ void ppc_store_msr(CPUPPCState *env, target_ulong value) { hreg_store_msr(env, value, 0); } + +bool cpu_get_byteswap(void) +{ +return first_cpu-hflags (1 MSR_LE); +} This assumes that first_cpu != NULL, which I pointed out is not the case for Xen. I'm not aware of a ppc Xen implementation, but it shows that you should define which endianness (probably little) you want to adopt in such a case. You should also urgently update your QEMU since that code will not build. first_cpu is CPUState since a number of weeks, you would need to cast to PowerPCCPU *cpu = POWERPC_CPU(first_cpu); in the non-NULL case. Andreas -- SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg
Re: [Qemu-devel] [PATCH 1/7] virtio: allow byte swapping for vring and config access
Am 09.08.2013 09:35, schrieb Rusty Russell: Andreas Färber afaer...@suse.de writes: [...] If we name it cpu_get_byteswap() as proposed by you, then its first argument should be a CPUState *cpu. Its value would be read from the derived type's state, such as the MSR bit in the code path that you wanted duplicated. The function implementing that register-reading would be a hook in CPUClass, with a default implementation in qom/cpu.c rather than a fallback in stubs/. To access CPUClass, CPUState cannot be NULL, as brought up by Stefano for cpu_do_unassigned_access(); not following that pattern prevents mixing CPU architectures, which my large refactorings have partially been about. Cf. my guest-memory-dump refactoring. If it is just some random global value, then please don't call it cpu_*(). Since sPAPR is not a target of its own, I don't see how/where you want to implement that hcall query as per-target function either, that might rather call for a QEMUMachine hook? I don't care or argue about byte lanes here, I am just trying to keep API design consistent and not regressing on the way to heterogeneous emulation. That's a lot of replumbing and indirect function calls for a fairly obscure case. It's how QOM methods generally work. And yes, little endian ppc64 is in fact a pretty obscure case. But IBM was just recently reported to adopt the IP licensing model like ARM, so chances are we will see the same mixed-core scenarios as with ARM + MicroBlaze/SuperH these days. http://news.techeye.net/chips/ibms-launches-intel-server-challenge Generally the problem is that we can't have multiple same-named global functions when combining multiple targets, so we need a way to dispatch - either from the individual CPU or from the machine. I would assume in practice mixed cores will have the same endianness. Or by making endianness a user-tweakable property of the virtio devices rather than trying to auto-detect it. We certainly don't have a nice CPUState lying around in virtio at the moment, for example. Compare http://git.qemu.org/?p=qemu.git;a=commit;h=c658b94f6e8c206c59d02aa6fbac285b86b53d2c cpu_single_env has since been renamed to the mentioned current_cpu and been changed to CPUState type. I can try to plumb this in if there's consensus, but I suspect it's making the job 10x harder. I doubt it's that complicated, estimated less than ten minutes for me, and not doing it is making the other job significantly harder. cpu_get_dump_info() is already a hard nut to crack. Regards, Andreas -- SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg
Re: [Qemu-devel] [PATCH 1/7] virtio: allow byte swapping for vring and config access
Rusty Russell ru...@rustcorp.com.au writes: Virtio is currently defined to work as guest endian, but this is a problem if the guest can change endian. As most targets can't change endian, we make it a per-target option to avoid pessimising. This is based on a simpler patch by Anthony Liguouri, which only handled the vring accesses. We also need some drivers to access these helpers, eg. for data which contains headers. Signed-off-by: Rusty Russell ru...@rustcorp.com.au --- hw/virtio/virtio.c| 46 + include/hw/virtio/virtio-access.h | 138 ++ 2 files changed, 170 insertions(+), 14 deletions(-) create mode 100644 include/hw/virtio/virtio-access.h diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c index 8176c14..2887f17 100644 --- a/hw/virtio/virtio.c +++ b/hw/virtio/virtio.c @@ -18,6 +18,7 @@ #include hw/virtio/virtio.h #include qemu/atomic.h #include hw/virtio/virtio-bus.h +#include hw/virtio/virtio-access.h /* The alignment to use between consumer and producer parts of vring. * x86 pagesize again. */ @@ -84,6 +85,20 @@ struct VirtQueue EventNotifier host_notifier; }; +#ifdef TARGET_VIRTIO_SWAPENDIAN +bool virtio_byteswap; + +/* Ask target code if we should swap endian for all vring and config access. */ +static void mark_endian(void) +{ +virtio_byteswap = virtio_swap_endian(); +} +#else +static void mark_endian(void) +{ +} +#endif + It would be very good to avoid a target specific define here. We would like to move to only building a single copy of the virtio code. We have a mechanism to do weak functions via stubs/. I think it would be better to do cpu_get_byteswap() as a stub function and then overload it in the ppc64 code. /* virt queue functions */ static void virtqueue_init(VirtQueue *vq) { @@ -100,49 +115,49 @@ static inline uint64_t vring_desc_addr(hwaddr desc_pa, int i) { hwaddr pa; pa = desc_pa + sizeof(VRingDesc) * i + offsetof(VRingDesc, addr); -return ldq_phys(pa); +return virtio_ldq_phys(pa); } static inline uint32_t vring_desc_len(hwaddr desc_pa, int i) { hwaddr pa; pa = desc_pa + sizeof(VRingDesc) * i + offsetof(VRingDesc, len); -return ldl_phys(pa); +return virtio_ldl_phys(pa); } static inline uint16_t vring_desc_flags(hwaddr desc_pa, int i) { hwaddr pa; pa = desc_pa + sizeof(VRingDesc) * i + offsetof(VRingDesc, flags); -return lduw_phys(pa); +return virtio_lduw_phys(pa); } static inline uint16_t vring_desc_next(hwaddr desc_pa, int i) { hwaddr pa; pa = desc_pa + sizeof(VRingDesc) * i + offsetof(VRingDesc, next); -return lduw_phys(pa); +return virtio_lduw_phys(pa); } static inline uint16_t vring_avail_flags(VirtQueue *vq) { hwaddr pa; pa = vq-vring.avail + offsetof(VRingAvail, flags); -return lduw_phys(pa); +return virtio_lduw_phys(pa); } static inline uint16_t vring_avail_idx(VirtQueue *vq) { hwaddr pa; pa = vq-vring.avail + offsetof(VRingAvail, idx); -return lduw_phys(pa); +return virtio_lduw_phys(pa); } static inline uint16_t vring_avail_ring(VirtQueue *vq, int i) { hwaddr pa; pa = vq-vring.avail + offsetof(VRingAvail, ring[i]); -return lduw_phys(pa); +return virtio_lduw_phys(pa); } static inline uint16_t vring_used_event(VirtQueue *vq) @@ -154,42 +169,42 @@ static inline void vring_used_ring_id(VirtQueue *vq, int i, uint32_t val) { hwaddr pa; pa = vq-vring.used + offsetof(VRingUsed, ring[i].id); -stl_phys(pa, val); +virtio_stl_phys(pa, val); } static inline void vring_used_ring_len(VirtQueue *vq, int i, uint32_t val) { hwaddr pa; pa = vq-vring.used + offsetof(VRingUsed, ring[i].len); -stl_phys(pa, val); +virtio_stl_phys(pa, val); } static uint16_t vring_used_idx(VirtQueue *vq) { hwaddr pa; pa = vq-vring.used + offsetof(VRingUsed, idx); -return lduw_phys(pa); +return virtio_lduw_phys(pa); } static inline void vring_used_idx_set(VirtQueue *vq, uint16_t val) { hwaddr pa; pa = vq-vring.used + offsetof(VRingUsed, idx); -stw_phys(pa, val); +virtio_stw_phys(pa, val); } static inline void vring_used_flags_set_bit(VirtQueue *vq, int mask) { hwaddr pa; pa = vq-vring.used + offsetof(VRingUsed, flags); -stw_phys(pa, lduw_phys(pa) | mask); +virtio_stw_phys(pa, virtio_lduw_phys(pa) | mask); } static inline void vring_used_flags_unset_bit(VirtQueue *vq, int mask) { hwaddr pa; pa = vq-vring.used + offsetof(VRingUsed, flags); -stw_phys(pa, lduw_phys(pa) ~mask); +virtio_stw_phys(pa, virtio_lduw_phys(pa) ~mask); } static inline void vring_avail_event(VirtQueue *vq, uint16_t val) @@ -199,7 +214,7 @@ static inline void vring_avail_event(VirtQueue *vq,
Re: [Qemu-devel] [PATCH 1/7] virtio: allow byte swapping for vring and config access
Am 08.08.2013 15:31, schrieb Anthony Liguori: Rusty Russell ru...@rustcorp.com.au writes: Virtio is currently defined to work as guest endian, but this is a problem if the guest can change endian. As most targets can't change endian, we make it a per-target option to avoid pessimising. This is based on a simpler patch by Anthony Liguouri, which only handled the vring accesses. We also need some drivers to access these helpers, eg. for data which contains headers. Signed-off-by: Rusty Russell ru...@rustcorp.com.au --- hw/virtio/virtio.c| 46 + include/hw/virtio/virtio-access.h | 138 ++ 2 files changed, 170 insertions(+), 14 deletions(-) create mode 100644 include/hw/virtio/virtio-access.h diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c index 8176c14..2887f17 100644 --- a/hw/virtio/virtio.c +++ b/hw/virtio/virtio.c @@ -18,6 +18,7 @@ #include hw/virtio/virtio.h #include qemu/atomic.h #include hw/virtio/virtio-bus.h +#include hw/virtio/virtio-access.h /* The alignment to use between consumer and producer parts of vring. * x86 pagesize again. */ @@ -84,6 +85,20 @@ struct VirtQueue EventNotifier host_notifier; }; +#ifdef TARGET_VIRTIO_SWAPENDIAN +bool virtio_byteswap; + +/* Ask target code if we should swap endian for all vring and config access. */ +static void mark_endian(void) +{ +virtio_byteswap = virtio_swap_endian(); +} +#else +static void mark_endian(void) +{ +} +#endif + It would be very good to avoid a target specific define here. We would like to move to only building a single copy of the virtio code. +1 We have a mechanism to do weak functions via stubs/. I think it would be better to do cpu_get_byteswap() as a stub function and then overload it in the ppc64 code. If this as your name indicates is a per-CPU function then it should go into CPUClass. Interesting question is, what is virtio supposed to do if we have two ppc CPUs, one is Big Endian, the other is Little Endian. We'd need to check current_cpu then, which for Xen is always NULL. Andreas -- SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg
Re: [Qemu-devel] [PATCH 1/7] virtio: allow byte swapping for vring and config access
Andreas Färber afaer...@suse.de writes: Am 08.08.2013 15:31, schrieb Anthony Liguori: Rusty Russell ru...@rustcorp.com.au writes: Virtio is currently defined to work as guest endian, but this is a problem if the guest can change endian. As most targets can't change endian, we make it a per-target option to avoid pessimising. This is based on a simpler patch by Anthony Liguouri, which only handled the vring accesses. We also need some drivers to access these helpers, eg. for data which contains headers. Signed-off-by: Rusty Russell ru...@rustcorp.com.au --- hw/virtio/virtio.c| 46 + include/hw/virtio/virtio-access.h | 138 ++ 2 files changed, 170 insertions(+), 14 deletions(-) create mode 100644 include/hw/virtio/virtio-access.h diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c index 8176c14..2887f17 100644 --- a/hw/virtio/virtio.c +++ b/hw/virtio/virtio.c @@ -18,6 +18,7 @@ #include hw/virtio/virtio.h #include qemu/atomic.h #include hw/virtio/virtio-bus.h +#include hw/virtio/virtio-access.h /* The alignment to use between consumer and producer parts of vring. * x86 pagesize again. */ @@ -84,6 +85,20 @@ struct VirtQueue EventNotifier host_notifier; }; +#ifdef TARGET_VIRTIO_SWAPENDIAN +bool virtio_byteswap; + +/* Ask target code if we should swap endian for all vring and config access. */ +static void mark_endian(void) +{ +virtio_byteswap = virtio_swap_endian(); +} +#else +static void mark_endian(void) +{ +} +#endif + It would be very good to avoid a target specific define here. We would like to move to only building a single copy of the virtio code. +1 We have a mechanism to do weak functions via stubs/. I think it would be better to do cpu_get_byteswap() as a stub function and then overload it in the ppc64 code. If this as your name indicates is a per-CPU function then it should go into CPUClass. Interesting question is, what is virtio supposed to do if we have two ppc CPUs, one is Big Endian, the other is Little Endian. PPC64 is big endian. AFAIK, there is no such thing as a little endian PPC64 processor. This is just a processor mode where loads/stores are byte lane swapped. Hence the name 'cpu_get_byteswap'. It's just asking whether the load/stores are being swapped or not. At least for PPC64, it's not possible to enable/disable byte lane swapping for individual CPUs. It's done through a system-wide hcall. FWIW, I think most bi-endian architectures are this way too so I think this is equally applicable to ARM. Peter, is that right? Regards, Anthony Liguori We'd need to check current_cpu then, which for Xen is always NULL. Andreas -- SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg
Re: [Qemu-devel] [PATCH 1/7] virtio: allow byte swapping for vring and config access
On Thu, Aug 08, 2013 at 10:40:28AM -0500, Anthony Liguori wrote: Andreas Färber afaer...@suse.de writes: We have a mechanism to do weak functions via stubs/. I think it would be better to do cpu_get_byteswap() as a stub function and then overload it in the ppc64 code. If this as your name indicates is a per-CPU function then it should go into CPUClass. Interesting question is, what is virtio supposed to do if we have two ppc CPUs, one is Big Endian, the other is Little Endian. PPC64 is big endian. AFAIK, there is no such thing as a little endian PPC64 processor. Unless I'm misunderstanding, this thread seems to suggest otherwise: [Qemu-devel] [PATCH 0/5] 64bit PowerPC little endian support https://lists.nongnu.org/archive/html/qemu-devel/2013-08/msg00813.html Daniel -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|
Re: [Qemu-devel] [PATCH 1/7] virtio: allow byte swapping for vring and config access
On 8 August 2013 16:40, Anthony Liguori anth...@codemonkey.ws wrote: PPC64 is big endian. AFAIK, there is no such thing as a little endian PPC64 processor. What's your definition of little endian processor here if it isn't one which is doing byte swaps of data? I would describe a PPC64 with the relevant mode bit set as little endian. This is just a processor mode where loads/stores are byte lane swapped. Hence the name 'cpu_get_byteswap'. It's just asking whether the load/stores are being swapped or not. At least for PPC64, it's not possible to enable/disable byte lane swapping for individual CPUs. It's done through a system-wide hcall. FWIW, I think most bi-endian architectures are this way too so I think this is equally applicable to ARM. Peter, is that right? ARM's bi-endian story is complicated and depends on the CPU. Older CPUs didn't do byte-lane swapping of data; instead when the CPU was configured in big endian mode they would do address munging (XOR the address with 3 when doing a byte access; XOR with 1 for halfword access). New CPUs do byte-lane swapping (but only for data, not for instruction fetch). CPUs in the transition period can do either. In all cases, this is a per-cpu-core thing: you can have one core configured to be bigendian and the other little endian. You could in theory have the OS support both big and little endian userspace processes. We ideally would want to support QEMU is a little endian process but the VM's vcpu is currently bigendian. Fuller writeup here: http://translatedcode.wordpress.com/2012/04/02/this-end-up/ -- PMM
Re: [Qemu-devel] [PATCH 1/7] virtio: allow byte swapping for vring and config access
Peter Maydell peter.mayd...@linaro.org writes: On 8 August 2013 16:40, Anthony Liguori anth...@codemonkey.ws wrote: PPC64 is big endian. AFAIK, there is no such thing as a little endian PPC64 processor. What's your definition of little endian processor here if it isn't one which is doing byte swaps of data? I would describe a PPC64 with the relevant mode bit set as little endian. Let's focus this to QEMU. PPC64 is still TARGET_WORDS_BIGENDIAN. It would be totally wrong to make this change to either a function call or to TARGET_WORDS_LITTLEENDIAN. This is just a processor mode where loads/stores are byte lane swapped. Hence the name 'cpu_get_byteswap'. It's just asking whether the load/stores are being swapped or not. At least for PPC64, it's not possible to enable/disable byte lane swapping for individual CPUs. It's done through a system-wide hcall. FWIW, I think most bi-endian architectures are this way too so I think this is equally applicable to ARM. Peter, is that right? ARM's bi-endian story is complicated and depends on the CPU. Older CPUs didn't do byte-lane swapping of data; instead when the CPU was configured in big endian mode they would do address munging (XOR the address with 3 when doing a byte access; XOR with 1 for halfword access). New CPUs do byte-lane swapping (but only for data, not for instruction fetch). CPUs in the transition period can do either. In all cases, this is a per-cpu-core thing: you can have one core configured to be bigendian and the other little endian. You could in theory have the OS support both big and little endian userspace processes. We ideally would want to support QEMU is a little endian process but the VM's vcpu is currently bigendian. Eek. Yeah, I guess we need to tie this to a CPUState then. Fuller writeup here: http://translatedcode.wordpress.com/2012/04/02/this-end-up/ Excellent, thanks! Regards, Anthony Liguori -- PMM
Re: [Qemu-devel] [PATCH 1/7] virtio: allow byte swapping for vring and config access
Daniel P. Berrange berra...@redhat.com writes: On Thu, Aug 08, 2013 at 10:40:28AM -0500, Anthony Liguori wrote: Andreas Färber afaer...@suse.de writes: We have a mechanism to do weak functions via stubs/. I think it would be better to do cpu_get_byteswap() as a stub function and then overload it in the ppc64 code. If this as your name indicates is a per-CPU function then it should go into CPUClass. Interesting question is, what is virtio supposed to do if we have two ppc CPUs, one is Big Endian, the other is Little Endian. PPC64 is big endian. AFAIK, there is no such thing as a little endian PPC64 processor. Unless I'm misunderstanding, this thread seems to suggest otherwise: [Qemu-devel] [PATCH 0/5] 64bit PowerPC little endian support https://lists.nongnu.org/archive/html/qemu-devel/2013-08/msg00813.html Yeah, it's confusing. It feels like little endian to most software but the distinction in hardware (and therefore QEMU) is important. It's the same processor. It still starts executing big endian instructions. A magic register value is tweaked and loads/stores are swapped. CPU data structures are still read as big endian though. It's really just load/stores that are affected. The distinction is important in QEMU. ppc64 is still TARGET_WORDS_BIGENDIAN. We still want most stl_phys to treat integers as big endian. There's just this extra concept that CPU loads/stores are sometimes byte swapped. That affects virtio but not a lot else. Regards, Anthony Liguori Daniel -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|
Re: [Qemu-devel] [PATCH 1/7] virtio: allow byte swapping for vring and config access
On 8 August 2013 17:07, Anthony Liguori anth...@codemonkey.ws wrote: It's the same processor. It still starts executing big endian instructions. A magic register value is tweaked and loads/stores are swapped. I dunno about PPC, but for ARM generally the boot-up state is controlled by config signals which a SoC or board can hardwire, so you can have a SoC which is configured to start in big-endian mode. CPU data structures are still read as big endian though. Do you have an example of what you mean by CPU data structure? The distinction is important in QEMU. ppc64 is still TARGET_WORDS_BIGENDIAN. Ideally TARGET_WORDS_BIGENDIAN would go away -- it is forcing at compile time a setting which is actually a runtime one, and a lot of the weirdness here flows from that. We still want most stl_phys to treat integers as big endian. Any stl_phys() should [in an ideal design] be tied to a bus master which has its own idea of which endianness it is. That is, an stl_phys() for a DMA controller model ought to use the endianness programmed for the DMA controller, not whatever the CPU happens to be using. -- PMM
Re: [Qemu-devel] [PATCH 1/7] virtio: allow byte swapping for vring and config access
Am 08.08.2013 17:40, schrieb Anthony Liguori: Andreas Färber afaer...@suse.de writes: Am 08.08.2013 15:31, schrieb Anthony Liguori: We have a mechanism to do weak functions via stubs/. I think it would be better to do cpu_get_byteswap() as a stub function and then overload it in the ppc64 code. If this as your name indicates is a per-CPU function then it should go into CPUClass. Interesting question is, what is virtio supposed to do if we have two ppc CPUs, one is Big Endian, the other is Little Endian. PPC64 is big endian. AFAIK, there is no such thing as a little endian PPC64 processor. This is just a processor mode where loads/stores are byte lane swapped. Hence the name 'cpu_get_byteswap'. It's just asking whether the load/stores are being swapped or not. Exactly, just read it as is in ... Endian mode. On the CPUs I am more familiar with (e.g., 970), this used to be controlled via an MSR bit, which as CPUPPCState::msr exists per CPUState. I did not check on real hardware, but from the QEMU code this would allow for the mixed-endian scenario described above. At least for PPC64, it's not possible to enable/disable byte lane swapping for individual CPUs. It's done through a system-wide hcall. What is offending me is only the following: If we name it cpu_get_byteswap() as proposed by you, then its first argument should be a CPUState *cpu. Its value would be read from the derived type's state, such as the MSR bit in the code path that you wanted duplicated. The function implementing that register-reading would be a hook in CPUClass, with a default implementation in qom/cpu.c rather than a fallback in stubs/. To access CPUClass, CPUState cannot be NULL, as brought up by Stefano for cpu_do_unassigned_access(); not following that pattern prevents mixing CPU architectures, which my large refactorings have partially been about. Cf. my guest-memory-dump refactoring. If it is just some random global value, then please don't call it cpu_*(). Since sPAPR is not a target of its own, I don't see how/where you want to implement that hcall query as per-target function either, that might rather call for a QEMUMachine hook? I don't care or argue about byte lanes here, I am just trying to keep API design consistent and not regressing on the way to heterogeneous emulation. Regards, Andreas FWIW, I think most bi-endian architectures are this way too so I think this is equally applicable to ARM. Peter, is that right? Regards, Anthony Liguori We'd need to check current_cpu then, which for Xen is always NULL. Andreas -- SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg
Re: [Qemu-devel] [PATCH 1/7] virtio: allow byte swapping for vring and config access
Peter Maydell peter.mayd...@linaro.org writes: On 8 August 2013 17:07, Anthony Liguori anth...@codemonkey.ws wrote: It's the same processor. It still starts executing big endian instructions. A magic register value is tweaked and loads/stores are swapped. I dunno about PPC, but for ARM generally the boot-up state is controlled by config signals which a SoC or board can hardwire, so you can have a SoC which is configured to start in big-endian mode. CPU data structures are still read as big endian though. Do you have an example of what you mean by CPU data structure? MMU tlb hash table. If you grep ldl target-ppc/* you'll see that there are only a few cases where bswap occurs. The distinction is important in QEMU. ppc64 is still TARGET_WORDS_BIGENDIAN. Ideally TARGET_WORDS_BIGENDIAN would go away -- it is forcing at compile time a setting which is actually a runtime one, and a lot of the weirdness here flows from that. We still want most stl_phys to treat integers as big endian. Any stl_phys() should [in an ideal design] be tied to a bus master which has its own idea of which endianness it is. That is, an stl_phys() for a DMA controller model ought to use the endianness programmed for the DMA controller, not whatever the CPU happens to be using. We have the DMA API that attempts to do this but maybe we need to generalize it a bit more... I think it's pretty true that we need a context and that the context for, say instruction fetch, is distinct from the context for load/store instructions. Regards, Anthony Liguori -- PMM
Re: [Qemu-devel] [PATCH 1/7] virtio: allow byte swapping for vring and config access
On 8 August 2013 17:25, Anthony Liguori anth...@codemonkey.ws wrote: Peter Maydell peter.mayd...@linaro.org writes: On 8 August 2013 17:07, Anthony Liguori anth...@codemonkey.ws wrote: CPU data structures are still read as big endian though. Do you have an example of what you mean by CPU data structure? MMU tlb hash table. If you grep ldl target-ppc/* you'll see that there are only a few cases where bswap occurs. Oh, right, that sort of in-memory data structure. If I understand correctly, the equivalent of that for ARM would be the MMU translation tables; on ARM there's a system control register bit for which endianness those are. Any stl_phys() should [in an ideal design] be tied to a bus master which has its own idea of which endianness it is. That is, an stl_phys() for a DMA controller model ought to use the endianness programmed for the DMA controller, not whatever the CPU happens to be using. We have the DMA API that attempts to do this but maybe we need to generalize it a bit more... I think it's pretty true that we need a context and that the context for, say instruction fetch, is distinct from the context for load/store instructions. A context might also give us a place to put other interesting information which in hardware gets passed around as transaction attributes on the bus, such as is this a userspace or privileged instruction. -- PMM
Re: [Qemu-devel] [PATCH 1/7] virtio: allow byte swapping for vring and config access
Andreas Färber afaer...@suse.de writes: Am 08.08.2013 15:31, schrieb Anthony Liguori: Rusty Russell ru...@rustcorp.com.au writes: Virtio is currently defined to work as guest endian, but this is a problem if the guest can change endian. As most targets can't change endian, we make it a per-target option to avoid pessimising. This is based on a simpler patch by Anthony Liguouri, which only handled the vring accesses. We also need some drivers to access these helpers, eg. for data which contains headers. Signed-off-by: Rusty Russell ru...@rustcorp.com.au --- hw/virtio/virtio.c| 46 + include/hw/virtio/virtio-access.h | 138 ++ 2 files changed, 170 insertions(+), 14 deletions(-) create mode 100644 include/hw/virtio/virtio-access.h diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c index 8176c14..2887f17 100644 --- a/hw/virtio/virtio.c +++ b/hw/virtio/virtio.c @@ -18,6 +18,7 @@ #include hw/virtio/virtio.h #include qemu/atomic.h #include hw/virtio/virtio-bus.h +#include hw/virtio/virtio-access.h /* The alignment to use between consumer and producer parts of vring. * x86 pagesize again. */ @@ -84,6 +85,20 @@ struct VirtQueue EventNotifier host_notifier; }; +#ifdef TARGET_VIRTIO_SWAPENDIAN +bool virtio_byteswap; + +/* Ask target code if we should swap endian for all vring and config access. */ +static void mark_endian(void) +{ +virtio_byteswap = virtio_swap_endian(); +} +#else +static void mark_endian(void) +{ +} +#endif + It would be very good to avoid a target specific define here. We would like to move to only building a single copy of the virtio code. +1 We have a mechanism to do weak functions via stubs/. I think it would be better to do cpu_get_byteswap() as a stub function and then overload it in the ppc64 code. If this as your name indicates is a per-CPU function then it should go into CPUClass. Interesting question is, what is virtio supposed to do if we have two ppc CPUs, one is Big Endian, the other is Little Endian. We'd need to check current_cpu then, which for Xen is always NULL. This is why the check is performed on a random CPU when they first acknowledge the device. It's a hacky assumption, but that's why there's a proposal to nail virtio to LE for the 1.0 OASIS standard. You can't actually change endian of a virtio device in flight: it doesn't make sense since there's readable state there already. Cheers, Rusty.
Re: [Qemu-devel] [PATCH 1/7] virtio: allow byte swapping for vring and config access
Anthony Liguori anth...@codemonkey.ws writes: Daniel P. Berrange berra...@redhat.com writes: On Thu, Aug 08, 2013 at 10:40:28AM -0500, Anthony Liguori wrote: Andreas Färber afaer...@suse.de writes: We have a mechanism to do weak functions via stubs/. I think it would be better to do cpu_get_byteswap() as a stub function and then overload it in the ppc64 code. If this as your name indicates is a per-CPU function then it should go into CPUClass. Interesting question is, what is virtio supposed to do if we have two ppc CPUs, one is Big Endian, the other is Little Endian. PPC64 is big endian. AFAIK, there is no such thing as a little endian PPC64 processor. Unless I'm misunderstanding, this thread seems to suggest otherwise: [Qemu-devel] [PATCH 0/5] 64bit PowerPC little endian support https://lists.nongnu.org/archive/html/qemu-devel/2013-08/msg00813.html Yeah, it's confusing. It feels like little endian to most software but the distinction in hardware (and therefore QEMU) is important. It's the same processor. It still starts executing big endian instructions. A magic register value is tweaked and loads/stores are swapped. CPU data structures are still read as big endian though. It's really just load/stores that are affected. The distinction is important in QEMU. ppc64 is still TARGET_WORDS_BIGENDIAN. We still want most stl_phys to treat integers as big endian. There's just this extra concept that CPU loads/stores are sometimes byte swapped. That affects virtio but not a lot else. You've redefined endian here; please don't do that. Endian is the order in memory which a CPU does loads and stores. From any reasonable definition, PPC is bi-endian. It's actually a weird thing for the qemu core to know at all: almost everything which cares is in target-specific code. The exceptions are gdb stubs and virtio, both of which are native endian (and that weird code in exec.c: what is notdirty_mem_write?). Your argument that we shouldn't fix stl_* might be justifiable (ie. just hack virtio and gdb as one-offs), but it's neither clear nor least surprise. Chers, Rusty.
Re: [Qemu-devel] [PATCH 1/7] virtio: allow byte swapping for vring and config access
Hi, The distinction is important in QEMU. ppc64 is still TARGET_WORDS_BIGENDIAN. We still want most stl_phys to treat integers as big endian. There's just this extra concept that CPU loads/stores are sometimes byte swapped. That affects virtio but not a lot else. You've redefined endian here; please don't do that. Endian is the order in memory which a CPU does loads and stores. From any reasonable definition, PPC is bi-endian. It's actually a weird thing for the qemu core to know at all: almost everything which cares is in target-specific code. The exceptions are gdb stubs and virtio, both of which are native endian (and that weird code in exec.c: what is notdirty_mem_write?). Your argument that we shouldn't fix stl_* might be justifiable (ie. just hack virtio and gdb as one-offs), but it's neither clear nor least surprise. Here is the hack I have to get gdbstub going with a little endian PowerPC kernel. Basically: LE guest - BE QEMU - BE gdb (pointing at the LE vmlinux) In this setup, gdb expects registers to be sent in little endian mode. It's a pretty big mistake for the gdb remote protocol to be using native endian to transfer registers especially when there is no other protocol negotation to work out what endian that is. Anton -- Index: b/gdbstub.c === --- a/gdbstub.c +++ b/gdbstub.c @@ -317,6 +317,8 @@ static GDBState *gdbserver_state; bool gdb_has_xml; +bool gdbstub_cross_endian; + #ifdef CONFIG_USER_ONLY /* XXX: This is not thread safe. Do we care? */ static int gdbserver_fd = -1; Index: b/include/exec/gdbstub.h === --- a/include/exec/gdbstub.h +++ b/include/exec/gdbstub.h @@ -42,8 +42,13 @@ static inline int cpu_index(CPUState *cp /* The GDB remote protocol transfers values in target byte order. This means * we can use the raw memory access routines to access the value buffer. * Conveniently, these also handle the case where the buffer is mis-aligned. + * + * We do need to byte swap if the CPU isn't running in the QEMU compiled + * target endian mode. */ +extern bool gdbstub_cross_endian; + static inline int gdb_get_reg8(uint8_t *mem_buf, uint8_t val) { stb_p(mem_buf, val); @@ -52,28 +57,49 @@ static inline int gdb_get_reg8(uint8_t * static inline int gdb_get_reg16(uint8_t *mem_buf, uint16_t val) { -stw_p(mem_buf, val); +if (gdbstub_cross_endian) +stw_p(mem_buf, bswap16(val)); +else +stw_p(mem_buf, val); return 2; } static inline int gdb_get_reg32(uint8_t *mem_buf, uint32_t val) { -stl_p(mem_buf, val); +if (gdbstub_cross_endian) +stq_p(mem_buf, bswap32(val)); +else +stl_p(mem_buf, val); return 4; } static inline int gdb_get_reg64(uint8_t *mem_buf, uint64_t val) { -stq_p(mem_buf, val); +if (gdbstub_cross_endian) +stq_p(mem_buf, bswap64(val)); +else +stq_p(mem_buf, val); return 8; } #if TARGET_LONG_BITS == 64 #define gdb_get_regl(buf, val) gdb_get_reg64(buf, val) -#define ldtul_p(addr) ldq_p(addr) +static inline uint64_t ldtul_p(const void *ptr) +{ + uint64_t tmp = ldq_p(ptr); + if (gdbstub_cross_endian) + tmp = bswap64(tmp); + return tmp; +} #else #define gdb_get_regl(buf, val) gdb_get_reg32(buf, val) -#define ldtul_p(addr) ldl_p(addr) +static inline uint32_t ldtul_p(const void *ptr) +{ + uint32_t tmp = ldl_p(ptr); + if (gdbstub_cross_endian) + tmp = bswap32(tmp); + return tmp; +} #endif #endif