[PATCH] hw/intc/arm_gicv3_kvm.c: Set the qemu_irq/gsi mapping for VFIO platform
Eric added the qemu_irq/gsi hash table to let VFIO platform device setup irqfd when kvm enabled [1]. And he setup the qemu_irq/gsi mapping in arm_gic_kvm.c [2]. But this mapping is not setting up in arm_gicv3_kvm.c. When VM use VFIO platform device with gicv3, the irqfd setup will fail and fallback to userspace handled eventfd in `vfio_start_irqfd_injection`. This patch will setup the qemu_irq/gsi mapping for gicv3, so that VFIO platform device with gicv3 can use kvm irqfd to accelerate. [1] https://lore.kernel.org/qemu-devel/20150706183506.15635.61812.st...@gimli.home/ [2] https://lore.kernel.org/qemu-devel/20150706183512.15635.915.st...@gimli.home/ Signed-off-by: Luca Wei --- hw/intc/arm_gicv3_kvm.c | 5 + 1 file changed, 5 insertions(+) diff --git a/hw/intc/arm_gicv3_kvm.c b/hw/intc/arm_gicv3_kvm.c index 72ad916d3d..7e90f8b723 100644 --- a/hw/intc/arm_gicv3_kvm.c +++ b/hw/intc/arm_gicv3_kvm.c @@ -807,6 +807,11 @@ static void kvm_arm_gicv3_realize(DeviceState *dev, Error **errp) gicv3_init_irqs_and_mmio(s, kvm_arm_gicv3_set_irq, NULL); +for (i = 0; i < s->num_irq - GIC_INTERNAL; i++) { +qemu_irq irq = qdev_get_gpio_in(dev, i); +kvm_irqchip_set_qemuirq_gsi(kvm_state, irq, i); +} + for (i = 0; i < s->num_cpu; i++) { ARMCPU *cpu = ARM_CPU(qemu_get_cpu(i)); -- 2.41.0.windows.3
[PATCH] Fixed incorrect LLONG alignment for openrisc and cris
From: Luca Bonissi Date: Thu, 3 Aug 2023 02:15:57 +0200 Subject: [PATCH] Fixed incorrect LLONG alignment for openrisc and cris OpenRISC (or1k) has long long alignment to 4 bytes, but currently not defined in abitypes.h. This lead to incorrect packing of /epoll_event/ structure and eventually infinite loop while waiting for file descriptor[s] event[s]. Fixed also CRIS alignments (1 byte for all types). Signed-off-by: Luca Bonissi --- include/exec/user/abitypes.h | 8 1 file changed, 8 insertions(+) diff --git a/include/exec/user/abitypes.h b/include/exec/user/abitypes.h index 6191ce9f74..6178453d94 100644 --- a/include/exec/user/abitypes.h +++ b/include/exec/user/abitypes.h @@ -15,8 +15,16 @@ #define ABI_LLONG_ALIGNMENT 2 #endif +#ifdef TARGET_CRIS +#define ABI_SHORT_ALIGNMENT 1 +#define ABI_INT_ALIGNMENT 1 +#define ABI_LONG_ALIGNMENT 1 +#define ABI_LLONG_ALIGNMENT 1 +#endif + #if (defined(TARGET_I386) && !defined(TARGET_X86_64)) \ || defined(TARGET_SH4) \ +|| defined(TARGET_OPENRISC) \ || defined(TARGET_MICROBLAZE) \ || defined(TARGET_NIOS2) #define ABI_LLONG_ALIGNMENT 4 -- 2.35.8
Re: [PATCH] Wrong unpacked structure for epoll_event on qemu-or1k (openrisc user-space)
On 19/07/23 10:49, Laurent Vivier wrote: According to linux/glibc sourced, epoll is only packed for x86_64. And, in recent glibc, also for i386, even it seems not necessary: even if the __alignof__(long long) is 8, structures like epoll_event are only 12 bytes, maybe "packed" for historical reasons. Ancient i386 gcc[s] (<3.0.0) have 4 bytes for __alignof__(long long). Perhaps the default alignment of long is not correctly defined in qemu for openrisc? __alignof__(long long): - 8 bytes: all 64 bit targets + arm, hppa, mips, ppc, sparc, xtensa, x86 - 4 bytes: microblaze, nios2, or1k, sh4 - 2 bytes: m68k - 1 byte : cris offsetof(struct epoll_event,data): - 8: all 64 bit targets + arm, hppa, mips, ppc, sparc, xtensa - 4: cris, m68k, microblaze, nios2, or1k, sh4, x86 So, epoll_event is "naturally" packed on the following targets (checked in linux-user container and/or with cross-compiler): - cris, m68k, microblaze, nios2, or1k, sh4, x86 (32bit) See include/exec/user/abitypes.h to update the value. OK, abitypes.h should be updated with the following patch (discard the previous patch on syscall_defs.h): Signed-off-by: Luca Bonissi --- diff -up a/include/exec/user/abitypes.h b/include/exec/user/abitypes.h --- a/include/exec/user/abitypes.h 2023-03-27 15:41:42.511916232 +0200 +++ b/include/exec/user/abitypes.h 2023-07-19 12:09:03.001687788 +0200 @@ -15,7 +15,15 @@ #define ABI_LLONG_ALIGNMENT 2 #endif +#ifdef TARGET_CRIS +#define ABI_SHORT_ALIGNMENT 1 +#define ABI_INT_ALIGNMENT 1 +#define ABI_LONG_ALIGNMENT 1 +#define ABI_LLONG_ALIGNMENT 1 +#endif + -#if (defined(TARGET_I386) && !defined(TARGET_X86_64)) || defined(TARGET_SH4) +#if (defined(TARGET_I386) && !defined(TARGET_X86_64)) || defined(TARGET_SH4) || \ +defined(TARGET_OPENRISC) || defined(TARGET_NIOS2) || defined(TARGET_MICROBLAZE) #define ABI_LLONG_ALIGNMENT 4 #endif
Re: [PATCH] Missing CASA instruction handling for SPARC qemu-user
On qemu-sparc (user-space), the CASA instruction is not handled for SPARC32 even if the selected cpu (e.g. LEON3) supports it. Following the patch that works. The patch also include an incorrect cpu-type for 32bit and missing configurable CPU features TA0_SHUTDOWN, ASR17, CACHE_CTRL, POWERDOWN, and CASA. Re-posting to add "signed-off-by" line. Removing also unused functions from qemu-sparc (32bit) building, and consequently removed helper patch (needed only by unused [removed] functions). Signed-off-by: Luca Bonissi --- diff -urp a/linux-user/syscall.c b/linux-user/syscall.c --- a/linux-user/syscall.c 2023-03-27 15:41:42.0 +0200 +++ b/linux-user/syscall.c 2023-04-01 13:54:14.709136932 +0200 @@ -8286,7 +8286,11 @@ static int open_net_route(CPUArchState * #if defined(TARGET_SPARC) static int open_cpuinfo(CPUArchState *cpu_env, int fd) { +#if defined(TARGET_SPARC64) dprintf(fd, "type\t\t: sun4u\n"); +#else +dprintf(fd, "type\t\t: sun4m\n"); +#endif return 0; } #endif diff -urp a/target/sparc/cpu.c b/target/sparc/cpu.c --- a/target/sparc/cpu.c2023-03-27 15:41:42.0 +0200 +++ b/target/sparc/cpu.c2023-03-31 21:32:54.927008782 +0200 @@ -560,6 +560,11 @@ static const char * const feature_name[] "hypv", "cmt", "gl", +"ta0shdn", +"asr17", +"cachectrl", +"powerdown", +"casa", }; static void print_features(uint32_t features, const char *prefix) @@ -852,6 +857,11 @@ static Property sparc_cpu_properties[] = DEFINE_PROP_BIT("hypv", SPARCCPU, env.def.features, 11, false), DEFINE_PROP_BIT("cmt", SPARCCPU, env.def.features, 12, false), DEFINE_PROP_BIT("gl", SPARCCPU, env.def.features, 13, false), +DEFINE_PROP_BIT("ta0shdn", SPARCCPU, env.def.features, 14, false), +DEFINE_PROP_BIT("asr17",SPARCCPU, env.def.features, 15, false), +DEFINE_PROP_BIT("cachectrl",SPARCCPU, env.def.features, 16, false), +DEFINE_PROP_BIT("powerdown",SPARCCPU, env.def.features, 17, false), +DEFINE_PROP_BIT("casa", SPARCCPU, env.def.features, 18, false), DEFINE_PROP_UNSIGNED("iu-version", SPARCCPU, env.def.iu_version, 0, qdev_prop_uint64, target_ulong), DEFINE_PROP_UINT32("fpu-version", SPARCCPU, env.def.fpu_version, 0), diff -urp a/target/sparc/translate.c b/target/sparc/translate.c --- a/target/sparc/translate.c 2023-03-27 15:41:42.0 +0200 +++ b/target/sparc/translate.c 2023-07-18 17:27:30.681134549 +0200 @@ -1917,7 +1917,6 @@ static void gen_ldstub(DisasContext *dc, } /* asi moves */ -#if !defined(CONFIG_USER_ONLY) || defined(TARGET_SPARC64) typedef enum { GET_ASI_HELPER, GET_ASI_EXCP, @@ -2149,6 +2148,7 @@ static DisasASI get_asi(DisasContext *dc return (DisasASI){ type, asi, mem_idx, memop }; } +#if !defined(CONFIG_USER_ONLY) || defined(TARGET_SPARC64) static void gen_ld_asi(DisasContext *dc, TCGv dst, TCGv addr, int insn, MemOp memop) { @@ -2277,6 +2277,7 @@ static void gen_swap_asi(DisasContext *d break; } } +#endif // !defined(CONFIG_USER_ONLY) || defined(TARGET_SPARC64) static void gen_cas_asi(DisasContext *dc, TCGv addr, TCGv cmpv, int insn, int rd) @@ -2300,6 +2301,7 @@ static void gen_cas_asi(DisasContext *dc } } +#if !defined(CONFIG_USER_ONLY) || defined(TARGET_SPARC64) static void gen_ldstub_asi(DisasContext *dc, TCGv dst, TCGv addr, int insn) { DisasASI da = get_asi(dc, insn, MO_UB); @@ -5508,7 +5510,6 @@ static void disas_sparc_insn(DisasContex case 0x37: /* stdc */ goto ncp_insn; #endif -#if !defined(CONFIG_USER_ONLY) || defined(TARGET_SPARC64) case 0x3c: /* V9 or LEON3 casa */ #ifndef TARGET_SPARC64 CHECK_IU_FEATURE(dc, CASA); @@ -5517,7 +5518,6 @@ static void disas_sparc_insn(DisasContex cpu_src2 = gen_load_gpr(dc, rs2); gen_cas_asi(dc, cpu_addr, cpu_src2, insn, rd); break; -#endif default: goto illegal_insn; }
Re: [PATCH] Wrong unpacked structure for epoll_event on qemu-or1k (openrisc user-space)
On 18/07/23 16:40, Peter Maydell wrote: Hi; thanks for this patch. Unfortunately we need patches to include a Signed-off-by: line that says you're legally OK with it being contributed to QEMU, or we can't take them. Sorry for the missing "signed-off-by" line, adding it just now: == The or1k epoll_event structure - unlike other architectures - is packed, so we need to define it as packed in qemu-user, otherwise it leads to infinite loop due to missing file descriptor in the returned data: Signed-off-by: Luca Bonissi --- diff -up a/linux-user/syscall_defs.h b/linux-user/syscall_defs.h --- a/linux-user/syscall_defs.h2023-03-27 15:41:42.0 +0200 +++ b/linux-user/syscall_defs.h2023-06-30 17:29:39.034322213 +0200 @@ -2714,7 +2709,7 @@ #define FUTEX_CMD_MASK ~(FUTEX_PRIVATE_FLAG | FUTEX_CLOCK_REALTIME) #ifdef CONFIG_EPOLL -#if defined(TARGET_X86_64) +#if defined(TARGET_X86_64) || defined(TARGET_OPENRISC) #define TARGET_EPOLL_PACKED QEMU_PACKED #else #define TARGET_EPOLL_PACKED
Re: [PATCH] Wrong signed data type on pageflags_* functions - limit to 2GB memory allocation
On 32bit qemu-user targets, memory allocation failed after about 2GB due to incorrect signed (instead of the correct unsigned) "last" parameter in pageflags_find and pageflags_next functions (file accel/tcg/user-exec.c). The parameter, on 32bit targets, will be signed-extent to the 64bit final uint64_t parameters, leading to incorrect comparison on the RBTree (only the first call to mmap on the upper 2GB memory will be successful). Following the patch to fix the bug (re-submit to add "signed-off-by"): Signed-off-by: Luca Bonissi --- diff -up a/accel/tcg/user-exec.c b/accel/tcg/user-exec.c --- a/accel/tcg/user-exec.c2023-03-27 15:41:42.0 +0200 +++ b/accel/tcg/user-exec.c2023-07-15 14:09:07.160453759 +0200 @@ -144,7 +144,7 @@ typedef struct PageFlagsNode { static IntervalTreeRoot pageflags_root; -static PageFlagsNode *pageflags_find(target_ulong start, target_long last) +static PageFlagsNode *pageflags_find(target_ulong start, target_ulong last) { IntervalTreeNode *n; @@ -153,7 +153,7 @@ static PageFlagsNode *pageflags_find(tar } static PageFlagsNode *pageflags_next(PageFlagsNode *p, target_ulong start, - target_long last) + target_ulong last) { IntervalTreeNode *n;
Missing CASA instruction handling for SPARC qemu-user
On qemu-sparc (user-space), the CASA instruction is not handled for SPARC32 even if the selected cpu (e.g. LEON3) supports it. Following the patch that works. I created "fake" ld/st_asi helpers: it seems all works fine, but I don't know if we should make real ld/st helpers like for SPARC64 user-space. Please check. The patch also include an incorrect cpu-type for 32bit and missing configureable CPU features TA0_SHUTDOWN, ASR17, CACHE_CTRL, POWERDOWN, and CASA. diff -urp qemu-20230327.orig/linux-user/syscall.c qemu-20230327/linux-user/syscall.c --- qemu-20230327.orig/linux-user/syscall.c 2023-03-27 15:41:42.0 +0200 +++ qemu-20230327/linux-user/syscall.c 2023-04-01 13:54:14.709136932 +0200 @@ -8286,7 +8286,11 @@ static int open_net_route(CPUArchState * #if defined(TARGET_SPARC) static int open_cpuinfo(CPUArchState *cpu_env, int fd) { +#if defined(TARGET_SPARC64) dprintf(fd, "type\t\t: sun4u\n"); +#else +dprintf(fd, "type\t\t: sun4m\n"); +#endif return 0; } #endif diff -urp qemu-20230327.orig/target/sparc/cpu.c qemu-20230327/target/sparc/cpu.c --- qemu-20230327.orig/target/sparc/cpu.c 2023-03-27 15:41:42.0 +0200 +++ qemu-20230327/target/sparc/cpu.c2023-03-31 21:32:54.927008782 +0200 @@ -560,6 +560,11 @@ static const char * const feature_name[] "hypv", "cmt", "gl", +"ta0shdn", +"asr17", +"cachectrl", +"powerdown", +"casa", }; static void print_features(uint32_t features, const char *prefix) @@ -852,6 +857,11 @@ static Property sparc_cpu_properties[] = DEFINE_PROP_BIT("hypv", SPARCCPU, env.def.features, 11, false), DEFINE_PROP_BIT("cmt", SPARCCPU, env.def.features, 12, false), DEFINE_PROP_BIT("gl", SPARCCPU, env.def.features, 13, false), +DEFINE_PROP_BIT("ta0shdn", SPARCCPU, env.def.features, 14, false), +DEFINE_PROP_BIT("asr17",SPARCCPU, env.def.features, 15, false), +DEFINE_PROP_BIT("cachectrl",SPARCCPU, env.def.features, 16, false), +DEFINE_PROP_BIT("powerdown",SPARCCPU, env.def.features, 17, false), +DEFINE_PROP_BIT("casa", SPARCCPU, env.def.features, 18, false), DEFINE_PROP_UNSIGNED("iu-version", SPARCCPU, env.def.iu_version, 0, qdev_prop_uint64, target_ulong), DEFINE_PROP_UINT32("fpu-version", SPARCCPU, env.def.fpu_version, 0), diff -urp qemu-20230327.orig/target/sparc/helper.h qemu-20230327/target/sparc/helper.h --- qemu-20230327.orig/target/sparc/helper.h 2023-03-27 15:41:42.0 +0200 +++ qemu-20230327/target/sparc/helper.h 2023-03-31 20:41:36.084224862 +0200 @@ -38,10 +38,10 @@ DEF_HELPER_3(tsubcctv, tl, env, tl, tl) DEF_HELPER_FLAGS_3(sdivx, TCG_CALL_NO_WG, s64, env, s64, s64) DEF_HELPER_FLAGS_3(udivx, TCG_CALL_NO_WG, i64, env, i64, i64) #endif -#if !defined(CONFIG_USER_ONLY) || defined(TARGET_SPARC64) +//#if !defined(CONFIG_USER_ONLY) || defined(TARGET_SPARC64) DEF_HELPER_FLAGS_4(ld_asi, TCG_CALL_NO_WG, i64, env, tl, int, i32) DEF_HELPER_FLAGS_5(st_asi, TCG_CALL_NO_WG, void, env, tl, i64, int, i32) -#endif +//#endif DEF_HELPER_FLAGS_1(check_ieee_exceptions, TCG_CALL_NO_WG, tl, env) DEF_HELPER_FLAGS_3(ldfsr, TCG_CALL_NO_RWG, tl, env, tl, i32) DEF_HELPER_FLAGS_1(fabss, TCG_CALL_NO_RWG_SE, f32, f32) diff -urp qemu-20230327.orig/target/sparc/ldst_helper.c qemu-20230327/target/sparc/ldst_helper.c --- qemu-20230327.orig/target/sparc/ldst_helper.c 2023-03-27 15:41:42.0 +0200 +++ qemu-20230327/target/sparc/ldst_helper.c 2023-03-31 21:02:21.897968335 +0200 @@ -1167,7 +1168,19 @@ void helper_st_asi(CPUSPARCState *env, t #endif } +#else /* CONFIG_USER_ONLY */ +uint64_t helper_ld_asi(CPUSPARCState *env, target_ulong addr, + int asi, uint32_t memop) +{ + return(0); +} +void helper_st_asi(CPUSPARCState *env, target_ulong addr, uint64_t val, + int asi, uint32_t memop) +{ +} + #endif /* CONFIG_USER_ONLY */ + #else /* TARGET_SPARC64 */ #ifdef CONFIG_USER_ONLY diff -urp qemu-20230327.orig/target/sparc/translate.c qemu-20230327/target/sparc/translate.c --- qemu-20230327.orig/target/sparc/translate.c 2023-03-27 15:41:42.0 +0200 +++ qemu-20230327/target/sparc/translate.c 2023-04-01 15:24:18.293176711 +0200 @@ -1910,7 +1910,8 @@ static void gen_ldstub(DisasContext *dc, } /* asi moves */ -#if !defined(CONFIG_USER_ONLY) || defined(TARGET_SPARC64) +//#if !defined(CONFIG_USER_ONLY) || defined(TARGET_SPARC64) +#if 1 typedef enum { GET_ASI_HELPER, GET_ASI_EXCP, @@ -5521,7 +5522,7 @@ static void disas_sparc_insn(DisasContex case 0x37: /* stdc */ goto ncp_insn; #endif -#if !defined(CONFIG_USER_ONLY) || defined(TARGET_SPARC64) +//#if !defined(CONFIG_USER_ONLY) || defined(TARGET_SPARC64) case 0x3c: /* V9 or LEON3 casa */ #ifndef TARGET_SPARC64 CHECK_IU_FEATURE(dc, CASA); @@ -5529,8 +5530,8 @@ static void disas_sparc_insn(DisasContex rs2 = GE
Wrong unpacked structure for epoll_event on qemu-or1k (openrisc user-space)
The or1k epoll_event structure - unlike other architectures - is packed, so we need to define it as packed in qemu-user, otherwise it leads to infinite loop due to missing file descriptor in the returned data: --- qemu-20230327/linux-user/syscall_defs.h.orig 2023-03-27 15:41:42.0 +0200 +++ qemu-20230327/linux-user/syscall_defs.h 2023-06-30 17:29:39.034322213 +0200 @@ -2714,7 +2709,7 @@ #define FUTEX_CMD_MASK ~(FUTEX_PRIVATE_FLAG | FUTEX_CLOCK_REALTIME) #ifdef CONFIG_EPOLL -#if defined(TARGET_X86_64) +#if defined(TARGET_X86_64) || defined(TARGET_OPENRISC) #define TARGET_EPOLL_PACKED QEMU_PACKED #else #define TARGET_EPOLL_PACKED
Wrong signed data type on pageflags_* functions - limit to 2GB memory allocation
On 32bit qemu-user targets, memory allocation failed after about 2GB due to incorrect signed (instead of the correct unsigned) "last" parameter in pageflags_find and pageflags_next functions (file accel/tcg/user-exec.c). The parameter, on 32bit targets, will be signed-extent to the 64bit final uint64_t parameters, leading to incorrect comparison on the RBTree (only the first call to mmap on the upper 2GB memory will be successful). Following the patch to fix the bug: --- qemu-20230327.orig/accel/tcg/user-exec.c 2023-03-27 15:41:42.0 +0200 +++ qemu-20230327/accel/tcg/user-exec.c 2023-07-15 14:09:07.160453759 +0200 @@ -144,7 +144,7 @@ typedef struct PageFlagsNode { static IntervalTreeRoot pageflags_root; -static PageFlagsNode *pageflags_find(target_ulong start, target_long last) +static PageFlagsNode *pageflags_find(target_ulong start, target_ulong last) { IntervalTreeNode *n; @@ -153,7 +153,7 @@ static PageFlagsNode *pageflags_find(tar } static PageFlagsNode *pageflags_next(PageFlagsNode *p, target_ulong start, - target_long last) + target_ulong last) { IntervalTreeNode *n;
Re: stat64 wrong on sparc64 user
On 29/03/23 18:22, Laurent Vivier wrote: Le 28/03/2023 à 14:22, Luca Bonissi a écrit : On 28/03/23 13:55, Thomas Huth wrote: On 28/03/2023 13.48, Luca Bonissi wrote: --- qemu-20230327/linux-user/syscall_defs.h 2023-03-27 15:41:42.0 +0200 +++ qemu-20230327/linux-user/syscall_defs.h.new 2023-03-27 21:43:25.615115126 +0200 @@ -1450,7 +1450,7 @@ struct target_stat { unsigned int st_dev; abi_ulong st_ino; unsigned int st_mode; - unsigned int st_nlink; + short int st_nlink; unsigned int st_uid; To have automatic alignment according to target ABI, you must use abi_XXX type (see include/exec/user/abitypes.h) I tried to keep as much as possibile the source code untouched, but no problem to change all fields with abi_XXX. Tested for sparc and sparc64: --- qemu-20230327/linux-user/syscall_defs.h.orig 2023-03-27 15:41:42.0 +0200 +++ qemu-20230327/linux-user/syscall_defs.h 2023-03-30 12:52:46.308640526 +0200 @@ -1447,13 +1447,13 @@ struct target_eabi_stat64 { #elif defined(TARGET_SPARC64) && !defined(TARGET_ABI32) struct target_stat { - unsigned intst_dev; + abi_uintst_dev; abi_ulong st_ino; - unsigned intst_mode; - unsigned intst_nlink; - unsigned intst_uid; - unsigned intst_gid; - unsigned intst_rdev; + abi_uintst_mode; + abi_short st_nlink; + abi_uintst_uid; + abi_uintst_gid; + abi_uintst_rdev; abi_longst_size; abi_longtarget_st_atime; abi_longtarget_st_mtime; @@ -1465,25 +1465,23 @@ struct target_stat { #define TARGET_HAS_STRUCT_STAT64 struct target_stat64 { - unsigned char __pad0[6]; - unsigned short st_dev; + abi_ullong st_dev; - uint64_tst_ino; - uint64_tst_nlink; + abi_ullong st_ino; + abi_ullong st_nlink; - unsigned intst_mode; + abi_uintst_mode; - unsigned intst_uid; - unsigned intst_gid; + abi_uintst_uid; + abi_uintst_gid; - unsigned char __pad2[6]; - unsigned short st_rdev; + abi_uint__pad0; + abi_ullong st_rdev; -int64_tst_size; - int64_t st_blksize; + abi_llong st_size; + abi_llong st_blksize; - unsigned char __pad4[4]; - unsigned intst_blocks; + abi_llong st_blocks; abi_ulong target_st_atime; abi_ulong target_st_atime_nsec; @@ -1501,13 +1499,13 @@ struct target_stat64 { #define TARGET_STAT_HAVE_NSEC struct target_stat { - unsigned short st_dev; + abi_ushort st_dev; abi_ulong st_ino; - unsigned short st_mode; - short st_nlink; - unsigned short st_uid; - unsigned short st_gid; - unsigned short st_rdev; + abi_ushort st_mode; + abi_short st_nlink; + abi_ushort st_uid; + abi_ushort st_gid; + abi_ushort st_rdev; abi_longst_size; abi_longtarget_st_atime; abi_ulong target_st_atime_nsec; @@ -1522,39 +1520,37 @@ struct target_stat { #define TARGET_HAS_STRUCT_STAT64 struct target_stat64 { - unsigned char __pad0[6]; - unsigned short st_dev; + abi_ullong st_dev; - uint64_t st_ino; + abi_ullong st_ino; - unsigned intst_mode; - unsigned intst_nlink; + abi_uintst_mode; + abi_uintst_nlink; - unsigned intst_uid; - unsigned intst_gid; + abi_uintst_uid; + abi_uintst_gid; - unsigned char __pad2[6]; - unsigned short st_rdev; + abi_ullongst_rdev; unsigned char __pad3[8]; -int64_tst_size; - unsigned intst_blksize; +abi_llong st_size; + abi_uintst_blksize; unsigned char __pad4[8]; - unsigned intst_blocks; + abi_uintst_blocks; - unsigned inttarget_st_atime; - unsigned inttarget_st_atime_nsec; + abi_uinttarget_st_atime; + abi_uinttarget_st_atime_nsec; - unsigned inttarget_st_mtime; - unsigned inttarget_st_mtime_nsec; + abi_uinttarget_st_mtime; + abi_uinttarget_st_mtime_nsec; - unsigned inttarget_st_ctime; - unsigned inttarget_st_ctime_nsec; + abi_uinttarget_st_ctime; + abi_uinttarget_st_ctime_nsec; - unsigned int__unused1; - unsigned int__unused2; + abi_uint__unused1; + abi_uint__unused2; }; #elif defined(TARGET_PPC)
Re: stat64 wrong on sparc64 user
On 28/03/23 13:55, Thomas Huth wrote: On 28/03/2023 13.48, Luca Bonissi wrote: --- qemu-20230327/linux-user/syscall_defs.h 2023-03-27 15:41:42.0 +0200 +++ qemu-20230327/linux-user/syscall_defs.h.new 2023-03-27 21:43:25.615115126 +0200 @@ -1450,7 +1450,7 @@ struct target_stat { unsigned int st_dev; abi_ulong st_ino; unsigned int st_mode; - unsigned int st_nlink; + short int st_nlink; unsigned int st_uid; That looks wrong at a first glance. IIRC Sparc is a very strictly aligned architecture, so if the previous field "st_mode" was aligned to a 4-byte boundary, the "st_uid" field now would not be aligned anymore... are you sure about this change? Maybe it needs a padding field now? The padding is automatic (either on Sparc or x86-64): short will be aligned to 2-byte boundary, int will be aligned to 4-byte boundary, long will be aligned to 8-byte boundary. E.g.: st_dev=0x05060708; st_ino=0x1112131415161718; st_mode=0x1a1b1c1d; st_nlink=0x2728; st_uid=0x2a2b2c2d; st_gid=0x3a3b3c3d; st_rdev=0x35363738; st_size=0x4142434445464748; st_blksize=0x5152535455565758; will result (sparc64 - big endian): 00: 05 06 07 08 00 00 00 00 08: 11 12 13 14 15 16 17 18 10: 1A 1B 1C 1D 27 28 00 00 18: 2A 2B 2C 2D 3A 3B 3C 3D 20: 35 36 37 38 00 00 00 00 28: 41 42 43 44 45 46 47 48 30: 00 00 00 00 00 00 00 00 38: 00 00 00 00 00 00 00 00 40: 00 00 00 00 00 00 00 00 48: 51 52 53 54 55 56 57 58 50: 00 00 00 00 00 00 00 00 58: 00 00 00 00 00 00 00 00 60: 00 00 00 00 00 00 00 00 Or on x86-64 (little endian): 00: 08 07 06 05 00 00 00 00 08: 18 17 16 15 14 13 12 11 10: 1D 1C 1B 1A 28 27 00 00 18: 2D 2C 2B 2A 3D 3C 3B 3A 20: 38 37 36 35 00 00 00 00 28: 48 47 46 45 44 43 42 41 30: 00 00 00 00 00 00 00 00 38: 00 00 00 00 00 00 00 00 40: 00 00 00 00 00 00 00 00 48: 58 57 56 55 54 53 52 51 50: 00 00 00 00 00 00 00 00 58: 00 00 00 00 00 00 00 00 60: 00 00 00 00 00 00 00 00 Please note the automatic padding between "st_dev" and "st_ino" (offset 0x04, 4 bytes), "st_nlink" and "st_uid" (offset 0x16, 2 bytes), "st_rdev" and "st_size" (offset 0x24, 4 bytes). Placing st_nlink as int would result in incorrect big/little endian conversion, so it should be set as short. If you like clearer source code, you can optionally add padding, but it is not mandatory. Thanks! Luca
stat64 wrong on sparc64 user
On qemu-sparc64 (userspace) the struct "target_stat64" is not correctly padded, so the field st_rdev is not correctly aligned and will report wrong major/minor (e.g. for /dev/zero it reports 0,0x1050 instead of 1,5). Here patch to solve the issue (it also fixes incorrect size on some fields): --- qemu-20230327/linux-user/syscall_defs.h 2023-03-27 15:41:42.0 +0200 +++ qemu-20230327/linux-user/syscall_defs.h.new 2023-03-27 21:43:25.615115126 +0200 @@ -1450,7 +1450,7 @@ struct target_stat { unsigned intst_dev; abi_ulong st_ino; unsigned intst_mode; - unsigned intst_nlink; + short int st_nlink; unsigned intst_uid; unsigned intst_gid; unsigned intst_rdev; @@ -1465,8 +1465,7 @@ struct target_stat { #define TARGET_HAS_STRUCT_STAT64 struct target_stat64 { - unsigned char __pad0[6]; - unsigned short st_dev; + uint64_tst_dev; uint64_tst_ino; uint64_tst_nlink; @@ -1476,14 +1475,13 @@ struct target_stat64 { unsigned intst_uid; unsigned intst_gid; - unsigned char __pad2[6]; - unsigned short st_rdev; + unsigned int__pad0; + uint64_tst_rdev; int64_tst_size; int64_t st_blksize; - unsigned char __pad4[4]; - unsigned intst_blocks; + int64_t st_blocks; abi_ulong target_st_atime; abi_ulong target_st_atime_nsec; @@ -1522,8 +1520,7 @@ struct target_stat { #define TARGET_HAS_STRUCT_STAT64 struct target_stat64 { - unsigned char __pad0[6]; - unsigned short st_dev; + uint64_t st_dev; uint64_t st_ino; @@ -1533,8 +1530,7 @@ struct target_stat64 { unsigned intst_uid; unsigned intst_gid; - unsigned char __pad2[6]; - unsigned short st_rdev; + uint64_tst_rdev; unsigned char __pad3[8];
Re: [PATCH] linux-user: Implement faccessat2
On 18/10/22 11:58, Michael Tokarev wrote: 10.10.2022 11:53, Helge Deller wrote: On 10/9/22 08:08, WANG Xuerui wrote: User space has been preferring this syscall for a while, due to its closer match with C semantics, and newer platforms such as LoongArch apparently have libc implementations that don't fallback to faccessat so normal access checks are failing without the emulation in place. https://lore.kernel.org/qemu-devel/YzLdcnL6x646T61W@p100/ I think this one is the more complete and simplest solution. Only change: +#if defined(TARGET_NR_faccessat2) && defined(__NR_faccessat2) with +#if defined(TARGET_NR_faccessat2) (not necessary to have host __NR_faccessat2) and replace "faccessat2(...)" with "faccessat(...)", so it uses glibc implementation, that uses __NR_faccessat2 if host has this syscall, otherwise it falls back to faccessat with the addition of fstatat if flags!=0 (obviously, the definition of syscall4(... faccessat2 ...) should be removed). For loongarch64 users this has become essential, because this is a new enough arch so that userspace does not bother using older syscalls, in this case it uses faccessat2() for everything, and simplest programs fail under qemu due to no fallback whatsoever. I agree that it has become essential. Development with qemu-user is much faster than using qemu-system, with all the benefits to use chroot on a shared file system. I tested (and currently testing) the above patch with Slackware-current build scripts on x86_64 host: all works fine! Thanks! Luca
Memory address of ivshmem device
Hi, I'm using KVM from command line to run a VM and I have to create a ivshmem between host and guest. The options that I pass are: -device ivshmem-plain, memdev=id -object memory-backend-file,size=1M,share,mem-path=/dev/shm/ivshmem,id=id Now, from host side I can read and write the shmem. From guest not because the OS in VM doesn't have a device PCI manager. I want to know if the device has a fixed address on KVM VM so to force read and write to that address in the application on VM. Best regards Luca Belluardo
[Bug 1895305] [NEW] pthread_cancel fails with "RT33" with musl libc
Public bug reported: >From my testing it seems that QEMU built against musl libc crashes on pthread_cancel cancel calls - if the binary is also built with musl libc. Minimal sample: #include #include #include void* threadfunc(void* ignored) { while (1) { pause(); } return NULL; } int main() { pthread_t thread; pthread_create(&thread, NULL, &threadfunc, NULL); sleep(1); pthread_cancel(thread); printf("OK, alive\n"); } In an Alpine Linux aarch64 chroot (on an x86_64 host) the binary will just output RT33 and has exit code 161. Using qemu-aarch64 on an x86_64 host results in the output (fish shell) fish: “qemu-aarch64-static ./musl-stat…” terminated by signal Unknown (Unknown) or (bash) Real-time signal 2 and exit code 164. It doesn't matter whether the binary is linked dynamically or static. You can see my test results in the following table: | | QEMU glibc | QEMU musl | |--||---| | binary glibc dynamic | ✓ | ✓ | | binary glibc static | ✓ | ✓ | | binary musl dynamic | ✓ | ✗ | | binary musl static | ✓ | ✗ | Both QEMU builds are v5.1.0 (glibc v2.32 / musl v1.2.1) I've uploaded all my compile and test commands (plus a script to conveniently run them all) to https://github.com/z3ntu/qemu- pthread_cancel . It also includes the built binaries if needed. The test script output can be found at https://github.com/z3ntu/qemu- pthread_cancel/blob/master/results.txt Further links: - https://gitlab.com/postmarketOS/pmaports/-/issues/190#note_141902075 - https://gitlab.com/postmarketOS/pmbootstrap/-/issues/1970 ** Affects: qemu Importance: Undecided Status: New -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1895305 Title: pthread_cancel fails with "RT33" with musl libc Status in QEMU: New Bug description: From my testing it seems that QEMU built against musl libc crashes on pthread_cancel cancel calls - if the binary is also built with musl libc. Minimal sample: #include #include #include void* threadfunc(void* ignored) { while (1) { pause(); } return NULL; } int main() { pthread_t thread; pthread_create(&thread, NULL, &threadfunc, NULL); sleep(1); pthread_cancel(thread); printf("OK, alive\n"); } In an Alpine Linux aarch64 chroot (on an x86_64 host) the binary will just output RT33 and has exit code 161. Using qemu-aarch64 on an x86_64 host results in the output (fish shell) fish: “qemu-aarch64-static ./musl-stat…” terminated by signal Unknown (Unknown) or (bash) Real-time signal 2 and exit code 164. It doesn't matter whether the binary is linked dynamically or static. You can see my test results in the following table: | | QEMU glibc | QEMU musl | |--||---| | binary glibc dynamic | ✓ | ✓ | | binary glibc static | ✓ | ✓ | | binary musl dynamic | ✓ | ✗ | | binary musl static | ✓ | ✗ | Both QEMU builds are v5.1.0 (glibc v2.32 / musl v1.2.1) I've uploaded all my compile and test commands (plus a script to conveniently run them all) to https://github.com/z3ntu/qemu- pthread_cancel . It also includes the built binaries if needed. The test script output can be found at https://github.com/z3ntu/qemu- pthread_cancel/blob/master/results.txt Further links: - https://gitlab.com/postmarketOS/pmaports/-/issues/190#note_141902075 - https://gitlab.com/postmarketOS/pmbootstrap/-/issues/1970 To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1895305/+subscriptions
[Qemu-devel] [Bug 1826401] [NEW] qemu-system-aarch64 has a high cpu usage on windows
Public bug reported: Running qemu-system-aarch64 leads to a high CPU consumption on windows 10. Tested with qemu: 4.0.0-rc4 & 3.1.0 & 2.11.0 Command: qemu_start_command = [ qemu-system-aarch64, "-pidfile", target_path + "/qemu" + str(instance) + ".pid", "-machine", "virt", "-cpu", "cortex-a57", "-nographic", "-smp", "2", "-m", "2048", "-kernel", kernel_path, "--append", "console=ttyAMA0 root=/dev/vda2 rw ipx=" + qemu_instance_ip + "/64 net.ifnames=0 biosdevname=0", "-drive", "file=" + qemu_instance_img_path + ",if=none,id=blk", "-device", "virtio-blk-device,drive=blk", "-netdev", "socket,id=mynet0,udp=127.0.0.1:2000,localaddr=127.0.0.1:" + qemu_instance_port, "-device", "virtio-net-device,netdev=mynet0", "-serial", "file:" + target_path + "/qemu" + str(instance) + ".log" ] *The cpu consumption is ~70%. *No acceleration used. *This CPU consumption is obtained only by running the above command. No workload on the guest OS. ** Affects: qemu Importance: Undecided Status: New -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1826401 Title: qemu-system-aarch64 has a high cpu usage on windows Status in QEMU: New Bug description: Running qemu-system-aarch64 leads to a high CPU consumption on windows 10. Tested with qemu: 4.0.0-rc4 & 3.1.0 & 2.11.0 Command: qemu_start_command = [ qemu-system-aarch64, "-pidfile", target_path + "/qemu" + str(instance) + ".pid", "-machine", "virt", "-cpu", "cortex-a57", "-nographic", "-smp", "2", "-m", "2048", "-kernel", kernel_path, "--append", "console=ttyAMA0 root=/dev/vda2 rw ipx=" + qemu_instance_ip + "/64 net.ifnames=0 biosdevname=0", "-drive", "file=" + qemu_instance_img_path + ",if=none,id=blk", "-device", "virtio-blk-device,drive=blk", "-netdev", "socket,id=mynet0,udp=127.0.0.1:2000,localaddr=127.0.0.1:" + qemu_instance_port, "-device", "virtio-net-device,netdev=mynet0", "-serial", "file:" + target_path + "/qemu" + str(instance) + ".log" ] *The cpu consumption is ~70%. *No acceleration used. *This CPU consumption is obtained only by running the above command. No workload on the guest OS. To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1826401/+subscriptions
[Qemu-devel] [Bug 1323001] Re: Netlink socket support for MIPS*
** Changed in: qemu Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1323001 Title: Netlink socket support for MIPS* Status in QEMU: Fix Released Bug description: It seems QEMU does not support Netlink socket support on MIPS* Trying to compile and run this simple program: #include #include #include int main() { int audit_fd = audit_open (); printf("fd is %d\n", audit_fd); printf("errno is %d\n", errno); return 0; } I receive the following output: $ ./test fd is -1 errno is 97 $ errno 97 is #define EAFNOSUPPORT97 /* Address family not supported by protocol */ To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1323001/+subscriptions
[Qemu-devel] [Bug 1323001] Re: Netlink socket support for MIPS*
http://git.qemu.org/?p=qemu.git;a=commitdiff;h=480eda2eda7c464e252f17ac87ec61bccc14f285 ** Changed in: qemu Status: New => Fix Committed -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1323001 Title: Netlink socket support for MIPS* Status in QEMU: Fix Committed Bug description: It seems QEMU does not support Netlink socket support on MIPS* Trying to compile and run this simple program: #include #include #include int main() { int audit_fd = audit_open (); printf("fd is %d\n", audit_fd); printf("errno is %d\n", errno); return 0; } I receive the following output: $ ./test fd is -1 errno is 97 $ errno 97 is #define EAFNOSUPPORT97 /* Address family not supported by protocol */ To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1323001/+subscriptions
[Qemu-devel] [Bug 1323001] Re: Netlink socket support for MIPS*
This patches solves the issue for me: http://patchwork.ozlabs.org/patch/346018/ ** Tags added: patch -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1323001 Title: Netlink socket support for MIPS* Status in QEMU: New Bug description: It seems QEMU does not support Netlink socket support on MIPS* Trying to compile and run this simple program: #include #include #include int main() { int audit_fd = audit_open (); printf("fd is %d\n", audit_fd); printf("errno is %d\n", errno); return 0; } I receive the following output: $ ./test fd is -1 errno is 97 $ errno 97 is #define EAFNOSUPPORT97 /* Address family not supported by protocol */ To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1323001/+subscriptions
[Qemu-devel] [Bug 1323001] [NEW] Netlink socket support for MIPS*
Public bug reported: It seems QEMU does not support Netlink socket support on MIPS* Trying to compile and run this simple program: #include #include #include int main() { int audit_fd = audit_open (); printf("fd is %d\n", audit_fd); printf("errno is %d\n", errno); return 0; } I receive the following output: $ ./test fd is -1 errno is 97 $ errno 97 is #define EAFNOSUPPORT97 /* Address family not supported by protocol */ ** Affects: qemu Importance: Undecided Status: New -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1323001 Title: Netlink socket support for MIPS* Status in QEMU: New Bug description: It seems QEMU does not support Netlink socket support on MIPS* Trying to compile and run this simple program: #include #include #include int main() { int audit_fd = audit_open (); printf("fd is %d\n", audit_fd); printf("errno is %d\n", errno); return 0; } I receive the following output: $ ./test fd is -1 errno is 97 $ errno 97 is #define EAFNOSUPPORT97 /* Address family not supported by protocol */ To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1323001/+subscriptions
[Qemu-devel] [Bug 1256546] Re: qemu-s390x-static: segmentation fault entering chroot
** Changed in: qemu (Ubuntu) Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1256546 Title: qemu-s390x-static: segmentation fault entering chroot Status in QEMU: Fix Released Status in “qemu” package in Ubuntu: Fix Released Status in “qemu” package in Debian: Fix Released Bug description: Host: Ubuntu Trusty i386 Guest: Debian Sid s390x When attempting to debootstrap a Debian Sid s390x guest the second stage process immediately fails with a segmentation fault, and any subsequent attempts to run any command while in the chroot. I: Running command: chroot s390x /debootstrap/debootstrap --second-stage Segmentation fault (core dumped) # chroot s390x/ # ps Segmentation fault (core dumped) # ls Segmentation fault (core dumped) # exit exit ProblemType: Bug DistroRelease: Ubuntu 14.04 Package: qemu-user-static 1.6.0+dfsg-2ubuntu4 ProcVersionSignature: Ubuntu 3.12.0-4.12-generic 3.12.1 Uname: Linux 3.12.0-4-generic i686 ApportVersion: 2.12.7-0ubuntu1 Architecture: i386 Date: Sat Nov 30 18:19:59 2013 InstallationDate: Installed on 2013-11-29 (1 days ago) InstallationMedia: Ubuntu 14.04 LTS "Trusty Tahr" - Alpha i386 (20131126) ProcEnviron: LANGUAGE=en_GB:en TERM=xterm PATH=(custom, no user) LANG=en_GB.UTF-8 SHELL=/bin/bash SourcePackage: qemu UpgradeStatus: No upgrade log present (probably fresh install) To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1256546/+subscriptions
Re: [Qemu-devel] [sheepdog] [PATCH 00/11] sheepdog: reconnect server after connection failure
Is this series of patches applyable to sheepdog-stable-0.6 band qemu 1.5.0 ? I've seen they use async i/o... MORITA Kazutaka ha scritto: >Currently, if a sheepdog server exits, all the connecting VMs need to >be restarted. This series implements a feature to reconnect the >server, and enables us to do online sheepdog upgrade and avoid >restarting VMs when sheepdog servers crash unexpectedly. > >MORITA Kazutaka (11): > ignore SIGPIPE in qemu-img and qemu-io > iov: handle eof in iov_send_recv > qemu-sockets: make wait_for_connect be invoked in qemu_aio_wait > sheepdog: make connect nonblocking > sheepdog: check return values of qemu_co_recv/send correctly > sheepdog: handle vdi objects in resend_aio_req > sheepdog: reload inode outside of resend_aioreq > coroutine: add co_aio_sleep_ns() to allow sleep in block drivers > sheepdog: try to reconnect to sheepdog after network error > sheepdog: make add_aio_request and send_aioreq void functions > sheepdog: cancel aio requests if possible > > Makefile | 4 +- > block/sheepdog.c | 314 -- > include/block/coroutine.h | 8 ++ > qemu-coroutine-sleep.c| 47 +++ > qemu-img.c| 4 + > qemu-io.c | 4 + > util/iov.c| 6 + > util/qemu-sockets.c | 15 ++- > 8 files changed, 303 insertions(+), 99 deletions(-) > >-- >1.8.1.3.566.gaa39828 > >-- >sheepdog mailing list >sheep...@lists.wpkg.org >http://lists.wpkg.org/mailman/listinfo/sheepdog
[Qemu-devel] [Bug 829455] [NEW] user mode network stack - hostfwd not working with restrict=y
Public bug reported: I find that explicit hostfwd commands do not seem to work with restrict=yes option, even if the docs clearly state that hostfwd should override restrict setting. I am using this config: -net user,name=test,net=192.168.100.0/24,host=192.168.100.44,restrict=y,hostfwd=tcp:127.0.0.1:3389-192.168.100.1:3389 (my guest has static IP address configured as 192.168.100.1/24) and I cannot log into my guest via rdp. the client hanging indefinitely. by just changing to "restrict=no" I can log in. maybe I am doing something wrong, but I cannot figure out what. running QEMU emulator version 0.14.0 (qemu-kvm-0.14.0) ** Affects: qemu Importance: Undecided Status: New ** Tags: hostfwd restrict -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/829455 Title: user mode network stack - hostfwd not working with restrict=y Status in QEMU: New Bug description: I find that explicit hostfwd commands do not seem to work with restrict=yes option, even if the docs clearly state that hostfwd should override restrict setting. I am using this config: -net user,name=test,net=192.168.100.0/24,host=192.168.100.44,restrict=y,hostfwd=tcp:127.0.0.1:3389-192.168.100.1:3389 (my guest has static IP address configured as 192.168.100.1/24) and I cannot log into my guest via rdp. the client hanging indefinitely. by just changing to "restrict=no" I can log in. maybe I am doing something wrong, but I cannot figure out what. running QEMU emulator version 0.14.0 (qemu-kvm-0.14.0) To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/829455/+subscriptions
Re: [Qemu-devel] [RFC] QEMU Object Model
On Thu, Jul 21, 2011 at 4:49 PM, Anthony Liguori wrote: [cut] > And it's really not that much nicer than the C version. The problem with > C++ is that even though the type system is much, much nicer, it still > doesn't have introspection or decorators. These two things would be the > killer feature for doing the sort of things we need to do and is really > where most of the ugliness comes from. QT has introspection; what about using the core (i.e. QObject and moc) to build QEMU object model? L
Re: [Qemu-devel] [RFC] qed: Add QEMU Enhanced Disk format
On Mon, Sep 6, 2010 at 12:25 PM, Alexander Graf wrote: > On 06.09.2010, at 12:04, Stefan Hajnoczi wrote: >> + >> +const char *bytes_to_str(uint64_t size) >> +{ >> + static char buffer[64]; >> + >> + if (size < (1ULL << 10)) { >> + snprintf(buffer, sizeof(buffer), "%" PRIu64 " byte(s)", size); >> + } else if (size < (1ULL << 20)) { >> + snprintf(buffer, sizeof(buffer), "%" PRIu64 " KB(s)", size >> 10); >> + } else if (size < (1ULL << 30)) { >> + snprintf(buffer, sizeof(buffer), "%" PRIu64 " MB(s)", size >> 20); >> + } else if (size < (1ULL << 40)) { >> + snprintf(buffer, sizeof(buffer), "%" PRIu64 " GB(s)", size >> 30); >> + } else { >> + snprintf(buffer, sizeof(buffer), "%" PRIu64 " TB(s)", size >> 40); >> + } >> + >> + return buffer; > > This returns a variable from the stack! Please make the target buffer caller > defined. It's static, so it's formally correct. But probably not a good idea :) Luca
Re: [Qemu-devel] [PATCH RFC] Advertise IDE physical block size as 4K
On Tue, Dec 29, 2009 at 2:21 PM, Jamie Lokier wrote: > Avi Kivity wrote: >> Guests use this number as a hint for alignment and I/O request sizes. > > It's not just a hint. It is also the "radius of corruption on failed > write" - important for journalling filesystems and databases. > >> Given >> that modern disks have 4K block sizes, > > Do they, yet? Yes, there are WD disks in the wild with 4k blocks, although in this first transition phase the firmware hides the fact and emulates the old 512b sector. >> We probably need to make this configurable depending on machine type. It >> should be the default for -M 0.13 only as it can affect guest code paths. > > What about that Windows/Linux 4k sectors incompatibility thing, where > disks with 4k sectors have to sense whether the first partition starts > at 512-byte sector 63 (Linux) or 512-byte sector 1024 (or something; > Windows), and then adjust their 512-byte sector to 4k-sector mapping > so that 4k blocks within the partition are aligned to 4k sectors? Linux tools put the first partition at sector 63 (512-byte) to retain compatibility with Windows; Linux itself does not have any problem with different layouts. See e.g. [1] The problem seems to be limited to Win 5.x (XP, 2k3) and WD has an utility[2] to re-align partitions in this case, so I guess that they do cope fine with a 4k-aligned partition table, they just create it unaligned by default. > It has been discussed for hardware disk design with 4k sectors, and > somehow there were plans to map sectors so that the Linux partition > scheme results in nicely aligned filesystem blocks Ugh, I hope you're wrong ;-) AFAICS remapping will lead only to headaches... Linux does not have any problem with aligned partitions. Luca [1] http://thunk.org/tytso/blog/2009/02/20/aligning-filesystems-to-an-ssds-erase-block-size/ [2] http://support.wdc.com/product/download.asp?groupid=805&sid=123&lang=en
Re: [Qemu-devel] qemu Changelog Makefile Makefile.target TODO ae...
On 9/17/07, J. Mayer <[EMAIL PROTECTED]> wrote: > On Mon, 2007-09-17 at 23:14 +0200, Luca wrote: > > > > since I mentionned "you should have used Git", I'll repeat: > > > > this commit was not disruptive to any of the Git users, and will > > > > never be. > > > > > > > > Evolve, or prepare to be assimilated into the Collective... > > > > > > Both the qemu.org and the Savannah project page only mention CVS. If > > > there are better ways to get the code then inform your users how to > > > use that. > > > http://brick.kernel.dk/git/?p=qemu.git;a=summary > > It's tracking QEMU CVS; you're right that it's not mentioned anywhere > > on the site (AFAICS). > > You can also DIY with git-cvsimport; see e.g. > > http://chneukirchen.org/blog/archive/2006/04/tracking-the-ruby-cvs-with-git.html > > Another point is CVS is an industry standard. It has many drawbacks but > is prooven to do its job as specified in a very reliable way. For now, > not such a thing for git, afaik. If it ever become the new industry > standard, after having prooven its reliability and long term stability, > then you may be able to expect everyone to use it. > Did anyone has done a long term comparison of CVS and git running on two > copies of the > same production repository and have made sure that any extraction at any > time of any data (ie, checkout in the present and any date in the past, > diffs, ...) of the two gives exactly the same result? Actually CVS doesn't provide _any_ guarantee about data integrity. GIT does. So... > Please show me > such studies and I may reconsider my position... If not, you can always > use it, closing your eyes and praying for everything to be OK... ...yes, I'm willing to trust GIT over CVS any time ;) Luca
Re: [Qemu-devel] qemu Changelog Makefile Makefile.target TODO ae...
On 9/17/07, Andreas Färber <[EMAIL PROTECTED]> wrote: > > Am 17.09.2007 um 14:18 schrieb Christian MICHON: > > > On 9/17/07, Philip Boulain <[EMAIL PROTECTED]> wrote: > >>>>>> DON'T DO THIS KIND OF COMMIT AGAIN, PLEASE. > >>>>> if we were using git (but you can do it locally anyway), you > >>>>> would not > >>>>> have these conflicts problems... > >>>> Maybe... but Savannah uses a CVS frontend, as far as I know... > >>> Those are excuses. > >> > >> So is a "you should have used X" argument. It doesn't invalidate the > >> point that the commit was disruptive, and merely acts as bait for the > >> grand old "version repository" flamewar.* > >> > > > > since I mentionned "you should have used Git", I'll repeat: > > this commit was not disruptive to any of the Git users, and will > > never be. > > > > Evolve, or prepare to be assimilated into the Collective... > > Both the qemu.org and the Savannah project page only mention CVS. If > there are better ways to get the code then inform your users how to > use that. http://brick.kernel.dk/git/?p=qemu.git;a=summary It's tracking QEMU CVS; you're right that it's not mentioned anywhere on the site (AFAICS). You can also DIY with git-cvsimport; see e.g. http://chneukirchen.org/blog/archive/2006/04/tracking-the-ruby-cvs-with-git.html Luca
Re: [Qemu-devel] Build failure on OS X - dynticks
On 8/31/07, Andreas Färber <[EMAIL PROTECTED]> wrote: > > Am 31.08.2007 um 21:45 schrieb Luca Tettamanti: > > > Andreas Färber ha scritto: > >> Am 25.08.2007 um 09:37 schrieb Andreas F=E4rber: > >> > >>> One of the recent patches (dynticks) has broken compilation on Mac > >>> OS X v10.4. Should this work on OS X or should this have been > >>> limited to Linux? > >> > >> Getting no answer on this I have prepared a quickfix which wraps all > >> dynticks in conditional sections for __linux__, restoring compilation > >> on OS X. > > > > Sorry for the late reply, I totaly missed you mail. The timers I've > > used > > are POSIX and are supported on *BSD so I guessed they would work on > > OSX > > too... cleary that's not the case. > > Can you grep the headers for timer_create? Maybe the prototypes are > > in a > > different file. > > I had that thought myself but man had nothing on it and Spotlight > didn't find anything apart from Qemu itself. Oh well, your patch is fine then. Luca
Re: [Qemu-devel] Build failure on OS X - dynticks
Andreas Färber ha scritto: > Am 25.08.2007 um 09:37 schrieb Andreas F=E4rber: > >> One of the recent patches (dynticks) has broken compilation on Mac >> OS X v10.4. Should this work on OS X or should this have been >> limited to Linux? > > Getting no answer on this I have prepared a quickfix which wraps all > dynticks in conditional sections for __linux__, restoring compilation > on OS X. > If this is the right way to fix then the conditional sections can be > merged with HPET, which is already limited to __linux__. Sorry for the late reply, I totaly missed you mail. The timers I've used are POSIX and are supported on *BSD so I guessed they would work on OSX too... cleary that's not the case. Can you grep the headers for timer_create? Maybe the prototypes are in a different file. Luca -- Non sempre quello che viene dopo e` progresso. Alessandro Manzoni
Re: [kvm-devel] [Qemu-devel] [PATCH 3/4] Add support for HPET periodic timer.
On 8/23/07, Dan Kenigsberg <[EMAIL PROTECTED]> wrote: > On Thu, Aug 23, 2007 at 12:09:47AM +0200, Andi Kleen wrote: > > > $ dmesg |grep -i hpet > > > ACPI: HPET 7D5B6AE0, 0038 (r1 A M I OEMHPET 5000708 MSFT 97) > > > ACPI: HPET id: 0x8086a301 base: 0xfed0 > > > hpet0: at MMIO 0xfed0, IRQs 2, 8, 0, 0 > > > hpet0: 4 64-bit timers, 14318180 Hz > > > hpet_resources: 0xfed0 is busy > > > > What kernel version was that? There was a bug that caused this pre .22 > > > > I have vanilla 2.6.22.3 on that machine. Try: cat /sys/devices/system/clocksource/clocksource0/available_clocksource do you see HPET listed twice? Luca
[Qemu-devel] Re: [kvm-devel] [PATCH 0/4] Rework alarm timer infrastrucure - take2
On 8/22/07, Dor Laor <[EMAIL PROTECTED]> wrote: > >> > >>> This is QEMU, with dynticks and HPET: > >> > >>> > >> > >>> % time seconds usecs/call callserrors syscall > >> > >>> -- --- --- - - > - > >--- > >> > >>> 52.100.002966 0 96840 > clock_gettime > >> > >>> 19.500.001110 0 37050 > timer_gettime > >> > >>> 10.660.000607 0 20086 > timer_settime > >> > >>> 10.400.000592 0 8985 2539 sigreturn > >> > >>> 4.940.000281 0 8361 2485 select > >> > >>> 2.410.000137 0 8362 gettimeofday > >> > >>> -- --- --- - - > - > >--- > >> > >>> 100.000.005693179684 5024 total > >> > >>> > >> > >>> > >> > >> This looks like 250 Hz? > >> > >> > >> > > > >> > > Nope: > >> > > > >> > > # CONFIG_NO_HZ is not set > >> > > # CONFIG_HZ_100 is not set > >> > > # CONFIG_HZ_250 is not set > >> > > # CONFIG_HZ_300 is not set > >> > > CONFIG_HZ_1000=y > >> > > CONFIG_HZ=1000 > >> > > > >> > > and I'm reading it from /proc/config.gz on the guest. > >> > > > >> > > >> > Yeah, thought so -- so dyntick is broken at present. > >> > >> I see a lot of sub ms timer_settime(). Many of them are the result of > >> ->expire_time being less than the current qemu_get_clock(). This > >> results into 250us timer due to MIN_TIMER_REARM_US; this happens only > >> for the REALTIME timer. Other sub-ms timers are generated by the > >> VIRTUAL timer. > >> > >> This first issue is easily fixed; if expire_time < current time then > >> the timer has expired and hasn't been reprogrammed (and thus can be > >> ignored). > >> VIRTUAL just becomes more accurate with dyntics, before multiple > >> timers were batched together. > >> > >> > Or maybe your host kernel can't support such a high rate. > >> > >> I don't know... a simple printf tells me that the signal handler is > >> called about 1050 times per second, which sounds about right. > > > >...unless strace is attached. ptrace()'ing the process really screw up > >the timing with dynticks; HPET is also affected but the performance > >hit is not as severe. > > > I didn't figure out how you use both hpet and dyn-tick together. I don't. Only one timer source is active at any time; the selection is done at startup with -clock option. > Hpet has periodic timer while dyn-tick is one shot timer each time. > Is ther a chance that both are working and that's the source of our > problems? No, the various sources are exclusive (though it might be possible to use HPET in one shot mode). Luca
[Qemu-devel] Re: [kvm-devel] [PATCH 0/4] Rework alarm timer infrastrucure - take2
On 8/22/07, Luca <[EMAIL PROTECTED]> wrote: > On 8/22/07, Avi Kivity <[EMAIL PROTECTED]> wrote: > > Luca wrote: > > >>> This is QEMU, with dynticks and HPET: > > >>> > > >>> % time seconds usecs/call callserrors syscall > > >>> -- --- --- - - > > >>> 52.100.002966 0 96840 clock_gettime > > >>> 19.500.001110 0 37050 timer_gettime > > >>> 10.660.000607 0 20086 timer_settime > > >>> 10.400.000592 0 8985 2539 sigreturn > > >>> 4.940.000281 0 8361 2485 select > > >>> 2.410.000137 0 8362 gettimeofday > > >>> -- --- --- - - > > >>> 100.000.005693179684 5024 total > > >>> > > >>> > > >> This looks like 250 Hz? > > >> > > > > > > Nope: > > > > > > # CONFIG_NO_HZ is not set > > > # CONFIG_HZ_100 is not set > > > # CONFIG_HZ_250 is not set > > > # CONFIG_HZ_300 is not set > > > CONFIG_HZ_1000=y > > > CONFIG_HZ=1000 > > > > > > and I'm reading it from /proc/config.gz on the guest. > > > > > > > Yeah, thought so -- so dyntick is broken at present. > > I see a lot of sub ms timer_settime(). Many of them are the result of > ->expire_time being less than the current qemu_get_clock(). This > results into 250us timer due to MIN_TIMER_REARM_US; this happens only > for the REALTIME timer. Other sub-ms timers are generated by the > VIRTUAL timer. > > This first issue is easily fixed; if expire_time < current time then > the timer has expired and hasn't been reprogrammed (and thus can be > ignored). > VIRTUAL just becomes more accurate with dyntics, before multiple > timers were batched together. > > > Or maybe your host kernel can't support such a high rate. > > I don't know... a simple printf tells me that the signal handler is > called about 1050 times per second, which sounds about right. ...unless strace is attached. ptrace()'ing the process really screw up the timing with dynticks; HPET is also affected but the performance hit is not as severe. Luca
[Qemu-devel] Re: [kvm-devel] [PATCH 0/4] Rework alarm timer infrastrucure - take2
On 8/22/07, Luca <[EMAIL PROTECTED]> wrote: > I see a lot of sub ms timer_settime(). Many of them are the result of > ->expire_time being less than the current qemu_get_clock(). False alarm, this was a bug in the debug code :D Luca
[Qemu-devel] Re: [kvm-devel] [PATCH 0/4] Rework alarm timer infrastrucure - take2
On 8/22/07, Avi Kivity <[EMAIL PROTECTED]> wrote: > Luca wrote: > >>> This is QEMU, with dynticks and HPET: > >>> > >>> % time seconds usecs/call callserrors syscall > >>> -- --- --- - - > >>> 52.100.002966 0 96840 clock_gettime > >>> 19.500.001110 0 37050 timer_gettime > >>> 10.660.000607 0 20086 timer_settime > >>> 10.400.000592 0 8985 2539 sigreturn > >>> 4.940.000281 0 8361 2485 select > >>> 2.410.000137 0 8362 gettimeofday > >>> -- --- --- - - > >>> 100.000.005693179684 5024 total > >>> > >>> > >> This looks like 250 Hz? > >> > > > > Nope: > > > > # CONFIG_NO_HZ is not set > > # CONFIG_HZ_100 is not set > > # CONFIG_HZ_250 is not set > > # CONFIG_HZ_300 is not set > > CONFIG_HZ_1000=y > > CONFIG_HZ=1000 > > > > and I'm reading it from /proc/config.gz on the guest. > > > > Yeah, thought so -- so dyntick is broken at present. I see a lot of sub ms timer_settime(). Many of them are the result of ->expire_time being less than the current qemu_get_clock(). This results into 250us timer due to MIN_TIMER_REARM_US; this happens only for the REALTIME timer. Other sub-ms timers are generated by the VIRTUAL timer. This first issue is easily fixed; if expire_time < current time then the timer has expired and hasn't been reprogrammed (and thus can be ignored). VIRTUAL just becomes more accurate with dyntics, before multiple timers were batched together. > Or maybe your host kernel can't support such a high rate. I don't know... a simple printf tells me that the signal handler is called about 1050 times per second, which sounds about right. Luca
[Qemu-devel] Re: [kvm-devel] [PATCH 0/4] Rework alarm timer infrastrucure - take2
On 8/22/07, Avi Kivity <[EMAIL PROTECTED]> wrote: > Luca Tettamanti wrote: > > Il Wed, Aug 22, 2007 at 08:02:07AM +0300, Avi Kivity ha scritto: > > > >> Luca Tettamanti wrote: > >> > >> > >>> Actually I'm having troubles with cyclesoak (probably it's calibration), > >>> numbers are not very stable across multiple runs... > >>> > >>> > >> I've had good results with cyclesoak; maybe you need to run it in > >> runlevel 3 so the load generated by moving the mouse or breathing > >> doesn't affect meaurements. > >> > > > > This is what I did, I tested with -no-grapich in text console. > > Okay. Maybe cpu frequency scaling confused it then. Or something else? I set it performance, frequency was locked at 2.1GHz. > >>> The guest is an idle kernel with HZ=1000. > >>> > >>> > >> Can you double check this? The dyntick results show that this is either > >> a 100Hz kernel, or that there is a serious bug in dynticks. > >> > > > > Ops I sent the wrong files, sorry. > > > > This is QEMU, with dynticks and HPET: > > > > % time seconds usecs/call callserrors syscall > > -- --- --- - - > > 52.100.002966 0 96840 clock_gettime > > 19.500.001110 0 37050 timer_gettime > > 10.660.000607 0 20086 timer_settime > > 10.400.000592 0 8985 2539 sigreturn > > 4.940.000281 0 8361 2485 select > > 2.410.000137 0 8362 gettimeofday > > -- --- --- - - > > 100.000.005693179684 5024 total > > > > This looks like 250 Hz? Nope: # CONFIG_NO_HZ is not set # CONFIG_HZ_100 is not set # CONFIG_HZ_250 is not set # CONFIG_HZ_300 is not set CONFIG_HZ_1000=y CONFIG_HZ=1000 and I'm reading it from /proc/config.gz on the guest. > And a huge number of settime calls? Yes, maybe some QEMU timer is using an interval < 1ms? Dan do you any any idea of what's going on? Luca
[Qemu-devel] Re: [kvm-devel] [PATCH 0/4] Rework alarm timer infrastrucure - take2
Il Wed, Aug 22, 2007 at 08:02:07AM +0300, Avi Kivity ha scritto: > Luca Tettamanti wrote: > > > Actually I'm having troubles with cyclesoak (probably it's calibration), > > numbers are not very stable across multiple runs... > > > > I've had good results with cyclesoak; maybe you need to run it in > runlevel 3 so the load generated by moving the mouse or breathing > doesn't affect meaurements. This is what I did, I tested with -no-grapich in text console. > > The guest is an idle kernel with HZ=1000. > > > > Can you double check this? The dyntick results show that this is either > a 100Hz kernel, or that there is a serious bug in dynticks. Ops I sent the wrong files, sorry. This is QEMU, with dynticks and HPET: % time seconds usecs/call callserrors syscall -- --- --- - - 52.100.002966 0 96840 clock_gettime 19.500.001110 0 37050 timer_gettime 10.660.000607 0 20086 timer_settime 10.400.000592 0 8985 2539 sigreturn 4.940.000281 0 8361 2485 select 2.410.000137 0 8362 gettimeofday -- --- --- - - 100.000.005693179684 5024 total % time seconds usecs/call callserrors syscall -- --- --- - - 93.370.025541 3 10194 10193 select 4.820.001319 0 33259 clock_gettime 1.100.000301 0 10195 gettimeofday 0.710.000195 0 10196 10194 sigreturn -- --- --- - - 100.000.027356 63844 20387 total And this KVM: % time seconds usecs/call callserrors syscall -- --- --- - - 42.660.002885 0 4552724 ioctl 25.620.001733 0 89305 clock_gettime 13.120.000887 0 34894 timer_gettime 7.970.000539 0 18016 timer_settime 4.700.000318 0 12224 7270 rt_sigtimedwait 2.790.000189 0 7271 select 1.860.000126 0 7271 gettimeofday 1.270.86 0 4954 rt_sigaction -- --- --- - - 100.000.006763219462 7294 total % time seconds usecs/call callserrors syscall -- --- --- - - 49.410.004606 0 5990027 ioctl 24.140.002250 0 31252 21082 rt_sigtimedwait 9.650.000900 0 51856 clock_gettime 8.440.000787 0 17819 select 4.420.000412 0 17819 gettimeofday 3.940.000367 0 10170 rt_sigaction -- --- --- - - ---- 100.000.009322188816 21109 total Luca -- Runtime error 6D at f000:a12f : user incompetente
Re: [Qemu-devel] [PATCH 3/4] Add support for HPET periodic timer.
On 8/21/07, Matthew Kent <[EMAIL PROTECTED]> wrote: > On Sat, 2007-18-08 at 01:11 +0200, Luca Tettamanti wrote: > > plain text document attachment (clock-hpet) > > Linux operates the HPET timer in legacy replacement mode, which means that > > the periodic interrupt of the CMOS RTC is not delivered (qemu won't be able > > to use /dev/rtc). Add support for HPET (/dev/hpet) as a replacement for the > > RTC; the periodic interrupt is delivered via SIGIO and is handled in the > > same way as the RTC timer. > > > > Signed-off-by: Luca Tettamanti <[EMAIL PROTECTED]> > > I must be missing something silly here.. should I be able to open more > than one instance of qemu with -clock hpet? Because upon invoking a > second instance of qemu HPET_IE_ON fails. It depends on your hardware. Theoretically it's possible, but I've yet to see a motherboard with more than one periodic timer. "dmesg | grep hpet" should tell you something like: hpet0: 3 64-bit timers, 14318180 Hz Luca
[Qemu-devel] Re: [kvm-devel] [PATCH 0/4] Rework alarm timer infrastrucure - take2
Avi Kivity ha scritto: > Luca Tettamanti wrote: >> At 1000Hz: >> >> QEMU >> hpet5.5% >> dynticks 11.7% >> >> KVM >> hpet3.4% >> dynticks7.3% >> >> No surprises here, you can see the additional 1k syscalls per second. > > This is very surprising to me. The 6.2% difference for the qemu case > translates to 62ms per second, or 62us per tick at 1000Hz. That's more > than a hundred simple syscalls on modern processors. We shouldn't have to > issue a hundred syscalls per guest clock tick. APIC or PIT interrupts are delivered using the timer, which will be re-armed after each tick, so I'd expect 1k timer_settime per second. But according to strace it's not happening, maybe I'm misreading the code? > The difference with kvm is smaller (just 3.9%), which is not easily > explained as the time for the extra syscalls should be about the same. My > guess is that guest behavior is different; with dynticks the guest does > about twice as much work as with hpet. Actually I'm having troubles with cyclesoak (probably it's calibration), numbers are not very stable across multiple runs... I've also tried APC which was suggested by malc[1] and: - readings are far more stable - the gap between dynticks and non-dynticks seems not significant > Can you verify this by running > >strace -c -p `pgrep qemu` & sleep 10; pkill strace > > for all 4 cases, and posting the results? Plain QEMU: With dynticks: % time seconds usecs/call callserrors syscall -- --- --- - - 57.970.000469 0 13795 clock_gettime 32.880.000266 0 1350 gettimeofday 7.420.60 0 1423 1072 sigreturn 1.730.14 0 5049 timer_gettime 0.000.00 0 1683 1072 select 0.000.00 0 2978 timer_settime -- --- --- - - 100.000.000809 26278 2144 total HPET: % time seconds usecs/call callserrors syscall -- --- --- - - 87.480.010459 1 10381 10050 select 8.450.001010 0 40736 clock_gettime 2.730.000326 0 10049 gettimeofday 1.350.000161 0 10086 10064 sigreturn -- --- --- - - 100.000.011956 71252 20114 total Unix (SIGALRM): % time seconds usecs/call callserrors syscall -- --- --- - - 90.360.011663 1 10291 9959 select 7.380.000953 0 40355 clock_gettime 2.050.000264 0 9960 gettimeofday 0.210.27 0 9985 9969 sigreturn -- --- --- - - 100.000.012907 70591 19928 total And KVM: dynticks: % time seconds usecs/call callserrors syscall -- --- --- - - 78.900.004001 1 6681 5088 rt_sigtimedwait 10.870.000551 0 27901 clock_gettime 4.930.000250 0 7622 timer_settime 4.300.000218 0 10078 timer_gettime 0.390.20 0 3863 gettimeofday 0.350.18 0 6054 ioctl 0.260.13 0 4196 select 0.000.00 0 1593 rt_sigaction -- --- --- - - 100.000.005071 67988 5088 total HPET: % time seconds usecs/call callserrors syscall -- --- --- - - 90.200.011029 0 32437 22244 rt_sigtimedwait 4.460.000545 0 44164 clock_gettime 2.590.000317 0 12128 gettimeofday 1.500.000184 0 10193 rt_sigaction 1.100.000134 0 12461 select 0.150.18 0 6060 ioctl -- --- --- - - 100.000.012227117443 22244 total Unix: % time seconds usecs/call callserrors syscall -- --- --- - - 83.290.012522 0 31652 21709 rt_sigtimedwait 6.910.001039 0 43125 clock_gettime 3.500.000526 0 6042
Re: [kvm-devel] [Qemu-devel] Re: [PATCH 0/4] Rework alarm timer infrastrucure - take2
On 8/20/07, malc <[EMAIL PROTECTED]> wrote: > On Mon, 20 Aug 2007, Luca Tettamanti wrote: > > > Il Sun, Aug 19, 2007 at 10:31:26PM +0300, Avi Kivity ha scritto: > >> Luca wrote: > >>> On 8/19/07, Luca Tettamanti <[EMAIL PROTECTED]> wrote: > >>> > >>>> +static uint64_t qemu_next_deadline(void) { > >>>> +uint64_t nearest_delta_us = ULLONG_MAX; > >>>> +uint64_t vmdelta_us; > >>>> > >>> > >>> Hum, I introduced a bug here... those vars should be signed. > >>> > >>> On the overhead introduced: how do you measure it? > >>> > >>> > >> > >> Run a 100Hz guest, measure cpu usage using something accurate like > >> cyclesoak, with and without dynticks, with and without kvm. > > > > Ok, here I've measured the CPU usage on the host when running an idle > > guest. > > [...] > The upshot is this - if you have used any standard utility (iostat, > top - basically anything /proc/stat based) the accounting has a fair > chance of being inaccurate. If cyclesoak is what you have used then > the results should be better, but still i would be worried about > them. Yes, I've used cyclesoak. Luca
[Qemu-devel] Re: [kvm-devel] [PATCH 0/4] Rework alarm timer infrastrucure - take2
Il Sun, Aug 19, 2007 at 10:31:26PM +0300, Avi Kivity ha scritto: > Luca wrote: > > On 8/19/07, Luca Tettamanti <[EMAIL PROTECTED]> wrote: > > > >> +static uint64_t qemu_next_deadline(void) { > >> +uint64_t nearest_delta_us = ULLONG_MAX; > >> +uint64_t vmdelta_us; > >> > > > > Hum, I introduced a bug here... those vars should be signed. > > > > On the overhead introduced: how do you measure it? > > > > > > Run a 100Hz guest, measure cpu usage using something accurate like > cyclesoak, with and without dynticks, with and without kvm. Ok, here I've measured the CPU usage on the host when running an idle guest. At 100Hz QEMU hpet4.8% dynticks5.1% Note: I've taken the mean over a period of 20 secs, but the difference between hpet and dynticks is well inside the variability of the test. KVM hpet2.2% dynticks1.0% Hum... here the numbers jumps a bit, but dynticks is always below hpet. At 1000Hz: QEMU hpet5.5% dynticks 11.7% KVM hpet3.4% dynticks7.3% No surprises here, you can see the additional 1k syscalls per second. On the bright side, keep in mind that with a tickless guest and dynticks I've seen as little as 50-60 timer ticks per second. Hackbench (hackbench -pipe 50) inside the guest: QEMU: impossible to measure, the variance of the results is much bigger than difference between dynticks and hpet. KVM: Around 0.8s slower in case on dynticks; variance of the results is about 0.3s in both cases. Luca -- "Chi parla in tono cortese, ma continua a prepararsi, potra` andare avanti; chi parla in tono bellicoso e avanza rapidamente dovra` ritirarsi" Sun Tzu -- L'arte della guerra
[Qemu-devel] Re: [kvm-devel] [PATCH 0/4] Rework alarm timer infrastrucure - take2
On 8/19/07, Luca Tettamanti <[EMAIL PROTECTED]> wrote: > +static uint64_t qemu_next_deadline(void) { > +uint64_t nearest_delta_us = ULLONG_MAX; > +uint64_t vmdelta_us; Hum, I introduced a bug here... those vars should be signed. On the overhead introduced: how do you measure it? Luca
[Qemu-devel] Re: [kvm-devel] [PATCH 0/4] Rework alarm timer infrastrucure - take2
Dor Laor ha scritto: > >> Hello, > >> in reply to this mail I will send a serie of 4 patches that cleans up > >and > >> expands the alarm timer handling in QEMU. Patches have been rebased > on > >QEMU > >> CVS. > >> > >> Patch 1 is mostly a cleanup of the existing code; instead of having > >multiple > >> #ifdefs to handle different timers scattered all over the code I've > >created a > >> modular infrastructure where each timer type is self-contained and > >generic code > >> is more readable. The resulting code is functionally equivalent to > the > >old one. > >> > >> Patch 2 implements the "-clock" command line option proposed by > Daniel > >Berrange > >> and Avi Kivity. By default QEMU tries RTC and then falls back to unix > >timer; > >> user can override the order of the timer through this options. Syntax > >is pretty > >> simple: -clock timer1,timer2,etc. (QEMU will pick the first one that > >works). > >> > >> Patch 3 adds support for HPET under Linux (which is basically my old > >patch). As > >> suggested HPET takes precedence over other timers, but of course this > >can be > >> overridden. > >> > >> Patch 4 introduces "dynticks" timer source; patch is mostly based on > >the work > >> Dan Kenigsberg. dynticks is now the default alarm timer. > > > >Why do you guard dynticks with #ifdef? Is there any reason why you > >wouldn't want to use dynticks? > > I think too that it's should be the default. > There is no regression in performance nor behaviour with this option. Ok, I've updated the patch. It was pretty easy to implement the same feature for win32 (slightly tested inside a winxp VM). > We didn't test qemu dyn-tick with kernels that don't have > high-res-timer+dyn-tick. I did ;) > In this case the dyn-tick minimum res will be 1msec. I believe it should > work ok since this is the case without any dyn-tick. Actually minimum resolution depends on host HZ setting, but - yes - essentially you have the same behaviour of the "unix" timer, plus the overhead of reprogramming the timer. Add support for dynamic ticks. If the the dynticks alarm timer is used qemu does not attempt to generate SIGALRM at a constant rate. Rather, the system timer is set to generate SIGALRM only when it is needed. Dynticks timer reduces the number of SIGALRMs sent to idle dynamic-ticked guests. Original patch from Dan Kenigsberg <[EMAIL PROTECTED]> Signed-off-by: Luca Tettamanti <[EMAIL PROTECTED]> --- vl.c | 178 +++ 1 file changed, 170 insertions(+), 8 deletions(-) Index: qemu/vl.c === --- qemu.orig/vl.c 2007-08-18 23:23:47.0 +0200 +++ qemu/vl.c 2007-08-18 23:23:53.0 +0200 @@ -784,12 +784,31 @@ struct qemu_alarm_timer { char const *name; +unsigned int flags; int (*start)(struct qemu_alarm_timer *t); void (*stop)(struct qemu_alarm_timer *t); +void (*rearm)(struct qemu_alarm_timer *t); void *priv; }; +#define ALARM_FLAG_DYNTICKS 0x1 + +static inline int alarm_has_dynticks(struct qemu_alarm_timer *t) +{ +return t->flags & ALARM_FLAG_DYNTICKS; +} + +static void qemu_rearm_alarm_timer(struct qemu_alarm_timer *t) { +if (!alarm_has_dynticks(t)) +return; + +t->rearm(t); +} + +/* TODO: MIN_TIMER_REARM_US should be optimized */ +#define MIN_TIMER_REARM_US 250 + static struct qemu_alarm_timer *alarm_timer; #ifdef _WIN32 @@ -802,12 +821,17 @@ static int win32_start_timer(struct qemu_alarm_timer *t); static void win32_stop_timer(struct qemu_alarm_timer *t); +static void win32_rearm_timer(struct qemu_alarm_timer *t); #else static int unix_start_timer(struct qemu_alarm_timer *t); static void unix_stop_timer(struct qemu_alarm_timer *t); +static int dynticks_start_timer(struct qemu_alarm_timer *t); +static void dynticks_stop_timer(struct qemu_alarm_timer *t); +static void dynticks_rearm_timer(struct qemu_alarm_timer *t); + #ifdef __linux__ static int hpet_start_timer(struct qemu_alarm_timer *t); @@ -816,21 +840,23 @@ static int rtc_start_timer(struct qemu_alarm_timer *t); static void rtc_stop_timer(struct qemu_alarm_timer *t); -#endif +#endif /* __linux__ */ #endif /* _WIN32 */ static struct qemu_alarm_timer alarm_timers[] = { +#ifndef _WIN32 +{"dynticks", ALARM_FLAG_DYNTICKS, dynticks_start_timer, dynticks_stop_timer, dynticks_rearm_timer, NULL}, #ifdef __linux__ /* HPET - if available - is preferred */ -{"hpet", hpet_start_timer, hpet_stop_timer, NULL
Re: [kvm-devel] [Qemu-devel] [PATCH 0/4] Rework alarm timer infrastrucure - take 2
On 8/18/07, Christian MICHON <[EMAIL PROTECTED]> wrote: > there's a typo line 1432 on vl.c after applying all 4 patches > (missing ';') Ops... > beyond this small typo, I managed to include this in a win32 qemu build. > is there a specific practical test to see the improvement in a linux guest > when running on a windows host ? The improvements - beyond the refactoring - are either specific to Linux (HPET timer) or to UNIX in general (dynticks - POSIX timers are used). It may be possible to use one-shot timer on windows too, but I'm not really familiar with win32 API. Luca
[Qemu-devel] [PATCH 2/4] Add -clock option.
Allow user to override the list of available alarm timers and their priority. The format of the options is -clock clk1,clk2,... Signed-off-by: Luca Tettamanti <[EMAIL PROTECTED]> --- vl.c | 72 +++ 1 file changed, 72 insertions(+) Index: qemu/vl.c === --- qemu.orig/vl.c 2007-08-17 17:31:09.0 +0200 +++ qemu/vl.c 2007-08-18 00:40:22.0 +0200 @@ -829,6 +829,71 @@ {NULL, } }; +static void show_available_alarms() +{ +int i; + +printf("Available alarm timers, in order of precedence:\n"); +for (i = 0; alarm_timers[i].name; i++) +printf("%s\n", alarm_timers[i].name); +} + +static void configure_alarms(char const *opt) +{ +int i; +int cur = 0; +int count = (sizeof(alarm_timers) / sizeof(*alarm_timers)) - 1; +char *arg; +char *name; + +if (!strcmp(opt, "help")) { +show_available_alarms(); +exit(0); +} + +arg = strdup(opt); + +/* Reorder the array */ +name = strtok(arg, ","); +while (name) { +struct qemu_alarm_timer tmp; + +for (i = 0; i < count; i++) { +if (!strcmp(alarm_timers[i].name, name)) +break; +} + +if (i == count) { +fprintf(stderr, "Unknown clock %s\n", name); +goto next; +} + +if (i < cur) +/* Ignore */ +goto next; + + /* Swap */ +tmp = alarm_timers[i]; +alarm_timers[i] = alarm_timers[cur]; +alarm_timers[cur] = tmp; + +cur++; +next: +name = strtok(NULL, ","); +} + +free(arg); + +if (cur) { + /* Disable remaining timers */ +for (i = cur; i < count; i++) +alarm_timers[i].name = NULL; +} + +/* debug */ +show_available_alarms(); +} + QEMUClock *rt_clock; QEMUClock *vm_clock; @@ -6791,6 +6856,8 @@ #ifdef TARGET_SPARC "-prom-env variable=value set OpenBIOS nvram variables\n" #endif + "-clock force the use of the given methods for timer alarm.\n" + "To see what timers are available use -clock help\n" "\n" "During emulation, the following keys are useful:\n" "ctrl-alt-f toggle full screen\n" @@ -6888,6 +6955,7 @@ QEMU_OPTION_name, QEMU_OPTION_prom_env, QEMU_OPTION_old_param, +QEMU_OPTION_clock, }; typedef struct QEMUOption { @@ -6992,6 +7060,7 @@ #if defined(TARGET_ARM) { "old-param", 0, QEMU_OPTION_old_param }, #endif +{ "clock", HAS_ARG, QEMU_OPTION_clock }, { NULL }, }; @@ -7771,6 +7840,9 @@ case QEMU_OPTION_old_param: old_param = 1; #endif +case QEMU_OPTION_clock: +configure_alarms(optarg); +break; } } } --
[Qemu-devel] [PATCH 4/4] Add support for dynamic ticks.
If DYNAMIC_TICKS is defined qemu does not attepmt to generate SIGALRM at a constant rate. Rather, the system timer is set to generate SIGALRM only when it is needed. DYNAMIC_TICKS reduces the number of SIGALRMs sent to idle dynamic-ticked guests. Original patch from Dan Kenigsberg <[EMAIL PROTECTED]> Signed-off-by: Luca Tettamanti <[EMAIL PROTECTED]> --- configure |5 ++ vl.c | 149 +++--- 2 files changed, 148 insertions(+), 6 deletions(-) Index: qemu/vl.c === --- qemu.orig/vl.c 2007-08-17 17:45:00.0 +0200 +++ qemu/vl.c 2007-08-18 00:38:03.0 +0200 @@ -784,12 +784,42 @@ struct qemu_alarm_timer { char const *name; +unsigned int flags; int (*start)(struct qemu_alarm_timer *t); void (*stop)(struct qemu_alarm_timer *t); +void (*rearm)(struct qemu_alarm_timer *t); void *priv; }; +#define ALARM_FLAG_DYNTICKS 0x1 + +#ifdef DYNAMIC_TICKS + +static inline int alarm_has_dynticks(struct qemu_alarm_timer *t) +{ +return t->flags & ALARM_FLAG_DYNTICKS; +} + +static void qemu_rearm_alarm_timer(struct qemu_alarm_timer *t) { +if (!alarm_has_dynticks(t)) +return; + +t->rearm(t); +} + +#else /* DYNAMIC_TICKS */ + +static inline int alarm_has_dynticks(struct qemu_alarm_timer *t) +{ +return 0; +} + +static void qemu_rearm_alarm_timer(void) { +} + +#endif /* DYNAMIC_TICKS */ + static struct qemu_alarm_timer *alarm_timer; #ifdef _WIN32 @@ -808,6 +838,14 @@ static int unix_start_timer(struct qemu_alarm_timer *t); static void unix_stop_timer(struct qemu_alarm_timer *t); +#ifdef DYNAMIC_TICKS + +static int dynticks_start_timer(struct qemu_alarm_timer *t); +static void dynticks_stop_timer(struct qemu_alarm_timer *t); +static void dynticks_rearm_timer(struct qemu_alarm_timer *t); + +#endif + #ifdef __linux__ static int hpet_start_timer(struct qemu_alarm_timer *t); @@ -821,16 +859,19 @@ #endif /* _WIN32 */ static struct qemu_alarm_timer alarm_timers[] = { +#ifndef _WIN32 +#ifdef DYNAMIC_TICKS +{"dynticks", ALARM_FLAG_DYNTICKS, dynticks_start_timer, dynticks_stop_timer, dynticks_rearm_timer, NULL}, +#endif #ifdef __linux__ /* HPET - if available - is preferred */ -{"hpet", hpet_start_timer, hpet_stop_timer, NULL}, +{"hpet", 0, hpet_start_timer, hpet_stop_timer, NULL, NULL}, /* ...otherwise try RTC */ -{"rtc", rtc_start_timer, rtc_stop_timer, NULL}, +{"rtc", 0, rtc_start_timer, rtc_stop_timer, NULL, NULL}, #endif -#ifndef _WIN32 -{"unix", unix_start_timer, unix_stop_timer, NULL}, +{"unix", 0, unix_start_timer, unix_stop_timer, NULL, NULL}, #else -{"win32", win32_start_timer, win32_stop_timer, &alarm_win32_data}, +{"win32", 0, win32_start_timer, win32_stop_timer, NULL, &alarm_win32_data}, #endif {NULL, } }; @@ -949,6 +990,8 @@ } pt = &t->next; } + +qemu_rearm_alarm_timer(alarm_timer); } /* modify the current timer so that it will be fired when current_time @@ -1008,6 +1051,7 @@ /* run the callback (the timer list can be modified) */ ts->cb(ts->opaque); } +qemu_rearm_alarm_timer(alarm_timer); } int64_t qemu_get_clock(QEMUClock *clock) @@ -1115,7 +1159,8 @@ last_clock = ti; } #endif -if (qemu_timer_expired(active_timers[QEMU_TIMER_VIRTUAL], +if (alarm_has_dynticks(alarm_timer) || +qemu_timer_expired(active_timers[QEMU_TIMER_VIRTUAL], qemu_get_clock(vm_clock)) || qemu_timer_expired(active_timers[QEMU_TIMER_REALTIME], qemu_get_clock(rt_clock))) { @@ -1243,6 +1288,97 @@ #endif /* !defined(__linux__) */ +#ifdef DYNAMIC_TICKS +static int dynticks_start_timer(struct qemu_alarm_timer *t) +{ +struct sigevent ev; +timer_t host_timer; +struct sigaction act; + +sigfillset(&act.sa_mask); +act.sa_flags = 0; +#if defined(TARGET_I386) && defined(USE_CODE_COPY) +act.sa_flags |= SA_ONSTACK; +#endif +act.sa_handler = host_alarm_handler; + +sigaction(SIGALRM, &act, NULL); + +ev.sigev_value.sival_int = 0; +ev.sigev_notify = SIGEV_SIGNAL; +ev.sigev_signo = SIGALRM; + +if (timer_create(CLOCK_REALTIME, &ev, &host_timer)) { +perror("timer_create"); + +/* disable dynticks */ +fprintf(stderr, "Dynamic Ticks disabled\n"); + +return -1; +} + +t->priv = (void *)host_timer; + +return 0; +} + +static void dynticks_stop_timer(struct qemu_alarm_timer *t) +{ +timer_t host_timer = (timer_t)t->priv; + +timer_delete(host_timer); +} + +static void dynticks_rearm_timer(struct qemu_alarm_timer *t) +{ +timer_t host_timer = (timer_t)t-&
[Qemu-devel] [PATCH 3/4] Add support for HPET periodic timer.
Linux operates the HPET timer in legacy replacement mode, which means that the periodic interrupt of the CMOS RTC is not delivered (qemu won't be able to use /dev/rtc). Add support for HPET (/dev/hpet) as a replacement for the RTC; the periodic interrupt is delivered via SIGIO and is handled in the same way as the RTC timer. Signed-off-by: Luca Tettamanti <[EMAIL PROTECTED]> --- vl.c | 57 - 1 file changed, 56 insertions(+), 1 deletion(-) Index: qemu/vl.c === --- qemu.orig/vl.c 2007-08-17 17:39:21.0 +0200 +++ qemu/vl.c 2007-08-18 00:40:16.0 +0200 @@ -56,6 +56,7 @@ #include #include #include +#include #include #include #else @@ -809,6 +810,9 @@ #ifdef __linux__ +static int hpet_start_timer(struct qemu_alarm_timer *t); +static void hpet_stop_timer(struct qemu_alarm_timer *t); + static int rtc_start_timer(struct qemu_alarm_timer *t); static void rtc_stop_timer(struct qemu_alarm_timer *t); @@ -818,7 +822,9 @@ static struct qemu_alarm_timer alarm_timers[] = { #ifdef __linux__ -/* RTC - if available - is preferred */ +/* HPET - if available - is preferred */ +{"hpet", hpet_start_timer, hpet_stop_timer, NULL}, +/* ...otherwise try RTC */ {"rtc", rtc_start_timer, rtc_stop_timer, NULL}, #endif #ifndef _WIN32 @@ -1153,6 +1159,55 @@ fcntl(fd, F_SETOWN, getpid()); } +static int hpet_start_timer(struct qemu_alarm_timer *t) +{ +struct hpet_info info; +int r, fd; + +fd = open("/dev/hpet", O_RDONLY); +if (fd < 0) +return -1; + +/* Set frequency */ +r = ioctl(fd, HPET_IRQFREQ, RTC_FREQ); +if (r < 0) { +fprintf(stderr, "Could not configure '/dev/hpet' to have a 1024Hz timer. This is not a fatal\n" +"error, but for better emulation accuracy type:\n" +"'echo 1024 > /proc/sys/dev/hpet/max-user-freq' as root.\n"); +goto fail; +} + +/* Check capabilities */ +r = ioctl(fd, HPET_INFO, &info); +if (r < 0) +goto fail; + +/* Enable periodic mode */ +r = ioctl(fd, HPET_EPI, 0); +if (info.hi_flags && (r < 0)) +goto fail; + +/* Enable interrupt */ +r = ioctl(fd, HPET_IE_ON, 0); +if (r < 0) +goto fail; + +enable_sigio_timer(fd); +t->priv = (void *)fd; + +return 0; +fail: +close(fd); +return -1; +} + +static void hpet_stop_timer(struct qemu_alarm_timer *t) +{ +int fd = (int)t->priv; + +close(fd); +} + static int rtc_start_timer(struct qemu_alarm_timer *t) { int rtc_fd; --
[Qemu-devel] [PATCH 0/4] Rework alarm timer infrastrucure - take 2
Hello, in reply to this mail I will send a serie of 4 patches that cleans up and expands the alarm timer handling in QEMU. Patches have been rebased on QEMU CVS. Patch 1 is mostly a cleanup of the existing code; instead of having multiple #ifdefs to handle different timers scattered all over the code I've created a modular infrastructure where each timer type is self-contained and generic code is more readable. The resulting code is functionally equivalent to the old one. Patch 2 implements the "-clock" command line option proposed by Daniel Berrange and Avi Kivity. By default QEMU tries RTC and then falls back to unix timer; user can override the order of the timer through this options. Syntax is pretty simple: -clock timer1,timer2,etc. (QEMU will pick the first one that works). Patch 3 adds support for HPET under Linux (which is basically my old patch). As suggested HPET takes precedence over other timers, but of course this can be overridden. Patch 4 introduces "dynticks" timer source; patch is mostly based on the work Dan Kenigsberg. dynticks is now the default alarm timer. Luca --
[Qemu-devel] [PATCH 1/4] Rework alarm timer infrastrucure.
Make the alarm code modular, removing #ifdef from the generic code and abstract a common interface for all the timer. The result is functionally equivalent to the old code. Signed-off-by: Luca Tettamanti <[EMAIL PROTECTED]> --- vl.c | 287 +++ vl.h |1 2 files changed, 185 insertions(+), 103 deletions(-) Index: qemu/vl.c === --- qemu.orig/vl.c 2007-08-17 16:48:32.0 +0200 +++ qemu/vl.c 2007-08-18 00:40:25.0 +0200 @@ -781,18 +781,58 @@ struct QEMUTimer *next; }; -QEMUClock *rt_clock; -QEMUClock *vm_clock; +struct qemu_alarm_timer { +char const *name; + +int (*start)(struct qemu_alarm_timer *t); +void (*stop)(struct qemu_alarm_timer *t); +void *priv; +}; + +static struct qemu_alarm_timer *alarm_timer; -static QEMUTimer *active_timers[2]; #ifdef _WIN32 -static MMRESULT timerID; -static HANDLE host_alarm = NULL; -static unsigned int period = 1; + +struct qemu_alarm_win32 { +MMRESULT timerId; +HANDLE host_alarm; +unsigned int period; +} alarm_win32_data = {0, NULL, -1}; + +static int win32_start_timer(struct qemu_alarm_timer *t); +static void win32_stop_timer(struct qemu_alarm_timer *t); + +#else + +static int unix_start_timer(struct qemu_alarm_timer *t); +static void unix_stop_timer(struct qemu_alarm_timer *t); + +#ifdef __linux__ + +static int rtc_start_timer(struct qemu_alarm_timer *t); +static void rtc_stop_timer(struct qemu_alarm_timer *t); + +#endif + +#endif /* _WIN32 */ + +static struct qemu_alarm_timer alarm_timers[] = { +#ifdef __linux__ +/* RTC - if available - is preferred */ +{"rtc", rtc_start_timer, rtc_stop_timer, NULL}, +#endif +#ifndef _WIN32 +{"unix", unix_start_timer, unix_stop_timer, NULL}, #else -/* frequency of the times() clock tick */ -static int timer_freq; +{"win32", win32_start_timer, win32_stop_timer, &alarm_win32_data}, #endif +{NULL, } +}; + +QEMUClock *rt_clock; +QEMUClock *vm_clock; + +static QEMUTimer *active_timers[2]; QEMUClock *qemu_new_clock(int type) { @@ -1009,7 +1049,8 @@ qemu_timer_expired(active_timers[QEMU_TIMER_REALTIME], qemu_get_clock(rt_clock))) { #ifdef _WIN32 -SetEvent(host_alarm); +struct qemu_alarm_win32 *data = ((struct qemu_alarm_timer*)dwUser)->priv; +SetEvent(data->host_alarm); #endif CPUState *env = cpu_single_env; if (env) { @@ -1030,10 +1071,27 @@ #define RTC_FREQ 1024 -static int rtc_fd; +static void enable_sigio_timer(int fd) +{ +struct sigaction act; -static int start_rtc_timer(void) +/* timer signal */ +sigfillset(&act.sa_mask); +act.sa_flags = 0; +#if defined (TARGET_I386) && defined(USE_CODE_COPY) +act.sa_flags |= SA_ONSTACK; +#endif +act.sa_handler = host_alarm_handler; + +sigaction(SIGIO, &act, NULL); +fcntl(fd, F_SETFL, O_ASYNC); +fcntl(fd, F_SETOWN, getpid()); +} + +static int rtc_start_timer(struct qemu_alarm_timer *t) { +int rtc_fd; + TFR(rtc_fd = open("/dev/rtc", O_RDONLY)); if (rtc_fd < 0) return -1; @@ -1048,117 +1106,142 @@ close(rtc_fd); return -1; } -pit_min_timer_count = PIT_FREQ / RTC_FREQ; + +enable_sigio_timer(rtc_fd); + +t->priv = (void *)rtc_fd; + return 0; } -#else - -static int start_rtc_timer(void) +static void rtc_stop_timer(struct qemu_alarm_timer *t) { -return -1; +int rtc_fd = (int)t->priv; + +close(rtc_fd); } #endif /* !defined(__linux__) */ -#endif /* !defined(_WIN32) */ +static int unix_start_timer(struct qemu_alarm_timer *t) +{ +struct sigaction act; +struct itimerval itv; +int err; + +/* timer signal */ +sigfillset(&act.sa_mask); +act.sa_flags = 0; +#if defined(TARGET_I386) && defined(USE_CODE_COPY) +act.sa_flags |= SA_ONSTACK; +#endif +act.sa_handler = host_alarm_handler; + +sigaction(SIGALRM, &act, NULL); + +itv.it_interval.tv_sec = 0; +/* for i386 kernel 2.6 to get 1 ms */ +itv.it_interval.tv_usec = 999; +itv.it_value.tv_sec = 0; +itv.it_value.tv_usec = 10 * 1000; -static void init_timer_alarm(void) +err = setitimer(ITIMER_REAL, &itv, NULL); +if (err) +return -1; + +return 0; +} + +static void unix_stop_timer(struct qemu_alarm_timer *t) { +struct itimerval itv; + +memset(&itv, 0, sizeof(itv)); +setitimer(ITIMER_REAL, &itv, NULL); +} + +#endif /* !defined(_WIN32) */ + #ifdef _WIN32 -{ -int count=0; -TIMECAPS tc; -ZeroMemory(&tc, sizeof(TIMECAPS)); -timeGetDevCaps(&tc, sizeof(TIMECAPS)); -if (period < tc.wPeriodMin) -period = tc.wPeriodMin; -timeBeginPeriod(period); -timerID = timeSetEvent(1, // in
[Qemu-devel] [PATCH/RFC 4/4] Add support for dynamic ticks.
If DYNAMIC_TICKS is defined qemu does not attepmt to generate SIGALRM at a constant rate. Rather, the system timer is set to generate SIGALRM only when it is needed. DYNAMIC_TICKS reduces the number of SIGALRMs sent to idle dynamic-ticked guests. Original patch from Dan Kenigsberg <[EMAIL PROTECTED]> Signed-off-by: Luca Tettamanti <[EMAIL PROTECTED]> --- qemu/configure |5 ++ qemu/vl.c | 149 -- 2 files changed, 148 insertions(+), 6 deletions(-) diff --git a/qemu/configure b/qemu/configure index 365b7fb..38373db 100755 --- a/qemu/configure +++ b/qemu/configure @@ -262,6 +262,8 @@ for opt do ;; --enable-uname-release=*) uname_release="$optarg" ;; + --disable-dynamic-ticks) dynamic_ticks="no" + ;; esac done @@ -788,6 +790,9 @@ echo "TARGET_DIRS=$target_list" >> $config_mak if [ "$build_docs" = "yes" ] ; then echo "BUILD_DOCS=yes" >> $config_mak fi +if test "$dynamic_ticks" != "no" ; then + echo "#define DYNAMIC_TICKS 1" >> $config_h +fi # XXX: suppress that if [ "$bsd" = "yes" ] ; then diff --git a/qemu/vl.c b/qemu/vl.c index 0373beb..096729d 100644 --- a/qemu/vl.c +++ b/qemu/vl.c @@ -748,12 +748,42 @@ struct QEMUTimer { struct qemu_alarm_timer { char const *name; +unsigned int flags; int (*start)(struct qemu_alarm_timer *t); void (*stop)(struct qemu_alarm_timer *t); +void (*rearm)(struct qemu_alarm_timer *t); void *priv; }; +#define ALARM_FLAG_DYNTICKS 0x1 + +#ifdef DYNAMIC_TICKS + +static inline int alarm_has_dynticks(struct qemu_alarm_timer *t) +{ +return t->flags & ALARM_FLAG_DYNTICKS; +} + +static void qemu_rearm_alarm_timer(struct qemu_alarm_timer *t) { +if (!alarm_has_dynticks(t)) +return; + +t->rearm(t); +} + +#else /* DYNAMIC_TICKS */ + +static inline int alarm_has_dynticks(struct qemu_alarm_timer *t) +{ +return 0; +} + +static void qemu_rearm_alarm_timer(void) { +} + +#endif /* DYNAMIC_TICKS */ + static struct qemu_alarm_timer *alarm_timer; #ifdef _WIN32 @@ -772,6 +802,14 @@ static void win32_stop_timer(struct qemu_alarm_timer *t); static int unix_start_timer(struct qemu_alarm_timer *t); static void unix_stop_timer(struct qemu_alarm_timer *t); +#ifdef DYNAMIC_TICKS + +static int dynticks_start_timer(struct qemu_alarm_timer *t); +static void dynticks_stop_timer(struct qemu_alarm_timer *t); +static void dynticks_rearm_timer(struct qemu_alarm_timer *t); + +#endif + #ifdef __linux__ static int hpet_start_timer(struct qemu_alarm_timer *t); @@ -785,16 +823,19 @@ static void rtc_stop_timer(struct qemu_alarm_timer *t); #endif /* _WIN32 */ static struct qemu_alarm_timer alarm_timers[] = { +#ifndef _WIN32 +#ifdef DYNAMIC_TICKS +{"dynticks", ALARM_FLAG_DYNTICKS, dynticks_start_timer, dynticks_stop_timer, dynticks_rearm_timer, NULL}, +#endif #ifdef __linux__ /* HPET - if available - is preferred */ -{"hpet", hpet_start_timer, hpet_stop_timer, NULL}, +{"hpet", 0, hpet_start_timer, hpet_stop_timer, NULL, NULL}, /* ...otherwise try RTC */ -{"rtc", rtc_start_timer, rtc_stop_timer, NULL}, +{"rtc", 0, rtc_start_timer, rtc_stop_timer, NULL, NULL}, #endif -#ifndef _WIN32 -{"unix", unix_start_timer, unix_stop_timer, NULL}, +{"unix", 0, unix_start_timer, unix_stop_timer, NULL, NULL}, #else -{"win32", win32_start_timer, win32_stop_timer, &alarm_win32_data}, +{"win32", 0, win32_start_timer, win32_stop_timer, NULL, &alarm_win32_data}, #endif {NULL, } }; @@ -913,6 +954,8 @@ void qemu_del_timer(QEMUTimer *ts) } pt = &t->next; } + +qemu_rearm_alarm_timer(alarm_timer); } /* modify the current timer so that it will be fired when current_time @@ -972,6 +1015,7 @@ static void qemu_run_timers(QEMUTimer **ptimer_head, int64_t current_time) /* run the callback (the timer list can be modified) */ ts->cb(ts->opaque); } +qemu_rearm_alarm_timer(alarm_timer); } int64_t qemu_get_clock(QEMUClock *clock) @@ -1079,7 +1123,8 @@ static void host_alarm_handler(int host_signum) last_clock = ti; } #endif -if (qemu_timer_expired(active_timers[QEMU_TIMER_VIRTUAL], +if (alarm_has_dynticks(alarm_timer) || +qemu_timer_expired(active_timers[QEMU_TIMER_VIRTUAL], qemu_get_clock(vm_clock)) || qemu_timer_expired(active_timers[QEMU_TIMER_REALTIME], qemu_get_clock(rt_clock))) { @@ -1207,6 +1252,97 @@ static void rtc_stop_timer(struct qemu_alarm_timer *t) #endif /* !defined(__linux__) */ +#ifdef DYNAMIC_TICKS +static int dynticks_start_timer(struct qemu_alarm_timer *t) +{ +struct sigevent ev; +timer_t host_
[Qemu-devel] [PATCH/RFC 1/4] Rework alarm timer infrastrucure.
Make the alarm code modular, removing #ifdef from the generic code and abstract a common interface for all the timer. The result is functionally equivalent to the old code. Signed-off-by: Luca Tettamanti <[EMAIL PROTECTED]> --- qemu/vl.c | 292 +-- 1 files changed, 189 insertions(+), 103 deletions(-) diff --git a/qemu/vl.c b/qemu/vl.c index 5360ed7..33443ca 100644 --- a/qemu/vl.c +++ b/qemu/vl.c @@ -745,18 +745,58 @@ struct QEMUTimer { struct QEMUTimer *next; }; -QEMUClock *rt_clock; -QEMUClock *vm_clock; +struct qemu_alarm_timer { +char const *name; + +int (*start)(struct qemu_alarm_timer *t); +void (*stop)(struct qemu_alarm_timer *t); +void *priv; +}; + +static struct qemu_alarm_timer *alarm_timer; -static QEMUTimer *active_timers[2]; #ifdef _WIN32 -static MMRESULT timerID; -static HANDLE host_alarm = NULL; -static unsigned int period = 1; + +struct qemu_alarm_win32 { +MMRESULT timerId; +HANDLE host_alarm; +unsigned int period; +} alarm_win32_data = {0, NULL, -1}; + +static int win32_start_timer(struct qemu_alarm_timer *t); +static void win32_stop_timer(struct qemu_alarm_timer *t); + +#else + +static int unix_start_timer(struct qemu_alarm_timer *t); +static void unix_stop_timer(struct qemu_alarm_timer *t); + +#ifdef __linux__ + +static int rtc_start_timer(struct qemu_alarm_timer *t); +static void rtc_stop_timer(struct qemu_alarm_timer *t); + +#endif + +#endif /* _WIN32 */ + +static struct qemu_alarm_timer alarm_timers[] = { +#ifdef __linux__ +/* RTC - if available - is preferred */ +{"rtc", rtc_start_timer, rtc_stop_timer, NULL}, +#endif +#ifndef _WIN32 +{"unix", unix_start_timer, unix_stop_timer, NULL}, #else -/* frequency of the times() clock tick */ -static int timer_freq; +{"win32", win32_start_timer, win32_stop_timer, &alarm_win32_data}, #endif +{NULL, } +}; + +QEMUClock *rt_clock; +QEMUClock *vm_clock; + +static QEMUTimer *active_timers[2]; QEMUClock *qemu_new_clock(int type) { @@ -973,7 +1013,8 @@ static void host_alarm_handler(int host_signum) qemu_timer_expired(active_timers[QEMU_TIMER_REALTIME], qemu_get_clock(rt_clock))) { #ifdef _WIN32 -SetEvent(host_alarm); +struct qemu_alarm_win32 *data = ((struct qemu_alarm_timer*)dwUser)->priv; +SetEvent(data->host_alarm); #endif CPUState *env = cpu_single_env; if (env) { @@ -995,10 +1036,31 @@ static void host_alarm_handler(int host_signum) #define RTC_FREQ 1024 static int use_rtc = 1; -static int rtc_fd; -static int start_rtc_timer(void) +static void enable_sigio_timer(int fd) +{ +struct sigaction act; + +/* timer signal */ +sigfillset(&act.sa_mask); +act.sa_flags = 0; +#if defined (TARGET_I386) && defined(USE_CODE_COPY) +act.sa_flags |= SA_ONSTACK; +#endif +act.sa_handler = host_alarm_handler; + +sigaction(SIGIO, &act, NULL); +fcntl(fd, F_SETFL, O_ASYNC); +fcntl(fd, F_SETOWN, getpid()); +} + +static int rtc_start_timer(struct qemu_alarm_timer *t) { +int rtc_fd; + +if (!use_rtc) +return -1; + rtc_fd = open("/dev/rtc", O_RDONLY); if (rtc_fd < 0) return -1; @@ -1009,121 +1071,145 @@ static int start_rtc_timer(void) goto fail; } if (ioctl(rtc_fd, RTC_PIE_ON, 0) < 0) { -fail: +fail: close(rtc_fd); return -1; } -pit_min_timer_count = PIT_FREQ / RTC_FREQ; + +enable_sigio_timer(rtc_fd); + +t->priv = (void *)rtc_fd; + return 0; } -#else - -static int start_rtc_timer(void) +static void rtc_stop_timer(struct qemu_alarm_timer *t) { -return -1; +int rtc_fd = (int)t->priv; + +close(rtc_fd); } #endif /* !defined(__linux__) */ -#endif /* !defined(_WIN32) */ +static int unix_start_timer(struct qemu_alarm_timer *t) +{ +struct sigaction act; +struct itimerval itv; +int err; -static void init_timer_alarm(void) +/* timer signal */ +sigfillset(&act.sa_mask); +act.sa_flags = 0; +#if defined(TARGET_I386) && defined(USE_CODE_COPY) +act.sa_flags |= SA_ONSTACK; +#endif +act.sa_handler = host_alarm_handler; + +sigaction(SIGALRM, &act, NULL); + +itv.it_interval.tv_sec = 0; +/* for i386 kernel 2.6 to get 1 ms */ +itv.it_interval.tv_usec = 999; +itv.it_value.tv_sec = 0; +itv.it_value.tv_usec = 10 * 1000; + +err = setitimer(ITIMER_REAL, &itv, NULL); +if (err) +return -1; + +return 0; +} + +static void unix_stop_timer(struct qemu_alarm_timer *t) { +struct itimerval itv; + +memset(&itv, 0, sizeof(itv)); +setitimer(ITIMER_REAL, &itv, NULL); +} + +#endif /* !defined(_WIN32) */ + #ifdef _WIN32 -{ -int count=0; -TIMECAPS tc; - -ZeroMemory(&tc, sizeof(TIMECAPS)); -timeGetDevCaps(&t
[Qemu-devel] [PATCH/RFC 3/4] Add support for HPET periodic timer.
Linux operates the HPET timer in legacy replacement mode, which means that the periodic interrupt of the CMOS RTC is not delivered (qemu won't be able to use /dev/rtc). Add support for HPET (/dev/hpet) as a replacement for the RTC; the periodic interrupt is delivered via SIGIO and is handled in the same way as the RTC timer. Signed-off-by: Luca Tettamanti <[EMAIL PROTECTED]> --- qemu/vl.c | 57 ++- 1 files changed, 56 insertions(+), 1 deletions(-) diff --git a/qemu/vl.c b/qemu/vl.c index f0b4896..0373beb 100644 --- a/qemu/vl.c +++ b/qemu/vl.c @@ -54,6 +54,7 @@ #include #include #include +#include #include #endif #endif @@ -773,6 +774,9 @@ static void unix_stop_timer(struct qemu_alarm_timer *t); #ifdef __linux__ +static int hpet_start_timer(struct qemu_alarm_timer *t); +static void hpet_stop_timer(struct qemu_alarm_timer *t); + static int rtc_start_timer(struct qemu_alarm_timer *t); static void rtc_stop_timer(struct qemu_alarm_timer *t); @@ -782,7 +786,9 @@ static void rtc_stop_timer(struct qemu_alarm_timer *t); static struct qemu_alarm_timer alarm_timers[] = { #ifdef __linux__ -/* RTC - if available - is preferred */ +/* HPET - if available - is preferred */ +{"hpet", hpet_start_timer, hpet_stop_timer, NULL}, +/* ...otherwise try RTC */ {"rtc", rtc_start_timer, rtc_stop_timer, NULL}, #endif #ifndef _WIN32 @@ -1117,6 +1123,55 @@ static void enable_sigio_timer(int fd) fcntl(fd, F_SETOWN, getpid()); } +static int hpet_start_timer(struct qemu_alarm_timer *t) +{ +struct hpet_info info; +int r, fd; + +fd = open("/dev/hpet", O_RDONLY); +if (fd < 0) +return -1; + +/* Set frequency */ +r = ioctl(fd, HPET_IRQFREQ, RTC_FREQ); +if (r < 0) { +fprintf(stderr, "Could not configure '/dev/hpet' to have a 1024Hz timer. This is not a fatal\n" +"error, but for better emulation accuracy type:\n" +"'echo 1024 > /proc/sys/dev/hpet/max-user-freq' as root.\n"); +goto fail; +} + +/* Check capabilities */ +r = ioctl(fd, HPET_INFO, &info); +if (r < 0) +goto fail; + +/* Enable periodic mode */ +r = ioctl(fd, HPET_EPI, 0); +if (info.hi_flags && (r < 0)) +goto fail; + +/* Enable interrupt */ +r = ioctl(fd, HPET_IE_ON, 0); +if (r < 0) +goto fail; + +enable_sigio_timer(fd); +t->priv = (void *)fd; + +return 0; +fail: +close(fd); +return -1; +} + +static void hpet_stop_timer(struct qemu_alarm_timer *t) +{ +int fd = (int)t->priv; + +close(fd); +} + static int rtc_start_timer(struct qemu_alarm_timer *t) { int rtc_fd; -- 1.5.2.4
[Qemu-devel] [PATCH/RFC 0/4] Rework alarm timer infrastrucure.
Hello, in reply to this mail I will send a serie of 4 patches that cleans up and expands the alarm timer handling in QEMU. Patches apply to current kvm-userspace tree, but I think I can rebase it to QEMU svn if desired. Patch 1 is mostly a cleanup of the existing code; instead of having multiple #ifdefs to handle different timers scattered all over the code I've created a modular infrastructure where each timer type is self-contained and generic code is more readable. The resulting code is functionally equivalent to the old one. Patch 2 implements the "-clock" command line option proposed by Daniel Berrange and Avi Kivity. By default QEMU tries RTC and then falls back to unix timer; user can override the order of the timer through this options. Syntax is pretty simple: -clock timer1,timer2,etc. (QEMU will pick the first one that works). Patch 3 adds support for HPET under Linux (which is basically my old patch). As suggested HPET takes precedence over other timers, but of course this can be overridden. Patch 4 introduces "dynticks" timer source; patch is mostly based on the work Dan Kenigsberg. dynticks is now the default alarm timer.
[Qemu-devel] [PATCH/RFC 2/4] Add -clock option.
Allow user to override the list of available alarm timers and their priority. The format of the options is -clock clk1,clk2,... Signed-off-by: Luca Tettamanti <[EMAIL PROTECTED]> --- qemu/vl.c | 90 -- 1 files changed, 72 insertions(+), 18 deletions(-) diff --git a/qemu/vl.c b/qemu/vl.c index 33443ca..f0b4896 100644 --- a/qemu/vl.c +++ b/qemu/vl.c @@ -793,6 +793,71 @@ static struct qemu_alarm_timer alarm_timers[] = { {NULL, } }; +static void show_available_alarms() +{ +int i; + +printf("Available alarm timers, in order of precedence:\n"); +for (i = 0; alarm_timers[i].name; i++) +printf("%s\n", alarm_timers[i].name); +} + +static void configure_alarms(char const *opt) +{ +int i; +int cur = 0; +int count = (sizeof(alarm_timers) / sizeof(*alarm_timers)) - 1; +char *arg; +char *name; + +if (!strcmp(opt, "help")) { +show_available_alarms(); +exit(0); +} + +arg = strdup(opt); + +/* Reorder the array */ +name = strtok(arg, ","); +while (name) { +struct qemu_alarm_timer tmp; + +for (i = 0; i < count; i++) { +if (!strcmp(alarm_timers[i].name, name)) +break; +} + +if (i == count) { +fprintf(stderr, "Unknown clock %s\n", name); +goto next; +} + +if (i < cur) +/* Ignore */ +goto next; + + /* Swap */ +tmp = alarm_timers[i]; +alarm_timers[i] = alarm_timers[cur]; +alarm_timers[cur] = tmp; + +cur++; +next: +name = strtok(NULL, ","); +} + +free(arg); + +if (cur) { + /* Disable remaining timers */ +for (i = cur; i < count; i++) +alarm_timers[i].name = NULL; +} + +/* debug */ +show_available_alarms(); +} + QEMUClock *rt_clock; QEMUClock *vm_clock; @@ -1035,8 +1100,6 @@ static void host_alarm_handler(int host_signum) #define RTC_FREQ 1024 -static int use_rtc = 1; - static void enable_sigio_timer(int fd) { struct sigaction act; @@ -1058,9 +1121,6 @@ static int rtc_start_timer(struct qemu_alarm_timer *t) { int rtc_fd; -if (!use_rtc) -return -1; - rtc_fd = open("/dev/rtc", O_RDONLY); if (rtc_fd < 0) return -1; @@ -6566,9 +6626,8 @@ void help(void) "-daemonize daemonize QEMU after initializing\n" #endif "-tdfinject timer interrupts that got lost\n" -#if defined(__linux__) - "-no-rtc don't use /dev/rtc for timer alarm (do use gettimeofday)\n" -#endif + "-clock force the use of the given methods for timer alarm.\n" + "To see what timers are available use -clock help\n" "-option-rom rom load a file, rom, into the option ROM space\n" "\n" "During emulation, the following keys are useful:\n" @@ -6658,9 +6717,7 @@ enum { QEMU_OPTION_semihosting, QEMU_OPTION_incoming, QEMU_OPTION_tdf, -#if defined(__linux__) -QEMU_OPTION_no_rtc, -#endif +QEMU_OPTION_clock, QEMU_OPTION_cpu_vendor, }; @@ -6755,9 +6812,7 @@ const QEMUOption qemu_options[] = { { "semihosting", 0, QEMU_OPTION_semihosting }, #endif { "tdf", 0, QEMU_OPTION_tdf }, /* enable time drift fix */ -#if defined(__linux__) -{ "no-rtc", 0, QEMU_OPTION_no_rtc }, -#endif +{ "clock", HAS_ARG, QEMU_OPTION_clock }, { "cpu-vendor", HAS_ARG, QEMU_OPTION_cpu_vendor }, { NULL }, }; @@ -7477,11 +7532,10 @@ int main(int argc, char **argv) break; case QEMU_OPTION_tdf: time_drift_fix = 1; -#if defined(__linux__) - case QEMU_OPTION_no_rtc: - use_rtc = 0; break; -#endif +case QEMU_OPTION_clock: +configure_alarms(optarg); +break; case QEMU_OPTION_cpu_vendor: cpu_vendor_string = optarg; break; -- 1.5.2.4
[Qemu-devel] Re: [kvm-devel] [PATCH] Dynamic ticks
\n" "\n" "During emulation, the following keys are useful:\n" @@ -6630,6 +6671,9 @@ enum { QEMU_OPTION_use_hpet, #endif QEMU_OPTION_cpu_vendor, +#ifdef DYNAMIC_TICKS +QEMU_OPTION_no_dyntick, +#endif }; typedef struct QEMUOption { @@ -6728,6 +6772,9 @@ const QEMUOption qemu_options[] = { { "use-hpet", 0, QEMU_OPTION_use_hpet }, #endif { "cpu-vendor", HAS_ARG, QEMU_OPTION_cpu_vendor }, +#if defined(DYNAMIC_TICKS) +{ "no-dyntick", 0, QEMU_OPTION_no_dyntick }, +#endif { NULL }, }; @@ -6932,6 +6979,78 @@ static BOOL WINAPI qemu_ctrl_handler(DWORD type) #define MAX_NET_CLIENTS 32 +#ifdef DYNAMIC_TICKS + +static timer_t host_timer; + +static int dynticks_create_timer() { +struct sigevent ev; + +ev.sigev_value.sival_int = 0; +ev.sigev_notify = SIGEV_SIGNAL; +ev.sigev_signo = SIGALRM; + +if (timer_create(CLOCK_REALTIME, &ev, &host_timer)) { +perror("timer_create"); + +/* disable dynticks */ +fprintf(stderr, "Dynamic Ticks disabled\n"); +use_dynticks = 0; + +return -1; +} + +return 0; +} + +/* call host_alarm_handler just when the nearest QEMUTimer expires */ +/* expire_time is measured in nanosec for vm_clock but in millisec + * for rt_clock + */ +static void rearm_host_timer(void) { +struct itimerspec timeout; +int64_t nearest_delta_us = INT64_MAX; + +if (!use_dynamic_ticks()) +return; + +if (active_timers[QEMU_TIMER_REALTIME] || +active_timers[QEMU_TIMER_VIRTUAL]) { +int64_t vmdelta_us, current_us; + +if (active_timers[QEMU_TIMER_REALTIME]) +nearest_delta_us = (active_timers[QEMU_TIMER_REALTIME]->expire_time - qemu_get_clock(rt_clock))*1000; + +if (active_timers[QEMU_TIMER_VIRTUAL]) { +/* round up */ +vmdelta_us = (active_timers[QEMU_TIMER_VIRTUAL]->expire_time - qemu_get_clock(vm_clock)+999)/1000; +if (vmdelta_us < nearest_delta_us) +nearest_delta_us = vmdelta_us; +} + +/* Avoid arming the timer to negative, zero, or too low values */ +/* MIN_TIMER_REARM_US should be optimized */ +#define MIN_TIMER_REARM_US 250 +if (nearest_delta_us <= MIN_TIMER_REARM_US) +nearest_delta_us = MIN_TIMER_REARM_US; + +/* check whether a timer is already running */ +if (timer_gettime(host_timer, &timeout)) +perror("gettime"); +current_us = timeout.it_value.tv_sec * 100 + timeout.it_value.tv_nsec/1000; +if (current_us && current_us <= nearest_delta_us) +return; + +timeout.it_interval.tv_sec = 0; +timeout.it_interval.tv_nsec = 0; /* 0 for one-shot timer */ +timeout.it_value.tv_sec = nearest_delta_us / 100; +timeout.it_value.tv_nsec = nearest_delta_us % 100 * 1000; +if (timer_settime(host_timer, 0 /* RELATIVE */, &timeout, NULL)) +perror("settime"); +} +} +#endif /* DYNAMIC_TICKS */ + static int saved_argc; static char **saved_argv; @@ -7457,6 +7576,11 @@ int main(int argc, char **argv) case QEMU_OPTION_cpu_vendor: cpu_vendor_string = optarg; break; +#ifdef DYNAMIC_TICKS +case QEMU_OPTION_no_dyntick: +use_dynticks = 0; +break; +#endif } } } Luca -- Se non puoi convincerli, confondili.
[Qemu-devel] Re: [kvm-devel] [PATCH] Add support for HPET periodic timer.
On 8/13/07, Daniel P. Berrange <[EMAIL PROTECTED]> wrote: > On Mon, Aug 13, 2007 at 11:04:46AM +0300, Avi Kivity wrote: > > Luca Tettamanti wrote: > > Something like: > > > > - try to use HPET (unless -no-rtc selected) > > - try to use RTC (unless -no-rtc selected) > > - fallback to normal unix facilities > > If we're going to add command line args it probably makes sense to be a > little more generic than -no-rtc or -use-hpet. Have a list of preferred > clock sources eg > > -clock hpet,rtc,unix > > If -clock is omitted, then default to trying all in the priority you > describe. Makes sense, I'll prepare a patch (maybe after a couple of days of vacation :P). Luca
[Qemu-devel] [PATCH] Add support for HPET periodic timer.
Linux operates the HPET timer in legacy replacement mode, which means that the periodic interrupt of the CMOS RTC is not delivered (qemu won't be able to use /dev/rtc). Add support for HPET (/dev/hpet) as a replacement for the RTC; the periodic interrupt is delivered via SIGIO and is handled in the same way as the RTC timer. HPET must be explicitly enabled with -use-hpet. Signed-off-by: Luca Tettamanti <[EMAIL PROTECTED]> --- qemu/vl.c | 62 +- 1 files changed, 60 insertions(+), 2 deletions(-) diff --git a/qemu/vl.c b/qemu/vl.c index 4ad39f1..db7262b 100644 --- a/qemu/vl.c +++ b/qemu/vl.c @@ -54,6 +54,7 @@ #include #include #include +#include #include #endif #endif @@ -996,8 +997,9 @@ static void host_alarm_handler(int host_signum) static int use_rtc = 1; static int rtc_fd; +static int use_hpet; -static int start_rtc_timer(void) +static int enable_rtc(void) { rtc_fd = open("/dev/rtc", O_RDONLY); if (rtc_fd < 0) @@ -1017,6 +1019,56 @@ static int start_rtc_timer(void) return 0; } +static char default_hpet[] = "/dev/hpet"; + +static int enable_hpet(void) +{ +struct hpet_info info; +int r; + +rtc_fd = open(default_hpet, O_RDONLY); +if (rtc_fd < 0) +return -1; + +/* Set frequency */ +r = ioctl(rtc_fd, HPET_IRQFREQ, RTC_FREQ); +if (r < 0) { +fprintf(stderr, "Could not configure '/dev/hpet' to have a 1024 Hz timer. This is not a fatal\n" +"error, but for better emulation accuracy type:\n" +"'echo 1024 > /proc/sys/dev/hpet/max-user-freq' as root.\n"); +goto fail; +} + +/* Check capabilities */ +r = ioctl(rtc_fd, HPET_INFO, &info); +if (r < 0) +goto fail; + +/* Enable periodic mode */ +r = ioctl(rtc_fd, HPET_EPI, 0); +if (info.hi_flags && (r < 0)) +goto fail; + +/* Enable interrupt */ +r = ioctl(rtc_fd, HPET_IE_ON, 0); +if (r < 0) +goto fail; + +pit_min_timer_count = PIT_FREQ / RTC_FREQ; + +return 0; +fail: +close(rtc_fd); +return -1; +} + +static int start_rtc_timer(void) +{ +if (use_hpet) +return enable_hpet(); +else +return enable_rtc(); +} #else static int start_rtc_timer(void) @@ -1090,7 +1142,7 @@ static void init_timer_alarm(void) 2.6 kernels */ if (itv.it_interval.tv_usec > 1000 || 1) { /* try to use /dev/rtc to have a faster timer */ -if (!use_rtc || (start_rtc_timer() < 0)) +if ((!use_rtc && !use_hpet) || (start_rtc_timer() < 0)) goto use_itimer; /* disable itimer */ itv.it_interval.tv_sec = 0; @@ -6482,6 +6534,7 @@ void help(void) "-tdfinject timer interrupts that got lost\n" #if defined(__linux__) "-no-rtc don't use /dev/rtc for timer alarm (do use gettimeofday)\n" + "-use-hpet use /dev/hpet (HPET) instead of RTC for timer alarm\n" #endif "-option-rom rom load a file, rom, into the option ROM space\n" "\n" @@ -6574,6 +6627,7 @@ enum { QEMU_OPTION_tdf, #if defined(__linux__) QEMU_OPTION_no_rtc, +QEMU_OPTION_use_hpet, #endif QEMU_OPTION_cpu_vendor, }; @@ -6671,6 +6725,7 @@ const QEMUOption qemu_options[] = { { "tdf", 0, QEMU_OPTION_tdf }, /* enable time drift fix */ #if defined(__linux__) { "no-rtc", 0, QEMU_OPTION_no_rtc }, +{ "use-hpet", 0, QEMU_OPTION_use_hpet }, #endif { "cpu-vendor", HAS_ARG, QEMU_OPTION_cpu_vendor }, { NULL }, @@ -7395,6 +7450,9 @@ int main(int argc, char **argv) case QEMU_OPTION_no_rtc: use_rtc = 0; break; + case QEMU_OPTION_use_hpet: + use_hpet = 1; + break; #endif case QEMU_OPTION_cpu_vendor: cpu_vendor_string = optarg; -- 1.5.2.3
[Qemu-devel] Re: [kvm-devel] PIIX/IDE: ports disabled in PCI config space?
On 6/5/07, Avi Kivity <[EMAIL PROTECTED]> wrote: Luca Tettamanti wrote: > Hello, > I'm testing the new Fedora7 under KVM. As you may know Fedora has > migrated to the new libata drivers. > > ata_piix is unhappy with the PIIX IDE controller provided by QEmu/KVM: > > > [...] > The following patch fixes the problem (i.e. ata_piix finds both the HD > and the cdrom): > > --- a/hw/ide.c2007-06-04 19:34:25.0 +0200 > +++ b/hw/ide.c2007-06-04 21:45:28.0 +0200 > @@ -2586,6 +2586,8 @@ static void piix3_reset(PCIIDEState *d) > pci_conf[0x06] = 0x80; /* FBC */ > pci_conf[0x07] = 0x02; // PCI_status_devsel_medium > pci_conf[0x20] = 0x01; /* BMIBA: 20-23h */ > +pci_conf[0x41] = 0x80; // enable port 0 > +pci_conf[0x43] = 0x80; // enable port 1 > } > > void pci_piix_ide_init(PCIBus *bus, BlockDriverState **hd_table, int devfn) > I imagine the reset state in the spec is disabled? Yes, you are right. Btw, this is the doc: http://www.intel.com/design/intarch/datashts/290550.htm If so, then the long-term fix is to enable these bits in the bios. Good point. I'm looking at bochs right now: the BIOS doesn't touch that bits. The 2 ports are enabled/disabled - using values from the config file - by the reset function of the emulated controller (iodev/pci_ide.cc), so they're doing the same thing I've done for QEMU. I'd rather not touch the BIOS, it's a bit obscure ;-) Luca
[Qemu-devel] PIIX/IDE: ports disabled in PCI config space?
Hello, I'm testing the new Fedora7 under KVM. As you may know Fedora has migrated to the new libata drivers. ata_piix is unhappy with the PIIX IDE controller provided by QEmu/KVM: libata version 2.20 loaded. ata_piix :00:01.1: version 2.10ac1 PCI: Setting latency timer of device :00:01.1 to 64 ata1: PATA max MWDMA2 cmd 0x000101f0 ctl 0x000103f6 bmdma 0x00011400 irq 14 ata2: PATA max MWDMA2 cmd 0x00010170 ctl 0x00010376 bmdma 0x00011408 irq 15 scsi0 : ata_piix ata1: port disabled. ignoring. scsi1 : ata_piix ata2: port disabled. ignoring. The "port disabled" messages are generated by piix_pata_prereset (called by piix_pata_error_handler), see drivers/ata/ata-piix.c. piix_pata_prereset checks PCI config space to see whether the port is active or not: static int piix_pata_prereset(struct ata_port *ap, unsigned long deadline) { struct pci_dev *pdev = to_pci_dev(ap->host->dev); if (!pci_test_config_bits(pdev, &piix_enable_bits[ap->port_no])) return -ENOENT; return ata_std_prereset(ap, deadline); } with: static struct pci_bits piix_enable_bits[] = { { 0x41U, 1U, 0x80UL, 0x80UL }, /* port 0 */ { 0x43U, 1U, 0x80UL, 0x80UL }, /* port 1 */ }; which means that it will read 1 byte at offset 0x41 (or 0x43), mask it with 0x80 and the result shall be 0x80 (i.e. it checks bit 7). Bit 7 in Intel docs is described in this way: "IDE Decode Enable (IDE). 1=Enable; 0=Disable. When enabled, I/O transactions on PCI targeting the IDE ATA register blocks (command block and control block) are positively decoded on PCI and driven on the IDE interface. When disabled, these accesses are subtractively decoded to ISA." Now this is config space of the IDE controller: 00:01.1 IDE interface: Intel Corporation 82371SB PIIX3 IDE [Natoma/Triton II] (prog-if 80 [Master]) Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- SERR-
Re: [Qemu-devel] QEMU GUI
Chris Wilson wrote: > > QT is only now free on Windows, and supports far fewer platforms than wx > (no Mac support?). I personally don't like tcl as a language, and prefer > to code in C++ for efficiency. qt/mac exists. > > GTK is also specific to Unix (not Mac) and Windows, and looks weird on > Windows, not very native. gtk/cocoa exists. > > Or platforms other than Windows and Unix. name them, macosx has X and cocoa mostly supported by major toolkits. tk+tile works everywhere and is also looking good. =) lu -- Luca Barbato Gentoo/linux Gentoo/PPC http://dev.gentoo.org/~lu_zero ___ Qemu-devel mailing list Qemu-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/qemu-devel
Re: [Qemu-devel] QEMU GUI
Chris Wilson wrote: > > I'd be interested to know why you dislike it. The library is incompatible with itself depending on the configure time options (see string constructors vs unicode string constructors) Its ABI/API changes too often (ok, that is the result of they fixing lots of bugs that require radical changes, but they could haven't been on first place...) Its architecture is a tad old. > I actually find it very > nice to code in wx, much easier than GTK or MFC or raw Win32 API. Try Qt or ewl/etk if you don't like the default tcl/tk look, all 4 are quite nicer architecture-wise and less painful to be handled as dependence. MFC and winapi are surely worst than wx, gtk on the other hand is simple and relatively easy to learn. The main/only point of wx is that mimics quite well some sort of native look&feel, and that is just nice if you have to handle windows users or idiotic managers. -- Luca Barbato Gentoo/linux Gentoo/PPC http://dev.gentoo.org/~lu_zero ___ Qemu-devel mailing list Qemu-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/qemu-devel
Re: [Qemu-devel] QEMU GUI
Fabrice Bellard wrote: > Hi, > > Concerning the QEMU GUI, my mind slightly evolved since my last posts on > the topic: I think that a wxWidgets GUI would be the best as it is > reasonnably portable and because it uses the native GUIs. wx is nasty at best. -- Luca Barbato Gentoo/linux Gentoo/PPC http://dev.gentoo.org/~lu_zero ___ Qemu-devel mailing list Qemu-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/qemu-devel
Re: [Qemu-devel] qemu 0.7.0 does not compile with binutils 2.16
Kevin F. Quinn wrote: > On 10/6/2005 19:54:53, Gioele Barabucci ([EMAIL PROTECTED]) wrote: > > >>I tried to compile qemu 0.7.0 on my gentoo-ppc box. > > ... > >>I used gcc-3.4.3, binutils 2.16 and gentoo's glibc 2.3.4.20041102-r1 with > >>NPTL. > > > Posting a bug to Gentoo's bugzilla would be a good idea > (http://bugs.gentoo.org). > Since binutils-2.16 is ~ppc (i.e. being tested), you could try the latest > stable > version of binutils (2.15.90.0.3-r5). That would demonstrate whether it's a > binutils issue or not. The problem got already reported in the gentoo bugzilla and in the binutils bugzilla. As I wrote before the issue is in the linker scripts. isn't a ld problems (it does the right thing) nor a distribution problem (I'm not so keen on use a older and probably with more issues open version) lu -- Luca Barbato Gentoo/linux Developer Gentoo/PPC Operational Leader http://dev.gentoo.org/~lu_zero ___ Qemu-devel mailing list Qemu-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/qemu-devel
[Qemu-devel] Linker issues with binutils 2.16 on ppc
ld complains about "Not enough room for program headers" That's is due the use of SIZEOF_HEADERS in the linker script. Reproduced with the current cvs on the -user targets. Since I have no experience about ld script so I didn't try to figure out myself a fix. Basic system info: gcc-3.4.4, glibc-2.3.5-r0, 2.6.11-gentoo-r9 binutils-2.16 lu -- Luca Barbato Gentoo/linux Developer Gentoo/PPC Operational Leader http://dev.gentoo.org/~lu_zero ___ Qemu-devel mailing list Qemu-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/qemu-devel
[Qemu-devel] Can't compile on ubuntu 5.04 amd64
Hi! I can't compile qemu on my Ubuntu 5.04 Amd64, after a bit of work it stops compilation with a: for d in i386-user arm-user armeb-user sparc-user ppc-user i386-softmmu ppc-softmmu sparc-softmmu x86_64-softmmu; do \ make -C $d all || exit 1 ; \ done make[1]: Entering directory `/prateria/tmp/qemu-0.7.0/i386-user' gcc-3.4 -g -Wl,-T,/prateria/tmp/qemu-0.7.0/x86_64.ld -o qemu-i386 elfload.o main.o syscall.o mmap.o signal.o path.o osdep.o thunk.o vm86.o libqemu.a gdbstub.o -lm /usr/bin/ld:/prateria/tmp/qemu-0.7.0/ x86_64.ld:62: parse error collect2: ld returned 1 exit status make[1]: *** [qemu-i386] Error 1 make[1]: Leaving directory `/prateria/tmp/qemu-0.7.0/ i386-user' make: *** [all] Error 1 As you can see it seems a ld error. My binutils version is 2.15. To configure I used the trivial: ./configure Playing a bit with configure, if I do: ./configure --target-list="x86_64-softmmu i386-softmmu" I can compile a working qemu: so is the i386-user target that creates problems... The only problem I have with these binaries is that networking doesn't work (neiter usermode nor tun/tap), but looking in the ML archives I found it is a know amd64 issue. -- "Uhm... l'ho detto o l'ho solo pensato?" .::. Ziabice aka Luca Gambetta .::. ___ Qemu-devel mailing list Qemu-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/qemu-devel