date:20230907

Re: [PULL v2 00/35] ppc queue

2023-09-07 Thread Nicholas Piggin

On Fri Sep 8, 2023 at 8:15 AM AEST, Cédric Le Goater wrote:
> On 9/7/23 21:10, Michael Tokarev wrote:
> > 06.09.2023 17:36, Cédric Le Goater wrote:
> > ...
> >> ppc queue :
> >>
> >> * debug facility improvements
> >> * timebase and decrementer fixes
> >> * record-replay fixes
> >> * TCG fixes
> >> * XIVE model improvements for multichip
> >>
> >> 
> >> Cédric Le Goater (4):
> >>    ppc/xive: Use address_space routines to access the machine RAM
> >>    ppc/xive: Introduce a new XiveRouter end_notify() handler
> >>    ppc/xive: Handle END triggers between chips with MMIOs
> >>    ppc/xive: Add support for the PC MMIOs
> >>
> >> Joel Stanley (1):
> >>    ppc: Add stub implementation of TRIG SPRs
> >>
> >> Maksim Kostin (1):
> >>    hw/ppc/e500: fix broken snapshot replay
> >>
> >> Nicholas Piggin (26):
> >>    target/ppc: Remove single-step suppression inside 0x100-0xf00
> >>    target/ppc: Improve book3s branch trace interrupt for v2.07S
> >>    target/ppc: Suppress single step interrupts on rfi-type instructions
> >>    target/ppc: Implement breakpoint debug facility for v2.07S
> >>    target/ppc: Implement watchpoint debug facility for v2.07S
> >>    spapr: implement H_SET_MODE debug facilities
> >>    ppc/vhyp: reset exception state when handling vhyp hcall
> >>    ppc/vof: Fix missed fields in VOF cleanup
> >>    hw/ppc/ppc.c: Tidy over-long lines
> >>    hw/ppc: Introduce functions for conversion between timebase and 
> >> nanoseconds
> >>    host-utils: Add muldiv64_round_up
> >>    hw/ppc: Round up the decrementer interval when converting to ns
> >>    hw/ppc: Avoid decrementer rounding errors
> >>    target/ppc: Sign-extend large decrementer to 64-bits
> >>    hw/ppc: Always store the decrementer value
> >>    target/ppc: Migrate DECR SPR
> >>    hw/ppc: Reset timebase facilities on machine reset
> >>    hw/ppc: Read time only once to perform decrementer write
> >>    target/ppc: Fix CPU reservation migration for record-replay
> >>    target/ppc: Fix timebase reset with record-replay
> >>    spapr: Fix machine reset deadlock from replay-record
> >>    spapr: Fix record-replay machine reset consuming too many events
> >>    tests/avocado: boot ppc64 pseries replay-record test to Linux VFS 
> >> mount
> >>    tests/avocado: reverse-debugging cope with re-executing breakpoints
> >>    tests/avocado: ppc64 reverse debugging tests for pseries and powernv
> >>    target/ppc: Fix LQ, STQ register-pair order for big-endian
> >>
> >> Richard Henderson (1):
> >>    target/ppc: Flush inputs to zero with NJ in ppc_store_vscr
> >>
> >> Shawn Anastasio (1):
> >>    target/ppc: Generate storage interrupts for radix RC changes
> >>
> >> jianchunfu (1):
> >>    target/ppc: Fix the order of kvm_enable judgment about 
> >> kvmppc_set_interrupt()
> > 
> > Is there anything in there worth to pick for -stable?
> > Like, for example, some decrementer fixes, 
>
> The decrementer fixes are good candidates but there are quite a few
> patches and you might encounter conflicts.

Decrementer I was nervous about since there were quite a lot of
interacting issues. Decrementer has worked okay for a while, so
even though there are some bugs, they're mostly in edge cases
that most OSes don't hit or care so much about.

Possibly the decrementer migration patch could be a candidate.

In any case I would like them to get more testing upstream for
a while first.

>
> > or some of these:
> > 
> >   ppc/vof: Fix missed fields in VOF cleanup

vof patch I think is a candidate. Simple and fixes leaks.

> >   spapr: Fix machine reset deadlock from replay-record
> >   hw/ppc/e500: fix broken snapshot replay
>
> I can not tell if replay-record is important for stable. Nick ?

It seems to have been broken in many ways for long enough that
nobody was really using it (at least on pseries). Maybe e500
because an issue was filed for that and the fix looked small.

>   
> > or something else?
>
> These are :
>
>target/ppc: Flush inputs to zero with NJ in ppc_store_vscr
>target/ppc: Fix LQ, STQ register-pair order for big-endian

Yes definitely these two.

Thanks,
Nick

[RESEND] qemu/timer: Add host ticks function for RISC-V

2023-09-07 Thread LIU Zhiwei

From: LIU Zhiwei 

Signed-off-by: LIU Zhiwei 
---
 include/qemu/timer.h | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/include/qemu/timer.h b/include/qemu/timer.h
index 9a91cb1248..105767c195 100644
--- a/include/qemu/timer.h
+++ b/include/qemu/timer.h
@@ -979,6 +979,25 @@ static inline int64_t cpu_get_host_ticks(void)
 return cur - ofs;
 }
 
+#elif defined(__riscv) && defined(__riscv_xlen) && __riscv_xlen == 32
+static inline int64_t cpu_get_host_ticks(void)
+{
+uint32_t lo, hi;
+asm volatile("RDCYCLE %0\n\t"
+ "RDCYCLEH %1"
+ : "=r"(lo), "=r"(hi));
+return lo | (uint64_t)hi << 32;
+}
+
+#elif defined(__riscv) && defined(__riscv_xlen) && __riscv_xlen > 32
+static inline int64_t cpu_get_host_ticks(void)
+{
+int64_t val;
+
+asm volatile("RDCYCLE %0" : "=r"(val));
+return val;
+}
+
 #else
 /* The host CPU doesn't have an easily accessible cycle counter.
Just return a monotonically increasing value.  This will be
-- 
2.17.1

[PATCH] qemu/timer: Add host ticks function for RISC-V

2023-09-07 Thread LIU Zhiwei

From: LIU Zhiwei 

Signed-off-by: LIU Zhiwei 
---
 include/qemu/timer.h | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/include/qemu/timer.h b/include/qemu/timer.h
index 9a91cb1248..ce0b66d122 100644
--- a/include/qemu/timer.h
+++ b/include/qemu/timer.h
@@ -979,6 +979,25 @@ static inline int64_t cpu_get_host_ticks(void)
 return cur - ofs;
 }
 
+#elif defined(__riscv) && defined(__riscv_xlen) && __riscv_xlen == 32
+static inline int64_t cpu_get_host_ticks(void)
+{
+uint32_t lo, hi;
+asm volatile("RDCYCLE %0\n\t"
+ "RDCYCLEH %1"
+ : "=r"(lo), "=r"(hi));
+return lo | (uint64_t)hi << 32;
+}
+
+#elif defined(__riscv) && defined(__riscv_xlen) && __riscv_xlen > 32
+static inline int64_t cpu_get_host_ticks(void)
+{
+int64_t val;
+
+asm volatile("RDCYCLE %0" : "=r"(cc));
+return val;
+}
+
 #else
 /* The host CPU doesn't have an easily accessible cycle counter.
Just return a monotonically increasing value.  This will be
-- 
2.17.1

Re: [PATCH RESEND v5 03/57] target/loongarch: Use gen_helper_gvec_4_ptr for 4OP + env vector instructions

2023-09-07 Thread gaosong


在 2023/9/8 上午1:34, Richard Henderson 写道:

On 9/7/23 01:31, Song Gao wrote:
+static bool gen__ptr_vl(DisasContext *ctx, arg_ *a, uint32_t 
oprsz,

+    gen_helper_gvec_4_ptr *fn)
+{
+    tcg_gen_gvec_4_ptr(vec_full_offset(a->vd),
+   vec_full_offset(a->vj),
+   vec_full_offset(a->vk),
+   vec_full_offset(a->va),
+   cpu_env,
+   oprsz, ctx->vl / 8, oprsz, fn);

   ^

This next to last argument is 'data', which is unused for this case.
Just use 0 here.


Got it,  I will correct the other 6 similar patches.

Thanks.
Song Gao

Re: [PATCH RESEND v5 02/57] target/loongarch: Implement gvec_*_vl functions

2023-09-07 Thread gaosong


在 2023/9/8 上午1:19, Richard Henderson 写道:

On 9/7/23 01:31, Song Gao wrote:

Using gvec_*_vl functions hides oprsz. We can use gvec_v* for oprsz 16.
and gvec_v* for oprsz 32.

Signed-off-by: Song Gao
---
  target/loongarch/insn_trans/trans_vec.c.inc | 68 +
  1 file changed, 44 insertions(+), 24 deletions(-)


The description above is not quite right.  How about:

   Create gvec_*_vl functions in order to hide oprsz.
   This is used by gvec_v* functions for oprsz 16,
   and will be used by gvec_x* functions for oprsz 32.


Yes, I will correct it.

Thanks.
Song Gao

[PATCH v4 02/16] tcg/loongarch64: Lower basic tcg vec ops to LSX

2023-09-07 Thread Jiajie Chen

LSX support on host cpu is detected via hwcap.

Lower the following ops to LSX:

- dup_vec
- dupi_vec
- dupm_vec
- ld_vec
- st_vec

Signed-off-by: Jiajie Chen 
Reviewed-by: Richard Henderson 
---
 tcg/loongarch64/tcg-target-con-set.h |   2 +
 tcg/loongarch64/tcg-target-con-str.h |   1 +
 tcg/loongarch64/tcg-target.c.inc | 219 ++-
 tcg/loongarch64/tcg-target.h |  38 -
 tcg/loongarch64/tcg-target.opc.h |  12 ++
 5 files changed, 270 insertions(+), 2 deletions(-)
 create mode 100644 tcg/loongarch64/tcg-target.opc.h

diff --git a/tcg/loongarch64/tcg-target-con-set.h 
b/tcg/loongarch64/tcg-target-con-set.h
index c2bde44613..37b3f80bf9 100644
--- a/tcg/loongarch64/tcg-target-con-set.h
+++ b/tcg/loongarch64/tcg-target-con-set.h
@@ -17,7 +17,9 @@
 C_O0_I1(r)
 C_O0_I2(rZ, r)
 C_O0_I2(rZ, rZ)
+C_O0_I2(w, r)
 C_O1_I1(r, r)
+C_O1_I1(w, r)
 C_O1_I2(r, r, rC)
 C_O1_I2(r, r, ri)
 C_O1_I2(r, r, rI)
diff --git a/tcg/loongarch64/tcg-target-con-str.h 
b/tcg/loongarch64/tcg-target-con-str.h
index 6e9ccca3ad..81b8d40278 100644
--- a/tcg/loongarch64/tcg-target-con-str.h
+++ b/tcg/loongarch64/tcg-target-con-str.h
@@ -14,6 +14,7 @@
  * REGS(letter, register_mask)
  */
 REGS('r', ALL_GENERAL_REGS)
+REGS('w', ALL_VECTOR_REGS)
 
 /*
  * Define constraint letters for constants:
diff --git a/tcg/loongarch64/tcg-target.c.inc b/tcg/loongarch64/tcg-target.c.inc
index baf5fc3819..150278e112 100644
--- a/tcg/loongarch64/tcg-target.c.inc
+++ b/tcg/loongarch64/tcg-target.c.inc
@@ -32,6 +32,8 @@
 #include "../tcg-ldst.c.inc"
 #include 
 
+bool use_lsx_instructions;
+
 #ifdef CONFIG_DEBUG_TCG
 static const char * const tcg_target_reg_names[TCG_TARGET_NB_REGS] = {
 "zero",
@@ -65,7 +67,39 @@ static const char * const 
tcg_target_reg_names[TCG_TARGET_NB_REGS] = {
 "s5",
 "s6",
 "s7",
-"s8"
+"s8",
+"vr0",
+"vr1",
+"vr2",
+"vr3",
+"vr4",
+"vr5",
+"vr6",
+"vr7",
+"vr8",
+"vr9",
+"vr10",
+"vr11",
+"vr12",
+"vr13",
+"vr14",
+"vr15",
+"vr16",
+"vr17",
+"vr18",
+"vr19",
+"vr20",
+"vr21",
+"vr22",
+"vr23",
+"vr24",
+"vr25",
+"vr26",
+"vr27",
+"vr28",
+"vr29",
+"vr30",
+"vr31",
 };
 #endif
 
@@ -102,6 +136,15 @@ static const int tcg_target_reg_alloc_order[] = {
 TCG_REG_A2,
 TCG_REG_A1,
 TCG_REG_A0,
+
+/* Vector registers */
+TCG_REG_V0, TCG_REG_V1, TCG_REG_V2, TCG_REG_V3,
+TCG_REG_V4, TCG_REG_V5, TCG_REG_V6, TCG_REG_V7,
+TCG_REG_V8, TCG_REG_V9, TCG_REG_V10, TCG_REG_V11,
+TCG_REG_V12, TCG_REG_V13, TCG_REG_V14, TCG_REG_V15,
+TCG_REG_V16, TCG_REG_V17, TCG_REG_V18, TCG_REG_V19,
+TCG_REG_V20, TCG_REG_V21, TCG_REG_V22, TCG_REG_V23,
+/* V24 - V31 are caller-saved, and skipped.  */
 };
 
 static const int tcg_target_call_iarg_regs[] = {
@@ -135,6 +178,7 @@ static TCGReg tcg_target_call_oarg_reg(TCGCallReturnKind 
kind, int slot)
 #define TCG_CT_CONST_WSZ   0x2000
 
 #define ALL_GENERAL_REGS   MAKE_64BIT_MASK(0, 32)
+#define ALL_VECTOR_REGSMAKE_64BIT_MASK(32, 32)
 
 static inline tcg_target_long sextreg(tcg_target_long val, int pos, int len)
 {
@@ -1486,6 +1530,154 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
 }
 }
 
+static bool tcg_out_dup_vec(TCGContext *s, TCGType type, unsigned vece,
+TCGReg rd, TCGReg rs)
+{
+switch (vece) {
+case MO_8:
+tcg_out_opc_vreplgr2vr_b(s, rd, rs);
+break;
+case MO_16:
+tcg_out_opc_vreplgr2vr_h(s, rd, rs);
+break;
+case MO_32:
+tcg_out_opc_vreplgr2vr_w(s, rd, rs);
+break;
+case MO_64:
+tcg_out_opc_vreplgr2vr_d(s, rd, rs);
+break;
+default:
+g_assert_not_reached();
+}
+return true;
+}
+
+static bool tcg_out_dupm_vec(TCGContext *s, TCGType type, unsigned vece,
+ TCGReg r, TCGReg base, intptr_t offset)
+{
+/* Handle imm overflow and division (vldrepl.d imm is divided by 8) */
+if (offset < -0x800 || offset > 0x7ff || \
+(offset & ((1 << vece) - 1)) != 0) {
+tcg_out_addi(s, TCG_TYPE_I64, TCG_REG_TMP0, base, offset);
+base = TCG_REG_TMP0;
+offset = 0;
+}
+offset >>= vece;
+
+switch (vece) {
+case MO_8:
+tcg_out_opc_vldrepl_b(s, r, base, offset);
+break;
+case MO_16:
+tcg_out_opc_vldrepl_h(s, r, base, offset);
+break;
+case MO_32:
+tcg_out_opc_vldrepl_w(s, r, base, offset);
+break;
+case MO_64:
+tcg_out_opc_vldrepl_d(s, r, base, offset);
+break;
+default:
+g_assert_not_reached();
+}
+return true;
+}
+
+static void tcg_out_dupi_vec(TCGContext *s, TCGType type, unsigned vece,
+ TCGReg rd, int64_t v64)
+{
+/* Try vldi if imm can fit */
+int64_t value = sextract64(v64, 0, 8 << vece);
+if (-0x200 <= value && value <= 0x1FF) {
+

[PATCH v4 10/16] tcg/loongarch64: Lower vector saturated ops

2023-09-07 Thread Jiajie Chen

Lower the following ops:

- ssadd_vec
- usadd_vec
- sssub_vec
- ussub_vec

Signed-off-by: Jiajie Chen 
Reviewed-by: Richard Henderson 
---
 tcg/loongarch64/tcg-target.c.inc | 32 
 tcg/loongarch64/tcg-target.h |  2 +-
 2 files changed, 33 insertions(+), 1 deletion(-)

diff --git a/tcg/loongarch64/tcg-target.c.inc b/tcg/loongarch64/tcg-target.c.inc
index bdf22d8807..90c52c38cf 100644
--- a/tcg/loongarch64/tcg-target.c.inc
+++ b/tcg/loongarch64/tcg-target.c.inc
@@ -1713,6 +1713,18 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc,
 static const LoongArchInsn umax_vec_insn[4] = {
 OPC_VMAX_BU, OPC_VMAX_HU, OPC_VMAX_WU, OPC_VMAX_DU
 };
+static const LoongArchInsn ssadd_vec_insn[4] = {
+OPC_VSADD_B, OPC_VSADD_H, OPC_VSADD_W, OPC_VSADD_D
+};
+static const LoongArchInsn usadd_vec_insn[4] = {
+OPC_VSADD_BU, OPC_VSADD_HU, OPC_VSADD_WU, OPC_VSADD_DU
+};
+static const LoongArchInsn sssub_vec_insn[4] = {
+OPC_VSSUB_B, OPC_VSSUB_H, OPC_VSSUB_W, OPC_VSSUB_D
+};
+static const LoongArchInsn ussub_vec_insn[4] = {
+OPC_VSSUB_BU, OPC_VSSUB_HU, OPC_VSSUB_WU, OPC_VSSUB_DU
+};
 
 a0 = args[0];
 a1 = args[1];
@@ -1829,6 +1841,18 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc,
 case INDEX_op_umax_vec:
 tcg_out32(s, encode_vdvjvk_insn(umax_vec_insn[vece], a0, a1, a2));
 break;
+case INDEX_op_ssadd_vec:
+tcg_out32(s, encode_vdvjvk_insn(ssadd_vec_insn[vece], a0, a1, a2));
+break;
+case INDEX_op_usadd_vec:
+tcg_out32(s, encode_vdvjvk_insn(usadd_vec_insn[vece], a0, a1, a2));
+break;
+case INDEX_op_sssub_vec:
+tcg_out32(s, encode_vdvjvk_insn(sssub_vec_insn[vece], a0, a1, a2));
+break;
+case INDEX_op_ussub_vec:
+tcg_out32(s, encode_vdvjvk_insn(ussub_vec_insn[vece], a0, a1, a2));
+break;
 case INDEX_op_dupm_vec:
 tcg_out_dupm_vec(s, type, vece, a0, a1, a2);
 break;
@@ -1860,6 +1884,10 @@ int tcg_can_emit_vec_op(TCGOpcode opc, TCGType type, 
unsigned vece)
 case INDEX_op_smax_vec:
 case INDEX_op_umin_vec:
 case INDEX_op_umax_vec:
+case INDEX_op_ssadd_vec:
+case INDEX_op_usadd_vec:
+case INDEX_op_sssub_vec:
+case INDEX_op_ussub_vec:
 return 1;
 default:
 return 0;
@@ -2039,6 +2067,10 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode 
op)
 case INDEX_op_smax_vec:
 case INDEX_op_umin_vec:
 case INDEX_op_umax_vec:
+case INDEX_op_ssadd_vec:
+case INDEX_op_usadd_vec:
+case INDEX_op_sssub_vec:
+case INDEX_op_ussub_vec:
 return C_O1_I2(w, w, w);
 
 case INDEX_op_not_vec:
diff --git a/tcg/loongarch64/tcg-target.h b/tcg/loongarch64/tcg-target.h
index ec725aaeaa..fa14558275 100644
--- a/tcg/loongarch64/tcg-target.h
+++ b/tcg/loongarch64/tcg-target.h
@@ -192,7 +192,7 @@ extern bool use_lsx_instructions;
 #define TCG_TARGET_HAS_roti_vec 0
 #define TCG_TARGET_HAS_rots_vec 0
 #define TCG_TARGET_HAS_rotv_vec 0
-#define TCG_TARGET_HAS_sat_vec  0
+#define TCG_TARGET_HAS_sat_vec  1
 #define TCG_TARGET_HAS_minmax_vec   1
 #define TCG_TARGET_HAS_bitsel_vec   0
 #define TCG_TARGET_HAS_cmpsel_vec   0
-- 
2.42.0

[PATCH v4 12/16] tcg/loongarch64: Lower bitsel_vec to vbitsel

2023-09-07 Thread Jiajie Chen

Signed-off-by: Jiajie Chen 
Reviewed-by: Richard Henderson 
---
 tcg/loongarch64/tcg-target-con-set.h |  1 +
 tcg/loongarch64/tcg-target.c.inc | 11 ++-
 tcg/loongarch64/tcg-target.h |  2 +-
 3 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/tcg/loongarch64/tcg-target-con-set.h 
b/tcg/loongarch64/tcg-target-con-set.h
index 3f530ad4d8..914572d21b 100644
--- a/tcg/loongarch64/tcg-target-con-set.h
+++ b/tcg/loongarch64/tcg-target-con-set.h
@@ -35,4 +35,5 @@ C_O1_I2(r, rZ, rZ)
 C_O1_I2(w, w, w)
 C_O1_I2(w, w, wM)
 C_O1_I2(w, w, wA)
+C_O1_I3(w, w, w, w)
 C_O1_I4(r, rZ, rJ, rZ, rZ)
diff --git a/tcg/loongarch64/tcg-target.c.inc b/tcg/loongarch64/tcg-target.c.inc
index 6958fd219c..a33ec594ee 100644
--- a/tcg/loongarch64/tcg-target.c.inc
+++ b/tcg/loongarch64/tcg-target.c.inc
@@ -1676,7 +1676,7 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc,
const int const_args[TCG_MAX_OP_ARGS])
 {
 TCGType type = vecl + TCG_TYPE_V64;
-TCGArg a0, a1, a2;
+TCGArg a0, a1, a2, a3;
 TCGReg temp = TCG_REG_TMP0;
 TCGReg temp_vec = TCG_VEC_TMP0;
 
@@ -1738,6 +1738,7 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc,
 a0 = args[0];
 a1 = args[1];
 a2 = args[2];
+a3 = args[3];
 
 /* Currently only supports V128 */
 tcg_debug_assert(type == TCG_TYPE_V128);
@@ -1871,6 +1872,10 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc,
 case INDEX_op_sarv_vec:
 tcg_out32(s, encode_vdvjvk_insn(sarv_vec_insn[vece], a0, a1, a2));
 break;
+case INDEX_op_bitsel_vec:
+/* vbitsel vd, vj, vk, va = bitsel_vec vd, va, vk, vj */
+tcg_out_opc_vbitsel_v(s, a0, a3, a2, a1);
+break;
 case INDEX_op_dupm_vec:
 tcg_out_dupm_vec(s, type, vece, a0, a1, a2);
 break;
@@ -1909,6 +1914,7 @@ int tcg_can_emit_vec_op(TCGOpcode opc, TCGType type, 
unsigned vece)
 case INDEX_op_shlv_vec:
 case INDEX_op_shrv_vec:
 case INDEX_op_sarv_vec:
+case INDEX_op_bitsel_vec:
 return 1;
 default:
 return 0;
@@ -2101,6 +2107,9 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode 
op)
 case INDEX_op_neg_vec:
 return C_O1_I1(w, w);
 
+case INDEX_op_bitsel_vec:
+return C_O1_I3(w, w, w, w);
+
 default:
 g_assert_not_reached();
 }
diff --git a/tcg/loongarch64/tcg-target.h b/tcg/loongarch64/tcg-target.h
index 7e9fb61c47..bc56939a57 100644
--- a/tcg/loongarch64/tcg-target.h
+++ b/tcg/loongarch64/tcg-target.h
@@ -194,7 +194,7 @@ extern bool use_lsx_instructions;
 #define TCG_TARGET_HAS_rotv_vec 0
 #define TCG_TARGET_HAS_sat_vec  1
 #define TCG_TARGET_HAS_minmax_vec   1
-#define TCG_TARGET_HAS_bitsel_vec   0
+#define TCG_TARGET_HAS_bitsel_vec   1
 #define TCG_TARGET_HAS_cmpsel_vec   0
 
 #define TCG_TARGET_DEFAULT_MO (0)
-- 
2.42.0

[PATCH v4 03/16] tcg: pass vece to tcg_target_const_match()

2023-09-07 Thread Jiajie Chen

Pass vece to tcg_target_const_match() to allow correct interpretation of
const args of vector ops.

Signed-off-by: Jiajie Chen 
Reviewed-by: Richard Henderson 
---
 tcg/aarch64/tcg-target.c.inc | 2 +-
 tcg/arm/tcg-target.c.inc | 2 +-
 tcg/i386/tcg-target.c.inc| 2 +-
 tcg/loongarch64/tcg-target.c.inc | 2 +-
 tcg/mips/tcg-target.c.inc| 2 +-
 tcg/ppc/tcg-target.c.inc | 2 +-
 tcg/riscv/tcg-target.c.inc   | 2 +-
 tcg/s390x/tcg-target.c.inc   | 2 +-
 tcg/sparc64/tcg-target.c.inc | 2 +-
 tcg/tcg.c| 4 ++--
 tcg/tci/tcg-target.c.inc | 2 +-
 11 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/tcg/aarch64/tcg-target.c.inc b/tcg/aarch64/tcg-target.c.inc
index 0931a69448..a1e2b6be16 100644
--- a/tcg/aarch64/tcg-target.c.inc
+++ b/tcg/aarch64/tcg-target.c.inc
@@ -272,7 +272,7 @@ static bool is_shimm1632(uint32_t v32, int *cmode, int 
*imm8)
 }
 }
 
-static bool tcg_target_const_match(int64_t val, TCGType type, int ct)
+static bool tcg_target_const_match(int64_t val, TCGType type, int ct, int vece)
 {
 if (ct & TCG_CT_CONST) {
 return 1;
diff --git a/tcg/arm/tcg-target.c.inc b/tcg/arm/tcg-target.c.inc
index acb5f23b54..76f1345002 100644
--- a/tcg/arm/tcg-target.c.inc
+++ b/tcg/arm/tcg-target.c.inc
@@ -509,7 +509,7 @@ static bool is_shimm1632(uint32_t v32, int *cmode, int 
*imm8)
  * mov operand2: values represented with x << (2 * y), x < 0x100
  * add, sub, eor...: ditto
  */
-static bool tcg_target_const_match(int64_t val, TCGType type, int ct)
+static bool tcg_target_const_match(int64_t val, TCGType type, int ct, int vece)
 {
 if (ct & TCG_CT_CONST) {
 return 1;
diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc
index 0c3d1e4cef..aed91e515e 100644
--- a/tcg/i386/tcg-target.c.inc
+++ b/tcg/i386/tcg-target.c.inc
@@ -198,7 +198,7 @@ static bool patch_reloc(tcg_insn_unit *code_ptr, int type,
 }
 
 /* test if a constant matches the constraint */
-static bool tcg_target_const_match(int64_t val, TCGType type, int ct)
+static bool tcg_target_const_match(int64_t val, TCGType type, int ct, int vece)
 {
 if (ct & TCG_CT_CONST) {
 return 1;
diff --git a/tcg/loongarch64/tcg-target.c.inc b/tcg/loongarch64/tcg-target.c.inc
index 150278e112..07a0326e5d 100644
--- a/tcg/loongarch64/tcg-target.c.inc
+++ b/tcg/loongarch64/tcg-target.c.inc
@@ -186,7 +186,7 @@ static inline tcg_target_long sextreg(tcg_target_long val, 
int pos, int len)
 }
 
 /* test if a constant matches the constraint */
-static bool tcg_target_const_match(int64_t val, TCGType type, int ct)
+static bool tcg_target_const_match(int64_t val, TCGType type, int ct, int vece)
 {
 if (ct & TCG_CT_CONST) {
 return true;
diff --git a/tcg/mips/tcg-target.c.inc b/tcg/mips/tcg-target.c.inc
index 9faa8bdf0b..c6662889f0 100644
--- a/tcg/mips/tcg-target.c.inc
+++ b/tcg/mips/tcg-target.c.inc
@@ -190,7 +190,7 @@ static bool is_p2m1(tcg_target_long val)
 }
 
 /* test if a constant matches the constraint */
-static bool tcg_target_const_match(int64_t val, TCGType type, int ct)
+static bool tcg_target_const_match(int64_t val, TCGType type, int ct, int vece)
 {
 if (ct & TCG_CT_CONST) {
 return 1;
diff --git a/tcg/ppc/tcg-target.c.inc b/tcg/ppc/tcg-target.c.inc
index 090f11e71c..ccf245191d 100644
--- a/tcg/ppc/tcg-target.c.inc
+++ b/tcg/ppc/tcg-target.c.inc
@@ -261,7 +261,7 @@ static bool reloc_pc14(tcg_insn_unit *src_rw, const 
tcg_insn_unit *target)
 }
 
 /* test if a constant matches the constraint */
-static bool tcg_target_const_match(int64_t val, TCGType type, int ct)
+static bool tcg_target_const_match(int64_t val, TCGType type, int ct, int vece)
 {
 if (ct & TCG_CT_CONST) {
 return 1;
diff --git a/tcg/riscv/tcg-target.c.inc b/tcg/riscv/tcg-target.c.inc
index 9be81c1b7b..3bd7959e7e 100644
--- a/tcg/riscv/tcg-target.c.inc
+++ b/tcg/riscv/tcg-target.c.inc
@@ -145,7 +145,7 @@ static TCGReg tcg_target_call_oarg_reg(TCGCallReturnKind 
kind, int slot)
 #define sextreg  sextract64
 
 /* test if a constant matches the constraint */
-static bool tcg_target_const_match(int64_t val, TCGType type, int ct)
+static bool tcg_target_const_match(int64_t val, TCGType type, int ct, int vece)
 {
 if (ct & TCG_CT_CONST) {
 return 1;
diff --git a/tcg/s390x/tcg-target.c.inc b/tcg/s390x/tcg-target.c.inc
index ecd8aaf2a1..f4d3abcb71 100644
--- a/tcg/s390x/tcg-target.c.inc
+++ b/tcg/s390x/tcg-target.c.inc
@@ -540,7 +540,7 @@ static bool risbg_mask(uint64_t c)
 }
 
 /* Test if a constant matches the constraint. */
-static bool tcg_target_const_match(int64_t val, TCGType type, int ct)
+static bool tcg_target_const_match(int64_t val, TCGType type, int ct, int vece)
 {
 if (ct & TCG_CT_CONST) {
 return 1;
diff --git a/tcg/sparc64/tcg-target.c.inc b/tcg/sparc64/tcg-target.c.inc
index 81a08bb6c5..6b9be4c520 100644
--- a/tcg/sparc64/tcg-target.c.inc
+++ b/tcg/sparc64/tcg-target.c.inc
@@ -322,7 +322,7 @@

[PATCH v4 15/16] tcg/loongarch64: Lower rotli_vec to vrotri

2023-09-07 Thread Jiajie Chen

Signed-off-by: Jiajie Chen 
Reviewed-by: Richard Henderson 
---
 tcg/loongarch64/tcg-target.c.inc | 21 +
 tcg/loongarch64/tcg-target.h |  2 +-
 2 files changed, 22 insertions(+), 1 deletion(-)

diff --git a/tcg/loongarch64/tcg-target.c.inc b/tcg/loongarch64/tcg-target.c.inc
index 8f448823b0..82901d678a 100644
--- a/tcg/loongarch64/tcg-target.c.inc
+++ b/tcg/loongarch64/tcg-target.c.inc
@@ -1902,6 +1902,26 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc,
 tcg_out32(s, encode_vdvjvk_insn(rotrv_vec_insn[vece], a0, a1,
 temp_vec));
 break;
+case INDEX_op_rotli_vec:
+/* rotli_vec a1, a2 = rotri_vec a1, -a2 */
+a2 = extract32(-a2, 0, 3 + vece);
+switch (vece) {
+case MO_8:
+tcg_out_opc_vrotri_b(s, a0, a1, a2);
+break;
+case MO_16:
+tcg_out_opc_vrotri_h(s, a0, a1, a2);
+break;
+case MO_32:
+tcg_out_opc_vrotri_w(s, a0, a1, a2);
+break;
+case MO_64:
+tcg_out_opc_vrotri_d(s, a0, a1, a2);
+break;
+default:
+g_assert_not_reached();
+}
+break;
 case INDEX_op_bitsel_vec:
 /* vbitsel vd, vj, vk, va = bitsel_vec vd, va, vk, vj */
 tcg_out_opc_vbitsel_v(s, a0, a3, a2, a1);
@@ -2140,6 +2160,7 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode 
op)
 case INDEX_op_shli_vec:
 case INDEX_op_shri_vec:
 case INDEX_op_sari_vec:
+case INDEX_op_rotli_vec:
 return C_O1_I1(w, w);
 
 case INDEX_op_bitsel_vec:
diff --git a/tcg/loongarch64/tcg-target.h b/tcg/loongarch64/tcg-target.h
index d5c69bc192..67b0a95532 100644
--- a/tcg/loongarch64/tcg-target.h
+++ b/tcg/loongarch64/tcg-target.h
@@ -189,7 +189,7 @@ extern bool use_lsx_instructions;
 #define TCG_TARGET_HAS_shi_vec  1
 #define TCG_TARGET_HAS_shs_vec  0
 #define TCG_TARGET_HAS_shv_vec  1
-#define TCG_TARGET_HAS_roti_vec 0
+#define TCG_TARGET_HAS_roti_vec 1
 #define TCG_TARGET_HAS_rots_vec 0
 #define TCG_TARGET_HAS_rotv_vec 1
 #define TCG_TARGET_HAS_sat_vec  1
-- 
2.42.0

[PATCH v4 13/16] tcg/loongarch64: Lower vector shift integer ops

2023-09-07 Thread Jiajie Chen

Lower the following ops:

- shli_vec
- shrv_vec
- sarv_vec

Signed-off-by: Jiajie Chen 
Reviewed-by: Richard Henderson 
---
 tcg/loongarch64/tcg-target.c.inc | 21 +
 tcg/loongarch64/tcg-target.h |  2 +-
 2 files changed, 22 insertions(+), 1 deletion(-)

diff --git a/tcg/loongarch64/tcg-target.c.inc b/tcg/loongarch64/tcg-target.c.inc
index a33ec594ee..c21c917083 100644
--- a/tcg/loongarch64/tcg-target.c.inc
+++ b/tcg/loongarch64/tcg-target.c.inc
@@ -1734,6 +1734,15 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc,
 static const LoongArchInsn sarv_vec_insn[4] = {
 OPC_VSRA_B, OPC_VSRA_H, OPC_VSRA_W, OPC_VSRA_D
 };
+static const LoongArchInsn shli_vec_insn[4] = {
+OPC_VSLLI_B, OPC_VSLLI_H, OPC_VSLLI_W, OPC_VSLLI_D
+};
+static const LoongArchInsn shri_vec_insn[4] = {
+OPC_VSRLI_B, OPC_VSRLI_H, OPC_VSRLI_W, OPC_VSRLI_D
+};
+static const LoongArchInsn sari_vec_insn[4] = {
+OPC_VSRAI_B, OPC_VSRAI_H, OPC_VSRAI_W, OPC_VSRAI_D
+};
 
 a0 = args[0];
 a1 = args[1];
@@ -1872,6 +1881,15 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc,
 case INDEX_op_sarv_vec:
 tcg_out32(s, encode_vdvjvk_insn(sarv_vec_insn[vece], a0, a1, a2));
 break;
+case INDEX_op_shli_vec:
+tcg_out32(s, encode_vdvjuk3_insn(shli_vec_insn[vece], a0, a1, a2));
+break;
+case INDEX_op_shri_vec:
+tcg_out32(s, encode_vdvjuk3_insn(shri_vec_insn[vece], a0, a1, a2));
+break;
+case INDEX_op_sari_vec:
+tcg_out32(s, encode_vdvjuk3_insn(sari_vec_insn[vece], a0, a1, a2));
+break;
 case INDEX_op_bitsel_vec:
 /* vbitsel vd, vj, vk, va = bitsel_vec vd, va, vk, vj */
 tcg_out_opc_vbitsel_v(s, a0, a3, a2, a1);
@@ -2105,6 +2123,9 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode 
op)
 
 case INDEX_op_not_vec:
 case INDEX_op_neg_vec:
+case INDEX_op_shli_vec:
+case INDEX_op_shri_vec:
+case INDEX_op_sari_vec:
 return C_O1_I1(w, w);
 
 case INDEX_op_bitsel_vec:
diff --git a/tcg/loongarch64/tcg-target.h b/tcg/loongarch64/tcg-target.h
index bc56939a57..d7b806e252 100644
--- a/tcg/loongarch64/tcg-target.h
+++ b/tcg/loongarch64/tcg-target.h
@@ -186,7 +186,7 @@ extern bool use_lsx_instructions;
 #define TCG_TARGET_HAS_nor_vec  1
 #define TCG_TARGET_HAS_eqv_vec  0
 #define TCG_TARGET_HAS_mul_vec  1
-#define TCG_TARGET_HAS_shi_vec  0
+#define TCG_TARGET_HAS_shi_vec  1
 #define TCG_TARGET_HAS_shs_vec  0
 #define TCG_TARGET_HAS_shv_vec  1
 #define TCG_TARGET_HAS_roti_vec 0
-- 
2.42.0

[PATCH v4 04/16] tcg/loongarch64: Lower cmp_vec to vseq/vsle/vslt

2023-09-07 Thread Jiajie Chen

Signed-off-by: Jiajie Chen 
Reviewed-by: Richard Henderson 
---
 tcg/loongarch64/tcg-target-con-set.h |  1 +
 tcg/loongarch64/tcg-target-con-str.h |  1 +
 tcg/loongarch64/tcg-target.c.inc | 65 
 3 files changed, 67 insertions(+)

diff --git a/tcg/loongarch64/tcg-target-con-set.h 
b/tcg/loongarch64/tcg-target-con-set.h
index 37b3f80bf9..8c8ea5d919 100644
--- a/tcg/loongarch64/tcg-target-con-set.h
+++ b/tcg/loongarch64/tcg-target-con-set.h
@@ -31,4 +31,5 @@ C_O1_I2(r, 0, rZ)
 C_O1_I2(r, rZ, ri)
 C_O1_I2(r, rZ, rJ)
 C_O1_I2(r, rZ, rZ)
+C_O1_I2(w, w, wM)
 C_O1_I4(r, rZ, rJ, rZ, rZ)
diff --git a/tcg/loongarch64/tcg-target-con-str.h 
b/tcg/loongarch64/tcg-target-con-str.h
index 81b8d40278..a8a1c44014 100644
--- a/tcg/loongarch64/tcg-target-con-str.h
+++ b/tcg/loongarch64/tcg-target-con-str.h
@@ -26,3 +26,4 @@ CONST('U', TCG_CT_CONST_U12)
 CONST('Z', TCG_CT_CONST_ZERO)
 CONST('C', TCG_CT_CONST_C12)
 CONST('W', TCG_CT_CONST_WSZ)
+CONST('M', TCG_CT_CONST_VCMP)
diff --git a/tcg/loongarch64/tcg-target.c.inc b/tcg/loongarch64/tcg-target.c.inc
index 07a0326e5d..129dd92910 100644
--- a/tcg/loongarch64/tcg-target.c.inc
+++ b/tcg/loongarch64/tcg-target.c.inc
@@ -176,6 +176,7 @@ static TCGReg tcg_target_call_oarg_reg(TCGCallReturnKind 
kind, int slot)
 #define TCG_CT_CONST_U12   0x800
 #define TCG_CT_CONST_C12   0x1000
 #define TCG_CT_CONST_WSZ   0x2000
+#define TCG_CT_CONST_VCMP  0x4000
 
 #define ALL_GENERAL_REGS   MAKE_64BIT_MASK(0, 32)
 #define ALL_VECTOR_REGSMAKE_64BIT_MASK(32, 32)
@@ -209,6 +210,10 @@ static bool tcg_target_const_match(int64_t val, TCGType 
type, int ct, int vece)
 if ((ct & TCG_CT_CONST_WSZ) && val == (type == TCG_TYPE_I32 ? 32 : 64)) {
 return true;
 }
+int64_t vec_val = sextract64(val, 0, 8 << vece);
+if ((ct & TCG_CT_CONST_VCMP) && -0x10 <= vec_val && vec_val <= 0x1f) {
+return true;
+}
 return false;
 }
 
@@ -1624,6 +1629,23 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc,
 TCGType type = vecl + TCG_TYPE_V64;
 TCGArg a0, a1, a2;
 TCGReg temp = TCG_REG_TMP0;
+TCGReg temp_vec = TCG_VEC_TMP0;
+
+static const LoongArchInsn cmp_vec_insn[16][4] = {
+[TCG_COND_EQ] = {OPC_VSEQ_B, OPC_VSEQ_H, OPC_VSEQ_W, OPC_VSEQ_D},
+[TCG_COND_LE] = {OPC_VSLE_B, OPC_VSLE_H, OPC_VSLE_W, OPC_VSLE_D},
+[TCG_COND_LEU] = {OPC_VSLE_BU, OPC_VSLE_HU, OPC_VSLE_WU, OPC_VSLE_DU},
+[TCG_COND_LT] = {OPC_VSLT_B, OPC_VSLT_H, OPC_VSLT_W, OPC_VSLT_D},
+[TCG_COND_LTU] = {OPC_VSLT_BU, OPC_VSLT_HU, OPC_VSLT_WU, OPC_VSLT_DU},
+};
+static const LoongArchInsn cmp_vec_imm_insn[16][4] = {
+[TCG_COND_EQ] = {OPC_VSEQI_B, OPC_VSEQI_H, OPC_VSEQI_W, OPC_VSEQI_D},
+[TCG_COND_LE] = {OPC_VSLEI_B, OPC_VSLEI_H, OPC_VSLEI_W, OPC_VSLEI_D},
+[TCG_COND_LEU] = {OPC_VSLEI_BU, OPC_VSLEI_HU, OPC_VSLEI_WU, 
OPC_VSLEI_DU},
+[TCG_COND_LT] = {OPC_VSLTI_B, OPC_VSLTI_H, OPC_VSLTI_W, OPC_VSLTI_D},
+[TCG_COND_LTU] = {OPC_VSLTI_BU, OPC_VSLTI_HU, OPC_VSLTI_WU, 
OPC_VSLTI_DU},
+};
+LoongArchInsn insn;
 
 a0 = args[0];
 a1 = args[1];
@@ -1651,6 +1673,45 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc,
 tcg_out_opc_vldx(s, a0, a1, temp);
 }
 break;
+case INDEX_op_cmp_vec:
+TCGCond cond = args[3];
+if (const_args[2]) {
+/*
+ * cmp_vec dest, src, value
+ * Try vseqi/vslei/vslti
+ */
+int64_t value = sextract64(a2, 0, 8 << vece);
+if ((cond == TCG_COND_EQ || cond == TCG_COND_LE || \
+ cond == TCG_COND_LT) && (-0x10 <= value && value <= 0x0f)) {
+tcg_out32(s, encode_vdvjsk5_insn(cmp_vec_imm_insn[cond][vece], 
\
+ a0, a1, value));
+break;
+} else if ((cond == TCG_COND_LEU || cond == TCG_COND_LTU) &&
+(0x00 <= value && value <= 0x1f)) {
+tcg_out32(s, encode_vdvjuk5_insn(cmp_vec_imm_insn[cond][vece], 
\
+ a0, a1, value));
+break;
+}
+
+/*
+ * Fallback to:
+ * dupi_vec temp, a2
+ * cmp_vec a0, a1, temp, cond
+ */
+tcg_out_dupi_vec(s, type, vece, temp_vec, a2);
+a2 = temp_vec;
+}
+
+insn = cmp_vec_insn[cond][vece];
+if (insn == 0) {
+TCGArg t;
+t = a1, a1 = a2, a2 = t;
+cond = tcg_swap_cond(cond);
+insn = cmp_vec_insn[cond][vece];
+tcg_debug_assert(insn != 0);
+}
+tcg_out32(s, encode_vdvjvk_insn(insn, a0, a1, a2));
+break;
 case INDEX_op_dupm_vec:
 tcg_out_dupm_vec(s, type, vece, a0, a1, a2);
 break;
@@ -1666,6 +1727,7 @@ int tcg_can_emit_vec_op(TCGOpcode opc, TCGType type, 
unsigned vece)
 case INDEX_op_st_vec:
 case

[PATCH v4 08/16] tcg/loongarch64: Lower mul_vec to vmul

2023-09-07 Thread Jiajie Chen

Signed-off-by: Jiajie Chen 
Reviewed-by: Richard Henderson 
---
 tcg/loongarch64/tcg-target.c.inc | 8 
 tcg/loongarch64/tcg-target.h | 2 +-
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/tcg/loongarch64/tcg-target.c.inc b/tcg/loongarch64/tcg-target.c.inc
index b36b706e39..0814f62905 100644
--- a/tcg/loongarch64/tcg-target.c.inc
+++ b/tcg/loongarch64/tcg-target.c.inc
@@ -1698,6 +1698,9 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc,
 static const LoongArchInsn neg_vec_insn[4] = {
 OPC_VNEG_B, OPC_VNEG_H, OPC_VNEG_W, OPC_VNEG_D
 };
+static const LoongArchInsn mul_vec_insn[4] = {
+OPC_VMUL_B, OPC_VMUL_H, OPC_VMUL_W, OPC_VMUL_D
+};
 
 a0 = args[0];
 a1 = args[1];
@@ -1799,6 +1802,9 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc,
 case INDEX_op_neg_vec:
 tcg_out32(s, encode_vdvj_insn(neg_vec_insn[vece], a0, a1));
 break;
+case INDEX_op_mul_vec:
+tcg_out32(s, encode_vdvjvk_insn(mul_vec_insn[vece], a0, a1, a2));
+break;
 case INDEX_op_dupm_vec:
 tcg_out_dupm_vec(s, type, vece, a0, a1, a2);
 break;
@@ -1825,6 +1831,7 @@ int tcg_can_emit_vec_op(TCGOpcode opc, TCGType type, 
unsigned vece)
 case INDEX_op_nor_vec:
 case INDEX_op_not_vec:
 case INDEX_op_neg_vec:
+case INDEX_op_mul_vec:
 return 1;
 default:
 return 0;
@@ -1999,6 +2006,7 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode 
op)
 case INDEX_op_orc_vec:
 case INDEX_op_xor_vec:
 case INDEX_op_nor_vec:
+case INDEX_op_mul_vec:
 return C_O1_I2(w, w, w);
 
 case INDEX_op_not_vec:
diff --git a/tcg/loongarch64/tcg-target.h b/tcg/loongarch64/tcg-target.h
index 64c72d0857..2c2266ed31 100644
--- a/tcg/loongarch64/tcg-target.h
+++ b/tcg/loongarch64/tcg-target.h
@@ -185,7 +185,7 @@ extern bool use_lsx_instructions;
 #define TCG_TARGET_HAS_nand_vec 0
 #define TCG_TARGET_HAS_nor_vec  1
 #define TCG_TARGET_HAS_eqv_vec  0
-#define TCG_TARGET_HAS_mul_vec  0
+#define TCG_TARGET_HAS_mul_vec  1
 #define TCG_TARGET_HAS_shi_vec  0
 #define TCG_TARGET_HAS_shs_vec  0
 #define TCG_TARGET_HAS_shv_vec  0
-- 
2.42.0

[PATCH v4 06/16] tcg/loongarch64: Lower vector bitwise operations

2023-09-07 Thread Jiajie Chen

Lower the following ops:

- and_vec
- andc_vec
- or_vec
- orc_vec
- xor_vec
- nor_vec
- not_vec

Signed-off-by: Jiajie Chen 
Reviewed-by: Richard Henderson 
---
 tcg/loongarch64/tcg-target-con-set.h |  2 ++
 tcg/loongarch64/tcg-target.c.inc | 44 
 tcg/loongarch64/tcg-target.h |  8 ++---
 3 files changed, 50 insertions(+), 4 deletions(-)

diff --git a/tcg/loongarch64/tcg-target-con-set.h 
b/tcg/loongarch64/tcg-target-con-set.h
index 2d5dce75c3..3f530ad4d8 100644
--- a/tcg/loongarch64/tcg-target-con-set.h
+++ b/tcg/loongarch64/tcg-target-con-set.h
@@ -20,6 +20,7 @@ C_O0_I2(rZ, rZ)
 C_O0_I2(w, r)
 C_O1_I1(r, r)
 C_O1_I1(w, r)
+C_O1_I1(w, w)
 C_O1_I2(r, r, rC)
 C_O1_I2(r, r, ri)
 C_O1_I2(r, r, rI)
@@ -31,6 +32,7 @@ C_O1_I2(r, 0, rZ)
 C_O1_I2(r, rZ, ri)
 C_O1_I2(r, rZ, rJ)
 C_O1_I2(r, rZ, rZ)
+C_O1_I2(w, w, w)
 C_O1_I2(w, w, wM)
 C_O1_I2(w, w, wA)
 C_O1_I4(r, rZ, rJ, rZ, rZ)
diff --git a/tcg/loongarch64/tcg-target.c.inc b/tcg/loongarch64/tcg-target.c.inc
index 1a369b237c..d569e443dd 100644
--- a/tcg/loongarch64/tcg-target.c.inc
+++ b/tcg/loongarch64/tcg-target.c.inc
@@ -1722,6 +1722,32 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc,
 tcg_out_opc_vldx(s, a0, a1, temp);
 }
 break;
+case INDEX_op_and_vec:
+tcg_out_opc_vand_v(s, a0, a1, a2);
+break;
+case INDEX_op_andc_vec:
+/*
+ * vandn vd, vj, vk: vd = vk & ~vj
+ * andc_vec vd, vj, vk: vd = vj & ~vk
+ * vk and vk are swapped
+ */
+tcg_out_opc_vandn_v(s, a0, a2, a1);
+break;
+case INDEX_op_or_vec:
+tcg_out_opc_vor_v(s, a0, a1, a2);
+break;
+case INDEX_op_orc_vec:
+tcg_out_opc_vorn_v(s, a0, a1, a2);
+break;
+case INDEX_op_xor_vec:
+tcg_out_opc_vxor_v(s, a0, a1, a2);
+break;
+case INDEX_op_nor_vec:
+tcg_out_opc_vnor_v(s, a0, a1, a2);
+break;
+case INDEX_op_not_vec:
+tcg_out_opc_vnor_v(s, a0, a1, a1);
+break;
 case INDEX_op_cmp_vec:
 TCGCond cond = args[3];
 if (const_args[2]) {
@@ -1785,6 +1811,13 @@ int tcg_can_emit_vec_op(TCGOpcode opc, TCGType type, 
unsigned vece)
 case INDEX_op_cmp_vec:
 case INDEX_op_add_vec:
 case INDEX_op_sub_vec:
+case INDEX_op_and_vec:
+case INDEX_op_andc_vec:
+case INDEX_op_or_vec:
+case INDEX_op_orc_vec:
+case INDEX_op_xor_vec:
+case INDEX_op_nor_vec:
+case INDEX_op_not_vec:
 return 1;
 default:
 return 0;
@@ -1953,6 +1986,17 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode 
op)
 case INDEX_op_sub_vec:
 return C_O1_I2(w, w, wA);
 
+case INDEX_op_and_vec:
+case INDEX_op_andc_vec:
+case INDEX_op_or_vec:
+case INDEX_op_orc_vec:
+case INDEX_op_xor_vec:
+case INDEX_op_nor_vec:
+return C_O1_I2(w, w, w);
+
+case INDEX_op_not_vec:
+return C_O1_I1(w, w);
+
 default:
 g_assert_not_reached();
 }
diff --git a/tcg/loongarch64/tcg-target.h b/tcg/loongarch64/tcg-target.h
index daaf38ee31..f9c5cb12ca 100644
--- a/tcg/loongarch64/tcg-target.h
+++ b/tcg/loongarch64/tcg-target.h
@@ -177,13 +177,13 @@ extern bool use_lsx_instructions;
 #define TCG_TARGET_HAS_v128 use_lsx_instructions
 #define TCG_TARGET_HAS_v256 0
 
-#define TCG_TARGET_HAS_not_vec  0
+#define TCG_TARGET_HAS_not_vec  1
 #define TCG_TARGET_HAS_neg_vec  0
 #define TCG_TARGET_HAS_abs_vec  0
-#define TCG_TARGET_HAS_andc_vec 0
-#define TCG_TARGET_HAS_orc_vec  0
+#define TCG_TARGET_HAS_andc_vec 1
+#define TCG_TARGET_HAS_orc_vec  1
 #define TCG_TARGET_HAS_nand_vec 0
-#define TCG_TARGET_HAS_nor_vec  0
+#define TCG_TARGET_HAS_nor_vec  1
 #define TCG_TARGET_HAS_eqv_vec  0
 #define TCG_TARGET_HAS_mul_vec  0
 #define TCG_TARGET_HAS_shi_vec  0
-- 
2.42.0

[PATCH v4 11/16] tcg/loongarch64: Lower vector shift vector ops

2023-09-07 Thread Jiajie Chen

Lower the following ops:

- shlv_vec
- shrv_vec
- sarv_vec

Signed-off-by: Jiajie Chen 
Reviewed-by: Richard Henderson 
---
 tcg/loongarch64/tcg-target.c.inc | 24 
 tcg/loongarch64/tcg-target.h |  2 +-
 2 files changed, 25 insertions(+), 1 deletion(-)

diff --git a/tcg/loongarch64/tcg-target.c.inc b/tcg/loongarch64/tcg-target.c.inc
index 90c52c38cf..6958fd219c 100644
--- a/tcg/loongarch64/tcg-target.c.inc
+++ b/tcg/loongarch64/tcg-target.c.inc
@@ -1725,6 +1725,15 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc,
 static const LoongArchInsn ussub_vec_insn[4] = {
 OPC_VSSUB_BU, OPC_VSSUB_HU, OPC_VSSUB_WU, OPC_VSSUB_DU
 };
+static const LoongArchInsn shlv_vec_insn[4] = {
+OPC_VSLL_B, OPC_VSLL_H, OPC_VSLL_W, OPC_VSLL_D
+};
+static const LoongArchInsn shrv_vec_insn[4] = {
+OPC_VSRL_B, OPC_VSRL_H, OPC_VSRL_W, OPC_VSRL_D
+};
+static const LoongArchInsn sarv_vec_insn[4] = {
+OPC_VSRA_B, OPC_VSRA_H, OPC_VSRA_W, OPC_VSRA_D
+};
 
 a0 = args[0];
 a1 = args[1];
@@ -1853,6 +1862,15 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc,
 case INDEX_op_ussub_vec:
 tcg_out32(s, encode_vdvjvk_insn(ussub_vec_insn[vece], a0, a1, a2));
 break;
+case INDEX_op_shlv_vec:
+tcg_out32(s, encode_vdvjvk_insn(shlv_vec_insn[vece], a0, a1, a2));
+break;
+case INDEX_op_shrv_vec:
+tcg_out32(s, encode_vdvjvk_insn(shrv_vec_insn[vece], a0, a1, a2));
+break;
+case INDEX_op_sarv_vec:
+tcg_out32(s, encode_vdvjvk_insn(sarv_vec_insn[vece], a0, a1, a2));
+break;
 case INDEX_op_dupm_vec:
 tcg_out_dupm_vec(s, type, vece, a0, a1, a2);
 break;
@@ -1888,6 +1906,9 @@ int tcg_can_emit_vec_op(TCGOpcode opc, TCGType type, 
unsigned vece)
 case INDEX_op_usadd_vec:
 case INDEX_op_sssub_vec:
 case INDEX_op_ussub_vec:
+case INDEX_op_shlv_vec:
+case INDEX_op_shrv_vec:
+case INDEX_op_sarv_vec:
 return 1;
 default:
 return 0;
@@ -2071,6 +2092,9 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode 
op)
 case INDEX_op_usadd_vec:
 case INDEX_op_sssub_vec:
 case INDEX_op_ussub_vec:
+case INDEX_op_shlv_vec:
+case INDEX_op_shrv_vec:
+case INDEX_op_sarv_vec:
 return C_O1_I2(w, w, w);
 
 case INDEX_op_not_vec:
diff --git a/tcg/loongarch64/tcg-target.h b/tcg/loongarch64/tcg-target.h
index fa14558275..7e9fb61c47 100644
--- a/tcg/loongarch64/tcg-target.h
+++ b/tcg/loongarch64/tcg-target.h
@@ -188,7 +188,7 @@ extern bool use_lsx_instructions;
 #define TCG_TARGET_HAS_mul_vec  1
 #define TCG_TARGET_HAS_shi_vec  0
 #define TCG_TARGET_HAS_shs_vec  0
-#define TCG_TARGET_HAS_shv_vec  0
+#define TCG_TARGET_HAS_shv_vec  1
 #define TCG_TARGET_HAS_roti_vec 0
 #define TCG_TARGET_HAS_rots_vec 0
 #define TCG_TARGET_HAS_rotv_vec 0
-- 
2.42.0

[PATCH v4 00/16] Lower TCG vector ops to LSX

2023-09-07 Thread Jiajie Chen

This patch series allows qemu to utilize LSX instructions on LoongArch
machines to execute TCG vector ops.

Passed tcg tests with x86_64 and aarch64 cross compilers.

Changes since v3:

- Refactor add/sub_vec handling code to use a helper function
- Only use vldx/vstx for MO_128 load/store, otherwise fallback to two ld/st

Changes since v2:

- Add vece argument to tcg_target_const_match() for const args of vector ops
- Use custom constraint for cmp_vec/add_vec/sub_vec for better const arg 
handling
- Implement 128-bit load & store using vldx/vstx

Changes since v1:

- Optimize dupi_vec/st_vec/ld_vec/cmp_vec/add_vec/sub_vec generation
- Lower not_vec/shi_vec/roti_vec/rotv_vec


Jiajie Chen (16):
  tcg/loongarch64: Import LSX instructions
  tcg/loongarch64: Lower basic tcg vec ops to LSX
  tcg: pass vece to tcg_target_const_match()
  tcg/loongarch64: Lower cmp_vec to vseq/vsle/vslt
  tcg/loongarch64: Lower add/sub_vec to vadd/vsub
  tcg/loongarch64: Lower vector bitwise operations
  tcg/loongarch64: Lower neg_vec to vneg
  tcg/loongarch64: Lower mul_vec to vmul
  tcg/loongarch64: Lower vector min max ops
  tcg/loongarch64: Lower vector saturated ops
  tcg/loongarch64: Lower vector shift vector ops
  tcg/loongarch64: Lower bitsel_vec to vbitsel
  tcg/loongarch64: Lower vector shift integer ops
  tcg/loongarch64: Lower rotv_vec ops to LSX
  tcg/loongarch64: Lower rotli_vec to vrotri
  tcg/loongarch64: Implement 128-bit load & store

 tcg/aarch64/tcg-target.c.inc |2 +-
 tcg/arm/tcg-target.c.inc |2 +-
 tcg/i386/tcg-target.c.inc|2 +-
 tcg/loongarch64/tcg-insn-defs.c.inc  | 6251 +-
 tcg/loongarch64/tcg-target-con-set.h |9 +
 tcg/loongarch64/tcg-target-con-str.h |3 +
 tcg/loongarch64/tcg-target.c.inc |  619 ++-
 tcg/loongarch64/tcg-target.h |   40 +-
 tcg/loongarch64/tcg-target.opc.h |   12 +
 tcg/mips/tcg-target.c.inc|2 +-
 tcg/ppc/tcg-target.c.inc |2 +-
 tcg/riscv/tcg-target.c.inc   |2 +-
 tcg/s390x/tcg-target.c.inc   |2 +-
 tcg/sparc64/tcg-target.c.inc |2 +-
 tcg/tcg.c|4 +-
 tcg/tci/tcg-target.c.inc |2 +-
 16 files changed, 6824 insertions(+), 132 deletions(-)
 create mode 100644 tcg/loongarch64/tcg-target.opc.h

-- 
2.42.0

[PATCH v4 14/16] tcg/loongarch64: Lower rotv_vec ops to LSX

2023-09-07 Thread Jiajie Chen

Lower the following ops:

- rotrv_vec
- rotlv_vec

Signed-off-by: Jiajie Chen 
Reviewed-by: Richard Henderson 
---
 tcg/loongarch64/tcg-target.c.inc | 14 ++
 tcg/loongarch64/tcg-target.h |  2 +-
 2 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/tcg/loongarch64/tcg-target.c.inc b/tcg/loongarch64/tcg-target.c.inc
index c21c917083..8f448823b0 100644
--- a/tcg/loongarch64/tcg-target.c.inc
+++ b/tcg/loongarch64/tcg-target.c.inc
@@ -1743,6 +1743,9 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc,
 static const LoongArchInsn sari_vec_insn[4] = {
 OPC_VSRAI_B, OPC_VSRAI_H, OPC_VSRAI_W, OPC_VSRAI_D
 };
+static const LoongArchInsn rotrv_vec_insn[4] = {
+OPC_VROTR_B, OPC_VROTR_H, OPC_VROTR_W, OPC_VROTR_D
+};
 
 a0 = args[0];
 a1 = args[1];
@@ -1890,6 +1893,15 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc,
 case INDEX_op_sari_vec:
 tcg_out32(s, encode_vdvjuk3_insn(sari_vec_insn[vece], a0, a1, a2));
 break;
+case INDEX_op_rotrv_vec:
+tcg_out32(s, encode_vdvjvk_insn(rotrv_vec_insn[vece], a0, a1, a2));
+break;
+case INDEX_op_rotlv_vec:
+/* rotlv_vec a1, a2 = rotrv_vec a1, -a2 */
+tcg_out32(s, encode_vdvj_insn(neg_vec_insn[vece], temp_vec, a2));
+tcg_out32(s, encode_vdvjvk_insn(rotrv_vec_insn[vece], a0, a1,
+temp_vec));
+break;
 case INDEX_op_bitsel_vec:
 /* vbitsel vd, vj, vk, va = bitsel_vec vd, va, vk, vj */
 tcg_out_opc_vbitsel_v(s, a0, a3, a2, a1);
@@ -2119,6 +2131,8 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode 
op)
 case INDEX_op_shlv_vec:
 case INDEX_op_shrv_vec:
 case INDEX_op_sarv_vec:
+case INDEX_op_rotrv_vec:
+case INDEX_op_rotlv_vec:
 return C_O1_I2(w, w, w);
 
 case INDEX_op_not_vec:
diff --git a/tcg/loongarch64/tcg-target.h b/tcg/loongarch64/tcg-target.h
index d7b806e252..d5c69bc192 100644
--- a/tcg/loongarch64/tcg-target.h
+++ b/tcg/loongarch64/tcg-target.h
@@ -191,7 +191,7 @@ extern bool use_lsx_instructions;
 #define TCG_TARGET_HAS_shv_vec  1
 #define TCG_TARGET_HAS_roti_vec 0
 #define TCG_TARGET_HAS_rots_vec 0
-#define TCG_TARGET_HAS_rotv_vec 0
+#define TCG_TARGET_HAS_rotv_vec 1
 #define TCG_TARGET_HAS_sat_vec  1
 #define TCG_TARGET_HAS_minmax_vec   1
 #define TCG_TARGET_HAS_bitsel_vec   1
-- 
2.42.0

[PATCH v4 16/16] tcg/loongarch64: Implement 128-bit load & store

2023-09-07 Thread Jiajie Chen

If LSX is available, use LSX instructions to implement 128-bit load &
store when MO_128 is required, otherwise use two 64-bit loads & stores.

Signed-off-by: Jiajie Chen 
---
 tcg/loongarch64/tcg-target-con-set.h |  2 +
 tcg/loongarch64/tcg-target.c.inc | 59 
 tcg/loongarch64/tcg-target.h |  2 +-
 3 files changed, 62 insertions(+), 1 deletion(-)

diff --git a/tcg/loongarch64/tcg-target-con-set.h 
b/tcg/loongarch64/tcg-target-con-set.h
index 914572d21b..77d62e38e7 100644
--- a/tcg/loongarch64/tcg-target-con-set.h
+++ b/tcg/loongarch64/tcg-target-con-set.h
@@ -18,6 +18,7 @@ C_O0_I1(r)
 C_O0_I2(rZ, r)
 C_O0_I2(rZ, rZ)
 C_O0_I2(w, r)
+C_O0_I3(r, r, r)
 C_O1_I1(r, r)
 C_O1_I1(w, r)
 C_O1_I1(w, w)
@@ -37,3 +38,4 @@ C_O1_I2(w, w, wM)
 C_O1_I2(w, w, wA)
 C_O1_I3(w, w, w, w)
 C_O1_I4(r, rZ, rJ, rZ, rZ)
+C_O2_I1(r, r, r)
diff --git a/tcg/loongarch64/tcg-target.c.inc b/tcg/loongarch64/tcg-target.c.inc
index 82901d678a..6e9f334fed 100644
--- a/tcg/loongarch64/tcg-target.c.inc
+++ b/tcg/loongarch64/tcg-target.c.inc
@@ -1081,6 +1081,48 @@ static void tcg_out_qemu_st(TCGContext *s, TCGReg 
data_reg, TCGReg addr_reg,
 }
 }
 
+static void tcg_out_qemu_ldst_i128(TCGContext *s, TCGReg data_lo, TCGReg 
data_hi,
+   TCGReg addr_reg, MemOpIdx oi, bool is_ld)
+{
+TCGLabelQemuLdst *ldst;
+HostAddress h;
+
+ldst = prepare_host_addr(s, , addr_reg, oi, true);
+
+if (h.aa.atom == MO_128) {
+/*
+ * Use VLDX/VSTX when 128-bit atomicity is required.
+ * If address is aligned to 16-bytes, the 128-bit load/store is atomic.
+ */
+if (is_ld) {
+tcg_out_opc_vldx(s, TCG_VEC_TMP0, h.base, h.index);
+tcg_out_opc_vpickve2gr_d(s, data_lo, TCG_VEC_TMP0, 0);
+tcg_out_opc_vpickve2gr_d(s, data_hi, TCG_VEC_TMP0, 1);
+} else {
+tcg_out_opc_vinsgr2vr_d(s, TCG_VEC_TMP0, data_lo, 0);
+tcg_out_opc_vinsgr2vr_d(s, TCG_VEC_TMP0, data_hi, 1);
+tcg_out_opc_vstx(s, TCG_VEC_TMP0, h.base, h.index);
+}
+} else {
+/* otherwise use a pair of LD/ST */
+tcg_out_opc_add_d(s, TCG_REG_TMP0, h.base, h.index);
+if (is_ld) {
+tcg_out_opc_ld_d(s, data_lo, TCG_REG_TMP0, 0);
+tcg_out_opc_ld_d(s, data_hi, TCG_REG_TMP0, 8);
+} else {
+tcg_out_opc_st_d(s, data_lo, TCG_REG_TMP0, 0);
+tcg_out_opc_st_d(s, data_hi, TCG_REG_TMP0, 8);
+}
+}
+
+if (ldst) {
+ldst->type = TCG_TYPE_I128;
+ldst->datalo_reg = data_lo;
+ldst->datahi_reg = data_hi;
+ldst->raddr = tcg_splitwx_to_rx(s->code_ptr);
+}
+}
+
 /*
  * Entry-points
  */
@@ -1145,6 +1187,7 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
 TCGArg a0 = args[0];
 TCGArg a1 = args[1];
 TCGArg a2 = args[2];
+TCGArg a3 = args[3];
 int c2 = const_args[2];
 
 switch (opc) {
@@ -1507,6 +1550,10 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
 case INDEX_op_qemu_ld_a64_i64:
 tcg_out_qemu_ld(s, a0, a1, a2, TCG_TYPE_I64);
 break;
+case INDEX_op_qemu_ld_a32_i128:
+case INDEX_op_qemu_ld_a64_i128:
+tcg_out_qemu_ldst_i128(s, a0, a1, a2, a3, true);
+break;
 case INDEX_op_qemu_st_a32_i32:
 case INDEX_op_qemu_st_a64_i32:
 tcg_out_qemu_st(s, a0, a1, a2, TCG_TYPE_I32);
@@ -1515,6 +1562,10 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
 case INDEX_op_qemu_st_a64_i64:
 tcg_out_qemu_st(s, a0, a1, a2, TCG_TYPE_I64);
 break;
+case INDEX_op_qemu_st_a32_i128:
+case INDEX_op_qemu_st_a64_i128:
+tcg_out_qemu_ldst_i128(s, a0, a1, a2, a3, false);
+break;
 
 case INDEX_op_mov_i32:  /* Always emitted via tcg_out_mov.  */
 case INDEX_op_mov_i64:
@@ -1996,6 +2047,14 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode 
op)
 case INDEX_op_qemu_st_a64_i64:
 return C_O0_I2(rZ, r);
 
+case INDEX_op_qemu_ld_a32_i128:
+case INDEX_op_qemu_ld_a64_i128:
+return C_O2_I1(r, r, r);
+
+case INDEX_op_qemu_st_a32_i128:
+case INDEX_op_qemu_st_a64_i128:
+return C_O0_I3(r, r, r);
+
 case INDEX_op_brcond_i32:
 case INDEX_op_brcond_i64:
 return C_O0_I2(rZ, rZ);
diff --git a/tcg/loongarch64/tcg-target.h b/tcg/loongarch64/tcg-target.h
index 67b0a95532..03017672f6 100644
--- a/tcg/loongarch64/tcg-target.h
+++ b/tcg/loongarch64/tcg-target.h
@@ -171,7 +171,7 @@ extern bool use_lsx_instructions;
 #define TCG_TARGET_HAS_muluh_i641
 #define TCG_TARGET_HAS_mulsh_i641
 
-#define TCG_TARGET_HAS_qemu_ldst_i128   0
+#define TCG_TARGET_HAS_qemu_ldst_i128   use_lsx_instructions
 
 #define TCG_TARGET_HAS_v64  0
 #define TCG_TARGET_HAS_v128 use_lsx_instructions
-- 
2.42.0

[PATCH v4 09/16] tcg/loongarch64: Lower vector min max ops

2023-09-07 Thread Jiajie Chen

Lower the following ops:

- smin_vec
- smax_vec
- umin_vec
- umax_vec

Signed-off-by: Jiajie Chen 
Reviewed-by: Richard Henderson 
---
 tcg/loongarch64/tcg-target.c.inc | 32 
 tcg/loongarch64/tcg-target.h |  2 +-
 2 files changed, 33 insertions(+), 1 deletion(-)

diff --git a/tcg/loongarch64/tcg-target.c.inc b/tcg/loongarch64/tcg-target.c.inc
index 0814f62905..bdf22d8807 100644
--- a/tcg/loongarch64/tcg-target.c.inc
+++ b/tcg/loongarch64/tcg-target.c.inc
@@ -1701,6 +1701,18 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc,
 static const LoongArchInsn mul_vec_insn[4] = {
 OPC_VMUL_B, OPC_VMUL_H, OPC_VMUL_W, OPC_VMUL_D
 };
+static const LoongArchInsn smin_vec_insn[4] = {
+OPC_VMIN_B, OPC_VMIN_H, OPC_VMIN_W, OPC_VMIN_D
+};
+static const LoongArchInsn umin_vec_insn[4] = {
+OPC_VMIN_BU, OPC_VMIN_HU, OPC_VMIN_WU, OPC_VMIN_DU
+};
+static const LoongArchInsn smax_vec_insn[4] = {
+OPC_VMAX_B, OPC_VMAX_H, OPC_VMAX_W, OPC_VMAX_D
+};
+static const LoongArchInsn umax_vec_insn[4] = {
+OPC_VMAX_BU, OPC_VMAX_HU, OPC_VMAX_WU, OPC_VMAX_DU
+};
 
 a0 = args[0];
 a1 = args[1];
@@ -1805,6 +1817,18 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc,
 case INDEX_op_mul_vec:
 tcg_out32(s, encode_vdvjvk_insn(mul_vec_insn[vece], a0, a1, a2));
 break;
+case INDEX_op_smin_vec:
+tcg_out32(s, encode_vdvjvk_insn(smin_vec_insn[vece], a0, a1, a2));
+break;
+case INDEX_op_smax_vec:
+tcg_out32(s, encode_vdvjvk_insn(smax_vec_insn[vece], a0, a1, a2));
+break;
+case INDEX_op_umin_vec:
+tcg_out32(s, encode_vdvjvk_insn(umin_vec_insn[vece], a0, a1, a2));
+break;
+case INDEX_op_umax_vec:
+tcg_out32(s, encode_vdvjvk_insn(umax_vec_insn[vece], a0, a1, a2));
+break;
 case INDEX_op_dupm_vec:
 tcg_out_dupm_vec(s, type, vece, a0, a1, a2);
 break;
@@ -1832,6 +1856,10 @@ int tcg_can_emit_vec_op(TCGOpcode opc, TCGType type, 
unsigned vece)
 case INDEX_op_not_vec:
 case INDEX_op_neg_vec:
 case INDEX_op_mul_vec:
+case INDEX_op_smin_vec:
+case INDEX_op_smax_vec:
+case INDEX_op_umin_vec:
+case INDEX_op_umax_vec:
 return 1;
 default:
 return 0;
@@ -2007,6 +2035,10 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode 
op)
 case INDEX_op_xor_vec:
 case INDEX_op_nor_vec:
 case INDEX_op_mul_vec:
+case INDEX_op_smin_vec:
+case INDEX_op_smax_vec:
+case INDEX_op_umin_vec:
+case INDEX_op_umax_vec:
 return C_O1_I2(w, w, w);
 
 case INDEX_op_not_vec:
diff --git a/tcg/loongarch64/tcg-target.h b/tcg/loongarch64/tcg-target.h
index 2c2266ed31..ec725aaeaa 100644
--- a/tcg/loongarch64/tcg-target.h
+++ b/tcg/loongarch64/tcg-target.h
@@ -193,7 +193,7 @@ extern bool use_lsx_instructions;
 #define TCG_TARGET_HAS_rots_vec 0
 #define TCG_TARGET_HAS_rotv_vec 0
 #define TCG_TARGET_HAS_sat_vec  0
-#define TCG_TARGET_HAS_minmax_vec   0
+#define TCG_TARGET_HAS_minmax_vec   1
 #define TCG_TARGET_HAS_bitsel_vec   0
 #define TCG_TARGET_HAS_cmpsel_vec   0
 
-- 
2.42.0

[PATCH v4 05/16] tcg/loongarch64: Lower add/sub_vec to vadd/vsub

2023-09-07 Thread Jiajie Chen

Lower the following ops:

- add_vec
- sub_vec

Signed-off-by: Jiajie Chen 
---
 tcg/loongarch64/tcg-target-con-set.h |  1 +
 tcg/loongarch64/tcg-target-con-str.h |  1 +
 tcg/loongarch64/tcg-target.c.inc | 61 
 3 files changed, 63 insertions(+)

diff --git a/tcg/loongarch64/tcg-target-con-set.h 
b/tcg/loongarch64/tcg-target-con-set.h
index 8c8ea5d919..2d5dce75c3 100644
--- a/tcg/loongarch64/tcg-target-con-set.h
+++ b/tcg/loongarch64/tcg-target-con-set.h
@@ -32,4 +32,5 @@ C_O1_I2(r, rZ, ri)
 C_O1_I2(r, rZ, rJ)
 C_O1_I2(r, rZ, rZ)
 C_O1_I2(w, w, wM)
+C_O1_I2(w, w, wA)
 C_O1_I4(r, rZ, rJ, rZ, rZ)
diff --git a/tcg/loongarch64/tcg-target-con-str.h 
b/tcg/loongarch64/tcg-target-con-str.h
index a8a1c44014..2ba9c135ac 100644
--- a/tcg/loongarch64/tcg-target-con-str.h
+++ b/tcg/loongarch64/tcg-target-con-str.h
@@ -27,3 +27,4 @@ CONST('Z', TCG_CT_CONST_ZERO)
 CONST('C', TCG_CT_CONST_C12)
 CONST('W', TCG_CT_CONST_WSZ)
 CONST('M', TCG_CT_CONST_VCMP)
+CONST('A', TCG_CT_CONST_VADD)
diff --git a/tcg/loongarch64/tcg-target.c.inc b/tcg/loongarch64/tcg-target.c.inc
index 129dd92910..1a369b237c 100644
--- a/tcg/loongarch64/tcg-target.c.inc
+++ b/tcg/loongarch64/tcg-target.c.inc
@@ -177,6 +177,7 @@ static TCGReg tcg_target_call_oarg_reg(TCGCallReturnKind 
kind, int slot)
 #define TCG_CT_CONST_C12   0x1000
 #define TCG_CT_CONST_WSZ   0x2000
 #define TCG_CT_CONST_VCMP  0x4000
+#define TCG_CT_CONST_VADD  0x8000
 
 #define ALL_GENERAL_REGS   MAKE_64BIT_MASK(0, 32)
 #define ALL_VECTOR_REGSMAKE_64BIT_MASK(32, 32)
@@ -214,6 +215,9 @@ static bool tcg_target_const_match(int64_t val, TCGType 
type, int ct, int vece)
 if ((ct & TCG_CT_CONST_VCMP) && -0x10 <= vec_val && vec_val <= 0x1f) {
 return true;
 }
+if ((ct & TCG_CT_CONST_VADD) && -0x1f <= vec_val && vec_val <= 0x1f) {
+return true;
+}
 return false;
 }
 
@@ -1621,6 +1625,51 @@ static void tcg_out_dupi_vec(TCGContext *s, TCGType 
type, unsigned vece,
 }
 }
 
+static void tcg_out_addsub_vec(TCGContext *s, unsigned vece, const TCGArg a0,
+   const TCGArg a1, const TCGArg a2,
+   bool a2_is_const, bool is_add)
+{
+static const LoongArchInsn add_vec_insn[4] = {
+OPC_VADD_B, OPC_VADD_H, OPC_VADD_W, OPC_VADD_D
+};
+static const LoongArchInsn add_vec_imm_insn[4] = {
+OPC_VADDI_BU, OPC_VADDI_HU, OPC_VADDI_WU, OPC_VADDI_DU
+};
+static const LoongArchInsn sub_vec_insn[4] = {
+OPC_VSUB_B, OPC_VSUB_H, OPC_VSUB_W, OPC_VSUB_D
+};
+static const LoongArchInsn sub_vec_imm_insn[4] = {
+OPC_VSUBI_BU, OPC_VSUBI_HU, OPC_VSUBI_WU, OPC_VSUBI_DU
+};
+
+if (a2_is_const) {
+int64_t value = sextract64(a2, 0, 8 << vece);
+if (!is_add) {
+value = -value;
+}
+
+/* Try vaddi/vsubi */
+if (0 <= value && value <= 0x1f) {
+tcg_out32(s, encode_vdvjuk5_insn(add_vec_imm_insn[vece], a0, \
+ a1, value));
+return;
+} else if (-0x1f <= value && value < 0) {
+tcg_out32(s, encode_vdvjuk5_insn(sub_vec_imm_insn[vece], a0, \
+ a1, -value));
+return;
+}
+
+/* constraint TCG_CT_CONST_VADD ensures unreachable */
+g_assert_not_reached();
+}
+
+if (is_add) {
+tcg_out32(s, encode_vdvjvk_insn(add_vec_insn[vece], a0, a1, a2));
+} else {
+tcg_out32(s, encode_vdvjvk_insn(sub_vec_insn[vece], a0, a1, a2));
+}
+}
+
 static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc,
unsigned vecl, unsigned vece,
const TCGArg args[TCG_MAX_OP_ARGS],
@@ -1712,6 +1761,12 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc,
 }
 tcg_out32(s, encode_vdvjvk_insn(insn, a0, a1, a2));
 break;
+case INDEX_op_add_vec:
+tcg_out_addsub_vec(s, vece, a0, a1, a2, const_args[2], true);
+break;
+case INDEX_op_sub_vec:
+tcg_out_addsub_vec(s, vece, a0, a1, a2, const_args[2], false);
+break;
 case INDEX_op_dupm_vec:
 tcg_out_dupm_vec(s, type, vece, a0, a1, a2);
 break;
@@ -1728,6 +1783,8 @@ int tcg_can_emit_vec_op(TCGOpcode opc, TCGType type, 
unsigned vece)
 case INDEX_op_dup_vec:
 case INDEX_op_dupm_vec:
 case INDEX_op_cmp_vec:
+case INDEX_op_add_vec:
+case INDEX_op_sub_vec:
 return 1;
 default:
 return 0;
@@ -1892,6 +1949,10 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode 
op)
 case INDEX_op_cmp_vec:
 return C_O1_I2(w, w, wM);
 
+case INDEX_op_add_vec:
+case INDEX_op_sub_vec:
+return C_O1_I2(w, w, wA);
+
 default:
 g_assert_not_reached();
 }
-- 
2.42.0

[PATCH v4 07/16] tcg/loongarch64: Lower neg_vec to vneg

2023-09-07 Thread Jiajie Chen

Signed-off-by: Jiajie Chen 
Reviewed-by: Richard Henderson 
---
 tcg/loongarch64/tcg-target.c.inc | 8 
 tcg/loongarch64/tcg-target.h | 2 +-
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/tcg/loongarch64/tcg-target.c.inc b/tcg/loongarch64/tcg-target.c.inc
index d569e443dd..b36b706e39 100644
--- a/tcg/loongarch64/tcg-target.c.inc
+++ b/tcg/loongarch64/tcg-target.c.inc
@@ -1695,6 +1695,9 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc,
 [TCG_COND_LTU] = {OPC_VSLTI_BU, OPC_VSLTI_HU, OPC_VSLTI_WU, 
OPC_VSLTI_DU},
 };
 LoongArchInsn insn;
+static const LoongArchInsn neg_vec_insn[4] = {
+OPC_VNEG_B, OPC_VNEG_H, OPC_VNEG_W, OPC_VNEG_D
+};
 
 a0 = args[0];
 a1 = args[1];
@@ -1793,6 +1796,9 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc,
 case INDEX_op_sub_vec:
 tcg_out_addsub_vec(s, vece, a0, a1, a2, const_args[2], false);
 break;
+case INDEX_op_neg_vec:
+tcg_out32(s, encode_vdvj_insn(neg_vec_insn[vece], a0, a1));
+break;
 case INDEX_op_dupm_vec:
 tcg_out_dupm_vec(s, type, vece, a0, a1, a2);
 break;
@@ -1818,6 +1824,7 @@ int tcg_can_emit_vec_op(TCGOpcode opc, TCGType type, 
unsigned vece)
 case INDEX_op_xor_vec:
 case INDEX_op_nor_vec:
 case INDEX_op_not_vec:
+case INDEX_op_neg_vec:
 return 1;
 default:
 return 0;
@@ -1995,6 +2002,7 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode 
op)
 return C_O1_I2(w, w, w);
 
 case INDEX_op_not_vec:
+case INDEX_op_neg_vec:
 return C_O1_I1(w, w);
 
 default:
diff --git a/tcg/loongarch64/tcg-target.h b/tcg/loongarch64/tcg-target.h
index f9c5cb12ca..64c72d0857 100644
--- a/tcg/loongarch64/tcg-target.h
+++ b/tcg/loongarch64/tcg-target.h
@@ -178,7 +178,7 @@ extern bool use_lsx_instructions;
 #define TCG_TARGET_HAS_v256 0
 
 #define TCG_TARGET_HAS_not_vec  1
-#define TCG_TARGET_HAS_neg_vec  0
+#define TCG_TARGET_HAS_neg_vec  1
 #define TCG_TARGET_HAS_abs_vec  0
 #define TCG_TARGET_HAS_andc_vec 1
 #define TCG_TARGET_HAS_orc_vec  1
-- 
2.42.0

[PULL 08/13] qemu-nbd: define struct NbdClientOpts when HAVE_NBD_DEVICE is not defined

2023-09-07 Thread Eric Blake

From: "Denis V. Lunev" 

This patch also drops definition of some locals in main() to avoid
useless data copy.

Signed-off-by: Denis V. Lunev 
CC: Eric Blake 
CC: Vladimir Sementsov-Ogievskiy 
Message-ID: <20230906093210.339585-3-...@openvz.org>
Reviewed-by: Eric Blake 
Signed-off-by: Eric Blake 
---
 qemu-nbd.c | 60 --
 1 file changed, 27 insertions(+), 33 deletions(-)

diff --git a/qemu-nbd.c b/qemu-nbd.c
index e2480061a16..acbdc0cd8fd 100644
--- a/qemu-nbd.c
+++ b/qemu-nbd.c
@@ -253,6 +253,12 @@ static int qemu_nbd_client_list(SocketAddress *saddr, 
QCryptoTLSCreds *tls,
 }


+struct NbdClientOpts {
+char *device;
+bool fork_process;
+bool verbose;
+};
+
 #if HAVE_NBD_DEVICE
 static void *show_parts(void *arg)
 {
@@ -271,12 +277,6 @@ static void *show_parts(void *arg)
 return NULL;
 }

-struct NbdClientOpts {
-char *device;
-bool fork_process;
-bool verbose;
-};
-
 static void *nbd_client_thread(void *arg)
 {
 struct NbdClientOpts *opts = arg;
@@ -519,7 +519,6 @@ int main(int argc, char **argv)
 const char *bindto = NULL;
 const char *port = NULL;
 char *sockpath = NULL;
-char *device = NULL;
 QemuOpts *sn_opts = NULL;
 const char *sn_id_or_name = NULL;
 const char *sopt = "hVb:o:p:rsnc:dvk:e:f:tl:x:T:D:AB:L";
@@ -582,16 +581,16 @@ int main(int argc, char **argv)
 const char *tlshostname = NULL;
 bool imageOpts = false;
 bool writethrough = false; /* Client will flush as needed. */
-bool verbose = false;
-bool fork_process = false;
 bool list = false;
 unsigned socket_activation;
 const char *pid_file_name = NULL;
 const char *selinux_label = NULL;
 BlockExportOptions *export_opts;
-#if HAVE_NBD_DEVICE
-struct NbdClientOpts opts;
-#endif
+struct NbdClientOpts opts = {
+.fork_process = false,
+.verbose = false,
+.device = NULL,
+};

 #ifdef CONFIG_POSIX
 os_setup_early_signal_handling();
@@ -719,7 +718,7 @@ int main(int argc, char **argv)
 disconnect = true;
 break;
 case 'c':
-device = optarg;
+opts.device = optarg;
 break;
 case 'e':
 if (qemu_strtoi(optarg, NULL, 0, ) < 0 ||
@@ -750,7 +749,7 @@ int main(int argc, char **argv)
 }
 break;
 case 'v':
-verbose = true;
+opts.verbose = true;
 break;
 case 'V':
 version(argv[0]);
@@ -782,7 +781,7 @@ int main(int argc, char **argv)
 tlsauthz = optarg;
 break;
 case QEMU_NBD_OPT_FORK:
-fork_process = true;
+opts.fork_process = true;
 break;
 case 'L':
 list = true;
@@ -802,12 +801,12 @@ int main(int argc, char **argv)
 exit(EXIT_FAILURE);
 }
 if (export_name || export_description || dev_offset ||
-device || disconnect || fmt || sn_id_or_name || bitmaps ||
+opts.device || disconnect || fmt || sn_id_or_name || bitmaps ||
 alloc_depth || seen_aio || seen_discard || seen_cache) {
 error_report("List mode is incompatible with per-device settings");
 exit(EXIT_FAILURE);
 }
-if (fork_process) {
+if (opts.fork_process) {
 error_report("List mode is incompatible with forking");
 exit(EXIT_FAILURE);
 }
@@ -832,7 +831,8 @@ int main(int argc, char **argv)
 }
 } else {
 /* Using socket activation - check user didn't use -p etc. */
-const char *err_msg = socket_activation_validate_opts(device, sockpath,
+const char *err_msg = socket_activation_validate_opts(opts.device,
+  sockpath,
   bindto, port,
   selinux_label,
   list);
@@ -850,7 +850,7 @@ int main(int argc, char **argv)
 }

 if (tlscredsid) {
-if (device) {
+if (opts.device) {
 error_report("TLS is not supported with a host device");
 exit(EXIT_FAILURE);
 }
@@ -880,7 +880,7 @@ int main(int argc, char **argv)

 if (selinux_label) {
 #ifdef CONFIG_SELINUX
-if (sockpath == NULL && device == NULL) {
+if (sockpath == NULL && opts.device == NULL) {
 error_report("--selinux-label is not permitted without --socket");
 exit(EXIT_FAILURE);
 }
@@ -897,7 +897,7 @@ int main(int argc, char **argv)
 }

 #if !HAVE_NBD_DEVICE
-if (disconnect || device) {
+if (disconnect || opts.device) {
 error_report("Kernel /dev/nbdN support not available");
 exit(EXIT_FAILURE);
 }
@@ -919,7 +919,7 @@ int main(int argc, char **argv)
 }
 #endif

-

[PULL 06/13] util/iov: Avoid dynamic stack allocation

2023-09-07 Thread Eric Blake

From: Philippe Mathieu-Daudé 

Use autofree heap allocation instead of variable-length array on the
stack.

The codebase has very few VLAs, and if we can get rid of them all we
can make the compiler error on new additions.  This is a defensive
measure against security bugs where an on-stack dynamic allocation
isn't correctly size-checked (e.g.  CVE-2021-3527).

Signed-off-by: Philippe Mathieu-Daudé 
Signed-off-by: Peter Maydell 
Message-ID: <20230824164706.2652277-1-peter.mayd...@linaro.org>
Reviewed-by: Eric Blake 
Signed-off-by: Eric Blake 
---
 util/iov.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/util/iov.c b/util/iov.c
index 866fb577f30..7e73948f5e3 100644
--- a/util/iov.c
+++ b/util/iov.c
@@ -571,7 +571,7 @@ static int sortelem_cmp_src_index(const void *a, const void 
*b)
  */
 void qemu_iovec_clone(QEMUIOVector *dest, const QEMUIOVector *src, void *buf)
 {
-IOVectorSortElem sortelems[src->niov];
+g_autofree IOVectorSortElem *sortelems = g_new(IOVectorSortElem, 
src->niov);
 void *last_end;
 int i;

-- 
2.41.0

[PULL 12/13] qemu-nbd: Restore "qemu-nbd -v --fork" output

2023-09-07 Thread Eric Blake

From: "Denis V. Lunev" 

Closing stderr earlier is good for daemonized qemu-nbd under ssh
earlier, but breaks the case where -v is being used to track what is
happening in the server, as in iotest 233.

When we know we are verbose, we should preserve original stderr and
restore it once the setup stage is done. This commit restores the
original behavior with -v option. In this case original output
inside the test is kept intact.

Reported-by: Kevin Wolf 
Signed-off-by: Denis V. Lunev 
CC: Eric Blake 
CC: Vladimir Sementsov-Ogievskiy 
CC: Hanna Reitz 
CC: Mike Maslenkin 
Fixes: 5c56dd27a2 ("qemu-nbd: fix regression with qemu-nbd --fork run over ssh")
Message-ID: <20230906093210.339585-7-...@openvz.org>
Reviewed-by: Eric Blake 
Tested-by: Eric Blake 
Signed-off-by: Eric Blake 
---
 qemu-nbd.c | 24 
 1 file changed, 20 insertions(+), 4 deletions(-)

diff --git a/qemu-nbd.c b/qemu-nbd.c
index 7c4e22def17..1cdc41ed292 100644
--- a/qemu-nbd.c
+++ b/qemu-nbd.c
@@ -255,18 +255,23 @@ struct NbdClientOpts {
 char *device;
 char *srcpath;
 SocketAddress *saddr;
+int stderr;
 bool fork_process;
 bool verbose;
 };

-static void nbd_client_release_pipe(void)
+static void nbd_client_release_pipe(int old_stderr)
 {
 /* Close stderr so that the qemu-nbd process exits.  */
-if (dup2(STDOUT_FILENO, STDERR_FILENO) < 0) {
+if (dup2(old_stderr, STDERR_FILENO) < 0) {
 error_report("Could not release pipe to parent: %s",
  strerror(errno));
 exit(EXIT_FAILURE);
 }
+if (old_stderr != STDOUT_FILENO && close(old_stderr) < 0) {
+error_report("Could not release qemu-nbd: %s", strerror(errno));
+exit(EXIT_FAILURE);
+}
 }

 #if HAVE_NBD_DEVICE
@@ -332,7 +337,7 @@ static void *nbd_client_thread(void *arg)
 fprintf(stderr, "NBD device %s is now connected to %s\n",
 opts->device, opts->srcpath);
 } else {
-nbd_client_release_pipe();
+nbd_client_release_pipe(opts->stderr);
 }

 if (nbd_client(fd) < 0) {
@@ -597,6 +602,7 @@ int main(int argc, char **argv)
 .device = NULL,
 .srcpath = NULL,
 .saddr = NULL,
+.stderr = STDOUT_FILENO,
 };

 #ifdef CONFIG_POSIX
@@ -951,6 +957,16 @@ int main(int argc, char **argv)

 close(stderr_fd[0]);

+/* Remember parent's stderr if we will be restoring it. */
+if (opts.verbose /* fork_process is set */) {
+opts.stderr = dup(STDERR_FILENO);
+if (opts.stderr < 0) {
+error_report("Could not dup original stderr: %s",
+ strerror(errno));
+exit(EXIT_FAILURE);
+}
+}
+
 ret = qemu_daemon(1, 0);
 saved_errno = errno;/* dup2 will overwrite error below */

@@ -1181,7 +1197,7 @@ int main(int argc, char **argv)
 }

 if (opts.fork_process) {
-nbd_client_release_pipe();
+nbd_client_release_pipe(opts.stderr);
 }

 state = RUNNING;
-- 
2.41.0

[PULL 00/13] NBD patches through 2023-09-07

2023-09-07 Thread Eric Blake

The following changes since commit 03a3a62fbd0aa5227e978eef3c67d3978aec9e5f:

  Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging 
(2023-09-07 10:29:06 -0400)

are available in the Git repository at:

  https://repo.or.cz/qemu/ericb.git tags/pull-nbd-2023-09-07

for you to fetch changes up to 737ff1b137b7ce1d613c3851e0efaae9b820dbc0:

  qemu-nbd: document -v behavior in respect to --fork in man (2023-09-07 
20:32:11 -0500)


NBD patches for 2023-09-07

- Andrey Drobyshev - fix regression in iotest 197 under -nbd
- Stefan Hajnoczi - allow coroutine read and write context to split
across threads
- Philippe Mathieu-Daudé - remove a VLA allocation
- Denis V. Lunev - fix regression in iotest 233 with qemu-nbd -v --fork


Andrey Drobyshev (1):
  qemu-iotests/197: use more generic commands for formats other than qcow2

Denis V. Lunev (7):
  qemu-nbd: improve error message for dup2 error
  qemu-nbd: define struct NbdClientOpts when HAVE_NBD_DEVICE is not defined
  qemu-nbd: move srcpath into struct NbdClientOpts
  qemu-nbd: put saddr into into struct NbdClientOpts
  qemu-nbd: invent nbd_client_release_pipe() helper
  qemu-nbd: Restore "qemu-nbd -v --fork" output
  qemu-nbd: document -v behavior in respect to --fork in man

Philippe Mathieu-Daudé (1):
  util/iov: Avoid dynamic stack allocation

Stefan Hajnoczi (4):
  nbd: drop unused nbd_receive_negotiate() aio_context argument
  nbd: drop unused nbd_start_negotiate() aio_context argument
  io: check there are no qio_channel_yield() coroutines during ->finalize()
  io: follow coroutine AioContext in qio_channel_yield()

 docs/tools/qemu-nbd.rst  |   4 +-
 include/block/nbd.h  |   3 +-
 include/io/channel-util.h|  23 +++
 include/io/channel.h |  69 +---
 include/qemu/vhost-user-server.h |   1 +
 block/nbd.c  |  11 +---
 io/channel-command.c |  10 ++-
 io/channel-file.c|   9 ++-
 io/channel-null.c|   3 +-
 io/channel-socket.c  |   9 ++-
 io/channel-tls.c |   6 +-
 io/channel-util.c|  24 +++
 io/channel.c | 124 ++--
 migration/channel-block.c|   3 +-
 migration/rdma.c |  25 
 nbd/client-connection.c  |   3 +-
 nbd/client.c |  14 ++---
 nbd/server.c |  14 +
 qemu-nbd.c   | 133 +--
 scsi/qemu-pr-helper.c|   4 +-
 util/iov.c   |   2 +-
 util/vhost-user-server.c |  27 +---
 tests/qemu-iotests/197   |   8 +--
 tests/qemu-iotests/197.out   |  18 +++---
 24 files changed, 328 insertions(+), 219 deletions(-)

-- 
2.41.0

[PULL 10/13] qemu-nbd: put saddr into into struct NbdClientOpts

2023-09-07 Thread Eric Blake

From: "Denis V. Lunev" 

We pass other parameters into nbd_client_thread() in this way. This patch
makes the code more consistent.

Signed-off-by: Denis V. Lunev 
CC: Eric Blake 
CC: Vladimir Sementsov-Ogievskiy 
Message-ID: <20230906093210.339585-5-...@openvz.org>
Reviewed-by: Eric Blake 
Signed-off-by: Eric Blake 
---
 qemu-nbd.c | 13 +++--
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/qemu-nbd.c b/qemu-nbd.c
index 16c59424f13..86bb2f04e24 100644
--- a/qemu-nbd.c
+++ b/qemu-nbd.c
@@ -73,7 +73,6 @@

 #define MBR_SIZE 512

-static SocketAddress *saddr;
 static int persistent = 0;
 static enum { RUNNING, TERMINATE, TERMINATED } state;
 static int shared = 1;
@@ -255,6 +254,7 @@ static int qemu_nbd_client_list(SocketAddress *saddr, 
QCryptoTLSCreds *tls,
 struct NbdClientOpts {
 char *device;
 char *srcpath;
+SocketAddress *saddr;
 bool fork_process;
 bool verbose;
 };
@@ -289,7 +289,7 @@ static void *nbd_client_thread(void *arg)

 sioc = qio_channel_socket_new();
 if (qio_channel_socket_connect_sync(sioc,
-saddr,
+opts->saddr,
 _error) < 0) {
 error_report_err(local_error);
 goto out;
@@ -591,6 +591,7 @@ int main(int argc, char **argv)
 .verbose = false,
 .device = NULL,
 .srcpath = NULL,
+.saddr = NULL,
 };

 #ifdef CONFIG_POSIX
@@ -892,8 +893,8 @@ int main(int argc, char **argv)
 }

 if (list) {
-saddr = nbd_build_socket_address(sockpath, bindto, port);
-return qemu_nbd_client_list(saddr, tlscreds,
+opts.saddr = nbd_build_socket_address(sockpath, bindto, port);
+return qemu_nbd_client_list(opts.saddr, tlscreds,
 tlshostname ? tlshostname : bindto);
 }

@@ -1024,8 +1025,8 @@ int main(int argc, char **argv)
 exit(EXIT_FAILURE);
 }
 #endif
-saddr = nbd_build_socket_address(sockpath, bindto, port);
-if (qio_net_listener_open_sync(server, saddr, backlog,
+opts.saddr = nbd_build_socket_address(sockpath, bindto, port);
+if (qio_net_listener_open_sync(server, opts.saddr, backlog,
_err) < 0) {
 object_unref(OBJECT(server));
 error_report_err(local_err);
-- 
2.41.0

[PULL 05/13] io: follow coroutine AioContext in qio_channel_yield()

2023-09-07 Thread Eric Blake

From: Stefan Hajnoczi 

The ongoing QEMU multi-queue block layer effort makes it possible for multiple
threads to process I/O in parallel. The nbd block driver is not compatible with
the multi-queue block layer yet because QIOChannel cannot be used easily from
coroutines running in multiple threads. This series changes the QIOChannel API
to make that possible.

In the current API, calling qio_channel_attach_aio_context() sets the
AioContext where qio_channel_yield() installs an fd handler prior to yielding:

  qio_channel_attach_aio_context(ioc, my_ctx);
  ...
  qio_channel_yield(ioc); // my_ctx is used here
  ...
  qio_channel_detach_aio_context(ioc);

This API design has limitations: reading and writing must be done in the same
AioContext and moving between AioContexts involves a cumbersome sequence of API
calls that is not suitable for doing on a per-request basis.

There is no fundamental reason why a QIOChannel needs to run within the
same AioContext every time qio_channel_yield() is called. QIOChannel
only uses the AioContext while inside qio_channel_yield(). The rest of
the time, QIOChannel is independent of any AioContext.

In the new API, qio_channel_yield() queries the AioContext from the current
coroutine using qemu_coroutine_get_aio_context(). There is no need to
explicitly attach/detach AioContexts anymore and
qio_channel_attach_aio_context() and qio_channel_detach_aio_context() are gone.
One coroutine can read from the QIOChannel while another coroutine writes from
a different AioContext.

This API change allows the nbd block driver to use QIOChannel from any thread.
It's important to keep in mind that the block driver already synchronizes
QIOChannel access and ensures that two coroutines never read simultaneously or
write simultaneously.

This patch updates all users of qio_channel_attach_aio_context() to the
new API. Most conversions are simple, but vhost-user-server requires a
new qemu_coroutine_yield() call to quiesce the vu_client_trip()
coroutine when not attached to any AioContext.

While the API is has become simpler, there is one wart: QIOChannel has a
special case for the iohandler AioContext (used for handlers that must not run
in nested event loops). I didn't find an elegant way preserve that behavior, so
I added a new API called qio_channel_set_follow_coroutine_ctx(ioc, true|false)
for opting in to the new AioContext model. By default QIOChannel uses the
iohandler AioHandler. Code that formerly called
qio_channel_attach_aio_context() now calls
qio_channel_set_follow_coroutine_ctx(ioc, true) once after the QIOChannel is
created.

Signed-off-by: Stefan Hajnoczi 
Reviewed-by: Eric Blake 
Acked-by: Daniel P. Berrangé 
Message-ID: <20230830224802.493686-5-stefa...@redhat.com>
[eblake: also fix migration/rdma.c]
Signed-off-by: Eric Blake 
---
 include/io/channel-util.h|  23 ++
 include/io/channel.h |  69 --
 include/qemu/vhost-user-server.h |   1 +
 block/nbd.c  |  11 +--
 io/channel-command.c |  10 ++-
 io/channel-file.c|   9 ++-
 io/channel-null.c|   3 +-
 io/channel-socket.c  |   9 ++-
 io/channel-tls.c |   6 +-
 io/channel-util.c|  24 +++
 io/channel.c | 120 ++-
 migration/channel-block.c|   3 +-
 migration/rdma.c |  25 +++
 nbd/server.c |  14 +---
 scsi/qemu-pr-helper.c|   4 +-
 util/vhost-user-server.c |  27 +--
 16 files changed, 229 insertions(+), 129 deletions(-)

diff --git a/include/io/channel-util.h b/include/io/channel-util.h
index a5d720d9a04..fa18a3756d8 100644
--- a/include/io/channel-util.h
+++ b/include/io/channel-util.h
@@ -49,4 +49,27 @@
 QIOChannel *qio_channel_new_fd(int fd,
Error **errp);

+/**
+ * qio_channel_util_set_aio_fd_handler:
+ * @read_fd: the file descriptor for the read handler
+ * @read_ctx: the AioContext for the read handler
+ * @io_read: the read handler
+ * @write_fd: the file descriptor for the write handler
+ * @write_ctx: the AioContext for the write handler
+ * @io_write: the write handler
+ * @opaque: the opaque argument to the read and write handler
+ *
+ * Set the read and write handlers when @read_ctx and @write_ctx are non-NULL,
+ * respectively. To leave a handler in its current state, pass a NULL
+ * AioContext. To clear a handler, pass a non-NULL AioContext and a NULL
+ * handler.
+ */
+void qio_channel_util_set_aio_fd_handler(int read_fd,
+ AioContext *read_ctx,
+ IOHandler *io_read,
+ int write_fd,
+ AioContext *write_ctx,
+ IOHandler *io_write,
+ void *opaque);
+
 #endif /* QIO_CHANNEL_UTIL_H */

[PULL 01/13] qemu-iotests/197: use more generic commands for formats other than qcow2

2023-09-07 Thread Eric Blake

From: Andrey Drobyshev 

In the previous commit e2f938265e0 ("tests/qemu-iotests/197: add
testcase for CoR with subclusters") we've introduced a new testcase for
copy-on-read with subclusters.  Test 197 always forces qcow2 as the top
image, but allows backing image to be in any format.  That last test
case didn't meet these requirements, so let's fix it by using more
generic "qemu-io -c map" command.

Signed-off-by: Andrey Drobyshev 
Message-ID: <20230907220718.983430-1-andrey.drobys...@virtuozzo.com>
Tested-by: Eric Blake 
Reviewed-by: Eric Blake 
Signed-off-by: Eric Blake 
---
 tests/qemu-iotests/197 |  8 
 tests/qemu-iotests/197.out | 18 --
 2 files changed, 12 insertions(+), 14 deletions(-)

diff --git a/tests/qemu-iotests/197 b/tests/qemu-iotests/197
index f07a9da136a..8ad2bdb035e 100755
--- a/tests/qemu-iotests/197
+++ b/tests/qemu-iotests/197
@@ -136,18 +136,18 @@ IMGPROTO=file IMGFMT=qcow2 TEST_IMG_FILE="$TEST_WRAP" \
 $QEMU_IO -c "write -P 0xaa 0 64k" "$TEST_IMG" | _filter_qemu_io

 # Allocate individual subclusters in the top image, and not the whole cluster
-$QEMU_IO -c "write -P 0xbb 28K 2K" -c "write -P 0xcc 34K 2K" "$TEST_WRAP" \
+$QEMU_IO -f qcow2 -c "write -P 0xbb 28K 2K" -c "write -P 0xcc 34K 2K" 
"$TEST_WRAP" \
 | _filter_qemu_io

 # Only 2 subclusters should be allocated in the top image at this point
-$QEMU_IMG map "$TEST_WRAP" | _filter_qemu_img_map
+$QEMU_IO -f qcow2 -c map "$TEST_WRAP"

 # Actual copy-on-read operation
-$QEMU_IO -C -c "read -P 0xaa 30K 4K" "$TEST_WRAP" | _filter_qemu_io
+$QEMU_IO -f qcow2 -C -c "read -P 0xaa 30K 4K" "$TEST_WRAP" | _filter_qemu_io

 # And here we should have 4 subclusters allocated right in the middle of the
 # top image. Make sure the whole cluster remains unallocated
-$QEMU_IMG map "$TEST_WRAP" | _filter_qemu_img_map
+$QEMU_IO -f qcow2 -c map "$TEST_WRAP"

 _check_test_img

diff --git a/tests/qemu-iotests/197.out b/tests/qemu-iotests/197.out
index 8f34a30afea..86c57b51d30 100644
--- a/tests/qemu-iotests/197.out
+++ b/tests/qemu-iotests/197.out
@@ -42,17 +42,15 @@ wrote 2048/2048 bytes at offset 28672
 2 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 wrote 2048/2048 bytes at offset 34816
 2 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
-Offset  Length  File
-0   0x7000  TEST_DIR/t.IMGFMT
-0x7000  0x800   TEST_DIR/t.wrap.IMGFMT
-0x7800  0x1000  TEST_DIR/t.IMGFMT
-0x8800  0x800   TEST_DIR/t.wrap.IMGFMT
-0x9000  0x7000  TEST_DIR/t.IMGFMT
+28 KiB (0x7000) bytes not allocated at offset 0 bytes (0x0)
+2 KiB (0x800) bytes allocated at offset 28 KiB (0x7000)
+4 KiB (0x1000) bytes not allocated at offset 30 KiB (0x7800)
+2 KiB (0x800) bytes allocated at offset 34 KiB (0x8800)
+28 KiB (0x7000) bytes not allocated at offset 36 KiB (0x9000)
 read 4096/4096 bytes at offset 30720
 4 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
-Offset  Length  File
-0   0x7000  TEST_DIR/t.IMGFMT
-0x7000  0x2000  TEST_DIR/t.wrap.IMGFMT
-0x9000  0x7000  TEST_DIR/t.IMGFMT
+28 KiB (0x7000) bytes not allocated at offset 0 bytes (0x0)
+8 KiB (0x2000) bytes allocated at offset 28 KiB (0x7000)
+28 KiB (0x7000) bytes not allocated at offset 36 KiB (0x9000)
 No errors were found on the image.
 *** done
-- 
2.41.0

[PULL 09/13] qemu-nbd: move srcpath into struct NbdClientOpts

2023-09-07 Thread Eric Blake

From: "Denis V. Lunev" 

We pass other parameters into nbd_client_thread() in this way. This patch
makes the code more consistent.

Signed-off-by: Denis V. Lunev 
CC: Eric Blake 
CC: Vladimir Sementsov-Ogievskiy 
Message-ID: <20230906093210.339585-4-...@openvz.org>
Reviewed-by: Eric Blake 
[eblake: Note that this also cleans up a -Wshadow issue, first
introduced in e5b815b0]
Signed-off-by: Eric Blake 
---
 qemu-nbd.c | 17 +
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/qemu-nbd.c b/qemu-nbd.c
index acbdc0cd8fd..16c59424f13 100644
--- a/qemu-nbd.c
+++ b/qemu-nbd.c
@@ -73,7 +73,6 @@

 #define MBR_SIZE 512

-static char *srcpath;
 static SocketAddress *saddr;
 static int persistent = 0;
 static enum { RUNNING, TERMINATE, TERMINATED } state;
@@ -255,6 +254,7 @@ static int qemu_nbd_client_list(SocketAddress *saddr, 
QCryptoTLSCreds *tls,

 struct NbdClientOpts {
 char *device;
+char *srcpath;
 bool fork_process;
 bool verbose;
 };
@@ -320,7 +320,7 @@ static void *nbd_client_thread(void *arg)

 if (opts->verbose && !opts->fork_process) {
 fprintf(stderr, "NBD device %s is now connected to %s\n",
-opts->device, srcpath);
+opts->device, opts->srcpath);
 } else {
 /* Close stderr so that the qemu-nbd process exits.  */
 if (dup2(STDOUT_FILENO, STDERR_FILENO) < 0) {
@@ -590,6 +590,7 @@ int main(int argc, char **argv)
 .fork_process = false,
 .verbose = false,
 .device = NULL,
+.srcpath = NULL,
 };

 #ifdef CONFIG_POSIX
@@ -1059,19 +1060,19 @@ int main(int argc, char **argv)
 bdrv_init();
 atexit(qemu_nbd_shutdown);

-srcpath = argv[optind];
+opts.srcpath = argv[optind];
 if (imageOpts) {
-QemuOpts *opts;
+QemuOpts *o;
 if (fmt) {
 error_report("--image-opts and -f are mutually exclusive");
 exit(EXIT_FAILURE);
 }
-opts = qemu_opts_parse_noisily(_opts, srcpath, true);
-if (!opts) {
+o = qemu_opts_parse_noisily(_opts, opts.srcpath, true);
+if (!o) {
 qemu_opts_reset(_opts);
 exit(EXIT_FAILURE);
 }
-options = qemu_opts_to_qdict(opts, NULL);
+options = qemu_opts_to_qdict(o, NULL);
 qemu_opts_reset(_opts);
 blk = blk_new_open(NULL, NULL, options, flags, _err);
 } else {
@@ -1079,7 +1080,7 @@ int main(int argc, char **argv)
 options = qdict_new();
 qdict_put_str(options, "driver", fmt);
 }
-blk = blk_new_open(srcpath, NULL, options, flags, _err);
+blk = blk_new_open(opts.srcpath, NULL, options, flags, _err);
 }

 if (!blk) {
-- 
2.41.0

[PULL 02/13] nbd: drop unused nbd_receive_negotiate() aio_context argument

2023-09-07 Thread Eric Blake

From: Stefan Hajnoczi 

aio_context is always NULL, so drop it.

Suggested-by: Fabiano Rosas 
Signed-off-by: Stefan Hajnoczi 
Reviewed-by: Eric Blake 
Reviewed-by: Philippe Mathieu-Daudé 
Message-ID: <20230830224802.493686-2-stefa...@redhat.com>
Signed-off-by: Eric Blake 
---
 include/block/nbd.h | 3 +--
 nbd/client-connection.c | 3 +--
 nbd/client.c| 5 ++---
 qemu-nbd.c  | 4 ++--
 4 files changed, 6 insertions(+), 9 deletions(-)

diff --git a/include/block/nbd.h b/include/block/nbd.h
index 4428bcffbb9..f672b76173b 100644
--- a/include/block/nbd.h
+++ b/include/block/nbd.h
@@ -324,8 +324,7 @@ typedef struct NBDExportInfo {
 char **contexts;
 } NBDExportInfo;

-int nbd_receive_negotiate(AioContext *aio_context, QIOChannel *ioc,
-  QCryptoTLSCreds *tlscreds,
+int nbd_receive_negotiate(QIOChannel *ioc, QCryptoTLSCreds *tlscreds,
   const char *hostname, QIOChannel **outioc,
   NBDExportInfo *info, Error **errp);
 void nbd_free_export_list(NBDExportInfo *info, int count);
diff --git a/nbd/client-connection.c b/nbd/client-connection.c
index 3d14296c042..aafb3d0fb43 100644
--- a/nbd/client-connection.c
+++ b/nbd/client-connection.c
@@ -146,8 +146,7 @@ static int nbd_connect(QIOChannelSocket *sioc, 
SocketAddress *addr,
 return 0;
 }

-ret = nbd_receive_negotiate(NULL, QIO_CHANNEL(sioc), tlscreds,
-tlshostname,
+ret = nbd_receive_negotiate(QIO_CHANNEL(sioc), tlscreds, tlshostname,
 outioc, info, errp);
 if (ret < 0) {
 /*
diff --git a/nbd/client.c b/nbd/client.c
index 479208d5d9d..16ec10c8a91 100644
--- a/nbd/client.c
+++ b/nbd/client.c
@@ -1014,8 +1014,7 @@ static int nbd_negotiate_finish_oldstyle(QIOChannel *ioc, 
NBDExportInfo *info,
  * Returns: negative errno: failure talking to server
  *  0: server is connected
  */
-int nbd_receive_negotiate(AioContext *aio_context, QIOChannel *ioc,
-  QCryptoTLSCreds *tlscreds,
+int nbd_receive_negotiate(QIOChannel *ioc, QCryptoTLSCreds *tlscreds,
   const char *hostname, QIOChannel **outioc,
   NBDExportInfo *info, Error **errp)
 {
@@ -1027,7 +1026,7 @@ int nbd_receive_negotiate(AioContext *aio_context, 
QIOChannel *ioc,
 assert(info->name && strlen(info->name) <= NBD_MAX_STRING_SIZE);
 trace_nbd_receive_negotiate_name(info->name);

-result = nbd_start_negotiate(aio_context, ioc, tlscreds, hostname, outioc,
+result = nbd_start_negotiate(NULL, ioc, tlscreds, hostname, outioc,
  info->structured_reply, , errp);
 if (result < 0) {
 return result;
diff --git a/qemu-nbd.c b/qemu-nbd.c
index aaccaa33184..b47459f781d 100644
--- a/qemu-nbd.c
+++ b/qemu-nbd.c
@@ -295,8 +295,8 @@ static void *nbd_client_thread(void *arg)
 goto out;
 }

-if (nbd_receive_negotiate(NULL, QIO_CHANNEL(sioc),
-  NULL, NULL, NULL, , _error) < 0) {
+if (nbd_receive_negotiate(QIO_CHANNEL(sioc), NULL, NULL, NULL,
+  , _error) < 0) {
 if (local_error) {
 error_report_err(local_error);
 }
-- 
2.41.0

[PULL 03/13] nbd: drop unused nbd_start_negotiate() aio_context argument

2023-09-07 Thread Eric Blake

From: Stefan Hajnoczi 

aio_context is always NULL, so drop it.

Suggested-by: Fabiano Rosas 
Signed-off-by: Stefan Hajnoczi 
Reviewed-by: Eric Blake 
Reviewed-by: Philippe Mathieu-Daudé 
Message-ID: <20230830224802.493686-3-stefa...@redhat.com>
Signed-off-by: Eric Blake 
---
 nbd/client.c | 11 +++
 1 file changed, 3 insertions(+), 8 deletions(-)

diff --git a/nbd/client.c b/nbd/client.c
index 16ec10c8a91..bd7e2001366 100644
--- a/nbd/client.c
+++ b/nbd/client.c
@@ -877,8 +877,7 @@ static int nbd_list_meta_contexts(QIOChannel *ioc,
  * Returns: negative errno: failure talking to server
  *  non-negative: enum NBDMode describing server abilities
  */
-static int nbd_start_negotiate(AioContext *aio_context, QIOChannel *ioc,
-   QCryptoTLSCreds *tlscreds,
+static int nbd_start_negotiate(QIOChannel *ioc, QCryptoTLSCreds *tlscreds,
const char *hostname, QIOChannel **outioc,
bool structured_reply, bool *zeroes,
Error **errp)
@@ -946,10 +945,6 @@ static int nbd_start_negotiate(AioContext *aio_context, 
QIOChannel *ioc,
 return -EINVAL;
 }
 ioc = *outioc;
-if (aio_context) {
-qio_channel_set_blocking(ioc, false, NULL);
-qio_channel_attach_aio_context(ioc, aio_context);
-}
 } else {
 error_setg(errp, "Server does not support STARTTLS");
 return -EINVAL;
@@ -1026,7 +1021,7 @@ int nbd_receive_negotiate(QIOChannel *ioc, 
QCryptoTLSCreds *tlscreds,
 assert(info->name && strlen(info->name) <= NBD_MAX_STRING_SIZE);
 trace_nbd_receive_negotiate_name(info->name);

-result = nbd_start_negotiate(NULL, ioc, tlscreds, hostname, outioc,
+result = nbd_start_negotiate(ioc, tlscreds, hostname, outioc,
  info->structured_reply, , errp);
 if (result < 0) {
 return result;
@@ -1149,7 +1144,7 @@ int nbd_receive_export_list(QIOChannel *ioc, 
QCryptoTLSCreds *tlscreds,
 QIOChannel *sioc = NULL;

 *info = NULL;
-result = nbd_start_negotiate(NULL, ioc, tlscreds, hostname, , true,
+result = nbd_start_negotiate(ioc, tlscreds, hostname, , true,
  NULL, errp);
 if (tlscreds && sioc) {
 ioc = sioc;
-- 
2.41.0

[PULL 07/13] qemu-nbd: improve error message for dup2 error

2023-09-07 Thread Eric Blake

From: "Denis V. Lunev" 

This error happens if we are not able to close the pipe to the
parent (to trace errors in the child process) and assign stderr to
/dev/null as required by the daemonizing convention.

Signed-off-by: Denis V. Lunev 
Suggested-by: Eric Blake 
CC: Eric Blake 
CC: Vladimir Sementsov-Ogievskiy 
Message-ID: <20230906093210.339585-2-...@openvz.org>
Reviewed-by: Eric Blake 
[eblake: commit message grammar]
Signed-off-by: Eric Blake 
---
 qemu-nbd.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/qemu-nbd.c b/qemu-nbd.c
index b47459f781d..e2480061a16 100644
--- a/qemu-nbd.c
+++ b/qemu-nbd.c
@@ -324,7 +324,7 @@ static void *nbd_client_thread(void *arg)
 } else {
 /* Close stderr so that the qemu-nbd process exits.  */
 if (dup2(STDOUT_FILENO, STDERR_FILENO) < 0) {
-error_report("Could not set stderr to /dev/null: %s",
+error_report("Could not release pipe to parent: %s",
  strerror(errno));
 exit(EXIT_FAILURE);
 }
@@ -1181,7 +1181,7 @@ int main(int argc, char **argv)

 if (fork_process) {
 if (dup2(STDOUT_FILENO, STDERR_FILENO) < 0) {
-error_report("Could not set stderr to /dev/null: %s",
+error_report("Could not release pipe to parent: %s",
  strerror(errno));
 exit(EXIT_FAILURE);
 }
-- 
2.41.0

[PULL 13/13] qemu-nbd: document -v behavior in respect to --fork in man

2023-09-07 Thread Eric Blake

From: "Denis V. Lunev" 

Signed-off-by: Denis V. Lunev 
CC: Eric Blake 
CC: Vladimir Sementsov-Ogievskiy 
Message-ID: <20230906093210.339585-8-...@openvz.org>
Reviewed-by: Eric Blake 
[eblake: Wording improvement]
Signed-off-by: Eric Blake 
---
 docs/tools/qemu-nbd.rst | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/docs/tools/qemu-nbd.rst b/docs/tools/qemu-nbd.rst
index faf6349ea51..329f44d9895 100644
--- a/docs/tools/qemu-nbd.rst
+++ b/docs/tools/qemu-nbd.rst
@@ -197,7 +197,9 @@ driver options if :option:`--image-opts` is specified.

 .. option:: -v, --verbose

-  Display extra debugging information.
+  Display extra debugging information. This option also keeps the original
+  *STDERR* stream open if the ``qemu-nbd`` process is daemonized due to
+  other options like :option:`--fork` or :option:`-c`.

 .. option:: -h, --help

-- 
2.41.0

[PULL 04/13] io: check there are no qio_channel_yield() coroutines during ->finalize()

2023-09-07 Thread Eric Blake

From: Stefan Hajnoczi 

Callers must clean up their coroutines before calling
object_unref(OBJECT(ioc)) to prevent an fd handler leak. Add an
assertion to check this.

This patch is preparation for the fd handler changes that follow.

Signed-off-by: Stefan Hajnoczi 
Reviewed-by: Daniel P. Berrangé 
Reviewed-by: Eric Blake 
Message-ID: <20230830224802.493686-4-stefa...@redhat.com>
Signed-off-by: Eric Blake 
---
 io/channel.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/io/channel.c b/io/channel.c
index 72f0066af55..c415f3fc885 100644
--- a/io/channel.c
+++ b/io/channel.c
@@ -653,6 +653,10 @@ static void qio_channel_finalize(Object *obj)
 {
 QIOChannel *ioc = QIO_CHANNEL(obj);

+/* Must not have coroutines in qio_channel_yield() */
+assert(!ioc->read_coroutine);
+assert(!ioc->write_coroutine);
+
 g_free(ioc->name);

 #ifdef _WIN32
-- 
2.41.0

[PULL 11/13] qemu-nbd: invent nbd_client_release_pipe() helper

2023-09-07 Thread Eric Blake

From: "Denis V. Lunev" 

Move the code from main() and nbd_client_thread() into the specific
helper. This code is going to be grown.

Signed-off-by: Denis V. Lunev 
CC: Eric Blake 
CC: Vladimir Sementsov-Ogievskiy 
Message-ID: <20230906093210.339585-6-...@openvz.org>
Reviewed-by: Eric Blake 
Signed-off-by: Eric Blake 
---
 qemu-nbd.c | 23 ---
 1 file changed, 12 insertions(+), 11 deletions(-)

diff --git a/qemu-nbd.c b/qemu-nbd.c
index 86bb2f04e24..7c4e22def17 100644
--- a/qemu-nbd.c
+++ b/qemu-nbd.c
@@ -259,6 +259,16 @@ struct NbdClientOpts {
 bool verbose;
 };

+static void nbd_client_release_pipe(void)
+{
+/* Close stderr so that the qemu-nbd process exits.  */
+if (dup2(STDOUT_FILENO, STDERR_FILENO) < 0) {
+error_report("Could not release pipe to parent: %s",
+ strerror(errno));
+exit(EXIT_FAILURE);
+}
+}
+
 #if HAVE_NBD_DEVICE
 static void *show_parts(void *arg)
 {
@@ -322,12 +332,7 @@ static void *nbd_client_thread(void *arg)
 fprintf(stderr, "NBD device %s is now connected to %s\n",
 opts->device, opts->srcpath);
 } else {
-/* Close stderr so that the qemu-nbd process exits.  */
-if (dup2(STDOUT_FILENO, STDERR_FILENO) < 0) {
-error_report("Could not release pipe to parent: %s",
- strerror(errno));
-exit(EXIT_FAILURE);
-}
+nbd_client_release_pipe();
 }

 if (nbd_client(fd) < 0) {
@@ -1176,11 +1181,7 @@ int main(int argc, char **argv)
 }

 if (opts.fork_process) {
-if (dup2(STDOUT_FILENO, STDERR_FILENO) < 0) {
-error_report("Could not release pipe to parent: %s",
- strerror(errno));
-exit(EXIT_FAILURE);
-}
+nbd_client_release_pipe();
 }

 state = RUNNING;
-- 
2.41.0

Re: [PATCH] softmmu/dirtylimit: Fix usleep early return on signal

2023-09-07 Thread alloc young





On 2023/9/4 21:27, Yong Huang wrote:



On Fri, Sep 1, 2023 at 10:19 AM > wrote:


From: alloc mailto:alloc.yo...@outlook.com>>

Timeout functions like usleep can return early on signal, which reduces
more dirty pages than expected. In dirtylimit case, dirtyrate meter
thread needs to kick all vcpus out to sync. The callchain:

vcpu_calculate_dirtyrate
     global_dirty_log_sync
         memory_global_dirty_log_sync
             kvm_log_sync_global
                 kvm_dirty_ring_flush
                     kvm_cpu_synchronize_kick_all < send vcpu signal

For long time sleep, use qemu_cond_timedwait_iothread to handle cpu stop
event.


The Dirty Limit algorithm seeks to keep the vCPU dirty page rate within
the set limit; since it focuses more emphasis on processing time and
precision, I feel that improvement should strive for the same result.
Could you please provide the final test results showing the impact of
that improvement?


The kvm_cpu_sync in dirty ring flush has to wait all vcpu to exit to run 
sync action, while the vcpu sleep may blocks this. Before this patch, 
the kick can reduce early vcpu return and add more dirty pages. It seems
the kvm_cpu_sync conflicts with vcpu sleep, why not measure dirty rate 
when dirty ring fulls?




Signed-off-by: alloc mailto:alloc.yo...@outlook.com>>
---
  softmmu/dirtylimit.c | 19 +--
  1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/softmmu/dirtylimit.c b/softmmu/dirtylimit.c
index fa959d7743..ee938c636d 100644
--- a/softmmu/dirtylimit.c
+++ b/softmmu/dirtylimit.c
@@ -411,13 +411,28 @@ void dirtylimit_set_all(uint64_t quota,

  void dirtylimit_vcpu_execute(CPUState *cpu)
  {
+    int64_t sleep_us, endtime_us;
+
+    dirtylimit_state_lock();
      if (dirtylimit_in_service() &&
          dirtylimit_vcpu_get_state(cpu->cpu_index)->enabled &&
          cpu->throttle_us_per_full) {
          trace_dirtylimit_vcpu_execute(cpu->cpu_index,
                  cpu->throttle_us_per_full);
-        usleep(cpu->throttle_us_per_full);
-    }
+        sleep_us = cpu->throttle_us_per_full;
+        dirtylimit_state_unlock();
+        endtime_us = qemu_clock_get_us(QEMU_CLOCK_REALTIME) + sleep_us;
+        while (sleep_us > 0 && !cpu->stop) {
+            if (sleep_us > SCALE_US) {
+                qemu_mutex_lock_iothread();
+                qemu_cond_timedwait_iothread(cpu->halt_cond,
sleep_us / SCALE_US);
+                qemu_mutex_unlock_iothread();
+            } else
+                g_usleep(sleep_us);
+            sleep_us = endtime_us -
qemu_clock_get_us(QEMU_CLOCK_REALTIME);
+        }
+    } else
+        dirtylimit_state_unlock();
  }

  static void dirtylimit_init(void)
-- 
2.39.3




--
Best regards

Re: [PATCH] hw/riscv: split RAM into low and high memory

2023-09-07 Thread Wu, Fei

On 9/7/2023 11:46 PM, Anup Patel wrote:
> On Tue, Aug 1, 2023 at 4:16 AM Daniel Henrique Barboza
>  wrote:
>>
>>
>>
>> On 7/30/23 22:53, Fei Wu wrote:
>>> riscv virt platform's memory started at 0x8000 and
>>> straddled the 4GiB boundary. Curiously enough, this choice
>>> of a memory layout will prevent from launching a VM with
>>> a bit more than 2000MiB and PCIe pass-thru on an x86 host, due
>>> to identity mapping requirements for the MSI doorbell on x86,
>>> and these (APIC/IOAPIC) live right below 4GiB.
>>>
>>> So just split the RAM range into two portions:
>>> - 1 GiB range from 0x8000 to 0xc000.
>>> - The remainder at 0x1
>>>
>>> ...leaving a hole between the ranges.
>>
>> I am afraid this breaks some existing distro setups, like Ubuntu. After this 
>> patch
>> this emulation stopped working:
>>
>> ~/work/qemu/build/qemu-system-riscv64 \
>> -machine virt -nographic -m 8G -smp 8 \
>> -kernel ./uboot-ubuntu/usr/lib/u-boot/qemu-riscv64_smode/uboot.elf \
>> -drive file=snapshot.img,format=qcow2,if=virtio \
>> -netdev bridge,id=bridge1,br=virbr0 -device 
>> virtio-net-pci,netdev=bridge1
>>
>>
>> This is basically a guest created via the official Canonical tutorial:
>>
>> https://wiki.ubuntu.com/RISC-V/QEMU
>>
>> The error being thrown:
>>
>> =
>>
>> Boot HART ID  : 4
>> Boot HART Domain  : root
>> Boot HART Priv Version: v1.12
>> Boot HART Base ISA: rv64imafdch
>> Boot HART ISA Extensions  : time,sstc
>> Boot HART PMP Count   : 16
>> Boot HART PMP Granularity : 4
>> Boot HART PMP Address Bits: 54
>> Boot HART MHPM Count  : 16
>> Boot HART MIDELEG : 0x1666
>> Boot HART MEDELEG : 0x00f0b509
>>
>>
>> U-Boot 2022.07+dfsg-1ubuntu4.2 (Nov 24 2022 - 18:47:41 +)
>>
>> CPU:   
>> rv64imafdch_zicbom_zicboz_zicsr_zifencei_zihintpause_zawrs_zfa_zca_zcd_zba_zbb_zbc_zbs_sstc_svadu
>> Model: riscv-virtio,qemu
>> DRAM:  Unhandled exception: Store/AMO access fault
>> EPC: 802018b8 RA: 802126a0 TVAL: ff733f90
>>
>> Code: b823 06b2 bc23 06b2 b023 08b2 b423 08b2 (b823 08b2)
>>
>>
>> resetting ...
>> System reset not supported on this platform
>> ### ERROR ### Please RESET the board ###
>> QEMU 8.0.90 monitor - type 'help' for more infor
>> =
> 
> Can you try again after setting CONFIG_NR_DRAM_BANKS=2 in
> qemu-riscv64_smode_defconfig and qemu-riscv64_spl_defconfig ?
> 
Yes, I made a u-boot patch to change this setting and also use
fdtdec_setup_mem_size_base_lowest() instead fdtdec_setup_mem_size_base()
in dram_init(), the latter is also necessary. The patch has been posted
to u-boot mailing list but got no reply yet:
https://lists.denx.de/pipermail/u-boot/2023-September/529729.html

Thanks,
Fei.

> Regards,
> Anup
> 
>>
>>
>> Based on the change made I can make an educated guess on what is going wrong.
>> We have another board with a similar memory topology you're making here, the
>> Microchip Polarfire (microchip_pfsoc.c). We were having some problems with 
>> this
>> board while trying to consolidate the boot process between all boards in
>> hw/riscv/boot.c because of its non-continuous RAM bank. The full story can be
>> read in the commit message of 4b402886ac89 ("hw/riscv: change 
>> riscv_compute_fdt_addr()
>> semantics") but the short version can be seen in riscv_compute_fdt_addr()
>> from boot.c:
>>
>>   - if ram_start is less than 3072MiB, the FDT will be  put at the lowest 
>> value
>> between 3072 MiB and the end of that RAM bank;
>>
>> - if ram_start is higher than 3072 MiB the FDT will be put at the end of the
>> RAM bank.
>>
>> So, after this patch, since riscv_compute_fdt_addr() is being used with the 
>> now
>> lower RAM bank, the fdt is being put in LOW_MEM - fdt_size for any setup 
>> that has
>> more than 1Gb RAM, and this breaks assumptions made by uboot and Ubuntu and 
>> possibly
>> others that are trying to retrieve the FDT from the gap that you created 
>> between
>> low and hi mem in this patch.
>>
>> In fact, this same Ubuntu guest I mentioned above will boot if I put only 1 
>> Gb of RAM
>> (-m 1Gb). If I try with -m 1.1Gb I reproduce this error. This can be a 
>> validation of
>> the guess I'm making here: Ubuntu is trying to fetch stuff (probably the 
>> fdt) from
>> the gap between the memory areas.
>>
>> This change on top of this patch doesn't work either:
>>
>> $ git diff
>> diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c
>> index 8fbdc7220c..dfff48d849 100644
>> --- a/hw/riscv/virt.c
>> +++ b/hw/riscv/virt.c
>> @@ -1335,9 +1335,16 @@ static void virt_machine_done(Notifier *notifier, 
>> void *data)
>>kernel_start_addr, true, NULL);
>>   }
>>
>> -fdt_load_addr = riscv_compute_fdt_addr(memmap[VIRT_DRAM].base,
>> +if (machine->ram_size < memmap[VIRT_DRAM].size) {
>> +fdt_load_addr = riscv_compute_fdt_addr(memmap[VIRT_DRAM].base,
>>

Re: [PATCH v2 4/3] qemu-iotests/197: use more generic commands for formats other than qcow2

2023-09-07 Thread Eric Blake

On Fri, Sep 08, 2023 at 01:07:18AM +0300, Andrey Drobyshev via wrote:
> In the previous commit e2f938265e0 ("tests/qemu-iotests/197: add
> testcase for CoR with subclusters") we've introduced a new testcase for
> copy-on-read with subclusters.  Test 197 always forces qcow2 as the top
> image, but allows backing image to be in any format.  That last test
> case didn't meet these requirements, so let's fix it by using more
> generic "qemu-io -c map" command.
> 
> Signed-off-by: Andrey Drobyshev 
> ---
>  tests/qemu-iotests/197 |  8 
>  tests/qemu-iotests/197.out | 18 --
>  2 files changed, 12 insertions(+), 14 deletions(-)

Tested-by: Eric Blake 

> 
> diff --git a/tests/qemu-iotests/197 b/tests/qemu-iotests/197
> index f07a9da136..8ad2bdb035 100755
> --- a/tests/qemu-iotests/197
> +++ b/tests/qemu-iotests/197
> @@ -136,18 +136,18 @@ IMGPROTO=file IMGFMT=qcow2 TEST_IMG_FILE="$TEST_WRAP" \
>  $QEMU_IO -c "write -P 0xaa 0 64k" "$TEST_IMG" | _filter_qemu_io
>  
>  # Allocate individual subclusters in the top image, and not the whole cluster
> -$QEMU_IO -c "write -P 0xbb 28K 2K" -c "write -P 0xcc 34K 2K" "$TEST_WRAP" \
> +$QEMU_IO -f qcow2 -c "write -P 0xbb 28K 2K" -c "write -P 0xcc 34K 2K" 
> "$TEST_WRAP" \
>  | _filter_qemu_io

Adding the -f qcow2 makes sense (this is a test of subcluster
behavior); and the backing file remains whatever format was passed to
./check.

> +++ b/tests/qemu-iotests/197.out
> @@ -42,17 +42,15 @@ wrote 2048/2048 bytes at offset 28672
>  2 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
>  wrote 2048/2048 bytes at offset 34816
>  2 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
> -Offset  Length  File
> -0   0x7000  TEST_DIR/t.IMGFMT
> -0x7000  0x800   TEST_DIR/t.wrap.IMGFMT
> -0x7800  0x1000  TEST_DIR/t.IMGFMT
> -0x8800  0x800   TEST_DIR/t.wrap.IMGFMT
> -0x9000  0x7000  TEST_DIR/t.IMGFMT
> +28 KiB (0x7000) bytes not allocated at offset 0 bytes (0x0)
> +2 KiB (0x800) bytes allocated at offset 28 KiB (0x7000)
> +4 KiB (0x1000) bytes not allocated at offset 30 KiB (0x7800)
> +2 KiB (0x800) bytes allocated at offset 34 KiB (0x8800)
> +28 KiB (0x7000) bytes not allocated at offset 36 KiB (0x9000)
>  read 4096/4096 bytes at offset 30720

Same information, but without the backing file details (which clears
up the problem with -nbd).

Reviewed-by: Eric Blake 

Adding to my NBD queue, for a pull request soon.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization:  qemu.org | libguestfs.org

Re: riscv64 virt board crash upon startup

2023-09-07 Thread Laszlo Ersek

On 9/8/23 01:47, Laszlo Ersek wrote:

> I don't know why qemu_console_is_multihead() used a lot of QOM
> trickery for this in the first place, but here's what I'd propose as
> fix -- simply try to locate a QemuGraphicConsole in "consoles" that
> references the same "device" that *this* QemuGraphicConsole
> references, but by a different "head" number.

So, the final version of the function would look like:

static bool qemu_graphic_console_is_multihead(QemuGraphicConsole *c)
{
QemuConsole *con;

QTAILQ_FOREACH(con, , next) {
if (!QEMU_IS_GRAPHIC_CONSOLE(con)) {
continue;
}
QemuGraphicConsole *candidate = QEMU_GRAPHIC_CONSOLE(con);
if (candidate->device != c->device) {
continue;
}

if (candidate->head != c->head) {
return true;
}
}
return false;
}

Laszlo

Re: [PATCH v3 27/32] machine: Print CPU model name instead of CPU type name

2023-09-07 Thread Gavin Shan


On 9/7/23 19:05, Philippe Mathieu-Daudé wrote:

On 7/9/23 02:35, Gavin Shan wrote:

The names of supported CPU models instead of CPU types should be
printed when the user specified CPU type isn't supported, to be
consistent with the output from '-cpu ?'.

Correct the error messages to print CPU model names instead of CPU
type names.

Signed-off-by: Gavin Shan 
---
  hw/core/machine.c | 16 
  1 file changed, 12 insertions(+), 4 deletions(-)




@@ -1373,11 +1374,18 @@ static void is_cpu_type_supported(MachineState 
*machine, Error **errp)
  /* The user specified CPU type isn't valid */
  if (!mc->valid_cpu_types[i]) {
-    error_setg(errp, "Invalid CPU type: %s", machine->cpu_type);
-    error_append_hint(errp, "The valid types are: %s",
-  mc->valid_cpu_types[0]);
+    model = cpu_model_from_type(machine->cpu_type);
+    error_setg(errp, "Invalid CPU type: %s", model);
+    g_free(model);
+
+    model = cpu_model_from_type(mc->valid_cpu_types[0]);
+    error_append_hint(errp, "The valid types are: %s", model);
+    g_free(model);
+
  for (i = 1; mc->valid_cpu_types[i]; i++) {
-    error_append_hint(errp, ", %s", mc->valid_cpu_types[i]);
+    model = cpu_model_from_type(mc->valid_cpu_types[i]);


cpu_model_from_type() can return NULL:

  char *cpu_model_from_type(const char *typename)
  {
  const char *suffix = "-" CPU_RESOLVING_TYPE;

  if (!object_class_by_name(typename)) {
  return NULL;
  }

Don't we want to skip that case?

    if (!model) {
    continue;
    }



No, it's intentional. "(null)" will be printed in this specific case so that
it can be identified quickly and mc->valid_cpu_types[] need to be fixed by
developers.


+    error_append_hint(errp, ", %s", model);
+    g_free(model);
  }
  error_append_hint(errp, "\n");




Thanks,
Gavin

Re: riscv64 virt board crash upon startup

2023-09-07 Thread Laszlo Ersek

Question for Gerd below:

On 9/7/23 14:29, Philippe Mathieu-Daudé wrote:
> On 7/9/23 13:25, Laszlo Ersek wrote:
>> This is with QEMU v8.1.0-391-gc152379422a2.
>>
>> I use the command line from (scroll to the bottom):
>>
>>    https://github.com/tianocore/edk2/commit/49f06b664018
>>
>> (with "-full-screen" removed).
>>
>> The crash is as follows:
>>
>>    Unexpected error in object_property_find_err() at
>> ../../src/upstream/qemu/qom/object.c:1314:
>>    qemu: Property 'qemu-fixed-text-console.device' not found
>>    Aborted (core dumped)
> 
> Cc'ing Marc-André for commit b208f745a8
> ("ui/console: introduce different console objects")

First bad commit:

58d5870845c61cea1e7df287b86c2607b2bf48a9 is the first bad commit
commit 58d5870845c61cea1e7df287b86c2607b2bf48a9
Author: Marc-André Lureau 
Date:   Wed Aug 30 13:38:03 2023 +0400

ui/console: move graphic fields to QemuGraphicConsole

Move fields specific to graphic console to the console subclass.

qemu_console_get_head() is adapated to accomodate QemuTextConsole, and
always returns 0.

Signed-off-by: Marc-André Lureau 
Reviewed-by: Daniel P. Berrangé 
Message-Id: <20230830093843.3531473-30-marcandre.lur...@redhat.com>

 ui/console.c | 110 ++-
 1 file changed, 64 insertions(+), 46 deletions(-)

Bisection log:

git bisect start
# status: waiting for both good and bad commits
# bad: [c152379422a204109f34ca2b43ecc538c7d738ae] Merge tag 'ui-pull-request' 
of https://gitlab.com/marcandre.lureau/qemu into staging
git bisect bad c152379422a204109f34ca2b43ecc538c7d738ae
# status: waiting for good commit(s), bad commit known
# good: [17780edd81d27fcfdb7a802efc870a99788bd2fc] Merge tag 
'quick-fix-pull-request' of https://gitlab.com/bsdimp/qemu into staging
git bisect good 17780edd81d27fcfdb7a802efc870a99788bd2fc
# good: [912a9efd6bf4d808b238e17d26de2e4bb9bc4743] Merge tag 
'pull-aspeed-20230901' of https://github.com/legoater/qemu into staging
git bisect good 912a9efd6bf4d808b238e17d26de2e4bb9bc4743
# bad: [6ce7b1fa8844db668f0a3c0b37b78b08d331a16a] ui/console: remove need for 
g_width/g_height
git bisect bad 6ce7b1fa8844db668f0a3c0b37b78b08d331a16a
# good: [6505fd8d2390e57c6a2e84f9c07b9e408ad7da76] ui/vc: move VCCharDev 
specific fields out of QemuConsole
git bisect good 6505fd8d2390e57c6a2e84f9c07b9e408ad7da76
# good: [7fa4b8041b870951642515e0954d274ec4d599b1] ui/console: update the head 
from unused QemuConsole
git bisect good 7fa4b8041b870951642515e0954d274ec4d599b1
# good: [b2bb9cc43dbb942a5333a6271629fd6094771bca] ui/vc: move text fields to 
QemuTextConsole
git bisect good b2bb9cc43dbb942a5333a6271629fd6094771bca
# bad: [98ee9dab81b2bc75d6ccf86681053ed80f9fc9af] ui/vc: fold 
text_console_do_init() in vc_chr_open()
git bisect bad 98ee9dab81b2bc75d6ccf86681053ed80f9fc9af
# bad: [58d5870845c61cea1e7df287b86c2607b2bf48a9] ui/console: move graphic 
fields to QemuGraphicConsole
git bisect bad 58d5870845c61cea1e7df287b86c2607b2bf48a9
# first bad commit: [58d5870845c61cea1e7df287b86c2607b2bf48a9] ui/console: move 
graphic fields to QemuGraphicConsole

The problem is that the commit in question didn't update 
qemu_console_is_multihead().

qemu_console_is_multihead() checks, effectively, if there is another console in 
the system that refers to *this* console's device, but under a different "head" 
number.

I don't know why qemu_console_is_multihead() used a lot of QOM trickery for 
this in the first place, but here's what I'd propose as fix -- simply try to 
locate a QemuGraphicConsole in "consoles" that references the same "device" 
that *this* QemuGraphicConsole references, but by a different "head" number.


* Patch #1 -- make "qemu_console_is_multihead" static:

diff --git a/include/ui/console.h b/include/ui/console.h
index 1ccd432b4d64..d715f88b1be2 100644
--- a/include/ui/console.h
+++ b/include/ui/console.h
@@ -506,7 +506,6 @@ bool qemu_console_is_visible(QemuConsole *con);
 bool qemu_console_is_graphic(QemuConsole *con);
 bool qemu_console_is_fixedsize(QemuConsole *con);
 bool qemu_console_is_gl_blocked(QemuConsole *con);
-bool qemu_console_is_multihead(DeviceState *dev);
 char *qemu_console_get_label(QemuConsole *con);
 int qemu_console_get_index(QemuConsole *con);
 uint32_t qemu_console_get_head(QemuConsole *con);
diff --git a/ui/console.c b/ui/console.c
index e4d61794bb2c..adacc3473140 100644
--- a/ui/console.c
+++ b/ui/console.c
@@ -2365,7 +2365,7 @@ bool qemu_console_is_gl_blocked(QemuConsole *con)
 return con->gl_block;
 }
 
-bool qemu_console_is_multihead(DeviceState *dev)
+static bool qemu_console_is_multihead(DeviceState *dev)
 {
 QemuConsole *con;
 Object *obj;


* Patch #2 -- only check QemuGraphicConsoles for referecing our "device" by a 
different "head" number:

diff --git a/ui/console.c b/ui/console.c
index adacc3473140..2ee65207b430 100644
--- a/ui/console.c
+++ b/ui/console.c
@@ -2373,6 +2373,9 @@ static bool

Re: [PATCH v3 15/32] target/s390x: Use generic helper to show CPU model names

2023-09-07 Thread Gavin Shan




On 9/7/23 18:20, David Hildenbrand wrote:

On 07.09.23 02:35, Gavin Shan wrote:

For target/s390x, the CPU type name is always the combination of the
CPU modle name and suffix. The CPU model names have been correctly
shown in s390_print_cpu_model_list_entry() and create_cpu_model_list().

Use generic helper cpu_model_from_type() to show the CPU model names
in the above two functions. Besides, we need validate the CPU class
in s390_cpu_class_by_name(), as other targets do.

Signed-off-by: Gavin Shan 
---
  target/s390x/cpu_models.c    | 18 +++---
  target/s390x/cpu_models_sysemu.c |  9 -
  2 files changed, 15 insertions(+), 12 deletions(-)

diff --git a/target/s390x/cpu_models.c b/target/s390x/cpu_models.c
index 91ce896491..103e9072b8 100644
--- a/target/s390x/cpu_models.c
+++ b/target/s390x/cpu_models.c
@@ -338,7 +338,8 @@ static void s390_print_cpu_model_list_entry(gpointer data, 
gpointer user_data)
  {
  const S390CPUClass *scc = S390_CPU_CLASS((ObjectClass *)data);
  CPUClass *cc = CPU_CLASS(scc);
-    char *name = g_strdup(object_class_get_name((ObjectClass *)data));
+    const char *typename = object_class_get_name((ObjectClass *)data);
+    char *model = cpu_model_from_type(typename);
  g_autoptr(GString) details = g_string_new("");
  if (scc->is_static) {
@@ -355,14 +356,12 @@ static void s390_print_cpu_model_list_entry(gpointer 
data, gpointer user_data)
  g_string_truncate(details, details->len - 2);
  }
-    /* strip off the -s390x-cpu */
-    g_strrstr(name, "-" TYPE_S390_CPU)[0] = 0;
  if (details->len) {
-    qemu_printf("s390 %-15s %-35s (%s)\n", name, scc->desc, details->str);
+    qemu_printf("s390 %-15s %-35s (%s)\n", model, scc->desc, details->str);
  } else {
-    qemu_printf("s390 %-15s %-35s\n", name, scc->desc);
+    qemu_printf("s390 %-15s %-35s\n", model, scc->desc);
  }
-    g_free(name);
+    g_free(model);
  }
  static gint s390_cpu_list_compare(gconstpointer a, gconstpointer b)
@@ -916,7 +915,12 @@ ObjectClass *s390_cpu_class_by_name(const char *name)
  oc = object_class_by_name(typename);
  g_free(typename);
-    return oc;
+    if (object_class_dynamic_cast(oc, TYPE_S390_CPU) &&
+    !object_class_is_abstract(oc)) {
+    return oc;
+    }
+
+    return NULL;


Why is that change required?



Good question. It's possible that other class's name conflicts with
CPU class's name. For example, class "abc-base-s390x-cpu" has been
registered for a misc class other than a CPU class. We don't want
s390_cpu_class_by_name() return the class for "abc-base-s390x-cpu".
Almost all other target does similar check.

Thanks,
Gavin

Re: [PATCH v3 15/32] target/s390x: Use generic helper to show CPU model names

2023-09-07 Thread Gavin Shan





On 9/7/23 18:31, Thomas Huth wrote:

On 07/09/2023 02.35, Gavin Shan wrote:

For target/s390x, the CPU type name is always the combination of the
CPU modle name and suffix. The CPU model names have been correctly


s/modle/model/



Thanks, will be fixed in next respin.

Thanks,
Gavin

Re: [PATCH v3 01/32] cpu: Add helper cpu_model_from_type()

2023-09-07 Thread Gavin Shan


On 9/7/23 18:54, Philippe Mathieu-Daudé wrote:

On 7/9/23 02:35, Gavin Shan wrote:

Add helper cpu_model_from_type() to extract the CPU model name from
the CPU type name in two circumstances: (1) The CPU type name is the
combination of the CPU model name and suffix. (2) The CPU type name
is same to the CPU model name.

The helper will be used in the subsequent patches to conver the


"patches to conver" -> "commits to convert"



Thanks, it will be fixed in next respin.


CPU type name to the CPU model name.

Suggested-by: Igor Mammedov 
Signed-off-by: Gavin Shan 
---
  cpu.c | 16 
  include/hw/core/cpu.h | 12 
  2 files changed, 28 insertions(+)




Thanks,
Gavin

Re: [PATCH 8/8] qemu-nbd: fix formatting in main()

2023-09-07 Thread Eric Blake

On Wed, Sep 06, 2023 at 11:32:10AM +0200, Denis V. Lunev wrote:
> Just a formatting, no functional changes.
> 
> Signed-off-by: Denis V. Lunev 
> CC: Eric Blake 
> CC: Vladimir Sementsov-Ogievskiy 
> ---
> Do not really sure that this patch is mandatory, just stabs my eye. Feel free
> to drop if this is too useless.
> 
>  qemu-nbd.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/qemu-nbd.c b/qemu-nbd.c
> index b9c74ce77c..8eb1d1f40b 100644
> --- a/qemu-nbd.c
> +++ b/qemu-nbd.c
> @@ -581,7 +581,8 @@ int main(int argc, char **argv)
>  pthread_t client_thread;
>  const char *fmt = NULL;
>  Error *local_err = NULL;
> -BlockdevDetectZeroesOptions detect_zeroes = 
> BLOCKDEV_DETECT_ZEROES_OPTIONS_OFF;
> +BlockdevDetectZeroesOptions detect_zeroes =
> +
> BLOCKDEV_DETECT_ZEROES_OPTIONS_OFF;

check-patch allows code up to 90 columngs although it does advise
staying under 80.  You fixed the long line by keeping the wrapped
portion right-flushed to 80 columns; I think more typical tree-wide is
to just indent by four spaces (at least, that's what emacs suggests I
do).  But me changing what you wrote would a complete rewrite, so I'm
reluctant to include it in my upcoming pull request, although I'm not
ruling out a later cleanup (perhaps if it touches more than one
stylistic thing at once).

I'm queuing 1-7 through my NBD tree, and running another round of
iotests before sending the pull request this week.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization:  qemu.org | libguestfs.org

Re: [PULL v2 00/35] ppc queue

2023-09-07 Thread Cédric Le Goater


On 9/7/23 21:10, Michael Tokarev wrote:

06.09.2023 17:36, Cédric Le Goater wrote:
...

ppc queue :

* debug facility improvements
* timebase and decrementer fixes
* record-replay fixes
* TCG fixes
* XIVE model improvements for multichip


Cédric Le Goater (4):
   ppc/xive: Use address_space routines to access the machine RAM
   ppc/xive: Introduce a new XiveRouter end_notify() handler
   ppc/xive: Handle END triggers between chips with MMIOs
   ppc/xive: Add support for the PC MMIOs

Joel Stanley (1):
   ppc: Add stub implementation of TRIG SPRs

Maksim Kostin (1):
   hw/ppc/e500: fix broken snapshot replay

Nicholas Piggin (26):
   target/ppc: Remove single-step suppression inside 0x100-0xf00
   target/ppc: Improve book3s branch trace interrupt for v2.07S
   target/ppc: Suppress single step interrupts on rfi-type instructions
   target/ppc: Implement breakpoint debug facility for v2.07S
   target/ppc: Implement watchpoint debug facility for v2.07S
   spapr: implement H_SET_MODE debug facilities
   ppc/vhyp: reset exception state when handling vhyp hcall
   ppc/vof: Fix missed fields in VOF cleanup
   hw/ppc/ppc.c: Tidy over-long lines
   hw/ppc: Introduce functions for conversion between timebase and 
nanoseconds
   host-utils: Add muldiv64_round_up
   hw/ppc: Round up the decrementer interval when converting to ns
   hw/ppc: Avoid decrementer rounding errors
   target/ppc: Sign-extend large decrementer to 64-bits
   hw/ppc: Always store the decrementer value
   target/ppc: Migrate DECR SPR
   hw/ppc: Reset timebase facilities on machine reset
   hw/ppc: Read time only once to perform decrementer write
   target/ppc: Fix CPU reservation migration for record-replay
   target/ppc: Fix timebase reset with record-replay
   spapr: Fix machine reset deadlock from replay-record
   spapr: Fix record-replay machine reset consuming too many events
   tests/avocado: boot ppc64 pseries replay-record test to Linux VFS mount
   tests/avocado: reverse-debugging cope with re-executing breakpoints
   tests/avocado: ppc64 reverse debugging tests for pseries and powernv
   target/ppc: Fix LQ, STQ register-pair order for big-endian

Richard Henderson (1):
   target/ppc: Flush inputs to zero with NJ in ppc_store_vscr

Shawn Anastasio (1):
   target/ppc: Generate storage interrupts for radix RC changes

jianchunfu (1):
   target/ppc: Fix the order of kvm_enable judgment about 
kvmppc_set_interrupt()


Is there anything in there worth to pick for -stable?
Like, for example, some decrementer fixes, 


The decrementer fixes are good candidates but there are quite a few
patches and you might encounter conflicts.


or some of these:

  ppc/vof: Fix missed fields in VOF cleanup
  spapr: Fix machine reset deadlock from replay-record
  hw/ppc/e500: fix broken snapshot replay


I can not tell if replay-record is important for stable. Nick ?
 

or something else?


These are :

  target/ppc: Flush inputs to zero with NJ in ppc_store_vscr
  target/ppc: Fix LQ, STQ register-pair order for big-endian

Thanks,

C.



Thanks!

/mjt

Re: [PATCH v2 0/3] block: align CoR requests to subclusters

2023-09-07 Thread Denis V. Lunev


On 9/7/23 22:11, Michael Tokarev wrote:

11.07.2023 20:25, Andrey Drobyshev via wrote:

v1 --> v2:
  * Fixed line indentation;
  * Fixed wording in a comment;
  * Added R-b.

v1: 
https://lists.nongnu.org/archive/html/qemu-block/2023-06/msg00606.html


Andrey Drobyshev (3):
   block: add subcluster_size field to BlockDriverInfo
   block/io: align requests to subcluster_size
   tests/qemu-iotests/197: add testcase for CoR with subclusters

  block.c  |  7 +
  block/io.c   | 50 ++--
  block/mirror.c   |  8 +++---
  block/qcow2.c    |  1 +
  include/block/block-common.h |  5 
  include/block/block-io.h |  8 +++---
  tests/qemu-iotests/197   | 29 +
  tests/qemu-iotests/197.out   | 24 +
  8 files changed, 99 insertions(+), 33 deletions(-)


So, given the size of patch series and amount of time the series
were sitting there.. I'm hesitating to apply it to -stable.
The whole issue, while real, smells like somewhat unusual case.

Any comments on this?

Thanks,

/mjt


The issue was observed by us in the reality on not yet
released product, but the series requires correct, which
has been sent as 4/3 patch today by Andrey. Unit tests
are broken at the moment.

Thanks,
    Den

Re: [PATCH v2 3/3] tests/qemu-iotests/197: add testcase for CoR with subclusters

2023-09-07 Thread Andrey Drobyshev

On 9/6/23 12:43, Denis V. Lunev wrote:
> On 7/11/23 19:25, Andrey Drobyshev wrote:
>> Add testcase which checks that allocations during copy-on-read are
>> performed on the subcluster basis when subclusters are enabled in target
>> image.
>>
>> This testcase also triggers the following assert with previous commit
>> not being applied, so we check that as well:
>>
>> qemu-io: ../block/io.c:1236: bdrv_co_do_copy_on_readv: Assertion
>> `skip_bytes < pnum' failed.
>>
>> Reviewed-by: Eric Blake 
>> Reviewed-by: Denis V. Lunev 
>> Signed-off-by: Andrey Drobyshev 
>> ---
>>   tests/qemu-iotests/197 | 29 +
>>   tests/qemu-iotests/197.out | 24 
>>   2 files changed, 53 insertions(+)
>>
>> diff --git a/tests/qemu-iotests/197 b/tests/qemu-iotests/197
>> index a2547bc280..f07a9da136 100755
>> --- a/tests/qemu-iotests/197
>> +++ b/tests/qemu-iotests/197
>> @@ -122,6 +122,35 @@ $QEMU_IO -f qcow2 -C -c 'read 0 1024'
>> "$TEST_WRAP" | _filter_qemu_io
>>   $QEMU_IO -f qcow2 -c map "$TEST_WRAP"
>>   _check_test_img
>>   +echo
>> +echo '=== Copy-on-read with subclusters ==='
>> +echo
>> +
>> +# Create base and top images 64K (1 cluster) each.  Make subclusters
>> enabled
>> +# for the top image
>> +_make_test_img 64K
>> +IMGPROTO=file IMGFMT=qcow2 TEST_IMG_FILE="$TEST_WRAP" \
>> +    _make_test_img --no-opts -o extended_l2=true -F "$IMGFMT" -b
>> "$TEST_IMG" \
>> +    64K | _filter_img_create
>> +
>> +$QEMU_IO -c "write -P 0xaa 0 64k" "$TEST_IMG" | _filter_qemu_io
>> +
>> +# Allocate individual subclusters in the top image, and not the whole
>> cluster
>> +$QEMU_IO -c "write -P 0xbb 28K 2K" -c "write -P 0xcc 34K 2K"
>> "$TEST_WRAP" \
>> +    | _filter_qemu_io
>> +
>> +# Only 2 subclusters should be allocated in the top image at this point
>> +$QEMU_IMG map "$TEST_WRAP" | _filter_qemu_img_map
>> +
>> +# Actual copy-on-read operation
>> +$QEMU_IO -C -c "read -P 0xaa 30K 4K" "$TEST_WRAP" | _filter_qemu_io
>> +
>> +# And here we should have 4 subclusters allocated right in the middle
>> of the
>> +# top image. Make sure the whole cluster remains unallocated
>> +$QEMU_IMG map "$TEST_WRAP" | _filter_qemu_img_map
>> +
>> +_check_test_img
>> +
>>   # success, all done
>>   echo '*** done'
>>   status=0
>> diff --git a/tests/qemu-iotests/197.out b/tests/qemu-iotests/197.out
>> index ad414c3b0e..8f34a30afe 100644
>> --- a/tests/qemu-iotests/197.out
>> +++ b/tests/qemu-iotests/197.out
>> @@ -31,4 +31,28 @@ read 1024/1024 bytes at offset 0
>>   1 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
>>   1 KiB (0x400) bytes allocated at offset 0 bytes (0x0)
>>   No errors were found on the image.
>> +
>> +=== Copy-on-read with subclusters ===
>> +
>> +Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=65536
>> +Formatting 'TEST_DIR/t.wrap.IMGFMT', fmt=IMGFMT size=65536
>> backing_file=TEST_DIR/t.IMGFMT backing_fmt=IMGFMT
>> +wrote 65536/65536 bytes at offset 0
>> +64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
>> +wrote 2048/2048 bytes at offset 28672
>> +2 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
>> +wrote 2048/2048 bytes at offset 34816
>> +2 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
>> +Offset  Length  File
>> +0   0x7000  TEST_DIR/t.IMGFMT
>> +0x7000  0x800   TEST_DIR/t.wrap.IMGFMT
>> +0x7800  0x1000  TEST_DIR/t.IMGFMT
>> +0x8800  0x800   TEST_DIR/t.wrap.IMGFMT
>> +0x9000  0x7000  TEST_DIR/t.IMGFMT
>> +read 4096/4096 bytes at offset 30720
>> +4 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
>> +Offset  Length  File
>> +0   0x7000  TEST_DIR/t.IMGFMT
>> +0x7000  0x2000  TEST_DIR/t.wrap.IMGFMT
>> +0x9000  0x7000  TEST_DIR/t.IMGFMT
>> +No errors were found on the image.
>>   *** done
> It is revealed that this patch seems to break unit tests if run for NBD
> format
> 
> iris ~/src/qemu/build/tests/qemu-iotests $ ./check -nbd 197
> QEMU  --
> "/home/den/src/qemu/build/tests/qemu-iotests/../../qemu-system-x86_64"
> -nodefaults -display none -accel qtest
> QEMU_IMG  --
> "/home/den/src/qemu/build/tests/qemu-iotests/../../qemu-img"
> QEMU_IO   --
> "/home/den/src/qemu/build/tests/qemu-iotests/../../qemu-io" --cache
> writeback --aio threads -f raw
> QEMU_NBD  --
> "/home/den/src/qemu/build/tests/qemu-iotests/../../qemu-nbd"
> IMGFMT    -- raw
> IMGPROTO  -- nbd
> PLATFORM  -- Linux/x86_64 iris 6.2.0-31-generic
> TEST_DIR  -- /home/den/src/qemu/build/tests/qemu-iotests/scratch
> SOCK_DIR  -- /tmp/tmpzw5ky8d3
> GDB_OPTIONS   --
> VALGRIND_QEMU --
> PRINT_QEMU_OUTPUT --
> 
> 197   fail   [11:41:26] [11:41:31]   5.7s   (last: 3.8s)  output
> mismatch (see
> /home/den/src/qemu/build/tests/qemu-iotests/scratch/raw-nbd-197/197.out.bad)
> --- /home/den/src/qemu/tests/qemu-iotests/197.out
> +++
>

[PATCH v2 4/3] qemu-iotests/197: use more generic commands for formats other than qcow2

2023-09-07 Thread Andrey Drobyshev via

In the previous commit e2f938265e0 ("tests/qemu-iotests/197: add
testcase for CoR with subclusters") we've introduced a new testcase for
copy-on-read with subclusters.  Test 197 always forces qcow2 as the top
image, but allows backing image to be in any format.  That last test
case didn't meet these requirements, so let's fix it by using more
generic "qemu-io -c map" command.

Signed-off-by: Andrey Drobyshev 
---
 tests/qemu-iotests/197 |  8 
 tests/qemu-iotests/197.out | 18 --
 2 files changed, 12 insertions(+), 14 deletions(-)

diff --git a/tests/qemu-iotests/197 b/tests/qemu-iotests/197
index f07a9da136..8ad2bdb035 100755
--- a/tests/qemu-iotests/197
+++ b/tests/qemu-iotests/197
@@ -136,18 +136,18 @@ IMGPROTO=file IMGFMT=qcow2 TEST_IMG_FILE="$TEST_WRAP" \
 $QEMU_IO -c "write -P 0xaa 0 64k" "$TEST_IMG" | _filter_qemu_io
 
 # Allocate individual subclusters in the top image, and not the whole cluster
-$QEMU_IO -c "write -P 0xbb 28K 2K" -c "write -P 0xcc 34K 2K" "$TEST_WRAP" \
+$QEMU_IO -f qcow2 -c "write -P 0xbb 28K 2K" -c "write -P 0xcc 34K 2K" 
"$TEST_WRAP" \
 | _filter_qemu_io
 
 # Only 2 subclusters should be allocated in the top image at this point
-$QEMU_IMG map "$TEST_WRAP" | _filter_qemu_img_map
+$QEMU_IO -f qcow2 -c map "$TEST_WRAP"
 
 # Actual copy-on-read operation
-$QEMU_IO -C -c "read -P 0xaa 30K 4K" "$TEST_WRAP" | _filter_qemu_io
+$QEMU_IO -f qcow2 -C -c "read -P 0xaa 30K 4K" "$TEST_WRAP" | _filter_qemu_io
 
 # And here we should have 4 subclusters allocated right in the middle of the
 # top image. Make sure the whole cluster remains unallocated
-$QEMU_IMG map "$TEST_WRAP" | _filter_qemu_img_map
+$QEMU_IO -f qcow2 -c map "$TEST_WRAP"
 
 _check_test_img
 
diff --git a/tests/qemu-iotests/197.out b/tests/qemu-iotests/197.out
index 8f34a30afe..86c57b51d3 100644
--- a/tests/qemu-iotests/197.out
+++ b/tests/qemu-iotests/197.out
@@ -42,17 +42,15 @@ wrote 2048/2048 bytes at offset 28672
 2 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 wrote 2048/2048 bytes at offset 34816
 2 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
-Offset  Length  File
-0   0x7000  TEST_DIR/t.IMGFMT
-0x7000  0x800   TEST_DIR/t.wrap.IMGFMT
-0x7800  0x1000  TEST_DIR/t.IMGFMT
-0x8800  0x800   TEST_DIR/t.wrap.IMGFMT
-0x9000  0x7000  TEST_DIR/t.IMGFMT
+28 KiB (0x7000) bytes not allocated at offset 0 bytes (0x0)
+2 KiB (0x800) bytes allocated at offset 28 KiB (0x7000)
+4 KiB (0x1000) bytes not allocated at offset 30 KiB (0x7800)
+2 KiB (0x800) bytes allocated at offset 34 KiB (0x8800)
+28 KiB (0x7000) bytes not allocated at offset 36 KiB (0x9000)
 read 4096/4096 bytes at offset 30720
 4 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
-Offset  Length  File
-0   0x7000  TEST_DIR/t.IMGFMT
-0x7000  0x2000  TEST_DIR/t.wrap.IMGFMT
-0x9000  0x7000  TEST_DIR/t.IMGFMT
+28 KiB (0x7000) bytes not allocated at offset 0 bytes (0x0)
+8 KiB (0x2000) bytes allocated at offset 28 KiB (0x7000)
+28 KiB (0x7000) bytes not allocated at offset 36 KiB (0x9000)
 No errors were found on the image.
 *** done
-- 
2.39.3

Re: [PATCH 7/8] qemu-nbd: document -v behavior in respect to --fork in man

2023-09-07 Thread Denis V. Lunev


On 9/8/23 00:01, Eric Blake wrote:

On Wed, Sep 06, 2023 at 11:32:09AM +0200, Denis V. Lunev wrote:

Signed-off-by: Denis V. Lunev 
CC: Eric Blake 
CC: Vladimir Sementsov-Ogievskiy 
---
  docs/tools/qemu-nbd.rst | 4 +++-
  1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/docs/tools/qemu-nbd.rst b/docs/tools/qemu-nbd.rst
index faf6349ea5..5c48ee7345 100644
--- a/docs/tools/qemu-nbd.rst
+++ b/docs/tools/qemu-nbd.rst
@@ -197,7 +197,9 @@ driver options if :option:`--image-opts` is specified.
  
  .. option:: -v, --verbose
  
-  Display extra debugging information.

+  Display extra debugging information. This option also keeps opened original
+  *STDERR* stream if ``qemu-nbd`` process is daemonized due to other options
+  like :option:`--fork` or :option:`-c`.

As a native speaker, I find the following a bit easier to parse:

This option also keeps the original *STDERR* stream open if ...

I can make that touchup as part of queuing the series.

Reviewed-by: Eric Blake 


That would be great, thanks!

Re: [PATCH 7/8] qemu-nbd: document -v behavior in respect to --fork in man

2023-09-07 Thread Eric Blake

On Wed, Sep 06, 2023 at 11:32:09AM +0200, Denis V. Lunev wrote:
> Signed-off-by: Denis V. Lunev 
> CC: Eric Blake 
> CC: Vladimir Sementsov-Ogievskiy 
> ---
>  docs/tools/qemu-nbd.rst | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/docs/tools/qemu-nbd.rst b/docs/tools/qemu-nbd.rst
> index faf6349ea5..5c48ee7345 100644
> --- a/docs/tools/qemu-nbd.rst
> +++ b/docs/tools/qemu-nbd.rst
> @@ -197,7 +197,9 @@ driver options if :option:`--image-opts` is specified.
>  
>  .. option:: -v, --verbose
>  
> -  Display extra debugging information.
> +  Display extra debugging information. This option also keeps opened original
> +  *STDERR* stream if ``qemu-nbd`` process is daemonized due to other options
> +  like :option:`--fork` or :option:`-c`.

As a native speaker, I find the following a bit easier to parse:

This option also keeps the original *STDERR* stream open if ...

I can make that touchup as part of queuing the series.

Reviewed-by: Eric Blake 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization:  qemu.org | libguestfs.org

Re: [PATCH 6/8] qemu-nbd: Restore "qemu-nbd -v --fork" output

2023-09-07 Thread Eric Blake

On Wed, Sep 06, 2023 at 11:32:08AM +0200, Denis V. Lunev wrote:
> Closing stderr earlier is good for daemonized qemu-nbd under ssh
> earlier, but breaks the case where -v is being used to track what is
> happening in the server, as in iotest 233.
> 
> When we know we are verbose, we should preserve original stderr and
> restore it once the setup stage is done. This commit restores the
> original behavior with -v option. In this case original output
> inside the test is kept intact.
> 
> Reported-by: Kevin Wolf 
> Signed-off-by: Denis V. Lunev 
> CC: Eric Blake 
> CC: Vladimir Sementsov-Ogievskiy 
> CC: Hanna Reitz 
> CC: Mike Maslenkin 
> Fixes: 5c56dd27a2 ("qemu-nbd: fix regression with qemu-nbd --fork run over 
> ssh")
> ---
>  qemu-nbd.c | 24 
>  1 file changed, 20 insertions(+), 4 deletions(-)

Tested-by: Eric Blake 
Reviewed-by: Eric Blake 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization:  qemu.org | libguestfs.org

Re: [PATCH 5/8] qemu-nbd: invent nbd_client_release_pipe() helper

2023-09-07 Thread Eric Blake

On Wed, Sep 06, 2023 at 11:32:07AM +0200, Denis V. Lunev wrote:
> Move the code from main() and nbd_client_thread() into the specific
> helper. This code is going to be grown.
> 
> Signed-off-by: Denis V. Lunev 
> CC: Eric Blake 
> CC: Vladimir Sementsov-Ogievskiy 
> ---
>  qemu-nbd.c | 23 ---
>  1 file changed, 12 insertions(+), 11 deletions(-)

Reviewed-by: Eric Blake 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization:  qemu.org | libguestfs.org

[PATCH 1/1] block: improve alignment detection and fix 271 test

2023-09-07 Thread Denis V. Lunev

Unfortunately 271 IO test is broken if started in non-cached mode.

Commits
commit a6b257a08e3d72219f03e461a52152672fec0612
Author: Nir Soffer 
Date:   Tue Aug 13 21:21:03 2019 +0300
file-posix: Handle undetectable alignment
and
commit 9c60a5d1978e6dcf85c0e01b50e6f7f54ca09104
Author: Kevin Wolf 
Date:   Thu Jul 16 16:26:00 2020 +0200
block: Require aligned image size to avoid assertion failure
have interesting side effect if used togather.

If the image size is not multiple of 4k and that image falls under
original constraints of Nil's patch, the image can not be opened
due to the check in the bdrv_check_perm().

The patch tries to satisfy the requirements of bdrv_check_perm()
inside raw_probe_alignment(). This is at my opinion better that just
disallowing to run that test in non-cached mode. The operation is legal
by itself.

Signed-off-by: Denis V. Lunev 
CC: Nir Soffer 
CC: Kevin Wolf 
CC: Hanna Reitz 
CC: Alberto Garcia 
---
 block/file-posix.c | 17 +++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/block/file-posix.c b/block/file-posix.c
index b16e9c21a1..988cfdc76c 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -447,8 +447,21 @@ static void raw_probe_alignment(BlockDriverState *bs, int 
fd, Error **errp)
 for (i = 0; i < ARRAY_SIZE(alignments); i++) {
 align = alignments[i];
 if (raw_is_io_aligned(fd, buf, align)) {
-/* Fallback to safe value. */
-bs->bl.request_alignment = (align != 1) ? align : max_align;
+if (align != 1) {
+bs->bl.request_alignment = align;
+break;
+}
+/*
+ * Fallback to safe value. max_align is perfect, but the size 
of the device must be multiple of
+ * the virtual length of the device. In the other case we will 
get a error in
+ * bdrv_node_refresh_perm().
+ */
+for (align = max_align; align > 1; align /= 2) {
+if ((bs->total_sectors * BDRV_SECTOR_SIZE) % align == 0) {
+break;
+}
+}
+bs->bl.request_alignment = align;
 break;
 }
 }
-- 
2.34.1

Re: [PATCH 4/8] qemu-nbd: put saddr into into struct NbdClientOpts

2023-09-07 Thread Eric Blake

On Wed, Sep 06, 2023 at 11:32:06AM +0200, Denis V. Lunev wrote:
> We pass other parameters into nbd_client_thread() in this way. This patch
> makes the code more consistent.
> 
> Signed-off-by: Denis V. Lunev 
> CC: Eric Blake 
> CC: Vladimir Sementsov-Ogievskiy 
> ---
>  qemu-nbd.c | 13 +++--
>  1 file changed, 7 insertions(+), 6 deletions(-)

Reviewed-by: Eric Blake 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization:  qemu.org | libguestfs.org

Re: [PATCH 3/8] qemu-nbd: move srcpath into struct NbdClientOpts

2023-09-07 Thread Eric Blake

On Wed, Sep 06, 2023 at 11:32:05AM +0200, Denis V. Lunev wrote:
> We pass other parameters into nbd_client_thread() in this way. This patch
> makes the code more consistent.
> 
> Signed-off-by: Denis V. Lunev 
> CC: Eric Blake 
> CC: Vladimir Sementsov-Ogievskiy 
> ---
>  qemu-nbd.c | 17 +
>  1 file changed, 9 insertions(+), 8 deletions(-)
> 

> @@ -1059,19 +1060,19 @@ int main(int argc, char **argv)
>  bdrv_init();
>  atexit(qemu_nbd_shutdown);
>  
> -srcpath = argv[optind];
> +opts.srcpath = argv[optind];
>  if (imageOpts) {
> -QemuOpts *opts;
> +QemuOpts *o;
>  if (fmt) {
>  error_report("--image-opts and -f are mutually exclusive");
>  exit(EXIT_FAILURE);
>  }
> -opts = qemu_opts_parse_noisily(_opts, srcpath, true);
> -if (!opts) {
> +o = qemu_opts_parse_noisily(_opts, opts.srcpath, true);
> +if (!o) {

Hmm - this would have been flagged by -Wshadow, and there are other
series working to clean up tree-wide issues that shadowing can cause.
Looking again, the shadowing was previously introduced before this
series, but only when HAVE_NBD_DEVICE was defined; then patch 2/8 made
the shadowing unconditional.  Reworking the series to clean up the
shadowing earlier in 2/8 is just churn, so I don't mind that it took
us to this point to notice it; however, I'm inclined to add a note to
the commit message that it is a (happy) side-effect.

Reviewed-by: Eric Blake 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization:  qemu.org | libguestfs.org

Re: [PATCH 2/8] qemu-nbd: define struct NbdClientOpts when HAVE_NBD_DEVICE is not defined

2023-09-07 Thread Eric Blake

On Wed, Sep 06, 2023 at 11:32:04AM +0200, Denis V. Lunev wrote:
> This patch also drops definition of some locals in main() to avoid
> useless data copy.
> 
> Signed-off-by: Denis V. Lunev 
> CC: Eric Blake 
> CC: Vladimir Sementsov-Ogievskiy 
> ---
>  qemu-nbd.c | 60 --
>  1 file changed, 27 insertions(+), 33 deletions(-)

> @@ -519,7 +519,6 @@ int main(int argc, char **argv)
>  const char *bindto = NULL;
>  const char *port = NULL;
>  char *sockpath = NULL;
> -char *device = NULL;
>  QemuOpts *sn_opts = NULL;
>  const char *sn_id_or_name = NULL;
>  const char *sopt = "hVb:o:p:rsnc:dvk:e:f:tl:x:T:D:AB:L";
> @@ -582,16 +581,16 @@ int main(int argc, char **argv)
>  const char *tlshostname = NULL;
>  bool imageOpts = false;
>  bool writethrough = false; /* Client will flush as needed. */
> -bool verbose = false;
> -bool fork_process = false;
>  bool list = false;
>  unsigned socket_activation;
>  const char *pid_file_name = NULL;
>  const char *selinux_label = NULL;
>  BlockExportOptions *export_opts;
> -#if HAVE_NBD_DEVICE
> -struct NbdClientOpts opts;
> -#endif
> +struct NbdClientOpts opts = {
> +.fork_process = false,
> +.verbose = false,
> +.device = NULL,
> +};

Could also do 'struct NbdClietnOpts opts = {};' since you happen to be
zero-initializing, but this may not remain the case if more fields get
added to the struct, so I'm fine leaving it as written.

Reviewed-by: Eric Blake 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization:  qemu.org | libguestfs.org

Re: [RFC 1/3] hmp: avoid the nested event loop in handle_hmp_command()

2023-09-07 Thread Stefan Hajnoczi

On Thu, 7 Sept 2023 at 16:53, Dr. David Alan Gilbert  wrote:
>
> * Stefan Hajnoczi (stefa...@gmail.com) wrote:
> > On Thu, 7 Sept 2023 at 10:07, Dr. David Alan Gilbert  
> > wrote:
> > >
> > > * Stefan Hajnoczi (stefa...@redhat.com) wrote:
> > > > On Thu, Sep 07, 2023 at 01:06:39AM +, Dr. David Alan Gilbert wrote:
> > > > > * Stefan Hajnoczi (stefa...@redhat.com) wrote:
> > > > > > Coroutine HMP commands currently run to completion in a nested event
> > > > > > loop with the Big QEMU Lock (BQL) held. The call_rcu thread also 
> > > > > > uses
> > > > > > the BQL and cannot process work while the coroutine monitor command 
> > > > > > is
> > > > > > running. A deadlock occurs when monitor commands attempt to wait for
> > > > > > call_rcu work to finish.
> > > > >
> > > > > I hate to think if there's anywhere else that ends up doing that
> > > > > other than the monitors.
> > > >
> > > > Luckily drain_call_rcu() has few callers: just
> > > > xen_block_device_destroy() and qmp_device_add(). We only need to worry
> > > > about their call stacks.
> > > >
> > > > I haven't looked at the Xen code.
> > > >
> > > > >
> > > > > But, not knowing the semantics of the rcu code, it looks kind of OK to
> > > > > me from the monitor.
> > > > >
> > > > > (Do you ever get anything like qemu quitting from one of the other
> > > > > monitors while this coroutine hasn't been run?)
> > > >
> > > > Not sure what you mean?
> > >
> > > Imagine that just after you create your coroutine, a vCPU does a
> > > shutdown and qemu is configured to quit, or on another monitor someone
> > > does a quit;  does your coroutine get executed or not?
> >
> > I think the answer is that it depends.
> >
> > A coroutine can run for a while and then yield while waiting for a
> > timer, BH, fd handler, etc. If the coroutine has yielded then I think
> > QEMU could terminate.
> >
> > The behavior of entering a coroutine for the first time depends on the
> > API that is used (e.g. qemu_coroutine_enter()/aio_co_enter()/etc).
> > qemu_coroutine_enter() is immediate but aio_co_enter() contains
> > indirect code paths like scheduling a BH.
> >
> > To summarize: ¯\_(ツ)_/¯
>
> That does mean you leave your g_new'd data and qdict allocated at
> exit - meh
>
> I'm not sure if it means you're making any other assumptions about what
> happens if the coroutine gets run during the exit path; although I guess
> there are plenty of other cases like that.

Yes, I think QEMU has some resources (memory, file descriptors, etc)
that are not freed on exit.

Stefan

>
> Dave
>
> > Stefan
> >
> > >
> > > Dave
> > >
> > > > Stefan
> > > >
> > > > >
> > > > > Dave
> > > > >
> > > > > > This patch refactors the HMP monitor to use the existing event loop
> > > > > > instead of creating a nested event loop. This will allow the next
> > > > > > patches to rely on draining call_rcu work.
> > > > > >
> > > > > > Signed-off-by: Stefan Hajnoczi 
> > > > > > ---
> > > > > >  monitor/hmp.c | 28 +++-
> > > > > >  1 file changed, 15 insertions(+), 13 deletions(-)
> > > > > >
> > > > > > diff --git a/monitor/hmp.c b/monitor/hmp.c
> > > > > > index 69c1b7e98a..6cff2810aa 100644
> > > > > > --- a/monitor/hmp.c
> > > > > > +++ b/monitor/hmp.c
> > > > > > @@ -,15 +,17 @@ typedef struct HandleHmpCommandCo {
> > > > > >  Monitor *mon;
> > > > > >  const HMPCommand *cmd;
> > > > > >  QDict *qdict;
> > > > > > -bool done;
> > > > > >  } HandleHmpCommandCo;
> > > > > >
> > > > > > -static void handle_hmp_command_co(void *opaque)
> > > > > > +static void coroutine_fn handle_hmp_command_co(void *opaque)
> > > > > >  {
> > > > > >  HandleHmpCommandCo *data = opaque;
> > > > > > +
> > > > > >  handle_hmp_command_exec(data->mon, data->cmd, data->qdict);
> > > > > >  monitor_set_cur(qemu_coroutine_self(), NULL);
> > > > > > -data->done = true;
> > > > > > +qobject_unref(data->qdict);
> > > > > > +monitor_resume(data->mon);
> > > > > > +g_free(data);
> > > > > >  }
> > > > > >
> > > > > >  void handle_hmp_command(MonitorHMP *mon, const char *cmdline)
> > > > > > @@ -1157,20 +1159,20 @@ void handle_hmp_command(MonitorHMP *mon, 
> > > > > > const char *cmdline)
> > > > > >  Monitor *old_mon = monitor_set_cur(qemu_coroutine_self(), 
> > > > > > >common);
> > > > > >  handle_hmp_command_exec(>common, cmd, qdict);
> > > > > >  monitor_set_cur(qemu_coroutine_self(), old_mon);
> > > > > > +qobject_unref(qdict);
> > > > > >  } else {
> > > > > > -HandleHmpCommandCo data = {
> > > > > > -.mon = >common,
> > > > > > -.cmd = cmd,
> > > > > > -.qdict = qdict,
> > > > > > -.done = false,
> > > > > > -};
> > > > > > -Coroutine *co = 
> > > > > > qemu_coroutine_create(handle_hmp_command_co, );
> > > > > > +HandleHmpCommandCo *data; /* freed by 
> > > > > > handle_hmp_command_co() */
> > > > > > +
> > > > > > +data =

[PATCH v3 1/2] block: add BDRV_BLOCK_COMPRESSED flag for bdrv_block_status()

2023-09-07 Thread Andrey Drobyshev via

Functions qcow2_get_host_offset(), get_cluster_offset(),
vmdk_co_block_status() explicitly report compressed cluster types when data
is compressed.  However, this information is never passed further.  Let's
make use of it by adding new BDRV_BLOCK_COMPRESSED flag for
bdrv_block_status(), so that caller may know that the data range is
compressed.  In particular, we're going to use this flag to tweak
"qemu-img map" output.

This new flag is only being utilized by qcow, qcow2 and vmdk formats, as only
those support compression.

Reviewed-by: Denis V. Lunev 
Reviewed-by: Hanna Czenczek 
Signed-off-by: Andrey Drobyshev 
---
 block/qcow.c | 5 -
 block/qcow2.c| 3 +++
 block/vmdk.c | 2 ++
 include/block/block-common.h | 3 +++
 4 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/block/qcow.c b/block/qcow.c
index 577bd70324..d56d24ab6d 100644
--- a/block/qcow.c
+++ b/block/qcow.c
@@ -549,7 +549,10 @@ qcow_co_block_status(BlockDriverState *bs, bool want_zero,
 if (!cluster_offset) {
 return 0;
 }
-if ((cluster_offset & QCOW_OFLAG_COMPRESSED) || s->crypto) {
+if (cluster_offset & QCOW_OFLAG_COMPRESSED) {
+return BDRV_BLOCK_DATA | BDRV_BLOCK_COMPRESSED;
+}
+if (s->crypto) {
 return BDRV_BLOCK_DATA;
 }
 *map = cluster_offset | index_in_cluster;
diff --git a/block/qcow2.c b/block/qcow2.c
index b48cd9ce63..b81dc5066b 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -2162,6 +2162,9 @@ qcow2_co_block_status(BlockDriverState *bs, bool 
want_zero, int64_t offset,
 {
 status |= BDRV_BLOCK_RECURSE;
 }
+if (type == QCOW2_SUBCLUSTER_COMPRESSED) {
+status |= BDRV_BLOCK_COMPRESSED;
+}
 return status;
 }
 
diff --git a/block/vmdk.c b/block/vmdk.c
index 70066c2b01..56b3d5151d 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -1770,6 +1770,8 @@ vmdk_co_block_status(BlockDriverState *bs, bool want_zero,
 if (extent->flat) {
 ret |= BDRV_BLOCK_RECURSE;
 }
+} else {
+ret |= BDRV_BLOCK_COMPRESSED;
 }
 *file = extent->file->bs;
 break;
diff --git a/include/block/block-common.h b/include/block/block-common.h
index df5ffc8d09..5b5ae07c93 100644
--- a/include/block/block-common.h
+++ b/include/block/block-common.h
@@ -287,6 +287,8 @@ typedef enum {
  *   layer rather than any backing, set by block layer
  * BDRV_BLOCK_EOF: the returned pnum covers through end of file for this
  * layer, set by block layer
+ * BDRV_BLOCK_COMPRESSED: the underlying data is compressed; only valid for
+ *the formats supporting compression: qcow, qcow2
  *
  * Internal flags:
  * BDRV_BLOCK_RAW: for use by passthrough drivers, such as raw, to request
@@ -322,6 +324,7 @@ typedef enum {
 #define BDRV_BLOCK_ALLOCATED0x10
 #define BDRV_BLOCK_EOF  0x20
 #define BDRV_BLOCK_RECURSE  0x40
+#define BDRV_BLOCK_COMPRESSED   0x80
 
 typedef QTAILQ_HEAD(BlockReopenQueue, BlockReopenQueueEntry) BlockReopenQueue;
 
-- 
2.39.3

[PATCH v3 0/2] qemu-img: map: implement support for compressed clusters

2023-09-07 Thread Andrey Drobyshev via

v2 --> v3:
  * Make "compressed" field mandatory, not optional;
  * Adjust field description in qapi/block-core.json;
  * Squash patch 3 into patch 2 so that failing tests don't break bisect;
  * Update even more tests' outputs now that the field is mandatory.

v2: https://lists.nongnu.org/archive/html/qemu-block/2023-07/msg00106.html

Andrey Drobyshev (2):
  block: add BDRV_BLOCK_COMPRESSED flag for bdrv_block_status()
  qemu-img: map: report compressed data blocks

 block/qcow.c  |   5 +-
 block/qcow2.c |   3 +
 block/vmdk.c  |   2 +
 include/block/block-common.h  |   3 +
 qapi/block-core.json  |   7 +-
 qemu-img.c|   8 +-
 tests/qemu-iotests/122.out|  84 +-
 tests/qemu-iotests/146.out| 780 +-
 tests/qemu-iotests/154.out| 194 ++---
 tests/qemu-iotests/179.out| 178 ++--
 tests/qemu-iotests/209.out|   4 +-
 tests/qemu-iotests/221.out|  16 +-
 tests/qemu-iotests/223.out|  60 +-
 tests/qemu-iotests/241.out|  10 +-
 tests/qemu-iotests/244.out|  24 +-
 tests/qemu-iotests/252.out|  10 +-
 tests/qemu-iotests/253.out|  20 +-
 tests/qemu-iotests/274.out|  48 +-
 .../tests/nbd-qemu-allocation.out |  16 +-
 tests/qemu-iotests/tests/qemu-img-bitmaps.out |  24 +-
 20 files changed, 757 insertions(+), 739 deletions(-)

-- 
2.39.3

Re: [PATCH] hw/display/xlnx_dp: update comments

2023-09-07 Thread Michael Tokarev


07.09.2023 23:34, Michael Tokarev wrote:


--- a/hw/display/xlnx_dp.c
+++ b/hw/display/xlnx_dp.c
@@ -1,4 +1,4 @@
-/*
+?*
   * Xilinx Display Port


Without this glitch ofc, - already fixed.

/mjt

Re: [RFC 1/3] hmp: avoid the nested event loop in handle_hmp_command()

2023-09-07 Thread Dr. David Alan Gilbert

* Stefan Hajnoczi (stefa...@gmail.com) wrote:
> On Thu, 7 Sept 2023 at 10:07, Dr. David Alan Gilbert  wrote:
> >
> > * Stefan Hajnoczi (stefa...@redhat.com) wrote:
> > > On Thu, Sep 07, 2023 at 01:06:39AM +, Dr. David Alan Gilbert wrote:
> > > > * Stefan Hajnoczi (stefa...@redhat.com) wrote:
> > > > > Coroutine HMP commands currently run to completion in a nested event
> > > > > loop with the Big QEMU Lock (BQL) held. The call_rcu thread also uses
> > > > > the BQL and cannot process work while the coroutine monitor command is
> > > > > running. A deadlock occurs when monitor commands attempt to wait for
> > > > > call_rcu work to finish.
> > > >
> > > > I hate to think if there's anywhere else that ends up doing that
> > > > other than the monitors.
> > >
> > > Luckily drain_call_rcu() has few callers: just
> > > xen_block_device_destroy() and qmp_device_add(). We only need to worry
> > > about their call stacks.
> > >
> > > I haven't looked at the Xen code.
> > >
> > > >
> > > > But, not knowing the semantics of the rcu code, it looks kind of OK to
> > > > me from the monitor.
> > > >
> > > > (Do you ever get anything like qemu quitting from one of the other
> > > > monitors while this coroutine hasn't been run?)
> > >
> > > Not sure what you mean?
> >
> > Imagine that just after you create your coroutine, a vCPU does a
> > shutdown and qemu is configured to quit, or on another monitor someone
> > does a quit;  does your coroutine get executed or not?
> 
> I think the answer is that it depends.
> 
> A coroutine can run for a while and then yield while waiting for a
> timer, BH, fd handler, etc. If the coroutine has yielded then I think
> QEMU could terminate.
> 
> The behavior of entering a coroutine for the first time depends on the
> API that is used (e.g. qemu_coroutine_enter()/aio_co_enter()/etc).
> qemu_coroutine_enter() is immediate but aio_co_enter() contains
> indirect code paths like scheduling a BH.
> 
> To summarize: ¯\_(ツ)_/¯

That does mean you leave your g_new'd data and qdict allocated at
exit - meh

I'm not sure if it means you're making any other assumptions about what
happens if the coroutine gets run during the exit path; although I guess
there are plenty of other cases like that.

Dave

> Stefan
> 
> >
> > Dave
> >
> > > Stefan
> > >
> > > >
> > > > Dave
> > > >
> > > > > This patch refactors the HMP monitor to use the existing event loop
> > > > > instead of creating a nested event loop. This will allow the next
> > > > > patches to rely on draining call_rcu work.
> > > > >
> > > > > Signed-off-by: Stefan Hajnoczi 
> > > > > ---
> > > > >  monitor/hmp.c | 28 +++-
> > > > >  1 file changed, 15 insertions(+), 13 deletions(-)
> > > > >
> > > > > diff --git a/monitor/hmp.c b/monitor/hmp.c
> > > > > index 69c1b7e98a..6cff2810aa 100644
> > > > > --- a/monitor/hmp.c
> > > > > +++ b/monitor/hmp.c
> > > > > @@ -,15 +,17 @@ typedef struct HandleHmpCommandCo {
> > > > >  Monitor *mon;
> > > > >  const HMPCommand *cmd;
> > > > >  QDict *qdict;
> > > > > -bool done;
> > > > >  } HandleHmpCommandCo;
> > > > >
> > > > > -static void handle_hmp_command_co(void *opaque)
> > > > > +static void coroutine_fn handle_hmp_command_co(void *opaque)
> > > > >  {
> > > > >  HandleHmpCommandCo *data = opaque;
> > > > > +
> > > > >  handle_hmp_command_exec(data->mon, data->cmd, data->qdict);
> > > > >  monitor_set_cur(qemu_coroutine_self(), NULL);
> > > > > -data->done = true;
> > > > > +qobject_unref(data->qdict);
> > > > > +monitor_resume(data->mon);
> > > > > +g_free(data);
> > > > >  }
> > > > >
> > > > >  void handle_hmp_command(MonitorHMP *mon, const char *cmdline)
> > > > > @@ -1157,20 +1159,20 @@ void handle_hmp_command(MonitorHMP *mon, 
> > > > > const char *cmdline)
> > > > >  Monitor *old_mon = monitor_set_cur(qemu_coroutine_self(), 
> > > > > >common);
> > > > >  handle_hmp_command_exec(>common, cmd, qdict);
> > > > >  monitor_set_cur(qemu_coroutine_self(), old_mon);
> > > > > +qobject_unref(qdict);
> > > > >  } else {
> > > > > -HandleHmpCommandCo data = {
> > > > > -.mon = >common,
> > > > > -.cmd = cmd,
> > > > > -.qdict = qdict,
> > > > > -.done = false,
> > > > > -};
> > > > > -Coroutine *co = qemu_coroutine_create(handle_hmp_command_co, 
> > > > > );
> > > > > +HandleHmpCommandCo *data; /* freed by 
> > > > > handle_hmp_command_co() */
> > > > > +
> > > > > +data = g_new(HandleHmpCommandCo, 1);
> > > > > +data->mon = >common;
> > > > > +data->cmd = cmd;
> > > > > +data->qdict = qdict; /* freed by handle_hmp_command_co() */
> > > > > +
> > > > > +Coroutine *co = qemu_coroutine_create(handle_hmp_command_co, 
> > > > > data);
> > > > > +monitor_suspend(>common); /* resumed by 
> > > > > handle_hmp_command_co() */
> > > > >  monitor_set_cur(co,

Re: [PATCH 1/8] qemu-nbd: improve error message for dup2 error

2023-09-07 Thread Eric Blake

On Wed, Sep 06, 2023 at 11:32:03AM +0200, Denis V. Lunev wrote:
> This error is happened when we are not able to close the pipe to the

s/is happened when/happens if/

> parent (to trace errors in the child process) and assign stderr to
> /dev/null as required by the daemonizing convention.
> 
> Signed-off-by: Denis V. Lunev 
> Suggested-by: Eric Blake 
> CC: Eric Blake 
> CC: Vladimir Sementsov-Ogievskiy 
> ---
>  qemu-nbd.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/qemu-nbd.c b/qemu-nbd.c
> index aaccaa3318..4575e4291e 100644
> --- a/qemu-nbd.c
> +++ b/qemu-nbd.c
> @@ -324,7 +324,7 @@ static void *nbd_client_thread(void *arg)
>  } else {
>  /* Close stderr so that the qemu-nbd process exits.  */
>  if (dup2(STDOUT_FILENO, STDERR_FILENO) < 0) {
> -error_report("Could not set stderr to /dev/null: %s",
> +error_report("Could not release pipe to parent: %s",
>   strerror(errno));
>  exit(EXIT_FAILURE);
>  }
> @@ -1181,7 +1181,7 @@ int main(int argc, char **argv)
>  
>  if (fork_process) {
>  if (dup2(STDOUT_FILENO, STDERR_FILENO) < 0) {
> -error_report("Could not set stderr to /dev/null: %s",
> +error_report("Could not release pipe to parent: %s",
>   strerror(errno));
>  exit(EXIT_FAILURE);
>  }
> -- 
> 2.34.1
>

Reviewed-by: Eric Blake 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization:  qemu.org | libguestfs.org

Re: [PATCH v3 4/4] io: follow coroutine AioContext in qio_channel_yield()

2023-09-07 Thread Eric Blake

On Wed, Aug 30, 2023 at 06:48:02PM -0400, Stefan Hajnoczi wrote:
> The ongoing QEMU multi-queue block layer effort makes it possible for multiple
> threads to process I/O in parallel. The nbd block driver is not compatible 
> with
> the multi-queue block layer yet because QIOChannel cannot be used easily from
> coroutines running in multiple threads. This series changes the QIOChannel API
> to make that possible.
> 
...
> 
> This API change allows the nbd block driver to use QIOChannel from any thread.
> It's important to keep in mind that the block driver already synchronizes
> QIOChannel access and ensures that two coroutines never read simultaneously or
> write simultaneously.
> 
> This patch updates all users of qio_channel_attach_aio_context() to the
> new API. Most conversions are simple, but vhost-user-server requires a
> new qemu_coroutine_yield() call to quiesce the vu_client_trip()
> coroutine when not attached to any AioContext.
> 
> While the API is has become simpler, there is one wart: QIOChannel has a
> special case for the iohandler AioContext (used for handlers that must not run
> in nested event loops). I didn't find an elegant way preserve that behavior, 
> so
> I added a new API called qio_channel_set_follow_coroutine_ctx(ioc, true|false)
> for opting in to the new AioContext model. By default QIOChannel uses the
> iohandler AioHandler. Code that formerly called
> qio_channel_attach_aio_context() now calls
> qio_channel_set_follow_coroutine_ctx(ioc, true) once after the QIOChannel is
> created.
> 
> Signed-off-by: Stefan Hajnoczi 
> Reviewed-by: Eric Blake 
> Acked-by: Daniel P. Berrangé 
> ---
>  include/io/channel-util.h|  23 ++
>  include/io/channel.h |  69 --
>  include/qemu/vhost-user-server.h |   1 +
>  block/nbd.c  |  11 +--
>  io/channel-command.c |  10 ++-
>  io/channel-file.c|   9 ++-
>  io/channel-null.c|   3 +-
>  io/channel-socket.c  |   9 ++-
>  io/channel-tls.c |   6 +-
>  io/channel-util.c|  24 +++
>  io/channel.c | 120 ++-
>  migration/channel-block.c|   3 +-
>  nbd/server.c |  14 +---
>  scsi/qemu-pr-helper.c|   4 +-
>  util/vhost-user-server.c |  27 +--
>  15 files changed, 216 insertions(+), 117 deletions(-)

Looks like migration/rdma.c is also impacted:

../migration/rdma.c: In function ‘qio_channel_rdma_class_init’:
../migration/rdma.c:4037:38: error: assignment to ‘void (*)(QIOChannel *, 
AioContext *, void (*)(void *), AioContext *, void (*)(void *), void *)’ from 
incompatible pointer type ‘void (*)(QIOChannel *, AioContext *, void (*)(void 
*), void (*)(void *), void *)’ [-Werror=incompatible-pointer-types]
 4037 | ioc_klass->io_set_aio_fd_handler = 
qio_channel_rdma_set_aio_fd_handler;
  |  ^

I'm squashing this in:

diff --git i/migration/rdma.c w/migration/rdma.c
index ca430d319d9..a2a3db35b1d 100644
--- i/migration/rdma.c
+++ w/migration/rdma.c
@@ -3103,22 +3103,23 @@ static GSource 
*qio_channel_rdma_create_watch(QIOChannel *ioc,
 }

 static void qio_channel_rdma_set_aio_fd_handler(QIOChannel *ioc,
-  AioContext *ctx,
-  IOHandler *io_read,
-  IOHandler *io_write,
-  void *opaque)
+AioContext *read_ctx,
+IOHandler *io_read,
+AioContext *write_ctx,
+IOHandler *io_write,
+void *opaque)
 {
 QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(ioc);
 if (io_read) {
-aio_set_fd_handler(ctx, rioc->rdmain->recv_comp_channel->fd, io_read,
-   io_write, NULL, NULL, opaque);
-aio_set_fd_handler(ctx, rioc->rdmain->send_comp_channel->fd, io_read,
-   io_write, NULL, NULL, opaque);
+aio_set_fd_handler(read_ctx, rioc->rdmain->recv_comp_channel->fd,
+   io_read, io_write, NULL, NULL, opaque);
+aio_set_fd_handler(read_ctx, rioc->rdmain->send_comp_channel->fd,
+   io_read, io_write, NULL, NULL, opaque);
 } else {
-aio_set_fd_handler(ctx, rioc->rdmaout->recv_comp_channel->fd, io_read,
-   io_write, NULL, NULL, opaque);
-aio_set_fd_handler(ctx, rioc->rdmaout->send_comp_channel->fd, io_read,
-   io_write, NULL, NULL, opaque);
+aio_set_fd_handler(write_ctx, rioc->rdmaout->recv_comp_channel->fd,
+   io_read, io_write, NULL, NULL, opaque);
+

[PATCH] hw/display/xlnx_dp: update comments

2023-09-07 Thread Michael Tokarev

From: Peter Maydell 

Clarify somewhat misleading code comments.

Signed-off-by: Michael Tokarev 
---
 hw/display/xlnx_dp.c | 11 +++
 1 file changed, 7 insertions(+), 4 deletions(-)

Peter, this is the result of your suggestions in this area.
Since it's entirely your wording, can I specify your
authorship too?

diff --git a/hw/display/xlnx_dp.c b/hw/display/xlnx_dp.c
index 822355ecc6..300daf9d9b 100644
--- a/hw/display/xlnx_dp.c
+++ b/hw/display/xlnx_dp.c
@@ -1,4 +1,4 @@
-/*
+?*
  * Xilinx Display Port
  *
  *  Copyright (C) 2015 : GreenSocs Ltd
@@ -380,13 +380,16 @@ static inline void xlnx_dp_audio_mix_buffer(XlnxDPState 
*s)
 static void xlnx_dp_audio_callback(void *opaque, int avail)
 {
 /*
- * Get some data from the DPDMA and compute these data.
- * Then wait for QEMU's audio subsystem to call this callback.
+ * Get the individual left and right audio streams from the DPDMA,
+ * and fill the output buffer with the combined stereo audio data
+ * adjusted by the volume controls.
+ * QEMU's audio subsystem will call this callback repeatedly;
+ * we return the data from the output buffer until it is emptied,
+ * and then we will read data from the DPDMA again.
  */
 XlnxDPState *s = XLNX_DP(opaque);
 size_t written = 0;
 
-/* If there are already some data don't get more data. */
 if (s->byte_left == 0) {
 s->audio_data_available[0] = xlnx_dpdma_start_operation(s->dpdma, 4,
   true);
-- 
2.39.2

Re: [PATCH v2 7/7] vhost-user: call VHOST_USER_SET_VRING_ENABLE synchronously

2023-09-07 Thread Laszlo Ersek

On 9/7/23 17:59, Eugenio Perez Martin wrote:
> On Wed, Aug 30, 2023 at 3:41 PM Laszlo Ersek  wrote:
>>
>> (1) The virtio-1.2 specification
>>  writes:
>>
>>> 3 General Initialization And Device Operation
>>> 3.1   Device Initialization
>>> 3.1.1 Driver Requirements: Device Initialization
>>>
>>> [...]
>>>
>>> 7. Perform device-specific setup, including discovery of virtqueues for
>>>the device, optional per-bus setup, reading and possibly writing the
>>>device’s virtio configuration space, and population of virtqueues.
>>>
>>> 8. Set the DRIVER_OK status bit. At this point the device is “live”.
>>
>> and
>>
>>> 4 Virtio Transport Options
>>> 4.1   Virtio Over PCI Bus
>>> 4.1.4 Virtio Structure PCI Capabilities
>>> 4.1.4.3   Common configuration structure layout
>>> 4.1.4.3.2 Driver Requirements: Common configuration structure layout
>>>
>>> [...]
>>>
>>> The driver MUST configure the other virtqueue fields before enabling the
>>> virtqueue with queue_enable.
>>>
>>> [...]
>>
>> (The same statements are present in virtio-1.0 identically, at
>> .)
>>
>> These together mean that the following sub-sequence of steps is valid for
>> a virtio-1.0 guest driver:
>>
>> (1.1) set "queue_enable" for the needed queues as the final part of device
>> initialization step (7),
>>
>> (1.2) set DRIVER_OK in step (8),
>>
>> (1.3) immediately start sending virtio requests to the device.
>>
>> (2) When vhost-user is enabled, and the VHOST_USER_F_PROTOCOL_FEATURES
>> special virtio feature is negotiated, then virtio rings start in disabled
>> state, according to
>> .
>> In this case, explicit VHOST_USER_SET_VRING_ENABLE messages are needed for
>> enabling vrings.
>>
>> Therefore setting "queue_enable" from the guest (1.1) is a *control plane*
>> operation, which travels from the guest through QEMU to the vhost-user
>> backend, using a unix domain socket.
>>
> 
> The code looks good to me, but this part of the message is not precise
> if I understood it correctly.
> 
> Guest PCI "queue_enable" writes remain in the qemu virtio device model
> until the guest writes DRIVER_OK to the status. I'm referring to
> hw/virtio/virtio-pci.c:virtio_pci_common_write, case
> VIRTIO_PCI_COMMON_Q_ENABLE. From there, virtio_queue_enable just saves
> the info in VirtIOPCIProxy.
> 
> After the needed queues are enabled, the guest writes DRIVER_OK status
> bit. Then, the vhost backend is started and qemu sends the
> VHOST_USER_SET_VRING_ENABLE through the unix socket. And that is the
> source of the message that is racing with the dataplane.

OK, so this means that 1.1 is "buffered" in QEMU until 1.2, but the race
between 1.2 and 1.3 is just the same.

I can reword the commit message to take this into account.

Thanks!
Laszlo

> 
> I didn't confirm it with virtiofs through tracing / debugging, so I
> may be missing something.
> 
> Even with the small nit, the code fixes the problem.
> 
> Acked-by: Eugenio Pérez 
> 
>> Whereas sending a virtio request (1.3) is a *data plane* operation, which
>> evades QEMU -- it travels from guest to the vhost-user backend via
>> eventfd.
>>
>> This means that steps (1.1) and (1.3) travel through different channels,
>> and their relative order can be reversed, as perceived by the vhost-user
>> backend.
>>
>> That's exactly what happens when OVMF's virtiofs driver (VirtioFsDxe) runs
>> against the Rust-language virtiofsd version 1.7.2. (Which uses version
>> 0.10.1 of the vhost-user-backend crate, and version 0.8.1 of the vhost
>> crate.)
>>
>> Namely, when VirtioFsDxe binds a virtiofs device, it goes through the
>> device initialization steps (i.e., control plane operations), and
>> immediately sends a FUSE_INIT request too (i.e., performs a data plane
>> operation). In the Rust-language virtiofsd, this creates a race between
>> two components that run *concurrently*, i.e., in different threads or
>> processes:
>>
>> - Control plane, handling vhost-user protocol messages:
>>
>>   The "VhostUserSlaveReqHandlerMut::set_vring_enable" method
>>   [crates/vhost-user-backend/src/handler.rs] handles
>>   VHOST_USER_SET_VRING_ENABLE messages, and updates each vring's "enabled"
>>   flag according to the message processed.
>>
>> - Data plane, handling virtio / FUSE requests:
>>
>>   The "VringEpollHandler::handle_event" method
>>   [crates/vhost-user-backend/src/event_loop.rs] handles the incoming
>>   virtio / FUSE request, consuming the virtio kick at the same time. If
>>   the vring's "enabled" flag is set, the virtio / FUSE request is
>>   processed genuinely. If the vring's "enabled" flag is clear, then the
>>   virtio / FUSE request is discarded.
>>
>> Note that OVMF enables the queue *first*, and sends FUSE_INIT *second*.
>> However, if the data plane processor in virtiofsd wins the race, then

Re: [PATCH v2 0/3] block: align CoR requests to subclusters

2023-09-07 Thread Michael Tokarev


11.07.2023 20:25, Andrey Drobyshev via wrote:

v1 --> v2:
  * Fixed line indentation;
  * Fixed wording in a comment;
  * Added R-b.

v1: https://lists.nongnu.org/archive/html/qemu-block/2023-06/msg00606.html

Andrey Drobyshev (3):
   block: add subcluster_size field to BlockDriverInfo
   block/io: align requests to subcluster_size
   tests/qemu-iotests/197: add testcase for CoR with subclusters

  block.c  |  7 +
  block/io.c   | 50 ++--
  block/mirror.c   |  8 +++---
  block/qcow2.c|  1 +
  include/block/block-common.h |  5 
  include/block/block-io.h |  8 +++---
  tests/qemu-iotests/197   | 29 +
  tests/qemu-iotests/197.out   | 24 +
  8 files changed, 99 insertions(+), 33 deletions(-)


So, given the size of patch series and amount of time the series
were sitting there.. I'm hesitating to apply it to -stable.
The whole issue, while real, smells like somewhat unusual case.

Any comments on this?

Thanks,

/mjt

Re: [PATCH] meson.build: Make keyutils independent from keyring

2023-09-07 Thread Michael Tokarev


24.08.2023 12:42, Thomas Huth wrote:

Commit 0db0fbb5cf ("Add conditional dependency for libkeyutils")
tried to provide a possibility for the user to disable keyutils
if not required by makeing it depend on the keyring feature. This
looked reasonable at a first glance (the unit test in tests/unit/
needs both), but the condition in meson.build fails if the feature
is meant to be detected automatically, and there is also another
spot in backends/meson.build where keyutils is used independently
from keyring. So let's remove the dependency on keyring again and
introduce a proper meson build option instead.

Cc: qemu-sta...@nongnu.org
Fixes: 0db0fbb5cf ("Add conditional dependency for libkeyutils")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1842
Signed-off-by: Thomas Huth 


Ping? Has this been forgotten?

Thanks,

/mjt

Re: [RFC PATCH] softmmu: Fix async_run_on_cpu() use in tcg_commit_cpu()

2023-09-07 Thread Philippe Mathieu-Daudé


On 7/9/23 18:28, Richard Henderson wrote:

On 9/7/23 09:14, Philippe Mathieu-Daudé wrote:

CPUState::halt_cond is an accelerator specific pointer, used
in particular by TCG (which tcg_commit() is about).
The pointer is set by the AccelOpsClass::create_vcpu_thread()
handler.
AccelOpsClass::create_vcpu_thread() is called by the generic
qemu_init_vcpu(), which expect the accelerator handler to
eventually call cpu_thread_signal_created() which is protected
with a QemuCond. It is safer to check the vCPU is created with
this field rather than the 'halt_cond' pointer set in
create_vcpu_thread() before the vCPU thread is initialized.

This avoids calling tcg_commit() until all CPUs are realized.

Here we can see for a machine with N CPUs, tcg_commit()
is called N times before the 'machine_creation_done' event:

   (lldb) settings set -- target.run-args  "-M" "virt" "-smp" "512" 
"-display" "none"
   (lldb) breakpoint set --name qemu_machine_creation_done --one-shot 
true

   (lldb) breakpoint set --name tcg_commit_cpu --auto-continue true
   (lldb) run
   Process 84089 launched: 'qemu-system-aarch64' (arm64)
   Process 84089 stopped
   * thread #1, queue = 'com.apple.main-thread', stop reason = 
one-shot breakpoint 2

   (lldb) breakpoint list --brief
   Current breakpoints:
   2: name = 'tcg_commit_cpu', locations = 2, resolved = 2, hit count 
= 512 Options: enabled auto-continue
  ^^
^^^




Of course the function is called 512 times: you asked for 512 cpus, and 
each has its own address space which needs initializing.


The AS are still initialized at the same time, but we defer the listener
callback until the vCPU is ready (what was expected first IIUC).

If you skip the call before cpu->created, when exactly are you going to 
do it?


With this patch tcg_commit_cpu() is only called by vCPU threads, in
their processing loop. i.e: comparing backtraces, now the first hit
is:
(lldb) bt
* thread #514, stop reason = breakpoint 4.2
  * frame #0: 0x1005d9d48 
qemu-system-aarch64`tcg_commit_cpu(cpu=0x173358000, data=...) at 
physmem.c:2493:63
frame #1: 0x1d684 
qemu-system-aarch64`process_queued_cpu_work(cpu=0x173358000) at 
cpus-common.c:360:13
frame #2: 0x100297390 qemu-system-aarch64`qemu_wait_io_event 
[inlined] qemu_wait_io_event_common(cpu=) at cpus.c:412:5 
[artificial]
frame #3: 0x100623b98 
qemu-system-aarch64`mttcg_cpu_thread_fn(arg=0x173358000) at 
tcg-accel-ops-mttcg.c:123:9
frame #4: 0x10079f15c 
qemu-system-aarch64`qemu_thread_start(args=) at 
qemu-thread-posix.c:541:9

frame #5: 0x18880ffa8 libsystem_pthread.dylib`_pthread_start + 148

Re: [PATCH] hw/net/vmxnet3: Fix guest-triggerable assert()

2023-09-07 Thread Michael Tokarev


17.08.2023 15:56, Thomas Huth wrote:

The assert() that checks for valid MTU sizes can be triggered by
the guest (e.g. with the reproducer code from the bug ticket
https://gitlab.com/qemu-project/qemu/-/issues/517 ). Let's avoid
this problem by simply logging the error and refusing to activate
the device instead.


I'm applying this to trivial-patches tree with some hesitation :)

Thanks,

/mjt


Fixes: d05dcd94ae ("net: vmxnet3: validate configuration values during 
activate")
Signed-off-by: Thomas Huth 
---
  hw/net/vmxnet3.c | 5 -
  1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/hw/net/vmxnet3.c b/hw/net/vmxnet3.c
index 5dfacb1098..6674122a7e 100644
--- a/hw/net/vmxnet3.c
+++ b/hw/net/vmxnet3.c
@@ -1439,7 +1439,10 @@ static void vmxnet3_activate_device(VMXNET3State *s)
  vmxnet3_setup_rx_filtering(s);
  /* Cache fields from shared memory */
  s->mtu = VMXNET3_READ_DRV_SHARED32(d, s->drv_shmem, devRead.misc.mtu);
-assert(VMXNET3_MIN_MTU <= s->mtu && s->mtu <= VMXNET3_MAX_MTU);
+if (s->mtu < VMXNET3_MIN_MTU || s->mtu > VMXNET3_MAX_MTU) {
+qemu_log_mask(LOG_GUEST_ERROR, "vmxnet3: Bad MTU size: %d\n", s->mtu);
+return;
+}
  VMW_CFPRN("MTU is %u", s->mtu);
  
  s->max_rx_frags =

Re: [PATCH] arm64: Restore trapless ptimer access

2023-09-07 Thread Michael Tokarev


31.08.2023 22:00, Colton Lewis wrote:

Due to recent KVM changes, QEMU is setting a ptimer offset resulting
in unintended trap and emulate access and a consequent performance
hit. Filter out the PTIMER_CNT register to restore trapless ptimer
access.

Quoting Andrew Jones:

Simply reading the CNT register and writing back the same value is
enough to set an offset, since the timer will have certainly moved
past whatever value was read by the time it's written.  QEMU
frequently saves and restores all registers in the get-reg-list array,
unless they've been explicitly filtered out (with Linux commit
680232a94c12, KVM_REG_ARM_PTIMER_CNT is now in the array). So, to
restore trapless ptimer accesses, we need a QEMU patch to filter out
the register.

See
https://lore.kernel.org/kvmarm/gsntttsonus5@coltonlewis-kvm.c.googlers.com/T/#m0770023762a821db2a3f0dd0a7dc6aa54e0d0da9
for additional context.

Signed-off-by: Andrew Jones 
---
  target/arm/kvm64.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/target/arm/kvm64.c b/target/arm/kvm64.c
index 4d904a1d11..2dd46e0a99 100644
--- a/target/arm/kvm64.c
+++ b/target/arm/kvm64.c
@@ -672,6 +672,7 @@ typedef struct CPRegStateLevel {
   */
  static const CPRegStateLevel non_runtime_cpregs[] = {
  { KVM_REG_ARM_TIMER_CNT, KVM_PUT_FULL_STATE },
+{ KVM_REG_ARM_PTIMER_CNT, KVM_PUT_FULL_STATE },
  };
  
  int kvm_arm_cpreg_level(uint64_t regidx)


While this patch itself is one-liner and trivial and all, I'd rather
not apply this to the trivial-patches tree, - it requires a little
bit more than trivial expertise in this area.

So basically, ping for qemu-arm@ ? :)

Thanks,

/mjt

[PATCH v1 0/2] i386/a-b-bootblock: zero the first byte of each page on start

2023-09-07 Thread Daniil Tatianin

This series fixes an issue where the outcome of the migration qtest
relies on the initial memory contents all being the same across the
first 100MiB of RAM, which is a very fragile invariant.

We fix this by making sure we zero the first byte of every testable page
in range beforehand.

Daniil Tatianin (2):
  i386/a-b-bootblock: factor test memory addresses out into constants
  i386/a-b-bootblock: zero the first byte of each page on start

 tests/migration/i386/a-b-bootblock.S | 18 +++---
 tests/migration/i386/a-b-bootblock.h | 16 
 2 files changed, 23 insertions(+), 11 deletions(-)

-- 
2.34.1

[PATCH v1 2/2] i386/a-b-bootblock: zero the first byte of each page on start

2023-09-07 Thread Daniil Tatianin

The migration qtest all the way up to this point used to work by sheer
luck relying on the contents of all pages from 1MiB to 100MiB to contain
the same one value in the first byte initially.

This easily breaks if we reduce the amount of RAM for the test instances
from 150MiB to e.g 110MiB since that makes SeaBIOS dirty some of the
pages starting at about 0x5dd2000 (~93 MiB) as it reuses those for the
HighMemory allocator since commit dc88f9b72df ("malloc: use large
ZoneHigh when there is enough memory").

This would result in the following errors:
12/60 qemu:qtest+qtest-x86_64 / qtest-x86_64/migration-test 
ERROR   2.74s   killed by signal 6 SIGABRT
stderr:
Memory content inconsistency at 5dd2000 first_byte = cc last_byte = cb 
current = 9e hit_edge = 1
Memory content inconsistency at 5dd3000 first_byte = cc last_byte = cb 
current = 89 hit_edge = 1
Memory content inconsistency at 5dd4000 first_byte = cc last_byte = cb 
current = 23 hit_edge = 1
Memory content inconsistency at 5dd5000 first_byte = cc last_byte = cb 
current = 31 hit_edge = 1
Memory content inconsistency at 5dd6000 first_byte = cc last_byte = cb 
current = 70 hit_edge = 1
Memory content inconsistency at 5dd7000 first_byte = cc last_byte = cb 
current = ff hit_edge = 1
Memory content inconsistency at 5dd8000 first_byte = cc last_byte = cb 
current = 54 hit_edge = 1
Memory content inconsistency at 5dd9000 first_byte = cc last_byte = cb 
current = 64 hit_edge = 1
Memory content inconsistency at 5dda000 first_byte = cc last_byte = cb 
current = 1d hit_edge = 1
Memory content inconsistency at 5ddb000 first_byte = cc last_byte = cb 
current = 1a hit_edge = 1
and in another 26 pages**
ERROR:../tests/qtest/migration-test.c:300:check_guests_ram: assertion 
failed: (bad == 0)

Fix this by always zeroing the first byte of each page in the range so
that we get consistent results no matter the initial contents.

Fixes: ea0c6d62391 ("test: Postcopy")
Signed-off-by: Daniil Tatianin 
---
 tests/migration/i386/a-b-bootblock.S |  9 +
 tests/migration/i386/a-b-bootblock.h | 16 
 2 files changed, 17 insertions(+), 8 deletions(-)

diff --git a/tests/migration/i386/a-b-bootblock.S 
b/tests/migration/i386/a-b-bootblock.S
index 036216e4a7..6bbd60 100644
--- a/tests/migration/i386/a-b-bootblock.S
+++ b/tests/migration/i386/a-b-bootblock.S
@@ -44,6 +44,15 @@ start: # at 0x7c00 ?
 
 # bl keeps a counter so we limit the output speed
 mov $0, %bl
+
+pre_zero:
+mov $TEST_MEM_START,%eax
+do_zero:
+movb $0, (%eax)
+add $4096,%eax
+cmp $TEST_MEM_END,%eax
+jl do_zero
+
 mainloop:
 mov $TEST_MEM_START,%eax
 innerloop:
diff --git a/tests/migration/i386/a-b-bootblock.h 
b/tests/migration/i386/a-b-bootblock.h
index b7b0fce2ee..5b523917ce 100644
--- a/tests/migration/i386/a-b-bootblock.h
+++ b/tests/migration/i386/a-b-bootblock.h
@@ -4,18 +4,18 @@
  * the header and the assembler differences in your patch submission.
  */
 unsigned char x86_bootsect[] = {
-  0xfa, 0x0f, 0x01, 0x16, 0x78, 0x7c, 0x66, 0xb8, 0x01, 0x00, 0x00, 0x00,
+  0xfa, 0x0f, 0x01, 0x16, 0x8c, 0x7c, 0x66, 0xb8, 0x01, 0x00, 0x00, 0x00,
   0x0f, 0x22, 0xc0, 0x66, 0xea, 0x20, 0x7c, 0x00, 0x00, 0x08, 0x00, 0x00,
   0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0xe4, 0x92, 0x0c, 0x02,
   0xe6, 0x92, 0xb8, 0x10, 0x00, 0x00, 0x00, 0x8e, 0xd8, 0x66, 0xb8, 0x41,
   0x00, 0x66, 0xba, 0xf8, 0x03, 0xee, 0xb3, 0x00, 0xb8, 0x00, 0x00, 0x10,
-  0x00, 0xfe, 0x00, 0x05, 0x00, 0x10, 0x00, 0x00, 0x3d, 0x00, 0x00, 0x40,
-  0x06, 0x7c, 0xf2, 0xfe, 0xc3, 0x80, 0xe3, 0x3f, 0x75, 0xe6, 0x66, 0xb8,
-  0x42, 0x00, 0x66, 0xba, 0xf8, 0x03, 0xee, 0xeb, 0xdb, 0x8d, 0x76, 0x00,
-  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0x00, 0x00,
-  0x00, 0x9a, 0xcf, 0x00, 0xff, 0xff, 0x00, 0x00, 0x00, 0x92, 0xcf, 0x00,
-  0x27, 0x00, 0x60, 0x7c, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
-  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0xc6, 0x00, 0x00, 0x05, 0x00, 0x10, 0x00, 0x00, 0x3d, 0x00, 0x00,
+  0x40, 0x06, 0x7c, 0xf1, 0xb8, 0x00, 0x00, 0x10, 0x00, 0xfe, 0x00, 0x05,
+  0x00, 0x10, 0x00, 0x00, 0x3d, 0x00, 0x00, 0x40, 0x06, 0x7c, 0xf2, 0xfe,
+  0xc3, 0x80, 0xe3, 0x3f, 0x75, 0xe6, 0x66, 0xb8, 0x42, 0x00, 0x66, 0xba,
+  0xf8, 0x03, 0xee, 0xeb, 0xdb, 0x8d, 0x76, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0x00, 0x00, 0x00, 0x9a, 0xcf, 0x00,
+  0xff, 0xff, 0x00, 0x00, 0x00, 0x92, 0xcf, 0x00, 0x27, 0x00, 0x74, 0x7c,
   0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
   0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
   0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
-- 
2.34.1

[PATCH v1 1/2] i386/a-b-bootblock: factor test memory addresses out into constants

2023-09-07 Thread Daniil Tatianin

So that we have less magic numbers to deal with. This also allows us to
reuse these in the following commits.

Signed-off-by: Daniil Tatianin 
---
 tests/migration/i386/a-b-bootblock.S | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/tests/migration/i386/a-b-bootblock.S 
b/tests/migration/i386/a-b-bootblock.S
index 3d464c7568..036216e4a7 100644
--- a/tests/migration/i386/a-b-bootblock.S
+++ b/tests/migration/i386/a-b-bootblock.S
@@ -34,6 +34,10 @@ start: # at 0x7c00 ?
 mov $16,%eax
 mov %eax,%ds
 
+# Start from 1MB
+.set TEST_MEM_START, (1024*1024)
+.set TEST_MEM_END, (100*1024*1024)
+
 mov $65,%ax
 mov $0x3f8,%dx
 outb %al,%dx
@@ -41,12 +45,11 @@ start: # at 0x7c00 ?
 # bl keeps a counter so we limit the output speed
 mov $0, %bl
 mainloop:
-# Start from 1MB
-mov $(1024*1024),%eax
+mov $TEST_MEM_START,%eax
 innerloop:
 incb (%eax)
 add $4096,%eax
-cmp $(100*1024*1024),%eax
+cmp $TEST_MEM_END,%eax
 jl innerloop
 
 inc %bl
-- 
2.34.1

[PATCH 2/2] i386/a-b-bootblock: zero the first byte of each page on start

2023-09-07 Thread Daniil Tatianin

The migration qtest all the way up to this point used to work by sheer
luck relying on the contents of all pages from 1MiB to 100MiB to contain
the same one value in the first byte initially.

This easily breaks if we reduce the amount of RAM for the test instances
from 150MiB to e.g 110MiB since that makes SeaBIOS dirty some of the
pages starting at about 0x5dd2000 (~93 MiB) as it reuses those for the
HighMemory allocator since commit dc88f9b72df ("malloc: use large
ZoneHigh when there is enough memory").

This would result in the following errors:
12/60 qemu:qtest+qtest-x86_64 / qtest-x86_64/migration-test 
ERROR   2.74s   killed by signal 6 SIGABRT
stderr:
Memory content inconsistency at 5dd2000 first_byte = cc last_byte = cb 
current = 9e hit_edge = 1
Memory content inconsistency at 5dd3000 first_byte = cc last_byte = cb 
current = 89 hit_edge = 1
Memory content inconsistency at 5dd4000 first_byte = cc last_byte = cb 
current = 23 hit_edge = 1
Memory content inconsistency at 5dd5000 first_byte = cc last_byte = cb 
current = 31 hit_edge = 1
Memory content inconsistency at 5dd6000 first_byte = cc last_byte = cb 
current = 70 hit_edge = 1
Memory content inconsistency at 5dd7000 first_byte = cc last_byte = cb 
current = ff hit_edge = 1
Memory content inconsistency at 5dd8000 first_byte = cc last_byte = cb 
current = 54 hit_edge = 1
Memory content inconsistency at 5dd9000 first_byte = cc last_byte = cb 
current = 64 hit_edge = 1
Memory content inconsistency at 5dda000 first_byte = cc last_byte = cb 
current = 1d hit_edge = 1
Memory content inconsistency at 5ddb000 first_byte = cc last_byte = cb 
current = 1a hit_edge = 1
and in another 26 pages**
ERROR:../tests/qtest/migration-test.c:300:check_guests_ram: assertion 
failed: (bad == 0)

Fix this by always zeroing the first byte of each page in the range so
that we get consistent results no matter the initial contents.

Fixes: ea0c6d62391 ("test: Postcopy")
Signed-off-by: Daniil Tatianin 
---
 tests/migration/i386/a-b-bootblock.S |  9 +
 tests/migration/i386/a-b-bootblock.h | 16 
 2 files changed, 17 insertions(+), 8 deletions(-)

diff --git a/tests/migration/i386/a-b-bootblock.S 
b/tests/migration/i386/a-b-bootblock.S
index 036216e4a7..6bbd60 100644
--- a/tests/migration/i386/a-b-bootblock.S
+++ b/tests/migration/i386/a-b-bootblock.S
@@ -44,6 +44,15 @@ start: # at 0x7c00 ?
 
 # bl keeps a counter so we limit the output speed
 mov $0, %bl
+
+pre_zero:
+mov $TEST_MEM_START,%eax
+do_zero:
+movb $0, (%eax)
+add $4096,%eax
+cmp $TEST_MEM_END,%eax
+jl do_zero
+
 mainloop:
 mov $TEST_MEM_START,%eax
 innerloop:
diff --git a/tests/migration/i386/a-b-bootblock.h 
b/tests/migration/i386/a-b-bootblock.h
index b7b0fce2ee..5b523917ce 100644
--- a/tests/migration/i386/a-b-bootblock.h
+++ b/tests/migration/i386/a-b-bootblock.h
@@ -4,18 +4,18 @@
  * the header and the assembler differences in your patch submission.
  */
 unsigned char x86_bootsect[] = {
-  0xfa, 0x0f, 0x01, 0x16, 0x78, 0x7c, 0x66, 0xb8, 0x01, 0x00, 0x00, 0x00,
+  0xfa, 0x0f, 0x01, 0x16, 0x8c, 0x7c, 0x66, 0xb8, 0x01, 0x00, 0x00, 0x00,
   0x0f, 0x22, 0xc0, 0x66, 0xea, 0x20, 0x7c, 0x00, 0x00, 0x08, 0x00, 0x00,
   0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0xe4, 0x92, 0x0c, 0x02,
   0xe6, 0x92, 0xb8, 0x10, 0x00, 0x00, 0x00, 0x8e, 0xd8, 0x66, 0xb8, 0x41,
   0x00, 0x66, 0xba, 0xf8, 0x03, 0xee, 0xb3, 0x00, 0xb8, 0x00, 0x00, 0x10,
-  0x00, 0xfe, 0x00, 0x05, 0x00, 0x10, 0x00, 0x00, 0x3d, 0x00, 0x00, 0x40,
-  0x06, 0x7c, 0xf2, 0xfe, 0xc3, 0x80, 0xe3, 0x3f, 0x75, 0xe6, 0x66, 0xb8,
-  0x42, 0x00, 0x66, 0xba, 0xf8, 0x03, 0xee, 0xeb, 0xdb, 0x8d, 0x76, 0x00,
-  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0x00, 0x00,
-  0x00, 0x9a, 0xcf, 0x00, 0xff, 0xff, 0x00, 0x00, 0x00, 0x92, 0xcf, 0x00,
-  0x27, 0x00, 0x60, 0x7c, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
-  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0xc6, 0x00, 0x00, 0x05, 0x00, 0x10, 0x00, 0x00, 0x3d, 0x00, 0x00,
+  0x40, 0x06, 0x7c, 0xf1, 0xb8, 0x00, 0x00, 0x10, 0x00, 0xfe, 0x00, 0x05,
+  0x00, 0x10, 0x00, 0x00, 0x3d, 0x00, 0x00, 0x40, 0x06, 0x7c, 0xf2, 0xfe,
+  0xc3, 0x80, 0xe3, 0x3f, 0x75, 0xe6, 0x66, 0xb8, 0x42, 0x00, 0x66, 0xba,
+  0xf8, 0x03, 0xee, 0xeb, 0xdb, 0x8d, 0x76, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0x00, 0x00, 0x00, 0x9a, 0xcf, 0x00,
+  0xff, 0xff, 0x00, 0x00, 0x00, 0x92, 0xcf, 0x00, 0x27, 0x00, 0x74, 0x7c,
   0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
   0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
   0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
-- 
2.34.1

[PATCH 1/2] i386/a-b-bootblock: factor test memory addresses out into constants

2023-09-07 Thread Daniil Tatianin

So that we have less magic numbers to deal with. This also allows us to
reuse these in the following commits.

Signed-off-by: Daniil Tatianin 
---
 tests/migration/i386/a-b-bootblock.S | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/tests/migration/i386/a-b-bootblock.S 
b/tests/migration/i386/a-b-bootblock.S
index 3d464c7568..036216e4a7 100644
--- a/tests/migration/i386/a-b-bootblock.S
+++ b/tests/migration/i386/a-b-bootblock.S
@@ -34,6 +34,10 @@ start: # at 0x7c00 ?
 mov $16,%eax
 mov %eax,%ds
 
+# Start from 1MB
+.set TEST_MEM_START, (1024*1024)
+.set TEST_MEM_END, (100*1024*1024)
+
 mov $65,%ax
 mov $0x3f8,%dx
 outb %al,%dx
@@ -41,12 +45,11 @@ start: # at 0x7c00 ?
 # bl keeps a counter so we limit the output speed
 mov $0, %bl
 mainloop:
-# Start from 1MB
-mov $(1024*1024),%eax
+mov $TEST_MEM_START,%eax
 innerloop:
 incb (%eax)
 add $4096,%eax
-cmp $(100*1024*1024),%eax
+cmp $TEST_MEM_END,%eax
 jl innerloop
 
 inc %bl
-- 
2.34.1

[PATCH 0/2] i386/a-b-bootblock: zero the first byte of each page on start

2023-09-07 Thread Daniil Tatianin

This series fixes an issue where the outcome of the migration qtest
relies on the initial memory contents all being the same across the
first 100MiB of RAM, which is a very fragile invariant.

We fix this by making sure we zero the first byte of every testable page
in range beforehand.

Daniil Tatianin (2):
  i386/a-b-bootblock: factor test memory addresses out into constants
  i386/a-b-bootblock: zero the first byte of each page on start

 tests/migration/i386/a-b-bootblock.S | 18 +++---
 tests/migration/i386/a-b-bootblock.h | 16 
 2 files changed, 23 insertions(+), 11 deletions(-)

-- 
2.34.1

Re: [virtio-dev] [RFC PATCH v2] docs/interop: define PROBE feature for vhost-user VirtIO devices

2023-09-07 Thread Stefan Hajnoczi

On Tue, Sep 05, 2023 at 10:34:11AM +0100, Alex Bennée wrote:
> 
> Albert Esteve  writes:
> 
> > This looks great! Thanks for this proposal.
> >
> > On Fri, Sep 1, 2023 at 1:00 PM Alex Bennée  wrote:
> >
> >  Currently QEMU has to know some details about the VirtIO device
> >  supported by a vhost-user daemon to be able to setup the guest. This
> >  makes it hard for QEMU to add support for additional vhost-user
> >  daemons without adding specific stubs for each additional VirtIO
> >  device.
> >
> >  This patch suggests a new feature flag (VHOST_USER_PROTOCOL_F_PROBE)
> >  which the back-end can advertise which allows a probe message to be
> >  sent to get all the details QEMU needs to know in one message.
> >
> >  Together with the existing features VHOST_USER_PROTOCOL_F_STATUS and
> >  VHOST_USER_PROTOCOL_F_CONFIG we can create "standalone" vhost-user
> >  daemons which are capable of handling all aspects of the VirtIO
> >  transactions with only a generic stub on the QEMU side. These daemons
> >  can also be used without QEMU in situations where there isn't a full
> >  VMM managing their setup.
> >
> >  Signed-off-by: Alex Bennée 
> >
> >  ---
> >  v2
> >- dropped F_STANDALONE in favour of F_PROBE
> >- split probe details across several messages
> >- probe messages don't automatically imply a standalone daemon
> >- add wording where probe details interact (F_MQ/F_CONFIG)
> >- define VMM and make clear QEMU is only one of many potential VMMs
> >- reword commit message
> >  ---
> >   docs/interop/vhost-user.rst | 90 -
> >   hw/virtio/vhost-user.c  |  8 
> >   2 files changed, 88 insertions(+), 10 deletions(-)
> >
> >  diff --git a/docs/interop/vhost-user.rst b/docs/interop/vhost-user.rst
> >  index 5a070adbc1..ba3b5e07b7 100644
> >  --- a/docs/interop/vhost-user.rst
> >  +++ b/docs/interop/vhost-user.rst
> >  @@ -7,6 +7,7 @@ Vhost-user Protocol
> >   ..
> > Copyright 2014 Virtual Open Systems Sarl.
> > Copyright 2019 Intel Corporation
> >  +  Copyright 2023 Linaro Ltd
> > Licence: This work is licensed under the terms of the GNU GPL,
> >  version 2 or later. See the COPYING file in the top-level
> >  directory.
> >  @@ -27,17 +28,31 @@ The protocol defines 2 sides of the communication, 
> > *front-end* and
> >   *back-end*. The *front-end* is the application that shares its 
> > virtqueues, in
> >   our case QEMU. The *back-end* is the consumer of the virtqueues.
> >
> >  -In the current implementation QEMU is the *front-end*, and the *back-end*
> >  -is the external process consuming the virtio queues, for example a
> >  -software Ethernet switch running in user space, such as Snabbswitch,
> >  -or a block device back-end processing read & write to a virtual
> >  -disk. In order to facilitate interoperability between various back-end
> >  -implementations, it is recommended to follow the :ref:`Backend program
> >  -conventions `.
> >  +In the current implementation a Virtual Machine Manager (VMM) such as
> >  +QEMU is the *front-end*, and the *back-end* is the external process
> >  +consuming the virtio queues, for example a software Ethernet switch
> >  +running in user space, such as Snabbswitch, or a block device back-end
> >  +processing read & write to a virtual disk. In order to facilitate
> >  +interoperability between various back-end implementations, it is
> >  +recommended to follow the :ref:`Backend program conventions
> >  +`.
> >
> >   The *front-end* and *back-end* can be either a client (i.e. connecting) or
> >   server (listening) in the socket communication.
> >
> >  +Probing device details
> >  +--
> >  +
> >  +Traditionally the vhost-user daemon *back-end* shares configuration
> >  +responsibilities with the VMM *front-end* which needs to know certain
> >  +key bits of information about the device. This means the VMM needs to
> >  +define at least a minimal stub for each VirtIO device it wants to
> >  +support. If the daemon supports the right set of protocol features the
> >  +VMM can probe the daemon for the information it needs to setup the
> >  +device. See :ref:`Probing features for standalone daemons
> >  +` for more details.
> >  +
> >  +
> >   Support for platforms other than Linux
> >   --
> >
> >  @@ -316,6 +331,7 @@ replies. Here is a list of the ones that do:
> >   * ``VHOST_USER_GET_VRING_BASE``
> >   * ``VHOST_USER_SET_LOG_BASE`` (if ``VHOST_USER_PROTOCOL_F_LOG_SHMFD``)
> >   * ``VHOST_USER_GET_INFLIGHT_FD`` (if 
> > ``VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD``)
> >  +* ``VHOST_USER_GET_BACKEND_SPECS`` (if 
> > ``VHOST_USER_PROTOCOL_F_STANDALONE``)
> >
> >   .. seealso::
> >
> >  @@ -396,9 +412,10 @@ must support changing some configuration aspects on 
> > the fly.
> >   Multiple queue support
> >   --
> >
> >  -Many devices have a fixed number of virtqueues.  In this case the 
> > front-end
> >

Re: [PATCH for-8.2 0/2] ppc: get rid of free() (gitlab #1798)

2023-09-07 Thread Michael Tokarev


30.07.2023 20:13, Daniel Henrique Barboza wrote:



On 7/29/23 12:35, Peter Maydell wrote:

On Fri, 28 Jul 2023 at 21:57, Daniel Henrique Barboza
 wrote:

Here's some trivial changes following Peter's call to arms against
free() and friends in gitlab issue #1798 in an attempt to enforce
our memory management guidelines [1].


To clarify, this isn't a "call to arms". The issue is marked up as
a "bite-sized task", which is to say that it's a potential easy
place to start for newcomers to the community who might be making
their first contribution to the codebase. The changes it suggests
aren't urgent; at most they're a nice-to-have, since glib
guarantees that you can mix malloc/free and g_malloc/g_free.


I failed to realized it was a byte sized task :/ and my Coccinelle comment
in the bug makes me fell dumb hehe (given that Coccinelle is not newcomer
friendly).



We've had this sitting around as a suggestion on the wiki page
for bite-sized-tasks for years, and occasionally people come
through and have a go at it. I wanted to clean up and expand
on the description of what we had in mind for the change, to
give those people a better chance of successfully completing
the task.


What we can do then, since I already sent these, is perhaps link these patches
as example/template in the gitlab issue later on.


Applied to my trivial-patches branch adding suggested commit comment
fixes while at it, hopefully there's nothing more to do :)

Thanks,

/mjt

Re: [RFC PATCH v2] docs/interop: define PROBE feature for vhost-user VirtIO devices

2023-09-07 Thread Stefan Hajnoczi

On Fri, Sep 01, 2023 at 12:00:18PM +0100, Alex Bennée wrote:
> Currently QEMU has to know some details about the VirtIO device
> supported by a vhost-user daemon to be able to setup the guest. This
> makes it hard for QEMU to add support for additional vhost-user
> daemons without adding specific stubs for each additional VirtIO
> device.
> 
> This patch suggests a new feature flag (VHOST_USER_PROTOCOL_F_PROBE)
> which the back-end can advertise which allows a probe message to be
> sent to get all the details QEMU needs to know in one message.
> 
> Together with the existing features VHOST_USER_PROTOCOL_F_STATUS and
> VHOST_USER_PROTOCOL_F_CONFIG we can create "standalone" vhost-user
> daemons which are capable of handling all aspects of the VirtIO
> transactions with only a generic stub on the QEMU side. These daemons
> can also be used without QEMU in situations where there isn't a full
> VMM managing their setup.
> 
> Signed-off-by: Alex Bennée 

I think the mindset for this change should be "vhost-user is becoming a
VIRTIO Transport". VIRTIO Transports have a reasonably well-defined
feature set in the VIRTIO specification. The goal should be to cover
every VIRTIO Transport operation via vhost-user protocol messages so
that the VIRTIO device model can be fully conveyed over vhost-user.

Anything less is yet another ad-hoc protocol extension that will lead to
more bugs and hacks when it turns out some VIRTIO devices cannot be
expressed due to limitations in the protocol.

This requires going through the VIRTIO spec to find a correspondence
between virtio-pci/virtio-mmio/virtio-ccw's interfaces and vhost-user
protocol messages. In most cases vhost-user already offers messages and
your patch adds more of what is missing. I think this effort is already
very close but missing the final check that it really matches the VIRTIO
spec.

Please do the comparison against the VIRTIO Transports and then adjust
this patch to make it clear that the back-end is becoming a full-fledged
VIRTIO Transport:
- The name of the patch series should reflect that.
- The vhost-user protocol feature should be named F_TRANSPORT.
- The messages added in this patch should have a 1:1 correspondence with
  the VIRTIO spec including using the same terminology for consistency.

Sorry for the hassle, but I think this is a really crucial point where
we have the chance to make vhost-user work smoothly in the future...but
only if we can faithfully expose VIRTIO Transport semantics.

> 
> ---
> v2
>   - dropped F_STANDALONE in favour of F_PROBE
>   - split probe details across several messages
>   - probe messages don't automatically imply a standalone daemon
>   - add wording where probe details interact (F_MQ/F_CONFIG)
>   - define VMM and make clear QEMU is only one of many potential VMMs
>   - reword commit message
> ---
>  docs/interop/vhost-user.rst | 90 -
>  hw/virtio/vhost-user.c  |  8 
>  2 files changed, 88 insertions(+), 10 deletions(-)
> 
> diff --git a/docs/interop/vhost-user.rst b/docs/interop/vhost-user.rst
> index 5a070adbc1..ba3b5e07b7 100644
> --- a/docs/interop/vhost-user.rst
> +++ b/docs/interop/vhost-user.rst
> @@ -7,6 +7,7 @@ Vhost-user Protocol
>  ..
>Copyright 2014 Virtual Open Systems Sarl.
>Copyright 2019 Intel Corporation
> +  Copyright 2023 Linaro Ltd
>Licence: This work is licensed under the terms of the GNU GPL,
> version 2 or later. See the COPYING file in the top-level
> directory.
> @@ -27,17 +28,31 @@ The protocol defines 2 sides of the communication, 
> *front-end* and
>  *back-end*. The *front-end* is the application that shares its virtqueues, in
>  our case QEMU. The *back-end* is the consumer of the virtqueues.
>  
> -In the current implementation QEMU is the *front-end*, and the *back-end*
> -is the external process consuming the virtio queues, for example a
> -software Ethernet switch running in user space, such as Snabbswitch,
> -or a block device back-end processing read & write to a virtual
> -disk. In order to facilitate interoperability between various back-end
> -implementations, it is recommended to follow the :ref:`Backend program
> -conventions `.
> +In the current implementation a Virtual Machine Manager (VMM) such as
> +QEMU is the *front-end*, and the *back-end* is the external process
> +consuming the virtio queues, for example a software Ethernet switch
> +running in user space, such as Snabbswitch, or a block device back-end
> +processing read & write to a virtual disk. In order to facilitate
> +interoperability between various back-end implementations, it is
> +recommended to follow the :ref:`Backend program conventions
> +`.
>  
>  The *front-end* and *back-end* can be either a client (i.e. connecting) or
>  server (listening) in the socket communication.
>  
> +Probing device details
> +--
> +
> +Traditionally the vhost-user daemon *back-end* shares configuration
> +responsibilities with

Re: [PULL v2 00/35] ppc queue

2023-09-07 Thread Michael Tokarev


06.09.2023 17:36, Cédric Le Goater wrote:
...

ppc queue :

* debug facility improvements
* timebase and decrementer fixes
* record-replay fixes
* TCG fixes
* XIVE model improvements for multichip


Cédric Le Goater (4):
   ppc/xive: Use address_space routines to access the machine RAM
   ppc/xive: Introduce a new XiveRouter end_notify() handler
   ppc/xive: Handle END triggers between chips with MMIOs
   ppc/xive: Add support for the PC MMIOs

Joel Stanley (1):
   ppc: Add stub implementation of TRIG SPRs

Maksim Kostin (1):
   hw/ppc/e500: fix broken snapshot replay

Nicholas Piggin (26):
   target/ppc: Remove single-step suppression inside 0x100-0xf00
   target/ppc: Improve book3s branch trace interrupt for v2.07S
   target/ppc: Suppress single step interrupts on rfi-type instructions
   target/ppc: Implement breakpoint debug facility for v2.07S
   target/ppc: Implement watchpoint debug facility for v2.07S
   spapr: implement H_SET_MODE debug facilities
   ppc/vhyp: reset exception state when handling vhyp hcall
   ppc/vof: Fix missed fields in VOF cleanup
   hw/ppc/ppc.c: Tidy over-long lines
   hw/ppc: Introduce functions for conversion between timebase and 
nanoseconds
   host-utils: Add muldiv64_round_up
   hw/ppc: Round up the decrementer interval when converting to ns
   hw/ppc: Avoid decrementer rounding errors
   target/ppc: Sign-extend large decrementer to 64-bits
   hw/ppc: Always store the decrementer value
   target/ppc: Migrate DECR SPR
   hw/ppc: Reset timebase facilities on machine reset
   hw/ppc: Read time only once to perform decrementer write
   target/ppc: Fix CPU reservation migration for record-replay
   target/ppc: Fix timebase reset with record-replay
   spapr: Fix machine reset deadlock from replay-record
   spapr: Fix record-replay machine reset consuming too many events
   tests/avocado: boot ppc64 pseries replay-record test to Linux VFS mount
   tests/avocado: reverse-debugging cope with re-executing breakpoints
   tests/avocado: ppc64 reverse debugging tests for pseries and powernv
   target/ppc: Fix LQ, STQ register-pair order for big-endian

Richard Henderson (1):
   target/ppc: Flush inputs to zero with NJ in ppc_store_vscr

Shawn Anastasio (1):
   target/ppc: Generate storage interrupts for radix RC changes

jianchunfu (1):
   target/ppc: Fix the order of kvm_enable judgment about 
kvmppc_set_interrupt()


Is there anything in there worth to pick for -stable?
Like, for example, some decrementer fixes, or some of these:

 ppc/vof: Fix missed fields in VOF cleanup
 spapr: Fix machine reset deadlock from replay-record
 hw/ppc/e500: fix broken snapshot replay

or something else?

Thanks!

/mjt

Re: [PATCH 0/2] virtio: Drop out of coroutine context in virtio_load()

2023-09-07 Thread Stefan Hajnoczi

On Tue, Sep 05, 2023 at 04:50:00PM +0200, Kevin Wolf wrote:
> This fixes a recently introduced assertion failure that was reported to
> happen when migrating virtio-net with a failover. The latent bug that
> we're executing code in coroutine context that was never supposed to run
> there has existed for a long time. However, the new assertion that
> callers of bdrv_graph_rdlock_main_loop() don't run in coroutine context
> makes it very visible because it's now always a crash.
> 
> Kevin Wolf (2):
>   vmstate: Mark VMStateInfo.get/put() coroutine_mixed_fn
>   virtio: Drop out of coroutine context in virtio_load()
> 
>  include/migration/vmstate.h |  8 ---
>  hw/virtio/virtio.c  | 45 -
>  2 files changed, 45 insertions(+), 8 deletions(-)

This looks like a bandaid for a specific instance of this problem rather
than a solution that takes care of the root cause.

Is it possible to make VMStateInfo.get/put() consistenty coroutine_fn?

Stefan


signature.asc
Description: PGP signature

Re: [PATCH 2/2] virtio: Drop out of coroutine context in virtio_load()

2023-09-07 Thread Stefan Hajnoczi

On Tue, Sep 05, 2023 at 04:50:02PM +0200, Kevin Wolf wrote:
> virtio_load() as a whole should run in coroutine context because it
> reads from the migration stream and we don't want this to block.

Is that "should" a "must" or a "can"?

If it's a "must" then virtio_load() needs assert(qemu_in_coroutine()).

But the previous patch mentioned that loadvm for snapshots calls it
outside coroutine context. So maybe it's a "can"?

> 
> However, it calls virtio_set_features_nocheck() and devices don't
> expect their .set_features callback to run in a coroutine and therefore
> call functions that may not be called in coroutine context. To fix this,
> drop out of coroutine context for calling virtio_set_features_nocheck().
> 
> Without this fix, the following crash was reported:
> 
>   #0  __pthread_kill_implementation (threadid=, 
> signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44
>   #1  0x7efc738c05d3 in __pthread_kill_internal (signo=6, 
> threadid=) at pthread_kill.c:78
>   #2  0x7efc73873d26 in __GI_raise (sig=sig@entry=6) at 
> ../sysdeps/posix/raise.c:26
>   #3  0x7efc738477f3 in __GI_abort () at abort.c:79
>   #4  0x7efc7384771b in __assert_fail_base (fmt=0x7efc739dbcb8 "", 
> assertion=assertion@entry=0x560aebfbf5cf "!qemu_in_coroutine()",
>  file=file@entry=0x560aebfcd2d4 "../block/graph-lock.c", 
> line=line@entry=275, function=function@entry=0x560aebfcd34d "void 
> bdrv_graph_rdlock_main_loop(void)") at assert.c:92
>   #5  0x7efc7386ccc6 in __assert_fail (assertion=0x560aebfbf5cf 
> "!qemu_in_coroutine()", file=0x560aebfcd2d4 "../block/graph-lock.c", line=275,
>  function=0x560aebfcd34d "void bdrv_graph_rdlock_main_loop(void)") at 
> assert.c:101
>   #6  0x560aebcd8dd6 in bdrv_register_buf ()
>   #7  0x560aeb97ed97 in ram_block_added.llvm ()
>   #8  0x560aebb8303f in ram_block_add.llvm ()
>   #9  0x560aebb834fa in qemu_ram_alloc_internal.llvm ()
>   #10 0x560aebb2ac98 in vfio_region_mmap ()
>   #11 0x560aebb3ea0f in vfio_bars_register ()
>   #12 0x560aebb3c628 in vfio_realize ()
>   #13 0x560aeb90f0c2 in pci_qdev_realize ()
>   #14 0x560aebc40305 in device_set_realized ()
>   #15 0x560aebc48e07 in property_set_bool.llvm ()
>   #16 0x560aebc46582 in object_property_set ()
>   #17 0x560aebc4cd58 in object_property_set_qobject ()
>   #18 0x560aebc46ba7 in object_property_set_bool ()
>   #19 0x560aeb98b3ca in qdev_device_add_from_qdict ()
>   #20 0x560aebb1fbaf in virtio_net_set_features ()
>   #21 0x560aebb46b51 in virtio_set_features_nocheck ()
>   #22 0x560aebb47107 in virtio_load ()
>   #23 0x560aeb9ae7ce in vmstate_load_state ()
>   #24 0x560aeb9d2ee9 in qemu_loadvm_state_main ()
>   #25 0x560aeb9d45e1 in qemu_loadvm_state ()
>   #26 0x560aeb9bc32c in process_incoming_migration_co.llvm ()
>   #27 0x560aebeace56 in coroutine_trampoline.llvm ()
> 
> Cc: qemu-sta...@nongnu.org
> Buglink: https://issues.redhat.com/browse/RHEL-832
> Signed-off-by: Kevin Wolf 
> ---
>  hw/virtio/virtio.c | 45 -
>  1 file changed, 40 insertions(+), 5 deletions(-)

Reviewed-by: Stefan Hajnoczi 


signature.asc
Description: PGP signature

Re: [PATCH v1 1/7] qapi: scripts: add a generator for qapi's examples

2023-09-07 Thread Victor Toso

Hi,

On Wed, Sep 06, 2023 at 10:15:52AM +0100, Daniel P. Berrangé wrote:
> On Tue, Sep 05, 2023 at 09:48:40PM +0200, Victor Toso wrote:
> > This generator has two goals:
> >  1. Mechanical validation of QAPI examples
> >  2. Generate the examples in a JSON format to be consumed for extra
> > validation.
> > 
> > The generator iterates over every Example section, parsing both server
> > and client messages. The generator prints any inconsistency found, for
> > example:
> > 
> >  |  Error: Extra data: line 1 column 39 (char 38)
> >  |  Location: cancel-vcpu-dirty-limit at qapi/migration.json:2017
> >  |  Data: {"execute": "cancel-vcpu-dirty-limit"},
> >  |  "arguments": { "cpu-index": 1 } }
> > 
> > The generator will output other JSON file with all the examples in the
> > QAPI module that they came from. This can be used to validate the
> > introspection between QAPI/QMP to language bindings, for example:
> > 
> >  | { "examples": [
> >  |   {
> >  | "id": "ksuxwzfayw",
> >  | "client": [
> >  | {
> >  |   "sequence-order": 1
> >  |   "message-type": "command",
> >  |   "message":
> >  |   { "arguments":
> >  | { "device": "scratch", "size": 1073741824 },
> >  | "execute": "block_resize"
> >  |   },
> >  |} ],
> >  |"server": [
> >  |{
> >  |  "sequence-order": 2
> >  |  "message-type": "return",
> >  |  "message": { "return": {} },
> >  |} ]
> >  |}
> >  |  ] }
> > 
> > Note that the order matters, as read by the Example section and
> > translated into "sequence-order". A language binding project can then
> > consume this files to Marshal and Unmarshal, comparing if the results
> > are what is to be expected.
> > 
> > RFC discussion:
> > https://lists.gnu.org/archive/html/qemu-devel/2022-08/msg04641.html
> > 
> > Signed-off-by: Victor Toso 
> > ---
> >  scripts/qapi/dumpexamples.py | 194 +++
> >  scripts/qapi/main.py |   2 +
> >  2 files changed, 196 insertions(+)
> >  create mode 100644 scripts/qapi/dumpexamples.py
> > 
> > diff --git a/scripts/qapi/dumpexamples.py b/scripts/qapi/dumpexamples.py
> > new file mode 100644
> > index 00..c14ed11774
> > --- /dev/null
> > +++ b/scripts/qapi/dumpexamples.py
> > @@ -0,0 +1,194 @@
> > +"""
> > +Dump examples for Developers
> > +"""
> > +# Copyright (c) 2022 Red Hat Inc.
> > +#
> > +# Authors:
> > +#  Victor Toso 
> > +#
> > +# This work is licensed under the terms of the GNU GPL, version 2.
> > +# See the COPYING file in the top-level directory.
> > +
> > +# Just for type hint on self
> > +from __future__ import annotations
> > +
> > +import os
> > +import json
> > +import random
> > +import string
> > +
> > +from typing import Dict, List, Optional
> > +
> > +from .schema import (
> > +QAPISchema,
> > +QAPISchemaType,
> > +QAPISchemaVisitor,
> > +QAPISchemaEnumMember,
> > +QAPISchemaFeature,
> > +QAPISchemaIfCond,
> > +QAPISchemaObjectType,
> > +QAPISchemaObjectTypeMember,
> > +QAPISchemaVariants,
> > +)
> > +from .source import QAPISourceInfo
> > +
> > +
> > +def gen_examples(schema: QAPISchema,
> > + output_dir: str,
> > + prefix: str) -> None:
> > +vis = QAPISchemaGenExamplesVisitor(prefix)
> > +schema.visit(vis)
> > +vis.write(output_dir)
> > +
> > +
> > +def get_id(random, size: int) -> str:
> > +letters = string.ascii_lowercase
> > +return ''.join(random.choice(letters) for i in range(size))
> > +
> > +
> > +def next_object(text, start, end, context) -> Dict:
> > +# Start of json object
> > +start = text.find("{", start)
> > +end = text.rfind("}", start, end+1)
> > +
> > +# try catch, pretty print issues
> > +try:
> > +ret = json.loads(text[start:end+1])
> > +except Exception as e:
> > +print("Error: {}\nLocation: {}\nData: {}\n".format(
> > +  str(e), context, text[start:end+1]))
> 
> This prints an error, but the caller ignores this and carries on
> as normal.
> 
> After applying this series, we still have multiple errors being
> printed on console

The first one is a easy to fix error. The other two are more
related to metadata inserted in valid examples, see:

> Error: Expecting ',' delimiter: line 12 column 19 (char 336)
> Location: query-blockstats at 
> ../storage-daemon/qapi/../../qapi/block-core.json:1259

Indeed.
 
> Error: Expecting property name enclosed in double quotes: line 7 column 19 
> (char 264)
> Location: query-rocker-of-dpa-flows at ../qapi/rocker.json:256

251 #   "mask": {"in-pport": 4294901760}
252 #  },
 -> 253 #  {...more...},
254 #]}

> 
> Error: Expecting value: line 28 column 15 (char 775)
> Location: query-spice at ../qapi/ui.json:372

365 #"tls": false
366 # },
 -> 367 # [ ... more channels follow ... ]
368 #  ]

It would be good

[PULL 4/5] hw/ufs: Support for UFS logical unit

2023-09-07 Thread Stefan Hajnoczi

From: Jeuk Kim 

This commit adds support for ufs logical unit.
The LU handles processing for the SCSI command,
unit descriptor query request.

This commit enables the UFS device to process
IO requests.

Signed-off-by: Jeuk Kim 
Reviewed-by: Stefan Hajnoczi 
Message-id: 
beacc504376ab6a14b1a3830bb3c69382cf6aebc.1693980783.git.jeuk20@gmail.com
Signed-off-by: Stefan Hajnoczi 
---
 hw/ufs/ufs.h |   43 ++
 include/scsi/constants.h |1 +
 hw/ufs/lu.c  | 1445 ++
 hw/ufs/ufs.c |  252 ++-
 hw/ufs/meson.build   |2 +-
 hw/ufs/trace-events  |   25 +
 6 files changed, 1761 insertions(+), 7 deletions(-)
 create mode 100644 hw/ufs/lu.c

diff --git a/hw/ufs/ufs.h b/hw/ufs/ufs.h
index 3d1b2cff4e..f244228617 100644
--- a/hw/ufs/ufs.h
+++ b/hw/ufs/ufs.h
@@ -18,6 +18,18 @@
 #define UFS_MAX_LUS 32
 #define UFS_BLOCK_SIZE 4096
 
+typedef struct UfsBusClass {
+BusClass parent_class;
+bool (*parent_check_address)(BusState *bus, DeviceState *dev, Error 
**errp);
+} UfsBusClass;
+
+typedef struct UfsBus {
+SCSIBus parent_bus;
+} UfsBus;
+
+#define TYPE_UFS_BUS "ufs-bus"
+DECLARE_OBJ_CHECKERS(UfsBus, UfsBusClass, UFS_BUS, TYPE_UFS_BUS)
+
 typedef enum UfsRequestState {
 UFS_REQUEST_IDLE = 0,
 UFS_REQUEST_READY = 1,
@@ -29,6 +41,7 @@ typedef enum UfsRequestState {
 typedef enum UfsReqResult {
 UFS_REQUEST_SUCCESS = 0,
 UFS_REQUEST_FAIL = 1,
+UFS_REQUEST_NO_COMPLETE = 2,
 } UfsReqResult;
 
 typedef struct UfsRequest {
@@ -44,6 +57,17 @@ typedef struct UfsRequest {
 QEMUSGList *sg;
 } UfsRequest;
 
+typedef struct UfsLu {
+SCSIDevice qdev;
+uint8_t lun;
+UnitDescriptor unit_desc;
+} UfsLu;
+
+typedef struct UfsWLu {
+SCSIDevice qdev;
+uint8_t lun;
+} UfsWLu;
+
 typedef struct UfsParams {
 char *serial;
 uint8_t nutrs; /* Number of UTP Transfer Request Slots */
@@ -52,12 +76,18 @@ typedef struct UfsParams {
 
 typedef struct UfsHc {
 PCIDevice parent_obj;
+UfsBus bus;
 MemoryRegion iomem;
 UfsReg reg;
 UfsParams params;
 uint32_t reg_size;
 UfsRequest *req_list;
 
+UfsLu *lus[UFS_MAX_LUS];
+UfsWLu *report_wlu;
+UfsWLu *dev_wlu;
+UfsWLu *boot_wlu;
+UfsWLu *rpmb_wlu;
 DeviceDescriptor device_desc;
 GeometryDescriptor geometry_desc;
 Attributes attributes;
@@ -71,6 +101,12 @@ typedef struct UfsHc {
 #define TYPE_UFS "ufs"
 #define UFS(obj) OBJECT_CHECK(UfsHc, (obj), TYPE_UFS)
 
+#define TYPE_UFS_LU "ufs-lu"
+#define UFSLU(obj) OBJECT_CHECK(UfsLu, (obj), TYPE_UFS_LU)
+
+#define TYPE_UFS_WLU "ufs-wlu"
+#define UFSWLU(obj) OBJECT_CHECK(UfsWLu, (obj), TYPE_UFS_WLU)
+
 typedef enum UfsQueryFlagPerm {
 UFS_QUERY_FLAG_NONE = 0x0,
 UFS_QUERY_FLAG_READ = 0x1,
@@ -85,4 +121,11 @@ typedef enum UfsQueryAttrPerm {
 UFS_QUERY_ATTR_WRITE = 0x2,
 } UfsQueryAttrPerm;
 
+static inline bool is_wlun(uint8_t lun)
+{
+return (lun == UFS_UPIU_REPORT_LUNS_WLUN ||
+lun == UFS_UPIU_UFS_DEVICE_WLUN || lun == UFS_UPIU_BOOT_WLUN ||
+lun == UFS_UPIU_RPMB_WLUN);
+}
+
 #endif /* HW_UFS_UFS_H */
diff --git a/include/scsi/constants.h b/include/scsi/constants.h
index 6a8bad556a..9b98451912 100644
--- a/include/scsi/constants.h
+++ b/include/scsi/constants.h
@@ -231,6 +231,7 @@
 #define MODE_PAGE_FLEXIBLE_DISK_GEOMETRY  0x05
 #define MODE_PAGE_CACHING 0x08
 #define MODE_PAGE_AUDIO_CTL   0x0e
+#define MODE_PAGE_CONTROL 0x0a
 #define MODE_PAGE_POWER   0x1a
 #define MODE_PAGE_FAULT_FAIL  0x1c
 #define MODE_PAGE_TO_PROTECT  0x1d
diff --git a/hw/ufs/lu.c b/hw/ufs/lu.c
new file mode 100644
index 00..e1c46bddb1
--- /dev/null
+++ b/hw/ufs/lu.c
@@ -0,0 +1,1445 @@
+/*
+ * QEMU UFS Logical Unit
+ *
+ * Copyright (c) 2023 Samsung Electronics Co., Ltd. All rights reserved.
+ *
+ * Written by Jeuk Kim 
+ *
+ * This code is licensed under the GNU GPL v2 or later.
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/units.h"
+#include "qapi/error.h"
+#include "qemu/memalign.h"
+#include "hw/scsi/scsi.h"
+#include "scsi/constants.h"
+#include "sysemu/block-backend.h"
+#include "qemu/cutils.h"
+#include "trace.h"
+#include "ufs.h"
+
+/*
+ * The code below handling SCSI commands is copied from hw/scsi/scsi-disk.c,
+ * with minor adjustments to make it work for UFS.
+ */
+
+#define SCSI_DMA_BUF_SIZE (128 * KiB)
+#define SCSI_MAX_INQUIRY_LEN 256
+#define SCSI_INQUIRY_DATA_SIZE 36
+#define SCSI_MAX_MODE_LEN 256
+
+typedef struct UfsSCSIReq {
+SCSIRequest req;
+/* Both sector and sector_count are in terms of BDRV_SECTOR_SIZE bytes.  */
+uint64_t sector;
+uint32_t sector_count;
+uint32_t buflen;
+bool started;
+bool need_fua_emulation;
+struct iovec iov;
+QEMUIOVector qiov;
+BlockAcctCookie acct;
+} UfsSCSIReq;
+
+static void ufs_scsi_free_request(SCSIRequest *req)
+{
+UfsSCSIReq *r =

Re: [PATCH v1 0/7] Validate and test qapi examples

2023-09-07 Thread Victor Toso

Hi,

Thanks for the quick review Daniel!

On Wed, Sep 06, 2023 at 10:17:04AM +0100, Daniel P. Berrangé wrote:
> On Tue, Sep 05, 2023 at 09:48:39PM +0200, Victor Toso wrote:
> > Hi,
> > 
> > This is a follow up from the RFC sent in the end of 08-2022:
> > https://lists.gnu.org/archive/html/qemu-devel/2022-08/msg04525.html
> > 
> > The generator code was rebased, without conflicts. The commit log was
> > improved as per Markus suggestion [0], altough I'm sure it can be
> > improved further.
> > 
> > To clarify, consuming the Examples as data for testing the qapi-go
> > work has been very very helpful. I'm positive it can be of use for other
> > bindings in the future, besides keeping the examples functional
> > 
> > Cheers,
> > 
> > [0] https://lists.gnu.org/archive/html/qemu-devel/2022-08/msg04682.html
> > 
> > Victor Toso (7):
> >   qapi: scripts: add a generator for qapi's examples
> >   qapi: fix example of get-win32-socket command
> >   qapi: fix example of dumpdtb command
> >   qapi: fix example of cancel-vcpu-dirty-limit command
> >   qapi: fix example of set-vcpu-dirty-limit command
> >   qapi: fix example of calc-dirty-rate command
> >   qapi: fix example of NETDEV_STREAM_CONNECTED event
> > 
> >  qapi/machine.json|   2 +-
> >  qapi/migration.json  |   6 +-
> >  qapi/misc.json   |   2 +-
> >  qapi/net.json|   6 +-
> >  scripts/qapi/dumpexamples.py | 194 +++
> >  scripts/qapi/main.py |   2 +
> >  6 files changed, 204 insertions(+), 8 deletions(-)
> >  create mode 100644 scripts/qapi/dumpexamples.py
> 
> After applying this series, aside from the extra broken examples
> mentioned in my patch 1 comments, I also see a test suite failure
> during build

My bad.

> FAILED: tests/qapi-builtin-types.c tests/qapi-builtin-types.h 
> tests/qapi-builtin-visit.c tests/qapi-builtin-visit.h 
> tests/test-qapi-commands-sub-sub-module.c 
> tests/test-qapi-commands-sub-sub-module.h tests/test-qapi-commands.c 
> tests/test-qapi-commands.h tests/test-qapi-emit-events.c 
> tests/test-qapi-emit-events.h tests/test-qapi-events-sub-sub-module.c 
> tests/test-qapi-events-sub-sub-module.h tests/test-qapi-events.c 
> tests/test-qapi-events.h tests/test-qapi-init-commands.c 
> tests/test-qapi-init-commands.h tests/test-qapi-introspect.c 
> tests/test-qapi-introspect.h tests/test-qapi-types-sub-sub-module.c 
> tests/test-qapi-types-sub-sub-module.h tests/test-qapi-types.c 
> tests/test-qapi-types.h tests/test-qapi-visit-sub-sub-module.c 
> tests/test-qapi-visit-sub-sub-module.h tests/test-qapi-visit.c 
> tests/test-qapi-visit.h 
> /home/berrange/src/virt/qemu/build/pyvenv/bin/python3 
> /home/berrange/src/virt/qemu/scripts/qapi-gen.py -o 
> /home/berrange/src/virt/qemu/build/tests -b -p test- 
> ../tests/qapi-schema/qapi-schema-test.json --suppress-tracing
> Traceback (most recent call last):
>   File "/home/berrange/src/virt/qemu/scripts/qapi-gen.py", line 19, in 
> 
> sys.exit(main.main())
>  ^^^
>   File "/home/berrange/src/virt/qemu/scripts/qapi/main.py", line 96, in main
> generate(args.schema,
>   File "/home/berrange/src/virt/qemu/scripts/qapi/main.py", line 58, in 
> generate
> gen_examples(schema, output_dir, prefix)
>   File "/home/berrange/src/virt/qemu/scripts/qapi/dumpexamples.py", line 40, 
> in gen_examples
> schema.visit(vis)
>   File "/home/berrange/src/virt/qemu/scripts/qapi/schema.py", line 1227, in 
> visit
> mod.visit(visitor)
>   File "/home/berrange/src/virt/qemu/scripts/qapi/schema.py", line 209, in 
> visit
> entity.visit(visitor)
>   File "/home/berrange/src/virt/qemu/scripts/qapi/schema.py", line 857, in 
> visit
> visitor.visit_command(
>   File "/home/berrange/src/virt/qemu/scripts/qapi/dumpexamples.py", line 184, 
> in visit_command
> parse_examples_of(self, name)
>   File "/home/berrange/src/virt/qemu/scripts/qapi/dumpexamples.py", line 118, 
> in parse_examples_of
> assert((obj.doc is not None))
> ^^^
> AssertionError
> ninja: build stopped: subcommand failed.
> 
> not sure if that's related to the examples that still need fixing or not ?

This is related to the script being fed with data without
documentation. In general, asserting should be the right approach
because we don't want API without docs but this failure comes
from the tests, that is, adding the following diff:

diff --git a/scripts/qapi/dumpexamples.py b/scripts/qapi/dumpexamples.py
index c14ed11774..a961c0575d 100644
--- a/scripts/qapi/dumpexamples.py
+++ b/scripts/qapi/dumpexamples.py
@@ -115,6 +115,10 @@ def parse_examples_of(self:
QAPISchemaGenExamplesVisitor,

 assert(name in self.schema._entity_dict)
 obj = self.schema._entity_dict[name]
+if obj.doc is None:
+print(f"{name} does not have documentation")
+return
+
 assert((obj.doc is not None))
 module_name = obj._module.name

gives:

user-def-cmd0 does not have

[PULL 3/5] hw/ufs: Support for Query Transfer Requests

2023-09-07 Thread Stefan Hajnoczi

From: Jeuk Kim 

This commit makes the UFS device support query
and nop out transfer requests.

The next patch would be support for UFS logical
unit and scsi command transfer request.

Signed-off-by: Jeuk Kim 
Reviewed-by: Stefan Hajnoczi 
Message-id: 
ff7a5f0fd26761936a553ffb89d3df0ba62844e9.1693980783.git.jeuk20@gmail.com
Signed-off-by: Stefan Hajnoczi 
---
 hw/ufs/ufs.h|  46 +++
 hw/ufs/ufs.c| 988 +++-
 hw/ufs/trace-events |   1 +
 3 files changed, 1033 insertions(+), 2 deletions(-)

diff --git a/hw/ufs/ufs.h b/hw/ufs/ufs.h
index d9d195caec..3d1b2cff4e 100644
--- a/hw/ufs/ufs.h
+++ b/hw/ufs/ufs.h
@@ -18,6 +18,32 @@
 #define UFS_MAX_LUS 32
 #define UFS_BLOCK_SIZE 4096
 
+typedef enum UfsRequestState {
+UFS_REQUEST_IDLE = 0,
+UFS_REQUEST_READY = 1,
+UFS_REQUEST_RUNNING = 2,
+UFS_REQUEST_COMPLETE = 3,
+UFS_REQUEST_ERROR = 4,
+} UfsRequestState;
+
+typedef enum UfsReqResult {
+UFS_REQUEST_SUCCESS = 0,
+UFS_REQUEST_FAIL = 1,
+} UfsReqResult;
+
+typedef struct UfsRequest {
+struct UfsHc *hc;
+UfsRequestState state;
+int slot;
+
+UtpTransferReqDesc utrd;
+UtpUpiuReq req_upiu;
+UtpUpiuRsp rsp_upiu;
+
+/* for scsi command */
+QEMUSGList *sg;
+} UfsRequest;
+
 typedef struct UfsParams {
 char *serial;
 uint8_t nutrs; /* Number of UTP Transfer Request Slots */
@@ -30,6 +56,12 @@ typedef struct UfsHc {
 UfsReg reg;
 UfsParams params;
 uint32_t reg_size;
+UfsRequest *req_list;
+
+DeviceDescriptor device_desc;
+GeometryDescriptor geometry_desc;
+Attributes attributes;
+Flags flags;
 
 qemu_irq irq;
 QEMUBH *doorbell_bh;
@@ -39,4 +71,18 @@ typedef struct UfsHc {
 #define TYPE_UFS "ufs"
 #define UFS(obj) OBJECT_CHECK(UfsHc, (obj), TYPE_UFS)
 
+typedef enum UfsQueryFlagPerm {
+UFS_QUERY_FLAG_NONE = 0x0,
+UFS_QUERY_FLAG_READ = 0x1,
+UFS_QUERY_FLAG_SET = 0x2,
+UFS_QUERY_FLAG_CLEAR = 0x4,
+UFS_QUERY_FLAG_TOGGLE = 0x8,
+} UfsQueryFlagPerm;
+
+typedef enum UfsQueryAttrPerm {
+UFS_QUERY_ATTR_NONE = 0x0,
+UFS_QUERY_ATTR_READ = 0x1,
+UFS_QUERY_ATTR_WRITE = 0x2,
+} UfsQueryAttrPerm;
+
 #endif /* HW_UFS_UFS_H */
diff --git a/hw/ufs/ufs.c b/hw/ufs/ufs.c
index df87f2a6d5..56a8ec286b 100644
--- a/hw/ufs/ufs.c
+++ b/hw/ufs/ufs.c
@@ -15,10 +15,221 @@
 #include "ufs.h"
 
 /* The QEMU-UFS device follows spec version 3.1 */
-#define UFS_SPEC_VER 0x0310
+#define UFS_SPEC_VER 0x0310
 #define UFS_MAX_NUTRS 32
 #define UFS_MAX_NUTMRS 8
 
+static MemTxResult ufs_addr_read(UfsHc *u, hwaddr addr, void *buf, int size)
+{
+hwaddr hi = addr + size - 1;
+
+if (hi < addr) {
+return MEMTX_DECODE_ERROR;
+}
+
+if (!FIELD_EX32(u->reg.cap, CAP, 64AS) && (hi >> 32)) {
+return MEMTX_DECODE_ERROR;
+}
+
+return pci_dma_read(PCI_DEVICE(u), addr, buf, size);
+}
+
+static MemTxResult ufs_addr_write(UfsHc *u, hwaddr addr, const void *buf,
+  int size)
+{
+hwaddr hi = addr + size - 1;
+if (hi < addr) {
+return MEMTX_DECODE_ERROR;
+}
+
+if (!FIELD_EX32(u->reg.cap, CAP, 64AS) && (hi >> 32)) {
+return MEMTX_DECODE_ERROR;
+}
+
+return pci_dma_write(PCI_DEVICE(u), addr, buf, size);
+}
+
+static void ufs_complete_req(UfsRequest *req, UfsReqResult req_result);
+
+static inline hwaddr ufs_get_utrd_addr(UfsHc *u, uint32_t slot)
+{
+hwaddr utrl_base_addr = (((hwaddr)u->reg.utrlbau) << 32) + u->reg.utrlba;
+hwaddr utrd_addr = utrl_base_addr + slot * sizeof(UtpTransferReqDesc);
+
+return utrd_addr;
+}
+
+static inline hwaddr ufs_get_req_upiu_base_addr(const UtpTransferReqDesc *utrd)
+{
+uint32_t cmd_desc_base_addr_lo =
+le32_to_cpu(utrd->command_desc_base_addr_lo);
+uint32_t cmd_desc_base_addr_hi =
+le32_to_cpu(utrd->command_desc_base_addr_hi);
+
+return (((hwaddr)cmd_desc_base_addr_hi) << 32) + cmd_desc_base_addr_lo;
+}
+
+static inline hwaddr ufs_get_rsp_upiu_base_addr(const UtpTransferReqDesc *utrd)
+{
+hwaddr req_upiu_base_addr = ufs_get_req_upiu_base_addr(utrd);
+uint32_t rsp_upiu_byte_off =
+le16_to_cpu(utrd->response_upiu_offset) * sizeof(uint32_t);
+return req_upiu_base_addr + rsp_upiu_byte_off;
+}
+
+static MemTxResult ufs_dma_read_utrd(UfsRequest *req)
+{
+UfsHc *u = req->hc;
+hwaddr utrd_addr = ufs_get_utrd_addr(u, req->slot);
+MemTxResult ret;
+
+ret = ufs_addr_read(u, utrd_addr, >utrd, sizeof(req->utrd));
+if (ret) {
+trace_ufs_err_dma_read_utrd(req->slot, utrd_addr);
+}
+return ret;
+}
+
+static MemTxResult ufs_dma_read_req_upiu(UfsRequest *req)
+{
+UfsHc *u = req->hc;
+hwaddr req_upiu_base_addr = ufs_get_req_upiu_base_addr(>utrd);
+UtpUpiuReq *req_upiu = >req_upiu;
+uint32_t copy_size;
+uint16_t data_segment_length;
+MemTxResult ret;
+
+/*
+ * To know the size of the req_upiu, we need to read the
+ * data_segment_length

[PULL 5/5] tests/qtest: Introduce tests for UFS

2023-09-07 Thread Stefan Hajnoczi

From: Jeuk Kim 

This patch includes the following tests
  Test mmio read
  Test ufs device initialization and ufs-lu recognition
  Test I/O (Performs a write followed by a read to verify)

Signed-off-by: Jeuk Kim 
Acked-by: Thomas Huth 
Reviewed-by: Stefan Hajnoczi 
Message-id: 
9e9207f54505e9ba30931849f949ff6f474ac333.1693980783.git.jeuk20@gmail.com
Signed-off-by: Stefan Hajnoczi 
---
 MAINTAINERS |   1 +
 tests/qtest/ufs-test.c  | 587 
 tests/qtest/meson.build |   1 +
 3 files changed, 589 insertions(+)
 create mode 100644 tests/qtest/ufs-test.c

diff --git a/MAINTAINERS b/MAINTAINERS
index 3ac4ac6219..bf2366815b 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2253,6 +2253,7 @@ M: Jeuk Kim 
 S: Supported
 F: hw/ufs/*
 F: include/block/ufs.h
+F: tests/qtest/ufs-test.c
 
 megasas
 M: Hannes Reinecke 
diff --git a/tests/qtest/ufs-test.c b/tests/qtest/ufs-test.c
new file mode 100644
index 00..ed3dbca154
--- /dev/null
+++ b/tests/qtest/ufs-test.c
@@ -0,0 +1,587 @@
+/*
+ * QTest testcase for UFS
+ *
+ * Copyright (c) 2023 Samsung Electronics Co., Ltd. All rights reserved.
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/module.h"
+#include "qemu/units.h"
+#include "libqtest.h"
+#include "libqos/qgraph.h"
+#include "libqos/pci.h"
+#include "scsi/constants.h"
+#include "include/block/ufs.h"
+
+/* Test images sizes in Bytes */
+#define TEST_IMAGE_SIZE (64 * 1024 * 1024)
+/* Timeout for various operations, in seconds. */
+#define TIMEOUT_SECONDS 10
+/* Maximum PRD entry count */
+#define MAX_PRD_ENTRY_COUNT 10
+#define PRD_ENTRY_DATA_SIZE 4096
+/* Constants to build upiu */
+#define UTP_COMMAND_DESCRIPTOR_SIZE 4096
+#define UTP_RESPONSE_UPIU_OFFSET 1024
+#define UTP_PRDT_UPIU_OFFSET 2048
+
+typedef struct QUfs QUfs;
+
+struct QUfs {
+QOSGraphObject obj;
+QPCIDevice dev;
+QPCIBar bar;
+
+uint64_t utrlba;
+uint64_t utmrlba;
+uint64_t cmd_desc_addr;
+uint64_t data_buffer_addr;
+
+bool enabled;
+};
+
+static inline uint32_t ufs_rreg(QUfs *ufs, size_t offset)
+{
+return qpci_io_readl(>dev, ufs->bar, offset);
+}
+
+static inline void ufs_wreg(QUfs *ufs, size_t offset, uint32_t value)
+{
+qpci_io_writel(>dev, ufs->bar, offset, value);
+}
+
+static void ufs_wait_for_irq(QUfs *ufs)
+{
+uint64_t end_time;
+uint32_t is;
+/* Wait for device to reset as the linux driver does. */
+end_time = g_get_monotonic_time() + TIMEOUT_SECONDS * G_TIME_SPAN_SECOND;
+do {
+qtest_clock_step(ufs->dev.bus->qts, 100);
+is = ufs_rreg(ufs, A_IS);
+} while (is == 0 && g_get_monotonic_time() < end_time);
+}
+
+static UtpTransferReqDesc ufs_build_req_utrd(uint64_t cmd_desc_addr,
+ uint8_t slot,
+ uint32_t data_direction,
+ uint16_t prd_table_length)
+{
+UtpTransferReqDesc req = { 0 };
+uint64_t command_desc_base_addr =
+cmd_desc_addr + slot * UTP_COMMAND_DESCRIPTOR_SIZE;
+
+req.header.dword_0 =
+cpu_to_le32(1 << 28 | data_direction | UFS_UTP_REQ_DESC_INT_CMD);
+req.header.dword_2 = cpu_to_le32(UFS_OCS_INVALID_COMMAND_STATUS);
+
+req.command_desc_base_addr_hi = cpu_to_le32(command_desc_base_addr >> 32);
+req.command_desc_base_addr_lo =
+cpu_to_le32(command_desc_base_addr & 0x);
+req.response_upiu_offset =
+cpu_to_le16(UTP_RESPONSE_UPIU_OFFSET / sizeof(uint32_t));
+req.response_upiu_length = cpu_to_le16(sizeof(UtpUpiuRsp));
+req.prd_table_offset = cpu_to_le16(UTP_PRDT_UPIU_OFFSET / 
sizeof(uint32_t));
+req.prd_table_length = cpu_to_le16(prd_table_length);
+return req;
+}
+
+static void ufs_send_nop_out(QUfs *ufs, uint8_t slot,
+ UtpTransferReqDesc *utrd_out, UtpUpiuRsp *rsp_out)
+{
+/* Build up utp transfer request descriptor */
+UtpTransferReqDesc utrd = ufs_build_req_utrd(ufs->cmd_desc_addr, slot,
+ UFS_UTP_NO_DATA_TRANSFER, 0);
+uint64_t utrd_addr = ufs->utrlba + slot * sizeof(UtpTransferReqDesc);
+uint64_t req_upiu_addr =
+ufs->cmd_desc_addr + slot * UTP_COMMAND_DESCRIPTOR_SIZE;
+uint64_t rsp_upiu_addr = req_upiu_addr + UTP_RESPONSE_UPIU_OFFSET;
+qtest_memwrite(ufs->dev.bus->qts, utrd_addr, , sizeof(utrd));
+
+/* Build up request upiu */
+UtpUpiuReq req_upiu = { 0 };
+req_upiu.header.trans_type = UFS_UPIU_TRANSACTION_NOP_OUT;
+req_upiu.header.task_tag = slot;
+qtest_memwrite(ufs->dev.bus->qts, req_upiu_addr, _upiu,
+   sizeof(req_upiu));
+
+/* Ring Doorbell */
+ufs_wreg(ufs, A_UTRLDBR, 1);
+ufs_wait_for_irq(ufs);
+g_assert_true(FIELD_EX32(ufs_rreg(ufs, A_IS), IS, UTRCS));
+ufs_wreg(ufs, A_IS, FIELD_DP32(0, IS, UTRCS, 1));
+
+qtest_memread(ufs->dev.bus->qts, utrd_addr, utrd_out,

[PULL 0/5] Block patches

2023-09-07 Thread Stefan Hajnoczi

The following changes since commit 03a3a62fbd0aa5227e978eef3c67d3978aec9e5f:

  Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging 
(2023-09-07 10:29:06 -0400)

are available in the Git repository at:

  https://gitlab.com/stefanha/qemu.git tags/block-pull-request

for you to fetch changes up to 631c872614aca91eaf947c1748f0f27f99635d92:

  tests/qtest: Introduce tests for UFS (2023-09-07 14:01:29 -0400)


Pull request

- Jeuk Kim's emulated UFS device
- Fabiano Rosas' IOThread GSource "name" debugging aid



Fabiano Rosas (1):
  iothread: Set the GSource "name" field

Jeuk Kim (4):
  hw/ufs: Initial commit for emulated Universal-Flash-Storage
  hw/ufs: Support for Query Transfer Requests
  hw/ufs: Support for UFS logical unit
  tests/qtest: Introduce tests for UFS

 MAINTAINERS  |7 +
 docs/specs/pci-ids.rst   |2 +
 meson.build  |1 +
 hw/ufs/trace.h   |1 +
 hw/ufs/ufs.h |  131 
 include/block/ufs.h  | 1090 +++
 include/hw/pci/pci.h |1 +
 include/hw/pci/pci_ids.h |1 +
 include/scsi/constants.h |1 +
 hw/ufs/lu.c  | 1445 
 hw/ufs/ufs.c | 1502 ++
 iothread.c   |   14 +-
 tests/qtest/ufs-test.c   |  587 +++
 hw/Kconfig   |1 +
 hw/meson.build   |1 +
 hw/ufs/Kconfig   |4 +
 hw/ufs/meson.build   |1 +
 hw/ufs/trace-events  |   58 ++
 tests/qtest/meson.build  |1 +
 19 files changed, 4843 insertions(+), 6 deletions(-)
 create mode 100644 hw/ufs/trace.h
 create mode 100644 hw/ufs/ufs.h
 create mode 100644 include/block/ufs.h
 create mode 100644 hw/ufs/lu.c
 create mode 100644 hw/ufs/ufs.c
 create mode 100644 tests/qtest/ufs-test.c
 create mode 100644 hw/ufs/Kconfig
 create mode 100644 hw/ufs/meson.build
 create mode 100644 hw/ufs/trace-events

-- 
2.41.0

[PULL 2/5] hw/ufs: Initial commit for emulated Universal-Flash-Storage

2023-09-07 Thread Stefan Hajnoczi

From: Jeuk Kim 

Universal Flash Storage (UFS) is a high-performance mass storage device
with a serial interface. It is primarily used as a high-performance
data storage device for embedded applications.

This commit contains code for UFS device to be recognized
as a UFS PCI device.
Patches to handle UFS logical unit and Transfer Request will follow.

Signed-off-by: Jeuk Kim 
Reviewed-by: Stefan Hajnoczi 
Message-id: 
10232660d462ee5cd10cf673f1a9a1205fc8276c.1693980783.git.jeuk20@gmail.com
Signed-off-by: Stefan Hajnoczi 
---
 MAINTAINERS  |6 +
 docs/specs/pci-ids.rst   |2 +
 meson.build  |1 +
 hw/ufs/trace.h   |1 +
 hw/ufs/ufs.h |   42 ++
 include/block/ufs.h  | 1090 ++
 include/hw/pci/pci.h |1 +
 include/hw/pci/pci_ids.h |1 +
 hw/ufs/ufs.c |  278 ++
 hw/Kconfig   |1 +
 hw/meson.build   |1 +
 hw/ufs/Kconfig   |4 +
 hw/ufs/meson.build   |1 +
 hw/ufs/trace-events  |   32 ++
 14 files changed, 1461 insertions(+)
 create mode 100644 hw/ufs/trace.h
 create mode 100644 hw/ufs/ufs.h
 create mode 100644 include/block/ufs.h
 create mode 100644 hw/ufs/ufs.c
 create mode 100644 hw/ufs/Kconfig
 create mode 100644 hw/ufs/meson.build
 create mode 100644 hw/ufs/trace-events

diff --git a/MAINTAINERS b/MAINTAINERS
index b471973e1e..3ac4ac6219 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2248,6 +2248,12 @@ F: tests/qtest/nvme-test.c
 F: docs/system/devices/nvme.rst
 T: git git://git.infradead.org/qemu-nvme.git nvme-next
 
+ufs
+M: Jeuk Kim 
+S: Supported
+F: hw/ufs/*
+F: include/block/ufs.h
+
 megasas
 M: Hannes Reinecke 
 L: qemu-bl...@nongnu.org
diff --git a/docs/specs/pci-ids.rst b/docs/specs/pci-ids.rst
index e302bea484..d6707fa069 100644
--- a/docs/specs/pci-ids.rst
+++ b/docs/specs/pci-ids.rst
@@ -92,6 +92,8 @@ PCI devices (other than virtio):
   PCI PVPanic device (``-device pvpanic-pci``)
 1b36:0012
   PCI ACPI ERST device (``-device acpi-erst``)
+1b36:0013
+  PCI UFS device (``-device ufs``)
 
 All these devices are documented in :doc:`index`.
 
diff --git a/meson.build b/meson.build
index bf9831c715..0e31bdfabf 100644
--- a/meson.build
+++ b/meson.build
@@ -3287,6 +3287,7 @@ if have_system
 'hw/ssi',
 'hw/timer',
 'hw/tpm',
+'hw/ufs',
 'hw/usb',
 'hw/vfio',
 'hw/virtio',
diff --git a/hw/ufs/trace.h b/hw/ufs/trace.h
new file mode 100644
index 00..2dbd6397c3
--- /dev/null
+++ b/hw/ufs/trace.h
@@ -0,0 +1 @@
+#include "trace/trace-hw_ufs.h"
diff --git a/hw/ufs/ufs.h b/hw/ufs/ufs.h
new file mode 100644
index 00..d9d195caec
--- /dev/null
+++ b/hw/ufs/ufs.h
@@ -0,0 +1,42 @@
+/*
+ * QEMU UFS
+ *
+ * Copyright (c) 2023 Samsung Electronics Co., Ltd. All rights reserved.
+ *
+ * Written by Jeuk Kim 
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+
+#ifndef HW_UFS_UFS_H
+#define HW_UFS_UFS_H
+
+#include "hw/pci/pci_device.h"
+#include "hw/scsi/scsi.h"
+#include "block/ufs.h"
+
+#define UFS_MAX_LUS 32
+#define UFS_BLOCK_SIZE 4096
+
+typedef struct UfsParams {
+char *serial;
+uint8_t nutrs; /* Number of UTP Transfer Request Slots */
+uint8_t nutmrs; /* Number of UTP Task Management Request Slots */
+} UfsParams;
+
+typedef struct UfsHc {
+PCIDevice parent_obj;
+MemoryRegion iomem;
+UfsReg reg;
+UfsParams params;
+uint32_t reg_size;
+
+qemu_irq irq;
+QEMUBH *doorbell_bh;
+QEMUBH *complete_bh;
+} UfsHc;
+
+#define TYPE_UFS "ufs"
+#define UFS(obj) OBJECT_CHECK(UfsHc, (obj), TYPE_UFS)
+
+#endif /* HW_UFS_UFS_H */
diff --git a/include/block/ufs.h b/include/block/ufs.h
new file mode 100644
index 00..fd884eb8ce
--- /dev/null
+++ b/include/block/ufs.h
@@ -0,0 +1,1090 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+
+#ifndef BLOCK_UFS_H
+#define BLOCK_UFS_H
+
+#include "hw/registerfields.h"
+
+typedef struct QEMU_PACKED UfsReg {
+uint32_t cap;
+uint32_t rsvd0;
+uint32_t ver;
+uint32_t rsvd1;
+uint32_t hcpid;
+uint32_t hcmid;
+uint32_t ahit;
+uint32_t rsvd2;
+uint32_t is;
+uint32_t ie;
+uint32_t rsvd3[2];
+uint32_t hcs;
+uint32_t hce;
+uint32_t uecpa;
+uint32_t uecdl;
+uint32_t uecn;
+uint32_t uect;
+uint32_t uecdme;
+uint32_t utriacr;
+uint32_t utrlba;
+uint32_t utrlbau;
+uint32_t utrldbr;
+uint32_t utrlclr;
+uint32_t utrlrsr;
+uint32_t utrlcnr;
+uint32_t rsvd4[2];
+uint32_t utmrlba;
+uint32_t utmrlbau;
+uint32_t utmrldbr;
+uint32_t utmrlclr;
+uint32_t utmrlrsr;
+uint32_t rsvd5[3];
+uint32_t uiccmd;
+uint32_t ucmdarg1;
+uint32_t ucmdarg2;
+uint32_t ucmdarg3;
+uint32_t rsvd6[4];
+uint32_t rsvd7[4];
+uint32_t rsvd8[16];
+uint32_t ccap;
+} UfsReg;
+
+REG32(CAP, offsetof(UfsReg, cap))
+FIELD(CAP, NUTRS, 0, 5)
+FIELD(CAP, RTT, 8, 8)
+FIELD(CAP, NUTMRS, 16, 3)
+FIELD(CAP,

Re: [PULL for-6.2 0/7] Ide patches

2023-09-07 Thread Michael Tokarev

07.09.2023 19:54, John Snow wrote:
..

 > 
 >
 > Niklas Cassel (7):
 >    hw/ide/core: set ERR_STAT in unsupported command completion
 >    hw/ide/ahci: write D2H FIS when processing NCQ command
 >    hw/ide/ahci: simplify and document PxCI handling
 >    hw/ide/ahci: PxSACT and PxCI is cleared when PxCMD.ST is cleared
 >    hw/ide/ahci: PxCI should not get cleared when ERR_STAT is set
 >    hw/ide/ahci: fix ahci_write_fis_sdb()
 >    hw/ide/ahci: fix broken SError handling

Is anything here stable-worthy?

Yes, assuming it doesn't break anything.

Hmm. I was thinking maybe one or two of the above.
Are you suggesting the *whole* lot?

I can't give IDE the testing it deserves anymore, but I trust Niklas. I don't have good test suites for *inside* linux/windows guests so I am 
admittedly relying on qtests and for people to bark if something regressed.

I'd say to tentatively add them to your list and if we find regressions during 
this window, we can exclude them from a stable release.

Yeah, sure, that's okay.

Thank you!

/mjt

[PULL 1/5] iothread: Set the GSource "name" field

2023-09-07 Thread Stefan Hajnoczi

From: Fabiano Rosas 

Having a name in the source helps with debugging core dumps when one
might not have access to TLS data to cross-reference AioContexts with
their addresses.

Signed-off-by: Fabiano Rosas 
Reviewed-by: Philippe Mathieu-Daudé 
Message-id: 20230905180359.14083-1-faro...@suse.de
Signed-off-by: Stefan Hajnoczi 
---
 iothread.c | 14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/iothread.c b/iothread.c
index b41c305bd9..b753286414 100644
--- a/iothread.c
+++ b/iothread.c
@@ -138,12 +138,14 @@ static void iothread_instance_finalize(Object *obj)
 qemu_sem_destroy(>init_done_sem);
 }
 
-static void iothread_init_gcontext(IOThread *iothread)
+static void iothread_init_gcontext(IOThread *iothread, const char *thread_name)
 {
 GSource *source;
+g_autofree char *name = g_strdup_printf("%s aio-context", thread_name);
 
 iothread->worker_context = g_main_context_new();
 source = aio_get_g_source(iothread_get_aio_context(iothread));
+g_source_set_name(source, name);
 g_source_attach(source, iothread->worker_context);
 g_source_unref(source);
 iothread->main_loop = g_main_loop_new(iothread->worker_context, TRUE);
@@ -180,7 +182,7 @@ static void iothread_init(EventLoopBase *base, Error **errp)
 {
 Error *local_error = NULL;
 IOThread *iothread = IOTHREAD(base);
-char *thread_name;
+g_autofree char *thread_name = NULL;
 
 iothread->stopping = false;
 iothread->running = true;
@@ -189,11 +191,14 @@ static void iothread_init(EventLoopBase *base, Error 
**errp)
 return;
 }
 
+thread_name = g_strdup_printf("IO %s",
+object_get_canonical_path_component(OBJECT(base)));
+
 /*
  * Init one GMainContext for the iothread unconditionally, even if
  * it's not used
  */
-iothread_init_gcontext(iothread);
+iothread_init_gcontext(iothread, thread_name);
 
 iothread_set_aio_context_params(base, _error);
 if (local_error) {
@@ -206,11 +211,8 @@ static void iothread_init(EventLoopBase *base, Error 
**errp)
 /* This assumes we are called from a thread with useful CPU affinity for us
  * to inherit.
  */
-thread_name = g_strdup_printf("IO %s",
-object_get_canonical_path_component(OBJECT(base)));
 qemu_thread_create(>thread, thread_name, iothread_run,
iothread, QEMU_THREAD_JOINABLE);
-g_free(thread_name);
 
 /* Wait for initialization to complete */
 while (iothread->thread_id == -1) {
-- 
2.41.0

Re: [PATCH v10 0/4] hw/ufs: Add Universal Flash Storage (UFS) support

2023-09-07 Thread Stefan Hajnoczi

On Wed, 6 Sept 2023 at 03:45, Jeuk Kim  wrote:
>
> Since v9:
> - Added the "UFS_" prefix to all define and enum defined in block/ufs.h.
> This fixes
> https://gitlab.com/qemu-project/qemu/-/jobs/4977255992
> which is a win32 build error.
>
> - Fixed not to use pointer type casting (uint32_t * -> unsigned long *).
> It causes the bug in the find_first_bit() function on big endian host pc.
> This fixes
> https://gitlab.com/qemu-project/qemu/-/jobs/4977256030
> which is qos-test failure on s390x hosts.
>
> Please let me know if there are any problems.
> Thank you very much!

Applied, thanks!

https://gitlab.com/stefanha/qemu/-/commits/block

Stefan

>
> Jeuk
>
> Jeuk Kim (4):
>   hw/ufs: Initial commit for emulated Universal-Flash-Storage
>   hw/ufs: Support for Query Transfer Requests
>   hw/ufs: Support for UFS logical unit
>   tests/qtest: Introduce tests for UFS
>
>  MAINTAINERS  |7 +
>  docs/specs/pci-ids.rst   |2 +
>  hw/Kconfig   |1 +
>  hw/meson.build   |1 +
>  hw/ufs/Kconfig   |4 +
>  hw/ufs/lu.c  | 1445 
>  hw/ufs/meson.build   |1 +
>  hw/ufs/trace-events  |   58 ++
>  hw/ufs/trace.h   |1 +
>  hw/ufs/ufs.c | 1502 ++
>  hw/ufs/ufs.h |  131 
>  include/block/ufs.h  | 1090 +++
>  include/hw/pci/pci.h |1 +
>  include/hw/pci/pci_ids.h |1 +
>  include/scsi/constants.h |1 +
>  meson.build  |1 +
>  tests/qtest/meson.build  |1 +
>  tests/qtest/ufs-test.c   |  587 +++
>  18 files changed, 4835 insertions(+)
>  create mode 100644 hw/ufs/Kconfig
>  create mode 100644 hw/ufs/lu.c
>  create mode 100644 hw/ufs/meson.build
>  create mode 100644 hw/ufs/trace-events
>  create mode 100644 hw/ufs/trace.h
>  create mode 100644 hw/ufs/ufs.c
>  create mode 100644 hw/ufs/ufs.h
>  create mode 100644 include/block/ufs.h
>  create mode 100644 tests/qtest/ufs-test.c
>
> --
> 2.34.1
>
>

Re: [PATCH RESEND v5 03/57] target/loongarch: Use gen_helper_gvec_4_ptr for 4OP + env vector instructions

2023-09-07 Thread Richard Henderson


On 9/7/23 01:31, Song Gao wrote:

+static bool gen__ptr_vl(DisasContext *ctx, arg_ *a, uint32_t oprsz,
+gen_helper_gvec_4_ptr *fn)
+{
+tcg_gen_gvec_4_ptr(vec_full_offset(a->vd),
+   vec_full_offset(a->vj),
+   vec_full_offset(a->vk),
+   vec_full_offset(a->va),
+   cpu_env,
+   oprsz, ctx->vl / 8, oprsz, fn);

  ^

This next to last argument is 'data', which is unused for this case.
Just use 0 here.

Otherwise,
Reviewed-by: Richard Henderson 


r~

Re: [PATCH RESEND v5 02/57] target/loongarch: Implement gvec_*_vl functions

2023-09-07 Thread Richard Henderson


On 9/7/23 01:31, Song Gao wrote:

Using gvec_*_vl functions hides oprsz. We can use gvec_v* for oprsz 16.
and gvec_v* for oprsz 32.

Signed-off-by: Song Gao
---
  target/loongarch/insn_trans/trans_vec.c.inc | 68 +
  1 file changed, 44 insertions(+), 24 deletions(-)


The description above is not quite right.  How about:

  Create gvec_*_vl functions in order to hide oprsz.
  This is used by gvec_v* functions for oprsz 16,
  and will be used by gvec_x* functions for oprsz 32.

The code is correct.

Reviewed-by: Richard Henderson 


r~

Re: [PULL for-6.2 0/7] Ide patches

2023-09-07 Thread John Snow

On Thu, Sep 7, 2023, 12:49 PM Michael Tokarev  wrote:

> 07.09.2023 06:42, John Snow wrote:
>
> > 
> > IDE Pull request
> >
> > 
> >
> > Niklas Cassel (7):
> >hw/ide/core: set ERR_STAT in unsupported command completion
> >hw/ide/ahci: write D2H FIS when processing NCQ command
> >hw/ide/ahci: simplify and document PxCI handling
> >hw/ide/ahci: PxSACT and PxCI is cleared when PxCMD.ST is cleared
> >hw/ide/ahci: PxCI should not get cleared when ERR_STAT is set
> >hw/ide/ahci: fix ahci_write_fis_sdb()
> >hw/ide/ahci: fix broken SError handling
>
> Is anything here stable-worthy?
>
> /mjt
>

Yes, assuming it doesn't break anything.

I can't give IDE the testing it deserves anymore, but I trust Niklas. I
don't have good test suites for *inside* linux/windows guests so I am
admittedly relying on qtests and for people to bark if something regressed.

I'd say to tentatively add them to your list and if we find regressions
during this window, we can exclude them from a stable release.

>

Re: [PULL for-6.2 0/7] Ide patches

2023-09-07 Thread Michael Tokarev


07.09.2023 06:42, John Snow wrote:



IDE Pull request



Niklas Cassel (7):
   hw/ide/core: set ERR_STAT in unsupported command completion
   hw/ide/ahci: write D2H FIS when processing NCQ command
   hw/ide/ahci: simplify and document PxCI handling
   hw/ide/ahci: PxSACT and PxCI is cleared when PxCMD.ST is cleared
   hw/ide/ahci: PxCI should not get cleared when ERR_STAT is set
   hw/ide/ahci: fix ahci_write_fis_sdb()
   hw/ide/ahci: fix broken SError handling


Is anything here stable-worthy?

/mjt

Re: [PATCH RESEND v5 01/57] target/loongarch: Renamed lsx.c to vec .c

2023-09-07 Thread Richard Henderson


On 9/7/23 01:31, Song Gao wrote:

Renamed lsx_helper.c to vec_helper.c and trans_lsx.c.inc to trans_vec.c.inc
So LASX can used them.

Signed-off-by: Song Gao
---
  target/loongarch/translate.c| 2 +-
  target/loongarch/{lsx_helper.c => vec_helper.c} | 2 +-
  .../loongarch/insn_trans/{trans_lsx.c.inc => trans_vec.c.inc}   | 2 +-
  target/loongarch/meson.build| 2 +-
  4 files changed, 4 insertions(+), 4 deletions(-)
  rename target/loongarch/{lsx_helper.c => vec_helper.c} (99%)
  rename target/loongarch/insn_trans/{trans_lsx.c.inc => trans_vec.c.inc} (99%)


Reviewed-by: Richard Henderson 


r~

Re: [RFC PATCH] softmmu: Fix async_run_on_cpu() use in tcg_commit_cpu()

2023-09-07 Thread Richard Henderson


On 9/7/23 09:14, Philippe Mathieu-Daudé wrote:

CPUState::halt_cond is an accelerator specific pointer, used
in particular by TCG (which tcg_commit() is about).
The pointer is set by the AccelOpsClass::create_vcpu_thread()
handler.
AccelOpsClass::create_vcpu_thread() is called by the generic
qemu_init_vcpu(), which expect the accelerator handler to
eventually call cpu_thread_signal_created() which is protected
with a QemuCond. It is safer to check the vCPU is created with
this field rather than the 'halt_cond' pointer set in
create_vcpu_thread() before the vCPU thread is initialized.

This avoids calling tcg_commit() until all CPUs are realized.

Here we can see for a machine with N CPUs, tcg_commit()
is called N times before the 'machine_creation_done' event:

   (lldb) settings set -- target.run-args  "-M" "virt" "-smp" "512" "-display" 
"none"
   (lldb) breakpoint set --name qemu_machine_creation_done --one-shot true
   (lldb) breakpoint set --name tcg_commit_cpu --auto-continue true
   (lldb) run
   Process 84089 launched: 'qemu-system-aarch64' (arm64)
   Process 84089 stopped
   * thread #1, queue = 'com.apple.main-thread', stop reason = one-shot 
breakpoint 2
   (lldb) breakpoint list --brief
   Current breakpoints:
   2: name = 'tcg_commit_cpu', locations = 2, resolved = 2, hit count = 512 
Options: enabled auto-continue
  ^^^^^



Of course the function is called 512 times: you asked for 512 cpus, and each has its own 
address space which needs initializing.


If you skip the call before cpu->created, when exactly are you going to do it?


r~

Re: [PATCH v3 30/32] hw/arm/sbsa-ref: Check CPU type in machine_run_board_init()

2023-09-07 Thread Leif Lindholm

On Thu, Sep 07, 2023 at 10:35:51 +1000, Gavin Shan wrote:
> Set mc->valid_cpu_types so that the user specified CPU type can
> be validated in machine_run_board_init(). We needn't to do it
> by ourselves.
> 
> Signed-off-by: Gavin Shan 

Reviewed-by: Leif Lindholm 

> ---
>  hw/arm/sbsa-ref.c | 21 +++--
>  1 file changed, 3 insertions(+), 18 deletions(-)
> 
> diff --git a/hw/arm/sbsa-ref.c b/hw/arm/sbsa-ref.c
> index bc89eb4806..f24be53ea2 100644
> --- a/hw/arm/sbsa-ref.c
> +++ b/hw/arm/sbsa-ref.c
> @@ -149,26 +149,15 @@ static const int sbsa_ref_irqmap[] = {
>  [SBSA_GWDT_WS0] = 16,
>  };
>  
> -static const char * const valid_cpus[] = {
> +static const char * const valid_cpu_types[] = {
>  ARM_CPU_TYPE_NAME("cortex-a57"),
>  ARM_CPU_TYPE_NAME("cortex-a72"),
>  ARM_CPU_TYPE_NAME("neoverse-n1"),
>  ARM_CPU_TYPE_NAME("neoverse-v1"),
>  ARM_CPU_TYPE_NAME("max"),
> +NULL,
>  };
>  
> -static bool cpu_type_valid(const char *cpu)
> -{
> -int i;
> -
> -for (i = 0; i < ARRAY_SIZE(valid_cpus); i++) {
> -if (strcmp(cpu, valid_cpus[i]) == 0) {
> -return true;
> -}
> -}
> -return false;
> -}
> -
>  static uint64_t sbsa_ref_cpu_mp_affinity(SBSAMachineState *sms, int idx)
>  {
>  uint8_t clustersz = ARM_DEFAULT_CPUS_PER_CLUSTER;
> @@ -730,11 +719,6 @@ static void sbsa_ref_init(MachineState *machine)
>  const CPUArchIdList *possible_cpus;
>  int n, sbsa_max_cpus;
>  
> -if (!cpu_type_valid(machine->cpu_type)) {
> -error_report("sbsa-ref: CPU type %s not supported", 
> machine->cpu_type);
> -exit(1);
> -}
> -
>  if (kvm_enabled()) {
>  error_report("sbsa-ref: KVM is not supported for this machine");
>  exit(1);
> @@ -899,6 +883,7 @@ static void sbsa_ref_class_init(ObjectClass *oc, void 
> *data)
>  mc->init = sbsa_ref_init;
>  mc->desc = "QEMU 'SBSA Reference' ARM Virtual Machine";
>  mc->default_cpu_type = ARM_CPU_TYPE_NAME("neoverse-n1");
> +mc->valid_cpu_types = valid_cpu_types;
>  mc->max_cpus = 512;
>  mc->pci_allow_0_address = true;
>  mc->minimum_page_bits = 12;
> -- 
> 2.41.0
>

1 2 3 4 >

1 - 100 of 366 matches

Mail list logo