[PATCH] tcg: drop unused tcg_temp_free define

2023-10-14 Thread Mike Frysinger
Use of the API was removed a while back, but the define wasn't.

Signed-off-by: Mike Frysinger 
---
 include/tcg/tcg-op.h | 2 --
 1 file changed, 2 deletions(-)

diff --git a/include/tcg/tcg-op.h b/include/tcg/tcg-op.h
index 80cfcf8104b6..3ead59e4594d 100644
--- a/include/tcg/tcg-op.h
+++ b/include/tcg/tcg-op.h
@@ -52,7 +52,6 @@ static inline void tcg_gen_insn_start(target_ulong pc, 
target_ulong a1,
 typedef TCGv_i32 TCGv;
 #define tcg_temp_new() tcg_temp_new_i32()
 #define tcg_global_mem_new tcg_global_mem_new_i32
-#define tcg_temp_free tcg_temp_free_i32
 #define tcgv_tl_temp tcgv_i32_temp
 #define tcg_gen_qemu_ld_tl tcg_gen_qemu_ld_i32
 #define tcg_gen_qemu_st_tl tcg_gen_qemu_st_i32
@@ -60,7 +59,6 @@ typedef TCGv_i32 TCGv;
 typedef TCGv_i64 TCGv;
 #define tcg_temp_new() tcg_temp_new_i64()
 #define tcg_global_mem_new tcg_global_mem_new_i64
-#define tcg_temp_free tcg_temp_free_i64
 #define tcgv_tl_temp tcgv_i64_temp
 #define tcg_gen_qemu_ld_tl tcg_gen_qemu_ld_i64
 #define tcg_gen_qemu_st_tl tcg_gen_qemu_st_i64
-- 
2.39.0




Re: [PATCH v2 2/3] target/riscv: Initialize gdb_core_xml_file only once

2023-10-14 Thread Akihiko Odaki

On 2023/10/15 3:19, Daniel Henrique Barboza wrote:



On 10/14/23 00:35, Akihiko Odaki wrote:

gdb_core_xml_file was assigned each time a CPU is instantiated before
this change.

Signed-off-by: Akihiko Odaki 
---
  target/riscv/cpu.c | 5 +
  target/riscv/tcg/tcg-cpu.c | 4 
  2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index ac4a6c7eec..a811215150 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -1575,6 +1575,11 @@ static void riscv_cpu_class_init(ObjectClass 
*c, void *data)

  cc->get_pc = riscv_cpu_get_pc;
  cc->gdb_read_register = riscv_cpu_gdb_read_register;
  cc->gdb_write_register = riscv_cpu_gdb_write_register;
+#ifdef TARGET_RISCV64
+    cc->gdb_core_xml_file = "riscv-64bit-cpu.xml";
+#elif defined(TARGET_RISCV32)
+    cc->gdb_core_xml_file = "riscv-32bit-cpu.xml";
+#endif
  cc->gdb_num_core_regs = 33;
  cc->gdb_stop_before_watchpoint = true;
  cc->disas_set_info = riscv_cpu_disas_set_info;
diff --git a/target/riscv/tcg/tcg-cpu.c b/target/riscv/tcg/tcg-cpu.c
index e0cbc56320..626fb2acea 100644
--- a/target/riscv/tcg/tcg-cpu.c
+++ b/target/riscv/tcg/tcg-cpu.c
@@ -150,8 +150,6 @@ static void 
riscv_cpu_validate_misa_priv(CPURISCVState *env, Error **errp)

  static void riscv_cpu_validate_misa_mxl(RISCVCPU *cpu, Error **errp)
  {
-    RISCVCPUClass *mcc = RISCV_CPU_GET_CLASS(cpu);
-    CPUClass *cc = CPU_CLASS(mcc);
  CPURISCVState *env = >env;
  /* Validate that MISA_MXL is set properly. */
@@ -159,11 +157,9 @@ static void riscv_cpu_validate_misa_mxl(RISCVCPU 
*cpu, Error **errp)

  #ifdef TARGET_RISCV64
  case MXL_RV64:
  case MXL_RV128:
-    cc->gdb_core_xml_file = "riscv-64bit-cpu.xml";
  break;
  #elif defined(TARGET_RISCV32)
  case MXL_RV32:
-    cc->gdb_core_xml_file = "riscv-32bit-cpu.xml";
  break;



Hmmm the issue here is that, in patch 1, you added an "elif 
defined(TARGET_RISCV32)"
based on an assumption that you changed here since there's no more 
gdb_core files being

set.


It's opposite. Patch 1 added "elif defined(TARGET_RISCV32)" because 
configs/targets/riscv64-linux-user.mak and 
configs/targets/riscv64-softmmu.mak do not list riscv-32bit-cpu.xml and 
referencing it from TARGET_RISCV64 results in an error.


Since patch 1 now ensures TARGET_RISCV64 will never have MXL_RV32 as 
misa_mxl_max, patch 2 can safely assume TARGET_RISCV64 always has 
riscv-64bit-cpu.xml as gdb_core_xml_file.




My suggestion is to use patch 1 from v1, where you removed the 
misa_mxl_max == misa_mxl
check at the end of this function. And then in this patch you can remove 
this function
altogether since you're assigning gdb_core in riscv_cpu_class_init() and 
the function will

be left doing nothing of note.


Assigning gdb_core_xml_file is more like a side-effect of 
riscv_cpu_validate_misa_mxl(), and the main purpose of this function is 
to validate misa_mxl[_max]. I think it's still a good idea to validate 
misa_mxl_max in particular. Specifying MXL_RV32 as misa_mxl_max for 
TARGET_RISCV64 or specifying MXL_RV64/MXL_RV128 for TARGET_RISCV32 will 
not work because of the incompatible gdb_core_xml_file (and probably 
other reasons).




Re: [RFC] virtio: enforce link up

2023-10-14 Thread Vincent Jardin
On 10/14/23 18:37, Michael S. Tsirkin wrote:
> On Sat, Oct 14, 2023 at 06:22:34PM +0200, Vincent Jardin wrote:
>> Using interface's settings, let's enforce an always on link up.
>>
>> Signed-off-by: Vincent Jardin 
> 
> What is going on here? Just don't set it down.

The purpose is to have a stable vLink for the VMs that don't support 
such sysctl arp_evict_nocarrier:
https://patchwork.kernel.org/project/netdevbpf/patch/20211101173630.300969-2-prest...@gmail.com/

We are facing some users of vSwitches that use vhost-user and that 
disconnect and reconnect during some operations. For most of the VMs on 
their deployments with such vSwitches, those VMs' vLink should not flap.

For those VMs, the flaps are critical and they can lead to some 
convergence issues.

If this capability is not at the virtio-net level, should it be at 
qemu's net_vhost_user_event() ?
For instance, from 
https://github.com/qemu/qemu/blob/63011373ad22c794a013da69663c03f1297a5c56/net/vhost-user.c#L266
 
?

best regards,
   Vincent


> 
>> ---
>>   hw/net/virtio-net.c| 8 
>>   include/hw/virtio/virtio-net.h | 2 ++
>>   2 files changed, 10 insertions(+)
>>
>> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
>> index 29e33ea5ed..e731b4fdea 100644
>> --- a/hw/net/virtio-net.c
>> +++ b/hw/net/virtio-net.c
>> @@ -78,6 +78,9 @@
>>  tso/gso/gro 'off'. */
>>   #define VIRTIO_NET_RSC_DEFAULT_INTERVAL 30
>>   
>> +/* force always link up */
>> +#define VIRTIO_NET_LINK_UP false
>> +
>>   #define VIRTIO_NET_RSS_SUPPORTED_HASHES (VIRTIO_NET_RSS_HASH_TYPE_IPv4 | \
>>VIRTIO_NET_RSS_HASH_TYPE_TCPv4 | \
>>VIRTIO_NET_RSS_HASH_TYPE_UDPv4 | \
>> @@ -447,6 +450,9 @@ static void virtio_net_set_link_status(NetClientState 
>> *nc)
>>   else
>>   n->status |= VIRTIO_NET_S_LINK_UP;
>>   
>> +if (n->net_conf.link_up)
>> +n->status |= VIRTIO_NET_S_LINK_UP;
>> +
>>   if (n->status != old_status)
>>   virtio_notify_config(vdev);
>>   
>> @@ -3947,6 +3953,8 @@ static Property virtio_net_properties[] = {
>> VIRTIO_NET_F_GUEST_USO6, true),
>>   DEFINE_PROP_BIT64("host_uso", VirtIONet, host_features,
>> VIRTIO_NET_F_HOST_USO, true),
>> +DEFINE_PROP_BOOL("link_up", VirtIONet, net_conf.link_up,
>> +   VIRTIO_NET_LINK_UP),
>>   DEFINE_PROP_END_OF_LIST(),
>>   };
>>   
>> diff --git a/include/hw/virtio/virtio-net.h b/include/hw/virtio/virtio-net.h
>> index 55977f01f0..385bebab34 100644
>> --- a/include/hw/virtio/virtio-net.h
>> +++ b/include/hw/virtio/virtio-net.h
>> @@ -56,6 +56,7 @@ typedef struct virtio_net_conf
>>   char *duplex_str;
>>   uint8_t duplex;
>>   char *primary_id_str;
>> +bool link_up; /* if set enforce link up, never down */
>>   } virtio_net_conf;
>>   
>>   /* Coalesced packets type & status */
>> @@ -180,6 +181,7 @@ struct VirtIONet {
>>   size_t guest_hdr_len;
>>   uint64_t host_features;
>>   uint32_t rsc_timeout;
>> +uint32_t link_up; /* if set enforce link up, never down */
>>   uint8_t rsc4_enabled;
>>   uint8_t rsc6_enabled;
>>   uint8_t has_ufo;
>> -- 
>> 2.34.1
> 



Re: [PATCH 08/12] hw/hppa: Require at least SeaBIOS-hppa version 10

2023-10-14 Thread BALATON Zoltan

On Sat, 14 Oct 2023, del...@kernel.org wrote:

From: Helge Deller 

The new SeaBIOS-hppa version 10 includes initial support
for PA2.0 CPUs.


Patch does a few misc things not mentioned in commit message. Would those 
need to be split of in separate patch or mentioned here?


Regards,
BALATON Zoltan


Signed-off-by: Helge Deller 
---
hw/hppa/machine.c | 10 +-
1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/hw/hppa/machine.c b/hw/hppa/machine.c
index cf28cb9586..c6d8deffcf 100644
--- a/hw/hppa/machine.c
+++ b/hw/hppa/machine.c
@@ -1,6 +1,8 @@
/*
 * QEMU HPPA hardware system emulator.
- * Copyright 2018 Helge Deller 
+ * (C) Copyright 2018-2023 Helge Deller 
+ *
+ * This work is licensed under the GNU GPL license version 2 or later.
 */

#include "qemu/osdep.h"
@@ -29,7 +31,7 @@
#include "net/net.h"
#include "qemu/log.h"

-#define MIN_SEABIOS_HPPA_VERSION 6 /* require at least this fw version */
+#define MIN_SEABIOS_HPPA_VERSION 10 /* require at least this fw version */

#define HPA_POWER_BUTTON (FIRMWARE_END - 0x10)

@@ -95,9 +97,7 @@ static ISABus *hppa_isa_bus(void)

isa_bus = isa_bus_new(NULL, get_system_memory(), isa_region,
  _abort);
-isa_irqs = i8259_init(isa_bus,
-  /* qemu_allocate_irq(dino_set_isa_irq, s, 0)); */
-  NULL);
+isa_irqs = i8259_init(isa_bus, NULL);
isa_bus_register_input_irqs(isa_bus, isa_irqs);

return isa_bus;





Re: [PATCH 06/12] tulip: Use the HP PCI vendor ID instead of number

2023-10-14 Thread BALATON Zoltan

On Sat, 14 Oct 2023, del...@kernel.org wrote:

From: Helge Deller 

Signed-off-by: Helge Deller 
---
hw/net/tulip.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/net/tulip.c b/hw/net/tulip.c
index 915e5fb595..11d866e431 100644
--- a/hw/net/tulip.c
+++ b/hw/net/tulip.c
@@ -1020,7 +1020,7 @@ static void tulip_class_init(ObjectClass *klass, void 
*data)
k->exit = pci_tulip_exit;
k->vendor_id = PCI_VENDOR_ID_DEC;
k->device_id = PCI_DEVICE_ID_DEC_21143;
-k->subsystem_vendor_id = 0x103c;
+k->subsystem_vendor_id = PCI_VENDOR_ID_HP;


Seems to be first use of value introduced in patch 2. I don't know if this 
is worrh two separate patches. I'd just squash patch 2 in this patch but 
not sure what others or maintainer prefers.


Regards,
BALATON Zoltan


k->subsystem_id = 0x104f;
k->class_id = PCI_CLASS_NETWORK_ETHERNET;
dc->vmsd = _pci_tulip;





Re: [PATCH 05/12] lasips2: LASI PS/2 devices are not user-createable

2023-10-14 Thread BALATON Zoltan

On Sat, 14 Oct 2023, del...@kernel.org wrote:

From: Helge Deller 

Those PS/2 ports are created with the LASI controller when
a 32-bit PA-RISC machine is created.

Mark them not user-createable to avoid showing them in
the qemu device list.

Signed-off-by: Helge Deller 
Cc: qemu-sta...@nongnu.org
---
hw/input/lasips2.c | 4 
1 file changed, 4 insertions(+)

diff --git a/hw/input/lasips2.c b/hw/input/lasips2.c
index ea7c07a2ba..93c9c887d3 100644
--- a/hw/input/lasips2.c
+++ b/hw/input/lasips2.c
@@ -351,6 +351,8 @@ static void lasips2_port_class_init(ObjectClass *klass, 
void *data)
{
DeviceClass *dc = DEVICE_CLASS(klass);

+/* Lasi devices can not be created by users */


That's what the next line says so this comment does not add any info. It 
should instead explain why, such as "part of LASI" or something like that.


Regards,
BALATON Zoltan


+dc->user_creatable = false;
dc->realize = lasips2_port_realize;
}

@@ -397,6 +399,8 @@ static void lasips2_kbd_port_class_init(ObjectClass *klass, 
void *data)
DeviceClass *dc = DEVICE_CLASS(klass);
LASIPS2PortDeviceClass *lpdc = LASIPS2_PORT_CLASS(klass);

+/* Lasi devices can not be created by users */
+dc->user_creatable = false;
device_class_set_parent_realize(dc, lasips2_kbd_port_realize,
>parent_realize);
}





Re: -drive if=none: can't we make this the default?

2023-10-14 Thread BALATON Zoltan

On Sat, 14 Oct 2023, Michael Tokarev wrote:

Can't we make -drive if=none the default?

Yes, I know current default is ide, and whole world have to use if=none 
explicitly
to undo this.  I think at this point we can deprecate if=ide default and 
switch to

if=none in the next release.  I think it will be a welcome change.


I don't think that would be welcome by all people now using -drive 
media=disk,format-raw,file= shortcut (which they were forced to use 
instead of the previous -hda shortcut just to specify format=raw) and now 
they would need to change that to -drive lot,of,options,here -device 
ide-hd,drive=[what was it called and which drive is that?] instead. That's 
a lot more typing and much less convenient so please keep the convenience 
option at least somewhat convenient for command line users.


Also machine class has a member (possibly) set by board code: 
mc->block_default_type = IF_IDE; Isn't that the default used if no "if" 
property given?


Maybe I don't understand how this works but if it makes less convenient 
for users and breaks their scripts then I don't think it's a good idea.


Regards,
BALATON Zoltan



[PATCH 02/12] pci_ids: Add PCI vendor ID for HP

2023-10-14 Thread deller
From: Helge Deller 

Signed-off-by: Helge Deller 
---
 include/hw/pci/pci_ids.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/hw/pci/pci_ids.h b/include/hw/pci/pci_ids.h
index 85469b9b53..3c0e72df0e 100644
--- a/include/hw/pci/pci_ids.h
+++ b/include/hw/pci/pci_ids.h
@@ -171,6 +171,8 @@
 #define PCI_VENDOR_ID_DEC0x1011
 #define PCI_DEVICE_ID_DEC_21143  0x0019
 
+#define PCI_VENDOR_ID_HP 0x103c
+
 #define PCI_VENDOR_ID_CIRRUS 0x1013
 
 #define PCI_VENDOR_ID_IBM0x1014
-- 
2.41.0




[PATCH 00/12] target/hppa: Add emulation of a C3700 HP-PARISC workstation

2023-10-14 Thread deller
From: Helge Deller 

This series adds a new PA-RISC machine emulation for the HP-PARISC
C3700 workstation.

The physical HP C3700 machine has a PA2.0 (64-bit) CPU, in contrast to
the existing emulation of a B160L workstation which is a 32-bit only
machine and where it's Dino PCI controller isn't 64-bit capable.

With the HP C3700 machine emulation (together with the emulated Astro
Memory controller and the Elroy PCI bridge) it's now possible to
enhance the hppa CPU emulation to support the 64-bit instruction set
in upcoming patches.

Please review.

Helge

Helge Deller (12):
  target/hppa: Update to SeaBIOS-hppa version 10
  pci_ids: Add PCI vendor ID for HP
  hw/pci-host: Add Astro system bus adapter found on PA-RISC machines
  MAINTAINERS: Add Astro PCI host for hppa machines
  lasips2: LASI PS/2 devices are not user-createable
  tulip: Use the HP PCI vendor ID instead of number
  pci-host: Wire up new Astro/Elroy PCI bridge
  hw/hppa: Require at least SeaBIOS-hppa version 10
  hw/hppa: Export machine name, BTLBs, power-button address via fw_cfg
  hw/hppa: Provide RTC and DebugOutputPort on CPU #0
  hw/hppa: Split out machine creation
  hw/hppa: Add new HP C3700 machine

 MAINTAINERS |   5 +-
 hw/hppa/Kconfig |   1 +
 hw/hppa/hppa_hardware.h |   1 -
 hw/hppa/machine.c   | 364 +++
 hw/input/lasips2.c  |   4 +
 hw/net/tulip.c  |   2 +-
 hw/pci-host/Kconfig |   4 +
 hw/pci-host/astro.c | 876 
 hw/pci-host/meson.build |   1 +
 hw/pci-host/trace-events|  11 +
 include/hw/pci-host/astro.h |  92 
 include/hw/pci/pci_ids.h|   2 +
 pc-bios/hppa-firmware.img   | Bin 732376 -> 755480 bytes
 roms/seabios-hppa   |   2 +-
 14 files changed, 1284 insertions(+), 81 deletions(-)
 create mode 100644 hw/pci-host/astro.c
 create mode 100644 include/hw/pci-host/astro.h

-- 
2.41.0




[PATCH 04/12] MAINTAINERS: Add Astro PCI host for hppa machines

2023-10-14 Thread deller
From: Helge Deller 

Signed-off-by: Helge Deller 
---
 MAINTAINERS | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index ceea4c2bf2..68d086a0f3 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1174,7 +1174,7 @@ F: hw/*/etraxfs_*.c
 
 HP-PARISC Machines
 --
-HP B160L
+HP B160L, HP C3700
 M: Richard Henderson 
 R: Helge Deller 
 S: Odd Fixes
@@ -1182,11 +1182,14 @@ F: configs/devices/hppa-softmmu/default.mak
 F: hw/hppa/
 F: hw/net/*i82596*
 F: hw/misc/lasi.c
+F: hw/pci-host/astro.c
 F: hw/pci-host/dino.c
 F: include/hw/misc/lasi.h
 F: include/hw/net/lasi_82596.h
+F: include/hw/pci-host/astro.h
 F: include/hw/pci-host/dino.h
 F: pc-bios/hppa-firmware.img
+F: roms/seabios-hppa/
 
 LoongArch Machines
 --
-- 
2.41.0




[PATCH 11/12] hw/hppa: Split out machine creation

2023-10-14 Thread deller
From: Helge Deller 

This is a preparation patch to allow the creation of additional
hppa machine.

It splits out the creation of the machine into a
- machine_HP_common_init_cpus(), and a
- machine_HP_common_init_tail()
function.

This will allow to reuse the basic functions which are common to
all parisc machines.

Signed-off-by: Helge Deller 
---
 hw/hppa/machine.c | 170 ++
 1 file changed, 98 insertions(+), 72 deletions(-)

diff --git a/hw/hppa/machine.c b/hw/hppa/machine.c
index 299b5bc95c..71087a3f2f 100644
--- a/hw/hppa/machine.c
+++ b/hw/hppa/machine.c
@@ -23,6 +23,7 @@
 #include "hw/net/lasi_82596.h"
 #include "hw/nmi.h"
 #include "hw/pci/pci.h"
+#include "hw/pci/pci_device.h"
 #include "hw/pci-host/dino.h"
 #include "hw/misc/lasi.h"
 #include "hppa_hardware.h"
@@ -38,6 +39,7 @@
 #define enable_lasi_lan()   0
 
 static PCIBus *pci_bus;
+static DeviceState *lasi_dev;
 
 static void hppa_powerdown_req(Notifier *n, void *opaque)
 {
@@ -251,29 +253,20 @@ static DinoState *dino_init(MemoryRegion *addr_space)
 return DINO_PCI_HOST_BRIDGE(dev);
 }
 
-static void machine_hppa_init(MachineState *machine)
+/*
+ * Step 1: Create CPUs and Memory
+ */
+static void machine_HP_common_init_cpus(MachineState *machine)
 {
-const char *kernel_filename = machine->kernel_filename;
-const char *kernel_cmdline = machine->kernel_cmdline;
-const char *initrd_filename = machine->initrd_filename;
-MachineClass *mc = MACHINE_GET_CLASS(machine);
-DeviceState *dev, *dino_dev, *lasi_dev;
-PCIBus *pci_bus;
-ISABus *isa_bus;
-char *firmware_filename;
-uint64_t firmware_low, firmware_high;
-long size;
-uint64_t kernel_entry = 0, kernel_low, kernel_high;
 MemoryRegion *addr_space = get_system_memory();
-MemoryRegion *rom_region;
 MemoryRegion *cpu_region;
 long i;
 unsigned int smp_cpus = machine->smp.cpus;
-SysBusDevice *s;
+char *name;
 
 /* Create CPUs.  */
 for (i = 0; i < smp_cpus; i++) {
-char *name = g_strdup_printf("cpu%ld-io-eir", i);
+name = g_strdup_printf("cpu%ld-io-eir", i);
 cpu[i] = HPPA_CPU(cpu_create(machine->cpu_type));
 
 cpu_region = g_new(MemoryRegion, 1);
@@ -296,45 +289,27 @@ static void machine_hppa_init(MachineState *machine)
 exit(EXIT_FAILURE);
 }
 memory_region_add_subregion_overlap(addr_space, 0, machine->ram, -1);
+}
 
-
-/* Init Lasi chip */
-lasi_dev = DEVICE(lasi_init());
-memory_region_add_subregion(addr_space, LASI_HPA,
-sysbus_mmio_get_region(
-SYS_BUS_DEVICE(lasi_dev), 0));
-
-/* Init Dino (PCI host bus chip).  */
-dino_dev = DEVICE(dino_init(addr_space));
-memory_region_add_subregion(addr_space, DINO_HPA,
-sysbus_mmio_get_region(
-SYS_BUS_DEVICE(dino_dev), 0));
-pci_bus = PCI_BUS(qdev_get_child_bus(dino_dev, "pci"));
-assert(pci_bus);
-
-/* Create ISA bus. */
-isa_bus = hppa_isa_bus();
-assert(isa_bus);
-
-/* Realtime clock, used by firmware for PDC_TOD call. */
-mc146818_rtc_init(isa_bus, 2000, NULL);
-
-/* Serial ports: Lasi and Dino use a 7.272727 MHz clock. */
-serial_mm_init(addr_space, LASI_UART_HPA + 0x800, 0,
-qdev_get_gpio_in(lasi_dev, LASI_IRQ_UART_HPA), 7272727 / 16,
-serial_hd(0), DEVICE_BIG_ENDIAN);
-
-serial_mm_init(addr_space, DINO_UART_HPA + 0x800, 0,
-qdev_get_gpio_in(dino_dev, DINO_IRQ_RS232INT), 7272727 / 16,
-serial_hd(1), DEVICE_BIG_ENDIAN);
-
-/* Parallel port */
-parallel_mm_init(addr_space, LASI_LPT_HPA + 0x800, 0,
- qdev_get_gpio_in(lasi_dev, LASI_IRQ_LAN_HPA),
- parallel_hds[0]);
-
-/* fw_cfg configuration interface */
-create_fw_cfg(machine);
+/*
+ * Last creation step: Add SCSI discs, NICs, graphics & load firmware
+ */
+static void machine_HP_common_init_tail(MachineState *machine)
+{
+const char *kernel_filename = machine->kernel_filename;
+const char *kernel_cmdline = machine->kernel_cmdline;
+const char *initrd_filename = machine->initrd_filename;
+MachineClass *mc = MACHINE_GET_CLASS(machine);
+DeviceState *dev;
+char *firmware_filename;
+uint64_t firmware_low, firmware_high;
+long size;
+uint64_t kernel_entry = 0, kernel_low, kernel_high;
+MemoryRegion *addr_space = get_system_memory();
+MemoryRegion *rom_region;
+long i;
+unsigned int smp_cpus = machine->smp.cpus;
+SysBusDevice *s;
 
 /* SCSI disk setup. */
 dev = DEVICE(pci_create_simple(pci_bus, -1, "lsi53c895a"));
@@ -362,21 +337,12 @@ static void machine_hppa_init(MachineState *machine)
 }
 }
 
-/* PS/2 Keyboard/Mouse */
-dev = qdev_new(TYPE_LASIPS2);
-sysbus_realize_and_unref(SYS_BUS_DEVICE(dev), _fatal);
-sysbus_connect_irq(SYS_BUS_DEVICE(dev), 0,
-   

[PATCH 05/12] lasips2: LASI PS/2 devices are not user-createable

2023-10-14 Thread deller
From: Helge Deller 

Those PS/2 ports are created with the LASI controller when
a 32-bit PA-RISC machine is created.

Mark them not user-createable to avoid showing them in
the qemu device list.

Signed-off-by: Helge Deller 
Cc: qemu-sta...@nongnu.org
---
 hw/input/lasips2.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/hw/input/lasips2.c b/hw/input/lasips2.c
index ea7c07a2ba..93c9c887d3 100644
--- a/hw/input/lasips2.c
+++ b/hw/input/lasips2.c
@@ -351,6 +351,8 @@ static void lasips2_port_class_init(ObjectClass *klass, 
void *data)
 {
 DeviceClass *dc = DEVICE_CLASS(klass);
 
+/* Lasi devices can not be created by users */
+dc->user_creatable = false;
 dc->realize = lasips2_port_realize;
 }
 
@@ -397,6 +399,8 @@ static void lasips2_kbd_port_class_init(ObjectClass *klass, 
void *data)
 DeviceClass *dc = DEVICE_CLASS(klass);
 LASIPS2PortDeviceClass *lpdc = LASIPS2_PORT_CLASS(klass);
 
+/* Lasi devices can not be created by users */
+dc->user_creatable = false;
 device_class_set_parent_realize(dc, lasips2_kbd_port_realize,
 >parent_realize);
 }
-- 
2.41.0




[PATCH 08/12] hw/hppa: Require at least SeaBIOS-hppa version 10

2023-10-14 Thread deller
From: Helge Deller 

The new SeaBIOS-hppa version 10 includes initial support
for PA2.0 CPUs.

Signed-off-by: Helge Deller 
---
 hw/hppa/machine.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/hw/hppa/machine.c b/hw/hppa/machine.c
index cf28cb9586..c6d8deffcf 100644
--- a/hw/hppa/machine.c
+++ b/hw/hppa/machine.c
@@ -1,6 +1,8 @@
 /*
  * QEMU HPPA hardware system emulator.
- * Copyright 2018 Helge Deller 
+ * (C) Copyright 2018-2023 Helge Deller 
+ *
+ * This work is licensed under the GNU GPL license version 2 or later.
  */
 
 #include "qemu/osdep.h"
@@ -29,7 +31,7 @@
 #include "net/net.h"
 #include "qemu/log.h"
 
-#define MIN_SEABIOS_HPPA_VERSION 6 /* require at least this fw version */
+#define MIN_SEABIOS_HPPA_VERSION 10 /* require at least this fw version */
 
 #define HPA_POWER_BUTTON (FIRMWARE_END - 0x10)
 
@@ -95,9 +97,7 @@ static ISABus *hppa_isa_bus(void)
 
 isa_bus = isa_bus_new(NULL, get_system_memory(), isa_region,
   _abort);
-isa_irqs = i8259_init(isa_bus,
-  /* qemu_allocate_irq(dino_set_isa_irq, s, 0)); */
-  NULL);
+isa_irqs = i8259_init(isa_bus, NULL);
 isa_bus_register_input_irqs(isa_bus, isa_irqs);
 
 return isa_bus;
-- 
2.41.0




[PATCH 12/12] hw/hppa: Add new HP C3700 machine

2023-10-14 Thread deller
From: Helge Deller 

Add code to create an emulated C3700 machine.
It includes the following components:
- HP Powerbar SP2 Diva BMC card (serial port only)
- PCI 4x serial card (for serial ports #1-#4)
- USB OHCI controller with USB keyboard and USB mouse

Signed-off-by: Helge Deller 
---
 hw/hppa/machine.c | 100 ++
 1 file changed, 100 insertions(+)

diff --git a/hw/hppa/machine.c b/hw/hppa/machine.c
index 71087a3f2f..365aa1d508 100644
--- a/hw/hppa/machine.c
+++ b/hw/hppa/machine.c
@@ -22,8 +22,10 @@
 #include "hw/input/lasips2.h"
 #include "hw/net/lasi_82596.h"
 #include "hw/nmi.h"
+#include "hw/usb.h"
 #include "hw/pci/pci.h"
 #include "hw/pci/pci_device.h"
+#include "hw/pci-host/astro.h"
 #include "hw/pci-host/dino.h"
 #include "hw/misc/lasi.h"
 #include "hppa_hardware.h"
@@ -301,6 +303,7 @@ static void machine_HP_common_init_tail(MachineState 
*machine)
 const char *initrd_filename = machine->initrd_filename;
 MachineClass *mc = MACHINE_GET_CLASS(machine);
 DeviceState *dev;
+PCIDevice *pci_dev;
 char *firmware_filename;
 uint64_t firmware_low, firmware_high;
 long size;
@@ -337,6 +340,36 @@ static void machine_HP_common_init_tail(MachineState 
*machine)
 }
 }
 
+/* BMC board: HP Powerbar SP2 Diva (with console only) */
+pci_dev = pci_new(-1, "pci-serial");
+if (!lasi_dev) {
+/* bind default keyboard/serial to Diva card */
+qdev_prop_set_chr(DEVICE(pci_dev), "chardev", serial_hd(0));
+}
+qdev_prop_set_uint8(DEVICE(pci_dev), "prog_if", 0);
+pci_realize_and_unref(pci_dev, pci_bus, _fatal);
+pci_config_set_vendor_id(pci_dev->config, PCI_VENDOR_ID_HP);
+pci_config_set_device_id(pci_dev->config, 0x1048);
+pci_set_word(_dev->config[PCI_SUBSYSTEM_VENDOR_ID], PCI_VENDOR_ID_HP);
+pci_set_word(_dev->config[PCI_SUBSYSTEM_ID], 0x1227); /* Powerbar */
+
+/* create a second serial PCI card when running Astro */
+if (!lasi_dev) {
+pci_dev = pci_new(-1, "pci-serial-4x");
+qdev_prop_set_chr(DEVICE(pci_dev), "chardev1", serial_hd(1));
+qdev_prop_set_chr(DEVICE(pci_dev), "chardev2", serial_hd(2));
+qdev_prop_set_chr(DEVICE(pci_dev), "chardev3", serial_hd(3));
+qdev_prop_set_chr(DEVICE(pci_dev), "chardev4", serial_hd(4));
+pci_realize_and_unref(pci_dev, pci_bus, _fatal);
+}
+
+/* create USB OHCI controller for USB keyboard & mouse on Astro machines */
+if (!lasi_dev && machine->enable_graphics) {
+pci_create_simple(pci_bus, -1, "pci-ohci");
+usb_create_simple(usb_bus_find(-1), "usb-kbd");
+usb_create_simple(usb_bus_find(-1), "usb-mouse");
+}
+
 /* register power switch emulation */
 qemu_register_powerdown_notifier(_system_powerdown_notifier);
 
@@ -520,6 +553,41 @@ static void machine_HP_B160L_init(MachineState *machine)
 machine_HP_common_init_tail(machine);
 }
 
+static AstroState *astro_init(void)
+{
+DeviceState *dev;
+
+dev = qdev_new(TYPE_ASTRO_CHIP);
+sysbus_realize_and_unref(SYS_BUS_DEVICE(dev), _fatal);
+
+return ASTRO_CHIP(dev);
+}
+
+/*
+ * Create HP C3700 workstation
+ */
+static void machine_HP_C3700_init(MachineState *machine)
+{
+AstroState *astro;
+DeviceState *astro_dev;
+MemoryRegion *addr_space = get_system_memory();
+
+/* Create CPUs and RAM.  */
+machine_HP_common_init_cpus(machine);
+
+/* Init Astro and the Elroys (PCI host bus chips).  */
+astro = astro_init();
+astro_dev = DEVICE(astro);
+memory_region_add_subregion(addr_space, ASTRO_HPA,
+sysbus_mmio_get_region(
+SYS_BUS_DEVICE(astro_dev), 0));
+pci_bus = PCI_BUS(qdev_get_child_bus(DEVICE(astro->elroy[0]), "pci"));
+assert(pci_bus);
+
+/* Add SCSI discs, NICs, graphics & load firmware */
+machine_HP_common_init_tail(machine);
+}
+
 static void hppa_machine_reset(MachineState *ms, ShutdownCause reason)
 {
 unsigned int smp_cpus = ms->smp.cpus;
@@ -599,9 +667,41 @@ static const TypeInfo HP_B160L_machine_init_typeinfo = {
 },
 };
 
+static void HP_C3700_machine_init_class_init(ObjectClass *oc, void *data)
+{
+MachineClass *mc = MACHINE_CLASS(oc);
+NMIClass *nc = NMI_CLASS(oc);
+
+mc->desc = "HP C3700 workstation";
+mc->default_cpu_type = TYPE_HPPA_CPU;
+mc->init = machine_HP_C3700_init;
+mc->reset = hppa_machine_reset;
+mc->block_default_type = IF_SCSI;
+mc->max_cpus = HPPA_MAX_CPUS;
+mc->default_cpus = 1;
+mc->is_default = false;
+mc->default_ram_size = 1024 * MiB;
+mc->default_boot_order = "cd";
+mc->default_ram_id = "ram";
+mc->default_nic = "tulip";
+
+nc->nmi_monitor_handler = hppa_nmi;
+}
+
+static const TypeInfo HP_C3700_machine_init_typeinfo = {
+.name = MACHINE_TYPE_NAME("C3700"),
+.parent = TYPE_MACHINE,
+.class_init = HP_C3700_machine_init_class_init,
+.interfaces = 

[PATCH 09/12] hw/hppa: Export machine name, BTLBs, power-button address via fw_cfg

2023-10-14 Thread deller
From: Helge Deller 

Provide necessary info to SeaBIOS-hppa.

Signed-off-by: Helge Deller 
---
 hw/hppa/machine.c | 19 ++-
 1 file changed, 18 insertions(+), 1 deletion(-)

diff --git a/hw/hppa/machine.c b/hw/hppa/machine.c
index c6d8deffcf..be0caf4675 100644
--- a/hw/hppa/machine.c
+++ b/hw/hppa/machine.c
@@ -37,6 +37,7 @@
 
 #define enable_lasi_lan()   0
 
+static PCIBus *pci_bus;
 
 static void hppa_powerdown_req(Notifier *n, void *opaque)
 {
@@ -123,6 +124,8 @@ static FWCfgState *create_fw_cfg(MachineState *ms)
 FWCfgState *fw_cfg;
 uint64_t val;
 const char qemu_version[] = QEMU_VERSION;
+MachineClass *mc = MACHINE_GET_CLASS(ms);
+int len;
 
 fw_cfg = fw_cfg_init_mem(FW_CFG_IO_BASE, FW_CFG_IO_BASE + 4);
 fw_cfg_add_i16(fw_cfg, FW_CFG_NB_CPUS, ms->smp.cpus);
@@ -137,8 +140,20 @@ static FWCfgState *create_fw_cfg(MachineState *ms)
 fw_cfg_add_file(fw_cfg, "/etc/cpu/tlb_entries",
 g_memdup(, sizeof(val)), sizeof(val));
 
+val = cpu_to_le64(HPPA_BTLB_ENTRIES);
+fw_cfg_add_file(fw_cfg, "/etc/cpu/btlb_entries",
+g_memdup(, sizeof(val)), sizeof(val));
+
+len = strlen(mc->name) + 1;
+fw_cfg_add_file(fw_cfg, "/etc/hppa/machine",
+g_memdup(mc->name, len), len);
+
 val = cpu_to_le64(HPA_POWER_BUTTON);
-fw_cfg_add_file(fw_cfg, "/etc/power-button-addr",
+fw_cfg_add_file(fw_cfg, "/etc/hppa/power-button-addr",
+g_memdup(, sizeof(val)), sizeof(val));
+
+val = cpu_to_le64(CPU_HPA + 24);
+fw_cfg_add_file(fw_cfg, "/etc/hppa/DebugOutputPort",
 g_memdup(, sizeof(val)), sizeof(val));
 
 fw_cfg_add_i16(fw_cfg, FW_CFG_BOOT_DEVICE, ms->boot_config.order[0]);
@@ -148,6 +163,8 @@ static FWCfgState *create_fw_cfg(MachineState *ms)
 g_memdup(qemu_version, sizeof(qemu_version)),
 sizeof(qemu_version));
 
+fw_cfg_add_extra_pci_roots(pci_bus, fw_cfg);
+
 return fw_cfg;
 }
 
-- 
2.41.0




[PATCH 07/12] pci-host: Wire up new Astro/Elroy PCI bridge

2023-10-14 Thread deller
From: Helge Deller 

Allow the Astro source to be built.

Signed-off-by: Helge Deller 
---
 hw/hppa/Kconfig | 1 +
 hw/pci-host/Kconfig | 4 
 hw/pci-host/meson.build | 1 +
 3 files changed, 6 insertions(+)

diff --git a/hw/hppa/Kconfig b/hw/hppa/Kconfig
index 5dd8b5b21e..ff8528aaa8 100644
--- a/hw/hppa/Kconfig
+++ b/hw/hppa/Kconfig
@@ -3,6 +3,7 @@ config HPPA_B160L
 imply PCI_DEVICES
 imply E1000_PCI
 imply VIRTIO_VGA
+select ASTRO
 select DINO
 select LASI
 select SERIAL
diff --git a/hw/pci-host/Kconfig b/hw/pci-host/Kconfig
index a07070eddf..54a609d2ca 100644
--- a/hw/pci-host/Kconfig
+++ b/hw/pci-host/Kconfig
@@ -82,6 +82,10 @@ config DINO
 bool
 select PCI
 
+config ASTRO
+bool
+select PCI
+
 config GT64120
 bool
 select PCI
diff --git a/hw/pci-host/meson.build b/hw/pci-host/meson.build
index 64eada76fe..f891f026cb 100644
--- a/hw/pci-host/meson.build
+++ b/hw/pci-host/meson.build
@@ -27,6 +27,7 @@ pci_ss.add(when: 'CONFIG_MV64361', if_true: 
files('mv64361.c'))
 pci_ss.add(when: 'CONFIG_VERSATILE_PCI', if_true: files('versatile.c'))
 
 # HPPA devices
+pci_ss.add(when: 'CONFIG_ASTRO', if_true: files('astro.c'))
 pci_ss.add(when: 'CONFIG_DINO', if_true: files('dino.c'))
 
 system_ss.add_all(when: 'CONFIG_PCI', if_true: pci_ss)
-- 
2.41.0




[PATCH 06/12] tulip: Use the HP PCI vendor ID instead of number

2023-10-14 Thread deller
From: Helge Deller 

Signed-off-by: Helge Deller 
---
 hw/net/tulip.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/net/tulip.c b/hw/net/tulip.c
index 915e5fb595..11d866e431 100644
--- a/hw/net/tulip.c
+++ b/hw/net/tulip.c
@@ -1020,7 +1020,7 @@ static void tulip_class_init(ObjectClass *klass, void 
*data)
 k->exit = pci_tulip_exit;
 k->vendor_id = PCI_VENDOR_ID_DEC;
 k->device_id = PCI_DEVICE_ID_DEC_21143;
-k->subsystem_vendor_id = 0x103c;
+k->subsystem_vendor_id = PCI_VENDOR_ID_HP;
 k->subsystem_id = 0x104f;
 k->class_id = PCI_CLASS_NETWORK_ETHERNET;
 dc->vmsd = _pci_tulip;
-- 
2.41.0




[PATCH 03/12] hw/pci-host: Add Astro system bus adapter found on PA-RISC machines

2023-10-14 Thread deller
From: Helge Deller 

The 64-bit PA-RISC machines use a Astro system bus adapter (SBA)
with Elroy PCI host chips.
Later generation Astro chips were named Pluto, Ike and REO.

Signed-off-by: Helge Deller 
---
 hw/pci-host/astro.c | 876 
 hw/pci-host/trace-events|  11 +
 include/hw/pci-host/astro.h |  92 
 3 files changed, 979 insertions(+)
 create mode 100644 hw/pci-host/astro.c
 create mode 100644 include/hw/pci-host/astro.h

diff --git a/hw/pci-host/astro.c b/hw/pci-host/astro.c
new file mode 100644
index 00..0adbf470c3
--- /dev/null
+++ b/hw/pci-host/astro.c
@@ -0,0 +1,876 @@
+/*
+ * HP-PARISC Astro/Pluto/Ike/REO system bus adapter (SBA)
+ * with Elroy PCI bus (LBA) adapter emulation
+ * Found in C3000 and similar machines
+ *
+ * (C) 2023 by Helge Deller 
+ *
+ * This work is licensed under the GNU GPL license version 2 or later.
+ *
+ * Chip documentation is available at:
+ * https://parisc.wiki.kernel.org/index.php/Technical_Documentation
+ *
+ * TODO:
+ * - All user-added devices are currently attached to the first
+ *   Elroy (PCI bus) only for now. To fix this additional work in
+ *   SeaBIOS and this driver is needed. See "user_creatable" flag below.
+ * - GMMIO (Greater than 4 GB MMIO) register
+ */
+
+#define TYPE_ASTRO_IOMMU_MEMORY_REGION "astro-iommu-memory-region"
+
+#include "qemu/osdep.h"
+#include "qemu/module.h"
+#include "qemu/units.h"
+#include "qapi/error.h"
+#include "hw/irq.h"
+#include "hw/pci/pci_device.h"
+#include "hw/pci/pci_bus.h"
+#include "hw/qdev-properties.h"
+#include "hw/pci-host/astro.h"
+#include "hw/hppa/hppa_hardware.h"
+#include "migration/vmstate.h"
+#include "trace.h"
+#include "qom/object.h"
+
+/*
+ * Helper functions
+ */
+
+static uint64_t mask_32bit_val(hwaddr addr, unsigned size, uint64_t val)
+{
+if (size == 8) {
+return val;
+}
+if (addr & 4) {
+val >>= 32;
+} else {
+val = (uint32_t) val;
+}
+return val;
+}
+
+static void put_val_in_int64(uint64_t *p, hwaddr addr, unsigned size,
+ uint64_t val)
+{
+if (size == 8) {
+*p = val;
+} else if (size == 4) {
+if (addr & 4) {
+*p = ((*p << 32) >> 32) | (val << 32);
+} else {
+*p = ((*p >> 32) << 32) | (uint32_t) val;
+}
+}
+}
+
+static void put_val_in_arrary(uint64_t *array, hwaddr start_addr,
+  hwaddr addr, unsigned size, uint64_t val)
+{
+int index;
+
+index = (addr - start_addr) / 8;
+put_val_in_int64([index], addr, size, val);
+}
+
+
+/*
+ * The Elroy PCI host bridge. We have at least 4 of those under Astro.
+ */
+
+static MemTxResult elroy_chip_read_with_attrs(void *opaque, hwaddr addr,
+ uint64_t *data, unsigned size,
+ MemTxAttrs attrs)
+{
+MemTxResult ret = MEMTX_OK;
+ElroyState *s = opaque;
+uint64_t val = -1;
+int index;
+
+switch ((addr >> 3) << 3) {
+case 0x0008:
+val = 0x605; /* func_class */
+break;
+case 0x0058:
+/*
+ * Scratch register, but firmware initializes it with the
+ * PCI BUS number and Linux/HP-UX uses it then.
+ */
+val = s->pci_bus_num;
+/* Upper byte holds the end of this bus number */
+val |= s->pci_bus_num << 8;
+break;
+case 0x0080:
+val = s->arb_mask; /* set ARB mask */
+break;
+case 0x0108:
+val = s->status_control;
+break;
+case 0x200 ... 0x250 - 1: /* LMMIO, GMMIO, WLMMIO, WGMMIO, ... */
+index = (addr - 0x200) / 8;
+val = s->mmio_base[index];
+break;
+case 0x0680:
+val = s->error_config;
+break;
+case 0x0688:
+val = 0;/* ERROR_STATUS */
+break;
+case 0x0800:/* IOSAPIC_REG_SELECT */
+val = s->iosapic_reg_select;
+break;
+case 0x0808:
+val = UINT64_MAX;/* XXX: tbc. */
+g_assert_not_reached();
+break;
+case 0x0810:/* IOSAPIC_REG_WINDOW */
+switch (s->iosapic_reg_select) {
+case 0x01:  /* IOSAPIC_REG_VERSION */
+val = (32 << 16) | 1; /* upper 16bit holds max entries */
+break;
+default:
+if (s->iosapic_reg_select < ARRAY_SIZE(s->iosapic_reg)) {
+val = s->iosapic_reg[s->iosapic_reg_select];
+} else {
+trace_iosapic_reg_read(s->iosapic_reg_select, size, val);
+g_assert_not_reached();
+}
+}
+trace_iosapic_reg_read(s->iosapic_reg_select, size, val);
+break;
+default:
+trace_elroy_read(addr, size, val);
+g_assert_not_reached();
+}
+trace_elroy_read(addr, size, val);
+
+/* for 32-bit accesses mask return value */
+val = 

[PATCH 10/12] hw/hppa: Provide RTC and DebugOutputPort on CPU #0

2023-10-14 Thread deller
From: Helge Deller 

For SeaBIOS-hppa, the RTC and DebugOutputPort were in the I/O area of
the LASI chip of the emulated B160L machine.
Since we will add other machines without a LASI chip, move the emulated
devices into the I/O area of CPU#0 instead.

Signed-off-by: Helge Deller 
---
 hw/hppa/hppa_hardware.h |  1 -
 hw/hppa/machine.c   | 67 +
 2 files changed, 67 insertions(+), 1 deletion(-)

diff --git a/hw/hppa/hppa_hardware.h b/hw/hppa/hppa_hardware.h
index a5ac3dd0fd..a9be7bb851 100644
--- a/hw/hppa/hppa_hardware.h
+++ b/hw/hppa/hppa_hardware.h
@@ -18,7 +18,6 @@
 #define LASI_UART_HPA   0xffd05000
 #define LASI_SCSI_HPA   0xffd06000
 #define LASI_LAN_HPA0xffd07000
-#define LASI_RTC_HPA0xffd09000
 #define LASI_LPT_HPA0xffd02000
 #define LASI_AUDIO_HPA  0xffd04000
 #define LASI_PS2KBD_HPA 0xffd08000
diff --git a/hw/hppa/machine.c b/hw/hppa/machine.c
index be0caf4675..299b5bc95c 100644
--- a/hw/hppa/machine.c
+++ b/hw/hppa/machine.c
@@ -104,6 +104,63 @@ static ISABus *hppa_isa_bus(void)
 return isa_bus;
 }
 
+/*
+ * Helper functions to emulate RTC clock and DebugOutputPort
+ */
+static time_t rtc_ref;
+
+static uint64_t io_cpu_read(void *opaque, hwaddr addr, unsigned size)
+{
+uint64_t val = 0;
+
+switch (addr) {
+case 0: /* RTC clock */
+val = time(NULL);
+val += rtc_ref;
+break;
+case 8: /* DebugOutputPort */
+return 0xe9;/* readback */
+}
+return val;
+}
+
+static void io_cpu_write(void *opaque, hwaddr addr,
+ uint64_t val, unsigned size)
+{
+unsigned char ch;
+Chardev *debugout;
+
+switch (addr) {
+case 0: /* RTC clock */
+rtc_ref = val - time(NULL);
+break;
+case 8: /* DebugOutputPort */
+ch = val;
+debugout = serial_hd(0);
+if (debugout) {
+qemu_chr_fe_write_all(debugout->be, , 1);
+} else {
+fprintf(stderr, "%c", ch);
+}
+break;
+}
+}
+
+static const MemoryRegionOps hppa_io_helper_ops = {
+.read = io_cpu_read,
+.write = io_cpu_write,
+.endianness = DEVICE_BIG_ENDIAN,
+.valid = {
+.min_access_size = 1,
+.max_access_size = 8,
+},
+.impl = {
+.min_access_size = 1,
+.max_access_size = 8,
+},
+};
+
+
 static uint64_t cpu_hppa_to_phys(void *opaque, uint64_t addr)
 {
 addr &= (0x1000 - 1);
@@ -152,6 +209,10 @@ static FWCfgState *create_fw_cfg(MachineState *ms)
 fw_cfg_add_file(fw_cfg, "/etc/hppa/power-button-addr",
 g_memdup(, sizeof(val)), sizeof(val));
 
+val = cpu_to_le64(CPU_HPA + 16);
+fw_cfg_add_file(fw_cfg, "/etc/hppa/rtc-addr",
+g_memdup(, sizeof(val)), sizeof(val));
+
 val = cpu_to_le64(CPU_HPA + 24);
 fw_cfg_add_file(fw_cfg, "/etc/hppa/DebugOutputPort",
 g_memdup(, sizeof(val)), sizeof(val));
@@ -223,6 +284,12 @@ static void machine_hppa_init(MachineState *machine)
 g_free(name);
 }
 
+/* RTC and DebugOutputPort on CPU #0 */
+cpu_region = g_new(MemoryRegion, 1);
+memory_region_init_io(cpu_region, OBJECT(cpu[0]), _io_helper_ops,
+  cpu[0], "cpu0-io-rtc", 2 * sizeof(uint64_t));
+memory_region_add_subregion(addr_space, CPU_HPA + 16, cpu_region);
+
 /* Main memory region. */
 if (machine->ram_size > 3 * GiB) {
 error_report("RAM size is currently restricted to 3GB");
-- 
2.41.0




Re: [PATCH v2 1/3] via-ide: Fix legacy mode emulation

2023-10-14 Thread BALATON Zoltan

On Sat, 14 Oct 2023, Mark Cave-Ayland wrote:

On 09/10/2023 20:54, BALATON Zoltan wrote:

The initial value for BARs were set in reset method for emulating
legacy mode at start but this does not work because PCI code resets
BARs after calling device reset method. Remove this ineffective
default to avoid confusion.

Instead move setting the BARs to a callback on writing the PCI config
regsiter that sets legacy mode (which firmwares needing this mode seem
to do) and fix their values to program it to use legacy port numbers
in this case. This does not fully emulate what the data sheet says
(which is not very clear on this) but it implements enogh to allow
both modes as used by firmwares of machines we emulate.

Signed-off-by: BALATON Zoltan 
---
  hw/ide/via.c | 41 -
  1 file changed, 36 insertions(+), 5 deletions(-)

diff --git a/hw/ide/via.c b/hw/ide/via.c
index fff23803a6..43e8af8d69 100644
--- a/hw/ide/via.c
+++ b/hw/ide/via.c
@@ -132,11 +132,6 @@ static void via_ide_reset(DeviceState *dev)
  pci_set_word(pci_conf + PCI_STATUS, PCI_STATUS_FAST_BACK |
   PCI_STATUS_DEVSEL_MEDIUM);
  -pci_set_long(pci_conf + PCI_BASE_ADDRESS_0, 0x01f0);
-pci_set_long(pci_conf + PCI_BASE_ADDRESS_1, 0x03f4);
-pci_set_long(pci_conf + PCI_BASE_ADDRESS_2, 0x0170);
-pci_set_long(pci_conf + PCI_BASE_ADDRESS_3, 0x0374);
-pci_set_long(pci_conf + PCI_BASE_ADDRESS_4, 0xcc01); /* BMIBA: 
20-23h */

  pci_set_long(pci_conf + PCI_INTERRUPT_LINE, 0x010e);
/* IDE chip enable, IDE configuration 1/2, IDE FIFO Configuration*/
@@ -159,6 +154,41 @@ static void via_ide_reset(DeviceState *dev)
  pci_set_long(pci_conf + 0xc0, 0x00020001);
  }
  +static void via_ide_cfg_write(PCIDevice *pd, uint32_t addr,
+  uint32_t val, int len)
+{
+pci_default_write_config(pd, addr, val, len);
+/*
+ * Bits 0 and 2 of the PCI programming interface register select 
between
+ * legacy and native mode for the two IDE channels. We don't emulate 
this
+ * because we cannot easily switch between ISA and PCI in QEMU so 
instead


As per my previous email, this statement is demonstrably false: this is now 
achievable using the portio_list*() APIs.


+ * when guest selects legacy mode we set the PCI BARs to legacy ports 
which
+ * works the same. We also don't care about setting each channel 
separately
+ * as no guest is known to do or need that. We only do this when BARs 
are
+ * unset when writing this register as logs from real hardware show 
that
+ * setting legacy mode after BARs were set it will still use ports set 
by
+ * BARs not ISA ports (e.g. pegasos2 Linux does this after firmware 
set

+ * native mode and programmed BARs and calls it non-100% native mode).
+ * But if 0x8a is written righr after reset without setting BARs then 
we

+ * want legacy ports (this is done by the AmigaOne firmware).
+ */
+if (addr == PCI_CLASS_PROG && val == 0x8a &&
+pci_get_long(pd->config + PCI_BASE_ADDRESS_0) ==
+PCI_BASE_ADDRESS_SPACE_IO) {
+pci_set_long(pd->config + PCI_BASE_ADDRESS_0, 0x1f0
+ | PCI_BASE_ADDRESS_SPACE_IO);
+pci_set_long(pd->config + PCI_BASE_ADDRESS_1, 0x3f6
+ | PCI_BASE_ADDRESS_SPACE_IO);
+pci_set_long(pd->config + PCI_BASE_ADDRESS_2, 0x170
+ | PCI_BASE_ADDRESS_SPACE_IO);
+pci_set_long(pd->config + PCI_BASE_ADDRESS_3, 0x376
+ | PCI_BASE_ADDRESS_SPACE_IO);
+/* BMIBA: 20-23h */
+pci_set_long(pd->config + PCI_BASE_ADDRESS_4, 0xcc00
+ | PCI_BASE_ADDRESS_SPACE_IO);
+}
+}


Another hint that this is not the right way to be doing this: the values you 
are placing in BARS 1 and 3 are illegal. PCI IO BARs have bit 1 forced to 0 
and bit 0 set to 1 which forces a minimum alignment of 4, so either the 
addresses 0x3f6/0x376 are being rounded internally to 0x3f4/0x374 and/or 
you're lucky that this just happens to work on QEMU.


Using the portio_list*() APIs really is the right way to implement this to 
avoid being affected by such issues.


On second thought I don't think that would work as pegaos2 Linux writes 
0x8a to prog_if but then keeps using the ports as set by BARs so if you 
use ISA ports in this case this will break. I think that proves that real 
chip also uses BARs to emulate legacy mode similar to this approach so 
what we have in this patch is close enough and works well. Your proposed 
alternative is more complex, would not work any better and probably even 
does not work for all cases this works for so I think this is the better 
way now. I've sent a v3 with values changed to match BAR default values 
with 0x3x4 and updating comment and commit message.


Regards,
BALATON Zoltan


  static void via_ide_realize(PCIDevice *dev, Error **errp)
  {
  PCIIDEState *d = PCI_IDE(dev);

[PATCH v3 0/3] Add emulation of AmigaOne XE board

2023-10-14 Thread BALATON Zoltan
Changes in v3:
- Update values, comment and commit message in patch 1 again

Changes in v2:
- Update comment and commit message in patch 1 (Mark)
- Fix irq mapping in patch 2 (Volker)

Regards,
BALATON Zoltan

BALATON Zoltan (3):
  via-ide: Fix legacy mode emulation
  hw/pci-host: Add emulation of Mai Logic Articia S
  hw/ppc: Add emulation of AmigaOne XE board

 MAINTAINERS |   8 +
 configs/devices/ppc-softmmu/default.mak |   1 +
 hw/ide/via.c|  41 +++-
 hw/pci-host/Kconfig |   5 +
 hw/pci-host/articia.c   | 293 
 hw/pci-host/meson.build |   2 +
 hw/ppc/Kconfig  |   7 +
 hw/ppc/amigaone.c   | 164 +
 hw/ppc/meson.build  |   2 +
 include/hw/pci-host/articia.h   |  17 ++
 10 files changed, 535 insertions(+), 5 deletions(-)
 create mode 100644 hw/pci-host/articia.c
 create mode 100644 hw/ppc/amigaone.c
 create mode 100644 include/hw/pci-host/articia.h

-- 
2.30.9




[PATCH v3 2/3] hw/pci-host: Add emulation of Mai Logic Articia S

2023-10-14 Thread BALATON Zoltan
The Articia S is a generic chipset supporting several different CPUs
that were among others used on some PPC boards. This is a minimal
emulation of the parts needed for emulating the AmigaOne board.

Signed-off-by: BALATON Zoltan 
---
 hw/pci-host/Kconfig   |   5 +
 hw/pci-host/articia.c | 293 ++
 hw/pci-host/meson.build   |   2 +
 include/hw/pci-host/articia.h |  17 ++
 4 files changed, 317 insertions(+)
 create mode 100644 hw/pci-host/articia.c
 create mode 100644 include/hw/pci-host/articia.h

diff --git a/hw/pci-host/Kconfig b/hw/pci-host/Kconfig
index a07070eddf..33014c80a4 100644
--- a/hw/pci-host/Kconfig
+++ b/hw/pci-host/Kconfig
@@ -73,6 +73,11 @@ config SH_PCI
 bool
 select PCI
 
+config ARTICIA
+bool
+select PCI
+select I8259
+
 config MV64361
 bool
 select PCI
diff --git a/hw/pci-host/articia.c b/hw/pci-host/articia.c
new file mode 100644
index 00..f3fcc49f81
--- /dev/null
+++ b/hw/pci-host/articia.c
@@ -0,0 +1,293 @@
+/*
+ * Mai Logic Articia S emulation
+ *
+ * Copyright (c) 2023 BALATON Zoltan
+ *
+ * This work is licensed under the GNU GPL license version 2 or later.
+ *
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/log.h"
+#include "qapi/error.h"
+#include "hw/pci/pci_device.h"
+#include "hw/pci/pci_host.h"
+#include "hw/irq.h"
+#include "hw/i2c/bitbang_i2c.h"
+#include "hw/intc/i8259.h"
+#include "hw/pci-host/articia.h"
+
+/*
+ * This is a minimal emulation of this chip as used in AmigaOne board.
+ * Most features are missing but those are not needed by firmware and guests.
+ */
+
+OBJECT_DECLARE_SIMPLE_TYPE(ArticiaState, ARTICIA)
+
+OBJECT_DECLARE_SIMPLE_TYPE(ArticiaHostState, ARTICIA_PCI_HOST)
+struct ArticiaHostState {
+PCIDevice parent_obj;
+
+ArticiaState *as;
+};
+
+/* TYPE_ARTICIA */
+
+struct ArticiaState {
+PCIHostState parent_obj;
+
+qemu_irq irq[PCI_NUM_PINS];
+MemoryRegion io;
+MemoryRegion mem;
+MemoryRegion reg;
+
+bitbang_i2c_interface smbus;
+uint32_t gpio; /* bits 0-7 in, 8-15 out, 16-23 direction (0 in, 1 out) */
+hwaddr gpio_base;
+MemoryRegion gpio_reg;
+};
+
+static uint64_t articia_gpio_read(void *opaque, hwaddr addr, unsigned int size)
+{
+ArticiaState *s = opaque;
+
+return (s->gpio >> (addr * 8)) & 0xff;
+}
+
+static void articia_gpio_write(void *opaque, hwaddr addr, uint64_t val,
+   unsigned int size)
+{
+ArticiaState *s = opaque;
+uint32_t sh = addr * 8;
+
+if (addr == 0) {
+/* in bits read only? */
+return;
+}
+
+if ((s->gpio & (0xff << sh)) != (val & 0xff) << sh) {
+s->gpio &= ~(0xff << sh | 0xff);
+s->gpio |= (val & 0xff) << sh;
+s->gpio |= bitbang_i2c_set(>smbus, BITBANG_I2C_SDA,
+   s->gpio & BIT(16) ?
+   !!(s->gpio & BIT(8)) : 1);
+if ((s->gpio & BIT(17))) {
+s->gpio &= ~BIT(0);
+s->gpio |= bitbang_i2c_set(>smbus, BITBANG_I2C_SCL,
+   !!(s->gpio & BIT(9)));
+}
+}
+}
+
+static const MemoryRegionOps articia_gpio_ops = {
+.read = articia_gpio_read,
+.write = articia_gpio_write,
+.valid.min_access_size = 1,
+.valid.max_access_size = 1,
+.endianness = DEVICE_LITTLE_ENDIAN,
+};
+
+static uint64_t articia_reg_read(void *opaque, hwaddr addr, unsigned int size)
+{
+ArticiaState *s = opaque;
+uint64_t ret = UINT_MAX;
+
+switch (addr) {
+case 0xc00cf8:
+ret = pci_host_conf_le_ops.read(PCI_HOST_BRIDGE(s), 0, size);
+break;
+case 0xe00cfc ... 0xe00cff:
+ret = pci_host_data_le_ops.read(PCI_HOST_BRIDGE(s), addr - 0xe00cfc, 
size);
+break;
+case 0xf0:
+ret = pic_read_irq(isa_pic);
+break;
+default:
+qemu_log_mask(LOG_UNIMP, "%s: Unimplemented register read 0x%"
+  HWADDR_PRIx " %d\n", __func__, addr, size);
+break;
+}
+return ret;
+}
+
+static void articia_reg_write(void *opaque, hwaddr addr, uint64_t val,
+  unsigned int size)
+{
+ArticiaState *s = opaque;
+
+switch (addr) {
+case 0xc00cf8:
+pci_host_conf_le_ops.write(PCI_HOST_BRIDGE(s), 0, val, size);
+break;
+case 0xe00cfc ... 0xe00cff:
+pci_host_data_le_ops.write(PCI_HOST_BRIDGE(s), addr, val, size);
+break;
+default:
+qemu_log_mask(LOG_UNIMP, "%s: Unimplemented register write 0x%"
+  HWADDR_PRIx " %d <- %"PRIx64"\n", __func__, addr, size, 
val);
+break;
+}
+}
+
+static const MemoryRegionOps articia_reg_ops = {
+.read = articia_reg_read,
+.write = articia_reg_write,
+.valid.min_access_size = 1,
+.valid.max_access_size = 4,
+.endianness = DEVICE_LITTLE_ENDIAN,
+};
+
+static void articia_pcihost_set_irq(void *opaque, int n, int level)
+{
+ArticiaState *s = opaque;
+

[PATCH v3 3/3] hw/ppc: Add emulation of AmigaOne XE board

2023-10-14 Thread BALATON Zoltan
The AmigaOne is a rebranded MAI Teron board that uses U-Boot firmware
with patches to support AmigaOS and is very similar to pegasos2 so can
be easily emulated sharing most code with pegasos2. The reason to
emulate it is that AmigaOS comes in different versions for AmigaOne
and PegasosII which only have drivers for one machine and firmware so
these only run on the specific machine. Adding this board allows
another AmigaOS version to be used reusing already existing peagasos2
emulation. (The AmigaOne was the first of these boards so likely most
widespread which then inspired Pegasos that was later replaced with
PegasosII due to problems with Articia S, so these have a lot of
similarity. Pegasos mainly ran MorphOS while the PegasosII version of
AmigaOS was added later and therefore less common than the AmigaOne
version.)

Signed-off-by: BALATON Zoltan 
---
 MAINTAINERS |   8 ++
 configs/devices/ppc-softmmu/default.mak |   1 +
 hw/ppc/Kconfig  |   7 +
 hw/ppc/amigaone.c   | 164 
 hw/ppc/meson.build  |   2 +
 5 files changed, 182 insertions(+)
 create mode 100644 hw/ppc/amigaone.c

diff --git a/MAINTAINERS b/MAINTAINERS
index ceea4c2bf2..ca405bc1bc 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1510,6 +1510,14 @@ F: hw/pci-host/mv64361.c
 F: hw/pci-host/mv643xx.h
 F: include/hw/pci-host/mv64361.h
 
+amigaone
+M: BALATON Zoltan 
+L: qemu-...@nongnu.org
+S: Maintained
+F: hw/ppc/amigaone.c
+F: hw/pci-host/articia.c
+F: include/hw/pci-host/articia.h
+
 Virtual Open Firmware (VOF)
 M: Alexey Kardashevskiy 
 R: David Gibson 
diff --git a/configs/devices/ppc-softmmu/default.mak 
b/configs/devices/ppc-softmmu/default.mak
index a887f5438b..b85fd2bcd7 100644
--- a/configs/devices/ppc-softmmu/default.mak
+++ b/configs/devices/ppc-softmmu/default.mak
@@ -14,6 +14,7 @@ CONFIG_SAM460EX=y
 CONFIG_MAC_OLDWORLD=y
 CONFIG_MAC_NEWWORLD=y
 
+CONFIG_AMIGAONE=y
 CONFIG_PEGASOS2=y
 
 # For PReP
diff --git a/hw/ppc/Kconfig b/hw/ppc/Kconfig
index 5dfbf47ef5..56f0475a8e 100644
--- a/hw/ppc/Kconfig
+++ b/hw/ppc/Kconfig
@@ -69,6 +69,13 @@ config SAM460EX
 select USB_OHCI
 select FDT_PPC
 
+config AMIGAONE
+bool
+imply ATI_VGA
+select ARTICIA
+select VT82C686
+select SMBUS_EEPROM
+
 config PEGASOS2
 bool
 imply ATI_VGA
diff --git a/hw/ppc/amigaone.c b/hw/ppc/amigaone.c
new file mode 100644
index 00..3589924c8a
--- /dev/null
+++ b/hw/ppc/amigaone.c
@@ -0,0 +1,164 @@
+/*
+ * QEMU Eyetech AmigaOne/Mai Logic Teron emulation
+ *
+ * Copyright (c) 2023 BALATON Zoltan
+ *
+ * This work is licensed under the GNU GPL license version 2 or later.
+ *
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/units.h"
+#include "qemu/datadir.h"
+#include "qemu/log.h"
+#include "qemu/error-report.h"
+#include "qapi/error.h"
+#include "hw/ppc/ppc.h"
+#include "hw/boards.h"
+#include "hw/loader.h"
+#include "hw/pci-host/articia.h"
+#include "hw/isa/vt82c686.h"
+#include "hw/ide/pci.h"
+#include "hw/i2c/smbus_eeprom.h"
+#include "hw/ppc/ppc.h"
+#include "sysemu/reset.h"
+#include "kvm_ppc.h"
+
+#define BUS_FREQ_HZ 1
+
+/*
+ * Firmware binary available at
+ * 
https://www.hyperion-entertainment.com/index.php/downloads?view=files=28
+ * then "tail -c 524288 updater.image >u-boot-amigaone.bin"
+ *
+ * BIOS emulator in firmware cannot run QEMU vgabios and hangs on it, use
+ * -device VGA,romfile=VGABIOS-lgpl-latest.bin
+ * from http://www.nongnu.org/vgabios/ instead.
+ */
+#define PROM_FILENAME "u-boot-amigaone.bin"
+#define PROM_ADDR 0xfff0
+#define PROM_SIZE (512 * KiB)
+
+static void amigaone_cpu_reset(void *opaque)
+{
+PowerPCCPU *cpu = opaque;
+
+cpu_reset(CPU(cpu));
+cpu_ppc_tb_reset(>env);
+}
+
+static void fix_spd_data(uint8_t *spd)
+{
+uint32_t bank_size = 4 * MiB * spd[31];
+uint32_t rows = bank_size / spd[13] / spd[17];
+spd[3] = ctz32(rows) - spd[4];
+}
+
+static void amigaone_init(MachineState *machine)
+{
+PowerPCCPU *cpu;
+CPUPPCState *env;
+MemoryRegion *rom, *pci_mem, *mr;
+const char *fwname = machine->firmware ?: PROM_FILENAME;
+char *filename;
+ssize_t sz;
+PCIBus *pci_bus;
+Object *via;
+DeviceState *dev;
+I2CBus *i2c_bus;
+uint8_t *spd_data;
+int i;
+
+/* init CPU */
+cpu = POWERPC_CPU(cpu_create(machine->cpu_type));
+env = >env;
+if (PPC_INPUT(env) != PPC_FLAGS_INPUT_6xx) {
+error_report("Incompatible CPU, only 6xx bus supported");
+exit(1);
+}
+cpu_ppc_tb_init(env, BUS_FREQ_HZ / 4);
+qemu_register_reset(amigaone_cpu_reset, cpu);
+
+/* RAM */
+if (machine->ram_size > 2 * GiB) {
+error_report("RAM size more than 2 GiB is not supported");
+exit(1);
+}
+memory_region_add_subregion(get_system_memory(), 0, machine->ram);
+if (machine->ram_size < 1 * GiB + 32 * KiB) {
+/* Firmware uses this area for startup */
+mr = 

[PATCH v3 1/3] via-ide: Fix legacy mode emulation

2023-10-14 Thread BALATON Zoltan
The initial value for BARs were set in reset method for emulating
legacy mode at start but this does not work because PCI code resets
BARs after calling device reset method. Remove this ineffective
default to avoid confusion.

Instead move setting the BARs to a callback on writing the PCI config
regsiter that sets legacy mode (which firmwares needing this mode seem
to do). This does not fully emulate what the data sheet says (which is
not very clear on this) but it implements enough to allow both modes
as used by firmwares and guests on machines we emulate.

Signed-off-by: BALATON Zoltan 
---
 hw/ide/via.c | 41 -
 1 file changed, 36 insertions(+), 5 deletions(-)

diff --git a/hw/ide/via.c b/hw/ide/via.c
index fff23803a6..ca9a3b8f49 100644
--- a/hw/ide/via.c
+++ b/hw/ide/via.c
@@ -132,11 +132,6 @@ static void via_ide_reset(DeviceState *dev)
 pci_set_word(pci_conf + PCI_STATUS, PCI_STATUS_FAST_BACK |
  PCI_STATUS_DEVSEL_MEDIUM);
 
-pci_set_long(pci_conf + PCI_BASE_ADDRESS_0, 0x01f0);
-pci_set_long(pci_conf + PCI_BASE_ADDRESS_1, 0x03f4);
-pci_set_long(pci_conf + PCI_BASE_ADDRESS_2, 0x0170);
-pci_set_long(pci_conf + PCI_BASE_ADDRESS_3, 0x0374);
-pci_set_long(pci_conf + PCI_BASE_ADDRESS_4, 0xcc01); /* BMIBA: 20-23h 
*/
 pci_set_long(pci_conf + PCI_INTERRUPT_LINE, 0x010e);
 
 /* IDE chip enable, IDE configuration 1/2, IDE FIFO Configuration*/
@@ -159,6 +154,41 @@ static void via_ide_reset(DeviceState *dev)
 pci_set_long(pci_conf + 0xc0, 0x00020001);
 }
 
+static void via_ide_cfg_write(PCIDevice *pd, uint32_t addr,
+  uint32_t val, int len)
+{
+pci_default_write_config(pd, addr, val, len);
+/*
+ * Bits 0 and 2 of the PCI programming interface register are documented to
+ * select between legacy and native mode for the two IDE channels. when the
+ * guest selects legacy mode we reset the PCI BARs to legacy ports which is
+ * their default value. We don't care about setting each channel separately
+ * as no guest is known to do or need that. Also only do this when BARs are
+ * unset when writing this register as logs from real hardware show that
+ * setting legacy mode after BARs were set will still use ports set by BARs
+ * not ISA ports (e.g. pegasos2 Linux does this after firmware set native
+ * mode and programmed BARs). But if 0x8a is written righr after reset
+ * without setting BARs then we want legacy ports (this is done by the
+ * AmigaOne firmware). We can't set these in via_ide_reset() because PCI
+ * code clears BARs after calling device reset method.
+ */
+if (addr == PCI_CLASS_PROG && val == 0x8a &&
+pci_get_long(pd->config + PCI_BASE_ADDRESS_0) ==
+PCI_BASE_ADDRESS_SPACE_IO) {
+pci_set_long(pd->config + PCI_BASE_ADDRESS_0, 0x1f0
+ | PCI_BASE_ADDRESS_SPACE_IO);
+pci_set_long(pd->config + PCI_BASE_ADDRESS_1, 0x3f4
+ | PCI_BASE_ADDRESS_SPACE_IO);
+pci_set_long(pd->config + PCI_BASE_ADDRESS_2, 0x170
+ | PCI_BASE_ADDRESS_SPACE_IO);
+pci_set_long(pd->config + PCI_BASE_ADDRESS_3, 0x374
+ | PCI_BASE_ADDRESS_SPACE_IO);
+/* BMIBA: 20-23h */
+pci_set_long(pd->config + PCI_BASE_ADDRESS_4, 0xcc00
+ | PCI_BASE_ADDRESS_SPACE_IO);
+}
+}
+
 static void via_ide_realize(PCIDevice *dev, Error **errp)
 {
 PCIIDEState *d = PCI_IDE(dev);
@@ -221,6 +251,7 @@ static void via_ide_class_init(ObjectClass *klass, void 
*data)
 /* Reason: only works as function of VIA southbridge */
 dc->user_creatable = false;
 
+k->config_write = via_ide_cfg_write;
 k->realize = via_ide_realize;
 k->exit = via_ide_exitfn;
 k->vendor_id = PCI_VENDOR_ID_VIA;
-- 
2.30.9




-drive if=none: can't we make this the default?

2023-10-14 Thread Michael Tokarev

Can't we make -drive if=none the default?

Yes, I know current default is ide, and whole world have to use if=none 
explicitly
to undo this.  I think at this point we can deprecate if=ide default and switch 
to
if=none in the next release.  I think it will be a welcome change.

/mjt



sector size = 4096?

2023-10-14 Thread Michael Tokarev

Hi!

I wanted to experiment with sector size = 4096, to see how an OS
reacts.  And naturally, the first instinct is to try it in qemu.
But it turned to be.. difficult.

First, there's no documentation.  But I remember seeing properties
of ide-hd device, like physical_sector_size etc.  Ok.

 $ qemu-img create -f raw img 32G

 $ kvm -machine q35 -drive file=img,format=raw,if=none,id=d \
   -device ide-hd,drive=d,physical_sector_size=4096

This way, the guest see size of the drive as 0.  This is obviously
not right (tm).

Okay, lets try full 4096 sectors:

   -device ide-hd,drive=d,physical_sector_size=4096,logical_sector_size=4096

This fails:

   discard_granularity must be a multiple of logical_block_size

Let's adjust discard_granularity (I think it'd be good if qemu does
this based on physical_sector_size already):

   -device 
ide-hd,drive=d,physical_sector_size=4096,logical_sector_size=4096,discard_granularity=0

This fails:

  logical_block_size must be 512 for IDE

Huh? Should the tell all the sata Advanced-Format (AF) drives manufacturers
their drives does not exist, or what? :)


Ok.  Tried the same with nvme.  There, sector sizes seems to be entirely
ignored, - no matter which values I choose, guest always see them as
512.


What's the proper way to run guest with 4096-byte sector size in qemu?

Thanks,

/mjt



Re: [PATCH v2 2/3] target/riscv: Initialize gdb_core_xml_file only once

2023-10-14 Thread Daniel Henrique Barboza




On 10/14/23 00:35, Akihiko Odaki wrote:

gdb_core_xml_file was assigned each time a CPU is instantiated before
this change.

Signed-off-by: Akihiko Odaki 
---
  target/riscv/cpu.c | 5 +
  target/riscv/tcg/tcg-cpu.c | 4 
  2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index ac4a6c7eec..a811215150 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -1575,6 +1575,11 @@ static void riscv_cpu_class_init(ObjectClass *c, void 
*data)
  cc->get_pc = riscv_cpu_get_pc;
  cc->gdb_read_register = riscv_cpu_gdb_read_register;
  cc->gdb_write_register = riscv_cpu_gdb_write_register;
+#ifdef TARGET_RISCV64
+cc->gdb_core_xml_file = "riscv-64bit-cpu.xml";
+#elif defined(TARGET_RISCV32)
+cc->gdb_core_xml_file = "riscv-32bit-cpu.xml";
+#endif
  cc->gdb_num_core_regs = 33;
  cc->gdb_stop_before_watchpoint = true;
  cc->disas_set_info = riscv_cpu_disas_set_info;
diff --git a/target/riscv/tcg/tcg-cpu.c b/target/riscv/tcg/tcg-cpu.c
index e0cbc56320..626fb2acea 100644
--- a/target/riscv/tcg/tcg-cpu.c
+++ b/target/riscv/tcg/tcg-cpu.c
@@ -150,8 +150,6 @@ static void riscv_cpu_validate_misa_priv(CPURISCVState 
*env, Error **errp)
  
  static void riscv_cpu_validate_misa_mxl(RISCVCPU *cpu, Error **errp)

  {
-RISCVCPUClass *mcc = RISCV_CPU_GET_CLASS(cpu);
-CPUClass *cc = CPU_CLASS(mcc);
  CPURISCVState *env = >env;
  
  /* Validate that MISA_MXL is set properly. */

@@ -159,11 +157,9 @@ static void riscv_cpu_validate_misa_mxl(RISCVCPU *cpu, 
Error **errp)
  #ifdef TARGET_RISCV64
  case MXL_RV64:
  case MXL_RV128:
-cc->gdb_core_xml_file = "riscv-64bit-cpu.xml";
  break;
  #elif defined(TARGET_RISCV32)
  case MXL_RV32:
-cc->gdb_core_xml_file = "riscv-32bit-cpu.xml";
  break;



Hmmm the issue here is that, in patch 1, you added an "elif 
defined(TARGET_RISCV32)"
based on an assumption that you changed here since there's no more gdb_core 
files being
set.

My suggestion is to use patch 1 from v1, where you removed the misa_mxl_max == 
misa_mxl
check at the end of this function. And then in this patch you can remove this 
function
altogether since you're assigning gdb_core in riscv_cpu_class_init() and the 
function will
be left doing nothing of note.


Thanks,

Daniel


  #endif
  default:




Re: [PATCH v2 1/3] target/riscv: Do not allow MXL_RV32 for TARGET_RISCV64

2023-10-14 Thread Daniel Henrique Barboza




On 10/14/23 00:35, Akihiko Odaki wrote:

TARGET_RISCV64 does not have riscv-32bit-cpu.xml so it shouldn't accept
MXL_RV32.

Signed-off-by: Akihiko Odaki 
---


Reviewed-by: Daniel Henrique Barboza 



  target/riscv/tcg/tcg-cpu.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/target/riscv/tcg/tcg-cpu.c b/target/riscv/tcg/tcg-cpu.c
index a28918ab30..e0cbc56320 100644
--- a/target/riscv/tcg/tcg-cpu.c
+++ b/target/riscv/tcg/tcg-cpu.c
@@ -161,10 +161,11 @@ static void riscv_cpu_validate_misa_mxl(RISCVCPU *cpu, 
Error **errp)
  case MXL_RV128:
  cc->gdb_core_xml_file = "riscv-64bit-cpu.xml";
  break;
-#endif
+#elif defined(TARGET_RISCV32)
  case MXL_RV32:
  cc->gdb_core_xml_file = "riscv-32bit-cpu.xml";
  break;
+#endif
  default:
  g_assert_not_reached();
  }




Re: [RFC] virtio: enforce link up

2023-10-14 Thread Michael S. Tsirkin
On Sat, Oct 14, 2023 at 06:22:34PM +0200, Vincent Jardin wrote:
> Using interface's settings, let's enforce an always on link up.
> 
> Signed-off-by: Vincent Jardin 

What is going on here? Just don't set it down.

> ---
>  hw/net/virtio-net.c| 8 
>  include/hw/virtio/virtio-net.h | 2 ++
>  2 files changed, 10 insertions(+)
> 
> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> index 29e33ea5ed..e731b4fdea 100644
> --- a/hw/net/virtio-net.c
> +++ b/hw/net/virtio-net.c
> @@ -78,6 +78,9 @@
> tso/gso/gro 'off'. */
>  #define VIRTIO_NET_RSC_DEFAULT_INTERVAL 30
>  
> +/* force always link up */
> +#define VIRTIO_NET_LINK_UP false
> +
>  #define VIRTIO_NET_RSS_SUPPORTED_HASHES (VIRTIO_NET_RSS_HASH_TYPE_IPv4 | \
>   VIRTIO_NET_RSS_HASH_TYPE_TCPv4 | \
>   VIRTIO_NET_RSS_HASH_TYPE_UDPv4 | \
> @@ -447,6 +450,9 @@ static void virtio_net_set_link_status(NetClientState *nc)
>  else
>  n->status |= VIRTIO_NET_S_LINK_UP;
>  
> +if (n->net_conf.link_up)
> +n->status |= VIRTIO_NET_S_LINK_UP;
> +
>  if (n->status != old_status)
>  virtio_notify_config(vdev);
>  
> @@ -3947,6 +3953,8 @@ static Property virtio_net_properties[] = {
>VIRTIO_NET_F_GUEST_USO6, true),
>  DEFINE_PROP_BIT64("host_uso", VirtIONet, host_features,
>VIRTIO_NET_F_HOST_USO, true),
> +DEFINE_PROP_BOOL("link_up", VirtIONet, net_conf.link_up,
> +   VIRTIO_NET_LINK_UP),
>  DEFINE_PROP_END_OF_LIST(),
>  };
>  
> diff --git a/include/hw/virtio/virtio-net.h b/include/hw/virtio/virtio-net.h
> index 55977f01f0..385bebab34 100644
> --- a/include/hw/virtio/virtio-net.h
> +++ b/include/hw/virtio/virtio-net.h
> @@ -56,6 +56,7 @@ typedef struct virtio_net_conf
>  char *duplex_str;
>  uint8_t duplex;
>  char *primary_id_str;
> +bool link_up; /* if set enforce link up, never down */
>  } virtio_net_conf;
>  
>  /* Coalesced packets type & status */
> @@ -180,6 +181,7 @@ struct VirtIONet {
>  size_t guest_hdr_len;
>  uint64_t host_features;
>  uint32_t rsc_timeout;
> +uint32_t link_up; /* if set enforce link up, never down */
>  uint8_t rsc4_enabled;
>  uint8_t rsc6_enabled;
>  uint8_t has_ufo;
> -- 
> 2.34.1




[RFC] virtio: enforce link up

2023-10-14 Thread Vincent Jardin
Using interface's settings, let's enforce an always on link up.

Signed-off-by: Vincent Jardin 
---
 hw/net/virtio-net.c| 8 
 include/hw/virtio/virtio-net.h | 2 ++
 2 files changed, 10 insertions(+)

diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 29e33ea5ed..e731b4fdea 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -78,6 +78,9 @@
tso/gso/gro 'off'. */
 #define VIRTIO_NET_RSC_DEFAULT_INTERVAL 30
 
+/* force always link up */
+#define VIRTIO_NET_LINK_UP false
+
 #define VIRTIO_NET_RSS_SUPPORTED_HASHES (VIRTIO_NET_RSS_HASH_TYPE_IPv4 | \
  VIRTIO_NET_RSS_HASH_TYPE_TCPv4 | \
  VIRTIO_NET_RSS_HASH_TYPE_UDPv4 | \
@@ -447,6 +450,9 @@ static void virtio_net_set_link_status(NetClientState *nc)
 else
 n->status |= VIRTIO_NET_S_LINK_UP;
 
+if (n->net_conf.link_up)
+n->status |= VIRTIO_NET_S_LINK_UP;
+
 if (n->status != old_status)
 virtio_notify_config(vdev);
 
@@ -3947,6 +3953,8 @@ static Property virtio_net_properties[] = {
   VIRTIO_NET_F_GUEST_USO6, true),
 DEFINE_PROP_BIT64("host_uso", VirtIONet, host_features,
   VIRTIO_NET_F_HOST_USO, true),
+DEFINE_PROP_BOOL("link_up", VirtIONet, net_conf.link_up,
+   VIRTIO_NET_LINK_UP),
 DEFINE_PROP_END_OF_LIST(),
 };
 
diff --git a/include/hw/virtio/virtio-net.h b/include/hw/virtio/virtio-net.h
index 55977f01f0..385bebab34 100644
--- a/include/hw/virtio/virtio-net.h
+++ b/include/hw/virtio/virtio-net.h
@@ -56,6 +56,7 @@ typedef struct virtio_net_conf
 char *duplex_str;
 uint8_t duplex;
 char *primary_id_str;
+bool link_up; /* if set enforce link up, never down */
 } virtio_net_conf;
 
 /* Coalesced packets type & status */
@@ -180,6 +181,7 @@ struct VirtIONet {
 size_t guest_hdr_len;
 uint64_t host_features;
 uint32_t rsc_timeout;
+uint32_t link_up; /* if set enforce link up, never down */
 uint8_t rsc4_enabled;
 uint8_t rsc6_enabled;
 uint8_t has_ufo;
-- 
2.34.1




Re: [PATCH v2 1/3] via-ide: Fix legacy mode emulation

2023-10-14 Thread BALATON Zoltan

On Sat, 14 Oct 2023, Mark Cave-Ayland wrote:

On 09/10/2023 20:54, BALATON Zoltan wrote:


The initial value for BARs were set in reset method for emulating
legacy mode at start but this does not work because PCI code resets
BARs after calling device reset method. Remove this ineffective
default to avoid confusion.

Instead move setting the BARs to a callback on writing the PCI config
regsiter that sets legacy mode (which firmwares needing this mode seem
to do) and fix their values to program it to use legacy port numbers
in this case. This does not fully emulate what the data sheet says
(which is not very clear on this) but it implements enogh to allow
both modes as used by firmwares of machines we emulate.

Signed-off-by: BALATON Zoltan 
---
  hw/ide/via.c | 41 -
  1 file changed, 36 insertions(+), 5 deletions(-)

diff --git a/hw/ide/via.c b/hw/ide/via.c
index fff23803a6..43e8af8d69 100644
--- a/hw/ide/via.c
+++ b/hw/ide/via.c
@@ -132,11 +132,6 @@ static void via_ide_reset(DeviceState *dev)
  pci_set_word(pci_conf + PCI_STATUS, PCI_STATUS_FAST_BACK |
   PCI_STATUS_DEVSEL_MEDIUM);
  -pci_set_long(pci_conf + PCI_BASE_ADDRESS_0, 0x01f0);
-pci_set_long(pci_conf + PCI_BASE_ADDRESS_1, 0x03f4);
-pci_set_long(pci_conf + PCI_BASE_ADDRESS_2, 0x0170);
-pci_set_long(pci_conf + PCI_BASE_ADDRESS_3, 0x0374);
-pci_set_long(pci_conf + PCI_BASE_ADDRESS_4, 0xcc01); /* BMIBA: 
20-23h */

  pci_set_long(pci_conf + PCI_INTERRUPT_LINE, 0x010e);
/* IDE chip enable, IDE configuration 1/2, IDE FIFO Configuration*/
@@ -159,6 +154,41 @@ static void via_ide_reset(DeviceState *dev)
  pci_set_long(pci_conf + 0xc0, 0x00020001);
  }
  +static void via_ide_cfg_write(PCIDevice *pd, uint32_t addr,
+  uint32_t val, int len)
+{
+pci_default_write_config(pd, addr, val, len);
+/*
+ * Bits 0 and 2 of the PCI programming interface register select 
between
+ * legacy and native mode for the two IDE channels. We don't emulate 
this
+ * because we cannot easily switch between ISA and PCI in QEMU so 
instead


As per my previous email, this statement is demonstrably false: this is now 
achievable using the portio_list*() APIs.


+ * when guest selects legacy mode we set the PCI BARs to legacy ports 
which
+ * works the same. We also don't care about setting each channel 
separately
+ * as no guest is known to do or need that. We only do this when BARs 
are
+ * unset when writing this register as logs from real hardware show 
that
+ * setting legacy mode after BARs were set it will still use ports set 
by
+ * BARs not ISA ports (e.g. pegasos2 Linux does this after firmware 
set

+ * native mode and programmed BARs and calls it non-100% native mode).
+ * But if 0x8a is written righr after reset without setting BARs then 
we

+ * want legacy ports (this is done by the AmigaOne firmware).
+ */
+if (addr == PCI_CLASS_PROG && val == 0x8a &&
+pci_get_long(pd->config + PCI_BASE_ADDRESS_0) ==
+PCI_BASE_ADDRESS_SPACE_IO) {
+pci_set_long(pd->config + PCI_BASE_ADDRESS_0, 0x1f0
+ | PCI_BASE_ADDRESS_SPACE_IO);
+pci_set_long(pd->config + PCI_BASE_ADDRESS_1, 0x3f6
+ | PCI_BASE_ADDRESS_SPACE_IO);
+pci_set_long(pd->config + PCI_BASE_ADDRESS_2, 0x170
+ | PCI_BASE_ADDRESS_SPACE_IO);
+pci_set_long(pd->config + PCI_BASE_ADDRESS_3, 0x376
+ | PCI_BASE_ADDRESS_SPACE_IO);
+/* BMIBA: 20-23h */
+pci_set_long(pd->config + PCI_BASE_ADDRESS_4, 0xcc00
+ | PCI_BASE_ADDRESS_SPACE_IO);
+}
+}


Another hint that this is not the right way to be doing this: the values you 
are placing in BARS 1 and 3 are illegal. PCI IO BARs have bit 1 forced to 0 
and bit 0 set to 1 which forces a minimum alignment of 4, so either the 
addresses 0x3f6/0x376 are being rounded internally to 0x3f4/0x374 and/or 
you're lucky that this just happens to work on QEMU.


The data sheet lists these values for legacy mode bur it seems that bit 1 
is ignored for BAR here and it ends up set to 0x3x4 with the actual reg 
mapped to 0x3x7 for both values ending in 4 or 6 here and both works the 
same with AmigaOS even if I change the values here to 0x3[7f]4 so I can do 
that and that should then match the default values for these regs but not 
match the values listed for legacy mode so the data sheet is wrong either 
way. It still does not make sense to set these in reset method which will 
be overwritten so only works if I set them here.


Using the portio_list*() APIs really is the right way to implement this to 
avoid being affected by such issues.


Can you provide an alternative patch using portio_list? I don't know how 
to do that and have no example to follow either so it would be hard for me 
to figure out. Or give some pointers on how 

Re: [PATCH 07/18] target/i386: introduce flags writeback mechanism

2023-10-14 Thread Richard Henderson

On 10/14/23 03:01, Paolo Bonzini wrote:

+static void prepare_update1_cc(X86DecodedInsn *decode, DisasContext *s, CCOp 
op)
+{
+decode->cc_dst = s->T0;
+set_cc_op(s, op);
+}


You must delay the set_cc_op() until the end too, for the same reason.  The function call 
will emit discard opcodes, which will kill cc_foo while still live via the memory exception.



r~



Re: [PATCH 1/3] hw/pci-host/sh_pcic: Declare CPU QOM types using DEFINE_TYPES() macro

2023-10-14 Thread Yoshinori Sato
On Thu, 12 Oct 2023 13:12:35 +0900,
Philippe Mathieu-Daudé wrote:
> 
> When multiple QOM types are registered in the same file,
> it is simpler to use the the DEFINE_TYPES() macro. In
> particular because type array declared with such macro
> are easier to review.
> 
> Signed-off-by: Philippe Mathieu-Daudé 
Reviewed-by: Yoshinori Sato 

> ---
>  hw/pci-host/sh_pci.c | 40 +---
>  1 file changed, 17 insertions(+), 23 deletions(-)
> 
> diff --git a/hw/pci-host/sh_pci.c b/hw/pci-host/sh_pci.c
> index 77e7bbc65f..41aed48c85 100644
> --- a/hw/pci-host/sh_pci.c
> +++ b/hw/pci-host/sh_pci.c
> @@ -167,17 +167,6 @@ static void sh_pci_host_class_init(ObjectClass *klass, 
> void *data)
>  dc->user_creatable = false;
>  }
>  
> -static const TypeInfo sh_pci_host_info = {
> -.name  = "sh_pci_host",
> -.parent= TYPE_PCI_DEVICE,
> -.instance_size = sizeof(PCIDevice),
> -.class_init= sh_pci_host_class_init,
> -.interfaces = (InterfaceInfo[]) {
> -{ INTERFACE_CONVENTIONAL_PCI_DEVICE },
> -{ },
> -},
> -};
> -
>  static void sh_pci_device_class_init(ObjectClass *klass, void *data)
>  {
>  DeviceClass *dc = DEVICE_CLASS(klass);
> @@ -185,17 +174,22 @@ static void sh_pci_device_class_init(ObjectClass 
> *klass, void *data)
>  dc->realize = sh_pci_device_realize;
>  }
>  
> -static const TypeInfo sh_pci_device_info = {
> -.name  = TYPE_SH_PCI_HOST_BRIDGE,
> -.parent= TYPE_PCI_HOST_BRIDGE,
> -.instance_size = sizeof(SHPCIState),
> -.class_init= sh_pci_device_class_init,
> +static const TypeInfo sh_pcic_types[] = {
> +{
> +.name   = TYPE_SH_PCI_HOST_BRIDGE,
> +.parent = TYPE_PCI_HOST_BRIDGE,
> +.instance_size  = sizeof(SHPCIState),
> +.class_init = sh_pci_device_class_init,
> +}, {
> +.name   = "sh_pci_host",
> +.parent = TYPE_PCI_DEVICE,
> +.instance_size  = sizeof(PCIDevice),
> +.class_init = sh_pci_host_class_init,
> +.interfaces = (InterfaceInfo[]) {
> +{ INTERFACE_CONVENTIONAL_PCI_DEVICE },
> +{ },
> +},
> +},
>  };
>  
> -static void sh_pci_register_types(void)
> -{
> -type_register_static(_pci_device_info);
> -type_register_static(_pci_host_info);
> -}
> -
> -type_init(sh_pci_register_types)
> +DEFINE_TYPES(sh_pcic_types)
> -- 
> 2.41.0
> 



Re: [PATCH 3/3] hw/pci-host/sh_pcic: Replace magic value by proper definition

2023-10-14 Thread Yoshinori Sato
On Thu, 12 Oct 2023 13:12:37 +0900,
Philippe Mathieu-Daudé wrote:
> 
> Signed-off-by: Philippe Mathieu-Daudé 
Reviewed-by: Yoshinori Sato 

> ---
>  hw/pci-host/sh_pci.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/pci-host/sh_pci.c b/hw/pci-host/sh_pci.c
> index 580e273d96..4edebced5e 100644
> --- a/hw/pci-host/sh_pci.c
> +++ b/hw/pci-host/sh_pci.c
> @@ -40,7 +40,7 @@ struct SHPCIState {
>  PCIHostState parent_obj;
>  
>  PCIDevice *dev;
> -qemu_irq irq[4];
> +qemu_irq irq[PCI_NUM_PINS];
>  MemoryRegion memconfig_p4;
>  MemoryRegion memconfig_a7;
>  MemoryRegion isa;
> @@ -131,7 +131,8 @@ static void sh_pcic_host_realize(DeviceState *dev, Error 
> **errp)
>   s->irq,
>   get_system_memory(),
>   get_system_io(),
> - PCI_DEVFN(0, 0), 4, TYPE_PCI_BUS);
> + PCI_DEVFN(0, 0), PCI_NUM_PINS,
> + TYPE_PCI_BUS);
>  memory_region_init_io(>memconfig_p4, OBJECT(s), _pci_reg_ops, s,
>"sh_pci", 0x224);
>  memory_region_init_alias(>memconfig_a7, OBJECT(s), "sh_pci.2",
> -- 
> 2.41.0
> 



Re: [PATCH 2/3] hw/pci-host/sh_pcic: Correct PCI host / devfn#0 function names

2023-10-14 Thread Yoshinori Sato
On Thu, 12 Oct 2023 13:12:36 +0900,
Philippe Mathieu-Daudé wrote:
> 
> Host bridge device and PCI function #0 are inverted.
> 
> Signed-off-by: Philippe Mathieu-Daudé 
Reviewed-by: Yoshinori Sato 

> ---
>  hw/pci-host/sh_pci.c | 16 
>  1 file changed, 8 insertions(+), 8 deletions(-)
> 
> diff --git a/hw/pci-host/sh_pci.c b/hw/pci-host/sh_pci.c
> index 41aed48c85..580e273d96 100644
> --- a/hw/pci-host/sh_pci.c
> +++ b/hw/pci-host/sh_pci.c
> @@ -116,7 +116,7 @@ static void sh_pci_set_irq(void *opaque, int irq_num, int 
> level)
>  qemu_set_irq(pic[irq_num], level);
>  }
>  
> -static void sh_pci_device_realize(DeviceState *dev, Error **errp)
> +static void sh_pcic_host_realize(DeviceState *dev, Error **errp)
>  {
>  SysBusDevice *sbd = SYS_BUS_DEVICE(dev);
>  SHPCIState *s = SH_PCI_HOST_BRIDGE(dev);
> @@ -145,19 +145,19 @@ static void sh_pci_device_realize(DeviceState *dev, 
> Error **errp)
>  s->dev = pci_create_simple(phb->bus, PCI_DEVFN(0, 0), "sh_pci_host");
>  }
>  
> -static void sh_pci_host_realize(PCIDevice *d, Error **errp)
> +static void sh_pcic_pci_realize(PCIDevice *d, Error **errp)
>  {
>  pci_set_word(d->config + PCI_COMMAND, PCI_COMMAND_WAIT);
>  pci_set_word(d->config + PCI_STATUS, PCI_STATUS_CAP_LIST |
>   PCI_STATUS_FAST_BACK | PCI_STATUS_DEVSEL_MEDIUM);
>  }
>  
> -static void sh_pci_host_class_init(ObjectClass *klass, void *data)
> +static void sh_pcic_pci_class_init(ObjectClass *klass, void *data)
>  {
>  PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
>  DeviceClass *dc = DEVICE_CLASS(klass);
>  
> -k->realize = sh_pci_host_realize;
> +k->realize = sh_pcic_pci_realize;
>  k->vendor_id = PCI_VENDOR_ID_HITACHI;
>  k->device_id = PCI_DEVICE_ID_HITACHI_SH7751R;
>  /*
> @@ -167,11 +167,11 @@ static void sh_pci_host_class_init(ObjectClass *klass, 
> void *data)
>  dc->user_creatable = false;
>  }
>  
> -static void sh_pci_device_class_init(ObjectClass *klass, void *data)
> +static void sh_pcic_host_class_init(ObjectClass *klass, void *data)
>  {
>  DeviceClass *dc = DEVICE_CLASS(klass);
>  
> -dc->realize = sh_pci_device_realize;
> +dc->realize = sh_pcic_host_realize;
>  }
>  
>  static const TypeInfo sh_pcic_types[] = {
> @@ -179,12 +179,12 @@ static const TypeInfo sh_pcic_types[] = {
>  .name   = TYPE_SH_PCI_HOST_BRIDGE,
>  .parent = TYPE_PCI_HOST_BRIDGE,
>  .instance_size  = sizeof(SHPCIState),
> -.class_init = sh_pci_device_class_init,
> +.class_init = sh_pcic_host_class_init,
>  }, {
>  .name   = "sh_pci_host",
>  .parent = TYPE_PCI_DEVICE,
>  .instance_size  = sizeof(PCIDevice),
> -.class_init = sh_pci_host_class_init,
> +.class_init = sh_pcic_pci_class_init,
>  .interfaces = (InterfaceInfo[]) {
>  { INTERFACE_CONVENTIONAL_PCI_DEVICE },
>  { },
> -- 
> 2.41.0
> 



Re: [PATCH v1 7/9] qapi: golang: Generate qapi's command types in Go

2023-10-14 Thread Victor Toso
Hi,

On Thu, Sep 28, 2023 at 03:32:54PM +0100, Daniel P. Berrangé wrote:
> On Wed, Sep 27, 2023 at 01:25:42PM +0200, Victor Toso wrote:
> > This patch handles QAPI command types and generates data structures in
> > Go that decodes from QMP JSON Object to Go data structure and vice
> > versa.
> > 
> > Similar to Event, this patch adds a Command interface and two helper
> > functions MarshalCommand and UnmarshalCommand.
> > 
> > Example:
> > qapi:
> >  | { 'command': 'set_password',
> >  |   'boxed': true,
> >  |   'data': 'SetPasswordOptions' }
> > 
> > go:
> >  | type SetPasswordCommand struct {
> >  | SetPasswordOptions
> >  | CommandId string `json:"-"`
> 
> IIUC, you renamed that to MessageId in the code now.
> 
> >  | }
> 
> Overall, I'm not entirely convinced that we will want to
> have the SetPasswordCommand struct wrappers, byut it is
> hard to say,

I was playing with removing these embed structs now, similar to
how we did for all other base types. The extra complexity might
not be worthy to remove it, IMHO.

Actually, the SetPasswordCommand can be used as example.

No embed (proposed):
  type SetPasswordCommand struct {
  CommandId string `json:"-"`

  Password  string `json:"password"`
  Connected *SetPasswordAction `json:"connected,omitempty"`

  Protocol  DisplayProtocol`json:"protocol"`
  Password  string `json:"password"`
  Connected *SetPasswordAction `json:"connected,omitempty"`

  // Variants fields
  Vnc *SetPasswordOptionsVnc `json:"-"`

  // Unbranched enum fields
  Spice bool `json:"-"`
  }

As SetPasswordOptions is a union, now we have to treat
SetPasswordCommand as union too, for Marshal/Unmarshal.

The data type could also be an Alternate, which has different
logic for Marshal/Unmarshal.

So, doing embed would add quite a bit of complexity to handling
Commands and Events too although no Event use boxed=true at the
moment in QEMU.

> as what we're missing still is the eventual application facing
> API.
> 
> eg something that ultimately looks more like this:
> 
> qemu = qemu.QMPConnection()
> qemu.Dial("/path/to/unix/socket.sock")
> 
> qemu.VncConnectedEvent(func(ev *VncConnectedEvent) {
>  fmt.Printf("VNC client %s connected\n", ev.Client.Host)
> })
> 
> resp, err := qemu.SetPassword(SetPasswordArguments{
> protocol: "vnc",
> password: "123456",
> })
> 
> if err != nil {
> fmt.Fprintf(os.Stderr, "Cannot set passwd: %s", err)
> }
> 
> ..do something wit resp   (well SetPassword has no response, but 
> other cmmands do)
> 
> It isn't clear that the SetPasswordCommand struct will be
> needed internally for the impl if QMPCommand.

Yes. Let's get this in so we can work on the next layer. We don't
need to declare this Go module as stable for now, till we get the
next few bits figure out.

Cheers,
Victor
 
> > usage:
> >  | input := `{"execute":"set_password",` +
> >  |  `"arguments":{"protocol":"vnc",` +
> >  |  `"password":"secret"}}`
> >  |
> >  | c, err := UnmarshalCommand([]byte(input))
> >  | if err != nil {
> >  | panic(err)
> >  | }
> >  |
> >  | if c.GetName() == `set_password` {
> >  | m := c.(*SetPasswordCommand)
> >  | // m.Password == "secret"
> >  | }
> > 
> > Signed-off-by: Victor Toso 
> > ---
> >  scripts/qapi/golang.py | 97 --
> >  1 file changed, 94 insertions(+), 3 deletions(-)
> > 
> > diff --git a/scripts/qapi/golang.py b/scripts/qapi/golang.py
> > index ff3b1dd020..52a9124641 100644
> > --- a/scripts/qapi/golang.py
> > +++ b/scripts/qapi/golang.py
> > @@ -246,6 +246,51 @@
> >  }}
> >  '''
> >  
> > +TEMPLATE_COMMAND_METHODS = '''
> > +func (c *{type_name}) GetName() string {{
> > +return "{name}"
> > +}}
> > +
> > +func (s *{type_name}) GetId() string {{
> > +return s.MessageId
> > +}}
> > +'''
> > +
> > +TEMPLATE_COMMAND = '''
> > +type Command interface {{
> > +GetId() string
> > +GetName()   string
> > +}}
> > +
> > +func MarshalCommand(c Command) ([]byte, error) {{
> > +m := make(map[string]any)
> > +m["execute"] = c.GetName()
> > +if id := c.GetId(); len(id) > 0 {{
> > +m["id"] = id
> > +}}
> > +if bytes, err := json.Marshal(c); err != nil {{
> > +return []byte{{}}, err
> > +}} else if len(bytes) > 2 {{
> > +m["arguments"] = c
> > +}}
> > +return json.Marshal(m)
> > +}}
> > +
> > +func UnmarshalCommand(data []byte) (Command, error) {{
> > +base := struct {{
> > +MessageId string `json:"id,omitempty"`
> > +Name  string `json:"execute"`
> > +}}{{}}
> > +if err := json.Unmarshal(data, ); err != nil {{
> > +return nil, fmt.Errorf("Failed to decode command: %s", 
> > string(data))
> > +}}
> > +
> > +switch base.Name {{
> > +{cases}
> > +}}
> > +return nil, errors.New("Failed to recognize command")
> > +}}

[PATCH for-8.1] vfio/display: Fix missing update to set backing fields

2023-10-14 Thread Edmund Raile
Hi,
I can confirm that the patch indeed fixes the issue.
Kind regards,
Edmund Raile

Tested-by: Edmund Raile 




[PATCH] qemu-ui-gtk clipboard: fix for freeze-crashes v2

2023-10-14 Thread Edmund Raile
summary

adresses https://gitlab.com/qemu-project/qemu/-/issues/1150
replaces blocking gtk_clipboard_wait_is_text_available in gd_owner_change and 
gtk_clipboard_wait_for_text in gd_clipboard_request with asynchronous 
gtk_clipboard_request_text
uses serial_last and serial of QemuClipboardInfo to only process new clipboard 
data



In response to [gemu-gtk-clipboard freezing and crashing 
guests](https://gitlab.com/qemu-project/qemu/-/issues/1150).

I think I might have a solution for the gtk clipboard sometimes crashing guests.

@kolAflash I couldn't have done it without you, figuring out 
`gtk_clipboard_wait_is_text_available(clipboard)` was the issue is half the 
work.

The real issue is that it's blocking and I'd wager that's a big no-no since 
qemu & KVM have to run the VM + OS, preferably as real-time as possible.
Something times out and you get a core dump.

As a replacement, `gtk_clipboard_request_text`, which is async and non-blocking 
is a better choice, hopefully.
It requires an additional function to handle receiving text.

In a previous (first) attempt of submitting a patch 
(https://lists.gnu.org/archive/html/qemu-devel/2023-09/msg06027.html), Mr. 
Lureau gave advice which I followed here:
 * now both gd_owner_change and gd_clipboard_request are asynchronous by means 
of the gd_clipboard_transfer_text_to_guest_callback
 * the QemuClipboardInfo struct now carries a serial_last field which is used 
by gd_clipboard_transfer_text_to_guest_callback to only process the clipboard 
when new

I hope the comments are acceptable, they should be very useful to people not 
particularly familiar with he workings of both qemu and gtk clipboards (like 
me).

To not risk breaking anything in the mailing list, I'm starting this new mail 
thread instead of replying to my first one.
Hopefully I'll get it right this time.

Kind regards,
Edmund Raile

Signed-off-by: Edmund Raile 



---
 include/ui/clipboard.h |  2 ++
 ui/gtk-clipboard.c | 34 --
 2 files changed, 22 insertions(+), 14 deletions(-)

diff --git a/include/ui/clipboard.h b/include/ui/clipboard.h
index ab6acdbd8a..123c04fc07 100644
--- a/include/ui/clipboard.h
+++ b/include/ui/clipboard.h
@@ -106,6 +106,7 @@ struct QemuClipboardNotify {
  * @types: clipboard data array (one entry per type).
  * @has_serial: whether @serial is available.
  * @serial: the grab serial counter.
+ * @serial_last: used by GTK UI to discard outdated transaction results.
  *
  * Clipboard content data and metadata.
  */
@@ -115,6 +116,7 @@ struct QemuClipboardInfo {
 QemuClipboardSelection selection;
 bool has_serial;
 uint32_t serial;
+uint32_t serial_last;
 struct {
 bool available;
 bool requested;
diff --git a/ui/gtk-clipboard.c b/ui/gtk-clipboard.c
index 8d8a636fd1..9e96cc2fb5 100644
--- a/ui/gtk-clipboard.c
+++ b/ui/gtk-clipboard.c
@@ -133,26 +133,38 @@ static void gd_clipboard_notify(Notifier *notifier, void 
*data)
 }
 }
 
+/* asynchronous clipboard text transfer (host -> guest): callback */
+static void gd_clipboard_transfer_text_to_guest_callback(GtkClipboard 
*clipboard, const gchar *text, gpointer data)
+{
+QemuClipboardInfo *info = (QemuClipboardInfo *)data;
+
+// serial_last is intentionally not stored as a static in this function as 
callbacks implementing other data types (e.g. images) need access as well
+
+if (text && (info->serial > info->serial_last)) {
+info->types[QEMU_CLIPBOARD_TYPE_TEXT].available = true;
+qemu_clipboard_update(info);
+info->serial_last = info->serial;
+}
+
+qemu_clipboard_info_unref(info);
+}
+
+/* asynchronous clipboard transfer (host -> guest) initiator when guest 
requests clipboard data */
 static void gd_clipboard_request(QemuClipboardInfo *info,
  QemuClipboardType type)
 {
 GtkDisplayState *gd = container_of(info->owner, GtkDisplayState, cbpeer);
-char *text;
 
 switch (type) {
 case QEMU_CLIPBOARD_TYPE_TEXT:
-text = gtk_clipboard_wait_for_text(gd->gtkcb[info->selection]);
-if (text) {
-qemu_clipboard_set_data(>cbpeer, info, type,
-strlen(text), text, true);
-g_free(text);
-}
+gtk_clipboard_request_text(gd->gtkcb[info->selection], 
gd_clipboard_transfer_text_to_guest_callback, info);
 break;
 default:
 break;
 }
 }
 
+/* asynchronous clipboard transfer (host -> guest) initiator when host has new 
clipboard data */
 static void gd_owner_change(GtkClipboard *clipboard,
 GdkEvent *event,
 gpointer data)
@@ -166,16 +178,10 @@ static void gd_owner_change(GtkClipboard *clipboard,
 return;
 }
 
-
 switch (event->owner_change.reason) {
 case GDK_OWNER_CHANGE_NEW_OWNER:
 info = qemu_clipboard_info_new(>cbpeer, s);
-if (gtk_clipboard_wait_is_text_available(clipboard)) {

Re: [PATCH v2 1/3] via-ide: Fix legacy mode emulation

2023-10-14 Thread Mark Cave-Ayland

On 09/10/2023 20:54, BALATON Zoltan wrote:


The initial value for BARs were set in reset method for emulating
legacy mode at start but this does not work because PCI code resets
BARs after calling device reset method. Remove this ineffective
default to avoid confusion.

Instead move setting the BARs to a callback on writing the PCI config
regsiter that sets legacy mode (which firmwares needing this mode seem
to do) and fix their values to program it to use legacy port numbers
in this case. This does not fully emulate what the data sheet says
(which is not very clear on this) but it implements enogh to allow
both modes as used by firmwares of machines we emulate.

Signed-off-by: BALATON Zoltan 
---
  hw/ide/via.c | 41 -
  1 file changed, 36 insertions(+), 5 deletions(-)

diff --git a/hw/ide/via.c b/hw/ide/via.c
index fff23803a6..43e8af8d69 100644
--- a/hw/ide/via.c
+++ b/hw/ide/via.c
@@ -132,11 +132,6 @@ static void via_ide_reset(DeviceState *dev)
  pci_set_word(pci_conf + PCI_STATUS, PCI_STATUS_FAST_BACK |
   PCI_STATUS_DEVSEL_MEDIUM);
  
-pci_set_long(pci_conf + PCI_BASE_ADDRESS_0, 0x01f0);

-pci_set_long(pci_conf + PCI_BASE_ADDRESS_1, 0x03f4);
-pci_set_long(pci_conf + PCI_BASE_ADDRESS_2, 0x0170);
-pci_set_long(pci_conf + PCI_BASE_ADDRESS_3, 0x0374);
-pci_set_long(pci_conf + PCI_BASE_ADDRESS_4, 0xcc01); /* BMIBA: 20-23h 
*/
  pci_set_long(pci_conf + PCI_INTERRUPT_LINE, 0x010e);
  
  /* IDE chip enable, IDE configuration 1/2, IDE FIFO Configuration*/

@@ -159,6 +154,41 @@ static void via_ide_reset(DeviceState *dev)
  pci_set_long(pci_conf + 0xc0, 0x00020001);
  }
  
+static void via_ide_cfg_write(PCIDevice *pd, uint32_t addr,

+  uint32_t val, int len)
+{
+pci_default_write_config(pd, addr, val, len);
+/*
+ * Bits 0 and 2 of the PCI programming interface register select between
+ * legacy and native mode for the two IDE channels. We don't emulate this
+ * because we cannot easily switch between ISA and PCI in QEMU so instead


As per my previous email, this statement is demonstrably false: this is now 
achievable using the portio_list*() APIs.



+ * when guest selects legacy mode we set the PCI BARs to legacy ports which
+ * works the same. We also don't care about setting each channel separately
+ * as no guest is known to do or need that. We only do this when BARs are
+ * unset when writing this register as logs from real hardware show that
+ * setting legacy mode after BARs were set it will still use ports set by
+ * BARs not ISA ports (e.g. pegasos2 Linux does this after firmware set
+ * native mode and programmed BARs and calls it non-100% native mode).
+ * But if 0x8a is written righr after reset without setting BARs then we
+ * want legacy ports (this is done by the AmigaOne firmware).
+ */
+if (addr == PCI_CLASS_PROG && val == 0x8a &&
+pci_get_long(pd->config + PCI_BASE_ADDRESS_0) ==
+PCI_BASE_ADDRESS_SPACE_IO) {
+pci_set_long(pd->config + PCI_BASE_ADDRESS_0, 0x1f0
+ | PCI_BASE_ADDRESS_SPACE_IO);
+pci_set_long(pd->config + PCI_BASE_ADDRESS_1, 0x3f6
+ | PCI_BASE_ADDRESS_SPACE_IO);
+pci_set_long(pd->config + PCI_BASE_ADDRESS_2, 0x170
+ | PCI_BASE_ADDRESS_SPACE_IO);
+pci_set_long(pd->config + PCI_BASE_ADDRESS_3, 0x376
+ | PCI_BASE_ADDRESS_SPACE_IO);
+/* BMIBA: 20-23h */
+pci_set_long(pd->config + PCI_BASE_ADDRESS_4, 0xcc00
+ | PCI_BASE_ADDRESS_SPACE_IO);
+}
+}


Another hint that this is not the right way to be doing this: the values you are 
placing in BARS 1 and 3 are illegal. PCI IO BARs have bit 1 forced to 0 and bit 0 set 
to 1 which forces a minimum alignment of 4, so either the addresses 0x3f6/0x376 are 
being rounded internally to 0x3f4/0x374 and/or you're lucky that this just happens to 
work on QEMU.


Using the portio_list*() APIs really is the right way to implement this to avoid 
being affected by such issues.



  static void via_ide_realize(PCIDevice *dev, Error **errp)
  {
  PCIIDEState *d = PCI_IDE(dev);
@@ -221,6 +251,7 @@ static void via_ide_class_init(ObjectClass *klass, void 
*data)
  /* Reason: only works as function of VIA southbridge */
  dc->user_creatable = false;
  
+k->config_write = via_ide_cfg_write;

  k->realize = via_ide_realize;
  k->exit = via_ide_exitfn;
  k->vendor_id = PCI_VENDOR_ID_VIA;



ATB,

Mark.




Re: [PATCH 1/1] block: improve alignment detection and fix 271 test

2023-10-14 Thread Nir Soffer
On Fri, Sep 8, 2023 at 12:54 AM Denis V. Lunev  wrote:

> Unfortunately 271 IO test is broken if started in non-cached mode.
>

Is this a real world issue? For example in oVirt you cannot create a disk
with
size < 4k so there is no way that 4k is not a good alignment.

Should we fix the test to reflect real world usage?

_reset_img 2083k

I guess it works with:

_reset_img 2084k

Commits
> commit a6b257a08e3d72219f03e461a52152672fec0612
> Author: Nir Soffer 
> Date:   Tue Aug 13 21:21:03 2019 +0300
> file-posix: Handle undetectable alignment
> and
> commit 9c60a5d1978e6dcf85c0e01b50e6f7f54ca09104
> Author: Kevin Wolf 
> Date:   Thu Jul 16 16:26:00 2020 +0200
> block: Require aligned image size to avoid assertion failure
> have interesting side effect if used togather.
>
> If the image size is not multiple of 4k and that image falls under
> original constraints of Nil's patch, the image can not be opened
> due to the check in the bdrv_check_perm().
>
> The patch tries to satisfy the requirements of bdrv_check_perm()
> inside raw_probe_alignment(). This is at my opinion better that just
> disallowing to run that test in non-cached mode. The operation is legal
> by itself.
>
> Signed-off-by: Denis V. Lunev 
> CC: Nir Soffer 
> CC: Kevin Wolf 
> CC: Hanna Reitz 
> CC: Alberto Garcia 
> ---
>  block/file-posix.c | 17 +++--
>  1 file changed, 15 insertions(+), 2 deletions(-)
>
> diff --git a/block/file-posix.c b/block/file-posix.c
> index b16e9c21a1..988cfdc76c 100644
> --- a/block/file-posix.c
> +++ b/block/file-posix.c
> @@ -447,8 +447,21 @@ static void raw_probe_alignment(BlockDriverState *bs,
> int fd, Error **errp)
>  for (i = 0; i < ARRAY_SIZE(alignments); i++) {
>  align = alignments[i];
>  if (raw_is_io_aligned(fd, buf, align)) {
> -/* Fallback to safe value. */
> -bs->bl.request_alignment = (align != 1) ? align :
> max_align;
> +if (align != 1) {
> +bs->bl.request_alignment = align;
> +break;
> +}
> +/*
> + * Fallback to safe value. max_align is perfect, but the
> size of the device must be multiple of
> + * the virtual length of the device. In the other case we
> will get a error in
> + * bdrv_node_refresh_perm().
> + */
> +for (align = max_align; align > 1; align /= 2) {
> +if ((bs->total_sectors * BDRV_SECTOR_SIZE) % align ==
> 0) {
>

Moving image size calculation out of the loop would make the intent of the
code
more clear:

if (image_size % align == 0) {

Since qemu does not enforce image size alignment, I can see how you create
a 512 bytes
aligned image and in the case when qemu cannot detect the alignment, we end
with
align = 4k. In this case this loop would select align = 512, but with the
image aligned to
some strange value, this loop may select align = 2 or some other value that
does not
make sense.

So I can see using 4k or 512 bytes as a good fallback value, but anything
else should not
be possible, so maybe we should fix this in bdrv_check_perm()?

Nir


[PATCH 11/18] target/i386: move 00-5F opcodes to new decoder

2023-10-14 Thread Paolo Bonzini
Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/decode-new.c.inc | 116 ++
 target/i386/tcg/decode-new.h |   3 +
 target/i386/tcg/emit.c.inc   | 201 +++
 target/i386/tcg/translate.c  |   2 +-
 4 files changed, 321 insertions(+), 1 deletion(-)

diff --git a/target/i386/tcg/decode-new.c.inc b/target/i386/tcg/decode-new.c.inc
index fb95e0b9268..91f79c09b73 100644
--- a/target/i386/tcg/decode-new.c.inc
+++ b/target/i386/tcg/decode-new.c.inc
@@ -102,6 +102,8 @@
 
 #define X86_OP_GROUP2(op, op0, s0, op1, s1, ...)  \
 X86_OP_GROUP3(op, op0, s0, 2op, s0, op1, s1, ## __VA_ARGS__)
+#define X86_OP_GROUPw(op, op0, s0, ...)   \
+X86_OP_GROUP3(op, op0, s0, None, None, None, None, ## __VA_ARGS__)
 #define X86_OP_GROUP0(op, ...)\
 X86_OP_GROUP3(op, None, None, None, None, None, None, ## __VA_ARGS__)
 
@@ -127,10 +129,13 @@
 X86_OP_ENTRY3(op, op0, s0, None, None, None, None, ## __VA_ARGS__)
 #define X86_OP_ENTRYr(op, op0, s0, ...)   \
 X86_OP_ENTRY3(op, None, None, None, None, op0, s0, ## __VA_ARGS__)
+#define X86_OP_ENTRY1(op, op0, s0, ...)   \
+X86_OP_ENTRY3(op, op0, s0, 2op, s0, None, None, ## __VA_ARGS__)
 #define X86_OP_ENTRY0(op, ...)\
 X86_OP_ENTRY3(op, None, None, None, None, None, None, ## __VA_ARGS__)
 
 #define cpuid(feat) .cpuid = X86_FEAT_##feat,
+#define nowb .special = X86_SPECIAL_NoWriteback,
 #define xchg .special = X86_SPECIAL_Locked,
 #define mmx .special = X86_SPECIAL_MMX,
 #define zext0 .special = X86_SPECIAL_ZExtOp0,
@@ -1074,7 +1079,114 @@ static void decode_0F(DisasContext *s, CPUX86State 
*env, X86OpEntry *entry, uint
 }
 
 static const X86OpEntry opcodes_root[256] = {
+[0x00] = X86_OP_ENTRY2(ADD, E,b, G,b),
+[0x01] = X86_OP_ENTRY2(ADD, E,v, G,v),
+[0x02] = X86_OP_ENTRY2(ADD, G,b, E,b),
+[0x03] = X86_OP_ENTRY2(ADD, G,v, E,v),
+[0x04] = X86_OP_ENTRY2(ADD, 0,b, I,b),   /* AL, Ib */
+[0x05] = X86_OP_ENTRY2(ADD, 0,v, I,z),   /* rAX, Iz */
+[0x06] = X86_OP_ENTRYr(PUSH, ES, w, chk(i64)),
+[0x07] = X86_OP_ENTRYw(POP, ES, w, chk(i64)),
+
+[0x10] = X86_OP_ENTRY2(ADC, E,b, G,b),
+[0x11] = X86_OP_ENTRY2(ADC, E,v, G,v),
+[0x12] = X86_OP_ENTRY2(ADC, G,b, E,b),
+[0x13] = X86_OP_ENTRY2(ADC, G,v, E,v),
+[0x14] = X86_OP_ENTRY2(ADC, 0,b, I,b),   /* AL, Ib */
+[0x15] = X86_OP_ENTRY2(ADC, 0,v, I,z),   /* rAX, Iz */
+[0x16] = X86_OP_ENTRYr(PUSH, SS, w, chk(i64)),
+[0x17] = X86_OP_ENTRYw(POP, SS, w, chk(i64)),
+
+[0x20] = X86_OP_ENTRY2(AND, E,b, G,b),
+[0x21] = X86_OP_ENTRY2(AND, E,v, G,v),
+[0x22] = X86_OP_ENTRY2(AND, G,b, E,b),
+[0x23] = X86_OP_ENTRY2(AND, G,v, E,v),
+[0x24] = X86_OP_ENTRY2(AND, 0,b, I,b),   /* AL, Ib */
+[0x25] = X86_OP_ENTRY2(AND, 0,v, I,z),   /* rAX, Iz */
+[0x26] = {},
+[0x27] = X86_OP_ENTRY0(DAA, chk(i64)),
+
+[0x30] = X86_OP_ENTRY2(XOR, E,b, G,b),
+[0x31] = X86_OP_ENTRY2(XOR, E,v, G,v),
+[0x32] = X86_OP_ENTRY2(XOR, G,b, E,b),
+[0x33] = X86_OP_ENTRY2(XOR, G,v, E,v),
+[0x34] = X86_OP_ENTRY2(XOR, 0,b, I,b),   /* AL, Ib */
+[0x35] = X86_OP_ENTRY2(XOR, 0,v, I,z),   /* rAX, Iz */
+[0x36] = {},
+[0x37] = X86_OP_ENTRY0(AAA, chk(i64)),
+
+[0x40] = X86_OP_ENTRY1(INC, 0,v, chk(i64)),
+[0x41] = X86_OP_ENTRY1(INC, 1,v, chk(i64)),
+[0x42] = X86_OP_ENTRY1(INC, 2,v, chk(i64)),
+[0x43] = X86_OP_ENTRY1(INC, 3,v, chk(i64)),
+[0x44] = X86_OP_ENTRY1(INC, 4,v, chk(i64)),
+[0x45] = X86_OP_ENTRY1(INC, 5,v, chk(i64)),
+[0x46] = X86_OP_ENTRY1(INC, 6,v, chk(i64)),
+[0x47] = X86_OP_ENTRY1(INC, 7,v, chk(i64)),
+
+[0x50] = X86_OP_ENTRYr(PUSH, LoBits,d64),
+[0x51] = X86_OP_ENTRYr(PUSH, LoBits,d64),
+[0x52] = X86_OP_ENTRYr(PUSH, LoBits,d64),
+[0x53] = X86_OP_ENTRYr(PUSH, LoBits,d64),
+[0x54] = X86_OP_ENTRYr(PUSH, LoBits,d64),
+[0x55] = X86_OP_ENTRYr(PUSH, LoBits,d64),
+[0x56] = X86_OP_ENTRYr(PUSH, LoBits,d64),
+[0x57] = X86_OP_ENTRYr(PUSH, LoBits,d64),
+
+
+[0x08] = X86_OP_ENTRY2(OR, E,b, G,b),
+[0x09] = X86_OP_ENTRY2(OR, E,v, G,v),
+[0x0A] = X86_OP_ENTRY2(OR, G,b, E,b),
+[0x0B] = X86_OP_ENTRY2(OR, G,v, E,v),
+[0x0C] = X86_OP_ENTRY2(OR, 0,b, I,b),   /* AL, Ib */
+[0x0D] = X86_OP_ENTRY2(OR, 0,v, I,z),   /* rAX, Iz */
+[0x0E] = X86_OP_ENTRYr(PUSH, CS, w, chk(i64)),
 [0x0F] = X86_OP_GROUP0(0F),
+
+[0x18] = X86_OP_ENTRY2(SBB, E,b, G,b),
+[0x19] = X86_OP_ENTRY2(SBB, E,v, G,v),
+[0x1A] = X86_OP_ENTRY2(SBB, G,b, E,b),
+[0x1B] = X86_OP_ENTRY2(SBB, G,v, E,v),
+[0x1C] = X86_OP_ENTRY2(SBB, 0,b, I,b),   /* AL, Ib */
+[0x1D] = X86_OP_ENTRY2(SBB, 0,v, I,z),   /* rAX, Iz */
+[0x1E] = X86_OP_ENTRYr(PUSH, DS, w, chk(i64)),
+[0x1F] = X86_OP_ENTRYw(POP, DS, w, chk(i64)),
+
+[0x28] = X86_OP_ENTRY2(SUB, E,b, G,b),
+[0x29] = X86_OP_ENTRY2(SUB, 

[PATCH 15/18] target/i386: move operand load and writeback out of gen_cmovcc1

2023-10-14 Thread Paolo Bonzini
Similar to gen_setcc1, make gen_cmovcc1 receive TCGv.  This is more friendly
to simultaneous implementation in the old and the new decoder.

A small wart is that s->T0 of CMOV is currently the *second* argument (which
would ordinarily be in T1).  Therefore, the condition as to be inverted in
order to overwrite s->T0 with cpu_regs[reg] if the MOV is not performed.

This only applies to the old decoder, and this code will go away soon.

Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c | 20 
 1 file changed, 8 insertions(+), 12 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 9c799b5a980..2c4e680a69e 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -2503,26 +2503,20 @@ static void gen_jcc(DisasContext *s, int b, int diff)
 gen_jmp_rel(s, s->dflag, diff, 0);
 }
 
-static void gen_cmovcc1(CPUX86State *env, DisasContext *s, MemOp ot, int b,
-int modrm, int reg)
+static void gen_cmovcc1(DisasContext *s, int b, TCGv dest, TCGv src)
 {
 CCPrepare cc;
 
-gen_ldst_modrm(env, s, modrm, ot, OR_TMP0, 0);
-
-cc = gen_prepare_cc(s, b, s->T1);
+cc = gen_prepare_cc(s, b, s->tmp4);
 if (cc.mask != -1) {
-TCGv t0 = tcg_temp_new();
-tcg_gen_andi_tl(t0, cc.reg, cc.mask);
-cc.reg = t0;
+tcg_gen_andi_tl(s->tmp4, cc.reg, cc.mask);
+cc.reg = s->tmp4;
 }
 if (!cc.use_reg2) {
 cc.reg2 = tcg_constant_tl(cc.imm);
 }
 
-tcg_gen_movcond_tl(cc.cond, s->T0, cc.reg, cc.reg2,
-   s->T0, cpu_regs[reg]);
-gen_op_mov_reg_v(s, ot, reg, s->T0);
+tcg_gen_movcond_tl(cc.cond, dest, cc.reg, cc.reg2, src, dest);
 }
 
 static inline void gen_op_movl_T0_seg(DisasContext *s, X86Seg seg_reg)
@@ -5265,7 +5259,9 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 ot = dflag;
 modrm = x86_ldub_code(env, s);
 reg = ((modrm >> 3) & 7) | REX_R(s);
-gen_cmovcc1(env, s, ot, b, modrm, reg);
+gen_ldst_modrm(env, s, modrm, ot, OR_TMP0, 0);
+gen_cmovcc1(s, b ^ 1, s->T0, cpu_regs[reg]);
+gen_op_mov_reg_v(s, ot, reg, s->T0);
 break;
 
 //
-- 
2.41.0




[PATCH 14/18] target/i386: move 60-BF opcodes to new decoder

2023-10-14 Thread Paolo Bonzini
Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/decode-new.c.inc | 157 ++
 target/i386/tcg/decode-new.h |   3 +
 target/i386/tcg/emit.c.inc   | 340 +++
 target/i386/tcg/translate.c  |  38 ++--
 4 files changed, 522 insertions(+), 16 deletions(-)

diff --git a/target/i386/tcg/decode-new.c.inc b/target/i386/tcg/decode-new.c.inc
index 37ed669bde0..d03bc5a9720 100644
--- a/target/i386/tcg/decode-new.c.inc
+++ b/target/i386/tcg/decode-new.c.inc
@@ -136,6 +136,7 @@
 
 #define cpuid(feat) .cpuid = X86_FEAT_##feat,
 #define nowb .special = X86_SPECIAL_NoWriteback,
+#define noseg .special = X86_SPECIAL_NoSeg,
 #define xchg .special = X86_SPECIAL_Locked,
 #define mmx .special = X86_SPECIAL_MMX,
 #define zext0 .special = X86_SPECIAL_ZExtOp0,
@@ -179,6 +180,9 @@
 #define p_66_f3_f2.valid_prefix = P_66 | P_F3 | P_F2,
 #define p_00_66_f3_f2 .valid_prefix = P_00 | P_66 | P_F3 | P_F2,
 
+static X86OpEntry illegal_opcode =
+X86_OP_ENTRY0(illegal);
+
 static uint8_t get_modrm(DisasContext *s, CPUX86State *env)
 {
 if (!s->has_modrm) {
@@ -1078,6 +1082,46 @@ static void decode_0F(DisasContext *s, CPUX86State *env, 
X86OpEntry *entry, uint
 do_decode_0F(s, env, entry, b);
 }
 
+static void decode_63(DisasContext *s, CPUX86State *env, X86OpEntry *entry, 
uint8_t *b)
+{
+static const X86OpEntry arpl = X86_OP_ENTRY2(ARPL, E,w, G,w, chk(prot));
+static const X86OpEntry mov = X86_OP_ENTRY3(MOV, G,v, E,v, None, None);
+static const X86OpEntry movsxd = X86_OP_ENTRY3(MOVSXD, G,v, E,d, None, 
None);
+if (!CODE64(s)) {
+*entry = arpl;
+} else if (REX_W(s)) {
+*entry = movsxd;
+} else {
+*entry = mov;
+}
+}
+
+static void decode_group1(DisasContext *s, CPUX86State *env, X86OpEntry 
*entry, uint8_t *b)
+{
+static const X86GenFunc group1_gen[8] = {
+gen_ADD, gen_OR, gen_ADC, gen_SBB, gen_AND, gen_SUB, gen_XOR, gen_SUB,
+};
+int op = (get_modrm(s, env) >> 3) & 7;
+entry->gen = group1_gen[op];
+
+if (op == 7) {
+/* CMP */
+entry->special = X86_SPECIAL_NoWriteback;
+}
+}
+
+static void decode_group1A(DisasContext *s, CPUX86State *env, X86OpEntry 
*entry, uint8_t *b)
+{
+int op = (get_modrm(s, env) >> 3) & 7;
+if (op != 0) {
+*entry = illegal_opcode;
+} else {
+entry->gen = gen_POP;
+/* The address must use the value of ESP after the pop.  */
+s->popl_esp_hack = 1 << mo_pushpop(s, s->dflag);
+}
+}
+
 static const X86OpEntry opcodes_root[256] = {
 [0x00] = X86_OP_ENTRY2(ADD, E,b, G,b),
 [0x01] = X86_OP_ENTRY2(ADD, E,v, G,v),
@@ -1133,6 +1177,60 @@ static const X86OpEntry opcodes_root[256] = {
 [0x56] = X86_OP_ENTRYr(PUSH, LoBits,d64),
 [0x57] = X86_OP_ENTRYr(PUSH, LoBits,d64),
 
+[0x60] = X86_OP_ENTRY0(PUSHA, chk(i64)),
+[0x61] = X86_OP_ENTRY0(POPA, chk(i64)),
+[0x62] = X86_OP_ENTRY2(BOUND, G,v, M,a, chk(i64)),
+[0x63] = X86_OP_GROUP0(63),
+[0x64] = {},
+[0x65] = {},
+[0x66] = {},
+[0x67] = {},
+
+[0x70] = X86_OP_ENTRYr(Jcc, J,b),
+[0x71] = X86_OP_ENTRYr(Jcc, J,b),
+[0x72] = X86_OP_ENTRYr(Jcc, J,b),
+[0x73] = X86_OP_ENTRYr(Jcc, J,b),
+[0x74] = X86_OP_ENTRYr(Jcc, J,b),
+[0x75] = X86_OP_ENTRYr(Jcc, J,b),
+[0x76] = X86_OP_ENTRYr(Jcc, J,b),
+[0x77] = X86_OP_ENTRYr(Jcc, J,b),
+
+[0x80] = X86_OP_GROUP2(group1, E,b, I,b),
+[0x81] = X86_OP_GROUP2(group1, E,v, I,z),
+[0x82] = X86_OP_GROUP2(group1, E,b, I,b, chk(i64)),
+[0x83] = X86_OP_GROUP2(group1, E,v, I,b),
+[0x84] = X86_OP_ENTRY2(AND, E,b, G,b, nowb),
+[0x85] = X86_OP_ENTRY2(AND, E,v, G,v, nowb),
+[0x86] = X86_OP_ENTRY2(XCHG, E,b, G,b, xchg),
+[0x87] = X86_OP_ENTRY2(XCHG, E,v, G,v, xchg),
+
+[0x90] = X86_OP_ENTRY2(XCHG, 0,v, LoBits,v),
+[0x91] = X86_OP_ENTRY2(XCHG, 0,v, LoBits,v),
+[0x92] = X86_OP_ENTRY2(XCHG, 0,v, LoBits,v),
+[0x93] = X86_OP_ENTRY2(XCHG, 0,v, LoBits,v),
+[0x94] = X86_OP_ENTRY2(XCHG, 0,v, LoBits,v),
+[0x95] = X86_OP_ENTRY2(XCHG, 0,v, LoBits,v),
+[0x96] = X86_OP_ENTRY2(XCHG, 0,v, LoBits,v),
+[0x97] = X86_OP_ENTRY2(XCHG, 0,v, LoBits,v),
+
+[0xA0] = X86_OP_ENTRY3(MOV, 0,b, O,b, None, None), /* AL, Ob */
+[0xA1] = X86_OP_ENTRY3(MOV, 0,v, O,v, None, None), /* rAX, Ov */
+[0xA2] = X86_OP_ENTRY3(MOV, O,b, 0,b, None, None), /* Ob, AL */
+[0xA3] = X86_OP_ENTRY3(MOV, O,v, 0,v, None, None), /* Ov, rAX */
+[0xA4] = X86_OP_ENTRY2(MOVS, Y,b, X,b, nowb),
+[0xA5] = X86_OP_ENTRY2(MOVS, Y,v, X,v, nowb),
+[0xA6] = X86_OP_ENTRY2(CMPS, Y,b, X,b, nowb),
+[0xA7] = X86_OP_ENTRY2(CMPS, Y,v, X,v, nowb),
+
+[0xB0] = X86_OP_ENTRY3(MOV, LoBits,b, I,b, None, None),
+[0xB1] = X86_OP_ENTRY3(MOV, LoBits,b, I,b, None, None),
+[0xB2] = X86_OP_ENTRY3(MOV, LoBits,b, I,b, None, None),
+[0xB3] = X86_OP_ENTRY3(MOV, LoBits,b, I,b, None, None),
+[0xB4] = X86_OP_ENTRY3(MOV, LoBits,b, I,b, None, None),
+[0xB5] = 

[PATCH 12/18] target/i386: adjust decoding of J operand

2023-10-14 Thread Paolo Bonzini
gen_jcc() has been changed to accept a relative offset since the
new decoder was written.  Adjust the J operand, which is meant
to be used with jump instructions such as gen_jcc(), to not
include the program counter and to not truncate the result, as
both operations are now performed by common code.

The result is that J is now the same as the I operand.

Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/decode-new.c.inc | 10 --
 1 file changed, 10 deletions(-)

diff --git a/target/i386/tcg/decode-new.c.inc b/target/i386/tcg/decode-new.c.inc
index 91f79c09b73..37ed669bde0 100644
--- a/target/i386/tcg/decode-new.c.inc
+++ b/target/i386/tcg/decode-new.c.inc
@@ -1448,19 +1448,9 @@ static bool decode_op(DisasContext *s, CPUX86State *env, 
X86DecodedInsn *decode,
 }
 
 case X86_TYPE_I:  /* Immediate */
-op->unit = X86_OP_IMM;
-decode->immediate = insn_get_signed(env, s, op->ot);
-break;
-
 case X86_TYPE_J:  /* Relative offset for a jump */
 op->unit = X86_OP_IMM;
 decode->immediate = insn_get_signed(env, s, op->ot);
-decode->immediate += s->pc - s->cs_base;
-if (s->dflag == MO_16) {
-decode->immediate &= 0x;
-} else if (!CODE64(s)) {
-decode->immediate &= 0xu;
-}
 break;
 
 case X86_TYPE_L:  /* The upper 4 bits of the immediate select a 128-bit 
register */
-- 
2.41.0




[PATCH 08/18] target/i386: implement CMPccXADD

2023-10-14 Thread Paolo Bonzini
The main difficulty here is that a page fault when writing to the destination
must not overwrite the flags.  Therefore, the compute-flags helper must be
called with a temporary destination instead of using gen_jcc1*.

For simplicity, I am using an unconditional cmpxchg operation, that becomes
a NOP if the comparison fails.

Signed-off-by: Paolo Bonzini 
---
 target/i386/cpu.c|  2 +-
 target/i386/tcg/decode-new.c.inc | 30 ++
 target/i386/tcg/decode-new.h |  2 +
 target/i386/tcg/emit.c.inc   | 98 
 target/i386/tcg/translate.c  |  2 +
 5 files changed, 133 insertions(+), 1 deletion(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 8beb989701c..80f0445301b 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -738,7 +738,7 @@ void x86_cpu_vendor_words2str(char *dst, uint32_t vendor1,
 #define TCG_7_0_EDX_FEATURES (CPUID_7_0_EDX_FSRM | 
CPUID_7_0_EDX_KERNEL_FEATURES)
 
 #define TCG_7_1_EAX_FEATURES (CPUID_7_1_EAX_FZRM | CPUID_7_1_EAX_FSRS | \
-  CPUID_7_1_EAX_FSRC)
+  CPUID_7_1_EAX_FSRC | CPUID_7_1_EAX_CMPCCXADD)
 #define TCG_7_1_EDX_FEATURES 0
 #define TCG_7_2_EDX_FEATURES 0
 #define TCG_APM_FEATURES 0
diff --git a/target/i386/tcg/decode-new.c.inc b/target/i386/tcg/decode-new.c.inc
index bad561ff66d..01c46e6a789 100644
--- a/target/i386/tcg/decode-new.c.inc
+++ b/target/i386/tcg/decode-new.c.inc
@@ -516,6 +516,28 @@ static const X86OpEntry opcodes_0F38_00toEF[240] = {
 [0xdd] = X86_OP_ENTRY3(VAESENCLAST, V,x,  H,x,   W,x,  vex4 cpuid(AES) 
p_66),
 [0xde] = X86_OP_ENTRY3(VAESDEC, V,x,  H,x,   W,x,  vex4 cpuid(AES) 
p_66),
 [0xdf] = X86_OP_ENTRY3(VAESDECLAST, V,x,  H,x,   W,x,  vex4 cpuid(AES) 
p_66),
+
+/*
+ * REG selects srcdest2 operand, VEX. selects src3.  VEX class not 
found
+ * in manual, assumed to be 13 from the VEX.L0 = constraint.
+ */
+[0xe0] = X86_OP_ENTRY3(CMPccXADD,   EM,y, G,y, B,y,  vex13 xchg chk(o64) 
cpuid(CMPCCXADD) p_66),
+[0xe1] = X86_OP_ENTRY3(CMPccXADD,   EM,y, G,y, B,y,  vex13 xchg chk(o64) 
cpuid(CMPCCXADD) p_66),
+[0xe2] = X86_OP_ENTRY3(CMPccXADD,   EM,y, G,y, B,y,  vex13 xchg chk(o64) 
cpuid(CMPCCXADD) p_66),
+[0xe3] = X86_OP_ENTRY3(CMPccXADD,   EM,y, G,y, B,y,  vex13 xchg chk(o64) 
cpuid(CMPCCXADD) p_66),
+[0xe4] = X86_OP_ENTRY3(CMPccXADD,   EM,y, G,y, B,y,  vex13 xchg chk(o64) 
cpuid(CMPCCXADD) p_66),
+[0xe5] = X86_OP_ENTRY3(CMPccXADD,   EM,y, G,y, B,y,  vex13 xchg chk(o64) 
cpuid(CMPCCXADD) p_66),
+[0xe6] = X86_OP_ENTRY3(CMPccXADD,   EM,y, G,y, B,y,  vex13 xchg chk(o64) 
cpuid(CMPCCXADD) p_66),
+[0xe7] = X86_OP_ENTRY3(CMPccXADD,   EM,y, G,y, B,y,  vex13 xchg chk(o64) 
cpuid(CMPCCXADD) p_66),
+
+[0xe8] = X86_OP_ENTRY3(CMPccXADD,   EM,y, G,y, B,y,  vex13 xchg chk(o64) 
cpuid(CMPCCXADD) p_66),
+[0xe9] = X86_OP_ENTRY3(CMPccXADD,   EM,y, G,y, B,y,  vex13 xchg chk(o64) 
cpuid(CMPCCXADD) p_66),
+[0xea] = X86_OP_ENTRY3(CMPccXADD,   EM,y, G,y, B,y,  vex13 xchg chk(o64) 
cpuid(CMPCCXADD) p_66),
+[0xeb] = X86_OP_ENTRY3(CMPccXADD,   EM,y, G,y, B,y,  vex13 xchg chk(o64) 
cpuid(CMPCCXADD) p_66),
+[0xec] = X86_OP_ENTRY3(CMPccXADD,   EM,y, G,y, B,y,  vex13 xchg chk(o64) 
cpuid(CMPCCXADD) p_66),
+[0xed] = X86_OP_ENTRY3(CMPccXADD,   EM,y, G,y, B,y,  vex13 xchg chk(o64) 
cpuid(CMPCCXADD) p_66),
+[0xee] = X86_OP_ENTRY3(CMPccXADD,   EM,y, G,y, B,y,  vex13 xchg chk(o64) 
cpuid(CMPCCXADD) p_66),
+[0xef] = X86_OP_ENTRY3(CMPccXADD,   EM,y, G,y, B,y,  vex13 xchg chk(o64) 
cpuid(CMPCCXADD) p_66),
 };
 
 /* five rows for no prefix, 66, F3, F2, 66+F2  */
@@ -1273,8 +1295,13 @@ static bool decode_op(DisasContext *s, CPUX86State *env, 
X86DecodedInsn *decode,
 
 case X86_TYPE_WM:  /* modrm byte selects an XMM/YMM memory operand */
 op->unit = X86_OP_SSE;
+goto get_modrm_mem;
+
+case X86_TYPE_EM:  /* modrm byte selects an ALU memory operand */
+op->unit = X86_OP_INT;
 /* fall through */
 case X86_TYPE_M:  /* modrm byte selects a memory operand */
+get_modrm_mem:
 modrm = get_modrm(s, env);
 if ((modrm >> 6) == 3) {
 return false;
@@ -1511,6 +1538,9 @@ static bool has_cpuid_feature(DisasContext *s, 
X86CPUIDFeature cpuid)
 return (s->cpuid_7_0_ebx_features & CPUID_7_0_EBX_AVX2);
 case X86_FEAT_SHA_NI:
 return (s->cpuid_7_0_ebx_features & CPUID_7_0_EBX_SHA_NI);
+
+case X86_FEAT_CMPCCXADD:
+return (s->cpuid_7_1_eax_features & CPUID_7_1_EAX_CMPCCXADD);
 }
 g_assert_not_reached();
 }
diff --git a/target/i386/tcg/decode-new.h b/target/i386/tcg/decode-new.h
index b2879136614..b22de02ce54 100644
--- a/target/i386/tcg/decode-new.h
+++ b/target/i386/tcg/decode-new.h
@@ -47,6 +47,7 @@ typedef enum X86OpType {
 X86_TYPE_Y, /* string destination */
 
 /* Custom */
+X86_TYPE_EM, /* modrm byte selects an ALU memory operand */
 X86_TYPE_WM, /* modrm byte selects an XMM/YMM memory operand */

[PATCH 04/18] tests/tcg/i386: initialize more registers in test-avx

2023-10-14 Thread Paolo Bonzini
Some instructions use YMM0 implicitly, or use YMM9 as a read-modify-write
register destination.  Initialize those registers as well.

Signed-off-by: Paolo Bonzini 
---
 tests/tcg/i386/test-avx.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/tests/tcg/i386/test-avx.c b/tests/tcg/i386/test-avx.c
index c39c0e5bce8..2a04c1ecf15 100644
--- a/tests/tcg/i386/test-avx.c
+++ b/tests/tcg/i386/test-avx.c
@@ -316,6 +316,8 @@ int main(int argc, char *argv[])
 int i;
 
 init_all();
+init_intreg([0]);
+init_intreg([9]);
 init_intreg([10]);
 init_intreg([11]);
 init_intreg([12]);
@@ -324,6 +326,8 @@ int main(int argc, char *argv[])
 dump_regs();
 
 init_all();
+init_f16reg([0]);
+init_f16reg([9]);
 init_f16reg([10]);
 init_f16reg([11]);
 init_f16reg([12]);
@@ -333,6 +337,8 @@ int main(int argc, char *argv[])
 dump_regs();
 
 init_all();
+init_f32reg([0]);
+init_f32reg([9]);
 init_f32reg([10]);
 init_f32reg([11]);
 init_f32reg([12]);
@@ -342,6 +348,8 @@ int main(int argc, char *argv[])
 dump_regs();
 
 init_all();
+init_f64reg([0]);
+init_f64reg([9]);
 init_f64reg([10]);
 init_f64reg([11]);
 init_f64reg([12]);
-- 
2.41.0




[PATCH 06/18] target/i386: accept full MemOp in gen_ext_tl

2023-10-14 Thread Paolo Bonzini
Use MO_SIGN to indicate signed vs. unsigned extension, and filter out
bits other than MO_SIGN and MO_SIZE.

Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c | 30 +++---
 1 file changed, 15 insertions(+), 15 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 4f6f9fa7e52..d7d6c85877d 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -699,18 +699,18 @@ static inline void gen_op_movl_T0_Dshift(DisasContext *s, 
MemOp ot)
 tcg_gen_shli_tl(s->T0, s->T0, ot);
 };
 
-static TCGv gen_ext_tl(TCGv dst, TCGv src, MemOp size, bool sign)
+static TCGv gen_ext_tl(TCGv dst, TCGv src, MemOp ot)
 {
-switch (size) {
+switch (ot & MO_SIZE) {
 case MO_8:
-if (sign) {
+if (ot & MO_SIGN) {
 tcg_gen_ext8s_tl(dst, src);
 } else {
 tcg_gen_ext8u_tl(dst, src);
 }
 return dst;
 case MO_16:
-if (sign) {
+if (ot & MO_SIGN) {
 tcg_gen_ext16s_tl(dst, src);
 } else {
 tcg_gen_ext16u_tl(dst, src);
@@ -718,7 +718,7 @@ static TCGv gen_ext_tl(TCGv dst, TCGv src, MemOp size, bool 
sign)
 return dst;
 #ifdef TARGET_X86_64
 case MO_32:
-if (sign) {
+if (ot & MO_SIGN) {
 tcg_gen_ext32s_tl(dst, src);
 } else {
 tcg_gen_ext32u_tl(dst, src);
@@ -732,12 +732,12 @@ static TCGv gen_ext_tl(TCGv dst, TCGv src, MemOp size, 
bool sign)
 
 static void gen_extu(MemOp ot, TCGv reg)
 {
-gen_ext_tl(reg, reg, ot, false);
+gen_ext_tl(reg, reg, ot);
 }
 
 static void gen_exts(MemOp ot, TCGv reg)
 {
-gen_ext_tl(reg, reg, ot, true);
+gen_ext_tl(reg, reg, ot | MO_SIGN);
 }
 
 static void gen_op_j_ecx(DisasContext *s, TCGCond cond, TCGLabel *label1)
@@ -926,7 +926,7 @@ static CCPrepare gen_prepare_eflags_c(DisasContext *s, TCGv 
reg)
 case CC_OP_SUBB ... CC_OP_SUBQ:
 /* (DATA_TYPE)CC_SRCT < (DATA_TYPE)CC_SRC */
 size = s->cc_op - CC_OP_SUBB;
-t1 = gen_ext_tl(s->tmp0, cpu_cc_src, size, false);
+t1 = gen_ext_tl(s->tmp0, cpu_cc_src, size);
 /* If no temporary was used, be careful not to alias t1 and t0.  */
 t0 = t1 == cpu_cc_src ? s->tmp0 : reg;
 tcg_gen_mov_tl(t0, s->cc_srcT);
@@ -936,8 +936,8 @@ static CCPrepare gen_prepare_eflags_c(DisasContext *s, TCGv 
reg)
 case CC_OP_ADDB ... CC_OP_ADDQ:
 /* (DATA_TYPE)CC_DST < (DATA_TYPE)CC_SRC */
 size = s->cc_op - CC_OP_ADDB;
-t1 = gen_ext_tl(s->tmp0, cpu_cc_src, size, false);
-t0 = gen_ext_tl(reg, cpu_cc_dst, size, false);
+t1 = gen_ext_tl(s->tmp0, cpu_cc_src, size);
+t0 = gen_ext_tl(reg, cpu_cc_dst, size);
 add_sub:
 return (CCPrepare) { .cond = TCG_COND_LTU, .reg = t0,
  .reg2 = t1, .mask = -1, .use_reg2 = true };
@@ -965,7 +965,7 @@ static CCPrepare gen_prepare_eflags_c(DisasContext *s, TCGv 
reg)
 
 case CC_OP_BMILGB ... CC_OP_BMILGQ:
 size = s->cc_op - CC_OP_BMILGB;
-t0 = gen_ext_tl(reg, cpu_cc_src, size, false);
+t0 = gen_ext_tl(reg, cpu_cc_src, size);
 return (CCPrepare) { .cond = TCG_COND_EQ, .reg = t0, .mask = -1 };
 
 case CC_OP_ADCX:
@@ -1017,7 +1017,7 @@ static CCPrepare gen_prepare_eflags_s(DisasContext *s, 
TCGv reg)
 default:
 {
 MemOp size = (s->cc_op - CC_OP_ADDB) & 3;
-TCGv t0 = gen_ext_tl(reg, cpu_cc_dst, size, true);
+TCGv t0 = gen_ext_tl(reg, cpu_cc_dst, size | MO_SIGN);
 return (CCPrepare) { .cond = TCG_COND_LT, .reg = t0, .mask = -1 };
 }
 }
@@ -1062,7 +1062,7 @@ static CCPrepare gen_prepare_eflags_z(DisasContext *s, 
TCGv reg)
 default:
 {
 MemOp size = (s->cc_op - CC_OP_ADDB) & 3;
-TCGv t0 = gen_ext_tl(reg, cpu_cc_dst, size, false);
+TCGv t0 = gen_ext_tl(reg, cpu_cc_dst, size);
 return (CCPrepare) { .cond = TCG_COND_EQ, .reg = t0, .mask = -1 };
 }
 }
@@ -1088,7 +1088,7 @@ static CCPrepare gen_prepare_cc(DisasContext *s, int b, 
TCGv reg)
 case JCC_BE:
 tcg_gen_mov_tl(s->tmp4, s->cc_srcT);
 gen_extu(size, s->tmp4);
-t0 = gen_ext_tl(s->tmp0, cpu_cc_src, size, false);
+t0 = gen_ext_tl(s->tmp0, cpu_cc_src, size);
 cc = (CCPrepare) { .cond = TCG_COND_LEU, .reg = s->tmp4,
.reg2 = t0, .mask = -1, .use_reg2 = true };
 break;
@@ -1101,7 +1101,7 @@ static CCPrepare gen_prepare_cc(DisasContext *s, int b, 
TCGv reg)
 fast_jcc_l:
 tcg_gen_mov_tl(s->tmp4, s->cc_srcT);
 gen_exts(size, s->tmp4);
-t0 = gen_ext_tl(s->tmp0, cpu_cc_src, size, true);
+t0 = gen_ext_tl(s->tmp0, cpu_cc_src, size | MO_SIGN);
 cc = (CCPrepare) { .cond = cond, .reg = s->tmp4,
.reg2 = t0, .mask = -1, 

[PATCH 16/18] target/i386: move remaining conditional operations to new decoder

2023-10-14 Thread Paolo Bonzini
Move long-displacement Jcc, SETcc and CMOVcc to the new decoder.
While filling in the tables makes the code seem longer, the new
emitters are all just one line of code.

Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/decode-new.c.inc | 56 
 target/i386/tcg/decode-new.h |  1 +
 target/i386/tcg/emit.c.inc   | 10 ++
 target/i386/tcg/translate.c  |  4 ++-
 4 files changed, 70 insertions(+), 1 deletion(-)

diff --git a/target/i386/tcg/decode-new.c.inc b/target/i386/tcg/decode-new.c.inc
index d03bc5a9720..fdbe7bce968 100644
--- a/target/i386/tcg/decode-new.c.inc
+++ b/target/i386/tcg/decode-new.c.inc
@@ -944,6 +944,15 @@ static const X86OpEntry opcodes_0F[256] = {
 /* Incorrectly listed as Mq,Vq in the manual */
 [0x17] = X86_OP_ENTRY3(VMOVHPx_st,  M,q, None,None, V,dq, vex5 p_00_66),
 
+[0x40] = X86_OP_ENTRY2(CMOVcc, G,v, E,v, cpuid(CMOV)),
+[0x41] = X86_OP_ENTRY2(CMOVcc, G,v, E,v, cpuid(CMOV)),
+[0x42] = X86_OP_ENTRY2(CMOVcc, G,v, E,v, cpuid(CMOV)),
+[0x43] = X86_OP_ENTRY2(CMOVcc, G,v, E,v, cpuid(CMOV)),
+[0x44] = X86_OP_ENTRY2(CMOVcc, G,v, E,v, cpuid(CMOV)),
+[0x45] = X86_OP_ENTRY2(CMOVcc, G,v, E,v, cpuid(CMOV)),
+[0x46] = X86_OP_ENTRY2(CMOVcc, G,v, E,v, cpuid(CMOV)),
+[0x47] = X86_OP_ENTRY2(CMOVcc, G,v, E,v, cpuid(CMOV)),
+
 [0x50] = X86_OP_ENTRY3(MOVMSK, G,y, None,None, U,x, vex7 p_00_66),
 [0x51] = X86_OP_GROUP3(sse_unary,  V,x, H,x, W,x, vex2_rep3 
p_00_66_f3_f2), /* sqrtps */
 [0x52] = X86_OP_GROUP3(sse_unary,  V,x, H,x, W,x, vex4_rep5 p_00_f3), /* 
rsqrtps */
@@ -971,6 +980,24 @@ static const X86OpEntry opcodes_0F[256] = {
 [0x76] = X86_OP_ENTRY3(PCMPEQD,V,x, H,x, W,x,  vex4 mmx avx2_256 
p_00_66),
 [0x77] = X86_OP_GROUP0(0F77),
 
+[0x80] = X86_OP_ENTRYr(Jcc, J,z),
+[0x81] = X86_OP_ENTRYr(Jcc, J,z),
+[0x82] = X86_OP_ENTRYr(Jcc, J,z),
+[0x83] = X86_OP_ENTRYr(Jcc, J,z),
+[0x84] = X86_OP_ENTRYr(Jcc, J,z),
+[0x85] = X86_OP_ENTRYr(Jcc, J,z),
+[0x86] = X86_OP_ENTRYr(Jcc, J,z),
+[0x87] = X86_OP_ENTRYr(Jcc, J,z),
+
+[0x90] = X86_OP_ENTRYw(SETcc, E,b),
+[0x91] = X86_OP_ENTRYw(SETcc, E,b),
+[0x92] = X86_OP_ENTRYw(SETcc, E,b),
+[0x93] = X86_OP_ENTRYw(SETcc, E,b),
+[0x94] = X86_OP_ENTRYw(SETcc, E,b),
+[0x95] = X86_OP_ENTRYw(SETcc, E,b),
+[0x96] = X86_OP_ENTRYw(SETcc, E,b),
+[0x97] = X86_OP_ENTRYw(SETcc, E,b),
+
 [0x28] = X86_OP_ENTRY3(MOVDQ,  V,x,  None,None, W,x, vex1 p_00_66), /* 
MOVAPS */
 [0x29] = X86_OP_ENTRY3(MOVDQ,  W,x,  None,None, V,x, vex1 p_00_66), /* 
MOVAPS */
 [0x2A] = X86_OP_GROUP0(0F2A),
@@ -983,6 +1010,15 @@ static const X86OpEntry opcodes_0F[256] = {
 [0x38] = X86_OP_GROUP0(0F38),
 [0x3a] = X86_OP_GROUP0(0F3A),
 
+[0x48] = X86_OP_ENTRY2(CMOVcc, G,v, E,v, cpuid(CMOV)),
+[0x49] = X86_OP_ENTRY2(CMOVcc, G,v, E,v, cpuid(CMOV)),
+[0x4a] = X86_OP_ENTRY2(CMOVcc, G,v, E,v, cpuid(CMOV)),
+[0x4b] = X86_OP_ENTRY2(CMOVcc, G,v, E,v, cpuid(CMOV)),
+[0x4c] = X86_OP_ENTRY2(CMOVcc, G,v, E,v, cpuid(CMOV)),
+[0x4d] = X86_OP_ENTRY2(CMOVcc, G,v, E,v, cpuid(CMOV)),
+[0x4e] = X86_OP_ENTRY2(CMOVcc, G,v, E,v, cpuid(CMOV)),
+[0x4f] = X86_OP_ENTRY2(CMOVcc, G,v, E,v, cpuid(CMOV)),
+
 [0x58] = X86_OP_ENTRY3(VADD,   V,x, H,x, W,x, vex2_rep3 p_00_66_f3_f2),
 [0x59] = X86_OP_ENTRY3(VMUL,   V,x, H,x, W,x, vex2_rep3 p_00_66_f3_f2),
 [0x5a] = X86_OP_GROUP0(0F5A),
@@ -1008,6 +1044,24 @@ static const X86OpEntry opcodes_0F[256] = {
 [0x7e] = X86_OP_GROUP0(0F7E),
 [0x7f] = X86_OP_GROUP0(0F7F),
 
+[0x88] = X86_OP_ENTRYr(Jcc, J,z),
+[0x89] = X86_OP_ENTRYr(Jcc, J,z),
+[0x8a] = X86_OP_ENTRYr(Jcc, J,z),
+[0x8b] = X86_OP_ENTRYr(Jcc, J,z),
+[0x8c] = X86_OP_ENTRYr(Jcc, J,z),
+[0x8d] = X86_OP_ENTRYr(Jcc, J,z),
+[0x8e] = X86_OP_ENTRYr(Jcc, J,z),
+[0x8f] = X86_OP_ENTRYr(Jcc, J,z),
+
+[0x98] = X86_OP_ENTRYw(SETcc, E,b),
+[0x99] = X86_OP_ENTRYw(SETcc, E,b),
+[0x9a] = X86_OP_ENTRYw(SETcc, E,b),
+[0x9b] = X86_OP_ENTRYw(SETcc, E,b),
+[0x9c] = X86_OP_ENTRYw(SETcc, E,b),
+[0x9d] = X86_OP_ENTRYw(SETcc, E,b),
+[0x9e] = X86_OP_ENTRYw(SETcc, E,b),
+[0x9f] = X86_OP_ENTRYw(SETcc, E,b),
+
 [0xae] = X86_OP_GROUP0(group15),
 
 [0xc2] = X86_OP_ENTRY4(VCMP,   V,x, H,x, W,x,   vex2_rep3 
p_00_66_f3_f2),
@@ -1743,6 +1797,8 @@ static bool has_cpuid_feature(DisasContext *s, 
X86CPUIDFeature cpuid)
 switch (cpuid) {
 case X86_FEAT_None:
 return true;
+case X86_FEAT_CMOV:
+return (s->cpuid_features & CPUID_CMOV);
 case X86_FEAT_F16C:
 return (s->cpuid_ext_features & CPUID_EXT_F16C);
 case X86_FEAT_FMA:
diff --git a/target/i386/tcg/decode-new.h b/target/i386/tcg/decode-new.h
index 98671579abe..663dce7384d 100644
--- a/target/i386/tcg/decode-new.h
+++ b/target/i386/tcg/decode-new.h
@@ -105,6 +105,7 @@ typedef enum 

[PATCH 03/18] target/i386: implement SHA instructions

2023-10-14 Thread Paolo Bonzini
The implementation was validated with OpenSSL and with the test vectors in
https://github.com/rust-lang/stdarch/blob/master/crates/core_arch/src/x86/sha.rs.

The instructions provide a ~25% improvement on hashing a 64 MiB file:
runtime goes down from 1.8 seconds to 1.4 seconds; instruction count on
the host goes down from 5.8 billion to 4.8 billion with slightly better
IPC too.  Good job Intel. ;)

Signed-off-by: Paolo Bonzini 
---
 target/i386/cpu.c|   2 +-
 target/i386/ops_sse.h| 128 +++
 target/i386/tcg/decode-new.c.inc |  11 +++
 target/i386/tcg/decode-new.h |   1 +
 target/i386/tcg/emit.c.inc   |  54 +++
 target/i386/tcg/ops_sse_header.h.inc |  14 +++
 6 files changed, 209 insertions(+), 1 deletion(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 3aab05ddadc..8beb989701c 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -714,7 +714,7 @@ void x86_cpu_vendor_words2str(char *dst, uint32_t vendor1,
   CPUID_7_0_EBX_PCOMMIT | CPUID_7_0_EBX_CLFLUSHOPT |\
   CPUID_7_0_EBX_CLWB | CPUID_7_0_EBX_MPX | CPUID_7_0_EBX_FSGSBASE | \
   CPUID_7_0_EBX_ERMS | CPUID_7_0_EBX_AVX2 | CPUID_7_0_EBX_RDSEED | \
-  CPUID_7_0_EBX_KERNEL_FEATURES)
+  CPUID_7_0_EBX_SHA_NI | CPUID_7_0_EBX_KERNEL_FEATURES)
   /* missing:
   CPUID_7_0_EBX_HLE
   CPUID_7_0_EBX_INVPCID, CPUID_7_0_EBX_RTM */
diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h
index 33908c0691f..6a465a35fdb 100644
--- a/target/i386/ops_sse.h
+++ b/target/i386/ops_sse.h
@@ -2527,6 +2527,134 @@ SSE_HELPER_FMAP(helper_fma4ps,  ZMM_S, 2 << SHIFT, 
float32_muladd)
 SSE_HELPER_FMAP(helper_fma4pd,  ZMM_D, 1 << SHIFT, float64_muladd)
 #endif
 
+#if SHIFT == 1
+#define SSE_HELPER_SHA1RNDS4(name, F, K) \
+void name(Reg *d, Reg *a, Reg *b)   \
+{   \
+uint32_t A, B, C, D, E, t, i;   \
+\
+A = a->L(3);\
+B = a->L(2);\
+C = a->L(1);\
+D = a->L(0);\
+E = 0;  \
+\
+for (i = 0; i <= 3; i++) {  \
+t = F(B, C, D) + rol32(A, 5) + b->L(3 - i) + E + K; \
+E = D;  \
+D = C;  \
+C = rol32(B, 30);   \
+B = A;  \
+A = t;  \
+}   \
+\
+d->L(3) = A;\
+d->L(2) = B;\
+d->L(1) = C;\
+d->L(0) = D;\
+}
+
+#define SHA1_F0(b, c, d) (((b) & (c)) ^ (~(b) & (d)))
+#define SHA1_F1(b, c, d) ((b) ^ (c) ^ (d))
+#define SHA1_F2(b, c, d) (((b) & (c)) ^ ((b) & (d)) ^ ((c) & (d)))
+
+SSE_HELPER_SHA1RNDS4(helper_sha1rnds4_f0, SHA1_F0, 0x5A827999)
+SSE_HELPER_SHA1RNDS4(helper_sha1rnds4_f1, SHA1_F1, 0x6ED9EBA1)
+SSE_HELPER_SHA1RNDS4(helper_sha1rnds4_f2, SHA1_F2, 0x8F1BBCDC)
+SSE_HELPER_SHA1RNDS4(helper_sha1rnds4_f3, SHA1_F1, 0xCA62C1D6)
+
+void helper_sha1nexte(Reg *d, Reg *a, Reg *b)
+{
+d->L(3) = b->L(3) + rol32(a->L(3), 30);
+d->L(2) = b->L(2);
+d->L(1) = b->L(1);
+d->L(0) = b->L(0);
+}
+
+void helper_sha1msg1(Reg *d, Reg *a, Reg *b)
+{
+/* These could be overwritten by the first two assignments, save them.  */
+uint32_t b3 = b->L(3);
+uint32_t b2 = b->L(2);
+
+d->L(3) = a->L(3) ^ a->L(1);
+d->L(2) = a->L(2) ^ a->L(0);
+d->L(1) = a->L(1) ^ b3;
+d->L(0) = a->L(0) ^ b2;
+}
+
+void helper_sha1msg2(Reg *d, Reg *a, Reg *b)
+{
+d->L(3) = rol32(a->L(3) ^ b->L(2), 1);
+d->L(2) = rol32(a->L(2) ^ b->L(1), 1);
+d->L(1) = rol32(a->L(1) ^ b->L(0), 1);
+d->L(0) = rol32(a->L(0) ^ d->L(3), 1);
+}
+
+#define SHA256_CH(e, f, g)  (((e) & (f)) ^ (~(e) & (g)))
+#define SHA256_MAJ(a, b, c) (((a) & (b)) ^ ((a) & (c)) ^ ((b) & (c)))
+
+#define SHA256_RNDS0(w) (ror32((w), 2) ^ 

[PATCH 17/18] target/i386: remove now converted opcodes from old decoder

2023-10-14 Thread Paolo Bonzini
Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c | 705 +---
 1 file changed, 4 insertions(+), 701 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 1f3cc6361c0..6e091fdb7f6 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -3176,7 +3176,7 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 #ifndef CONFIG_USER_ONLY
 use_new &= b <= limit;
 #endif
-if (use_new && b <= 0xbf) {
+if (use_new && 0) {
 disas_insn_new(s, cpu, b);
 return true;
 }
@@ -3186,9 +3186,7 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 #ifndef CONFIG_USER_ONLY
 use_new &= b <= limit;
 #endif
-if (use_new &&
-   ((b >= 0x140 && b <= 0x14f) ||
-(b >= 0x180 && b <= 0x19f))) {
+if (use_new && 0) {
 disas_insn_new(s, cpu, b);
 return true;
 }
@@ -3289,119 +3287,6 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 switch (b) {
 /**/
 /* arith & logic */
-case 0x00 ... 0x05:
-case 0x08 ... 0x0d:
-case 0x10 ... 0x15:
-case 0x18 ... 0x1d:
-case 0x20 ... 0x25:
-case 0x28 ... 0x2d:
-case 0x30 ... 0x35:
-case 0x38 ... 0x3d:
-{
-int f;
-op = (b >> 3) & 7;
-f = (b >> 1) & 3;
-
-ot = mo_b_d(b, dflag);
-
-switch(f) {
-case 0: /* OP Ev, Gv */
-modrm = x86_ldub_code(env, s);
-reg = ((modrm >> 3) & 7) | REX_R(s);
-mod = (modrm >> 6) & 3;
-rm = (modrm & 7) | REX_B(s);
-if (mod != 3) {
-gen_lea_modrm(env, s, modrm);
-opreg = OR_TMP0;
-} else if (op == OP_XORL && rm == reg) {
-xor_zero:
-/* xor reg, reg optimisation */
-set_cc_op(s, CC_OP_CLR);
-tcg_gen_movi_tl(s->T0, 0);
-gen_op_mov_reg_v(s, ot, reg, s->T0);
-break;
-} else {
-opreg = rm;
-}
-gen_op_mov_v_reg(s, ot, s->T1, reg);
-gen_op(s, op, ot, opreg);
-break;
-case 1: /* OP Gv, Ev */
-modrm = x86_ldub_code(env, s);
-mod = (modrm >> 6) & 3;
-reg = ((modrm >> 3) & 7) | REX_R(s);
-rm = (modrm & 7) | REX_B(s);
-if (mod != 3) {
-gen_lea_modrm(env, s, modrm);
-gen_op_ld_v(s, ot, s->T1, s->A0);
-} else if (op == OP_XORL && rm == reg) {
-goto xor_zero;
-} else {
-gen_op_mov_v_reg(s, ot, s->T1, rm);
-}
-gen_op(s, op, ot, reg);
-break;
-case 2: /* OP A, Iv */
-val = insn_get(env, s, ot);
-tcg_gen_movi_tl(s->T1, val);
-gen_op(s, op, ot, OR_EAX);
-break;
-}
-}
-break;
-
-case 0x82:
-if (CODE64(s))
-goto illegal_op;
-/* fall through */
-case 0x80: /* GRP1 */
-case 0x81:
-case 0x83:
-{
-ot = mo_b_d(b, dflag);
-
-modrm = x86_ldub_code(env, s);
-mod = (modrm >> 6) & 3;
-rm = (modrm & 7) | REX_B(s);
-op = (modrm >> 3) & 7;
-
-if (mod != 3) {
-if (b == 0x83)
-s->rip_offset = 1;
-else
-s->rip_offset = insn_const_size(ot);
-gen_lea_modrm(env, s, modrm);
-opreg = OR_TMP0;
-} else {
-opreg = rm;
-}
-
-switch(b) {
-default:
-case 0x80:
-case 0x81:
-case 0x82:
-val = insn_get(env, s, ot);
-break;
-case 0x83:
-val = (int8_t)insn_get(env, s, MO_8);
-break;
-}
-tcg_gen_movi_tl(s->T1, val);
-gen_op(s, op, ot, opreg);
-}
-break;
-
-/**/
-/* inc, dec, and other misc arith */
-case 0x40 ... 0x47: /* inc Gv */
-ot = dflag;
-gen_inc(s, ot, OR_EAX + (b & 7), 1);
-break;
-case 0x48 ... 0x4f: /* dec Gv */
-ot = dflag;
-gen_inc(s, ot, OR_EAX + (b & 7), -1);
-break;
 case 0xf6: /* GRP3 */
 case 0xf7:
 ot = mo_b_d(b, dflag);
@@ -3725,81 +3610,7 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 }
 break;
 
-case 0x84: /* test Ev, Gv */
-case 0x85:
-ot = mo_b_d(b, dflag);
-
-modrm = x86_ldub_code(env, s);
-reg = 

[PATCH 07/18] target/i386: introduce flags writeback mechanism

2023-10-14 Thread Paolo Bonzini
ALU instructions can write to both memory and flags.  If the CC_SRC*
and CC_DST locations have been written already when a memory access
causes a fault, the value in CC_SRC* and CC_DST might be interpreted
with the wrong CC_OP (the one that is in effect before the instruction.

Besides just using the wrong result for the flags, something like
subtracting -1 can have disastrous effects if the current CC_OP is
CC_OP_EFLAGS: this is because QEMU does not expect bits outside the ALU
flags to be set in CC_SRC, and env->eflags can end up set to all-ones.
In the case of the attached testcase, this sets IOPL to 3 and would
cause an assertion failure if SUB is moved to the new decoder.

This mechanism is not really needed for BMI instructions, which can
only write to a register, but put it to use anyway for cleanliness.

Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/decode-new.c.inc | 20 +
 target/i386/tcg/decode-new.h |  2 ++
 target/i386/tcg/emit.c.inc   | 15 +++--
 tests/tcg/i386/Makefile.target   |  2 +-
 tests/tcg/i386/test-flags.c  | 37 
 5 files changed, 69 insertions(+), 7 deletions(-)
 create mode 100644 tests/tcg/i386/test-flags.c

diff --git a/target/i386/tcg/decode-new.c.inc b/target/i386/tcg/decode-new.c.inc
index eb2400095f8..bad561ff66d 100644
--- a/target/i386/tcg/decode-new.c.inc
+++ b/target/i386/tcg/decode-new.c.inc
@@ -1943,6 +1943,26 @@ static void disas_insn_new(DisasContext *s, CPUState 
*cpu, int b)
 decode.e.gen(s, env, );
 gen_writeback(s, , 0, s->T0);
 }
+
+/*
+ * Write back flags after last memory access.  Some newer ALU 
instructions, as
+ * well as SSE instructions, write flags in the gen_* function, but that 
can
+ * cause incorrect tracking of CC_OP for instructions that write to both 
memory
+ * and flags.
+ */
+if (decode.cc_dst) {
+tcg_gen_mov_tl(cpu_cc_dst, decode.cc_dst);
+}
+if (decode.cc_src) {
+tcg_gen_mov_tl(cpu_cc_src, decode.cc_src);
+}
+if (decode.cc_src2) {
+tcg_gen_mov_tl(cpu_cc_src, decode.cc_src2);
+}
+if (decode.cc_srcT) {
+tcg_gen_mov_tl(s->cc_srcT, decode.cc_srcT);
+}
+
 return;
  gp_fault:
 gen_exception_gpf(s);
diff --git a/target/i386/tcg/decode-new.h b/target/i386/tcg/decode-new.h
index ab21fa6db97..b2879136614 100644
--- a/target/i386/tcg/decode-new.h
+++ b/target/i386/tcg/decode-new.h
@@ -265,6 +265,8 @@ struct X86DecodedInsn {
 target_ulong immediate;
 AddressParts mem;
 
+TCGv cc_dst, cc_src, cc_src2, cc_srcT;
+
 uint8_t b;
 };
 
diff --git a/target/i386/tcg/emit.c.inc b/target/i386/tcg/emit.c.inc
index 82da5488d47..b5dfdc409e5 100644
--- a/target/i386/tcg/emit.c.inc
+++ b/target/i386/tcg/emit.c.inc
@@ -323,6 +323,12 @@ static inline int vector_len(DisasContext *s, 
X86DecodedInsn *decode)
 return s->vex_l ? 32 : 16;
 }
 
+static void prepare_update1_cc(X86DecodedInsn *decode, DisasContext *s, CCOp 
op)
+{
+decode->cc_dst = s->T0;
+set_cc_op(s, op);
+}
+
 static void gen_store_sse(DisasContext *s, X86DecodedInsn *decode, int src_ofs)
 {
 MemOp ot = decode->op[0].ot;
@@ -1073,8 +1079,7 @@ static void gen_ANDN(DisasContext *s, CPUX86State *env, 
X86DecodedInsn *decode)
 MemOp ot = decode->op[0].ot;
 
 tcg_gen_andc_tl(s->T0, s->T1, s->T0);
-gen_op_update1_cc(s);
-set_cc_op(s, CC_OP_LOGICB + ot);
+prepare_update1_cc(decode, s, CC_OP_LOGICB + ot);
 }
 
 static void gen_BEXTR(DisasContext *s, CPUX86State *env, X86DecodedInsn 
*decode)
@@ -1105,8 +1110,7 @@ static void gen_BEXTR(DisasContext *s, CPUX86State *env, 
X86DecodedInsn *decode)
 tcg_gen_movcond_tl(TCG_COND_LEU, s->T1, s->A0, bound, s->T1, zero);
 tcg_gen_andc_tl(s->T0, s->T0, s->T1);
 
-gen_op_update1_cc(s);
-set_cc_op(s, CC_OP_LOGICB + ot);
+prepare_update1_cc(decode, s, CC_OP_LOGICB + ot);
 }
 
 static void gen_BLSI(DisasContext *s, CPUX86State *env, X86DecodedInsn *decode)
@@ -1161,8 +1165,7 @@ static void gen_BZHI(DisasContext *s, CPUX86State *env, 
X86DecodedInsn *decode)
 tcg_gen_movcond_tl(TCG_COND_LEU, s->A0, s->T1, bound, s->A0, zero);
 tcg_gen_andc_tl(s->T0, s->T0, s->A0);
 
-gen_op_update1_cc(s);
-set_cc_op(s, CC_OP_BMILGB + ot);
+prepare_update1_cc(decode, s, CC_OP_BMILGB + ot);
 }
 
 static void gen_CRC32(DisasContext *s, CPUX86State *env, X86DecodedInsn 
*decode)
diff --git a/tests/tcg/i386/Makefile.target b/tests/tcg/i386/Makefile.target
index fdf757c6ce4..ca0f543ef16 100644
--- a/tests/tcg/i386/Makefile.target
+++ b/tests/tcg/i386/Makefile.target
@@ -13,7 +13,7 @@ config-cc.mak: Makefile
 
 I386_SRCS=$(notdir $(wildcard $(I386_SRC)/*.c))
 ALL_X86_TESTS=$(I386_SRCS:.c=)
-SKIP_I386_TESTS=test-i386-ssse3 test-avx test-3dnow test-mmx
+SKIP_I386_TESTS=test-i386-ssse3 test-avx test-3dnow test-mmx test-flags
 X86_64_TESTS:=$(filter test-i386-adcox test-i386-bmi2 $(SKIP_I386_TESTS), 
$(ALL_X86_TESTS))
 
 

[PATCH 05/18] tests/tcg/i386: test-avx: add test cases for SHA new instructions

2023-10-14 Thread Paolo Bonzini
---
 tests/tcg/i386/test-avx.py | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/tests/tcg/i386/test-avx.py b/tests/tcg/i386/test-avx.py
index 641a2ef69eb..6063fb2d11d 100755
--- a/tests/tcg/i386/test-avx.py
+++ b/tests/tcg/i386/test-avx.py
@@ -9,7 +9,7 @@
 archs = [
 "SSE", "SSE2", "SSE3", "SSSE3", "SSE4_1", "SSE4_2",
 "AES", "AVX", "AVX2", "AES+AVX", "VAES+AVX",
-"F16C", "FMA",
+"F16C", "FMA", "SHA",
 ]
 
 ignore = set(["FISTTP",
@@ -43,6 +43,7 @@
 'vPS[LR][AL][WDQ]': 0x3f,
 'vPS[RL]LDQ': 0x1f,
 'vROUND[PS][SD]': 0x7,
+'SHA1RNDS4': 0x03,
 'vSHUFPD': 0x0f,
 'vSHUFPS': 0xff,
 'vAESKEYGENASSIST': 0xff,
-- 
2.41.0




[PATCH 00/18] target/i386: decoder changes for 8.2

2023-10-14 Thread Paolo Bonzini
This includes:

- implementing SHA and CMPccXADD instruction extensions

- introducing a new mechanism for flags writeback that avoids a
  tricky failure

- converting the more orthogonal parts of the one-byte opcode
  map, as well as the CMOVcc and SETcc instructions.

Tested by booting several 32-bit and 64-bit guests.

Paolo


Paolo Bonzini (18):
  target/i386: group common checks in the decoding phase
  target/i386: validate VEX.W for AVX instructions
  target/i386: implement SHA instructions
  tests/tcg/i386: initialize more registers in test-avx
  tests/tcg/i386: test-avx: add test cases for SHA new instructions
  target/i386: accept full MemOp in gen_ext_tl
  target/i386: introduce flags writeback mechanism
  target/i386: implement CMPccXADD
  target/i386: do not clobber A0 in POP translation
  target/i386: reintroduce debugging mechanism
  target/i386: move 00-5F opcodes to new decoder
  target/i386: adjust decoding of J operand
  target/i386: split eflags computation out of gen_compute_eflags
  target/i386: move 60-BF opcodes to new decoder
  target/i386: move operand load and writeback out of gen_cmovcc1
  target/i386: move remaining conditional operations to new decoder
  target/i386: remove now converted opcodes from old decoder
  target/i386: remove gen_op

 target/i386/cpu.c|4 +-
 target/i386/ops_sse.h|  128 
 target/i386/tcg/decode-new.c.inc |  605 ++--
 target/i386/tcg/decode-new.h |   41 +-
 target/i386/tcg/emit.c.inc   |  726 ++-
 target/i386/tcg/ops_sse_header.h.inc |   14 +
 target/i386/tcg/translate.c  | 1001 +++---
 tests/tcg/i386/Makefile.target   |2 +-
 tests/tcg/i386/test-avx.c|8 +
 tests/tcg/i386/test-avx.py   |3 +-
 tests/tcg/i386/test-flags.c  |   37 +
 11 files changed, 1593 insertions(+), 976 deletions(-)
 create mode 100644 tests/tcg/i386/test-flags.c

-- 
2.41.0




[PATCH 10/18] target/i386: reintroduce debugging mechanism

2023-10-14 Thread Paolo Bonzini
Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/decode-new.c.inc |  5 -
 target/i386/tcg/translate.c  | 27 +++
 2 files changed, 31 insertions(+), 1 deletion(-)

diff --git a/target/i386/tcg/decode-new.c.inc b/target/i386/tcg/decode-new.c.inc
index 01c46e6a789..fb95e0b9268 100644
--- a/target/i386/tcg/decode-new.c.inc
+++ b/target/i386/tcg/decode-new.c.inc
@@ -1701,6 +1701,9 @@ static void disas_insn_new(DisasContext *s, CPUState 
*cpu, int b)
 X86DecodedInsn decode;
 X86DecodeFunc decode_func = decode_root;
 
+#ifdef CONFIG_USER_ONLY
+if (limit) { --limit; }
+#endif
 s->has_modrm = false;
 
  next_byte:
@@ -1987,7 +1990,7 @@ static void disas_insn_new(DisasContext *s, CPUState 
*cpu, int b)
 tcg_gen_mov_tl(cpu_cc_src, decode.cc_src);
 }
 if (decode.cc_src2) {
-tcg_gen_mov_tl(cpu_cc_src, decode.cc_src2);
+tcg_gen_mov_tl(cpu_cc_src2, decode.cc_src2);
 }
 if (decode.cc_srcT) {
 tcg_gen_mov_tl(s->cc_srcT, decode.cc_srcT);
diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 39b5752e780..080b56840da 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -2980,6 +2980,9 @@ static void gen_sty_env_A0(DisasContext *s, int offset, 
bool align)
 tcg_gen_qemu_st_i64(s->tmp1_i64, s->tmp0, mem_index, MO_LEUQ);
 }
 
+static bool first = true;
+static unsigned long limit;
+
 #include "decode-new.h"
 #include "emit.c.inc"
 #include "decode-new.c.inc"
@@ -3134,15 +3137,39 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 
 prefixes = 0;
 
+if (first) {
+const char *env = getenv("QEMU_I386_LIMIT");
+limit = env ? atol(env) : -1;
+first = false;
+}
+bool use_new = true;
+#ifdef CONFIG_USER_ONLY
+use_new &= limit > 0;
+#endif
+
  next_byte:
 s->prefix = prefixes;
 b = x86_ldub_code(env, s);
 /* Collect prefixes.  */
 switch (b) {
 default:
+#ifndef CONFIG_USER_ONLY
+use_new &= b <= limit;
+#endif
+if (use_new && 0) {
+disas_insn_new(s, cpu, b);
+return true;
+}
 break;
 case 0x0f:
 b = x86_ldub_code(env, s) + 0x100;
+#ifndef CONFIG_USER_ONLY
+use_new &= b <= limit;
+#endif
+if (use_new && 0) {
+disas_insn_new(s, cpu, b);
+return true;
+}
 break;
 case 0xf3:
 prefixes |= PREFIX_REPZ;
-- 
2.41.0




[PATCH 01/18] target/i386: group common checks in the decoding phase

2023-10-14 Thread Paolo Bonzini
In preparation for adding more similar checks, move the VEX.L=0 check
and several X86_SPECIAL_* checks to a new field, where each bit represent
a common check on unused bits, or a restriction on the processor mode.

Likewise, many SVM intercepts can be checked during the decoding phase,
the main exception being the selective CR0 write, MSR and IOIO intercepts.

Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/decode-new.c.inc | 79 +++-
 target/i386/tcg/decode-new.h | 25 +++---
 target/i386/tcg/emit.c.inc   |  8 
 3 files changed, 76 insertions(+), 36 deletions(-)

diff --git a/target/i386/tcg/decode-new.c.inc b/target/i386/tcg/decode-new.c.inc
index 7d76f152758..790339eaf25 100644
--- a/target/i386/tcg/decode-new.c.inc
+++ b/target/i386/tcg/decode-new.c.inc
@@ -90,8 +90,6 @@
 X86_OP_ENTRY3(op, None, None, None, None, None, None, ## __VA_ARGS__)
 
 #define cpuid(feat) .cpuid = X86_FEAT_##feat,
-#define i64 .special = X86_SPECIAL_i64,
-#define o64 .special = X86_SPECIAL_o64,
 #define xchg .special = X86_SPECIAL_Locked,
 #define mmx .special = X86_SPECIAL_MMX,
 #define zext0 .special = X86_SPECIAL_ZExtOp0,
@@ -114,6 +112,9 @@
 #define vex12 .vex_class = 12,
 #define vex13 .vex_class = 13,
 
+#define chk(a) .check = X86_CHECK_##a,
+#define svm(a) .intercept = SVM_EXIT_##a,
+
 #define avx2_256 .vex_special = X86_VEX_AVX2_256,
 
 #define P_00  1
@@ -161,8 +162,8 @@ static void decode_group15(DisasContext *s, CPUX86State 
*env, X86OpEntry *entry,
 };
 
 static const X86OpEntry group15_mem[8] = {
-[2] = X86_OP_ENTRYr(LDMXCSR,E,d, vex5),
-[3] = X86_OP_ENTRYw(STMXCSR,E,d, vex5),
+[2] = X86_OP_ENTRYr(LDMXCSR,E,d, vex5 chk(VEX128)),
+[3] = X86_OP_ENTRYw(STMXCSR,E,d, vex5 chk(VEX128)),
 };
 
 uint8_t modrm = get_modrm(s, env);
@@ -1579,6 +1580,12 @@ static bool validate_vex(DisasContext *s, X86DecodedInsn 
*decode)
 if (s->flags & HF_EM_MASK) {
 goto illegal;
 }
+
+if (e->check & X86_CHECK_VEX128) {
+if (s->vex_l) {
+goto illegal;
+}
+}
 return true;
 
 nm_exception:
@@ -1764,6 +1771,25 @@ static void disas_insn_new(DisasContext *s, CPUState 
*cpu, int b)
 goto illegal_op;
 }
 
+/* Checks that result in #UD come first.  */
+if (decode.e.check) {
+if (decode.e.check & X86_CHECK_i64) {
+if (CODE64(s)) {
+goto illegal_op;
+}
+}
+if (decode.e.check & X86_CHECK_o64) {
+if (!CODE64(s)) {
+goto illegal_op;
+}
+}
+if (decode.e.check & X86_CHECK_prot) {
+if (!PE(s) || VM86(s)) {
+goto illegal_op;
+}
+}
+}
+
 switch (decode.e.special) {
 case X86_SPECIAL_None:
 break;
@@ -1774,23 +1800,6 @@ static void disas_insn_new(DisasContext *s, CPUState 
*cpu, int b)
 }
 break;
 
-case X86_SPECIAL_ProtMode:
-if (!PE(s) || VM86(s)) {
-goto illegal_op;
-}
-break;
-
-case X86_SPECIAL_i64:
-if (CODE64(s)) {
-goto illegal_op;
-}
-break;
-case X86_SPECIAL_o64:
-if (!CODE64(s)) {
-goto illegal_op;
-}
-break;
-
 case X86_SPECIAL_ZExtOp0:
 assert(decode.op[0].unit == X86_OP_INT);
 if (!decode.op[0].has_ea) {
@@ -1820,6 +1829,31 @@ static void disas_insn_new(DisasContext *s, CPUState 
*cpu, int b)
 if (!validate_vex(s, )) {
 return;
 }
+
+/*
+ * Checks that result in #GP or VMEXIT come second.  Intercepts are
+ * generally checked after non-memory exceptions (i.e. before all
+ * exceptions if there is no memory operand).  Exceptions are
+ * vm86 checks (INTn, IRET, PUSHF/POPF), RSM and XSETBV (!).
+ *
+ * RSM and XSETBV will be handled in the gen_* functions
+ * instead of using chk().
+ */
+if (decode.e.check & X86_CHECK_cpl0) {
+if (CPL(s) != 0) {
+goto gp_fault;
+}
+}
+if (decode.e.intercept && unlikely(GUEST(s))) {
+gen_helper_svm_check_intercept(tcg_env,
+   tcg_constant_i32(decode.e.intercept));
+}
+if (decode.e.check & X86_CHECK_vm86_iopl) {
+if (VM86(s) && IOPL(s) < 3) {
+goto gp_fault;
+}
+}
+
 if (decode.e.special == X86_SPECIAL_MMX &&
 !(s->prefix & (PREFIX_REPZ | PREFIX_REPNZ | PREFIX_DATA))) {
 gen_helper_enter_mmx(tcg_env);
@@ -1846,6 +1880,9 @@ static void disas_insn_new(DisasContext *s, CPUState 
*cpu, int b)
 gen_writeback(s, , 0, s->T0);
 }
 return;
+ gp_fault:
+gen_exception_gpf(s);
+return;
  illegal_op:
 gen_illegal_opcode(s);
 return;
diff --git a/target/i386/tcg/decode-new.h b/target/i386/tcg/decode-new.h
index a542ec16813..631d39220bb 100644
--- 

[PATCH 18/18] target/i386: remove gen_op

2023-10-14 Thread Paolo Bonzini
It is not used anymore by the old decoder, inline the CMP case into CMPS and 
SCAS.

Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c | 145 +++-
 1 file changed, 12 insertions(+), 133 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 6e091fdb7f6..3d5cdf4d29a 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -238,21 +238,8 @@ static void gen_eob(DisasContext *s);
 static void gen_jr(DisasContext *s);
 static void gen_jmp_rel(DisasContext *s, MemOp ot, int diff, int tb_num);
 static void gen_jmp_rel_csize(DisasContext *s, int diff, int tb_num);
-static void gen_op(DisasContext *s1, int op, MemOp ot, int d);
 static void gen_exception_gpf(DisasContext *s);
 
-/* i386 arith/logic operations */
-enum {
-OP_ADDL,
-OP_ORL,
-OP_ADCL,
-OP_SBBL,
-OP_ANDL,
-OP_SUBL,
-OP_XORL,
-OP_CMPL,
-};
-
 /* i386 shift ops */
 enum {
 OP_ROL,
@@ -853,13 +840,6 @@ static void gen_op_update2_cc(DisasContext *s)
 tcg_gen_mov_tl(cpu_cc_dst, s->T0);
 }
 
-static void gen_op_update3_cc(DisasContext *s, TCGv reg)
-{
-tcg_gen_mov_tl(cpu_cc_src2, reg);
-tcg_gen_mov_tl(cpu_cc_src, s->T1);
-tcg_gen_mov_tl(cpu_cc_dst, s->T0);
-}
-
 static inline void gen_op_testl_T0_T1_cc(DisasContext *s)
 {
 tcg_gen_and_tl(cpu_cc_dst, s->T0, s->T1);
@@ -1288,7 +1268,12 @@ static void gen_scas(DisasContext *s, MemOp ot)
 {
 gen_string_movl_A0_EDI(s);
 gen_op_ld_v(s, ot, s->T1, s->A0);
-gen_op(s, OP_CMPL, ot, R_EAX);
+gen_op_mov_v_reg(s, ot, s->T0, R_EAX);
+tcg_gen_mov_tl(cpu_cc_src, s->T1);
+tcg_gen_mov_tl(s->cc_srcT, s->T0);
+tcg_gen_sub_tl(cpu_cc_dst, s->T0, s->T1);
+set_cc_op(s, CC_OP_SUBB + ot);
+
 gen_op_movl_T0_Dshift(s, ot);
 gen_op_add_reg_T0(s, s->aflag, R_EDI);
 }
@@ -1298,7 +1283,12 @@ static void gen_cmps(DisasContext *s, MemOp ot)
 gen_string_movl_A0_EDI(s);
 gen_op_ld_v(s, ot, s->T1, s->A0);
 gen_string_movl_A0_ESI(s);
-gen_op(s, OP_CMPL, ot, OR_TMP0);
+gen_op_ld_v(s, ot, s->T0, s->A0);
+tcg_gen_mov_tl(cpu_cc_src, s->T1);
+tcg_gen_mov_tl(s->cc_srcT, s->T0);
+tcg_gen_sub_tl(cpu_cc_dst, s->T0, s->T1);
+set_cc_op(s, CC_OP_SUBB + ot);
+
 gen_op_movl_T0_Dshift(s, ot);
 gen_op_add_reg_T0(s, s->aflag, R_ESI);
 gen_op_add_reg_T0(s, s->aflag, R_EDI);
@@ -1506,117 +1496,6 @@ static bool check_iopl(DisasContext *s)
 return false;
 }
 
-/* if d == OR_TMP0, it means memory operand (address in A0) */
-static void gen_op(DisasContext *s1, int op, MemOp ot, int d)
-{
-if (d != OR_TMP0) {
-if (s1->prefix & PREFIX_LOCK) {
-/* Lock prefix when destination is not memory.  */
-gen_illegal_opcode(s1);
-return;
-}
-gen_op_mov_v_reg(s1, ot, s1->T0, d);
-} else if (!(s1->prefix & PREFIX_LOCK)) {
-gen_op_ld_v(s1, ot, s1->T0, s1->A0);
-}
-switch(op) {
-case OP_ADCL:
-gen_compute_eflags_c(s1, s1->tmp4);
-if (s1->prefix & PREFIX_LOCK) {
-tcg_gen_add_tl(s1->T0, s1->tmp4, s1->T1);
-tcg_gen_atomic_add_fetch_tl(s1->T0, s1->A0, s1->T0,
-s1->mem_index, ot | MO_LE);
-} else {
-tcg_gen_add_tl(s1->T0, s1->T0, s1->T1);
-tcg_gen_add_tl(s1->T0, s1->T0, s1->tmp4);
-gen_op_st_rm_T0_A0(s1, ot, d);
-}
-gen_op_update3_cc(s1, s1->tmp4);
-set_cc_op(s1, CC_OP_ADCB + ot);
-break;
-case OP_SBBL:
-gen_compute_eflags_c(s1, s1->tmp4);
-if (s1->prefix & PREFIX_LOCK) {
-tcg_gen_add_tl(s1->T0, s1->T1, s1->tmp4);
-tcg_gen_neg_tl(s1->T0, s1->T0);
-tcg_gen_atomic_add_fetch_tl(s1->T0, s1->A0, s1->T0,
-s1->mem_index, ot | MO_LE);
-} else {
-tcg_gen_sub_tl(s1->T0, s1->T0, s1->T1);
-tcg_gen_sub_tl(s1->T0, s1->T0, s1->tmp4);
-gen_op_st_rm_T0_A0(s1, ot, d);
-}
-gen_op_update3_cc(s1, s1->tmp4);
-set_cc_op(s1, CC_OP_SBBB + ot);
-break;
-case OP_ADDL:
-if (s1->prefix & PREFIX_LOCK) {
-tcg_gen_atomic_add_fetch_tl(s1->T0, s1->A0, s1->T1,
-s1->mem_index, ot | MO_LE);
-} else {
-tcg_gen_add_tl(s1->T0, s1->T0, s1->T1);
-gen_op_st_rm_T0_A0(s1, ot, d);
-}
-gen_op_update2_cc(s1);
-set_cc_op(s1, CC_OP_ADDB + ot);
-break;
-case OP_SUBL:
-if (s1->prefix & PREFIX_LOCK) {
-tcg_gen_neg_tl(s1->T0, s1->T1);
-tcg_gen_atomic_fetch_add_tl(s1->cc_srcT, s1->A0, s1->T0,
-s1->mem_index, ot | MO_LE);
-tcg_gen_sub_tl(s1->T0, s1->cc_srcT, s1->T1);
-} else {
-tcg_gen_mov_tl(s1->cc_srcT, s1->T0);
-tcg_gen_sub_tl(s1->T0, s1->T0, 

[PATCH 09/18] target/i386: do not clobber A0 in POP translation

2023-10-14 Thread Paolo Bonzini
The new decoder likes to compute the address in A0 very early, so the
gen_lea_v_seg in gen_pop_T0 would clobber the address of the memory
operand.  Instead use T0 since it is already available and will be
overwritten immediately after.

Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c | 34 --
 1 file changed, 20 insertions(+), 14 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 038151a8c3e..39b5752e780 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -628,17 +628,17 @@ static TCGv eip_cur_tl(DisasContext *s)
 }
 }
 
-/* Compute SEG:REG into A0.  SEG is selected from the override segment
+/* Compute SEG:REG into DEST.  SEG is selected from the override segment
(OVR_SEG) and the default segment (DEF_SEG).  OVR_SEG may be -1 to
indicate no override.  */
-static void gen_lea_v_seg(DisasContext *s, MemOp aflag, TCGv a0,
-  int def_seg, int ovr_seg)
+static void gen_lea_v_seg_dest(DisasContext *s, MemOp aflag, TCGv dest, TCGv 
a0,
+   int def_seg, int ovr_seg)
 {
 switch (aflag) {
 #ifdef TARGET_X86_64
 case MO_64:
 if (ovr_seg < 0) {
-tcg_gen_mov_tl(s->A0, a0);
+tcg_gen_mov_tl(dest, a0);
 return;
 }
 break;
@@ -649,14 +649,14 @@ static void gen_lea_v_seg(DisasContext *s, MemOp aflag, 
TCGv a0,
 ovr_seg = def_seg;
 }
 if (ovr_seg < 0) {
-tcg_gen_ext32u_tl(s->A0, a0);
+tcg_gen_ext32u_tl(dest, a0);
 return;
 }
 break;
 case MO_16:
 /* 16 bit address */
-tcg_gen_ext16u_tl(s->A0, a0);
-a0 = s->A0;
+tcg_gen_ext16u_tl(dest, a0);
+a0 = dest;
 if (ovr_seg < 0) {
 if (ADDSEG(s)) {
 ovr_seg = def_seg;
@@ -673,17 +673,23 @@ static void gen_lea_v_seg(DisasContext *s, MemOp aflag, 
TCGv a0,
 TCGv seg = cpu_seg_base[ovr_seg];
 
 if (aflag == MO_64) {
-tcg_gen_add_tl(s->A0, a0, seg);
+tcg_gen_add_tl(dest, a0, seg);
 } else if (CODE64(s)) {
-tcg_gen_ext32u_tl(s->A0, a0);
-tcg_gen_add_tl(s->A0, s->A0, seg);
+tcg_gen_ext32u_tl(dest, a0);
+tcg_gen_add_tl(dest, dest, seg);
 } else {
-tcg_gen_add_tl(s->A0, a0, seg);
-tcg_gen_ext32u_tl(s->A0, s->A0);
+tcg_gen_add_tl(dest, a0, seg);
+tcg_gen_ext32u_tl(dest, dest);
 }
 }
 }
 
+static void gen_lea_v_seg(DisasContext *s, MemOp aflag, TCGv a0,
+  int def_seg, int ovr_seg)
+{
+gen_lea_v_seg_dest(s, aflag, s->A0, a0, def_seg, ovr_seg);
+}
+
 static inline void gen_string_movl_A0_ESI(DisasContext *s)
 {
 gen_lea_v_seg(s, s->aflag, cpu_regs[R_ESI], R_DS, s->override);
@@ -2590,8 +2596,8 @@ static MemOp gen_pop_T0(DisasContext *s)
 {
 MemOp d_ot = mo_pushpop(s, s->dflag);
 
-gen_lea_v_seg(s, mo_stacksize(s), cpu_regs[R_ESP], R_SS, -1);
-gen_op_ld_v(s, d_ot, s->T0, s->A0);
+gen_lea_v_seg_dest(s, mo_stacksize(s), s->T0, cpu_regs[R_ESP], R_SS, -1);
+gen_op_ld_v(s, d_ot, s->T0, s->T0);
 
 return d_ot;
 }
-- 
2.41.0




[PATCH 13/18] target/i386: split eflags computation out of gen_compute_eflags

2023-10-14 Thread Paolo Bonzini
The new x86 decoder wants to compute EFLAGS before writeback, which
can be an issue for some instructions such as ARPL.  Extract code
to compute the EFLAGS without clobbering CC_SRC, in case the ARPL
memory write causes a fault.

Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c | 18 +-
 1 file changed, 13 insertions(+), 5 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index e13bf7df591..2da7c357cdc 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -872,18 +872,20 @@ static void gen_op_update_neg_cc(DisasContext *s)
 tcg_gen_movi_tl(s->cc_srcT, 0);
 }
 
-/* compute all eflags to cc_src */
-static void gen_compute_eflags(DisasContext *s)
+/* compute all eflags to reg */
+static void gen_mov_eflags(DisasContext *s, TCGv reg)
 {
 TCGv zero, dst, src1, src2;
 int live, dead;
 
 if (s->cc_op == CC_OP_EFLAGS) {
+if (reg != cpu_cc_src) {
+tcg_gen_mov_tl(reg, cpu_cc_src);
+}
 return;
 }
 if (s->cc_op == CC_OP_CLR) {
-tcg_gen_movi_tl(cpu_cc_src, CC_Z | CC_P);
-set_cc_op(s, CC_OP_EFLAGS);
+tcg_gen_movi_tl(reg, CC_Z | CC_P);
 return;
 }
 
@@ -909,7 +911,13 @@ static void gen_compute_eflags(DisasContext *s)
 }
 
 gen_update_cc_op(s);
-gen_helper_cc_compute_all(cpu_cc_src, dst, src1, src2, cpu_cc_op);
+gen_helper_cc_compute_all(reg, dst, src1, src2, cpu_cc_op);
+}
+
+/* compute all eflags to cc_src */
+static void gen_compute_eflags(DisasContext *s)
+{
+gen_mov_eflags(s, cpu_cc_src);
 set_cc_op(s, CC_OP_EFLAGS);
 }
 
-- 
2.41.0




[PATCH 02/18] target/i386: validate VEX.W for AVX instructions

2023-10-14 Thread Paolo Bonzini
Instructions in VEX exception class 6 generally look at the value of
VEX.W.  Note that the manual places some instructions incorrectly in
class 4, for example VPERMQ which has no non-VEX encoding and no legacy
SSE analogue.  AMD does a mess of its own, as documented in the comment
that this patch adds.

Most of them are checked for VEX.W=0, and are listed in the manual
(though with an omission) in table 2-16; VPERMQ and VPERMPD check for
VEX.W=1, which is only listed in the instruction description.  Others,
such as VPSRLV, VPSLLV and the FMA3 instructions, use VEX.W to switch
between a 32-bit and 64-bit operation.

Fix more of the class 4/class 6 mismatches, and implement the check for
VEX.W in TCG.

Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/decode-new.c.inc | 133 +--
 target/i386/tcg/decode-new.h |   6 ++
 2 files changed, 99 insertions(+), 40 deletions(-)

diff --git a/target/i386/tcg/decode-new.c.inc b/target/i386/tcg/decode-new.c.inc
index 790339eaf25..850271e0898 100644
--- a/target/i386/tcg/decode-new.c.inc
+++ b/target/i386/tcg/decode-new.c.inc
@@ -43,6 +43,47 @@
  * There are a couple cases in which instructions (e.g. MOVD) write the
  * whole XMM or MM register but are established incorrectly in the manual
  * as "d" or "q".  These have to be fixed for the decoder to work correctly.
+ *
+ * Speaking about imprecisions in the manual, the decoder treats all
+ * exception-class 4 instructions as having an optional VEX prefix, and
+ * all exception-class 6 instructions as having a mandatory VEX prefix.
+ * This is true except for a dozen instructions; these are in exception
+ * class 4 but do not ignore the VEX.W bit (which does not even exist
+ * without a VEX prefix).  These instructions are mostly listed in Intel's
+ * table 2-16, but with a few exceptions.
+ *
+ * The AMD manual has more precise subclasses for exceptions, and unlike Intel
+ * they list the VEX.W requirements in the exception classes as well (except
+ * when they don't).  AMD describes class 6 as "AVX Mixed Memory Argument"
+ * without defining what a mixed memory argument is, but still use 4 as the
+ * primary exception class... except when they don't.
+ *
+ * The summary is:
+ *   Intel AMD VEX.W   note
+ * ---
+ * vpblendd  4 4J  0
+ * vpblendvb 4 4E-X0   (*)
+ * vpbroadcastq  6 6D  0   (+)
+ * vpermd/vpermps4 4H  0   (§)
+ * vpermq/vpermpd4 4H-11   (§)
+ * vpermilpd/vpermilps   4 6E  0   (^)
+ * vpmaskmovd6 4K  significant (^)
+ * vpsllv4 4K  significant
+ * vpsrav4 4J  0
+ * vpsrlv4 4K  significant
+ * vtestps/vtestpd   4 4G  0
+ *
+ *(*)  AMD lists VPBLENDVB as related to SSE4.1 PBLENDVB, which may
+ * explain why it is considered exception class 4.  However,
+ * Intel says that VEX-only instructions should be in class 6...
+ *
+ *(+)  Not found in Intel's table 2-16
+ *
+ *(§)  4H and 4H-1 do not mention VEX.W requirements, which are
+ * however present in the description of the instruction
+ *
+ *(^)  these are the two cases in which Intel and AMD disagree on the
+ * primary exception class
  */
 
 #define X86_OP_NONE { 0 },
@@ -338,11 +379,11 @@ static const X86OpEntry opcodes_0F38_00toEF[240] = {
 [0x07] = X86_OP_ENTRY3(PHSUBSW,   V,x,  H,x,   W,x,  vex4 cpuid(SSSE3) mmx 
avx2_256 p_00_66),
 
 [0x10] = X86_OP_ENTRY2(PBLENDVB,  V,x, W,x,  vex4 cpuid(SSE41) 
avx2_256 p_66),
-[0x13] = X86_OP_ENTRY2(VCVTPH2PS, V,x, W,xh, vex11 cpuid(F16C) 
p_66),
+[0x13] = X86_OP_ENTRY2(VCVTPH2PS, V,x, W,xh, vex11 chk(W0) 
cpuid(F16C) p_66),
 [0x14] = X86_OP_ENTRY2(BLENDVPS,  V,x, W,x,  vex4 cpuid(SSE41) 
p_66),
 [0x15] = X86_OP_ENTRY2(BLENDVPD,  V,x, W,x,  vex4 cpuid(SSE41) 
p_66),
 /* Listed incorrectly as type 4 */
-[0x16] = X86_OP_ENTRY3(VPERMD,V,qq, H,qq,  W,qq,  vex6 cpuid(AVX2) 
p_66),
+[0x16] = X86_OP_ENTRY3(VPERMD,V,qq, H,qq,  W,qq,  vex6 chk(W0) 
cpuid(AVX2) p_66), /* vpermps */
 [0x17] = X86_OP_ENTRY3(VPTEST,None,None, V,x,  W,x,   vex4 
cpuid(SSE41) p_66),
 
 /*
@@ -363,14 +404,14 @@ static const X86OpEntry opcodes_0F38_00toEF[240] = {
 [0x33] = X86_OP_ENTRY3(VPMOVZXWD, V,x,  None,None, W,q,   vex5 
cpuid(SSE41) avx_movx avx2_256 p_66),
 [0x34] = X86_OP_ENTRY3(VPMOVZXWQ, V,x,  None,None, W,d,   vex5 
cpuid(SSE41) avx_movx avx2_256 p_66),
 [0x35] = X86_OP_ENTRY3(VPMOVZXDQ, V,x,  None,None, W,q,   vex5 
cpuid(SSE41) avx_movx avx2_256 p_66),
-[0x36] = X86_OP_ENTRY3(VPERMD,

Re: [PATCH v2 1/2] qdev: add IOThreadVirtQueueMappingList property type

2023-10-14 Thread Markus Armbruster
Stefan Hajnoczi  writes:

> virtio-blk and virtio-scsi devices will need a way to specify the
> mapping between IOThreads and virtqueues. At the moment all virtqueues
> are assigned to a single IOThread or the main loop. This single thread
> can be a CPU bottleneck, so it is necessary to allow finer-grained
> assignment to spread the load.
>
> Introduce DEFINE_PROP_IOTHREAD_VQ_MAPPING_LIST() so devices can take a
> parameter that maps virtqueues to IOThreads. The command-line syntax for
> this new property is as follows:
>
>   --device 
> '{"driver":"foo","iothread-vq-mapping":[{"iothread":"iothread0","vqs":[0,1,2]},...]}'
>
> IOThreads are specified by name and virtqueues are specified by 0-based
> index.
>
> It will be common to simply assign virtqueues round-robin across a set
> of IOThreads. A convenient syntax that does not require specifying
> individual virtqueue indices is available:
>
>   --device 
> '{"driver":"foo","iothread-vq-mapping":[{"iothread":"iothread0"},{"iothread":"iothread1"},...]}'
>
> Signed-off-by: Stefan Hajnoczi 
> ---
>  qapi/virtio.json| 30 ++
>  include/hw/qdev-properties-system.h |  4 +++
>  hw/core/qdev-properties-system.c| 47 +
>  3 files changed, 81 insertions(+)
>
> diff --git a/qapi/virtio.json b/qapi/virtio.json
> index e6dcee7b83..cb341ae596 100644
> --- a/qapi/virtio.json
> +++ b/qapi/virtio.json
> @@ -928,3 +928,33 @@
>'data': { 'path': 'str', 'queue': 'uint16', '*index': 'uint16' },
>'returns': 'VirtioQueueElement',
>'features': [ 'unstable' ] }
> +
> +##
> +# @IOThreadVirtQueueMapping:
> +#
> +# Describes the subset of virtqueues assigned to an IOThread.
> +#
> +# @iothread: the id of IOThread object
> +# @vqs: an optional array of virtqueue indices that will be handled by this
> +#   IOThread. When absent, virtqueues are assigned round-robin across all
> +#   IOThreadVirtQueueMappings provided. Either all
> +#   IOThreadVirtQueueMappings must have @vqs or none of them must have 
> it.
> +#
> +# Since: 8.2
> +#
> +##

Please format like

   ##
   # @IOThreadVirtQueueMapping:
   #
   # Describes the subset of virtqueues assigned to an IOThread.
   #
   # @iothread: the id of IOThread object
   #
   # @vqs: an optional array of virtqueue indices that will be handled by
   # this IOThread.  When absent, virtqueues are assigned round-robin
   # across all IOThreadVirtQueueMappings provided.  Either all
   # IOThreadVirtQueueMappings must have @vqs or none of them must
   # have it.
   #
   # Since: 8.2
   ##

to blend in with recent commit a937b6aa739 (qapi: Reformat doc comments
to conform to current conventions).

> +
> +{ 'struct': 'IOThreadVirtQueueMapping',
> +  'data': { 'iothread': 'str', '*vqs': ['uint16'] } }
> +
> +##
> +# @IOThreadVirtQueueMappings:
> +#
> +# IOThreadVirtQueueMapping list. This struct is not actually used but the
> +# IOThreadVirtQueueMappingList type it generates is!

Two spaces between sentences for consistency, please.

Doc comments are QMP reference documentation for users.  Does this
paragraph belong there?

> +#
> +# Since: 8.2
> +##
> +
> +{ 'struct': 'IOThreadVirtQueueMappings',
> +  'data': { 'mappings': ['IOThreadVirtQueueMapping'] } }
> diff --git a/include/hw/qdev-properties-system.h 
> b/include/hw/qdev-properties-system.h
> index 0ac327ae60..c526e502c8 100644
> --- a/include/hw/qdev-properties-system.h
> +++ b/include/hw/qdev-properties-system.h
> @@ -22,6 +22,7 @@ extern const PropertyInfo qdev_prop_audiodev;
>  extern const PropertyInfo qdev_prop_off_auto_pcibar;
>  extern const PropertyInfo qdev_prop_pcie_link_speed;
>  extern const PropertyInfo qdev_prop_pcie_link_width;
> +extern const PropertyInfo qdev_prop_iothread_vq_mapping_list;
>  
>  #define DEFINE_PROP_PCI_DEVFN(_n, _s, _f, _d)   \
>  DEFINE_PROP_SIGNED(_n, _s, _f, _d, qdev_prop_pci_devfn, int32_t)
> @@ -73,5 +74,8 @@ extern const PropertyInfo qdev_prop_pcie_link_width;
>  #define DEFINE_PROP_UUID_NODEFAULT(_name, _state, _field) \
>  DEFINE_PROP(_name, _state, _field, qdev_prop_uuid, QemuUUID)
>  
> +#define DEFINE_PROP_IOTHREAD_VQ_MAPPING_LIST(_name, _state, _field) \
> +DEFINE_PROP(_name, _state, _field, qdev_prop_iothread_vq_mapping_list, \
> +IOThreadVirtQueueMappingList *)
>  
>  #endif
> diff --git a/hw/core/qdev-properties-system.c 
> b/hw/core/qdev-properties-system.c
> index 6d5d43eda2..831796e106 100644
> --- a/hw/core/qdev-properties-system.c
> +++ b/hw/core/qdev-properties-system.c
> @@ -18,6 +18,7 @@
>  #include "qapi/qapi-types-block.h"
>  #include "qapi/qapi-types-machine.h"
>  #include "qapi/qapi-types-migration.h"
> +#include "qapi/qapi-visit-virtio.h"
>  #include "qapi/qmp/qerror.h"
>  #include "qemu/ctype.h"
>  #include "qemu/cutils.h"
> @@ -1147,3 +1148,49 @@ const PropertyInfo qdev_prop_uuid = {
>  .set   = set_uuid,
>  .set_default_value = set_default_uuid_auto,
>  };
> +
> +/* --- 

Re: [PATCH v3 0/10] hw/char/pl011: Implement TX (async) FIFO to avoid blocking the main loop

2023-10-14 Thread Mark Cave-Ayland

On 13/10/2023 15:11, Philippe Mathieu-Daudé wrote:


Hi,

This series add support for (async) FIFO on the transmit path
of the PL011 UART.

Since v2:
- Added R-b tags
- Addressed Richard comments on migration

Since v1:
- Restrict pl011_ops[] impl access_size,
- Do not check transmitter is enabled (Peter),
- Addressed Alex's review comments,
- Simplified migration trying to care about backward compat,
   but still unsure...

Philippe Mathieu-Daudé (9):
   util/fifo8: Allow fifo8_pop_buf() to not populate popped length
   util/fifo8: Introduce fifo8_peek_buf()
   hw/char/pl011: Split RX/TX path of pl011_reset_fifo()
   hw/char/pl011: Extract pl011_write_txdata() from pl011_write()
   hw/char/pl011: Extract pl011_read_rxdata() from pl011_read()
   hw/char/pl011: Warn when using disabled transmitter
   hw/char/pl011: Check if receiver is enabled
   hw/char/pl011: Rename RX FIFO methods
   hw/char/pl011: Add transmit FIFO to PL011State
   hw/char/pl011: Implement TX FIFO

  include/hw/char/pl011.h |   2 +
  include/qemu/fifo8.h|  37 +--
  hw/char/pl011.c | 140 +---
  util/fifo8.c|  28 ++--
  hw/char/trace-events|   4 +-
  5 files changed, 161 insertions(+), 50 deletions(-)


Looks like patch 10 where all the interesting stuff is didn't make it to the list? 
Patchew also agrees here: 
https://patchew.org/QEMU/20231013141131.1531-1-phi...@linaro.org/.



ATB,

Mark.




Re: [PATCH 11/11] qdev: Rework array properties based on list visitor

2023-10-14 Thread Markus Armbruster
Kevin Wolf  writes:

> Am 22.09.2023 um 17:05 hat Markus Armbruster geschrieben:
>> Kevin Wolf  writes:
>> 
>> > Until now, array properties are actually implemented with a hack that
>> > uses multiple properties on the QOM level: a static "foo-len" property
>> > and after it is set, dynamically created "foo[i]" properties.
>> >
>> > In external interfaces (-device on the command line and device_add in
>> > QMP), this interface was broken by commit f3558b1b ('qdev: Base object
>> > creation on QDict rather than QemuOpts') because QDicts are unordered
>> > and therefore it could happen that QEMU tried to set the indexed
>> > properties before setting the length, which fails and effectively makes
>> > array properties inaccessible. In particular, this affects the 'ports'
>> > property of the 'rocker' device.
>> >
>> > This patch reworks the external interface so that instead of using a
>> > separate top-level property for the length and for each element, we use
>> > a single true array property that accepts a list value. In the external
>> > interfaces, this is naturally expressed as a JSON list and makes array
>> > properties accessible again.
>> >
>> > Creating an array property on the command line without using JSON format
>> > is currently not possible. This could be fixed by switching from
>> > QemuOpts to a keyval parser, which however requires consideration of the
>> > compatibility implications.
>> >
>> > All internal users of devices with array properties go through
>> > qdev_prop_set_array() at this point, so updating it takes care of all of
>> > them.
>> >
>> > Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1090
>> > Fixes: f3558b1b763683bb877f7dd5b282469cdadc65c3
>> > Signed-off-by: Kevin Wolf 

[...]

>> > diff --git a/hw/core/qdev-properties.c b/hw/core/qdev-properties.c
>> > index 950ef48e01..b2303a6fbc 100644
>> > --- a/hw/core/qdev-properties.c
>> > +++ b/hw/core/qdev-properties.c
>> > @@ -546,98 +546,152 @@ const PropertyInfo qdev_prop_size32 = {
>> >  
>> >  /* --- support for array properties --- */
>> >  
>> > -/* Used as an opaque for the object properties we add for each
>> > - * array element. Note that the struct Property must be first
>> > - * in the struct so that a pointer to this works as the opaque
>> > - * for the underlying element's property hooks as well as for
>> > - * our own release callback.
>> > - */
>> > -typedef struct {
>> > -struct Property prop;
>> > -char *propname;
>> > -ObjectPropertyRelease *release;
>> > -} ArrayElementProperty;
>> > -
>> > -/* object property release callback for array element properties:
>> > - * we call the underlying element's property release hook, and
>> > - * then free the memory we allocated when we added the property.
>> > +static Property array_elem_prop(Object *obj, Property *parent_prop,
>> > +const char *name, char *elem)
>> 
>> @parent_prop is an array property.  It's backed by an uint32_t length
>> and an element array.  @elem points into the element array.  Correct?
>
> Correct.

Worth explaining in a comment?

>> > +{
>> > +return (Property) {
>> > +.info = parent_prop->arrayinfo,
>> > +.name = name,
>> > +/*
>> > + * This ugly piece of pointer arithmetic sets up the offset so
>> > + * that when the underlying release hook calls qdev_get_prop_ptr
>> > + * they get the right answer despite the array element not 
>> > actually
>> > + * being inside the device struct.
>> > + */
>> > +.offset = elem - (char *) obj,
>> 
>> Isn't this is undefined behavior?
>
> It should be at least less illegal than the old version of it, which did
> the calculation on void * and still worked in practice...
>
> But yes, strictly speaking, it's probably undefined behaviour. I can
> calculate on uintptr_t instead, and then it should be defined here.
>
> The QOM counterpart object_field_prop_ptr() is probably still undefined
> because it calculates on a pointer and I think the spec allows casting
> back to a pointer only after we've applied the offset so that we stay in
> the same object with pointer arithmetics.

We should not have to waste time on worrying about compilers using UB
fine print against us, but sadly we do.

I'm not objecting to your code, I'm merely pointing out a potential time
bomb.  In a programming environment that has embraced time bombing with
gusto.

>> Delete the space between (char *) and obj.
>> 
>> > +};
>> > +}
>> > +
>> > +/*
>> > + * Object property release callback for array properties: We call the 
>> > underlying
>> > + * element's property release hook for each element.
>> > + *
>> > + * Note that it is the responsibility of the individual device's deinit 
>> > to free
>> > + * the array proper.
>> 
>> What is a device's "deinit"?  Is it the unrealize() method?  The
>> instance_finalize() method?
>
> Who knows? I only moved this comment.

Opportunity to improve it.  Not a demand.

> My 

Re: [PATCH 00/85] target/sparc: Convert to decodetree

2023-10-14 Thread Mark Cave-Ayland

On 13/10/2023 22:27, Richard Henderson wrote:


While doing some other testing the other day, I noticed my sparc64
chroot running particularly slowly.  I think I know what the problem
is there, but fixing that was going to be particularly ugly with the
existing sparc translator.

So I've converted the translator to something more managable.  :-)

I've only done avocado testing so far, fingers crossed.
 
r~


Oh wow, that's amazing - looking into this has been on my TODO list for quite some 
time now :O


FWIW I'm still struggling with hangs on sun4m which I've noticed a lot more whilst 
working on my ESP changes. I *think* it is the same issue I saw before when testing 
your original gen_helper_lookup_tb_ptr() conversion series for target/sparc, which 
did disappear in the final version of the series but I can now reproduce fairly 
consistently with git master.


The reproducer here is easy with Solaris 8:
./build/qemu-system-sparc -cdrom sol8-cd1.iso -boot d

Then when the splash screen appears keep wiggling the mouse until everything locks 
up. In my ESP traces I sometimes see random hangs where the trace-events would end 
with "esp_raise_irq()" and sit there for 10s of seconds before resuming, so both of 
this seems to suggest that interrupts aren't getting through when they should.


Anyhow I can certainly give this series a spin on my OpenBIOS test images over the 
next few days as time allows.



ATB,

Mark.


Richard Henderson (85):
   target/sparc: Set TCG_GUEST_DEFAULT_MO
   configs: Enable MTTCG for sparc, sparc64
   target/sparc: Remove always-set cpu features
   target/sparc: Add decodetree infrastructure
   target/sparc: Define AM_CHECK for sparc32
   target/sparc: Move CALL to decodetree
   target/sparc: Move BPcc and Bicc to decodetree
   target/sparc: Move BPr to decodetree
   target/sparc: Move FBPfcc and FBfcc to decodetree
   target/sparc: Merge gen_cond with only caller
   target/sparc: Merge gen_fcond with only caller
   target/sparc: Merge gen_branch_[an] with only caller
   target/sparc: Pass DisasCompare to advance_jump_cond
   target/sparc: Move SETHI to decodetree
   target/sparc: Move Tcc to decodetree
   target/sparc: Move RDASR, STBAR, MEMBAR to decodetree
   target/sparc: Move RDPSR, RDHPR to decodetree
   target/sparc: Move RDWIM, RDPR to decodetree
   target/sparc: Move RDTBR, FLUSHW to decodetree
   target/sparc: Move WRASR to decodetree
   target/sparc: Move WRPSR, SAVED, RESTORED to decodetree
   target/sparc: Move WRWIM, WRPR to decodetree
   target/sparc: Move WRTBR, WRHPR to decodetree
   target/sparc: Move basic arithmetic to decodetree
   target/sparc: Move ADDC to decodetree
   target/sparc: Move MULX to decodetree
   target/sparc: Move UMUL, SMUL to decodetree
   target/sparc: Move SUBC to decodetree
   target/sparc: Move UDIVX, SDIVX to decodetree
   target/sparc: Move UDIV, SDIV to decodetree
   target/sparc: Move TADD, TSUB, MULS to decodetree
   target/sparc: Move SLL, SRL, SRA to decodetree
   target/sparc: Move MOVcc, MOVR to decodetree
   target/sparc: Move POPC to decodetree
   target/sparc: Convert remaining v8 coproc insns to decodetree
   target/sparc: Move JMPL, RETT, RETURN to decodetree
   target/sparc: Move FLUSH, SAVE, RESTORE to decodetree
   target/sparc: Move DONE, RETRY to decodetree
   target/sparc: Split out resolve_asi
   target/sparc: Drop ifdef around get_asi and friends
   target/sparc: Split out ldst functions with asi pre-computed
   target/sparc: Use tcg_gen_qemu_{ld,st}_i128 for GET_ASI_DTWINX
   target/sparc: Move simple integer load/store to decodetree
   target/sparc: Move asi integer load/store to decodetree
   target/sparc: Move LDSTUB, LDSTUBA to decodetree
   target/sparc: Move SWAP, SWAPA to decodetree
   target/sparc: Move CASA, CASXA to decodetree
   target/sparc: Move PREFETCH, PREFETCHA to decodetree
   target/sparc: Split out fp ldst functions with asi precomputed
   target/sparc: Move simple fp load/store to decodetree
   target/sparc: Move asi fp load/store to decodetree
   target/sparc: Move LDFSR, STFSR to decodetree
   target/sparc: Merge LDFSR, LDXFSR implementations
   target/sparc: Move EDGE* to decodetree
   target/sparc: Move ARRAY* to decodetree
   target/sparc: Move ADDRALIGN* to decodetree
   target/sparc: Move BMASK to decodetree
   target/sparc: Move FMOVS, FNEGS, FABSS, FSRC*S, FNOT*S to decodetree
   target/sparc: Move FMOVD, FNEGD, FABSD, FSRC*D, FNOT*D to decodetree
   target/sparc: Use tcg_gen_vec_{add,sub}*
   target/sparc: Move gen_ne_fop_FFF insns to decodetree
   target/sparc: Move gen_ne_fop_DDD insns to decodetree
   target/sparc: Move PDIST to decodetree
   target/sparc: Move gen_gsr_fop_DDD insns to decodetree
   target/sparc: Move gen_fop_FF insns to decodetree
   target/sparc: Move gen_fop_DD insns to decodetree
   target/sparc: Move FSQRTq to decodetree
   target/sparc: Move gen_fop_FFF insns to decodetree
   target/sparc: Move gen_fop_DDD insns to decodetree
   target/sparc: Move