date:20230131

Re: [PATCH 08/25] hw/arm/boot: Export write_bootloader for Aspeed machines

2023-01-31 Thread Cédric Le Goater


On 2/1/23 06:45, Joel Stanley wrote:

On Thu, 19 Jan 2023 at 12:37, Cédric Le Goater  wrote:


AST2600 Aspeed machines have an home made boot loader for secondaries.
To improve support, export the internal ARM boot loader and use it
instead.


I didn't quite follow why we're doing this. Is it just a cleanup?


It comes from this discussion :

  https://lore.kernel.org/qemu-devel/2022115549.86872-1-phi...@linaro.org/

I will leave Phil handle it. The patch is on the list now.

Thanks,

C.




Signed-off-by: Cédric Le Goater 
---
  include/hw/arm/boot.h | 24 
  hw/arm/aspeed.c   | 42 ++
  hw/arm/boot.c | 34 +++---
  3 files changed, 53 insertions(+), 47 deletions(-)

diff --git a/include/hw/arm/boot.h b/include/hw/arm/boot.h
index f18cc3064f..23edd0d31b 100644
--- a/include/hw/arm/boot.h
+++ b/include/hw/arm/boot.h
@@ -183,4 +183,28 @@ void arm_write_secure_board_setup_dummy_smc(ARMCPU *cpu,
  const struct arm_boot_info *info,
  hwaddr mvbar_addr);

+typedef enum {
+FIXUP_NONE = 0, /* do nothing */
+FIXUP_TERMINATOR,   /* end of insns */
+FIXUP_BOARDID,  /* overwrite with board ID number */
+FIXUP_BOARD_SETUP,  /* overwrite with board specific setup code address */
+FIXUP_ARGPTR_LO,/* overwrite with pointer to kernel args */
+FIXUP_ARGPTR_HI,/* overwrite with pointer to kernel args (high half) */
+FIXUP_ENTRYPOINT_LO, /* overwrite with kernel entry point */
+FIXUP_ENTRYPOINT_HI, /* overwrite with kernel entry point (high half) */
+FIXUP_GIC_CPU_IF,   /* overwrite with GIC CPU interface address */
+FIXUP_BOOTREG,  /* overwrite with boot register address */
+FIXUP_DSB,  /* overwrite with correct DSB insn for cpu */
+FIXUP_MAX,
+} FixupType;
+
+typedef struct ARMInsnFixup {
+uint32_t insn;
+FixupType fixup;
+} ARMInsnFixup;
+
+void arm_write_bootloader(const char *name, hwaddr addr,
+  const ARMInsnFixup *insns, uint32_t *fixupcontext,
+  AddressSpace *as);
+
  #endif /* HW_ARM_BOOT_H */
diff --git a/hw/arm/aspeed.c b/hw/arm/aspeed.c
index 4919a1fe9e..c373bd2851 100644
--- a/hw/arm/aspeed.c
+++ b/hw/arm/aspeed.c
@@ -198,33 +198,35 @@ struct AspeedMachineState {
  static void aspeed_write_smpboot(ARMCPU *cpu,
   const struct arm_boot_info *info)
  {
-static const uint32_t poll_mailbox_ready[] = {
+AddressSpace *as = arm_boot_address_space(cpu, info);
+static const ARMInsnFixup poll_mailbox_ready[] = {
  /*
   * r2 = per-cpu go sign value
   * r1 = AST_SMP_MBOX_FIELD_ENTRY
   * r0 = AST_SMP_MBOX_FIELD_GOSIGN
   */
-0xee100fb0,  /* mrc p15, 0, r0, c0, c0, 5 */
-0xe21000ff,  /* andsr0, r0, #255  */
-0xe59f201c,  /* ldr r2, [pc, #28] */
-0xe1822000,  /* orr r2, r2, r0*/
-
-0xe59f1018,  /* ldr r1, [pc, #24] */
-0xe59f0018,  /* ldr r0, [pc, #24] */
-
-0xe320f002,  /* wfe   */
-0xe5904000,  /* ldr r4, [r0]  */
-0xe1520004,  /* cmp r2, r4*/
-0x1afb,  /* bne  */
-0xe591f000,  /* ldr pc, [r1]  */
-AST_SMP_MBOX_GOSIGN,
-AST_SMP_MBOX_FIELD_ENTRY,
-AST_SMP_MBOX_FIELD_GOSIGN,
+{ 0xee100fb0 },  /* mrc p15, 0, r0, c0, c0, 5 */
+{ 0xe21000ff },  /* andsr0, r0, #255  */
+{ 0xe59f201c },  /* ldr r2, [pc, #28] */
+{ 0xe1822000 },  /* orr r2, r2, r0*/
+
+{ 0xe59f1018 },  /* ldr r1, [pc, #24] */
+{ 0xe59f0018 },  /* ldr r0, [pc, #24] */
+
+{ 0xe320f002 },  /* wfe   */
+{ 0xe5904000 },  /* ldr r4, [r0]  */
+{ 0xe1520004 },  /* cmp r2, r4*/
+{ 0x1afb },  /* bne  */
+{ 0xe591f000 },  /* ldr pc, [r1]  */
+{ AST_SMP_MBOX_GOSIGN },
+{ AST_SMP_MBOX_FIELD_ENTRY },
+{ AST_SMP_MBOX_FIELD_GOSIGN },
+{ 0, FIXUP_TERMINATOR }
  };
+uint32_t fixupcontext[FIXUP_MAX] = { 0 };

-rom_add_blob_fixed("aspeed.smpboot", poll_mailbox_ready,
-   sizeof(poll_mailbox_ready),
-   info->smp_loader_start);
+arm_write_bootloader("aspeed.smpboot", info->smp_loader_start,
+ poll_mailbox_ready, fixupcontext, as);
  }

  static void aspeed_reset_secondary(ARMCPU *cpu,
diff --git a/hw/arm/boot.c b/hw/arm/boot.c
index 3d7d11f782..ed6fd7c77f 100644
--- a/hw/arm/boot.c
+++ b/hw/arm/boot.c
@@ -59,26 +59,6 @@ AddressSpace

Re: [PATCH 0/3] util/userfaultfd: Support /dev/userfaultfd

2023-01-31 Thread Michal Prívozník

On 1/31/23 22:01, Peter Xu wrote:
> I'll wait 1-2 more days to see whether Michal has anything to comment.

Yeah, we can go with your patches and leave FD passing for future work.
It's orthogonal after all.

In the end we can have (in the order of precedence):

1) new cmd line argument, say:

   qemu-system-x86_64 -userfaultfd fd=5 # where FD 5 is passed by
libvirt when exec()-ing qemu, just like other FDs, e.g. -chardev
socket,fd=XXX

2) your patches, where qemu opens /dev/userfaultfd

3) current behavior, userfaultfd syscall


Michal

Re: [QEMU][PATCH v5 01/10] hw/i386/xen/: move xen-mapcache.c to hw/xen/

2023-01-31 Thread Paul Durrant


On 31/01/2023 22:51, Vikram Garhwal wrote:

xen-mapcache.c contains common functions which can be used for enabling Xen on
aarch64 with IOREQ handling. Moving it out from hw/i386/xen to hw/xen to make it
accessible for both aarch64 and x86.

Signed-off-by: Vikram Garhwal 
Signed-off-by: Stefano Stabellini 
---
  hw/i386/meson.build  | 1 +
  hw/i386/xen/meson.build  | 1 -
  hw/i386/xen/trace-events | 5 -
  hw/xen/meson.build   | 4 
  hw/xen/trace-events  | 5 +
  hw/{i386 => }/xen/xen-mapcache.c | 0
  6 files changed, 10 insertions(+), 6 deletions(-)
  rename hw/{i386 => }/xen/xen-mapcache.c (100%)



Reviewed-by: Paul Durrant

Re: [PATCH 04/25] avocado/boot_linux_console.py: Update ast2600 test

2023-01-31 Thread Cédric Le Goater


On 2/1/23 06:46, Joel Stanley wrote:

On Thu, 19 Jan 2023 at 12:35, Cédric Le Goater  wrote:


From: Joel Stanley 

Update the test_arm_ast2600_debian test to

  - the latest Debian kernel


Would you like a newer version of this patch that uses the latest kernel?


Sure. We can not test all kernels. Using the latest is the best approach I 
think.

Thanks,

C.





  - use the Rainier machine instead of Tacoma

Both of which contains support for more hardware and thus exercises more
of the hardware Qemu models.

Signed-off-by: Joel Stanley 
Reviewed-by: Cédric Le Goater 
Message-Id: <20220607011938.1676459-1-j...@jms.id.au>
Signed-off-by: Cédric Le Goater 
---
  tests/avocado/boot_linux_console.py | 12 ++--
  1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/tests/avocado/boot_linux_console.py 
b/tests/avocado/boot_linux_console.py
index 8c1d981586..f3a1d00be9 100644
--- a/tests/avocado/boot_linux_console.py
+++ b/tests/avocado/boot_linux_console.py
@@ -1098,18 +1098,18 @@ def test_arm_vexpressa9(self):
  def test_arm_ast2600_debian(self):
  """
  :avocado: tags=arch:arm
-:avocado: tags=machine:tacoma-bmc
+:avocado: tags=machine:rainier-bmc
  """
  deb_url = ('http://snapshot.debian.org/archive/debian/'
-   '20210302T203551Z/'
+   '20220606T211338Z/'
 'pool/main/l/linux/'
-   'linux-image-5.10.0-3-armmp_5.10.13-1_armhf.deb')
-deb_hash = 
'db40d32fe39255d05482bea48d72467b67d6225bb2a2a4d6f618cb8976f1e09e'
+   'linux-image-5.17.0-2-armmp_5.17.6-1%2Bb1_armhf.deb')
+deb_hash = 
'8acb2b4439faedc2f3ed4bdb2847ad4f6e0491f73debaeb7f660c8abe4dcdc0e'
  deb_path = self.fetch_asset(deb_url, asset_hash=deb_hash,
  algorithm='sha256')
-kernel_path = self.extract_from_deb(deb_path, 
'/boot/vmlinuz-5.10.0-3-armmp')
+kernel_path = self.extract_from_deb(deb_path, 
'/boot/vmlinuz-5.17.0-2-armmp')
  dtb_path = self.extract_from_deb(deb_path,
-
'/usr/lib/linux-image-5.10.0-3-armmp/aspeed-bmc-opp-tacoma.dtb')
+
'/usr/lib/linux-image-5.17.0-2-armmp/aspeed-bmc-ibm-rainier.dtb')

  self.vm.set_console()
  self.vm.add_args('-kernel', kernel_path,
--
2.39.0

Re: [PATCH 02/25] aspeed: Add Supermicro X11 SPI machine type

2023-01-31 Thread Cédric Le Goater


On 2/1/23 06:39, Joel Stanley wrote:

On Thu, 19 Jan 2023 at 12:36, Cédric Le Goater  wrote:


From: Guenter Roeck 

supermicrox11-bmc is configured with ast2400-a1 SoC. This does not match
the Supermicro documentation for X11 BMCs, and it does not match the
devicetree file in the Linux kernel.


I found this sentence confusing; AFAICT X11 doesn't name a specific
motherboard. It appears to be a marketing term for a bunch of
different things.


As it turns out, some Supermicro X11 motherboards use AST2400 SoCs,
while others use AST2500.

Introduce new machine type supermicrox11-spi-bmc with AST2500 SoC


How about supermicro-x11spi-bmc? It would match the product name:

https://www.supermicro.com/en/products/motherboard/X11SPi-TF

as well as the device tree filename.


Indeed,

model = "X11SPI BMC";
compatible = "supermicro,x11spi-bmc", "aspeed,ast2500";

I was the one who suggested the name. Let's change if Guenter agrees.

Thanks,

C.

Re: Emulating device configuration / max_virtqueue_pairs in vhost-vdpa and vhost-user

2023-01-31 Thread Eugenio Perez Martin

On Wed, Feb 1, 2023 at 4:29 AM Jason Wang  wrote:
>
> On Wed, Feb 1, 2023 at 3:11 AM Eugenio Perez Martin  
> wrote:
> >
> > On Tue, Jan 31, 2023 at 8:10 PM Eugenio Perez Martin
> >  wrote:
> > >
> > > Hi,
> > >
> > > The current approach of offering an emulated CVQ to the guest and map
> > > the commands to vhost-user is not scaling well:
> > > * Some devices already offer it, so the transformation is redundant.
> > > * There is no support for commands with variable length (RSS?)
> > >
> > > We can solve both of them by offering it through vhost-user the same
> > > way as vhost-vdpa do. With this approach qemu needs to track the
> > > commands, for similar reasons as vhost-vdpa: qemu needs to track the
> > > device status for live migration. vhost-user should use the same SVQ
> > > code for this, so we avoid duplications.
> > >
> > > One of the challenges here is to know what virtqueue to shadow /
> > > isolate. The vhost-user device may not have the same queues as the
> > > device frontend:
> > > * The first depends on the actual vhost-user device, and qemu fetches
> > > it with VHOST_USER_GET_QUEUE_NUM at the moment.
> > > * The qemu device frontend's is set by netdev queues= cmdline parameter 
> > > in qemu
> > >
> > > For the device, the CVQ is the last one it offers, but for the guest
> > > it is the last one offered in config space.
> > >
> > > To create a new vhost-user command to decrease that maximum number of
> > > queues may be an option. But we can do it without adding more
> > > commands, remapping the CVQ index at virtqueue setup. I think it
> > > should be doable using (struct vhost_dev).vq_index and maybe a few
> > > adjustments here and there.
> > >
> > > Thoughts?
> > >
> > > Thanks!
> >
> >
> > (Starting a separated thread to vhost-vdpa related use case)
> >
> > This could also work for vhost-vdpa if we ever decide to honor netdev
> > queues argument. It is totally ignored now, as opposed to the rest of
> > backends:
> > * vhost-kernel, whose tap device has the requested number of queues.
> > * vhost-user, that errors with ("you are asking more queues than
> > supported") if the vhost-user parent device has less queues than
> > requested (by vhost-user msg VHOST_USER_GET_QUEUE_NUM).
> >
> > One of the reasons for this is that device configuration space is
> > totally passthrough, with the values for mtu, rss conditions, etc.
> > This is not ideal, as qemu cannot check src and destination
> > equivalence and they can change under the feets of the guest in the
> > event of a migration.
>
> This looks not the responsibility of qemu but the upper layer (to
> provision the same config/features in src/dst).

I think both share it. Or, at least, that it is inconsistent that QEMU
is in charge of checking / providing consistency for virtio features,
but not virtio-net config space.

If we follow that to the extreme, we could simply delete the feature
checks, right?

>
> > External tools are needed for this, duplicating
> > part of the effort.
> >
> > Start intercepting config space accesses and offering an emulated one
> > to the guest with this kind of adjustments is beneficial, as it makes
> > vhost-vdpa more similar to the rest of backends, making the surprise
> > on a change way lower.
>
> This probably needs more thought, since vDPA already provides a kind
> of emulation in the kernel. My understanding is that it would be
> sufficient to add checks to make sure the config that guests see is
> consistent with what host provisioned?
>

With host provisioned you mean with "vdpa" tool or with qemu? Also, we
need a way to communicate the guest values to it If those checks are
added in the kernel.

The reasoning here is the same as above: QEMU already filters features
with its own emulated layer, so the operator can specify a feature
that will never appear to the guest. It has other uses (abstract
between transport for example), but feature filtering is definitely a
thing there.

A feature set to off in a VM (or that does not exist in that
particular qemu version) will never appear as on even in the case of
migration to modern qemu versions.

We don't have the equivalent protection for device config space. QEMU
could assure a consistent MTU, number of queues, etc for the guest in
virtio_net_get_config (and equivalent for other kinds of devices).
QEMU already has some transformations there. It shouldn't take a lot
of code.

Having said that:
* I'm ok with starting just with checks there instead of
transformations like the queues remap proposed here.
* If we choose not to implement it, I'm not proposing to actually
delete the features checks, as I see them useful :).

Thanks!

Re: bamboo machine

2023-01-31 Thread Cédric Le Goater


Hello Mobin,

On 1/31/23 20:24, Mobin Shaikh wrote:

Hello Cédric ,

I found your contact information from GitHub. I am a new QEMU enthusiast and 
learning about QEMU. I emulated PPC bamboo machine using QEMU but I couldn't 
completely bootup the OS. Would you be able to share the example image file and 
QEMU command line you use to test QEMU emulation? I want to run the test to 
make sure I am not missing anything in configuring/building QEMU. I'll really 
appreciate your help and pointers.


I simply run the command :

  qemu-system-ppc -M bamboo -kernel /path/to/vmlinux -net 
nic,model=virtio-net-pci -net user -serial mon:stdio -nodefaults -nographic

Here is a kernel/buildroot image :

 
https://github.com/legoater/qemu-ppc-boot/tree/main/buildroot/qemu_ppc_bamboo-2022.02-4-geae5011c83-20220309

and the associated defconfig :

  
https://github.com/legoater/buildroot/blob/qemu-ppc/configs/qemu_ppc_bamboo_defconfig

Thanks,

C.

Re: Emulating device configuration / max_virtqueue_pairs in vhost-vdpa and vhost-user

2023-01-31 Thread Eugenio Perez Martin

On Wed, Feb 1, 2023 at 4:27 AM Jason Wang  wrote:
>
> On Wed, Feb 1, 2023 at 3:10 AM Eugenio Perez Martin  
> wrote:
> >
> > Hi,
> >
> > The current approach of offering an emulated CVQ to the guest and map
> > the commands to vhost-user is not scaling well:
> > * Some devices already offer it, so the transformation is redundant.
> > * There is no support for commands with variable length (RSS?)
> >
> > We can solve both of them by offering it through vhost-user the same
> > way as vhost-vdpa do. With this approach qemu needs to track the
> > commands, for similar reasons as vhost-vdpa: qemu needs to track the
> > device status for live migration. vhost-user should use the same SVQ
> > code for this, so we avoid duplications.
>
> Note that it really depends on the model we used. SVQ works well for
> trap and emulation (without new API to be invented). But if save and
> load API is invented, SVQ is not a must.
>

That's right, but the premise in the proposal is that control vq RSS
messages are already complex enough to avoid adding more vhost-user
messages. I cannot imagine expanding the vhost-user or virtio API to
add the save and restore functions soon :).

> >
> > One of the challenges here is to know what virtqueue to shadow /
> > isolate. The vhost-user device may not have the same queues as the
> > device frontend:
> > * The first depends on the actual vhost-user device, and qemu fetches
> > it with VHOST_USER_GET_QUEUE_NUM at the moment.
> > * The qemu device frontend's is set by netdev queues= cmdline parameter in 
> > qemu
> >
> > For the device, the CVQ is the last one it offers, but for the guest
> > it is the last one offered in config space.
> >
> > To create a new vhost-user command to decrease that maximum number of
> > queues may be an option. But we can do it without adding more
> > commands, remapping the CVQ index at virtqueue setup. I think it
> > should be doable using (struct vhost_dev).vq_index and maybe a few
> > adjustments here and there.
>
> It requires device specific knowledge, it might work for networking
> devices but not others (or need new codes).
>

Yes, I didn't review all the other kinds of devices for the proposal,
but I'm assuming:
* There is no other device that has already implemented CVQ over
vhost-user (or this problems would have been solved)
* All vhost-user devices config space are already offered by qemu like
vhost-user net, and the cvq-alike index is well defined in the
standard like -net.

So this proposal should fit all those devices, isn't it?

Thanks!

Re: accel/tcg/translator.c question about translator_access

2023-01-31 Thread Richard Henderson


On 1/31/23 17:06, Sid Manning wrote:
There is an assert in translator_access that I hit while running on a version of QEMU 
integrated into a Virtual Platform.


Since this function can return null anyway I tried the following experiment:

...

-    assert(phys_page != -1);
+    if(phys_page == -1) {
+    return NULL;
+    }

...

which avoided the issue and the test ran to completion.  What is this assert 
trying to catch?



One half of the instruction in ram and the other half of the instruction in 
mmio.

If the entire instruction is in mmio, then we correctly translate, but do not cache the 
result (since the io can produce different results on every access).  But if we have 
started caching the result, because we start in ram, then we will incorrectly cache the 
mmio access.


This really should never happen.  How did it occur?


r~

Re: [PATCH 03/25] hw/net: Fix read of uninitialized memory in ftgmac100

2023-01-31 Thread Joel Stanley

On Thu, 19 Jan 2023 at 12:39, Cédric Le Goater  wrote:
>
> From: Stephen Longfield 
>
> With the `size += 4` before the call to `crc32`, the CRC calculation
> would overrun the buffer. Size is used in the while loop starting on
> line 1009 to determine how much data to write back, with the last
> four bytes coming from `crc_ptr`, so do need to increase it, but should
> do this after the computation.
>
> I'm unsure why this use of uninitialized memory in the CRC doesn't
> result in CRC errors, but it seems clear to me that it should not be
> included in the calculation.

Does this affect the error counters observed under Linux?

>
> Signed-off-by: Stephen Longfield 
> Reviewed-by: Hao Wu 
> Message-Id: <20221220221437.3303721-1-slongfi...@google.com>
> Signed-off-by: Cédric Le Goater 

Reviewed-by: Joel Stanley 

> ---
>  hw/net/ftgmac100.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/hw/net/ftgmac100.c b/hw/net/ftgmac100.c
> index 83ef0a783e..d3bf14be53 100644
> --- a/hw/net/ftgmac100.c
> +++ b/hw/net/ftgmac100.c
> @@ -980,9 +980,9 @@ static ssize_t ftgmac100_receive(NetClientState *nc, 
> const uint8_t *buf,
>  return size;
>  }
>
> -/* 4 bytes for the CRC.  */
> -size += 4;
>  crc = cpu_to_be32(crc32(~0, buf, size));
> +/* Increase size by 4, loop below reads the last 4 bytes from crc_ptr. */
> +size += 4;
>  crc_ptr = (uint8_t *) 
>
>  /* Huge frames are truncated.  */
> --
> 2.39.0
>
>

Re: [PATCH 08/25] hw/arm/boot: Export write_bootloader for Aspeed machines

2023-01-31 Thread Joel Stanley

On Thu, 19 Jan 2023 at 12:37, Cédric Le Goater  wrote:
>
> AST2600 Aspeed machines have an home made boot loader for secondaries.
> To improve support, export the internal ARM boot loader and use it
> instead.

I didn't quite follow why we're doing this. Is it just a cleanup?

>
> Signed-off-by: Cédric Le Goater 
> ---
>  include/hw/arm/boot.h | 24 
>  hw/arm/aspeed.c   | 42 ++
>  hw/arm/boot.c | 34 +++---
>  3 files changed, 53 insertions(+), 47 deletions(-)
>
> diff --git a/include/hw/arm/boot.h b/include/hw/arm/boot.h
> index f18cc3064f..23edd0d31b 100644
> --- a/include/hw/arm/boot.h
> +++ b/include/hw/arm/boot.h
> @@ -183,4 +183,28 @@ void arm_write_secure_board_setup_dummy_smc(ARMCPU *cpu,
>  const struct arm_boot_info *info,
>  hwaddr mvbar_addr);
>
> +typedef enum {
> +FIXUP_NONE = 0, /* do nothing */
> +FIXUP_TERMINATOR,   /* end of insns */
> +FIXUP_BOARDID,  /* overwrite with board ID number */
> +FIXUP_BOARD_SETUP,  /* overwrite with board specific setup code address 
> */
> +FIXUP_ARGPTR_LO,/* overwrite with pointer to kernel args */
> +FIXUP_ARGPTR_HI,/* overwrite with pointer to kernel args (high half) 
> */
> +FIXUP_ENTRYPOINT_LO, /* overwrite with kernel entry point */
> +FIXUP_ENTRYPOINT_HI, /* overwrite with kernel entry point (high half) */
> +FIXUP_GIC_CPU_IF,   /* overwrite with GIC CPU interface address */
> +FIXUP_BOOTREG,  /* overwrite with boot register address */
> +FIXUP_DSB,  /* overwrite with correct DSB insn for cpu */
> +FIXUP_MAX,
> +} FixupType;
> +
> +typedef struct ARMInsnFixup {
> +uint32_t insn;
> +FixupType fixup;
> +} ARMInsnFixup;
> +
> +void arm_write_bootloader(const char *name, hwaddr addr,
> +  const ARMInsnFixup *insns, uint32_t *fixupcontext,
> +  AddressSpace *as);
> +
>  #endif /* HW_ARM_BOOT_H */
> diff --git a/hw/arm/aspeed.c b/hw/arm/aspeed.c
> index 4919a1fe9e..c373bd2851 100644
> --- a/hw/arm/aspeed.c
> +++ b/hw/arm/aspeed.c
> @@ -198,33 +198,35 @@ struct AspeedMachineState {
>  static void aspeed_write_smpboot(ARMCPU *cpu,
>   const struct arm_boot_info *info)
>  {
> -static const uint32_t poll_mailbox_ready[] = {
> +AddressSpace *as = arm_boot_address_space(cpu, info);
> +static const ARMInsnFixup poll_mailbox_ready[] = {
>  /*
>   * r2 = per-cpu go sign value
>   * r1 = AST_SMP_MBOX_FIELD_ENTRY
>   * r0 = AST_SMP_MBOX_FIELD_GOSIGN
>   */
> -0xee100fb0,  /* mrc p15, 0, r0, c0, c0, 5 */
> -0xe21000ff,  /* andsr0, r0, #255  */
> -0xe59f201c,  /* ldr r2, [pc, #28] */
> -0xe1822000,  /* orr r2, r2, r0*/
> -
> -0xe59f1018,  /* ldr r1, [pc, #24] */
> -0xe59f0018,  /* ldr r0, [pc, #24] */
> -
> -0xe320f002,  /* wfe   */
> -0xe5904000,  /* ldr r4, [r0]  */
> -0xe1520004,  /* cmp r2, r4*/
> -0x1afb,  /* bne  */
> -0xe591f000,  /* ldr pc, [r1]  */
> -AST_SMP_MBOX_GOSIGN,
> -AST_SMP_MBOX_FIELD_ENTRY,
> -AST_SMP_MBOX_FIELD_GOSIGN,
> +{ 0xee100fb0 },  /* mrc p15, 0, r0, c0, c0, 5 */
> +{ 0xe21000ff },  /* andsr0, r0, #255  */
> +{ 0xe59f201c },  /* ldr r2, [pc, #28] */
> +{ 0xe1822000 },  /* orr r2, r2, r0*/
> +
> +{ 0xe59f1018 },  /* ldr r1, [pc, #24] */
> +{ 0xe59f0018 },  /* ldr r0, [pc, #24] */
> +
> +{ 0xe320f002 },  /* wfe   */
> +{ 0xe5904000 },  /* ldr r4, [r0]  */
> +{ 0xe1520004 },  /* cmp r2, r4*/
> +{ 0x1afb },  /* bne  */
> +{ 0xe591f000 },  /* ldr pc, [r1]  */
> +{ AST_SMP_MBOX_GOSIGN },
> +{ AST_SMP_MBOX_FIELD_ENTRY },
> +{ AST_SMP_MBOX_FIELD_GOSIGN },
> +{ 0, FIXUP_TERMINATOR }
>  };
> +uint32_t fixupcontext[FIXUP_MAX] = { 0 };
>
> -rom_add_blob_fixed("aspeed.smpboot", poll_mailbox_ready,
> -   sizeof(poll_mailbox_ready),
> -   info->smp_loader_start);
> +arm_write_bootloader("aspeed.smpboot", info->smp_loader_start,
> + poll_mailbox_ready, fixupcontext, as);
>  }
>
>  static void aspeed_reset_secondary(ARMCPU *cpu,
> diff --git a/hw/arm/boot.c b/hw/arm/boot.c
> index 3d7d11f782..ed6fd7c77f 100644
> --- a/hw/arm/boot.c
> +++ b/hw/arm/boot.c
> @@ -59,26 +59,6 @@ AddressSpace *arm_boot_address_space(ARMCPU *cpu,
>

Re: [PATCH 06/25] tests/avocado/machine_aspeed.py: update buildroot tests

2023-01-31 Thread Joel Stanley

On Thu, 19 Jan 2023 at 12:37, Cédric Le Goater  wrote:
>
> Use buildroot 2022.11 based images plus some customization :
>
>   - Linux version is bumped to 6.0.9 and kernel is built with a custom
> config similar to what OpenBMC provides.
>   - U-Boot is switched to the one provided by OpenBMC for better support.
>   - defconfigs includes more target tools for dev.
>
> Signed-off-by: Cédric Le Goater 

Reviewed-by: Joel Stanley 

> ---
>  tests/avocado/machine_aspeed.py | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/tests/avocado/machine_aspeed.py b/tests/avocado/machine_aspeed.py
> index 1fc385e1c8..773b1ff3a9 100644
> --- a/tests/avocado/machine_aspeed.py
> +++ b/tests/avocado/machine_aspeed.py
> @@ -123,8 +123,8 @@ def test_arm_ast2500_evb_buildroot(self):
>  """
>
>  image_url = 
> ('https://github.com/legoater/qemu-aspeed-boot/raw/master/'
> - 'images/ast2500-evb/buildroot-2022.05/flash.img')
> -image_hash = 
> ('549db6e9d8cdaf4367af21c36385a68bb465779c18b5e37094fc7343decccd3f')
> + 
> 'images/ast2500-evb/buildroot-2022.11-2-g15d3648df9/flash.img')
> +image_hash = 
> ('f96d11db521fe7a2787745e9e391225dc3318ee0fc07c8b799b8833dd474')
>  image_path = self.fetch_asset(image_url, asset_hash=image_hash,
>algorithm='sha256')
>
> @@ -151,8 +151,8 @@ def test_arm_ast2600_evb_buildroot(self):
>  """
>
>  image_url = 
> ('https://github.com/legoater/qemu-aspeed-boot/raw/master/'
> - 'images/ast2600-evb/buildroot-2022.05/flash.img')
> -image_hash = 
> ('6cc9e7d128fd4fa1fd01c883af67593cae8072c3239a0b8b6ace857f3538a92d')
> + 
> 'images/ast2600-evb/buildroot-2022.11-2-g15d3648df9/flash.img')
> +image_hash = 
> ('e598d86e5ea79671ca8b59212a326c911bc8bea728dec1a1f5390d717a28bb8b')
>  image_path = self.fetch_asset(image_url, asset_hash=image_hash,
>algorithm='sha256')
>
> --
> 2.39.0
>
>

Re: [PATCH 07/25] tests/avocado/machine_aspeed.py: Mask systemd services to speed up SDK boot

2023-01-31 Thread Joel Stanley

On Thu, 19 Jan 2023 at 12:39, Cédric Le Goater  wrote:
>
> Signed-off-by: Cédric Le Goater 

NIce!

Reviewed-by: Joel Stanley 

> ---
>  tests/avocado/machine_aspeed.py | 11 +--
>  1 file changed, 9 insertions(+), 2 deletions(-)
>
> diff --git a/tests/avocado/machine_aspeed.py b/tests/avocado/machine_aspeed.py
> index 773b1ff3a9..1cab946727 100644
> --- a/tests/avocado/machine_aspeed.py
> +++ b/tests/avocado/machine_aspeed.py
> @@ -183,7 +183,14 @@ def test_arm_ast2600_evb_buildroot(self):
>
>  class AST2x00MachineSDK(QemuSystemTest):
>
> -EXTRA_BOOTARGS = ' quiet'
> +EXTRA_BOOTARGS = (
> +'quiet '
> +'systemd.mask=org.openbmc.HostIpmi.service '
> +'systemd.mask=xyz.openbmc_project.Chassis.Control.Power@0.service '
> +'systemd.mask=modprobe@fuse.service '
> +'systemd.mask=rngd.service '
> +'systemd.mask=obmc-console@ttyS2.service '
> +)
>
>  # FIXME: Although these tests boot a whole distro they are still
>  # slower than comparable machine models. There may be some
> @@ -208,7 +215,7 @@ def do_test_arm_aspeed_sdk_start(self, image):
>  interrupt_interactive_console_until_pattern(
>  self, 'Hit any key to stop autoboot:', 'ast#')
>  exec_command_and_wait_for_pattern(
> -self, 'setenv bootargs ${bootargs}' + self.EXTRA_BOOTARGS, 
> 'ast#')
> +self, 'setenv bootargs ${bootargs} ' + self.EXTRA_BOOTARGS, 
> 'ast#')
>  exec_command_and_wait_for_pattern(
>  self, 'boot', '## Loading kernel from FIT Image')
>  self.wait_for_console_pattern('Starting kernel ...')
> --
> 2.39.0
>
>

Re: [PULL 16/22] tcg/aarch64: Reorg goto_tb implementation

2023-01-31 Thread Richard Henderson


On 1/31/23 15:45, Zenghui Yu wrote:

On 2023/1/18 7:10, Richard Henderson wrote:

+void tb_target_set_jmp_target(const TranslationBlock *tb, int n,
+  uintptr_t jmp_rx, uintptr_t jmp_rw)
+{
+    uintptr_t d_addr = tb->jmp_target_addr[n];
+    ptrdiff_t d_offset = d_addr - jmp_rx;
+    tcg_insn_unit insn;
+
+    /* Either directly branch, or indirect branch load. */
+    if (d_offset == sextract64(d_offset, 0, 28)) {
+    insn = deposit32(I3206_B, 0, 26, d_offset >> 2);
+    } else {
+    uintptr_t i_addr = (uintptr_t)>jmp_target_addr[n];
+    ptrdiff_t i_offset = i_addr - jmp_rx;
+
+    /* Note that we asserted this in range in tcg_out_goto_tb. */
+    insn = deposit32(I3305_LDR | TCG_REG_TMP, 0, 5, i_offset >> 2);


'offset' should be bits [23:5] of LDR instruction, rather than [4:0].


Quite right.  Oops.


r~

Re: [PATCH 02/25] aspeed: Add Supermicro X11 SPI machine type

2023-01-31 Thread Joel Stanley

On Thu, 19 Jan 2023 at 12:36, Cédric Le Goater  wrote:
>
> From: Guenter Roeck 
>
> supermicrox11-bmc is configured with ast2400-a1 SoC. This does not match
> the Supermicro documentation for X11 BMCs, and it does not match the
> devicetree file in the Linux kernel.

I found this sentence confusing; AFAICT X11 doesn't name a specific
motherboard. It appears to be a marketing term for a bunch of
different things.

> As it turns out, some Supermicro X11 motherboards use AST2400 SoCs,
> while others use AST2500.
>
> Introduce new machine type supermicrox11-spi-bmc with AST2500 SoC

How about supermicro-x11spi-bmc? It would match the product name:

https://www.supermicro.com/en/products/motherboard/X11SPi-TF

as well as the device tree filename.

> to match the devicetree description in the Linux kernel. Hardware
> configuration details for this machine type are guesswork and taken
> from defaults as well as from the Linux kernel devicetree file.
>
> The new machine type was tested with aspeed-bmc-supermicro-x11spi.dts
> from the Linux kernel and with Linux versions 6.0.3 and 6.1-rc2.
> Linux booted successfully from initrd and from both SPI interfaces.
> Ethernet interfaces were confirmed to be operational.
>
> Signed-off-by: Guenter Roeck 
> Reviewed-by: Philippe Mathieu-Daudé 
> Link: https://lore.kernel.org/r/20221025165109.1226001-1-li...@roeck-us.net
> Message-Id: <20221025165109.1226001-1-li...@roeck-us.net>
> Signed-off-by: Cédric Le Goater 
> ---
>  hw/arm/aspeed.c | 33 +
>  1 file changed, 33 insertions(+)
>
> diff --git a/hw/arm/aspeed.c b/hw/arm/aspeed.c
> index 55f114ef72..4919a1fe9e 100644
> --- a/hw/arm/aspeed.c
> +++ b/hw/arm/aspeed.c
> @@ -71,6 +71,16 @@ struct AspeedMachineState {
>  SCU_HW_STRAP_VGA_SIZE_SET(VGA_16M_DRAM) |   \
>  SCU_AST2400_HW_STRAP_BOOT_MODE(AST2400_SPI_BOOT))
>
> +/* TODO: Find the actual hardware value */
> +#define SUPERMICROX11_SPI_BMC_HW_STRAP1 (   \
> +AST2500_HW_STRAP1_DEFAULTS |\
> +SCU_AST2500_HW_STRAP_SPI_AUTOFETCH_ENABLE | \
> +SCU_AST2500_HW_STRAP_GPIO_STRAP_ENABLE |\
> +SCU_AST2500_HW_STRAP_UART_DEBUG |   \
> +SCU_AST2500_HW_STRAP_DDR4_ENABLE |  \
> +SCU_HW_STRAP_SPI_WIDTH |\
> +SCU_HW_STRAP_SPI_MODE(SCU_HW_STRAP_SPI_M_S_EN))
> +
>  /* AST2500 evb hardware value: 0xF100C2E6 */
>  #define AST2500_EVB_HW_STRAP1 ((\
>  AST2500_HW_STRAP1_DEFAULTS |\
> @@ -1141,6 +1151,25 @@ static void 
> aspeed_machine_supermicrox11_bmc_class_init(ObjectClass *oc,
>  mc->default_ram_size = 256 * MiB;
>  }
>
> +static void aspeed_machine_supermicrox11_spi_bmc_class_init(ObjectClass *oc,
> +void *data)
> +{
> +MachineClass *mc = MACHINE_CLASS(oc);
> +AspeedMachineClass *amc = ASPEED_MACHINE_CLASS(oc);
> +
> +mc->desc   = "Supermicro X11 SPI BMC (ARM1176)";
> +amc->soc_name  = "ast2500-a1";
> +amc->hw_strap1 = SUPERMICROX11_SPI_BMC_HW_STRAP1;
> +amc->fmc_model = "mx25l25635e";
> +amc->spi_model = "mx25l25635e";
> +amc->num_cs= 1;
> +amc->macs_mask = ASPEED_MAC0_ON | ASPEED_MAC1_ON;
> +amc->i2c_init  = palmetto_bmc_i2c_init;
> +mc->default_ram_size = 512 * MiB;
> +mc->default_cpus = mc->min_cpus = mc->max_cpus =
> +aspeed_soc_num_cpus(amc->soc_name);
> +}
> +
>  static void aspeed_machine_ast2500_evb_class_init(ObjectClass *oc, void 
> *data)
>  {
>  MachineClass *mc = MACHINE_CLASS(oc);
> @@ -1522,6 +1551,10 @@ static const TypeInfo aspeed_machine_types[] = {
>  .name  = MACHINE_TYPE_NAME("supermicrox11-bmc"),
>  .parent= TYPE_ASPEED_MACHINE,
>  .class_init= aspeed_machine_supermicrox11_bmc_class_init,
> +}, {
> +.name  = MACHINE_TYPE_NAME("supermicrox11-spi-bmc"),
> +.parent= TYPE_ASPEED_MACHINE,
> +.class_init= aspeed_machine_supermicrox11_spi_bmc_class_init,
>  }, {
>  .name  = MACHINE_TYPE_NAME("ast2500-evb"),
>  .parent= TYPE_ASPEED_MACHINE,
> --
> 2.39.0
>
>

Re: [PATCH 04/25] avocado/boot_linux_console.py: Update ast2600 test

2023-01-31 Thread Joel Stanley

On Thu, 19 Jan 2023 at 12:35, Cédric Le Goater  wrote:
>
> From: Joel Stanley 
>
> Update the test_arm_ast2600_debian test to
>
>  - the latest Debian kernel

Would you like a newer version of this patch that uses the latest kernel?

>  - use the Rainier machine instead of Tacoma
>
> Both of which contains support for more hardware and thus exercises more
> of the hardware Qemu models.
>
> Signed-off-by: Joel Stanley 
> Reviewed-by: Cédric Le Goater 
> Message-Id: <20220607011938.1676459-1-j...@jms.id.au>
> Signed-off-by: Cédric Le Goater 
> ---
>  tests/avocado/boot_linux_console.py | 12 ++--
>  1 file changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/tests/avocado/boot_linux_console.py 
> b/tests/avocado/boot_linux_console.py
> index 8c1d981586..f3a1d00be9 100644
> --- a/tests/avocado/boot_linux_console.py
> +++ b/tests/avocado/boot_linux_console.py
> @@ -1098,18 +1098,18 @@ def test_arm_vexpressa9(self):
>  def test_arm_ast2600_debian(self):
>  """
>  :avocado: tags=arch:arm
> -:avocado: tags=machine:tacoma-bmc
> +:avocado: tags=machine:rainier-bmc
>  """
>  deb_url = ('http://snapshot.debian.org/archive/debian/'
> -   '20210302T203551Z/'
> +   '20220606T211338Z/'
> 'pool/main/l/linux/'
> -   'linux-image-5.10.0-3-armmp_5.10.13-1_armhf.deb')
> -deb_hash = 
> 'db40d32fe39255d05482bea48d72467b67d6225bb2a2a4d6f618cb8976f1e09e'
> +   'linux-image-5.17.0-2-armmp_5.17.6-1%2Bb1_armhf.deb')
> +deb_hash = 
> '8acb2b4439faedc2f3ed4bdb2847ad4f6e0491f73debaeb7f660c8abe4dcdc0e'
>  deb_path = self.fetch_asset(deb_url, asset_hash=deb_hash,
>  algorithm='sha256')
> -kernel_path = self.extract_from_deb(deb_path, 
> '/boot/vmlinuz-5.10.0-3-armmp')
> +kernel_path = self.extract_from_deb(deb_path, 
> '/boot/vmlinuz-5.17.0-2-armmp')
>  dtb_path = self.extract_from_deb(deb_path,
> -
> '/usr/lib/linux-image-5.10.0-3-armmp/aspeed-bmc-opp-tacoma.dtb')
> +
> '/usr/lib/linux-image-5.17.0-2-armmp/aspeed-bmc-ibm-rainier.dtb')
>
>  self.vm.set_console()
>  self.vm.add_args('-kernel', kernel_path,
> --
> 2.39.0
>

Re: [PATCH 09/25] hw/core/loader: Remove declarations of option_rom_has_mr/rom_file_has_mr

2023-01-31 Thread Joel Stanley

On Thu, 19 Jan 2023 at 12:37, Cédric Le Goater  wrote:
>
> These globals were moved to MachineClass by commit 71ae9e94d9 ("pc: Move
> option_rom_has_mr/rom_file_has_mr globals to MachineClass"). Finish cleanup.
>
> Cc: Eduardo Habkost 
> Cc: Marcel Apfelbaum 
> Reviewed-by: Alex Bennée 
> Reviewed-by: Philippe Mathieu-Daudé 
> Signed-off-by: Cédric Le Goater 

Reviewed-by: Joel Stanley 

> ---
>  include/hw/loader.h | 3 ---
>  1 file changed, 3 deletions(-)
>
> diff --git a/include/hw/loader.h b/include/hw/loader.h
> index 70248e0da7..1384796a4b 100644
> --- a/include/hw/loader.h
> +++ b/include/hw/loader.h
> @@ -251,9 +251,6 @@ void pstrcpy_targphys(const char *name,
>hwaddr dest, int buf_size,
>const char *source);
>
> -extern bool option_rom_has_mr;
> -extern bool rom_file_has_mr;
> -
>  ssize_t rom_add_file(const char *file, const char *fw_dir,
>   hwaddr addr, int32_t bootindex,
>   bool option_rom, MemoryRegion *mr, AddressSpace *as);
> --
> 2.39.0
>
>

Re: [PATCH 20/25] hw/arm/aspeed_ast10x0: Add various unimplemented peripherals

2023-01-31 Thread Joel Stanley

On Thu, 19 Jan 2023 at 12:36, Cédric Le Goater  wrote:
>
> From: Philippe Mathieu-Daudé 
>
> Based on booting Zephyr demo from [1] running QEMU with
> '-d unimp' and checking missing devices in [2].
>
> [1] https://github.com/AspeedTech-BMC/zephyr/releases/tag/v00.01.07
> [2] 
> https://github.com/AspeedTech-BMC/zephyr/blob/v00.01.08/dts/arm/aspeed/ast10x0.dtsi
>
> Signed-off-by: Philippe Mathieu-Daudé 
> Reviewed-by: Peter Delevoryas 
> Reviewed-by: Cédric Le Goater 
> Signed-off-by: Cédric Le Goater 

Reviewed-by: Joel Stanley 

> ---
>  include/hw/arm/aspeed_soc.h | 11 +++
>  hw/arm/aspeed_ast10x0.c | 35 +++
>  2 files changed, 46 insertions(+)
>
> diff --git a/include/hw/arm/aspeed_soc.h b/include/hw/arm/aspeed_soc.h
> index 8389200b2d..9a5e3c0bac 100644
> --- a/include/hw/arm/aspeed_soc.h
> +++ b/include/hw/arm/aspeed_soc.h
> @@ -44,6 +44,7 @@
>  #define ASPEED_CPUS_NUM  2
>  #define ASPEED_MACS_NUM  4
>  #define ASPEED_UARTS_NUM 13
> +#define ASPEED_JTAG_NUM  2
>
>  struct AspeedSoCState {
>  /*< private >*/
> @@ -87,6 +88,11 @@ struct AspeedSoCState {
>  UnimplementedDeviceState video;
>  UnimplementedDeviceState emmc_boot_controller;
>  UnimplementedDeviceState dpmcu;
> +UnimplementedDeviceState pwm;
> +UnimplementedDeviceState espi;
> +UnimplementedDeviceState udc;
> +UnimplementedDeviceState sgpiom;
> +UnimplementedDeviceState jtag[ASPEED_JTAG_NUM];
>  };
>
>  #define TYPE_ASPEED_SOC "aspeed-soc"
> @@ -174,6 +180,11 @@ enum {
>  ASPEED_DEV_DPMCU,
>  ASPEED_DEV_DP,
>  ASPEED_DEV_I3C,
> +ASPEED_DEV_ESPI,
> +ASPEED_DEV_UDC,
> +ASPEED_DEV_SGPIOM,
> +ASPEED_DEV_JTAG0,
> +ASPEED_DEV_JTAG1,
>  };
>
>  qemu_irq aspeed_soc_get_irq(AspeedSoCState *s, int dev);
> diff --git a/hw/arm/aspeed_ast10x0.c b/hw/arm/aspeed_ast10x0.c
> index b483735dc2..b970a5ea58 100644
> --- a/hw/arm/aspeed_ast10x0.c
> +++ b/hw/arm/aspeed_ast10x0.c
> @@ -27,10 +27,15 @@ static const hwaddr aspeed_soc_ast1030_memmap[] = {
>  [ASPEED_DEV_FMC]   = 0x7E62,
>  [ASPEED_DEV_SPI1]  = 0x7E63,
>  [ASPEED_DEV_SPI2]  = 0x7E64,
> +[ASPEED_DEV_UDC]   = 0x7E6A2000,
>  [ASPEED_DEV_SCU]   = 0x7E6E2000,
> +[ASPEED_DEV_JTAG0] = 0x7E6E4000,
> +[ASPEED_DEV_JTAG1] = 0x7E6E4100,
>  [ASPEED_DEV_ADC]   = 0x7E6E9000,
> +[ASPEED_DEV_ESPI]  = 0x7E6EE000,
>  [ASPEED_DEV_SBC]   = 0x7E6F2000,
>  [ASPEED_DEV_GPIO]  = 0x7E78,
> +[ASPEED_DEV_SGPIOM]= 0x7E780500,
>  [ASPEED_DEV_TIMER1]= 0x7E782000,
>  [ASPEED_DEV_UART1] = 0x7E783000,
>  [ASPEED_DEV_UART2] = 0x7E78D000,
> @@ -78,12 +83,17 @@ static const int aspeed_soc_ast1030_irqmap[] = {
>  [ASPEED_DEV_LPC]   = 35,
>  [ASPEED_DEV_PECI]  = 38,
>  [ASPEED_DEV_FMC]   = 39,
> +[ASPEED_DEV_ESPI]  = 42,
>  [ASPEED_DEV_PWM]   = 44,
>  [ASPEED_DEV_ADC]   = 46,
>  [ASPEED_DEV_SPI1]  = 65,
>  [ASPEED_DEV_SPI2]  = 66,
>  [ASPEED_DEV_I2C]   = 110, /* 110 ~ 123 */
>  [ASPEED_DEV_KCS]   = 138, /* 138 -> 142 */
> +[ASPEED_DEV_UDC]   = 9,
> +[ASPEED_DEV_SGPIOM]= 51,
> +[ASPEED_DEV_JTAG0] = 27,
> +[ASPEED_DEV_JTAG1] = 53,

nit: The array is kind of sorted by irq number, these could probably go above?

>  };
>
>  static qemu_irq aspeed_soc_ast1030_get_irq(AspeedSoCState *s, int dev)
> @@ -154,6 +164,15 @@ static void aspeed_soc_ast1030_init(Object *obj)
>  object_initialize_child(obj, "iomem", >iomem, 
> TYPE_UNIMPLEMENTED_DEVICE);
>  object_initialize_child(obj, "sbc-unimplemented", >sbc_unimplemented,
>  TYPE_UNIMPLEMENTED_DEVICE);
> +object_initialize_child(obj, "pwm", >pwm, TYPE_UNIMPLEMENTED_DEVICE);
> +object_initialize_child(obj, "espi", >espi, 
> TYPE_UNIMPLEMENTED_DEVICE);
> +object_initialize_child(obj, "udc", >udc, TYPE_UNIMPLEMENTED_DEVICE);
> +object_initialize_child(obj, "sgpiom", >sgpiom,
> +TYPE_UNIMPLEMENTED_DEVICE);
> +object_initialize_child(obj, "jtag[0]", >jtag[0],
> +TYPE_UNIMPLEMENTED_DEVICE);
> +object_initialize_child(obj, "jtag[1]", >jtag[1],
> +TYPE_UNIMPLEMENTED_DEVICE);
>  }
>
>  static void aspeed_soc_ast1030_realize(DeviceState *dev_soc, Error **errp)
> @@ -336,6 +355,22 @@ static void aspeed_soc_ast1030_realize(DeviceState 
> *dev_soc, Error **errp)
>  sc->memmap[ASPEED_DEV_GPIO]);
>  sysbus_connect_irq(SYS_BUS_DEVICE(>gpio), 0,
> aspeed_soc_get_irq(s, ASPEED_DEV_GPIO));
> +
> +aspeed_mmio_map_unimplemented(s, SYS_BUS_DEVICE(>pwm), "aspeed.pwm",
> +  sc->memmap[ASPEED_DEV_PWM], 0x100);
> +
> +aspeed_mmio_map_unimplemented(s, SYS_BUS_DEVICE(>espi), "aspeed.espi",
> +

Re: [PATCH 05/25] m25p80: Add the is25wp256 SFPD table

2023-01-31 Thread Joel Stanley

On Thu, 19 Jan 2023 at 12:36, Cédric Le Goater  wrote:
>
> From: Guenter Roeck 
>
> Generated from hardware using the following command and then padding
> with 0xff to fill out a power-of-2:
> xxd -p /sys/bus/spi/devices/spi0.0/spi-nor/sfdp
>
> Cc: Michael Walle 
> Cc: Tudor Ambarus 
> Signed-off-by: Guenter Roeck 
> Reviewed-by: Cédric Le Goater 
> Message-Id: <20221221122213.1458540-1-li...@roeck-us.net>
> Signed-off-by: Cédric Le Goater 

Reviewed-by: Joel Stanley 

I wonder if we could update the code so the padding is assumed.

> ---
>  hw/block/m25p80_sfdp.h |  2 ++
>  hw/block/m25p80.c  |  3 ++-
>  hw/block/m25p80_sfdp.c | 40 
>  3 files changed, 44 insertions(+), 1 deletion(-)
>
> diff --git a/hw/block/m25p80_sfdp.h b/hw/block/m25p80_sfdp.h
> index df7adfb5ce..011a880f66 100644
> --- a/hw/block/m25p80_sfdp.h
> +++ b/hw/block/m25p80_sfdp.h
> @@ -26,4 +26,6 @@ uint8_t m25p80_sfdp_w25q512jv(uint32_t addr);
>
>  uint8_t m25p80_sfdp_w25q01jvq(uint32_t addr);
>
> +uint8_t m25p80_sfdp_is25wp256(uint32_t addr);
> +
>  #endif
> diff --git a/hw/block/m25p80.c b/hw/block/m25p80.c
> index 68a757abf3..dc5ffbc4ff 100644
> --- a/hw/block/m25p80.c
> +++ b/hw/block/m25p80.c
> @@ -222,7 +222,8 @@ static const FlashPartInfo known_devices[] = {
>  { INFO("is25wp032",   0x9d7016,  0,  64 << 10,  64, ER_4K) },
>  { INFO("is25wp064",   0x9d7017,  0,  64 << 10, 128, ER_4K) },
>  { INFO("is25wp128",   0x9d7018,  0,  64 << 10, 256, ER_4K) },
> -{ INFO("is25wp256",   0x9d7019,  0,  64 << 10, 512, ER_4K) },
> +{ INFO("is25wp256",   0x9d7019,  0,  64 << 10, 512, ER_4K),
> +  .sfdp_read = m25p80_sfdp_is25wp256 },
>
>  /* Macronix */
>  { INFO("mx25l2005a",  0xc22012,  0,  64 << 10,   4, ER_4K) },
> diff --git a/hw/block/m25p80_sfdp.c b/hw/block/m25p80_sfdp.c
> index 77615fa29e..b33811a4f5 100644
> --- a/hw/block/m25p80_sfdp.c
> +++ b/hw/block/m25p80_sfdp.c
> @@ -330,3 +330,43 @@ static const uint8_t sfdp_w25q01jvq[] = {
>  0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
>  };
>  define_sfdp_read(w25q01jvq);
> +
> +/*
> + * Integrated Silicon Solution (ISSI)
> + */
> +
> +static const uint8_t sfdp_is25wp256[] = {
> +0x53, 0x46, 0x44, 0x50, 0x06, 0x01, 0x01, 0xff,
> +0x00, 0x06, 0x01, 0x10, 0x30, 0x00, 0x00, 0xff,
> +0x9d, 0x05, 0x01, 0x03, 0x80, 0x00, 0x00, 0x02,
> +0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
> +0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
> +0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
> +0xe5, 0x20, 0xf9, 0xff, 0xff, 0xff, 0xff, 0x0f,
> +0x44, 0xeb, 0x08, 0x6b, 0x08, 0x3b, 0x80, 0xbb,
> +0xfe, 0xff, 0xff, 0xff, 0xff, 0xff, 0x00, 0xff,
> +0xff, 0xff, 0x44, 0xeb, 0x0c, 0x20, 0x0f, 0x52,
> +0x10, 0xd8, 0x00, 0xff, 0x23, 0x4a, 0xc9, 0x00,
> +0x82, 0xd8, 0x11, 0xce, 0xcc, 0xcd, 0x68, 0x46,
> +0x7a, 0x75, 0x7a, 0x75, 0xf7, 0xae, 0xd5, 0x5c,
> +0x4a, 0x42, 0x2c, 0xff, 0xf0, 0x30, 0xfa, 0xa9,
> +0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
> +0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
> +0x50, 0x19, 0x50, 0x16, 0x9f, 0xf9, 0xc0, 0x64,
> +0x8f, 0xef, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
> +0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
> +0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
> +0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
> +0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
> +0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
> +0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
> +0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
> +0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
> +0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
> +0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
> +0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
> +0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
> +0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
> +0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
> +};
> +define_sfdp_read(is25wp256);
> --
> 2.39.0
>
>

Re: [PULL 10/56] x86: don't let decompressed kernel image clobber setup_data

2023-01-31 Thread H. Peter Anvin

On January 31, 2023 1:22:43 PM PST, "Jason A. Donenfeld"  
wrote:
>On Tue, Jan 31, 2023, 15:55 H. Peter Anvin  wrote:
>
>> On January 30, 2023 12:19:14 PM PST, "Michael S. Tsirkin" 
>> wrote:
>> >From: "Jason A. Donenfeld" 
>> >
>> >The setup_data links are appended to the compressed kernel image. Since
>> >the kernel image is typically loaded at 0x10, setup_data lives at
>> >`0x10 + compressed_size`, which does not get relocated during the
>> >kernel's boot process.
>> >
>> >The kernel typically decompresses the image starting at address
>> >0x100 (note: there's one more zero there than the compressed image
>> >above). This usually is fine for most kernels.
>> >
>> >However, if the compressed image is actually quite large, then
>> >setup_data will live at a `0x10 + compressed_size` that extends into
>> >the decompressed zone at 0x100. In other words, if compressed_size
>> >is larger than `0x100 - 0x10`, then the decompression step will
>> >clobber setup_data, resulting in crashes.
>> >
>> >Visually, what happens now is that QEMU appends setup_data to the kernel
>> >image:
>> >
>> >  kernel imagesetup_data
>> >   |--|||
>> >0x10  0x10+l1 0x10+l1+l2
>> >
>> >The problem is that this decompresses to 0x100 (one more zero). So
>> >if l1 is > (0x100-0x10), then this winds up looking like:
>> >
>> >  kernel imagesetup_data
>> >   |--|||
>> >0x10  0x10+l1 0x10+l1+l2
>> >
>> > d e c o m p r e s s e d   k e r n e l
>> >
>>  |-|
>> >0x100
>>  0x100+l3
>> >
>> >The decompressed kernel seemingly overwriting the compressed kernel
>> >image isn't a problem, because that gets relocated to a higher address
>> >early on in the boot process, at the end of startup_64. setup_data,
>> >however, stays in the same place, since those links are self referential
>> >and nothing fixes them up.  So the decompressed kernel clobbers it.
>> >
>> >Fix this by appending setup_data to the cmdline blob rather than the
>> >kernel image blob, which remains at a lower address that won't get
>> >clobbered.
>> >
>> >This could have been done by overwriting the initrd blob instead, but
>> >that poses big difficulties, such as no longer being able to use memory
>> >mapped files for initrd, hurting performance, and, more importantly, the
>> >initrd address calculation is hard coded in qboot, and it always grows
>> >down rather than up, which means lots of brittle semantics would have to
>> >be changed around, incurring more complexity. In contrast, using cmdline
>> >is simple and doesn't interfere with anything.
>> >
>> >The microvm machine has a gross hack where it fiddles with fw_cfg data
>> >after the fact. So this hack is updated to account for this appending,
>> >by reserving some bytes.
>> >
>> >Fixup-by: Michael S. Tsirkin 
>> >Cc: x...@kernel.org
>> >Cc: Philippe Mathieu-Daudé 
>> >Cc: H. Peter Anvin 
>> >Cc: Borislav Petkov 
>> >Cc: Eric Biggers 
>> >Signed-off-by: Jason A. Donenfeld 
>> >Message-Id: <20221230220725.618763-1-ja...@zx2c4.com>
>> >Message-ID: <20230128061015-mutt-send-email-...@kernel.org>
>> >Reviewed-by: Michael S. Tsirkin 
>> >Signed-off-by: Michael S. Tsirkin 
>> >Tested-by: Eric Biggers 
>> >Tested-by: Mathias Krause 
>> >---
>> > include/hw/i386/microvm.h |  5 ++--
>> > include/hw/nvram/fw_cfg.h |  9 +++
>> > hw/i386/microvm.c | 15 +++
>> > hw/i386/x86.c | 52 +--
>> > hw/nvram/fw_cfg.c |  9 +++
>> > 5 files changed, 59 insertions(+), 31 deletions(-)
>> >
>> >diff --git a/include/hw/i386/microvm.h b/include/hw/i386/microvm.h
>> >index fad97a891d..e8af61f194 100644
>> >--- a/include/hw/i386/microvm.h
>> >+++ b/include/hw/i386/microvm.h
>> >@@ -50,8 +50,9 @@
>> >  */
>> >
>> > /* Platform virtio definitions */
>> >-#define VIRTIO_MMIO_BASE  0xfeb0
>> >-#define VIRTIO_CMDLINE_MAXLEN 64
>> >+#define VIRTIO_MMIO_BASE0xfeb0
>> >+#define VIRTIO_CMDLINE_MAXLEN   64
>> >+#define VIRTIO_CMDLINE_TOTAL_MAX_LEN((VIRTIO_CMDLINE_MAXLEN + 1) *
>> 16)
>> >
>> > #define GED_MMIO_BASE 0xfea0
>> > #define GED_MMIO_BASE_MEMHP   (GED_MMIO_BASE + 0x100)
>> >diff --git a/include/hw/nvram/fw_cfg.h b/include/hw/nvram/fw_cfg.h
>> >index 2e503904dc..990dcdbb2e 100644
>> >--- a/include/hw/nvram/fw_cfg.h
>> >+++ b/include/hw/nvram/fw_cfg.h
>> >@@ -139,6 +139,15 @@ void fw_cfg_add_bytes_callback(FWCfgState *s,
>> uint16_t key,
>> >void *data, size_t len,
>> >bool read_only);
>> >
>> >+/**
>> >+ * fw_cfg_read_bytes_ptr:
>> >+ * @s: fw_cfg device being modified
>> >+ * @key: selector key value for new fw_cfg item
>> >+ *

Re: [PATCH v3 8/9] igb: respect VT_CTL ignore MAC field

2023-01-31 Thread Akihiko Odaki


On 2023/01/31 18:42, Sriram Yagnaraman wrote:

Also trace out a warning if replication mode is disabled, since we only
support replication mode enabled.

Signed-off-by: Sriram Yagnaraman 
---
  hw/net/igb_core.c   | 9 +
  hw/net/trace-events | 2 ++
  2 files changed, 11 insertions(+)

diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
index c5f9c14f47..8115be2d76 100644
--- a/hw/net/igb_core.c
+++ b/hw/net/igb_core.c
@@ -964,6 +964,10 @@ static uint16_t igb_receive_assign(IGBCore *core, const 
struct eth_header *ehdr,
  }
  
  if (core->mac[MRQC] & 1) {

+if (!(core->mac[VT_CTL] & E1000_VT_CTL_VM_REPL_EN)) {
+trace_igb_rx_vmdq_replication_mode_disabled();
+}
+
  if (is_broadcast_ether_addr(ehdr->h_dest)) {
  for (i = 0; i < IGB_NUM_VM_POOLS; i++) {
  if (core->mac[VMOLR0 + i] & E1000_VMOLR_BAM) {
@@ -1010,6 +1014,11 @@ static uint16_t igb_receive_assign(IGBCore *core, const 
struct eth_header *ehdr,
  }
  }
  
+/* assume a full pool list if IGMAC is set */

+if (core->mac[VT_CTL] & E1000_VT_CTL_IGNORE_MAC) {
+queues = BIT(IGB_MAX_VF_FUNCTIONS) - 1;
+}
+


This overwrites "queues", but "external_tx" is not overwritten.


  if (e1000x_vlan_rx_filter_enabled(core->mac)) {
  uint16_t mask = 0;
  
diff --git a/hw/net/trace-events b/hw/net/trace-events

index e94172e748..9bc7658692 100644
--- a/hw/net/trace-events
+++ b/hw/net/trace-events
@@ -288,6 +288,8 @@ igb_rx_desc_buff_write(uint64_t addr, uint16_t offset, 
const void* source, uint3
  
  igb_rx_metadata_rss(uint32_t rss) "RSS data: 0x%X"
  
+igb_rx_vmdq_replication_mode_disabled(void) "WARN: Only replication mode enabled is supported"

+
  igb_irq_icr_clear_gpie_nsicr(void) "Clearing ICR on read due to GPIE.NSICR 
enabled"
  igb_irq_icr_write(uint32_t bits, uint32_t old_icr, uint32_t new_icr) "Clearing ICR 
bits 0x%x: 0x%x --> 0x%x"
  igb_irq_set_iam(uint32_t icr) "Update IAM: 0x%x"

Re: [PATCH v3 9/9] igb: respect VMVIR and VMOLR for VLAN

2023-01-31 Thread Akihiko Odaki


On 2023/01/31 18:42, Sriram Yagnaraman wrote:

Add support for stripping/inserting VLAN for VFs.

Signed-off-by: Sriram Yagnaraman 
---
  hw/net/igb_core.c | 51 ++-
  1 file changed, 42 insertions(+), 9 deletions(-)

diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
index 8115be2d76..a697fcf56a 100644
--- a/hw/net/igb_core.c
+++ b/hw/net/igb_core.c
@@ -386,6 +386,25 @@ igb_rss_parse_packet(IGBCore *core, struct NetRxPkt *pkt, 
bool tx,
  info->queue = E1000_RSS_QUEUE(>mac[RETA], info->hash);
  }
  
+static inline bool

+igb_tx_insert_vlan(IGBCore *core, uint16_t qn,
+   struct igb_tx *tx, bool desc_vle)
+{
+if (core->mac[MRQC] & 1) {
+uint16_t pool = qn % IGB_NUM_VM_POOLS;
+
+if (core->mac[VMVIR0 + pool] & E1000_VMVIR_VLANA_DEFAULT) {
+/* always insert default VLAN */
+desc_vle = true;
+tx->vlan = core->mac[VMVIR0 + pool] & 0x;
+} else if (core->mac[VMVIR0 + pool] & E1000_VMVIR_VLANA_NEVER) {
+return false;
+}
+}
+
+return desc_vle && e1000x_vlan_enabled(core->mac);
+}
+
  static bool
  igb_setup_tx_offloads(IGBCore *core, struct igb_tx *tx)
  {
@@ -581,7 +600,8 @@ igb_process_tx_desc(IGBCore *core,
  
  if (cmd_type_len & E1000_TXD_CMD_EOP) {

  if (!tx->skip_cp && net_tx_pkt_parse(tx->tx_pkt)) {
-if (cmd_type_len & E1000_TXD_CMD_VLE) {
+if (igb_tx_insert_vlan(core, queue_index, tx,
+!!(cmd_type_len & E1000_TXD_CMD_VLE))) {
  net_tx_pkt_setup_vlan_header_ex(tx->tx_pkt, tx->vlan,
  core->mac[VET] & 0x);
  }
@@ -1543,6 +1563,20 @@ igb_write_packet_to_guest(IGBCore *core, struct NetRxPkt 
*pkt,
  igb_update_rx_stats(core, rxi, size, total_size);
  }
  
+static bool

+igb_rx_strip_vlan(IGBCore *core, const E1000E_RingInfo *rxi)
+{
+if (core->mac[MRQC] & 1) {
+uint16_t pool = rxi->idx % IGB_NUM_VM_POOLS;
+/* Sec 7.10.3.8: CTRL.VME is ignored, only VMOLR/RPLOLR is used */
+return (net_rx_pkt_get_packet_type(core->rx_pkt) == ETH_PKT_MCAST) ?
+core->mac[RPLOLR] & E1000_RPLOLR_STRVLAN :
+core->mac[VMOLR0 + pool] & E1000_VMOLR_STRVLAN;
+}
+
+return e1000x_vlan_enabled(core->mac);
+}
+
  static inline void
  igb_rx_fix_l4_csum(IGBCore *core, struct NetRxPkt *pkt)
  {
@@ -1624,10 +1658,7 @@ igb_receive_internal(IGBCore *core, const struct iovec 
*iov, int iovcnt,
  
  ehdr = PKT_GET_ETH_HDR(filter_buf);

  net_rx_pkt_set_packet_type(core->rx_pkt, get_eth_packet_type(ehdr));
-
-net_rx_pkt_attach_iovec_ex(core->rx_pkt, iov, iovcnt, iov_ofs,
-   e1000x_vlan_enabled(core->mac),
-   core->mac[VET] & 0x);
+net_rx_pkt_set_protocols(core->rx_pkt, filter_buf, size);
  
  queues = igb_receive_assign(core, ehdr, size, _info, external_tx);

  if (!queues) {
@@ -1635,11 +1666,8 @@ igb_receive_internal(IGBCore *core, const struct iovec 
*iov, int iovcnt,
  return orig_size;
  }
  
-total_size = net_rx_pkt_get_total_len(core->rx_pkt) +

-e1000x_fcs_len(core->mac);
-
  retval = orig_size;
-igb_rx_fix_l4_csum(core, core->rx_pkt);
+total_size = size + e1000x_fcs_len(core->mac);


This change to total_size should be reverted; total_size will be 
different from size if VLAN stripping is enabled. There is also no 
reason to reorder the statements.


  
  for (i = 0; i < IGB_NUM_QUEUES; i++) {

  if (!(queues & BIT(i)) ||
@@ -1648,6 +1676,11 @@ igb_receive_internal(IGBCore *core, const struct iovec 
*iov, int iovcnt,
  }
  
  igb_rx_ring_init(core, , i);

+net_rx_pkt_attach_iovec_ex(core->rx_pkt, iov, iovcnt, iov_ofs,
+   igb_rx_strip_vlan(core, rxr.i),
+   core->mac[VET] & 0x);
+igb_rx_fix_l4_csum(core, core->rx_pkt);
+
  if (!igb_has_rxbufs(core, rxr.i, total_size)) {
  icr_bits |= E1000_ICS_RXO;
  continue;

Re: [PATCH v3 4/9] igb: implement VFRE and VFTE registers

2023-01-31 Thread Akihiko Odaki


On 2023/01/31 18:42, Sriram Yagnaraman wrote:

Also add checks for RXDCTL/TXDCTL queue enable bits

Signed-off-by: Sriram Yagnaraman 
---
  hw/net/igb_core.c | 30 +-
  hw/net/igb_core.h |  1 +
  hw/net/igb_regs.h |  3 +++
  3 files changed, 29 insertions(+), 5 deletions(-)

diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
index e78bc3611a..4a1b98bf0e 100644
--- a/hw/net/igb_core.c
+++ b/hw/net/igb_core.c
@@ -780,6 +780,18 @@ igb_txdesc_writeback(IGBCore *core, dma_addr_t base,
  return igb_tx_wb_eic(core, txi->idx);
  }
  
+static inline bool

+igb_tx_enabled(IGBCore *core, const E1000E_RingInfo *txi)
+{
+bool vmdq = core->mac[MRQC] & 1;
+uint16_t qn = txi->idx;
+uint16_t pool = qn % IGB_NUM_VM_POOLS;
+
+return (core->mac[TCTL] & E1000_TCTL_EN) &&
+(!vmdq || core->mac[VFTE] & BIT(pool)) &&
+(core->mac[TXDCTL0 + (qn * 16)] & E1000_TXDCTL_QUEUE_ENABLE);
+}
+
  static void
  igb_start_xmit(IGBCore *core, const IGB_TxRing *txr)
  {
@@ -789,8 +801,7 @@ igb_start_xmit(IGBCore *core, const IGB_TxRing *txr)
  const E1000E_RingInfo *txi = txr->i;
  uint32_t eic = 0;
  
-/* TODO: check if the queue itself is enabled too. */

-if (!(core->mac[TCTL] & E1000_TCTL_EN)) {
+if (!igb_tx_enabled(core, txi)) {
  trace_e1000e_tx_disabled();
  return;
  }
@@ -1005,6 +1016,7 @@ static uint16_t igb_receive_assign(IGBCore *core, const 
struct eth_header *ehdr,
  queues = BIT(def_pl >> E1000_VT_CTL_DEFAULT_POOL_SHIFT);
  }
  
+queues &= core->mac[VFRE];

  igb_rss_parse_packet(core, core->rx_pkt, external_tx != NULL, 
rss_info);
  if (rss_info->queue & 1) {
  queues <<= 8;
@@ -1564,12 +1576,12 @@ igb_receive_internal(IGBCore *core, const struct iovec 
*iov, int iovcnt,
  igb_rx_fix_l4_csum(core, core->rx_pkt);
  
  for (i = 0; i < IGB_NUM_QUEUES; i++) {

-if (!(queues & BIT(i))) {
+if (!(queues & BIT(i)) ||
+!(core->mac[RXDCTL0 + (i * 16)] & E1000_RXDCTL_QUEUE_ENABLE)) {
  continue;
  }
  
  igb_rx_ring_init(core, , i);

-
  if (!igb_has_rxbufs(core, rxr.i, total_size)) {
  icr_bits |= E1000_ICS_RXO;
  continue;
@@ -1973,9 +1985,16 @@ static void igb_set_vfmailbox(IGBCore *core, int index, 
uint32_t val)
  
  static void igb_vf_reset(IGBCore *core, uint16_t vfn)

  {
+uint16_t qn0 = vfn;
+uint16_t qn1 = vfn + IGB_NUM_VM_POOLS;
+
  /* disable Rx and Tx for the VF*/
-core->mac[VFTE] &= ~BIT(vfn);
+core->mac[RXDCTL0 + (qn0 * 16)] &= ~E1000_RXDCTL_QUEUE_ENABLE;
+core->mac[RXDCTL0 + (qn1 * 16)] &= ~E1000_RXDCTL_QUEUE_ENABLE;
+core->mac[TXDCTL0 + (qn0 * 16)] &= ~E1000_TXDCTL_QUEUE_ENABLE;
+core->mac[TXDCTL0 + (qn1 * 16)] &= ~E1000_TXDCTL_QUEUE_ENABLE;
  core->mac[VFRE] &= ~BIT(vfn);
+core->mac[VFTE] &= ~BIT(vfn);
  /* indicate VF reset to PF */
  core->mac[VFLRE] |= BIT(vfn);
  /* VFLRE and mailbox use the same interrupt cause */
@@ -3881,6 +3900,7 @@ igb_phy_reg_init[] = {
  static const uint32_t igb_mac_reg_init[] = {
  [LEDCTL]= 2 | (3 << 8) | BIT(15) | (6 << 16) | (7 << 24),
  [EEMNGCTL]  = BIT(31),
+[TXDCTL0]   = E1000_TXDCTL_QUEUE_ENABLE,
  [RXDCTL0]   = E1000_RXDCTL_QUEUE_ENABLE | (1 << 16),
  [RXDCTL1]   = 1 << 16,
  [RXDCTL2]   = 1 << 16,
diff --git a/hw/net/igb_core.h b/hw/net/igb_core.h
index cc3b4d1f2b..9938922598 100644
--- a/hw/net/igb_core.h
+++ b/hw/net/igb_core.h
@@ -47,6 +47,7 @@
  #define IGB_MSIX_VEC_NUM(10)
  #define IGBVF_MSIX_VEC_NUM  (3)
  #define IGB_NUM_QUEUES  (16)
+#define IGB_NUM_VM_POOLS(8)


If you are adding this definition, search for "8" and replace the 
occurrences with it where appropriate.


  
  typedef struct IGBCore IGBCore;
  
diff --git a/hw/net/igb_regs.h b/hw/net/igb_regs.h

index ddc0f931d6..4d98079906 100644
--- a/hw/net/igb_regs.h
+++ b/hw/net/igb_regs.h
@@ -160,6 +160,9 @@ union e1000_adv_rx_desc {
  #define E1000_MRQC_RSS_FIELD_IPV6_UDP   0x0080
  #define E1000_MRQC_RSS_FIELD_IPV6_UDP_EX0x0100
  
+/* Additional Transmit Descriptor Control definitions */

+#define E1000_TXDCTL_QUEUE_ENABLE  0x0200 /* Enable specific Tx Queue */
+
  /* Additional Receive Descriptor Control definitions */
  #define E1000_RXDCTL_QUEUE_ENABLE  0x0200 /* Enable specific Rx Queue */

Re: [PATCH v3 5/9] igb: check oversized packets for VMDq

2023-01-31 Thread Akihiko Odaki


On 2023/01/31 18:42, Sriram Yagnaraman wrote:

Signed-off-by: Sriram Yagnaraman 
---
  hw/net/igb_core.c | 48 +++
  1 file changed, 40 insertions(+), 8 deletions(-)

diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
index 4a1b98bf0e..2f6f30341f 100644
--- a/hw/net/igb_core.c
+++ b/hw/net/igb_core.c
@@ -912,12 +912,27 @@ igb_rx_l4_cso_enabled(IGBCore *core)
  return !!(core->mac[RXCSUM] & E1000_RXCSUM_TUOFLD);
  }
  
+static bool

+igb_rx_is_oversized(IGBCore *core, uint16_t qn, size_t size)
+{
+uint16_t pool = qn % IGB_NUM_VM_POOLS;
+bool lpe = !!(core->mac[VMOLR0 + pool] & E1000_VMOLR_LPE);
+int maximum_ethernet_lpe_size =
+core->mac[VMOLR0 + pool] & E1000_VMOLR_RLPML_MASK;
+int maximum_ethernet_vlan_size = 1522;
+
+return lpe ? size > maximum_ethernet_lpe_size :
+size > maximum_ethernet_vlan_size;


Instead do:
size > (lpe ? maximum_ethernet_lpe_size : maximum_ethernet_vlan_size)


+}
+
  static uint16_t igb_receive_assign(IGBCore *core, const struct eth_header 
*ehdr,
-   E1000E_RSSInfo *rss_info, bool *external_tx)
+   size_t size, E1000E_RSSInfo *rss_info,
+   bool *external_tx)
  {
  static const int ta_shift[] = { 4, 3, 2, 0 };
  uint32_t f, ra[2], *macp, rctl = core->mac[RCTL];
  uint16_t queues = 0;
+uint16_t oversized = 0;
  uint16_t vid = lduw_be_p(_GET_VLAN_HDR(ehdr)->h_tci) & VLAN_VID_MASK;
  bool accepted = false;
  int i;
@@ -943,7 +958,7 @@ static uint16_t igb_receive_assign(IGBCore *core, const 
struct eth_header *ehdr,
  
  if (core->mac[MRQC] & 1) {

  if (is_broadcast_ether_addr(ehdr->h_dest)) {
-for (i = 0; i < 8; i++) {
+for (i = 0; i < IGB_NUM_VM_POOLS; i++) {
  if (core->mac[VMOLR0 + i] & E1000_VMOLR_BAM) {
  queues |= BIT(i);
  }
@@ -977,7 +992,7 @@ static uint16_t igb_receive_assign(IGBCore *core, const 
struct eth_header *ehdr,
  f = ta_shift[(rctl >> E1000_RCTL_MO_SHIFT) & 3];
  f = (((ehdr->h_dest[5] << 8) | ehdr->h_dest[4]) >> f) & 0xfff;
  if (macp[f >> 5] & (1 << (f & 0x1f))) {
-for (i = 0; i < 8; i++) {
+for (i = 0; i < IGB_NUM_VM_POOLS; i++) {
  if (core->mac[VMOLR0 + i] & E1000_VMOLR_ROMPE) {
  queues |= BIT(i);
  }
@@ -1000,7 +1015,7 @@ static uint16_t igb_receive_assign(IGBCore *core, const 
struct eth_header *ehdr,
  }
  }
  } else {
-for (i = 0; i < 8; i++) {
+for (i = 0; i < IGB_NUM_VM_POOLS; i++) {
  if (core->mac[VMOLR0 + i] & E1000_VMOLR_AUPE) {
  mask |= BIT(i);
  }
@@ -1017,9 +1032,26 @@ static uint16_t igb_receive_assign(IGBCore *core, const 
struct eth_header *ehdr,
  }
  
  queues &= core->mac[VFRE];

-igb_rss_parse_packet(core, core->rx_pkt, external_tx != NULL, 
rss_info);
-if (rss_info->queue & 1) {
-queues <<= 8;
+if (queues) {
+for (i = 0; i < IGB_NUM_VM_POOLS; i++) {
+if ((queues & BIT(i)) && igb_rx_is_oversized(core, i, size)) {
+oversized |= BIT(i);
+}
+}
+/* 8.19.37 increment ROC if packet is oversized for all queues */
+if (oversized == queues) {
+trace_e1000x_rx_oversized(size);
+e1000x_inc_reg_if_not_full(core->mac, ROC);
+}
+queues &= ~oversized;
+}
+
+if (queues) {
+igb_rss_parse_packet(core, core->rx_pkt,
+ external_tx != NULL, rss_info);
+if (rss_info->queue & 1) {
+queues <<= 8;
+}
  }
  } else {
  switch (net_rx_pkt_get_packet_type(core->rx_pkt)) {
@@ -1563,7 +1595,7 @@ igb_receive_internal(IGBCore *core, const struct iovec 
*iov, int iovcnt,
 e1000x_vlan_enabled(core->mac),
 core->mac[VET] & 0x);
  
-queues = igb_receive_assign(core, ehdr, _info, external_tx);

+queues = igb_receive_assign(core, ehdr, size, _info, external_tx);
  if (!queues) {
  trace_e1000e_rx_flt_dropped();
  return orig_size;

Re: [PATCH v3 3/9] igb: add ICR_RXDW

2023-01-31 Thread Akihiko Odaki


On 2023/01/31 18:42, Sriram Yagnaraman wrote:

IGB uses RXDW ICR bit to indicate that rx descriptor has been written
back. This is the same as RXT0 bit in older HW.

Signed-off-by: Sriram Yagnaraman 
---
  hw/net/e1000x_regs.h |  4 
  hw/net/igb_core.c| 46 +---
  2 files changed, 22 insertions(+), 28 deletions(-)

diff --git a/hw/net/e1000x_regs.h b/hw/net/e1000x_regs.h
index fb5b861135..f509db73a7 100644
--- a/hw/net/e1000x_regs.h
+++ b/hw/net/e1000x_regs.h
@@ -335,6 +335,7 @@
  #define E1000_ICR_RXDMT00x0010 /* rx desc min. threshold (0) */
  #define E1000_ICR_RXO   0x0040 /* rx overrun */
  #define E1000_ICR_RXT0  0x0080 /* rx timer intr (ring 0) */
+#define E1000_ICR_RXDW  0x0080 /* rx desc written back */
  #define E1000_ICR_MDAC  0x0200 /* MDIO access complete */
  #define E1000_ICR_RXCFG 0x0400 /* RX /c/ ordered set */
  #define E1000_ICR_GPI_EN0   0x0800 /* GP Int 0 */
@@ -378,6 +379,7 @@
  #define E1000_ICS_RXDMT0E1000_ICR_RXDMT0/* rx desc min. threshold */
  #define E1000_ICS_RXO   E1000_ICR_RXO   /* rx overrun */
  #define E1000_ICS_RXT0  E1000_ICR_RXT0  /* rx timer intr */
+#define E1000_ICS_RXDW  E1000_ICR_RXDW  /* rx desc written back */
  #define E1000_ICS_MDAC  E1000_ICR_MDAC  /* MDIO access complete */
  #define E1000_ICS_RXCFG E1000_ICR_RXCFG /* RX /c/ ordered set */
  #define E1000_ICS_GPI_EN0   E1000_ICR_GPI_EN0   /* GP Int 0 */
@@ -407,6 +409,7 @@
  #define E1000_IMS_RXDMT0E1000_ICR_RXDMT0/* rx desc min. threshold */
  #define E1000_IMS_RXO   E1000_ICR_RXO   /* rx overrun */
  #define E1000_IMS_RXT0  E1000_ICR_RXT0  /* rx timer intr */
+#define E1000_IMS_RXDW  E1000_ICR_RXDW  /* rx desc written back */
  #define E1000_IMS_MDAC  E1000_ICR_MDAC  /* MDIO access complete */
  #define E1000_IMS_RXCFG E1000_ICR_RXCFG /* RX /c/ ordered set */
  #define E1000_IMS_GPI_EN0   E1000_ICR_GPI_EN0   /* GP Int 0 */
@@ -441,6 +444,7 @@
  #define E1000_IMC_RXDMT0E1000_ICR_RXDMT0/* rx desc min. threshold */
  #define E1000_IMC_RXO   E1000_ICR_RXO   /* rx overrun */
  #define E1000_IMC_RXT0  E1000_ICR_RXT0  /* rx timer intr */
+#define E1000_IMC_RXDW  E1000_ICR_RXDW  /* rx desc written back */
  #define E1000_IMC_MDAC  E1000_ICR_MDAC  /* MDIO access complete */
  #define E1000_IMC_RXCFG E1000_ICR_RXCFG /* RX /c/ ordered set */
  #define E1000_IMC_GPI_EN0   E1000_ICR_GPI_EN0   /* GP Int 0 */
diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
index 9c32ad5e36..e78bc3611a 100644
--- a/hw/net/igb_core.c
+++ b/hw/net/igb_core.c
@@ -1488,7 +1488,7 @@ igb_receive_internal(IGBCore *core, const struct iovec 
*iov, int iovcnt,
  static const int maximum_ethernet_hdr_len = (ETH_HLEN + 4);
  
  uint16_t queues = 0;

-uint32_t n;
+uint32_t icr_bits = 0;
  uint8_t min_buf[ETH_ZLEN];
  struct iovec min_iov;
  struct eth_header *ehdr;
@@ -1561,6 +1561,7 @@ igb_receive_internal(IGBCore *core, const struct iovec 
*iov, int iovcnt,
  e1000x_fcs_len(core->mac);
  
  retval = orig_size;

+igb_rx_fix_l4_csum(core, core->rx_pkt);
  
  for (i = 0; i < IGB_NUM_QUEUES; i++) {

  if (!(queues & BIT(i))) {
@@ -1569,43 +1570,32 @@ igb_receive_internal(IGBCore *core, const struct iovec 
*iov, int iovcnt,
  
  igb_rx_ring_init(core, , i);
  
-trace_e1000e_rx_rss_dispatched_to_queue(rxr.i->idx);

-
  if (!igb_has_rxbufs(core, rxr.i, total_size)) {
-retval = 0;
+icr_bits |= E1000_ICS_RXO;
+continue;
  }
-}
  
-if (retval) {

-n = E1000_ICR_RXT0;
-
-igb_rx_fix_l4_csum(core, core->rx_pkt);
-
-for (i = 0; i < IGB_NUM_QUEUES; i++) {
-if (!(queues & BIT(i))) {
-continue;
-}
-
-igb_rx_ring_init(core, , i);
+trace_e1000e_rx_rss_dispatched_to_queue(rxr.i->idx);
+igb_write_packet_to_guest(core, core->rx_pkt, , _info);
  
-igb_write_packet_to_guest(core, core->rx_pkt, , _info);

+/* Check if receive descriptor minimum threshold hit */
+if (igb_rx_descr_threshold_hit(core, rxr.i)) {
+icr_bits |= E1000_ICS_RXDMT0;
+}
  
-/* Check if receive descriptor minimum threshold hit */

-if (igb_rx_descr_threshold_hit(core, rxr.i)) {
-n |= E1000_ICS_RXDMT0;
-}
+core->mac[EICR] |= igb_rx_wb_eic(core, rxr.i->idx);
  
-core->mac[EICR] |= igb_rx_wb_eic(core, rxr.i->idx);

-}
+icr_bits |= E1000_ICR_RXDW;
+}
  
-trace_e1000e_rx_written_to_guest(n);

+if (icr_bits & E1000_ICR_RXDW) {
+trace_e1000e_rx_written_to_guest(icr_bits);
  } else {
-n = E1000_ICS_RXO;
-trace_e1000e_rx_not_written_to_guest(n);
+

[PATCH v7 9/9] docs/system/devices/igb: Add igb documentation

2023-01-31 Thread Akihiko Odaki

Signed-off-by: Akihiko Odaki 
---
 MAINTAINERS  |  1 +
 docs/system/device-emulation.rst |  1 +
 docs/system/devices/igb.rst  | 71 
 3 files changed, 73 insertions(+)
 create mode 100644 docs/system/devices/igb.rst

diff --git a/MAINTAINERS b/MAINTAINERS
index c0831aeb56..e85957e37f 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2224,6 +2224,7 @@ F: tests/qtest/libqos/e1000e.*
 igb
 M: Akihiko Odaki 
 S: Maintained
+F: docs/system/devices/igb.rst
 F: hw/net/igb*
 F: tests/avocado/igb.py
 F: tests/qtest/igb-test.c
diff --git a/docs/system/device-emulation.rst b/docs/system/device-emulation.rst
index 0506006056..c1b1934e3d 100644
--- a/docs/system/device-emulation.rst
+++ b/docs/system/device-emulation.rst
@@ -93,3 +93,4 @@ Emulated Devices
devices/virtio-pmem.rst
devices/vhost-user-rng.rst
devices/canokey.rst
+   devices/igb.rst
diff --git a/docs/system/devices/igb.rst b/docs/system/devices/igb.rst
new file mode 100644
index 00..70edadd574
--- /dev/null
+++ b/docs/system/devices/igb.rst
@@ -0,0 +1,71 @@
+.. SPDX-License-Identifier: GPL-2.0-or-later
+.. _igb:
+
+igb
+---
+
+igb is a family of Intel's gigabit ethernet controllers. In QEMU, 82576
+emulation is implemented in particular. Its datasheet is available at [1]_.
+
+This implementation is expected to be useful to test SR-IOV networking without
+requiring physical hardware.
+
+Limitations
+===
+
+This igb implementation was tested with Linux Test Project [2]_ and Windows HLK
+[3]_ during the initial development. The command used when testing with LTP is:
+
+.. code-block:: shell
+
+  network.sh -6mta
+
+Be aware that this implementation lacks many functionalities available with the
+actual hardware, and you may experience various failures if you try to use it
+with a different operating system other than Linux and Windows or if you try
+functionalities not covered by the tests.
+
+Using igb
+=
+
+Using igb should be nothing different from using another network device. See
+:ref:`pcsys_005fnetwork` in general.
+
+However, you may also need to perform additional steps to activate SR-IOV
+feature on your guest. For Linux, refer to [4]_.
+
+Developing igb
+==
+
+igb is the successor of e1000e, and e1000e is the successor of e1000 in turn.
+As these devices are very similar, if you make a change for igb and the same
+change can be applied to e1000e and e1000, please do so.
+
+Please do not forget to run tests before submitting a change. As tests included
+in QEMU is very minimal, run some application which is likely to be affected by
+the change to confirm it works in an integrated system.
+
+Testing igb
+===
+
+A qtest of the basic functionality is available. Run the below at the build
+directory:
+
+.. code-block:: shell
+
+  meson test qtest-x86_64/qos-test
+
+ethtool can test register accesses, interrupts, etc. It is automated as an
+Avocado test and can be ran with the following command:
+
+.. code:: shell
+
+  make check-avocado AVOCADO_TESTS=tests/avocado/igb.py
+
+References
+==
+
+.. [1] 
https://www.intel.com/content/dam/www/public/us/en/documents/datasheets/82576eb-gigabit-ethernet-controller-datasheet.pdf
+.. [2] https://github.com/linux-test-project/ltp
+.. [3] https://learn.microsoft.com/en-us/windows-hardware/test/hlk/
+.. [4] https://docs.kernel.org/PCI/pci-iov-howto.html
-- 
2.39.1

[PATCH v7 3/9] e1000: Split header files

2023-01-31 Thread Akihiko Odaki

Some definitions in the header files are invalid for igb so extract
them to new header files to keep igb from referring to them.

Signed-off-by: Gal Hammer 
Signed-off-by: Marcel Apfelbaum 
Signed-off-by: Akihiko Odaki 
Reviewed-by: Philippe Mathieu-Daudé 
---
 hw/net/e1000.c |   1 +
 hw/net/e1000_common.h  | 102 +
 hw/net/e1000_regs.h| 927 +---
 hw/net/e1000e.c|   3 +-
 hw/net/e1000e_core.c   |   1 +
 hw/net/e1000x_common.c |   1 +
 hw/net/e1000x_common.h |  74 
 hw/net/e1000x_regs.h   | 940 +
 8 files changed, 1049 insertions(+), 1000 deletions(-)
 create mode 100644 hw/net/e1000_common.h
 create mode 100644 hw/net/e1000x_regs.h

diff --git a/hw/net/e1000.c b/hw/net/e1000.c
index c81d914a02..23d3d32403 100644
--- a/hw/net/e1000.c
+++ b/hw/net/e1000.c
@@ -39,6 +39,7 @@
 #include "qemu/module.h"
 #include "qemu/range.h"
 
+#include "e1000_common.h"
 #include "e1000x_common.h"
 #include "trace.h"
 #include "qom/object.h"
diff --git a/hw/net/e1000_common.h b/hw/net/e1000_common.h
new file mode 100644
index 00..48feda7404
--- /dev/null
+++ b/hw/net/e1000_common.h
@@ -0,0 +1,102 @@
+/*
+ * QEMU e1000(e) emulation - shared definitions
+ *
+ * Copyright (c) 2008 Qumranet
+ *
+ * Based on work done by:
+ * Nir Peleg, Tutis Systems Ltd. for Qumranet Inc.
+ * Copyright (c) 2007 Dan Aloni
+ * Copyright (c) 2004 Antony T Curtis
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see .
+ */
+
+#ifndef HW_NET_E1000_COMMON_H
+#define HW_NET_E1000_COMMON_H
+
+#include "e1000_regs.h"
+
+#define defreg(x)   x = (E1000_##x >> 2)
+enum {
+defreg(CTRL),defreg(EECD),defreg(EERD),defreg(GPRC),
+defreg(GPTC),defreg(ICR), defreg(ICS), defreg(IMC),
+defreg(IMS), defreg(LEDCTL),  defreg(MANC),defreg(MDIC),
+defreg(MPC), defreg(PBA), defreg(RCTL),defreg(RDBAH0),
+defreg(RDBAL0),  defreg(RDH0),defreg(RDLEN0),  defreg(RDT0),
+defreg(STATUS),  defreg(SWSM),defreg(TCTL),defreg(TDBAH),
+defreg(TDBAL),   defreg(TDH), defreg(TDLEN),   defreg(TDT),
+defreg(TDLEN1),  defreg(TDBAL1),  defreg(TDBAH1),  defreg(TDH1),
+defreg(TDT1),defreg(TORH),defreg(TORL),defreg(TOTH),
+defreg(TOTL),defreg(TPR), defreg(TPT), defreg(TXDCTL),
+defreg(WUFC),defreg(RA),  defreg(MTA), defreg(CRCERRS),
+defreg(VFTA),defreg(VET), defreg(RDTR),defreg(RADV),
+defreg(TADV),defreg(ITR), defreg(SCC), defreg(ECOL),
+defreg(MCC), defreg(LATECOL), defreg(COLC),defreg(DC),
+defreg(TNCRS),   defreg(SEQEC),   defreg(CEXTERR), defreg(RLEC),
+defreg(XONRXC),  defreg(XONTXC),  defreg(XOFFRXC), defreg(XOFFTXC),
+defreg(FCRUC),   defreg(AIT), defreg(TDFH),defreg(TDFT),
+defreg(TDFHS),   defreg(TDFTS),   defreg(TDFPC),   defreg(WUC),
+defreg(WUS), defreg(POEMB),   defreg(PBS), defreg(RDFH),
+defreg(RDFT),defreg(RDFHS),   defreg(RDFTS),   defreg(RDFPC),
+defreg(PBM), defreg(IPAV),defreg(IP4AT),   defreg(IP6AT),
+defreg(WUPM),defreg(FFLT),defreg(FFMT),defreg(FFVT),
+defreg(TARC0),   defreg(TARC1),   defreg(IAM), defreg(EXTCNF_CTRL),
+defreg(GCR), defreg(TIMINCA), defreg(EIAC),defreg(CTRL_EXT),
+defreg(IVAR),defreg(MFUTP01), defreg(MFUTP23), defreg(MANC2H),
+defreg(MFVAL),   defreg(MDEF),defreg(FACTPS),  defreg(FTFT),
+defreg(RUC), defreg(ROC), defreg(RFC), defreg(RJC),
+defreg(PRC64),   defreg(PRC127),  defreg(PRC255),  defreg(PRC511),
+defreg(PRC1023), defreg(PRC1522), defreg(PTC64),   defreg(PTC127),
+defreg(PTC255),  defreg(PTC511),  defreg(PTC1023), defreg(PTC1522),
+defreg(GORCL),   defreg(GORCH),   defreg(GOTCL),   defreg(GOTCH),
+defreg(RNBC),defreg(BPRC),defreg(MPRC),defreg(RFCTL),
+defreg(PSRCTL),  defreg(MPTC),defreg(BPTC),defreg(TSCTFC),
+defreg(IAC), defreg(MGTPRC),  defreg(MGTPDC),  defreg(MGTPTC),
+defreg(TSCTC),   defreg(RXCSUM),  defreg(FUNCTAG), defreg(GSCL_1),
+defreg(GSCL_2),  defreg(GSCL_3),  defreg(GSCL_4),  defreg(GSCN_0),
+defreg(GSCN_1),  defreg(GSCN_2),  defreg(GSCN_3),  defreg(GCR2),
+defreg(RAID),defreg(RSRPD),   defreg(TIDV),defreg(EITR),
+defreg(MRQC),

[PATCH v7 7/9] igb: Introduce qtest for igb device

2023-01-31 Thread Akihiko Odaki

This change is derived from qtest for e1000e device.

Signed-off-by: Akihiko Odaki 
Acked-by: Thomas Huth 
---
 MAINTAINERS |   2 +
 hw/net/igb_core.c   |   8 +-
 tests/qtest/fuzz/generic_fuzz_configs.h |   5 +
 tests/qtest/igb-test.c  | 243 
 tests/qtest/libqos/igb.c| 185 ++
 tests/qtest/libqos/meson.build  |   1 +
 tests/qtest/meson.build |   1 +
 7 files changed, 441 insertions(+), 4 deletions(-)
 create mode 100644 tests/qtest/igb-test.c
 create mode 100644 tests/qtest/libqos/igb.c

diff --git a/MAINTAINERS b/MAINTAINERS
index f9e9638290..127fd92541 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2225,6 +2225,8 @@ igb
 M: Akihiko Odaki 
 S: Maintained
 F: hw/net/igb*
+F: tests/qtest/igb-test.c
+F: tests/qtest/libqos/igb.c
 
 eepro100
 M: Stefan Weil 
diff --git a/hw/net/igb_core.c b/hw/net/igb_core.c
index 90eb7b9083..cb3e2d0be3 100644
--- a/hw/net/igb_core.c
+++ b/hw/net/igb_core.c
@@ -1559,6 +1559,8 @@ igb_receive_internal(IGBCore *core, const struct iovec 
*iov, int iovcnt,
 total_size = net_rx_pkt_get_total_len(core->rx_pkt) +
 e1000x_fcs_len(core->mac);
 
+igb_rx_fix_l4_csum(core, core->rx_pkt);
+
 for (i = 0; i < IGB_NUM_QUEUES; i++) {
 if (!(queues & BIT(i))) {
 continue;
@@ -1572,17 +1574,15 @@ igb_receive_internal(IGBCore *core, const struct iovec 
*iov, int iovcnt,
 continue;
 }
 
-n |= E1000_ICR_RXT0;
-
-igb_rx_fix_l4_csum(core, core->rx_pkt);
 igb_write_packet_to_guest(core, core->rx_pkt, , _info);
+core->mac[EICR] |= igb_rx_wb_eic(core, rxr.i->idx);
 
 /* Check if receive descriptor minimum threshold hit */
 if (igb_rx_descr_threshold_hit(core, rxr.i)) {
 n |= E1000_ICS_RXDMT0;
 }
 
-core->mac[EICR] |= igb_rx_wb_eic(core, rxr.i->idx);
+n |= E1000_ICR_RXT0;
 
 trace_e1000e_rx_written_to_guest(rxr.i->idx);
 }
diff --git a/tests/qtest/fuzz/generic_fuzz_configs.h 
b/tests/qtest/fuzz/generic_fuzz_configs.h
index a825b78c14..50689da653 100644
--- a/tests/qtest/fuzz/generic_fuzz_configs.h
+++ b/tests/qtest/fuzz/generic_fuzz_configs.h
@@ -90,6 +90,11 @@ const generic_fuzz_config predefined_configs[] = {
 .args = "-M q35 -nodefaults "
 "-device e1000e,netdev=net0 -netdev user,id=net0",
 .objects = "e1000e",
+},{
+.name = "igb",
+.args = "-M q35 -nodefaults "
+"-device igb,netdev=net0 -netdev user,id=net0",
+.objects = "igb",
 },{
 .name = "cirrus-vga",
 .args = "-machine q35 -nodefaults -device cirrus-vga",
diff --git a/tests/qtest/igb-test.c b/tests/qtest/igb-test.c
new file mode 100644
index 00..b36ddece75
--- /dev/null
+++ b/tests/qtest/igb-test.c
@@ -0,0 +1,243 @@
+/*
+ * QTest testcase for igb NIC
+ *
+ * Copyright (c) 2022-2023 Red Hat, Inc.
+ * Copyright (c) 2015 Ravello Systems LTD (http://ravellosystems.com)
+ * Developed by Daynix Computing LTD (http://www.daynix.com)
+ *
+ * Authors:
+ * Akihiko Odaki 
+ * Dmitry Fleytman 
+ * Leonid Bloch 
+ * Yan Vugenfirer 
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see .
+ */
+
+
+#include "qemu/osdep.h"
+#include "libqtest-single.h"
+#include "libqos/pci-pc.h"
+#include "net/eth.h"
+#include "qemu/sockets.h"
+#include "qemu/iov.h"
+#include "qemu/module.h"
+#include "qemu/bitops.h"
+#include "libqos/libqos-malloc.h"
+#include "libqos/e1000e.h"
+#include "hw/net/igb_regs.h"
+
+static const struct eth_header packet = {
+.h_dest = E1000E_ADDRESS,
+.h_source = E1000E_ADDRESS,
+};
+
+static void igb_send_verify(QE1000E *d, int *test_sockets, QGuestAllocator 
*alloc)
+{
+union e1000_adv_tx_desc descr;
+char buffer[64];
+int ret;
+uint32_t recv_len;
+
+/* Prepare test data buffer */
+uint64_t data = guest_alloc(alloc, sizeof(buffer));
+memwrite(data, , sizeof(packet));
+
+/* Prepare TX descriptor */
+memset(, 0, sizeof(descr));
+descr.read.buffer_addr = cpu_to_le64(data);
+descr.read.cmd_type_len = cpu_to_le32(E1000_TXD_CMD_RS   |
+  E1000_TXD_CMD_EOP  |
+  E1000_TXD_DTYP_D   |
+

[PATCH v7 6/9] tests/qtest/libqos/e1000e: Export macreg functions

2023-01-31 Thread Akihiko Odaki

They will be useful for igb testing.

Signed-off-by: Akihiko Odaki 
Reviewed-by: Thomas Huth 
---
 tests/qtest/libqos/e1000e.c | 12 
 tests/qtest/libqos/e1000e.h | 12 
 2 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/tests/qtest/libqos/e1000e.c b/tests/qtest/libqos/e1000e.c
index 28fb3052aa..925654c7fd 100644
--- a/tests/qtest/libqos/e1000e.c
+++ b/tests/qtest/libqos/e1000e.c
@@ -36,18 +36,6 @@
 
 #define E1000E_RING_LEN (0x1000)
 
-static void e1000e_macreg_write(QE1000E *d, uint32_t reg, uint32_t val)
-{
-QE1000E_PCI *d_pci = container_of(d, QE1000E_PCI, e1000e);
-qpci_io_writel(_pci->pci_dev, d_pci->mac_regs, reg, val);
-}
-
-static uint32_t e1000e_macreg_read(QE1000E *d, uint32_t reg)
-{
-QE1000E_PCI *d_pci = container_of(d, QE1000E_PCI, e1000e);
-return qpci_io_readl(_pci->pci_dev, d_pci->mac_regs, reg);
-}
-
 void e1000e_tx_ring_push(QE1000E *d, void *descr)
 {
 QE1000E_PCI *d_pci = container_of(d, QE1000E_PCI, e1000e);
diff --git a/tests/qtest/libqos/e1000e.h b/tests/qtest/libqos/e1000e.h
index 5e2b201aa7..30643c8094 100644
--- a/tests/qtest/libqos/e1000e.h
+++ b/tests/qtest/libqos/e1000e.h
@@ -42,6 +42,18 @@ struct QE1000E_PCI {
 QE1000E e1000e;
 };
 
+static inline void e1000e_macreg_write(QE1000E *d, uint32_t reg, uint32_t val)
+{
+QE1000E_PCI *d_pci = container_of(d, QE1000E_PCI, e1000e);
+qpci_io_writel(_pci->pci_dev, d_pci->mac_regs, reg, val);
+}
+
+static inline uint32_t e1000e_macreg_read(QE1000E *d, uint32_t reg)
+{
+QE1000E_PCI *d_pci = container_of(d, QE1000E_PCI, e1000e);
+return qpci_io_readl(_pci->pci_dev, d_pci->mac_regs, reg);
+}
+
 void e1000e_wait_isr(QE1000E *d, uint16_t msg_id);
 void e1000e_tx_ring_push(QE1000E *d, void *descr);
 void e1000e_rx_ring_push(QE1000E *d, void *descr);
-- 
2.39.1

[PATCH v7 8/9] tests/avocado: Add igb test

2023-01-31 Thread Akihiko Odaki

This automates ethtool tests for igb registers, interrupts, etc.

Signed-off-by: Akihiko Odaki 
---
 MAINTAINERS   |  1 +
 .../org.centos/stream/8/x86_64/test-avocado   |  1 +
 tests/avocado/igb.py  | 38 +++
 3 files changed, 40 insertions(+)
 create mode 100644 tests/avocado/igb.py

diff --git a/MAINTAINERS b/MAINTAINERS
index 127fd92541..c0831aeb56 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2225,6 +2225,7 @@ igb
 M: Akihiko Odaki 
 S: Maintained
 F: hw/net/igb*
+F: tests/avocado/igb.py
 F: tests/qtest/igb-test.c
 F: tests/qtest/libqos/igb.c
 
diff --git a/scripts/ci/org.centos/stream/8/x86_64/test-avocado 
b/scripts/ci/org.centos/stream/8/x86_64/test-avocado
index 7aeecbcfb8..7e07dbcc89 100755
--- a/scripts/ci/org.centos/stream/8/x86_64/test-avocado
+++ b/scripts/ci/org.centos/stream/8/x86_64/test-avocado
@@ -37,6 +37,7 @@ make get-vm-images
 tests/avocado/cpu_queries.py:QueryCPUModelExpansion.test \
 tests/avocado/empty_cpu_model.py:EmptyCPUModel.test \
 tests/avocado/hotplug_cpu.py:HotPlugCPU.test \
+tests/avocado/igb.py:IGB.test \
 tests/avocado/info_usernet.py:InfoUsernet.test_hostfwd \
 tests/avocado/intel_iommu.py:IntelIOMMU.test_intel_iommu \
 tests/avocado/intel_iommu.py:IntelIOMMU.test_intel_iommu_pt \
diff --git a/tests/avocado/igb.py b/tests/avocado/igb.py
new file mode 100644
index 00..abf5dfa07f
--- /dev/null
+++ b/tests/avocado/igb.py
@@ -0,0 +1,38 @@
+# SPDX-License-Identifier: GPL-2.0-or-later
+# ethtool tests for igb registers, interrupts, etc
+
+from avocado_qemu import LinuxTest
+
+class IGB(LinuxTest):
+"""
+:avocado: tags=accel:kvm
+:avocado: tags=arch:x86_64
+:avocado: tags=distro:fedora
+:avocado: tags=distro_version:31
+:avocado: tags=machine:q35
+"""
+
+timeout = 180
+
+def test(self):
+self.require_accelerator('kvm')
+kernel_url = self.distro.pxeboot_url + 'vmlinuz'
+kernel_hash = '5b6f6876e1b5bda314f93893271da0d5777b1f3c'
+kernel_path = self.fetch_asset(kernel_url, asset_hash=kernel_hash)
+initrd_url = self.distro.pxeboot_url + 'initrd.img'
+initrd_hash = 'dd0340a1b39bd28f88532babd4581c67649ec5b1'
+initrd_path = self.fetch_asset(initrd_url, asset_hash=initrd_hash)
+
+# Ideally we want to test MSI as well, but it is blocked by a bug
+# fixed with:
+# 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=28e96556baca7056d11d9fb3cdd0aba4483e00d8
+kernel_params = self.distro.default_kernel_params + ' pci=nomsi'
+
+self.vm.add_args('-kernel', kernel_path,
+ '-initrd', initrd_path,
+ '-append', kernel_params,
+ '-accel', 'kvm',
+ '-device', 'igb')
+self.launch_and_wait()
+self.ssh_command('dnf -y install ethtool')
+self.ssh_command('ethtool -t eth1 offline')
-- 
2.39.1

[PATCH v7 1/9] hw/net/net_tx_pkt: Introduce net_tx_pkt_get_eth_hdr

2023-01-31 Thread Akihiko Odaki

Expose the ethernet header so that igb can utilize it to perform the
internal routing among its SR-IOV functions.

Signed-off-by: Gal Hammer 
Signed-off-by: Marcel Apfelbaum 
Signed-off-by: Akihiko Odaki 
---
 hw/net/net_tx_pkt.c | 6 ++
 hw/net/net_tx_pkt.h | 8 
 2 files changed, 14 insertions(+)

diff --git a/hw/net/net_tx_pkt.c b/hw/net/net_tx_pkt.c
index 986a3adfe9..be5b65f0e9 100644
--- a/hw/net/net_tx_pkt.c
+++ b/hw/net/net_tx_pkt.c
@@ -273,6 +273,12 @@ bool net_tx_pkt_parse(struct NetTxPkt *pkt)
 }
 }
 
+struct eth_header *net_tx_pkt_get_eth_hdr(struct NetTxPkt *pkt)
+{
+assert(pkt);
+return (struct eth_header *)>l2_hdr;
+}
+
 struct virtio_net_hdr *net_tx_pkt_get_vhdr(struct NetTxPkt *pkt)
 {
 assert(pkt);
diff --git a/hw/net/net_tx_pkt.h b/hw/net/net_tx_pkt.h
index f57b4e034b..2e51b73b6c 100644
--- a/hw/net/net_tx_pkt.h
+++ b/hw/net/net_tx_pkt.h
@@ -45,6 +45,14 @@ void net_tx_pkt_init(struct NetTxPkt **pkt, PCIDevice 
*pci_dev,
  */
 void net_tx_pkt_uninit(struct NetTxPkt *pkt);
 
+/**
+ * get ethernet header
+ *
+ * @pkt:packet
+ * @ret:ethernet header
+ */
+struct eth_header *net_tx_pkt_get_eth_hdr(struct NetTxPkt *pkt);
+
 /**
  * get virtio header
  *
-- 
2.39.1

[PATCH v7 2/9] pcie: Introduce pcie_sriov_num_vfs

2023-01-31 Thread Akihiko Odaki

igb can use this function to change its behavior depending on the
number of virtual functions currently enabled.

Signed-off-by: Gal Hammer 
Signed-off-by: Marcel Apfelbaum 
Signed-off-by: Akihiko Odaki 
Reviewed-by: Philippe Mathieu-Daudé 
---
 hw/pci/pcie_sriov.c | 5 +
 include/hw/pci/pcie_sriov.h | 3 +++
 2 files changed, 8 insertions(+)

diff --git a/hw/pci/pcie_sriov.c b/hw/pci/pcie_sriov.c
index f0bd72e069..aa5a757b11 100644
--- a/hw/pci/pcie_sriov.c
+++ b/hw/pci/pcie_sriov.c
@@ -300,3 +300,8 @@ PCIDevice *pcie_sriov_get_vf_at_index(PCIDevice *dev, int n)
 }
 return NULL;
 }
+
+uint16_t pcie_sriov_num_vfs(PCIDevice *dev)
+{
+return dev->exp.sriov_pf.num_vfs;
+}
diff --git a/include/hw/pci/pcie_sriov.h b/include/hw/pci/pcie_sriov.h
index 96cc743309..095fb0c9ed 100644
--- a/include/hw/pci/pcie_sriov.h
+++ b/include/hw/pci/pcie_sriov.h
@@ -76,4 +76,7 @@ PCIDevice *pcie_sriov_get_pf(PCIDevice *dev);
  */
 PCIDevice *pcie_sriov_get_vf_at_index(PCIDevice *dev, int n);
 
+/* Returns the current number of virtual functions. */
+uint16_t pcie_sriov_num_vfs(PCIDevice *dev);
+
 #endif /* QEMU_PCIE_SRIOV_H */
-- 
2.39.1

[PATCH v7 5/9] tests/qtest/e1000e-test: Fabricate ethernet header

2023-01-31 Thread Akihiko Odaki

e1000e understands ethernet header so fabricate something convincing.

Signed-off-by: Akihiko Odaki 
Reviewed-by: Thomas Huth 
---
 tests/qtest/e1000e-test.c   | 25 +++--
 tests/qtest/libqos/e1000e.h |  2 ++
 2 files changed, 17 insertions(+), 10 deletions(-)

diff --git a/tests/qtest/e1000e-test.c b/tests/qtest/e1000e-test.c
index b63a4d3c91..de9738fdb7 100644
--- a/tests/qtest/e1000e-test.c
+++ b/tests/qtest/e1000e-test.c
@@ -27,6 +27,7 @@
 #include "qemu/osdep.h"
 #include "libqtest-single.h"
 #include "libqos/pci-pc.h"
+#include "net/eth.h"
 #include "qemu/sockets.h"
 #include "qemu/iov.h"
 #include "qemu/module.h"
@@ -35,9 +36,13 @@
 #include "libqos/e1000e.h"
 #include "hw/net/e1000_regs.h"
 
+static const struct eth_header packet = {
+.h_dest = E1000E_ADDRESS,
+.h_source = E1000E_ADDRESS,
+};
+
 static void e1000e_send_verify(QE1000E *d, int *test_sockets, QGuestAllocator 
*alloc)
 {
-static const char test[] = "TEST";
 struct e1000_tx_desc descr;
 char buffer[64];
 int ret;
@@ -45,7 +50,7 @@ static void e1000e_send_verify(QE1000E *d, int *test_sockets, 
QGuestAllocator *a
 
 /* Prepare test data buffer */
 uint64_t data = guest_alloc(alloc, sizeof(buffer));
-memwrite(data, test, sizeof(test));
+memwrite(data, , sizeof(packet));
 
 /* Prepare TX descriptor */
 memset(, 0, sizeof(descr));
@@ -71,7 +76,7 @@ static void e1000e_send_verify(QE1000E *d, int *test_sockets, 
QGuestAllocator *a
 g_assert_cmpint(ret, == , sizeof(recv_len));
 ret = recv(test_sockets[0], buffer, sizeof(buffer), 0);
 g_assert_cmpint(ret, ==, sizeof(buffer));
-g_assert_cmpstr(buffer, == , test);
+g_assert_false(memcmp(buffer, , sizeof(packet)));
 
 /* Free test data buffer */
 guest_free(alloc, data);
@@ -81,15 +86,15 @@ static void e1000e_receive_verify(QE1000E *d, int 
*test_sockets, QGuestAllocator
 {
 union e1000_rx_desc_extended descr;
 
-char test[] = "TEST";
-int len = htonl(sizeof(test));
+struct eth_header test_iov = packet;
+int len = htonl(sizeof(packet));
 struct iovec iov[] = {
 {
 .iov_base = ,
 .iov_len = sizeof(len),
 },{
-.iov_base = test,
-.iov_len = sizeof(test),
+.iov_base = _iov,
+.iov_len = sizeof(packet),
 },
 };
 
@@ -97,8 +102,8 @@ static void e1000e_receive_verify(QE1000E *d, int 
*test_sockets, QGuestAllocator
 int ret;
 
 /* Send a dummy packet to device's socket*/
-ret = iov_send(test_sockets[0], iov, 2, 0, sizeof(len) + sizeof(test));
-g_assert_cmpint(ret, == , sizeof(test) + sizeof(len));
+ret = iov_send(test_sockets[0], iov, 2, 0, sizeof(len) + sizeof(packet));
+g_assert_cmpint(ret, == , sizeof(packet) + sizeof(len));
 
 /* Prepare test data buffer */
 uint64_t data = guest_alloc(alloc, sizeof(buffer));
@@ -119,7 +124,7 @@ static void e1000e_receive_verify(QE1000E *d, int 
*test_sockets, QGuestAllocator
 
 /* Check data sent to the backend */
 memread(data, buffer, sizeof(buffer));
-g_assert_cmpstr(buffer, == , test);
+g_assert_false(memcmp(buffer, , sizeof(packet)));
 
 /* Free test data buffer */
 guest_free(alloc, data);
diff --git a/tests/qtest/libqos/e1000e.h b/tests/qtest/libqos/e1000e.h
index 091ce139da..5e2b201aa7 100644
--- a/tests/qtest/libqos/e1000e.h
+++ b/tests/qtest/libqos/e1000e.h
@@ -25,6 +25,8 @@
 #define E1000E_RX0_MSG_ID   (0)
 #define E1000E_TX0_MSG_ID   (1)
 
+#define E1000E_ADDRESS { 0x52, 0x54, 0x00, 0x12, 0x34, 0x56 }
+
 typedef struct QE1000E QE1000E;
 typedef struct QE1000E_PCI QE1000E_PCI;
 
-- 
2.39.1

[PATCH v7 0/9] Introduce igb

2023-01-31 Thread Akihiko Odaki

Based-on: <20230201033539.30049-1-akihiko.od...@daynix.com>
([PATCH v5 00/29] e1000x cleanups (preliminary for IGB))

igb is a family of Intel's gigabit ethernet controllers. This series implements
82576 emulation in particular. You can see the last patch for the documentation.

Note that there is another effort to bring 82576 emulation. This series was
developed independently by Sriram Yagnaraman.
https://lists.gnu.org/archive/html/qemu-devel/2022-12/msg04670.html

V6 -> V7:
- Reordered statements in igb_receive_internal() so that checksum will be
  calculated only once and it will be more close to e1000e_receive_internal().

V5 -> V6:
- Rebased.
- Renamed "test" to "packet" in tests/qtest/e1000e-test.c.
- Fixed Rx logic so that a Rx pool without enough space won't prevent other
  pools from receiving, based on Sriram Yagnaraman's work.

V4 -> V5:
- Rebased.
- Squashed patches to copy from e1000e code and modify it.
- Listed the implemented features.
- Added a check for interrupts availablity on PF.
- Fixed the declaration of igb_receive_internal(). (Sriram Yagnaraman)

V3 -> V4:
- Rebased.
- Corrected PCIDevice specified for DMA.

V2 -> V3:
- Rebased.
- Fixed PCIDevice reference in hw/net/igbvf.c.
- Fixed TX packet switching when VM loopback is enabled.
- Fixed VMDq enablement check.
- Fixed RX descriptor length parser.
- Fixed the definitions of RQDPC readers.
- Implemented VLAN VM filter.
- Implemented VT_CTL.Def_PL.
- Implemented the combination of VMDq and RSS.
- Noted that igb is tested with Windows HLK.

V1 -> V2:
- Spun off e1000e general improvements to a distinct series.
- Restored vnet_hdr offload as there seems nothing preventing from that.

Akihiko Odaki (9):
  hw/net/net_tx_pkt: Introduce net_tx_pkt_get_eth_hdr
  pcie: Introduce pcie_sriov_num_vfs
  e1000: Split header files
  Intrdocue igb device emulation
  tests/qtest/e1000e-test: Fabricate ethernet header
  tests/qtest/libqos/e1000e: Export macreg functions
  igb: Introduce qtest for igb device
  tests/avocado: Add igb test
  docs/system/devices/igb: Add igb documentation

 MAINTAINERS   |9 +
 docs/system/device-emulation.rst  |1 +
 docs/system/devices/igb.rst   |   71 +
 hw/net/Kconfig|5 +
 hw/net/e1000.c|1 +
 hw/net/e1000_common.h |  102 +
 hw/net/e1000_regs.h   |  927 +---
 hw/net/e1000e.c   |3 +-
 hw/net/e1000e_core.c  |1 +
 hw/net/e1000x_common.c|1 +
 hw/net/e1000x_common.h|   74 -
 hw/net/e1000x_regs.h  |  940 
 hw/net/igb.c  |  612 +++
 hw/net/igb_common.h   |  146 +
 hw/net/igb_core.c | 4043 +
 hw/net/igb_core.h |  144 +
 hw/net/igb_regs.h |  648 +++
 hw/net/igbvf.c|  327 ++
 hw/net/meson.build|2 +
 hw/net/net_tx_pkt.c   |6 +
 hw/net/net_tx_pkt.h   |8 +
 hw/net/trace-events   |   32 +
 hw/pci/pcie_sriov.c   |5 +
 include/hw/pci/pcie_sriov.h   |3 +
 .../org.centos/stream/8/x86_64/test-avocado   |1 +
 tests/avocado/igb.py  |   38 +
 tests/qtest/e1000e-test.c |   25 +-
 tests/qtest/fuzz/generic_fuzz_configs.h   |5 +
 tests/qtest/igb-test.c|  243 +
 tests/qtest/libqos/e1000e.c   |   12 -
 tests/qtest/libqos/e1000e.h   |   14 +
 tests/qtest/libqos/igb.c  |  185 +
 tests/qtest/libqos/meson.build|1 +
 tests/qtest/meson.build   |1 +
 34 files changed, 7614 insertions(+), 1022 deletions(-)
 create mode 100644 docs/system/devices/igb.rst
 create mode 100644 hw/net/e1000_common.h
 create mode 100644 hw/net/e1000x_regs.h
 create mode 100644 hw/net/igb.c
 create mode 100644 hw/net/igb_common.h
 create mode 100644 hw/net/igb_core.c
 create mode 100644 hw/net/igb_core.h
 create mode 100644 hw/net/igb_regs.h
 create mode 100644 hw/net/igbvf.c
 create mode 100644 tests/avocado/igb.py
 create mode 100644 tests/qtest/igb-test.c
 create mode 100644 tests/qtest/libqos/igb.c

-- 
2.39.1

Re: [PATCH v9 4/5] riscv: Introduce satp mode hw capabilities

2023-01-31 Thread Bin Meng

On Tue, Jan 31, 2023 at 10:41 PM Alexandre Ghiti  wrote:
>
> Currently, the max satp mode is set with the only constraint that it must be
> implemented in QEMU, i.e. set in valid_vm_1_10_[32|64].
>
> But we actually need to add another level of constraint: what the hw is
> actually capable of, because currently, a linux booting on a sifive-u54
> boots in sv57 mode which is incompatible with the cpu's sv39 max
> capability.
>
> So add a new bitmap to RISCVSATPMap which contains this capability and
> initialize it in every XXX_cpu_init.
>
> Finally:
> - valid_vm_1_10_[32|64] constrains which satp mode the CPU can use
> - the CPU hw capabilities constrains what the user may select
> - the user's selection then constrains what's available to the guest
>   OS.
>
> Signed-off-by: Alexandre Ghiti 
> Reviewed-by: Andrew Jones 
> ---
>  target/riscv/cpu.c | 79 +++---
>  target/riscv/cpu.h |  8 +++--
>  2 files changed, 60 insertions(+), 27 deletions(-)
>
> diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
> index 3a7a1746aa..6dd76355ec 100644
> --- a/target/riscv/cpu.c
> +++ b/target/riscv/cpu.c
> @@ -292,26 +292,36 @@ const char *satp_mode_str(uint8_t satp_mode, bool 
> is_32_bit)
>  g_assert_not_reached();
>  }
>
> -/* Sets the satp mode to the max supported */
> -static void set_satp_mode_default_map(RISCVCPU *cpu)
> +static void set_satp_mode_max_supported(RISCVCPU *cpu,
> +uint8_t satp_mode)
>  {
>  bool rv32 = riscv_cpu_mxl(>env) == MXL_RV32;
> +const bool *valid_vm = rv32 ? valid_vm_1_10_32 : valid_vm_1_10_64;
>
> -if (riscv_feature(>env, RISCV_FEATURE_MMU)) {
> -cpu->cfg.satp_mode.map |=
> -(1 << satp_mode_from_str(rv32 ? "sv32" : "sv57"));
> -} else {
> -cpu->cfg.satp_mode.map |= (1 << satp_mode_from_str("mbare"));
> +for (int i = 0; i <= satp_mode; ++i) {
> +if (valid_vm[i]) {
> +cpu->cfg.satp_mode.supported |= (1 << i);
> +}
>  }
>  }
>
> +/* Set the satp mode to the max supported */
> +static void set_satp_mode_default_map(RISCVCPU *cpu)
> +{
> +cpu->cfg.satp_mode.map = cpu->cfg.satp_mode.supported;
> +}
> +
>  static void riscv_any_cpu_init(Object *obj)
>  {
>  CPURISCVState *env = _CPU(obj)->env;
> +RISCVCPU *cpu = RISCV_CPU(obj);
> +
>  #if defined(TARGET_RISCV32)
>  set_misa(env, MXL_RV32, RVI | RVM | RVA | RVF | RVD | RVC | RVU);
> +set_satp_mode_max_supported(cpu, VM_1_10_SV32);
>  #elif defined(TARGET_RISCV64)
>  set_misa(env, MXL_RV64, RVI | RVM | RVA | RVF | RVD | RVC | RVU);
> +set_satp_mode_max_supported(cpu, VM_1_10_SV57);
>  #endif
>  set_priv_version(env, PRIV_VERSION_1_12_0);
>  register_cpu_props(obj);
> @@ -321,18 +331,24 @@ static void riscv_any_cpu_init(Object *obj)
>  static void rv64_base_cpu_init(Object *obj)
>  {
>  CPURISCVState *env = _CPU(obj)->env;
> +RISCVCPU *cpu = RISCV_CPU(obj);
> +
>  /* We set this in the realise function */
>  set_misa(env, MXL_RV64, 0);
>  register_cpu_props(obj);
>  /* Set latest version of privileged specification */
>  set_priv_version(env, PRIV_VERSION_1_12_0);
> +set_satp_mode_max_supported(cpu, VM_1_10_SV57);
>  }
>
>  static void rv64_sifive_u_cpu_init(Object *obj)
>  {
>  CPURISCVState *env = _CPU(obj)->env;
> +RISCVCPU *cpu = RISCV_CPU(obj);
> +
>  set_misa(env, MXL_RV64, RVI | RVM | RVA | RVF | RVD | RVC | RVS | RVU);
>  set_priv_version(env, PRIV_VERSION_1_10_0);
> +set_satp_mode_max_supported(cpu, VM_1_10_SV39);
>  }
>
>  static void rv64_sifive_e_cpu_init(Object *obj)
> @@ -343,6 +359,7 @@ static void rv64_sifive_e_cpu_init(Object *obj)
>  set_misa(env, MXL_RV64, RVI | RVM | RVA | RVC | RVU);
>  set_priv_version(env, PRIV_VERSION_1_10_0);
>  cpu->cfg.mmu = false;
> +set_satp_mode_max_supported(cpu, VM_1_10_MBARE);
>  }
>
>  static void rv128_base_cpu_init(Object *obj)
> @@ -354,28 +371,36 @@ static void rv128_base_cpu_init(Object *obj)
>  exit(EXIT_FAILURE);
>  }
>  CPURISCVState *env = _CPU(obj)->env;
> +RISCVCPU *cpu = RISCV_CPU(obj);

nits: for consistency with other cpu_init, needs a blank line after this

>  /* We set this in the realise function */
>  set_misa(env, MXL_RV128, 0);
>  register_cpu_props(obj);
>  /* Set latest version of privileged specification */
>  set_priv_version(env, PRIV_VERSION_1_12_0);
> +set_satp_mode_max_supported(cpu, VM_1_10_SV57);
>  }
>  #else
>  static void rv32_base_cpu_init(Object *obj)
>  {
>  CPURISCVState *env = _CPU(obj)->env;
> +RISCVCPU *cpu = RISCV_CPU(obj);
> +
>  /* We set this in the realise function */
>  set_misa(env, MXL_RV32, 0);
>  register_cpu_props(obj);
>  /* Set latest version of privileged specification */
>  set_priv_version(env, PRIV_VERSION_1_12_0);
> +set_satp_mode_max_supported(cpu, VM_1_10_SV32);
>  }
>
>  static void

Re: [PATCH v9 3/5] riscv: Allow user to set the satp mode

2023-01-31 Thread Bin Meng

On Tue, Jan 31, 2023 at 11:13 PM Alexandre Ghiti  wrote:
>
> RISC-V specifies multiple sizes for addressable memory and Linux probes for
> the machine's support at startup via the satp CSR register (done in
> csr.c:validate_vm).
>
> As per the specification, sv64 must support sv57, which in turn must
> support sv48...etc. So we can restrict machine support by simply setting the
> "highest" supported mode and the bare mode is always supported.
>
> You can set the satp mode using the new properties "sv32", "sv39", "sv48",
> "sv57" and "sv64" as follows:
> -cpu rv64,sv57=on  # Linux will boot using sv57 scheme
> -cpu rv64,sv39=on  # Linux will boot using sv39 scheme
> -cpu rv64,sv57=off # Linux will boot using sv48 scheme
> -cpu rv64  # Linux will boot using sv57 scheme by default
>
> We take the highest level set by the user:
> -cpu rv64,sv48=on,sv57=on # Linux will boot using sv57 scheme
>
> We make sure that invalid configurations are rejected:
> -cpu rv64,sv39=off,sv48=on # sv39 must be supported if higher modes are
># enabled
>
> We accept "redundant" configurations:
> -cpu rv64,sv48=on,sv57=off # Linux will boot using sv48 scheme
>
> And contradictory configurations:
> -cpu rv64,sv48=on,sv48=off # Linux will boot using sv39 scheme
>
> Co-Developed-by: Ludovic Henry 
> Signed-off-by: Ludovic Henry 
> Signed-off-by: Alexandre Ghiti 
> Reviewed-by: Andrew Jones 
> ---
>  target/riscv/cpu.c | 207 +
>  target/riscv/cpu.h |  19 +
>  target/riscv/csr.c |  12 ++-
>  3 files changed, 231 insertions(+), 7 deletions(-)
>

Reviewed-by: Bin Meng

[PATCH v6 9/9] docs/system/devices/igb: Add igb documentation

2023-01-31 Thread Akihiko Odaki

Signed-off-by: Akihiko Odaki 
---
 MAINTAINERS  |  1 +
 docs/system/device-emulation.rst |  1 +
 docs/system/devices/igb.rst  | 71 
 3 files changed, 73 insertions(+)
 create mode 100644 docs/system/devices/igb.rst

diff --git a/MAINTAINERS b/MAINTAINERS
index c0831aeb56..e85957e37f 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2224,6 +2224,7 @@ F: tests/qtest/libqos/e1000e.*
 igb
 M: Akihiko Odaki 
 S: Maintained
+F: docs/system/devices/igb.rst
 F: hw/net/igb*
 F: tests/avocado/igb.py
 F: tests/qtest/igb-test.c
diff --git a/docs/system/device-emulation.rst b/docs/system/device-emulation.rst
index 0506006056..c1b1934e3d 100644
--- a/docs/system/device-emulation.rst
+++ b/docs/system/device-emulation.rst
@@ -93,3 +93,4 @@ Emulated Devices
devices/virtio-pmem.rst
devices/vhost-user-rng.rst
devices/canokey.rst
+   devices/igb.rst
diff --git a/docs/system/devices/igb.rst b/docs/system/devices/igb.rst
new file mode 100644
index 00..70edadd574
--- /dev/null
+++ b/docs/system/devices/igb.rst
@@ -0,0 +1,71 @@
+.. SPDX-License-Identifier: GPL-2.0-or-later
+.. _igb:
+
+igb
+---
+
+igb is a family of Intel's gigabit ethernet controllers. In QEMU, 82576
+emulation is implemented in particular. Its datasheet is available at [1]_.
+
+This implementation is expected to be useful to test SR-IOV networking without
+requiring physical hardware.
+
+Limitations
+===
+
+This igb implementation was tested with Linux Test Project [2]_ and Windows HLK
+[3]_ during the initial development. The command used when testing with LTP is:
+
+.. code-block:: shell
+
+  network.sh -6mta
+
+Be aware that this implementation lacks many functionalities available with the
+actual hardware, and you may experience various failures if you try to use it
+with a different operating system other than Linux and Windows or if you try
+functionalities not covered by the tests.
+
+Using igb
+=
+
+Using igb should be nothing different from using another network device. See
+:ref:`pcsys_005fnetwork` in general.
+
+However, you may also need to perform additional steps to activate SR-IOV
+feature on your guest. For Linux, refer to [4]_.
+
+Developing igb
+==
+
+igb is the successor of e1000e, and e1000e is the successor of e1000 in turn.
+As these devices are very similar, if you make a change for igb and the same
+change can be applied to e1000e and e1000, please do so.
+
+Please do not forget to run tests before submitting a change. As tests included
+in QEMU is very minimal, run some application which is likely to be affected by
+the change to confirm it works in an integrated system.
+
+Testing igb
+===
+
+A qtest of the basic functionality is available. Run the below at the build
+directory:
+
+.. code-block:: shell
+
+  meson test qtest-x86_64/qos-test
+
+ethtool can test register accesses, interrupts, etc. It is automated as an
+Avocado test and can be ran with the following command:
+
+.. code:: shell
+
+  make check-avocado AVOCADO_TESTS=tests/avocado/igb.py
+
+References
+==
+
+.. [1] 
https://www.intel.com/content/dam/www/public/us/en/documents/datasheets/82576eb-gigabit-ethernet-controller-datasheet.pdf
+.. [2] https://github.com/linux-test-project/ltp
+.. [3] https://learn.microsoft.com/en-us/windows-hardware/test/hlk/
+.. [4] https://docs.kernel.org/PCI/pci-iov-howto.html
-- 
2.39.1

[PATCH v6 1/9] hw/net/net_tx_pkt: Introduce net_tx_pkt_get_eth_hdr

2023-01-31 Thread Akihiko Odaki

Expose the ethernet header so that igb can utilize it to perform the
internal routing among its SR-IOV functions.

Signed-off-by: Gal Hammer 
Signed-off-by: Marcel Apfelbaum 
Signed-off-by: Akihiko Odaki 
---
 hw/net/net_tx_pkt.c | 6 ++
 hw/net/net_tx_pkt.h | 8 
 2 files changed, 14 insertions(+)

diff --git a/hw/net/net_tx_pkt.c b/hw/net/net_tx_pkt.c
index 986a3adfe9..be5b65f0e9 100644
--- a/hw/net/net_tx_pkt.c
+++ b/hw/net/net_tx_pkt.c
@@ -273,6 +273,12 @@ bool net_tx_pkt_parse(struct NetTxPkt *pkt)
 }
 }
 
+struct eth_header *net_tx_pkt_get_eth_hdr(struct NetTxPkt *pkt)
+{
+assert(pkt);
+return (struct eth_header *)>l2_hdr;
+}
+
 struct virtio_net_hdr *net_tx_pkt_get_vhdr(struct NetTxPkt *pkt)
 {
 assert(pkt);
diff --git a/hw/net/net_tx_pkt.h b/hw/net/net_tx_pkt.h
index f57b4e034b..2e51b73b6c 100644
--- a/hw/net/net_tx_pkt.h
+++ b/hw/net/net_tx_pkt.h
@@ -45,6 +45,14 @@ void net_tx_pkt_init(struct NetTxPkt **pkt, PCIDevice 
*pci_dev,
  */
 void net_tx_pkt_uninit(struct NetTxPkt *pkt);
 
+/**
+ * get ethernet header
+ *
+ * @pkt:packet
+ * @ret:ethernet header
+ */
+struct eth_header *net_tx_pkt_get_eth_hdr(struct NetTxPkt *pkt);
+
 /**
  * get virtio header
  *
-- 
2.39.1

[PATCH v6 5/9] tests/qtest/e1000e-test: Fabricate ethernet header

2023-01-31 Thread Akihiko Odaki

e1000e understands ethernet header so fabricate something convincing.

Signed-off-by: Akihiko Odaki 
Reviewed-by: Thomas Huth 
---
 tests/qtest/e1000e-test.c   | 25 +++--
 tests/qtest/libqos/e1000e.h |  2 ++
 2 files changed, 17 insertions(+), 10 deletions(-)

diff --git a/tests/qtest/e1000e-test.c b/tests/qtest/e1000e-test.c
index b63a4d3c91..de9738fdb7 100644
--- a/tests/qtest/e1000e-test.c
+++ b/tests/qtest/e1000e-test.c
@@ -27,6 +27,7 @@
 #include "qemu/osdep.h"
 #include "libqtest-single.h"
 #include "libqos/pci-pc.h"
+#include "net/eth.h"
 #include "qemu/sockets.h"
 #include "qemu/iov.h"
 #include "qemu/module.h"
@@ -35,9 +36,13 @@
 #include "libqos/e1000e.h"
 #include "hw/net/e1000_regs.h"
 
+static const struct eth_header packet = {
+.h_dest = E1000E_ADDRESS,
+.h_source = E1000E_ADDRESS,
+};
+
 static void e1000e_send_verify(QE1000E *d, int *test_sockets, QGuestAllocator 
*alloc)
 {
-static const char test[] = "TEST";
 struct e1000_tx_desc descr;
 char buffer[64];
 int ret;
@@ -45,7 +50,7 @@ static void e1000e_send_verify(QE1000E *d, int *test_sockets, 
QGuestAllocator *a
 
 /* Prepare test data buffer */
 uint64_t data = guest_alloc(alloc, sizeof(buffer));
-memwrite(data, test, sizeof(test));
+memwrite(data, , sizeof(packet));
 
 /* Prepare TX descriptor */
 memset(, 0, sizeof(descr));
@@ -71,7 +76,7 @@ static void e1000e_send_verify(QE1000E *d, int *test_sockets, 
QGuestAllocator *a
 g_assert_cmpint(ret, == , sizeof(recv_len));
 ret = recv(test_sockets[0], buffer, sizeof(buffer), 0);
 g_assert_cmpint(ret, ==, sizeof(buffer));
-g_assert_cmpstr(buffer, == , test);
+g_assert_false(memcmp(buffer, , sizeof(packet)));
 
 /* Free test data buffer */
 guest_free(alloc, data);
@@ -81,15 +86,15 @@ static void e1000e_receive_verify(QE1000E *d, int 
*test_sockets, QGuestAllocator
 {
 union e1000_rx_desc_extended descr;
 
-char test[] = "TEST";
-int len = htonl(sizeof(test));
+struct eth_header test_iov = packet;
+int len = htonl(sizeof(packet));
 struct iovec iov[] = {
 {
 .iov_base = ,
 .iov_len = sizeof(len),
 },{
-.iov_base = test,
-.iov_len = sizeof(test),
+.iov_base = _iov,
+.iov_len = sizeof(packet),
 },
 };
 
@@ -97,8 +102,8 @@ static void e1000e_receive_verify(QE1000E *d, int 
*test_sockets, QGuestAllocator
 int ret;
 
 /* Send a dummy packet to device's socket*/
-ret = iov_send(test_sockets[0], iov, 2, 0, sizeof(len) + sizeof(test));
-g_assert_cmpint(ret, == , sizeof(test) + sizeof(len));
+ret = iov_send(test_sockets[0], iov, 2, 0, sizeof(len) + sizeof(packet));
+g_assert_cmpint(ret, == , sizeof(packet) + sizeof(len));
 
 /* Prepare test data buffer */
 uint64_t data = guest_alloc(alloc, sizeof(buffer));
@@ -119,7 +124,7 @@ static void e1000e_receive_verify(QE1000E *d, int 
*test_sockets, QGuestAllocator
 
 /* Check data sent to the backend */
 memread(data, buffer, sizeof(buffer));
-g_assert_cmpstr(buffer, == , test);
+g_assert_false(memcmp(buffer, , sizeof(packet)));
 
 /* Free test data buffer */
 guest_free(alloc, data);
diff --git a/tests/qtest/libqos/e1000e.h b/tests/qtest/libqos/e1000e.h
index 091ce139da..5e2b201aa7 100644
--- a/tests/qtest/libqos/e1000e.h
+++ b/tests/qtest/libqos/e1000e.h
@@ -25,6 +25,8 @@
 #define E1000E_RX0_MSG_ID   (0)
 #define E1000E_TX0_MSG_ID   (1)
 
+#define E1000E_ADDRESS { 0x52, 0x54, 0x00, 0x12, 0x34, 0x56 }
+
 typedef struct QE1000E QE1000E;
 typedef struct QE1000E_PCI QE1000E_PCI;
 
-- 
2.39.1

[PATCH v6 7/9] igb: Introduce qtest for igb device

2023-01-31 Thread Akihiko Odaki

This change is derived from qtest for e1000e device.

Signed-off-by: Akihiko Odaki 
Acked-by: Thomas Huth 
---
 MAINTAINERS |   2 +
 tests/qtest/fuzz/generic_fuzz_configs.h |   5 +
 tests/qtest/igb-test.c  | 243 
 tests/qtest/libqos/igb.c| 185 ++
 tests/qtest/libqos/meson.build  |   1 +
 tests/qtest/meson.build |   1 +
 6 files changed, 437 insertions(+)
 create mode 100644 tests/qtest/igb-test.c
 create mode 100644 tests/qtest/libqos/igb.c

diff --git a/MAINTAINERS b/MAINTAINERS
index f9e9638290..127fd92541 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2225,6 +2225,8 @@ igb
 M: Akihiko Odaki 
 S: Maintained
 F: hw/net/igb*
+F: tests/qtest/igb-test.c
+F: tests/qtest/libqos/igb.c
 
 eepro100
 M: Stefan Weil 
diff --git a/tests/qtest/fuzz/generic_fuzz_configs.h 
b/tests/qtest/fuzz/generic_fuzz_configs.h
index a825b78c14..50689da653 100644
--- a/tests/qtest/fuzz/generic_fuzz_configs.h
+++ b/tests/qtest/fuzz/generic_fuzz_configs.h
@@ -90,6 +90,11 @@ const generic_fuzz_config predefined_configs[] = {
 .args = "-M q35 -nodefaults "
 "-device e1000e,netdev=net0 -netdev user,id=net0",
 .objects = "e1000e",
+},{
+.name = "igb",
+.args = "-M q35 -nodefaults "
+"-device igb,netdev=net0 -netdev user,id=net0",
+.objects = "igb",
 },{
 .name = "cirrus-vga",
 .args = "-machine q35 -nodefaults -device cirrus-vga",
diff --git a/tests/qtest/igb-test.c b/tests/qtest/igb-test.c
new file mode 100644
index 00..b36ddece75
--- /dev/null
+++ b/tests/qtest/igb-test.c
@@ -0,0 +1,243 @@
+/*
+ * QTest testcase for igb NIC
+ *
+ * Copyright (c) 2022-2023 Red Hat, Inc.
+ * Copyright (c) 2015 Ravello Systems LTD (http://ravellosystems.com)
+ * Developed by Daynix Computing LTD (http://www.daynix.com)
+ *
+ * Authors:
+ * Akihiko Odaki 
+ * Dmitry Fleytman 
+ * Leonid Bloch 
+ * Yan Vugenfirer 
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see .
+ */
+
+
+#include "qemu/osdep.h"
+#include "libqtest-single.h"
+#include "libqos/pci-pc.h"
+#include "net/eth.h"
+#include "qemu/sockets.h"
+#include "qemu/iov.h"
+#include "qemu/module.h"
+#include "qemu/bitops.h"
+#include "libqos/libqos-malloc.h"
+#include "libqos/e1000e.h"
+#include "hw/net/igb_regs.h"
+
+static const struct eth_header packet = {
+.h_dest = E1000E_ADDRESS,
+.h_source = E1000E_ADDRESS,
+};
+
+static void igb_send_verify(QE1000E *d, int *test_sockets, QGuestAllocator 
*alloc)
+{
+union e1000_adv_tx_desc descr;
+char buffer[64];
+int ret;
+uint32_t recv_len;
+
+/* Prepare test data buffer */
+uint64_t data = guest_alloc(alloc, sizeof(buffer));
+memwrite(data, , sizeof(packet));
+
+/* Prepare TX descriptor */
+memset(, 0, sizeof(descr));
+descr.read.buffer_addr = cpu_to_le64(data);
+descr.read.cmd_type_len = cpu_to_le32(E1000_TXD_CMD_RS   |
+  E1000_TXD_CMD_EOP  |
+  E1000_TXD_DTYP_D   |
+  sizeof(buffer));
+
+/* Put descriptor to the ring */
+e1000e_tx_ring_push(d, );
+
+/* Wait for TX WB interrupt */
+e1000e_wait_isr(d, E1000E_TX0_MSG_ID);
+
+/* Check DD bit */
+g_assert_cmphex(le32_to_cpu(descr.wb.status) & E1000_TXD_STAT_DD, ==,
+E1000_TXD_STAT_DD);
+
+/* Check data sent to the backend */
+ret = recv(test_sockets[0], _len, sizeof(recv_len), 0);
+g_assert_cmpint(ret, == , sizeof(recv_len));
+ret = recv(test_sockets[0], buffer, sizeof(buffer), 0);
+g_assert_cmpint(ret, ==, sizeof(buffer));
+g_assert_false(memcmp(buffer, , sizeof(packet)));
+
+/* Free test data buffer */
+guest_free(alloc, data);
+}
+
+static void igb_receive_verify(QE1000E *d, int *test_sockets, QGuestAllocator 
*alloc)
+{
+union e1000_adv_rx_desc descr;
+
+struct eth_header test_iov = packet;
+int len = htonl(sizeof(packet));
+struct iovec iov[] = {
+{
+.iov_base = ,
+.iov_len = sizeof(len),
+},{
+.iov_base = _iov,
+.iov_len = sizeof(packet),
+},
+};
+
+char buffer[64];
+int ret;
+
+/* Send a dummy packet to device's socket*/
+ret

[PATCH v6 6/9] tests/qtest/libqos/e1000e: Export macreg functions

2023-01-31 Thread Akihiko Odaki

They will be useful for igb testing.

Signed-off-by: Akihiko Odaki 
Reviewed-by: Thomas Huth 
---
 tests/qtest/libqos/e1000e.c | 12 
 tests/qtest/libqos/e1000e.h | 12 
 2 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/tests/qtest/libqos/e1000e.c b/tests/qtest/libqos/e1000e.c
index 28fb3052aa..925654c7fd 100644
--- a/tests/qtest/libqos/e1000e.c
+++ b/tests/qtest/libqos/e1000e.c
@@ -36,18 +36,6 @@
 
 #define E1000E_RING_LEN (0x1000)
 
-static void e1000e_macreg_write(QE1000E *d, uint32_t reg, uint32_t val)
-{
-QE1000E_PCI *d_pci = container_of(d, QE1000E_PCI, e1000e);
-qpci_io_writel(_pci->pci_dev, d_pci->mac_regs, reg, val);
-}
-
-static uint32_t e1000e_macreg_read(QE1000E *d, uint32_t reg)
-{
-QE1000E_PCI *d_pci = container_of(d, QE1000E_PCI, e1000e);
-return qpci_io_readl(_pci->pci_dev, d_pci->mac_regs, reg);
-}
-
 void e1000e_tx_ring_push(QE1000E *d, void *descr)
 {
 QE1000E_PCI *d_pci = container_of(d, QE1000E_PCI, e1000e);
diff --git a/tests/qtest/libqos/e1000e.h b/tests/qtest/libqos/e1000e.h
index 5e2b201aa7..30643c8094 100644
--- a/tests/qtest/libqos/e1000e.h
+++ b/tests/qtest/libqos/e1000e.h
@@ -42,6 +42,18 @@ struct QE1000E_PCI {
 QE1000E e1000e;
 };
 
+static inline void e1000e_macreg_write(QE1000E *d, uint32_t reg, uint32_t val)
+{
+QE1000E_PCI *d_pci = container_of(d, QE1000E_PCI, e1000e);
+qpci_io_writel(_pci->pci_dev, d_pci->mac_regs, reg, val);
+}
+
+static inline uint32_t e1000e_macreg_read(QE1000E *d, uint32_t reg)
+{
+QE1000E_PCI *d_pci = container_of(d, QE1000E_PCI, e1000e);
+return qpci_io_readl(_pci->pci_dev, d_pci->mac_regs, reg);
+}
+
 void e1000e_wait_isr(QE1000E *d, uint16_t msg_id);
 void e1000e_tx_ring_push(QE1000E *d, void *descr);
 void e1000e_rx_ring_push(QE1000E *d, void *descr);
-- 
2.39.1

Re: [PATCH 01/18] vfio/migration: Add VFIO migration pre-copy support

2023-01-31 Thread Alex Williamson

On Tue, 31 Jan 2023 19:29:48 -0400
Jason Gunthorpe  wrote:

> On Tue, Jan 31, 2023 at 03:43:01PM -0700, Alex Williamson wrote:
> 
> > How does this affect our path towards supported migration?  I'm
> > thinking about a user experience where QEMU supports migration if
> > device A OR device B are attached, but not devices A and B attached to
> > the same VM.  We might have a device C where QEMU supports migration
> > with B AND C, but not A AND C, nor A AND B AND C.  This would be the
> > case if device B and device C both supported P2P states, but device A
> > did not. The user has no observability of this feature, so all of this
> > looks effectively random to the user.  
> 
> I think qemu should just log if it encounters a device without P2P
> support.

Better for debugging, but still poor from a VM management perspective.

> > Even in the single device case, we need to make an assumption that a
> > device that does not support P2P migration states (or when QEMU doesn't
> > make use of P2P states) cannot be a DMA target, or otherwise have its
> > MMIO space accessed while in a STOP state.  Can we guarantee that when
> > other devices have not yet transitioned to STOP?  
> 
> You mean the software devices created by qemu?

Other devices, software or otherwise, yes.

> > We could disable the direct map MemoryRegions when we move to a STOP
> > state, which would give QEMU visibility to those accesses, but besides
> > pulling an abort should such an access occur, could we queue them in
> > software, add them to the migration stream, and replay them after the
> > device moves to the RUNNING state?  We'd need to account for the lack of
> > RESUMING_P2P states as well, trapping and queue accesses from devices
> > already RUNNING to those still in RESUMING (not _P2P).  
> 
> I think any internal SW devices should just fail all accesses to the
> P2P space, all the time.
> 
> qemu simply acts like a real system that doesn't support P2P.
> 
> IMHO this is generally the way forward to do multi-device as well,
> remove the MMIO from all the address maps: VFIO, SW access, all of
> them. Nothing can touch MMIO except for the vCPU.

Are you suggesting this relative to migration or in general?  P2P is a
feature with tangible benefits and real use cases.  Real systems seem
to be moving towards making P2P work better, so it would seem short
sighted to move to and enforce only configurations w/o P2P in QEMU
generally.  Besides, this would require some sort of deprecation, so are
we intending to make users choose between migration and P2P?
 
> > This all looks complicated.  Is it better to start with requiring P2P
> > state support?  Thanks,  
> 
> People have built HW without it, so I don't see this as so good..

Are we obliged to start with that hardware?  I'm just trying to think
about whether a single device restriction is sufficient to prevent any
possible P2P or whether there might be an easier starting point for
more capable hardware.  There's no shortage of hardware that could
support migration given sufficient effort.  Thanks,

Alex

[PATCH v6 8/9] tests/avocado: Add igb test

2023-01-31 Thread Akihiko Odaki

This automates ethtool tests for igb registers, interrupts, etc.

Signed-off-by: Akihiko Odaki 
---
 MAINTAINERS   |  1 +
 .../org.centos/stream/8/x86_64/test-avocado   |  1 +
 tests/avocado/igb.py  | 38 +++
 3 files changed, 40 insertions(+)
 create mode 100644 tests/avocado/igb.py

diff --git a/MAINTAINERS b/MAINTAINERS
index 127fd92541..c0831aeb56 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2225,6 +2225,7 @@ igb
 M: Akihiko Odaki 
 S: Maintained
 F: hw/net/igb*
+F: tests/avocado/igb.py
 F: tests/qtest/igb-test.c
 F: tests/qtest/libqos/igb.c
 
diff --git a/scripts/ci/org.centos/stream/8/x86_64/test-avocado 
b/scripts/ci/org.centos/stream/8/x86_64/test-avocado
index 7aeecbcfb8..7e07dbcc89 100755
--- a/scripts/ci/org.centos/stream/8/x86_64/test-avocado
+++ b/scripts/ci/org.centos/stream/8/x86_64/test-avocado
@@ -37,6 +37,7 @@ make get-vm-images
 tests/avocado/cpu_queries.py:QueryCPUModelExpansion.test \
 tests/avocado/empty_cpu_model.py:EmptyCPUModel.test \
 tests/avocado/hotplug_cpu.py:HotPlugCPU.test \
+tests/avocado/igb.py:IGB.test \
 tests/avocado/info_usernet.py:InfoUsernet.test_hostfwd \
 tests/avocado/intel_iommu.py:IntelIOMMU.test_intel_iommu \
 tests/avocado/intel_iommu.py:IntelIOMMU.test_intel_iommu_pt \
diff --git a/tests/avocado/igb.py b/tests/avocado/igb.py
new file mode 100644
index 00..abf5dfa07f
--- /dev/null
+++ b/tests/avocado/igb.py
@@ -0,0 +1,38 @@
+# SPDX-License-Identifier: GPL-2.0-or-later
+# ethtool tests for igb registers, interrupts, etc
+
+from avocado_qemu import LinuxTest
+
+class IGB(LinuxTest):
+"""
+:avocado: tags=accel:kvm
+:avocado: tags=arch:x86_64
+:avocado: tags=distro:fedora
+:avocado: tags=distro_version:31
+:avocado: tags=machine:q35
+"""
+
+timeout = 180
+
+def test(self):
+self.require_accelerator('kvm')
+kernel_url = self.distro.pxeboot_url + 'vmlinuz'
+kernel_hash = '5b6f6876e1b5bda314f93893271da0d5777b1f3c'
+kernel_path = self.fetch_asset(kernel_url, asset_hash=kernel_hash)
+initrd_url = self.distro.pxeboot_url + 'initrd.img'
+initrd_hash = 'dd0340a1b39bd28f88532babd4581c67649ec5b1'
+initrd_path = self.fetch_asset(initrd_url, asset_hash=initrd_hash)
+
+# Ideally we want to test MSI as well, but it is blocked by a bug
+# fixed with:
+# 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=28e96556baca7056d11d9fb3cdd0aba4483e00d8
+kernel_params = self.distro.default_kernel_params + ' pci=nomsi'
+
+self.vm.add_args('-kernel', kernel_path,
+ '-initrd', initrd_path,
+ '-append', kernel_params,
+ '-accel', 'kvm',
+ '-device', 'igb')
+self.launch_and_wait()
+self.ssh_command('dnf -y install ethtool')
+self.ssh_command('ethtool -t eth1 offline')
-- 
2.39.1

[PATCH v6 3/9] e1000: Split header files

2023-01-31 Thread Akihiko Odaki

Some definitions in the header files are invalid for igb so extract
them to new header files to keep igb from referring to them.

Signed-off-by: Gal Hammer 
Signed-off-by: Marcel Apfelbaum 
Signed-off-by: Akihiko Odaki 
Reviewed-by: Philippe Mathieu-Daudé 
---
 hw/net/e1000.c |   1 +
 hw/net/e1000_common.h  | 102 +
 hw/net/e1000_regs.h| 927 +---
 hw/net/e1000e.c|   3 +-
 hw/net/e1000e_core.c   |   1 +
 hw/net/e1000x_common.c |   1 +
 hw/net/e1000x_common.h |  74 
 hw/net/e1000x_regs.h   | 940 +
 8 files changed, 1049 insertions(+), 1000 deletions(-)
 create mode 100644 hw/net/e1000_common.h
 create mode 100644 hw/net/e1000x_regs.h

diff --git a/hw/net/e1000.c b/hw/net/e1000.c
index c81d914a02..23d3d32403 100644
--- a/hw/net/e1000.c
+++ b/hw/net/e1000.c
@@ -39,6 +39,7 @@
 #include "qemu/module.h"
 #include "qemu/range.h"
 
+#include "e1000_common.h"
 #include "e1000x_common.h"
 #include "trace.h"
 #include "qom/object.h"
diff --git a/hw/net/e1000_common.h b/hw/net/e1000_common.h
new file mode 100644
index 00..48feda7404
--- /dev/null
+++ b/hw/net/e1000_common.h
@@ -0,0 +1,102 @@
+/*
+ * QEMU e1000(e) emulation - shared definitions
+ *
+ * Copyright (c) 2008 Qumranet
+ *
+ * Based on work done by:
+ * Nir Peleg, Tutis Systems Ltd. for Qumranet Inc.
+ * Copyright (c) 2007 Dan Aloni
+ * Copyright (c) 2004 Antony T Curtis
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see .
+ */
+
+#ifndef HW_NET_E1000_COMMON_H
+#define HW_NET_E1000_COMMON_H
+
+#include "e1000_regs.h"
+
+#define defreg(x)   x = (E1000_##x >> 2)
+enum {
+defreg(CTRL),defreg(EECD),defreg(EERD),defreg(GPRC),
+defreg(GPTC),defreg(ICR), defreg(ICS), defreg(IMC),
+defreg(IMS), defreg(LEDCTL),  defreg(MANC),defreg(MDIC),
+defreg(MPC), defreg(PBA), defreg(RCTL),defreg(RDBAH0),
+defreg(RDBAL0),  defreg(RDH0),defreg(RDLEN0),  defreg(RDT0),
+defreg(STATUS),  defreg(SWSM),defreg(TCTL),defreg(TDBAH),
+defreg(TDBAL),   defreg(TDH), defreg(TDLEN),   defreg(TDT),
+defreg(TDLEN1),  defreg(TDBAL1),  defreg(TDBAH1),  defreg(TDH1),
+defreg(TDT1),defreg(TORH),defreg(TORL),defreg(TOTH),
+defreg(TOTL),defreg(TPR), defreg(TPT), defreg(TXDCTL),
+defreg(WUFC),defreg(RA),  defreg(MTA), defreg(CRCERRS),
+defreg(VFTA),defreg(VET), defreg(RDTR),defreg(RADV),
+defreg(TADV),defreg(ITR), defreg(SCC), defreg(ECOL),
+defreg(MCC), defreg(LATECOL), defreg(COLC),defreg(DC),
+defreg(TNCRS),   defreg(SEQEC),   defreg(CEXTERR), defreg(RLEC),
+defreg(XONRXC),  defreg(XONTXC),  defreg(XOFFRXC), defreg(XOFFTXC),
+defreg(FCRUC),   defreg(AIT), defreg(TDFH),defreg(TDFT),
+defreg(TDFHS),   defreg(TDFTS),   defreg(TDFPC),   defreg(WUC),
+defreg(WUS), defreg(POEMB),   defreg(PBS), defreg(RDFH),
+defreg(RDFT),defreg(RDFHS),   defreg(RDFTS),   defreg(RDFPC),
+defreg(PBM), defreg(IPAV),defreg(IP4AT),   defreg(IP6AT),
+defreg(WUPM),defreg(FFLT),defreg(FFMT),defreg(FFVT),
+defreg(TARC0),   defreg(TARC1),   defreg(IAM), defreg(EXTCNF_CTRL),
+defreg(GCR), defreg(TIMINCA), defreg(EIAC),defreg(CTRL_EXT),
+defreg(IVAR),defreg(MFUTP01), defreg(MFUTP23), defreg(MANC2H),
+defreg(MFVAL),   defreg(MDEF),defreg(FACTPS),  defreg(FTFT),
+defreg(RUC), defreg(ROC), defreg(RFC), defreg(RJC),
+defreg(PRC64),   defreg(PRC127),  defreg(PRC255),  defreg(PRC511),
+defreg(PRC1023), defreg(PRC1522), defreg(PTC64),   defreg(PTC127),
+defreg(PTC255),  defreg(PTC511),  defreg(PTC1023), defreg(PTC1522),
+defreg(GORCL),   defreg(GORCH),   defreg(GOTCL),   defreg(GOTCH),
+defreg(RNBC),defreg(BPRC),defreg(MPRC),defreg(RFCTL),
+defreg(PSRCTL),  defreg(MPTC),defreg(BPTC),defreg(TSCTFC),
+defreg(IAC), defreg(MGTPRC),  defreg(MGTPDC),  defreg(MGTPTC),
+defreg(TSCTC),   defreg(RXCSUM),  defreg(FUNCTAG), defreg(GSCL_1),
+defreg(GSCL_2),  defreg(GSCL_3),  defreg(GSCL_4),  defreg(GSCN_0),
+defreg(GSCN_1),  defreg(GSCN_2),  defreg(GSCN_3),  defreg(GCR2),
+defreg(RAID),defreg(RSRPD),   defreg(TIDV),defreg(EITR),
+defreg(MRQC),

[PATCH v6 2/9] pcie: Introduce pcie_sriov_num_vfs

2023-01-31 Thread Akihiko Odaki

igb can use this function to change its behavior depending on the
number of virtual functions currently enabled.

Signed-off-by: Gal Hammer 
Signed-off-by: Marcel Apfelbaum 
Signed-off-by: Akihiko Odaki 
Reviewed-by: Philippe Mathieu-Daudé 
---
 hw/pci/pcie_sriov.c | 5 +
 include/hw/pci/pcie_sriov.h | 3 +++
 2 files changed, 8 insertions(+)

diff --git a/hw/pci/pcie_sriov.c b/hw/pci/pcie_sriov.c
index f0bd72e069..aa5a757b11 100644
--- a/hw/pci/pcie_sriov.c
+++ b/hw/pci/pcie_sriov.c
@@ -300,3 +300,8 @@ PCIDevice *pcie_sriov_get_vf_at_index(PCIDevice *dev, int n)
 }
 return NULL;
 }
+
+uint16_t pcie_sriov_num_vfs(PCIDevice *dev)
+{
+return dev->exp.sriov_pf.num_vfs;
+}
diff --git a/include/hw/pci/pcie_sriov.h b/include/hw/pci/pcie_sriov.h
index 96cc743309..095fb0c9ed 100644
--- a/include/hw/pci/pcie_sriov.h
+++ b/include/hw/pci/pcie_sriov.h
@@ -76,4 +76,7 @@ PCIDevice *pcie_sriov_get_pf(PCIDevice *dev);
  */
 PCIDevice *pcie_sriov_get_vf_at_index(PCIDevice *dev, int n);
 
+/* Returns the current number of virtual functions. */
+uint16_t pcie_sriov_num_vfs(PCIDevice *dev);
+
 #endif /* QEMU_PCIE_SRIOV_H */
-- 
2.39.1

[PATCH v6 0/9] Introduce igb

2023-01-31 Thread Akihiko Odaki

Based-on: <20230201033539.30049-1-akihiko.od...@daynix.com>
([PATCH v5 00/29] e1000x cleanups (preliminary for IGB))

igb is a family of Intel's gigabit ethernet controllers. This series implements
82576 emulation in particular. You can see the last patch for the documentation.

Note that there is another effort to bring 82576 emulation. This series was
developed independently by Sriram Yagnaraman.
https://lists.gnu.org/archive/html/qemu-devel/2022-12/msg04670.html

V5 -> V6:
- Rebased.
- Renamed "test" to "packet" in tests/qtest/e1000e-test.c.
- Fixed Rx logic so that a Rx pool without enough space won't prevent other
  pools from receiving, based on Sriram Yagnaraman's work.

V4 -> V5:
- Rebased.
- Squashed patches to copy from e1000e code and modify it.
- Listed the implemented features.
- Added a check for interrupts availablity on PF.
- Fixed the declaration of igb_receive_internal(). (Sriram Yagnaraman)

V3 -> V4:
- Rebased.
- Corrected PCIDevice specified for DMA.

V2 -> V3:
- Rebased.
- Fixed PCIDevice reference in hw/net/igbvf.c.
- Fixed TX packet switching when VM loopback is enabled.
- Fixed VMDq enablement check.
- Fixed RX descriptor length parser.
- Fixed the definitions of RQDPC readers.
- Implemented VLAN VM filter.
- Implemented VT_CTL.Def_PL.
- Implemented the combination of VMDq and RSS.
- Noted that igb is tested with Windows HLK.

V1 -> V2:
- Spun off e1000e general improvements to a distinct series.
- Restored vnet_hdr offload as there seems nothing preventing from that.

Akihiko Odaki (9):
  hw/net/net_tx_pkt: Introduce net_tx_pkt_get_eth_hdr
  pcie: Introduce pcie_sriov_num_vfs
  e1000: Split header files
  Intrdocue igb device emulation
  tests/qtest/e1000e-test: Fabricate ethernet header
  tests/qtest/libqos/e1000e: Export macreg functions
  igb: Introduce qtest for igb device
  tests/avocado: Add igb test
  docs/system/devices/igb: Add igb documentation

 MAINTAINERS   |9 +
 docs/system/device-emulation.rst  |1 +
 docs/system/devices/igb.rst   |   71 +
 hw/net/Kconfig|5 +
 hw/net/e1000.c|1 +
 hw/net/e1000_common.h |  102 +
 hw/net/e1000_regs.h   |  927 +---
 hw/net/e1000e.c   |3 +-
 hw/net/e1000e_core.c  |1 +
 hw/net/e1000x_common.c|1 +
 hw/net/e1000x_common.h|   74 -
 hw/net/e1000x_regs.h  |  940 
 hw/net/igb.c  |  612 +++
 hw/net/igb_common.h   |  146 +
 hw/net/igb_core.c | 4043 +
 hw/net/igb_core.h |  144 +
 hw/net/igb_regs.h |  648 +++
 hw/net/igbvf.c|  327 ++
 hw/net/meson.build|2 +
 hw/net/net_tx_pkt.c   |6 +
 hw/net/net_tx_pkt.h   |8 +
 hw/net/trace-events   |   32 +
 hw/pci/pcie_sriov.c   |5 +
 include/hw/pci/pcie_sriov.h   |3 +
 .../org.centos/stream/8/x86_64/test-avocado   |1 +
 tests/avocado/igb.py  |   38 +
 tests/qtest/e1000e-test.c |   25 +-
 tests/qtest/fuzz/generic_fuzz_configs.h   |5 +
 tests/qtest/igb-test.c|  243 +
 tests/qtest/libqos/e1000e.c   |   12 -
 tests/qtest/libqos/e1000e.h   |   14 +
 tests/qtest/libqos/igb.c  |  185 +
 tests/qtest/libqos/meson.build|1 +
 tests/qtest/meson.build   |1 +
 34 files changed, 7614 insertions(+), 1022 deletions(-)
 create mode 100644 docs/system/devices/igb.rst
 create mode 100644 hw/net/e1000_common.h
 create mode 100644 hw/net/e1000x_regs.h
 create mode 100644 hw/net/igb.c
 create mode 100644 hw/net/igb_common.h
 create mode 100644 hw/net/igb_core.c
 create mode 100644 hw/net/igb_core.h
 create mode 100644 hw/net/igb_regs.h
 create mode 100644 hw/net/igbvf.c
 create mode 100644 tests/avocado/igb.py
 create mode 100644 tests/qtest/igb-test.c
 create mode 100644 tests/qtest/libqos/igb.c

-- 
2.39.1

Re: [PATCH v2] target/riscv: set tval for triggered watchpoints

2023-01-31 Thread Bin Meng

On Wed, Feb 1, 2023 at 1:35 AM Sergey Matyukevich  wrote:
>
> From: Sergey Matyukevich 
>
> According to priviledged spec, if [sm]tval is written with a nonzero

typo: privileged

> value when a breakpoint exception occurs, then [sm]tval will contain
> the faulting virtual address. Set tval to hit address when breakpoint
> exception is triggered by hardware watchpoint.
>
> Signed-off-by: Sergey Matyukevich 
>
> ---
>
> v1 -> v2
> - do not set tval blindly for every breakpoint exception,
>   handle current specific case under consideration
>
>  target/riscv/cpu_helper.c | 6 ++
>  target/riscv/debug.c  | 1 -
>  2 files changed, 6 insertions(+), 1 deletion(-)
>

Reviewed-by: Bin Meng

[PATCH v5 14/29] e1000e: Configure ResettableClass

2023-01-31 Thread Akihiko Odaki

This is part of recent efforts of refactoring e1000 and e1000e.

DeviceClass's reset member is deprecated so migrate to ResettableClass.
There is no behavioral difference.

Signed-off-by: Akihiko Odaki 
Reviewed-by: Peter Maydell 
Reviewed-by: Philippe Mathieu-Daudé 
---
 hw/net/e1000e.c | 10 ++
 hw/net/trace-events |  2 +-
 2 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/hw/net/e1000e.c b/hw/net/e1000e.c
index 0bc222d354..ec274319c4 100644
--- a/hw/net/e1000e.c
+++ b/hw/net/e1000e.c
@@ -513,11 +513,11 @@ static void e1000e_pci_uninit(PCIDevice *pci_dev)
 msi_uninit(pci_dev);
 }
 
-static void e1000e_qdev_reset(DeviceState *dev)
+static void e1000e_qdev_reset_hold(Object *obj)
 {
-E1000EState *s = E1000E(dev);
+E1000EState *s = E1000E(obj);
 
-trace_e1000e_cb_qdev_reset();
+trace_e1000e_cb_qdev_reset_hold();
 
 e1000e_core_reset(>core);
 
@@ -669,6 +669,7 @@ static Property e1000e_properties[] = {
 static void e1000e_class_init(ObjectClass *class, void *data)
 {
 DeviceClass *dc = DEVICE_CLASS(class);
+ResettableClass *rc = RESETTABLE_CLASS(class);
 PCIDeviceClass *c = PCI_DEVICE_CLASS(class);
 
 c->realize = e1000e_pci_realize;
@@ -679,8 +680,9 @@ static void e1000e_class_init(ObjectClass *class, void 
*data)
 c->romfile = "efi-e1000e.rom";
 c->class_id = PCI_CLASS_NETWORK_ETHERNET;
 
+rc->phases.hold = e1000e_qdev_reset_hold;
+
 dc->desc = "Intel 82574L GbE Controller";
-dc->reset = e1000e_qdev_reset;
 dc->vmsd = _vmstate;
 
 e1000e_prop_disable_vnet = qdev_prop_uint8;
diff --git a/hw/net/trace-events b/hw/net/trace-events
index 8fa4299704..c98ad12537 100644
--- a/hw/net/trace-events
+++ b/hw/net/trace-events
@@ -251,7 +251,7 @@ e1000e_vm_state_stopped(void) "VM state is stopped"
 # e1000e.c
 e1000e_cb_pci_realize(void) "E1000E PCI realize entry"
 e1000e_cb_pci_uninit(void) "E1000E PCI unit entry"
-e1000e_cb_qdev_reset(void) "E1000E qdev reset entry"
+e1000e_cb_qdev_reset_hold(void) "E1000E qdev reset hold"
 e1000e_cb_pre_save(void) "E1000E pre save entry"
 e1000e_cb_post_load(void) "E1000E post load entry"
 
-- 
2.39.1

[PATCH v5 28/29] MAINTAINERS: Add e1000e test files

2023-01-31 Thread Akihiko Odaki

Signed-off-by: Akihiko Odaki 
Acked-by: Thomas Huth 
---
 MAINTAINERS | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 958915f227..e920d0061e 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2218,6 +2218,8 @@ R: Akihiko Odaki 
 S: Maintained
 F: hw/net/e1000e*
 F: tests/qtest/fuzz-e1000e-test.c
+F: tests/qtest/e1000e-test.c
+F: tests/qtest/libqos/e1000e.*
 
 eepro100
 M: Stefan Weil 
-- 
2.39.1

[PATCH v5 29/29] e1000e: Combine rx traces

2023-01-31 Thread Akihiko Odaki

Whether a packet will be written back to the guest depends on the
remaining space of the queue. Therefore, e1000e_rx_written_to_guest and
e1000e_rx_not_written_to_guest should log the index of the queue instead
of generated interrupts. This also removes the need of
e1000e_rx_rss_dispatched_to_queue, which logs the queue index.

Signed-off-by: Akihiko Odaki 
---
 hw/net/e1000e_core.c | 6 ++
 hw/net/trace-events  | 5 ++---
 2 files changed, 4 insertions(+), 7 deletions(-)

diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c
index 76c7814cb8..4fec6dfe7e 100644
--- a/hw/net/e1000e_core.c
+++ b/hw/net/e1000e_core.c
@@ -1759,8 +1759,6 @@ e1000e_receive_internal(E1000ECore *core, const struct 
iovec *iov, int iovcnt,
 e1000e_rss_parse_packet(core, core->rx_pkt, _info);
 e1000e_rx_ring_init(core, , rss_info.queue);
 
-trace_e1000e_rx_rss_dispatched_to_queue(rxr.i->idx);
-
 total_size = net_rx_pkt_get_total_len(core->rx_pkt) +
 e1000x_fcs_len(core->mac);
 
@@ -1786,12 +1784,12 @@ e1000e_receive_internal(E1000ECore *core, const struct 
iovec *iov, int iovcnt,
 rdmts_hit = e1000e_rx_descr_threshold_hit(core, rxr.i);
 n |= e1000e_rx_wb_interrupt_cause(core, rxr.i->idx, rdmts_hit);
 
-trace_e1000e_rx_written_to_guest(n);
+trace_e1000e_rx_written_to_guest(rxr.i->idx);
 } else {
 n |= E1000_ICS_RXO;
 retval = 0;
 
-trace_e1000e_rx_not_written_to_guest(n);
+trace_e1000e_rx_not_written_to_guest(rxr.i->idx);
 }
 
 if (!e1000e_intrmgr_delay_rx_causes(core, )) {
diff --git a/hw/net/trace-events b/hw/net/trace-events
index f7257a0693..d24ba945dc 100644
--- a/hw/net/trace-events
+++ b/hw/net/trace-events
@@ -165,8 +165,8 @@ e1000e_rx_descr(int ridx, uint64_t base, uint8_t len) "Next 
RX descriptor: ring
 e1000e_rx_set_rctl(uint32_t rctl) "RCTL = 0x%x"
 e1000e_rx_receive_iov(int iovcnt) "Received vector of %d fragments"
 e1000e_rx_flt_dropped(void) "Received packet dropped by RX filter"
-e1000e_rx_written_to_guest(uint32_t causes) "Received packet written to guest 
(ICR causes %u)"
-e1000e_rx_not_written_to_guest(uint32_t causes) "Received packet NOT written 
to guest (ICR causes %u)"
+e1000e_rx_written_to_guest(int queue_idx) "Received packet written to guest 
(queue %d)"
+e1000e_rx_not_written_to_guest(int queue_idx) "Received packet NOT written to 
guest (queue %d)"
 e1000e_rx_interrupt_set(uint32_t causes) "Receive interrupt set (ICR causes 
%u)"
 e1000e_rx_interrupt_delayed(uint32_t causes) "Receive interrupt delayed (ICR 
causes %u)"
 e1000e_rx_set_cso(int cso_state) "RX CSO state set to %d"
@@ -180,7 +180,6 @@ e1000e_rx_rss_type(uint32_t type) "RSS type is %u"
 e1000e_rx_rss_ip4(bool isfragment, bool istcp, uint32_t mrqc, bool 
tcpipv4_enabled, bool ipv4_enabled) "RSS IPv4: fragment %d, tcp %d, mrqc 0x%X, 
tcpipv4 enabled %d, ipv4 enabled %d"
 e1000e_rx_rss_ip6_rfctl(uint32_t rfctl) "RSS IPv6: rfctl 0x%X"
 e1000e_rx_rss_ip6(bool ex_dis, bool new_ex_dis, bool istcp, bool 
has_ext_headers, bool ex_dst_valid, bool ex_src_valid, uint32_t mrqc, bool 
tcpipv6_enabled, bool ipv6ex_enabled, bool ipv6_enabled) "RSS IPv6: ex_dis: %d, 
new_ex_dis: %d, tcp %d, has_ext_headers %d, ex_dst_valid %d, ex_src_valid %d, 
mrqc 0x%X, tcpipv6 enabled %d, ipv6ex enabled %d, ipv6 enabled %d"
-e1000e_rx_rss_dispatched_to_queue(int queue_idx) "Packet being dispatched to 
queue %d"
 
 e1000e_rx_metadata_protocols(bool isip4, bool isip6, bool isudp, bool istcp) 
"protocols: ip4: %d, ip6: %d, udp: %d, tcp: %d"
 e1000e_rx_metadata_vlan(uint16_t vlan_tag) "VLAN tag is 0x%X"
-- 
2.39.1

[PATCH v5 06/29] e1000e: Mask registers when writing

2023-01-31 Thread Akihiko Odaki

When a register has effective bits fewer than their width, the old code
inconsistently masked when writing or reading. Make the code consistent
by always masking when writing, and remove some code duplication.

Signed-off-by: Akihiko Odaki 
---
 hw/net/e1000e_core.c | 94 +++-
 1 file changed, 40 insertions(+), 54 deletions(-)

diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c
index 181c1e0c2a..e6fc85ea51 100644
--- a/hw/net/e1000e_core.c
+++ b/hw/net/e1000e_core.c
@@ -2440,17 +2440,19 @@ e1000e_set_fcrtl(E1000ECore *core, int index, uint32_t 
val)
 core->mac[FCRTL] = val & 0x8000FFF8;
 }
 
-static inline void
-e1000e_set_16bit(E1000ECore *core, int index, uint32_t val)
-{
-core->mac[index] = val & 0x;
-}
+#define E1000E_LOW_BITS_SET_FUNC(num)\
+static void  \
+e1000e_set_##num##bit(E1000ECore *core, int index, uint32_t val) \
+{\
+core->mac[index] = val & (BIT(num) - 1); \
+}
 
-static void
-e1000e_set_12bit(E1000ECore *core, int index, uint32_t val)
-{
-core->mac[index] = val & 0xfff;
-}
+E1000E_LOW_BITS_SET_FUNC(4)
+E1000E_LOW_BITS_SET_FUNC(6)
+E1000E_LOW_BITS_SET_FUNC(11)
+E1000E_LOW_BITS_SET_FUNC(12)
+E1000E_LOW_BITS_SET_FUNC(13)
+E1000E_LOW_BITS_SET_FUNC(16)
 
 static void
 e1000e_set_vet(E1000ECore *core, int index, uint32_t val)
@@ -2621,22 +2623,6 @@ e1000e_mac_ims_read(E1000ECore *core, int index)
 return core->mac[IMS];
 }
 
-#define E1000E_LOW_BITS_READ_FUNC(num)  \
-static uint32_t \
-e1000e_mac_low##num##_read(E1000ECore *core, int index) \
-{   \
-return core->mac[index] & (BIT(num) - 1);   \
-}   \
-
-#define E1000E_LOW_BITS_READ(num)   \
-e1000e_mac_low##num##_read
-
-E1000E_LOW_BITS_READ_FUNC(4);
-E1000E_LOW_BITS_READ_FUNC(6);
-E1000E_LOW_BITS_READ_FUNC(11);
-E1000E_LOW_BITS_READ_FUNC(13);
-E1000E_LOW_BITS_READ_FUNC(16);
-
 static uint32_t
 e1000e_mac_swsm_read(E1000ECore *core, int index)
 {
@@ -2930,7 +2916,19 @@ static const readops e1000e_macreg_readops[] = {
 e1000e_getreg(LATECOL),
 e1000e_getreg(SEQEC),
 e1000e_getreg(XONTXC),
+e1000e_getreg(AIT),
+e1000e_getreg(TDFH),
+e1000e_getreg(TDFT),
+e1000e_getreg(TDFHS),
+e1000e_getreg(TDFTS),
+e1000e_getreg(TDFPC),
 e1000e_getreg(WUS),
+e1000e_getreg(PBS),
+e1000e_getreg(RDFH),
+e1000e_getreg(RDFT),
+e1000e_getreg(RDFHS),
+e1000e_getreg(RDFTS),
+e1000e_getreg(RDFPC),
 e1000e_getreg(GORCL),
 e1000e_getreg(MGTPRC),
 e1000e_getreg(EERD),
@@ -3066,16 +3064,9 @@ static const readops e1000e_macreg_readops[] = {
 [MPTC]= e1000e_mac_read_clr4,
 [IAC] = e1000e_mac_read_clr4,
 [ICR] = e1000e_mac_icr_read,
-[RDFH]= E1000E_LOW_BITS_READ(13),
-[RDFHS]   = E1000E_LOW_BITS_READ(13),
-[RDFPC]   = E1000E_LOW_BITS_READ(13),
-[TDFH]= E1000E_LOW_BITS_READ(13),
-[TDFHS]   = E1000E_LOW_BITS_READ(13),
 [STATUS]  = e1000e_get_status,
 [TARC0]   = e1000e_get_tarc,
-[PBS] = E1000E_LOW_BITS_READ(6),
 [ICS] = e1000e_mac_ics_read,
-[AIT] = E1000E_LOW_BITS_READ(16),
 [TORH]= e1000e_mac_read_clr8,
 [GORCH]   = e1000e_mac_read_clr8,
 [PRC127]  = e1000e_mac_read_clr4,
@@ -3091,11 +3082,6 @@ static const readops e1000e_macreg_readops[] = {
 [BPTC]= e1000e_mac_read_clr4,
 [TSCTC]   = e1000e_mac_read_clr4,
 [ITR] = e1000e_mac_itr_read,
-[RDFT]= E1000E_LOW_BITS_READ(13),
-[RDFTS]   = E1000E_LOW_BITS_READ(13),
-[TDFPC]   = E1000E_LOW_BITS_READ(13),
-[TDFT]= E1000E_LOW_BITS_READ(13),
-[TDFTS]   = E1000E_LOW_BITS_READ(13),
 [CTRL]= e1000e_get_ctrl,
 [TARC1]   = e1000e_get_tarc,
 [SWSM]= e1000e_mac_swsm_read,
@@ -3108,10 +3094,10 @@ static const readops e1000e_macreg_readops[] = {
 [WUPM ... WUPM + 31]   = e1000e_mac_readreg,
 [MTA ... MTA + 127]= e1000e_mac_readreg,
 [VFTA ... VFTA + 127]  = e1000e_mac_readreg,
-[FFMT ... FFMT + 254]  = E1000E_LOW_BITS_READ(4),
+[FFMT ... FFMT + 254]  = e1000e_mac_readreg,
 [FFVT ... FFVT + 254]  = e1000e_mac_readreg,
 [MDEF ... MDEF + 7]= e1000e_mac_readreg,
-[FFLT ... FFLT + 10]   = E1000E_LOW_BITS_READ(11),
+[FFLT ... FFLT + 10]   = e1000e_mac_readreg,
 [FTFT ... FTFT + 254]  = e1000e_mac_readreg,
 [PBM ... PBM + 10239]  = e1000e_mac_readreg,
 [RETA ... RETA + 31]   = e1000e_mac_readreg,
@@ -3134,19 +3120,8 @@ static const writeops e1000e_macreg_writeops[] = {
 e1000e_putreg(LEDCTL),
 e1000e_putreg(FCAL),
 e1000e_putreg(FCRUC),
-e1000e_putreg(AIT),
-

[PATCH v5 08/29] e1000e: Use more constant definitions

2023-01-31 Thread Akihiko Odaki

The definitions of SW Semaphore Register were copied from:
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/drivers/net/ethernet/intel/e1000e/defines.h?h=v6.0.9#n374

Signed-off-by: Akihiko Odaki 
---
 hw/net/e1000_regs.h  |  7 +++
 hw/net/e1000e_core.c | 49 
 2 files changed, 34 insertions(+), 22 deletions(-)

diff --git a/hw/net/e1000_regs.h b/hw/net/e1000_regs.h
index 3f6b5d0c52..6a36573802 100644
--- a/hw/net/e1000_regs.h
+++ b/hw/net/e1000_regs.h
@@ -525,6 +525,13 @@
 #define M88E1000_PHY_VCO_REG_BIT8  0x100 /* Bits 8 & 11 are adjusted for */
 #define M88E1000_PHY_VCO_REG_BIT11 0x800/* improved BER performance */
 
+/* SW Semaphore Register */
+#define E1000_SWSM_SMBI 0x0001 /* Driver Semaphore bit */
+#define E1000_SWSM_SWESMBI  0x0002 /* FW Semaphore bit */
+#define E1000_SWSM_DRV_LOAD 0x0008 /* Driver Loaded Bit */
+
+#define E1000_SWSM2_LOCK0x0002 /* Secondary driver semaphore bit */
+
 /* Interrupt Cause Read */
 #define E1000_ICR_TXDW  0x0001 /* Transmit desc written back */
 #define E1000_ICR_TXQE  0x0002 /* Transmit Queue empty */
diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c
index e6fc85ea51..6a4da72bd3 100644
--- a/hw/net/e1000e_core.c
+++ b/hw/net/e1000e_core.c
@@ -1022,10 +1022,11 @@ e1000e_receive_filter(E1000ECore *core, const uint8_t 
*buf, int size)
 
 if (e1000x_is_vlan_packet(buf, core->mac[VET]) &&
 e1000x_vlan_rx_filter_enabled(core->mac)) {
-uint16_t vid = lduw_be_p(buf + 14);
-uint32_t vfta = ldl_le_p((uint32_t *)(core->mac + VFTA) +
- ((vid >> 5) & 0x7f));
-if ((vfta & (1 << (vid & 0x1f))) == 0) {
+uint16_t vid = lduw_be_p(_GET_VLAN_HDR(buf)->h_tci);
+uint32_t vfta =
+ldl_le_p((uint32_t *)(core->mac + VFTA) +
+ ((vid >> E1000_VFTA_ENTRY_SHIFT) & 
E1000_VFTA_ENTRY_MASK));
+if ((vfta & (1 << (vid & E1000_VFTA_ENTRY_BIT_SHIFT_MASK))) == 0) {
 trace_e1000e_rx_flt_vlan_mismatch(vid);
 return false;
 } else {
@@ -1679,16 +1680,13 @@ e1000e_rx_fix_l4_csum(E1000ECore *core, struct NetRxPkt 
*pkt)
 }
 }
 
-/* Min. octets in an ethernet frame sans FCS */
-#define MIN_BUF_SIZE 60
-
 ssize_t
 e1000e_receive_iov(E1000ECore *core, const struct iovec *iov, int iovcnt)
 {
-static const int maximum_ethernet_hdr_len = (14 + 4);
+static const int maximum_ethernet_hdr_len = (ETH_HLEN + 4);
 
 uint32_t n = 0;
-uint8_t min_buf[MIN_BUF_SIZE];
+uint8_t min_buf[ETH_ZLEN];
 struct iovec min_iov;
 uint8_t *filter_buf;
 size_t size, orig_size;
@@ -2627,7 +2625,7 @@ static uint32_t
 e1000e_mac_swsm_read(E1000ECore *core, int index)
 {
 uint32_t val = core->mac[SWSM];
-core->mac[SWSM] = val | 1;
+core->mac[SWSM] = val | E1000_SWSM_SMBI;
 return val;
 }
 
@@ -3092,8 +3090,8 @@ static const readops e1000e_macreg_readops[] = {
 [IP4AT ... IP4AT + 6]  = e1000e_mac_readreg,
 [RA ... RA + 31]   = e1000e_mac_readreg,
 [WUPM ... WUPM + 31]   = e1000e_mac_readreg,
-[MTA ... MTA + 127]= e1000e_mac_readreg,
-[VFTA ... VFTA + 127]  = e1000e_mac_readreg,
+[MTA ... MTA + E1000_MC_TBL_SIZE - 1] = e1000e_mac_readreg,
+[VFTA ... VFTA + E1000_VLAN_FILTER_TBL_SIZE - 1]  = e1000e_mac_readreg,
 [FFMT ... FFMT + 254]  = e1000e_mac_readreg,
 [FFVT ... FFVT + 254]  = e1000e_mac_readreg,
 [MDEF ... MDEF + 7]= e1000e_mac_readreg,
@@ -3245,8 +3243,8 @@ static const writeops e1000e_macreg_writeops[] = {
 [IP4AT ... IP4AT + 6]= e1000e_mac_writereg,
 [RA + 2 ... RA + 31] = e1000e_mac_writereg,
 [WUPM ... WUPM + 31] = e1000e_mac_writereg,
-[MTA ... MTA + 127]  = e1000e_mac_writereg,
-[VFTA ... VFTA + 127]= e1000e_mac_writereg,
+[MTA ... MTA + E1000_MC_TBL_SIZE - 1] = e1000e_mac_writereg,
+[VFTA ... VFTA + E1000_VLAN_FILTER_TBL_SIZE - 1]= e1000e_mac_writereg,
 [FFMT ... FFMT + 254]= e1000e_set_4bit,
 [FFVT ... FFVT + 254]= e1000e_mac_writereg,
 [PBM ... PBM + 10239]= e1000e_mac_writereg,
@@ -3276,7 +3274,7 @@ static const uint16_t mac_reg_access[E1000E_MAC_SIZE] = {
 [TDH_A]   = 0x0cf8, [TDT_A]   = 0x0cf8, [TIDV_A] = 0x0cf8,
 [TDFH_A]  = 0xed00, [TDFT_A]  = 0xed00,
 [RA_A ... RA_A + 31]  = 0x14f0,
-[VFTA_A ... VFTA_A + 127] = 0x1400,
+[VFTA_A ... VFTA_A + E1000_VLAN_FILTER_TBL_SIZE - 1] = 0x1400,
 [RDBAL0_A ... RDLEN0_A] = 0x09bc,
 [TDBAL_A ... TDLEN_A]   = 0x0cf8,
 /* Access options */
@@ -3433,13 +3431,20 @@ 
e1000e_phy_reg_init[E1000E_PHY_PAGES][E1000E_PHY_PAGE_SIZE] = {
 
 [MII_PHYID1]= 0x141,
 [MII_PHYID2]= E1000_PHY_ID2_82574x,
-[MII_ANAR]  = 0xde1,
-[MII_ANLPAR]= 0x7e0,
-[MII_ANER]  = BIT(2),
-[MII_ANNP]

[PATCH v5 05/29] e1000: Mask registers when writing

2023-01-31 Thread Akihiko Odaki

When a register has effective bits fewer than their width, the old code
inconsistently masked when writing or reading. Make the code consistent
by always masking when writing, and remove some code duplication.

Signed-off-by: Akihiko Odaki 
---
 hw/net/e1000.c | 84 +++---
 1 file changed, 31 insertions(+), 53 deletions(-)

diff --git a/hw/net/e1000.c b/hw/net/e1000.c
index 9619a2e481..0925a99511 100644
--- a/hw/net/e1000.c
+++ b/hw/net/e1000.c
@@ -1062,30 +1062,6 @@ mac_readreg(E1000State *s, int index)
 return s->mac_reg[index];
 }
 
-static uint32_t
-mac_low4_read(E1000State *s, int index)
-{
-return s->mac_reg[index] & 0xf;
-}
-
-static uint32_t
-mac_low11_read(E1000State *s, int index)
-{
-return s->mac_reg[index] & 0x7ff;
-}
-
-static uint32_t
-mac_low13_read(E1000State *s, int index)
-{
-return s->mac_reg[index] & 0x1fff;
-}
-
-static uint32_t
-mac_low16_read(E1000State *s, int index)
-{
-return s->mac_reg[index] & 0x;
-}
-
 static uint32_t
 mac_icr_read(E1000State *s, int index)
 {
@@ -1138,11 +1114,17 @@ set_rdt(E1000State *s, int index, uint32_t val)
 }
 }
 
-static void
-set_16bit(E1000State *s, int index, uint32_t val)
-{
-s->mac_reg[index] = val & 0x;
-}
+#define LOW_BITS_SET_FUNC(num) \
+static void\
+set_##num##bit(E1000State *s, int index, uint32_t val) \
+{  \
+s->mac_reg[index] = val & (BIT(num) - 1);  \
+}
+
+LOW_BITS_SET_FUNC(4)
+LOW_BITS_SET_FUNC(11)
+LOW_BITS_SET_FUNC(13)
+LOW_BITS_SET_FUNC(16)
 
 static void
 set_dlen(E1000State *s, int index, uint32_t val)
@@ -1196,7 +1178,9 @@ static const readops macreg_readops[] = {
 getreg(XONRXC),   getreg(XONTXC),   getreg(XOFFRXC),  getreg(XOFFTXC),
 getreg(RFC),  getreg(RJC),  getreg(RNBC), getreg(TSCTFC),
 getreg(MGTPRC),   getreg(MGTPDC),   getreg(MGTPTC),   getreg(GORCL),
-getreg(GOTCL),
+getreg(GOTCL),getreg(RDFH), getreg(RDFT), getreg(RDFHS),
+getreg(RDFTS),getreg(RDFPC),getreg(TDFH), getreg(TDFT),
+getreg(TDFHS),getreg(TDFTS),getreg(TDFPC),getreg(AIT),
 
 [TOTH]= mac_read_clr8,  [TORH]= mac_read_clr8,
 [GOTCH]   = mac_read_clr8,  [GORCH]   = mac_read_clr8,
@@ -1214,22 +1198,15 @@ static const readops macreg_readops[] = {
 [MPTC]= mac_read_clr4,
 [ICR] = mac_icr_read,   [EECD]= get_eecd,
 [EERD]= flash_eerd_read,
-[RDFH]= mac_low13_read, [RDFT]= mac_low13_read,
-[RDFHS]   = mac_low13_read, [RDFTS]   = mac_low13_read,
-[RDFPC]   = mac_low13_read,
-[TDFH]= mac_low11_read, [TDFT]= mac_low11_read,
-[TDFHS]   = mac_low13_read, [TDFTS]   = mac_low13_read,
-[TDFPC]   = mac_low13_read,
-[AIT] = mac_low16_read,
 
 [CRCERRS ... MPC] = _readreg,
 [IP6AT ... IP6AT + 3] = _readreg,[IP4AT ... IP4AT + 6] = 
_readreg,
-[FFLT ... FFLT + 6]   = _low11_read,
+[FFLT ... FFLT + 6]   = _readreg,
 [RA ... RA + 31]  = _readreg,
 [WUPM ... WUPM + 31]  = _readreg,
 [MTA ... MTA + 127]   = _readreg,
 [VFTA ... VFTA + 127] = _readreg,
-[FFMT ... FFMT + 254] = _low4_read,
+[FFMT ... FFMT + 254] = _readreg,
 [FFVT ... FFVT + 254] = _readreg,
 [PBM ... PBM + 16383] = _readreg,
 };
@@ -1241,26 +1218,27 @@ static const writeops macreg_writeops[] = {
 putreg(PBA),  putreg(EERD), putreg(SWSM), putreg(WUFC),
 putreg(TDBAL),putreg(TDBAH),putreg(TXDCTL),   putreg(RDBAH),
 putreg(RDBAL),putreg(LEDCTL),   putreg(VET),  putreg(FCRUC),
-putreg(TDFH), putreg(TDFT), putreg(TDFHS),putreg(TDFTS),
-putreg(TDFPC),putreg(RDFH), putreg(RDFT), putreg(RDFHS),
-putreg(RDFTS),putreg(RDFPC),putreg(IPAV), putreg(WUC),
-putreg(WUS),  putreg(AIT),
-
-[TDLEN]  = set_dlen,   [RDLEN]  = set_dlen,   [TCTL] = set_tctl,
-[TDT]= set_tctl,   [MDIC]   = set_mdic,   [ICS]  = set_ics,
-[TDH]= set_16bit,  [RDH]= set_16bit,  [RDT]  = set_rdt,
-[IMC]= set_imc,[IMS]= set_ims,[ICR]  = set_icr,
-[EECD]   = set_eecd,   [RCTL]   = set_rx_control, [CTRL] = set_ctrl,
-[RDTR]   = set_16bit,  [RADV]   = set_16bit,  [TADV] = set_16bit,
-[ITR]= set_16bit,
+putreg(IPAV), putreg(WUC),
+putreg(WUS),
+
+[TDLEN]  = set_dlen,   [RDLEN]  = set_dlen,   [TCTL]  = set_tctl,
+[TDT]= set_tctl,   [MDIC]   = set_mdic,   [ICS]   = set_ics,
+[TDH]= set_16bit,  [RDH]= set_16bit,  [RDT]   = set_rdt,
+[IMC]= set_imc,[IMS]= set_ims,[ICR]   = set_icr,
+[EECD]   = set_eecd,   [RCTL]   = set_rx_control, [CTRL]  = set_ctrl,
+[RDTR]   = set_16bit,  [RADV]   = set_16bit,  [TADV]  = set_16bit,
+[ITR]

[PATCH v5 24/29] hw/net/net_tx_pkt: Implement TCP segmentation

2023-01-31 Thread Akihiko Odaki

There was no proper implementation of TCP segmentation before this
change, and net_tx_pkt relied solely on IPv4 fragmentation. Not only
this is not aligned with the specification, but it also resulted in
corrupted IPv6 packets.

This is particularly problematic for the igb, a new proposed device
implementation; igb provides loopback feature for VMDq and the feature
relies on software segmentation.

Implement proper TCP segmentation in net_tx_pkt to fix such a scenario.

Signed-off-by: Akihiko Odaki 
---
 hw/net/net_tx_pkt.c | 248 
 include/net/eth.h   |   5 -
 net/eth.c   |  27 -
 3 files changed, 206 insertions(+), 74 deletions(-)

diff --git a/hw/net/net_tx_pkt.c b/hw/net/net_tx_pkt.c
index 6afd3f6693..4a35e8429d 100644
--- a/hw/net/net_tx_pkt.c
+++ b/hw/net/net_tx_pkt.c
@@ -326,7 +326,8 @@ bool net_tx_pkt_build_vheader(struct NetTxPkt *pkt, bool 
tso_enable,
 case VIRTIO_NET_HDR_GSO_TCPV6:
 bytes_read = iov_to_buf(>vec[NET_TX_PKT_PL_START_FRAG],
 pkt->payload_frags, 0, , sizeof(l4hdr));
-if (bytes_read < sizeof(l4hdr)) {
+if (bytes_read < sizeof(l4hdr) ||
+l4hdr.th_off * sizeof(uint32_t) < sizeof(l4hdr)) {
 return false;
 }
 
@@ -466,15 +467,14 @@ void net_tx_pkt_reset(struct NetTxPkt *pkt)
 pkt->l4proto = 0;
 }
 
-static void net_tx_pkt_do_sw_csum(struct NetTxPkt *pkt)
+static void net_tx_pkt_do_sw_csum(struct NetTxPkt *pkt,
+  struct iovec *iov, uint32_t iov_len,
+  uint16_t csl)
 {
-struct iovec *iov = >vec[NET_TX_PKT_L2HDR_FRAG];
 uint32_t csum_cntr;
 uint16_t csum = 0;
 uint32_t cso;
 /* num of iovec without vhdr */
-uint32_t iov_len = pkt->payload_frags + NET_TX_PKT_PL_START_FRAG - 1;
-uint16_t csl;
 size_t csum_offset = pkt->virt_hdr.csum_start + pkt->virt_hdr.csum_offset;
 uint16_t l3_proto = eth_get_l3_proto(iov, 1, iov->iov_len);
 
@@ -482,8 +482,6 @@ static void net_tx_pkt_do_sw_csum(struct NetTxPkt *pkt)
 iov_from_buf(iov, iov_len, csum_offset, , sizeof csum);
 
 /* Calculate L4 TCP/UDP checksum */
-csl = pkt->payload_len;
-
 csum_cntr = 0;
 cso = 0;
 /* add pseudo header to csum */
@@ -509,14 +507,13 @@ static void net_tx_pkt_do_sw_csum(struct NetTxPkt *pkt)
 #define NET_MAX_FRAG_SG_LIST (64)
 
 static size_t net_tx_pkt_fetch_fragment(struct NetTxPkt *pkt,
-int *src_idx, size_t *src_offset, struct iovec *dst, int *dst_idx)
+int *src_idx, size_t *src_offset, size_t src_len,
+struct iovec *dst, int *dst_idx)
 {
 size_t fetched = 0;
 struct iovec *src = pkt->vec;
 
-*dst_idx = NET_TX_PKT_PL_START_FRAG;
-
-while (fetched < IP_FRAG_ALIGN_SIZE(pkt->virt_hdr.gso_size)) {
+while (fetched < src_len) {
 
 /* no more place in fragment iov */
 if (*dst_idx == NET_MAX_FRAG_SG_LIST) {
@@ -531,7 +528,7 @@ static size_t net_tx_pkt_fetch_fragment(struct NetTxPkt 
*pkt,
 
 dst[*dst_idx].iov_base = src[*src_idx].iov_base + *src_offset;
 dst[*dst_idx].iov_len = MIN(src[*src_idx].iov_len - *src_offset,
-IP_FRAG_ALIGN_SIZE(pkt->virt_hdr.gso_size) - fetched);
+src_len - fetched);
 
 *src_offset += dst[*dst_idx].iov_len;
 fetched += dst[*dst_idx].iov_len;
@@ -560,58 +557,223 @@ static void net_tx_pkt_sendv(
 }
 }
 
+static bool net_tx_pkt_tcp_fragment_init(struct NetTxPkt *pkt,
+ struct iovec *fragment,
+ int *pl_idx,
+ size_t *l4hdr_len,
+ int *src_idx,
+ size_t *src_offset,
+ size_t *src_len)
+{
+struct iovec *l4 = fragment + NET_TX_PKT_PL_START_FRAG;
+size_t bytes_read = 0;
+struct tcp_hdr *th;
+
+if (!pkt->payload_frags) {
+return false;
+}
+
+l4->iov_len = pkt->virt_hdr.hdr_len - pkt->hdr_len;
+l4->iov_base = g_malloc(l4->iov_len);
+
+*src_idx = NET_TX_PKT_PL_START_FRAG;
+while (pkt->vec[*src_idx].iov_len < l4->iov_len - bytes_read) {
+memcpy((char *)l4->iov_base + bytes_read, pkt->vec[*src_idx].iov_base,
+   pkt->vec[*src_idx].iov_len);
+
+bytes_read += pkt->vec[*src_idx].iov_len;
+
+(*src_idx)++;
+if (*src_idx >= pkt->payload_frags + NET_TX_PKT_PL_START_FRAG) {
+g_free(l4->iov_base);
+return false;
+}
+}
+
+*src_offset = l4->iov_len - bytes_read;
+memcpy((char *)l4->iov_base + bytes_read, pkt->vec[*src_idx].iov_base,
+   *src_offset);
+
+th = l4->iov_base;
+th->th_flags &= ~(TH_FIN | TH_PUSH);
+
+*pl_idx = NET_TX_PKT_PL_START_FRAG + 1;
+*l4hdr_len = l4->iov_len;
+*src_len = pkt->virt_hdr.gso_size;
+
+return true;
+}
+
+static void

[PATCH v5 23/29] e1000e: Perform software segmentation for loopback

2023-01-31 Thread Akihiko Odaki

e1000e didn't perform software segmentation for loopback if virtio-net
header is enabled, which is wrong.

To fix the problem, introduce net_tx_pkt_send_custom(), which allows the
caller to specify whether offloading should be assumed or not.

net_tx_pkt_send_custom() also allows the caller to provide a custom
sending function. Packets with virtio-net headers and ones without
virtio-net headers will be provided at the same time so the function
can choose the preferred version. In case of e1000e loopback, it prefers
to have virtio-net headers as they allows to skip the checksum
verification if VIRTIO_NET_HDR_F_DATA_VALID is set.

Signed-off-by: Akihiko Odaki 
---
 hw/net/e1000e_core.c | 27 ++--
 hw/net/net_rx_pkt.c  |  7 
 hw/net/net_rx_pkt.h  |  8 +
 hw/net/net_tx_pkt.c  | 76 +---
 hw/net/net_tx_pkt.h  | 21 ++--
 5 files changed, 88 insertions(+), 51 deletions(-)

diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c
index 95245c42f5..ff93547f88 100644
--- a/hw/net/e1000e_core.c
+++ b/hw/net/e1000e_core.c
@@ -61,6 +61,10 @@ union e1000_rx_desc_union {
 union e1000_rx_desc_packet_split packet_split;
 };
 
+static ssize_t
+e1000e_receive_internal(E1000ECore *core, const struct iovec *iov, int iovcnt,
+bool has_vnet);
+
 static inline void
 e1000e_set_interrupt_cause(E1000ECore *core, uint32_t val);
 
@@ -655,6 +659,15 @@ e1000e_setup_tx_offloads(E1000ECore *core, struct 
e1000e_tx *tx)
 return true;
 }
 
+static void e1000e_tx_pkt_callback(void *core,
+   const struct iovec *iov,
+   int iovcnt,
+   const struct iovec *virt_iov,
+   int virt_iovcnt)
+{
+e1000e_receive_internal(core, virt_iov, virt_iovcnt, true);
+}
+
 static bool
 e1000e_tx_pkt_send(E1000ECore *core, struct e1000e_tx *tx, int queue_index)
 {
@@ -669,7 +682,8 @@ e1000e_tx_pkt_send(E1000ECore *core, struct e1000e_tx *tx, 
int queue_index)
 
 if ((core->phy[0][MII_BMCR] & MII_BMCR_LOOPBACK) ||
 ((core->mac[RCTL] & E1000_RCTL_LBM_MAC) == E1000_RCTL_LBM_MAC)) {
-return net_tx_pkt_send_loopback(tx->tx_pkt, queue);
+return net_tx_pkt_send_custom(tx->tx_pkt, false,
+  e1000e_tx_pkt_callback, core);
 } else {
 return net_tx_pkt_send(tx->tx_pkt, queue);
 }
@@ -1674,6 +1688,13 @@ e1000e_rx_fix_l4_csum(E1000ECore *core, struct NetRxPkt 
*pkt)
 
 ssize_t
 e1000e_receive_iov(E1000ECore *core, const struct iovec *iov, int iovcnt)
+{
+return e1000e_receive_internal(core, iov, iovcnt, core->has_vnet);
+}
+
+static ssize_t
+e1000e_receive_internal(E1000ECore *core, const struct iovec *iov, int iovcnt,
+bool has_vnet)
 {
 static const int maximum_ethernet_hdr_len = (ETH_HLEN + 4);
 
@@ -1696,9 +1717,11 @@ e1000e_receive_iov(E1000ECore *core, const struct iovec 
*iov, int iovcnt)
 }
 
 /* Pull virtio header in */
-if (core->has_vnet) {
+if (has_vnet) {
 net_rx_pkt_set_vhdr_iovec(core->rx_pkt, iov, iovcnt);
 iov_ofs = sizeof(struct virtio_net_hdr);
+} else {
+net_rx_pkt_unset_vhdr(core->rx_pkt);
 }
 
 filter_buf = iov->iov_base + iov_ofs;
diff --git a/hw/net/net_rx_pkt.c b/hw/net/net_rx_pkt.c
index b309c2f476..a53e7561c5 100644
--- a/hw/net/net_rx_pkt.c
+++ b/hw/net/net_rx_pkt.c
@@ -463,6 +463,13 @@ void net_rx_pkt_set_vhdr_iovec(struct NetRxPkt *pkt,
 iov_to_buf(iov, iovcnt, 0, >virt_hdr, sizeof pkt->virt_hdr);
 }
 
+void net_rx_pkt_unset_vhdr(struct NetRxPkt *pkt)
+{
+assert(pkt);
+
+memset(>virt_hdr, 0, sizeof(pkt->virt_hdr));
+}
+
 bool net_rx_pkt_is_vlan_stripped(struct NetRxPkt *pkt)
 {
 assert(pkt);
diff --git a/hw/net/net_rx_pkt.h b/hw/net/net_rx_pkt.h
index 7277907a22..8b69ddb2da 100644
--- a/hw/net/net_rx_pkt.h
+++ b/hw/net/net_rx_pkt.h
@@ -312,6 +312,14 @@ void net_rx_pkt_set_vhdr(struct NetRxPkt *pkt,
 void net_rx_pkt_set_vhdr_iovec(struct NetRxPkt *pkt,
 const struct iovec *iov, int iovcnt);
 
+/**
+ * unset vhdr data from packet context
+ *
+ * @pkt:packet
+ *
+ */
+void net_rx_pkt_unset_vhdr(struct NetRxPkt *pkt);
+
 /**
  * save packet type in packet context
  *
diff --git a/hw/net/net_tx_pkt.c b/hw/net/net_tx_pkt.c
index cf46c8457f..6afd3f6693 100644
--- a/hw/net/net_tx_pkt.c
+++ b/hw/net/net_tx_pkt.c
@@ -53,8 +53,6 @@ struct NetTxPkt {
 uint16_t hdr_len;
 eth_pkt_types_e packet_type;
 uint8_t l4proto;
-
-bool is_loopback;
 };
 
 void net_tx_pkt_init(struct NetTxPkt **pkt, PCIDevice *pci_dev,
@@ -508,12 +506,6 @@ static void net_tx_pkt_do_sw_csum(struct NetTxPkt *pkt)
 iov_from_buf(iov, iov_len, csum_offset, , sizeof csum);
 }
 
-enum {
-NET_TX_PKT_FRAGMENT_L2_HDR_POS = 0,
-NET_TX_PKT_FRAGMENT_L3_HDR_POS,
-NET_TX_PKT_FRAGMENT_HEADER_NUM
-};
-
 #define NET_MAX_FRAG_SG_LIST (64)
 
 static

[PATCH v5 21/29] hw/net/net_tx_pkt: Automatically determine if virtio-net header is used

2023-01-31 Thread Akihiko Odaki

The new function qemu_get_using_vnet_hdr() allows to automatically
determine if virtio-net header is used.

Signed-off-by: Akihiko Odaki 
---
 hw/net/e1000e_core.c |  3 +--
 hw/net/net_tx_pkt.c  | 19 ++-
 hw/net/net_tx_pkt.h  |  3 +--
 hw/net/vmxnet3.c |  6 ++
 4 files changed, 14 insertions(+), 17 deletions(-)

diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c
index 38d374fba3..954a007151 100644
--- a/hw/net/e1000e_core.c
+++ b/hw/net/e1000e_core.c
@@ -3376,8 +3376,7 @@ e1000e_core_pci_realize(E1000ECore *core,
 qemu_add_vm_change_state_handler(e1000e_vm_state_change, core);
 
 for (i = 0; i < E1000E_NUM_QUEUES; i++) {
-net_tx_pkt_init(>tx[i].tx_pkt, core->owner,
-E1000E_MAX_TX_FRAGS, core->has_vnet);
+net_tx_pkt_init(>tx[i].tx_pkt, core->owner, E1000E_MAX_TX_FRAGS);
 }
 
 net_rx_pkt_init(>rx_pkt, core->has_vnet);
diff --git a/hw/net/net_tx_pkt.c b/hw/net/net_tx_pkt.c
index 8a23899a4d..cf46c8457f 100644
--- a/hw/net/net_tx_pkt.c
+++ b/hw/net/net_tx_pkt.c
@@ -35,7 +35,6 @@ struct NetTxPkt {
 PCIDevice *pci_dev;
 
 struct virtio_net_hdr virt_hdr;
-bool has_virt_hdr;
 
 struct iovec *raw;
 uint32_t raw_frags;
@@ -59,7 +58,7 @@ struct NetTxPkt {
 };
 
 void net_tx_pkt_init(struct NetTxPkt **pkt, PCIDevice *pci_dev,
-uint32_t max_frags, bool has_virt_hdr)
+uint32_t max_frags)
 {
 struct NetTxPkt *p = g_malloc0(sizeof *p);
 
@@ -71,10 +70,8 @@ void net_tx_pkt_init(struct NetTxPkt **pkt, PCIDevice 
*pci_dev,
 
 p->max_payload_frags = max_frags;
 p->max_raw_frags = max_frags;
-p->has_virt_hdr = has_virt_hdr;
 p->vec[NET_TX_PKT_VHDR_FRAG].iov_base = >virt_hdr;
-p->vec[NET_TX_PKT_VHDR_FRAG].iov_len =
-p->has_virt_hdr ? sizeof p->virt_hdr : 0;
+p->vec[NET_TX_PKT_VHDR_FRAG].iov_len = sizeof p->virt_hdr;
 p->vec[NET_TX_PKT_L2HDR_FRAG].iov_base = >l2_hdr;
 p->vec[NET_TX_PKT_L3HDR_FRAG].iov_base = >l3_hdr;
 
@@ -617,9 +614,11 @@ static bool net_tx_pkt_do_sw_fragmentation(struct NetTxPkt 
*pkt,
 
 bool net_tx_pkt_send(struct NetTxPkt *pkt, NetClientState *nc)
 {
+bool using_vnet_hdr = qemu_get_using_vnet_hdr(nc->peer);
+
 assert(pkt);
 
-if (!pkt->has_virt_hdr &&
+if (!using_vnet_hdr &&
 pkt->virt_hdr.flags & VIRTIO_NET_HDR_F_NEEDS_CSUM) {
 net_tx_pkt_do_sw_csum(pkt);
 }
@@ -636,11 +635,13 @@ bool net_tx_pkt_send(struct NetTxPkt *pkt, NetClientState 
*nc)
 }
 }
 
-if (pkt->has_virt_hdr ||
+if (using_vnet_hdr ||
 pkt->virt_hdr.gso_type == VIRTIO_NET_HDR_GSO_NONE) {
+int index = using_vnet_hdr ?
+NET_TX_PKT_VHDR_FRAG : NET_TX_PKT_L2HDR_FRAG;
 net_tx_pkt_fix_ip6_payload_len(pkt);
-net_tx_pkt_sendv(pkt, nc, pkt->vec,
-pkt->payload_frags + NET_TX_PKT_PL_START_FRAG);
+net_tx_pkt_sendv(pkt, nc, pkt->vec + index,
+pkt->payload_frags + NET_TX_PKT_PL_START_FRAG - index);
 return true;
 }
 
diff --git a/hw/net/net_tx_pkt.h b/hw/net/net_tx_pkt.h
index 2e38a5fa69..8d3faa42fb 100644
--- a/hw/net/net_tx_pkt.h
+++ b/hw/net/net_tx_pkt.h
@@ -32,10 +32,9 @@ struct NetTxPkt;
  * @pkt:packet pointer
  * @pci_dev:PCI device processing this packet
  * @max_frags:  max tx ip fragments
- * @has_virt_hdr:   device uses virtio header.
  */
 void net_tx_pkt_init(struct NetTxPkt **pkt, PCIDevice *pci_dev,
-uint32_t max_frags, bool has_virt_hdr);
+uint32_t max_frags);
 
 /**
  * Clean all tx packet resources.
diff --git a/hw/net/vmxnet3.c b/hw/net/vmxnet3.c
index c63bbb59bd..8c3f5d6e14 100644
--- a/hw/net/vmxnet3.c
+++ b/hw/net/vmxnet3.c
@@ -1521,8 +1521,7 @@ static void vmxnet3_activate_device(VMXNET3State *s)
 
 /* Preallocate TX packet wrapper */
 VMW_CFPRN("Max TX fragments is %u", s->max_tx_frags);
-net_tx_pkt_init(>tx_pkt, PCI_DEVICE(s),
-s->max_tx_frags, s->peer_has_vhdr);
+net_tx_pkt_init(>tx_pkt, PCI_DEVICE(s), s->max_tx_frags);
 net_rx_pkt_init(>rx_pkt, s->peer_has_vhdr);
 
 /* Read rings memory locations for RX queues */
@@ -2402,8 +2401,7 @@ static int vmxnet3_post_load(void *opaque, int version_id)
 {
 VMXNET3State *s = opaque;
 
-net_tx_pkt_init(>tx_pkt, PCI_DEVICE(s),
-s->max_tx_frags, s->peer_has_vhdr);
+net_tx_pkt_init(>tx_pkt, PCI_DEVICE(s), s->max_tx_frags);
 net_rx_pkt_init(>rx_pkt, s->peer_has_vhdr);
 
 if (s->msix_used) {
-- 
2.39.1

[PATCH v5 17/29] e1000e: Remove extra pointer indirection

2023-01-31 Thread Akihiko Odaki

e1000e_write_packet_to_guest() passes the reference of variable ba as a
pointer to an array, and that pointer indirection is just unnecessary;
all functions which uses the passed reference performs no pointer
operation on the pointer and they simply dereference the passed
pointer. Remove the extra pointer indirection.

Signed-off-by: Akihiko Odaki 
Reviewed-by: Philippe Mathieu-Daudé 
---
 hw/net/e1000e_core.c | 38 +++---
 1 file changed, 19 insertions(+), 19 deletions(-)

diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c
index 736708407c..d143f2ae6f 100644
--- a/hw/net/e1000e_core.c
+++ b/hw/net/e1000e_core.c
@@ -1075,31 +1075,31 @@ e1000e_read_ext_rx_descr(E1000ECore *core, union 
e1000_rx_desc_extended *desc,
 static inline void
 e1000e_read_ps_rx_descr(E1000ECore *core,
 union e1000_rx_desc_packet_split *desc,
-hwaddr (*buff_addr)[MAX_PS_BUFFERS])
+hwaddr buff_addr[MAX_PS_BUFFERS])
 {
 int i;
 
 for (i = 0; i < MAX_PS_BUFFERS; i++) {
-(*buff_addr)[i] = le64_to_cpu(desc->read.buffer_addr[i]);
+buff_addr[i] = le64_to_cpu(desc->read.buffer_addr[i]);
 }
 
-trace_e1000e_rx_desc_ps_read((*buff_addr)[0], (*buff_addr)[1],
- (*buff_addr)[2], (*buff_addr)[3]);
+trace_e1000e_rx_desc_ps_read(buff_addr[0], buff_addr[1],
+ buff_addr[2], buff_addr[3]);
 }
 
 static inline void
 e1000e_read_rx_descr(E1000ECore *core, union e1000_rx_desc_union *desc,
- hwaddr (*buff_addr)[MAX_PS_BUFFERS])
+ hwaddr buff_addr[MAX_PS_BUFFERS])
 {
 if (e1000e_rx_use_legacy_descriptor(core)) {
-e1000e_read_lgcy_rx_descr(core, >legacy, &(*buff_addr)[0]);
-(*buff_addr)[1] = (*buff_addr)[2] = (*buff_addr)[3] = 0;
+e1000e_read_lgcy_rx_descr(core, >legacy, _addr[0]);
+buff_addr[1] = buff_addr[2] = buff_addr[3] = 0;
 } else {
 if (core->mac[RCTL] & E1000_RCTL_DTYP_PS) {
 e1000e_read_ps_rx_descr(core, >packet_split, buff_addr);
 } else {
-e1000e_read_ext_rx_descr(core, >extended, &(*buff_addr)[0]);
-(*buff_addr)[1] = (*buff_addr)[2] = (*buff_addr)[3] = 0;
+e1000e_read_ext_rx_descr(core, >extended, _addr[0]);
+buff_addr[1] = buff_addr[2] = buff_addr[3] = 0;
 }
 }
 }
@@ -1420,14 +1420,14 @@ typedef struct e1000e_ba_state_st {
 
 static inline void
 e1000e_write_hdr_to_rx_buffers(E1000ECore *core,
-   hwaddr (*ba)[MAX_PS_BUFFERS],
+   hwaddr ba[MAX_PS_BUFFERS],
e1000e_ba_state *bastate,
const char *data,
dma_addr_t data_len)
 {
 assert(data_len <= core->rxbuf_sizes[0] - bastate->written[0]);
 
-pci_dma_write(core->owner, (*ba)[0] + bastate->written[0], data, data_len);
+pci_dma_write(core->owner, ba[0] + bastate->written[0], data, data_len);
 bastate->written[0] += data_len;
 
 bastate->cur_idx = 1;
@@ -1435,7 +1435,7 @@ e1000e_write_hdr_to_rx_buffers(E1000ECore *core,
 
 static void
 e1000e_write_to_rx_buffers(E1000ECore *core,
-   hwaddr (*ba)[MAX_PS_BUFFERS],
+   hwaddr ba[MAX_PS_BUFFERS],
e1000e_ba_state *bastate,
const char *data,
dma_addr_t data_len)
@@ -1447,13 +1447,13 @@ e1000e_write_to_rx_buffers(E1000ECore *core,
 uint32_t bytes_to_write = MIN(data_len, cur_buf_bytes_left);
 
 trace_e1000e_rx_desc_buff_write(bastate->cur_idx,
-(*ba)[bastate->cur_idx],
+ba[bastate->cur_idx],
 bastate->written[bastate->cur_idx],
 data,
 bytes_to_write);
 
 pci_dma_write(core->owner,
-(*ba)[bastate->cur_idx] + bastate->written[bastate->cur_idx],
+ba[bastate->cur_idx] + bastate->written[bastate->cur_idx],
 data, bytes_to_write);
 
 bastate->written[bastate->cur_idx] += bytes_to_write;
@@ -1577,7 +1577,7 @@ e1000e_write_packet_to_guest(E1000ECore *core, struct 
NetRxPkt *pkt,
 
 trace_e1000e_rx_descr(rxi->idx, base, core->rx_desc_len);
 
-e1000e_read_rx_descr(core, , );
+e1000e_read_rx_descr(core, , ba);
 
 if (ba[0]) {
 if (desc_offset < size) {
@@ -1596,7 +1596,7 @@ e1000e_write_packet_to_guest(E1000ECore *core, struct 
NetRxPkt *pkt,
 iov_copy = MIN(ps_hdr_len - ps_hdr_copied,
iov->iov_len - iov_ofs);
 
-e1000e_write_hdr_to_rx_buffers(core, , ,
+

[PATCH v5 13/29] e1000: Configure ResettableClass

2023-01-31 Thread Akihiko Odaki

This is part of recent efforts of refactoring e1000 and e1000e.

DeviceClass's reset member is deprecated so migrate to ResettableClass.
There is no behavioral difference.

Signed-off-by: Akihiko Odaki 
Reviewed-by: Peter Maydell 
Reviewed-by: Philippe Mathieu-Daudé 
---
 hw/net/e1000.c | 13 -
 1 file changed, 4 insertions(+), 9 deletions(-)

diff --git a/hw/net/e1000.c b/hw/net/e1000.c
index 3353a3752c..c81d914a02 100644
--- a/hw/net/e1000.c
+++ b/hw/net/e1000.c
@@ -377,9 +377,9 @@ static bool e1000_vet_init_need(void *opaque)
 return chkflag(VET);
 }
 
-static void e1000_reset(void *opaque)
+static void e1000_reset_hold(Object *obj)
 {
-E1000State *d = opaque;
+E1000State *d = E1000(obj);
 E1000BaseClass *edc = E1000_GET_CLASS(d);
 uint8_t *macaddr = d->conf.macaddr.a;
 
@@ -1731,12 +1731,6 @@ static void pci_e1000_realize(PCIDevice *pci_dev, Error 
**errp)
 e1000_flush_queue_timer, d);
 }
 
-static void qdev_e1000_reset(DeviceState *dev)
-{
-E1000State *d = E1000(dev);
-e1000_reset(d);
-}
-
 static Property e1000_properties[] = {
 DEFINE_NIC_PROPERTIES(E1000State, conf),
 DEFINE_PROP_BIT("autonegotiation", E1000State,
@@ -1762,6 +1756,7 @@ typedef struct E1000Info {
 static void e1000_class_init(ObjectClass *klass, void *data)
 {
 DeviceClass *dc = DEVICE_CLASS(klass);
+ResettableClass *rc = RESETTABLE_CLASS(klass);
 PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
 E1000BaseClass *e = E1000_CLASS(klass);
 const E1000Info *info = data;
@@ -1774,9 +1769,9 @@ static void e1000_class_init(ObjectClass *klass, void 
*data)
 k->revision = info->revision;
 e->phy_id2 = info->phy_id2;
 k->class_id = PCI_CLASS_NETWORK_ETHERNET;
+rc->phases.hold = e1000_reset_hold;
 set_bit(DEVICE_CATEGORY_NETWORK, dc->categories);
 dc->desc = "Intel Gigabit Ethernet";
-dc->reset = qdev_e1000_reset;
 dc->vmsd = _e1000;
 device_class_set_props(dc, e1000_properties);
 }
-- 
2.39.1

[PATCH v5 10/29] e1000e: Use memcpy to intialize registers

2023-01-31 Thread Akihiko Odaki

Use memcpy instead of memmove to initialize registers. The initial
register templates and register table instances will never overlap.

Signed-off-by: Akihiko Odaki 
---
 hw/net/e1000e_core.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c
index 6a4da72bd3..87f964cdc1 100644
--- a/hw/net/e1000e_core.c
+++ b/hw/net/e1000e_core.c
@@ -3511,9 +3511,9 @@ e1000e_core_reset(E1000ECore *core)
 e1000e_intrmgr_reset(core);
 
 memset(core->phy, 0, sizeof core->phy);
-memmove(core->phy, e1000e_phy_reg_init, sizeof e1000e_phy_reg_init);
+memcpy(core->phy, e1000e_phy_reg_init, sizeof e1000e_phy_reg_init);
 memset(core->mac, 0, sizeof core->mac);
-memmove(core->mac, e1000e_mac_reg_init, sizeof e1000e_mac_reg_init);
+memcpy(core->mac, e1000e_mac_reg_init, sizeof e1000e_mac_reg_init);
 
 core->rxbuf_min_shift = 1 + E1000_RING_DESC_LEN_SHIFT;
 
-- 
2.39.1

[PATCH v5 18/29] net: Check L4 header size

2023-01-31 Thread Akihiko Odaki

net_tx_pkt_build_vheader() inspects TCP header but had no check for
the header size, resulting in an undefined behavior. Check the header
size and drop the packet if the header is too small.

Signed-off-by: Akihiko Odaki 
---
 hw/net/e1000e_core.c | 19 ++-
 hw/net/net_tx_pkt.c  | 13 ++---
 hw/net/net_tx_pkt.h  |  3 ++-
 hw/net/vmxnet3.c | 14 +++---
 4 files changed, 33 insertions(+), 16 deletions(-)

diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c
index d143f2ae6f..38d374fba3 100644
--- a/hw/net/e1000e_core.c
+++ b/hw/net/e1000e_core.c
@@ -629,23 +629,30 @@ e1000e_rss_parse_packet(E1000ECore *core,
 info->queue = E1000_RSS_QUEUE(>mac[RETA], info->hash);
 }
 
-static void
+static bool
 e1000e_setup_tx_offloads(E1000ECore *core, struct e1000e_tx *tx)
 {
 if (tx->props.tse && tx->cptse) {
-net_tx_pkt_build_vheader(tx->tx_pkt, true, true, tx->props.mss);
+if (!net_tx_pkt_build_vheader(tx->tx_pkt, true, true, tx->props.mss)) {
+return false;
+}
+
 net_tx_pkt_update_ip_checksums(tx->tx_pkt);
 e1000x_inc_reg_if_not_full(core->mac, TSCTC);
-return;
+return true;
 }
 
 if (tx->sum_needed & E1000_TXD_POPTS_TXSM) {
-net_tx_pkt_build_vheader(tx->tx_pkt, false, true, 0);
+if (!net_tx_pkt_build_vheader(tx->tx_pkt, false, true, 0)) {
+return false;
+}
 }
 
 if (tx->sum_needed & E1000_TXD_POPTS_IXSM) {
 net_tx_pkt_update_ip_hdr_checksum(tx->tx_pkt);
 }
+
+return true;
 }
 
 static bool
@@ -654,7 +661,9 @@ e1000e_tx_pkt_send(E1000ECore *core, struct e1000e_tx *tx, 
int queue_index)
 int target_queue = MIN(core->max_queue_num, queue_index);
 NetClientState *queue = qemu_get_subqueue(core->owner_nic, target_queue);
 
-e1000e_setup_tx_offloads(core, tx);
+if (!e1000e_setup_tx_offloads(core, tx)) {
+return false;
+}
 
 net_tx_pkt_dump(tx->tx_pkt);
 
diff --git a/hw/net/net_tx_pkt.c b/hw/net/net_tx_pkt.c
index 2533ea2700..8a23899a4d 100644
--- a/hw/net/net_tx_pkt.c
+++ b/hw/net/net_tx_pkt.c
@@ -304,10 +304,11 @@ func_exit:
 return rc;
 }
 
-void net_tx_pkt_build_vheader(struct NetTxPkt *pkt, bool tso_enable,
+bool net_tx_pkt_build_vheader(struct NetTxPkt *pkt, bool tso_enable,
 bool csum_enable, uint32_t gso_size)
 {
 struct tcp_hdr l4hdr;
+size_t bytes_read;
 assert(pkt);
 
 /* csum has to be enabled if tso is. */
@@ -328,8 +329,12 @@ void net_tx_pkt_build_vheader(struct NetTxPkt *pkt, bool 
tso_enable,
 
 case VIRTIO_NET_HDR_GSO_TCPV4:
 case VIRTIO_NET_HDR_GSO_TCPV6:
-iov_to_buf(>vec[NET_TX_PKT_PL_START_FRAG], pkt->payload_frags,
-   0, , sizeof(l4hdr));
+bytes_read = iov_to_buf(>vec[NET_TX_PKT_PL_START_FRAG],
+pkt->payload_frags, 0, , sizeof(l4hdr));
+if (bytes_read < sizeof(l4hdr)) {
+return false;
+}
+
 pkt->virt_hdr.hdr_len = pkt->hdr_len + l4hdr.th_off * sizeof(uint32_t);
 pkt->virt_hdr.gso_size = gso_size;
 break;
@@ -354,6 +359,8 @@ void net_tx_pkt_build_vheader(struct NetTxPkt *pkt, bool 
tso_enable,
 break;
 }
 }
+
+return true;
 }
 
 void net_tx_pkt_setup_vlan_header_ex(struct NetTxPkt *pkt,
diff --git a/hw/net/net_tx_pkt.h b/hw/net/net_tx_pkt.h
index 4ec8bbe9bd..2e38a5fa69 100644
--- a/hw/net/net_tx_pkt.h
+++ b/hw/net/net_tx_pkt.h
@@ -59,9 +59,10 @@ struct virtio_net_hdr *net_tx_pkt_get_vhdr(struct NetTxPkt 
*pkt);
  * @tso_enable: TSO enabled
  * @csum_enable:CSO enabled
  * @gso_size:   MSS size for TSO
+ * @ret:operation result
  *
  */
-void net_tx_pkt_build_vheader(struct NetTxPkt *pkt, bool tso_enable,
+bool net_tx_pkt_build_vheader(struct NetTxPkt *pkt, bool tso_enable,
 bool csum_enable, uint32_t gso_size);
 
 /**
diff --git a/hw/net/vmxnet3.c b/hw/net/vmxnet3.c
index d2ab527ef4..c63bbb59bd 100644
--- a/hw/net/vmxnet3.c
+++ b/hw/net/vmxnet3.c
@@ -440,19 +440,19 @@ vmxnet3_setup_tx_offloads(VMXNET3State *s)
 {
 switch (s->offload_mode) {
 case VMXNET3_OM_NONE:
-net_tx_pkt_build_vheader(s->tx_pkt, false, false, 0);
-break;
+return net_tx_pkt_build_vheader(s->tx_pkt, false, false, 0);
 
 case VMXNET3_OM_CSUM:
-net_tx_pkt_build_vheader(s->tx_pkt, false, true, 0);
 VMW_PKPRN("L4 CSO requested\n");
-break;
+return net_tx_pkt_build_vheader(s->tx_pkt, false, true, 0);
 
 case VMXNET3_OM_TSO:
-net_tx_pkt_build_vheader(s->tx_pkt, true, true,
-s->cso_or_gso_size);
-net_tx_pkt_update_ip_checksums(s->tx_pkt);
 VMW_PKPRN("GSO offload requested.");
+if (!net_tx_pkt_build_vheader(s->tx_pkt, true, true,
+s->cso_or_gso_size)) {
+return false;
+}
+net_tx_pkt_update_ip_checksums(s->tx_pkt);
 break;
 
 default:
-- 
2.39.1

[PATCH v5 11/29] e1000e: Remove pending interrupt flags

2023-01-31 Thread Akihiko Odaki

They are duplicate of running throttling timer flags and incomplete as
the flags are not cleared when the interrupts are fired or the device is
reset.

Signed-off-by: Akihiko Odaki 
---
 hw/net/e1000e.c  |  5 ++---
 hw/net/e1000e_core.c | 19 +++
 hw/net/e1000e_core.h |  2 --
 hw/net/trace-events  |  2 --
 4 files changed, 5 insertions(+), 23 deletions(-)

diff --git a/hw/net/e1000e.c b/hw/net/e1000e.c
index d591d01c07..0bc222d354 100644
--- a/hw/net/e1000e.c
+++ b/hw/net/e1000e.c
@@ -631,12 +631,11 @@ static const VMStateDescription e1000e_vmstate = {
 VMSTATE_E1000E_INTR_DELAY_TIMER(core.tidv, E1000EState),
 
 VMSTATE_E1000E_INTR_DELAY_TIMER(core.itr, E1000EState),
-VMSTATE_BOOL(core.itr_intr_pending, E1000EState),
+VMSTATE_UNUSED(1),
 
 VMSTATE_E1000E_INTR_DELAY_TIMER_ARRAY(core.eitr, E1000EState,
   E1000E_MSIX_VEC_NUM),
-VMSTATE_BOOL_ARRAY(core.eitr_intr_pending, E1000EState,
-   E1000E_MSIX_VEC_NUM),
+VMSTATE_UNUSED(E1000E_MSIX_VEC_NUM),
 
 VMSTATE_UINT32(core.itr_guest_value, E1000EState),
 VMSTATE_UINT32_ARRAY(core.eitr_guest_value, E1000EState,
diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c
index 87f964cdc1..37aec6a970 100644
--- a/hw/net/e1000e_core.c
+++ b/hw/net/e1000e_core.c
@@ -154,11 +154,6 @@ e1000e_intrmgr_on_throttling_timer(void *opaque)
 
 timer->running = false;
 
-if (!timer->core->itr_intr_pending) {
-trace_e1000e_irq_throttling_no_pending_interrupts();
-return;
-}
-
 if (msi_enabled(timer->core->owner)) {
 trace_e1000e_irq_msi_notify_postponed();
 /* Clear msi_causes_pending to fire MSI eventually */
@@ -180,11 +175,6 @@ e1000e_intrmgr_on_msix_throttling_timer(void *opaque)
 
 timer->running = false;
 
-if (!timer->core->eitr_intr_pending[idx]) {
-trace_e1000e_irq_throttling_no_pending_vec(idx);
-return;
-}
-
 trace_e1000e_irq_msix_notify_postponed_vec(idx);
 msix_notify(timer->core->owner, idx);
 }
@@ -2015,13 +2005,11 @@ e1000e_clear_ims_bits(E1000ECore *core, uint32_t bits)
 }
 
 static inline bool
-e1000e_postpone_interrupt(bool *interrupt_pending,
-   E1000IntrDelayTimer *timer)
+e1000e_postpone_interrupt(E1000IntrDelayTimer *timer)
 {
 if (timer->running) {
 trace_e1000e_irq_postponed_by_xitr(timer->delay_reg << 2);
 
-*interrupt_pending = true;
 return true;
 }
 
@@ -2035,14 +2023,13 @@ e1000e_postpone_interrupt(bool *interrupt_pending,
 static inline bool
 e1000e_itr_should_postpone(E1000ECore *core)
 {
-return e1000e_postpone_interrupt(>itr_intr_pending, >itr);
+return e1000e_postpone_interrupt(>itr);
 }
 
 static inline bool
 e1000e_eitr_should_postpone(E1000ECore *core, int idx)
 {
-return e1000e_postpone_interrupt(>eitr_intr_pending[idx],
- >eitr[idx]);
+return e1000e_postpone_interrupt(>eitr[idx]);
 }
 
 static void
diff --git a/hw/net/e1000e_core.h b/hw/net/e1000e_core.h
index b8f38c47a0..d0a14b4523 100644
--- a/hw/net/e1000e_core.h
+++ b/hw/net/e1000e_core.h
@@ -95,10 +95,8 @@ struct E1000Core {
 E1000IntrDelayTimer tidv;
 
 E1000IntrDelayTimer itr;
-bool itr_intr_pending;
 
 E1000IntrDelayTimer eitr[E1000E_MSIX_VEC_NUM];
-bool eitr_intr_pending[E1000E_MSIX_VEC_NUM];
 
 VMChangeStateEntry *vmstate;
 
diff --git a/hw/net/trace-events b/hw/net/trace-events
index 4c0ec3fda1..8fa4299704 100644
--- a/hw/net/trace-events
+++ b/hw/net/trace-events
@@ -201,10 +201,8 @@ e1000e_rx_metadata_ipv6_filtering_disabled(void) "IPv6 RX 
filtering disabled by
 e1000e_vlan_vet(uint16_t vet) "Setting VLAN ethernet type 0x%X"
 
 e1000e_irq_msi_notify(uint32_t cause) "MSI notify 0x%x"
-e1000e_irq_throttling_no_pending_interrupts(void) "No pending interrupts to 
notify"
 e1000e_irq_msi_notify_postponed(void) "Sending MSI postponed by ITR"
 e1000e_irq_legacy_notify_postponed(void) "Raising legacy IRQ postponed by ITR"
-e1000e_irq_throttling_no_pending_vec(int idx) "No pending interrupts for 
vector %d"
 e1000e_irq_msix_notify_postponed_vec(int idx) "Sending MSI-X postponed by 
EITR[%d]"
 e1000e_irq_legacy_notify(bool level) "IRQ line state: %d"
 e1000e_irq_msix_notify_vec(uint32_t vector) "MSI-X notify vector 0x%x"
-- 
2.39.1

[PATCH v5 19/29] e1000x: Alter the signature of e1000x_is_vlan_packet

2023-01-31 Thread Akihiko Odaki

e1000x_is_vlan_packet() had a pointer to uint8_t as a parameter, but
it does not have to be uint8_t. Change the type to void *.

Signed-off-by: Akihiko Odaki 
---
 hw/net/e1000x_common.c | 2 +-
 hw/net/e1000x_common.h | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/net/e1000x_common.c b/hw/net/e1000x_common.c
index b3bbf31582..e6387dde53 100644
--- a/hw/net/e1000x_common.c
+++ b/hw/net/e1000x_common.c
@@ -47,7 +47,7 @@ bool e1000x_rx_ready(PCIDevice *d, uint32_t *mac)
 return true;
 }
 
-bool e1000x_is_vlan_packet(const uint8_t *buf, uint16_t vet)
+bool e1000x_is_vlan_packet(const void *buf, uint16_t vet)
 {
 uint16_t eth_proto = lduw_be_p(_GET_ETH_HDR(buf)->h_proto);
 bool res = (eth_proto == vet);
diff --git a/hw/net/e1000x_common.h b/hw/net/e1000x_common.h
index b991d814b1..86a31b69f8 100644
--- a/hw/net/e1000x_common.h
+++ b/hw/net/e1000x_common.h
@@ -178,7 +178,7 @@ uint32_t e1000x_rxbufsize(uint32_t rctl);
 
 bool e1000x_rx_ready(PCIDevice *d, uint32_t *mac);
 
-bool e1000x_is_vlan_packet(const uint8_t *buf, uint16_t vet);
+bool e1000x_is_vlan_packet(const void *buf, uint16_t vet);
 
 bool e1000x_rx_group_filter(uint32_t *mac, const uint8_t *buf);
 
-- 
2.39.1

[PATCH v5 20/29] net: Strip virtio-net header when dumping

2023-01-31 Thread Akihiko Odaki

filter-dump specifiees Ethernet as PCAP LinkType, which does not expect
virtio-net header. Having virtio-net header in such PCAP file breaks
PCAP unconsumable. Unfortunately currently there is no LinkType for
virtio-net so for now strip virtio-net header to convert the output to
Ethernet.

Signed-off-by: Akihiko Odaki 
---
 include/net/net.h |  6 ++
 net/dump.c| 11 +++
 net/net.c | 18 ++
 net/tap.c | 16 
 4 files changed, 47 insertions(+), 4 deletions(-)

diff --git a/include/net/net.h b/include/net/net.h
index dc20b31e9f..4b2d72b3fc 100644
--- a/include/net/net.h
+++ b/include/net/net.h
@@ -56,8 +56,10 @@ typedef RxFilterInfo *(QueryRxFilter)(NetClientState *);
 typedef bool (HasUfo)(NetClientState *);
 typedef bool (HasVnetHdr)(NetClientState *);
 typedef bool (HasVnetHdrLen)(NetClientState *, int);
+typedef bool (GetUsingVnetHdr)(NetClientState *);
 typedef void (UsingVnetHdr)(NetClientState *, bool);
 typedef void (SetOffload)(NetClientState *, int, int, int, int, int);
+typedef int (GetVnetHdrLen)(NetClientState *);
 typedef void (SetVnetHdrLen)(NetClientState *, int);
 typedef int (SetVnetLE)(NetClientState *, bool);
 typedef int (SetVnetBE)(NetClientState *, bool);
@@ -84,8 +86,10 @@ typedef struct NetClientInfo {
 HasUfo *has_ufo;
 HasVnetHdr *has_vnet_hdr;
 HasVnetHdrLen *has_vnet_hdr_len;
+GetUsingVnetHdr *get_using_vnet_hdr;
 UsingVnetHdr *using_vnet_hdr;
 SetOffload *set_offload;
+GetVnetHdrLen *get_vnet_hdr_len;
 SetVnetHdrLen *set_vnet_hdr_len;
 SetVnetLE *set_vnet_le;
 SetVnetBE *set_vnet_be;
@@ -183,9 +187,11 @@ void qemu_format_nic_info_str(NetClientState *nc, uint8_t 
macaddr[6]);
 bool qemu_has_ufo(NetClientState *nc);
 bool qemu_has_vnet_hdr(NetClientState *nc);
 bool qemu_has_vnet_hdr_len(NetClientState *nc, int len);
+bool qemu_get_using_vnet_hdr(NetClientState *nc);
 void qemu_using_vnet_hdr(NetClientState *nc, bool enable);
 void qemu_set_offload(NetClientState *nc, int csum, int tso4, int tso6,
   int ecn, int ufo);
+int qemu_get_vnet_hdr_len(NetClientState *nc);
 void qemu_set_vnet_hdr_len(NetClientState *nc, int len);
 int qemu_set_vnet_le(NetClientState *nc, bool is_le);
 int qemu_set_vnet_be(NetClientState *nc, bool is_be);
diff --git a/net/dump.c b/net/dump.c
index 6a63b15359..7d05f16ca7 100644
--- a/net/dump.c
+++ b/net/dump.c
@@ -61,12 +61,13 @@ struct pcap_sf_pkthdr {
 uint32_t len;
 };
 
-static ssize_t dump_receive_iov(DumpState *s, const struct iovec *iov, int cnt)
+static ssize_t dump_receive_iov(DumpState *s, const struct iovec *iov, int cnt,
+int offset)
 {
 struct pcap_sf_pkthdr hdr;
 int64_t ts;
 int caplen;
-size_t size = iov_size(iov, cnt);
+size_t size = iov_size(iov, cnt) - offset;
 struct iovec dumpiov[cnt + 1];
 
 /* Early return in case of previous error. */
@@ -84,7 +85,7 @@ static ssize_t dump_receive_iov(DumpState *s, const struct 
iovec *iov, int cnt)
 
 dumpiov[0].iov_base = 
 dumpiov[0].iov_len = sizeof(hdr);
-cnt = iov_copy([1], cnt, iov, cnt, 0, caplen);
+cnt = iov_copy([1], cnt, iov, cnt, offset, caplen);
 
 if (writev(s->fd, dumpiov, cnt + 1) != sizeof(hdr) + caplen) {
 error_report("network dump write error - stopping dump");
@@ -153,8 +154,10 @@ static ssize_t filter_dump_receive_iov(NetFilterState *nf, 
NetClientState *sndr,
int iovcnt, NetPacketSent *sent_cb)
 {
 NetFilterDumpState *nfds = FILTER_DUMP(nf);
+int offset = qemu_get_using_vnet_hdr(nf->netdev) ?
+ qemu_get_vnet_hdr_len(nf->netdev) : 0;
 
-dump_receive_iov(>ds, iov, iovcnt);
+dump_receive_iov(>ds, iov, iovcnt, offset);
 return 0;
 }
 
diff --git a/net/net.c b/net/net.c
index 2d01472998..03f17de5fc 100644
--- a/net/net.c
+++ b/net/net.c
@@ -513,6 +513,15 @@ bool qemu_has_vnet_hdr_len(NetClientState *nc, int len)
 return nc->info->has_vnet_hdr_len(nc, len);
 }
 
+bool qemu_get_using_vnet_hdr(NetClientState *nc)
+{
+if (!nc || !nc->info->get_using_vnet_hdr) {
+return false;
+}
+
+return nc->info->get_using_vnet_hdr(nc);
+}
+
 void qemu_using_vnet_hdr(NetClientState *nc, bool enable)
 {
 if (!nc || !nc->info->using_vnet_hdr) {
@@ -532,6 +541,15 @@ void qemu_set_offload(NetClientState *nc, int csum, int 
tso4, int tso6,
 nc->info->set_offload(nc, csum, tso4, tso6, ecn, ufo);
 }
 
+int qemu_get_vnet_hdr_len(NetClientState *nc)
+{
+if (!nc || !nc->info->get_vnet_hdr_len) {
+return 0;
+}
+
+return nc->info->get_vnet_hdr_len(nc);
+}
+
 void qemu_set_vnet_hdr_len(NetClientState *nc, int len)
 {
 if (!nc || !nc->info->set_vnet_hdr_len) {
diff --git a/net/tap.c b/net/tap.c
index 7d7bc1dc5f..1bf085d422 100644
--- a/net/tap.c
+++ b/net/tap.c
@@ -255,6 +255,13 @@ static bool tap_has_vnet_hdr_len(NetClientState *nc, int 
len)
 return

[PATCH v5 04/29] e1000: Use hw/net/mii.h

2023-01-31 Thread Akihiko Odaki

hw/net/mii.h provides common definitions for MII.

Signed-off-by: Akihiko Odaki 
Reviewed-by: Philippe Mathieu-Daudé 
---
 hw/net/e1000.c | 86 ++--
 hw/net/e1000_regs.h| 46 
 hw/net/e1000e.c|  1 +
 hw/net/e1000e_core.c   | 99 +-
 hw/net/e1000x_common.c |  5 ++-
 hw/net/e1000x_common.h |  8 ++--
 6 files changed, 101 insertions(+), 144 deletions(-)

diff --git a/hw/net/e1000.c b/hw/net/e1000.c
index 8ee30aa37c..9619a2e481 100644
--- a/hw/net/e1000.c
+++ b/hw/net/e1000.c
@@ -26,6 +26,7 @@
 
 
 #include "qemu/osdep.h"
+#include "hw/net/mii.h"
 #include "hw/pci/pci_device.h"
 #include "hw/qdev-properties.h"
 #include "migration/vmstate.h"
@@ -181,67 +182,67 @@ e1000_autoneg_done(E1000State *s)
 static bool
 have_autoneg(E1000State *s)
 {
-return chkflag(AUTONEG) && (s->phy_reg[PHY_CTRL] & MII_CR_AUTO_NEG_EN);
+return chkflag(AUTONEG) && (s->phy_reg[MII_BMCR] & MII_BMCR_AUTOEN);
 }
 
 static void
 set_phy_ctrl(E1000State *s, int index, uint16_t val)
 {
-/* bits 0-5 reserved; MII_CR_[RESTART_AUTO_NEG,RESET] are self clearing */
-s->phy_reg[PHY_CTRL] = val & ~(0x3f |
-   MII_CR_RESET |
-   MII_CR_RESTART_AUTO_NEG);
+/* bits 0-5 reserved; MII_BMCR_[ANRESTART,RESET] are self clearing */
+s->phy_reg[MII_BMCR] = val & ~(0x3f |
+   MII_BMCR_RESET |
+   MII_BMCR_ANRESTART);
 
 /*
  * QEMU 1.3 does not support link auto-negotiation emulation, so if we
  * migrate during auto negotiation, after migration the link will be
  * down.
  */
-if (have_autoneg(s) && (val & MII_CR_RESTART_AUTO_NEG)) {
+if (have_autoneg(s) && (val & MII_BMCR_ANRESTART)) {
 e1000x_restart_autoneg(s->mac_reg, s->phy_reg, s->autoneg_timer);
 }
 }
 
 static void (*phyreg_writeops[])(E1000State *, int, uint16_t) = {
-[PHY_CTRL] = set_phy_ctrl,
+[MII_BMCR] = set_phy_ctrl,
 };
 
 enum { NPHYWRITEOPS = ARRAY_SIZE(phyreg_writeops) };
 
 enum { PHY_R = 1, PHY_W = 2, PHY_RW = PHY_R | PHY_W };
 static const char phy_regcap[0x20] = {
-[PHY_STATUS]  = PHY_R, [M88E1000_EXT_PHY_SPEC_CTRL] = PHY_RW,
-[PHY_ID1] = PHY_R, [M88E1000_PHY_SPEC_CTRL] = PHY_RW,
-[PHY_CTRL]= PHY_RW,[PHY_1000T_CTRL] = PHY_RW,
-[PHY_LP_ABILITY]  = PHY_R, [PHY_1000T_STATUS]   = PHY_R,
-[PHY_AUTONEG_ADV] = PHY_RW,[M88E1000_RX_ERR_CNTR]   = PHY_R,
-[PHY_ID2] = PHY_R, [M88E1000_PHY_SPEC_STATUS]   = PHY_R,
-[PHY_AUTONEG_EXP] = PHY_R,
+[MII_BMSR]   = PHY_R, [M88E1000_EXT_PHY_SPEC_CTRL] = PHY_RW,
+[MII_PHYID1] = PHY_R, [M88E1000_PHY_SPEC_CTRL] = PHY_RW,
+[MII_BMCR]   = PHY_RW,[MII_CTRL1000]   = PHY_RW,
+[MII_ANLPAR] = PHY_R, [MII_STAT1000]   = PHY_R,
+[MII_ANAR]   = PHY_RW,[M88E1000_RX_ERR_CNTR]   = PHY_R,
+[MII_PHYID2] = PHY_R, [M88E1000_PHY_SPEC_STATUS]   = PHY_R,
+[MII_ANER]   = PHY_R,
 };
 
-/* PHY_ID2 documented in 8254x_GBe_SDM.pdf, pp. 250 */
+/* MII_PHYID2 documented in 8254x_GBe_SDM.pdf, pp. 250 */
 static const uint16_t phy_reg_init[] = {
-[PHY_CTRL]   = MII_CR_SPEED_SELECT_MSB |
-   MII_CR_FULL_DUPLEX |
-   MII_CR_AUTO_NEG_EN,
-
-[PHY_STATUS] = MII_SR_EXTENDED_CAPS |
-   MII_SR_LINK_STATUS |   /* link initially up */
-   MII_SR_AUTONEG_CAPS |
-   /* MII_SR_AUTONEG_COMPLETE: initially NOT completed */
-   MII_SR_PREAMBLE_SUPPRESS |
-   MII_SR_EXTENDED_STATUS |
-   MII_SR_10T_HD_CAPS |
-   MII_SR_10T_FD_CAPS |
-   MII_SR_100X_HD_CAPS |
-   MII_SR_100X_FD_CAPS,
-
-[PHY_ID1] = 0x141,
-/* [PHY_ID2] configured per DevId, from e1000_reset() */
-[PHY_AUTONEG_ADV] = 0xde1,
-[PHY_LP_ABILITY] = 0x1e0,
-[PHY_1000T_CTRL] = 0x0e00,
-[PHY_1000T_STATUS] = 0x3c00,
+[MII_BMCR] = MII_BMCR_SPEED1000 |
+ MII_BMCR_FD |
+ MII_BMCR_AUTOEN,
+
+[MII_BMSR] = MII_BMSR_EXTCAP |
+ MII_BMSR_LINK_ST |   /* link initially up */
+ MII_BMSR_AUTONEG |
+ /* MII_BMSR_AN_COMP: initially NOT completed */
+ MII_BMSR_MFPS |
+ MII_BMSR_EXTSTAT |
+ MII_BMSR_10T_HD |
+ MII_BMSR_10T_FD |
+ MII_BMSR_100TX_HD |
+ MII_BMSR_100TX_FD,
+
+[MII_PHYID1] = 0x141,
+/* [MII_PHYID2] configured per DevId, from e1000_reset() */
+[MII_ANAR] = 0xde1,
+[MII_ANLPAR] = 0x1e0,
+[MII_CTRL1000] = 0x0e00,
+[MII_STAT1000] = 0x3c00,
 [M88E1000_PHY_SPEC_CTRL] = 0x360,
 [M88E1000_PHY_SPEC_STATUS] = 0xac00,
 [M88E1000_EXT_PHY_SPEC_CTRL] =

[PATCH v5 25/29] hw/net/net_tx_pkt: Check the payload length

2023-01-31 Thread Akihiko Odaki

Check the payload length if checksumming to ensure the payload contains
the space for the resulting value.

This bug was found by Alexander Bulekov with the fuzzer:
https://patchew.org/QEMU/20230129053316.1071513-1-alx...@bu.edu/

The fixed test case is:
fuzz/crash_6aeaa33e7211ecd603726c53e834df4c6d1e08bc

Fixes: e263cd49c7 ("Packet abstraction for VMWARE network devices")
Signed-off-by: Akihiko Odaki 
---
 hw/net/net_tx_pkt.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/hw/net/net_tx_pkt.c b/hw/net/net_tx_pkt.c
index 4a35e8429d..986a3adfe9 100644
--- a/hw/net/net_tx_pkt.c
+++ b/hw/net/net_tx_pkt.c
@@ -342,11 +342,17 @@ bool net_tx_pkt_build_vheader(struct NetTxPkt *pkt, bool 
tso_enable,
 if (csum_enable) {
 switch (pkt->l4proto) {
 case IP_PROTO_TCP:
+if (pkt->payload_len < sizeof(struct tcp_hdr)) {
+return false;
+}
 pkt->virt_hdr.flags = VIRTIO_NET_HDR_F_NEEDS_CSUM;
 pkt->virt_hdr.csum_start = pkt->hdr_len;
 pkt->virt_hdr.csum_offset = offsetof(struct tcp_hdr, th_sum);
 break;
 case IP_PROTO_UDP:
+if (pkt->payload_len < sizeof(struct udp_hdr)) {
+return false;
+}
 pkt->virt_hdr.flags = VIRTIO_NET_HDR_F_NEEDS_CSUM;
 pkt->virt_hdr.csum_start = pkt->hdr_len;
 pkt->virt_hdr.csum_offset = offsetof(struct udp_hdr, uh_sum);
-- 
2.39.1

[PATCH v5 22/29] hw/net/net_rx_pkt: Remove net_rx_pkt_has_virt_hdr

2023-01-31 Thread Akihiko Odaki

When virtio-net header is not set, net_rx_pkt_get_vhdr() returns
zero-filled virtio_net_hdr, which is actually valid. In fact, tap device
uses zero-filled virtio_net_hdr when virtio-net header is not provided
by the peer. Therefore, we can just remove net_rx_pkt_has_virt_hdr() and
always assume NetTxPkt has a valid virtio-net header.

Signed-off-by: Akihiko Odaki 
---
 hw/net/e1000e_core.c | 16 
 hw/net/net_rx_pkt.c  | 11 +--
 hw/net/net_rx_pkt.h  | 12 +---
 hw/net/trace-events  |  1 -
 hw/net/virtio-net.c  |  2 +-
 hw/net/vmxnet3.c | 12 ++--
 6 files changed, 9 insertions(+), 45 deletions(-)

diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c
index 954a007151..95245c42f5 100644
--- a/hw/net/e1000e_core.c
+++ b/hw/net/e1000e_core.c
@@ -1240,12 +1240,6 @@ e1000e_build_rx_metadata(E1000ECore *core,
 goto func_exit;
 }
 
-if (!net_rx_pkt_has_virt_hdr(pkt)) {
-trace_e1000e_rx_metadata_no_virthdr();
-e1000e_verify_csum_in_sw(core, pkt, status_flags, istcp, isudp);
-goto func_exit;
-}
-
 vhdr = net_rx_pkt_get_vhdr(pkt);
 
 if (!(vhdr->flags & VIRTIO_NET_HDR_F_DATA_VALID) &&
@@ -1671,12 +1665,10 @@ e1000e_write_packet_to_guest(E1000ECore *core, struct 
NetRxPkt *pkt,
 static inline void
 e1000e_rx_fix_l4_csum(E1000ECore *core, struct NetRxPkt *pkt)
 {
-if (net_rx_pkt_has_virt_hdr(pkt)) {
-struct virtio_net_hdr *vhdr = net_rx_pkt_get_vhdr(pkt);
+struct virtio_net_hdr *vhdr = net_rx_pkt_get_vhdr(pkt);
 
-if (vhdr->flags & VIRTIO_NET_HDR_F_NEEDS_CSUM) {
-net_rx_pkt_fix_l4_csum(pkt);
-}
+if (vhdr->flags & VIRTIO_NET_HDR_F_NEEDS_CSUM) {
+net_rx_pkt_fix_l4_csum(pkt);
 }
 }
 
@@ -3379,7 +3371,7 @@ e1000e_core_pci_realize(E1000ECore *core,
 net_tx_pkt_init(>tx[i].tx_pkt, core->owner, E1000E_MAX_TX_FRAGS);
 }
 
-net_rx_pkt_init(>rx_pkt, core->has_vnet);
+net_rx_pkt_init(>rx_pkt);
 
 e1000x_core_prepare_eeprom(core->eeprom,
eeprom_templ,
diff --git a/hw/net/net_rx_pkt.c b/hw/net/net_rx_pkt.c
index 1e1c504e42..b309c2f476 100644
--- a/hw/net/net_rx_pkt.c
+++ b/hw/net/net_rx_pkt.c
@@ -30,7 +30,6 @@ struct NetRxPkt {
 uint32_t tot_len;
 uint16_t tci;
 size_t ehdr_buf_len;
-bool has_virt_hdr;
 eth_pkt_types_e packet_type;
 
 /* Analysis results */
@@ -48,10 +47,9 @@ struct NetRxPkt {
 eth_l4_hdr_info  l4hdr_info;
 };
 
-void net_rx_pkt_init(struct NetRxPkt **pkt, bool has_virt_hdr)
+void net_rx_pkt_init(struct NetRxPkt **pkt)
 {
 struct NetRxPkt *p = g_malloc0(sizeof *p);
-p->has_virt_hdr = has_virt_hdr;
 p->vec = NULL;
 p->vec_len_total = 0;
 *pkt = p;
@@ -472,13 +470,6 @@ bool net_rx_pkt_is_vlan_stripped(struct NetRxPkt *pkt)
 return pkt->ehdr_buf_len ? true : false;
 }
 
-bool net_rx_pkt_has_virt_hdr(struct NetRxPkt *pkt)
-{
-assert(pkt);
-
-return pkt->has_virt_hdr;
-}
-
 uint16_t net_rx_pkt_get_vlan_tag(struct NetRxPkt *pkt)
 {
 assert(pkt);
diff --git a/hw/net/net_rx_pkt.h b/hw/net/net_rx_pkt.h
index 048e3461f0..7277907a22 100644
--- a/hw/net/net_rx_pkt.h
+++ b/hw/net/net_rx_pkt.h
@@ -37,10 +37,9 @@ void net_rx_pkt_uninit(struct NetRxPkt *pkt);
  * Init function for rx packet functionality
  *
  * @pkt:packet pointer
- * @has_virt_hdr:   device uses virtio header
  *
  */
-void net_rx_pkt_init(struct NetRxPkt **pkt, bool has_virt_hdr);
+void net_rx_pkt_init(struct NetRxPkt **pkt);
 
 /**
  * returns total length of data attached to rx context
@@ -214,15 +213,6 @@ uint16_t net_rx_pkt_get_vlan_tag(struct NetRxPkt *pkt);
  */
 bool net_rx_pkt_is_vlan_stripped(struct NetRxPkt *pkt);
 
-/**
- * notifies caller if the packet has virtio header
- *
- * @pkt:packet
- * @ret:true if packet has virtio header, false otherwize
- *
- */
-bool net_rx_pkt_has_virt_hdr(struct NetRxPkt *pkt);
-
 /**
 * attach scatter-gather data to rx packet
 *
diff --git a/hw/net/trace-events b/hw/net/trace-events
index c98ad12537..f7257a0693 100644
--- a/hw/net/trace-events
+++ b/hw/net/trace-events
@@ -188,7 +188,6 @@ e1000e_rx_metadata_rss(uint32_t rss, uint32_t mrq) "RSS 
data: rss: 0x%X, mrq: 0x
 e1000e_rx_metadata_ip_id(uint16_t ip_id) "the IPv4 ID is 0x%X"
 e1000e_rx_metadata_ack(void) "the packet is TCP ACK"
 e1000e_rx_metadata_pkt_type(uint32_t pkt_type) "the packet type is %u"
-e1000e_rx_metadata_no_virthdr(void) "the packet has no virt-header"
 e1000e_rx_metadata_virthdr_no_csum_info(void) "virt-header does not contain 
checksum info"
 e1000e_rx_metadata_l3_cso_disabled(void) "IP4 CSO is disabled"
 e1000e_rx_metadata_l4_cso_disabled(void) "TCP/UDP CSO is disabled"
diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 3ae909041a..1795e1aa7d 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -3703,7 +3703,7 @@ static void virtio_net_device_realize(DeviceState *dev, 
Error **errp)
 QTAILQ_INIT(>rsc_chains);

[PATCH v5 27/29] MAINTAINERS: Add Akihiko Odaki as a e1000e reviewer

2023-01-31 Thread Akihiko Odaki

I want to know to be notified when there is a new change for e1000e
as e1000e is similar to igb and such a change may also be applicable for
igb.

Signed-off-by: Akihiko Odaki 
---
 MAINTAINERS | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 08ad1e5341..958915f227 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2208,11 +2208,13 @@ F: docs/specs/rocker.txt
 
 e1000x
 M: Dmitry Fleytman 
+R: Akihiko Odaki 
 S: Maintained
 F: hw/net/e1000x*
 
 e1000e
 M: Dmitry Fleytman 
+R: Akihiko Odaki 
 S: Maintained
 F: hw/net/e1000e*
 F: tests/qtest/fuzz-e1000e-test.c
-- 
2.39.1

[PATCH v5 12/29] e1000e: Improve software reset

2023-01-31 Thread Akihiko Odaki

This change makes e1000e reset more things when software reset was
triggered. Some registers are exempted from software reset in the
datasheet and this change also implements the behavior accordingly.

Signed-off-by: Akihiko Odaki 
---
 hw/net/e1000e_core.c | 24 +++-
 1 file changed, 19 insertions(+), 5 deletions(-)

diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c
index 37aec6a970..b8670662c8 100644
--- a/hw/net/e1000e_core.c
+++ b/hw/net/e1000e_core.c
@@ -58,6 +58,8 @@
 static inline void
 e1000e_set_interrupt_cause(E1000ECore *core, uint32_t val);
 
+static void e1000e_reset(E1000ECore *core, bool sw);
+
 static inline void
 e1000e_process_ts_option(E1000ECore *core, struct e1000_tx_desc *dp)
 {
@@ -1882,7 +1884,7 @@ e1000e_set_ctrl(E1000ECore *core, int index, uint32_t val)
 
 if (val & E1000_CTRL_RST) {
 trace_e1000e_core_ctrl_sw_reset();
-e1000x_reset_mac_addr(core->owner_nic, core->mac, core->permanent_mac);
+e1000e_reset(core, true);
 }
 
 if (val & E1000_CTRL_PHY_RST) {
@@ -3488,8 +3490,7 @@ static const uint32_t e1000e_mac_reg_init[] = {
 [EITR...EITR + E1000E_MSIX_VEC_NUM - 1] = E1000E_MIN_XITR,
 };
 
-void
-e1000e_core_reset(E1000ECore *core)
+static void e1000e_reset(E1000ECore *core, bool sw)
 {
 int i;
 
@@ -3499,8 +3500,15 @@ e1000e_core_reset(E1000ECore *core)
 
 memset(core->phy, 0, sizeof core->phy);
 memcpy(core->phy, e1000e_phy_reg_init, sizeof e1000e_phy_reg_init);
-memset(core->mac, 0, sizeof core->mac);
-memcpy(core->mac, e1000e_mac_reg_init, sizeof e1000e_mac_reg_init);
+
+for (i = 0; i < E1000E_MAC_SIZE; i++) {
+if (sw && (i == PBA || i == PBS || i == FLA)) {
+continue;
+}
+
+core->mac[i] = i < ARRAY_SIZE(e1000e_mac_reg_init) ?
+   e1000e_mac_reg_init[i] : 0;
+}
 
 core->rxbuf_min_shift = 1 + E1000_RING_DESC_LEN_SHIFT;
 
@@ -3517,6 +3525,12 @@ e1000e_core_reset(E1000ECore *core)
 }
 }
 
+void
+e1000e_core_reset(E1000ECore *core)
+{
+e1000e_reset(core, false);
+}
+
 void e1000e_core_pre_save(E1000ECore *core)
 {
 int i;
-- 
2.39.1

[PATCH v5 15/29] e1000e: Introduce e1000_rx_desc_union

2023-01-31 Thread Akihiko Odaki

Before this change, e1000e_write_packet_to_guest() allocated the
receive descriptor buffer as an array of uint8_t. This does not ensure
the buffer is sufficiently aligned.

Introduce e1000_rx_desc_union type, a union type of all receive
descriptor types to correct this.

Signed-off-by: Akihiko Odaki 
---
 hw/net/e1000_regs.h  |   1 -
 hw/net/e1000e_core.c | 115 +--
 2 files changed, 57 insertions(+), 59 deletions(-)

diff --git a/hw/net/e1000_regs.h b/hw/net/e1000_regs.h
index 6a36573802..4545fe25a6 100644
--- a/hw/net/e1000_regs.h
+++ b/hw/net/e1000_regs.h
@@ -1061,7 +1061,6 @@ union e1000_rx_desc_packet_split {
 #define E1000_RING_DESC_LEN_SHIFT (4)
 
 #define E1000_MIN_RX_DESC_LEN   E1000_RING_DESC_LEN
-#define E1000_MAX_RX_DESC_LEN   (sizeof(union e1000_rx_desc_packet_split))
 
 /* Receive Descriptor bit definitions */
 #define E1000_RXD_STAT_DD   0x01/* Descriptor Done */
diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c
index b8670662c8..d8c17baf8f 100644
--- a/hw/net/e1000e_core.c
+++ b/hw/net/e1000e_core.c
@@ -55,6 +55,12 @@
 
 #define E1000E_MAX_TX_FRAGS (64)
 
+union e1000_rx_desc_union {
+struct e1000_rx_desc legacy;
+union e1000_rx_desc_extended extended;
+union e1000_rx_desc_packet_split packet_split;
+};
+
 static inline void
 e1000e_set_interrupt_cause(E1000ECore *core, uint32_t val);
 
@@ -1053,29 +1059,28 @@ e1000e_receive_filter(E1000ECore *core, const uint8_t 
*buf, int size)
 }
 
 static inline void
-e1000e_read_lgcy_rx_descr(E1000ECore *core, uint8_t *desc, hwaddr *buff_addr)
+e1000e_read_lgcy_rx_descr(E1000ECore *core, struct e1000_rx_desc *desc,
+  hwaddr *buff_addr)
 {
-struct e1000_rx_desc *d = (struct e1000_rx_desc *) desc;
-*buff_addr = le64_to_cpu(d->buffer_addr);
+*buff_addr = le64_to_cpu(desc->buffer_addr);
 }
 
 static inline void
-e1000e_read_ext_rx_descr(E1000ECore *core, uint8_t *desc, hwaddr *buff_addr)
+e1000e_read_ext_rx_descr(E1000ECore *core, union e1000_rx_desc_extended *desc,
+ hwaddr *buff_addr)
 {
-union e1000_rx_desc_extended *d = (union e1000_rx_desc_extended *) desc;
-*buff_addr = le64_to_cpu(d->read.buffer_addr);
+*buff_addr = le64_to_cpu(desc->read.buffer_addr);
 }
 
 static inline void
-e1000e_read_ps_rx_descr(E1000ECore *core, uint8_t *desc,
+e1000e_read_ps_rx_descr(E1000ECore *core,
+union e1000_rx_desc_packet_split *desc,
 hwaddr (*buff_addr)[MAX_PS_BUFFERS])
 {
 int i;
-union e1000_rx_desc_packet_split *d =
-(union e1000_rx_desc_packet_split *) desc;
 
 for (i = 0; i < MAX_PS_BUFFERS; i++) {
-(*buff_addr)[i] = le64_to_cpu(d->read.buffer_addr[i]);
+(*buff_addr)[i] = le64_to_cpu(desc->read.buffer_addr[i]);
 }
 
 trace_e1000e_rx_desc_ps_read((*buff_addr)[0], (*buff_addr)[1],
@@ -1083,17 +1088,17 @@ e1000e_read_ps_rx_descr(E1000ECore *core, uint8_t *desc,
 }
 
 static inline void
-e1000e_read_rx_descr(E1000ECore *core, uint8_t *desc,
+e1000e_read_rx_descr(E1000ECore *core, union e1000_rx_desc_union *desc,
  hwaddr (*buff_addr)[MAX_PS_BUFFERS])
 {
 if (e1000e_rx_use_legacy_descriptor(core)) {
-e1000e_read_lgcy_rx_descr(core, desc, &(*buff_addr)[0]);
+e1000e_read_lgcy_rx_descr(core, >legacy, &(*buff_addr)[0]);
 (*buff_addr)[1] = (*buff_addr)[2] = (*buff_addr)[3] = 0;
 } else {
 if (core->mac[RCTL] & E1000_RCTL_DTYP_PS) {
-e1000e_read_ps_rx_descr(core, desc, buff_addr);
+e1000e_read_ps_rx_descr(core, >packet_split, buff_addr);
 } else {
-e1000e_read_ext_rx_descr(core, desc, &(*buff_addr)[0]);
+e1000e_read_ext_rx_descr(core, >extended, &(*buff_addr)[0]);
 (*buff_addr)[1] = (*buff_addr)[2] = (*buff_addr)[3] = 0;
 }
 }
@@ -1264,7 +1269,7 @@ func_exit:
 }
 
 static inline void
-e1000e_write_lgcy_rx_descr(E1000ECore *core, uint8_t *desc,
+e1000e_write_lgcy_rx_descr(E1000ECore *core, struct e1000_rx_desc *desc,
struct NetRxPkt *pkt,
const E1000E_RSSInfo *rss_info,
uint16_t length)
@@ -1272,71 +1277,66 @@ e1000e_write_lgcy_rx_descr(E1000ECore *core, uint8_t 
*desc,
 uint32_t status_flags, rss, mrq;
 uint16_t ip_id;
 
-struct e1000_rx_desc *d = (struct e1000_rx_desc *) desc;
-
 assert(!rss_info->enabled);
 
-d->length = cpu_to_le16(length);
-d->csum = 0;
+desc->length = cpu_to_le16(length);
+desc->csum = 0;
 
 e1000e_build_rx_metadata(core, pkt, pkt != NULL,
  rss_info,
  , ,
  _flags, _id,
- >special);
-d->errors = (uint8_t) (le32_to_cpu(status_flags) >> 24);
-d->status = (uint8_t) le32_to_cpu(status_flags);
+ >special);
+desc->errors =

[PATCH v5 16/29] e1000e: Set MII_ANER_NWAY

2023-01-31 Thread Akihiko Odaki

This keeps Windows driver 12.18.9.23 from generating an event with ID
30. The description of the event is as follows:
> Intel(R) 82574L Gigabit Network Connection
>  PROBLEM: The network adapter is configured for auto-negotiation but
> the link partner is not.  This may result in a duplex mismatch.
>  ACTION: Configure the link partner for auto-negotiation.

Signed-off-by: Akihiko Odaki 
Reviewed-by: Philippe Mathieu-Daudé 
---
 hw/net/e1000e_core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c
index d8c17baf8f..736708407c 100644
--- a/hw/net/e1000e_core.c
+++ b/hw/net/e1000e_core.c
@@ -3426,7 +3426,7 @@ 
e1000e_phy_reg_init[E1000E_PHY_PAGES][E1000E_PHY_PAGE_SIZE] = {
 [MII_ANLPAR]= MII_ANLPAR_10 | MII_ANLPAR_10FD |
   MII_ANLPAR_TX | MII_ANLPAR_TXFD |
   MII_ANLPAR_T4 | MII_ANLPAR_PAUSE,
-[MII_ANER]  = MII_ANER_NP,
+[MII_ANER]  = MII_ANER_NP | MII_ANER_NWAY,
 [MII_ANNP]  = 1 | MII_ANNP_MP,
 [MII_CTRL1000]  = MII_CTRL1000_HALF | MII_CTRL1000_FULL |
   MII_CTRL1000_PORT | MII_CTRL1000_MASTER,
-- 
2.39.1

[PATCH v5 26/29] e1000e: Do not assert when MSI-X is disabled later

2023-01-31 Thread Akihiko Odaki

Assertions will fail if MSI-X gets disabled while a timer for MSI-X
interrupts is running so remove them to avoid abortions. Fortunately,
nothing bad happens even if the assertions won't trigger as
msix_notify(), called by timer handlers, does nothing when MSI-X is
disabled.

This bug was found by Alexander Bulekov when fuzzing igb, a new
device implementation derived from e1000e:
https://patchew.org/QEMU/20230129053316.1071513-1-alx...@bu.edu/

The fixed test case is:
fuzz/crash_aea040166819193cf9fedb810c6d100221da721a

Fixes: 6f3fbe4ed0 ("net: Introduce e1000e device emulation")
Signed-off-by: Akihiko Odaki 
---
 hw/net/e1000e_core.c | 4 
 1 file changed, 4 deletions(-)

diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c
index ff93547f88..76c7814cb8 100644
--- a/hw/net/e1000e_core.c
+++ b/hw/net/e1000e_core.c
@@ -162,8 +162,6 @@ e1000e_intrmgr_on_throttling_timer(void *opaque)
 {
 E1000IntrDelayTimer *timer = opaque;
 
-assert(!msix_enabled(timer->core->owner));
-
 timer->running = false;
 
 if (msi_enabled(timer->core->owner)) {
@@ -183,8 +181,6 @@ e1000e_intrmgr_on_msix_throttling_timer(void *opaque)
 E1000IntrDelayTimer *timer = opaque;
 int idx = timer - >core->eitr[0];
 
-assert(msix_enabled(timer->core->owner));
-
 timer->running = false;
 
 trace_e1000e_irq_msix_notify_postponed_vec(idx);
-- 
2.39.1

[PATCH v5 07/29] e1000: Use more constant definitions

2023-01-31 Thread Akihiko Odaki

The definitions for E1000_VFTA_ENTRY_SHIFT, E1000_VFTA_ENTRY_MASK, and
E1000_VFTA_ENTRY_BIT_SHIFT_MASK were copied from:
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/drivers/net/ethernet/intel/e1000/e1000_hw.h?h=v6.0.9#n306

The definitions for E1000_NUM_UNICAST, E1000_MC_TBL_SIZE, and
E1000_VLAN_FILTER_TBL_SIZE were copied from:
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/drivers/net/ethernet/intel/e1000/e1000_hw.h?h=v6.0.9#n707

Signed-off-by: Akihiko Odaki 
Reviewed-by: Philippe Mathieu-Daudé 
---
 hw/net/e1000.c | 50 +++---
 hw/net/e1000_regs.h|  9 
 hw/net/e1000x_common.c |  5 +++--
 hw/net/e1000x_common.h |  2 +-
 4 files changed, 41 insertions(+), 25 deletions(-)

diff --git a/hw/net/e1000.c b/hw/net/e1000.c
index 0925a99511..d9d048f665 100644
--- a/hw/net/e1000.c
+++ b/hw/net/e1000.c
@@ -43,8 +43,6 @@
 #include "trace.h"
 #include "qom/object.h"
 
-static const uint8_t bcast[] = {0xff, 0xff, 0xff, 0xff, 0xff, 0xff};
-
 /* #define E1000_DEBUG */
 
 #ifdef E1000_DEBUG
@@ -67,9 +65,8 @@ static int debugflags = DBGBIT(TXERR) | DBGBIT(GENERAL);
 
 #define IOPORT_SIZE   0x40
 #define PNPMMIO_SIZE  0x2
-#define MIN_BUF_SIZE  60 /* Min. octets in an ethernet frame sans FCS */
 
-#define MAXIMUM_ETHERNET_HDR_LEN (14+4)
+#define MAXIMUM_ETHERNET_HDR_LEN (ETH_HLEN + 4)
 
 /*
  * HW models:
@@ -239,10 +236,16 @@ static const uint16_t phy_reg_init[] = {
 
 [MII_PHYID1] = 0x141,
 /* [MII_PHYID2] configured per DevId, from e1000_reset() */
-[MII_ANAR] = 0xde1,
-[MII_ANLPAR] = 0x1e0,
-[MII_CTRL1000] = 0x0e00,
-[MII_STAT1000] = 0x3c00,
+[MII_ANAR] = MII_ANAR_CSMACD | MII_ANAR_10 |
+ MII_ANAR_10FD | MII_ANAR_TX |
+ MII_ANAR_TXFD | MII_ANAR_PAUSE |
+ MII_ANAR_PAUSE_ASYM,
+[MII_ANLPAR] = MII_ANLPAR_10 | MII_ANLPAR_10FD |
+   MII_ANLPAR_TX | MII_ANLPAR_TXFD,
+[MII_CTRL1000] = MII_CTRL1000_FULL | MII_CTRL1000_PORT |
+ MII_CTRL1000_MASTER,
+[MII_STAT1000] = MII_STAT1000_HALF | MII_STAT1000_FULL |
+ MII_STAT1000_ROK | MII_STAT1000_LOK,
 [M88E1000_PHY_SPEC_CTRL] = 0x360,
 [M88E1000_PHY_SPEC_STATUS] = 0xac00,
 [M88E1000_EXT_PHY_SPEC_CTRL] = 0x0d60,
@@ -548,9 +551,9 @@ putsum(uint8_t *data, uint32_t n, uint32_t sloc, uint32_t 
css, uint32_t cse)
 static inline void
 inc_tx_bcast_or_mcast_count(E1000State *s, const unsigned char *arr)
 {
-if (!memcmp(arr, bcast, sizeof bcast)) {
+if (is_broadcast_ether_addr(arr)) {
 e1000x_inc_reg_if_not_full(s->mac_reg, BPTC);
-} else if (arr[0] & 1) {
+} else if (is_multicast_ether_addr(arr)) {
 e1000x_inc_reg_if_not_full(s->mac_reg, MPTC);
 }
 }
@@ -804,14 +807,16 @@ static int
 receive_filter(E1000State *s, const uint8_t *buf, int size)
 {
 uint32_t rctl = s->mac_reg[RCTL];
-int isbcast = !memcmp(buf, bcast, sizeof bcast), ismcast = (buf[0] & 1);
+int isbcast = is_broadcast_ether_addr(buf);
+int ismcast = is_multicast_ether_addr(buf);
 
 if (e1000x_is_vlan_packet(buf, le16_to_cpu(s->mac_reg[VET])) &&
 e1000x_vlan_rx_filter_enabled(s->mac_reg)) {
-uint16_t vid = lduw_be_p(buf + 14);
-uint32_t vfta = ldl_le_p((uint32_t *)(s->mac_reg + VFTA) +
- ((vid >> 5) & 0x7f));
-if ((vfta & (1 << (vid & 0x1f))) == 0) {
+uint16_t vid = lduw_be_p(_GET_VLAN_HDR(buf)->h_tci);
+uint32_t vfta =
+ldl_le_p((uint32_t *)(s->mac_reg + VFTA) +
+ ((vid >> E1000_VFTA_ENTRY_SHIFT) & 
E1000_VFTA_ENTRY_MASK));
+if ((vfta & (1 << (vid & E1000_VFTA_ENTRY_BIT_SHIFT_MASK))) == 0) {
 return 0;
 }
 }
@@ -909,7 +914,7 @@ e1000_receive_iov(NetClientState *nc, const struct iovec 
*iov, int iovcnt)
 uint32_t rdh_start;
 uint16_t vlan_special = 0;
 uint8_t vlan_status = 0;
-uint8_t min_buf[MIN_BUF_SIZE];
+uint8_t min_buf[ETH_ZLEN];
 struct iovec min_iov;
 uint8_t *filter_buf = iov->iov_base;
 size_t size = iov_size(iov, iovcnt);
@@ -1204,8 +1209,8 @@ static const readops macreg_readops[] = {
 [FFLT ... FFLT + 6]   = _readreg,
 [RA ... RA + 31]  = _readreg,
 [WUPM ... WUPM + 31]  = _readreg,
-[MTA ... MTA + 127]   = _readreg,
-[VFTA ... VFTA + 127] = _readreg,
+[MTA ... MTA + E1000_MC_TBL_SIZE - 1]   = _readreg,
+[VFTA ... VFTA + E1000_VLAN_FILTER_TBL_SIZE - 1] = _readreg,
 [FFMT ... FFMT + 254] = _readreg,
 [FFVT ... FFVT + 254] = _readreg,
 [PBM ... PBM + 16383] = _readreg,
@@ -1236,8 +1241,8 @@ static const writeops macreg_writeops[] = {
 [FFLT ... FFLT + 6]   = _11bit,
 [RA ... RA + 31]  = _writereg,
 [WUPM ... WUPM + 31]  = _writereg,
-[MTA ... MTA + 127]   = _writereg,
-[VFTA ... VFTA + 127] = _writereg,
+[MTA ... MTA + E1000_MC_TBL_SIZE - 1] =

[PATCH v5 01/29] e1000e: Fix the code style

2023-01-31 Thread Akihiko Odaki

igb implementation first starts off by copying e1000e code. Correct the
code style before that.

Signed-off-by: Akihiko Odaki 
Reviewed-by: Philippe Mathieu-Daudé 
---
 hw/net/e1000.c |  41 
 hw/net/e1000e.c|  72 ++--
 hw/net/e1000e_core.c   | 103 ++---
 hw/net/e1000e_core.h   |  66 +-
 hw/net/e1000x_common.h |  44 +-
 5 files changed, 168 insertions(+), 158 deletions(-)

diff --git a/hw/net/e1000.c b/hw/net/e1000.c
index 7efb8a4c52..8ee30aa37c 100644
--- a/hw/net/e1000.c
+++ b/hw/net/e1000.c
@@ -808,10 +808,11 @@ receive_filter(E1000State *s, const uint8_t *buf, int 
size)
 if (e1000x_is_vlan_packet(buf, le16_to_cpu(s->mac_reg[VET])) &&
 e1000x_vlan_rx_filter_enabled(s->mac_reg)) {
 uint16_t vid = lduw_be_p(buf + 14);
-uint32_t vfta = ldl_le_p((uint32_t*)(s->mac_reg + VFTA) +
+uint32_t vfta = ldl_le_p((uint32_t *)(s->mac_reg + VFTA) +
  ((vid >> 5) & 0x7f));
-if ((vfta & (1 << (vid & 0x1f))) == 0)
+if ((vfta & (1 << (vid & 0x1f))) == 0) {
 return 0;
+}
 }
 
 if (!isbcast && !ismcast && (rctl & E1000_RCTL_UPE)) { /* promiscuous 
ucast */
@@ -1220,16 +1221,16 @@ static const readops macreg_readops[] = {
 [TDFPC]   = mac_low13_read,
 [AIT] = mac_low16_read,
 
-[CRCERRS ... MPC]   = _readreg,
-[IP6AT ... IP6AT+3] = _readreg,[IP4AT ... IP4AT+6] = _readreg,
-[FFLT ... FFLT+6]   = _low11_read,
-[RA ... RA+31]  = _readreg,
-[WUPM ... WUPM+31]  = _readreg,
-[MTA ... MTA+127]   = _readreg,
-[VFTA ... VFTA+127] = _readreg,
-[FFMT ... FFMT+254] = _low4_read,
-[FFVT ... FFVT+254] = _readreg,
-[PBM ... PBM+16383] = _readreg,
+[CRCERRS ... MPC] = _readreg,
+[IP6AT ... IP6AT + 3] = _readreg,[IP4AT ... IP4AT + 6] = 
_readreg,
+[FFLT ... FFLT + 6]   = _low11_read,
+[RA ... RA + 31]  = _readreg,
+[WUPM ... WUPM + 31]  = _readreg,
+[MTA ... MTA + 127]   = _readreg,
+[VFTA ... VFTA + 127] = _readreg,
+[FFMT ... FFMT + 254] = _low4_read,
+[FFVT ... FFVT + 254] = _readreg,
+[PBM ... PBM + 16383] = _readreg,
 };
 enum { NREADOPS = ARRAY_SIZE(macreg_readops) };
 
@@ -1252,14 +1253,14 @@ static const writeops macreg_writeops[] = {
 [RDTR]   = set_16bit,  [RADV]   = set_16bit,  [TADV] = set_16bit,
 [ITR]= set_16bit,
 
-[IP6AT ... IP6AT+3] = _writereg, [IP4AT ... IP4AT+6] = _writereg,
-[FFLT ... FFLT+6]   = _writereg,
-[RA ... RA+31]  = _writereg,
-[WUPM ... WUPM+31]  = _writereg,
-[MTA ... MTA+127]   = _writereg,
-[VFTA ... VFTA+127] = _writereg,
-[FFMT ... FFMT+254] = _writereg, [FFVT ... FFVT+254] = _writereg,
-[PBM ... PBM+16383] = _writereg,
+[IP6AT ... IP6AT + 3] = _writereg, [IP4AT ... IP4AT + 6] = 
_writereg,
+[FFLT ... FFLT + 6]   = _writereg,
+[RA ... RA + 31]  = _writereg,
+[WUPM ... WUPM + 31]  = _writereg,
+[MTA ... MTA + 127]   = _writereg,
+[VFTA ... VFTA + 127] = _writereg,
+[FFMT ... FFMT + 254] = _writereg, [FFVT ... FFVT + 254] = 
_writereg,
+[PBM ... PBM + 16383] = _writereg,
 };
 
 enum { NWRITEOPS = ARRAY_SIZE(macreg_writeops) };
diff --git a/hw/net/e1000e.c b/hw/net/e1000e.c
index 7523e9f5d2..8635ca16c6 100644
--- a/hw/net/e1000e.c
+++ b/hw/net/e1000e.c
@@ -1,37 +1,37 @@
 /*
-* QEMU INTEL 82574 GbE NIC emulation
-*
-* Software developer's manuals:
-* 
http://www.intel.com/content/dam/doc/datasheet/82574l-gbe-controller-datasheet.pdf
-*
-* Copyright (c) 2015 Ravello Systems LTD (http://ravellosystems.com)
-* Developed by Daynix Computing LTD (http://www.daynix.com)
-*
-* Authors:
-* Dmitry Fleytman 
-* Leonid Bloch 
-* Yan Vugenfirer 
-*
-* Based on work done by:
-* Nir Peleg, Tutis Systems Ltd. for Qumranet Inc.
-* Copyright (c) 2008 Qumranet
-* Based on work done by:
-* Copyright (c) 2007 Dan Aloni
-* Copyright (c) 2004 Antony T Curtis
-*
-* This library is free software; you can redistribute it and/or
-* modify it under the terms of the GNU Lesser General Public
-* License as published by the Free Software Foundation; either
-* version 2.1 of the License, or (at your option) any later version.
-*
-* This library is distributed in the hope that it will be useful,
-* but WITHOUT ANY WARRANTY; without even the implied warranty of
-* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
-* Lesser General Public License for more details.
-*
-* You should have received a copy of the GNU Lesser General Public
-* License along with this library; if not, see .
-*/
+ * QEMU INTEL 82574 GbE NIC emulation
+ *
+ * Software developer's manuals:
+ * 
http://www.intel.com/content/dam/doc/datasheet/82574l-gbe-controller-datasheet.pdf
+ *
+ * Copyright (c) 2015 Ravello Systems LTD (http://ravellosystems.com)
+ * Developed by Daynix Computing LTD

[PATCH v5 03/29] fsl_etsec: Use hw/net/mii.h

2023-01-31 Thread Akihiko Odaki

hw/net/mii.h provides common definitions for MII.

Signed-off-by: Akihiko Odaki 
Reviewed-by: Philippe Mathieu-Daudé 
---
 hw/net/fsl_etsec/etsec.c | 11 ++-
 hw/net/fsl_etsec/etsec.h | 17 -
 hw/net/fsl_etsec/miim.c  |  5 +++--
 include/hw/net/mii.h |  1 +
 4 files changed, 10 insertions(+), 24 deletions(-)

diff --git a/hw/net/fsl_etsec/etsec.c b/hw/net/fsl_etsec/etsec.c
index c753bfb3a8..798ea33d08 100644
--- a/hw/net/fsl_etsec/etsec.c
+++ b/hw/net/fsl_etsec/etsec.c
@@ -29,6 +29,7 @@
 #include "qemu/osdep.h"
 #include "hw/sysbus.h"
 #include "hw/irq.h"
+#include "hw/net/mii.h"
 #include "hw/ptimer.h"
 #include "hw/qdev-properties.h"
 #include "etsec.h"
@@ -339,11 +340,11 @@ static void etsec_reset(DeviceState *d)
 etsec->rx_buffer_len = 0;
 
 etsec->phy_status =
-MII_SR_EXTENDED_CAPS| MII_SR_LINK_STATUS   | MII_SR_AUTONEG_CAPS  |
-MII_SR_AUTONEG_COMPLETE | MII_SR_PREAMBLE_SUPPRESS |
-MII_SR_EXTENDED_STATUS  | MII_SR_100T2_HD_CAPS | MII_SR_100T2_FD_CAPS |
-MII_SR_10T_HD_CAPS  | MII_SR_10T_FD_CAPS   | MII_SR_100X_HD_CAPS  |
-MII_SR_100X_FD_CAPS | MII_SR_100T4_CAPS;
+MII_BMSR_EXTCAP   | MII_BMSR_LINK_ST  | MII_BMSR_AUTONEG  |
+MII_BMSR_AN_COMP  | MII_BMSR_MFPS | MII_BMSR_EXTSTAT  |
+MII_BMSR_100T2_HD | MII_BMSR_100T2_FD |
+MII_BMSR_10T_HD   | MII_BMSR_10T_FD   |
+MII_BMSR_100TX_HD | MII_BMSR_100TX_FD | MII_BMSR_100T4;
 
 etsec_update_irq(etsec);
 }
diff --git a/hw/net/fsl_etsec/etsec.h b/hw/net/fsl_etsec/etsec.h
index 3c625c955c..3860864a3f 100644
--- a/hw/net/fsl_etsec/etsec.h
+++ b/hw/net/fsl_etsec/etsec.h
@@ -76,23 +76,6 @@ typedef struct eTSEC_rxtx_bd {
 #define FCB_TX_CTU (1 << 1)
 #define FCB_TX_NPH (1 << 0)
 
-/* PHY Status Register */
-#define MII_SR_EXTENDED_CAPS 0x0001/* Extended register capabilities */
-#define MII_SR_JABBER_DETECT 0x0002/* Jabber Detected */
-#define MII_SR_LINK_STATUS   0x0004/* Link Status 1 = link */
-#define MII_SR_AUTONEG_CAPS  0x0008/* Auto Neg Capable */
-#define MII_SR_REMOTE_FAULT  0x0010/* Remote Fault Detect */
-#define MII_SR_AUTONEG_COMPLETE  0x0020/* Auto Neg Complete */
-#define MII_SR_PREAMBLE_SUPPRESS 0x0040/* Preamble may be suppressed */
-#define MII_SR_EXTENDED_STATUS   0x0100/* Ext. status info in Reg 0x0F */
-#define MII_SR_100T2_HD_CAPS 0x0200/* 100T2 Half Duplex Capable */
-#define MII_SR_100T2_FD_CAPS 0x0400/* 100T2 Full Duplex Capable */
-#define MII_SR_10T_HD_CAPS   0x0800/* 10T   Half Duplex Capable */
-#define MII_SR_10T_FD_CAPS   0x1000/* 10T   Full Duplex Capable */
-#define MII_SR_100X_HD_CAPS  0x2000/* 100X  Half Duplex Capable */
-#define MII_SR_100X_FD_CAPS  0x4000/* 100X  Full Duplex Capable */
-#define MII_SR_100T4_CAPS0x8000/* 100T4 Capable */
-
 /* eTSEC */
 
 /* Number of register in the device */
diff --git a/hw/net/fsl_etsec/miim.c b/hw/net/fsl_etsec/miim.c
index 6bba01c82a..b48d2cb57b 100644
--- a/hw/net/fsl_etsec/miim.c
+++ b/hw/net/fsl_etsec/miim.c
@@ -23,6 +23,7 @@
  */
 
 #include "qemu/osdep.h"
+#include "hw/net/mii.h"
 #include "etsec.h"
 #include "registers.h"
 
@@ -140,8 +141,8 @@ void etsec_miim_link_status(eTSEC *etsec, NetClientState 
*nc)
 {
 /* Set link status */
 if (nc->link_down) {
-etsec->phy_status &= ~MII_SR_LINK_STATUS;
+etsec->phy_status &= ~MII_BMSR_LINK_ST;
 } else {
-etsec->phy_status |= MII_SR_LINK_STATUS;
+etsec->phy_status |= MII_BMSR_LINK_ST;
 }
 }
diff --git a/include/hw/net/mii.h b/include/hw/net/mii.h
index c6a767a49a..ed1bb52b0f 100644
--- a/include/hw/net/mii.h
+++ b/include/hw/net/mii.h
@@ -55,6 +55,7 @@
 #define MII_BMCR_CTST   (1 << 7)  /* Collision test */
 #define MII_BMCR_SPEED1000  (1 << 6)  /* MSB of Speed (1000) */
 
+#define MII_BMSR_100T4  (1 << 15) /* Can do 100mbps T4 */
 #define MII_BMSR_100TX_FD   (1 << 14) /* Can do 100mbps, full-duplex */
 #define MII_BMSR_100TX_HD   (1 << 13) /* Can do 100mbps, half-duplex */
 #define MII_BMSR_10T_FD (1 << 12) /* Can do 10mbps, full-duplex */
-- 
2.39.1

[PATCH v5 09/29] e1000: Use memcpy to intialize registers

2023-01-31 Thread Akihiko Odaki

Use memcpy instead of memmove to initialize registers. The initial
register templates and register table instances will never overlap.

Signed-off-by: Akihiko Odaki 
---
 hw/net/e1000.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/net/e1000.c b/hw/net/e1000.c
index d9d048f665..3353a3752c 100644
--- a/hw/net/e1000.c
+++ b/hw/net/e1000.c
@@ -390,10 +390,10 @@ static void e1000_reset(void *opaque)
 d->mit_irq_level = 0;
 d->mit_ide = 0;
 memset(d->phy_reg, 0, sizeof d->phy_reg);
-memmove(d->phy_reg, phy_reg_init, sizeof phy_reg_init);
+memcpy(d->phy_reg, phy_reg_init, sizeof phy_reg_init);
 d->phy_reg[MII_PHYID2] = edc->phy_id2;
 memset(d->mac_reg, 0, sizeof d->mac_reg);
-memmove(d->mac_reg, mac_reg_init, sizeof mac_reg_init);
+memcpy(d->mac_reg, mac_reg_init, sizeof mac_reg_init);
 d->rxbuf_min_shift = 1;
 memset(>tx, 0, sizeof d->tx);
 
-- 
2.39.1

[PATCH v5 02/29] hw/net: Add more MII definitions

2023-01-31 Thread Akihiko Odaki

The definitions will be used by igb.

Signed-off-by: Akihiko Odaki 
---
 include/hw/net/mii.h | 13 -
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/include/hw/net/mii.h b/include/hw/net/mii.h
index 4ae4dcce7e..c6a767a49a 100644
--- a/include/hw/net/mii.h
+++ b/include/hw/net/mii.h
@@ -81,20 +81,31 @@
 #define MII_ANLPAR_ACK  (1 << 14)
 #define MII_ANLPAR_PAUSEASY (1 << 11) /* can pause asymmetrically */
 #define MII_ANLPAR_PAUSE(1 << 10) /* can pause */
+#define MII_ANLPAR_T4   (1 << 9)
 #define MII_ANLPAR_TXFD (1 << 8)
 #define MII_ANLPAR_TX   (1 << 7)
 #define MII_ANLPAR_10FD (1 << 6)
 #define MII_ANLPAR_10   (1 << 5)
 #define MII_ANLPAR_CSMACD   (1 << 0)
 
-#define MII_ANER_NWAY   (1 << 0) /* Can do N-way auto-nego */
+#define MII_ANER_NP (1 << 2)  /* Next Page Able */
+#define MII_ANER_NWAY   (1 << 0)  /* Can do N-way auto-nego */
 
+#define MII_ANNP_MP (1 << 13) /* Message Page */
+
+#define MII_CTRL1000_MASTER (1 << 11) /* MASTER-SLAVE Manual Configuration 
Value */
+#define MII_CTRL1000_PORT   (1 << 10) /* T2_Repeater/DTE bit */
 #define MII_CTRL1000_FULL   (1 << 9)  /* 1000BASE-T full duplex */
 #define MII_CTRL1000_HALF   (1 << 8)  /* 1000BASE-T half duplex */
 
+#define MII_STAT1000_LOK(1 << 13) /* Local Receiver Status */
+#define MII_STAT1000_ROK(1 << 12) /* Remote Receiver Status */
 #define MII_STAT1000_FULL   (1 << 11) /* 1000BASE-T full duplex */
 #define MII_STAT1000_HALF   (1 << 10) /* 1000BASE-T half duplex */
 
+#define MII_EXTSTAT_1000T_FD (1 << 13) /* 1000BASE-T Full Duplex */
+#define MII_EXTSTAT_1000T_HD (1 << 12) /* 1000BASE-T Half Duplex */
+
 /* List of vendor identifiers */
 /* RealTek 8201 */
 #define RTL8201CP_PHYID10x
-- 
2.39.1

[PATCH v5 00/29] e1000x cleanups (preliminary for IGB)

2023-01-31 Thread Akihiko Odaki

We are adding a new device named igb, yet another Intel NIC. As the new
implementation derives from e1000e, overhaul e1000e implementation first.
e1000 has many commonalities with e1000e so we also apply the corresponding
changes to the device if possible.

This was spun off from:
https://patchew.org/QEMU/20230112095743.20123-1-akihiko.od...@daynix.com/

The changes from the series are as follows:
- Fixed code alignment in e1000.c. (Philippe Mathieu-Daudé)
- "e1000: Configure ResettableClass" and e1000e's corresponding patch was based
  on the old version so they are now updated. (Philippe Mathieu-Daudé)
- Added "e1000e: Remove extra pointer indirection"

The series was composed on patches submitted earlier for e1000e. The below
are links to Patchew:
03: https://patchew.org/QEMU/20221103060103.83363-1-akihiko.od...@daynix.com/
04: https://patchew.org/QEMU/20221125135254.54760-1-akihiko.od...@daynix.com/
05: https://patchew.org/QEMU/20221119054913.103803-1-akihiko.od...@daynix.com/
06: https://patchew.org/QEMU/20221119055304.105500-1-akihiko.od...@daynix.com/
08 includes: 
https://patchew.org/QEMU/20221119060156.110010-1-akihiko.od...@daynix.com/
10: https://patchew.org/QEMU/20221125140105.55925-1-akihiko.od...@daynix.com/
11: https://patchew.org/QEMU/20221125142608.58919-1-akihiko.od...@daynix.com/
13: https://patchew.org/QEMU/20221201095351.63392-1-akihiko.od...@daynix.com/
14: https://patchew.org/QEMU/20221201100113.64387-1-akihiko.od...@daynix.com/
15: https://patchew.org/QEMU/20230107143328.102534-1-akihiko.od...@daynix.com/
20: https://patchew.org/QEMU/20230114025339.4874-1-akihiko.od...@daynix.com/

V4 -> V5:
- Added "e1000e: Combine rx traces".

V3 -> V4:
- Fixed iov cursor update in "hw/net/net_tx_pkt: Implement TCP segmentation".
- Fixed UDP checksumming in "hw/net/net_tx_pkt: Implement TCP segmentation".
- Added "hw/net/net_tx_pkt: Check the payload length".
- Added "e1000e: Do not assert when MSI-X is disabled later".

V2 -> V3:
- List tests/qtest/libqos/e1000e.h in MAINTAINERS. (Thomas Huth)

V1 -> V2:
- Rebased to commit fcb7e040f5c69ca1f0678f991ab5354488a9e192.
- Added "net: Check L4 header size".
- Added "e1000x: Alter the signature of e1000x_is_vlan_packet".
- Added "net: Strip virtio-net header when dumping".
- Added "hw/net/net_tx_pkt: Automatically determine if virtio-net header is
  used".
- Added "hw/net/net_rx_pkt: Remove net_rx_pkt_has_virt_hdr".
- Added "e1000e: Perform software segmentation for loopback".
- Added "hw/net/net_tx_pkt: Implement TCP segmentation"
- Added "MAINTAINERS: Add Akihiko Odaki as a e1000e reviewer".
- Added "MAINTAINERS: Add e1000e test files".

Akihiko Odaki (29):
  e1000e: Fix the code style
  hw/net: Add more MII definitions
  fsl_etsec: Use hw/net/mii.h
  e1000: Use hw/net/mii.h
  e1000: Mask registers when writing
  e1000e: Mask registers when writing
  e1000: Use more constant definitions
  e1000e: Use more constant definitions
  e1000: Use memcpy to intialize registers
  e1000e: Use memcpy to intialize registers
  e1000e: Remove pending interrupt flags
  e1000e: Improve software reset
  e1000: Configure ResettableClass
  e1000e: Configure ResettableClass
  e1000e: Introduce e1000_rx_desc_union
  e1000e: Set MII_ANER_NWAY
  e1000e: Remove extra pointer indirection
  net: Check L4 header size
  e1000x: Alter the signature of e1000x_is_vlan_packet
  net: Strip virtio-net header when dumping
  hw/net/net_tx_pkt: Automatically determine if virtio-net header is
used
  hw/net/net_rx_pkt: Remove net_rx_pkt_has_virt_hdr
  e1000e: Perform software segmentation for loopback
  hw/net/net_tx_pkt: Implement TCP segmentation
  hw/net/net_tx_pkt: Check the payload length
  e1000e: Do not assert when MSI-X is disabled later
  MAINTAINERS: Add Akihiko Odaki as a e1000e reviewer
  MAINTAINERS: Add e1000e test files
  e1000e: Combine rx traces

 MAINTAINERS  |   4 +
 hw/net/e1000.c   | 254 -
 hw/net/e1000_regs.h  |  61 +---
 hw/net/e1000e.c  |  88 +++---
 hw/net/e1000e_core.c | 594 ---
 hw/net/e1000e_core.h |  68 +++--
 hw/net/e1000x_common.c   |  12 +-
 hw/net/e1000x_common.h   |  56 ++--
 hw/net/fsl_etsec/etsec.c |  11 +-
 hw/net/fsl_etsec/etsec.h |  17 --
 hw/net/fsl_etsec/miim.c  |   5 +-
 hw/net/net_rx_pkt.c  |  12 +-
 hw/net/net_rx_pkt.h  |  20 +-
 hw/net/net_tx_pkt.c  | 332 --
 hw/net/net_tx_pkt.h  |  27 +-
 hw/net/trace-events  |  10 +-
 hw/net/virtio-net.c  |   2 +-
 hw/net/vmxnet3.c |  32 +--
 include/hw/net/mii.h |  14 +-
 include/net/eth.h|   5 -
 include/net/net.h|   6 +
 net/dump.c   |  11 +-
 net/eth.c|  27 --
 net/net.c|  18 ++
 net/tap.c|  16 ++
 25 files changed, 921 insertions(+), 781 deletions(-)

-- 
2.39.1

Re: Emulating device configuration / max_virtqueue_pairs in vhost-vdpa and vhost-user

2023-01-31 Thread Jason Wang

On Wed, Feb 1, 2023 at 3:11 AM Eugenio Perez Martin  wrote:
>
> On Tue, Jan 31, 2023 at 8:10 PM Eugenio Perez Martin
>  wrote:
> >
> > Hi,
> >
> > The current approach of offering an emulated CVQ to the guest and map
> > the commands to vhost-user is not scaling well:
> > * Some devices already offer it, so the transformation is redundant.
> > * There is no support for commands with variable length (RSS?)
> >
> > We can solve both of them by offering it through vhost-user the same
> > way as vhost-vdpa do. With this approach qemu needs to track the
> > commands, for similar reasons as vhost-vdpa: qemu needs to track the
> > device status for live migration. vhost-user should use the same SVQ
> > code for this, so we avoid duplications.
> >
> > One of the challenges here is to know what virtqueue to shadow /
> > isolate. The vhost-user device may not have the same queues as the
> > device frontend:
> > * The first depends on the actual vhost-user device, and qemu fetches
> > it with VHOST_USER_GET_QUEUE_NUM at the moment.
> > * The qemu device frontend's is set by netdev queues= cmdline parameter in 
> > qemu
> >
> > For the device, the CVQ is the last one it offers, but for the guest
> > it is the last one offered in config space.
> >
> > To create a new vhost-user command to decrease that maximum number of
> > queues may be an option. But we can do it without adding more
> > commands, remapping the CVQ index at virtqueue setup. I think it
> > should be doable using (struct vhost_dev).vq_index and maybe a few
> > adjustments here and there.
> >
> > Thoughts?
> >
> > Thanks!
>
>
> (Starting a separated thread to vhost-vdpa related use case)
>
> This could also work for vhost-vdpa if we ever decide to honor netdev
> queues argument. It is totally ignored now, as opposed to the rest of
> backends:
> * vhost-kernel, whose tap device has the requested number of queues.
> * vhost-user, that errors with ("you are asking more queues than
> supported") if the vhost-user parent device has less queues than
> requested (by vhost-user msg VHOST_USER_GET_QUEUE_NUM).
>
> One of the reasons for this is that device configuration space is
> totally passthrough, with the values for mtu, rss conditions, etc.
> This is not ideal, as qemu cannot check src and destination
> equivalence and they can change under the feets of the guest in the
> event of a migration.

This looks not the responsibility of qemu but the upper layer (to
provision the same config/features in src/dst).

> External tools are needed for this, duplicating
> part of the effort.
>
> Start intercepting config space accesses and offering an emulated one
> to the guest with this kind of adjustments is beneficial, as it makes
> vhost-vdpa more similar to the rest of backends, making the surprise
> on a change way lower.

This probably needs more thought, since vDPA already provides a kind
of emulation in the kernel. My understanding is that it would be
sufficient to add checks to make sure the config that guests see is
consistent with what host provisioned?

Thanks

>
> Thoughts?
>
> Thanks!
>

Re: Emulating device configuration / max_virtqueue_pairs in vhost-vdpa and vhost-user

2023-01-31 Thread Jason Wang

On Wed, Feb 1, 2023 at 3:10 AM Eugenio Perez Martin  wrote:
>
> Hi,
>
> The current approach of offering an emulated CVQ to the guest and map
> the commands to vhost-user is not scaling well:
> * Some devices already offer it, so the transformation is redundant.
> * There is no support for commands with variable length (RSS?)
>
> We can solve both of them by offering it through vhost-user the same
> way as vhost-vdpa do. With this approach qemu needs to track the
> commands, for similar reasons as vhost-vdpa: qemu needs to track the
> device status for live migration. vhost-user should use the same SVQ
> code for this, so we avoid duplications.

Note that it really depends on the model we used. SVQ works well for
trap and emulation (without new API to be invented). But if save and
load API is invented, SVQ is not a must.

>
> One of the challenges here is to know what virtqueue to shadow /
> isolate. The vhost-user device may not have the same queues as the
> device frontend:
> * The first depends on the actual vhost-user device, and qemu fetches
> it with VHOST_USER_GET_QUEUE_NUM at the moment.
> * The qemu device frontend's is set by netdev queues= cmdline parameter in 
> qemu
>
> For the device, the CVQ is the last one it offers, but for the guest
> it is the last one offered in config space.
>
> To create a new vhost-user command to decrease that maximum number of
> queues may be an option. But we can do it without adding more
> commands, remapping the CVQ index at virtqueue setup. I think it
> should be doable using (struct vhost_dev).vq_index and maybe a few
> adjustments here and there.

It requires device specific knowledge, it might work for networking
devices but not others (or need new codes).

Thanks

>
> Thoughts?
>
> Thanks!
>

accel/tcg/translator.c question about translator_access

2023-01-31 Thread Sid Manning

There is an assert in translator_access that I hit while running on a version 
of QEMU integrated into a Virtual Platform.

Since this function can return null anyway I tried the following experiment:

--- a/accel/tcg/translator.c
+++ b/accel/tcg/translator.c
@@ -172,7 +172,9 @@ static void *translator_access(CPUArchState *env, 
DisasContextBase *db,
 tb_page_addr_t phys_page =
 get_page_addr_code_hostp(env, base, >host_addr[1]);
 /* We cannot handle MMIO as second page. */
-assert(phys_page != -1);
+if(phys_page == -1) {
+return NULL;
+}
 tb_set_page_addr1(tb, phys_page);
#ifdef CONFIG_USER_ONLY
 page_protect(end);

which avoided the issue and the test ran to completion.  What is this assert 
trying to catch?

Re: [QEMU][PATCH v5 04/10] xen-hvm: reorganize xen-hvm and move common function to xen-hvm-common

2023-01-31 Thread Stefano Stabellini

On Tue, 31 Jan 2023, Vikram Garhwal wrote:
> From: Stefano Stabellini 
> 
> This patch does following:
> 1. creates arch_handle_ioreq() and arch_xen_set_memory(). This is done in
> preparation for moving most of xen-hvm code to an arch-neutral location,
> move the x86-specific portion of xen_set_memory to arch_xen_set_memory.
> Also, move handle_vmport_ioreq to arch_handle_ioreq.
> 
> 2. Pure code movement: move common functions to hw/xen/xen-hvm-common.c
> Extract common functionalities from hw/i386/xen/xen-hvm.c and move them to
> hw/xen/xen-hvm-common.c. These common functions are useful for creating
> an IOREQ server.
> 
> xen_hvm_init_pc() contains the architecture independent code for creating
> and mapping a IOREQ server, connecting memory and IO listeners, 
> initializing
> a xen bus and registering backends. Moved this common xen code to a new
> function xen_register_ioreq() which can be used by both x86 and ARM 
> machines.
> 
> Following functions are moved to hw/xen/xen-hvm-common.c:
> xen_vcpu_eport(), xen_vcpu_ioreq(), xen_ram_alloc(), xen_set_memory(),
> xen_region_add(), xen_region_del(), xen_io_add(), xen_io_del(),
> xen_device_realize(), xen_device_unrealize(),
> cpu_get_ioreq_from_shared_memory(), cpu_get_ioreq(), do_inp(),
> do_outp(), rw_phys_req_item(), read_phys_req_item(),
> write_phys_req_item(), cpu_ioreq_pio(), cpu_ioreq_move(),
> cpu_ioreq_config(), handle_ioreq(), handle_buffered_iopage(),
> handle_buffered_io(), cpu_handle_ioreq(), xen_main_loop_prepare(),
> xen_hvm_change_state_handler(), xen_exit_notifier(),
> xen_map_ioreq_server(), destroy_hvm_domain() and
> xen_shutdown_fatal_error()
> 
> 3. Removed static type from below functions:
> 1. xen_region_add()
> 2. xen_region_del()
> 3. xen_io_add()
> 4. xen_io_del()
> 5. xen_device_realize()
> 6. xen_device_unrealize()
> 7. xen_hvm_change_state_handler()
> 8. cpu_ioreq_pio()
> 9. xen_exit_notifier()
> 
> 4. Replace TARGET_PAGE_SIZE with XC_PAGE_SIZE to match the page side with Xen.
> 
> Signed-off-by: Vikram Garhwal 
> Signed-off-by: Stefano Stabellini 

Reviewed-by: Stefano Stabellini 


> ---
>  hw/i386/xen/trace-events|   14 -
>  hw/i386/xen/xen-hvm.c   | 1019 ++-
>  hw/xen/meson.build  |5 +-
>  hw/xen/trace-events |   14 +
>  hw/xen/xen-hvm-common.c |  874 ++
>  include/hw/i386/xen_arch_hvm.h  |   11 +
>  include/hw/xen/arch_hvm.h   |3 +
>  include/hw/xen/xen-hvm-common.h |   98 +++
>  8 files changed, 1067 insertions(+), 971 deletions(-)
>  create mode 100644 hw/xen/xen-hvm-common.c
>  create mode 100644 include/hw/i386/xen_arch_hvm.h
>  create mode 100644 include/hw/xen/arch_hvm.h
>  create mode 100644 include/hw/xen/xen-hvm-common.h
> 
> diff --git a/hw/i386/xen/trace-events b/hw/i386/xen/trace-events
> index a0c89d91c4..5d0a8d6dcf 100644
> --- a/hw/i386/xen/trace-events
> +++ b/hw/i386/xen/trace-events
> @@ -7,17 +7,3 @@ xen_platform_log(char *s) "xen platform: %s"
>  xen_pv_mmio_read(uint64_t addr) "WARNING: read from Xen PV Device MMIO space 
> (address 0x%"PRIx64")"
>  xen_pv_mmio_write(uint64_t addr) "WARNING: write to Xen PV Device MMIO space 
> (address 0x%"PRIx64")"
>  
> -# xen-hvm.c
> -xen_ram_alloc(unsigned long ram_addr, unsigned long size) "requested: 0x%lx, 
> size 0x%lx"
> -xen_client_set_memory(uint64_t start_addr, unsigned long size, bool 
> log_dirty) "0x%"PRIx64" size 0x%lx, log_dirty %i"
> -handle_ioreq(void *req, uint32_t type, uint32_t dir, uint32_t df, uint32_t 
> data_is_ptr, uint64_t addr, uint64_t data, uint32_t count, uint32_t size) 
> "I/O=%p type=%d dir=%d df=%d ptr=%d port=0x%"PRIx64" data=0x%"PRIx64" 
> count=%d size=%d"
> -handle_ioreq_read(void *req, uint32_t type, uint32_t df, uint32_t 
> data_is_ptr, uint64_t addr, uint64_t data, uint32_t count, uint32_t size) 
> "I/O=%p read type=%d df=%d ptr=%d port=0x%"PRIx64" data=0x%"PRIx64" count=%d 
> size=%d"
> -handle_ioreq_write(void *req, uint32_t type, uint32_t df, uint32_t 
> data_is_ptr, uint64_t addr, uint64_t data, uint32_t count, uint32_t size) 
> "I/O=%p write type=%d df=%d ptr=%d port=0x%"PRIx64" data=0x%"PRIx64" count=%d 
> size=%d"
> -cpu_ioreq_pio(void *req, uint32_t dir, uint32_t df, uint32_t data_is_ptr, 
> uint64_t addr, uint64_t data, uint32_t count, uint32_t size) "I/O=%p pio 
> dir=%d df=%d ptr=%d port=0x%"PRIx64" data=0x%"PRIx64" count=%d size=%d"
> -cpu_ioreq_pio_read_reg(void *req, uint64_t data, uint64_t addr, uint32_t 
> size) "I/O=%p pio read reg data=0x%"PRIx64" port=0x%"PRIx64" size=%d"
> -cpu_ioreq_pio_write_reg(void *req, uint64_t data, uint64_t addr, uint32_t 
> size) "I/O=%p pio write reg data=0x%"PRIx64" port=0x%"PRIx64" size=%d"
> -cpu_ioreq_move(void *req, uint32_t dir, uint32_t df, uint32_t data_is_ptr, 
> uint64_t addr, uint64_t

Re: [QEMU][PATCH v5 09/10] hw/arm: introduce xenpvh machine

2023-01-31 Thread Stefano Stabellini

On Tue, 31 Jan 2023, Vikram Garhwal wrote:
> Add a new machine xenpvh which creates a IOREQ server to register/connect with
> Xen Hypervisor.
> 
> Optional: When CONFIG_TPM is enabled, it also creates a tpm-tis-device, adds a
> TPM emulator and connects to swtpm running on host machine via chardev socket
> and support TPM functionalities for a guest domain.
> 
> Extra command line for aarch64 xenpvh QEMU to connect to swtpm:
> -chardev socket,id=chrtpm,path=/tmp/myvtpm2/swtpm-sock \
> -tpmdev emulator,id=tpm0,chardev=chrtpm \
> -machine tpm-base-addr=0x0c00 \
> 
> swtpm implements a TPM software emulator(TPM 1.2 & TPM 2) built on libtpms and
> provides access to TPM functionality over socket, chardev and CUSE interface.
> Github repo: https://github.com/stefanberger/swtpm
> Example for starting swtpm on host machine:
> mkdir /tmp/vtpm2
> swtpm socket --tpmstate dir=/tmp/vtpm2 \
> --ctrl type=unixio,path=/tmp/vtpm2/swtpm-sock &
> 
> Signed-off-by: Vikram Garhwal 
> Signed-off-by: Stefano Stabellini 

Reviewed-by: Stefano Stabellini 


> ---
>  docs/system/arm/xenpvh.rst|  34 +++
>  docs/system/target-arm.rst|   1 +
>  hw/arm/meson.build|   2 +
>  hw/arm/xen_arm.c  | 182 ++
>  include/hw/arm/xen_arch_hvm.h |   9 ++
>  include/hw/xen/arch_hvm.h |   2 +
>  6 files changed, 230 insertions(+)
>  create mode 100644 docs/system/arm/xenpvh.rst
>  create mode 100644 hw/arm/xen_arm.c
>  create mode 100644 include/hw/arm/xen_arch_hvm.h
> 
> diff --git a/docs/system/arm/xenpvh.rst b/docs/system/arm/xenpvh.rst
> new file mode 100644
> index 00..e1655c7ab8
> --- /dev/null
> +++ b/docs/system/arm/xenpvh.rst
> @@ -0,0 +1,34 @@
> +XENPVH (``xenpvh``)
> +=
> +This machine creates a IOREQ server to register/connect with Xen Hypervisor.
> +
> +When TPM is enabled, this machine also creates a tpm-tis-device at a user 
> input
> +tpm base address, adds a TPM emulator and connects to a swtpm application
> +running on host machine via chardev socket. This enables xenpvh to support 
> TPM
> +functionalities for a guest domain.
> +
> +More information about TPM use and installing swtpm linux application can be
> +found at: docs/specs/tpm.rst.
> +
> +Example for starting swtpm on host machine:
> +.. code-block:: console
> +
> +mkdir /tmp/vtpm2
> +swtpm socket --tpmstate dir=/tmp/vtpm2 \
> +--ctrl type=unixio,path=/tmp/vtpm2/swtpm-sock &
> +
> +Sample QEMU xenpvh commands for running and connecting with Xen:
> +.. code-block:: console
> +
> +qemu-system-aarch64 -xen-domid 1 \
> +-chardev socket,id=libxl-cmd,path=qmp-libxl-1,server=on,wait=off \
> +-mon chardev=libxl-cmd,mode=control \
> +-chardev 
> socket,id=libxenstat-cmd,path=qmp-libxenstat-1,server=on,wait=off \
> +-mon chardev=libxenstat-cmd,mode=control \
> +-xen-attach -name guest0 -vnc none -display none -nographic \
> +-machine xenpvh -m 1301 \
> +-chardev socket,id=chrtpm,path=tmp/vtpm2/swtpm-sock \
> +-tpmdev emulator,id=tpm0,chardev=chrtpm -machine tpm-base-addr=0x0C00
> +
> +In above QEMU command, last two lines are for connecting xenpvh QEMU to swtpm
> +via chardev socket.
> diff --git a/docs/system/target-arm.rst b/docs/system/target-arm.rst
> index 91ebc26c6d..af8d7c77d6 100644
> --- a/docs/system/target-arm.rst
> +++ b/docs/system/target-arm.rst
> @@ -106,6 +106,7 @@ undocumented; you can get a complete list by running
> arm/stm32
> arm/virt
> arm/xlnx-versal-virt
> +   arm/xenpvh
>  
>  Emulated CPU architecture support
>  =
> diff --git a/hw/arm/meson.build b/hw/arm/meson.build
> index b036045603..06bddbfbb8 100644
> --- a/hw/arm/meson.build
> +++ b/hw/arm/meson.build
> @@ -61,6 +61,8 @@ arm_ss.add(when: 'CONFIG_FSL_IMX7', if_true: 
> files('fsl-imx7.c', 'mcimx7d-sabre.
>  arm_ss.add(when: 'CONFIG_ARM_SMMUV3', if_true: files('smmuv3.c'))
>  arm_ss.add(when: 'CONFIG_FSL_IMX6UL', if_true: files('fsl-imx6ul.c', 
> 'mcimx6ul-evk.c'))
>  arm_ss.add(when: 'CONFIG_NRF51_SOC', if_true: files('nrf51_soc.c'))
> +arm_ss.add(when: 'CONFIG_XEN', if_true: files('xen_arm.c'))
> +arm_ss.add_all(xen_ss)
>  
>  softmmu_ss.add(when: 'CONFIG_ARM_SMMUV3', if_true: files('smmu-common.c'))
>  softmmu_ss.add(when: 'CONFIG_EXYNOS4', if_true: files('exynos4_boards.c'))
> diff --git a/hw/arm/xen_arm.c b/hw/arm/xen_arm.c
> new file mode 100644
> index 00..eaca65af37
> --- /dev/null
> +++ b/hw/arm/xen_arm.c
> @@ -0,0 +1,182 @@
> +/*
> + * QEMU ARM Xen PVH Machine
> + *
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a 
> copy
> + * of this software and associated documentation files (the "Software"), to 
> deal
> + * in the Software without restriction, including without limitation the 
> rights
> + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
> + * copies of the Software, and to

Re: [PULL 16/22] tcg/aarch64: Reorg goto_tb implementation

2023-01-31 Thread Zenghui Yu via


On 2023/1/18 7:10, Richard Henderson wrote:

+void tb_target_set_jmp_target(const TranslationBlock *tb, int n,
+  uintptr_t jmp_rx, uintptr_t jmp_rw)
+{
+uintptr_t d_addr = tb->jmp_target_addr[n];
+ptrdiff_t d_offset = d_addr - jmp_rx;
+tcg_insn_unit insn;
+
+/* Either directly branch, or indirect branch load. */
+if (d_offset == sextract64(d_offset, 0, 28)) {
+insn = deposit32(I3206_B, 0, 26, d_offset >> 2);
+} else {
+uintptr_t i_addr = (uintptr_t)>jmp_target_addr[n];
+ptrdiff_t i_offset = i_addr - jmp_rx;
+
+/* Note that we asserted this in range in tcg_out_goto_tb. */
+insn = deposit32(I3305_LDR | TCG_REG_TMP, 0, 5, i_offset >> 2);


'offset' should be bits [23:5] of LDR instruction, rather than [4:0].

[PATCH 1/4] cpus: Make {start,end}_exclusive() recursive

2023-01-31 Thread Ilya Leoshkevich

Currently dying to one of the core_dump_signal()s deadlocks, because
dump_core_and_abort() calls start_exclusive() two times: first via
stop_all_tasks(), and then via preexit_cleanup() ->
qemu_plugin_user_exit().

There are a number of ways to solve this: resume after dumping core;
check cpu_in_exclusive_context() in qemu_plugin_user_exit(); or make
{start,end}_exclusive() recursive. Pick the last option, since it's
the most straightforward one.

Fixes: da91c1920242 ("linux-user: Clean up when exiting due to a signal")
Signed-off-by: Ilya Leoshkevich 
---
 cpus-common.c | 12 ++--
 include/hw/core/cpu.h |  4 ++--
 2 files changed, 12 insertions(+), 4 deletions(-)

diff --git a/cpus-common.c b/cpus-common.c
index 793364dc0ed..a0c52cd187f 100644
--- a/cpus-common.c
+++ b/cpus-common.c
@@ -192,6 +192,11 @@ void start_exclusive(void)
 CPUState *other_cpu;
 int running_cpus;
 
+if (current_cpu->exclusive_context_count) {
+current_cpu->exclusive_context_count++;
+return;
+}
+
 qemu_mutex_lock(_cpu_list_lock);
 exclusive_idle();
 
@@ -219,13 +224,16 @@ void start_exclusive(void)
  */
 qemu_mutex_unlock(_cpu_list_lock);
 
-current_cpu->in_exclusive_context = true;
+current_cpu->exclusive_context_count++;
 }
 
 /* Finish an exclusive operation.  */
 void end_exclusive(void)
 {
-current_cpu->in_exclusive_context = false;
+current_cpu->exclusive_context_count--;
+if (current_cpu->exclusive_context_count) {
+return;
+}
 
 qemu_mutex_lock(_cpu_list_lock);
 qatomic_set(_cpus, 0);
diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
index 2417597236b..671f041bec6 100644
--- a/include/hw/core/cpu.h
+++ b/include/hw/core/cpu.h
@@ -349,7 +349,7 @@ struct CPUState {
 bool unplug;
 bool crash_occurred;
 bool exit_request;
-bool in_exclusive_context;
+int exclusive_context_count;
 uint32_t cflags_next_tb;
 /* updates protected by BQL */
 uint32_t interrupt_request;
@@ -758,7 +758,7 @@ void async_safe_run_on_cpu(CPUState *cpu, run_on_cpu_func 
func, run_on_cpu_data
  */
 static inline bool cpu_in_exclusive_context(const CPUState *cpu)
 {
-return cpu->in_exclusive_context;
+return cpu->exclusive_context_count;
 }
 
 /**
-- 
2.39.1

[PATCH 3/4] linux-user/sparc: Handle "ta 5"

2023-01-31 Thread Ilya Leoshkevich

GCC lowers __builtin_trap() to "ta 5", which in turn generates trap
0x105. Follow what kernel's bad_trap() is doing there.

Signed-off-by: Ilya Leoshkevich 
---
 linux-user/sparc/cpu_loop.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/linux-user/sparc/cpu_loop.c b/linux-user/sparc/cpu_loop.c
index 434c90a55f8..fa36d452a51 100644
--- a/linux-user/sparc/cpu_loop.c
+++ b/linux-user/sparc/cpu_loop.c
@@ -225,6 +225,9 @@ void cpu_loop (CPUSPARCState *env)
 restore_window(env);
 break;
 #ifndef TARGET_ABI32
+case 0x105:
+force_sig_fault(TARGET_SIGILL, ILL_ILLTRP, env->pc);
+break;
 case 0x16e:
 flush_windows(env);
 sparc64_get_context(env);
-- 
2.39.1

[PATCH 2/4] linux-user/microblaze: Handle privileged exception

2023-01-31 Thread Ilya Leoshkevich

Follow what kernel's full_exception() is doing.

Signed-off-by: Ilya Leoshkevich 
---
 linux-user/microblaze/cpu_loop.c | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/linux-user/microblaze/cpu_loop.c b/linux-user/microblaze/cpu_loop.c
index 5ccf9e942ea..212e62d0a62 100644
--- a/linux-user/microblaze/cpu_loop.c
+++ b/linux-user/microblaze/cpu_loop.c
@@ -25,8 +25,8 @@
 
 void cpu_loop(CPUMBState *env)
 {
+int trapnr, ret, si_code, sig;
 CPUState *cs = env_cpu(env);
-int trapnr, ret, si_code;
 
 while (1) {
 cpu_exec_start(cs);
@@ -76,6 +76,7 @@ void cpu_loop(CPUMBState *env)
 env->iflags &= ~(IMM_FLAG | D_FLAG);
 switch (env->esr & 31) {
 case ESR_EC_DIVZERO:
+sig = TARGET_SIGFPE;
 si_code = TARGET_FPE_INTDIV;
 break;
 case ESR_EC_FPU:
@@ -84,6 +85,7 @@ void cpu_loop(CPUMBState *env)
  * if there's no recognized bit set.  Possibly this
  * implies that si_code is 0, but follow the structure.
  */
+sig = TARGET_SIGFPE;
 si_code = env->fsr;
 if (si_code & FSR_IO) {
 si_code = TARGET_FPE_FLTINV;
@@ -97,13 +99,17 @@ void cpu_loop(CPUMBState *env)
 si_code = TARGET_FPE_FLTRES;
 }
 break;
+case ESR_EC_PRIVINSN:
+sig = SIGILL;
+si_code = ILL_PRVOPC;
+break;
 default:
 fprintf(stderr, "Unhandled hw-exception: 0x%x\n",
 env->esr & ESR_EC_MASK);
 cpu_dump_state(cs, stderr, 0);
 exit(EXIT_FAILURE);
 }
-force_sig_fault(TARGET_SIGFPE, si_code, env->pc);
+force_sig_fault(sig, si_code, env->pc);
 break;
 
 case EXCP_DEBUG:
-- 
2.39.1

[PATCH 0/4] Fix deadlock when dying because of a signal

2023-01-31 Thread Ilya Leoshkevich

Hi,

wasmtime testsuite found a deadlock in qemu_plugin_user_exit().
I tracked it down to one of my earlier patches, which introduced
cleanup in dump_core_and_abort().

Patch 1 fixes the issue, patches 2 and 3 fix __builtin_trap()
handling in microblaze and sparc - which is needed for patch 4,
that adds a test.

Just before sending this, I noticed that a solution has already been
proposed in [1], but apparently it wasn't accepted.

Best regards,
Ilya

[1] https://lists.gnu.org/archive/html/qemu-devel/2023-01/msg03506.html

Ilya Leoshkevich (4):
  cpus: Make {start,end}_exclusive() recursive
  linux-user/microblaze: Handle privileged exception
  linux-user/sparc: Handle "ta 5"
  tests/tcg/linux-test: Add linux-fork-trap test

 cpus-common.c   | 12 +-
 include/hw/core/cpu.h   |  4 +-
 linux-user/microblaze/cpu_loop.c| 10 -
 linux-user/sparc/cpu_loop.c |  3 ++
 tests/tcg/multiarch/linux/linux-fork-trap.c | 48 +
 5 files changed, 71 insertions(+), 6 deletions(-)
 create mode 100644 tests/tcg/multiarch/linux/linux-fork-trap.c

-- 
2.39.1

[PATCH 4/4] tests/tcg/linux-test: Add linux-fork-trap test

2023-01-31 Thread Ilya Leoshkevich

Check that dying due to a signal does not deadlock.

Signed-off-by: Ilya Leoshkevich 
---
 tests/tcg/multiarch/linux/linux-fork-trap.c | 48 +
 1 file changed, 48 insertions(+)
 create mode 100644 tests/tcg/multiarch/linux/linux-fork-trap.c

diff --git a/tests/tcg/multiarch/linux/linux-fork-trap.c 
b/tests/tcg/multiarch/linux/linux-fork-trap.c
new file mode 100644
index 000..a921f875380
--- /dev/null
+++ b/tests/tcg/multiarch/linux/linux-fork-trap.c
@@ -0,0 +1,48 @@
+/*
+ * Test that a fork()ed process terminates after __builtin_trap().
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+#include 
+#include 
+#include 
+#include 
+#include 
+
+int main(void)
+{
+struct rlimit nodump;
+pid_t err, pid;
+int wstatus;
+
+pid = fork();
+assert(pid != -1);
+if (pid == 0) {
+/* We are about to crash on purpose; disable core dumps. */
+if (getrlimit(RLIMIT_CORE, )) {
+return EXIT_FAILURE;
+}
+nodump.rlim_cur = 0;
+if (setrlimit(RLIMIT_CORE, )) {
+return EXIT_FAILURE;
+}
+/*
+ * An alternative would be to dereference a NULL pointer, but that
+ * would be an UB in C.
+ */
+#if defined(__MICROBLAZE__)
+/*
+ * gcc emits "bri 0", which is an endless loop.
+ * Take glibc's ABORT_INSTRUCTION.
+ */
+asm volatile("brki r0,-1");
+#else
+__builtin_trap();
+#endif
+}
+err = waitpid(pid, , 0);
+assert(err == pid);
+assert(WIFSIGNALED(wstatus));
+
+return EXIT_SUCCESS;
+}
-- 
2.39.1

Re: [PATCH v9 1/3] hw/riscv: clear kernel_entry higher bits from load_elf_ram_sym()

2023-01-31 Thread Alistair Francis

On Fri, Jan 20, 2023 at 7:38 AM Daniel Henrique Barboza
 wrote:
>
> load_elf_ram_sym() will sign-extend 32 bit addresses. If a 32 bit
> QEMU guest happens to be running in a hypervisor that are using 64
> bits to encode its address, kernel_entry can be padded with '1's
> and create problems [1].
>
> Use a translate_fn() callback to be called by load_elf_ram_sym() and
> return only the 32 bits address if we're running a 32 bit CPU.
>
> [1] https://lists.gnu.org/archive/html/qemu-devel/2023-01/msg02281.html
>
> Suggested-by: Philippe Mathieu-Daudé 
> Suggested-by: Bin Meng 
> Reviewed-by: Philippe Mathieu-Daudé 
> Signed-off-by: Daniel Henrique Barboza 
> ---
>  hw/riscv/boot.c| 20 +++-
>  hw/riscv/microchip_pfsoc.c |  3 ++-
>  hw/riscv/opentitan.c   |  3 ++-
>  hw/riscv/sifive_e.c|  3 ++-
>  hw/riscv/sifive_u.c|  3 ++-
>  hw/riscv/spike.c   |  3 ++-
>  hw/riscv/virt.c|  3 ++-
>  include/hw/riscv/boot.h|  1 +
>  8 files changed, 32 insertions(+), 7 deletions(-)
>
> diff --git a/hw/riscv/boot.c b/hw/riscv/boot.c
> index 2594276223..46fc7adccf 100644
> --- a/hw/riscv/boot.c
> +++ b/hw/riscv/boot.c
> @@ -173,7 +173,24 @@ target_ulong riscv_load_firmware(const char 
> *firmware_filename,
>  exit(1);
>  }
>
> +static uint64_t translate_kernel_address(void *opaque, uint64_t addr)
> +{
> +RISCVHartArrayState *harts = opaque;
> +
> +if (riscv_is_32bit(harts)) {
> +/*
> + * For 32 bit CPUs, kernel_load_base is sign-extended
> + * (i.e. it can be padded with '1's) by load_elf().
> + * Remove the sign extension by truncating to 32-bit.
> + */
> +return extract64(addr, 0, 32);
> +}
> +
> +return addr;

So After all that, this doesn't actually mask pentry from
load_elf_ram_sym(), so it doesn't help.

> +}
> +
>  target_ulong riscv_load_kernel(MachineState *machine,
> +   RISCVHartArrayState *harts,
> target_ulong kernel_start_addr,
> symbol_fn_t sym_cb)
>  {
> @@ -189,7 +206,8 @@ target_ulong riscv_load_kernel(MachineState *machine,
>   * the (expected) load address load address. This allows kernels to have
>   * separate SBI and ELF entry points (used by FreeBSD, for example).
>   */
> -if (load_elf_ram_sym(kernel_filename, NULL, NULL, NULL,
> +if (load_elf_ram_sym(kernel_filename, NULL,
> + translate_kernel_address, harts,
>   NULL, _load_base, NULL, NULL, 0,
>   EM_RISCV, 1, 0, NULL, true, sym_cb) > 0) {

I think we just need to add the mask here

Alistair

>  return kernel_load_base;
> diff --git a/hw/riscv/microchip_pfsoc.c b/hw/riscv/microchip_pfsoc.c
> index 82ae5e7023..bdefeb3cbb 100644
> --- a/hw/riscv/microchip_pfsoc.c
> +++ b/hw/riscv/microchip_pfsoc.c
> @@ -629,7 +629,8 @@ static void 
> microchip_icicle_kit_machine_init(MachineState *machine)
>  kernel_start_addr = riscv_calc_kernel_start_addr(>soc.u_cpus,
>   firmware_end_addr);
>
> -kernel_entry = riscv_load_kernel(machine, kernel_start_addr, NULL);
> +kernel_entry = riscv_load_kernel(machine, >soc.u_cpus,
> + kernel_start_addr, NULL);
>
>  if (machine->initrd_filename) {
>  riscv_load_initrd(machine, kernel_entry);
> diff --git a/hw/riscv/opentitan.c b/hw/riscv/opentitan.c
> index 64d5d435b9..2731138c41 100644
> --- a/hw/riscv/opentitan.c
> +++ b/hw/riscv/opentitan.c
> @@ -101,7 +101,8 @@ static void opentitan_board_init(MachineState *machine)
>  }
>
>  if (machine->kernel_filename) {
> -riscv_load_kernel(machine, memmap[IBEX_DEV_RAM].base, NULL);
> +riscv_load_kernel(machine, >soc.cpus,
> +  memmap[IBEX_DEV_RAM].base, NULL);
>  }
>  }
>
> diff --git a/hw/riscv/sifive_e.c b/hw/riscv/sifive_e.c
> index 3e3f4b0088..1a7d381514 100644
> --- a/hw/riscv/sifive_e.c
> +++ b/hw/riscv/sifive_e.c
> @@ -114,7 +114,8 @@ static void sifive_e_machine_init(MachineState *machine)
>memmap[SIFIVE_E_DEV_MROM].base, 
> _space_memory);
>
>  if (machine->kernel_filename) {
> -riscv_load_kernel(machine, memmap[SIFIVE_E_DEV_DTIM].base, NULL);
> +riscv_load_kernel(machine, >soc.cpus,
> +  memmap[SIFIVE_E_DEV_DTIM].base, NULL);
>  }
>  }
>
> diff --git a/hw/riscv/sifive_u.c b/hw/riscv/sifive_u.c
> index 2fb6ee231f..83dfe09877 100644
> --- a/hw/riscv/sifive_u.c
> +++ b/hw/riscv/sifive_u.c
> @@ -598,7 +598,8 @@ static void sifive_u_machine_init(MachineState *machine)
>  kernel_start_addr = riscv_calc_kernel_start_addr(>soc.u_cpus,
>   firmware_end_addr);
>
> -kernel_entry = riscv_load_kernel(machine, kernel_start_addr,

Re: [PATCH v2 11/20] hw/isa/lpc_ich9: Reuse memory and io address space of PCI bus

2023-01-31 Thread Bernhard Beschow




Am 31. Januar 2023 11:53:17 UTC schrieb Bernhard Beschow :
>In pc_q35.c the PCI host bridge's io and memory space is initialized
>with get_system_memory() and get_system_io() respectively. Therefore,
>using pci_address_space() and pci_address_space_io() is equivalent.

Self-NACK: pci_address_space() != get_system_memory().

Please ignore this patch. This patch can be omitted from the series w/o any 
syntactic or semantic conflicts. I'll omit it in v3.

>All
>in all this makes the LPC function respect whatever memory spaces the
>PCI bus was set up with.
>
>Signed-off-by: Bernhard Beschow 
>---
> hw/isa/lpc_ich9.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
>diff --git a/hw/isa/lpc_ich9.c b/hw/isa/lpc_ich9.c
>index 9ab966ef88..1b7e5585b3 100644
>--- a/hw/isa/lpc_ich9.c
>+++ b/hw/isa/lpc_ich9.c
>@@ -506,10 +506,10 @@ static void ich9_lpc_rcba_update(ICH9LPCState *lpc, 
>uint32_t rcba_old)
> uint32_t rcba = pci_get_long(lpc->d.config + ICH9_LPC_RCBA);
> 
> if (rcba_old & ICH9_LPC_RCBA_EN) {
>-memory_region_del_subregion(get_system_memory(), >rcrb_mem);
>+memory_region_del_subregion(pci_address_space(>d), 
>>rcrb_mem);
> }
> if (rcba & ICH9_LPC_RCBA_EN) {
>-memory_region_add_subregion_overlap(get_system_memory(),
>+memory_region_add_subregion_overlap(pci_address_space(>d),
> rcba & ICH9_LPC_RCBA_BA_MASK,
> >rcrb_mem, 1);
> }
>@@ -695,7 +695,7 @@ static void ich9_lpc_realize(PCIDevice *d, Error **errp)
> return;
> }
> 
>-isa_bus = isa_bus_new(DEVICE(d), get_system_memory(), get_system_io(),
>+isa_bus = isa_bus_new(dev, pci_address_space(d), pci_address_space_io(d),
>   errp);
> if (!isa_bus) {
> return;

RE: [PATCH] target/hexagon/idef-parser: Remove unused code paths

2023-01-31 Thread Taylor Simpson




> -Original Message-
> From: Anton Johansson 
> Sent: Tuesday, January 31, 2023 4:32 PM
> To: qemu-devel@nongnu.org
> Cc: a...@rev.ng; Taylor Simpson ; Brian Cain
> ; Michael Lambert 
> Subject: [PATCH] target/hexagon/idef-parser: Remove unused code paths
> 
> Removes code paths used by COF instructions, which are no longer
> processed by idef-parser.
> 
> Signed-off-by: Anton Johansson 
> ---
>  target/hexagon/idef-parser/idef-parser.h|  1 -
>  target/hexagon/idef-parser/idef-parser.lex  | 27 +
>  target/hexagon/idef-parser/idef-parser.y| 45 +
>  target/hexagon/idef-parser/macros.inc   | 10 -
>  target/hexagon/idef-parser/parser-helpers.c |  3 --
>  5 files changed, 4 insertions(+), 82 deletions(-)
> 

Tested-by: Taylor Simpson 
Reviewed-by: Taylor Simpson

Re: [PATCH 01/18] vfio/migration: Add VFIO migration pre-copy support

2023-01-31 Thread Jason Gunthorpe

On Tue, Jan 31, 2023 at 03:43:01PM -0700, Alex Williamson wrote:

> How does this affect our path towards supported migration?  I'm
> thinking about a user experience where QEMU supports migration if
> device A OR device B are attached, but not devices A and B attached to
> the same VM.  We might have a device C where QEMU supports migration
> with B AND C, but not A AND C, nor A AND B AND C.  This would be the
> case if device B and device C both supported P2P states, but device A
> did not. The user has no observability of this feature, so all of this
> looks effectively random to the user.

I think qemu should just log if it encounters a device without P2P
support.

> Even in the single device case, we need to make an assumption that a
> device that does not support P2P migration states (or when QEMU doesn't
> make use of P2P states) cannot be a DMA target, or otherwise have its
> MMIO space accessed while in a STOP state.  Can we guarantee that when
> other devices have not yet transitioned to STOP?

You mean the software devices created by qemu?

> We could disable the direct map MemoryRegions when we move to a STOP
> state, which would give QEMU visibility to those accesses, but besides
> pulling an abort should such an access occur, could we queue them in
> software, add them to the migration stream, and replay them after the
> device moves to the RUNNING state?  We'd need to account for the lack of
> RESUMING_P2P states as well, trapping and queue accesses from devices
> already RUNNING to those still in RESUMING (not _P2P).

I think any internal SW devices should just fail all accesses to the
P2P space, all the time.

qemu simply acts like a real system that doesn't support P2P.

IMHO this is generally the way forward to do multi-device as well,
remove the MMIO from all the address maps: VFIO, SW access, all of
them. Nothing can touch MMIO except for the vCPU.

> This all looks complicated.  Is it better to start with requiring P2P
> state support?  Thanks,

People have built HW without it, so I don't see this as so good..

Jason

Re: [PULL 10/56] x86: don't let decompressed kernel image clobber setup_data

2023-01-31 Thread Jason A. Donenfeld

On Mon, Jan 30, 2023 at 03:19:59PM -0500, Michael S. Tsirkin wrote:
> From: "Jason A. Donenfeld" 
> 
> The setup_data links are appended to the compressed kernel image. Since
> the kernel image is typically loaded at 0x10, setup_data lives at
> `0x10 + compressed_size`, which does not get relocated during the
> kernel's boot process.
> 
> The kernel typically decompresses the image starting at address
> 0x100 (note: there's one more zero there than the compressed image
> above). This usually is fine for most kernels.
> 
> However, if the compressed image is actually quite large, then
> setup_data will live at a `0x10 + compressed_size` that extends into
> the decompressed zone at 0x100. In other words, if compressed_size
> is larger than `0x100 - 0x10`, then the decompression step will
> clobber setup_data, resulting in crashes.
> 
> Visually, what happens now is that QEMU appends setup_data to the kernel
> image:
> 
>   kernel imagesetup_data
>|--|||
> 0x10  0x10+l1 0x10+l1+l2
> 
> The problem is that this decompresses to 0x100 (one more zero). So
> if l1 is > (0x100-0x10), then this winds up looking like:
> 
>   kernel imagesetup_data
>|--|||
> 0x10  0x10+l1 0x10+l1+l2
> 
>  d e c o m p r e s s e d   k e r n e l
>  
> |-|
> 0x100 
> 0x100+l3
> 
> The decompressed kernel seemingly overwriting the compressed kernel
> image isn't a problem, because that gets relocated to a higher address
> early on in the boot process, at the end of startup_64. setup_data,
> however, stays in the same place, since those links are self referential
> and nothing fixes them up.  So the decompressed kernel clobbers it.
> 
> Fix this by appending setup_data to the cmdline blob rather than the
> kernel image blob, which remains at a lower address that won't get
> clobbered.
> 
> This could have been done by overwriting the initrd blob instead, but
> that poses big difficulties, such as no longer being able to use memory
> mapped files for initrd, hurting performance, and, more importantly, the
> initrd address calculation is hard coded in qboot, and it always grows
> down rather than up, which means lots of brittle semantics would have to
> be changed around, incurring more complexity. In contrast, using cmdline
> is simple and doesn't interfere with anything.
> 
> The microvm machine has a gross hack where it fiddles with fw_cfg data
> after the fact. So this hack is updated to account for this appending,
> by reserving some bytes.
> 
> Fixup-by: Michael S. Tsirkin 
> Cc: x...@kernel.org
> Cc: Philippe Mathieu-Daudé 
> Cc: H. Peter Anvin 
> Cc: Borislav Petkov 
> Cc: Eric Biggers 
> Signed-off-by: Jason A. Donenfeld 
> Message-Id: <20221230220725.618763-1-ja...@zx2c4.com>
> Message-ID: <20230128061015-mutt-send-email-...@kernel.org>
> Reviewed-by: Michael S. Tsirkin 
> Signed-off-by: Michael S. Tsirkin 
> Tested-by: Eric Biggers 
> Tested-by: Mathias Krause 
> ---
>  include/hw/i386/microvm.h |  5 ++--
>  include/hw/nvram/fw_cfg.h |  9 +++
>  hw/i386/microvm.c | 15 +++
>  hw/i386/x86.c | 52 +--
>  hw/nvram/fw_cfg.c |  9 +++
>  5 files changed, 59 insertions(+), 31 deletions(-)

Cc: qemu-sta...@nongnu.org
Fixes: 67f7e426 ("hw/i386: pass RNG seed via setup_data entry")

Re: [RFC v3 16/18] vfio/iommufd: Implement the iommufd backend

2023-01-31 Thread Jason Gunthorpe

On Tue, Jan 31, 2023 at 09:53:03PM +0100, Eric Auger wrote:
> From: Yi Liu 
> 
> Add the iommufd backend. The IOMMUFD container class is implemented
> based on the new /dev/iommu user API. This backend obviously depends
> on CONFIG_IOMMUFD.
> 
> So far, the iommufd backend doesn't support live migration and
> cache coherency yet due to missing support in the host kernel meaning
> that only a subset of the container class callbacks is implemented.

What is missing for cache coherency? I spent lots of time on that
already, I thought I got everything..

Jason

Re: [PATCH v3 0/9] virtio-gpu: Support Venus Vulkan driver

2023-01-31 Thread Dmitry Osipenko

Hello,

On 1/30/23 20:00, Alex Bennée wrote:
> 
> Antonio Caggiano  writes:
> 
>> This series of patches enables support for the Venus VirtIO-GPU Vulkan
>> driver by adding some features required by the driver:
>>
>> - CONTEXT_INIT
>> - HOSTMEM
>> - RESOURCE_UUID
>> - BLOB_RESOURCES
>>
>> In addition to these features, Venus capset support was required
>> together with the implementation for Virgl blob resource commands.
> 
> I managed to apply to current master but I needed a bunch of patches to
> get it to compile with my old virgl:

Thank you for reviewing and testing the patches! Antonio isn't working
on Venus anymore, I'm going to continue this effort. Last year we
stabilized some of the virglrenderer Venus APIs, this year Venus may
transition to supporting per-context fences only and require to init a
renderserver, which will result in a more changes to Qemu. I'm going to
wait a bit for Venus to settle down and then make a v4.

In the end we will either need to add more #ifdefs if we will want to
keep supporting older virglrenderer versions in Qemu, or bump the min
required virglrenderer version.

-- 
Best regards,
Dmitry

[PATCH v5 04/14] Hexagon (target/hexagon) Add overrides for dealloc-return instructions

2023-01-31 Thread Taylor Simpson

These instructions perform a deallocframe+return (jumpr r31)

Add overrides for
L4_return
SL2_return
L4_return_t
L4_return_f
L4_return_tnew_pt
L4_return_fnew_pt
L4_return_tnew_pnt
L4_return_fnew_pnt
SL2_return_t
SL2_return_f
SL2_return_tnew
SL2_return_fnew

This patch eliminates the last helper that uses write_new_pc, so we
remove it from op_helper.c

Signed-off-by: Taylor Simpson 
---
 target/hexagon/gen_tcg.h   | 54 
 target/hexagon/genptr.c| 86 ++
 target/hexagon/op_helper.c | 26 +---
 3 files changed, 141 insertions(+), 25 deletions(-)

diff --git a/target/hexagon/gen_tcg.h b/target/hexagon/gen_tcg.h
index 6267f51ccc..8282ff3fc5 100644
--- a/target/hexagon/gen_tcg.h
+++ b/target/hexagon/gen_tcg.h
@@ -508,6 +508,60 @@
 #define fGEN_TCG_S2_storerinew_pcr(SHORTCODE) \
 fGEN_TCG_STORE_pcr(2, fSTORE(1, 4, EA, NtN))
 
+/*
+ * dealloc_return
+ * Assembler mapped to
+ * r31:30 = dealloc_return(r30):raw
+ */
+#define fGEN_TCG_L4_return(SHORTCODE) \
+gen_return(ctx, RddV, RsV)
+
+/*
+ * sub-instruction version (no RddV, so handle it manually)
+ */
+#define fGEN_TCG_SL2_return(SHORTCODE) \
+do { \
+TCGv_i64 RddV = tcg_temp_new_i64(); \
+gen_return(ctx, RddV, hex_gpr[HEX_REG_FP]); \
+gen_log_reg_write_pair(HEX_REG_FP, RddV); \
+tcg_temp_free_i64(RddV); \
+} while (0)
+
+/*
+ * Conditional returns follow this naming convention
+ * _t predicate true
+ * _f predicate false
+ * _tnew_pt   predicate.new true predict taken
+ * _fnew_pt   predicate.new false predict taken
+ * _tnew_pnt  predicate.new true predict not taken
+ * _fnew_pnt  predicate.new false predict not taken
+ * Predictions are not modelled in QEMU
+ *
+ * Example:
+ * if (p1) r31:30 = dealloc_return(r30):raw
+ */
+#define fGEN_TCG_L4_return_t(SHORTCODE) \
+gen_cond_return(ctx, RddV, RsV, PvV, TCG_COND_EQ);
+#define fGEN_TCG_L4_return_f(SHORTCODE) \
+gen_cond_return(ctx, RddV, RsV, PvV, TCG_COND_NE)
+#define fGEN_TCG_L4_return_tnew_pt(SHORTCODE) \
+gen_cond_return(ctx, RddV, RsV, PvN, TCG_COND_EQ)
+#define fGEN_TCG_L4_return_fnew_pt(SHORTCODE) \
+gen_cond_return(ctx, RddV, RsV, PvN, TCG_COND_NE)
+#define fGEN_TCG_L4_return_tnew_pnt(SHORTCODE) \
+gen_cond_return(ctx, RddV, RsV, PvN, TCG_COND_EQ)
+#define fGEN_TCG_L4_return_fnew_pnt(SHORTCODE) \
+gen_cond_return(ctx, RddV, RsV, PvN, TCG_COND_NE)
+
+#define fGEN_TCG_SL2_return_t(SHORTCODE) \
+gen_cond_return_subinsn(ctx, TCG_COND_EQ, hex_pred[0])
+#define fGEN_TCG_SL2_return_f(SHORTCODE) \
+gen_cond_return_subinsn(ctx, TCG_COND_NE, hex_pred[0])
+#define fGEN_TCG_SL2_return_tnew(SHORTCODE) \
+gen_cond_return_subinsn(ctx, TCG_COND_EQ, hex_new_pred_value[0])
+#define fGEN_TCG_SL2_return_fnew(SHORTCODE) \
+gen_cond_return_subinsn(ctx, TCG_COND_NE, hex_new_pred_value[0])
+
 /*
  * Mathematical operations with more than one definition require
  * special handling
diff --git a/target/hexagon/genptr.c b/target/hexagon/genptr.c
index e17ac93a59..efd36f760f 100644
--- a/target/hexagon/genptr.c
+++ b/target/hexagon/genptr.c
@@ -746,6 +746,92 @@ static void gen_cond_callr(DisasContext *ctx,
 gen_set_label(skip);
 }
 
+/* frame ^= (int64_t)FRAMEKEY << 32 */
+static void gen_frame_unscramble(TCGv_i64 frame)
+{
+TCGv_i64 framekey = tcg_temp_new_i64();
+tcg_gen_extu_i32_i64(framekey, hex_gpr[HEX_REG_FRAMEKEY]);
+tcg_gen_shli_i64(framekey, framekey, 32);
+tcg_gen_xor_i64(frame, frame, framekey);
+tcg_temp_free_i64(framekey);
+}
+
+static void gen_load_frame(DisasContext *ctx, TCGv_i64 frame, TCGv EA)
+{
+Insn *insn = ctx->insn;  /* Needed for CHECK_NOSHUF */
+CHECK_NOSHUF(EA, 8);
+tcg_gen_qemu_ld64(frame, EA, ctx->mem_idx);
+}
+
+static void gen_return_base(DisasContext *ctx, TCGv_i64 dst, TCGv src,
+TCGv r29)
+{
+/*
+ * frame = *src
+ * dst = frame_unscramble(frame)
+ * SP = src + 8
+ * PC = dst.w[1]
+ */
+TCGv_i64 frame = tcg_temp_new_i64();
+TCGv r31 = tcg_temp_new();
+
+gen_load_frame(ctx, frame, src);
+gen_frame_unscramble(frame);
+tcg_gen_mov_i64(dst, frame);
+tcg_gen_addi_tl(r29, src, 8);
+tcg_gen_extrh_i64_i32(r31, dst);
+gen_jumpr(ctx, r31);
+
+tcg_temp_free_i64(frame);
+tcg_temp_free(r31);
+}
+
+static void gen_return(DisasContext *ctx, TCGv_i64 dst, TCGv src)
+{
+TCGv r29 = tcg_temp_new();
+gen_return_base(ctx, dst, src, r29);
+gen_log_reg_write(HEX_REG_SP, r29);
+tcg_temp_free(r29);
+}
+
+/* if (pred) dst = dealloc_return(src):raw */
+static void gen_cond_return(DisasContext *ctx, TCGv_i64 dst, TCGv src,
+TCGv pred, TCGCond cond)
+{
+TCGv LSB = tcg_temp_new();
+TCGv mask = tcg_temp_new();
+TCGv r29 = tcg_temp_local_new();
+TCGLabel

[PATCH v5 07/14] Hexagon (target/hexagon) Analyze packet for HVX

2023-01-31 Thread Taylor Simpson

Extend the analyze_ functions for HVX vector and predicate writes
Remove calls to ctx_log_vreg_write[_pair] from gen_tcg_funcs.py
During gen_start_packet, reload the predicated HVX registers into
fugure_VRegs and tmp_VRegs

Signed-off-by: Taylor Simpson 
---
 target/hexagon/translate.h  | 14 --
 target/hexagon/translate.c  | 30 +
 target/hexagon/gen_analyze_funcs.py | 17 +---
 target/hexagon/gen_tcg_funcs.py | 18 -
 4 files changed, 52 insertions(+), 27 deletions(-)

diff --git a/target/hexagon/translate.h b/target/hexagon/translate.h
index 34368b2186..765f2c6a22 100644
--- a/target/hexagon/translate.h
+++ b/target/hexagon/translate.h
@@ -54,6 +54,8 @@ typedef struct DisasContext {
 DECLARE_BITMAP(vregs_updated_tmp, NUM_VREGS);
 DECLARE_BITMAP(vregs_updated, NUM_VREGS);
 DECLARE_BITMAP(vregs_select, NUM_VREGS);
+DECLARE_BITMAP(predicated_future_vregs, NUM_VREGS);
+DECLARE_BITMAP(predicated_tmp_vregs, NUM_VREGS);
 int qreg_log[NUM_QREGS];
 bool qreg_is_predicated[NUM_QREGS];
 int qreg_log_idx;
@@ -99,12 +101,6 @@ static inline void ctx_log_reg_write_pair(DisasContext 
*ctx, int rnum,
 ctx_log_reg_write(ctx, rnum + 1, is_predicated);
 }
 
-static inline bool is_vreg_preloaded(DisasContext *ctx, int num)
-{
-return test_bit(num, ctx->vregs_updated) ||
-   test_bit(num, ctx->vregs_updated_tmp);
-}
-
 intptr_t ctx_future_vreg_off(DisasContext *ctx, int regnum,
  int num, bool alloc_ok);
 intptr_t ctx_tmp_vreg_off(DisasContext *ctx, int regnum,
@@ -120,12 +116,18 @@ static inline void ctx_log_vreg_write(DisasContext *ctx,
 ctx->vreg_log_idx++;
 
 set_bit(rnum, ctx->vregs_updated);
+if (is_predicated) {
+set_bit(rnum, ctx->predicated_future_vregs);
+}
 }
 if (type == EXT_NEW) {
 set_bit(rnum, ctx->vregs_select);
 }
 if (type == EXT_TMP) {
 set_bit(rnum, ctx->vregs_updated_tmp);
+if (is_predicated) {
+set_bit(rnum, ctx->predicated_tmp_vregs);
+}
 }
 }
 
diff --git a/target/hexagon/translate.c b/target/hexagon/translate.c
index 8b33e6cd8f..53fd935db7 100644
--- a/target/hexagon/translate.c
+++ b/target/hexagon/translate.c
@@ -364,6 +364,8 @@ static void gen_start_packet(DisasContext *ctx)
 bitmap_zero(ctx->vregs_updated_tmp, NUM_VREGS);
 bitmap_zero(ctx->vregs_updated, NUM_VREGS);
 bitmap_zero(ctx->vregs_select, NUM_VREGS);
+bitmap_zero(ctx->predicated_future_vregs, NUM_VREGS);
+bitmap_zero(ctx->predicated_tmp_vregs, NUM_VREGS);
 ctx->qreg_log_idx = 0;
 for (i = 0; i < STORES_MAX; i++) {
 ctx->store_width[i] = 0;
@@ -415,6 +417,34 @@ static void gen_start_packet(DisasContext *ctx)
 }
 }
 
+/* Preload the predicated HVX registers into future_VRegs and tmp_VRegs */
+if (!bitmap_empty(ctx->predicated_future_vregs, NUM_VREGS)) {
+int i = find_first_bit(ctx->predicated_future_vregs, NUM_VREGS);
+while (i < NUM_VREGS) {
+const intptr_t VdV_off =
+ctx_future_vreg_off(ctx, i, 1, true);
+intptr_t src_off = offsetof(CPUHexagonState, VRegs[i]);
+tcg_gen_gvec_mov(MO_64, VdV_off,
+ src_off,
+ sizeof(MMVector),
+ sizeof(MMVector));
+i = find_next_bit(ctx->predicated_future_vregs, NUM_VREGS, i + 1);
+}
+}
+if (!bitmap_empty(ctx->predicated_tmp_vregs, NUM_VREGS)) {
+int i = find_first_bit(ctx->predicated_tmp_vregs, NUM_VREGS);
+while (i < NUM_VREGS) {
+const intptr_t VdV_off =
+ctx_tmp_vreg_off(ctx, i, 1, true);
+intptr_t src_off = offsetof(CPUHexagonState, VRegs[i]);
+tcg_gen_gvec_mov(MO_64, VdV_off,
+ src_off,
+ sizeof(MMVector),
+ sizeof(MMVector));
+i = find_next_bit(ctx->predicated_tmp_vregs, NUM_VREGS, i + 1);
+}
+}
+
 if (pkt->pkt_has_hvx) {
 tcg_gen_movi_tl(hex_VRegs_updated, 0);
 tcg_gen_movi_tl(hex_QRegs_updated, 0);
diff --git a/target/hexagon/gen_analyze_funcs.py 
b/target/hexagon/gen_analyze_funcs.py
index ff5b69978c..3a1db46ac3 100755
--- a/target/hexagon/gen_analyze_funcs.py
+++ b/target/hexagon/gen_analyze_funcs.py
@@ -83,9 +83,16 @@ def analyze_opn_old(f, tag, regtype, regid, regno):
 else:
 print("Bad register parse: ", regtype, regid)
 elif (regtype == "V"):
+newv = "EXT_DFL"
+if (hex_common.is_new_result(tag)):
+newv = "EXT_NEW"
+elif (hex_common.is_tmp_result(tag)):
+newv = "EXT_TMP"
 if (regid in {"dd", "xx"}):
-f.write("//const int %s = insn->regno[%d];\n" %\
+f.write("const int %s = insn->regno[%d];\n"

[PATCH v5 01/14] Hexagon (target/hexagon) Add overrides for jumpr31 instructions

2023-01-31 Thread Taylor Simpson

Add overrides for
SL2_jumpr31Unconditional
SL2_jumpr31_t  Predicated true (old value)
SL2_jumpr31_f  Predicated false (old value)
SL2_jumpr31_tnew   Predicated true (new value)
SL2_jumpr31_fnew   Predicated false (new value)

Signed-off-by: Taylor Simpson 
---
 target/hexagon/gen_tcg.h | 15 ++-
 target/hexagon/genptr.c  | 10 +-
 2 files changed, 23 insertions(+), 2 deletions(-)

diff --git a/target/hexagon/gen_tcg.h b/target/hexagon/gen_tcg.h
index 19697b42a5..d644e59a63 100644
--- a/target/hexagon/gen_tcg.h
+++ b/target/hexagon/gen_tcg.h
@@ -1,5 +1,5 @@
 /*
- *  Copyright(c) 2019-2022 Qualcomm Innovation Center, Inc. All Rights 
Reserved.
+ *  Copyright(c) 2019-2023 Qualcomm Innovation Center, Inc. All Rights 
Reserved.
  *
  *  This program is free software; you can redistribute it and/or modify
  *  it under the terms of the GNU General Public License as published by
@@ -1015,6 +1015,19 @@
 #define fGEN_TCG_S2_asl_r_r_sat(SHORTCODE) \
 gen_asl_r_r_sat(RdV, RsV, RtV)
 
+#define fGEN_TCG_SL2_jumpr31(SHORTCODE) \
+gen_jumpr(ctx, hex_gpr[HEX_REG_LR])
+
+#define fGEN_TCG_SL2_jumpr31_t(SHORTCODE) \
+gen_cond_jumpr31(ctx, TCG_COND_EQ, hex_pred[0])
+#define fGEN_TCG_SL2_jumpr31_f(SHORTCODE) \
+gen_cond_jumpr31(ctx, TCG_COND_NE, hex_pred[0])
+
+#define fGEN_TCG_SL2_jumpr31_tnew(SHORTCODE) \
+gen_cond_jumpr31(ctx, TCG_COND_EQ, hex_new_pred_value[0])
+#define fGEN_TCG_SL2_jumpr31_fnew(SHORTCODE) \
+gen_cond_jumpr31(ctx, TCG_COND_NE, hex_new_pred_value[0])
+
 /* Floating point */
 #define fGEN_TCG_F2_conv_sf2df(SHORTCODE) \
 gen_helper_conv_sf2df(RddV, cpu_env, RsV)
diff --git a/target/hexagon/genptr.c b/target/hexagon/genptr.c
index 90db99024f..23fb808e37 100644
--- a/target/hexagon/genptr.c
+++ b/target/hexagon/genptr.c
@@ -1,5 +1,5 @@
 /*
- *  Copyright(c) 2019-2022 Qualcomm Innovation Center, Inc. All Rights 
Reserved.
+ *  Copyright(c) 2019-2023 Qualcomm Innovation Center, Inc. All Rights 
Reserved.
  *
  *  This program is free software; you can redistribute it and/or modify
  *  it under the terms of the GNU General Public License as published by
@@ -593,6 +593,14 @@ static void gen_cond_jumpr(DisasContext *ctx, TCGv dst_pc,
 gen_write_new_pc_addr(ctx, dst_pc, cond, pred);
 }
 
+static void gen_cond_jumpr31(DisasContext *ctx, TCGCond cond, TCGv pred)
+{
+TCGv LSB = tcg_temp_new();
+tcg_gen_andi_tl(LSB, pred, 1);
+gen_cond_jumpr(ctx, hex_gpr[HEX_REG_LR], cond, LSB);
+tcg_temp_free(LSB);
+}
+
 static void gen_cond_jump(DisasContext *ctx, TCGCond cond, TCGv pred,
   int pc_off)
 {
-- 
2.17.1

[PATCH v5 08/14] Hexagon (tests/tcg/hexagon) Update preg_alias.c

2023-01-31 Thread Taylor Simpson

Add control registers (c4, c5) to clobbers list
Made possible by new toolchain container

Signed-off-by: Taylor Simpson 
---
 tests/tcg/hexagon/preg_alias.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/tests/tcg/hexagon/preg_alias.c b/tests/tcg/hexagon/preg_alias.c
index b44a8112b4..8798fbcaf3 100644
--- a/tests/tcg/hexagon/preg_alias.c
+++ b/tests/tcg/hexagon/preg_alias.c
@@ -1,5 +1,5 @@
 /*
- *  Copyright(c) 2019-2022 Qualcomm Innovation Center, Inc. All Rights 
Reserved.
+ *  Copyright(c) 2019-2023 Qualcomm Innovation Center, Inc. All Rights 
Reserved.
  *
  *  This program is free software; you can redistribute it and/or modify
  *  it under the terms of the GNU General Public License as published by
@@ -65,7 +65,7 @@ static inline void creg_alias(int cval, PRegs *pregs)
   : "=r"(pregs->pregs.p0), "=r"(pregs->pregs.p1),
 "=r"(pregs->pregs.p2), "=r"(pregs->pregs.p3)
   : "r"(cval)
-  : "p0", "p1", "p2", "p3");
+  : "c4", "p0", "p1", "p2", "p3");
 }
 
 int err;
@@ -92,7 +92,7 @@ static inline void creg_alias_pair(unsigned int cval, PRegs 
*pregs)
: "=r"(pregs->pregs.p0), "=r"(pregs->pregs.p1),
  "=r"(pregs->pregs.p2), "=r"(pregs->pregs.p3), "=r"(c5)
: "r"(cval_pair)
-   : "p0", "p1", "p2", "p3");
+   : "c4", "c5", "p0", "p1", "p2", "p3");
 
   check(c5, 0xdeadbeef);
 }
@@ -117,7 +117,7 @@ static void test_packet(void)
  "}\n\t"
  : "+r"(result)
  : "r"(0x), "r"(0xff00), "r"(0x837ed653)
- : "p0", "p1", "p2", "p3");
+ : "c4", "p0", "p1", "p2", "p3");
 check(result, old_val);
 
 /* Test a predicated store */
@@ -129,7 +129,7 @@ static void test_packet(void)
  "}\n\t"
  :
  : "r"(0), "r"(0x), "r"()
- : "p0", "p1", "p2", "p3", "memory");
+ : "c4", "p0", "p1", "p2", "p3", "memory");
 check(result, 0x0);
 }
 
-- 
2.17.1

[PATCH v5 11/14] Hexagon (target/hexagon) Change subtract from zero to change sign

2023-01-31 Thread Taylor Simpson

The F2_sffms instruction [r0 -= sfmpy(r1, r2)] doesn't properly
handle -0.  Previously we would negate the input operand by subtracting
from zero.  Instead, we negate by changing the sign bit.

Test case added to tests/tcg/hexagon/fpstuff.c

Signed-off-by: Taylor Simpson 
---
 target/hexagon/op_helper.c  |  2 +-
 tests/tcg/hexagon/fpstuff.c | 31 ++-
 2 files changed, 31 insertions(+), 2 deletions(-)

diff --git a/target/hexagon/op_helper.c b/target/hexagon/op_helper.c
index 38b8aee193..9425941c69 100644
--- a/target/hexagon/op_helper.c
+++ b/target/hexagon/op_helper.c
@@ -1169,7 +1169,7 @@ float32 HELPER(sffms)(CPUHexagonState *env, float32 RxV,
 {
 float32 neg_RsV;
 arch_fpop_start(env);
-neg_RsV = float32_sub(float32_zero, RsV, >fp_status);
+neg_RsV = float32_set_sign(RsV, float32_is_neg(RsV) ? 0 : 1);
 RxV = internal_fmafx(neg_RsV, RtV, RxV, 0, >fp_status);
 arch_fpop_end(env);
 return RxV;
diff --git a/tests/tcg/hexagon/fpstuff.c b/tests/tcg/hexagon/fpstuff.c
index 56bf562a40..90ce9a6ef3 100644
--- a/tests/tcg/hexagon/fpstuff.c
+++ b/tests/tcg/hexagon/fpstuff.c
@@ -1,5 +1,5 @@
 /*
- *  Copyright(c) 2020-2022 Qualcomm Innovation Center, Inc. All Rights 
Reserved.
+ *  Copyright(c) 2020-2023 Qualcomm Innovation Center, Inc. All Rights 
Reserved.
  *
  *  This program is free software; you can redistribute it and/or modify
  *  it under the terms of the GNU General Public License as published by
@@ -40,6 +40,7 @@ const int SF_HEX_NAN =0x;
 const int SF_small_neg =  0xab98fba8;
 const int SF_denorm = 0x0001;
 const int SF_random = 0x346001d6;
+const int SF_neg_zero =   0x8000;
 
 const long long DF_QNaN = 0x7ff8ULL;
 const long long DF_SNaN = 0x7ff7ULL;
@@ -536,6 +537,33 @@ static void check_sffixupd(void)
 check32(result, 0x146001d6);
 }
 
+static void check_sffms(void)
+{
+int result;
+
+/* Check that sffms properly deals with -0 */
+result = SF_neg_zero;
+asm ("%0 -= sfmpy(%1 , %2)\n\t"
+: "+r"(result)
+: "r"(SF_ZERO), "r"(SF_ZERO)
+: "r12", "r8");
+check32(result, SF_neg_zero);
+
+result = SF_ZERO;
+asm ("%0 -= sfmpy(%1 , %2)\n\t"
+: "+r"(result)
+: "r"(SF_neg_zero), "r"(SF_ZERO)
+: "r12", "r8");
+check32(result, SF_ZERO);
+
+result = SF_ZERO;
+asm ("%0 -= sfmpy(%1 , %2)\n\t"
+: "+r"(result)
+: "r"(SF_ZERO), "r"(SF_neg_zero)
+: "r12", "r8");
+check32(result, SF_ZERO);
+}
+
 static void check_float2int_convs()
 {
 int res32;
@@ -688,6 +716,7 @@ int main()
 check_invsqrta();
 check_sffixupn();
 check_sffixupd();
+check_sffms();
 check_float2int_convs();
 
 puts(err ? "FAIL" : "PASS");
-- 
2.17.1

1 2 3 4 >

1 - 100 of 398 matches

Mail list logo