from:"Jan Kiszka"

Re: [PATCH] hw/arm/musicpal: Convert to qemu_add_kbd_event_handler()

2024-01-22 Thread Jan Kiszka

On 22.01.24 10:50, Alex Bennée wrote:
> Jan Kiszka  writes:
>
>> On 19.01.24 12:24, Alex Bennée wrote:
>>> Peter Maydell  writes:
>>>
>>>> Convert the musicpal key input device to use
>>>> qemu_add_kbd_event_handler().  This lets us simplify it because we no
>>>> longer need to track whether we're in the middle of a PS/2 multibyte
>>>> key sequence.
> 
>>>
>>> Well the key input all works as intended and looks good to me. I'm a
>>> little disappointed I couldn't get audio working on the musicpal machine
>>> but that is not a problem for this patch.
>>>
>>> Tested-by: Alex Bennée 
>>> Reviewed-by: Alex Bennée 
>>>
>>
>> Looks good to me as well, all keys still work fine.
>>
>> No idea what's the issue with sound, though. I think I haven't run the
>> whole stuff in a decade or so, had to search for all the pieces first of
>> all again. The webradio service original behind this stopped their
>> operations, at least for this device, but manually entered favorits
>> still work on the real device - I still have one, though that is
>> starting to get some issues as well.
>
> I navigated through the favourites and after pressing some keys it seems
> to indicate there was a stream of some sort (or at least a bitrate was
> reported ;-).
>
> The main issue I was having with sound was with pipewire - this would
> eventually generate a lot of warning messages because input devices are
> created but I guess the model wasn't draining the input buffers so
> eventually we get:
>
>   qemu: 0x7f1490259500: overrun write:5859188 filled:5842804 + size:940 > 
> max:4194304
>   qemu: 0x7f14902680a0: overrun write:5860128 filled:5843744 + size:940 > 
> max:4194304
>   qemu: 0x7f1490259500: overrun write:5861068 filled:5844684 + size:940 > 
> max:4194304
>   qemu: 0x7f14902680a0: overrun write:5862008 filled:5845624 + size:940 > 
> max:4194304
>

I'm getting these here:

pulseaudio: set_source_output_volume() failed
pulseaudio: Reason: Invalid argument
...

> Is your image just a hacked up version of the original firmware or
> something we have source for? I guess there was never a rockbox port for
> the device?
>

It's an original firmware, nothing hacked. I do have some sources here,
but just partial ones: U-Boot, kernel, not the complete userland and
even not all kernel drivers IIRC.

Jan

Re: [PATCH] hw/arm/musicpal: Convert to qemu_add_kbd_event_handler()

2024-01-21 Thread Jan Kiszka

On 19.01.24 12:24, Alex Bennée wrote:
> Peter Maydell  writes:
>
>> Convert the musicpal key input device to use
>> qemu_add_kbd_event_handler().  This lets us simplify it because we no
>> longer need to track whether we're in the middle of a PS/2 multibyte
>> key sequence.
>>
>> In the conversion we move the keyboard handler registration from init
>> to realize, because devices shouldn't disturb the state of the
>> simulation by doing things like registering input handlers until
>> they're realized, so that device objects can be introspected
>> safely.
>>
>> The behaviour where key-repeat is permitted for the arrow-keys only
>> is intentional (added in commit 7c6ce4baedfcd0c), so we retain it,
>> and add a comment to that effect.
>
> Well the key input all works as intended and looks good to me. I'm a
> little disappointed I couldn't get audio working on the musicpal machine
> but that is not a problem for this patch.
>
> Tested-by: Alex Bennée 
> Reviewed-by: Alex Bennée 
>

Looks good to me as well, all keys still work fine.

No idea what's the issue with sound, though. I think I haven't run the
whole stuff in a decade or so, had to search for all the pieces first of
all again. The webradio service original behind this stopped their
operations, at least for this device, but manually entered favorits
still work on the real device - I still have one, though that is
starting to get some issues as well.

Thanks,
Jan

Re: [PATCH] hw/arm: virt: Add SBSA watchdog

2022-05-05 Thread Jan Kiszka

On 05.05.22 10:40, Peter Maydell wrote:
> On Sun, 1 May 2022 at 19:07, Jan Kiszka  wrote:
>>
>> From: Jan Kiszka 
>>
>> The virt machine lacks a watchdog so far while the sbsa-ref has a simple
>> model that is also supported by Linux and even U-Boot. Let's take it to
>> allow, e.g., system integration tests of A/B software update under
>> watchdog monitoring.
>>
>> Signed-off-by: Jan Kiszka 
> 
> The virt board has a PCI bus, and QEMU has a model of a
> PCI watchdog device -- the i6300esb. Can you use that?

While one could do that, it implies using completely alien Intel x86
host controller bits on this platform. And, thus, would require changes
in firmware (e.g. adding a driver to U-Boot).

> 
> In general I much prefer it if we can support use cases in
> virt via pluggable PCI devices rather than hard-wired MMIO
> devices -- it means that only users who want the functionality
> need to have the exposed attack surface area of the device
> present in their VM.

Valid point - would making this opt-in via a machine feature be sufficient?

Jan

-- 
Siemens AG, Technology
Competence Center Embedded Linux

[PATCH] hw/arm: virt: Add SBSA watchdog

2022-05-01 Thread Jan Kiszka

From: Jan Kiszka 

The virt machine lacks a watchdog so far while the sbsa-ref has a simple
model that is also supported by Linux and even U-Boot. Let's take it to
allow, e.g., system integration tests of A/B software update under
watchdog monitoring.

Signed-off-by: Jan Kiszka 
---

An alternative could be sp805. However, QEMU has no support for that
device yet.

 hw/arm/Kconfig|  1 +
 hw/arm/virt.c | 35 +++
 include/hw/arm/virt.h |  3 +++
 3 files changed, 39 insertions(+)

diff --git a/hw/arm/Kconfig b/hw/arm/Kconfig
index 97f3b38019..8df04a3533 100644
--- a/hw/arm/Kconfig
+++ b/hw/arm/Kconfig
@@ -29,6 +29,7 @@ config ARM_VIRT
 select ACPI_APEI
 select ACPI_VIOT
 select VIRTIO_MEM_SUPPORTED
+select WDT_SBSA

 config CHEETAH
 bool
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index f94278935f..a7b85438a8 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -78,6 +78,7 @@
 #include "hw/virtio/virtio-mem-pci.h"
 #include "hw/virtio/virtio-iommu.h"
 #include "hw/char/pl011.h"
+#include "hw/watchdog/sbsa_gwdt.h"
 #include "qemu/guest-random.h"

 #define DEFINE_VIRT_MACHINE_LATEST(major, minor, latest) \
@@ -154,6 +155,8 @@ static const MemMapEntry base_memmap[] = {
 [VIRT_NVDIMM_ACPI] ={ 0x0909, NVDIMM_ACPI_IO_LEN},
 [VIRT_PVTIME] = { 0x090a, 0x0001 },
 [VIRT_SECURE_GPIO] ={ 0x090b, 0x1000 },
+[VIRT_GWDT_CONTROL] =   { 0x090c, 0x1000 },
+[VIRT_GWDT_REFRESH] =   { 0x090d, 0x1000 },
 [VIRT_MMIO] =   { 0x0a00, 0x0200 },
 /* ...repeating for a total of NUM_VIRTIO_TRANSPORTS, each of that size */
 [VIRT_PLATFORM_BUS] =   { 0x0c00, 0x0200 },
@@ -190,6 +193,7 @@ static const int a15irqmap[] = {
 [VIRT_GPIO] = 7,
 [VIRT_SECURE_UART] = 8,
 [VIRT_ACPI_GED] = 9,
+[VIRT_GWDT_WS0] = 10,
 [VIRT_MMIO] = 16, /* ...to 16 + NUM_VIRTIO_TRANSPORTS - 1 */
 [VIRT_GIC_V2M] = 48, /* ...to 48 + NUM_GICV2M_SPIS - 1 */
 [VIRT_SMMU] = 74,/* ...to 74 + NUM_SMMU_IRQS - 1 */
@@ -902,6 +906,35 @@ static void create_rtc(const VirtMachineState *vms)
 g_free(nodename);
 }

+static void create_wdt(const VirtMachineState *vms)
+{
+hwaddr cbase = vms->memmap[VIRT_GWDT_CONTROL].base;
+hwaddr csize = vms->memmap[VIRT_GWDT_CONTROL].size;
+hwaddr rbase = vms->memmap[VIRT_GWDT_REFRESH].base;
+hwaddr rsize = vms->memmap[VIRT_GWDT_REFRESH].size;
+DeviceState *dev = qdev_new(TYPE_WDT_SBSA);
+SysBusDevice *s = SYS_BUS_DEVICE(dev);
+int irq = vms->irqmap[VIRT_GWDT_WS0];
+const char compat[] = "arm,sbsa-gwdt";
+MachineState *ms = MACHINE(vms);
+char *nodename;
+
+sysbus_realize_and_unref(s, _fatal);
+sysbus_mmio_map(s, 1, cbase);
+sysbus_mmio_map(s, 0, rbase);
+sysbus_connect_irq(s, 0, qdev_get_gpio_in(vms->gic, irq));
+
+nodename = g_strdup_printf("/watchdog@%" PRIx64, cbase);
+qemu_fdt_add_subnode(ms->fdt, nodename);
+qemu_fdt_setprop(ms->fdt, nodename, "compatible", compat, sizeof(compat));
+qemu_fdt_setprop_sized_cells(ms->fdt, nodename, "reg",
+ 2, cbase, 2, csize, 2, rbase, 2, rsize);
+qemu_fdt_setprop_cells(ms->fdt, nodename, "interrupts",
+   GIC_FDT_IRQ_TYPE_SPI, irq,
+   GIC_FDT_IRQ_FLAGS_LEVEL_HI);
+g_free(nodename);
+}
+
 static DeviceState *gpio_key_dev;
 static void virt_powerdown_req(Notifier *n, void *opaque)
 {
@@ -2240,6 +2273,8 @@ static void machvirt_init(MachineState *machine)

 create_rtc(vms);

+create_wdt(vms);
+
 create_pcie(vms);

 if (has_ged && aarch64 && firmware_loaded && virt_is_acpi_enabled(vms)) {
diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
index 15feabac63..7f295f771e 100644
--- a/include/hw/arm/virt.h
+++ b/include/hw/arm/virt.h
@@ -86,6 +86,9 @@ enum {
 VIRT_ACPI_GED,
 VIRT_NVDIMM_ACPI,
 VIRT_PVTIME,
+VIRT_GWDT_WS0,
+VIRT_GWDT_CONTROL,
+VIRT_GWDT_REFRESH,
 VIRT_LOWMEMMAP_LAST,
 };

--
2.34.1

Re: [RFC PATCH 08/21] contrib/gitdm: Add Mentor Graphics to the domain map

2020-10-06 Thread Jan Kiszka

On 06.10.20 14:41, Philippe Mathieu-Daudé wrote:
> On 10/6/20 11:44 AM, Alex Bennée wrote:
>>
>> Jan Kiszka  writes:
>>
>>> On 05.10.20 22:52, Joseph Myers wrote:
>>>> On Mon, 5 Oct 2020, Alex Bennée wrote:
>>>>
>>>>> Joseph Myers  writes:
>>>>>
>>>>>> On Sun, 4 Oct 2020, Philippe Mathieu-Daudé wrote:
>>>>>>
>>>>>>> There is a number of contributors from this domain,
>>>>>>> add its own entry to the gitdm domain map.
>>>>>>
>>>>>> At some point the main branding will be Siemens; not sure how you want 
>>>>>> to 
>>>>>> handle that.
>>>>>
>>>>> We've already done something similar with WaveComp who have rolled up
>>>>> the various mips and imgtec contributions into
>>>>> contrib/gitdm/group-map-wavecomp.
>>>>>
>>>>> It's really up to you and which corporate entity would like internet
>>>>> bragging points. The only Siemens contributor I could find is Jan Kiszka
>>>>> but he has contributed a fair amount ;-)
>>>>
>>>> Given that the Mentor branding is going away (and the "Mentor Graphics" 
>>>> version largely has gone away, "Mentor, a Siemens Business" is what's 
>>>> currently used as a Mentor brand), probably it makes sense to use Siemens 
>>>> for both codesourcery.com and mentor.com addresses.
>>>>
>>>
>>> I think the key question is what this map is used for: Is it supposed to
>>> document the historic status, who was who at the time of contribution?
>>> Or is its purpose to help identifying the copyright holder of a
>>> contribution today?
>>
>> I don't know what others use them for but for me it was just an easy way
>> to get a survey of the companies and individuals involved over the last
>> year (2y, 3y, 5y... etc) of development.
> 
> My personal interest is seeing how the corporate/academic/hobbyist
> contributions are split, and how this evolves over time.
> 
> Since there were entries for some companies, I added more,
> but this is not a requisite and we can drop the patches if
> considered not useful or giving headaches to the contributors.
> 
>> The consolidation of
>> contributions isn't overly distorting IMO. The biggest user is probably
>> the end of year state of the nation surveys wanting to see what impact
>> various organisations are having on a project and consolidation just
>> helps push you up the table a little more.
>>
>> The biggest counter example we have at the moment is RedHat/IBM which
>> AFAICT is because the RedHat guys are treated as a separate business
>> unit with their own unique identity.
>>
>> Either way I don't think it's a major issue - hence it is up to the
>> hackers to make the choice.
> 
> Totally. In this particular case, we could even keep "Codesourcery"
> group separate, similarly to RH/IBM.
> 

If there is a way to express also timeline, then I would keep the
historic contributions separate and account those since the merger
(April 2017) to Siemens. You may even consider adding also the phase
when Codesourcery was owned by Mentor.

Jan

-- 
Siemens AG, T RDA IOT
Corporate Competence Center Embedded Linux

Re: [RFC PATCH 08/21] contrib/gitdm: Add Mentor Graphics to the domain map

2020-10-05 Thread Jan Kiszka

On 05.10.20 22:52, Joseph Myers wrote:
> On Mon, 5 Oct 2020, Alex Bennée wrote:
> 
>> Joseph Myers  writes:
>>
>>> On Sun, 4 Oct 2020, Philippe Mathieu-Daudé wrote:
>>>
>>>> There is a number of contributors from this domain,
>>>> add its own entry to the gitdm domain map.
>>>
>>> At some point the main branding will be Siemens; not sure how you want to 
>>> handle that.
>>
>> We've already done something similar with WaveComp who have rolled up
>> the various mips and imgtec contributions into
>> contrib/gitdm/group-map-wavecomp.
>>
>> It's really up to you and which corporate entity would like internet
>> bragging points. The only Siemens contributor I could find is Jan Kiszka
>> but he has contributed a fair amount ;-)
> 
> Given that the Mentor branding is going away (and the "Mentor Graphics" 
> version largely has gone away, "Mentor, a Siemens Business" is what's 
> currently used as a Mentor brand), probably it makes sense to use Siemens 
> for both codesourcery.com and mentor.com addresses.
> 

I think the key question is what this map is used for: Is it supposed to
document the historic status, who was who at the time of contribution?
Or is its purpose to help identifying the copyright holder of a
contribution today?

Jan

-- 
Siemens AG, T RDA IOT
Corporate Competence Center Embedded Linux

Re: scripts/gdb: issues when loading modules after lx-symbols

2020-10-05 Thread Jan Kiszka

On 05.10.20 13:05, Stefano Garzarella wrote:
> On Mon, Oct 05, 2020 at 11:45:41AM +0200, Jan Kiszka wrote:
>> On 05.10.20 11:29, Stefano Garzarella wrote:
>>> On Mon, Oct 05, 2020 at 10:33:30AM +0200, Jan Kiszka wrote:
>>>> On 05.10.20 10:14, Stefano Garzarella wrote:
>>>>> On Sun, Oct 04, 2020 at 08:52:37PM +0200, Jan Kiszka wrote:
>>>>>> On 01.10.20 16:31, Stefano Garzarella wrote:
>>>>>>> Hi,
>>>>>>> I had some issues with gdb scripts and kernel modules in Linux 5.9-rc7.
>>>>>>>
>>>>>>> If the modules are already loaded, and I do 'lx-symbols', all work fine.
>>>>>>> But, if I load a kernel module after 'lx-symbols', I had this issue:
>>>>>>>
>>>>>>> [ 5093.393940] invalid opcode:  [#1] SMP PTI
>>>>>>> [ 5093.395134] CPU: 0 PID: 576 Comm: modprobe Not tainted 
>>>>>>> 5.9.0-rc7-ste-00010-gf0b671d9608d-dirty #2
>>>>>>> [ 5093.397566] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 
>>>>>>> 1.13.0-2.fc32 04/01/2014
>>>>>>> [ 5093.400761] RIP: 0010:do_init_module+0x1/0x270
>>>>>>> [ 5093.402553] Code: ff ff e9 cf fe ff ff 0f 0b 49 c7 c4 f2 ff ff ff e9 
>>>>>>> c1 fe ff ff e8 5f b2 65 00 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 
>>>>>>> cc <1f> 44 00 00 55 ba 10 00 00 00 be c0 0c 00 00 48 89 e5 41 56 41 55
>>>>>>> [ 5093.409505] RSP: 0018:c9563d18 EFLAGS: 00010246
>>>>>>> [ 5093.412056] RAX:  RBX: c010a0c0 RCX: 
>>>>>>> 4ee3
>>>>>>> [ 5093.414472] RDX: 4ee2 RSI: ea0001efe188 RDI: 
>>>>>>> c010a0c0
>>>>>>> [ 5093.416349] RBP: c9563e50 R08:  R09: 
>>>>>>> 0002
>>>>>>> [ 5093.418044] R10: 0096 R11: 08a4 R12: 
>>>>>>> 88807a0d1280
>>>>>>> [ 5093.424721] R13: c010a110 R14: 88807a0d1300 R15: 
>>>>>>> c9563e70
>>>>>>> [ 5093.427138] FS:  7f018f632740() GS:88807dc0() 
>>>>>>> knlGS:
>>>>>>> [ 5093.430037] CS:  0010 DS:  ES:  CR0: 80050033
>>>>>>> [ 5093.432279] CR2: 55fbe282b239 CR3: 7922a006 CR4: 
>>>>>>> 00170ef0
>>>>>>> [ 5093.435096] DR0:  DR1:  DR2: 
>>>>>>> 
>>>>>>> [ 5093.436765] DR3:  DR6: fffe0ff0 DR7: 
>>>>>>> 0400
>>>>>>> [ 5093.439689] Call Trace:
>>>>>>> [ 5093.440954]  ? load_module+0x24b6/0x27d0
>>>>>>> [ 5093.443212]  ? __kernel_read+0xd6/0x150
>>>>>>> [ 5093.445140]  __do_sys_finit_module+0xd3/0xf0
>>>>>>> [ 5093.446877]  __x64_sys_finit_module+0x1a/0x20
>>>>>>> [ 5093.449098]  do_syscall_64+0x38/0x50
>>>>>>> [ 5093.450877]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
>>>>>>> [ 5093.456153] RIP: 0033:0x7f018f75c43d
>>>>>>> [ 5093.457728] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 
>>>>>>> 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 
>>>>>>> 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 2b 6a 0c 00 f7 d8 64 89 01 48
>>>>>>> [ 5093.466349] RSP: 002b:7ffd7f080368 EFLAGS: 0246 ORIG_RAX: 
>>>>>>> 0139
>>>>>>> [ 5093.470613] RAX: ffda RBX: 557e5c96f9c0 RCX: 
>>>>>>> 7f018f75c43d
>>>>>>> [ 5093.474747] RDX:  RSI: 557e5c964288 RDI: 
>>>>>>> 0003
>>>>>>> [ 5093.478049] RBP: 0004 R08:  R09: 
>>>>>>> 
>>>>>>> [ 5093.481298] R10: 0003 R11: 0246 R12: 
>>>>>>> 
>>>>>>> [ 5093.483725] R13: 557e5c964288 R14: 557e5c96f950 R15: 
>>>>>>> 557e5c9775c0
>>>>>>> [ 5093.485778] Modules linked in: virtio_vdpa(+) vdpa sunrpc kvm_intel 
>>>>>>> kvm irqbypass virtio_blk virtio_rng rng_core [last

Re: scripts/gdb: issues when loading modules after lx-symbols

2020-10-05 Thread Jan Kiszka

On 05.10.20 11:29, Stefano Garzarella wrote:
> On Mon, Oct 05, 2020 at 10:33:30AM +0200, Jan Kiszka wrote:
>> On 05.10.20 10:14, Stefano Garzarella wrote:
>>> On Sun, Oct 04, 2020 at 08:52:37PM +0200, Jan Kiszka wrote:
>>>> On 01.10.20 16:31, Stefano Garzarella wrote:
>>>>> Hi,
>>>>> I had some issues with gdb scripts and kernel modules in Linux 5.9-rc7.
>>>>>
>>>>> If the modules are already loaded, and I do 'lx-symbols', all work fine.
>>>>> But, if I load a kernel module after 'lx-symbols', I had this issue:
>>>>>
>>>>> [ 5093.393940] invalid opcode:  [#1] SMP PTI
>>>>> [ 5093.395134] CPU: 0 PID: 576 Comm: modprobe Not tainted 
>>>>> 5.9.0-rc7-ste-00010-gf0b671d9608d-dirty #2
>>>>> [ 5093.397566] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 
>>>>> 1.13.0-2.fc32 04/01/2014
>>>>> [ 5093.400761] RIP: 0010:do_init_module+0x1/0x270
>>>>> [ 5093.402553] Code: ff ff e9 cf fe ff ff 0f 0b 49 c7 c4 f2 ff ff ff e9 
>>>>> c1 fe ff ff e8 5f b2 65 00 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 
>>>>> cc <1f> 44 00 00 55 ba 10 00 00 00 be c0 0c 00 00 48 89 e5 41 56 41 55
>>>>> [ 5093.409505] RSP: 0018:c9563d18 EFLAGS: 00010246
>>>>> [ 5093.412056] RAX:  RBX: c010a0c0 RCX: 
>>>>> 4ee3
>>>>> [ 5093.414472] RDX: 4ee2 RSI: ea0001efe188 RDI: 
>>>>> c010a0c0
>>>>> [ 5093.416349] RBP: c9563e50 R08:  R09: 
>>>>> 0002
>>>>> [ 5093.418044] R10: 0096 R11: 08a4 R12: 
>>>>> 88807a0d1280
>>>>> [ 5093.424721] R13: c010a110 R14: 88807a0d1300 R15: 
>>>>> c9563e70
>>>>> [ 5093.427138] FS:  7f018f632740() GS:88807dc0() 
>>>>> knlGS:
>>>>> [ 5093.430037] CS:  0010 DS:  ES:  CR0: 80050033
>>>>> [ 5093.432279] CR2: 55fbe282b239 CR3: 7922a006 CR4: 
>>>>> 00170ef0
>>>>> [ 5093.435096] DR0:  DR1:  DR2: 
>>>>> 
>>>>> [ 5093.436765] DR3:  DR6: fffe0ff0 DR7: 
>>>>> 0400
>>>>> [ 5093.439689] Call Trace:
>>>>> [ 5093.440954]  ? load_module+0x24b6/0x27d0
>>>>> [ 5093.443212]  ? __kernel_read+0xd6/0x150
>>>>> [ 5093.445140]  __do_sys_finit_module+0xd3/0xf0
>>>>> [ 5093.446877]  __x64_sys_finit_module+0x1a/0x20
>>>>> [ 5093.449098]  do_syscall_64+0x38/0x50
>>>>> [ 5093.450877]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
>>>>> [ 5093.456153] RIP: 0033:0x7f018f75c43d
>>>>> [ 5093.457728] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 
>>>>> 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 
>>>>> 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 2b 6a 0c 00 f7 d8 64 89 01 48
>>>>> [ 5093.466349] RSP: 002b:7ffd7f080368 EFLAGS: 0246 ORIG_RAX: 
>>>>> 0139
>>>>> [ 5093.470613] RAX: ffda RBX: 557e5c96f9c0 RCX: 
>>>>> 7f018f75c43d
>>>>> [ 5093.474747] RDX:  RSI: 557e5c964288 RDI: 
>>>>> 0003
>>>>> [ 5093.478049] RBP: 0004 R08:  R09: 
>>>>> 
>>>>> [ 5093.481298] R10: 0003 R11: 0246 R12: 
>>>>> 
>>>>> [ 5093.483725] R13: 557e5c964288 R14: 557e5c96f950 R15: 
>>>>> 557e5c9775c0
>>>>> [ 5093.485778] Modules linked in: virtio_vdpa(+) vdpa sunrpc kvm_intel 
>>>>> kvm irqbypass virtio_blk virtio_rng rng_core [last unloaded: virtio_vdpa]
>>>>> [ 5093.488695] ---[ end trace 23712ecebc11f53c ]---
>>>>>
>>>>> Guest kernel: Linux 5.9-rc7
>>>>> gdb: GNU gdb (GDB) Fedora 9.1-6.fc32
>>>>> I tried with QEMU 4.2.1 and the latest master branch: same issue.
>>>>>
>>>>>
>>>>> I did some digging, and skipping the gdb 'add-symbol-file' command in 
>>>>> symbol.py
>>>>> avoid the issue, but of course I don't have the symbols loaded:
>>>>>
>>>>> diff --git a/scripts/gdb

Re: scripts/gdb: issues when loading modules after lx-symbols

2020-10-05 Thread Jan Kiszka

On 05.10.20 10:14, Stefano Garzarella wrote:
> On Sun, Oct 04, 2020 at 08:52:37PM +0200, Jan Kiszka wrote:
>> On 01.10.20 16:31, Stefano Garzarella wrote:
>>> Hi,
>>> I had some issues with gdb scripts and kernel modules in Linux 5.9-rc7.
>>>
>>> If the modules are already loaded, and I do 'lx-symbols', all work fine.
>>> But, if I load a kernel module after 'lx-symbols', I had this issue:
>>>
>>> [ 5093.393940] invalid opcode:  [#1] SMP PTI
>>> [ 5093.395134] CPU: 0 PID: 576 Comm: modprobe Not tainted 
>>> 5.9.0-rc7-ste-00010-gf0b671d9608d-dirty #2
>>> [ 5093.397566] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 
>>> 1.13.0-2.fc32 04/01/2014
>>> [ 5093.400761] RIP: 0010:do_init_module+0x1/0x270
>>> [ 5093.402553] Code: ff ff e9 cf fe ff ff 0f 0b 49 c7 c4 f2 ff ff ff e9 c1 
>>> fe ff ff e8 5f b2 65 00 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 cc 
>>> <1f> 44 00 00 55 ba 10 00 00 00 be c0 0c 00 00 48 89 e5 41 56 41 55
>>> [ 5093.409505] RSP: 0018:c9563d18 EFLAGS: 00010246
>>> [ 5093.412056] RAX:  RBX: c010a0c0 RCX: 
>>> 4ee3
>>> [ 5093.414472] RDX: 4ee2 RSI: ea0001efe188 RDI: 
>>> c010a0c0
>>> [ 5093.416349] RBP: c9563e50 R08:  R09: 
>>> 0002
>>> [ 5093.418044] R10: 0096 R11: 08a4 R12: 
>>> 88807a0d1280
>>> [ 5093.424721] R13: c010a110 R14: 88807a0d1300 R15: 
>>> c9563e70
>>> [ 5093.427138] FS:  7f018f632740() GS:88807dc0() 
>>> knlGS:
>>> [ 5093.430037] CS:  0010 DS:  ES:  CR0: 80050033
>>> [ 5093.432279] CR2: 55fbe282b239 CR3: 7922a006 CR4: 
>>> 00170ef0
>>> [ 5093.435096] DR0:  DR1:  DR2: 
>>> 
>>> [ 5093.436765] DR3:  DR6: fffe0ff0 DR7: 
>>> 0400
>>> [ 5093.439689] Call Trace:
>>> [ 5093.440954]  ? load_module+0x24b6/0x27d0
>>> [ 5093.443212]  ? __kernel_read+0xd6/0x150
>>> [ 5093.445140]  __do_sys_finit_module+0xd3/0xf0
>>> [ 5093.446877]  __x64_sys_finit_module+0x1a/0x20
>>> [ 5093.449098]  do_syscall_64+0x38/0x50
>>> [ 5093.450877]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
>>> [ 5093.456153] RIP: 0033:0x7f018f75c43d
>>> [ 5093.457728] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 
>>> 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 
>>> <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 2b 6a 0c 00 f7 d8 64 89 01 48
>>> [ 5093.466349] RSP: 002b:7ffd7f080368 EFLAGS: 0246 ORIG_RAX: 
>>> 0139
>>> [ 5093.470613] RAX: ffda RBX: 557e5c96f9c0 RCX: 
>>> 7f018f75c43d
>>> [ 5093.474747] RDX:  RSI: 557e5c964288 RDI: 
>>> 0003
>>> [ 5093.478049] RBP: 0004 R08:  R09: 
>>> 
>>> [ 5093.481298] R10: 0003 R11: 0246 R12: 
>>> 
>>> [ 5093.483725] R13: 557e5c964288 R14: 557e5c96f950 R15: 
>>> 557e5c9775c0
>>> [ 5093.485778] Modules linked in: virtio_vdpa(+) vdpa sunrpc kvm_intel kvm 
>>> irqbypass virtio_blk virtio_rng rng_core [last unloaded: virtio_vdpa]
>>> [ 5093.488695] ---[ end trace 23712ecebc11f53c ]---
>>>
>>> Guest kernel: Linux 5.9-rc7
>>> gdb: GNU gdb (GDB) Fedora 9.1-6.fc32
>>> I tried with QEMU 4.2.1 and the latest master branch: same issue.
>>>
>>>
>>> I did some digging, and skipping the gdb 'add-symbol-file' command in 
>>> symbol.py
>>> avoid the issue, but of course I don't have the symbols loaded:
>>>
>>> diff --git a/scripts/gdb/linux/symbols.py b/scripts/gdb/linux/symbols.py
>>> index 1be9763cf8bb..eadfaa4d4907 100644
>>> --- a/scripts/gdb/linux/symbols.py
>>> +++ b/scripts/gdb/linux/symbols.py
>>> @@ -129,7 +129,7 @@ lx-symbols command."""
>>>  filename=module_file,
>>>  addr=module_addr,
>>>  sections=self._section_arguments(module))
>>> -gdb.execute(cmdline, to_string=True)
>>> +#gdb.execute(cmdline, to_string=True)
>>>  if module_name not in self.loaded_modules:
>>>

Re: scripts/gdb: issues when loading modules after lx-symbols

2020-10-04 Thread Jan Kiszka

On 01.10.20 16:31, Stefano Garzarella wrote:
> Hi,
> I had some issues with gdb scripts and kernel modules in Linux 5.9-rc7.
> 
> If the modules are already loaded, and I do 'lx-symbols', all work fine.
> But, if I load a kernel module after 'lx-symbols', I had this issue:
> 
> [ 5093.393940] invalid opcode:  [#1] SMP PTI
> [ 5093.395134] CPU: 0 PID: 576 Comm: modprobe Not tainted 
> 5.9.0-rc7-ste-00010-gf0b671d9608d-dirty #2
> [ 5093.397566] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 
> 1.13.0-2.fc32 04/01/2014
> [ 5093.400761] RIP: 0010:do_init_module+0x1/0x270
> [ 5093.402553] Code: ff ff e9 cf fe ff ff 0f 0b 49 c7 c4 f2 ff ff ff e9 c1 fe 
> ff ff e8 5f b2 65 00 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 cc <1f> 44 
> 00 00 55 ba 10 00 00 00 be c0 0c 00 00 48 89 e5 41 56 41 55
> [ 5093.409505] RSP: 0018:c9563d18 EFLAGS: 00010246
> [ 5093.412056] RAX:  RBX: c010a0c0 RCX: 
> 4ee3
> [ 5093.414472] RDX: 4ee2 RSI: ea0001efe188 RDI: 
> c010a0c0
> [ 5093.416349] RBP: c9563e50 R08:  R09: 
> 0002
> [ 5093.418044] R10: 0096 R11: 08a4 R12: 
> 88807a0d1280
> [ 5093.424721] R13: c010a110 R14: 88807a0d1300 R15: 
> c9563e70
> [ 5093.427138] FS:  7f018f632740() GS:88807dc0() 
> knlGS:
> [ 5093.430037] CS:  0010 DS:  ES:  CR0: 80050033
> [ 5093.432279] CR2: 55fbe282b239 CR3: 7922a006 CR4: 
> 00170ef0
> [ 5093.435096] DR0:  DR1:  DR2: 
> 
> [ 5093.436765] DR3:  DR6: fffe0ff0 DR7: 
> 0400
> [ 5093.439689] Call Trace:
> [ 5093.440954]  ? load_module+0x24b6/0x27d0
> [ 5093.443212]  ? __kernel_read+0xd6/0x150
> [ 5093.445140]  __do_sys_finit_module+0xd3/0xf0
> [ 5093.446877]  __x64_sys_finit_module+0x1a/0x20
> [ 5093.449098]  do_syscall_64+0x38/0x50
> [ 5093.450877]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [ 5093.456153] RIP: 0033:0x7f018f75c43d
> [ 5093.457728] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 
> f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 
> 01 f0 ff ff 73 01 c3 48 8b 0d 2b 6a 0c 00 f7 d8 64 89 01 48
> [ 5093.466349] RSP: 002b:7ffd7f080368 EFLAGS: 0246 ORIG_RAX: 
> 0139
> [ 5093.470613] RAX: ffda RBX: 557e5c96f9c0 RCX: 
> 7f018f75c43d
> [ 5093.474747] RDX:  RSI: 557e5c964288 RDI: 
> 0003
> [ 5093.478049] RBP: 0004 R08:  R09: 
> 
> [ 5093.481298] R10: 0003 R11: 0246 R12: 
> 
> [ 5093.483725] R13: 557e5c964288 R14: 557e5c96f950 R15: 
> 557e5c9775c0
> [ 5093.485778] Modules linked in: virtio_vdpa(+) vdpa sunrpc kvm_intel kvm 
> irqbypass virtio_blk virtio_rng rng_core [last unloaded: virtio_vdpa]
> [ 5093.488695] ---[ end trace 23712ecebc11f53c ]---
> 
> Guest kernel: Linux 5.9-rc7
> gdb: GNU gdb (GDB) Fedora 9.1-6.fc32
> I tried with QEMU 4.2.1 and the latest master branch: same issue.
> 
> 
> I did some digging, and skipping the gdb 'add-symbol-file' command in 
> symbol.py
> avoid the issue, but of course I don't have the symbols loaded:
> 
> diff --git a/scripts/gdb/linux/symbols.py b/scripts/gdb/linux/symbols.py
> index 1be9763cf8bb..eadfaa4d4907 100644
> --- a/scripts/gdb/linux/symbols.py
> +++ b/scripts/gdb/linux/symbols.py
> @@ -129,7 +129,7 @@ lx-symbols command."""
>  filename=module_file,
>  addr=module_addr,
>  sections=self._section_arguments(module))
> -gdb.execute(cmdline, to_string=True)
> +#gdb.execute(cmdline, to_string=True)
>  if module_name not in self.loaded_modules:
>  self.loaded_modules.append(module_name)
>  else:
> 
> I tried several modules and this happens every time after '(gdb) lx-symbols'.
> 
> Do you have any hints?
> 
I assume you are debugging a kernel inside QEMU/KVM, right? Does it work
without -enable-kvm?

Debugging guests in KVM mode at least was unstable for a long time. I
avoided setting soft-BPs - which is what the script does for the sake of
tracking modules loading -, falling back to hw-BPs, as I had no time to
debug that further. /Maybe/ that's the issue here.

Jan

-- 
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux

Re: [virtio-comment] Re: [RFC] ivshmem v2: Shared memory device specification

2020-07-27 Thread Jan Kiszka


On 27.07.20 15:56, Michael S. Tsirkin wrote:

On Mon, Jul 27, 2020 at 03:39:32PM +0200, Jan Kiszka wrote:

On 27.07.20 15:20, Michael S. Tsirkin wrote:

On Mon, May 25, 2020 at 09:58:28AM +0200, Jan Kiszka wrote:

 Vendor Specific Capability (ID 09h)

This capability must always be present.

| Offset | Register| Content
|
|---:|:|:---|
|00h | ID  | 09h
|
|01h | Next Capability | Pointer to next capability or 00h  
|
|02h | Length  | 20h if Base Address is present, 18h otherwise  
|
|03h | Privileged Control  | Bit 0 (read/write): one-shot interrupt mode
|
|| | Bits 1-7: Reserved (0 on read, writes ignored) 
|
|04h | State Table Size| 32-bit size of read-only State Table   
|
|08h | R/W Section Size| 64-bit size of common read/write section   
|
|10h | Output Section Size | 64-bit size of output sections 
|
|18h | Base Address| optional: 64-bit base address of shared memory 
|

All registers are read-only. Writes are ignored, except to bit 0 of
the Privileged Control register.



Is there value in making this follow the virtio vendor-specific
capability format? That will cost several extra bytes - do you envision
having many of these in the config space?


Of course, this could be modeled with via virtio_pci_cap as well. Would add
12 unused by bytes and one type byte. If it helps to make the device look
more virtio'ish, but I'm afraid there are more differences at PCI level.


I guess it will be useful if we ever find it handy to make an ivshmem
device also be a virtio device. Can't say why yet but if we don't care
it vaguely seems kind of like a good idea. I guess it will also be handy
if you ever need another vendor specific cap: you already get a way to
identify it without breaking drivers.



I can look into that. Those 12 wasted bytes are a bit ugly, but so far 
we are not short on config space, even in the non-extended range.


More problematic is that the existing specification of virtio_pci_cap 
assumes that this describes a structure in a PCI resource, rather than 
even being that data itself, and even a register (privileged control).


Would it be possible to split the types into two ranges, one for the 
existing structure, one for others - like ivshmem - that will only share 
the cfg_type field?





I do not see a use case for having multiple of those caps above per device.
If someone comes around with a valid use case for having multiple,
non-consequitive shared memory regions for one device, we would need to add
registers for them. But that would also only work for non-BAR regions due to
limited BARs.



OK, I guess this answers the below too.


Also, do we want to define an extended capability format in case this
is a pci extended capability?



What would be the practical benefit? Do you see PCIe caps that could become
useful in virtual setups?


So if we ever have a huge number of these caps, PCIe allows many more
caps.


We don't do that for regular virtio devices
either, do we?


We don't, there's a small number of these so we don't run out of config
space.


Right. But then it would not a be a problem to add PCIe (right before 
adding it becomes impossible) and push new caps into the extended space. 
And all that without breaking existing drivers. It's just a cap, and the 
spec so far does not state that there must be no other cap, neither in 
current virtio nor this ivshmem device.


Jan

--
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux

Re: [virtio-comment] Re: [RFC] ivshmem v2: Shared memory device specification

2020-07-27 Thread Jan Kiszka


On 27.07.20 15:20, Michael S. Tsirkin wrote:

On Mon, May 25, 2020 at 09:58:28AM +0200, Jan Kiszka wrote:

 Vendor Specific Capability (ID 09h)

This capability must always be present.

| Offset | Register| Content
|
|---:|:|:---|
|00h | ID  | 09h
|
|01h | Next Capability | Pointer to next capability or 00h  
|
|02h | Length  | 20h if Base Address is present, 18h otherwise  
|
|03h | Privileged Control  | Bit 0 (read/write): one-shot interrupt mode
|
|| | Bits 1-7: Reserved (0 on read, writes ignored) 
|
|04h | State Table Size| 32-bit size of read-only State Table   
|
|08h | R/W Section Size| 64-bit size of common read/write section   
|
|10h | Output Section Size | 64-bit size of output sections 
|
|18h | Base Address| optional: 64-bit base address of shared memory 
|

All registers are read-only. Writes are ignored, except to bit 0 of
the Privileged Control register.



Is there value in making this follow the virtio vendor-specific
capability format? That will cost several extra bytes - do you envision
having many of these in the config space?


Of course, this could be modeled with via virtio_pci_cap as well. Would 
add 12 unused by bytes and one type byte. If it helps to make the device 
look more virtio'ish, but I'm afraid there are more differences at PCI 
level.


I do not see a use case for having multiple of those caps above per 
device. If someone comes around with a valid use case for having 
multiple, non-consequitive shared memory regions for one device, we 
would need to add registers for them. But that would also only work for 
non-BAR regions due to limited BARs.



Also, do we want to define an extended capability format in case this
is a pci extended capability?



What would be the practical benefit? Do you see PCIe caps that could 
become useful in virtual setups? We don't do that for regular virtio 
devices either, do we?


Thanks,
Jan

--
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux

Re: [PATCH] KVM: fix CPU reset wrt HF2_GIF_MASK

2020-07-23 Thread Jan Kiszka


On 23.07.20 16:27, Vitaly Kuznetsov wrote:

HF2_GIF_MASK is set in env->hflags2 unconditionally on CPU reset
(see x86_cpu_reset()) but when calling KVM_SET_NESTED_STATE,
KVM_STATE_NESTED_GIF_SET is only valid for nSVM as e.g. nVMX code
looks like

if (kvm_state->hdr.vmx.vmxon_pa == -1ull) {
 if (kvm_state->flags & ~KVM_STATE_NESTED_EVMCS)
 return -EINVAL;
}

Also, when adjusting the environment after KVM_GET_NESTED_STATE we
need not reset HF2_GIF_MASK on VMX as e.g. x86_cpu_pending_interrupt()
expects it to be set.

Alternatively, we could've made env->hflags2 SVM-only.

Reported-by: Jan Kiszka 
Fixes: b16c0e20c742 ("KVM: add support for AMD nested live migration")
Signed-off-by: Vitaly Kuznetsov 
---
  target/i386/kvm.c | 16 +++-
  1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/target/i386/kvm.c b/target/i386/kvm.c
index 2b6b7443d273..3c4647fbfbf6 100644
--- a/target/i386/kvm.c
+++ b/target/i386/kvm.c
@@ -3883,7 +3883,9 @@ static int kvm_put_nested_state(X86CPU *cpu)
  } else {
  env->nested_state->flags &= ~KVM_STATE_NESTED_GUEST_MODE;
  }
-if (env->hflags2 & HF2_GIF_MASK) {
+
+/* Don't set KVM_STATE_NESTED_GIF_SET on VMX as it is illegal */
+if (cpu_has_svm(env) && (env->hflags2 & HF2_GIF_MASK)) {
  env->nested_state->flags |= KVM_STATE_NESTED_GIF_SET;
  } else {
  env->nested_state->flags &= ~KVM_STATE_NESTED_GIF_SET;
@@ -3925,10 +3927,14 @@ static int kvm_get_nested_state(X86CPU *cpu)
  } else {
  env->hflags &= ~HF_GUEST_MASK;
  }
-if (env->nested_state->flags & KVM_STATE_NESTED_GIF_SET) {
-env->hflags2 |= HF2_GIF_MASK;
-} else {
-env->hflags2 &= ~HF2_GIF_MASK;
+
+/* Keep HF2_GIF_MASK set on !SVM as x86_cpu_pending_interrupt() needs it */
+if (cpu_has_svm(env)) {
+if (env->nested_state->flags & KVM_STATE_NESTED_GIF_SET) {
+env->hflags2 |= HF2_GIF_MASK;
+        } else {
+env->hflags2 &= ~HF2_GIF_MASK;
+}
  }
  
  return ret;




Tested-by: Jan Kiszka 

Thanks!
Jan

--
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux

Re: 5.1.0-rc1 regression: reset fails with kvm and -cpu host

2020-07-23 Thread Jan Kiszka


On 23.07.20 14:52, Dr. David Alan Gilbert wrote:

* Vitaly Kuznetsov (vkuzn...@redhat.com) wrote:

Philippe Mathieu-DaudÃ©  writes:


+Vitaly

On 7/23/20 10:40 AM, Dr. David Alan Gilbert wrote:

* Eduardo Habkost (ehabk...@redhat.com) wrote:

On Wed, Jul 22, 2020 at 04:47:32PM -0400, Eduardo Habkost wrote:

On Wed, Jul 22, 2020 at 08:05:01PM +0200, Jan Kiszka wrote:

On 22.07.20 19:35, Eduardo Habkost wrote:

Hi Jan,

What was the last version where it worked for you?  Does using
"-cpu host,-vmx" help?


Yeah, -vmx does indeed help.

I didn't have the time to bisect yet. Just check my reflog, picked
eb6490f544, and that works.


Thanks!

I could reproduce it locally[1], I will bisect it.

The good news is that "-cpu host,+vmx" still works, on commit
eb6490f544.

[1] Linux 5.6.19-300.fc32.x86_64, Intel Core i7-8665U CPU.


Bisected to:

commit b16c0e20c74218f2d69710cedad11da7dd4d2190
Author: Paolo Bonzini 
Date:   Wed May 20 10:49:22 2020 -0400

 KVM: add support for AMD nested live migration

 Support for nested guest live migration is part of Linux 5.8, add the
 corresponding code to QEMU.  The migration format consists of a few
 flags, is an opaque 4k blob.

 The blob is in VMCB format (the control area represents the L1 VMCB
 control fields, the save area represents the pre-vmentry state; KVM does
 not use the host save area since the AMD manual allows that) but QEMU
 does not really care about that.  However, the flags need to be
 copied to hflags/hflags2 and back.

 In addition, support for retrieving and setting the AMD nested 
virtualization
 states allows the L1 guest to be reset while running a nested guest, but
 a small bug in CPU reset needs to be fixed for that to work.

 Signed-off-by: Paolo Bonzini 


Guesswork led me to try reverting the chunk in kvm_put_nested_state;
without it the reset seems to work; I can't explain that code though.



(sorry, missed the beginning of the discussion)

So one does:

(qemu) system_reset

on Intel wiht '-cpu host' and the result is:

(qemu) KVM: entry failed, hardware error 0x8021


Interesting; I hadn't seen that error - I just see a hard hung guest
rather than a reset one.


I've seen it once or twice, maybe that was also with a more complex 
command line. The point is that an invalid state is very likely loaded 
on reset. Not all invalid states cause KVM to complain, though. Some 
just lock up the guest.


Jan



(i7-8650U laptop 5.7.9 fedora 32)

Dave


If you're running a guest on an Intel machine without unrestricted mode
support, the failure can be most likely due to the guest entering an invalid
state for Intel VT. For example, the guest maybe running in big real mode
which is not supported on less recent Intel processors.

EAX=0064 EBX=91df5efe ECX= EDX=03f8
ESI= EDI=91ee32c0 EBP=90643260 ESP=00013c68
EIP=906428e6 EFL=0002 [---] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =   9300
CS =f000   9b00
SS =   9300
DS =   9300
FS =   9300
GS =   9300
LDT=   8200
TR =   8b00
GDT=  
IDT=  
CR0=6010 CR2= CR3= CR4=
DR0= DR1= DR2= 
DR3=
DR6=0ff0 DR7=0400
EFER=
Code=?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ??  ?? ?? ?? 
?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ??

I can take a look (if no one beats me to it).

--
Vitaly


--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



--
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux

Re: [virtio-comment] [RFC] ivshmem v2: Shared memory device specification

2020-07-23 Thread Jan Kiszka


On 23.07.20 08:54, Stefan Hajnoczi wrote:

On Fri, Jul 17, 2020 at 06:15:58PM +0200, Jan Kiszka wrote:

On 15.07.20 15:27, Stefan Hajnoczi wrote:

On Mon, May 25, 2020 at 09:58:28AM +0200, Jan Kiszka wrote:


Thanks for the responses. It would be great to update the spec with
these clarifications.


If BAR 2 is not present, the shared memory region is not relocatable
by the user. In that case, the hypervisor has to implement the Base
Address register in the vendor-specific capability.


What does relocatable mean in this context?


That the guest can decide (via BAR) where the resource should show up in the
physical guest address space. We do not want to support this in setups like
for static partitioning hypervisors, and then we use that side-channel
read-only configuration.


I see. I'm not sure what is vendor-specific about non-relocatable shared
memory. I guess it could be added to the spec too?


That "vendor-specific" comes from the PCI spec which - to my 
understanding - provides us no other means to introduce registers to the 
config space that are outside of the PCI spec. I could introduce a name 
for the ivshmem vendor cap and use that name here - would that be better?




In any case, since "relocatable" hasn't been fully defined, I suggest
making the statement more general:

   If BAR 2 is not present the hypervisor has to implement the Base
   Address Register in the vendor-specific capability. This can be used
   for vendor-specific shared memory functionality.



Will integrate this.

Thanks,
Jan

--
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux

Re: 5.1.0-rc1 regression: reset fails with kvm and -cpu host

2020-07-22 Thread Jan Kiszka


On 22.07.20 19:35, Eduardo Habkost wrote:

Hi Jan,

What was the last version where it worked for you?  Does using
"-cpu host,-vmx" help?


Yeah, -vmx does indeed help.

I didn't have the time to bisect yet. Just check my reflog, picked 
eb6490f544, and that works.


HTH,
Jan




On Wed, Jul 22, 2020 at 11:15:43AM +0200, Jan Kiszka wrote:

Hi all,

this locks up the guest:

- qemu-system-x86_64 -enable-kvm -cpu host
- trigger hard reset

Host kernel: 5.7.7.
Host CPU: i7-8850H

Jan

--
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux





--
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux

5.1.0-rc1 regression: reset fails with kvm and -cpu host

2020-07-22 Thread Jan Kiszka


Hi all,

this locks up the guest:

- qemu-system-x86_64 -enable-kvm -cpu host
- trigger hard reset

Host kernel: 5.7.7.
Host CPU: i7-8850H

Jan

--
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux

Re: Inter-VM device emulation (call on Mon 20th July 2020)

2020-07-21 Thread Jan Kiszka


On 21.07.20 12:49, Alex Bennée wrote:


Stefan Hajnoczi  writes:


Thank you everyone who joined!

I didn't take notes but two things stood out:

1. The ivshmem v2 and virtio-vhost-user use cases are quite different
so combining them does not seem realistic. ivshmem v2 needs to be as
simple for the hypervisor to implement as possible even if this
involves some sacrifices (e.g. not transparent to the Driver VM that
is accessing the device, performance). virtio-vhost-user is more aimed
at general-purpose device emulation although support for arbitrary
devices (e.g. PCI) would be important to serve all use cases.


I believe my phone gave up on the last few minutes of the call so I'll
just say we are interested in being able to implement arbitrary devices
in the inter-VM silos. Devices we are looking at:

   virtio-audio
   virtio-video

these are performance sensitive devices which provide a HAL abstraction
to a common software core.

   virtio-rpmb

this is a secure device where the backend may need to reside in a secure
virtualised world.

   virtio-scmi

this is a more complex device which allows the guest to make power and
clock demands from the firmware. Needless to say this starts to become
complex with multiple moving parts.

The flexibility of vhost-user seems to match up quite well with wanting
to have a reasonably portable backend that just needs to be fed signals
and a memory mapping. However we don't want daemons to automatically
have a full view of the whole of the guests system memory.


2. Alexander Graf's idea for a new Linux driver that provides an
enforcing software IOMMU. This would be a character device driver that
is mmapped by the device emulation process (either vhost-user-style on
the host or another VMM for inter-VM device emulation). The Driver VMM
can program mappings into the device and the page tables in the device
emulation process will be updated. This way the Driver VMM can share
memory specific regions of guest RAM with the device emulation process
and revoke those mappings later.


I'm wondering if there is enough plumbing on the guest side so a guest
can use the virtio-iommu to mark out exactly which bits of memory the
virtual device can have access to? At a minimum the virtqueues need to
be accessible and for larger transfers maybe a bounce buffer. However
for speed you want as wide as possible mapping but no more. It would be
nice for example if a block device could load data directly into the
guests block cache (zero-copy) but without getting a view of the kernels
internal data structures.


Welcome to a classic optimization triangle:

 - speed -> direct mappings
 - security -> restricted mapping
 - simplicity -> static mapping

Pick two, you can't have them all. Well, you could try a little bit more 
of one, at the price of losing on another. But that's it.


We chose the last two, ending up with probably the simplest but not 
fastest solution for type-1 hypervisors like Jailhouse. Specifically for 
non-Linux use cases, legacy RTOSes, often with limited driver stacks, 
having not only virtio but also even simpler channels over 
application-defined shared memory layouts is a requirement.




Another thing that came across in the call was quite a lot of
assumptions about QEMU and Linux w.r.t virtio. While our project will
likely have Linux as a guest OS we are looking specifically at enabling
virtio for Type-1 hypervisors like Xen and the various safety certified
proprietary ones. It is unlikely that QEMU would be used as the VMM for
these deployments. We want to work out what sort of common facilities
hypervisors need to support to enable virtio so the daemons can be
re-usable and maybe setup with a minimal shim for the particular
hypervisor in question.



I'm with you regarding stacks that are mappable not only on QEMU/Linux. 
And also one that does not let the certification costs sky-rocket 
because of its mandated implementation complexity.


I'm not sure anymore if there will be only one device model. Maybe we 
should eventually think about a backend layer that can sit on something 
like virtio-vhost-user as well as on ivshmem-virtio, allowing the same 
device backend code to be plumbed into both transports. Why shouldn't 
work what already works well under Linux with the frontend device 
drivers vs. virtio transports?


Jan

--
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux

Re: [PATCH for-5.1] target/arm: Always pass cacheattr in S1_ptw_translate

2020-07-21 Thread Jan Kiszka


On 21.07.20 18:35, Richard Henderson wrote:

When we changed the interface of get_phys_addr_lpae to require
the cacheattr parameter, this spot was missed.  The compiler is
unable to detect the use of NULL vs the nonnull attribute here.

Fixes: 7e98e21c098
Reported-by: Jan Kiszka 
Signed-off-by: Richard Henderson 
---
  target/arm/helper.c | 19 ++-
  1 file changed, 6 insertions(+), 13 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index c69a2baf1d..8ef0fb478f 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -10204,21 +10204,11 @@ static hwaddr S1_ptw_translate(CPUARMState *env, 
ARMMMUIdx mmu_idx,
  int s2prot;
  int ret;
  ARMCacheAttrs cacheattrs = {};
-ARMCacheAttrs *pcacheattrs = NULL;
-
-if (env->cp15.hcr_el2 & HCR_PTW) {
-/*
- * PTW means we must fault if this S1 walk touches S2 Device
- * memory; otherwise we don't care about the attributes and can
- * save the S2 translation the effort of computing them.
- */
-pcacheattrs = 
-}
  
  ret = get_phys_addr_lpae(env, addr, MMU_DATA_LOAD, ARMMMUIdx_Stage2,

   false,
   , , , , fi,
- pcacheattrs);
+ );
  if (ret) {
  assert(fi->type != ARMFault_None);
  fi->s2addr = addr;
@@ -10226,8 +10216,11 @@ static hwaddr S1_ptw_translate(CPUARMState *env, 
ARMMMUIdx mmu_idx,
  fi->s1ptw = true;
  return ~0;
  }
-if (pcacheattrs && (pcacheattrs->attrs & 0xf0) == 0) {
-/* Access was to Device memory: generate Permission fault */
+if ((env->cp15.hcr_el2 & HCR_PTW) && (cacheattrs.attrs & 0xf0) == 0) {
+/*
+ * PTW set and S1 walk touched S2 Device memory:
+ * generate Permission fault.
+ */
  fi->type = ARMFault_Permission;
  fi->s2addr = addr;
      fi->stage2 = true;



Jup:

Tested-by: Jan Kiszka 

Thanks,
Jan

--
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux

aarch64: Crash with qemu master when starting Jailhouse

2020-07-21 Thread Jan Kiszka

Hi,

I've seen this first a couple of weeks ago, ignored it, but it's still there 
today with master:

Thread 13 "qemu-system-aar" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7f90e2ffd700 (LWP 26883)]
0x560ef0bddda7 in get_phys_addr_lpae (env=, 
address=address@entry=1095261192, access_type=access_type@entry=MMU_DATA_LOAD, 
mmu_idx=mmu_idx@entry=ARMMMUIdx_Stage2, s1_is_el0=s1_is_el0@entry=false, 
phys_ptr=phys_ptr@entry=0x7f90e2ffc200, txattrs=0x7f90e2ffc1ec, 
prot=0x7f90e2ffc1f0, page_size_ptr=0x7f90e2ffc1f8, fi=0x7f90e2ffc530, 
cacheattrs=0x0) at /data/qemu/target/arm/helper.c:11106
11106   cacheattrs->attrs = convert_stage2_attrs(env, extract32(attrs, 
0, 4));
(gdb) bt
#0  0x560ef0bddda7 in get_phys_addr_lpae
(env=, address=address@entry=1095261192, 
access_type=access_type@entry=MMU_DATA_LOAD, 
mmu_idx=mmu_idx@entry=ARMMMUIdx_Stage2, s1_is_el0=s1_is_el0@entry=false, 
phys_ptr=phys_ptr@entry=0x7f90e2ffc200, txattrs=0x7f90e2ffc1ec, 
prot=0x7f90e2ffc1f0, page_size_ptr=0x7f90e2ffc1f8, fi=0x7f90e2ffc530, 
cacheattrs=0x0) at /data/qemu/target/arm/helper.c:11106
#1  0x560ef0bde3c6 in S1_ptw_translate (env=env@entry=0x560ef32742b0, 
mmu_idx=mmu_idx@entry=ARMMMUIdx_Stage1_E1, addr=1095261192, txattrs=..., 
fi=fi@entry=0x7f90e2ffc530) at /data/qemu/target/arm/helper.c:10218
#2  0x560ef0bdd7f0 in arm_ldq_ptw (fi=0x7f90e2ffc530, 
mmu_idx=ARMMMUIdx_Stage1_E1, is_secure=false, addr=, 
cs=0x560ef326ac10) at /data/qemu/target/arm/helper.c:10284
#3  0x560ef0bdd7f0 in get_phys_addr_lpae
(env=env@entry=0x560ef32742b0, address=address@entry=18446674270391351284, 
access_type=access_type@entry=MMU_INST_FETCH, 
mmu_idx=mmu_idx@entry=ARMMMUIdx_Stage1_E1, s1_is_el0=s1_is_el0@entry=false, 
phys_ptr=phys_ptr@entry=0x7f90e2ffc490, txattrs=0x7f90e2ffc518, 
prot=0x7f90e2ffc514, page_size_ptr=0x7f90e2ffc528, fi=0x7f90e2ffc530, 
cacheattrs=0x7f90e2ffc51c) at /data/qemu/target/arm/helper.c:11014
#4  0x560ef0bdfacb in get_phys_addr (env=env@entry=0x560ef32742b0, 
address=, address@entry=18446674270391351284, 
access_type=access_type@entry=MMU_INST_FETCH, mmu_idx=, 
mmu_idx@entry=ARMMMUIdx_Stage1_E1, phys_ptr=phys_ptr@entry=0x7f90e2ffc490, 
attrs=attrs@entry=0x7f90e2ffc518, prot=0x7f90e2ffc514, 
page_size=0x7f90e2ffc528, fi=0x7f90e2ffc530, cacheattrs=0x7f90e2ffc51c)
at /data/qemu/target/arm/helper.c:12115
#5  0x560ef0bdf5ca in get_phys_addr
(env=env@entry=0x560ef32742b0, address=address@entry=18446674270391351284, 
access_type=access_type@entry=MMU_INST_FETCH, mmu_idx=, 
phys_ptr=phys_ptr@entry=0x7f90e2ffc520, attrs=attrs@entry=0x7f90e2ffc518, 
prot=0x7f90e2ffc514, page_size=0x7f90e2ffc528, fi=0x7f90e2ffc530, 
cacheattrs=0x7f90e2ffc51c) at /data/qemu/target/arm/helper.c:11950
#6  0x560ef0bef669 in arm_cpu_tlb_fill (cs=0x560ef326ac10, 
address=18446674270391351284, size=, access_type=MMU_INST_FETCH, 
mmu_idx=2, probe=, retaddr=0) at 
/data/qemu/target/arm/tlb_helper.c:177
#7  0x560ef0adbd85 in tlb_fill (cpu=0x560ef326ac10, 
addr=18446674270391351284, size=0, access_type=MMU_INST_FETCH, mmu_idx=2, 
retaddr=0) at /data/qemu/accel/tcg/cputlb.c:1032
#8  0x560ef0adf216 in get_page_addr_code_hostp (env=, 
addr=addr@entry=18446674270391351284, hostp=hostp@entry=0x0) at 
/data/qemu/accel/tcg/cputlb.c:1211
#9  0x560ef0adf287 in get_page_addr_code (env=, 
addr=addr@entry=18446674270391351284) at /data/qemu/accel/tcg/cputlb.c:1243
#10 0x560ef0af21c4 in tb_htable_lookup (cpu=cpu@entry=0x560ef326ac10, 
pc=18446674270391351284, cs_base=, flags=2182107137, 
cf_mask=4278714368) at /data/qemu/accel/tcg/cpu-exec.c:337
#11 0x560ef0af2fd6 in tb_lookup__cpu_state (cf_mask=, 
flags=0x7f90e2ffc718, cs_base=0x7f90e2ffc720, pc=0x7f90e2ffc728, cpu=0x0) at 
/data/qemu/include/exec/tb-lookup.h:43
#12 0x560ef0af2fd6 in tb_find (cf_mask=524288, tb_exit=0, last_tb=0x0, 
cpu=0x0) at /data/qemu/accel/tcg/cpu-exec.c:404
#13 0x560ef0af2fd6 in cpu_exec (cpu=cpu@entry=0x560ef326ac10) at 
/data/qemu/accel/tcg/cpu-exec.c:748
#14 0x560ef0bb779f in tcg_cpu_exec (cpu=0x560ef326ac10) at 
/data/qemu/softmmu/cpus.c:1356
#15 0x560ef0bb980b in qemu_tcg_cpu_thread_fn (arg=arg@entry=0x560ef326ac10) 
at /data/qemu/softmmu/cpus.c:1664
#16 0x560ef10aaf76 in qemu_thread_start (args=) at 
/data/qemu/util/qemu-thread-posix.c:521
#17 0x7f919e9434f9 in start_thread () at /lib64/libpthread.so.0
#18 0x7f919e67bf2f in clone () at /lib64/libc.so.6

I've reproduced that with a local Jailhouse installation, but I suspect 
(do not have the time right now to check) that a vanilla jailhouse-
images [1] build for qemu-arm64 will trigger it as well. Once time 
permits, I could try to generate and share such an image.

qemu 3.1.1.1 of my distro is fine, also f4d8cf148e43.

Any ideas?

Jan

[1] https://github.com/siemens/jailhouse-images

-- 
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux

Re: [virtio-comment] [RFC] ivshmem v2: Shared memory device specification

2020-07-17 Thread Jan Kiszka


On 15.07.20 15:27, Stefan Hajnoczi wrote:

On Mon, May 25, 2020 at 09:58:28AM +0200, Jan Kiszka wrote:

IVSHMEM Device Specification


** NOTE: THIS IS WORK-IN-PROGRESS, NOT YET A STABLE INTERFACE SPECIFICATION! **


Hi Jan,
Thanks for posting this! I have a posted comments where I wasn't sure
what the spec meant.


The goal of the Inter-VM Shared Memory (IVSHMEM) device model is to
define the minimally needed building blocks a hypervisor has to
provide for enabling guest-to-guest communication. The details of
communication protocols shall remain opaque to the hypervisor so that
guests are free to define them as simple or sophisticated as they
need.

For that purpose, the IVSHMEM provides the following features to its
users:

- Interconnection between up to 65536 peers

- Multi-purpose shared memory region

 - common read/writable section

 - output sections that are read/writable for one peer and only
   readable for the others

 - section with peer states

- Event signaling via interrupt to remote sides

- Support for life-cycle management via state value exchange and
   interrupt notification on changes, backed by a shared memory
   section

- Free choice of protocol to be used on top

- Protocol type declaration

- Register can be implemented either memory-mapped or via I/O,
   depending on platform support and lower VM-exit costs

- Unprivileged access to memory-mapped or I/O registers feasible

- Single discovery and configuration via standard PCI, no complexity
   by additionally defining a platform device model


Hypervisor Model


In order to provide a consistent link between peers, all connected
instances of IVSHMEM devices need to be configured, created and run
by the hypervisor according to the following requirements:

- The instances of the device shall appear as a PCI device to their
   users.

- The read/write shared memory section has to be of the same size for
   all peers. The size can be zero.

- If shared memory output sections are present (non-zero section
   size), there must be one reserved for each peer with exclusive
   write access. All output sections must have the same size and must
   be readable for all peers.

- The State Table must have the same size for all peers, must be
   large enough to hold the state values of all peers, and must be
   read-only for the user.


Who/what is the "user"? I guess this simply means that the State Table
is read-only and only the hypervisor can update the table entries?


Read-only for the guest, right. I used the term "user" here and 
elsewhere, switching to "guest" might be better.





- State register changes (explicit writes, peer resets) have to be
   propagated to the other peers by updating the corresponding State
   Table entry and issuing an interrupt to all other peers if they
   enabled reception.

- Interrupts events triggered by a peer have to be delivered to the
   target peer, provided the receiving side is valid and has enabled
   the reception.

- All peers must have the same interrupt delivery features available,
   i.e. MSI-X with the same maximum number of vectors on platforms
   supporting this mechanism, otherwise INTx with one vector.


Guest-side Programming Model


An IVSHMEM device appears as a PCI device to its users. Unless
otherwise noted, it conforms to the PCI Local Bus Specification,
Revision 3.0. As such, it is discoverable via the PCI configuration
space and provides a number of standard and custom PCI configuration
registers.

### Shared Memory Region Layout

The shared memory region is divided into several sections.

 +-+   -
 | |   :
 | Output Section for peer n-1 |   : Output Section Size
 | (n = Maximum Peers) |   :
 +-+   -
 : :
 : :
 : :
 +-+   -
 | |   :
 |  Output Section for peer 1  |   : Output Section Size
 | |   :
 +-+   -
 | |   :
 |  Output Section for peer 0  |   : Output Section Size
 | |   :
 +-+   -
 | |   :
 | Read/Write Section  |   : R/W Section Size
 | |   :
 +-+   -
 | |   :
 | State Table |   : State Table Size
 | |   :
 +-+   <-- Shared memory base address

The first section consists of the mandatory State Table. Its size is
defined by the State Table Size register and cannot be zero. This
section is rea

Re: Inter-VM device emulation (call on Mon 20th July 2020)

2020-07-15 Thread Jan Kiszka

On 15.07.20 13:23, Stefan Hajnoczi wrote:
> Hi,
> Several projects are underway to create an inter-VM device emulation
> interface:
> 
>  * ivshmem v2
>https://www.mail-archive.com/qemu-devel@nongnu.org/msg706465.html
> 
>A PCI device that provides shared-memory communication between VMs.
>This device already exists but is limited in its current form. The
>"v2" project updates IVSHMEM's capabilities and makes it suitable as
>a VIRTIO transport.
> 
>Jan Kiszka is working on this and has posted specs for review.
> 
>  * virtio-vhost-user
>https://www.mail-archive.com/virtio-dev@lists.oasis-open.org/msg06429.html
> 
>A VIRTIO device that transports the vhost-user protocol. Allows
>vhost-user device emulation to be implemented by another VM.
> 
>Nikos Dragazis is working on this with QEMU, DPDK, and VIRTIO patches
>posted.
> 
>  * VFIO-over-socket
>
> https://github.com/tmakatos/qemu/blob/master/docs/devel/vfio-over-socket.rst
> 
>Similar to the vhost-user protocol in spirit but for any PCI device.
>Uses the Linux VFIO ioctl API as the protocol instead of vhost.
> 
>It doesn't have a virtio-vhost-user equivalent yet, but the same
>approach could be applied to VFIO-over-socket too.
> 
>Thanos Makatos and John G. Johnson are working on this. The draft
>spec is available.
> 
> Let's have a call to figure out:
> 
> 1. What is unique about these approaches and how do they overlap?
> 2. Can we focus development and code review efforts to get something
>merged sooner?
> 
> Jan and Nikos: do you have time to join on Monday, 20th of July at 15:00
> UTC?
> https://www.timeanddate.com/worldclock/fixedtime.html?iso=20200720T1500
> 

Not at that slot, but one hour earlier or later would work for me (so far).

Jan

> Video call URL: https://bluejeans.com/240406010
> 
> It would be nice if Thanos and/or JJ could join the call too. Others
> welcome too (feel free to forward this email)!
> 
> Stefan
> 

-- 
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux



smime.p7s
Description: S/MIME Cryptographic Signature

Re: [PATCH v2] apic: Report current_count via 'info lapic'

2020-07-05 Thread Jan Kiszka


On 07.02.20 07:43, Jan Kiszka wrote:

From: Jan Kiszka 

This is helpful when debugging stuck guest timers.

As we need apic_get_current_count for that, and it is really not
emulation specific, move it to apic_common.c and export it. Fix its
style at this chance as well.

Signed-off-by: Jan Kiszka 
Reviewed-by: Philippe Mathieu-Daudé 
---

Changes in v2:
  - fix style of apic_get_current_count

  hw/intc/apic.c  | 18 --
  hw/intc/apic_common.c   | 19 +++
  include/hw/i386/apic_internal.h |  1 +
  target/i386/helper.c|  5 +++--
  4 files changed, 23 insertions(+), 20 deletions(-)

diff --git a/hw/intc/apic.c b/hw/intc/apic.c
index bd40467965..f2207d0ace 100644
--- a/hw/intc/apic.c
+++ b/hw/intc/apic.c
@@ -615,24 +615,6 @@ int apic_accept_pic_intr(DeviceState *dev)
  return 0;
  }

-static uint32_t apic_get_current_count(APICCommonState *s)
-{
-int64_t d;
-uint32_t val;
-d = (qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) - s->initial_count_load_time) >>
-s->count_shift;
-if (s->lvt[APIC_LVT_TIMER] & APIC_LVT_TIMER_PERIODIC) {
-/* periodic */
-val = s->initial_count - (d % ((uint64_t)s->initial_count + 1));
-} else {
-if (d >= s->initial_count)
-val = 0;
-else
-val = s->initial_count - d;
-}
-return val;
-}
-
  static void apic_timer_update(APICCommonState *s, int64_t current_time)
  {
  if (apic_next_timer(s, current_time)) {
diff --git a/hw/intc/apic_common.c b/hw/intc/apic_common.c
index 9ec0f2deb2..fb432e83f2 100644
--- a/hw/intc/apic_common.c
+++ b/hw/intc/apic_common.c
@@ -189,6 +189,25 @@ bool apic_next_timer(APICCommonState *s, int64_t 
current_time)
  return true;
  }

+uint32_t apic_get_current_count(APICCommonState *s)
+{
+int64_t d;
+uint32_t val;
+d = (qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) - s->initial_count_load_time) >>
+s->count_shift;
+if (s->lvt[APIC_LVT_TIMER] & APIC_LVT_TIMER_PERIODIC) {
+/* periodic */
+val = s->initial_count - (d % ((uint64_t)s->initial_count + 1));
+} else {
+if (d >= s->initial_count) {
+val = 0;
+} else {
+val = s->initial_count - d;
+}
+}
+return val;
+}
+
  void apic_init_reset(DeviceState *dev)
  {
  APICCommonState *s;
diff --git a/include/hw/i386/apic_internal.h b/include/hw/i386/apic_internal.h
index b04bdd947f..2597000e03 100644
--- a/include/hw/i386/apic_internal.h
+++ b/include/hw/i386/apic_internal.h
@@ -211,6 +211,7 @@ void vapic_report_tpr_access(DeviceState *dev, CPUState 
*cpu, target_ulong ip,
   TPRAccess access);

  int apic_get_ppr(APICCommonState *s);
+uint32_t apic_get_current_count(APICCommonState *s);

  static inline void apic_set_bit(uint32_t *tab, int index)
  {
diff --git a/target/i386/helper.c b/target/i386/helper.c
index c3a6e4fabe..e3c3726c29 100644
--- a/target/i386/helper.c
+++ b/target/i386/helper.c
@@ -370,10 +370,11 @@ void x86_cpu_dump_local_apic_state(CPUState *cs, int 
flags)
  dump_apic_lvt("LVTTHMR", lvt[APIC_LVT_THERMAL], false);
  dump_apic_lvt("LVTT", lvt[APIC_LVT_TIMER], true);

-qemu_printf("Timer\t DCR=0x%x (divide by %u) initial_count = %u\n",
+qemu_printf("Timer\t DCR=0x%x (divide by %u) initial_count = %u"
+" current_count = %u\n",
  s->divide_conf & APIC_DCR_MASK,
  divider_conf(s->divide_conf),
-s->initial_count);
+s->initial_count, apic_get_current_count(s));

  qemu_printf("SPIV\t 0x%08x APIC %s, focus=%s, spurious vec %u\n",
  s->spurious_vec,



Ping.

Jan

Re: [RFC] ivshmem v2: Shared memory device specification

2020-06-17 Thread Jan Kiszka

On 17.06.20 17:32, Alex Bennée wrote:
> 
> Jan Kiszka  writes:
> 
>> Hi all,
>>
>> as requested by Michael, find below the current version of the Inter-VM 
>> Shared Memory device specification version 2 (as version 1 could be 
>> considered what is currently in QEMU).
>>
>> This posting is intended to collect feedback from the virtio community 
>> before officially proposing it to become part of the spec. As you can 
>> see, the spec format is not yet integrated with the virtio documents at 
>> all.
>>
>> IVSHMEM has value of its own, allowing unprivileged applications to 
>> establish links to other applications in VMs connected via this 
>> transport. In addition, and that is where virtio comes into play even 
>> more, it can be used to build virtio backend-frontend connections 
>> between two VMs. Prototypes have been developed, see e.g. [1], 
>> specifying that transport is still a to-do. I will try to reserve a few 
>> cycle in the upcoming weeks for a first draft.
>>
>> My current strategy for these two pieces, ivshmem2 and virtio-shmem, is 
>> propose them both to virtio but allowing virtio-shmem to also be mapped 
>> on other shared memory channels for virtual machines.
>>
>> The ivshmem2 device model is fairly stable now, also being in use in 
>> Jailhouse for quite a while. Still there are some aspects that could be 
>> worth to discuss in particular:
>>
>>  - Do we also need a platform device model, in addition to PCI? My
>>feelings are negative, but there has been at least one request.
>>
>>  - Should we add some feature flags, or is using the PCI revision ID
>>sufficient (...to do that later)? Currently, there is no need for
>>communicating features this way.
>>
>>  - Should we try to model the doorbell interface more flexibly, in way
>>that may allow mapping it on hardware-provided mailboxes (i.e. VM-
>>exit free channels)? Unfortunately, I'm not yet aware of any hardware
>>that could provide this feature and, thus, could act as a use case to
>>design against.
>>
>> Thanks,
>> Jan
>>
>> [1] https://www.mail-archive.com/qemu-devel@nongnu.org/msg668749.html
>>
>> ---
>>
>> IVSHMEM Device Specification
>> 
>>
>> ** NOTE: THIS IS WORK-IN-PROGRESS, NOT YET A STABLE INTERFACE SPECIFICATION! 
>> **
>>
>> The goal of the Inter-VM Shared Memory (IVSHMEM) device model is to
>> define the minimally needed building blocks a hypervisor has to
>> provide for enabling guest-to-guest communication. The details of
>> communication protocols shall remain opaque to the hypervisor so that
>> guests are free to define them as simple or sophisticated as they
>> need.
>>
>> For that purpose, the IVSHMEM provides the following features to its
>> users:
>>
>> - Interconnection between up to 65536 peers
>>
>> - Multi-purpose shared memory region
>>
>> - common read/writable section
>>
>> - output sections that are read/writable for one peer and only
>>   readable for the others
>>
>> - section with peer states
>>
>> - Event signaling via interrupt to remote sides
>>
>> - Support for life-cycle management via state value exchange and
>>   interrupt notification on changes, backed by a shared memory
>>   section
>>
>> - Free choice of protocol to be used on top
>>
>> - Protocol type declaration
>>
>> - Register can be implemented either memory-mapped or via I/O,
>>   depending on platform support and lower VM-exit costs
>>
>> - Unprivileged access to memory-mapped or I/O registers feasible
>>
>> - Single discovery and configuration via standard PCI, no complexity
>>   by additionally defining a platform device model
>>
>>
>> Hypervisor Model
>> 
>>
>> In order to provide a consistent link between peers, all connected
>> instances of IVSHMEM devices need to be configured, created and run
>> by the hypervisor according to the following requirements:
>>
>> - The instances of the device shall appear as a PCI device to their
>>   users.
> 
> Would there ever be scope for a plain MMIO style device to be reported?

As pointed out above, there has been a question already targeting at
MMIO binding.

> While I agree PCI is pretty useful from the point of view of easy
> discovery I note a number of machine types prefer device tree reported
> devices because they are faster to bring up for cloudy fast VM purposes
>

[RFC] ivshmem v2: Shared memory device specification

2020-05-25 Thread Jan Kiszka

Hi all,

as requested by Michael, find below the current version of the Inter-VM 
Shared Memory device specification version 2 (as version 1 could be 
considered what is currently in QEMU).

This posting is intended to collect feedback from the virtio community 
before officially proposing it to become part of the spec. As you can 
see, the spec format is not yet integrated with the virtio documents at 
all.

IVSHMEM has value of its own, allowing unprivileged applications to 
establish links to other applications in VMs connected via this 
transport. In addition, and that is where virtio comes into play even 
more, it can be used to build virtio backend-frontend connections 
between two VMs. Prototypes have been developed, see e.g. [1], 
specifying that transport is still a to-do. I will try to reserve a few 
cycle in the upcoming weeks for a first draft.

My current strategy for these two pieces, ivshmem2 and virtio-shmem, is 
propose them both to virtio but allowing virtio-shmem to also be mapped 
on other shared memory channels for virtual machines.

The ivshmem2 device model is fairly stable now, also being in use in 
Jailhouse for quite a while. Still there are some aspects that could be 
worth to discuss in particular:

 - Do we also need a platform device model, in addition to PCI? My
   feelings are negative, but there has been at least one request.

 - Should we add some feature flags, or is using the PCI revision ID
   sufficient (...to do that later)? Currently, there is no need for
   communicating features this way.

 - Should we try to model the doorbell interface more flexibly, in way
   that may allow mapping it on hardware-provided mailboxes (i.e. VM-
   exit free channels)? Unfortunately, I'm not yet aware of any hardware
   that could provide this feature and, thus, could act as a use case to
   design against.

Thanks,
Jan

[1] https://www.mail-archive.com/qemu-devel@nongnu.org/msg668749.html

---

IVSHMEM Device Specification


** NOTE: THIS IS WORK-IN-PROGRESS, NOT YET A STABLE INTERFACE SPECIFICATION! **

The goal of the Inter-VM Shared Memory (IVSHMEM) device model is to
define the minimally needed building blocks a hypervisor has to
provide for enabling guest-to-guest communication. The details of
communication protocols shall remain opaque to the hypervisor so that
guests are free to define them as simple or sophisticated as they
need.

For that purpose, the IVSHMEM provides the following features to its
users:

- Interconnection between up to 65536 peers

- Multi-purpose shared memory region

- common read/writable section

- output sections that are read/writable for one peer and only
  readable for the others

- section with peer states

- Event signaling via interrupt to remote sides

- Support for life-cycle management via state value exchange and
  interrupt notification on changes, backed by a shared memory
  section

- Free choice of protocol to be used on top

- Protocol type declaration

- Register can be implemented either memory-mapped or via I/O,
  depending on platform support and lower VM-exit costs

- Unprivileged access to memory-mapped or I/O registers feasible

- Single discovery and configuration via standard PCI, no complexity
  by additionally defining a platform device model


Hypervisor Model


In order to provide a consistent link between peers, all connected
instances of IVSHMEM devices need to be configured, created and run
by the hypervisor according to the following requirements:

- The instances of the device shall appear as a PCI device to their
  users.

- The read/write shared memory section has to be of the same size for
  all peers. The size can be zero.

- If shared memory output sections are present (non-zero section
  size), there must be one reserved for each peer with exclusive
  write access. All output sections must have the same size and must
  be readable for all peers.

- The State Table must have the same size for all peers, must be
  large enough to hold the state values of all peers, and must be
  read-only for the user.

- State register changes (explicit writes, peer resets) have to be
  propagated to the other peers by updating the corresponding State
  Table entry and issuing an interrupt to all other peers if they
  enabled reception.

- Interrupts events triggered by a peer have to be delivered to the
  target peer, provided the receiving side is valid and has enabled
  the reception.

- All peers must have the same interrupt delivery features available,
  i.e. MSI-X with the same maximum number of vectors on platforms
  supporting this mechanism, otherwise INTx with one vector.


Guest-side Programming Model


An IVSHMEM device appears as a PCI device to its users. Unless
otherwise noted, it conforms to the PCI Local Bus Specification,
Revision 3.0. As such, it is discoverable via the PCI configuration
space and provides a number of

Re: [RFC][PATCH v2 2/3] docs/specs: Add specification of ivshmem device revision 2

2020-05-22 Thread Jan Kiszka

On 21.05.20 20:18, Michael S. Tsirkin wrote:
> On Thu, May 21, 2020 at 05:53:31PM +0100, Alex Bennée wrote:
>>
>> Jan Kiszka  writes:
>>
>>> From: Jan Kiszka 
>>>
>>> This imports the ivshmem v2 specification draft from Jailhouse where the
>>> implementation is about to be merged now. The final home of the spec is
>>> to be decided, this shall just simplify the review process at this
>>> stage.
>>>
>>> Signed-off-by: Jan Kiszka 
>>> ---
>>>  docs/specs/ivshmem-2-device-spec.md | 376 
>>> 
>>>  1 file changed, 376 insertions(+)
>>>  create mode 100644 docs/specs/ivshmem-2-device-spec.md
>>>
>>> diff --git a/docs/specs/ivshmem-2-device-spec.md 
>>> b/docs/specs/ivshmem-2-device-spec.md
>>> new file mode 100644
>>> index 00..d93cb22b04
>>> --- /dev/null
>>> +++ b/docs/specs/ivshmem-2-device-spec.md
>>> @@ -0,0 +1,376 @@
>>> +IVSHMEM Device Specification
>>> +
>>> +
>>> +** NOTE: THIS IS WORK-IN-PROGRESS, NOT YET A STABLE INTERFACE
>>> SPECIFICATION! **
>>
>> Has there been any proposal to the OASIS virtio spec to use this as a
>> transport for VirtIO?
>>

Not yet, primarily because the second half - virtio over generic shared
memory or even ivshmem2 in particular - has not been written yet. It
only exists as prototypes. Lacking time.

>> -- 
>> Alex Bennée
> 
> 
> I think intergrating this in the virtio spec needs not gate the
> implementation, but I'd like to see the current spec copied to
> virtio-comments list to possibly get comments from that community.
> 
> 

Will do.

Jan

-- 
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux

Re: [RFC PATCH] hw/arm/musicpal: Map the UART devices unconditionally

2020-05-05 Thread Jan Kiszka


On 05.05.20 11:59, Philippe Mathieu-Daudé wrote:

I can't find proper documentation or datasheet, but it is likely
a MMIO mapped serial device mapped in the 0x8000..0x8000
range belongs to the SoC address space, thus is always mapped in
the memory bus.
Map the devices on the bus regardless a chardev is attached to it.

Signed-off-by: Philippe Mathieu-Daudé 
---
from 2019... found while doing housekeeping
---
  hw/arm/musicpal.c | 12 
  1 file changed, 4 insertions(+), 8 deletions(-)

diff --git a/hw/arm/musicpal.c b/hw/arm/musicpal.c
index b2d0cfdac8..92f33ed87e 100644
--- a/hw/arm/musicpal.c
+++ b/hw/arm/musicpal.c
@@ -1619,14 +1619,10 @@ static void musicpal_init(MachineState *machine)
pic[MP_TIMER2_IRQ], pic[MP_TIMER3_IRQ],
pic[MP_TIMER4_IRQ], NULL);

-if (serial_hd(0)) {
-serial_mm_init(address_space_mem, MP_UART1_BASE, 2, pic[MP_UART1_IRQ],
-   1825000, serial_hd(0), DEVICE_NATIVE_ENDIAN);
-}
-if (serial_hd(1)) {
-serial_mm_init(address_space_mem, MP_UART2_BASE, 2, pic[MP_UART2_IRQ],
-   1825000, serial_hd(1), DEVICE_NATIVE_ENDIAN);
-}
+serial_mm_init(address_space_mem, MP_UART1_BASE, 2, pic[MP_UART1_IRQ],
+   1825000, serial_hd(0), DEVICE_NATIVE_ENDIAN);
+serial_mm_init(address_space_mem, MP_UART2_BASE, 2, pic[MP_UART2_IRQ],
+   1825000, serial_hd(1), DEVICE_NATIVE_ENDIAN);

  /* Register flash */
  dinfo = drive_get(IF_PFLASH, 0, 0);



I don't recall details anymore either (more than 10 year ago now...),
but this looks reasonable.

Reviewed-by: Jan Kiszka 

Jan

Re: [RFC][PATCH v2 0/3] IVSHMEM version 2 device for QEMU

2020-04-29 Thread Jan Kiszka


On 29.04.20 06:15, Liang Yan wrote:

Hi, All,

Did a test for these patches, all looked fine.

Test environment:
Host: opensuse tumbleweed + latest upstream qemu  + these three patches
Guest: opensuse tumbleweed root fs + custom kernel(5.5) + related
uio-ivshmem driver + ivshmem-console/ivshmem-block tools


1. lspci show

00:04.0 Unassigned class [ff80]: Siemens AG Device 4106 (prog-if 02)
Subsystem: Red Hat, Inc. Device 1100
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR+ FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- 
Capabilities: [40] MSI-X: Enable+ Count=2 Masked-
Vector table: BAR=1 offset=
PBA: BAR=1 offset=0800
Kernel driver in use: virtio-ivshmem


2. virtio-ivshmem-console test
2.1 ivshmem2-server(host)

airey:~/ivshmem/qemu/:[0]# ./ivshmem2-server -F -l 64K -n 2 -V 3 -P 0x8003
*** Example code, do not use in production ***

2.2 guest vm backend(test-01)
localhost:~ # echo "110a 4106 1af4 1100 ffc003 ff" >
/sys/bus/pci/drivers/uio_ivshmem/new_id
[  185.831277] uio_ivshmem :00:04.0: state_table at
0xfd80, size 0x1000
[  185.835129] uio_ivshmem :00:04.0: rw_section at
0xfd801000, size 0x7000

localhost:~ # virtio/virtio-ivshmem-console /dev/uio0
Waiting for peer to be ready...

2.3 guest vm frontend(test-02)
need to boot or reboot after backend is done

2.4 backend will serial output of frontend

localhost:~ # virtio/virtio-ivshmem-console /dev/uio0
Waiting for peer to be ready...

localhost:~/virtio # ./virtio-ivshmem-console /dev/uio0
Waiting for peer to be ready...
Starting virtio device
device_status: 0x0
device_status: 0x1
device_status: 0x3
device_features_sel: 1
device_features_sel: 0
driver_features_sel: 1
driver_features[1]: 0x13
driver_features_sel: 0
driver_features[0]: 0x1
device_status: 0xb
queue_sel: 0
queue size: 8
queue driver vector: 1
queue desc: 0x200
queue driver: 0x280
queue device: 0x2c0
queue enable: 1
queue_sel: 1
queue size: 8
queue driver vector: 2
queue desc: 0x400
queue driver: 0x480
queue device: 0x4c0
queue enable: 1
device_status: 0xf

Welcome to openSUSE Tumbleweed 20200326 - Kernel 5.5.0-rc5-1-default+
(hvc0).

enp0s3:


localhost login:

2.5 close backend and frontend will show
localhost:~ # [  185.685041] virtio-ivshmem :00:04.0: backend failed!

3. virtio-ivshmem-block test

3.1 ivshmem2-server(host)
airey:~/ivshmem/qemu/:[0]# ./ivshmem2-server -F -l 1M -n 2 -V 2 -P 0x8002
*** Example code, do not use in production ***

3.2 guest vm backend(test-01)

localhost:~ # echo "110a 4106 1af4 1100 ffc002 ff" >
/sys/bus/pci/drivers/uio_ivshmem/new_id
[   77.701462] uio_ivshmem :00:04.0: state_table at
0xfd80, size 0x1000
[   77.705231] uio_ivshmem :00:04.0: rw_section at
0xfd801000, size 0x000ff000

localhost:~ # virtio/virtio-ivshmem-block /dev/uio0 /root/disk.img
Waiting for peer to be ready...

3.3 guest vm frontend(test-02)
need to boot or reboot after backend is done

3.4 guest vm backend(test-01)
localhost:~ # virtio/virtio-ivshmem-block /dev/uio0 /root/disk.img
Waiting for peer to be ready...
Starting virtio device
device_status: 0x0
device_status: 0x1
device_status: 0x3
device_features_sel: 1
device_features_sel: 0
driver_features_sel: 1
driver_features[1]: 0x13
driver_features_sel: 0
driver_features[0]: 0x206
device_status: 0xb
queue_sel: 0
queue size: 8
queue driver vector: 1
queue desc: 0x200
queue driver: 0x280
queue device: 0x2c0
queue enable: 1
device_status: 0xf

3.5 guest vm frontend(test-02), a new disk is attached:

fdisk /dev/vdb

Disk /dev/vdb: 192 KiB, 196608 bytes, 384 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes

3.6 close backend and frontend will show
localhost:~ # [ 1312.284301] virtio-ivshmem :00:04.0: backend failed!



Tested-by: Liang Yan 



Thanks for testing this! I'll look into your patch findings.

Jan


On 1/7/20 9:36 AM, Jan Kiszka wrote:

Overdue update of the ivshmem 2.0 device model as presented at [1].

Changes in v2:
  - changed PCI device ID to Siemens-granted one,
adjusted PCI device revision to 0
  - removed unused feature register from device
  - addressed feedback on specification document
  - rebased over master

This version is now fully in sync with the implementation for Jailhouse
that is currently under review [2][3], UIO and virtio-ivshmem drivers
are shared. Jailhouse will very likely pick up this revision of the
device in order to move forward with stressing it.

More details on the usage with QEMU were in the original cover letter
(with adjustements to the new device ID):

If you want to play with this, the basic setup of the shared memory
device is described in patch 1 and 3. UIO driver and also the
virtio-ivshmem prototype can be found at

 http://git.kiszka.org/?p=linu

Re: [RFC][PATCH v2 0/3] IVSHMEM version 2 device for QEMU

2020-04-09 Thread Jan Kiszka

On 09.04.20 15:52, Liang Yan wrote:

On 1/7/20 9:36 AM, Jan Kiszka wrote:

Overdue update of the ivshmem 2.0 device model as presented at [1].

Changes in v2:
- changed PCI device ID to Siemens-granted one,
adjusted PCI device revision to 0
- removed unused feature register from device
- addressed feedback on specification document
- rebased over master

This version is now fully in sync with the implementation for Jailhouse
that is currently under review [2][3], UIO and virtio-ivshmem drivers
are shared. Jailhouse will very likely pick up this revision of the
device in order to move forward with stressing it.

More details on the usage with QEMU were in the original cover letter
(with adjustements to the new device ID):

If you want to play with this, the basic setup of the shared memory
device is described in patch 1 and 3. UIO driver and also the
virtio-ivshmem prototype can be found at

http://git.kiszka.org/?p=linux.git;a=shortlog;h=refs/heads/queues/ivshmem2

Accessing the device via UIO is trivial enough. If you want to use it
for virtio, this is additionally to the description in patch 3 needed on
the virtio console backend side:

modprobe uio_ivshmem
echo "110a 4106 1af4 1100 ffc003 ff" >
/sys/bus/pci/drivers/uio_ivshmem/new_id
linux/tools/virtio/virtio-ivshmem-console /dev/uio0

And for virtio block:

echo "110a 4106 1af4 1100 ffc002 ff" >
/sys/bus/pci/drivers/uio_ivshmem/new_id
linux/tools/virtio/virtio-ivshmem-console /dev/uio0 /path/to/disk.img

After that, you can start the QEMU frontend instance with the
virtio-ivshmem driver installed which can use the new /dev/hvc* or
/dev/vda* as usual.

Hi, Jan,

Nice work.

I did a full test for your this new version. QEMU device part looks
good, virtio console worked as expected. Just had some issue with
virtio-ivshmem-block tests here.

I suppose you mean "linux/tools/virtio/virtio-ivshmem-block"?

Yes, copy mistake, had the same issue over in
https://github.com/siemens/jailhouse/blob/master/Documentation/inter-cell-communication.md

Noticed "ffc002" is the main difference, however I saw nothing response
when run echo command here, are there anything I need to prepare?

I build the driver in guest kernel already.

Do I need a new protocol or anything for below command line?
ivshmem2-server -F -l 64K -n 2 -V 3 -P 0x8003

Yes, you need to adjust that command line - didn't I document that
somewhere? Looks like I didn't:

ivshmem2-server -F -l 1M -n 2 -V 2 -P 0x8002

i.e. a bit more memory is good (but this isn't speed-optimized anyway),
you only need 2 vectors here (but more do not harm), and the protocol
indeed needs adjustment (that is the key).

Jan

--
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux

[PATCH] hw/i386/intel_iommu: Fix out-of-bounds access on guest IRT

2020-03-10 Thread Jan Kiszka

From: Jan Kiszka 

vtd_irte_get failed to check the index against the configured table
size, causing an out-of-bounds access on guest memory and potentially
misinterpreting the result.

Signed-off-by: Jan Kiszka 
---

BTW, we still miss error reporting emulation, right? Therefore, I added 
that simple error_report_once thing, like the other paths do.

 hw/i386/intel_iommu.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 204b6841ec..df7ad254ac 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -3094,6 +3094,12 @@ static int vtd_irte_get(IntelIOMMUState *iommu, uint16_t 
index,
 uint16_t mask, source_id;
 uint8_t bus, bus_max, bus_min;
 
+if (index >= iommu->intr_size) {
+error_report_once("%s: index too large: ind=0x%x",
+  __func__, index);
+return -VTD_FR_IR_INDEX_OVER;
+}
+
 addr = iommu->intr_root + index * sizeof(*entry);
 if (dma_memory_read(_space_memory, addr, entry,
 sizeof(*entry))) {
-- 
2.16.4


-- 
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux

Re: [PATCH v2 0/2] ui/gtk: Fix gd_refresh_rate_millihz() when widget window is not realized

2020-02-08 Thread Jan Kiszka


On 08.02.20 17:10, Philippe Mathieu-Daudé wrote:

Fix bug report from Jan Kiszka:
https://www.mail-archive.com/qemu-devel@nongnu.org/msg678130.html
https://lists.gnu.org/archive/html/qemu-devel/2020-02/msg02260.html

Supersedes: <20200208143008.6157-1-f4...@amsat.org>

Philippe Mathieu-Daudé (2):
   ui/gtk: Update gd_refresh_rate_millihz() to handle VirtualConsole
   ui/gtk: Fix gd_refresh_rate_millihz() when widget window is not
 realized

  ui/gtk.c | 19 +++
  1 file changed, 11 insertions(+), 8 deletions(-)



Yep.

Tested-by: Jan Kiszka

Re: [PATCH] ui/gtk: Fix gd_refresh_rate_millihz() when using VirtualConsole

2020-02-08 Thread Jan Kiszka

On 08.02.20 16:20, Jan Kiszka wrote:
> On 08.02.20 15:30, Philippe Mathieu-Daudé wrote:
>> Fix using virtual console under gtk 3.22.30 (mate 1.20.1):
>>
>>    qemu-system-x86_64: Gdk: gdk_window_get_origin: assertion
>> 'GDK_IS_WINDOW (window)' failed
>>
>> Fixes: c4c00922cc and 28b58f19d2 (display/gtk: get proper refreshrate)
>> Reported-by: Jan Kiszka 
>> Signed-off-by: Philippe Mathieu-Daudé 
>> ---
>> Cc: Nikola Pavlica 
>> Report:
>> https://www.mail-archive.com/qemu-devel@nongnu.org/msg678130.html
>> ---
>>   ui/gtk.c | 9 +
>>   1 file changed, 5 insertions(+), 4 deletions(-)
>>
>> diff --git a/ui/gtk.c b/ui/gtk.c
>> index d18892d1de..c59297ff4d 100644
>> --- a/ui/gtk.c
>> +++ b/ui/gtk.c
>> @@ -1965,11 +1965,11 @@ static GtkWidget
>> *gd_create_menu_machine(GtkDisplayState *s)
>>    * If available, return the refresh rate of the display in milli-Hertz,
>>    * else return 0.
>>    */
>> -static int gd_refresh_rate_millihz(GtkDisplayState *s)
>> +static int gd_refresh_rate_millihz(GtkWidget *window)
>>   {
>>   #ifdef GDK_VERSION_3_22
>> -    GdkDisplay *dpy = gtk_widget_get_display(s->window);
>> -    GdkWindow *win = gtk_widget_get_window(s->window);
>> +    GdkDisplay *dpy = gtk_widget_get_display(window);
>> +    GdkWindow *win = gtk_widget_get_window(window);
>>   GdkMonitor *monitor = gdk_display_get_monitor_at_window(dpy, win);
>
> Nope, no change. It's triggered right from ui/gtk.c:1973, the line above.
>
>>   return gdk_monitor_get_refresh_rate(monitor);
>> @@ -2045,7 +2045,8 @@ static GSList *gd_vc_gfx_init(GtkDisplayState
>> *s, VirtualConsole *vc,
>>   vc->gfx.kbd = qkbd_state_init(con);
>>   vc->gfx.dcl.con = con;
>> -    refresh_rate_millihz = gd_refresh_rate_millihz(s);
>> +    refresh_rate_millihz = gd_refresh_rate_millihz(vc->window ?
>> +   vc->window :
>> s->window);
>>   if (refresh_rate_millihz) {
>>   vc->gfx.dcl.update_interval = MILLISEC_PER_SEC /
>> refresh_rate_millihz;
>>   }
>>
>
> Jan

Here is the full backtrace, just in case:

#0  0x7496cf70 in gdk_window_get_origin () from /usr/lib64/libgdk-3.so.0
#1  0x749582a0 in gdk_display_get_monitor_at_window () from 
/usr/lib64/libgdk-3.so.0
#2  0x55bb73e2 in gd_refresh_rate_millihz (window=0x579d6280) at 
/data/qemu/ui/gtk.c:1973
#3  gd_vc_gfx_init (view_menu=0x579f0590, group=0x0, idx=0, con=, vc=0x579d4a90, s=0x579d49f0) at /data/qemu/ui/gtk.c:2048
#4  gd_create_menu_view (s=0x579d49f0) at /data/qemu/ui/gtk.c:2149
#5  gd_create_menus (s=0x579d49f0) at /data/qemu/ui/gtk.c:2188
#6  gtk_display_init (ds=, opts=0x5661ed80 ) at 
/data/qemu/ui/gtk.c:2256
#7  0x5583d5a0 in main (argc=, argv=, 
envp=) at /data/qemu/vl.c:4358

Jan

Re: [PATCH] ui/gtk: Fix gd_refresh_rate_millihz() when using VirtualConsole

2020-02-08 Thread Jan Kiszka


On 08.02.20 15:30, Philippe Mathieu-Daudé wrote:

Fix using virtual console under gtk 3.22.30 (mate 1.20.1):

   qemu-system-x86_64: Gdk: gdk_window_get_origin: assertion 'GDK_IS_WINDOW 
(window)' failed

Fixes: c4c00922cc and 28b58f19d2 (display/gtk: get proper refreshrate)
Reported-by: Jan Kiszka 
Signed-off-by: Philippe Mathieu-Daudé 
---
Cc: Nikola Pavlica 
Report:
https://www.mail-archive.com/qemu-devel@nongnu.org/msg678130.html
---
  ui/gtk.c | 9 +
  1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/ui/gtk.c b/ui/gtk.c
index d18892d1de..c59297ff4d 100644
--- a/ui/gtk.c
+++ b/ui/gtk.c
@@ -1965,11 +1965,11 @@ static GtkWidget 
*gd_create_menu_machine(GtkDisplayState *s)
   * If available, return the refresh rate of the display in milli-Hertz,
   * else return 0.
   */
-static int gd_refresh_rate_millihz(GtkDisplayState *s)
+static int gd_refresh_rate_millihz(GtkWidget *window)
  {
  #ifdef GDK_VERSION_3_22
-GdkDisplay *dpy = gtk_widget_get_display(s->window);
-GdkWindow *win = gtk_widget_get_window(s->window);
+GdkDisplay *dpy = gtk_widget_get_display(window);
+GdkWindow *win = gtk_widget_get_window(window);
  GdkMonitor *monitor = gdk_display_get_monitor_at_window(dpy, win);


Nope, no change. It's triggered right from ui/gtk.c:1973, the line above.



  return gdk_monitor_get_refresh_rate(monitor);
@@ -2045,7 +2045,8 @@ static GSList *gd_vc_gfx_init(GtkDisplayState *s, 
VirtualConsole *vc,
  vc->gfx.kbd = qkbd_state_init(con);
  vc->gfx.dcl.con = con;

-refresh_rate_millihz = gd_refresh_rate_millihz(s);
+refresh_rate_millihz = gd_refresh_rate_millihz(vc->window ?
+   vc->window : s->window);
  if (refresh_rate_millihz) {
  vc->gfx.dcl.update_interval = MILLISEC_PER_SEC / refresh_rate_millihz;
  }



Jan

Re: [PATCH] ui/gtk: Get display refresh rate with GDK version 3.22 or later

2020-02-08 Thread Jan Kiszka

On 16.01.20 02:12, Philippe Mathieu-Daudé wrote:
> The GdkMonitor was introduced in GTK+ 3.22:
> https://developer.gnome.org/gdk3/stable/api-index-3-22.html#api-index-3.22
>
> If we build with older GTK+, the build fails:
>
>  CC  ui/gtk.o
>qemu/ui/gtk.c: In function ‘gd_vc_gfx_init’:
>qemu/ui/gtk.c:1973:5: error: unknown type name ‘GdkMonitor’
> GdkMonitor *monitor = gdk_display_get_monitor_at_window(dpy, win);
> ^
>qemu/ui/gtk.c:1973:27: error: implicit declaration of function 
> ‘gdk_display_get_monitor_at_window’ [-Werror=implicit-function-declaration]
> GdkMonitor *monitor = gdk_display_get_monitor_at_window(dpy, win);
>   ^
>qemu/ui/gtk.c:1973:5: error: nested extern declaration of 
> ‘gdk_display_get_monitor_at_window’ [-Werror=nested-externs]
> GdkMonitor *monitor = gdk_display_get_monitor_at_window(dpy, win);
> ^
>qemu/ui/gtk.c:1973:27: error: initialization makes pointer from integer 
> without a cast [-Werror=int-conversion]
> GdkMonitor *monitor = gdk_display_get_monitor_at_window(dpy, win);
>   ^
>qemu/ui/gtk.c:2035:28: error: implicit declaration of function 
> ‘gdk_monitor_get_refresh_rate’ [-Werror=implicit-function-declaration]
> refresh_rate_millihz = gdk_monitor_get_refresh_rate(monitor);
>^
>qemu/ui/gtk.c:2035:5: error: nested extern declaration of 
> ‘gdk_monitor_get_refresh_rate’ [-Werror=nested-externs]
> refresh_rate_millihz = gdk_monitor_get_refresh_rate(monitor);
> ^
>cc1: all warnings being treated as errors
>qemu/rules.mak:69: recipe for target 'ui/gtk.o' failed
>make: *** [ui/gtk.o] Error 1
>
> We only use the GdkMonitor API to get the monitor refresh rate.
>
> GTK+ provides convenient definition in 
> (already include by ) to check which API are available.
>
> Extract this code as a new gd_refresh_rate_millihz() function,
> and check GDK_VERSION_3_22 is defined before calling its API.
> If it is not defined, return 0. This is safe and fixes our build
> failure.
>
> Fixes: c4c00922cc (display/gtk: get proper refreshrate)
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
> Sorry I missed that, I only tested Nikola's patch on my workstation
> which runs Fedora 30:
>
>$ pkg-config --modversion gtk+-3.0
>3.24.11
>
> If Gerd acks this patch, we might consider having it directly
> applied as a build fix.
> ---
>   ui/gtk.c | 23 ++-
>   1 file changed, 18 insertions(+), 5 deletions(-)
>
> diff --git a/ui/gtk.c b/ui/gtk.c
> index 7355d34fcf..d18892d1de 100644
> --- a/ui/gtk.c
> +++ b/ui/gtk.c
> @@ -1961,6 +1961,23 @@ static GtkWidget 
> *gd_create_menu_machine(GtkDisplayState *s)
>   return machine_menu;
>   }
>
> +/*
> + * If available, return the refresh rate of the display in milli-Hertz,
> + * else return 0.
> + */
> +static int gd_refresh_rate_millihz(GtkDisplayState *s)
> +{
> +#ifdef GDK_VERSION_3_22
> +GdkDisplay *dpy = gtk_widget_get_display(s->window);
> +GdkWindow *win = gtk_widget_get_window(s->window);
> +GdkMonitor *monitor = gdk_display_get_monitor_at_window(dpy, win);
> +
> +return gdk_monitor_get_refresh_rate(monitor);
> +#else
> +return 0;
> +#endif
> +}
> +
>   static GSList *gd_vc_gfx_init(GtkDisplayState *s, VirtualConsole *vc,
> QemuConsole *con, int idx,
> GSList *group, GtkWidget *view_menu)
> @@ -1968,10 +1985,6 @@ static GSList *gd_vc_gfx_init(GtkDisplayState *s, 
> VirtualConsole *vc,
>   bool zoom_to_fit = false;
>   int refresh_rate_millihz;
>
> -GdkDisplay *dpy = gtk_widget_get_display(s->window);
> -GdkWindow *win = gtk_widget_get_window(s->window);
> -GdkMonitor *monitor = gdk_display_get_monitor_at_window(dpy, win);
> -
>   vc->label = qemu_console_get_label(con);
>   vc->s = s;
>   vc->gfx.scale_x = 1.0;
> @@ -2032,7 +2045,7 @@ static GSList *gd_vc_gfx_init(GtkDisplayState *s, 
> VirtualConsole *vc,
>   vc->gfx.kbd = qkbd_state_init(con);
>   vc->gfx.dcl.con = con;
>
> -refresh_rate_millihz = gdk_monitor_get_refresh_rate(monitor);
> +refresh_rate_millihz = gd_refresh_rate_millihz(s);
>   if (refresh_rate_millihz) {
>   vc->gfx.dcl.update_interval = MILLISEC_PER_SEC / 
> refresh_rate_millihz;
>   }
>

This (as well as c4c00922cc) gives

qemu-system-x86_64: Gdk: gdk_window_get_origin: assertion 'GDK_IS_WINDOW 
(window)' failed

on startup under gtk 3.22.30 (mate 1.20.1).

Jan

[PATCH v2] apic: Report current_count via 'info lapic'

2020-02-06 Thread Jan Kiszka

From: Jan Kiszka 

This is helpful when debugging stuck guest timers.

As we need apic_get_current_count for that, and it is really not
emulation specific, move it to apic_common.c and export it. Fix its
style at this chance as well.

Signed-off-by: Jan Kiszka 
Reviewed-by: Philippe Mathieu-Daudé 
---

Changes in v2:
 - fix style of apic_get_current_count

 hw/intc/apic.c  | 18 --
 hw/intc/apic_common.c   | 19 +++
 include/hw/i386/apic_internal.h |  1 +
 target/i386/helper.c|  5 +++--
 4 files changed, 23 insertions(+), 20 deletions(-)

diff --git a/hw/intc/apic.c b/hw/intc/apic.c
index bd40467965..f2207d0ace 100644
--- a/hw/intc/apic.c
+++ b/hw/intc/apic.c
@@ -615,24 +615,6 @@ int apic_accept_pic_intr(DeviceState *dev)
 return 0;
 }
 
-static uint32_t apic_get_current_count(APICCommonState *s)
-{
-int64_t d;
-uint32_t val;
-d = (qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) - s->initial_count_load_time) >>
-s->count_shift;
-if (s->lvt[APIC_LVT_TIMER] & APIC_LVT_TIMER_PERIODIC) {
-/* periodic */
-val = s->initial_count - (d % ((uint64_t)s->initial_count + 1));
-} else {
-if (d >= s->initial_count)
-val = 0;
-else
-val = s->initial_count - d;
-}
-return val;
-}
-
 static void apic_timer_update(APICCommonState *s, int64_t current_time)
 {
 if (apic_next_timer(s, current_time)) {
diff --git a/hw/intc/apic_common.c b/hw/intc/apic_common.c
index 9ec0f2deb2..fb432e83f2 100644
--- a/hw/intc/apic_common.c
+++ b/hw/intc/apic_common.c
@@ -189,6 +189,25 @@ bool apic_next_timer(APICCommonState *s, int64_t 
current_time)
 return true;
 }
 
+uint32_t apic_get_current_count(APICCommonState *s)
+{
+int64_t d;
+uint32_t val;
+d = (qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) - s->initial_count_load_time) >>
+s->count_shift;
+if (s->lvt[APIC_LVT_TIMER] & APIC_LVT_TIMER_PERIODIC) {
+/* periodic */
+val = s->initial_count - (d % ((uint64_t)s->initial_count + 1));
+} else {
+if (d >= s->initial_count) {
+val = 0;
+} else {
+val = s->initial_count - d;
+}
+}
+return val;
+}
+
 void apic_init_reset(DeviceState *dev)
 {
 APICCommonState *s;
diff --git a/include/hw/i386/apic_internal.h b/include/hw/i386/apic_internal.h
index b04bdd947f..2597000e03 100644
--- a/include/hw/i386/apic_internal.h
+++ b/include/hw/i386/apic_internal.h
@@ -211,6 +211,7 @@ void vapic_report_tpr_access(DeviceState *dev, CPUState 
*cpu, target_ulong ip,
  TPRAccess access);
 
 int apic_get_ppr(APICCommonState *s);
+uint32_t apic_get_current_count(APICCommonState *s);
 
 static inline void apic_set_bit(uint32_t *tab, int index)
 {
diff --git a/target/i386/helper.c b/target/i386/helper.c
index c3a6e4fabe..e3c3726c29 100644
--- a/target/i386/helper.c
+++ b/target/i386/helper.c
@@ -370,10 +370,11 @@ void x86_cpu_dump_local_apic_state(CPUState *cs, int 
flags)
 dump_apic_lvt("LVTTHMR", lvt[APIC_LVT_THERMAL], false);
 dump_apic_lvt("LVTT", lvt[APIC_LVT_TIMER], true);
 
-qemu_printf("Timer\t DCR=0x%x (divide by %u) initial_count = %u\n",
+qemu_printf("Timer\t DCR=0x%x (divide by %u) initial_count = %u"
+" current_count = %u\n",
 s->divide_conf & APIC_DCR_MASK,
 divider_conf(s->divide_conf),
-s->initial_count);
+s->initial_count, apic_get_current_count(s));
 
 qemu_printf("SPIV\t 0x%08x APIC %s, focus=%s, spurious vec %u\n",
 s->spurious_vec,
-- 
2.16.4

Re: [PATCH] apic: Report current_count via 'info lapic'

2020-02-06 Thread Jan Kiszka

On 06.02.20 23:36, Philippe Mathieu-Daudé wrote:
> On 2/6/20 8:50 PM, Jan Kiszka wrote:
>> From: Jan Kiszka 
>>
>> This is helpful when debugging stuck guest timers.
>>
>> As we need apic_get_current_count for that, and it is really not
>> emulation specific, move it to apic_common.c and export it.
>>
>> Signed-off-by: Jan Kiszka 
>> ---
>>   hw/intc/apic.c  | 18 --
>>   hw/intc/apic_common.c   | 18 ++
>>   include/hw/i386/apic_internal.h |  1 +
>>   target/i386/helper.c    |  5 +++--
>>   4 files changed, 22 insertions(+), 20 deletions(-)
>>
>> diff --git a/hw/intc/apic.c b/hw/intc/apic.c
>> index bd40467965..f2207d0ace 100644
>> --- a/hw/intc/apic.c
>> +++ b/hw/intc/apic.c
>> @@ -615,24 +615,6 @@ int apic_accept_pic_intr(DeviceState *dev)
>>   return 0;
>>   }
>>   -static uint32_t apic_get_current_count(APICCommonState *s)
>> -{
>> -    int64_t d;
>> -    uint32_t val;
>> -    d = (qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) -
>> s->initial_count_load_time) >>
>> -    s->count_shift;
>> -    if (s->lvt[APIC_LVT_TIMER] & APIC_LVT_TIMER_PERIODIC) {
>> -    /* periodic */
>> -    val = s->initial_count - (d % ((uint64_t)s->initial_count + 1));
>> -    } else {
>> -    if (d >= s->initial_count)
>> -    val = 0;
>> -    else
>> -    val = s->initial_count - d;
>> -    }
>> -    return val;
>> -}
>> -
>>   static void apic_timer_update(APICCommonState *s, int64_t current_time)
>>   {
>>   if (apic_next_timer(s, current_time)) {
>> diff --git a/hw/intc/apic_common.c b/hw/intc/apic_common.c
>> index 9ec0f2deb2..6f4e877878 100644
>> --- a/hw/intc/apic_common.c
>> +++ b/hw/intc/apic_common.c
>> @@ -189,6 +189,24 @@ bool apic_next_timer(APICCommonState *s, int64_t
>> current_time)
>>   return true;
>>   }
>>   +uint32_t apic_get_current_count(APICCommonState *s)
>> +{
>> +    int64_t d;
>> +    uint32_t val;
>> +    d = (qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) -
>> s->initial_count_load_time) >>
>> +    s->count_shift;
>> +    if (s->lvt[APIC_LVT_TIMER] & APIC_LVT_TIMER_PERIODIC) {
>> +    /* periodic */
>> +    val = s->initial_count - (d % ((uint64_t)s->initial_count + 1));
>> +    } else {
>> +    if (d >= s->initial_count)
>> +    val = 0;
>> +    else
>> +    val = s->initial_count - d;
> 
> Using QEMU style if () {} else {}:

Yeah, that happens when you move old code - will address.

> Reviewed-by: Philippe Mathieu-Daudé 

Thanks,
Jan

-- 
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux

[PATCH] apic: Report current_count via 'info lapic'

2020-02-06 Thread Jan Kiszka

From: Jan Kiszka 

This is helpful when debugging stuck guest timers.

As we need apic_get_current_count for that, and it is really not
emulation specific, move it to apic_common.c and export it.

Signed-off-by: Jan Kiszka 
---
 hw/intc/apic.c  | 18 --
 hw/intc/apic_common.c   | 18 ++
 include/hw/i386/apic_internal.h |  1 +
 target/i386/helper.c|  5 +++--
 4 files changed, 22 insertions(+), 20 deletions(-)

diff --git a/hw/intc/apic.c b/hw/intc/apic.c
index bd40467965..f2207d0ace 100644
--- a/hw/intc/apic.c
+++ b/hw/intc/apic.c
@@ -615,24 +615,6 @@ int apic_accept_pic_intr(DeviceState *dev)
 return 0;
 }
 
-static uint32_t apic_get_current_count(APICCommonState *s)
-{
-int64_t d;
-uint32_t val;
-d = (qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) - s->initial_count_load_time) >>
-s->count_shift;
-if (s->lvt[APIC_LVT_TIMER] & APIC_LVT_TIMER_PERIODIC) {
-/* periodic */
-val = s->initial_count - (d % ((uint64_t)s->initial_count + 1));
-} else {
-if (d >= s->initial_count)
-val = 0;
-else
-val = s->initial_count - d;
-}
-return val;
-}
-
 static void apic_timer_update(APICCommonState *s, int64_t current_time)
 {
 if (apic_next_timer(s, current_time)) {
diff --git a/hw/intc/apic_common.c b/hw/intc/apic_common.c
index 9ec0f2deb2..6f4e877878 100644
--- a/hw/intc/apic_common.c
+++ b/hw/intc/apic_common.c
@@ -189,6 +189,24 @@ bool apic_next_timer(APICCommonState *s, int64_t 
current_time)
 return true;
 }
 
+uint32_t apic_get_current_count(APICCommonState *s)
+{
+int64_t d;
+uint32_t val;
+d = (qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) - s->initial_count_load_time) >>
+s->count_shift;
+if (s->lvt[APIC_LVT_TIMER] & APIC_LVT_TIMER_PERIODIC) {
+/* periodic */
+val = s->initial_count - (d % ((uint64_t)s->initial_count + 1));
+} else {
+if (d >= s->initial_count)
+val = 0;
+else
+val = s->initial_count - d;
+}
+return val;
+}
+
 void apic_init_reset(DeviceState *dev)
 {
 APICCommonState *s;
diff --git a/include/hw/i386/apic_internal.h b/include/hw/i386/apic_internal.h
index b04bdd947f..2597000e03 100644
--- a/include/hw/i386/apic_internal.h
+++ b/include/hw/i386/apic_internal.h
@@ -211,6 +211,7 @@ void vapic_report_tpr_access(DeviceState *dev, CPUState 
*cpu, target_ulong ip,
  TPRAccess access);
 
 int apic_get_ppr(APICCommonState *s);
+uint32_t apic_get_current_count(APICCommonState *s);
 
 static inline void apic_set_bit(uint32_t *tab, int index)
 {
diff --git a/target/i386/helper.c b/target/i386/helper.c
index c3a6e4fabe..e3c3726c29 100644
--- a/target/i386/helper.c
+++ b/target/i386/helper.c
@@ -370,10 +370,11 @@ void x86_cpu_dump_local_apic_state(CPUState *cs, int 
flags)
 dump_apic_lvt("LVTTHMR", lvt[APIC_LVT_THERMAL], false);
 dump_apic_lvt("LVTT", lvt[APIC_LVT_TIMER], true);
 
-qemu_printf("Timer\t DCR=0x%x (divide by %u) initial_count = %u\n",
+qemu_printf("Timer\t DCR=0x%x (divide by %u) initial_count = %u"
+" current_count = %u\n",
 s->divide_conf & APIC_DCR_MASK,
 divider_conf(s->divide_conf),
-s->initial_count);
+s->initial_count, apic_get_current_count(s));
 
 qemu_printf("SPIV\t 0x%08x APIC %s, focus=%s, spurious vec %u\n",
 s->spurious_vec,
-- 
2.16.4

Re: [RFC 0/9] Add an interVM memory sharing device

2020-02-05 Thread Jan Kiszka


On 05.02.20 15:39, Stefan Hajnoczi wrote:

On Tue, Feb 04, 2020 at 12:30:42PM +0100, i.kotrasi...@partner.samsung.com 
wrote:

From: Igor Kotrasinski 

This patchset adds a "memory exposing" device that allows two QEMU
instances to share arbitrary memory regions. Unlike ivshmem, it does not
create a new region of memory that's shared between VMs, but instead
allows one VM to access any memory region of the other VM we choose to
share.

The motivation for this device is a sort of ARM Trustzone "emulation",
where a rich system running on one machine (e.g. x86_64 linux) is able
to perform SMCs to a trusted system running on another (e.g. OpTEE on
ARM). With a device that allows sharing arbitrary memory between VMs,
this can be achieved with minimal changes to the trusted system and its
linux driver while allowing the rich system to run on a speedier x86
emulator. I prepared additional patches for linux, OpTEE OS and OpTEE
build system as a PoC that such emulation works and passes OpTEE tests;
I'm not sure what would be the best way to share them.

This patchset is my first foray into QEMU source code and I'm certain
it's not yet ready to be merged in. I'm not sure whether memory sharing
code has any race conditions or breaks rules of working with memory
regions, or if having VMs communicate synchronously via chardevs is the
right way to do it. I do believe the basic idea for sharing memory
regions is sound and that it could be useful for inter-VM communication.


Hi,
Without having looked into the patches yet, I'm already wondering if you
can use the existing -object
memory-backend-file,size=512M,mem-path=/my/shared/mem feature for your
use case?

That's the existing mechanism for fully sharing guest RAM and if you
want to share all of memory then maybe a device is not necessary - just
share the memory.


I suspect it's about sharing that memory in a discoverable way. Maybe it 
is also about the signalling channel defined in the device.


OTOH, when it's really about sharing everything, even bidirectional, 
that rather looks like a pragmatic shortcut, not a generic model.


The patches should clarify their use case a bit further, I think. The 
title suggests a generic sharing solution, but my impression is that it 
rather caters a specific case under specific boundary conditions.


Jan

--
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux

[RFC][PATCH v2 1/3] hw/misc: Add implementation of ivshmem revision 2 device

2020-01-07 Thread Jan Kiszka

From: Jan Kiszka 

This adds a reimplementation of ivshmem in its new revision 2 as
separate device. The goal of this is not to enable sharing with v1,
rather to allow explore the properties and potential limitation of the
new version prior to discussing its integration with the existing code.

v2 always requires a server to interconnect two more more QEMU
instances because it provides signaling between peers unconditionally.
Therefore, only the interconnecting chardev, master mode, and the usage
of ioeventfd can be configured at device level. All other parameters are
defined by the server instance.

A new server protocol is introduced along this. Its primary difference
is the introduction of a single welcome message that contains all peer
parameters, rather than a series of single-word messages pushing them
piece by piece.

A complicating difference in interrupt handling, compare to v1, is the
auto-disable mode of v2: When this is active, interrupt delivery is
disabled by the device after each interrupt event. This prevents the
usage of irqfd on the receiving side, but it lowers the handling cost
for guests that implemented interrupt throttling this way (specifically
when exposing the device via UIO).

No changes have been made to the ivshmem device regarding migration:
Only the master can live-migrate, slave peers have to hot-unplug the
device first.

The details of the device model will be specified in a succeeding
commit. Drivers for this device can currently be found under

http://git.kiszka.org/?p=linux.git;a=shortlog;h=refs/heads/queues/ivshmem2

To instantiate a ivshmem v2 device, just add

 ... -chardev socket,path=/tmp/ivshmem_socket,id=ivshmem \
 -device ivshmem,chardev=ivshmem

provided the server opened its socket under the default path.

Signed-off-by: Jan Kiszka 
---
 hw/misc/Makefile.objs  |2 +-
 hw/misc/ivshmem2.c | 1085 
 include/hw/misc/ivshmem2.h |   48 ++
 include/hw/pci/pci_ids.h   |2 +
 4 files changed, 1136 insertions(+), 1 deletion(-)
 create mode 100644 hw/misc/ivshmem2.c
 create mode 100644 include/hw/misc/ivshmem2.h

diff --git a/hw/misc/Makefile.objs b/hw/misc/Makefile.objs
index ba898a5781..90a4a6608c 100644
--- a/hw/misc/Makefile.objs
+++ b/hw/misc/Makefile.objs
@@ -26,7 +26,7 @@ common-obj-$(CONFIG_PUV3) += puv3_pm.o
 
 common-obj-$(CONFIG_MACIO) += macio/
 
-common-obj-$(CONFIG_IVSHMEM_DEVICE) += ivshmem.o
+common-obj-$(CONFIG_IVSHMEM_DEVICE) += ivshmem.o ivshmem2.o
 
 common-obj-$(CONFIG_REALVIEW) += arm_sysctl.o
 common-obj-$(CONFIG_NSERIES) += cbus.o
diff --git a/hw/misc/ivshmem2.c b/hw/misc/ivshmem2.c
new file mode 100644
index 00..d5f88ed0e9
--- /dev/null
+++ b/hw/misc/ivshmem2.c
@@ -0,0 +1,1085 @@
+/*
+ * Inter-VM Shared Memory PCI device, version 2.
+ *
+ * Copyright (c) Siemens AG, 2019
+ *
+ * Authors:
+ *  Jan Kiszka 
+ *
+ * Based on ivshmem.c by Cam Macdonell 
+ *
+ * This code is licensed under the GNU GPL v2.
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/units.h"
+#include "qapi/error.h"
+#include "qemu/cutils.h"
+#include "hw/hw.h"
+#include "hw/pci/pci.h"
+#include "hw/pci/msi.h"
+#include "hw/pci/msix.h"
+#include "hw/qdev-properties.h"
+#include "sysemu/kvm.h"
+#include "migration/blocker.h"
+#include "migration/vmstate.h"
+#include "qemu/error-report.h"
+#include "qemu/event_notifier.h"
+#include "qemu/module.h"
+#include "qom/object_interfaces.h"
+#include "chardev/char-fe.h"
+#include "sysemu/qtest.h"
+#include "qapi/visitor.h"
+
+#include "hw/misc/ivshmem2.h"
+
+#define PCI_VENDOR_ID_IVSHMEM   PCI_VENDOR_ID_SIEMENS
+#define PCI_DEVICE_ID_IVSHMEM   0x4106
+
+#define IVSHMEM_MAX_PEERS   UINT16_MAX
+#define IVSHMEM_IOEVENTFD   0
+#define IVSHMEM_MSI 1
+
+#define IVSHMEM_REG_BAR_SIZE0x1000
+
+#define IVSHMEM_REG_ID  0x00
+#define IVSHMEM_REG_MAX_PEERS   0x04
+#define IVSHMEM_REG_INT_CTRL0x08
+#define IVSHMEM_REG_DOORBELL0x0c
+#define IVSHMEM_REG_STATE   0x10
+
+#define IVSHMEM_INT_ENABLE  0x1
+
+#define IVSHMEM_ONESHOT_MODE0x1
+
+#define IVSHMEM_DEBUG 0
+#define IVSHMEM_DPRINTF(fmt, ...)   \
+do {\
+if (IVSHMEM_DEBUG) {\
+printf("IVSHMEM: " fmt, ## __VA_ARGS__);\
+}   \
+} while (0)
+
+#define TYPE_IVSHMEM "ivshmem"
+#define IVSHMEM(obj) \
+OBJECT_CHECK(IVShmemState, (obj), TYPE_IVSHMEM)
+
+typedef struct Peer {
+int nb_eventfds;
+EventNotifier *eventfds;
+} Peer;
+
+typedef struct MSIVector {
+PCIDevice *pdev;
+int virq;
+bool unmasked;
+} MSIVector;
+
+typedef struct IVShmemVndrCap {
+uint8_t id;
+

[RFC][PATCH v2 2/3] docs/specs: Add specification of ivshmem device revision 2

2020-01-07 Thread Jan Kiszka

From: Jan Kiszka 

This imports the ivshmem v2 specification draft from Jailhouse where the
implementation is about to be merged now. The final home of the spec is
to be decided, this shall just simplify the review process at this
stage.

Signed-off-by: Jan Kiszka 
---
 docs/specs/ivshmem-2-device-spec.md | 376 
 1 file changed, 376 insertions(+)
 create mode 100644 docs/specs/ivshmem-2-device-spec.md

diff --git a/docs/specs/ivshmem-2-device-spec.md 
b/docs/specs/ivshmem-2-device-spec.md
new file mode 100644
index 00..d93cb22b04
--- /dev/null
+++ b/docs/specs/ivshmem-2-device-spec.md
@@ -0,0 +1,376 @@
+IVSHMEM Device Specification
+
+
+** NOTE: THIS IS WORK-IN-PROGRESS, NOT YET A STABLE INTERFACE SPECIFICATION! **
+
+The goal of the Inter-VM Shared Memory (IVSHMEM) device model is to
+define the minimally needed building blocks a hypervisor has to
+provide for enabling guest-to-guest communication. The details of
+communication protocols shall remain opaque to the hypervisor so that
+guests are free to define them as simple or sophisticated as they
+need.
+
+For that purpose, the IVSHMEM provides the following features to its
+users:
+
+- Interconnection between up to 65536 peers
+
+- Multi-purpose shared memory region
+
+- common read/writable section
+
+- output sections that are read/writable for one peer and only
+  readable for the others
+
+- section with peer states
+
+- Event signaling via interrupt to remote sides
+
+- Support for life-cycle management via state value exchange and
+  interrupt notification on changes, backed by a shared memory
+  section
+
+- Free choice of protocol to be used on top
+
+- Protocol type declaration
+
+- Register can be implemented either memory-mapped or via I/O,
+  depending on platform support and lower VM-exit costs
+
+- Unprivileged access to memory-mapped or I/O registers feasible
+
+- Single discovery and configuration via standard PCI, no complexity
+  by additionally defining a platform device model
+
+
+Hypervisor Model
+
+
+In order to provide a consistent link between peers, all connected
+instances of IVSHMEM devices need to be configured, created and run
+by the hypervisor according to the following requirements:
+
+- The instances of the device shall appear as a PCI device to their
+  users.
+
+- The read/write shared memory section has to be of the same size for
+  all peers. The size can be zero.
+
+- If shared memory output sections are present (non-zero section
+  size), there must be one reserved for each peer with exclusive
+  write access. All output sections must have the same size and must
+  be readable for all peers.
+
+- The State Table must have the same size for all peers, must be
+  large enough to hold the state values of all peers, and must be
+  read-only for the user.
+
+- State register changes (explicit writes, peer resets) have to be
+  propagated to the other peers by updating the corresponding State
+  Table entry and issuing an interrupt to all other peers if they
+  enabled reception.
+
+- Interrupts events triggered by a peer have to be delivered to the
+  target peer, provided the receiving side is valid and has enabled
+  the reception.
+
+- All peers must have the same interrupt delivery features available,
+  i.e. MSI-X with the same maximum number of vectors on platforms
+  supporting this mechanism, otherwise INTx with one vector.
+
+
+Guest-side Programming Model
+
+
+An IVSHMEM device appears as a PCI device to its users. Unless
+otherwise noted, it conforms to the PCI Local Bus Specification,
+Revision 3.0. As such, it is discoverable via the PCI configuration
+space and provides a number of standard and custom PCI configuration
+registers.
+
+### Shared Memory Region Layout
+
+The shared memory region is divided into several sections.
+
++-+   -
+| |   :
+| Output Section for peer n-1 |   : Output Section Size
+| (n = Maximum Peers) |   :
++-+   -
+: :
+: :
+: :
++-+   -
+| |   :
+|  Output Section for peer 1  |   : Output Section Size
+| |   :
++-+   -
+| |   :
+|  Output Section for peer 0  |   : Output Section Size
+| |   :
++-+   -
+| |   :
+| Read/Write Section  |   : R/W Section Size
+| |   :
++-+   -
+| |   :
+| State Table |   : State Table Size

[RFC][PATCH v2 3/3] contrib: Add server for ivshmem revision 2

2020-01-07 Thread Jan Kiszka

From: Jan Kiszka 

This implements the server process for ivshmem v2 device models of QEMU.
Again, no effort has been spent yet on sharing code with the v1 server.
Parts have been copied, others were rewritten.

In addition to parameters of v1, this server now also specifies

 - the maximum number of peers to be connected (required to know in
   advance because of v2's state table)
 - the size of the output sections (can be 0)
 - the protocol ID to be published to all peers

When a virtio protocol ID is chosen, only 2 peers can be connected.
Furthermore, the server will signal the backend variant of the ID to the
master instance and the frontend ID to the slave peer.

To start, e.g., a server that allows virtio console over ivshmem, call

ivshmem2-server -F -l 64K -n 2 -V 3 -P 0x8003

TODO: specify the new server protocol.

Signed-off-by: Jan Kiszka 
---
 Makefile  |   3 +
 Makefile.objs |   1 +
 configure |   1 +
 contrib/ivshmem2-server/Makefile.objs |   1 +
 contrib/ivshmem2-server/ivshmem2-server.c | 462 ++
 contrib/ivshmem2-server/ivshmem2-server.h | 158 ++
 contrib/ivshmem2-server/main.c| 313 
 7 files changed, 939 insertions(+)
 create mode 100644 contrib/ivshmem2-server/Makefile.objs
 create mode 100644 contrib/ivshmem2-server/ivshmem2-server.c
 create mode 100644 contrib/ivshmem2-server/ivshmem2-server.h
 create mode 100644 contrib/ivshmem2-server/main.c

diff --git a/Makefile b/Makefile
index 6b5ad1121b..33bb0eefdb 100644
--- a/Makefile
+++ b/Makefile
@@ -427,6 +427,7 @@ dummy := $(call unnest-vars,, \
 elf2dmp-obj-y \
 ivshmem-client-obj-y \
 ivshmem-server-obj-y \
+ivshmem2-server-obj-y \
 rdmacm-mux-obj-y \
 libvhost-user-obj-y \
 vhost-user-scsi-obj-y \
@@ -655,6 +656,8 @@ ivshmem-client$(EXESUF): $(ivshmem-client-obj-y) 
$(COMMON_LDADDS)
$(call LINK, $^)
 ivshmem-server$(EXESUF): $(ivshmem-server-obj-y) $(COMMON_LDADDS)
$(call LINK, $^)
+ivshmem2-server$(EXESUF): $(ivshmem2-server-obj-y) $(COMMON_LDADDS)
+   $(call LINK, $^)
 endif
 vhost-user-scsi$(EXESUF): $(vhost-user-scsi-obj-y) libvhost-user.a
$(call LINK, $^)
diff --git a/Makefile.objs b/Makefile.objs
index 02bf5ce11d..ce243975ef 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -115,6 +115,7 @@ qga-vss-dll-obj-y = qga/
 elf2dmp-obj-y = contrib/elf2dmp/
 ivshmem-client-obj-$(CONFIG_IVSHMEM) = contrib/ivshmem-client/
 ivshmem-server-obj-$(CONFIG_IVSHMEM) = contrib/ivshmem-server/
+ivshmem2-server-obj-$(CONFIG_IVSHMEM) = contrib/ivshmem2-server/
 libvhost-user-obj-y = contrib/libvhost-user/
 vhost-user-scsi.o-cflags := $(LIBISCSI_CFLAGS)
 vhost-user-scsi.o-libs := $(LIBISCSI_LIBS)
diff --git a/configure b/configure
index 747d3b4120..1cb1427f1b 100755
--- a/configure
+++ b/configure
@@ -6165,6 +6165,7 @@ if test "$want_tools" = "yes" ; then
   fi
   if [ "$ivshmem" = "yes" ]; then
 tools="ivshmem-client\$(EXESUF) ivshmem-server\$(EXESUF) $tools"
+tools="ivshmem2-server\$(EXESUF) $tools"
   fi
   if [ "$curl" = "yes" ]; then
   tools="elf2dmp\$(EXESUF) $tools"
diff --git a/contrib/ivshmem2-server/Makefile.objs 
b/contrib/ivshmem2-server/Makefile.objs
new file mode 100644
index 00..d233e18ec8
--- /dev/null
+++ b/contrib/ivshmem2-server/Makefile.objs
@@ -0,0 +1 @@
+ivshmem2-server-obj-y = ivshmem2-server.o main.o
diff --git a/contrib/ivshmem2-server/ivshmem2-server.c 
b/contrib/ivshmem2-server/ivshmem2-server.c
new file mode 100644
index 00..b341f1fcd0
--- /dev/null
+++ b/contrib/ivshmem2-server/ivshmem2-server.c
@@ -0,0 +1,462 @@
+/*
+ * Copyright 6WIND S.A., 2014
+ * Copyright (c) Siemens AG, 2019
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * (at your option) any later version.  See the COPYING file in the
+ * top-level directory.
+ */
+#include "qemu/osdep.h"
+#include "qemu/host-utils.h"
+#include "qemu/sockets.h"
+#include "qemu/atomic.h"
+
+#include 
+#include 
+
+#include "ivshmem2-server.h"
+
+/* log a message on stdout if verbose=1 */
+#define IVSHMEM_SERVER_DEBUG(server, fmt, ...) do { \
+if ((server)->args.verbose) { \
+printf(fmt, ## __VA_ARGS__); \
+}\
+} while (0)
+
+/** maximum size of a huge page, used by ivshmem_server_ftruncate() */
+#define IVSHMEM_SERVER_MAX_HUGEPAGE_SIZE (1024 * 1024 * 1024)
+
+/** default listen backlog (number of sockets not accepted) */
+#define IVSHMEM_SERVER_LISTEN_BACKLOG 10
+
+/* send message to a client unix socket */
+static int ivshmem_server_send_msg(int sock_fd, void *payload, int len, int fd)
+{
+in

[RFC][PATCH v2 0/3] IVSHMEM version 2 device for QEMU

2020-01-07 Thread Jan Kiszka

Overdue update of the ivshmem 2.0 device model as presented at [1].

Changes in v2:
 - changed PCI device ID to Siemens-granted one,
   adjusted PCI device revision to 0
 - removed unused feature register from device
 - addressed feedback on specification document
 - rebased over master

This version is now fully in sync with the implementation for Jailhouse
that is currently under review [2][3], UIO and virtio-ivshmem drivers
are shared. Jailhouse will very likely pick up this revision of the
device in order to move forward with stressing it.

More details on the usage with QEMU were in the original cover letter
(with adjustements to the new device ID):

If you want to play with this, the basic setup of the shared memory
device is described in patch 1 and 3. UIO driver and also the
virtio-ivshmem prototype can be found at

http://git.kiszka.org/?p=linux.git;a=shortlog;h=refs/heads/queues/ivshmem2

Accessing the device via UIO is trivial enough. If you want to use it
for virtio, this is additionally to the description in patch 3 needed on
the virtio console backend side:

modprobe uio_ivshmem
echo "110a 4106 1af4 1100 ffc003 ff" > 
/sys/bus/pci/drivers/uio_ivshmem/new_id
linux/tools/virtio/virtio-ivshmem-console /dev/uio0

And for virtio block:

echo "110a 4106 1af4 1100 ffc002 ff" > 
/sys/bus/pci/drivers/uio_ivshmem/new_id
linux/tools/virtio/virtio-ivshmem-console /dev/uio0 /path/to/disk.img

After that, you can start the QEMU frontend instance with the
virtio-ivshmem driver installed which can use the new /dev/hvc* or
/dev/vda* as usual.

Any feedback welcome!

Jan

PS: Let me know if I missed someone potentially interested in this topic
on CC - or if you would like to be dropped from the list.

[1] https://kvmforum2019.sched.com/event/TmxI
[2] https://groups.google.com/forum/#!topic/jailhouse-dev/ffnCcRh8LOs
[3] https://groups.google.com/forum/#!topic/jailhouse-dev/HX-0AGF1cjg

Jan Kiszka (3):
  hw/misc: Add implementation of ivshmem revision 2 device
  docs/specs: Add specification of ivshmem device revision 2
  contrib: Add server for ivshmem revision 2

 Makefile  |3 +
 Makefile.objs |1 +
 configure |1 +
 contrib/ivshmem2-server/Makefile.objs |1 +
 contrib/ivshmem2-server/ivshmem2-server.c |  462 
 contrib/ivshmem2-server/ivshmem2-server.h |  158 +
 contrib/ivshmem2-server/main.c|  313 +
 docs/specs/ivshmem-2-device-spec.md   |  376 ++
 hw/misc/Makefile.objs |2 +-
 hw/misc/ivshmem2.c| 1085 +
 include/hw/misc/ivshmem2.h|   48 ++
 include/hw/pci/pci_ids.h  |2 +
 12 files changed, 2451 insertions(+), 1 deletion(-)
 create mode 100644 contrib/ivshmem2-server/Makefile.objs
 create mode 100644 contrib/ivshmem2-server/ivshmem2-server.c
 create mode 100644 contrib/ivshmem2-server/ivshmem2-server.h
 create mode 100644 contrib/ivshmem2-server/main.c
 create mode 100644 docs/specs/ivshmem-2-device-spec.md
 create mode 100644 hw/misc/ivshmem2.c
 create mode 100644 include/hw/misc/ivshmem2.h

-- 
2.16.4

Re: [RFC][PATCH 2/3] docs/specs: Add specification of ivshmem device revision 2

2019-12-05 Thread Jan Kiszka

On 05.12.19 12:14, Markus Armbruster wrote:
> This has been on the list for more than three weeks already.  I
> apologize for the delay.

No problem! Your feedback is highly appreciated.

> 
> Jan Kiszka  writes:
> 
>> From: Jan Kiszka 
>>
>> This imports the ivshmem v2 specification draft from Jailhouse. Its
>> final home is to be decided, this shall just simplify the review process
>> at this stage.
>>
>> Note that specifically the Features register (offset 08h) is still under
>> consideration. In particular, its bit 0 seems useless now as its benefit
>> to guests, specifically when they want to be portable, is close to zero.
>> Maybe the register should still be kept, with all bits RsvdZ, for easier
>> extensibility.
>>
>> The rest appears now rather mature and reasonably implementable, as
>> proven also by a version for Jailhouse.
>>
>> Signed-off-by: Jan Kiszka 
>> ---
>>  docs/specs/ivshmem-2-device-spec.md | 333 
>> 
>>  1 file changed, 333 insertions(+)
>>  create mode 100644 docs/specs/ivshmem-2-device-spec.md
>>
>> diff --git a/docs/specs/ivshmem-2-device-spec.md 
>> b/docs/specs/ivshmem-2-device-spec.md
>> new file mode 100644
>> index 00..98cfde585a
>> --- /dev/null
>> +++ b/docs/specs/ivshmem-2-device-spec.md
>> @@ -0,0 +1,333 @@
>> +IVSHMEM Device Specification
>> +
>> +
>> +** NOTE: THIS IS WORK-IN-PROGRESS, NOT YET A STABLE INTERFACE 
>> SPECIFICATION! **
>> +
>> +The Inter-VM Shared Memory device provides the following features to its 
>> users:
>> +
>> +- Interconnection between up to 65536 peers
>> +
>> +- Multi-purpose shared memory region
>> +
>> +- common read/writable section
>> +
>> +- unidirectional sections that are read/writable for one peer and only
>> +  readable for the others
>> +
>> +- section with peer states
>> +
>> +- Event signaling via interrupt to remote sides
>> +
>> +- Support for life-cycle management via state value exchange and interrupt
>> +  notification on changes, backed by a shared memory section
>> +
>> +- Free choice of protocol to be used on top
>> +
>> +- Protocol type declaration
>> +
>> +- Unprivileged access to memory-mapped or I/O registers feasible
>> +
>> +- Discoverable and configurable via standard PCI mechanisms
> 
> Stating requirements is much appreciated.  Design rationale would be
> even better :)

The idea is to avoid having to model also a platform device with
platform discovery mechanism. Based on our experiences with providing a
generic virtual PCI host controller, such a complication of
specification and implementation is simply not needed.

> 
> As pointed out many times, shared memory is not a solution to any
> communication problem, it's merely a building block for building such
> solutions: you invariably have to layer some protocol on top.  In your
> KVM Forum talk, you mention layering virtio on top.  Makes sense to me.
> But why does *this* virtio transport have to be an independent device?
> Other transports aren't.

This device is not only a virtio transport. As you correctly point out,
it is a foundation for communication, not the full solution by itself.

Its goal is to define the minimal piece that a hypervisor has to provide
to its guests in order to enable them building full communication
solutions of their preferred flavor on top. It does not want to deal
with those details, may they be virtio, may they be something much
simpler, much older or much more sophisticated (or complicated). All
that shall not be the business of a hypervisor.

> 
> Now let me indulge in spec nitpicking :)
> 
>> +
>> +
>> +Hypervisor Model
>> +
>> +
>> +In order to provide a consistent link between peers, all connected 
>> instances of
>> +IVSHMEM devices need to be configured, created and run by the hypervisor
>> +according to the following requirements:
>> +
>> +- The instances of the device need to be accessible via PCI programming
>> +  interfaces on all sides.
> 
> What does that mean?

"From a guest point of view, this shall be a PCI device."

> 
>> +
>> +- The read/write shared memory section has to be of the same size for all
>> +  peers and, if non-zero, has to reflect the same memory content for them.
> 
> Isn't "same memory content" redundant with "shared memory"?

Probably.

> 
>> +
>> +- If output sections are present (non-zero section size), there must be one
>>

Re: [RFC][PATCH 0/3] IVSHMEM version 2 device for QEMU

2019-12-02 Thread Jan Kiszka

On 03.12.19 06:53, Liang Yan wrote:
> 
> On 12/2/19 1:16 AM, Jan Kiszka wrote:
>> On 27.11.19 18:19, Jan Kiszka wrote:
>>> Hi Liang,
>>>
>>> On 27.11.19 16:28, Liang Yan wrote:
>>>>
>>>>
>>>> On 11/11/19 7:57 AM, Jan Kiszka wrote:
>>>>> To get the ball rolling after my presentation of the topic at KVM Forum
>>>>> [1] and many fruitful discussions around it, this is a first concrete
>>>>> code series. As discussed, I'm starting with the IVSHMEM implementation
>>>>> of a QEMU device and server. It's RFC because, besides specification
>>>>> and
>>>>> implementation details, there will still be some decisions needed about
>>>>> how to integrate the new version best into the existing code bases.
>>>>>
>>>>> If you want to play with this, the basic setup of the shared memory
>>>>> device is described in patch 1 and 3. UIO driver and also the
>>>>> virtio-ivshmem prototype can be found at
>>>>>
>>>>> 
>>>>> http://git.kiszka.org/?p=linux.git;a=shortlog;h=refs/heads/queues/ivshmem2
>>>>>
>>>>>
>>>>> Accessing the device via UIO is trivial enough. If you want to use it
>>>>> for virtio, this is additionally to the description in patch 3
>>>>> needed on
>>>>> the virtio console backend side:
>>>>>
>>>>>  modprobe uio_ivshmem
>>>>>  echo "1af4 1110 1af4 1100 ffc003 ff" >
>>>>> /sys/bus/pci/drivers/uio_ivshmem/new_id
>>>>>  linux/tools/virtio/virtio-ivshmem-console /dev/uio0
>>>>>
>>>>> And for virtio block:
>>>>>
>>>>>  echo "1af4 1110 1af4 1100 ffc002 ff" >
>>>>> /sys/bus/pci/drivers/uio_ivshmem/new_id
>>>>>  linux/tools/virtio/virtio-ivshmem-console /dev/uio0
>>>>> /path/to/disk.img
>>>>>
>>>>> After that, you can start the QEMU frontend instance with the
>>>>> virtio-ivshmem driver installed which can use the new /dev/hvc* or
>>>>> /dev/vda* as usual.
>>>>>
>>>>> Any feedback welcome!
>>>>
>>>> Hi, Jan,
>>>>
>>>> I have been playing your code for last few weeks, mostly study and test,
>>>> of course. Really nice work. I have a few questions here:
>>>>
>>>> First, qemu part looks good, I tried test between couple VMs, and device
>>>> could pop up correctly for all of them, but I had some problems when
>>>> trying load driver. For example, if set up two VMs, vm1 and vm2, start
>>>> ivshmem server as you suggested. vm1 could load uio_ivshmem and
>>>> virtio_ivshmem correctly, vm2 could load uio_ivshmem but could not show
>>>> up "/dev/uio0", virtio_ivshmem could not be loaded at all, these still
>>>> exist even I switch the load sequence of vm1 and vm2, and sometimes
>>>> reset "virtio_ivshmem" could crash both vm1 and vm2. Not quite sure this
>>>> is bug or "Ivshmem Mode" issue, but I went through ivshmem-server code,
>>>> did not related information.
>>>
>>> If we are only talking about one ivshmem link and vm1 is the master,
>>> there is not role for virtio_ivshmem on it as backend. That is purely
>>> a frontend driver. Vice versa for vm2: If you want to use its ivshmem
>>> instance as virtio frontend, uio_ivshmem plays no role.
>>>
> Hi, Jan,
> 
> Sorry for the late response. Just came back from Thanksgiving holiday.
> 
> I have a few questions here.
> First, how to decide master/slave node? I used two VMs here, they did
> not show same behavior even if I change the boot sequence.

The current mechanism works by selecting the VM gets ID 0 as the
backend, thus sending it also a different protocol ID than the frontend
gets. Could possibly be improved by allowing selection also on the VM
side (QEMU command line parameter etc.).

Inherently, this only affects virtio over ivshmem. Other, symmetric
protocols do not need this differentiation.

> 
> Second, in order to run virtio-ivshmem-console demo, VM1 connect to VM2
> Console. So, need to install virtio frontend driver in VM2, then install
> uio frontend driver in VM1 to get "/dev/uio0", then run demo, right?
> Could you share your procedure?
> 
> Also, I could not get "/dev/uio0" all the time.

OK, should have collected this earlier. This is h

Re: [RFC][PATCH 0/3] IVSHMEM version 2 device for QEMU

2019-12-01 Thread Jan Kiszka


On 27.11.19 18:19, Jan Kiszka wrote:

Hi Liang,

On 27.11.19 16:28, Liang Yan wrote:



On 11/11/19 7:57 AM, Jan Kiszka wrote:

To get the ball rolling after my presentation of the topic at KVM Forum
[1] and many fruitful discussions around it, this is a first concrete
code series. As discussed, I'm starting with the IVSHMEM implementation
of a QEMU device and server. It's RFC because, besides specification and
implementation details, there will still be some decisions needed about
how to integrate the new version best into the existing code bases.

If you want to play with this, the basic setup of the shared memory
device is described in patch 1 and 3. UIO driver and also the
virtio-ivshmem prototype can be found at

 
http://git.kiszka.org/?p=linux.git;a=shortlog;h=refs/heads/queues/ivshmem2 



Accessing the device via UIO is trivial enough. If you want to use it
for virtio, this is additionally to the description in patch 3 needed on
the virtio console backend side:

 modprobe uio_ivshmem
 echo "1af4 1110 1af4 1100 ffc003 ff" > 
/sys/bus/pci/drivers/uio_ivshmem/new_id

 linux/tools/virtio/virtio-ivshmem-console /dev/uio0

And for virtio block:

 echo "1af4 1110 1af4 1100 ffc002 ff" > 
/sys/bus/pci/drivers/uio_ivshmem/new_id
 linux/tools/virtio/virtio-ivshmem-console /dev/uio0 
/path/to/disk.img


After that, you can start the QEMU frontend instance with the
virtio-ivshmem driver installed which can use the new /dev/hvc* or
/dev/vda* as usual.

Any feedback welcome!


Hi, Jan,

I have been playing your code for last few weeks, mostly study and test,
of course. Really nice work. I have a few questions here:

First, qemu part looks good, I tried test between couple VMs, and device
could pop up correctly for all of them, but I had some problems when
trying load driver. For example, if set up two VMs, vm1 and vm2, start
ivshmem server as you suggested. vm1 could load uio_ivshmem and
virtio_ivshmem correctly, vm2 could load uio_ivshmem but could not show
up "/dev/uio0", virtio_ivshmem could not be loaded at all, these still
exist even I switch the load sequence of vm1 and vm2, and sometimes
reset "virtio_ivshmem" could crash both vm1 and vm2. Not quite sure this
is bug or "Ivshmem Mode" issue, but I went through ivshmem-server code,
did not related information.


If we are only talking about one ivshmem link and vm1 is the master, 
there is not role for virtio_ivshmem on it as backend. That is purely a 
frontend driver. Vice versa for vm2: If you want to use its ivshmem 
instance as virtio frontend, uio_ivshmem plays no role.


The "crash" is would be interesting to understand: Do you see kernel 
panics of the guests? Or are they stuck? Or are the QEMU instances 
stuck? Do you know that you can debug the guest kernels via gdb (and 
gdb-scripts of the kernel)?




I started some code work recently, such as fix code style issues and
some work based on above testing, however I know you are also working on
RFC V2, beside the protocol between server-client and client-client is
not finalized yet either, things may change, so much appreciate if you
could squeeze me into your develop schedule and share with me some
plans, :-)  Maybe I could send some pull request in your github repo?


I'm currently working on a refresh of the Jailhouse queue and the kernel 
patches to incorporate just two smaller changes:


  - use Siemens device ID
  - drop "features" register from ivshmem device

I have not yet touched the QEMU code for that so far, thus no conflict 
yet. I would wait for your patches then.


If it helps us to work on this together, I can push things to github as 
well. Will drop you a note when done. We should just present the outcome 
frequently as new series to the list.


I've updated my queues, mostly small changes, mostly to the kernel bits. 
Besides the already announced places, you can also find them as PR 
targets on


https://github.com/siemens/qemu/commits/wip/ivshmem2
https://github.com/siemens/linux/commits/queues/ivshmem2

To give the whole thing broader coverage, I will now also move forward 
and integrate the current state into Jailhouse - at the risk of having 
to rework the interface there once again. But there are a number of 
users already requiring the extended features (or even using them), plus 
this gives a nice test coverage of key components and properties.


Jan

--
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux

Re: [RFC][PATCH 0/3] IVSHMEM version 2 device for QEMU

2019-11-27 Thread Jan Kiszka


Hi Liang,

On 27.11.19 16:28, Liang Yan wrote:



On 11/11/19 7:57 AM, Jan Kiszka wrote:

To get the ball rolling after my presentation of the topic at KVM Forum
[1] and many fruitful discussions around it, this is a first concrete
code series. As discussed, I'm starting with the IVSHMEM implementation
of a QEMU device and server. It's RFC because, besides specification and
implementation details, there will still be some decisions needed about
how to integrate the new version best into the existing code bases.

If you want to play with this, the basic setup of the shared memory
device is described in patch 1 and 3. UIO driver and also the
virtio-ivshmem prototype can be found at

 http://git.kiszka.org/?p=linux.git;a=shortlog;h=refs/heads/queues/ivshmem2

Accessing the device via UIO is trivial enough. If you want to use it
for virtio, this is additionally to the description in patch 3 needed on
the virtio console backend side:

 modprobe uio_ivshmem
 echo "1af4 1110 1af4 1100 ffc003 ff" > 
/sys/bus/pci/drivers/uio_ivshmem/new_id
 linux/tools/virtio/virtio-ivshmem-console /dev/uio0

And for virtio block:

 echo "1af4 1110 1af4 1100 ffc002 ff" > 
/sys/bus/pci/drivers/uio_ivshmem/new_id
 linux/tools/virtio/virtio-ivshmem-console /dev/uio0 /path/to/disk.img

After that, you can start the QEMU frontend instance with the
virtio-ivshmem driver installed which can use the new /dev/hvc* or
/dev/vda* as usual.

Any feedback welcome!


Hi, Jan,

I have been playing your code for last few weeks, mostly study and test,
of course. Really nice work. I have a few questions here:

First, qemu part looks good, I tried test between couple VMs, and device
could pop up correctly for all of them, but I had some problems when
trying load driver. For example, if set up two VMs, vm1 and vm2, start
ivshmem server as you suggested. vm1 could load uio_ivshmem and
virtio_ivshmem correctly, vm2 could load uio_ivshmem but could not show
up "/dev/uio0", virtio_ivshmem could not be loaded at all, these still
exist even I switch the load sequence of vm1 and vm2, and sometimes
reset "virtio_ivshmem" could crash both vm1 and vm2. Not quite sure this
is bug or "Ivshmem Mode" issue, but I went through ivshmem-server code,
did not related information.


If we are only talking about one ivshmem link and vm1 is the master, 
there is not role for virtio_ivshmem on it as backend. That is purely a 
frontend driver. Vice versa for vm2: If you want to use its ivshmem 
instance as virtio frontend, uio_ivshmem plays no role.


The "crash" is would be interesting to understand: Do you see kernel 
panics of the guests? Or are they stuck? Or are the QEMU instances 
stuck? Do you know that you can debug the guest kernels via gdb (and 
gdb-scripts of the kernel)?




I started some code work recently, such as fix code style issues and
some work based on above testing, however I know you are also working on
RFC V2, beside the protocol between server-client and client-client is
not finalized yet either, things may change, so much appreciate if you
could squeeze me into your develop schedule and share with me some
plans, :-)  Maybe I could send some pull request in your github repo?


I'm currently working on a refresh of the Jailhouse queue and the kernel 
patches to incorporate just two smaller changes:


 - use Siemens device ID
 - drop "features" register from ivshmem device

I have not yet touched the QEMU code for that so far, thus no conflict 
yet. I would wait for your patches then.


If it helps us to work on this together, I can push things to github as 
well. Will drop you a note when done. We should just present the outcome 
frequently as new series to the list.




I personally like this project a lot, there would be a lot of potential
and user case for it, especially some devices like
ivshmem-net/ivshmem-block. Anyway, thanks for adding me to the list, and
looking forward to your reply.


Thanks for the positive feedback. I'm looking forward to work on this 
together!


Jan

--
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux

Re: [RFC][PATCH 2/3] docs/specs: Add specification of ivshmem device revision 2

2019-11-20 Thread Jan Kiszka


On 12.11.19 09:04, Michael S. Tsirkin wrote:

On Mon, Nov 11, 2019 at 05:38:29PM +0100, Jan Kiszka wrote:

On 11.11.19 17:11, Michael S. Tsirkin wrote:

On Mon, Nov 11, 2019 at 03:27:43PM +, Daniel P. Berrangé wrote:

On Mon, Nov 11, 2019 at 10:08:20AM -0500, Michael S. Tsirkin wrote:

On Mon, Nov 11, 2019 at 02:59:07PM +0100, Jan Kiszka wrote:

On 11.11.19 14:45, Michael S. Tsirkin wrote:

On Mon, Nov 11, 2019 at 01:57:11PM +0100, Jan Kiszka wrote:

+| Offset | Register   | Content
  |
+|---:|:---|:-|
+|00h | Vendor ID  | 1AF4h  
  |
+|02h | Device ID  | 1110h  
  |


Given it's a virtio vendor ID, please reserve a device ID
with the virtio TC.


Yeah, QEMU's IVSHMEM was always using that. I'm happy to make this finally
official.



And I guess we will just mark it reserved or something right?
Since at least IVSHMEM 1 isn't a virtio device.
And will you be reusing same ID for IVSHMEM 2 or a new one?


1110h isn't under either of the virtio PCI device ID allowed ranges
according to the spec:

"Any PCI device with PCI Vendor ID 0x1AF4, and PCI Device
 ID 0x1000 through 0x107F inclusive is a virtio device.
 ...
 Additionally, devices MAY utilize a Transitional PCI Device
 ID range, 0x1000 to 0x103F depending on the device type. "

So there's no need to reserve 0x1110h from the virtio spec POV.


Well we do have:

B.3
What Device Number?
Device numbers can be reserved by the OASIS committee: email 
virtio-...@lists.oasis-open.org to secure
a unique one.
Meanwhile for experimental drivers, use 65535 and work backwards.

So it seems it can  in theory conflict at least with experimental virtio 
devices.

Really it's messy that people are reusing the virtio vendor ID for
random stuff - getting a vendor ID is only hard for a hobbyist, any big
company already has an ID - but if it is a hobbyist and they at least
register then doesn't cause much harm.


Note that ivshmem came from a research environment. I do know if there was a
check for the IDs at the point the code was merged.

That said, I may get a device ID here as well, provided I can explain that
not a single "product" will own it, but rather an open specification.

Jan


OK, up to you - if you decide you want an ID reserved, pls let us know.



Turned out to be much simpler than expect:

I reserved device ID 4106h under the vendor ID 110Ah (Siemens AG) for 
the purpose of specifying a shared memory device via the VIRTIO TC. Will 
update this "detail" in the next revision of the patches, also resetting 
the device revision ID to 0 as no longer need to tell us apart from the 
current implementation this way.


Jan

--
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux

Re: [RFC][PATCH 2/3] docs/specs: Add specification of ivshmem device revision 2

2019-11-11 Thread Jan Kiszka


On 11.11.19 17:11, Michael S. Tsirkin wrote:

On Mon, Nov 11, 2019 at 03:27:43PM +, Daniel P. Berrangé wrote:

On Mon, Nov 11, 2019 at 10:08:20AM -0500, Michael S. Tsirkin wrote:

On Mon, Nov 11, 2019 at 02:59:07PM +0100, Jan Kiszka wrote:

On 11.11.19 14:45, Michael S. Tsirkin wrote:

On Mon, Nov 11, 2019 at 01:57:11PM +0100, Jan Kiszka wrote:

+| Offset | Register   | Content
  |
+|---:|:---|:-|
+|00h | Vendor ID  | 1AF4h  
  |
+|02h | Device ID  | 1110h  
  |


Given it's a virtio vendor ID, please reserve a device ID
with the virtio TC.


Yeah, QEMU's IVSHMEM was always using that. I'm happy to make this finally
official.



And I guess we will just mark it reserved or something right?
Since at least IVSHMEM 1 isn't a virtio device.
And will you be reusing same ID for IVSHMEM 2 or a new one?


1110h isn't under either of the virtio PCI device ID allowed ranges
according to the spec:

   "Any PCI device with PCI Vendor ID 0x1AF4, and PCI Device
ID 0x1000 through 0x107F inclusive is a virtio device.
...
Additionally, devices MAY utilize a Transitional PCI Device
ID range, 0x1000 to 0x103F depending on the device type. "

So there's no need to reserve 0x1110h from the virtio spec POV.


Well we do have:

B.3
What Device Number?
Device numbers can be reserved by the OASIS committee: email 
virtio-...@lists.oasis-open.org to secure
a unique one.
Meanwhile for experimental drivers, use 65535 and work backwards.

So it seems it can  in theory conflict at least with experimental virtio 
devices.

Really it's messy that people are reusing the virtio vendor ID for
random stuff - getting a vendor ID is only hard for a hobbyist, any big
company already has an ID - but if it is a hobbyist and they at least
register then doesn't cause much harm.


Note that ivshmem came from a research environment. I do know if there 
was a check for the IDs at the point the code was merged.


That said, I may get a device ID here as well, provided I can explain 
that not a single "product" will own it, but rather an open specification.


Jan

--
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux

Re: [RFC][PATCH 2/3] docs/specs: Add specification of ivshmem device revision 2

2019-11-11 Thread Jan Kiszka


On 11.11.19 17:14, Michael S. Tsirkin wrote:

On Mon, Nov 11, 2019 at 04:42:52PM +0100, Jan Kiszka wrote:

On 11.11.19 16:27, Daniel P. Berrangé wrote:

On Mon, Nov 11, 2019 at 10:08:20AM -0500, Michael S. Tsirkin wrote:

On Mon, Nov 11, 2019 at 02:59:07PM +0100, Jan Kiszka wrote:

On 11.11.19 14:45, Michael S. Tsirkin wrote:

On Mon, Nov 11, 2019 at 01:57:11PM +0100, Jan Kiszka wrote:

+| Offset | Register   | Content
  |
+|---:|:---|:-|
+|00h | Vendor ID  | 1AF4h  
  |
+|02h | Device ID  | 1110h  
  |


Given it's a virtio vendor ID, please reserve a device ID
with the virtio TC.


Yeah, QEMU's IVSHMEM was always using that. I'm happy to make this finally
official.



And I guess we will just mark it reserved or something right?
Since at least IVSHMEM 1 isn't a virtio device.
And will you be reusing same ID for IVSHMEM 2 or a new one?


1110h isn't under either of the virtio PCI device ID allowed ranges
according to the spec:

"Any PCI device with PCI Vendor ID 0x1AF4, and PCI Device
 ID 0x1000 through 0x107F inclusive is a virtio device.
 ...
 Additionally, devices MAY utilize a Transitional PCI Device
 ID range, 0x1000 to 0x103F depending on the device type. "

So there's no need to reserve 0x1110h from the virtio spec POV.


Indeed.



I have, however, ensured it is assigned to ivshmem from POV of
Red Hat's own internal tracking of allocated device IDs, under
its vendor ID.

If ivshmem 2 is now a virtio device, then it is a good thing that
it will get a new/different PCI device ID, to show that it is not
compatible with the old device impl.


At this stage, it is just a PCI device that may be used in combination with
virtio (stacked on top), but it is not designed like a normal virtio (PCI)
device. That's because it lacks many properties of regular virtio devices,
like queues.

So, if such a device could be come part of the virtio spec, it would be
separate from the rest, and having an ID from the regular range would likely
not be helpful in this regard.

Jan


I agree it needs a separate ID not from the regular range.
It's a distinct transport.
Maybe even a distinct vendor ID - we could easily get another one
if needed.


That might be useful because I've seen the kernel's virtio-pci driver 
grabbing ivshmem devices from time to time. OTOH, that could likely also 
be improved in Linux, at least for future versions.


Jan

--
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux

Re: [RFC][PATCH 2/3] docs/specs: Add specification of ivshmem device revision 2

2019-11-11 Thread Jan Kiszka


On 11.11.19 16:27, Daniel P. Berrangé wrote:

On Mon, Nov 11, 2019 at 10:08:20AM -0500, Michael S. Tsirkin wrote:

On Mon, Nov 11, 2019 at 02:59:07PM +0100, Jan Kiszka wrote:

On 11.11.19 14:45, Michael S. Tsirkin wrote:

On Mon, Nov 11, 2019 at 01:57:11PM +0100, Jan Kiszka wrote:

+| Offset | Register   | Content
  |
+|---:|:---|:-|
+|00h | Vendor ID  | 1AF4h  
  |
+|02h | Device ID  | 1110h  
  |


Given it's a virtio vendor ID, please reserve a device ID
with the virtio TC.


Yeah, QEMU's IVSHMEM was always using that. I'm happy to make this finally
official.



And I guess we will just mark it reserved or something right?
Since at least IVSHMEM 1 isn't a virtio device.
And will you be reusing same ID for IVSHMEM 2 or a new one?


1110h isn't under either of the virtio PCI device ID allowed ranges
according to the spec:

   "Any PCI device with PCI Vendor ID 0x1AF4, and PCI Device
ID 0x1000 through 0x107F inclusive is a virtio device.
...
Additionally, devices MAY utilize a Transitional PCI Device
ID range, 0x1000 to 0x103F depending on the device type. "

So there's no need to reserve 0x1110h from the virtio spec POV.


Indeed.



I have, however, ensured it is assigned to ivshmem from POV of
Red Hat's own internal tracking of allocated device IDs, under
its vendor ID.

If ivshmem 2 is now a virtio device, then it is a good thing that
it will get a new/different PCI device ID, to show that it is not
compatible with the old device impl.


At this stage, it is just a PCI device that may be used in combination 
with virtio (stacked on top), but it is not designed like a normal 
virtio (PCI) device. That's because it lacks many properties of regular 
virtio devices, like queues.


So, if such a device could be come part of the virtio spec, it would be 
separate from the rest, and having an ID from the regular range would 
likely not be helpful in this regard.


Jan

--
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux

Re: [RFC][PATCH 2/3] docs/specs: Add specification of ivshmem device revision 2

2019-11-11 Thread Jan Kiszka


On 11.11.19 14:45, Michael S. Tsirkin wrote:

On Mon, Nov 11, 2019 at 01:57:11PM +0100, Jan Kiszka wrote:

+| Offset | Register   | Content
  |
+|---:|:---|:-|
+|00h | Vendor ID  | 1AF4h  
  |
+|02h | Device ID  | 1110h  
  |


Given it's a virtio vendor ID, please reserve a device ID
with the virtio TC.


Yeah, QEMU's IVSHMEM was always using that. I'm happy to make this 
finally official.


Jan


--
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux

[RFC][PATCH 1/3] hw/misc: Add implementation of ivshmem revision 2 device

2019-11-11 Thread Jan Kiszka

From: Jan Kiszka 

This adds a reimplementation of ivshmem in its new revision 2 as
separate device. The goal of this is not to enable sharing with v1,
rather to allow explore the properties and potential limitation of the
new version prior to discussing its integration with the existing code.

v2 always requires a server to interconnect two more more QEMU
instances because it provides signaling between peers unconditionally.
Therefore, only the interconnecting chardev, master mode, and the usage
of ioeventfd can be configured at device level. All other parameters are
defined by the server instance.

A new server protocol is introduced along this. Its primary difference
is the introduction of a single welcome message that contains all peer
parameters, rather than a series of single-word messages pushing them
piece by piece.

A complicating difference in interrupt handling, compare to v1, is the
auto-disable mode of v2: When this is active, interrupt delivery is
disabled by the device after each interrupt event. This prevents the
usage of irqfd on the receiving side, but it lowers the handling cost
for guests that implemented interrupt throttling this way (specifically
when exposing the device via UIO).

No changes have been made to the ivshmem device regarding migration:
Only the master can live-migrate, slave peers have to hot-unplug the
device first.

The details of the device model will be specified in a succeeding
commit. Drivers for this device can currently be found under

http://git.kiszka.org/?p=linux.git;a=shortlog;h=refs/heads/queues/ivshmem2

To instantiate a ivshmem v2 device, just add

 ... -chardev socket,path=/tmp/ivshmem_socket,id=ivshmem \
 -device ivshmem,chardev=ivshmem

provided the server opened its socket under the default path.

Signed-off-by: Jan Kiszka 
---
 hw/misc/Makefile.objs  |2 +-
 hw/misc/ivshmem2.c | 1091 
 include/hw/misc/ivshmem2.h |   48 ++
 3 files changed, 1140 insertions(+), 1 deletion(-)
 create mode 100644 hw/misc/ivshmem2.c
 create mode 100644 include/hw/misc/ivshmem2.h

diff --git a/hw/misc/Makefile.objs b/hw/misc/Makefile.objs
index ba898a5781..90a4a6608c 100644
--- a/hw/misc/Makefile.objs
+++ b/hw/misc/Makefile.objs
@@ -26,7 +26,7 @@ common-obj-$(CONFIG_PUV3) += puv3_pm.o
 
 common-obj-$(CONFIG_MACIO) += macio/
 
-common-obj-$(CONFIG_IVSHMEM_DEVICE) += ivshmem.o
+common-obj-$(CONFIG_IVSHMEM_DEVICE) += ivshmem.o ivshmem2.o
 
 common-obj-$(CONFIG_REALVIEW) += arm_sysctl.o
 common-obj-$(CONFIG_NSERIES) += cbus.o
diff --git a/hw/misc/ivshmem2.c b/hw/misc/ivshmem2.c
new file mode 100644
index 00..aeb5c40e0b
--- /dev/null
+++ b/hw/misc/ivshmem2.c
@@ -0,0 +1,1091 @@
+/*
+ * Inter-VM Shared Memory PCI device, version 2.
+ *
+ * Copyright (c) Siemens AG, 2019
+ *
+ * Authors:
+ *  Jan Kiszka 
+ *
+ * Based on ivshmem.c by Cam Macdonell 
+ *
+ * This code is licensed under the GNU GPL v2.
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/units.h"
+#include "qapi/error.h"
+#include "qemu/cutils.h"
+#include "hw/hw.h"
+#include "hw/pci/pci.h"
+#include "hw/pci/msi.h"
+#include "hw/pci/msix.h"
+#include "hw/qdev-properties.h"
+#include "sysemu/kvm.h"
+#include "migration/blocker.h"
+#include "migration/vmstate.h"
+#include "qemu/error-report.h"
+#include "qemu/event_notifier.h"
+#include "qemu/module.h"
+#include "qom/object_interfaces.h"
+#include "chardev/char-fe.h"
+#include "sysemu/qtest.h"
+#include "qapi/visitor.h"
+
+#include "hw/misc/ivshmem2.h"
+
+#define PCI_VENDOR_ID_IVSHMEM   PCI_VENDOR_ID_REDHAT_QUMRANET
+#define PCI_DEVICE_ID_IVSHMEM   0x1110
+
+#define IVSHMEM_MAX_PEERS   UINT16_MAX
+#define IVSHMEM_IOEVENTFD   0
+#define IVSHMEM_MSI 1
+
+#define IVSHMEM_REG_BAR_SIZE0x1000
+
+#define IVSHMEM_REG_ID  0x00
+#define IVSHMEM_REG_MAX_PEERS   0x04
+#define IVSHMEM_REG_FEATURES0x08
+#define IVSHMEM_REG_INT_CTRL0x0c
+#define IVSHMEM_REG_DOORBELL0x10
+#define IVSHMEM_REG_STATE   0x14
+
+#define IVSHMEM_INT_ENABLE  0x1
+
+#define IVSHMEM_ONESHOT_MODE0x1
+
+#define IVSHMEM_DEBUG 0
+#define IVSHMEM_DPRINTF(fmt, ...)   \
+do {\
+if (IVSHMEM_DEBUG) {\
+printf("IVSHMEM: " fmt, ## __VA_ARGS__);\
+}   \
+} while (0)
+
+#define TYPE_IVSHMEM "ivshmem"
+#define IVSHMEM(obj) \
+OBJECT_CHECK(IVShmemState, (obj), TYPE_IVSHMEM)
+
+typedef struct Peer {
+int nb_eventfds;
+EventNotifier *eventfds;
+} Peer;
+
+typedef struct MSIVector {
+PCIDevice *pdev;
+int virq;
+bool unmasked;
+} MSIVector;
+
+typedef struct IVShmemVndrCap {

[RFC][PATCH 0/3] IVSHMEM version 2 device for QEMU

2019-11-11 Thread Jan Kiszka

To get the ball rolling after my presentation of the topic at KVM Forum
[1] and many fruitful discussions around it, this is a first concrete
code series. As discussed, I'm starting with the IVSHMEM implementation
of a QEMU device and server. It's RFC because, besides specification and
implementation details, there will still be some decisions needed about
how to integrate the new version best into the existing code bases.

If you want to play with this, the basic setup of the shared memory
device is described in patch 1 and 3. UIO driver and also the
virtio-ivshmem prototype can be found at

http://git.kiszka.org/?p=linux.git;a=shortlog;h=refs/heads/queues/ivshmem2

Accessing the device via UIO is trivial enough. If you want to use it
for virtio, this is additionally to the description in patch 3 needed on
the virtio console backend side:

modprobe uio_ivshmem
echo "1af4 1110 1af4 1100 ffc003 ff" > 
/sys/bus/pci/drivers/uio_ivshmem/new_id
linux/tools/virtio/virtio-ivshmem-console /dev/uio0

And for virtio block:

echo "1af4 1110 1af4 1100 ffc002 ff" > 
/sys/bus/pci/drivers/uio_ivshmem/new_id
linux/tools/virtio/virtio-ivshmem-console /dev/uio0 /path/to/disk.img

After that, you can start the QEMU frontend instance with the
virtio-ivshmem driver installed which can use the new /dev/hvc* or
/dev/vda* as usual.

Any feedback welcome!

Jan

PS: Let me know if I missed someone potentially interested in this topic
on CC - or if you would like to be dropped from the list.

PPS: The Jailhouse queues are currently out of sync /wrt minor details
of this one, primarily the device ID. Will update them when the general
direction is clear.

[1] https://kvmforum2019.sched.com/event/TmxI

Jan Kiszka (3):
  hw/misc: Add implementation of ivshmem revision 2 device
  docs/specs: Add specification of ivshmem device revision 2
  contrib: Add server for ivshmem revision 2

 Makefile  |3 +
 Makefile.objs |1 +
 configure |1 +
 contrib/ivshmem2-server/Makefile.objs |1 +
 contrib/ivshmem2-server/ivshmem2-server.c |  462 
 contrib/ivshmem2-server/ivshmem2-server.h |  158 +
 contrib/ivshmem2-server/main.c|  313 +
 docs/specs/ivshmem-2-device-spec.md   |  333 +
 hw/misc/Makefile.objs |2 +-
 hw/misc/ivshmem2.c| 1091 +
 include/hw/misc/ivshmem2.h|   48 ++
 11 files changed, 2412 insertions(+), 1 deletion(-)
 create mode 100644 contrib/ivshmem2-server/Makefile.objs
 create mode 100644 contrib/ivshmem2-server/ivshmem2-server.c
 create mode 100644 contrib/ivshmem2-server/ivshmem2-server.h
 create mode 100644 contrib/ivshmem2-server/main.c
 create mode 100644 docs/specs/ivshmem-2-device-spec.md
 create mode 100644 hw/misc/ivshmem2.c
 create mode 100644 include/hw/misc/ivshmem2.h

-- 
2.16.4

[RFC][PATCH 2/3] docs/specs: Add specification of ivshmem device revision 2

2019-11-11 Thread Jan Kiszka

From: Jan Kiszka 

This imports the ivshmem v2 specification draft from Jailhouse. Its
final home is to be decided, this shall just simplify the review process
at this stage.

Note that specifically the Features register (offset 08h) is still under
consideration. In particular, its bit 0 seems useless now as its benefit
to guests, specifically when they want to be portable, is close to zero.
Maybe the register should still be kept, with all bits RsvdZ, for easier
extensibility.

The rest appears now rather mature and reasonably implementable, as
proven also by a version for Jailhouse.

Signed-off-by: Jan Kiszka 
---
 docs/specs/ivshmem-2-device-spec.md | 333 
 1 file changed, 333 insertions(+)
 create mode 100644 docs/specs/ivshmem-2-device-spec.md

diff --git a/docs/specs/ivshmem-2-device-spec.md 
b/docs/specs/ivshmem-2-device-spec.md
new file mode 100644
index 00..98cfde585a
--- /dev/null
+++ b/docs/specs/ivshmem-2-device-spec.md
@@ -0,0 +1,333 @@
+IVSHMEM Device Specification
+
+
+** NOTE: THIS IS WORK-IN-PROGRESS, NOT YET A STABLE INTERFACE SPECIFICATION! **
+
+The Inter-VM Shared Memory device provides the following features to its users:
+
+- Interconnection between up to 65536 peers
+
+- Multi-purpose shared memory region
+
+- common read/writable section
+
+- unidirectional sections that are read/writable for one peer and only
+  readable for the others
+
+- section with peer states
+
+- Event signaling via interrupt to remote sides
+
+- Support for life-cycle management via state value exchange and interrupt
+  notification on changes, backed by a shared memory section
+
+- Free choice of protocol to be used on top
+
+- Protocol type declaration
+
+- Unprivileged access to memory-mapped or I/O registers feasible
+
+- Discoverable and configurable via standard PCI mechanisms
+
+
+Hypervisor Model
+
+
+In order to provide a consistent link between peers, all connected instances of
+IVSHMEM devices need to be configured, created and run by the hypervisor
+according to the following requirements:
+
+- The instances of the device need to be accessible via PCI programming
+  interfaces on all sides.
+
+- The read/write shared memory section has to be of the same size for all
+  peers and, if non-zero, has to reflect the same memory content for them.
+
+- If output sections are present (non-zero section size), there must be one
+  reserved for each peer with exclusive write access. All output sections
+  must have the same size and must be readable for all peers. They have to
+  reflect the same memory content for all peers.
+
+- The State Table must have the same size for all peers, must be large enough 
to
+  hold a state values of all peers, and must be read-only for the user.
+
+- State register changes (explicit writes, peer resets) have to be propagated
+  to the other peers by updating the corresponding State Table entry and 
issuing
+  an interrupt to all other peers if they enabled reception.
+
+- Interrupts events triggered by a peer have to be delivered to the target 
peer,
+  provided the receiving side is valid and has enabled the reception.
+
+- All peers must have the same interrupt delivery features available, i.e. 
MSI-X
+  with the same maximum number of vectors on platforms supporting this
+  mechanism, otherwise INTx with one vector.
+
+
+Guest-side Programming Model
+
+
+An IVSHMEM device appears as a PCI device to its users. Unless otherwise noted,
+it conforms to the PCI Local Bus Specification, Revision 3.0 As such, it is
+discoverable via the PCI configuration space and provides a number of standard
+and custom PCI configuration registers.
+
+### Shared Memory Region Layout
+
+The shared memory region is divided into several sections.
+
++-+   -
+| |   :
+| Output Section for peer n-1 |   : Output Section Size
+| (n = Maximum Peers) |   :
++-+   -
+: :
+: :
+: :
++-+   -
+| |   :
+|  Output Section for peer 1  |   : Output Section Size
+| |   :
++-+   -
+| |   :
+|  Output Section for peer 0  |   : Output Section Size
+| |   :
++-+   -
+| |   :
+| Read/Write Section  |   : R/W Section Size
+| |   :
++-+   -
+| |   :
+| State Table |   : State Table Size
+| |   :
++-+   <-- Shared memory b

[RFC][PATCH 3/3] contrib: Add server for ivshmem revision 2

2019-11-11 Thread Jan Kiszka

From: Jan Kiszka 

This implements the server process for ivshmem v2 device models of QEMU.
Again, no effort has been spent yet on sharing code with the v1 server.
Parts have been copied, others were rewritten.

In addition to parameters of v1, this server now also specifies

 - the maximum number of peers to be connected (required to know in
   advance because of v2's state table)
 - the size of the output sections (can be 0)
 - the protocol ID to be published to all peers

When a virtio protocol ID is chosen, only 2 peers can be connected.
Furthermore, the server will signal the backend variant of the ID to the
master instance and the frontend ID to the slave peer.

To start, e.g., a server that allows virtio console over ivshmem, call

ivshmem2-server -F -l 64K -n 2 -V 3 -P 0x8003

TODO: specify the new server protocol.

Signed-off-by: Jan Kiszka 
---
 Makefile  |   3 +
 Makefile.objs |   1 +
 configure |   1 +
 contrib/ivshmem2-server/Makefile.objs |   1 +
 contrib/ivshmem2-server/ivshmem2-server.c | 462 ++
 contrib/ivshmem2-server/ivshmem2-server.h | 158 ++
 contrib/ivshmem2-server/main.c| 313 
 7 files changed, 939 insertions(+)
 create mode 100644 contrib/ivshmem2-server/Makefile.objs
 create mode 100644 contrib/ivshmem2-server/ivshmem2-server.c
 create mode 100644 contrib/ivshmem2-server/ivshmem2-server.h
 create mode 100644 contrib/ivshmem2-server/main.c

diff --git a/Makefile b/Makefile
index aa9d1a42aa..85b5e985ac 100644
--- a/Makefile
+++ b/Makefile
@@ -430,6 +430,7 @@ dummy := $(call unnest-vars,, \
 elf2dmp-obj-y \
 ivshmem-client-obj-y \
 ivshmem-server-obj-y \
+ivshmem2-server-obj-y \
 rdmacm-mux-obj-y \
 libvhost-user-obj-y \
 vhost-user-scsi-obj-y \
@@ -662,6 +663,8 @@ ivshmem-client$(EXESUF): $(ivshmem-client-obj-y) 
$(COMMON_LDADDS)
$(call LINK, $^)
 ivshmem-server$(EXESUF): $(ivshmem-server-obj-y) $(COMMON_LDADDS)
$(call LINK, $^)
+ivshmem2-server$(EXESUF): $(ivshmem2-server-obj-y) $(COMMON_LDADDS)
+   $(call LINK, $^)
 endif
 vhost-user-scsi$(EXESUF): $(vhost-user-scsi-obj-y) libvhost-user.a
$(call LINK, $^)
diff --git a/Makefile.objs b/Makefile.objs
index 11ba1a36bd..a4a729d808 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -117,6 +117,7 @@ qga-vss-dll-obj-y = qga/
 elf2dmp-obj-y = contrib/elf2dmp/
 ivshmem-client-obj-$(CONFIG_IVSHMEM) = contrib/ivshmem-client/
 ivshmem-server-obj-$(CONFIG_IVSHMEM) = contrib/ivshmem-server/
+ivshmem2-server-obj-$(CONFIG_IVSHMEM) = contrib/ivshmem2-server/
 libvhost-user-obj-y = contrib/libvhost-user/
 vhost-user-scsi.o-cflags := $(LIBISCSI_CFLAGS)
 vhost-user-scsi.o-libs := $(LIBISCSI_LIBS)
diff --git a/configure b/configure
index efe165edf9..1b466b2461 100755
--- a/configure
+++ b/configure
@@ -6184,6 +6184,7 @@ if test "$want_tools" = "yes" ; then
   fi
   if [ "$ivshmem" = "yes" ]; then
 tools="ivshmem-client\$(EXESUF) ivshmem-server\$(EXESUF) $tools"
+tools="ivshmem2-server\$(EXESUF) $tools"
   fi
   if [ "$curl" = "yes" ]; then
   tools="elf2dmp\$(EXESUF) $tools"
diff --git a/contrib/ivshmem2-server/Makefile.objs 
b/contrib/ivshmem2-server/Makefile.objs
new file mode 100644
index 00..d233e18ec8
--- /dev/null
+++ b/contrib/ivshmem2-server/Makefile.objs
@@ -0,0 +1 @@
+ivshmem2-server-obj-y = ivshmem2-server.o main.o
diff --git a/contrib/ivshmem2-server/ivshmem2-server.c 
b/contrib/ivshmem2-server/ivshmem2-server.c
new file mode 100644
index 00..b341f1fcd0
--- /dev/null
+++ b/contrib/ivshmem2-server/ivshmem2-server.c
@@ -0,0 +1,462 @@
+/*
+ * Copyright 6WIND S.A., 2014
+ * Copyright (c) Siemens AG, 2019
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * (at your option) any later version.  See the COPYING file in the
+ * top-level directory.
+ */
+#include "qemu/osdep.h"
+#include "qemu/host-utils.h"
+#include "qemu/sockets.h"
+#include "qemu/atomic.h"
+
+#include 
+#include 
+
+#include "ivshmem2-server.h"
+
+/* log a message on stdout if verbose=1 */
+#define IVSHMEM_SERVER_DEBUG(server, fmt, ...) do { \
+if ((server)->args.verbose) { \
+printf(fmt, ## __VA_ARGS__); \
+}\
+} while (0)
+
+/** maximum size of a huge page, used by ivshmem_server_ftruncate() */
+#define IVSHMEM_SERVER_MAX_HUGEPAGE_SIZE (1024 * 1024 * 1024)
+
+/** default listen backlog (number of sockets not accepted) */
+#define IVSHMEM_SERVER_LISTEN_BACKLOG 10
+
+/* send message to a client unix socket */
+static int ivshmem_server_send_msg(int sock_fd, void *payload, int len, int fd)
+{
+in

Re: [PATCH] MAINTAINERS: slirp: Remove myself as maintainer

2019-11-10 Thread Jan Kiszka


On 27.07.19 12:00, Marc-André Lureau wrote:

Hi

On Sat, Jul 27, 2019 at 10:16 AM Jan Kiszka  wrote:


From: Jan Kiszka 

I haven't been doing anything here for a long time, nor does my git repo
still play a role.

Signed-off-by: Jan Kiszka 


too bad, we could still use some help ;)

thanks anyway!

Reviewed-by: Marc-André Lureau 


---
  MAINTAINERS | 2 --
  1 file changed, 2 deletions(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index d6de200453..238feb965f 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2089,13 +2089,11 @@ F: include/hw/registerfields.h

  SLIRP
  M: Samuel Thibault 
-M: Jan Kiszka 
  S: Maintained
  F: slirp/
  F: net/slirp.c
  F: include/net/slirp.h
  T: git https://people.debian.org/~sthibault/qemu.git slirp
-T: git git://git.kiszka.org/qemu.git queues/slirp

  Stubs
  M: Paolo Bonzini 
--
2.16.4






May I point out that this one was never merged?

Sorry, I really can't help in this area anymore.

Jan

--
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux

Re: [PATCH v2] ivshmem-server: Terminate also on SIGINT

2019-11-08 Thread Jan Kiszka


On 03.08.19 15:22, Jan Kiszka wrote:

From: Jan Kiszka 

Allows to shutdown a foreground session via ctrl-c.

Signed-off-by: Jan Kiszka 
---

Changes in v2:
  - adjust error message

  contrib/ivshmem-server/main.c | 5 +++--
  1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/contrib/ivshmem-server/main.c b/contrib/ivshmem-server/main.c
index 197c79c57e..e4cd35f74c 100644
--- a/contrib/ivshmem-server/main.c
+++ b/contrib/ivshmem-server/main.c
@@ -223,8 +223,9 @@ main(int argc, char *argv[])
  sa_quit.sa_handler = ivshmem_server_quit_cb;
  sa_quit.sa_flags = 0;
  if (sigemptyset(_quit.sa_mask) == -1 ||
-sigaction(SIGTERM, _quit, 0) == -1) {
-perror("failed to add SIGTERM handler; sigaction");
+sigaction(SIGTERM, _quit, 0) == -1 ||
+sigaction(SIGINT, _quit, 0) == -1) {
+perror("failed to add signal handler; sigaction");
  goto err;
  }

--
2.16.4




...and this one for you as well, Markus?

Jan

--
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux

Re: [PATCH v2] ivshmem-server: Clean up shmem on shutdown

2019-11-08 Thread Jan Kiszka


On 06.08.19 15:01, Claudio Fontana wrote:

On 8/5/19 7:54 AM, Jan Kiszka wrote:

From: Jan Kiszka 

So far, the server leaves the posix shared memory object behind when
terminating, requiring the user to explicitly remove it in order to
start a new instance.

Signed-off-by: Jan Kiszka 
---

Changes in v2:
  - respect use_shm_open
  - also clean up in ivshmem_server_start error path

  contrib/ivshmem-server/ivshmem-server.c | 6 ++
  1 file changed, 6 insertions(+)

diff --git a/contrib/ivshmem-server/ivshmem-server.c 
b/contrib/ivshmem-server/ivshmem-server.c
index 77f97b209c..88daee812d 100644
--- a/contrib/ivshmem-server/ivshmem-server.c
+++ b/contrib/ivshmem-server/ivshmem-server.c
@@ -353,6 +353,9 @@ ivshmem_server_start(IvshmemServer *server)
  err_close_sock:
  close(sock_fd);
  err_close_shm:
+if (server->use_shm_open) {
+shm_unlink(server->shm_path);
+}
  close(shm_fd);
  return -1;
  }
@@ -370,6 +373,9 @@ ivshmem_server_close(IvshmemServer *server)
  }

  unlink(server->unix_sock_path);
+if (server->use_shm_open) {
+shm_unlink(server->shm_path);
+}
  close(server->sock_fd);
  close(server->shm_fd);
  server->sock_fd = -1;
--
2.16.4




Reviewed-by: Claudio Fontana 





Markus, would you take this?

Jan

--
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux

Re: [Qemu-devel] [PATCH] i386/vmmouse: Properly reset state

2019-08-29 Thread Jan Kiszka


On 29.08.19 20:00, Philippe Mathieu-Daudé wrote:

Hi Jan,

On 8/27/19 9:49 PM, Eduardo Habkost wrote:

On Sun, Aug 25, 2019 at 04:58:18PM +0200, Jan Kiszka wrote:

On 21.07.19 10:58, Jan Kiszka wrote:

From: Jan Kiszka 

nb_queue was not zeroed so that we no longer delivered events if a
previous guest left the device in an overflow state.

The state of absolute does not matter as the next vmmouse_update_handler
call will align it again.

Signed-off-by: Jan Kiszka 
---
   hw/i386/vmmouse.c | 1 +
   1 file changed, 1 insertion(+)

diff --git a/hw/i386/vmmouse.c b/hw/i386/vmmouse.c
index 5d2d278be4..e335bd07da 100644
--- a/hw/i386/vmmouse.c
+++ b/hw/i386/vmmouse.c
@@ -257,6 +257,7 @@ static void vmmouse_reset(DeviceState *d)
   VMMouseState *s = VMMOUSE(d);

   s->queue_size = VMMOUSE_QUEUE_SIZE;
+s->nb_queue = 0;


Don't we also need to reset the status in case vmmouse_get_status() is
called directly after reset?

  s->status = 0;



Thanks for checking. We call vmmouse_disable() here, and that sets status to
0x anyway.

Jan


With it:
Reviewed-by: Philippe Mathieu-Daudé 



   vmmouse_disable(s);
   }
--
2.16.4




Ping - or who is looking after this?


Despite being in hw/i386, I think we can say vmmouse.c doesn't
have a maintainer.  Last time someone changed vmmouse.c in a
meaningful way (not just adapting to API changes or removing
duplicate code) was in 2012.



Well it does has a few:

$ ./scripts/get_maintainer.pl -f hw/i386/vmmouse.c
"Michael S. Tsirkin"  (supporter:PC)
Marcel Apfelbaum  (supporter:PC)
Paolo Bonzini  (maintainer:X86 TCG CPUs)
Richard Henderson  (maintainer:X86 TCG CPUs)
Eduardo Habkost  (maintainer:X86 TCG CPUs)

However the correct section should rather be "PC Chipset".


But the change makes sense to me, so:

Reviewed-by: Eduardo Habkost 

I'll queue it.

Re: [Qemu-devel] [PATCH] i386/vmmouse: Properly reset state

2019-08-25 Thread Jan Kiszka


On 21.07.19 10:58, Jan Kiszka wrote:

From: Jan Kiszka 

nb_queue was not zeroed so that we no longer delivered events if a
previous guest left the device in an overflow state.

The state of absolute does not matter as the next vmmouse_update_handler
call will align it again.

Signed-off-by: Jan Kiszka 
---
  hw/i386/vmmouse.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/hw/i386/vmmouse.c b/hw/i386/vmmouse.c
index 5d2d278be4..e335bd07da 100644
--- a/hw/i386/vmmouse.c
+++ b/hw/i386/vmmouse.c
@@ -257,6 +257,7 @@ static void vmmouse_reset(DeviceState *d)
  VMMouseState *s = VMMOUSE(d);

  s->queue_size = VMMOUSE_QUEUE_SIZE;
+s->nb_queue = 0;

  vmmouse_disable(s);
  }
--
2.16.4




Ping - or who is looking after this?

Jan

[Qemu-devel] [PATCH] kvm: vmxcap: Enhance with latest features

2019-08-13 Thread Jan Kiszka

Based on SDM from May 2019.

Signed-off-by: Jan Kiszka 
---
 scripts/kvm/vmxcap | 8 
 1 file changed, 8 insertions(+)

diff --git a/scripts/kvm/vmxcap b/scripts/kvm/vmxcap
index 99a8146aaa..d8c7d6dfb8 100755
--- a/scripts/kvm/vmxcap
+++ b/scripts/kvm/vmxcap
@@ -178,7 +178,11 @@ controls = [
 19: 'Conceal non-root operation from PT',
 20: 'Enable XSAVES/XRSTORS',
 22: 'Mode-based execute control (XS/XU)',
+23: 'Sub-page write permissions',
+24: 'GPA translation for PT',
 25: 'TSC scaling',
+26: 'User wait and pause',
+28: 'ENCLV exiting',
 },
 cap_msr = MSR_IA32_VMX_PROCBASED_CTLS2,
 ),
@@ -197,6 +201,7 @@ controls = [
 22: 'Save VMX-preemption timer value',
 23: 'Clear IA32_BNDCFGS',
 24: 'Conceal VM exits from PT',
+25: 'Clear IA32_RTIT_CTL',
 },
 cap_msr = MSR_IA32_VMX_EXIT_CTLS,
 true_cap_msr = MSR_IA32_VMX_TRUE_EXIT_CTLS,
@@ -214,6 +219,7 @@ controls = [
 15: 'Load IA32_EFER',
 16: 'Load IA32_BNDCFGS',
 17: 'Conceal VM entries from PT',
+18: 'Load IA32_RTIT_CTL',
 },
 cap_msr = MSR_IA32_VMX_ENTRY_CTLS,
 true_cap_msr = MSR_IA32_VMX_TRUE_ENTRY_CTLS,
@@ -227,6 +233,7 @@ controls = [
 6: 'HLT activity state',
 7: 'Shutdown activity state',
 8: 'Wait-for-SIPI activity state',
+14: 'PT in VMX operation',
 15: 'IA32_SMBASE support',
 (16,24): 'Number of CR3-target values',
 (25,27): 'MSR-load/store count recommendation',
@@ -249,6 +256,7 @@ controls = [
 17: '1GB EPT pages',
 20: 'INVEPT supported',
 21: 'EPT accessed and dirty flags',
+22: 'Advanced VM-exit information for EPT violations',
 25: 'Single-context INVEPT',
 26: 'All-context INVEPT',
 32: 'INVVPID supported',
-- 
2.16.4

Re: [Qemu-devel] [PATCH v2] ivshmem-server: Terminate also on SIGINT

2019-08-05 Thread Jan Kiszka

On 05.08.19 10:33, Stefano Garzarella wrote:
> On Sat, Aug 03, 2019 at 03:22:04PM +0200, Jan Kiszka wrote:
>> From: Jan Kiszka 
>>
>> Allows to shutdown a foreground session via ctrl-c.
>>
>> Signed-off-by: Jan Kiszka 
>> ---
>>
>> Changes in v2:
>>  - adjust error message
>>
>>  contrib/ivshmem-server/main.c | 5 +++--
>>  1 file changed, 3 insertions(+), 2 deletions(-)
>>
>> diff --git a/contrib/ivshmem-server/main.c b/contrib/ivshmem-server/main.c
>> index 197c79c57e..e4cd35f74c 100644
>> --- a/contrib/ivshmem-server/main.c
>> +++ b/contrib/ivshmem-server/main.c
>> @@ -223,8 +223,9 @@ main(int argc, char *argv[])
>>  sa_quit.sa_handler = ivshmem_server_quit_cb;
>>  sa_quit.sa_flags = 0;
>>  if (sigemptyset(_quit.sa_mask) == -1 ||
>> -sigaction(SIGTERM, _quit, 0) == -1) {
>> -perror("failed to add SIGTERM handler; sigaction");
>> +sigaction(SIGTERM, _quit, 0) == -1 ||
>> +sigaction(SIGINT, _quit, 0) == -1) {
>> +perror("failed to add signal handler; sigaction");
>>  goto err;
>>  }
>
> Reviewed-by: Stefano Garzarella 

Thanks.

>
> Not related with this patch, but since I was looking at the code,
> I noticed the 'ivshmem_server_quit' variable, set in the signal handler,
> is not volatile.
> Should we define it volatile to avoid possible compiler optimizations?

Yes, would be better.

I'm sure there is more, possibly also on the current device model side. I only
started to dig through it while implementing a new revision of the protocol on
top and aside of it.

Jan

[Qemu-devel] [PATCH v2] ivshmem-server: Clean up shmem on shutdown

2019-08-04 Thread Jan Kiszka

From: Jan Kiszka 

So far, the server leaves the posix shared memory object behind when
terminating, requiring the user to explicitly remove it in order to
start a new instance.

Signed-off-by: Jan Kiszka 
---

Changes in v2:
 - respect use_shm_open
 - also clean up in ivshmem_server_start error path

 contrib/ivshmem-server/ivshmem-server.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/contrib/ivshmem-server/ivshmem-server.c 
b/contrib/ivshmem-server/ivshmem-server.c
index 77f97b209c..88daee812d 100644
--- a/contrib/ivshmem-server/ivshmem-server.c
+++ b/contrib/ivshmem-server/ivshmem-server.c
@@ -353,6 +353,9 @@ ivshmem_server_start(IvshmemServer *server)
 err_close_sock:
 close(sock_fd);
 err_close_shm:
+if (server->use_shm_open) {
+shm_unlink(server->shm_path);
+}
 close(shm_fd);
 return -1;
 }
@@ -370,6 +373,9 @@ ivshmem_server_close(IvshmemServer *server)
 }

 unlink(server->unix_sock_path);
+if (server->use_shm_open) {
+shm_unlink(server->shm_path);
+}
 close(server->sock_fd);
 close(server->shm_fd);
 server->sock_fd = -1;
--
2.16.4

[Qemu-devel] [PATCH v2] ivshmem-server: Terminate also on SIGINT

2019-08-03 Thread Jan Kiszka

From: Jan Kiszka 

Allows to shutdown a foreground session via ctrl-c.

Signed-off-by: Jan Kiszka 
---

Changes in v2:
 - adjust error message

 contrib/ivshmem-server/main.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/contrib/ivshmem-server/main.c b/contrib/ivshmem-server/main.c
index 197c79c57e..e4cd35f74c 100644
--- a/contrib/ivshmem-server/main.c
+++ b/contrib/ivshmem-server/main.c
@@ -223,8 +223,9 @@ main(int argc, char *argv[])
 sa_quit.sa_handler = ivshmem_server_quit_cb;
 sa_quit.sa_flags = 0;
 if (sigemptyset(_quit.sa_mask) == -1 ||
-sigaction(SIGTERM, _quit, 0) == -1) {
-perror("failed to add SIGTERM handler; sigaction");
+sigaction(SIGTERM, _quit, 0) == -1 ||
+sigaction(SIGINT, _quit, 0) == -1) {
+perror("failed to add signal handler; sigaction");
 goto err;
 }

--
2.16.4

[Qemu-devel] [PATCH] ivshmem-server: Terminate also on SIGINT

2019-08-03 Thread Jan Kiszka

From: Jan Kiszka 

Allows to shutdown a foreground session via ctrl-c.

Signed-off-by: Jan Kiszka 
---
 contrib/ivshmem-server/main.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/contrib/ivshmem-server/main.c b/contrib/ivshmem-server/main.c
index 197c79c57e..8a81cdb04c 100644
--- a/contrib/ivshmem-server/main.c
+++ b/contrib/ivshmem-server/main.c
@@ -223,7 +223,8 @@ main(int argc, char *argv[])
 sa_quit.sa_handler = ivshmem_server_quit_cb;
 sa_quit.sa_flags = 0;
 if (sigemptyset(_quit.sa_mask) == -1 ||
-sigaction(SIGTERM, _quit, 0) == -1) {
+sigaction(SIGTERM, _quit, 0) == -1 ||
+sigaction(SIGINT, _quit, 0) == -1) {
 perror("failed to add SIGTERM handler; sigaction");
 goto err;
 }
--
2.16.4

[Qemu-devel] [PATCH] ivshmem-server: Clean up shmem on shutdown

2019-08-03 Thread Jan Kiszka

From: Jan Kiszka 

So far, the server leaves the posix shared memory object behind when
terminating, requiring the user to explicitly remove it in order to
start a new instance.

Signed-off-by: Jan Kiszka 
---
 contrib/ivshmem-server/ivshmem-server.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/contrib/ivshmem-server/ivshmem-server.c 
b/contrib/ivshmem-server/ivshmem-server.c
index 77f97b209c..9b9dbc87ec 100644
--- a/contrib/ivshmem-server/ivshmem-server.c
+++ b/contrib/ivshmem-server/ivshmem-server.c
@@ -370,6 +370,7 @@ ivshmem_server_close(IvshmemServer *server)
 }

 unlink(server->unix_sock_path);
+shm_unlink(server->shm_path);
 close(server->sock_fd);
 close(server->shm_fd);
 server->sock_fd = -1;
--
2.16.4

[Qemu-devel] [PATCH] MAINTAINERS: slirp: Remove myself as maintainer

2019-07-27 Thread Jan Kiszka

From: Jan Kiszka 

I haven't been doing anything here for a long time, nor does my git repo
still play a role.

Signed-off-by: Jan Kiszka 
---
 MAINTAINERS | 2 --
 1 file changed, 2 deletions(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index d6de200453..238feb965f 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2089,13 +2089,11 @@ F: include/hw/registerfields.h

 SLIRP
 M: Samuel Thibault 
-M: Jan Kiszka 
 S: Maintained
 F: slirp/
 F: net/slirp.c
 F: include/net/slirp.h
 T: git https://people.debian.org/~sthibault/qemu.git slirp
-T: git git://git.kiszka.org/qemu.git queues/slirp

 Stubs
 M: Paolo Bonzini 
--
2.16.4

Re: [Qemu-devel] [PATCH] i386/kvm: Do not sync nested state during runtime

2019-07-22 Thread Jan Kiszka

On 22.07.19 12:31, Liran Alon wrote:
>> On 22 Jul 2019, at 13:20, Jan Kiszka  wrote:
>>
>> On 22.07.19 11:44, Liran Alon wrote:
>>>
>>>
>>>> On 22 Jul 2019, at 7:00, Jan Kiszka  wrote:
>>>>
>>>> Writing the nested state e.g. after a vmport access can invalidate
>>>> important parts of the kernel-internal state, and it is not needed as
>>>> well. So leave this out from KVM_PUT_RUNTIME_STATE.
>>>>
>>>> Suggested-by: Paolo Bonzini 
>>>> Signed-off-by: Jan Kiszka 
>>>
>>> As QEMU never modifies vCPU nested-state in userspace besides in vmload and 
>>> vCPU creation,
>>> shouldn’t this be under KVM_PUT_FULL_STATE? Same as the call to 
>>> kvm_arch_set_tsc_khz().
>>
>> Reset is a relevant modification as well. If we do not write back a state 
>> that
>> is disabling virtualization, we break.
>>
>> Jan
> 
> Currently QEMU writes to userspace maintained nested-state only at 
> kvm_arch_init_vcpu() and
> when loading vmstate_nested_state vmstate subsection.
> kvm_arch_reset_vcpu() do not modify userspace maintained nested-state.

Hmm, then we probably achieve that effect by clearing the related bit in CR4. If
doing that implies an invalidation of the nested state by KVM, we are fine.
Otherwise, I would expect userspace to reset the state to VMCLEAR and purge any
traces of prior use.

> 
> I would expect KVM internal nested-state to be reset as part of KVM’s 
> vmx_vcpu_reset().

vmx_vcpu_reset is not issued on userspace initiated VM reset, only on INIT/SIPI
cycles. It's misleading, and I remember myself running into that when I hacked
on KVM back then.

Jan

> Are we missing a call to vmx_leave_nested() there? 
> 
> -Liran
> 

-- 
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux

Re: [Qemu-devel] [PATCH] i386/kvm: Do not sync nested state during runtime

2019-07-22 Thread Jan Kiszka

On 22.07.19 11:44, Liran Alon wrote:
> 
> 
>> On 22 Jul 2019, at 7:00, Jan Kiszka  wrote:
>>
>> Writing the nested state e.g. after a vmport access can invalidate
>> important parts of the kernel-internal state, and it is not needed as
>> well. So leave this out from KVM_PUT_RUNTIME_STATE.
>>
>> Suggested-by: Paolo Bonzini 
>> Signed-off-by: Jan Kiszka 
> 
> As QEMU never modifies vCPU nested-state in userspace besides in vmload and 
> vCPU creation,
> shouldn’t this be under KVM_PUT_FULL_STATE? Same as the call to 
> kvm_arch_set_tsc_khz().

Reset is a relevant modification as well. If we do not write back a state that
is disabling virtualization, we break.

Jan

-- 
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux

[Qemu-devel] [PATCH] i386/kvm: Do not sync nested state during runtime

2019-07-21 Thread Jan Kiszka

Writing the nested state e.g. after a vmport access can invalidate
important parts of the kernel-internal state, and it is not needed as
well. So leave this out from KVM_PUT_RUNTIME_STATE.

Suggested-by: Paolo Bonzini 
Signed-off-by: Jan Kiszka 
---
 target/i386/kvm.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/target/i386/kvm.c b/target/i386/kvm.c
index ec7870c6af..da98b2cbca 100644
--- a/target/i386/kvm.c
+++ b/target/i386/kvm.c
@@ -3577,12 +3577,12 @@ int kvm_arch_put_registers(CPUState *cpu, int level)
 
 assert(cpu_is_stopped(cpu) || qemu_cpu_is_self(cpu));
 
-ret = kvm_put_nested_state(x86_cpu);
-if (ret < 0) {
-return ret;
-}
-
 if (level >= KVM_PUT_RESET_STATE) {
+ret = kvm_put_nested_state(x86_cpu);
+if (ret < 0) {
+return ret;
+}
+
 ret = kvm_put_msr_feature_control(x86_cpu);
 if (ret < 0) {
 return ret;
-- 
2.16.4

[Qemu-devel] [PATCH] i386/vmmouse: Properly reset state

2019-07-21 Thread Jan Kiszka

From: Jan Kiszka 

nb_queue was not zeroed so that we no longer delivered events if a
previous guest left the device in an overflow state.

The state of absolute does not matter as the next vmmouse_update_handler
call will align it again.

Signed-off-by: Jan Kiszka 
---
 hw/i386/vmmouse.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/i386/vmmouse.c b/hw/i386/vmmouse.c
index 5d2d278be4..e335bd07da 100644
--- a/hw/i386/vmmouse.c
+++ b/hw/i386/vmmouse.c
@@ -257,6 +257,7 @@ static void vmmouse_reset(DeviceState *d)
 VMMouseState *s = VMMOUSE(d);

 s->queue_size = VMMOUSE_QUEUE_SIZE;
+s->nb_queue = 0;

 vmmouse_disable(s);
 }
--
2.16.4

Re: [Qemu-devel] [PATCH] ioapic: kvm: Skip route updates for masked pins

2019-07-21 Thread Jan Kiszka

On 03.06.19 02:36, Michael S. Tsirkin wrote:
> On Sun, Jun 02, 2019 at 01:42:13PM +0200, Jan Kiszka wrote:
>> From: Jan Kiszka 
>>
>> Masked entries will not generate interrupt messages, thus do no need to
>> be routed by KVM. This is a cosmetic cleanup, just avoiding warnings of
>> the kind
>>
>> qemu-system-x86_64: vtd_irte_get: detected non-present IRTE (index=0, 
>> high=0xff00, low=0x100)
>>
>> if the masked entry happens to reference a non-present IRTE.
>>
>> Signed-off-by: Jan Kiszka 
>
> Reviewed-by: Michael S. Tsirkin 
>
>> ---
>>  hw/intc/ioapic.c | 8 +---
>>  1 file changed, 5 insertions(+), 3 deletions(-)
>>
>> diff --git a/hw/intc/ioapic.c b/hw/intc/ioapic.c
>> index 7074489fdf..2fb288a22d 100644
>> --- a/hw/intc/ioapic.c
>> +++ b/hw/intc/ioapic.c
>> @@ -197,9 +197,11 @@ static void ioapic_update_kvm_routes(IOAPICCommonState 
>> *s)
>>  MSIMessage msg;
>>  struct ioapic_entry_info info;
>>  ioapic_entry_parse(s->ioredtbl[i], );
>> -msg.address = info.addr;
>> -msg.data = info.data;
>> -kvm_irqchip_update_msi_route(kvm_state, i, msg, NULL);
>> +if (!info.masked) {
>> +msg.address = info.addr;
>> +msg.data = info.data;
>> +kvm_irqchip_update_msi_route(kvm_state, i, msg, NULL);
>> +}
>>  }
>>  kvm_irqchip_commit_routes(kvm_state);
>>  }
>> --
>> 2.16.4
>
>

Ping. Or is this queued for 4.2?

Jan

Re: [Qemu-devel] [PULL 22/25] target/i386: kvm: Add nested migration blocker only when kernel lacks required capabilities

2019-07-10 Thread Jan Kiszka

On 10.07.19 18:34, Paolo Bonzini wrote:
> On 10/07/19 18:08, Jan Kiszka wrote:
>> On 10.07.19 16:40, Paolo Bonzini wrote:
>>> On 08/07/19 20:31, Jan Kiszka wrote:
>>>>> -if (cpu_has_nested_virt(env) && !env->nested_state) {
>>>>> +if (kvm_enabled() && cpu_has_vmx(env) && !env->nested_state) {
>>>>>  error_report("Guest enabled nested virtualization but kernel "
>>>>>  "does not support saving of nested state");
>>>>>  return -EINVAL;
>>>>>
>>>> Starting with this commit latest (bisection issue...), running Jailhouse 
>>>> in a
>>>> guest first stalls L1 (looks like we lose interrupts), and if I try to 
>>>> reset
>>>> that VM, I lose my host as well:
>>>
>>> If 5.2 is still broken, can you gather a dump of KVM_SET_NESTED_STATE's
>>> payload?
>>
>> savevm or is there a dump function in QEMU or the kernel?
> 
> There is qemu_hexdump.
> 

Here are two states taken during the vmport loop:

2: nested_state: :  01 00 00 00  80 10 00 00  00 00 24 3a  00 00 00 00 
..$:
2: nested_state: 0010:  00 10 24 3a  00 00 00 00  00 00 00 00  00 00 00 00 
..$:
2: nested_state: 0020:  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00 

2: nested_state: 0030:  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00 

2: nested_state: 0040:  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00 

2: nested_state: 0050:  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00 

2: nested_state: 0060:  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00 

2: nested_state: 0070:  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00 

2: nested_state: 0080:  d0 7e e5 11  00 00 00 00  01 00 00 00  00 00 00 00 
.~..
2: nested_state: 0090:  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00 

2: nested_state: 00a0:  00 00 00 00  00 00 00 00  00 70 25 3a  00 00 00 00 
.p%:
2: nested_state: 00b0:  00 80 25 3a  00 00 00 00  00 f0 00 3a  00 00 00 00 
..%:...:
2: nested_state: 00c0:  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00 

2: nested_state: 00d0:  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00 

2: nested_state: 00e0:  00 00 00 00  00 00 00 00  00 50 01 3a  00 00 00 00 
.P.:
2: nested_state: 00f0:  00 00 00 00  00 00 00 00  1e e0 22 3a  00 00 00 00 
..":
2: nested_state: 0100:  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00 

2: nested_state: 0110:  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00 

2: nested_state: 0120:  00 00 00 00  00 00 00 00  10 00 00 00  01 00 00 00 

2: nested_state: 0130:  ff ff ff ff  ff ff ff ff  00 00 00 00  00 00 00 00 

2: nested_state: 0140:  06 01 07 00  06 01 07 00  01 0d 00 00  00 00 00 00 

2: nested_state: 0150:  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00 

2: nested_state: 0160:  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00 

2: nested_state: 0170:  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00 

2: nested_state: 0180:  06 01 07 00  00 00 00 00  00 05 00 00  00 00 00 00 

2: nested_state: 0190:  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00 

2: nested_state: 01a0:  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00 

2: nested_state: 01b0:  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00 

2: nested_state: 01c0:  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00 

2: nested_state: 01d0:  00 00 00 00  00 00 00 00  30 00 00 60  ff ff ff ff 
0..`
2: nested_state: 01e0:  00 00 00 00  00 00 00 00  33 00 05 80  00 00 00 00 
3...
2: nested_state: 01f0:  a0 26 04 00  00 00 00 00  00 00 00 00  00 00 00 00 
.&..
2: nested_state: 0200:  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00 

2: nested_state: 0210:  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00 

2: nested_state: 0220:  00 00 00 00  00 00 00 00  33 00 05 80  00 00 00 00 
3...
2: nested_state: 0230:  00 20 6f 32  00 00 00 00  a0 26 04 00  00 00 00 00 . 
o2.&..
2: nested_state: 0240:  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00 

2: nested_state: 0250:  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00 

2: nested_state: 0260:  00 97 c4 ec  b4 7f 00 00  00 00 00 00  00 00 00 00 

2: nested_state: 0270:  00 00 00 00  00 00 00 00  00 90 06 00  00 fe ff ff 

2: nested_state: 0280:  00 70 06 00  00 fe ff ff  00 00 00

Re: [Qemu-devel] [PULL 22/25] target/i386: kvm: Add nested migration blocker only when kernel lacks required capabilities

2019-07-10 Thread Jan Kiszka

On 10.07.19 16:40, Paolo Bonzini wrote:
> On 08/07/19 20:31, Jan Kiszka wrote:
>>> -if (cpu_has_nested_virt(env) && !env->nested_state) {
>>> +if (kvm_enabled() && cpu_has_vmx(env) && !env->nested_state) {
>>>  error_report("Guest enabled nested virtualization but kernel "
>>>  "does not support saving of nested state");
>>>  return -EINVAL;
>>>
>> Starting with this commit latest (bisection issue...), running Jailhouse in a
>> guest first stalls L1 (looks like we lose interrupts), and if I try to reset
>> that VM, I lose my host as well:
> 
> If 5.2 is still broken, can you gather a dump of KVM_SET_NESTED_STATE's
> payload?

savevm or is there a dump function in QEMU or the kernel?

Jan

-- 
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux

Re: [Qemu-devel] [PULL 22/25] target/i386: kvm: Add nested migration blocker only when kernel lacks required capabilities

2019-07-08 Thread Jan Kiszka

On 08.07.19 20:31, Jan Kiszka wrote:
> 
> On 21.06.19 13:30, Paolo Bonzini wrote:
>> From: Liran Alon 
>>
>> Previous commits have added support for migration of nested virtualization
>> workloads. This was done by utilising two new KVM capabilities:
>> KVM_CAP_NESTED_STATE and KVM_CAP_EXCEPTION_PAYLOAD. Both which are
>> required in order to correctly migrate such workloads.
>>
>> Therefore, change code to add a migration blocker for vCPUs exposed with
>> Intel VMX or AMD SVM in case one of these kernel capabilities is
>> missing.
>>
>> Signed-off-by: Liran Alon 
>> Reviewed-by: Maran Wilson 
>> Message-Id: <20190619162140.133674-11-liran.a...@oracle.com>
>> Signed-off-by: Paolo Bonzini 
>> ---
>>  target/i386/kvm.c | 9 +++--
>>  target/i386/machine.c | 2 +-
>>  2 files changed, 8 insertions(+), 3 deletions(-)
>>
>> diff --git a/target/i386/kvm.c b/target/i386/kvm.c
>> index c931e9d..e4b4f57 100644
>> --- a/target/i386/kvm.c
>> +++ b/target/i386/kvm.c
>> @@ -1640,9 +1640,14 @@ int kvm_arch_init_vcpu(CPUState *cs)
>>!!(c->ecx & CPUID_EXT_SMX);
>>  }
>>  
>> -if (cpu_has_nested_virt(env) && !nested_virt_mig_blocker) {
>> +if (cpu_has_vmx(env) && !nested_virt_mig_blocker &&
>> +((kvm_max_nested_state_length() <= 0) || !has_exception_payload)) {
>>  error_setg(_virt_mig_blocker,
>> -   "Nested virtualization does not support live migration 
>> yet");
>> +   "Kernel do not provide required capabilities for "
>> +   "nested virtualization migration. "
>> +   "(CAP_NESTED_STATE=%d, CAP_EXCEPTION_PAYLOAD=%d)",
>> +   kvm_max_nested_state_length() > 0,
>> +   has_exception_payload);
>>  r = migrate_add_blocker(nested_virt_mig_blocker, _err);
>>  if (local_err) {
>>  error_report_err(local_err);
>> diff --git a/target/i386/machine.c b/target/i386/machine.c
>> index fc49e5a..851b249 100644
>> --- a/target/i386/machine.c
>> +++ b/target/i386/machine.c
>> @@ -233,7 +233,7 @@ static int cpu_pre_save(void *opaque)
>>  
>>  #ifdef CONFIG_KVM
>>  /* Verify we have nested virtualization state from kernel if required */
>> -if (cpu_has_nested_virt(env) && !env->nested_state) {
>> +if (kvm_enabled() && cpu_has_vmx(env) && !env->nested_state) {
>>  error_report("Guest enabled nested virtualization but kernel "
>>  "does not support saving of nested state");
>>  return -EINVAL;
>>
> 
> Starting with this commit latest (bisection issue...), running Jailhouse in a
> guest first stalls L1 (looks like we lose interrupts), and if I try to reset
> that VM, I lose my host as well:
> 
> kvm: vmptrld   (null)/6eb9 failed
> kvm: vmclear fail:   (null)/6eb9
> 
> and then things start to lock up because we seem to lose the CPUs the guest 
> was
> running on. Once I had this in the logs:
> 
> rcu: INFO: rcu_sched detected expedited stalls on CPUs/tasks: { 7-... } 15040
> jiffies s: 4673 root: 0x80/.
> rcu: blocking rcu_node structures:
> Task dump for CPU 7:
> qemu-system-x86 R  running task0 17413  17345 0x0008
> Call Trace:
>  ? x86_virt_spec_ctrl+0x7/0xe0
>  ? vmx_vcpu_run.part.0+0x2a4/0xfa0 [kvm_intel]
>  ? vcpu_enter_guest+0x349/0xe80 [kvm]
>  ? kvm_arch_vcpu_ioctl_run+0xff/0x550 [kvm]
>  ? kvm_vcpu_ioctl+0x20d/0x590 [kvm]
>  ? get_futex_key+0x35d/0x3b0
>  ? do_vfs_ioctl+0x447/0x640
>  ? do_futex+0x157/0x1d0
>  ? ksys_ioctl+0x5e/0x90
>  ? __x64_sys_ioctl+0x16/0x20
>  ? do_syscall_64+0x60/0x120
>  ? entry_SYSCALL_64_after_hwframe+0x49/0xbe
> 
> This was on a 5.1.16 distro kernel. Currently rebuilding 5.2 vanilla.
> 
> Looks like we have up to two critical bugs here...
> 
> Jan
> 

It turns out it's actually patch 20 that introduces the problem. Maybe it is
only the kernel, but I rather suspect a combination of userspace not providing
the right state (specifically on reset) and the kernel accepting that.

Continuing the search on 5.2 now.

Jan

-- 
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux

Re: [Qemu-devel] [PULL 22/25] target/i386: kvm: Add nested migration blocker only when kernel lacks required capabilities

2019-07-08 Thread Jan Kiszka



On 21.06.19 13:30, Paolo Bonzini wrote:
> From: Liran Alon 
> 
> Previous commits have added support for migration of nested virtualization
> workloads. This was done by utilising two new KVM capabilities:
> KVM_CAP_NESTED_STATE and KVM_CAP_EXCEPTION_PAYLOAD. Both which are
> required in order to correctly migrate such workloads.
> 
> Therefore, change code to add a migration blocker for vCPUs exposed with
> Intel VMX or AMD SVM in case one of these kernel capabilities is
> missing.
> 
> Signed-off-by: Liran Alon 
> Reviewed-by: Maran Wilson 
> Message-Id: <20190619162140.133674-11-liran.a...@oracle.com>
> Signed-off-by: Paolo Bonzini 
> ---
>  target/i386/kvm.c | 9 +++--
>  target/i386/machine.c | 2 +-
>  2 files changed, 8 insertions(+), 3 deletions(-)
> 
> diff --git a/target/i386/kvm.c b/target/i386/kvm.c
> index c931e9d..e4b4f57 100644
> --- a/target/i386/kvm.c
> +++ b/target/i386/kvm.c
> @@ -1640,9 +1640,14 @@ int kvm_arch_init_vcpu(CPUState *cs)
>!!(c->ecx & CPUID_EXT_SMX);
>  }
>  
> -if (cpu_has_nested_virt(env) && !nested_virt_mig_blocker) {
> +if (cpu_has_vmx(env) && !nested_virt_mig_blocker &&
> +((kvm_max_nested_state_length() <= 0) || !has_exception_payload)) {
>  error_setg(_virt_mig_blocker,
> -   "Nested virtualization does not support live migration 
> yet");
> +   "Kernel do not provide required capabilities for "
> +   "nested virtualization migration. "
> +   "(CAP_NESTED_STATE=%d, CAP_EXCEPTION_PAYLOAD=%d)",
> +   kvm_max_nested_state_length() > 0,
> +   has_exception_payload);
>  r = migrate_add_blocker(nested_virt_mig_blocker, _err);
>  if (local_err) {
>  error_report_err(local_err);
> diff --git a/target/i386/machine.c b/target/i386/machine.c
> index fc49e5a..851b249 100644
> --- a/target/i386/machine.c
> +++ b/target/i386/machine.c
> @@ -233,7 +233,7 @@ static int cpu_pre_save(void *opaque)
>  
>  #ifdef CONFIG_KVM
>  /* Verify we have nested virtualization state from kernel if required */
> -if (cpu_has_nested_virt(env) && !env->nested_state) {
> +if (kvm_enabled() && cpu_has_vmx(env) && !env->nested_state) {
>  error_report("Guest enabled nested virtualization but kernel "
>  "does not support saving of nested state");
>  return -EINVAL;
> 

Starting with this commit latest (bisection issue...), running Jailhouse in a
guest first stalls L1 (looks like we lose interrupts), and if I try to reset
that VM, I lose my host as well:

kvm: vmptrld   (null)/6eb9 failed
kvm: vmclear fail:   (null)/6eb9

and then things start to lock up because we seem to lose the CPUs the guest was
running on. Once I had this in the logs:

rcu: INFO: rcu_sched detected expedited stalls on CPUs/tasks: { 7-... } 15040
jiffies s: 4673 root: 0x80/.
rcu: blocking rcu_node structures:
Task dump for CPU 7:
qemu-system-x86 R  running task0 17413  17345 0x0008
Call Trace:
 ? x86_virt_spec_ctrl+0x7/0xe0
 ? vmx_vcpu_run.part.0+0x2a4/0xfa0 [kvm_intel]
 ? vcpu_enter_guest+0x349/0xe80 [kvm]
 ? kvm_arch_vcpu_ioctl_run+0xff/0x550 [kvm]
 ? kvm_vcpu_ioctl+0x20d/0x590 [kvm]
 ? get_futex_key+0x35d/0x3b0
 ? do_vfs_ioctl+0x447/0x640
 ? do_futex+0x157/0x1d0
 ? ksys_ioctl+0x5e/0x90
 ? __x64_sys_ioctl+0x16/0x20
 ? do_syscall_64+0x60/0x120
 ? entry_SYSCALL_64_after_hwframe+0x49/0xbe

This was on a 5.1.16 distro kernel. Currently rebuilding 5.2 vanilla.

Looks like we have up to two critical bugs here...

Jan

-- 
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux

Re: [Qemu-devel] [PULL 17/25] target/i386: kvm: Block migration for vCPUs exposed with nested virtualization

2019-07-08 Thread Jan Kiszka

On 21.06.19 13:30, Paolo Bonzini wrote:
> From: Liran Alon 
> 
> Commit d98f26073beb ("target/i386: kvm: add VMX migration blocker")
> added a migration blocker for vCPU exposed with Intel VMX.
> However, migration should also be blocked for vCPU exposed with
> AMD SVM.
> 
> Both cases should be blocked because QEMU should extract additional
> vCPU state from KVM that should be migrated as part of vCPU VMState.
> E.g. Whether vCPU is running in guest-mode or host-mode.
> 
> Fixes: d98f26073beb ("target/i386: kvm: add VMX migration blocker")
> Reviewed-by: Maran Wilson 
> Signed-off-by: Liran Alon 
> Message-Id: <20190619162140.133674-6-liran.a...@oracle.com>
> Signed-off-by: Paolo Bonzini 
> ---
>  target/i386/cpu.c |  6 --
>  target/i386/cpu.h | 12 
>  target/i386/kvm.c | 14 +++---
>  3 files changed, 19 insertions(+), 13 deletions(-)
> 
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index c330fd9..61e44cb 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -5215,12 +5215,6 @@ static int x86_cpu_filter_features(X86CPU *cpu)
>  return rv;
>  }
>  
> -#define IS_INTEL_CPU(env) ((env)->cpuid_vendor1 == CPUID_VENDOR_INTEL_1 && \
> -   (env)->cpuid_vendor2 == CPUID_VENDOR_INTEL_2 && \
> -   (env)->cpuid_vendor3 == CPUID_VENDOR_INTEL_3)
> -#define IS_AMD_CPU(env) ((env)->cpuid_vendor1 == CPUID_VENDOR_AMD_1 && \
> - (env)->cpuid_vendor2 == CPUID_VENDOR_AMD_2 && \
> - (env)->cpuid_vendor3 == CPUID_VENDOR_AMD_3)
>  static void x86_cpu_realizefn(DeviceState *dev, Error **errp)
>  {
>  CPUState *cs = CPU(dev);
> diff --git a/target/i386/cpu.h b/target/i386/cpu.h
> index 7f48136..bf0c9c2 100644
> --- a/target/i386/cpu.h
> +++ b/target/i386/cpu.h
> @@ -722,6 +722,13 @@ typedef uint32_t FeatureWordArray[FEATURE_WORDS];
>  
>  #define CPUID_VENDOR_HYGON"HygonGenuine"
>  
> +#define IS_INTEL_CPU(env) ((env)->cpuid_vendor1 == CPUID_VENDOR_INTEL_1 && \
> +   (env)->cpuid_vendor2 == CPUID_VENDOR_INTEL_2 && \
> +   (env)->cpuid_vendor3 == CPUID_VENDOR_INTEL_3)
> +#define IS_AMD_CPU(env) ((env)->cpuid_vendor1 == CPUID_VENDOR_AMD_1 && \
> + (env)->cpuid_vendor2 == CPUID_VENDOR_AMD_2 && \
> + (env)->cpuid_vendor3 == CPUID_VENDOR_AMD_3)
> +
>  #define CPUID_MWAIT_IBE (1U << 1) /* Interrupts can exit capability */
>  #define CPUID_MWAIT_EMX (1U << 0) /* enumeration supported */
>  
> @@ -1848,6 +1855,11 @@ static inline int32_t x86_get_a20_mask(CPUX86State 
> *env)
>  }
>  }
>  
> +static inline bool cpu_has_vmx(CPUX86State *env)
> +{
> +return env->features[FEAT_1_ECX] & CPUID_EXT_VMX;
> +}
> +
>  /* fpu_helper.c */
>  void update_fp_status(CPUX86State *env);
>  void update_mxcsr_status(CPUX86State *env);
> diff --git a/target/i386/kvm.c b/target/i386/kvm.c
> index 9864aa0..f9872f1 100644
> --- a/target/i386/kvm.c
> +++ b/target/i386/kvm.c
> @@ -1299,7 +1299,7 @@ static int hyperv_init_vcpu(X86CPU *cpu)
>  }
>  
>  static Error *invtsc_mig_blocker;
> -static Error *vmx_mig_blocker;
> +static Error *nested_virt_mig_blocker;
>  
>  #define KVM_MAX_CPUID_ENTRIES  100
>  
> @@ -1597,13 +1597,13 @@ int kvm_arch_init_vcpu(CPUState *cs)
>!!(c->ecx & CPUID_EXT_SMX);
>  }
>  
> -if ((env->features[FEAT_1_ECX] & CPUID_EXT_VMX) && !vmx_mig_blocker) {
> -error_setg(_mig_blocker,
> -   "Nested VMX virtualization does not support live 
> migration yet");
> -r = migrate_add_blocker(vmx_mig_blocker, _err);
> +if (cpu_has_nested_virt(env) && !nested_virt_mig_blocker) {

This broke bisection (which I currently have to do because this pull manages to
lock up 5.1 host kernels): cpu_has_nested_virt will only come later in this 
series.

Jan

> +error_setg(_virt_mig_blocker,
> +   "Nested virtualization does not support live migration 
> yet");
> +r = migrate_add_blocker(nested_virt_mig_blocker, _err);
>  if (local_err) {
>  error_report_err(local_err);
> -error_free(vmx_mig_blocker);
> +error_free(nested_virt_mig_blocker);
>  return r;
>  }
>  }
> @@ -1674,7 +1674,7 @@ int kvm_arch_init_vcpu(CPUState *cs)
>   fail:
>  migrate_del_blocker(invtsc_mig_blocker);
>   fail2:
> -migrate_del_blocker(vmx_mig_blocker);
> +migrate_del_blocker(nested_virt_mig_blocker);
>  
>  return r;
>  }
> 

-- 
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux

[Qemu-devel] [PATCH] hw/arm/virt: Add support for Cortex-A7

2019-06-30 Thread Jan Kiszka

From: Jan Kiszka 

No reason to deny this type.

Signed-off-by: Jan Kiszka 
---
 hw/arm/virt.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 431e2900fd..ed009fa447 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -176,6 +176,7 @@ static const int a15irqmap[] = {
 };

 static const char *valid_cpus[] = {
+ARM_CPU_TYPE_NAME("cortex-a7"),
 ARM_CPU_TYPE_NAME("cortex-a15"),
 ARM_CPU_TYPE_NAME("cortex-a53"),
 ARM_CPU_TYPE_NAME("cortex-a57"),
--
2.16.4

Re: [Qemu-devel] [PATCH] ioapic: kvm: Skip route updates for masked pins

2019-06-03 Thread Jan Kiszka


On 02.06.19 14:10, Peter Xu wrote:

On Sun, Jun 02, 2019 at 01:42:13PM +0200, Jan Kiszka wrote:

From: Jan Kiszka 

Masked entries will not generate interrupt messages, thus do no need to
be routed by KVM. This is a cosmetic cleanup, just avoiding warnings of
the kind

qemu-system-x86_64: vtd_irte_get: detected non-present IRTE (index=0, 
high=0xff00, low=0x100)

if the masked entry happens to reference a non-present IRTE.

Signed-off-by: Jan Kiszka 


Reviewed-by: Peter Xu 

Thanks, Jan.

The "non-cosmetic" part of clearing of those entries (e.g. including
when the entries were not setup correctly rather than masked) was
never really implemented and that task has been on my todo list for
quite a while but with a very low priority (low enough to sink...).  I
hope I didn't overlook its importance since AFAICT general OSs should
hardly trigger those paths and so far I don't see it a very big issue.


I triggered the message during the handover phase from Linux to Jailhouse. That 
involves reprogramming both IOAPIC and IRTEs - likely unusual sequences, I just 
didn't find invalid ones.


Thanks,
Jan

--
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux

[Qemu-devel] [PATCH] ioapic: kvm: Skip route updates for masked pins

2019-06-02 Thread Jan Kiszka

From: Jan Kiszka 

Masked entries will not generate interrupt messages, thus do no need to
be routed by KVM. This is a cosmetic cleanup, just avoiding warnings of
the kind

qemu-system-x86_64: vtd_irte_get: detected non-present IRTE (index=0, 
high=0xff00, low=0x100)

if the masked entry happens to reference a non-present IRTE.

Signed-off-by: Jan Kiszka 
---
 hw/intc/ioapic.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/hw/intc/ioapic.c b/hw/intc/ioapic.c
index 7074489fdf..2fb288a22d 100644
--- a/hw/intc/ioapic.c
+++ b/hw/intc/ioapic.c
@@ -197,9 +197,11 @@ static void ioapic_update_kvm_routes(IOAPICCommonState *s)
 MSIMessage msg;
 struct ioapic_entry_info info;
 ioapic_entry_parse(s->ioredtbl[i], );
-msg.address = info.addr;
-msg.data = info.data;
-kvm_irqchip_update_msi_route(kvm_state, i, msg, NULL);
+if (!info.masked) {
+msg.address = info.addr;
+msg.data = info.data;
+kvm_irqchip_update_msi_route(kvm_state, i, msg, NULL);
+}
 }
 kvm_irqchip_commit_routes(kvm_state);
 }
-- 
2.16.4

Re: [Qemu-devel] [PATCH v8 05/16] gdbstub: add multiprocess support to vCont packets

2019-03-22 Thread Jan Kiszka


On 22.03.19 15:01, Luc Michel wrote:

On 3/22/19 2:29 PM, Jan Kiszka wrote:

On 07.12.18 10:01, Luc Michel wrote:

Add the gdb_first_attached_cpu() and gdb_next_attached_cpu() to iterate
over all the CPUs in currently attached processes.

Add the gdb_first_cpu_in_process() and gdb_next_cpu_in_process() to
iterate over CPUs of a given process.

Use them to add multiprocess extension support to vCont packets.

Signed-off-by: Luc Michel 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Edgar E. Iglesias 
Acked-by: Alistair Francis 
---
   gdbstub.c | 117 +++---
   1 file changed, 102 insertions(+), 15 deletions(-)

diff --git a/gdbstub.c b/gdbstub.c
index 911faa225a..77b3dbb2c8 100644
--- a/gdbstub.c
+++ b/gdbstub.c
@@ -721,10 +721,40 @@ static CPUState *find_cpu(uint32_t thread_id)
   }
   
   return NULL;

   }
   
+static CPUState *get_first_cpu_in_process(const GDBState *s,

+  GDBProcess *process)
+{
+CPUState *cpu;
+
+CPU_FOREACH(cpu) {
+if (gdb_get_cpu_pid(s, cpu) == process->pid) {
+return cpu;
+}
+}
+
+return NULL;
+}
+
+static CPUState *gdb_next_cpu_in_process(const GDBState *s, CPUState *cpu)
+{
+uint32_t pid = gdb_get_cpu_pid(s, cpu);
+cpu = CPU_NEXT(cpu);
+
+while (cpu) {
+if (gdb_get_cpu_pid(s, cpu) == pid) {
+break;
+}
+
+cpu = CPU_NEXT(cpu);
+}
+
+return cpu;
+}
+
   static CPUState *gdb_get_cpu(const GDBState *s, uint32_t pid, uint32_t tid)
   {
   GDBProcess *process;
   CPUState *cpu;
   
@@ -750,10 +780,41 @@ static CPUState *gdb_get_cpu(const GDBState *s, uint32_t pid, uint32_t tid)

   }
   
   return cpu;

   }
   
+/* Return the cpu following @cpu, while ignoring

+ * unattached processes.
+ */
+static CPUState *gdb_next_attached_cpu(const GDBState *s, CPUState *cpu)
+{
+cpu = CPU_NEXT(cpu);
+
+while (cpu) {
+if (gdb_get_cpu_process(s, cpu)->attached) {
+break;
+}
+
+cpu = CPU_NEXT(cpu);
+}
+
+return cpu;
+}
+
+/* Return the first attached cpu */
+static CPUState *gdb_first_attached_cpu(const GDBState *s)
+{
+CPUState *cpu = first_cpu;
+GDBProcess *process = gdb_get_cpu_process(s, cpu);
+
+if (!process->attached) {
+return gdb_next_attached_cpu(s, cpu);
+}
+
+return cpu;
+}
+
   static const char *get_feature_xml(const char *p, const char **newp,
  CPUClass *cc)
   {
   size_t len;
   int i;
@@ -1088,14 +1149,16 @@ static int is_query_packet(const char *p, const char 
*query, char separator)
* returns -ENOTSUP if a command is unsupported, -EINVAL or -ERANGE if there 
is
* a format error, 0 on success.
*/
   static int gdb_handle_vcont(GDBState *s, const char *p)
   {
-int res, idx, signal = 0;
+int res, signal = 0;
   char cur_action;
   char *newstates;
   unsigned long tmp;
+uint32_t pid, tid;
+GDBProcess *process;
   CPUState *cpu;
   #ifdef CONFIG_USER_ONLY
   int max_cpus = 1; /* global variable max_cpus exists only in system mode 
*/
   
   CPU_FOREACH(cpu) {

@@ -1134,29 +1197,52 @@ static int gdb_handle_vcont(GDBState *s, const char *p)
   } else if (cur_action != 'c' && cur_action != 's') {
   /* unknown/invalid/unsupported command */
   res = -ENOTSUP;
   goto out;
   }
-/* thread specification. special values: (none), -1 = all; 0 = any */
-if ((p[0] == ':' && p[1] == '-' && p[2] == '1') || (p[0] != ':')) {
-if (*p == ':') {
-p += 3;
-}
-for (idx = 0; idx < max_cpus; idx++) {
-if (newstates[idx] == 1) {
-newstates[idx] = cur_action;
+
+if (*p++ != ':') {
+res = -ENOTSUP;
+goto out;
+}
+
+switch (read_thread_id(p, , , )) {
+case GDB_READ_THREAD_ERR:
+res = -EINVAL;
+goto out;
+
+case GDB_ALL_PROCESSES:
+cpu = gdb_first_attached_cpu(s);
+while (cpu) {
+if (newstates[cpu->cpu_index] == 1) {
+newstates[cpu->cpu_index] = cur_action;
   }
+
+cpu = gdb_next_attached_cpu(s, cpu);
   }
-} else if (*p == ':') {
-p++;
-res = qemu_strtoul(p, , 16, );
-if (res) {
+break;
+
+case GDB_ALL_THREADS:
+process = gdb_get_process(s, pid);
+
+if (!process->attached) {
+res = -EINVAL;
   goto out;
   }
   
-/* 0 means any thread, so we pick the first valid CPU */

-cpu = tmp ? find_cpu(tmp) : first_cpu;
+cpu = get_first_cpu_in

Re: [Qemu-devel] [PATCH v8 05/16] gdbstub: add multiprocess support to vCont packets

2019-03-22 Thread Jan Kiszka

On 07.12.18 10:01, Luc Michel wrote:
> Add the gdb_first_attached_cpu() and gdb_next_attached_cpu() to iterate
> over all the CPUs in currently attached processes.
> 
> Add the gdb_first_cpu_in_process() and gdb_next_cpu_in_process() to
> iterate over CPUs of a given process.
> 
> Use them to add multiprocess extension support to vCont packets.
> 
> Signed-off-by: Luc Michel 
> Reviewed-by: Philippe Mathieu-Daudé 
> Reviewed-by: Edgar E. Iglesias 
> Acked-by: Alistair Francis 
> ---
>   gdbstub.c | 117 +++---
>   1 file changed, 102 insertions(+), 15 deletions(-)
> 
> diff --git a/gdbstub.c b/gdbstub.c
> index 911faa225a..77b3dbb2c8 100644
> --- a/gdbstub.c
> +++ b/gdbstub.c
> @@ -721,10 +721,40 @@ static CPUState *find_cpu(uint32_t thread_id)
>   }
>   
>   return NULL;
>   }
>   
> +static CPUState *get_first_cpu_in_process(const GDBState *s,
> +  GDBProcess *process)
> +{
> +CPUState *cpu;
> +
> +CPU_FOREACH(cpu) {
> +if (gdb_get_cpu_pid(s, cpu) == process->pid) {
> +return cpu;
> +}
> +}
> +
> +return NULL;
> +}
> +
> +static CPUState *gdb_next_cpu_in_process(const GDBState *s, CPUState *cpu)
> +{
> +uint32_t pid = gdb_get_cpu_pid(s, cpu);
> +cpu = CPU_NEXT(cpu);
> +
> +while (cpu) {
> +if (gdb_get_cpu_pid(s, cpu) == pid) {
> +break;
> +}
> +
> +cpu = CPU_NEXT(cpu);
> +}
> +
> +return cpu;
> +}
> +
>   static CPUState *gdb_get_cpu(const GDBState *s, uint32_t pid, uint32_t tid)
>   {
>   GDBProcess *process;
>   CPUState *cpu;
>   
> @@ -750,10 +780,41 @@ static CPUState *gdb_get_cpu(const GDBState *s, 
> uint32_t pid, uint32_t tid)
>   }
>   
>   return cpu;
>   }
>   
> +/* Return the cpu following @cpu, while ignoring
> + * unattached processes.
> + */
> +static CPUState *gdb_next_attached_cpu(const GDBState *s, CPUState *cpu)
> +{
> +cpu = CPU_NEXT(cpu);
> +
> +while (cpu) {
> +if (gdb_get_cpu_process(s, cpu)->attached) {
> +break;
> +}
> +
> +cpu = CPU_NEXT(cpu);
> +}
> +
> +return cpu;
> +}
> +
> +/* Return the first attached cpu */
> +static CPUState *gdb_first_attached_cpu(const GDBState *s)
> +{
> +CPUState *cpu = first_cpu;
> +GDBProcess *process = gdb_get_cpu_process(s, cpu);
> +
> +if (!process->attached) {
> +return gdb_next_attached_cpu(s, cpu);
> +}
> +
> +return cpu;
> +}
> +
>   static const char *get_feature_xml(const char *p, const char **newp,
>  CPUClass *cc)
>   {
>   size_t len;
>   int i;
> @@ -1088,14 +1149,16 @@ static int is_query_packet(const char *p, const char 
> *query, char separator)
>* returns -ENOTSUP if a command is unsupported, -EINVAL or -ERANGE if 
> there is
>* a format error, 0 on success.
>*/
>   static int gdb_handle_vcont(GDBState *s, const char *p)
>   {
> -int res, idx, signal = 0;
> +int res, signal = 0;
>   char cur_action;
>   char *newstates;
>   unsigned long tmp;
> +uint32_t pid, tid;
> +GDBProcess *process;
>   CPUState *cpu;
>   #ifdef CONFIG_USER_ONLY
>   int max_cpus = 1; /* global variable max_cpus exists only in system 
> mode */
>   
>   CPU_FOREACH(cpu) {
> @@ -1134,29 +1197,52 @@ static int gdb_handle_vcont(GDBState *s, const char 
> *p)
>   } else if (cur_action != 'c' && cur_action != 's') {
>   /* unknown/invalid/unsupported command */
>   res = -ENOTSUP;
>   goto out;
>   }
> -/* thread specification. special values: (none), -1 = all; 0 = any */
> -if ((p[0] == ':' && p[1] == '-' && p[2] == '1') || (p[0] != ':')) {
> -if (*p == ':') {
> -p += 3;
> -}
> -for (idx = 0; idx < max_cpus; idx++) {
> -if (newstates[idx] == 1) {
> -newstates[idx] = cur_action;
> +
> +if (*p++ != ':') {
> +res = -ENOTSUP;
> +goto out;
> +}
> +
> +switch (read_thread_id(p, , , )) {
> +case GDB_READ_THREAD_ERR:
> +res = -EINVAL;
> +goto out;
> +
> +case GDB_ALL_PROCESSES:
> +cpu = gdb_first_attached_cpu(s);
> +while (cpu) {
> +if (newstates[cpu->cpu_index] == 1) {
> +newstates[cpu->cpu_index] = cur_action;
>   }
> +
> +cpu = gdb_next_attached_cpu(s, cpu);
>   }
> -} else if (*p == ':') {
> -p++;
> -res = qemu_strtoul(p, , 16, );
> -if (res) {
> +break;
> +
> +case GDB_ALL_THREADS:
> +process = gdb_get_process(s, pid);
> +
> +if (!process->attached) {
> +res = -EINVAL;
>   goto out;
>   }
>   
> -

Re: [Qemu-devel] [PATCH 0/1] snip my name and email

2019-02-21 Thread Jan Kiszka


On 21.02.19 17:48, Markus Armbruster wrote:

Jan Kiszka  writes:


On 21.02.19 17:05, Eric Blake wrote:

On 2/21/19 9:53 AM, David Kiarie wrote:

the occurrence of my name and email on the files below may have led to
some confusion in the reporting of a few recent bugs.

i have therefore choosen to snip it.


Dropping an email from the copyright line makes sense; dropping the
Copyright declaration altogether is a bit odd (the GPL works only in
tandem with a copyright assertion) - but as you are the author of the
line and copyright holder of your contributions, I am not in a position
to say you are wrong in removing it, only that it looks odd.



Yeah, indeed.

David, also note that you probably have been addressed because
scripts/get_maintainer.pl will look into the git history of files that
some patch addresses and pick up significant and/or recent
contributors from there.


MAINTAINERS covers these files, so the most common use of
get_maintainer.pl won't list anyone not listed there:

 $ scripts/get_maintainer.pl -f hw/i386/amd_iommu.[ch]
 Paolo Bonzini  (maintainer:X86)
 Richard Henderson  (maintainer:X86)
 Eduardo Habkost  (maintainer:X86)
 "Michael S. Tsirkin"  (supporter:PC)
 Marcel Apfelbaum  (supporter:PC)
 qemu-devel@nongnu.org (open list:All patches CC here)

David's contributions have aged out of --git:

 $ scripts/get_maintainer.pl -f hw/i386/amd_iommu.[ch] --git
 "Michael S. Tsirkin"  
(supporter:PC,commit_signer:12/7=100%,commit_signer:10/5=100%)
 Marcel Apfelbaum  (supporter:PC)
 Paolo Bonzini  (maintainer:X86)
 Richard Henderson  (maintainer:X86,commit_signer:1/7=14%)
 Eduardo Habkost  (maintainer:X86)
 Peter Xu  (commit_signer:6/7=86%,commit_signer:4/5=80%)
 Brijesh Singh  
(commit_signer:5/7=71%,commit_signer:4/5=80%)
 "Alex Bennée"  (commit_signer:1/7=14%)
 Jan Kiszka  (commit_signer:1/5=20%)
 qemu-devel@nongnu.org (open list:All patches CC here)

However, --git-blame still lists him:

 $ scripts/get_maintainer.pl -f hw/i386/amd_iommu.[ch] --git-blame
 Paolo Bonzini  (maintainer:X86,commits:7/24=29%)
 Richard Henderson  (maintainer:X86)
 Eduardo Habkost  (maintainer:X86,commits:7/24=29%)
 "Michael S. Tsirkin"  
(supporter:PC,commits:26/24=100%,commits:14/8=100%)
 Marcel Apfelbaum  (supporter:PC)
 David Kiarie  (authored 
lines:1176/1645=71%,authored lines:274/373=73%)
 Brijesh Singh  (authored 
lines:403/1645=24%,authored lines:93/373=25%,commits:4/8=50%)
 Peter Xu  (commits:10/24=42%,commits:4/8=50%)
     David Gibson  (commits:7/24=29%)
 Jan Kiszka  (commits:1/8=12%)
 Prasad J Pandit  (commits:1/8=12%)
 qemu-devel@nongnu.org (open list:All patches CC here)

--help admonishes:

   Using "--git-blame" is slow and may add old committers and authors
   that are no longer active maintainers to the output.


  There should be some opt-out statement from
that, but I don't recall how.


I don't think get_maintainer.pl supports a blacklist of people who don't
want to be pestered anymore.



# cat linux/.get_maintainer.ignore
Christoph Hellwig 

That's why I remembered it vaguely.

Jan

--
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux

Re: [Qemu-devel] [PATCH 0/1] snip my name and email

2019-02-21 Thread Jan Kiszka


On 21.02.19 17:05, Eric Blake wrote:

On 2/21/19 9:53 AM, David Kiarie wrote:

the occurrence of my name and email on the files below may have led to
some confusion in the reporting of a few recent bugs.

i have therefore choosen to snip it.


Dropping an email from the copyright line makes sense; dropping the
Copyright declaration altogether is a bit odd (the GPL works only in
tandem with a copyright assertion) - but as you are the author of the
line and copyright holder of your contributions, I am not in a position
to say you are wrong in removing it, only that it looks odd.



Yeah, indeed.

David, also note that you probably have been addressed because 
scripts/get_maintainer.pl will look into the git history of files that some 
patch addresses and pick up significant and/or recent contributors from there. 
There should be some opt-out statement from that, but I don't recall how.


Jan

--
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux

[Qemu-devel] [PATCH] kvm: x86: Fix kvm_arch_fixup_msi_route for remap-less case

2018-08-27 Thread Jan Kiszka

The AMD IOMMU does not (yet) support interrupt remapping. But
kvm_arch_fixup_msi_route assumes that all implementations do and crashes
when the AMD IOMMU is used in KVM mode.

Fixes: 8b5ed7dffa1f ("intel_iommu: add support for split irqchip")
Reported-by: Christopher Goldsworthy 
Signed-off-by: Jan Kiszka 
---
 target/i386/kvm.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/target/i386/kvm.c b/target/i386/kvm.c
index 9313602d3d..1fe3a73a10 100644
--- a/target/i386/kvm.c
+++ b/target/i386/kvm.c
@@ -3677,6 +3677,10 @@ int kvm_arch_fixup_msi_route(struct 
kvm_irq_routing_entry *route,
 MSIMessage src, dst;
 X86IOMMUClass *class = X86_IOMMU_GET_CLASS(iommu);
 
+if (!class->int_remap) {
+return 0;
+}
+
 src.address = route->u.msi.address_hi;
 src.address <<= VTD_MSI_ADDR_HI_SHIFT;
 src.address |= route->u.msi.address_lo;
-- 
2.16.4

Re: [Qemu-devel] [PATCH v3 20/20] arm/virt: Add support for GICv2 virtualization extensions

2018-07-06 Thread Jan Kiszka

On 2018-07-05 10:00, Jan Kiszka wrote:
> On 2018-07-05 08:51, Jan Kiszka wrote:
>> On 2018-06-29 15:29, Luc Michel wrote:
>>> Add support for GICv2 virtualization extensions by mapping the necessary
>>> I/O regions and connecting the maintenance IRQ lines.
>>>
>>> Declare those additions in the device tree and in the ACPI tables.
>>>
>>> Signed-off-by: Luc Michel 
>>> ---
>>>  hw/arm/virt-acpi-build.c |  4 
>>>  hw/arm/virt.c| 50 +---
>>>  include/hw/arm/virt.h|  3 +++
>>>  3 files changed, 49 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
>>> index 6ea47e2588..3b74bf0372 100644
>>> --- a/hw/arm/virt-acpi-build.c
>>> +++ b/hw/arm/virt-acpi-build.c
>>> @@ -659,6 +659,8 @@ build_madt(GArray *table_data, BIOSLinker *linker, 
>>> VirtMachineState *vms)
>>>  gicc->length = sizeof(*gicc);
>>>  if (vms->gic_version == 2) {
>>>  gicc->base_address = cpu_to_le64(memmap[VIRT_GIC_CPU].base);
>>> +gicc->gich_base_address = 
>>> cpu_to_le64(memmap[VIRT_GIC_HYP].base);
>>> +gicc->gicv_base_address = 
>>> cpu_to_le64(memmap[VIRT_GIC_VCPU].base);
>>>  }
>>>  gicc->cpu_interface_number = cpu_to_le32(i);
>>>  gicc->arm_mpidr = cpu_to_le64(armcpu->mp_affinity);
>>> @@ -670,6 +672,8 @@ build_madt(GArray *table_data, BIOSLinker *linker, 
>>> VirtMachineState *vms)
>>>  }
>>>  if (vms->virt && vms->gic_version == 3) {
>>>  gicc->vgic_interrupt = cpu_to_le32(PPI(ARCH_GICV3_MAINT_IRQ));
>>> +} else if (vms->virt && vms->gic_version == 2) {
>>> +gicc->vgic_interrupt = cpu_to_le32(PPI(ARCH_GICV2_MAINT_IRQ));
>>>  }
>>>  }
>>>  
>>> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
>>> index 742f68afca..e45b9de3be 100644
>>> --- a/hw/arm/virt.c
>>> +++ b/hw/arm/virt.c
>>> @@ -131,6 +131,8 @@ static const MemMapEntry a15memmap[] = {
>>>  [VIRT_GIC_DIST] =   { 0x0800, 0x0001 },
>>>  [VIRT_GIC_CPU] ={ 0x0801, 0x0001 },
>>>  [VIRT_GIC_V2M] ={ 0x0802, 0x1000 },
>>> +[VIRT_GIC_HYP] ={ 0x0803, 0x1000 },
>>> +[VIRT_GIC_VCPU] =   { 0x0804, 0x1000 },
>>>  /* The space in between here is reserved for GICv3 CPU/vCPU/HYP */
>>>  [VIRT_GIC_ITS] ={ 0x0808, 0x0002 },
>>>  /* This redistributor space allows up to 2*64kB*123 CPUs */
>>> @@ -438,11 +440,26 @@ static void fdt_add_gic_node(VirtMachineState *vms)
>>>  /* 'cortex-a15-gic' means 'GIC v2' */
>>>  qemu_fdt_setprop_string(vms->fdt, "/intc", "compatible",
>>>  "arm,cortex-a15-gic");
>>> -qemu_fdt_setprop_sized_cells(vms->fdt, "/intc", "reg",
>>> -  2, vms->memmap[VIRT_GIC_DIST].base,
>>> -  2, vms->memmap[VIRT_GIC_DIST].size,
>>> -  2, vms->memmap[VIRT_GIC_CPU].base,
>>> -  2, vms->memmap[VIRT_GIC_CPU].size);
>>> +if (!vms->virt) {
>>> +qemu_fdt_setprop_sized_cells(vms->fdt, "/intc", "reg",
>>> + 2, 
>>> vms->memmap[VIRT_GIC_DIST].base,
>>> + 2, 
>>> vms->memmap[VIRT_GIC_DIST].size,
>>> + 2, vms->memmap[VIRT_GIC_CPU].base,
>>> + 2, 
>>> vms->memmap[VIRT_GIC_CPU].size);
>>> +} else {
>>> +qemu_fdt_setprop_sized_cells(vms->fdt, "/intc", "reg",
>>> + 2, 
>>> vms->memmap[VIRT_GIC_DIST].base,
>>> + 2, 
>>> vms->memmap[VIRT_GIC_DIST].size,
>>> + 2, vms->memmap[VIRT_GIC_CPU].base,
>>> + 2, vms->memmap[VIRT_GIC_CPU].size,
>>> +

Re: [Qemu-devel] [PATCH v3 20/20] arm/virt: Add support for GICv2 virtualization extensions

2018-07-05 Thread Jan Kiszka

On 2018-07-05 08:51, Jan Kiszka wrote:
> On 2018-06-29 15:29, Luc Michel wrote:
>> Add support for GICv2 virtualization extensions by mapping the necessary
>> I/O regions and connecting the maintenance IRQ lines.
>>
>> Declare those additions in the device tree and in the ACPI tables.
>>
>> Signed-off-by: Luc Michel 
>> ---
>>  hw/arm/virt-acpi-build.c |  4 
>>  hw/arm/virt.c| 50 +---
>>  include/hw/arm/virt.h|  3 +++
>>  3 files changed, 49 insertions(+), 8 deletions(-)
>>
>> diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
>> index 6ea47e2588..3b74bf0372 100644
>> --- a/hw/arm/virt-acpi-build.c
>> +++ b/hw/arm/virt-acpi-build.c
>> @@ -659,6 +659,8 @@ build_madt(GArray *table_data, BIOSLinker *linker, 
>> VirtMachineState *vms)
>>  gicc->length = sizeof(*gicc);
>>  if (vms->gic_version == 2) {
>>  gicc->base_address = cpu_to_le64(memmap[VIRT_GIC_CPU].base);
>> +gicc->gich_base_address = 
>> cpu_to_le64(memmap[VIRT_GIC_HYP].base);
>> +gicc->gicv_base_address = 
>> cpu_to_le64(memmap[VIRT_GIC_VCPU].base);
>>  }
>>  gicc->cpu_interface_number = cpu_to_le32(i);
>>  gicc->arm_mpidr = cpu_to_le64(armcpu->mp_affinity);
>> @@ -670,6 +672,8 @@ build_madt(GArray *table_data, BIOSLinker *linker, 
>> VirtMachineState *vms)
>>  }
>>  if (vms->virt && vms->gic_version == 3) {
>>  gicc->vgic_interrupt = cpu_to_le32(PPI(ARCH_GICV3_MAINT_IRQ));
>> +} else if (vms->virt && vms->gic_version == 2) {
>> +gicc->vgic_interrupt = cpu_to_le32(PPI(ARCH_GICV2_MAINT_IRQ));
>>  }
>>  }
>>  
>> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
>> index 742f68afca..e45b9de3be 100644
>> --- a/hw/arm/virt.c
>> +++ b/hw/arm/virt.c
>> @@ -131,6 +131,8 @@ static const MemMapEntry a15memmap[] = {
>>  [VIRT_GIC_DIST] =   { 0x0800, 0x0001 },
>>  [VIRT_GIC_CPU] ={ 0x0801, 0x0001 },
>>  [VIRT_GIC_V2M] ={ 0x0802, 0x1000 },
>> +[VIRT_GIC_HYP] ={ 0x0803, 0x1000 },
>> +[VIRT_GIC_VCPU] =   { 0x0804, 0x1000 },
>>  /* The space in between here is reserved for GICv3 CPU/vCPU/HYP */
>>  [VIRT_GIC_ITS] ={ 0x0808, 0x0002 },
>>  /* This redistributor space allows up to 2*64kB*123 CPUs */
>> @@ -438,11 +440,26 @@ static void fdt_add_gic_node(VirtMachineState *vms)
>>  /* 'cortex-a15-gic' means 'GIC v2' */
>>  qemu_fdt_setprop_string(vms->fdt, "/intc", "compatible",
>>  "arm,cortex-a15-gic");
>> -qemu_fdt_setprop_sized_cells(vms->fdt, "/intc", "reg",
>> -  2, vms->memmap[VIRT_GIC_DIST].base,
>> -  2, vms->memmap[VIRT_GIC_DIST].size,
>> -  2, vms->memmap[VIRT_GIC_CPU].base,
>> -  2, vms->memmap[VIRT_GIC_CPU].size);
>> +if (!vms->virt) {
>> +qemu_fdt_setprop_sized_cells(vms->fdt, "/intc", "reg",
>> + 2, vms->memmap[VIRT_GIC_DIST].base,
>> + 2, vms->memmap[VIRT_GIC_DIST].size,
>> + 2, vms->memmap[VIRT_GIC_CPU].base,
>> + 2, vms->memmap[VIRT_GIC_CPU].size);
>> +} else {
>> +qemu_fdt_setprop_sized_cells(vms->fdt, "/intc", "reg",
>> + 2, vms->memmap[VIRT_GIC_DIST].base,
>> + 2, vms->memmap[VIRT_GIC_DIST].size,
>> + 2, vms->memmap[VIRT_GIC_CPU].base,
>> + 2, vms->memmap[VIRT_GIC_CPU].size,
>> + 2, vms->memmap[VIRT_GIC_HYP].base,
>> + 2, vms->memmap[VIRT_GIC_HYP].size,
>> + 2, vms->memmap[VIRT_GIC_VCPU].base,
>> + 2, 
>> vms->memmap[VIRT_GIC_VCPU].size);
>> +qemu_fdt_setprop_cells(vms->fdt, "/intc", "

Re: [Qemu-devel] [PATCH v3 20/20] arm/virt: Add support for GICv2 virtualization extensions

2018-07-05 Thread Jan Kiszka

On 2018-06-29 15:29, Luc Michel wrote:
> Add support for GICv2 virtualization extensions by mapping the necessary
> I/O regions and connecting the maintenance IRQ lines.
> 
> Declare those additions in the device tree and in the ACPI tables.
> 
> Signed-off-by: Luc Michel 
> ---
>  hw/arm/virt-acpi-build.c |  4 
>  hw/arm/virt.c| 50 +---
>  include/hw/arm/virt.h|  3 +++
>  3 files changed, 49 insertions(+), 8 deletions(-)
> 
> diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
> index 6ea47e2588..3b74bf0372 100644
> --- a/hw/arm/virt-acpi-build.c
> +++ b/hw/arm/virt-acpi-build.c
> @@ -659,6 +659,8 @@ build_madt(GArray *table_data, BIOSLinker *linker, 
> VirtMachineState *vms)
>  gicc->length = sizeof(*gicc);
>  if (vms->gic_version == 2) {
>  gicc->base_address = cpu_to_le64(memmap[VIRT_GIC_CPU].base);
> +gicc->gich_base_address = cpu_to_le64(memmap[VIRT_GIC_HYP].base);
> +gicc->gicv_base_address = 
> cpu_to_le64(memmap[VIRT_GIC_VCPU].base);
>  }
>  gicc->cpu_interface_number = cpu_to_le32(i);
>  gicc->arm_mpidr = cpu_to_le64(armcpu->mp_affinity);
> @@ -670,6 +672,8 @@ build_madt(GArray *table_data, BIOSLinker *linker, 
> VirtMachineState *vms)
>  }
>  if (vms->virt && vms->gic_version == 3) {
>  gicc->vgic_interrupt = cpu_to_le32(PPI(ARCH_GICV3_MAINT_IRQ));
> +} else if (vms->virt && vms->gic_version == 2) {
> +gicc->vgic_interrupt = cpu_to_le32(PPI(ARCH_GICV2_MAINT_IRQ));
>  }
>  }
>  
> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> index 742f68afca..e45b9de3be 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -131,6 +131,8 @@ static const MemMapEntry a15memmap[] = {
>  [VIRT_GIC_DIST] =   { 0x0800, 0x0001 },
>  [VIRT_GIC_CPU] ={ 0x0801, 0x0001 },
>  [VIRT_GIC_V2M] ={ 0x0802, 0x1000 },
> +[VIRT_GIC_HYP] ={ 0x0803, 0x1000 },
> +[VIRT_GIC_VCPU] =   { 0x0804, 0x1000 },
>  /* The space in between here is reserved for GICv3 CPU/vCPU/HYP */
>  [VIRT_GIC_ITS] ={ 0x0808, 0x0002 },
>  /* This redistributor space allows up to 2*64kB*123 CPUs */
> @@ -438,11 +440,26 @@ static void fdt_add_gic_node(VirtMachineState *vms)
>  /* 'cortex-a15-gic' means 'GIC v2' */
>  qemu_fdt_setprop_string(vms->fdt, "/intc", "compatible",
>  "arm,cortex-a15-gic");
> -qemu_fdt_setprop_sized_cells(vms->fdt, "/intc", "reg",
> -  2, vms->memmap[VIRT_GIC_DIST].base,
> -  2, vms->memmap[VIRT_GIC_DIST].size,
> -  2, vms->memmap[VIRT_GIC_CPU].base,
> -  2, vms->memmap[VIRT_GIC_CPU].size);
> +if (!vms->virt) {
> +qemu_fdt_setprop_sized_cells(vms->fdt, "/intc", "reg",
> + 2, vms->memmap[VIRT_GIC_DIST].base,
> + 2, vms->memmap[VIRT_GIC_DIST].size,
> + 2, vms->memmap[VIRT_GIC_CPU].base,
> + 2, vms->memmap[VIRT_GIC_CPU].size);
> +} else {
> +qemu_fdt_setprop_sized_cells(vms->fdt, "/intc", "reg",
> + 2, vms->memmap[VIRT_GIC_DIST].base,
> + 2, vms->memmap[VIRT_GIC_DIST].size,
> + 2, vms->memmap[VIRT_GIC_CPU].base,
> + 2, vms->memmap[VIRT_GIC_CPU].size,
> + 2, vms->memmap[VIRT_GIC_HYP].base,
> + 2, vms->memmap[VIRT_GIC_HYP].size,
> + 2, vms->memmap[VIRT_GIC_VCPU].base,
> + 2, vms->memmap[VIRT_GIC_VCPU].size);
> +qemu_fdt_setprop_cells(vms->fdt, "/intc", "interrupts",
> +   GIC_FDT_IRQ_TYPE_PPI, 
> ARCH_GICV2_MAINT_IRQ,
> +   GIC_FDT_IRQ_FLAGS_LEVEL_HI);
> +}
>  }
>  
>  qemu_fdt_setprop_cell(vms->fdt, "/intc", "phandle", vms->gic_phandle);
> @@ -563,6 +580,11 @@ static void create_gic(VirtMachineState *vms, qemu_irq 
> *pic)
>  qdev_prop_set_uint32(gicdev, "redist-region-count[1]",
>  MIN(smp_cpus - redist0_count, redist1_capacity));
>  }
> +} else {
> +if (!kvm_irqchip_in_kernel()) {
> +qdev_prop_set_bit(gicdev, "has-virtualization-extensions",
> +  vms->virt);
> +}
>  }
>  qdev_init_nofail(gicdev);
>  gicbusdev = SYS_BUS_DEVICE(gicdev);
> @@ -574,6 +596,10 @@ static void

Re: [Qemu-devel] [PATCH v2] e1000e: Prevent MSI/MSI-X storms

2018-07-01 Thread Jan Kiszka

On 2018-07-02 05:40, Jason Wang wrote:
> 
> 
> On 2018年06月30日 14:13, Jan Kiszka wrote:
>> On 2018-04-05 19:41, Jan Kiszka wrote:
>>> From: Jan Kiszka 
>>>
>>> Only signal MSI/MSI-X events on rising edges. So far we re-triggered the
>>> interrupt sources even if the guest did no consumed the pending one,
>>> easily causing interrupt storms.
>>>
>>> Issue was observable with Linux 4.16 e1000e driver when MSI-X was used.
>>> Vector 2 was causing interrupt storms after the driver activated the
>>> device.
>>>
>>> Signed-off-by: Jan Kiszka 
>>> ---
>>>
>>> Changes in v2:
>>>   - also update msi_causes_pending after EIAC changes (required because
>>>     there is no e1000e_update_interrupt_state after that
>>>
>>>   hw/net/e1000e_core.c | 11 +++
>>>   hw/net/e1000e_core.h |  2 ++
>>>   2 files changed, 13 insertions(+)
>>>
>>> diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c
>>> index c93c4661ed..d6ddd59986 100644
>>> --- a/hw/net/e1000e_core.c
>>> +++ b/hw/net/e1000e_core.c
>>> @@ -2027,6 +2027,7 @@ e1000e_msix_notify_one(E1000ECore *core,
>>> uint32_t cause, uint32_t int_cfg)
>>>   }
>>>     core->mac[ICR] &= ~effective_eiac;
>>> +    core->msi_causes_pending &= ~effective_eiac;
>>>     if (!(core->mac[CTRL_EXT] & E1000_CTRL_EXT_IAME)) {
>>>   core->mac[IMS] &= ~effective_eiac;
>>> @@ -2123,6 +2124,13 @@ e1000e_send_msi(E1000ECore *core, bool msix)
>>>   {
>>>   uint32_t causes = core->mac[ICR] & core->mac[IMS] &
>>> ~E1000_ICR_ASSERTED;
>>>   +    core->msi_causes_pending &= causes;
>>> +    causes ^= core->msi_causes_pending;
>>> +    if (causes == 0) {
>>> +    return;
>>> +    }
>>> +    core->msi_causes_pending |= causes;
>>> +
>>>   if (msix) {
>>>   e1000e_msix_notify(core, causes);
>>>   } else {
>>> @@ -2160,6 +2168,9 @@ e1000e_update_interrupt_state(E1000ECore *core)
>>>   core->mac[ICS] = core->mac[ICR];
>>>     interrupts_pending = (core->mac[IMS] & core->mac[ICR]) ? true
>>> : false;
>>> +    if (!interrupts_pending) {
>>> +    core->msi_causes_pending = 0;
>>> +    }
>>>     trace_e1000e_irq_pending_interrupts(core->mac[ICR] &
>>> core->mac[IMS],
>>>   core->mac[ICR],
>>> core->mac[IMS]);
>>> diff --git a/hw/net/e1000e_core.h b/hw/net/e1000e_core.h
>>> index 7d8ff41890..63a15510cc 100644
>>> --- a/hw/net/e1000e_core.h
>>> +++ b/hw/net/e1000e_core.h
>>> @@ -109,6 +109,8 @@ struct E1000Core {
>>>   NICState *owner_nic;
>>>   PCIDevice *owner;
>>>   void (*owner_start_recv)(PCIDevice *d);
>>> +
>>> +    uint32_t msi_causes_pending;
>>>   };
>>>     void
>>>
>> Ping again, this one is still open.
>>
>> Jan
> 
> Sorry for the late.
> 
> Just one thing to confirm, I think the reason that we don't need to
> migrate msi_cause_pending is that it can only lead at most one more
> signal of MSI on destination?

Right, a cleared msi_causes_pending on the destination may lead to one
spurious interrupt injects per vector. That should be tolerable.

Jan



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH v2] e1000e: Prevent MSI/MSI-X storms

2018-06-30 Thread Jan Kiszka

On 2018-04-05 19:41, Jan Kiszka wrote:
> From: Jan Kiszka 
> 
> Only signal MSI/MSI-X events on rising edges. So far we re-triggered the
> interrupt sources even if the guest did no consumed the pending one,
> easily causing interrupt storms.
> 
> Issue was observable with Linux 4.16 e1000e driver when MSI-X was used.
> Vector 2 was causing interrupt storms after the driver activated the
> device.
> 
> Signed-off-by: Jan Kiszka 
> ---
> 
> Changes in v2:
>  - also update msi_causes_pending after EIAC changes (required because
>there is no e1000e_update_interrupt_state after that
> 
>  hw/net/e1000e_core.c | 11 +++
>  hw/net/e1000e_core.h |  2 ++
>  2 files changed, 13 insertions(+)
> 
> diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c
> index c93c4661ed..d6ddd59986 100644
> --- a/hw/net/e1000e_core.c
> +++ b/hw/net/e1000e_core.c
> @@ -2027,6 +2027,7 @@ e1000e_msix_notify_one(E1000ECore *core, uint32_t 
> cause, uint32_t int_cfg)
>  }
>  
>  core->mac[ICR] &= ~effective_eiac;
> +core->msi_causes_pending &= ~effective_eiac;
>  
>  if (!(core->mac[CTRL_EXT] & E1000_CTRL_EXT_IAME)) {
>  core->mac[IMS] &= ~effective_eiac;
> @@ -2123,6 +2124,13 @@ e1000e_send_msi(E1000ECore *core, bool msix)
>  {
>  uint32_t causes = core->mac[ICR] & core->mac[IMS] & ~E1000_ICR_ASSERTED;
>  
> +core->msi_causes_pending &= causes;
> +causes ^= core->msi_causes_pending;
> +if (causes == 0) {
> +return;
> +}
> +core->msi_causes_pending |= causes;
> +
>  if (msix) {
>  e1000e_msix_notify(core, causes);
>  } else {
> @@ -2160,6 +2168,9 @@ e1000e_update_interrupt_state(E1000ECore *core)
>  core->mac[ICS] = core->mac[ICR];
>  
>  interrupts_pending = (core->mac[IMS] & core->mac[ICR]) ? true : false;
> +if (!interrupts_pending) {
> +core->msi_causes_pending = 0;
> +}
>  
>  trace_e1000e_irq_pending_interrupts(core->mac[ICR] & core->mac[IMS],
>  core->mac[ICR], core->mac[IMS]);
> diff --git a/hw/net/e1000e_core.h b/hw/net/e1000e_core.h
> index 7d8ff41890..63a15510cc 100644
> --- a/hw/net/e1000e_core.h
> +++ b/hw/net/e1000e_core.h
> @@ -109,6 +109,8 @@ struct E1000Core {
>  NICState *owner_nic;
>  PCIDevice *owner;
>  void (*owner_start_recv)(PCIDevice *d);
> +
> +uint32_t msi_causes_pending;
>  };
>  
>  void
> 

Ping again, this one is still open.

Jan

[Qemu-devel] [PATCH v2] target-i386: Add NPT support

2018-06-30 Thread Jan Kiszka

From: Jan Kiszka 

This implements NPT suport for SVM by hooking into
x86_cpu_handle_mmu_fault where it reads the stage-1 page table. Whether
we need to perform this 2nd stage translation, and how, is decided
during vmrun and stored in hflags2, along with nested_cr3 and
nested_pg_mode.

As get_hphys performs a direct cpu_vmexit in case of NPT faults, we need
retaddr in that function. To avoid changing the signature of
cpu_handle_mmu_fault, this passes the value from tlb_fill to get_hphys
via the CPU state.

This was tested successfully via the Jailhouse hypervisor.

Signed-off-by: Jan Kiszka 
---

Changes in v2:
 - use hflags2 instead of hflags
 - add conditional vmstate subsection

 target/i386/cpu.c |   2 +-
 target/i386/cpu.h |   6 ++
 target/i386/excp_helper.c | 216 +-
 target/i386/machine.c |  21 +
 target/i386/mem_helper.c  |   6 +-
 target/i386/svm.h |  14 +++
 target/i386/svm_helper.c  |  22 +
 7 files changed, 281 insertions(+), 6 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 1e6a7d0a75..6e1f180249 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -751,7 +751,7 @@ static void x86_cpu_vendor_words2str(char *dst, uint32_t 
vendor1,
 #define TCG_EXT3_FEATURES (CPUID_EXT3_LAHF_LM | CPUID_EXT3_SVM | \
   CPUID_EXT3_CR8LEG | CPUID_EXT3_ABM | CPUID_EXT3_SSE4A)
 #define TCG_EXT4_FEATURES 0
-#define TCG_SVM_FEATURES 0
+#define TCG_SVM_FEATURES CPUID_SVM_NPT
 #define TCG_KVM_FEATURES 0
 #define TCG_7_0_EBX_FEATURES (CPUID_7_0_EBX_SMEP | CPUID_7_0_EBX_SMAP | \
   CPUID_7_0_EBX_BMI1 | CPUID_7_0_EBX_BMI2 | CPUID_7_0_EBX_ADX | \
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 8eaefeee3e..7f33755bf5 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -211,6 +211,7 @@ typedef enum X86Seg {
 #define HF2_VINTR_SHIFT  3 /* value of V_INTR_MASKING bit */
 #define HF2_SMM_INSIDE_NMI_SHIFT 4 /* CPU serving SMI nested inside NMI */
 #define HF2_MPX_PR_SHIFT 5 /* BNDCFGx.BNDPRESERVE */
+#define HF2_NPT_SHIFT6 /* Nested Paging enabled */
 
 #define HF2_GIF_MASK(1 << HF2_GIF_SHIFT)
 #define HF2_HIF_MASK(1 << HF2_HIF_SHIFT)
@@ -218,6 +219,7 @@ typedef enum X86Seg {
 #define HF2_VINTR_MASK  (1 << HF2_VINTR_SHIFT)
 #define HF2_SMM_INSIDE_NMI_MASK (1 << HF2_SMM_INSIDE_NMI_SHIFT)
 #define HF2_MPX_PR_MASK (1 << HF2_MPX_PR_SHIFT)
+#define HF2_NPT_MASK(1 << HF2_NPT_SHIFT)
 
 #define CR0_PE_SHIFT 0
 #define CR0_MP_SHIFT 1
@@ -1265,12 +1267,16 @@ typedef struct CPUX86State {
 uint16_t intercept_dr_read;
 uint16_t intercept_dr_write;
 uint32_t intercept_exceptions;
+uint64_t nested_cr3;
+uint32_t nested_pg_mode;
 uint8_t v_tpr;
 
 /* KVM states, automatically cleared on reset */
 uint8_t nmi_injected;
 uint8_t nmi_pending;
 
+uintptr_t retaddr;
+
 /* Fields up to this point are cleared by a CPU reset */
 struct {} end_reset_fields;
 
diff --git a/target/i386/excp_helper.c b/target/i386/excp_helper.c
index cb4d1b7d33..37a33d5ae0 100644
--- a/target/i386/excp_helper.c
+++ b/target/i386/excp_helper.c
@@ -157,6 +157,209 @@ int x86_cpu_handle_mmu_fault(CPUState *cs, vaddr addr, 
int size,
 
 #else
 
+static hwaddr get_hphys(CPUState *cs, hwaddr gphys, MMUAccessType access_type,
+int *prot)
+{
+CPUX86State *env = _CPU(cs)->env;
+uint64_t rsvd_mask = PG_HI_RSVD_MASK;
+uint64_t ptep, pte;
+uint64_t exit_info_1 = 0;
+target_ulong pde_addr, pte_addr;
+uint32_t page_offset;
+int page_size;
+
+if (likely(!(env->hflags2 & HF2_NPT_MASK))) {
+return gphys;
+}
+
+if (!(env->nested_pg_mode & SVM_NPT_NXE)) {
+rsvd_mask |= PG_NX_MASK;
+}
+
+if (env->nested_pg_mode & SVM_NPT_PAE) {
+uint64_t pde, pdpe;
+target_ulong pdpe_addr;
+
+#ifdef TARGET_X86_64
+if (env->nested_pg_mode & SVM_NPT_LMA) {
+uint64_t pml5e;
+uint64_t pml4e_addr, pml4e;
+
+pml5e = env->nested_cr3;
+ptep = PG_NX_MASK | PG_USER_MASK | PG_RW_MASK;
+
+pml4e_addr = (pml5e & PG_ADDRESS_MASK) +
+(((gphys >> 39) & 0x1ff) << 3);
+pml4e = x86_ldq_phys(cs, pml4e_addr);
+if (!(pml4e & PG_PRESENT_MASK)) {
+goto do_fault;
+}
+if (pml4e & (rsvd_mask | PG_PSE_MASK)) {
+goto do_fault_rsvd;
+}
+if (!(pml4e & PG_ACCESSED_MASK)) {
+pml4e |= PG_ACCESSED_MASK;
+x86_stl_phys_notdirty(cs, pml4e_addr, pml4e);
+}
+ptep &= pml4e ^ PG_NX_MASK;
+pdpe_addr = (pml4e & PG_ADDRESS_MASK) +
+(((gphys >> 30) & 0x1ff) << 3);

Re: [Qemu-devel] [PATCH v2 4/4] target-i386: Add NPT support

2018-06-30 Thread Jan Kiszka

On 2018-06-30 08:05, Paolo Bonzini wrote:
> On 30/06/2018 07:25, Jan Kiszka wrote:
>> On 2018-06-27 14:14, Paolo Bonzini wrote:
>>> On 03/04/2018 17:36, Jan Kiszka wrote:
>>>>  
>>>> +static hwaddr get_hphys(CPUState *cs, hwaddr gphys, MMUAccessType 
>>>> access_type,
>>>> +int *prot)
>>>> +{
>>>> +CPUX86State *env = _CPU(cs)->env;
>>>> +uint64_t rsvd_mask = PG_HI_RSVD_MASK;
>>>> +uint64_t ptep, pte;
>>>> +uint64_t exit_info_1 = 0;
>>>> +target_ulong pde_addr, pte_addr;
>>>> +uint32_t page_offset;
>>>> +int page_size;
>>>> +
>>>> +if (likely(!(env->hflags & HF_NPT_MASK))) {
>>>> +return gphys;
>>>> +}
>>>
>>> hflags are a somewhat limited resource.  Can this go in hflags2?
>>>
>>
>> Will have a look - I don't seen why not. Or is there any special
>> semantical difference between both fields?
> 
> Yes, hflags become flags of the translation block, while hflags2 are
> just random processor state.  If translate.c uses it you must use
> hflags, but here hflags2 should be safe.

Indeed. We only change it at vmentry/exit, and that is already a
translation block barrier.

v2 is on the way.

Jan



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH v2 4/4] target-i386: Add NPT support

2018-06-29 Thread Jan Kiszka

On 2018-06-27 14:14, Paolo Bonzini wrote:
> On 03/04/2018 17:36, Jan Kiszka wrote:
>>  
>> +static hwaddr get_hphys(CPUState *cs, hwaddr gphys, MMUAccessType 
>> access_type,
>> +int *prot)
>> +{
>> +CPUX86State *env = _CPU(cs)->env;
>> +uint64_t rsvd_mask = PG_HI_RSVD_MASK;
>> +uint64_t ptep, pte;
>> +uint64_t exit_info_1 = 0;
>> +target_ulong pde_addr, pte_addr;
>> +uint32_t page_offset;
>> +int page_size;
>> +
>> +if (likely(!(env->hflags & HF_NPT_MASK))) {
>> +return gphys;
>> +}
> 
> hflags are a somewhat limited resource.  Can this go in hflags2?
> 

Will have a look - I don't seen why not. Or is there any special
semantical difference between both fields?

>>
>> +
>> +env->nested_pg_mode = 0;
>> +if (env->cr[4] & CR4_PAE_MASK) {
>> +env->nested_pg_mode |= SVM_NPT_PAE;
>> +}
>> +if (env->hflags & HF_LMA_MASK) {
>> +env->nested_pg_mode |= SVM_NPT_LMA;
>> +}
>> +if (env->efer & MSR_EFER_NXE) {
>> +env->nested_pg_mode |= SVM_NPT_NXE;
>> +}
>> +}
>> +
> 
> This needs to be migrated.  You can put it in a subsection, conditional
> on hflags & HF_SVMI_MASK.

OK.

> 
> Also, do you need to flush the TLB unconditionally, even if CR0.PG is zero?
Cannot follow you here yet. What flush are you referring to?

Also, CR0.PG would not reflect if NPT is on, which now also contributes
to our TLB.

> 
> Otherwise looks good.  I have queued patches 1-3, but hopefully this one
> can go in the next release too.  Sorry for the delayed review.

No problem.

Thanks,
Jan



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH v2 0/7] arm_gic: add virtualization extensions support

2018-06-26 Thread Jan Kiszka

On 2018-06-26 12:22, Jan Kiszka wrote:
> On 2018-06-26 11:24, Luc Michel wrote:
>>
>>
>> On 06/25/2018 06:55 AM, Jan Kiszka wrote:
>>> On 2018-06-19 11:31, luc.mic...@greensocs.com wrote:
>>>> From: Luc MICHEL 
>>>>
>>>> This patch series add support for the virtualization extensions in the
>>>> ARM GICv2 interrupt controller.
>>>>
>>>> The first two commits do some refactoring to prepare for the
>>>> implementation. Commits 3 and 4 are the actual implementation. The last
>>>> commit updates the ZynqMP implementation to support virtualization.
>>>
>>> Is it enabled for the virt machine as well? Seems we are missing some
>>> mapping of GICV and GICH there.
>>>
>>> And what is the recommended way to get the ZynqMP model running? I've a
>>> kernel that works on a ZCU102, but just passing it via -kernel to the
>>> QEMU model seems insufficient.
>> Hum, I think it should be enough. This is the way I run my tests with Xen:
>> qemu-system-aarch64 -M xlnx-zcu102,secure=on,virtualization=on \
>> -m 1024 \
>> -kernel xen-kernel -nographic -smp 1
>>
> 
> Just debugged this further a bit, and it seems my kernel is not getting
> any device tree from qemu:
> 
> (gdb) bt
> #0  cpu_relax () at ../arch/arm64/include/asm/processor.h:203
> #1  setup_machine_fdt (dt_phys=256) at ../arch/arm64/kernel/setup.c:194
> #2  setup_arch (cmdline_p=) at ../arch/arm64/kernel/setup.c:258
> #3  0xff8008e30bc4 in start_kernel () at ../init/main.c:552
> #4  0x in ?? ()
> Backtrace stopped: previous frame identical to this frame (corrupt stack?)
> (gdb) lx-dmesg 
> [0.00] Booting Linux on physical CPU 0x00 [0x410fd034]
> [0.00] Linux version 4.17.0-00033-g67c2a9a37d59 (jan@md1f2u6c) (gcc 
> version 7.2.1 20171011 (Linaro GCC 7.2-2017.11)) #17 SMP Tue Jun 26 12:16:56 
> CEST 2018
> (gdb) l setup_machine_fdt
> 178 pr_warn("Large number of MPIDR hash buckets 
> detected\n");
> 179 }
> 180
> 181 static void __init setup_machine_fdt(phys_addr_t dt_phys)
> 182 {
> 183 void *dt_virt = fixmap_remap_fdt(dt_phys);
> 184 const char *name;
> 185
> 186 if (!dt_virt || !early_init_dt_scan(dt_virt)) {
> 187 pr_crit("\n"
> 188 "Error: invalid device tree blob at physical 
> address %pa (virtual address 0x%p)\n"
> 189 "The dtb must be 8-byte aligned and must not 
> exceed 2 MB in size\n"
> 190 "\nPlease check your bootloader.",
> 191 _phys, dt_virt);
> 192
> 193 while (true)
> 194 cpu_relax();
> [...]
> (gdb) p dt_virt
> $1 = (void *) 0x0
> 
> Jan
> 

OK, feeding in -dtb arch/arm64/boot/dts/xilinx/zynqmp-zcu102-revB.dtb
and setting the console to ttyPS0 gives now some output. Looks like this
can take me closer towards a testable setup.

Jan

Re: [Qemu-devel] [PATCH v2 0/7] arm_gic: add virtualization extensions support

2018-06-26 Thread Jan Kiszka

On 2018-06-26 11:24, Luc Michel wrote:
> 
> 
> On 06/25/2018 06:55 AM, Jan Kiszka wrote:
>> On 2018-06-19 11:31, luc.mic...@greensocs.com wrote:
>>> From: Luc MICHEL 
>>>
>>> This patch series add support for the virtualization extensions in the
>>> ARM GICv2 interrupt controller.
>>>
>>> The first two commits do some refactoring to prepare for the
>>> implementation. Commits 3 and 4 are the actual implementation. The last
>>> commit updates the ZynqMP implementation to support virtualization.
>>
>> Is it enabled for the virt machine as well? Seems we are missing some
>> mapping of GICV and GICH there.
>>
>> And what is the recommended way to get the ZynqMP model running? I've a
>> kernel that works on a ZCU102, but just passing it via -kernel to the
>> QEMU model seems insufficient.
> Hum, I think it should be enough. This is the way I run my tests with Xen:
> qemu-system-aarch64 -M xlnx-zcu102,secure=on,virtualization=on \
> -m 1024 \
> -kernel xen-kernel -nographic -smp 1
> 

Just debugged this further a bit, and it seems my kernel is not getting
any device tree from qemu:

(gdb) bt
#0  cpu_relax () at ../arch/arm64/include/asm/processor.h:203
#1  setup_machine_fdt (dt_phys=256) at ../arch/arm64/kernel/setup.c:194
#2  setup_arch (cmdline_p=) at ../arch/arm64/kernel/setup.c:258
#3  0xff8008e30bc4 in start_kernel () at ../init/main.c:552
#4  0x in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb) lx-dmesg 
[0.00] Booting Linux on physical CPU 0x00 [0x410fd034]
[0.00] Linux version 4.17.0-00033-g67c2a9a37d59 (jan@md1f2u6c) (gcc 
version 7.2.1 20171011 (Linaro GCC 7.2-2017.11)) #17 SMP Tue Jun 26 12:16:56 
CEST 2018
(gdb) l setup_machine_fdt
178 pr_warn("Large number of MPIDR hash buckets 
detected\n");
179 }
180
181 static void __init setup_machine_fdt(phys_addr_t dt_phys)
182 {
183 void *dt_virt = fixmap_remap_fdt(dt_phys);
184 const char *name;
185
186 if (!dt_virt || !early_init_dt_scan(dt_virt)) {
187 pr_crit("\n"
188 "Error: invalid device tree blob at physical 
address %pa (virtual address 0x%p)\n"
189 "The dtb must be 8-byte aligned and must not 
exceed 2 MB in size\n"
190 "\nPlease check your bootloader.",
191 _phys, dt_virt);
192
193 while (true)
194 cpu_relax();
[...]
(gdb) p dt_virt
$1 = (void *) 0x0

Jan

Re: [Qemu-devel] [PATCH v2 0/7] arm_gic: add virtualization extensions support

2018-06-24 Thread Jan Kiszka

On 2018-06-19 11:31, luc.mic...@greensocs.com wrote:
> From: Luc MICHEL 
> 
> This patch series add support for the virtualization extensions in the
> ARM GICv2 interrupt controller.
> 
> The first two commits do some refactoring to prepare for the
> implementation. Commits 3 and 4 are the actual implementation. The last
> commit updates the ZynqMP implementation to support virtualization.

Is it enabled for the virt machine as well? Seems we are missing some
mapping of GICV and GICH there.

And what is the recommended way to get the ZynqMP model running? I've a
kernel that works on a ZCU102, but just passing it via -kernel to the
QEMU model seems insufficient.

Jan

Re: [Qemu-devel] [PATCH] e1000e: Do not auto-clear ICR bits which aren't set in EIAC

2018-06-15 Thread Jan Kiszka

On 2018-06-15 04:07, Jason Wang wrote:
> 
> 
> On 2018年06月13日 10:29, Jason Wang wrote:
>>
>>
>> On 2018年06月13日 10:26, Philippe Mathieu-Daudé wrote:
>>> Hi Jason,
>>>
>>> On 06/12/2018 11:18 PM, Jason Wang wrote:
>>>> On 2018年06月13日 03:00, Philippe Mathieu-Daudé wrote:
>>>>> Cc'ing Jason who is also listed as co-maintainer:
>>>>>
>>>>>     ./scripts/get_maintainer.pl -f hw/net/e1000e_core.c
>>>>>     Dmitry Fleytman  (maintainer:e1000e)
>>>>>     Jason Wang  (odd fixer:Network devices)
>>>>>
>>>>> On 06/12/2018 03:43 PM, Jan Kiszka wrote:
>>>>>> On 2018-06-12 20:38, Philippe Mathieu-Daudé wrote:
>>>>>>> On 06/12/2018 03:30 PM, Jan Kiszka wrote:
>>>>>>>> On 2018-06-12 20:11, Philippe Mathieu-Daudé wrote:
>>>>>>>>> Hi Jan,
>>>>>>>>>
>>>>>>>>> On 06/12/2018 02:22 PM, Jan Kiszka wrote:
>>>>>>>>>> On 2018-05-22 09:00, Jan Kiszka wrote:
>>>>>>>>>>> On 2018-04-16 17:29, Peter Maydell wrote:
>>>>>>>>>>>> On 16 April 2018 at 16:25, Jan Kiszka 
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>> On 2018-04-01 23:17, Jan Kiszka wrote:
>>>>>>>>>>>>>> From: Jan Kiszka 
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The spec does not justify clearing of any
>>>>>>>>>>>>>> E1000_ICR_OTHER_CAUSES when
>>>>>>>>>>>>>> E1000_ICR_OTHER is set in EIAC. In fact, removing this code
>>>>>>>>>>>>>> fixes the
>>>>>>>>>>>>>> issue the Linux driver runs into since 4aea7a5c5e94 ("e1000e:
>>>>>>>>>>>>>> Avoid
>>>>>>>>>>>>>> receiver overrun interrupt bursts") and was worked around by
>>>>>>>>>>>>>> 745d0bd3af99 ("e1000e: Remove Other from EIAC").
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Signed-off-by: Jan Kiszka 
>>>>>>>>>>>>>> ---
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> This resolves the issue I reported on February 18 ("e1000e:
>>>>>>>>>>>>>> MSI-X
>>>>>>>>>>>>>> problem with recent Linux drivers").
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>    hw/net/e1000e_core.c | 4 
>>>>>>>>>>>>>>    1 file changed, 4 deletions(-)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c
>>>>>>>>>>>>>> index ecf9b1..d38f025c0f 100644
>>>>>>>>>>>>>> --- a/hw/net/e1000e_core.c
>>>>>>>>>>>>>> +++ b/hw/net/e1000e_core.c
>>>>>>>>>>>>>> @@ -2022,10 +2022,6 @@ e1000e_msix_notify_one(E1000ECore
>>>>>>>>>>>>>> *core, uint32_t cause, uint32_t int_cfg)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>    effective_eiac = core->mac[EIAC] & cause;
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> -    if (effective_eiac == E1000_ICR_OTHER) {
>>>>>>>>>>>>>> -    effective_eiac |= E1000_ICR_OTHER_CAUSES;
>>>>>>>>>>>>>> -    }
>>>>>>>>>>>>>> -
>>>>>>>>>>>>>>    core->mac[ICR] &= ~effective_eiac;
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>    if (!(core->mac[CTRL_EXT] & E1000_CTRL_EXT_IAME)) {
>>>>>>>>>>>>>>
>>>>>>>>>>>>> Ping for this - as well as
>>>>>>>>>>>>> https://patchwork.ozlabs.org/patch/895476.
>>>>>>>>>>>&g

Re: [Qemu-devel] [PATCH] hw/i386: Fix IVHD entry length for AMD IOMMU

2018-06-13 Thread Jan Kiszka

On 2018-06-13 16:26, Michael S. Tsirkin wrote:
> On Tue, May 22, 2018 at 09:06:56AM +0200, Jan Kiszka wrote:
>> On 2018-03-29 14:51, Jan Kiszka wrote:
>>> From: Jan Kiszka 
>>>
>>> Counting from the IVHD ID field to the all-devices entry, we have 28
>>> bytes, not 36.
>>>
>>> Signed-off-by: Jan Kiszka 
>>> ---
>>>  hw/i386/acpi-build.c | 2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
>>> index deb440f286..a0cda71411 100644
>>> --- a/hw/i386/acpi-build.c
>>> +++ b/hw/i386/acpi-build.c
>>> @@ -2561,7 +2561,7 @@ build_amd_iommu(GArray *table_data, BIOSLinker 
>>> *linker)
>>>   (1UL << 7),  /* PPRSup   */
>>>   1);
>>>  /* IVHD length */
>>> -build_append_int_noprefix(table_data, 0x24, 2);
>>> +build_append_int_noprefix(table_data, 28, 2);
>>>  /* DeviceID */
>>>  build_append_int_noprefix(table_data, s->devid, 2);
>>>  /* Capability offset */
>>>
>>
>> Waiting to be merged.
>>
>> Jan
> 
> I'll queue this but you really should Cc maintainers :)
> 

Sorry, wasn't clear to me that you are also maintaining the AMD IOMMU code.

Thanks,
Jan



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH] e1000e: Do not auto-clear ICR bits which aren't set in EIAC

2018-06-12 Thread Jan Kiszka

On 2018-06-12 20:38, Philippe Mathieu-Daudé wrote:
> On 06/12/2018 03:30 PM, Jan Kiszka wrote:
>> On 2018-06-12 20:11, Philippe Mathieu-Daudé wrote:
>>> Hi Jan,
>>>
>>> On 06/12/2018 02:22 PM, Jan Kiszka wrote:
>>>> On 2018-05-22 09:00, Jan Kiszka wrote:
>>>>> On 2018-04-16 17:29, Peter Maydell wrote:
>>>>>> On 16 April 2018 at 16:25, Jan Kiszka  wrote:
>>>>>>> On 2018-04-01 23:17, Jan Kiszka wrote:
>>>>>>>> From: Jan Kiszka 
>>>>>>>>
>>>>>>>> The spec does not justify clearing of any E1000_ICR_OTHER_CAUSES when
>>>>>>>> E1000_ICR_OTHER is set in EIAC. In fact, removing this code fixes the
>>>>>>>> issue the Linux driver runs into since 4aea7a5c5e94 ("e1000e: Avoid
>>>>>>>> receiver overrun interrupt bursts") and was worked around by
>>>>>>>> 745d0bd3af99 ("e1000e: Remove Other from EIAC").
>>>>>>>>
>>>>>>>> Signed-off-by: Jan Kiszka 
>>>>>>>> ---
>>>>>>>>
>>>>>>>> This resolves the issue I reported on February 18 ("e1000e: MSI-X
>>>>>>>> problem with recent Linux drivers").
>>>>>>>>
>>>>>>>>  hw/net/e1000e_core.c | 4 
>>>>>>>>  1 file changed, 4 deletions(-)
>>>>>>>>
>>>>>>>> diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c
>>>>>>>> index ecf9b1..d38f025c0f 100644
>>>>>>>> --- a/hw/net/e1000e_core.c
>>>>>>>> +++ b/hw/net/e1000e_core.c
>>>>>>>> @@ -2022,10 +2022,6 @@ e1000e_msix_notify_one(E1000ECore *core, 
>>>>>>>> uint32_t cause, uint32_t int_cfg)
>>>>>>>>
>>>>>>>>  effective_eiac = core->mac[EIAC] & cause;
>>>>>>>>
>>>>>>>> -if (effective_eiac == E1000_ICR_OTHER) {
>>>>>>>> -effective_eiac |= E1000_ICR_OTHER_CAUSES;
>>>>>>>> -}
>>>>>>>> -
>>>>>>>>  core->mac[ICR] &= ~effective_eiac;
>>>>>>>>
>>>>>>>>  if (!(core->mac[CTRL_EXT] & E1000_CTRL_EXT_IAME)) {
>>>>>>>>
>>>>>>>
>>>>>>> Ping for this - as well as https://patchwork.ozlabs.org/patch/895476.
>>>>>>>
>>>>>>> Given that q35 uses e1000e by default and many Linux kernel versions no
>>>>>>> longer work, this should likely go into upcoming and stable versions
>>>>>>
>>>>>> I'd rather not put it into 2.12 at this point in the release
>>>>>> cycle unless it's a regression from 2.11, I think.
>>>>>
>>>>> Second ping - nothing hit the repo so far, nor did I receive feedback.
>>>>>
>>>>
>>>> And another ping. For both.
>>>>
>>>> These days I had to help someone with a broken QEMU setup that failed
>>>> installing from network. It turned out that "modprobe e1000e IntMode=0"
>>>> was needed to workaround the issues my patches address.
>>>
>>> What about the IMS register? It is set just after.
>>>
>>> Looking at b38636b8372, can you test this patch?
>>>
>>> -- >8 --
>>> diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c
>>> index c93c4661ed..a484b68a5a 100644
>>> --- a/hw/net/e1000e_core.c
>>> +++ b/hw/net/e1000e_core.c
>>> @@ -2022,13 +2022,13 @@ e1000e_msix_notify_one(E1000ECore *core,
>>> uint32_t cause, uint32_t int_cfg)
>>>
>>>  effective_eiac = core->mac[EIAC] & cause;
>>>
>>> -if (effective_eiac == E1000_ICR_OTHER) {
>>> -effective_eiac |= E1000_ICR_OTHER_CAUSES;
>>> -}
>>> -
>>>  core->mac[ICR] &= ~effective_eiac;
>>>
>>>  if (!(core->mac[CTRL_EXT] & E1000_CTRL_EXT_IAME)) {
>>> +if (effective_eiac == E1000_ICR_OTHER) {
>>> +effective_eiac |= E1000_ICR_OTHER_CAUSES;
>>> +}
>>> +
>>>  core->mac[IMS] &= ~effective_eiac;
>>>  }
>>>  }
>>>
>>
>> Before testing this: What would be the reasoning for this change?
> 
> Not breaking the purpose of b38636b8372 :)

I disagree on this expansion of bit 31 ("other causes"). I see no
indication in the spec that setting this bit for autoclear has more
impact than on the very same bit itself. Therefore I'm asking for a
reasoning - based on the spec.

Jan

-- 
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux

Re: [Qemu-devel] [PATCH] e1000e: Do not auto-clear ICR bits which aren't set in EIAC

2018-06-12 Thread Jan Kiszka

On 2018-06-12 20:11, Philippe Mathieu-Daudé wrote:
> Hi Jan,
> 
> On 06/12/2018 02:22 PM, Jan Kiszka wrote:
>> On 2018-05-22 09:00, Jan Kiszka wrote:
>>> On 2018-04-16 17:29, Peter Maydell wrote:
>>>> On 16 April 2018 at 16:25, Jan Kiszka  wrote:
>>>>> On 2018-04-01 23:17, Jan Kiszka wrote:
>>>>>> From: Jan Kiszka 
>>>>>>
>>>>>> The spec does not justify clearing of any E1000_ICR_OTHER_CAUSES when
>>>>>> E1000_ICR_OTHER is set in EIAC. In fact, removing this code fixes the
>>>>>> issue the Linux driver runs into since 4aea7a5c5e94 ("e1000e: Avoid
>>>>>> receiver overrun interrupt bursts") and was worked around by
>>>>>> 745d0bd3af99 ("e1000e: Remove Other from EIAC").
>>>>>>
>>>>>> Signed-off-by: Jan Kiszka 
>>>>>> ---
>>>>>>
>>>>>> This resolves the issue I reported on February 18 ("e1000e: MSI-X
>>>>>> problem with recent Linux drivers").
>>>>>>
>>>>>>  hw/net/e1000e_core.c | 4 
>>>>>>  1 file changed, 4 deletions(-)
>>>>>>
>>>>>> diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c
>>>>>> index ecf9b1..d38f025c0f 100644
>>>>>> --- a/hw/net/e1000e_core.c
>>>>>> +++ b/hw/net/e1000e_core.c
>>>>>> @@ -2022,10 +2022,6 @@ e1000e_msix_notify_one(E1000ECore *core, uint32_t 
>>>>>> cause, uint32_t int_cfg)
>>>>>>
>>>>>>  effective_eiac = core->mac[EIAC] & cause;
>>>>>>
>>>>>> -if (effective_eiac == E1000_ICR_OTHER) {
>>>>>> -effective_eiac |= E1000_ICR_OTHER_CAUSES;
>>>>>> -}
>>>>>> -
>>>>>>  core->mac[ICR] &= ~effective_eiac;
>>>>>>
>>>>>>  if (!(core->mac[CTRL_EXT] & E1000_CTRL_EXT_IAME)) {
>>>>>>
>>>>>
>>>>> Ping for this - as well as https://patchwork.ozlabs.org/patch/895476.
>>>>>
>>>>> Given that q35 uses e1000e by default and many Linux kernel versions no
>>>>> longer work, this should likely go into upcoming and stable versions
>>>>
>>>> I'd rather not put it into 2.12 at this point in the release
>>>> cycle unless it's a regression from 2.11, I think.
>>>
>>> Second ping - nothing hit the repo so far, nor did I receive feedback.
>>>
>>
>> And another ping. For both.
>>
>> These days I had to help someone with a broken QEMU setup that failed
>> installing from network. It turned out that "modprobe e1000e IntMode=0"
>> was needed to workaround the issues my patches address.
> 
> What about the IMS register? It is set just after.
> 
> Looking at b38636b8372, can you test this patch?
> 
> -- >8 --
> diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c
> index c93c4661ed..a484b68a5a 100644
> --- a/hw/net/e1000e_core.c
> +++ b/hw/net/e1000e_core.c
> @@ -2022,13 +2022,13 @@ e1000e_msix_notify_one(E1000ECore *core,
> uint32_t cause, uint32_t int_cfg)
> 
>  effective_eiac = core->mac[EIAC] & cause;
> 
> -if (effective_eiac == E1000_ICR_OTHER) {
> -effective_eiac |= E1000_ICR_OTHER_CAUSES;
> -}
> -
>  core->mac[ICR] &= ~effective_eiac;
> 
>  if (!(core->mac[CTRL_EXT] & E1000_CTRL_EXT_IAME)) {
> +if (effective_eiac == E1000_ICR_OTHER) {
> +effective_eiac |= E1000_ICR_OTHER_CAUSES;
> +}
> +
>  core->mac[IMS] &= ~effective_eiac;
>  }
>  }
> 

Before testing this: What would be the reasoning for this change?

I need to refresh my caches, the debugging session is now too long ago
again.

Jan

-- 
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 5319 matches

Mail list logo