Re: [Qemu-devel] [PATCH] spapr: fix LSI interrupt specifiers in the device tree

2017-12-02 Thread Cédric Le Goater
On 12/02/2017 08:30 PM, Greg Kurz wrote:
> PAPR 2.7 C.6.9.1.2 describes the "#interrupt-cells" property of the
> PowerPC External Interrupt Source Controller node as follows:
> 
> “#interrupt-cells”
> 
>   Standard property name to define the number of cells in an interrupt-
>   specifier within an interrupt domain.
> 
>   prop-encoded-array: An integer, encoded as with encode-int, that denotes
>   the number of cells required to represent an interrupt specifier in its
>   child nodes.
> 
>   The value of this property for the PowerPC External Interrupt option shall
>   be 2. Thus all interrupt specifiers (as used in the standard “interrupts”
>   property) shall consist of two cells, each containing an integer encoded
>   as with encode-int. The first integer represents the interrupt number the
>   second integer is the trigger code: 0 for edge triggered, 1 for level
>   triggered.
> 
> This patch adds a second cell to the interrupt specifier stored in the
> "interrupts" property of PCI device nodes. This property only exists if
> the Interrupt Pin register is set, ie, the interrupt is level, the extra
> cell is hence set to 1.
> 
> This also fixes the interrupt specifiers in the "interrupt-map" property
> of the PHB node, that were setting the second cell to 8 (confusion with
> IRQ_TYPE_LEVEL_LOW ?) instead of 1.
> 
> While here, let's introduce defines for the interrupt specifier trigger
> code, and patch other users in spapr.
> 
> Signed-off-by: Greg Kurz 
> ---
> 
> This fixes /proc/interrupts in linux guests where LSIs appear as
> Edge instead of Level.

It does and also XIVE stops complaining with such warning 
when an LSI interrupt is configured :

   [   20.137390] xive: Interrupt 17 (HW 0x1004) type mismatch, Linux says 
Edge, FW says Level

and we know have :

   (initramfs) ip link set up dev enp0s0
   [   20.186717] 8139cp :00:00.0 enp0s0: link up, 100Mbps, full-duplex, 
lpa 0x05E1
   (initramfs) cat /proc/interrupts 
  CPU0   CPU1   CPU2   CPU3   
16:341635485538  XIVE-IPI0 Edge  IPI
17:  0  0  0  5  XIVE-IRQ 4100 Level 
enp0s0
18:  0  0  0  0  XIVE-IRQ 4097 Edge  
RAS_HOTPLUG
19:  0  0  0  0  XIVE-IRQ 4096 Edge  
RAS_EPOW
20:  0  0 25  0  XIVE-IRQ 4098 Edge  
hvc_console


The "interrupt-map" property is not obvious to understand 
but indeed the last 2 cells of a row are also determined by 
the #interrupt-cells property. Some comments would be 
welcomed. See device tree specification for that.

Reviewed-by: Cédric Le Goater 
Tested-by: Cédric Le Goater 

Thanks,

C.

> ---
>  hw/ppc/spapr_events.c  |2 +-
>  hw/ppc/spapr_pci.c |4 +++-
>  hw/ppc/spapr_vio.c |3 ++-
>  include/hw/ppc/spapr.h |3 +++
>  4 files changed, 9 insertions(+), 3 deletions(-)
> 
> diff --git a/hw/ppc/spapr_events.c b/hw/ppc/spapr_events.c
> index e377fc7ddea2..4bcb98f948ea 100644
> --- a/hw/ppc/spapr_events.c
> +++ b/hw/ppc/spapr_events.c
> @@ -283,7 +283,7 @@ void spapr_dt_events(sPAPRMachineState *spapr, void *fdt)
>  }
>  
>  interrupts[0] = cpu_to_be32(source->irq);
> -interrupts[1] = 0;
> +interrupts[1] = SPAPR_DT_INTERRUPT_IDENTIFIER_EDGE;
>  
>  _FDT(node_offset = fdt_add_subnode(fdt, event_sources, source_name));
>  _FDT(fdt_setprop(fdt, node_offset, "interrupts", interrupts,
> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> index 5a3122a9f9f9..91fedbf0929c 100644
> --- a/hw/ppc/spapr_pci.c
> +++ b/hw/ppc/spapr_pci.c
> @@ -1231,6 +1231,8 @@ static void spapr_populate_pci_child_dt(PCIDevice *dev, 
> void *fdt, int offset,
>  if (pci_default_read_config(dev, PCI_INTERRUPT_PIN, 1)) {
>  _FDT(fdt_setprop_cell(fdt, offset, "interrupts",
>   pci_default_read_config(dev, PCI_INTERRUPT_PIN, 1)));
> +_FDT(fdt_appendprop_cell(fdt, offset, "interrupts",
> + SPAPR_DT_INTERRUPT_IDENTIFIER_LEVEL));
>  }
>  
>  if (!is_bridge) {
> @@ -2122,7 +2124,7 @@ int spapr_populate_pci_dt(sPAPRPHBState *phb,
>  irqmap[3] = cpu_to_be32(j+1);
>  irqmap[4] = cpu_to_be32(xics_phandle);
>  irqmap[5] = cpu_to_be32(phb->lsi_table[lsi_num].irq);
> -irqmap[6] = cpu_to_be32(0x8);
> +irqmap[6] = cpu_to_be32(SPAPR_DT_INTERRUPT_IDENTIFIER_LEVEL);
>  }
>  }
>  /* Write interrupt map */
> diff --git a/hw/ppc/spapr_vio.c b/hw/ppc/spapr_vio.c
> index ea3bc8bd9e21..29a17651a17c 100644
> --- a/hw/ppc/spapr_vio.c
> +++ b/hw/ppc/spapr_vio.c
> @@ -126,7 +126,8 @@ static int vio_make_devnode(VIOsPAPRDevice *dev,
>  }
>  
>  if (dev->irq) {
> -uint32_t ints_prop[] = {cpu_to_be32(dev->irq), 0};
> +uint32_t ints_prop[] = { 

[Qemu-devel] [PATCH v2] Add AVX, AVX-512, MPX support to x86_cpu_dump_state

2017-12-02 Thread Doug Gale
Signed-off-by: Doug Gale 
---
Fix MSB LSB showing when SSE is disabled
 target/i386/helper.c | 95 +---
 1 file changed, 83 insertions(+), 12 deletions(-)

diff --git a/target/i386/helper.c b/target/i386/helper.c
index f63eb3d3f4..03812b6e87 100644
--- a/target/i386/helper.c
+++ b/target/i386/helper.c
@@ -543,6 +543,7 @@ void x86_cpu_dump_state(CPUState *cs, FILE *f, 
fprintf_function cpu_fprintf,
 }
 }
 cpu_fprintf(f, "EFER=%016" PRIx64 "\n", env->efer);
+cpu_fprintf(f, "XCR0=%016" PRIx64 "\n", env->xcr0);
 if (flags & CPU_DUMP_FPU) {
 int fptag;
 fptag = 0;
@@ -565,21 +566,91 @@ void x86_cpu_dump_state(CPUState *cs, FILE *f, 
fprintf_function cpu_fprintf,
 else
 cpu_fprintf(f, " ");
 }
-if (env->hflags & HF_CS64_MASK)
-nb = 16;
-else
+
+if (env->hflags & HF_CS64_MASK) {
+if (env->xcr0 & XSTATE_Hi16_ZMM_MASK) {
+/* AVX-512 32 registers enabled */
+nb = 32;
+} else {
+/* 64-bit mode, 16 registers */
+nb = 16;
+}
+} else {
+/* 32 bit mode, 8 registers */
 nb = 8;
-for(i=0;ixmm_regs[i].ZMM_L(3),
-env->xmm_regs[i].ZMM_L(2),
-env->xmm_regs[i].ZMM_L(1),
-env->xmm_regs[i].ZMM_L(0));
-if ((i & 1) == 1)
+}
+
+/* sse register width in units of 64 bits */
+int zmm_width;
+char zmm_name;
+if (env->xcr0 & XSTATE_ZMM_Hi256_MASK) {
+/* 512-bit "ZMM" - AVX-512 registers enabled */
+zmm_width = 8;
+zmm_name = 'Z';
+} else if (env->xcr0 & XSTATE_YMM_MASK) {
+/* 256-bit "YMM" - AVX enabled */
+zmm_width = 4;
+zmm_name = 'Y';
+} else if (env->cr[4] & CR4_OSFXSR_MASK) {
+/* 128-bit "XMM" - SSE enabled */
+zmm_width = 2;
+zmm_name = 'X';
+} else {
+/* SSE not enabled */
+zmm_width = 0;
+zmm_name = 0;
+}
+
+if (zmm_width > 0) {
+cpu_fprintf(f, "  MSB%*sLSB\n",
+-(zmm_width * 16 + zmm_width - 7), "");
+}
+
+for (i = 0; i < nb; i++) {
+if (zmm_width == 0) {
+cpu_fprintf(f, "SSE not enabled\n");
+break;
+}
+
+cpu_fprintf(f, "%cMM%02d=", zmm_name, i);
+int qw;
+for (qw = zmm_width; qw > 0; qw -= 2) {
+/* ':' separator every 64 bits */
+cpu_fprintf(f, "%016" PRIx64 ":%016" PRIx64 "%s",
+env->xmm_regs[i].ZMM_Q(qw - 1),
+env->xmm_regs[i].ZMM_Q(qw - 2),
+qw > 2 ? ":" : "");
+}
+
+/* two registers per line for 128-bit registers */
+if (zmm_width > 2 || (i & 1)) {
 cpu_fprintf(f, "\n");
-else
+} else {
 cpu_fprintf(f, " ");
+}
+}
+
+if (env->xcr0 & XSTATE_OPMASK_MASK) {
+/* AVX-512 opmask registers */
+for (i = 0; i < 8; ++i) {
+cpu_fprintf(f, "K%d=%08" PRIx64 "%s", i, env->opmask_regs[i],
+   (i & 3) == 3 ? "\n" : " ");
+}
+}
+
+if (env->xcr0 & XSTATE_BNDREGS_MASK) {
+/* MPX bound registers */
+for (i = 0; i < 4; ++i) {
+cpu_fprintf(f, "BND%d=%016" PRIx64 ":%016" PRIx64 "%s",
+i, env->bnd_regs[i].ub, env->bnd_regs[i].lb,
+(i & 1) ? "\n" : " ");
+}
+}
+
+if (env->xcr0 & XSTATE_BNDCSR_MASK) {
+/* MPX bound status/config registers */
+cpu_fprintf(f, "BNDCFGU=%016" PRIx64 " BNDSTATUS=%016" PRIx64 "\n",
+env->bndcs_regs.cfgu, env->bndcs_regs.sts);
 }
 }
 if (flags & CPU_DUMP_CODE) {
-- 
2.14.1




Re: [Qemu-devel] [for-2.12 3/7] pci: Fold pci_bus.h into pci.h

2017-12-02 Thread Michael S. Tsirkin
On Sat, Dec 02, 2017 at 11:59:20AM +1100, David Gibson wrote:
> On Fri, Dec 01, 2017 at 06:29:39PM +0200, Michael S. Tsirkin wrote:
> > On Thu, Nov 30, 2017 at 03:02:48PM +1100, David Gibson wrote:
> > > On Wed, Nov 29, 2017 at 12:38:00PM +0200, Marcel Apfelbaum wrote:
> > > > On 29/11/2017 10:46, David Gibson wrote:
> > > > > include/hw/pci/pci_bus.h is now very small and can only safely be 
> > > > > included
> > > > > after hw/pci/pci.h.  So, just fold it into pci.h.
> > > > > 
> > > > 
> > > > I don't get the benefit from merging the header files.
> > > > I would go the other way around and find stuff specific
> > > > to pci_bus and add it there, like the pci_bus_new*
> > > > you touched in the prev patch.
> > > 
> > > Hrm.  Except the point of the earlier patch was that those are
> > > actually spoecific to root buses, so would really belong in say
> > > pci-host.h, rather than pci-bus.h.
> > > 
> > > A log of PCI stuff deals with interaction between the device and bus
> > > though, so it just seemed like more trouble than it was worth to go
> > > disentangling them properly.
> > > 
> > > > Maybe if *all* code files requiring pci.h would automatically
> > > > need pci_bus.h would make sense, but I didn't check that.
> > > 
> > > Yeah, I don't think every user of pci.h needs pci_bus.h, although a
> > > fair few do as you can see from the diff.  Well, I guess it's up to
> > > Michael.  I'll be tolerably content either way - I'd say this is the
> > > least important patch of the series.
> > 
> > I'm inclined to agree with Marcel also because it's lots of noise for no
> > real win.  The next patch depends on this one. I skipped this and next
> > one, pls feel free to repost next one.
> 
> Ok, will do.  Is the tree you've merged the others to public
> somewhere, so I can rebase on it?

git://git.kernel.org/pub/scm/virt/kvm/mst/qemu.git branch is named pci

pls note this is a rebased branch, commit IDs won't be stable.

> -- 
> David Gibson  | I'll have my music baroque, and my code
> david AT gibson.dropbear.id.au| minimalist, thank you.  NOT _the_ 
> _other_
>   | _way_ _around_!
> http://www.ozlabs.org/~dgibson





[Qemu-devel] [PATCH] Add AVX, AVX-512, MPX support to x86_cpu_dump_state

2017-12-02 Thread Doug Gale
Signed-off-by: Doug Gale 
---
 target/i386/helper.c | 94 +---
 1 file changed, 82 insertions(+), 12 deletions(-)

diff --git a/target/i386/helper.c b/target/i386/helper.c
index f63eb3d3f4..708fe13f2f 100644
--- a/target/i386/helper.c
+++ b/target/i386/helper.c
@@ -543,6 +543,7 @@ void x86_cpu_dump_state(CPUState *cs, FILE *f, 
fprintf_function cpu_fprintf,
 }
 }
 cpu_fprintf(f, "EFER=%016" PRIx64 "\n", env->efer);
+cpu_fprintf(f, "XCR0=%016" PRIx64 "\n", env->xcr0);
 if (flags & CPU_DUMP_FPU) {
 int fptag;
 fptag = 0;
@@ -565,21 +566,90 @@ void x86_cpu_dump_state(CPUState *cs, FILE *f, 
fprintf_function cpu_fprintf,
 else
 cpu_fprintf(f, " ");
 }
-if (env->hflags & HF_CS64_MASK)
-nb = 16;
-else
+
+if (env->hflags & HF_CS64_MASK) {
+if (env->xcr0 & XSTATE_Hi16_ZMM_MASK) {
+/* AVX-512 32 registers enabled */
+nb = 32;
+} else {
+/* 64-bit mode, 16 registers */
+nb = 16;
+}
+} else {
+/* 32 bit mode, 8 registers */
 nb = 8;
-for(i=0;ixmm_regs[i].ZMM_L(3),
-env->xmm_regs[i].ZMM_L(2),
-env->xmm_regs[i].ZMM_L(1),
-env->xmm_regs[i].ZMM_L(0));
-if ((i & 1) == 1)
+}
+
+/* sse register width in units of 64 bits */
+int zmm_width;
+char zmm_name;
+if (env->xcr0 & XSTATE_ZMM_Hi256_MASK) {
+/* 512-bit "ZMM" - AVX-512 registers enabled */
+zmm_width = 8;
+zmm_name = 'Z';
+} else if (env->xcr0 & XSTATE_YMM_MASK) {
+/* 256-bit "YMM" - AVX enabled */
+zmm_width = 4;
+zmm_name = 'Y';
+} else if (env->cr[4] & CR4_OSFXSR_MASK) {
+/* 128-bit "XMM" - SSE enabled */
+zmm_width = 2;
+zmm_name = 'X';
+} else {
+/* SSE not enabled */
+zmm_width = 0;
+zmm_name = 0;
+}
+
+
+cpu_fprintf(f, "  MSB%*sLSB\n",
+-(zmm_width * 16 + zmm_width - 7), "");
+
+for (i = 0; i < nb; i++) {
+if (zmm_width == 0) {
+cpu_fprintf(f, "SSE not enabled\n");
+break;
+}
+
+cpu_fprintf(f, "%cMM%02d=", zmm_name, i);
+int qw;
+for (qw = zmm_width; qw > 0; qw -= 2) {
+/* ':' separator every 64 bits */
+cpu_fprintf(f, "%016" PRIx64 ":%016" PRIx64 "%s",
+env->xmm_regs[i].ZMM_Q(qw - 1),
+env->xmm_regs[i].ZMM_Q(qw - 2),
+qw > 2 ? ":" : "");
+}
+
+/* two registers per line for 128-bit registers */
+if (zmm_width > 2 || (i & 1)) {
 cpu_fprintf(f, "\n");
-else
+} else {
 cpu_fprintf(f, " ");
+}
+}
+
+if (env->xcr0 & XSTATE_OPMASK_MASK) {
+/* AVX-512 opmask registers */
+for (i = 0; i < 8; ++i) {
+cpu_fprintf(f, "K%d=%08" PRIx64 "%s", i, env->opmask_regs[i],
+   (i & 3) == 3 ? "\n" : " ");
+}
+}
+
+if (env->xcr0 & XSTATE_BNDREGS_MASK) {
+/* MPX bound registers */
+for (i = 0; i < 4; ++i) {
+cpu_fprintf(f, "BND%d=%016" PRIx64 ":%016" PRIx64 "%s",
+i, env->bnd_regs[i].ub, env->bnd_regs[i].lb,
+(i & 1) ? "\n" : " ");
+}
+}
+
+if (env->xcr0 & XSTATE_BNDCSR_MASK) {
+/* MPX bound status/config registers */
+cpu_fprintf(f, "BNDCFGU=%016" PRIx64 " BNDSTATUS=%016" PRIx64 "\n",
+env->bndcs_regs.cfgu, env->bndcs_regs.sts);
 }
 }
 if (flags & CPU_DUMP_CODE) {
-- 
2.14.1




Re: [Qemu-devel] [PULL 0/7] pc, pci, virtio: fixes for rc3

2017-12-02 Thread Michael S. Tsirkin
On Fri, Dec 01, 2017 at 06:05:25PM +, Peter Maydell wrote:
> On 1 December 2017 at 17:08, Michael S. Tsirkin  wrote:
> > The following changes since commit c11d61271b9e6e7a1f0479ef1ca8fb55fa457a62:
> >
> >   Update version for v2.11.0-rc3 release (2017-11-29 17:59:34 +)
> >
> > are available in the git repository at:
> >
> >   git://git.kernel.org/pub/scm/virt/kvm/mst/qemu.git tags/for_upstream
> >
> > for you to fetch changes up to 75ba2ddb188fa07c3442446766782036e3085cba:
> >
> >   pc: fix crash on attempted cpu unplug (2017-12-01 19:05:58 +0200)
> >
> > 
> > pc, pci, virtio: fixes for rc3
> >
> > A bunch of fixes all over the place.
> >
> > Signed-off-by: Michael S. Tsirkin 
> >
> > 
> 
> You can't send fixes for rc3 2 days after rc3 has been tagged :-)

Bad wording. I meant they are for bugs present in rc3.

> > Chao Gao (1):
> >   i386/msi: Correct mask of destination ID in MSI address
> >
> > Greg Kurz (1):
> >   vhost: fix error check in vhost_verify_ring_mappings()
> >
> > Igor Mammedov (1):
> >   pc: fix crash on attempted cpu unplug
> >
> > Marc-André Lureau (1):
> >   dump-guest-memory.py: fix No symbol "vmcoreinfo_find"
> >
> > Maxime Coquelin (2):
> >   virtio: Add queue interface to restore avail index from vring used 
> > index
> >   vhost: restore avail index from vring used index on disconnection
> >
> > Prasad J Pandit (1):
> >   virtio: check VirtQueue Vring object is set
> 
> Are any of these so important that we would absolutely refuse
> to release without the fixes (ie they justify rolling an rc4
> that we would otherwise not have needed) ?
> 
> thanks
> -- PMM

The msi one is less important it just happened to be queued a while ago
and I didn't want to rebase all testing. Others are crashers but they
don't affect everyone. So I wouldn't be sure, but there's also a
security fix in there, so yes, I suspect we are better off with rc4, and
if we do I think including others is justified (except maybe the msi
one, if you feel strongly I'll rebase and drop it).

-- 
MST



[Qemu-devel] [Bug 1321684] Re: block_stream command stalls

2017-12-02 Thread Launchpad Bug Tracker
[Expired for QEMU because there has been no activity for 60 days.]

** Changed in: qemu
   Status: Incomplete => Expired

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1321684

Title:
  block_stream command stalls

Status in QEMU:
  Expired

Bug description:
  block_stream command stalls near finishing.
  I tried 1.7.1, 2.0.0 and the master versions. All of them stalled.
  But the 1.1.2 could finish the job.

  Here is how to reproduce it:

  $ sudo $QEMU \
  -enable-kvm  -cpu qemu64  -m 1024  \
  -monitor stdio \
  -drive file=./i1,if=none,id=drive_0,cache=none,aio=native -device 
virtio-blk-pci,drive=drive_0,bus=pci.0,addr=0x5 \

  QEMU 2.0.50 monitor - type 'help' for more information
  (qemu) VNC server running on `127.0.0.1:5900'
  (qemu) snapshot_blkdev drive_0 s1
  Formatting 's1', fmt=qcow2 size=26843545600 backing_file='./i1' 
backing_fmt='qcow2' encryption=off cluster_size=65536 lazy_refcounts=off 
  (qemu) block_stream drive_0 
  (qemu) info block-jobs 
  Streaming device drive_0: Completed 400818176 of 26843545600 bytes, speed 
limit 0 bytes/s
  (qemu) info block-jobs 
  Streaming device drive_0: Completed 904396800 of 26843545600 bytes, speed 
limit 0 bytes/s
  (qemu) info block-jobs 
  Streaming device drive_0: Completed 23401070592 of 26843545600 bytes, speed 
limit 0 bytes/s
  (qemu) info block-jobs 
  Streaming device drive_0: Completed 26513768448 of 26843545600 bytes, speed 
limit 0 bytes/s
  (qemu) main-loop: WARNING: I/O thread spun for 1000 iterations
  info block-jobs 
  Streaming device drive_0: Completed 26841513984 of 26843545600 bytes, speed 
limit 0 bytes/s
  (qemu) info block-jobs 
  Streaming device drive_0: Completed 26841513984 of 26843545600 bytes, speed 
limit 0 bytes/s
  (qemu) info block-jobs 
  Streaming device drive_0: Completed 26841513984 of 26843545600 bytes, speed 
limit 0 bytes/s

   here, the progress stopped at 26841513984 

  
  $ qemu-img info i1 
  image: i1
  file format: qcow2
  virtual size: 25G (26843545600 bytes)
  disk size: 1.0G
  cluster_size: 2097152
  Format specific information:
  compat: 1.1
  lazy refcounts: false

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1321684/+subscriptions



[Qemu-devel] [Bug 1318474] Re: QEMU update causes Windows reactivation

2017-12-02 Thread Launchpad Bug Tracker
[Expired for QEMU because there has been no activity for 60 days.]

** Changed in: qemu
   Status: Incomplete => Expired

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1318474

Title:
  QEMU update causes Windows reactivation

Status in QEMU:
  Expired

Bug description:
  After updating QEMU the guest OS's detect new hardware. As a result
  any Windows OS sees it as a significant change in hardware and require
  a reactivation.

  Host OS: Ubuntu 14.04 64-bit

  Guest OS's:
  Windows Server 2003 R2 Enterprise
  Windows Server 2008 R2 Enterprise
  Windows Server 2008 R2 Web
  Windows Server 2008 R2 Data Center

  QEMU version: 2.0.0

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1318474/+subscriptions



[Qemu-devel] [Bug 700276] Re: QEMU crashed when GDB request a big size variable information

2017-12-02 Thread Launchpad Bug Tracker
[Expired for QEMU because there has been no activity for 60 days.]

** Changed in: qemu
   Status: Incomplete => Expired

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/700276

Title:
  QEMU crashed when GDB request a big size variable information

Status in QEMU:
  Expired

Bug description:
  Hello,
  My host is Fedora 13. My QEMU version is 0.13.0, I use QEMU with GDB to debug 
Linux kernel(Version 2.6.36.2).

  I use QEMU like this:"qemu -s -S -kernel build/arch/i386/boot/bzImage -hda 
/dev/zero"
  When GDB connected with QEMU, and use gdb command print to look big size 
variable, the qemu is crash down. for example, when I look a task_struct 
variable 'init_task'(print init_task ), the qemu produce the below message and 
exit.

  *** stack smashing detected ***: qemu terminated
  === Backtrace: =
  /lib/libc.so.6(__fortify_fail+0x4d)[0x78a31d]
  /lib/libc.so.6[0x78a2ca]
  qemu[0x8059e21]
  qemu[0x805a0cf]
  qemu[0x80d12a1]
  qemu[0x8189cb8]
  qemu[0x818c3b0]
  /lib/libc.so.6(__libc_start_main+0xe6)[0x6a8cc6]
  ...
  adbf7000-adbf8000 rw-p  00:00 0 
  adbf8000-ae3f8000 rw-p  00:00 0 
  ae3f8000-ae742000 rw-p  00:00 0 
  ae742000-ae762000 rw-p  00:00 0 
  ae762000-ae764000 rw-p  00:00 0 
  ae764000-ae784000 rw-p  00:00 0 
  ae784000-ae786000 rw-p  00:00 0 
  ae786000-b6786000 rw-p  00:00 0 
  b6786000-b7894000 rw-p  00:00 0 
  b78aa000-b78ab000 rw-p  00:00 0 
  bfe95000-bfeaa000 rw-p  00:00 0  [stack]
  已放弃 (core dumped)

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/700276/+subscriptions



Re: [Qemu-devel] [PATCH v18 05/10] xbitmap: add more operations

2017-12-02 Thread Tetsuo Handa
Matthew Wilcox wrote:
> On Fri, Dec 01, 2017 at 03:09:08PM +, Wang, Wei W wrote:
> > On Friday, December 1, 2017 9:02 PM, Tetsuo Handa wrote:
> > > If start == end is legal,
> > > 
> > >for (; start < end; start = (start | (IDA_BITMAP_BITS - 1)) + 1) {
> > > 
> > > makes this loop do nothing because 10 < 10 is false.
> > 
> > How about "start <= end "?
> 
> Don't ask Tetsuo for his opinion, write some userspace code that uses it.
> 

Please be sure to prepare for "end == -1UL" case, for "start < end" will become
true when "start = (start | (IDA_BITMAP_BITS - 1)) + 1" made "start == 0" due to
overflow.



Re: [Qemu-devel] [PATCH v18 05/10] xbitmap: add more operations

2017-12-02 Thread Tetsuo Handa
Matthew Wilcox wrote:
> On Thu, Nov 30, 2017 at 10:35:03PM +0900, Tetsuo Handa wrote:
> > According to xb_set_bit(), it seems to me that we are trying to avoid 
> > memory allocation
> > for "struct ida_bitmap" when all set bits within a 1024-bits bitmap reside 
> > in the first
> > 61 bits.
> > 
> > But does such saving help? Is there characteristic bias that majority of 
> > set bits resides
> > in the first 61 bits, for "bit" is "unsigned long" which holds a page 
> > number (isn't it)?
> > If no such bias, wouldn't eliminating radix_tree_exception() case and 
> > always storing
> > "struct ida_bitmap" simplifies the code (and make the processing faster)?
> 
> It happens all the time.  The vast majority of users of the IDA set
> low bits.  Also, it's the first 62 bits -- going up to 63 bits with the
> XArray rewrite.

Oops, 0...61 is 62 bits.

What I meant is below (untested) patch. If correct, it can save lines and make
the code easier to read.

 lib/radix-tree.c | 20 +
 lib/xbitmap.c| 88 +---
 2 files changed, 8 insertions(+), 100 deletions(-)

diff --git a/lib/radix-tree.c b/lib/radix-tree.c
index a039588..fb84b91 100644
--- a/lib/radix-tree.c
+++ b/lib/radix-tree.c
@@ -2152,25 +2152,7 @@ int ida_pre_get(struct ida *ida, gfp_t gfp)
  */
 __must_check bool xb_preload(gfp_t gfp)
 {
-   if (!this_cpu_read(ida_bitmap)) {
-   struct ida_bitmap *bitmap = kmalloc(sizeof(*bitmap), gfp);
-
-   if (!bitmap)
-   return false;
-   /*
-* The per-CPU variable is updated with preemption enabled.
-* If the calling task is unlucky to be scheduled to another
-* CPU which has no ida_bitmap allocation, it will be detected
-* when setting a bit (i.e. __xb_set_bit()).
-*/
-   bitmap = this_cpu_cmpxchg(ida_bitmap, NULL, bitmap);
-   kfree(bitmap);
-   }
-
-   if (__radix_tree_preload(gfp, XB_PRELOAD_SIZE) < 0)
-   return false;
-
-   return true;
+   return __radix_tree_preload(gfp, XB_PRELOAD_SIZE) == 0;
 }
 EXPORT_SYMBOL(xb_preload);
 
diff --git a/lib/xbitmap.c b/lib/xbitmap.c
index 816dd3e..426d168 100644
--- a/lib/xbitmap.c
+++ b/lib/xbitmap.c
@@ -18,7 +18,7 @@
  * This function is used to set a bit in the xbitmap. If the bitmap that @bit
  * resides in is not there, the per-cpu ida_bitmap will be taken.
  *
- * Returns: 0 on success. %-EAGAIN indicates that @bit was not set.
+ * Returns: 0 on success. Negative value on failure.
  */
 int xb_set_bit(struct xb *xb, unsigned long bit)
 {
@@ -28,46 +28,19 @@ int xb_set_bit(struct xb *xb, unsigned long bit)
struct radix_tree_node *node;
void **slot;
struct ida_bitmap *bitmap;
-   unsigned long ebit;
 
bit %= IDA_BITMAP_BITS;
-   ebit = bit + 2;
 
err = __radix_tree_create(root, index, 0, , );
if (err)
return err;
bitmap = rcu_dereference_raw(*slot);
-   if (radix_tree_exception(bitmap)) {
-   unsigned long tmp = (unsigned long)bitmap;
-
-   if (ebit < BITS_PER_LONG) {
-   tmp |= 1UL << ebit;
-   rcu_assign_pointer(*slot, (void *)tmp);
-   return 0;
-   }
-   bitmap = this_cpu_xchg(ida_bitmap, NULL);
-   if (!bitmap) {
-   __radix_tree_delete(root, node, slot);
-   return -EAGAIN;
-   }
-   memset(bitmap, 0, sizeof(*bitmap));
-   bitmap->bitmap[0] = tmp >> RADIX_TREE_EXCEPTIONAL_SHIFT;
-   rcu_assign_pointer(*slot, bitmap);
-   }
-
if (!bitmap) {
-   if (ebit < BITS_PER_LONG) {
-   bitmap = (void *)((1UL << ebit) |
-   RADIX_TREE_EXCEPTIONAL_ENTRY);
-   __radix_tree_replace(root, node, slot, bitmap, NULL);
-   return 0;
-   }
-   bitmap = this_cpu_xchg(ida_bitmap, NULL);
+   bitmap = kzalloc(sizeof(*bitmap), GFP_NOWAIT | __GFP_NOWARN);
if (!bitmap) {
__radix_tree_delete(root, node, slot);
-   return -EAGAIN;
+   return -ENOMEM;
}
-   memset(bitmap, 0, sizeof(*bitmap));
__radix_tree_replace(root, node, slot, bitmap, NULL);
}
 
@@ -82,7 +55,7 @@ int xb_set_bit(struct xb *xb, unsigned long bit)
  *  @bit: index of the bit to set
  *
  * A wrapper of the xb_preload() and xb_set_bit().
- * Returns: 0 on success; -EAGAIN or -ENOMEM on error.
+ * Returns: 0 on success; -ENOMEM on error.
  */
 int xb_preload_and_set_bit(struct xb *xb, unsigned long bit, gfp_t gfp)
 {
@@ -113,25 +86,10 @@ void xb_clear_bit(struct xb *xb, unsigned long bit)
struct 

[Qemu-devel] [PATCH v4] gdbstub: add tracing

2017-12-02 Thread Doug Gale
Signed-off-by: Doug Gale 
---
Fix usage of %c in trace output, now uses 0x%02x
Fix possible sign extended char that could cause 0xfc to say 0xfffc
Add missing traces for hitting breakpoints, continuing, stepping
Fix incorrect dynamic check for tracing being enabled in hexdump
Fix missing braces around single line if body
Fix incorrectly indented return statement
Fix order of trace-events to be more tidy
 gdbstub.c| 113 +--
 trace-events |  28 +++
 2 files changed, 106 insertions(+), 35 deletions(-)

diff --git a/gdbstub.c b/gdbstub.c
index 2a94030d3b..f1d51480f7 100644
--- a/gdbstub.c
+++ b/gdbstub.c
@@ -21,6 +21,7 @@
 #include "qemu/error-report.h"
 #include "qemu/cutils.h"
 #include "cpu.h"
+#include "trace-root.h"
 #ifdef CONFIG_USER_ONLY
 #include "qemu.h"
 #else
@@ -287,21 +288,6 @@ static int gdb_signal_to_target (int sig)
 return -1;
 }
 
-/* #define DEBUG_GDB */
-
-#ifdef DEBUG_GDB
-# define DEBUG_GDB_GATE 1
-#else
-# define DEBUG_GDB_GATE 0
-#endif
-
-#define gdb_debug(fmt, ...) do { \
-if (DEBUG_GDB_GATE) { \
-fprintf(stderr, "%s: " fmt, __func__, ## __VA_ARGS__); \
-} \
-} while (0)
-
-
 typedef struct GDBRegisterState {
 int base_reg;
 int num_regs;
@@ -410,10 +396,13 @@ int use_gdb_syscalls(void)
 /* Resume execution.  */
 static inline void gdb_continue(GDBState *s)
 {
+
 #ifdef CONFIG_USER_ONLY
 s->running_state = 1;
+trace_gdbstub_op_continue();
 #else
 if (!runstate_needs_reset()) {
+trace_gdbstub_op_continue();
 vm_start();
 }
 #endif
@@ -434,6 +423,7 @@ static int gdb_continue_partial(GDBState *s, char 
*newstates)
  */
 CPU_FOREACH(cpu) {
 if (newstates[cpu->cpu_index] == 's') {
+trace_gdbstub_op_stepping(cpu->cpu_index);
 cpu_single_step(cpu, sstep_flags);
 }
 }
@@ -452,11 +442,13 @@ static int gdb_continue_partial(GDBState *s, char 
*newstates)
 case 1:
 break; /* nothing to do here */
 case 's':
+trace_gdbstub_op_stepping(cpu->cpu_index);
 cpu_single_step(cpu, sstep_flags);
 cpu_resume(cpu);
 flag = 1;
 break;
 case 'c':
+trace_gdbstub_op_continue_cpu(cpu->cpu_index);
 cpu_resume(cpu);
 flag = 1;
 break;
@@ -538,12 +530,49 @@ static void hextomem(uint8_t *mem, const char *buf, int 
len)
 }
 }
 
+static void hexdump(const char *buf, int len,
+void (*trace_fn)(size_t ofs, char const *text))
+{
+char line_buffer[3 * 16 + 4 + 16 + 1];
+
+size_t i;
+for (i = 0; i < len || (i & 0xF); ++i) {
+size_t byte_ofs = i & 15;
+
+if (byte_ofs == 0) {
+memset(line_buffer, ' ', 3 * 16 + 4 + 16);
+line_buffer[3 * 16 + 4 + 16] = 0;
+}
+
+size_t col_group = (i >> 2) & 3;
+size_t hex_col = byte_ofs * 3 + col_group;
+size_t txt_col = 3 * 16 + 4 + byte_ofs;
+
+if (i < len) {
+char value = buf[i];
+
+line_buffer[hex_col + 0] = tohex((value >> 4) & 0xF);
+line_buffer[hex_col + 1] = tohex((value >> 0) & 0xF);
+line_buffer[txt_col + 0] = (value >= ' ' && value < 127)
+? value
+: '.';
+}
+
+if (byte_ofs == 0xF)
+trace_fn(i & -16, line_buffer);
+}
+}
+
 /* return -1 if error, 0 if OK */
-static int put_packet_binary(GDBState *s, const char *buf, int len)
+static int put_packet_binary(GDBState *s, const char *buf, int len, bool dump)
 {
 int csum, i;
 uint8_t *p;
 
+if (dump && trace_event_get_state_backends(TRACE_GDBSTUB_IO_BINARYREPLY)) {
+hexdump(buf, len, trace_gdbstub_io_binaryreply);
+}
+
 for(;;) {
 p = s->last_packet;
 *(p++) = '$';
@@ -576,9 +605,9 @@ static int put_packet_binary(GDBState *s, const char *buf, 
int len)
 /* return -1 if error, 0 if OK */
 static int put_packet(GDBState *s, const char *buf)
 {
-gdb_debug("reply='%s'\n", buf);
+trace_gdbstub_io_reply(buf);
 
-return put_packet_binary(s, buf, strlen(buf));
+return put_packet_binary(s, buf, strlen(buf), false);
 }
 
 /* Encode data using the encoding for 'x' packets.  */
@@ -975,8 +1004,7 @@ static int gdb_handle_packet(GDBState *s, const char 
*line_buf)
 uint8_t *registers;
 target_ulong addr, len;
 
-
-gdb_debug("command='%s'\n", line_buf);
+trace_gdbstub_io_command(line_buf);
 
 p = line_buf;
 ch = *p++;
@@ -999,7 +1027,7 @@ static int gdb_handle_packet(GDBState *s, const char 
*line_buf)
 }
 s->signal = 0;
 gdb_continue(s);
-   return RS_IDLE;
+return RS_IDLE;
 case 'C':
 s->signal = gdb_signal_to_target (strtoul(p, (char **), 16));
 if (s->signal == -1)
@@ -1045,7 

Re: [Qemu-devel] [PATCH 0/5] spapr: preliminary cleanups before introducing XIVE

2017-12-02 Thread David Gibson
On Fri, Dec 01, 2017 at 05:05:59PM +0100, Cédric Le Goater wrote:
> Hello,
> 
> These are initial patches of the XIVE patchset which prepare ground
> for the integration of the XIVE model.

Applied to ppc-for-2.12.

> 
> Thanks,
> 
> C.
> 
> Cédric Le Goater (5):
>   ppc/xics: introduce an icp_create() helper
>   ppc/xics: assign of the CPU 'intc' pointer under the core
>   spapr: move the IRQ allocation routines under the machine
>   spapr: introduce a spapr_irq_set_lsi() helper
>   spapr: introduce a spapr_qirq() helper
> 
>  hw/intc/trace-events|   4 --
>  hw/intc/xics.c  |  34 ++-
>  hw/intc/xics_spapr.c| 114 -
>  hw/ppc/pnv_core.c   |  10 +---
>  hw/ppc/spapr.c  | 133 
> 
>  hw/ppc/spapr_cpu_core.c |  14 +
>  hw/ppc/spapr_events.c   |  16 +++---
>  hw/ppc/spapr_pci.c  |  10 ++--
>  hw/ppc/spapr_vio.c  |   2 +-
>  hw/ppc/trace-events |   4 ++
>  include/hw/pci-host/spapr.h |   2 +-
>  include/hw/ppc/spapr.h  |   7 +++
>  include/hw/ppc/spapr_vio.h  |   2 +-
>  include/hw/ppc/xics.h   |   8 +--
>  14 files changed, 187 insertions(+), 173 deletions(-)
> 

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


[Qemu-devel] [PATCH] qemu-img: Fixed grammatical error in dump_human_image_check

2017-12-02 Thread Shravan Rajinikanth
Signed-off-by: Shravan Rajinikanth 
---
 qemu-img.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/qemu-img.c b/qemu-img.c
index 68b375f..bea9268 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -580,7 +580,7 @@ static void dump_human_image_check(ImageCheck *check, bool 
quiet)
 if (check->leaks) {
 qprintf(quiet,
 "\n%" PRId64 " leaked clusters were found on the image.\n"
-"This means waste of disk space, but no harm to data.\n",
+"This means disk space is wasted, but data is safe.\n",
 check->leaks);
 }
 
-- 
2.7.4




[Qemu-devel] [PATCH] 9pfs: Correctly handle cancelled requests

2017-12-02 Thread Keno Fischer
 # Background

I was investigating spurious, non-deterministic EINTR returns from
various 9p file system operations in a Linux guest served from the
qemu 9p server.

 ## EINTR, ERESTARTSYS and the linux kernel

When a signal arrives that the Linux kernel needs to deliver to user-space
while a given thread is blocked (in the 9p case waiting for a reply to its
request in 9p_client_rpc -> wait_event_interruptible), it asks whatever
driver is currently running to abort its current operation (in the 9p case
causing the submission of a TFLUSH message) and return to user space.
In these situations, the error message reported is generally ERESTARTSYS.
If the userspace processes specified SA_RESTART, this means that the
system call will get restarted upon completion of the signal handler
delivery (assuming the signal handler doesn't modify the process state
in complicated ways not relevant here). If SA_RESTART is not specified,
ERESTARTSYS gets translated to EINTR and user space is expected to handle
the restart itself.

 ## The 9p TFLISH command

The 9p TFLUSH commands requests that the server abort an ongoing operation.
The man page [1] specifies:

```
If it recognizes oldtag as the tag of a pending transaction, it should abort any
pending response and discard that tag.
[...]
When the client sends a Tflush, it must wait to receive the corresponding Rflush
before reusing oldtag for subsequent messages. If a response to the flushed 
request
is received before the Rflush, the client must honor the response as if it had 
not
been flushed, since the completed request may signify a state change in the 
server
```

In particular, this means that the server must not send a reply with the orignal
tag in response to the cancellation request, because the client is obligated
to interpret such a reply as a coincidental reply to the original request.

 # The bug

When qemu receives a TFlush request, it sets the `cancelled` flag on the 
relevant
pdu. This flag is periodically checked, e.g. in `v9fs_co_name_to_path`, and if
set, the operation is aborted and the error is set to EINTR. However, the server
then violates the spec, by returning to the client an Rerror response, rather
than discarding the message entirely. As a result, the client is required
to assume that said Rerror response is a result of the original request, not
a result of the cancellation and thus passes the EINTR error back to user space.
This is not the worst thing it could do, however as discussed above, the correct
error code would have been ERESTARTSYS, such that user space programs with
SA_RESTART set get correctly restarted upon completion of the signal handler.
Instead, such programs get spurious EINTR results that they were not expecting
to handle.

It should be noted that there are plenty of user space programs that do not
set SA_RESTART and do not correctly handle EINTR either. However, that is then
a userspace bug. It should also be noted that this bug has been mitigated by
a recent commit to the Linux kernel [2], which essentially prevents the kernel
from sending Tflush requests unless the process is about to die (in which case
the process likely doesn't care about the response). Nevertheless, for older
kernels and to comply with the spec, I believe this change is beneficial.

 # Implementation

The fix is fairly simple, just skipping notification of a reply if
the pdu was previously cancelled. I also added a new trace event to distinguish
operations that caused an error reply from those that were cancelled.

One complication is that we only omit sending the message on EINTR errors in
order to avoid confusing the rest of the code (which may assume that a
client knows about a fid if it sucessfully passed it off to pud_complete
without checking for cancellation status). This does mean that if the server
acts upon the cancellation flag, it always needs to set err to EINTR. I believe
this is true of the current code.

[1] https://9fans.github.io/plan9port/man/man9/flush.html
[2] 
https://github.com/torvalds/linux/commit/9523feac272ccad2ad8186ba4fcc89103754de52

Signed-off-by: Keno Fischer 
---
 hw/9pfs/9p.c | 17 +
 hw/9pfs/trace-events |  1 +
 2 files changed, 18 insertions(+)

diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c
index 710cd91..46f406b 100644
--- a/hw/9pfs/9p.c
+++ b/hw/9pfs/9p.c
@@ -648,6 +648,22 @@ static void coroutine_fn pdu_complete(V9fsPDU *pdu, 
ssize_t len)
 V9fsState *s = pdu->s;
 int ret;
 
+/*
+ * The 9p spec requires that successfully cancelled pdus receive no reply.
+ * Sending a reply would confuse clients because they would
+ * assume that any EINTR is the actual result of the operation,
+ * rather than a consequence of the cancellation. However, if
+ * the operation completed (succesfully or with an error other
+ * than caused be cancellation), we do send out that reply, both
+ * for efficiency and to avoid confusing the rest of the state 

[Qemu-devel] [PATCH] spapr: fix LSI interrupt specifiers in the device tree

2017-12-02 Thread Greg Kurz
PAPR 2.7 C.6.9.1.2 describes the "#interrupt-cells" property of the
PowerPC External Interrupt Source Controller node as follows:

“#interrupt-cells”

  Standard property name to define the number of cells in an interrupt-
  specifier within an interrupt domain.

  prop-encoded-array: An integer, encoded as with encode-int, that denotes
  the number of cells required to represent an interrupt specifier in its
  child nodes.

  The value of this property for the PowerPC External Interrupt option shall
  be 2. Thus all interrupt specifiers (as used in the standard “interrupts”
  property) shall consist of two cells, each containing an integer encoded
  as with encode-int. The first integer represents the interrupt number the
  second integer is the trigger code: 0 for edge triggered, 1 for level
  triggered.

This patch adds a second cell to the interrupt specifier stored in the
"interrupts" property of PCI device nodes. This property only exists if
the Interrupt Pin register is set, ie, the interrupt is level, the extra
cell is hence set to 1.

This also fixes the interrupt specifiers in the "interrupt-map" property
of the PHB node, that were setting the second cell to 8 (confusion with
IRQ_TYPE_LEVEL_LOW ?) instead of 1.

While here, let's introduce defines for the interrupt specifier trigger
code, and patch other users in spapr.

Signed-off-by: Greg Kurz 
---

This fixes /proc/interrupts in linux guests where LSIs appear as
Edge instead of Level.
---
 hw/ppc/spapr_events.c  |2 +-
 hw/ppc/spapr_pci.c |4 +++-
 hw/ppc/spapr_vio.c |3 ++-
 include/hw/ppc/spapr.h |3 +++
 4 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/hw/ppc/spapr_events.c b/hw/ppc/spapr_events.c
index e377fc7ddea2..4bcb98f948ea 100644
--- a/hw/ppc/spapr_events.c
+++ b/hw/ppc/spapr_events.c
@@ -283,7 +283,7 @@ void spapr_dt_events(sPAPRMachineState *spapr, void *fdt)
 }
 
 interrupts[0] = cpu_to_be32(source->irq);
-interrupts[1] = 0;
+interrupts[1] = SPAPR_DT_INTERRUPT_IDENTIFIER_EDGE;
 
 _FDT(node_offset = fdt_add_subnode(fdt, event_sources, source_name));
 _FDT(fdt_setprop(fdt, node_offset, "interrupts", interrupts,
diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
index 5a3122a9f9f9..91fedbf0929c 100644
--- a/hw/ppc/spapr_pci.c
+++ b/hw/ppc/spapr_pci.c
@@ -1231,6 +1231,8 @@ static void spapr_populate_pci_child_dt(PCIDevice *dev, 
void *fdt, int offset,
 if (pci_default_read_config(dev, PCI_INTERRUPT_PIN, 1)) {
 _FDT(fdt_setprop_cell(fdt, offset, "interrupts",
  pci_default_read_config(dev, PCI_INTERRUPT_PIN, 1)));
+_FDT(fdt_appendprop_cell(fdt, offset, "interrupts",
+ SPAPR_DT_INTERRUPT_IDENTIFIER_LEVEL));
 }
 
 if (!is_bridge) {
@@ -2122,7 +2124,7 @@ int spapr_populate_pci_dt(sPAPRPHBState *phb,
 irqmap[3] = cpu_to_be32(j+1);
 irqmap[4] = cpu_to_be32(xics_phandle);
 irqmap[5] = cpu_to_be32(phb->lsi_table[lsi_num].irq);
-irqmap[6] = cpu_to_be32(0x8);
+irqmap[6] = cpu_to_be32(SPAPR_DT_INTERRUPT_IDENTIFIER_LEVEL);
 }
 }
 /* Write interrupt map */
diff --git a/hw/ppc/spapr_vio.c b/hw/ppc/spapr_vio.c
index ea3bc8bd9e21..29a17651a17c 100644
--- a/hw/ppc/spapr_vio.c
+++ b/hw/ppc/spapr_vio.c
@@ -126,7 +126,8 @@ static int vio_make_devnode(VIOsPAPRDevice *dev,
 }
 
 if (dev->irq) {
-uint32_t ints_prop[] = {cpu_to_be32(dev->irq), 0};
+uint32_t ints_prop[] = { cpu_to_be32(dev->irq),
+ SPAPR_DT_INTERRUPT_IDENTIFIER_EDGE };
 
 ret = fdt_setprop(fdt, node_off, "interrupts", ints_prop,
   sizeof(ints_prop));
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index 9d21ca9bde3a..8f6298bde59b 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -590,6 +590,9 @@ void spapr_load_rtas(sPAPRMachineState *spapr, void *fdt, 
hwaddr addr);
 
 #define RTAS_EVENT_SCAN_RATE1
 
+#define SPAPR_DT_INTERRUPT_IDENTIFIER_EDGE  0
+#define SPAPR_DT_INTERRUPT_IDENTIFIER_LEVEL 1
+
 typedef struct sPAPRTCETable sPAPRTCETable;
 
 #define TYPE_SPAPR_TCE_TABLE "spapr-tce-table"




Re: [Qemu-devel] [PATCH 11/25] spapr: describe the XIVE interrupt source flags

2017-12-02 Thread Cédric Le Goater
On 12/02/2017 03:48 PM, Benjamin Herrenschmidt wrote:
> On Sat, 2017-12-02 at 15:38 +0100, Cédric Le Goater wrote:
>> Hmm, yes. So, the current design for sPAPR handles all sources 
>> under the same XIVE object with a global memory region for all 
>> the ESBs. 
>>
>> The first RFC had a mechanism to register source objects into 
>> the XIVE main one, allocating the IRQs per source and mapping 
>> the ESBs in the overall region. A bit like OPAL does. I then 
>> simplified for the sake of clarity and merged everything under 
>> the same XIVE object. 
>>
>> Shall I reintroduce multiples sources support ? and provide a 
>> default one for IPIs and virtual devices of the machine. 
> 
> That or you need state bits ;-)

yeah. I started with state bits and thought I could hack the PQ 
ones but that's wrong. 

Thanks,

C. 




Re: [Qemu-devel] [PATCH 11/25] spapr: describe the XIVE interrupt source flags

2017-12-02 Thread Benjamin Herrenschmidt
On Sat, 2017-12-02 at 15:38 +0100, Cédric Le Goater wrote:
> Hmm, yes. So, the current design for sPAPR handles all sources 
> under the same XIVE object with a global memory region for all 
> the ESBs. 
> 
> The first RFC had a mechanism to register source objects into 
> the XIVE main one, allocating the IRQs per source and mapping 
> the ESBs in the overall region. A bit like OPAL does. I then 
> simplified for the sake of clarity and merged everything under 
> the same XIVE object. 
> 
> Shall I reintroduce multiples sources support ? and provide a 
> default one for IPIs and virtual devices of the machine. 

That or you need state bits ;-)

Cheers,
Ben.




Re: [Qemu-devel] [PATCH 10/25] spapr: add MMIO handlers for the XIVE interrupt sources

2017-12-02 Thread Cédric Le Goater
On 12/02/2017 03:28 PM, Benjamin Herrenschmidt wrote:
> On Wed, 2017-11-29 at 17:23 +0100, Cédric Le Goater wrote:
>> On 11/29/2017 02:56 PM, Cédric Le Goater wrote:
>>> +switch (offset) {
>>> +case 0:
>>> +spapr_xive_source_eoi(xive, lisn);
>>
>> Hrm.  I don't love that you're dealing with clearing that LSI bit
>> here, but setting it at a different level.
>>
>> The state machines are doing my head in a bit, is there any way
>> you could derive the STATUS_SENT bit from the PQ bits?
>
> Yes. I should. 
>
> I am also lacking a guest driver to exercise these LSIs so I didn't
> pay a lot of attention to level interrupts. Any idea ?

 How about an old-school emulated PCI device?  Maybe rtl8139?
>>>
>>> Perfect. The current model is working but I will see how I can 
>>> improve it to use the PQ bits instead.
>>
>> Using the PQ bits is simplifying the model but we still have to 
>> maintain an array to store the IRQ type. 
>>
>> There are 3 unused bits in the IVE descriptor, bits[1-3]:  
>>
>>   #define IVE_VALID   PPC_BIT(0)
>>   #define IVE_EQ_BLOCKPPC_BITMASK(4, 7)/* Destination EQ block# 
>> */
>>   #define IVE_EQ_INDEXPPC_BITMASK(8, 31)   /* Destination EQ index */
>>   #define IVE_MASKED  PPC_BIT(32)  /* Masked */
>>   #define IVE_EQ_DATA PPC_BITMASK(33, 63)  /* Data written to the EQ 
>> */
>>
>> We could hijack one of them to store the LSI type and get rid of 
>> the type array. Would you object to that ? 
> 
> This won't work well if/when you implement a real HW XIVE.
> 
> Another option is to have different source objects for LSIs and MSIs.

yes. Like for the PHB3 in PowerNV or in OPAL.

I will need to complexify the model a bit more with multiple source 
support like we did for PowerNV but that might be interesting for 
pass-through.

Thanks,

C.



Re: [Qemu-devel] [PATCH 14/25] spapr: push the XIVE EQ data in OS event queue

2017-12-02 Thread Benjamin Herrenschmidt
On Sat, 2017-12-02 at 08:45 -0600, Benjamin Herrenschmidt wrote:
> On Fri, 2017-12-01 at 15:10 +1100, David Gibson wrote:
> > 
> > Hm, ok.  Guest endian (or at least, not definitively host-endian) data
> > in a plain uint32_t makes me uncomfortable.  Could we use char data[4]
> > instead, to make it clear it's a byte-ordered buffer, rather than a
> > number as far as the XIVE is concerned.
> > 
> > Hm.. except that doesn't quite work, because the hardware must define
> > which end that generation bit ends up in...
> 
> It also needs to be written atomically. Just say it's big endian.

Also the guest reads it using be32_to_cpup...

> 
> Cheers,
> Ben.
> 
> > > > > +qemu_log_mask(LOG_GUEST_ERROR, "%s: failed to write EQ data 
> > > > > @0x%"
> > > > > +  HWADDR_PRIx "\n", __func__, qaddr);
> > > > > +return;
> > > > > +}
> > > > > +
> > > > > +qindex = (qindex + 1) % qentries;
> > > > > +if (qindex == 0) {
> > > > > +qgen ^= 1;
> > > > > +eq->w1 = SETFIELD(EQ_W1_GENERATION, eq->w1, qgen);
> > > > > +}
> > > > > +eq->w1 = SETFIELD(EQ_W1_PAGE_OFF, eq->w1, qindex);
> > > > > +}
> > > > > +
> > > > >  static void spapr_xive_irq(sPAPRXive *xive, int lisn)
> > > > >  {
> > > > > +XiveIVE *ive;
> > > > > +XiveEQ *eq;
> > > > > +uint32_t eq_idx;
> > > > > +uint8_t priority;
> > > > > +
> > > > > +ive = spapr_xive_get_ive(xive, lisn);
> > > > > +if (!ive || !(ive->w & IVE_VALID)) {
> > > > > +qemu_log_mask(LOG_GUEST_ERROR, "XIVE: invalid LISN %d\n", 
> > > > > lisn);
> > > > 
> > > > As mentioned on other patches, I'm a little concerned by these
> > > > guest-triggerable logs.  I guess the LOG_GUEST_ERROR mask will save
> > > > us, though.
> > > 
> > > I want to track 'invalid' interrupts but I haven't seen these show up 
> > > in my tests. I agree there are a little too much and some could just 
> > > be asserts.
> > 
> > Uh.. I don't think many can be assert()s.  assert() is only
> > appropriate if it being tripped definitely indicates a bug in qemu.
> > Nearly all these qemu_log()s I've seen can be tripped by the guest
> > doing something bad, which absolutely should not assert() qemu.




Re: [Qemu-devel] [PATCH 14/25] spapr: push the XIVE EQ data in OS event queue

2017-12-02 Thread Benjamin Herrenschmidt
On Fri, 2017-12-01 at 15:10 +1100, David Gibson wrote:
> 
> Hm, ok.  Guest endian (or at least, not definitively host-endian) data
> in a plain uint32_t makes me uncomfortable.  Could we use char data[4]
> instead, to make it clear it's a byte-ordered buffer, rather than a
> number as far as the XIVE is concerned.
> 
> Hm.. except that doesn't quite work, because the hardware must define
> which end that generation bit ends up in...

It also needs to be written atomically. Just say it's big endian.

Cheers,
Ben.

> > >> +qemu_log_mask(LOG_GUEST_ERROR, "%s: failed to write EQ data 
> > >> @0x%"
> > >> +  HWADDR_PRIx "\n", __func__, qaddr);
> > >> +return;
> > >> +}
> > >> +
> > >> +qindex = (qindex + 1) % qentries;
> > >> +if (qindex == 0) {
> > >> +qgen ^= 1;
> > >> +eq->w1 = SETFIELD(EQ_W1_GENERATION, eq->w1, qgen);
> > >> +}
> > >> +eq->w1 = SETFIELD(EQ_W1_PAGE_OFF, eq->w1, qindex);
> > >> +}
> > >> +
> > >>  static void spapr_xive_irq(sPAPRXive *xive, int lisn)
> > >>  {
> > >> +XiveIVE *ive;
> > >> +XiveEQ *eq;
> > >> +uint32_t eq_idx;
> > >> +uint8_t priority;
> > >> +
> > >> +ive = spapr_xive_get_ive(xive, lisn);
> > >> +if (!ive || !(ive->w & IVE_VALID)) {
> > >> +qemu_log_mask(LOG_GUEST_ERROR, "XIVE: invalid LISN %d\n", lisn);
> > > 
> > > As mentioned on other patches, I'm a little concerned by these
> > > guest-triggerable logs.  I guess the LOG_GUEST_ERROR mask will save
> > > us, though.
> > 
> > I want to track 'invalid' interrupts but I haven't seen these show up 
> > in my tests. I agree there are a little too much and some could just 
> > be asserts.
> 
> Uh.. I don't think many can be assert()s.  assert() is only
> appropriate if it being tripped definitely indicates a bug in qemu.
> Nearly all these qemu_log()s I've seen can be tripped by the guest
> doing something bad, which absolutely should not assert() qemu.



Re: [Qemu-devel] [PATCH 13/25] spapr: introduce the XIVE Event Queues

2017-12-02 Thread Benjamin Herrenschmidt
On Sat, 2017-12-02 at 08:39 -0600, Benjamin Herrenschmidt wrote:
> The IVEs and EQs are managed by the virtualization controller. The VPs
> (aka ENDs) 

typo. aka NVTs

> are managed by the presentation controller. There's a VP per
> real and virtual CPU.




Re: [Qemu-devel] [PATCH 15/25] spapr: notify the CPU when the XIVE interrupt priority is more privileged

2017-12-02 Thread Benjamin Herrenschmidt
On Thu, 2017-11-30 at 16:00 +1100, David Gibson wrote:
> 
> >  static uint64_t spapr_xive_icp_accept(sPAPRXiveICP *icp)
> >  {
> > -return 0;
> > +uint8_t nsr = icp->tima_os[TM_NSR];
> > +
> > +qemu_irq_lower(icp->output);
> > +
> > +if (icp->tima_os[TM_NSR] & TM_QW1_NSR_EO) {
> > +uint8_t cppr = icp->tima_os[TM_PIPR];
> > +
> > +icp->tima_os[TM_CPPR] = cppr;
> > +
> > +/* Reset the pending buffer bit */
> > +icp->tima_os[TM_IPB] &= ~priority_to_ipb(cppr);
> 
> What if multiple irqs of the same priority were queued?

It's the job of the OS to handle that case by consuming from the queue
until it's empty. There is an MMIO the guest can use if it wants to
that can set the IPB bits back to 1 for a given priority. Otherwise in
Linux we just have a SW way to force a replay.

> > +icp->tima_os[TM_PIPR] = ipb_to_pipr(icp->tima_os[TM_IPB]);
> > +
> > +/* Drop Exception bit for OS */
> > +icp->tima_os[TM_NSR] &= ~TM_QW1_NSR_EO;
> > +}
> > +
> > +return (nsr << 8) | icp->tima_os[TM_CPPR];
> > +}
> > +
> > +static void spapr_xive_icp_notify(sPAPRXiveICP *icp)
> > +{
> > +if (icp->tima_os[TM_PIPR] < icp->tima_os[TM_CPPR]) {
> > +icp->tima_os[TM_NSR] |= TM_QW1_NSR_EO;
> > +qemu_irq_raise(icp->output);
> > +}
> >  }
> >  
> >  static void spapr_xive_icp_set_cppr(sPAPRXiveICP *icp, uint8_t cppr)
> > @@ -51,6 +105,9 @@ static void spapr_xive_icp_set_cppr(sPAPRXiveICP *icp, 
> > uint8_t cppr)
> >  }
> >  
> >  icp->tima_os[TM_CPPR] = cppr;
> > +
> > +/* CPPR has changed, inform the ICP which might raise an exception */
> > +spapr_xive_icp_notify(icp);
> >  }
> >  
> >  /*
> > @@ -224,6 +281,8 @@ static void spapr_xive_irq(sPAPRXive *xive, int lisn)
> >  XiveEQ *eq;
> >  uint32_t eq_idx;
> >  uint8_t priority;
> > +uint32_t server;
> > +sPAPRXiveICP *icp;
> >  
> >  ive = spapr_xive_get_ive(xive, lisn);
> >  if (!ive || !(ive->w & IVE_VALID)) {
> > @@ -253,6 +312,13 @@ static void spapr_xive_irq(sPAPRXive *xive, int lisn)
> >  qemu_log_mask(LOG_UNIMP, "XIVE: !UCOND_NOTIFY not implemented\n");
> >  }
> >  
> > +server = GETFIELD(EQ_W6_NVT_INDEX, eq->w6);
> > +icp = spapr_xive_icp_get(xive, server);
> > +if (!icp) {
> > +qemu_log_mask(LOG_GUEST_ERROR, "XIVE: No ICP for server %d\n", 
> > server);
> > +return;
> > +}
> > +
> >  if (GETFIELD(EQ_W6_FORMAT_BIT, eq->w6) == 0) {
> >  priority = GETFIELD(EQ_W7_F0_PRIORITY, eq->w7);
> >  
> > @@ -260,9 +326,18 @@ static void spapr_xive_irq(sPAPRXive *xive, int lisn)
> >  if (priority == 0xff) {
> >  g_assert_not_reached();
> >  }
> > +
> > +/* Update the IPB (Interrupt Pending Buffer) with the priority
> > + * of the new notification and inform the ICP, which will
> > + * decide to raise the exception, or not, depending the CPPR.
> > + */
> > +icp->tima_os[TM_IPB] |= priority_to_ipb(priority);
> > +icp->tima_os[TM_PIPR] = ipb_to_pipr(icp->tima_os[TM_IPB]);
> >  } else {
> >  qemu_log_mask(LOG_UNIMP, "XIVE: w7 format1 not implemented\n");
> >  }
> > +
> > +spapr_xive_icp_notify(icp);
> >  }
> >  
> >  /*
> 
> 



Re: [Qemu-devel] [PATCH 13/25] spapr: introduce the XIVE Event Queues

2017-12-02 Thread Benjamin Herrenschmidt
On Thu, 2017-11-30 at 15:38 +1100, David Gibson wrote:
> On Thu, Nov 23, 2017 at 02:29:43PM +0100, Cédric Le Goater wrote:
> > The Event Queue Descriptor (EQD) table, also known as Event Notification
> > Descriptor (END), is one of the internal tables the XIVE interrupt
> > controller uses to redirect exception from event sources to CPU
> > threads.
> > 
> > The EQD specifies on which Event Queue the event data should be posted
> > when an exception occurs (later on pulled by the OS) and which server
> > (VPD in XIVE terminology) to notify. The Event Queue is a much more
> > complex structure but we start with a simple model for the sPAPR
> > machine.
> 
> Just to clarify my understanding a server / VPD in XIVE would
> typically correspond to a cpu - either real or virtual, yes?

The IVEs and EQs are managed by the virtualization controller. The VPs
(aka ENDs) are managed by the presentation controller. There's a VP per
real and virtual CPU.

You can think of the XIVE as having 3 main component types:


 - Source controller(s). There are some in the PHBs, one generic in the
XIVE itself, and one in the PSI bridge. Those contain the PQ bits and
thus the trigger & coalescing logic. They effectively shoot an MMIO to
the virtualization controller on events.

 - Virtualization controller (one per chip). This receives the above
MMIOs from the sources, manages the IVEs to get the target queue and
remap the number, and manages the queues. When a queue is enabled for
notification (or escalation) and such an event occurs, an MMIO goes to
the corresponding presentation controller.

 - Presentation controller (one per chip). This receives the above
notifications and sets as a result the IPB bits for one of the 8
priorities. Basically this guy tracks a single pending bit per priority
for each VP indicating whether there's something in the queue for that
priority and delivers interrupts to the core accordingly.


Now this is a simplified view. The PC supports groups but we don't
handle that yet, there are escalation interrupts, there are
redistribution mechanisms etc... but for now you get the basic idea.

Cheers,
Ben.




Re: [Qemu-devel] [PATCH 11/25] spapr: describe the XIVE interrupt source flags

2017-12-02 Thread Cédric Le Goater
On 12/02/2017 03:24 PM, Benjamin Herrenschmidt wrote:
> On Tue, 2017-11-28 at 17:40 +1100, David Gibson wrote:
>>> @@ -368,6 +368,10 @@ static void spapr_xive_realize(DeviceState *dev, Error 
>>> **errp)
>>>  /* Allocate the IVT (Interrupt Virtualization Table) */
>>>  xive->ivt = g_malloc0(xive->nr_irqs * sizeof(XiveIVE));
>>>  
>>> +/* All sources are emulated under the XIVE object and share the
>>> + * same characteristic */
>>> +xive->flags = XIVE_SRC_TRIGGER;
>>
>> You never actually use this field.  And since it always has the same
>> value, is there a point to storing it?
> 
> Some HW sources don't have it, so with pass-through maybe...

Hmm, yes. So, the current design for sPAPR handles all sources 
under the same XIVE object with a global memory region for all 
the ESBs. 

The first RFC had a mechanism to register source objects into 
the XIVE main one, allocating the IRQs per source and mapping 
the ESBs in the overall region. A bit like OPAL does. I then 
simplified for the sake of clarity and merged everything under 
the same XIVE object. 

Shall I reintroduce multiples sources support ? and provide a 
default one for IPIs and virtual devices of the machine. 

Thanks,

C. 

 



Re: [Qemu-devel] [PATCH 10/25] spapr: add MMIO handlers for the XIVE interrupt sources

2017-12-02 Thread Benjamin Herrenschmidt
On Thu, 2017-11-30 at 15:28 +1100, David Gibson wrote:
> 
> How does this work at the hardware level?  Presumbly the actual
> hardware components don't communicate with the XIVE to request edge or
> level.  So how does it know?  Specific ranges for LSIs?  If that we
> should probably do the same.

So the source controller and the IVE are separate. The source
controller sends an internal MMIO to the IVE for "translating" the
event into a queue etc...

The IVE only see "events" which are effectively state transitions of
the P bit of the source.

The LSI vs MSI difference is thus entirely a property of the source
HW.

All the XIVE "Generic" built-in sources (the ones you can trigger with
an MMIO, which we use in KVM for all the IPIs and virtual interrupts)
are MSIs.

You find 2 kind of blocks of LSIs in the chip, the one PSI block which
has a handful or two of LSI sources for random "stuff" (LPC
interrupt(s), i2c interrupts etc..) and the LSI blocks which are in
each PHB.

So the PHB has basically two different bits of logic, one for LSIs and
one for MSIs. Their HW state machine is different.

In fact in the PHB and the PSI, I think, there's even an MMIO backdoor
register that allows you to see the "state" of the LSI (asserted).

Cheers,
Ben.




Re: [Qemu-devel] [PATCH 10/25] spapr: add MMIO handlers for the XIVE interrupt sources

2017-12-02 Thread Benjamin Herrenschmidt
On Wed, 2017-11-29 at 17:23 +0100, Cédric Le Goater wrote:
> On 11/29/2017 02:56 PM, Cédric Le Goater wrote:
> > > > > > +switch (offset) {
> > > > > > +case 0:
> > > > > > +spapr_xive_source_eoi(xive, lisn);
> > > > > 
> > > > > Hrm.  I don't love that you're dealing with clearing that LSI bit
> > > > > here, but setting it at a different level.
> > > > > 
> > > > > The state machines are doing my head in a bit, is there any way
> > > > > you could derive the STATUS_SENT bit from the PQ bits?
> > > > 
> > > > Yes. I should. 
> > > > 
> > > > I am also lacking a guest driver to exercise these LSIs so I didn't
> > > > pay a lot of attention to level interrupts. Any idea ?
> > > 
> > > How about an old-school emulated PCI device?  Maybe rtl8139?
> > 
> > Perfect. The current model is working but I will see how I can 
> > improve it to use the PQ bits instead.
> 
> Using the PQ bits is simplifying the model but we still have to 
> maintain an array to store the IRQ type. 
> 
> There are 3 unused bits in the IVE descriptor, bits[1-3]:  
> 
>   #define IVE_VALID   PPC_BIT(0)
>   #define IVE_EQ_BLOCKPPC_BITMASK(4, 7)/* Destination EQ block# */
>   #define IVE_EQ_INDEXPPC_BITMASK(8, 31)   /* Destination EQ index */
>   #define IVE_MASKED  PPC_BIT(32)  /* Masked */
>   #define IVE_EQ_DATA PPC_BITMASK(33, 63)  /* Data written to the EQ 
> */
> 
> We could hijack one of them to store the LSI type and get rid of 
> the type array. Would you object to that ? 

This won't work well if/when you implement a real HW XIVE.

Another option is to have different source objects for LSIs and MSIs.

Cheers,
Ben.
> 
> C.



Re: [Qemu-devel] [PATCH 09/25] spapr: introduce handlers for XIVE interrupt sources

2017-12-02 Thread Benjamin Herrenschmidt
On Tue, 2017-11-28 at 18:18 +, Cédric Le Goater wrote:
> AFAICT, it doesn't. LSI events are configured as the other XIVE interrupts. 
> The level is converted in the P bit and the Q bit should always be zero.
> So I should be able to simplify the proposed model which still is mimicking 
> XICS  ... I will take a look at it. 
> 
> There are a sort of special degenerated LSIs but these are for bringup.

Not really. So for MSIs you don't need your state flags.

For LSIs, you do need the "asserted" one that keeps track of the LSI
input state. See my other note with the actual states.

Cheers,
Ben.




Re: [Qemu-devel] [PATCH 11/25] spapr: describe the XIVE interrupt source flags

2017-12-02 Thread Benjamin Herrenschmidt
On Tue, 2017-11-28 at 17:40 +1100, David Gibson wrote:
> > @@ -368,6 +368,10 @@ static void spapr_xive_realize(DeviceState *dev, Error 
> > **errp)
> >  /* Allocate the IVT (Interrupt Virtualization Table) */
> >  xive->ivt = g_malloc0(xive->nr_irqs * sizeof(XiveIVE));
> >  
> > +/* All sources are emulated under the XIVE object and share the
> > + * same characteristic */
> > +xive->flags = XIVE_SRC_TRIGGER;
> 
> You never actually use this field.  And since it always has the same
> value, is there a point to storing it?

Some HW sources don't have it, so with pass-through maybe...

Cheers,
Ben




Re: [Qemu-devel] [PATCH 10/25] spapr: add MMIO handlers for the XIVE interrupt sources

2017-12-02 Thread Benjamin Herrenschmidt
On Tue, 2017-11-28 at 17:38 +1100, David Gibson wrote:
> Hrm.  I don't love that you're dealing with clearing that LSI bit
> here, but setting it at a different level.
> 
> The state machines are doing my head in a bit, is there any way
> you could derive the STATUS_SENT bit from the PQ bits?

Yeah it should be...

So you should normally need only one extra bit of state for LSI which
is whether it's asserted or not and no extra bit of state for MSIs.

P is basically "sent". Q is whether another event has been queued up
(and is only meaningful for MSIs though the 01 combination will mask
LSIs too).

The state logic should be for MSIs on event:

- if PQ=01 ignore (masked)
- if P=1, set Q and finish
- set P=1 and forward event to IVE

For EOI (load and store):

- if PQ=01 ignore
- P=Q, Q=0
- (storeEOI only) if new P=1, forward event to IVE
 
For LSIs, and "event" is whenever the state is asserted, and Q is
meaningless, so basically on every change of state or ESB:

- if PQ=01 ignore (masked)
- if P=1 finish
- set P=1 and forward event to IVE

For EOI (load and store):

- if PQ=01 ignore
- clear P
- re-evaluate as above if asserted

Cheers,
Ben.




Re: [Qemu-devel] [PATCH v2 6/7] maint: Fix macros with broken 'do/while(0); ' usage

2017-12-02 Thread Juan Quintela
Eric Blake  wrote:
> The point of writing a macro embedded in a 'do { ... } while (0)'
> loop (particularly if the macro has multiple statements or would
> otherwise end with an 'if' statement) is so that the macro can be
> used as a drop-in statement with the caller supplying the
> trailing ';'.  Although our coding style frowns on brace-less 'if':
>   if (cond)
> statement;
>   else
> something else;
> that is the classic case where failure to use do/while(0) wrapping
> would cause the 'else' to pair with any embedded 'if' in the macro
> rather than the intended outer 'if'.  But conversely, if the macro
> includes an embedded ';', then the same brace-less coding style
> would now have two statements, making the 'else' a syntax error
> rather than pairing with the outer 'if'.  Thus, even though our
> coding style with required braces is not impacted, ending a macro
> with ';' makes our code harder to port to projects that use
> brace-less styles.
>
> The change should have no semantic impact.  I was not able to
> fully compile-test all of the changes (as some of them are
> examples of the ugly bit-rotting debug print statements that are
> completely elided by default, and I didn't want to recompile
> with the necessary -D witnesses - cleaning those up is left as a
> bite-sized task for another day); I did, however, audit that for
> all files touched, all callers of the changed macros DID supply
> a trailing ';' at the callsite, and did not appear to be used
> as part of a brace-less conditional.
>
> Found mechanically via: $ git grep -B1 'while (0);' | grep -A1 
>
> Signed-off-by: Eric Blake 
> Acked-by: Cornelia Huck 
> Reviewed-by: Michael S. Tsirkin 
> Acked-by: Dr. David Alan Gilbert 

Reviewed-by: Juan Quintela 



Re: [Qemu-devel] [PATCH 1/1] main loop: remove useless code

2017-12-02 Thread Peter Maydell
On 2 December 2017 at 07:41, FelixYao  wrote:
> hi Paolo Bonzini:
>
> Those codes seem useless, Could it be removed?
>
> Signed-off-by: FelixYao 
> ---
>  vl.c | 4 
>  1 file changed, 4 deletions(-)
>
> diff --git a/vl.c b/vl.c
> index 1ad1c04..5bed4c2 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -2995,10 +2995,6 @@ static void set_memory_options(uint64_t *ram_slots, 
> ram_addr_t *maxram_size,
>
>  sz = QEMU_ALIGN_UP(sz, 8192);
>  ram_size = sz;
> -if (ram_size != sz) {
> -error_report("ram size too large");
> -exit(EXIT_FAILURE);
> -}

ram_size is a ramaddr_t, which may be a 32-bit variable, whereas
sz is a uint64_t. This check is making sure that the ram size
specified can actually fit in a ramaddr_t.

thanks
-- PMM



Re: [Qemu-devel] [PATCH 1/1] vhost-scsi: add missing virtqueue_size parameter

2017-12-02 Thread Richard W.M. Jones
On Fri, Dec 01, 2017 at 04:15:38PM +0100, Eric Farman wrote:
> Commit 5c0919d02066 ("virtio-scsi: Add virtqueue_size parameter allowing
> virtqueue size to be set.") introduced a new parameter to virtio-scsi.
> Later, commit 920036106044 ("vhost-user-scsi: add missing virtqueue_size
> param") added that parameter to the new vhost-user-scsi interface but
> neglected the existing vhost-scsi interface it was built on.
> 
> Apply the same change to vhost-scsi, so that we can boot a guest with
> a device defined.  This also avoids crashing a guest when hotplugging
> a vhost-scsi device.
> 
> Signed-off-by: Eric Farman 
> ---
>  hw/scsi/vhost-scsi.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/hw/scsi/vhost-scsi.c b/hw/scsi/vhost-scsi.c
> index cd4ab05233..9c1bea8ff3 100644
> --- a/hw/scsi/vhost-scsi.c
> +++ b/hw/scsi/vhost-scsi.c
> @@ -233,6 +233,8 @@ static Property vhost_scsi_properties[] = {
>  DEFINE_PROP_STRING("wwpn", VirtIOSCSICommon, conf.wwpn),
>  DEFINE_PROP_UINT32("boot_tpgt", VirtIOSCSICommon, conf.boot_tpgt, 0),
>  DEFINE_PROP_UINT32("num_queues", VirtIOSCSICommon, conf.num_queues, 1),
> +DEFINE_PROP_UINT32("virtqueue_size", VirtIOSCSICommon, 
> conf.virtqueue_size,
> +   128),
>  DEFINE_PROP_UINT32("max_sectors", VirtIOSCSICommon, conf.max_sectors,
> 0x),
>  DEFINE_PROP_UINT32("cmd_per_lun", VirtIOSCSICommon, conf.cmd_per_lun, 
> 128),

Yes, this fix is fine.  Sorry for breaking it :-/

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-builder quickly builds VMs from scratch
http://libguestfs.org/virt-builder.1.html