date:20160628

Re: [Qemu-devel] [PATCH v4 00/24] target-sparc improvements

2016-06-28 Thread Richard Henderson


On 06/28/2016 03:44 PM, Mark Cave-Ayland wrote:

I didn't see the branch rebase onto aa8151b7df here, although I was able
to manually rebase the tgt-sparc-2 branch onto git master and build
without issues.


I pushed it to tcg-sparc if you wanted to see my branch.


r~

[Qemu-devel] [Bug 1597138] Re: Deadlock on Windows 10 pop-up

2016-06-28 Thread Shannon Barber

Removing the soundhw hda device prevents the deadlock.

Below was my QEmu start-up command-line:

qemu-system-x86_64 \
-enable-kvm \
-m 8192 \
-drive if=pflash,format=raw,readonly,file=./ovmf-x64/OVMF-pure-efi.fd \
-drive if=pflash,format=raw,file=./OVMF-pure-efi-Win10.fd \
-drive file=/dev/Stuff/Windows10,format=raw,if=virtio,cache=none \
-drive file=virtio-win.iso,id=virtiocd,if=none,format=raw -device 
ide-cd,bus=ide.1,drive=virtiocd \
-device vfio-pci,host=01:00.0,addr=09.0,multifunction=on,x-vga=on \
-device vfio-pci,host=01:00.1,addr=09.1 \
-usb -usbdevice host:003.006 \
-cpu core2duo,+nx,kvm=off \
-vga none \
-soundhw hda

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1597138

Title:
  Deadlock on Windows 10 pop-up

Status in QEMU:
  New

Bug description:
  I was able to install and can log in but whenever a pop-up is attempted the 
VM appears to deadlock.
  I can still kill -9 the process and recover but the VM and the QEmu console 
both hang with no error output.

  At first I thought it was UAC but renaming a file causes a pop-up and that 
also deadlocks.
  I rebuilt QEmu 2.6.0 with debug info and did a thread back-trace once the 
deadlock occurs.
  See the attachment for the trace.

  I am attempting to setup GPU pass-thru with a GTX 970 but this
  deadlock occurs with -vga std (and no GPU pass-thru) as well.

  (I cannot install or start Windows 7 but I am told this is a known
  bug.)

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1597138/+subscriptions

Re: [Qemu-devel] [PATCH v2 5/8] ppc/xics: Make the ICSState a list

2016-06-28 Thread Benjamin Herrenschmidt

On Wed, 2016-06-29 at 13:37 +1000, David Gibson wrote:
> AFAICT xirr_owner will be lost on migration, which will break things.
> That will need to be transferred on migration, somehow.  If it can be
> recalculated from existing data in post_load() that would be ideal,
> otherwise we'll have to devise a wire encoding for it.

It should be possible to get it back from the interrupt number by
walking the list of ICS yes.

Cheers,
Ben.

[Qemu-devel] [RFC PATCH] armv7m_nvic: Use qemu_get_cpu(0) instead of current_cpu

2016-06-28 Thread Andrey Smirnov

Starting QEMU with -S results in current_cpu containing its initial
value of NULL. It is however possible to connect to such QEMU instance
and query various CPU registers, one example being CPUID, and doing that
results in QEMU segfaulting.

Using qemu_get_cpu(0) seem reasonable enough given that ARMv7M
architecture is a single core architecture.

Signed-off-by: Andrey Smirnov 
---
 hw/intc/armv7m_nvic.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/hw/intc/armv7m_nvic.c b/hw/intc/armv7m_nvic.c
index 890d5d7..06d8db6 100644
--- a/hw/intc/armv7m_nvic.c
+++ b/hw/intc/armv7m_nvic.c
@@ -187,11 +187,11 @@ static uint32_t nvic_readl(nvic_state *s, uint32_t offset)
 case 0x1c: /* SysTick Calibration Value.  */
 return 1;
 case 0xd00: /* CPUID Base.  */
-cpu = ARM_CPU(current_cpu);
+cpu = ARM_CPU(qemu_get_cpu(0));
 return cpu->midr;
 case 0xd04: /* Interrupt Control State.  */
 /* VECTACTIVE */
-cpu = ARM_CPU(current_cpu);
+cpu = ARM_CPU(qemu_get_cpu(0));
 val = cpu->env.v7m.exception;
 if (val == 1023) {
 val = 0;
@@ -222,7 +222,7 @@ static uint32_t nvic_readl(nvic_state *s, uint32_t offset)
 val |= (1 << 31);
 return val;
 case 0xd08: /* Vector Table Offset.  */
-cpu = ARM_CPU(current_cpu);
+cpu = ARM_CPU(qemu_get_cpu(0));
 return cpu->env.v7m.vecbase;
 case 0xd0c: /* Application Interrupt/Reset Control.  */
 return 0xfa05;
@@ -349,7 +349,7 @@ static void nvic_writel(nvic_state *s, uint32_t offset, 
uint32_t value)
 }
 break;
 case 0xd08: /* Vector Table Offset.  */
-cpu = ARM_CPU(current_cpu);
+cpu = ARM_CPU(qemu_get_cpu(0));
 cpu->env.v7m.vecbase = value & 0xff80;
 break;
 case 0xd0c: /* Application Interrupt/Reset Control.  */
-- 
2.5.5

[Qemu-devel] [RFC PATCH] exec: Support non-direct memory writes in cpu_memory_rw_debug

2016-06-28 Thread Andrey Smirnov

Add code to support writing to memory mapped peripherals via
cpu_memory_rw_debug(). The code of that function already supports
reading from such memory regions, so this commit makes that
functionality "symmetric".

One use-case for that functionality is setting various registers of a
non-running CPU. A concrete example would be starting QEMU emulating
Cortex-M with -S, connecting with GDB and modifying the value of Vector
Table Offset register.

Signed-off-by: Andrey Smirnov 
---
 cpus.c  |  2 +-
 disas.c |  4 ++--
 exec.c  | 57 -
 gdbstub.c   | 10 
 hw/i386/kvmvapic.c  | 18 +++---
 hw/mips/mips_jazz.c |  2 +-
 hw/pci-host/prep.c  |  2 +-
 hw/virtio/virtio.c  |  2 +-
 include/exec/cpu-all.h  |  2 +-
 include/exec/memory.h   | 15 +---
 include/exec/softmmu-semi.h | 16 ++---
 ioport.c|  6 ++---
 monitor.c   |  2 +-
 target-arm/arm-semi.c   |  2 +-
 target-arm/kvm64.c  |  8 +++
 target-i386/helper.c|  6 ++---
 target-i386/kvm.c   |  8 +++
 target-ppc/kvm.c|  8 +++
 target-s390x/kvm.c  |  8 +++
 target-xtensa/xtensa-semi.c |  6 ++---
 20 files changed, 100 insertions(+), 84 deletions(-)

diff --git a/cpus.c b/cpus.c
index 84c3520..14f0f4f 100644
--- a/cpus.c
+++ b/cpus.c
@@ -1691,7 +1691,7 @@ void qmp_memsave(int64_t addr, int64_t size, const char 
*filename,
 l = sizeof(buf);
 if (l > size)
 l = size;
-if (cpu_memory_rw_debug(cpu, addr, buf, l, 0) != 0) {
+if (cpu_memory_rw_debug(cpu, addr, buf, l, MEMTX_READ) != 0) {
 error_setg(errp, "Invalid addr 0x%016" PRIx64 "/size %" PRId64
  " specified", orig_addr, orig_size);
 goto exit;
diff --git a/disas.c b/disas.c
index 05a7a12..8ceeedb 100644
--- a/disas.c
+++ b/disas.c
@@ -39,7 +39,7 @@ target_read_memory (bfd_vma memaddr,
 {
 CPUDebug *s = container_of(info, CPUDebug, info);
 
-cpu_memory_rw_debug(s->cpu, memaddr, myaddr, length, 0);
+cpu_memory_rw_debug(s->cpu, memaddr, myaddr, length, MEMTX_READ);
 return 0;
 }
 
@@ -358,7 +358,7 @@ monitor_read_memory (bfd_vma memaddr, bfd_byte *myaddr, int 
length,
 if (monitor_disas_is_physical) {
 cpu_physical_memory_read(memaddr, myaddr, length);
 } else {
-cpu_memory_rw_debug(s->cpu, memaddr, myaddr, length, 0);
+cpu_memory_rw_debug(s->cpu, memaddr, myaddr, length, MEMTX_READ);
 }
 return 0;
 }
diff --git a/exec.c b/exec.c
index 0122ef7..048d3d0 100644
--- a/exec.c
+++ b/exec.c
@@ -2219,7 +2219,7 @@ static MemTxResult subpage_write(void *opaque, hwaddr 
addr,
 abort();
 }
 return address_space_write(subpage->as, addr + subpage->base,
-   attrs, buf, len);
+   attrs, buf, len, false);
 }
 
 static bool subpage_accepts(void *opaque, hwaddr addr,
@@ -2436,7 +2436,7 @@ MemoryRegion *get_system_io(void)
 /* physical memory access (slow version, mainly for debug) */
 #if defined(CONFIG_USER_ONLY)
 int cpu_memory_rw_debug(CPUState *cpu, target_ulong addr,
-uint8_t *buf, int len, int is_write)
+uint8_t *buf, int len, MemTxType type)
 {
 int l, flags;
 target_ulong page;
@@ -2450,7 +2450,8 @@ int cpu_memory_rw_debug(CPUState *cpu, target_ulong addr,
 flags = page_get_flags(page);
 if (!(flags & PAGE_VALID))
 return -1;
-if (is_write) {
+if (type == MEMTX_WRITE ||
+type == MEMTX_PROGRAM) {
 if (!(flags & PAGE_WRITE))
 return -1;
 /* XXX: this code should not depend on lock_user */
@@ -2552,7 +2553,8 @@ static MemTxResult 
address_space_write_continue(AddressSpace *as, hwaddr addr,
 MemTxAttrs attrs,
 const uint8_t *buf,
 int len, hwaddr addr1,
-hwaddr l, MemoryRegion *mr)
+hwaddr l, MemoryRegion *mr,
+bool force)
 {
 uint8_t *ptr;
 uint64_t val;
@@ -2560,7 +2562,14 @@ static MemTxResult 
address_space_write_continue(AddressSpace *as, hwaddr addr,
 bool release_lock = false;
 
 for (;;) {
-if (!memory_access_is_direct(mr, true)) {
+
+if (memory_access_is_direct(mr, true) ||
+(force && memory_region_is_romd(mr))) {
+/* RAM case */
+ptr = qemu_map_ram_ptr(mr->ram_block, addr1);
+memcpy(ptr, buf, l);
+invalidate_and_set_dirty(mr, addr1, l);
+} else {

[Qemu-devel] [Bug 1597138] [NEW] Deadlock on Windows 10 pop-up

2016-06-28 Thread Shannon Barber

Public bug reported:

I was able to install and can log in but whenever a pop-up is attempted the VM 
appears to deadlock.
I can still kill -9 the process and recover but the VM and the QEmu console 
both hang with no error output.

At first I thought it was UAC but renaming a file causes a pop-up and that also 
deadlocks.
I rebuilt QEmu 2.6.0 with debug info and did a thread back-trace once the 
deadlock occurs.
See the attachment for the trace.

I am attempting to setup GPU pass-thru with a GTX 970 but this deadlock
occurs with -vga std (and no GPU pass-thru) as well.

(I cannot install or start Windows 7 but I am told this is a known bug.)

** Affects: qemu
 Importance: Undecided
 Status: New

** Attachment added: "qemu_deadlock_bt.txt"
   
https://bugs.launchpad.net/bugs/1597138/+attachment/4691950/+files/qemu_deadlock_bt.txt

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1597138

Title:
  Deadlock on Windows 10 pop-up

Status in QEMU:
  New

Bug description:
  I was able to install and can log in but whenever a pop-up is attempted the 
VM appears to deadlock.
  I can still kill -9 the process and recover but the VM and the QEmu console 
both hang with no error output.

  At first I thought it was UAC but renaming a file causes a pop-up and that 
also deadlocks.
  I rebuilt QEmu 2.6.0 with debug info and did a thread back-trace once the 
deadlock occurs.
  See the attachment for the trace.

  I am attempting to setup GPU pass-thru with a GTX 970 but this
  deadlock occurs with -vga std (and no GPU pass-thru) as well.

  (I cannot install or start Windows 7 but I am told this is a known
  bug.)

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1597138/+subscriptions

Re: [Qemu-devel] [PATCH v2 5/8] ppc/xics: Make the ICSState a list

2016-06-28 Thread David Gibson

On Wed, Jun 29, 2016 at 12:35:16AM +0530, Nikunj A Dadhania wrote:
> From: Benjamin Herrenschmidt 
> 
> Instead of an array of fixed sized blocks, use a list, as we will need
> to have sources with variable number of interrupts. SPAPR only uses
> a single entry. Native will create more. If performance becomes an
> issue we can add some hashed lookup but for now this will do fine.
> 
> Signed-off-by: Benjamin Herrenschmidt 
> [ move the initialization of list to xics_common_initfn ]
> Signed-off-by: Nikunj A Dadhania 

AFAICT xirr_owner will be lost on migration, which will break things.
That will need to be transferred on migration, somehow.  If it can be
recalculated from existing data in post_load() that would be ideal,
otherwise we'll have to devise a wire encoding for it.

> ---
>  hw/intc/trace-events  |  4 +--
>  hw/intc/xics.c| 83 
>  hw/intc/xics_kvm.c| 27 +++-
>  hw/intc/xics_spapr.c  | 88 
> +--
>  hw/ppc/spapr_events.c |  2 +-
>  hw/ppc/spapr_pci.c|  5 ++-
>  hw/ppc/spapr_vio.c|  2 +-
>  include/hw/ppc/xics.h | 13 
>  8 files changed, 138 insertions(+), 86 deletions(-)
> 
> diff --git a/hw/intc/trace-events b/hw/intc/trace-events
> index 376dd18..5f0f783 100644
> --- a/hw/intc/trace-events
> +++ b/hw/intc/trace-events
> @@ -56,8 +56,8 @@ xics_set_irq_lsi(int srcno, int nr) "set_irq_lsi: srcno %d 
> [irq %#x]"
>  xics_ics_write_xive(int nr, int srcno, int server, uint8_t priority) 
> "ics_write_xive: irq %#x [src %d] server %#x prio %#x"
>  xics_ics_reject(int nr, int srcno) "reject irq %#x [src %d]"
>  xics_ics_eoi(int nr) "ics_eoi: irq %#x"
> -xics_alloc(int src, int irq) "source#%d, irq %d"
> -xics_alloc_block(int src, int first, int num, bool lsi, int align) 
> "source#%d, first irq %d, %d irqs, lsi=%d, alignnum %d"
> +xics_alloc(int irq) "irq %d"
> +xics_alloc_block(int first, int num, bool lsi, int align) "first irq %d, %d 
> irqs, lsi=%d, alignnum %d"
>  xics_ics_free(int src, int irq, int num) "Source#%d, first irq %d, %d irqs"
>  xics_ics_free_warn(int src, int irq) "Source#%d, irq %d is already free"
>  
> diff --git a/hw/intc/xics.c b/hw/intc/xics.c
> index cd48f42..5148bdf 100644
> --- a/hw/intc/xics.c
> +++ b/hw/intc/xics.c
> @@ -96,13 +96,16 @@ void xics_cpu_setup(XICSState *xics, PowerPCCPU *cpu)
>  static void xics_common_reset(DeviceState *d)
>  {
>  XICSState *xics = XICS_COMMON(d);
> +ICSState *ics;
>  int i;
>  
>  for (i = 0; i < xics->nr_servers; i++) {
>  device_reset(DEVICE(>ss[i]));
>  }
>  
> -device_reset(DEVICE(xics->ics));
> +QLIST_FOREACH(ics, >ics, list) {
> +device_reset(DEVICE(ics));
> +}
>  }
>  
>  static void xics_prop_get_nr_irqs(Object *obj, Visitor *v, const char *name,
> @@ -134,7 +137,6 @@ static void xics_prop_set_nr_irqs(Object *obj, Visitor 
> *v, const char *name,
>  }
>  
>  assert(info->set_nr_irqs);
> -assert(xics->ics);
>  info->set_nr_irqs(xics, value, errp);
>  }
>  
> @@ -174,6 +176,9 @@ static void xics_prop_set_nr_servers(Object *obj, Visitor 
> *v,
>  
>  static void xics_common_initfn(Object *obj)
>  {
> +XICSState *xics = XICS_COMMON(obj);
> +
> +QLIST_INIT(>ics);
>  object_property_add(obj, "nr_irqs", "int",
>  xics_prop_get_nr_irqs, xics_prop_set_nr_irqs,
>  NULL, NULL, NULL);
> @@ -212,33 +217,35 @@ static void ics_reject(ICSState *ics, int nr);
>  static void ics_resend(ICSState *ics);
>  static void ics_eoi(ICSState *ics, int nr);
>  
> -static void icp_check_ipi(XICSState *xics, int server)
> +static void icp_check_ipi(ICPState *ss)
>  {
> -ICPState *ss = xics->ss + server;
> -
>  if (XISR(ss) && (ss->pending_priority <= ss->mfrr)) {
>  return;
>  }
>  
> -trace_xics_icp_check_ipi(server, ss->mfrr);
> +trace_xics_icp_check_ipi(ss->cs->cpu_index, ss->mfrr);
>  
> -if (XISR(ss)) {
> -ics_reject(xics->ics, XISR(ss));
> +if (XISR(ss) && ss->xirr_owner) {
> +ics_reject(ss->xirr_owner, XISR(ss));
>  }
>  
>  ss->xirr = (ss->xirr & ~XISR_MASK) | XICS_IPI;
>  ss->pending_priority = ss->mfrr;
> +ss->xirr_owner = NULL;
>  qemu_irq_raise(ss->output);
>  }
>  
>  static void icp_resend(XICSState *xics, int server)
>  {
>  ICPState *ss = xics->ss + server;
> +ICSState *ics;
>  
>  if (ss->mfrr < CPPR(ss)) {
> -icp_check_ipi(xics, server);
> +icp_check_ipi(ss);
> +}
> +QLIST_FOREACH(ics, >ics, list) {
> +ics_resend(ics);
>  }
> -ics_resend(xics->ics);
>  }
>  
>  void icp_set_cppr(XICSState *xics, int server, uint8_t cppr)
> @@ -256,7 +263,10 @@ void icp_set_cppr(XICSState *xics, int server, uint8_t 
> cppr)
>  ss->xirr &= ~XISR_MASK; /* Clear XISR */
>

Re: [Qemu-devel] [PATCH v2 0/8] sPAPR xics rework/cleanup

2016-06-28 Thread David Gibson

On Wed, Jun 29, 2016 at 12:35:11AM +0530, Nikunj A Dadhania wrote:
> sPAPR xics related changes required for powernv platform. This brings
> infrastructure to get the xics native mode for powernv. Tested pseries guests
> in KVM and TCG mode.
> 
> Changelog v1:
>  * Change XICS to XICS_SPAPR and KVM_XICS to XICS_KVM_SPAPR
>  * Added xics_ to function get_cpu_index_by_dt_id as this is a global symbol
>  * Dropped server parameter from  icp_check_ipi
>  * Send HW_ERROR when ics is NULL
>  * Remove redundant parameters in trace routines
>  * Use type ICS_SIMPLE, ICS_BASE and ICS_KVM
>  * Dropped xics-native and info pic patches for this version
> 
> ToDo:
>  + Use ICPNative and XICSNative in "native" implementation
>  + xics_spapr_alloc - getting rid of that
>  + xirr_owner - how to reassign after migration
> 
> Benjamin Herrenschmidt (8):
>   ppc/xics: Rename existing xics to xics_spapr
>   ppc/xics: Move SPAPR specific code to a separate file
>   ppc/xics: Implement H_IPOLL using an accessor
>   ppc/xics: Replace "icp" with "xics" in most places
>   ppc/xics: Make the ICSState a list
>   ppc/xics: An ICS with offset 0 is assumed to be uninitialized
>   ppc/xics: Use a helper to add a new ICS
>   ppc/xics: Split ICS into ics-base and ics class
> 
>  default-configs/ppc64-softmmu.mak |   1 +
>  hw/intc/Makefile.objs |   1 +
>  hw/intc/trace-events  |  14 +-
>  hw/intc/xics.c| 724 
> +++---
>  hw/intc/xics_kvm.c|  92 +++--
>  hw/intc/xics_spapr.c  | 460 
>  hw/ppc/spapr.c|  19 +-
>  hw/ppc/spapr_cpu_core.c   |   4 +-
>  hw/ppc/spapr_events.c |   8 +-
>  hw/ppc/spapr_pci.c|  12 +-
>  hw/ppc/spapr_vio.c|   2 +-
>  include/hw/pci-host/spapr.h   |   2 +-
>  include/hw/ppc/spapr.h|   2 +-
>  include/hw/ppc/spapr_vio.h|   2 +-
>  include/hw/ppc/xics.h |  79 +++--
>  15 files changed, 803 insertions(+), 619 deletions(-)
>  create mode 100644 hw/intc/xics_spapr.c

I've put 1-4/8 into ppc-for-2.7.  5/8, unfortunately will break
migration and 6-8/8 don't make much sense without 5/8, so I've left
them for now.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH 1/2] ppc: Add proper real mode translation support

2016-06-28 Thread David Gibson

On Wed, Jun 29, 2016 at 12:59:05PM +1000, Benjamin Herrenschmidt wrote:
> On Wed, 2016-06-29 at 12:41 +1000, David Gibson wrote:
> > > +    /* Actually we don't support unbounded RMA anymore since
> > we
> > > + * added proper emulation of HV mode. The max we can get
> > is
> > > + * 16G which also happens to be what we configure for PAPR
> > > + * mode so make sure we don't do anything bigger than that
> > > + */
> > > +    spapr->rma_size = MIN(spapr->rma_size, 0x4ull);
> > 
> > #1 - Instead of the various KVM / non-KVM cases here, it might be
> > simpler to just always clamp the RMA to 256MiB.
> 
> That would be sad ... we benefit from having a larger RMA..

Ah, ok.  Let's leave it as is, then.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

[Qemu-devel] [PULL 10/10] mirror: fix misleading comments

2016-06-28 Thread Jeff Cody

From: Changlong Xie 

s/target bs/to_replace/, also we check to_replace bs is not
blocked in qmp_drive_mirror() not here

Signed-off-by: Changlong Xie 
Reviewed-by: Fam Zheng 
Reviewed-by: Stefan Hajnoczi 
Reviewed-by: Jeff Cody 
Message-id: 1466672241-22485-3-git-send-email-xiecl.f...@cn.fujitsu.com
Signed-off-by: Jeff Cody 
---
 block/mirror.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/mirror.c b/block/mirror.c
index 5bac906..8d96049 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -775,7 +775,7 @@ static void mirror_complete(BlockJob *job, Error **errp)
 }
 }
 
-/* check the target bs is not blocked and block all operations on it */
+/* block all operations on to_replace bs */
 if (s->replaces) {
 AioContext *replace_aio_context;
 
-- 
1.9.3

[Qemu-devel] [PULL 09/10] blockjob: assert(cb) when create job

2016-06-28 Thread Jeff Cody

From: Changlong Xie 

Callback for block job should always exist

Suggested-by: Paolo Bonzini 
Suggested-by: Kevin Wolf 
Signed-off-by: Changlong Xie 
Reviewed-by: Fam Zheng 
Reviewed-by: Stefan Hajnoczi 
Reviewed-by: Jeff Cody 
Message-id: 1466672241-22485-2-git-send-email-xiecl.f...@cn.fujitsu.com
Signed-off-by: Jeff Cody 
---
 block/backup.c | 1 -
 blockjob.c | 1 +
 2 files changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/backup.c b/block/backup.c
index 581269b..f87f8d5 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -489,7 +489,6 @@ void backup_start(BlockDriverState *bs, BlockDriverState 
*target,
 
 assert(bs);
 assert(target);
-assert(cb);
 
 if (bs == target) {
 error_setg(errp, "Source and target cannot be the same");
diff --git a/blockjob.c b/blockjob.c
index 90c4e26..205da9d 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -110,6 +110,7 @@ void *block_job_create(const BlockJobDriver *driver, 
BlockDriverState *bs,
 BlockBackend *blk;
 BlockJob *job;
 
+assert(cb);
 if (bs->job) {
 error_setg(errp, QERR_DEVICE_IN_USE, bdrv_get_device_name(bs));
 return NULL;
-- 
1.9.3

[Qemu-devel] [PULL 07/10] mirror: limit niov to IOV_MAX elements, again

2016-06-28 Thread Jeff Cody

From: John Snow 

During the refactor of mirror_iteration in e5b43573,
we regressed the fix introduced in cae98cb8.

This patch re-adds IOV_MAX checking to cases where we
aren't checking alignment (and size) already.

Signed-off-by: John Snow 
Reviewed-by: Eric Blake 
Reviewed-by: Fam Zheng 
Message-id: 1466625064-11280-3-git-send-email-js...@redhat.com
Signed-off-by: Jeff Cody 
---
 block/mirror.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/block/mirror.c b/block/mirror.c
index 42ebc3b..5bac906 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -231,11 +231,14 @@ static int mirror_do_read(MirrorBlockJob *s, int64_t 
sector_num,
 int sectors_per_chunk, nb_chunks;
 int ret;
 MirrorOp *op;
+int max_sectors;
 
 sectors_per_chunk = s->granularity >> BDRV_SECTOR_BITS;
+max_sectors = sectors_per_chunk * s->max_iov;
 
 /* We can only handle as much as buf_size at a time. */
 nb_sectors = MIN(s->buf_size >> BDRV_SECTOR_BITS, nb_sectors);
+nb_sectors = MIN(max_sectors, nb_sectors);
 assert(nb_sectors);
 ret = nb_sectors;
 
-- 
1.9.3

[Qemu-devel] [PULL 04/10] mirror: fix trace_mirror_yield_in_flight usage in mirror_iteration()

2016-06-28 Thread Jeff Cody

From: "Denis V. Lunev" 

trace_mirror_yield_in_flight accepts 2nd arguments in sectors while here
we pass chunks instead.

Signed-off-by: Denis V. Lunev 
Reviewed-by: Eric Blake 
Message-id: 1466518157-27140-1-git-send-email-...@openvz.org
CC: Jeff Cody 
CC: Kevin Wolf 
CC: Max Reitz 
Signed-off-by: Jeff Cody 
---
 block/mirror.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/mirror.c b/block/mirror.c
index a04ed9c..930ac96 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -327,7 +327,7 @@ static uint64_t coroutine_fn 
mirror_iteration(MirrorBlockJob *s)
 
 first_chunk = sector_num / sectors_per_chunk;
 while (test_bit(first_chunk, s->in_flight_bitmap)) {
-trace_mirror_yield_in_flight(s, first_chunk, s->in_flight);
+trace_mirror_yield_in_flight(s, sector_num, s->in_flight);
 mirror_wait_for_io(s);
 }
 
-- 
1.9.3

[Qemu-devel] [PULL 06/10] mirror: clarify mirror_do_read return code

2016-06-28 Thread Jeff Cody

From: John Snow 

mirror_do_read intends to return the number of sectors processed after
the starting sector, without regard to how many sectors were processed
before the starting sector due to alignment.

Clean up the comments and code to hopefully illustrate this more clearly.

This also fixes an issue in initialization where if the mirror buffer size
is initialized to smaller than the number of sectors being requested for
transfer, we report back an incorrectly large number to the caller.

Signed-off-by: John Snow 
Reviewed-by: Eric Blake 
Reviewed-by: Fam Zheng 
Message-id: 1466625064-11280-2-git-send-email-js...@redhat.com
Signed-off-by: Jeff Cody 
---
 block/mirror.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/block/mirror.c b/block/mirror.c
index 930ac96..42ebc3b 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -218,7 +218,9 @@ static inline void mirror_wait_for_io(MirrorBlockJob *s)
 }
 
 /* Submit async read while handling COW.
- * Returns: nb_sectors if no alignment is necessary, or
+ * Returns: The number of sectors copied after and including sector_num,
+ *  excluding any sectors copied prior to sector_num due to alignment.
+ *  This will be nb_sectors if no alignment is necessary, or
  *  (new_end - sector_num) if tail is rounded up or down due to
  *  alignment or buffer limit.
  */
@@ -227,7 +229,7 @@ static int mirror_do_read(MirrorBlockJob *s, int64_t 
sector_num,
 {
 BlockBackend *source = s->common.blk;
 int sectors_per_chunk, nb_chunks;
-int ret = nb_sectors;
+int ret;
 MirrorOp *op;
 
 sectors_per_chunk = s->granularity >> BDRV_SECTOR_BITS;
@@ -235,6 +237,7 @@ static int mirror_do_read(MirrorBlockJob *s, int64_t 
sector_num,
 /* We can only handle as much as buf_size at a time. */
 nb_sectors = MIN(s->buf_size >> BDRV_SECTOR_BITS, nb_sectors);
 assert(nb_sectors);
+ret = nb_sectors;
 
 if (s->cow_bitmap) {
 ret += mirror_cow_align(s, _num, _sectors);
-- 
1.9.3

[Qemu-devel] [PULL 08/10] iotests: add small-granularity mirror test

2016-06-28 Thread Jeff Cody

From: John Snow 

Signed-off-by: John Snow 
Reviewed-by: Eric Blake 
Reviewed-by: Fam Zheng 
Message-id: 1466625064-11280-4-git-send-email-js...@redhat.com
Signed-off-by: Jeff Cody 
---
 tests/qemu-iotests/041 | 30 ++
 tests/qemu-iotests/041.out |  4 ++--
 2 files changed, 32 insertions(+), 2 deletions(-)

diff --git a/tests/qemu-iotests/041 b/tests/qemu-iotests/041
index ed1d9d4..cbf5e0b 100755
--- a/tests/qemu-iotests/041
+++ b/tests/qemu-iotests/041
@@ -727,6 +727,36 @@ class TestUnbackedSource(iotests.QMPTestCase):
 self.complete_and_wait()
 self.assert_no_active_block_jobs()
 
+class TestGranularity(iotests.QMPTestCase):
+image_len = 10 * 1024 * 1024 # MB
+
+def setUp(self):
+qemu_img('create', '-f', iotests.imgfmt, test_img,
+ str(TestGranularity.image_len))
+qemu_io('-c', 'write 0 %d' % (self.image_len),
+test_img)
+self.vm = iotests.VM().add_drive(test_img)
+self.vm.launch()
+
+def tearDown(self):
+self.vm.shutdown()
+self.assertTrue(iotests.compare_images(test_img, target_img),
+'target image does not match source after mirroring')
+os.remove(test_img)
+os.remove(target_img)
+
+def test_granularity(self):
+self.assert_no_active_block_jobs()
+result = self.vm.qmp('drive-mirror', device='drive0',
+ sync='full', target=target_img,
+ mode='absolute-paths', granularity=8192)
+self.assert_qmp(result, 'return', {})
+event = self.vm.get_qmp_event(wait=60.0)
+# Failures will manifest as COMPLETED/ERROR.
+self.assert_qmp(event, 'event', 'BLOCK_JOB_READY')
+self.complete_and_wait(drive='drive0', wait_ready=False)
+self.assert_no_active_block_jobs()
+
 class TestRepairQuorum(iotests.QMPTestCase):
 """ This class test quorum file repair using drive-mirror.
 It's mostly a fork of TestSingleDrive """
diff --git a/tests/qemu-iotests/041.out b/tests/qemu-iotests/041.out
index b0cadc8..b67d050 100644
--- a/tests/qemu-iotests/041.out
+++ b/tests/qemu-iotests/041.out
@@ -1,5 +1,5 @@
-...
+
 --
-Ran 75 tests
+Ran 76 tests
 
 OK
-- 
1.9.3

[Qemu-devel] [PULL 03/10] block/nfs: add support for libnfs pagecache

2016-06-28 Thread Jeff Cody

From: Peter Lieven 

upcoming libnfs will have support for a read cache that can
significantly help to speed up requests since libnfs by design
circumvents the kernel cache.

Example:
 qemu -cdrom nfs://127.0.0.1/iso/my.iso?pagecache=1024

The pagecache parameters takes the maximum amount of pages to
cache.  A page in libnfs is always the NFS_BLKSIZE which is
4KB.

Signed-off-by: Peter Lieven 
Reviewed-by: Jeff Cody 
Message-id: 1463662083-20814-3-git-send-email...@kamp.de
Signed-off-by: Jeff Cody 
---
 block/nfs.c | 37 -
 1 file changed, 36 insertions(+), 1 deletion(-)

diff --git a/block/nfs.c b/block/nfs.c
index 60be45e..15d6832 100644
--- a/block/nfs.c
+++ b/block/nfs.c
@@ -38,6 +38,7 @@
 #include 
 
 #define QEMU_NFS_MAX_READAHEAD_SIZE 1048576
+#define QEMU_NFS_MAX_PAGECACHE_SIZE (8388608 / NFS_BLKSIZE)
 #define QEMU_NFS_MAX_DEBUG_LEVEL 2
 
 typedef struct NFSClient {
@@ -342,6 +343,26 @@ static int64_t nfs_client_open(NFSClient *client, const 
char *filename,
 val = QEMU_NFS_MAX_READAHEAD_SIZE;
 }
 nfs_set_readahead(client->context, val);
+#ifdef LIBNFS_FEATURE_PAGECACHE
+nfs_set_pagecache_ttl(client->context, 0);
+#endif
+client->cache_used = true;
+#endif
+#ifdef LIBNFS_FEATURE_PAGECACHE
+nfs_set_pagecache_ttl(client->context, 0);
+} else if (!strcmp(qp->p[i].name, "pagecache")) {
+if (open_flags & BDRV_O_NOCACHE) {
+error_setg(errp, "Cannot enable NFS pagecache "
+ "if cache.direct = on");
+goto fail;
+}
+if (val > QEMU_NFS_MAX_PAGECACHE_SIZE) {
+error_report("NFS Warning: Truncating NFS pagecache"
+ " size to %d pages", QEMU_NFS_MAX_PAGECACHE_SIZE);
+val = QEMU_NFS_MAX_PAGECACHE_SIZE;
+}
+nfs_set_pagecache(client->context, val);
+nfs_set_pagecache_ttl(client->context, 0);
 client->cache_used = true;
 #endif
 #ifdef LIBNFS_FEATURE_DEBUG
@@ -524,7 +545,8 @@ static int nfs_reopen_prepare(BDRVReopenState *state,
 }
 
 if ((state->flags & BDRV_O_NOCACHE) && client->cache_used) {
-error_setg(errp, "Cannot disable cache if libnfs readahead is 
enabled");
+error_setg(errp, "Cannot disable cache if libnfs readahead or"
+ " pagecache is enabled");
 return -EINVAL;
 }
 
@@ -542,6 +564,15 @@ static int nfs_reopen_prepare(BDRVReopenState *state,
 return 0;
 }
 
+#ifdef LIBNFS_FEATURE_PAGECACHE
+static void nfs_invalidate_cache(BlockDriverState *bs,
+ Error **errp)
+{
+NFSClient *client = bs->opaque;
+nfs_pagecache_invalidate(client->context, client->fh);
+}
+#endif
+
 static BlockDriver bdrv_nfs = {
 .format_name= "nfs",
 .protocol_name  = "nfs",
@@ -565,6 +596,10 @@ static BlockDriver bdrv_nfs = {
 
 .bdrv_detach_aio_context= nfs_detach_aio_context,
 .bdrv_attach_aio_context= nfs_attach_aio_context,
+
+#ifdef LIBNFS_FEATURE_PAGECACHE
+.bdrv_invalidate_cache  = nfs_invalidate_cache,
+#endif
 };
 
 static void nfs_block_init(void)
-- 
1.9.3

[Qemu-devel] [PULL 05/10] block/gluster: add support for selecting debug logging level

2016-06-28 Thread Jeff Cody

This adds commandline support for the logging level of the
gluster protocol driver, output to stdout.  The option is 'debug',
e.g.:

-drive filename=gluster://192.168.15.180/gv2/test.qcow2,debug=9

Debug levels are 0-9, with 9 being the most verbose, and 0 representing
no debugging output.  The default is the same as it was before, which
is a level of 4.  The current logging levels defined in the gluster
source are:

0 - None
1 - Emergency
2 - Alert
3 - Critical
4 - Error
5 - Warning
6 - Notice
7 - Info
8 - Debug
9 - Trace

(From: glusterfs/logging.h)

Reviewed-by: Niels de Vos 
Signed-off-by: Jeff Cody 
---
 block/gluster.c | 48 +---
 1 file changed, 41 insertions(+), 7 deletions(-)

diff --git a/block/gluster.c b/block/gluster.c
index 38fce9e..16f7778 100644
--- a/block/gluster.c
+++ b/block/gluster.c
@@ -25,6 +25,7 @@ typedef struct BDRVGlusterState {
 struct glfs *glfs;
 struct glfs_fd *fd;
 bool supports_seek_data;
+int debug_level;
 } BDRVGlusterState;
 
 typedef struct GlusterConf {
@@ -33,6 +34,7 @@ typedef struct GlusterConf {
 char *volname;
 char *image;
 char *transport;
+int debug_level;
 } GlusterConf;
 
 static void qemu_gluster_gconf_free(GlusterConf *gconf)
@@ -195,11 +197,7 @@ static struct glfs *qemu_gluster_init(GlusterConf *gconf, 
const char *filename,
 goto out;
 }
 
-/*
- * TODO: Use GF_LOG_ERROR instead of hard code value of 4 here when
- * GlusterFS makes GF_LOG_* macros available to libgfapi users.
- */
-ret = glfs_set_logging(glfs, "-", 4);
+ret = glfs_set_logging(glfs, "-", gconf->debug_level);
 if (ret < 0) {
 goto out;
 }
@@ -257,16 +255,26 @@ static void gluster_finish_aiocb(struct glfs_fd *fd, 
ssize_t ret, void *arg)
 qemu_bh_schedule(acb->bh);
 }
 
+#define GLUSTER_OPT_FILENAME "filename"
+#define GLUSTER_OPT_DEBUG "debug"
+#define GLUSTER_DEBUG_DEFAULT 4
+#define GLUSTER_DEBUG_MAX 9
+
 /* TODO Convert to fine grained options */
 static QemuOptsList runtime_opts = {
 .name = "gluster",
 .head = QTAILQ_HEAD_INITIALIZER(runtime_opts.head),
 .desc = {
 {
-.name = "filename",
+.name = GLUSTER_OPT_FILENAME,
 .type = QEMU_OPT_STRING,
 .help = "URL to the gluster image",
 },
+{
+.name = GLUSTER_OPT_DEBUG,
+.type = QEMU_OPT_NUMBER,
+.help = "Gluster log level, valid range is 0-9",
+},
 { /* end of list */ }
 },
 };
@@ -329,8 +337,17 @@ static int qemu_gluster_open(BlockDriverState *bs,  QDict 
*options,
 goto out;
 }
 
-filename = qemu_opt_get(opts, "filename");
+filename = qemu_opt_get(opts, GLUSTER_OPT_FILENAME);
 
+s->debug_level = qemu_opt_get_number(opts, GLUSTER_OPT_DEBUG,
+ GLUSTER_DEBUG_DEFAULT);
+if (s->debug_level < 0) {
+s->debug_level = 0;
+} else if (s->debug_level > GLUSTER_DEBUG_MAX) {
+s->debug_level = GLUSTER_DEBUG_MAX;
+}
+
+gconf->debug_level = s->debug_level;
 s->glfs = qemu_gluster_init(gconf, filename, errp);
 if (!s->glfs) {
 ret = -errno;
@@ -388,6 +405,7 @@ static int qemu_gluster_reopen_prepare(BDRVReopenState 
*state,
BlockReopenQueue *queue, Error **errp)
 {
 int ret = 0;
+BDRVGlusterState *s;
 BDRVGlusterReopenState *reop_s;
 GlusterConf *gconf = NULL;
 int open_flags = 0;
@@ -395,6 +413,8 @@ static int qemu_gluster_reopen_prepare(BDRVReopenState 
*state,
 assert(state != NULL);
 assert(state->bs != NULL);
 
+s = state->bs->opaque;
+
 state->opaque = g_new0(BDRVGlusterReopenState, 1);
 reop_s = state->opaque;
 
@@ -402,6 +422,7 @@ static int qemu_gluster_reopen_prepare(BDRVReopenState 
*state,
 
 gconf = g_new0(GlusterConf, 1);
 
+gconf->debug_level = s->debug_level;
 reop_s->glfs = qemu_gluster_init(gconf, state->bs->filename, errp);
 if (reop_s->glfs == NULL) {
 ret = -errno;
@@ -535,6 +556,14 @@ static int qemu_gluster_create(const char *filename,
 char *tmp = NULL;
 GlusterConf *gconf = g_new0(GlusterConf, 1);
 
+gconf->debug_level = qemu_opt_get_number_del(opts, GLUSTER_OPT_DEBUG,
+ GLUSTER_DEBUG_DEFAULT);
+if (gconf->debug_level < 0) {
+gconf->debug_level = 0;
+} else if (gconf->debug_level > GLUSTER_DEBUG_MAX) {
+gconf->debug_level = GLUSTER_DEBUG_MAX;
+}
+
 glfs = qemu_gluster_init(gconf, filename, errp);
 if (!glfs) {
 ret = -errno;
@@ -919,6 +948,11 @@ static QemuOptsList qemu_gluster_create_opts = {
 .type = QEMU_OPT_STRING,
 .help = "Preallocation mode (allowed values: off, full)"
 },
+{
+.name = GLUSTER_OPT_DEBUG,
+

[Qemu-devel] [PULL 01/10] block/gluster: add support for SEEK_DATA/SEEK_HOLE

2016-06-28 Thread Jeff Cody

From: Niels de Vos 

GlusterFS 3.8 contains support for SEEK_DATA and SEEK_HOLE. This makes
it possible to detect sparse areas in files.

Signed-off-by: Niels de Vos 
Reviewed-by: Jeff Cody 
---
 block/gluster.c | 182 
 1 file changed, 182 insertions(+)

diff --git a/block/gluster.c b/block/gluster.c
index d361d8e..38fce9e 100644
--- a/block/gluster.c
+++ b/block/gluster.c
@@ -24,6 +24,7 @@ typedef struct GlusterAIOCB {
 typedef struct BDRVGlusterState {
 struct glfs *glfs;
 struct glfs_fd *fd;
+bool supports_seek_data;
 } BDRVGlusterState;
 
 typedef struct GlusterConf {
@@ -287,6 +288,28 @@ static void qemu_gluster_parse_flags(int bdrv_flags, int 
*open_flags)
 }
 }
 
+/*
+ * Do SEEK_DATA/HOLE to detect if it is functional. Older broken versions of
+ * gfapi incorrectly return the current offset when SEEK_DATA/HOLE is used.
+ * - Corrected versions return -1 and set errno to EINVAL.
+ * - Versions that support SEEK_DATA/HOLE correctly, will return -1 and set
+ *   errno to ENXIO when SEEK_DATA is called with a position of EOF.
+ */
+static bool qemu_gluster_test_seek(struct glfs_fd *fd)
+{
+off_t ret, eof;
+
+eof = glfs_lseek(fd, 0, SEEK_END);
+if (eof < 0) {
+/* this should never occur */
+return false;
+}
+
+/* this should always fail with ENXIO if SEEK_DATA is supported */
+ret = glfs_lseek(fd, eof, SEEK_DATA);
+return (ret < 0) && (errno == ENXIO);
+}
+
 static int qemu_gluster_open(BlockDriverState *bs,  QDict *options,
  int bdrv_flags, Error **errp)
 {
@@ -338,6 +361,8 @@ static int qemu_gluster_open(BlockDriverState *bs,  QDict 
*options,
 ret = -errno;
 }
 
+s->supports_seek_data = qemu_gluster_test_seek(s->fd);
+
 out:
 qemu_opts_del(opts);
 qemu_gluster_gconf_free(gconf);
@@ -727,6 +752,159 @@ static int qemu_gluster_has_zero_init(BlockDriverState 
*bs)
 return 0;
 }
 
+/*
+ * Find allocation range in @bs around offset @start.
+ * May change underlying file descriptor's file offset.
+ * If @start is not in a hole, store @start in @data, and the
+ * beginning of the next hole in @hole, and return 0.
+ * If @start is in a non-trailing hole, store @start in @hole and the
+ * beginning of the next non-hole in @data, and return 0.
+ * If @start is in a trailing hole or beyond EOF, return -ENXIO.
+ * If we can't find out, return a negative errno other than -ENXIO.
+ *
+ * (Shamefully copied from raw-posix.c, only miniscule adaptions.)
+ */
+static int find_allocation(BlockDriverState *bs, off_t start,
+   off_t *data, off_t *hole)
+{
+BDRVGlusterState *s = bs->opaque;
+off_t offs;
+
+if (!s->supports_seek_data) {
+return -ENOTSUP;
+}
+
+/*
+ * SEEK_DATA cases:
+ * D1. offs == start: start is in data
+ * D2. offs > start: start is in a hole, next data at offs
+ * D3. offs < 0, errno = ENXIO: either start is in a trailing hole
+ *  or start is beyond EOF
+ * If the latter happens, the file has been truncated behind
+ * our back since we opened it.  All bets are off then.
+ * Treating like a trailing hole is simplest.
+ * D4. offs < 0, errno != ENXIO: we learned nothing
+ */
+offs = glfs_lseek(s->fd, start, SEEK_DATA);
+if (offs < 0) {
+return -errno;  /* D3 or D4 */
+}
+assert(offs >= start);
+
+if (offs > start) {
+/* D2: in hole, next data at offs */
+*hole = start;
+*data = offs;
+return 0;
+}
+
+/* D1: in data, end not yet known */
+
+/*
+ * SEEK_HOLE cases:
+ * H1. offs == start: start is in a hole
+ * If this happens here, a hole has been dug behind our back
+ * since the previous lseek().
+ * H2. offs > start: either start is in data, next hole at offs,
+ *   or start is in trailing hole, EOF at offs
+ * Linux treats trailing holes like any other hole: offs ==
+ * start.  Solaris seeks to EOF instead: offs > start (blech).
+ * If that happens here, a hole has been dug behind our back
+ * since the previous lseek().
+ * H3. offs < 0, errno = ENXIO: start is beyond EOF
+ * If this happens, the file has been truncated behind our
+ * back since we opened it.  Treat it like a trailing hole.
+ * H4. offs < 0, errno != ENXIO: we learned nothing
+ * Pretend we know nothing at all, i.e. "forget" about D1.
+ */
+offs = glfs_lseek(s->fd, start, SEEK_HOLE);
+if (offs < 0) {
+return -errno;  /* D1 and (H3 or H4) */
+}
+assert(offs >= start);
+
+if (offs > start) {
+/*
+ * D1 and H2: either in data, next hole at offs, or it was in
+ * data but is now in a trailing hole.  In the latter case,
+ *

[Qemu-devel] [PULL 00/10] Block patches

2016-06-28 Thread Jeff Cody

The following changes since commit d7f30403576f04f1f3a5fb5a1d18cba8dfa7a6d2:

  cputlb: don't cpu_abort() if guest tries to execute outside RAM or RAM 
(2016-06-28 18:50:53 +0100)

are available in the git repository at:

  g...@github.com:codyprime/qemu-kvm-jtc.git tags/block-pull-request

for you to fetch changes up to 15d6729850728ee49859711dd40b00d8d85d94ee:

  mirror: fix misleading comments (2016-06-28 23:08:25 -0400)





Changlong Xie (2):
  blockjob: assert(cb) when create job
  mirror: fix misleading comments

Denis V. Lunev (1):
  mirror: fix trace_mirror_yield_in_flight usage in mirror_iteration()

Jeff Cody (1):
  block/gluster: add support for selecting debug logging level

John Snow (3):
  mirror: clarify mirror_do_read return code
  mirror: limit niov to IOV_MAX elements, again
  iotests: add small-granularity mirror test

Niels de Vos (1):
  block/gluster: add support for SEEK_DATA/SEEK_HOLE

Peter Lieven (2):
  block/nfs: refuse readahead if cache.direct is on
  block/nfs: add support for libnfs pagecache

 block/backup.c |   1 -
 block/gluster.c| 230 +++--
 block/mirror.c |  14 ++-
 block/nfs.c|  55 ++-
 blockjob.c |   1 +
 tests/qemu-iotests/041 |  30 ++
 tests/qemu-iotests/041.out |   4 +-
 7 files changed, 317 insertions(+), 18 deletions(-)

-- 
1.9.3

[Qemu-devel] [PULL 02/10] block/nfs: refuse readahead if cache.direct is on

2016-06-28 Thread Jeff Cody

From: Peter Lieven 

if we open a NFS export with disabled cache we should refuse
the readahead feature as it will cache data inside libnfs.

If a export was opened with readahead enabled it should
futher not be allowed to disable the cache while running.

Cc: qemu-sta...@nongnu.org
Signed-off-by: Peter Lieven 
Reviewed-by: Jeff Cody 
Message-id: 1463662083-20814-2-git-send-email...@kamp.de
Signed-off-by: Jeff Cody 
---
 block/nfs.c | 20 
 1 file changed, 16 insertions(+), 4 deletions(-)

diff --git a/block/nfs.c b/block/nfs.c
index 9f51cc3..60be45e 100644
--- a/block/nfs.c
+++ b/block/nfs.c
@@ -1,7 +1,7 @@
 /*
  * QEMU Block driver for native access to files on NFS shares
  *
- * Copyright (c) 2014 Peter Lieven 
+ * Copyright (c) 2014-2016 Peter Lieven 
  *
  * Permission is hereby granted, free of charge, to any person obtaining a copy
  * of this software and associated documentation files (the "Software"), to 
deal
@@ -47,6 +47,7 @@ typedef struct NFSClient {
 bool has_zero_init;
 AioContext *aio_context;
 blkcnt_t st_blocks;
+bool cache_used;
 } NFSClient;
 
 typedef struct NFSRPC {
@@ -278,7 +279,7 @@ static void nfs_file_close(BlockDriverState *bs)
 }
 
 static int64_t nfs_client_open(NFSClient *client, const char *filename,
-   int flags, Error **errp)
+   int flags, Error **errp, int open_flags)
 {
 int ret = -EINVAL, i;
 struct stat st;
@@ -330,12 +331,18 @@ static int64_t nfs_client_open(NFSClient *client, const 
char *filename,
 nfs_set_tcp_syncnt(client->context, val);
 #ifdef LIBNFS_FEATURE_READAHEAD
 } else if (!strcmp(qp->p[i].name, "readahead")) {
+if (open_flags & BDRV_O_NOCACHE) {
+error_setg(errp, "Cannot enable NFS readahead "
+ "if cache.direct = on");
+goto fail;
+}
 if (val > QEMU_NFS_MAX_READAHEAD_SIZE) {
 error_report("NFS Warning: Truncating NFS readahead"
  " size to %d", QEMU_NFS_MAX_READAHEAD_SIZE);
 val = QEMU_NFS_MAX_READAHEAD_SIZE;
 }
 nfs_set_readahead(client->context, val);
+client->cache_used = true;
 #endif
 #ifdef LIBNFS_FEATURE_DEBUG
 } else if (!strcmp(qp->p[i].name, "debug")) {
@@ -418,7 +425,7 @@ static int nfs_file_open(BlockDriverState *bs, QDict 
*options, int flags,
 }
 ret = nfs_client_open(client, qemu_opt_get(opts, "filename"),
   (flags & BDRV_O_RDWR) ? O_RDWR : O_RDONLY,
-  errp);
+  errp, bs->open_flags);
 if (ret < 0) {
 goto out;
 }
@@ -454,7 +461,7 @@ static int nfs_file_create(const char *url, QemuOpts *opts, 
Error **errp)
 total_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
   BDRV_SECTOR_SIZE);
 
-ret = nfs_client_open(client, url, O_CREAT, errp);
+ret = nfs_client_open(client, url, O_CREAT, errp, 0);
 if (ret < 0) {
 goto out;
 }
@@ -516,6 +523,11 @@ static int nfs_reopen_prepare(BDRVReopenState *state,
 return -EACCES;
 }
 
+if ((state->flags & BDRV_O_NOCACHE) && client->cache_used) {
+error_setg(errp, "Cannot disable cache if libnfs readahead is 
enabled");
+return -EINVAL;
+}
+
 /* Update cache for read-only reopens */
 if (!(state->flags & BDRV_O_RDWR)) {
 ret = nfs_fstat(client->context, client->fh, );
-- 
1.9.3

Re: [Qemu-devel] [PATCH v2 0/2] small fix of block job

2016-06-28 Thread Jeff Cody

On Thu, Jun 23, 2016 at 04:57:19PM +0800, Changlong Xie wrote:
> V2
> p1: put assert(cb) in block_job_create
> 
> Changlong Xie (2):
>   blockjob: assert(cb) when create job
>   mirror: fix misleading comments
> 
>  block/backup.c | 1 -
>  block/mirror.c | 2 +-
>  blockjob.c | 1 +
>  3 files changed, 2 insertions(+), 2 deletions(-)
> 
> -- 
> 1.9.3
> 
> 
> 

Thanks,

Applied to my block branch:

git://github.com/codyprime/qemu-kvm-jtc.git block

-Jeff

Re: [Qemu-devel] [PATCH v2 2/2] mirror: fix misleading comments

2016-06-28 Thread Jeff Cody

On Thu, Jun 23, 2016 at 04:57:21PM +0800, Changlong Xie wrote:
> s/target bs/to_replace/, also we check to_replace bs is not
> blocked in qmp_drive_mirror() not here
> 
> Signed-off-by: Changlong Xie 

Reviewed-by: Jeff Cody 
> ---
>  block/mirror.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/block/mirror.c b/block/mirror.c
> index a04ed9c..4420a15 100644
> --- a/block/mirror.c
> +++ b/block/mirror.c
> @@ -769,7 +769,7 @@ static void mirror_complete(BlockJob *job, Error **errp)
>  }
>  }
>  
> -/* check the target bs is not blocked and block all operations on it */
> +/* block all operations on to_replace bs */
>  if (s->replaces) {
>  AioContext *replace_aio_context;
>  
> -- 
> 1.9.3
> 
> 
>

Re: [Qemu-devel] [PATCH 1/2] ppc: Add proper real mode translation support

2016-06-28 Thread Benjamin Herrenschmidt

On Wed, 2016-06-29 at 12:41 +1000, David Gibson wrote:
> > +    /* Actually we don't support unbounded RMA anymore since
> we
> > + * added proper emulation of HV mode. The max we can get
> is
> > + * 16G which also happens to be what we configure for PAPR
> > + * mode so make sure we don't do anything bigger than that
> > + */
> > +    spapr->rma_size = MIN(spapr->rma_size, 0x4ull);
> 
> #1 - Instead of the various KVM / non-KVM cases here, it might be
> simpler to just always clamp the RMA to 256MiB.

That would be sad ... we benefit from having a larger RMA..

Cheers,
Ben.

Re: [Qemu-devel] [PATCH v2 1/2] blockjob: assert(cb) when create job

2016-06-28 Thread Jeff Cody

On Thu, Jun 23, 2016 at 04:57:20PM +0800, Changlong Xie wrote:
> Callback for block job should always exist
> 
> Suggested-by: Paolo Bonzini 
> Suggested-by: Kevin Wolf 
> Signed-off-by: Changlong Xie 

Reviewed-by: Jeff Cody 

> ---
>  block/backup.c | 1 -
>  blockjob.c | 1 +
>  2 files changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/block/backup.c b/block/backup.c
> index 581269b..f87f8d5 100644
> --- a/block/backup.c
> +++ b/block/backup.c
> @@ -489,7 +489,6 @@ void backup_start(BlockDriverState *bs, BlockDriverState 
> *target,
>  
>  assert(bs);
>  assert(target);
> -assert(cb);
>  
>  if (bs == target) {
>  error_setg(errp, "Source and target cannot be the same");
> diff --git a/blockjob.c b/blockjob.c
> index 90c4e26..205da9d 100644
> --- a/blockjob.c
> +++ b/blockjob.c
> @@ -110,6 +110,7 @@ void *block_job_create(const BlockJobDriver *driver, 
> BlockDriverState *bs,
>  BlockBackend *blk;
>  BlockJob *job;
>  
> +assert(cb);
>  if (bs->job) {
>  error_setg(errp, QERR_DEVICE_IN_USE, bdrv_get_device_name(bs));
>  return NULL;
> -- 
> 1.9.3
> 
> 
>

Re: [Qemu-devel] [PATCH 2/3] VFIO driver for mediated PCI device

2016-06-28 Thread Alex Williamson

On Wed, 29 Jun 2016 00:15:23 +0530
Kirti Wankhede  wrote:

> On 6/25/2016 1:15 AM, Alex Williamson wrote:
> > On Sat, 25 Jun 2016 00:04:27 +0530
> > Kirti Wankhede  wrote:
> >   
> 
>  +
>  +static int mdev_get_irq_count(struct vfio_mdev *vmdev, int irq_type)
>  +{
>  +/* Don't support MSIX for now */
>  +if (irq_type == VFIO_PCI_MSIX_IRQ_INDEX)
>  +return -1;
>  +
>  +return 1;
> >>>
> >>> Too much hard coding here, the mediated driver should define this.
> >>> 
> >>
> >> I'm testing INTX and MSI, I don't have a way to test MSIX for now. So we
> >> thought we can add supported for MSIX later. Till then hard code it to 1.  
> > 
> > To me it screams that there needs to be an interface to the mediated
> > device here.  How do you even know that the mediated device intends to
> > support MSI?  What if it wants to emulated a VF and not support INTx?
> > This is basically just a big "TODO" flag that needs to be addressed
> > before a non-RFC.
> >   
> 
> VFIO user space app reads emulated PCI config space of mediated device.
> In PCI capability list when MSI capability (PCI_CAP_ID_MSI) is present,
> it calls VFIO_DEVICE_SET_IRQS ioctl with irq_set->index set to
> VFIO_PCI_MSI_IRQ_INDEX.
> Similarly, MSIX is identified from emulated config space of mediated
> device that checks if MSI capability is present and number of vectors
> extracted from PCI_MSI_FLAGS_QSIZE flag.
> vfio_mpci modules don't need to query it from vendor driver of mediated
> device. Depending on which interrupt to support, mediated driver should
> emulate PCI config space.

Are you suggesting that if the user can determine which interrupts are
supported and the various counts for each by querying the PCI config
space of the mediated device then this interface should do the same,
much like vfio_pci_get_irq_count(), such that it can provide results
consistent with config space?  That I'm ok with.  Having the user find
one IRQ count as they read PCI config space and another via the vfio
API, I'm not ok with.  Thanks,

Alex

Re: [Qemu-devel] [PATCH 0/3] drive-mirror: limit niov to MAX_IOV

2016-06-28 Thread Jeff Cody

On Wed, Jun 22, 2016 at 03:51:01PM -0400, John Snow wrote:
> e5b43573 caused a regression in the preparation of our IO vectors, such
> that if a small granularity but a large buffer size is chosen, we may
> accidentally exceed MAX_IOV and the request will fail.
> 
> This has been fixed before in cae98cb8, and now we'll fix it again.
> To keep it fixed, we'll add an iotest this time.
> 
> [Thanks to Max for finding the root cause.]
> 
> John Snow (3):
>   mirror: clarify mirror_do_read return code
>   mirror: limit niov to IOV_MAX elements, again
>   iotests: add small-granularity mirror test
> 
>  block/mirror.c | 10 --
>  tests/qemu-iotests/041 | 30 ++
>  tests/qemu-iotests/041.out |  4 ++--
>  3 files changed, 40 insertions(+), 4 deletions(-)
> 
> -- 
> 2.4.11
> 

Thanks,

Applied to my block branch:

git://github.com/codyprime/qemu-kvm-jtc.git block

-Jeff

Re: [Qemu-devel] [PATCH 3/3] VFIO Type1 IOMMU: Add support for mediated devices

2016-06-28 Thread Alex Williamson

On Tue, 28 Jun 2016 18:32:44 +0530
Kirti Wankhede  wrote:

> On 6/22/2016 9:16 AM, Alex Williamson wrote:
> > On Mon, 20 Jun 2016 22:01:48 +0530
> > Kirti Wankhede  wrote:
> >   
> >>  
> >>  struct vfio_iommu {
> >>struct list_headdomain_list;
> >> +  struct vfio_domain  *mediated_domain;  
> > 
> > I'm not really a fan of how this is so often used to special case the
> > code...
> >   
> >>struct mutexlock;
> >>struct rb_root  dma_list;
> >>boolv2;
> >> @@ -67,6 +69,13 @@ struct vfio_domain {
> >>struct list_headgroup_list;
> >>int prot;   /* IOMMU_CACHE */
> >>boolfgsp;   /* Fine-grained super pages */
> >> +
> >> +  /* Domain for mediated device which is without physical IOMMU */
> >> +  boolmediated_device;  
> > 
> > But sometimes we use this to special case the code and other times we
> > use domain_list being empty.  I thought the argument against pulling
> > code out to a shared file was that this approach could be made
> > maintainable.
> >   
> 
> Functions where struct vfio_domain *domain is argument which are
> intended to perform for that domain only, checked if
> (domain->mediated_device), like map_try_harder(), vfio_iommu_replay(),
> vfio_test_domain_fgsp(). Checks in these functions can be removed but
> then it would be callers responsibility to make sure that they don't
> call these functions for mediated_domain.
> Whereas functions where struct vfio_iommu *iommu is argument and
> domain_list is traversed to find domain or perform for each domain in
> domain_list, checked if (list_empty(>domain_list)), like
> vfio_unmap_unpin(), vfio_iommu_map(), vfio_dma_do_map().

My point is that we have different test elements at different points in
the data structures and they all need to be kept in sync and the right
one used at the right place, which makes the code all that much more
complex versus the alternative approach of finding commonality,
extracting it into a shared file, and creating a mediated version of
the type1 iommu that doesn't try to overload dual functionality into a
single code block. 

> >> +
> >> +  struct mm_struct*mm;
> >> +  struct rb_root  pfn_list;   /* pinned Host pfn list */
> >> +  struct mutexpfn_list_lock;  /* mutex for pfn_list */  
> > 
> > Seems like we could reduce overhead for the existing use cases by just
> > adding a pointer here and making these last 3 entries part of the
> > structure that gets pointed to.  Existence of the pointer would replace
> > @mediated_device.
> >  
> 
> Ok.
> 
> >>  };
> >>  
> >>  struct vfio_dma {
> >> @@ -79,10 +88,26 @@ struct vfio_dma {
> >>  
> >>  struct vfio_group {
> >>struct iommu_group  *iommu_group;
> >> +#if defined(CONFIG_MDEV) || defined(CONFIG_MDEV_MODULE)  
> > 
> > Where does CONFIG_MDEV_MODULE come from?
> > 
> > Plus, all the #ifdefs... 
> >   
> 
> Config option MDEV is tristate and when selected as module
> CONFIG_MDEV_MODULE is set in include/generated/autoconf.h.
> Symbols mdev_bus_type, mdev_get_device_by_group() and mdev_put_device()
> are only available when MDEV option is selected as built-in or modular.
> If MDEV option is not selected, vfio_iommu_type1 modules should still
> work for device direct assignment. If these #ifdefs are not there
> vfio_iommu_type1 module fails to load with undefined symbols when MDEV
> is not selected.

I guess I just hadn't seen the _MODULE define used before, but it does
appear to be fairly common.  Another option might be to provide stubs
or static inline abstractions in a header file so the #ifdefs can be
isolated.  It also seems like this is going to mean that type1 now
depends on and will autoload the mdev module even for physical
assignment.  That's not terribly desirable.

> >> +  struct mdev_device  *mdev;  
> > 
> > This gets set on attach_group where we use the iommu_group to lookup
> > the mdev, so why can't we do that on the other paths that make use of
> > this?  I think this is just holding a reference.
> >   
> 
> mdev is retrieved from attach_group for 2 reasons:
> 1. to increase the ref count of mdev, mdev_get_device_by_group(), when
> its iommu_group is attached. That should be decremented, by
> mdev_put_device(), from detach while detaching its iommu_group. This is
> make sure that mdev is not freed until it's iommu_group is detached from
> the container.
> 
> 2. save reference to iommu_data so that vendor driver would use to call
> vfio_pin_pages() and vfio_unpin_pages(). More details below.
> 
> 
> 
> >> -static int vaddr_get_pfn(unsigned long vaddr, int prot, unsigned long 
> >> *pfn)
> >> +static int vaddr_get_pfn(struct mm_struct *mm, unsigned long vaddr,
> >> +   int prot, unsigned long *pfn)
> >>  {
> >>struct page *page[1];
> >>struct vm_area_struct *vma;
> >> +  struct

Re: [Qemu-devel] [PATCH v2] target-ppc: Eliminate redundant and incorrect function booke206_page_size_to_tlb

2016-06-28 Thread David Gibson

On Tue, Jun 28, 2016 at 06:50:05AM -0700, Aaron Larson wrote:
> 
> Eliminate redundant and incorrect booke206_page_size_to_tlb function
> from ppce500_spin.c in preference to previously existing but newly
> exported definition from e500.c
> 
> Defect analysis:
> 
> The booke206_page_size_to_tlb function in e500.c was updated in commit
> 2bd9543 "ppc: booke206: use MAV=2.0 TSIZE definition, fix 4G pages" to
> reflect a change in the definition of MAS1_TSIZE_SHIFT from 8
> (corresponding to a min TLB page size of 4kb) to a value of 7 (TLB
> page size 2k).  The booke206_page_size_to_tlb() function defined in
> ppce500_spin.c was never updated to reflect the change in
> MAS1_TSIZE_SHIFT.
> 
> In http://lists.nongnu.org/archive/html/qemu-ppc/2016-06/msg00533.html,
> Scott Wood suggested this "root cause" explanation:
> 
> SW> The patch that changed MAS1_TSIZE_SHIFT from 8 to 7 was around the
> SW> same time as the patch that added this code, which is probably why
> SW> adjusting it got missed.  Commit 2bd9543cd3 did update the
> SW> equivalent code in ppce500_mpc8544ds.c, which now resides in
> SW> hw/ppc/e500.c and has been changed to not assume a power-of-2
> SW> size.  The ppce500_spin version should be eliminated.
> 
> Signed-off-by: Aaron Larson 

Applied to pppc-for-2.7, thanks.

> ---
>  hw/ppc/e500.c | 2 +-
>  hw/ppc/e500.h | 2 ++
>  hw/ppc/ppce500_spin.c | 7 +--
>  3 files changed, 4 insertions(+), 7 deletions(-)
> 
> diff --git a/hw/ppc/e500.c b/hw/ppc/e500.c
> index ee1c60b..0cd534d 100644
> --- a/hw/ppc/e500.c
> +++ b/hw/ppc/e500.c
> @@ -601,7 +601,7 @@ static int ppce500_prep_device_tree(MachineState *machine,
>  }
>  
>  /* Create -kernel TLB entries for BookE.  */
> -static inline hwaddr booke206_page_size_to_tlb(uint64_t size)
> +hwaddr booke206_page_size_to_tlb(uint64_t size)
>  {
>  return 63 - clz64(size >> 10);
>  }
> diff --git a/hw/ppc/e500.h b/hw/ppc/e500.h
> index ef224ea..70ba1d8 100644
> --- a/hw/ppc/e500.h
> +++ b/hw/ppc/e500.h
> @@ -26,4 +26,6 @@ typedef struct PPCE500Params {
>  
>  void ppce500_init(MachineState *machine, PPCE500Params *params);
>  
> +hwaddr booke206_page_size_to_tlb(uint64_t size);
> +
>  #endif
> diff --git a/hw/ppc/ppce500_spin.c b/hw/ppc/ppce500_spin.c
> index 225177b..22c584e 100644
> --- a/hw/ppc/ppce500_spin.c
> +++ b/hw/ppc/ppce500_spin.c
> @@ -32,6 +32,7 @@
>  #include "sysemu/sysemu.h"
>  #include "hw/sysbus.h"
>  #include "sysemu/kvm.h"
> +#include "e500.h"
>  
>  #define MAX_CPUS 32
>  
> @@ -72,12 +73,6 @@ static void spin_reset(void *opaque)
>  }
>  }
>  
> -/* Create -kernel TLB entries for BookE, linearly spanning 256MB.  */
> -static inline hwaddr booke206_page_size_to_tlb(uint64_t size)
> -{
> -return ctz32(size >> 10) >> 1;
> -}
> -
>  static void mmubooke_create_initial_mapping(CPUPPCState *env,
>   target_ulong va,
>   hwaddr pa,

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH 1/2] ppc: Add proper real mode translation support

2016-06-28 Thread David Gibson

On Tue, Jun 28, 2016 at 08:48:33AM +0200, Cédric Le Goater wrote:
> From: Benjamin Herrenschmidt 
> 
> This adds proper support for translating real mode addresses based
> on the combination of HV and LPCR bits. This handles HRMOR offset
> for hypervisor real mode, and both RMA and VRMA modes for guest
> real mode. PAPR mode adjusts the offsets appropriately to match the
> RMA used in TCG, but we need to limit to the max supported by the
> implementation (16G).
> 
> Signed-off-by: Benjamin Herrenschmidt 
> [clg: fixed checkpatch.pl errors ]
> Signed-off-by: Cédric Le Goater 

This looks correct and I've applied it.  There are a couple of
possible cleanups which might be a good idea to follow up with though.


> ---
>  hw/ppc/spapr.c  |   7 +++
>  target-ppc/mmu-hash64.c | 146 
> ++--
>  target-ppc/mmu-hash64.h |   1 +
>  target-ppc/translate_init.c |  10 ++-
>  4 files changed, 144 insertions(+), 20 deletions(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index d26b4c26ed10..53ab1f84fb11 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -1770,6 +1770,13 @@ static void ppc_spapr_init(MachineState *machine)
>  spapr->vrma_adjust = 1;
>  spapr->rma_size = MIN(spapr->rma_size, 0x1000);
>  }
> +
> +/* Actually we don't support unbounded RMA anymore since we
> + * added proper emulation of HV mode. The max we can get is
> + * 16G which also happens to be what we configure for PAPR
> + * mode so make sure we don't do anything bigger than that
> + */
> +spapr->rma_size = MIN(spapr->rma_size, 0x4ull);

#1 - Instead of the various KVM / non-KVM cases here, it might be
simpler to just always clamp the RMA to 256MiB.

>  }
>  
>  if (spapr->rma_size > node0_size) {
> diff --git a/target-ppc/mmu-hash64.c b/target-ppc/mmu-hash64.c
> index 6d6f26c92957..ed353b2d1539 100644
> --- a/target-ppc/mmu-hash64.c
> +++ b/target-ppc/mmu-hash64.c
> @@ -653,13 +653,41 @@ static void ppc_hash64_set_dsi(CPUState *cs, 
> CPUPPCState *env, uint64_t dar,
>  env->error_code = 0;
>  }
>  
> +static int64_t ppc_hash64_get_rmls(CPUPPCState *env)
> +{
> +uint64_t lpcr = env->spr[SPR_LPCR];
> +
> +/*
> + * This is the full 4 bits encoding of POWER8. Previous
> + * CPUs only support a subset of these but the filtering
> + * is done when writing LPCR
> + */
> +switch ((lpcr & LPCR_RMLS) >> LPCR_RMLS_SHIFT) {
> +case 0x8: /* 32MB */
> +return 0x200ull;
> +case 0x3: /* 64MB */
> +return 0x400ull;
> +case 0x7: /* 128MB */
> +return 0x800ull;
> +case 0x4: /* 256MB */
> +return 0x1000ull;
> +case 0x2: /* 1GB */
> +return 0x4000ull;
> +case 0x1: /* 16GB */
> +return 0x4ull;
> +default:
> +/* What to do here ??? */
> +return 0;
> +}
> +}
>  
>  int ppc_hash64_handle_mmu_fault(PowerPCCPU *cpu, vaddr eaddr,
>  int rwx, int mmu_idx)
>  {
>  CPUState *cs = CPU(cpu);
>  CPUPPCState *env = >env;
> -ppc_slb_t *slb;
> +ppc_slb_t *slb_ptr;
> +ppc_slb_t slb;
>  unsigned apshift;
>  hwaddr pte_offset;
>  ppc_hash_pte64_t pte;
> @@ -670,11 +698,53 @@ int ppc_hash64_handle_mmu_fault(PowerPCCPU *cpu, vaddr 
> eaddr,
>  
>  assert((rwx == 0) || (rwx == 1) || (rwx == 2));
>  
> +/* Note on LPCR usage: 970 uses HID4, but our special variant
> + * of store_spr copies relevant fields into env->spr[SPR_LPCR].
> + * Similarily we filter unimplemented bits when storing into
> + * LPCR depending on the MMU version. This code can thus just
> + * use the LPCR "as-is".
> + */
> +
>  /* 1. Handle real mode accesses */
>  if (((rwx == 2) && (msr_ir == 0)) || ((rwx != 2) && (msr_dr == 0))) {
> -/* Translation is off */
> -/* In real mode the top 4 effective address bits are ignored */
> +/* Translation is supposedly "off"  */
> +/* In real mode the top 4 effective address bits are (mostly) 
> ignored */
>  raddr = eaddr & 0x0FFFULL;
> +
> +/* In HV mode, add HRMOR if top EA bit is clear */
> +if (msr_hv) {
> +if (!(eaddr >> 63)) {
> +raddr |= env->spr[SPR_HRMOR];
> +}
> +} else {
> +/* Otherwise, check VPM for RMA vs VRMA */
> +if (env->spr[SPR_LPCR] & LPCR_VPM0) {
> +uint32_t vrmasd;
> +/* VRMA, we make up an SLB entry */
> +slb.vsid = SLB_VSID_VRMA;
> +vrmasd = (env->spr[SPR_LPCR] & LPCR_VRMASD) >>
> +LPCR_VRMASD_SHIFT;
> +slb.vsid |= (vrmasd << 4) & (SLB_VSID_L | SLB_VSID_LP);
> +slb.esid = SLB_ESID_V;
> +goto skip_slb;

Re: [Qemu-devel] [PATCH 2/2] ppc: Fix 64K pages support in full emulation

2016-06-28 Thread David Gibson

On Tue, Jun 28, 2016 at 08:48:34AM +0200, Cédric Le Goater wrote:
> From: Benjamin Herrenschmidt 
> 
> We were always advertising only 4K & 16M. Additionally the code wasn't
> properly matching the page size with the PTE content, which meant we
> could potentially hit an incorrect PTE if the guest used multiple sizes.
> 
> Finally, honor the CPU capabilities when decoding the size from the SLB
> so we don't try to use 64K pages on 970.
> 
> This still doesn't add support for MPSS (Multiple Page Sizes per Segment)
> 
> Signed-off-by: Benjamin Herrenschmidt 
> [clg: fixed checkpatch.pl errors
>   commits 61a36c9b5a12 and 1114e712c998 reworked the hpte code
>   doing insertion/removal in hw/ppc/spapr_hcall.c. The hunks
>   modifying these areas were removed. ]
> Signed-off-by: Cédric Le Goater 

Applied to ppc-for-2.7.

> ---
>  target-ppc/cpu-qom.h|  3 +++
>  target-ppc/mmu-hash64.c | 39 +++
>  target-ppc/translate_init.c | 22 +++---
>  3 files changed, 57 insertions(+), 7 deletions(-)
> 
> diff --git a/target-ppc/cpu-qom.h b/target-ppc/cpu-qom.h
> index 0fad2def0a94..286410502f6d 100644
> --- a/target-ppc/cpu-qom.h
> +++ b/target-ppc/cpu-qom.h
> @@ -70,18 +70,21 @@ enum powerpc_mmu_t {
>  #define POWERPC_MMU_64   0x0001
>  #define POWERPC_MMU_1TSEG0x0002
>  #define POWERPC_MMU_AMR  0x0004
> +#define POWERPC_MMU_64K  0x0008
>  /* 64 bits PowerPC MMU */
>  POWERPC_MMU_64B= POWERPC_MMU_64 | 0x0001,
>  /* Architecture 2.03 and later (has LPCR) */
>  POWERPC_MMU_2_03   = POWERPC_MMU_64 | 0x0002,
>  /* Architecture 2.06 variant   */
>  POWERPC_MMU_2_06   = POWERPC_MMU_64 | POWERPC_MMU_1TSEG
> + | POWERPC_MMU_64K
>   | POWERPC_MMU_AMR | 0x0003,
>  /* Architecture 2.06 "degraded" (no 1T segments)   */
>  POWERPC_MMU_2_06a  = POWERPC_MMU_64 | POWERPC_MMU_AMR
>   | 0x0003,
>  /* Architecture 2.07 variant   */
>  POWERPC_MMU_2_07   = POWERPC_MMU_64 | POWERPC_MMU_1TSEG
> + | POWERPC_MMU_64K
>   | POWERPC_MMU_AMR | 0x0004,
>  /* Architecture 2.07 "degraded" (no 1T segments)   */
>  POWERPC_MMU_2_07a  = POWERPC_MMU_64 | POWERPC_MMU_AMR
> diff --git a/target-ppc/mmu-hash64.c b/target-ppc/mmu-hash64.c
> index ed353b2d1539..fa26ad2e875b 100644
> --- a/target-ppc/mmu-hash64.c
> +++ b/target-ppc/mmu-hash64.c
> @@ -450,9 +450,31 @@ void ppc_hash64_stop_access(PowerPCCPU *cpu, uint64_t 
> token)
>  }
>  }
>  
> +/* Returns the effective page shift or 0. MPSS isn't supported yet so
> + * this will always be the slb_pshift or 0
> + */
> +static uint32_t ppc_hash64_pte_size_decode(uint64_t pte1, uint32_t 
> slb_pshift)
> +{
> +switch (slb_pshift) {
> +case 12:
> +return 12;
> +case 16:
> +if ((pte1 & 0xf000) == 0x1000) {
> +return 16;
> +}
> +return 0;
> +case 24:
> +if ((pte1 & 0xff000) == 0) {
> +return 24;
> +}
> +return 0;
> +}
> +return 0;
> +}
> +
>  static hwaddr ppc_hash64_pteg_search(PowerPCCPU *cpu, hwaddr hash,
> - bool secondary, target_ulong ptem,
> - ppc_hash_pte64_t *pte)
> + uint32_t slb_pshift, bool secondary,
> + target_ulong ptem, ppc_hash_pte64_t 
> *pte)
>  {
>  CPUPPCState *env = >env;
>  int i;
> @@ -472,6 +494,13 @@ static hwaddr ppc_hash64_pteg_search(PowerPCCPU *cpu, 
> hwaddr hash,
>  if ((pte0 & HPTE64_V_VALID)
>  && (secondary == !!(pte0 & HPTE64_V_SECONDARY))
>  && HPTE64_V_COMPARE(pte0, ptem)) {
> +uint32_t pshift = ppc_hash64_pte_size_decode(pte1, slb_pshift);
> +if (pshift == 0) {
> +continue;
> +}
> +/* We don't do anything with pshift yet as qemu TLB only deals
> + * with 4K pages anyway
> + */
>  pte->pte0 = pte0;
>  pte->pte1 = pte1;
>  ppc_hash64_stop_access(cpu, token);
> @@ -525,7 +554,8 @@ static hwaddr ppc_hash64_htab_lookup(PowerPCCPU *cpu,
>  " vsid=" TARGET_FMT_lx " ptem=" TARGET_FMT_lx
>  " hash=" TARGET_FMT_plx "\n",
>  env->htab_base, env->htab_mask, vsid, ptem,  hash);
> -pte_offset = ppc_hash64_pteg_search(cpu, hash, 0, ptem, pte);
> +pte_offset = ppc_hash64_pteg_search(cpu, hash, slb->sps->page_shift,
> +0, ptem, pte);
>  
>  if (pte_offset == -1) {
>  /* Secondary

Re: [Qemu-devel] [PATCH v0] spapr: Restore support for older PowerPC CPU cores

2016-06-28 Thread David Gibson

On Tue, Jun 28, 2016 at 08:35:02PM +0530, Bharata B Rao wrote:
> Introduction of core based CPU hotplug for PowerPC sPAPR didn't
> add support for 970 and POWER5+ based core types. Add support for
> the same.
> 
> Signed-off-by: Bharata B Rao 

Applied to ppc-for-2.7

> ---
> TODO:
> - There are few other variants of 970, like 970fx etc for which I have not
>   added core types since I am not sure if they fall under sPAPR category.

Yeah, frankly I wouldn't really trust the spapr code with anything
except POWER7 or POWER8.

> - Is it time to add core type for POWER8NVL yet ?

Yes.

>  hw/ppc/spapr_cpu_core.c | 9 -
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c
> index 8b802a6..cebeef5 100644
> --- a/hw/ppc/spapr_cpu_core.c
> +++ b/hw/ppc/spapr_cpu_core.c
> @@ -325,7 +325,6 @@ static void spapr_cpu_core_class_init(ObjectClass *oc, 
> void *data)
>  
>  /*
>   * instance_init routines from different flavours of sPAPR CPU cores.
> - * TODO: Add support for 'host' core type.
>   */
>  #define SPAPR_CPU_CORE_INITFN(_type, _fname) \
>  static void glue(glue(spapr_cpu_core_, _fname), _initfn(Object *obj)) \
> @@ -338,6 +337,8 @@ static void glue(glue(spapr_cpu_core_, _fname), 
> _initfn(Object *obj)) \
>  core->cpu_class = oc; \
>  }
>  
> +SPAPR_CPU_CORE_INITFN(970_v2.2, 970);
> +SPAPR_CPU_CORE_INITFN(POWER5+_v2.1, POWER5plus);
>  SPAPR_CPU_CORE_INITFN(POWER7_v2.3, POWER7);
>  SPAPR_CPU_CORE_INITFN(POWER7+_v2.1, POWER7plus);
>  SPAPR_CPU_CORE_INITFN(POWER8_v2.0, POWER8);
> @@ -349,6 +350,12 @@ typedef struct SPAPRCoreInfo {
>  } SPAPRCoreInfo;
>  
>  static const SPAPRCoreInfo spapr_cores[] = {
> +/* 970 */
> +{ .name = "970", .initfn = spapr_cpu_core_970_initfn },
> +
> +/* POWER5 */
> +{ .name = "POWER5+", .initfn = spapr_cpu_core_POWER5plus_initfn },
> +
>  /* POWER7 and aliases */
>  { .name = "POWER7_v2.3", .initfn = spapr_cpu_core_POWER7_initfn },
>  { .name = "POWER7", .initfn = spapr_cpu_core_POWER7_initfn },

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH V5 1/5] hw/ppc: realize the PCI root bus as part of mac99 init

2016-06-28 Thread David Gibson

On Tue, Jun 28, 2016 at 11:00:18AM +0300, Marcel Apfelbaum wrote:
> On 06/28/2016 05:56 AM, David Gibson wrote:
> > On Mon, Jun 27, 2016 at 06:38:31PM +0300, Marcel Apfelbaum wrote:
> > > Mac99's PCI root bus is not part of a host bridge,
> > > realize it manually.
> > 
> > Um.. how did this ever work?
> 
> Well, the only thing the PCI bus realize does is
> to register the VM migration state, so only migration was affected.
> 
> However, patch 2/5 adds to the realize function bus_master initialization code
> for all devices attached to the bridge.

Ah, ok.  In that case, ppc portions are

Acked-by: David Gibson 

> 
> Thanks,
> Marcel
> 
> > 
> > > 
> > > Signed-off-by: Marcel Apfelbaum 
> > > ---
> > >   hw/ppc/mac_newworld.c | 1 +
> > >   1 file changed, 1 insertion(+)
> > > 
> > > diff --git a/hw/ppc/mac_newworld.c b/hw/ppc/mac_newworld.c
> > > index 32e88b3..7d25106 100644
> > > --- a/hw/ppc/mac_newworld.c
> > > +++ b/hw/ppc/mac_newworld.c
> > > @@ -380,6 +380,7 @@ static void ppc_core99_init(MachineState *machine)
> > >   pci_bus = pci_pmac_init(pic, get_system_memory(), 
> > > get_system_io());
> > >   machine_arch = ARCH_MAC99;
> > >   }
> > > +object_property_set_bool(OBJECT(pci_bus), true, "realized", 
> > > _abort);
> > > 
> > >   machine->usb |= defaults_enabled() && !machine->usb_disabled;
> > > 
> > 
> 

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH V5 5/5] machine: remove iommu property

2016-06-28 Thread David Gibson

On Tue, Jun 28, 2016 at 11:07:52AM +0300, Marcel Apfelbaum wrote:
> On 06/28/2016 05:57 AM, David Gibson wrote:
> > On Mon, Jun 27, 2016 at 06:38:35PM +0300, Marcel Apfelbaum wrote:
> > > Since iommu devices can be created with '-device' there is
> > > no need to keep iommu as machine and mch property.
> > 
> > Doesn't this break backwards compatibility?
> > 
> 
> 
> Hi David,
> 
> Intel IOMMU was a kind of POC until recent development.
> The new IOMMU features will require more command line options,
> so even if we keep the machine property around, it can't be really used.
> 
> More on this was discussed in a prev thread.
> https://www.mail-archive.com/qemu-devel@nongnu.org/msg377385.html

Ok, fair enough.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] [RFC PATCH 3/3] filter-rewriter: rewrite tcp packet to keep secondary connection

2016-06-28 Thread Jason Wang




On 2016年06月28日 14:33, Zhang Chen wrote:




primary guest response 
pkt(seq=primary_seq+1,ack=client_seq+1+data_len,flag=ACK)
secondary guest response 
pkt(seq=secondary_seq+1,ack=client_seq+1+data_len,flag=ACK)


Is ACK a must here?


Yes.



Looks not, e.g what happens if guest does not use piggybacking acks?




If guest does not use piggybacking acks, it will send a independent 
packet for ack.

we will get this packet.
like:
pkt(seq=,ack=xxx,flag=ACK). 


Right, so looks like if guest want to send some data too, it can send 
tcp packet without ACK set?

Re: [Qemu-devel] Regression: virtio-pci: convert to ioeventfd callbacks

2016-06-28 Thread Jason Wang




On 2016年06月27日 17:44, Peter Lieven wrote:

Hi, with the above patch applied:

commit 9f06e71a567ba5ee8b727e65a2d5347fd331d2aa
Author: Cornelia Huck 
Date:   Fri Jun 10 11:04:12 2016 +0200

virtio-pci: convert to ioeventfd callbacks

a Ubuntu 14.04 VM freezes at startup when blk-mq is set up - even if 
there is only one queue.


Peter




In fact, I notice vhost-net does not work for master, look like we are 
trying to set host notifier without initialization which seems a bug

Re: [Qemu-devel] [PATCH v9 00/13] Add param Error ** for msi_init()--part2

2016-06-28 Thread Cao jin


ping again...
because get so many "The following message to  was undeliverable"

On 06/28/2016 07:19 PM, Cao jin wrote:

ping

On 06/20/2016 02:13 PM, Cao jin wrote:

rebased against upstream, and passed make check.

changelog:
1. vmw_pvscsi: for compatibility, leave the field msi_used alone.
2. since patch "msi_init: change return value to 0 on success" has
been adopted
first, the patch "megasas: Fix check for msi_init() failure" isn't
necessary
anymore, so drop it.
3. fix failure of make check. It is actually not a bug, test case
"/ahci/hba_spec" always think the 1st capability pointed by
Capabilities
pointer should be MSI, and the patch changed the order of adding
capability.
Since we don`t pass a error object to msi_init() in ich9ahci and
return on
its error, also and PCIDeviceClass->exit function is enough to
free all the
resource even if .realize() returns on msi_init() failure, so,
revert to the
position where we added msi capability to make "make check" happy.

cc: Gerd Hoffmann 
cc: John Snow 
cc: Dmitry Fleytman 
cc: Jason Wang 
cc: Michael S. Tsirkin 
cc: Hannes Reinecke 
cc: Paolo Bonzini 
cc: Alex Williamson 
cc: Markus Armbruster 
cc: Marcel Apfelbaum 

Cao jin (13):
   change pvscsi_init_msi() type to void
   mptsas: change .realize function name
   usb xhci: change msi/msix property type
   intel-hda: change msi property type
   mptsas: change msi property type
   megasas: change msi/msix property type
   pci bridge dev: change msi property type
   pci: Convert msi_init() to Error and fix callers to check it
   megasas: remove unnecessary megasas_use_msi()
   mptsas: remove unnecessary internal msi state flag
   vmxnet3: remove unnecessary internal msi state flag
   e1000e: remove unnecessary internal msi state flag
   vmw_pvscsi: remove unnecessary internal msi state flag

  hw/audio/intel-hda.c   | 29 +++
  hw/ide/ich.c   |  7 +++--
  hw/net/e1000e.c| 37 +---
  hw/net/vmxnet3.c   | 52
+++--
  hw/pci-bridge/ioh3420.c|  6 +++-
  hw/pci-bridge/pci_bridge_dev.c | 31 ++--
  hw/pci-bridge/xio3130_downstream.c |  6 +++-
  hw/pci-bridge/xio3130_upstream.c   |  6 +++-
  hw/pci/msi.c   | 11 +--
  hw/scsi/megasas.c  | 59
--
  hw/scsi/mptsas.c   | 40 +-
  hw/scsi/mptsas.h   |  5 ++--
  hw/scsi/vmw_pvscsi.c   | 15 --
  hw/usb/hcd-xhci.c  | 35 --
  hw/vfio/pci.c  |  7 +++--
  include/hw/pci/msi.h   |  3 +-
  16 files changed, 194 insertions(+), 155 deletions(-)





--
Yours Sincerely,

Cao jin

Re: [Qemu-devel] [PATCH v4 1/3] block: ignore flush requests when storage is clean

2016-06-28 Thread Fam Zheng

On Tue, 06/28 12:10, Denis V. Lunev wrote:
> On 06/28/2016 04:27 AM, Fam Zheng wrote:
> > On Mon, 06/27 17:47, Denis V. Lunev wrote:
> > > From: Evgeny Yakovlev 
> > > 
> > > Some guests (win2008 server for example) do a lot of unnecessary
> > > flushing when underlying media has not changed. This adds additional
> > > overhead on host when calling fsync/fdatasync.
> > > 
> > > This change introduces a dirty flag in BlockDriverState which is set
> > > in bdrv_set_dirty and is checked in bdrv_co_flush. This allows us to
> > > avoid unnecessary flushing when storage is clean.
> > > 
> > > The problem with excessive flushing was found by a performance test
> > > which does parallel directory tree creation (from 2 processes).
> > > Results improved from 0.424 loops/sec to 0.432 loops/sec.
> > > Each loop creates 10^3 directories with 10 files in each.
> > > 
> > > Signed-off-by: Evgeny Yakovlev 
> > > Signed-off-by: Denis V. Lunev 
> > > CC: Kevin Wolf 
> > > CC: Max Reitz 
> > > CC: Stefan Hajnoczi 
> > > CC: Fam Zheng 
> > > CC: John Snow 
> > > ---
> > >   block.c   |  1 +
> > >   block/dirty-bitmap.c  |  3 +++
> > >   block/io.c| 19 +++
> > >   include/block/block_int.h |  1 +
> > >   4 files changed, 24 insertions(+)
> > > 
> > > diff --git a/block.c b/block.c
> > > index 947df29..68ae3a0 100644
> > > --- a/block.c
> > > +++ b/block.c
> > > @@ -2581,6 +2581,7 @@ int bdrv_truncate(BlockDriverState *bs, int64_t 
> > > offset)
> > >   ret = refresh_total_sectors(bs, offset >> BDRV_SECTOR_BITS);
> > >   bdrv_dirty_bitmap_truncate(bs);
> > >   bdrv_parent_cb_resize(bs);
> > > +bs->dirty = true; /* file node sync is needed after truncate */
> > >   }
> > >   return ret;
> > >   }
> > > diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
> > > index 4902ca5..54e0413 100644
> > > --- a/block/dirty-bitmap.c
> > > +++ b/block/dirty-bitmap.c
> > > @@ -370,6 +370,9 @@ void bdrv_set_dirty(BlockDriverState *bs, int64_t 
> > > cur_sector,
> > >   }
> > >   hbitmap_set(bitmap->bitmap, cur_sector, nr_sectors);
> > >   }
> > > +
> > > +/* Set global block driver dirty flag even if bitmap is disabled */
> > > +bs->dirty = true;
> > >   }
> > >   /**
> > > diff --git a/block/io.c b/block/io.c
> > > index b9e53e3..152f5a9 100644
> > > --- a/block/io.c
> > > +++ b/block/io.c
> > > @@ -2247,6 +2247,25 @@ int coroutine_fn bdrv_co_flush(BlockDriverState 
> > > *bs)
> > >   goto flush_parent;
> > >   }
> > > +/* Check if storage is actually dirty before flushing to disk */
> > > +if (!bs->dirty) {
> > > +/* Flush requests are appended to tracked request list in order 
> > > so that
> > > + * most recent request is at the head of the list. Following 
> > > code uses
> > > + * this ordering to wait for the most recent flush request to 
> > > complete
> > > + * to ensure that requests return in order */
> > > +BdrvTrackedRequest *prev_req;
> > > +QLIST_FOREACH(prev_req, >tracked_requests, list) {
> > > +if (prev_req ==  || prev_req->type != 
> > > BDRV_TRACKED_FLUSH) {
> > > +continue;
> > > +}
> > > +
> > > +qemu_co_queue_wait(_req->wait_queue);
> > > +break;
> > > +}
> > > +goto flush_parent;
> > Should we check bs->dirty again after qemu_co_queue_wait()? I think another
> > write request could sneak in while this coroutine yields.
> no, we do not care. Any subsequent to FLUSH write does not guaranteed to
> be flushed. We have the warranty only that all write requests completed
> prior to this flush are really flushed.

I'm not worried about subsequent requests.

A prior request can be already in progress or be waiting when we check
bs->dirty, though it would be false there, but it will become true soon --
bdrv_set_dirty is only called when a request is completing.

Fam

> 
> 
> 
> > > +}
> > > +bs->dirty = false;
> > > +
> > >   BLKDBG_EVENT(bs->file, BLKDBG_FLUSH_TO_DISK);
> > >   if (bs->drv->bdrv_co_flush_to_disk) {
> > >   ret = bs->drv->bdrv_co_flush_to_disk(bs);
> > > diff --git a/include/block/block_int.h b/include/block/block_int.h
> > > index 0432ba5..59a7def 100644
> > > --- a/include/block/block_int.h
> > > +++ b/include/block/block_int.h
> > > @@ -435,6 +435,7 @@ struct BlockDriverState {
> > >   bool valid_key; /* if true, a valid encryption key has been set */
> > >   bool sg;/* if true, the device is a /dev/sg* */
> > >   bool probed;/* if true, format was probed rather than specified 
> > > */
> > > +bool dirty; /* if true, media is dirty and should be flushed */
> > How about renaming this to "need_flush"? The one "dirty" we had is

Re: [Qemu-devel] [RFC PATCH 07/11] introduce zynqmp_crf

2016-06-28 Thread Alistair Francis

On Mon, Jun 13, 2016 at 9:27 AM,   wrote:
> From: KONRAD Frederic 
>
> This introduce Xilinx zynqmp-crf.
> It is extracted from the qemu xilinx tree 
> (02d2f0203dd489ed30d9c8d90c14a52c57332b25) and is used as
> an example for the clock framework.

Watch out with this one, the newet register API sent to mainline is
differnent to the one we use internally.

This looks like it won't work with the newest version.

I think just leave it as is until the register API is accepted and
then fix it up then.

Thanks,

Alistair

> ---
>  hw/misc/Makefile.objs   |   1 +
>  hw/misc/xilinx_zynqmp_crf.c | 532 
> 
>  2 files changed, 533 insertions(+)
>  create mode 100644 hw/misc/xilinx_zynqmp_crf.c
>
> diff --git a/hw/misc/Makefile.objs b/hw/misc/Makefile.objs
> index e8b8855..c6e3c7f 100644
> --- a/hw/misc/Makefile.objs
> +++ b/hw/misc/Makefile.objs
> @@ -44,6 +44,7 @@ obj-$(CONFIG_SLAVIO) += slavio_misc.o
>  obj-$(CONFIG_ZYNQ) += zynq_slcr.o
>  obj-$(CONFIG_ZYNQ) += zynq-xadc.o
>  obj-$(CONFIG_ZYNQ) += xlnx-zynqmp-iou-slcr.o
> +obj-$(CONFIG_ZYNQ) += xilinx_zynqmp_crf.o
>  obj-$(CONFIG_STM32F2XX_SYSCFG) += stm32f2xx_syscfg.o
>  obj-$(CONFIG_MIPS_CPS) += mips_cmgcr.o
>  obj-$(CONFIG_MIPS_CPS) += mips_cpc.o
> diff --git a/hw/misc/xilinx_zynqmp_crf.c b/hw/misc/xilinx_zynqmp_crf.c
> new file mode 100644
> index 000..b1bf2a6
> --- /dev/null
> +++ b/hw/misc/xilinx_zynqmp_crf.c
> @@ -0,0 +1,532 @@
> +/*
> + * QEMU model of the CRF_APB APB control registers for clock controller. The
> + * RST_ctrl_fpd will be added to this as well
> + *
> + * Copyright (c) 2014 Xilinx Inc.
> + *
> + * Autogenerated by xregqemu.py 2014-01-22.
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a 
> copy
> + * of this software and associated documentation files (the "Software"), to 
> deal
> + * in the Software without restriction, including without limitation the 
> rights
> + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
> + * copies of the Software, and to permit persons to whom the Software is
> + * furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice shall be included in
> + * all copies or substantial portions of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
> FROM,
> + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
> + * THE SOFTWARE.
> + */
> +
> +#include "qemu/osdep.h"
> +#include "hw/sysbus.h"
> +#include "hw/register.h"
> +#include "qemu/bitops.h"
> +#include "qemu/log.h"
> +
> +#ifndef XILINX_CRF_APB_ERR_DEBUG
> +#define XILINX_CRF_APB_ERR_DEBUG 0
> +#endif
> +
> +#define TYPE_XILINX_CRF_APB "xlnx.zynqmp_crf"
> +
> +#define XILINX_CRF_APB(obj) \
> + OBJECT_CHECK(CRF_APB, (obj), TYPE_XILINX_CRF_APB)
> +
> +REG32(ERR_CTRL, 0x0)
> +FIELD(ERR_CTRL, SLVERR_ENABLE, 1, 0)
> +REG32(IR_STATUS, 0x4)
> +FIELD(IR_STATUS, ADDR_DECODE_ERR, 1, 0)
> +REG32(IR_MASK, 0x8)
> +FIELD(IR_MASK, ADDR_DECODE_ERR, 1, 0)
> +REG32(IR_ENABLE, 0xc)
> +FIELD(IR_ENABLE, ADDR_DECODE_ERR, 1, 0)
> +REG32(IR_DISABLE, 0x10)
> +FIELD(IR_DISABLE, ADDR_DECODE_ERR, 1, 0)
> +REG32(CRF_ECO, 0x18)
> +REG32(APLL_CTRL, 0x20)
> +FIELD(APLL_CTRL, POST_SRC, 3, 24)
> +FIELD(APLL_CTRL, PRE_SRC, 3, 20)
> +FIELD(APLL_CTRL, CLKOUTDIV, 1, 17)
> +FIELD(APLL_CTRL, DIV2, 1, 16)
> +FIELD(APLL_CTRL, FBDIV, 7, 8)
> +FIELD(APLL_CTRL, BYPASS, 1, 3)
> +FIELD(APLL_CTRL, RESET, 1, 0)
> +REG32(APLL_CFG, 0x24)
> +FIELD(APLL_CFG, LOCK_DLY, 7, 25)
> +FIELD(APLL_CFG, LOCK_CNT, 10, 13)
> +FIELD(APLL_CFG, LFHF, 2, 10)
> +FIELD(APLL_CFG, CP, 4, 5)
> +FIELD(APLL_CFG, RES, 4, 0)
> +REG32(APLL_FRAC_CFG, 0x28)
> +FIELD(APLL_FRAC_CFG, ENABLED, 1, 31)
> +FIELD(APLL_FRAC_CFG, SEED, 3, 22)
> +FIELD(APLL_FRAC_CFG, ALGRTHM, 1, 19)
> +FIELD(APLL_FRAC_CFG, ORDER, 1, 18)
> +FIELD(APLL_FRAC_CFG, DATA, 16, 0)
> +REG32(DPLL_CTRL, 0x2c)
> +FIELD(DPLL_CTRL, POST_SRC, 3, 24)
> +FIELD(DPLL_CTRL, PRE_SRC, 3, 20)
> +FIELD(DPLL_CTRL, CLKOUTDIV, 1, 17)
> +FIELD(DPLL_CTRL, DIV2, 1, 16)
> +FIELD(DPLL_CTRL, FBDIV, 7, 8)
> +FIELD(DPLL_CTRL, BYPASS, 1, 3)
> +FIELD(DPLL_CTRL, RESET, 1, 0)
> +REG32(DPLL_CFG, 0x30)
> +FIELD(DPLL_CFG, LOCK_DLY, 7, 25)
> +FIELD(DPLL_CFG, LOCK_CNT, 10, 13)
> +FIELD(DPLL_CFG, LFHF, 2, 10)
> +FIELD(DPLL_CFG, CP, 4, 5)
> +FIELD(DPLL_CFG, RES, 4, 0)
> +REG32(DPLL_FRAC_CFG, 0x34)
> +FIELD(DPLL_FRAC_CFG, ENABLED, 1, 31)
> +

Re: [Qemu-devel] [RFC PATCH 05/11] docs: add qemu-clock documentation

2016-06-28 Thread Alistair Francis

On Mon, Jun 13, 2016 at 9:27 AM,   wrote:
> From: KONRAD Frederic 
>
> This adds the qemu-clock documentation.
>
> Signed-off-by: KONRAD Frederic 
> ---
>  docs/clock.txt | 112 
> +
>  1 file changed, 112 insertions(+)
>  create mode 100644 docs/clock.txt
>
> diff --git a/docs/clock.txt b/docs/clock.txt
> new file mode 100644
> index 000..f4ad4c8
> --- /dev/null
> +++ b/docs/clock.txt
> @@ -0,0 +1,112 @@
> +
> +What is a QEMU_CLOCK
> +
> +
> +A QEMU_CLOCK is a QOM Object developed for the purpose of modeling a clock 
> tree
> +with QEMU.
> +
> +It only simulates the clock by keeping a copy of the current frequency and
> +doesn't model the signal itself such as pin toggle or duty cycle.
> +
> +It allows to model the impact of badly configured PLL, clock source selection
> +or disabled clock on the models.
> +
> +Bounding the clock together to create a tree
> +
> +
> +In order to create a clock tree with QEMU_CLOCK two or more clock must be 
> bound
> +together. Let's say there are two clocks clk_a and clk_b:
> +Using qemu_clk_bound(clk_a, clk_b) will bound clk_a and clk_b.
> +
> +Binding two qemu-clk together is a unidirectional link which means that 
> changing
> +the rate of clk_a will propagate to clk_b and not the opposite. The bound
> +process automatically refresh clk_b rate.
> +
> +Clock can be bound and unbound during execution for modeling eg: a clock
> +selector.
> +
> +A clock can drive more than one other clock. eg with this code:
> +qemu_clk_bound(clk_a, clk_b);
> +qemu_clk_bound(clk_a, clk_c);
> +
> +A clock rate change one clk_a will propagate to clk_b and clk_c.
> +
> +Implementing a callback on a rate change
> +
> +
> +The function prototype is the following:
> +typedef float (*qemu_clk_rate_change_cb)(void *opaque, float rate);
> +
> +It's main goal is to modify the rate before it's passed to the next clocks in
> +the tree.
> +
> +eg: for a 4x PLL the function will be:
> +float qemu_clk_rate_change_cb(void *opaque, float rate)
> +{
> +return 4.0 * rate;
> +}
> +
> +To set the callback for the clock:
> +void qemu_clk_set_callback(qemu_clk clk, qemu_clk_on_rate_update_cb cb,
> +   void *opaque);
> +can be called.
> +
> +NOTE: It's not recommended that the clock is driven by more than one clock 
> as it
> +would mean that we don't know which clock trigger the callback.

Would this not be something worth knowing?

Thanks,

Alistair

> +The rate update process
> +===
> +
> +The rate update happen in this way:
> +When a model wants to update a clock frequency (eg: based on a register 
> change
> +or something similar) it will call qemu_clk_update_rate(..) on the clock:
> +  * The callback associated to the clock is called with the new rate.
> +  * qemu_clk_update_rate(..) is then called on all bound clock with the
> +value returned by the callback.
> +
> +NOTE: When no callback is attached the clock qemu_clk_update_rate(..) is 
> called
> +with the unmodified rate.
> +
> +Attaching a QEMU_CLOCK to a DeviceState
> +===
> +
> +Attaching a qemu-clk to a DeviceState is required to be able to get the clock
> +outside the model through qemu_clk_get_pin(..).
> +
> +It is also required to be able to print the clock and its rate with info 
> qtree.
> +For example:
> +
> +  type System
> +  dev: xlnx.zynqmp_crf, id ""
> +gpio-out "sysbus-irq" 1
> +gpio-out "RST_A9" 4
> +qemu-clk "dbg_trace" 0.0
> +qemu-clk "vpll_to_lpd" 62500.0
> +qemu-clk "dp_stc_ref" 0.0
> +qemu-clk "dpll_to_lpd" 1250.0
> +qemu-clk "acpu_clk" 0.0
> +qemu-clk "pcie_ref" 0.0
> +qemu-clk "topsw_main" 0.0
> +qemu-clk "topsw_lsbus" 0.0
> +qemu-clk "dp_audio_ref" 0.0
> +qemu-clk "sata_ref" 0.0
> +qemu-clk "dp_video_ref" 71428568.0
> +qemu-clk "vpll_clk" 25.0
> +qemu-clk "apll_to_lpd" 1250.0
> +qemu-clk "dpll_clk" 5000.0
> +qemu-clk "gpu_ref" 0.0
> +qemu-clk "aux_refclk" 0.0
> +qemu-clk "video_clk" 2700.0
> +qemu-clk "gdma_ref" 0.0
> +qemu-clk "gt_crx_ref_clk" 0.0
> +qemu-clk "dbg_fdp" 0.0
> +qemu-clk "apll_clk" 5000.0
> +qemu-clk "pss_alt_ref_clk" 0.0
> +qemu-clk "ddr" 0.0
> +qemu-clk "pss_ref_clk" 5000.0
> +qemu-clk "dpdma_ref" 0.0
> +qemu-clk "dbg_tstmp" 0.0
> +mmio fd1a/010c
> +
> +This way a DeviceState can have multiple clock input or output.
> +
> --
> 2.5.5
>
>

Re: [Qemu-devel] [RFC PATCH 04/11] qdev-monitor: print the device's clock with info qtree

2016-06-28 Thread Alistair Francis

On Mon, Jun 13, 2016 at 9:27 AM,   wrote:
> From: KONRAD Frederic 
>
> This prints the clock attached to a DeviceState when using "info qtree" 
> monitor
> command.

Can you include an example of what this will look like?

Thanks,

Alistair

>
> Signed-off-by: KONRAD Frederic 
> ---
>  include/qemu/qemu-clock.h |  9 +
>  qdev-monitor.c|  2 ++
>  qemu-clock.c  | 28 
>  3 files changed, 39 insertions(+)
>
> diff --git a/include/qemu/qemu-clock.h b/include/qemu/qemu-clock.h
> index 677de9a..265ec65 100644
> --- a/include/qemu/qemu-clock.h
> +++ b/include/qemu/qemu-clock.h
> @@ -124,4 +124,13 @@ void qemu_clk_set_callback(qemu_clk clk,
> qemu_clk_on_rate_update_cb cb,
> void *opaque);
>
> +/**
> + * qemu_clk_print:
> + * @dev: the device for which the clock need to be printed.
> + *
> + * Print the clock information for a given device.
> + *
> + */
> +void qemu_clk_print(Monitor *mon, DeviceState *dev, int indent);
> +
>  #endif /* QEMU_CLOCK_H */
> diff --git a/qdev-monitor.c b/qdev-monitor.c
> index e19617f..d6d1aa4 100644
> --- a/qdev-monitor.c
> +++ b/qdev-monitor.c
> @@ -28,6 +28,7 @@
>  #include "qemu/config-file.h"
>  #include "qemu/error-report.h"
>  #include "qemu/help_option.h"
> +#include "qemu/qemu-clock.h"
>
>  /*
>   * Aliases were a bad idea from the start.  Let's keep them
> @@ -684,6 +685,7 @@ static void qdev_print(Monitor *mon, DeviceState *dev, 
> int indent)
>  ngl->num_out);
>  }
>  }
> +qemu_clk_print(mon, dev, indent);
>  class = object_get_class(OBJECT(dev));
>  do {
>  qdev_print_props(mon, dev, DEVICE_CLASS(class)->props, indent);
> diff --git a/qemu-clock.c b/qemu-clock.c
> index 811d6a0..378a14d 100644
> --- a/qemu-clock.c
> +++ b/qemu-clock.c
> @@ -24,6 +24,7 @@
>  #include "qemu/qemu-clock.h"
>  #include "hw/hw.h"
>  #include "qapi/error.h"
> +#include "monitor/monitor.h"
>
>  /* #define DEBUG_QEMU_CLOCK */
>
> @@ -111,6 +112,33 @@ qemu_clk qemu_clk_get_pin(DeviceState *d, const char 
> *name)
>  return QEMU_CLOCK(clk);
>  }
>
> +struct print_opaque {
> +Monitor *mon;
> +int indent;
> +};
> +
> +static int qemu_clk_print_rec(Object *obj, void *opaque)
> +{
> +qemu_clk clk = (qemu_clk)(object_dynamic_cast(obj, TYPE_CLOCK));
> +struct print_opaque *po = opaque;
> +
> +if (clk) {
> +monitor_printf(po->mon, "%*s" "qemu-clk \"%s\" %.1f\n", po->indent,
> +   " ", clk->name, clk->out_rate);
> +}
> +
> +return 0;
> +}
> +
> +void qemu_clk_print(Monitor *mon, DeviceState *dev, int indent)
> +{
> +struct print_opaque po;
> +
> +po.indent = indent;
> +po.mon = mon;
> +object_child_foreach(OBJECT(dev), qemu_clk_print_rec, );
> +}
> +
>  static const TypeInfo qemu_clk_info = {
>  .name  = TYPE_CLOCK,
>  .parent= TYPE_OBJECT,
> --
> 2.5.5
>
>

Re: [Qemu-devel] [RFC PATCH 03/11] qemu-clk: allow to bound two clocks together

2016-06-28 Thread Alistair Francis

On Mon, Jun 13, 2016 at 9:27 AM,   wrote:
> From: KONRAD Frederic 
>
> This introduces the clock binding and the update part.
> When the qemu_clk_rate_update(qemu_clk, int) function is called:
>   * The clock callback is called on the qemu_clk so it can change the rate.
>   * The qemu_clk_rate_update function is called on all the driven clock.
>
> Signed-off-by: KONRAD Frederic 
> ---
>  include/qemu/qemu-clock.h | 65 
> +++
>  qemu-clock.c  | 56 
>  2 files changed, 121 insertions(+)
>
> diff --git a/include/qemu/qemu-clock.h b/include/qemu/qemu-clock.h
> index a2ba105..677de9a 100644
> --- a/include/qemu/qemu-clock.h
> +++ b/include/qemu/qemu-clock.h
> @@ -27,15 +27,29 @@
>  #include "qemu/osdep.h"
>  #include "qom/object.h"
>
> +typedef float (*qemu_clk_on_rate_update_cb)(void *opaque, float rate);
> +
>  #define TYPE_CLOCK "qemu-clk"
>  #define QEMU_CLOCK(obj) OBJECT_CHECK(struct qemu_clk, (obj), TYPE_CLOCK)
>
> +typedef struct ClkList ClkList;
> +
>  typedef struct qemu_clk {
>  /*< private >*/
>  Object parent_obj;
>  char *name;/* name of this clock in the device. */
> +float in_rate; /* rate of the clock which drive this pin. */
> +float out_rate;/* rate of this clock pin. */
> +void *opaque;
> +qemu_clk_on_rate_update_cb cb;
> +QLIST_HEAD(, ClkList) bound;
>  } *qemu_clk;
>
> +struct ClkList {
> +qemu_clk clk;
> +QLIST_ENTRY(ClkList) node;
> +};
> +
>  /**
>   * qemu_clk_attach_to_device:
>   * @d: the device on which the clock need to be attached.
> @@ -59,4 +73,55 @@ void qemu_clk_attach_to_device(DeviceState *d, qemu_clk 
> clk,
>   */
>  qemu_clk qemu_clk_get_pin(DeviceState *d, const char *name);
>
> +/**
> + * qemu_clk_bound_clock:

Maybe this should be bind and unbind clock instead of bound and unbound.

> + * @out: the clock output.
> + * @in: the clock input.
> + *
> + * Connect the clock together. This is unidirectionnal so a
> + * qemu_clk_update_rate will go from @out to @in.

s/unidirectionnal/unidirectional/g

> + *
> + */
> +void qemu_clk_bound_clock(qemu_clk out, qemu_clk in);
> +
> +/**
> + * qemu_clk_unbound:
> + * @out: the clock output.
> + * @in: the clock input.
> + *
> + * Disconnect the clock if they were bound together.

clocks

> + *
> + */
> +void qemu_clk_unbound(qemu_clk out, qemu_clk in);
> +
> +/**
> + * qemu_clk_update_rate:
> + * @clk: the clock to update.
> + * @rate: the new rate.
> + *
> + * Update the @clk to the new @rate.
> + *
> + */
> +void qemu_clk_update_rate(qemu_clk clk, float rate);
> +
> +/**
> + * qemu_clk_refresh:
> + * @clk: the clock to be refreshed.
> + *
> + * This updates all childs of a clock without changing its own rate.

Can you clarify what this does?

> + *
> + */
> +void qemu_clk_refresh(qemu_clk clk);
> +
> +/**
> + * qemu_clk_set_callback:
> + * @clk: the clock where to set the callback.
> + * @cb: the callback to associate to the callback.
> + * @opaque: the opaque data passed to the calback.
> + *
> + */
> +void qemu_clk_set_callback(qemu_clk clk,
> +   qemu_clk_on_rate_update_cb cb,
> +   void *opaque);
> +
>  #endif /* QEMU_CLOCK_H */
> diff --git a/qemu-clock.c b/qemu-clock.c
> index 81f2852..811d6a0 100644
> --- a/qemu-clock.c
> +++ b/qemu-clock.c
> @@ -34,6 +34,62 @@ do { printf("qemu-clock: " fmt , ## __VA_ARGS__); } while 
> (0)
>  #define DPRINTF(fmt, ...) do { } while (0)
>  #endif
>
> +void qemu_clk_refresh(qemu_clk clk)
> +{
> +qemu_clk_update_rate(clk, clk->in_rate);
> +}
> +
> +void qemu_clk_update_rate(qemu_clk clk, float rate)
> +{
> +ClkList *child;
> +
> +clk->in_rate = rate;
> +clk->out_rate = rate;
> +
> +if (clk->cb) {
> +clk->out_rate = clk->cb(clk->opaque, rate);
> +}
> +
> +DPRINTF("%s output rate updated to %.1f\n",
> +object_get_canonical_path(OBJECT(clk)),
> +clk->out_rate);
> +
> +QLIST_FOREACH(child, >bound, node) {
> +qemu_clk_update_rate(child->clk, clk->out_rate);
> +}
> +}
> +
> +void qemu_clk_bound_clock(qemu_clk out, qemu_clk in)
> +{
> +ClkList *child;
> +
> +child = g_malloc(sizeof(child));
> +assert(child);
> +child->clk = in;
> +QLIST_INSERT_HEAD(>bound, child, node);
> +qemu_clk_update_rate(in, out->out_rate);
> +}
> +
> +void qemu_clk_unbound(qemu_clk out, qemu_clk in)
> +{
> +ClkList *child, *next;
> +
> +QLIST_FOREACH_SAFE(child, >bound, node, next) {
> +if (child->clk == in) {
> +QLIST_REMOVE(child, node);
> +g_free(child);
> +}
> +}
> +}
> +
> +void qemu_clk_set_callback(qemu_clk clk,
> +   qemu_clk_on_rate_update_cb cb,
> +   void *opaque)
> +{
> +clk->cb = cb;
> +clk->opaque = opaque;
> +}
> +
>

Re: [Qemu-devel] [RFC PATCH 01/11] qemu-clk: introduce qemu-clk qom object

2016-06-28 Thread Alistair Francis

On Mon, Jun 13, 2016 at 9:27 AM,   wrote:
> From: KONRAD Frederic 
>
> This introduces qemu-clk qom object.
>
> Signed-off-by: KONRAD Frederic 
> ---
>  Makefile.objs |  1 +
>  include/qemu/qemu-clock.h | 40 
>  qemu-clock.c  | 47 
> +++
>  3 files changed, 88 insertions(+)
>  create mode 100644 include/qemu/qemu-clock.h
>  create mode 100644 qemu-clock.c
>
> diff --git a/Makefile.objs b/Makefile.objs
> index 61f4bf4..2284ef5 100644
> --- a/Makefile.objs
> +++ b/Makefile.objs
> @@ -77,6 +77,7 @@ common-obj-y += backends/
>  common-obj-$(CONFIG_SECCOMP) += qemu-seccomp.o
>
>  common-obj-$(CONFIG_FDT) += device_tree.o
> +common-obj-y += qemu-clock.o
>
>  ##
>  # qapi
> diff --git a/include/qemu/qemu-clock.h b/include/qemu/qemu-clock.h
> new file mode 100644
> index 000..e7acd68
> --- /dev/null
> +++ b/include/qemu/qemu-clock.h
> @@ -0,0 +1,40 @@
> +/*
> + * QEMU Clock
> + *
> + *  Copyright (C) 2016 : GreenSocs Ltd
> + *  http://www.greensocs.com/ , email: i...@greensocs.com
> + *
> + *  Frederic Konrad 
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation, either version 2 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License along
> + * with this program; if not, see .
> + *
> + */
> +
> +#ifndef QEMU_CLOCK_H
> +#define QEMU_CLOCK_H
> +
> +#include "qemu/osdep.h"
> +#include "qom/object.h"
> +
> +#define TYPE_CLOCK "qemu-clk"
> +#define QEMU_CLOCK(obj) OBJECT_CHECK(struct qemu_clk, (obj), TYPE_CLOCK)
> +
> +typedef struct qemu_clk {
> +/*< private >*/
> +Object parent_obj;
> +} *qemu_clk;
> +
> +#endif /* QEMU_CLOCK_H */
> +
> +
> diff --git a/qemu-clock.c b/qemu-clock.c
> new file mode 100644
> index 000..4a47fb4
> --- /dev/null
> +++ b/qemu-clock.c
> @@ -0,0 +1,47 @@
> +/*
> + * QEMU Clock
> + *
> + *  Copyright (C) 2016 : GreenSocs Ltd
> + *  http://www.greensocs.com/ , email: i...@greensocs.com
> + *
> + *  Frederic Konrad 
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation, either version 2 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License along
> + * with this program; if not, see .
> + *
> + */
> +
> +#include "qemu/qemu-clock.h"
> +#include "hw/hw.h"

I'm pretty sure every file should start with osdep now.

> +
> +/* #define DEBUG_QEMU_CLOCK */

This shouldn't be here.

Thanks,

Alistair

> +
> +#ifdef DEBUG_QEMU_CLOCK
> +#define DPRINTF(fmt, ...) \
> +do { printf("qemu-clock: " fmt , ## __VA_ARGS__); } while (0)
> +#else
> +#define DPRINTF(fmt, ...) do { } while (0)
> +#endif
> +
> +static const TypeInfo qemu_clk_info = {
> +.name  = TYPE_CLOCK,
> +.parent= TYPE_OBJECT,
> +.instance_size = sizeof(struct qemu_clk),
> +};
> +
> +static void qemu_clk_register_types(void)
> +{
> +type_register_static(_clk_info);
> +}
> +
> +type_init(qemu_clk_register_types);
> --
> 2.5.5
>
>

Re: [Qemu-devel] [RFC PATCH 02/11] qemu-clk: allow to attach a clock to a device

2016-06-28 Thread Alistair Francis

On Mon, Jun 13, 2016 at 9:27 AM,   wrote:
> From: KONRAD Frederic 
>
> This allows to attach a clock to a DeviceState.
> Contrary to gpios, the clock pins are not contained in the DeviceState but
> with the child property so they can appears in the qom-tree.
>
> Signed-off-by: KONRAD Frederic 
> ---
>  include/qemu/qemu-clock.h | 24 +++-
>  qemu-clock.c  | 22 ++
>  2 files changed, 45 insertions(+), 1 deletion(-)
>
> diff --git a/include/qemu/qemu-clock.h b/include/qemu/qemu-clock.h
> index e7acd68..a2ba105 100644
> --- a/include/qemu/qemu-clock.h
> +++ b/include/qemu/qemu-clock.h
> @@ -33,8 +33,30 @@
>  typedef struct qemu_clk {
>  /*< private >*/
>  Object parent_obj;
> +char *name;/* name of this clock in the device. */
>  } *qemu_clk;
>
> -#endif /* QEMU_CLOCK_H */
> +/**
> + * qemu_clk_attach_to_device:
> + * @d: the device on which the clock need to be attached.
> + * @clk: the clock which need to be attached.
> + * @name: the name of the clock can't be NULL.
> + *
> + * Attach @clk named @name to the device @d.
> + *
> + */
> +void qemu_clk_attach_to_device(DeviceState *d, qemu_clk clk,

dev instead of just d

> +   const char *name);
>
> +/**
> + * qemu_clk_get_pin:
> + * @d: the device which contain the clock.
> + * @name: the name of the clock.
> + *
> + * Get the clock named @name located in the device @d, or NULL if not found.
> + *
> + * Returns the clock named @name contained in @d.
> + */
> +qemu_clk qemu_clk_get_pin(DeviceState *d, const char *name);
>
> +#endif /* QEMU_CLOCK_H */
> diff --git a/qemu-clock.c b/qemu-clock.c
> index 4a47fb4..81f2852 100644
> --- a/qemu-clock.c
> +++ b/qemu-clock.c
> @@ -23,6 +23,7 @@
>
>  #include "qemu/qemu-clock.h"
>  #include "hw/hw.h"
> +#include "qapi/error.h"
>
>  /* #define DEBUG_QEMU_CLOCK */
>
> @@ -33,6 +34,27 @@ do { printf("qemu-clock: " fmt , ## __VA_ARGS__); } while 
> (0)
>  #define DPRINTF(fmt, ...) do { } while (0)
>  #endif
>
> +void qemu_clk_attach_to_device(DeviceState *d, qemu_clk clk, const char 
> *name)
> +{
> +assert(name);
> +assert(!clk->name);
> +object_property_add_child(OBJECT(d), name, OBJECT(clk), _abort);
> +clk->name = g_strdup(name);
> +}
> +
> +qemu_clk qemu_clk_get_pin(DeviceState *d, const char *name)
> +{
> +gchar *path = NULL;
> +Object *clk;
> +bool ambiguous;
> +
> +path = g_strdup_printf("%s/%s", object_get_canonical_path(OBJECT(d)),
> +   name);
> +clk = object_resolve_path(path, );

Should ambiguous be passed back to the caller?

> +g_free(path);
> +return QEMU_CLOCK(clk);

Shouldn't you check to see if you got something valid before casting?

Thanks,

Alistair

> +}
> +
>  static const TypeInfo qemu_clk_info = {
>  .name  = TYPE_CLOCK,
>  .parent= TYPE_OBJECT,
> --
> 2.5.5
>
>

Re: [Qemu-devel] [RFC PATCH 1/1] OpenBIOS: Switch over to official OpenBIOS git repo

2016-06-28 Thread G 3



On Jun 28, 2016, at 7:44 PM, qemu-devel-requ...@nongnu.org wrote:


On 28/06/16 14:44, Stefan Hajnoczi wrote:


On Tue, Jun 28, 2016 at 7:11 AM, Jeff Cody  wrote:

On Mon, Jun 27, 2016 at 07:48:23AM +0100, Mark Cave-Ayland wrote:

On 21/06/16 14:48, Mark Cave-Ayland wrote:


On 21/06/16 11:28, Stefan Hajnoczi wrote:


On Tue, Jun 21, 2016 at 01:40:42AM -0400, Jeff Cody wrote:
This update should preserve git history, and allow seamless  
switching
over to the official openbios git repo, rather than pulling  
from the
svn mirror.  All prior history from the svn repository should  
still be
preserved (i.e., commit hashes are the same for historical  
commits).


In the roms/openbios submodule, the branch "origin/official"  
is the

latest mirror of the official git repository (fetched daily).

Signed-off-by: Jeff Cody 
---
 roms/openbios | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


Assuming the git.qemu-project.org openbios.git remote and .git/ 
config

mirror setting has been updated to use the new upstream git repo:

Reviewed-by: Stefan Hajnoczi 


Is it possible to switch this around, so that there is a legacy  
branch
which points to the current HEAD and master points to the new,  
upstream

git HEAD? Then it means if someone clones either the
git.qemu-project.org repository or the official repository then  
the

default master branch will point to the same HEAD.


Urgent ping? It has been another week, we're coming up to soft  
freeze

and the PPC guys are urgently after an OpenBIOS fix.

As per the above I'd really like the branches switched around so  
that
both the git.qemu-project.org master and github.com master are  
exactly
the same HEAD although I believe it may be technically possible  
to do
this part separately once the HEAD switch is in? If so, please  
can we
apply this and then I can line up and attempt to push the  
outstanding

patches to the new github master later this evening.



If we want something other than this patch, so that the openbios  
git repo
hosted on qemu.org has 'master' as the new github tracking, we  
might be able
to do that with a git-merge.  Here are the three methods I am  
thinking of:



A) For 'master' referencing new github hashes:
git fetch github
git merge --no-edit github/master




git push /pub/git/openbios.git master:master


B) Old, prior behavior for SVN:
git svn fetch svn
git merge git-svn
git push /pub/git/openbios.git master:master


C) Current behavior, as of the submitted patch above, this is  
what is being run:

git svn fetch svn
git merge git-svn
git fetch github
git push /pub/git/openbios.git master:master
git push /pub/git/openbios.git official:official
(This seemed safest to run, as old behavior remains unchanged)

If we do A), we'll have merge commits with just the auto- 
generated merge

message, and I'm not sure this is what you want.  Thoughts?


No, I think A is not appropriate because the mirror must have the
exact same commit IDs as github.  Only fast-forward merges are
allowed, so I would use --ff-only instead.  The first time you begin
using the github repo you'll need git reset --hard github/master to
move from the old svn commit history to the new github history.

It's important to keep the svm commits so old versions of QEMU still
work.  You can ensure that the garbage collector does not delete the
commits by tagging the latest svn head.


Yes, this is exactly what I'm thinking. Given that the repository is
already merged, is it not just as simple as:

git checkout master -b legacy
git checkout master
git reset --hard 36785d7

And then change the nightly script to "git pull origin/master" with  
the

origin remote set to the github.com repository. I'm also fine with
asking existing developers to switch over to the new master once  
we're done.



ATB,

Mark.


Now that OpenBIOS will be using git, would you be willing to accept a  
patch that prints the git commit used to make the openbios binary  
into the banner word? If not in the banner word, maybe into another  
word like openbios_version?

[Qemu-devel] [PATCH] configure: mark qemu-ga VSS includes as system headers

2016-06-28 Thread Michael Roth

As of e4650c81, we do w32 builds with -Werror enabled. Unfortunately
for cases where we enable VSS support in qemu-ga, we still have
warnings generated by VSS includes that ship as part of the Microsoft
VSS SDK.

We can selectively address a number of these warnings using

  #pragma GCC diagnostic ignored ...

but at least one of these:

  warning: ‘typedef’ was ignored in this declaration

resulting from declarations of the form:

  typedef struct Blah { ... };

does not provide a specific command-line/pragma option to disable
warnings of the sort.

To allow VSS builds to succeed, the next-best option is disabling
these warnings on a per-file basis. pragmas like #pragma GCC
system_header can be used to declare subsequent includes/declarations
as being exempt from normal warnings, but this must be done within
a header file.

Since we don't control the VSS SDK, we'd need to rely on a
intermediate header include to accomplish this, and
since different objects in the VSS link target rely on different
headers from the VSS SDK, this would become somewhat of a rat's nest
(though not totally unmanageable).

The next step up in granularity is just marking the entire VSS
SDK include path as system headers via -isystem. This is a bit more
heavy-handed, but since this SDK hasn't changed since 2005, there's
likely little to be gained from selectively disabling warnings
anyway, so we implement that approach here.

This fixes the -Werror failures in both the configure test and the
qga build due to shared reliance on $vss_win32_include. For the
same reason, this also enforces a new dependency on -isystem support
in the C/C++ compiler when building QGA with VSS enabled.

Cc: Thomas Huth 
Cc: Stefan Weil 
Cc: Paolo Bonzini 
Signed-off-by: Michael Roth 
---
 configure | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/configure b/configure
index e14e907..2d84bc5 100755
--- a/configure
+++ b/configure
@@ -4049,13 +4049,13 @@ fi
 
 if test "$mingw32" = "yes" -a "$guest_agent" != "no" -a "$vss_win32_sdk" != 
"no" ; then
   case "$vss_win32_sdk" in
-"")   vss_win32_include="-I$source_path" ;;
+"")   vss_win32_include="-isystem $source_path" ;;
 *\ *) # The SDK is installed in "Program Files" by default, but we cannot
   # handle path with spaces. So we symlink the headers into ".sdk/vss".
-  vss_win32_include="-I$source_path/.sdk/vss"
+  vss_win32_include="-isystem $source_path/.sdk/vss"
  symlink "$vss_win32_sdk/inc" "$source_path/.sdk/vss/inc"
  ;;
-*)vss_win32_include="-I$vss_win32_sdk"
+*)vss_win32_include="-isystem $vss_win32_sdk"
   esac
   cat > $TMPC << EOF
 #define __MIDL_user_allocate_free_DEFINED__
-- 
1.9.1

Re: [Qemu-devel] [PATCH v4 00/24] target-sparc improvements

2016-06-28 Thread Mark Cave-Ayland

On 28/06/16 01:38, Richard Henderson wrote:

> The primary focus of this patch set is to reduce the number of
> helpers that modify TCG globals, and thus increase the lifetime
> of those globals within each TB, and thus decrease the number
> of times that tcg must spill and fill them from backing store.
> 
> As a byproduct, I also implement the bulk of the interesting v9 ASIs
> inline, thus exposing e.g. the little-endian loads and stores as
> simple tcg operations.
> 
> The patch set is relative to my outstanding tcg pull request.
> For reference, the complete tree can be found at
> 
>   git://github.com/rth7680/qemu.git tgt-sparc-2
> 
> Changes from v3 to v4:
>   * Re-do the UA2005 commit change, which apparently got lost in v3.
>   * Rebased on aa8151b7df, which contains fix for the ldstub issue.
> 
> Changes from v2 to v3:
>   * Add ASI_BLK_COMMIT_[PS] to patch 19.
> This fixes the illegal instruction that Artyom reported.
>   * Add gen_address_mask calls to all direct accesses.
> This fixes a follow-on segv that affected the debian install.
> 
> Changes from v1 to v2:
>   * Commit message refers to UA2005 instead of UA2011 when
> introducing new asi.h defines. (Artyom)
>   * Drop MMU_REAL_IDX, and inline handling of ASI_REAL_*.
> This appears to be the source of the regression that Artyom
> identified wrt ss5 emulation.
> 
> 
> r~
> 
> 
> Richard Henderson (24):
>   target-sparc: Mark more flags for helpers
>   target-sparc: Remove softint as a TCG global
>   target-sparc: Store mmu index in TB flags
>   target-sparc: Create gen_exception
>   target-sparc: Unify asi handling between 32 and 64-bit
>   target-sparc: Store %asi in TB flags
>   target-sparc: Introduce get_asi
>   target-sparc: Pass TCGMemOp to gen_ld/st_asi
>   target-sparc: Import linux/arch/sparc/include/uapi/asm/asi.h
>   target-sparc: Add UA2005 defines to asi.h
>   target-sparc: Use defines from asi.h
>   target-sparc: Directly implement easy ld/st asis
>   target-sparc: Use QT0 to return results from ldda
>   target-sparc: Introduce gen_check_align
>   target-sparc: Directly implement easy ldd/std asis
>   target-sparc: Fix obvious error in ASI_M_BFILL
>   target-sparc: Pass TCGMemOp constants to helper_ld/st_asi
>   target-sparc: Directly implement easy ldf/stf asis
>   target-sparc: Directly implement block and short ldf/stf asis
>   target-sparc: Remove helper_ldf_asi, helper_stf_asi
>   target-sparc: Use explicit writes to cpu_fsr
>   target-sparc: Use cpu_fsr in stfsr
>   target-sparc: Use cpu_loop_exit_restore from
> helper_check_ieee_exceptions
>   target-sparc: Elide duplicate updates to fprs
> 
>  target-sparc/asi.h |  311 +++
>  target-sparc/cpu.h |   28 +-
>  target-sparc/fop_helper.c  |  230 +++-
>  target-sparc/helper.h  |  168 +++---
>  target-sparc/ldst_helper.c |  696 +++-
>  target-sparc/translate.c   | 1273 
> 
>  6 files changed, 1607 insertions(+), 1099 deletions(-)
>  create mode 100644 target-sparc/asi.h

Hi Richard,

I didn't see the branch rebase onto aa8151b7df here, although I was able
to manually rebase the tgt-sparc-2 branch onto git master and build
without issues.

With that, I ran through all my OpenBIOS boot tests for SPARC32/SPARC64
and all my images booted fine without any regressions.

Tested-by: Mark Cave-Ayland 


ATB,

Mark.

Re: [Qemu-devel] [PATCH v2] slirp: Add support for stateless DHCPv6

2016-06-28 Thread Samuel Thibault

Hello,

Thomas Huth, on Tue 28 Jun 2016 12:48:31 +0200, wrote:
> Provide basic support for stateless DHCPv6 (see RFC 3736) so
> that guests can also automatically boot via IPv6 with SLIRP
> (for IPv6 network booting, see RFC 5970 for details).
> 
> Tested with:
> 
> qemu-system-ppc64 -nographic -vga none -boot n -net nic \
> -net user,ipv6=yes,ipv4=no,tftp=/path/to/tftp,bootfile=ppc64.img
> 
> Signed-off-by: Thomas Huth 

Pushed to my tree, thanks!

Samuel
> ---
>  v2:
>  - Addressed review comments from Samuel for v1
>  - Moved the ALLDHCP_MULTICAST definition to dhcpv6.h instead of ip6.h
>  - Fixed a bug in the OPTION_ORO parsing (the index was not calculated
>right)
> 
>  slirp/Makefile.objs |   2 +-
>  slirp/dhcpv6.c  | 209 
> 
>  slirp/dhcpv6.h  |  22 ++
>  slirp/udp6.c|  13 +++-
>  4 files changed, 244 insertions(+), 2 deletions(-)
>  create mode 100644 slirp/dhcpv6.c
>  create mode 100644 slirp/dhcpv6.h
> 
> diff --git a/slirp/Makefile.objs b/slirp/Makefile.objs
> index 6748e4f..1baa1f1 100644
> --- a/slirp/Makefile.objs
> +++ b/slirp/Makefile.objs
> @@ -1,5 +1,5 @@
>  common-obj-y = cksum.o if.o ip_icmp.o ip6_icmp.o ip6_input.o ip6_output.o \
> -   ip_input.o ip_output.o dnssearch.o
> +   ip_input.o ip_output.o dnssearch.o dhcpv6.o
>  common-obj-y += slirp.o mbuf.o misc.o sbuf.o socket.o tcp_input.o 
> tcp_output.o
>  common-obj-y += tcp_subr.o tcp_timer.o udp.o udp6.o bootp.o tftp.o 
> arp_table.o \
>  ndp_table.o
> diff --git a/slirp/dhcpv6.c b/slirp/dhcpv6.c
> new file mode 100644
> index 000..02c51c7
> --- /dev/null
> +++ b/slirp/dhcpv6.c
> @@ -0,0 +1,209 @@
> +/*
> + * SLIRP stateless DHCPv6
> + *
> + * We only support stateless DHCPv6, e.g. for network booting.
> + * See RFC 3315, RFC 3736, RFC 3646 and RFC 5970 for details.
> + *
> + * Copyright 2016 Thomas Huth, Red Hat Inc.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License,
> + * or (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; if not, see .
> + */
> +
> +#include "qemu/osdep.h"
> +#include "qemu/log.h"
> +#include "slirp.h"
> +#include "dhcpv6.h"
> +
> +/* DHCPv6 message types */
> +#define MSGTYPE_REPLY7
> +#define MSGTYPE_INFO_REQUEST 11
> +
> +/* DHCPv6 option types */
> +#define OPTION_CLIENTID  1
> +#define OPTION_IAADDR5
> +#define OPTION_ORO   6
> +#define OPTION_DNS_SERVERS   23
> +#define OPTION_BOOTFILE_URL  59
> +
> +struct requested_infos {
> +uint8_t *client_id;
> +int client_id_len;
> +bool want_dns;
> +bool want_boot_url;
> +};
> +
> +/**
> + * Analyze the info request message sent by the client to see what data it
> + * provided and what it wants to have. The information is gathered in the
> + * "requested_infos" struct. Note that client_id (if provided) points into
> + * the odata region, thus the caller must keep odata valid as long as it
> + * needs to access the requested_infos struct.
> + */
> +static int dhcpv6_parse_info_request(uint8_t *odata, int olen,
> + struct requested_infos *ri)
> +{
> +int i, req_opt;
> +
> +while (olen > 4) {
> +/* Parse one option */
> +int option = odata[0] << 8 | odata[1];
> +int len = odata[2] << 8 | odata[3];
> +
> +if (len + 4 > olen) {
> +qemu_log_mask(LOG_GUEST_ERROR, "Guest sent bad DHCPv6 
> packet!\n");
> +return -E2BIG;
> +}
> +
> +switch (option) {
> +case OPTION_IAADDR:
> +/* According to RFC3315, we must discard requests with IA option 
> */
> +return -EINVAL;
> +case OPTION_CLIENTID:
> +if (len > 256) {
> +/* Avoid very long IDs which could cause problems later */
> +return -E2BIG;
> +}
> +ri->client_id = odata + 4;
> +ri->client_id_len = len;
> +break;
> +case OPTION_ORO:/* Option request option */
> +if (len & 1) {
> +return -EINVAL;
> +}
> +/* Check which options the client wants to have */
> +for (i = 0; i < len; i += 2) {
> +req_opt = odata[4 + i] << 8 | odata[4 + i + 1];
> +switch (req_opt) {
> +case OPTION_DNS_SERVERS:
> +

[Qemu-devel] [Bug 1588328] Re: Qemu 2.6 Solaris 9 Sparc Segmentation Fault

2016-06-28 Thread Mark Cave-Ayland

I ran all the way through the installer in order to test the patch, so
it should be working for you. Is your Spark9.disk labelled? See
http://virtuallyfun.superglobalmegacorp.com/2010/10/03/formatting-disks-
for-solaris/ for more information on how to do this.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1588328

Title:
  Qemu 2.6 Solaris 9 Sparc Segmentation Fault

Status in QEMU:
  New

Bug description:
  Hi,
  I tried the following command to boot Solaris 9 sparc:
  qemu-system-sparc -nographic -boot d -hda ./Spark9.disk -m 256 -cdrom 
sol-9-905hw-ga-sparc-dvd.iso -serial telnet:0.0.0.0:3000,server 

  It seems there are a few Segmentation Faults, one from the starting of
  the boot. Another at the beginning of the commandline installation.

  Trying 127.0.0.1...
  Connected to localhost.
  Escape character is '^]'.
  Configuration device id QEMU version 1 machine id 32
  Probing SBus slot 0 offset 0
  Probing SBus slot 1 offset 0
  Probing SBus slot 2 offset 0
  Probing SBus slot 3 offset 0
  Probing SBus slot 4 offset 0
  Probing SBus slot 5 offset 0
  Invalid FCode start byte
  CPUs: 1 x FMI,MB86904
  UUID: ----
  Welcome to OpenBIOS v1.1 built on Apr 18 2016 08:19
Type 'help' for detailed information
  Trying cdrom:d...
  Not a bootable ELF image
  Loading a.out image...
  Loaded 7680 bytes
  entry point is 0x4000
  bootpath: 
/iommu@0,1000/sbus@0,10001000/espdma@5,840/esp@5,880/sd@2,0:d

  Jumping to entry point 4000 for type 0005...
  switching to new context:
  SunOS Release 5.9 Version Generic_118558-34 32-bit
  Copyright 1983-2003 Sun Microsystems, Inc.  All rights reserved.
  Use is subject to license terms.
  WARNING: 
/iommu@0,1000/sbus@0,10001000/espdma@5,840/esp@5,880/sd@0,0 (sd0):
Corrupt label; wrong magic number

  Segmentation Fault
  Configuring /dev and /devices
  NOTICE: Couldn't set value (../../sun/io/audio/sada/drv/audiocs/audio_4231.c, 
Line #1759 0x00 0x88)
  audio may not work correctly until it is stopped and restarted
  Segmentation Fault
  Using RPC Bootparams for network configuration information.
  Skipping interface le0
  Searching for configuration file(s)...
  Search complete.

  

  What type of terminal are you using?
   1) ANSI Standard CRT
   2) DEC VT52
   3) DEC VT100
   4) Heathkit 19
   5) Lear Siegler ADM31
   6) PC Console
   7) Sun Command Tool
   8) Sun Workstation
   9) Televideo 910
   10) Televideo 925
   11) Wyse Model 50
   12) X Terminal Emulator (xterms)
   13) CDE Terminal Emulator (dtterm)
   14) Other
  Type the number of your choice and press Return: 3
  syslog service starting.
  savecore: no dump device configured
  Running in command line mode
  /sbin/disk0_install[109]: 143 Segmentation Fault
  /sbin/run_install[130]: 155 Segmentation Fault

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1588328/+subscriptions

[Qemu-devel] [PULL 8/8] trace: [*-user] Add events to trace guest syscalls in syscall emulation mode

2016-06-28 Thread Stefan Hajnoczi

From: Lluís Vilanova 

Adds two events to trace syscalls in syscall emulation mode (*-user):

* guest_user_syscall: Emitted before the syscall is emulated; contains
  the syscall number and arguments.

* guest_user_syscall_ret: Emitted after the syscall is emulated;
  contains the syscall number and return value.

Signed-off-by: Lluís Vilanova 
Message-id: 146651712411.12388.10024905980452504938.st...@fimbulvetr.bsc.es
Signed-off-by: Stefan Hajnoczi 
---
 bsd-user/syscall.c   |  9 +
 linux-user/syscall.c |  2 ++
 trace-events | 16 
 3 files changed, 27 insertions(+)

diff --git a/bsd-user/syscall.c b/bsd-user/syscall.c
index a9fe869..66492aa 100644
--- a/bsd-user/syscall.c
+++ b/bsd-user/syscall.c
@@ -315,12 +315,14 @@ abi_long do_freebsd_syscall(void *cpu_env, int num, 
abi_long arg1,
 abi_long arg5, abi_long arg6, abi_long arg7,
 abi_long arg8)
 {
+CPUState *cpu = ENV_GET_CPU(cpu_env);
 abi_long ret;
 void *p;
 
 #ifdef DEBUG
 gemu_log("freebsd syscall %d\n", num);
 #endif
+trace_guest_user_syscall(cpu, num, arg1, arg2, arg3, arg4, arg5, arg6, 
arg7, arg8);
 if(do_strace)
 print_freebsd_syscall(num, arg1, arg2, arg3, arg4, arg5, arg6);
 
@@ -400,6 +402,7 @@ abi_long do_freebsd_syscall(void *cpu_env, int num, 
abi_long arg1,
 #endif
 if (do_strace)
 print_freebsd_syscall_ret(num, ret);
+trace_guest_user_syscall_ret(cpu, num, ret);
 return ret;
  efault:
 ret = -TARGET_EFAULT;
@@ -410,12 +413,14 @@ abi_long do_netbsd_syscall(void *cpu_env, int num, 
abi_long arg1,
abi_long arg2, abi_long arg3, abi_long arg4,
abi_long arg5, abi_long arg6)
 {
+CPUState *cpu = ENV_GET_CPU(cpu_env);
 abi_long ret;
 void *p;
 
 #ifdef DEBUG
 gemu_log("netbsd syscall %d\n", num);
 #endif
+trace_guest_user_syscall(cpu, num, arg1, arg2, arg3, arg4, arg5, arg6, 0, 
0);
 if(do_strace)
 print_netbsd_syscall(num, arg1, arg2, arg3, arg4, arg5, arg6);
 
@@ -472,6 +477,7 @@ abi_long do_netbsd_syscall(void *cpu_env, int num, abi_long 
arg1,
 #endif
 if (do_strace)
 print_netbsd_syscall_ret(num, ret);
+trace_guest_user_syscall_ret(cpu, num, ret);
 return ret;
  efault:
 ret = -TARGET_EFAULT;
@@ -482,12 +488,14 @@ abi_long do_openbsd_syscall(void *cpu_env, int num, 
abi_long arg1,
 abi_long arg2, abi_long arg3, abi_long arg4,
 abi_long arg5, abi_long arg6)
 {
+CPUState *cpu = ENV_GET_CPU(cpu_env);
 abi_long ret;
 void *p;
 
 #ifdef DEBUG
 gemu_log("openbsd syscall %d\n", num);
 #endif
+trace_guest_user_syscall(cpu, num, arg1, arg2, arg3, arg4, arg5, arg6, 0, 
0);
 if(do_strace)
 print_openbsd_syscall(num, arg1, arg2, arg3, arg4, arg5, arg6);
 
@@ -544,6 +552,7 @@ abi_long do_openbsd_syscall(void *cpu_env, int num, 
abi_long arg1,
 #endif
 if (do_strace)
 print_openbsd_syscall_ret(num, ret);
+trace_guest_user_syscall_ret(cpu, num, ret);
 return ret;
  efault:
 ret = -TARGET_EFAULT;
diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 1c17b74..e59f16d 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -6690,6 +6690,7 @@ abi_long do_syscall(void *cpu_env, int num, abi_long arg1,
 #ifdef DEBUG
 gemu_log("syscall %d", num);
 #endif
+trace_guest_user_syscall(cpu, num, arg1, arg2, arg3, arg4, arg5, arg6, 
arg7, arg8);
 if(do_strace)
 print_syscall(num, arg1, arg2, arg3, arg4, arg5, arg6);
 
@@ -11182,6 +11183,7 @@ fail:
 #endif
 if(do_strace)
 print_syscall_ret(num, ret);
+trace_guest_user_syscall_ret(cpu, num, ret);
 return ret;
 efault:
 ret = -TARGET_EFAULT;
diff --git a/trace-events b/trace-events
index 9d76de8..4767059 100644
--- a/trace-events
+++ b/trace-events
@@ -156,3 +156,19 @@ memory_region_tb_write(int cpu_index, uint64_t addr, 
uint64_t value, unsigned si
 #
 # Targets: TCG(all)
 disable vcpu tcg guest_mem_before(TCGv vaddr, uint8_t info) "info=%d", 
"vaddr=0x%016"PRIx64" info=%d"
+
+# @num: System call number.
+# @arg*: System call argument value.
+#
+# Start executing a guest system call in syscall emulation mode.
+#
+# Targets: TCG(all)
+disable vcpu guest_user_syscall(uint64_t num, uint64_t arg1, uint64_t arg2, 
uint64_t arg3, uint64_t arg4, uint64_t arg5, uint64_t arg6, uint64_t arg7, 
uint64_t arg8) "num=0x%016"PRIx64" arg1=0x%016"PRIx64" arg2=0x%016"PRIx64" 
arg3=0x%016"PRIx64" arg4=0x%016"PRIx64" arg5=0x%016"PRIx64" arg6=0x%016"PRIx64" 
arg7=0x%016"PRIx64" arg8=0x%016"PRIx64
+
+# @num: System call number.
+# @ret: System call result value.
+#
+# Finish executing a guest system call in syscall emulation mode.
+#
+# Targets: TCG(all)
+disable vcpu guest_user_syscall_ret(uint64_t num, uint64_t ret) 
"num=0x%016"PRIx64" ret=0x%016"PRIx64
--

[Qemu-devel] [PULL 4/8] trace: enable tracing in qemu-io

2016-06-28 Thread Stefan Hajnoczi

From: "Denis V. Lunev" 

Moving trace_init_backends() into trace_opt_parse() is not possible. This
should be called after daemonize() in vl.c.

Signed-off-by: Denis V. Lunev 
Reviewed-by: Eric Blake 
Reviewed-by: Stefan Hajnoczi 
Message-id: 1466174654-30130-5-git-send-email-...@openvz.org
CC: Paolo Bonzini 
CC: Kevin Wolf 
Signed-off-by: Stefan Hajnoczi 
---
 qemu-io.c | 18 ++
 1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/qemu-io.c b/qemu-io.c
index d977a6e..db129ea 100644
--- a/qemu-io.c
+++ b/qemu-io.c
@@ -18,6 +18,7 @@
 #include "qemu/option.h"
 #include "qemu/config-file.h"
 #include "qemu/readline.h"
+#include "qemu/log.h"
 #include "qapi/qmp/qstring.h"
 #include "qom/object_interfaces.h"
 #include "sysemu/block-backend.h"
@@ -253,7 +254,9 @@ static void usage(const char *name)
 "  -k, --native-aio use kernel AIO implementation (on Linux only)\n"
 "  -t, --cache=MODE use the given cache mode for the image\n"
 "  -d, --discard=MODE   use the given discard mode for the image\n"
-"  -T, --trace FILE enable trace events listed in the given file\n"
+"  -T, --trace [[enable=]][,events=][,file=]\n"
+"   specify tracing options\n"
+"   see qemu-img(1) man page for full description\n"
 "  -h, --help   display this help and exit\n"
 "  -V, --versionoutput version information and exit\n"
 "\n"
@@ -458,6 +461,7 @@ int main(int argc, char **argv)
 Error *local_error = NULL;
 QDict *opts = NULL;
 const char *format = NULL;
+char *trace_file = NULL;
 
 #ifdef CONFIG_POSIX
 signal(SIGPIPE, SIG_IGN);
@@ -470,6 +474,7 @@ int main(int argc, char **argv)
 
 module_call_init(MODULE_INIT_QOM);
 qemu_add_opts(_object_opts);
+qemu_add_opts(_trace_opts);
 bdrv_init();
 
 while ((c = getopt_long(argc, argv, sopt, lopt, _index)) != -1) {
@@ -509,9 +514,8 @@ int main(int argc, char **argv)
 }
 break;
 case 'T':
-if (!trace_init_backends()) {
-exit(1); /* error message will have been printed */
-}
+g_free(trace_file);
+trace_file = trace_opt_parse(optarg);
 break;
 case 'V':
 printf("%s version %s\n", progname, QEMU_VERSION);
@@ -557,6 +561,12 @@ int main(int argc, char **argv)
 exit(1);
 }
 
+if (!trace_init_backends()) {
+exit(1);
+}
+trace_init_file(trace_file);
+qemu_set_log(LOG_TRACE);
+
 /* initialize commands */
 qemuio_add_command(_cmd);
 qemuio_add_command(_cmd);
-- 
2.7.4

[Qemu-devel] [PULL 5/8] trace: enable tracing in qemu-nbd

2016-06-28 Thread Stefan Hajnoczi

From: "Denis V. Lunev" 

Please note, trace_init_backends() must be called in the final process,
i.e. after daemonization. This is necessary to keep tracing thread in
the proper process.

Signed-off-by: Denis V. Lunev 
Reviewed-by: Eric Blake 
Reviewed-by: Stefan Hajnoczi 
Message-id: 1466174654-30130-6-git-send-email-...@openvz.org
CC: Paolo Bonzini 
CC: Kevin Wolf 
Signed-off-by: Stefan Hajnoczi 
---
 Makefile  |  2 +-
 qemu-nbd.c| 19 ++-
 qemu-nbd.texi |  3 +++
 3 files changed, 22 insertions(+), 2 deletions(-)

diff --git a/Makefile b/Makefile
index b72093f..2d31af0 100644
--- a/Makefile
+++ b/Makefile
@@ -578,7 +578,7 @@ fsdev/virtfs-proxy-helper.1: fsdev/virtfs-proxy-helper.texi
  $(POD2MAN) --section=1 --center=" " --release=" " 
fsdev/virtfs-proxy-helper.pod > $@, \
  "  GEN   $@")
 
-qemu-nbd.8: qemu-nbd.texi
+qemu-nbd.8: qemu-nbd.texi qemu-option-trace.texi
$(call quiet-command, \
  perl -Ww -- $(SRC_PATH)/scripts/texi2pod.pl $< qemu-nbd.pod && \
  $(POD2MAN) --section=8 --center=" " --release=" " qemu-nbd.pod > $@, \
diff --git a/qemu-nbd.c b/qemu-nbd.c
index 9519db3..321f02b 100644
--- a/qemu-nbd.c
+++ b/qemu-nbd.c
@@ -27,12 +27,14 @@
 #include "qemu/error-report.h"
 #include "qemu/config-file.h"
 #include "qemu/bswap.h"
+#include "qemu/log.h"
 #include "block/snapshot.h"
 #include "qapi/util.h"
 #include "qapi/qmp/qstring.h"
 #include "qom/object_interfaces.h"
 #include "io/channel-socket.h"
 #include "crypto/init.h"
+#include "trace/control.h"
 
 #include 
 #include 
@@ -88,6 +90,8 @@ static void usage(const char *name)
 "General purpose options:\n"
 "  --object type,id=ID,...   define an object such as 'secret' for providing\n"
 "passwords and/or encryption keys\n"
+"  -T, --trace [[enable=]][,events=][,file=]\n"
+"specify tracing options\n"
 #ifdef __linux__
 "Kernel NBD client support:\n"
 "  -c, --connect=DEV connect FILE to the local NBD device DEV\n"
@@ -470,7 +474,7 @@ int main(int argc, char **argv)
 off_t fd_size;
 QemuOpts *sn_opts = NULL;
 const char *sn_id_or_name = NULL;
-const char *sopt = "hVb:o:p:rsnP:c:dvk:e:f:tl:x:";
+const char *sopt = "hVb:o:p:rsnP:c:dvk:e:f:tl:x:T:";
 struct option lopt[] = {
 { "help", no_argument, NULL, 'h' },
 { "version", no_argument, NULL, 'V' },
@@ -498,6 +502,7 @@ int main(int argc, char **argv)
 { "export-name", required_argument, NULL, 'x' },
 { "tls-creds", required_argument, NULL, QEMU_NBD_OPT_TLSCREDS },
 { "image-opts", no_argument, NULL, QEMU_NBD_OPT_IMAGE_OPTS },
+{ "trace", required_argument, NULL, 'T' },
 { NULL, 0, NULL, 0 }
 };
 int ch;
@@ -518,6 +523,7 @@ int main(int argc, char **argv)
 const char *tlscredsid = NULL;
 bool imageOpts = false;
 bool writethrough = true;
+char *trace_file = NULL;
 
 /* The client thread uses SIGTERM to interrupt the server.  A signal
  * handler ensures that "qemu-nbd -v -c" exits with a nice status code.
@@ -531,6 +537,7 @@ int main(int argc, char **argv)
 
 module_call_init(MODULE_INIT_QOM);
 qemu_add_opts(_object_opts);
+qemu_add_opts(_trace_opts);
 qemu_init_exec_dir(argv[0]);
 
 while ((ch = getopt_long(argc, argv, sopt, lopt, _ind)) != -1) {
@@ -703,6 +710,10 @@ int main(int argc, char **argv)
 case QEMU_NBD_OPT_IMAGE_OPTS:
 imageOpts = true;
 break;
+case 'T':
+g_free(trace_file);
+trace_file = trace_opt_parse(optarg);
+break;
 }
 }
 
@@ -718,6 +729,12 @@ int main(int argc, char **argv)
 exit(EXIT_FAILURE);
 }
 
+if (!trace_init_backends()) {
+exit(1);
+}
+trace_init_file(trace_file);
+qemu_set_log(LOG_TRACE);
+
 if (tlscredsid) {
 if (sockpath) {
 error_report("TLS is only supported with IPv4/IPv6");
diff --git a/qemu-nbd.texi b/qemu-nbd.texi
index 9f23343..91ebf04 100644
--- a/qemu-nbd.texi
+++ b/qemu-nbd.texi
@@ -92,6 +92,9 @@ Display extra debugging information
 Display this help and exit
 @item -V, --version
 Display version information and exit
+@item -T, --trace 
[[enable=]@var{pattern}][,events=@var{file}][,file=@var{file}]
+@findex --trace
+@include qemu-option-trace.texi
 @end table
 
 @c man end
-- 
2.7.4

[Qemu-devel] [PULL 3/8] trace: move qemu_trace_opts to trace/control.c

2016-06-28 Thread Stefan Hajnoczi

From: "Denis V. Lunev" 

The patch also creates trace_opt_parse() helper in trace/control.c to reuse
this code in next patches for qemu-nbd and qemu-io.

The patch also makes trace_init_events() static, as this call is not used
outside the module anymore.

Signed-off-by: Denis V. Lunev 
Reviewed-by: Eric Blake 
Reviewed-by: Stefan Hajnoczi 
Message-id: 1466174654-30130-4-git-send-email-...@openvz.org
CC: Paolo Bonzini 
CC: Stefan Hajnoczi 
CC: Kevin Wolf 
Signed-off-by: Stefan Hajnoczi 
---
 trace/control.c | 42 +-
 trace/control.h | 25 ++---
 vl.c| 38 ++
 3 files changed, 57 insertions(+), 48 deletions(-)

diff --git a/trace/control.c b/trace/control.c
index e1556a3..86de8b9 100644
--- a/trace/control.c
+++ b/trace/control.c
@@ -21,11 +21,33 @@
 #endif
 #include "qapi/error.h"
 #include "qemu/error-report.h"
+#include "qemu/config-file.h"
 #include "monitor/monitor.h"
 
 int trace_events_enabled_count;
 bool trace_events_dstate[TRACE_EVENT_COUNT];
 
+QemuOptsList qemu_trace_opts = {
+.name = "trace",
+.implied_opt_name = "enable",
+.head = QTAILQ_HEAD_INITIALIZER(qemu_trace_opts.head),
+.desc = {
+{
+.name = "enable",
+.type = QEMU_OPT_STRING,
+},
+{
+.name = "events",
+.type = QEMU_OPT_STRING,
+},{
+.name = "file",
+.type = QEMU_OPT_STRING,
+},
+{ /* end of list */ }
+},
+};
+
+
 TraceEvent *trace_event_name(const char *name)
 {
 assert(name != NULL);
@@ -142,7 +164,7 @@ void trace_enable_events(const char *line_buf)
 }
 }
 
-void trace_init_events(const char *fname)
+static void trace_init_events(const char *fname)
 {
 Location loc;
 FILE *fp;
@@ -217,3 +239,21 @@ bool trace_init_backends(void)
 
 return true;
 }
+
+char *trace_opt_parse(const char *optarg)
+{
+char *trace_file;
+QemuOpts *opts = qemu_opts_parse_noisily(qemu_find_opts("trace"),
+ optarg, true);
+if (!opts) {
+exit(1);
+}
+if (qemu_opt_get(opts, "enable")) {
+trace_enable_events(qemu_opt_get(opts, "enable"));
+}
+trace_init_events(qemu_opt_get(opts, "events"));
+trace_file = g_strdup(qemu_opt_get(opts, "file"));
+qemu_opts_del(opts);
+
+return trace_file;
+}
diff --git a/trace/control.h b/trace/control.h
index e2ba6d4..a2dd3ea 100644
--- a/trace/control.h
+++ b/trace/control.h
@@ -160,17 +160,6 @@ static void trace_event_set_state_dynamic(TraceEvent *ev, 
bool state);
 bool trace_init_backends(void);
 
 /**
- * trace_init_events:
- * @events: Name of file with events to be enabled at startup; may be NULL.
- *  Corresponds to commandline option "-trace events=...".
- *
- * Read the list of enabled tracing events.
- *
- * Returns: Whether the backends could be successfully initialized.
- */
-void trace_init_events(const char *file);
-
-/**
  * trace_init_file:
  * @file:   Name of trace output file; may be NULL.
  *  Corresponds to commandline option "-trace file=...".
@@ -197,6 +186,20 @@ void trace_list_events(void);
  */
 void trace_enable_events(const char *line_buf);
 
+/**
+ * Definition of QEMU options describing trace subsystem configuration
+ */
+extern QemuOptsList qemu_trace_opts;
+
+/**
+ * trace_opt_parse:
+ * @optarg: A string argument of --trace command line argument
+ *
+ * Initialize tracing subsystem.
+ *
+ * Returns the filename to save trace to.  It must be freed with g_free().
+ */
+char *trace_opt_parse(const char *optarg);
 
 #include "trace/control-internal.h"
 
diff --git a/vl.c b/vl.c
index 4c1f9ae..90cf638 100644
--- a/vl.c
+++ b/vl.c
@@ -262,26 +262,6 @@ static QemuOptsList qemu_sandbox_opts = {
 },
 };
 
-static QemuOptsList qemu_trace_opts = {
-.name = "trace",
-.implied_opt_name = "enable",
-.head = QTAILQ_HEAD_INITIALIZER(qemu_trace_opts.head),
-.desc = {
-{
-.name = "enable",
-.type = QEMU_OPT_STRING,
-},
-{
-.name = "events",
-.type = QEMU_OPT_STRING,
-},{
-.name = "file",
-.type = QEMU_OPT_STRING,
-},
-{ /* end of list */ }
-},
-};
-
 static QemuOptsList qemu_option_rom_opts = {
 .name = "option-rom",
 .implied_opt_name = "romfile",
@@ -3864,23 +3844,9 @@ int main(int argc, char **argv, char **envp)
 xen_mode = XEN_ATTACH;
 break;
 case QEMU_OPTION_trace:
-{
-opts = qemu_opts_parse_noisily(qemu_find_opts("trace"),
-   optarg, true);
-if (!opts) {
-exit(1);
-}
-

[Qemu-devel] [PULL 6/8] qemu-img: move common options parsing before commands processing

2016-06-28 Thread Stefan Hajnoczi

From: "Denis V. Lunev" 

This is necessary to enable creation of common qemu-img options which will
be specified before command.

The patch also enables '-V' alias to '--version' (exactly like in other
block utilities) and documents this change.

Signed-off-by: Denis V. Lunev 
Reviewed-by: Eric Blake 
Reviewed-by: Stefan Hajnoczi 
Message-id: 1466174654-30130-7-git-send-email-...@openvz.org
CC: Paolo Bonzini 
CC: Kevin Wolf 
Signed-off-by: Stefan Hajnoczi 
---
 qemu-img.c| 41 +++--
 qemu-img.texi | 10 +-
 2 files changed, 36 insertions(+), 15 deletions(-)

diff --git a/qemu-img.c b/qemu-img.c
index 14e2661..2194c2d 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -91,9 +91,12 @@ static void QEMU_NORETURN help(void)
 {
 const char *help_msg =
QEMU_IMG_VERSION
-   "usage: qemu-img command [command options]\n"
+   "usage: qemu-img [standard options] command [command options]\n"
"QEMU disk image utility\n"
"\n"
+   "'-h', '--help'   display this help and exit\n"
+   "'-V', '--version'output version information and exit\n"
+   "\n"
"Command syntax:\n"
 #define DEF(option, callback, arg_string)\
"  " arg_string "\n"
@@ -3806,7 +3809,7 @@ int main(int argc, char **argv)
 int c;
 static const struct option long_options[] = {
 {"help", no_argument, 0, 'h'},
-{"version", no_argument, 0, 'v'},
+{"version", no_argument, 0, 'V'},
 {0, 0, 0, 0}
 };
 
@@ -3829,28 +3832,38 @@ int main(int argc, char **argv)
 if (argc < 2) {
 error_exit("Not enough arguments");
 }
-cmdname = argv[1];
 
 qemu_add_opts(_object_opts);
 qemu_add_opts(_source_opts);
 
+while ((c = getopt_long(argc, argv, "+hV", long_options, NULL)) != -1) {
+switch (c) {
+case 'h':
+help();
+return 0;
+case 'V':
+printf(QEMU_IMG_VERSION);
+return 0;
+}
+}
+
+cmdname = argv[optind];
+
+/* reset getopt_long scanning */
+argc -= optind;
+if (argc < 1) {
+return 0;
+}
+argv += optind;
+optind = 1;
+
 /* find the command */
 for (cmd = img_cmds; cmd->name != NULL; cmd++) {
 if (!strcmp(cmdname, cmd->name)) {
-return cmd->handler(argc - 1, argv + 1);
+return cmd->handler(argc, argv);
 }
 }
 
-c = getopt_long(argc, argv, "h", long_options, NULL);
-
-if (c == 'h') {
-help();
-}
-if (c == 'v') {
-printf(QEMU_IMG_VERSION);
-return 0;
-}
-
 /* not found */
 error_exit("Command not found: %s", cmdname);
 }
diff --git a/qemu-img.texi b/qemu-img.texi
index cbe50e9..f1b874d 100644
--- a/qemu-img.texi
+++ b/qemu-img.texi
@@ -1,6 +1,6 @@
 @example
 @c man begin SYNOPSIS
-@command{qemu-img} @var{command} [@var{command} @var{options}]
+@command{qemu-img} [@var{standard} @var{options}] @var{command} [@var{command} 
@var{options}]
 @c man end
 @end example
 
@@ -16,6 +16,14 @@ inconsistent state.
 
 @c man begin OPTIONS
 
+Standard options:
+@table @option
+@item -h, --help
+Display this help and exit
+@item -V, --version
+Display version information and exit
+@end table
+
 The following commands are supported:
 
 @include qemu-img-cmds.texi
-- 
2.7.4

[Qemu-devel] [PULL 7/8] trace: enable tracing in qemu-img

2016-06-28 Thread Stefan Hajnoczi

From: "Denis V. Lunev" 

The command will work this way:
qemu-img --trace "qcow2*" create -f qcow2 1.img 64G

[Quote "qcow2*" to protect against shell globbing as suggested by Eric
Blake .
--Stefan]

Signed-off-by: Denis V. Lunev 
Reviewed-by: Eric Blake 
Reviewed-by: Stefan Hajnoczi 
Message-id: 1466174654-30130-8-git-send-email-...@openvz.org
Suggested by: Daniel P. Berrange 
Reviewed-by: Eric Blake 
Reviewed-by: Stefan Hajnoczi 
CC: Paolo Bonzini 
CC: Kevin Wolf 
Signed-off-by: Stefan Hajnoczi 
---
 Makefile  |  2 +-
 qemu-img.c| 19 ++-
 qemu-img.texi |  3 +++
 3 files changed, 22 insertions(+), 2 deletions(-)

diff --git a/Makefile b/Makefile
index 2d31af0..9d12dc6 100644
--- a/Makefile
+++ b/Makefile
@@ -566,7 +566,7 @@ qemu.1: qemu-doc.texi qemu-options.texi qemu-monitor.texi 
qemu-monitor-info.texi
  "  GEN   $@")
 qemu.1: qemu-option-trace.texi
 
-qemu-img.1: qemu-img.texi qemu-img-cmds.texi
+qemu-img.1: qemu-img.texi qemu-option-trace.texi qemu-img-cmds.texi
$(call quiet-command, \
  perl -Ww -- $(SRC_PATH)/scripts/texi2pod.pl $< qemu-img.pod && \
  $(POD2MAN) --section=1 --center=" " --release=" " qemu-img.pod > $@, \
diff --git a/qemu-img.c b/qemu-img.c
index 2194c2d..3322a1e 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -32,6 +32,7 @@
 #include "qemu/config-file.h"
 #include "qemu/option.h"
 #include "qemu/error-report.h"
+#include "qemu/log.h"
 #include "qom/object_interfaces.h"
 #include "sysemu/sysemu.h"
 #include "sysemu/block-backend.h"
@@ -39,6 +40,7 @@
 #include "block/blockjob.h"
 #include "block/qapi.h"
 #include "crypto/init.h"
+#include "trace/control.h"
 #include 
 
 #define QEMU_IMG_VERSION "qemu-img version " QEMU_VERSION QEMU_PKGVERSION \
@@ -96,6 +98,8 @@ static void QEMU_NORETURN help(void)
"\n"
"'-h', '--help'   display this help and exit\n"
"'-V', '--version'output version information and exit\n"
+   "'-T', '--trace'  
[[enable=]][,events=][,file=]\n"
+   " specify tracing options\n"
"\n"
"Command syntax:\n"
 #define DEF(option, callback, arg_string)\
@@ -3806,10 +3810,12 @@ int main(int argc, char **argv)
 const img_cmd_t *cmd;
 const char *cmdname;
 Error *local_error = NULL;
+char *trace_file = NULL;
 int c;
 static const struct option long_options[] = {
 {"help", no_argument, 0, 'h'},
 {"version", no_argument, 0, 'V'},
+{"trace", required_argument, NULL, 'T'},
 {0, 0, 0, 0}
 };
 
@@ -3835,8 +3841,9 @@ int main(int argc, char **argv)
 
 qemu_add_opts(_object_opts);
 qemu_add_opts(_source_opts);
+qemu_add_opts(_trace_opts);
 
-while ((c = getopt_long(argc, argv, "+hV", long_options, NULL)) != -1) {
+while ((c = getopt_long(argc, argv, "+hVT:", long_options, NULL)) != -1) {
 switch (c) {
 case 'h':
 help();
@@ -3844,6 +3851,10 @@ int main(int argc, char **argv)
 case 'V':
 printf(QEMU_IMG_VERSION);
 return 0;
+case 'T':
+g_free(trace_file);
+trace_file = trace_opt_parse(optarg);
+break;
 }
 }
 
@@ -3857,6 +3868,12 @@ int main(int argc, char **argv)
 argv += optind;
 optind = 1;
 
+if (!trace_init_backends()) {
+exit(1);
+}
+trace_init_file(trace_file);
+qemu_set_log(LOG_TRACE);
+
 /* find the command */
 for (cmd = img_cmds; cmd->name != NULL; cmd++) {
 if (!strcmp(cmdname, cmd->name)) {
diff --git a/qemu-img.texi b/qemu-img.texi
index f1b874d..449a19c 100644
--- a/qemu-img.texi
+++ b/qemu-img.texi
@@ -22,6 +22,9 @@ Standard options:
 Display this help and exit
 @item -V, --version
 Display version information and exit
+@item -T, --trace 
[[enable=]@var{pattern}][,events=@var{file}][,file=@var{file}]
+@findex --trace
+@include qemu-option-trace.texi
 @end table
 
 The following commands are supported:
-- 
2.7.4

[Qemu-devel] [PULL 0/8] Tracing patches

2016-06-28 Thread Stefan Hajnoczi

The following changes since commit d7f30403576f04f1f3a5fb5a1d18cba8dfa7a6d2:

  cputlb: don't cpu_abort() if guest tries to execute outside RAM or RAM 
(2016-06-28 18:50:53 +0100)

are available in the git repository at:

  git://github.com/stefanha/qemu.git tags/tracing-pull-request

for you to fetch changes up to 9c15e70086f3343bd810c6150d92ebfd6f346fcf:

  trace: [*-user] Add events to trace guest syscalls in syscall emulation mode 
(2016-06-28 21:14:12 +0100)





Denis V. Lunev (7):
  doc: sync help description for --trace with man for qemu.1
  doc: move text describing --trace to specific .texi file
  trace: move qemu_trace_opts to trace/control.c
  trace: enable tracing in qemu-io
  trace: enable tracing in qemu-nbd
  qemu-img: move common options parsing before commands processing
  trace: enable tracing in qemu-img

Lluís Vilanova (1):
  trace: [*-user] Add events to trace guest syscalls in syscall
emulation mode

 Makefile   |  7 +++---
 bsd-user/syscall.c |  9 
 linux-user/syscall.c   |  2 ++
 qemu-img.c | 58 ++
 qemu-img.texi  | 13 ++-
 qemu-io.c  | 18 
 qemu-nbd.c | 19 -
 qemu-nbd.texi  |  3 +++
 qemu-option-trace.texi | 25 ++
 qemu-options.hx| 29 ++---
 trace-events   | 16 ++
 trace/control.c| 42 +++-
 trace/control.h| 25 --
 vl.c   | 38 ++---
 14 files changed, 206 insertions(+), 98 deletions(-)
 create mode 100644 qemu-option-trace.texi

-- 
2.7.4

[Qemu-devel] [PULL 2/8] doc: move text describing --trace to specific .texi file

2016-06-28 Thread Stefan Hajnoczi

From: "Denis V. Lunev" 

This text will be included to qemu-nbd/qemu-img mans in the next patches.

Signed-off-by: Denis V. Lunev 
Reviewed-by: Eric Blake 
Reviewed-by: Stefan Hajnoczi 
Message-id: 1466174654-30130-3-git-send-email-...@openvz.org
CC: Paolo Bonzini 
CC: Stefan Hajnoczi 
CC: Kevin Wolf 
Signed-off-by: Stefan Hajnoczi 
---
 Makefile   |  3 ++-
 qemu-option-trace.texi | 25 +
 qemu-options.hx| 27 +--
 3 files changed, 28 insertions(+), 27 deletions(-)
 create mode 100644 qemu-option-trace.texi

diff --git a/Makefile b/Makefile
index 7087fc2..b72093f 100644
--- a/Makefile
+++ b/Makefile
@@ -564,6 +564,7 @@ qemu.1: qemu-doc.texi qemu-options.texi qemu-monitor.texi 
qemu-monitor-info.texi
  perl -Ww -- $(SRC_PATH)/scripts/texi2pod.pl $< qemu.pod && \
  $(POD2MAN) --section=1 --center=" " --release=" " qemu.pod > $@, \
  "  GEN   $@")
+qemu.1: qemu-option-trace.texi
 
 qemu-img.1: qemu-img.texi qemu-img-cmds.texi
$(call quiet-command, \
@@ -595,7 +596,7 @@ info: qemu-doc.info qemu-tech.info
 pdf: qemu-doc.pdf qemu-tech.pdf
 
 qemu-doc.dvi qemu-doc.html qemu-doc.info qemu-doc.pdf: \
-   qemu-img.texi qemu-nbd.texi qemu-options.texi \
+   qemu-img.texi qemu-nbd.texi qemu-options.texi qemu-option-trace.texi \
qemu-monitor.texi qemu-img-cmds.texi qemu-ga.texi \
qemu-monitor-info.texi
 
diff --git a/qemu-option-trace.texi b/qemu-option-trace.texi
new file mode 100644
index 000..693ab5a
--- /dev/null
+++ b/qemu-option-trace.texi
@@ -0,0 +1,25 @@
+Specify tracing options.
+
+@table @option
+@item [enable=]@var{pattern}
+Immediately enable events matching @var{pattern}.
+The file must contain one event name (as listed in the @file{trace-events-all}
+file) per line; globbing patterns are accepted too.  This option is only
+available if QEMU has been compiled with the @var{simple}, @var{stderr}
+or @var{ftrace} tracing backend.  To specify multiple events or patterns,
+specify the @option{-trace} option multiple times.
+
+Use @code{-trace help} to print a list of names of trace points.
+
+@item events=@var{file}
+Immediately enable events listed in @var{file}.
+The file must contain one event name (as listed in the @file{trace-events-all}
+file) per line; globbing patterns are accepted too.  This option is only
+available if QEMU has been compiled with the @var{simple}, @var{stderr} or
+@var{ftrace} tracing backend.
+
+@item file=@var{file}
+Log output traces to @var{file}.
+This option is only available if QEMU has been compiled with
+the @var{simple} tracing backend.
+@end table
diff --git a/qemu-options.hx b/qemu-options.hx
index ab42530..a95a936 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -3671,32 +3671,7 @@ HXCOMM This line is not accurate, as some sub-options 
are backend-specific but
 HXCOMM HX does not support conditional compilation of text.
 @item -trace [[enable=]@var{pattern}][,events=@var{file}][,file=@var{file}]
 @findex -trace
-
-Specify tracing options.
-
-@table @option
-@item [enable=]@var{pattern}
-Immediately enable events matching @var{pattern}.
-The file must contain one event name (as listed in the @file{trace-events-all}
-file) per line; globbing patterns are accepted too.  This option is only
-available if QEMU has been compiled with the @var{simple}, @var{stderr}
-or @var{ftrace} tracing backend.  To specify multiple events or patterns,
-specify the @option{-trace} option multiple times.
-
-Use @code{-trace help} to print a list of names of trace points.
-
-@item events=@var{file}
-Immediately enable events listed in @var{file}.
-The file must contain one event name (as listed in the @file{trace-events-all}
-file) per line; globbing patterns are accepted too.  This option is only
-available if QEMU has been compiled with the @var{simple}, @var{stderr} or
-@var{ftrace} tracing backend.
-
-@item file=@var{file}
-Log output traces to @var{file}.
-This option is only available if QEMU has been compiled with
-the @var{simple} tracing backend.
-@end table
+@include qemu-option-trace.texi
 ETEXI
 
 HXCOMM Internal use
-- 
2.7.4

[Qemu-devel] [PULL 1/8] doc: sync help description for --trace with man for qemu.1

2016-06-28 Thread Stefan Hajnoczi

From: "Denis V. Lunev" 

[s/descriprion/description/ in commit message as suggested by Eric Blake
.
--Stefan]

Signed-off-by: Denis V. Lunev 
Reviewed-by: Eric Blake 
Reviewed-by: Stefan Hajnoczi 
Message-id: 1466174654-30130-2-git-send-email-...@openvz.org
CC: Paolo Bonzini 
CC: Kevin Wolf 
Signed-off-by: Stefan Hajnoczi 
---
 qemu-options.hx | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/qemu-options.hx b/qemu-options.hx
index 44c658f..ab42530 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -3669,7 +3669,7 @@ DEF("trace", HAS_ARG, QEMU_OPTION_trace,
 STEXI
 HXCOMM This line is not accurate, as some sub-options are backend-specific but
 HXCOMM HX does not support conditional compilation of text.
-@item -trace [events=@var{file}][,file=@var{file}]
+@item -trace [[enable=]@var{pattern}][,events=@var{file}][,file=@var{file}]
 @findex -trace
 
 Specify tracing options.
-- 
2.7.4

Re: [Qemu-devel] [PATCH 1/3] block: ignore flush requests when storage is clean

2016-06-28 Thread Paolo Bonzini



On 24/06/2016 17:06, Denis V. Lunev wrote:
> From: Evgeny Yakovlev 
> 
> Some guests (win2008 server for example) do a lot of unnecessary
> flushing when underlying media has not changed. This adds additional
> overhead on host when calling fsync/fdatasync.
> 
> This change introduces a dirty flag in BlockDriverState which is set
> in bdrv_set_dirty and is checked in bdrv_co_flush. This allows us to
> avoid unnesessary flushing when storage is clean.
> 
> The problem with excessive flushing was found by a performance test
> which does parallel directory tree creation (from 2 processes).
> Results improved from 0.424 loops/sec to 0.432 loops/sec.
> Each loop creates 10^3 directories with 10 files in each.
> 
> Signed-off-by: Evgeny Yakovlev 
> Signed-off-by: Denis V. Lunev 
> CC: Kevin Wolf 
> CC: Max Reitz 
> CC: Stefan Hajnoczi 
> CC: Fam Zheng 
> CC: John Snow 
> ---
>  block.c   |  1 +
>  block/dirty-bitmap.c  |  3 +++
>  block/io.c| 19 +++
>  include/block/block_int.h |  2 ++
>  4 files changed, 25 insertions(+)
> 
> diff --git a/block.c b/block.c
> index f4648e9..e36f148 100644
> --- a/block.c
> +++ b/block.c
> @@ -2582,6 +2582,7 @@ int bdrv_truncate(BlockDriverState *bs, int64_t offset)
>  ret = refresh_total_sectors(bs, offset >> BDRV_SECTOR_BITS);
>  bdrv_dirty_bitmap_truncate(bs);
>  bdrv_parent_cb_resize(bs);
> +bs->dirty = true; /* file node sync is needed after truncate */
>  }
>  return ret;
>  }
> diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
> index 4902ca5..54e0413 100644
> --- a/block/dirty-bitmap.c
> +++ b/block/dirty-bitmap.c
> @@ -370,6 +370,9 @@ void bdrv_set_dirty(BlockDriverState *bs, int64_t 
> cur_sector,
>  }
>  hbitmap_set(bitmap->bitmap, cur_sector, nr_sectors);
>  }
> +
> +/* Set global block driver dirty flag even if bitmap is disabled */
> +bs->dirty = true;
>  }
>  
>  /**
> diff --git a/block/io.c b/block/io.c
> index 7cf3645..8078af2 100644
> --- a/block/io.c
> +++ b/block/io.c
> @@ -2239,6 +2239,25 @@ int coroutine_fn bdrv_co_flush(BlockDriverState *bs)
>  goto flush_parent;
>  }
>  
> +/* Check if storage is actually dirty before flushing to disk */
> +if (!bs->dirty) {
> +/* Flush requests are appended to tracked request list in order so 
> that
> + * most recent request is at the head of the list. Following code 
> uses
> + * this ordering to wait for the most recent flush request to 
> complete
> + * to ensure that requests return in order */
> +BdrvTrackedRequest *prev_req;
> +QLIST_FOREACH(prev_req, >tracked_requests, list) {
> +if (prev_req ==  || prev_req->type != BDRV_TRACKED_FLUSH) {
> +continue;
> +}
> +
> +qemu_co_queue_wait(_req->wait_queue);
> +break;
> +}
> +goto flush_parent;

Can you just have a CoQueue specific to flushes, where a completing
flush does a restart_all on the CoQueue?

Flushes are never serialising, so there's no reason for them to be in
tracked_requests (I posted patches a while ago that instead use a simple
atomic counter, but they will only be in 2.8).

Paolo

> +}
> +bs->dirty = false;
> +
>  BLKDBG_EVENT(bs->file, BLKDBG_FLUSH_TO_DISK);
>  if (bs->drv->bdrv_co_flush_to_disk) {
>  ret = bs->drv->bdrv_co_flush_to_disk(bs);
> diff --git a/include/block/block_int.h b/include/block/block_int.h
> index 2057156..616058b 100644
> --- a/include/block/block_int.h
> +++ b/include/block/block_int.h
> @@ -418,6 +418,8 @@ struct BlockDriverState {
>  int sg;/* if true, the device is a /dev/sg* */
>  int copy_on_read; /* if true, copy read backing sectors into image
>   note this is a reference count */
> +
> +bool dirty;
>  bool probed;
>  
>  BlockDriver *drv; /* NULL means no media */
>

Re: [Qemu-devel] [PATCH 2/3] ide: ignore retry_unit check for non-retry operations

2016-06-28 Thread Paolo Bonzini



On 24/06/2016 17:06, Denis V. Lunev wrote:
> When doing DMA request ide/core.c will set s->retry_unit to s->unit in
> ide_start_dma. When dma completes ide_set_inactive sets retry_unit to -1.
> After that ide_flush_cache runs and fails thanks to blkdebug.
> ide_flush_cb calls ide_handle_rw_error which asserts that s->retry_unit
> == s->unit. But s->retry_unit is still -1 after previous DMA completion
> and flush does not use anything related to retry.

Wouldn't the assertion fail for a PIO read/write too?  Perhaps
retry_unit should be set to s->unit in ide_transfer_start too.

Paolo

Re: [Qemu-devel] [RFC PATCH 1/1] OpenBIOS: Switch over to official OpenBIOS git repo

2016-06-28 Thread Mark Cave-Ayland

On 28/06/16 14:44, Stefan Hajnoczi wrote:

> On Tue, Jun 28, 2016 at 7:11 AM, Jeff Cody  wrote:
>> On Mon, Jun 27, 2016 at 07:48:23AM +0100, Mark Cave-Ayland wrote:
>>> On 21/06/16 14:48, Mark Cave-Ayland wrote:
>>>
 On 21/06/16 11:28, Stefan Hajnoczi wrote:

> On Tue, Jun 21, 2016 at 01:40:42AM -0400, Jeff Cody wrote:
>> This update should preserve git history, and allow seamless switching
>> over to the official openbios git repo, rather than pulling from the
>> svn mirror.  All prior history from the svn repository should still be
>> preserved (i.e., commit hashes are the same for historical commits).
>>
>> In the roms/openbios submodule, the branch "origin/official" is the
>> latest mirror of the official git repository (fetched daily).
>>
>> Signed-off-by: Jeff Cody 
>> ---
>>  roms/openbios | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> Assuming the git.qemu-project.org openbios.git remote and .git/config
> mirror setting has been updated to use the new upstream git repo:
>
> Reviewed-by: Stefan Hajnoczi 

 Is it possible to switch this around, so that there is a legacy branch
 which points to the current HEAD and master points to the new, upstream
 git HEAD? Then it means if someone clones either the
 git.qemu-project.org repository or the official repository then the
 default master branch will point to the same HEAD.
>>>
>>> Urgent ping? It has been another week, we're coming up to soft freeze
>>> and the PPC guys are urgently after an OpenBIOS fix.
>>>
>>> As per the above I'd really like the branches switched around so that
>>> both the git.qemu-project.org master and github.com master are exactly
>>> the same HEAD although I believe it may be technically possible to do
>>> this part separately once the HEAD switch is in? If so, please can we
>>> apply this and then I can line up and attempt to push the outstanding
>>> patches to the new github master later this evening.
>>>
>>
>> If we want something other than this patch, so that the openbios git repo
>> hosted on qemu.org has 'master' as the new github tracking, we might be able
>> to do that with a git-merge.  Here are the three methods I am thinking of:
>>
>>
>> A) For 'master' referencing new github hashes:
>> git fetch github
>> git merge --no-edit github/master
> 
> 
>> git push /pub/git/openbios.git master:master
>>
>>
>> B) Old, prior behavior for SVN:
>> git svn fetch svn
>> git merge git-svn
>> git push /pub/git/openbios.git master:master
>>
>>
>> C) Current behavior, as of the submitted patch above, this is what is being 
>> run:
>> git svn fetch svn
>> git merge git-svn
>> git fetch github
>> git push /pub/git/openbios.git master:master
>> git push /pub/git/openbios.git official:official
>> (This seemed safest to run, as old behavior remains unchanged)
>>
>> If we do A), we'll have merge commits with just the auto-generated merge
>> message, and I'm not sure this is what you want.  Thoughts?
> 
> No, I think A is not appropriate because the mirror must have the
> exact same commit IDs as github.  Only fast-forward merges are
> allowed, so I would use --ff-only instead.  The first time you begin
> using the github repo you'll need git reset --hard github/master to
> move from the old svn commit history to the new github history.
> 
> It's important to keep the svm commits so old versions of QEMU still
> work.  You can ensure that the garbage collector does not delete the
> commits by tagging the latest svn head.

Yes, this is exactly what I'm thinking. Given that the repository is
already merged, is it not just as simple as:

git checkout master -b legacy
git checkout master
git reset --hard 36785d7

And then change the nightly script to "git pull origin/master" with the
origin remote set to the github.com repository. I'm also fine with
asking existing developers to switch over to the new master once we're done.


ATB,

Mark.

Re: [Qemu-devel] [PATCH] vfio/pci: Hide SR-IOV capability

2016-06-28 Thread Laszlo Ersek

On 06/21/16 00:04, Alex Williamson wrote:
> The kernel currently exposes the SR-IOV capability as read-only
> through vfio-pci.  This is sufficient to protect the host kernel, but
> has the potential to confuse guests without further virtualization.
> In particular, OVMF tries to size the VF BARs and comes up with absurd
> results, ending with an assert.  There's not much point in adding
> virtualization to a read-only capability, so we simply hide it for
> now.  If the kernel ever enables SR-IOV virtualization, we should
> easily be able to test it through VF BAR sizing or explicit flags.
> 
> Testing whether we should parse extended capabilities is also pulled
> into the function to keep these assumptions in one place.
> 
> Signed-off-by: Alex Williamson 
> ---
> 
> This depends on Chen Fan's patch "vfio: add pcie extended capability
> support", which I'll pull from Zhou Jie's latest series unless there
> are comments to the contrary.  Otherwise based on Stefan's tracing
> pull request so as not to conflict.
> 
>  hw/vfio/pci.c|   49 +++--
>  hw/vfio/trace-events |1 +
>  2 files changed, 40 insertions(+), 10 deletions(-)
> 
> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
> index a171056b..36d5e00 100644
> --- a/hw/vfio/pci.c
> +++ b/hw/vfio/pci.c
> @@ -1772,6 +1772,12 @@ static int vfio_add_ext_cap(VFIOPCIDevice *vdev)
>  uint8_t cap_ver;
>  uint8_t *config;
>  
> +/* Only add extended caps if we have them and the guest can see them */
> +if (!pci_is_express(pdev) || !pci_bus_is_express(pdev->bus) ||
> +!pci_get_long(pdev->config + PCI_CONFIG_SPACE_SIZE)) {
> +return 0;
> +}
> +
>  /*
>   * pcie_add_capability always inserts the new capability at the tail
>   * of the chain.  Therefore to end up with a chain that matches the
> @@ -1780,6 +1786,25 @@ static int vfio_add_ext_cap(VFIOPCIDevice *vdev)
>   */
>  config = g_memdup(pdev->config, vdev->config_size);
>  
> +/*
> + * Extended capabilities are chained with each pointing to the next, so 
> we
> + * can drop anything other than the head of the chain simply by modifying
> + * the previous next pointer.  For the head of the chain, we can modify 
> the
> + * capability ID to something that cannot match a valid capability.  ID
> + * 0 is reserved for this since absence of capabilities is indicated by
> + * 0 for the ID, version, AND next pointer.  However, 
> pcie_add_capability()
> + * uses ID 0 as reserved for list management and will incorrectly match 
> and
> + * assert if we attempt to pre-load the head of the chain with with this
> + * ID.  Use ID 0x temporarily since it is also seems to be reserved 
> in
> + * part for identifying abscense of capabilities in a root complex 
> register
> + * block.  If the ID still exists after adding capabilities, switch back 
> to
> + * zero.  We'll mark this entire first dword as emulated for this 
> purpose.
> + */
> +pci_set_long(pdev->config + PCI_CONFIG_SPACE_SIZE,
> + PCI_EXT_CAP(0x, 0, 0));
> +pci_set_long(pdev->wmask + PCI_CONFIG_SPACE_SIZE, 0);
> +pci_set_long(vdev->emulated_config_bits + PCI_CONFIG_SPACE_SIZE, ~0);
> +
>  for (next = PCI_CONFIG_SPACE_SIZE; next;
>   next = PCI_EXT_CAP_NEXT(pci_get_long(config + next))) {
>  header = pci_get_long(config + next);
> @@ -1794,12 +1819,23 @@ static int vfio_add_ext_cap(VFIOPCIDevice *vdev)
>   */
>  size = vfio_ext_cap_max_size(config, next);
>  
> -pcie_add_capability(pdev, cap_id, cap_ver, next, size);
> -pci_set_long(pdev->config + next, PCI_EXT_CAP(cap_id, cap_ver, 0));
> -
>  /* Use emulated next pointer to allow dropping extended caps */
>  pci_long_test_and_set_mask(vdev->emulated_config_bits + next,
> PCI_EXT_CAP_NEXT_MASK);
> +
> +switch (cap_id) {
> +case PCI_EXT_CAP_ID_SRIOV: /* Read-only VF BARs confuses OVMF */
> +trace_vfio_add_ext_cap_dropped(vdev->vbasedev.name, cap_id, 
> next);
> +break;
> +default:
> +pcie_add_capability(pdev, cap_id, cap_ver, next, size);
> +}
> +
> +}
> +
> +/* Cleanup chain head ID if necessary */
> +if (pci_get_word(pdev->config + PCI_CONFIG_SPACE_SIZE) == 0x) {
> +pci_set_word(pdev->config + PCI_CONFIG_SPACE_SIZE, 0);
>  }
>  
>  g_free(config);
> @@ -1821,13 +1857,6 @@ static int vfio_add_capabilities(VFIOPCIDevice *vdev)
>  return ret;
>  }
>  
> -/* on PCI bus, it doesn't make sense to expose extended capabilities. */
> -if (!pci_is_express(pdev) ||
> -!pci_bus_is_express(pdev->bus) ||
> -!pci_get_long(pdev->config + PCI_CONFIG_SPACE_SIZE)) {
> -return 0;
> -}
> -
>  return vfio_add_ext_cap(vdev);
>  }
>  
> diff --git

Re: [Qemu-devel] [PATCH 03/12] vfio: add pcie extended capability support

2016-06-28 Thread Laszlo Ersek

On 05/18/16 05:31, Zhou Jie wrote:
> From: Chen Fan 
> 
> For vfio pcie device, we could expose the extended capability on
> PCIE bus. due to add a new pcie capability at the tail of the chain,
> in order to avoid config space overwritten, we introduce a copy config
> for parsing extended caps. and rebuild the pcie extended config space.
> 
> Signed-off-by: Chen Fan 
> ---
>  hw/vfio/pci.c | 72 
> ++-
>  1 file changed, 71 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
> index 1ad47ef..f697853 100644
> --- a/hw/vfio/pci.c
> +++ b/hw/vfio/pci.c
> @@ -1528,6 +1528,21 @@ static uint8_t vfio_std_cap_max_size(PCIDevice *pdev, 
> uint8_t pos)
>  return next - pos;
>  }
>  
> +
> +static uint16_t vfio_ext_cap_max_size(const uint8_t *config, uint16_t pos)
> +{
> +uint16_t tmp, next = PCIE_CONFIG_SPACE_SIZE;
> +
> +for (tmp = PCI_CONFIG_SPACE_SIZE; tmp;
> +tmp = PCI_EXT_CAP_NEXT(pci_get_long(config + tmp))) {
> +if (tmp > pos && tmp < next) {
> +next = tmp;
> +}
> +}
> +
> +return next - pos;
> +}
> +
>  static void vfio_set_word_bits(uint8_t *buf, uint16_t val, uint16_t mask)
>  {
>  pci_set_word(buf, (pci_get_word(buf) & ~mask) | val);
> @@ -1862,16 +1877,71 @@ static int vfio_add_std_cap(VFIOPCIDevice *vdev, 
> uint8_t pos)
>  return 0;
>  }
>  
> +static int vfio_add_ext_cap(VFIOPCIDevice *vdev)
> +{
> +PCIDevice *pdev = >pdev;
> +uint32_t header;
> +uint16_t cap_id, next, size;
> +uint8_t cap_ver;
> +uint8_t *config;
> +
> +/*
> + * pcie_add_capability always inserts the new capability at the tail
> + * of the chain.  Therefore to end up with a chain that matches the
> + * physical device, we cache the config space to avoid overwriting
> + * the original config space when we parse the extended capabilities.
> + */
> +config = g_memdup(pdev->config, vdev->config_size);
> +
> +for (next = PCI_CONFIG_SPACE_SIZE; next;
> + next = PCI_EXT_CAP_NEXT(pci_get_long(config + next))) {
> +header = pci_get_long(config + next);
> +cap_id = PCI_EXT_CAP_ID(header);
> +cap_ver = PCI_EXT_CAP_VER(header);
> +
> +/*
> + * If it becomes important to configure extended capabilities to 
> their
> + * actual size, use this as the default when it's something we don't
> + * recognize. Since QEMU doesn't actually handle many of the config
> + * accesses, exact size doesn't seem worthwhile.
> + */
> +size = vfio_ext_cap_max_size(config, next);
> +
> +pcie_add_capability(pdev, cap_id, cap_ver, next, size);
> +pci_set_long(pdev->config + next, PCI_EXT_CAP(cap_id, cap_ver, 0));
> +
> +/* Use emulated next pointer to allow dropping extended caps */
> +pci_long_test_and_set_mask(vdev->emulated_config_bits + next,
> +   PCI_EXT_CAP_NEXT_MASK);
> +}
> +
> +g_free(config);
> +return 0;
> +}
> +
>  static int vfio_add_capabilities(VFIOPCIDevice *vdev)
>  {
>  PCIDevice *pdev = >pdev;
> +int ret;
>  
>  if (!(pdev->config[PCI_STATUS] & PCI_STATUS_CAP_LIST) ||
>  !pdev->config[PCI_CAPABILITY_LIST]) {
>  return 0; /* Nothing to add */
>  }
>  
> -return vfio_add_std_cap(vdev, pdev->config[PCI_CAPABILITY_LIST]);
> +ret = vfio_add_std_cap(vdev, pdev->config[PCI_CAPABILITY_LIST]);
> +if (ret) {
> +return ret;
> +}
> +
> +/* on PCI bus, it doesn't make sense to expose extended capabilities. */
> +if (!pci_is_express(pdev) ||
> +!pci_bus_is_express(pdev->bus) ||
> +!pci_get_long(pdev->config + PCI_CONFIG_SPACE_SIZE)) {
> +return 0;
> +}
> +
> +return vfio_add_ext_cap(vdev);
>  }
>  
>  static void vfio_pci_pre_reset(VFIOPCIDevice *vdev)
> 

Tested-by: Laszlo Ersek 

(as a prerequisite for
)

Re: [Qemu-devel] [RFC v3 16/19] tcg: move locking for tb_invalidate_phys_page_range up

2016-06-28 Thread Sergey Fedorov

On 28/06/16 22:43, Sergey Fedorov wrote:
> On 03/06/16 23:40, Alex Bennée wrote:
>> While we previously assumed an existing memory lock protected the page
>> look up in the MTTCG SoftMMU case the memory lock is provided by the
>> tb_lock. As a result we push the taking of this lock up the call tree.
>> This requires a slightly different entry for the SoftMMU and user-mode
>> cases from tb_invalidate_phys_range.
> Sorry, I can't understand the description for the patch :( Some
> rewording might be helpful, if you don't mind.

Well, do I understand it right that we're gonna use tb_lock to protect
'l1_map' and PageDesc structures in softmmu mode?

Regards,
Sergey

[Qemu-devel] [Bug 1356969] Re: qemu-io: the 'map' command hangs on the fuzzed image

2016-06-28 Thread T. Huth

** Changed in: qemu
   Status: Fix Committed => Fix Released

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1356969

Title:
  qemu-io: the 'map' command hangs on the fuzzed image

Status in QEMU:
  Fix Released

Bug description:
  Sequence:
   1. Unpack the attached archive, make a copy of test.img
   2. Put copy.img and backing_img.vdi in the same directory
   3. Execute

  qemu-io copy.img -c map

  Result: qemu-io processes part of the image and then hangs loading
  100% of CPU time.

  
  qemu.git HEAD 2d591ce2aeebf

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1356969/+subscriptions

[Qemu-devel] [Bug 1353456] Re: qemu-io: Failure on a qcow2 image with the fuzzed refcount table

2016-06-28 Thread T. Huth

** Changed in: qemu
   Status: Fix Committed => Fix Released

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1353456

Title:
  qemu-io: Failure on a qcow2 image with the fuzzed refcount table

Status in QEMU:
  Fix Released

Bug description:
  'qemu-io -c write' and 'qemu-io -c aio_write' crashes on a qcow2 image
  with a fuzzed refcount table.

  Sequence:
   1. Unpack the attached archive, make a copy of test.img
   2. Put copy.img and backing_img.file in the same directory
   3. Execute
  qemu-io copy.img -c write 279552 322560
or
 qemu-io copy.img -c aio_write 836608 166400

  Result: qemu-io was killed by SIGIOT with the reason:

  qemu-io: block/qcow2-cluster.c:1291: qcow2_alloc_cluster_offset:
  Assertion `*host_offset != 0' failed.

  qemu.git HEAD 69f87f713069f1f

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1353456/+subscriptions

[Qemu-devel] [Bug 1355697] Re: qemu-img: Segfault on a fuzzed image with large values of L1/L2 entries

2016-06-28 Thread T. Huth

** Changed in: qemu
   Status: Fix Committed => Fix Released

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1355697

Title:
  qemu-img: Segfault on a fuzzed image with large values of L1/L2
  entries

Status in QEMU:
  Fix Released

Bug description:
  'qemu-img check -r all/leaks' failed with a segmentation fault on the
  fuzzed image with L1/L2 entry values having UINT64 border values.

  Sequence:
   1. Unpack the attached archive, make a copy of test.img
   2. Put copy.img and backing_img.raw in the same directory
   3. Execute
 
  qemu-img check -f qcow2 -r all copy.img

  Result: qemu-img was killed by SIGSEGV.

  The qemu-img execution log can be found in the attached archive.

  
  qemu.git HEAD 2d591ce2aeebf

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1355697/+subscriptions

[Qemu-devel] [Bug 1354529] Re: qemu-io: Assert failure on the fuzzed qcow2 image

2016-06-28 Thread T. Huth

** Changed in: qemu
   Status: Fix Committed => Fix Released

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1354529

Title:
  qemu-io: Assert failure on the fuzzed qcow2 image

Status in QEMU:
  Fix Released

Bug description:
  'qemu-io -c write' failed on the fuzzed image with missed refcount
  tables:

  Sequence:
   1. Unpack the attached archive, make a copy of test.img
   2. Put copy.img and backing_img.cow in the same directory
   3. Execute
 qemu-io copy.img -c 'write 2856960 208896'

  Result: qemu-io was killed by SIGIOT with the reason:

  qemu-io: block/qcow2-cluster.c:910: handle_copied: Assertion `*host_offset == 
0 
  || offset_into_cluster(s, guest_offset) == offset_into_cluster(s, 
*host_offset)'
   failed.

  qemu.git HEAD 2d591ce2aeebf

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1354529/+subscriptions

[Qemu-devel] [PATCH v2] i2c: Fix SMBus read transactions to avoid double events

2016-06-28 Thread minyard

From: Corey Minyard 

Change 2293c27faddf (i2c: implement broadcast write) added broadcast
capability to the I2C bus, but it broke SMBus read transactions.
An SMBus read transaction does two i2c_start_transaction() calls
without an intervening i2c_end_transfer() call.  This will
result in i2c_start_transfer() adding the same device to the
current_devs list twice, and then the ->event() for the same
device gets called twice in the second call to i2c_start_transfer(),
resulting in the smbus code getting confused.

Note that this happens even with pure I2C devices when simulating
SMBus over I2C.

This fix only scans the bus if the current set of devices is empty.
This means that the current set of devices stays fixed until
i2c_end_transfer() is called, which is really what you want.

This also deletes the empty check from the top of i2c_end_transfer().
It's unnecessary, and it prevents the broadcast variable from being
set to false at the end of the transaction if no devices were on
the bus.

Cc: KONRAD Frederic 
Cc: Alistair Francis 
Cc: Peter Crosthwaite 
Cc: Kwon 
Cc: Peter Maydell 
Signed-off-by: Corey Minyard 
---
 hw/i2c/core.c | 28 +++-
 1 file changed, 15 insertions(+), 13 deletions(-)

This fix should work with I2C devices as well as SMBus devices.

Sorry for not thinking it through all the way before.

diff --git a/hw/i2c/core.c b/hw/i2c/core.c
index abb3efb..6313d31 100644
--- a/hw/i2c/core.c
+++ b/hw/i2c/core.c
@@ -101,15 +101,21 @@ int i2c_start_transfer(I2CBus *bus, uint8_t address, int 
recv)
 bus->broadcast = true;
 }
 
-QTAILQ_FOREACH(kid, >qbus.children, sibling) {
-DeviceState *qdev = kid->child;
-I2CSlave *candidate = I2C_SLAVE(qdev);
-if ((candidate->address == address) || (bus->broadcast)) {
-node = g_malloc(sizeof(struct I2CNode));
-node->elt = candidate;
-QLIST_INSERT_HEAD(>current_devs, node, next);
-if (!bus->broadcast) {
-break;
+/*
+ * If there are already devices in the list, that means we are in
+ * the middle of a transaction and we shouldn't rescan the bus.
+ */
+if (QLIST_EMPTY(>current_devs)) {
+QTAILQ_FOREACH(kid, >qbus.children, sibling) {
+DeviceState *qdev = kid->child;
+I2CSlave *candidate = I2C_SLAVE(qdev);
+if ((candidate->address == address) || (bus->broadcast)) {
+node = g_malloc(sizeof(struct I2CNode));
+node->elt = candidate;
+QLIST_INSERT_HEAD(>current_devs, node, next);
+if (!bus->broadcast) {
+break;
+}
 }
 }
 }
@@ -134,10 +140,6 @@ void i2c_end_transfer(I2CBus *bus)
 I2CSlaveClass *sc;
 I2CNode *node, *next;
 
-if (QLIST_EMPTY(>current_devs)) {
-return;
-}
-
 QLIST_FOREACH_SAFE(node, >current_devs, next, next) {
 sc = I2C_SLAVE_GET_CLASS(node->elt);
 if (sc->event) {
-- 
2.7.4

Re: [Qemu-devel] [RFC 00/30] cmpxchg-based emulation of atomics

2016-06-28 Thread Emilio G. Cota

On Tue, Jun 28, 2016 at 08:48:28 -0700, Richard Henderson wrote:
> On 06/28/2016 01:45 AM, Lluís Vilanova wrote:
> >Emilio G Cota writes:
> >[...]
> >>- What to do when atomic ops are used on something other than RAM?
> >>  Should we have a "slow path" that is not atomic for these cases, or
> >>  it's OK to assume code is bogus? For now, I just wrote XXX.
> >[...]
> >
> >You mean, for example, on I/O space? In these cases, it depends on the 
> >specific
> >device you're accessing and the interconnect used to access it.

Yes, exactly. Anything non-cacheable, really.

> >TL;DR: In some cases, it makes sense to support atomics outside RAM.
> >
> >For example, PCIe has support for expressing atomic operations in its 
> >messages
> >(I'm sure other interconnects do too). But in the end it depends on whether 
> >the
> >device supports them (I'm not sure if the device can reject atomics and 
> >produce
> >an error to whomever tried to do the atomic access, or if they are simply
> >ignored).

But these messages wouldn't be generated as a result of calling cmpxchg
on the memory-mapped I/O address, right?

> Indeed.  Thankfully, that's rare.  Many cpus explicitly say that the atomic
> ops can't be used on non-cachable memory, since they use the cache coherency
> protocol to implement the atomicity.
>
> That said, I can imagine that this probably works on x86, and supporting
> this is going to require a stop-the-world kind of emulation.

I'm assuming virtually all device drivers serialize accesses so that
"read-modify-write" cycles can be implemented as a read+write
on the bus. I have written quite a few drivers and it never occurred
to me to write cmpxchg or equivalent on an I/O address.

That said, for a non-RFC submission of this patchset, what should
we do? Just abort() for now, or do a non-atomic cycle? Stop-the-world
isn't available yet, and I wouldn't want to wait for it--this is not
a huge deal-breaker for most code out there, I think.

Thanks,

Emilio

[Qemu-devel] [PULL v2 23/24] linux-user: Provide safe_syscall for s390x

2016-06-28 Thread riku . voipio

From: Richard Henderson 

Signed-off-by: Richard Henderson 
Signed-off-by: Riku Voipio 
---
 linux-user/host/s390x/hostdep.h  | 23 
 linux-user/host/s390x/safe-syscall.inc.S | 90 
 2 files changed, 113 insertions(+)
 create mode 100644 linux-user/host/s390x/safe-syscall.inc.S

diff --git a/linux-user/host/s390x/hostdep.h b/linux-user/host/s390x/hostdep.h
index 7609bf5..e95871c 100644
--- a/linux-user/host/s390x/hostdep.h
+++ b/linux-user/host/s390x/hostdep.h
@@ -12,4 +12,27 @@
 #ifndef QEMU_HOSTDEP_H
 #define QEMU_HOSTDEP_H
 
+/* We have a safe-syscall.inc.S */
+#define HAVE_SAFE_SYSCALL
+
+#ifndef __ASSEMBLER__
+
+/* These are defined by the safe-syscall.inc.S file */
+extern char safe_syscall_start[];
+extern char safe_syscall_end[];
+
+/* Adjust the signal context to rewind out of safe-syscall if we're in it */
+static inline void rewind_if_in_safe_syscall(void *puc)
+{
+struct ucontext *uc = puc;
+unsigned long *pcreg = >uc_mcontext.psw.addr;
+
+if (*pcreg > (uintptr_t)safe_syscall_start
+&& *pcreg < (uintptr_t)safe_syscall_end) {
+*pcreg = (uintptr_t)safe_syscall_start;
+}
+}
+
+#endif /* __ASSEMBLER__ */
+
 #endif
diff --git a/linux-user/host/s390x/safe-syscall.inc.S 
b/linux-user/host/s390x/safe-syscall.inc.S
new file mode 100644
index 000..f1b446a
--- /dev/null
+++ b/linux-user/host/s390x/safe-syscall.inc.S
@@ -0,0 +1,90 @@
+/*
+ * safe-syscall.inc.S : host-specific assembly fragment
+ * to handle signals occurring at the same time as system calls.
+ * This is intended to be included by linux-user/safe-syscall.S
+ *
+ * Written by Richard Henderson 
+ * Copyright (C) 2016 Red Hat, Inc.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+   .global safe_syscall_base
+   .global safe_syscall_start
+   .global safe_syscall_end
+   .type   safe_syscall_base, @function
+
+   /* This is the entry point for making a system call. The calling
+* convention here is that of a C varargs function with the
+* first argument an 'int *' to the signal_pending flag, the
+* second one the system call number (as a 'long'), and all further
+* arguments being syscall arguments (also 'long').
+* We return a long which is the syscall's return value, which
+* may be negative-errno on failure. Conversion to the
+* -1-and-errno-set convention is done by the calling wrapper.
+*/
+safe_syscall_base:
+   .cfi_startproc
+   stmg%r6,%r15,48(%r15)   /* save all call-saved registers */
+   .cfi_offset %r15,-40
+   .cfi_offset %r14,-48
+   .cfi_offset %r13,-56
+   .cfi_offset %r12,-64
+   .cfi_offset %r11,-72
+   .cfi_offset %r10,-80
+   .cfi_offset %r9,-88
+   .cfi_offset %r8,-96
+   .cfi_offset %r7,-104
+   .cfi_offset %r6,-112
+   lgr %r1,%r15
+   lg  %r0,8(%r15) /* load eos */
+   aghi%r15,-160
+   .cfi_adjust_cfa_offset 160
+   stg %r1,0(%r15) /* store back chain */
+   stg %r0,8(%r15) /* store eos */
+
+   /* The syscall calling convention isn't the same as the
+* C one:
+* we enter with r2 == *signal_pending
+*   r3 == syscall number
+*   r4, r5, r6, (stack) == syscall arguments
+*   and return the result in r2
+* and the syscall instruction needs
+*   r1 == syscall number
+*   r2 ... r7 == syscall arguments
+*   and returns the result in r2
+* Shuffle everything around appropriately.
+*/
+   lgr %r8,%r2 /* signal_pending pointer */
+   lgr %r1,%r3 /* syscall number */
+   lgr %r2,%r4 /* syscall args */
+   lgr %r3,%r5
+   lgr %r4,%r6
+   lmg %r5,%r7,320(%r15)
+
+   /* This next sequence of code works in conjunction with the
+* rewind_if_safe_syscall_function(). If a signal is taken
+* and the interrupted PC is anywhere between 'safe_syscall_start'
+* and 'safe_syscall_end' then we rewind it to 'safe_syscall_start'.
+* The code sequence must therefore be able to cope with this, and
+* the syscall instruction must be the final one in the sequence.
+*/
+safe_syscall_start:
+   /* if signal_pending is non-zero, don't do the call */
+   lt  %r0,0(%r8)
+   jne 2f
+   svc 0
+safe_syscall_end:
+
+1: lg  %r15,0(%r15)/* load back chain */
+   .cfi_remember_state
+   .cfi_adjust_cfa_offset -160
+   lmg %r6,%r15,48(%r15)   /* load saved registers */
+   br  %r14
+   .cfi_restore_state

[Qemu-devel] [PULL v2 16/24] linux-user: add missing return in netlink switch statement

2016-06-28 Thread riku . voipio

From: Laurent Vivier 

Reported-by: Peter Maydell 
Signed-off-by: Laurent Vivier 
Signed-off-by: Riku Voipio 
Reviewed-by: Peter Maydell 
---
 linux-user/syscall.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index b8a0738..33409c0 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -1692,6 +1692,7 @@ static abi_long target_to_host_for_each_nlmsg(struct 
nlmsghdr *nlh,
 struct nlmsgerr *e = NLMSG_DATA(nlh);
 e->error = tswap32(e->error);
 tswap_nlmsghdr(>msg);
+return 0;
 }
 default:
 ret = target_to_host_nlmsg(nlh);
-- 
2.1.4

[Qemu-devel] [PULL v2 18/24] linux-user: don't swap NLMSG_DATA() fields

2016-06-28 Thread riku . voipio

From: Laurent Vivier 

If the structure pointed by NLMSG_DATA() is bigger
than the size of NLMSG_DATA(), don't swap its fields
to avoid memory corruption.

Signed-off-by: Laurent Vivier 
Signed-off-by: Riku Voipio 
Reviewed-by: Peter Maydell 
---
 linux-user/syscall.c | 72 ++--
 1 file changed, 42 insertions(+), 30 deletions(-)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 4b0d791..d3d7ee6 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -1948,29 +1948,35 @@ static abi_long host_to_target_data_route(struct 
nlmsghdr *nlh)
 case RTM_NEWLINK:
 case RTM_DELLINK:
 case RTM_GETLINK:
-ifi = NLMSG_DATA(nlh);
-ifi->ifi_type = tswap16(ifi->ifi_type);
-ifi->ifi_index = tswap32(ifi->ifi_index);
-ifi->ifi_flags = tswap32(ifi->ifi_flags);
-ifi->ifi_change = tswap32(ifi->ifi_change);
-host_to_target_link_rtattr(IFLA_RTA(ifi),
-   nlmsg_len - NLMSG_LENGTH(sizeof(*ifi)));
+if (nlh->nlmsg_len >= NLMSG_LENGTH(sizeof(*ifi))) {
+ifi = NLMSG_DATA(nlh);
+ifi->ifi_type = tswap16(ifi->ifi_type);
+ifi->ifi_index = tswap32(ifi->ifi_index);
+ifi->ifi_flags = tswap32(ifi->ifi_flags);
+ifi->ifi_change = tswap32(ifi->ifi_change);
+host_to_target_link_rtattr(IFLA_RTA(ifi),
+   nlmsg_len - NLMSG_LENGTH(sizeof(*ifi)));
+}
 break;
 case RTM_NEWADDR:
 case RTM_DELADDR:
 case RTM_GETADDR:
-ifa = NLMSG_DATA(nlh);
-ifa->ifa_index = tswap32(ifa->ifa_index);
-host_to_target_addr_rtattr(IFA_RTA(ifa),
-   nlmsg_len - NLMSG_LENGTH(sizeof(*ifa)));
+if (nlh->nlmsg_len >= NLMSG_LENGTH(sizeof(*ifa))) {
+ifa = NLMSG_DATA(nlh);
+ifa->ifa_index = tswap32(ifa->ifa_index);
+host_to_target_addr_rtattr(IFA_RTA(ifa),
+   nlmsg_len - NLMSG_LENGTH(sizeof(*ifa)));
+}
 break;
 case RTM_NEWROUTE:
 case RTM_DELROUTE:
 case RTM_GETROUTE:
-rtm = NLMSG_DATA(nlh);
-rtm->rtm_flags = tswap32(rtm->rtm_flags);
-host_to_target_route_rtattr(RTM_RTA(rtm),
-nlmsg_len - NLMSG_LENGTH(sizeof(*rtm)));
+if (nlh->nlmsg_len >= NLMSG_LENGTH(sizeof(*rtm))) {
+rtm = NLMSG_DATA(nlh);
+rtm->rtm_flags = tswap32(rtm->rtm_flags);
+host_to_target_route_rtattr(RTM_RTA(rtm),
+nlmsg_len - 
NLMSG_LENGTH(sizeof(*rtm)));
+}
 break;
 default:
 return -TARGET_EINVAL;
@@ -2086,30 +2092,36 @@ static abi_long target_to_host_data_route(struct 
nlmsghdr *nlh)
 break;
 case RTM_NEWLINK:
 case RTM_DELLINK:
-ifi = NLMSG_DATA(nlh);
-ifi->ifi_type = tswap16(ifi->ifi_type);
-ifi->ifi_index = tswap32(ifi->ifi_index);
-ifi->ifi_flags = tswap32(ifi->ifi_flags);
-ifi->ifi_change = tswap32(ifi->ifi_change);
-target_to_host_link_rtattr(IFLA_RTA(ifi), nlh->nlmsg_len -
-   NLMSG_LENGTH(sizeof(*ifi)));
+if (nlh->nlmsg_len >= NLMSG_LENGTH(sizeof(*ifi))) {
+ifi = NLMSG_DATA(nlh);
+ifi->ifi_type = tswap16(ifi->ifi_type);
+ifi->ifi_index = tswap32(ifi->ifi_index);
+ifi->ifi_flags = tswap32(ifi->ifi_flags);
+ifi->ifi_change = tswap32(ifi->ifi_change);
+target_to_host_link_rtattr(IFLA_RTA(ifi), nlh->nlmsg_len -
+   NLMSG_LENGTH(sizeof(*ifi)));
+}
 break;
 case RTM_GETADDR:
 case RTM_NEWADDR:
 case RTM_DELADDR:
-ifa = NLMSG_DATA(nlh);
-ifa->ifa_index = tswap32(ifa->ifa_index);
-target_to_host_addr_rtattr(IFA_RTA(ifa), nlh->nlmsg_len -
-   NLMSG_LENGTH(sizeof(*ifa)));
+if (nlh->nlmsg_len >= NLMSG_LENGTH(sizeof(*ifa))) {
+ifa = NLMSG_DATA(nlh);
+ifa->ifa_index = tswap32(ifa->ifa_index);
+target_to_host_addr_rtattr(IFA_RTA(ifa), nlh->nlmsg_len -
+   NLMSG_LENGTH(sizeof(*ifa)));
+}
 break;
 case RTM_GETROUTE:
 break;
 case RTM_NEWROUTE:
 case RTM_DELROUTE:
-rtm = NLMSG_DATA(nlh);
-rtm->rtm_flags = tswap32(rtm->rtm_flags);
-target_to_host_route_rtattr(RTM_RTA(rtm), nlh->nlmsg_len -
-NLMSG_LENGTH(sizeof(*rtm)));
+if (nlh->nlmsg_len >= NLMSG_LENGTH(sizeof(*rtm))) {
+rtm = NLMSG_DATA(nlh);
+rtm->rtm_flags = tswap32(rtm->rtm_flags);
+target_to_host_route_rtattr(RTM_RTA(rtm), nlh->nlmsg_len -
+

[Qemu-devel] [PULL v2 19/24] linux-user: fix x86_64 safe_syscall

2016-06-28 Thread riku . voipio

From: Richard Henderson 

Do what the comment says, test for signal_pending non-zero,
rather than the current code which tests for bit 0 non-zero.

Signed-off-by: Richard Henderson 
Signed-off-by: Riku Voipio 
Reviewed-by: Peter Maydell 
---
 linux-user/host/x86_64/safe-syscall.inc.S | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/linux-user/host/x86_64/safe-syscall.inc.S 
b/linux-user/host/x86_64/safe-syscall.inc.S
index e09368d..f36992d 100644
--- a/linux-user/host/x86_64/safe-syscall.inc.S
+++ b/linux-user/host/x86_64/safe-syscall.inc.S
@@ -67,8 +67,8 @@ safe_syscall_base:
  */
 safe_syscall_start:
 /* if signal_pending is non-zero, don't do the call */
-testl   $1, (%rbp)
-jnz return_ERESTARTSYS
+cmpl   $0, (%rbp)
+jnz 1f
 syscall
 safe_syscall_end:
 /* code path for having successfully executed the syscall */
@@ -78,7 +78,7 @@ safe_syscall_end:
 .cfi_restore rbp
 ret
 
-return_ERESTARTSYS:
+1:
 /* code path when we didn't execute the syscall */
 .cfi_restore_state
 mov $-TARGET_ERESTARTSYS, %rax
-- 
2.1.4

Re: [Qemu-devel] [RFC v3 16/19] tcg: move locking for tb_invalidate_phys_page_range up

2016-06-28 Thread Sergey Fedorov

On 03/06/16 23:40, Alex Bennée wrote:
> While we previously assumed an existing memory lock protected the page
> look up in the MTTCG SoftMMU case the memory lock is provided by the
> tb_lock. As a result we push the taking of this lock up the call tree.
> This requires a slightly different entry for the SoftMMU and user-mode
> cases from tb_invalidate_phys_range.

Sorry, I can't understand the description for the patch :( Some
rewording might be helpful, if you don't mind.

Thanks,
Sergey

> This also means user-mode breakpoint insertion needs to take two locks
> but it hadn't taken any previously so this is an improvement.
>
> Signed-off-by: Alex Bennée 
> ---
>  exec.c  | 16 
>  translate-all.c | 37 +
>  2 files changed, 45 insertions(+), 8 deletions(-)
(snip)

[Qemu-devel] [PULL v2 12/24] linux-user: add socketcall() strace

2016-06-28 Thread riku . voipio

From: Laurent Vivier 

Signed-off-by: Laurent Vivier 
Signed-off-by: Riku Voipio 
Reviewed-by: Peter Maydell 
---
 linux-user/strace.c   | 549 ++
 linux-user/strace.list|   2 +-
 linux-user/syscall_defs.h |  22 +-
 3 files changed, 568 insertions(+), 5 deletions(-)

diff --git a/linux-user/strace.c b/linux-user/strace.c
index 6ef5d38..c8df76f 100644
--- a/linux-user/strace.c
+++ b/linux-user/strace.c
@@ -5,6 +5,9 @@
 #include 
 #include 
 #include 
+#include 
+#include 
+#include 
 #include 
 #include "qemu.h"
 
@@ -57,10 +60,15 @@ UNUSED static void print_open_flags(abi_long, int);
 UNUSED static void print_syscall_prologue(const struct syscallname *);
 UNUSED static void print_syscall_epilogue(const struct syscallname *);
 UNUSED static void print_string(abi_long, int);
+UNUSED static void print_buf(abi_long addr, abi_long len, int last);
 UNUSED static void print_raw_param(const char *, abi_long, int);
 UNUSED static void print_timeval(abi_ulong, int);
 UNUSED static void print_number(abi_long, int);
 UNUSED static void print_signal(abi_ulong, int);
+UNUSED static void print_sockaddr(abi_ulong addr, abi_long addrlen);
+UNUSED static void print_socket_domain(int domain);
+UNUSED static void print_socket_type(int type);
+UNUSED static void print_socket_protocol(int domain, int type, int protocol);
 
 /*
  * Utility functions
@@ -146,6 +154,165 @@ print_signal(abi_ulong arg, int last)
 gemu_log("%s%s", signal_name, get_comma(last));
 }
 
+static void
+print_sockaddr(abi_ulong addr, abi_long addrlen)
+{
+struct target_sockaddr *sa;
+int i;
+int sa_family;
+
+sa = lock_user(VERIFY_READ, addr, addrlen, 1);
+if (sa) {
+sa_family = tswap16(sa->sa_family);
+switch (sa_family) {
+case AF_UNIX: {
+struct target_sockaddr_un *un = (struct target_sockaddr_un *)sa;
+int i;
+gemu_log("{sun_family=AF_UNIX,sun_path=\"");
+for (i = 0; i < addrlen -
+offsetof(struct target_sockaddr_un, sun_path) &&
+ un->sun_path[i]; i++) {
+gemu_log("%c", un->sun_path[i]);
+}
+gemu_log("\"}");
+break;
+}
+case AF_INET: {
+struct target_sockaddr_in *in = (struct target_sockaddr_in *)sa;
+uint8_t *c = (uint8_t *)>sin_addr.s_addr;
+gemu_log("{sin_family=AF_INET,sin_port=htons(%d),",
+ ntohs(in->sin_port));
+gemu_log("sin_addr=inet_addr(\"%d.%d.%d.%d\")",
+ c[0], c[1], c[2], c[3]);
+gemu_log("}");
+break;
+}
+case AF_PACKET: {
+struct target_sockaddr_ll *ll = (struct target_sockaddr_ll *)sa;
+uint8_t *c = (uint8_t *)>sll_addr;
+gemu_log("{sll_family=AF_PACKET,"
+ "sll_protocol=htons(0x%04x),if%d,pkttype=",
+ ntohs(ll->sll_protocol), ll->sll_ifindex);
+switch (ll->sll_pkttype) {
+case PACKET_HOST:
+gemu_log("PACKET_HOST");
+break;
+case PACKET_BROADCAST:
+gemu_log("PACKET_BROADCAST");
+break;
+case PACKET_MULTICAST:
+gemu_log("PACKET_MULTICAST");
+break;
+case PACKET_OTHERHOST:
+gemu_log("PACKET_OTHERHOST");
+break;
+case PACKET_OUTGOING:
+gemu_log("PACKET_OUTGOING");
+break;
+default:
+gemu_log("%d", ll->sll_pkttype);
+break;
+}
+gemu_log(",sll_addr=%02x:%02x:%02x:%02x:%02x:%02x:%02x:%02x",
+ c[0], c[1], c[2], c[3], c[4], c[5], c[6], c[7]);
+gemu_log("}");
+break;
+}
+default:
+gemu_log("{sa_family=%d, sa_data={", sa->sa_family);
+for (i = 0; i < 13; i++) {
+gemu_log("%02x, ", sa->sa_data[i]);
+}
+gemu_log("%02x}", sa->sa_data[i]);
+gemu_log("}");
+break;
+}
+unlock_user(sa, addr, 0);
+} else {
+print_raw_param("0x"TARGET_ABI_FMT_lx, addr, 0);
+}
+gemu_log(", "TARGET_ABI_FMT_ld, addrlen);
+}
+
+static void
+print_socket_domain(int domain)
+{
+switch (domain) {
+case PF_UNIX:
+gemu_log("PF_UNIX");
+break;
+case PF_INET:
+gemu_log("PF_INET");
+break;
+case PF_PACKET:
+gemu_log("PF_PACKET");
+break;
+default:
+gemu_log("%d", domain);
+break;
+}
+}
+
+static void
+print_socket_type(int type)
+{
+switch (type) {
+case TARGET_SOCK_DGRAM:
+gemu_log("SOCK_DGRAM");
+break;
+case TARGET_SOCK_STREAM:
+

[Qemu-devel] [PULL v2 24/24] linux-user: Provide safe_syscall for ppc64

2016-06-28 Thread riku . voipio

From: Richard Henderson 

Signed-off-by: Richard Henderson 
Signed-off-by: Riku Voipio 
---
 linux-user/host/ppc64/hostdep.h  | 23 
 linux-user/host/ppc64/safe-syscall.inc.S | 92 
 2 files changed, 115 insertions(+)
 create mode 100644 linux-user/host/ppc64/safe-syscall.inc.S

diff --git a/linux-user/host/ppc64/hostdep.h b/linux-user/host/ppc64/hostdep.h
index 7609bf5..310e7d1 100644
--- a/linux-user/host/ppc64/hostdep.h
+++ b/linux-user/host/ppc64/hostdep.h
@@ -12,4 +12,27 @@
 #ifndef QEMU_HOSTDEP_H
 #define QEMU_HOSTDEP_H
 
+/* We have a safe-syscall.inc.S */
+#define HAVE_SAFE_SYSCALL
+
+#ifndef __ASSEMBLER__
+
+/* These are defined by the safe-syscall.inc.S file */
+extern char safe_syscall_start[];
+extern char safe_syscall_end[];
+
+/* Adjust the signal context to rewind out of safe-syscall if we're in it */
+static inline void rewind_if_in_safe_syscall(void *puc)
+{
+struct ucontext *uc = puc;
+unsigned long *pcreg = >uc_mcontext.gp_regs[PT_NIP];
+
+if (*pcreg > (uintptr_t)safe_syscall_start
+&& *pcreg < (uintptr_t)safe_syscall_end) {
+*pcreg = (uintptr_t)safe_syscall_start;
+}
+}
+
+#endif /* __ASSEMBLER__ */
+
 #endif
diff --git a/linux-user/host/ppc64/safe-syscall.inc.S 
b/linux-user/host/ppc64/safe-syscall.inc.S
new file mode 100644
index 000..d30050a
--- /dev/null
+++ b/linux-user/host/ppc64/safe-syscall.inc.S
@@ -0,0 +1,92 @@
+/*
+ * safe-syscall.inc.S : host-specific assembly fragment
+ * to handle signals occurring at the same time as system calls.
+ * This is intended to be included by linux-user/safe-syscall.S
+ *
+ * Written by Richard Henderson 
+ * Copyright (C) 2016 Red Hat, Inc.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+   .global safe_syscall_base
+   .global safe_syscall_start
+   .global safe_syscall_end
+   .type   safe_syscall_base, @function
+
+   .text
+
+   /* This is the entry point for making a system call. The calling
+* convention here is that of a C varargs function with the
+* first argument an 'int *' to the signal_pending flag, the
+* second one the system call number (as a 'long'), and all further
+* arguments being syscall arguments (also 'long').
+* We return a long which is the syscall's return value, which
+* may be negative-errno on failure. Conversion to the
+* -1-and-errno-set convention is done by the calling wrapper.
+*/
+#if _CALL_ELF == 2
+safe_syscall_base:
+   .cfi_startproc
+   .localentry safe_syscall_base,0
+#else
+   .section ".opd","aw"
+   .align  3
+safe_syscall_base:
+   .quad   .L.safe_syscall_base,.TOC.@tocbase,0
+   .previous
+.L.safe_syscall_base:
+   .cfi_startproc
+#endif
+   /* We enter with r3 == *signal_pending
+*   r4 == syscall number
+*   r5 ... r10 == syscall arguments
+*   and return the result in r3
+* and the syscall instruction needs
+*   r0 == syscall number
+*   r3 ... r8 == syscall arguments
+*   and returns the result in r3
+* Shuffle everything around appropriately.
+*/
+   mr  11, 3   /* signal_pending */
+   mr  0, 4/* syscall number */
+   mr  3, 5/* syscall arguments */
+   mr  4, 6
+   mr  5, 7
+   mr  6, 8
+   mr  7, 9
+   mr  8, 10
+
+   /* This next sequence of code works in conjunction with the
+* rewind_if_safe_syscall_function(). If a signal is taken
+* and the interrupted PC is anywhere between 'safe_syscall_start'
+* and 'safe_syscall_end' then we rewind it to 'safe_syscall_start'.
+* The code sequence must therefore be able to cope with this, and
+* the syscall instruction must be the final one in the sequence.
+*/
+safe_syscall_start:
+   /* if signal_pending is non-zero, don't do the call */
+   lwz 12, 0(11)
+   cmpwi   0, 12, 0
+   bne-0f
+   sc
+safe_syscall_end:
+   /* code path when we did execute the syscall */
+   bnslr+
+
+   /* syscall failed; return negative errno */
+   neg 3, 3
+   blr
+
+   /* code path when we didn't execute the syscall */
+0: addi3, 0, -TARGET_ERESTARTSYS
+   blr
+   .cfi_endproc
+
+#if _CALL_ELF == 2
+   .size   safe_syscall_base, .-safe_syscall_base
+#else
+   .size   safe_syscall_base, .-.L.safe_syscall_base
+   .size   .L.safe_syscall_base, .-.L.safe_syscall_base
+#endif
-- 
2.1.4

[Qemu-devel] [Bug 1355738] Re: qemu-img: Killed by SIGTRAP on check of the fuzzed image

2016-06-28 Thread T. Huth

** Changed in: qemu
   Status: Fix Committed => Fix Released

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1355738

Title:
  qemu-img: Killed by SIGTRAP on check of the fuzzed image

Status in QEMU:
  Fix Released

Bug description:
  'qemu-img check -r all' was killed by SIGTRAP.

  Sequence:
   1. Unpack the attached archive, make a copy of test.img
   2. Put copy.img and backing_img.qed in the same directory
   3. Execute

  qemu-img check -f qcow2 -r all copy.img

  Result: qemu-img was killed by SIGTRAP with the reason:

  (process:2210): GLib-ERROR **: gmem.c:140: failed to allocate
  18446744069633940288 bytes

  The qemu-img execution log can be found in the attached archive.

  qemu.git HEAD 2d591ce2aeebf

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1355738/+subscriptions

[Qemu-devel] [PULL v2 17/24] linux-user: fd_trans_host_to_target_data() must process only received data

2016-06-28 Thread riku . voipio

From: Laurent Vivier 

if we process the whole buffer, the netlink helpers can try
to swap invalid data.

Signed-off-by: Laurent Vivier 
Signed-off-by: Riku Voipio 
Reviewed-by: Peter Maydell 
---
 linux-user/syscall.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 33409c0..4b0d791 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -2991,7 +2991,7 @@ static abi_long do_sendrecvmsg_locked(int fd, struct 
target_msghdr *msgp,
 len = ret;
 if (fd_trans_host_to_target_data(fd)) {
 ret = fd_trans_host_to_target_data(fd)(msg.msg_iov->iov_base,
-   msg.msg_iov->iov_len);
+   len);
 } else {
 ret = host_to_target_cmsg(msgp, );
 }
-- 
2.1.4

[Qemu-devel] [PULL v2 20/24] linux-user: Provide safe_syscall for i386

2016-06-28 Thread riku . voipio

From: Richard Henderson 

Signed-off-by: Richard Henderson 
Reviewed-by: Peter Maydell 
Signed-off-by: Riku Voipio 
---
 linux-user/host/i386/hostdep.h  |  23 +++
 linux-user/host/i386/safe-syscall.inc.S | 112 
 2 files changed, 135 insertions(+)
 create mode 100644 linux-user/host/i386/safe-syscall.inc.S

diff --git a/linux-user/host/i386/hostdep.h b/linux-user/host/i386/hostdep.h
index 7609bf5..5a12f4a 100644
--- a/linux-user/host/i386/hostdep.h
+++ b/linux-user/host/i386/hostdep.h
@@ -12,4 +12,27 @@
 #ifndef QEMU_HOSTDEP_H
 #define QEMU_HOSTDEP_H
 
+/* We have a safe-syscall.inc.S */
+#define HAVE_SAFE_SYSCALL
+
+#ifndef __ASSEMBLER__
+
+/* These are defined by the safe-syscall.inc.S file */
+extern char safe_syscall_start[];
+extern char safe_syscall_end[];
+
+/* Adjust the signal context to rewind out of safe-syscall if we're in it */
+static inline void rewind_if_in_safe_syscall(void *puc)
+{
+struct ucontext *uc = puc;
+greg_t *pcreg = >uc_mcontext.gregs[REG_EIP];
+
+if (*pcreg > (uintptr_t)safe_syscall_start
+&& *pcreg < (uintptr_t)safe_syscall_end) {
+*pcreg = (uintptr_t)safe_syscall_start;
+}
+}
+
+#endif /* __ASSEMBLER__ */
+
 #endif
diff --git a/linux-user/host/i386/safe-syscall.inc.S 
b/linux-user/host/i386/safe-syscall.inc.S
new file mode 100644
index 000..766d0de
--- /dev/null
+++ b/linux-user/host/i386/safe-syscall.inc.S
@@ -0,0 +1,112 @@
+/*
+ * safe-syscall.inc.S : host-specific assembly fragment
+ * to handle signals occurring at the same time as system calls.
+ * This is intended to be included by linux-user/safe-syscall.S
+ *
+ * Written by Richard Henderson 
+ * Copyright (C) 2016 Red Hat, Inc.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+   .global safe_syscall_base
+   .global safe_syscall_start
+   .global safe_syscall_end
+   .type   safe_syscall_base, @function
+
+   /* This is the entry point for making a system call. The calling
+* convention here is that of a C varargs function with the
+* first argument an 'int *' to the signal_pending flag, the
+* second one the system call number (as a 'long'), and all further
+* arguments being syscall arguments (also 'long').
+* We return a long which is the syscall's return value, which
+* may be negative-errno on failure. Conversion to the
+* -1-and-errno-set convention is done by the calling wrapper.
+*/
+safe_syscall_base:
+   .cfi_startproc
+   push%ebp
+   .cfi_adjust_cfa_offset 4
+   .cfi_rel_offset ebp, 0
+   push%esi
+   .cfi_adjust_cfa_offset 4
+   .cfi_rel_offset esi, 0
+   push%edi
+   .cfi_adjust_cfa_offset 4
+   .cfi_rel_offset edi, 0
+   push%ebx
+   .cfi_adjust_cfa_offset 4
+   .cfi_rel_offset ebx, 0
+
+   /* The syscall calling convention isn't the same as the C one:
+* we enter with 0(%esp) == return address
+*   4(%esp) == *signal_pending
+*   8(%esp) == syscall number
+*   12(%esp) ... 32(%esp) == syscall arguments
+*   and return the result in eax
+* and the syscall instruction needs
+*   eax == syscall number
+*   ebx, ecx, edx, esi, edi, ebp == syscall arguments
+*   and returns the result in eax
+* Shuffle everything around appropriately.
+* Note the 16 bytes that we pushed to save registers.
+*/
+   mov 12+16(%esp), %ebx   /* the syscall arguments */
+   mov 16+16(%esp), %ecx
+   mov 20+16(%esp), %edx
+   mov 24+16(%esp), %esi
+   mov 28+16(%esp), %edi
+   mov 32+16(%esp), %ebp
+
+   /* This next sequence of code works in conjunction with the
+* rewind_if_safe_syscall_function(). If a signal is taken
+* and the interrupted PC is anywhere between 'safe_syscall_start'
+* and 'safe_syscall_end' then we rewind it to 'safe_syscall_start'.
+* The code sequence must therefore be able to cope with this, and
+* the syscall instruction must be the final one in the sequence.
+*/
+safe_syscall_start:
+   /* if signal_pending is non-zero, don't do the call */
+   mov 4+16(%esp), %eax/* signal_pending */
+   cmp $0, (%eax)
+   jnz 1f
+   mov 8+16(%esp), %eax/* syscall number */
+   int $0x80
+safe_syscall_end:
+   /* code path for having successfully executed the syscall */
+   pop %ebx
+   .cfi_remember_state
+   .cfi_def_cfa_offset -4
+   .cfi_restore ebx
+   pop %edi
+   .cfi_def_cfa_offset -4
+   .cfi_restore edi
+

[Qemu-devel] [PULL v2 14/24] linux-user: fix clone() strace

2016-06-28 Thread riku . voipio

From: Laurent Vivier 

Signed-off-by: Laurent Vivier 
Signed-off-by: Riku Voipio 
Reviewed-by: Peter Maydell 
---
 linux-user/strace.c | 42 --
 1 file changed, 20 insertions(+), 22 deletions(-)

diff --git a/linux-user/strace.c b/linux-user/strace.c
index 95f4338..cc10dc4 100644
--- a/linux-user/strace.c
+++ b/linux-user/strace.c
@@ -957,33 +957,31 @@ print_chmod(const struct syscallname *name,
 #endif
 
 #ifdef TARGET_NR_clone
+static void do_print_clone(unsigned int flags, abi_ulong newsp,
+   abi_ulong parent_tidptr, target_ulong newtls,
+   abi_ulong child_tidptr)
+{
+print_flags(clone_flags, flags, 0);
+print_raw_param("child_stack=0x" TARGET_ABI_FMT_lx, newsp, 0);
+print_raw_param("parent_tidptr=0x" TARGET_ABI_FMT_lx, parent_tidptr, 0);
+print_raw_param("tls=0x" TARGET_ABI_FMT_lx, newtls, 0);
+print_raw_param("child_tidptr=0x" TARGET_ABI_FMT_lx, child_tidptr, 1);
+}
+
 static void
 print_clone(const struct syscallname *name,
-abi_long arg0, abi_long arg1, abi_long arg2,
-abi_long arg3, abi_long arg4, abi_long arg5)
+abi_long arg1, abi_long arg2, abi_long arg3,
+abi_long arg4, abi_long arg5, abi_long arg6)
 {
 print_syscall_prologue(name);
-#if defined(TARGET_M68K)
-print_flags(clone_flags, arg0, 0);
-print_raw_param("newsp=0x" TARGET_ABI_FMT_lx, arg1, 1);
-#elif defined(TARGET_SH4) || defined(TARGET_ALPHA)
-print_flags(clone_flags, arg0, 0);
-print_raw_param("child_stack=0x" TARGET_ABI_FMT_lx, arg1, 0);
-print_raw_param("parent_tidptr=0x" TARGET_ABI_FMT_lx, arg2, 0);
-print_raw_param("child_tidptr=0x" TARGET_ABI_FMT_lx, arg3, 0);
-print_raw_param("tls=0x" TARGET_ABI_FMT_lx, arg4, 1);
-#elif defined(TARGET_CRIS)
-print_raw_param("child_stack=0x" TARGET_ABI_FMT_lx, arg0, 0);
-print_flags(clone_flags, arg1, 0);
-print_raw_param("parent_tidptr=0x" TARGET_ABI_FMT_lx, arg2, 0);
-print_raw_param("tls=0x" TARGET_ABI_FMT_lx, arg3, 0);
-print_raw_param("child_tidptr=0x" TARGET_ABI_FMT_lx, arg4, 1);
+#if defined(TARGET_MICROBLAZE)
+do_print_clone(arg1, arg2, arg4, arg6, arg5);
+#elif defined(TARGET_CLONE_BACKWARDS)
+do_print_clone(arg1, arg2, arg3, arg4, arg5);
+#elif defined(TARGET_CLONE_BACKWARDS2)
+do_print_clone(arg2, arg1, arg3, arg5, arg4);
 #else
-print_flags(clone_flags, arg0, 0);
-print_raw_param("child_stack=0x" TARGET_ABI_FMT_lx, arg1, 0);
-print_raw_param("parent_tidptr=0x" TARGET_ABI_FMT_lx, arg2, 0);
-print_raw_param("tls=0x" TARGET_ABI_FMT_lx, arg3, 0);
-print_raw_param("child_tidptr=0x" TARGET_ABI_FMT_lx, arg4, 1);
+do_print_clone(arg1, arg2, arg3, arg5, arg4);
 #endif
 print_syscall_epilogue(name);
 }
-- 
2.1.4

[Qemu-devel] [PULL v2 22/24] linux-user: Provide safe_syscall for aarch64

2016-06-28 Thread riku . voipio

From: Richard Henderson 

Signed-off-by: Richard Henderson 
Reviewed-by: Peter Maydell 
Signed-off-by: Riku Voipio 
[RV] Updated syscall argument comment to match code
---
 linux-user/host/aarch64/hostdep.h  | 23 +
 linux-user/host/aarch64/safe-syscall.inc.S | 75 ++
 2 files changed, 98 insertions(+)
 create mode 100644 linux-user/host/aarch64/safe-syscall.inc.S

diff --git a/linux-user/host/aarch64/hostdep.h 
b/linux-user/host/aarch64/hostdep.h
index 7609bf5..b79eaf1 100644
--- a/linux-user/host/aarch64/hostdep.h
+++ b/linux-user/host/aarch64/hostdep.h
@@ -12,4 +12,27 @@
 #ifndef QEMU_HOSTDEP_H
 #define QEMU_HOSTDEP_H
 
+/* We have a safe-syscall.inc.S */
+#define HAVE_SAFE_SYSCALL
+
+#ifndef __ASSEMBLER__
+
+/* These are defined by the safe-syscall.inc.S file */
+extern char safe_syscall_start[];
+extern char safe_syscall_end[];
+
+/* Adjust the signal context to rewind out of safe-syscall if we're in it */
+static inline void rewind_if_in_safe_syscall(void *puc)
+{
+struct ucontext *uc = puc;
+__u64 *pcreg = >uc_mcontext.pc;
+
+if (*pcreg > (uintptr_t)safe_syscall_start
+&& *pcreg < (uintptr_t)safe_syscall_end) {
+*pcreg = (uintptr_t)safe_syscall_start;
+}
+}
+
+#endif /* __ASSEMBLER__ */
+
 #endif
diff --git a/linux-user/host/aarch64/safe-syscall.inc.S 
b/linux-user/host/aarch64/safe-syscall.inc.S
new file mode 100644
index 000..58a2329
--- /dev/null
+++ b/linux-user/host/aarch64/safe-syscall.inc.S
@@ -0,0 +1,75 @@
+/*
+ * safe-syscall.inc.S : host-specific assembly fragment
+ * to handle signals occurring at the same time as system calls.
+ * This is intended to be included by linux-user/safe-syscall.S
+ *
+ * Written by Richard Henderson 
+ * Copyright (C) 2016 Red Hat, Inc.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+   .global safe_syscall_base
+   .global safe_syscall_start
+   .global safe_syscall_end
+   .type   safe_syscall_base, #function
+   .type   safe_syscall_start, #function
+   .type   safe_syscall_end, #function
+
+   /* This is the entry point for making a system call. The calling
+* convention here is that of a C varargs function with the
+* first argument an 'int *' to the signal_pending flag, the
+* second one the system call number (as a 'long'), and all further
+* arguments being syscall arguments (also 'long').
+* We return a long which is the syscall's return value, which
+* may be negative-errno on failure. Conversion to the
+* -1-and-errno-set convention is done by the calling wrapper.
+*/
+safe_syscall_base:
+   .cfi_startproc
+   /* The syscall calling convention isn't the same as the
+* C one:
+* we enter with x0 == *signal_pending
+*   x1 == syscall number
+*   x2 ... x7, (stack) == syscall arguments
+*   and return the result in x0
+* and the syscall instruction needs
+*   x8 == syscall number
+*   x0 ... x7 == syscall arguments
+*   and returns the result in x0
+* Shuffle everything around appropriately.
+*/
+   mov x9, x0  /* signal_pending pointer */
+   mov x8, x1  /* syscall number */
+   mov x0, x2  /* syscall arguments */
+   mov x1, x3
+   mov x2, x4
+   mov x3, x5
+   mov x4, x6
+   mov x6, x7
+   ldr x7, [sp]
+
+   /* This next sequence of code works in conjunction with the
+* rewind_if_safe_syscall_function(). If a signal is taken
+* and the interrupted PC is anywhere between 'safe_syscall_start'
+* and 'safe_syscall_end' then we rewind it to 'safe_syscall_start'.
+* The code sequence must therefore be able to cope with this, and
+* the syscall instruction must be the final one in the sequence.
+*/
+safe_syscall_start:
+   /* if signal_pending is non-zero, don't do the call */
+   ldr w10, [x9]
+   cbnzw10, 0f 
+   svc 0x0
+safe_syscall_end:
+   /* code path for having successfully executed the syscall */
+   ret
+
+0:
+   /* code path when we didn't execute the syscall */
+   mov x0, #-TARGET_ERESTARTSYS
+   ret
+   .cfi_endproc
+
+   .size   safe_syscall_base, .-safe_syscall_base
-- 
2.1.4

[Qemu-devel] [PULL v2 15/24] linux-user: update get_thread_area/set_thread_area strace

2016-06-28 Thread riku . voipio

From: Laurent Vivier 

   int get_thread_area(struct user_desc *u_info);
   int set_thread_area(struct user_desc *u_info);

Signed-off-by: Laurent Vivier 
Signed-off-by: Riku Voipio 
Reviewed-by: Peter Maydell 
---
 linux-user/strace.list | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/linux-user/strace.list b/linux-user/strace.list
index 7c54dc6..aa967a2 100644
--- a/linux-user/strace.list
+++ b/linux-user/strace.list
@@ -337,7 +337,8 @@
 { TARGET_NR_getsockopt, "getsockopt" , NULL, NULL, NULL },
 #endif
 #ifdef TARGET_NR_get_thread_area
-{ TARGET_NR_get_thread_area, "get_thread_area" , NULL, NULL, NULL },
+{ TARGET_NR_get_thread_area, "get_thread_area", "%s(0x"TARGET_ABI_FMT_lx")",
+  NULL, NULL },
 #endif
 #ifdef TARGET_NR_gettid
 { TARGET_NR_gettid, "gettid" , NULL, NULL, NULL },
@@ -1234,7 +1235,8 @@
 { TARGET_NR_setsockopt, "setsockopt" , NULL, NULL, NULL },
 #endif
 #ifdef TARGET_NR_set_thread_area
-{ TARGET_NR_set_thread_area, "set_thread_area" , NULL, NULL, NULL },
+{ TARGET_NR_set_thread_area, "set_thread_area", "%s(0x"TARGET_ABI_FMT_lx")",
+  NULL, NULL },
 #endif
 #ifdef TARGET_NR_set_tid_address
 { TARGET_NR_set_tid_address, "set_tid_address" , NULL, NULL, NULL },
-- 
2.1.4

[Qemu-devel] [PULL v2 21/24] linux-user: Provide safe_syscall for arm

2016-06-28 Thread riku . voipio

From: Richard Henderson 

Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 
Signed-off-by: Riku Voipio 
---
 linux-user/host/arm/hostdep.h  | 23 +
 linux-user/host/arm/safe-syscall.inc.S | 90 ++
 2 files changed, 113 insertions(+)
 create mode 100644 linux-user/host/arm/safe-syscall.inc.S

diff --git a/linux-user/host/arm/hostdep.h b/linux-user/host/arm/hostdep.h
index 7609bf5..8e1ff2f 100644
--- a/linux-user/host/arm/hostdep.h
+++ b/linux-user/host/arm/hostdep.h
@@ -12,4 +12,27 @@
 #ifndef QEMU_HOSTDEP_H
 #define QEMU_HOSTDEP_H
 
+/* We have a safe-syscall.inc.S */
+#define HAVE_SAFE_SYSCALL
+
+#ifndef __ASSEMBLER__
+
+/* These are defined by the safe-syscall.inc.S file */
+extern char safe_syscall_start[];
+extern char safe_syscall_end[];
+
+/* Adjust the signal context to rewind out of safe-syscall if we're in it */
+static inline void rewind_if_in_safe_syscall(void *puc)
+{
+struct ucontext *uc = puc;
+unsigned long *pcreg = >uc_mcontext.arm_pc;
+
+if (*pcreg > (uintptr_t)safe_syscall_start
+&& *pcreg < (uintptr_t)safe_syscall_end) {
+*pcreg = (uintptr_t)safe_syscall_start;
+}
+}
+
+#endif /* __ASSEMBLER__ */
+
 #endif
diff --git a/linux-user/host/arm/safe-syscall.inc.S 
b/linux-user/host/arm/safe-syscall.inc.S
new file mode 100644
index 000..88c4958
--- /dev/null
+++ b/linux-user/host/arm/safe-syscall.inc.S
@@ -0,0 +1,90 @@
+/*
+ * safe-syscall.inc.S : host-specific assembly fragment
+ * to handle signals occurring at the same time as system calls.
+ * This is intended to be included by linux-user/safe-syscall.S
+ *
+ * Written by Richard Henderson 
+ * Copyright (C) 2016 Red Hat, Inc.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+   .global safe_syscall_base
+   .global safe_syscall_start
+   .global safe_syscall_end
+   .type   safe_syscall_base, %function
+
+   .cfi_sections   .debug_frame
+
+   .text
+   .syntax unified
+   .arm
+   .align 2
+
+   /* This is the entry point for making a system call. The calling
+* convention here is that of a C varargs function with the
+* first argument an 'int *' to the signal_pending flag, the
+* second one the system call number (as a 'long'), and all further
+* arguments being syscall arguments (also 'long').
+* We return a long which is the syscall's return value, which
+* may be negative-errno on failure. Conversion to the
+* -1-and-errno-set convention is done by the calling wrapper.
+*/
+safe_syscall_base:
+   .fnstart
+   .cfi_startproc
+   mov r12, sp /* save entry stack */
+   push{ r4, r5, r6, r7, r8, lr }
+   .save   { r4, r5, r6, r7, r8, lr }
+   .cfi_adjust_cfa_offset 24
+   .cfi_rel_offset r4, 0
+   .cfi_rel_offset r5, 4
+   .cfi_rel_offset r6, 8
+   .cfi_rel_offset r7, 12
+   .cfi_rel_offset r8, 16
+   .cfi_rel_offset lr, 20
+
+   /* The syscall calling convention isn't the same as the C one:
+* we enter with r0 == *signal_pending
+*   r1 == syscall number
+*   r2, r3, [sp+0] ... [sp+12] == syscall arguments
+*   and return the result in r0
+* and the syscall instruction needs
+*   r7 == syscall number
+*   r0 ... r6 == syscall arguments
+*   and returns the result in r0
+* Shuffle everything around appropriately.
+* Note the 16 bytes that we pushed to save registers.
+*/
+   mov r8, r0  /* copy signal_pending */
+   mov r7, r1  /* syscall number */
+   mov r0, r2  /* syscall args */
+   mov r1, r3
+   ldm r12, { r2, r3, r4, r5, r6 }
+
+   /* This next sequence of code works in conjunction with the
+* rewind_if_safe_syscall_function(). If a signal is taken
+* and the interrupted PC is anywhere between 'safe_syscall_start'
+* and 'safe_syscall_end' then we rewind it to 'safe_syscall_start'.
+* The code sequence must therefore be able to cope with this, and
+* the syscall instruction must be the final one in the sequence.
+*/
+safe_syscall_start:
+   /* if signal_pending is non-zero, don't do the call */
+   ldr r12, [r8]   /* signal_pending */
+   tst r12, r12
+   bne 1f
+   swi 0
+safe_syscall_end:
+   /* code path for having successfully executed the syscall */
+   pop { r4, r5, r6, r7, r8, pc }
+
+1:
+   /* code path when we didn't execute the syscall */
+   ldr r0, =-TARGET_ERESTARTSYS
+   pop { r4,

[Qemu-devel] [PULL v2 11/24] linux-user: Support F_GETPIPE_SZ and F_SETPIPE_SZ fcntls

2016-06-28 Thread riku . voipio

From: Peter Maydell 

Support the F_GETPIPE_SZ and F_SETPIPE_SZ fcntl operations.

Signed-off-by: Peter Maydell 
Reviewed-by: Laurent Vivier 
Signed-off-by: Riku Voipio 
---
 linux-user/strace.c   | 7 +++
 linux-user/syscall.c  | 6 ++
 linux-user/syscall_defs.h | 2 ++
 3 files changed, 15 insertions(+)

diff --git a/linux-user/strace.c b/linux-user/strace.c
index 4046b81..6ef5d38 100644
--- a/linux-user/strace.c
+++ b/linux-user/strace.c
@@ -918,6 +918,13 @@ print_fcntl(const struct syscallname *name,
 case TARGET_F_GETLEASE:
 gemu_log("F_GETLEASE");
 break;
+case TARGET_F_SETPIPE_SZ:
+gemu_log("F_SETPIPE_SZ,");
+print_raw_param(TARGET_ABI_FMT_ld, arg2, 1);
+break;
+case TARGET_F_GETPIPE_SZ:
+gemu_log("F_GETPIPE_SZ");
+break;
 case TARGET_F_DUPFD_CLOEXEC:
 gemu_log("F_DUPFD_CLOEXEC,");
 print_raw_param(TARGET_ABI_FMT_ld, arg2, 1);
diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 8163ae8..b8a0738 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -5585,6 +5585,10 @@ static int target_to_host_fcntl_cmd(int cmd)
case TARGET_F_SETOWN_EX:
return F_SETOWN_EX;
 #endif
+case TARGET_F_SETPIPE_SZ:
+return F_SETPIPE_SZ;
+case TARGET_F_GETPIPE_SZ:
+return F_GETPIPE_SZ;
default:
 return -TARGET_EINVAL;
 }
@@ -5822,6 +5826,8 @@ static abi_long do_fcntl(int fd, int cmd, abi_ulong arg)
 case TARGET_F_GETSIG:
 case TARGET_F_SETLEASE:
 case TARGET_F_GETLEASE:
+case TARGET_F_SETPIPE_SZ:
+case TARGET_F_GETPIPE_SZ:
 ret = get_errno(safe_fcntl(fd, host_cmd, arg));
 break;
 
diff --git a/linux-user/syscall_defs.h b/linux-user/syscall_defs.h
index 6ee9251..420463b 100644
--- a/linux-user/syscall_defs.h
+++ b/linux-user/syscall_defs.h
@@ -2166,6 +2166,8 @@ struct target_statfs64 {
 #define TARGET_F_SETLEASE (TARGET_F_LINUX_SPECIFIC_BASE + 0)
 #define TARGET_F_GETLEASE (TARGET_F_LINUX_SPECIFIC_BASE + 1)
 #define TARGET_F_DUPFD_CLOEXEC (TARGET_F_LINUX_SPECIFIC_BASE + 6)
+#define TARGET_F_SETPIPE_SZ (TARGET_F_LINUX_SPECIFIC_BASE + 7)
+#define TARGET_F_GETPIPE_SZ (TARGET_F_LINUX_SPECIFIC_BASE + 8)
 #define TARGET_F_NOTIFY  (TARGET_F_LINUX_SPECIFIC_BASE+2)
 
 #if defined(TARGET_ALPHA)
-- 
2.1.4

[Qemu-devel] [PULL v2 10/24] linux-user: Fix wrong type used for argument to rt_sigqueueinfo

2016-06-28 Thread riku . voipio

From: Peter Maydell 

The third argument to the rt_sigqueueinfo syscall is a pointer to
a siginfo_t, not a pointer to a sigset_t. Fix the error in the
arguments to lock_user(), which meant that we would not have
detected some faults that we should.

Signed-off-by: Peter Maydell 
Reviewed-by: Laurent Vivier 
Signed-off-by: Riku Voipio 
---
 linux-user/syscall.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 5166ff9..8163ae8 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -7876,8 +7876,11 @@ abi_long do_syscall(void *cpu_env, int num, abi_long 
arg1,
 case TARGET_NR_rt_sigqueueinfo:
 {
 siginfo_t uinfo;
-if (!(p = lock_user(VERIFY_READ, arg3, sizeof(target_sigset_t), 
1)))
+
+p = lock_user(VERIFY_READ, arg3, sizeof(target_siginfo_t), 1);
+if (!p) {
 goto efault;
+}
 target_to_host_siginfo(, p);
 unlock_user(p, arg1, 0);
 ret = get_errno(sys_rt_sigqueueinfo(arg1, arg2, ));
-- 
2.1.4

[Qemu-devel] [PULL v2 08/24] user-exec: Remove unused code for OSX hosts

2016-06-28 Thread riku . voipio

From: Peter Maydell 

Since we dropped darwin-user support many years ago, the code in
user-exec to support hosts which define __APPLE__ is unused; delete it.

Reviewed-by: Laurent Vivier 
Reviewed-by: Richard Henderson 
Signed-off-by: Riku Voipio 
Signed-off-by: Peter Maydell 
---
 user-exec.c | 47 +--
 1 file changed, 1 insertion(+), 46 deletions(-)

diff --git a/user-exec.c b/user-exec.c
index 1e2449e..95f9f97 100644
--- a/user-exec.c
+++ b/user-exec.c
@@ -117,14 +117,7 @@ static inline int handle_cpu_signal(uintptr_t pc, unsigned 
long address,
 
 #if defined(__i386__)
 
-#if defined(__APPLE__)
-#include 
-
-#define EIP_sig(context)  (*((unsigned long *)&(context)->uc_mcontext->ss.eip))
-#define TRAP_sig(context)((context)->uc_mcontext->es.trapno)
-#define ERROR_sig(context)   ((context)->uc_mcontext->es.err)
-#define MASK_sig(context)((context)->uc_sigmask)
-#elif defined(__NetBSD__)
+#if defined(__NetBSD__)
 #include 
 
 #define EIP_sig(context) ((context)->uc_mcontext.__gregs[_REG_EIP])
@@ -274,44 +267,6 @@ int cpu_signal_handler(int host_signum, void *pinfo,
 #define TRAP_sig(context)  ((context)->uc_mcontext.mc_exc)
 #endif /* __FreeBSD__|| __FreeBSD_kernel__ */
 
-#ifdef __APPLE__
-#include 
-typedef struct ucontext SIGCONTEXT;
-/* All Registers access - only for local access */
-#define REG_sig(reg_name, context)  \
-((context)->uc_mcontext->ss.reg_name)
-#define FLOATREG_sig(reg_name, context) \
-((context)->uc_mcontext->fs.reg_name)
-#define EXCEPREG_sig(reg_name, context) \
-((context)->uc_mcontext->es.reg_name)
-#define VECREG_sig(reg_name, context)   \
-((context)->uc_mcontext->vs.reg_name)
-/* Gpr Registers access */
-#define GPR_sig(reg_num, context)  REG_sig(r##reg_num, context)
-/* Program counter */
-#define IAR_sig(context)   REG_sig(srr0, context)
-/* Machine State Register (Supervisor) */
-#define MSR_sig(context)   REG_sig(srr1, context)
-#define CTR_sig(context)   REG_sig(ctr, context)
-/* Link register */
-#define XER_sig(context)   REG_sig(xer, context)
-/* User's integer exception register */
-#define LR_sig(context)REG_sig(lr, context)
-/* Condition register */
-#define CR_sig(context)REG_sig(cr, context)
-/* Float Registers access */
-#define FLOAT_sig(reg_num, context) \
-FLOATREG_sig(fpregs[reg_num], context)
-#define FPSCR_sig(context)  \
-((double)FLOATREG_sig(fpscr, context))
-/* Exception Registers access */
-/* Fault registers for coredump */
-#define DAR_sig(context)   EXCEPREG_sig(dar, context)
-#define DSISR_sig(context) EXCEPREG_sig(dsisr, context)
-/* number of powerpc exception taken */
-#define TRAP_sig(context)  EXCEPREG_sig(exception, context)
-#endif /* __APPLE__ */
-
 int cpu_signal_handler(int host_signum, void *pinfo,
void *puc)
 {
-- 
2.1.4

[Qemu-devel] [PULL v2 13/24] linux-user: add socket() strace

2016-06-28 Thread riku . voipio

From: Laurent Vivier 

Signed-off-by: Laurent Vivier 
Signed-off-by: Riku Voipio 
Reviewed-by: Peter Maydell 
---
 linux-user/strace.c| 23 +++
 linux-user/strace.list |  2 +-
 2 files changed, 24 insertions(+), 1 deletion(-)

diff --git a/linux-user/strace.c b/linux-user/strace.c
index c8df76f..95f4338 100644
--- a/linux-user/strace.c
+++ b/linux-user/strace.c
@@ -1227,6 +1227,29 @@ print__llseek(const struct syscallname *name,
 }
 #endif
 
+#if defined(TARGET_NR_socket)
+static void
+print_socket(const struct syscallname *name,
+ abi_long arg0, abi_long arg1, abi_long arg2,
+ abi_long arg3, abi_long arg4, abi_long arg5)
+{
+abi_ulong domain = arg0, type = arg1, protocol = arg2;
+
+print_syscall_prologue(name);
+print_socket_domain(domain);
+gemu_log(",");
+print_socket_type(type);
+gemu_log(",");
+if (domain == AF_PACKET ||
+(domain == AF_INET && type == TARGET_SOCK_PACKET)) {
+protocol = tswap16(protocol);
+}
+print_socket_protocol(domain, type, protocol);
+print_syscall_epilogue(name);
+}
+
+#endif
+
 #if defined(TARGET_NR_socketcall)
 
 #define get_user_ualx(x, gaddr, idx) \
diff --git a/linux-user/strace.list b/linux-user/strace.list
index b379497..7c54dc6 100644
--- a/linux-user/strace.list
+++ b/linux-user/strace.list
@@ -1291,7 +1291,7 @@
 { TARGET_NR_sigsuspend, "sigsuspend" , NULL, NULL, NULL },
 #endif
 #ifdef TARGET_NR_socket
-{ TARGET_NR_socket, "socket" , NULL, NULL, NULL },
+{ TARGET_NR_socket, "socket" , NULL, print_socket, NULL },
 #endif
 #ifdef TARGET_NR_socketcall
 { TARGET_NR_socketcall, "socketcall" , NULL, print_socketcall, NULL },
-- 
2.1.4

[Qemu-devel] [PULL v2 09/24] linux-user: Create a hostdep.h for each host architecture

2016-06-28 Thread riku . voipio

From: Peter Maydell 

In commit 4d330cee37a21 a new hostdep.h file was added, with the intent
that host architectures which needed one could provide it, and the
build system would automatically fall back to a generic version if
there was no version for the host architecture. Although this works,
it has a flaw: if a subsequent commit switches an architecture from
"uses generic/hostdep.h" to "uses its own hostdep.h" nothing in the
makefile dependencies notices this and so doing a rebuild without
a manual 'make clean' will fail.

So we drop the idea of having a 'generic' version in favour of
every architecture we support having its own hostdep.h, even if
it doesn't have anything in it. (There are only thirteen of these.)

If the dependency files claim that an object file depends on a
nonexistent file, our dependency system means that make will
rebuild the object file, and regenerate the dependencies in
the process. So moving between trees prior to this commit and
trees after this commit works without requiring a 'make clean'.

Signed-off-by: Peter Maydell 
Reviewed-by: Laurent Vivier 
Reviewed-by: Richard Henderson 
Signed-off-by: Riku Voipio 
---
 Makefile.target   |  5 +
 linux-user/host/aarch64/hostdep.h | 15 +++
 linux-user/host/arm/hostdep.h | 15 +++
 linux-user/host/generic/hostdep.h | 20 
 linux-user/host/i386/hostdep.h| 15 +++
 linux-user/host/ia64/hostdep.h| 15 +++
 linux-user/host/mips/hostdep.h| 15 +++
 linux-user/host/ppc/hostdep.h | 15 +++
 linux-user/host/ppc64/hostdep.h   | 15 +++
 linux-user/host/s390/hostdep.h| 15 +++
 linux-user/host/s390x/hostdep.h   | 15 +++
 linux-user/host/sparc/hostdep.h   | 15 +++
 linux-user/host/sparc64/hostdep.h | 15 +++
 linux-user/host/x32/hostdep.h | 15 +++
 14 files changed, 181 insertions(+), 24 deletions(-)
 create mode 100644 linux-user/host/aarch64/hostdep.h
 create mode 100644 linux-user/host/arm/hostdep.h
 delete mode 100644 linux-user/host/generic/hostdep.h
 create mode 100644 linux-user/host/i386/hostdep.h
 create mode 100644 linux-user/host/ia64/hostdep.h
 create mode 100644 linux-user/host/mips/hostdep.h
 create mode 100644 linux-user/host/ppc/hostdep.h
 create mode 100644 linux-user/host/ppc64/hostdep.h
 create mode 100644 linux-user/host/s390/hostdep.h
 create mode 100644 linux-user/host/s390x/hostdep.h
 create mode 100644 linux-user/host/sparc/hostdep.h
 create mode 100644 linux-user/host/sparc64/hostdep.h
 create mode 100644 linux-user/host/x32/hostdep.h

diff --git a/Makefile.target b/Makefile.target
index d720b3e..a440bcb 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -108,11 +108,8 @@ obj-$(CONFIG_LIBDECNUMBER) += libdecnumber/dpd/decimal128.o
 
 ifdef CONFIG_LINUX_USER
 
-# Note that we only add linux-user/host/$ARCH if it exists, and
-# that it must come before linux-user/host/generic in the search path.
 QEMU_CFLAGS+=-I$(SRC_PATH)/linux-user/$(TARGET_ABI_DIR) \
- $(patsubst %,-I%,$(wildcard $(SRC_PATH)/linux-user/host/$(ARCH))) 
\
- -I$(SRC_PATH)/linux-user/host/generic \
+ -I$(SRC_PATH)/linux-user/host/$(ARCH) \
  -I$(SRC_PATH)/linux-user
 
 obj-y += linux-user/
diff --git a/linux-user/host/aarch64/hostdep.h 
b/linux-user/host/aarch64/hostdep.h
new file mode 100644
index 000..7609bf5
--- /dev/null
+++ b/linux-user/host/aarch64/hostdep.h
@@ -0,0 +1,15 @@
+/*
+ * hostdep.h : things which are dependent on the host architecture
+ *
+ *  * Written by Peter Maydell 
+ *
+ * Copyright (C) 2016 Linaro Limited
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#ifndef QEMU_HOSTDEP_H
+#define QEMU_HOSTDEP_H
+
+#endif
diff --git a/linux-user/host/arm/hostdep.h b/linux-user/host/arm/hostdep.h
new file mode 100644
index 000..7609bf5
--- /dev/null
+++ b/linux-user/host/arm/hostdep.h
@@ -0,0 +1,15 @@
+/*
+ * hostdep.h : things which are dependent on the host architecture
+ *
+ *  * Written by Peter Maydell 
+ *
+ * Copyright (C) 2016 Linaro Limited
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#ifndef QEMU_HOSTDEP_H
+#define QEMU_HOSTDEP_H
+
+#endif
diff --git a/linux-user/host/generic/hostdep.h 
b/linux-user/host/generic/hostdep.h
deleted file mode 100644
index cfabc35..000
--- a/linux-user/host/generic/hostdep.h
+++ /dev/null
@@ -1,20 +0,0 @@
-/*
- * hostdep.h : fallback generic version of header for things
- * which are dependent on the host architecture
- *
- *  * Written by Peter Maydell 
- *
- *

[Qemu-devel] [PULL v2 06/24] configure: Don't allow user-only targets for unknown CPU architectures

2016-06-28 Thread riku . voipio

From: Peter Maydell 

For the user-only targets, we need to know something about the host CPU
architecture even if we are using the TCI interpreter rather than TCG.
(In particular user-exec.c has code for handling signals that needs
to know about that host's context structures.)

Specifically forbid building the user-only targets on unknown CPU
architectures, rather than allowing them to configure but then fail
when building user-exec.c.

This change drops supports for two configurations which were theoretically
possible before:
 * linux-user targets on M68K hosts using TCI
 * linux-user targets on HPPA hosts using TCI

We don't think anybody is actually trying to use these in practice, though:
 * interpreted TCG on a slow host CPU would be unusably slow
 * the m68k user-exec.c support is missing is_write detection so guest
   code which writes to the same page it is executing from was broken
   (will include any guest program using signals)
 * HPPA TCG backend support was dropped two and a half years ago
   with no complaints

Signed-off-by: Peter Maydell 
Reviewed-by: Laurent Vivier 
Reviewed-by: Richard Henderson 
Signed-off-by: Riku Voipio 
---
 configure | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/configure b/configure
index 6696316..dce20f0 100755
--- a/configure
+++ b/configure
@@ -1216,6 +1216,13 @@ esac
 QEMU_CFLAGS="$CPU_CFLAGS $QEMU_CFLAGS"
 EXTRA_CFLAGS="$CPU_CFLAGS $EXTRA_CFLAGS"
 
+# For user-mode emulation the host arch has to be one we explicitly
+# support, even if we're using TCI.
+if [ "$ARCH" = "unknown" ]; then
+  bsd_user="no"
+  linux_user="no"
+fi
+
 default_target_list=""
 
 mak_wilds=""
-- 
2.1.4

[Qemu-devel] [PULL v2 07/24] user-exec: Delete now-unused hppa and m68k cpu_signal_handler() code

2016-06-28 Thread riku . voipio

From: Peter Maydell 

Now that configure blocks attempts to build user-mode code on hppa
and m68k hosts, we can delete the cpu_signal_handler() implementations
for those architectures.

Signed-off-by: Peter Maydell 
Reviewed-by: Laurent Vivier 
Reviewed-by: Richard Henderson 
Signed-off-by: Riku Voipio 
---
 user-exec.c | 60 
 1 file changed, 60 deletions(-)

diff --git a/user-exec.c b/user-exec.c
index 50e95a6..1e2449e 100644
--- a/user-exec.c
+++ b/user-exec.c
@@ -494,24 +494,6 @@ int cpu_signal_handler(int host_signum, void *pinfo, void 
*puc)
  is_write, >uc_sigmask);
 }
 
-#elif defined(__mc68000)
-
-int cpu_signal_handler(int host_signum, void *pinfo,
-   void *puc)
-{
-siginfo_t *info = pinfo;
-struct ucontext *uc = puc;
-unsigned long pc;
-int is_write;
-
-pc = uc->uc_mcontext.gregs[16];
-/* XXX: compute is_write */
-is_write = 0;
-return handle_cpu_signal(pc, (unsigned long)info->si_addr,
- is_write,
- >uc_sigmask);
-}
-
 #elif defined(__ia64)
 
 #ifndef __ISR_VALID
@@ -616,48 +598,6 @@ int cpu_signal_handler(int host_signum, void *pinfo,
  is_write, >uc_sigmask);
 }
 
-#elif defined(__hppa__)
-
-int cpu_signal_handler(int host_signum, void *pinfo,
-   void *puc)
-{
-siginfo_t *info = pinfo;
-struct ucontext *uc = puc;
-unsigned long pc = uc->uc_mcontext.sc_iaoq[0];
-uint32_t insn = *(uint32_t *)pc;
-int is_write = 0;
-
-/* XXX: need kernel patch to get write flag faster.  */
-switch (insn >> 26) {
-case 0x1a: /* STW */
-case 0x19: /* STH */
-case 0x18: /* STB */
-case 0x1b: /* STWM */
-is_write = 1;
-break;
-
-case 0x09: /* CSTWX, FSTWX, FSTWS */
-case 0x0b: /* CSTDX, FSTDX, FSTDS */
-/* Distinguish from coprocessor load ... */
-is_write = (insn >> 9) & 1;
-break;
-
-case 0x03:
-switch ((insn >> 6) & 15) {
-case 0xa: /* STWS */
-case 0x9: /* STHS */
-case 0x8: /* STBS */
-case 0xe: /* STWAS */
-case 0xc: /* STBYS */
-is_write = 1;
-}
-break;
-}
-
-return handle_cpu_signal(pc, (unsigned long)info->si_addr,
- is_write, >uc_sigmask);
-}
-
 #else
 
 #error host CPU specific signal handler needed
-- 
2.1.4

[Qemu-devel] [PULL v2 02/24] linux-user: Use __get_user() and __put_user() to handle structs in do_fcntl()

2016-06-28 Thread riku . voipio

From: Peter Maydell 

Use the __get_user() and __put_user() to handle reading and writing the
guest structures in do_ioctl(). This has two benefits:
 * avoids possible errors due to misaligned guest pointers
 * correctly sign extends signed fields (like l_start in struct flock)
   which might be different sizes between guest and host

To do this we abstract out into copy_from/to_user functions. We
also standardize on always using host flock64 and the F_GETLK64
etc flock commands, as this means we always have 64 bit offsets
whether the host is 64-bit or 32-bit and we don't need to support
conversion to both host struct flock and struct flock64.

In passing we fix errors in converting l_type from the host to
the target (where we were doing a byteswap of the host value
before trying to do the convert-bitmasks operation rather than
otherwise, and inexplicably shifting left by 1); these were
accidentally left over when the original simple "just shift by 1"
arm<->x86 conversion of commit 43f238d was changed to the more
general scheme of using target_to_host_bitmask() functions in 2ba7f73.

[RV: fixed ifdef guard for eabi functions]
Signed-off-by: Peter Maydell 
Reviewed-by: Laurent Vivier 
Signed-off-by: Riku Voipio 
---
 linux-user/syscall.c | 298 ---
 1 file changed, 166 insertions(+), 132 deletions(-)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 1c17b74..5c0d111 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -5541,11 +5541,11 @@ static int target_to_host_fcntl_cmd(int cmd)
case TARGET_F_SETFL:
 return cmd;
 case TARGET_F_GETLK:
-   return F_GETLK;
-   case TARGET_F_SETLK:
-   return F_SETLK;
-   case TARGET_F_SETLKW:
-   return F_SETLKW;
+return F_GETLK64;
+case TARGET_F_SETLK:
+return F_SETLK64;
+case TARGET_F_SETLKW:
+return F_SETLKW64;
case TARGET_F_GETOWN:
return F_GETOWN;
case TARGET_F_SETOWN:
@@ -5596,12 +5596,134 @@ static const bitmask_transtbl flock_tbl[] = {
 { 0, 0, 0, 0 }
 };
 
-static abi_long do_fcntl(int fd, int cmd, abi_ulong arg)
+static inline abi_long copy_from_user_flock(struct flock64 *fl,
+abi_ulong target_flock_addr)
 {
-struct flock fl;
 struct target_flock *target_fl;
+short l_type;
+
+if (!lock_user_struct(VERIFY_READ, target_fl, target_flock_addr, 1)) {
+return -TARGET_EFAULT;
+}
+
+__get_user(l_type, _fl->l_type);
+fl->l_type = target_to_host_bitmask(l_type, flock_tbl);
+__get_user(fl->l_whence, _fl->l_whence);
+__get_user(fl->l_start, _fl->l_start);
+__get_user(fl->l_len, _fl->l_len);
+__get_user(fl->l_pid, _fl->l_pid);
+unlock_user_struct(target_fl, target_flock_addr, 0);
+return 0;
+}
+
+static inline abi_long copy_to_user_flock(abi_ulong target_flock_addr,
+  const struct flock64 *fl)
+{
+struct target_flock *target_fl;
+short l_type;
+
+if (!lock_user_struct(VERIFY_WRITE, target_fl, target_flock_addr, 0)) {
+return -TARGET_EFAULT;
+}
+
+l_type = host_to_target_bitmask(fl->l_type, flock_tbl);
+__put_user(l_type, _fl->l_type);
+__put_user(fl->l_whence, _fl->l_whence);
+__put_user(fl->l_start, _fl->l_start);
+__put_user(fl->l_len, _fl->l_len);
+__put_user(fl->l_pid, _fl->l_pid);
+unlock_user_struct(target_fl, target_flock_addr, 1);
+return 0;
+}
+
+typedef abi_long from_flock64_fn(struct flock64 *fl, abi_ulong target_addr);
+typedef abi_long to_flock64_fn(abi_ulong target_addr, const struct flock64 
*fl);
+
+#if defined(TARGET_ARM) && TARGET_ABI_BITS == 32
+static inline abi_long copy_from_user_eabi_flock64(struct flock64 *fl,
+   abi_ulong target_flock_addr)
+{
+struct target_eabi_flock64 *target_fl;
+short l_type;
+
+if (!lock_user_struct(VERIFY_READ, target_fl, target_flock_addr, 1)) {
+return -TARGET_EFAULT;
+}
+
+__get_user(l_type, _fl->l_type);
+fl->l_type = target_to_host_bitmask(l_type, flock_tbl);
+__get_user(fl->l_whence, _fl->l_whence);
+__get_user(fl->l_start, _fl->l_start);
+__get_user(fl->l_len, _fl->l_len);
+__get_user(fl->l_pid, _fl->l_pid);
+unlock_user_struct(target_fl, target_flock_addr, 0);
+return 0;
+}
+
+static inline abi_long copy_to_user_eabi_flock64(abi_ulong target_flock_addr,
+ const struct flock64 *fl)
+{
+struct target_eabi_flock64 *target_fl;
+short l_type;
+
+if (!lock_user_struct(VERIFY_WRITE, target_fl, target_flock_addr, 0)) {
+return -TARGET_EFAULT;
+}
+
+l_type = host_to_target_bitmask(fl->l_type, flock_tbl);
+__put_user(l_type, _fl->l_type);
+

[Qemu-devel] [PULL v2 05/24] configure: Don't override ARCH=unknown if enabling TCI

2016-06-28 Thread riku . voipio

From: Peter Maydell 

At the moment if configure finds an unknown CPU it will set
ARCH to 'unknown', and then later either bail out or set it
to 'tci' (depending on whether the user passed configure the
--enable-tcg-interpreter switch). This is unnecessarily
confusing, because we could be using TCI in two cases:
 * a known host architecture (in which case ARCH is set to
   the actual host architecture, like 'i386')
 * an unknown host architecture (in which case ARCH is
   set to 'tci')
so nothing can rely on ARCH=tci to mean "using TCI".
Remove the line setting ARCH, so we leave it as "unknown",
which is what the actual situation is.

Signed-off-by: Peter Maydell 
Reviewed-by: Laurent Vivier 
Reviewed-by: Richard Henderson 
Signed-off-by: Riku Voipio 
---
 configure | 1 -
 1 file changed, 1 deletion(-)

diff --git a/configure b/configure
index 5929aba..6696316 100755
--- a/configure
+++ b/configure
@@ -1380,7 +1380,6 @@ fi
 if test "$ARCH" = "unknown"; then
 if test "$tcg_interpreter" = "yes" ; then
 echo "Unsupported CPU = $cpu, will use TCG with TCI (experimental)"
-ARCH=tci
 else
 error_exit "Unsupported CPU = $cpu, try --enable-tcg-interpreter"
 fi
-- 
2.1.4

[Qemu-devel] [PULL v2 03/24] linux-user: Use safe_syscall wrapper for fcntl

2016-06-28 Thread riku . voipio

From: Peter Maydell 

Use the safe_syscall wrapper for fcntl. This is straightforward now
that we always use 'struct fcntl64' on the host, as we don't need
to select whether to call the host's fcntl64 or fcntl syscall
(a detail that the libc previously hid for us).

Signed-off-by: Peter Maydell 
Reviewed-by: Laurent Vivier 
Signed-off-by: Riku Voipio 
---
 linux-user/syscall.c | 34 +++---
 1 file changed, 23 insertions(+), 11 deletions(-)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 5c0d111..3dfaea9 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -783,6 +783,16 @@ safe_syscall5(int, mq_timedreceive, int, mqdes, char *, 
msg_ptr,
  * the libc function.
  */
 #define safe_ioctl(...) safe_syscall(__NR_ioctl, __VA_ARGS__)
+/* Similarly for fcntl. Note that callers must always:
+ *  pass the F_GETLK64 etc constants rather than the unsuffixed F_GETLK
+ *  use the flock64 struct rather than unsuffixed flock
+ * This will then work and use a 64-bit offset for both 32-bit and 64-bit 
hosts.
+ */
+#ifdef __NR_fcntl64
+#define safe_fcntl(...) safe_syscall(__NR_fcntl64, __VA_ARGS__)
+#else
+#define safe_fcntl(...) safe_syscall(__NR_fcntl, __VA_ARGS__)
+#endif
 
 static inline int host_to_target_sock_type(int host_type)
 {
@@ -5740,7 +5750,7 @@ static abi_long do_fcntl(int fd, int cmd, abi_ulong arg)
 if (ret) {
 return ret;
 }
-ret = get_errno(fcntl(fd, host_cmd, ));
+ret = get_errno(safe_fcntl(fd, host_cmd, ));
 if (ret == 0) {
 ret = copy_to_user_flock(arg, );
 }
@@ -5752,7 +5762,7 @@ static abi_long do_fcntl(int fd, int cmd, abi_ulong arg)
 if (ret) {
 return ret;
 }
-ret = get_errno(fcntl(fd, host_cmd, ));
+ret = get_errno(safe_fcntl(fd, host_cmd, ));
 break;
 
 case TARGET_F_GETLK64:
@@ -5760,7 +5770,7 @@ static abi_long do_fcntl(int fd, int cmd, abi_ulong arg)
 if (ret) {
 return ret;
 }
-ret = get_errno(fcntl(fd, host_cmd, ));
+ret = get_errno(safe_fcntl(fd, host_cmd, ));
 if (ret == 0) {
 ret = copy_to_user_flock64(arg, );
 }
@@ -5771,23 +5781,25 @@ static abi_long do_fcntl(int fd, int cmd, abi_ulong arg)
 if (ret) {
 return ret;
 }
-ret = get_errno(fcntl(fd, host_cmd, ));
+ret = get_errno(safe_fcntl(fd, host_cmd, ));
 break;
 
 case TARGET_F_GETFL:
-ret = get_errno(fcntl(fd, host_cmd, arg));
+ret = get_errno(safe_fcntl(fd, host_cmd, arg));
 if (ret >= 0) {
 ret = host_to_target_bitmask(ret, fcntl_flags_tbl);
 }
 break;
 
 case TARGET_F_SETFL:
-ret = get_errno(fcntl(fd, host_cmd, target_to_host_bitmask(arg, 
fcntl_flags_tbl)));
+ret = get_errno(safe_fcntl(fd, host_cmd,
+   target_to_host_bitmask(arg,
+  fcntl_flags_tbl)));
 break;
 
 #ifdef F_GETOWN_EX
 case TARGET_F_GETOWN_EX:
-ret = get_errno(fcntl(fd, host_cmd, ));
+ret = get_errno(safe_fcntl(fd, host_cmd, ));
 if (ret >= 0) {
 if (!lock_user_struct(VERIFY_WRITE, target_fox, arg, 0))
 return -TARGET_EFAULT;
@@ -5805,7 +5817,7 @@ static abi_long do_fcntl(int fd, int cmd, abi_ulong arg)
 fox.type = tswap32(target_fox->type);
 fox.pid = tswap32(target_fox->pid);
 unlock_user_struct(target_fox, arg, 0);
-ret = get_errno(fcntl(fd, host_cmd, ));
+ret = get_errno(safe_fcntl(fd, host_cmd, ));
 break;
 #endif
 
@@ -5815,11 +5827,11 @@ static abi_long do_fcntl(int fd, int cmd, abi_ulong arg)
 case TARGET_F_GETSIG:
 case TARGET_F_SETLEASE:
 case TARGET_F_GETLEASE:
-ret = get_errno(fcntl(fd, host_cmd, arg));
+ret = get_errno(safe_fcntl(fd, host_cmd, arg));
 break;
 
 default:
-ret = get_errno(fcntl(fd, cmd, arg));
+ret = get_errno(safe_fcntl(fd, cmd, arg));
 break;
 }
 return ret;
@@ -10252,7 +10264,7 @@ abi_long do_syscall(void *cpu_env, int num, abi_long 
arg1,
 if (ret) {
 break;
 }
-ret = get_errno(fcntl(arg1, cmd, ));
+ret = get_errno(safe_fcntl(arg1, cmd, ));
break;
 default:
 ret = do_fcntl(arg1, arg2, arg3);
-- 
2.1.4

[Qemu-devel] [PULL v2 04/24] linux-user: Don't use sigfillset() on uc->uc_sigmask

2016-06-28 Thread riku . voipio

From: Peter Maydell 

The kernel and libc have different ideas about what a sigset_t
is -- for the kernel it is only _NSIG / 8 bytes in size (usually
8 bytes), but for libc it is much larger, 128 bytes. In most
situations the difference doesn't matter, because if you pass a
pointer to a libc sigset_t to the kernel it just acts on the first
8 bytes of it, but for the ucontext_t* argument to a signal handler
it trips us up. The kernel allocates this ucontext_t on the stack
according to its idea of the sigset_t type, but the type of the
ucontext_t defined by the libc headers uses the libc type, and
so do the manipulator functions like sigfillset(). This means that
 (1) sizeof(uc->uc_sigmask) is much larger than the actual
 space used on the stack
 (2) sigfillset(>uc_sigmask) will write garbage 0xff bytes
 off the end of the structure, which can trash data that
 was on the stack before the signal handler was invoked,
 and may result in a crash after the handler returns

To avoid this, we use a memset() of the correct size to fill
the signal mask rather than using the libc function.

This fixes a problem where we would crash at least some of the
time on an i386 host when a signal was taken.

Signed-off-by: Peter Maydell 
Reviewed-by: Laurent Vivier 
Signed-off-by: Riku Voipio 
---
 linux-user/qemu.h|  5 +
 linux-user/signal.c  | 10 +-
 linux-user/syscall.c |  5 -
 3 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/linux-user/qemu.h b/linux-user/qemu.h
index 56f29c3..e8a5aed 100644
--- a/linux-user/qemu.h
+++ b/linux-user/qemu.h
@@ -20,6 +20,11 @@
 
 #define THREAD __thread
 
+/* This is the size of the host kernel's sigset_t, needed where we make
+ * direct system calls that take a sigset_t pointer and a size.
+ */
+#define SIGSET_T_SIZE (_NSIG / 8)
+
 /* This struct is used to hold certain information about the image.
  * Basically, it replicates in user space what would be certain
  * task_struct fields in the kernel
diff --git a/linux-user/signal.c b/linux-user/signal.c
index e2d55ff..9d98045 100644
--- a/linux-user/signal.c
+++ b/linux-user/signal.c
@@ -636,8 +636,16 @@ static void host_signal_handler(int host_signum, siginfo_t 
*info,
  * code in case the guest code provokes one in the window between
  * now and it getting out to the main loop. Signals will be
  * unblocked again in process_pending_signals().
+ *
+ * WARNING: we cannot use sigfillset() here because the uc_sigmask
+ * field is a kernel sigset_t, which is much smaller than the
+ * libc sigset_t which sigfillset() operates on. Using sigfillset()
+ * would write 0xff bytes off the end of the structure and trash
+ * data on the struct.
+ * We can't use sizeof(uc->uc_sigmask) either, because the libc
+ * headers define the struct field with the wrong (too large) type.
  */
-sigfillset(>uc_sigmask);
+memset(>uc_sigmask, 0xff, SIGSET_T_SIZE);
 sigdelset(>uc_sigmask, SIGSEGV);
 sigdelset(>uc_sigmask, SIGBUS);
 
diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 3dfaea9..5166ff9 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -123,11 +123,6 @@ int __clone2(int (*fn)(void *), void *child_stack_base,
 #defineVFAT_IOCTL_READDIR_BOTH _IOR('r', 1, struct 
linux_dirent [2])
 #defineVFAT_IOCTL_READDIR_SHORT_IOR('r', 2, struct 
linux_dirent [2])
 
-/* This is the size of the host kernel's sigset_t, needed where we make
- * direct system calls that take a sigset_t pointer and a size.
- */
-#define SIGSET_T_SIZE (_NSIG / 8)
-
 #undef _syscall0
 #undef _syscall1
 #undef _syscall2
-- 
2.1.4

[Qemu-devel] [PULL v2 01/24] linux-user: Avoid possible misalignment in host_to_target_siginfo()

2016-06-28 Thread riku . voipio

From: Peter Maydell 

host_to_target_siginfo() is implemented by a combination of
host_to_target_siginfo_noswap() followed by tswap_siginfo().
The first of these two functions assumes that the target_siginfo_t
it is writing to is correctly aligned, but the pointer passed
into host_to_target_siginfo() is directly from the guest and
might be misaligned. Use a local variable to avoid this problem.
(tswap_siginfo() does now correctly handle a misaligned destination.)

We have to add a memset() to host_to_target_siginfo_noswap()
to avoid some false positive "may be used uninitialized" warnings
from gcc about subfields of the _sifields union if it chooses to
inline both tswap_siginfo() and host_to_target_siginfo_noswap()
into host_to_target_siginfo().

Signed-off-by: Peter Maydell 
Reviewed-by: Laurent Vivier 
Signed-off-by: Peter Maydell 
---
 linux-user/signal.c | 13 +++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/linux-user/signal.c b/linux-user/signal.c
index 1dadddf..e2d55ff 100644
--- a/linux-user/signal.c
+++ b/linux-user/signal.c
@@ -278,6 +278,14 @@ static inline void 
host_to_target_siginfo_noswap(target_siginfo_t *tinfo,
 tinfo->si_errno = 0;
 tinfo->si_code = info->si_code;
 
+/* This memset serves two purposes:
+ * (1) ensure we don't leak random junk to the guest later
+ * (2) placate false positives from gcc about fields
+ * being used uninitialized if it chooses to inline both this
+ * function and tswap_siginfo() into host_to_target_siginfo().
+ */
+memset(tinfo->_sifields._pad, 0, sizeof(tinfo->_sifields._pad));
+
 /* This is awkward, because we have to use a combination of
  * the si_code and si_signo to figure out which of the union's
  * members are valid. (Within the host kernel it is always possible
@@ -397,8 +405,9 @@ static void tswap_siginfo(target_siginfo_t *tinfo,
 
 void host_to_target_siginfo(target_siginfo_t *tinfo, const siginfo_t *info)
 {
-host_to_target_siginfo_noswap(tinfo, info);
-tswap_siginfo(tinfo, tinfo);
+target_siginfo_t tgt_tmp;
+host_to_target_siginfo_noswap(_tmp, info);
+tswap_siginfo(tinfo, _tmp);
 }
 
 /* XXX: we support only POSIX RT signals are used. */
-- 
2.1.4

[Qemu-devel] [PATCH v2 8/8] ppc/xics: Split ICS into ics-base and ics class

2016-06-28 Thread Nikunj A Dadhania

From: Benjamin Herrenschmidt 

The existing implementation remains same and ics-base is introduced. The
type name "ics" is retained, and all the related functions renamed as
ics_simple_*

This will allow different implementations for the source controllers
such as the MSI support of PHB3 on Power8 which uses in-memory state
tables for example.

Signed-off-by: Benjamin Herrenschmidt 
Signed-off-by: Nikunj A Dadhania 
---
 hw/intc/trace-events  |  10 ++--
 hw/intc/xics.c| 146 +++---
 hw/intc/xics_kvm.c|  10 ++--
 hw/intc/xics_spapr.c  |  28 +-
 include/hw/ppc/xics.h |  23 +---
 5 files changed, 132 insertions(+), 85 deletions(-)

diff --git a/hw/intc/trace-events b/hw/intc/trace-events
index 5f0f783..e5e7ec7 100644
--- a/hw/intc/trace-events
+++ b/hw/intc/trace-events
@@ -50,12 +50,12 @@ xics_icp_accept(uint32_t old_xirr, uint32_t new_xirr) 
"icp_accept: XIRR %#"PRIx3
 xics_icp_eoi(int server, uint32_t xirr, uint32_t new_xirr) "icp_eoi: server %d 
given XIRR %#"PRIx32" new XIRR %#"PRIx32
 xics_icp_irq(int server, int nr, uint8_t priority) "cpu %d trying to deliver 
irq %#"PRIx32" priority %#x"
 xics_icp_raise(uint32_t xirr, uint8_t pending_priority) "raising IRQ new 
XIRR=%#x new pending priority=%#x"
-xics_set_irq_msi(int srcno, int nr) "set_irq_msi: srcno %d [irq %#x]"
+xics_ics_simple_set_irq_msi(int srcno, int nr) "set_irq_msi: srcno %d [irq 
%#x]"
 xics_masked_pending(void) "set_irq_msi: masked pending"
-xics_set_irq_lsi(int srcno, int nr) "set_irq_lsi: srcno %d [irq %#x]"
-xics_ics_write_xive(int nr, int srcno, int server, uint8_t priority) 
"ics_write_xive: irq %#x [src %d] server %#x prio %#x"
-xics_ics_reject(int nr, int srcno) "reject irq %#x [src %d]"
-xics_ics_eoi(int nr) "ics_eoi: irq %#x"
+xics_ics_simple_set_irq_lsi(int srcno, int nr) "set_irq_lsi: srcno %d [irq 
%#x]"
+xics_ics_simple_write_xive(int nr, int srcno, int server, uint8_t priority) 
"ics_write_xive: irq %#x [src %d] server %#x prio %#x"
+xics_ics_simple_reject(int nr, int srcno) "reject irq %#x [src %d]"
+xics_ics_simple_eoi(int nr) "ics_eoi: irq %#x"
 xics_alloc(int irq) "irq %d"
 xics_alloc_block(int first, int num, bool lsi, int align) "first irq %d, %d 
irqs, lsi=%d, alignnum %d"
 xics_ics_free(int src, int irq, int num) "Source#%d, first irq %d, %d irqs"
diff --git a/hw/intc/xics.c b/hw/intc/xics.c
index bbdba84..39928d9 100644
--- a/hw/intc/xics.c
+++ b/hw/intc/xics.c
@@ -112,7 +112,7 @@ void xics_add_ics(XICSState *xics)
 {
 ICSState *ics;
 
-ics = ICS(object_new(TYPE_ICS));
+ics = ICS_SIMPLE(object_new(TYPE_ICS_SIMPLE));
 object_property_add_child(OBJECT(xics), "ics", OBJECT(ics), NULL);
 ics->xics = xics;
 QLIST_INSERT_HEAD(>ics, ics, list);
@@ -223,9 +223,32 @@ static const TypeInfo xics_common_info = {
 #define XISR(ss)   (((ss)->xirr) & XISR_MASK)
 #define CPPR(ss)   (((ss)->xirr) >> 24)
 
-static void ics_reject(ICSState *ics, int nr);
-static void ics_resend(ICSState *ics);
-static void ics_eoi(ICSState *ics, int nr);
+static void ics_reject(ICSState *ics, uint32_t nr)
+{
+ICSStateClass *k = ICS_GET_CLASS(ics);
+
+if (k->reject) {
+k->reject(ics, nr);
+}
+}
+
+static void ics_resend(ICSState *ics)
+{
+ICSStateClass *k = ICS_GET_CLASS(ics);
+
+if (k->resend) {
+k->resend(ics);
+}
+}
+
+static void ics_eoi(ICSState *ics, int nr)
+{
+ICSStateClass *k = ICS_GET_CLASS(ics);
+
+if (k->eoi) {
+k->eoi(ics, nr);
+}
+}
 
 static void icp_check_ipi(ICPState *ss)
 {
@@ -428,7 +451,7 @@ static const TypeInfo icp_info = {
 /*
  * ICS: Source layer
  */
-static void resend_msi(ICSState *ics, int srcno)
+static void ics_simple_resend_msi(ICSState *ics, int srcno)
 {
 ICSIRQState *irq = ics->irqs + srcno;
 
@@ -441,7 +464,7 @@ static void resend_msi(ICSState *ics, int srcno)
 }
 }
 
-static void resend_lsi(ICSState *ics, int srcno)
+static void ics_simple_resend_lsi(ICSState *ics, int srcno)
 {
 ICSIRQState *irq = ics->irqs + srcno;
 
@@ -453,11 +476,11 @@ static void resend_lsi(ICSState *ics, int srcno)
 }
 }
 
-static void set_irq_msi(ICSState *ics, int srcno, int val)
+static void ics_simple_set_irq_msi(ICSState *ics, int srcno, int val)
 {
 ICSIRQState *irq = ics->irqs + srcno;
 
-trace_xics_set_irq_msi(srcno, srcno + ics->offset);
+trace_xics_ics_simple_set_irq_msi(srcno, srcno + ics->offset);
 
 if (val) {
 if (irq->priority == 0xff) {
@@ -469,31 +492,31 @@ static void set_irq_msi(ICSState *ics, int srcno, int val)
 }
 }
 
-static void set_irq_lsi(ICSState *ics, int srcno, int val)
+static void ics_simple_set_irq_lsi(ICSState *ics, int srcno, int val)
 {
 ICSIRQState *irq = ics->irqs + srcno;
 
-trace_xics_set_irq_lsi(srcno, srcno + ics->offset);
+trace_xics_ics_simple_set_irq_lsi(srcno, srcno + ics->offset);
 if (val) {
 irq->status

[Qemu-devel] [PULL v2 00/24] linux-user changes for v2.7

2016-06-28 Thread riku . voipio

From: Riku Voipio <riku.voi...@linaro.org>

The following changes since commit c7288767523f6510cf557707d3eb5e78e519b90d:

  Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.7-20160623' into 
staging (2016-06-23 11:53:14 +0100)

are available in the git repository at:

  git://git.linaro.org/people/riku.voipio/qemu.git tags/pull-linux-user-20160628

for you to fetch changes up to 4ba92cd736a9ce0dc83c9b16a75d24d385e1cdf3:

  linux-user: Provide safe_syscall for ppc64 (2016-06-26 13:17:22 +0300)


Drop building linux-user targets on HPPA or m68k host systems
and add safe_syscall support for i386, aarch64, arm, ppc64 and
s390x.



Laurent Vivier (7):
  linux-user: add socketcall() strace
  linux-user: add socket() strace
  linux-user: fix clone() strace
  linux-user: update get_thread_area/set_thread_area strace
  linux-user: add missing return in netlink switch statement
  linux-user: fd_trans_host_to_target_data() must process only received
data
  linux-user: don't swap NLMSG_DATA() fields

Peter Maydell (11):
  linux-user: Avoid possible misalignment in host_to_target_siginfo()
  linux-user: Use __get_user() and __put_user() to handle structs in
do_fcntl()
  linux-user: Use safe_syscall wrapper for fcntl
  linux-user: Don't use sigfillset() on uc->uc_sigmask
  configure: Don't override ARCH=unknown if enabling TCI
  configure: Don't allow user-only targets for unknown CPU architectures
  user-exec: Delete now-unused hppa and m68k cpu_signal_handler() code
  user-exec: Remove unused code for OSX hosts
  linux-user: Create a hostdep.h for each host architecture
  linux-user: Fix wrong type used for argument to rt_sigqueueinfo
  linux-user: Support F_GETPIPE_SZ and F_SETPIPE_SZ fcntls

Richard Henderson (6):
  linux-user: fix x86_64 safe_syscall
  linux-user: Provide safe_syscall for i386
  linux-user: Provide safe_syscall for arm
  linux-user: Provide safe_syscall for aarch64
  linux-user: Provide safe_syscall for s390x
  linux-user: Provide safe_syscall for ppc64

 Makefile.target|   5 +-
 configure  |   8 +-
 linux-user/host/aarch64/hostdep.h  |  38 ++
 linux-user/host/aarch64/safe-syscall.inc.S |  75 
 linux-user/host/arm/hostdep.h  |  38 ++
 linux-user/host/arm/safe-syscall.inc.S |  90 +
 linux-user/host/generic/hostdep.h  |  20 -
 linux-user/host/i386/hostdep.h |  38 ++
 linux-user/host/i386/safe-syscall.inc.S| 112 ++
 linux-user/host/ia64/hostdep.h |  15 +
 linux-user/host/mips/hostdep.h |  15 +
 linux-user/host/ppc/hostdep.h  |  15 +
 linux-user/host/ppc64/hostdep.h|  38 ++
 linux-user/host/ppc64/safe-syscall.inc.S   |  92 +
 linux-user/host/s390/hostdep.h |  15 +
 linux-user/host/s390x/hostdep.h|  38 ++
 linux-user/host/s390x/safe-syscall.inc.S   |  90 +
 linux-user/host/sparc/hostdep.h|  15 +
 linux-user/host/sparc64/hostdep.h  |  15 +
 linux-user/host/x32/hostdep.h  |  15 +
 linux-user/host/x86_64/safe-syscall.inc.S  |   6 +-
 linux-user/qemu.h  |   5 +
 linux-user/signal.c|  23 +-
 linux-user/strace.c| 621 -
 linux-user/strace.list |  10 +-
 linux-user/syscall.c   | 419 ++-
 linux-user/syscall_defs.h  |  24 +-
 user-exec.c| 107 +
 28 files changed, 1657 insertions(+), 345 deletions(-)
 create mode 100644 linux-user/host/aarch64/hostdep.h
 create mode 100644 linux-user/host/aarch64/safe-syscall.inc.S
 create mode 100644 linux-user/host/arm/hostdep.h
 create mode 100644 linux-user/host/arm/safe-syscall.inc.S
 delete mode 100644 linux-user/host/generic/hostdep.h
 create mode 100644 linux-user/host/i386/hostdep.h
 create mode 100644 linux-user/host/i386/safe-syscall.inc.S
 create mode 100644 linux-user/host/ia64/hostdep.h
 create mode 100644 linux-user/host/mips/hostdep.h
 create mode 100644 linux-user/host/ppc/hostdep.h
 create mode 100644 linux-user/host/ppc64/hostdep.h
 create mode 100644 linux-user/host/ppc64/safe-syscall.inc.S
 create mode 100644 linux-user/host/s390/hostdep.h
 create mode 100644 linux-user/host/s390x/hostdep.h
 create mode 100644 linux-user/host/s390x/safe-syscall.inc.S
 create mode 100644 linux-user/host/sparc/hostdep.h
 create mode 100644 linux-user/host/sparc64/hostdep.h
 create mode 100644 linux-user/host/x32/hostdep.h

-- 
2.1.4

[Qemu-devel] [PATCH v2 6/8] ppc/xics: An ICS with offset 0 is assumed to be uninitialized

2016-06-28 Thread Nikunj A Dadhania

From: Benjamin Herrenschmidt 

This will make life easier for dealing with dynamically configured
ICSes such as PHB3

Signed-off-by: Benjamin Herrenschmidt 
Reviewed-by: David Gibson 
Signed-off-by: Nikunj A Dadhania 
---
 include/hw/ppc/xics.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/hw/ppc/xics.h b/include/hw/ppc/xics.h
index 6ad3057..8c22daf 100644
--- a/include/hw/ppc/xics.h
+++ b/include/hw/ppc/xics.h
@@ -150,7 +150,7 @@ struct ICSState {
 
 static inline bool ics_valid_irq(ICSState *ics, uint32_t nr)
 {
-return (nr >= ics->offset)
+return (ics->offset != 0) && (nr >= ics->offset)
 && (nr < (ics->offset + ics->nr_irqs));
 }
 
-- 
2.7.4

[Qemu-devel] [PATCH v2 5/8] ppc/xics: Make the ICSState a list

2016-06-28 Thread Nikunj A Dadhania

From: Benjamin Herrenschmidt 

Instead of an array of fixed sized blocks, use a list, as we will need
to have sources with variable number of interrupts. SPAPR only uses
a single entry. Native will create more. If performance becomes an
issue we can add some hashed lookup but for now this will do fine.

Signed-off-by: Benjamin Herrenschmidt 
[ move the initialization of list to xics_common_initfn ]
Signed-off-by: Nikunj A Dadhania 
---
 hw/intc/trace-events  |  4 +--
 hw/intc/xics.c| 83 
 hw/intc/xics_kvm.c| 27 +++-
 hw/intc/xics_spapr.c  | 88 +--
 hw/ppc/spapr_events.c |  2 +-
 hw/ppc/spapr_pci.c|  5 ++-
 hw/ppc/spapr_vio.c|  2 +-
 include/hw/ppc/xics.h | 13 
 8 files changed, 138 insertions(+), 86 deletions(-)

diff --git a/hw/intc/trace-events b/hw/intc/trace-events
index 376dd18..5f0f783 100644
--- a/hw/intc/trace-events
+++ b/hw/intc/trace-events
@@ -56,8 +56,8 @@ xics_set_irq_lsi(int srcno, int nr) "set_irq_lsi: srcno %d 
[irq %#x]"
 xics_ics_write_xive(int nr, int srcno, int server, uint8_t priority) 
"ics_write_xive: irq %#x [src %d] server %#x prio %#x"
 xics_ics_reject(int nr, int srcno) "reject irq %#x [src %d]"
 xics_ics_eoi(int nr) "ics_eoi: irq %#x"
-xics_alloc(int src, int irq) "source#%d, irq %d"
-xics_alloc_block(int src, int first, int num, bool lsi, int align) "source#%d, 
first irq %d, %d irqs, lsi=%d, alignnum %d"
+xics_alloc(int irq) "irq %d"
+xics_alloc_block(int first, int num, bool lsi, int align) "first irq %d, %d 
irqs, lsi=%d, alignnum %d"
 xics_ics_free(int src, int irq, int num) "Source#%d, first irq %d, %d irqs"
 xics_ics_free_warn(int src, int irq) "Source#%d, irq %d is already free"
 
diff --git a/hw/intc/xics.c b/hw/intc/xics.c
index cd48f42..5148bdf 100644
--- a/hw/intc/xics.c
+++ b/hw/intc/xics.c
@@ -96,13 +96,16 @@ void xics_cpu_setup(XICSState *xics, PowerPCCPU *cpu)
 static void xics_common_reset(DeviceState *d)
 {
 XICSState *xics = XICS_COMMON(d);
+ICSState *ics;
 int i;
 
 for (i = 0; i < xics->nr_servers; i++) {
 device_reset(DEVICE(>ss[i]));
 }
 
-device_reset(DEVICE(xics->ics));
+QLIST_FOREACH(ics, >ics, list) {
+device_reset(DEVICE(ics));
+}
 }
 
 static void xics_prop_get_nr_irqs(Object *obj, Visitor *v, const char *name,
@@ -134,7 +137,6 @@ static void xics_prop_set_nr_irqs(Object *obj, Visitor *v, 
const char *name,
 }
 
 assert(info->set_nr_irqs);
-assert(xics->ics);
 info->set_nr_irqs(xics, value, errp);
 }
 
@@ -174,6 +176,9 @@ static void xics_prop_set_nr_servers(Object *obj, Visitor 
*v,
 
 static void xics_common_initfn(Object *obj)
 {
+XICSState *xics = XICS_COMMON(obj);
+
+QLIST_INIT(>ics);
 object_property_add(obj, "nr_irqs", "int",
 xics_prop_get_nr_irqs, xics_prop_set_nr_irqs,
 NULL, NULL, NULL);
@@ -212,33 +217,35 @@ static void ics_reject(ICSState *ics, int nr);
 static void ics_resend(ICSState *ics);
 static void ics_eoi(ICSState *ics, int nr);
 
-static void icp_check_ipi(XICSState *xics, int server)
+static void icp_check_ipi(ICPState *ss)
 {
-ICPState *ss = xics->ss + server;
-
 if (XISR(ss) && (ss->pending_priority <= ss->mfrr)) {
 return;
 }
 
-trace_xics_icp_check_ipi(server, ss->mfrr);
+trace_xics_icp_check_ipi(ss->cs->cpu_index, ss->mfrr);
 
-if (XISR(ss)) {
-ics_reject(xics->ics, XISR(ss));
+if (XISR(ss) && ss->xirr_owner) {
+ics_reject(ss->xirr_owner, XISR(ss));
 }
 
 ss->xirr = (ss->xirr & ~XISR_MASK) | XICS_IPI;
 ss->pending_priority = ss->mfrr;
+ss->xirr_owner = NULL;
 qemu_irq_raise(ss->output);
 }
 
 static void icp_resend(XICSState *xics, int server)
 {
 ICPState *ss = xics->ss + server;
+ICSState *ics;
 
 if (ss->mfrr < CPPR(ss)) {
-icp_check_ipi(xics, server);
+icp_check_ipi(ss);
+}
+QLIST_FOREACH(ics, >ics, list) {
+ics_resend(ics);
 }
-ics_resend(xics->ics);
 }
 
 void icp_set_cppr(XICSState *xics, int server, uint8_t cppr)
@@ -256,7 +263,10 @@ void icp_set_cppr(XICSState *xics, int server, uint8_t 
cppr)
 ss->xirr &= ~XISR_MASK; /* Clear XISR */
 ss->pending_priority = 0xff;
 qemu_irq_lower(ss->output);
-ics_reject(xics->ics, old_xisr);
+if (ss->xirr_owner) {
+ics_reject(ss->xirr_owner, old_xisr);
+ss->xirr_owner = NULL;
+}
 }
 } else {
 if (!XISR(ss)) {
@@ -271,7 +281,7 @@ void icp_set_mfrr(XICSState *xics, int server, uint8_t mfrr)
 
 ss->mfrr = mfrr;
 if (mfrr < CPPR(ss)) {
-icp_check_ipi(xics, server);
+icp_check_ipi(ss);
 }
 }
 
@@ -282,6 +292,7 @@ uint32_t icp_accept(ICPState *ss)
 qemu_irq_lower(ss->output);

[Qemu-devel] [PATCH v2 7/8] ppc/xics: Use a helper to add a new ICS

2016-06-28 Thread Nikunj A Dadhania

From: Benjamin Herrenschmidt 

Signed-off-by: Benjamin Herrenschmidt 
[Move object allocation and adding child to the helper]
Signed-off-by: Nikunj A Dadhania 
Reviewed-by: David Gibson 
---
 hw/intc/xics.c| 10 ++
 hw/intc/xics_spapr.c  |  6 +-
 include/hw/ppc/xics.h |  1 +
 3 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/hw/intc/xics.c b/hw/intc/xics.c
index 5148bdf..bbdba84 100644
--- a/hw/intc/xics.c
+++ b/hw/intc/xics.c
@@ -108,6 +108,16 @@ static void xics_common_reset(DeviceState *d)
 }
 }
 
+void xics_add_ics(XICSState *xics)
+{
+ICSState *ics;
+
+ics = ICS(object_new(TYPE_ICS));
+object_property_add_child(OBJECT(xics), "ics", OBJECT(ics), NULL);
+ics->xics = xics;
+QLIST_INSERT_HEAD(>ics, ics, list);
+}
+
 static void xics_prop_get_nr_irqs(Object *obj, Visitor *v, const char *name,
   void *opaque, Error **errp)
 {
diff --git a/hw/intc/xics_spapr.c b/hw/intc/xics_spapr.c
index 0b0845d..270f20e 100644
--- a/hw/intc/xics_spapr.c
+++ b/hw/intc/xics_spapr.c
@@ -305,12 +305,8 @@ static void xics_spapr_realize(DeviceState *dev, Error 
**errp)
 static void xics_spapr_initfn(Object *obj)
 {
 XICSState *xics = XICS_SPAPR(obj);
-ICSState *ics;
 
-ics = ICS(object_new(TYPE_ICS));
-object_property_add_child(obj, "ics", OBJECT(ics), NULL);
-ics->xics = xics;
-QLIST_INSERT_HEAD(>ics, ics, list);
+xics_add_ics(xics);
 }
 
 static void xics_spapr_class_init(ObjectClass *oc, void *data)
diff --git a/include/hw/ppc/xics.h b/include/hw/ppc/xics.h
index 8c22daf..8433bf9 100644
--- a/include/hw/ppc/xics.h
+++ b/include/hw/ppc/xics.h
@@ -196,5 +196,6 @@ void ics_write_xive(ICSState *ics, int nr, int server,
 void ics_set_irq_type(ICSState *ics, int srcno, bool lsi);
 
 ICSState *xics_find_source(XICSState *icp, int irq);
+void xics_add_ics(XICSState *xics);
 
 #endif /* __XICS_H__ */
-- 
2.7.4

1 2 3 4 5 >

1 - 100 of 462 matches

Mail list logo