Re: [Qemu-devel] [Qemu-ppc] [PATCH] spapr/pci: populate PCI DT in reverse order

2015-12-20 Thread David Gibson
On Thu, Dec 17, 2015 at 09:43:29AM +0100, Greg Kurz wrote:
> On Thu, 3 Dec 2015 15:53:17 +0100
> Greg Kurz  wrote:
> 
> > On Tue, 1 Dec 2015 22:48:38 +0100
> > Thomas Huth  wrote:
> > 
> > > On 30/11/15 11:45, Greg Kurz wrote:
> > > > Since commit 1d2d974244c6 "spapr_pci: enumerate and add PCI device 
> > > > tree", QEMU
> > > > populates the PCI device tree in the opposite order compared to SLOF.
> > > > 
> > > > Before 1d2d974244c6:
> > > > 
> > > > Populating /pci@8002000
> > > >  00  (D) : 1af4 1000virtio [ net ]
> > > >  00 0800 (D) : 1af4 1001virtio [ block ]
> > > >  00 1000 (D) : 1af4 1009virtio [ network ]
> > > > Populating /pci@8002000/unknown-legacy-device@2
> > > > 
> > > > 
> > > > 7e5294b8 :  /pci@8002000
> > > > 7e52b998 :  |-- ethernet@0
> > > > 7e52c0c8 :  |-- scsi@1
> > > > 7e52c7e8 :  +-- unknown-legacy-device@2 ok
> > > > 
> > > > Since 1d2d974244c6:
> > > > 
> > > > Populating /pci@8002000
> > > >  00 1000 (D) : 1af4 1009virtio [ network ]
> > > > Populating /pci@8002000/unknown-legacy-device@2
> > > >  00 0800 (D) : 1af4 1001virtio [ block ]
> > > >  00  (D) : 1af4 1000virtio [ net ]
> > > > 
> > > > 
> > > > 7e5e8118 :  /pci@8002000
> > > > 7e5ea6a0 :  |-- unknown-legacy-device@2
> > > > 7e5eadb8 :  |-- scsi@1
> > > > 7e5eb4d8 :  +-- ethernet@0 ok
> > > > 
> > > > This behaviour change is not actually a bug since no assumptions should 
> > > > be
> > > > made on DT ordering. But it has no real justification either, other than
> > > > being the consequence of the way fdt_add_subnode() inserts new elements
> > > > to the front of the FDT rather than adding them to the tail.
> > > > 
> > > > This patch reverts to the historical SLOF ordering by walking PCI 
> > > > devices in
> > > > reverse order.
> > > 
> > > I've applied your patch here locally, and indeed, the device tree looks
> > > nicer to me, too, when the nodes are listed in ascending order.
> > > 
> > > Tested-by: Thomas Huth 
> > > 
> > > 
> > 
> 
> Ping ?

Sorry I didn't reply.

I'm still dubious about this.  It seems like a fair bit of effort to
restore a behaviour that the client isn't supposed to be relying on
anyway.

Plus, the version with the changed order is already released, so
applying this will mean a second behaviour change.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


[Qemu-devel] [PATCH 2/2] ppc: Allow 64kiB pages for POWER8 in TCG

2015-12-20 Thread David Gibson
Now that the spapr code has been extended to support 64kiB pages, we can
allow guests to use 64kiB pages on an emulated POWER8 by adding it to the
"segment_page_sizes" structure which is advertised via the device tree.

For now we just add support for 64kiB pages in 64kiB page segments.  Real
POWER8 also supports 64kiB pages in 4kiB page segments, but that will
require more work to implement.

Real POWER7s (and maybe some other CPU models) also support 64kiB pages,
however, I don't want to add support there without double checking if they
use the same HPTE and SLB encodings (in principle these are implementation
dependent).

Signed-off-by: David Gibson 
---
 target-ppc/translate_init.c | 17 +
 1 file changed, 17 insertions(+)

diff --git a/target-ppc/translate_init.c b/target-ppc/translate_init.c
index e88dc7f..ae5a269 100644
--- a/target-ppc/translate_init.c
+++ b/target-ppc/translate_init.c
@@ -8200,6 +8200,22 @@ POWERPC_FAMILY(POWER8)(ObjectClass *oc, void *data)
 {
 DeviceClass *dc = DEVICE_CLASS(oc);
 PowerPCCPUClass *pcc = POWERPC_CPU_CLASS(oc);
+static const struct ppc_segment_page_sizes POWER8_sps = {
+.sps = {
+{ .page_shift = 12, /* 4K */
+  .slb_enc = 0,
+  .enc = { { .page_shift = 12, .pte_enc = 0 } }
+},
+{ .page_shift = 16, /* 64K */
+  .slb_enc = 0x110,
+  .enc = { { .page_shift = 16, .pte_enc = 0x1 } }
+},
+{ .page_shift = 24, /* 16M */
+  .slb_enc = 0x100,
+  .enc = { { .page_shift = 24, .pte_enc = 0 } }
+},
+}
+};
 
 dc->fw_name = "PowerPC,POWER8";
 dc->desc = "POWER8";
@@ -8258,6 +8274,7 @@ POWERPC_FAMILY(POWER8)(ObjectClass *oc, void *data)
 pcc->l1_dcache_size = 0x8000;
 pcc->l1_icache_size = 0x8000;
 pcc->interrupts_big_endian = ppc_cpu_interrupts_big_endian_lpcr;
+pcc->sps = _sps;
 }
 #endif /* defined (TARGET_PPC64) */
 
-- 
2.5.0




[Qemu-devel] [PATCH 0/2] 64kiB page support for TCG guests with POWER8 CPU

2015-12-20 Thread David Gibson
This series enables 64kiB pages when using a TCG emulated POWER8 CPU.

This gets the emulated POWER8 MMU a little bit closer to what the real
POWER8 CPU supports.

David Gibson (2):
  ppc: Move HPTE size parsing code to target-ppc helper (and add 64kiB
pages)
  ppc: Allow 64kiB pages for POWER8 in TCG

 hw/ppc/spapr_hcall.c| 26 --
 target-ppc/mmu-hash64.c | 31 +++
 target-ppc/mmu-hash64.h |  3 +++
 target-ppc/translate_init.c | 17 +
 4 files changed, 59 insertions(+), 18 deletions(-)

-- 
2.5.0




Re: [Qemu-devel] [PATCH 3/5] PXB: convert to realize()

2015-12-20 Thread Cao jin



On 12/20/2015 07:21 PM, Marcel Apfelbaum wrote:

On 12/20/2015 12:48 PM, Cao jin wrote:

Hi,

On 12/20/2015 06:22 PM, Marcel Apfelbaum wrote:

[...]

+
+err_register_bus:
+object_unref(OBJECT(ds));
+object_unref(OBJECT(bds));
+object_unref(OBJECT(bus));



The order should be in the reverse order of creation:
 bds, bus, ds



Ok, I can do that. But it seems the order here doesn`t matter? Is
there dependency among these three?


Yes, there is a dependency:
At first the pxb host (ds) is created, then the bus (bus) is created as
host's child (see pci_bus_new)
and in the end a pci bridge (bds) is attached to the bus (see qdev_create).



Yup...thanks for reminding, I did read the code trying to find the 
parent relationship...but seem I didn`t read it thoroughly:-[



By the way, indeed you should call object_unparent at least for the
pxb_host(ds), but you may want to
check the others parent relationship as well.



yes, but I think you are saying: object_unparent(bus), right? the 
relationship seems is:

  pxb host-->(child property)bus-->(link property)bds

Another question: because some of this series is CCed to 
qemu-trivial(which means: reviewed-by?) by other maintainer, so next 
time, do I need to send the whole series with "v2", or the rest?







  }

  static void pxb_dev_exitfn(PCIDevice *pci_dev)
@@ -259,7 +263,7 @@ static void pxb_dev_class_init(ObjectClass *klass,
void *data)
  DeviceClass *dc = DEVICE_CLASS(klass);
  PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);

-k->init = pxb_dev_initfn;
+k->realize = pxb_dev_realize;
  k->exit = pxb_dev_exitfn;


If init is converted to realize, maybe the exit should be converted to
unrealize?



Yup, I agree with you from the point that the names should be antonym.
But it seems there is no PCIDeviceClass.unrealize:(


You are right. The pci_qdev_unrealize ultimately calls exit. But the
same goes for init, pci_qdev_realize calls for pc->realize.
This is the reason I chose to use init/exit instead of the strange
realize/exit.

But since the intention is to get rid of init, I am not against it.



And I am also not aware why there is no comment for .exit while there
is for .init. It is appreciated if somebody could tell me the history
O:-)


I'll add Markus, Andreas and Michael (the PCI maintainer), maybe they
have a better insight to this.

On the other hand you should continue with the patch and leave the
"unrealize" until we'll know more :)


Got it;)



Thanks,
Marcel





Thanks,
Marcel


  k->vendor_id = PCI_VENDOR_ID_REDHAT;
  k->device_id = PCI_DEVICE_ID_REDHAT_PXB;





.







.



--
Yours Sincerely,

Cao Jin





Re: [Qemu-devel] [PATCH v2 3/3] hw/sd: use guest error logging rather than fprintf to stderr

2015-12-20 Thread Peter Crosthwaite
On Wed, Dec 16, 2015 at 11:02 AM, Andrew Baumann
 wrote:
> Some of these errors may be harmless (e.g. probing unimplemented
> commands, or issuing CMD12 in the wrong state), and may also be quite
> frequent. Spamming the standard error output isn't desirable in such
> cases.
>
> Signed-off-by: Andrew Baumann 
> ---
> It might also be desirable to have a squelch mechanism for these
> messages, but at least for my use-case, this is sufficient, since they
> only occur during boot time.
>
>  hw/sd/sd.c | 22 +-
>  1 file changed, 13 insertions(+), 9 deletions(-)
>
> diff --git a/hw/sd/sd.c b/hw/sd/sd.c
> index 1011785..9b76b47 100644
> --- a/hw/sd/sd.c
> +++ b/hw/sd/sd.c
> @@ -1234,16 +1234,18 @@ static sd_rsp_type_t sd_normal_command(SDState *sd,
>
>  default:
>  bad_cmd:
> -fprintf(stderr, "SD: Unknown CMD%i\n", req.cmd);
> +qemu_log_mask(LOG_GUEST_ERROR, "SD: Unknown CMD%i\n", req.cmd);
>  return sd_illegal;
>
>  unimplemented_cmd:
>  /* Commands that are recognised but not yet implemented in SPI mode. 
>  */
> -fprintf(stderr, "SD: CMD%i not implemented in SPI mode\n", req.cmd);
> +qemu_log_mask(LOG_GUEST_ERROR,
> +  "SD: CMD%i not implemented in SPI mode\n",
> +  req.cmd);

Are these unimplemented by specification or unimplemented in QEMU? If
they are unimplemented in QEMU it should be a LOG_UNIMP.

Otherwise:

Reviewed-by: Peter Crosthwaite 

Regards,
Peter

>  return sd_illegal;
>  }
>
> -fprintf(stderr, "SD: CMD%i in a wrong state\n", req.cmd);
> +qemu_log_mask(LOG_GUEST_ERROR, "SD: CMD%i in a wrong state\n", req.cmd);
>  return sd_illegal;
>  }
>
> @@ -1375,7 +1377,7 @@ static sd_rsp_type_t sd_app_command(SDState *sd,
>  return sd_normal_command(sd, req);
>  }
>
> -fprintf(stderr, "SD: ACMD%i in a wrong state\n", req.cmd);
> +qemu_log_mask(LOG_GUEST_ERROR, "SD: ACMD%i in a wrong state\n", req.cmd);
>  return sd_illegal;
>  }
>
> @@ -1418,7 +1420,7 @@ int sd_do_command(SDState *sd, SDRequest *req,
>  if (!cmd_valid_while_locked(sd, req)) {
>  sd->card_status |= ILLEGAL_COMMAND;
>  sd->expecting_acmd = false;
> -fprintf(stderr, "SD: Card is locked\n");
> +qemu_log_mask(LOG_GUEST_ERROR, "SD: Card is locked\n");
>  rtype = sd_illegal;
>  goto send_response;
>  }
> @@ -1576,7 +1578,8 @@ void sd_write_data(SDState *sd, uint8_t value)
>  return;
>
>  if (sd->state != sd_receivingdata_state) {
> -fprintf(stderr, "sd_write_data: not in Receiving-Data state\n");
> +qemu_log_mask(LOG_GUEST_ERROR,
> +  "sd_write_data: not in Receiving-Data state\n");
>  return;
>  }
>
> @@ -1695,7 +1698,7 @@ void sd_write_data(SDState *sd, uint8_t value)
>  break;
>
>  default:
> -fprintf(stderr, "sd_write_data: unknown command\n");
> +qemu_log_mask(LOG_GUEST_ERROR, "sd_write_data: unknown command\n");
>  break;
>  }
>  }
> @@ -1710,7 +1713,8 @@ uint8_t sd_read_data(SDState *sd)
>  return 0x00;
>
>  if (sd->state != sd_sendingdata_state) {
> -fprintf(stderr, "sd_read_data: not in Sending-Data state\n");
> +qemu_log_mask(LOG_GUEST_ERROR,
> +  "sd_read_data: not in Sending-Data state\n");
>  return 0x00;
>  }
>
> @@ -1821,7 +1825,7 @@ uint8_t sd_read_data(SDState *sd)
>  break;
>
>  default:
> -fprintf(stderr, "sd_read_data: unknown command\n");
> +qemu_log_mask(LOG_GUEST_ERROR, "sd_read_data: unknown command\n");
>  return 0x00;
>  }
>
> --
> 2.5.3
>



Re: [Qemu-devel] Is there a particular reason why REP STOS/MOV are jitted this way?

2015-12-20 Thread Peter Maydell
On 20 December 2015 at 23:13, farmdve  wrote:
> The rep stosd instruction seems to be jitted in a really weird way and I was
> wondering what are the design choices behind this.
> Basically the code is jitted to an operation where there is a conditional
> branch that tests the ECX register to see if it's zero or not(although I
> could have gotten this part wrong). If it's not, it proceeds to prepare the
> data to be placed at es:[edi], decrements ECX and executes another
> conditional branch that ends up at a jmp to 'helper_le_stl_mmu' effectively
> exiting the translation block as execution will end up re-entering that same
> TB.
>
> My question is why isn't this jitted to a loop and a call to this helper?

helper_le_stl_mmu is part of the standard load/store code and not
visible at the 'generate i386 guest code' layer; in any case the
fast path won't call the helper but can just directly access
the guest RAM.

I suspect the reason we do it the way we do is so that we can
correctly take an interrupt midway through the rep stosd and
have the intermediate state be correct -- the comments in the
GEN_REPZ macro say we do it like Valgrind, so this isn't a
QEMU-specific oddity. Handling single-step of a rep stosd
also requires care.

Since we generate the do-another-repetition code with
gen_jmp(s, cur_eip) that should result in a jump-to-TB
which is chainable, so once we've gone round once we won't
have to go back out to the main loop while we're iterating.

thanks
-- PMM



[Qemu-devel] [PATCH v2] linux-user/mmap.c: Always zero MAP_ANONYMOUS memory in mmap_frag()

2015-12-20 Thread chengang
From: Chen Gang 

When mapping MAP_ANONYMOUS memory fragments, still need notice about to
set it zero, or it will cause issues.

Signed-off-by: Chen Gang 
---
 linux-user/mmap.c |4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/linux-user/mmap.c b/linux-user/mmap.c
index 7b459d5..29fe646 100644
--- a/linux-user/mmap.c
+++ b/linux-user/mmap.c
@@ -186,10 +186,12 @@ static int mmap_frag(abi_ulong real_start,
 if (prot_new != (prot1 | PROT_WRITE))
 mprotect(host_start, qemu_host_page_size, prot_new);
 } else {
-/* just update the protection */
 if (prot_new != prot1) {
 mprotect(host_start, qemu_host_page_size, prot_new);
 }
+if ((prot_new & PROT_WRITE) && ((flags & MAP_PRIVATE) || (fd == -1))) {
+memset(g2h(start), 0, end - start);
+}
 }
 return 0;
 }
-- 
1.7.3.4




Re: [Qemu-devel] [PULL v3 00/15] Tracing patches

2015-12-20 Thread Denis V. Lunev

On 11/13/2015 01:44 PM, Stefan Hajnoczi wrote:

On Fri, Nov 13, 2015 at 6:30 PM, Peter Maydell  wrote:

On 13 November 2015 at 07:58, Stefan Hajnoczi  wrote:

v3:
  * Include "exec/log.h" from translate-a64.c [Peter]

v2:
  * Add missing log.py file [Peter]

The following changes since commit 74fcbd22d20a2fbc1a47a7b00cce5bf98fd7be5f:

   hw/misc: Add support for ADC controller in Xilinx Zynq 7000 (2015-11-12 
21:30:42 +)

are available in the git repository at:

   git://github.com/stefanha/qemu.git tags/tracing-pull-request

for you to fetch changes up to de332788d7c626fce31f6baebab807df5a7410e1:

   log: add "-d trace:PATTERN" (2015-11-13 15:52:57 +0800)

w32 fails to build with the same 'ssize_t vs signed size_t for %zd'
thing that hit somebody else's changes earlier this week:

...

I also now get a handful of extra warnings in the 'make check' output:

As mentioned on IRC, I'm dropping this pull request.

I'll send a new pull request early next week for -rc1.  It will only
contain bug fixes.

Too many fixes are accumulating on top of the patches.  I'm going to
bounce the original series and merge it, once fixed, for QEMU 2.6.

Stefan


is it time?

Stefan, do you need some help with this serie?

Den



[Qemu-devel] Is there a particular reason why REP STOS/MOV are jitted this way?

2015-12-20 Thread farmdve
The rep stosd instruction seems to be jitted in a really weird way and I
was wondering what are the design choices behind this.
Basically the code is jitted to an operation where there is a conditional
branch that tests the ECX register to see if it's zero or not(although I
could have gotten this part wrong). If it's not, it proceeds to prepare the
data to be placed at es:[edi], decrements ECX and executes another
conditional branch that ends up at a jmp to 'helper_le_stl_mmu' effectively
exiting the translation block as execution will end up re-entering that
same TB.

My question is why isn't this jitted to a loop and a call to this helper?


Re: [Qemu-devel] [PATCH] linux-user/syscall.c: Notice about lock bitmask translation for fcntl

2015-12-20 Thread Chen Gang

On 2015年12月18日 18:47, Chen Gang wrote:
> 
> On 2015年12月18日 17:41, Laurent Vivier wrote:
>>
>> Le 18/12/2015 07:51, Chen Gang a écrit :

[...]

>>>
>>> +++ b/linux-user/mmap.c
>>> @@ -567,6 +567,10 @@ abi_long target_mmap(abi_ulong start, abi_ulong len, 
>>> int prot,
>>>  printf("\n");
>>>  #endif
>>>  tb_invalidate_phys_range(start, start + len);
>>> +if ((prot & PROT_WRITE) && (flags & MAP_ANONYMOUS)
>>> +&& ((flags & MAP_PRIVATE) || (fd == -1))) {
>>> +memset(g2h(start), 0, len);
>>> +}
>>
>> IMHO, their kernel needs a fix, mmap(2):
>>
>> MAP_ANONYMOUS
>> The mapping is not backed by any file; its contents are initial‐
>> ized to zero.
>>

Oh, sorry, after check again, it is not kernel's issue: mmap() will
always zero the ANONYMOUS memory for kernel 3.8 in sw_64 arch.

It is still our qemu's issue in mmap_frag (do not zero ANONYMOUS memory
fragments), so I sent patch v2 for it, please help check.

Thanks.

> 
> OK, Thanks.
> 
> 

-- 
Chen Gang (陈刚)

Open, share, and attitude like air, water, and life which God blessed



Re: [Qemu-devel] [PATCH 5/5] xen-pvdevice: convert to realize()

2015-12-20 Thread Cao jin



On 12/19/2015 02:00 AM, Paolo Bonzini wrote:

CCing Stefano, who is the maintainer.



"Stefano" isn`t in the list when I use scripts/get_maintainer.pl...


Paolo

On 18/12/2015 12:03, Cao jin wrote:

Signed-off-by: Cao jin 
---
  hw/i386/xen/xen_pvdevice.c | 12 ++--
  1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/hw/i386/xen/xen_pvdevice.c b/hw/i386/xen/xen_pvdevice.c
index c218947..a6c93d0 100644
--- a/hw/i386/xen/xen_pvdevice.c
+++ b/hw/i386/xen/xen_pvdevice.c
@@ -69,14 +69,16 @@ static const MemoryRegionOps xen_pv_mmio_ops = {
  .endianness = DEVICE_LITTLE_ENDIAN,
  };

-static int xen_pv_init(PCIDevice *pci_dev)
+static void xen_pv_realize(PCIDevice *pci_dev, Error **errp)
  {
  XenPVDevice *d = XEN_PV_DEVICE(pci_dev);
  uint8_t *pci_conf;

  /* device-id property must always be supplied */
-if (d->device_id == 0x)
-   return -1;
+if (d->device_id == 0x) {
+error_setg(errp, "Device ID invalid, it must always be supplied")
+   return;
+}

  pci_conf = pci_dev->config;

@@ -97,8 +99,6 @@ static int xen_pv_init(PCIDevice *pci_dev)

  pci_register_bar(pci_dev, 1, PCI_BASE_ADDRESS_MEM_PREFETCH,
   >mmio);
-
-return 0;
  }

  static Property xen_pv_props[] = {
@@ -114,7 +114,7 @@ static void xen_pv_class_init(ObjectClass *klass, void 
*data)
  DeviceClass *dc = DEVICE_CLASS(klass);
  PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);

-k->init = xen_pv_init;
+k->realize = xen_pv_realize;
  k->class_id = PCI_CLASS_SYSTEM_OTHER;
  dc->desc = "Xen PV Device";
  dc->props = xen_pv_props;




.



--
Yours Sincerely,

Cao Jin





Re: [Qemu-devel] [PATCH 04/10] hw/sd: Add QOM bus which SD cards plug in to

2015-12-20 Thread Peter Crosthwaite
On Sun, Dec 20, 2015 at 3:18 PM, Peter Maydell  wrote:
> On 20 December 2015 at 20:51, Peter Crosthwaite
>  wrote:
>> On Sun, Dec 20, 2015 at 9:10 AM, Peter Maydell  
>> wrote:
>>> On 19 December 2015 at 21:38, Peter Crosthwaite
>>>  wrote:
 On Fri, Dec 11, 2015 at 04:37:05PM +, Peter Maydell wrote:
> +bool sdbus_get_inserted(SDBus *sdbus)
> +{
> +SDState *card = get_card(sdbus);
> +
> +if (card) {
> +SDClass *sc = SD_GET_CLASS(card);
> +
> +return sc->get_inserted(card);
> +}
> +
> +return false;
> +}

 This I am not sure about. Realistically, the card has no self
 awareness of its ejection state so it can't be polled for "are
 you there". The card insertion switch is implemented as a
 physical switch on the slot itself and a property of the bus.

 The absence on presence of a device should determine this, making me
 think this should return !!card.

 Unfortunately, we have the case of the block UI being able to trigger a
 card ejection from underneath the bus level. But since the SD card is 
 already
 using qdev_get_parent_bus() the removal from the bus can be managed at the
 card level.
>>>
>>> For user-level back compat I think we need to retain "might have
>>> an sdcard object with no block backend, and that means
>>> 'no-card-present'". This is both for the user facing
>>> monitor commands to manipulate the sd card, and also
>>
>> What are the user-facing monitor commands? I tried using "change" and
>> "eject", but they don't seem to work for SD, due to the tray being
>> closed. Has this ever worked in a way that is user manipulatable for
>> SD or is it just to handle the case of unconditional SD card creation
>> (with the card never hotplugging over the system lifetime)?
>
> I admit to just assuming that this stuff worked rather than
> testing it :-)
>

Can we capitilise on the opportunity here to break the backwards
compat for the non-functional feature and defeature eject and change
for SD? File substitution for SD cards doesn't match the real world
nicely, as we use the file size to modify the SD card implementation
(SDSC vs SDHC as well as the # of blocks). It really should be a
destroy+recreate rather than a substitution.We can then deal with SD
card ejection and insertion as proper hotplug.

Regards,
Peter

> thanks
> -- PMM



Re: [Qemu-devel] [PATCH 1/2] compat: Introduce HW_COMPAT_2_5

2015-12-20 Thread David Gibson
On Fri, Dec 18, 2015 at 09:30:02AM +0200, Shmulik Ladkani wrote:
> Introduce the place-holder for 2.5 back-compat properties, and the
> accompanying PC_COMPAT_2_5, CCW_COMPAT_2_5, SPAPR_COMPAT_2_5.
> 
> Signed-off-by: Shmulik Ladkani 

spapr part

Acked-by: David Gibson 

> ---
>  hw/i386/pc_piix.c  | 1 +
>  hw/i386/pc_q35.c   | 1 +
>  hw/ppc/spapr.c | 9 +
>  hw/s390x/s390-virtio-ccw.c | 9 +
>  include/hw/compat.h| 3 +++
>  include/hw/i386/pc.h   | 4 
>  6 files changed, 27 insertions(+)
> 
> diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
> index 319497e..f34b0fd 100644
> --- a/hw/i386/pc_piix.c
> +++ b/hw/i386/pc_piix.c
> @@ -475,6 +475,7 @@ static void pc_i440fx_2_5_machine_options(MachineClass *m)
>  pc_i440fx_machine_options(m);
>  m->alias = "pc";
>  m->is_default = 1;
> +SET_MACHINE_COMPAT(m, PC_COMPAT_2_5);
>  }
>  
>  DEFINE_I440FX_MACHINE(v2_5, "pc-i440fx-2.5", NULL,
> diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
> index 9a12068..b3585e0 100644
> --- a/hw/i386/pc_q35.c
> +++ b/hw/i386/pc_q35.c
> @@ -374,6 +374,7 @@ static void pc_q35_2_5_machine_options(MachineClass *m)
>  {
>  pc_q35_machine_options(m);
>  m->alias = "q35";
> +SET_MACHINE_COMPAT(m, PC_COMPAT_2_5);
>  }
>  
>  DEFINE_Q35_MACHINE(v2_5, "pc-q35-2.5", NULL,
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 6bfb908..6a0bfd7 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -2298,7 +2298,11 @@ static const TypeInfo spapr_machine_info = {
>  },
>  };
>  
> +#define SPAPR_COMPAT_2_5 \
> +HW_COMPAT_2_5
> +
>  #define SPAPR_COMPAT_2_4 \
> +SPAPR_COMPAT_2_5 \
>  HW_COMPAT_2_4
>  
>  #define SPAPR_COMPAT_2_3 \
> @@ -2434,6 +2438,10 @@ static const TypeInfo spapr_machine_2_4_info = {
>  
>  static void spapr_machine_2_5_class_init(ObjectClass *oc, void *data)
>  {
> +static GlobalProperty compat_props[] = {
> +SPAPR_COMPAT_2_5
> +{ /* end of list */ }
> +};
>  MachineClass *mc = MACHINE_CLASS(oc);
>  sPAPRMachineClass *smc = SPAPR_MACHINE_CLASS(oc);
>  
> @@ -2442,6 +2450,7 @@ static void spapr_machine_2_5_class_init(ObjectClass 
> *oc, void *data)
>  mc->alias = "pseries";
>  mc->is_default = 1;
>  smc->dr_lmb_enabled = true;
> +mc->compat_props = compat_props;
>  }
>  
>  static const TypeInfo spapr_machine_2_5_info = {
> diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
> index 5a52ff2..3d79654 100644
> --- a/hw/s390x/s390-virtio-ccw.c
> +++ b/hw/s390x/s390-virtio-ccw.c
> @@ -235,7 +235,11 @@ static const TypeInfo ccw_machine_info = {
>  },
>  };
>  
> +#define CCW_COMPAT_2_5 \
> +HW_COMPAT_2_5
> +
>  #define CCW_COMPAT_2_4 \
> +CCW_COMPAT_2_5 \
>  HW_COMPAT_2_4 \
>  {\
>  .driver   = TYPE_S390_SKEYS,\
> @@ -296,10 +300,15 @@ static const TypeInfo ccw_machine_2_4_info = {
>  static void ccw_machine_2_5_class_init(ObjectClass *oc, void *data)
>  {
>  MachineClass *mc = MACHINE_CLASS(oc);
> +static GlobalProperty compat_props[] = {
> +CCW_COMPAT_2_5
> +{ /* end of list */ }
> +};
>  
>  mc->alias = "s390-ccw-virtio";
>  mc->desc = "VirtIO-ccw based S390 machine v2.5";
>  mc->is_default = 1;
> +mc->compat_props = compat_props;
>  }
>  
>  static const TypeInfo ccw_machine_2_5_info = {
> diff --git a/include/hw/compat.h b/include/hw/compat.h
> index bcb36ef..3aa35c9 100644
> --- a/include/hw/compat.h
> +++ b/include/hw/compat.h
> @@ -1,6 +1,9 @@
>  #ifndef HW_COMPAT_H
>  #define HW_COMPAT_H
>  
> +#define HW_COMPAT_2_5 \
> +/* empty */
> +
>  #define HW_COMPAT_2_4 \
>  {\
>  .driver   = "virtio-blk-device",\
> diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
> index 4bf4faf..3b445e4 100644
> --- a/include/hw/i386/pc.h
> +++ b/include/hw/i386/pc.h
> @@ -309,7 +309,11 @@ int e820_add_entry(uint64_t, uint64_t, uint32_t);
>  int e820_get_num_entries(void);
>  bool e820_get_entry(int, uint32_t, uint64_t *, uint64_t *);
>  
> +#define PC_COMPAT_2_5 \
> +HW_COMPAT_2_5
> +
>  #define PC_COMPAT_2_4 \
> +PC_COMPAT_2_5 \
>  HW_COMPAT_2_4 \
>  {\
>  .driver   = "Haswell-" TYPE_X86_CPU,\

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH] hw/ppc/spapr_rtc: Remove bad class_size value

2015-12-20 Thread David Gibson
On Fri, Dec 18, 2015 at 10:06:14AM +0100, Thomas Huth wrote:
> class_size = sizeof(XICSStateClass) does not make much sense
> in the RTC code and likely was just a copy-n-paste error.
> Let's simply remove it.
> 
> Signed-off-by: Thomas Huth 

Oops, that's an embarrasing error.  Fix applied to ppc-for-2.5 and ppc-for-2.6

> ---
>  hw/ppc/spapr_rtc.c | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/hw/ppc/spapr_rtc.c b/hw/ppc/spapr_rtc.c
> index 34b27db..b591a8e 100644
> --- a/hw/ppc/spapr_rtc.c
> +++ b/hw/ppc/spapr_rtc.c
> @@ -200,7 +200,6 @@ static const TypeInfo spapr_rtc_info = {
>  .name  = TYPE_SPAPR_RTC,
>  .parent= TYPE_SYS_BUS_DEVICE,
>  .instance_size = sizeof(sPAPRRTCState),
> -.class_size = sizeof(XICSStateClass),
>  .class_init= spapr_rtc_class_init,
>  };
>  

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


[Qemu-devel] performance in io virtulization

2015-12-20 Thread ??????
i use qemu 2.2 and zhe vm kernel is 3.18i use fio do experiments in host and vm:
4k randwrite/read  ioengine=libaio
host: vm:
randread   about 200M/S  randread  about   190M/S   

randwrite   about 160M/S randwrite  about   65M/S


i tried multiqueue and dataplane ,but no improvement

Re: [Qemu-devel] [PATCH 3/3] sdhci: add optional quirk property to disable card insertion/removal interrupts

2015-12-20 Thread Peter Crosthwaite
On Wed, Dec 16, 2015 at 11:47 AM, Andrew Baumann
 wrote:
> This is needed for a quirk of the Raspberry Pi (bcm2835/6) MMC
> controller, where the card insert bit is documented as unimplemented
> (always reads zero, doesn't generate interrupts) but is in fact
> observed on hardware as set at power on, but is cleared (and remains
> clear) on subsequent controller resets.
>

Yep. Thats what the doc is saying.

FWIW , that set-on-reset behaviour makes me very suspicious that this
is the SoC layer or pin mux tying off this pin incorrectly. A bug in
the pin-mux would cause exactly this symptom.

Unfortunately we (or at least I) don't have access to Arasan's doc to
tell if this this is SoC errata or SDHCI errata and the Broadcom doc
quite unhelpfully refers the reader to:

sd3.0_host_ahb_emmc4.4_usersguide_ver5.9_jan11_10.pdf

Which doesn't google.

> Signed-off-by: Andrew Baumann 

Reviewed-by: Peter Crosthwaite 

> ---
>  hw/sd/sdhci.c | 6 +-
>  1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/hw/sd/sdhci.c b/hw/sd/sdhci.c
> index dd83e89..61f919b 100644
> --- a/hw/sd/sdhci.c
> +++ b/hw/sd/sdhci.c
> @@ -193,7 +193,9 @@ static void sdhci_reset(SDHCIState *s)
>   * initialization */
>  memset(>sdmasysad, 0, (uintptr_t)>capareg - 
> (uintptr_t)>sdmasysad);
>
> -sd_set_cb(s->card, s->ro_cb, s->eject_cb);
> +if (!s->noeject_quirk) {
> +sd_set_cb(s->card, s->ro_cb, s->eject_cb);
> +}

This will conflict with PMMs SD busification work but the resolution
should be reasonably straightforward. The Busifed SDHCI can chose to
ignore the inserted cb from the bus instead of the card.

Regards,
Peter

>  s->data_count = 0;
>  s->stopped_state = sdhc_not_stopped;
>  }
> @@ -1208,6 +1210,7 @@ const VMStateDescription sdhci_vmstate = {
>  VMSTATE_UINT16(data_count, SDHCIState),
>  VMSTATE_UINT64(admasysaddr, SDHCIState),
>  VMSTATE_UINT8(stopped_state, SDHCIState),
> +VMSTATE_BOOL(noeject_quirk, SDHCIState),
>  VMSTATE_VBUFFER_UINT32(fifo_buffer, SDHCIState, 1, NULL, 0, 
> buf_maxsz),
>  VMSTATE_TIMER_PTR(insert_timer, SDHCIState),
>  VMSTATE_TIMER_PTR(transfer_timer, SDHCIState),
> @@ -1276,6 +1279,7 @@ static Property sdhci_sysbus_properties[] = {
>  DEFINE_PROP_UINT32("capareg", SDHCIState, capareg,
>  SDHC_CAPAB_REG_DEFAULT),
>  DEFINE_PROP_UINT32("maxcurr", SDHCIState, maxcurr, 0),
> +DEFINE_PROP_BOOL("noeject-quirk", SDHCIState, noeject_quirk, false),
>  DEFINE_PROP_END_OF_LIST(),
>  };
>
> --
> 2.5.3
>



Re: [Qemu-devel] [PATCH v2 2/3] hw/sd: model a power-up delay, as a workaround for an EDK2 bug

2015-12-20 Thread Peter Crosthwaite
On Wed, Dec 16, 2015 at 11:02 AM, Andrew Baumann
 wrote:
> The SD spec for ACMD41 says that a zero argument is an "inquiry"
> ACMD41, which does not start initialisation and is used only for
> retrieving the OCR. However, Tianocore EDK2 (UEFI) has a bug [1]: it
> first sends an inquiry (zero) ACMD41. If that first request returns an
> OCR value with the power up bit (0x8000) set, it assumes the card
> is ready and continues, leaving the card in the wrong state. (My
> assumption is that this works on hardware, because no real card is
> immediately powered up upon reset.)
>
> This change models a delay of 0.5ms from the first ACMD41 to the power
> being up. However, it also immediately sets the power on upon seeing a
> non-zero (non-enquiry) ACMD41. This speeds up UEFI boot; it should
> also account for guests that simply delay after card reset and then
> issue an ACMD41 that they expect will succeed.
>
> [1] 
> https://github.com/tianocore/edk2/blob/master/EmbeddedPkg/Universal/MmcDxe/MmcIdentification.c#L279
> (This is the loop starting with "We need to wait for the MMC or SD
> card is ready")
>
> Signed-off-by: Andrew Baumann 
> ---
> Obviously this is a bug that should be fixed in EDK2. However, this
> initialisation appears to have been around for quite a while in EDK2
> (in various forms), and the fact that it has obviously worked with so
> many real SD/MMC cards makes me think that it would be pragmatic to
> have the workaround in QEMU as well.
>
> You might argue that the delay timer should start on sd_reset(), and
> not the first ACMD41. However, that doesn't work reliably with UEFI,
> because a large delay often elapses between the two (particularly in
> debug builds that do lots of printing to the serial port). If the
> timer fires too early, we'll still hit the bug, but we also don't want
> to set a huge timeout value, because some guests may depend on it
> expiring.
>
>  hw/sd/sd.c | 45 -
>  1 file changed, 40 insertions(+), 5 deletions(-)
>
> diff --git a/hw/sd/sd.c b/hw/sd/sd.c
> index 1a24933..1011785 100644
> --- a/hw/sd/sd.c
> +++ b/hw/sd/sd.c
> @@ -33,6 +33,7 @@
>  #include "sysemu/block-backend.h"
>  #include "hw/sd/sd.h"
>  #include "qemu/bitmap.h"
> +#include "qemu/timer.h"
>
>  //#define DEBUG_SD 1
>
> @@ -43,7 +44,9 @@ do { fprintf(stderr, "SD: " fmt , ## __VA_ARGS__); } while 
> (0)
>  #define DPRINTF(fmt, ...) do {} while(0)
>  #endif
>
> -#define ACMD41_ENQUIRY_MASK 0x00ff
> +#define ACMD41_ENQUIRY_MASK 0x00ff
> +#define OCR_POWER_UP0x8000
> +#define OCR_POWER_DELAY (get_ticks_per_sec() / 2000) /* 0.5ms */
>
>  typedef enum {
>  sd_r0 = 0,/* no response */
> @@ -80,6 +83,7 @@ struct SDState {
>  uint32_t mode;/* current card mode, one of SDCardModes */
>  int32_t state;/* current card state, one of SDCardStates */
>  uint32_t ocr;
> +QEMUTimer *ocr_power_timer;
>  uint8_t scr[8];
>  uint8_t cid[16];
>  uint8_t csd[16];
> @@ -194,8 +198,17 @@ static uint16_t sd_crc16(void *message, size_t width)
>
>  static void sd_set_ocr(SDState *sd)
>  {
> -/* All voltages OK, card power-up OK, Standard Capacity SD Memory Card */
> -sd->ocr = 0x8000;
> +/* All voltages OK, Standard Capacity SD Memory Card, not yet powered up 
> */
> +sd->ocr = 0x0000;
> +}
> +
> +static void sd_ocr_powerup(void *opaque)
> +{
> +SDState *sd = opaque;
> +
> +/* Set powered up bit in OCR */
> +assert(!(sd->ocr & OCR_POWER_UP));
> +sd->ocr |= OCR_POWER_UP;
>  }
>
>  static void sd_set_scr(SDState *sd)
> @@ -449,6 +462,8 @@ static const VMStateDescription sd_vmstate = {
>  .fields = (VMStateField[]) {
>  VMSTATE_UINT32(mode, SDState),
>  VMSTATE_INT32(state, SDState),
> +VMSTATE_UINT32(ocr, SDState),
> +VMSTATE_TIMER_PTR(ocr_power_timer, SDState),
>  VMSTATE_UINT8_ARRAY(cid, SDState, 16),
>  VMSTATE_UINT8_ARRAY(csd, SDState, 16),
>  VMSTATE_UINT16(rca, SDState),
> @@ -493,6 +508,7 @@ SDState *sd_init(BlockBackend *blk, bool is_spi)
>  sd->spi = is_spi;
>  sd->enable = true;
>  sd->blk = blk;
> +sd->ocr_power_timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, sd_ocr_powerup, 
> sd);
>  sd_reset(sd);
>  if (sd->blk) {
>  /* Attach dev if not already attached.  (This call ignores an
> @@ -1295,12 +1311,31 @@ static sd_rsp_type_t sd_app_command(SDState *sd,
>  }
>  switch (sd->state) {
>  case sd_idle_state:
> +/* If it's the first ACMD41 since reset, we need to decide
> + * whether to power up. If this is not an enquiry ACMD41,
> + * we immediately report power on and proceed below to the
> + * ready state, but if it is, we set a timer to model a
> + * delay for power up. This works around a bug in EDK2
> + * UEFI, which 

[Qemu-devel] [PATCH v2] ivshmem: Store file descriptor for vhost-user negotiation

2015-12-20 Thread Tetsuya Mukawa
If virtio-net driver allocates memory in ivshmem shared memory,
vhost-net will work correctly, but vhost-user will not work because
a fd of shared memory will not be sent to vhost-user backend.
This patch fixes ivshmem to store file descriptor of shared memory.
It will be used when vhost-user negotiates vhost-user backend.

v2:
* Fixed typo.

Signed-off-by: Tetsuya Mukawa 
---
 exec.c  | 10 ++
 hw/misc/ivshmem.c   |  9 +++--
 include/exec/ram_addr.h |  1 +
 3 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/exec.c b/exec.c
index 8718a75..7f0ce42 100644
--- a/exec.c
+++ b/exec.c
@@ -1757,6 +1757,16 @@ int qemu_get_ram_fd(ram_addr_t addr)
 return fd;
 }
 
+void qemu_set_ram_fd(ram_addr_t addr, int fd)
+{
+RAMBlock *block;
+
+rcu_read_lock();
+block = qemu_get_ram_block(addr);
+block->fd = fd;
+rcu_read_unlock();
+}
+
 void *qemu_get_ram_block_host_ptr(ram_addr_t addr)
 {
 RAMBlock *block;
diff --git a/hw/misc/ivshmem.c b/hw/misc/ivshmem.c
index f73f0c2..df585de 100644
--- a/hw/misc/ivshmem.c
+++ b/hw/misc/ivshmem.c
@@ -29,6 +29,7 @@
 #include "sysemu/char.h"
 #include "sysemu/hostmem.h"
 #include "qapi/visitor.h"
+#include "exec/ram_addr.h"
 
 #include "hw/misc/ivshmem.h"
 
@@ -422,6 +423,7 @@ static int create_shared_memory_BAR(IVShmemState *s, int 
fd, uint8_t attr,
 
 memory_region_init_ram_ptr(>ivshmem, OBJECT(s), "ivshmem.bar2",
s->ivshmem_size, ptr);
+qemu_set_ram_fd(s->ivshmem.ram_addr, fd);
 vmstate_register_ram(>ivshmem, DEVICE(s));
 memory_region_add_subregion(>bar, 0, >ivshmem);
 
@@ -682,6 +684,7 @@ static void ivshmem_read(void *opaque, const uint8_t *buf, 
int size)
 }
 memory_region_init_ram_ptr(>ivshmem, OBJECT(s),
"ivshmem.bar2", s->ivshmem_size, map_ptr);
+qemu_set_ram_fd(s->ivshmem.ram_addr, incoming_fd);
 vmstate_register_ram(>ivshmem, DEVICE(s));
 
 IVSHMEM_DPRINTF("guest h/w addr = %p, size = %" PRIu64 "\n",
@@ -689,7 +692,6 @@ static void ivshmem_read(void *opaque, const uint8_t *buf, 
int size)
 
 memory_region_add_subregion(>bar, 0, >ivshmem);
 
-close(incoming_fd);
 return;
 }
 
@@ -991,7 +993,6 @@ static void pci_ivshmem_realize(PCIDevice *dev, Error 
**errp)
 }
 
 create_shared_memory_BAR(s, fd, attr, errp);
-close(fd);
 }
 }
 
@@ -1010,11 +1011,15 @@ static void pci_ivshmem_exit(PCIDevice *dev)
 if (memory_region_is_mapped(>ivshmem)) {
 if (!s->hostmem) {
 void *addr = memory_region_get_ram_ptr(>ivshmem);
+int fd;
 
 if (munmap(addr, s->ivshmem_size) == -1) {
 error_report("Failed to munmap shared memory %s",
  strerror(errno));
 }
+
+if ((fd = qemu_get_ram_fd(s->ivshmem.ram_addr)) != -1)
+close(fd);
 }
 
 vmstate_unregister_ram(>ivshmem, DEVICE(dev));
diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h
index ba4c04d..ef1489d 100644
--- a/include/exec/ram_addr.h
+++ b/include/exec/ram_addr.h
@@ -72,6 +72,7 @@ ram_addr_t qemu_ram_alloc_resizeable(ram_addr_t size, 
ram_addr_t max_size,
  void *host),
  MemoryRegion *mr, Error **errp);
 int qemu_get_ram_fd(ram_addr_t addr);
+void qemu_set_ram_fd(ram_addr_t addr, int fd);
 void *qemu_get_ram_block_host_ptr(ram_addr_t addr);
 void qemu_ram_free(ram_addr_t addr);
 
-- 
2.1.4




Re: [Qemu-devel] [PATCH 1/1] static checker: e1000-82540em got aliased to e1000

2015-12-20 Thread Jason Wang


On 12/18/2015 01:35 PM, Amit Shah wrote:
> Commit 8304402033e8dbe8e379017d51ed1dd8344f1dce changed the name of the
> e1000-82540em device to e1000.  This was flagged:
>
>Section "e1000-82540em" does not exist in dest
>
> Add the mapping to the changed section names dictionary so the checker
> can proceed.
>
> Signed-off-by: Amit Shah 
> ---
>  scripts/vmstate-static-checker.py | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/scripts/vmstate-static-checker.py 
> b/scripts/vmstate-static-checker.py
> index b6c0bbe..b5ecaf6 100755
> --- a/scripts/vmstate-static-checker.py
> +++ b/scripts/vmstate-static-checker.py
> @@ -99,6 +99,7 @@ def get_changed_sec_name(sec):
>  # Section names can change -- see commit 292b1634 for an example.
>  changes = {
>  "ICH9 LPC": "ICH9-LPC",
> +"e1000-82540em": "e1000",
>  }
>  
>  for item in changes:

Acked-by: Jason Wang 



Re: [Qemu-devel] [PATCH 04/10] hw/sd: Add QOM bus which SD cards plug in to

2015-12-20 Thread Peter Maydell
On 20 December 2015 at 20:51, Peter Crosthwaite
 wrote:
> On Sun, Dec 20, 2015 at 9:10 AM, Peter Maydell  
> wrote:
>> On 19 December 2015 at 21:38, Peter Crosthwaite
>>  wrote:
>>> On Fri, Dec 11, 2015 at 04:37:05PM +, Peter Maydell wrote:
 +bool sdbus_get_inserted(SDBus *sdbus)
 +{
 +SDState *card = get_card(sdbus);
 +
 +if (card) {
 +SDClass *sc = SD_GET_CLASS(card);
 +
 +return sc->get_inserted(card);
 +}
 +
 +return false;
 +}
>>>
>>> This I am not sure about. Realistically, the card has no self
>>> awareness of its ejection state so it can't be polled for "are
>>> you there". The card insertion switch is implemented as a
>>> physical switch on the slot itself and a property of the bus.
>>>
>>> The absence on presence of a device should determine this, making me
>>> think this should return !!card.
>>>
>>> Unfortunately, we have the case of the block UI being able to trigger a
>>> card ejection from underneath the bus level. But since the SD card is 
>>> already
>>> using qdev_get_parent_bus() the removal from the bus can be managed at the
>>> card level.
>>
>> For user-level back compat I think we need to retain "might have
>> an sdcard object with no block backend, and that means
>> 'no-card-present'". This is both for the user facing
>> monitor commands to manipulate the sd card, and also
>
> What are the user-facing monitor commands? I tried using "change" and
> "eject", but they don't seem to work for SD, due to the tray being
> closed. Has this ever worked in a way that is user manipulatable for
> SD or is it just to handle the case of unconditional SD card creation
> (with the card never hotplugging over the system lifetime)?

I admit to just assuming that this stuff worked rather than
testing it :-)

thanks
-- PMM



[Qemu-devel] [PATCH 1/2] ppc: Move HPTE size parsing code to target-ppc helper (and add 64kiB pages)

2015-12-20 Thread David Gibson
The h_enter() hypercall implementation in spapr_hcall.c contains some code
to determine the page size of the newly inserted hash PTE.  This is
surprisingly complex in the general case, because POWER CPUs can have
different implementation dependent ways of encoding page sizes.  The
current version assumes POWER7 or POWER8 and doesn't support all the
possibilities of even those (but this is advertised correctly in the device
tree and so works with guests).

To support the possibility of future CPUs with different options, however,
this really belongs in the core ppc mmu code, rather than the spapr
("pseries" machine type) specific code.

So, move this code to a helper in target-ppc.

At the same time, uncomment some code to allow 64kiB pages.  At the time
that was added, I believe other parts of the ppc mmu code didn't handle
64kiB pages, but that's since been fixed.

Signed-off-by: David Gibson 
---
 hw/ppc/spapr_hcall.c| 26 --
 target-ppc/mmu-hash64.c | 31 +++
 target-ppc/mmu-hash64.h |  3 +++
 3 files changed, 42 insertions(+), 18 deletions(-)

diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
index cebceea..c346e97 100644
--- a/hw/ppc/spapr_hcall.c
+++ b/hw/ppc/spapr_hcall.c
@@ -93,28 +93,18 @@ static target_ulong h_enter(PowerPCCPU *cpu, 
sPAPRMachineState *spapr,
 target_ulong pte_index = args[1];
 target_ulong pteh = args[2];
 target_ulong ptel = args[3];
-target_ulong page_shift = 12;
+int page_shift;
+uint64_t avpnm;
 target_ulong raddr;
 target_ulong index;
 uint64_t token;
 
-/* only handle 4k and 16M pages for now */
-if (pteh & HPTE64_V_LARGE) {
-#if 0 /* We don't support 64k pages yet */
-if ((ptel & 0xf000) == 0x1000) {
-/* 64k page */
-} else
-#endif
-if ((ptel & 0xff000) == 0) {
-/* 16M page */
-page_shift = 24;
-/* lowest AVA bit must be 0 for 16M pages */
-if (pteh & 0x80) {
-return H_PARAMETER;
-}
-} else {
-return H_PARAMETER;
-}
+if (ppc_hash64_hpte_shift(pteh, ptel, _shift, ) < 0) {
+return H_PARAMETER;
+}
+
+if (HPTE64_V_AVPN_VAL(pteh) & avpnm) {
+return H_PARAMETER;
 }
 
 raddr = (ptel & HPTE64_R_RPN) & ~((1ULL << page_shift) - 1);
diff --git a/target-ppc/mmu-hash64.c b/target-ppc/mmu-hash64.c
index 34e20fa..d3b895d 100644
--- a/target-ppc/mmu-hash64.c
+++ b/target-ppc/mmu-hash64.c
@@ -399,6 +399,37 @@ static uint64_t ppc_hash64_page_shift(ppc_slb_t *slb)
 return epnshift;
 }
 
+/* Returns the page shift of the given hpte.  This is the "base"
+ * shift, used for hashing.  Although we don't implement it yet, the
+ * architecture allows the actuall mapped page to be larger, but
+ * determining that size requires information from the slb. */
+int ppc_hash64_hpte_shift(uint64_t pte0, uint64_t pte1,
+  int *shift, uint64_t *avpnm)
+{
+*shift = 12; /* 4kiB default */
+
+/* only handle 4k and 16M pages for now */
+if (pte0 & HPTE64_V_LARGE) {
+if ((pte1 & 0xf000) == 0x1000) {
+*shift = 16; /* 64 kiB */
+} else if ((pte1 & 0xff000) == 0) {
+*shift = 24; /* 16MiB */
+} else {
+return -EINVAL;
+}
+}
+
+if (avpnm) {
+if (*shift <= 23) {
+*avpnm = 0;
+} else {
+*avpnm = (1ULL << (*shift - 23)) - 1;
+}
+}
+
+return 0;
+}
+
 static hwaddr ppc_hash64_htab_lookup(CPUPPCState *env,
  ppc_slb_t *slb, target_ulong eaddr,
  ppc_hash_pte64_t *pte)
diff --git a/target-ppc/mmu-hash64.h b/target-ppc/mmu-hash64.h
index 291750f..71d2532 100644
--- a/target-ppc/mmu-hash64.h
+++ b/target-ppc/mmu-hash64.h
@@ -117,6 +117,9 @@ typedef struct {
 uint64_t pte0, pte1;
 } ppc_hash_pte64_t;
 
+int ppc_hash64_hpte_shift(uint64_t pte0, uint64_t pte1,
+  int *shift, uint64_t *avpnm);
+
 #endif /* CONFIG_USER_ONLY */
 
 #endif /* !defined (__MMU_HASH64_H__) */
-- 
2.5.0




[Qemu-devel] [PATCH 1/1] qga: guest-set-user-password - added ability to create new user

2015-12-20 Thread Denis V. Lunev
From: Yuri Pudgorodskiy 

Added optional 'create' flag to guest-set-user-password command.
When it is specified, a new user will be created if it is not
exists yet.

The option to the existing command is added as password for newly created
user should be set as specified.

Signed-off-by: Yuri Pudgorodskiy 
Signed-off-by: Denis V. Lunev 
CC: Michael Roth 
---
 qga/commands-posix.c | 36 
 qga/commands-win32.c | 25 -
 qga/qapi-schema.json |  3 ++-
 3 files changed, 62 insertions(+), 2 deletions(-)

diff --git a/qga/commands-posix.c b/qga/commands-posix.c
index c2ff970..5f60f08 100644
--- a/qga/commands-posix.c
+++ b/qga/commands-posix.c
@@ -1951,6 +1951,8 @@ int64_t qmp_guest_set_vcpus(GuestLogicalProcessorList 
*vcpus, Error **errp)
 void qmp_guest_set_user_password(const char *username,
  const char *password,
  bool crypted,
+ bool has_create,
+ bool create,
  Error **errp)
 {
 Error *local_err = NULL;
@@ -1993,6 +1995,40 @@ void qmp_guest_set_user_password(const char *username,
 goto out;
 }
 
+/* create new user if requested */
+if (has_create && create) {
+pid = fork();
+if (pid == 0) {
+char *str = g_shell_quote(username);
+char *cmd = g_strdup_printf("id %s || useradd -m %s", str, str);
+setsid();
+reopen_fd_to_null(0);
+reopen_fd_to_null(1);
+reopen_fd_to_null(2);
+execle("/bin/sh", "sh", "-c", cmd, NULL, environ);
+_exit(EXIT_FAILURE);
+} else if (pid < 0) {
+error_setg_errno(errp, errno, "failed to create child process");
+goto out;
+}
+
+ga_wait_child(pid, , _err);
+if (local_err) {
+error_propagate(errp, local_err);
+goto out;
+}
+
+if (!WIFEXITED(status)) {
+error_setg(errp, "child process has terminated abnormally");
+goto out;
+}
+
+if (WEXITSTATUS(status)) {
+error_setg(errp, "child process has failed to add new user");
+goto out;
+}
+}
+
 pid = fork();
 if (pid == 0) {
 close(datafd[1]);
diff --git a/qga/commands-win32.c b/qga/commands-win32.c
index 0654fe4..5269f8b 100644
--- a/qga/commands-win32.c
+++ b/qga/commands-win32.c
@@ -1281,6 +1281,8 @@ get_net_error_message(gint error)
 void qmp_guest_set_user_password(const char *username,
  const char *password,
  bool crypted,
+ bool has_create,
+ bool create,
  Error **errp)
 {
 NET_API_STATUS nas;
@@ -1301,6 +1303,27 @@ void qmp_guest_set_user_password(const char *username,
 user = g_utf8_to_utf16(username, -1, NULL, NULL, NULL);
 wpass = g_utf8_to_utf16(rawpasswddata, -1, NULL, NULL, NULL);
 
+if (has_create && create) {
+USER_INFO_1 ui = { 0 };
+
+ui.usri1_name = user;
+ui.usri1_password = wpass;
+ui.usri1_priv = USER_PRIV_USER;
+ui.usri1_flags = UF_SCRIPT | UF_DONT_EXPIRE_PASSWD;
+nas = NetUserAdd(NULL, 1, (LPBYTE), NULL);
+
+if (nas == NERR_Success) {
+goto out;
+}
+
+if (nas != NERR_UserExists) {
+gchar *msg = get_net_error_message(nas);
+error_setg(errp, "failed to add user: %s", msg);
+g_free(msg);
+goto out;
+}
+}
+
 pi1003.usri1003_password = wpass;
 nas = NetUserSetInfo(NULL, user,
  1003, (LPBYTE),
@@ -1311,7 +1334,7 @@ void qmp_guest_set_user_password(const char *username,
 error_setg(errp, "failed to set password: %s", msg);
 g_free(msg);
 }
-
+out:
 g_free(user);
 g_free(wpass);
 g_free(rawpasswddata);
diff --git a/qga/qapi-schema.json b/qga/qapi-schema.json
index 01c9ee4..2f97862 100644
--- a/qga/qapi-schema.json
+++ b/qga/qapi-schema.json
@@ -787,6 +787,7 @@
 # @username: the user account whose password to change
 # @password: the new password entry string, base64 encoded
 # @crypted: true if password is already crypt()d, false if raw
+# @create: #optional user will be created if not exists (since 2.6)
 #
 # If the @crypted flag is true, it is the caller's responsibility
 # to ensure the correct crypt() encryption scheme is used. This
@@ -806,7 +807,7 @@
 # Since 2.3
 ##
 { 'command': 'guest-set-user-password',
-  'data': { 'username': 'str', 'password': 'str', 'crypted': 'bool' } }
+  'data': { 'username': 'str', 'password': 'str', 'crypted': 'bool', 
'*create': 'bool' } }
 
 # @GuestMemoryBlock:
 #
-- 
2.1.4




Re: [Qemu-devel] [PATCH 2/2] ppc: Allow 64kiB pages for POWER8 in TCG

2015-12-20 Thread Benjamin Herrenschmidt
On Mon, 2015-12-21 at 13:41 +1100, David Gibson wrote:
> Now that the spapr code has been extended to support 64kiB pages, we can
> allow guests to use 64kiB pages on an emulated POWER8 by adding it to the
> "segment_page_sizes" structure which is advertised via the device tree.
> 
> For now we just add support for 64kiB pages in 64kiB page segments.  Real
> POWER8 also supports 64kiB pages in 4kiB page segments, but that will
> require more work to implement.
> 
> Real POWER7s (and maybe some other CPU models) also support 64kiB pages,
> however, I don't want to add support there without double checking if they
> use the same HPTE and SLB encodings (in principle these are implementation
> dependent).

Don't I have a patch for that in my series already ? IIRC, I had
another fix somewhere...

https://github.com/ozbenh/qemu/commit/6c9dc80db82ed87d5ee77852930348232
f5f9db2

Cheers,
Ben.


> Signed-off-by: David Gibson 
> ---
>  target-ppc/translate_init.c | 17 +
>  1 file changed, 17 insertions(+)
> 
> diff --git a/target-ppc/translate_init.c b/target-ppc/translate_init.c
> index e88dc7f..ae5a269 100644
> --- a/target-ppc/translate_init.c
> +++ b/target-ppc/translate_init.c
> @@ -8200,6 +8200,22 @@ POWERPC_FAMILY(POWER8)(ObjectClass *oc, void *data)
>  {
>  DeviceClass *dc = DEVICE_CLASS(oc);
>  PowerPCCPUClass *pcc = POWERPC_CPU_CLASS(oc);
> +static const struct ppc_segment_page_sizes POWER8_sps = {
> +.sps = {
> +{ .page_shift = 12, /* 4K */
> +  .slb_enc = 0,
> +  .enc = { { .page_shift = 12, .pte_enc = 0 } }
> +},
> +{ .page_shift = 16, /* 64K */
> +  .slb_enc = 0x110,
> +  .enc = { { .page_shift = 16, .pte_enc = 0x1 } }
> +},
> +{ .page_shift = 24, /* 16M */
> +  .slb_enc = 0x100,
> +  .enc = { { .page_shift = 24, .pte_enc = 0 } }
> +},
> +}
> +};
>  
>  dc->fw_name = "PowerPC,POWER8";
>  dc->desc = "POWER8";
> @@ -8258,6 +8274,7 @@ POWERPC_FAMILY(POWER8)(ObjectClass *oc, void *data)
>  pcc->l1_dcache_size = 0x8000;
>  pcc->l1_icache_size = 0x8000;
>  pcc->interrupts_big_endian = ppc_cpu_interrupts_big_endian_lpcr;
> +pcc->sps = _sps;
>  }
>  #endif /* defined (TARGET_PPC64) */
>  


[Qemu-devel] ?????? ?????? someconfusion on qemu i/o pocess and the qcow2format

2015-12-20 Thread ??????
I want to change the queue_size of virtio-blk ring from 128 to 1024, so I  
change here:
s->vq = virtio_add_queue(vdev, 128, virtio_blk_handle_output);
but the vm will not work. Are any thing i missed ?

Re: [Qemu-devel] [PATCH v4 5/5] kvm/x86: Hyper-V kvm exit

2015-12-20 Thread Pavel Fedin
 Hello!

 Replying to everything in one message.

> > As far as i understand this code, KVM_EXIT_HYPERV is called when one
> > of three MSRs are accessed. But, shouldn't we have implemented
> > instead something more generic, like KVM_EXIT_REG_IO, which would
> > work similar to KVM_EXIT_PIO or KVM_EXIT_MMIO, but carry register
> > code and value?
> 
> Yes, we considered that.  There were actually patches for this as well.

 Ah, i missed them, what a pity. There are lots of patches, i don't review them 
all. Actually i have noticed the change only after
it appeared in linux-next.

>  However, in this case the register is still emulated in the kernel, and
> userspace just gets informed of the new value.

 I see, but this doesn't change the semantic. All we need to do is to tell the 
userland that "register has been written".
Additionally to this we could do whatever we want, including caching the data 
in kernel, using it in kernel, and processing reads in
kernel.

> If we do get that, we will just rename KVM_EXIT_HYPERV to
> KVM_EXIT_MSR_ACCESS, and KVM_EXIT_HYPERV_SYNIC to
> KVM_EXIT_MSR_HYPERV_SYNIC, and struct kvm_hyperv_exit to kvm_msr_exit.

 Actually, i see this in more generic way, something like:

/* KVM_EXIT_REG_ACCESS */
struct {
__u64 reg;
__u64 data;
__u8  is_write;
} mmio;
 
 'data' and 'is_write' are self-explanatory, 'reg' would be generalized 
register code, the same as used for KVM_(GET|SET)_ONE_REG:
 - for ARM64: ARM64_SYS_REG(op0, op1, crn, crm, op2) - see
http://lxr.free-electrons.com/source/arch/arm64/include/uapi/asm/kvm.h#L189
 - for x86  : to be defined (i know, we don't use ..._ONE_REG operations here 
yet), like X86_MSR_REG(id), where the macro itself is:

#define X86_MSR_REG(id) (KVM_REG_X86 | KVM_REG_X86_MSR | 
KVM_REG_SIZE_U64 | (id))

 - for other architectures: to be defined in a similar way, once needed.

> On brief inspection of Andrey's patch (I have not been following
> closely) it looks like the kvm_hyperv_exit struct that's returned to
> userspace contains more data (control, evt_page, and msg_page fields)
> than simply the value of the MSR, so would the desired SynIC exit fit
> into a general-purpose exit for MSR emulation?

 I have looked at the code too, and these three fields are nothing more than 
values of respective MSR's:

case HV_X64_MSR_SCONTROL:
synic->control = data;
if (!host)
synic_exit(synic, msr);
break;



case HV_X64_MSR_SIEFP:
if (data & HV_SYNIC_SIEFP_ENABLE)
if (kvm_clear_guest(vcpu->kvm,
data & PAGE_MASK, PAGE_SIZE)) {
ret = 1;
break;
}
synic->evt_page = data;
if (!host)
synic_exit(synic, msr);
break;
case HV_X64_MSR_SIMP:
if (data & HV_SYNIC_SIMP_ENABLE)
if (kvm_clear_guest(vcpu->kvm,
data & PAGE_MASK, PAGE_SIZE)) {
ret = 1;
break;
}
synic->msg_page = data;
if (!host)
synic_exit(synic, msr);
break;

 So, every time one of these thee MSRs is written, we get a vmexit with values 
of all three registers, and that's all. We could
easily have 'synic_exit(synic, msr, data)' in all three cases, and i think the 
userspace could easily deal with proposed
KVM_EXIT_REG_ACCESS, just cache these values internally if needed.

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia





Re: [Qemu-devel] [PATCH 2/3] sdhci: don't raise a command index error for an unexpected response

2015-12-20 Thread Peter Crosthwaite
On Wed, Dec 16, 2015 at 11:47 AM, Andrew Baumann
 wrote:
> This deletes a block of code that raised a command index error if a
> command returned response data, but the guest did not set the
> appropriate bits in the response register to handle such a response. I
> cannot find any documentation that suggests the controller should
> behave in this way, the error code doesn't make sense (command index
> error is defined for the case where the index in a response does not
> match that of the issued command), and in at least one case (CMD23
> issued by UEFI on Raspberry Pi 2), actual hardware does not do this.
>
> Signed-off-by: Andrew Baumann 

Reviewed-by: Peter Crosthwaite 

All I can think of, is the original author is assuming that any
mismatch in response length can only occur on a mismatch of commands.
If we really need _CMDIDX it should be done directly as a check of the
command index in the response.

Regards,
Peter

> ---
>  hw/sd/sdhci.c | 3 ---
>  1 file changed, 3 deletions(-)
>
> diff --git a/hw/sd/sdhci.c b/hw/sd/sdhci.c
> index bc39d48..dd83e89 100644
> --- a/hw/sd/sdhci.c
> +++ b/hw/sd/sdhci.c
> @@ -243,9 +243,6 @@ static void sdhci_send_command(SDHCIState *s)
>  (s->cmdreg & SDHC_CMD_RESPONSE) == SDHC_CMD_RSP_WITH_BUSY) {
>  s->norintsts |= SDHC_NIS_TRSCMP;
>  }
> -} else if (rlen != 0 && (s->errintstsen & SDHC_EISEN_CMDIDX)) {
> -s->errintsts |= SDHC_EIS_CMDIDX;
> -s->norintsts |= SDHC_NIS_ERR;
>  }
>
>  if (s->norintstsen & SDHC_NISEN_CMDCMP) {
> --
> 2.5.3
>



Re: [Qemu-devel] [PATCH] virtio-9p: use accessor to get thread_pool

2015-12-20 Thread Michael Tokarev
20.12.2015 14:19, Greg Kurz wrote:
> The aio_context_new() function does not allocate a thread pool. This is
> deferred to the first call to the aio_get_thread_pool() accessor. It is
> hence forbidden to access the thread_pool field directly, as it may be
> NULL. The accessor *must* be used always.
> 
> Fixes: ebac1202c95a4f1b76b6ef3f0f63926fa76e753e
> Signed-off-by: Greg Kurz 

Reviewed-by: Michael Tokarev 
Tested-by: Michael Tokarev 
Cc: qemu-sta...@nongnu.org

> Michael,
> 
> I did not have time to reproduce the issue, and therefore to
> effectively test this patch fixes it. Maybe you can do it ?

Yes, it effectively fixes the issue.

Meanwhile I looked at the whole thing with how thread pools are
allocated, and now can add my R-b too.

Thank you!

/mjt



Re: [Qemu-devel] [PATCH] SeaBios: Fix reset procedure reentrancy problem on qemu-kvm platform

2015-12-20 Thread Gonglei (Arei)

> -Original Message-
> From: Kevin O'Connor [mailto:ke...@koconnor.net]
> Sent: Saturday, December 19, 2015 11:12 PM
> On Sat, Dec 19, 2015 at 12:03:15PM +, Gonglei (Arei) wrote:
> > Maybe the root cause is not NMI but INTR, so yield() can open hardware
> interrupt,
> > And then execute interrupt handler, but the interrupt handler make the
> SeaBIOS
> > stack broken, so that the BSP can't execute the instruction and occur
> exception,
> > VM_EXIT to Kmod, which is an infinite loop. But I don't have any proofs 
> > except
> > the surface phenomenon.
> 
> I can't see any reason why allowing interrupts at this location would
> be a problem.
> 
Does it have any relationship with *extra stack* of SeaBIOS?

> > Kevin, can we drop yield() in smp_setup() ?
> 
> It's possible to eliminate this instance of yield, but I think it
> would just push the crash to the next time interrupts are enabled.
> 
Perhaps. I'm not sure.

> > Is it really useful and allowable for SeaBIOS? Maybe for other components?
> > I'm not sure. Because we found that when SeaBIOS is booting, if we inject a
> > NMI by QMP, the guest will *stuck*. And the kvm tracing log is the same with
> > the current problem.
> 
> If you apply the patches you had to prevent that NMI crash problem,
> does it also prevent the above crash?
> 
Yes, but we cannot prevent the NMI injection (though I'll submit some patches to
forbid users' NMI injection after NMI_EN disabled by RTC bit7 of port 0x70).


Regards,
-Gonglei



Re: [Qemu-devel] [PATCH 2/5] igd-passthrough-i440FX: convert to realize()

2015-12-20 Thread Cao jin



On 12/19/2015 05:18 AM, Markus Armbruster wrote:

One short remark in addition to Eduardo's review.

Eduardo Habkost  writes:


[...]


  config_fd = open(path, O_RDWR);
  if (config_fd < 0) {
+error_setg(errp, "No such device");
  return -ENODEV;
  }


Can we come up with nicer error messages?



Ok...will change the error msg to strerror(errno)


[...]


.



--
Yours Sincerely,

Cao Jin





Re: [Qemu-devel] [PATCH 02/10] hw/sd/sd.c: QOMify

2015-12-20 Thread Peter Crosthwaite
On Sun, Dec 20, 2015 at 9:07 AM, Peter Maydell  wrote:
> On 19 December 2015 at 21:36, Peter Crosthwaite
>  wrote:
>> On Fri, Dec 11, 2015 at 04:37:03PM +, Peter Maydell wrote:
>>> Turn the SD card into a QOM device.
>>> This conversion only changes the device itself; the various
>>> functions which are effectively methods on the device are not
>>> touched at this point.
>
>>> +#define TYPE_SD "sd"
>>
>> Can we get "card" in there? I think unqualified SD should be usable
>> to refer to the bus standard.
>
> Sure. I don't have strong opinions on any of the type names
> in this series, and I was wondering about sticking 'card' in.
> "sdcard" or "sd-card" ?
>

I think "sd-card"

Regards,
Peter

> thanks
> -- PMM



Re: [Qemu-devel] [Qemu-ppc] [PATCH 0/3] ohci reset improvements

2015-12-20 Thread Mark Cave-Ayland
On 19/12/15 23:23, Hervé Poussineau wrote:

> Hi,
> 
> This small patchset adds different reset levels to OHCI USB controller.
> The idea is from Benjamin Herrenschmidt.
> Most significant change is that MacOS 9 doesn't try anymore to do some
> DMA transfer to address 0, and boots a little bit further.
> 
> Hervé
> 
> Hervé Poussineau (3):
>   ohci: split reset method in 3 parts
>   ohci: fix Host Controller USBRESET
>   ohci: fix command HostControllerReset
> 
>  hw/usb/hcd-ohci.c | 64 
> ---
>  1 file changed, 37 insertions(+), 27 deletions(-)

I've given this patchset a test against my GSOC OS 9 images and it seems
to help against USB crashes (although there are still other extensions
that crash on boot) while not regressing my other images, so:

Tested-by: Mark Cave-Ayland 


ATB,

Mark.




Re: [Qemu-devel] [PATCH v2] Use macro instead of plain text

2015-12-20 Thread Cao jin



On 12/18/2015 08:52 PM, Markus Armbruster wrote:

Cao jin  writes:


There is TYPE_ICH9_AHCI definition in ahci.h when QOMify it, seems it
is missed.

Signed-off-by: Cao jin 
---
  hw/i386/pc_q35.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
index 9a12068..aa34a07 100644
--- a/hw/i386/pc_q35.c
+++ b/hw/i386/pc_q35.c
@@ -249,7 +249,7 @@ static void pc_q35_init(MachineState *machine)
  ahci = pci_create_simple_multifunction(host_bus,
 PCI_DEVFN(ICH9_SATA1_DEV,
   ICH9_SATA1_FUNC),
-   true, "ich9-ahci");
+   true, TYPE_ICH9_AHCI);
  idebus[0] = qdev_get_child_bus(>qdev, "ide.0");
  idebus[1] = qdev_get_child_bus(>qdev, "ide.1");
  g_assert(MAX_SATA_PORTS == ICH_AHCI(ahci)->ahci.ports);


I suspect there are more instances of the same problem.

To find all similar macro definitions:

 $ git-grep -Eh '^ *# *define +TYPE_[A-Za-z0-9_]+ +"'

Extract the strings:

 $ git-grep -Eh '^ *# *define +TYPE_[A-Za-z0-9_]+ +"' | sed 
's/^[^"]*\("[^"]*"\).*/\1/'

Search the source for them:

 $ for i in `git-grep -Eh '^ *# *define +TYPE_[A-Za-z0-9_]+ +"' | sed 
's/^[^"]*\("[^"]*"\).*/\1/'`; do echo "= $i ="; git-grep -F "$i"; done

Results in more than 1000 hits for me.  Names like "device" are of
course very prone to false positives.



Yes, often seeing "plain text" style when browsing code, and after 
seeing PMM`s explanation, I finally see why there are lots of these...



Not sure how interested we are in consistent use of these macros...



I didn`t pay any attention to this before. It comes to me when I make 
the V2 "msi/msix" patch. as I said in that cover-letter: special case:)


we can forget this patch now.



.



--
Yours Sincerely,

Cao Jin





Re: [Qemu-devel] [PATCH v2 26/74] pc: acpi: memhp: move MHPD._STA method into SSDT

2015-12-20 Thread Marcel Apfelbaum

On 12/16/2015 04:47 PM, Igor Mammedov wrote:

Signed-off-by: Igor Mammedov 
---
v2:
  - add parentheses around ifctx block
Suggested-by: Marcel Apfelbaum 
---
  hw/acpi/memory_hotplug_acpi_table.c | 14 ++
  hw/i386/acpi-dsdt-mem-hotplug.dsl   |  8 
  2 files changed, 14 insertions(+), 8 deletions(-)

diff --git a/hw/acpi/memory_hotplug_acpi_table.c 
b/hw/acpi/memory_hotplug_acpi_table.c
index 9b61b1c..e1a6551 100644
--- a/hw/acpi/memory_hotplug_acpi_table.c
+++ b/hw/acpi/memory_hotplug_acpi_table.c
@@ -29,11 +29,25 @@ void build_memory_hotplug_aml(Aml *ctx, uint32_t nr_mem,
  {
  Aml *pci_scope;
  Aml *ctrl_dev;
+Aml *method;
+Aml *ifctx;
+Aml *a_zero = aml_int(0);
+Aml *a_slots_nr = aml_name(stringify(MEMORY_SLOTS_NUMBER));

  /* scope for memory hotplug controller device node */
  pci_scope = aml_scope("_SB.PCI0");
  ctrl_dev = aml_scope(stringify(MEMORY_HOTPLUG_DEVICE));
  {
+/* MHPD._STA() method */
+method = aml_method("_STA", 0, AML_NOTSERIALIZED);
+ifctx = aml_if(aml_equal(a_slots_nr, a_zero));
+{
+ aml_append(ifctx, aml_return(a_zero));
+}
+aml_append(method, ifctx);
+/* present, functioning, decoding, not shown in UI */
+aml_append(method, aml_return(aml_int(0xB)));
+aml_append(ctrl_dev, method);
  }
  aml_append(pci_scope, ctrl_dev);
  aml_append(ctx, pci_scope);
diff --git a/hw/i386/acpi-dsdt-mem-hotplug.dsl 
b/hw/i386/acpi-dsdt-mem-hotplug.dsl
index c2bb6a1..b4eacc9 100644
--- a/hw/i386/acpi-dsdt-mem-hotplug.dsl
+++ b/hw/i386/acpi-dsdt-mem-hotplug.dsl
@@ -35,14 +35,6 @@
  External(MEMORY_SLOT_OST_EVENT, FieldUnitObj) // _OST event code, 
write only
  External(MEMORY_SLOT_OST_STATUS, FieldUnitObj) // _OST status 
code, write only

-Method(_STA, 0) {
-If (LEqual(MEMORY_SLOTS_NUMBER, Zero)) {
-Return(0x0)
-}
-/* present, functioning, decoding, not shown in UI */
-Return(0xB)
-}
-
  Mutex (MEMORY_SLOT_LOCK, 0)

  Method(MEMORY_SLOT_SCAN_METHOD, 0) {




Reviewed-by: Marcel Apfelbaum 



[Qemu-devel] [PATCH] virtio-9p: use accessor to get thread_pool

2015-12-20 Thread Greg Kurz
The aio_context_new() function does not allocate a thread pool. This is
deferred to the first call to the aio_get_thread_pool() accessor. It is
hence forbidden to access the thread_pool field directly, as it may be
NULL. The accessor *must* be used always.

Fixes: ebac1202c95a4f1b76b6ef3f0f63926fa76e753e
Signed-off-by: Greg Kurz 
---

Michael,

I did not have time to reproduce the issue, and therefore to
effectively test this patch fixes it. Maybe you can do it ?

Thanks.

--
Greg

 hw/9pfs/virtio-9p-coth.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/9pfs/virtio-9p-coth.c b/hw/9pfs/virtio-9p-coth.c
index fb6e8f80e0f4..ab9425c60fd2 100644
--- a/hw/9pfs/virtio-9p-coth.c
+++ b/hw/9pfs/virtio-9p-coth.c
@@ -36,6 +36,6 @@ static int coroutine_enter_func(void *arg)
 void co_run_in_worker_bh(void *opaque)
 {
 Coroutine *co = opaque;
-thread_pool_submit_aio(qemu_get_aio_context()->thread_pool,
+thread_pool_submit_aio(aio_get_thread_pool(qemu_get_aio_context()),
coroutine_enter_func, co, coroutine_enter_cb, co);
 }




Re: [Qemu-devel] [PATCH 0/2] checkpatch: Fixing two cases of false positives in checkpatch.pl

2015-12-20 Thread Leonid Bloch
ping

http://patchwork.ozlabs.org/patch/537763
http://patchwork.ozlabs.org/patch/537762

On Tue, Nov 24, 2015 at 4:43 PM, Leonid Bloch  wrote:

> ping
>
> http://patchwork.ozlabs.org/patch/537763
> http://patchwork.ozlabs.org/patch/537762
>
> On Tue, Nov 10, 2015 at 11:48 AM, Leonid Bloch  wrote:
> > ping
> >
> > http://patchwork.ozlabs.org/patch/537763
> > http://patchwork.ozlabs.org/patch/537762
> >
> > On Thu, Oct 29, 2015 at 11:48 AM, Leonid Bloch 
> wrote:
> >> This series addresses two cases where errors were printed if whitespaces
> >> appeared in front of a square bracket in places where there should be no
> >> problem with such placements (please see messages of individual
> commits).
> >>
> >> Leonid Bloch (2):
> >>   checkpatch: Eliminate false positive in case of comma-space-square
> >> bracket
> >>   checkpatch: Eliminate false positive in case of space before square
> >> bracket in a definition
> >>
> >>  scripts/checkpatch.pl | 6 +-
> >>  1 file changed, 5 insertions(+), 1 deletion(-)
> >>
> >> --
> >> 2.4.3
> >>
>


Re: [Qemu-devel] [PATCH v2] Use macro instead of plain text

2015-12-20 Thread Cao jin

Hi Peter,

Thanks for the explanation, it make me aware that why there are lots of 
"plain_text_for_device":)


forget it please.

On 12/18/2015 09:01 PM, Peter Maydell wrote:

On 18 December 2015 at 12:52, Markus Armbruster  wrote:

Cao jin  writes:


There is TYPE_ICH9_AHCI definition in ahci.h when QOMify it, seems it
is missed.



I suspect there are more instances of the same problem.



Not sure how interested we are in consistent use of these macros...


In a lot of cases we'd need to find somewhere appropriate to put the
TYPE_* macro which might well mean creating a new header file just
for the purpose. That seems like overkill...

thanks
-- PMM





--
Yours Sincerely,

Cao Jin





Re: [Qemu-devel] [PATCH 3/5] PXB: convert to realize()

2015-12-20 Thread Marcel Apfelbaum


Hi,

On 12/18/2015 01:03 PM, Cao jin wrote:

Signed-off-by: Cao jin 
---
  hw/pci-bridge/pci_expander_bridge.c | 24 ++--
  1 file changed, 14 insertions(+), 10 deletions(-)

diff --git a/hw/pci-bridge/pci_expander_bridge.c 
b/hw/pci-bridge/pci_expander_bridge.c
index 57f8a37..cc975f6 100644
--- a/hw/pci-bridge/pci_expander_bridge.c
+++ b/hw/pci-bridge/pci_expander_bridge.c
@@ -145,19 +145,19 @@ static const TypeInfo pxb_host_info = {
   * Returns 0 on successs, -1 if i440fx host was not
   * found or the bus number is already in use.
   */
-static int pxb_register_bus(PCIDevice *dev, PCIBus *pxb_bus)
+static int pxb_register_bus(PCIDevice *dev, PCIBus *pxb_bus, Error **errp)


If you add an err parameter, maybe the function should return void.



  {
  PCIBus *bus = dev->bus;
  int pxb_bus_num = pci_bus_num(pxb_bus);

  if (bus->parent_dev) {
-error_report("PXB devices can be attached only to root bus.");
+error_setg(errp, "PXB devices can be attached only to root bus.");
  return -1;
  }

  QLIST_FOREACH(bus, >child, sibling) {
  if (pci_bus_num(bus) == pxb_bus_num) {
-error_report("Bus %d is already in use.", pxb_bus_num);
+error_setg(errp, "Bus %d is already in use.", pxb_bus_num);
  return -1;
  }
  }
@@ -193,7 +193,7 @@ static gint pxb_compare(gconstpointer a, gconstpointer b)
 0;
  }

-static int pxb_dev_initfn(PCIDevice *dev)
+static void pxb_dev_realize(PCIDevice *dev, Error **errp)
  {
  PXBDev *pxb = PXB_DEV(dev);
  DeviceState *ds, *bds;
@@ -202,8 +202,8 @@ static int pxb_dev_initfn(PCIDevice *dev)

  if (pxb->numa_node != NUMA_NODE_UNASSIGNED &&
  pxb->numa_node >= nb_numa_nodes) {
-error_report("Illegal numa node %d.", pxb->numa_node);
-return -EINVAL;
+error_setg(errp, "Illegal numa node %d.", pxb->numa_node);
+return;
  }

  if (dev->qdev.id && *dev->qdev.id) {
@@ -225,8 +225,8 @@ static int pxb_dev_initfn(PCIDevice *dev)

  PCI_HOST_BRIDGE(ds)->bus = bus;

-if (pxb_register_bus(dev, bus)) {
-return -EINVAL;
+if (pxb_register_bus(dev, bus, errp)) {
+goto err_register_bus;
  }

  qdev_init_nofail(ds);
@@ -237,7 +237,11 @@ static int pxb_dev_initfn(PCIDevice *dev)
  pci_config_set_class(dev->config, PCI_CLASS_BRIDGE_HOST);

  pxb_dev_list = g_list_insert_sorted(pxb_dev_list, pxb, pxb_compare);
-return 0;
+
+err_register_bus:
+object_unref(OBJECT(ds));
+object_unref(OBJECT(bds));
+object_unref(OBJECT(bus));



The order should be in the reverse order of creation:
bds, bus, ds



  }

  static void pxb_dev_exitfn(PCIDevice *pci_dev)
@@ -259,7 +263,7 @@ static void pxb_dev_class_init(ObjectClass *klass, void 
*data)
  DeviceClass *dc = DEVICE_CLASS(klass);
  PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);

-k->init = pxb_dev_initfn;
+k->realize = pxb_dev_realize;
  k->exit = pxb_dev_exitfn;


If init is converted to realize, maybe the exit should be converted to 
unrealize?


Thanks,
Marcel


  k->vendor_id = PCI_VENDOR_ID_REDHAT;
  k->device_id = PCI_DEVICE_ID_REDHAT_PXB;






Re: [Qemu-devel] [PATCH 3/5] PXB: convert to realize()

2015-12-20 Thread Marcel Apfelbaum

On 12/20/2015 12:48 PM, Cao jin wrote:

Hi,

On 12/20/2015 06:22 PM, Marcel Apfelbaum wrote:


Hi,

On 12/18/2015 01:03 PM, Cao jin wrote:

Signed-off-by: Cao jin 
---
  hw/pci-bridge/pci_expander_bridge.c | 24 ++--
  1 file changed, 14 insertions(+), 10 deletions(-)

diff --git a/hw/pci-bridge/pci_expander_bridge.c
b/hw/pci-bridge/pci_expander_bridge.c
index 57f8a37..cc975f6 100644
--- a/hw/pci-bridge/pci_expander_bridge.c
+++ b/hw/pci-bridge/pci_expander_bridge.c
@@ -145,19 +145,19 @@ static const TypeInfo pxb_host_info = {
   * Returns 0 on successs, -1 if i440fx host was not
   * found or the bus number is already in use.
   */
-static int pxb_register_bus(PCIDevice *dev, PCIBus *pxb_bus)
+static int pxb_register_bus(PCIDevice *dev, PCIBus *pxb_bus, Error
**errp)


If you add an err parameter, maybe the function should return void.



Ok, will modify it in V2. Actually, both style are fine with me:)




  {
  PCIBus *bus = dev->bus;
  int pxb_bus_num = pci_bus_num(pxb_bus);

  if (bus->parent_dev) {
-error_report("PXB devices can be attached only to root bus.");
+error_setg(errp, "PXB devices can be attached only to root
bus.");
  return -1;
  }

  QLIST_FOREACH(bus, >child, sibling) {
  if (pci_bus_num(bus) == pxb_bus_num) {
-error_report("Bus %d is already in use.", pxb_bus_num);
+error_setg(errp, "Bus %d is already in use.", pxb_bus_num);
  return -1;
  }
  }
@@ -193,7 +193,7 @@ static gint pxb_compare(gconstpointer a,
gconstpointer b)
 0;
  }

-static int pxb_dev_initfn(PCIDevice *dev)
+static void pxb_dev_realize(PCIDevice *dev, Error **errp)
  {
  PXBDev *pxb = PXB_DEV(dev);
  DeviceState *ds, *bds;
@@ -202,8 +202,8 @@ static int pxb_dev_initfn(PCIDevice *dev)

  if (pxb->numa_node != NUMA_NODE_UNASSIGNED &&
  pxb->numa_node >= nb_numa_nodes) {
-error_report("Illegal numa node %d.", pxb->numa_node);
-return -EINVAL;
+error_setg(errp, "Illegal numa node %d.", pxb->numa_node);
+return;
  }

  if (dev->qdev.id && *dev->qdev.id) {
@@ -225,8 +225,8 @@ static int pxb_dev_initfn(PCIDevice *dev)

  PCI_HOST_BRIDGE(ds)->bus = bus;

-if (pxb_register_bus(dev, bus)) {
-return -EINVAL;
+if (pxb_register_bus(dev, bus, errp)) {
+goto err_register_bus;
  }

  qdev_init_nofail(ds);
@@ -237,7 +237,11 @@ static int pxb_dev_initfn(PCIDevice *dev)
  pci_config_set_class(dev->config, PCI_CLASS_BRIDGE_HOST);

  pxb_dev_list = g_list_insert_sorted(pxb_dev_list, pxb,
pxb_compare);
-return 0;
+
+err_register_bus:
+object_unref(OBJECT(ds));
+object_unref(OBJECT(bds));
+object_unref(OBJECT(bus));



The order should be in the reverse order of creation:
 bds, bus, ds



Ok, I can do that. But it seems the order here doesn`t matter? Is there 
dependency among these three?


Yes, there is a dependency:
At first the pxb host (ds) is created, then the bus (bus) is created as host's 
child (see pci_bus_new)
and in the end a pci bridge (bds) is attached to the bus (see qdev_create).

By the way, indeed you should call object_unparent at least for the 
pxb_host(ds), but you may want to
check the others parent relationship as well.






  }

  static void pxb_dev_exitfn(PCIDevice *pci_dev)
@@ -259,7 +263,7 @@ static void pxb_dev_class_init(ObjectClass *klass,
void *data)
  DeviceClass *dc = DEVICE_CLASS(klass);
  PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);

-k->init = pxb_dev_initfn;
+k->realize = pxb_dev_realize;
  k->exit = pxb_dev_exitfn;


If init is converted to realize, maybe the exit should be converted to
unrealize?



Yup, I agree with you from the point that the names should be antonym. But it 
seems there is no PCIDeviceClass.unrealize:(


You are right. The pci_qdev_unrealize ultimately calls exit. But the same goes for 
init, pci_qdev_realize calls for pc->realize.
This is the reason I chose to use init/exit instead of the strange realize/exit.

But since the intention is to get rid of init, I am not against it.



And I am also not aware why there is no comment for .exit while there is for 
.init. It is appreciated if somebody could tell me the history O:-)


I'll add Markus, Andreas and Michael (the PCI maintainer), maybe they have a 
better insight to this.

On the other hand you should continue with the patch and leave the "unrealize" 
until we'll know more :)

Thanks,
Marcel





Thanks,
Marcel


  k->vendor_id = PCI_VENDOR_ID_REDHAT;
  k->device_id = PCI_DEVICE_ID_REDHAT_PXB;





.








Re: [Qemu-devel] [PATCH 3/5] PXB: convert to realize()

2015-12-20 Thread Cao jin

Hi

On 12/19/2015 02:01 AM, Paolo Bonzini wrote:



On 18/12/2015 12:03, Cao jin wrote:

[...]

+
+err_register_bus:
+object_unref(OBJECT(ds));
+object_unref(OBJECT(bds));
+object_unref(OBJECT(bus));


I think these should be object_unparent, not unref.



But, it seems these 3 objects isn`t added as a child-property via 
object_property_add_child() during creation, so OBJECT(ds)->parent(so 
does the other 2) will be NULL, and so object_unparent will do nothing?


Or am I missing something?


Paolo


  }

  static void pxb_dev_exitfn(PCIDevice *pci_dev)
@@ -259,7 +263,7 @@ static void pxb_dev_class_init(ObjectClass *klass, void 
*data)
  DeviceClass *dc = DEVICE_CLASS(klass);
  PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);

-k->init = pxb_dev_initfn;
+k->realize = pxb_dev_realize;
  k->exit = pxb_dev_exitfn;
  k->vendor_id = PCI_VENDOR_ID_REDHAT;
  k->device_id = PCI_DEVICE_ID_REDHAT_PXB;




.



--
Yours Sincerely,

Cao Jin





Re: [Qemu-devel] [PATCH 2/5] igd-passthrough-i440FX: convert to realize()

2015-12-20 Thread Cao jin



On 12/19/2015 02:37 AM, Eduardo Habkost wrote:

On Fri, Dec 18, 2015 at 07:03:49PM +0800, Cao jin wrote:

Signed-off-by: Cao jin 
---
  hw/pci-host/piix.c | 16 +---
  1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/hw/pci-host/piix.c b/hw/pci-host/piix.c
index 715208b..e3840f0 100644
--- a/hw/pci-host/piix.c
+++ b/hw/pci-host/piix.c
@@ -761,7 +761,7 @@ static const IGDHostInfo igd_host_bridge_infos[] = {
  {0xa8, 4},  /* SNB: base of GTT stolen memory */
  };

-static int host_pci_config_read(int pos, int len, uint32_t val)
+static int host_pci_config_read(int pos, int len, uint32_t val, Error **errp)


You don't need the return value anymore, if you report errors
through the errp parameter. The function can be void, now.


Ok, will modify that in next version, as many people mentioned in the 
other patch...





  {
  char path[PATH_MAX];
  int config_fd;
@@ -772,15 +772,18 @@ static int host_pci_config_read(int pos, int len, 
uint32_t val)
  int ret = 0;

  if (rc >= size || rc < 0) {
+error_setg(errp, "No such device");
  return -ENODEV;
  }

  config_fd = open(path, O_RDWR);
  if (config_fd < 0) {
+error_setg(errp, "No such device");
  return -ENODEV;
  }

  if (lseek(config_fd, pos, SEEK_SET) != pos) {
+error_setg(errp, "lseek err: %s", strerror(errno));


What about error_setg_errno()?


Ok, I am not conscious that there exist error_setg_errno() :p




  ret = -errno;
  goto out;
  }
@@ -789,13 +792,14 @@ static int host_pci_config_read(int pos, int len, 
uint32_t val)
  } while (rc < 0 && (errno == EINTR || errno == EAGAIN));
  if (rc != len) {
  ret = -errno;
+error_setg(errp, "read err: %s", strerror(errno));


Same here.

OK.



  }
  out:
  close(config_fd);
  return ret;
  }

-static int igd_pt_i440fx_initfn(struct PCIDevice *pci_dev)
+static void igd_pt_i440fx_realize(struct PCIDevice *pci_dev, Error **errp)
  {
  uint32_t val = 0;
  int rc, i, num;
@@ -805,14 +809,12 @@ static int igd_pt_i440fx_initfn(struct PCIDevice *pci_dev)
  for (i = 0; i < num; i++) {
  pos = igd_host_bridge_infos[i].offset;
  len = igd_host_bridge_infos[i].len;
-rc = host_pci_config_read(pos, len, val);
+rc = host_pci_config_read(pos, len, val, errp);
  if (rc) {


The usual pattern for error checking and propagation is:

 Error *err;
 host_pci_config_read(pos, len, val, );
 if (err) {
 error_propagate(errp, local_err);
 return;
 }



Yup. I see


-return -ENODEV;
+return;
  }
  pci_default_write_config(pci_dev, pos, val, len);
  }
-
-return 0;
  }

  static void igd_passthrough_i440fx_class_init(ObjectClass *klass, void *data)
@@ -820,7 +822,7 @@ static void igd_passthrough_i440fx_class_init(ObjectClass 
*klass, void *data)
  DeviceClass *dc = DEVICE_CLASS(klass);
  PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);

-k->init = igd_pt_i440fx_initfn;
+k->realize = igd_pt_i440fx_realize;
  dc->desc = "IGD Passthrough Host bridge";
  }

--
2.1.0







--
Yours Sincerely,

Cao Jin





[Qemu-devel] new barrier type for paravirt (was Re: [PATCH] virtio_ring: use smp_store_mb)

2015-12-20 Thread Michael S. Tsirkin
On Thu, Dec 17, 2015 at 03:39:10PM +0100, Peter Zijlstra wrote:
> On Thu, Dec 17, 2015 at 04:33:44PM +0200, Michael S. Tsirkin wrote:
> > On Thu, Dec 17, 2015 at 02:57:26PM +0100, Peter Zijlstra wrote:
> > > 
> > > You could of course go fix that instead of mutilating things into
> > > sort-of functional state.
> > 
> > Yes, we'd just need to touch all architectures, all for
> > the sake of UP which almost no one uses.
> > Basically, we need APIs that explicitly are
> > for talking to another kernel on a different CPU on
> > the same SMP system, and implemented identically
> > between CONFIG_SMP and !CONFIG_SMP on all architectures.
> > 
> > Do you think this is something of general usefulness,
> > outside virtio?
> 
> I'm not aware of any other case, but if there are more parts of virt
> that need this then I see no problem adding it.

When I wrote this, I assumed there are no other users, and I'm still not
sure there are other users at the moment. Do you see a problem then?

> That is, virt in general is the only use-case that I can think of,
> because this really is an artifact of interfacing with an SMP host while
> running an UP kernel.

Or another guest kernel on an SMP host.

> But I'm really not familiar with virt, so I do not know if there's more
> sites outside of virtio that could use this.
> Touching all archs is a tad tedious, but its fairly straight forward.

So I looked and I was only able to find other another possible user in Xen.

Cc Xen folks.

I noticed that drivers/xen/xenbus/xenbus_comms.c uses
full memory barriers to communicate with the other side.
For example:

/* Must write data /after/ reading the consumer index.  * */
mb();

memcpy(dst, data, avail);
data += avail;
len -= avail;

/* Other side must not see new producer until data is * there. 
*/
wmb();
intf->req_prod += avail;

/* Implies mb(): other side will see the updated producer. */
notify_remote_via_evtchn(xen_store_evtchn);

To me, it looks like for guests compiled with CONFIG_SMP, smp_wmb and smp_mb
would be sufficient, so mb() and wmb() here are only needed if
a non-SMP guest runs on an SMP host.

Is my analysis correct?

So what I'm suggesting is something like the below patch,
except instead of using virtio directly, a new set of barriers
that behaves identically for SMP and non-SMP guests will be introduced.

And of course the weak barriers flag is not needed for Xen -
that's a virtio only thing.

For example:

smp_pv_wmb()
smp_pv_rmb()
smp_pv_mb()

I'd like to get confirmation from Xen folks before posting
this patchset.

Comments/suggestions?

Signed-off-by: Michael S. Tsirkin 

--

compile-tested only.

diff --git a/drivers/xen/xenbus/xenbus_comms.c 
b/drivers/xen/xenbus/xenbus_comms.c
index fdb0f33..a28f049 100644
--- a/drivers/xen/xenbus/xenbus_comms.c
+++ b/drivers/xen/xenbus/xenbus_comms.c
@@ -36,6 +36,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -123,14 +124,14 @@ int xb_write(const void *data, unsigned len)
avail = len;
 
/* Must write data /after/ reading the consumer index. */
-   mb();
+   virtio_mb(true);
 
memcpy(dst, data, avail);
data += avail;
len -= avail;
 
/* Other side must not see new producer until data is there. */
-   wmb();
+   virtio_wmb(true);
intf->req_prod += avail;
 
/* Implies mb(): other side will see the updated producer. */
@@ -180,14 +181,14 @@ int xb_read(void *data, unsigned len)
avail = len;
 
/* Must read data /after/ reading the producer index. */
-   rmb();
+   virtio_rmb(true);
 
memcpy(data, src, avail);
data += avail;
len -= avail;
 
/* Other side must not see free space until we've copied out */
-   mb();
+   virtio_mb(true);
intf->rsp_cons += avail;
 
pr_debug("Finished read of %i bytes (%i to go)\n", avail, len);

-- 
MST



Re: [Qemu-devel] [PATCH 3/5] PXB: convert to realize()

2015-12-20 Thread Cao jin

Hi,

On 12/20/2015 06:22 PM, Marcel Apfelbaum wrote:


Hi,

On 12/18/2015 01:03 PM, Cao jin wrote:

Signed-off-by: Cao jin 
---
  hw/pci-bridge/pci_expander_bridge.c | 24 ++--
  1 file changed, 14 insertions(+), 10 deletions(-)

diff --git a/hw/pci-bridge/pci_expander_bridge.c
b/hw/pci-bridge/pci_expander_bridge.c
index 57f8a37..cc975f6 100644
--- a/hw/pci-bridge/pci_expander_bridge.c
+++ b/hw/pci-bridge/pci_expander_bridge.c
@@ -145,19 +145,19 @@ static const TypeInfo pxb_host_info = {
   * Returns 0 on successs, -1 if i440fx host was not
   * found or the bus number is already in use.
   */
-static int pxb_register_bus(PCIDevice *dev, PCIBus *pxb_bus)
+static int pxb_register_bus(PCIDevice *dev, PCIBus *pxb_bus, Error
**errp)


If you add an err parameter, maybe the function should return void.



Ok, will modify it in V2. Actually, both style are fine with me:)




  {
  PCIBus *bus = dev->bus;
  int pxb_bus_num = pci_bus_num(pxb_bus);

  if (bus->parent_dev) {
-error_report("PXB devices can be attached only to root bus.");
+error_setg(errp, "PXB devices can be attached only to root
bus.");
  return -1;
  }

  QLIST_FOREACH(bus, >child, sibling) {
  if (pci_bus_num(bus) == pxb_bus_num) {
-error_report("Bus %d is already in use.", pxb_bus_num);
+error_setg(errp, "Bus %d is already in use.", pxb_bus_num);
  return -1;
  }
  }
@@ -193,7 +193,7 @@ static gint pxb_compare(gconstpointer a,
gconstpointer b)
 0;
  }

-static int pxb_dev_initfn(PCIDevice *dev)
+static void pxb_dev_realize(PCIDevice *dev, Error **errp)
  {
  PXBDev *pxb = PXB_DEV(dev);
  DeviceState *ds, *bds;
@@ -202,8 +202,8 @@ static int pxb_dev_initfn(PCIDevice *dev)

  if (pxb->numa_node != NUMA_NODE_UNASSIGNED &&
  pxb->numa_node >= nb_numa_nodes) {
-error_report("Illegal numa node %d.", pxb->numa_node);
-return -EINVAL;
+error_setg(errp, "Illegal numa node %d.", pxb->numa_node);
+return;
  }

  if (dev->qdev.id && *dev->qdev.id) {
@@ -225,8 +225,8 @@ static int pxb_dev_initfn(PCIDevice *dev)

  PCI_HOST_BRIDGE(ds)->bus = bus;

-if (pxb_register_bus(dev, bus)) {
-return -EINVAL;
+if (pxb_register_bus(dev, bus, errp)) {
+goto err_register_bus;
  }

  qdev_init_nofail(ds);
@@ -237,7 +237,11 @@ static int pxb_dev_initfn(PCIDevice *dev)
  pci_config_set_class(dev->config, PCI_CLASS_BRIDGE_HOST);

  pxb_dev_list = g_list_insert_sorted(pxb_dev_list, pxb,
pxb_compare);
-return 0;
+
+err_register_bus:
+object_unref(OBJECT(ds));
+object_unref(OBJECT(bds));
+object_unref(OBJECT(bus));



The order should be in the reverse order of creation:
 bds, bus, ds



Ok, I can do that. But it seems the order here doesn`t matter? Is there 
dependency among these three?





  }

  static void pxb_dev_exitfn(PCIDevice *pci_dev)
@@ -259,7 +263,7 @@ static void pxb_dev_class_init(ObjectClass *klass,
void *data)
  DeviceClass *dc = DEVICE_CLASS(klass);
  PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);

-k->init = pxb_dev_initfn;
+k->realize = pxb_dev_realize;
  k->exit = pxb_dev_exitfn;


If init is converted to realize, maybe the exit should be converted to
unrealize?



Yup, I agree with you from the point that the names should be antonym. 
But it seems there is no PCIDeviceClass.unrealize:(


And I am also not aware why there is no comment for .exit while there is 
for .init. It is appreciated if somebody could tell me the history O:-)




Thanks,
Marcel


  k->vendor_id = PCI_VENDOR_ID_REDHAT;
  k->device_id = PCI_DEVICE_ID_REDHAT_PXB;





.



--
Yours Sincerely,

Cao Jin





[Qemu-devel] [PATCH RFC] smp_store_mb should use smp_mb

2015-12-20 Thread Michael S. Tsirkin
On some architectures smp_store_mb() calls mb() which is stronger
than implied by both the name and the documentation.

smp_store_mb is only used by core kernel code at the moment, so
we know no one mis-uses it for an MMIO barrier.
Make it call smp_mb consistently before some arch-specific
code uses it as such by mistake.

Signed-off-by: Michael S. Tsirkin 

---

Note: I'm guessing an ack from arch maintainers will be needed, but
I'm working on a bigger cleanup moving a bunch of duplicated code
into asm-generic/barrier.h which depends on this, so not Cc'ing
widely yet.

Please Ack if appropriate but do not merge yet.

diff --git a/arch/ia64/include/asm/barrier.h b/arch/ia64/include/asm/barrier.h
index df896a1..425552b 100644
--- a/arch/ia64/include/asm/barrier.h
+++ b/arch/ia64/include/asm/barrier.h
@@ -77,7 +77,7 @@ do {  
\
___p1;  \
 })
 
-#define smp_store_mb(var, value)   do { WRITE_ONCE(var, value); mb(); } 
while (0)
+#define smp_store_mb(var, value)   do { WRITE_ONCE(var, value); smp_mb(); 
} while (0)
 
 /*
  * The group barrier in front of the rsm & ssm are necessary to ensure
diff --git a/arch/powerpc/include/asm/barrier.h 
b/arch/powerpc/include/asm/barrier.h
index 0eca6ef..4f0169e 100644
--- a/arch/powerpc/include/asm/barrier.h
+++ b/arch/powerpc/include/asm/barrier.h
@@ -34,7 +34,7 @@
 #define rmb()  __asm__ __volatile__ ("sync" : : : "memory")
 #define wmb()  __asm__ __volatile__ ("sync" : : : "memory")
 
-#define smp_store_mb(var, value)   do { WRITE_ONCE(var, value); mb(); } 
while (0)
+#define smp_store_mb(var, value)   do { WRITE_ONCE(var, value); smp_mb(); 
} while (0)
 
 #ifdef __SUBARCH_HAS_LWSYNC
 #define SMPWMB  LWSYNC
diff --git a/arch/s390/include/asm/barrier.h b/arch/s390/include/asm/barrier.h
index d68e11e..6c1d8b5 100644
--- a/arch/s390/include/asm/barrier.h
+++ b/arch/s390/include/asm/barrier.h
@@ -36,7 +36,7 @@
 #define smp_mb__before_atomic()smp_mb()
 #define smp_mb__after_atomic() smp_mb()
 
-#define smp_store_mb(var, value)   do { WRITE_ONCE(var, value); 
mb(); } while (0)
+#define smp_store_mb(var, value)   do { WRITE_ONCE(var, value); 
smp_mb(); } while (0)
 
 #define smp_store_release(p, v)
\
 do {   \
diff --git a/include/asm-generic/barrier.h b/include/asm-generic/barrier.h
index b42afad..0f45f93 100644
--- a/include/asm-generic/barrier.h
+++ b/include/asm-generic/barrier.h
@@ -93,7 +93,7 @@
 #endif /* CONFIG_SMP */
 
 #ifndef smp_store_mb
-#define smp_store_mb(var, value)  do { WRITE_ONCE(var, value); mb(); } while 
(0)
+#define smp_store_mb(var, value)  do { WRITE_ONCE(var, value); smp_mb(); } 
while (0)
 #endif
 
 #ifndef smp_mb__before_atomic



Re: [Qemu-devel] [Xen-devel] new barrier type for paravirt (was Re: [PATCH] virtio_ring: use smp_store_mb)

2015-12-20 Thread Michael S. Tsirkin
On Sun, Dec 20, 2015 at 08:59:44PM +0100, Peter Zijlstra wrote:
> On Sun, Dec 20, 2015 at 05:07:19PM +, Andrew Cooper wrote:
> > 
> > Very much +1 for fixing this.
> > 
> > Those names would be fine, but they do add yet another set of options in
> > an already-complicated area.
> > 
> > An alternative might be to have the regular smp_{w,r,}mb() not revert
> > back to nops if CONFIG_PARAVIRT, or perhaps if pvops have detected a
> > non-native environment.  (I don't know how feasible this suggestion is,
> > however.)
> 
> So a regular SMP kernel emits the LOCK prefix and will patch it out with
> a DS prefix (iirc) when it finds but a single CPU. So for those you
> could easily do this.
> 
> However an UP kernel will not emit the LOCK and do no patching.
> 
> So if you're willing to make CONFIG_PARAVIRT depend on CONFIG_SMP or
> similar, this is doable.

One of the uses for virtio is to allow testing an existing kernel on
kvm just by loading a module, and this will break this usecase.

> I don't see people going to allow emitting the LOCK prefix (and growing
> the kernel text size) for UP kernels.

Thinking about this more, maybe __smp_*mb is a better set of names.

The nice thing about it is that we can then have generic code
that does basically

#ifdef CONFIG_SMP
#define smp_mb() __smp_mb()
#else
#define smp_mb() barrier()
#endif 

and reuse this on all architectures.

So instead of a maintainance burden, we are actually
removing code duplication.

-- 
MST



Re: [Qemu-devel] [PATCH v2 26/74] pc: acpi: memhp: move MHPD._STA method into SSDT

2015-12-20 Thread Michael S. Tsirkin
On Wed, Dec 16, 2015 at 03:47:35PM +0100, Igor Mammedov wrote:
> Signed-off-by: Igor Mammedov 
> ---
> v2:
>  - add parentheses around ifctx block
>Suggested-by: Marcel Apfelbaum 
> ---
>  hw/acpi/memory_hotplug_acpi_table.c | 14 ++
>  hw/i386/acpi-dsdt-mem-hotplug.dsl   |  8 
>  2 files changed, 14 insertions(+), 8 deletions(-)
> 
> diff --git a/hw/acpi/memory_hotplug_acpi_table.c 
> b/hw/acpi/memory_hotplug_acpi_table.c
> index 9b61b1c..e1a6551 100644
> --- a/hw/acpi/memory_hotplug_acpi_table.c
> +++ b/hw/acpi/memory_hotplug_acpi_table.c
> @@ -29,11 +29,25 @@ void build_memory_hotplug_aml(Aml *ctx, uint32_t nr_mem,
>  {
>  Aml *pci_scope;
>  Aml *ctrl_dev;
> +Aml *method;
> +Aml *ifctx;
> +Aml *a_zero = aml_int(0);
> +Aml *a_slots_nr = aml_name(stringify(MEMORY_SLOTS_NUMBER));
>  
>  /* scope for memory hotplug controller device node */
>  pci_scope = aml_scope("_SB.PCI0");
>  ctrl_dev = aml_scope(stringify(MEMORY_HOTPLUG_DEVICE));
>  {
> +/* MHPD._STA() method */

You know something's wrong if you are documenting C using ASL.  In this
case, rename ctrl_dev to mem_hotplug_dev, move ifctx zero and slots nr
within {} and drop a_ prefix for zero and slots_nr and code will be
clear without this comment.

> +method = aml_method("_STA", 0, AML_NOTSERIALIZED);
> +ifctx = aml_if(aml_equal(a_slots_nr, a_zero));
> +{
> + aml_append(ifctx, aml_return(a_zero));
> +}
> +aml_append(method, ifctx);
> +/* present, functioning, decoding, not shown in UI */
> +aml_append(method, aml_return(aml_int(0xB)));
> +aml_append(ctrl_dev, method);


Simple unserialized methods that return value depending on input
being zero seem very common.  How about a helper for this case?

E.g.

aml_method_if_then_else(slots_nr, aml_int(0xB), aml_int(0));


>  }
>  aml_append(pci_scope, ctrl_dev);
>  aml_append(ctx, pci_scope);
> diff --git a/hw/i386/acpi-dsdt-mem-hotplug.dsl 
> b/hw/i386/acpi-dsdt-mem-hotplug.dsl
> index c2bb6a1..b4eacc9 100644
> --- a/hw/i386/acpi-dsdt-mem-hotplug.dsl
> +++ b/hw/i386/acpi-dsdt-mem-hotplug.dsl
> @@ -35,14 +35,6 @@
>  External(MEMORY_SLOT_OST_EVENT, FieldUnitObj) // _OST event 
> code, write only
>  External(MEMORY_SLOT_OST_STATUS, FieldUnitObj) // _OST status 
> code, write only
>  
> -Method(_STA, 0) {
> -If (LEqual(MEMORY_SLOTS_NUMBER, Zero)) {
> -Return(0x0)
> -}
> -/* present, functioning, decoding, not shown in UI */
> -Return(0xB)
> -}
> -
>  Mutex (MEMORY_SLOT_LOCK, 0)
>  
>  Method(MEMORY_SLOT_SCAN_METHOD, 0) {
> -- 
> 1.8.3.1



Re: [Qemu-devel] [PATCH] SeaBios: Fix reset procedure reentrancy problem on qemu-kvm platform

2015-12-20 Thread Kevin O'Connor
On Sun, Dec 20, 2015 at 09:49:54AM +, Gonglei (Arei) wrote:
> > From: Kevin O'Connor [mailto:ke...@koconnor.net]
> > Sent: Saturday, December 19, 2015 11:12 PM
> > On Sat, Dec 19, 2015 at 12:03:15PM +, Gonglei (Arei) wrote:
> > > Maybe the root cause is not NMI but INTR, so yield() can open hardware
> > interrupt,
> > > And then execute interrupt handler, but the interrupt handler make the
> > SeaBIOS
> > > stack broken, so that the BSP can't execute the instruction and occur
> > exception,
> > > VM_EXIT to Kmod, which is an infinite loop. But I don't have any proofs 
> > > except
> > > the surface phenomenon.
> > 
> > I can't see any reason why allowing interrupts at this location would
> > be a problem.
> > 
> Does it have any relationship with *extra stack* of SeaBIOS?

None that I can see.  Also, the kvm trace seems to show the code
trying to execute at rip=0x03 - that will crash long before the extra
stack is used.

> > > Kevin, can we drop yield() in smp_setup() ?
> > 
> > It's possible to eliminate this instance of yield, but I think it
> > would just push the crash to the next time interrupts are enabled.
> > 
> Perhaps. I'm not sure.
> 
> > > Is it really useful and allowable for SeaBIOS? Maybe for other components?
> > > I'm not sure. Because we found that when SeaBIOS is booting, if we inject 
> > > a
> > > NMI by QMP, the guest will *stuck*. And the kvm tracing log is the same 
> > > with
> > > the current problem.
> > 
> > If you apply the patches you had to prevent that NMI crash problem,
> > does it also prevent the above crash?
> > 
> Yes, but we cannot prevent the NMI injection (though I'll submit some patches 
> to
> forbid users' NMI injection after NMI_EN disabled by RTC bit7 of port 0x70).
> 

-Kevin



Re: [Qemu-devel] [PATCH v3 2/4] target-tilegx: Add single floating point implementation

2015-12-20 Thread Chen Gang

After tried, I guess, this way below is incorrect: float64_to_float32()
assumes the input 'd' is already a standard (packed) float64 variable.
But in fact, it is not (e.g. the input from floatsisf2).

And we have to still check TILEGX_F_CALC_CVT, for they are really two
different format: TILEGX_F_CALC_CVT has no HBIT, but TILEGX_F_CALC_NCVT
has HBIT (which we need process it specially).

For me, the way like helper_fdouble_pack2 (the double implementation) is
OK to TILEGX_F_CALC_NCVT format, too.

 - Shift left to get HBIT, and change the related vexp (use vexp instead
   of exp to process overflow cases -- like double implementation does).

 - Use (u)int32_to_float32 for the mantissa.

 - Then process exp again.


Thanks.

On 12/11/15 06:14, Chen Gang wrote:
>> In particular, if gcc decided to optimize fractional fixed-point types, it
>> > would do something very similar to the current floatsisf2 code sequence, 
>> > except
>> > that it wouldn't use 0x9e as the exponent; it would use something smaller, 
>> > so
>> > that some number of low bits of the mantessa would be below the radix 
>> > point.
>> > 
> Oh, really.
> 
>> > Therefore, I think that fsingle_pack2 should do the following: Take the
>> > (sign,exp,man) tuple and slot them into a double -- recall that a single 
>> > only
>> > has 23 bits in its mantessa, and this temp format has 32 -- then convert 
>> > the
>> > double to a single.  Pre-rounded single results from fsingle_* will be
>> > unchanged, while integer data that gcc has constructed will be properly 
>> > rounded.
>> > 
>> > E.g.
>> > 
>> >   uint32_t sign = get_fsingle_sign(sfmt);
>> >   uint32_t exp = get_fsingle_exp(sfmt);
>> >   uint32_t man = get_fsingle_man(sfmt);
>> >   uint64_t d;
>> > 
>> >   /* Adjust the exponent for double precision, preserving Inf/NaN.  */
>> >   if (exp == 0xff) {
>> > exp = 0x7ff;
>> >   } else {
>> > exp += 1023 - 127;
>> >   }
>> > 
>> >   d = (uint64_t)sign << 63;
>> >   d = deposit64(d, 53, 11, exp);
>> >   d = deposit64(d, 21, 32, man);
>> >   return float64_to_float32(d, fp_status);
>> > 
>> > Note that this does require float32_to_sfmt to store the mantissa
>> > left-justified. That is, not in bits [54-32] as you're doing now, but in 
>> > bits
>> > [63-41].
>> > 
> For me, it is a good idea! :-)
> 
> 

-- 
Chen Gang (陈刚)

Open, share, and attitude like air, water, and life which God blessed



Re: [Qemu-devel] [Xen-devel] new barrier type for paravirt (was Re: [PATCH] virtio_ring: use smp_store_mb)

2015-12-20 Thread Andrew Cooper
On 20/12/15 09:25, Michael S. Tsirkin wrote:
> On Thu, Dec 17, 2015 at 03:39:10PM +0100, Peter Zijlstra wrote:
>> On Thu, Dec 17, 2015 at 04:33:44PM +0200, Michael S. Tsirkin wrote:
>>> On Thu, Dec 17, 2015 at 02:57:26PM +0100, Peter Zijlstra wrote:
 You could of course go fix that instead of mutilating things into
 sort-of functional state.
>>> Yes, we'd just need to touch all architectures, all for
>>> the sake of UP which almost no one uses.
>>> Basically, we need APIs that explicitly are
>>> for talking to another kernel on a different CPU on
>>> the same SMP system, and implemented identically
>>> between CONFIG_SMP and !CONFIG_SMP on all architectures.
>>>
>>> Do you think this is something of general usefulness,
>>> outside virtio?
>> I'm not aware of any other case, but if there are more parts of virt
>> that need this then I see no problem adding it.
> When I wrote this, I assumed there are no other users, and I'm still not
> sure there are other users at the moment. Do you see a problem then?
>
>> That is, virt in general is the only use-case that I can think of,
>> because this really is an artifact of interfacing with an SMP host while
>> running an UP kernel.
> Or another guest kernel on an SMP host.
>
>> But I'm really not familiar with virt, so I do not know if there's more
>> sites outside of virtio that could use this.
>> Touching all archs is a tad tedious, but its fairly straight forward.
> So I looked and I was only able to find other another possible user in Xen.
>
> Cc Xen folks.
>
> I noticed that drivers/xen/xenbus/xenbus_comms.c uses
> full memory barriers to communicate with the other side.
> For example:
>
> /* Must write data /after/ reading the consumer index.  * */
> mb();
>
> memcpy(dst, data, avail);
> data += avail;
> len -= avail;
> 
> /* Other side must not see new producer until data is * 
> there. */
> wmb();
> intf->req_prod += avail;
> 
> /* Implies mb(): other side will see the updated producer. */
> notify_remote_via_evtchn(xen_store_evtchn);
>
> To me, it looks like for guests compiled with CONFIG_SMP, smp_wmb and smp_mb
> would be sufficient, so mb() and wmb() here are only needed if
> a non-SMP guest runs on an SMP host.
>
> Is my analysis correct?

Correct.  The reason full barriers are used is so non-SMP guests still
function correctly.  It is certainly inefficient.

>
> So what I'm suggesting is something like the below patch,
> except instead of using virtio directly, a new set of barriers
> that behaves identically for SMP and non-SMP guests will be introduced.
>
> And of course the weak barriers flag is not needed for Xen -
> that's a virtio only thing.
>
> For example:
>
> smp_pv_wmb()
> smp_pv_rmb()
> smp_pv_mb()
>
> I'd like to get confirmation from Xen folks before posting
> this patchset.
>
> Comments/suggestions?

Very much +1 for fixing this.

Those names would be fine, but they do add yet another set of options in
an already-complicated area.

An alternative might be to have the regular smp_{w,r,}mb() not revert
back to nops if CONFIG_PARAVIRT, or perhaps if pvops have detected a
non-native environment.  (I don't know how feasible this suggestion is,
however.)

~Andrew



Re: [Qemu-devel] [PATCH 02/10] hw/sd/sd.c: QOMify

2015-12-20 Thread Peter Maydell
On 19 December 2015 at 21:36, Peter Crosthwaite
 wrote:
> On Fri, Dec 11, 2015 at 04:37:03PM +, Peter Maydell wrote:
>> Turn the SD card into a QOM device.
>> This conversion only changes the device itself; the various
>> functions which are effectively methods on the device are not
>> touched at this point.

>> +#define TYPE_SD "sd"
>
> Can we get "card" in there? I think unqualified SD should be usable
> to refer to the bus standard.

Sure. I don't have strong opinions on any of the type names
in this series, and I was wondering about sticking 'card' in.
"sdcard" or "sd-card" ?

thanks
-- PMM



Re: [Qemu-devel] [PATCH 04/10] hw/sd: Add QOM bus which SD cards plug in to

2015-12-20 Thread Peter Maydell
On 19 December 2015 at 21:38, Peter Crosthwaite
 wrote:
> On Fri, Dec 11, 2015 at 04:37:05PM +, Peter Maydell wrote:
>> +bool sdbus_get_inserted(SDBus *sdbus)
>> +{
>> +SDState *card = get_card(sdbus);
>> +
>> +if (card) {
>> +SDClass *sc = SD_GET_CLASS(card);
>> +
>> +return sc->get_inserted(card);
>> +}
>> +
>> +return false;
>> +}
>
> This I am not sure about. Realistically, the card has no self
> awareness of its ejection state so it can't be polled for "are
> you there". The card insertion switch is implemented as a
> physical switch on the slot itself and a property of the bus.
>
> The absence on presence of a device should determine this, making me
> think this should return !!card.
>
> Unfortunately, we have the case of the block UI being able to trigger a
> card ejection from underneath the bus level. But since the SD card is already
> using qdev_get_parent_bus() the removal from the bus can be managed at the
> card level.

For user-level back compat I think we need to retain "might have
an sdcard object with no block backend, and that means
'no-card-present'". This is both for the user facing
monitor commands to manipulate the sd card, and also
for the not-yet-QOMified controllers, which will always
create an sdcard object even if there's no block backend
(which should continue to mean "no card").

thanks
-- PMM



Re: [Qemu-devel] [PATCH 05/10] hw/sd/sdhci.c: Update to use SDBus APIs

2015-12-20 Thread Peter Maydell
On 19 December 2015 at 21:39, Peter Crosthwaite
 wrote:
> On Fri, Dec 11, 2015 at 04:37:06PM +, Peter Maydell wrote:
>> +carddev = qdev_create(qdev_get_child_bus(dev, "sd-bus"), TYPE_SD);
>> +qdev_prop_set_drive(carddev, "drive", blk, errp);
>> +if (*errp) {
>
> It is generally valid for an errp to be NULL. I think we should use a
> local even if we know the caller will not pass NULL.

Agreed.

thanks
-- PMM



Re: [Qemu-devel] [PATCH 08/10] hw/sd/pxa2xx_mmci: Update to use new SDBus APIs

2015-12-20 Thread Peter Maydell
On 19 December 2015 at 21:42, Peter Crosthwaite
 wrote:
> On Fri, Dec 11, 2015 at 04:37:09PM +, Peter Maydell wrote:
>> Now the PXA2xx MMCI device is QOMified itself, we can
>> update it to use the SDBus APIs to talk to the SD card.

>>  void pxa2xx_mmci_handlers(PXA2xxMMCIState *s, qemu_irq readonly,
>>qemu_irq coverswitch)
>>  {
>> -sd_set_cb(s->card, readonly, coverswitch);
>> +DeviceState *dev = DEVICE(s);
>> +
>> +s->readonly = readonly;
>> +s->inserted = coverswitch;
>> +
>> +pxa2xx_mmci_set_inserted(dev, sdbus_get_inserted(>sdbus));
>> +pxa2xx_mmci_set_readonly(dev, sdbus_get_readonly(>sdbus));
>
> Looking at the machine models, _mmci_handlers is only called at
> machine init time. So this should not be needed. devices should be
> triggering an initial call of _set_readonly that reliably propagates
> back up.
>
> If we do need these two lines they should be in the reset handler.

This change is just retaining current behaviour over the QOMification
(we would assert the lines in this function before this patch,
so we should continue to do so afterwards).

We have no good mechanism for dealing with lines that ought to
be asserted on device reset (of which "sd card inserted" is
typically one). If we assert these lines on reset then we're
just introducing a different wrong behaviour where there's
a dependency on reset order. I preferred not to touch this
because we don't actually care much about the pxa2xx boards
themselves except that they're getting a bit elderly and
need updating to QOM.

thanks
-- PMM



Re: [Qemu-devel] [PATCH 09/10] hw/sd/pxa2xx_mmci: Convert to VMStateDescription

2015-12-20 Thread Peter Maydell
On 19 December 2015 at 21:45, Peter Crosthwaite
 wrote:
> On Fri, Dec 11, 2015 at 04:37:10PM +, Peter Maydell wrote:
>> Convert the pxa2xx_mmci device from manual save/load
>> functions to a VMStateDescription structure.
>>
>> This is a migration compatibility break.

>> +static bool pxa2xx_mmci_vmstate_validate(void *opaque, int version_id)
>> +{
>> +PXA2xxMMCIState *s = opaque;
>> +
>> +return s->tx_start < sizeof(s->tx_fifo)
>> +&& s->rx_start < sizeof(s->rx_fifo)
>> +&& s->tx_len <= sizeof(s->tx_fifo)
>> +&& s->rx_len <= sizeof(s->rx_fifo)
>> +&& s->resp_len <= sizeof(s->resp_fifo);
>
>
> A nit, but I wonder if ARRAY_SIZE should be generally used in this case, as
> it is a little more self documenting and would make code identical to
> non-byte FIFOs.

Yes. In particular ARRAY_SIZE(s->resp_fifo) is correct
and sizeof(s->resp_fifo) is wrong because that array is uint16_t...

>> +}
>> +
>
> Extra blank.
>
>> +
>> +static const VMStateDescription vmstate_pxa2xx_mmci = {
>
> Is the registration in class_init missing?

Does seem to be...

thanks
-- PMM



Re: [Qemu-devel] [Xen-devel] new barrier type for paravirt (was Re: [PATCH] virtio_ring: use smp_store_mb)

2015-12-20 Thread Peter Zijlstra
On Sun, Dec 20, 2015 at 05:07:19PM +, Andrew Cooper wrote:
> 
> Very much +1 for fixing this.
> 
> Those names would be fine, but they do add yet another set of options in
> an already-complicated area.
> 
> An alternative might be to have the regular smp_{w,r,}mb() not revert
> back to nops if CONFIG_PARAVIRT, or perhaps if pvops have detected a
> non-native environment.  (I don't know how feasible this suggestion is,
> however.)

So a regular SMP kernel emits the LOCK prefix and will patch it out with
a DS prefix (iirc) when it finds but a single CPU. So for those you
could easily do this.

However an UP kernel will not emit the LOCK and do no patching.

So if you're willing to make CONFIG_PARAVIRT depend on CONFIG_SMP or
similar, this is doable.

I don't see people going to allow emitting the LOCK prefix (and growing
the kernel text size) for UP kernels.



Re: [Qemu-devel] [PATCH 04/10] hw/sd: Add QOM bus which SD cards plug in to

2015-12-20 Thread Peter Crosthwaite
On Sun, Dec 20, 2015 at 9:10 AM, Peter Maydell  wrote:
> On 19 December 2015 at 21:38, Peter Crosthwaite
>  wrote:
>> On Fri, Dec 11, 2015 at 04:37:05PM +, Peter Maydell wrote:
>>> +bool sdbus_get_inserted(SDBus *sdbus)
>>> +{
>>> +SDState *card = get_card(sdbus);
>>> +
>>> +if (card) {
>>> +SDClass *sc = SD_GET_CLASS(card);
>>> +
>>> +return sc->get_inserted(card);
>>> +}
>>> +
>>> +return false;
>>> +}
>>
>> This I am not sure about. Realistically, the card has no self
>> awareness of its ejection state so it can't be polled for "are
>> you there". The card insertion switch is implemented as a
>> physical switch on the slot itself and a property of the bus.
>>
>> The absence on presence of a device should determine this, making me
>> think this should return !!card.
>>
>> Unfortunately, we have the case of the block UI being able to trigger a
>> card ejection from underneath the bus level. But since the SD card is already
>> using qdev_get_parent_bus() the removal from the bus can be managed at the
>> card level.
>
> For user-level back compat I think we need to retain "might have
> an sdcard object with no block backend, and that means
> 'no-card-present'". This is both for the user facing
> monitor commands to manipulate the sd card, and also

What are the user-facing monitor commands? I tried using "change" and
"eject", but they don't seem to work for SD, due to the tray being
closed. Has this ever worked in a way that is user manipulatable for
SD or is it just to handle the case of unconditional SD card creation
(with the card never hotplugging over the system lifetime)?

Test case below. Using ep108 with a backendless drive (-drive if=sd).

$ dd if=/dev/zero of=./sd0file bs=16M count=1
1+0 records in
1+0 records out
16777216 bytes (17 MB) copied, 0.170586 s, 98.4 MB/s
$ dd if=/dev/zero of=./sd1file bs=16M count=1
1+0 records in
1+0 records out
16777216 bytes (17 MB) copied, 0.183836 s, 91.3 MB/s
$ ./aarch64-softmmu/qemu-system-aarch64 -M xlnx-ep108 -S -kernel
/dev/null -sd sd0file -drive if=sd -nographic
WARNING: Image format was not specified for 'sd0file' and probing guessed raw.
 Automatically detecting the format is dangerous for raw
images, write operations on block 0 will be restricted.
 Specify the 'raw' format explicitly to remove the restrictions.
QEMU 2.5.50 monitor - type 'help' for more information
(qemu) info block
sd0 (#block197): sd0file (raw)
Removable device: not locked, tray closed
Cache mode:   writeback

sd1: [not inserted]
Removable device: not locked, tray closed

ide1-cd0: [not inserted]
Removable device: not locked, tray closed

floppy0: [not inserted]
Removable device: not locked, tray closed
(qemu) change sd1 sd1file raw
Tray of device 'sd1' is not open
(qemu) eject -f sd1
Tray of device 'sd1' is not open
(qemu) info block
sd0 (#block197): sd0file (raw)
Removable device: not locked, tray closed
Cache mode:   writeback

sd1: [not inserted]
Removable device: not locked, tray closed

ide1-cd0: [not inserted]
Removable device: not locked, tray closed

floppy0: [not inserted]
Removable device: not locked, tray closed
(qemu)

Regards,
Peter

> for the not-yet-QOMified controllers, which will always
> create an sdcard object even if there's no block backend
> (which should continue to mean "no card").
>
> thanks
> -- PMM