date:20180327

Re: [Qemu-devel] [PATCH for 2.13 v2 00/20] linux-user: move arch specific parts to arch directories

2018-03-27 Thread Richard Henderson

On 03/24/2018 06:57 AM, Laurent Vivier wrote:
> Some files like signal.c are really hard to read
> because all architectures are mixed in the same
> file.
> 
> This series moves from signal.c these parts to
> the architecture dedicated directories in linux-user.
> Moreover, this allows to compare easier functions
> between architectures (it helps to debug problems).
> Adding new functions for a new architecture will
> be facilitated too.
> 
> checkpatch.pl is not happy... but I only want to
> move code from a file to another. I don't want
> to change the content of the parts I move.
> 
> v2:
>   - only move parts from signal.c
>   - link them instead of including them
>   - one patch by architecture
>   - add a first patch to prepare the change.
> 
> The first patch adds signal-common.h to define
> what is needed by the signal.c of the architectures.
> It adds a "do-nothing" signal.c in each arch
> directory and the rule needed to build them
> in Makefile.objs.
> 
> Then the process is simple...
> 
> for each architecture:
>   - copy the arch specific code from signal.c
> to /signal.c
>   - add includes (including signal-common.h)
>   - export setup_rt_frame() and setup_frame()
> (remove static in /signal.c,
>  add the declaration in /target_signal.h)
> 
> When the arch has 32bit and 64bit architectures,
> it's a little bit more complicated:
>   - ppc/ppc64: nothing special to do, all is in ppc/,
> there is no ppc64 directory,
>   - arm/aarch64: one file for arm, one file for aarch64
>   - i386/x86_64, sparc/sparc64, mips/mips64:
> update each target_signal.h,
> include the 32bit signal.c file into the 64bit signal.c file
> to avoid to duplicate code (and add a guard to not include
> the 32bit target_signal.h)

Reviewed-by: Richard Henderson 


r~

Re: [Qemu-devel] [PATCH for 2.13 00/19] linux-user: move arch specific parts from main.c to arch directories

2018-03-27 Thread Richard Henderson

On 03/27/2018 03:15 AM, Laurent Vivier wrote:
> This series moves from main.c the architecture specific parts
> to the architecture directory.
> 
> This is the continuation of my series
>   "linux-user: move arch specific parts to arch directories"
> that includes since the v2 only the signal.c parts.
> 
> For each architecture, there are two parts:
> 
>   - cpu_loop(), and the function with its
> dependencies is moved to /cpu_loop.c
> 
>   - the prologue of the cpu_loop(), that was inlined
> in main(). We move it to a new function in
> /cpu_loop.c, target_cpu_copy_regs().
> 
> The first patch adds the skeleton to move the
> parts to the architecture directories, a cpu_loop.c
> file with an empty target_cpu_copy_regs() function,
> called from main().
> 
> There is no change in the code.

Reviewed-by: Richard Henderson 


r~

Re: [Qemu-devel] [PATCH v1] RISC-V: RISC-V TCG backend work in progress

2018-03-27 Thread Michael Clark

On Tue, Mar 27, 2018 at 5:35 PM, Richard Henderson <
richard.hender...@linaro.org> wrote:

> On 03/28/2018 01:43 AM, Michael Clark wrote:
> > > +if ((ct & TCG_CT_CONST_N12) && val >= -2047 && val <= 2047) {
> >
> > +2048?
>

Yes of course, you're right. It's safe. I just hadn't thought about it
carefully enough.


> > We use this constraint for a negatable immediate and the constraint is
> only
> > applied to sub. We have no subi, so we implement subi as addi rd, rs1,
> -imm
> >
> > case INDEX_op_sub_i32:
> > if (c2) {
> > tcg_out_opc_imm(s, is32bit ? OPC_ADDI : OPC_ADDIW, a0, a1,
> -a2);
> > } else {
> > tcg_out_opc_reg(s, is32bit ? OPC_SUB : OPC_SUBW, a0, a1, a2);
> > }
> > break;
>
> That's my point.  The "positive" range for addition is -2048...2047, so the
> "negative" range for subtraction should be -2047...2048.
>

Got it. Thanks.

Re: [Qemu-devel] [PATCH v2 1/1] RISC-V: Workaround for critical mstatus.FS MTTCG bug

2018-03-27 Thread Richard Henderson

On 03/28/2018 08:42 AM, Michael Clark wrote:
> This change is a workaround for a bug where mstatus.FS
> is not correctly reporting dirty after operations that
> modify floating point registers. This a critical bug
> or RISC-V in QEMU as it results in floating point
> register file corruption when running SMP Linux due to
> task migration and possibly uniprocessor Linux if
> more than one process is using the FPU.
> 
> This workaround will return dirty if mstatus.FS is
> switched from off to initial or clean. According to
> the specification it is legal for an implementation
> to return only off, or dirty.
> 
> Cc: Palmer Dabbelt 
> Cc: Sagar Karandikar 
> Cc: Bastian Koppelmann 
> Cc: Peter Maydell 
> Cc: Alex Bennée 
> Cc: Richard Henderson 
> Cc: Philippe Mathieu-Daudé 
> Tested-by: Richard W.M. Jones 
> Signed-off-by: Michael Clark 
> ---
>  target/riscv/op_helper.c | 17 +++--
>  1 file changed, 15 insertions(+), 2 deletions(-)

In case the more extensive fix waits until 2.13,

Reviewed-by: Richard Henderson 


r~

Re: [Qemu-devel] [PATCH 2/2] target/riscv: Mark MSTATUS_FS dirty

2018-03-27 Thread Richard Henderson

On 03/28/2018 10:22 AM, Richard Henderson wrote:
> +/* Mark fp status as dirty.  */
> +env->mstatus = MSTATUS_FS;

Bah.  This should of course be |=.


r~

Re: [Qemu-devel] [PATCH for-2.12 0/2] RISC-V: Mark FP status dirty

2018-03-27 Thread Michael Clark

Hi Richard,

Thanks! I'll test this tomorrow morning and we can choose whether to
include your proper fix or the workaround.

I think we have time assuming we send out PRs tomorrow.

Given our important fixes have review including either this fix by tomorrow
or the workaround, and Philippe has reviewed our other important bugs
fixes, then we should be fine.

Then after getting the critical and important fixes out of the way, then I
perhaps make a PR for the other reviewed changes, although these might best
wait until QEMU 2.13 opens.

Thanks again,
Michael.

On Tue, Mar 27, 2018 at 7:22 PM, Richard Henderson <
richard.hender...@linaro.org> wrote:

> Since it was my patch that broke FP state tracking in the
> first place, I feel obligated to fix it again.
>
> Mark mstatus[fs] as dirty whenever we write to the file.
> This can be optimized by only doing so once within a TB
> which initially began with a clean file.
>
> I have not yet put together an environment that can test
> this, so I'll need someone else to give it a go.
>
>
> r~
>
>
> Richard Henderson (2):
>   target/riscv: Split out mstatus_fs from tb_flags during translation
>   target/riscv: Mark MSTATUS_FS dirty
>
>  target/riscv/cpu.h   |  6 +++---
>  target/riscv/op_helper.c | 25 
>  target/riscv/translate.c | 50 ++
> --
>  3 files changed, 64 insertions(+), 17 deletions(-)
>
> --
> 2.14.3
>
>

[Qemu-devel] [PULL 1/1] tcg: Mark muluh_i64 and mulsh_i64 as 64-bit ops

2018-03-27 Thread Richard Henderson

Failure to do so results in the tcg optimizer sign-extending
any constant fold from 32-bits.  This turns out to be visible
in the RISC-V testsuite using a host that emits these opcodes
(e.g. any non-x86_64).

Reported-by: Michael Clark 
Reviewed-by: Emilio G. Cota 
Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Richard Henderson 
---
 tcg/tcg-opc.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tcg/tcg-opc.h b/tcg/tcg-opc.h
index d81a6c4535..e3a43aabb6 100644
--- a/tcg/tcg-opc.h
+++ b/tcg/tcg-opc.h
@@ -182,8 +182,8 @@ DEF(add2_i64, 2, 4, 0, IMPL64 | 
IMPL(TCG_TARGET_HAS_add2_i64))
 DEF(sub2_i64, 2, 4, 0, IMPL64 | IMPL(TCG_TARGET_HAS_sub2_i64))
 DEF(mulu2_i64, 2, 2, 0, IMPL64 | IMPL(TCG_TARGET_HAS_mulu2_i64))
 DEF(muls2_i64, 2, 2, 0, IMPL64 | IMPL(TCG_TARGET_HAS_muls2_i64))
-DEF(muluh_i64, 1, 2, 0, IMPL(TCG_TARGET_HAS_muluh_i64))
-DEF(mulsh_i64, 1, 2, 0, IMPL(TCG_TARGET_HAS_mulsh_i64))
+DEF(muluh_i64, 1, 2, 0, IMPL64 | IMPL(TCG_TARGET_HAS_muluh_i64))
+DEF(mulsh_i64, 1, 2, 0, IMPL64 | IMPL(TCG_TARGET_HAS_mulsh_i64))
 
 #define TLADDR_ARGS  (TARGET_LONG_BITS <= TCG_TARGET_REG_BITS ? 1 : 2)
 #define DATA64_ARGS  (TCG_TARGET_REG_BITS == 64 ? 1 : 2)
-- 
2.14.3

[Qemu-devel] [PULL for-2.12 0/1] tcg mul[us]h fix

2018-03-27 Thread Richard Henderson

This is material for stable as well.


r~


The following changes since commit fa3704d87720d7049d483ff669b9e2ff991e7658:

  Update version for v2.12.0-rc1 release (2018-03-27 22:04:23 +0100)

are available in the Git repository at:

  git://github.com/rth7680/qemu.git tags/pull-tcg-20180328

for you to fetch changes up to f2f1dde75160cac6ede330f3db50dc817d01a2d6:

  tcg: Mark muluh_i64 and mulsh_i64 as 64-bit ops (2018-03-28 12:45:16 +0800)


Fix muluh_i64 and mulsh_i64 flags


Richard Henderson (1):
  tcg: Mark muluh_i64 and mulsh_i64 as 64-bit ops

 tcg/tcg-opc.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Re: [Qemu-devel] [PATCH v1 2/2] RISC-V: Fix incorrect disassembly for addiw

2018-03-27 Thread Richard Henderson

On 03/28/2018 03:55 AM, Michael Clark wrote:
> This fixes a bug in the disassembler constraints used
> to lift instructions into pseudo-instructions, whereby
> addiw instructions are always lifted to sext.w instead
> of just lifting addiw with a zero immediate.
> 
> An associated fix has been made to the metadata used to
> machine generate the disseasembler:
> 
> https://github.com/michaeljclark/riscv-meta/
> commit/4a6b2f3898430768acfe201405224d2ea31e1477
> 
> Cc: Sagar Karandikar 
> Cc: Bastian Koppelmann 
> Cc: Palmer Dabbelt 
> Cc: Peter Maydell 
> Signed-off-by: Michael Clark 
> ---
>  disas/riscv.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Reviewed-by: Richard Henderson 


r~

Re: [Qemu-devel] [PATCH v1 1/2] RISC-V: Convert cpu definition to future model

2018-03-27 Thread Richard Henderson

On 03/28/2018 03:55 AM, Michael Clark wrote:
> - Model borrowed from target/sh4/cpu.c
> - Rewrote riscv_cpu_list to use object_class_get_list
> - Dropped 'struct RISCVCPUInfo' and used TypeInfo array
> - Replaced riscv_cpu_register_types with DEFINE_TYPES
> - Marked base class as abstract
> - Fixes -cpu list
> 
> Cc: Igor Mammedov 
> Cc: Sagar Karandikar 
> Cc: Bastian Koppelmann 
> Cc: Palmer Dabbelt 
> Signed-off-by: Michael Clark 
> Reviewed-by: Philippe Mathieu-Daudé 
> Reviewed-by: Igor Mammedov 
> ---
>  target/riscv/cpu.c | 123 
> ++---
>  1 file changed, 69 insertions(+), 54 deletions(-)

Reviewed-by: Richard Henderson 


r~

Re: [Qemu-devel] [PATCH 3/8] migration: support todetectcompression and decompression errors

2018-03-27 Thread Peter Xu

On Wed, Mar 28, 2018 at 12:08:19PM +0800, jiang.bi...@zte.com.cn wrote:
> > 
> > On Tue, Mar 27, 2018 at 10:35:29PM +0800, Xiao Guangrong wrote:
> >
> >> > > No, we can't make the assumption that "error _must_ be caused by page 
> >> > > update".
> >> > > No document/ABI about compress/decompress promised it. :)
> >
> > Indeed, I found no good documents about below errors that jiang.biao
> > pointed out.
> Hi, Peter
> The description about the errors comes from here,
> http://www.zlib.net/manual.html
> And about the error codes returned by inflate(), they are described as，
> ** inflate() returns 
> Z_OK if some progress has been made (more input processed or more output 
> produced),
> Z_STREAM_END if the end of the compressed data has been reached and all 
> uncompressed output has been produced, 
> Z_NEED_DICT if a preset dictionary is needed at this point, 
> Z_DATA_ERROR if the input data was corrupted (input stream not conforming to 
> the zlib format or incorrect check value, in which case strm->msg points to a 
> string with a more specific error), 
> Z_STREAM_ERROR if the stream structure was inconsistent (for example next_in 
> or next_out was Z_NULL, or the state was inadvertently written over by the 
> application), 
> Z_MEM_ERROR if there was not enough memory, 
> Z_BUF_ERROR if no progress was possible or if there was not enough room in 
> the output buffer when Z_FINISH is used. ... 
> **

Ah yes.  My bad to be so uncareful. :)

> According to the above description, the error caused by page update looks 
> more like tend to return Z_DATA_ERROR, but I do not have env to verify that. 
> :)
> As I understand it, the real compress/decompress error cases other than that 
> caused by page update should be rare, maybe the error code is enough to
> distinguish those if we can verify the the error codes returned by page update
> and other silent failures by test. If so, we can cut the cost of memcpy.  
> If not, I agree with Guangrong's idea too. I never read the zlib code and all 
> my
> information comes from the manual, so if anything inaccurate, pls ignore my
> option. :)

So I suppose all of us know that alternative now, we just need a solid
way to confirm the uncertainty.  I'll leave this to Guangrong.

Thanks,

-- 
Peter Xu

Re: [Qemu-devel] [PATCH 3/8] migration: support todetectcompression and decompression errors

2018-03-27 Thread jiang.biao2

> 
> On Tue, Mar 27, 2018 at 10:35:29PM +0800, Xiao Guangrong wrote:
>
>> > > No, we can't make the assumption that "error _must_ be caused by page 
>> > > update".
>> > > No document/ABI about compress/decompress promised it. :)
>
> Indeed, I found no good documents about below errors that jiang.biao
> pointed out.
Hi, Peter
The description about the errors comes from here,
http://www.zlib.net/manual.html
And about the error codes returned by inflate(), they are described as，
** inflate() returns 
Z_OK if some progress has been made (more input processed or more output 
produced),
Z_STREAM_END if the end of the compressed data has been reached and all 
uncompressed output has been produced, 
Z_NEED_DICT if a preset dictionary is needed at this point, 
Z_DATA_ERROR if the input data was corrupted (input stream not conforming to 
the zlib format or incorrect check value, in which case strm->msg points to a 
string with a more specific error), 
Z_STREAM_ERROR if the stream structure was inconsistent (for example next_in or 
next_out was Z_NULL, or the state was inadvertently written over by the 
application), 
Z_MEM_ERROR if there was not enough memory, 
Z_BUF_ERROR if no progress was possible or if there was not enough room in the 
output buffer when Z_FINISH is used. ... 
**
According to the above description, the error caused by page update looks 
more like tend to return Z_DATA_ERROR, but I do not have env to verify that. :)
As I understand it, the real compress/decompress error cases other than that 
caused by page update should be rare, maybe the error code is enough to
distinguish those if we can verify the the error codes returned by page update
and other silent failures by test. If so, we can cut the cost of memcpy.  
If not, I agree with Guangrong's idea too. I never read the zlib code and all my
information comes from the manual, so if anything inaccurate, pls ignore my
option. :)

Regards,
Jiang

Re: [Qemu-devel] [RFC 8/8] hw/arm/virt: Allow up to 512 vcpus along with KVM VGICv3

2018-03-27 Thread Shannon Zhao



On 2018/3/27 22:15, Eric Auger wrote:
> With KVM acceleration and if KVM VGICV3 supports to set multiple
> redistributor regions, we now allow up to 512 vcpus.
> 
> Signed-off-by: Eric Auger 
> ---
>  hw/arm/virt.c | 17 -
>  include/hw/arm/virt.h |  1 +
>  2 files changed, 17 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> index 8258f6f..cdb1a75 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -149,8 +149,10 @@ static const MemMapEntry a15memmap[] = {
>  [VIRT_PCIE_PIO] =   { 0x3eff, 0x0001 },
>  [VIRT_PCIE_ECAM] =  { 0x3f00, 0x0100 },
>  [VIRT_MEM] ={ 0x4000, RAMLIMIT_BYTES },
> +/* Allows 512 - 123 additional vcpus (each 2x64kB) */
> +[VIRT_GIC_REDIST2] ={ 0x40ULL, 0x30ALL },
One concern that this will limit the guest ram size to RAMLIMIT_BYTES.
If we want to support larger ram size in the future, this may be a problem.

>  /* Second PCIe window, 512GB wide at the 512GB boundary */
> -[VIRT_PCIE_MMIO_HIGH] =   { 0x80ULL, 0x80ULL },
> +[VIRT_PCIE_MMIO_HIGH] = { 0x80ULL, 0x80ULL },
>  };
>  
>  static const int a15irqmap[] = {
> @@ -553,6 +555,11 @@ static void create_gic(VirtMachineState *vms, qemu_irq 
> *pic)
>  agcc->register_redist_region((GICv3State *)gicdev,
>   vms->memmap[VIRT_GIC_REDIST].base,
>   vms->memmap[VIRT_GIC_REDIST].size >> 17);
> +if (vms->smp_cpus > 123) {
> +agcc->register_redist_region((GICv3State *)gicdev,
> + vms->memmap[VIRT_GIC_REDIST2].base,
> + vms->memmap[VIRT_GIC_REDIST2].size >> 17);
> +}
>  } else {
>  sysbus_mmio_map(gicbusdev, 1, vms->memmap[VIRT_GIC_CPU].base);
>  }
> @@ -1284,6 +1291,14 @@ static void machvirt_init(MachineState *machine)
>   */
>  if (vms->gic_version == 3) {
>  virt_max_cpus = vms->memmap[VIRT_GIC_REDIST].size / 0x2;
> +if (kvm_max_vcpus(kvm_state) > 255) {
> +/*
> + * VGICv3 KVM device capability to set multiple redistributor
> + * was introduced at the same time KVM_MAX_VCPUS was bumped
> + * from 255 to 512
> + */
> +virt_max_cpus += vms->memmap[VIRT_GIC_REDIST2].size / 0x2;
> +}
>  } else {
>  virt_max_cpus = GIC_NCPU;
>  }
> diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
> index d168291..801a4ad 100644
> --- a/include/hw/arm/virt.h
> +++ b/include/hw/arm/virt.h
> @@ -60,6 +60,7 @@ enum {
>  VIRT_GIC_V2M,
>  VIRT_GIC_ITS,
>  VIRT_GIC_REDIST,
> +VIRT_GIC_REDIST2,
>  VIRT_UART,
>  VIRT_MMIO,
>  VIRT_RTC,
> 

-- 
Shannon

Re: [Qemu-devel] [PATCH v8 14/23] monitor: separate QMP parser and dispatcher

2018-03-27 Thread Peter Xu

On Mon, Mar 26, 2018 at 11:46:13AM +0200, Marc-André Lureau wrote:
> Hi
> 
> On Mon, Mar 26, 2018 at 11:08 AM, Peter Xu  wrote:
> > On Mon, Mar 26, 2018 at 10:33:27AM +0200, Marc-André Lureau wrote:
> >> Hi
> >>
> >> On Mon, Mar 26, 2018 at 10:07 AM, Peter Xu  wrote:
> >> > On Fri, Mar 23, 2018 at 05:18:53PM +0100, Marc-André Lureau wrote:
> >> >
> >> > [...]
> >> >
> >> >> > +/*
> >> >> > + * Dispatch one single QMP request. The function will free the 
> >> >> > req_obj
> >> >> > + * and objects inside it before return.
> >> >> > + */
> >> >> > +static void monitor_qmp_dispatch_one(QMPRequest *req_obj)
> >> >> >  {
> >> >> > -QObject *req, *rsp = NULL, *id = NULL;
> >> >> > +Monitor *mon, *old_mon;
> >> >> > +QObject *req, *rsp = NULL, *id;
> >> >> >  QDict *qdict = NULL;
> >> >> > -MonitorQMP *mon_qmp = container_of(parser, MonitorQMP, parser);
> >> >> > -Monitor *old_mon, *mon = container_of(mon_qmp, Monitor, qmp);
> >> >> > -
> >> >> > -Error *err = NULL;
> >> >> > +bool need_resume;
> >> >> >
> >> >> > -req = json_parser_parse_err(tokens, NULL, );
> >> >> > -if (!req && !err) {
> >> >> > -/* json_parser_parse_err() sucks: can fail without setting 
> >> >> > @err */
> >> >> > -error_setg(, QERR_JSON_PARSING);
> >> >> > -}
> >> >> > -if (err) {
> >> >> > -goto err_out;
> >> >> > -}
> >> >> > +req = req_obj->req;
> >> >> > +mon = req_obj->mon;
> >> >> > +id = req_obj->id;
> >> >> > +need_resume = req_obj->need_resume;
> >> >> >
> >> >> > -qdict = qobject_to_qdict(req);
> >> >> > -if (qdict) {
> >> >> > -id = qdict_get(qdict, "id");
> >> >> > -qobject_incref(id);
> >> >> > -qdict_del(qdict, "id");
> >> >> > -} /* else will fail qmp_dispatch() */
> >> >> > +g_free(req_obj);
> >> >> >
> >> >> >  if (trace_event_get_state_backends(TRACE_HANDLE_QMP_COMMAND)) {
> >> >> >  QString *req_json = qobject_to_json(req);
> >> >> > @@ -3900,7 +3932,7 @@ static void 
> >> >> > handle_qmp_command(JSONMessageParser *parser, GQueue *tokens)
> >> >> >  old_mon = cur_mon;
> >> >> >  cur_mon = mon;
> >> >>
> >> >> There is another issue with this series, since cur_mon is global (and
> >> >> not protected), an oob command may change the cur_mon while another
> >> >> command is running in the main thread with unexpected consequences. I
> >> >> don't have a clear idea what is the best way to solve it. Making the
> >> >> variable per-thread, or going all the way to get rid of cur_mon (my
> >> >> preference, but much harder)
> >> >
> >> > IMHO it is fine too.
> >> >
> >> > Note that this cur_mon operation is in monitor_qmp_dispatch_one() now,
> >> > which is still running in main thread.  So AFAICT all the cur_mon
> >> > references are in main thread, and monitor IOThread does not modify
> >> > that variable at all.  Then we should probably be safe.
> >>
> >> But monitor_qmp_dispatch_one() is called from iothread if the command
> >> is oob, so cur_mon may be updated while another command is running in
> >> main thread, or am I wrong?
> >
> > You are right. I missed that, sorry...
> >
> > Would this be a simple workaround (but hopefully efficient) solution?
> >
> > diff --git a/monitor.c b/monitor.c
> > index 77f4c41cfa..99641c0c6d 100644
> > --- a/monitor.c
> > +++ b/monitor.c
> > @@ -4023,7 +4023,7 @@ typedef struct QMPRequest QMPRequest;
> >   * Dispatch one single QMP request. The function will free the req_obj
> >   * and objects inside it before return.
> >   */
> > -static void monitor_qmp_dispatch_one(QMPRequest *req_obj)
> > +static void monitor_qmp_dispatch_one(QMPRequest *req_obj, bool hack_curmon)
> >  {
> >  Monitor *mon, *old_mon;
> >  QObject *req, *rsp = NULL, *id;
> > @@ -4043,12 +4043,16 @@ static void monitor_qmp_dispatch_one(QMPRequest 
> > *req_obj)
> >  QDECREF(req_json);
> >  }
> >
> > -old_mon = cur_mon;
> > -cur_mon = mon;
> > +if (hack_curmon) {
> > +old_mon = cur_mon;
> > +cur_mon = mon;
> > +}
> >
> >  rsp = qmp_dispatch(mon->qmp.commands, req);
> >
> > -cur_mon = old_mon;
> > +if (hack_curmon) {
> > +cur_mon = old_mon;
> > +}
> >
> >  if (mon->qmp.commands == _cap_negotiation_commands) {
> >  qdict = qdict_get_qdict(qobject_to(QDict, rsp), "error");
> > @@ -4116,7 +4120,7 @@ static void monitor_qmp_bh_dispatcher(void *data)
> >
> >  if (req_obj) {
> >  trace_monitor_qmp_cmd_in_band(qobject_get_try_str(req_obj->id) ?: 
> > "");
> > -monitor_qmp_dispatch_one(req_obj);
> > +monitor_qmp_dispatch_one(req_obj, true);
> >  /* Reschedule instead of looping so the main loop stays responsive 
> > */
> >  qemu_bh_schedule(mon_global.qmp_dispatcher_bh);
> >  }
> > @@ -4175,7 +4179,7 @@ static void handle_qmp_command(JSONMessageParser 
> > *parser, GQueue *tokens)
> >  /* Out-Of-Band

Re: [Qemu-devel] [PATCH v2] scsi-disk: Don't enlarge min_io_size to max_io_size

2018-03-27 Thread David Gibson

On Wed, Mar 28, 2018 at 12:41:41AM +0800, Fam Zheng wrote:
> Some backends report big max_io_sectors. Making min_io_size the same
> value in this case will make it impossible for guest to align memory,
> therefore the disk may not be usable at all.
> 
> Do not enlarge them when they are zero.
> 
> Reported-by: David Gibson 
> Signed-off-by: Fam Zheng 

Tested-by: David Gibson 

With this patch applied, I was able to successfully install a ppc64le
guest again.

> 
> ---
> 
> v2: Leave the values alone if zero. [Paolo]
> At least we can consult block layer for a slightly more sensible
> opt_io_size, but that's for another patch.
> ---
>  hw/scsi/scsi-disk.c | 10 ++
>  1 file changed, 6 insertions(+), 4 deletions(-)
> 
> diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c
> index f5ab767ab5..f8ed8cf2b4 100644
> --- a/hw/scsi/scsi-disk.c
> +++ b/hw/scsi/scsi-disk.c
> @@ -714,10 +714,12 @@ static int scsi_disk_emulate_inquiry(SCSIRequest *req, 
> uint8_t *outbuf)
>  
>  /* min_io_size and opt_io_size can't be greater than
>   * max_io_sectors */
> -min_io_size =
> -MIN_NON_ZERO(min_io_size, max_io_sectors);
> -opt_io_size =
> -MIN_NON_ZERO(opt_io_size, max_io_sectors);
> +if (min_io_size) {
> +min_io_size = MIN(min_io_size, max_io_sectors);
> +}
> +if (opt_io_size) {
> +opt_io_size = MIN(opt_io_size, max_io_sectors);
> +}
>  }
>  /* required VPD size with unmap support */
>  buflen = 0x40;

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH 1/8] migration: stop compressing page in migration thread

2018-03-27 Thread Xiao Guangrong




On 03/28/2018 11:01 AM, Wang, Wei W wrote:

On Tuesday, March 13, 2018 3:58 PM, Xiao Guangrong wrote:


As compression is a heavy work, do not do it in migration thread, instead, we
post it out as a normal page

Signed-off-by: Xiao Guangrong 



Hi Guangrong,

Dave asked me to help review your patch, so I will just drop my 2 cents 
wherever possible, and hope that could be inspiring for your work.


Thank you both for the nice help on the work. :)





---
  migration/ram.c | 32 
  1 file changed, 16 insertions(+), 16 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c index
7266351fd0..615693f180 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -1132,7 +1132,7 @@ static int ram_save_compressed_page(RAMState
*rs, PageSearchStatus *pss,
  int pages = -1;
  uint64_t bytes_xmit = 0;
  uint8_t *p;
-int ret, blen;
+int ret;
  RAMBlock *block = pss->block;
  ram_addr_t offset = pss->page << TARGET_PAGE_BITS;

@@ -1162,23 +1162,23 @@ static int
ram_save_compressed_page(RAMState *rs, PageSearchStatus *pss,
  if (block != rs->last_sent_block) {
  flush_compressed_data(rs);
  pages = save_zero_page(rs, block, offset);
-if (pages == -1) {
-/* Make sure the first page is sent out before other pages */
-bytes_xmit = save_page_header(rs, rs->f, block, offset |
-  RAM_SAVE_FLAG_COMPRESS_PAGE);
-blen = qemu_put_compression_data(rs->f, p, TARGET_PAGE_SIZE,
- migrate_compress_level());
-if (blen > 0) {
-ram_counters.transferred += bytes_xmit + blen;
-ram_counters.normal++;
-pages = 1;
-} else {
-qemu_file_set_error(rs->f, blen);
-error_report("compressed data failed!");
-}
-}
  if (pages > 0) {
  ram_release_pages(block->idstr, offset, pages);
+} else {
+/*
+ * Make sure the first page is sent out before other pages.
+ *
+ * we post it as normal page as compression will take much
+ * CPU resource.
+ */
+ram_counters.transferred += save_page_header(rs, rs->f, block,
+offset | RAM_SAVE_FLAG_PAGE);
+qemu_put_buffer_async(rs->f, p, TARGET_PAGE_SIZE,
+  migrate_release_ram() &
+  migration_in_postcopy());
+ram_counters.transferred += TARGET_PAGE_SIZE;
+ram_counters.normal++;
+pages = 1;
  }
  } else {
  pages = save_zero_page(rs, block, offset);
--


I agree that this patch is an improvement for the current implementation. So 
just pile up mine here:
Reviewed-by: Wei Wang 


Thanks.




If you are interested in something more aggressive, I can share an alternative 
approach, which I think would be better. Please see below.

Actually, we can use the multi-threaded compression for the first page as well, 
which will not block the migration thread progress. The advantage is that we 
can enjoy the compression benefit for the first page and meanwhile not blocking 
the migration thread - the page is given to a compression thread and compressed 
asynchronously to the migration thread execution.



Yes, it is a good point.


The main barrier to achieving the above that is that we need to make sure the 
first page of each block is sent first in the multi-threaded environment. We 
can twist the current implementation to achieve that, which is not hard:

For example, we can add a new flag to RAMBlock - bool first_page_added. In each 
thread of compression, they need
1) check if this is the first page of the block.
2) If it is the first page, set block->first_page_added after sending the page;
3) If it is not the first the page, wait to send the page only when 
block->first_page_added is set.



So there is another barrier introduced which hurts the parallel...

Hmm, we need more deliberate consideration on this point, let me think it over 
after this work.

Thank you.

Re: [Qemu-devel] [PATCH 3/8] migration: support to detectcompression and decompression errors

2018-03-27 Thread Peter Xu

On Tue, Mar 27, 2018 at 10:35:29PM +0800, Xiao Guangrong wrote:
> 
> 
> On 03/28/2018 08:43 AM, jiang.bi...@zte.com.cn wrote:
> > > On 03/27/2018 07:17 PM, Peter Xu wrote:
> > > > On Tue, Mar 27, 2018 at 03:42:32AM +0800, Xiao Guangrong wrote:
> > > > 
> > > > [...]
> > > > 
> > > > > > It'll be understandable to me if the problem is that the compress()
> > > > > > API does not allow the input buffer to be changed during the whole
> > > > > > period of the call.  If that is a must, this patch for sure helps.
> > > > > 
> > > > > Yes, that is exactly what i want to say. :)
> > > > 
> > > > So I think now I know what this patch is for. :) And yeah, it makes
> > > > sense.
> > > > 
> > > > Though another question would be: if the buffer is updated during
> > > > compress() and compress() returned error, would that pollute the whole
> > > > z_stream or it only fails the compress() call?
> > > > 
> > > 
> > > I guess deflateReset() can recover everything, i.e, keep z_stream as
> > > it is init'ed by deflate_init().
> > > 
> > > > (Same question applies to decompress().)
> > > > 
> > > > If it's only a compress() error and it won't pollute z_stream (or say,
> > > > it can be recovered after a deflateReset() and then we can continue to
> > > > call deflate() without problem), then we'll actually have two
> > > > alternatives to solve this "buffer update" issue:
> > > > 
> > > > 1. Use the approach of current patch: we copy the page every time, so
> > > >  deflate() never fails because update never happens.  But it's slow
> > > >  since we copy the pages every time.
> > > > 
> > > > 2. Use the old approach, and when compress() fail, we just ignore that
> > > >  page (since now we know that error _must_ be caused by page update,
> > > >  then we are 100% sure that we'll send that page again so it'll be
> > > >  perfectly fine).
> > > > 
> > > 
> > > No, we can't make the assumption that "error _must_ be caused by page 
> > > update".
> > > No document/ABI about compress/decompress promised it. :)

Indeed, I found no good documents about below errors that jiang.biao
pointed out.

> > So, as I metioned before, can we just distingush the decompress/compress 
> > errors
> > from errors caused by page update by the return code of inflate/deflate?
> > According to the zlib manual, there seems to be several error codes for 
> > different
> > cases,
> > #define Z_ERRNO(-1)
> > #define Z_STREAM_ERROR (-2)
> > #define Z_DATA_ERROR   (-3)
> > #define Z_MEM_ERROR(-4)
> > #define Z_BUF_ERROR(-5)
> > #define Z_VERSION_ERROR (-6)
> > Did you check the return code when silent failure(not caused by page update)
> > happened before? :)
> 
> I am afraid there is no such error code and i guess zlib is not designed to
> compress the data which is being modified.

So I agree with you, maybe the only right way to do now is copy the
page, until we know better about zlib and find something useful.

Thanks!

-- 
Peter Xu

Re: [Qemu-devel] [PATCH 1/8] migration: stop compressing page in migration thread

2018-03-27 Thread Wang, Wei W

On Tuesday, March 13, 2018 3:58 PM, Xiao Guangrong wrote:
> 
> As compression is a heavy work, do not do it in migration thread, instead, we
> post it out as a normal page
> 
> Signed-off-by: Xiao Guangrong 

Hi Guangrong,

Dave asked me to help review your patch, so I will just drop my 2 cents 
wherever possible, and hope that could be inspiring for your work.

> ---
>  migration/ram.c | 32 
>  1 file changed, 16 insertions(+), 16 deletions(-)
> 
> diff --git a/migration/ram.c b/migration/ram.c index
> 7266351fd0..615693f180 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -1132,7 +1132,7 @@ static int ram_save_compressed_page(RAMState
> *rs, PageSearchStatus *pss,
>  int pages = -1;
>  uint64_t bytes_xmit = 0;
>  uint8_t *p;
> -int ret, blen;
> +int ret;
>  RAMBlock *block = pss->block;
>  ram_addr_t offset = pss->page << TARGET_PAGE_BITS;
> 
> @@ -1162,23 +1162,23 @@ static int
> ram_save_compressed_page(RAMState *rs, PageSearchStatus *pss,
>  if (block != rs->last_sent_block) {
>  flush_compressed_data(rs);
>  pages = save_zero_page(rs, block, offset);
> -if (pages == -1) {
> -/* Make sure the first page is sent out before other pages */
> -bytes_xmit = save_page_header(rs, rs->f, block, offset |
> -  RAM_SAVE_FLAG_COMPRESS_PAGE);
> -blen = qemu_put_compression_data(rs->f, p, TARGET_PAGE_SIZE,
> - migrate_compress_level());
> -if (blen > 0) {
> -ram_counters.transferred += bytes_xmit + blen;
> -ram_counters.normal++;
> -pages = 1;
> -} else {
> -qemu_file_set_error(rs->f, blen);
> -error_report("compressed data failed!");
> -}
> -}
>  if (pages > 0) {
>  ram_release_pages(block->idstr, offset, pages);
> +} else {
> +/*
> + * Make sure the first page is sent out before other pages.
> + *
> + * we post it as normal page as compression will take much
> + * CPU resource.
> + */
> +ram_counters.transferred += save_page_header(rs, rs->f, 
> block,
> +offset | RAM_SAVE_FLAG_PAGE);
> +qemu_put_buffer_async(rs->f, p, TARGET_PAGE_SIZE,
> +  migrate_release_ram() &
> +  migration_in_postcopy());
> +ram_counters.transferred += TARGET_PAGE_SIZE;
> +ram_counters.normal++;
> +pages = 1;
>  }
>  } else {
>  pages = save_zero_page(rs, block, offset);
> --

I agree that this patch is an improvement for the current implementation. So 
just pile up mine here:
Reviewed-by: Wei Wang 

If you are interested in something more aggressive, I can share an alternative 
approach, which I think would be better. Please see below.

Actually, we can use the multi-threaded compression for the first page as well, 
which will not block the migration thread progress. The advantage is that we 
can enjoy the compression benefit for the first page and meanwhile not blocking 
the migration thread - the page is given to a compression thread and compressed 
asynchronously to the migration thread execution.

The main barrier to achieving the above that is that we need to make sure the 
first page of each block is sent first in the multi-threaded environment. We 
can twist the current implementation to achieve that, which is not hard:

For example, we can add a new flag to RAMBlock - bool first_page_added. In each 
thread of compression, they need
1) check if this is the first page of the block.
2) If it is the first page, set block->first_page_added after sending the page;
3) If it is not the first the page, wait to send the page only when 
block->first_page_added is set.

Best,
Wei

Re: [Qemu-devel] [PATCH v3] qemu-doc: Rework the network options chapter to make "-net" less prominent

2018-03-27 Thread Jason Wang




On 2018年03月27日 21:16, Thomas Huth wrote:

   # launch vde switch
   vde_switch -F -sock /tmp/myswitch
   # launch QEMU instance
-qemu-system-i386 linux.img -net nic -net vde,sock=/tmp/myswitch
+qemu-system-i386 linux.img -nic vde,sock=/tmp/myswitch

I think we should use -netdev here?

I've had that in the original version of this patch, but Paolo suggested
to use -nic instead since this is more "user-friendly":

https://lists.gnu.org/archive/html/qemu-devel/2018-02/msg05258.html

I personally don't mind whether we use "-netdev" or "-nic" in the
examples, as long as we finally get rid of "-net" there...


Right.


Please let me
know which way you prefer, so I can respin the patch again if necessary.

  Thomas



But I think we should be consistent: in your patch some "-net" was 
replaced by "-nic", but others were replaced by "-netdev".


Thanks

Re: [Qemu-devel] [PATCH 0/3] e1000 migration changes for 2.12

2018-03-27 Thread Jason Wang




On 2018年03月27日 22:26, Dr. David Alan Gilbert wrote:

* Jason Wang (jasow...@redhat.com) wrote:


On 2018年03月27日 19:34, Dr. David Alan Gilbert (git) wrote:

From: "Dr. David Alan Gilbert" 

Hi Ed, Jason,
This set of patches change the e1000 migration code to make
it easier to keep with compatibility with older versions in backwards
migration;  but I do need some advice whether I need to do more as well.

I think the first and second patch are fairly uncontrovercial and I
would like them for 2.12, since it'll make any future changes easier.
The third one changes the default behaviour, so again I'd prefer it but
lets see what you think.

The patches looks good to me. So for the changes of default behavior, did
you mean we can make the migration to older versions work?

Right.


My question however, without knowing the internals of the e1000, is
whether when ommitting the subsection, should the code in 2.12 be
changing the data it sends back in the main section of data?

I'm not sure I get the meaning here. But it looks to me turning it off for
old machine types makes sense, otherwise, management need to set it
explicitly.

OK, let me expand the question a bit.
If I understand correctly the d6244b description, there's two pieces of
state (TSO and non-TSO) where there used to be only one.
Now, with the new format we migrate both pieces of state, but lets think
about what happens if we have to migrate only one piece.

a) 2.11->2.11 migrated the only piece it knew about; so I guess the only
problem was really the UDP corruption mentioned in the commit.

b) 2.11->2.12 - it receives the wrongly merged piece of state and puts it
in - well which of the two states does it load it into?  What's the
effect of that?


So I think what we can do in this case is keep the (buggy) behavior 
here. E.g put the state into both props and tso_props.




c) 2.12(+my mod)->2.11 ok, so 2.12 will have filled in both sets of state
internally; but now it's only going to send one of them over to 2.11 -
which one gets sent to 2.11? Is it the one that 2.11 is expecting?


Then we can keep the behavior of 2.11 when migrate_tso_props (probably 
need a better name) is false we will use props for all contexts.




d) 2.12(+my mod)->2.12(+my mod) with an old machine type, again we're only
going to send one set of data (same as c) - but what's 2.12 going to
make of the one piece of state received?


So if we do like above, guest will see buggy device after migration. 
(But do we really care this case?).


Thanks



(b) is an existing question.

Dave


Thanks


Dave


Dr. David Alan Gilbert (3):
e1000: Convert v3 fields to subsection
e1000: wire new subsection to property
e1000: Old machine types, turn new subsection off

   hw/net/e1000.c  | 46 ++
   include/hw/compat.h |  4 
   2 files changed, 38 insertions(+), 12 deletions(-)


--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

[Qemu-devel] [PATCH] file-posix: Support fallocate for block device

2018-03-27 Thread zhenwei.pi

since linux 4.9, block device supports fallocate. kernel issues
block device zereout request and invalidates page cache. So
ioctl(fd, FALLOC_FL_ZERO_RANGE...) is safer than ioctl(fd,
BLKZEROOUT...). try to call do_fallocate, if failing, fallback.

use new field "has_fallocate_zero_range" with default value as
true. if do_fallocate returns -ENOTSUP, it will be set false.

Signed-off-by: zhenwei.pi 
---
 block/file-posix.c | 27 +--
 1 file changed, 17 insertions(+), 10 deletions(-)

diff --git a/block/file-posix.c b/block/file-posix.c
index d7fb772..842e940 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -159,8 +159,9 @@ typedef struct BDRVRawState {
 bool discard_zeroes:1;
 bool use_linux_aio:1;
 bool page_cache_inconsistent:1;
-bool has_fallocate;
-bool needs_alignment;
+bool has_fallocate:1;
+bool has_fallocate_zero_range:1;
+bool needs_alignment:1;
 
 PRManager *pr_mgr;
 } BDRVRawState;
@@ -549,6 +550,7 @@ static int raw_open_common(BlockDriverState *bs, QDict 
*options,
 
 s->has_discard = true;
 s->has_write_zeroes = true;
+s->has_fallocate_zero_range = true;
 if ((bs->open_flags & BDRV_O_NOCACHE) != 0) {
 s->needs_alignment = true;
 }
@@ -1365,10 +1367,6 @@ static ssize_t handle_aiocb_write_zeroes(RawPosixAIOData 
*aiocb)
 int64_t len;
 #endif
 
-if (aiocb->aio_type & QEMU_AIO_BLKDEV) {
-return handle_aiocb_write_zeroes_block(aiocb);
-}
-
 #ifdef CONFIG_XFS
 if (s->is_xfs) {
 return xfs_write_zeroes(s, aiocb->aio_offset, aiocb->aio_nbytes);
@@ -1376,16 +1374,25 @@ static ssize_t 
handle_aiocb_write_zeroes(RawPosixAIOData *aiocb)
 #endif
 
 #ifdef CONFIG_FALLOCATE_ZERO_RANGE
-if (s->has_write_zeroes) {
+/* since linux 4.9, block device supports fallocate. kernel issues
+ * block device zereout request and invalidates page cache. So
+ * ioctl(fd, FALLOC_FL_ZERO_RANGE...) is safer than ioctl(fd,
+ * BLKZEROOUT...). try to call do_fallocate, if failing, fallback.
+ */
+if (s->has_fallocate_zero_range) {
 int ret = do_fallocate(s->fd, FALLOC_FL_ZERO_RANGE,
aiocb->aio_offset, aiocb->aio_nbytes);
-if (ret == 0 || ret != -ENOTSUP) {
+if (ret == 0) {
 return ret;
-}
-s->has_write_zeroes = false;
+} else if (ret == -ENOTSUP)
+s->has_fallocate_zero_range = false;
 }
 #endif
 
+if (aiocb->aio_type & QEMU_AIO_BLKDEV) {
+return handle_aiocb_write_zeroes_block(aiocb);
+}
+
 #ifdef CONFIG_FALLOCATE_PUNCH_HOLE
 if (s->has_discard && s->has_fallocate) {
 int ret = do_fallocate(s->fd,
-- 
2.7.4

Re: [Qemu-devel] [PATCH 3/8] migration: support to detectcompression and decompression errors

2018-03-27 Thread Xiao Guangrong




On 03/28/2018 08:43 AM, jiang.bi...@zte.com.cn wrote:

On 03/27/2018 07:17 PM, Peter Xu wrote:

On Tue, Mar 27, 2018 at 03:42:32AM +0800, Xiao Guangrong wrote:

[...]


It'll be understandable to me if the problem is that the compress()
API does not allow the input buffer to be changed during the whole
period of the call.  If that is a must, this patch for sure helps.


Yes, that is exactly what i want to say. :)


So I think now I know what this patch is for. :) And yeah, it makes
sense.

Though another question would be: if the buffer is updated during
compress() and compress() returned error, would that pollute the whole
z_stream or it only fails the compress() call?



I guess deflateReset() can recover everything, i.e, keep z_stream as
it is init'ed by deflate_init().


(Same question applies to decompress().)

If it's only a compress() error and it won't pollute z_stream (or say,
it can be recovered after a deflateReset() and then we can continue to
call deflate() without problem), then we'll actually have two
alternatives to solve this "buffer update" issue:

1. Use the approach of current patch: we copy the page every time, so
 deflate() never fails because update never happens.  But it's slow
 since we copy the pages every time.

2. Use the old approach, and when compress() fail, we just ignore that
 page (since now we know that error _must_ be caused by page update,
 then we are 100% sure that we'll send that page again so it'll be
 perfectly fine).



No, we can't make the assumption that "error _must_ be caused by page update".
No document/ABI about compress/decompress promised it. :)

So, as I metioned before, can we just distingush the decompress/compress errors
from errors caused by page update by the return code of inflate/deflate?
According to the zlib manual, there seems to be several error codes for 
different
cases,
#define Z_ERRNO(-1)
#define Z_STREAM_ERROR (-2)
#define Z_DATA_ERROR   (-3)
#define Z_MEM_ERROR(-4)
#define Z_BUF_ERROR(-5)
#define Z_VERSION_ERROR (-6)
Did you check the return code when silent failure(not caused by page update)
happened before? :)


I am afraid there is no such error code and i guess zlib is not designed to
compress the data which is being modified.

Re: [Qemu-devel] [PATCH v1 1/1] RISC-V: Workaround for critical mstatus.FS MTTCG bug

2018-03-27 Thread Richard Henderson

On 03/28/2018 08:15 AM, Michael Clark wrote:
> On looking at this again, I think we may need to remove
> the qemu_tcg_mttcg_enabled conditional and always return dirty if the state is
> initial or clean, but not off.

Yes.

> While testing on uniprocessor worked okay, it's likely because we were lucky
> and there was no task migration or multiple FPU tasks working.

Also yes.

> > +        if (qemu_tcg_mttcg_enabled()) {
> > +            /* FP is always dirty or off */
> > +            if (mstatus & MSTATUS_FS) {
> > +                mstatus |= MSTATUS_FS;
> > +            }
> > +        }

I've just posted an alternate patch set to track MSTATUS_FS more exactly, but
if that's thought to be too much this late in the cycle, then I'll sign off on
this patch without the mttcg test.


r~

[Qemu-devel] [PATCH 2/2] target/riscv: Mark MSTATUS_FS dirty

2018-03-27 Thread Richard Henderson

Writes to the FP register file mark the register file as dirty.

Signed-off-by: Richard Henderson 
---
 target/riscv/op_helper.c | 25 +
 target/riscv/translate.c | 40 +++-
 2 files changed, 56 insertions(+), 9 deletions(-)

diff --git a/target/riscv/op_helper.c b/target/riscv/op_helper.c
index e34715df4e..74eeef0be8 100644
--- a/target/riscv/op_helper.c
+++ b/target/riscv/op_helper.c
@@ -72,11 +72,20 @@ void helper_raise_exception(CPURISCVState *env, uint32_t 
exception)
 do_raise_exception_err(env, exception, 0);
 }
 
-static void validate_mstatus_fs(CPURISCVState *env, uintptr_t ra)
+static void validate_mstatus_fs(CPURISCVState *env, uintptr_t ra, bool write)
 {
 #ifndef CONFIG_USER_ONLY
-if (!(env->mstatus & MSTATUS_FS)) {
+switch (get_field(env->mstatus, MSTATUS_FS)) {
+case 0: /* disabled */
 do_raise_exception_err(env, RISCV_EXCP_ILLEGAL_INST, ra);
+g_assert_not_reached();
+case 1: /* initial */
+case 2: /* clean */
+if (write) {
+/* Mark fp status as dirty.  */
+env->mstatus = MSTATUS_FS;
+}
+break;
 }
 #endif
 }
@@ -96,15 +105,15 @@ void csr_write_helper(CPURISCVState *env, target_ulong 
val_to_write,
 
 switch (csrno) {
 case CSR_FFLAGS:
-validate_mstatus_fs(env, GETPC());
+validate_mstatus_fs(env, GETPC(), true);
 cpu_riscv_set_fflags(env, val_to_write & (FSR_AEXC >> FSR_AEXC_SHIFT));
 break;
 case CSR_FRM:
-validate_mstatus_fs(env, GETPC());
+validate_mstatus_fs(env, GETPC(), true);
 env->frm = val_to_write & (FSR_RD >> FSR_RD_SHIFT);
 break;
 case CSR_FCSR:
-validate_mstatus_fs(env, GETPC());
+validate_mstatus_fs(env, GETPC(), true);
 env->frm = (val_to_write & FSR_RD) >> FSR_RD_SHIFT;
 cpu_riscv_set_fflags(env, (val_to_write & FSR_AEXC) >> FSR_AEXC_SHIFT);
 break;
@@ -379,13 +388,13 @@ target_ulong csr_read_helper(CPURISCVState *env, 
target_ulong csrno)
 
 switch (csrno) {
 case CSR_FFLAGS:
-validate_mstatus_fs(env, GETPC());
+validate_mstatus_fs(env, GETPC(), false);
 return cpu_riscv_get_fflags(env);
 case CSR_FRM:
-validate_mstatus_fs(env, GETPC());
+validate_mstatus_fs(env, GETPC(), false);
 return env->frm;
 case CSR_FCSR:
-validate_mstatus_fs(env, GETPC());
+validate_mstatus_fs(env, GETPC(), false);
 return (cpu_riscv_get_fflags(env) << FSR_AEXC_SHIFT)
 | (env->frm << FSR_RD_SHIFT);
 /* rdtime/rdtimeh is trapped and emulated by bbl in system mode */
diff --git a/target/riscv/translate.c b/target/riscv/translate.c
index a30724aa90..08fc42a679 100644
--- a/target/riscv/translate.c
+++ b/target/riscv/translate.c
@@ -660,6 +660,31 @@ static void gen_store(DisasContext *ctx, uint32_t opc, int 
rs1, int rs2,
 tcg_temp_free(dat);
 }
 
+#ifndef CONFIG_USER_ONLY
+/* The states of mstatus_fs are:
+ * 0 = disabled, 1 = initial, 2 = clean, 3 = dirty
+ * We will have already diagnosed disabled state,
+ * and need to turn initial/clean into dirty.
+ */
+static void mark_fs_dirty(DisasContext *ctx)
+{
+TCGv tmp;
+if (ctx->mstatus_fs == MSTATUS_FS) {
+return;
+}
+/* Remember the state change for the rest of the TB.  */
+ctx->mstatus_fs = MSTATUS_FS;
+
+tmp = tcg_temp_new();
+tcg_gen_ld_tl(tmp, cpu_env, offsetof(CPURISCVState, mstatus));
+tcg_gen_ori_tl(tmp, tmp, MSTATUS_FS);
+tcg_gen_st_tl(tmp, cpu_env, offsetof(CPURISCVState, mstatus));
+tcg_temp_free(tmp);
+}
+#else
+static inline void mark_fs_dirty(DisasContext *ctx) { }
+#endif
+
 static void gen_fp_load(DisasContext *ctx, uint32_t opc, int rd,
 int rs1, target_long imm)
 {
@@ -688,6 +713,8 @@ static void gen_fp_load(DisasContext *ctx, uint32_t opc, 
int rd,
 break;
 }
 tcg_temp_free(t0);
+
+mark_fs_dirty(ctx);
 }
 
 static void gen_fp_store(DisasContext *ctx, uint32_t opc, int rs1,
@@ -985,6 +1012,7 @@ static void gen_fp_arith(DisasContext *ctx, uint32_t opc, 
int rd,
  int rs1, int rs2, int rm)
 {
 TCGv t0 = NULL;
+bool fp_output = true;
 
 if (ctx->mstatus_fs == 0) {
 goto do_illegal;
@@ -1047,6 +1075,7 @@ static void gen_fp_arith(DisasContext *ctx, uint32_t opc, 
int rd,
 }
 gen_set_gpr(rd, t0);
 tcg_temp_free(t0);
+fp_output = false;
 break;
 
 case OPC_RISC_FCVT_W_S:
@@ -1076,6 +1105,7 @@ static void gen_fp_arith(DisasContext *ctx, uint32_t opc, 
int rd,
 }
 gen_set_gpr(rd, t0);
 tcg_temp_free(t0);
+fp_output = false;
 break;
 
 case OPC_RISC_FCVT_S_W:
@@ -1126,6 +1156,7 @@ static void gen_fp_arith(DisasContext *ctx, uint32_t opc, 
int rd,
 }
 gen_set_gpr(rd, t0);
 tcg_temp_free(t0);
+fp_output =

[Qemu-devel] [PATCH 1/2] target/riscv: Split out mstatus_fs from tb_flags during translation

2018-03-27 Thread Richard Henderson

We will want to track changes to mstatus_fs through the TB.
As there is nothing else in tb_flags at the moment, remove
the variable from DisasContext.

Signed-off-by: Richard Henderson 
---
 target/riscv/cpu.h   |  6 +++---
 target/riscv/translate.c | 10 +-
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index 41e06ac0f9..d201dd3e90 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -269,8 +269,8 @@ void QEMU_NORETURN do_raise_exception_err(CPURISCVState 
*env,
 target_ulong cpu_riscv_get_fflags(CPURISCVState *env);
 void cpu_riscv_set_fflags(CPURISCVState *env, target_ulong);
 
-#define TB_FLAGS_MMU_MASK  3
-#define TB_FLAGS_FP_ENABLE MSTATUS_FS
+#define TB_FLAGS_MMU_MASK   3
+#define TB_FLAGS_MSTATUS_FS MSTATUS_FS
 
 static inline void cpu_get_tb_cpu_state(CPURISCVState *env, target_ulong *pc,
 target_ulong *cs_base, uint32_t *flags)
@@ -278,7 +278,7 @@ static inline void cpu_get_tb_cpu_state(CPURISCVState *env, 
target_ulong *pc,
 *pc = env->pc;
 *cs_base = 0;
 #ifdef CONFIG_USER_ONLY
-*flags = TB_FLAGS_FP_ENABLE;
+*flags = TB_FLAGS_MSTATUS_FS;
 #else
 *flags = cpu_mmu_index(env, 0) | (env->mstatus & MSTATUS_FS);
 #endif
diff --git a/target/riscv/translate.c b/target/riscv/translate.c
index 808eab7f50..a30724aa90 100644
--- a/target/riscv/translate.c
+++ b/target/riscv/translate.c
@@ -43,7 +43,7 @@ typedef struct DisasContext {
 target_ulong pc;
 target_ulong next_pc;
 uint32_t opcode;
-uint32_t flags;
+uint32_t mstatus_fs;
 uint32_t mem_idx;
 int singlestep_enabled;
 int bstate;
@@ -665,7 +665,7 @@ static void gen_fp_load(DisasContext *ctx, uint32_t opc, 
int rd,
 {
 TCGv t0;
 
-if (!(ctx->flags & TB_FLAGS_FP_ENABLE)) {
+if (ctx->mstatus_fs == 0) {
 gen_exception_illegal(ctx);
 return;
 }
@@ -695,7 +695,7 @@ static void gen_fp_store(DisasContext *ctx, uint32_t opc, 
int rs1,
 {
 TCGv t0;
 
-if (!(ctx->flags & TB_FLAGS_FP_ENABLE)) {
+if (ctx->mstatus_fs == 0) {
 gen_exception_illegal(ctx);
 return;
 }
@@ -986,7 +986,7 @@ static void gen_fp_arith(DisasContext *ctx, uint32_t opc, 
int rd,
 {
 TCGv t0 = NULL;
 
-if (!(ctx->flags & TB_FLAGS_FP_ENABLE)) {
+if (ctx->mstatus_fs == 0) {
 goto do_illegal;
 }
 
@@ -1862,8 +1862,8 @@ void gen_intermediate_code(CPUState *cs, TranslationBlock 
*tb)
 
 ctx.tb = tb;
 ctx.bstate = BS_NONE;
-ctx.flags = tb->flags;
 ctx.mem_idx = tb->flags & TB_FLAGS_MMU_MASK;
+ctx.mstatus_fs = tb->flags & TB_FLAGS_MSTATUS_FS;
 ctx.frm = -1;  /* unknown rounding mode */
 
 num_insns = 0;
-- 
2.14.3

[Qemu-devel] [PATCH for-2.12 0/2] RISC-V: Mark FP status dirty

2018-03-27 Thread Richard Henderson

Since it was my patch that broke FP state tracking in the
first place, I feel obligated to fix it again.

Mark mstatus[fs] as dirty whenever we write to the file.
This can be optimized by only doing so once within a TB
which initially began with a clean file.

I have not yet put together an environment that can test
this, so I'll need someone else to give it a go.


r~


Richard Henderson (2):
  target/riscv: Split out mstatus_fs from tb_flags during translation
  target/riscv: Mark MSTATUS_FS dirty

 target/riscv/cpu.h   |  6 +++---
 target/riscv/op_helper.c | 25 
 target/riscv/translate.c | 50 ++--
 3 files changed, 64 insertions(+), 17 deletions(-)

-- 
2.14.3

Re: [Qemu-devel] [patches] [PATCH v1 1/1] RISC-V: Workaround for critical mstatus.FS MTTCG bug

2018-03-27 Thread Palmer Dabbelt


On Tue, 27 Mar 2018 12:54:47 PDT (-0700), Michael Clark wrote:

This change is a workaround for a bug where mstatus.FS
is not correctly reporting dirty when MTTCG and SMP are
enabled which results in the floating point register file
not being saved during context switches. This a critical
bug for RISC-V in QEMU as it results in floating point
register file corruption when running SMP Linux in the
RISC-V 'virt' machine.

This workaround will return dirty if mstatus.FS is
switched from off to initial or clean. We have checked
the specification and it is legal for an implementation
to return either off, or dirty, if set to initial or clean.

This workaround will result in unnecessary floating point
save restore. When mstatus.FS is off, floating point
instruction trap to indicate the process is using the FPU.
The OS can then save floating-point state of the previous
process using the FPU and set mstatus.FS to initial or
clean. With this workaround, mstatus.FS will always return
dirty if set to a non-zero value, indicating floating point
save restore is necessary, versus misreporting mstatus.FS
resulting in floating point register file corruption.

Cc: Palmer Dabbelt 
Cc: Sagar Karandikar 
Cc: Bastian Koppelmann 
Cc: Peter Maydell 
Tested-by: Richard W.M. Jones 
Signed-off-by: Michael Clark 
---
 target/riscv/op_helper.c | 19 +--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/target/riscv/op_helper.c b/target/riscv/op_helper.c
index e34715d..7281b98 100644
--- a/target/riscv/op_helper.c
+++ b/target/riscv/op_helper.c
@@ -144,8 +144,23 @@ void csr_write_helper(CPURISCVState *env, target_ulong 
val_to_write,
 }

 mstatus = (mstatus & ~mask) | (val_to_write & mask);
-int dirty = (mstatus & MSTATUS_FS) == MSTATUS_FS;
-dirty |= (mstatus & MSTATUS_XS) == MSTATUS_XS;
+
+/* Note: this is a workaround for an issue where mstatus.FS
+   does not report dirty when SMP and MTTCG is enabled. This
+   workaround is technically compliant with the RISC-V Privileged
+   specification as it is legal to return only off, or dirty,
+   however this may cause unnecessary saves of floating point state.
+   Without this workaround, floating point state is not saved and
+   restored correctly when SMP and MTTCG is enabled, */
+if (qemu_tcg_mttcg_enabled()) {
+/* FP is always dirty or off */
+if (mstatus & MSTATUS_FS) {
+mstatus |= MSTATUS_FS;
+}
+}
+
+int dirty = ((mstatus & MSTATUS_FS) == MSTATUS_FS) |
+((mstatus & MSTATUS_XS) == MSTATUS_XS);
 mstatus = set_field(mstatus, MSTATUS_SD, dirty);
 env->mstatus = mstatus;
 break;


FWIW, this isn't just "technically compliant with the RISC-V Privileged 
specification" but it's actually an intended design point.  We're considering 
making this a bit more explicit in the ISA manual -- well, unless Andrew 
decides I'm being too pedantic in one of my possible readings of the spec :).


Reviewed-By: Palmer Dabbelt

Re: [Qemu-devel] Regression on KVM qemu-system-aarch64 since "monitor: enable IO thread for (qmp & !mux) typed"

2018-03-27 Thread Peter Xu

On Fri, Mar 23, 2018 at 01:36:36PM +0100, Auger Eric wrote:
> Hi,
> 
> On 23/03/18 13:11, Peter Maydell wrote:
> > On 23 March 2018 at 12:01, Auger Eric  wrote:
> >> Hi,
> >>
> >> On 23/03/18 11:26, Peter Maydell wrote:
> >>> On 23 March 2018 at 10:24, Auger Eric  wrote:
>  Hi,
> 
>  I observe a regression on KVM accelerated qemu-system-aarch64:
> 
>  Unexpected error in kvm_device_access() at
>  /home/augere/UPSTREAM/qemu/accel/kvm/kvm-all.c:2164:
>  2018-03-23T09:59:59.629439Z qemu-system-aarch64: KVM_GET_DEVICE_ATTR
>  failed: Group 6 attr 0xc664: Device or resource busy
>  2018-03-23 10:00:00.085+: shutting down, reason=crashed
> >>>
> >>> Can you get a backtrace for this? (I guess you'd need to fiddle
> >>> with the kvm_device_access() code to make it assert rather
> >>> than passing back the error).
> >>
> >> OK. I will try to do so. As I could have expected, I cannot reproduce on
> >> a standalone qemu command line. The problem observed above is seen with
> >> libvirt launch which may be doing some other QMP stuff concurrently?
> > 
> > Hmm, that could be a bit painful to debug. I dunno if libvirt
> > has a "launch QEMU under gdb" option. If not, you could try
> > something like:
> >if (condition we want to get a backtrace on) {
> >printf("hit condition, attach gdb to process %d\n", (int)getpid());
> >for (;;) { }
> >}
> 
> Thanks for the hint. Here is the stack I get.
> 
> #0  kvm_device_access (fd=31, group=6, attr=50788, val=0x5937c88, 
> write=false, errp=0x16984a8 ) at 
> /home/augere/UPSTREAM/qemu/accel/kvm/kvm-all.c:2164
> #1  0x004f8ce4 in arm_gicv3_icc_reset (env=0xa1fc8330, 
> ri=0x597f910) at /home/augere/UPSTREAM/qemu/hw/intc/arm_gicv3_kvm.c:632
> #2  0x006351ac in cp_reg_reset (key=0x597f730, value=0x597f910, 
> opaque=0xa1fc0010) at /home/augere/UPSTREAM/qemu/target/arm/cpu.c:78
> #3  0xa47edce4 in g_hash_table_foreach () from /lib64/libglib-2.0.so.0
> #4  0x00635394 in arm_cpu_reset (s=0xa1fc0010) at 
> /home/augere/UPSTREAM/qemu/target/arm/cpu.c:130
> #5  0x0090c888 in cpu_reset (cpu=0xa1fc0010) at qom/cpu.c:249
> #6  0x005793d8 in do_cpu_reset (opaque=0xa1fc0010) at 
> /home/augere/UPSTREAM/qemu/hw/arm/boot.c:665
> #7  0x0073095c in qemu_devices_reset () at hw/core/reset.c:69
> #8  0x006976e0 in qemu_system_reset (reason=SHUTDOWN_CAUSE_NONE) at 
> vl.c:1731
> #9  0x0069fd60 in main (argc=69, argv=0xe877d1a8, 
> envp=0xe877d3d8) at vl.c:4697

I think current master should work fine with ARM KVM now since OOB is
now off by default. But does ARM use postcopy, and will ARM need the
coming network failure recovery feature?

If so, maybe we'll still need to have a look on this single problem
(this is the only non-testcase issue I know now with Out-Of-Band).

Thanks,

-- 
Peter Xu

Re: [Qemu-devel] [PATCH v3 1/1] hw/scsi: support SCSI-2 passthrough without PI

2018-03-27 Thread Fam Zheng

On Tue, 03/27 18:14, Daniel Henrique Barboza wrote:
> QEMU SCSI code makes assumptions about how the PROTECT and BYTCHK
> works in the protocol, denying support for PI (Protection
> Information) in case the guest OS requests it. However, in SCSI versions 2
> and older, there is no PI concept in the protocol.
> 
> This means that when dealing with such devices:
> 
> - there is no PROTECT bit in byte 5 of the standard INQUIRY response. The
> whole byte is marked as "Reserved";
> 
> - there is no RDPROTECT in byte 2 of READ. We have 'Logical Unit Number'
> in this field instead;
> 
> - there is no VRPROTECT in byte 2 of VERIFY. We have 'Logical Unit Number'
> in this field instead. This also means that the BYTCHK bit in this case
> is not related to PI.
> 
> Since QEMU does not consider these changes, a SCSI passthrough using
> a SCSI-2 device will not work. It will mistake these fields with
> PI information and return Illegal Request SCSI SENSE thinking
> that the driver is asking for PI support.
> 
> This patch fixes it by adding a new attribute called 'scsi_version'
> that is read from the standard INQUIRY response of passthrough
> devices. This allows for a version verification before applying
> conditions related to PI that doesn't apply for older versions.
> 
> Reported-by: Dac Nguyen 
> Signed-off-by: Daniel Henrique Barboza 
> ---
> 
> Changes in v3:
> - moved the scsi_version initialization from realize functions to
> reset functions, allowing the scsi_version to be redefined again after
> each reboot.

Reviewed-by: Fam Zheng

Re: [Qemu-devel] [PATCH] vfio: remove DPRINTF() definition from vfio-common.h

2018-03-27 Thread Michael S. Tsirkin

On Wed, Mar 28, 2018 at 09:19:53AM +0800, Tiwei Bie wrote:
> This macro isn't used by any VFIO code. And its name is
> too generic. The vfio-common.h (in include/hw/vfio) can
> be included by other modules in QEMU. It can introduce
> conflicts.
> 
> Signed-off-by: Tiwei Bie 

Reviewed-by: Michael S. Tsirkin 

> ---
>  include/hw/vfio/vfio-common.h | 9 -
>  1 file changed, 9 deletions(-)
> 
> diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
> index d9360148e6..cecbd4e386 100644
> --- a/include/hw/vfio/vfio-common.h
> +++ b/include/hw/vfio/vfio-common.h
> @@ -34,15 +34,6 @@
>  #define ERR_PREFIX "vfio error: %s: "
>  #define WARN_PREFIX "vfio warning: %s: "
>  
> -/*#define DEBUG_VFIO*/
> -#ifdef DEBUG_VFIO
> -#define DPRINTF(fmt, ...) \
> -do { fprintf(stderr, "vfio: " fmt, ## __VA_ARGS__); } while (0)
> -#else
> -#define DPRINTF(fmt, ...) \
> -do { } while (0)
> -#endif
> -
>  enum {
>  VFIO_DEVICE_TYPE_PCI = 0,
>  VFIO_DEVICE_TYPE_PLATFORM = 1,
> -- 
> 2.11.0

[Qemu-devel] [PATCH] vfio: remove DPRINTF() definition from vfio-common.h

2018-03-27 Thread Tiwei Bie

This macro isn't used by any VFIO code. And its name is
too generic. The vfio-common.h (in include/hw/vfio) can
be included by other modules in QEMU. It can introduce
conflicts.

Signed-off-by: Tiwei Bie 
---
 include/hw/vfio/vfio-common.h | 9 -
 1 file changed, 9 deletions(-)

diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
index d9360148e6..cecbd4e386 100644
--- a/include/hw/vfio/vfio-common.h
+++ b/include/hw/vfio/vfio-common.h
@@ -34,15 +34,6 @@
 #define ERR_PREFIX "vfio error: %s: "
 #define WARN_PREFIX "vfio warning: %s: "
 
-/*#define DEBUG_VFIO*/
-#ifdef DEBUG_VFIO
-#define DPRINTF(fmt, ...) \
-do { fprintf(stderr, "vfio: " fmt, ## __VA_ARGS__); } while (0)
-#else
-#define DPRINTF(fmt, ...) \
-do { } while (0)
-#endif
-
 enum {
 VFIO_DEVICE_TYPE_PCI = 0,
 VFIO_DEVICE_TYPE_PLATFORM = 1,
-- 
2.11.0

Re: [Qemu-devel] [PATCH 3/8] migration: support to detectcompression and decompression errors

2018-03-27 Thread jiang.biao2

> On 03/27/2018 07:17 PM, Peter Xu wrote:
>> On Tue, Mar 27, 2018 at 03:42:32AM +0800, Xiao Guangrong wrote:
>> 
>> [...]
>> 
 It'll be understandable to me if the problem is that the compress()
 API does not allow the input buffer to be changed during the whole
 period of the call.  If that is a must, this patch for sure helps.
>>>
>>> Yes, that is exactly what i want to say. :)
>> 
>> So I think now I know what this patch is for. :) And yeah, it makes
>> sense.
>> 
>> Though another question would be: if the buffer is updated during
>> compress() and compress() returned error, would that pollute the whole
>> z_stream or it only fails the compress() call?
>> 
>
> I guess deflateReset() can recover everything, i.e, keep z_stream as
> it is init'ed by deflate_init().
>
>> (Same question applies to decompress().)
>> 
>> If it's only a compress() error and it won't pollute z_stream (or say,
>> it can be recovered after a deflateReset() and then we can continue to
>> call deflate() without problem), then we'll actually have two
>> alternatives to solve this "buffer update" issue:
>> 
>> 1. Use the approach of current patch: we copy the page every time, so
>> deflate() never fails because update never happens.  But it's slow
>> since we copy the pages every time.
>> 
>> 2. Use the old approach, and when compress() fail, we just ignore that
>> page (since now we know that error _must_ be caused by page update,
>> then we are 100% sure that we'll send that page again so it'll be
>> perfectly fine).
>> 
>
> No, we can't make the assumption that "error _must_ be caused by page 
> update". 
> No document/ABI about compress/decompress promised it. :)
So, as I metioned before, can we just distingush the decompress/compress errors 
from errors caused by page update by the return code of inflate/deflate?
According to the zlib manual, there seems to be several error codes for 
different 
cases,
#define Z_ERRNO(-1) 
#define Z_STREAM_ERROR (-2) 
#define Z_DATA_ERROR   (-3) 
#define Z_MEM_ERROR(-4)
#define Z_BUF_ERROR(-5)
#define Z_VERSION_ERROR (-6)
Did you check the return code when silent failure(not caused by page update) 
happened before? :)

Jiang
Regards

[Qemu-devel] [PATCH v2 1/1] RISC-V: Workaround for critical mstatus.FS MTTCG bug

2018-03-27 Thread Michael Clark

This change is a workaround for a bug where mstatus.FS
is not correctly reporting dirty after operations that
modify floating point registers. This a critical bug
or RISC-V in QEMU as it results in floating point
register file corruption when running SMP Linux due to
task migration and possibly uniprocessor Linux if
more than one process is using the FPU.

This workaround will return dirty if mstatus.FS is
switched from off to initial or clean. According to
the specification it is legal for an implementation
to return only off, or dirty.

Cc: Palmer Dabbelt 
Cc: Sagar Karandikar 
Cc: Bastian Koppelmann 
Cc: Peter Maydell 
Cc: Alex Bennée 
Cc: Richard Henderson 
Cc: Philippe Mathieu-Daudé 
Tested-by: Richard W.M. Jones 
Signed-off-by: Michael Clark 
---
 target/riscv/op_helper.c | 17 +++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/target/riscv/op_helper.c b/target/riscv/op_helper.c
index e34715d..7c6068b 100644
--- a/target/riscv/op_helper.c
+++ b/target/riscv/op_helper.c
@@ -144,8 +144,21 @@ void csr_write_helper(CPURISCVState *env, target_ulong 
val_to_write,
 }
 
 mstatus = (mstatus & ~mask) | (val_to_write & mask);
-int dirty = (mstatus & MSTATUS_FS) == MSTATUS_FS;
-dirty |= (mstatus & MSTATUS_XS) == MSTATUS_XS;
+
+/* Note: this is a workaround for an issue where mstatus.FS
+   does not report dirty after floating point operations
+   that modify floating point state. This workaround is
+   technically compliant with the RISC-V Privileged
+   specification as it is legal to return only off, or dirty.
+   at the expense of extra floating point save/restore. */
+
+/* FP is always dirty or off */
+if (mstatus & MSTATUS_FS) {
+mstatus |= MSTATUS_FS;
+}
+
+int dirty = ((mstatus & MSTATUS_FS) == MSTATUS_FS) |
+((mstatus & MSTATUS_XS) == MSTATUS_XS);
 mstatus = set_field(mstatus, MSTATUS_SD, dirty);
 env->mstatus = mstatus;
 break;
-- 
2.7.0

[Qemu-devel] [PATCH v2 1/1] RISC-V: Workaround for critical mstatus.FS bug

2018-03-27 Thread Michael Clark

This change is a workaround for a bug where mstatus.FS
is not correctly reporting dirty after operations that
modify floating point registers. This a critical bug
or RISC-V in QEMU as it results in floating point
register file corruption when running SMP Linux due to
task migration and possibly uniprocessor Linux if
more than one process is using the FPU.

This workaround will return dirty if mstatus.FS is
switched from off to initial or clean. According to
the specification it is legal for an implementation
to return only off, or dirty.

Cc: Palmer Dabbelt 
Cc: Sagar Karandikar 
Cc: Bastian Koppelmann 
Cc: Peter Maydell 
Cc: Alex Bennée 
Cc: Richard Henderson 
Cc: Philippe Mathieu-Daudé 
Tested-by: Richard W.M. Jones 
Signed-off-by: Michael Clark 
---
 target/riscv/op_helper.c | 17 +++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/target/riscv/op_helper.c b/target/riscv/op_helper.c
index e34715d..7c6068b 100644
--- a/target/riscv/op_helper.c
+++ b/target/riscv/op_helper.c
@@ -144,8 +144,21 @@ void csr_write_helper(CPURISCVState *env, target_ulong 
val_to_write,
 }
 
 mstatus = (mstatus & ~mask) | (val_to_write & mask);
-int dirty = (mstatus & MSTATUS_FS) == MSTATUS_FS;
-dirty |= (mstatus & MSTATUS_XS) == MSTATUS_XS;
+
+/* Note: this is a workaround for an issue where mstatus.FS
+   does not report dirty after floating point operations
+   that modify floating point state. This workaround is
+   technically compliant with the RISC-V Privileged
+   specification as it is legal to return only off, or dirty.
+   at the expense of extra floating point save/restore. */
+
+/* FP is always dirty or off */
+if (mstatus & MSTATUS_FS) {
+mstatus |= MSTATUS_FS;
+}
+
+int dirty = ((mstatus & MSTATUS_FS) == MSTATUS_FS) |
+((mstatus & MSTATUS_XS) == MSTATUS_XS);
 mstatus = set_field(mstatus, MSTATUS_SD, dirty);
 env->mstatus = mstatus;
 break;
-- 
2.7.0

[Qemu-devel] [PATCH v2 0/1] RISC-V: Critical fixes for QEMU 2.12

2018-03-27 Thread Michael Clark

This series includes changes that are considered release critical,
such as floating point register file corruption under SMP Linux.

v2

- reverted to Richard W.M. Jone's original, more conservative fix
- reworded comment to be more concise and more general

Michael Clark (1):
  RISC-V: Workaround for critical mstatus.FS bug

 target/riscv/op_helper.c | 17 +++--
 1 file changed, 15 insertions(+), 2 deletions(-)

-- 
2.7.0

Re: [Qemu-devel] [PATCH] target/ppc: Fix reserved bit mask of dstst instruction

2018-03-27 Thread David Gibson

On Mon, Mar 26, 2018 at 01:54:28AM +0200, BALATON Zoltan wrote:
> According to the Vector/SIMD extension documentation bit 6 that is
> currently masked is valid (listed as transient bit) but bits 7 and 8
> should be reserved instead. Fix the mask to match this.
> 
> Signed-off-by: BALATON Zoltan 

Applied to ppc-for-2.13, thanks.

> ---
>  target/ppc/translate.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/target/ppc/translate.c b/target/ppc/translate.c
> index 3457d29..b0d79a3 100644
> --- a/target/ppc/translate.c
> +++ b/target/ppc/translate.c
> @@ -6561,7 +6561,7 @@ GEN_HANDLER(dcbtst, 0x1F, 0x16, 0x07, 0x0001, 
> PPC_CACHE),
>  GEN_HANDLER_E(dcbtls, 0x1F, 0x06, 0x05, 0x0201, PPC_BOOKE, 
> PPC2_BOOKE206),
>  GEN_HANDLER(dcbz, 0x1F, 0x16, 0x1F, 0x03C1, PPC_CACHE_DCBZ),
>  GEN_HANDLER(dst, 0x1F, 0x16, 0x0A, 0x0181, PPC_ALTIVEC),
> -GEN_HANDLER(dstst, 0x1F, 0x16, 0x0B, 0x0201, PPC_ALTIVEC),
> +GEN_HANDLER(dstst, 0x1F, 0x16, 0x0B, 0x0181, PPC_ALTIVEC),
>  GEN_HANDLER(dss, 0x1F, 0x16, 0x19, 0x019FF801, PPC_ALTIVEC),
>  GEN_HANDLER(icbi, 0x1F, 0x16, 0x1E, 0x03E1, PPC_CACHE_ICBI),
>  GEN_HANDLER(dcba, 0x1F, 0x16, 0x17, 0x03E1, PPC_CACHE_DCBA),

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] [RFC for-2.13 05/12] target/ppc: Remove fallback 64k pagesize information

2018-03-27 Thread David Gibson

On Tue, Mar 27, 2018 at 03:54:55PM +0200, Greg Kurz wrote:
> On Tue, 27 Mar 2018 15:37:34 +1100
> David Gibson  wrote:
> 
> > CPU definitions for cpus with the 64-bit hash MMU can include a table of
> > available pagesizes.  If this isn't supplied ppc_cpu_instance_init() will
> > fill it in a fallback table based on the POWERPC_MMU_64K bit in mmu_model.
> > 
> > However, it turns out all the cpus which support 64K pages already include
> > an explicit table of page sizes, so there's no point to the fallback table
> > including 64k pages.
> > 
> 
> I was thinking that 64k pages came with POWER5+. At least, this is mentioned
> in several places:
> 
> https://www.ibm.com/support/knowledgecenter/ssw_aix_72/com.ibm.aix.performance/supported_page_sizes_processor_type.htm
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=a4a0f2524acc2c602cadd8e743be19d86f3a746b

Ok, I didn't know that.  However, that was already wrong - we weren't
setting the MMU_64K bit for POWER5+.

> And we do support POWER5+ with TCG and KVM PR.

Well, theoretically.  I doubt it's been tested in years, and I
strongly suspect it won't actually work.

> Shouldn't we include an explicit
> table of pages sizes there as well ?

Yeah, but I think it makes more sense to fix that later.  Or, more
likely, not, since no-one actually cares about POWER5.

> 
> > That removes the only place which tests POWERPC_MMU_64K, so we can remove
> > it.  Which in turn allows some logic to be removed from
> > kvm_fixup_page_sizes().
> > 
> > Signed-off-by: David Gibson 
> > ---
> >  target/ppc/cpu-qom.h|  4 
> >  target/ppc/kvm.c|  7 ---
> >  target/ppc/translate_init.c | 20 ++--
> >  3 files changed, 2 insertions(+), 29 deletions(-)
> > 
> > diff --git a/target/ppc/cpu-qom.h b/target/ppc/cpu-qom.h
> > index deaa46a14b..9bbb05cf62 100644
> > --- a/target/ppc/cpu-qom.h
> > +++ b/target/ppc/cpu-qom.h
> > @@ -70,7 +70,6 @@ enum powerpc_mmu_t {
> >  #define POWERPC_MMU_64   0x0001
> >  #define POWERPC_MMU_1TSEG0x0002
> >  #define POWERPC_MMU_AMR  0x0004
> > -#define POWERPC_MMU_64K  0x0008
> >  #define POWERPC_MMU_V3   0x0010 /* ISA V3.00 MMU Support */
> >  /* 64 bits PowerPC MMU */
> >  POWERPC_MMU_64B= POWERPC_MMU_64 | 0x0001,
> > @@ -78,15 +77,12 @@ enum powerpc_mmu_t {
> >  POWERPC_MMU_2_03   = POWERPC_MMU_64 | 0x0002,
> >  /* Architecture 2.06 variant   */
> >  POWERPC_MMU_2_06   = POWERPC_MMU_64 | POWERPC_MMU_1TSEG
> > - | POWERPC_MMU_64K
> >   | POWERPC_MMU_AMR | 0x0003,
> >  /* Architecture 2.07 variant   */
> >  POWERPC_MMU_2_07   = POWERPC_MMU_64 | POWERPC_MMU_1TSEG
> > - | POWERPC_MMU_64K
> >   | POWERPC_MMU_AMR | 0x0004,
> >  /* Architecture 3.00 variant   */
> >  POWERPC_MMU_3_00   = POWERPC_MMU_64 | POWERPC_MMU_1TSEG
> > - | POWERPC_MMU_64K
> >   | POWERPC_MMU_AMR | POWERPC_MMU_V3
> >   | 0x0005,
> >  };
> > diff --git a/target/ppc/kvm.c b/target/ppc/kvm.c
> > index 79a436a384..6160356a4a 100644
> > --- a/target/ppc/kvm.c
> > +++ b/target/ppc/kvm.c
> > @@ -425,7 +425,6 @@ static void kvm_fixup_page_sizes(PowerPCCPU *cpu)
> >  static bool has_smmu_info;
> >  CPUPPCState *env = >env;
> >  int iq, ik, jq, jk;
> > -bool has_64k_pages = false;
> >  
> >  /* We only handle page sizes for 64-bit server guests for now */
> >  if (!(env->mmu_model & POWERPC_MMU_64)) {
> > @@ -471,9 +470,6 @@ static void kvm_fixup_page_sizes(PowerPCCPU *cpu)
> >   ksps->enc[jk].page_shift)) {
> >  continue;
> >  }
> > -if (ksps->enc[jk].page_shift == 16) {
> > -has_64k_pages = true;
> > -}
> >  qsps->enc[jq].page_shift = ksps->enc[jk].page_shift;
> >  qsps->enc[jq].pte_enc = ksps->enc[jk].pte_enc;
> >  if (++jq >= PPC_PAGE_SIZES_MAX_SZ) {
> > @@ -488,9 +484,6 @@ static void kvm_fixup_page_sizes(PowerPCCPU *cpu)
> >  if (!(smmu_info.flags & KVM_PPC_1T_SEGMENTS)) {
> >  env->mmu_model &= ~POWERPC_MMU_1TSEG;
> >  }
> > -if (!has_64k_pages) {
> > -env->mmu_model &= ~POWERPC_MMU_64K;
> > -}
> >  }
> >  
> >  bool kvmppc_is_mem_backend_page_size_ok(const char *obj_path)
> > diff --git a/target/ppc/translate_init.c b/target/ppc/translate_init.c
> > index 29bd6f3654..99be6fcd68 100644
> > --- a/target/ppc/translate_init.c
> > +++ b/target/ppc/translate_init.c
> > @@ -10469,7 +10469,7 @@ static void ppc_cpu_instance_init(Object *obj)
> >

Re: [Qemu-devel] [PATCH] target/ppc: Fix reserved bit mask of dstst instruction

2018-03-27 Thread David Gibson

On Mon, Mar 26, 2018 at 05:24:12AM +0200, BALATON Zoltan wrote:
> On Mon, 26 Mar 2018, David Gibson wrote:
> > On Mon, Mar 26, 2018 at 01:54:28AM +0200, BALATON Zoltan wrote:
> > > According to the Vector/SIMD extension documentation bit 6 that is
> > > currently masked is valid (listed as transient bit) but bits 7 and 8
> > > should be reserved instead. Fix the mask to match this.
> > 
> > What document can I find information on dstst in?  The ISA documents I
> > have handy are either too early (the instruction didn't exist yet) or
> > too late (the instruction was considered obsolete and no details are
> > given).
> 
> I've found it in "PowerPC Microprocessor Family: Vector/SIMD Multimedia
> Extension Technology Programming Environments Manual" Version 2.06 which was
> the first one Google found. According to this document dstst should have the
> same reserved bits as dst.

Thanks.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH v1] RISC-V: RISC-V TCG backend work in progress

2018-03-27 Thread Richard Henderson

On 03/28/2018 01:43 AM, Michael Clark wrote:
> > +    if ((ct & TCG_CT_CONST_N12) && val >= -2047 && val <= 2047) {
> 
> +2048?
> 
> We use this constraint for a negatable immediate and the constraint is only
> applied to sub. We have no subi, so we implement subi as addi rd, rs1, -imm
> 
>     case INDEX_op_sub_i32:
>         if (c2) {
>             tcg_out_opc_imm(s, is32bit ? OPC_ADDI : OPC_ADDIW, a0, a1, -a2);
>         } else {
>             tcg_out_opc_reg(s, is32bit ? OPC_SUB : OPC_SUBW, a0, a1, a2);
>         }
>         break;

That's my point.  The "positive" range for addition is -2048...2047, so the
"negative" range for subtraction should be -2047...2048.


r~

Re: [Qemu-devel] [PATCH 3/3] e1000: Old machine types, turn new subsection off

2018-03-27 Thread Ed Swierk

On Tue, Mar 27, 2018 at 10:28 AM, Paolo Bonzini  wrote:
> On 27/03/2018 18:47, Dr. David Alan Gilbert wrote:
>>> So if the subsection is absent you
>>> have to migrate either tx.tso_props or tx.props, depending on s->tx.cptse.
>> Do you mean when sending you have to decide which set to send in the
>> non-subsection data?  And with cptse true that means use tso_props?
>
> Yes.
>
>>> Likewise if you migrate from older versions: if s->tx.props.tse &&
>>> s->tx.cptse, you have to copy s->tx.props to s->tx.tso_props and clear
>>> s->tx.props.
>>
>> I don't see any equivalent code in the existing non-subsection postload to
>> do this; so I'm guessing there are some cases of 2.11->2.12 that will
>> break at the moment?
>
> Yes, I think so.
>
>>> My understanding is that s->tx.tso_props.tse will be 1 if
>>> and only if the source sent s->tx.tso_props.
>> I don't see anything in the current code that migrates tso_props.tse -
>> where does it come from?
>
> Ouch... The tse field is more or less dead in current code AFAICS, but
> it was used in the previous version.  What's the best way then to find
> if the subsection was transmitted?  Do we have anything like a post_load
> callback in the subsection itself?

The TSE flag in the cmd_and_length field of the context descriptor is
useful only as an indication of which context is being updated: TSO
(tso_props) or non-TSO (props). There is no reason to store it or
migrate it, and all prior uses of the stored field were based on an
incorrect understanding of its meaning. Now props.tse is always 0, and
tso_props.tse is always 1 after the first TSO context is processed.

> To find out which "props" to transmit to older QEMU, you can add a
> tp->use_tso_for_migration = tp->cptse just before "if (!(txd_lower &
> E1000_TXD_CMD_EOP))" in process_tx_desc...

tp->cptse only indicates whether the current tx data descriptor should
be segmented using parameters from the last TSO context descriptor.
It's perfectly legal for the guest to set up a TSO context and then
use it for some but not all subsequent data descriptors. tp->cptse
doesn't help in deciding what to migrate.

Whether to migrate props or tso_props back to 2.11 should be instead
based on which was updated last by a context descriptor. Something
like

if (dtype == E1000_TXD_CMD_DEXT) {/* context descriptor */
if (le32_to_cpu(xp->cmd_and_length) & E1000_TXD_CMD_TSE) {
e1000x_read_tx_ctx_descr(xp, >tso_props);
tp->use_tso_for_migration = 1;
tp->tso_frames = 0;
} else {
e1000x_read_tx_ctx_descr(xp, >props);
tp->use_tso_for_migration = 0;
}
return;

--Ed

Re: [Qemu-devel] [PATCH v1 1/1] RISC-V: Workaround for critical mstatus.FS MTTCG bug

2018-03-27 Thread Michael Clark

On Tue, Mar 27, 2018 at 3:17 PM, Philippe Mathieu-Daudé 
wrote:

> Cc'ing Alex and Richard.
>
> On 03/27/2018 04:54 PM, Michael Clark wrote:
> > This change is a workaround for a bug where mstatus.FS
> > is not correctly reporting dirty when MTTCG and SMP are
> > enabled which results in the floating point register file
> > not being saved during context switches. This a critical
> > bug for RISC-V in QEMU as it results in floating point
> > register file corruption when running SMP Linux in the
> > RISC-V 'virt' machine.
> >
> > This workaround will return dirty if mstatus.FS is
> > switched from off to initial or clean. We have checked
> > the specification and it is legal for an implementation
> > to return either off, or dirty, if set to initial or clean.
> >
> > This workaround will result in unnecessary floating point
> > save restore. When mstatus.FS is off, floating point
> > instruction trap to indicate the process is using the FPU.
> > The OS can then save floating-point state of the previous
> > process using the FPU and set mstatus.FS to initial or
> > clean. With this workaround, mstatus.FS will always return
> > dirty if set to a non-zero value, indicating floating point
> > save restore is necessary, versus misreporting mstatus.FS
> > resulting in floating point register file corruption.
> >
> > Cc: Palmer Dabbelt 
> > Cc: Sagar Karandikar 
> > Cc: Bastian Koppelmann 
> > Cc: Peter Maydell 
> > Tested-by: Richard W.M. Jones 
> > Signed-off-by: Michael Clark 
> > ---
> >  target/riscv/op_helper.c | 19 +--
> >  1 file changed, 17 insertions(+), 2 deletions(-)
> >
> > diff --git a/target/riscv/op_helper.c b/target/riscv/op_helper.c
> > index e34715d..7281b98 100644
> > --- a/target/riscv/op_helper.c
> > +++ b/target/riscv/op_helper.c
> > @@ -144,8 +144,23 @@ void csr_write_helper(CPURISCVState *env,
> target_ulong val_to_write,
> >  }
> >
> >  mstatus = (mstatus & ~mask) | (val_to_write & mask);
> > -int dirty = (mstatus & MSTATUS_FS) == MSTATUS_FS;
> > -dirty |= (mstatus & MSTATUS_XS) == MSTATUS_XS;
> > +
> > +/* Note: this is a workaround for an issue where mstatus.FS
> > +   does not report dirty when SMP and MTTCG is enabled. This
> > +   workaround is technically compliant with the RISC-V
> Privileged
> > +   specification as it is legal to return only off, or dirty,
> > +   however this may cause unnecessary saves of floating point
> state.
> > +   Without this workaround, floating point state is not saved
> and
> > +   restored correctly when SMP and MTTCG is enabled, */
>

On looking at this again, I think we may need to remove the
qemu_tcg_mttcg_enabled conditional and always return dirty if the state is
initial or clean, but not off.

While testing on uniprocessor worked okay, it's likely because we were
lucky and there was no task migration or multiple FPU tasks working. This
would mean we would revert to Richard W.M. Jones initial patch.

> +if (qemu_tcg_mttcg_enabled()) {
> > +/* FP is always dirty or off */
> > +if (mstatus & MSTATUS_FS) {
> > +mstatus |= MSTATUS_FS;
> > +}
> > +}
> > +
> > +int dirty = ((mstatus & MSTATUS_FS) == MSTATUS_FS) |
> > +((mstatus & MSTATUS_XS) == MSTATUS_XS);
> >  mstatus = set_field(mstatus, MSTATUS_SD, dirty);
> >  env->mstatus = mstatus;
> >  break;
> >
>

The text from the specification that allows us to always return dirty if
set to initial or clean, is below i.e. Dirty implies state has
"potentially" been modified, so that gives us wriggle room.

"
When an extension's status is set to Off , any instruction that attempts to
read or write the corresponding
state will cause an exception. When the status is Initial, the
corresponding state should
have an initial constant value. When the status is Clean, the corresponding
state is potentially
di fferent from the initial value, but matches the last value stored on a
context swap. When the
status is Dirty, the corresponding state has potentially been modif ed
since the last context save.
"

I think the problem is Linux is setting the state to clean after saving
fpu register state [1], but we have no code in the QEMU FPU operations to
set the state to dirty, if is clean or initial, only code to cause an
exception if the floating point extension state is set to off e.g.

static void gen_fp_store(DisasContext *ctx, uint32_t opc, int rs1,
int rs2, target_long imm)
{
TCGv t0;

if (!(ctx->flags & TB_FLAGS_FP_ENABLE)) {
gen_exception_illegal(ctx);
return;
}

t0 = tcg_temp_new();
gen_get_gpr(t0, rs1);
tcg_gen_addi_tl(t0, t0, imm);

switch (opc) {
case OPC_RISC_FSW:

Re: [Qemu-devel] [PATCH] i386/kvm: add support for KVM_CAP_X86_DISABLE_EXITS

2018-03-27 Thread Michael S. Tsirkin

On Tue, Mar 27, 2018 at 06:36:46PM -0300, Eduardo Habkost wrote:
> On Tue, Mar 27, 2018 at 10:42:56PM +0300, Michael S. Tsirkin wrote:
> > On Fri, Mar 16, 2018 at 07:36:42AM -0700, Wanpeng Li wrote:
> > > From: Wanpeng Li 
> > > 
> > > This patch adds support for KVM_CAP_X86_DISABLE_EXITS. Provides userspace 
> > > with 
> > > per-VM capability(KVM_CAP_X86_DISABLE_EXITS) to not intercept 
> > > MWAIT/HLT/PAUSE 
> > > in order that to improve latency in some workloads.
> > > 
> [...]
> > > diff --git a/target/i386/kvm.c b/target/i386/kvm.c
> > > index d23fff1..95ed9eb 100644
> > > --- a/target/i386/kvm.c
> > > +++ b/target/i386/kvm.c
> > > @@ -999,6 +999,18 @@ int kvm_arch_init_vcpu(CPUState *cs)
> > >  }
> > >  }
> > >  
> > > +if (env->features[FEAT_KVM_HINTS] & KVM_HINTS_DEDICATED) {
> > > +int disable_exits = kvm_check_extension(cs->kvm_state, 
> > > KVM_CAP_X86_DISABLE_EXITS);
> > > +if (disable_exits) {
> > > +disable_exits &= (KVM_X86_DISABLE_EXITS_MWAIT |
> > > +  KVM_X86_DISABLE_EXITS_HLT |
> > > +  KVM_X86_DISABLE_EXITS_PAUSE);
> > > +}
> > > +if (kvm_vm_enable_cap(cs->kvm_state, KVM_CAP_X86_DISABLE_EXITS, 
> > > 0, disable_exits)) {
> > > +error_report("kvm: DISABLE EXITS not supported");
> > > +}
> > > +}
> > > +
> > >  qemu_add_vm_change_state_handler(cpu_update_state, env);
> > >  
> > >  c = cpuid_find_entry(_data.cpuid, 1, 0);
> > 
> > Why not a bit per capability?
> > I can see how someone might want to disable mwait exists
> > but not the rest of them.
> 
> kvm-hint-dedicated=on should be used only if the physical CPU is
> dedicated to the VCPU task.  Are there any advantages of getting
> vmexits for HLT and PAUSE if no other task is going to use the
> CPU?

No but there are advantages to using mwait even without a dedicated host
CPU (VCPUs can wake up each other without exiting to hypervisor).

Which is my point - there should be a separate flag to disable mwait
exiting only.

> -- 
> Eduardo

Re: [Qemu-devel] [PATCH 0/3] e1000 migration changes for 2.12

2018-03-27 Thread Ed Swierk

On Tue, Mar 27, 2018 at 7:26 AM, Dr. David Alan Gilbert
 wrote:
> If I understand correctly the d6244b description, there's two pieces of
> state (TSO and non-TSO) where there used to be only one.
> Now, with the new format we migrate both pieces of state, but lets think
> about what happens if we have to migrate only one piece.
>
> a) 2.11->2.11 migrated the only piece it knew about; so I guess the only
> problem was really the UDP corruption mentioned in the commit.

Right.

> b) 2.11->2.12 - it receives the wrongly merged piece of state and puts it
> in - well which of the two states does it load it into?  What's the
> effect of that?

The best we can do is copy the old props into both props and
tso_props. Copying it into just props means that e1000 would suddenly
find all zero offsets in tso_props, screwing up an ongoing TCP session
using TSO in a Windows guest. Conversely copying into just tso_props
would screw up an ongoing UDP session using non-TSO offload in a
Windows guest. Copying means we do no worse than the buggy 2.11
behavior, until both TSO and non-TSO contexts are refreshed and
everything is fine. (For Linux guests it doesn't matter since Linux
always seems to refresh both TSO and non-TSO contexts before every tx
data descriptor.)

> c) 2.12(+my mod)->2.11 ok, so 2.12 will have filled in both sets of state
> internally; but now it's only going to send one of them over to 2.11 -
> which one gets sent to 2.11? Is it the one that 2.11 is expecting?

This case is trickier. We want to copy into the old props whichever
one of props and tso_props was updated most recently by a non-TSO or
TSO context descriptor. Always copying one or the other risks screwing
up an ongoing session in a Windows guest by using outdated offsets.
(Again there's no problem for Linux guests.)

Probably the easiest solution is to add yet another flag (which
doesn't itself get migrated) that's set when tso_props is updated and
cleared when props is updated.

> d) 2.12(+my mod)->2.12(+my mod) with an old machine type, again we're only
> going to send one set of data (same as c) - but what's 2.12 going to
> make of the one piece of state received?

This is the same as (b) I think.

--Ed

[Qemu-devel] [ANNOUNCE] QEMU 2.12.0-rc1 is now available

2018-03-27 Thread Michael Roth

Hello,

On behalf of the QEMU Team, I'd like to announce the availability of the
second release candidate for the QEMU 2.12 release.  This release is meant
for testing purposes and should not be used in a production environment.

  http://download.qemu-project.org/qemu-2.12.0-rc1.tar.xz
  http://download.qemu-project.org/qemu-2.12.0-rc1.tar.xz.sig

You can help improve the quality of the QEMU 2.12 release by testing this
release and reporting bugs on Launchpad:

  https://bugs.launchpad.net/qemu/

The release plan, as well a documented known issues for release
candidates, are available at:

  http://wiki.qemu.org/Planning/2.12

Please add entries to the ChangeLog for the 2.12 release below:

  http://wiki.qemu.org/ChangeLog/2.12

Changes since rc0:

fa3704d877: Update version for v2.12.0-rc1 release (Peter Maydell)
1bb982b8fc: gdbstub: send a termination packet instead of crashing gdb (KONRAD 
Frederic)
0dfddbb537: hmp.c: Revert hmp_info_cpus output format change (Satheesh 
Rajendran)
fa198ad9bd: tests: qmp-test: add test for new "x-oob" (Peter Xu)
ddee57e017: tests: Add parameter to qtest_init_without_qmp_handshake (Eric 
Blake)
be933ffc23: monitor: new parameter "x-oob" (Peter Xu)
6d2d563f8c: qmp: cleanup qmp queues properly (Peter Xu)
1a1b11dc0f: tests: add oob-test for qapi-schema (Peter Xu)
4bebca1e42: tests: let qapi-schema tests detect oob (Peter Xu)
9408860165: qapi: restrict allow-oob value to be "true" (Peter Xu)
9ddb7456c8: qmp: fix qmp_capabilities error regression (Peter Xu)
625eaca9e5: qdict: remove useless cast (Laurent Vivier)
710c263407: error: Remove NULL checks on error_propagate() calls (Laurent 
Vivier)
2d9178d90f: error: Strip trailing '\n' from error string arguments (again 
again) (Laurent Vivier)
fdf235ba15: tests: Silence false positive warning on generated test name (Eric 
Blake)
fa15cf8b5c: qmp-test: fix response leak (Marc-André Lureau)
f5a53faad4: MAINTAINERS: add include/block/aio-wait.h (Stefan Hajnoczi)
35111583aa: coroutine: add test-aio coroutine queue chaining test case (Stefan 
Hajnoczi)
c40a254570: coroutine: avoid co_queue_wakeup recursion (Stefan Hajnoczi)
67a74148d8: queue: add QSIMPLEQ_PREPEND() (Stefan Hajnoczi)
eb69953ecb: macio: fix NULL pointer dereference when issuing IDE trim (Mark 
Cave-Ayland)
caeadbc8ba: ide: fix invalid TRIM range abortion for macio (Anton Nefedov)
d0ce7e9cfc: target/xtensa: fix timers test (Max Filippov)
12ab0b33f1: linux-user/xtensa: remove stray syscall.h (Max Filippov)
2745c3bbf3: target/xtensa/import_core.sh: fix #include  (Max 
Filippov)
dda2441b2b: target/xtensa: add .inc. to non-top level source file names (Max 
Filippov)
a77672ea3d: vmdk: return ERROR when cluster sector is larger than vmdk 
limitation (yuchenlin)
f7640f0dbc: iotests: enable shared migration cases in 169 (Vladimir 
Sementsov-Ogievskiy)
2d949dfcef: qcow2: fix bitmaps loading when bitmaps already exist (Vladimir 
Sementsov-Ogievskiy)
b1336cc2ec: qcow2-bitmap: add qcow2_reopen_bitmaps_rw_hint() (Vladimir 
Sementsov-Ogievskiy)
ebd0e15114: target/hppa: Include priv level in user-only iaoq (Richard 
Henderson)
83d7c40c92: scripts/decodetree: Fix insnmask not marked as global in main() 
(Bastian Koppelmann)
f8e1a98964: qemu-pr-helper: Actually allow users to specify pidfile (Michal 
Privoznik)
12051d82f0: chardev/char-fe: Allow NULL chardev in qemu_chr_fe_init() (Peter 
Maydell)
90c558beca: iothread: fix breakage on windows (Peter Xu)
09c2c6ffda: scsi: turn "is this a SCSI device?" into a conditional hint (Paolo 
Bonzini)
ff82fab792: chardev-socket: remove useless if (Paolo Bonzini)
87f963be66: tcg: Really fix cpu_io_recompile (Richard Henderson)
8e029fd64e: vhost-user-test: add back memfd check (Marc-André Lureau)
642e065a15: vhost-user-test: do not hang if chardev creation failed (Marc-André 
Lureau)
6ff8d9b03a: scripts/device-crash-test: Remove fixed isapc-with-iommu entry 
(Thomas Huth)
c9073238fc: hw/audio: Fix crashes when devices are used on ISA bus without DMA 
(Thomas Huth)
b3da551389: fdc: Exit if ISA controller does not support DMA (Alexey 
Kardashevskiy)
089eac81e1: hw/net/can: Fix segfaults when using the devices without bus 
(Thomas Huth)
4e286099fe: WHPX improve vcpu_post_run perf (Justin Terry (VM))
60168541da: WHPX fix WHvSetPartitionProperty in PropertyCode (Justin Terry (VM))
3907e6318e: WHPX fix WHvGetCapability out WrittenSizeInBytes (Justin Terry (VM))
36b4cf1934: scripts/get_maintainer.pl: Print proper error message for missing 
$file (Ian Jackson)
0b7e7f6681: qemu-iotests: Test vhdx image creation with QMP (Kevin Wolf)
6f16f7c562: vhdx: Check for 4 GB maximum log size on creation (Kevin Wolf)
0fcc38e7d0: vhdx: Don't use error_setg_errno() with constant errno (Kevin Wolf)
b412f49407: vhdx: Require power-of-two block size on create (Kevin Wolf)
e8f6ea6fb6: qemu-iotests: Test parallels image creation with QMP (Kevin Wolf)
2332d82589: parallels: Check maximum cluster size on create (Kevin Wolf)
50880f25c8: qemu-iotests: Test invalid resize on luks

Re: [Qemu-devel] [Qemu-block] [PATCH] blockjob: leak fix, remove from txn when failing early

2018-03-27 Thread John Snow



On 03/27/2018 04:10 PM, Jeff Cody wrote:
> On Tue, Mar 27, 2018 at 06:07:36PM +0200, Marc-André Lureau wrote:
>> This fixes leaks found by ASAN such as:
>>   GTESTER tests/test-blockjob
>> =
>> ==31442==ERROR: LeakSanitizer: detected memory leaks
>>
>> Direct leak of 24 byte(s) in 1 object(s) allocated from:
>> #0 0x7f88483cba38 in __interceptor_calloc (/lib64/libasan.so.4+0xdea38)
>> #1 0x7f8845e1bd77 in g_malloc0 ../glib/gmem.c:129
>> #2 0x7f8845e1c04b in g_malloc0_n ../glib/gmem.c:360
>> #3 0x5584d2732498 in block_job_txn_new 
>> /home/elmarco/src/qemu/blockjob.c:172
>> #4 0x5584d2739b28 in block_job_create 
>> /home/elmarco/src/qemu/blockjob.c:973
>> #5 0x5584d270ae31 in mk_job 
>> /home/elmarco/src/qemu/tests/test-blockjob.c:34
>> #6 0x5584d270b1c1 in do_test_id 
>> /home/elmarco/src/qemu/tests/test-blockjob.c:57
>> #7 0x5584d270b65c in test_job_ids 
>> /home/elmarco/src/qemu/tests/test-blockjob.c:118
>> #8 0x7f8845e40b69 in test_case_run ../glib/gtestutils.c:2255
>> #9 0x7f8845e40f29 in g_test_run_suite_internal ../glib/gtestutils.c:2339
>> #10 0x7f8845e40fd2 in g_test_run_suite_internal ../glib/gtestutils.c:2351
>> #11 0x7f8845e411e9 in g_test_run_suite ../glib/gtestutils.c:2426
>> #12 0x7f8845e3fe72 in g_test_run ../glib/gtestutils.c:1692
>> #13 0x5584d270d6e2 in main 
>> /home/elmarco/src/qemu/tests/test-blockjob.c:377
>> #14 0x7f8843641f29 in __libc_start_main (/lib64/libc.so.6+0x20f29)
>>
>> Add an assert to make sure that the job doesn't have associated txn before 
>> free().
>>
>> Signed-off-by: Marc-André Lureau 
>> ---
>>  blockjob.c | 5 +
>>  1 file changed, 5 insertions(+)
>>
>> diff --git a/blockjob.c b/blockjob.c
>> index 11c9ce124d..bb75386515 100644
>> --- a/blockjob.c
>> +++ b/blockjob.c
>> @@ -228,6 +228,7 @@ void block_job_unref(BlockJob *job)
>>  {
>>  if (--job->refcnt == 0) {
>>  assert(job->status == BLOCK_JOB_STATUS_NULL);
>> +assert(!job->txn);
>>  BlockDriverState *bs = blk_bs(job->blk);
>>  QLIST_REMOVE(job, job_list);
>>  bs->job = NULL;
>> @@ -479,6 +480,7 @@ static int block_job_finalize_single(BlockJob *job)
>>  
>>  QLIST_REMOVE(job, txn_list);
>>  block_job_txn_unref(job->txn);
>> +job->txn = NULL;
>>  block_job_conclude(job);
>>  return 0;
>>  }
>> @@ -994,6 +996,9 @@ void block_job_pause_all(void)
>>  void block_job_early_fail(BlockJob *job)
>>  {
>>  assert(job->status == BLOCK_JOB_STATUS_CREATED);
>> +QLIST_REMOVE(job, txn_list);
>> +block_job_txn_unref(job->txn);
>> +job->txn = NULL;
>>  block_job_decommission(job);
>>  }
>>  
>> -- 
>> 2.17.0.rc1.1.g4c4f2b46a3
>>
> 
> This patch causes a segfault/assert in iotests 031 041 055:
> 
> e.g., from 031:
> 
> test_set_speed_invalid (__main__.TestSetSpeed) ... DEBUG:QMP:>>> {'execute': 
> 'qmp_capabilities'}
> DEBUG:QMP:<<< {u'return': {}}
> DEBUG:QMP:>>> {'execute': 'query-block-jobs'}
> DEBUG:QMP:<<< {u'return': []}
> DEBUG:QMP:>>> {'execute': 'block-stream', 'arguments': {'device': 'drive0', 
> 'speed': -1}}
> DEBUG:QMP:<<< None
> WARNING:qemu:qemu received signal -11: [...]
> 
> 

Oh, because block_job_early_fail can be called from block_job_create
before we've established a transaction (even if it's the dummy transaction.)

It's a *really* early failure.

I patched it out like this; JTC: take whichever one, credit Marc-Andre
regardless of which you choose.

--js


diff --git a/blockjob.c b/blockjob.c
index ef3ed69ff1..c510a9fde5 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -204,6 +204,15 @@ void block_job_txn_add_job(BlockJobTxn *txn,
BlockJob *job)
 block_job_txn_ref(txn);
 }

+static void block_job_txn_del_job(BlockJob *job)
+{
+if (job->txn) {
+QLIST_REMOVE(job, txn_list);
+block_job_txn_unref(job->txn);
+job->txn = NULL;
+}
+}
+
 static void block_job_pause(BlockJob *job)
 {
 job->pause_count++;
@@ -232,6 +241,7 @@ void block_job_unref(BlockJob *job)
 {
 if (--job->refcnt == 0) {
 assert(job->status == BLOCK_JOB_STATUS_NULL);
+assert(!job->txn);
 BlockDriverState *bs = blk_bs(job->blk);
 QLIST_REMOVE(job, job_list);
 bs->job = NULL;
@@ -392,6 +402,7 @@ static void block_job_decommission(BlockJob *job)
 job->busy = false;
 job->paused = false;
 job->deferred_to_main_loop = true;
+block_job_txn_del_job(job);
 block_job_state_transition(job, BLOCK_JOB_STATUS_NULL);
 block_job_unref(job);
 }
@@ -481,8 +492,7 @@ static int block_job_finalize_single(BlockJob *job)
 }
 }

-QLIST_REMOVE(job, txn_list);
-block_job_txn_unref(job->txn);
+block_job_txn_del_job(job);
 block_job_conclude(job);
 return 0;
 }

Re: [Qemu-devel] [PATCH for-2.12] dump: Fix build with newer gcc

2018-03-27 Thread Marc-André Lureau

On Tue, Mar 27, 2018 at 10:21 PM, Eric Blake  wrote:
> gcc 8 on rawhide is picky enough to complain:
>
> /home/dummy/qemu/dump.c: In function 'create_header32':
> /home/dummy/qemu/dump.c:817:5: error: 'strncpy' output truncated before 
> terminating nul copying 8 bytes from a string of the same length 
> [-Werror=stringop-truncation]
>  strncpy(dh->signature, KDUMP_SIGNATURE, strlen(KDUMP_SIGNATURE));
>  ^~~~
>
> But we already have SIG_LEN defined as the right length without needing
> to do a strlen(), and memcpy() is better than strncpy() when we know
> we do not want a trailing NUL byte.
>
> Signed-off-by: Eric Blake 

Reviewed-by: Marc-André Lureau 


> ---
>  dump.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/dump.c b/dump.c
> index 669f715274d..b54cd42b217 100644
> --- a/dump.c
> +++ b/dump.c
> @@ -814,7 +814,7 @@ static void create_header32(DumpState *s, Error **errp)
>  size = sizeof(DiskDumpHeader32);
>  dh = g_malloc0(size);
>
> -strncpy(dh->signature, KDUMP_SIGNATURE, strlen(KDUMP_SIGNATURE));
> +memcpy(dh->signature, KDUMP_SIGNATURE, SIG_LEN);
>  dh->header_version = cpu_to_dump32(s, 6);
>  block_size = s->dump_info.page_size;
>  dh->block_size = cpu_to_dump32(s, block_size);
> @@ -926,7 +926,7 @@ static void create_header64(DumpState *s, Error **errp)
>  size = sizeof(DiskDumpHeader64);
>  dh = g_malloc0(size);
>
> -strncpy(dh->signature, KDUMP_SIGNATURE, strlen(KDUMP_SIGNATURE));
> +memcpy(dh->signature, KDUMP_SIGNATURE, SIG_LEN);
>  dh->header_version = cpu_to_dump32(s, 6);
>  block_size = s->dump_info.page_size;
>  dh->block_size = cpu_to_dump32(s, block_size);
> --
> 2.14.3
>

Re: [Qemu-devel] [PATCH for 2.13 14/19] linux-user: move alpha cpu loop to alpha directory

2018-03-27 Thread Philippe Mathieu-Daudé

On 03/26/2018 04:15 PM, Laurent Vivier wrote:
> No code change, only move code from main.c to
> alpha/cpu_loop.c.
> 
> Signed-off-by: Laurent Vivier 

Reviewed-by: Philippe Mathieu-Daudé 

> ---
>  linux-user/alpha/cpu_loop.c | 199 ++
>  linux-user/main.c   | 204 
> 
>  2 files changed, 199 insertions(+), 204 deletions(-)
> 
> diff --git a/linux-user/alpha/cpu_loop.c b/linux-user/alpha/cpu_loop.c
> index b7700a5561..b87fcaea87 100644
> --- a/linux-user/alpha/cpu_loop.c
> +++ b/linux-user/alpha/cpu_loop.c
> @@ -21,6 +21,205 @@
>  #include "qemu.h"
>  #include "cpu_loop-common.h"
>  
> +void cpu_loop(CPUAlphaState *env)
> +{
> +CPUState *cs = CPU(alpha_env_get_cpu(env));
> +int trapnr;
> +target_siginfo_t info;
> +abi_long sysret;
> +
> +while (1) {
> +bool arch_interrupt = true;
> +
> +cpu_exec_start(cs);
> +trapnr = cpu_exec(cs);
> +cpu_exec_end(cs);
> +process_queued_cpu_work(cs);
> +
> +switch (trapnr) {
> +case EXCP_RESET:
> +fprintf(stderr, "Reset requested. Exit\n");
> +exit(EXIT_FAILURE);
> +break;
> +case EXCP_MCHK:
> +fprintf(stderr, "Machine check exception. Exit\n");
> +exit(EXIT_FAILURE);
> +break;
> +case EXCP_SMP_INTERRUPT:
> +case EXCP_CLK_INTERRUPT:
> +case EXCP_DEV_INTERRUPT:
> +fprintf(stderr, "External interrupt. Exit\n");
> +exit(EXIT_FAILURE);
> +break;
> +case EXCP_MMFAULT:
> +info.si_signo = TARGET_SIGSEGV;
> +info.si_errno = 0;
> +info.si_code = (page_get_flags(env->trap_arg0) & PAGE_VALID
> +? TARGET_SEGV_ACCERR : TARGET_SEGV_MAPERR);
> +info._sifields._sigfault._addr = env->trap_arg0;
> +queue_signal(env, info.si_signo, QEMU_SI_FAULT, );
> +break;
> +case EXCP_UNALIGN:
> +info.si_signo = TARGET_SIGBUS;
> +info.si_errno = 0;
> +info.si_code = TARGET_BUS_ADRALN;
> +info._sifields._sigfault._addr = env->trap_arg0;
> +queue_signal(env, info.si_signo, QEMU_SI_FAULT, );
> +break;
> +case EXCP_OPCDEC:
> +do_sigill:
> +info.si_signo = TARGET_SIGILL;
> +info.si_errno = 0;
> +info.si_code = TARGET_ILL_ILLOPC;
> +info._sifields._sigfault._addr = env->pc;
> +queue_signal(env, info.si_signo, QEMU_SI_FAULT, );
> +break;
> +case EXCP_ARITH:
> +info.si_signo = TARGET_SIGFPE;
> +info.si_errno = 0;
> +info.si_code = TARGET_FPE_FLTINV;
> +info._sifields._sigfault._addr = env->pc;
> +queue_signal(env, info.si_signo, QEMU_SI_FAULT, );
> +break;
> +case EXCP_FEN:
> +/* No-op.  Linux simply re-enables the FPU.  */
> +break;
> +case EXCP_CALL_PAL:
> +switch (env->error_code) {
> +case 0x80:
> +/* BPT */
> +info.si_signo = TARGET_SIGTRAP;
> +info.si_errno = 0;
> +info.si_code = TARGET_TRAP_BRKPT;
> +info._sifields._sigfault._addr = env->pc;
> +queue_signal(env, info.si_signo, QEMU_SI_FAULT, );
> +break;
> +case 0x81:
> +/* BUGCHK */
> +info.si_signo = TARGET_SIGTRAP;
> +info.si_errno = 0;
> +info.si_code = 0;
> +info._sifields._sigfault._addr = env->pc;
> +queue_signal(env, info.si_signo, QEMU_SI_FAULT, );
> +break;
> +case 0x83:
> +/* CALLSYS */
> +trapnr = env->ir[IR_V0];
> +sysret = do_syscall(env, trapnr,
> +env->ir[IR_A0], env->ir[IR_A1],
> +env->ir[IR_A2], env->ir[IR_A3],
> +env->ir[IR_A4], env->ir[IR_A5],
> +0, 0);
> +if (sysret == -TARGET_ERESTARTSYS) {
> +env->pc -= 4;
> +break;
> +}
> +if (sysret == -TARGET_QEMU_ESIGRETURN) {
> +break;
> +}
> +/* Syscall writes 0 to V0 to bypass error check, similar
> +   to how this is handled internal to Linux kernel.
> +   (Ab)use trapnr temporarily as boolean indicating error.  
> */
> +trapnr = (env->ir[IR_V0] != 0 && sysret < 0);
> +env->ir[IR_V0] = (trapnr ? -sysret : sysret);
> +env->ir[IR_A3] = trapnr;
> +break;
> +case 0x86:
> +

Re: [Qemu-devel] [PATCH for 2.13 18/19] linux-user: move hppa cpu loop to hppa directory

2018-03-27 Thread Philippe Mathieu-Daudé

On 03/26/2018 04:16 PM, Laurent Vivier wrote:
> No code change, only move code from main.c to
> hppa/cpu_loop.c.
> 
> Signed-off-by: Laurent Vivier 

Reviewed-by: Philippe Mathieu-Daudé 

> ---
>  linux-user/hppa/cpu_loop.c | 185 ++
>  linux-user/main.c  | 194 
> +
>  2 files changed, 186 insertions(+), 193 deletions(-)
> 
> diff --git a/linux-user/hppa/cpu_loop.c b/linux-user/hppa/cpu_loop.c
> index b7700a5561..0301c766c6 100644
> --- a/linux-user/hppa/cpu_loop.c
> +++ b/linux-user/hppa/cpu_loop.c
> @@ -21,6 +21,191 @@
>  #include "qemu.h"
>  #include "cpu_loop-common.h"
>  
> +static abi_ulong hppa_lws(CPUHPPAState *env)
> +{
> +uint32_t which = env->gr[20];
> +abi_ulong addr = env->gr[26];
> +abi_ulong old = env->gr[25];
> +abi_ulong new = env->gr[24];
> +abi_ulong size, ret;
> +
> +switch (which) {
> +default:
> +return -TARGET_ENOSYS;
> +
> +case 0: /* elf32 atomic 32bit cmpxchg */
> +if ((addr & 3) || !access_ok(VERIFY_WRITE, addr, 4)) {
> +return -TARGET_EFAULT;
> +}
> +old = tswap32(old);
> +new = tswap32(new);
> +ret = atomic_cmpxchg((uint32_t *)g2h(addr), old, new);
> +ret = tswap32(ret);
> +break;
> +
> +case 2: /* elf32 atomic "new" cmpxchg */
> +size = env->gr[23];
> +if (size >= 4) {
> +return -TARGET_ENOSYS;
> +}
> +if (((addr | old | new) & ((1 << size) - 1))
> +|| !access_ok(VERIFY_WRITE, addr, 1 << size)
> +|| !access_ok(VERIFY_READ, old, 1 << size)
> +|| !access_ok(VERIFY_READ, new, 1 << size)) {
> +return -TARGET_EFAULT;
> +}
> +/* Note that below we use host-endian loads so that the cmpxchg
> +   can be host-endian as well.  */
> +switch (size) {
> +case 0:
> +old = *(uint8_t *)g2h(old);
> +new = *(uint8_t *)g2h(new);
> +ret = atomic_cmpxchg((uint8_t *)g2h(addr), old, new);
> +ret = ret != old;
> +break;
> +case 1:
> +old = *(uint16_t *)g2h(old);
> +new = *(uint16_t *)g2h(new);
> +ret = atomic_cmpxchg((uint16_t *)g2h(addr), old, new);
> +ret = ret != old;
> +break;
> +case 2:
> +old = *(uint32_t *)g2h(old);
> +new = *(uint32_t *)g2h(new);
> +ret = atomic_cmpxchg((uint32_t *)g2h(addr), old, new);
> +ret = ret != old;
> +break;
> +case 3:
> +{
> +uint64_t o64, n64, r64;
> +o64 = *(uint64_t *)g2h(old);
> +n64 = *(uint64_t *)g2h(new);
> +#ifdef CONFIG_ATOMIC64
> +r64 = atomic_cmpxchg__nocheck((uint64_t *)g2h(addr), o64, 
> n64);
> +ret = r64 != o64;
> +#else
> +start_exclusive();
> +r64 = *(uint64_t *)g2h(addr);
> +ret = 1;
> +if (r64 == o64) {
> +*(uint64_t *)g2h(addr) = n64;
> +ret = 0;
> +}
> +end_exclusive();
> +#endif
> +}
> +break;
> +}
> +break;
> +}
> +
> +env->gr[28] = ret;
> +return 0;
> +}
> +
> +void cpu_loop(CPUHPPAState *env)
> +{
> +CPUState *cs = CPU(hppa_env_get_cpu(env));
> +target_siginfo_t info;
> +abi_ulong ret;
> +int trapnr;
> +
> +while (1) {
> +cpu_exec_start(cs);
> +trapnr = cpu_exec(cs);
> +cpu_exec_end(cs);
> +process_queued_cpu_work(cs);
> +
> +switch (trapnr) {
> +case EXCP_SYSCALL:
> +ret = do_syscall(env, env->gr[20],
> + env->gr[26], env->gr[25],
> + env->gr[24], env->gr[23],
> + env->gr[22], env->gr[21], 0, 0);
> +switch (ret) {
> +default:
> +env->gr[28] = ret;
> +/* We arrived here by faking the gateway page.  Return.  */
> +env->iaoq_f = env->gr[31];
> +env->iaoq_b = env->gr[31] + 4;
> +break;
> +case -TARGET_ERESTARTSYS:
> +case -TARGET_QEMU_ESIGRETURN:
> +break;
> +}
> +break;
> +case EXCP_SYSCALL_LWS:
> +env->gr[21] = hppa_lws(env);
> +/* We arrived here by faking the gateway page.  Return.  */
> +env->iaoq_f = env->gr[31];
> +env->iaoq_b = env->gr[31] + 4;
> +break;
> +case EXCP_ITLB_MISS:
> +case EXCP_DTLB_MISS:
> +case EXCP_NA_ITLB_MISS:
> +case EXCP_NA_DTLB_MISS:
> +case EXCP_IMP:
> +case EXCP_DMP:
> +case EXCP_DMB:
> +case EXCP_PAGE_REF:
> +

Re: [Qemu-devel] [PATCH for 2.13 07/19] linux-user: move mips/mips64 cpu loop to mips directory

2018-03-27 Thread Philippe Mathieu-Daudé

On 03/26/2018 04:15 PM, Laurent Vivier wrote:
> No code change, only move code from main.c to
> mips/cpu_loop.c.
> 
> Include mips/cpu_loop.c in mips64/cpu_loop.c
> to avoid to duplicate code.
> 
> Signed-off-by: Laurent Vivier 

Reviewed-by: Philippe Mathieu-Daudé 

> ---
>  linux-user/main.c| 725 
> ---
>  linux-user/mips/cpu_loop.c   | 723 ++
>  linux-user/mips64/cpu_loop.c |   8 +-
>  3 files changed, 724 insertions(+), 732 deletions(-)
> 
> diff --git a/linux-user/main.c b/linux-user/main.c
> index b5d0513b44..490733a3fb 100644
> --- a/linux-user/main.c
> +++ b/linux-user/main.c
> @@ -149,705 +149,6 @@ void fork_end(int child)
>  }
>  }
>  
> -#ifdef TARGET_MIPS
> -
> -# ifdef TARGET_ABI_MIPSO32
> -#  define MIPS_SYS(name, args) args,
> -static const uint8_t mips_syscall_args[] = {
> - MIPS_SYS(sys_syscall, 8)/* 4000 */
> - MIPS_SYS(sys_exit   , 1)
> - MIPS_SYS(sys_fork   , 0)
> - MIPS_SYS(sys_read   , 3)
> - MIPS_SYS(sys_write  , 3)
> - MIPS_SYS(sys_open   , 3)/* 4005 */
> - MIPS_SYS(sys_close  , 1)
> - MIPS_SYS(sys_waitpid, 3)
> - MIPS_SYS(sys_creat  , 2)
> - MIPS_SYS(sys_link   , 2)
> - MIPS_SYS(sys_unlink , 1)/* 4010 */
> - MIPS_SYS(sys_execve , 0)
> - MIPS_SYS(sys_chdir  , 1)
> - MIPS_SYS(sys_time   , 1)
> - MIPS_SYS(sys_mknod  , 3)
> - MIPS_SYS(sys_chmod  , 2)/* 4015 */
> - MIPS_SYS(sys_lchown , 3)
> - MIPS_SYS(sys_ni_syscall , 0)
> - MIPS_SYS(sys_ni_syscall , 0)/* was sys_stat */
> - MIPS_SYS(sys_lseek  , 3)
> - MIPS_SYS(sys_getpid , 0)/* 4020 */
> - MIPS_SYS(sys_mount  , 5)
> - MIPS_SYS(sys_umount , 1)
> - MIPS_SYS(sys_setuid , 1)
> - MIPS_SYS(sys_getuid , 0)
> - MIPS_SYS(sys_stime  , 1)/* 4025 */
> - MIPS_SYS(sys_ptrace , 4)
> - MIPS_SYS(sys_alarm  , 1)
> - MIPS_SYS(sys_ni_syscall , 0)/* was sys_fstat */
> - MIPS_SYS(sys_pause  , 0)
> - MIPS_SYS(sys_utime  , 2)/* 4030 */
> - MIPS_SYS(sys_ni_syscall , 0)
> - MIPS_SYS(sys_ni_syscall , 0)
> - MIPS_SYS(sys_access , 2)
> - MIPS_SYS(sys_nice   , 1)
> - MIPS_SYS(sys_ni_syscall , 0)/* 4035 */
> - MIPS_SYS(sys_sync   , 0)
> - MIPS_SYS(sys_kill   , 2)
> - MIPS_SYS(sys_rename , 2)
> - MIPS_SYS(sys_mkdir  , 2)
> - MIPS_SYS(sys_rmdir  , 1)/* 4040 */
> - MIPS_SYS(sys_dup, 1)
> - MIPS_SYS(sys_pipe   , 0)
> - MIPS_SYS(sys_times  , 1)
> - MIPS_SYS(sys_ni_syscall , 0)
> - MIPS_SYS(sys_brk, 1)/* 4045 */
> - MIPS_SYS(sys_setgid , 1)
> - MIPS_SYS(sys_getgid , 0)
> - MIPS_SYS(sys_ni_syscall , 0)/* was signal(2) */
> - MIPS_SYS(sys_geteuid, 0)
> - MIPS_SYS(sys_getegid, 0)/* 4050 */
> - MIPS_SYS(sys_acct   , 0)
> - MIPS_SYS(sys_umount2, 2)
> - MIPS_SYS(sys_ni_syscall , 0)
> - MIPS_SYS(sys_ioctl  , 3)
> - MIPS_SYS(sys_fcntl  , 3)/* 4055 */
> - MIPS_SYS(sys_ni_syscall , 2)
> - MIPS_SYS(sys_setpgid, 2)
> - MIPS_SYS(sys_ni_syscall , 0)
> - MIPS_SYS(sys_olduname   , 1)
> - MIPS_SYS(sys_umask  , 1)/* 4060 */
> - MIPS_SYS(sys_chroot , 1)
> - MIPS_SYS(sys_ustat  , 2)
> - MIPS_SYS(sys_dup2   , 2)
> - MIPS_SYS(sys_getppid, 0)
> - MIPS_SYS(sys_getpgrp, 0)/* 4065 */
> - MIPS_SYS(sys_setsid , 0)
> - MIPS_SYS(sys_sigaction  , 3)
> - MIPS_SYS(sys_sgetmask   , 0)
> - MIPS_SYS(sys_ssetmask   , 1)
> - MIPS_SYS(sys_setreuid   , 2)/* 4070 */
> - MIPS_SYS(sys_setregid   , 2)
> - MIPS_SYS(sys_sigsuspend , 0)
> - MIPS_SYS(sys_sigpending , 1)
> - MIPS_SYS(sys_sethostname, 2)
> - MIPS_SYS(sys_setrlimit  , 2)/* 4075 */
> - MIPS_SYS(sys_getrlimit  , 2)
> - MIPS_SYS(sys_getrusage  , 2)
> - MIPS_SYS(sys_gettimeofday, 2)
> - MIPS_SYS(sys_settimeofday, 2)
> - MIPS_SYS(sys_getgroups  , 2)/* 4080 */
> - MIPS_SYS(sys_setgroups  , 2)
> - MIPS_SYS(sys_ni_syscall , 0)/* old_select */
> - MIPS_SYS(sys_symlink, 2)
> - MIPS_SYS(sys_ni_syscall , 0)/* was sys_lstat */
> - MIPS_SYS(sys_readlink   , 3)/* 4085 */
> - MIPS_SYS(sys_uselib , 1)
> - MIPS_SYS(sys_swapon , 2)
> - MIPS_SYS(sys_reboot , 3)
> - MIPS_SYS(old_readdir, 3)
> - MIPS_SYS(old_mmap   , 6)/* 4090 */
> - MIPS_SYS(sys_munmap , 2)
> - MIPS_SYS(sys_truncate   , 2)
> - MIPS_SYS(sys_ftruncate  , 2)
> - MIPS_SYS(sys_fchmod , 2)
> - MIPS_SYS(sys_fchown , 3)/* 4095 */
> - MIPS_SYS(sys_getpriority, 2)
> - MIPS_SYS(sys_setpriority, 3)
> -

Re: [Qemu-devel] [PATCH v1 1/1] RISC-V: Workaround for critical mstatus.FS MTTCG bug

2018-03-27 Thread Philippe Mathieu-Daudé

Cc'ing Alex and Richard.

On 03/27/2018 04:54 PM, Michael Clark wrote:
> This change is a workaround for a bug where mstatus.FS
> is not correctly reporting dirty when MTTCG and SMP are
> enabled which results in the floating point register file
> not being saved during context switches. This a critical
> bug for RISC-V in QEMU as it results in floating point
> register file corruption when running SMP Linux in the
> RISC-V 'virt' machine.
> 
> This workaround will return dirty if mstatus.FS is
> switched from off to initial or clean. We have checked
> the specification and it is legal for an implementation
> to return either off, or dirty, if set to initial or clean.
> 
> This workaround will result in unnecessary floating point
> save restore. When mstatus.FS is off, floating point
> instruction trap to indicate the process is using the FPU.
> The OS can then save floating-point state of the previous
> process using the FPU and set mstatus.FS to initial or
> clean. With this workaround, mstatus.FS will always return
> dirty if set to a non-zero value, indicating floating point
> save restore is necessary, versus misreporting mstatus.FS
> resulting in floating point register file corruption.
> 
> Cc: Palmer Dabbelt 
> Cc: Sagar Karandikar 
> Cc: Bastian Koppelmann 
> Cc: Peter Maydell 
> Tested-by: Richard W.M. Jones 
> Signed-off-by: Michael Clark 
> ---
>  target/riscv/op_helper.c | 19 +--
>  1 file changed, 17 insertions(+), 2 deletions(-)
> 
> diff --git a/target/riscv/op_helper.c b/target/riscv/op_helper.c
> index e34715d..7281b98 100644
> --- a/target/riscv/op_helper.c
> +++ b/target/riscv/op_helper.c
> @@ -144,8 +144,23 @@ void csr_write_helper(CPURISCVState *env, target_ulong 
> val_to_write,
>  }
>  
>  mstatus = (mstatus & ~mask) | (val_to_write & mask);
> -int dirty = (mstatus & MSTATUS_FS) == MSTATUS_FS;
> -dirty |= (mstatus & MSTATUS_XS) == MSTATUS_XS;
> +
> +/* Note: this is a workaround for an issue where mstatus.FS
> +   does not report dirty when SMP and MTTCG is enabled. This
> +   workaround is technically compliant with the RISC-V Privileged
> +   specification as it is legal to return only off, or dirty,
> +   however this may cause unnecessary saves of floating point state.
> +   Without this workaround, floating point state is not saved and
> +   restored correctly when SMP and MTTCG is enabled, */
> +if (qemu_tcg_mttcg_enabled()) {
> +/* FP is always dirty or off */
> +if (mstatus & MSTATUS_FS) {
> +mstatus |= MSTATUS_FS;
> +}
> +}
> +
> +int dirty = ((mstatus & MSTATUS_FS) == MSTATUS_FS) |
> +((mstatus & MSTATUS_XS) == MSTATUS_XS);
>  mstatus = set_field(mstatus, MSTATUS_SD, dirty);
>  env->mstatus = mstatus;
>  break;
>

Re: [Qemu-devel] [PATCH v1 2/2] RISC-V: Fix incorrect disassembly for addiw

2018-03-27 Thread Philippe Mathieu-Daudé

On 03/27/2018 04:55 PM, Michael Clark wrote:
> This fixes a bug in the disassembler constraints used
> to lift instructions into pseudo-instructions, whereby
> addiw instructions are always lifted to sext.w instead
> of just lifting addiw with a zero immediate.
> 
> An associated fix has been made to the metadata used to
> machine generate the disseasembler:
> 
> https://github.com/michaeljclark/riscv-meta/
> commit/4a6b2f3898430768acfe201405224d2ea31e1477
> 
> Cc: Sagar Karandikar 
> Cc: Bastian Koppelmann 
> Cc: Palmer Dabbelt 
> Cc: Peter Maydell 
> Signed-off-by: Michael Clark 

Reviewed-by: Philippe Mathieu-Daudé 

> ---
>  disas/riscv.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/disas/riscv.c b/disas/riscv.c
> index 3c17501..74ad16e 100644
> --- a/disas/riscv.c
> +++ b/disas/riscv.c
> @@ -600,7 +600,7 @@ static const rvc_constraint rvcc_mv[] = { 
> rvc_imm_eq_zero, rvc_end };
>  static const rvc_constraint rvcc_not[] = { rvc_imm_eq_n1, rvc_end };
>  static const rvc_constraint rvcc_neg[] = { rvc_rs1_eq_x0, rvc_end };
>  static const rvc_constraint rvcc_negw[] = { rvc_rs1_eq_x0, rvc_end };
> -static const rvc_constraint rvcc_sext_w[] = { rvc_rs2_eq_x0, rvc_end };
> +static const rvc_constraint rvcc_sext_w[] = { rvc_imm_eq_zero, rvc_end };
>  static const rvc_constraint rvcc_seqz[] = { rvc_imm_eq_p1, rvc_end };
>  static const rvc_constraint rvcc_snez[] = { rvc_rs1_eq_x0, rvc_end };
>  static const rvc_constraint rvcc_sltz[] = { rvc_rs2_eq_x0, rvc_end };
>

Re: [Qemu-devel] [PATCH for-2.12] nbd: Fix 32-bit compilation on BLOCK_STATUS

2018-03-27 Thread Paolo Bonzini

On 27/03/2018 23:05, Eric Blake wrote:
> iotests 123 and 209 fail on 32-bit platforms.  The culprit:
> sizeof(extent) is wrong; we want sizeof(*extent).  But since
> the struct is 8 bytes, it happened to work on 64-bit platforms
> where the pointer is also 8 bytes (nasty).
> 
> Fixes: 78a33ab58
> Reported-by: Max Reitz 
> Signed-off-by: Eric Blake 
> ---
>  block/nbd-client.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/block/nbd-client.c b/block/nbd-client.c
> index e64e346d690..e7caf49fbb4 100644
> --- a/block/nbd-client.c
> +++ b/block/nbd-client.c
> @@ -239,7 +239,7 @@ static int nbd_parse_blockstatus_payload(NBDClientSession 
> *client,
>  {
>  uint32_t context_id;
> 
> -if (chunk->length != sizeof(context_id) + sizeof(extent)) {
> +if (chunk->length != sizeof(context_id) + sizeof(*extent)) {
>  error_setg(errp, "Protocol error: invalid payload for "
>   "NBD_REPLY_TYPE_BLOCK_STATUS");
>  return -EINVAL;
> 

Reviewed-by: Paolo Bonzini

Re: [Qemu-devel] [PATCH v4 2/5] target/i386: Populate AMD Processor Cache Information

2018-03-27 Thread Moger, Babu



> -Original Message-
> From: Eduardo Habkost 
> Sent: Wednesday, March 21, 2018 3:30 PM
> To: Moger, Babu 
> Cc: pbonz...@redhat.com; r...@twiddle.net; rkrc...@redhat.com;
> Lendacky, Thomas ; Singh, Brijesh
> ; k...@vger.kernel.org; k...@tripleback.net;
> mtosa...@redhat.com; Hook, Gary ; qemu-
> de...@nongnu.org
> Subject: Re: [Qemu-devel] [PATCH v4 2/5] target/i386: Populate AMD
> Processor Cache Information
> 
> On Wed, Mar 21, 2018 at 08:07:54PM +, Moger, Babu wrote:
> >
> > > -Original Message-
> > > From: Eduardo Habkost 
> > > Sent: Wednesday, March 21, 2018 1:15 PM
> > > To: Moger, Babu 
> > > Cc: pbonz...@redhat.com; r...@twiddle.net; rkrc...@redhat.com;
> > > Lendacky, Thomas ; Singh, Brijesh
> > > ; k...@vger.kernel.org; k...@tripleback.net;
> > > mtosa...@redhat.com; Hook, Gary ; qemu-
> > > de...@nongnu.org
> > > Subject: Re: [Qemu-devel] [PATCH v4 2/5] target/i386: Populate AMD
> > > Processor Cache Information
> > >
> > > On Wed, Mar 21, 2018 at 05:47:28PM +, Moger, Babu wrote:
> > > >
> > > >
> > > > > -Original Message-
> > > > > From: Eduardo Habkost 
> > > > > Sent: Wednesday, March 21, 2018 12:10 PM
> > > > > To: Moger, Babu 
> > > > > Cc: pbonz...@redhat.com; r...@twiddle.net; rkrc...@redhat.com;
> > > > > Lendacky, Thomas ; Singh, Brijesh
> > > > > ; k...@vger.kernel.org;
> k...@tripleback.net;
> > > > > mtosa...@redhat.com; Hook, Gary ; qemu-
> > > > > de...@nongnu.org
> > > > > Subject: Re: [Qemu-devel] [PATCH v4 2/5] target/i386: Populate AMD
> > > > > Processor Cache Information
> > > > >
> > > > > On Wed, Mar 21, 2018 at 03:58:41PM +, Moger, Babu wrote:
> > > > > > Hi Eduardo,
> > > > > >
> > > > > > > -Original Message-
> > > > > > > From: Eduardo Habkost 
> > > > > > > Sent: Tuesday, March 20, 2018 12:54 PM
> > > > > > > To: Moger, Babu 
> > > > > > > Cc: pbonz...@redhat.com; r...@twiddle.net;
> rkrc...@redhat.com;
> > > > > > > Lendacky, Thomas ; Singh, Brijesh
> > > > > > > ; k...@vger.kernel.org;
> > > k...@tripleback.net;
> > > > > > > mtosa...@redhat.com; Hook, Gary ;
> qemu-
> > > > > > > de...@nongnu.org
> > > > > > > Subject: Re: [Qemu-devel] [PATCH v4 2/5] target/i386: Populate
> AMD
> > > > > > > Processor Cache Information
> > > > > > >
> > > > > > > On Tue, Mar 20, 2018 at 05:25:52PM +, Moger, Babu wrote:
> > > > > > > > Hi Eduardo, Thanks for the comments. Please see the response
> > > inline.
> > > > > > > >
> > > > > > > > > -Original Message-
> > > > > > > > > From: Eduardo Habkost 
> > > > > > > > > Sent: Friday, March 16, 2018 1:00 PM
> > > > > > > > > To: Moger, Babu 
> > > > > > > > > Cc: pbonz...@redhat.com; r...@twiddle.net;
> > > rkrc...@redhat.com;
> > > > > > > > > Lendacky, Thomas ; Singh,
> Brijesh
> > > > > > > > > ; k...@vger.kernel.org;
> > > > > k...@tripleback.net;
> > > > > > > > > mtosa...@redhat.com; Hook, Gary ;
> > > qemu-
> > > > > > > > > de...@nongnu.org
> > > > > > > > > Subject: Re: [Qemu-devel] [PATCH v4 2/5] target/i386:
> Populate
> > > AMD
> > > > > > > > > Processor Cache Information
> > > > > > > > >
> > > > > > > > > On Mon, Mar 12, 2018 at 05:00:46PM -0400, Babu Moger
> wrote:
> > > > > > > > > > From: Stanislav Lanci 
> > > > > > > > > >
> > > > > > > > > > Add information for cpuid 0x801D leaf. Populate cache
> > > topology
> > > > > > > > > information
> > > > > > > > > > for different cache types(Data Cache, Instruction Cache, L2
> and
> > > L3)
> > > > > > > > > supported
> > > > > > > > > > by 0x801D leaf. Please refer Processor Programming
> > > Reference
> > > > > > > (PPR)
> > > > > > > > > for AMD
> > > > > > > > > > Family 17h Model for more details.
> > > > > > > > > >
> > > > > > > > > > Signed-off-by: Stanislav Lanci 
> > > > > > > > > > Signed-off-by: Babu Moger 
> > > > > > > > >
> > > > > > > > > The new CPUID leaves don't seem to match the existing AMD
> > > cache
> > > > > > > > > information
> > > > > > > > > leaves.  Is this intentional?  Why?
> > > > > > > >
> > > > > > > > It is not intentional. These values are from older family of
> > > processors.
> > > > > > > These values have changed from Family 14  or later.
> > > > > > > > The latest one is Family 17. You can see the differences here.
> > > > > > > >  https://support.amd.com/TechDocs/41131.pdf
> > > > > > > >
> > > > > > >
> > > > >
> > >
>

Re: [Qemu-devel] [PATCH] i386/kvm: add support for KVM_CAP_X86_DISABLE_EXITS

2018-03-27 Thread Eduardo Habkost

On Tue, Mar 27, 2018 at 10:42:56PM +0300, Michael S. Tsirkin wrote:
> On Fri, Mar 16, 2018 at 07:36:42AM -0700, Wanpeng Li wrote:
> > From: Wanpeng Li 
> > 
> > This patch adds support for KVM_CAP_X86_DISABLE_EXITS. Provides userspace 
> > with 
> > per-VM capability(KVM_CAP_X86_DISABLE_EXITS) to not intercept 
> > MWAIT/HLT/PAUSE 
> > in order that to improve latency in some workloads.
> > 
[...]
> > diff --git a/target/i386/kvm.c b/target/i386/kvm.c
> > index d23fff1..95ed9eb 100644
> > --- a/target/i386/kvm.c
> > +++ b/target/i386/kvm.c
> > @@ -999,6 +999,18 @@ int kvm_arch_init_vcpu(CPUState *cs)
> >  }
> >  }
> >  
> > +if (env->features[FEAT_KVM_HINTS] & KVM_HINTS_DEDICATED) {
> > +int disable_exits = kvm_check_extension(cs->kvm_state, 
> > KVM_CAP_X86_DISABLE_EXITS);
> > +if (disable_exits) {
> > +disable_exits &= (KVM_X86_DISABLE_EXITS_MWAIT |
> > +  KVM_X86_DISABLE_EXITS_HLT |
> > +  KVM_X86_DISABLE_EXITS_PAUSE);
> > +}
> > +if (kvm_vm_enable_cap(cs->kvm_state, KVM_CAP_X86_DISABLE_EXITS, 0, 
> > disable_exits)) {
> > +error_report("kvm: DISABLE EXITS not supported");
> > +}
> > +}
> > +
> >  qemu_add_vm_change_state_handler(cpu_update_state, env);
> >  
> >  c = cpuid_find_entry(_data.cpuid, 1, 0);
> 
> Why not a bit per capability?
> I can see how someone might want to disable mwait exists
> but not the rest of them.

kvm-hint-dedicated=on should be used only if the physical CPU is
dedicated to the VCPU task.  Are there any advantages of getting
vmexits for HLT and PAUSE if no other task is going to use the
CPU?

-- 
Eduardo

[Qemu-devel] [PATCH v5 8/9] i386: Enable TOPOEXT feature on AMD EPYC CPU

2018-03-27 Thread Babu Moger

Enable TOPOEXT feature on EPYC CPU. This is required to support
hyperthreading on VM guests. Also extend xlevel to 0x801E.

Signed-off-by: Babu Moger 
---
 target/i386/cpu.c | 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 6f3ad44..f8e7325 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -2330,7 +2330,8 @@ static X86CPUDefinition builtin_x86_defs[] = {
 .features[FEAT_8000_0001_ECX] =
 CPUID_EXT3_OSVW | CPUID_EXT3_3DNOWPREFETCH |
 CPUID_EXT3_MISALIGNSSE | CPUID_EXT3_SSE4A | CPUID_EXT3_ABM |
-CPUID_EXT3_CR8LEG | CPUID_EXT3_SVM | CPUID_EXT3_LAHF_LM,
+CPUID_EXT3_CR8LEG | CPUID_EXT3_SVM | CPUID_EXT3_LAHF_LM |
+CPUID_EXT3_TOPOEXT,
 .features[FEAT_7_0_EBX] =
 CPUID_7_0_EBX_FSGSBASE | CPUID_7_0_EBX_BMI1 | CPUID_7_0_EBX_AVX2 |
 CPUID_7_0_EBX_SMEP | CPUID_7_0_EBX_BMI2 | CPUID_7_0_EBX_RDSEED |
@@ -2422,7 +2423,8 @@ static X86CPUDefinition builtin_x86_defs[] = {
 .features[FEAT_8000_0001_ECX] =
 CPUID_EXT3_OSVW | CPUID_EXT3_3DNOWPREFETCH |
 CPUID_EXT3_MISALIGNSSE | CPUID_EXT3_SSE4A | CPUID_EXT3_ABM |
-CPUID_EXT3_CR8LEG | CPUID_EXT3_SVM | CPUID_EXT3_LAHF_LM,
+CPUID_EXT3_CR8LEG | CPUID_EXT3_SVM | CPUID_EXT3_LAHF_LM |
+CPUID_EXT3_TOPOEXT,
 .features[FEAT_8000_0008_EBX] =
 CPUID_8000_0008_EBX_IBPB,
 .features[FEAT_7_0_EBX] =
@@ -4575,6 +4577,11 @@ static void x86_cpu_expand_features(X86CPU *cpu, Error 
**errp)
 x86_cpu_adjust_level(cpu, >cpuid_min_xlevel, 0x800A);
 }
 
+/* TOPOEXT feature requires 0x801E */
+if (env->features[FEAT_8000_0001_ECX] & CPUID_EXT3_TOPOEXT) {
+x86_cpu_adjust_level(cpu, >cpuid_min_xlevel, 0x801E);
+}
+
 /* SEV requires CPUID[0x801F] */
 if (sev_enabled()) {
 x86_cpu_adjust_level(cpu, >cpuid_min_xlevel, 0x801F);
-- 
1.8.3.1

[Qemu-devel] [PATCH v5 9/9] i386: Remove generic SMT thread check

2018-03-27 Thread Babu Moger

Remove generic non-intel check while validating hyperthreading support.
Certain AMD CPUs can support hyperthreading now.

CPU family with TOPOEXT feature can support hyperthreading now.

Signed-off-by: Babu Moger 
---
 target/i386/cpu.c | 15 +--
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index f8e7325..e216d54 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -4824,17 +4824,20 @@ static void x86_cpu_realizefn(DeviceState *dev, Error 
**errp)
 
 qemu_init_vcpu(cs);
 
-/* Only Intel CPUs support hyperthreading. Even though QEMU fixes this
- * issue by adjusting CPUID__0001_EBX and CPUID_8000_0008_ECX
- * based on inputs (sockets,cores,threads), it is still better to gives
+/* Most Intel and certain AMD CPUs support hyperthreading. Even though QEMU
+ * fixes this issue by adjusting CPUID__0001_EBX and 
CPUID_8000_0008_ECX
+ * based on inputs (sockets,cores,threads), it is still better to give
  * users a warning.
  *
  * NOTE: the following code has to follow qemu_init_vcpu(). Otherwise
  * cs->nr_threads hasn't be populated yet and the checking is incorrect.
  */
-if (!IS_INTEL_CPU(env) && cs->nr_threads > 1 && !ht_warned) {
-error_report("AMD CPU doesn't support hyperthreading. Please configure"
- " -smp options properly.");
+ if (IS_AMD_CPU(env) &&
+ !(env->features[FEAT_8000_0001_ECX] & CPUID_EXT3_TOPOEXT) &&
+ cs->nr_threads > 1 && !ht_warned) {
+error_report("This family of AMD CPU doesn't support "
+ "hyperthreading. Please configure -smp "
+ "options properly.");
 ht_warned = true;
 }
 
-- 
1.8.3.1

[Qemu-devel] [PATCH v5 5/9] i386: Use the statically loaded cache definitions

2018-03-27 Thread Babu Moger

Use the statically loaded cache definitions if available
and legacy-cache parameter is not set.

Signed-off-by: Babu Moger 
---
 target/i386/cpu.c | 22 +-
 1 file changed, 17 insertions(+), 5 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index f4fbe3a..738927d 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -3938,8 +3938,13 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, 
uint32_t count,
(L1_ITLB_2M_ASSOC <<  8) | (L1_ITLB_2M_ENTRIES);
 *ebx = (L1_DTLB_4K_ASSOC << 24) | (L1_DTLB_4K_ENTRIES << 16) | \
(L1_ITLB_4K_ASSOC <<  8) | (L1_ITLB_4K_ENTRIES);
-*ecx = encode_cache_cpuid8005(_cache_amd);
-*edx = encode_cache_cpuid8005(_cache_amd);
+if (env->cache_info.valid && !cpu->legacy_cache) {
+*ecx = encode_cache_cpuid8005(>cache_info.l1d_cache);
+*edx = encode_cache_cpuid8005(>cache_info.l1i_cache);
+} else {
+*ecx = encode_cache_cpuid8005(_cache_amd);
+*edx = encode_cache_cpuid8005(_cache_amd);
+}
 break;
 case 0x8006:
 /* cache info (L2 cache) */
@@ -3955,9 +3960,16 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, 
uint32_t count,
(L2_DTLB_4K_ENTRIES << 16) | \
(AMD_ENC_ASSOC(L2_ITLB_4K_ASSOC) << 12) | \
(L2_ITLB_4K_ENTRIES);
-encode_cache_cpuid8006(_cache_amd,
-   cpu->enable_l3_cache ? _cache : NULL,
-   ecx, edx);
+if (env->cache_info.valid && !cpu->legacy_cache) {
+encode_cache_cpuid8006(>cache_info.l2_cache,
+   cpu->enable_l3_cache ?
+   >cache_info.l3_cache : NULL,
+   ecx, edx);
+} else {
+encode_cache_cpuid8006(_cache_amd,
+   cpu->enable_l3_cache ? _cache : NULL,
+   ecx, edx);
+}
 break;
 case 0x8007:
 *eax = 0;
-- 
1.8.3.1

[Qemu-devel] [PATCH v5 7/9] i386: Add support for CPUID_8000_001E for AMD

2018-03-27 Thread Babu Moger

Populate threads/core_id/apic_ids/socket_id when CPUID_EXT3_TOPOEXT
feature is supported. This is required to support hyperthreading feature
on AMD CPUs. This is supported via CPUID_8000_001E extended functions.

Signed-off-by: Babu Moger 
---
 target/i386/cpu.c | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index f69f551..6f3ad44 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -315,6 +315,12 @@ static uint32_t encode_cache_cpuid8005(CPUCacheInfo 
*cache)
  (((CORES_IN_CMPLX - 1) * 2) + 1)  : \
  (CORES_IN_CMPLX - 1))
 
+/* Definitions used on CPUID Leaf 0x801E */
+#define EXTENDED_APIC_ID(threads, socket_id, core_id, thread_id) \
+((threads) ? \
+ ((socket_id << 6) | (core_id << 1) | thread_id) : \
+ ((socket_id << 6) | core_id))
+
 /*
  * Encode cache info for CPUID[0x8006].ECX and CPUID[0x8006].EDX
  * @l3 can be NULL.
@@ -4101,6 +4107,14 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, 
uint32_t count,
 break;
 }
 break;
+case 0x801E:
+assert(cpu->core_id <= 255);
+*eax = EXTENDED_APIC_ID((cs->nr_threads - 1),
+   cpu->socket_id, cpu->core_id, cpu->thread_id);
+*ebx = (cs->nr_threads - 1) << 8 | cpu->core_id;
+*ecx = cpu->socket_id;
+*edx = 0;
+break;
 case 0xC000:
 *eax = env->cpuid_xlevel2;
 *ebx = 0;
-- 
1.8.3.1

[Qemu-devel] [PATCH v5 6/9] i386: Populate AMD Processor Cache Information for cpuid 0x8000001D

2018-03-27 Thread Babu Moger

Add information for cpuid 0x801D leaf. Populate cache topology information
for different cache types(Data Cache, Instruction Cache, L2 and L3) supported
by 0x801D leaf. Please refer Processor Programming Reference (PPR) for AMD
Family 17h Model for more details.

Signed-off-by: Babu Moger 
---
 target/i386/cpu.c | 91 +++
 target/i386/kvm.c | 29 --
 2 files changed, 117 insertions(+), 3 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 738927d..f69f551 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -307,6 +307,14 @@ static uint32_t encode_cache_cpuid8005(CPUCacheInfo 
*cache)
   a == ASSOC_FULL ? 0xF : \
   0 /* invalid value */)
 
+/* Definitions used on CPUID Leaf 0x801D */
+/* Number of logical cores in a complex */
+#define CORES_IN_CMPLX  4
+/* Number of logical processors sharing cache */
+#define NUM_SHARING_CACHE(threads)   (threads ? \
+ (((CORES_IN_CMPLX - 1) * 2) + 1)  : \
+ (CORES_IN_CMPLX - 1))
+
 /*
  * Encode cache info for CPUID[0x8006].ECX and CPUID[0x8006].EDX
  * @l3 can be NULL.
@@ -336,6 +344,40 @@ static void encode_cache_cpuid8006(CPUCacheInfo *l2,
 }
 }
 
+/* Encode cache info for CPUID[801D] */
+static void encode_cache_cpuid801d(CPUCacheInfo *cache, int nr_threads,
+uint32_t *eax, uint32_t *ebx,
+uint32_t *ecx, uint32_t *edx)
+{
+assert(cache->size == cache->line_size * cache->associativity *
+  cache->partitions * cache->sets);
+
+*eax = CACHE_TYPE(cache->type) | CACHE_LEVEL(cache->level) |
+   (cache->self_init ? CACHE_SELF_INIT_LEVEL : 0);
+
+if (CACHE_TYPE(cache->type) == UNIFIED_CACHE) {
+*eax |= (NUM_SHARING_CACHE(nr_threads - 1) << 14);
+} else {
+*eax |= ((nr_threads - 1) << 14);
+}
+
+assert(cache->line_size > 0);
+assert(cache->partitions > 0);
+assert(cache->associativity > 0);
+/* We don't implement fully-associative caches */
+assert(cache->associativity < cache->sets);
+*ebx = (cache->line_size - 1) |
+   ((cache->partitions - 1) << 12) |
+   ((cache->associativity - 1) << 22);
+
+assert(cache->sets > 0);
+*ecx = cache->sets - 1;
+
+*edx = (cache->no_invd_sharing ? CACHE_NO_INVD_SHARING : 0) |
+   (cache->inclusive ? CACHE_INCLUSIVE : 0) |
+   (cache->complex_indexing ? CACHE_COMPLEX_IDX : 0);
+}
+
 /* Definitions of the hardcoded cache entries we expose: */
 
 /* L1 data cache: */
@@ -4010,6 +4052,55 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, 
uint32_t count,
 *edx = 0;
 }
 break;
+case 0x801D:
+*eax = 0;
+switch (count) {
+case 0: /* L1 dcache info */
+if (env->cache_info.valid && !cpu->legacy_cache) {
+encode_cache_cpuid801d(>cache_info.l1d_cache,
+   cs->nr_threads,
+   eax, ebx, ecx, edx);
+} else {
+encode_cache_cpuid801d(_cache_amd, cs->nr_threads,
+   eax, ebx, ecx, edx);
+}
+break;
+case 1: /* L1 icache info */
+if (env->cache_info.valid && !cpu->legacy_cache) {
+encode_cache_cpuid801d(>cache_info.l1i_cache,
+   cs->nr_threads,
+   eax, ebx, ecx, edx);
+} else {
+encode_cache_cpuid801d(_cache_amd,
+   cs->nr_threads,
+   eax, ebx, ecx, edx);
+}
+break;
+case 2: /* L2 cache info */
+if (env->cache_info.valid && !cpu->legacy_cache) {
+encode_cache_cpuid801d(>cache_info.l2_cache,
+   cs->nr_threads,
+   eax, ebx, ecx, edx);
+} else {
+encode_cache_cpuid801d(_cache_amd, cs->nr_threads,
+   eax, ebx, ecx, edx);
+}
+break;
+case 3: /* L3 cache info */
+if (env->cache_info.valid && !cpu->legacy_cache) {
+encode_cache_cpuid801d(>cache_info.l3_cache,
+   cs->nr_threads,
+   eax, ebx, ecx, edx);
+} else {
+encode_cache_cpuid801d(_cache, cs->nr_threads,
+   eax, ebx, ecx, edx);
+}
+break;
+default: /* end of info */
+*eax = *ebx = *ecx = *edx = 0;

[Qemu-devel] [PATCH v5 3/9] i386: Initialize cache information for EPYC family processors

2018-03-27 Thread Babu Moger

Initialize pre-determined cache information for EPYC processors.

Signed-off-by: Babu Moger 
---
 target/i386/cpu.c | 96 +++
 1 file changed, 96 insertions(+)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index eec4a97..67faa53 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -2299,6 +2299,54 @@ static X86CPUDefinition builtin_x86_defs[] = {
 CPUID_6_EAX_ARAT,
 .xlevel = 0x800A,
 .model_id = "AMD EPYC Processor",
+.cache_info.valid = 1,
+.cache_info.l1d_cache = {
+.type = DCACHE,
+.level = 1,
+.size = 32 * KiB,
+.line_size = 64,
+.associativity = 8,
+.partitions = 1,
+.sets = 64,
+.lines_per_tag = 1,
+.self_init = 1,
+.no_invd_sharing = true,
+},
+.cache_info.l1i_cache = {
+.type = ICACHE,
+.level = 1,
+.size = 64 * KiB,
+.line_size = 64,
+.associativity = 4,
+.partitions = 1,
+.sets = 256,
+.lines_per_tag = 1,
+.self_init = 1,
+.no_invd_sharing = true,
+},
+.cache_info.l2_cache = {
+.type = UNIFIED_CACHE,
+.level = 2,
+.size = 512 * KiB,
+.line_size = 64,
+.associativity = 8,
+.partitions = 1,
+.sets = 1024,
+.lines_per_tag = 1,
+},
+.cache_info.l3_cache = {
+.type = UNIFIED_CACHE,
+.level = 3,
+.size = 16 * MiB,
+.line_size = 64,
+.associativity = 16,
+.partitions = 1,
+.sets = 16384,
+.lines_per_tag = 1,
+.self_init = true,
+.inclusive = true,
+.complex_indexing = true,
+},
 },
 {
 .name = "EPYC-IBPB",
@@ -2345,6 +2393,54 @@ static X86CPUDefinition builtin_x86_defs[] = {
 CPUID_6_EAX_ARAT,
 .xlevel = 0x800A,
 .model_id = "AMD EPYC Processor (with IBPB)",
+.cache_info.valid = 1,
+.cache_info.l1d_cache = {
+.type = DCACHE,
+.level = 1,
+.size = 32 * KiB,
+.line_size = 64,
+.associativity = 8,
+.partitions = 1,
+.sets = 64,
+.lines_per_tag = 1,
+.self_init = 1,
+.no_invd_sharing = true,
+},
+.cache_info.l1i_cache = {
+.type = ICACHE,
+.level = 1,
+.size = 64 * KiB,
+.line_size = 64,
+.associativity = 4,
+.partitions = 1,
+.sets = 256,
+.lines_per_tag = 1,
+.self_init = 1,
+.no_invd_sharing = true,
+},
+.cache_info.l2_cache = {
+.type = UNIFIED_CACHE,
+.level = 2,
+.size = 512 * KiB,
+.line_size = 64,
+.associativity = 8,
+.partitions = 1,
+.sets = 1024,
+.lines_per_tag = 1,
+},
+.cache_info.l3_cache = {
+.type = UNIFIED_CACHE,
+.level = 3,
+.size = 16 * MiB,
+.line_size = 64,
+.associativity = 16,
+.partitions = 1,
+.sets = 16384,
+.lines_per_tag = 1,
+.self_init = true,
+.inclusive = true,
+.complex_indexing = true,
+},
 },
 };
 
-- 
1.8.3.1

[Qemu-devel] [PATCH v5 4/9] i386: Add new property to control cache info

2018-03-27 Thread Babu Moger

This will be used to control the cache information.
By default new information will be displayed. If user
passes "-cpu legacy-cache" then older information will
be displayed even if the hardware supports new information.

Signed-off-by: Babu Moger 
---
 include/hw/i386/pc.h | 6 +-
 target/i386/cpu.c| 1 +
 target/i386/cpu.h| 5 +
 3 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index ffee841..9cda1ab 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -327,7 +327,11 @@ bool e820_get_entry(int, uint32_t, uint64_t *, uint64_t *);
 .driver   = "q35-pcihost",\
 .property = "x-pci-hole64-fix",\
 .value= "off",\
-},
+},{\
+.driver   = TYPE_X86_CPU,\
+.property = "legacy-cache",\
+.value= "off",\
+},\
 
 #define PC_COMPAT_2_9 \
 HW_COMPAT_2_9 \
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 67faa53..f4fbe3a 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -5132,6 +5132,7 @@ static Property x86_cpu_properties[] = {
  false),
 DEFINE_PROP_BOOL("vmware-cpuid-freq", X86CPU, vmware_cpuid_freq, true),
 DEFINE_PROP_BOOL("tcg-cpuid", X86CPU, expose_tcg, true),
+DEFINE_PROP_BOOL("legacy-cache", X86CPU, legacy_cache, false),
 
 /*
  * From "Requirements for Implementing the Microsoft
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 806c34b..bbe13f2 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1394,6 +1394,11 @@ struct X86CPU {
  */
 bool enable_l3_cache;
 
+/* Compatibility bits for old machine types.
+ * If true present the old cache topology information
+ */
+bool legacy_cache;
+
 /* Compatibility bits for old machine types: */
 bool enable_cpuid_0xb;
 
-- 
1.8.3.1

[Qemu-devel] [PATCH v5 2/9] i386: Add cache information in X86CPUDefinition

2018-03-27 Thread Babu Moger

Add cache information in X86CPUDefinition and CPUX86State.

Signed-off-by: Babu Moger 
---
 target/i386/cpu.c | 4 
 target/i386/cpu.h | 8 
 2 files changed, 12 insertions(+)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index da59dc4..eec4a97 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -1102,6 +1102,7 @@ struct X86CPUDefinition {
 int stepping;
 FeatureWordArray features;
 const char *model_id;
+CPUCaches cache_info;
 };
 
 static X86CPUDefinition builtin_x86_defs[] = {
@@ -3239,6 +3240,9 @@ static void x86_cpu_load_def(X86CPU *cpu, 
X86CPUDefinition *def, Error **errp)
 env->features[w] = def->features[w];
 }
 
+/* Load Cache information from the X86CPUDefinition */
+memcpy(>cache_info, >cache_info, sizeof(CPUCaches));
+
 /* Special cases not set in the X86CPUDefinition structs: */
 /* TODO: in-kernel irqchip for hvf */
 if (kvm_enabled()) {
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 79d5ccf..806c34b 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1096,6 +1096,13 @@ typedef struct CPUCacheInfo {
 } CPUCacheInfo;
 
 
+typedef struct CPUCaches {
+bool valid;
+CPUCacheInfo l1d_cache;
+CPUCacheInfo l1i_cache;
+CPUCacheInfo l2_cache;
+CPUCacheInfo l3_cache;
+} CPUCaches;
 
 typedef struct CPUX86State {
 /* standard registers */
@@ -1282,6 +1289,7 @@ typedef struct CPUX86State {
 /* Features that were explicitly enabled/disabled */
 FeatureWordArray user_features;
 uint32_t cpuid_model[12];
+CPUCaches cache_info;
 
 /* MTRRs */
 uint64_t mtrr_fixed[11];
-- 
1.8.3.1

[Qemu-devel] [PATCH v5 0/9] i386: Enable TOPOEXT to support hyperthreading on AMD CPU

2018-03-27 Thread Babu Moger

This series enables the TOPOEXT feature for AMD CPUs. This is required to
support hyperthreading on kvm guests.

This addresses the issues reported in these bugs:
https://bugzilla.redhat.com/show_bug.cgi?id=1481253
https://bugs.launchpad.net/qemu/+bug/1703506 


v5:
In this series I tried to address the feedback from Eduardo Habkost.
The discussion thread is here.
https://patchwork.kernel.org/patch/10299745/
The previous thread is here.
http://patchwork.ozlabs.org/cover/884885/

Reason for these changes
The cache properties for AMD family of processors have changed from
previous releases. We don't want to display the new information on the
old family of processors as this might cause compatibility issues.

Changes:
 1. Based the patches on top of Eduardo's(patch#1) patch.
Changed few things.
Moved the Cache definitions to cpu.h file.
Changed the CPUID_4 names to generic names.
2. Added a new propery "legacy-cache" in cpu object(patch#2). This can be
   used to display the old property even if the host supports the new cache
   properties.
3. Added cache information in X86CPUDefinition and CPUX86State
4. Patch 6-7 changed quite a bit from previous version does to new approach.
5. Addressed few issues with CPUID_8000_001d and CPUID_8000_001E.


v4:
1.Removed the checks under cpuid 0x801D leaf(patch #2). These check are
  not necessary. Found this during internal review.
2.Added CPUID_EXT3_TOPOEXT feature for all the 17 family(patch #4). This was
  found by Kash Pande during his testing.
3.Removed th hardcoded cpuid xlevel and dynamically extended if 
CPUID_EXT3_TOPOEXT
  is supported(Suggested by Brijesh Singh). 

v3:
1.Removed the patch #1. Radim mentioned that original typo problem is in 
  linux kernel header. qemu is just copying those files.
2.In previous version, I used the cpuid 4 definitions for AMDs cpuid leaf
  0x801D. CPUID 4 is very intel specific and we dont want to expose those
  details under AMD. I have renamed some of these definitions as generic.
  These changes are in patch#1. Radim, let me know if this is what you intended.
3.Added assert to for core_id(Suggested by Radim Kr??m).
4.Changed the if condition under "L3 cache info"(Suggested by Gary Hook).
5.Addressed few more text correction and code cleanup(Suggested by Thomas 
Lendacky).

v2:
  Fixed few more minor issues per Gary Hook's comments. Thank you Gary.
  Removed the patch#1. We need to handle the instruction cache associativity 
  seperately. It varies based on the cpu family. I will comeback to that later.
  Added two more typo corrections in patch#1 and patch#5.

v1:
  Stanislav Lanci posted few patches earlier. 
  https://patchwork.kernel.org/patch/10040903/

Rebased his patches with few changes.
1.Spit the patches into two, separating cpuid functions 
  0x801D and 0x801E (Patch 2 and 3).
2.Removed the generic non-intel check and made a separate patch
  with some changes(Patch 5).
3.Fixed L3_N_SETS_AMD(from 4096 to 8192) based on CPUID_Fn801D_ECX_x03.

Added 2 more patches.
Patch 1. Fixes cache associativity.
Patch 4. Adds TOPOEXT feature on AMD EPYC CPU.


Babu Moger (8):
  i386: Add cache information in X86CPUDefinition
  i386: Initialize cache information for EPYC family processors
  i386: Add new property to control cache info
  i386: Use the statically loaded cache definitions
  i386: Populate AMD Processor Cache Information for cpuid 0x801D
  i386: Add support for CPUID_8000_001E for AMD
  i386: Enable TOPOEXT feature on AMD EPYC CPU
  i386: Remove generic SMT thread check

Eduardo Habkost (1):
  i386: Helpers to encode cache information consistently

 include/hw/i386/pc.h |   6 +-
 target/i386/cpu.c| 735 ++-
 target/i386/cpu.h|  66 +
 target/i386/kvm.c|  29 +-
 4 files changed, 702 insertions(+), 134 deletions(-)

-- 
1.8.3.1

[Qemu-devel] [PATCH v5 1/9] i386: Helpers to encode cache information consistently

2018-03-27 Thread Babu Moger

From: Eduardo Habkost 

Instead of having a collection of macros that need to be used in
complex expressions to build CPUID data, define a CPUCacheInfo
struct that can hold information about a given cache.  Helper
functions will take a CPUCacheInfo struct as input to encode
CPUID leaves for a cache.

This will help us ensure consistency between cache information
CPUID leaves, and make the existing inconsistencies in CPUID info
more visible.

Signed-off-by: Eduardo Habkost 
Signed-off-by: Babu Moger 
---
 target/i386/cpu.c | 495 --
 target/i386/cpu.h |  53 ++
 2 files changed, 424 insertions(+), 124 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 6bb4ce8..da59dc4 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -56,33 +56,240 @@
 
 #include "disas/capstone.h"
 
+/* Helpers for building CPUID[2] descriptors: */
+
+struct CPUID2CacheDescriptorInfo {
+enum CacheType type;
+int level;
+int size;
+int line_size;
+int associativity;
+};
 
-/* Cache topology CPUID constants: */
+#define KiB 1024
+#define MiB (1024 * 1024)
 
-/* CPUID Leaf 2 Descriptors */
+/*
+ * Known CPUID 2 cache descriptors.
+ * From Intel SDM Volume 2A, CPUID instruction
+ */
+struct CPUID2CacheDescriptorInfo cpuid2_cache_descriptors[] = {
+[0x06] = { .level = 1, .type = ICACHE,.size =   8 * KiB,
+   .associativity = 4,  .line_size = 32, },
+[0x08] = { .level = 1, .type = ICACHE,.size =  16 * KiB,
+   .associativity = 4,  .line_size = 32, },
+[0x09] = { .level = 1, .type = ICACHE,.size =  32 * KiB,
+   .associativity = 4,  .line_size = 64, },
+[0x0A] = { .level = 1, .type = DCACHE,.size =   8 * KiB,
+   .associativity = 2,  .line_size = 32, },
+[0x0C] = { .level = 1, .type = DCACHE,.size =  16 * KiB,
+   .associativity = 4,  .line_size = 32, },
+[0x0D] = { .level = 1, .type = DCACHE,.size =  16 * KiB,
+   .associativity = 4,  .line_size = 64, },
+[0x0E] = { .level = 1, .type = DCACHE,.size =  24 * KiB,
+   .associativity = 6,  .line_size = 64, },
+[0x1D] = { .level = 2, .type = UNIFIED_CACHE, .size = 128 * KiB,
+   .associativity = 2,  .line_size = 64, },
+[0x21] = { .level = 2, .type = UNIFIED_CACHE, .size = 256 * KiB,
+   .associativity = 8,  .line_size = 64, },
+/* lines per sector is not supported cpuid2_cache_descriptor(),
+* so descriptors 0x22, 0x23 are not included
+*/
+[0x24] = { .level = 2, .type = UNIFIED_CACHE, .size =   1 * MiB,
+   .associativity = 16, .line_size = 64, },
+/* lines per sector is not supported cpuid2_cache_descriptor(),
+* so descriptors 0x25, 0x20 are not included
+*/
+[0x2C] = { .level = 1, .type = DCACHE,.size =  32 * KiB,
+   .associativity = 8,  .line_size = 64, },
+[0x30] = { .level = 1, .type = ICACHE,.size =  32 * KiB,
+   .associativity = 8,  .line_size = 64, },
+[0x41] = { .level = 2, .type = UNIFIED_CACHE, .size = 128 * KiB,
+   .associativity = 4,  .line_size = 32, },
+[0x42] = { .level = 2, .type = UNIFIED_CACHE, .size = 256 * KiB,
+   .associativity = 4,  .line_size = 32, },
+[0x43] = { .level = 2, .type = UNIFIED_CACHE, .size = 512 * KiB,
+   .associativity = 4,  .line_size = 32, },
+[0x44] = { .level = 2, .type = UNIFIED_CACHE, .size =   1 * MiB,
+   .associativity = 4,  .line_size = 32, },
+[0x45] = { .level = 2, .type = UNIFIED_CACHE, .size =   2 * MiB,
+   .associativity = 4,  .line_size = 32, },
+[0x46] = { .level = 3, .type = UNIFIED_CACHE, .size =   4 * MiB,
+   .associativity = 4,  .line_size = 64, },
+[0x47] = { .level = 3, .type = UNIFIED_CACHE, .size =   8 * MiB,
+   .associativity = 8,  .line_size = 64, },
+[0x48] = { .level = 2, .type = UNIFIED_CACHE, .size =   3 * MiB,
+   .associativity = 12, .line_size = 64, },
+/* Descriptor 0x49 depends on CPU family/model, so it is not included */
+[0x4A] = { .level = 3, .type = UNIFIED_CACHE, .size =   6 * MiB,
+   .associativity = 12, .line_size = 64, },
+[0x4B] = { .level = 3, .type = UNIFIED_CACHE, .size =   8 * MiB,
+   .associativity = 16, .line_size = 64, },
+[0x4C] = { .level = 3, .type = UNIFIED_CACHE, .size =  12 * MiB,
+   .associativity = 12, .line_size = 64, },
+[0x4D] = { .level = 3, .type = UNIFIED_CACHE, .size =  16 * MiB,
+   .associativity = 16, .line_size = 64, },
+[0x4E] = { .level = 2, .type = UNIFIED_CACHE, .size =   6 * MiB,
+   .associativity = 24, .line_size = 64, },
+[0x60] = { .level = 1, .type = DCACHE,.size =  16 * KiB,
+

Re: [Qemu-devel] [PATCH v2 1/1] hw/scsi: support SCSI-2 passthrough without PI

2018-03-27 Thread Daniel Henrique Barboza




On 03/27/2018 02:21 PM, Fam Zheng wrote:

On Tue, 03/13 13:43, Daniel Henrique Barboza wrote:

QEMU SCSI code makes assumptions about how the PROTECT and BYTCHK
works in the protocol, denying support for PI (Protection
Information) in case the guest OS requests it. However, in SCSI versions 2
and older, there is no PI concept in the protocol.

This means that when dealing with such devices:

- there is no PROTECT bit in byte 5 of the standard INQUIRY response. The
whole byte is marked as "Reserved";

- there is no RDPROTECT in byte 2 of READ. We have 'Logical Unit Number'
in this field instead;

- there is no VRPROTECT in byte 2 of VERIFY. We have 'Logical Unit Number'
in this field instead. This also means that the BYTCHK bit in this case
is not related to PI.

Since QEMU does not consider these changes, a SCSI passthrough using
a SCSI-2 device will not work. It will mistake these fields with
PI information and return Illegal Request SCSI SENSE thinking
that the driver is asking for PI support.

This patch fixes it by adding a new attribute called 'scsi_version'
that is read from the standard INQUIRY response of passthrough
devices. This allows for a version verification before applying
conditions related to PI that doesn't apply for older versions.

Reported-by: Dac Nguyen 
Signed-off-by: Daniel Henrique Barboza 
---

Changes in v2:
- removed "scsi_version" as a property
- scsi_version is now initialized with -1 in scsi_realize (that is
used by scsi_hd_realize, scsi_cd_realize, scsi_disk_realize and
scsi_block_realize) and scsi_generic_realize


  hw/scsi/scsi-disk.c| 14 +++---
  hw/scsi/scsi-generic.c | 42 +++---
  include/hw/scsi/scsi.h |  1 +
  3 files changed, 43 insertions(+), 14 deletions(-)

diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c
index 49d2559d93..80b1eb92ae 100644
--- a/hw/scsi/scsi-disk.c
+++ b/hw/scsi/scsi-disk.c
@@ -2176,7 +2176,7 @@ static int32_t scsi_disk_dma_command(SCSIRequest *req, 
uint8_t *buf)
  case READ_12:
  case READ_16:
  DPRINTF("Read (sector %" PRId64 ", count %u)\n", r->req.cmd.lba, len);
-if (r->req.cmd.buf[1] & 0xe0) {
+if ((r->req.cmd.buf[1] & 0xe0) && (s->qdev.scsi_version > 2)) {
  goto illegal_request;
  }
  if (!check_lba_range(s, r->req.cmd.lba, len)) {
@@ -2206,8 +2206,12 @@ static int32_t scsi_disk_dma_command(SCSIRequest *req, 
uint8_t *buf)
  /* We get here only for BYTCHK == 0x01 and only for scsi-block.
   * As far as DMA is concerned, we can treat it the same as a write;
   * scsi_block_do_sgio will send VERIFY commands.
+ *
+ * For scsi versions 2 and older, the BYTCHK isn't related
+ * to VRPROTECT (in fact, there is no VRPROTECT). Skip
+ * this check in these versions.
   */
-if (r->req.cmd.buf[1] & 0xe0) {
+if ((r->req.cmd.buf[1] & 0xe0) && (s->qdev.scsi_version > 2)) {
  goto illegal_request;
  }
  if (!check_lba_range(s, r->req.cmd.lba, len)) {
@@ -2383,6 +2387,8 @@ static void scsi_realize(SCSIDevice *dev, Error **errp)
  return;
  }
  
+dev->scsi_version = -1;

+
  if ((s->features & (1 << SCSI_DISK_F_REMOVABLE)) &&
  !(s->features & (1 << SCSI_DISK_F_NO_REMOVABLE_DEVOPS))) {
  blk_set_dev_ops(s->qdev.conf.blk, _disk_removable_block_ops, s);
@@ -2796,6 +2802,8 @@ static bool scsi_block_is_passthrough(SCSIDiskState *s, 
uint8_t *buf)
  static int32_t scsi_block_dma_command(SCSIRequest *req, uint8_t *buf)
  {
  SCSIBlockReq *r = (SCSIBlockReq *)req;
+SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, req->dev);
+
  r->cmd = req->cmd.buf[0];
  switch (r->cmd >> 5) {
  case 0:
@@ -2821,7 +2829,7 @@ static int32_t scsi_block_dma_command(SCSIRequest *req, 
uint8_t *buf)
  abort();
  }
  
-if (r->cdb1 & 0xe0) {

+if ((r->cdb1 & 0xe0) && (s->qdev.scsi_version > 2)) {
  /* Protection information is not supported.  */
  scsi_check_condition(>req, SENSE_CODE(INVALID_FIELD));
  return 0;
diff --git a/hw/scsi/scsi-generic.c b/hw/scsi/scsi-generic.c
index 7414fe2d67..5cc5598983 100644
--- a/hw/scsi/scsi-generic.c
+++ b/hw/scsi/scsi-generic.c
@@ -194,17 +194,35 @@ static void scsi_read_complete(void * opaque, int ret)
  r->buf[3] |= 0x80;
  }
  }
-if (s->type == TYPE_DISK &&
-r->req.cmd.buf[0] == INQUIRY &&
-r->req.cmd.buf[2] == 0xb0) {
-uint32_t max_transfer =
-blk_get_max_transfer(s->conf.blk) / s->blocksize;
-
-assert(max_transfer);
-stl_be_p(>buf[8], max_transfer);
-/* Also take care of the opt xfer len. */
-stl_be_p(>buf[12],
- MIN_NON_ZERO(max_transfer, ldl_be_p(>buf[12])));
+if (r->req.cmd.buf[0] == INQUIRY) {
+/*
+ *  EVPD set to zero returns the

[Qemu-devel] [PATCH v3 1/1] hw/scsi: support SCSI-2 passthrough without PI

2018-03-27 Thread Daniel Henrique Barboza

QEMU SCSI code makes assumptions about how the PROTECT and BYTCHK
works in the protocol, denying support for PI (Protection
Information) in case the guest OS requests it. However, in SCSI versions 2
and older, there is no PI concept in the protocol.

This means that when dealing with such devices:

- there is no PROTECT bit in byte 5 of the standard INQUIRY response. The
whole byte is marked as "Reserved";

- there is no RDPROTECT in byte 2 of READ. We have 'Logical Unit Number'
in this field instead;

- there is no VRPROTECT in byte 2 of VERIFY. We have 'Logical Unit Number'
in this field instead. This also means that the BYTCHK bit in this case
is not related to PI.

Since QEMU does not consider these changes, a SCSI passthrough using
a SCSI-2 device will not work. It will mistake these fields with
PI information and return Illegal Request SCSI SENSE thinking
that the driver is asking for PI support.

This patch fixes it by adding a new attribute called 'scsi_version'
that is read from the standard INQUIRY response of passthrough
devices. This allows for a version verification before applying
conditions related to PI that doesn't apply for older versions.

Reported-by: Dac Nguyen 
Signed-off-by: Daniel Henrique Barboza 
---

Changes in v3:
- moved the scsi_version initialization from realize functions to
reset functions, allowing the scsi_version to be redefined again after
each reboot.


 hw/scsi/scsi-disk.c| 14 +++---
 hw/scsi/scsi-generic.c | 46 +++---
 include/hw/scsi/scsi.h |  1 +
 3 files changed, 47 insertions(+), 14 deletions(-)

diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c
index f5ab767ab5..115fbd1e8f 100644
--- a/hw/scsi/scsi-disk.c
+++ b/hw/scsi/scsi-disk.c
@@ -2191,7 +2191,7 @@ static int32_t scsi_disk_dma_command(SCSIRequest *req, 
uint8_t *buf)
 case READ_12:
 case READ_16:
 DPRINTF("Read (sector %" PRId64 ", count %u)\n", r->req.cmd.lba, len);
-if (r->req.cmd.buf[1] & 0xe0) {
+if ((r->req.cmd.buf[1] & 0xe0) && (s->qdev.scsi_version > 2)) {
 goto illegal_request;
 }
 if (!check_lba_range(s, r->req.cmd.lba, len)) {
@@ -2221,8 +2221,12 @@ static int32_t scsi_disk_dma_command(SCSIRequest *req, 
uint8_t *buf)
 /* We get here only for BYTCHK == 0x01 and only for scsi-block.
  * As far as DMA is concerned, we can treat it the same as a write;
  * scsi_block_do_sgio will send VERIFY commands.
+ *
+ * For scsi versions 2 and older, the BYTCHK isn't related
+ * to VRPROTECT (in fact, there is no VRPROTECT). Skip
+ * this check in these versions.
  */
-if (r->req.cmd.buf[1] & 0xe0) {
+if ((r->req.cmd.buf[1] & 0xe0) && (s->qdev.scsi_version > 2)) {
 goto illegal_request;
 }
 if (!check_lba_range(s, r->req.cmd.lba, len)) {
@@ -2268,6 +2272,8 @@ static void scsi_disk_reset(DeviceState *dev)
 /* reset tray statuses */
 s->tray_locked = 0;
 s->tray_open = 0;
+
+s->qdev.scsi_version = -1;
 }
 
 static void scsi_disk_resize_cb(void *opaque)
@@ -2812,6 +2818,8 @@ static bool scsi_block_is_passthrough(SCSIDiskState *s, 
uint8_t *buf)
 static int32_t scsi_block_dma_command(SCSIRequest *req, uint8_t *buf)
 {
 SCSIBlockReq *r = (SCSIBlockReq *)req;
+SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, req->dev);
+
 r->cmd = req->cmd.buf[0];
 switch (r->cmd >> 5) {
 case 0:
@@ -2837,7 +2845,7 @@ static int32_t scsi_block_dma_command(SCSIRequest *req, 
uint8_t *buf)
 abort();
 }
 
-if (r->cdb1 & 0xe0) {
+if ((r->cdb1 & 0xe0) && (s->qdev.scsi_version > 2)) {
 /* Protection information is not supported.  */
 scsi_check_condition(>req, SENSE_CODE(INVALID_FIELD));
 return 0;
diff --git a/hw/scsi/scsi-generic.c b/hw/scsi/scsi-generic.c
index 4753f8738f..3b34f167c7 100644
--- a/hw/scsi/scsi-generic.c
+++ b/hw/scsi/scsi-generic.c
@@ -194,17 +194,40 @@ static void scsi_read_complete(void * opaque, int ret)
 r->buf[3] |= 0x80;
 }
 }
-if (s->type == TYPE_DISK &&
-r->req.cmd.buf[0] == INQUIRY &&
-r->req.cmd.buf[2] == 0xb0) {
-uint32_t max_transfer =
-blk_get_max_transfer(s->conf.blk) / s->blocksize;
-
-assert(max_transfer);
-stl_be_p(>buf[8], max_transfer);
-/* Also take care of the opt xfer len. */
-stl_be_p(>buf[12],
- MIN_NON_ZERO(max_transfer, ldl_be_p(>buf[12])));
+if (r->req.cmd.buf[0] == INQUIRY) {
+/*
+ *  EVPD set to zero returns the standard INQUIRY data.
+ *
+ *  Check if scsi_version is unset (-1) to avoid re-defining it
+ *  each time an INQUIRY with standard data is received.
+ *  scsi_version is initialized with -1 in scsi_generic_reset
+ *  and scsi_disk_reset, making sure that we'll set the

[Qemu-devel] [PATCH for-2.12] nbd: Fix 32-bit compilation on BLOCK_STATUS

2018-03-27 Thread Eric Blake

iotests 123 and 209 fail on 32-bit platforms.  The culprit:
sizeof(extent) is wrong; we want sizeof(*extent).  But since
the struct is 8 bytes, it happened to work on 64-bit platforms
where the pointer is also 8 bytes (nasty).

Fixes: 78a33ab58
Reported-by: Max Reitz 
Signed-off-by: Eric Blake 
---
 block/nbd-client.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/nbd-client.c b/block/nbd-client.c
index e64e346d690..e7caf49fbb4 100644
--- a/block/nbd-client.c
+++ b/block/nbd-client.c
@@ -239,7 +239,7 @@ static int nbd_parse_blockstatus_payload(NBDClientSession 
*client,
 {
 uint32_t context_id;

-if (chunk->length != sizeof(context_id) + sizeof(extent)) {
+if (chunk->length != sizeof(context_id) + sizeof(*extent)) {
 error_setg(errp, "Protocol error: invalid payload for "
  "NBD_REPLY_TYPE_BLOCK_STATUS");
 return -EINVAL;
-- 
2.14.3

Re: [Qemu-devel] [PATCH v2 for-2.12] gdbstub: send a termination packet instead of crashing gdb

2018-03-27 Thread Peter Maydell

On 27 March 2018 at 10:31, KONRAD Frederic  wrote:
> Peter, can this be cherry-picked in 2.12-rc1?
>
> Thanks,
> Fred
>
>
> On 03/20/2018 10:39 AM, KONRAD Frederic wrote:
>>
>> Since the commit:
>> commit 4486e89c219c0d1b9bd8dfa0b1dd5b0d51ff2268
>> Author: Stefan Hajnoczi 
>> Date:   Wed Mar 7 14:42:05 2018 +
>>
>>  vl: introduce vm_shutdown()
>>
>> GDB crashes when qemu exits (at least on sparc-softmmu):
>> Remote communication error.  Target disconnected.: Connection reset by
>> peer.
>> Quitting: putpkt: write failed: Broken pipe.
>>
>> So send a packet to exit GDB before we exit QEMU:
>> [Inferior 1 (Thread 0) exited normally]
>>
>> Reviewed-by: Philippe Mathieu-Daudé 
>> Reviewed-by: Stefan Hajnoczi 
>> Signed-off-by: KONRAD Frederic 
>> ---

Applied to master, thanks.

-- PMM

Re: [Qemu-devel] [PATCH for-2.12] dump: Fix build with newer gcc

2018-03-27 Thread Philippe Mathieu-Daudé

On 03/27/2018 05:21 PM, Eric Blake wrote:
> gcc 8 on rawhide is picky enough to complain:
> 
> /home/dummy/qemu/dump.c: In function 'create_header32':
> /home/dummy/qemu/dump.c:817:5: error: 'strncpy' output truncated before 
> terminating nul copying 8 bytes from a string of the same length 
> [-Werror=stringop-truncation]
>  strncpy(dh->signature, KDUMP_SIGNATURE, strlen(KDUMP_SIGNATURE));
>  ^~~~
> 
> But we already have SIG_LEN defined as the right length without needing
> to do a strlen(), and memcpy() is better than strncpy() when we know
> we do not want a trailing NUL byte.
> 
> Signed-off-by: Eric Blake 

Reviewed-by: Philippe Mathieu-Daudé 

> ---
>  dump.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/dump.c b/dump.c
> index 669f715274d..b54cd42b217 100644
> --- a/dump.c
> +++ b/dump.c
> @@ -814,7 +814,7 @@ static void create_header32(DumpState *s, Error **errp)
>  size = sizeof(DiskDumpHeader32);
>  dh = g_malloc0(size);
> 
> -strncpy(dh->signature, KDUMP_SIGNATURE, strlen(KDUMP_SIGNATURE));
> +memcpy(dh->signature, KDUMP_SIGNATURE, SIG_LEN);
>  dh->header_version = cpu_to_dump32(s, 6);
>  block_size = s->dump_info.page_size;
>  dh->block_size = cpu_to_dump32(s, block_size);
> @@ -926,7 +926,7 @@ static void create_header64(DumpState *s, Error **errp)
>  size = sizeof(DiskDumpHeader64);
>  dh = g_malloc0(size);
> 
> -strncpy(dh->signature, KDUMP_SIGNATURE, strlen(KDUMP_SIGNATURE));
> +memcpy(dh->signature, KDUMP_SIGNATURE, SIG_LEN);
>  dh->header_version = cpu_to_dump32(s, 6);
>  block_size = s->dump_info.page_size;
>  dh->block_size = cpu_to_dump32(s, block_size);
>

[Qemu-devel] [PATCH for-2.12] dump: Fix build with newer gcc

2018-03-27 Thread Eric Blake

gcc 8 on rawhide is picky enough to complain:

/home/dummy/qemu/dump.c: In function 'create_header32':
/home/dummy/qemu/dump.c:817:5: error: 'strncpy' output truncated before 
terminating nul copying 8 bytes from a string of the same length 
[-Werror=stringop-truncation]
 strncpy(dh->signature, KDUMP_SIGNATURE, strlen(KDUMP_SIGNATURE));
 ^~~~

But we already have SIG_LEN defined as the right length without needing
to do a strlen(), and memcpy() is better than strncpy() when we know
we do not want a trailing NUL byte.

Signed-off-by: Eric Blake 
---
 dump.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/dump.c b/dump.c
index 669f715274d..b54cd42b217 100644
--- a/dump.c
+++ b/dump.c
@@ -814,7 +814,7 @@ static void create_header32(DumpState *s, Error **errp)
 size = sizeof(DiskDumpHeader32);
 dh = g_malloc0(size);

-strncpy(dh->signature, KDUMP_SIGNATURE, strlen(KDUMP_SIGNATURE));
+memcpy(dh->signature, KDUMP_SIGNATURE, SIG_LEN);
 dh->header_version = cpu_to_dump32(s, 6);
 block_size = s->dump_info.page_size;
 dh->block_size = cpu_to_dump32(s, block_size);
@@ -926,7 +926,7 @@ static void create_header64(DumpState *s, Error **errp)
 size = sizeof(DiskDumpHeader64);
 dh = g_malloc0(size);

-strncpy(dh->signature, KDUMP_SIGNATURE, strlen(KDUMP_SIGNATURE));
+memcpy(dh->signature, KDUMP_SIGNATURE, SIG_LEN);
 dh->header_version = cpu_to_dump32(s, 6);
 block_size = s->dump_info.page_size;
 dh->block_size = cpu_to_dump32(s, block_size);
-- 
2.14.3

Re: [Qemu-devel] [PATCH 3/3] e1000: Old machine types, turn new subsection off

2018-03-27 Thread Paolo Bonzini

On 27/03/2018 20:01, Dr. David Alan Gilbert wrote:
>> New->old migration will place tx_legacy_vmstate_props in tx.props on the
>> destination; new->new will realize the subsection was transmitted and
>> ignore the tx_legacy_vmstate_props; old->new will not find data from the
>> subsection and copy the tx_legacy_vmstate_props into one of tx.props and
>> tx.tso_props.
> Yeh, adding a legacy_props field should do it; although we never need
> to transmit more than 2 copies.  
> 
> I'll look at this more tomorrow; I am a bit worried about testing it
> though.

Yes, I'm also afraid that we can't really do much better than careful
code review.

Paolo

Re: [Qemu-devel] [PULL v2 00/14] QAPI changes for 2018-03-27, 2.12-rc1

2018-03-27 Thread Peter Maydell

On 27 March 2018 at 16:21, Eric Blake  wrote:
> The following changes since commit bdc408e91b14cedfc29be8ff703408936e575721:
>
>   Merge remote-tracking branch 'remotes/maxreitz/tags/pull-block-2018-03-26' 
> into staging (2018-03-27 14:11:30 +0100)
>
> are available in the Git repository at:
>
>   git://repo.or.cz/qemu/ericb.git tags/pull-qapi-2018-03-27-v2
>
> for you to fetch changes up to 0dfddbb537fcb0fbd045e1c890bc0e95f2ea5177:
>
>   hmp.c: Revert hmp_info_cpus output format change (2018-03-27 10:17:48 -0500)
>
> v2: fix some R-b tags in 6/14, 14/14 (no code changes, so no patches re-sent)
>
> 
> qapi patches for 2018-03-27, 2.12-rc1
>
> - Marc-André Lureau: qmp-test: fix response leak
> - Eric Blake: tests: Silence false positive warning on generated test name
> - Laurent Vivier: 0/4 (partial) coccinelle: re-run scripts from 
> scripst/coccinelle
> - Peter Xu: 0/8 Monitor: some oob related patches (fixes, new param, tests)
> - Satheesh Rajendran: hmp.c: Revert hmp_info_cpus output format change
>
> 

Applied, thanks.

-- PMM

Re: [Qemu-devel] [PATCH] blockjob: leak fix, remove from txn when failing early

2018-03-27 Thread Jeff Cody

On Tue, Mar 27, 2018 at 06:07:36PM +0200, Marc-André Lureau wrote:
> This fixes leaks found by ASAN such as:
>   GTESTER tests/test-blockjob
> =
> ==31442==ERROR: LeakSanitizer: detected memory leaks
> 
> Direct leak of 24 byte(s) in 1 object(s) allocated from:
> #0 0x7f88483cba38 in __interceptor_calloc (/lib64/libasan.so.4+0xdea38)
> #1 0x7f8845e1bd77 in g_malloc0 ../glib/gmem.c:129
> #2 0x7f8845e1c04b in g_malloc0_n ../glib/gmem.c:360
> #3 0x5584d2732498 in block_job_txn_new 
> /home/elmarco/src/qemu/blockjob.c:172
> #4 0x5584d2739b28 in block_job_create 
> /home/elmarco/src/qemu/blockjob.c:973
> #5 0x5584d270ae31 in mk_job 
> /home/elmarco/src/qemu/tests/test-blockjob.c:34
> #6 0x5584d270b1c1 in do_test_id 
> /home/elmarco/src/qemu/tests/test-blockjob.c:57
> #7 0x5584d270b65c in test_job_ids 
> /home/elmarco/src/qemu/tests/test-blockjob.c:118
> #8 0x7f8845e40b69 in test_case_run ../glib/gtestutils.c:2255
> #9 0x7f8845e40f29 in g_test_run_suite_internal ../glib/gtestutils.c:2339
> #10 0x7f8845e40fd2 in g_test_run_suite_internal ../glib/gtestutils.c:2351
> #11 0x7f8845e411e9 in g_test_run_suite ../glib/gtestutils.c:2426
> #12 0x7f8845e3fe72 in g_test_run ../glib/gtestutils.c:1692
> #13 0x5584d270d6e2 in main 
> /home/elmarco/src/qemu/tests/test-blockjob.c:377
> #14 0x7f8843641f29 in __libc_start_main (/lib64/libc.so.6+0x20f29)
> 
> Add an assert to make sure that the job doesn't have associated txn before 
> free().
> 
> Signed-off-by: Marc-André Lureau 
> ---
>  blockjob.c | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/blockjob.c b/blockjob.c
> index 11c9ce124d..bb75386515 100644
> --- a/blockjob.c
> +++ b/blockjob.c
> @@ -228,6 +228,7 @@ void block_job_unref(BlockJob *job)
>  {
>  if (--job->refcnt == 0) {
>  assert(job->status == BLOCK_JOB_STATUS_NULL);
> +assert(!job->txn);
>  BlockDriverState *bs = blk_bs(job->blk);
>  QLIST_REMOVE(job, job_list);
>  bs->job = NULL;
> @@ -479,6 +480,7 @@ static int block_job_finalize_single(BlockJob *job)
>  
>  QLIST_REMOVE(job, txn_list);
>  block_job_txn_unref(job->txn);
> +job->txn = NULL;
>  block_job_conclude(job);
>  return 0;
>  }
> @@ -994,6 +996,9 @@ void block_job_pause_all(void)
>  void block_job_early_fail(BlockJob *job)
>  {
>  assert(job->status == BLOCK_JOB_STATUS_CREATED);
> +QLIST_REMOVE(job, txn_list);
> +block_job_txn_unref(job->txn);
> +job->txn = NULL;
>  block_job_decommission(job);
>  }
>  
> -- 
> 2.17.0.rc1.1.g4c4f2b46a3
> 

This patch causes a segfault/assert in iotests 031 041 055:

e.g., from 031:

test_set_speed_invalid (__main__.TestSetSpeed) ... DEBUG:QMP:>>> {'execute': 
'qmp_capabilities'}
DEBUG:QMP:<<< {u'return': {}}
DEBUG:QMP:>>> {'execute': 'query-block-jobs'}
DEBUG:QMP:<<< {u'return': []}
DEBUG:QMP:>>> {'execute': 'block-stream', 'arguments': {'device': 'drive0', 
'speed': -1}}
DEBUG:QMP:<<< None
WARNING:qemu:qemu received signal -11: [...]

[Qemu-devel] [PATCH v1 1/2] RISC-V: Convert cpu definition to future model

2018-03-27 Thread Michael Clark

- Model borrowed from target/sh4/cpu.c
- Rewrote riscv_cpu_list to use object_class_get_list
- Dropped 'struct RISCVCPUInfo' and used TypeInfo array
- Replaced riscv_cpu_register_types with DEFINE_TYPES
- Marked base class as abstract
- Fixes -cpu list

Cc: Igor Mammedov 
Cc: Sagar Karandikar 
Cc: Bastian Koppelmann 
Cc: Palmer Dabbelt 
Signed-off-by: Michael Clark 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Igor Mammedov 
---
 target/riscv/cpu.c | 123 ++---
 1 file changed, 69 insertions(+), 54 deletions(-)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 9de34d7..5a527fb 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -115,6 +115,8 @@ static void riscv_any_cpu_init(Object *obj)
 set_resetvec(env, DEFAULT_RSTVEC);
 }
 
+#if defined(TARGET_RISCV32)
+
 static void rv32gcsu_priv1_09_1_cpu_init(Object *obj)
 {
 CPURISCVState *env = _CPU(obj)->env;
@@ -141,6 +143,8 @@ static void rv32imacu_nommu_cpu_init(Object *obj)
 set_resetvec(env, DEFAULT_RSTVEC);
 }
 
+#elif defined(TARGET_RISCV64)
+
 static void rv64gcsu_priv1_09_1_cpu_init(Object *obj)
 {
 CPURISCVState *env = _CPU(obj)->env;
@@ -167,20 +171,7 @@ static void rv64imacu_nommu_cpu_init(Object *obj)
 set_resetvec(env, DEFAULT_RSTVEC);
 }
 
-static const RISCVCPUInfo riscv_cpus[] = {
-{ 96, TYPE_RISCV_CPU_ANY,  riscv_any_cpu_init },
-{ 32, TYPE_RISCV_CPU_RV32GCSU_V1_09_1, rv32gcsu_priv1_09_1_cpu_init },
-{ 32, TYPE_RISCV_CPU_RV32GCSU_V1_10_0, rv32gcsu_priv1_10_0_cpu_init },
-{ 32, TYPE_RISCV_CPU_RV32IMACU_NOMMU,  rv32imacu_nommu_cpu_init },
-{ 32, TYPE_RISCV_CPU_SIFIVE_E31,   rv32imacu_nommu_cpu_init },
-{ 32, TYPE_RISCV_CPU_SIFIVE_U34,   rv32gcsu_priv1_10_0_cpu_init },
-{ 64, TYPE_RISCV_CPU_RV64GCSU_V1_09_1, rv64gcsu_priv1_09_1_cpu_init },
-{ 64, TYPE_RISCV_CPU_RV64GCSU_V1_10_0, rv64gcsu_priv1_10_0_cpu_init },
-{ 64, TYPE_RISCV_CPU_RV64IMACU_NOMMU,  rv64imacu_nommu_cpu_init },
-{ 64, TYPE_RISCV_CPU_SIFIVE_E51,   rv64imacu_nommu_cpu_init },
-{ 64, TYPE_RISCV_CPU_SIFIVE_U54,   rv64gcsu_priv1_10_0_cpu_init },
-{ 0, NULL, NULL }
-};
+#endif
 
 static ObjectClass *riscv_cpu_class_by_name(const char *cpu_model)
 {
@@ -366,28 +357,6 @@ static void riscv_cpu_class_init(ObjectClass *c, void 
*data)
 cc->vmsd = _riscv_cpu;
 }
 
-static void cpu_register(const RISCVCPUInfo *info)
-{
-TypeInfo type_info = {
-.name = info->name,
-.parent = TYPE_RISCV_CPU,
-.instance_size = sizeof(RISCVCPU),
-.instance_init = info->initfn,
-};
-
-type_register(_info);
-}
-
-static const TypeInfo riscv_cpu_type_info = {
-.name = TYPE_RISCV_CPU,
-.parent = TYPE_CPU,
-.instance_size = sizeof(RISCVCPU),
-.instance_init = riscv_cpu_init,
-.abstract = false,
-.class_size = sizeof(RISCVCPUClass),
-.class_init = riscv_cpu_class_init,
-};
-
 char *riscv_isa_string(RISCVCPU *cpu)
 {
 int i;
@@ -403,30 +372,76 @@ char *riscv_isa_string(RISCVCPU *cpu)
 return isa_str;
 }
 
-void riscv_cpu_list(FILE *f, fprintf_function cpu_fprintf)
+typedef struct RISCVCPUListState {
+fprintf_function cpu_fprintf;
+FILE *file;
+} RISCVCPUListState;
+
+static gint riscv_cpu_list_compare(gconstpointer a, gconstpointer b)
 {
-const RISCVCPUInfo *info = riscv_cpus;
+ObjectClass *class_a = (ObjectClass *)a;
+ObjectClass *class_b = (ObjectClass *)b;
+const char *name_a, *name_b;
 
-while (info->name) {
-if (info->bit_widths & TARGET_LONG_BITS) {
-(*cpu_fprintf)(f, "%s\n", info->name);
-}
-info++;
-}
+name_a = object_class_get_name(class_a);
+name_b = object_class_get_name(class_b);
+return strcmp(name_a, name_b);
 }
 
-static void riscv_cpu_register_types(void)
+static void riscv_cpu_list_entry(gpointer data, gpointer user_data)
 {
-const RISCVCPUInfo *info = riscv_cpus;
+RISCVCPUListState *s = user_data;
+const char *typename = object_class_get_name(OBJECT_CLASS(data));
+int len = strlen(typename) - strlen(RISCV_CPU_TYPE_SUFFIX);
 
-type_register_static(_cpu_type_info);
+(*s->cpu_fprintf)(s->file, "%.*s\n", len, typename);
+}
 
-while (info->name) {
-if (info->bit_widths & TARGET_LONG_BITS) {
-cpu_register(info);
-}
-info++;
-}
+void riscv_cpu_list(FILE *f, fprintf_function cpu_fprintf)
+{
+RISCVCPUListState s = {
+.cpu_fprintf = cpu_fprintf,
+.file = f,
+};
+GSList *list;
+
+list = object_class_get_list(TYPE_RISCV_CPU, false);
+list = g_slist_sort(list, riscv_cpu_list_compare);
+g_slist_foreach(list, riscv_cpu_list_entry, );
+g_slist_free(list);
 }
 
-type_init(riscv_cpu_register_types)
+#define DEFINE_CPU(type_name, initfn)

[Qemu-devel] [PATCH v1 2/2] RISC-V: Fix incorrect disassembly for addiw

2018-03-27 Thread Michael Clark

This fixes a bug in the disassembler constraints used
to lift instructions into pseudo-instructions, whereby
addiw instructions are always lifted to sext.w instead
of just lifting addiw with a zero immediate.

An associated fix has been made to the metadata used to
machine generate the disseasembler:

https://github.com/michaeljclark/riscv-meta/
commit/4a6b2f3898430768acfe201405224d2ea31e1477

Cc: Sagar Karandikar 
Cc: Bastian Koppelmann 
Cc: Palmer Dabbelt 
Cc: Peter Maydell 
Signed-off-by: Michael Clark 
---
 disas/riscv.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/disas/riscv.c b/disas/riscv.c
index 3c17501..74ad16e 100644
--- a/disas/riscv.c
+++ b/disas/riscv.c
@@ -600,7 +600,7 @@ static const rvc_constraint rvcc_mv[] = { rvc_imm_eq_zero, 
rvc_end };
 static const rvc_constraint rvcc_not[] = { rvc_imm_eq_n1, rvc_end };
 static const rvc_constraint rvcc_neg[] = { rvc_rs1_eq_x0, rvc_end };
 static const rvc_constraint rvcc_negw[] = { rvc_rs1_eq_x0, rvc_end };
-static const rvc_constraint rvcc_sext_w[] = { rvc_rs2_eq_x0, rvc_end };
+static const rvc_constraint rvcc_sext_w[] = { rvc_imm_eq_zero, rvc_end };
 static const rvc_constraint rvcc_seqz[] = { rvc_imm_eq_p1, rvc_end };
 static const rvc_constraint rvcc_snez[] = { rvc_rs1_eq_x0, rvc_end };
 static const rvc_constraint rvcc_sltz[] = { rvc_rs2_eq_x0, rvc_end };
-- 
2.7.0

[Qemu-devel] [PATCH v1 0/2] RISC-V: Important fixes for QEMU 2.12

2018-03-27 Thread Michael Clark

This series includes changes that are considered important.
i.e. correct user-visible bugs that are exercised by common
operations such as -cpu list or -d in_asm

Michael Clark (2):
  RISC-V: Convert cpu definition to future model
  RISC-V: Fix incorrect disassembly for addiw

 disas/riscv.c  |   2 +-
 target/riscv/cpu.c | 123 ++---
 2 files changed, 70 insertions(+), 55 deletions(-)

-- 
2.7.0

[Qemu-devel] [PATCH v1 1/1] RISC-V: Workaround for critical mstatus.FS MTTCG bug

2018-03-27 Thread Michael Clark

This change is a workaround for a bug where mstatus.FS
is not correctly reporting dirty when MTTCG and SMP are
enabled which results in the floating point register file
not being saved during context switches. This a critical
bug for RISC-V in QEMU as it results in floating point
register file corruption when running SMP Linux in the
RISC-V 'virt' machine.

This workaround will return dirty if mstatus.FS is
switched from off to initial or clean. We have checked
the specification and it is legal for an implementation
to return either off, or dirty, if set to initial or clean.

This workaround will result in unnecessary floating point
save restore. When mstatus.FS is off, floating point
instruction trap to indicate the process is using the FPU.
The OS can then save floating-point state of the previous
process using the FPU and set mstatus.FS to initial or
clean. With this workaround, mstatus.FS will always return
dirty if set to a non-zero value, indicating floating point
save restore is necessary, versus misreporting mstatus.FS
resulting in floating point register file corruption.

Cc: Palmer Dabbelt 
Cc: Sagar Karandikar 
Cc: Bastian Koppelmann 
Cc: Peter Maydell 
Tested-by: Richard W.M. Jones 
Signed-off-by: Michael Clark 
---
 target/riscv/op_helper.c | 19 +--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/target/riscv/op_helper.c b/target/riscv/op_helper.c
index e34715d..7281b98 100644
--- a/target/riscv/op_helper.c
+++ b/target/riscv/op_helper.c
@@ -144,8 +144,23 @@ void csr_write_helper(CPURISCVState *env, target_ulong 
val_to_write,
 }
 
 mstatus = (mstatus & ~mask) | (val_to_write & mask);
-int dirty = (mstatus & MSTATUS_FS) == MSTATUS_FS;
-dirty |= (mstatus & MSTATUS_XS) == MSTATUS_XS;
+
+/* Note: this is a workaround for an issue where mstatus.FS
+   does not report dirty when SMP and MTTCG is enabled. This
+   workaround is technically compliant with the RISC-V Privileged
+   specification as it is legal to return only off, or dirty,
+   however this may cause unnecessary saves of floating point state.
+   Without this workaround, floating point state is not saved and
+   restored correctly when SMP and MTTCG is enabled, */
+if (qemu_tcg_mttcg_enabled()) {
+/* FP is always dirty or off */
+if (mstatus & MSTATUS_FS) {
+mstatus |= MSTATUS_FS;
+}
+}
+
+int dirty = ((mstatus & MSTATUS_FS) == MSTATUS_FS) |
+((mstatus & MSTATUS_XS) == MSTATUS_XS);
 mstatus = set_field(mstatus, MSTATUS_SD, dirty);
 env->mstatus = mstatus;
 break;
-- 
2.7.0

[Qemu-devel] [PATCH v1 0/1] RISC-V: Critical fixes for QEMU 2.12

2018-03-27 Thread Michael Clark

This series includes changes that are considered release critical,
such as floating point register file corruption under SMP Linux.

Michael Clark (1):
  RISC-V: Workaround for critical mstatus.FS MTTCG bug

 target/riscv/op_helper.c | 19 +--
 1 file changed, 17 insertions(+), 2 deletions(-)

-- 
2.7.0

Re: [Qemu-devel] [PATCH] i386/kvm: add support for KVM_CAP_X86_DISABLE_EXITS

2018-03-27 Thread Michael S. Tsirkin

On Fri, Mar 16, 2018 at 07:36:42AM -0700, Wanpeng Li wrote:
> From: Wanpeng Li 
> 
> This patch adds support for KVM_CAP_X86_DISABLE_EXITS. Provides userspace 
> with 
> per-VM capability(KVM_CAP_X86_DISABLE_EXITS) to not intercept MWAIT/HLT/PAUSE 
> in order that to improve latency in some workloads.
> 
> Cc: Paolo Bonzini 
> Cc: Radim Krčmář 
> Cc: Eduardo Habkost 
> Signed-off-by: Wanpeng Li 
> ---
>  linux-headers/linux/kvm.h |  6 +-
>  target/i386/kvm.c | 12 
>  2 files changed, 17 insertions(+), 1 deletion(-)
> 
> diff --git a/linux-headers/linux/kvm.h b/linux-headers/linux/kvm.h
> index a167be8..857df15 100644
> --- a/linux-headers/linux/kvm.h
> +++ b/linux-headers/linux/kvm.h
> @@ -925,7 +925,7 @@ struct kvm_ppc_resize_hpt {
>  #define KVM_CAP_S390_GS 140
>  #define KVM_CAP_S390_AIS 141
>  #define KVM_CAP_SPAPR_TCE_VFIO 142
> -#define KVM_CAP_X86_GUEST_MWAIT 143
> +#define KVM_CAP_X86_DISABLE_EXITS 143
>  #define KVM_CAP_ARM_USER_IRQ 144
>  #define KVM_CAP_S390_CMMA_MIGRATION 145
>  #define KVM_CAP_PPC_FWNMI 146
> @@ -1508,6 +1508,10 @@ struct kvm_assigned_msix_entry {
>  #define KVM_X2APIC_API_USE_32BIT_IDS(1ULL << 0)
>  #define KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK  (1ULL << 1)
>  
> +#define KVM_X86_DISABLE_EXITS_MWAIT  (1 << 0)
> +#define KVM_X86_DISABLE_EXITS_HLT(1 << 1)
> +#define KVM_X86_DISABLE_EXITS_PAUSE  (1 << 2)
> +
>  /* Available with KVM_CAP_ARM_USER_IRQ */
>  
>  /* Bits for run->s.regs.device_irq_level */
> diff --git a/target/i386/kvm.c b/target/i386/kvm.c
> index d23fff1..95ed9eb 100644
> --- a/target/i386/kvm.c
> +++ b/target/i386/kvm.c
> @@ -999,6 +999,18 @@ int kvm_arch_init_vcpu(CPUState *cs)
>  }
>  }
>  
> +if (env->features[FEAT_KVM_HINTS] & KVM_HINTS_DEDICATED) {
> +int disable_exits = kvm_check_extension(cs->kvm_state, 
> KVM_CAP_X86_DISABLE_EXITS);
> +if (disable_exits) {
> +disable_exits &= (KVM_X86_DISABLE_EXITS_MWAIT |
> +  KVM_X86_DISABLE_EXITS_HLT |
> +  KVM_X86_DISABLE_EXITS_PAUSE);
> +}
> +if (kvm_vm_enable_cap(cs->kvm_state, KVM_CAP_X86_DISABLE_EXITS, 0, 
> disable_exits)) {
> +error_report("kvm: DISABLE EXITS not supported");
> +}
> +}
> +
>  qemu_add_vm_change_state_handler(cpu_update_state, env);
>  
>  c = cpuid_find_entry(_data.cpuid, 1, 0);

Why not a bit per capability?
I can see how someone might want to disable mwait exists
but not the rest of them.

> -- 
> 2.7.4

Re: [Qemu-devel] [PATCH 1/8] migration: stop compressing page in migration thread

2018-03-27 Thread Dr. David Alan Gilbert

* Xiao Guangrong (guangrong.x...@gmail.com) wrote:
> 
> 
> On 03/26/2018 05:02 PM, Peter Xu wrote:
> > On Thu, Mar 22, 2018 at 07:38:07PM +0800, Xiao Guangrong wrote:
> > > 
> > > 
> > > On 03/21/2018 04:19 PM, Peter Xu wrote:
> > > > On Fri, Mar 16, 2018 at 04:05:14PM +0800, Xiao Guangrong wrote:
> > > > > 
> > > > > Hi David,
> > > > > 
> > > > > Thanks for your review.
> > > > > 
> > > > > On 03/15/2018 06:25 PM, Dr. David Alan Gilbert wrote:
> > > > > 
> > > > > > > migration/ram.c | 32 
> > > > > > 
> > > > > > Hi,
> > > > > >  Do you have some performance numbers to show this helps?  Were 
> > > > > > those
> > > > > > taken on a normal system or were they taken with one of the 
> > > > > > compression
> > > > > > accelerators (which I think the compression migration was designed 
> > > > > > for)?
> > > > > 
> > > > > Yes, i have tested it on my desktop, i7-4790 + 16G, by locally live 
> > > > > migrate
> > > > > the VM which has 8 vCPUs + 6G memory and the max-bandwidth is limited 
> > > > > to 350.
> > > > > 
> > > > > During the migration, a workload which has 8 threads repeatedly 
> > > > > written total
> > > > > 6G memory in the VM. Before this patchset, its bandwidth is ~25 mbps, 
> > > > > after
> > > > > applying, the bandwidth is ~50 mbps.
> > > > 
> > > > Hi, Guangrong,
> > > > 
> > > > Not really review comments, but I got some questions. :)
> > > 
> > > Your comments are always valuable to me! :)
> > > 
> > > > 
> > > > IIUC this patch will only change the behavior when last_sent_block
> > > > changed.  I see that the performance is doubled after the change,
> > > > which is really promising.  However I don't fully understand why it
> > > > brings such a big difference considering that IMHO current code is
> > > > sending dirty pages per-RAMBlock.  I mean, IMHO last_sent_block should
> > > > not change frequently?  Or am I wrong?
> > > 
> > > It's depends on the configuration, each memory-region which is ram or
> > > file backend has a RAMBlock.
> > > 
> > > Actually, more benefits comes from the fact that the performance & 
> > > throughput
> > > of the multithreads has been improved as the threads is fed by the
> > > migration thread and the result is consumed by the migration
> > > thread.
> > 
> > I'm not sure whether I got your points - I think you mean that the
> > compression threads and the migration thread can form a better
> > pipeline if the migration thread does not do any compression at all.
> > 
> > I think I agree with that.
> > 
> > However it does not really explain to me on why a very rare event
> > (sending the first page of a RAMBlock, considering bitmap sync is
> > rare) can greatly affect the performance (it shows a doubled boost).
> > 
> 
> I understand it is trick indeed, but it is not very hard to explain.
> Multi-threads (using 8 CPUs in our test) keep idle for a long time
> for the origin code, however, after our patch, as the normal is
> posted out async-ly that it's extremely fast as you said (the network
> is almost idle for current implementation) so it has a long time that
> the CPUs can be used effectively to generate more compressed data than
> before.

One thing to try, to explain Peter's worry, would be, for testing, to
add a counter to see how often this case triggers, and perhaps add
some debug to see when;  Peter's right that flipping between the
RAMBlocks seems odd, unless you're either doing lots of iterations or
have lots of separate RAMBlocks for some reason.

Dave

> > Btw, about the numbers: IMHO the numbers might not be really "true
> > numbers".  Or say, even the bandwidth is doubled, IMHO it does not
> > mean the performance is doubled. Becasue the data has changed.
> > 
> > Previously there were only compressed pages, and now for each cycle of
> > RAMBlock looping we'll send a normal page (then we'll get more thing
> > to send).  So IMHO we don't really know whether we sent more pages
> > with this patch, we can only know we sent more bytes (e.g., an extreme
> > case is that the extra 25Mbps/s are all caused by those normal pages,
> > and we can be sending exactly the same number of pages like before, or
> > even worse?).
> > 
> 
> Current implementation uses CPU very ineffectively (it's our next work
> to be posted out) that the network is almost idle so posting more data
> out is a better choice，further more, migration thread plays a role for
> parallel, it'd better to make it fast.
> 
> > > 
> > > > 
> > > > Another follow-up question would be: have you measured how long time
> > > > needed to compress a 4k page, and how many time to send it?  I think
> > > > "sending the page" is not really meaningful considering that we just
> > > > put a page into the buffer (which should be extremely fast since we
> > > > don't really flush it every time), however I would be curious on how
> > > > slow would compressing a page be.
> > > 
> > > I haven't benchmark the performance of zlib, i think it is CPU

Re: [Qemu-devel] [Qemu-block] [PATCH] blockjob: leak fix, remove from txn when failing early

2018-03-27 Thread John Snow



On 03/27/2018 12:07 PM, Marc-André Lureau wrote:
> This fixes leaks found by ASAN such as:
>   GTESTER tests/test-blockjob
> =
> ==31442==ERROR: LeakSanitizer: detected memory leaks
> 
> Direct leak of 24 byte(s) in 1 object(s) allocated from:
> #0 0x7f88483cba38 in __interceptor_calloc (/lib64/libasan.so.4+0xdea38)
> #1 0x7f8845e1bd77 in g_malloc0 ../glib/gmem.c:129
> #2 0x7f8845e1c04b in g_malloc0_n ../glib/gmem.c:360
> #3 0x5584d2732498 in block_job_txn_new 
> /home/elmarco/src/qemu/blockjob.c:172
> #4 0x5584d2739b28 in block_job_create 
> /home/elmarco/src/qemu/blockjob.c:973
> #5 0x5584d270ae31 in mk_job 
> /home/elmarco/src/qemu/tests/test-blockjob.c:34
> #6 0x5584d270b1c1 in do_test_id 
> /home/elmarco/src/qemu/tests/test-blockjob.c:57
> #7 0x5584d270b65c in test_job_ids 
> /home/elmarco/src/qemu/tests/test-blockjob.c:118
> #8 0x7f8845e40b69 in test_case_run ../glib/gtestutils.c:2255
> #9 0x7f8845e40f29 in g_test_run_suite_internal ../glib/gtestutils.c:2339
> #10 0x7f8845e40fd2 in g_test_run_suite_internal ../glib/gtestutils.c:2351
> #11 0x7f8845e411e9 in g_test_run_suite ../glib/gtestutils.c:2426
> #12 0x7f8845e3fe72 in g_test_run ../glib/gtestutils.c:1692
> #13 0x5584d270d6e2 in main 
> /home/elmarco/src/qemu/tests/test-blockjob.c:377
> #14 0x7f8843641f29 in __libc_start_main (/lib64/libc.so.6+0x20f29)
> 
> Add an assert to make sure that the job doesn't have associated txn before 
> free().
> 
> Signed-off-by: Marc-André Lureau 
> ---
>  blockjob.c | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/blockjob.c b/blockjob.c
> index 11c9ce124d..bb75386515 100644
> --- a/blockjob.c
> +++ b/blockjob.c
> @@ -228,6 +228,7 @@ void block_job_unref(BlockJob *job)
>  {
>  if (--job->refcnt == 0) {
>  assert(job->status == BLOCK_JOB_STATUS_NULL);
> +assert(!job->txn);
>  BlockDriverState *bs = blk_bs(job->blk);
>  QLIST_REMOVE(job, job_list);
>  bs->job = NULL;
> @@ -479,6 +480,7 @@ static int block_job_finalize_single(BlockJob *job)
>  
>  QLIST_REMOVE(job, txn_list);
>  block_job_txn_unref(job->txn);
> +job->txn = NULL;
>  block_job_conclude(job);
>  return 0;
>  }
> @@ -994,6 +996,9 @@ void block_job_pause_all(void)
>  void block_job_early_fail(BlockJob *job)
>  {
>  assert(job->status == BLOCK_JOB_STATUS_CREATED);
> +QLIST_REMOVE(job, txn_list);
> +block_job_txn_unref(job->txn);
> +job->txn = NULL;
>  block_job_decommission(job);
>  }
>  
> 

Shame on me.

I may have shuffled this into decommission, where if there is a txn we
unlink ourselves from it (especially with the assertion added), but this
patch is fine.

Reviewed-by: John Snow 

cc: Jeff Cody

Re: [Qemu-devel] [patches] Re: [PATCH v6 00/26] RISC-V: Fixes and cleanups for QEMU 2.12

2018-03-27 Thread Michael Clark

On Tue, Mar 27, 2018 at 11:39 AM, Michael Clark  wrote:

>
> I will divide the series up into 3 branches, and move through them in
> order of priority, with correctness ahead of tidyness:
>
> 1). riscv-qemu-2.12-critical-fixes
> 2). riscv-qemu-2.13-bug-fixes
> 3). riscv-qemu-2.13-tidy-ups
>

I think we need 4 categories:

1). riscv-qemu-2.12-critical-fixes
- floating point register file corruption mstatus.FS

2). riscv-qemu-2.12-important-fixes
- user visible bugs, e.g. Igor's -cpu list bug, wrong dissassembly for
sext.w/addiw bug

3). riscv-qemu-2.13-bug-fixes
- spec bugs and other innocuous bug fixes that are not /yet/ user visible.
i.e. not exercised by RISC-V Linux

4). riscv-qemu-2.13-tidy-ups
- code cleanups

Re: [Qemu-devel] [patches] Re: [PATCH v6 00/26] RISC-V: Fixes and cleanups for QEMU 2.12

2018-03-27 Thread Michael Clark

On Tue, Mar 27, 2018 at 2:42 AM, Peter Maydell 
wrote:

> On 26 March 2018 at 19:07, Michael Clark  wrote:
> > On Sun, Mar 25, 2018 at 8:03 AM, Peter Maydell  >
> > wrote:
> >> Hi. It looks to me like a fair number of these patches
> >> are already reviewed, so we don't need to wait on the
> >> rest being reviewed to get those into master.
> >>
> >> My suggestion is that you send a pullrequest now for the
> >> reviewed patches, and send a patchset for review for the
> >> new ones or the ones that still need review. (If there
> >> are patches that are reviewed but depend on earlier ones
> >> that need to go in set 2 then they go in set 2 as well.)
> >>
> >
> > Unfortunately the reviewed patches are mostly just minor cleanups. It's
> > almost not worth making a PR for them as *none* of the reviewed patches
> are
> > actually bug fixes. They are things like removing unused definitions or
> > replacing hardcoded constants with enums, removing unnesscary braces,
> etc,
> > etc
>
> No, what I'm saying is that it is very much worth it. You
> want to shorten the size of your set of uncommitted patches.
> Large pull requests increase the chances that some
> random thing in there hits a compile issue or other minor
> problem that means I have to bounce the whole thing and
> you need to respin it. Smaller ones are more likely to
> go in. This is especially true during the freeze part
> of the release cycle, when we do an RC every week -- having
> patches in earlier RCs reduces the risk. I do not want
> to still see a 26 patch set unapplied by the time we get
> to RC3 or RC4.
>
> Or if you don't think the minor cleanups are worth putting
> into 2.12, that's fine too (it's a submaintainer judgement
> you can make). In that case you can put those to one side
> and trim down the size of the patchset you're sending out
> (ie make it an 01/11...11/11 patchset or whatever).

I'm not sure whether maintainer or submaintainer is really that relavant.
Active maintainership is probably more relevant. i.e. responding to RISC-V
related emails, PRs, issues on the riscv.org qemu repo, testing PRs before
merging them, etc, etc.

I'm going to focus on getting the critical bug fixes in for QEMU 2.12 i.e.
the ones that break RISC-V Linux in QEMU 2.12 e.g. the mstatus.FS fix. User
visible bugs like the disassembler bug and the -cpu list bug. I'm going to
make a 3 patch series... possibly a 2 patch series... we can leave the
disassembler bug there and just include Igor's change and the mstatus.FS
workaround. I don't think writable ROM is really that critical, and bounds
checks for potential device-tree truncation are just nice-to-haves. Spec
conformance is nice-to-have if we are triaging against critical issues.

Once we can ensure we have a working RISC-V port for QEMU 2.12 we can then
worry about spec conformance bug fixes and tidy ups, perhaps for QEMU 2.13.

My pesonal opinion is that Tested-by: should carry more weight than
Reviewed-by: assuming Reviewed-by: only means someone has reviewed the code
versus checking out and testing that a critical bug has been resolved by
said patch. That said, if all QEMU patches need Reviewed-by: then there is
not much we can do. In GCC and Linux, subsystem maintainers are allowed to
make judgements over the inclusion of critical bug fixes. i.e. Reviewed-by:
is not mandatory if the change is a critical fix.

I will divide the series up into 3 branches, and move through them in order
of priority, with correctness ahead of tidyness:

1). riscv-qemu-2.12-critical-fixes
2). riscv-qemu-2.13-bug-fixes
3). riscv-qemu-2.13-tidy-ups

Expect to see riscv-qemu-2.12-critical-fixes very soon...

> 26 patches is a lot to still be carrying around much
> >> beyond rc1, so I would like to see the size of this set
> >> reducing rather than increasing. As the release process
> >> moves forward the bar for "can this still go in" gradually
> >> goes up -- by about rc3 it is at about "is this a
> >> really critical bug or regression from the previous
> >> release".
> >>
> >> (Also something seems to have unhelpfully decided to eat
> >> or delay about half of your emails in this patchset :-(
> >> Patchew only sees 14 of the 26. Our mailing list server
> >> does seem to do that occasionally so that would be my
> >> first guess at the culprit, but it's possible it's
> >> something at your end.)
> >>
> >
> > Phil asked that I send out only the patches that don't have review, so
> > that's what I did.
>
> I think that was a miscommunication. You should always
> send out entire patchsets, not just parts of one.
> Philippe said:
> https://lists.gnu.org/archive/html/qemu-devel/2018-03/msg06038.html
> "You could have sent a PR of the reviewed patches, and
> respin the unreviewed patches separately.", which is the
> same thing I'm suggesting here.

My mistake.

Re: [Qemu-devel] [PULL 0/4] Block patches

2018-03-27 Thread Peter Maydell

On 27 March 2018 at 15:41, Stefan Hajnoczi  wrote:
> The following changes since commit f58d9620aa4a514b1227074ff56eefd1334a6225:
>
>   Merge remote-tracking branch 'remotes/rth/tags/pull-dt-20180326' into 
> staging (2018-03-27 10:27:34 +0100)
>
> are available in the Git repository at:
>
>   git://github.com/stefanha/qemu.git tags/block-pull-request
>
> for you to fetch changes up to f5a53faad4bfbf1b86012a13055d2a1a774a42b6:
>
>   MAINTAINERS: add include/block/aio-wait.h (2018-03-27 13:05:48 +0100)
>
> 
>
> 
>
> Stefan Hajnoczi (4):
>   queue: add QSIMPLEQ_PREPEND()
>   coroutine: avoid co_queue_wakeup recursion
>   coroutine: add test-aio coroutine queue chaining test case
>   MAINTAINERS: add include/block/aio-wait.h
>
>  MAINTAINERS  |   1 +
>  include/qemu/coroutine_int.h |   1 -
>  include/qemu/queue.h |   8 
>  block/io.c   |   3 +-
>  tests/test-aio.c |  65 -
>  util/qemu-coroutine-lock.c   |  34 -
>  util/qemu-coroutine.c| 110 
> +++
>  7 files changed, 121 insertions(+), 101 deletions(-)
>

Applied, thanks.

-- PMM

Re: [Qemu-devel] [PATCH v1 07/14] fpu: introduce hostfloat

2018-03-27 Thread Emilio G. Cota

On Tue, Mar 27, 2018 at 12:49:48 +0100, Alex Bennée wrote:
> Emilio G. Cota  writes:
> 
> > The appended paves the way for leveraging the host FPU for a subset
> > of guest FP operations. For most guest workloads (e.g. FP flags
> > aren't ever cleared, inexact occurs often and rounding is set to the
> > default [to nearest]) this will yield sizable performance speedups.
> >
> > The approach followed here avoids checking the FP exception flags register.
> > See the comment at the top of hostfloat.c for details.
> >
> > This assumes that QEMU is running on an IEEE754-compliant FPU and
> > that the rounding is set to the default (to nearest). The
> > implementation-dependent specifics of the FPU should not matter; things
> > like tininess detection and snan representation are still dealt with in
> > soft-fp. However, this approach will break on most hosts if we compile
> > QEMU with flags such as -ffast-math. We control the flags so this should
> > be easy to enforce though.
> 
> The thing I would avoid is generating is any x87 instructions as we can
> get weird effects if the compiler ever decides to stash a signalling NaN
> in an x87 register.

We take care not to do hardfloat on operands that might result in NaNs.
So this should not be a concern.

> Anyway perhaps -fno-fast-math should be explicit when building fpu/* code?

That's a fair suggestion. There are plenty of other flags though that could
ruin this approach, so I'm not sure how effective this would be.

Also, we should be careful not to sneak in things like
_MM_SET_FLUSH_ZERO_MODE(_MM_FLUSH_ZERO_ON) 
in the QEMU binary. Not sure we can guarantee this is avoided unless
we had a runtime check =)

> > The licensing in softfloat.h is complicated at best, so to keep things
> > simple I'm adding this as a separate, GPL'ed file.
> 
> I don't think we need to worry about this. It's fine to add GPL only
> stuff to softfloat.c and since the re-factoring (or before really) we
> "own" this code and are unlikely to upstream anything.
> 
> My preference would be to include this all in softfloat.c unless there
> is a very good reason not to.

Yes I did this in v2 after reading the license etc.

(snip)
> > +++ b/fpu/hostfloat.c
(snip)
> > +#define GEN_INPUT_FLUSH(soft_t) \
> > +static inline __attribute__((always_inline)) void   \
> > +soft_t ## _input_flush__nocheck(soft_t *a, float_status *s) \
(snip)
> > +soft_t ## _input_flush__nocheck(c, s);  \
> > +}
> > +
> > +GEN_INPUT_FLUSH(float32)
> > +GEN_INPUT_FLUSH(float64)
> 
> Having spent time getting rid of a bunch of macro expansions I'm wary of
> adding more in. However for these I guess it's kind of marginal.

Then you won't like v2 :-(

I don't like macros either but in this case they might be a necessary evil.
I left a lot of macros in there because it'll let us retain performance
and also easily support things like half/quad precision, if we ever want to.

Thanks,

Emilio

Re: [Qemu-devel] [PATCH v1 08/14] hostfloat: support float32/64 addition and subtraction

2018-03-27 Thread Emilio G. Cota

On Tue, Mar 27, 2018 at 12:41:18 +0100, Alex Bennée wrote:
> 
> Emilio G. Cota  writes:
> 
> > On Thu, Mar 22, 2018 at 14:41:05 +0800, Richard Henderson wrote:
> > (snip)
> >> Another thought re all of the soft_is_normal || soft_is_zero checks that 
> >> you're
> >> performing.  I think it would be nice if we could work with
> >> float*_unpack_canonical so that we don't have to duplicate work.  E.g.
> >>
> >> /* Return true for float_class_normal && float_class_zero.  */
> >> static inline bool is_finite(FloatClass c) { return c <= float_class_zero; 
> >> }
> >>
> >> float32 float32_add(float32 a, float32 b, float_status *s)
> >> {
> >>   FloatClass a_cls = float32_classify(a);
> >>   FloatClass b_cls = float32_classify(b);
> >
> > Just looked at this. It can be done, although it comes at the
> > price of some performance for fp-bench -o add:
> > 180 Mflops vs. 196 Mflops, i.e. a 8% slowdown. That is with
> > adequate inlining etc., otherwise perf is worse.
> >
> > I'm not convinced that we can gain much in simplicity to
> > justify the perf impact. Yes, we'd simplify canonicalize(),
> > but we'd probably need a float_class_denormal[*], which
> > would complicate everything else.
> >
> > I think it makes sense to keep some inlines that work on
> > the float32/64's directly.
> >
> >>   if (is_finite(a_cls) && is_finite(b_cls) && ...) {
> >>   /* do hardfp thing */
> >>   }
> >
> > [*] Taking 0, denormals and normals would be OK from correctness,
> > but we really don't want to compute ops with denormal inputs on
> > the host; it is very likely that the output will also be denormal,
> > and we'll end up deferring to soft-fp anyway to avoid
> > computing whether the underflow exception has occurred,
> > which is expensive.
> >
> >>   pa = float32_unpack(a, ca, s);
> >>   pb = float32_unpack(b, cb, s);
> >>   pr = addsub_floats(pa, pb, s, false);
> >>   return float32_round_pack(pr, s);
> >> }
> >
> > It pays off to have two separate functions (add & sub) for the
> > slow path. With soft_f32_add/sub factored out:
> >
> > $ taskset -c 0 x86_64-linux-user/qemu-x86_64 tests/fp-bench -o add
> > 197.53 MFlops
> >
> > With the above four lines (pa...return) as an else branch:
> > 169.16 MFlops
> >
> > BTW flattening makes things worse (150.63 MFlops).
> 
> That's disappointing. Did you look at the generated code? Because the
> way we are abusing __flatten__ to effectively make a compile time
> template you would hope it could pull the relevant classify bits to
> before the hard float branch and do the rest later if needed.
> 
> Everything was inline or in softfloat.c for this test right?

Yes. It's just that the classify bits are more expensive than
the alternative. This is fair enough when you look at classify().

E.

Re: [Qemu-devel] [PATCH 5.5/7] dirty-bitmap: drop unused bdrv_undo_clear_dirty_bitmap

2018-03-27 Thread no-reply

Hi,

This series failed docker-quick@centos6 build test. Please find the testing 
commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

Type: series
Message-id: 20180326115346.11939-1-vsement...@virtuozzo.com
Subject: [Qemu-devel] [PATCH 5.5/7] dirty-bitmap: drop unused 
bdrv_undo_clear_dirty_bitmap

=== TEST SCRIPT BEGIN ===
#!/bin/bash
set -e
git submodule update --init dtc
# Let docker tests dump environment info
export SHOW_ENV=1
export J=8
time make docker-test-quick@centos6
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
Switched to a new branch 'test'
410b0c5abb dirty-bitmap: drop unused bdrv_undo_clear_dirty_bitmap

=== OUTPUT BEGIN ===
Submodule 'dtc' (git://git.qemu-project.org/dtc.git) registered for path 'dtc'
Cloning into '/var/tmp/patchew-tester-tmp-ruregu41/src/dtc'...
Submodule path 'dtc': checked out 'e54388015af1fb4bf04d0bca99caba1074d9cc42'
  BUILD   centos6
make[1]: Entering directory '/var/tmp/patchew-tester-tmp-ruregu41/src'
  GEN 
/var/tmp/patchew-tester-tmp-ruregu41/src/docker-src.2018-03-26-10.19.23.32742/qemu.tar
Cloning into 
'/var/tmp/patchew-tester-tmp-ruregu41/src/docker-src.2018-03-26-10.19.23.32742/qemu.tar.vroot'...
done.
Checking out files:  47% (2851/6057)   
Checking out files:  48% (2908/6057)   
Checking out files:  49% (2968/6057)   
Checking out files:  50% (3029/6057)   
Checking out files:  51% (3090/6057)   
Checking out files:  52% (3150/6057)   
Checking out files:  53% (3211/6057)   
Checking out files:  54% (3271/6057)   
Checking out files:  55% (3332/6057)   
Checking out files:  56% (3392/6057)   
Checking out files:  57% (3453/6057)   
Checking out files:  58% (3514/6057)   
Checking out files:  59% (3574/6057)   
Checking out files:  60% (3635/6057)   
Checking out files:  61% (3695/6057)   
Checking out files:  62% (3756/6057)   
Checking out files:  63% (3816/6057)   
Checking out files:  64% (3877/6057)   
Checking out files:  65% (3938/6057)   
Checking out files:  66% (3998/6057)   
Checking out files:  67% (4059/6057)   
Checking out files:  68% (4119/6057)   
Checking out files:  69% (4180/6057)   
Checking out files:  70% (4240/6057)   
Checking out files:  71% (4301/6057)   
Checking out files:  72% (4362/6057)   
Checking out files:  73% (4422/6057)   
Checking out files:  74% (4483/6057)   
Checking out files:  75% (4543/6057)   
Checking out files:  76% (4604/6057)   
Checking out files:  77% (4664/6057)   
Checking out files:  78% (4725/6057)   
Checking out files:  79% (4786/6057)   
Checking out files:  80% (4846/6057)   
Checking out files:  81% (4907/6057)   
Checking out files:  82% (4967/6057)   
Checking out files:  83% (5028/6057)   
Checking out files:  84% (5088/6057)   
Checking out files:  85% (5149/6057)   
Checking out files:  86% (5210/6057)   
Checking out files:  87% (5270/6057)   
Checking out files:  88% (5331/6057)   
Checking out files:  89% (5391/6057)   
Checking out files:  90% (5452/6057)   
Checking out files:  91% (5512/6057)   
Checking out files:  92% (5573/6057)   
Checking out files:  93% (5634/6057)   
Checking out files:  94% (5694/6057)   
Checking out files:  95% (5755/6057)   
Checking out files:  96% (5815/6057)   
Checking out files:  97% (5876/6057)   
Checking out files:  98% (5936/6057)   
Checking out files:  99% (5997/6057)   
Checking out files: 100% (6057/6057)   
Checking out files: 100% (6057/6057), done.
Your branch is up-to-date with 'origin/test'.
Submodule 'dtc' (git://git.qemu-project.org/dtc.git) registered for path 'dtc'
Cloning into 
'/var/tmp/patchew-tester-tmp-ruregu41/src/docker-src.2018-03-26-10.19.23.32742/qemu.tar.vroot/dtc'...
Submodule path 'dtc': checked out 'e54388015af1fb4bf04d0bca99caba1074d9cc42'
Submodule 'ui/keycodemapdb' (git://git.qemu.org/keycodemapdb.git) registered 
for path 'ui/keycodemapdb'
Cloning into 
'/var/tmp/patchew-tester-tmp-ruregu41/src/docker-src.2018-03-26-10.19.23.32742/qemu.tar.vroot/ui/keycodemapdb'...
Submodule path 'ui/keycodemapdb': checked out 
'6b3d716e2b6472eb7189d3220552280ef3d832ce'
  COPYRUNNER
RUN test-quick in qemu:centos6 
Packages installed:
SDL-devel-1.2.14-7.el6_7.1.x86_64
bison-2.4.1-5.el6.x86_64
bzip2-devel-1.0.5-7.el6_0.x86_64
ccache-3.1.6-2.el6.x86_64
csnappy-devel-0-6.20150729gitd7bc683.el6.x86_64
flex-2.5.35-9.el6.x86_64
gcc-4.4.7-18.el6.x86_64
gettext-0.17-18.el6.x86_64
git-1.7.1-9.el6_9.x86_64
glib2-devel-2.28.8-9.el6.x86_64
libepoxy-devel-1.2-3.el6.x86_64
libfdt-devel-1.4.0-1.el6.x86_64
librdmacm-devel-1.0.21-0.el6.x86_64
lzo-devel-2.03-3.1.el6_5.1.x86_64
make-3.81-23.el6.x86_64
mesa-libEGL-devel-11.0.7-4.el6.x86_64
mesa-libgbm-devel-11.0.7-4.el6.x86_64
package g++ is not installed
pixman-devel-0.32.8-1.el6.x86_64
spice-glib-devel-0.26-8.el6.x86_64
spice-server-devel-0.12.4-16.el6.x86_64
tar-1.23-15.el6_8.x86_64
vte-devel-0.25.1-9.el6.x86_64
xen-devel-4.6.6-2.el6.x86_64
zlib-devel-1.2.3-29.el6.x86_64

Environment variables:
PACKAGES=bison

Re: [Qemu-devel] [PATCH v1 05/14] softfloat: add float32_is_normal and float64_is_normal

2018-03-27 Thread Emilio G. Cota

On Tue, Mar 27, 2018 at 12:34:57 +0100, Alex Bennée wrote:
> 
> Emilio G. Cota  writes:
> 
> > This paves the way for upcoming work.
> >
> > Signed-off-by: Emilio G. Cota 
> 
> Reviewed-by: Alex Bennée 

(snip)
On Tue, Mar 27, 2018 at 12:35:07 +0100, Alex Bennée wrote:
> 
> Emilio G. Cota  writes:
> 
> > This paves the way for upcoming work.
> >
> > Signed-off-by: Emilio G. Cota 
> 
> Reviewed-by: Alex Bennée 

In v2 I merged these two into the same commit, so I've added your
R-b tag to the corresponding commit in v3.

Thanks,

Emilio

Re: [Qemu-devel] [PATCH v1 04/14] fp-test: add muladd variants

2018-03-27 Thread Emilio G. Cota

On Tue, Mar 27, 2018 at 12:33:55 +0100, Alex Bennée wrote:
> Emilio G. Cota  writes:
> 
> > These are a few muladd-related operations that the original IBM syntax
> > does not specify; model files for these are in muladd.fptest.
> >
> > Signed-off-by: Emilio G. Cota 
(snip)
> > +case OP_MULADD_NEG_ADDEND:
> > +res64 = float64_muladd(a, b, c, float_muladd_negate_c, s);
> > +break;
> > +case OP_MULADD_NEG_PRODUCT:
> > +res64 = float64_muladd(a, b, c, float_muladd_negate_product, 
> > s);
> > +break;
> > +case OP_MULADD_NEG_RESULT:
> > +res64 = float64_muladd(a, b, c, float_muladd_negate_result, s);
> > +break;
> >  case OP_DIV:
> >  res64 = float64_div(a, b, s);
> >  break;
> 
> Are there any intrinsics we could use for the hard variant which would
> be useful if we want to run under translation?

I don't know of any portable way of doing this. We could add some
arch-specific code though, suitably ifdef'ed.

E.

Re: [Qemu-devel] [PATCH 3/3] e1000: Old machine types, turn new subsection off

2018-03-27 Thread Dr. David Alan Gilbert

* Paolo Bonzini (pbonz...@redhat.com) wrote:
> On 27/03/2018 18:47, Dr. David Alan Gilbert wrote:
> >> So if the subsection is absent you
> >> have to migrate either tx.tso_props or tx.props, depending on s->tx.cptse.
> > Do you mean when sending you have to decide which set to send in the
> > non-subsection data?  And with cptse true that means use tso_props?
> 
> Yes.
> 
> >> Likewise if you migrate from older versions: if s->tx.props.tse &&
> >> s->tx.cptse, you have to copy s->tx.props to s->tx.tso_props and clear
> >> s->tx.props.
> > 
> > I don't see any equivalent code in the existing non-subsection postload to
> > do this; so I'm guessing there are some cases of 2.11->2.12 that will
> > break at the moment?
> 
> Yes, I think so.

OK, so we'd better fix that for 2.12.

> >> My understanding is that s->tx.tso_props.tse will be 1 if
> >> and only if the source sent s->tx.tso_props.
> > I don't see anything in the current code that migrates tso_props.tse -
> > where does it come from?
> 
> Ouch... The tse field is more or less dead in current code AFAICS, but
> it was used in the previous version.  What's the best way then to find
> if the subsection was transmitted?  Do we have anything like a post_load
> callback in the subsection itself?

Yes, if I just add a .post_load field in the subsection it should get
called.

> To find out which "props" to transmit to older QEMU, you can add a
> tp->use_tso_for_migration = tp->cptse just before "if (!(txd_lower &
> E1000_TXD_CMD_EOP))" in process_tx_desc...

OK.

> >> This seems most easily done with a new field (e.g. vmstate_fixed_props)
> >> that is written in pre_save and set in post_load.
> > It might need a VMSTATE_WITH_TMP to be able to do the saving part;
> > when saving we can't change the current state when migrating
> > to an old destination in case the migration fails and we just continue.
> 
> Perhaps you can just copy props/tso_props to a new field, and change all the
> 
> VMSTATE_UINT8(tx.props.ipcss, E1000State),
> VMSTATE_UINT8(tx.props.ipcso, E1000State),
> VMSTATE_UINT16(tx.props.ipcse, E1000State),
> 
> to
> 
> VMSTATE_UINT8(tx_legacy_vmstate_props.ipcss, E1000State),
>   ...
> 
> and then add tx.props to the subsection together with tso.props.
> 
> New->old migration will place tx_legacy_vmstate_props in tx.props on the
> destination; new->new will realize the subsection was transmitted and
> ignore the tx_legacy_vmstate_props; old->new will not find data from the
> subsection and copy the tx_legacy_vmstate_props into one of tx.props and
> tx.tso_props.

Yeh, adding a legacy_props field should do it; although we never need
to transmit more than 2 copies.  

I'll look at this more tomorrow; I am a bit worried about testing it
though.

Dave


> Paolo
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

Re: [Qemu-devel] [PATCH v1 02/14] tests: add fp-test, a floating point test suite

2018-03-27 Thread Emilio G. Cota

On Tue, Mar 27, 2018 at 11:13:01 +0100, Alex Bennée wrote:
> 
> Emilio G. Cota  writes:
> 
> > This will allow us to run correctness tests against our
> > FP implementation. The test can be run in two modes (called
> > "testers"): host and soft. With the former we check the results
> > and FP flags on the host machine against the model.
> > With the latter we check QEMU's fpu primitives against the
> > model. Note that in soft mode we are not instantiating any
> > particular CPU (hence the HW_POISON_H hack to avoid macro poisoning);
> > for that we need to run the test in host mode under QEMU.
> >
> > The input files are taken from IBM's FPGen test suite:
> > https://www.research.ibm.com/haifa/projects/verification/fpgen/
> >
> > I see no license file in there so I am just downloading them
> > with wget. We might want to keep a copy on a qemu server though,
> > in case IBM takes those files down in the future.
> 
> Hmm the files themselves have:
> 
>   Copyright of IBM Corp. 2005
> 
> So I'm not sure we can take them into any of our source trees.

Yes I don't think committing these would be appropriate. I guess keeping
our own copy of the tarball somewhere would be OK, though. My worry is
that at some point IBM's server will die and we'll have no more test
files.

> However what are we testing here?

fp-test allows you to test against a model. This model can be anything;
for now, I grabbed model files from IBM's fpgen, which tests for
IEEE compliance.

We can add more model files; the muladd tests are an example of that.

> Do we just want to test our implementation is IEEE compliant or should
> we generate our own test cases on validated hardware to check that our
> emulation of a guest is correct (e.g. correct modulo the valid
> variations in the standard)?

I think what we want is a large core of tests that test the standard,
plus a smaller set of tests that cover ISA particularities. I think
the IBM models are a good starting point for the former -- note
though that some operations are not covered in the models (despite
the syntax description specifying an op for them, e.g. int-to-float
conversions.)

> > The "IBM" syntax of those files (for now the only syntax supported
> > in fp-test) is documented here:
> > https://www.research.ibm.com/haifa/projects/verification/fpgen/papers/ieee-test-suite-v2.pdf
> >
> > Note that the syntax document has some inaccuracies; the appended
> > parsing code works around some of those.
> >
> > The exception flag (-e) is important: many of the optimizations
> > included in the following commits assume that the inexact flag
> > is set, so "-e x" is necessary in order to test those code paths.
> >
> > The whitelist flag (-w) points to a file with test cases to be ignored.
> > I have put some whitelist files online, but we should have them
> > on a QEMU-related server.
> >
> > Thus, a typical of fp-test is as follows:
> >
> >   $ cd qemu/build/tests/fp-test
> >   $ make -j && \
> > ./fp-test -t soft ibm/*.fptest \
> > -w whitelist.txt \
> > -e x
> 
> So this is a unit test of our code rather than a test program running
> under QEMU?

Having the -t host/soft flags allows you flexibility in what to test.

With "host" mode, you're generating a binary that knows nothing
about QEMU, i.e. all its FP operations are native. You can use this
to (1) figure out whether your host diverts from the model [hopefully
it doesn't in anything substantial], and (2) test whether QEMU mimics
the corresponding host by running the binary under *-linux-user.

With "soft" mode, you're testing QEMU's soft-fp implementation. This
allows you to check it against the model. Note that it doesn't let
you check anything specific about a target CPU (hence the HW_POISON_H
hack); for this you'd have to go to point (2) above. Here instead we're
checking QEMU's FP implementation directly against the models.

>  I noted running under x86-64-linux-user fails pretty quick.

As I wrote below, I think this is due to bugs in the i386 target.

On a host x86_64 machine:
$ ./fp-test -t host ibm/* -w whitelist.txt -w whitelist-tininess-after.txt
All tests OK.
Tests passed: 74426. Not handled: 53297, whitelisted: 2748

$ ../../x86_64-linux-user/qemu-x86_64 ./fp-test -t host \
ibm/* -w whitelist.txt -w whitelist-tininess-after.txt -n 2>/dev/null
Tests failed: 57479. Parsing: 0, result:14, flags:57465
Tests passed: 16947. Not handled: 53297, whitelisted: 2748

The results are different when run under QEMU, which means
this target is not doing things correctly (despite -t soft
passing).

> If so we really should be building this automatically in make check.

Yes, passing -soft mode would certainly be valuable and trivial
to integrate since there is nothing built that is target-dependent.

(snip)
> > --- /dev/null
> > +++ b/tests/fp-test/fp-test.c
(snip)
> > +enum precision {
> > +PREC_FLOAT,
> > +PREC_DOUBLE,
> > +PREC_QUAD,
> > +PREC_FLOAT_TO_DOUBLE,
> > +};
> 
> Again we

Re: [Qemu-devel] [PATCH v1] RISC-V: RISC-V TCG backend work in progress

2018-03-27 Thread Michael Clark

On Tue, Mar 27, 2018 at 3:52 AM, Richard Henderson <
richard.hender...@linaro.org> wrote:

> On 03/25/2018 05:24 AM, Michael Clark wrote:
> > Running with `-d in_asm,op,op_opt,out_asm` is very helpful
> > for debugging. Note: due to a limitation in QEMU, the backend
> > disassembler is not compiled, unless the backend matches
> > the front-end, so `scripts/disas-objdump.pl` is required
> > to decode the emmitted RISC-V assembly when using the x86_64
> > front-end.
>
> Certainly not.  The configure mistake, I think, is
>
> -  riscv)
> +  riscv*)
>  disas_config "RISCV"
>
> because for host $ARCH is going to be riscv64 not riscv.


Oh my mistake. Thanks for pointing this out. I'll fix this in v2.


> > +int cpu_signal_handler(int host_signum, void *pinfo,
> > +   void *puc)
> > +{
> > +siginfo_t *info = pinfo;
> > +ucontext_t *uc = puc;
> > +greg_t pc = uc->uc_mcontext.__gregs[REG_PC];
> > +int is_write = 0;
>
> You're going to have to fill this in for many guests to work.  A data
> write to
> the same page for which we have executed code will fire here.
>
> If your host kernel does not supply the proper info via ucontext_t or
> siginfo_t
> (highly recommended, assuming the hardware reports this as part of the
> fault),
> then you'll need to do something as brute force as reading from the host
> PC and
> disassembling to see if it was a host store insn.
>

Apparently we don't have this in our ucontext and changing it would require
an ABI change. It seems siginfo_t only contains sa_addr. We have space
reserved in ucontext. If we were to add it to our ucontext, we could use 0
for unknown. It seems we'll need to use the host PC and disassemble the
instruction.


> I believe you can see this with e.g. sparc from our
> linux-user-test-0.3.tgz on
> the qemu wiki.
>
> > +/* optional instructions */
> > +#define TCG_TARGET_HAS_goto_ptr 1
> > +#define TCG_TARGET_HAS_movcond_i32  0
>
> Future: Does your real hardware do what the arch manual describes and
> predicate
> a jump across a single register move instruction?  Either way, for output
> code
> density you may wish to implement
>
> movcond_i32  out,x,y,in,out,cc
> as
> bcc x, y, .+8
> mov out, in
>
> rather than allow the tcg middle-end to expand to a 5 insn sequence.  See
> e.g.
> i386, ppc, s390 where we do exactly this when the hardware does not
> support a
> real conditional move insn.


Okay I'll implement movcond as a bcc +8 and mv.

> +if ((ct & TCG_CT_CONST_N12) && val >= -2047 && val <= 2047) {
>
> +2048?


We use this constraint for a negatable immediate and the constraint is only
applied to sub. We have no subi, so we implement subi as addi rd, rs1, -imm

case INDEX_op_sub_i32:
if (c2) {
tcg_out_opc_imm(s, is32bit ? OPC_ADDI : OPC_ADDIW, a0, a1, -a2);
} else {
tcg_out_opc_reg(s, is32bit ? OPC_SUB : OPC_SUBW, a0, a1, a2);
}
break;



> > +/* Type-S */
> > +
> > +static int32_t encode_simm12(uint32_t imm)
> > +{
> > +return ((imm << 20) >> 25) << 25 | ((imm << 27) >> 27) << 7;
>
> Probably more legible as
>
>   extract32(imm, 0, 5) << 7 | extract32(imm, 5, 7) << 25


I can change these to extract32.

I actually wrote code to generate these from instruction set metadata so
that I could avoid manual transcription errors


> > +/* Type-SB */
> > +
> > +static int32_t encode_sbimm12(uint32_t imm)
> > +{
> > +return ((imm << 19) >> 31) << 31 | ((imm << 21) >> 26) << 25 |
> > +   ((imm << 27) >> 28) << 8 | ((imm << 20) >> 31) << 7;
> > +}
>
> Similarly.
>
> > +static void tcg_out_movi(TCGContext *s, TCGType type, TCGReg rd,
> > + tcg_target_long val)
> > +{
> > +tcg_target_long lo = sextract64(val, 0, 12);
> > +tcg_target_long hi = val - lo;
> > +
> > +RISCVInsn add32_op = TCG_TARGET_REG_BITS == 64 ? OPC_ADDIW :
> OPC_ADDI;
> > +
> > +if (val == lo) {
> > +tcg_out_opc_imm(s, OPC_ADDI, rd, TCG_REG_ZERO, val);
> > +} else if (val && !(val & (val - 1))) {
> > +/* power of 2 */
> > +tcg_out_opc_imm(s, OPC_ADDI, rd, TCG_REG_ZERO, 1);
> > +tcg_out_opc_imm(s, OPC_SLLI, rd, rd, ctz64(val));
> > +} else if (TCG_TARGET_REG_BITS == 64 &&
> > +   !(val >> 31 == 0 || val >> 31 == -1)) {
> > +int shift = 12 + ctz64(hi >> 12);
> > +hi >>= shift;
> > +tcg_out_movi(s, type, rd, hi);
> > +tcg_out_opc_imm(s, OPC_SLLI, rd, rd, shift);
> > +if (lo != 0) {
> > +tcg_out_opc_imm(s, OPC_ADDI, rd, rd, lo);
> > +}
>
> Future: The other special case that happens frequently is loading of a
> 64-bit
> host address.  E.g. for exit_tb after goto_tb, the address of the TB
> itself.
> You will want to test to see if auipc+addi can load the value before
> falling
> back to the full 64-bit constant load.
>

Good idea. I'll implement auipc+addi


> Future: I'll note that your

[Qemu-devel] [Bug 1759337] [NEW] 'Failed to get "write" lock' error when trying to run a VM with disk image file on an SMB share

2018-03-27 Thread Adam Williamson

Public bug reported:

This has been reported and discussed downstream:

https://bugzilla.redhat.com/show_bug.cgi?id=1484130

but doesn't seem to be getting a lot of traction there.

Basically, with qemu since at least 2.10, you cannot use a disk image on
an SMB share that's mounted with protocol version 3 (I think possibly 2
or higher). This is made much more serious because kernel 4.13 upstream
made version 3 the *default* for SMB mounts, because version 1 is
insecure and should not be used.

So basically, anyone with a recent qemu and kernel cannot use disk
images stored on an SMB share. This is a major inconvenience for me
because, well, an SMB share is exactly where I store my VM disk images,
usually: I have a big NAS drive where I keep them all, only now I can't
because of this bug, and I'm manually swapping them in and out of the
very limited space I have on my system drive (SSD).

The error you get is:

qemu-system-x86_64: -drive 
file=/share/data/isos/vms/desktop_test_1.qcow2,format=qcow2,if=none,id=drive-virtio-disk0:
 Failed to get "write" lock
Is another process using the image?

** Affects: qemu
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1759337

Title:
  'Failed to get "write" lock' error when trying to run a VM with disk
  image file on an SMB share

Status in QEMU:
  New

Bug description:
  This has been reported and discussed downstream:

  https://bugzilla.redhat.com/show_bug.cgi?id=1484130

  but doesn't seem to be getting a lot of traction there.

  Basically, with qemu since at least 2.10, you cannot use a disk image
  on an SMB share that's mounted with protocol version 3 (I think
  possibly 2 or higher). This is made much more serious because kernel
  4.13 upstream made version 3 the *default* for SMB mounts, because
  version 1 is insecure and should not be used.

  So basically, anyone with a recent qemu and kernel cannot use disk
  images stored on an SMB share. This is a major inconvenience for me
  because, well, an SMB share is exactly where I store my VM disk
  images, usually: I have a big NAS drive where I keep them all, only
  now I can't because of this bug, and I'm manually swapping them in and
  out of the very limited space I have on my system drive (SSD).

  The error you get is:

  qemu-system-x86_64: -drive 
file=/share/data/isos/vms/desktop_test_1.qcow2,format=qcow2,if=none,id=drive-virtio-disk0:
 Failed to get "write" lock
  Is another process using the image?

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1759337/+subscriptions

[Qemu-devel] [Bug 1759338] [NEW] qemu-system-sparc w/ SS-20 ROM does not add processors

2018-03-27 Thread m...@papersolve.com

Public bug reported:

When booting a SPARCstation-20 with the original ROM, qemu does not set
the number of processors in a way that this ROM can understand it, and
the ROM always reports only 1 processor installed:


 ~/qemu  /usr/local/bin/qemu-system-sparc -bios ./ss20_v2.25_rom -M SS-20 -cpu 
"TI SuperSparc 60" -smp 2 -nographic

Power-ON Reset


   SMCC SPARCstation 10/20 UP/MP POST version VRV3.45 (09/11/95)


CPU_#0   TI, TMS390Z50(3.x)   0Mb External cache

CPU_#1   *** NOT installed ***
CPU_#2   *** NOT installed ***
CPU_#3   *** NOT installed ***

<<< CPU_ on MBus Slot_ >>> IS RUNNING (MID =
0008)


...

Cpu #0 TI,TMS390Z50 
Cpu #1 Nothing there 
Cpu #2 Nothing there 
Cpu #3 Nothing there 

...

SPARCstation 20 (1 X 390Z50), No Keyboard
ROM Rev. 2.25, 128 MB memory installed, Serial #1193046.
Ethernet address 52:54:0:12:34:56, Host ID: 72123456.


(It is necessary use SS-20 since it is the only sun4m model that
supports 512MB RAM, and I can't get Solaris to install on the SS-20
using OpenBIOS.)

When booting with OpenBIOS I can't seem to boot any version of Solaris
though I had heard this did work.  Solaris 8 and 9 do work nicely with
this ROM, but I am opening this to see if it is possible to fix this to
allow the original OBP ROM to see multiple processors.

** Affects: qemu
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1759338

Title:
  qemu-system-sparc w/ SS-20 ROM does not add processors

Status in QEMU:
  New

Bug description:
  When booting a SPARCstation-20 with the original ROM, qemu does not
  set the number of processors in a way that this ROM can understand it,
  and the ROM always reports only 1 processor installed:

  
   ~/qemu  /usr/local/bin/qemu-system-sparc -bios ./ss20_v2.25_rom -M SS-20 
-cpu "TI SuperSparc 60" -smp 2 -nographic

  Power-ON Reset


  
 SMCC SPARCstation 10/20 UP/MP POST version VRV3.45 (09/11/95)

  
  CPU_#0   TI, TMS390Z50(3.x)   0Mb External cache

  CPU_#1   *** NOT installed ***
  CPU_#2   *** NOT installed ***
  CPU_#3   *** NOT installed ***

  <<< CPU_ on MBus Slot_ >>> IS RUNNING (MID =
  0008)

  
  ...

  Cpu #0 TI,TMS390Z50 
  Cpu #1 Nothing there 
  Cpu #2 Nothing there 
  Cpu #3 Nothing there 

  ...

  SPARCstation 20 (1 X 390Z50), No Keyboard
  ROM Rev. 2.25, 128 MB memory installed, Serial #1193046.
  Ethernet address 52:54:0:12:34:56, Host ID: 72123456.


  
  (It is necessary use SS-20 since it is the only sun4m model that supports 
512MB RAM, and I can't get Solaris to install on the SS-20 using OpenBIOS.) 

  When booting with OpenBIOS I can't seem to boot any version of Solaris
  though I had heard this did work.  Solaris 8 and 9 do work nicely with
  this ROM, but I am opening this to see if it is possible to fix this
  to allow the original OBP ROM to see multiple processors.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1759338/+subscriptions

Re: [Qemu-devel] [PATCH 3/3] e1000: Old machine types, turn new subsection off

2018-03-27 Thread Paolo Bonzini

On 27/03/2018 18:47, Dr. David Alan Gilbert wrote:
>> So if the subsection is absent you
>> have to migrate either tx.tso_props or tx.props, depending on s->tx.cptse.
> Do you mean when sending you have to decide which set to send in the
> non-subsection data?  And with cptse true that means use tso_props?

Yes.

>> Likewise if you migrate from older versions: if s->tx.props.tse &&
>> s->tx.cptse, you have to copy s->tx.props to s->tx.tso_props and clear
>> s->tx.props.
> 
> I don't see any equivalent code in the existing non-subsection postload to
> do this; so I'm guessing there are some cases of 2.11->2.12 that will
> break at the moment?

Yes, I think so.

>> My understanding is that s->tx.tso_props.tse will be 1 if
>> and only if the source sent s->tx.tso_props.
> I don't see anything in the current code that migrates tso_props.tse -
> where does it come from?

Ouch... The tse field is more or less dead in current code AFAICS, but
it was used in the previous version.  What's the best way then to find
if the subsection was transmitted?  Do we have anything like a post_load
callback in the subsection itself?

To find out which "props" to transmit to older QEMU, you can add a
tp->use_tso_for_migration = tp->cptse just before "if (!(txd_lower &
E1000_TXD_CMD_EOP))" in process_tx_desc...

>> This seems most easily done with a new field (e.g. vmstate_fixed_props)
>> that is written in pre_save and set in post_load.
> It might need a VMSTATE_WITH_TMP to be able to do the saving part;
> when saving we can't change the current state when migrating
> to an old destination in case the migration fails and we just continue.

Perhaps you can just copy props/tso_props to a new field, and change all the

VMSTATE_UINT8(tx.props.ipcss, E1000State),
VMSTATE_UINT8(tx.props.ipcso, E1000State),
VMSTATE_UINT16(tx.props.ipcse, E1000State),

to

VMSTATE_UINT8(tx_legacy_vmstate_props.ipcss, E1000State),
...

and then add tx.props to the subsection together with tso.props.

New->old migration will place tx_legacy_vmstate_props in tx.props on the
destination; new->new will realize the subsection was transmitted and
ignore the tx_legacy_vmstate_props; old->new will not find data from the
subsection and copy the tx_legacy_vmstate_props into one of tx.props and
tx.tso_props.

Paolo

Re: [Qemu-devel] [PATCH v2 1/1] hw/scsi: support SCSI-2 passthrough without PI

2018-03-27 Thread Fam Zheng

On Tue, 03/13 13:43, Daniel Henrique Barboza wrote:
> QEMU SCSI code makes assumptions about how the PROTECT and BYTCHK
> works in the protocol, denying support for PI (Protection
> Information) in case the guest OS requests it. However, in SCSI versions 2
> and older, there is no PI concept in the protocol.
> 
> This means that when dealing with such devices:
> 
> - there is no PROTECT bit in byte 5 of the standard INQUIRY response. The
> whole byte is marked as "Reserved";
> 
> - there is no RDPROTECT in byte 2 of READ. We have 'Logical Unit Number'
> in this field instead;
> 
> - there is no VRPROTECT in byte 2 of VERIFY. We have 'Logical Unit Number'
> in this field instead. This also means that the BYTCHK bit in this case
> is not related to PI.
> 
> Since QEMU does not consider these changes, a SCSI passthrough using
> a SCSI-2 device will not work. It will mistake these fields with
> PI information and return Illegal Request SCSI SENSE thinking
> that the driver is asking for PI support.
> 
> This patch fixes it by adding a new attribute called 'scsi_version'
> that is read from the standard INQUIRY response of passthrough
> devices. This allows for a version verification before applying
> conditions related to PI that doesn't apply for older versions.
> 
> Reported-by: Dac Nguyen 
> Signed-off-by: Daniel Henrique Barboza 
> ---
> 
> Changes in v2:
> - removed "scsi_version" as a property
> - scsi_version is now initialized with -1 in scsi_realize (that is
> used by scsi_hd_realize, scsi_cd_realize, scsi_disk_realize and
> scsi_block_realize) and scsi_generic_realize
> 
> 
>  hw/scsi/scsi-disk.c| 14 +++---
>  hw/scsi/scsi-generic.c | 42 +++---
>  include/hw/scsi/scsi.h |  1 +
>  3 files changed, 43 insertions(+), 14 deletions(-)
> 
> diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c
> index 49d2559d93..80b1eb92ae 100644
> --- a/hw/scsi/scsi-disk.c
> +++ b/hw/scsi/scsi-disk.c
> @@ -2176,7 +2176,7 @@ static int32_t scsi_disk_dma_command(SCSIRequest *req, 
> uint8_t *buf)
>  case READ_12:
>  case READ_16:
>  DPRINTF("Read (sector %" PRId64 ", count %u)\n", r->req.cmd.lba, 
> len);
> -if (r->req.cmd.buf[1] & 0xe0) {
> +if ((r->req.cmd.buf[1] & 0xe0) && (s->qdev.scsi_version > 2)) {
>  goto illegal_request;
>  }
>  if (!check_lba_range(s, r->req.cmd.lba, len)) {
> @@ -2206,8 +2206,12 @@ static int32_t scsi_disk_dma_command(SCSIRequest *req, 
> uint8_t *buf)
>  /* We get here only for BYTCHK == 0x01 and only for scsi-block.
>   * As far as DMA is concerned, we can treat it the same as a write;
>   * scsi_block_do_sgio will send VERIFY commands.
> + *
> + * For scsi versions 2 and older, the BYTCHK isn't related
> + * to VRPROTECT (in fact, there is no VRPROTECT). Skip
> + * this check in these versions.
>   */
> -if (r->req.cmd.buf[1] & 0xe0) {
> +if ((r->req.cmd.buf[1] & 0xe0) && (s->qdev.scsi_version > 2)) {
>  goto illegal_request;
>  }
>  if (!check_lba_range(s, r->req.cmd.lba, len)) {
> @@ -2383,6 +2387,8 @@ static void scsi_realize(SCSIDevice *dev, Error **errp)
>  return;
>  }
>  
> +dev->scsi_version = -1;
> +
>  if ((s->features & (1 << SCSI_DISK_F_REMOVABLE)) &&
>  !(s->features & (1 << SCSI_DISK_F_NO_REMOVABLE_DEVOPS))) {
>  blk_set_dev_ops(s->qdev.conf.blk, _disk_removable_block_ops, s);
> @@ -2796,6 +2802,8 @@ static bool scsi_block_is_passthrough(SCSIDiskState *s, 
> uint8_t *buf)
>  static int32_t scsi_block_dma_command(SCSIRequest *req, uint8_t *buf)
>  {
>  SCSIBlockReq *r = (SCSIBlockReq *)req;
> +SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, req->dev);
> +
>  r->cmd = req->cmd.buf[0];
>  switch (r->cmd >> 5) {
>  case 0:
> @@ -2821,7 +2829,7 @@ static int32_t scsi_block_dma_command(SCSIRequest *req, 
> uint8_t *buf)
>  abort();
>  }
>  
> -if (r->cdb1 & 0xe0) {
> +if ((r->cdb1 & 0xe0) && (s->qdev.scsi_version > 2)) {
>  /* Protection information is not supported.  */
>  scsi_check_condition(>req, SENSE_CODE(INVALID_FIELD));
>  return 0;
> diff --git a/hw/scsi/scsi-generic.c b/hw/scsi/scsi-generic.c
> index 7414fe2d67..5cc5598983 100644
> --- a/hw/scsi/scsi-generic.c
> +++ b/hw/scsi/scsi-generic.c
> @@ -194,17 +194,35 @@ static void scsi_read_complete(void * opaque, int ret)
>  r->buf[3] |= 0x80;
>  }
>  }
> -if (s->type == TYPE_DISK &&
> -r->req.cmd.buf[0] == INQUIRY &&
> -r->req.cmd.buf[2] == 0xb0) {
> -uint32_t max_transfer =
> -blk_get_max_transfer(s->conf.blk) / s->blocksize;
> -
> -assert(max_transfer);
> -stl_be_p(>buf[8], max_transfer);
> -/* Also take care of the opt xfer len. */
> -stl_be_p(>buf[12],
> -

Re: [Qemu-devel] [PATCH v1 01/14] tests: add fp-bench, a collection of simple floating-point microbenchmarks

2018-03-27 Thread Emilio G. Cota

On Tue, Mar 27, 2018 at 09:45:14 +0100, Alex Bennée wrote:
> Emilio G. Cota  writes:
(snip)
> > +/*
> > + * Disable optimizations (e.g. "a OP b" outside of the inner loop) with
> > + * volatile.
> > + */
> > +#define GEN_BENCH_1OPF(NAME, FUNC, PRECISION)   \
> > +static void NAME(volatile PRECISION *res)   \
> > +{   \
> > +uint64_t ra = SEED_A;   \
> > +uint64_t i, j;  \
> > +\
> > +for (i = 0; i < n_ops; i += OPS_PER_ITER) { \
> > +volatile PRECISION a = glue(get_random_, PRECISION)();   \
> > +\
> > +for (j = 0; j < OPS_PER_ITER; j++) {\
> > +*res = FUNC(a); \
> > +}   \
> > +}   \
> > +}
> > +
> 
> Have you had a chance to look at if this will vectorise? I have a
> similar benchmark which I compile with multiple options to test normal,
> NEON/AdvSIMD and SVE enabled loops.

It does not. I'm pretty sure the volatile there prevents the compiler
from doing anything smart. In this case I don't want the compiler
to vectorise though, but I can see how that would be a nice
benchmark to have in addition to the above.

> > +case 'p':
> > +precision = optarg;
> > +if (strcmp(precision, "float") &&
> > +strcmp(precision, "single") &&
> > +strcmp(precision, "double")) {
> > +fprintf(stderr, "Unsupported precision '%s'\n", precision);
> > +exit(EXIT_FAILURE);
> 
> Supporting half-precision if the compiler does would also be useful here.

I wasn't speeding those up so didn't care to test them. But yes I can see how
that could be useful for arm/aarch64; we can add it later.

> > diff --git a/tests/Makefile.include b/tests/Makefile.include
> > index ef9b88c..f6121ee 100644
> > --- a/tests/Makefile.include
> > +++ b/tests/Makefile.include
> > @@ -587,7 +587,7 @@ test-obj-y = tests/check-qnum.o tests/check-qstring.o 
> > tests/check-qdict.o \
> > tests/rcutorture.o tests/test-rcu-list.o \
> > tests/test-qdist.o tests/test-shift128.o \
> > tests/test-qht.o tests/qht-bench.o tests/test-qht-par.o \
> > -   tests/atomic_add-bench.o
> > +   tests/atomic_add-bench.o tests/fp-bench.o
> 
> Not sure why but "make check" didn't build this. I had to explicitly
> "make tests/fp-bench". I guess along with atomic_add_bench though these
> are explicitly guest facing tests so maybe we should move them once
> tests/tcg is working again. I'll have another run at that this week.

That was intentional; these are benchmarks rather than tests so I
wouldn't expect make check to build them or run them at all. So that was 

> >  $(test-obj-y): QEMU_INCLUDES += -Itests
> >  QEMU_CFLAGS += -I$(SRC_PATH)/tests
> > @@ -639,6 +639,7 @@ tests/test-qht-par$(EXESUF): tests/test-qht-par.o 
> > tests/qht-bench$(EXESUF) $(tes
> >  tests/qht-bench$(EXESUF): tests/qht-bench.o $(test-util-obj-y)
> >  tests/test-bufferiszero$(EXESUF): tests/test-bufferiszero.o 
> > $(test-util-obj-y)
> >  tests/atomic_add-bench$(EXESUF): tests/atomic_add-bench.o 
> > $(test-util-obj-y)
> > +tests/fp-bench$(EXESUF): tests/fp-bench.o $(test-util-obj-y)
> >
> >  tests/test-qdev-global-props$(EXESUF): tests/test-qdev-global-props.o \
> > hw/core/qdev.o hw/core/qdev-properties.o hw/core/hotplug.o\
> 
> Anyway for this version:
> 
> Reviewed-by: Alex Bennée 

Thanks! I'll keep this for v3 (I sent v2 yesterday), since not
much changed.

If I had more time to work on this I'd like to have a -t soft/host flag
like in fp-test. Right now there is no such flag so we default to "host";
IOW, we end up testing the performance of the whole sausage, i.e. guest
compiler + QEMU. This is useful because it represents real-life
scenarios. However, if we tested the functions in fpu/ directly,
we'd get benchmarking that (1) would be more sensitive to the functions
we want to benchmark, and (2) would not depend on the particular
implementation of the QEMU target (e.g. i386 does not emit fma
at all!).

Thanks,

Emilio

[Qemu-devel] [Bug 1759333] [NEW] Illegal Instruction with HVF when encountering SSE instructions in the emulator

2018-03-27 Thread Fa Bi

Public bug reported:

The latest version of QEMU doesn't seem to support emulated SSE instructions 
with HVF acceleration on macOS.
The decoder will treat SSE instructions as invalid, get the instruction sizes 
wrong and quickly crash the guest OS because of illegal instructions.
After having a quick look at target/i386/hvf/x86_decode.c, it seems that SSE 
instruction emulation isn't implemented in the current version of the x86 
emulator.

A way to reproduce the issue is to run a macOS 10.13 guest with HVF
acceleration enabled, this will crash in the guest once it's loading up
the GUI (and also print a "Unimplemented handler" warning from
target/i386/hvf/x86_emu.c).

** Affects: qemu
 Importance: Undecided
 Status: New

** Summary changed:

- Illegal Instruction with HVF when encountering SSE instructions in Privileged 
Mode
+ Illegal Instruction with HVF when encountering SSE instructions in the 
emulator

** Description changed:

  The latest version of QEMU doesn't seem to support emulated SSE instructions 
with HVF acceleration on macOS.
  The decoder will treat SSE instructions as invalid, get the instruction sizes 
wrong and quickly crash the guest OS because of illegal instructions.
  After having a quick look at target/i386/hvf/x86_decode.c, it seems that SSE 
instruction emulation isn't implemented in the current version of the x86 
emulator.
  
  A way to reproduce the issue is to run a macOS 10.13 guest with HVF
- acceleration enabled, this will crash once it's loading up the GUI.
+ acceleration enabled, this will crash in the guest once it's loading up
+ the GUI (and also print a "Unimplemented handler" warning from
+ target/i386/hvf/x86_emu.c).

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1759333

Title:
  Illegal Instruction with HVF when encountering SSE instructions in the
  emulator

Status in QEMU:
  New

Bug description:
  The latest version of QEMU doesn't seem to support emulated SSE instructions 
with HVF acceleration on macOS.
  The decoder will treat SSE instructions as invalid, get the instruction sizes 
wrong and quickly crash the guest OS because of illegal instructions.
  After having a quick look at target/i386/hvf/x86_decode.c, it seems that SSE 
instruction emulation isn't implemented in the current version of the x86 
emulator.

  A way to reproduce the issue is to run a macOS 10.13 guest with HVF
  acceleration enabled, this will crash in the guest once it's loading
  up the GUI (and also print a "Unimplemented handler" warning from
  target/i386/hvf/x86_emu.c).

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1759333/+subscriptions

[Qemu-devel] [PATCH] softfloat: rename canonicalize to sf_canonicalize

2018-03-27 Thread Emilio G. Cota

On Tue, Mar 27, 2018 at 12:06:58 +0200, Bastian Koppelmann wrote:
> On 03/27/2018 11:56 AM, Bastian Koppelmann wrote:
> > When I try to build your tree with gcc version 7.3.1 20180312 (GCC), I get:
> > 
> >   CC  tricore-softmmu/fpu/softfloat.o
> > /home/kbastian/coding/upstream-qemu/qemu-fpu/fpu/softfloat.c:417:19:
> > error: conflicting types for ‘canonicalize’
> >  static FloatParts canonicalize(FloatParts part, const FloatFmt *parm,
> >^~~~
> > In file included from /usr/include/features.h:423:0,
> >  from /usr/include/bits/libc-header-start.h:33,
> >  from /usr/include/math.h:27,
> >  from
> > /home/kbastian/coding/upstream-qemu/qemu-fpu/fpu/softfloat.c:85:
> > /usr/include/bits/mathcalls.h:385:1: note: previous declaration of
> > ‘canonicalize’ was here
> >  __MATHDECL_1 (int, canonicalize,, (_Mdouble_ *__cx, const _Mdouble_ *__x));
> > 
> 
> git bisect points to fb1514a0104df6740e4a60c1b08b5daf173f9737
> 
> Author: Emilio G. Cota 
> Date:   Sat Mar 17 02:13:59 2018 -0400
> 
> fpu: introduce hardfloat

Thanks for pointing this out.

Turns our glibc >= 2.25 defines canonicalize(), so when including math.h
we clash with theirs.

> commit eaf5ad0bc4a67bf40999e22db6f583ebc3a806ba
> Author: Joseph Myers 
> Date:   Wed Oct 26 23:14:31 2016 +
> 
> Add canonicalize, canonicalizef, canonicalizel.
> 
> TS 18661-1 defines canonicalize functions to produce a canonical
> version of a floating-point representation.  This patch implements
> these functions for glibc.

In v3 I'll rename our canonicalize() to sf_canonicalize(); patch
appended (you can apply this message as a patch with git am --scissors).
The patch should go before the one that includes math.h, i.e.
"fpu: introduce hardfloat".

Thanks,

Emilio

--- 8< ---

glibc >= 2.25 defines canonicalize in commit eaf5ad0
(Add canonicalize, canonicalizef, canonicalizel., 2016-10-26).

Given that we'll be including libc's math.h soon, prepare
for this by prefixing our canonicalize() with sf_ to avoid
clashing with the libc's canonicalize().

Signed-off-by: Emilio G. Cota 
---
 fpu/softfloat.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/fpu/softfloat.c b/fpu/softfloat.c
index d0f1f65..901c507 100644
--- a/fpu/softfloat.c
+++ b/fpu/softfloat.c
@@ -414,8 +414,8 @@ static inline float64 float64_pack_raw(FloatParts p)
 }
 
 /* Canonicalize EXP and FRAC, setting CLS.  */
-static FloatParts canonicalize(FloatParts part, const FloatFmt *parm,
-   float_status *status)
+static FloatParts sf_canonicalize(FloatParts part, const FloatFmt *parm,
+  float_status *status)
 {
 if (part.exp == parm->exp_max) {
 if (part.frac == 0) {
@@ -585,7 +585,7 @@ static FloatParts round_canonical(FloatParts p, 
float_status *s,
 
 static FloatParts float16_unpack_canonical(float16 f, float_status *s)
 {
-return canonicalize(float16_unpack_raw(f), _params, s);
+return sf_canonicalize(float16_unpack_raw(f), _params, s);
 }
 
 static float16 float16_round_pack_canonical(FloatParts p, float_status *s)
@@ -603,7 +603,7 @@ static float16 float16_round_pack_canonical(FloatParts p, 
float_status *s)
 
 static FloatParts float32_unpack_canonical(float32 f, float_status *s)
 {
-return canonicalize(float32_unpack_raw(f), _params, s);
+return sf_canonicalize(float32_unpack_raw(f), _params, s);
 }
 
 static float32 float32_round_pack_canonical(FloatParts p, float_status *s)
@@ -621,7 +621,7 @@ static float32 float32_round_pack_canonical(FloatParts p, 
float_status *s)
 
 static FloatParts float64_unpack_canonical(float64 f, float_status *s)
 {
-return canonicalize(float64_unpack_raw(f), _params, s);
+return sf_canonicalize(float64_unpack_raw(f), _params, s);
 }
 
 static float64 float64_round_pack_canonical(FloatParts p, float_status *s)
-- 
2.7.4

Re: [Qemu-devel] [PATCH v2] scsi-disk: Don't enlarge min_io_size to max_io_size

2018-03-27 Thread Daniel Henrique Barboza




On 03/27/2018 01:41 PM, Fam Zheng wrote:

Some backends report big max_io_sectors. Making min_io_size the same
value in this case will make it impossible for guest to align memory,
therefore the disk may not be usable at all.

Do not enlarge them when they are zero.

Reported-by: David Gibson 
Signed-off-by: Fam Zheng 

---

v2: Leave the values alone if zero. [Paolo]
 At least we can consult block layer for a slightly more sensible
 opt_io_size, but that's for another patch.
---
  hw/scsi/scsi-disk.c | 10 ++
  1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c
index f5ab767ab5..f8ed8cf2b4 100644
--- a/hw/scsi/scsi-disk.c
+++ b/hw/scsi/scsi-disk.c
@@ -714,10 +714,12 @@ static int scsi_disk_emulate_inquiry(SCSIRequest *req, 
uint8_t *outbuf)

  /* min_io_size and opt_io_size can't be greater than
   * max_io_sectors */
-min_io_size =
-MIN_NON_ZERO(min_io_size, max_io_sectors);
-opt_io_size =
-MIN_NON_ZERO(opt_io_size, max_io_sectors);
+if (min_io_size) {
+min_io_size = MIN(min_io_size, max_io_sectors);
+}
+if (opt_io_size) {
+opt_io_size = MIN(opt_io_size, max_io_sectors);
+}
  }
  /* required VPD size with unmap support */
  buflen = 0x40;


Reviewed-by: Daniel Henrique Barboza

1 2 3 >

1 - 100 of 244 matches

Mail list logo