Re: [Qemu-devel] [PATCH v5 05/46] hw: Use IEC binary prefix definitions from "qemu/units.h"

2018-06-26 Thread Richard Henderson
On 06/25/2018 05:41 AM, Philippe Mathieu-Daudé wrote:
> Code change produced with:
> 
>   $ git ls-files | egrep '\.[ch]$' | \
> xargs sed -i -e 's/\(\W[KMGTPE]\)_BYTE/\1iB/g'
> 
> Suggested-by: Stefan Weil 
> Signed-off-by: Philippe Mathieu-Daudé 
> Acked-by: David Gibson  (ppc parts)
> ---
>  include/qemu/cutils.h  |  8 +---
>  hw/arm/msf2-soc.c  |  4 ++--
>  hw/arm/msf2-som.c  |  6 +++---
>  hw/core/loader-fit.c   |  3 ++-
>  hw/core/machine.c  |  2 +-
>  hw/display/sm501.c | 14 +++---
>  hw/hppa/machine.c  |  2 +-
>  hw/mips/boston.c   | 28 ++--
>  hw/ppc/pnv.c   |  4 ++--
>  hw/ppc/ppc440_uc.c | 26 +-
>  hw/ppc/prep.c  |  2 +-
>  hw/ppc/sam460ex.c  |  2 +-
>  hw/ppc/spapr.c | 10 +-
>  hw/ppc/spapr_rtas.c|  2 +-
>  hw/sd/sd.c |  4 ++--
>  hw/sd/sdhci.c  |  2 +-
>  tests/test-cutils.c| 18 +-
>  tests/test-keyval.c|  6 +++---
>  tests/test-qemu-opts.c |  7 +++
>  19 files changed, 72 insertions(+), 78 deletions(-)

Reviewed-by: Richard Henderson 


r~




Re: [Qemu-devel] [RFC PATCH 2/2] iotests: add 222 to test basic fleecing

2018-06-26 Thread John Snow



On 06/26/2018 09:49 PM, Eric Blake wrote:
> On 06/26/2018 05:22 PM, John Snow wrote:
>> Signed-off-by: John Snow 
>> ---
>>   tests/qemu-iotests/222   | 121

Probably would have helped to include 222.out, though.

>> +++
>>   tests/qemu-iotests/group |   1 +
>>   2 files changed, 122 insertions(+)
>>   create mode 100644 tests/qemu-iotests/222
>>
>> diff --git a/tests/qemu-iotests/222 b/tests/qemu-iotests/222
>> new file mode 100644
>> index 00..133d10c351
>> --- /dev/null
>> +++ b/tests/qemu-iotests/222
>> @@ -0,0 +1,121 @@
>> +#!/usr/bin/env python
>> +#
>> +# This test covers the basic fleecing workflow.
>> +#
>> +# Copyright (C) 2018 Red Hat, Inc.
>> +# John helped, too.
> 
> LOL.
> 
>> +
>> +patterns = [("0x5d", "0", "64k"),
>> +    ("0xd5", "1M", "64k"),
>> +    ("0xdc", "32M", "64k"),
>> +    ("0xcd", "67043328", "64k")]  # 64M - 64K
>> +
>> +overwrite = [("0xab", "0",    "64k"), # Full overwrite
>> + ("0xad", "1015808",  "64k"), # Partial-left (1M-32K)
>> + ("0x1d", "33587200", "64k"), # Partial-right (32M+32K)
>> + ("0xea", "64M", "64k")]  # Adjacent-right (64M)
>> +
>> +with iotests.FilePath('base.img') as base_img_path, \
>> + iotests.FilePath('fleece.img') as fleece_img_path, \
>> + iotests.FilePath('nbd.sock') as nbd_sock_path, \
>> + iotests.VM() as vm:
> 
> Does python require \ even after ','?
> 

Dunno. Cargo cult from 216.

> The test looks valid - you are definitely reading data over NBD from the
> point in time that you started the blockdev-backup job, even while the
> source image continues to be modified.
> 
>> +    for p in overwrite:
>> +    cmd = "write -P%s %s %s" % p
>> +    log(cmd)
>> +    log(vm.hmp_qemu_io(srcNode, cmd))
>> +
>> +    log('')
>> +    log('--- Verifying Data ---')
>> +    log('')
>> +
>> +    for p in patterns:
>> +    cmd = "read -P%s %s %s" % p
>> +    log(cmd)
>> +    assert qemu_io_silent('-r', '-f', 'raw', '-c', cmd, nbd_uri)
>> == 0
> 
> Perhaps additional steps would be to then stop the NBD export, stop the
> block job, delete the tgtNode fleecing file, then stop qemu, and finally
> check that the overwritten patterns correctly show up in the source
> image (that is, also prove that we can tear down a job, and that the
> overwrites worked).  And we may want to enhance this test (or use it as
> a starting point to copy into a new test) to play with persistent dirty
> bitmaps thrown into the mix as well.  But what you have is already a
> great start to prevent regressions, so:
> 

Good suggestions. I'm working toward throwing bitmaps in now, but
actually cleaning up the VM properly and stopping the NBD server and
testing some of the latter-half paths would be nice. This was just a bit
of an RFC to get the bits out there sooner rather than later.

> Reviewed-by: Eric Blake 
> 

Too many changes I need to pepper in for v2, but thanks for the vote of
confidence :)



Re: [Qemu-devel] [PATCH v5 03/46] x86/cpu: Use definitions from "qemu/units.h"

2018-06-26 Thread Richard Henderson
On 06/25/2018 05:41 AM, Philippe Mathieu-Daudé wrote:
> Signed-off-by: Philippe Mathieu-Daudé 
> Acked-by: Eduardo Habkost 
> ---
>  target/i386/cpu.c | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)

Reviewed-by: Richard Henderson 


r~




Re: [Qemu-devel] [PATCH v5 01/46] include: Add IEC binary prefixes in "qemu/units.h"

2018-06-26 Thread Richard Henderson
On 06/25/2018 05:41 AM, Philippe Mathieu-Daudé wrote:
> Loosely based on 076b35b5a56.
> 
> Suggested-by: Stefan Weil 
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
>  include/qemu/units.h | 20 
>  1 file changed, 20 insertions(+)
>  create mode 100644 include/qemu/units.h

Reviewed-by: Richard Henderson 


r~




Re: [Qemu-devel] [PATCH v5 02/46] vdi: Use definitions from "qemu/units.h"

2018-06-26 Thread Richard Henderson
On 06/25/2018 05:41 AM, Philippe Mathieu-Daudé wrote:
> Signed-off-by: Philippe Mathieu-Daudé 
> Reviewed-by: Stefan Weil 
> ---
>  block/vdi.c | 7 +++
>  1 file changed, 3 insertions(+), 4 deletions(-)

Reviewed-by: Richard Henderson 


r~




Re: [Qemu-devel] [PATCH v3 5/5] tests/tcg/aarch64: userspace system register test

2018-06-26 Thread Richard Henderson
On 06/25/2018 09:00 AM, Alex Bennée wrote:
> This tests a bunch of registers that the kernel allows userspace to
> read including the CPUID registers.
> 
> Signed-off-by: Alex Bennée 
> ---
>  tests/tcg/aarch64/Makefile.target |  2 +-
>  tests/tcg/aarch64/sysregs.c   | 99 +++
>  2 files changed, 100 insertions(+), 1 deletion(-)
>  create mode 100644 tests/tcg/aarch64/sysregs.c

Reviewed-by: Richard Henderson 


r~




Re: [Qemu-devel] [PATCH v3 4/5] linux-user/elfload: enable HWCAP_CPUID for AArch64

2018-06-26 Thread Richard Henderson
On 06/25/2018 09:00 AM, Alex Bennée wrote:
> Userspace programs should (in theory) query the ELF HWCAP before
> probing these registers. Now we have implemented them all make it
> public.
> 
> Signed-off-by: Alex Bennée 
> ---
>  linux-user/elfload.c | 1 +
>  1 file changed, 1 insertion(+)

Reviewed-by: Richard Henderson 


r~




Re: [Qemu-devel] [PATCH v3 05/49] qapi: leave the ifcond attribute undefined until check()

2018-06-26 Thread Markus Armbruster
Marc-André Lureau  writes:

> Hi
>
> On Tue, Jun 19, 2018 at 11:06 AM, Markus Armbruster  wrote:
>> Marc-André Lureau  writes:
>>
>>> We commonly initialize attributes to None in .init(), then set their
>>> real value in .check().  Accessing the attribute before .check()
>>> yields None.  If we're lucky, the code that accesses the attribute
>>> prematurely chokes on None.
>>>
>>> It won't for .ifcond, because None is a legitimate value.
>>>
>>> Leave the ifcond attribute undefined until check().
>>
>> Drawback: pylint complains.  We'll live.
>>
>>>
>>> Suggested-by: Markus Armbruster 
>>> Signed-off-by: Marc-André Lureau 
>>> Reviewed-by: Markus Armbruster 
>>
>> Shouldn't this be squashed into the previous patch?
>
> I would rather keep it seperate, as it makes reviewing both a bit
> easier to me. But feel free to squash on commit.

No need to decide right now.

>>
>>> ---
>>>  scripts/qapi/common.py | 21 +
>>>  1 file changed, 17 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/scripts/qapi/common.py b/scripts/qapi/common.py
>>> index d8ab3d8f7f..eb07d641ab 100644
>>> --- a/scripts/qapi/common.py
>>> +++ b/scripts/qapi/common.py
>>> @@ -1026,13 +1026,19 @@ class QAPISchemaEntity(object):
>>>  # such place).
>>>  self.info = info
>>>  self.doc = doc
>>> -self.ifcond = listify_cond(ifcond)
>>> +self._ifcond = ifcond  # self.ifcond is set only after .check()
>>>
>>>  def c_name(self):
>>>  return c_name(self.name)
>>>
>>>  def check(self, schema):
>>> -pass
>>> +if isinstance(self._ifcond, QAPISchemaType):
>>> +# inherit the condition from a type
>>> +typ = self._ifcond
>>> +typ.check(schema)
>>> +self.ifcond = typ.ifcond
>>> +else:
>>> +self.ifcond = listify_cond(self._ifcond)
>>
>> Whenever we add a .check(), we need to prove QAPISchema.check()'s
>> recursion still terminates, and terminates the right way.
>>
>> Argument before this patch: we recurse only into types contained in
>> types, e.g. an object type's base type, and we detect and report cycles
>> as "Object %s contains itself", in QAPISchemaObjectType.check().
>>
>> The .check() added here recurses into a type.  If this creates a cycle,
>> it'll be caught and reported as "contains itself".  We still need to
>> show that the error message remains accurate.
>>
>> We .check() here to inherit .ifcond from a type.  As far as I can tell,
>> we use this inheritance feature only to inherit an array's condition
>> from its element type.  That's safe, because an array does contain its
>> elements.
>>
>> This is hardly a rigorous proof.  Just enough to make me believe your
>> code works.
>>
>> However, I suspect adding the inheritance feature at the entity level
>> complicates the correctness argument without real need.  Can we restrict
>> it to array elements?  Have QAPISchemaArrayType.check() resolve
>> type-valued ._ifcond, and all the others choke on it?
>
> There is also implicit object types.

Can you give an example?

>>>
>>>  def is_implicit(self):
>>>  return not self.info
>>> @@ -1169,6 +1175,7 @@ class QAPISchemaEnumType(QAPISchemaType):
>>>  self.prefix = prefix
>>>
>>>  def check(self, schema):
>>> +QAPISchemaType.check(self, schema)
>>>  seen = {}
>>>  for v in self.values:
>>>  v.check_clash(self.info, seen)
>>> @@ -1201,8 +1208,10 @@ class QAPISchemaArrayType(QAPISchemaType):
>>>  self.element_type = None
>>>
>>>  def check(self, schema):
>>> +QAPISchemaType.check(self, schema)
>>>  self.element_type = schema.lookup_type(self._element_type_name)
>>>  assert self.element_type
>>> +self.element_type.check(schema)
>>>  self.ifcond = self.element_type.ifcond
>>>
>>>  def is_implicit(self):
>>> @@ -1245,6 +1254,7 @@ class QAPISchemaObjectType(QAPISchemaType):
>>>  self.members = None
>>>
>>>  def check(self, schema):
>>> +QAPISchemaType.check(self, schema)
>>>  if self.members is False:   # check for cycles
>>>  raise QAPISemError(self.info,
>>> "Object %s contains itself" % self.name)
>>> @@ -1427,6 +1437,7 @@ class QAPISchemaAlternateType(QAPISchemaType):
>>>  self.variants = variants
>>>
>>>  def check(self, schema):
>>> +QAPISchemaType.check(self, schema)
>>>  self.variants.tag_member.check(schema)
>>>  # Not calling self.variants.check_clash(), because there's nothing
>>>  # to clash with
>>> @@ -1470,6 +1481,7 @@ class QAPISchemaCommand(QAPISchemaEntity):
>>>  self.allow_oob = allow_oob
>>>
>>>  def check(self, schema):
>>> +QAPISchemaEntity.check(self, schema)
>>>  if self._arg_type_name:
>>>  self.arg_type = schema.lookup_type(self._arg_type_name)
>>>  assert (isinstance(self.arg_type, 

Re: [Qemu-devel] [PATCH v3 2/5] target/arm: relax permission checks for HWCAP_CPUID registers

2018-06-26 Thread Richard Henderson
On 06/25/2018 09:00 AM, Alex Bennée wrote:
> +#ifdef CONFIG_USER_ONLY
> +/* Some AArch64 CPU ID/feature are exported to userspace
> + * by the kernel (see HWCAP_CPUID) */
> +if (r->opc0 == 3 && r->crn == 0 &&
> +(r->crm == 0 ||
> + (r->crm >= 4 && r->crm <= 7))) {
> +mask = PL0_R;
> +break;
> +}
> +#endif

Why not just set mask to PL0U_R | PL1_RW?
This mask doesn't affect the actual permissions, just the check.

Then of course merge with the next patch.


r~



[Qemu-devel] [PATCH v6 32/35] target/arm: Implement SVE dot product (vectors)

2018-06-26 Thread Richard Henderson
Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 
---
 target/arm/helper.h|  5 +++
 target/arm/translate-sve.c | 17 ++
 target/arm/vec_helper.c| 67 ++
 target/arm/sve.decode  |  3 ++
 4 files changed, 92 insertions(+)

diff --git a/target/arm/helper.h b/target/arm/helper.h
index 8607077dda..e23ce7ff19 100644
--- a/target/arm/helper.h
+++ b/target/arm/helper.h
@@ -583,6 +583,11 @@ DEF_HELPER_FLAGS_5(gvec_qrdmlah_s32, TCG_CALL_NO_RWG,
 DEF_HELPER_FLAGS_5(gvec_qrdmlsh_s32, TCG_CALL_NO_RWG,
void, ptr, ptr, ptr, ptr, i32)
 
+DEF_HELPER_FLAGS_4(gvec_sdot_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_4(gvec_udot_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_4(gvec_sdot_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_4(gvec_udot_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
+
 DEF_HELPER_FLAGS_5(gvec_fcaddh, TCG_CALL_NO_RWG,
void, ptr, ptr, ptr, ptr, i32)
 DEF_HELPER_FLAGS_5(gvec_fcadds, TCG_CALL_NO_RWG,
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
index 4f2152fb70..8a2bd1f8c5 100644
--- a/target/arm/translate-sve.c
+++ b/target/arm/translate-sve.c
@@ -3423,6 +3423,23 @@ DO_ZZI(UMIN, umin)
 
 #undef DO_ZZI
 
+static bool trans_DOT_zzz(DisasContext *s, arg_DOT_zzz *a, uint32_t insn)
+{
+static gen_helper_gvec_3 * const fns[2][2] = {
+{ gen_helper_gvec_sdot_b, gen_helper_gvec_sdot_h },
+{ gen_helper_gvec_udot_b, gen_helper_gvec_udot_h }
+};
+
+if (sve_access_check(s)) {
+unsigned vsz = vec_full_reg_size(s);
+tcg_gen_gvec_3_ool(vec_full_reg_offset(s, a->rd),
+   vec_full_reg_offset(s, a->rn),
+   vec_full_reg_offset(s, a->rm),
+   vsz, vsz, 0, fns[a->u][a->sz]);
+}
+return true;
+}
+
 /*
  *** SVE Floating Point Multiply-Add Indexed Group
  */
diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
index db5aeb9f24..c16a30c3b5 100644
--- a/target/arm/vec_helper.c
+++ b/target/arm/vec_helper.c
@@ -194,6 +194,73 @@ void HELPER(gvec_qrdmlsh_s32)(void *vd, void *vn, void *vm,
 clear_tail(d, opr_sz, simd_maxsz(desc));
 }
 
+/* Integer 8 and 16-bit dot-product.
+ *
+ * Note that for the loops herein, host endianness does not matter
+ * with respect to the ordering of data within the 64-bit lanes.
+ * All elements are treated equally, no matter where they are.
+ */
+
+void HELPER(gvec_sdot_b)(void *vd, void *vn, void *vm, uint32_t desc)
+{
+intptr_t i, opr_sz = simd_oprsz(desc);
+uint32_t *d = vd;
+int8_t *n = vn, *m = vm;
+
+for (i = 0; i < opr_sz / 4; ++i) {
+d[i] += n[i * 4 + 0] * m[i * 4 + 0]
+  + n[i * 4 + 1] * m[i * 4 + 1]
+  + n[i * 4 + 2] * m[i * 4 + 2]
+  + n[i * 4 + 3] * m[i * 4 + 3];
+}
+clear_tail(d, opr_sz, simd_maxsz(desc));
+}
+
+void HELPER(gvec_udot_b)(void *vd, void *vn, void *vm, uint32_t desc)
+{
+intptr_t i, opr_sz = simd_oprsz(desc);
+uint32_t *d = vd;
+uint8_t *n = vn, *m = vm;
+
+for (i = 0; i < opr_sz / 4; ++i) {
+d[i] += n[i * 4 + 0] * m[i * 4 + 0]
+  + n[i * 4 + 1] * m[i * 4 + 1]
+  + n[i * 4 + 2] * m[i * 4 + 2]
+  + n[i * 4 + 3] * m[i * 4 + 3];
+}
+clear_tail(d, opr_sz, simd_maxsz(desc));
+}
+
+void HELPER(gvec_sdot_h)(void *vd, void *vn, void *vm, uint32_t desc)
+{
+intptr_t i, opr_sz = simd_oprsz(desc);
+uint64_t *d = vd;
+int16_t *n = vn, *m = vm;
+
+for (i = 0; i < opr_sz / 8; ++i) {
+d[i] += (int64_t)n[i * 4 + 0] * m[i * 4 + 0]
+  + (int64_t)n[i * 4 + 1] * m[i * 4 + 1]
+  + (int64_t)n[i * 4 + 2] * m[i * 4 + 2]
+  + (int64_t)n[i * 4 + 3] * m[i * 4 + 3];
+}
+clear_tail(d, opr_sz, simd_maxsz(desc));
+}
+
+void HELPER(gvec_udot_h)(void *vd, void *vn, void *vm, uint32_t desc)
+{
+intptr_t i, opr_sz = simd_oprsz(desc);
+uint64_t *d = vd;
+uint16_t *n = vn, *m = vm;
+
+for (i = 0; i < opr_sz / 8; ++i) {
+d[i] += (uint64_t)n[i * 4 + 0] * m[i * 4 + 0]
+  + (uint64_t)n[i * 4 + 1] * m[i * 4 + 1]
+  + (uint64_t)n[i * 4 + 2] * m[i * 4 + 2]
+  + (uint64_t)n[i * 4 + 3] * m[i * 4 + 3];
+}
+clear_tail(d, opr_sz, simd_maxsz(desc));
+}
+
 void HELPER(gvec_fcaddh)(void *vd, void *vn, void *vm,
  void *vfpst, uint32_t desc)
 {
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
index 62365ed90f..35415bfb6c 100644
--- a/target/arm/sve.decode
+++ b/target/arm/sve.decode
@@ -725,6 +725,9 @@ UMIN_zzi00100101 .. 101 011 110  .  
@rdn_i8u
 # SVE integer multiply immediate (unpredicated)
 MUL_zzi 00100101 .. 110 000 110  .  @rdn_i8s
 
+# SVE integer dot product (unpredicated)
+DOT_zzz 01000100 1 sz:1 0 rm:5 0 u:1 rn:5 rd:5
+
 # SVE 

Re: [Qemu-devel] [PATCH v3 1/5] target/arm: support reading of CNT[VCT|FRQ]_EL0 from user-space

2018-06-26 Thread Richard Henderson
On 06/25/2018 09:00 AM, Alex Bennée wrote:
> Since kernel commit a86bd139f2 (arm64: arch_timer: Enable CNTVCT_EL0
> trap..) user-space has been able to read these system registers. As we
> can't use QEMUTimer's in linux-user mode we just directly call
> cpu_get_clock().
> 
> Signed-off-by: Alex Bennée 
> 
> ---
> v2
>   - include CNTFRQ_EL0 for PL0_R only
> v3
>   - use NANOSECONDS_PER_SECOND / GTIMER_SCALE
> ---
>  target/arm/helper.c | 27 ---
>  1 file changed, 24 insertions(+), 3 deletions(-)

Reviewed-by: Richard Henderson 


r~



[Qemu-devel] [PATCH v6 29/35] target/arm: Implement SVE fp complex multiply add

2018-06-26 Thread Richard Henderson
Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 
---
 target/arm/helper-sve.h|   4 +
 target/arm/sve_helper.c| 162 +
 target/arm/translate-sve.c |  37 +
 target/arm/sve.decode  |   4 +
 4 files changed, 207 insertions(+)

diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
index 0bd9fe2f28..023952a9a4 100644
--- a/target/arm/helper-sve.h
+++ b/target/arm/helper-sve.h
@@ -1115,6 +1115,10 @@ DEF_HELPER_FLAGS_3(sve_fnmls_zpzzz_h, TCG_CALL_NO_RWG, 
void, env, ptr, i32)
 DEF_HELPER_FLAGS_3(sve_fnmls_zpzzz_s, TCG_CALL_NO_RWG, void, env, ptr, i32)
 DEF_HELPER_FLAGS_3(sve_fnmls_zpzzz_d, TCG_CALL_NO_RWG, void, env, ptr, i32)
 
+DEF_HELPER_FLAGS_3(sve_fcmla_zpzzz_h, TCG_CALL_NO_RWG, void, env, ptr, i32)
+DEF_HELPER_FLAGS_3(sve_fcmla_zpzzz_s, TCG_CALL_NO_RWG, void, env, ptr, i32)
+DEF_HELPER_FLAGS_3(sve_fcmla_zpzzz_d, TCG_CALL_NO_RWG, void, env, ptr, i32)
+
 DEF_HELPER_FLAGS_5(sve_ftmad_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
 DEF_HELPER_FLAGS_5(sve_ftmad_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
 DEF_HELPER_FLAGS_5(sve_ftmad_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
index bdb7565779..790cbacd14 100644
--- a/target/arm/sve_helper.c
+++ b/target/arm/sve_helper.c
@@ -3757,6 +3757,168 @@ void HELPER(sve_fcadd_d)(void *vd, void *vn, void *vm, 
void *vg,
 } while (i != 0);
 }
 
+/*
+ * FP Complex Multiply
+ */
+
+QEMU_BUILD_BUG_ON(SIMD_DATA_SHIFT + 22 > 32);
+
+void HELPER(sve_fcmla_zpzzz_h)(CPUARMState *env, void *vg, uint32_t desc)
+{
+intptr_t j, i = simd_oprsz(desc);
+unsigned rd = extract32(desc, SIMD_DATA_SHIFT, 5);
+unsigned rn = extract32(desc, SIMD_DATA_SHIFT + 5, 5);
+unsigned rm = extract32(desc, SIMD_DATA_SHIFT + 10, 5);
+unsigned ra = extract32(desc, SIMD_DATA_SHIFT + 15, 5);
+unsigned rot = extract32(desc, SIMD_DATA_SHIFT + 20, 2);
+bool flip = rot & 1;
+float16 neg_imag, neg_real;
+void *vd = >vfp.zregs[rd];
+void *vn = >vfp.zregs[rn];
+void *vm = >vfp.zregs[rm];
+void *va = >vfp.zregs[ra];
+uint64_t *g = vg;
+
+neg_imag = float16_set_sign(0, (rot & 2) != 0);
+neg_real = float16_set_sign(0, rot == 1 || rot == 2);
+
+do {
+uint64_t pg = g[(i - 1) >> 6];
+do {
+float16 e1, e2, e3, e4, nr, ni, mr, mi, d;
+
+/* I holds the real index; J holds the imag index.  */
+j = i - sizeof(float16);
+i -= 2 * sizeof(float16);
+
+nr = *(float16 *)(vn + H1_2(i));
+ni = *(float16 *)(vn + H1_2(j));
+mr = *(float16 *)(vm + H1_2(i));
+mi = *(float16 *)(vm + H1_2(j));
+
+e2 = (flip ? ni : nr);
+e1 = (flip ? mi : mr) ^ neg_real;
+e4 = e2;
+e3 = (flip ? mr : mi) ^ neg_imag;
+
+if (likely((pg >> (i & 63)) & 1)) {
+d = *(float16 *)(va + H1_2(i));
+d = float16_muladd(e2, e1, d, 0, >vfp.fp_status_f16);
+*(float16 *)(vd + H1_2(i)) = d;
+}
+if (likely((pg >> (j & 63)) & 1)) {
+d = *(float16 *)(va + H1_2(j));
+d = float16_muladd(e4, e3, d, 0, >vfp.fp_status_f16);
+*(float16 *)(vd + H1_2(j)) = d;
+}
+} while (i & 63);
+} while (i != 0);
+}
+
+void HELPER(sve_fcmla_zpzzz_s)(CPUARMState *env, void *vg, uint32_t desc)
+{
+intptr_t j, i = simd_oprsz(desc);
+unsigned rd = extract32(desc, SIMD_DATA_SHIFT, 5);
+unsigned rn = extract32(desc, SIMD_DATA_SHIFT + 5, 5);
+unsigned rm = extract32(desc, SIMD_DATA_SHIFT + 10, 5);
+unsigned ra = extract32(desc, SIMD_DATA_SHIFT + 15, 5);
+unsigned rot = extract32(desc, SIMD_DATA_SHIFT + 20, 2);
+bool flip = rot & 1;
+float32 neg_imag, neg_real;
+void *vd = >vfp.zregs[rd];
+void *vn = >vfp.zregs[rn];
+void *vm = >vfp.zregs[rm];
+void *va = >vfp.zregs[ra];
+uint64_t *g = vg;
+
+neg_imag = float32_set_sign(0, (rot & 2) != 0);
+neg_real = float32_set_sign(0, rot == 1 || rot == 2);
+
+do {
+uint64_t pg = g[(i - 1) >> 6];
+do {
+float32 e1, e2, e3, e4, nr, ni, mr, mi, d;
+
+/* I holds the real index; J holds the imag index.  */
+j = i - sizeof(float32);
+i -= 2 * sizeof(float32);
+
+nr = *(float32 *)(vn + H1_2(i));
+ni = *(float32 *)(vn + H1_2(j));
+mr = *(float32 *)(vm + H1_2(i));
+mi = *(float32 *)(vm + H1_2(j));
+
+e2 = (flip ? ni : nr);
+e1 = (flip ? mi : mr) ^ neg_real;
+e4 = e2;
+e3 = (flip ? mr : mi) ^ neg_imag;
+
+if (likely((pg >> (i & 63)) & 1)) {
+d = *(float32 *)(va + H1_2(i));
+d = float32_muladd(e2, e1, d, 0, >vfp.fp_status);
+*(float32 *)(vd + H1_2(i)) = d;
+}
+ 

[Qemu-devel] [PATCH v6 28/35] target/arm: Implement SVE floating-point complex add

2018-06-26 Thread Richard Henderson
Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 
---
 target/arm/helper-sve.h|   7 +++
 target/arm/sve_helper.c| 100 +
 target/arm/translate-sve.c |  24 +
 target/arm/sve.decode  |   4 ++
 4 files changed, 135 insertions(+)

diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
index 891346a5ac..0bd9fe2f28 100644
--- a/target/arm/helper-sve.h
+++ b/target/arm/helper-sve.h
@@ -1092,6 +1092,13 @@ DEF_HELPER_FLAGS_6(sve_facgt_s, TCG_CALL_NO_RWG,
 DEF_HELPER_FLAGS_6(sve_facgt_d, TCG_CALL_NO_RWG,
void, ptr, ptr, ptr, ptr, ptr, i32)
 
+DEF_HELPER_FLAGS_6(sve_fcadd_h, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_6(sve_fcadd_s, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_6(sve_fcadd_d, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, ptr, i32)
+
 DEF_HELPER_FLAGS_3(sve_fmla_zpzzz_h, TCG_CALL_NO_RWG, void, env, ptr, i32)
 DEF_HELPER_FLAGS_3(sve_fmla_zpzzz_s, TCG_CALL_NO_RWG, void, env, ptr, i32)
 DEF_HELPER_FLAGS_3(sve_fmla_zpzzz_d, TCG_CALL_NO_RWG, void, env, ptr, i32)
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
index 83bd8c4269..bdb7565779 100644
--- a/target/arm/sve_helper.c
+++ b/target/arm/sve_helper.c
@@ -3657,6 +3657,106 @@ void HELPER(sve_ftmad_d)(void *vd, void *vn, void *vm, 
void *vs, uint32_t desc)
 }
 }
 
+/*
+ * FP Complex Add
+ */
+
+void HELPER(sve_fcadd_h)(void *vd, void *vn, void *vm, void *vg,
+ void *vs, uint32_t desc)
+{
+intptr_t j, i = simd_oprsz(desc);
+uint64_t *g = vg;
+float16 neg_imag = float16_set_sign(0, simd_data(desc));
+float16 neg_real = float16_chs(neg_imag);
+
+do {
+uint64_t pg = g[(i - 1) >> 6];
+do {
+float16 e0, e1, e2, e3;
+
+/* I holds the real index; J holds the imag index.  */
+j = i - sizeof(float16);
+i -= 2 * sizeof(float16);
+
+e0 = *(float16 *)(vn + H1_2(i));
+e1 = *(float16 *)(vm + H1_2(j)) ^ neg_real;
+e2 = *(float16 *)(vn + H1_2(j));
+e3 = *(float16 *)(vm + H1_2(i)) ^ neg_imag;
+
+if (likely((pg >> (i & 63)) & 1)) {
+*(float16 *)(vd + H1_2(i)) = float16_add(e0, e1, vs);
+}
+if (likely((pg >> (j & 63)) & 1)) {
+*(float16 *)(vd + H1_2(j)) = float16_add(e2, e3, vs);
+}
+} while (i & 63);
+} while (i != 0);
+}
+
+void HELPER(sve_fcadd_s)(void *vd, void *vn, void *vm, void *vg,
+ void *vs, uint32_t desc)
+{
+intptr_t j, i = simd_oprsz(desc);
+uint64_t *g = vg;
+float32 neg_imag = float32_set_sign(0, simd_data(desc));
+float32 neg_real = float32_chs(neg_imag);
+
+do {
+uint64_t pg = g[(i - 1) >> 6];
+do {
+float32 e0, e1, e2, e3;
+
+/* I holds the real index; J holds the imag index.  */
+j = i - sizeof(float32);
+i -= 2 * sizeof(float32);
+
+e0 = *(float32 *)(vn + H1_2(i));
+e1 = *(float32 *)(vm + H1_2(j)) ^ neg_real;
+e2 = *(float32 *)(vn + H1_2(j));
+e3 = *(float32 *)(vm + H1_2(i)) ^ neg_imag;
+
+if (likely((pg >> (i & 63)) & 1)) {
+*(float32 *)(vd + H1_2(i)) = float32_add(e0, e1, vs);
+}
+if (likely((pg >> (j & 63)) & 1)) {
+*(float32 *)(vd + H1_2(j)) = float32_add(e2, e3, vs);
+}
+} while (i & 63);
+} while (i != 0);
+}
+
+void HELPER(sve_fcadd_d)(void *vd, void *vn, void *vm, void *vg,
+ void *vs, uint32_t desc)
+{
+intptr_t j, i = simd_oprsz(desc);
+uint64_t *g = vg;
+float64 neg_imag = float64_set_sign(0, simd_data(desc));
+float64 neg_real = float64_chs(neg_imag);
+
+do {
+uint64_t pg = g[(i - 1) >> 6];
+do {
+float64 e0, e1, e2, e3;
+
+/* I holds the real index; J holds the imag index.  */
+j = i - sizeof(float64);
+i -= 2 * sizeof(float64);
+
+e0 = *(float64 *)(vn + H1_2(i));
+e1 = *(float64 *)(vm + H1_2(j)) ^ neg_real;
+e2 = *(float64 *)(vn + H1_2(j));
+e3 = *(float64 *)(vm + H1_2(i)) ^ neg_imag;
+
+if (likely((pg >> (i & 63)) & 1)) {
+*(float64 *)(vd + H1_2(i)) = float64_add(e0, e1, vs);
+}
+if (likely((pg >> (j & 63)) & 1)) {
+*(float64 *)(vd + H1_2(j)) = float64_add(e2, e3, vs);
+}
+} while (i & 63);
+} while (i != 0);
+}
+
 /*
  * Load contiguous data, protected by a governing predicate.
  */
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
index 4883de3fab..b1764f099b 100644
--- a/target/arm/translate-sve.c
+++ b/target/arm/translate-sve.c
@@ -3895,6 +3895,30 @@ DO_FPCMP(FACGT, 

[Qemu-devel] [PATCH v6 33/35] target/arm: Implement SVE dot product (indexed)

2018-06-26 Thread Richard Henderson
Signed-off-by: Richard Henderson 

---
v6: Rearrange the loops.  The compiler does well with this form
and hopefully they are also easier to read.
---
 target/arm/helper.h|   5 ++
 target/arm/translate-sve.c |  18 ++
 target/arm/vec_helper.c| 124 +
 target/arm/sve.decode  |   8 ++-
 4 files changed, 154 insertions(+), 1 deletion(-)

diff --git a/target/arm/helper.h b/target/arm/helper.h
index e23ce7ff19..59e8c3bd1b 100644
--- a/target/arm/helper.h
+++ b/target/arm/helper.h
@@ -588,6 +588,11 @@ DEF_HELPER_FLAGS_4(gvec_udot_b, TCG_CALL_NO_RWG, void, 
ptr, ptr, ptr, i32)
 DEF_HELPER_FLAGS_4(gvec_sdot_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
 DEF_HELPER_FLAGS_4(gvec_udot_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
 
+DEF_HELPER_FLAGS_4(gvec_sdot_idx_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_4(gvec_udot_idx_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_4(gvec_sdot_idx_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_4(gvec_udot_idx_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
+
 DEF_HELPER_FLAGS_5(gvec_fcaddh, TCG_CALL_NO_RWG,
void, ptr, ptr, ptr, ptr, i32)
 DEF_HELPER_FLAGS_5(gvec_fcadds, TCG_CALL_NO_RWG,
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
index 8a2bd1f8c5..3cff71cae8 100644
--- a/target/arm/translate-sve.c
+++ b/target/arm/translate-sve.c
@@ -3440,6 +3440,24 @@ static bool trans_DOT_zzz(DisasContext *s, arg_DOT_zzz 
*a, uint32_t insn)
 return true;
 }
 
+static bool trans_DOT_zzx(DisasContext *s, arg_DOT_zzx *a, uint32_t insn)
+{
+static gen_helper_gvec_3 * const fns[2][2] = {
+{ gen_helper_gvec_sdot_idx_b, gen_helper_gvec_sdot_idx_h },
+{ gen_helper_gvec_udot_idx_b, gen_helper_gvec_udot_idx_h }
+};
+
+if (sve_access_check(s)) {
+unsigned vsz = vec_full_reg_size(s);
+tcg_gen_gvec_3_ool(vec_full_reg_offset(s, a->rd),
+   vec_full_reg_offset(s, a->rn),
+   vec_full_reg_offset(s, a->rm),
+   vsz, vsz, a->index, fns[a->u][a->sz]);
+}
+return true;
+}
+
+
 /*
  *** SVE Floating Point Multiply-Add Indexed Group
  */
diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
index c16a30c3b5..37f338732e 100644
--- a/target/arm/vec_helper.c
+++ b/target/arm/vec_helper.c
@@ -261,6 +261,130 @@ void HELPER(gvec_udot_h)(void *vd, void *vn, void *vm, 
uint32_t desc)
 clear_tail(d, opr_sz, simd_maxsz(desc));
 }
 
+void HELPER(gvec_sdot_idx_b)(void *vd, void *vn, void *vm, uint32_t desc)
+{
+intptr_t i, segend, opr_sz = simd_oprsz(desc), opr_sz_4 = opr_sz / 4;
+intptr_t index = simd_data(desc);
+uint32_t *d = vd;
+int8_t *n = vn;
+int8_t *m_indexed = (int8_t *)vm + index * 4;
+
+/* Notice the special case of opr_sz == 8, from aa64/aa32 advsimd.
+ * Otherwise opr_sz is a multiple of 16.
+ */
+segend = MIN(4, opr_sz_4);
+i = 0;
+do {
+int8_t m0 = m_indexed[i * 4 + 0];
+int8_t m1 = m_indexed[i * 4 + 1];
+int8_t m2 = m_indexed[i * 4 + 2];
+int8_t m3 = m_indexed[i * 4 + 3];
+
+do {
+d[i] += n[i * 4 + 0] * m0
+  + n[i * 4 + 1] * m1
+  + n[i * 4 + 2] * m2
+  + n[i * 4 + 3] * m3;
+} while (++i < segend);
+segend = i + 4;
+} while (i < opr_sz_4);
+
+clear_tail(d, opr_sz, simd_maxsz(desc));
+}
+
+void HELPER(gvec_udot_idx_b)(void *vd, void *vn, void *vm, uint32_t desc)
+{
+intptr_t i, segend, opr_sz = simd_oprsz(desc), opr_sz_4 = opr_sz / 4;
+intptr_t index = simd_data(desc);
+uint32_t *d = vd;
+uint8_t *n = vn;
+uint8_t *m_indexed = (uint8_t *)vm + index * 4;
+
+/* Notice the special case of opr_sz == 8, from aa64/aa32 advsimd.
+ * Otherwise opr_sz is a multiple of 16.
+ */
+segend = MIN(4, opr_sz_4);
+i = 0;
+do {
+uint8_t m0 = m_indexed[i * 4 + 0];
+uint8_t m1 = m_indexed[i * 4 + 1];
+uint8_t m2 = m_indexed[i * 4 + 2];
+uint8_t m3 = m_indexed[i * 4 + 3];
+
+do {
+d[i] += n[i * 4 + 0] * m0
+  + n[i * 4 + 1] * m1
+  + n[i * 4 + 2] * m2
+  + n[i * 4 + 3] * m3;
+} while (++i < segend);
+segend = i + 4;
+} while (i < opr_sz_4);
+
+clear_tail(d, opr_sz, simd_maxsz(desc));
+}
+
+void HELPER(gvec_sdot_idx_h)(void *vd, void *vn, void *vm, uint32_t desc)
+{
+intptr_t i, opr_sz = simd_oprsz(desc), opr_sz_8 = opr_sz / 8;
+intptr_t index = simd_data(desc);
+uint64_t *d = vd;
+int16_t *n = vn;
+int16_t *m_indexed = (int16_t *)vm + index * 4;
+
+/* This is supported by SVE only, so opr_sz is always a multiple of 16.
+ * Process the entire segment all at once, writing back the results
+ * only after we've consumed all of the inputs.
+ */
+for (i = 0; i < opr_sz_8 ; i += 

[Qemu-devel] [PATCH v6 24/35] target/arm: Implement SVE floating-point convert to integer

2018-06-26 Thread Richard Henderson
Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 
---
 target/arm/helper-sve.h| 30 +
 target/arm/helper.h| 12 +++---
 target/arm/helper.c|  2 +-
 target/arm/sve_helper.c| 88 ++
 target/arm/translate-sve.c | 70 ++
 target/arm/sve.decode  | 16 +++
 6 files changed, 211 insertions(+), 7 deletions(-)

diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
index 4c379dbb05..37fa9eb9bb 100644
--- a/target/arm/helper-sve.h
+++ b/target/arm/helper-sve.h
@@ -955,6 +955,36 @@ DEF_HELPER_FLAGS_5(sve_fcvt_hd, TCG_CALL_NO_RWG,
 DEF_HELPER_FLAGS_5(sve_fcvt_sd, TCG_CALL_NO_RWG,
void, ptr, ptr, ptr, ptr, i32)
 
+DEF_HELPER_FLAGS_5(sve_fcvtzs_hh, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_5(sve_fcvtzs_hs, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_5(sve_fcvtzs_ss, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_5(sve_fcvtzs_ds, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_5(sve_fcvtzs_hd, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_5(sve_fcvtzs_sd, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_5(sve_fcvtzs_dd, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, i32)
+
+DEF_HELPER_FLAGS_5(sve_fcvtzu_hh, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_5(sve_fcvtzu_hs, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_5(sve_fcvtzu_ss, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_5(sve_fcvtzu_ds, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_5(sve_fcvtzu_hd, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_5(sve_fcvtzu_sd, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_5(sve_fcvtzu_dd, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, i32)
+
 DEF_HELPER_FLAGS_5(sve_scvt_hh, TCG_CALL_NO_RWG,
void, ptr, ptr, ptr, ptr, i32)
 DEF_HELPER_FLAGS_5(sve_scvt_sh, TCG_CALL_NO_RWG,
diff --git a/target/arm/helper.h b/target/arm/helper.h
index ad9cb6c7d5..8607077dda 100644
--- a/target/arm/helper.h
+++ b/target/arm/helper.h
@@ -134,12 +134,12 @@ DEF_HELPER_2(vfp_touid, i32, f64, ptr)
 DEF_HELPER_2(vfp_touizh, i32, f16, ptr)
 DEF_HELPER_2(vfp_touizs, i32, f32, ptr)
 DEF_HELPER_2(vfp_touizd, i32, f64, ptr)
-DEF_HELPER_2(vfp_tosih, i32, f16, ptr)
-DEF_HELPER_2(vfp_tosis, i32, f32, ptr)
-DEF_HELPER_2(vfp_tosid, i32, f64, ptr)
-DEF_HELPER_2(vfp_tosizh, i32, f16, ptr)
-DEF_HELPER_2(vfp_tosizs, i32, f32, ptr)
-DEF_HELPER_2(vfp_tosizd, i32, f64, ptr)
+DEF_HELPER_2(vfp_tosih, s32, f16, ptr)
+DEF_HELPER_2(vfp_tosis, s32, f32, ptr)
+DEF_HELPER_2(vfp_tosid, s32, f64, ptr)
+DEF_HELPER_2(vfp_tosizh, s32, f16, ptr)
+DEF_HELPER_2(vfp_tosizs, s32, f32, ptr)
+DEF_HELPER_2(vfp_tosizd, s32, f64, ptr)
 
 DEF_HELPER_3(vfp_toshs_round_to_zero, i32, f32, i32, ptr)
 DEF_HELPER_3(vfp_tosls_round_to_zero, i32, f32, i32, ptr)
diff --git a/target/arm/helper.c b/target/arm/helper.c
index 1248d84e6f..a36f5b1899 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -11360,7 +11360,7 @@ ftype HELPER(name)(uint32_t x, void *fpstp) 
\
 }
 
 #define CONV_FTOI(name, ftype, fsz, sign, round)\
-uint32_t HELPER(name)(ftype x, void *fpstp) \
+sign##int32_t HELPER(name)(ftype x, void *fpstp)\
 {   \
 float_status *fpst = fpstp; \
 if (float##fsz##_is_any_nan(x)) {   \
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
index 4b36c1eecf..b6421ec19c 100644
--- a/target/arm/sve_helper.c
+++ b/target/arm/sve_helper.c
@@ -3195,6 +3195,78 @@ static inline float16 sve_f64_to_f16(float64 f, 
float_status *fpst)
 return ret;
 }
 
+static inline int16_t vfp_float16_to_int16_rtz(float16 f, float_status *s)
+{
+if (float16_is_any_nan(f)) {
+float_raise(float_flag_invalid, s);
+return 0;
+}
+return float16_to_int16_round_to_zero(f, s);
+}
+
+static inline int64_t vfp_float16_to_int64_rtz(float16 f, float_status *s)
+{
+if (float16_is_any_nan(f)) {
+float_raise(float_flag_invalid, s);
+return 0;
+}
+return float16_to_int64_round_to_zero(f, s);
+}
+
+static inline int64_t vfp_float32_to_int64_rtz(float32 f, float_status *s)
+{
+if (float32_is_any_nan(f)) {
+float_raise(float_flag_invalid, s);
+return 0;
+}
+return float32_to_int64_round_to_zero(f, s);
+}
+
+static inline int64_t vfp_float64_to_int64_rtz(float64 f, float_status *s)
+{
+   

[Qemu-devel] [PATCH v6 35/35] target/arm: Implement ARMv8.2-DotProd

2018-06-26 Thread Richard Henderson
We've already added the helpers with an SVE patch, all that remains
is to wire up the aa64 and aa32 translators.  Enable the feature
within -cpu max for CONFIG_USER_ONLY.

Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 

---
v6: Fix aa32 index form.
---
 target/arm/cpu.h   |  1 +
 linux-user/elfload.c   |  1 +
 target/arm/cpu.c   |  1 +
 target/arm/cpu64.c |  1 +
 target/arm/translate-a64.c | 36 +++
 target/arm/translate.c | 74 +++---
 6 files changed, 93 insertions(+), 21 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index a4507a2d6f..6a8441c2dd 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -1480,6 +1480,7 @@ enum arm_features {
 ARM_FEATURE_V8_SM4, /* implements SM4 part of v8 Crypto Extensions */
 ARM_FEATURE_V8_ATOMICS, /* ARMv8.1-Atomics feature */
 ARM_FEATURE_V8_RDM, /* implements v8.1 simd round multiply */
+ARM_FEATURE_V8_DOTPROD, /* implements v8.2 simd dot product */
 ARM_FEATURE_V8_FP16, /* implements v8.2 half-precision float */
 ARM_FEATURE_V8_FCMA, /* has complex number part of v8.3 extensions.  */
 ARM_FEATURE_M_MAIN, /* M profile Main Extension */
diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index d1231ad07a..942a1b661f 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -583,6 +583,7 @@ static uint32_t get_elf_hwcap(void)
 ARM_HWCAP_A64_FPHP | ARM_HWCAP_A64_ASIMDHP);
 GET_FEATURE(ARM_FEATURE_V8_ATOMICS, ARM_HWCAP_A64_ATOMICS);
 GET_FEATURE(ARM_FEATURE_V8_RDM, ARM_HWCAP_A64_ASIMDRDM);
+GET_FEATURE(ARM_FEATURE_V8_DOTPROD, ARM_HWCAP_A64_ASIMDDP);
 GET_FEATURE(ARM_FEATURE_V8_FCMA, ARM_HWCAP_A64_FCMA);
 GET_FEATURE(ARM_FEATURE_SVE, ARM_HWCAP_A64_SVE);
 #undef GET_FEATURE
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index 6dcc552e14..aa62315cea 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -1805,6 +1805,7 @@ static void arm_max_initfn(Object *obj)
 set_feature(>env, ARM_FEATURE_V8_PMULL);
 set_feature(>env, ARM_FEATURE_CRC);
 set_feature(>env, ARM_FEATURE_V8_RDM);
+set_feature(>env, ARM_FEATURE_V8_DOTPROD);
 set_feature(>env, ARM_FEATURE_V8_FCMA);
 #endif
 }
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index 0360d7efc5..3b4bc73ffa 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -250,6 +250,7 @@ static void aarch64_max_initfn(Object *obj)
 set_feature(>env, ARM_FEATURE_CRC);
 set_feature(>env, ARM_FEATURE_V8_ATOMICS);
 set_feature(>env, ARM_FEATURE_V8_RDM);
+set_feature(>env, ARM_FEATURE_V8_DOTPROD);
 set_feature(>env, ARM_FEATURE_V8_FP16);
 set_feature(>env, ARM_FEATURE_V8_FCMA);
 set_feature(>env, ARM_FEATURE_SVE);
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index eb3a4ab2f0..f986340832 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -640,6 +640,16 @@ static void gen_gvec_op3(DisasContext *s, bool is_q, int 
rd,
vec_full_reg_size(s), gvec_op);
 }
 
+/* Expand a 3-operand operation using an out-of-line helper.  */
+static void gen_gvec_op3_ool(DisasContext *s, bool is_q, int rd,
+ int rn, int rm, int data, gen_helper_gvec_3 *fn)
+{
+tcg_gen_gvec_3_ool(vec_full_reg_offset(s, rd),
+   vec_full_reg_offset(s, rn),
+   vec_full_reg_offset(s, rm),
+   is_q ? 16 : 8, vec_full_reg_size(s), data, fn);
+}
+
 /* Expand a 3-operand + env pointer operation using
  * an out-of-line helper.
  */
@@ -11336,6 +11346,14 @@ static void 
disas_simd_three_reg_same_extra(DisasContext *s, uint32_t insn)
 }
 feature = ARM_FEATURE_V8_RDM;
 break;
+case 0x02: /* SDOT (vector) */
+case 0x12: /* UDOT (vector) */
+if (size != MO_32) {
+unallocated_encoding(s);
+return;
+}
+feature = ARM_FEATURE_V8_DOTPROD;
+break;
 case 0x8: /* FCMLA, #0 */
 case 0x9: /* FCMLA, #90 */
 case 0xa: /* FCMLA, #180 */
@@ -11389,6 +11407,11 @@ static void 
disas_simd_three_reg_same_extra(DisasContext *s, uint32_t insn)
 }
 return;
 
+case 0x2: /* SDOT / UDOT */
+gen_gvec_op3_ool(s, is_q, rd, rn, rm, 0,
+ u ? gen_helper_gvec_udot_b : gen_helper_gvec_sdot_b);
+return;
+
 case 0x8: /* FCMLA, #0 */
 case 0x9: /* FCMLA, #90 */
 case 0xa: /* FCMLA, #180 */
@@ -12568,6 +12591,13 @@ static void disas_simd_indexed(DisasContext *s, 
uint32_t insn)
 return;
 }
 break;
+case 0x0e: /* SDOT */
+case 0x1e: /* UDOT */
+if (size != MO_32 || !arm_dc_feature(s, ARM_FEATURE_V8_DOTPROD)) {
+unallocated_encoding(s);
+return;
+}
+break;
 case 0x11: /* FCMLA #0 */
 case 0x13: /* FCMLA #90 */
 case 0x15: /* 

[Qemu-devel] [PATCH v6 26/35] target/arm: Implement SVE floating-point unary operations

2018-06-26 Thread Richard Henderson
Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 
---
 target/arm/helper-sve.h| 14 ++
 target/arm/sve_helper.c|  8 
 target/arm/translate-sve.c | 26 ++
 target/arm/sve.decode  |  4 
 4 files changed, 52 insertions(+)

diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
index 36168c5bb2..891346a5ac 100644
--- a/target/arm/helper-sve.h
+++ b/target/arm/helper-sve.h
@@ -999,6 +999,20 @@ DEF_HELPER_FLAGS_5(sve_frintx_s, TCG_CALL_NO_RWG,
 DEF_HELPER_FLAGS_5(sve_frintx_d, TCG_CALL_NO_RWG,
void, ptr, ptr, ptr, ptr, i32)
 
+DEF_HELPER_FLAGS_5(sve_frecpx_h, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_5(sve_frecpx_s, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_5(sve_frecpx_d, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, i32)
+
+DEF_HELPER_FLAGS_5(sve_fsqrt_h, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_5(sve_fsqrt_s, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_5(sve_fsqrt_d, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, i32)
+
 DEF_HELPER_FLAGS_5(sve_scvt_hh, TCG_CALL_NO_RWG,
void, ptr, ptr, ptr, ptr, i32)
 DEF_HELPER_FLAGS_5(sve_scvt_sh, TCG_CALL_NO_RWG,
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
index af8221c714..83bd8c4269 100644
--- a/target/arm/sve_helper.c
+++ b/target/arm/sve_helper.c
@@ -3298,6 +3298,14 @@ DO_ZPZ_FP(sve_frintx_h, uint16_t, H1_2, 
float16_round_to_int)
 DO_ZPZ_FP(sve_frintx_s, uint32_t, H1_4, float32_round_to_int)
 DO_ZPZ_FP(sve_frintx_d, uint64_t, , float64_round_to_int)
 
+DO_ZPZ_FP(sve_frecpx_h, uint16_t, H1_2, helper_frecpx_f16)
+DO_ZPZ_FP(sve_frecpx_s, uint32_t, H1_4, helper_frecpx_f32)
+DO_ZPZ_FP(sve_frecpx_d, uint64_t, , helper_frecpx_f64)
+
+DO_ZPZ_FP(sve_fsqrt_h, uint16_t, H1_2, float16_sqrt)
+DO_ZPZ_FP(sve_fsqrt_s, uint32_t, H1_4, float32_sqrt)
+DO_ZPZ_FP(sve_fsqrt_d, uint64_t, , float64_sqrt)
+
 DO_ZPZ_FP(sve_scvt_hh, uint16_t, H1_2, int16_to_float16)
 DO_ZPZ_FP(sve_scvt_sh, uint32_t, H1_4, int32_to_float16)
 DO_ZPZ_FP(sve_scvt_ss, uint32_t, H1_4, int32_to_float32)
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
index 270bf9101b..ff8ae67e2b 100644
--- a/target/arm/translate-sve.c
+++ b/target/arm/translate-sve.c
@@ -4117,6 +4117,32 @@ static bool trans_FRINTA(DisasContext *s, arg_rpr_esz 
*a, uint32_t insn)
 return do_frint_mode(s, a, float_round_ties_away);
 }
 
+static bool trans_FRECPX(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
+{
+static gen_helper_gvec_3_ptr * const fns[3] = {
+gen_helper_sve_frecpx_h,
+gen_helper_sve_frecpx_s,
+gen_helper_sve_frecpx_d
+};
+if (a->esz == 0) {
+return false;
+}
+return do_zpz_ptr(s, a->rd, a->rn, a->pg, a->esz == MO_16, fns[a->esz - 
1]);
+}
+
+static bool trans_FSQRT(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
+{
+static gen_helper_gvec_3_ptr * const fns[3] = {
+gen_helper_sve_fsqrt_h,
+gen_helper_sve_fsqrt_s,
+gen_helper_sve_fsqrt_d
+};
+if (a->esz == 0) {
+return false;
+}
+return do_zpz_ptr(s, a->rd, a->rn, a->pg, a->esz == MO_16, fns[a->esz - 
1]);
+}
+
 static bool trans_SCVTF_hh(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
 {
 return do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_scvt_hh);
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
index e45faaec3a..2aca9f0bb0 100644
--- a/target/arm/sve.decode
+++ b/target/arm/sve.decode
@@ -854,6 +854,10 @@ FRINTA  01100101 .. 000 100 101 ... . .
 @rd_pg_rn
 FRINTX  01100101 .. 000 110 101 ... . . @rd_pg_rn
 FRINTI  01100101 .. 000 111 101 ... . . @rd_pg_rn
 
+# SVE floating-point unary operations
+FRECPX  01100101 .. 001 100 101 ... . . @rd_pg_rn
+FSQRT   01100101 .. 001 101 101 ... . . @rd_pg_rn
+
 # SVE integer convert to floating-point
 SCVTF_hh01100101 01 010 01 0 101 ... . .@rd_pg_rn_e0
 SCVTF_sh01100101 01 010 10 0 101 ... . .@rd_pg_rn_e0
-- 
2.17.1




[Qemu-devel] [PATCH v6 31/35] target/arm: Implement SVE fp complex multiply add (indexed)

2018-06-26 Thread Richard Henderson
Enhance the existing helpers to support SVE, which takes the
index from each 128-bit segment.  The change has no effect
for AdvSIMD, since there is only one such segment.

Signed-off-by: Richard Henderson 
---
 target/arm/translate-sve.c | 23 ++
 target/arm/vec_helper.c| 50 +++---
 target/arm/sve.decode  |  6 +
 3 files changed, 59 insertions(+), 20 deletions(-)

diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
index 7ce3222158..4f2152fb70 100644
--- a/target/arm/translate-sve.c
+++ b/target/arm/translate-sve.c
@@ -4005,6 +4005,29 @@ static bool trans_FCMLA_zpzzz(DisasContext *s,
 return true;
 }
 
+static bool trans_FCMLA_zzxz(DisasContext *s, arg_FCMLA_zzxz *a, uint32_t insn)
+{
+static gen_helper_gvec_3_ptr * const fns[2] = {
+gen_helper_gvec_fcmlah_idx,
+gen_helper_gvec_fcmlas_idx,
+};
+
+tcg_debug_assert(a->esz == 1 || a->esz == 2);
+tcg_debug_assert(a->rd == a->ra);
+if (sve_access_check(s)) {
+unsigned vsz = vec_full_reg_size(s);
+TCGv_ptr status = get_fpstatus_ptr(a->esz == MO_16);
+tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, a->rd),
+   vec_full_reg_offset(s, a->rn),
+   vec_full_reg_offset(s, a->rm),
+   status, vsz, vsz,
+   a->index * 4 + a->rot,
+   fns[a->esz - 1]);
+tcg_temp_free_ptr(status);
+}
+return true;
+}
+
 /*
  *** SVE Floating Point Unary Operations Prediated Group
  */
diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
index 8f2dc4b989..db5aeb9f24 100644
--- a/target/arm/vec_helper.c
+++ b/target/arm/vec_helper.c
@@ -319,22 +319,27 @@ void HELPER(gvec_fcmlah_idx)(void *vd, void *vn, void *vm,
 uint32_t neg_imag = extract32(desc, SIMD_DATA_SHIFT + 1, 1);
 intptr_t index = extract32(desc, SIMD_DATA_SHIFT + 2, 2);
 uint32_t neg_real = flip ^ neg_imag;
-uintptr_t i;
-float16 e1 = m[H2(2 * index + flip)];
-float16 e3 = m[H2(2 * index + 1 - flip)];
+intptr_t elements = opr_sz / sizeof(float16);
+intptr_t eltspersegment = 16 / sizeof(float16);
+intptr_t i, j;
 
 /* Shift boolean to the sign bit so we can xor to negate.  */
 neg_real <<= 15;
 neg_imag <<= 15;
-e1 ^= neg_real;
-e3 ^= neg_imag;
 
-for (i = 0; i < opr_sz / 2; i += 2) {
-float16 e2 = n[H2(i + flip)];
-float16 e4 = e2;
+for (i = 0; i < elements; i += eltspersegment) {
+float16 mr = m[H2(i + 2 * index + 0)];
+float16 mi = m[H2(i + 2 * index + 1)];
+float16 e1 = neg_real ^ (flip ? mi : mr);
+float16 e3 = neg_imag ^ (flip ? mr : mi);
 
-d[H2(i)] = float16_muladd(e2, e1, d[H2(i)], 0, fpst);
-d[H2(i + 1)] = float16_muladd(e4, e3, d[H2(i + 1)], 0, fpst);
+for (j = i; j < i + eltspersegment; j += 2) {
+float16 e2 = n[H2(j + flip)];
+float16 e4 = e2;
+
+d[H2(j)] = float16_muladd(e2, e1, d[H2(j)], 0, fpst);
+d[H2(j + 1)] = float16_muladd(e4, e3, d[H2(j + 1)], 0, fpst);
+}
 }
 clear_tail(d, opr_sz, simd_maxsz(desc));
 }
@@ -380,22 +385,27 @@ void HELPER(gvec_fcmlas_idx)(void *vd, void *vn, void *vm,
 uint32_t neg_imag = extract32(desc, SIMD_DATA_SHIFT + 1, 1);
 intptr_t index = extract32(desc, SIMD_DATA_SHIFT + 2, 2);
 uint32_t neg_real = flip ^ neg_imag;
-uintptr_t i;
-float32 e1 = m[H4(2 * index + flip)];
-float32 e3 = m[H4(2 * index + 1 - flip)];
+intptr_t elements = opr_sz / sizeof(float32);
+intptr_t eltspersegment = 16 / sizeof(float32);
+intptr_t i, j;
 
 /* Shift boolean to the sign bit so we can xor to negate.  */
 neg_real <<= 31;
 neg_imag <<= 31;
-e1 ^= neg_real;
-e3 ^= neg_imag;
 
-for (i = 0; i < opr_sz / 4; i += 2) {
-float32 e2 = n[H4(i + flip)];
-float32 e4 = e2;
+for (i = 0; i < elements; i += eltspersegment) {
+float32 mr = m[H4(i + 2 * index + 0)];
+float32 mi = m[H4(i + 2 * index + 1)];
+float32 e1 = neg_real ^ (flip ? mi : mr);
+float32 e3 = neg_imag ^ (flip ? mr : mi);
 
-d[H4(i)] = float32_muladd(e2, e1, d[H4(i)], 0, fpst);
-d[H4(i + 1)] = float32_muladd(e4, e3, d[H4(i + 1)], 0, fpst);
+for (j = i; j < i + eltspersegment; j += 2) {
+float32 e2 = n[H4(j + flip)];
+float32 e4 = e2;
+
+d[H4(j)] = float32_muladd(e2, e1, d[H4(j)], 0, fpst);
+d[H4(j + 1)] = float32_muladd(e4, e3, d[H4(j + 1)], 0, fpst);
+}
 }
 clear_tail(d, opr_sz, simd_maxsz(desc));
 }
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
index e342cfdf14..62365ed90f 100644
--- a/target/arm/sve.decode
+++ b/target/arm/sve.decode
@@ -733,6 +733,12 @@ FCADD   01100100 esz:2 0 rot:1 100 pg:3 rm:5 
rd:5 \
 FCMLA_zpzzz 01100100 esz:2 0 rm:5 0 rot:2 pg:3 rn:5 

[Qemu-devel] [PATCH v6 21/35] target/arm: Implement SVE FP Compare with Zero Group

2018-06-26 Thread Richard Henderson
Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 
---
 target/arm/helper-sve.h| 42 +
 target/arm/sve_helper.c| 43 ++
 target/arm/translate-sve.c | 43 ++
 target/arm/sve.decode  | 10 +
 4 files changed, 138 insertions(+)

diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
index ff69d143a0..44a98440c9 100644
--- a/target/arm/helper-sve.h
+++ b/target/arm/helper-sve.h
@@ -767,6 +767,48 @@ DEF_HELPER_FLAGS_5(sve_fadda_s, TCG_CALL_NO_RWG,
 DEF_HELPER_FLAGS_5(sve_fadda_d, TCG_CALL_NO_RWG,
i64, i64, ptr, ptr, ptr, i32)
 
+DEF_HELPER_FLAGS_5(sve_fcmge0_h, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_5(sve_fcmge0_s, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_5(sve_fcmge0_d, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, i32)
+
+DEF_HELPER_FLAGS_5(sve_fcmgt0_h, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_5(sve_fcmgt0_s, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_5(sve_fcmgt0_d, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, i32)
+
+DEF_HELPER_FLAGS_5(sve_fcmlt0_h, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_5(sve_fcmlt0_s, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_5(sve_fcmlt0_d, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, i32)
+
+DEF_HELPER_FLAGS_5(sve_fcmle0_h, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_5(sve_fcmle0_s, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_5(sve_fcmle0_d, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, i32)
+
+DEF_HELPER_FLAGS_5(sve_fcmeq0_h, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_5(sve_fcmeq0_s, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_5(sve_fcmeq0_d, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, i32)
+
+DEF_HELPER_FLAGS_5(sve_fcmne0_h, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_5(sve_fcmne0_s, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_5(sve_fcmne0_d, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, i32)
+
 DEF_HELPER_FLAGS_6(sve_fadd_h, TCG_CALL_NO_RWG,
void, ptr, ptr, ptr, ptr, ptr, i32)
 DEF_HELPER_FLAGS_6(sve_fadd_s, TCG_CALL_NO_RWG,
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
index 4c44d52a23..0486cb1e5e 100644
--- a/target/arm/sve_helper.c
+++ b/target/arm/sve_helper.c
@@ -3362,6 +3362,8 @@ void HELPER(NAME)(void *vd, void *vn, void *vm, void *vg, 
  \
 
 #define DO_FCMGE(TYPE, X, Y, ST)  TYPE##_compare(Y, X, ST) <= 0
 #define DO_FCMGT(TYPE, X, Y, ST)  TYPE##_compare(Y, X, ST) < 0
+#define DO_FCMLE(TYPE, X, Y, ST)  TYPE##_compare(X, Y, ST) <= 0
+#define DO_FCMLT(TYPE, X, Y, ST)  TYPE##_compare(X, Y, ST) < 0
 #define DO_FCMEQ(TYPE, X, Y, ST)  TYPE##_compare_quiet(X, Y, ST) == 0
 #define DO_FCMNE(TYPE, X, Y, ST)  TYPE##_compare_quiet(X, Y, ST) != 0
 #define DO_FCMUO(TYPE, X, Y, ST)  \
@@ -3385,6 +3387,47 @@ DO_FPCMP_PPZZ_ALL(sve_facgt, DO_FACGT)
 #undef DO_FPCMP_PPZZ_H
 #undef DO_FPCMP_PPZZ
 
+/* One operand floating-point comparison against zero, controlled
+ * by a predicate.
+ */
+#define DO_FPCMP_PPZ0(NAME, TYPE, H, OP)   \
+void HELPER(NAME)(void *vd, void *vn, void *vg,\
+  void *status, uint32_t desc) \
+{  \
+intptr_t i = simd_oprsz(desc), j = (i - 1) >> 6;   \
+uint64_t *d = vd, *g = vg; \
+do {   \
+uint64_t out = 0, pg = g[j];   \
+do {   \
+i -= sizeof(TYPE), out <<= sizeof(TYPE);   \
+if ((pg >> (i & 63)) & 1) {\
+TYPE nn = *(TYPE *)(vn + H(i));\
+out |= OP(TYPE, nn, 0, status);\
+}  \
+} while (i & 63);  \
+d[j--] = out;  \
+} while (i > 0);   \
+}
+
+#define DO_FPCMP_PPZ0_H(NAME, OP) \
+DO_FPCMP_PPZ0(NAME##_h, float16, H1_2, OP)
+#define DO_FPCMP_PPZ0_S(NAME, OP) \
+DO_FPCMP_PPZ0(NAME##_s, float32, H1_4, OP)
+#define DO_FPCMP_PPZ0_D(NAME, OP) \
+DO_FPCMP_PPZ0(NAME##_d, float64, , OP)
+
+#define DO_FPCMP_PPZ0_ALL(NAME, OP) \
+DO_FPCMP_PPZ0_H(NAME, OP)   \
+

[Qemu-devel] [PATCH v6 34/35] target/arm: Enable SVE for aarch64-linux-user

2018-06-26 Thread Richard Henderson
Enable ARM_FEATURE_SVE for the generic "max" cpu.

Tested-by: Alex Bennée 
Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 

---
v6: Set ARM_HWCAP_A64_SVE.
---
 linux-user/elfload.c | 1 +
 target/arm/cpu.c | 7 +++
 target/arm/cpu64.c   | 1 +
 3 files changed, 9 insertions(+)

diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index 13bc78d0c8..d1231ad07a 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -584,6 +584,7 @@ static uint32_t get_elf_hwcap(void)
 GET_FEATURE(ARM_FEATURE_V8_ATOMICS, ARM_HWCAP_A64_ATOMICS);
 GET_FEATURE(ARM_FEATURE_V8_RDM, ARM_HWCAP_A64_ASIMDRDM);
 GET_FEATURE(ARM_FEATURE_V8_FCMA, ARM_HWCAP_A64_FCMA);
+GET_FEATURE(ARM_FEATURE_SVE, ARM_HWCAP_A64_SVE);
 #undef GET_FEATURE
 
 return hwcaps;
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index 2ae4fffafb..6dcc552e14 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -164,6 +164,13 @@ static void arm_cpu_reset(CPUState *s)
 env->cp15.sctlr_el[1] |= SCTLR_UCT | SCTLR_UCI | SCTLR_DZE;
 /* and to the FP/Neon instructions */
 env->cp15.cpacr_el1 = deposit64(env->cp15.cpacr_el1, 20, 2, 3);
+/* and to the SVE instructions */
+env->cp15.cpacr_el1 = deposit64(env->cp15.cpacr_el1, 16, 2, 3);
+env->cp15.cptr_el[3] |= CPTR_EZ;
+/* with maximum vector length */
+env->vfp.zcr_el[1] = ARM_MAX_VQ - 1;
+env->vfp.zcr_el[2] = ARM_MAX_VQ - 1;
+env->vfp.zcr_el[3] = ARM_MAX_VQ - 1;
 #else
 /* Reset into the highest available EL */
 if (arm_feature(env, ARM_FEATURE_EL3)) {
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index c50dcd4077..0360d7efc5 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -252,6 +252,7 @@ static void aarch64_max_initfn(Object *obj)
 set_feature(>env, ARM_FEATURE_V8_RDM);
 set_feature(>env, ARM_FEATURE_V8_FP16);
 set_feature(>env, ARM_FEATURE_V8_FCMA);
+set_feature(>env, ARM_FEATURE_SVE);
 /* For usermode -cpu max we can use a larger and more efficient DCZ
  * blocksize since we don't have to follow what the hardware does.
  */
-- 
2.17.1




[Qemu-devel] [PATCH v6 19/35] target/arm: Implement SVE FP Fast Reduction Group

2018-06-26 Thread Richard Henderson
Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 
---
 target/arm/helper-sve.h| 35 ++
 target/arm/sve_helper.c| 61 ++
 target/arm/translate-sve.c | 57 +++
 target/arm/sve.decode  |  8 +
 4 files changed, 161 insertions(+)

diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
index 087819ec2b..ff69d143a0 100644
--- a/target/arm/helper-sve.h
+++ b/target/arm/helper-sve.h
@@ -725,6 +725,41 @@ DEF_HELPER_FLAGS_5(gvec_rsqrts_s, TCG_CALL_NO_RWG,
 DEF_HELPER_FLAGS_5(gvec_rsqrts_d, TCG_CALL_NO_RWG,
void, ptr, ptr, ptr, ptr, i32)
 
+DEF_HELPER_FLAGS_4(sve_faddv_h, TCG_CALL_NO_RWG,
+   i64, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_4(sve_faddv_s, TCG_CALL_NO_RWG,
+   i64, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_4(sve_faddv_d, TCG_CALL_NO_RWG,
+   i64, ptr, ptr, ptr, i32)
+
+DEF_HELPER_FLAGS_4(sve_fmaxnmv_h, TCG_CALL_NO_RWG,
+   i64, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_4(sve_fmaxnmv_s, TCG_CALL_NO_RWG,
+   i64, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_4(sve_fmaxnmv_d, TCG_CALL_NO_RWG,
+   i64, ptr, ptr, ptr, i32)
+
+DEF_HELPER_FLAGS_4(sve_fminnmv_h, TCG_CALL_NO_RWG,
+   i64, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_4(sve_fminnmv_s, TCG_CALL_NO_RWG,
+   i64, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_4(sve_fminnmv_d, TCG_CALL_NO_RWG,
+   i64, ptr, ptr, ptr, i32)
+
+DEF_HELPER_FLAGS_4(sve_fmaxv_h, TCG_CALL_NO_RWG,
+   i64, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_4(sve_fmaxv_s, TCG_CALL_NO_RWG,
+   i64, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_4(sve_fmaxv_d, TCG_CALL_NO_RWG,
+   i64, ptr, ptr, ptr, i32)
+
+DEF_HELPER_FLAGS_4(sve_fminv_h, TCG_CALL_NO_RWG,
+   i64, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_4(sve_fminv_s, TCG_CALL_NO_RWG,
+   i64, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_4(sve_fminv_d, TCG_CALL_NO_RWG,
+   i64, ptr, ptr, ptr, i32)
+
 DEF_HELPER_FLAGS_5(sve_fadda_h, TCG_CALL_NO_RWG,
i64, i64, ptr, ptr, ptr, i32)
 DEF_HELPER_FLAGS_5(sve_fadda_s, TCG_CALL_NO_RWG,
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
index bc23c66221..4c44d52a23 100644
--- a/target/arm/sve_helper.c
+++ b/target/arm/sve_helper.c
@@ -2852,6 +2852,67 @@ uint32_t HELPER(sve_while)(void *vd, uint32_t count, 
uint32_t pred_desc)
 return predtest_ones(d, oprsz, esz_mask);
 }
 
+/* Recursive reduction on a function;
+ * C.f. the ARM ARM function ReducePredicated.
+ *
+ * While it would be possible to write this without the DATA temporary,
+ * it is much simpler to process the predicate register this way.
+ * The recursion is bounded to depth 7 (128 fp16 elements), so there's
+ * little to gain with a more complex non-recursive form.
+ */
+#define DO_REDUCE(NAME, TYPE, H, FUNC, IDENT) \
+static TYPE NAME##_reduce(TYPE *data, float_status *status, uintptr_t n) \
+{ \
+if (n == 1) { \
+return *data; \
+} else {  \
+uintptr_t half = n / 2;   \
+TYPE lo = NAME##_reduce(data, status, half);  \
+TYPE hi = NAME##_reduce(data + half, status, half);   \
+return TYPE##_##FUNC(lo, hi, status); \
+} \
+} \
+uint64_t HELPER(NAME)(void *vn, void *vg, void *vs, uint32_t desc)\
+{ \
+uintptr_t i, oprsz = simd_oprsz(desc), maxsz = simd_maxsz(desc);  \
+TYPE data[sizeof(ARMVectorReg) / sizeof(TYPE)];   \
+for (i = 0; i < oprsz; ) {\
+uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3));   \
+do {  \
+TYPE nn = *(TYPE *)(vn + H(i));   \
+*(TYPE *)((void *)data + i) = (pg & 1 ? nn : IDENT);  \
+i += sizeof(TYPE), pg >>= sizeof(TYPE);   \
+} while (i & 15); \
+} \
+for (; i < maxsz; i += sizeof(TYPE)) {\
+*(TYPE *)((void *)data + i) = IDENT;  \
+} \
+return NAME##_reduce(data, vs, maxsz / sizeof(TYPE)); \
+}
+

[Qemu-devel] [PATCH v6 30/35] target/arm: Pass index to AdvSIMD FCMLA (indexed)

2018-06-26 Thread Richard Henderson
For aa64 advsimd, we had been passing the pre-indexed vector.
However, sve applies the index to each 128-bit segment, so we
need to pass in the index separately.

For aa32 advsimd, the fp32 operation always has index 0, but
we failed to interpret the fp16 index correctly.

Signed-off-by: Richard Henderson 

---
v6:
  * Fix double-indexing in translate-a64.c
  * Fix non-indexing of fp16 in translate.c.
---
 target/arm/translate-a64.c | 21 -
 target/arm/translate.c | 32 +++-
 target/arm/vec_helper.c| 10 ++
 3 files changed, 41 insertions(+), 22 deletions(-)

diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index 8d8a4cecb0..eb3a4ab2f0 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -12669,15 +12669,18 @@ static void disas_simd_indexed(DisasContext *s, 
uint32_t insn)
 case 0x13: /* FCMLA #90 */
 case 0x15: /* FCMLA #180 */
 case 0x17: /* FCMLA #270 */
-tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, rd),
-   vec_full_reg_offset(s, rn),
-   vec_reg_offset(s, rm, index, size), fpst,
-   is_q ? 16 : 8, vec_full_reg_size(s),
-   extract32(insn, 13, 2), /* rot */
-   size == MO_64
-   ? gen_helper_gvec_fcmlas_idx
-   : gen_helper_gvec_fcmlah_idx);
-tcg_temp_free_ptr(fpst);
+{
+int rot = extract32(insn, 13, 2);
+int data = (index << 2) | rot;
+tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, rd),
+   vec_full_reg_offset(s, rn),
+   vec_full_reg_offset(s, rm), fpst,
+   is_q ? 16 : 8, vec_full_reg_size(s), data,
+   size == MO_64
+   ? gen_helper_gvec_fcmlas_idx
+   : gen_helper_gvec_fcmlah_idx);
+tcg_temp_free_ptr(fpst);
+}
 return;
 }
 
diff --git a/target/arm/translate.c b/target/arm/translate.c
index 2a3e4f5d4c..a7a980b1f2 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -7826,26 +7826,42 @@ static int disas_neon_insn_3same_ext(DisasContext *s, 
uint32_t insn)
 
 static int disas_neon_insn_2reg_scalar_ext(DisasContext *s, uint32_t insn)
 {
-int rd, rn, rm, rot, size, opr_sz;
+gen_helper_gvec_3_ptr *fn_gvec_ptr;
+int rd, rn, rm, opr_sz, data;
 TCGv_ptr fpst;
 bool q;
 
 q = extract32(insn, 6, 1);
 VFP_DREG_D(rd, insn);
 VFP_DREG_N(rn, insn);
-VFP_DREG_M(rm, insn);
 if ((rd | rn) & q) {
 return 1;
 }
 
 if ((insn & 0xff000f10) == 0xfe000800) {
 /* VCMLA (indexed) --  1110 S.RR   1000 ...0  */
-rot = extract32(insn, 20, 2);
-size = extract32(insn, 23, 1);
-if (!arm_dc_feature(s, ARM_FEATURE_V8_FCMA)
-|| (!size && !arm_dc_feature(s, ARM_FEATURE_V8_FP16))) {
+int rot = extract32(insn, 20, 2);
+int size = extract32(insn, 23, 1);
+int index;
+
+if (!arm_dc_feature(s, ARM_FEATURE_V8_FCMA)) {
 return 1;
 }
+if (size == 0) {
+if (!arm_dc_feature(s, ARM_FEATURE_V8_FP16)) {
+return 1;
+}
+/* For fp16, rm is just Vm, and index is M.  */
+rm = extract32(insn, 0, 4);
+index = extract32(insn, 5, 1);
+} else {
+/* For fp32, rm is the usual M:Vm, and index is 0.  */
+VFP_DREG_M(rm, insn);
+index = 0;
+}
+data = (index << 2) | rot;
+fn_gvec_ptr = (size ? gen_helper_gvec_fcmlas_idx
+   : gen_helper_gvec_fcmlah_idx);
 } else {
 return 1;
 }
@@ -7864,9 +7880,7 @@ static int disas_neon_insn_2reg_scalar_ext(DisasContext 
*s, uint32_t insn)
 tcg_gen_gvec_3_ptr(vfp_reg_offset(1, rd),
vfp_reg_offset(1, rn),
vfp_reg_offset(1, rm), fpst,
-   opr_sz, opr_sz, rot,
-   size ? gen_helper_gvec_fcmlas_idx
-   : gen_helper_gvec_fcmlah_idx);
+   opr_sz, opr_sz, data, fn_gvec_ptr);
 tcg_temp_free_ptr(fpst);
 return 0;
 }
diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
index 073e5c58e7..8f2dc4b989 100644
--- a/target/arm/vec_helper.c
+++ b/target/arm/vec_helper.c
@@ -317,10 +317,11 @@ void HELPER(gvec_fcmlah_idx)(void *vd, void *vn, void *vm,
 float_status *fpst = vfpst;
 intptr_t flip = extract32(desc, SIMD_DATA_SHIFT, 1);
 uint32_t neg_imag = extract32(desc, SIMD_DATA_SHIFT + 1, 1);
+intptr_t index = extract32(desc, SIMD_DATA_SHIFT + 2, 2);
 uint32_t neg_real = flip ^ neg_imag;
 uintptr_t i;
-float16 e1 = m[H2(flip)];
-float16 e3 = m[H2(1 - flip)];
+float16 e1 = 

[Qemu-devel] [PATCH v6 20/35] target/arm: Implement SVE Floating Point Unary Operations - Unpredicated Group

2018-06-26 Thread Richard Henderson
Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 
---
 target/arm/helper.h|  8 +++
 target/arm/translate-sve.c | 47 ++
 target/arm/vec_helper.c| 20 
 target/arm/sve.decode  |  5 
 4 files changed, 80 insertions(+)

diff --git a/target/arm/helper.h b/target/arm/helper.h
index 56439ac1e4..ad9cb6c7d5 100644
--- a/target/arm/helper.h
+++ b/target/arm/helper.h
@@ -601,6 +601,14 @@ DEF_HELPER_FLAGS_5(gvec_fcmlas_idx, TCG_CALL_NO_RWG,
 DEF_HELPER_FLAGS_5(gvec_fcmlad, TCG_CALL_NO_RWG,
void, ptr, ptr, ptr, ptr, i32)
 
+DEF_HELPER_FLAGS_4(gvec_frecpe_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_4(gvec_frecpe_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_4(gvec_frecpe_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
+
+DEF_HELPER_FLAGS_4(gvec_frsqrte_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_4(gvec_frsqrte_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_4(gvec_frsqrte_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
+
 DEF_HELPER_FLAGS_5(gvec_fadd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
 DEF_HELPER_FLAGS_5(gvec_fadd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
 DEF_HELPER_FLAGS_5(gvec_fadd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
index 3b009193a9..1dcc2d38c9 100644
--- a/target/arm/translate-sve.c
+++ b/target/arm/translate-sve.c
@@ -3507,6 +3507,53 @@ DO_VPZ(FMAXNMV, fmaxnmv)
 DO_VPZ(FMINV, fminv)
 DO_VPZ(FMAXV, fmaxv)
 
+/*
+ *** SVE Floating Point Unary Operations - Unpredicated Group
+ */
+
+static void do_zz_fp(DisasContext *s, arg_rr_esz *a, gen_helper_gvec_2_ptr *fn)
+{
+unsigned vsz = vec_full_reg_size(s);
+TCGv_ptr status = get_fpstatus_ptr(a->esz == MO_16);
+
+tcg_gen_gvec_2_ptr(vec_full_reg_offset(s, a->rd),
+   vec_full_reg_offset(s, a->rn),
+   status, vsz, vsz, 0, fn);
+tcg_temp_free_ptr(status);
+}
+
+static bool trans_FRECPE(DisasContext *s, arg_rr_esz *a, uint32_t insn)
+{
+static gen_helper_gvec_2_ptr * const fns[3] = {
+gen_helper_gvec_frecpe_h,
+gen_helper_gvec_frecpe_s,
+gen_helper_gvec_frecpe_d,
+};
+if (a->esz == 0) {
+return false;
+}
+if (sve_access_check(s)) {
+do_zz_fp(s, a, fns[a->esz - 1]);
+}
+return true;
+}
+
+static bool trans_FRSQRTE(DisasContext *s, arg_rr_esz *a, uint32_t insn)
+{
+static gen_helper_gvec_2_ptr * const fns[3] = {
+gen_helper_gvec_frsqrte_h,
+gen_helper_gvec_frsqrte_s,
+gen_helper_gvec_frsqrte_d,
+};
+if (a->esz == 0) {
+return false;
+}
+if (sve_access_check(s)) {
+do_zz_fp(s, a, fns[a->esz - 1]);
+}
+return true;
+}
+
 /*
  *** SVE Floating Point Accumulating Reduction Group
  */
diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
index 97af75a61b..073e5c58e7 100644
--- a/target/arm/vec_helper.c
+++ b/target/arm/vec_helper.c
@@ -427,6 +427,26 @@ void HELPER(gvec_fcmlad)(void *vd, void *vn, void *vm,
 clear_tail(d, opr_sz, simd_maxsz(desc));
 }
 
+#define DO_2OP(NAME, FUNC, TYPE) \
+void HELPER(NAME)(void *vd, void *vn, void *stat, uint32_t desc)  \
+{ \
+intptr_t i, oprsz = simd_oprsz(desc); \
+TYPE *d = vd, *n = vn;\
+for (i = 0; i < oprsz / sizeof(TYPE); i++) {  \
+d[i] = FUNC(n[i], stat);  \
+} \
+}
+
+DO_2OP(gvec_frecpe_h, helper_recpe_f16, float16)
+DO_2OP(gvec_frecpe_s, helper_recpe_f32, float32)
+DO_2OP(gvec_frecpe_d, helper_recpe_f64, float64)
+
+DO_2OP(gvec_frsqrte_h, helper_rsqrte_f16, float16)
+DO_2OP(gvec_frsqrte_s, helper_rsqrte_f32, float32)
+DO_2OP(gvec_frsqrte_d, helper_rsqrte_f64, float64)
+
+#undef DO_2OP
+
 /* Floating-point trigonometric starting value.
  * See the ARM ARM pseudocode function FPTrigSMul.
  */
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
index 66b0fd0cc4..ca93bdb2b3 100644
--- a/target/arm/sve.decode
+++ b/target/arm/sve.decode
@@ -743,6 +743,11 @@ FMINNMV 01100101 .. 000 101 001 ... . .
 @rd_pg_rn
 FMAXV   01100101 .. 000 110 001 ... . . @rd_pg_rn
 FMINV   01100101 .. 000 111 001 ... . . @rd_pg_rn
 
+## SVE Floating Point Unary Operations - Unpredicated Group
+
+FRECPE  01100101 .. 001 110 001100 . .  @rd_rn
+FRSQRTE 01100101 .. 001 111 001100 . .  @rd_rn
+
 ### SVE FP Accumulating Reduction Group
 
 # SVE floating-point serial reduction (predicated)
-- 
2.17.1




[Qemu-devel] [PATCH v6 23/35] target/arm: Implement SVE floating-point convert precision

2018-06-26 Thread Richard Henderson
Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 

---
v6: Squish fz16 a-la vfp_fcvt_f16_to_f32
---
 target/arm/helper-sve.h| 13 +
 target/arm/sve_helper.c| 55 ++
 target/arm/translate-sve.c | 30 +
 target/arm/sve.decode  |  8 ++
 4 files changed, 106 insertions(+)

diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
index aca137fc37..4c379dbb05 100644
--- a/target/arm/helper-sve.h
+++ b/target/arm/helper-sve.h
@@ -942,6 +942,19 @@ DEF_HELPER_FLAGS_6(sve_fmins_s, TCG_CALL_NO_RWG,
 DEF_HELPER_FLAGS_6(sve_fmins_d, TCG_CALL_NO_RWG,
void, ptr, ptr, ptr, i64, ptr, i32)
 
+DEF_HELPER_FLAGS_5(sve_fcvt_sh, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_5(sve_fcvt_dh, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_5(sve_fcvt_hs, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_5(sve_fcvt_ds, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_5(sve_fcvt_hd, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_5(sve_fcvt_sd, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, i32)
+
 DEF_HELPER_FLAGS_5(sve_scvt_hh, TCG_CALL_NO_RWG,
void, ptr, ptr, ptr, ptr, i32)
 DEF_HELPER_FLAGS_5(sve_scvt_sh, TCG_CALL_NO_RWG,
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
index 79358c804b..4b36c1eecf 100644
--- a/target/arm/sve_helper.c
+++ b/target/arm/sve_helper.c
@@ -3147,6 +3147,61 @@ void HELPER(NAME)(void *vd, void *vn, void *vg, void 
*status, uint32_t desc) \
 } while (i != 0); \
 }
 
+/* SVE fp16 conversions always use IEEE mode.  Like AdvSIMD, they ignore
+ * FZ16.  When converting from fp16, this affects flushing input denormals;
+ * when converting to fp16, this affects flushing output denormals.
+ */
+static inline float32 sve_f16_to_f32(float16 f, float_status *fpst)
+{
+flag save = get_flush_inputs_to_zero(fpst);
+float32 ret;
+
+set_flush_inputs_to_zero(false, fpst);
+ret = float16_to_float32(f, true, fpst);
+set_flush_inputs_to_zero(save, fpst);
+return ret;
+}
+
+static inline float64 sve_f16_to_f64(float16 f, float_status *fpst)
+{
+flag save = get_flush_inputs_to_zero(fpst);
+float64 ret;
+
+set_flush_inputs_to_zero(false, fpst);
+ret = float16_to_float64(f, true, fpst);
+set_flush_inputs_to_zero(save, fpst);
+return ret;
+}
+
+static inline float16 sve_f32_to_f16(float32 f, float_status *fpst)
+{
+flag save = get_flush_to_zero(fpst);
+float16 ret;
+
+set_flush_to_zero(false, fpst);
+ret = float32_to_float16(f, true, fpst);
+set_flush_to_zero(save, fpst);
+return ret;
+}
+
+static inline float16 sve_f64_to_f16(float64 f, float_status *fpst)
+{
+flag save = get_flush_to_zero(fpst);
+float16 ret;
+
+set_flush_to_zero(false, fpst);
+ret = float64_to_float16(f, true, fpst);
+set_flush_to_zero(save, fpst);
+return ret;
+}
+
+DO_ZPZ_FP(sve_fcvt_sh, uint32_t, H1_4, sve_f32_to_f16)
+DO_ZPZ_FP(sve_fcvt_hs, uint32_t, H1_4, sve_f16_to_f32)
+DO_ZPZ_FP(sve_fcvt_dh, uint64_t, , sve_f64_to_f16)
+DO_ZPZ_FP(sve_fcvt_hd, uint64_t, , sve_f16_to_f64)
+DO_ZPZ_FP(sve_fcvt_ds, uint64_t, , float64_to_float32)
+DO_ZPZ_FP(sve_fcvt_sd, uint64_t, , float32_to_float64)
+
 DO_ZPZ_FP(sve_scvt_hh, uint16_t, H1_2, int16_to_float16)
 DO_ZPZ_FP(sve_scvt_sh, uint32_t, H1_4, int32_to_float16)
 DO_ZPZ_FP(sve_scvt_ss, uint32_t, H1_4, int32_to_float32)
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
index a86ebc0a91..37ad1c9459 100644
--- a/target/arm/translate-sve.c
+++ b/target/arm/translate-sve.c
@@ -3940,6 +3940,36 @@ static bool do_zpz_ptr(DisasContext *s, int rd, int rn, 
int pg,
 return true;
 }
 
+static bool trans_FCVT_sh(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
+{
+return do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_fcvt_sh);
+}
+
+static bool trans_FCVT_hs(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
+{
+return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvt_hs);
+}
+
+static bool trans_FCVT_dh(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
+{
+return do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_fcvt_dh);
+}
+
+static bool trans_FCVT_hd(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
+{
+return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvt_hd);
+}
+
+static bool trans_FCVT_ds(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
+{
+return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvt_ds);
+}
+
+static bool trans_FCVT_sd(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
+{
+return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvt_sd);
+}
+
 static bool trans_SCVTF_hh(DisasContext *s, arg_rpr_esz *a, 

[Qemu-devel] [PATCH v6 27/35] target/arm: Implement SVE MOVPRFX

2018-06-26 Thread Richard Henderson
Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 

---
v6: Fix comment typos
---
 target/arm/translate-sve.c | 60 +-
 target/arm/sve.decode  |  7 +
 2 files changed, 66 insertions(+), 1 deletion(-)

diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
index ff8ae67e2b..4883de3fab 100644
--- a/target/arm/translate-sve.c
+++ b/target/arm/translate-sve.c
@@ -351,6 +351,23 @@ static bool do_zpzz_ool(DisasContext *s, arg_rprr_esz *a, 
gen_helper_gvec_4 *fn)
 return true;
 }
 
+/* Select active elememnts from Zn and inactive elements from Zm,
+ * storing the result in Zd.
+ */
+static void do_sel_z(DisasContext *s, int rd, int rn, int rm, int pg, int esz)
+{
+static gen_helper_gvec_4 * const fns[4] = {
+gen_helper_sve_sel_zpzz_b, gen_helper_sve_sel_zpzz_h,
+gen_helper_sve_sel_zpzz_s, gen_helper_sve_sel_zpzz_d
+};
+unsigned vsz = vec_full_reg_size(s);
+tcg_gen_gvec_4_ool(vec_full_reg_offset(s, rd),
+   vec_full_reg_offset(s, rn),
+   vec_full_reg_offset(s, rm),
+   pred_full_reg_offset(s, pg),
+   vsz, vsz, 0, fns[esz]);
+}
+
 #define DO_ZPZZ(NAME, name) \
 static bool trans_##NAME##_zpzz(DisasContext *s, arg_rprr_esz *a, \
 uint32_t insn)\
@@ -401,7 +418,13 @@ static bool trans_UDIV_zpzz(DisasContext *s, arg_rprr_esz 
*a, uint32_t insn)
 return do_zpzz_ool(s, a, fns[a->esz]);
 }
 
-DO_ZPZZ(SEL, sel)
+static bool trans_SEL_zpzz(DisasContext *s, arg_rprr_esz *a, uint32_t insn)
+{
+if (sve_access_check(s)) {
+do_sel_z(s, a->rd, a->rn, a->rm, a->pg, a->esz);
+}
+return true;
+}
 
 #undef DO_ZPZZ
 
@@ -5035,3 +5058,38 @@ static bool trans_PRF_rr(DisasContext *s, arg_PRF_rr *a, 
uint32_t insn)
 sve_access_check(s);
 return true;
 }
+
+/*
+ * Move Prefix
+ *
+ * TODO: The implementation so far could handle predicated merging movprfx.
+ * The helper functions as written take an extra source register to
+ * use in the operation, but the result is only written when predication
+ * succeeds.  For unpredicated movprfx, we need to rearrange the helpers
+ * to allow the final write back to the destination to be unconditional.
+ * For predicated zeroing movprfx, we need to rearrange the helpers to
+ * allow the final write back to zero inactives.
+ *
+ * In the meantime, just emit the moves.
+ */
+
+static bool trans_MOVPRFX(DisasContext *s, arg_MOVPRFX *a, uint32_t insn)
+{
+return do_mov_z(s, a->rd, a->rn);
+}
+
+static bool trans_MOVPRFX_m(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
+{
+if (sve_access_check(s)) {
+do_sel_z(s, a->rd, a->rn, a->rd, a->pg, a->esz);
+}
+return true;
+}
+
+static bool trans_MOVPRFX_z(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
+{
+if (sve_access_check(s)) {
+do_movz_zpz(s, a->rd, a->rn, a->pg, a->esz);
+}
+return true;
+}
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
index 2aca9f0bb0..c725ee2584 100644
--- a/target/arm/sve.decode
+++ b/target/arm/sve.decode
@@ -270,6 +270,10 @@ ORV 0100 .. 011 000 001 ... . .
 @rd_pg_rn
 EORV0100 .. 011 001 001 ... . . @rd_pg_rn
 ANDV0100 .. 011 010 001 ... . . @rd_pg_rn
 
+# SVE constructive prefix (predicated)
+MOVPRFX_z   0100 .. 010 000 001 ... . . @rd_pg_rn
+MOVPRFX_m   0100 .. 010 001 001 ... . . @rd_pg_rn
+
 # SVE integer add reduction (predicated)
 # Note that saddv requires size != 3.
 UADDV   0100 .. 000 001 001 ... . . @rd_pg_rn
@@ -418,6 +422,9 @@ ADR_p64 0100 11 1 . 1010 .. . . 
@rd_rn_msz_rm
 
 ### SVE Integer Misc - Unpredicated Group
 
+# SVE constructive prefix (unpredicated)
+MOVPRFX 0100 00 1 0 10 rn:5 rd:5
+
 # SVE floating-point exponential accelerator
 # Note esz != 0
 FEXPA   0100 .. 1 0 101110 . .  @rd_rn
-- 
2.17.1




[Qemu-devel] [PATCH v6 16/35] target/arm: Implement SVE floating-point compare vectors

2018-06-26 Thread Richard Henderson
Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 
---
 target/arm/helper-sve.h| 49 ++
 target/arm/sve_helper.c| 62 ++
 target/arm/translate-sve.c | 40 
 target/arm/sve.decode  | 11 +++
 4 files changed, 162 insertions(+)

diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
index 55e8a908d4..6089b3a53f 100644
--- a/target/arm/helper-sve.h
+++ b/target/arm/helper-sve.h
@@ -839,6 +839,55 @@ DEF_HELPER_FLAGS_5(sve_ucvt_ds, TCG_CALL_NO_RWG,
 DEF_HELPER_FLAGS_5(sve_ucvt_dd, TCG_CALL_NO_RWG,
void, ptr, ptr, ptr, ptr, i32)
 
+DEF_HELPER_FLAGS_6(sve_fcmge_h, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_6(sve_fcmge_s, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_6(sve_fcmge_d, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, ptr, i32)
+
+DEF_HELPER_FLAGS_6(sve_fcmgt_h, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_6(sve_fcmgt_s, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_6(sve_fcmgt_d, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, ptr, i32)
+
+DEF_HELPER_FLAGS_6(sve_fcmeq_h, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_6(sve_fcmeq_s, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_6(sve_fcmeq_d, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, ptr, i32)
+
+DEF_HELPER_FLAGS_6(sve_fcmne_h, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_6(sve_fcmne_s, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_6(sve_fcmne_d, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, ptr, i32)
+
+DEF_HELPER_FLAGS_6(sve_fcmuo_h, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_6(sve_fcmuo_s, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_6(sve_fcmuo_d, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, ptr, i32)
+
+DEF_HELPER_FLAGS_6(sve_facge_h, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_6(sve_facge_s, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_6(sve_facge_d, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, ptr, i32)
+
+DEF_HELPER_FLAGS_6(sve_facgt_h, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_6(sve_facgt_s, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_6(sve_facgt_d, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, ptr, i32)
+
 DEF_HELPER_FLAGS_3(sve_fmla_zpzzz_h, TCG_CALL_NO_RWG, void, env, ptr, i32)
 DEF_HELPER_FLAGS_3(sve_fmla_zpzzz_s, TCG_CALL_NO_RWG, void, env, ptr, i32)
 DEF_HELPER_FLAGS_3(sve_fmla_zpzzz_d, TCG_CALL_NO_RWG, void, env, ptr, i32)
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
index 81fc968087..41d8ce6b54 100644
--- a/target/arm/sve_helper.c
+++ b/target/arm/sve_helper.c
@@ -3193,6 +3193,68 @@ void HELPER(sve_fnmls_zpzzz_d)(CPUARMState *env, void 
*vg, uint32_t desc)
 do_fmla_zpzzz_d(env, vg, desc, 0, INT64_MIN);
 }
 
+/* Two operand floating-point comparison controlled by a predicate.
+ * Unlike the integer version, we are not allowed to optimistically
+ * compare operands, since the comparison may have side effects wrt
+ * the FPSR.
+ */
+#define DO_FPCMP_PPZZ(NAME, TYPE, H, OP)\
+void HELPER(NAME)(void *vd, void *vn, void *vm, void *vg,   \
+  void *status, uint32_t desc)  \
+{   \
+intptr_t i = simd_oprsz(desc), j = (i - 1) >> 6;\
+uint64_t *d = vd, *g = vg;  \
+do {\
+uint64_t out = 0, pg = g[j];\
+do {\
+i -= sizeof(TYPE), out <<= sizeof(TYPE);\
+if (likely((pg >> (i & 63)) & 1)) { \
+TYPE nn = *(TYPE *)(vn + H(i)); \
+TYPE mm = *(TYPE *)(vm + H(i)); \
+out |= OP(TYPE, nn, mm, status);\
+}   \
+} while (i & 63);   \
+d[j--] = out;   \
+} while (i > 0);  

[Qemu-devel] [PATCH v6 17/35] target/arm: Implement SVE floating-point arithmetic with immediate

2018-06-26 Thread Richard Henderson
Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 
---
 target/arm/helper-sve.h| 56 
 target/arm/sve_helper.c| 69 +++
 target/arm/translate-sve.c | 75 ++
 target/arm/sve.decode  | 14 +++
 4 files changed, 214 insertions(+)

diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
index 6089b3a53f..087819ec2b 100644
--- a/target/arm/helper-sve.h
+++ b/target/arm/helper-sve.h
@@ -809,6 +809,62 @@ DEF_HELPER_FLAGS_6(sve_fmulx_s, TCG_CALL_NO_RWG,
 DEF_HELPER_FLAGS_6(sve_fmulx_d, TCG_CALL_NO_RWG,
void, ptr, ptr, ptr, ptr, ptr, i32)
 
+DEF_HELPER_FLAGS_6(sve_fadds_h, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, i64, ptr, i32)
+DEF_HELPER_FLAGS_6(sve_fadds_s, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, i64, ptr, i32)
+DEF_HELPER_FLAGS_6(sve_fadds_d, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, i64, ptr, i32)
+
+DEF_HELPER_FLAGS_6(sve_fsubs_h, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, i64, ptr, i32)
+DEF_HELPER_FLAGS_6(sve_fsubs_s, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, i64, ptr, i32)
+DEF_HELPER_FLAGS_6(sve_fsubs_d, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, i64, ptr, i32)
+
+DEF_HELPER_FLAGS_6(sve_fmuls_h, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, i64, ptr, i32)
+DEF_HELPER_FLAGS_6(sve_fmuls_s, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, i64, ptr, i32)
+DEF_HELPER_FLAGS_6(sve_fmuls_d, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, i64, ptr, i32)
+
+DEF_HELPER_FLAGS_6(sve_fsubrs_h, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, i64, ptr, i32)
+DEF_HELPER_FLAGS_6(sve_fsubrs_s, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, i64, ptr, i32)
+DEF_HELPER_FLAGS_6(sve_fsubrs_d, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, i64, ptr, i32)
+
+DEF_HELPER_FLAGS_6(sve_fmaxnms_h, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, i64, ptr, i32)
+DEF_HELPER_FLAGS_6(sve_fmaxnms_s, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, i64, ptr, i32)
+DEF_HELPER_FLAGS_6(sve_fmaxnms_d, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, i64, ptr, i32)
+
+DEF_HELPER_FLAGS_6(sve_fminnms_h, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, i64, ptr, i32)
+DEF_HELPER_FLAGS_6(sve_fminnms_s, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, i64, ptr, i32)
+DEF_HELPER_FLAGS_6(sve_fminnms_d, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, i64, ptr, i32)
+
+DEF_HELPER_FLAGS_6(sve_fmaxs_h, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, i64, ptr, i32)
+DEF_HELPER_FLAGS_6(sve_fmaxs_s, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, i64, ptr, i32)
+DEF_HELPER_FLAGS_6(sve_fmaxs_d, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, i64, ptr, i32)
+
+DEF_HELPER_FLAGS_6(sve_fmins_h, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, i64, ptr, i32)
+DEF_HELPER_FLAGS_6(sve_fmins_s, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, i64, ptr, i32)
+DEF_HELPER_FLAGS_6(sve_fmins_d, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, i64, ptr, i32)
+
 DEF_HELPER_FLAGS_5(sve_scvt_hh, TCG_CALL_NO_RWG,
void, ptr, ptr, ptr, ptr, i32)
 DEF_HELPER_FLAGS_5(sve_scvt_sh, TCG_CALL_NO_RWG,
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
index 41d8ce6b54..bc23c66221 100644
--- a/target/arm/sve_helper.c
+++ b/target/arm/sve_helper.c
@@ -2997,6 +2997,75 @@ DO_ZPZZ_FP(sve_fmulx_d, uint64_t, , helper_vfp_mulxd)
 
 #undef DO_ZPZZ_FP
 
+/* Three-operand expander, with one scalar operand, controlled by
+ * a predicate, with the extra float_status parameter.
+ */
+#define DO_ZPZS_FP(NAME, TYPE, H, OP) \
+void HELPER(NAME)(void *vd, void *vn, void *vg, uint64_t scalar,  \
+  void *status, uint32_t desc)\
+{ \
+intptr_t i = simd_oprsz(desc);\
+uint64_t *g = vg; \
+TYPE mm = scalar; \
+do {  \
+uint64_t pg = g[(i - 1) >> 6];\
+do {  \
+i -= sizeof(TYPE);\
+if (likely((pg >> (i & 63)) & 1)) {   \
+TYPE nn = *(TYPE *)(vn + H(i));   \
+*(TYPE *)(vd + H(i)) = OP(nn, mm, status);\
+} \
+} while (i & 63); \
+} while (i != 0); \
+}
+
+DO_ZPZS_FP(sve_fadds_h, float16, H1_2, 

[Qemu-devel] [PATCH v6 25/35] target/arm: Implement SVE floating-point round to integral value

2018-06-26 Thread Richard Henderson
Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 
---
 target/arm/helper-sve.h| 14 +++
 target/arm/sve_helper.c|  8 
 target/arm/translate-sve.c | 77 ++
 target/arm/sve.decode  |  9 +
 4 files changed, 108 insertions(+)

diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
index 37fa9eb9bb..36168c5bb2 100644
--- a/target/arm/helper-sve.h
+++ b/target/arm/helper-sve.h
@@ -985,6 +985,20 @@ DEF_HELPER_FLAGS_5(sve_fcvtzu_sd, TCG_CALL_NO_RWG,
 DEF_HELPER_FLAGS_5(sve_fcvtzu_dd, TCG_CALL_NO_RWG,
void, ptr, ptr, ptr, ptr, i32)
 
+DEF_HELPER_FLAGS_5(sve_frint_h, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_5(sve_frint_s, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_5(sve_frint_d, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, i32)
+
+DEF_HELPER_FLAGS_5(sve_frintx_h, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_5(sve_frintx_s, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_5(sve_frintx_d, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, i32)
+
 DEF_HELPER_FLAGS_5(sve_scvt_hh, TCG_CALL_NO_RWG,
void, ptr, ptr, ptr, ptr, i32)
 DEF_HELPER_FLAGS_5(sve_scvt_sh, TCG_CALL_NO_RWG,
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
index b6421ec19c..af8221c714 100644
--- a/target/arm/sve_helper.c
+++ b/target/arm/sve_helper.c
@@ -3290,6 +3290,14 @@ DO_ZPZ_FP(sve_fcvtzu_sd, uint64_t, , 
vfp_float32_to_uint64_rtz)
 DO_ZPZ_FP(sve_fcvtzu_ds, uint64_t, , helper_vfp_touizd)
 DO_ZPZ_FP(sve_fcvtzu_dd, uint64_t, , vfp_float64_to_uint64_rtz)
 
+DO_ZPZ_FP(sve_frint_h, uint16_t, H1_2, helper_advsimd_rinth)
+DO_ZPZ_FP(sve_frint_s, uint32_t, H1_4, helper_rints)
+DO_ZPZ_FP(sve_frint_d, uint64_t, , helper_rintd)
+
+DO_ZPZ_FP(sve_frintx_h, uint16_t, H1_2, float16_round_to_int)
+DO_ZPZ_FP(sve_frintx_s, uint32_t, H1_4, float32_round_to_int)
+DO_ZPZ_FP(sve_frintx_d, uint64_t, , float64_round_to_int)
+
 DO_ZPZ_FP(sve_scvt_hh, uint16_t, H1_2, int16_to_float16)
 DO_ZPZ_FP(sve_scvt_sh, uint32_t, H1_4, int32_to_float16)
 DO_ZPZ_FP(sve_scvt_ss, uint32_t, H1_4, int32_to_float32)
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
index be589a1cf2..270bf9101b 100644
--- a/target/arm/translate-sve.c
+++ b/target/arm/translate-sve.c
@@ -4040,6 +4040,83 @@ static bool trans_FCVTZU_dd(DisasContext *s, arg_rpr_esz 
*a, uint32_t insn)
 return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvtzu_dd);
 }
 
+static gen_helper_gvec_3_ptr * const frint_fns[3] = {
+gen_helper_sve_frint_h,
+gen_helper_sve_frint_s,
+gen_helper_sve_frint_d
+};
+
+static bool trans_FRINTI(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
+{
+if (a->esz == 0) {
+return false;
+}
+return do_zpz_ptr(s, a->rd, a->rn, a->pg, a->esz == MO_16,
+  frint_fns[a->esz - 1]);
+}
+
+static bool trans_FRINTX(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
+{
+static gen_helper_gvec_3_ptr * const fns[3] = {
+gen_helper_sve_frintx_h,
+gen_helper_sve_frintx_s,
+gen_helper_sve_frintx_d
+};
+if (a->esz == 0) {
+return false;
+}
+return do_zpz_ptr(s, a->rd, a->rn, a->pg, a->esz == MO_16, fns[a->esz - 
1]);
+}
+
+static bool do_frint_mode(DisasContext *s, arg_rpr_esz *a, int mode)
+{
+if (a->esz == 0) {
+return false;
+}
+if (sve_access_check(s)) {
+unsigned vsz = vec_full_reg_size(s);
+TCGv_i32 tmode = tcg_const_i32(mode);
+TCGv_ptr status = get_fpstatus_ptr(a->esz == MO_16);
+
+gen_helper_set_rmode(tmode, tmode, status);
+
+tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, a->rd),
+   vec_full_reg_offset(s, a->rn),
+   pred_full_reg_offset(s, a->pg),
+   status, vsz, vsz, 0, frint_fns[a->esz - 1]);
+
+gen_helper_set_rmode(tmode, tmode, status);
+tcg_temp_free_i32(tmode);
+tcg_temp_free_ptr(status);
+}
+return true;
+}
+
+static bool trans_FRINTN(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
+{
+return do_frint_mode(s, a, float_round_nearest_even);
+}
+
+static bool trans_FRINTP(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
+{
+return do_frint_mode(s, a, float_round_up);
+}
+
+static bool trans_FRINTM(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
+{
+return do_frint_mode(s, a, float_round_down);
+}
+
+static bool trans_FRINTZ(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
+{
+return do_frint_mode(s, a, float_round_to_zero);
+}
+
+static bool trans_FRINTA(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
+{
+return do_frint_mode(s, a, float_round_ties_away);
+}
+
 static bool trans_SCVTF_hh(DisasContext *s, arg_rpr_esz *a, uint32_t insn)
 {
 return 

[Qemu-devel] [PATCH v6 22/35] target/arm: Implement SVE floating-point trig multiply-add coefficient

2018-06-26 Thread Richard Henderson
Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 
---
 target/arm/helper-sve.h|  4 +++
 target/arm/sve_helper.c| 70 ++
 target/arm/translate-sve.c | 27 +++
 target/arm/sve.decode  |  3 ++
 4 files changed, 104 insertions(+)

diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
index 44a98440c9..aca137fc37 100644
--- a/target/arm/helper-sve.h
+++ b/target/arm/helper-sve.h
@@ -1037,6 +1037,10 @@ DEF_HELPER_FLAGS_3(sve_fnmls_zpzzz_h, TCG_CALL_NO_RWG, 
void, env, ptr, i32)
 DEF_HELPER_FLAGS_3(sve_fnmls_zpzzz_s, TCG_CALL_NO_RWG, void, env, ptr, i32)
 DEF_HELPER_FLAGS_3(sve_fnmls_zpzzz_d, TCG_CALL_NO_RWG, void, env, ptr, i32)
 
+DEF_HELPER_FLAGS_5(sve_ftmad_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_5(sve_ftmad_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_5(sve_ftmad_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
+
 DEF_HELPER_FLAGS_4(sve_ld1bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
 DEF_HELPER_FLAGS_4(sve_ld2bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
 DEF_HELPER_FLAGS_4(sve_ld3bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
index 0486cb1e5e..79358c804b 100644
--- a/target/arm/sve_helper.c
+++ b/target/arm/sve_helper.c
@@ -3428,6 +3428,76 @@ DO_FPCMP_PPZ0_ALL(sve_fcmlt0, DO_FCMLT)
 DO_FPCMP_PPZ0_ALL(sve_fcmeq0, DO_FCMEQ)
 DO_FPCMP_PPZ0_ALL(sve_fcmne0, DO_FCMNE)
 
+/* FP Trig Multiply-Add. */
+
+void HELPER(sve_ftmad_h)(void *vd, void *vn, void *vm, void *vs, uint32_t desc)
+{
+static const float16 coeff[16] = {
+0x3c00, 0xb155, 0x2030, 0x, 0x, 0x, 0x, 0x,
+0x3c00, 0xb800, 0x293a, 0x, 0x, 0x, 0x, 0x,
+};
+intptr_t i, opr_sz = simd_oprsz(desc) / sizeof(float16);
+intptr_t x = simd_data(desc);
+float16 *d = vd, *n = vn, *m = vm;
+for (i = 0; i < opr_sz; i++) {
+float16 mm = m[i];
+intptr_t xx = x;
+if (float16_is_neg(mm)) {
+mm = float16_abs(mm);
+xx += 8;
+}
+d[i] = float16_muladd(n[i], mm, coeff[xx], 0, vs);
+}
+}
+
+void HELPER(sve_ftmad_s)(void *vd, void *vn, void *vm, void *vs, uint32_t desc)
+{
+static const float32 coeff[16] = {
+0x3f80, 0xbe2b, 0x3c06, 0xb95008b9,
+0x36369d6d, 0x, 0x, 0x,
+0x3f80, 0xbf00, 0x3d26, 0xbab60705,
+0x37cd37cc, 0x, 0x, 0x,
+};
+intptr_t i, opr_sz = simd_oprsz(desc) / sizeof(float32);
+intptr_t x = simd_data(desc);
+float32 *d = vd, *n = vn, *m = vm;
+for (i = 0; i < opr_sz; i++) {
+float32 mm = m[i];
+intptr_t xx = x;
+if (float32_is_neg(mm)) {
+mm = float32_abs(mm);
+xx += 8;
+}
+d[i] = float32_muladd(n[i], mm, coeff[xx], 0, vs);
+}
+}
+
+void HELPER(sve_ftmad_d)(void *vd, void *vn, void *vm, void *vs, uint32_t desc)
+{
+static const float64 coeff[16] = {
+0x3ff0ull, 0xbfc55543ull,
+0x3f80f30cull, 0xbf2a01a019b92fc6ull,
+0x3ec71de351f3d22bull, 0xbe5ae5e2b60f7b91ull,
+0x3de5d8408868552full, 0xull,
+0x3ff0ull, 0xbfe0ull,
+0x3fa55536ull, 0xbf56c16c16c13a0bull,
+0x3efa01a019b1e8d8ull, 0xbe927e4f7282f468ull,
+0x3e21ee96d2641b13ull, 0xbda8f76380fbb401ull,
+};
+intptr_t i, opr_sz = simd_oprsz(desc) / sizeof(float64);
+intptr_t x = simd_data(desc);
+float64 *d = vd, *n = vn, *m = vm;
+for (i = 0; i < opr_sz; i++) {
+float64 mm = m[i];
+intptr_t xx = x;
+if (float64_is_neg(mm)) {
+mm = float64_abs(mm);
+xx += 8;
+}
+d[i] = float64_muladd(n[i], mm, coeff[xx], 0, vs);
+}
+}
+
 /*
  * Load contiguous data, protected by a governing predicate.
  */
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
index cfee256be9..a86ebc0a91 100644
--- a/target/arm/translate-sve.c
+++ b/target/arm/translate-sve.c
@@ -3597,6 +3597,33 @@ DO_PPZ(FCMNE_ppz0, fcmne0)
 
 #undef DO_PPZ
 
+/*
+ *** SVE floating-point trig multiply-add coefficient
+ */
+
+static bool trans_FTMAD(DisasContext *s, arg_FTMAD *a, uint32_t insn)
+{
+static gen_helper_gvec_3_ptr * const fns[3] = {
+gen_helper_sve_ftmad_h,
+gen_helper_sve_ftmad_s,
+gen_helper_sve_ftmad_d,
+};
+
+if (a->esz == 0) {
+return false;
+}
+if (sve_access_check(s)) {
+unsigned vsz = vec_full_reg_size(s);
+TCGv_ptr status = get_fpstatus_ptr(a->esz == MO_16);
+tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, a->rd),
+   vec_full_reg_offset(s, a->rn),
+   vec_full_reg_offset(s, a->rm),
+   status, vsz, vsz, a->imm, fns[a->esz - 1]);
+

[Qemu-devel] [PATCH v6 13/35] target/arm: Implement SVE gather loads

2018-06-26 Thread Richard Henderson
Signed-off-by: Richard Henderson 

---
v6:
  * Finish esz == msz && u==1 decode in sve.decode.
  * Remove duplicate decode in trans_ST1_zprz.
  * Add xs=2 comment.
  * Reformat tables to leave room for ff helpers.
---
 target/arm/helper-sve.h|  67 +
 target/arm/sve_helper.c|  77 
 target/arm/translate-sve.c | 100 +
 target/arm/sve.decode  |  57 +
 4 files changed, 301 insertions(+)

diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
index 8880128f9c..aeb62afc34 100644
--- a/target/arm/helper-sve.h
+++ b/target/arm/helper-sve.h
@@ -959,6 +959,73 @@ DEF_HELPER_FLAGS_4(sve_st1hd_r, TCG_CALL_NO_WG, void, env, 
ptr, tl, i32)
 
 DEF_HELPER_FLAGS_4(sve_st1sd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
 
+DEF_HELPER_FLAGS_6(sve_ldbsu_zsu, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldhsu_zsu, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldssu_zsu, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldbss_zsu, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldhss_zsu, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+
+DEF_HELPER_FLAGS_6(sve_ldbsu_zss, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldhsu_zss, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldssu_zss, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldbss_zss, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldhss_zss, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+
+DEF_HELPER_FLAGS_6(sve_ldbdu_zsu, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldhdu_zsu, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldsdu_zsu, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldddu_zsu, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldbds_zsu, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldhds_zsu, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldsds_zsu, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+
+DEF_HELPER_FLAGS_6(sve_ldbdu_zss, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldhdu_zss, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldsdu_zss, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldddu_zss, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldbds_zss, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldhds_zss, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldsds_zss, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+
+DEF_HELPER_FLAGS_6(sve_ldbdu_zd, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldhdu_zd, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldsdu_zd, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldddu_zd, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldbds_zd, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldhds_zd, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldsds_zd, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+
 DEF_HELPER_FLAGS_6(sve_stbs_zsu, TCG_CALL_NO_WG,
void, env, ptr, ptr, ptr, tl, i32)
 DEF_HELPER_FLAGS_6(sve_sths_zsu, TCG_CALL_NO_WG,
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
index 7622bb2af0..24f75a32d3 100644
--- a/target/arm/sve_helper.c
+++ b/target/arm/sve_helper.c
@@ -3714,6 +3714,83 @@ void HELPER(sve_st4dd_r)(CPUARMState *env, void *vg,
 }
 }
 
+/* Loads with a vector index.  */
+
+#define DO_LD1_ZPZ_S(NAME, TYPEI, TYPEM, FN)\
+void HELPER(NAME)(CPUARMState *env, void *vd, void *vg, void *vm,   \
+  target_ulong base, uint32_t desc) \
+{   \
+intptr_t i, oprsz = simd_oprsz(desc);   \
+unsigned scale = simd_data(desc);

[Qemu-devel] [PATCH v6 14/35] target/arm: Implement SVE first-fault gather loads

2018-06-26 Thread Richard Henderson
Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 
---
 target/arm/helper-sve.h| 67 +
 target/arm/sve_helper.c| 88 ++
 target/arm/translate-sve.c | 40 -
 3 files changed, 193 insertions(+), 2 deletions(-)

diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
index aeb62afc34..55e8a908d4 100644
--- a/target/arm/helper-sve.h
+++ b/target/arm/helper-sve.h
@@ -1026,6 +1026,73 @@ DEF_HELPER_FLAGS_6(sve_ldhds_zd, TCG_CALL_NO_WG,
 DEF_HELPER_FLAGS_6(sve_ldsds_zd, TCG_CALL_NO_WG,
void, env, ptr, ptr, ptr, tl, i32)
 
+DEF_HELPER_FLAGS_6(sve_ldffbsu_zsu, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffhsu_zsu, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffssu_zsu, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffbss_zsu, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffhss_zsu, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+
+DEF_HELPER_FLAGS_6(sve_ldffbsu_zss, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffhsu_zss, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffssu_zss, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffbss_zss, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffhss_zss, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+
+DEF_HELPER_FLAGS_6(sve_ldffbdu_zsu, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffhdu_zsu, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffsdu_zsu, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffddu_zsu, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffbds_zsu, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffhds_zsu, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffsds_zsu, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+
+DEF_HELPER_FLAGS_6(sve_ldffbdu_zss, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffhdu_zss, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffsdu_zss, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffddu_zss, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffbds_zss, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffhds_zss, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffsds_zss, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+
+DEF_HELPER_FLAGS_6(sve_ldffbdu_zd, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffhdu_zd, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffsdu_zd, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffddu_zd, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffbds_zd, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffhds_zd, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_ldffsds_zd, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+
 DEF_HELPER_FLAGS_6(sve_stbs_zsu, TCG_CALL_NO_WG,
void, env, ptr, ptr, ptr, tl, i32)
 DEF_HELPER_FLAGS_6(sve_sths_zsu, TCG_CALL_NO_WG,
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
index 24f75a32d3..81fc968087 100644
--- a/target/arm/sve_helper.c
+++ b/target/arm/sve_helper.c
@@ -3791,6 +3791,94 @@ DO_LD1_ZPZ_D(sve_ldbds_zd, uint64_t, int8_t,   
cpu_ldub_data_ra)
 DO_LD1_ZPZ_D(sve_ldhds_zd, uint64_t, int16_t,  cpu_lduw_data_ra)
 DO_LD1_ZPZ_D(sve_ldsds_zd, uint64_t, int32_t,  cpu_ldl_data_ra)
 
+/* First fault loads with a vector index.  */
+
+#ifdef CONFIG_USER_ONLY
+
+#define DO_LDFF1_ZPZ(NAME, TYPEE, TYPEI, TYPEM, FN, H)  \
+void HELPER(NAME)(CPUARMState *env, void *vd, void *vg, void *vm,   \
+  target_ulong base, uint32_t desc) \
+{   \
+intptr_t i, oprsz = simd_oprsz(desc);   \
+unsigned 

[Qemu-devel] [PATCH v6 18/35] target/arm: Implement SVE Floating Point Multiply Indexed Group

2018-06-26 Thread Richard Henderson
Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 
---
 target/arm/helper.h| 14 +++
 target/arm/translate-sve.c | 50 ++
 target/arm/vec_helper.c| 48 
 target/arm/sve.decode  | 19 +++
 4 files changed, 131 insertions(+)

diff --git a/target/arm/helper.h b/target/arm/helper.h
index 879a7229e9..56439ac1e4 100644
--- a/target/arm/helper.h
+++ b/target/arm/helper.h
@@ -620,6 +620,20 @@ DEF_HELPER_FLAGS_5(gvec_ftsmul_s, TCG_CALL_NO_RWG,
 DEF_HELPER_FLAGS_5(gvec_ftsmul_d, TCG_CALL_NO_RWG,
void, ptr, ptr, ptr, ptr, i32)
 
+DEF_HELPER_FLAGS_5(gvec_fmul_idx_h, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_5(gvec_fmul_idx_s, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_5(gvec_fmul_idx_d, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, i32)
+
+DEF_HELPER_FLAGS_6(gvec_fmla_idx_h, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_6(gvec_fmla_idx_s, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_6(gvec_fmla_idx_d, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, ptr, i32)
+
 #ifdef TARGET_AARCH64
 #include "helper-a64.h"
 #include "helper-sve.h"
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
index 499252deff..b60d47af2c 100644
--- a/target/arm/translate-sve.c
+++ b/target/arm/translate-sve.c
@@ -3400,6 +3400,56 @@ DO_ZZI(UMIN, umin)
 
 #undef DO_ZZI
 
+/*
+ *** SVE Floating Point Multiply-Add Indexed Group
+ */
+
+static bool trans_FMLA_zzxz(DisasContext *s, arg_FMLA_zzxz *a, uint32_t insn)
+{
+static gen_helper_gvec_4_ptr * const fns[3] = {
+gen_helper_gvec_fmla_idx_h,
+gen_helper_gvec_fmla_idx_s,
+gen_helper_gvec_fmla_idx_d,
+};
+
+if (sve_access_check(s)) {
+unsigned vsz = vec_full_reg_size(s);
+TCGv_ptr status = get_fpstatus_ptr(a->esz == MO_16);
+tcg_gen_gvec_4_ptr(vec_full_reg_offset(s, a->rd),
+   vec_full_reg_offset(s, a->rn),
+   vec_full_reg_offset(s, a->rm),
+   vec_full_reg_offset(s, a->ra),
+   status, vsz, vsz, (a->index << 1) | a->sub,
+   fns[a->esz - 1]);
+tcg_temp_free_ptr(status);
+}
+return true;
+}
+
+/*
+ *** SVE Floating Point Multiply Indexed Group
+ */
+
+static bool trans_FMUL_zzx(DisasContext *s, arg_FMUL_zzx *a, uint32_t insn)
+{
+static gen_helper_gvec_3_ptr * const fns[3] = {
+gen_helper_gvec_fmul_idx_h,
+gen_helper_gvec_fmul_idx_s,
+gen_helper_gvec_fmul_idx_d,
+};
+
+if (sve_access_check(s)) {
+unsigned vsz = vec_full_reg_size(s);
+TCGv_ptr status = get_fpstatus_ptr(a->esz == MO_16);
+tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, a->rd),
+   vec_full_reg_offset(s, a->rn),
+   vec_full_reg_offset(s, a->rm),
+   status, vsz, vsz, a->index, fns[a->esz - 1]);
+tcg_temp_free_ptr(status);
+}
+return true;
+}
+
 /*
  *** SVE Floating Point Accumulating Reduction Group
  */
diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
index f504dd53c8..97af75a61b 100644
--- a/target/arm/vec_helper.c
+++ b/target/arm/vec_helper.c
@@ -495,3 +495,51 @@ DO_3OP(gvec_rsqrts_d, helper_rsqrtsf_f64, float64)
 
 #endif
 #undef DO_3OP
+
+/* For the indexed ops, SVE applies the index per 128-bit vector segment.
+ * For AdvSIMD, there is of course only one such vector segment.
+ */
+
+#define DO_MUL_IDX(NAME, TYPE, H) \
+void HELPER(NAME)(void *vd, void *vn, void *vm, void *stat, uint32_t desc) \
+{  \
+intptr_t i, j, oprsz = simd_oprsz(desc), segment = 16 / sizeof(TYPE);  \
+intptr_t idx = simd_data(desc);\
+TYPE *d = vd, *n = vn, *m = vm;\
+for (i = 0; i < oprsz / sizeof(TYPE); i += segment) {  \
+TYPE mm = m[H(i + idx)];   \
+for (j = 0; j < segment; j++) {\
+d[i + j] = TYPE##_mul(n[i + j], mm, stat); \
+}  \
+}  \
+}
+
+DO_MUL_IDX(gvec_fmul_idx_h, float16, H2)
+DO_MUL_IDX(gvec_fmul_idx_s, float32, H4)
+DO_MUL_IDX(gvec_fmul_idx_d, float64, )
+
+#undef DO_MUL_IDX
+
+#define DO_FMLA_IDX(NAME, TYPE, H) \
+void HELPER(NAME)(void *vd, void *vn, void *vm, void *va,  \
+  void *stat, uint32_t desc)

[Qemu-devel] [PATCH v6 08/35] target/arm: Implement SVE Floating Point Accumulating Reduction Group

2018-06-26 Thread Richard Henderson
Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 
---
 target/arm/helper-sve.h|  7 +
 target/arm/sve_helper.c| 56 ++
 target/arm/translate-sve.c | 45 ++
 target/arm/sve.decode  |  5 
 4 files changed, 113 insertions(+)

diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
index eb0645dd43..68e55a8d03 100644
--- a/target/arm/helper-sve.h
+++ b/target/arm/helper-sve.h
@@ -720,6 +720,13 @@ DEF_HELPER_FLAGS_5(gvec_rsqrts_s, TCG_CALL_NO_RWG,
 DEF_HELPER_FLAGS_5(gvec_rsqrts_d, TCG_CALL_NO_RWG,
void, ptr, ptr, ptr, ptr, i32)
 
+DEF_HELPER_FLAGS_5(sve_fadda_h, TCG_CALL_NO_RWG,
+   i64, i64, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_5(sve_fadda_s, TCG_CALL_NO_RWG,
+   i64, i64, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_5(sve_fadda_d, TCG_CALL_NO_RWG,
+   i64, i64, ptr, ptr, ptr, i32)
+
 DEF_HELPER_FLAGS_6(sve_fadd_h, TCG_CALL_NO_RWG,
void, ptr, ptr, ptr, ptr, ptr, i32)
 DEF_HELPER_FLAGS_6(sve_fadd_s, TCG_CALL_NO_RWG,
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
index 2f416e5e28..2d08b7dcd3 100644
--- a/target/arm/sve_helper.c
+++ b/target/arm/sve_helper.c
@@ -2811,6 +2811,62 @@ uint32_t HELPER(sve_while)(void *vd, uint32_t count, 
uint32_t pred_desc)
 return predtest_ones(d, oprsz, esz_mask);
 }
 
+uint64_t HELPER(sve_fadda_h)(uint64_t nn, void *vm, void *vg,
+ void *status, uint32_t desc)
+{
+intptr_t i = 0, opr_sz = simd_oprsz(desc);
+float16 result = nn;
+
+do {
+uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3));
+do {
+if (pg & 1) {
+float16 mm = *(float16 *)(vm + H1_2(i));
+result = float16_add(result, mm, status);
+}
+i += sizeof(float16), pg >>= sizeof(float16);
+} while (i & 15);
+} while (i < opr_sz);
+
+return result;
+}
+
+uint64_t HELPER(sve_fadda_s)(uint64_t nn, void *vm, void *vg,
+ void *status, uint32_t desc)
+{
+intptr_t i = 0, opr_sz = simd_oprsz(desc);
+float32 result = nn;
+
+do {
+uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3));
+do {
+if (pg & 1) {
+float32 mm = *(float32 *)(vm + H1_2(i));
+result = float32_add(result, mm, status);
+}
+i += sizeof(float32), pg >>= sizeof(float32);
+} while (i & 15);
+} while (i < opr_sz);
+
+return result;
+}
+
+uint64_t HELPER(sve_fadda_d)(uint64_t nn, void *vm, void *vg,
+ void *status, uint32_t desc)
+{
+intptr_t i = 0, opr_sz = simd_oprsz(desc) / 8;
+uint64_t *m = vm;
+uint8_t *pg = vg;
+
+for (i = 0; i < opr_sz; i++) {
+if (pg[H1(i)] & 1) {
+nn = float64_add(nn, m[i], status);
+}
+}
+
+return nn;
+}
+
 /* Fully general three-operand expander, controlled by a predicate,
  * With the extra float_status parameter.
  */
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
index acad6374ef..483ad33179 100644
--- a/target/arm/translate-sve.c
+++ b/target/arm/translate-sve.c
@@ -3383,6 +3383,51 @@ DO_ZZI(UMIN, umin)
 
 #undef DO_ZZI
 
+/*
+ *** SVE Floating Point Accumulating Reduction Group
+ */
+
+static bool trans_FADDA(DisasContext *s, arg_rprr_esz *a, uint32_t insn)
+{
+typedef void fadda_fn(TCGv_i64, TCGv_i64, TCGv_ptr,
+  TCGv_ptr, TCGv_ptr, TCGv_i32);
+static fadda_fn * const fns[3] = {
+gen_helper_sve_fadda_h,
+gen_helper_sve_fadda_s,
+gen_helper_sve_fadda_d,
+};
+unsigned vsz = vec_full_reg_size(s);
+TCGv_ptr t_rm, t_pg, t_fpst;
+TCGv_i64 t_val;
+TCGv_i32 t_desc;
+
+if (a->esz == 0) {
+return false;
+}
+if (!sve_access_check(s)) {
+return true;
+}
+
+t_val = load_esz(cpu_env, vec_reg_offset(s, a->rn, 0, a->esz), a->esz);
+t_rm = tcg_temp_new_ptr();
+t_pg = tcg_temp_new_ptr();
+tcg_gen_addi_ptr(t_rm, cpu_env, vec_full_reg_offset(s, a->rm));
+tcg_gen_addi_ptr(t_pg, cpu_env, pred_full_reg_offset(s, a->pg));
+t_fpst = get_fpstatus_ptr(a->esz == MO_16);
+t_desc = tcg_const_i32(simd_desc(vsz, vsz, 0));
+
+fns[a->esz - 1](t_val, t_val, t_rm, t_pg, t_fpst, t_desc);
+
+tcg_temp_free_i32(t_desc);
+tcg_temp_free_ptr(t_fpst);
+tcg_temp_free_ptr(t_pg);
+tcg_temp_free_ptr(t_rm);
+
+write_fp_dreg(s, a->rd, t_val);
+tcg_temp_free_i64(t_val);
+return true;
+}
+
 /*
  *** SVE Floating Point Arithmetic - Unpredicated Group
  */
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
index e8531e28cd..675b81aaa0 100644
--- a/target/arm/sve.decode
+++ b/target/arm/sve.decode
@@ -676,6 +676,11 @@ UMIN_zzi00100101 .. 101 011 110  . 
 @rdn_i8u
 # SVE integer multiply immediate (unpredicated)
 MUL_zzi 

[Qemu-devel] [PATCH v6 12/35] target/arm: Implement SVE prefetches

2018-06-26 Thread Richard Henderson
Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 
---
 target/arm/translate-sve.c | 21 +
 target/arm/sve.decode  | 23 +++
 2 files changed, 44 insertions(+)

diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
index 65da3e633f..27854e0042 100644
--- a/target/arm/translate-sve.c
+++ b/target/arm/translate-sve.c
@@ -4303,3 +4303,24 @@ static bool trans_ST1_zprz(DisasContext *s, arg_ST1_zprz 
*a, uint32_t insn)
cpu_reg_sp(s, a->rn), fn);
 return true;
 }
+
+/*
+ * Prefetches
+ */
+
+static bool trans_PRF(DisasContext *s, arg_PRF *a, uint32_t insn)
+{
+/* Prefetch is a nop within QEMU.  */
+sve_access_check(s);
+return true;
+}
+
+static bool trans_PRF_rr(DisasContext *s, arg_PRF_rr *a, uint32_t insn)
+{
+if (a->rm == 31) {
+return false;
+}
+/* Prefetch is a nop within QEMU.  */
+sve_access_check(s);
+return true;
+}
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
index 7d24c2bdc4..80b955ff84 100644
--- a/target/arm/sve.decode
+++ b/target/arm/sve.decode
@@ -794,6 +794,29 @@ LD1RQ_zprr  1010010 .. 00 . 000 ... . . \
 LD1RQ_zpri  1010010 .. 00 0 001 ... . . \
 @rpri_load_msz nreg=0
 
+# SVE 32-bit gather prefetch (scalar plus 32-bit scaled offsets)
+PRF 110 00 -1 - 0-- --- - 0 
+
+# SVE 32-bit gather prefetch (vector plus immediate)
+PRF 110 -- 00 - 111 --- - 0 
+
+# SVE contiguous prefetch (scalar plus immediate)
+PRF 110 11 1- - 0-- --- - 0 
+
+# SVE contiguous prefetch (scalar plus scalar)
+PRF_rr  110 -- 00 rm:5 110 --- - 0 
+
+### SVE Memory 64-bit Gather Group
+
+# SVE 64-bit gather prefetch (scalar plus 64-bit scaled offsets)
+PRF 1100010 00 11 - 1-- --- - 0 
+
+# SVE 64-bit gather prefetch (scalar plus unpacked 32-bit scaled offsets)
+PRF 1100010 00 -1 - 0-- --- - 0 
+
+# SVE 64-bit gather prefetch (vector plus immediate)
+PRF 1100010 -- 00 - 111 --- - 0 
+
 ### SVE Memory Store Group
 
 # SVE store predicate register
-- 
2.17.1




[Qemu-devel] [PATCH v6 15/35] target/arm: Implement SVE scatter store vector immediate

2018-06-26 Thread Richard Henderson
Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 
---
 target/arm/translate-sve.c | 85 ++
 target/arm/sve.decode  | 11 +
 2 files changed, 70 insertions(+), 26 deletions(-)

diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
index ea4407b746..9eb2530d3b 100644
--- a/target/arm/translate-sve.c
+++ b/target/arm/translate-sve.c
@@ -4391,32 +4391,34 @@ static bool trans_LD1_zpiz(DisasContext *s, 
arg_LD1_zpiz *a, uint32_t insn)
 return true;
 }
 
+/* Indexed by [xs][msz].  */
+static gen_helper_gvec_mem_scatter * const scatter_store_fn32[2][3] = {
+{ gen_helper_sve_stbs_zsu,
+  gen_helper_sve_sths_zsu,
+  gen_helper_sve_stss_zsu, },
+{ gen_helper_sve_stbs_zss,
+  gen_helper_sve_sths_zss,
+  gen_helper_sve_stss_zss, },
+};
+
+/* Note that we overload xs=2 to indicate 64-bit offset.  */
+static gen_helper_gvec_mem_scatter * const scatter_store_fn64[3][4] = {
+{ gen_helper_sve_stbd_zsu,
+  gen_helper_sve_sthd_zsu,
+  gen_helper_sve_stsd_zsu,
+  gen_helper_sve_stdd_zsu, },
+{ gen_helper_sve_stbd_zss,
+  gen_helper_sve_sthd_zss,
+  gen_helper_sve_stsd_zss,
+  gen_helper_sve_stdd_zss, },
+{ gen_helper_sve_stbd_zd,
+  gen_helper_sve_sthd_zd,
+  gen_helper_sve_stsd_zd,
+  gen_helper_sve_stdd_zd, },
+};
+
 static bool trans_ST1_zprz(DisasContext *s, arg_ST1_zprz *a, uint32_t insn)
 {
-/* Indexed by [xs][msz].  */
-static gen_helper_gvec_mem_scatter * const fn32[2][3] = {
-{ gen_helper_sve_stbs_zsu,
-  gen_helper_sve_sths_zsu,
-  gen_helper_sve_stss_zsu, },
-{ gen_helper_sve_stbs_zss,
-  gen_helper_sve_sths_zss,
-  gen_helper_sve_stss_zss, },
-};
-/* Note that we overload xs=2 to indicate 64-bit offset.  */
-static gen_helper_gvec_mem_scatter * const fn64[3][4] = {
-{ gen_helper_sve_stbd_zsu,
-  gen_helper_sve_sthd_zsu,
-  gen_helper_sve_stsd_zsu,
-  gen_helper_sve_stdd_zsu, },
-{ gen_helper_sve_stbd_zss,
-  gen_helper_sve_sthd_zss,
-  gen_helper_sve_stsd_zss,
-  gen_helper_sve_stdd_zss, },
-{ gen_helper_sve_stbd_zd,
-  gen_helper_sve_sthd_zd,
-  gen_helper_sve_stsd_zd,
-  gen_helper_sve_stdd_zd, },
-};
 gen_helper_gvec_mem_scatter *fn;
 
 if (a->esz < a->msz || (a->msz == 0 && a->scale)) {
@@ -4427,10 +4429,10 @@ static bool trans_ST1_zprz(DisasContext *s, 
arg_ST1_zprz *a, uint32_t insn)
 }
 switch (a->esz) {
 case MO_32:
-fn = fn32[a->xs][a->msz];
+fn = scatter_store_fn32[a->xs][a->msz];
 break;
 case MO_64:
-fn = fn64[a->xs][a->msz];
+fn = scatter_store_fn64[a->xs][a->msz];
 break;
 default:
 g_assert_not_reached();
@@ -4440,6 +4442,37 @@ static bool trans_ST1_zprz(DisasContext *s, arg_ST1_zprz 
*a, uint32_t insn)
 return true;
 }
 
+static bool trans_ST1_zpiz(DisasContext *s, arg_ST1_zpiz *a, uint32_t insn)
+{
+gen_helper_gvec_mem_scatter *fn = NULL;
+TCGv_i64 imm;
+
+if (a->esz < a->msz) {
+return false;
+}
+if (!sve_access_check(s)) {
+return true;
+}
+
+switch (a->esz) {
+case MO_32:
+fn = scatter_store_fn32[0][a->msz];
+break;
+case MO_64:
+fn = scatter_store_fn64[2][a->msz];
+break;
+}
+assert(fn != NULL);
+
+/* Treat ST1_zpiz (zn[x] + imm) the same way as ST1_zprz (rn + zm[x])
+ * by loading the immediate into the scalar parameter.
+ */
+imm = tcg_const_i64(a->imm << a->msz);
+do_mem_zpz(s, a->rd, a->pg, a->rn, 0, imm, fn);
+tcg_temp_free_i64(imm);
+return true;
+}
+
 /*
  * Prefetches
  */
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
index 45016c6042..75133ce659 100644
--- a/target/arm/sve.decode
+++ b/target/arm/sve.decode
@@ -83,6 +83,7 @@
 _gather_load   rd pg rn rm esz msz u ff xs scale
 _gather_load   rd pg rn imm esz msz u ff
 _scatter_store rd pg rn rm esz msz xs scale
+_scatter_store rd pg rn imm esz msz
 
 ###
 # Named instruction formats.  These are generally used to
@@ -219,6 +220,8 @@
 _store nreg=0
 @rprr_scatter_store ... msz:2 .. rm:5 ... pg:3 rn:5 rd:5 \
 _scatter_store
+@rpri_scatter_store ... msz:2 ..imm:5 ... pg:3 rn:5 rd:5 \
+_scatter_store
 
 ###
 # Instruction patterns.  Grouped according to the SVE encodingindex.xhtml.
@@ -932,6 +935,14 @@ ST1_zprz1110010 .. 01 . 101 ... . . \
 ST1_zprz1110010 .. 00 . 101 ... . . \
 @rprr_scatter_store xs=2 esz=3 scale=0
 
+# SVE 64-bit scatter store (vector plus immediate)
+ST1_zpiz1110010 .. 10 . 101 ... . . \
+   

[Qemu-devel] [PATCH v6 11/35] target/arm: Implement SVE scatter stores

2018-06-26 Thread Richard Henderson
Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 

---
v6:
 * Rewrite to the usual two nested loops,
 * Add comment about XS=2.
---
 target/arm/helper-sve.h| 41 +
 target/arm/sve_helper.c| 61 +++
 target/arm/translate-sve.c | 75 ++
 target/arm/sve.decode  | 39 
 4 files changed, 216 insertions(+)

diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
index a5d3bb121c..8880128f9c 100644
--- a/target/arm/helper-sve.h
+++ b/target/arm/helper-sve.h
@@ -958,3 +958,44 @@ DEF_HELPER_FLAGS_4(sve_st1hs_r, TCG_CALL_NO_WG, void, env, 
ptr, tl, i32)
 DEF_HELPER_FLAGS_4(sve_st1hd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
 
 DEF_HELPER_FLAGS_4(sve_st1sd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+
+DEF_HELPER_FLAGS_6(sve_stbs_zsu, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_sths_zsu, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_stss_zsu, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+
+DEF_HELPER_FLAGS_6(sve_stbs_zss, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_sths_zss, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_stss_zss, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+
+DEF_HELPER_FLAGS_6(sve_stbd_zsu, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_sthd_zsu, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_stsd_zsu, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_stdd_zsu, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+
+DEF_HELPER_FLAGS_6(sve_stbd_zss, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_sthd_zss, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_stsd_zss, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_stdd_zss, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+
+DEF_HELPER_FLAGS_6(sve_stbd_zd, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_sthd_zd, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_stsd_zd, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
+DEF_HELPER_FLAGS_6(sve_stdd_zd, TCG_CALL_NO_WG,
+   void, env, ptr, ptr, ptr, tl, i32)
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
index 93f2942590..7622bb2af0 100644
--- a/target/arm/sve_helper.c
+++ b/target/arm/sve_helper.c
@@ -3713,3 +3713,64 @@ void HELPER(sve_st4dd_r)(CPUARMState *env, void *vg,
 addr += 4 * 8;
 }
 }
+
+/* Stores with a vector index.  */
+
+#define DO_ST1_ZPZ_S(NAME, TYPEI, FN)   \
+void HELPER(NAME)(CPUARMState *env, void *vd, void *vg, void *vm,   \
+  target_ulong base, uint32_t desc) \
+{   \
+intptr_t i, oprsz = simd_oprsz(desc);   \
+unsigned scale = simd_data(desc);   \
+uintptr_t ra = GETPC(); \
+for (i = 0; i < oprsz; ) {  \
+uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \
+do {\
+if (likely(pg & 1)) {   \
+target_ulong off = *(TYPEI *)(vm + H1_4(i));\
+uint32_t d = *(uint32_t *)(vd + H1_4(i));   \
+FN(env, base + (off << scale), d, ra);  \
+}   \
+i += sizeof(uint32_t), pg >>= sizeof(uint32_t); \
+} while (i & 15);   \
+}   \
+}
+
+#define DO_ST1_ZPZ_D(NAME, TYPEI, FN)   \
+void HELPER(NAME)(CPUARMState *env, void *vd, void *vg, void *vm,   \
+  target_ulong base, uint32_t desc) \
+{   \
+intptr_t i, oprsz = simd_oprsz(desc) / 8;   \
+unsigned scale = simd_data(desc);   \
+uintptr_t ra = GETPC(); \
+uint64_t *d = vd, *m = vm; uint8_t *pg = vg;\
+

[Qemu-devel] [PATCH v6 05/35] target/arm: Implement SVE integer convert to floating-point

2018-06-26 Thread Richard Henderson
Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 
---
 target/arm/helper-sve.h| 30 +
 target/arm/sve_helper.c| 38 
 target/arm/translate-sve.c | 90 ++
 target/arm/sve.decode  | 22 ++
 4 files changed, 180 insertions(+)

diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
index b768128951..185112e1d2 100644
--- a/target/arm/helper-sve.h
+++ b/target/arm/helper-sve.h
@@ -720,6 +720,36 @@ DEF_HELPER_FLAGS_5(gvec_rsqrts_s, TCG_CALL_NO_RWG,
 DEF_HELPER_FLAGS_5(gvec_rsqrts_d, TCG_CALL_NO_RWG,
void, ptr, ptr, ptr, ptr, i32)
 
+DEF_HELPER_FLAGS_5(sve_scvt_hh, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_5(sve_scvt_sh, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_5(sve_scvt_dh, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_5(sve_scvt_ss, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_5(sve_scvt_sd, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_5(sve_scvt_ds, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_5(sve_scvt_dd, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, i32)
+
+DEF_HELPER_FLAGS_5(sve_ucvt_hh, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_5(sve_ucvt_sh, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_5(sve_ucvt_dh, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_5(sve_ucvt_ss, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_5(sve_ucvt_sd, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_5(sve_ucvt_ds, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_5(sve_ucvt_dd, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, i32)
+
 DEF_HELPER_FLAGS_4(sve_ld1bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
 DEF_HELPER_FLAGS_4(sve_ld2bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
 DEF_HELPER_FLAGS_4(sve_ld3bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
index bd874e6fa2..031bec22df 100644
--- a/target/arm/sve_helper.c
+++ b/target/arm/sve_helper.c
@@ -2811,6 +2811,44 @@ uint32_t HELPER(sve_while)(void *vd, uint32_t count, 
uint32_t pred_desc)
 return predtest_ones(d, oprsz, esz_mask);
 }
 
+/* Fully general two-operand expander, controlled by a predicate,
+ * With the extra float_status parameter.
+ */
+#define DO_ZPZ_FP(NAME, TYPE, H, OP)  \
+void HELPER(NAME)(void *vd, void *vn, void *vg, void *status, uint32_t desc) \
+{ \
+intptr_t i = simd_oprsz(desc);\
+uint64_t *g = vg; \
+do {  \
+uint64_t pg = g[(i - 1) >> 6];\
+do {  \
+i -= sizeof(TYPE);\
+if (likely((pg >> (i & 63)) & 1)) {   \
+TYPE nn = *(TYPE *)(vn + H(i));   \
+*(TYPE *)(vd + H(i)) = OP(nn, status);\
+} \
+} while (i & 63); \
+} while (i != 0); \
+}
+
+DO_ZPZ_FP(sve_scvt_hh, uint16_t, H1_2, int16_to_float16)
+DO_ZPZ_FP(sve_scvt_sh, uint32_t, H1_4, int32_to_float16)
+DO_ZPZ_FP(sve_scvt_ss, uint32_t, H1_4, int32_to_float32)
+DO_ZPZ_FP(sve_scvt_sd, uint64_t, , int32_to_float64)
+DO_ZPZ_FP(sve_scvt_dh, uint64_t, , int64_to_float16)
+DO_ZPZ_FP(sve_scvt_ds, uint64_t, , int64_to_float32)
+DO_ZPZ_FP(sve_scvt_dd, uint64_t, , int64_to_float64)
+
+DO_ZPZ_FP(sve_ucvt_hh, uint16_t, H1_2, uint16_to_float16)
+DO_ZPZ_FP(sve_ucvt_sh, uint32_t, H1_4, uint32_to_float16)
+DO_ZPZ_FP(sve_ucvt_ss, uint32_t, H1_4, uint32_to_float32)
+DO_ZPZ_FP(sve_ucvt_sd, uint64_t, , uint32_to_float64)
+DO_ZPZ_FP(sve_ucvt_dh, uint64_t, , uint64_to_float16)
+DO_ZPZ_FP(sve_ucvt_ds, uint64_t, , uint64_to_float32)
+DO_ZPZ_FP(sve_ucvt_dd, uint64_t, , uint64_to_float64)
+
+#undef DO_ZPZ_FP
+
 /*
  * Load contiguous data, protected by a governing predicate.
  */
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
index 83de87ee0e..7639e589f5 100644
--- a/target/arm/translate-sve.c
+++ b/target/arm/translate-sve.c
@@ -3425,6 +3425,96 @@ DO_FP3(FRSQRTS, rsqrts)
 
 #undef DO_FP3
 
+
+/*
+ *** SVE Floating 

[Qemu-devel] [PATCH v6 10/35] target/arm: Implement SVE store vector/predicate register

2018-06-26 Thread Richard Henderson
Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 

---
v6: Fix shift of data in 6 byte store.
---
 target/arm/translate-sve.c | 103 +
 target/arm/sve.decode  |   6 +++
 2 files changed, 109 insertions(+)

diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
index 954d6653d3..4116fe9904 100644
--- a/target/arm/translate-sve.c
+++ b/target/arm/translate-sve.c
@@ -3762,6 +3762,89 @@ static void do_ldr(DisasContext *s, uint32_t vofs, 
uint32_t len,
 tcg_temp_free_i64(t0);
 }
 
+/* Similarly for stores.  */
+static void do_str(DisasContext *s, uint32_t vofs, uint32_t len,
+   int rn, int imm)
+{
+uint32_t len_align = QEMU_ALIGN_DOWN(len, 8);
+uint32_t len_remain = len % 8;
+uint32_t nparts = len / 8 + ctpop8(len_remain);
+int midx = get_mem_index(s);
+TCGv_i64 addr, t0;
+
+addr = tcg_temp_new_i64();
+t0 = tcg_temp_new_i64();
+
+/* Note that unpredicated load/store of vector/predicate registers
+ * are defined as a stream of bytes, which equates to little-endian
+ * operations on larger quantities.  There is no nice way to force
+ * a little-endian store for aarch64_be-linux-user out of line.
+ *
+ * Attempt to keep code expansion to a minimum by limiting the
+ * amount of unrolling done.
+ */
+if (nparts <= 4) {
+int i;
+
+for (i = 0; i < len_align; i += 8) {
+tcg_gen_ld_i64(t0, cpu_env, vofs + i);
+tcg_gen_addi_i64(addr, cpu_reg_sp(s, rn), imm + i);
+tcg_gen_qemu_st_i64(t0, addr, midx, MO_LEQ);
+}
+} else {
+TCGLabel *loop = gen_new_label();
+TCGv_ptr t2, i = tcg_const_local_ptr(0);
+
+gen_set_label(loop);
+
+t2 = tcg_temp_new_ptr();
+tcg_gen_add_ptr(t2, cpu_env, i);
+tcg_gen_ld_i64(t0, t2, vofs);
+
+/* Minimize the number of local temps that must be re-read from
+ * the stack each iteration.  Instead, re-compute values other
+ * than the loop counter.
+ */
+tcg_gen_addi_ptr(t2, i, imm);
+tcg_gen_extu_ptr_i64(addr, t2);
+tcg_gen_add_i64(addr, addr, cpu_reg_sp(s, rn));
+tcg_temp_free_ptr(t2);
+
+tcg_gen_qemu_st_i64(t0, addr, midx, MO_LEQ);
+
+tcg_gen_addi_ptr(i, i, 8);
+
+tcg_gen_brcondi_ptr(TCG_COND_LTU, i, len_align, loop);
+tcg_temp_free_ptr(i);
+}
+
+/* Predicate register stores can be any multiple of 2.  */
+if (len_remain) {
+tcg_gen_ld_i64(t0, cpu_env, vofs + len_align);
+tcg_gen_addi_i64(addr, cpu_reg_sp(s, rn), imm + len_align);
+
+switch (len_remain) {
+case 2:
+case 4:
+case 8:
+tcg_gen_qemu_st_i64(t0, addr, midx, MO_LE | ctz32(len_remain));
+break;
+
+case 6:
+tcg_gen_qemu_st_i64(t0, addr, midx, MO_LEUL);
+tcg_gen_addi_i64(addr, addr, 4);
+tcg_gen_shri_i64(t0, t0, 32);
+tcg_gen_qemu_st_i64(t0, addr, midx, MO_LEUW);
+break;
+
+default:
+g_assert_not_reached();
+}
+}
+tcg_temp_free_i64(addr);
+tcg_temp_free_i64(t0);
+}
+
 static bool trans_LDR_zri(DisasContext *s, arg_rri *a, uint32_t insn)
 {
 if (sve_access_check(s)) {
@@ -3782,6 +3865,26 @@ static bool trans_LDR_pri(DisasContext *s, arg_rri *a, 
uint32_t insn)
 return true;
 }
 
+static bool trans_STR_zri(DisasContext *s, arg_rri *a, uint32_t insn)
+{
+if (sve_access_check(s)) {
+int size = vec_full_reg_size(s);
+int off = vec_full_reg_offset(s, a->rd);
+do_str(s, off, size, a->rn, a->imm * size);
+}
+return true;
+}
+
+static bool trans_STR_pri(DisasContext *s, arg_rri *a, uint32_t insn)
+{
+if (sve_access_check(s)) {
+int size = pred_full_reg_size(s);
+int off = pred_full_reg_offset(s, a->rd);
+do_str(s, off, size, a->rn, a->imm * size);
+}
+return true;
+}
+
 /*
  *** SVE Memory - Contiguous Load Group
  */
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
index 765e7e479b..6a76010f51 100644
--- a/target/arm/sve.decode
+++ b/target/arm/sve.decode
@@ -793,6 +793,12 @@ LD1RQ_zpri  1010010 .. 00 0 001 ... . . \
 
 ### SVE Memory Store Group
 
+# SVE store predicate register
+STR_pri 1110010 11 0. . 000 ... . 0 @pd_rn_i9
+
+# SVE store vector register
+STR_zri 1110010 11 0. . 010 ... . . @rd_rn_i9
+
 # SVE contiguous store (scalar plus immediate)
 # ST1B, ST1H, ST1W, ST1D; require msz <= esz
 ST_zpri 1110010 .. esz:2  0 111 ... . . \
-- 
2.17.1




[Qemu-devel] [PATCH v6 01/35] target/arm: Implement SVE Memory Contiguous Load Group

2018-06-26 Thread Richard Henderson
Reviewed-by: Alex Bennée 
Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 
---
 target/arm/helper-sve.h|  35 +
 target/arm/sve_helper.c| 153 +
 target/arm/translate-sve.c | 121 +
 target/arm/sve.decode  |  34 +
 4 files changed, 343 insertions(+)

diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
index 2e76084992..fcc9ba5f50 100644
--- a/target/arm/helper-sve.h
+++ b/target/arm/helper-sve.h
@@ -719,3 +719,38 @@ DEF_HELPER_FLAGS_5(gvec_rsqrts_s, TCG_CALL_NO_RWG,
void, ptr, ptr, ptr, ptr, i32)
 DEF_HELPER_FLAGS_5(gvec_rsqrts_d, TCG_CALL_NO_RWG,
void, ptr, ptr, ptr, ptr, i32)
+
+DEF_HELPER_FLAGS_4(sve_ld1bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld2bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld3bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld4bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+
+DEF_HELPER_FLAGS_4(sve_ld1hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld2hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld3hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld4hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+
+DEF_HELPER_FLAGS_4(sve_ld1ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld2ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld3ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld4ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+
+DEF_HELPER_FLAGS_4(sve_ld1dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld2dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld3dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld4dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+
+DEF_HELPER_FLAGS_4(sve_ld1bhu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld1bsu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld1bdu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld1bhs_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld1bss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld1bds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+
+DEF_HELPER_FLAGS_4(sve_ld1hsu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld1hdu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld1hss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld1hds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+
+DEF_HELPER_FLAGS_4(sve_ld1sdu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ld1sds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
index 128bbf9b04..4e6ad282f9 100644
--- a/target/arm/sve_helper.c
+++ b/target/arm/sve_helper.c
@@ -2810,3 +2810,156 @@ uint32_t HELPER(sve_while)(void *vd, uint32_t count, 
uint32_t pred_desc)
 
 return predtest_ones(d, oprsz, esz_mask);
 }
+
+/*
+ * Load contiguous data, protected by a governing predicate.
+ */
+#define DO_LD1(NAME, FN, TYPEE, TYPEM, H)  \
+static void do_##NAME(CPUARMState *env, void *vd, void *vg, \
+  target_ulong addr, intptr_t oprsz,   \
+  uintptr_t ra)\
+{  \
+intptr_t i = 0;\
+do {   \
+uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3));\
+do {   \
+TYPEM m = 0;   \
+if (pg & 1) {  \
+m = FN(env, addr, ra); \
+}  \
+*(TYPEE *)(vd + H(i)) = m; \
+i += sizeof(TYPEE), pg >>= sizeof(TYPEE);  \
+addr += sizeof(TYPEM); \
+} while (i & 15);  \
+} while (i < oprsz);   \
+}  \
+void HELPER(NAME)(CPUARMState *env, void *vg,  \
+  target_ulong addr, uint32_t desc)\
+{  \
+do_##NAME(env, >vfp.zregs[simd_data(desc)], vg,   \
+  addr, simd_oprsz(desc), GETPC());\
+}
+
+#define DO_LD2(NAME, FN, TYPEE, TYPEM, H)  \
+void HELPER(NAME)(CPUARMState *env, void *vg,  \
+  target_ulong addr, uint32_t desc)\
+{  \
+intptr_t i, oprsz = simd_oprsz(desc);  

[Qemu-devel] [PATCH v6 06/35] target/arm: Implement SVE floating-point arithmetic (predicated)

2018-06-26 Thread Richard Henderson
Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 
---
 target/arm/helper-sve.h| 77 +
 target/arm/sve_helper.c| 89 ++
 target/arm/translate-sve.c | 46 
 target/arm/sve.decode  | 17 
 4 files changed, 229 insertions(+)

diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
index 185112e1d2..4097b55f0e 100644
--- a/target/arm/helper-sve.h
+++ b/target/arm/helper-sve.h
@@ -720,6 +720,83 @@ DEF_HELPER_FLAGS_5(gvec_rsqrts_s, TCG_CALL_NO_RWG,
 DEF_HELPER_FLAGS_5(gvec_rsqrts_d, TCG_CALL_NO_RWG,
void, ptr, ptr, ptr, ptr, i32)
 
+DEF_HELPER_FLAGS_6(sve_fadd_h, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_6(sve_fadd_s, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_6(sve_fadd_d, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, ptr, i32)
+
+DEF_HELPER_FLAGS_6(sve_fsub_h, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_6(sve_fsub_s, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_6(sve_fsub_d, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, ptr, i32)
+
+DEF_HELPER_FLAGS_6(sve_fmul_h, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_6(sve_fmul_s, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_6(sve_fmul_d, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, ptr, i32)
+
+DEF_HELPER_FLAGS_6(sve_fdiv_h, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_6(sve_fdiv_s, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_6(sve_fdiv_d, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, ptr, i32)
+
+DEF_HELPER_FLAGS_6(sve_fmin_h, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_6(sve_fmin_s, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_6(sve_fmin_d, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, ptr, i32)
+
+DEF_HELPER_FLAGS_6(sve_fmax_h, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_6(sve_fmax_s, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_6(sve_fmax_d, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, ptr, i32)
+
+DEF_HELPER_FLAGS_6(sve_fminnum_h, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_6(sve_fminnum_s, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_6(sve_fminnum_d, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, ptr, i32)
+
+DEF_HELPER_FLAGS_6(sve_fmaxnum_h, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_6(sve_fmaxnum_s, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_6(sve_fmaxnum_d, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, ptr, i32)
+
+DEF_HELPER_FLAGS_6(sve_fabd_h, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_6(sve_fabd_s, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_6(sve_fabd_d, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, ptr, i32)
+
+DEF_HELPER_FLAGS_6(sve_fscalbn_h, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_6(sve_fscalbn_s, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_6(sve_fscalbn_d, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, ptr, i32)
+
+DEF_HELPER_FLAGS_6(sve_fmulx_h, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_6(sve_fmulx_s, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_6(sve_fmulx_d, TCG_CALL_NO_RWG,
+   void, ptr, ptr, ptr, ptr, ptr, i32)
+
 DEF_HELPER_FLAGS_5(sve_scvt_hh, TCG_CALL_NO_RWG,
void, ptr, ptr, ptr, ptr, i32)
 DEF_HELPER_FLAGS_5(sve_scvt_sh, TCG_CALL_NO_RWG,
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
index 031bec22df..3401662397 100644
--- a/target/arm/sve_helper.c
+++ b/target/arm/sve_helper.c
@@ -2811,6 +2811,95 @@ uint32_t HELPER(sve_while)(void *vd, uint32_t count, 
uint32_t pred_desc)
 return predtest_ones(d, oprsz, esz_mask);
 }
 
+/* Fully general three-operand expander, controlled by a predicate,
+ * With the extra float_status parameter.
+ */
+#define DO_ZPZZ_FP(NAME, TYPE, H, OP)   \
+void HELPER(NAME)(void *vd, void *vn, void *vm, void *vg,   \
+  void *status, uint32_t desc)  \
+{ 

[Qemu-devel] [PATCH v6 09/35] target/arm: Implement SVE load and broadcast element

2018-06-26 Thread Richard Henderson
Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 

---
v6: Fix typo in comment.
---
 target/arm/helper-sve.h|  5 +++
 target/arm/sve_helper.c| 41 +
 target/arm/translate-sve.c | 62 ++
 target/arm/sve.decode  |  5 +++
 4 files changed, 113 insertions(+)

diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
index 68e55a8d03..a5d3bb121c 100644
--- a/target/arm/helper-sve.h
+++ b/target/arm/helper-sve.h
@@ -274,6 +274,11 @@ DEF_HELPER_FLAGS_3(sve_clr_h, TCG_CALL_NO_RWG, void, ptr, 
ptr, i32)
 DEF_HELPER_FLAGS_3(sve_clr_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
 DEF_HELPER_FLAGS_3(sve_clr_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
 
+DEF_HELPER_FLAGS_4(sve_movz_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_4(sve_movz_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_4(sve_movz_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_4(sve_movz_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
+
 DEF_HELPER_FLAGS_4(sve_asr_zpzi_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
 DEF_HELPER_FLAGS_4(sve_asr_zpzi_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
 DEF_HELPER_FLAGS_4(sve_asr_zpzi_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
index 2d08b7dcd3..93f2942590 100644
--- a/target/arm/sve_helper.c
+++ b/target/arm/sve_helper.c
@@ -995,6 +995,47 @@ void HELPER(sve_clr_d)(void *vd, void *vg, uint32_t desc)
 }
 }
 
+/* Copy Zn into Zd, and store zero into inactive elements.  */
+void HELPER(sve_movz_b)(void *vd, void *vn, void *vg, uint32_t desc)
+{
+intptr_t i, opr_sz = simd_oprsz(desc) / 8;
+uint64_t *d = vd, *n = vn;
+uint8_t *pg = vg;
+for (i = 0; i < opr_sz; i += 1) {
+d[i] = n[i] & expand_pred_b(pg[H1(i)]);
+}
+}
+
+void HELPER(sve_movz_h)(void *vd, void *vn, void *vg, uint32_t desc)
+{
+intptr_t i, opr_sz = simd_oprsz(desc) / 8;
+uint64_t *d = vd, *n = vn;
+uint8_t *pg = vg;
+for (i = 0; i < opr_sz; i += 1) {
+d[i] = n[i] & expand_pred_h(pg[H1(i)]);
+}
+}
+
+void HELPER(sve_movz_s)(void *vd, void *vn, void *vg, uint32_t desc)
+{
+intptr_t i, opr_sz = simd_oprsz(desc) / 8;
+uint64_t *d = vd, *n = vn;
+uint8_t *pg = vg;
+for (i = 0; i < opr_sz; i += 1) {
+d[i] = n[i] & expand_pred_s(pg[H1(i)]);
+}
+}
+
+void HELPER(sve_movz_d)(void *vd, void *vn, void *vg, uint32_t desc)
+{
+intptr_t i, opr_sz = simd_oprsz(desc) / 8;
+uint64_t *d = vd, *n = vn;
+uint8_t *pg = vg;
+for (i = 0; i < opr_sz; i += 1) {
+d[i] = n[1] & -(uint64_t)(pg[H1(i)] & 1);
+}
+}
+
 /* Three-operand expander, immediate operand, controlled by a predicate.
  */
 #define DO_ZPZI(NAME, TYPE, H, OP)  \
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
index 483ad33179..954d6653d3 100644
--- a/target/arm/translate-sve.c
+++ b/target/arm/translate-sve.c
@@ -606,6 +606,20 @@ static bool do_clr_zp(DisasContext *s, int rd, int pg, int 
esz)
 return true;
 }
 
+/* Copy Zn into Zd, storing zeros into inactive elements.  */
+static void do_movz_zpz(DisasContext *s, int rd, int rn, int pg, int esz)
+{
+static gen_helper_gvec_3 * const fns[4] = {
+gen_helper_sve_movz_b, gen_helper_sve_movz_h,
+gen_helper_sve_movz_s, gen_helper_sve_movz_d,
+};
+unsigned vsz = vec_full_reg_size(s);
+tcg_gen_gvec_3_ool(vec_full_reg_offset(s, rd),
+   vec_full_reg_offset(s, rn),
+   pred_full_reg_offset(s, pg),
+   vsz, vsz, 0, fns[esz]);
+}
+
 static bool do_zpzi_ool(DisasContext *s, arg_rpri_esz *a,
 gen_helper_gvec_3 *fn)
 {
@@ -3999,6 +4013,54 @@ static bool trans_LD1RQ_zpri(DisasContext *s, 
arg_rpri_load *a, uint32_t insn)
 return true;
 }
 
+/* Load and broadcast element.  */
+static bool trans_LD1R_zpri(DisasContext *s, arg_rpri_load *a, uint32_t insn)
+{
+if (!sve_access_check(s)) {
+return true;
+}
+
+unsigned vsz = vec_full_reg_size(s);
+unsigned psz = pred_full_reg_size(s);
+unsigned esz = dtype_esz[a->dtype];
+TCGLabel *over = gen_new_label();
+TCGv_i64 temp;
+
+/* If the guarding predicate has no bits set, no load occurs.  */
+if (psz <= 8) {
+/* Reduce the pred_esz_masks value simply to reduce the
+ * size of the code generated here.
+ */
+uint64_t psz_mask = MAKE_64BIT_MASK(0, psz * 8);
+temp = tcg_temp_new_i64();
+tcg_gen_ld_i64(temp, cpu_env, pred_full_reg_offset(s, a->pg));
+tcg_gen_andi_i64(temp, temp, pred_esz_masks[esz] & psz_mask);
+tcg_gen_brcondi_i64(TCG_COND_EQ, temp, 0, over);
+tcg_temp_free_i64(temp);
+} else {
+TCGv_i32 t32 = tcg_temp_new_i32();
+find_last_active(s, t32, esz, a->pg);
+tcg_gen_brcondi_i32(TCG_COND_LT, t32, 0, over);
+

[Qemu-devel] [PATCH v6 04/35] target/arm: Implement SVE load and broadcast quadword

2018-06-26 Thread Richard Henderson
Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 
---
 target/arm/translate-sve.c | 52 ++
 target/arm/sve.decode  |  9 +++
 2 files changed, 61 insertions(+)

diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
index b25fe96b77..83de87ee0e 100644
--- a/target/arm/translate-sve.c
+++ b/target/arm/translate-sve.c
@@ -3717,6 +3717,58 @@ static bool trans_LDNF1_zpri(DisasContext *s, 
arg_rpri_load *a, uint32_t insn)
 return true;
 }
 
+static void do_ldrq(DisasContext *s, int zt, int pg, TCGv_i64 addr, int msz)
+{
+static gen_helper_gvec_mem * const fns[4] = {
+gen_helper_sve_ld1bb_r, gen_helper_sve_ld1hh_r,
+gen_helper_sve_ld1ss_r, gen_helper_sve_ld1dd_r,
+};
+unsigned vsz = vec_full_reg_size(s);
+TCGv_ptr t_pg;
+TCGv_i32 desc;
+
+/* Load the first quadword using the normal predicated load helpers.  */
+desc = tcg_const_i32(simd_desc(16, 16, zt));
+t_pg = tcg_temp_new_ptr();
+
+tcg_gen_addi_ptr(t_pg, cpu_env, pred_full_reg_offset(s, pg));
+fns[msz](cpu_env, t_pg, addr, desc);
+
+tcg_temp_free_ptr(t_pg);
+tcg_temp_free_i32(desc);
+
+/* Replicate that first quadword.  */
+if (vsz > 16) {
+unsigned dofs = vec_full_reg_offset(s, zt);
+tcg_gen_gvec_dup_mem(4, dofs + 16, dofs, vsz - 16, vsz - 16);
+}
+}
+
+static bool trans_LD1RQ_zprr(DisasContext *s, arg_rprr_load *a, uint32_t insn)
+{
+if (a->rm == 31) {
+return false;
+}
+if (sve_access_check(s)) {
+int msz = dtype_msz(a->dtype);
+TCGv_i64 addr = new_tmp_a64(s);
+tcg_gen_shli_i64(addr, cpu_reg(s, a->rm), msz);
+tcg_gen_add_i64(addr, addr, cpu_reg_sp(s, a->rn));
+do_ldrq(s, a->rd, a->pg, addr, msz);
+}
+return true;
+}
+
+static bool trans_LD1RQ_zpri(DisasContext *s, arg_rpri_load *a, uint32_t insn)
+{
+if (sve_access_check(s)) {
+TCGv_i64 addr = new_tmp_a64(s);
+tcg_gen_addi_i64(addr, cpu_reg_sp(s, a->rn), a->imm * 16);
+do_ldrq(s, a->rd, a->pg, addr, dtype_msz(a->dtype));
+}
+return true;
+}
+
 static void do_st_zpa(DisasContext *s, int zt, int pg, TCGv_i64 addr,
   int msz, int esz, int nreg)
 {
diff --git a/target/arm/sve.decode b/target/arm/sve.decode
index 6e159faaec..606c4f623c 100644
--- a/target/arm/sve.decode
+++ b/target/arm/sve.decode
@@ -715,6 +715,15 @@ LD_zprr 1010010 .. nreg:2 . 110 ... . 
. @rprr_load_msz
 # LD2B, LD2H, LD2W, LD2D; etc.
 LD_zpri 1010010 .. nreg:2 0 111 ... . . @rpri_load_msz
 
+# SVE load and broadcast quadword (scalar plus scalar)
+LD1RQ_zprr  1010010 .. 00 . 000 ... . . \
+@rprr_load_msz nreg=0
+
+# SVE load and broadcast quadword (scalar plus immediate)
+# LD1RQB, LD1RQH, LD1RQS, LD1RQD
+LD1RQ_zpri  1010010 .. 00 0 001 ... . . \
+@rpri_load_msz nreg=0
+
 ### SVE Memory Store Group
 
 # SVE contiguous store (scalar plus immediate)
-- 
2.17.1




[Qemu-devel] [PATCH v6 02/35] target/arm: Implement SVE Contiguous Load, first-fault and no-fault

2018-06-26 Thread Richard Henderson
Signed-off-by: Richard Henderson 

---
v6:
  * Remove cold attribute from record_fault, add unlikely marker
to the if that protects its call, which seems to be enough to
prevent the function being inlined.
  * Fix the set of bits masked by record_fault.
---
 target/arm/helper-sve.h|  40 ++
 target/arm/sve_helper.c| 157 +
 target/arm/translate-sve.c |  69 
 target/arm/sve.decode  |   6 ++
 4 files changed, 272 insertions(+)

diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
index fcc9ba5f50..7338abbbcf 100644
--- a/target/arm/helper-sve.h
+++ b/target/arm/helper-sve.h
@@ -754,3 +754,43 @@ DEF_HELPER_FLAGS_4(sve_ld1hds_r, TCG_CALL_NO_WG, void, 
env, ptr, tl, i32)
 
 DEF_HELPER_FLAGS_4(sve_ld1sdu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
 DEF_HELPER_FLAGS_4(sve_ld1sds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+
+DEF_HELPER_FLAGS_4(sve_ldff1bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ldff1bhu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ldff1bsu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ldff1bdu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ldff1bhs_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ldff1bss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ldff1bds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+
+DEF_HELPER_FLAGS_4(sve_ldff1hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ldff1hsu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ldff1hdu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ldff1hss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ldff1hds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+
+DEF_HELPER_FLAGS_4(sve_ldff1ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ldff1sdu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ldff1sds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+
+DEF_HELPER_FLAGS_4(sve_ldff1dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+
+DEF_HELPER_FLAGS_4(sve_ldnf1bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ldnf1bhu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ldnf1bsu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ldnf1bdu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ldnf1bhs_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ldnf1bss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ldnf1bds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+
+DEF_HELPER_FLAGS_4(sve_ldnf1hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ldnf1hsu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ldnf1hdu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ldnf1hss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ldnf1hds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+
+DEF_HELPER_FLAGS_4(sve_ldnf1ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ldnf1sdu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_ldnf1sds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+
+DEF_HELPER_FLAGS_4(sve_ldnf1dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
index 4e6ad282f9..0d22a57a22 100644
--- a/target/arm/sve_helper.c
+++ b/target/arm/sve_helper.c
@@ -2963,3 +2963,160 @@ DO_LD4(sve_ld4dd_r, cpu_ldq_data_ra, uint64_t, 
uint64_t, )
 #undef DO_LD2
 #undef DO_LD3
 #undef DO_LD4
+
+/*
+ * Load contiguous data, first-fault and no-fault.
+ */
+
+#ifdef CONFIG_USER_ONLY
+
+/* Fault on byte I.  All bits in FFR from I are cleared.  The vector
+ * result from I is CONSTRAINED UNPREDICTABLE; we choose the MERGE
+ * option, which leaves subsequent data unchanged.
+ */
+static void record_fault(CPUARMState *env, uintptr_t i, uintptr_t oprsz)
+{
+uint64_t *ffr = env->vfp.pregs[FFR_PRED_NUM].p;
+
+if (i & 63) {
+ffr[i / 64] &= MAKE_64BIT_MASK(0, i & 63);
+i = ROUND_UP(i, 64);
+}
+for (; i < oprsz; i += 64) {
+ffr[i / 64] = 0;
+}
+}
+
+/* Hold the mmap lock during the operation so that there is no race
+ * between page_check_range and the load operation.  We expect the
+ * usual case to have no faults at all, so we check the whole range
+ * first and if successful defer to the normal load operation.
+ *
+ * TODO: Change mmap_lock to a rwlock so that multiple readers
+ * can run simultaneously.  This will probably help other uses
+ * within QEMU as well.
+ */
+#define DO_LDFF1(PART, FN, TYPEE, TYPEM, H) \
+static void do_sve_ldff1##PART(CPUARMState *env, void *vd, void *vg,\
+   target_ulong addr, intptr_t oprsz,   \
+   bool first, uintptr_t ra)   

[Qemu-devel] [PATCH v6 07/35] target/arm: Implement SVE FP Multiply-Add Group

2018-06-26 Thread Richard Henderson
Signed-off-by: Richard Henderson 

---
v6: Add some decode commentary.
---
 target/arm/helper-sve.h|  16 
 target/arm/sve_helper.c| 158 +
 target/arm/translate-sve.c |  49 
 target/arm/sve.decode  |  18 +
 4 files changed, 241 insertions(+)

diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
index 4097b55f0e..eb0645dd43 100644
--- a/target/arm/helper-sve.h
+++ b/target/arm/helper-sve.h
@@ -827,6 +827,22 @@ DEF_HELPER_FLAGS_5(sve_ucvt_ds, TCG_CALL_NO_RWG,
 DEF_HELPER_FLAGS_5(sve_ucvt_dd, TCG_CALL_NO_RWG,
void, ptr, ptr, ptr, ptr, i32)
 
+DEF_HELPER_FLAGS_3(sve_fmla_zpzzz_h, TCG_CALL_NO_RWG, void, env, ptr, i32)
+DEF_HELPER_FLAGS_3(sve_fmla_zpzzz_s, TCG_CALL_NO_RWG, void, env, ptr, i32)
+DEF_HELPER_FLAGS_3(sve_fmla_zpzzz_d, TCG_CALL_NO_RWG, void, env, ptr, i32)
+
+DEF_HELPER_FLAGS_3(sve_fmls_zpzzz_h, TCG_CALL_NO_RWG, void, env, ptr, i32)
+DEF_HELPER_FLAGS_3(sve_fmls_zpzzz_s, TCG_CALL_NO_RWG, void, env, ptr, i32)
+DEF_HELPER_FLAGS_3(sve_fmls_zpzzz_d, TCG_CALL_NO_RWG, void, env, ptr, i32)
+
+DEF_HELPER_FLAGS_3(sve_fnmla_zpzzz_h, TCG_CALL_NO_RWG, void, env, ptr, i32)
+DEF_HELPER_FLAGS_3(sve_fnmla_zpzzz_s, TCG_CALL_NO_RWG, void, env, ptr, i32)
+DEF_HELPER_FLAGS_3(sve_fnmla_zpzzz_d, TCG_CALL_NO_RWG, void, env, ptr, i32)
+
+DEF_HELPER_FLAGS_3(sve_fnmls_zpzzz_h, TCG_CALL_NO_RWG, void, env, ptr, i32)
+DEF_HELPER_FLAGS_3(sve_fnmls_zpzzz_s, TCG_CALL_NO_RWG, void, env, ptr, i32)
+DEF_HELPER_FLAGS_3(sve_fnmls_zpzzz_d, TCG_CALL_NO_RWG, void, env, ptr, i32)
+
 DEF_HELPER_FLAGS_4(sve_ld1bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
 DEF_HELPER_FLAGS_4(sve_ld2bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
 DEF_HELPER_FLAGS_4(sve_ld3bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
index 3401662397..2f416e5e28 100644
--- a/target/arm/sve_helper.c
+++ b/target/arm/sve_helper.c
@@ -2938,6 +2938,164 @@ DO_ZPZ_FP(sve_ucvt_dd, uint64_t, , 
uint64_to_float64)
 
 #undef DO_ZPZ_FP
 
+/* 4-operand predicated multiply-add.  This requires 7 operands to pass
+ * "properly", so we need to encode some of the registers into DESC.
+ */
+QEMU_BUILD_BUG_ON(SIMD_DATA_SHIFT + 20 > 32);
+
+static void do_fmla_zpzzz_h(CPUARMState *env, void *vg, uint32_t desc,
+uint16_t neg1, uint16_t neg3)
+{
+intptr_t i = simd_oprsz(desc);
+unsigned rd = extract32(desc, SIMD_DATA_SHIFT, 5);
+unsigned rn = extract32(desc, SIMD_DATA_SHIFT + 5, 5);
+unsigned rm = extract32(desc, SIMD_DATA_SHIFT + 10, 5);
+unsigned ra = extract32(desc, SIMD_DATA_SHIFT + 15, 5);
+void *vd = >vfp.zregs[rd];
+void *vn = >vfp.zregs[rn];
+void *vm = >vfp.zregs[rm];
+void *va = >vfp.zregs[ra];
+uint64_t *g = vg;
+
+do {
+uint64_t pg = g[(i - 1) >> 6];
+do {
+i -= 2;
+if (likely((pg >> (i & 63)) & 1)) {
+float16 e1, e2, e3, r;
+
+e1 = *(uint16_t *)(vn + H1_2(i)) ^ neg1;
+e2 = *(uint16_t *)(vm + H1_2(i));
+e3 = *(uint16_t *)(va + H1_2(i)) ^ neg3;
+r = float16_muladd(e1, e2, e3, 0, >vfp.fp_status);
+*(uint16_t *)(vd + H1_2(i)) = r;
+}
+} while (i & 63);
+} while (i != 0);
+}
+
+void HELPER(sve_fmla_zpzzz_h)(CPUARMState *env, void *vg, uint32_t desc)
+{
+do_fmla_zpzzz_h(env, vg, desc, 0, 0);
+}
+
+void HELPER(sve_fmls_zpzzz_h)(CPUARMState *env, void *vg, uint32_t desc)
+{
+do_fmla_zpzzz_h(env, vg, desc, 0x8000, 0);
+}
+
+void HELPER(sve_fnmla_zpzzz_h)(CPUARMState *env, void *vg, uint32_t desc)
+{
+do_fmla_zpzzz_h(env, vg, desc, 0x8000, 0x8000);
+}
+
+void HELPER(sve_fnmls_zpzzz_h)(CPUARMState *env, void *vg, uint32_t desc)
+{
+do_fmla_zpzzz_h(env, vg, desc, 0, 0x8000);
+}
+
+static void do_fmla_zpzzz_s(CPUARMState *env, void *vg, uint32_t desc,
+uint32_t neg1, uint32_t neg3)
+{
+intptr_t i = simd_oprsz(desc);
+unsigned rd = extract32(desc, SIMD_DATA_SHIFT, 5);
+unsigned rn = extract32(desc, SIMD_DATA_SHIFT + 5, 5);
+unsigned rm = extract32(desc, SIMD_DATA_SHIFT + 10, 5);
+unsigned ra = extract32(desc, SIMD_DATA_SHIFT + 15, 5);
+void *vd = >vfp.zregs[rd];
+void *vn = >vfp.zregs[rn];
+void *vm = >vfp.zregs[rm];
+void *va = >vfp.zregs[ra];
+uint64_t *g = vg;
+
+do {
+uint64_t pg = g[(i - 1) >> 6];
+do {
+i -= 4;
+if (likely((pg >> (i & 63)) & 1)) {
+float32 e1, e2, e3, r;
+
+e1 = *(uint32_t *)(vn + H1_4(i)) ^ neg1;
+e2 = *(uint32_t *)(vm + H1_4(i));
+e3 = *(uint32_t *)(va + H1_4(i)) ^ neg3;
+r = float32_muladd(e1, e2, e3, 0, >vfp.fp_status);
+*(uint32_t *)(vd + H1_4(i)) = r;
+}
+} while (i & 63);
+} while (i != 0);
+}
+
+void 

[Qemu-devel] [PATCH v6 03/35] target/arm: Implement SVE Memory Contiguous Store Group

2018-06-26 Thread Richard Henderson
Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 
---
 target/arm/helper-sve.h|  29 +
 target/arm/sve_helper.c| 211 +
 target/arm/translate-sve.c |  65 
 target/arm/sve.decode  |  38 +++
 4 files changed, 343 insertions(+)

diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h
index 7338abbbcf..b768128951 100644
--- a/target/arm/helper-sve.h
+++ b/target/arm/helper-sve.h
@@ -794,3 +794,32 @@ DEF_HELPER_FLAGS_4(sve_ldnf1sdu_r, TCG_CALL_NO_WG, void, 
env, ptr, tl, i32)
 DEF_HELPER_FLAGS_4(sve_ldnf1sds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
 
 DEF_HELPER_FLAGS_4(sve_ldnf1dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+
+DEF_HELPER_FLAGS_4(sve_st1bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_st2bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_st3bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_st4bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+
+DEF_HELPER_FLAGS_4(sve_st1hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_st2hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_st3hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_st4hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+
+DEF_HELPER_FLAGS_4(sve_st1ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_st2ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_st3ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_st4ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+
+DEF_HELPER_FLAGS_4(sve_st1dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_st2dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_st3dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_st4dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+
+DEF_HELPER_FLAGS_4(sve_st1bh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_st1bs_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_st1bd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+
+DEF_HELPER_FLAGS_4(sve_st1hs_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+DEF_HELPER_FLAGS_4(sve_st1hd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
+
+DEF_HELPER_FLAGS_4(sve_st1sd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32)
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
index 0d22a57a22..bd874e6fa2 100644
--- a/target/arm/sve_helper.c
+++ b/target/arm/sve_helper.c
@@ -3120,3 +3120,214 @@ DO_LDNF1(sds_r)
 DO_LDNF1(dd_r)
 
 #undef DO_LDNF1
+
+/*
+ * Store contiguous data, protected by a governing predicate.
+ */
+#define DO_ST1(NAME, FN, TYPEE, TYPEM, H)  \
+void HELPER(NAME)(CPUARMState *env, void *vg,  \
+  target_ulong addr, uint32_t desc)\
+{  \
+intptr_t i, oprsz = simd_oprsz(desc);  \
+intptr_t ra = GETPC(); \
+unsigned rd = simd_data(desc); \
+void *vd = >vfp.zregs[rd];\
+for (i = 0; i < oprsz; ) { \
+uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3));\
+do {   \
+if (pg & 1) {  \
+TYPEM m = *(TYPEE *)(vd + H(i));   \
+FN(env, addr, m, ra);  \
+}  \
+i += sizeof(TYPEE), pg >>= sizeof(TYPEE);  \
+addr += sizeof(TYPEM); \
+} while (i & 15);  \
+}  \
+}
+
+#define DO_ST1_D(NAME, FN, TYPEM)  \
+void HELPER(NAME)(CPUARMState *env, void *vg,  \
+  target_ulong addr, uint32_t desc)\
+{  \
+intptr_t i, oprsz = simd_oprsz(desc) / 8;  \
+intptr_t ra = GETPC(); \
+unsigned rd = simd_data(desc); \
+uint64_t *d = >vfp.zregs[rd].d[0];\
+uint8_t *pg = vg;  \
+for (i = 0; i < oprsz; i += 1) {   \
+if (pg[H1(i)] & 1) {   \
+FN(env, addr, d[i], ra);   \
+}  \
+addr += sizeof(TYPEM); \
+}  \
+}
+
+#define DO_ST2(NAME, FN, TYPEE, TYPEM, H)  \
+void HELPER(NAME)(CPUARMState *env, void *vg,  \
+  target_ulong addr, uint32_t desc)\
+{ 

[Qemu-devel] [PATCH v6 00/35] target/arm SVE patches

2018-06-26 Thread Richard Henderson
This is the remainder of the SVE enablement patches,
with an extra bonus patch to enable ARMv8.2-DotProd.

V6 updates based on review.

Patches with changes:
  0002-target-arm-Implement-SVE-Contiguous-Load-first-fa.patch
  0007-target-arm-Implement-SVE-FP-Multiply-Add-Group.patch
  0009-target-arm-Implement-SVE-load-and-broadcast-eleme.patch
  0010-target-arm-Implement-SVE-store-vector-predicate-r.patch
  0011-target-arm-Implement-SVE-scatter-stores.patch
  0013-target-arm-Implement-SVE-gather-loads.patch
  0023-target-arm-Implement-SVE-floating-point-convert-p.patch
  0027-target-arm-Implement-SVE-MOVPRFX.patch
  0030-target-arm-Pass-index-to-AdvSIMD-FCMLA-indexed.patch
  0033-target-arm-Implement-SVE-dot-product-indexed.patch
  0034-target-arm-Enable-SVE-for-aarch64-linux-user.patch
  0035-target-arm-Implement-ARMv8.2-DotProd.patch

Patches lacking reviews:
  0002-target-arm-Implement-SVE-Contiguous-Load-first-fa.patch
  0007-target-arm-Implement-SVE-FP-Multiply-Add-Group.patch
  0013-target-arm-Implement-SVE-gather-loads.patch
  0030-target-arm-Pass-index-to-AdvSIMD-FCMLA-indexed.patch
  0031-target-arm-Implement-SVE-fp-complex-multiply-add-.patch
  0033-target-arm-Implement-SVE-dot-product-indexed.patch


r~


Richard Henderson (35):
  target/arm: Implement SVE Memory Contiguous Load Group
  target/arm: Implement SVE Contiguous Load, first-fault and no-fault
  target/arm: Implement SVE Memory Contiguous Store Group
  target/arm: Implement SVE load and broadcast quadword
  target/arm: Implement SVE integer convert to floating-point
  target/arm: Implement SVE floating-point arithmetic (predicated)
  target/arm: Implement SVE FP Multiply-Add Group
  target/arm: Implement SVE Floating Point Accumulating Reduction Group
  target/arm: Implement SVE load and broadcast element
  target/arm: Implement SVE store vector/predicate register
  target/arm: Implement SVE scatter stores
  target/arm: Implement SVE prefetches
  target/arm: Implement SVE gather loads
  target/arm: Implement SVE first-fault gather loads
  target/arm: Implement SVE scatter store vector immediate
  target/arm: Implement SVE floating-point compare vectors
  target/arm: Implement SVE floating-point arithmetic with immediate
  target/arm: Implement SVE Floating Point Multiply Indexed Group
  target/arm: Implement SVE FP Fast Reduction Group
  target/arm: Implement SVE Floating Point Unary Operations -
Unpredicated Group
  target/arm: Implement SVE FP Compare with Zero Group
  target/arm: Implement SVE floating-point trig multiply-add coefficient
  target/arm: Implement SVE floating-point convert precision
  target/arm: Implement SVE floating-point convert to integer
  target/arm: Implement SVE floating-point round to integral value
  target/arm: Implement SVE floating-point unary operations
  target/arm: Implement SVE MOVPRFX
  target/arm: Implement SVE floating-point complex add
  target/arm: Implement SVE fp complex multiply add
  target/arm: Pass index to AdvSIMD FCMLA (indexed)
  target/arm: Implement SVE fp complex multiply add (indexed)
  target/arm: Implement SVE dot product (vectors)
  target/arm: Implement SVE dot product (indexed)
  target/arm: Enable SVE for aarch64-linux-user
  target/arm: Implement ARMv8.2-DotProd

 target/arm/cpu.h   |1 +
 target/arm/helper-sve.h|  682 +
 target/arm/helper.h|   44 +-
 linux-user/elfload.c   |2 +
 target/arm/cpu.c   |8 +
 target/arm/cpu64.c |2 +
 target/arm/helper.c|2 +-
 target/arm/sve_helper.c| 1855 
 target/arm/translate-a64.c |   57 +-
 target/arm/translate-sve.c | 1688 +++-
 target/arm/translate.c |  102 +-
 target/arm/vec_helper.c|  311 +-
 target/arm/sve.decode  |  427 +
 13 files changed, 5116 insertions(+), 65 deletions(-)

-- 
2.17.1




[Qemu-devel] [Bug 1767200] Re: Kernel Panic Unable to mount root fs on unknown-block(31, 3)

2018-06-26 Thread Launchpad Bug Tracker
[Expired for QEMU because there has been no activity for 60 days.]

** Changed in: qemu
   Status: Incomplete => Expired

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1767200

Title:
  Kernel Panic Unable to mount root fs on unknown-block(31,3)

Status in QEMU:
  Expired

Bug description:
  Using the latest qemu:
  qemu-system-arm.exe -kernel C:\Users\a\Downloads\kernel-qemu-4.4.34-jessie 
-cpu arm1176 -m 256 -machine versatilepb -cdrom 
C:\Users\a\Downloads\picore-9.0.3.img

  Gives error:
  Kernel Panic Unable to mount root fs on unknown-block(31,3)

  I have tried different ARMv6 ARMv7 images/kernels with the same
  result.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1767200/+subscriptions



Re: [Qemu-devel] [virtio-dev] Re: [PATCH v2 0/4] Use of unique identifier for pairing virtio and passthrough devices...

2018-06-26 Thread Venu Busireddy
On 2018-06-27 07:06:42 +0300, Michael S. Tsirkin wrote:
> On Tue, Jun 26, 2018 at 10:49:30PM -0500, Venu Busireddy wrote:
> > The patch set "Enable virtio_net to act as a standby for a passthru
> > device" [1] deals with live migration of guests that use passthrough
> > devices. However, that scheme uses the MAC address for pairing
> > the virtio device and the passthrough device. The thread "netvsc:
> > refactor notifier/event handling code to use the failover framework"
> > [2] discusses an alternate mechanism, such as using an UUID, for pairing
> > the devices. Based on that discussion, proposals "Add "Group Identifier"
> > to virtio PCI capabilities." [3] and "RFC: Use of bridge devices to
> > store pairing information..." [4] were made.
> > 
> > The current patch set includes all the feedback received for proposals [3]
> > and [4]. For the sake of completeness, patch for the virtio specification
> > is also included here. Following is the updated proposal.
> > 
> > 1. Extend the virtio specification to include a new virtio PCI capability
> >"VIRTIO_PCI_CAP_GROUP_ID_CFG".
> 
> There's still discussion around whether it should be
> a virtio pci capability, a virtio net config field or
> a new kind of capability.
> 
> > 2. Enhance the QEMU CLI to include a "uuid" option to the virtio device.
> >The "uuid" is a string in UUID format.
> > 
> > 3. Enhance the QEMU CLI to include a "uuid" option to the bridge device.
> >The "uuid" is a string in UUID format. Currently, PCIe bridge for
> >the Q35 model is supported.
> > 
> > 4. The operator creates a unique identifier string using 'uuidgen'.
> > 
> > 5. When the virtio device is created, the operator uses the "uuid" option
> >(for example, '-device virtio-net-pci,uuid="string"') and specifies
> >the UUID created in step 4.
> > 
> >QEMU stores the UUID in the virtio device's configuration space
> >in the capability "VIRTIO_PCI_CAP_GROUP_ID_CFG".
> > 
> > 6. When assigning a PCI device to the guest in passthrough mode, the
> >operator first creates a bridge using the "uuid" option (for example,
> >'-device pcie-downstream,uuid="string"') to specify the UUID created
> >in step 4, and then attaches the passthrough device to the bridge.
> > 
> >QEMU stores the UUID in the configuration space of the bridge as
> >Vendor-Specific capability (0x09). The "Vendor" here is not to be
> >confused with a specific organization. Instead, the vendor of the
> >bridge is QEMU. To avoid mixing up with other bridges, the bridge
> >will be created with vendor ID 0x1b36 (PCI_VENDOR_ID_REDHAT) and
> >device ID 0x000e (PCI_DEVICE_ID_REDHAT_PCIE_BRIDGE) if the "uuid"
> >option is specified. Otherwise, current defaults are used.
> > 
> > 7. Patch 4 in patch series "Enable virtio_net to act as a standby for
> >a passthru device" [1] needs to be modified to use the UUID values
> >present in the bridge's configuration space and the virtio device's
> >configuration space instead of the MAC address for pairing the devices.
> > 
> > Thanks!
> > 
> > Venu
> 
> The part where the visibility of a vfio device is controlled by the
> virtio driver acknowledging the backup feature is missing here.

Could you please elaborate?

Thanks,

Venu

>  
> 
> > [1] https://lists.oasis-open.org/archives/virtio-dev/201805/msg00156.html
> > [2] https://www.spinics.net/lists/netdev/msg499011.html
> > [3] https://lists.oasis-open.org/archives/virtio-dev/201805/msg00118.html
> > [4] https://lists.oasis-open.org/archives/virtio-dev/201805/msg00204.html
> > 
> > Changes in v2:
> >   - As Michael Tsirkin suggested, changed the virtio specification
> > to restrict the group identifier to be a 16-byte field, presented
> > entirely in the virtio device's configuration space.
> >   - As Michael Tsirkin suggested, instead of tweaking the ioh3420
> > device with Red Hat vendor ID, create a new PCIe bridge device
> > named "pcie-downstream" with Red Hat Vendor ID, and include the
> > group identifier in this device.
> >   - Added a new patch to enhance the "pci-bridge" device to support
> > the group identifier (for the i440FX model).
> > 
> > Venu Busireddy (4):
> >   Add a true or false option to the DEFINE_PROP_UUID macro.
> >   Add "Group Identifier" support to virtio devices.
> >   Add "Group Identifier" support to Red Hat PCI bridge.
> >   Add "Group Identifier" support to Red Hat PCI Express bridge.
> > 
> >  default-configs/arm-softmmu.mak |   1 +
> >  default-configs/i386-softmmu.mak|   1 +
> >  default-configs/x86_64-softmmu.mak  |   1 +
> >  hw/acpi/vmgenid.c   |   2 +-
> >  hw/pci-bridge/Makefile.objs |   1 +
> >  hw/pci-bridge/pci_bridge_dev.c  |   8 +
> >  hw/pci-bridge/pcie_downstream.c | 215 
> >  hw/pci-bridge/pcie_downstream.h |  10 +
> >  hw/pci/pci_bridge.c

Re: [Qemu-devel] [virtio-dev] Re: [PATCH v2 3/4] Add "Group Identifier" support to Red Hat PCI bridge.

2018-06-26 Thread Venu Busireddy
On 2018-06-27 07:02:36 +0300, Michael S. Tsirkin wrote:
> On Tue, Jun 26, 2018 at 10:49:33PM -0500, Venu Busireddy wrote:
> > Add the "Vendor-Specific" capability to the Red Hat PCI bridge device
> > "pci-bridge", to contain the "Group Identifier" (UUID) that will be
> > used to pair a virtio device with the passthrough device attached to
> > that bridge.
> > 
> > This capability is added to the bridge iff the "uuid" option is specified
> > for the bridge.
> 
> I think the name should be more explicit. How about "failover-group-id"?

I can change it. But don't you think it is bit long?

> > 
> > Signed-off-by: Venu Busireddy 
> 
> I'd like to also tweak the device id in this case,
> to make it easier for guests to know it's a grouping bridge.

Could you please recommend a name for the new ID'd definition? Something
in lines of PCI_DEVICE_ID_REDHAT_.

Thanks,

Venu

> 
> > ---
> >  hw/pci-bridge/pci_bridge_dev.c |  8 
> >  hw/pci/pci_bridge.c| 26 ++
> >  include/hw/pci/pcie.h  |  1 +
> >  3 files changed, 35 insertions(+)
> > 
> > diff --git a/hw/pci-bridge/pci_bridge_dev.c b/hw/pci-bridge/pci_bridge_dev.c
> > index b2d861d216..bbbc6fa1c6 100644
> > --- a/hw/pci-bridge/pci_bridge_dev.c
> > +++ b/hw/pci-bridge/pci_bridge_dev.c
> > @@ -71,6 +71,12 @@ static void pci_bridge_dev_realize(PCIDevice *dev, Error 
> > **errp)
> >  bridge_dev->msi = ON_OFF_AUTO_OFF;
> >  }
> >  
> > +err = pci_bridge_vendor_init(dev, 0, errp);
> > +if (err < 0) {
> > +error_append_hint(errp, "Can't init group ID, error %d\n", err);
> > +goto vendor_cap_err;
> > +}
> > +
> >  err = slotid_cap_init(dev, 0, bridge_dev->chassis_nr, 0, errp);
> >  if (err) {
> >  goto slotid_error;
> > @@ -109,6 +115,7 @@ slotid_error:
> >  if (shpc_present(dev)) {
> >  shpc_cleanup(dev, _dev->bar);
> >  }
> > +vendor_cap_err:
> >  shpc_error:
> >  pci_bridge_exitfn(dev);
> >  }
> > @@ -162,6 +169,7 @@ static Property pci_bridge_dev_properties[] = {
> >  ON_OFF_AUTO_AUTO),
> >  DEFINE_PROP_BIT(PCI_BRIDGE_DEV_PROP_SHPC, PCIBridgeDev, flags,
> >  PCI_BRIDGE_DEV_F_SHPC_REQ, true),
> > +DEFINE_PROP_UUID(COMPAT_PROP_UUID, PCIDevice, uuid, false),
> >  DEFINE_PROP_END_OF_LIST(),
> >  };
> >  
> > diff --git a/hw/pci/pci_bridge.c b/hw/pci/pci_bridge.c
> > index 40a39f57cb..cb8b3dad2a 100644
> > --- a/hw/pci/pci_bridge.c
> > +++ b/hw/pci/pci_bridge.c
> > @@ -34,12 +34,17 @@
> >  #include "hw/pci/pci_bus.h"
> >  #include "qemu/range.h"
> >  #include "qapi/error.h"
> > +#include "qemu/uuid.h"
> >  
> >  /* PCI bridge subsystem vendor ID helper functions */
> >  #define PCI_SSVID_SIZEOF8
> >  #define PCI_SSVID_SVID  4
> >  #define PCI_SSVID_SSID  6
> >  
> > +#define PCI_VENDOR_SIZEOF 20
> > +#define PCI_VENDOR_CAP_LEN_OFFSET  2
> > +#define PCI_VENDOR_GROUP_ID_OFFSET 4
> > +
> >  int pci_bridge_ssvid_init(PCIDevice *dev, uint8_t offset,
> >uint16_t svid, uint16_t ssid,
> >Error **errp)
> > @@ -57,6 +62,27 @@ int pci_bridge_ssvid_init(PCIDevice *dev, uint8_t offset,
> >  return pos;
> >  }
> >  
> > +int pci_bridge_vendor_init(PCIDevice *d, uint8_t offset, Error **errp)
> > +{
> > +int pos;
> > +
> > +if (qemu_uuid_is_null(>uuid)) {
> > +return 0;
> > +}
> > +
> > +pos = pci_add_capability(d, PCI_CAP_ID_VNDR, offset, PCI_VENDOR_SIZEOF,
> > +errp);
> > +if (pos < 0) {
> > +return pos;
> > +}
> > +
> > +pci_set_word(d->config + pos + PCI_VENDOR_CAP_LEN_OFFSET,
> > +PCI_VENDOR_SIZEOF);
> > +memcpy(d->config + pos + PCI_VENDOR_GROUP_ID_OFFSET, >uuid,
> > +sizeof(QemuUUID));
> > +return pos;
> > +}
> > +
> >  /* Accessor function to get parent bridge device from pci bus. */
> >  PCIDevice *pci_bridge_get_device(PCIBus *bus)
> >  {
> > diff --git a/include/hw/pci/pcie.h b/include/hw/pci/pcie.h
> > index b71e369703..b4189d0ce3 100644
> > --- a/include/hw/pci/pcie.h
> > +++ b/include/hw/pci/pcie.h
> > @@ -82,6 +82,7 @@ struct PCIExpressDevice {
> >  };
> >  
> >  #define COMPAT_PROP_PCP "power_controller_present"
> > +#define COMPAT_PROP_UUID "uuid"
> >  
> >  /* PCI express capability helper functions */
> >  int pcie_cap_init(PCIDevice *dev, uint8_t offset, uint8_t type,
> 
> -
> To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
> For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org
> 



Re: [Qemu-devel] [PATCH v2 0/4] Use of unique identifier for pairing virtio and passthrough devices...

2018-06-26 Thread Michael S. Tsirkin
On Tue, Jun 26, 2018 at 10:49:30PM -0500, Venu Busireddy wrote:
> The patch set "Enable virtio_net to act as a standby for a passthru
> device" [1] deals with live migration of guests that use passthrough
> devices. However, that scheme uses the MAC address for pairing
> the virtio device and the passthrough device. The thread "netvsc:
> refactor notifier/event handling code to use the failover framework"
> [2] discusses an alternate mechanism, such as using an UUID, for pairing
> the devices. Based on that discussion, proposals "Add "Group Identifier"
> to virtio PCI capabilities." [3] and "RFC: Use of bridge devices to
> store pairing information..." [4] were made.
> 
> The current patch set includes all the feedback received for proposals [3]
> and [4]. For the sake of completeness, patch for the virtio specification
> is also included here. Following is the updated proposal.
> 
> 1. Extend the virtio specification to include a new virtio PCI capability
>"VIRTIO_PCI_CAP_GROUP_ID_CFG".

There's still discussion around whether it should be
a virtio pci capability, a virtio net config field or
a new kind of capability.

> 2. Enhance the QEMU CLI to include a "uuid" option to the virtio device.
>The "uuid" is a string in UUID format.
> 
> 3. Enhance the QEMU CLI to include a "uuid" option to the bridge device.
>The "uuid" is a string in UUID format. Currently, PCIe bridge for
>the Q35 model is supported.
> 
> 4. The operator creates a unique identifier string using 'uuidgen'.
> 
> 5. When the virtio device is created, the operator uses the "uuid" option
>(for example, '-device virtio-net-pci,uuid="string"') and specifies
>the UUID created in step 4.
> 
>QEMU stores the UUID in the virtio device's configuration space
>in the capability "VIRTIO_PCI_CAP_GROUP_ID_CFG".
> 
> 6. When assigning a PCI device to the guest in passthrough mode, the
>operator first creates a bridge using the "uuid" option (for example,
>'-device pcie-downstream,uuid="string"') to specify the UUID created
>in step 4, and then attaches the passthrough device to the bridge.
> 
>QEMU stores the UUID in the configuration space of the bridge as
>Vendor-Specific capability (0x09). The "Vendor" here is not to be
>confused with a specific organization. Instead, the vendor of the
>bridge is QEMU. To avoid mixing up with other bridges, the bridge
>will be created with vendor ID 0x1b36 (PCI_VENDOR_ID_REDHAT) and
>device ID 0x000e (PCI_DEVICE_ID_REDHAT_PCIE_BRIDGE) if the "uuid"
>option is specified. Otherwise, current defaults are used.
> 
> 7. Patch 4 in patch series "Enable virtio_net to act as a standby for
>a passthru device" [1] needs to be modified to use the UUID values
>present in the bridge's configuration space and the virtio device's
>configuration space instead of the MAC address for pairing the devices.
> 
> Thanks!
> 
> Venu

The part where the visibility of a vfio device is controlled by the
virtio driver acknowledging the backup feature is missing here.
 

> [1] https://lists.oasis-open.org/archives/virtio-dev/201805/msg00156.html
> [2] https://www.spinics.net/lists/netdev/msg499011.html
> [3] https://lists.oasis-open.org/archives/virtio-dev/201805/msg00118.html
> [4] https://lists.oasis-open.org/archives/virtio-dev/201805/msg00204.html
> 
> Changes in v2:
>   - As Michael Tsirkin suggested, changed the virtio specification
> to restrict the group identifier to be a 16-byte field, presented
> entirely in the virtio device's configuration space.
>   - As Michael Tsirkin suggested, instead of tweaking the ioh3420
> device with Red Hat vendor ID, create a new PCIe bridge device
> named "pcie-downstream" with Red Hat Vendor ID, and include the
> group identifier in this device.
>   - Added a new patch to enhance the "pci-bridge" device to support
> the group identifier (for the i440FX model).
> 
> Venu Busireddy (4):
>   Add a true or false option to the DEFINE_PROP_UUID macro.
>   Add "Group Identifier" support to virtio devices.
>   Add "Group Identifier" support to Red Hat PCI bridge.
>   Add "Group Identifier" support to Red Hat PCI Express bridge.
> 
>  default-configs/arm-softmmu.mak |   1 +
>  default-configs/i386-softmmu.mak|   1 +
>  default-configs/x86_64-softmmu.mak  |   1 +
>  hw/acpi/vmgenid.c   |   2 +-
>  hw/pci-bridge/Makefile.objs |   1 +
>  hw/pci-bridge/pci_bridge_dev.c  |   8 +
>  hw/pci-bridge/pcie_downstream.c | 215 
>  hw/pci-bridge/pcie_downstream.h |  10 +
>  hw/pci/pci_bridge.c |  26 +++
>  hw/virtio/virtio-pci.c  |  15 ++
>  hw/virtio/virtio-pci.h  |   3 +-
>  include/hw/pci/pci.h|   3 +
>  include/hw/pci/pcie.h   |   1 +
>  include/hw/qdev-properties.h|  

Re: [Qemu-devel] [PATCH v5 23/35] target/arm: Implement SVE floating-point convert precision

2018-06-26 Thread Richard Henderson
On 06/26/2018 03:44 AM, Peter Maydell wrote:
> A comment to the effect that the SVE fp-to-fp conversion
> routines always use IEEE format halfprec (ie ignore FPCR.AHP)
> would be helpful.

Ok.

> Are you sure we have the FPCR.FZ16 handling right here? That
> is, do we need the same "use the not-fp16 fpstatus pointer,
> and temporarily clear the flush flag for the fp16 end of
> the conversion" behaviour that we have in vfp_fcvt_f16_to_f32
> and friends ? The pseudocode FPConvertSVE() calls FPConvert(),
> which is the "ignore FZ16" codepath I think. The test case would
> be (eg) a conversion where the input f16 is denormal and
> FPCR.FZ == 1: this should not do the flush-input-to-zero, right?

Yes, I read it the same way.  I guess both my and Alex's RISU
test cases didn't exercise this?


r~



Re: [Qemu-devel] [PATCH v2 3/4] Add "Group Identifier" support to Red Hat PCI bridge.

2018-06-26 Thread Michael S. Tsirkin
On Tue, Jun 26, 2018 at 10:49:33PM -0500, Venu Busireddy wrote:
> Add the "Vendor-Specific" capability to the Red Hat PCI bridge device
> "pci-bridge", to contain the "Group Identifier" (UUID) that will be
> used to pair a virtio device with the passthrough device attached to
> that bridge.
> 
> This capability is added to the bridge iff the "uuid" option is specified
> for the bridge.

I think the name should be more explicit. How about "failover-group-id"?

> 
> Signed-off-by: Venu Busireddy 

I'd like to also tweak the device id in this case,
to make it easier for guests to know it's a grouping bridge.

> ---
>  hw/pci-bridge/pci_bridge_dev.c |  8 
>  hw/pci/pci_bridge.c| 26 ++
>  include/hw/pci/pcie.h  |  1 +
>  3 files changed, 35 insertions(+)
> 
> diff --git a/hw/pci-bridge/pci_bridge_dev.c b/hw/pci-bridge/pci_bridge_dev.c
> index b2d861d216..bbbc6fa1c6 100644
> --- a/hw/pci-bridge/pci_bridge_dev.c
> +++ b/hw/pci-bridge/pci_bridge_dev.c
> @@ -71,6 +71,12 @@ static void pci_bridge_dev_realize(PCIDevice *dev, Error 
> **errp)
>  bridge_dev->msi = ON_OFF_AUTO_OFF;
>  }
>  
> +err = pci_bridge_vendor_init(dev, 0, errp);
> +if (err < 0) {
> +error_append_hint(errp, "Can't init group ID, error %d\n", err);
> +goto vendor_cap_err;
> +}
> +
>  err = slotid_cap_init(dev, 0, bridge_dev->chassis_nr, 0, errp);
>  if (err) {
>  goto slotid_error;
> @@ -109,6 +115,7 @@ slotid_error:
>  if (shpc_present(dev)) {
>  shpc_cleanup(dev, _dev->bar);
>  }
> +vendor_cap_err:
>  shpc_error:
>  pci_bridge_exitfn(dev);
>  }
> @@ -162,6 +169,7 @@ static Property pci_bridge_dev_properties[] = {
>  ON_OFF_AUTO_AUTO),
>  DEFINE_PROP_BIT(PCI_BRIDGE_DEV_PROP_SHPC, PCIBridgeDev, flags,
>  PCI_BRIDGE_DEV_F_SHPC_REQ, true),
> +DEFINE_PROP_UUID(COMPAT_PROP_UUID, PCIDevice, uuid, false),
>  DEFINE_PROP_END_OF_LIST(),
>  };
>  
> diff --git a/hw/pci/pci_bridge.c b/hw/pci/pci_bridge.c
> index 40a39f57cb..cb8b3dad2a 100644
> --- a/hw/pci/pci_bridge.c
> +++ b/hw/pci/pci_bridge.c
> @@ -34,12 +34,17 @@
>  #include "hw/pci/pci_bus.h"
>  #include "qemu/range.h"
>  #include "qapi/error.h"
> +#include "qemu/uuid.h"
>  
>  /* PCI bridge subsystem vendor ID helper functions */
>  #define PCI_SSVID_SIZEOF8
>  #define PCI_SSVID_SVID  4
>  #define PCI_SSVID_SSID  6
>  
> +#define PCI_VENDOR_SIZEOF 20
> +#define PCI_VENDOR_CAP_LEN_OFFSET  2
> +#define PCI_VENDOR_GROUP_ID_OFFSET 4
> +
>  int pci_bridge_ssvid_init(PCIDevice *dev, uint8_t offset,
>uint16_t svid, uint16_t ssid,
>Error **errp)
> @@ -57,6 +62,27 @@ int pci_bridge_ssvid_init(PCIDevice *dev, uint8_t offset,
>  return pos;
>  }
>  
> +int pci_bridge_vendor_init(PCIDevice *d, uint8_t offset, Error **errp)
> +{
> +int pos;
> +
> +if (qemu_uuid_is_null(>uuid)) {
> +return 0;
> +}
> +
> +pos = pci_add_capability(d, PCI_CAP_ID_VNDR, offset, PCI_VENDOR_SIZEOF,
> +errp);
> +if (pos < 0) {
> +return pos;
> +}
> +
> +pci_set_word(d->config + pos + PCI_VENDOR_CAP_LEN_OFFSET,
> +PCI_VENDOR_SIZEOF);
> +memcpy(d->config + pos + PCI_VENDOR_GROUP_ID_OFFSET, >uuid,
> +sizeof(QemuUUID));
> +return pos;
> +}
> +
>  /* Accessor function to get parent bridge device from pci bus. */
>  PCIDevice *pci_bridge_get_device(PCIBus *bus)
>  {
> diff --git a/include/hw/pci/pcie.h b/include/hw/pci/pcie.h
> index b71e369703..b4189d0ce3 100644
> --- a/include/hw/pci/pcie.h
> +++ b/include/hw/pci/pcie.h
> @@ -82,6 +82,7 @@ struct PCIExpressDevice {
>  };
>  
>  #define COMPAT_PROP_PCP "power_controller_present"
> +#define COMPAT_PROP_UUID "uuid"
>  
>  /* PCI express capability helper functions */
>  int pcie_cap_init(PCIDevice *dev, uint8_t offset, uint8_t type,



[Qemu-devel] [PATCH 1/2] qcow2: Remove dead check on !ret

2018-06-26 Thread Fam Zheng
In the beginning of the function, we initialize the local variable to 0,
and in the body of the function, we check the assigned values and exit
the loop immediately. So here it can never be non-zero.

Reported-by: Kevin Wolf 
Signed-off-by: Fam Zheng 
---
 block/qcow2.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index a3a3aa2a97..ff23063616 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1772,7 +1772,7 @@ static coroutine_fn int 
qcow2_handle_l2meta(BlockDriverState *bs,
 while (l2meta != NULL) {
 QCowL2Meta *next;
 
-if (!ret && link_l2) {
+if (link_l2) {
 ret = qcow2_alloc_cluster_link_l2(bs, l2meta);
 if (ret) {
 goto out;
-- 
2.17.1




[Qemu-devel] [PATCH 0/2] block: Two copy offloading corrections

2018-06-26 Thread Fam Zheng



Fam Zheng (2):
  qcow2: Remove dead check on !ret
  block: Move request tracking to children in copy offloading

 block/io.c| 59 ---
 block/qcow2.c |  2 +-
 2 files changed, 29 insertions(+), 32 deletions(-)

-- 
2.17.1




[Qemu-devel] [PATCH 2/2] block: Move request tracking to children in copy offloading

2018-06-26 Thread Fam Zheng
in_flight and tracked requests need to be tracked in every layer during
recursion. For now the only user is qemu-img convert where overlapping
requests and IOThreads don't exist, therefore this change doesn't make
much difference form user point of view, but it is incorrect as part of
the API. Fix it.

Reported-by: Kevin Wolf 
Signed-off-by: Fam Zheng 
---
 block/io.c | 59 ++
 1 file changed, 28 insertions(+), 31 deletions(-)

diff --git a/block/io.c b/block/io.c
index ef4fedd364..585008a6fb 100644
--- a/block/io.c
+++ b/block/io.c
@@ -2932,6 +2932,9 @@ static int coroutine_fn 
bdrv_co_copy_range_internal(BdrvChild *src,
 BdrvRequestFlags flags,
 bool recurse_src)
 {
+BdrvTrackedRequest src_req, dst_req;
+BlockDriverState *src_bs = src->bs;
+BlockDriverState *dst_bs = dst->bs;
 int ret;
 
 if (!src || !dst || !src->bs || !dst->bs) {
@@ -2955,17 +2958,31 @@ static int coroutine_fn 
bdrv_co_copy_range_internal(BdrvChild *src,
 || src->bs->encrypted || dst->bs->encrypted) {
 return -ENOTSUP;
 }
+bdrv_inc_in_flight(src_bs);
+bdrv_inc_in_flight(dst_bs);
+tracked_request_begin(_req, src_bs, src_offset,
+  bytes, BDRV_TRACKED_READ);
+tracked_request_begin(_req, dst_bs, dst_offset,
+  bytes, BDRV_TRACKED_WRITE);
+
+wait_serialising_requests(_req);
+wait_serialising_requests(_req);
 if (recurse_src) {
-return src->bs->drv->bdrv_co_copy_range_from(src->bs,
- src, src_offset,
- dst, dst_offset,
- bytes, flags);
+ret = src->bs->drv->bdrv_co_copy_range_from(src->bs,
+src, src_offset,
+dst, dst_offset,
+bytes, flags);
 } else {
-return dst->bs->drv->bdrv_co_copy_range_to(dst->bs,
-   src, src_offset,
-   dst, dst_offset,
-   bytes, flags);
+ret = dst->bs->drv->bdrv_co_copy_range_to(dst->bs,
+  src, src_offset,
+  dst, dst_offset,
+  bytes, flags);
 }
+tracked_request_end(_req);
+tracked_request_end(_req);
+bdrv_dec_in_flight(src_bs);
+bdrv_dec_in_flight(dst_bs);
+return ret;
 }
 
 /* Copy range from @src to @dst.
@@ -2996,27 +3013,7 @@ int coroutine_fn bdrv_co_copy_range(BdrvChild *src, 
uint64_t src_offset,
 BdrvChild *dst, uint64_t dst_offset,
 uint64_t bytes, BdrvRequestFlags flags)
 {
-BdrvTrackedRequest src_req, dst_req;
-BlockDriverState *src_bs = src->bs;
-BlockDriverState *dst_bs = dst->bs;
-int ret;
-
-bdrv_inc_in_flight(src_bs);
-bdrv_inc_in_flight(dst_bs);
-tracked_request_begin(_req, src_bs, src_offset,
-  bytes, BDRV_TRACKED_READ);
-tracked_request_begin(_req, dst_bs, dst_offset,
-  bytes, BDRV_TRACKED_WRITE);
-
-wait_serialising_requests(_req);
-wait_serialising_requests(_req);
-ret = bdrv_co_copy_range_from(src, src_offset,
-  dst, dst_offset,
-  bytes, flags);
-
-tracked_request_end(_req);
-tracked_request_end(_req);
-bdrv_dec_in_flight(src_bs);
-bdrv_dec_in_flight(dst_bs);
-return ret;
+return bdrv_co_copy_range_from(src, src_offset,
+   dst, dst_offset,
+   bytes, flags);
 }
-- 
2.17.1




[Qemu-devel] [PATCH v2 0/4] Use of unique identifier for pairing virtio and passthrough devices...

2018-06-26 Thread Venu Busireddy
The patch set "Enable virtio_net to act as a standby for a passthru
device" [1] deals with live migration of guests that use passthrough
devices. However, that scheme uses the MAC address for pairing
the virtio device and the passthrough device. The thread "netvsc:
refactor notifier/event handling code to use the failover framework"
[2] discusses an alternate mechanism, such as using an UUID, for pairing
the devices. Based on that discussion, proposals "Add "Group Identifier"
to virtio PCI capabilities." [3] and "RFC: Use of bridge devices to
store pairing information..." [4] were made.

The current patch set includes all the feedback received for proposals [3]
and [4]. For the sake of completeness, patch for the virtio specification
is also included here. Following is the updated proposal.

1. Extend the virtio specification to include a new virtio PCI capability
   "VIRTIO_PCI_CAP_GROUP_ID_CFG".

2. Enhance the QEMU CLI to include a "uuid" option to the virtio device.
   The "uuid" is a string in UUID format.

3. Enhance the QEMU CLI to include a "uuid" option to the bridge device.
   The "uuid" is a string in UUID format. Currently, PCIe bridge for
   the Q35 model is supported.

4. The operator creates a unique identifier string using 'uuidgen'.

5. When the virtio device is created, the operator uses the "uuid" option
   (for example, '-device virtio-net-pci,uuid="string"') and specifies
   the UUID created in step 4.

   QEMU stores the UUID in the virtio device's configuration space
   in the capability "VIRTIO_PCI_CAP_GROUP_ID_CFG".

6. When assigning a PCI device to the guest in passthrough mode, the
   operator first creates a bridge using the "uuid" option (for example,
   '-device pcie-downstream,uuid="string"') to specify the UUID created
   in step 4, and then attaches the passthrough device to the bridge.

   QEMU stores the UUID in the configuration space of the bridge as
   Vendor-Specific capability (0x09). The "Vendor" here is not to be
   confused with a specific organization. Instead, the vendor of the
   bridge is QEMU. To avoid mixing up with other bridges, the bridge
   will be created with vendor ID 0x1b36 (PCI_VENDOR_ID_REDHAT) and
   device ID 0x000e (PCI_DEVICE_ID_REDHAT_PCIE_BRIDGE) if the "uuid"
   option is specified. Otherwise, current defaults are used.

7. Patch 4 in patch series "Enable virtio_net to act as a standby for
   a passthru device" [1] needs to be modified to use the UUID values
   present in the bridge's configuration space and the virtio device's
   configuration space instead of the MAC address for pairing the devices.

Thanks!

Venu

[1] https://lists.oasis-open.org/archives/virtio-dev/201805/msg00156.html
[2] https://www.spinics.net/lists/netdev/msg499011.html
[3] https://lists.oasis-open.org/archives/virtio-dev/201805/msg00118.html
[4] https://lists.oasis-open.org/archives/virtio-dev/201805/msg00204.html

Changes in v2:
  - As Michael Tsirkin suggested, changed the virtio specification
to restrict the group identifier to be a 16-byte field, presented
entirely in the virtio device's configuration space.
  - As Michael Tsirkin suggested, instead of tweaking the ioh3420
device with Red Hat vendor ID, create a new PCIe bridge device
named "pcie-downstream" with Red Hat Vendor ID, and include the
group identifier in this device.
  - Added a new patch to enhance the "pci-bridge" device to support
the group identifier (for the i440FX model).

Venu Busireddy (4):
  Add a true or false option to the DEFINE_PROP_UUID macro.
  Add "Group Identifier" support to virtio devices.
  Add "Group Identifier" support to Red Hat PCI bridge.
  Add "Group Identifier" support to Red Hat PCI Express bridge.

 default-configs/arm-softmmu.mak |   1 +
 default-configs/i386-softmmu.mak|   1 +
 default-configs/x86_64-softmmu.mak  |   1 +
 hw/acpi/vmgenid.c   |   2 +-
 hw/pci-bridge/Makefile.objs |   1 +
 hw/pci-bridge/pci_bridge_dev.c  |   8 +
 hw/pci-bridge/pcie_downstream.c | 215 
 hw/pci-bridge/pcie_downstream.h |  10 +
 hw/pci/pci_bridge.c |  26 +++
 hw/virtio/virtio-pci.c  |  15 ++
 hw/virtio/virtio-pci.h  |   3 +-
 include/hw/pci/pci.h|   3 +
 include/hw/pci/pcie.h   |   1 +
 include/hw/qdev-properties.h|   4 +-
 include/standard-headers/linux/virtio_pci.h |   8 +
 15 files changed, 295 insertions(+), 4 deletions(-)
 create mode 100644 hw/pci-bridge/pcie_downstream.c
 create mode 100644 hw/pci-bridge/pcie_downstream.h




[Qemu-devel] [PATCH v2 4/4] Add "Group Identifier" support to Red Hat PCI Express bridge.

2018-06-26 Thread Venu Busireddy
Add a new bridge device "pcie-downstream" with a Vendor ID of
PCI_VENDOR_ID_REDHAT and Device ID of PCI_DEVICE_ID_REDHAT_DOWNSTREAM.
Also add the "Vendor-Specific" capability to the bridge to contain the
"Group Identifier" (UUID) that will be used to pair a virtio device with
the passthrough device attached to that bridge.

This capability is added to the bridge iff the "uuid" option is specified
for the bridge.

Signed-off-by: Venu Busireddy 
---
 default-configs/arm-softmmu.mak|   1 +
 default-configs/i386-softmmu.mak   |   1 +
 default-configs/x86_64-softmmu.mak |   1 +
 hw/pci-bridge/Makefile.objs|   1 +
 hw/pci-bridge/pcie_downstream.c| 215 +
 hw/pci-bridge/pcie_downstream.h|  10 ++
 include/hw/pci/pci.h   |   1 +
 7 files changed, 230 insertions(+)
 create mode 100644 hw/pci-bridge/pcie_downstream.c
 create mode 100644 hw/pci-bridge/pcie_downstream.h

diff --git a/default-configs/arm-softmmu.mak b/default-configs/arm-softmmu.mak
index 834d45cfaf..b86c6fb122 100644
--- a/default-configs/arm-softmmu.mak
+++ b/default-configs/arm-softmmu.mak
@@ -139,6 +139,7 @@ CONFIG_IMX_I2C=y
 CONFIG_PCIE_PORT=y
 CONFIG_XIO3130=y
 CONFIG_IOH3420=y
+CONFIG_PCIE_DOWNSTREAM=y
 CONFIG_I82801B11=y
 CONFIG_ACPI=y
 CONFIG_SMBIOS=y
diff --git a/default-configs/i386-softmmu.mak b/default-configs/i386-softmmu.mak
index 8c7d4a0fa0..a900c8f052 100644
--- a/default-configs/i386-softmmu.mak
+++ b/default-configs/i386-softmmu.mak
@@ -56,6 +56,7 @@ CONFIG_ACPI_NVDIMM=y
 CONFIG_PCIE_PORT=y
 CONFIG_XIO3130=y
 CONFIG_IOH3420=y
+CONFIG_PCIE_DOWNSTREAM=y
 CONFIG_I82801B11=y
 CONFIG_SMBIOS=y
 CONFIG_HYPERV_TESTDEV=$(CONFIG_KVM)
diff --git a/default-configs/x86_64-softmmu.mak 
b/default-configs/x86_64-softmmu.mak
index 0390b4303c..481e4764be 100644
--- a/default-configs/x86_64-softmmu.mak
+++ b/default-configs/x86_64-softmmu.mak
@@ -56,6 +56,7 @@ CONFIG_ACPI_NVDIMM=y
 CONFIG_PCIE_PORT=y
 CONFIG_XIO3130=y
 CONFIG_IOH3420=y
+CONFIG_PCIE_DOWNSTREAM=y
 CONFIG_I82801B11=y
 CONFIG_SMBIOS=y
 CONFIG_HYPERV_TESTDEV=$(CONFIG_KVM)
diff --git a/hw/pci-bridge/Makefile.objs b/hw/pci-bridge/Makefile.objs
index 47065f87d9..5b42212edc 100644
--- a/hw/pci-bridge/Makefile.objs
+++ b/hw/pci-bridge/Makefile.objs
@@ -3,6 +3,7 @@ common-obj-$(CONFIG_PCIE_PORT) += pcie_root_port.o 
gen_pcie_root_port.o pcie_pci
 common-obj-$(CONFIG_PXB) += pci_expander_bridge.o
 common-obj-$(CONFIG_XIO3130) += xio3130_upstream.o xio3130_downstream.o
 common-obj-$(CONFIG_IOH3420) += ioh3420.o
+common-obj-$(CONFIG_PCIE_DOWNSTREAM) += pcie_downstream.o
 common-obj-$(CONFIG_I82801B11) += i82801b11.o
 # NewWorld PowerMac
 common-obj-$(CONFIG_DEC_PCI) += dec.o
diff --git a/hw/pci-bridge/pcie_downstream.c b/hw/pci-bridge/pcie_downstream.c
new file mode 100644
index 00..78604504ea
--- /dev/null
+++ b/hw/pci-bridge/pcie_downstream.c
@@ -0,0 +1,215 @@
+/*
+ * Red Hat PCI Express downstream port.
+ *
+ * pcie_downstream.c
+ * Most of this code is copied from xio3130_downstream.c
+ *
+ * Copyright (c) 2018 Oracle and/or its affiliates.
+ * Author: Venu Busireddy 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see .
+ */
+
+#include "qemu/osdep.h"
+#include "hw/pci/pci_ids.h"
+#include "hw/pci/msi.h"
+#include "hw/pci/pcie.h"
+#include "pcie_downstream.h"
+#include "qapi/error.h"
+
+#define REDHAT_PCIE_DS_REVISION0x1
+#define REDHAT_PCIE_DS_MSI_OFFSET  0x70
+#define REDHAT_PCIE_DS_MSI_SUPPORTED_FLAGS PCI_MSI_FLAGS_64BIT
+#define REDHAT_PCIE_DS_MSI_NR_VECTOR   1
+#define REDHAT_PCIE_DS_SSVID_OFFSET0x80
+#define REDHAT_PCIE_DS_SSVID_SVID  0
+#define REDHAT_PCIE_DS_SSVID_SSID  0
+#define REDHAT_PCIE_DS_EXP_OFFSET  0x90
+#define REDHAT_PCIE_DS_VENDOR_OFFSET   0xCC
+#define REDHAT_PCIE_DS_AER_OFFSET  0x100
+
+static void pcie_ds_write_config(PCIDevice *d, uint32_t address,
+ uint32_t val, int len)
+{
+pci_bridge_write_config(d, address, val, len);
+pcie_cap_flr_write_config(d, address, val, len);
+pcie_cap_slot_write_config(d, address, val, len);
+pcie_aer_write_config(d, address, val, len);
+}
+
+static void pcie_ds_reset(DeviceState *qdev)
+{
+PCIDevice *d = PCI_DEVICE(qdev);
+
+pcie_cap_deverr_reset(d);
+pcie_cap_slot_reset(d);
+pcie_cap_arifwd_reset(d);

[Qemu-devel] [PATCH v2 2/4] Add "Group Identifier" support to virtio devices.

2018-06-26 Thread Venu Busireddy
Use the virtio PCI capability "VIRTIO_PCI_CAP_GROUP_ID_CFG" to store the
"Group Identifier" (UUID) specified via the command line option "uuid"
for the virtio device. The capability will be present in the virtio
device's configuration space iff the "uuid" option is specified.

Group Identifier is used to pair a virtio device with a passthrough
device.

Signed-off-by: Venu Busireddy 
---
 hw/virtio/virtio-pci.c  | 15 +++
 hw/virtio/virtio-pci.h  |  3 ++-
 include/hw/pci/pci.h|  2 ++
 include/standard-headers/linux/virtio_pci.h |  8 
 4 files changed, 27 insertions(+), 1 deletion(-)

diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index 3a01fe90f0..42703a5567 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -36,6 +36,7 @@
 #include "qemu/range.h"
 #include "hw/virtio/virtio-bus.h"
 #include "qapi/visitor.h"
+#include "qemu/uuid.h"
 
 #define VIRTIO_PCI_REGION_SIZE(dev) 
VIRTIO_PCI_CONFIG_OFF(msix_present(dev))
 
@@ -1638,6 +1639,10 @@ static void virtio_pci_device_plugged(DeviceState *d, 
Error **errp)
 .cap.cap_len = sizeof cfg,
 .cap.cfg_type = VIRTIO_PCI_CAP_PCI_CFG,
 };
+struct virtio_pci_group_id_cap group = {
+.cap.cap_len = sizeof group,
+.cap.cfg_type = VIRTIO_PCI_CAP_GROUP_ID_CFG,
+};
 struct virtio_pci_notify_cap notify_pio = {
 .cap.cap_len = sizeof notify,
 .notify_off_multiplier = cpu_to_le32(0x0),
@@ -1647,6 +1652,11 @@ static void virtio_pci_device_plugged(DeviceState *d, 
Error **errp)
 
 virtio_pci_modern_regions_init(proxy);
 
+if (!qemu_uuid_is_null(>pci_dev.uuid)) {
+memcpy(group.uuid, >pci_dev.uuid, sizeof(QemuUUID));
+virtio_pci_modern_mem_region_map(proxy, >group, );
+}
+
 virtio_pci_modern_mem_region_map(proxy, >common, );
 virtio_pci_modern_mem_region_map(proxy, >isr, );
 virtio_pci_modern_mem_region_map(proxy, >device, );
@@ -1763,6 +1773,10 @@ static void virtio_pci_realize(PCIDevice *pci_dev, Error 
**errp)
 proxy->device.size = 0x1000;
 proxy->device.type = VIRTIO_PCI_CAP_DEVICE_CFG;
 
+proxy->group.offset = 0;
+proxy->group.size = 0;
+proxy->group.type = VIRTIO_PCI_CAP_GROUP_ID_CFG;
+
 proxy->notify.offset = 0x3000;
 proxy->notify.size = virtio_pci_queue_mem_mult(proxy) * VIRTIO_QUEUE_MAX;
 proxy->notify.type = VIRTIO_PCI_CAP_NOTIFY_CFG;
@@ -1898,6 +1912,7 @@ static Property virtio_pci_properties[] = {
 VIRTIO_PCI_FLAG_INIT_LNKCTL_BIT, true),
 DEFINE_PROP_BIT("x-pcie-pm-init", VirtIOPCIProxy, flags,
 VIRTIO_PCI_FLAG_INIT_PM_BIT, true),
+DEFINE_PROP_UUID("uuid", PCIDevice, uuid, false),
 DEFINE_PROP_END_OF_LIST(),
 };
 
diff --git a/hw/virtio/virtio-pci.h b/hw/virtio/virtio-pci.h
index 813082b0d7..e4592e90bf 100644
--- a/hw/virtio/virtio-pci.h
+++ b/hw/virtio/virtio-pci.h
@@ -164,10 +164,11 @@ struct VirtIOPCIProxy {
 VirtIOPCIRegion common;
 VirtIOPCIRegion isr;
 VirtIOPCIRegion device;
+VirtIOPCIRegion group;
 VirtIOPCIRegion notify;
 VirtIOPCIRegion notify_pio;
 };
-VirtIOPCIRegion regs[5];
+VirtIOPCIRegion regs[6];
 };
 MemoryRegion modern_bar;
 MemoryRegion io_bar;
diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
index 990d6fcbde..ee234c5a6f 100644
--- a/include/hw/pci/pci.h
+++ b/include/hw/pci/pci.h
@@ -4,6 +4,7 @@
 #include "hw/qdev.h"
 #include "exec/memory.h"
 #include "sysemu/dma.h"
+#include "qemu/uuid.h"
 
 /* PCI includes legacy ISA access.  */
 #include "hw/isa/isa.h"
@@ -343,6 +344,7 @@ struct PCIDevice {
 bool has_rom;
 MemoryRegion rom;
 uint32_t rom_bar;
+QemuUUID uuid;
 
 /* INTx routing notifier */
 PCIINTxRoutingNotifier intx_routing_notifier;
diff --git a/include/standard-headers/linux/virtio_pci.h 
b/include/standard-headers/linux/virtio_pci.h
index 9262acd130..f6de333f1d 100644
--- a/include/standard-headers/linux/virtio_pci.h
+++ b/include/standard-headers/linux/virtio_pci.h
@@ -113,6 +113,8 @@
 #define VIRTIO_PCI_CAP_DEVICE_CFG  4
 /* PCI configuration access */
 #define VIRTIO_PCI_CAP_PCI_CFG 5
+/* Group Identifier */
+#define VIRTIO_PCI_CAP_GROUP_ID_CFG6
 
 /* This is the PCI capability header: */
 struct virtio_pci_cap {
@@ -163,6 +165,12 @@ struct virtio_pci_cfg_cap {
uint8_t pci_cfg_data[4]; /* Data for BAR access. */
 };
 
+/* Fields in VIRTIO_PCI_CAP_GROUP_ID_CFG: */
+struct virtio_pci_group_id_cap {
+   struct virtio_pci_cap cap;
+   uint8_t uuid[16];
+};
+
 /* Macro versions of offsets for the Old Timers! */
 #define VIRTIO_PCI_CAP_VNDR0
 #define VIRTIO_PCI_CAP_NEXT1



[Qemu-devel] [PATCH v2 3/4] Add "Group Identifier" support to Red Hat PCI bridge.

2018-06-26 Thread Venu Busireddy
Add the "Vendor-Specific" capability to the Red Hat PCI bridge device
"pci-bridge", to contain the "Group Identifier" (UUID) that will be
used to pair a virtio device with the passthrough device attached to
that bridge.

This capability is added to the bridge iff the "uuid" option is specified
for the bridge.

Signed-off-by: Venu Busireddy 
---
 hw/pci-bridge/pci_bridge_dev.c |  8 
 hw/pci/pci_bridge.c| 26 ++
 include/hw/pci/pcie.h  |  1 +
 3 files changed, 35 insertions(+)

diff --git a/hw/pci-bridge/pci_bridge_dev.c b/hw/pci-bridge/pci_bridge_dev.c
index b2d861d216..bbbc6fa1c6 100644
--- a/hw/pci-bridge/pci_bridge_dev.c
+++ b/hw/pci-bridge/pci_bridge_dev.c
@@ -71,6 +71,12 @@ static void pci_bridge_dev_realize(PCIDevice *dev, Error 
**errp)
 bridge_dev->msi = ON_OFF_AUTO_OFF;
 }
 
+err = pci_bridge_vendor_init(dev, 0, errp);
+if (err < 0) {
+error_append_hint(errp, "Can't init group ID, error %d\n", err);
+goto vendor_cap_err;
+}
+
 err = slotid_cap_init(dev, 0, bridge_dev->chassis_nr, 0, errp);
 if (err) {
 goto slotid_error;
@@ -109,6 +115,7 @@ slotid_error:
 if (shpc_present(dev)) {
 shpc_cleanup(dev, _dev->bar);
 }
+vendor_cap_err:
 shpc_error:
 pci_bridge_exitfn(dev);
 }
@@ -162,6 +169,7 @@ static Property pci_bridge_dev_properties[] = {
 ON_OFF_AUTO_AUTO),
 DEFINE_PROP_BIT(PCI_BRIDGE_DEV_PROP_SHPC, PCIBridgeDev, flags,
 PCI_BRIDGE_DEV_F_SHPC_REQ, true),
+DEFINE_PROP_UUID(COMPAT_PROP_UUID, PCIDevice, uuid, false),
 DEFINE_PROP_END_OF_LIST(),
 };
 
diff --git a/hw/pci/pci_bridge.c b/hw/pci/pci_bridge.c
index 40a39f57cb..cb8b3dad2a 100644
--- a/hw/pci/pci_bridge.c
+++ b/hw/pci/pci_bridge.c
@@ -34,12 +34,17 @@
 #include "hw/pci/pci_bus.h"
 #include "qemu/range.h"
 #include "qapi/error.h"
+#include "qemu/uuid.h"
 
 /* PCI bridge subsystem vendor ID helper functions */
 #define PCI_SSVID_SIZEOF8
 #define PCI_SSVID_SVID  4
 #define PCI_SSVID_SSID  6
 
+#define PCI_VENDOR_SIZEOF 20
+#define PCI_VENDOR_CAP_LEN_OFFSET  2
+#define PCI_VENDOR_GROUP_ID_OFFSET 4
+
 int pci_bridge_ssvid_init(PCIDevice *dev, uint8_t offset,
   uint16_t svid, uint16_t ssid,
   Error **errp)
@@ -57,6 +62,27 @@ int pci_bridge_ssvid_init(PCIDevice *dev, uint8_t offset,
 return pos;
 }
 
+int pci_bridge_vendor_init(PCIDevice *d, uint8_t offset, Error **errp)
+{
+int pos;
+
+if (qemu_uuid_is_null(>uuid)) {
+return 0;
+}
+
+pos = pci_add_capability(d, PCI_CAP_ID_VNDR, offset, PCI_VENDOR_SIZEOF,
+errp);
+if (pos < 0) {
+return pos;
+}
+
+pci_set_word(d->config + pos + PCI_VENDOR_CAP_LEN_OFFSET,
+PCI_VENDOR_SIZEOF);
+memcpy(d->config + pos + PCI_VENDOR_GROUP_ID_OFFSET, >uuid,
+sizeof(QemuUUID));
+return pos;
+}
+
 /* Accessor function to get parent bridge device from pci bus. */
 PCIDevice *pci_bridge_get_device(PCIBus *bus)
 {
diff --git a/include/hw/pci/pcie.h b/include/hw/pci/pcie.h
index b71e369703..b4189d0ce3 100644
--- a/include/hw/pci/pcie.h
+++ b/include/hw/pci/pcie.h
@@ -82,6 +82,7 @@ struct PCIExpressDevice {
 };
 
 #define COMPAT_PROP_PCP "power_controller_present"
+#define COMPAT_PROP_UUID "uuid"
 
 /* PCI express capability helper functions */
 int pcie_cap_init(PCIDevice *dev, uint8_t offset, uint8_t type,



[Qemu-devel] [PATCH v2 virtio 1/1] Add "Group Identifier" to virtio PCI capabilities.

2018-06-26 Thread Venu Busireddy
Add VIRTIO_PCI_CAP_GROUP_ID_CFG (Group Identifier) capability to the
virtio PCI capabilities to allow for the grouping of devices.

Signed-off-by: Venu Busireddy 
---
 content.tex | 36 
 1 file changed, 36 insertions(+)

diff --git a/content.tex b/content.tex
index be18234..27581c1 100644
--- a/content.tex
+++ b/content.tex
@@ -599,6 +599,8 @@ The fields are interpreted as follows:
 #define VIRTIO_PCI_CAP_DEVICE_CFG4
 /* PCI configuration access */
 #define VIRTIO_PCI_CAP_PCI_CFG   5
+/* Group Identifier */
+#define VIRTIO_PCI_CAP_GROUP_ID_CFG  6
 \end{lstlisting}
 
 Any other value is reserved for future use.
@@ -997,6 +999,40 @@ address \field{cap.length} bytes within a BAR range
 specified by some other Virtio Structure PCI Capability
 of type other than \field{VIRTIO_PCI_CAP_PCI_CFG}.
 
+\subsubsection{Group Identifier capability}\label{sec:Virtio Transport Options 
/ Virtio Over PCI Bus / PCI Device Layout / Group Identifier capability}
+
+The VIRTIO_PCI_CAP_GROUP_ID_CFG capability provides means for grouping devices 
together.
+
+The capability is immediately followed by an identifier of arbitrary size as 
below:
+
+\begin{lstlisting}
+struct virtio_pci_group_id_cap {
+struct virtio_pci_cap cap;
+u8 group_id[]; /* Group Identifier */
+};
+\end{lstlisting}
+
+The fields \field{cap.bar}, \field{cap.length}, \field{cap.offset}
+and \field{group_id} are read-only for the driver.
+
+The specification does not impose any restrictions on the structure
+or size of group_id[], except that the size must be a multiple of 4.
+Devices are free to declare this array as large as needed, as long as
+the combined size of all capabilities can be accommodated within the
+PCI configuration space.
+
+The field \field{cap.cap_len} indicates the length of the group identifier
+\field{group_id}. The fields \field{cap.bar}, \field{cap.offset} and
+\field{cap.length} should be set to 0.
+
+\devicenormative{\paragraph}{Group Identifier capability}{Virtio Transport 
Options / Virtio Over PCI Bus / PCI Device Layout / Group Identifier capability}
+
+The device MAY present the VIRTIO_PCI_CAP_GROUP_ID_CFG capability.
+
+\drivernormative{\paragraph}{Group Identifier capability}{Virtio Transport 
Options / Virtio Over PCI Bus / PCI Device Layout / Group Identifier capability}
+
+The driver MUST NOT write to group_id[] area.
+
 \subsubsection{Legacy Interfaces: A Note on PCI Device 
Layout}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI Device 
Layout / Legacy Interfaces: A Note on PCI Device Layout}
 
 Transitional devices MUST present part of configuration



[Qemu-devel] [PATCH v2 1/4] Add a true or false option to the DEFINE_PROP_UUID macro.

2018-06-26 Thread Venu Busireddy
It may not always be desirable to have a random UUID stuffed into the
'_field' member. Add a new boolean option '_default' that will allow
the caller to specify if a random UUID needs be generated or not.

Also modified the instance where this macro is used.

Signed-off-by: Venu Busireddy 
---
 hw/acpi/vmgenid.c| 2 +-
 include/hw/qdev-properties.h | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/hw/acpi/vmgenid.c b/hw/acpi/vmgenid.c
index d78b579a20..6d53757ee5 100644
--- a/hw/acpi/vmgenid.c
+++ b/hw/acpi/vmgenid.c
@@ -215,7 +215,7 @@ static void vmgenid_realize(DeviceState *dev, Error **errp)
 }
 
 static Property vmgenid_device_properties[] = {
-DEFINE_PROP_UUID(VMGENID_GUID, VmGenIdState, guid),
+DEFINE_PROP_UUID(VMGENID_GUID, VmGenIdState, guid, true),
 DEFINE_PROP_END_OF_LIST(),
 };
 
diff --git a/include/hw/qdev-properties.h b/include/hw/qdev-properties.h
index 4f60cc88f3..7d39a4bdcd 100644
--- a/include/hw/qdev-properties.h
+++ b/include/hw/qdev-properties.h
@@ -218,12 +218,12 @@ extern const PropertyInfo qdev_prop_off_auto_pcibar;
 DEFINE_PROP_SIGNED(_n, _s, _f, _d, qdev_prop_off_auto_pcibar, \
 OffAutoPCIBAR)
 
-#define DEFINE_PROP_UUID(_name, _state, _field) {  \
+#define DEFINE_PROP_UUID(_name, _state, _field, _default) {\
 .name  = (_name),  \
 .info  = _prop_uuid,  \
 .offset= offsetof(_state, _field)  \
 + type_check(QemuUUID, typeof_field(_state, _field)),  \
-.set_default = true,   \
+.set_default = _default,   \
 }
 
 #define DEFINE_PROP_END_OF_LIST()   \



Re: [Qemu-devel] [PATCH] ppc/pnv: Add model for Power8 PHB3 PCIe Host bridge

2018-06-26 Thread Michael S. Tsirkin
On Wed, Jun 27, 2018 at 11:38:17AM +1000, Benjamin Herrenschmidt wrote:
> On Wed, 2018-06-27 at 03:35 +0300, Michael S. Tsirkin wrote:
> > 
> > > +
> > > +/* Extract field fname from val */
> > > +#define GETFIELD(fname, val)\
> > > +(((val) & fname##_MASK) >> fname##_LSH)
> > > +
> > > +/* Set field fname of oval to fval
> > > + * NOTE: oval isn't modified, the combined result is returned
> > > + */
> > > +#define SETFIELD(fname, oval, fval) \
> > > +(((oval) & ~fname##_MASK) | \
> > > + typeof(oval))(fval)) << fname##_LSH) & fname##_MASK))
> > > +
> > 
> > Pls don't make up macros like these. We can't have each device do it.
> 
> So what ? We move the macros in a generic place ? These are MUCH better
> than open-coding the masks & shifts and much less error prone.

include/qemu/bitops.h has a ton of handy macros.

> > > @@ -1031,6 +1110,7 @@ static Property pnv_chip_properties[] = {
> > >  DEFINE_PROP_UINT64("ram-size", PnvChip, ram_size, 0),
> > >  DEFINE_PROP_UINT32("nr-cores", PnvChip, nr_cores, 1),
> > >  DEFINE_PROP_UINT64("cores-mask", PnvChip, cores_mask, 0x0),
> > > +DEFINE_PROP_UINT32("num-phbs", PnvChip, num_phbs, 1),
> > >  DEFINE_PROP_END_OF_LIST(),
> > >  };
> > 
> > How about instanciating each extra phb using -device?
> > Seems better than teaching everyone about platform-specific
> > options.
> 
> It's about which PHBs are enabled, not which are instanciated, if I
> understand Cedric changes ...
> 
> This aims are implementing the POWER8 chip correctly, it has a fixed
> number of PHBs per-chip at very specific XSCOM addresses, that the
> firwmare knows about.
> 
> Cheers,
> Ben.



Re: [Qemu-devel] [PATCH] translate-all: fix locking of TBs whose two pages share the same physical page

2018-06-26 Thread Richard Henderson
On 06/25/2018 09:31 AM, Emilio G. Cota wrote:
> +} else if (page1 == page2) {
> +page_lock(p1);
> +if (ret_p2) {
> +*ret_p2 = p1;

I think you should set NULL here...

> @@ -1623,7 +1641,7 @@ tb_link_page(TranslationBlock *tb, tb_page_addr_t 
> phys_pc,
>  tb = existing_tb;
>  }
>  
> -if (p2) {
> +if (p2 && p2 != p) {
>  page_unlock(p2);

... so that you need no change here.
Otherwise it looks good.


r~




Re: [Qemu-devel] [PATCH 3/3] i.mx7d: Change IRQ number type from hwaddr to int

2018-06-26 Thread Philippe Mathieu-Daudé
On 06/26/2018 07:00 PM, Jean-Christophe Dubois wrote:
> The qdev_get_gpio_in() function accept an int as second parameter.
> 
> Signed-off-by: Jean-Christophe Dubois 

Reviewed-by: Philippe Mathieu-Daudé 

> ---
>  hw/arm/fsl-imx7.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/hw/arm/fsl-imx7.c b/hw/arm/fsl-imx7.c
> index e15aadb587..44fde03cbe 100644
> --- a/hw/arm/fsl-imx7.c
> +++ b/hw/arm/fsl-imx7.c
> @@ -324,7 +324,7 @@ static void fsl_imx7_realize(DeviceState *dev, Error 
> **errp)
>  FSL_IMX7_ECSPI4_ADDR,
>  };
>  
> -static const hwaddr FSL_IMX7_SPIn_IRQ[FSL_IMX7_NUM_ECSPIS] = {
> +static const int FSL_IMX7_SPIn_IRQ[FSL_IMX7_NUM_ECSPIS] = {
>  FSL_IMX7_ECSPI1_IRQ,
>  FSL_IMX7_ECSPI2_IRQ,
>  FSL_IMX7_ECSPI3_IRQ,
> @@ -349,7 +349,7 @@ static void fsl_imx7_realize(DeviceState *dev, Error 
> **errp)
>  FSL_IMX7_I2C4_ADDR,
>  };
>  
> -static const hwaddr FSL_IMX7_I2Cn_IRQ[FSL_IMX7_NUM_I2CS] = {
> +static const int FSL_IMX7_I2Cn_IRQ[FSL_IMX7_NUM_I2CS] = {
>  FSL_IMX7_I2C1_IRQ,
>  FSL_IMX7_I2C2_IRQ,
>  FSL_IMX7_I2C3_IRQ,
> @@ -515,7 +515,7 @@ static void fsl_imx7_realize(DeviceState *dev, Error 
> **errp)
>  FSL_IMX7_USB3_ADDR,
>  };
>  
> -static const hwaddr FSL_IMX7_USBn_IRQ[FSL_IMX7_NUM_USBS] = {
> +static const int FSL_IMX7_USBn_IRQ[FSL_IMX7_NUM_USBS] = {
>  FSL_IMX7_USB1_IRQ,
>  FSL_IMX7_USB2_IRQ,
>  FSL_IMX7_USB3_IRQ,
> 



Re: [Qemu-devel] [PATCH 4/6] docker: Use os.environ.items() instead of .iteritems()

2018-06-26 Thread Philippe Mathieu-Daudé
On 06/26/2018 11:14 PM, Eduardo Habkost wrote:
> Mapping.iteritems() doesn't exist in Python 3.
> 
> Note that Mapping.items() exists in both Python 3 and Python 2,
> but it returns a list (and not an iterator) in Python 2.  The
> existing code will work on both cases, though.
> 
> Signed-off-by: Eduardo Habkost 

Reviewed-by: Philippe Mathieu-Daudé 

> ---
>  tests/docker/docker.py | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/tests/docker/docker.py b/tests/docker/docker.py
> index db6b463b92..bc34bd872b 100755
> --- a/tests/docker/docker.py
> +++ b/tests/docker/docker.py
> @@ -355,7 +355,7 @@ class BuildCommand(SubCommand):
>  cksum += [(filename, _file_checksum(filename))]
>  
>  argv += ["--build-arg=" + k.lower() + "=" + v
> -for k, v in os.environ.iteritems()
> +for k, v in os.environ.items()
>  if k.lower() in FILTERED_ENV_NAMES]
>  dkr.build_image(tag, docker_dir, dockerfile,
>  quiet=args.quiet, user=args.user, argv=argv,
> 



[Qemu-devel] [PATCH 6/6] docker: Open dockerfiles in text mode

2018-06-26 Thread Eduardo Habkost
Instead of treating dockerfile contents as byte sequences, always
open dockerfiles in text mode and treat it as text.

This is not strictly required to make the script compatible with
Python 3, but it's a simpler and safer way than opening
dockerfiles in binary mode and decoding the data data later.

To make the code compatible with both Python 2 and 3, use
io.open(), which accepts a 'encoding' argument on both versions.

Signed-off-by: Eduardo Habkost 
---
 tests/docker/docker.py | 46 --
 1 file changed, 26 insertions(+), 20 deletions(-)

diff --git a/tests/docker/docker.py b/tests/docker/docker.py
index f58af8e894..412a031c1c 100755
--- a/tests/docker/docker.py
+++ b/tests/docker/docker.py
@@ -23,6 +23,7 @@ import argparse
 import tempfile
 import re
 import signal
+import io
 from tarfile import TarFile, TarInfo
 from io import BytesIO
 from shutil import copy, rmtree
@@ -30,7 +31,7 @@ from pwd import getpwuid
 from datetime import datetime,timedelta
 
 try:
-from typing import List, Union, Tuple
+from typing import List, Union, Tuple, Text
 except ImportError:
 # needed only to make type annotations work
 pass
@@ -52,13 +53,13 @@ def _fsdecode(name):
 return name # type: ignore
 
 def _text_checksum(text):
-# type: (bytes) -> str
+# type: (Text) -> str
 """Calculate a digest string unique to the text content"""
-return hashlib.sha1(text).hexdigest()
+return hashlib.sha1(text.encode('utf-8')).hexdigest()
 
 def _file_checksum(filename):
 # type: (str) -> str
-return _text_checksum(open(filename, 'rb').read())
+return _text_checksum(io.open(filename, 'r', encoding='utf-8').read())
 
 def _guess_docker_command():
 # type: () -> List[str]
@@ -129,14 +130,14 @@ def _copy_binary_with_libs(src, dest_dir):
 _copy_with_mkdir(l , dest_dir, so_path)
 
 def _read_qemu_dockerfile(img_name):
-# type: (str) -> str
+# type: (Text) -> str
 df = os.path.join(os.path.dirname(__file__), "dockerfiles",
   img_name + ".docker")
-return open(df, "r").read()
+return io.open(df, "r", encoding='utf-8').read()
 
 def _dockerfile_preprocess(df):
-# type: (str) -> str
-out = ""
+# type: (Text) -> Text
+out = u""
 for l in df.splitlines():
 if len(l.strip()) == 0 or l.startswith("#"):
 continue
@@ -149,7 +150,7 @@ def _dockerfile_preprocess(df):
 inlining = _read_qemu_dockerfile(l[len(from_pref):])
 out += _dockerfile_preprocess(inlining)
 continue
-out += l + "\n"
+out += l + u"\n"
 return out
 
 class Docker(object):
@@ -220,32 +221,37 @@ class Docker(object):
 def build_image(self,
 tag, # type: str
 docker_dir,  # type: str
-dockerfile,  # type: str
+dockerfile,  # type: Text
 quiet=True,  # type: bool
 user=False,  # type: bool
 argv=[], # type: List[str]
 extra_files_cksum=[] # List[Tuple[str, bytes]]
 ):
 # type(...) -> None
-tmp_df = tempfile.NamedTemporaryFile(dir=docker_dir, suffix=".docker")
+tmp_ndf = tempfile.NamedTemporaryFile(dir=docker_dir, suffix=".docker")
+# on Python 2.7, NamedTemporaryFile doesn't support encoding parameter,
+# so reopen it in text mode:
+tmp_df = io.open(tmp_ndf.name, mode='w+t', encoding='utf-8')
 tmp_df.write(dockerfile)
 
 if user:
 uid = os.getuid()
 uname = getpwuid(uid).pw_name
-tmp_df.write("\n")
-tmp_df.write("RUN id %s 2>/dev/null || useradd -u %d -U %s" %
+tmp_df.write(u"\n")
+tmp_df.write(u"RUN id %s 2>/dev/null || useradd -u %d -U %s" %
  (uname, uid, uname))
 
-tmp_df.write("\n")
-tmp_df.write("LABEL com.qemu.dockerfile-checksum=%s" %
- _text_checksum(_dockerfile_preprocess(dockerfile)))
+dockerfile = _dockerfile_preprocess(dockerfile)
+
+tmp_df.write(u"\n")
+tmp_df.write(u"LABEL com.qemu.dockerfile-checksum=%s" %
+ _text_checksum(dockerfile))
 for f, c in extra_files_cksum:
-tmp_df.write("LABEL com.qemu.%s-checksum=%s" % (f, c))
+tmp_df.write(u"LABEL com.qemu.%s-checksum=%s" % (f, c))
 
 tmp_df.flush()
 
-self._do_check(["build", "-t", tag, "-f", tmp_df.name] + argv + \
+self._do_check(["build", "-t", tag, "-f", tmp_ndf.name] + argv + \
[docker_dir],
quiet=quiet)
 
@@ -326,7 +332,7 @@ class BuildCommand(SubCommand):
 
 def run(self, args, argv):
 # type: (argparse.Namespace, List[str]) -> int
-dockerfile = open(args.dockerfile, 

[Qemu-devel] [PATCH 4/6] docker: Use os.environ.items() instead of .iteritems()

2018-06-26 Thread Eduardo Habkost
Mapping.iteritems() doesn't exist in Python 3.

Note that Mapping.items() exists in both Python 3 and Python 2,
but it returns a list (and not an iterator) in Python 2.  The
existing code will work on both cases, though.

Signed-off-by: Eduardo Habkost 
---
 tests/docker/docker.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/docker/docker.py b/tests/docker/docker.py
index db6b463b92..bc34bd872b 100755
--- a/tests/docker/docker.py
+++ b/tests/docker/docker.py
@@ -355,7 +355,7 @@ class BuildCommand(SubCommand):
 cksum += [(filename, _file_checksum(filename))]
 
 argv += ["--build-arg=" + k.lower() + "=" + v
-for k, v in os.environ.iteritems()
+for k, v in os.environ.items()
 if k.lower() in FILTERED_ENV_NAMES]
 dkr.build_image(tag, docker_dir, dockerfile,
 quiet=args.quiet, user=args.user, argv=argv,
-- 
2.18.0.rc1.1.g3f1ff2140




[Qemu-devel] [PATCH 3/6] docker: Add type annotations

2018-06-26 Thread Eduardo Habkost
Add type annotations that indicate how the code works today, to
make the conversion to Python 3 easier and safer.

With these type annotations, "mypy -2" is not reporting any
issues, but "mypy" in Python 3 mode reports a few problems:

tests/docker/docker.py:233: error: Argument 1 to "_text_checksum" has 
incompatible type "str"; expected "bytes"
tests/docker/docker.py:358: error: "_Environ[str]" has no attribute "iteritems"
tests/docker/docker.py:360: error: Argument 3 to "build_image" of "Docker" has 
incompatible type "bytes"; expected "str"

These problems will be addressed by the following commits.

Signed-off-by: Eduardo Habkost 
---
 tests/docker/docker.py | 44 +++---
 1 file changed, 37 insertions(+), 7 deletions(-)

diff --git a/tests/docker/docker.py b/tests/docker/docker.py
index e3bfa1cc9e..db6b463b92 100755
--- a/tests/docker/docker.py
+++ b/tests/docker/docker.py
@@ -29,6 +29,12 @@ from shutil import copy, rmtree
 from pwd import getpwuid
 from datetime import datetime,timedelta
 
+try:
+from typing import List, Union, Tuple
+except ImportError:
+# needed only to make type annotations work
+pass
+
 
 FILTERED_ENV_NAMES = ['ftp_proxy', 'http_proxy', 'https_proxy']
 
@@ -37,13 +43,16 @@ DEVNULL = open(os.devnull, 'wb')
 
 
 def _text_checksum(text):
+# type: (bytes) -> str
 """Calculate a digest string unique to the text content"""
 return hashlib.sha1(text).hexdigest()
 
 def _file_checksum(filename):
+# type: (str) -> str
 return _text_checksum(open(filename, 'rb').read())
 
 def _guess_docker_command():
+# type: () -> List[str]
 """ Guess a working docker command or raise exception if not found"""
 commands = [["docker"], ["sudo", "-n", "docker"]]
 for cmd in commands:
@@ -60,6 +69,7 @@ def _guess_docker_command():
 commands_txt)
 
 def _copy_with_mkdir(src, root_dir, sub_path='.'):
+# type: (str, str, str) -> None
 """Copy src into root_dir, creating sub_path as needed."""
 dest_dir = os.path.normpath("%s/%s" % (root_dir, sub_path))
 try:
@@ -73,6 +83,7 @@ def _copy_with_mkdir(src, root_dir, sub_path='.'):
 
 
 def _get_so_libs(executable):
+# type: (str) -> List[str]
 """Return a list of libraries associated with an executable.
 
 The paths may be symbolic links which would need to be resolved to
@@ -94,6 +105,7 @@ def _get_so_libs(executable):
 return libs
 
 def _copy_binary_with_libs(src, dest_dir):
+# type: (str, str) -> None
 """Copy a binary executable and all its dependant libraries.
 
 This does rely on the host file-system being fairly multi-arch
@@ -108,11 +120,13 @@ def _copy_binary_with_libs(src, dest_dir):
 _copy_with_mkdir(l , dest_dir, so_path)
 
 def _read_qemu_dockerfile(img_name):
+# type: (str) -> str
 df = os.path.join(os.path.dirname(__file__), "dockerfiles",
   img_name + ".docker")
 return open(df, "r").read()
 
 def _dockerfile_preprocess(df):
+# type: (str) -> str
 out = ""
 for l in df.splitlines():
 if len(l.strip()) == 0 or l.startswith("#"):
@@ -194,11 +208,16 @@ class Docker(object):
 labels = json.loads(resp)[0]["Config"].get("Labels", {})
 return labels.get("com.qemu.dockerfile-checksum", "")
 
-def build_image(self, tag, docker_dir, dockerfile,
-quiet=True, user=False, argv=None, extra_files_cksum=[]):
-if argv == None:
-argv = []
-
+def build_image(self,
+tag, # type: str
+docker_dir,  # type: str
+dockerfile,  # type: str
+quiet=True,  # type: bool
+user=False,  # type: bool
+argv=[], # type: List[str]
+extra_files_cksum=[] # List[Tuple[str, bytes]]
+):
+# type(...) -> None
 tmp_df = tempfile.NamedTemporaryFile(dir=docker_dir, suffix=".docker")
 tmp_df.write(dockerfile)
 
@@ -249,7 +268,8 @@ class Docker(object):
 
 class SubCommand(object):
 """A SubCommand template base class"""
-name = None # Subcommand name
+# Subcommand name
+name = None # type: str
 def shared_args(self, parser):
 parser.add_argument("--quiet", action="store_true",
 help="Run quietly unless an error occured")
@@ -258,6 +278,7 @@ class SubCommand(object):
 """Setup argument parser"""
 pass
 def run(self, args, argv):
+# type: (argparse.Namespace, List[str]) -> int
 """Run command.
 args: parsed argument by argument parser.
 argv: remaining arguments from sys.argv.
@@ -271,6 +292,7 @@ class RunCommand(SubCommand):
 parser.add_argument("--keep", action="store_true",
 help="Don't remove image when command completes")
 def 

[Qemu-devel] [PATCH 5/6] docker: Make _get_so_libs() work on Python 3

2018-06-26 Thread Eduardo Habkost
The "ldd" output is a byte sequence, not a string.  Use bytes
literals while handling the output, and use os.fsdecode() on the
resulting file paths before returning.

Signed-off-by: Eduardo Habkost 
---
 tests/docker/docker.py | 15 ---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/tests/docker/docker.py b/tests/docker/docker.py
index bc34bd872b..f58af8e894 100755
--- a/tests/docker/docker.py
+++ b/tests/docker/docker.py
@@ -41,6 +41,15 @@ FILTERED_ENV_NAMES = ['ftp_proxy', 'http_proxy', 
'https_proxy']
 
 DEVNULL = open(os.devnull, 'wb')
 
+def _fsdecode(name):
+# type: (bytes) -> str
+"""Decode filename to str, try to use os.fsdecode() if available"""
+if hasattr(os, 'fsdecode'):
+# Python 3
+return os.fsdecode(name) # type: ignore
+else:
+# Python 2.7
+return name # type: ignore
 
 def _text_checksum(text):
 # type: (bytes) -> str
@@ -90,15 +99,15 @@ def _get_so_libs(executable):
 ensure theright data is copied."""
 
 libs = []
-ldd_re = re.compile(r"(/.*/)(\S*)")
+ldd_re = re.compile(b"(/.*/)(\S*)")
 try:
 ldd_output = subprocess.check_output(["ldd", executable])
-for line in ldd_output.split("\n"):
+for line in ldd_output.split(b"\n"):
 search = ldd_re.search(line)
 if search and len(search.groups()) == 2:
 so_path = search.groups()[0]
 so_lib = search.groups()[1]
-libs.append("%s/%s" % (so_path, so_lib))
+libs.append(_fsdecode(b"%s/%s" % (so_path, so_lib)))
 except subprocess.CalledProcessError:
 print("%s had no associated libraries (static build?)" % (executable))
 
-- 
2.18.0.rc1.1.g3f1ff2140




[Qemu-devel] [PATCH 1/6] docker: Use BytesIO instead of StringIO

2018-06-26 Thread Eduardo Habkost
The file passed as argument to TarFile.addfile() must be a binary
file, so BytesIO is more appropriate than StringIO.

This is necessary to make the code work on Python 3.

Signed-off-by: Eduardo Habkost 
---
 tests/docker/docker.py | 15 ++-
 1 file changed, 6 insertions(+), 9 deletions(-)

diff --git a/tests/docker/docker.py b/tests/docker/docker.py
index 8e13f18e6c..0de7662146 100755
--- a/tests/docker/docker.py
+++ b/tests/docker/docker.py
@@ -24,10 +24,7 @@ import tempfile
 import re
 import signal
 from tarfile import TarFile, TarInfo
-try:
-from StringIO import StringIO
-except ImportError:
-from io import StringIO
+from io import BytesIO
 from shutil import copy, rmtree
 from pwd import getpwuid
 from datetime import datetime,timedelta
@@ -372,13 +369,13 @@ class UpdateCommand(SubCommand):
 tmp_tar.add(os.path.realpath(l), arcname=l)
 
 # Create a Docker buildfile
-df = StringIO()
-df.write("FROM %s\n" % args.tag)
-df.write("ADD . /\n")
-df.seek(0)
+df = BytesIO()
+df.write(b"FROM %s\n" % args.tag.encode())
+df.write(b"ADD . /\n")
 
 df_tar = TarInfo(name="Dockerfile")
-df_tar.size = len(df.buf)
+df_tar.size = df.tell()
+df.seek(0)
 tmp_tar.addfile(df_tar, fileobj=df)
 
 tmp_tar.close()
-- 
2.18.0.rc1.1.g3f1ff2140




[Qemu-devel] [PATCH 2/6] docker: Always return int on run()

2018-06-26 Thread Eduardo Habkost
We'll add type annotations to the run() methods, so add 'return'
statements to all the functions so the type checker won't
complain.

Signed-off-by: Eduardo Habkost 
---
 tests/docker/docker.py | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/tests/docker/docker.py b/tests/docker/docker.py
index 0de7662146..e3bfa1cc9e 100755
--- a/tests/docker/docker.py
+++ b/tests/docker/docker.py
@@ -418,7 +418,7 @@ class ProbeCommand(SubCommand):
 except Exception:
 print("no")
 
-return
+return 0
 
 
 class CcCommand(SubCommand):
@@ -503,6 +503,7 @@ class CheckCommand(SubCommand):
 print ("Image less than %d minutes old" % (args.olderthan))
 return 0
 
+return 0
 
 def main():
 parser = argparse.ArgumentParser(description="A Docker helper",
-- 
2.18.0.rc1.1.g3f1ff2140




[Qemu-devel] [PATCH 0/6] docker: Port to Python 3

2018-06-26 Thread Eduardo Habkost
This series makes tests/docker/docker.py compatible with both
Python 2 and Python 3, and adds type annotation to make
maintenance easier in the future.

A note about dockerfile encoding


One decision I made while working on this was to open dockerfiles
in text mode instead of binary mode, to make the code simpler and
safer.

This means we won't support dockerfiles that are not valid utf-8
data, but I see that as a feature and not a bug.  :)

Opening dockerfiles in binary mode and treating its contents as
byte sequences instead of text is possible if we really want to,
but I don't think it would be worth the extra code complexity.

Eduardo Habkost (6):
  docker: Use BytesIO instead of StringIO
  docker: Always return int on run()
  docker: Add type annotations
  docker: Use os.environ.items() instead of .iteritems()
  docker: Make _get_so_libs() work on Python 3
  docker: Open dockerfiles in text mode

 tests/docker/docker.py | 115 -
 1 file changed, 79 insertions(+), 36 deletions(-)


base-commit: bd4e4a387aa733e40270a7406c7d111f2292de65
prerequisite-patch-id: 83051ebcf718afae38540902b60a0f8e9f91c174
prerequisite-patch-id: 1a35c71f2a58523de78e3ea2e44c5ef1f84bcc4a
prerequisite-patch-id: 5206b4c5a6797ea17eb763da6203e1881d379f2c
-- 
2.18.0.rc1.1.g3f1ff2140




Re: [Qemu-devel] [RFC PATCH 2/2] iotests: add 222 to test basic fleecing

2018-06-26 Thread Eric Blake

On 06/26/2018 05:22 PM, John Snow wrote:

Signed-off-by: John Snow 
---
  tests/qemu-iotests/222   | 121 +++
  tests/qemu-iotests/group |   1 +
  2 files changed, 122 insertions(+)
  create mode 100644 tests/qemu-iotests/222

diff --git a/tests/qemu-iotests/222 b/tests/qemu-iotests/222
new file mode 100644
index 00..133d10c351
--- /dev/null
+++ b/tests/qemu-iotests/222
@@ -0,0 +1,121 @@
+#!/usr/bin/env python
+#
+# This test covers the basic fleecing workflow.
+#
+# Copyright (C) 2018 Red Hat, Inc.
+# John helped, too.


LOL.


+
+patterns = [("0x5d", "0", "64k"),
+("0xd5", "1M", "64k"),
+("0xdc", "32M", "64k"),
+("0xcd", "67043328", "64k")]  # 64M - 64K
+
+overwrite = [("0xab", "0","64k"), # Full overwrite
+ ("0xad", "1015808",  "64k"), # Partial-left (1M-32K)
+ ("0x1d", "33587200", "64k"), # Partial-right (32M+32K)
+ ("0xea", "64M", "64k")]  # Adjacent-right (64M)
+
+with iotests.FilePath('base.img') as base_img_path, \
+ iotests.FilePath('fleece.img') as fleece_img_path, \
+ iotests.FilePath('nbd.sock') as nbd_sock_path, \
+ iotests.VM() as vm:


Does python require \ even after ','?

The test looks valid - you are definitely reading data over NBD from the 
point in time that you started the blockdev-backup job, even while the 
source image continues to be modified.



+for p in overwrite:
+cmd = "write -P%s %s %s" % p
+log(cmd)
+log(vm.hmp_qemu_io(srcNode, cmd))
+
+log('')
+log('--- Verifying Data ---')
+log('')
+
+for p in patterns:
+cmd = "read -P%s %s %s" % p
+log(cmd)
+assert qemu_io_silent('-r', '-f', 'raw', '-c', cmd, nbd_uri) == 0


Perhaps additional steps would be to then stop the NBD export, stop the 
block job, delete the tgtNode fleecing file, then stop qemu, and finally 
check that the overwritten patterns correctly show up in the source 
image (that is, also prove that we can tear down a job, and that the 
overwrites worked).  And we may want to enhance this test (or use it as 
a starting point to copy into a new test) to play with persistent dirty 
bitmaps thrown into the mix as well.  But what you have is already a 
great start to prevent regressions, so:


Reviewed-by: Eric Blake 

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



Re: [Qemu-devel] [PATCH] ppc/pnv: Add model for Power8 PHB3 PCIe Host bridge

2018-06-26 Thread Benjamin Herrenschmidt
On Wed, 2018-06-27 at 03:35 +0300, Michael S. Tsirkin wrote:
> 
> > +
> > +/* Extract field fname from val */
> > +#define GETFIELD(fname, val)\
> > +(((val) & fname##_MASK) >> fname##_LSH)
> > +
> > +/* Set field fname of oval to fval
> > + * NOTE: oval isn't modified, the combined result is returned
> > + */
> > +#define SETFIELD(fname, oval, fval) \
> > +(((oval) & ~fname##_MASK) | \
> > + typeof(oval))(fval)) << fname##_LSH) & fname##_MASK))
> > +
> 
> Pls don't make up macros like these. We can't have each device do it.

So what ? We move the macros in a generic place ? These are MUCH better
than open-coding the masks & shifts and much less error prone.

> > @@ -1031,6 +1110,7 @@ static Property pnv_chip_properties[] = {
> >  DEFINE_PROP_UINT64("ram-size", PnvChip, ram_size, 0),
> >  DEFINE_PROP_UINT32("nr-cores", PnvChip, nr_cores, 1),
> >  DEFINE_PROP_UINT64("cores-mask", PnvChip, cores_mask, 0x0),
> > +DEFINE_PROP_UINT32("num-phbs", PnvChip, num_phbs, 1),
> >  DEFINE_PROP_END_OF_LIST(),
> >  };
> 
> How about instanciating each extra phb using -device?
> Seems better than teaching everyone about platform-specific
> options.

It's about which PHBs are enabled, not which are instanciated, if I
understand Cedric changes ...

This aims are implementing the POWER8 chip correctly, it has a fixed
number of PHBs per-chip at very specific XSCOM addresses, that the
firwmare knows about.

Cheers,
Ben.




Re: [Qemu-devel] [RFC PATCH 1/2] block: allow blockdev-backup from any source

2018-06-26 Thread Eric Blake

On 06/26/2018 05:22 PM, John Snow wrote:

In the case of image fleecing, the node we choose as the source
for a blockdev-backup is going to be both a root node AND the
backing node for the exported image. It does not qualify as a root
image in this case.

Loosen the restriction.


Did we regress (and if so, when), or has this never worked?  But you are 
right: visually, we are starting with:


  Device
Base (backing) <- Top (active)

then want to add nodes for an NBD export:
  DeviceNBD
Base (backing) <- Top<- Tmp

with a blockdev-backup "sync":"none" from Top to Tmp (any writes from 
the Device first copy the old data to Tmp; the NBD export sees a 
read-only view of Tmp that is unchanging from the time the backup job 
started, regardless of what the Device does in the meantime).


Then when the fleece job ends, the NBD export is stopped, the 
blockdev-backup job is canceled, and Tmp is thrown away as unneeded.




Signed-off-by: John Snow 
---
  blockdev.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/blockdev.c b/blockdev.c
index 58d7570932..526f8b60be 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -3517,7 +3517,7 @@ BlockJob *do_blockdev_backup(BlockdevBackup *backup, 
JobTxn *txn,
  backup->compress = false;
  }
  
-bs = qmp_get_root_bs(backup->device, errp);

+bs = bdrv_lookup_bs(backup->device, backup->device, errp);


Reviewed-by: Eric Blake 


  if (!bs) {
  return NULL;
  }



--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



Re: [Qemu-devel] [PATCH] ppc/pnv: fix pnv_core_realize() error handling

2018-06-26 Thread David Gibson
On Tue, Jun 26, 2018 at 04:22:14PM +0200, Cédric Le Goater wrote:
> commit d35aefa9ae15 ("ppc/pnv: introduce a new intc_create() operation
> to the chip model") changed the object link in the pnv_core_realize()
> routine but a return was forgotten in case of error, which can lead to
> more problems afterwards (segv)
> 
> Signed-off-by: Cédric Le Goater 

Applied to ppc-for-3.0, thanks.

> ---
>  hw/ppc/pnv_core.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/hw/ppc/pnv_core.c b/hw/ppc/pnv_core.c
> index a9f129fc2c5f..9750464bf4a1 100644
> --- a/hw/ppc/pnv_core.c
> +++ b/hw/ppc/pnv_core.c
> @@ -150,6 +150,7 @@ static void pnv_core_realize(DeviceState *dev, Error 
> **errp)
>  if (!chip) {
>  error_propagate(errp, local_err);
>  error_prepend(errp, "required link 'chip' not found: ");
> +return;
>  }
>  
>  pc->threads = g_new(PowerPCCPU *, cc->nr_threads);

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH] ppc/pnv: Add model for Power8 PHB3 PCIe Host bridge

2018-06-26 Thread Michael S. Tsirkin
On Tue, Jun 26, 2018 at 03:59:28PM +0200, Cédric Le Goater wrote:
> diff --git a/include/hw/pci-host/pnv_phb3_regs.h 
> b/include/hw/pci-host/pnv_phb3_regs.h
> new file mode 100644
> index ..a1672726b908
> --- /dev/null
> +++ b/include/hw/pci-host/pnv_phb3_regs.h
> @@ -0,0 +1,510 @@
> +/* Copyright (c) 2013-2018, IBM Corporation.
> + *
> + * Licensed under the Apache License, Version 2.0 (the "License");
> + * you may not use this file except in compliance with the License.
> + * You may obtain a copy of the License at
> + *
> + *  http://www.apache.org/licenses/LICENSE-2.0
> + *
> + * Unless required by applicable law or agreed to in writing, software
> + * distributed under the License is distributed on an "AS IS" BASIS,
> + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
> + * implied.
> + *
> + * See the License for the specific language governing permissions and
> + * limitations under the License.
> + */
> +
> +#ifndef PCI_HOST_PNV_PHB3_REGS_H
> +#define PCI_HOST_PNV_PHB3_REGS_H
> +
> +/*
> + * Duplicated from target/ppc/cpu.h
> + */
> +#define PPC_BIT(bit)(0x8000UL >> (bit))
> +#define PPC_BIT32(bit)  (0x8000UL >> (bit))
> +#define PPC_BIT8(bit)   (0x80UL >> (bit))
> +#define PPC_BITMASK(bs, be) ((PPC_BIT(bs) - PPC_BIT(be)) | PPC_BIT(bs))
> +#define PPC_BITMASK32(bs, be)   ((PPC_BIT32(bs) - PPC_BIT32(be)) | \
> + PPC_BIT32(bs))
> +#define PPC_BITLSHIFT(be)   (63 - (be))
> +#define PPC_BITLSHIFT32(be) (31 - (be))



> +
> +/* Extract field fname from val */
> +#define GETFIELD(fname, val)\
> +(((val) & fname##_MASK) >> fname##_LSH)
> +
> +/* Set field fname of oval to fval
> + * NOTE: oval isn't modified, the combined result is returned
> + */
> +#define SETFIELD(fname, oval, fval) \
> +(((oval) & ~fname##_MASK) | \
> + typeof(oval))(fval)) << fname##_LSH) & fname##_MASK))
> +

Pls don't make up macros like these. We can't have each device do it.


> @@ -1031,6 +1110,7 @@ static Property pnv_chip_properties[] = {
>  DEFINE_PROP_UINT64("ram-size", PnvChip, ram_size, 0),
>  DEFINE_PROP_UINT32("nr-cores", PnvChip, nr_cores, 1),
>  DEFINE_PROP_UINT64("cores-mask", PnvChip, cores_mask, 0x0),
> +DEFINE_PROP_UINT32("num-phbs", PnvChip, num_phbs, 1),
>  DEFINE_PROP_END_OF_LIST(),
>  };

How about instanciating each extra phb using -device?
Seems better than teaching everyone about platform-specific
options.

-- 
MST



Re: [Qemu-devel] [Qemu-ppc] [PATCH v2 0/5] rework the ICS classes inheritance tree

2018-06-26 Thread David Gibson
On Tue, Jun 26, 2018 at 06:37:12PM +0200, Cédric Le Goater wrote:
1;5202;0c> On 06/26/2018 03:27 PM, Greg Kurz wrote:
> > On Mon, 25 Jun 2018 11:17:13 +0200
> > Cédric Le Goater  wrote:
> > 
> >> Hello,
> >>
> > 
> > Hello,
> > 
> > Sorry I didn't manage to look at this before it got merged :)
> > 
> >> It makes the class hierarchy much cleaner and removes duplicated
> >> code. As we are touching the location of the objects states, migration
> >> compatibility was checked and the following tests were performed under
> >> KVM :
> >>
> >>   qemu-3.0 (pseries-3.0)   -> qemu-3.0  (pseries-3.0)   OK
> >>   qemu-3.0 (pseries-2.12)  -> qemu-2.12 (pseries-2.12)  OK
> >>   qemu-3.0 (pseries-2.11)  -> qemu-2.11 (pseries-2.11)  OK
> >>   qemu-3.0 (pseries-2.10)  -> qemu-2.10 (pseries-2.10)  OK
> >>   qemu-3.0 (pseries-2.9)   -> qemu-2.9  (pseries-2.9)   OK
> >>   qemu-3.0 (pseries-2.8)   -> qemu-2.8  (pseries-2.8)   OK
> >>   qemu-3.0 (pseries-2.7)   -> qemu-2.7  (pseries-2.7)   FAIL
> > 
> > What's the failure ?
> 
> qemu-system-ppc64: error while loading state for instance 0x0 of device 'cpu'
> qemu-system-ppc64: load of migration failed: Invalid argument
> 
> and to be more precise :
> 
>qemu-3.0  (pseries-2.7)   -> qemu-2.7  (pseries-2.7)   FAIL
>qemu-2.12 (pseries-2.7)   -> qemu-2.7  (pseries-2.7)   FAIL
>qemu-2.11 (pseries-2.7)   -> qemu-2.7  (pseries-2.7)   FAIL
>qemu-2.10 (pseries-2.7)   -> qemu-2.7  (pseries-2.7)   FAIL
>qemu-2.9  (pseries-2.7)   -> qemu-2.7  (pseries-2.7)   FAIL
>qemu-2.8  (pseries-2.7)   -> qemu-2.7  (pseries-2.7)   FAIL
>qemu-2.7  (pseries-2.7)   -> qemu-2.7  (pseries-2.7)   OK
> 
> 
> So it has been a while.

Yeah, IIRC that's a known problem.  If you try 2.7.1, I think it will work.

> 
> C. 
> 
> 
> > 
> >>
> >> and back :
> >>
> >>   qemu-3.0 (pseries-3.0)  <-  qemu-3.0  (pseries-3.0)   OK
> >>   qemu-3.0 (pseries-2.12) <-  qemu-2.12 (pseries-2.12)  OK
> >>   qemu-3.0 (pseries-2.11) <-  qemu-2.11 (pseries-2.11)  OK
> >>   qemu-3.0 (pseries-2.10) <-  qemu-2.10 (pseries-2.10)  OK
> >>   qemu-3.0 (pseries-2.9)  <-  qemu-2.9  (pseries-2.9)   OK
> >>   qemu-3.0 (pseries-2.8)  <-  qemu-2.8  (pseries-2.8)   OK
> >>   qemu-3.0 (pseries-2.7)  <-  qemu-2.7  (pseries-2.7)   OK
> >>
> >> under TCG, same scenarios were run but up to 2.10 only, in which case
> >> the migration fails for other reasons.
> >>
> >> I wouldn't mind some extra cross checking from someone else.
> >>
> >> Thanks,
> >>
> >> C.
> >>
> >> Changes since v2:
> >>
> >>  - split the patch in smaller units. The migration tests were not
> >>rerun because the code is very much the same. make check was run on
> >>each patch.
> >>
> >>
> >> Cédric Le Goater (5):
> >>   ppc/xics: introduce a parent_realize in ICSStateClass
> >>   ppc/xics: move the instance_init handler under the ics-base class
> >>   ppx/xics: introduce a parent_reset in ICSStateClass
> >>   ppc/xics: move the vmstate structures under the ics-base class
> >>   ppc/xics: rework the ICS classes inheritance tree
> >>
> >>  include/hw/ppc/xics.h |   4 +-
> >>  hw/intc/xics.c| 164 
> >> --
> >>  hw/intc/xics_kvm.c|  46 +++---
> >>  hw/ppc/spapr.c|   2 +-
> >>  4 files changed, 121 insertions(+), 95 deletions(-)
> >>
> > 
> 

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [Qemu-devel] [virtio-dev] Re: [PATCH] qemu: Introduce VIRTIO_NET_F_STANDBY feature bit to virtio_net

2018-06-26 Thread Michael S. Tsirkin
On Tue, Jun 26, 2018 at 04:38:26PM -0700, Siwei Liu wrote:
> On Mon, Jun 25, 2018 at 6:50 PM, Michael S. Tsirkin  wrote:
> > On Mon, Jun 25, 2018 at 10:54:09AM -0700, Samudrala, Sridhar wrote:
> >> > > > > Might not neccessarily be something wrong, but it's very limited to
> >> > > > > prohibit the MAC of VF from changing when enslaved by failover.
> >> > > > You mean guest changing MAC? I'm not sure why we prohibit that.
> >> > > I think Sridhar and Jiri might be better person to answer it. My
> >> > > impression was that sync'ing the MAC address change between all 3
> >> > > devices is challenging, as the failover driver uses MAC address to
> >> > > match net_device internally.
> >>
> >> Yes. The MAC address is assigned by the hypervisor and it needs to manage 
> >> the movement
> >> of the MAC between the PF and VF.  Allowing the guest to change the MAC 
> >> will require
> >> synchronization between the hypervisor and the PF/VF drivers. Most of the 
> >> VF drivers
> >> don't allow changing guest MAC unless it is a trusted VF.
> >
> > OK but it's a policy thing. Maybe it's a trusted VF. Who knows?
> > For example I can see host just
> > failing VIRTIO_NET_CTRL_MAC_ADDR_SET if it wants to block it.
> > I'm not sure why VIRTIO_NET_F_STANDBY has to block it in the guest.
> 
> That's why I think pairing using MAC is fragile IMHO. When VF's MAC
> got changed before virtio attempts to match and pair the device, it
> ends up with no pairing found out at all.

Guest seems to match on the hardware mac and ignore whatever
is set by user. Makes sense to me and should not be fragile.


> UUID is better.
> 
> -Siwei
> 
> >
> > --
> > MST



Re: [Qemu-devel] [virtio-dev] Re: [PATCH] qemu: Introduce VIRTIO_NET_F_STANDBY feature bit to virtio_net

2018-06-26 Thread Siwei Liu
On Mon, Jun 25, 2018 at 6:50 PM, Michael S. Tsirkin  wrote:
> On Mon, Jun 25, 2018 at 10:54:09AM -0700, Samudrala, Sridhar wrote:
>> > > > > Might not neccessarily be something wrong, but it's very limited to
>> > > > > prohibit the MAC of VF from changing when enslaved by failover.
>> > > > You mean guest changing MAC? I'm not sure why we prohibit that.
>> > > I think Sridhar and Jiri might be better person to answer it. My
>> > > impression was that sync'ing the MAC address change between all 3
>> > > devices is challenging, as the failover driver uses MAC address to
>> > > match net_device internally.
>>
>> Yes. The MAC address is assigned by the hypervisor and it needs to manage 
>> the movement
>> of the MAC between the PF and VF.  Allowing the guest to change the MAC will 
>> require
>> synchronization between the hypervisor and the PF/VF drivers. Most of the VF 
>> drivers
>> don't allow changing guest MAC unless it is a trusted VF.
>
> OK but it's a policy thing. Maybe it's a trusted VF. Who knows?
> For example I can see host just
> failing VIRTIO_NET_CTRL_MAC_ADDR_SET if it wants to block it.
> I'm not sure why VIRTIO_NET_F_STANDBY has to block it in the guest.

That's why I think pairing using MAC is fragile IMHO. When VF's MAC
got changed before virtio attempts to match and pair the device, it
ends up with no pairing found out at all. UUID is better.

-Siwei

>
> --
> MST



Re: [Qemu-devel] [PATCH v2 13/22] target/openrisc: Fix cpu_mmu_index

2018-06-26 Thread Richard Henderson
On 06/26/2018 03:07 PM, Stafford Horne wrote:
> Hello,
> 
> I think I found out something.
> 
> in: target/openrisc/sys_helper.c:92
> 
> When we write to `env->tlb.dtlb[idx].tr`  in helper_mtspr():
>   93  case TO_SPR(1, 640) ... TO_SPR(1, 640 + TLB_SIZE - 1):
> /* DTLBW0TR 0-127 */
>   94  idx = spr - TO_SPR(1, 640);
>   95  env->tlb.dtlb[idx].tr = rb;
> 
> 
> Somehow we are overlapping with `cpu->tb_jmp_cache`,  these are both
> pointing to the same spot in memory.
> 
> (gdb) p >tb_jmp_cache[3014]
> $9 = (struct TranslationBlock **) 0x5608b300
> (gdb) p >tlb.dtlb[idx].tr
> $10 = (uint32_t *) 0x5608b304

That is definitely weird.  How about

(gdb) p openrisc_env_get_cpu(env)
$1 = 
(gdb) p &$1->parent_obj
(gdb) p &$1->env
(gdb) p cs->env_ptr

There should be 4096 entries in tb_jmp_cache, so there should
be no way that overlaps.  I can only imagine either CS or ENV
is incorrect somehow.  How that would be, I don't know...


r~



[Qemu-devel] [RFC PATCH 2/2] iotests: add 222 to test basic fleecing

2018-06-26 Thread John Snow
Signed-off-by: John Snow 
---
 tests/qemu-iotests/222   | 121 +++
 tests/qemu-iotests/group |   1 +
 2 files changed, 122 insertions(+)
 create mode 100644 tests/qemu-iotests/222

diff --git a/tests/qemu-iotests/222 b/tests/qemu-iotests/222
new file mode 100644
index 00..133d10c351
--- /dev/null
+++ b/tests/qemu-iotests/222
@@ -0,0 +1,121 @@
+#!/usr/bin/env python
+#
+# This test covers the basic fleecing workflow.
+#
+# Copyright (C) 2018 Red Hat, Inc.
+# John helped, too.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see .
+#
+# Creator/Owner: John Snow 
+
+import iotests
+from iotests import log, qemu_img, qemu_io, qemu_io_silent
+
+iotests.verify_platform(['linux'])
+
+patterns = [("0x5d", "0", "64k"),
+("0xd5", "1M", "64k"),
+("0xdc", "32M", "64k"),
+("0xcd", "67043328", "64k")]  # 64M - 64K
+
+overwrite = [("0xab", "0","64k"), # Full overwrite
+ ("0xad", "1015808",  "64k"), # Partial-left (1M-32K)
+ ("0x1d", "33587200", "64k"), # Partial-right (32M+32K)
+ ("0xea", "64M", "64k")]  # Adjacent-right (64M)
+
+with iotests.FilePath('base.img') as base_img_path, \
+ iotests.FilePath('fleece.img') as fleece_img_path, \
+ iotests.FilePath('nbd.sock') as nbd_sock_path, \
+ iotests.VM() as vm:
+
+log('--- Setting up images ---')
+log('')
+
+assert qemu_img('create', '-f', iotests.imgfmt, base_img_path, '64M') == 0
+assert qemu_img('create', '-f', iotests.imgfmt, fleece_img_path, '64M') == 0
+
+for p in patterns:
+qemu_io('-c', 'write -P%s %s %s' % p, base_img_path)
+
+log('Done')
+
+log('')
+log('--- Launching VM ---')
+log('')
+
+vm.add_drive(base_img_path)
+vm.launch()
+
+log('')
+log('--- Setting up Fleecing Graph ---')
+log('')
+
+srcNode = "drive0"
+tgtNode = "fleeceNode"
+
+# create tgtNode backed by srcNode
+log(vm.qmp("blockdev-add", **{
+"driver": "qcow2",
+"node-name": tgtNode,
+"file": {
+"driver": "file",
+"filename": fleece_img_path,
+},
+"backing": srcNode,
+}))
+
+# Establish COW from source to fleecing node
+log(vm.qmp("blockdev-backup",
+   device=srcNode,
+   target=tgtNode,
+   sync="none"))
+
+log('')
+log('--- Setting up NBD Export ---')
+log('')
+
+nbd_uri = 'nbd+unix:///%s?socket=%s' % (tgtNode, nbd_sock_path)
+log(vm.qmp("nbd-server-start",
+   **{"addr": { "type": "unix",
+"data": { "path": nbd_sock_path } } }))
+
+log(vm.qmp("nbd-server-add", device=tgtNode))
+
+log('')
+log('--- Sanity Check ---')
+log('')
+
+for p in patterns:
+cmd = "read -P%s %s %s" % p
+log(cmd)
+assert qemu_io_silent('-r', '-f', 'raw', '-c', cmd, nbd_uri) == 0
+
+log('')
+log('--- Testing COW ---')
+log('')
+
+for p in overwrite:
+cmd = "write -P%s %s %s" % p
+log(cmd)
+log(vm.hmp_qemu_io(srcNode, cmd))
+
+log('')
+log('--- Verifying Data ---')
+log('')
+
+for p in patterns:
+cmd = "read -P%s %s %s" % p
+log(cmd)
+assert qemu_io_silent('-r', '-f', 'raw', '-c', cmd, nbd_uri) == 0
diff --git a/tests/qemu-iotests/group b/tests/qemu-iotests/group
index eea75819d2..8019a9f721 100644
--- a/tests/qemu-iotests/group
+++ b/tests/qemu-iotests/group
@@ -220,3 +220,4 @@
 218 rw auto quick
 219 rw auto
 221 rw auto quick
+222 rw auto quick
-- 
2.14.4




[Qemu-devel] [RFC PATCH 0/2] iotests: fleecing test

2018-06-26 Thread John Snow
A simple, hastily-written example of image fleecing over NBD.

John Snow (2):
  block: allow blockdev-backup from any source
  iotests: add 222 to test basic fleecing

 blockdev.c   |   2 +-
 tests/qemu-iotests/222   | 121 +++
 tests/qemu-iotests/group |   1 +
 3 files changed, 123 insertions(+), 1 deletion(-)
 create mode 100644 tests/qemu-iotests/222

-- 
2.14.4




[Qemu-devel] [RFC PATCH 1/2] block: allow blockdev-backup from any source

2018-06-26 Thread John Snow
In the case of image fleecing, the node we choose as the source
for a blockdev-backup is going to be both a root node AND the
backing node for the exported image. It does not qualify as a root
image in this case.

Loosen the restriction.

Signed-off-by: John Snow 
---
 blockdev.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/blockdev.c b/blockdev.c
index 58d7570932..526f8b60be 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -3517,7 +3517,7 @@ BlockJob *do_blockdev_backup(BlockdevBackup *backup, 
JobTxn *txn,
 backup->compress = false;
 }
 
-bs = qmp_get_root_bs(backup->device, errp);
+bs = bdrv_lookup_bs(backup->device, backup->device, errp);
 if (!bs) {
 return NULL;
 }
-- 
2.14.4




Re: [Qemu-devel] [PATCH] ppc/pnv: Add model for Power8 PHB3 PCIe Host bridge

2018-06-26 Thread Benjamin Herrenschmidt
On Tue, 2018-06-26 at 17:57 +0200, Andrea Bolognani wrote:
> On Tue, 2018-06-26 at 15:59 +0200, Cédric Le Goater wrote:
> > This is a model of the PCIe host bridge found on Power8 chips,
> > including PowerBus logic interface, IOMMU support, PCIe root complex,
> > XICS MSI and LSI interrupt sources.
> > 
> > 4 PHBs are provisioned under the Power8 chip model to fit hardware but
> > only one is currently initialized.
> 
> What's the advantage in creating 4 PHBs instead of a single one,
> like we already do for pSeries guests? As it is, this will confuse
> the heck out of libvirt's PCI address allocation algorithm :)

This matches the actual HW. POWER9 will have 6 per chip :-)

The goal of the "powernv" platform in qemu is to closely match the
actual HW.

Note that pseries guests can (and will under some cirscumstances) have
multiple PHBs as well.

Cheers,
Ben.




Re: [Qemu-devel] [PATCH v2 13/22] target/openrisc: Fix cpu_mmu_index

2018-06-26 Thread Stafford Horne
Hello,

I think I found out something.

in: target/openrisc/sys_helper.c:92

When we write to `env->tlb.dtlb[idx].tr`  in helper_mtspr():
  93  case TO_SPR(1, 640) ... TO_SPR(1, 640 + TLB_SIZE - 1):
/* DTLBW0TR 0-127 */
  94  idx = spr - TO_SPR(1, 640);
  95  env->tlb.dtlb[idx].tr = rb;


Somehow we are overlapping with `cpu->tb_jmp_cache`,  these are both
pointing to the same spot in memory.

(gdb) p >tb_jmp_cache[3014]
$9 = (struct TranslationBlock **) 0x5608b300
(gdb) p >tlb.dtlb[idx].tr
$10 = (uint32_t *) 0x5608b304


I can't see why yet, but it should be something simple.  Still looking.

-Stafford
On Sun, Jun 24, 2018 at 12:44 PM Stafford Horne  wrote:
>
> On Tue, Jun 19, 2018 at 3:41 AM Richard Henderson
>  wrote:
> >
> > The code in cpu_mmu_index does not properly honor SR_DME.
> > This bug has workarounds elsewhere in that we flush the
> > tlb more often than necessary, on the state changes that
> > should be reflected in a change of mmu_index.
> >
> > Fixing this means that we can respect the mmu_index that
> > is given to tlb_flush.
> >
> > Signed-off-by: Richard Henderson 
> > ---
> >  target/openrisc/cpu.h  | 23 +
> >  target/openrisc/interrupt.c|  4 
> >  target/openrisc/interrupt_helper.c | 15 +++---
> >  target/openrisc/mmu.c  | 33 +++---
> >  target/openrisc/sys_helper.c   |  4 
> >  target/openrisc/translate.c|  2 +-
> >  6 files changed, 49 insertions(+), 32 deletions(-)
>
>
> Hello,
>
> I am trying to test these patches running a linux kernel.
>
> For some reason this is causing a strange failure with SMP but not
> single core, I see an OpenRISC target pointer is making its way into
> the tb_jmp_cache.  I don't think this is right and I am trying to
> figure out why this happens and why this patch triggers it.
>
> When bisecting to this commit I get:
>
> [New Thread 0x7fffe9f11700 (LWP 4210)]
>
> [0.00] Compiled-in FDT at (ptrval)
> [0.00] Linux version
> 4.18.0-rc1-simple-smp-6-gd5d0782e3db9-dirty
> (sho...@lianli.shorne-pla.net) (gcc version 9.0.0 20180426
> (experimental) (GCC)) #1013 SMP Sat Jun 23 17:11:42 JST 2018
> [0.00] CPU: OpenRISC-0 (revision 0) @20 MHz
> [0.00] -- dcache disabled
> [0.00] -- icache disabled
> [0.00] -- dmmu:   64 entries, 1 way(s)
> [0.00] -- immu:   64 entries, 1 way(s)
> [0.00] -- additional features:
> [0.00] -- power management
> [0.00] -- PIC
> [0.00] -- timer
> [0.00] setup_memory: Memory: 0x0-0x200
> [0.00] Setting up paging and PTEs.
> [0.00] map_ram: Memory: 0x0-0x200
> [0.00] itlb_miss_handler (ptrval)
> [0.00] dtlb_miss_handler (ptrval)
> [0.00] OpenRISC Linux -- http://openrisc.io
> [0.00] percpu: Embedded 6 pages/cpu @(ptrval) s18880 r8192 d22080 
> u49152
> [0.00] Built 1 zonelists, mobility grouping off.  Total pages: 4080
> [0.00] Kernel command line: earlycon
> [0.00] earlycon: ns16550a0 at MMIO 0x9000 (options '115200')
> [0.00] bootconsole [ns16550a0] enabled
> [0.00] Dentry cache hash table entries: 4096 (order: 1, 16384 bytes)
> [0.00] Inode-cache hash table entries: 2048 (order: 0, 8192 bytes)
> [0.00] Sorting __ex_table...
> [0.00] Memory: 22336K/32768K available (3309K kernel code, 96K
> rwdata, 736K rodata, 5898K init, 91K bss, 10432K reserved, 0K
> cma-reserved)
> [0.00] mem_init_done ...
> [0.00] Hierarchical RCU implementation.
> [0.00] NR_IRQS: 32, nr_irqs: 32, preallocated irqs: 0
> [0.00] clocksource: openrisc_timer: mask: 0x
> max_cycles: 0x, max_idle_ns: 95563022313 ns
> [0.00] 40.00 BogoMIPS (lpj=20)
> [0.00] pid_max: default: 32768 minimum: 301
> [0.00] Mount-cache hash table entries: 2048 (order: 0, 8192 bytes)
> [0.00] Mountpoint-cache hash table entries: 2048 (order: 0, 8192 
> bytes)
>
>
> (gdb) bt
> #0  0x556d3e59 in tb_lookup__cpu_state (cf_mask=0,
> flags=, cs_base=, pc= pointer>, cpu=0x55f81300)
> at /home/shorne/work/openrisc/qemu/include/exec/tb-lookup.h:31
> #1  0x556d3e59 in tb_find (cf_mask=0, tb_exit=0,
> last_tb=0x7fffe223ff00 , cpu=0x55f81300)
> at /home/shorne/work/openrisc/qemu/accel/tcg/cpu-exec.c:390
> #2  0x556d3e59 in cpu_exec (cpu=cpu@entry=0x55f81300) at
> /home/shorne/work/openrisc/qemu/accel/tcg/cpu-exec.c:735
> #3  0x556a0d2b in tcg_cpu_exec (cpu=cpu@entry=0x55f81300)
> at /home/shorne/work/openrisc/qemu/cpus.c:1362
> #4  0x556a238e in qemu_tcg_rr_cpu_thread_fn (arg= out>) at /home/shorne/work/openrisc/qemu/cpus.c:1461
> #5  0x55886005 in qemu_thread_start (args=0x55f93ef0) at
> /home/shorne/work/openrisc/qemu/util/qemu-thread-posix.c:507
> 

[Qemu-devel] [PATCH v2 5/7] sm501: Log unimplemented raster operation modes

2018-06-26 Thread BALATON Zoltan
From: Sebastian Bauer 

The sm501 currently implements only a very limited set of raster operation
modes. After this change, unknown raster operation modes are logged so
these can be easily spotted.

Signed-off-by: Sebastian Bauer 
Signed-off-by: BALATON Zoltan 
---
 hw/display/sm501.c | 23 +++
 1 file changed, 23 insertions(+)

diff --git a/hw/display/sm501.c b/hw/display/sm501.c
index 08631d5..7404035 100644
--- a/hw/display/sm501.c
+++ b/hw/display/sm501.c
@@ -706,6 +706,8 @@ static void sm501_2d_operation(SM501State *s)
 int format_flags = (s->twoD_stretch >> 20) & 0x3;
 int addressing = (s->twoD_stretch >> 16) & 0xF;
 int rop_mode = (s->twoD_control >> 15) & 0x1; /* 1 for rop2, else rop3 */
+/* 1 if rop2 source is the pattern, otherwise the source is the bitmap */
+int rop2_source_is_pattern = (s->twoD_control >> 14) & 0x1;
 int rop = s->twoD_control & 0xFF;
 
 /* get frame buffer info */
@@ -719,6 +721,27 @@ static void sm501_2d_operation(SM501State *s)
 abort();
 }
 
+if (rop_mode == 0) {
+if (rop != 0xcc) {
+/* Anything other than plain copies are not supported */
+qemu_log_mask(LOG_UNIMP, "sm501: rop3 mode with rop %x is not "
+  "supported.\n", rop);
+}
+} else {
+if (rop2_source_is_pattern && rop != 0x5) {
+/* For pattern source, we support only inverse dest */
+qemu_log_mask(LOG_UNIMP, "sm501: rop2 source being the pattern and 
"
+  "rop %x is not supported.\n", rop);
+} else {
+if (rop != 0x5 && rop != 0xc) {
+/* Anything other than plain copies or inverse dest is not
+ * supported */
+qemu_log_mask(LOG_UNIMP, "sm501: rop mode %x is not "
+  "supported.\n", rop);
+}
+}
+}
+
 if ((s->twoD_source_base & 0x0800) ||
 (s->twoD_destination_base & 0x0800)) {
 printf("%s: only local memory is supported.\n", __func__);
-- 
2.7.6




[Qemu-devel] [PATCH 1/3] i.mx7d: Remove unused header files

2018-06-26 Thread Jean-Christophe Dubois
Signed-off-by: Jean-Christophe Dubois 
---
 hw/arm/mcimx7d-sabre.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/hw/arm/mcimx7d-sabre.c b/hw/arm/mcimx7d-sabre.c
index 95fb409d9c..9c5f0e70c3 100644
--- a/hw/arm/mcimx7d-sabre.c
+++ b/hw/arm/mcimx7d-sabre.c
@@ -18,10 +18,8 @@
 #include "hw/arm/fsl-imx7.h"
 #include "hw/boards.h"
 #include "sysemu/sysemu.h"
-#include "sysemu/device_tree.h"
 #include "qemu/error-report.h"
 #include "sysemu/qtest.h"
-#include "net/net.h"
 
 typedef struct {
 FslIMX7State soc;
-- 
2.17.1




[Qemu-devel] [PATCH v2 6/7] sm501: Fix support for non-zero frame buffer start address

2018-06-26 Thread BALATON Zoltan
Display updates and drawing hardware cursor did not work when frame
buffer address was non-zero. Fix this by taking the frame buffer
address into account in these cases. This fixes screen dragging on
AmigaOS. Based on patch by Sebastian Bauer.

Signed-off-by: Sebastian Bauer 
Signed-off-by: BALATON Zoltan 
---
v2: fixed crash with Linux setting extra bits and log unimplemented case

 hw/display/sm501.c | 19 ---
 1 file changed, 16 insertions(+), 3 deletions(-)

diff --git a/hw/display/sm501.c b/hw/display/sm501.c
index 7404035..6e78f73 100644
--- a/hw/display/sm501.c
+++ b/hw/display/sm501.c
@@ -586,6 +586,11 @@ static uint32_t get_local_mem_size_index(uint32_t size)
 return index;
 }
 
+static ram_addr_t get_fb_addr(SM501State *s, int crt)
+{
+return (crt ? s->dc_crt_fb_addr : s->dc_panel_fb_addr) & 0x3F0;
+}
+
 static inline int get_width(SM501State *s, int crt)
 {
 int width = crt ? s->dc_crt_h_total : s->dc_panel_h_total;
@@ -688,7 +693,8 @@ static inline void hwc_invalidate(SM501State *s, int crt)
 start *= w * bpp;
 end *= w * bpp;
 
-memory_region_set_dirty(>local_mem_region, start, end - start);
+memory_region_set_dirty(>local_mem_region,
+get_fb_addr(s, crt) + start, end - start);
 }
 
 static void sm501_2d_operation(SM501State *s)
@@ -1213,6 +1219,9 @@ static void sm501_disp_ctrl_write(void *opaque, hwaddr 
addr,
 break;
 case SM501_DC_PANEL_FB_ADDR:
 s->dc_panel_fb_addr = value & 0x8FF0;
+if (value & 0x800) {
+qemu_log_mask(LOG_UNIMP, "Panel external memory not supported\n");
+}
 break;
 case SM501_DC_PANEL_FB_OFFSET:
 s->dc_panel_fb_offset = value & 0x3FF03FF0;
@@ -1273,6 +1282,9 @@ static void sm501_disp_ctrl_write(void *opaque, hwaddr 
addr,
 break;
 case SM501_DC_CRT_FB_ADDR:
 s->dc_crt_fb_addr = value & 0x8FF0;
+if (value & 0x800) {
+qemu_log_mask(LOG_UNIMP, "CRT external memory not supported\n");
+}
 break;
 case SM501_DC_CRT_FB_OFFSET:
 s->dc_crt_fb_offset = value & 0x3FF03FF0;
@@ -1615,7 +1627,7 @@ static void sm501_update_display(void *opaque)
 draw_hwc_line_func *draw_hwc_line = NULL;
 int full_update = 0;
 int y_start = -1;
-ram_addr_t offset = 0;
+ram_addr_t offset;
 uint32_t *palette;
 uint8_t hwc_palette[3 * 3];
 uint8_t *hwc_src = NULL;
@@ -1672,9 +1684,10 @@ static void sm501_update_display(void *opaque)
 }
 
 /* draw each line according to conditions */
+offset = get_fb_addr(s, crt);
 snap = memory_region_snapshot_and_clear_dirty(>local_mem_region,
   offset, width * height * src_bpp, DIRTY_MEMORY_VGA);
-for (y = 0, offset = 0; y < height; y++, offset += width * src_bpp) {
+for (y = 0; y < height; y++, offset += width * src_bpp) {
 int update, update_hwc;
 
 /* check if hardware cursor is enabled and we're within its range */
-- 
2.7.6




[Qemu-devel] [PATCH v2 1/7] sm501: Implement i2c part for reading monitor EDID

2018-06-26 Thread BALATON Zoltan
Emulate the i2c part of SM501 which is used to access the EDID info
from a monitor.

The vmstate structure is changed and its version is increased but
SM501 is only used on SH and PPC sam460ex machines that don't support
cross-version migration.

Signed-off-by: BALATON Zoltan 
---
v2:
- added constants for register bits
- fix clearing error bit in reset reg
- set max access size to 1
v1: fixed build with SH

 default-configs/ppc-softmmu.mak|   1 +
 default-configs/ppcemb-softmmu.mak |   1 +
 default-configs/sh4-softmmu.mak|   2 +
 default-configs/sh4eb-softmmu.mak  |   2 +
 hw/display/sm501.c | 146 -
 5 files changed, 148 insertions(+), 4 deletions(-)

diff --git a/default-configs/ppc-softmmu.mak b/default-configs/ppc-softmmu.mak
index b8b0526..e131e24 100644
--- a/default-configs/ppc-softmmu.mak
+++ b/default-configs/ppc-softmmu.mak
@@ -24,6 +24,7 @@ CONFIG_ETSEC=y
 # For Sam460ex
 CONFIG_USB_EHCI_SYSBUS=y
 CONFIG_SM501=y
+CONFIG_DDC=y
 CONFIG_IDE_SII3112=y
 CONFIG_I2C=y
 CONFIG_BITBANG_I2C=y
diff --git a/default-configs/ppcemb-softmmu.mak 
b/default-configs/ppcemb-softmmu.mak
index 37af193..ac44f15 100644
--- a/default-configs/ppcemb-softmmu.mak
+++ b/default-configs/ppcemb-softmmu.mak
@@ -17,6 +17,7 @@ CONFIG_XILINX=y
 CONFIG_XILINX_ETHLITE=y
 CONFIG_USB_EHCI_SYSBUS=y
 CONFIG_SM501=y
+CONFIG_DDC=y
 CONFIG_IDE_SII3112=y
 CONFIG_I2C=y
 CONFIG_BITBANG_I2C=y
diff --git a/default-configs/sh4-softmmu.mak b/default-configs/sh4-softmmu.mak
index 546d855..caeccd5 100644
--- a/default-configs/sh4-softmmu.mak
+++ b/default-configs/sh4-softmmu.mak
@@ -9,6 +9,8 @@ CONFIG_PFLASH_CFI02=y
 CONFIG_SH4=y
 CONFIG_IDE_MMIO=y
 CONFIG_SM501=y
+CONFIG_I2C=y
+CONFIG_DDC=y
 CONFIG_ISA_TESTDEV=y
 CONFIG_I82378=y
 CONFIG_I8259=y
diff --git a/default-configs/sh4eb-softmmu.mak 
b/default-configs/sh4eb-softmmu.mak
index 2d3fd49..53b9cd7 100644
--- a/default-configs/sh4eb-softmmu.mak
+++ b/default-configs/sh4eb-softmmu.mak
@@ -9,6 +9,8 @@ CONFIG_PFLASH_CFI02=y
 CONFIG_SH4=y
 CONFIG_IDE_MMIO=y
 CONFIG_SM501=y
+CONFIG_I2C=y
+CONFIG_DDC=y
 CONFIG_ISA_TESTDEV=y
 CONFIG_I82378=y
 CONFIG_I8259=y
diff --git a/hw/display/sm501.c b/hw/display/sm501.c
index 8206ae8..273495e 100644
--- a/hw/display/sm501.c
+++ b/hw/display/sm501.c
@@ -26,6 +26,7 @@
 #include "qemu/osdep.h"
 #include "qemu/cutils.h"
 #include "qapi/error.h"
+#include "qemu/log.h"
 #include "qemu-common.h"
 #include "cpu.h"
 #include "hw/hw.h"
@@ -34,6 +35,8 @@
 #include "hw/devices.h"
 #include "hw/sysbus.h"
 #include "hw/pci/pci.h"
+#include "hw/i2c/i2c.h"
+#include "hw/i2c/i2c-ddc.h"
 #include "qemu/range.h"
 #include "ui/pixel_ops.h"
 
@@ -216,6 +219,14 @@
 #define SM501_I2C_SLAVE_ADDRESS (0x03)
 #define SM501_I2C_DATA  (0x04)
 
+#define SM501_I2C_CONTROL_START (1 << 2)
+#define SM501_I2C_CONTROL_ENABLE(1 << 0)
+
+#define SM501_I2C_STATUS_COMPLETE   (1 << 3)
+#define SM501_I2C_STATUS_ERROR  (1 << 2)
+
+#define SM501_I2C_RESET_ERROR   (1 << 2)
+
 /* SSP base */
 #define SM501_SSP   (0x02)
 
@@ -471,10 +482,12 @@ typedef struct SM501State {
 MemoryRegion local_mem_region;
 MemoryRegion mmio_region;
 MemoryRegion system_config_region;
+MemoryRegion i2c_region;
 MemoryRegion disp_ctrl_region;
 MemoryRegion twoD_engine_region;
 uint32_t last_width;
 uint32_t last_height;
+I2CBus *i2c_bus;
 
 /* mmio registers */
 uint32_t system_control;
@@ -487,6 +500,11 @@ typedef struct SM501State {
 uint32_t misc_timing;
 uint32_t power_mode_control;
 
+uint8_t i2c_byte_count;
+uint8_t i2c_status;
+uint8_t i2c_addr;
+uint8_t i2c_data[16];
+
 uint32_t uart0_ier;
 uint32_t uart0_lcr;
 uint32_t uart0_mcr;
@@ -897,6 +915,109 @@ static const MemoryRegionOps sm501_system_config_ops = {
 .endianness = DEVICE_LITTLE_ENDIAN,
 };
 
+static uint64_t sm501_i2c_read(void *opaque, hwaddr addr, unsigned size)
+{
+SM501State *s = (SM501State *)opaque;
+uint8_t ret = 0;
+
+switch (addr) {
+case SM501_I2C_BYTE_COUNT:
+ret = s->i2c_byte_count;
+break;
+case SM501_I2C_STATUS:
+ret = s->i2c_status;
+break;
+case SM501_I2C_SLAVE_ADDRESS:
+ret = s->i2c_addr;
+break;
+case SM501_I2C_DATA ... SM501_I2C_DATA + 15:
+ret = s->i2c_data[addr - SM501_I2C_DATA];
+break;
+default:
+qemu_log_mask(LOG_UNIMP, "sm501 i2c : not implemented register read."
+  " addr=0x%" HWADDR_PRIx "\n", addr);
+}
+
+SM501_DPRINTF("sm501 i2c regs : read addr=%" HWADDR_PRIx " val=%x\n",
+  addr, ret);
+return ret;
+}
+
+static void sm501_i2c_write(void *opaque, hwaddr addr, uint64_t value,
+unsigned size)
+{
+SM501State *s = (SM501State *)opaque;
+SM501_DPRINTF("sm501 i2c regs : write addr=%" HWADDR_PRIx
+  " val=%" PRIx64 

[Qemu-devel] [PATCH 3/3] i.mx7d: Change IRQ number type from hwaddr to int

2018-06-26 Thread Jean-Christophe Dubois
The qdev_get_gpio_in() function accept an int as second parameter.

Signed-off-by: Jean-Christophe Dubois 
---
 hw/arm/fsl-imx7.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/hw/arm/fsl-imx7.c b/hw/arm/fsl-imx7.c
index e15aadb587..44fde03cbe 100644
--- a/hw/arm/fsl-imx7.c
+++ b/hw/arm/fsl-imx7.c
@@ -324,7 +324,7 @@ static void fsl_imx7_realize(DeviceState *dev, Error **errp)
 FSL_IMX7_ECSPI4_ADDR,
 };
 
-static const hwaddr FSL_IMX7_SPIn_IRQ[FSL_IMX7_NUM_ECSPIS] = {
+static const int FSL_IMX7_SPIn_IRQ[FSL_IMX7_NUM_ECSPIS] = {
 FSL_IMX7_ECSPI1_IRQ,
 FSL_IMX7_ECSPI2_IRQ,
 FSL_IMX7_ECSPI3_IRQ,
@@ -349,7 +349,7 @@ static void fsl_imx7_realize(DeviceState *dev, Error **errp)
 FSL_IMX7_I2C4_ADDR,
 };
 
-static const hwaddr FSL_IMX7_I2Cn_IRQ[FSL_IMX7_NUM_I2CS] = {
+static const int FSL_IMX7_I2Cn_IRQ[FSL_IMX7_NUM_I2CS] = {
 FSL_IMX7_I2C1_IRQ,
 FSL_IMX7_I2C2_IRQ,
 FSL_IMX7_I2C3_IRQ,
@@ -515,7 +515,7 @@ static void fsl_imx7_realize(DeviceState *dev, Error **errp)
 FSL_IMX7_USB3_ADDR,
 };
 
-static const hwaddr FSL_IMX7_USBn_IRQ[FSL_IMX7_NUM_USBS] = {
+static const int FSL_IMX7_USBn_IRQ[FSL_IMX7_NUM_USBS] = {
 FSL_IMX7_USB1_IRQ,
 FSL_IMX7_USB2_IRQ,
 FSL_IMX7_USB3_IRQ,
-- 
2.17.1




[Qemu-devel] [PATCH v2 2/7] sm501: Perform a full update after palette change

2018-06-26 Thread BALATON Zoltan
From: Sebastian Bauer 

Changing the palette of a color index has as an immediate effect on
all pixels with the corresponding index on real hardware. Performing a
full update after a palette change is a simple way to emulate this
effect.

Signed-off-by: Sebastian Bauer 
Signed-off-by: BALATON Zoltan 
---
v2: change type to bool

 hw/display/sm501.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/hw/display/sm501.c b/hw/display/sm501.c
index 273495e..2fbb10e 100644
--- a/hw/display/sm501.c
+++ b/hw/display/sm501.c
@@ -487,6 +487,7 @@ typedef struct SM501State {
 MemoryRegion twoD_engine_region;
 uint32_t last_width;
 uint32_t last_height;
+bool do_full_update; /* perform a full update next time */
 I2CBus *i2c_bus;
 
 /* mmio registers */
@@ -1042,6 +1043,7 @@ static void sm501_palette_write(void *opaque, hwaddr addr,
 
 assert(range_covers_byte(0, 0x400 * 3, addr));
 *(uint32_t *)>dc_palette[addr] = value;
+s->do_full_update = true;
 }
 
 static uint64_t sm501_disp_ctrl_read(void *opaque, hwaddr addr,
@@ -1630,6 +1632,12 @@ static void sm501_update_display(void *opaque)
 full_update = 1;
 }
 
+/* someone else requested a full update */
+if (s->do_full_update) {
+s->do_full_update = false;
+full_update = 1;
+}
+
 /* draw each line according to conditions */
 snap = memory_region_snapshot_and_clear_dirty(>local_mem_region,
   offset, width * height * src_bpp, DIRTY_MEMORY_VGA);
-- 
2.7.6




[Qemu-devel] [PATCH v2 3/7] sm501: Use values from the pitch register for 2D operations

2018-06-26 Thread BALATON Zoltan
From: Sebastian Bauer 

Before, crt_h_total was used for src_width and dst_width. This is a
property of the current display setting and not relevant for the 2D
operation that also can be done off-screen. The pitch register's purpose
is to describe line pitch relevant of the 2D operation.

Signed-off-by: Sebastian Bauer 
Signed-off-by: BALATON Zoltan 
---
 hw/display/sm501.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/display/sm501.c b/hw/display/sm501.c
index 2fbb10e..8522042 100644
--- a/hw/display/sm501.c
+++ b/hw/display/sm501.c
@@ -709,8 +709,8 @@ static void sm501_2d_operation(SM501State *s)
 /* get frame buffer info */
 uint8_t *src = s->local_mem + (s->twoD_source_base & 0x03FF);
 uint8_t *dst = s->local_mem + (s->twoD_destination_base & 0x03FF);
-int src_width = (s->dc_crt_h_total & 0x0FFF) + 1;
-int dst_width = (s->dc_crt_h_total & 0x0FFF) + 1;
+int src_width = s->twoD_pitch & 0x1FFF;
+int dst_width = (s->twoD_pitch >> 16) & 0x1FFF;
 
 if (addressing != 0x0) {
 printf("%s: only XY addressing is supported.\n", __func__);
-- 
2.7.6




[Qemu-devel] [PATCH v2 0/7] Misc sm501 improvements

2018-06-26 Thread BALATON Zoltan
Version 2 of the sm501 changes with fixes that are needed to get
AmigaOS 4.1FE to boot and able to produce graphics.

The strange blue-white colors that first appear are actually correct
and because of AmigaOS selecting a low resolution PAL mode by default
instead of a board specific mode. To work around this one can select
the last option to boot the live CD and then select a better board
specific mode from ScreenMode Prefs. It takes a while for the
ScreenMode window to appear and graphics operations are slow which
could use some improvement but at least it seems to work correctly now
apart from some unimplemented drawing modes for compositing.

If this could be merged before the freeze with the sam460ex patches
and Sebastian's ehci patch then QEMU 3.0 could be the first version
that can boot AmigaOS.

BALATON Zoltan (3):
  sm501: Implement i2c part for reading monitor EDID
  sm501: Fix support for non-zero frame buffer start address
  sm501: Set updated region dirty after 2D operation

Sebastian Bauer (4):
  sm501: Perform a full update after palette change
  sm501: Use values from the pitch register for 2D operations
  sm501: Implement negated destination raster operation mode
  sm501: Log unimplemented raster operation modes

 default-configs/ppc-softmmu.mak|   1 +
 default-configs/ppcemb-softmmu.mak |   1 +
 default-configs/sh4-softmmu.mak|   2 +
 default-configs/sh4eb-softmmu.mak  |   2 +
 hw/display/sm501.c | 229 +++--
 5 files changed, 223 insertions(+), 12 deletions(-)

-- 
2.7.6




[Qemu-devel] [PATCH 2/3] i.mx7d: Change SRC unimpleted device name from sdma to src

2018-06-26 Thread Jean-Christophe Dubois
Signed-off-by: Jean-Christophe Dubois 
---
 hw/arm/fsl-imx7.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/arm/fsl-imx7.c b/hw/arm/fsl-imx7.c
index 26c1d27f7c..e15aadb587 100644
--- a/hw/arm/fsl-imx7.c
+++ b/hw/arm/fsl-imx7.c
@@ -459,7 +459,7 @@ static void fsl_imx7_realize(DeviceState *dev, Error **errp)
 /*
  * SRC
  */
-create_unimplemented_device("sdma", FSL_IMX7_SRC_ADDR, FSL_IMX7_SRC_SIZE);
+create_unimplemented_device("src", FSL_IMX7_SRC_ADDR, FSL_IMX7_SRC_SIZE);
 
 /*
  * Watchdog
-- 
2.17.1




[Qemu-devel] [PATCH 0/3] i.mx7d fixes

2018-06-26 Thread Jean-Christophe Dubois
Small fixes in the i.mx7d code.

Jean-Christophe Dubois (3):
  i.mx7d: Remove unused header files
  i.mx7d: Change SRC unimpleted device name from sdma to src
  i.mx7d: Change IRQ number type from hwaddr to int

 hw/arm/fsl-imx7.c  | 8 
 hw/arm/mcimx7d-sabre.c | 2 --
 2 files changed, 4 insertions(+), 6 deletions(-)

-- 
2.17.1




[Qemu-devel] [PATCH v2 4/7] sm501: Implement negated destination raster operation mode

2018-06-26 Thread BALATON Zoltan
From: Sebastian Bauer 

Add support for the negated destination operation mode. This is used e.g.
by AmigaOS for the INVERSEVID drawing mode. With this change, the cursor
in the shell and non-immediate window adjustment are working now.

Signed-off-by: Sebastian Bauer 
Signed-off-by: BALATON Zoltan 
---
 hw/display/sm501.c | 12 +++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/hw/display/sm501.c b/hw/display/sm501.c
index 8522042..08631d5 100644
--- a/hw/display/sm501.c
+++ b/hw/display/sm501.c
@@ -705,6 +705,8 @@ static void sm501_2d_operation(SM501State *s)
 uint32_t color = s->twoD_foreground;
 int format_flags = (s->twoD_stretch >> 20) & 0x3;
 int addressing = (s->twoD_stretch >> 16) & 0xF;
+int rop_mode = (s->twoD_control >> 15) & 0x1; /* 1 for rop2, else rop3 */
+int rop = s->twoD_control & 0xFF;
 
 /* get frame buffer info */
 uint8_t *src = s->local_mem + (s->twoD_source_base & 0x03FF);
@@ -729,6 +731,8 @@ static void sm501_2d_operation(SM501State *s)
 int y, x, index_d, index_s;   \
 for (y = 0; y < operation_height; y++) {  \
 for (x = 0; x < operation_width; x++) {   \
+_pixel_type val;  \
+  \
 if (rtl) {\
 index_s = ((src_y - y) * src_width + src_x - x) * _bpp;   \
 index_d = ((dst_y - y) * dst_width + dst_x - x) * _bpp;   \
@@ -736,7 +740,13 @@ static void sm501_2d_operation(SM501State *s)
 index_s = ((src_y + y) * src_width + src_x + x) * _bpp;   \
 index_d = ((dst_y + y) * dst_width + dst_x + x) * _bpp;   \
 } \
-*(_pixel_type *)[index_d] = *(_pixel_type *)[index_s];\
+if (rop_mode == 1 && rop == 5) {  \
+/* Invert dest */ \
+val = ~*(_pixel_type *)[index_d]; \
+} else {  \
+val = *(_pixel_type *)[index_s];  \
+} \
+*(_pixel_type *)[index_d] = val;  \
 } \
 } \
 }
-- 
2.7.6




[Qemu-devel] [PATCH v2 7/7] sm501: Set updated region dirty after 2D operation

2018-06-26 Thread BALATON Zoltan
Set the changed memory region dirty after performed a 2D operation to
ensure that the screen is updated properly.

Signed-off-by: BALATON Zoltan 
---
v2: fixed to work with non-zero fb_addr

 hw/display/sm501.c | 17 +++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/hw/display/sm501.c b/hw/display/sm501.c
index 6e78f73..090bc6a 100644
--- a/hw/display/sm501.c
+++ b/hw/display/sm501.c
@@ -715,12 +715,16 @@ static void sm501_2d_operation(SM501State *s)
 /* 1 if rop2 source is the pattern, otherwise the source is the bitmap */
 int rop2_source_is_pattern = (s->twoD_control >> 14) & 0x1;
 int rop = s->twoD_control & 0xFF;
+uint32_t src_base = s->twoD_source_base & 0x03FF;
+uint32_t dst_base = s->twoD_destination_base & 0x03FF;
 
 /* get frame buffer info */
-uint8_t *src = s->local_mem + (s->twoD_source_base & 0x03FF);
-uint8_t *dst = s->local_mem + (s->twoD_destination_base & 0x03FF);
+uint8_t *src = s->local_mem + src_base;
+uint8_t *dst = s->local_mem + dst_base;
 int src_width = s->twoD_pitch & 0x1FFF;
 int dst_width = (s->twoD_pitch >> 16) & 0x1FFF;
+int crt = (s->dc_crt_control & SM501_DC_CRT_CONTROL_SEL) ? 1 : 0;
+int fb_len = get_width(s, crt) * get_height(s, crt) * get_bpp(s, crt);
 
 if (addressing != 0x0) {
 printf("%s: only XY addressing is supported.\n", __func__);
@@ -821,6 +825,15 @@ static void sm501_2d_operation(SM501State *s)
 abort();
 break;
 }
+
+if (dst_base >= get_fb_addr(s, crt) &&
+dst_base <= get_fb_addr(s, crt) + fb_len) {
+int dst_len = MIN(fb_len, ((dst_y + operation_height - 1) * dst_width +
+   dst_x + operation_width) * (1 << format_flags));
+if (dst_len) {
+memory_region_set_dirty(>local_mem_region, dst_base, dst_len);
+}
+}
 }
 
 static uint64_t sm501_system_config_read(void *opaque, hwaddr addr,
-- 
2.7.6




Re: [Qemu-devel] [PATCH] fix fdiv instruction

2018-06-26 Thread Richard Henderson
On 06/26/2018 12:50 PM, G 3 wrote:
> 
> If FPSCR[ZE] is set or not set, answer = 0x7ff0. This indicates to
> me that the fdiv instruction needs a little work. This is what I think should
> happen. If division by zero takes  place and the FPSCR[ZE] bit is set, then 
> the
> value in the destination register should not be altered (rather than returning
> zero).

I have not tested, but I suspect the same will be true for all other
floating-point exceptions.

E.g. try fmul of DBL_MAX * DBL_MAX with FPSCR[OE] set.

To my eye we would need to rearrange all of the fp operations:

(1) Remove helper_reset_fpstatus.
Every fp operation should leave 0 in the exception_flags.

Failure to do so indicates we're missing the post-operation
processing of exceptions via float_check_status.  Which is
in fact exactly the bug here for fdiv.  And based on a quick
browse, also fmul, fsub, and fadd.

(2) float_check_status should be re-organized.

  (a) if status == 0, early exit,
  (b) otherwise, set_float_exception_flags(>fp_status, 0) immediately.

(3) I suspect that all of the exception special cases can be
reordered such that we test them after the operation, as
they should all be unlikely.

A good example is target/tricore/fpu_helper.c, in which
we test the exception flags, do special cases when we
find e.g. float_flags_invalid set, and then process the
exceptions that were raised.


r~



Re: [Qemu-devel] [PATCH 4/5] pr-manager: add query-pr-managers QMP command

2018-06-26 Thread Eric Blake

On 06/26/2018 10:40 AM, Paolo Bonzini wrote:

This command lets you query the connection status of each pr-manager-helper
object.

Signed-off-by: Paolo Bonzini 
---



+++ b/qapi/block.json
@@ -77,6 +77,33 @@
  { 'struct': 'BlockdevSnapshotInternal',
'data': { 'device': 'str', 'name': 'str' } }
  
+##

+# @PRManagerInfo:
+#
+# Information about a persistent reservation manager
+#
+# @id: the identifier of the persistent reservation manager
+#
+# @is-connected: whether the persistent reservation manager is connected to
+#the underlying storage or helper
+#
+# Since: 3.0
+##
+{ 'struct': 'PRManagerInfo',
+  'data': {'id': 'str', 'is-connected': 'bool'} }


Bike-shedding: I think 'connected' is a reasonable (and shorter) name 
for this member



+
+##
+# @query-pr-managers:
+#
+# Returns a list of information about each persistent reservation manager.
+#
+# Returns: a list of @PRManagerInfo for each persistent reservation manager
+#
+# Since: 3.0
+##
+{ 'command': 'query-pr-managers', 'returns': ['PRManagerInfo'] }
+


As a query command, does it make sense to consider whether this command 
could be provided during preconfig?


--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



  1   2   3   4   >