Re: [Qemu-devel] [Qemu-ppc] [PATCH] ppc: Three floating point fixes

2019-08-21 Thread David Gibson
On Tue, Aug 20, 2019 at 11:35:29AM +0100, Peter Maydell wrote:
> On Tue, 20 Aug 2019 at 08:36, David Gibson  
> wrote:
> > On Mon, Aug 19, 2019 at 12:13:34PM -0500, Paul Clarke wrote:
> > > These issues were found while running Glibc's test suite for "math",
> > > and there are still a *LOT* of QEMU-only FAILs, so I may be back
> > > again with suggested fixes or questions.  :-)
> >
> > That doesn't greatly surprise me, TCG's ppc target stuff is only so-so
> > tested, TBH.
> 
> You might also consider using/extending the risu test cases for
> ppc64 -- individual checks of each insn against a known-good
> implementation can be easier to track down bugs than trying
> to figure out why a higher-level test suite like the glibc one
> has reported a failure, IME. (There are already risu patterns
> for XSCVDPSP and XSCVDPSPN, so I think that bug at least ought
> to be found by risu if you run it against the right h/w as
> known-good reference...)

Oh, I'd love to.  I tried running risu once, it failed cryptically and
I haven't had time to investigate deeper.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH v7 04/13] vfio: Add save and load functions for VFIO PCI devices

2019-08-21 Thread Kirti Wankhede
Sorry for delay to respond.

On 7/11/2019 5:37 PM, Dr. David Alan Gilbert wrote:
> * Kirti Wankhede (kwankh...@nvidia.com) wrote:
>> These functions save and restore PCI device specific data - config
>> space of PCI device.
>> Tested save and restore with MSI and MSIX type.
>>
>> Signed-off-by: Kirti Wankhede 
>> Reviewed-by: Neo Jia 
>> ---
>>  hw/vfio/pci.c | 114 
>> ++
>>  include/hw/vfio/vfio-common.h |   2 +
>>  2 files changed, 116 insertions(+)
>>
>> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
>> index de0d286fc9dd..5fe4f8076cac 100644
>> --- a/hw/vfio/pci.c
>> +++ b/hw/vfio/pci.c
>> @@ -2395,11 +2395,125 @@ static Object *vfio_pci_get_object(VFIODevice 
>> *vbasedev)
>>  return OBJECT(vdev);
>>  }
>>  
>> +static void vfio_pci_save_config(VFIODevice *vbasedev, QEMUFile *f)
>> +{
>> +VFIOPCIDevice *vdev = container_of(vbasedev, VFIOPCIDevice, vbasedev);
>> +PCIDevice *pdev = >pdev;
>> +uint16_t pci_cmd;
>> +int i;
>> +
>> +for (i = 0; i < PCI_ROM_SLOT; i++) {
>> +uint32_t bar;
>> +
>> +bar = pci_default_read_config(pdev, PCI_BASE_ADDRESS_0 + i * 4, 4);
>> +qemu_put_be32(f, bar);
>> +}
>> +
>> +qemu_put_be32(f, vdev->interrupt);
>> +if (vdev->interrupt == VFIO_INT_MSI) {
>> +uint32_t msi_flags, msi_addr_lo, msi_addr_hi = 0, msi_data;
>> +bool msi_64bit;
>> +
>> +msi_flags = pci_default_read_config(pdev, pdev->msi_cap + 
>> PCI_MSI_FLAGS,
>> +2);
>> +msi_64bit = (msi_flags & PCI_MSI_FLAGS_64BIT);
>> +
>> +msi_addr_lo = pci_default_read_config(pdev,
>> + pdev->msi_cap + 
>> PCI_MSI_ADDRESS_LO, 4);
>> +qemu_put_be32(f, msi_addr_lo);
>> +
>> +if (msi_64bit) {
>> +msi_addr_hi = pci_default_read_config(pdev,
>> + pdev->msi_cap + 
>> PCI_MSI_ADDRESS_HI,
>> + 4);
>> +}
>> +qemu_put_be32(f, msi_addr_hi);
>> +
>> +msi_data = pci_default_read_config(pdev,
>> +pdev->msi_cap + (msi_64bit ? PCI_MSI_DATA_64 : 
>> PCI_MSI_DATA_32),
>> +2);
>> +qemu_put_be32(f, msi_data);
>> +} else if (vdev->interrupt == VFIO_INT_MSIX) {
>> +uint16_t offset;
>> +
>> +/* save enable bit and maskall bit */
>> +offset = pci_default_read_config(pdev,
>> +   pdev->msix_cap + PCI_MSIX_FLAGS + 1, 
>> 2);
>> +qemu_put_be16(f, offset);
>> +msix_save(pdev, f);
>> +}
>> +pci_cmd = pci_default_read_config(pdev, PCI_COMMAND, 2);
>> +qemu_put_be16(f, pci_cmd);
>> +}
>> +
>> +static void vfio_pci_load_config(VFIODevice *vbasedev, QEMUFile *f)
>> +{
>> +VFIOPCIDevice *vdev = container_of(vbasedev, VFIOPCIDevice, vbasedev);
>> +PCIDevice *pdev = >pdev;
>> +uint32_t interrupt_type;
>> +uint32_t msi_flags, msi_addr_lo, msi_addr_hi = 0, msi_data;
>> +uint16_t pci_cmd;
>> +bool msi_64bit;
>> +int i;
>> +
>> +/* retore pci bar configuration */
>> +pci_cmd = pci_default_read_config(pdev, PCI_COMMAND, 2);
>> +vfio_pci_write_config(pdev, PCI_COMMAND,
>> +pci_cmd & (!(PCI_COMMAND_IO | PCI_COMMAND_MEMORY)), 
>> 2);
>> +for (i = 0; i < PCI_ROM_SLOT; i++) {
>> +uint32_t bar = qemu_get_be32(f);
>> +
>> +vfio_pci_write_config(pdev, PCI_BASE_ADDRESS_0 + i * 4, bar, 4);
>> +}
> 
> Is it possible to validate the bar's at all?  We just had a bug on a
> virtual device where one version was asking for a larger bar than the
> other; our validation caught this in some cases so we could tell that
> the guest had a BAR that was aligned at the wrong alignment.
> 

"Validate the bars" does that means validate size of bars?

>> +vfio_pci_write_config(pdev, PCI_COMMAND,
>> +  pci_cmd | PCI_COMMAND_IO | PCI_COMMAND_MEMORY, 2);
> 
> Can you explain what this is for?  You write the command register at the
> end of the function with the original value; there's no guarantee that
> the device is using IO for example, so ORing it seems odd.
> 

IO space and memory space accesses are disabled before writing BAR
addresses, only those are enabled here.

> Also, are the other flags in COMMAND safe at this point - e.g. what
> about interrupts and stuff?
> 

COMMAND registers is saved from stop-and-copy phase, interrupt should be
disabled, then restoring here when vCPU are not yet running.

>> +interrupt_type = qemu_get_be32(f);
>> +
>> +if (interrupt_type == VFIO_INT_MSI) {
>> +/* restore msi configuration */
>> +msi_flags = pci_default_read_config(pdev,
>> +pdev->msi_cap + PCI_MSI_FLAGS, 
>> 2);
>> +msi_64bit = (msi_flags & PCI_MSI_FLAGS_64BIT);
>> +
>> +vfio_pci_write_config(pdev, 

[Qemu-devel] [PULL 0/2] Ui 20190822 patches

2019-08-21 Thread Gerd Hoffmann
The following changes since commit 17dc57990320edaad52ac9ea808be9719c91cea6:

  Merge remote-tracking branch 
'remotes/huth-gitlab/tags/pull-request-2019-08-20' into staging (2019-08-20 
14:14:20 +0100)

are available in the Git repository at:

  git://git.kraxel.org/qemu tags/ui-20190822-pull-request

for you to fetch changes up to a923b471fc59389e49575f38f4db3cd622619bf5:

  input-linux: add shift+shift as a grab toggle (2019-08-21 12:25:46 +0200)


curses: assert get_wch return value is okay
input-linux: add shift+shift as a grab toggle



Niklas Haas (1):
  input-linux: add shift+shift as a grab toggle

Paolo Bonzini (1):
  curses: assert get_wch return value is okay

 ui/curses.c  | 2 ++
 ui/input-linux.c | 4 
 qapi/ui.json | 3 ++-
 3 files changed, 8 insertions(+), 1 deletion(-)

-- 
2.18.1




[Qemu-devel] [PULL 2/2] input-linux: add shift+shift as a grab toggle

2019-08-21 Thread Gerd Hoffmann
From: Niklas Haas 

We have ctrl-ctrl and alt-alt; why not shift-shift? That's my preferred
grab binding, personally.

Signed-off-by: Niklas Haas 
Message-id: 20190818105038.19520-1-q...@haasn.xyz
Signed-off-by: Gerd Hoffmann 
---
 ui/input-linux.c | 4 
 qapi/ui.json | 3 ++-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/ui/input-linux.c b/ui/input-linux.c
index 59456fe7658b..a7b280b25b98 100644
--- a/ui/input-linux.c
+++ b/ui/input-linux.c
@@ -113,6 +113,10 @@ static bool input_linux_check_toggle(InputLinux *il)
 return il->keydown[KEY_LEFTALT] &&
 il->keydown[KEY_RIGHTALT];
 
+case GRAB_TOGGLE_KEYS_SHIFT_SHIFT:
+return il->keydown[KEY_LEFTSHIFT] &&
+il->keydown[KEY_RIGHTSHIFT];
+
 case GRAB_TOGGLE_KEYS_META_META:
 return il->keydown[KEY_LEFTMETA] &&
 il->keydown[KEY_RIGHTMETA];
diff --git a/qapi/ui.json b/qapi/ui.json
index 59e412139adc..e04525d8b44b 100644
--- a/qapi/ui.json
+++ b/qapi/ui.json
@@ -1025,7 +1025,8 @@
 #
 ##
 { 'enum': 'GrabToggleKeys',
-  'data': [ 'ctrl-ctrl', 'alt-alt', 'meta-meta', 'scrolllock', 
'ctrl-scrolllock' ] }
+  'data': [ 'ctrl-ctrl', 'alt-alt', 'shift-shift','meta-meta', 'scrolllock',
+'ctrl-scrolllock' ] }
 
 ##
 # @DisplayGTK:
-- 
2.18.1




[Qemu-devel] [PULL 1/2] curses: assert get_wch return value is okay

2019-08-21 Thread Gerd Hoffmann
From: Paolo Bonzini 

This prevents the compiler from reporting a possible uninitialized use
of maybe_keycode in function curses_refresh.

Cc: Gerd Hoffmann 
Signed-off-by: Paolo Bonzini 
Message-id: 1563451264-46176-1-git-send-email-pbonz...@redhat.com

[ kraxel: whitespace fixup ]

Signed-off-by: Gerd Hoffmann 
---
 ui/curses.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/ui/curses.c b/ui/curses.c
index a6e260eb964d..ec281125acbd 100644
--- a/ui/curses.c
+++ b/ui/curses.c
@@ -225,6 +225,8 @@ static wint_t console_getch(enum maybe_keycode 
*maybe_keycode)
 case ERR:
 ret = -1;
 break;
+default:
+abort();
 }
 return ret;
 }
-- 
2.18.1




[Qemu-devel] [PATCH] i386: Fix legacy guest with xsave panic on host kvm without update cpuid.

2019-08-21 Thread Bingsong Si
without kvm commit 412a3c41, CPUID(EAX=0xd,ECX=0).EBX always equal to 0 even
through guest update xcr0, this will crash legacy guest(e.g., CentOS 6).
Below is the call trace on the guest.

[0.00] kernel BUG at mm/bootmem.c:469!
[0.00] invalid opcode:  [#1] SMP
[0.00] last sysfs file:
[0.00] CPU 0
[0.00] Modules linked in:
[0.00]
[0.00] Pid: 0, comm: swapper Tainted: G   --- H  
2.6.32-279#2 Red Hat KVM
[0.00] RIP: 0010:[]  [] 
alloc_bootmem_core+0x7b/0x29e
[0.00] RSP: 0018:81a01cd8  EFLAGS: 00010046
[0.00] RAX: 81cb1748 RBX: 81cb1720 RCX: 0100
[0.00] RDX: 0040 RSI:  RDI: 81cb1720
[0.00] RBP: 81a01d38 R08:  R09: 1000
[0.00] R10: 02008921da802087 R11: 8800 R12: 
[0.00] R13:  R14:  R15: 0100
[0.00] FS:  () GS:88000220() 
knlGS:
[0.00] CS:  0010 DS: 0018 ES: 0018 CR0: 80050033
[0.00] CR2:  CR3: 01a85000 CR4: 001406b0
[0.00] DR0:  DR1:  DR2: 
[0.00] DR3:  DR6: 0ff0 DR7: 0400
[0.00] Process swapper (pid: 0, threadinfo 81a0, task 
81a8d020)
[0.00] Stack:
[0.00]  0002 81a01dd881eaf060 7e5fe227 
1001
[0.00]  0040 0001 006c 
0100
[0.00]  81cb1720   

[0.00] Call Trace:
[0.00]  [] ___alloc_bootmem_nopanic+0x8d/0xca
[0.00]  [] ___alloc_bootmem+0x11/0x39
[0.00]  [] __alloc_bootmem+0xb/0xd
[0.00]  [] xsave_cntxt_init+0x249/0x2c0
[0.00]  [] init_thread_xstate+0x17/0x25
[0.00]  [] fpu_init+0x79/0xaa
[0.00]  [] cpu_init+0x301/0x344
[0.00]  [] ? sort+0x155/0x230
[0.00]  [] trap_init+0x24e/0x25f
[0.00]  [] start_kernel+0x21c/0x430
[0.00]  [] x86_64_start_reservations+0x125/0x129
[0.00]  [] x86_64_start_kernel+0xfa/0x109
[0.00] Code: 03 48 89 f1 49 c1 e8 0c 48 0f af d0 48 c7 c6 00 a6 61 81 
48 c7 c7 00 e5 79 81 31 c0 4c 89 74 24 08 e8 f2 d7 89 ff 4d 85 e4 75 04 <0f> 0b 
eb fe 48 8b 45 c0 48 83 e8 01 48 85 45
c0 74 04 0f 0b eb

Signed-off-by: Bingsong Si 
---
 target/i386/cpu.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index ff65e11008..77510cdacd 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -4416,7 +4416,13 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, 
uint32_t count,
 *ecx = xsave_area_size(x86_cpu_xsave_components(cpu));
 *eax = env->features[FEAT_XSAVE_COMP_LO];
 *edx = env->features[FEAT_XSAVE_COMP_HI];
-*ebx = xsave_area_size(env->xcr0);
+/*
+ * The initial value of xcr0 and ebx == 0, On host without kvm
+ * commit 412a3c41(e.g., CentOS 6), the ebx's value always == 0
+ * even through guest update xcr0, this will crash some legacy 
guest
+ * (e.g., CentOS 6), So set ebx == ecx to workaroud it.
+ */
+*ebx = kvm_enabled() ? *ecx : xsave_area_size(env->xcr0);
 } else if (count == 1) {
 *eax = env->features[FEAT_XSAVE];
 } else if (count < ARRAY_SIZE(x86_ext_save_areas)) {
-- 
2.22.0




Re: [Qemu-devel] [RFC PATCH v4 31/75] target/i386: introduce code generators

2019-08-21 Thread Aleksandar Markovic
21.08.2019. 20.12, "Jan Bobek"  је написао/ла:
>
> In this context, "code generators" are functions that receive decoded
> instruction operands and emit TCG ops implementing the correct
> instruction functionality. Introduce the naming macros first, actual
> generator macros will be added later.
>
> Signed-off-by: Jan Bobek 
> ---

I advice some caution here. Before adopting the coding approach that relies
heavily on preprocessor, you should seriously evaluate
not-always-so-obvious aspects of debugability and readibility of the end
result. In other words, you should provide a clear and objective answer to
this: What is gained and what is lost by using macros?

Thanks,
Aleksandar

>  target/i386/translate.c | 46 +
>  1 file changed, 46 insertions(+)
>
> diff --git a/target/i386/translate.c b/target/i386/translate.c
> index 2e78bed78f..603a5b80a1 100644
> --- a/target/i386/translate.c
> +++ b/target/i386/translate.c
> @@ -5331,6 +5331,52 @@ INSNOP_LDST(xmm, Mhq)
>  tcg_temp_free_i64(r64);
>  }
>
> +/*
> + * Code generators
> + */
> +#define gen_insn(mnem, argc, ...)   \
> +glue(gen_insn, argc)(mnem, ## __VA_ARGS__)
> +#define gen_insn0(mnem) \
> +gen_ ## mnem ## _0
> +#define gen_insn1(mnem, opT1)   \
> +gen_ ## mnem ## _1 ## opT1
> +#define gen_insn2(mnem, opT1, opT2) \
> +gen_ ## mnem ## _2 ## opT1 ## opT2
> +#define gen_insn3(mnem, opT1, opT2, opT3)   \
> +gen_ ## mnem ## _3 ## opT1 ## opT2 ## opT3
> +#define gen_insn4(mnem, opT1, opT2, opT3, opT4) \
> +gen_ ## mnem ## _4 ## opT1 ## opT2 ## opT3 ## opT4
> +#define gen_insn5(mnem, opT1, opT2, opT3, opT4, opT5)   \
> +gen_ ## mnem ## _5 ## opT1 ## opT2 ## opT3 ## opT4 ## opT5
> +
> +#define GEN_INSN0(mnem) \
> +static void gen_insn0(mnem)(\
> +CPUX86State *env, DisasContext *s)
> +#define GEN_INSN1(mnem, opT1)   \
> +static void gen_insn1(mnem, opT1)(  \
> +CPUX86State *env, DisasContext *s,  \
> +insnop_arg_t(opT1) arg1)
> +#define GEN_INSN2(mnem, opT1, opT2) \
> +static void gen_insn2(mnem, opT1, opT2)(\
> +CPUX86State *env, DisasContext *s,  \
> +insnop_arg_t(opT1) arg1, insnop_arg_t(opT2) arg2)
> +#define GEN_INSN3(mnem, opT1, opT2, opT3)   \
> +static void gen_insn3(mnem, opT1, opT2, opT3)(  \
> +CPUX86State *env, DisasContext *s,  \
> +insnop_arg_t(opT1) arg1, insnop_arg_t(opT2) arg2,   \
> +insnop_arg_t(opT3) arg3)
> +#define GEN_INSN4(mnem, opT1, opT2, opT3, opT4) \
> +static void gen_insn4(mnem, opT1, opT2, opT3, opT4)(\
> +CPUX86State *env, DisasContext *s,  \
> +insnop_arg_t(opT1) arg1, insnop_arg_t(opT2) arg2,   \
> +insnop_arg_t(opT3) arg3, insnop_arg_t(opT4) arg4)
> +#define GEN_INSN5(mnem, opT1, opT2, opT3, opT4, opT5)   \
> +static void gen_insn5(mnem, opT1, opT2, opT3, opT4, opT5)(  \
> +CPUX86State *env, DisasContext *s,  \
> +insnop_arg_t(opT1) arg1, insnop_arg_t(opT2) arg2,   \
> +insnop_arg_t(opT3) arg3, insnop_arg_t(opT4) arg4,   \
> +insnop_arg_t(opT5) arg5)
> +
>  static void gen_sse_ng(CPUX86State *env, DisasContext *s, int b)
>  {
>  enum {
> --
> 2.20.1
>
>


Re: [Qemu-devel] [PATCH] vhost-user-scsi: prevent using uninitialized vqs

2019-08-21 Thread Raphael Norwitz
On Fri, Jun 14, 2019 at 10:18:41AM +0100, Stefan Hajnoczi wrote:
> On Tue, Jun 11, 2019 at 05:35:17PM -0700, Raphael Norwitz wrote:
> > Of the 3 virtqueues, seabios only sets cmd, leaving ctrl
> > and event without a physical address. This can cause
> > vhost_verify_ring_part_mapping to return ENOMEM, causing
> > the following logs:
> > 
> > qemu-system-x86_64: Unable to map available ring for ring 0
> > qemu-system-x86_64: Verify ring failure on region 0
> > 
> > The qemu commit e6cc11d64fc998c11a4dfcde8fda3fc33a74d844
> > has already resolved the issue for vhost scsi devices but
> > the fix was never applied to vhost-user scsi devices.
> > 
> > Signed-off-by: Raphael Norwitz 
> > ---
> >  hw/scsi/vhost-user-scsi.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> Reviewed-by: Stefan Hajnoczi 

Ping on this. Any reason it has not been merged?



Re: [Qemu-devel] [RFC PATCH v4 02/75] target/i386: Push rex_w into DisasContext

2019-08-21 Thread Aleksandar Markovic
21.08.2019. 19.41, "Jan Bobek"  је написао/ла:
>
> From: Richard Henderson 
>
> Treat this the same as we already do for other rex bits.
>
> Signed-off-by: Richard Henderson 
> ---

I keep my previous opinion that this is an example of a low-quality commit
message that needlessly introduces unclarity.

Thanks,
Aleksandar

>  target/i386/translate.c | 19 +++
>  1 file changed, 11 insertions(+), 8 deletions(-)
>
> diff --git a/target/i386/translate.c b/target/i386/translate.c
> index 3aac84e5b0..9ec1c79371 100644
> --- a/target/i386/translate.c
> +++ b/target/i386/translate.c
> @@ -44,11 +44,13 @@
>  #define REX_X(s) ((s)->rex_x)
>  #define REX_B(s) ((s)->rex_b)
>  #define REX_R(s) ((s)->rex_r)
> +#define REX_W(s) ((s)->rex_w)
>  #else
>  #define CODE64(s) 0
>  #define REX_X(s) 0
>  #define REX_B(s) 0
>  #define REX_R(s) 0
> +#define REX_W(s) -1
>  #endif
>
>  #ifdef TARGET_X86_64
> @@ -100,7 +102,7 @@ typedef struct DisasContext {
>  #ifdef TARGET_X86_64
>  int lma;/* long mode active */
>  int code64; /* 64 bit code segment */
> -int rex_x, rex_b, rex_r;
> +int rex_x, rex_b, rex_r, rex_w;
>  #endif
>  int vex_l;  /* vex vector length */
>  int vex_v;  /* vex  register, without 1's complement.  */
> @@ -4495,7 +4497,6 @@ static target_ulong disas_insn(DisasContext *s,
CPUState *cpu)
>  int modrm, reg, rm, mod, op, opreg, val;
>  target_ulong next_eip, tval;
>  target_ulong pc_start = s->base.pc_next;
> -int rex_w;
>
>  s->pc_start = s->pc = pc_start;
>  s->override = -1;
> @@ -4503,6 +4504,7 @@ static target_ulong disas_insn(DisasContext *s,
CPUState *cpu)
>  s->rex_x = 0;
>  s->rex_b = 0;
>  s->rex_r = 0;
> +s->rex_w = -1;
>  s->x86_64_hregs = false;
>  #endif
>  s->rip_offset = 0; /* for relative ip address */
> @@ -4514,7 +4516,6 @@ static target_ulong disas_insn(DisasContext *s,
CPUState *cpu)
>  }
>
>  prefixes = 0;
> -rex_w = -1;
>
>   next_byte:
>  b = x86_ldub_code(env, s);
> @@ -4557,7 +4558,7 @@ static target_ulong disas_insn(DisasContext *s,
CPUState *cpu)
>  case 0x40 ... 0x4f:
>  if (CODE64(s)) {
>  /* REX prefix */
> -rex_w = (b >> 3) & 1;
> +s->rex_w = (b >> 3) & 1;
>  s->rex_r = (b & 0x4) << 1;
>  s->rex_x = (b & 0x2) << 2;
>  s->rex_b = (b & 0x1) << 3;
> @@ -4606,7 +4607,9 @@ static target_ulong disas_insn(DisasContext *s,
CPUState *cpu)
>  s->rex_b = (~vex2 >> 2) & 8;
>  #endif
>  vex3 = x86_ldub_code(env, s);
> -rex_w = (vex3 >> 7) & 1;
> +#ifdef TARGET_X86_64
> +s->rex_w = (vex3 >> 7) & 1;
> +#endif
>  switch (vex2 & 0x1f) {
>  case 0x01: /* Implied 0f leading opcode bytes.  */
>  b = x86_ldub_code(env, s) | 0x100;
> @@ -4631,9 +4634,9 @@ static target_ulong disas_insn(DisasContext *s,
CPUState *cpu)
>  /* Post-process prefixes.  */
>  if (CODE64(s)) {
>  /* In 64-bit mode, the default data size is 32-bit.  Select
64-bit
> -   data with rex_w, and 16-bit data with 0x66; rex_w takes
precedence
> +   data with REX_W, and 16-bit data with 0x66; REX_W takes
precedence
> over 0x66 if both are present.  */
> -dflag = (rex_w > 0 ? MO_64 : prefixes & PREFIX_DATA ? MO_16 :
MO_32);
> +dflag = (REX_W(s) > 0 ? MO_64 : prefixes & PREFIX_DATA ? MO_16 :
MO_32);
>  /* In 64-bit mode, 0x67 selects 32-bit addressing.  */
>  aflag = (prefixes & PREFIX_ADR ? MO_32 : MO_64);
>  } else {
> @@ -5029,7 +5032,7 @@ static target_ulong disas_insn(DisasContext *s,
CPUState *cpu)
>  /* operand size for jumps is 64 bit */
>  ot = MO_64;
>  } else if (op == 3 || op == 5) {
> -ot = dflag != MO_16 ? MO_32 + (rex_w == 1) : MO_16;
> +ot = dflag != MO_16 ? MO_32 + (REX_W(s) == 1) : MO_16;
>  } else if (op == 6) {
>  /* default push size is 64 bit */
>  ot = mo_pushpop(s, dflag);
> --
> 2.20.1
>
>


Re: [Qemu-devel] [RFC PATCH v4 58/75] target/i386: introduce AES and PCLMULQDQ vector instructions to sse-opcode.inc.h

2019-08-21 Thread Aleksandar Markovic
21.08.2019. 20.37, "Jan Bobek"  је написао/ла:
>
> Add all the AES and PCLMULQDQ vector instruction entries to
sse-opcode.inc.h.
>

Why only pclmulqdq, and not entire CLMUL instruction set?

> Signed-off-by: Jan Bobek 
> ---
>  target/i386/sse-opcode.inc.h | 34 ++
>  1 file changed, 34 insertions(+)
>
> diff --git a/target/i386/sse-opcode.inc.h b/target/i386/sse-opcode.inc.h
> index f43436213e..1359508424 100644
> --- a/target/i386/sse-opcode.inc.h
> +++ b/target/i386/sse-opcode.inc.h
> @@ -449,6 +449,26 @@
>   * 66 0F 3A 61 /r imm8 PCMPESTRI xmm1, xmm2/m128, imm8
>   * 66 0F 3A 62 /r imm8 PCMPISTRM xmm1, xmm2/m128, imm8
>   * 66 0F 3A 63 /r imm8 PCMPISTRI xmm1, xmm2/m128, imm8
> + *
> + * AES Instructions
> + * -
> + * 66 0F 38 DE /r  AESDEC xmm1, xmm2/m128
> + * VEX.128.66.0F38.WIG DE /r   VAESDEC xmm1, xmm2, xmm3/m128
> + * 66 0F 38 DF /r  AESDECLAST xmm1, xmm2/m128
> + * VEX.128.66.0F38.WIG DF /r   VAESDECLAST xmm1, xmm2, xmm3/m128
> + * 66 0F 38 DC /r  AESENC xmm1, xmm2/m128
> + * VEX.128.66.0F38.WIG DC /r   VAESENC xmm1, xmm2, xmm3/m128
> + * 66 0F 38 DD /r  AESENCLAST xmm1, xmm2/m128
> + * VEX.128.66.0F38.WIG DD /r   VAESENCLAST xmm1, xmm2, xmm3/m128
> + * 66 0F 38 DB /r  AESIMC xmm1, xmm2/m128
> + * VEX.128.66.0F38.WIG DB /r   VAESIMC xmm1, xmm2/m128
> + * 66 0F 3A DF /r ib   AESKEYGENASSIST xmm1, xmm2/m128, imm8
> + * VEX.128.66.0F3A.WIG DF /r ibVAESKEYGENASSIST xmm1, xmm2/m128, imm8
> + *
> + * PCLMULQDQ Instructions
> + * ---
> + * 66 0F 3A 44 /r ib   PCLMULQDQ xmm1, xmm2/m128, imm8
> + * VEX.128.66.0F3A.WIG 44 /r ibVPCLMULQDQ xmm1, xmm2, xmm3/m128, imm8
>   */
>
>  OPCODE(movd, LEG(NP, 0F, 0, 0x6e), MMX, WR, Pq, Ed)
> @@ -641,6 +661,20 @@ OPCODE(roundps, LEG(66, 0F3A, 0, 0x08), SSE4_1, WRR,
Vdq, Wdq, Ib)
>  OPCODE(roundpd, LEG(66, 0F3A, 0, 0x09), SSE4_1, WRR, Vdq, Wdq, Ib)
>  OPCODE(roundss, LEG(66, 0F3A, 0, 0x0a), SSE4_1, WRR, Vd, Wd, Ib)
>  OPCODE(roundsd, LEG(66, 0F3A, 0, 0x0b), SSE4_1, WRR, Vq, Wq, Ib)
> +OPCODE(aesdec, LEG(66, 0F38, 0, 0xde), AES, WRR, Vdq, Vdq, Wdq)
> +OPCODE(vaesdec, VEX(128, 66, 0F38, IG, 0xde), AES_AVX, WRR, Vdq, Hdq,
Wdq)
> +OPCODE(aesdeclast, LEG(66, 0F38, 0, 0xdf), AES, WRR, Vdq, Vdq, Wdq)
> +OPCODE(vaesdeclast, VEX(128, 66, 0F38, IG, 0xdf), AES_AVX, WRR, Vdq,
Hdq, Wdq)
> +OPCODE(aesenc, LEG(66, 0F38, 0, 0xdc), AES, WRR, Vdq, Vdq, Wdq)
> +OPCODE(vaesenc, VEX(128, 66, 0F38, IG, 0xdc), AES_AVX, WRR, Vdq, Hdq,
Wdq)
> +OPCODE(aesenclast, LEG(66, 0F38, 0, 0xdd), AES, WRR, Vdq, Vdq, Wdq)
> +OPCODE(vaesenclast, VEX(128, 66, 0F38, IG, 0xdd), AES_AVX, WRR, Vdq,
Hdq, Wdq)
> +OPCODE(aesimc, LEG(66, 0F38, 0, 0xdb), AES, WR, Vdq, Wdq)
> +OPCODE(vaesimc, VEX(128, 66, 0F38, IG, 0xdb), AES_AVX, WR, Vdq, Wdq)
> +OPCODE(aeskeygenassist, LEG(66, 0F3A, 0, 0xdf), AES, WRR, Vdq, Wdq, Ib)
> +OPCODE(vaeskeygenassist, VEX(128, 66, 0F3A, IG, 0xdf), AES_AVX, WRR,
Vdq, Wdq, Ib)
> +OPCODE(pclmulqdq, LEG(66, 0F3A, 0, 0x44), PCLMULQDQ, WRRR, Vdq, Vdq,
Wdq, Ib)
> +OPCODE(vpclmulqdq, VEX(128, 66, 0F3A, IG, 0x44), PCLMULQDQ_AVX, WRRR,
Vdq, Hdq, Wdq, Ib)
>  OPCODE(pcmpeqb, LEG(NP, 0F, 0, 0x74), MMX, WRR, Pq, Pq, Qq)
>  OPCODE(pcmpeqb, LEG(66, 0F, 0, 0x74), SSE2, WRR, Vdq, Vdq, Wdq)
>  OPCODE(pcmpeqw, LEG(NP, 0F, 0, 0x75), MMX, WRR, Pq, Pq, Qq)
> --
> 2.20.1
>
>


Re: [Qemu-devel] [RFC PATCH v4 64/75] target/i386: introduce AVX2 vector instructions to sse-opcode.inc.h

2019-08-21 Thread Aleksandar Markovic
21.08.2019. 20.49, "Jan Bobek"  је написао/ла:
>
> Add all the AVX2 vector instruction entries to sse-opcode.inc.h.
>

Why is AVX-related code inserted in a file whose name says SSE? Perhaps the
file should be named vector-opcode.inc.h?

Also, some vector extensions contain non-vector instructions. Even if you
don't intend to implement emulation of such instructions, you should
mention them in places like this file, since this file mainly deals with
decoding, which is known to you. Without it, you leave the file unfinished
without a good reason.

Thanks,
Aleksandar

> Signed-off-by: Jan Bobek 
> ---
>  target/i386/sse-opcode.inc.h | 362 ++-
>  1 file changed, 359 insertions(+), 3 deletions(-)
>
> diff --git a/target/i386/sse-opcode.inc.h b/target/i386/sse-opcode.inc.h
> index c3c0ec4f89..abbb0a15d7 100644
> --- a/target/i386/sse-opcode.inc.h
> +++ b/target/i386/sse-opcode.inc.h
> @@ -855,6 +855,181 @@
>   * VEX.128.66.0F.WIG 73 /3 ib  VPSRLDQ xmm1, xmm2, imm8
>   * VEX.LZ.0F.WIG AE /2 VLDMXCSR m32
>   * VEX.LZ.0F.WIG AE /3 VSTMXCSR m32
> + *
> + * AVX2 Instructions
> + * --
> + * VEX.256.66.0F.W0 D7 /r  VPMOVMSKB r32, ymm1
> + * VEX.256.66.0F.W1 D7 /r  VPMOVMSKB r64, ymm1
> + * VEX.256.66.0F.WIG FC /r VPADDB ymm1, ymm2, ymm3/m256
> + * VEX.256.66.0F.WIG FD /r VPADDW ymm1, ymm2, ymm3/m256
> + * VEX.256.66.0F.WIG FE /r VPADDD ymm1, ymm2, ymm3/m256
> + * VEX.256.66.0F.WIG D4 /r VPADDQ ymm1, ymm2, ymm3/m256
> + * VEX.256.66.0F.WIG EC /r VPADDSB ymm1, ymm2, ymm3/m256
> + * VEX.256.66.0F.WIG ED /r VPADDSW ymm1, ymm2, ymm3/m256
> + * VEX.256.66.0F.WIG DC /r VPADDUSB ymm1,ymm2,ymm3/m256
> + * VEX.256.66.0F.WIG DD /r VPADDUSW ymm1,ymm2,ymm3/m256
> + * VEX.256.66.0F38.WIG 01 /r   VPHADDW ymm1, ymm2, ymm3/m256
> + * VEX.256.66.0F38.WIG 02 /r   VPHADDD ymm1, ymm2, ymm3/m256
> + * VEX.256.66.0F38.WIG 03 /r   VPHADDSW ymm1, ymm2, ymm3/m256
> + * VEX.256.66.0F.WIG F8 /r VPSUBB ymm1, ymm2, ymm3/m256
> + * VEX.256.66.0F.WIG F9 /r VPSUBW ymm1, ymm2, ymm3/m256
> + * VEX.256.66.0F.WIG FA /r VPSUBD ymm1, ymm2, ymm3/m256
> + * VEX.256.66.0F.WIG FB /r VPSUBQ ymm1, ymm2, ymm3/m256
> + * VEX.256.66.0F.WIG E8 /r VPSUBSB ymm1, ymm2, ymm3/m256
> + * VEX.256.66.0F.WIG E9 /r VPSUBSW ymm1, ymm2, ymm3/m256
> + * VEX.256.66.0F.WIG D8 /r VPSUBUSB ymm1, ymm2, ymm3/m256
> + * VEX.256.66.0F.WIG D9 /r VPSUBUSW ymm1, ymm2, ymm3/m256
> + * VEX.256.66.0F38.WIG 05 /r   VPHSUBW ymm1, ymm2, ymm3/m256
> + * VEX.256.66.0F38.WIG 06 /r   VPHSUBD ymm1, ymm2, ymm3/m256
> + * VEX.256.66.0F38.WIG 07 /r   VPHSUBSW ymm1, ymm2, ymm3/m256
> + * VEX.256.66.0F.WIG D5 /r VPMULLW ymm1, ymm2, ymm3/m256
> + * VEX.256.66.0F38.WIG 40 /r   VPMULLD ymm1, ymm2, ymm3/m256
> + * VEX.256.66.0F.WIG E5 /r VPMULHW ymm1, ymm2, ymm3/m256
> + * VEX.256.66.0F.WIG E4 /r VPMULHUW ymm1, ymm2, ymm3/m256
> + * VEX.256.66.0F38.WIG 28 /r   VPMULDQ ymm1, ymm2, ymm3/m256
> + * VEX.256.66.0F.WIG F4 /r VPMULUDQ ymm1, ymm2, ymm3/m256
> + * VEX.256.66.0F38.WIG 0B /r   VPMULHRSW ymm1, ymm2, ymm3/m256
> + * VEX.256.66.0F.WIG F5 /r VPMADDWD ymm1, ymm2, ymm3/m256
> + * VEX.256.66.0F38.WIG 04 /r   VPMADDUBSW ymm1, ymm2, ymm3/m256
> + * VEX.256.66.0F DA /r VPMINUB ymm1, ymm2, ymm3/m256
> + * VEX.256.66.0F38 3A /r   VPMINUW ymm1, ymm2, ymm3/m256
> + * VEX.256.66.0F38.WIG 3B /r   VPMINUD ymm1, ymm2, ymm3/m256
> + * VEX.256.66.0F38 38 /r   VPMINSB ymm1, ymm2, ymm3/m256
> + * VEX.256.66.0F EA /r VPMINSW ymm1, ymm2, ymm3/m256
> + * VEX.256.66.0F38.WIG 39 /r   VPMINSD ymm1, ymm2, ymm3/m256
> + * VEX.256.66.0F DE /r VPMAXUB ymm1, ymm2, ymm3/m256
> + * VEX.256.66.0F38 3E /r   VPMAXUW ymm1, ymm2, ymm3/m256
> + * VEX.256.66.0F38.WIG 3F /r   VPMAXUD ymm1, ymm2, ymm3/m256
> + * VEX.256.66.0F38.WIG 3C /r   VPMAXSB ymm1, ymm2, ymm3/m256
> + * VEX.256.66.0F.WIG EE /r VPMAXSW ymm1, ymm2, ymm3/m256
> + * VEX.256.66.0F38.WIG 3D /r   VPMAXSD ymm1, ymm2, ymm3/m256
> + * VEX.256.66.0F.WIG E0 /r VPAVGB ymm1, ymm2, ymm3/m256
> + * VEX.256.66.0F.WIG E3 /r VPAVGW ymm1, ymm2, ymm3/m256
> + * VEX.256.66.0F.WIG F6 /r VPSADBW ymm1, ymm2, ymm3/m256
> + * VEX.256.66.0F3A.WIG 42 /r ibVMPSADBW ymm1, ymm2, ymm3/m256, imm8
> + * VEX.256.66.0F38.WIG 1C /r   VPABSB ymm1, ymm2/m256
> + * VEX.256.66.0F38.WIG 1D /r   VPABSW ymm1, ymm2/m256
> + * VEX.256.66.0F38.WIG 1E /r   VPABSD ymm1, ymm2/m256
> + * VEX.256.66.0F38.WIG 08 /r   VPSIGNB ymm1, ymm2, ymm3/m256
> + * VEX.256.66.0F38.WIG 09 /r   VPSIGNW ymm1, ymm2, ymm3/m256
> + * VEX.256.66.0F38.WIG 0A /r   VPSIGND ymm1, ymm2, ymm3/m256
> + * VEX.256.66.0F.WIG 74 /r VPCMPEQB ymm1,ymm2,ymm3/m256
> + * VEX.256.66.0F.WIG 75 /r VPCMPEQW 

Re: [Qemu-devel] [PATCH] tests: make filemonitor test more robust to event ordering

2019-08-21 Thread Peter Xu
On Wed, Aug 21, 2019 at 04:53:27PM +0100, Daniel P. Berrangé wrote:
> The ordering of events that are emitted during the rmdir
> test have changed with kernel >= 5.3. Semantically both
> new & old orderings are correct, so we must be able to
> cope with either.
> 
> To cope with this, when we see an unexpected event, we
> push it back onto the queue and look and the subsequent
> event to see if that matches instead.
> 
> Signed-off-by: Daniel P. Berrangé 

Thanks for fixing it!

Tested-by: Peter Xu 

-- 
Peter Xu



Re: [Qemu-devel] [RFC PATCH qemu] qapi: Add query-memory-checksum

2019-08-21 Thread Alexey Kardashevskiy




On 22/08/2019 11:33, Eric Blake wrote:

On 8/21/19 8:16 PM, Alexey Kardashevskiy wrote:

This returns MD5 checksum of all RAM blocks for migration debugging
as this is way faster than saving the entire RAM to a file and checking
that.

Signed-off-by: Alexey Kardashevskiy 
---


I am actually wondering if there is an easier way of getting these
checksums and I just do not see it, it cannot be that we fixed all
memory migration bugs :)


I'm not sure whether the command itself makes sense, but for the interface:



+++ b/qapi/misc.json
@@ -1194,6 +1194,33 @@
  ##
  { 'command': 'query-memory-size-summary', 'returns': 'MemoryInfo' }
  
+##

+# @MemoryChecksum:
+#
+# A string with MD5 checksum of all RAMBlocks.
+#
+# @checksum: the checksum.
+#
+# Since: 3.2.0


This should be 4.2, not 3.2.


+##
+{ 'struct': 'MemoryChecksum',
+  'data'  : { 'checksum': 'str' } }
+
+##
+# @query-memory-checksum:
+#
+# Return the MD5 checksum of all RAMBlocks.
+#
+# Example:
+#
+# -> { "execute": "query-memory-checksum" }
+# <- { "return": { "checksum": "a0880304994f64cb2edad77b9a1cd58f" } }
+#
+# Since: 3.2.0


and again


+##
+{ 'command': 'query-memory-checksum',
+  'returns': 'MemoryChecksum' }
+
  



+++ b/exec.c
@@ -2050,6 +2050,22 @@ void *qemu_ram_get_host_addr(RAMBlock *rb)
  return rb->host;
  }
  
+gchar *qemu_ram_chksum(void)


gchar is a pointless glib type.  Use 'char' instead.


+{
+struct RAMBlock *rb;
+GChecksum *chksum = g_checksum_new(G_CHECKSUM_MD5);
+gchar *ret;
+
+RAMBLOCK_FOREACH(rb) {
+g_checksum_update(chksum, qemu_ram_get_host_addr(rb),
+  qemu_ram_get_used_length(rb));
+}
+ret = g_strdup(g_checksum_get_string(chksum));
+g_checksum_free(chksum);
+
+return ret;
+}


How long does this take to run?  Is it something where you really want
to block the guest while chewing over the guest's entire memory?



10-20 times faster than "pmemsave" and blocking the guest is not a 
problem here as both - source and destination - guests are stopped 
(otherwise the checksum does not make sense).




--
Alexey



Re: [Qemu-devel] [PATCH 0/2] tests/acceptance: Update MIPS Malta ssh test

2019-08-21 Thread Aleksandar Markovic
21.08.2019. 23.00, "Eduardo Habkost"  је написао/ла:
>
> On Wed, Aug 21, 2019 at 10:27:11PM +0200, Aleksandar Markovic wrote:
> > 02.08.2019. 17.37, "Aleksandar Markovic" 
је
> > написао/ла:
> > >
> > > From: Aleksandar Markovic 
> > >
> > > This little series improves linux_ssh_mips_malta.py, both in the sense
> > > of code organization and in the sense of quantity of executed tests.
> > >
> >
> > Hello, all.
> >
> > I am going to send a new version in few days, and I have a question for
> > test team:
> >
> > Currently, the outcome of the script execition is either PASS:1 FAIL:0
or
> > PASS:0 FAIL:1. But the test actually consists of several subtests. Is
there
> > any way that this single Python script considers these subtests as
separate
> > tests (test cases), reporting something like PASS:12 FAIL:7? If yes,
what
> > would be the best way to achieve that?
>
> If you are talking about each test_*() method, they are already
> treated like separate tests.  If you mean treating each
> ssh_command_output_contains() call as a separate test, this might
> be difficult.
>

Yes, I meant the latter one, individual code segments involving an
invocation of ssh_command_output_contains() instance being treated as
separate tests.

> Cleber, is there something already available in the Avocado API
> that would help us report more fine-grained results inside each
> test case?
>

Thanks, that would be a better way of expressing my question.

>
> >
> > Thanks in advance,
> > Aleksandar
> >
> > > Aleksandar Markovic (2):
> > >   tests/acceptance: Refactor and improve reporting in
> > > linux_ssh_mips_malta.py
> > >   tests/acceptance: Add new test cases in linux_ssh_mips_malta.py
> > >
> > >  tests/acceptance/linux_ssh_mips_malta.py | 81
> > ++--
> > >  1 file changed, 66 insertions(+), 15 deletions(-)
> > >
> > > --
> > > 2.7.4
> > >
> > >
>
> --
> Eduardo


Re: [Qemu-devel] [PATCH] target/alpha: fix tlb_fill trap_arg2 value for instruction fetch

2019-08-21 Thread Richard Henderson
On 8/21/19 6:52 AM, Peter Maydell wrote:
> On Wed, 21 Aug 2019 at 14:42, Aurelien Jarno  wrote:
>>
>> Commit e41c94529740cc26 ("target/alpha: Convert to CPUClass::tlb_fill")
>> slightly changed the way the trap_arg2 value is computed in case of TLB
>> fill. The type of the variable used in the ternary operator has been
>> changed from an int to an enum. This causes the -1 value to not be
>> sign-extended to 64-bit in case of an instruction fetch. The trap_arg2
>> ends up with 0x instead of 0x. Fix that by
>> changing the -1 into -1LL.
>>
>> This fixes the execution of user space processes in qemu-system-alpha.
>>
>> Fixes: e41c94529740cc26
>> Cc: qemu-sta...@nongnu.org
>> Signed-off-by: Aurelien Jarno 
>> ---
>>  target/alpha/helper.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/target/alpha/helper.c b/target/alpha/helper.c
>> index 93b8e788b1..9e9d880c1a 100644
>> --- a/target/alpha/helper.c
>> +++ b/target/alpha/helper.c
>> @@ -283,7 +283,7 @@ bool alpha_cpu_tlb_fill(CPUState *cs, vaddr addr, int 
>> size,
>>  cs->exception_index = EXCP_MMFAULT;
>>  env->trap_arg0 = addr;
>>  env->trap_arg1 = fail;
>> -env->trap_arg2 = (access_type == MMU_INST_FETCH ? -1 : access_type);
>> +env->trap_arg2 = (access_type == MMU_INST_FETCH ? -1LL : 
>> access_type);
>>  cpu_loop_exit_restore(cs, retaddr);
>>  }
> 
> Oops. Thanks for the catch.
> 
> Maybe we should not rely directly on the value of the access_type
> enum to set trap_arg2 at all (ie just go for a switch on access_type and
> set env->trap_arg2 to the right h/w value in the three cases)?

Yes, I'll do that.  I'm somewhat embarrassed that I haven't tested Alpha in a
while, and moreso because we just did a release.


r~



Re: [Qemu-devel] RISC-V: Vector && DSP Extension

2019-08-21 Thread liuzhiwei



On 2019/8/22 上午3:31, Palmer Dabbelt wrote:

On Thu, 15 Aug 2019 14:37:52 PDT (-0700), alistai...@gmail.com wrote:
On Thu, Aug 15, 2019 at 2:07 AM Peter Maydell 
 wrote:


On Thu, 15 Aug 2019 at 09:53, Aleksandar Markovic
 wrote:
>
> > We can accept draft
> > extensions in QEMU as long as they are disabled by default.

> Hi, Alistair, Palmer,
>
> Is this an official stance of QEMU community, or perhaps Alistair's
> personal judgement, or maybe a rule within risv subcomunity?

Alistair asked on a previous thread; my view was:
https://lists.gnu.org/archive/html/qemu-devel/2019-07/msg03364.html
and nobody else spoke up disagreeing (summary: should at least be
disabled-by-default and only enabled by setting an explicit
property whose name should start with the 'x-' prefix).


Agreed!



In general QEMU does sometimes introduce experimental extensions
(we've had them in the block layer, for example) and so the 'x-'
property to enable them is a reasonably established convention.
I think it's a reasonable compromise to allow this sort of work
to start and not have to live out-of-tree for a long time, without
confusing users or getting into a situation where some QEMU
versions behave differently or to obsolete drafts of a spec
without it being clear from the command line that experimental
extensions are being enabled.

There is also an element of "submaintainer judgement" to be applied
here -- upstream is probably not the place for a draft extension
to be implemented if it is:
 * still fast moving or subject to major changes of design direction
 * major changes to the codebase (especially if it requires
   changes to core code) that might later need to be redone
   entirely differently
 * still experimental


Yep, agreed. For RISC-V I think this would extend to only allowing
extensions that have backing from the foundation and are under active
discussion.


My general philosophy here is that we'll take anything written down in 
an official RISC-V ISA manual (ie, the ones actually released by the 
foundation).  This provides a single source of truth for what an 
extension name / version means, which is important to avoid 
confusion.  If it's a ratified extension then I see no reason not to 
support it on my end.  For frozen extensions we should probably just 
wait the 45 days until they go up for a ratification vote, but I'd be 
happy to start reviewing patches then (or earlier :)).


If the spec is a draft in the ISA manual then we need to worry about 
the support burden, which I don't have a fixed criteria for -- 
generally there shouldn't be issues here, but early drafts can be in a 
state where they're going to change extensively and are unlikely to be 
used by anyone.  There's also the question of "what is an official 
release of a draft specification?".
That's a bit awkward right now: the current ratified ISA manual 
contains version 0.3 of the hypervisor extension, but I just talked to 
Andrew and the plan is to remove the draft extensions from the 
ratified manuals because these drafts are old and the official manuals 
update slowly.  For now I guess we'll need an an-hoc way of 
determining if a draft extension has been officially versioned or not, 
which is a bit of a headache.


We already have examples of supporting draft extensions, including 
priv-1.9.1.  This does cause some pain for us on the QEMU side (CSR 
bits have different semantics between the specs), but there's 1.9.1 
hardware out there and the port continues to be useful so I'd be in 
favor of keeping it around for now.  I suppose there is an implicit 
risk that draft extensions will be deprecated, but the "x-" prefix, 
draft status, and long deprecation period should be sufficient to 
inform users of the risk.  I wouldn't be opposed to adding a "this is 
a draft ISA" warning, but I feel like it might be a bit overkill.



Hi, Palmer

Maybe it is the headache of open source hardware. Everyone cooperates to 
build a better architecture.


In my opinion, we should focus on the future. The code in QEMU mainline 
should evolve to the  ratified extension step by step, and only support 
the best extension at last.


At that time,  even many hardwares just support  the deprecated draft 
extension,  the draft codes should be in the wild and maintained by the 
hardware manufactures.


But before that,  it is better to  have a draft implementation. So that 
We can work step by step to accelerate the coming of the ratified 
extension.


Even at last draft extension implementation are deprecated, they are not 
meaningless. The manufactures may use  the  history commit to support 
their hardwares that


only support drafted extension.

Best Regards,

Zhiwei



Alistair



thanks
-- PMM






Re: [Qemu-devel] [RFC PATCH qemu] qapi: Add query-memory-checksum

2019-08-21 Thread Eric Blake
On 8/21/19 8:16 PM, Alexey Kardashevskiy wrote:
> This returns MD5 checksum of all RAM blocks for migration debugging
> as this is way faster than saving the entire RAM to a file and checking
> that.
> 
> Signed-off-by: Alexey Kardashevskiy 
> ---
> 
> 
> I am actually wondering if there is an easier way of getting these
> checksums and I just do not see it, it cannot be that we fixed all
> memory migration bugs :)

I'm not sure whether the command itself makes sense, but for the interface:


> +++ b/qapi/misc.json
> @@ -1194,6 +1194,33 @@
>  ##
>  { 'command': 'query-memory-size-summary', 'returns': 'MemoryInfo' }
>  
> +##
> +# @MemoryChecksum:
> +#
> +# A string with MD5 checksum of all RAMBlocks.
> +#
> +# @checksum: the checksum.
> +#
> +# Since: 3.2.0

This should be 4.2, not 3.2.

> +##
> +{ 'struct': 'MemoryChecksum',
> +  'data'  : { 'checksum': 'str' } }
> +
> +##
> +# @query-memory-checksum:
> +#
> +# Return the MD5 checksum of all RAMBlocks.
> +#
> +# Example:
> +#
> +# -> { "execute": "query-memory-checksum" }
> +# <- { "return": { "checksum": "a0880304994f64cb2edad77b9a1cd58f" } }
> +#
> +# Since: 3.2.0

and again

> +##
> +{ 'command': 'query-memory-checksum',
> +  'returns': 'MemoryChecksum' }
> +
>  

> +++ b/exec.c
> @@ -2050,6 +2050,22 @@ void *qemu_ram_get_host_addr(RAMBlock *rb)
>  return rb->host;
>  }
>  
> +gchar *qemu_ram_chksum(void)

gchar is a pointless glib type.  Use 'char' instead.

> +{
> +struct RAMBlock *rb;
> +GChecksum *chksum = g_checksum_new(G_CHECKSUM_MD5);
> +gchar *ret;
> +
> +RAMBLOCK_FOREACH(rb) {
> +g_checksum_update(chksum, qemu_ram_get_host_addr(rb),
> +  qemu_ram_get_used_length(rb));
> +}
> +ret = g_strdup(g_checksum_get_string(chksum));
> +g_checksum_free(chksum);
> +
> +return ret;
> +}

How long does this take to run?  Is it something where you really want
to block the guest while chewing over the guest's entire memory?

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


[Qemu-devel] [Bug 1819289] Re: Windows 95 and Windows 98 will not install or run

2019-08-21 Thread Brad Parker
Here is the exact working command line I used for Windows 95C (OSR2.5):

qemu-system-i386 -cpu pentium -m 128 -vga std -no-kvm -hda
~/Win95C.qcow2 -nodefaults -no-hpet -no-acpi -nodefaults -monitor stdio
-sdl -boot menu=on,order=c,splash-time=2000 -accel tcg,thread=single

To install the OS I simply added -cdrom and -fda, but everything else
stayed the same.

This was using the latest master (33f18cf, after v4.1.0) and its
included bios binaries.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1819289

Title:
  Windows 95 and Windows 98 will not install or run

Status in QEMU:
  New

Bug description:
  The last version of QEMU I have been able to run Windows 95 or Windows
  98 on was 2.7 or 2.8. Recent versions since then even up to 3.1 will
  either not install or will not run 95 or 98 at all. I have tried every
  combination of options like isapc or no isapc, cpu pentium  or cpu as
  486. Tried different memory configurations, but they just don't work
  anymore.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1819289/+subscriptions



[Qemu-devel] [RFC PATCH qemu] qapi: Add query-memory-checksum

2019-08-21 Thread Alexey Kardashevskiy
This returns MD5 checksum of all RAM blocks for migration debugging
as this is way faster than saving the entire RAM to a file and checking
that.

Signed-off-by: Alexey Kardashevskiy 
---


I am actually wondering if there is an easier way of getting these
checksums and I just do not see it, it cannot be that we fixed all
memory migration bugs :)


---
 qapi/misc.json| 27 +++
 include/exec/cpu-common.h |  1 +
 exec.c| 16 
 monitor/qmp-cmds.c|  9 +
 4 files changed, 53 insertions(+)

diff --git a/qapi/misc.json b/qapi/misc.json
index a7fba7230cfa..e7475f30a844 100644
--- a/qapi/misc.json
+++ b/qapi/misc.json
@@ -1194,6 +1194,33 @@
 ##
 { 'command': 'query-memory-size-summary', 'returns': 'MemoryInfo' }
 
+##
+# @MemoryChecksum:
+#
+# A string with MD5 checksum of all RAMBlocks.
+#
+# @checksum: the checksum.
+#
+# Since: 3.2.0
+##
+{ 'struct': 'MemoryChecksum',
+  'data'  : { 'checksum': 'str' } }
+
+##
+# @query-memory-checksum:
+#
+# Return the MD5 checksum of all RAMBlocks.
+#
+# Example:
+#
+# -> { "execute": "query-memory-checksum" }
+# <- { "return": { "checksum": "a0880304994f64cb2edad77b9a1cd58f" } }
+#
+# Since: 3.2.0
+##
+{ 'command': 'query-memory-checksum',
+  'returns': 'MemoryChecksum' }
+
 
 ##
 # @AddfdInfo:
diff --git a/include/exec/cpu-common.h b/include/exec/cpu-common.h
index f7dbe75fbc38..15dbf18c2d5d 100644
--- a/include/exec/cpu-common.h
+++ b/include/exec/cpu-common.h
@@ -57,6 +57,7 @@ void qemu_ram_set_idstr(RAMBlock *block, const char *name, 
DeviceState *dev);
 void qemu_ram_unset_idstr(RAMBlock *block);
 const char *qemu_ram_get_idstr(RAMBlock *rb);
 void *qemu_ram_get_host_addr(RAMBlock *rb);
+gchar *qemu_ram_chksum(void);
 ram_addr_t qemu_ram_get_offset(RAMBlock *rb);
 ram_addr_t qemu_ram_get_used_length(RAMBlock *rb);
 bool qemu_ram_is_shared(RAMBlock *rb);
diff --git a/exec.c b/exec.c
index 3e78de3b8f8b..76f7f63cf71b 100644
--- a/exec.c
+++ b/exec.c
@@ -2050,6 +2050,22 @@ void *qemu_ram_get_host_addr(RAMBlock *rb)
 return rb->host;
 }
 
+gchar *qemu_ram_chksum(void)
+{
+struct RAMBlock *rb;
+GChecksum *chksum = g_checksum_new(G_CHECKSUM_MD5);
+gchar *ret;
+
+RAMBLOCK_FOREACH(rb) {
+g_checksum_update(chksum, qemu_ram_get_host_addr(rb),
+  qemu_ram_get_used_length(rb));
+}
+ret = g_strdup(g_checksum_get_string(chksum));
+g_checksum_free(chksum);
+
+return ret;
+}
+
 ram_addr_t qemu_ram_get_offset(RAMBlock *rb)
 {
 return rb->offset;
diff --git a/monitor/qmp-cmds.c b/monitor/qmp-cmds.c
index b9ae40eec751..ec52bd82588e 100644
--- a/monitor/qmp-cmds.c
+++ b/monitor/qmp-cmds.c
@@ -413,3 +413,12 @@ MemoryInfo *qmp_query_memory_size_summary(Error **errp)
 
 return mem_info;
 }
+
+MemoryChecksum *qmp_query_memory_checksum(Error **errp)
+{
+MemoryChecksum *chk = g_malloc0(sizeof(MemoryChecksum));
+
+chk->checksum = qemu_ram_chksum();
+
+return chk;
+}
-- 
2.17.1




[Qemu-devel] [Bug 1840865] Re: qemu crashes when doing iotest on virtio-9p filesystem

2019-08-21 Thread fangying
** Description changed:

  Qemu crashes when doing avocado-vt test on virtio-9p filesystem.
- This bug can be reproduced running 
https://github.com/autotest/tp-qemu/blob/master/qemu/tests/9p.py.
+ This bug can be reproduced running 
https://github.com/autotest/tp-qemu/blob/master/qemu/tests/9p.py with the 
latest qemu-4.0.0.
  The crash stack goes like:
  
  Program terminated with signal SIGSEGV, Segmentation fault.
  #0  v9fs_mark_fids_unreclaim (pdu=pdu@entry=0xaaab00046868, 
path=path@entry=0x851e2fa8)
- at hw/9pfs/9p.c:505
+ at hw/9pfs/9p.c:505
  #1  0xe3585acc in v9fs_unlinkat (opaque=0xaaab00046868) at 
hw/9pfs/9p.c:2590
  #2  0xe3811c10 in coroutine_trampoline (i0=, 
i1=)
- at util/coroutine-ucontext.c:116
+ at util/coroutine-ucontext.c:116
  #3  0xa13ddb20 in ?? () from /lib64/libc.so.6
  Backtrace stopped: not enough registers or memory available to unwind further
  
  A segment fault is triggered at hw/9pfs/9p.c line 505
  
- for (fidp = s->fid_list; fidp; fidp = fidp->next) {
- if (fidp->path.size != path->size) { # fidp is invalid 
- continue;
- }
+ for (fidp = s->fid_list; fidp; fidp = fidp->next) {
+ if (fidp->path.size != path->size) { # fidp is invalid
+ continue;
+ }
  
  (gdb) p path
  $10 = (V9fsPath *) 0x851e2fa8
  (gdb) p *path
  $11 = {size = 21, data = 0xfed6f420 "./9p_test/p2a1/d0/f1"}
  (gdb) p *fidp
  Cannot access memory at address 0x101010101010101
  (gdb) p *pdu
  $12 = {size = 19, tag = 54, id = 76 'L', cancelled = 0 '\000', complete = 
{entries = {
-   sqh_first = 0x0, sqh_last = 0xaaab00046870}}, s = 0xaaab000454b8, next 
= {
- le_next = 0xaaab000467c0, le_prev = 0xaaab00046f88}, idx = 88}
- (gdb) 
+   sqh_first = 0x0, sqh_last = 0xaaab00046870}}, s = 0xaaab000454b8, next 
= {
+ le_next = 0xaaab000467c0, le_prev = 0xaaab00046f88}, idx = 88}
+ (gdb)
  
  Address Sanitizer shows error and saying that there is a heap-use-after-
  free on *fidp*.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1840865

Title:
  qemu crashes when doing iotest on  virtio-9p filesystem

Status in QEMU:
  New

Bug description:
  Qemu crashes when doing avocado-vt test on virtio-9p filesystem.
  This bug can be reproduced running 
https://github.com/autotest/tp-qemu/blob/master/qemu/tests/9p.py with the 
latest qemu-4.0.0.
  The crash stack goes like:

  Program terminated with signal SIGSEGV, Segmentation fault.
  #0  v9fs_mark_fids_unreclaim (pdu=pdu@entry=0xaaab00046868, 
path=path@entry=0x851e2fa8)
  at hw/9pfs/9p.c:505
  #1  0xe3585acc in v9fs_unlinkat (opaque=0xaaab00046868) at 
hw/9pfs/9p.c:2590
  #2  0xe3811c10 in coroutine_trampoline (i0=, 
i1=)
  at util/coroutine-ucontext.c:116
  #3  0xa13ddb20 in ?? () from /lib64/libc.so.6
  Backtrace stopped: not enough registers or memory available to unwind further

  A segment fault is triggered at hw/9pfs/9p.c line 505

  for (fidp = s->fid_list; fidp; fidp = fidp->next) {
  if (fidp->path.size != path->size) { # fidp is invalid
  continue;
  }

  (gdb) p path
  $10 = (V9fsPath *) 0x851e2fa8
  (gdb) p *path
  $11 = {size = 21, data = 0xfed6f420 "./9p_test/p2a1/d0/f1"}
  (gdb) p *fidp
  Cannot access memory at address 0x101010101010101
  (gdb) p *pdu
  $12 = {size = 19, tag = 54, id = 76 'L', cancelled = 0 '\000', complete = 
{entries = {
    sqh_first = 0x0, sqh_last = 0xaaab00046870}}, s = 0xaaab000454b8, next 
= {
  le_next = 0xaaab000467c0, le_prev = 0xaaab00046f88}, idx = 88}
  (gdb)

  Address Sanitizer shows error and saying that there is a heap-use-
  after-free on *fidp*.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1840865/+subscriptions



Re: [Qemu-devel] [PATCH] tests: make filemonitor test more robust to event ordering

2019-08-21 Thread Wei Yang
On Wed, Aug 21, 2019 at 04:53:27PM +0100, Daniel P. Berrangé wrote:
>The ordering of events that are emitted during the rmdir
>test have changed with kernel >= 5.3. Semantically both
>new & old orderings are correct, so we must be able to
>cope with either.
>
>To cope with this, when we see an unexpected event, we
>push it back onto the queue and look and the subsequent
>event to see if that matches instead.
>
>Signed-off-by: Daniel P. Berrangé 

Tested-by: Wei Yang 

>---
> tests/test-util-filemonitor.c | 43 +++
> 1 file changed, 34 insertions(+), 9 deletions(-)
>
>diff --git a/tests/test-util-filemonitor.c b/tests/test-util-filemonitor.c
>index 46e781c022..301cd2db61 100644
>--- a/tests/test-util-filemonitor.c
>+++ b/tests/test-util-filemonitor.c
>@@ -45,6 +45,11 @@ typedef struct {
> const char *filedst;
> int64_t *watchid;
> int eventid;
>+/*
>+ * Only valid with OP_EVENT - this event might be
>+ * swapped with the next OP_EVENT
>+ */
>+bool swapnext;
> } QFileMonitorTestOp;
> 
> typedef struct {
>@@ -98,6 +103,10 @@ qemu_file_monitor_test_handler(int64_t id,
> QFileMonitorTestData *data = opaque;
> QFileMonitorTestRecord *rec = g_new0(QFileMonitorTestRecord, 1);
> 
>+if (debug) {
>+g_printerr("Queue event id %" PRIx64 " event %d file %s\n",
>+   id, event, filename);
>+}
> rec->id = id;
> rec->event = event;
> rec->filename = g_strdup(filename);
>@@ -125,7 +134,8 @@ qemu_file_monitor_test_record_free(QFileMonitorTestRecord 
>*rec)
>  * to wait for the event to be queued for us.
>  */
> static QFileMonitorTestRecord *
>-qemu_file_monitor_test_next_record(QFileMonitorTestData *data)
>+qemu_file_monitor_test_next_record(QFileMonitorTestData *data,
>+   QFileMonitorTestRecord *pushback)
> {
> GTimer *timer = g_timer_new();
> QFileMonitorTestRecord *record = NULL;
>@@ -139,9 +149,15 @@ qemu_file_monitor_test_next_record(QFileMonitorTestData 
>*data)
> }
> if (data->records) {
> record = data->records->data;
>-tmp = data->records;
>-data->records = g_list_remove_link(data->records, tmp);
>-g_list_free(tmp);
>+if (pushback) {
>+data->records->data = pushback;
>+} else {
>+tmp = data->records;
>+data->records = g_list_remove_link(data->records, tmp);
>+g_list_free(tmp);
>+}
>+} else if (pushback) {
>+qemu_file_monitor_test_record_free(pushback);
> }
> qemu_mutex_unlock(>lock);
> 
>@@ -158,13 +174,15 @@ static bool
> qemu_file_monitor_test_expect(QFileMonitorTestData *data,
>   int64_t id,
>   QFileMonitorEvent event,
>-  const char *filename)
>+  const char *filename,
>+  bool swapnext)
> {
> QFileMonitorTestRecord *rec;
> bool ret = false;
> 
>-rec = qemu_file_monitor_test_next_record(data);
>+rec = qemu_file_monitor_test_next_record(data, NULL);
> 
>+ retry:
> if (!rec) {
> g_printerr("Missing event watch id %" PRIx64 " event %d file %s\n",
>id, event, filename);
>@@ -172,6 +190,11 @@ qemu_file_monitor_test_expect(QFileMonitorTestData *data,
> }
> 
> if (id != rec->id) {
>+if (swapnext) {
>+rec = qemu_file_monitor_test_next_record(data, rec);
>+swapnext = false;
>+goto retry;
>+}
> g_printerr("Expected watch id %" PRIx64 " but got %" PRIx64 "\n",
>id, rec->id);
> goto cleanup;
>@@ -347,7 +370,8 @@ test_file_monitor_events(void)
>   .filesrc = "fish", },
> { .type = QFILE_MONITOR_TEST_OP_EVENT,
>   .filesrc = "", .watchid = ,
>-  .eventid = QFILE_MONITOR_EVENT_IGNORED },
>+  .eventid = QFILE_MONITOR_EVENT_IGNORED,
>+  .swapnext = true },
> { .type = QFILE_MONITOR_TEST_OP_EVENT,
>   .filesrc = "fish", .watchid = ,
>   .eventid = QFILE_MONITOR_EVENT_DELETED },
>@@ -493,8 +517,9 @@ test_file_monitor_events(void)
> g_printerr("Event id=%" PRIx64 " event=%d file=%s\n",
>*op->watchid, op->eventid, op->filesrc);
> }
>-if (!qemu_file_monitor_test_expect(
>-, *op->watchid, op->eventid, op->filesrc))
>+if (!qemu_file_monitor_test_expect(, *op->watchid,
>+   op->eventid, op->filesrc,
>+   op->swapnext))
> goto cleanup;
> break;
> case QFILE_MONITOR_TEST_OP_CREATE:
>-- 
>2.21.0

-- 
Wei Yang
Help you, Help me



Re: [Qemu-devel] [PATCH] ppc: Fix xsmaddmdp and friends

2019-08-21 Thread David Gibson
On Wed, Aug 21, 2019 at 10:28:41AM -0500, Paul A. Clarke wrote:
> From: "Paul A. Clarke" 
> 
> A class of instructions of the form:
>   op Target,A,B
> which operate like:
>   Target = Target * A + B
> have a bit set which distinguishes them from instructions that operate as:
>   Target = Target * B + A
> 
> This bit is not being checked properly (using PPC_BIT macro), so all
> instructions in this class are operating incorrectly as the second form
> above.  The bit was being checked as if it were part of a 64-bit
> instruction opcode, rather than a proper 32-bit opcode.  Fix by using the
> macro (PPC_BIT32) which treats the opcode as a 32-bit quantity.
> 
> Signed-off-by: Paul A. Clarke 

Applied to ppc-for-4.2, thanks.

> ---
>  target/ppc/translate/vsx-impl.inc.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/target/ppc/translate/vsx-impl.inc.c 
> b/target/ppc/translate/vsx-impl.inc.c
> index 3922686..8287e27 100644
> --- a/target/ppc/translate/vsx-impl.inc.c
> +++ b/target/ppc/translate/vsx-impl.inc.c
> @@ -1308,7 +1308,7 @@ static void gen_##name(DisasContext *ctx)   
>   \
>  }
>  \
>  xt = gen_vsr_ptr(xT(ctx->opcode));   
>  \
>  xa = gen_vsr_ptr(xA(ctx->opcode));   
>  \
> -if (ctx->opcode & PPC_BIT(25)) { 
>  \
> +if (ctx->opcode & PPC_BIT32(25)) {   
>  \
>  /*   
>  \
>   * AxT + B   
>  \
>   */  
>  \

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH] ppc: Fix xscvdpspn for SNAN

2019-08-21 Thread David Gibson
On Tue, Aug 20, 2019 at 12:26:04PM -0500, Paul A. Clarke wrote:
> From: "Paul A. Clarke" 
> 
> helper_xscvdpspn() uses float64_to_float32() to convert double-precision
> floating-point to single-precision.  Unfortunately, float64_to_float32()
> converts SNAN to QNAN, which should not happen with xscvdpspn.
> 
> float64_to_float32() is also used by other instruction implementations
> for conversions which _should_ convert SNAN to QNAN.
> 
> Rather than trying to wedge code to preserve SNAN in float64_to_float32()
> just for this this one case, I instead embed an embodiment of the
> conversion code outlined in the POWER ISA for xscvdpspn.
> 
> Signed-off-by: Paul A. Clarke 

Applied to ppc-for-4.2.  I used rth's description rather than the one
above, since I found it clearer.

> ---
>  target/ppc/fpu_helper.c | 32 ++--
>  1 file changed, 30 insertions(+), 2 deletions(-)
> 
> diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
> index 07bc905..c8e7192 100644
> --- a/target/ppc/fpu_helper.c
> +++ b/target/ppc/fpu_helper.c
> @@ -2887,12 +2887,40 @@ void helper_xscvqpdp(CPUPPCState *env, uint32_t 
> opcode,
>  
>  uint64_t helper_xscvdpspn(CPUPPCState *env, uint64_t xb)
>  {
> -uint64_t result;
> +uint64_t result, sign, exp, frac;
>  
>  float_status tstat = env->fp_status;
>  set_float_exception_flags(0, );
>  
> -result = (uint64_t)float64_to_float32(xb, );
> +sign = extract64(xb, 63,  1);
> +exp  = extract64(xb, 52, 11);
> +frac = extract64(xb,  0, 52) | 0x10ULL;
> +
> +if (unlikely(exp == 0 && extract64(frac, 0, 52) != 0)) {
> +/* DP denormal operand.  */
> +/* Exponent override to DP min exp.  */
> +exp = 1;
> +/* Implicit bit override to 0.  */
> +frac = deposit64(frac, 53, 1, 0);
> +}
> +
> +if (unlikely(exp < 897 && frac != 0)) {
> +/* SP tiny operand.  */
> +if (897 - exp > 63) {
> +frac = 0;
> +} else {
> +/* Denormalize until exp = SP min exp.  */
> +frac >>= (897 - exp);
> +}
> +/* Exponent override to SP min exp - 1.  */
> +exp = 896;
> +}
> +
> +result = sign << 31;
> +result |= extract64(exp, 10, 1) << 30;
> +result |= extract64(exp, 0, 7) << 23;
> +result |= extract64(frac, 29, 23);
> +
>  /* hardware replicates result to both words of the doubleword result.  */
>  return (result << 32) | result;
>  }

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH v2] ppc/pnv: Set default ram size to 1.75GB

2019-08-21 Thread David Gibson
On Wed, Aug 21, 2019 at 12:39:45PM +0930, Joel Stanley wrote:
> This makes the powernv machine easier for end users as the default
> initrd address (1.5GB) is now within RAM.
> 
> This uses less than 2GB of RAM to ensure 32 bit Qemu still works.
> 
> Signed-off-by: Joel Stanley 

Applied to ppc-for-4.2, in place of the earlier version.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH v8 01/21] configure: Define TARGET_ALIGNED_ONLY in configure

2019-08-21 Thread Richard Henderson
On 8/21/19 8:08 AM, Tony Nguyen wrote:
> Rename ALIGNED_ONLY to TARGET_ALIGNED_ONLY for clarity and move
> defines out of target/foo/cpu.h into configure, as we do with
> TARGET_WORDS_BIGENDIAN, so that it is always defined early.
> 
> Poisoned TARGET_ALIGNED_ONLY to prevent use in common code.
> 
> Signed-off-by: Tony Nguyen 
> Reviewed-by: Philippe Mathieu-Daudé 
> Reviewed-by: Richard Henderson 
> Reviewed-by: Aleksandar Markovic 
> Reviewed-by: Cornelia Huck 
> ---
>  configure | 10 +-
>  include/exec/poison.h |  1 +
>  include/qom/cpu.h |  2 +-
>  target/alpha/cpu.h|  2 --
>  target/hppa/cpu.h |  1 -
>  target/mips/cpu.h |  2 --
>  target/sh4/cpu.h  |  2 --
>  target/sparc/cpu.h|  2 --
>  target/xtensa/cpu.h   |  2 --
>  tcg/tcg.c |  2 +-
>  tcg/tcg.h |  8 +---
>  11 files changed, 17 insertions(+), 17 deletions(-)

You are going to have to fix your patch submission.

Applying: configure: Define TARGET_ALIGNED_ONLY in configure
error: patch failed: configure:7431
error: configure: patch does not apply
error: patch failed: include/exec/poison.h:35
error: include/exec/poison.h: patch does not apply
error: patch failed: include/qom/cpu.h:89
error: include/qom/cpu.h: patch does not apply
error: patch failed: target/alpha/cpu.h:23
error: target/alpha/cpu.h: patch does not apply
error: patch failed: target/hppa/cpu.h:30
error: target/hppa/cpu.h: patch does not apply
error: patch failed: target/mips/cpu.h:1
error: target/mips/cpu.h: patch does not apply
error: patch failed: target/sh4/cpu.h:23
error: target/sh4/cpu.h: patch does not apply
error: patch failed: target/sparc/cpu.h:5
error: target/sparc/cpu.h: patch does not apply
error: patch failed: target/xtensa/cpu.h:32
error: target/xtensa/cpu.h: patch does not apply
error: patch failed: tcg/tcg.c:1925
error: tcg/tcg.c: patch does not apply
error: patch failed: tcg/tcg.h:333
error: tcg/tcg.h: patch does not apply
Patch failed at 0001 configure: Define TARGET_ALIGNED_ONLY in configure

There are far too many errors for me to want to fix them up by hand.


r~



[Qemu-devel] [Bug 1819289] Re: Windows 95 and Windows 98 will not install or run

2019-08-21 Thread Philippe Mathieu-Daudé
After hours bisecting various QEMU/SeaBIOS combinations, Brad figured
out a new commit:

0a7fa00a13f0852ec6fa83ab987a5ee7978d9867 is the first bad commit
Author: Emilio G. Cota 
Date:   Mon Aug 13 20:52:26 2018 -0400

configure: enable mttcg for i386 and x86_64

Note 1: Brad was not using '-M isapc'.
Note 2: Brad was using the pc-bios/ folder checkout'd at v4.1.0 or 33f18cf7dc 
to avoid the SeaBIOS issues reported previously

Brad could succeed booting QEMU using '-accel thread=single' on
0a7fa00a13.


** Tags added: mttcg

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1819289

Title:
  Windows 95 and Windows 98 will not install or run

Status in QEMU:
  New

Bug description:
  The last version of QEMU I have been able to run Windows 95 or Windows
  98 on was 2.7 or 2.8. Recent versions since then even up to 3.1 will
  either not install or will not run 95 or 98 at all. I have tried every
  combination of options like isapc or no isapc, cpu pentium  or cpu as
  486. Tried different memory configurations, but they just don't work
  anymore.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1819289/+subscriptions



Re: [Qemu-devel] [PATCH 01/13] block-crypto: misc refactoring

2019-08-21 Thread Maxim Levitsky
On Wed, 2019-08-21 at 16:39 +0100, Daniel P. Berrangé wrote:
> On Wed, Aug 14, 2019 at 11:22:07PM +0300, Maxim Levitsky wrote:
> > * rename the write_func to create_write_func,
> >   and init_func to create_init_func
> >   this is  preparation for other write_func that will
> >   be used to update the encryption keys.
> > 
> > No functional changes
> > 
> > Signed-off-by: Maxim Levitsky 
> > ---
> >  block/crypto.c | 15 ---
> >  1 file changed, 8 insertions(+), 7 deletions(-)
> > 
> > diff --git a/block/crypto.c b/block/crypto.c
> > index 8237424ae6..42a3f0898b 100644
> > --- a/block/crypto.c
> > +++ b/block/crypto.c
> > @@ -51,7 +51,6 @@ static int block_crypto_probe_generic(QCryptoBlockFormat 
> > format,
> >  }
> >  }
> >  
> > -
> 
> Unrelated whitespace change
> 
> >  static ssize_t block_crypto_read_func(QCryptoBlock *block,
> >size_t offset,
> >uint8_t *buf,
> > @@ -77,7 +76,7 @@ struct BlockCryptoCreateData {
> >  };
> >  
> >  
> > -static ssize_t block_crypto_write_func(QCryptoBlock *block,
> > +static ssize_t block_crypto_create_write_func(QCryptoBlock *block,
> > size_t offset,
> > const uint8_t *buf,
> > size_t buflen,
> 
> Re-indent.
> 
> > @@ -95,8 +94,7 @@ static ssize_t block_crypto_write_func(QCryptoBlock 
> > *block,
> >  return ret;
> >  }
> >  
> > -
> 
> Unrelated whitespace
> 
> > -static ssize_t block_crypto_init_func(QCryptoBlock *block,
> > +static ssize_t block_crypto_create_init_func(QCryptoBlock *block,
> >size_t headerlen,
> >void *opaque,
> >Error **errp)
> 
> Re-indent.
> 
> > @@ -108,7 +106,8 @@ static ssize_t block_crypto_init_func(QCryptoBlock 
> > *block,
> >  return -EFBIG;
> >  }
> >  
> > -/* User provided size should reflect amount of space made
> > +/*
> > + * User provided size should reflect amount of space made
> 
> Unrelated whitespace
> 
> >   * available to the guest, so we must take account of that
> >   * which will be used by the crypto header
> >   */
> > @@ -117,6 +116,8 @@ static ssize_t block_crypto_init_func(QCryptoBlock 
> > *block,
> >  }
> >  
> >  
> > +
> > +
> 
> Unrelated whitespace
> 
> >  static QemuOptsList block_crypto_runtime_opts_luks = {
> >  .name = "crypto",
> >  .head = QTAILQ_HEAD_INITIALIZER(block_crypto_runtime_opts_luks.head),
> > @@ -272,8 +273,8 @@ static int 
> > block_crypto_co_create_generic(BlockDriverState *bs,
> >  };
> >  
> >  crypto = qcrypto_block_create(opts, NULL,
> > -  block_crypto_init_func,
> > -  block_crypto_write_func,
> > +  block_crypto_create_init_func,
> > +  block_crypto_create_write_func,
> >,
> >errp);
> 
> With the whitespace changes removed & indent fixed
> 
> Reviewed-by: Daniel P. Berrangé 
> 
> 
> Regards,
> Daniel

Thanks you!

Best regards,
Maxim Levitsky




Re: [Qemu-devel] [PATCH 01/13] block-crypto: misc refactoring

2019-08-21 Thread Maxim Levitsky
On Tue, 2019-08-20 at 18:38 +0200, Max Reitz wrote:
> On 14.08.19 22:22, Maxim Levitsky wrote:
> > * rename the write_func to create_write_func,
> >   and init_func to create_init_func
> >   this is  preparation for other write_func that will
> >   be used to update the encryption keys.
> > 
> > No functional changes
> > 
> > Signed-off-by: Maxim Levitsky 
> > ---
> >  block/crypto.c | 15 ---
> >  1 file changed, 8 insertions(+), 7 deletions(-)
> > 
> 
> I’m not quite sure why you remove or add blank lines seemingly at random...

Basically to have consistent two space separation between all functions.
A bit of OCD I confess :-)

> 
> > diff --git a/block/crypto.c b/block/crypto.c
> > index 8237424ae6..42a3f0898b 100644
> > --- a/block/crypto.c
> > +++ b/block/crypto.c
> 
> [...]
> 
> > @@ -77,7 +76,7 @@ struct BlockCryptoCreateData {
> >  };
> >  
> >  
> > -static ssize_t block_crypto_write_func(QCryptoBlock *block,
> > +static ssize_t block_crypto_create_write_func(QCryptoBlock *block,
> > size_t offset,
> > const uint8_t *buf,
> > size_t buflen,
> 
> Alignment should be kept at the opening parentheses.
Opps. I am still trying to learn that rule. Fixed.

> 
> But other than those two things, why not.
> 
> Max
> 

Best regards,
Thanks for the review
Maxim Levitsky





Re: [Qemu-devel] [PATCH 02/13] qcrypto-luks: misc refactoring

2019-08-21 Thread Maxim Levitsky
On Tue, 2019-08-20 at 19:36 +0200, Max Reitz wrote:
> On 14.08.19 22:22, Maxim Levitsky wrote:
> > This is also a preparation for key read/write/erase functions
> > 
> > * use master key len from the header
> > * prefer to use crypto params in the QCryptoBlockLUKS
> >   over passing them as function arguments
> > * define QCRYPTO_BLOCK_LUKS_DEFAULT_ITER_TIME
> > * Add comments to various crypto parameters in the QCryptoBlockLUKS
> > 
> > Signed-off-by: Maxim Levitsky 
> > ---
> >  crypto/block-luks.c | 213 ++--
> >  1 file changed, 105 insertions(+), 108 deletions(-)
> > 
> > diff --git a/crypto/block-luks.c b/crypto/block-luks.c
> > index 409ab50f20..48213abde7 100644
> > --- a/crypto/block-luks.c
> > +++ b/crypto/block-luks.c
> 
> [...]
> 
> > @@ -199,13 +201,25 @@ QEMU_BUILD_BUG_ON(sizeof(struct 
> > QCryptoBlockLUKSHeader) != 592);
> >  struct QCryptoBlockLUKS {
> >  QCryptoBlockLUKSHeader header;
> >  
> > -/* Cache parsed versions of what's in header fields,
> > - * as we can't rely on QCryptoBlock.cipher being
> > - * non-NULL */
> 
> Hm, why remove this comment?

Because the intended uses of these fields changed.
As Daniel explained to me initially none of the crypto parameters
were stored in the QCryptoBlockLUKS, and thus there were all passed
as function arguments.
Later when qemu-img info was added/implemented, there was need to 'cache' them
in the header so that info command could use them after image was opened.

Now after my changes this is no longer true. now these crypto parameters are 
set early
on create/load and used everywhere to avoid passing them over and over to each
function.

> 
> > +/* Main encryption algorithm used for encryption*/
> >  QCryptoCipherAlgorithm cipher_alg;
> > +
> > +/* Mode of encryption for the selected encryption algorithm */
> >  QCryptoCipherMode cipher_mode;
> > +
> > +/* Initialization vector generation algorithm */
> >  QCryptoIVGenAlgorithm ivgen_alg;
> > +
> > +/* Hash algorithm used for IV generation*/
> >  QCryptoHashAlgorithm ivgen_hash_alg;
> > +
> > +/*
> > + * Encryption algorithm used for IV generation.
> > + * Usually the same as main encryption algorithm
> > + */
> > +QCryptoCipherAlgorithm ivgen_cipher_alg;
> > +
> > +/* Hash algorithm used in pbkdf2 function */
> >  QCryptoHashAlgorithm hash_alg;
> >  };
> >  
> > @@ -397,6 +411,12 @@ qcrypto_block_luks_essiv_cipher(QCryptoCipherAlgorithm 
> > cipher,
> >  }
> >  }
> >  
> > +static int masterkeylen(QCryptoBlockLUKS *luks)
> 
> This should be a const pointer.
I haven't had const in mind while writing this but you are right.
Fixed.


> 
> > +{
> > +return luks->header.key_bytes;
> > +}
> > +
> > +
> >  /*
> >   * Given a key slot, and user password, this will attempt to unlock
> >   * the master encryption key from the key slot.
> > @@ -410,21 +430,15 @@ 
> > qcrypto_block_luks_essiv_cipher(QCryptoCipherAlgorithm cipher,
> >   */
> >  static int
> >  qcrypto_block_luks_load_key(QCryptoBlock *block,
> > -QCryptoBlockLUKSKeySlot *slot,
> > +uint slot_idx,
> 
> Did you use uint on purpose or do you mean a plain “unsigned”?
Well there are just 8 slots, but yea I don't mind to make this an unsigned int.


> 
> >  const char *password,
> > -QCryptoCipherAlgorithm cipheralg,
> > -QCryptoCipherMode ciphermode,
> > -QCryptoHashAlgorithm hash,
> > -QCryptoIVGenAlgorithm ivalg,
> > -QCryptoCipherAlgorithm ivcipheralg,
> > -QCryptoHashAlgorithm ivhash,
> >  uint8_t *masterkey,
> > -size_t masterkeylen,
> >  QCryptoBlockReadFunc readfunc,
> >  void *opaque,
> >  Error **errp)
> >  {
> >  QCryptoBlockLUKS *luks = block->opaque;
> > +QCryptoBlockLUKSKeySlot *slot = >header.key_slots[slot_idx];
> 
> I think this is a great opportunity to make this a const pointer.
Agree. Done.
> 
> >  uint8_t *splitkey;
> >  size_t splitkeylen;
> >  uint8_t *possiblekey;
> 
> [...]
> 
> > @@ -710,6 +696,8 @@ qcrypto_block_luks_open(QCryptoBlock *block,
> >  goto fail;
> >  }
> >  
> > +cipher_mode = g_strdup(luks->header.cipher_mode);
> > +
> 
> This should be freed under the fail label.
> 
> (And maybe the fact that this no longer modifies
> luks->header.cipher_mode should be mentioned in the commit message, I
> don’t know.)

I swear I documented that in the commit message... yea in the next commit (:-()
Fixed that now.

> 
> >  /*
> >   * The cipher_mode header contains a string that we have
> >   * to further parse, of the format
> 
> [...]
> 
> > @@ -730,13 +718,13 @@ 

Re: [Qemu-devel] [RFC PATCH v4 75/75] target/i386: convert pmovmskb/movmskps/movmskpd helpers to gvec style

2019-08-21 Thread Richard Henderson
On 8/21/19 10:29 AM, Jan Bobek wrote:
> +for (intptr_t i = 0; i * sizeof(uint8_t) < oprsz; ++i) {
> +const uint8_t t = a->B(i) & (1 << 7);
> +ret |= i < 8 ? t >> (7 - i) : t << (i - 7);

You can avoid this variable shift by doing

  uint32_t t = a->B(i) >> 7;
  ret |= t << i;

> +uint64_t glue(helper_pmovmskbq, SUFFIX)(Reg *a, uint32_t desc)
> +{
> +return glue(helper_pmovmskbd, SUFFIX)(a, desc);
>  }
...
> +DEF_GEN_INSN2_GVEC(vpmovmskb, Gd, Uqq, sd1_ool, XMM_OPRSZ, XMM_MAXSZ, 
> pmovmskbd_xmm)
> +DEF_GEN_INSN2_GVEC(vpmovmskb, Gq, Uqq, sq1_ool, XMM_OPRSZ, XMM_MAXSZ, 
> pmovmskbq_xmm)

What is the difference between these two?

Given that we aren't attempting avx512, uint32_t is sufficient for all of the
bytes of a YMM register.

I have a feeling that some of this should simply use target_ulong, so that a
direct assignment to the general register can be done without extra extensions
within the generated code.


r~



Re: [Qemu-devel] [PATCH v2 0/5] tricore: Convert to translate_loop (resend)

2019-08-21 Thread no-reply
Patchew URL: 
https://patchew.org/QEMU/20190821122315.18015-1-kbast...@mail.uni-paderborn.de/



Hi,

This series seems to have some coding style problems. See output below for
more information:

Type: series
Subject: [Qemu-devel] [PATCH v2 0/5] tricore: Convert to translate_loop (resend)
Message-id: 20190821122315.18015-1-kbast...@mail.uni-paderborn.de

=== TEST SCRIPT BEGIN ===
#!/bin/bash
git rev-parse base > /dev/null || exit 0
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
./scripts/checkpatch.pl --mailback base..
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 - [tag update]  
patchew/20190821122315.18015-1-kbast...@mail.uni-paderborn.de -> 
patchew/20190821122315.18015-1-kbast...@mail.uni-paderborn.de
Submodule 'capstone' (https://git.qemu.org/git/capstone.git) registered for 
path 'capstone'
Submodule 'dtc' (https://git.qemu.org/git/dtc.git) registered for path 'dtc'
Submodule 'roms/QemuMacDrivers' (https://git.qemu.org/git/QemuMacDrivers.git) 
registered for path 'roms/QemuMacDrivers'
Submodule 'roms/SLOF' (https://git.qemu.org/git/SLOF.git) registered for path 
'roms/SLOF'
Submodule 'roms/edk2' (https://git.qemu.org/git/edk2.git) registered for path 
'roms/edk2'
Submodule 'roms/ipxe' (https://git.qemu.org/git/ipxe.git) registered for path 
'roms/ipxe'
Submodule 'roms/openbios' (https://git.qemu.org/git/openbios.git) registered 
for path 'roms/openbios'
Submodule 'roms/openhackware' (https://git.qemu.org/git/openhackware.git) 
registered for path 'roms/openhackware'
Submodule 'roms/opensbi' (https://git.qemu.org/git/opensbi.git) registered for 
path 'roms/opensbi'
Submodule 'roms/qemu-palcode' (https://git.qemu.org/git/qemu-palcode.git) 
registered for path 'roms/qemu-palcode'
Submodule 'roms/seabios' (https://git.qemu.org/git/seabios.git/) registered for 
path 'roms/seabios'
Submodule 'roms/seabios-hppa' (https://git.qemu.org/git/seabios-hppa.git) 
registered for path 'roms/seabios-hppa'
Submodule 'roms/sgabios' (https://git.qemu.org/git/sgabios.git) registered for 
path 'roms/sgabios'
Submodule 'roms/skiboot' (https://git.qemu.org/git/skiboot.git) registered for 
path 'roms/skiboot'
Submodule 'roms/u-boot' (https://git.qemu.org/git/u-boot.git) registered for 
path 'roms/u-boot'
Submodule 'roms/u-boot-sam460ex' (https://git.qemu.org/git/u-boot-sam460ex.git) 
registered for path 'roms/u-boot-sam460ex'
Submodule 'slirp' (https://git.qemu.org/git/libslirp.git) registered for path 
'slirp'
Submodule 'tests/fp/berkeley-softfloat-3' 
(https://git.qemu.org/git/berkeley-softfloat-3.git) registered for path 
'tests/fp/berkeley-softfloat-3'
Submodule 'tests/fp/berkeley-testfloat-3' 
(https://git.qemu.org/git/berkeley-testfloat-3.git) registered for path 
'tests/fp/berkeley-testfloat-3'
Submodule 'ui/keycodemapdb' (https://git.qemu.org/git/keycodemapdb.git) 
registered for path 'ui/keycodemapdb'
Cloning into 'capstone'...
Submodule path 'capstone': checked out 
'22ead3e0bfdb87516656453336160e0a37b066bf'
Cloning into 'dtc'...
Submodule path 'dtc': checked out '88f18909db731a627456f26d779445f84e449536'
Cloning into 'roms/QemuMacDrivers'...
Submodule path 'roms/QemuMacDrivers': checked out 
'90c488d5f4a407342247b9ea869df1c2d9c8e266'
Cloning into 'roms/SLOF'...
Submodule path 'roms/SLOF': checked out 
'7bfe584e321946771692711ff83ad2b5850daca7'
Cloning into 'roms/edk2'...
Submodule path 'roms/edk2': checked out 
'20d2e5a125e34fc8501026613a71549b2a1a3e54'
Submodule 'SoftFloat' (https://github.com/ucb-bar/berkeley-softfloat-3.git) 
registered for path 'ArmPkg/Library/ArmSoftFloatLib/berkeley-softfloat-3'
Submodule 'CryptoPkg/Library/OpensslLib/openssl' 
(https://github.com/openssl/openssl) registered for path 
'CryptoPkg/Library/OpensslLib/openssl'
Cloning into 'ArmPkg/Library/ArmSoftFloatLib/berkeley-softfloat-3'...
Submodule path 'roms/edk2/ArmPkg/Library/ArmSoftFloatLib/berkeley-softfloat-3': 
checked out 'b64af41c3276f97f0e181920400ee056b9c88037'
Cloning into 'CryptoPkg/Library/OpensslLib/openssl'...
Submodule path 'roms/edk2/CryptoPkg/Library/OpensslLib/openssl': checked out 
'50eaac9f3337667259de725451f201e784599687'
Submodule 'boringssl' (https://boringssl.googlesource.com/boringssl) registered 
for path 'boringssl'
Submodule 'krb5' (https://github.com/krb5/krb5) registered for path 'krb5'
Submodule 'pyca.cryptography' (https://github.com/pyca/cryptography.git) 
registered for path 'pyca-cryptography'
Cloning into 'boringssl'...
Submodule path 'roms/edk2/CryptoPkg/Library/OpensslLib/openssl/boringssl': 
checked out '2070f8ad9151dc8f3a73bffaa146b5e6937a583f'
Cloning into 'krb5'...
Submodule path 'roms/edk2/CryptoPkg/Library/OpensslLib/openssl/krb5': checked 
out 'b9ad6c49505c96a088326b62a52568e3484f2168'
Cloning into 'pyca-cryptography'...
Submodule path 
'roms/edk2/CryptoPkg/Library/OpensslLib/openssl/pyca-cryptography': checked out 

Re: [Qemu-devel] [Qemu-block] [PATCH 05/13] qcrypto-luks: clear the masterkey and password before freeing them always

2019-08-21 Thread Maxim Levitsky
On Thu, 2019-08-22 at 02:01 +0300, Nir Soffer wrote:
> On Wed, Aug 14, 2019, 23:23 Maxim Levitsky  wrote:
> 
> > While there are other places where these are still stored in memory,
> > this is still one less key material area that can be sniffed with
> > various side channel attacks
> > 
> > 
> > 
> > Signed-off-by: Maxim Levitsky 
> > ---
> >  crypto/block-luks.c | 52 ++---
> >  1 file changed, 44 insertions(+), 8 deletions(-)
> > 
> > diff --git a/crypto/block-luks.c b/crypto/block-luks.c
> > index e1a4df94b7..336e633df4 100644
> > --- a/crypto/block-luks.c
> > +++ b/crypto/block-luks.c
> > @@ -1023,8 +1023,18 @@ qcrypto_block_luks_load_key(QCryptoBlock *block,
> >   cleanup:
> >  qcrypto_ivgen_free(ivgen);
> >  qcrypto_cipher_free(cipher);
> > -g_free(splitkey);
> > -g_free(possiblekey);
> > +
> > +if (splitkey) {
> > +memset(splitkey, 0, splitkeylen);
> > 
> 
> I think we need memset_s() or similar replacement, see
> 
> https://www.cryptologie.net/article/419/zeroing-memory-compiler-optimizations-and-memset_s/

You raise a very very good point here! Thanks!!

Best regards,
Maxim Levitsky






Re: [Qemu-devel] [Qemu-riscv] RISC-V: Vector && DSP Extension

2019-08-21 Thread Jonathan Behrens
Is there a reason to guarantee support of a particular draft extension
version once it has been superseded by a subsequent version? I understand
why it was done for priv-1.9.1, but going forward I'm skeptical there will
be much/any code out in the wild that depends on old draft versions of
extensions. The main reason people seem interested in implementing
extensions in QEMU is to test them before going through the trouble of
manufacturing hardware, and I don't really see why anyone would want to
test a design that is no longer under consideration.

Jonathan

On Wed, Aug 21, 2019 at 3:31 PM Palmer Dabbelt  wrote:

> On Thu, 15 Aug 2019 14:37:52 PDT (-0700), alistai...@gmail.com wrote:
> > On Thu, Aug 15, 2019 at 2:07 AM Peter Maydell 
> wrote:
> >>
> >> On Thu, 15 Aug 2019 at 09:53, Aleksandar Markovic
> >>  wrote:
> >> >
> >> > > We can accept draft
> >> > > extensions in QEMU as long as they are disabled by default.
> >>
> >> > Hi, Alistair, Palmer,
> >> >
> >> > Is this an official stance of QEMU community, or perhaps Alistair's
> >> > personal judgement, or maybe a rule within risv subcomunity?
> >>
> >> Alistair asked on a previous thread; my view was:
> >> https://lists.gnu.org/archive/html/qemu-devel/2019-07/msg03364.html
> >> and nobody else spoke up disagreeing (summary: should at least be
> >> disabled-by-default and only enabled by setting an explicit
> >> property whose name should start with the 'x-' prefix).
> >
> > Agreed!
> >
> >>
> >> In general QEMU does sometimes introduce experimental extensions
> >> (we've had them in the block layer, for example) and so the 'x-'
> >> property to enable them is a reasonably established convention.
> >> I think it's a reasonable compromise to allow this sort of work
> >> to start and not have to live out-of-tree for a long time, without
> >> confusing users or getting into a situation where some QEMU
> >> versions behave differently or to obsolete drafts of a spec
> >> without it being clear from the command line that experimental
> >> extensions are being enabled.
> >>
> >> There is also an element of "submaintainer judgement" to be applied
> >> here -- upstream is probably not the place for a draft extension
> >> to be implemented if it is:
> >>  * still fast moving or subject to major changes of design direction
> >>  * major changes to the codebase (especially if it requires
> >>changes to core code) that might later need to be redone
> >>entirely differently
> >>  * still experimental
> >
> > Yep, agreed. For RISC-V I think this would extend to only allowing
> > extensions that have backing from the foundation and are under active
> > discussion.
>
> My general philosophy here is that we'll take anything written down in an
> official RISC-V ISA manual (ie, the ones actually released by the
> foundation).
> This provides a single source of truth for what an extension name /
> version
> means, which is important to avoid confusion.  If it's a ratified
> extension
> then I see no reason not to support it on my end.  For frozen extensions
> we
> should probably just wait the 45 days until they go up for a ratification
> vote,
> but I'd be happy to start reviewing patches then (or earlier :)).
>
> If the spec is a draft in the ISA manual then we need to worry about the
> support burden, which I don't have a fixed criteria for -- generally there
> shouldn't be issues here, but early drafts can be in a state where they're
> going to change extensively and are unlikely to be used by anyone.
> There's
> also the question of "what is an official release of a draft
> specification?".
>
> That's a bit awkward right now: the current ratified ISA manual contains
> version 0.3 of the hypervisor extension, but I just talked to Andrew and
> the
> plan is to remove the draft extensions from the ratified manuals because
> these
> drafts are old and the official manuals update slowly.  For now I guess
> we'll
> need an an-hoc way of determining if a draft extension has been officially
> versioned or not, which is a bit of a headache.
>
> We already have examples of supporting draft extensions, including
> priv-1.9.1.
> This does cause some pain for us on the QEMU side (CSR bits have different
> semantics between the specs), but there's 1.9.1 hardware out there and the
> port
> continues to be useful so I'd be in favor of keeping it around for now.  I
> suppose there is an implicit risk that draft extensions will be
> deprecated, but
> the "x-" prefix, draft status, and long deprecation period should be
> sufficient
> to inform users of the risk.  I wouldn't be opposed to adding a "this is a
> draft ISA" warning, but I feel like it might be a bit overkill.
>
> >
> > Alistair
> >
> >>
> >> thanks
> >> -- PMM
>
>


Re: [Qemu-devel] [PATCH v2 0/5] tricore: Convert to translate_loop (resend)

2019-08-21 Thread no-reply
Patchew URL: 
https://patchew.org/QEMU/20190821122315.18015-1-kbast...@mail.uni-paderborn.de/



Hi,

This series seems to have some coding style problems. See output below for
more information:

Type: series
Subject: [Qemu-devel] [PATCH v2 0/5] tricore: Convert to translate_loop (resend)
Message-id: 20190821122315.18015-1-kbast...@mail.uni-paderborn.de

=== TEST SCRIPT BEGIN ===
#!/bin/bash
git rev-parse base > /dev/null || exit 0
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
./scripts/checkpatch.pl --mailback base..
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 - [tag update]  
patchew/20190821122315.18015-1-kbast...@mail.uni-paderborn.de -> 
patchew/20190821122315.18015-1-kbast...@mail.uni-paderborn.de
Submodule 'capstone' (https://git.qemu.org/git/capstone.git) registered for 
path 'capstone'
Submodule 'dtc' (https://git.qemu.org/git/dtc.git) registered for path 'dtc'
Submodule 'roms/QemuMacDrivers' (https://git.qemu.org/git/QemuMacDrivers.git) 
registered for path 'roms/QemuMacDrivers'
Submodule 'roms/SLOF' (https://git.qemu.org/git/SLOF.git) registered for path 
'roms/SLOF'
Submodule 'roms/edk2' (https://git.qemu.org/git/edk2.git) registered for path 
'roms/edk2'
Submodule 'roms/ipxe' (https://git.qemu.org/git/ipxe.git) registered for path 
'roms/ipxe'
Submodule 'roms/openbios' (https://git.qemu.org/git/openbios.git) registered 
for path 'roms/openbios'
Submodule 'roms/openhackware' (https://git.qemu.org/git/openhackware.git) 
registered for path 'roms/openhackware'
Submodule 'roms/opensbi' (https://git.qemu.org/git/opensbi.git) registered for 
path 'roms/opensbi'
Submodule 'roms/qemu-palcode' (https://git.qemu.org/git/qemu-palcode.git) 
registered for path 'roms/qemu-palcode'
Submodule 'roms/seabios' (https://git.qemu.org/git/seabios.git/) registered for 
path 'roms/seabios'
Submodule 'roms/seabios-hppa' (https://git.qemu.org/git/seabios-hppa.git) 
registered for path 'roms/seabios-hppa'
Submodule 'roms/sgabios' (https://git.qemu.org/git/sgabios.git) registered for 
path 'roms/sgabios'
Submodule 'roms/skiboot' (https://git.qemu.org/git/skiboot.git) registered for 
path 'roms/skiboot'
Submodule 'roms/u-boot' (https://git.qemu.org/git/u-boot.git) registered for 
path 'roms/u-boot'
Submodule 'roms/u-boot-sam460ex' (https://git.qemu.org/git/u-boot-sam460ex.git) 
registered for path 'roms/u-boot-sam460ex'
Submodule 'slirp' (https://git.qemu.org/git/libslirp.git) registered for path 
'slirp'
Submodule 'tests/fp/berkeley-softfloat-3' 
(https://git.qemu.org/git/berkeley-softfloat-3.git) registered for path 
'tests/fp/berkeley-softfloat-3'
Submodule 'tests/fp/berkeley-testfloat-3' 
(https://git.qemu.org/git/berkeley-testfloat-3.git) registered for path 
'tests/fp/berkeley-testfloat-3'
Submodule 'ui/keycodemapdb' (https://git.qemu.org/git/keycodemapdb.git) 
registered for path 'ui/keycodemapdb'
Cloning into 'capstone'...
Submodule path 'capstone': checked out 
'22ead3e0bfdb87516656453336160e0a37b066bf'
Cloning into 'dtc'...
Submodule path 'dtc': checked out '88f18909db731a627456f26d779445f84e449536'
Cloning into 'roms/QemuMacDrivers'...
Submodule path 'roms/QemuMacDrivers': checked out 
'90c488d5f4a407342247b9ea869df1c2d9c8e266'
Cloning into 'roms/SLOF'...
Submodule path 'roms/SLOF': checked out 
'7bfe584e321946771692711ff83ad2b5850daca7'
Cloning into 'roms/edk2'...
Submodule path 'roms/edk2': checked out 
'20d2e5a125e34fc8501026613a71549b2a1a3e54'
Submodule 'SoftFloat' (https://github.com/ucb-bar/berkeley-softfloat-3.git) 
registered for path 'ArmPkg/Library/ArmSoftFloatLib/berkeley-softfloat-3'
Submodule 'CryptoPkg/Library/OpensslLib/openssl' 
(https://github.com/openssl/openssl) registered for path 
'CryptoPkg/Library/OpensslLib/openssl'
Cloning into 'ArmPkg/Library/ArmSoftFloatLib/berkeley-softfloat-3'...
Submodule path 'roms/edk2/ArmPkg/Library/ArmSoftFloatLib/berkeley-softfloat-3': 
checked out 'b64af41c3276f97f0e181920400ee056b9c88037'
Cloning into 'CryptoPkg/Library/OpensslLib/openssl'...
Submodule path 'roms/edk2/CryptoPkg/Library/OpensslLib/openssl': checked out 
'50eaac9f3337667259de725451f201e784599687'
Submodule 'boringssl' (https://boringssl.googlesource.com/boringssl) registered 
for path 'boringssl'
Submodule 'krb5' (https://github.com/krb5/krb5) registered for path 'krb5'
Submodule 'pyca.cryptography' (https://github.com/pyca/cryptography.git) 
registered for path 'pyca-cryptography'
Cloning into 'boringssl'...
Submodule path 'roms/edk2/CryptoPkg/Library/OpensslLib/openssl/boringssl': 
checked out '2070f8ad9151dc8f3a73bffaa146b5e6937a583f'
Cloning into 'krb5'...
Submodule path 'roms/edk2/CryptoPkg/Library/OpensslLib/openssl/krb5': checked 
out 'b9ad6c49505c96a088326b62a52568e3484f2168'
Cloning into 'pyca-cryptography'...
Submodule path 
'roms/edk2/CryptoPkg/Library/OpensslLib/openssl/pyca-cryptography': checked out 

Re: [Qemu-devel] [PATCH v2 4/5] target/tricore: Implement a qemu excptions helper

2019-08-21 Thread Richard Henderson
On 8/21/19 4:05 PM, Richard Henderson wrote:
> On 8/21/19 5:23 AM, Bastian Koppelmann wrote:
>> @@ -3928,7 +3937,7 @@ static void decode_sr_system(DisasContext *ctx)
>>  ctx->base.is_jmp = DISAS_NORETURN;
>>  break;
>>  case OPC2_16_SR_DEBUG:
>> -/* raise EXCP_DEBUG */
>> +generate_qemu_excp(ctx, EXCP_DEBUG);
>>  break;
>>  case OPC2_16_SR_FRET:
>>  gen_fret(ctx);
>> @@ -8354,7 +8363,7 @@ static void decode_sys_interrupts(DisasContext *ctx)
>>  
>>  switch (op2) {
>>  case OPC2_32_SYS_DEBUG:
>> -/* raise EXCP_DEBUG */
>> +generate_qemu_excp(ctx, EXCP_DEBUG);
>>  break;
>>  case OPC2_32_SYS_DISABLE:
>>  tcg_gen_andi_tl(cpu_ICR, cpu_ICR, ~MASK_ICR_IE_1_3);
> 
> This is not correct -- EXCP_DEBUG is an internal qemu exception.
> 
> The manual I have only describes the ISA and does not describe what a "Debug
> Event" would be.  I note that you're missing the DBGSR.DE check.  I also note
> that whatever a "Debug Event" is, RFM appears to be the return from it.  So 
> one
> can deduce some things about what it should be.

Anyway, remove these hunks and the rest of the patch is ok.
Reviewed-by: Richard Henderson 


r~



Re: [Qemu-devel] [PATCH v2 3/5] target/tricore: Use translate_loop

2019-08-21 Thread Richard Henderson
On 8/21/19 5:23 AM, Bastian Koppelmann wrote:
> Signed-off-by: Bastian Koppelmann 
> ---
> v1 -> v2:
> - save hflags in tricore_tr_init_disas_context()
> 
>  target/tricore/translate.c | 118 +++--
>  1 file changed, 74 insertions(+), 44 deletions(-)

Reviewed-by: Richard Henderson 


r~



Re: [Qemu-devel] [PATCH v2 4/5] target/tricore: Implement a qemu excptions helper

2019-08-21 Thread Richard Henderson
On 8/21/19 5:23 AM, Bastian Koppelmann wrote:
> @@ -3928,7 +3937,7 @@ static void decode_sr_system(DisasContext *ctx)
>  ctx->base.is_jmp = DISAS_NORETURN;
>  break;
>  case OPC2_16_SR_DEBUG:
> -/* raise EXCP_DEBUG */
> +generate_qemu_excp(ctx, EXCP_DEBUG);
>  break;
>  case OPC2_16_SR_FRET:
>  gen_fret(ctx);
> @@ -8354,7 +8363,7 @@ static void decode_sys_interrupts(DisasContext *ctx)
>  
>  switch (op2) {
>  case OPC2_32_SYS_DEBUG:
> -/* raise EXCP_DEBUG */
> +generate_qemu_excp(ctx, EXCP_DEBUG);
>  break;
>  case OPC2_32_SYS_DISABLE:
>  tcg_gen_andi_tl(cpu_ICR, cpu_ICR, ~MASK_ICR_IE_1_3);

This is not correct -- EXCP_DEBUG is an internal qemu exception.

The manual I have only describes the ISA and does not describe what a "Debug
Event" would be.  I note that you're missing the DBGSR.DE check.  I also note
that whatever a "Debug Event" is, RFM appears to be the return from it.  So one
can deduce some things about what it should be.


r~



Re: [Qemu-devel] [Qemu-block] [PATCH 05/13] qcrypto-luks: clear the masterkey and password before freeing them always

2019-08-21 Thread Nir Soffer
On Wed, Aug 14, 2019, 23:23 Maxim Levitsky  wrote:

> While there are other places where these are still stored in memory,
> this is still one less key material area that can be sniffed with
> various side channel attacks
>
>
>
> Signed-off-by: Maxim Levitsky 
> ---
>  crypto/block-luks.c | 52 ++---
>  1 file changed, 44 insertions(+), 8 deletions(-)
>
> diff --git a/crypto/block-luks.c b/crypto/block-luks.c
> index e1a4df94b7..336e633df4 100644
> --- a/crypto/block-luks.c
> +++ b/crypto/block-luks.c
> @@ -1023,8 +1023,18 @@ qcrypto_block_luks_load_key(QCryptoBlock *block,
>   cleanup:
>  qcrypto_ivgen_free(ivgen);
>  qcrypto_cipher_free(cipher);
> -g_free(splitkey);
> -g_free(possiblekey);
> +
> +if (splitkey) {
> +memset(splitkey, 0, splitkeylen);
>

I think we need memset_s() or similar replacement, see

https://www.cryptologie.net/article/419/zeroing-memory-compiler-optimizations-and-memset_s/

+g_free(splitkey);
> +}
> +
> +if (possiblekey) {
> +memset(possiblekey, 0, masterkeylen(luks));
> +g_free(possiblekey);
> +
> +}
> +
>  return ret;
>  }
>
> @@ -1161,16 +1171,34 @@ qcrypto_block_luks_open(QCryptoBlock *block,
>  block->sector_size = QCRYPTO_BLOCK_LUKS_SECTOR_SIZE;
>  block->payload_offset = luks->header.payload_offset *
> block->sector_size;
>
> -g_free(masterkey);
> -g_free(password);
> +if (masterkey) {
> +memset(masterkey, 0, masterkeylen(luks));
> +g_free(masterkey);
> +}
> +
> +if (password) {
> +memset(password, 0, strlen(password));
> +g_free(password);
> +}
> +
>  return 0;
>
>   fail:
> -g_free(masterkey);
> +
> +if (masterkey) {
> +memset(masterkey, 0, masterkeylen(luks));
> +g_free(masterkey);
> +}
> +
> +if (password) {
> +memset(password, 0, strlen(password));
> +g_free(password);
> +}
> +
>  qcrypto_block_free_cipher(block);
>  qcrypto_ivgen_free(block->ivgen);
> +
>  g_free(luks);
> -g_free(password);
>  return ret;
>  }
>
> @@ -1459,7 +1487,10 @@ qcrypto_block_luks_create(QCryptoBlock *block,
>
>  memset(masterkey, 0, luks->header.key_bytes);
>  g_free(masterkey);
> +
> +memset(password, 0, strlen(password));
>  g_free(password);
> +
>  g_free(cipher_mode_spec);
>
>  return 0;
> @@ -1467,9 +1498,14 @@ qcrypto_block_luks_create(QCryptoBlock *block,
>   error:
>  if (masterkey) {
>  memset(masterkey, 0, luks->header.key_bytes);
> +g_free(masterkey);
>  }
> -g_free(masterkey);
> -g_free(password);
> +
> +if (password) {
> +memset(password, 0, strlen(password));
> +g_free(password);
> +}
> +
>  g_free(cipher_mode_spec);
>
>  qcrypto_block_free_cipher(block);
> --
> 2.17.2
>
>
>


Re: [Qemu-devel] [PATCH v2 5/5] target/tricore: Fix tricore_tr_translate_insn

2019-08-21 Thread Richard Henderson
On 8/21/19 5:23 AM, Bastian Koppelmann wrote:
> we now fetch 2 bytes first, check whether we have a 32 bit insn, and only then
> fetch another 2 bytes. We also make sure that a 16 bit insn that still fits
> into the current page does not end up in the next page.
> 
> Signed-off-by: Bastian Koppelmann 
> ---
>  target/tricore/translate.c | 47 +++---
>  1 file changed, 34 insertions(+), 13 deletions(-)

Reviewed-by: Richard Henderson 


r~



Re: [Qemu-devel] [PATCH 03/13] qcrypto-luks: refactoring: extract load/store/check/parse header functions

2019-08-21 Thread Maxim Levitsky
On Tue, 2019-08-20 at 20:01 +0200, Max Reitz wrote:
> On 14.08.19 22:22, Maxim Levitsky wrote:
> > With upcoming key management, the header will
> > need to be stored after the image is created.
> > 
> > Extracting load header isn't strictly needed, but
> > do this anyway for the symmetry.
> > 
> > Also I extracted a function that does basic sanity
> > checks on the just read header, and a function
> > which parses all the crypto format to make the
> > code a bit more readable, plus now the code
> > doesn't destruct the in-header cipher-mode string,
> > so that the header now can be stored many times,
> > which is needed for the key management.
> > 
> > Also this allows to contain the endianess conversions
> > in these functions alone
> > 
> > The header is no longer endian swapped in place,
> > to prevent (mostly theoretical races I think)
> > races where someone could see the header in the
> > process of beeing byteswapped.
> 
> The formatting looks weird, it doesn’t look quite 72 characters wide...
>  (what commit messages normally use)
Could you elaborate on that? I thought that code should not
exceed 80 character limit.

> 
> > Signed-off-by: Maxim Levitsky 
> > ---
> >  crypto/block-luks.c | 756 ++--
> >  1 file changed, 440 insertions(+), 316 deletions(-)
> 
> Also, this commit is just too big.

Yea, but it has no functional changes.
I can split it further, but that won't help much IMHO.

Best regards,
Maxim Levitsky





Re: [Qemu-devel] [PATCH v1 2/4] s390x/tcg: Introduce probe_read_access()

2019-08-21 Thread Richard Henderson
On 8/21/19 3:31 PM, Richard Henderson wrote:
>> Yes, that's what I mean, TARGET_PAGE_SIZE, but eventually crossing a
>> page boundary. The longer I stare at the MVCL code, the more broken it
>> is. There are more nice things buried in the PoP. MVCL does not detect
>> access exceptions beyond the next 2k. So we have to limit it there
>> differently.
> That language is indeed odd.
> 
> The only reading of that paragraph that makes sense to me is that the hardware
> *must* interrupt MVCL after every 2k bytes processed.  The idea that the user
> can magically write to a read-only page simply by providing length = 2MB and
> page that is initially writable is dumb.  I cannot imagine that is a correct
> reading.
> 
> Getting clarification from an IBM engineer on that would be good; otherwise I
> would just ignore that and proceed as if all access checks are performed.
> 

FWIW, splitting the operation at every aligned 2k boundary is exactly what the
Hercules emulator does:

len3 = NOCROSS2KL(addr1,len1) ? len1 : (int)(0x800 - (addr1 & 0x7FF));
len4 = NOCROSS2KL(addr2,len2) ? len2 : (int)(0x800 - (addr2 & 0x7FF));
len = len3 < len4 ? len3 : len4;
/* Use concpy to ensure Concurrent block update consistency */
concpy (regs, dest, source, len);

After this it writes back the lengths and addresses to the
register file, and then if necessary loops back to the address
translation step.


r~



Re: [Qemu-devel] [PATCH 05/13] qcrypto-luks: clear the masterkey and password before freeing them always

2019-08-21 Thread Maxim Levitsky
On Tue, 2019-08-20 at 20:12 +0200, Max Reitz wrote:
> On 14.08.19 22:22, Maxim Levitsky wrote:
> > While there are other places where these are still stored in memory,
> > this is still one less key material area that can be sniffed with
> > various side channel attacks
> > 
> > 
> > 
> 
> (Many empty lines here)
> 
> > Signed-off-by: Maxim Levitsky 
> > ---
> >  crypto/block-luks.c | 52 ++---
> >  1 file changed, 44 insertions(+), 8 deletions(-)
> 
> Wouldn’t it make sense to introduce a dedicated function for this?

Absolutely. I was mostly focused on fixing all the cases first.
I usually refactor such ugly code at the end, but this time I forgot
to do so.

Plus I need to pick a place where to put such function (it can be useful in any 
place in qemu), 
and first check if maybe glib already has such free+scrub function implemented 
somewhere.

Best regards,
Maxim Levitsky




Re: [Qemu-devel] [PATCH 12/13] qemu-img: implement key management

2019-08-21 Thread Maxim Levitsky
On Tue, 2019-08-20 at 20:29 +0200, Max Reitz wrote:
> On 14.08.19 22:22, Maxim Levitsky wrote:
> > Signed-off-by: Maxim Levitsky 
> > ---
> >  block/crypto.c   |  16 ++
> >  block/crypto.h   |   3 +
> >  qemu-img-cmds.hx |  13 +
> >  qemu-img.c   | 140 +++
> >  4 files changed, 172 insertions(+)
> 
> Yes, this seems a bit weird.  Putting it under amend seems like the
> natural thing if that works; if not, I think it should be a single
> qemu-img subcommand instead of two.
> 
> Max
> 
Agree, thats why RFC.

Best regards,
Maxim Levitsky




Re: [Qemu-devel] [PATCH 07/13] block: add manage-encryption command (qmp and blockdev)

2019-08-21 Thread Maxim Levitsky
On Tue, 2019-08-20 at 20:27 +0200, Max Reitz wrote:
> On 14.08.19 22:22, Maxim Levitsky wrote:
> > This adds:
> > 
> > * x-blockdev-update-encryption and x-blockdev-erase-encryption qmp commands
> >   Both commands take the QCryptoKeyManageOptions
> >   the x-blockdev-update-encryption is meant for non destructive addition
> >   of key slots / whatever the encryption driver supports in the future
> > 
> >   x-blockdev-erase-encryption is meant for destructive encryption key erase,
> >   in some cases even without way to recover the data.
> > 
> > 
> > * bdrv_setup_encryption callback in the block driver
> >   This callback does both the above functions with 'action' parameter
> > 
> > * QCryptoKeyManageOptions with set of options that drivers can use for 
> > encryption managment
> >   Currently it has all the options that LUKS needs, and later it can be 
> > extended
> >   (via union) to support more encryption drivers if needed
> > 
> > * blk_setup_encryption / bdrv_setup_encryption - the usual block layer 
> > wrappers.
> >   Note that bdrv_setup_encryption takes BlockDriverState and not BdrvChild,
> >   for the ease of use from the qmp code. It is not expected that this 
> > function
> >   will be used by anything but qmp and qemu-img code
> > 
> > 
> > Signed-off-by: Maxim Levitsky 
> > ---
> >  block/block-backend.c  |  9 
> >  block/io.c | 24 
> >  blockdev.c | 40 ++
> >  include/block/block.h  | 12 ++
> >  include/block/block_int.h  | 11 ++
> >  include/sysemu/block-backend.h |  7 ++
> >  qapi/block-core.json   | 36 ++
> >  qapi/crypto.json   | 26 ++
> >  8 files changed, 165 insertions(+)
> 
> Now I don’t know whether you want to keep this interface at all, because
> the cover letter seemed to imply you’d prefer a QMP amend.  But let it
> be said that a QMP amend is no trivial task.  I think the most difficult
> bit is that the qcow2 implementation currently is inherently an offline
> operation.  It isn’t a good idea to use it on a live image.  (Maybe it
> works, but it’s definitely not what I had in mind when I wrote it.)
> 
> So I’ll still take a quick glance at the interface here.
> 
> [...]
> 
> > diff --git a/qapi/block-core.json b/qapi/block-core.json
> > index 0d43d4f37c..53ed411eed 100644
> > --- a/qapi/block-core.json
> > +++ b/qapi/block-core.json
> > @@ -5327,3 +5327,39 @@
> >'data' : { 'node-name': 'str',
> >   'iothread': 'StrOrNull',
> >   '*force': 'bool' } }
> > +
> > +
> > +##
> > +# @x-blockdev-update-encryption:
> > +#
> > +# Update the encryption keys for an encrypted block device
> > +#
> > +# @node-name:Name of the blockdev to operate on
> > +# @force: Disable safety checks (use with care)
> > +# @options:   Driver specific options
> > +#
> > +
> > +# Since: 4.2
> > +##
> > +{ 'command': 'x-blockdev-update-encryption',
> > +  'data': { 'node-name' : 'str',
> > +'*force' : 'bool',
> > +'options': 'QCryptoEncryptionSetupOptions' } }
> > +
> > +##
> > +# @x-blockdev-erase-encryption:
> > +#
> > +# Erase the encryption keys for an encrypted block device
> > +#
> > +# @node-name:Name of the blockdev to operate on
> 
> Why the tab?
Because checkpatch.pl doesn't warn about this :-)

> 
> > +# @force: Disable safety checks (use with care)
> 
> I think being a bit more verbose wouldn’t hurt.
> 
> (Same above.)
True about this - this is another reason this is RFC,

I honestly didn't finish the documentation,
since the sudden change to drop all of this interface.


> 
> > +# @options:   Driver specific options
> > +#
> > +# Returns: @QCryptoKeyManageResult
> > +#
> > +# Since: 4.2
> > +##
> > +{ 'command': 'x-blockdev-erase-encryption',
> > +  'data': { 'node-name' : 'str',
> > +'*force' : 'bool',
> > +'options': 'QCryptoEncryptionSetupOptions' } }
> > diff --git a/qapi/crypto.json b/qapi/crypto.json
> > index b2a4cff683..69e8b086db 100644
> > --- a/qapi/crypto.json
> > +++ b/qapi/crypto.json
> > @@ -309,3 +309,29 @@
> >'base': 'QCryptoBlockInfoBase',
> >'discriminator': 'format',
> >'data': { 'luks': 'QCryptoBlockInfoLUKS' } }
> > +
> > +
> > +##
> > +# @QCryptoEncryptionSetupOptions:
> > +#
> > +# Driver specific options for encryption key management.
> 
> The options do seem LUKS-specific, but the name of this structure does not.
This is because to be not luks specific we must use some kind of an union
which means that the user has to specify the driver which I didn't want to do.
Now all of you convinced me ( :-) ) to do this so this will be done when I 
switch
to the amend interface.

> 
> > +# @key-secret: the ID of a QCryptoSecret object providing the password
> > +#  to add or to erase (optional for erase)
> > +#
> > +# @old-key-secret: 

Re: [Qemu-devel] [PATCH v1 2/4] s390x/tcg: Introduce probe_read_access()

2019-08-21 Thread Richard Henderson
On 8/21/19 2:33 PM, David Hildenbrand wrote:
>> NOTDIRTY cannot fault at all.  The associated rcu critical section is ugly
>> enough to make me not want to do anything except continue to go through the
>> regular MMIO path.
>>
>> In any case, so long as we eliminate *access* faults by probing the page 
>> table,
>> then falling back to the byte-by-byte loop is, AFAICS, sufficient to 
>> implement
>> the instructions correctly.
> 
> "In any case, so long as we eliminate *access* faults by probing the
> page table" - that's what I'm doing in this patch (and even more correct
> in the prototype patch I shared), no? (besides the watchpoint madness below)

Correct.

My main objection to your current patch is that you perform the access checks
within MVC, and then do some more tlb lookups in fast_memmove.

I think that fast_memmove is where the access checks should live.  That allows
incremental improvement to combine access checks + host address lookup, which
cannot currently be done in one step with existing interfaces.

I guess you would still want access checks within MVC for the case in which you
must fall back to byte-by-byte because of destructive overlap.

> "falling back to the byte-by-byte loop is, AFAICS, sufficient"
> 
> I don't think this is sufficient. E.g., LAP protected pages
> (PAGE_WRITE_INV which immediately requires a new MMU walk on the next
> access) will trigger a new MMU walk on every byte access (that's why I
> chose to pre-translate in my prototype).

LAP protected pages is exactly why probe_write should return the host address,
so that we can do the access check + host address lookup in one step.

But in the meantime...

> If another CPU modified the
> page tables in between, we could suddenly get a fault - although we
> checked early. What am I missing?

You're concerned with a bare write to the page table by cpu B, while cpu A is
executing, and before cpu B issues the cross-cpu tlb flush?

The tlb victim cache should prevent having to re-read a tlb entry from memory,
at least for MVC.  The unlimited size we currently support for MVCL and MVCLE
could act weird, but would be fixed by limiting the execution as discussed.

Honestly, the os has to make sure that the page remains valid until after the
flush completes, otherwise it's an os bug.  The cross-cpu tlb flush happens via
async_run_on_cpu, and of course never occurs while we are executing a TB.  Yet
another reason to limit the amount of work any one instruction does.  ;-)


> I see that we use BP_STOP_BEFORE_ACCESS for PER (Program Event
> Recording) on s390x. I don't think that's correct. We want to get
> notified after the values were changed.
> 
> "A storage-alteration event occurs whenever a CPU,
> by using a logical or virtual address, makes a store
> access without an access exception to the storage
> area designated by control registers 10 and 11. ..."
> 
> "For a PER instruction-fetching nullification event, the
> unit of operation is nullified. For other PER events,
> the unit of operation is completed"
> 
> Oh man, why is everything I take a look at broken.

Heh.

>> In the latter case, if the instruction has had any side effects prior to the
>> longjmp, they will be re-done when we re-start the current instruction.
>>
>> To me this seems like a rather large bug in our implementation of 
>> watchpoints,
>> as it only really works properly for simple load/store/load-op-store type
>> instructions.  Anything that works on many addresses and doesn't delay side
>> effects until all accesses are complete will Do The Wrong Thing.
>>
>> The fix, AFAICS, is for probe_write to call check_watchpoint(), so that we
>> take the debug exit early.
> 
> Indeed. I see what you mean now. (I was ignoring the "before access"
> because I was assuming we don't need it on s390x)
> 
> probe_write() would have to check for all BP_STOP_BEFORE_ACCESS watchpoints.

!BP_STOP_BEFORE_ACCESS watchpoints exit to the main loop as well, so that it
can restart and then single-step the current instruction.

We need it the check in probe_write for all cases.

> Yes, that's what I mean, TARGET_PAGE_SIZE, but eventually crossing a
> page boundary. The longer I stare at the MVCL code, the more broken it
> is. There are more nice things buried in the PoP. MVCL does not detect
> access exceptions beyond the next 2k. So we have to limit it there
> differently.

That language is indeed odd.

The only reading of that paragraph that makes sense to me is that the hardware
*must* interrupt MVCL after every 2k bytes processed.  The idea that the user
can magically write to a read-only page simply by providing length = 2MB and
page that is initially writable is dumb.  I cannot imagine that is a correct
reading.

Getting clarification from an IBM engineer on that would be good; otherwise I
would just ignore that and proceed as if all access checks are performed.

> So what I understand is that
> 
> - we should handle watchpoints in probe_write()
> - not bypass IO memory 

Re: [Qemu-devel] [PATCH 07/13] block: add manage-encryption command (qmp and blockdev)

2019-08-21 Thread Maxim Levitsky
On Wed, 2019-08-21 at 13:47 +0200, Markus Armbruster wrote:
> Maxim Levitsky  writes:
> 
> > This adds:
> > 
> > * x-blockdev-update-encryption and x-blockdev-erase-encryption qmp commands
> >   Both commands take the QCryptoKeyManageOptions
> >   the x-blockdev-update-encryption is meant for non destructive addition
> >   of key slots / whatever the encryption driver supports in the future
> > 
> >   x-blockdev-erase-encryption is meant for destructive encryption key erase,
> >   in some cases even without way to recover the data.
> > 
> > 
> > * bdrv_setup_encryption callback in the block driver
> >   This callback does both the above functions with 'action' parameter
> > 
> > * QCryptoKeyManageOptions with set of options that drivers can use for 
> > encryption managment
> >   Currently it has all the options that LUKS needs, and later it can be 
> > extended
> >   (via union) to support more encryption drivers if needed
> > 
> > * blk_setup_encryption / bdrv_setup_encryption - the usual block layer 
> > wrappers.
> >   Note that bdrv_setup_encryption takes BlockDriverState and not BdrvChild,
> >   for the ease of use from the qmp code. It is not expected that this 
> > function
> >   will be used by anything but qmp and qemu-img code
> > 
> > 
> > Signed-off-by: Maxim Levitsky 
> 
> [...]
> > diff --git a/qapi/block-core.json b/qapi/block-core.json
> > index 0d43d4f37c..53ed411eed 100644
> > --- a/qapi/block-core.json
> > +++ b/qapi/block-core.json
> > @@ -5327,3 +5327,39 @@
> >'data' : { 'node-name': 'str',
> >   'iothread': 'StrOrNull',
> >   '*force': 'bool' } }
> > +
> > +
> > +##
> > +# @x-blockdev-update-encryption:
> > +#
> > +# Update the encryption keys for an encrypted block device
> > +#
> > +# @node-name:Name of the blockdev to operate on
> > +# @force: Disable safety checks (use with care)
> 
> What checks excactly are disabled?
Ability to overwrite an used slot with a different password. 
If overwrite fails, the image won't be recoverable.

The safe way is to add a new slot, then erase the old
one, but this changes the slot where the password
is stored, unless this procedure is used twice

> 
> > +# @options:   Driver specific options
> > +#
> > +
> > +# Since: 4.2
> > +##
> > +{ 'command': 'x-blockdev-update-encryption',
> > +  'data': { 'node-name' : 'str',
> > +'*force' : 'bool',
> > +'options': 'QCryptoEncryptionSetupOptions' } }
> > +
> > +##
> > +# @x-blockdev-erase-encryption:
> > +#
> > +# Erase the encryption keys for an encrypted block device
> > +#
> > +# @node-name:Name of the blockdev to operate on
> > +# @force: Disable safety checks (use with care)
> 
> Likewise.
1. Erase a slot which is already marked as
erased. Mostly harmless but pointless as well.

2. Erase last keyslot. This irreversibly destroys
any ability to read the data from the device,
unless a backup of the header and the key material is
done prior. Still can be useful when it is desired to
erase the data fast.


> 
> > +# @options:   Driver specific options
> > +#
> > +# Returns: @QCryptoKeyManageResult
> 
> Doc comment claims the command returns something, even though it
> doesn't.  Please fix.  Sadly, the doc generator fails to flag that.
This is leftover, fixed now although most likely this interface will die.
I was initially planning to return
information on which slot was allocated when user left that
decision to the driver.

> 
> > +#
> > +# Since: 4.2
> > +##
> > +{ 'command': 'x-blockdev-erase-encryption',
> > +  'data': { 'node-name' : 'str',
> > +'*force' : 'bool',
> > +'options': 'QCryptoEncryptionSetupOptions' } }
> > diff --git a/qapi/crypto.json b/qapi/crypto.json
> > index b2a4cff683..69e8b086db 100644
> > --- a/qapi/crypto.json
> > +++ b/qapi/crypto.json
> > @@ -309,3 +309,29 @@
> >'base': 'QCryptoBlockInfoBase',
> >'discriminator': 'format',
> >'data': { 'luks': 'QCryptoBlockInfoLUKS' } }
> > +
> > +
> > +##
> > +# @QCryptoEncryptionSetupOptions:
> > +#
> > +# Driver specific options for encryption key management.
> 
> Specific to which driver?

This is the same issue, of not beeing able to detect an union.

I was planning to have an union here where we could add
add the driver specific options if we need to have another crypto driver,
however since I discovered that union needs user to pass the driver name,
I just placed it in a struct.

So this struct is supposed to represent driver specific options, but
currently contains only luks options.

> 
> > +#
> > +# @key-secret: the ID of a QCryptoSecret object providing the password
> > +#  to add or to erase (optional for erase)
> > +#
> > +# @old-key-secret: the ID of a QCryptoSecret object providing the password
> > +#  that can currently unlock the image
> > +#
> > +# @slot: Key slot to update/erase
> > +#(optional, for update will select a free slot,
> > +#for erase will 

Re: [Qemu-devel] [Qemu-block] [PATCH 0/3] block/io_uring: fix EINTR and resubmit short reads

2019-08-21 Thread John Snow



On 7/15/19 4:19 PM, Stefan Hajnoczi wrote:
> Short reads are possible with cache=writeback (see Patch 3 for details).
> Handle this by resubmitting requests until the read is completed.
> 
> Patch 1 adds trace events useful for debugging io_uring.
> 
> Patch 2 fixes EINTR.  This lays the groundwork for resubmitting requests in
> Patch 3.
> 
> Aarushi: Feel free to squash this into your patch series if you are happy with
> the code, I don't mind if the authorship information is lost.  After applying
> these patches I can successfully boot a Fedora 30 guest qcow2 file with
> cache=writeback.
> 
> Based-on: <20190610134905.22294-1-mehta.aar...@gmail.com>
> 
> Stefan Hajnoczi (3):
>   block/io_uring: add submission and completion trace events
>   block/io_uring: fix EINTR request resubmission
>   block/io_uring: resubmit short buffered reads
> 
>  block/io_uring.c   | 136 ++---
>  block/trace-events |   6 +-
>  2 files changed, 108 insertions(+), 34 deletions(-)
> 

Since this is over the 30 days mark, I'm going to assume this WAS
squashed into Aarushi's patchset, and it's safe to drop this from the
review queue for now?

--js



Re: [Qemu-devel] [PATCH 00/13] RFC: luks/encrypted qcow2 key management

2019-08-21 Thread Maxim Levitsky
On Tue, 2019-08-20 at 19:59 +0200, Max Reitz wrote:
> On 14.08.19 22:22, Maxim Levitsky wrote:
> 
> [...]
> 
> > Testing. This was lightly tested with manual testing and with few iotests 
> > that I prepared.
> > I haven't yet tested fully the write sharing behavior, nor did I run the 
> > whole iotests
> > suite to see if this code causes some regressions. Since I will need 
> > probably
> > to rewrite some chunks of it to change to 'amend' interface, I decided to 
> > post it now,
> > to see if you have other ideas/comments to add.
> 
> I can see that, because half of the qcow2 tests that contain the string
> “secret” break:
> 
> Failures: 087 134 158 178 188 198 206
> Failed 7 of 13 tests
> 
> Also, 210 when run with -luks.
> 
> Some are just due to different test outputs (because you change
> _filter_img_create to filter some encrypt.* parameters), but some of
> them are due to aborts.  All of them look like different kinds of heap
> corruptions.
> 
> 
> I can fully understand not running all iotests (because only the
> maintainers do that before pull requests), but just running the iotests
> that immediately concern a series seems prudent to me (unless the series
> is trivial).
> 
> (Just “(cd tests/qemu-iotests && grep -l secret ???)” tells you which
> tests to run that may concern themselves with qcow2 encryption, for
> example.)
> 
> 
> So I suppose I’ll stop reviewing the series in detail and just give a
> more cursory glance from now on.

Sorry about that! I posted this as RFC, and the reason it is mostly done as 
opposed to typical RFC which might not
even contain any code was that for most of the time I was sure that API of this 
is straightforward and won't need
any significant discussion, and in the last minute after I discussed with Kevin 
on IRC one 
obscure case of block backend permissions that was failing, he told me about 
the amend interface.
Next time I guess, when new a API is involved, I will post an API RFC first 
always and then start the implementation.

I fixed both issues that iotests uncovered locally, now all luks and most qcow2 
tests pass 
(118 and 194 sometimes fail with qcow2, and this happens regardless of my 
patches, and same for 162 which seems to fail
always now, also regardless of my patches.
I have a git head after the merge window opened so probably some bugs were 
added, and maybe already fixed)


The first issue was in 'qcrypto-luks: implement the encryption key management'
where I accidentally stored the name of the secret without strdup'ing in the 
create flow, so I got double free,
which indeed caused the heap corruptions you have seen.

Basically this line:
luks->secret = options->u.luks.key_secret;

The second issue as you mention is indeed the change in filters I did. Do you 
agree with that change btw?
If you ask me, I would even change the filter further and filter all the image 
options from the qemu command line since these are test inputs anyway.
This allowed me to have the same test for both luks and qcow2 luks encrypted 
test.

Also I didn't even expect you to run the iotests for me, but
rather just wanted a general RFC level feedback on the whole thing, that is why 
I even mentioned that I didn't run them.
So sorry for the trouble I caused!

I btw don't agree with you that only maintainers need to run all the iotests 
fully. 
I think that the patch submitter  should run all the tests that he can to catch 
as many problems as he can,
_unless_ of course this is an RFC.


Best regards,
Thanks for the review,
Sorry again for the trouble,

Maxim Levitsky






Re: [Qemu-devel] [PATCH v1 2/4] s390x/tcg: Introduce probe_read_access()

2019-08-21 Thread David Hildenbrand
On 21.08.19 22:38, Richard Henderson wrote:
> On 8/21/19 12:36 PM, David Hildenbrand wrote:
 There are certain cases where we can't get access to the raw host
 page. Namely, cpu watchpoints, LAP, NODIRTY. In summary: this won't
 as you describe. (my first approach did exactly this)
>>>
>>> NODIRTY and LAP are automatically handled via probe_write
>>> faulting instead of returning the address.  I think there
>>> may be a bug in probe_write at present not checking
>>
>> For LAP pages we immediately set TLB_INVALID_MASK again, to trigger a
>> new fault on the next write access (only). The could be handled in
>> tlb_vaddr_to_host(), simply returning the address to the page after
>> trying to fill the tlb and succeeding (I implemented that, that's the
>> easy part).
> 
> Yes.
> 
>> TLB_NOTDIRTY and TLB_MMIO are the real issue. We don't want to refault,
>> we want to treat that memory like IO memory and route it via
>> MemoryRegionOps() - e.g., watch_mem_ops() in qemu/exec.c. So it's not a
>> fault but IO memory.
> 
> Watchpoints are not *really* i/o memory (unless of course it's a watchpoint on
> a device, which is a different matter), that's merely how we've chosen to
> implement it to force the memory operation through the slow path.  We can, and
> probably should, implement this differently wrt probe_write.

Yes, I agree wrt probe_write.

> 
> Real MMIO can only fault via cc->transaction_failed(), for some sort of bus
> error.  Which s390x does not currently implement.  In the meantime, a
> probe_write proves that the page is at least mapped correctly, so we have
> eliminated the normal access fault.

Yes, and that's all we care about on s390x.

> 
> NOTDIRTY cannot fault at all.  The associated rcu critical section is ugly
> enough to make me not want to do anything except continue to go through the
> regular MMIO path.
> 
> In any case, so long as we eliminate *access* faults by probing the page 
> table,
> then falling back to the byte-by-byte loop is, AFAICS, sufficient to implement
> the instructions correctly.

"In any case, so long as we eliminate *access* faults by probing the
page table" - that's what I'm doing in this patch (and even more correct
in the prototype patch I shared), no? (besides the watchpoint madness below)

"falling back to the byte-by-byte loop is, AFAICS, sufficient"

I don't think this is sufficient. E.g., LAP protected pages
(PAGE_WRITE_INV which immediately requires a new MMU walk on the next
access) will trigger a new MMU walk on every byte access (that's why I
chose to pre-translate in my prototype). If another CPU modified the
page tables in between, we could suddenly get a fault - although we
checked early. What am I missing?

> 
>> probe_write() performs the MMU translation. If that succeeds, there is
>> no fault. If there are watchpoints, the memory is treated like IO and
>> memory access is rerouted. I think this works as designed.
> 
> Well, if BP_STOP_BEFORE_ACCESS, then we want to raise a debug exception before
> any changes are made.  If !BP_STOP_BEFORE_ACCESS, then we longjmp back to the
> main cpu loop and single-step the current instruction.

I see that we use BP_STOP_BEFORE_ACCESS for PER (Program Event
Recording) on s390x. I don't think that's correct. We want to get
notified after the values were changed.

"A storage-alteration event occurs whenever a CPU,
by using a logical or virtual address, makes a store
access without an access exception to the storage
area designated by control registers 10 and 11. ..."

"For a PER instruction-fetching nullification event, the
unit of operation is nullified. For other PER events,
the unit of operation is completed"

Oh man, why is everything I take a look at broken.

> 
> In the latter case, if the instruction has had any side effects prior to the
> longjmp, they will be re-done when we re-start the current instruction.
> 
> To me this seems like a rather large bug in our implementation of watchpoints,
> as it only really works properly for simple load/store/load-op-store type
> instructions.  Anything that works on many addresses and doesn't delay side
> effects until all accesses are complete will Do The Wrong Thing.
> 
> The fix, AFAICS, is for probe_write to call check_watchpoint(), so that we
> take the debug exit early.

Indeed. I see what you mean now. (I was ignoring the "before access"
because I was assuming we don't need it on s390x)

probe_write() would have to check for all BP_STOP_BEFORE_ACCESS watchpoints.

> 
>> You mean two pages I assume. Yeah, I could certainly simplify the
>> prototype patch I have here quite a lot. 2 pages should be enough for
>> everybody ;)
> 
> Heh.  But, seriously, TARGET_PAGE_SIZE bytes is enough at one go, without
> releasing control so that interrupts etc may be recognized.

Yes, that's what I mean, TARGET_PAGE_SIZE, but eventually crossing a
page boundary. The longer I stare at the MVCL code, the more broken it
is. There are more nice things buried in the PoP. MVCL 

[Qemu-devel] [Bug 1840922] Re: qemu-arm for cortex-m33 aborts with unhandled CPU exception 0x8

2019-08-21 Thread Richard Henderson
This happens because we're applying a loose test for the v8m magic
exception return address.

There are two possible fixes for this, and perhaps we should
apply both of them:

(1) Unset ARM_FEATURE_M_SECURITY for arm-linux-user.
This would disable the FNC_RETURN_MIN_MAGIC test,
which, unlike EXC_RETURN_MIN_MAGIC, is not protected
by a condition that linux-user cannot satisfy (Handler mode).

(2) Limit the address space to 0x7ff, the normal end of
write-back cached ram.  Since M-profile doesn't have an MMU,
this would make linux-user addresses more like what we'd see
running the same test under system mode.

** Patch added: "Hack to work around the problem; not a proper solution."
   
https://bugs.launchpad.net/qemu/+bug/1840922/+attachment/5283854/+files/z.patch

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1840922

Title:
  qemu-arm for cortex-m33 aborts with unhandled CPU exception 0x8

Status in QEMU:
  Confirmed

Bug description:
  Hi,

  While experimenting with running the GCC testsuite with cortex-m33 as target 
(to exercise v8-m code), I came across this failure:
  qemu: unhandled CPU exception 0x8 - aborting
  R00=fffeaf58 R01=fffeaf58 R02= R03=fffeaf5d
  R04=fffeaf5c R05=fffeaf9c R06= R07=fffeaf80
  R08= R09= R10=00019dbc R11=
  R12=00f0 R13=fffeaf58 R14=81f3 R15=fffeaf5c
  XPSR=6100 -ZC- T NS priv-thread
  qemu:handle_cpu_signal received signal outside vCPU context @ pc=0x6033c908

  I'm using arm-eabi-gcc, so it targets bare-metal, not linux.

  The testcase is GCC's
  gcc/testsuite/gcc.c-torture/execute/2822-1.c; it works when
  compiled at -O2, but crashes when compiled at -Os. The test uses
  nested functions, so it creates a trampoline on the stack, whose
  address may be a problem. But since the stack address seems to be in
  the same range in the O2 and Os cases, it's not that clear.

  I'm attaching the C source, asm, binary executables and qemu traces
  with in_asm,cpu.

  I execute the binaries with:
  qemu-arm --cpu cortex-m33  ./2822-1.exe.Os

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1840922/+subscriptions



[Qemu-devel] [Bug 1840922] Re: qemu-arm for cortex-m33 aborts with unhandled CPU exception 0x8

2019-08-21 Thread Richard Henderson
** Changed in: qemu
   Status: New => Confirmed

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1840922

Title:
  qemu-arm for cortex-m33 aborts with unhandled CPU exception 0x8

Status in QEMU:
  Confirmed

Bug description:
  Hi,

  While experimenting with running the GCC testsuite with cortex-m33 as target 
(to exercise v8-m code), I came across this failure:
  qemu: unhandled CPU exception 0x8 - aborting
  R00=fffeaf58 R01=fffeaf58 R02= R03=fffeaf5d
  R04=fffeaf5c R05=fffeaf9c R06= R07=fffeaf80
  R08= R09= R10=00019dbc R11=
  R12=00f0 R13=fffeaf58 R14=81f3 R15=fffeaf5c
  XPSR=6100 -ZC- T NS priv-thread
  qemu:handle_cpu_signal received signal outside vCPU context @ pc=0x6033c908

  I'm using arm-eabi-gcc, so it targets bare-metal, not linux.

  The testcase is GCC's
  gcc/testsuite/gcc.c-torture/execute/2822-1.c; it works when
  compiled at -O2, but crashes when compiled at -Os. The test uses
  nested functions, so it creates a trampoline on the stack, whose
  address may be a problem. But since the stack address seems to be in
  the same range in the O2 and Os cases, it's not that clear.

  I'm attaching the C source, asm, binary executables and qemu traces
  with in_asm,cpu.

  I execute the binaries with:
  qemu-arm --cpu cortex-m33  ./2822-1.exe.Os

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1840922/+subscriptions



Re: [Qemu-devel] QEMU bitmap backup usability FAQ

2019-08-21 Thread John Snow



On 8/21/19 10:21 AM, Vladimir Sementsov-Ogievskiy wrote:
> [CC Nikolay]
> 
> 21.08.2019 1:25, John Snow wrote:
>> Hi, downstream here at Red Hat I've been fielding some questions about
>> the usability and feature readiness of Bitmaps (and related features) in
>> QEMU.
>>
>> Here are some questions I answered internally that I am copying to the
>> list for two reasons:
>>
>> (1) To make sure my answers are actually correct, and
>> (2) To share this pseudo-reference with the community at large.
>>
>> This is long, and mostly for reference. There's a summary at the bottom
>> with some todo items and observations about the usability of the feature
>> as it exists in QEMU.
>>
>> Before too long, I intend to send a more summarized "roadmap" mail which
>> details all of the current and remaining work to be done in and around
>> the bitmaps feature in QEMU.
>>
>>
>> Questions:
>>
>>> "What format(s) is/are required for this functionality?"
>>
>>  From the QEMU API, any format can be used to create and author
>> incremental backups. The only known format limitations are:
>>
>> 1. Persistent bitmaps cannot be created on any format except qcow2,
>> although there are hooks to add support to other formats at a later date
>> if desired.
>>
>> DANGER CAVEAT #1: Adding bitmaps to QEMU by default creates transient
>> bitmaps instead of persistent ones.
>>
>> Possible TODO: Allow users to 'upgrade' transient bitmaps to persistent
>> ones in case they made a mistake.
> 
> I doubt, as in my opinion real users of Qemu are not people but libvirt, which
> should never make such mistake.
> 

Right, that's largely been the consensus here; but there is some concern
that libvirt might not be the only user of QEMU, so I felt it was worth
documenting some of the crucial moments where it was "easy" to get it wrong.

I think like it or not, the API that QEMU presents has to be considered
on its own without libvirt because it's not a given that libvirt will
forever and always be the only user of QEMU.

I do think that any problems of this kind that can be solved in libvirt
are not immediate, crucial action items. libvirt WILL be the major user
of these features.

However, try as we might, releasing a set of primitive operations that
offer 998 ways to corrupt your data and 2 ways to manage it correctly
are going to provoke some questions from people who are trying to work
with that API, including from libvirt developers.

It might be the conclusion that it's libvirt's job to safeguard the user
from themselves, but we at least need to present consistent and clear
information about the way we expect/anticipate people to use the APIs,
because people DO keep asking me about several of these issues and the
usability problems they perceive with the QEMU API.

So this thread was largely in attempt to explore what some "solutions"
to perceived problems look like, mostly to come to the conclusion that
the actual "must-haves" list in QEMU is not very long compared to the
"nice-to-haves?" list.

>>
>>
>> 2. When using push backups (blockdev-backup, drive-backup), you may use
>> any format as a target format. >
>> DANGER CAVEAT #2: without backing file and/or filesystem-less sparse
>> support, these images will be unusable.
> 
> You mean incremental backups of course, as the whole document is about 
> bitmaps.
> 

Ah, yes, incremental push backups. Full backups are of course not a
problem. :)

>>
>> EXAMPLE: Backing up to a raw file loses allocation information, so we
>> can no longer distinguish between zeroes and unallocated regions. The
>> cluster size is also lost. This file will not be usable without
>> additional metadata recorded elsewhere.*
>>
>> (* This is complicated, but it is in theory possible to do a push backup
>> to e.g. an NBD target with custom server code that saves allocation
>> information to a metadata file, which would allow you to reconstruct
>> backups. For instance, recording in a .json file which extents were
>> written out would allow you to -- with a custom binary -- write this
>> information on top of a base file to reconstruct a backup.)
>>
>>
>> 3. Any format can be used for either shared storage or live storage
>> migrations. There are TWO distinct mechanisms for migrating bitmaps:
>>
>> A) The bitmap is flushed to storage and re-opened on the destination.
>> This is only supported for qcow2 and shared-storage migrations.
> 
> cons: flushing/reopening is done during migration downtime, so if you have
> a lot of bitmap data (for example, 64k granulared bitmap for 16tb disk is
> ~30MB, and there may be several bitmaps) downtime will become long.
> 

Worth documenting the drawback, yes.

>>
>> B) The bitmap is live-migrated to the destination. This is supported for
>> any format and can be used for either shared storage or live storage
>> migrations.
>>
>> DANGER CAVEAT #3: The second bitmap migration technique there is an
>> optional migration capability that must be enabled explicitly.
>> Otherwise, some 

Re: [Qemu-devel] [PATCH 0/2] tests/acceptance: Update MIPS Malta ssh test

2019-08-21 Thread Eduardo Habkost
On Wed, Aug 21, 2019 at 10:27:11PM +0200, Aleksandar Markovic wrote:
> 02.08.2019. 17.37, "Aleksandar Markovic"  је
> написао/ла:
> >
> > From: Aleksandar Markovic 
> >
> > This little series improves linux_ssh_mips_malta.py, both in the sense
> > of code organization and in the sense of quantity of executed tests.
> >
> 
> Hello, all.
> 
> I am going to send a new version in few days, and I have a question for
> test team:
> 
> Currently, the outcome of the script execition is either PASS:1 FAIL:0 or
> PASS:0 FAIL:1. But the test actually consists of several subtests. Is there
> any way that this single Python script considers these subtests as separate
> tests (test cases), reporting something like PASS:12 FAIL:7? If yes, what
> would be the best way to achieve that?

If you are talking about each test_*() method, they are already
treated like separate tests.  If you mean treating each
ssh_command_output_contains() call as a separate test, this might
be difficult.

Cleber, is there something already available in the Avocado API
that would help us report more fine-grained results inside each
test case?


> 
> Thanks in advance,
> Aleksandar
> 
> > Aleksandar Markovic (2):
> >   tests/acceptance: Refactor and improve reporting in
> > linux_ssh_mips_malta.py
> >   tests/acceptance: Add new test cases in linux_ssh_mips_malta.py
> >
> >  tests/acceptance/linux_ssh_mips_malta.py | 81
> ++--
> >  1 file changed, 66 insertions(+), 15 deletions(-)
> >
> > --
> > 2.7.4
> >
> >

-- 
Eduardo



Re: [Qemu-devel] [PATCH] Revert "i386: correct cpu_x86_cpuid(0xd)"

2019-08-21 Thread Eduardo Habkost
On Wed, Aug 21, 2019 at 07:54:17PM +0800, owen...@ucloud.cn wrote:
> It is CentOS 6.3 with kernel version 2.6.32-279. Actually all CentOS 6 
> releases have this issue.

We stopped supporting CentOS 6 in July 2016 (2 years after CentOS
7 was released).  Be aware that even if we work around that
specific bug, there are no guarantees that QEMU will still build
on a CentOS 6 host in the future.

That said, I probably wouldn't reject a patch that works around
that CentOS 6 bug, if it's conditional on kvm_enabled() and has a
comment explaining why the workaround exists.

> 
> 
> 
> owen...@ucloud.cn
>  
> From: Eduardo Habkost
> Date: 2019-08-21 19:19
> To: owen...@ucloud.cn
> CC: qemu-devel
> Subject: Re: Re: [Qemu-devel] [PATCH] Revert "i386: correct 
> cpu_x86_cpuid(0xd)"
> On Wed, Aug 21, 2019 at 11:04:46AM +0800, owen...@ucloud.cn wrote:
> > Thanks for you reply, we have some hosts running with legacy kernel, 
> > difficult to upgrade, and i want to run the latest qemu.
> > Does QEMU support running with legacy kernel(kvm) in design?
>  
> For KVM, QEMU requires Linux 4.5 or newer.  See "System
> requirements" / "KVM kernel module" section on qemu-doc.  We also
> aim to support the latest version of Linux distributions with
> long term support (e.g. RHEL, Debian, Ubuntu LTS, SLES).
>  
> Do you have more details on the kernel you are using?  Is it
> built and distributed by a third party?
>  
>  
> > 
> > 
> > 
> > owen...@ucloud.cn
> >  
> > From: Eduardo Habkost
> > Date: 2019-08-21 05:23
> > To: Bingsong Si
> > CC: qemu-devel
> > Subject: Re: [Qemu-devel] [PATCH] Revert "i386: correct cpu_x86_cpuid(0xd)"
> > On Mon, Aug 19, 2019 at 06:09:24PM +0800, Bingsong Si wrote:
> > > This reverts commit de2e68c902f7b6e438b0fa3cfedd74a06a20704f.
> > > 
> > > Initial value of env->xcr0 == 0, then CPUID(EAX=0xd,ECX=0).EBX == 0, 
> > > after kvm
> > > upstream commit 412a3c41, It is ok.
> > > On host before commit 412a3c41, some legacy guest, i.e. CentOS 6, get
> > > xstate_size == 0, will crash the guest.
> > > 
> > > Signed-off-by: Bingsong Si 
> >  
> > cpu_x86_cpuid() is also used by TCG, and needs to return the
> > correct data depending on xcr0.  If you want to work around a KVM
> > bug by ignoring xcr0, it needs to be conditional on
> > kvm_enabled().
> >  
> > But even if we you make this conditional on kvm_enabled(), I
> > don't understand why QEMU would need a workaround for a KVM bug
> > that was fixed more than 4 years ago.
> >  
> > > ---
> > >  target/i386/cpu.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> > > index ff65e11008..69562e21ed 100644
> > > --- a/target/i386/cpu.c
> > > +++ b/target/i386/cpu.c
> > > @@ -4416,7 +4416,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t 
> > > index, uint32_t count,
> > >  *ecx = xsave_area_size(x86_cpu_xsave_components(cpu));
> > >  *eax = env->features[FEAT_XSAVE_COMP_LO];
> > >  *edx = env->features[FEAT_XSAVE_COMP_HI];
> > > -*ebx = xsave_area_size(env->xcr0);
> > > +*ebx = *ecx;
> > >  } else if (count == 1) {
> > >  *eax = env->features[FEAT_XSAVE];
> > >  } else if (count < ARRAY_SIZE(x86_ext_save_areas)) {
> > > -- 
> > > 2.22.0
> > > 
> > > 
> >  
> > -- 
> > Eduardo
>  
> -- 
> Eduardo

-- 
Eduardo



Re: [Qemu-devel] [PATCH v1 2/4] s390x/tcg: Introduce probe_read_access()

2019-08-21 Thread Richard Henderson
On 8/21/19 12:36 PM, David Hildenbrand wrote:
>>> There are certain cases where we can't get access to the raw host
>>> page. Namely, cpu watchpoints, LAP, NODIRTY. In summary: this won't
>>> as you describe. (my first approach did exactly this)
>>
>> NODIRTY and LAP are automatically handled via probe_write
>> faulting instead of returning the address.  I think there
>> may be a bug in probe_write at present not checking
> 
> For LAP pages we immediately set TLB_INVALID_MASK again, to trigger a
> new fault on the next write access (only). The could be handled in
> tlb_vaddr_to_host(), simply returning the address to the page after
> trying to fill the tlb and succeeding (I implemented that, that's the
> easy part).

Yes.

> TLB_NOTDIRTY and TLB_MMIO are the real issue. We don't want to refault,
> we want to treat that memory like IO memory and route it via
> MemoryRegionOps() - e.g., watch_mem_ops() in qemu/exec.c. So it's not a
> fault but IO memory.

Watchpoints are not *really* i/o memory (unless of course it's a watchpoint on
a device, which is a different matter), that's merely how we've chosen to
implement it to force the memory operation through the slow path.  We can, and
probably should, implement this differently wrt probe_write.

Real MMIO can only fault via cc->transaction_failed(), for some sort of bus
error.  Which s390x does not currently implement.  In the meantime, a
probe_write proves that the page is at least mapped correctly, so we have
eliminated the normal access fault.

NOTDIRTY cannot fault at all.  The associated rcu critical section is ugly
enough to make me not want to do anything except continue to go through the
regular MMIO path.

In any case, so long as we eliminate *access* faults by probing the page table,
then falling back to the byte-by-byte loop is, AFAICS, sufficient to implement
the instructions correctly.

> probe_write() performs the MMU translation. If that succeeds, there is
> no fault. If there are watchpoints, the memory is treated like IO and
> memory access is rerouted. I think this works as designed.

Well, if BP_STOP_BEFORE_ACCESS, then we want to raise a debug exception before
any changes are made.  If !BP_STOP_BEFORE_ACCESS, then we longjmp back to the
main cpu loop and single-step the current instruction.

In the latter case, if the instruction has had any side effects prior to the
longjmp, they will be re-done when we re-start the current instruction.

To me this seems like a rather large bug in our implementation of watchpoints,
as it only really works properly for simple load/store/load-op-store type
instructions.  Anything that works on many addresses and doesn't delay side
effects until all accesses are complete will Do The Wrong Thing.

The fix, AFAICS, is for probe_write to call check_watchpoint(), so that we
take the debug exit early.

> You mean two pages I assume. Yeah, I could certainly simplify the
> prototype patch I have here quite a lot. 2 pages should be enough for
> everybody ;)

Heh.  But, seriously, TARGET_PAGE_SIZE bytes is enough at one go, without
releasing control so that interrupts etc may be recognized.


r~



Re: [Qemu-devel] [PATCH] linux-user: hijack open() for thread directories

2019-08-21 Thread no-reply
Patchew URL: https://patchew.org/QEMU/20190821201921.106902-1-...@google.com/



Hi,

This series seems to have some coding style problems. See output below for
more information:

Type: series
Subject: [Qemu-devel] [PATCH] linux-user: hijack open() for thread directories
Message-id: 20190821201921.106902-1-...@google.com

=== TEST SCRIPT BEGIN ===
#!/bin/bash
git rev-parse base > /dev/null || exit 0
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
./scripts/checkpatch.pl --mailback base..
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 * [new tag] patchew/20190821201921.106902-1-...@google.com -> 
patchew/20190821201921.106902-1-...@google.com
Submodule 'capstone' (https://git.qemu.org/git/capstone.git) registered for 
path 'capstone'
Submodule 'dtc' (https://git.qemu.org/git/dtc.git) registered for path 'dtc'
Submodule 'roms/QemuMacDrivers' (https://git.qemu.org/git/QemuMacDrivers.git) 
registered for path 'roms/QemuMacDrivers'
Submodule 'roms/SLOF' (https://git.qemu.org/git/SLOF.git) registered for path 
'roms/SLOF'
Submodule 'roms/edk2' (https://git.qemu.org/git/edk2.git) registered for path 
'roms/edk2'
Submodule 'roms/ipxe' (https://git.qemu.org/git/ipxe.git) registered for path 
'roms/ipxe'
Submodule 'roms/openbios' (https://git.qemu.org/git/openbios.git) registered 
for path 'roms/openbios'
Submodule 'roms/openhackware' (https://git.qemu.org/git/openhackware.git) 
registered for path 'roms/openhackware'
Submodule 'roms/opensbi' (https://git.qemu.org/git/opensbi.git) registered for 
path 'roms/opensbi'
Submodule 'roms/qemu-palcode' (https://git.qemu.org/git/qemu-palcode.git) 
registered for path 'roms/qemu-palcode'
Submodule 'roms/seabios' (https://git.qemu.org/git/seabios.git/) registered for 
path 'roms/seabios'
Submodule 'roms/seabios-hppa' (https://git.qemu.org/git/seabios-hppa.git) 
registered for path 'roms/seabios-hppa'
Submodule 'roms/sgabios' (https://git.qemu.org/git/sgabios.git) registered for 
path 'roms/sgabios'
Submodule 'roms/skiboot' (https://git.qemu.org/git/skiboot.git) registered for 
path 'roms/skiboot'
Submodule 'roms/u-boot' (https://git.qemu.org/git/u-boot.git) registered for 
path 'roms/u-boot'
Submodule 'roms/u-boot-sam460ex' (https://git.qemu.org/git/u-boot-sam460ex.git) 
registered for path 'roms/u-boot-sam460ex'
Submodule 'slirp' (https://git.qemu.org/git/libslirp.git) registered for path 
'slirp'
Submodule 'tests/fp/berkeley-softfloat-3' 
(https://git.qemu.org/git/berkeley-softfloat-3.git) registered for path 
'tests/fp/berkeley-softfloat-3'
Submodule 'tests/fp/berkeley-testfloat-3' 
(https://git.qemu.org/git/berkeley-testfloat-3.git) registered for path 
'tests/fp/berkeley-testfloat-3'
Submodule 'ui/keycodemapdb' (https://git.qemu.org/git/keycodemapdb.git) 
registered for path 'ui/keycodemapdb'
Cloning into 'capstone'...
Submodule path 'capstone': checked out 
'22ead3e0bfdb87516656453336160e0a37b066bf'
Cloning into 'dtc'...
Submodule path 'dtc': checked out '88f18909db731a627456f26d779445f84e449536'
Cloning into 'roms/QemuMacDrivers'...
Submodule path 'roms/QemuMacDrivers': checked out 
'90c488d5f4a407342247b9ea869df1c2d9c8e266'
Cloning into 'roms/SLOF'...
Submodule path 'roms/SLOF': checked out 
'7bfe584e321946771692711ff83ad2b5850daca7'
Cloning into 'roms/edk2'...
Submodule path 'roms/edk2': checked out 
'20d2e5a125e34fc8501026613a71549b2a1a3e54'
Submodule 'SoftFloat' (https://github.com/ucb-bar/berkeley-softfloat-3.git) 
registered for path 'ArmPkg/Library/ArmSoftFloatLib/berkeley-softfloat-3'
Submodule 'CryptoPkg/Library/OpensslLib/openssl' 
(https://github.com/openssl/openssl) registered for path 
'CryptoPkg/Library/OpensslLib/openssl'
Cloning into 'ArmPkg/Library/ArmSoftFloatLib/berkeley-softfloat-3'...
Submodule path 'roms/edk2/ArmPkg/Library/ArmSoftFloatLib/berkeley-softfloat-3': 
checked out 'b64af41c3276f97f0e181920400ee056b9c88037'
Cloning into 'CryptoPkg/Library/OpensslLib/openssl'...
Submodule path 'roms/edk2/CryptoPkg/Library/OpensslLib/openssl': checked out 
'50eaac9f3337667259de725451f201e784599687'
Submodule 'boringssl' (https://boringssl.googlesource.com/boringssl) registered 
for path 'boringssl'
Submodule 'krb5' (https://github.com/krb5/krb5) registered for path 'krb5'
Submodule 'pyca.cryptography' (https://github.com/pyca/cryptography.git) 
registered for path 'pyca-cryptography'
Cloning into 'boringssl'...
Submodule path 'roms/edk2/CryptoPkg/Library/OpensslLib/openssl/boringssl': 
checked out '2070f8ad9151dc8f3a73bffaa146b5e6937a583f'
Cloning into 'krb5'...
Submodule path 'roms/edk2/CryptoPkg/Library/OpensslLib/openssl/krb5': checked 
out 'b9ad6c49505c96a088326b62a52568e3484f2168'
Cloning into 'pyca-cryptography'...
Submodule path 
'roms/edk2/CryptoPkg/Library/OpensslLib/openssl/pyca-cryptography': checked out 
'09403100de2f6f1cdd0d484dcb8e620f1c335c8f'
Cloning into 'roms/ipxe'...
Submodule path 

Re: [Qemu-devel] [PATCH v7 01/13] vfio: KABI for migration interface

2019-08-21 Thread Kirti Wankhede



On 7/23/2019 5:43 PM, Cornelia Huck wrote:
> On Tue, 16 Jul 2019 14:56:32 -0600
> Alex Williamson  wrote:
> 
>> On Tue, 9 Jul 2019 15:19:08 +0530
>> Kirti Wankhede  wrote:
> 
> I'm still a bit unsure about the device_state bit handling as well.
> 
>>> + * device_state: (read/write)
>>> + *  To indicate vendor driver the state VFIO device should be 
>>> transitioned
>>> + *  to. If device state transition fails, write on this field return 
>>> error.
> 
> Does 'device state transition fails' include 'the device state written
> was invalid'?
> 

Yes.

>>> + *  It consists of 3 bits:
>>> + *  - If bit 0 set, indicates _RUNNING state. When its reset, that 
>>> indicates
>>> + *_STOPPED state. When device is changed to _STOPPED, driver 
>>> should stop
>>> + *device before write() returns.
> 
> So _STOPPED is always !_RUNNING, regardless of which other bits are set?
>

Yes.

>>> + *  - If bit 1 set, indicates _SAVING state.
>>> + *  - If bit 2 set, indicates _RESUMING state.
>>> + *  _SAVING and _RESUMING set at the same time is invalid state.  
> 
> What about _RUNNING | _RESUMING -- does that make sense?
>

I think this will be valid state in postcopy case, though I'm not very sure.


>>
>> I think in the previous version there was a question of how we handle
>> yet-to-be-defined bits.  For instance, if we defined a
>> SUBTYPE_MIGRATIONv2 with the intention of making it backwards
>> compatible with this version, do we declare the undefined bits as
>> preserved so that the user should do a read-modify-write operation?
> 
> Or can we state that undefined bits are ignored, and may or may not
> preserved, so that we can skip the read-modify-write requirement? v1
> and v2 can hopefully be distinguished in a different way.
> 

Updating comment in next version.

Thanks,
Kirti

> (...)
> 
>>> +struct vfio_device_migration_info {
>>> +__u32 device_state; /* VFIO device state */
>>> +#define VFIO_DEVICE_STATE_RUNNING   (1 << 0)
>>> +#define VFIO_DEVICE_STATE_SAVING(1 << 1)
>>> +#define VFIO_DEVICE_STATE_RESUMING  (1 << 2)
>>> +#define VFIO_DEVICE_STATE_MASK  (VFIO_DEVICE_STATE_RUNNING | \
>>> + VFIO_DEVICE_STATE_SAVING | \
>>> + VFIO_DEVICE_STATE_RESUMING)  
>>
>> Yes, we have the mask in here now, but no mention above how the user
>> should handle undefined bits.  Thanks,
>>
>> Alex
>>
>>> +#define VFIO_DEVICE_STATE_INVALID   (VFIO_DEVICE_STATE_SAVING | \
>>> + VFIO_DEVICE_STATE_RESUMING)
> 
> As mentioned above, does _RESUMING | _RUNNING make sense?
> 



Re: [Qemu-devel] [PATCH v7 01/13] vfio: KABI for migration interface

2019-08-21 Thread Kirti Wankhede


Sorry for the delay.

On 7/17/2019 2:26 AM, Alex Williamson wrote:
> On Tue, 9 Jul 2019 15:19:08 +0530
> Kirti Wankhede  wrote:
> 
>> - Defined MIGRATION region type and sub-type.
>> - Used 3 bits to define VFIO device states.
>> Bit 0 => _RUNNING
>> Bit 1 => _SAVING
>> Bit 2 => _RESUMING
>> Combination of these bits defines VFIO device's state during migration
>> _STOPPED => All bits 0 indicates VFIO device stopped.
>> _RUNNING => Normal VFIO device running state.
>> _SAVING | _RUNNING => vCPUs are running, VFIO device is running but start
>>   saving state of device i.e. pre-copy state
>> _SAVING  => vCPUs are stoppped, VFIO device should be stopped, and
>>   save device state,i.e. stop-n-copy state
>> _RESUMING => VFIO device resuming state.
>> _SAVING | _RESUMING => Invalid state if _SAVING and _RESUMING bits are 
>> set
>> - Defined vfio_device_migration_info structure which will be placed at 0th
>>   offset of migration region to get/set VFIO device related information.
>>   Defined members of structure and usage on read/write access:
>> * device_state: (read/write)
>> To convey VFIO device state to be transitioned to. Only 3 bits are 
>> used
>> as of now.
>> * pending bytes: (read only)
>> To get pending bytes yet to be migrated for VFIO device.
>> * data_offset: (read only)
>> To get data offset in migration from where data exist during _SAVING
>> and from where data should be written by user space application 
>> during
>>  _RESUMING state
>> * data_size: (read/write)
>> To get and set size of data copied in migration region during _SAVING
>> and _RESUMING state.
>> * start_pfn, page_size, total_pfns: (write only)
>> To get bitmap of dirty pages from vendor driver from given
>> start address for total_pfns.
>> * copied_pfns: (read only)
>> To get number of pfns bitmap copied in migration region.
>> Vendor driver should copy the bitmap with bits set only for
>> pages to be marked dirty in migration region. Vendor driver
>> should return 0 if there are 0 pages dirty in requested
>> range. Vendor driver should return -1 to mark all pages in the 
>> section
>> as dirty
>>
>> Migration region looks like:
>>  --
>> |vfio_device_migration_info|data section  |
>> |  | ///  |
>>  --
>>  ^  ^  ^
>>  offset 0-trapped partdata_offset data_size
>>
>> Data section is always followed by vfio_device_migration_info
>> structure in the region, so data_offset will always be none-0.
>> Offset from where data is copied is decided by kernel driver, data
>> section can be trapped or mapped depending on how kernel driver
>> defines data section. If mmapped, then data_offset should be page
>> aligned, where as initial section which contain
>> vfio_device_migration_info structure might not end at offset which
>> is page aligned.
>>
>> Signed-off-by: Kirti Wankhede 
>> Reviewed-by: Neo Jia 
>> ---
>>  linux-headers/linux/vfio.h | 166 
>> +
>>  1 file changed, 166 insertions(+)
>>
>> diff --git a/linux-headers/linux/vfio.h b/linux-headers/linux/vfio.h
>> index 24f505199f83..6696a4600545 100644
>> --- a/linux-headers/linux/vfio.h
>> +++ b/linux-headers/linux/vfio.h
>> @@ -372,6 +372,172 @@ struct vfio_region_gfx_edid {
>>   */
>>  #define VFIO_REGION_SUBTYPE_IBM_NVLINK2_ATSD(1)
>>  
>> +/* Migration region type and sub-type */
>> +#define VFIO_REGION_TYPE_MIGRATION  (2)
> 
> Region type #2 is already claimed by VFIO_REGION_TYPE_CCW, so this would
> need to be #3 or greater (we should have a reference table somewhere in
> this header as it gets easier to miss claimed entries as the sprawl
> grows).
> 
>> +#define VFIO_REGION_SUBTYPE_MIGRATION   (1)
>> +
>> +/**
>> + * Structure vfio_device_migration_info is placed at 0th offset of
>> + * VFIO_REGION_SUBTYPE_MIGRATION region to get/set VFIO device related 
>> migration
>> + * information. Field accesses from this structure are only supported at 
>> their
>> + * native width and alignment, otherwise should return error.
> 
> This seems like a good unit test, a userspace driver that performs
> unaligned accesses to this space.  I'm afraid the wording above might
> suggest that if there's no error it must work though, which might put
> us in sticky support situations.  Should we say:
> 
> s/should return error/the result is undefined and vendor drivers should
> return an error/
> 
>> + *
>> + * device_state: (read/write)
>> + *  To indicate vendor driver the state VFIO device should 

Re: [Qemu-devel] [PATCH 0/2] tests/acceptance: Update MIPS Malta ssh test

2019-08-21 Thread Aleksandar Markovic
02.08.2019. 17.37, "Aleksandar Markovic"  је
написао/ла:
>
> From: Aleksandar Markovic 
>
> This little series improves linux_ssh_mips_malta.py, both in the sense
> of code organization and in the sense of quantity of executed tests.
>

Hello, all.

I am going to send a new version in few days, and I have a question for
test team:

Currently, the outcome of the script execition is either PASS:1 FAIL:0 or
PASS:0 FAIL:1. But the test actually consists of several subtests. Is there
any way that this single Python script considers these subtests as separate
tests (test cases), reporting something like PASS:12 FAIL:7? If yes, what
would be the best way to achieve that?

Thanks in advance,
Aleksandar

> Aleksandar Markovic (2):
>   tests/acceptance: Refactor and improve reporting in
> linux_ssh_mips_malta.py
>   tests/acceptance: Add new test cases in linux_ssh_mips_malta.py
>
>  tests/acceptance/linux_ssh_mips_malta.py | 81
++--
>  1 file changed, 66 insertions(+), 15 deletions(-)
>
> --
> 2.7.4
>
>


Re: [Qemu-devel] [edk2-rfc] [edk2-devel] CPU hotplug using SMM with QEMU+OVMF

2019-08-21 Thread Kinney, Michael D
Paolo,

It makes sense to match real HW.  That puts us back to
the reset vector and handling the initial SMI at
3000:8000.  That is all workable from a FW implementation
perspective.  It look like the only issue left is DMA.

DMA protection of memory ranges is a chipset feature.
For the current QEMU implementation, what ranges of
memory are guaranteed to be protected from DMA?  Is
it only A/B seg and TSEG?

Thanks,

Mike

> -Original Message-
> From: Paolo Bonzini [mailto:pbonz...@redhat.com]
> Sent: Wednesday, August 21, 2019 10:40 AM
> To: Kinney, Michael D ;
> r...@edk2.groups.io; Yao, Jiewen 
> Cc: Alex Williamson ; Laszlo
> Ersek ; de...@edk2.groups.io; qemu
> devel list ; Igor Mammedov
> ; Chen, Yingwen
> ; Nakajima, Jun
> ; Boris Ostrovsky
> ; Joao Marcal Lemos Martins
> ; Phillip Goerl
> 
> Subject: Re: [edk2-rfc] [edk2-devel] CPU hotplug using
> SMM with QEMU+OVMF
> 
> On 21/08/19 19:25, Kinney, Michael D wrote:
> > Could we have an initial SMBASE that is within TSEG.
> >
> > If we bring in hot plug CPUs one at a time, then
> initial SMBASE in
> > TSEG can reprogram the SMBASE to the correct value for
> that CPU.
> >
> > Can we add a register to the hot plug controller that
> allows the BSP
> > to set the initial SMBASE value for a hot added CPU?
> The default can
> > be 3000:8000 for compatibility.
> >
> > Another idea is when the SMI handler runs for a hot
> add CPU event, the
> > SMM monarch programs the hot plug controller register
> with the SMBASE
> > to use for the CPU that is being added.  As each CPU
> is added, a
> > different SMBASE value can be programmed by the SMM
> Monarch.
> 
> Yes, all of these would work.  Again, I'm interested in
> having something that has a hope of being implemented in
> real hardware.
> 
> Another, far easier to implement possibility could be a
> lockable MSR (could be the existing
> MSR_SMM_FEATURE_CONTROL) that allows programming the
> SMBASE outside SMM.  It would be nice if such a bit
> could be defined by Intel.
> 
> Paolo



[Qemu-devel] [PATCH] linux-user: hijack open() for thread directories

2019-08-21 Thread Shu-Chun Weng via Qemu-devel
Besides /proc/self|, files under /proc/thread-self and
/proc/self|/task/ also expose host information to the guest
program. This patch adds them to the hijack infrastracture. Note that
is_proc_myself() does not check if the  matches the current thread
and is thus only suitable for procfs files that are identical for all
threads in the same process.

Behavior verified with guest program:

long main_thread_tid;

long gettid() {
  return syscall(SYS_gettid);
}

void print_info(const char* cxt, const char* dir) {
  char buf[1024];
  FILE* fp;

  snprintf(buf, sizeof(buf), "%s/cmdline", dir);
  fp = fopen(buf, "r");

  if (fp == NULL) {
printf("%s: can't open %s\n", cxt, buf);
  } else {
fgets(buf, sizeof(buf), fp);
printf("%s %s cmd: %s\n", cxt, dir, buf);
fclose(fp);
  }

  snprintf(buf, sizeof(buf), "%s/maps", dir);
  fp = fopen(buf, "r");

  if (fp == NULL) {
printf("%s: can't open %s\n", cxt, buf);
  } else {
char seen[128][128];
int n = 0, is_new = 0;
while(fgets(buf, sizeof(buf), fp) != NULL) {
  const char* p = strrchr(buf, ' ');
  if (p == NULL || *(p + 1) == '\n') {
continue;
  }
  ++p;
  is_new = 1;
  for (int i = 0; i < n; ++i) {
if (strncmp(p, seen[i], sizeof(seen[i])) == 0) {
  is_new = 0;
  break;
}
  }
  if (is_new) {
printf("%s %s map: %s", cxt, dir, p);
if (n < 128) {
  strncpy(seen[n], p, sizeof(seen[n]));
  seen[n][sizeof(seen[n]) - 1] = '\0';
  ++n;
}
  }
}
fclose(fp);
  }
}

void* thread_main(void* _) {
  char buf[1024];

  print_info("Child", "/proc/thread-self");

  snprintf(buf, sizeof(buf), "/proc/%ld/task/%ld", (long) getpid(), 
main_thread_tid);
  print_info("Child", buf);

  snprintf(buf, sizeof(buf), "/proc/%ld/task/%ld", (long) getpid(), (long) 
gettid());
  print_info("Child", buf);

  return NULL;
}

int main() {
  char buf[1024];
  pthread_t thread;
  int ret;

  print_info("Main", "/proc/thread-self");
  print_info("Main", "/proc/self");

  snprintf(buf, sizeof(buf), "/proc/%ld", (long) getpid());
  print_info("Main", buf);

  main_thread_tid = gettid();
  snprintf(buf, sizeof(buf), "/proc/self/task/%ld", main_thread_tid);
  print_info("Main", buf);

  snprintf(buf, sizeof(buf), "/proc/%ld/task/%ld", (long) getpid(), 
main_thread_tid);
  print_info("Main", buf);

  if ((ret = pthread_create(, NULL, _main, NULL)) < 0) {
printf("ptherad_create failed: %s (%d)\n", strerror(ret), ret);
  }

  pthread_join(thread, NULL);
  return 0;
}

Signed-off-by: Shu-Chun Weng 
---
 linux-user/syscall.c | 40 
 1 file changed, 40 insertions(+)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 8367cb138d..73fe82bcc7 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -6968,17 +6968,57 @@ static int open_self_auxv(void *cpu_env, int fd)
 return 0;
 }
 
+static int consume_task_directories(const char **filename)
+{
+if (!strncmp(*filename, "task/", strlen("task/"))) {
+*filename += strlen("task/");
+if (**filename < '1' || **filename > '9') {
+return 0;
+}
+/*
+ * Don't care about the exact tid.
+ * XXX: this allows opening files under /proc/self|/task/ where
+ *   is not a valid thread id. Consider checking if the file
+ *  actually exists.
+ */
+const char *p = *filename + 1;
+while (*p >= '0' && *p <= '9') {
+++p;
+}
+if (*p == '/') {
+*filename = p + 1;
+return 1;
+} else {
+return 0;
+}
+}
+return 1;
+}
+
+/*
+ * Determines if filename refer to a procfs file for the current process or any
+ * thread within the current process. This function should only be used to 
check
+ * for files that have identical contents in all threads, e.g. exec, maps, etc.
+ */
 static int is_proc_myself(const char *filename, const char *entry)
 {
 if (!strncmp(filename, "/proc/", strlen("/proc/"))) {
 filename += strlen("/proc/");
 if (!strncmp(filename, "self/", strlen("self/"))) {
 filename += strlen("self/");
+if (!consume_task_directories()) {
+return 0;
+}
+} else if (!strncmp(filename, "thread-self/", strlen("thread-self/"))) 
{
+filename += strlen("thread-self/");
 } else if (*filename >= '1' && *filename <= '9') {
 char myself[80];
 snprintf(myself, sizeof(myself), "%d/", getpid());
 if (!strncmp(filename, myself, strlen(myself))) {
 filename += strlen(myself);
+if (!consume_task_directories()) {
+return 0;
+}
 } else {
 return 0;
 }
-- 
2.23.0.rc1.153.gdeed80330f-goog




Re: [Qemu-devel] [PATCH] block/backup: install notifier during creation

2019-08-21 Thread John Snow



On 8/21/19 10:41 AM, Vladimir Sementsov-Ogievskiy wrote:
> 09.08.2019 23:13, John Snow wrote:
>> Backup jobs may yield prior to installing their handler, because of the
>> job_co_entry shim which guarantees that a job won't begin work until
>> we are ready to start an entire transaction.
>>
>> Unfortunately, this makes proving correctness about transactional
>> points-in-time for backup hard to reason about. Make it explicitly clear
>> by moving the handler registration to creation time, and changing the
>> write notifier to a no-op until the job is started.
>>
>> Reported-by: Vladimir Sementsov-Ogievskiy 
>> Signed-off-by: John Snow 
>> ---
>>   block/backup.c | 32 +++-
>>   include/qemu/job.h |  5 +
>>   job.c  |  2 +-
>>   3 files changed, 29 insertions(+), 10 deletions(-)
>>
>> diff --git a/block/backup.c b/block/backup.c
>> index 07d751aea4..4df5b95415 100644
>> --- a/block/backup.c
>> +++ b/block/backup.c
>> @@ -344,6 +344,13 @@ static int coroutine_fn backup_before_write_notify(
>>   assert(QEMU_IS_ALIGNED(req->offset, BDRV_SECTOR_SIZE));
>>   assert(QEMU_IS_ALIGNED(req->bytes, BDRV_SECTOR_SIZE));
>>   
>> +/* The handler is installed at creation time; the actual point-in-time
>> + * starts at job_start(). Transactions guarantee those two points are
>> + * the same point in time. */
>> +if (!job_started(>common.job)) {
>> +return 0;
>> +}
> 
> Hmm, sorry if it is a stupid question, I'm not good in multiprocessing and in
> Qemu iothreads..
> 
> job_started just reads job->co. If bs runs in iothread, and therefore 
> write-notifier
> is in iothread, when job_start is called from main thread.. Is it guaranteed 
> that
> write-notifier will see job->co variable change early enough to not miss 
> guest write?
> Should not job->co be volatile for example or something like this?
> 
> If not think about this patch looks good for me.
> 

You know, it's a really good question.
So good, in fact, that I have no idea.

¯\_(ツ)_/¯

I'm fairly certain that IO will not come in until the .clean phase of a
qmp_transaction, because bdrv_drained_begin(bs) is called during
.prepare, and we activate the handler (by starting the job) in .commit.
We do not end the drained section until .clean.

I'm not fully clear on what threading guarantees we have otherwise,
though; is it possible that "Thread A" would somehow lift the bdrv_drain
on an IO thread ("Thread B") and, after that, "Thread B" would somehow
still be able to see an outdated version of job->co that was set by
"Thread A"?

I doubt it; but I can't prove it.

Paolo, may I please ask you for a consult here as our resident
volatility expert?

--js

>> +
>>   return backup_do_cow(job, req->offset, req->bytes, NULL, true);
>>   }
>>   
>> @@ -398,6 +405,12 @@ static void backup_clean(Job *job)
>>   BackupBlockJob *s = container_of(job, BackupBlockJob, common.job);
>>   BlockDriverState *bs = blk_bs(s->common.blk);
>>   
>> +/* cancelled before job_start: remove write_notifier */
>> +if (s->before_write.notify) {
>> +notifier_with_return_remove(>before_write);
>> +s->before_write.notify = NULL;
>> +}
>> +
>>   if (s->copy_bitmap) {
>>   bdrv_release_dirty_bitmap(bs, s->copy_bitmap);
>>   s->copy_bitmap = NULL;
>> @@ -527,17 +540,8 @@ static void backup_init_copy_bitmap(BackupBlockJob *job)
>>   static int coroutine_fn backup_run(Job *job, Error **errp)
>>   {
>>   BackupBlockJob *s = container_of(job, BackupBlockJob, common.job);
>> -BlockDriverState *bs = blk_bs(s->common.blk);
>>   int ret = 0;
>>   
>> -QLIST_INIT(>inflight_reqs);
>> -qemu_co_rwlock_init(>flush_rwlock);
>> -
>> -backup_init_copy_bitmap(s);
>> -
>> -s->before_write.notify = backup_before_write_notify;
>> -bdrv_add_before_write_notifier(bs, >before_write);
>> -
>>   if (s->sync_mode == MIRROR_SYNC_MODE_TOP) {
>>   int64_t offset = 0;
>>   int64_t count;
>> @@ -572,6 +576,7 @@ static int coroutine_fn backup_run(Job *job, Error 
>> **errp)
>>   
>>out:
>>   notifier_with_return_remove(>before_write);
>> +s->before_write.notify = NULL;
>>   
>>   /* wait until pending backup_do_cow() calls have completed */
>>   qemu_co_rwlock_wrlock(>flush_rwlock);
>> @@ -767,6 +772,15 @@ BlockJob *backup_job_create(const char *job_id, 
>> BlockDriverState *bs,
>>  _abort);
>>   job->len = len;
>>   
>> +/* Finally, install a write notifier that takes effect after 
>> job_start() */
>> +backup_init_copy_bitmap(job);
>> +
>> +QLIST_INIT(>inflight_reqs);
>> +qemu_co_rwlock_init(>flush_rwlock);
>> +
>> +job->before_write.notify = backup_before_write_notify;
>> +bdrv_add_before_write_notifier(bs, >before_write);
>> +
>>   return >common;
>>   
>>error:
>> diff --git a/include/qemu/job.h b/include/qemu/job.h
>> index 9e7cd1e4a0..733afb696b 100644
>> --- 

Re: [Qemu-devel] [PATCH v1 2/4] s390x/tcg: Introduce probe_read_access()

2019-08-21 Thread David Hildenbrand
On 21.08.19 21:19, Richard Henderson wrote:
> On 8/21/19 10:37 AM, David Hildenbrand wrote:
>> Hah, guess what, I implemented a similar variant of "fetch all
>> of the host addresses" *but* it is not that easy as you might
>> think (sorry for the bad news).
> 
> I think it is, because I didn't think it *that* easy.  :-)

:) hehe

> 
>> There are certain cases where we can't get access to the raw host
>> page. Namely, cpu watchpoints, LAP, NODIRTY. In summary: this won't
>> as you describe. (my first approach did exactly this)
> 
> NODIRTY and LAP are automatically handled via probe_write
> faulting instead of returning the address.  I think there
> may be a bug in probe_write at present not checking

For LAP pages we immediately set TLB_INVALID_MASK again, to trigger a
new fault on the next write access (only). The could be handled in
tlb_vaddr_to_host(), simply returning the address to the page after
trying to fill the tlb and succeeding (I implemented that, that's the
easy part).

TLB_NOTDIRTY and TLB_MMIO are the real issue. We don't want to refault,
we want to treat that memory like IO memory and route it via
MemoryRegionOps() - e.g., watch_mem_ops() in qemu/exec.c. So it's not a
fault but IO memory.

That's why we don't expose that memory via tlb_vaddr_to_host(). Faulting
in case of TLB_NOTDIRTY or TLB_MMIO would be bad.

> 
> Watchpoints could be handled the same way, if we were to
> export check_watchpoint from exec.c.  Indeed, I see no way
> to handle watchpoints correctly if we don't.  I think that's
> an outstanding bug with probe_write.

probe_write() performs the MMU translation. If that succeeds, there is
no fault. If there are watchpoints, the memory is treated like IO and
memory access is rerouted. I think this works as designed.

> 
> Any other objections?  I certainly think that restricting the
> size of such operations to one page is a large simplification
> over the S390Access array thing that you create in this patch.

You mean two pages I assume. Yeah, I could certainly simplify the
prototype patch I have here quite a lot. 2 pages should be enough for
everybody ;)

The basic question is: Should we try to somehow work around IO memory
access (including NOTDIRTY and watchpoints) in tlb_vaddr_to_host() or
perform access in these cases via cpu_physical_memory_write() ?

It feels somewhat wrong to me to tune tlb_vaddr_to_host() to always
return the address of a page although we are dealing with
MemoryRegionOps()  that want a more controlled access mechanism.

> 
> 
> r~
> 
>>
>> The following patch requires another re-factoring
>> (tcg_s390_cpu_mmu_translate), but you should get the idea.
>>
>>
>>
>> From 0cacd2aea3dbc25e93492cca04f6c866b86d7f8a Mon Sep 17 00:00:00 2001
>> From: David Hildenbrand 
>> Date: Tue, 20 Aug 2019 09:37:11 +0200
>> Subject: [PATCH v1] s390x/tcg: Fault-safe MVC (MOVE) implementation
>>
>> MVC can cross page boundaries. In case we fault on the second page, we
>> already partially copied data. If we have overlaps, we would
>> trigger a fault after having partially moved data, eventually having
>> our original data already overwritten. When continuing after the fault,
>> we would try to move already modified data, not the original data -
>> very bad.
>>
>> glibc started to use MVC for forward memmove() and is able to trigger
>> exectly this corruption (via rpmbuild and rpm). Fedora 31 (rawhide)
>> currently fails to install as we trigger rpm database corruptions due to
>> this bug.
>>
>> We need a way to translate a virtual address range to individual pages that
>> we can access later on without triggering faults. Probing all virtual
>> addresses once before the read/write is not sufficient - the guest could
>> have modified the page tables (e.g., write-protect, map out) in between,
>> so on we could fault on any new tlb_fill() - we have to skip any new MMU
>> translations.
>>
>> Unfortunately, there are TLB entries for which cannot get a host address
>> for (esp., watchpoints, LAP, NOTDIRTY) - in these cases we cannot avoid
>> a new MMU translation using the ordinary ld/st helpers. Let's fallback
>> to guest physical addresses in these cases, that we access via
>> cpu_physical_memory_(read|write),
>>
>> This change reduced the boottime for s390x guests (to prompt) from ~1:29
>> min to ~1:19 min in my tests. For example, LAP protected pages are now only
>> translated once when writing to them using MVC and we don't always fallback
>> to byte-based copies.
>>
>> We will want to use the same mechanism for other accesses as well (e.g.,
>> mvcl), prepare for that right away.
>>
>> Signed-off-by: David Hildenbrand 
>> ---
>>  target/s390x/mem_helper.c | 213 +++---
>>  1 file changed, 200 insertions(+), 13 deletions(-)
>>
>> diff --git a/target/s390x/mem_helper.c b/target/s390x/mem_helper.c
>> index 91ba2e03d9..1ca293e00d 100644
>> --- a/target/s390x/mem_helper.c
>> +++ b/target/s390x/mem_helper.c
>> @@ -24,8 +24,10 @@
>>  #include 

Re: [Qemu-devel] RISC-V: Vector && DSP Extension

2019-08-21 Thread Palmer Dabbelt

On Thu, 15 Aug 2019 14:37:52 PDT (-0700), alistai...@gmail.com wrote:

On Thu, Aug 15, 2019 at 2:07 AM Peter Maydell  wrote:


On Thu, 15 Aug 2019 at 09:53, Aleksandar Markovic
 wrote:
>
> > We can accept draft
> > extensions in QEMU as long as they are disabled by default.

> Hi, Alistair, Palmer,
>
> Is this an official stance of QEMU community, or perhaps Alistair's
> personal judgement, or maybe a rule within risv subcomunity?

Alistair asked on a previous thread; my view was:
https://lists.gnu.org/archive/html/qemu-devel/2019-07/msg03364.html
and nobody else spoke up disagreeing (summary: should at least be
disabled-by-default and only enabled by setting an explicit
property whose name should start with the 'x-' prefix).


Agreed!



In general QEMU does sometimes introduce experimental extensions
(we've had them in the block layer, for example) and so the 'x-'
property to enable them is a reasonably established convention.
I think it's a reasonable compromise to allow this sort of work
to start and not have to live out-of-tree for a long time, without
confusing users or getting into a situation where some QEMU
versions behave differently or to obsolete drafts of a spec
without it being clear from the command line that experimental
extensions are being enabled.

There is also an element of "submaintainer judgement" to be applied
here -- upstream is probably not the place for a draft extension
to be implemented if it is:
 * still fast moving or subject to major changes of design direction
 * major changes to the codebase (especially if it requires
   changes to core code) that might later need to be redone
   entirely differently
 * still experimental


Yep, agreed. For RISC-V I think this would extend to only allowing
extensions that have backing from the foundation and are under active
discussion.


My general philosophy here is that we'll take anything written down in an 
official RISC-V ISA manual (ie, the ones actually released by the foundation).  
This provides a single source of truth for what an extension name / version 
means, which is important to avoid confusion.  If it's a ratified extension 
then I see no reason not to support it on my end.  For frozen extensions we 
should probably just wait the 45 days until they go up for a ratification vote, 
but I'd be happy to start reviewing patches then (or earlier :)).


If the spec is a draft in the ISA manual then we need to worry about the 
support burden, which I don't have a fixed criteria for -- generally there 
shouldn't be issues here, but early drafts can be in a state where they're 
going to change extensively and are unlikely to be used by anyone.  There's 
also the question of "what is an official release of a draft specification?".  

That's a bit awkward right now: the current ratified ISA manual contains 
version 0.3 of the hypervisor extension, but I just talked to Andrew and the 
plan is to remove the draft extensions from the ratified manuals because these 
drafts are old and the official manuals update slowly.  For now I guess we'll 
need an an-hoc way of determining if a draft extension has been officially 
versioned or not, which is a bit of a headache.


We already have examples of supporting draft extensions, including priv-1.9.1.  
This does cause some pain for us on the QEMU side (CSR bits have different 
semantics between the specs), but there's 1.9.1 hardware out there and the port 
continues to be useful so I'd be in favor of keeping it around for now.  I 
suppose there is an implicit risk that draft extensions will be deprecated, but 
the "x-" prefix, draft status, and long deprecation period should be sufficient 
to inform users of the risk.  I wouldn't be opposed to adding a "this is a 
draft ISA" warning, but I feel like it might be a bit overkill.




Alistair



thanks
-- PMM




Re: [Qemu-devel] [PATCH v1 2/4] s390x/tcg: Introduce probe_read_access()

2019-08-21 Thread Richard Henderson
On 8/21/19 10:37 AM, David Hildenbrand wrote:
> Hah, guess what, I implemented a similar variant of "fetch all
> of the host addresses" *but* it is not that easy as you might
> think (sorry for the bad news).

I think it is, because I didn't think it *that* easy.  :-)

> There are certain cases where we can't get access to the raw host
> page. Namely, cpu watchpoints, LAP, NODIRTY. In summary: this won't
> as you describe. (my first approach did exactly this)

NODIRTY and LAP are automatically handled via probe_write
faulting instead of returning the address.  I think there
may be a bug in probe_write at present not checking

Watchpoints could be handled the same way, if we were to
export check_watchpoint from exec.c.  Indeed, I see no way
to handle watchpoints correctly if we don't.  I think that's
an outstanding bug with probe_write.

Any other objections?  I certainly think that restricting the
size of such operations to one page is a large simplification
over the S390Access array thing that you create in this patch.


r~

> 
> The following patch requires another re-factoring
> (tcg_s390_cpu_mmu_translate), but you should get the idea.
> 
> 
> 
> From 0cacd2aea3dbc25e93492cca04f6c866b86d7f8a Mon Sep 17 00:00:00 2001
> From: David Hildenbrand 
> Date: Tue, 20 Aug 2019 09:37:11 +0200
> Subject: [PATCH v1] s390x/tcg: Fault-safe MVC (MOVE) implementation
> 
> MVC can cross page boundaries. In case we fault on the second page, we
> already partially copied data. If we have overlaps, we would
> trigger a fault after having partially moved data, eventually having
> our original data already overwritten. When continuing after the fault,
> we would try to move already modified data, not the original data -
> very bad.
> 
> glibc started to use MVC for forward memmove() and is able to trigger
> exectly this corruption (via rpmbuild and rpm). Fedora 31 (rawhide)
> currently fails to install as we trigger rpm database corruptions due to
> this bug.
> 
> We need a way to translate a virtual address range to individual pages that
> we can access later on without triggering faults. Probing all virtual
> addresses once before the read/write is not sufficient - the guest could
> have modified the page tables (e.g., write-protect, map out) in between,
> so on we could fault on any new tlb_fill() - we have to skip any new MMU
> translations.
> 
> Unfortunately, there are TLB entries for which cannot get a host address
> for (esp., watchpoints, LAP, NOTDIRTY) - in these cases we cannot avoid
> a new MMU translation using the ordinary ld/st helpers. Let's fallback
> to guest physical addresses in these cases, that we access via
> cpu_physical_memory_(read|write),
> 
> This change reduced the boottime for s390x guests (to prompt) from ~1:29
> min to ~1:19 min in my tests. For example, LAP protected pages are now only
> translated once when writing to them using MVC and we don't always fallback
> to byte-based copies.
> 
> We will want to use the same mechanism for other accesses as well (e.g.,
> mvcl), prepare for that right away.
> 
> Signed-off-by: David Hildenbrand 
> ---
>  target/s390x/mem_helper.c | 213 +++---
>  1 file changed, 200 insertions(+), 13 deletions(-)
> 
> diff --git a/target/s390x/mem_helper.c b/target/s390x/mem_helper.c
> index 91ba2e03d9..1ca293e00d 100644
> --- a/target/s390x/mem_helper.c
> +++ b/target/s390x/mem_helper.c
> @@ -24,8 +24,10 @@
>  #include "exec/helper-proto.h"
>  #include "exec/exec-all.h"
>  #include "exec/cpu_ldst.h"
> +#include "exec/cpu-common.h"
>  #include "qemu/int128.h"
>  #include "qemu/atomic128.h"
> +#include "tcg_s390x.h"
>  
>  #if !defined(CONFIG_USER_ONLY)
>  #include "hw/s390x/storage-keys.h"
> @@ -104,6 +106,181 @@ static inline void cpu_stsize_data_ra(CPUS390XState 
> *env, uint64_t addr,
>  }
>  }
>  
> +/*
> + * An access covers one page, except for the start/end of the translated
> + * virtual address range.
> + */
> +typedef struct S390Access {
> +union {
> +char *haddr;
> +hwaddr paddr;
> +};
> +uint16_t size;
> +bool isHaddr;
> +} S390Access;
> +
> +/*
> + * Prepare access to a virtual address range, guaranteeing we won't trigger
> + * faults during the actual access. Sometimes we can't get access to the
> + * host address (e.g., LAP, cpu watchpoints/PER, clean pages, ...). Then, we
> + * translate to guest physical addresses instead. We'll have to perform
> + * slower, indirect, access to these physical addresses then.
> + */
> +static void access_prepare_idx(CPUS390XState *env, S390Access access[],
> +   int nb_access, vaddr vaddr, target_ulong size,
> +   MMUAccessType access_type, int mmu_idx,
> +   uintptr_t ra)
> +{
> +int i = 0;
> +int cur_size;
> +
> +/*
> + * After we obtained the host address of a TLB entry that entry might
> + * be invalidated again - e.g., via tlb_set_dirty(), via 

Re: [Qemu-devel] [PATCH 1/2] virtio: add vhost-user-fs base device

2019-08-21 Thread Dr. David Alan Gilbert
* Michael S. Tsirkin (m...@redhat.com) wrote:
> On Fri, Aug 16, 2019 at 03:33:20PM +0100, Dr. David Alan Gilbert (git) wrote:
> > From: "Dr. David Alan Gilbert" 
> > 
> > The virtio-fs virtio device provides shared file system access using
> > the FUSE protocol carried ovew virtio.
> > The actual file server is implemented in an external vhost-user-fs device
> > backend process.
> > 
> > Signed-off-by: Stefan Hajnoczi 
> > Signed-off-by: Sebastien Boeuf 
> > Signed-off-by: Dr. David Alan Gilbert 
> > ---
> >  configure   |  13 +
> >  hw/virtio/Makefile.objs |   1 +
> >  hw/virtio/vhost-user-fs.c   | 297 
> >  include/hw/virtio/vhost-user-fs.h   |  45 +++
> >  include/standard-headers/linux/virtio_fs.h  |  41 +++
> >  include/standard-headers/linux/virtio_ids.h |   1 +
> >  6 files changed, 398 insertions(+)
> >  create mode 100644 hw/virtio/vhost-user-fs.c
> >  create mode 100644 include/hw/virtio/vhost-user-fs.h
> >  create mode 100644 include/standard-headers/linux/virtio_fs.h
> > 
> > diff --git a/configure b/configure
> > index 714e7fb6a1..e7e33ee783 100755
> > --- a/configure
> > +++ b/configure
> > @@ -382,6 +382,7 @@ vhost_crypto=""
> >  vhost_scsi=""
> >  vhost_vsock=""
> >  vhost_user=""
> > +vhost_user_fs=""
> >  kvm="no"
> >  hax="no"
> >  hvf="no"
> > @@ -1316,6 +1317,10 @@ for opt do
> >;;
> >--enable-vhost-vsock) vhost_vsock="yes"
> >;;
> > +  --disable-vhost-user-fs) vhost_user_fs="no"
> > +  ;;
> > +  --enable-vhost-user-fs) vhost_user_fs="yes"
> > +  ;;
> >--disable-opengl) opengl="no"
> >;;
> >--enable-opengl) opengl="yes"
> > @@ -2269,6 +2274,10 @@ test "$vhost_crypto" = "" && vhost_crypto=$vhost_user
> >  if test "$vhost_crypto" = "yes" && test "$vhost_user" = "no"; then
> >error_exit "--enable-vhost-crypto requires --enable-vhost-user"
> >  fi
> > +test "$vhost_user_fs" = "" && vhost_user_fs=$vhost_user
> > +if test "$vhost_user_fs" = "yes" && test "$vhost_user" = "no"; then
> > +  error_exit "--enable-vhost-user-fs requires --enable-vhost-user"
> > +fi
> >  
> >  # OR the vhost-kernel and vhost-user values for simplicity
> >  if test "$vhost_net" = ""; then
> > @@ -6425,6 +6434,7 @@ echo "vhost-crypto support $vhost_crypto"
> >  echo "vhost-scsi support $vhost_scsi"
> >  echo "vhost-vsock support $vhost_vsock"
> >  echo "vhost-user support $vhost_user"
> > +echo "vhost-user-fs support $vhost_user_fs"
> >  echo "Trace backends$trace_backends"
> >  if have_backend "simple"; then
> >  echo "Trace output file $trace_file-"
> > @@ -6921,6 +6931,9 @@ fi
> >  if test "$vhost_user" = "yes" ; then
> >echo "CONFIG_VHOST_USER=y" >> $config_host_mak
> >  fi
> > +if test "$vhost_user_fs" = "yes" ; then
> > +  echo "CONFIG_VHOST_USER_FS=y" >> $config_host_mak
> > +fi
> >  if test "$blobs" = "yes" ; then
> >echo "INSTALL_BLOBS=yes" >> $config_host_mak
> >  fi
> > diff --git a/hw/virtio/Makefile.objs b/hw/virtio/Makefile.objs
> > index 964ce78607..47ffbf22c4 100644
> > --- a/hw/virtio/Makefile.objs
> > +++ b/hw/virtio/Makefile.objs
> > @@ -11,6 +11,7 @@ common-obj-$(CONFIG_VIRTIO_PCI) += virtio-pci.o
> >  common-obj-$(CONFIG_VIRTIO_MMIO) += virtio-mmio.o
> >  obj-$(CONFIG_VIRTIO_BALLOON) += virtio-balloon.o
> >  obj-$(CONFIG_VIRTIO_CRYPTO) += virtio-crypto.o
> > +obj-$(CONFIG_VHOST_USER_FS) += vhost-user-fs.o
> >  obj-$(call land,$(CONFIG_VIRTIO_CRYPTO),$(CONFIG_VIRTIO_PCI)) += 
> > virtio-crypto-pci.o
> >  obj-$(CONFIG_VIRTIO_PMEM) += virtio-pmem.o
> >  common-obj-$(call land,$(CONFIG_VIRTIO_PMEM),$(CONFIG_VIRTIO_PCI)) += 
> > virtio-pmem-pci.o
> > diff --git a/hw/virtio/vhost-user-fs.c b/hw/virtio/vhost-user-fs.c
> > new file mode 100644
> > index 00..2753c2c07a
> > --- /dev/null
> > +++ b/hw/virtio/vhost-user-fs.c
> > @@ -0,0 +1,297 @@
> > +/*
> > + * Vhost-user filesystem virtio device
> > + *
> > + * Copyright 2018 Red Hat, Inc.
> > + *
> > + * Authors:
> > + *  Stefan Hajnoczi 
> > + *
> > + * This work is licensed under the terms of the GNU GPL, version 2 or
> > + * (at your option) any later version.  See the COPYING file in the
> > + * top-level directory.
> > + */
> > +
> > +#include "qemu/osdep.h"
> > +#include 
> > +#include "standard-headers/linux/virtio_fs.h"
> > +#include "qapi/error.h"
> > +#include "hw/virtio/virtio-bus.h"
> > +#include "hw/virtio/virtio-access.h"
> > +#include "qemu/error-report.h"
> > +#include "hw/virtio/vhost-user-fs.h"
> > +#include "monitor/monitor.h"
> > +
> > +static void vuf_get_config(VirtIODevice *vdev, uint8_t *config)
> > +{
> > +VHostUserFS *fs = VHOST_USER_FS(vdev);
> > +struct virtio_fs_config fscfg = {};
> > +
> > +memcpy((char *)fscfg.tag, fs->conf.tag,
> > +   MIN(strlen(fs->conf.tag) + 1, sizeof(fscfg.tag)));
> > +
> > +virtio_stl_p(vdev, _queues, fs->conf.num_queues);
> > +
> > +memcpy(config, , sizeof(fscfg));
> > +}
> > +
> > +static void vuf_start(VirtIODevice *vdev)
> > +{

Re: [Qemu-devel] [PATCH v1 1/2] accel/tcg: adding integration with linux perf

2019-08-21 Thread Vanderson Martins do Rosario
On Thu, Aug 15, 2019 at 11:40 AM Stefan Hajnoczi  wrote:

> On Wed, Aug 14, 2019 at 11:37:24PM -0300, vandersonmr wrote:
> > This commit adds support to Linux Perf in order
> > to be able to analyze qemu jitted code and
> > also to able to see the TBs PC in it.
>
> Is there any reference to the file format?  Please include it in a code
> comment, if such a thing exists.
>
> > diff --git a/accel/tcg/perf/jitdump.c b/accel/tcg/perf/jitdump.c
> > new file mode 100644
> > index 00..6f4c0911c2
> > --- /dev/null
> > +++ b/accel/tcg/perf/jitdump.c
> > @@ -0,0 +1,180 @@
>
> License header?
>
> > +#ifdef __linux__
>
> If the entire source file is #ifdef __linux__ then Makefile.objs should
> probably contain obj-$(CONFIG_LINUX) += jitdump.o instead.  Letting the
> build system decide what to compile is cleaner than ifdeffing large
> amounts of code.
>
> > +
> > +#include 
> > +
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +
> > +#include "jitdump.h"
> > +#include "qemu-common.h"
>
> Please follow QEMU's header ordering conventions.  See ./HACKING "1.2.
> Include directives".
>
> > +void start_jitdump_file(void)
> > +{
> > +GString *dumpfile_name = g_string_new(NULL);;
> > +g_string_printf(dumpfile_name, "./jit-%d.dump", getpid());
>
> Simpler:
>
>   gchar *dumpfile_name = g_strdup_printf("./jit-%d.dump", getpid());
>   ...
>   g_free(dumpfile_name);
>
> > +dumpfile = fopen(dumpfile_name->str, "w+");
>
> getpid() and the global dumpfile variable make me wonder what happens
> with multi-threaded TCG?
>

I did some tests and it appears to be working fine with multi-threaded TCG.
tcg_exec_init should execute only once even in multi-threaded TCG, right?
If so, we are going to create only one jitdump file. Also, both in Windows
and Linux/POSIX fwrites is thread-safe, thus we would be safely updating
the jitdump file. Does it make sense?


>
> > +
> > +perf_marker = mmap(NULL, sysconf(_SC_PAGESIZE),
>
> Please mention the point of this mmap in a comment.  My best guess is
> that perf stores the /proc/$PID/maps and this is how it finds the
> jitdump file?
>
> > +  PROT_READ | PROT_EXEC,
> > +  MAP_PRIVATE,
> > +  fileno(dumpfile), 0);
> > +
> > +if (perf_marker == MAP_FAILED) {
> > +printf("Failed to create mmap marker file for perf %d\n",
> fileno(dumpfile));
> > +fclose(dumpfile);
> > +return;
> > +}
> > +
> > +g_string_free(dumpfile_name, TRUE);
> > +
> > +struct jitheader *header = g_new0(struct jitheader, 1);
>
> Why g_new this struct?  It's small and can be declared on the stack.
>
> Please use g_free() with g_malloc/new/etc().  It's not safe to mismatch
> glib and libc memory allocation functions.
>
> > +header->magic = 0x4A695444;
> > +header->version = 1;
> > +header->elf_mach = get_e_machine();
> > +header->total_size = sizeof(struct jitheader);
> > +header->pid = getpid();
> > +header->timestamp = get_timestamp();
> > +
> > +fwrite(header, header->total_size, 1, dumpfile);
> > +
> > +free(header);
> > +fflush(dumpfile);
> > +}
> > +
> > +void append_load_in_jitdump_file(TranslationBlock *tb)
> > +{
> > +GString *func_name = g_string_new(NULL);
> > +g_string_printf(func_name, "TB virt:0x"TARGET_FMT_lx"%c", tb->pc,
> '\0');
>
> The explicit NUL character looks strange to me.  I think the idea is to
> avoid func_name->len + 1?  Adding NUL characters to C strings can be a
> source of bugs, I would stick to convention and do len + 1 instead of
> putting NUL characters into the GString.  This is a question of style
> though.
>
> > +
> > +struct jr_code_load *load_event = g_new0(struct jr_code_load, 1);
>
> No need to allocate load_event on the heap.
>
> > diff --git a/qemu-options.hx b/qemu-options.hx
> > index 9621e934c0..1c26eeeb9c 100644
> > --- a/qemu-options.hx
> > +++ b/qemu-options.hx
> > @@ -4147,6 +4147,18 @@ STEXI
> >  Enable FIPS 140-2 compliance mode.
> >  ETEXI
> >
> > +#ifdef __linux__
> > +DEF("perf", 0, QEMU_OPTION_perf,
> > +"-perfdump jitdump files to help linux perf JIT code
> visualization\n",
> > +QEMU_ARCH_ALL)
> > +#endif
> > +STEXI
> > +@item -perf
> > +@findex -perf
> > +Dumps jitdump files to help linux perf JIT code visualization
>
> Suggestions on expanding the documentation:
>
> Where are the jitdump files dumped?  The current working directory?
>
> Anything to say about the naming scheme for these files?
>
> Can you include an example of how to load them into perf(1)?
>


Re: [Qemu-devel] [PATCH 1/2] virtio: add vhost-user-fs base device

2019-08-21 Thread Dr. David Alan Gilbert
* Michael S. Tsirkin (m...@redhat.com) wrote:
> On Fri, Aug 16, 2019 at 03:33:20PM +0100, Dr. David Alan Gilbert (git) wrote:
> > From: "Dr. David Alan Gilbert" 

> > +/* Hiprio queue */
> > +virtio_add_queue(vdev, fs->conf.queue_size, vuf_handle_output);
> >
> 
> Weird, spec patch v6 says:
> 
> +\item[0] hiprio
> +\item[1\ldots n] request queues
> 
> where's the Notifications queue coming from?

Oops, that's a left over from when we used to have a notification queue;
all the other parts of it are gone.

Dave

> > +/* Request queues */
> > +for (i = 0; i < fs->conf.num_queues; i++) {
> > +virtio_add_queue(vdev, fs->conf.queue_size, vuf_handle_output);
> > +}
> > +
> > +/* 1 high prio queue, plus the number configured */
> > +fs->vhost_dev.nvqs = 1 + fs->conf.num_queues;
> > +fs->vhost_dev.vqs = g_new0(struct vhost_virtqueue, fs->vhost_dev.nvqs);
> > +ret = vhost_dev_init(>vhost_dev, >vhost_user,
> > + VHOST_BACKEND_TYPE_USER, 0);
> > +if (ret < 0) {
> > +error_setg_errno(errp, -ret, "vhost_dev_init failed");
> > +goto err_virtio;
> > +}
> > +
> > +return;
> > +
> > +err_virtio:
> > +vhost_user_cleanup(>vhost_user);
> > +virtio_cleanup(vdev);
> > +g_free(fs->vhost_dev.vqs);
> > +return;
> > +}
> > +
> > +static void vuf_device_unrealize(DeviceState *dev, Error **errp)
> > +{
> > +VirtIODevice *vdev = VIRTIO_DEVICE(dev);
> > +VHostUserFS *fs = VHOST_USER_FS(dev);
> > +
> > +/* This will stop vhost backend if appropriate. */
> > +vuf_set_status(vdev, 0);
> > +
> > +vhost_dev_cleanup(>vhost_dev);
> > +
> > +vhost_user_cleanup(>vhost_user);
> > +
> > +virtio_cleanup(vdev);
> > +g_free(fs->vhost_dev.vqs);
> > +fs->vhost_dev.vqs = NULL;
> > +}
> > +
> > +static const VMStateDescription vuf_vmstate = {
> > +.name = "vhost-user-fs",
> > +.unmigratable = 1,
> > +};
> > +
> > +static Property vuf_properties[] = {
> > +DEFINE_PROP_CHR("chardev", VHostUserFS, conf.chardev),
> > +DEFINE_PROP_STRING("tag", VHostUserFS, conf.tag),
> > +DEFINE_PROP_UINT16("num-queues", VHostUserFS, conf.num_queues, 1),
> > +DEFINE_PROP_UINT16("queue-size", VHostUserFS, conf.queue_size, 128),
> > +DEFINE_PROP_STRING("vhostfd", VHostUserFS, conf.vhostfd),
> > +DEFINE_PROP_END_OF_LIST(),
> > +};
> > +
> > +static void vuf_class_init(ObjectClass *klass, void *data)
> > +{
> > +DeviceClass *dc = DEVICE_CLASS(klass);
> > +VirtioDeviceClass *vdc = VIRTIO_DEVICE_CLASS(klass);
> > +
> > +dc->props = vuf_properties;
> > +dc->vmsd = _vmstate;
> > +set_bit(DEVICE_CATEGORY_STORAGE, dc->categories);
> > +vdc->realize = vuf_device_realize;
> > +vdc->unrealize = vuf_device_unrealize;
> > +vdc->get_features = vuf_get_features;
> > +vdc->get_config = vuf_get_config;
> > +vdc->set_status = vuf_set_status;
> > +vdc->guest_notifier_mask = vuf_guest_notifier_mask;
> > +vdc->guest_notifier_pending = vuf_guest_notifier_pending;
> > +}
> > +
> > +static const TypeInfo vuf_info = {
> > +.name = TYPE_VHOST_USER_FS,
> > +.parent = TYPE_VIRTIO_DEVICE,
> > +.instance_size = sizeof(VHostUserFS),
> > +.class_init = vuf_class_init,
> > +};
> > +
> > +static void vuf_register_types(void)
> > +{
> > +type_register_static(_info);
> > +}
> > +
> > +type_init(vuf_register_types)
> > diff --git a/include/hw/virtio/vhost-user-fs.h 
> > b/include/hw/virtio/vhost-user-fs.h
> > new file mode 100644
> > index 00..d07ab134b9
> > --- /dev/null
> > +++ b/include/hw/virtio/vhost-user-fs.h
> > @@ -0,0 +1,45 @@
> > +/*
> > + * Vhost-user filesystem virtio device
> > + *
> > + * Copyright 2018 Red Hat, Inc.
> > + *
> > + * Authors:
> > + *  Stefan Hajnoczi 
> > + *
> > + * This work is licensed under the terms of the GNU GPL, version 2 or
> > + * (at your option) any later version.  See the COPYING file in the
> > + * top-level directory.
> > + */
> > +
> > +#ifndef _QEMU_VHOST_USER_FS_H
> > +#define _QEMU_VHOST_USER_FS_H
> > +
> > +#include "hw/virtio/virtio.h"
> > +#include "hw/virtio/vhost.h"
> > +#include "hw/virtio/vhost-user.h"
> > +#include "chardev/char-fe.h"
> > +
> > +#define TYPE_VHOST_USER_FS "x-vhost-user-fs-device"
> > +#define VHOST_USER_FS(obj) \
> > +OBJECT_CHECK(VHostUserFS, (obj), TYPE_VHOST_USER_FS)
> > +
> > +typedef struct {
> > +CharBackend chardev;
> > +char *tag;
> > +uint16_t num_queues;
> > +uint16_t queue_size;
> > +char *vhostfd;
> > +} VHostUserFSConf;
> > +
> > +typedef struct {
> > +/*< private >*/
> > +VirtIODevice parent;
> > +VHostUserFSConf conf;
> > +struct vhost_virtqueue *vhost_vqs;
> > +struct vhost_dev vhost_dev;
> > +VhostUserState vhost_user;
> > +
> > +/*< public >*/
> > +} VHostUserFS;
> > +
> > +#endif /* _QEMU_VHOST_USER_FS_H */
> > diff --git a/include/standard-headers/linux/virtio_fs.h 
> > b/include/standard-headers/linux/virtio_fs.h

Re: [Qemu-devel] Broken aarch64 by qcow2: skip writing zero buffers to empty COW areas [v2]

2019-08-21 Thread Max Reitz
On 21.08.19 16:14, Lukáš Doktor wrote:
> Hello guys,
> 
> First attempt was rejected due to zip attachment, let's try it again with 
> just Avocado-vt debug.log and serial console log files attached.
> 
> I bisected a regression on aarch64 all the way to this commit: "qcow2: skip 
> writing zero buffers to empty COW areas" 
> c8bb23cbdbe32f5c326365e0a82e1b0e68cdcd8a. Would you please have a look at it?

I think I can see the issue on my x64 system (I don’t see the XFS
corruption, but the installation fails because of some segfaults).

I haven’t found a simpler way to reproduce the problem yet, though,
which is a pain... :-/

It looks like the problem disappears when I configure qemu with
“--disable-xfsctl”.  Can you try that?

Max



[Qemu-devel] [PATCH V2 0/2] Fix bug in nios2 and m68k semihosting

2019-08-21 Thread Sandra Loosemore
I noticed recently that the exit semihosting call on nios2 was
ignoring its parameter and always returning status 0 instead.  It
turns out the handler was retrieving the value of the wrong register.
Since the nios2 semihosting implementation was basically
cut-and-pasted from that for m68k, I checked m68k also and it had the
same bug.  This set of patches fixes both of them.

There are no changes to the actual patches from V1, only more
informative commit messages with links to the respective semihosting
protocol documents in newlib.

Sandra Loosemore (2):
  target/nios2: Fix bug in semihosted exit handling
  target/m68k: Fix bug in semihosted exit handling

 target/m68k/m68k-semi.c   | 4 ++--
 target/nios2/nios2-semi.c | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

-- 
2.8.1




Re: [Qemu-devel] [PATCH v2] target/riscv: Hardwire mcounter.TM and upper bits of [m|s]counteren

2019-08-21 Thread Palmer Dabbelt

On Wed, 14 Aug 2019 20:19:39 PDT (-0700), jonat...@fintelia.io wrote:

Ping! What is the status of this patch?


Sorry, I must have lost track of it.  I've added it to my patch queue.



On Wed, Jul 3, 2019 at 2:02 PM Jonathan Behrens 
wrote:


Bin, that proposal proved to be somewhat more controversial than I was
expecting, since it was different than how currently available hardware
worked. This option seemed much more likely to be accepted in the short
term.

Jonathan

On Mon, Jul 1, 2019 at 9:26 PM Bin Meng  wrote:


On Tue, Jul 2, 2019 at 8:20 AM Alistair Francis 
wrote:
>
> On Mon, Jul 1, 2019 at 8:56 AM  wrote:
> >
> > From: Jonathan Behrens 
> >
> > QEMU currently always triggers an illegal instruction exception when
> > code attempts to read the time CSR. This is valid behavor, but only if
> > the TM bit in mcounteren is hardwired to zero. This change also
> > corrects mcounteren and scounteren CSRs to be 32-bits on both 32-bit
> > and 64-bit targets.
> >
> > Signed-off-by: Jonathan Behrens 
>
> Reviewed-by: Alistair Francis 
>

I am a little bit lost here. I think we agreed to allow directly read
to time CSR when mcounteren.TM is set, no?

Regards,
Bin







Re: [Qemu-devel] [PATCH v2 0/3] colo: Add support for continious replication

2019-08-21 Thread Dr. David Alan Gilbert
* Lukas Straub (lukasstra...@web.de) wrote:
> On Fri, 16 Aug 2019 01:51:20 +
> "Zhang, Chen"  wrote:
> 
> > > -Original Message-
> > > From: Lukas Straub [mailto:lukasstra...@web.de]
> > > Sent: Friday, August 16, 2019 3:48 AM
> > > To: Dr. David Alan Gilbert 
> > > Cc: qemu-devel ; Zhang, Chen
> > > ; Jason Wang ; Xie
> > > Changlong ; Wen Congyang
> > > 
> > > Subject: Re: [Qemu-devel] [PATCH v2 0/3] colo: Add support for continious
> > > replication
> > >
> > > On Thu, 15 Aug 2019 19:57:37 +0100
> > > "Dr. David Alan Gilbert"  wrote:
> > >
> > > > * Lukas Straub (lukasstra...@web.de) wrote:
> > > > > Hello Everyone,
> > > > > These Patches add support for continious replication to colo.
> > > > > Please review.
> > > >
> > > >
> > > > OK, for those who haven't followed COLO for so long; 'continuous
> > > > replication' is when after the first primary fails, you can promote
> > > > the original secondary to a new primary and start replicating again;
> > > >
> > > > i.e. current COLO gives you
> > > >
> > > > p<->s
> > > > 
> > > > s
> > > >
> > > > with your patches you can do
> > > >
> > > > s becomes p2
> > > > p2<->s2
> > > >
> > > > and you're back to being resilient again.
> > > >
> > > > Which is great; because that was always an important missing piece.
> > > >
> > > > Do you have some test scripts/setup for this - it would be great to
> > > > automate some testing.
> > >
> > > My Plan is to write a Pacemaker Resource Agent[1] for qemu-colo and then 
> > > do
> > > some long-term testing in my small cluster here. Writing standalone tests 
> > > using
> > > that Resource Agent should be easy, it just needs to be provided with the 
> > > right
> > > arguments and environment Variables.

Could you update tests/test-replication.c to test the extra steps?

Dave

> > Thanks Dave's explanation.
> > It looks good for me and I will test this series in my side.
> >
> > Another question: Is "Pacemaker Resource Agent[1] "  like a heartbeat 
> > module?
> 
> It's a bit more than that. Pacemaker itself is an Cluster Resource Manager, 
> you can think of it like sysvinit but for clusters. It controls where in the 
> cluster Resources run, what state (master/slave) and what to do in case of a 
> Node or Resource failure. Now Resources can be anything like SQL-Server, 
> Webserver, VM, etc. and Pacemaker itself doesn't directly control them, 
> that's the Job of the Resource Agents. So a Resource Agent is like an 
> init-script, but cluster-aware with more actions like start, stop, monitor, 
> promote (to master) or migrate-to.
> 
> > I have wrote an internal heartbeat module running on Qemu, it make COLO can 
> > detect fail and trigger failover automatically, no need external APP to 
> > call the QMP command "x-colo-lost-heartbeat". If you need it, I can send a 
> > RFC version recently.
> 
> Cool, this should be faster to failover than with Pacemaker.
> What is the plan with cases like Primary-failover, which need to issue 
> multiple commands?
> 
> > Thanks
> > Zhang Chen
> > >
> > > Regards,
> > > Lukas Straub
> > >
> > > [1] 
> > > https://github.com/ClusterLabs/resource-agents/blob/master/doc/dev-guides/ra-dev-guide.asc#what-is-a-resource-agent
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



[Qemu-devel] [RFC PATCH v4 64/75] target/i386: introduce AVX2 vector instructions to sse-opcode.inc.h

2019-08-21 Thread Jan Bobek
Add all the AVX2 vector instruction entries to sse-opcode.inc.h.

Signed-off-by: Jan Bobek 
---
 target/i386/sse-opcode.inc.h | 362 ++-
 1 file changed, 359 insertions(+), 3 deletions(-)

diff --git a/target/i386/sse-opcode.inc.h b/target/i386/sse-opcode.inc.h
index c3c0ec4f89..abbb0a15d7 100644
--- a/target/i386/sse-opcode.inc.h
+++ b/target/i386/sse-opcode.inc.h
@@ -855,6 +855,181 @@
  * VEX.128.66.0F.WIG 73 /3 ib  VPSRLDQ xmm1, xmm2, imm8
  * VEX.LZ.0F.WIG AE /2 VLDMXCSR m32
  * VEX.LZ.0F.WIG AE /3 VSTMXCSR m32
+ *
+ * AVX2 Instructions
+ * --
+ * VEX.256.66.0F.W0 D7 /r  VPMOVMSKB r32, ymm1
+ * VEX.256.66.0F.W1 D7 /r  VPMOVMSKB r64, ymm1
+ * VEX.256.66.0F.WIG FC /r VPADDB ymm1, ymm2, ymm3/m256
+ * VEX.256.66.0F.WIG FD /r VPADDW ymm1, ymm2, ymm3/m256
+ * VEX.256.66.0F.WIG FE /r VPADDD ymm1, ymm2, ymm3/m256
+ * VEX.256.66.0F.WIG D4 /r VPADDQ ymm1, ymm2, ymm3/m256
+ * VEX.256.66.0F.WIG EC /r VPADDSB ymm1, ymm2, ymm3/m256
+ * VEX.256.66.0F.WIG ED /r VPADDSW ymm1, ymm2, ymm3/m256
+ * VEX.256.66.0F.WIG DC /r VPADDUSB ymm1,ymm2,ymm3/m256
+ * VEX.256.66.0F.WIG DD /r VPADDUSW ymm1,ymm2,ymm3/m256
+ * VEX.256.66.0F38.WIG 01 /r   VPHADDW ymm1, ymm2, ymm3/m256
+ * VEX.256.66.0F38.WIG 02 /r   VPHADDD ymm1, ymm2, ymm3/m256
+ * VEX.256.66.0F38.WIG 03 /r   VPHADDSW ymm1, ymm2, ymm3/m256
+ * VEX.256.66.0F.WIG F8 /r VPSUBB ymm1, ymm2, ymm3/m256
+ * VEX.256.66.0F.WIG F9 /r VPSUBW ymm1, ymm2, ymm3/m256
+ * VEX.256.66.0F.WIG FA /r VPSUBD ymm1, ymm2, ymm3/m256
+ * VEX.256.66.0F.WIG FB /r VPSUBQ ymm1, ymm2, ymm3/m256
+ * VEX.256.66.0F.WIG E8 /r VPSUBSB ymm1, ymm2, ymm3/m256
+ * VEX.256.66.0F.WIG E9 /r VPSUBSW ymm1, ymm2, ymm3/m256
+ * VEX.256.66.0F.WIG D8 /r VPSUBUSB ymm1, ymm2, ymm3/m256
+ * VEX.256.66.0F.WIG D9 /r VPSUBUSW ymm1, ymm2, ymm3/m256
+ * VEX.256.66.0F38.WIG 05 /r   VPHSUBW ymm1, ymm2, ymm3/m256
+ * VEX.256.66.0F38.WIG 06 /r   VPHSUBD ymm1, ymm2, ymm3/m256
+ * VEX.256.66.0F38.WIG 07 /r   VPHSUBSW ymm1, ymm2, ymm3/m256
+ * VEX.256.66.0F.WIG D5 /r VPMULLW ymm1, ymm2, ymm3/m256
+ * VEX.256.66.0F38.WIG 40 /r   VPMULLD ymm1, ymm2, ymm3/m256
+ * VEX.256.66.0F.WIG E5 /r VPMULHW ymm1, ymm2, ymm3/m256
+ * VEX.256.66.0F.WIG E4 /r VPMULHUW ymm1, ymm2, ymm3/m256
+ * VEX.256.66.0F38.WIG 28 /r   VPMULDQ ymm1, ymm2, ymm3/m256
+ * VEX.256.66.0F.WIG F4 /r VPMULUDQ ymm1, ymm2, ymm3/m256
+ * VEX.256.66.0F38.WIG 0B /r   VPMULHRSW ymm1, ymm2, ymm3/m256
+ * VEX.256.66.0F.WIG F5 /r VPMADDWD ymm1, ymm2, ymm3/m256
+ * VEX.256.66.0F38.WIG 04 /r   VPMADDUBSW ymm1, ymm2, ymm3/m256
+ * VEX.256.66.0F DA /r VPMINUB ymm1, ymm2, ymm3/m256
+ * VEX.256.66.0F38 3A /r   VPMINUW ymm1, ymm2, ymm3/m256
+ * VEX.256.66.0F38.WIG 3B /r   VPMINUD ymm1, ymm2, ymm3/m256
+ * VEX.256.66.0F38 38 /r   VPMINSB ymm1, ymm2, ymm3/m256
+ * VEX.256.66.0F EA /r VPMINSW ymm1, ymm2, ymm3/m256
+ * VEX.256.66.0F38.WIG 39 /r   VPMINSD ymm1, ymm2, ymm3/m256
+ * VEX.256.66.0F DE /r VPMAXUB ymm1, ymm2, ymm3/m256
+ * VEX.256.66.0F38 3E /r   VPMAXUW ymm1, ymm2, ymm3/m256
+ * VEX.256.66.0F38.WIG 3F /r   VPMAXUD ymm1, ymm2, ymm3/m256
+ * VEX.256.66.0F38.WIG 3C /r   VPMAXSB ymm1, ymm2, ymm3/m256
+ * VEX.256.66.0F.WIG EE /r VPMAXSW ymm1, ymm2, ymm3/m256
+ * VEX.256.66.0F38.WIG 3D /r   VPMAXSD ymm1, ymm2, ymm3/m256
+ * VEX.256.66.0F.WIG E0 /r VPAVGB ymm1, ymm2, ymm3/m256
+ * VEX.256.66.0F.WIG E3 /r VPAVGW ymm1, ymm2, ymm3/m256
+ * VEX.256.66.0F.WIG F6 /r VPSADBW ymm1, ymm2, ymm3/m256
+ * VEX.256.66.0F3A.WIG 42 /r ibVMPSADBW ymm1, ymm2, ymm3/m256, imm8
+ * VEX.256.66.0F38.WIG 1C /r   VPABSB ymm1, ymm2/m256
+ * VEX.256.66.0F38.WIG 1D /r   VPABSW ymm1, ymm2/m256
+ * VEX.256.66.0F38.WIG 1E /r   VPABSD ymm1, ymm2/m256
+ * VEX.256.66.0F38.WIG 08 /r   VPSIGNB ymm1, ymm2, ymm3/m256
+ * VEX.256.66.0F38.WIG 09 /r   VPSIGNW ymm1, ymm2, ymm3/m256
+ * VEX.256.66.0F38.WIG 0A /r   VPSIGND ymm1, ymm2, ymm3/m256
+ * VEX.256.66.0F.WIG 74 /r VPCMPEQB ymm1,ymm2,ymm3/m256
+ * VEX.256.66.0F.WIG 75 /r VPCMPEQW ymm1,ymm2,ymm3/m256
+ * VEX.256.66.0F.WIG 76 /r VPCMPEQD ymm1,ymm2,ymm3/m256
+ * VEX.256.66.0F38.WIG 29 /r   VPCMPEQQ ymm1, ymm2, ymm3/m256
+ * VEX.256.66.0F.WIG 64 /r VPCMPGTB ymm1,ymm2,ymm3/m256
+ * VEX.256.66.0F.WIG 65 /r VPCMPGTW ymm1,ymm2,ymm3/m256
+ * VEX.256.66.0F.WIG 66 /r VPCMPGTD ymm1,ymm2,ymm3/m256
+ * VEX.256.66.0F38.WIG 37 /r   VPCMPGTQ ymm1, ymm2, ymm3/m256
+ * VEX.256.66.0F.WIG DB /r VPAND ymm1, ymm2, ymm3/m256
+ * VEX.256.66.0F.WIG DF /r VPANDN ymm1, ymm2, ymm3/m256
+ * VEX.256.66.0F.WIG EB /r VPOR ymm1, ymm2, ymm3/m256
+ * VEX.256.66.0F.WIG EF /r VPXOR ymm1, ymm2, 

Re: [Qemu-devel] [PATCH v1 2/4] s390x/tcg: Introduce probe_read_access()

2019-08-21 Thread David Hildenbrand
On 21.08.19 19:26, Richard Henderson wrote:
> On 8/21/19 2:22 AM, David Hildenbrand wrote:
>> +/*
>> + * Make sure the read access is permitted and TLB entries are created. In
>> + * very rare cases it might happen that the actual accesses might need
>> + * new MMU translations. If the page tables were changed in between, we
>> + * might still trigger a fault. However, this seems to barely happen, so we
>> + * can ignore this for now.
>> + */
>> +void probe_read_access(CPUS390XState *env, uint64_t addr, uint64_t len,
>> +   uintptr_t ra)
>> +{
>> +#ifdef CONFIG_USER_ONLY
>> +if (!guest_addr_valid(addr) || !guest_addr_valid(addr + len - 1) ||
>> +page_check_range(addr, len, PAGE_READ) < 0) {
>> +s390_program_interrupt(env, PGM_ADDRESSING, ILEN_AUTO, ra);
>> +}
>> +#else
>> +while (len) {
>> +const uint64_t pagelen = -(addr | -TARGET_PAGE_MASK);
>> +const uint64_t curlen = MIN(pagelen, len);
>> +
>> +cpu_ldub_data_ra(env, addr, ra);
>> +addr = wrap_address(env, addr + curlen);
>> +len -= curlen;
>> +}
>> +#endif
>> +}
> 
> I don't think this is really the right approach, precisely because of the
> comment above.
> 
> I think we should
> 
> (1) Modify the generic probe_write to return the host address,
> akin to tlb_vaddr_to_host except it *will* fault.
> 
> (2) Create a generic version of probe_write for CONFIG_USER_ONLY,
> much like the one you have done for target/s390x.
> 
> (3) Create generic version of probe_read that does the same.
> 
> (4) Rewrite fast_memset and fast_memmove to fetch all of the host
> addresses before doing any modifications.  The functions are
> currently written as if len can be very large, handling any
> number of pages.  Except that's not true.  While there are
> several kinds of users apart from MVC, two pages are sufficient
> for all users.
> 
> Well, should be.  We would need to adjust do_mvcl to limit the
> operation to TARGET_PAGE_SIZE (CC=3, cpu-determined number of
> bytes moved without reaching end of first operand).
> Which is probably a good idea anyway.  System mode should not
> spend forever executing one instruction, as it would if you
> pass in a 64-bit length from MVCLE.
> 

Related to that: yes, that's what I mentioned in the cover letter, the
MOVE variants are full of issues. MVCLE should be limited to 4096 bytes,
and we should return cc=3. However, MVCL (which also uses do_mvcl) is
also semi-broken (it's an interruptible instruction - not cc=3, we have
to update the register step by step). MVCL should return cc=3 in case
it's an destructive copy - also not implemented properly. mem_helpers.c
is full of issues.

-- 

Thanks,

David / dhildenb



[Qemu-devel] [RFC PATCH v4 63/75] target/i386: introduce AVX2 code generators

2019-08-21 Thread Jan Bobek
Introduce code generators required by AVX2 instructions.

Signed-off-by: Jan Bobek 
---
 target/i386/translate.c | 407 ++--
 1 file changed, 395 insertions(+), 12 deletions(-)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index 3f4bb40932..3149989d68 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -4946,6 +4946,11 @@ DEF_INSNOP_ALIAS(Mhq, M)
 DEF_INSNOP_ALIAS(Mdq, M)
 DEF_INSNOP_ALIAS(Mqq, M)
 
+DEF_INSNOP_ALIAS(MDdq, M)
+DEF_INSNOP_ALIAS(MDqq, M)
+DEF_INSNOP_ALIAS(MQdq, M)
+DEF_INSNOP_ALIAS(MQqq, M)
+
 /*
  * 32-bit general register operands
  */
@@ -5907,6 +5912,14 @@ GEN_INSN2(vpmovmskb, Gq, Udq)
 tcg_gen_extu_i32_i64(arg1, arg1_r32);
 tcg_temp_free_i32(arg1_r32);
 }
+DEF_GEN_INSN2_HELPER_DEP(vpmovmskb, pmovmskb_xmm, Gd, Uqq)
+GEN_INSN2(vpmovmskb, Gq, Uqq)
+{
+const TCGv_i32 arg1_r32 = tcg_temp_new_i32();
+gen_insn2(vpmovmskb, Gd, Uqq)(env, s, arg1_r32, arg2);
+tcg_gen_extu_i32_i64(arg1, arg1_r32);
+tcg_temp_free_i32(arg1_r32);
+}
 
 DEF_GEN_INSN2_HELPER_DEP(movmskps, movmskps, Gd, Udq)
 GEN_INSN2(movmskps, Gq, Udq)
@@ -6049,27 +6062,35 @@ GEN_INSN2(vmovddup, Vqq, Wqq)
 DEF_GEN_INSN3_GVEC(paddb, Pq, Pq, Qq, add, MM_OPRSZ, MM_MAXSZ, MO_8)
 DEF_GEN_INSN3_GVEC(paddb, Vdq, Vdq, Wdq, add, XMM_OPRSZ, XMM_MAXSZ, MO_8)
 DEF_GEN_INSN3_GVEC(vpaddb, Vdq, Hdq, Wdq, add, XMM_OPRSZ, XMM_MAXSZ, MO_8)
+DEF_GEN_INSN3_GVEC(vpaddb, Vqq, Hqq, Wqq, add, XMM_OPRSZ, XMM_MAXSZ, MO_8)
 DEF_GEN_INSN3_GVEC(paddw, Pq, Pq, Qq, add, MM_OPRSZ, MM_MAXSZ, MO_16)
 DEF_GEN_INSN3_GVEC(paddw, Vdq, Vdq, Wdq, add, XMM_OPRSZ, XMM_MAXSZ, MO_16)
 DEF_GEN_INSN3_GVEC(vpaddw, Vdq, Hdq, Wdq, add, XMM_OPRSZ, XMM_MAXSZ, MO_16)
+DEF_GEN_INSN3_GVEC(vpaddw, Vqq, Hqq, Wqq, add, XMM_OPRSZ, XMM_MAXSZ, MO_16)
 DEF_GEN_INSN3_GVEC(paddd, Pq, Pq, Qq, add, MM_OPRSZ, MM_MAXSZ, MO_32)
 DEF_GEN_INSN3_GVEC(paddd, Vdq, Vdq, Wdq, add, XMM_OPRSZ, XMM_MAXSZ, MO_32)
 DEF_GEN_INSN3_GVEC(vpaddd, Vdq, Hdq, Wdq, add, XMM_OPRSZ, XMM_MAXSZ, MO_32)
+DEF_GEN_INSN3_GVEC(vpaddd, Vqq, Hqq, Wqq, add, XMM_OPRSZ, XMM_MAXSZ, MO_32)
 DEF_GEN_INSN3_GVEC(paddq, Pq, Pq, Qq, add, MM_OPRSZ, MM_MAXSZ, MO_64)
 DEF_GEN_INSN3_GVEC(paddq, Vdq, Vdq, Wdq, add, XMM_OPRSZ, XMM_MAXSZ, MO_64)
 DEF_GEN_INSN3_GVEC(vpaddq, Vdq, Hdq, Wdq, add, XMM_OPRSZ, XMM_MAXSZ, MO_64)
+DEF_GEN_INSN3_GVEC(vpaddq, Vqq, Hqq, Wqq, add, XMM_OPRSZ, XMM_MAXSZ, MO_64)
 DEF_GEN_INSN3_GVEC(paddsb, Pq, Pq, Qq, ssadd, MM_OPRSZ, MM_MAXSZ, MO_8)
 DEF_GEN_INSN3_GVEC(paddsb, Vdq, Vdq, Wdq, ssadd, XMM_OPRSZ, XMM_MAXSZ, MO_8)
 DEF_GEN_INSN3_GVEC(vpaddsb, Vdq, Hdq, Wdq, ssadd, XMM_OPRSZ, XMM_MAXSZ, MO_8)
+DEF_GEN_INSN3_GVEC(vpaddsb, Vqq, Hqq, Wqq, ssadd, XMM_OPRSZ, XMM_MAXSZ, MO_8)
 DEF_GEN_INSN3_GVEC(paddsw, Pq, Pq, Qq, ssadd, MM_OPRSZ, MM_MAXSZ, MO_16)
 DEF_GEN_INSN3_GVEC(paddsw, Vdq, Vdq, Wdq, ssadd, XMM_OPRSZ, XMM_MAXSZ, MO_16)
 DEF_GEN_INSN3_GVEC(vpaddsw, Vdq, Hdq, Wdq, ssadd, XMM_OPRSZ, XMM_MAXSZ, MO_16)
+DEF_GEN_INSN3_GVEC(vpaddsw, Vqq, Hqq, Wqq, ssadd, XMM_OPRSZ, XMM_MAXSZ, MO_16)
 DEF_GEN_INSN3_GVEC(paddusb, Pq, Pq, Qq, usadd, MM_OPRSZ, MM_MAXSZ, MO_8)
 DEF_GEN_INSN3_GVEC(paddusb, Vdq, Vdq, Wdq, usadd, XMM_OPRSZ, XMM_MAXSZ, MO_8)
 DEF_GEN_INSN3_GVEC(vpaddusb, Vdq, Hdq, Wdq, usadd, XMM_OPRSZ, XMM_MAXSZ, MO_8)
+DEF_GEN_INSN3_GVEC(vpaddusb, Vqq, Hqq, Wqq, usadd, XMM_OPRSZ, XMM_MAXSZ, MO_8)
 DEF_GEN_INSN3_GVEC(paddusw, Pq, Pq, Qq, usadd, MM_OPRSZ, MM_MAXSZ, MO_16)
 DEF_GEN_INSN3_GVEC(paddusw, Vdq, Vdq, Wdq, usadd, XMM_OPRSZ, XMM_MAXSZ, MO_16)
 DEF_GEN_INSN3_GVEC(vpaddusw, Vdq, Hdq, Wdq, usadd, XMM_OPRSZ, XMM_MAXSZ, MO_16)
+DEF_GEN_INSN3_GVEC(vpaddusw, Vqq, Hqq, Wqq, usadd, XMM_OPRSZ, XMM_MAXSZ, MO_16)
 DEF_GEN_INSN3_HELPER_EPP(addps, addps, Vdq, Vdq, Wdq)
 DEF_GEN_INSN3_HELPER_EPP(vaddps, addps, Vdq, Hdq, Wdq)
 DEF_GEN_INSN3_HELPER_EPP(vaddps, addps, Vqq, Hqq, Wqq)
@@ -6083,12 +6104,15 @@ DEF_GEN_INSN3_HELPER_EPP(vaddsd, addsd, Vq, Hq, Wq)
 DEF_GEN_INSN3_HELPER_EPP(phaddw, phaddw_mmx, Pq, Pq, Qq)
 DEF_GEN_INSN3_HELPER_EPP(phaddw, phaddw_xmm, Vdq, Vdq, Wdq)
 DEF_GEN_INSN3_HELPER_EPP(vphaddw, phaddw_xmm, Vdq, Hdq, Wdq)
+DEF_GEN_INSN3_HELPER_EPP(vphaddw, phaddw_xmm, Vqq, Hqq, Wqq)
 DEF_GEN_INSN3_HELPER_EPP(phaddd, phaddd_mmx, Pq, Pq, Qq)
 DEF_GEN_INSN3_HELPER_EPP(phaddd, phaddd_xmm, Vdq, Vdq, Wdq)
 DEF_GEN_INSN3_HELPER_EPP(vphaddd, phaddd_xmm, Vdq, Hdq, Wdq)
+DEF_GEN_INSN3_HELPER_EPP(vphaddd, phaddd_xmm, Vqq, Hqq, Wqq)
 DEF_GEN_INSN3_HELPER_EPP(phaddsw, phaddsw_mmx, Pq, Pq, Qq)
 DEF_GEN_INSN3_HELPER_EPP(phaddsw, phaddsw_xmm, Vdq, Vdq, Wdq)
 DEF_GEN_INSN3_HELPER_EPP(vphaddsw, phaddsw_xmm, Vdq, Hdq, Wdq)
+DEF_GEN_INSN3_HELPER_EPP(vphaddsw, phaddsw_xmm, Vqq, Hqq, Wqq)
 DEF_GEN_INSN3_HELPER_EPP(haddps, haddps, Vdq, Vdq, Wdq)
 DEF_GEN_INSN3_HELPER_EPP(vhaddps, haddps, Vdq, Hdq, Wdq)
 DEF_GEN_INSN3_HELPER_EPP(vhaddps, haddps, Vqq, Hqq, Wqq)
@@ -6099,27 +6123,35 @@ DEF_GEN_INSN3_HELPER_EPP(vhaddpd, haddpd, Vqq, Hqq, Wqq)
 DEF_GEN_INSN3_GVEC(psubb, Pq, Pq, Qq, sub, MM_OPRSZ, MM_MAXSZ, MO_8)
 DEF_GEN_INSN3_GVEC(psubb, Vdq, Vdq, Wdq, sub, 

[Qemu-devel] [RFC PATCH v4 73/75] target/i386: remove obsoleted helper_mov(l, q)_mm_T0

2019-08-21 Thread Jan Bobek
This helper has been obsoleted by the new code.

Signed-off-by: Jan Bobek 
---
 target/i386/ops_sse.h| 19 ---
 target/i386/ops_sse_header.h |  4 
 target/i386/translate.c  | 33 -
 3 files changed, 56 deletions(-)

diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h
index b866ead1c8..8172324e34 100644
--- a/target/i386/ops_sse.h
+++ b/target/i386/ops_sse.h
@@ -550,25 +550,6 @@ void glue(helper_maskmov, SUFFIX)(CPUX86State *env, Reg 
*a, Reg *b,
 }
 }
 
-void glue(helper_movl_mm_T0, SUFFIX)(Reg *d, uint32_t val)
-{
-d->L(0) = val;
-d->L(1) = 0;
-#if SHIFT == 1
-d->Q(1) = 0;
-#endif
-}
-
-#ifdef TARGET_X86_64
-void glue(helper_movq_mm_T0, SUFFIX)(Reg *d, uint64_t val)
-{
-d->Q(0) = val;
-#if SHIFT == 1
-d->Q(1) = 0;
-#endif
-}
-#endif
-
 #if SHIFT == 0
 void glue(helper_pshufw, SUFFIX)(Reg *d, Reg *s, int order)
 {
diff --git a/target/i386/ops_sse_header.h b/target/i386/ops_sse_header.h
index ec7d1fc686..ee8bd4c1af 100644
--- a/target/i386/ops_sse_header.h
+++ b/target/i386/ops_sse_header.h
@@ -76,10 +76,6 @@ DEF_HELPER_4(glue(pmaddwd, SUFFIX), void, Reg, Reg, Reg, i32)
 
 DEF_HELPER_4(glue(psadbw, SUFFIX), void, Reg, Reg, Reg, i32)
 DEF_HELPER_4(glue(maskmov, SUFFIX), void, env, Reg, Reg, tl)
-DEF_HELPER_2(glue(movl_mm_T0, SUFFIX), void, Reg, i32)
-#ifdef TARGET_X86_64
-DEF_HELPER_2(glue(movq_mm_T0, SUFFIX), void, Reg, i64)
-#endif
 
 #if SHIFT == 0
 DEF_HELPER_3(glue(pshufw, SUFFIX), void, Reg, Reg, int)
diff --git a/target/i386/translate.c b/target/i386/translate.c
index 6bffbaee4c..3554086336 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -3079,39 +3079,6 @@ static void gen_sse(CPUX86State *env, DisasContext *s, 
int b)
 gen_op_st_v(s, MO_32, s->T0, s->A0);
 }
 break;
-case 0x6e: /* movd mm, ea */
-#ifdef TARGET_X86_64
-if (s->dflag == MO_64) {
-gen_ldst_modrm(env, s, modrm, MO_64, OR_TMP0, 0);
-tcg_gen_st_tl(s->T0, cpu_env,
-  offsetof(CPUX86State, fpregs[reg].mmx));
-} else
-#endif
-{
-gen_ldst_modrm(env, s, modrm, MO_32, OR_TMP0, 0);
-tcg_gen_addi_ptr(s->ptr0, cpu_env,
- offsetof(CPUX86State,fpregs[reg].mmx));
-tcg_gen_trunc_tl_i32(s->tmp2_i32, s->T0);
-gen_helper_movl_mm_T0_mmx(s->ptr0, s->tmp2_i32);
-}
-break;
-case 0x16e: /* movd xmm, ea */
-#ifdef TARGET_X86_64
-if (s->dflag == MO_64) {
-gen_ldst_modrm(env, s, modrm, MO_64, OR_TMP0, 0);
-tcg_gen_addi_ptr(s->ptr0, cpu_env,
- offsetof(CPUX86State,xmm_regs[reg]));
-gen_helper_movq_mm_T0_xmm(s->ptr0, s->T0);
-} else
-#endif
-{
-gen_ldst_modrm(env, s, modrm, MO_32, OR_TMP0, 0);
-tcg_gen_addi_ptr(s->ptr0, cpu_env,
- offsetof(CPUX86State,xmm_regs[reg]));
-tcg_gen_trunc_tl_i32(s->tmp2_i32, s->T0);
-gen_helper_movl_mm_T0_xmm(s->ptr0, s->tmp2_i32);
-}
-break;
 case 0x6f: /* movq mm, ea */
 if (mod != 3) {
 gen_lea_modrm(env, s, modrm);
-- 
2.20.1




[Qemu-devel] [PATCH V2 1/2] target/nios2: Fix bug in semihosted exit handling

2019-08-21 Thread Sandra Loosemore
This patch fixes a bug that caused semihosted exit to always return
status 0; it was incorrectly using the value of register R_ARG0 (which
contains the HOSTED_EXIT request number) instead of register R_ARG1.

Note that per the newlib documentation for the nios2 semihosting protocol

https://www.sourceware.org/git/gitweb.cgi?p=newlib-cygwin.git;a=blob;f=libgloss/nios2/nios2-semi.txt;h=ded3a093c03dbae84cb95b4cd45bc3e0d751eda2;hb=HEAD

for the HOSTED_EXIT syscall the parameter is passed directly in the register
instead of in a parameter block pointed to by the register.

Signed-off-by: Sandra Loosemore 
Reviewed-by: Laurent Vivier 
---
 target/nios2/nios2-semi.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/target/nios2/nios2-semi.c b/target/nios2/nios2-semi.c
index d7a80dd..06c0861 100644
--- a/target/nios2/nios2-semi.c
+++ b/target/nios2/nios2-semi.c
@@ -215,8 +215,8 @@ void do_nios2_semihosting(CPUNios2State *env)
 args = env->regs[R_ARG1];
 switch (nr) {
 case HOSTED_EXIT:
-gdb_exit(env, env->regs[R_ARG0]);
-exit(env->regs[R_ARG0]);
+gdb_exit(env, env->regs[R_ARG1]);
+exit(env->regs[R_ARG1]);
 case HOSTED_OPEN:
 GET_ARG(0);
 GET_ARG(1);
-- 
2.8.1




[Qemu-devel] [RFC PATCH v4 72/75] target/i386: convert psadbw helper to gvec style

2019-08-21 Thread Jan Bobek
Make these helpers suitable for use with tcg_gen_gvec_* functions.

Signed-off-by: Jan Bobek 
---
 target/i386/ops_sse.h| 64 +++-
 target/i386/ops_sse_header.h |  2 +-
 target/i386/translate.c  |  9 +++--
 3 files changed, 32 insertions(+), 43 deletions(-)

diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h
index 384a835662..b866ead1c8 100644
--- a/target/i386/ops_sse.h
+++ b/target/i386/ops_sse.h
@@ -412,6 +412,15 @@ static inline int satsw(int x)
 }
 }
 
+static inline int abs1(int x)
+{
+if (x < 0) {
+return -x;
+} else {
+return x;
+}
+}
+
 #define FMULHRW(a, b) (((int16_t)(a) * (int16_t)(b) + 0x8000) >> 16)
 #endif
 
@@ -510,52 +519,33 @@ void glue(helper_pmaddwd, SUFFIX)(Reg *d, Reg *a, Reg *b, 
uint32_t desc)
 glue(clear_high, SUFFIX)(d, oprsz, maxsz);
 }
 
-#if SHIFT == 0
-static inline int abs1(int a)
+void glue(helper_psadbw, SUFFIX)(Reg *d, Reg *a, Reg *b, uint32_t desc)
 {
-if (a < 0) {
-return -a;
-} else {
-return a;
+const intptr_t oprsz = simd_oprsz(desc);
+const intptr_t maxsz = simd_maxsz(desc);
+
+for (intptr_t i = 0; i * sizeof(uint64_t) < oprsz; ++i) {
+const uint64_t t0 = abs1(a->B(8 * i + 0) - b->B(8 * i + 0));
+const uint64_t t1 = abs1(a->B(8 * i + 1) - b->B(8 * i + 1));
+const uint64_t t2 = abs1(a->B(8 * i + 2) - b->B(8 * i + 2));
+const uint64_t t3 = abs1(a->B(8 * i + 3) - b->B(8 * i + 3));
+const uint64_t t4 = abs1(a->B(8 * i + 4) - b->B(8 * i + 4));
+const uint64_t t5 = abs1(a->B(8 * i + 5) - b->B(8 * i + 5));
+const uint64_t t6 = abs1(a->B(8 * i + 6) - b->B(8 * i + 6));
+const uint64_t t7 = abs1(a->B(8 * i + 7) - b->B(8 * i + 7));
+d->Q(i) = t0 + t1 + t2 + t3 + t4 + t5 + t6 + t7;
 }
+glue(clear_high, SUFFIX)(d, oprsz, maxsz);
 }
-#endif
-void glue(helper_psadbw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
-{
-unsigned int val;
 
-val = 0;
-val += abs1(d->B(0) - s->B(0));
-val += abs1(d->B(1) - s->B(1));
-val += abs1(d->B(2) - s->B(2));
-val += abs1(d->B(3) - s->B(3));
-val += abs1(d->B(4) - s->B(4));
-val += abs1(d->B(5) - s->B(5));
-val += abs1(d->B(6) - s->B(6));
-val += abs1(d->B(7) - s->B(7));
-d->Q(0) = val;
-#if SHIFT == 1
-val = 0;
-val += abs1(d->B(8) - s->B(8));
-val += abs1(d->B(9) - s->B(9));
-val += abs1(d->B(10) - s->B(10));
-val += abs1(d->B(11) - s->B(11));
-val += abs1(d->B(12) - s->B(12));
-val += abs1(d->B(13) - s->B(13));
-val += abs1(d->B(14) - s->B(14));
-val += abs1(d->B(15) - s->B(15));
-d->Q(1) = val;
-#endif
-}
-
-void glue(helper_maskmov, SUFFIX)(CPUX86State *env, Reg *d, Reg *s,
+void glue(helper_maskmov, SUFFIX)(CPUX86State *env, Reg *a, Reg *b,
   target_ulong a0)
 {
 int i;
 
 for (i = 0; i < (8 << SHIFT); i++) {
-if (s->B(i) & 0x80) {
-cpu_stb_data_ra(env, a0 + i, d->B(i), GETPC());
+if (b->B(i) & 0x80) {
+cpu_stb_data_ra(env, a0 + i, a->B(i), GETPC());
 }
 }
 }
diff --git a/target/i386/ops_sse_header.h b/target/i386/ops_sse_header.h
index 18d39ca649..ec7d1fc686 100644
--- a/target/i386/ops_sse_header.h
+++ b/target/i386/ops_sse_header.h
@@ -74,7 +74,7 @@ DEF_HELPER_4(glue(pavgw, SUFFIX), void, Reg, Reg, Reg, i32)
 DEF_HELPER_4(glue(pmuludq, SUFFIX), void, Reg, Reg, Reg, i32)
 DEF_HELPER_4(glue(pmaddwd, SUFFIX), void, Reg, Reg, Reg, i32)
 
-DEF_HELPER_3(glue(psadbw, SUFFIX), void, env, Reg, Reg)
+DEF_HELPER_4(glue(psadbw, SUFFIX), void, Reg, Reg, Reg, i32)
 DEF_HELPER_4(glue(maskmov, SUFFIX), void, env, Reg, Reg, tl)
 DEF_HELPER_2(glue(movl_mm_T0, SUFFIX), void, Reg, i32)
 #ifdef TARGET_X86_64
diff --git a/target/i386/translate.c b/target/i386/translate.c
index 55607db09c..6bffbaee4c 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -2806,7 +2806,6 @@ static const SSEFunc_0_epp sse_op_table1[256][4] = {
 [0xe6] = { NULL, gen_helper_cvttpd2dq, gen_helper_cvtdq2pd, 
gen_helper_cvtpd2dq },
 [0xe7] = { SSE_SPECIAL , SSE_SPECIAL },  /* movntq, movntq */
 [0xf0] = { NULL, NULL, NULL, SSE_SPECIAL }, /* lddqu */
-[0xf6] = MMX_OP2(psadbw),
 [0xf7] = { (SSEFunc_0_epp)gen_helper_maskmov_mmx,
(SSEFunc_0_epp)gen_helper_maskmov_xmm }, /* XXX: casts */
 };
@@ -6256,10 +6255,10 @@ DEF_GEN_INSN3_GVEC(pavgw, Pq, Pq, Qq, 3_ool, MM_OPRSZ, 
MM_MAXSZ, pavgw_mmx)
 DEF_GEN_INSN3_GVEC(pavgw, Vdq, Vdq, Wdq, 3_ool, XMM_OPRSZ, XMM_MAXSZ, 
pavgw_xmm)
 DEF_GEN_INSN3_GVEC(vpavgw, Vdq, Hdq, Wdq, 3_ool, XMM_OPRSZ, XMM_MAXSZ, 
pavgw_xmm)
 DEF_GEN_INSN3_GVEC(vpavgw, Vqq, Hqq, Wqq, 3_ool, XMM_OPRSZ, XMM_MAXSZ, 
pavgw_xmm)
-DEF_GEN_INSN3_HELPER_EPP(psadbw, psadbw_mmx, Pq, Pq, Qq)
-DEF_GEN_INSN3_HELPER_EPP(psadbw, psadbw_xmm, Vdq, Vdq, Wdq)
-DEF_GEN_INSN3_HELPER_EPP(vpsadbw, psadbw_xmm, Vdq, Hdq, Wdq)
-DEF_GEN_INSN3_HELPER_EPP(vpsadbw, psadbw_xmm, Vqq, Hqq, 

Re: [Qemu-devel] [PATCH 1/2] target/nios2: Fix bug in semihosted exit handling

2019-08-21 Thread Sandra Loosemore

On 8/21/19 9:41 AM, Laurent Vivier wrote:


Could add this information in the commit messages of each patch?


Sure.  V2 of the patches coming up shortly.

-Sandra



[Qemu-devel] [RFC PATCH v4 71/75] target/i386: convert pmuludq/pmaddwd helpers to gvec style

2019-08-21 Thread Jan Bobek
Make these helpers suitable for use with tcg_gen_gvec_* functions.
---
 target/i386/ops_sse.h| 27 +--
 target/i386/ops_sse_header.h |  4 ++--
 target/i386/translate.c  | 18 --
 3 files changed, 27 insertions(+), 22 deletions(-)

diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h
index 1661bd7c64..384a835662 100644
--- a/target/i386/ops_sse.h
+++ b/target/i386/ops_sse.h
@@ -485,22 +485,29 @@ void glue(helper_pavgw, SUFFIX)(Reg *d, Reg *a, Reg *b, 
uint32_t desc)
 glue(clear_high, SUFFIX)(d, oprsz, maxsz);
 }
 
-void glue(helper_pmuludq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+void glue(helper_pmuludq, SUFFIX)(Reg *d, Reg *a, Reg *b, uint32_t desc)
 {
-d->Q(0) = (uint64_t)s->L(0) * (uint64_t)d->L(0);
-#if SHIFT == 1
-d->Q(1) = (uint64_t)s->L(2) * (uint64_t)d->L(2);
-#endif
+const intptr_t oprsz = simd_oprsz(desc);
+const intptr_t maxsz = simd_maxsz(desc);
+
+for (intptr_t i = 0; i * sizeof(uint64_t) < oprsz; ++i) {
+const uint64_t t = (uint64_t)a->L(2 * i) * (uint64_t)b->L(2 * i);
+d->Q(i) = t;
+}
+glue(clear_high, SUFFIX)(d, oprsz, maxsz);
 }
 
-void glue(helper_pmaddwd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+void glue(helper_pmaddwd, SUFFIX)(Reg *d, Reg *a, Reg *b, uint32_t desc)
 {
-int i;
+const intptr_t oprsz = simd_oprsz(desc);
+const intptr_t maxsz = simd_maxsz(desc);
 
-for (i = 0; i < (2 << SHIFT); i++) {
-d->L(i) = (int16_t)s->W(2 * i) * (int16_t)d->W(2 * i) +
-(int16_t)s->W(2 * i + 1) * (int16_t)d->W(2 * i + 1);
+for (intptr_t i = 0; i * sizeof(uint32_t) < oprsz; ++i) {
+const int32_t t0 = (int32_t)a->W(2 * i + 0) * (int32_t)b->W(2 * i + 0);
+const int32_t t1 = (int32_t)a->W(2 * i + 1) * (int32_t)b->W(2 * i + 1);
+d->L(i) = t0 + t1;
 }
+glue(clear_high, SUFFIX)(d, oprsz, maxsz);
 }
 
 #if SHIFT == 0
diff --git a/target/i386/ops_sse_header.h b/target/i386/ops_sse_header.h
index b5e8aae897..18d39ca649 100644
--- a/target/i386/ops_sse_header.h
+++ b/target/i386/ops_sse_header.h
@@ -71,8 +71,8 @@ DEF_HELPER_3(glue(pavgusb, SUFFIX), void, env, Reg, Reg)
 #endif
 DEF_HELPER_4(glue(pavgw, SUFFIX), void, Reg, Reg, Reg, i32)
 
-DEF_HELPER_3(glue(pmuludq, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_3(glue(pmaddwd, SUFFIX), void, env, Reg, Reg)
+DEF_HELPER_4(glue(pmuludq, SUFFIX), void, Reg, Reg, Reg, i32)
+DEF_HELPER_4(glue(pmaddwd, SUFFIX), void, Reg, Reg, Reg, i32)
 
 DEF_HELPER_3(glue(psadbw, SUFFIX), void, env, Reg, Reg)
 DEF_HELPER_4(glue(maskmov, SUFFIX), void, env, Reg, Reg, tl)
diff --git a/target/i386/translate.c b/target/i386/translate.c
index 77b2e18f34..55607db09c 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -2806,8 +2806,6 @@ static const SSEFunc_0_epp sse_op_table1[256][4] = {
 [0xe6] = { NULL, gen_helper_cvttpd2dq, gen_helper_cvtdq2pd, 
gen_helper_cvtpd2dq },
 [0xe7] = { SSE_SPECIAL , SSE_SPECIAL },  /* movntq, movntq */
 [0xf0] = { NULL, NULL, NULL, SSE_SPECIAL }, /* lddqu */
-[0xf4] = MMX_OP2(pmuludq),
-[0xf5] = MMX_OP2(pmaddwd),
 [0xf6] = MMX_OP2(psadbw),
 [0xf7] = { (SSEFunc_0_epp)gen_helper_maskmov_mmx,
(SSEFunc_0_epp)gen_helper_maskmov_xmm }, /* XXX: casts */
@@ -6129,10 +6127,10 @@ DEF_GEN_INSN3_GVEC(vpmulhuw, Vqq, Hqq, Wqq, 3_ool, 
XMM_OPRSZ, XMM_MAXSZ, pmulhuw
 DEF_GEN_INSN3_HELPER_EPP(pmuldq, pmuldq_xmm, Vdq, Vdq, Wdq)
 DEF_GEN_INSN3_HELPER_EPP(vpmuldq, pmuldq_xmm, Vdq, Hdq, Wdq)
 DEF_GEN_INSN3_HELPER_EPP(vpmuldq, pmuldq_xmm, Vqq, Hqq, Wqq)
-DEF_GEN_INSN3_HELPER_EPP(pmuludq, pmuludq_mmx, Pq, Pq, Qq)
-DEF_GEN_INSN3_HELPER_EPP(pmuludq, pmuludq_xmm, Vdq, Vdq, Wdq)
-DEF_GEN_INSN3_HELPER_EPP(vpmuludq, pmuludq_xmm, Vdq, Hdq, Wdq)
-DEF_GEN_INSN3_HELPER_EPP(vpmuludq, pmuludq_xmm, Vqq, Hqq, Wqq)
+DEF_GEN_INSN3_GVEC(pmuludq, Pq, Pq, Qq, 3_ool, MM_OPRSZ, MM_MAXSZ, pmuludq_mmx)
+DEF_GEN_INSN3_GVEC(pmuludq, Vdq, Vdq, Wdq, 3_ool, XMM_OPRSZ, XMM_MAXSZ, 
pmuludq_xmm)
+DEF_GEN_INSN3_GVEC(vpmuludq, Vdq, Hdq, Wdq, 3_ool, XMM_OPRSZ, XMM_MAXSZ, 
pmuludq_xmm)
+DEF_GEN_INSN3_GVEC(vpmuludq, Vqq, Hqq, Wqq, 3_ool, XMM_OPRSZ, XMM_MAXSZ, 
pmuludq_xmm)
 DEF_GEN_INSN3_HELPER_EPP(pmulhrsw, pmulhrsw_mmx, Pq, Pq, Qq)
 DEF_GEN_INSN3_HELPER_EPP(pmulhrsw, pmulhrsw_xmm, Vdq, Vdq, Wdq)
 DEF_GEN_INSN3_HELPER_EPP(vpmulhrsw, pmulhrsw_xmm, Vdq, Hdq, Wdq)
@@ -6147,10 +6145,10 @@ DEF_GEN_INSN3_HELPER_EPP(mulss, mulss, Vd, Vd, Wd)
 DEF_GEN_INSN3_HELPER_EPP(vmulss, mulss, Vd, Hd, Wd)
 DEF_GEN_INSN3_HELPER_EPP(mulsd, mulsd, Vq, Vq, Wq)
 DEF_GEN_INSN3_HELPER_EPP(vmulsd, mulsd, Vq, Hq, Wq)
-DEF_GEN_INSN3_HELPER_EPP(pmaddwd, pmaddwd_mmx, Pq, Pq, Qq)
-DEF_GEN_INSN3_HELPER_EPP(pmaddwd, pmaddwd_xmm, Vdq, Vdq, Wdq)
-DEF_GEN_INSN3_HELPER_EPP(vpmaddwd, pmaddwd_xmm, Vdq, Hdq, Wdq)
-DEF_GEN_INSN3_HELPER_EPP(vpmaddwd, pmaddwd_xmm, Vqq, Hqq, Wqq)
+DEF_GEN_INSN3_GVEC(pmaddwd, Pq, Pq, Qq, 3_ool, MM_OPRSZ, MM_MAXSZ, pmaddwd_mmx)
+DEF_GEN_INSN3_GVEC(pmaddwd, Vdq, Vdq, Wdq, 3_ool, XMM_OPRSZ, 

Re: [Qemu-devel] [PATCH v3 0/4] iotests: use python logging

2019-08-21 Thread John Snow



On 8/20/19 8:10 PM, no-re...@patchew.org wrote:
> Patchew URL: https://patchew.org/QEMU/20190820235243.26092-1-js...@redhat.com/
> 
> 
> 
> Hi,
> 
> This series seems to have some coding style problems. See output below for
> more information:
> 
> Type: series
> Subject: [Qemu-devel] [PATCH v3 0/4] iotests: use python logging
> Message-id: 20190820235243.26092-1-js...@redhat.com
> 

I have to remember that apparently git-publish does not seem to "save"
my setting for adding my signed-off-by if I don't explicitly request it.

Sorry about that. You may assume:

Signed-off-by: John Snow 

for all patches in this series.



[Qemu-devel] [RFC PATCH v4 66/75] target/i386: cleanup leftovers in ops_sse_header.h

2019-08-21 Thread Jan Bobek
Get rid of unused macro definitions that have been left over after
removal of obsoleted helpers.
---
 target/i386/ops_sse_header.h | 28 ++--
 1 file changed, 6 insertions(+), 22 deletions(-)

diff --git a/target/i386/ops_sse_header.h b/target/i386/ops_sse_header.h
index d8e33dff6b..afa0ad0938 100644
--- a/target/i386/ops_sse_header.h
+++ b/target/i386/ops_sse_header.h
@@ -48,27 +48,15 @@ DEF_HELPER_3(glue(psrldq, SUFFIX), void, env, Reg, Reg)
 DEF_HELPER_3(glue(pslldq, SUFFIX), void, env, Reg, Reg)
 #endif
 
-#define SSE_HELPER_B(name, F)\
-DEF_HELPER_3(glue(name, SUFFIX), void, env, Reg, Reg)
-
-#define SSE_HELPER_W(name, F)\
-DEF_HELPER_3(glue(name, SUFFIX), void, env, Reg, Reg)
-
-#define SSE_HELPER_L(name, F)\
-DEF_HELPER_3(glue(name, SUFFIX), void, env, Reg, Reg)
-
-#define SSE_HELPER_Q(name, F)\
-DEF_HELPER_3(glue(name, SUFFIX), void, env, Reg, Reg)
-
-SSE_HELPER_W(pmullw, FMULLW)
+DEF_HELPER_3(glue(pmullw, SUFFIX), void, env, Reg, Reg)
 #if SHIFT == 0
-SSE_HELPER_W(pmulhrw, FMULHRW)
+DEF_HELPER_3(glue(pmulhrw, SUFFIX), void, env, Reg, Reg)
 #endif
-SSE_HELPER_W(pmulhuw, FMULHUW)
-SSE_HELPER_W(pmulhw, FMULHW)
+DEF_HELPER_3(glue(pmulhuw, SUFFIX), void, env, Reg, Reg)
+DEF_HELPER_3(glue(pmulhw, SUFFIX), void, env, Reg, Reg)
 
-SSE_HELPER_B(pavgb, FAVG)
-SSE_HELPER_W(pavgw, FAVG)
+DEF_HELPER_3(glue(pavgb, SUFFIX), void, env, Reg, Reg)
+DEF_HELPER_3(glue(pavgw, SUFFIX), void, env, Reg, Reg)
 
 DEF_HELPER_3(glue(pmuludq, SUFFIX), void, env, Reg, Reg)
 DEF_HELPER_3(glue(pmaddwd, SUFFIX), void, env, Reg, Reg)
@@ -311,10 +299,6 @@ DEF_HELPER_4(glue(pclmulqdq, SUFFIX), void, env, Reg, Reg, 
i32)
 #undef Reg
 #undef SUFFIX
 
-#undef SSE_HELPER_B
-#undef SSE_HELPER_W
-#undef SSE_HELPER_L
-#undef SSE_HELPER_Q
 #undef SSE_HELPER_S
 #undef SSE_HELPER_CMP
 #undef UNPCK_OP
-- 
2.20.1




[Qemu-devel] [PATCH V2 2/2] target/m68k: Fix bug in semihosted exit handling

2019-08-21 Thread Sandra Loosemore
This patch fixes a bug that caused semihosted exit to always return
status 0; it was incorrectly using the value of D0 (which
contains the HOSTED_EXIT request number) instead of D1.

Note that per the newlib documentation for the m68k semihosting protocol

https://www.sourceware.org/git/gitweb.cgi?p=newlib-cygwin.git;a=blob;f=libgloss/m68k/m68k-semi.txt;h=50520c15292aa7edf7eef28e09fd9202ce75b153;hb=HEAD

for the HOSTED_EXIT syscall the parameter is passed directly in the register
instead of in a parameter block pointed to by the register.

Signed-off-by: Sandra Loosemore 
Reviewed-by: Laurent Vivier 
---
 target/m68k/m68k-semi.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/target/m68k/m68k-semi.c b/target/m68k/m68k-semi.c
index 8e5fbfc..f189c92 100644
--- a/target/m68k/m68k-semi.c
+++ b/target/m68k/m68k-semi.c
@@ -194,8 +194,8 @@ void do_m68k_semihosting(CPUM68KState *env, int nr)
 args = env->dregs[1];
 switch (nr) {
 case HOSTED_EXIT:
-gdb_exit(env, env->dregs[0]);
-exit(env->dregs[0]);
+gdb_exit(env, env->dregs[1]);
+exit(env->dregs[1]);
 case HOSTED_OPEN:
 GET_ARG(0);
 GET_ARG(1);
-- 
2.8.1




[Qemu-devel] [RFC PATCH v4 61/75] target/i386: introduce AVX vector instructions to sse-opcode.inc.h

2019-08-21 Thread Jan Bobek
Add all the AVX vector instruction entries to sse-opcode.inc.h.

Signed-off-by: Jan Bobek 
---
 target/i386/sse-opcode.inc.h | 779 +++
 1 file changed, 779 insertions(+)

diff --git a/target/i386/sse-opcode.inc.h b/target/i386/sse-opcode.inc.h
index 1359508424..c3c0ec4f89 100644
--- a/target/i386/sse-opcode.inc.h
+++ b/target/i386/sse-opcode.inc.h
@@ -469,198 +469,767 @@
  * ---
  * 66 0F 3A 44 /r ib   PCLMULQDQ xmm1, xmm2/m128, imm8
  * VEX.128.66.0F3A.WIG 44 /r ibVPCLMULQDQ xmm1, xmm2, xmm3/m128, imm8
+ *
+ * AVX Instructions
+ * -
+ * VEX.128.66.0F.W0 6E /r  VMOVD xmm1,r32/m32
+ * VEX.128.66.0F.W0 7E /r  VMOVD r32/m32,xmm1
+ * VEX.128.66.0F.W1 6E /r  VMOVQ xmm1,r64/m64
+ * VEX.128.66.0F.W1 7E /r  VMOVQ r64/m64,xmm1
+ * VEX.128.F3.0F.WIG 7E /r VMOVQ xmm1, xmm2/m64
+ * VEX.128.66.0F.WIG D6 /r VMOVQ xmm1/m64, xmm2
+ * VEX.128.0F.WIG 28 /rVMOVAPS xmm1, xmm2/m128
+ * VEX.128.0F.WIG 29 /rVMOVAPS xmm2/m128, xmm1
+ * VEX.256.0F.WIG 28 /rVMOVAPS ymm1, ymm2/m256
+ * VEX.256.0F.WIG 29 /rVMOVAPS ymm2/m256, ymm1
+ * VEX.128.66.0F.WIG 28 /r VMOVAPD xmm1, xmm2/m128
+ * VEX.128.66.0F.WIG 29 /r VMOVAPD xmm2/m128, xmm1
+ * VEX.256.66.0F.WIG 28 /r VMOVAPD ymm1, ymm2/m256
+ * VEX.256.66.0F.WIG 29 /r VMOVAPD ymm2/m256, ymm1
+ * VEX.128.66.0F.WIG 6F /r VMOVDQA xmm1, xmm2/m128
+ * VEX.128.66.0F.WIG 7F /r VMOVDQA xmm2/m128, xmm1
+ * VEX.256.66.0F.WIG 6F /r VMOVDQA ymm1, ymm2/m256
+ * VEX.256.66.0F.WIG 7F /r VMOVDQA ymm2/m256, ymm1
+ * VEX.128.0F.WIG 10 /rVMOVUPS xmm1, xmm2/m128
+ * VEX.128.0F.WIG 11 /rVMOVUPS xmm2/m128, xmm1
+ * VEX.256.0F.WIG 10 /rVMOVUPS ymm1, ymm2/m256
+ * VEX.256.0F.WIG 11 /rVMOVUPS ymm2/m256, ymm1
+ * VEX.128.66.0F.WIG 10 /r VMOVUPD xmm1, xmm2/m128
+ * VEX.128.66.0F.WIG 11 /r VMOVUPD xmm2/m128, xmm1
+ * VEX.256.66.0F.WIG 10 /r VMOVUPD ymm1, ymm2/m256
+ * VEX.256.66.0F.WIG 11 /r VMOVUPD ymm2/m256, ymm1
+ * VEX.128.F3.0F.WIG 6F /r VMOVDQU xmm1,xmm2/m128
+ * VEX.128.F3.0F.WIG 7F /r VMOVDQU xmm2/m128,xmm1
+ * VEX.256.F3.0F.WIG 6F /r VMOVDQU ymm1,ymm2/m256
+ * VEX.256.F3.0F.WIG 7F /r VMOVDQU ymm2/m256,ymm1
+ * VEX.LIG.F3.0F.WIG 10 /r VMOVSS xmm1, xmm2, xmm3
+ * VEX.LIG.F3.0F.WIG 10 /r VMOVSS xmm1, m32
+ * VEX.LIG.F3.0F.WIG 11 /r VMOVSS xmm1, xmm2, xmm3
+ * VEX.LIG.F3.0F.WIG 11 /r VMOVSS m32, xmm1
+ * VEX.LIG.F2.0F.WIG 10 /r VMOVSD xmm1, xmm2, xmm3
+ * VEX.LIG.F2.0F.WIG 10 /r VMOVSD xmm1, m64
+ * VEX.LIG.F2.0F.WIG 11 /r VMOVSD xmm1, xmm2, xmm3
+ * VEX.LIG.F2.0F.WIG 11 /r VMOVSD m64, xmm1
+ * VEX.128.0F.WIG 12 /rVMOVHLPS xmm1, xmm2, xmm3
+ * VEX.128.0F.WIG 12 /rVMOVLPS xmm2, xmm1, m64
+ * VEX.128.0F.WIG 13 /rVMOVLPS m64, xmm1
+ * VEX.128.66.0F.WIG 12 /r VMOVLPD xmm2,xmm1,m64
+ * VEX.128.66.0F.WIG 13 /r VMOVLPD m64,xmm1
+ * VEX.128.0F.WIG 16 /rVMOVLHPS xmm1, xmm2, xmm3
+ * VEX.128.0F.WIG 16 /rVMOVHPS xmm2, xmm1, m64
+ * VEX.128.0F.WIG 17 /rVMOVHPS m64, xmm1
+ * VEX.128.66.0F.WIG 16 /r VMOVHPD xmm2, xmm1, m64
+ * VEX.128.66.0F.WIG 17 /r VMOVHPD m64, xmm1
+ * VEX.128.66.0F.W0 D7 /r  VPMOVMSKB r32, xmm1
+ * VEX.128.66.0F.W1 D7 /r  VPMOVMSKB r64, xmm1
+ * VEX.128.0F.W0 50 /r VMOVMSKPS r32, xmm2
+ * VEX.128.0F.W1 50 /r VMOVMSKPS r64, xmm2
+ * VEX.256.0F.W0 50 /r VMOVMSKPS r32, ymm2
+ * VEX.256.0F.W1 50 /r VMOVMSKPS r64, ymm2
+ * VEX.128.66.0F.W0 50 /r  VMOVMSKPD r32, xmm2
+ * VEX.128.66.0F.W1 50 /r  VMOVMSKPD r64, xmm2
+ * VEX.256.66.0F.W0 50 /r  VMOVMSKPD r32, ymm2
+ * VEX.256.66.0F.W1 50 /r  VMOVMSKPD r64, ymm2
+ * VEX.128.F2.0F.WIG F0 /r VLDDQU xmm1, m128
+ * VEX.256.F2.0F.WIG F0 /r VLDDQU ymm1, m256
+ * VEX.128.F3.0F.WIG 16 /r VMOVSHDUP xmm1, xmm2/m128
+ * VEX.256.F3.0F.WIG 16 /r VMOVSHDUP ymm1, ymm2/m256
+ * VEX.128.F3.0F.WIG 12 /r VMOVSLDUP xmm1, xmm2/m128
+ * VEX.256.F3.0F.WIG 12 /r VMOVSLDUP ymm1, ymm2/m256
+ * VEX.128.F2.0F.WIG 12 /r VMOVDDUP xmm1, xmm2/m64
+ * VEX.256.F2.0F.WIG 12 /r VMOVDDUP ymm1, ymm2/m256
+ * VEX.128.66.0F.WIG FC /r VPADDB xmm1, xmm2, xmm3/m128
+ * VEX.128.66.0F.WIG FD /r VPADDW xmm1, xmm2, xmm3/m128
+ * VEX.128.66.0F.WIG FE /r VPADDD xmm1, xmm2, xmm3/m128
+ * VEX.128.66.0F.WIG D4 /r VPADDQ xmm1, xmm2, xmm3/m128
+ * VEX.128.66.0F.WIG EC /r VPADDSB xmm1, xmm2, xmm3/m128
+ * VEX.128.66.0F.WIG ED /r VPADDSW xmm1, xmm2, xmm3/m128
+ * VEX.128.66.0F.WIG DC /r VPADDUSB xmm1,xmm2,xmm3/m128
+ * VEX.128.66.0F.WIG DD /r VPADDUSW xmm1,xmm2,xmm3/m128
+ * 

Re: [Qemu-devel] [edk2-rfc] [edk2-devel] CPU hotplug using SMM with QEMU+OVMF

2019-08-21 Thread Paolo Bonzini
On 21/08/19 19:25, Kinney, Michael D wrote:
> Could we have an initial SMBASE that is within TSEG.
> 
> If we bring in hot plug CPUs one at a time, then initial
> SMBASE in TSEG can reprogram the SMBASE to the correct 
> value for that CPU.
> 
> Can we add a register to the hot plug controller that
> allows the BSP to set the initial SMBASE value for 
> a hot added CPU?  The default can be 3000:8000 for
> compatibility.
> 
> Another idea is when the SMI handler runs for a hot add
> CPU event, the SMM monarch programs the hot plug controller
> register with the SMBASE to use for the CPU that is being
> added.  As each CPU is added, a different SMBASE value can
> be programmed by the SMM Monarch.

Yes, all of these would work.  Again, I'm interested in having something
that has a hope of being implemented in real hardware.

Another, far easier to implement possibility could be a lockable MSR
(could be the existing MSR_SMM_FEATURE_CONTROL) that allows programming
the SMBASE outside SMM.  It would be nice if such a bit could be defined
by Intel.

Paolo



Re: [Qemu-devel] [PATCH v3 3/8] iotests: Allow skipping test cases

2019-08-21 Thread Andrey Shinkevich


On 19/08/2019 23:18, Max Reitz wrote:
> case_notrun() does not actually skip the current test case.  It just
> adds a "notrun" note and then returns to the caller, who manually has to
> skip the test.  Generally, skipping a test case is as simple as
> returning from the current function, but not always: For example, this
> model does not allow skipping tests already in the setUp() function.
> 
> Thus, add a QMPTestCase.case_skip() function that invokes case_notrun()
> and then self.skipTest().  To make this work, we need to filter the
> information on how many test cases were skipped from the unittest
> output.
> 
> Signed-off-by: Max Reitz 
> ---
>   tests/qemu-iotests/iotests.py | 21 ++---
>   1 file changed, 18 insertions(+), 3 deletions(-)
> 
> diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
> index 84438e837c..2f53baf633 100644
> --- a/tests/qemu-iotests/iotests.py
> +++ b/tests/qemu-iotests/iotests.py
> @@ -802,6 +802,11 @@ class QMPTestCase(unittest.TestCase):
>   return self.pause_wait(job_id)
>   return result
>   
> +def case_skip(self, reason):
> +'''Skip this test case'''
> +case_notrun(reason)
> +self.skipTest(reason)
> +
>   
>   def notrun(reason):
>   '''Skip this test suite'''
> @@ -813,7 +818,10 @@ def notrun(reason):
>   sys.exit(0)
>   
>   def case_notrun(reason):
> -'''Skip this test case'''
> +'''Mark this test case as not having been run, but do not actually
> +skip it; that is left to the caller.  See QMPTestCase.case_skip()

The clause "do not actually skip it" sounds like a prescription. I would 
like the comment to be clearer for a reader that the method is a 
notifier only.

Andrey

> +for a variant that actually skips the current test case.'''
> +
>   # Each test in qemu-iotests has a number ("seq")
>   seq = os.path.basename(sys.argv[0])
>   
> @@ -904,8 +912,15 @@ def execute_unittest(output, verbosity, debug):
>   unittest.main(testRunner=runner)
>   finally:
>   if not debug:
> -sys.stderr.write(re.sub(r'Ran (\d+) tests? in [\d.]+s',
> -r'Ran \1 tests', output.getvalue()))
> +out = output.getvalue()
> +out = re.sub(r'Ran (\d+) tests? in [\d.]+s', r'Ran \1 tests', 
> out)
> +
> +# Hide skipped tests from the reference output
> +out = re.sub(r'OK \(skipped=\d+\)', 'OK', out)
> +out_first_line, out_rest = out.split('\n', 1)
> +out = out_first_line.replace('s', '.') + '\n' + out_rest
> +
> +sys.stderr.write(out)
>   
>   def execute_test(test_function=None,
>supported_fmts=[], supported_oses=['linux'],
> 

Reviewed-by: Andrey Shinkevich 
-- 
With the best regards,
Andrey Shinkevich


[Qemu-devel] [RFC PATCH v4 58/75] target/i386: introduce AES and PCLMULQDQ vector instructions to sse-opcode.inc.h

2019-08-21 Thread Jan Bobek
Add all the AES and PCLMULQDQ vector instruction entries to sse-opcode.inc.h.

Signed-off-by: Jan Bobek 
---
 target/i386/sse-opcode.inc.h | 34 ++
 1 file changed, 34 insertions(+)

diff --git a/target/i386/sse-opcode.inc.h b/target/i386/sse-opcode.inc.h
index f43436213e..1359508424 100644
--- a/target/i386/sse-opcode.inc.h
+++ b/target/i386/sse-opcode.inc.h
@@ -449,6 +449,26 @@
  * 66 0F 3A 61 /r imm8 PCMPESTRI xmm1, xmm2/m128, imm8
  * 66 0F 3A 62 /r imm8 PCMPISTRM xmm1, xmm2/m128, imm8
  * 66 0F 3A 63 /r imm8 PCMPISTRI xmm1, xmm2/m128, imm8
+ *
+ * AES Instructions
+ * -
+ * 66 0F 38 DE /r  AESDEC xmm1, xmm2/m128
+ * VEX.128.66.0F38.WIG DE /r   VAESDEC xmm1, xmm2, xmm3/m128
+ * 66 0F 38 DF /r  AESDECLAST xmm1, xmm2/m128
+ * VEX.128.66.0F38.WIG DF /r   VAESDECLAST xmm1, xmm2, xmm3/m128
+ * 66 0F 38 DC /r  AESENC xmm1, xmm2/m128
+ * VEX.128.66.0F38.WIG DC /r   VAESENC xmm1, xmm2, xmm3/m128
+ * 66 0F 38 DD /r  AESENCLAST xmm1, xmm2/m128
+ * VEX.128.66.0F38.WIG DD /r   VAESENCLAST xmm1, xmm2, xmm3/m128
+ * 66 0F 38 DB /r  AESIMC xmm1, xmm2/m128
+ * VEX.128.66.0F38.WIG DB /r   VAESIMC xmm1, xmm2/m128
+ * 66 0F 3A DF /r ib   AESKEYGENASSIST xmm1, xmm2/m128, imm8
+ * VEX.128.66.0F3A.WIG DF /r ibVAESKEYGENASSIST xmm1, xmm2/m128, imm8
+ *
+ * PCLMULQDQ Instructions
+ * ---
+ * 66 0F 3A 44 /r ib   PCLMULQDQ xmm1, xmm2/m128, imm8
+ * VEX.128.66.0F3A.WIG 44 /r ibVPCLMULQDQ xmm1, xmm2, xmm3/m128, imm8
  */
 
 OPCODE(movd, LEG(NP, 0F, 0, 0x6e), MMX, WR, Pq, Ed)
@@ -641,6 +661,20 @@ OPCODE(roundps, LEG(66, 0F3A, 0, 0x08), SSE4_1, WRR, Vdq, 
Wdq, Ib)
 OPCODE(roundpd, LEG(66, 0F3A, 0, 0x09), SSE4_1, WRR, Vdq, Wdq, Ib)
 OPCODE(roundss, LEG(66, 0F3A, 0, 0x0a), SSE4_1, WRR, Vd, Wd, Ib)
 OPCODE(roundsd, LEG(66, 0F3A, 0, 0x0b), SSE4_1, WRR, Vq, Wq, Ib)
+OPCODE(aesdec, LEG(66, 0F38, 0, 0xde), AES, WRR, Vdq, Vdq, Wdq)
+OPCODE(vaesdec, VEX(128, 66, 0F38, IG, 0xde), AES_AVX, WRR, Vdq, Hdq, Wdq)
+OPCODE(aesdeclast, LEG(66, 0F38, 0, 0xdf), AES, WRR, Vdq, Vdq, Wdq)
+OPCODE(vaesdeclast, VEX(128, 66, 0F38, IG, 0xdf), AES_AVX, WRR, Vdq, Hdq, Wdq)
+OPCODE(aesenc, LEG(66, 0F38, 0, 0xdc), AES, WRR, Vdq, Vdq, Wdq)
+OPCODE(vaesenc, VEX(128, 66, 0F38, IG, 0xdc), AES_AVX, WRR, Vdq, Hdq, Wdq)
+OPCODE(aesenclast, LEG(66, 0F38, 0, 0xdd), AES, WRR, Vdq, Vdq, Wdq)
+OPCODE(vaesenclast, VEX(128, 66, 0F38, IG, 0xdd), AES_AVX, WRR, Vdq, Hdq, Wdq)
+OPCODE(aesimc, LEG(66, 0F38, 0, 0xdb), AES, WR, Vdq, Wdq)
+OPCODE(vaesimc, VEX(128, 66, 0F38, IG, 0xdb), AES_AVX, WR, Vdq, Wdq)
+OPCODE(aeskeygenassist, LEG(66, 0F3A, 0, 0xdf), AES, WRR, Vdq, Wdq, Ib)
+OPCODE(vaeskeygenassist, VEX(128, 66, 0F3A, IG, 0xdf), AES_AVX, WRR, Vdq, Wdq, 
Ib)
+OPCODE(pclmulqdq, LEG(66, 0F3A, 0, 0x44), PCLMULQDQ, WRRR, Vdq, Vdq, Wdq, Ib)
+OPCODE(vpclmulqdq, VEX(128, 66, 0F3A, IG, 0x44), PCLMULQDQ_AVX, WRRR, Vdq, 
Hdq, Wdq, Ib)
 OPCODE(pcmpeqb, LEG(NP, 0F, 0, 0x74), MMX, WRR, Pq, Pq, Qq)
 OPCODE(pcmpeqb, LEG(66, 0F, 0, 0x74), SSE2, WRR, Vdq, Vdq, Wdq)
 OPCODE(pcmpeqw, LEG(NP, 0F, 0, 0x75), MMX, WRR, Pq, Pq, Qq)
-- 
2.20.1




[Qemu-devel] [RFC PATCH v4 68/75] target/i386: convert ps((l, r)l(w, d, q), ra(w, d)) to helpers to gvec style

2019-08-21 Thread Jan Bobek
Make these helpers suitable for use with tcg_gen_gvec_* functions.

Signed-off-by: Jan Bobek 
---
 target/i386/ops_sse.h| 357 +--
 target/i386/ops_sse_header.h |  30 ++-
 target/i386/translate.c  | 259 +++--
 3 files changed, 306 insertions(+), 340 deletions(-)

diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h
index aca6b50f23..168e581c0c 100644
--- a/target/i386/ops_sse.h
+++ b/target/i386/ops_sse.h
@@ -19,6 +19,7 @@
  */
 
 #include "crypto/aes.h"
+#include "tcg-gvec-desc.h"
 
 #if SHIFT == 0
 #define Reg MMXReg
@@ -38,199 +39,273 @@
 #define SUFFIX _xmm
 #endif
 
-void glue(helper_psrlw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+static inline void glue(clear_high, SUFFIX)(Reg *d, intptr_t oprsz,
+intptr_t maxsz)
 {
-int shift;
+intptr_t i;
 
-if (s->Q(0) > 15) {
-d->Q(0) = 0;
-#if SHIFT == 1
-d->Q(1) = 0;
-#endif
-} else {
-shift = s->B(0);
-d->W(0) >>= shift;
-d->W(1) >>= shift;
-d->W(2) >>= shift;
-d->W(3) >>= shift;
-#if SHIFT == 1
-d->W(4) >>= shift;
-d->W(5) >>= shift;
-d->W(6) >>= shift;
-d->W(7) >>= shift;
-#endif
+assert(oprsz % sizeof(uint64_t) == 0);
+assert(maxsz % sizeof(uint64_t) == 0);
+
+if (oprsz < maxsz) {
+i = oprsz / sizeof(uint64_t);
+for (; i * sizeof(uint64_t) < maxsz; ++i) {
+d->Q(i) = 0;
+}
 }
 }
 
-void glue(helper_psraw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+void glue(helper_psllw, SUFFIX)(Reg *d, Reg *a, Reg *b, uint32_t desc)
 {
-int shift;
+const uint64_t count = b->Q(0);
+const intptr_t oprsz = count > 15 ? 0 : simd_oprsz(desc);
+const intptr_t maxsz = simd_maxsz(desc);
 
-if (s->Q(0) > 15) {
-shift = 15;
-} else {
-shift = s->B(0);
+for (intptr_t i = 0; i * sizeof(uint16_t) < oprsz; ++i) {
+d->W(i) = a->W(i) << count;
 }
-d->W(0) = (int16_t)d->W(0) >> shift;
-d->W(1) = (int16_t)d->W(1) >> shift;
-d->W(2) = (int16_t)d->W(2) >> shift;
-d->W(3) = (int16_t)d->W(3) >> shift;
-#if SHIFT == 1
-d->W(4) = (int16_t)d->W(4) >> shift;
-d->W(5) = (int16_t)d->W(5) >> shift;
-d->W(6) = (int16_t)d->W(6) >> shift;
-d->W(7) = (int16_t)d->W(7) >> shift;
-#endif
+glue(clear_high, SUFFIX)(d, oprsz, maxsz);
 }
 
-void glue(helper_psllw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+void glue(helper_pslld, SUFFIX)(Reg *d, Reg *a, Reg *b, uint32_t desc)
 {
-int shift;
+const uint64_t count = b->Q(0);
+const intptr_t oprsz = count > 31 ? 0 : simd_oprsz(desc);
+const intptr_t maxsz = simd_maxsz(desc);
 
-if (s->Q(0) > 15) {
-d->Q(0) = 0;
-#if SHIFT == 1
-d->Q(1) = 0;
-#endif
-} else {
-shift = s->B(0);
-d->W(0) <<= shift;
-d->W(1) <<= shift;
-d->W(2) <<= shift;
-d->W(3) <<= shift;
-#if SHIFT == 1
-d->W(4) <<= shift;
-d->W(5) <<= shift;
-d->W(6) <<= shift;
-d->W(7) <<= shift;
-#endif
+for (intptr_t i = 0; i * sizeof(uint32_t) < oprsz; ++i) {
+d->L(i) = a->L(i) << count;
 }
+glue(clear_high, SUFFIX)(d, oprsz, maxsz);
 }
 
-void glue(helper_psrld, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+void glue(helper_psllq, SUFFIX)(Reg *d, Reg *a, Reg *b, uint32_t desc)
 {
-int shift;
+const uint64_t count = b->Q(0);
+const intptr_t oprsz = count > 63 ? 0 : simd_oprsz(desc);
+const intptr_t maxsz = simd_maxsz(desc);
 
-if (s->Q(0) > 31) {
-d->Q(0) = 0;
-#if SHIFT == 1
-d->Q(1) = 0;
-#endif
-} else {
-shift = s->B(0);
-d->L(0) >>= shift;
-d->L(1) >>= shift;
-#if SHIFT == 1
-d->L(2) >>= shift;
-d->L(3) >>= shift;
-#endif
+for (intptr_t i = 0; i * sizeof(uint64_t) < oprsz; ++i) {
+d->Q(i) = a->Q(i) << count;
 }
+glue(clear_high, SUFFIX)(d, oprsz, maxsz);
 }
 
-void glue(helper_psrad, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+void glue(helper_psllwi, SUFFIX)(Reg *d, Reg *a, uint32_t desc)
 {
-int shift;
+const uint64_t count = simd_data(desc);
+const intptr_t oprsz = count > 15 ? 0 : simd_oprsz(desc);
+const intptr_t maxsz = simd_maxsz(desc);
 
-if (s->Q(0) > 31) {
-shift = 31;
-} else {
-shift = s->B(0);
+for (intptr_t i = 0; i * sizeof(uint16_t) < oprsz; ++i) {
+d->W(i) = a->W(i) << count;
 }
-d->L(0) = (int32_t)d->L(0) >> shift;
-d->L(1) = (int32_t)d->L(1) >> shift;
-#if SHIFT == 1
-d->L(2) = (int32_t)d->L(2) >> shift;
-d->L(3) = (int32_t)d->L(3) >> shift;
-#endif
+glue(clear_high, SUFFIX)(d, oprsz, maxsz);
 }
 
-void glue(helper_pslld, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+void glue(helper_pslldi, SUFFIX)(Reg *d, Reg *a, uint32_t desc)
 {
-int shift;
+const uint64_t count = simd_data(desc);
+const intptr_t oprsz = count > 31 ? 0 

Re: [Qemu-devel] [PATCH v1 2/4] s390x/tcg: Introduce probe_read_access()

2019-08-21 Thread David Hildenbrand
On 21.08.19 19:26, Richard Henderson wrote:
> On 8/21/19 2:22 AM, David Hildenbrand wrote:
>> +/*
>> + * Make sure the read access is permitted and TLB entries are created. In
>> + * very rare cases it might happen that the actual accesses might need
>> + * new MMU translations. If the page tables were changed in between, we
>> + * might still trigger a fault. However, this seems to barely happen, so we
>> + * can ignore this for now.
>> + */
>> +void probe_read_access(CPUS390XState *env, uint64_t addr, uint64_t len,
>> +   uintptr_t ra)
>> +{
>> +#ifdef CONFIG_USER_ONLY
>> +if (!guest_addr_valid(addr) || !guest_addr_valid(addr + len - 1) ||
>> +page_check_range(addr, len, PAGE_READ) < 0) {
>> +s390_program_interrupt(env, PGM_ADDRESSING, ILEN_AUTO, ra);
>> +}
>> +#else
>> +while (len) {
>> +const uint64_t pagelen = -(addr | -TARGET_PAGE_MASK);
>> +const uint64_t curlen = MIN(pagelen, len);
>> +
>> +cpu_ldub_data_ra(env, addr, ra);
>> +addr = wrap_address(env, addr + curlen);
>> +len -= curlen;
>> +}
>> +#endif
>> +}
> 
> I don't think this is really the right approach, precisely because of the
> comment above.
> 
> I think we should
> 
> (1) Modify the generic probe_write to return the host address,
> akin to tlb_vaddr_to_host except it *will* fault.
> 
> (2) Create a generic version of probe_write for CONFIG_USER_ONLY,
> much like the one you have done for target/s390x.
> 
> (3) Create generic version of probe_read that does the same.
> 
> (4) Rewrite fast_memset and fast_memmove to fetch all of the host
> addresses before doing any modifications.  The functions are
> currently written as if len can be very large, handling any
> number of pages.  Except that's not true.  While there are
> several kinds of users apart from MVC, two pages are sufficient
> for all users.
> 
> Well, should be.  We would need to adjust do_mvcl to limit the
> operation to TARGET_PAGE_SIZE (CC=3, cpu-determined number of
> bytes moved without reaching end of first operand).
> Which is probably a good idea anyway.  System mode should not
> spend forever executing one instruction, as it would if you
> pass in a 64-bit length from MVCLE.


Hah, guess what, I implemented a similar variant of "fetch all
of the host addresses" *but* it is not that easy as you might
think (sorry for the bad news).

There are certain cases where we can't get access to the raw host
page. Namely, cpu watchpoints, LAP, NODIRTY. In summary: this won't
as you describe. (my first approach did exactly this)

The following patch requires another re-factoring
(tcg_s390_cpu_mmu_translate), but you should get the idea.



>From 0cacd2aea3dbc25e93492cca04f6c866b86d7f8a Mon Sep 17 00:00:00 2001
From: David Hildenbrand 
Date: Tue, 20 Aug 2019 09:37:11 +0200
Subject: [PATCH v1] s390x/tcg: Fault-safe MVC (MOVE) implementation

MVC can cross page boundaries. In case we fault on the second page, we
already partially copied data. If we have overlaps, we would
trigger a fault after having partially moved data, eventually having
our original data already overwritten. When continuing after the fault,
we would try to move already modified data, not the original data -
very bad.

glibc started to use MVC for forward memmove() and is able to trigger
exectly this corruption (via rpmbuild and rpm). Fedora 31 (rawhide)
currently fails to install as we trigger rpm database corruptions due to
this bug.

We need a way to translate a virtual address range to individual pages that
we can access later on without triggering faults. Probing all virtual
addresses once before the read/write is not sufficient - the guest could
have modified the page tables (e.g., write-protect, map out) in between,
so on we could fault on any new tlb_fill() - we have to skip any new MMU
translations.

Unfortunately, there are TLB entries for which cannot get a host address
for (esp., watchpoints, LAP, NOTDIRTY) - in these cases we cannot avoid
a new MMU translation using the ordinary ld/st helpers. Let's fallback
to guest physical addresses in these cases, that we access via
cpu_physical_memory_(read|write),

This change reduced the boottime for s390x guests (to prompt) from ~1:29
min to ~1:19 min in my tests. For example, LAP protected pages are now only
translated once when writing to them using MVC and we don't always fallback
to byte-based copies.

We will want to use the same mechanism for other accesses as well (e.g.,
mvcl), prepare for that right away.

Signed-off-by: David Hildenbrand 
---
 target/s390x/mem_helper.c | 213 +++---
 1 file changed, 200 insertions(+), 13 deletions(-)

diff --git a/target/s390x/mem_helper.c b/target/s390x/mem_helper.c
index 91ba2e03d9..1ca293e00d 100644
--- a/target/s390x/mem_helper.c
+++ b/target/s390x/mem_helper.c
@@ -24,8 +24,10 @@
 #include "exec/helper-proto.h"
 #include "exec/exec-all.h"
 

[Qemu-devel] [RFC PATCH v4 59/75] target/i386: introduce AVX translators

2019-08-21 Thread Jan Bobek
Use the translator macros to define translators required by AVX
instructions.

Signed-off-by: Jan Bobek 
---
 target/i386/translate.c | 48 +
 1 file changed, 48 insertions(+)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index 14117c2993..9b9f0d4ed1 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -6708,10 +6708,12 @@ DEF_TRANSLATE_INSN2(Eq, Pq)
 DEF_TRANSLATE_INSN2(Eq, Vdq)
 DEF_TRANSLATE_INSN2(Gd, Nq)
 DEF_TRANSLATE_INSN2(Gd, Udq)
+DEF_TRANSLATE_INSN2(Gd, Uqq)
 DEF_TRANSLATE_INSN2(Gd, Wd)
 DEF_TRANSLATE_INSN2(Gd, Wq)
 DEF_TRANSLATE_INSN2(Gq, Nq)
 DEF_TRANSLATE_INSN2(Gq, Udq)
+DEF_TRANSLATE_INSN2(Gq, Uqq)
 DEF_TRANSLATE_INSN2(Gq, Wd)
 DEF_TRANSLATE_INSN2(Gq, Wq)
 DEF_TRANSLATE_INSN2(Md, Gd)
@@ -6720,6 +6722,7 @@ DEF_TRANSLATE_INSN2(Mq, Gq)
 DEF_TRANSLATE_INSN2(Mq, Pq)
 DEF_TRANSLATE_INSN2(Mq, Vdq)
 DEF_TRANSLATE_INSN2(Mq, Vq)
+DEF_TRANSLATE_INSN2(Mqq, Vqq)
 DEF_TRANSLATE_INSN2(Pq, Ed)
 DEF_TRANSLATE_INSN2(Pq, Eq)
 DEF_TRANSLATE_INSN2(Pq, Nq)
@@ -6735,6 +6738,7 @@ DEF_TRANSLATE_INSN2(Vd, Wd)
 DEF_TRANSLATE_INSN2(Vd, Wq)
 DEF_TRANSLATE_INSN2(Vdq, Ed)
 DEF_TRANSLATE_INSN2(Vdq, Eq)
+DEF_TRANSLATE_INSN2(Vdq, Md)
 DEF_TRANSLATE_INSN2(Vdq, Mdq)
 DEF_TRANSLATE_INSN2(Vdq, Nq)
 DEF_TRANSLATE_INSN2(Vdq, Qq)
@@ -6742,14 +6746,22 @@ DEF_TRANSLATE_INSN2(Vdq, Udq)
 DEF_TRANSLATE_INSN2(Vdq, Wd)
 DEF_TRANSLATE_INSN2(Vdq, Wdq)
 DEF_TRANSLATE_INSN2(Vdq, Wq)
+DEF_TRANSLATE_INSN2(Vdq, Wqq)
 DEF_TRANSLATE_INSN2(Vdq, Ww)
 DEF_TRANSLATE_INSN2(Vq, Ed)
 DEF_TRANSLATE_INSN2(Vq, Eq)
 DEF_TRANSLATE_INSN2(Vq, Wd)
 DEF_TRANSLATE_INSN2(Vq, Wq)
+DEF_TRANSLATE_INSN2(Vqq, Md)
+DEF_TRANSLATE_INSN2(Vqq, Mdq)
+DEF_TRANSLATE_INSN2(Vqq, Mq)
+DEF_TRANSLATE_INSN2(Vqq, Mqq)
+DEF_TRANSLATE_INSN2(Vqq, Wdq)
+DEF_TRANSLATE_INSN2(Vqq, Wqq)
 DEF_TRANSLATE_INSN2(Wd, Vd)
 DEF_TRANSLATE_INSN2(Wdq, Vdq)
 DEF_TRANSLATE_INSN2(Wq, Vq)
+DEF_TRANSLATE_INSN2(Wqq, Vqq)
 DEF_TRANSLATE_INSN2(modrm_mod, modrm)
 
 #define DEF_TRANSLATE_INSN3(opT1, opT2, opT3)   \
@@ -6796,6 +6808,9 @@ DEF_TRANSLATE_INSN3(Gd, Nq, Ib)
 DEF_TRANSLATE_INSN3(Gd, Udq, Ib)
 DEF_TRANSLATE_INSN3(Gq, Nq, Ib)
 DEF_TRANSLATE_INSN3(Gq, Udq, Ib)
+DEF_TRANSLATE_INSN3(Hdq, Udq, Ib)
+DEF_TRANSLATE_INSN3(Mdq, Hdq, Vdq)
+DEF_TRANSLATE_INSN3(Mqq, Hqq, Vqq)
 DEF_TRANSLATE_INSN3(Nq, Nq, Ib)
 DEF_TRANSLATE_INSN3(Pq, Pq, Qd)
 DEF_TRANSLATE_INSN3(Pq, Pq, Qq)
@@ -6803,17 +6818,34 @@ DEF_TRANSLATE_INSN3(Pq, Qq, Ib)
 DEF_TRANSLATE_INSN3(RdMb, Vdq, Ib)
 DEF_TRANSLATE_INSN3(RdMw, Vdq, Ib)
 DEF_TRANSLATE_INSN3(Udq, Udq, Ib)
+DEF_TRANSLATE_INSN3(Vd, Hd, Ed)
+DEF_TRANSLATE_INSN3(Vd, Hd, Eq)
+DEF_TRANSLATE_INSN3(Vd, Hd, Wd)
+DEF_TRANSLATE_INSN3(Vd, Hd, Wq)
 DEF_TRANSLATE_INSN3(Vd, Vd, Wd)
 DEF_TRANSLATE_INSN3(Vd, Wd, Ib)
+DEF_TRANSLATE_INSN3(Vdq, Hdq, Mdq)
+DEF_TRANSLATE_INSN3(Vdq, Hdq, Mq)
+DEF_TRANSLATE_INSN3(Vdq, Hdq, UdqMhq)
 DEF_TRANSLATE_INSN3(Vdq, Hdq, Wdq)
+DEF_TRANSLATE_INSN3(Vdq, Hq, Mq)
+DEF_TRANSLATE_INSN3(Vdq, Hq, Wq)
 DEF_TRANSLATE_INSN3(Vdq, Vdq, Mq)
 DEF_TRANSLATE_INSN3(Vdq, Vdq, UdqMhq)
 DEF_TRANSLATE_INSN3(Vdq, Vdq, Wdq)
 DEF_TRANSLATE_INSN3(Vdq, Vq, Mq)
 DEF_TRANSLATE_INSN3(Vdq, Vq, Wq)
 DEF_TRANSLATE_INSN3(Vdq, Wdq, Ib)
+DEF_TRANSLATE_INSN3(Vq, Hq, Ed)
+DEF_TRANSLATE_INSN3(Vq, Hq, Eq)
+DEF_TRANSLATE_INSN3(Vq, Hq, Wd)
+DEF_TRANSLATE_INSN3(Vq, Hq, Wq)
 DEF_TRANSLATE_INSN3(Vq, Vq, Wq)
 DEF_TRANSLATE_INSN3(Vq, Wq, Ib)
+DEF_TRANSLATE_INSN3(Vqq, Hqq, Mqq)
+DEF_TRANSLATE_INSN3(Vqq, Hqq, Wqq)
+DEF_TRANSLATE_INSN3(Vqq, Wqq, Ib)
+DEF_TRANSLATE_INSN3(Wdq, Vqq, Ib)
 
 #define DEF_TRANSLATE_INSN4(opT1, opT2, opT3, opT4) \
 static void translate_insn4(opT1, opT2, opT3, opT4)(\
@@ -6861,8 +6893,15 @@ DEF_TRANSLATE_INSN3(Vq, Wq, Ib)
 
 DEF_TRANSLATE_INSN4(Pq, Pq, Qq, Ib)
 DEF_TRANSLATE_INSN4(Pq, Pq, RdMw, Ib)
+DEF_TRANSLATE_INSN4(Vd, Hd, Wd, Ib)
 DEF_TRANSLATE_INSN4(Vd, Vd, Wd, Ib)
+DEF_TRANSLATE_INSN4(Vdq, Hdq, Ed, Ib)
+DEF_TRANSLATE_INSN4(Vdq, Hdq, Eq, Ib)
+DEF_TRANSLATE_INSN4(Vdq, Hdq, RdMb, Ib)
+DEF_TRANSLATE_INSN4(Vdq, Hdq, RdMw, Ib)
+DEF_TRANSLATE_INSN4(Vdq, Hdq, Wd, Ib)
 DEF_TRANSLATE_INSN4(Vdq, Hdq, Wdq, Ib)
+DEF_TRANSLATE_INSN4(Vdq, Hdq, Wdq, Ldq)
 DEF_TRANSLATE_INSN4(Vdq, Vdq, Ed, Ib)
 DEF_TRANSLATE_INSN4(Vdq, Vdq, Eq, Ib)
 DEF_TRANSLATE_INSN4(Vdq, Vdq, RdMb, Ib)
@@ -6871,7 +6910,11 @@ DEF_TRANSLATE_INSN4(Vdq, Vdq, Wd, Ib)
 DEF_TRANSLATE_INSN4(Vdq, Vdq, Wd, modrm_mod)
 DEF_TRANSLATE_INSN4(Vdq, Vdq, Wdq, Ib)
 DEF_TRANSLATE_INSN4(Vdq, Vdq, Wq, modrm_mod)
+DEF_TRANSLATE_INSN4(Vq, Hq, Wq, Ib)
 DEF_TRANSLATE_INSN4(Vq, Vq, Wq, Ib)
+DEF_TRANSLATE_INSN4(Vqq, Hqq, Wdq, Ib)
+DEF_TRANSLATE_INSN4(Vqq, Hqq, Wqq, Ib)
+DEF_TRANSLATE_INSN4(Vqq, Hqq, Wqq, Lqq)
 
 #define DEF_TRANSLATE_INSN5(opT1, opT2, opT3, opT4, opT5)   \
 static void translate_insn5(opT1, opT2, opT3, opT4, opT5)(  \
@@ -6924,6 +6967,11 @@ DEF_TRANSLATE_INSN4(Vq, Vq, Wq, Ib)
 }   \
 }
 
+DEF_TRANSLATE_INSN5(Vdq, Hdq, 

[Qemu-devel] [RFC PATCH v4 74/75] target/i386: convert pshuf(w, lw, hw, d), shuf(pd, ps) helpers to gvec style

2019-08-21 Thread Jan Bobek
Make these helpers suitable for use with tcg_gen_gvec_* functions.

Signed-off-by: Jan Bobek 
---
 target/i386/ops_sse.h| 141 ---
 target/i386/ops_sse_header.h |  12 +--
 target/i386/translate.c  |  34 -
 3 files changed, 119 insertions(+), 68 deletions(-)

diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h
index 8172324e34..2e50d91a25 100644
--- a/target/i386/ops_sse.h
+++ b/target/i386/ops_sse.h
@@ -551,70 +551,123 @@ void glue(helper_maskmov, SUFFIX)(CPUX86State *env, Reg 
*a, Reg *b,
 }
 
 #if SHIFT == 0
-void glue(helper_pshufw, SUFFIX)(Reg *d, Reg *s, int order)
+void glue(helper_pshufw, SUFFIX)(Reg *d, Reg *a, uint32_t desc)
 {
-Reg r;
+const intptr_t oprsz = simd_oprsz(desc);
+const intptr_t maxsz = simd_maxsz(desc);
+const uint8_t ctrl = simd_data(desc);
 
-r.W(0) = s->W(order & 3);
-r.W(1) = s->W((order >> 2) & 3);
-r.W(2) = s->W((order >> 4) & 3);
-r.W(3) = s->W((order >> 6) & 3);
-*d = r;
+for (intptr_t i = 0; 4 * i * sizeof(uint16_t) < oprsz; ++i) {
+const uint16_t t0 = a->W(4 * i + ((ctrl >> 0) & 3));
+const uint16_t t1 = a->W(4 * i + ((ctrl >> 2) & 3));
+const uint16_t t2 = a->W(4 * i + ((ctrl >> 4) & 3));
+const uint16_t t3 = a->W(4 * i + ((ctrl >> 6) & 3));
+
+d->W(4 * i + 0) = t0;
+d->W(4 * i + 1) = t1;
+d->W(4 * i + 2) = t2;
+d->W(4 * i + 3) = t3;
+}
+glue(clear_high, SUFFIX)(d, oprsz, maxsz);
 }
 #else
-void helper_shufps(Reg *d, Reg *s, int order)
+void glue(helper_pshuflw, SUFFIX)(Reg *d, Reg *a, uint32_t desc)
 {
-Reg r;
+const intptr_t oprsz = simd_oprsz(desc);
+const intptr_t maxsz = simd_maxsz(desc);
+const uint8_t ctrl = simd_data(desc);
 
-r.L(0) = d->L(order & 3);
-r.L(1) = d->L((order >> 2) & 3);
-r.L(2) = s->L((order >> 4) & 3);
-r.L(3) = s->L((order >> 6) & 3);
-*d = r;
+for (intptr_t i = 0; 8 * i * sizeof(uint16_t) < oprsz; ++i) {
+const uint16_t t0 = a->W(8 * i + ((ctrl >> 0) & 3));
+const uint16_t t1 = a->W(8 * i + ((ctrl >> 2) & 3));
+const uint16_t t2 = a->W(8 * i + ((ctrl >> 4) & 3));
+const uint16_t t3 = a->W(8 * i + ((ctrl >> 6) & 3));
+
+d->W(8 * i + 0) = t0;
+d->W(8 * i + 1) = t1;
+d->W(8 * i + 2) = t2;
+d->W(8 * i + 3) = t3;
+d->Q(2 * i + 1) = a->Q(2 * i + 1);
+}
+glue(clear_high, SUFFIX)(d, oprsz, maxsz);
 }
 
-void helper_shufpd(Reg *d, Reg *s, int order)
+void glue(helper_pshufhw, SUFFIX)(Reg *d, Reg *a, uint32_t desc)
 {
-Reg r;
+const intptr_t oprsz = simd_oprsz(desc);
+const intptr_t maxsz = simd_maxsz(desc);
+const uint8_t ctrl = simd_data(desc);
+
+for (intptr_t i = 0; 8 * i * sizeof(uint16_t) < oprsz; ++i) {
+const uint16_t t0 = a->W(8 * i + 4 + ((ctrl >> 0) & 3));
+const uint16_t t1 = a->W(8 * i + 4 + ((ctrl >> 2) & 3));
+const uint16_t t2 = a->W(8 * i + 4 + ((ctrl >> 4) & 3));
+const uint16_t t3 = a->W(8 * i + 4 + ((ctrl >> 6) & 3));
 
-r.Q(0) = d->Q(order & 1);
-r.Q(1) = s->Q((order >> 1) & 1);
-*d = r;
+d->Q(2 * i + 0) = a->Q(2 * i + 0);
+d->W(8 * i + 4) = t0;
+d->W(8 * i + 5) = t1;
+d->W(8 * i + 6) = t2;
+d->W(8 * i + 7) = t3;
+}
+glue(clear_high, SUFFIX)(d, oprsz, maxsz);
 }
 
-void glue(helper_pshufd, SUFFIX)(Reg *d, Reg *s, int order)
+void glue(helper_pshufd, SUFFIX)(Reg *d, Reg *a, uint32_t desc)
 {
-Reg r;
+const intptr_t oprsz = simd_oprsz(desc);
+const intptr_t maxsz = simd_maxsz(desc);
+const uint8_t ctrl = simd_data(desc);
+
+for (intptr_t i = 0; 4 * i * sizeof(uint32_t) < oprsz; ++i) {
+const uint32_t t0 = a->L(4 * i + ((ctrl >> 0) & 3));
+const uint32_t t1 = a->L(4 * i + ((ctrl >> 2) & 3));
+const uint32_t t2 = a->L(4 * i + ((ctrl >> 4) & 3));
+const uint32_t t3 = a->L(4 * i + ((ctrl >> 6) & 3));
+
+d->L(4 * i + 0) = t0;
+d->L(4 * i + 1) = t1;
+d->L(4 * i + 2) = t2;
+d->L(4 * i + 3) = t3;
 
-r.L(0) = s->L(order & 3);
-r.L(1) = s->L((order >> 2) & 3);
-r.L(2) = s->L((order >> 4) & 3);
-r.L(3) = s->L((order >> 6) & 3);
-*d = r;
+}
+glue(clear_high, SUFFIX)(d, oprsz, maxsz);
 }
 
-void glue(helper_pshuflw, SUFFIX)(Reg *d, Reg *s, int order)
+void glue(helper_shufps, SUFFIX)(Reg *d, Reg *a, Reg *b, uint32_t desc)
 {
-Reg r;
-
-r.W(0) = s->W(order & 3);
-r.W(1) = s->W((order >> 2) & 3);
-r.W(2) = s->W((order >> 4) & 3);
-r.W(3) = s->W((order >> 6) & 3);
-r.Q(1) = s->Q(1);
-*d = r;
+const intptr_t oprsz = simd_oprsz(desc);
+const intptr_t maxsz = simd_maxsz(desc);
+const uint8_t ctrl = simd_data(desc);
+
+for (intptr_t i = 0; 4 * i * sizeof(uint32_t) < oprsz; ++i) {
+const uint32_t t0 = a->L(4 * i + ((ctrl >> 0) & 3));
+const uint32_t t1 = a->L(4 * i + ((ctrl >> 2) & 3));
+  

[Qemu-devel] [RFC PATCH v4 43/75] target/i386: introduce SSE2 code generators

2019-08-21 Thread Jan Bobek
Introduce code generators required by SSE2 instructions.

Signed-off-by: Jan Bobek 
---
 target/i386/translate.c | 427 +++-
 1 file changed, 425 insertions(+), 2 deletions(-)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index 43917edc76..3445b4cff1 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -5563,6 +5563,7 @@ INSNOP_LDST(xmm, Mhq)
 }
 
 GEN_INSN2(movq, Pq, Eq);/* forward declaration */
+GEN_INSN2(movq, Vdq, Eq);   /* forward declaration */
 GEN_INSN2(movd, Pq, Ed)
 {
 const insnop_arg_t(Eq) arg2_r64 = tcg_temp_new_i64();
@@ -5574,6 +5575,17 @@ GEN_INSN2(movd, Ed, Pq)
 {
 tcg_gen_ld_i32(arg1, cpu_env, arg2 + offsetof(MMXReg, MMX_L(0)));
 }
+GEN_INSN2(movd, Vdq, Ed)
+{
+const insnop_arg_t(Eq) arg2_r64 = tcg_temp_new_i64();
+tcg_gen_extu_i32_i64(arg2_r64, arg2);
+gen_insn2(movq, Vdq, Eq)(env, s, arg1, arg2_r64);
+tcg_temp_free_i64(arg2_r64);
+}
+GEN_INSN2(movd, Ed, Vdq)
+{
+tcg_gen_ld_i32(arg1, cpu_env, arg2 + offsetof(ZMMReg, ZMM_L(0)));
+}
 
 GEN_INSN2(movq, Pq, Eq)
 {
@@ -5583,13 +5595,45 @@ GEN_INSN2(movq, Eq, Pq)
 {
 tcg_gen_ld_i64(arg1, cpu_env, arg2 + offsetof(MMXReg, MMX_Q(0)));
 }
+GEN_INSN2(movq, Vdq, Eq)
+{
+const TCGv_i64 r64 = tcg_temp_new_i64();
+tcg_gen_st_i64(arg2, cpu_env, arg1 + offsetof(ZMMReg, ZMM_Q(0)));
+tcg_gen_movi_i64(r64, 0);
+tcg_gen_st_i64(r64, cpu_env, arg1 + offsetof(ZMMReg, ZMM_Q(1)));
+tcg_temp_free_i64(r64);
+}
+GEN_INSN2(movq, Eq, Vdq)
+{
+tcg_gen_ld_i64(arg1, cpu_env, arg2 + offsetof(ZMMReg, ZMM_Q(0)));
+}
 
 DEF_GEN_INSN2_GVEC(movq, Pq, Qq, mov, MM_OPRSZ, MM_MAXSZ, MO_64)
 DEF_GEN_INSN2_GVEC(movq, Qq, Pq, mov, MM_OPRSZ, MM_MAXSZ, MO_64)
+GEN_INSN2(movq, Vdq, Wq)
+{
+const TCGv_i64 r64 = tcg_temp_new_i64();
+tcg_gen_ld_i64(r64, cpu_env, arg2 + offsetof(ZMMReg, ZMM_Q(0)));
+gen_insn2(movq, Vdq, Eq)(env, s, arg1, r64);
+tcg_temp_free_i64(r64);
+}
+GEN_INSN2(movq, UdqMq, Vq)
+{
+gen_insn2(movq, Vdq, Wq)(env, s, arg1, arg2);
+}
+
 DEF_GEN_INSN2_GVEC(movaps, Vdq, Wdq, mov, XMM_OPRSZ, XMM_MAXSZ, MO_64)
 DEF_GEN_INSN2_GVEC(movaps, Wdq, Vdq, mov, XMM_OPRSZ, XMM_MAXSZ, MO_64)
+DEF_GEN_INSN2_GVEC(movapd, Vdq, Wdq, mov, XMM_OPRSZ, XMM_MAXSZ, MO_64)
+DEF_GEN_INSN2_GVEC(movapd, Wdq, Vdq, mov, XMM_OPRSZ, XMM_MAXSZ, MO_64)
+DEF_GEN_INSN2_GVEC(movdqa, Vdq, Wdq, mov, XMM_OPRSZ, XMM_MAXSZ, MO_64)
+DEF_GEN_INSN2_GVEC(movdqa, Wdq, Vdq, mov, XMM_OPRSZ, XMM_MAXSZ, MO_64)
 DEF_GEN_INSN2_GVEC(movups, Vdq, Wdq, mov, XMM_OPRSZ, XMM_MAXSZ, MO_64)
 DEF_GEN_INSN2_GVEC(movups, Wdq, Vdq, mov, XMM_OPRSZ, XMM_MAXSZ, MO_64)
+DEF_GEN_INSN2_GVEC(movupd, Vdq, Wdq, mov, XMM_OPRSZ, XMM_MAXSZ, MO_64)
+DEF_GEN_INSN2_GVEC(movupd, Wdq, Vdq, mov, XMM_OPRSZ, XMM_MAXSZ, MO_64)
+DEF_GEN_INSN2_GVEC(movdqu, Vdq, Wdq, mov, XMM_OPRSZ, XMM_MAXSZ, MO_64)
+DEF_GEN_INSN2_GVEC(movdqu, Wdq, Vdq, mov, XMM_OPRSZ, XMM_MAXSZ, MO_64)
 
 GEN_INSN4(movss, Vdq, Vdq, Wd, modrm_mod)
 {
@@ -5621,6 +5665,46 @@ GEN_INSN2(movss, Wd, Vd)
 gen_insn4(movss, Vdq, Vdq, Wd, modrm_mod)(env, s, arg1, arg1, arg2, 3);
 }
 
+GEN_INSN4(movsd, Vdq, Vdq, Wq, modrm_mod)
+{
+const TCGv_i64 r64 = tcg_temp_new_i64();
+tcg_gen_ld_i64(r64, cpu_env, arg3 + offsetof(ZMMReg, ZMM_Q(0)));
+tcg_gen_st_i64(r64, cpu_env, arg1 + offsetof(ZMMReg, ZMM_Q(0)));
+if (arg4 == 3) {
+/* merging movsd */
+if (arg1 != arg2) {
+tcg_gen_ld_i64(r64, cpu_env, arg2 + offsetof(ZMMReg, ZMM_Q(1)));
+tcg_gen_st_i64(r64, cpu_env, arg1 + offsetof(ZMMReg, ZMM_Q(1)));
+}
+} else {
+/* zero-extending movsd */
+tcg_gen_movi_i64(r64, 0);
+tcg_gen_st_i64(r64, cpu_env, arg1 + offsetof(ZMMReg, ZMM_Q(1)));
+}
+tcg_temp_free_i64(r64);
+}
+GEN_INSN2(movsd, Wq, Vq)
+{
+gen_insn4(movsd, Vdq, Vdq, Wq, modrm_mod)(env, s, arg1, arg1, arg2, 3);
+}
+
+GEN_INSN2(movq2dq, Vdq, Nq)
+{
+const insnop_arg_t(Vdq) dofs = offsetof(ZMMReg, ZMM_Q(0));
+const insnop_arg_t(Nq) aofs = offsetof(MMXReg, MMX_Q(0));
+gen_op_movq(s, arg1 + dofs, arg2 + aofs);
+
+const TCGv_i64 r64z = tcg_const_i64(0);
+tcg_gen_st_i64(r64z, cpu_env, arg1 + offsetof(ZMMReg, ZMM_Q(1)));
+tcg_temp_free_i64(r64z);
+}
+GEN_INSN2(movdq2q, Pq, Uq)
+{
+const insnop_arg_t(Pq) dofs = offsetof(MMXReg, MMX_Q(0));
+const insnop_arg_t(Uq) aofs = offsetof(ZMMReg, ZMM_Q(0));
+gen_op_movq(s, arg1 + dofs, arg2 + aofs);
+}
+
 GEN_INSN3(movhlps, Vdq, Vdq, UdqMhq)
 {
 const TCGv_i64 r64 = tcg_temp_new_i64();
@@ -5637,6 +5721,21 @@ GEN_INSN2(movlps, Mq, Vq)
 insnop_ldst(xmm, Mq)(env, s, 1, arg2, arg1);
 }
 
+GEN_INSN3(movlpd, Vdq, Vdq, Mq)
+{
+insnop_ldst(xmm, Mq)(env, s, 0, arg1, arg3);
+if (arg1 != arg2) {
+const TCGv_i64 r64 = tcg_temp_new_i64();
+tcg_gen_ld_i64(r64, cpu_env, arg2 + offsetof(ZMMReg, ZMM_Q(1)));
+tcg_gen_st_i64(r64, cpu_env, arg1 + offsetof(ZMMReg, ZMM_Q(1)));
+tcg_temp_free_i64(r64);
+

Re: [Qemu-devel] [PATCH v8 2/3] block/nbd: nbd reconnect

2019-08-21 Thread Eric Blake
On 8/21/19 11:52 AM, Vladimir Sementsov-Ogievskiy wrote:
> Implement reconnect. To achieve this:
> 
> 1. add new modes:
>connecting-wait: means, that reconnecting is in progress, and there
>  were small number of reconnect attempts, so all requests are
>  waiting for the connection.
>connecting-nowait: reconnecting is in progress, there were a lot of
>  attempts of reconnect, all requests will return errors.
> 
>two old modes are used too:
>connected: normal state
>quit: exiting after fatal error or on close
> 
> Possible transitions are:
> 
>* -> quit
>connecting-* -> connected
>connecting-wait -> connecting-nowait (transition is done after
>   reconnect-delay seconds in connecting-wait mode)
>connected -> connecting-wait
> 
> 2. Implement reconnect in connection_co. So, in connecting-* mode,
> connection_co, tries to reconnect unlimited times.
> 
> 3. Retry nbd queries on channel error, if we are in connecting-wait
> state.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---

> +static bool nbd_client_connecting(BDRVNBDState *s)
> +{
> +return s->state == NBD_CLIENT_CONNECTING_WAIT ||
> +s->state == NBD_CLIENT_CONNECTING_NOWAIT;


Indentation looks unusual. I might have done:

return (s->state == NBD_CLIENT_CONNECTING_WAIT ||
s->state == NBD_CLIENT_CONNECTING_NOWAIT);

Or even exploit the enum encoding:

return s->state <= NBD_CLIENT_CONNECTING_NOWAIT

Is s->state updated atomically, or do we risk the case where we might
see two different values of s->state across the || sequence point?  Does
that matter?

> +}
> +
> +static bool nbd_client_connecting_wait(BDRVNBDState *s)
> +{
> +return s->state == NBD_CLIENT_CONNECTING_WAIT;
> +}
> +
> +static coroutine_fn void nbd_reconnect_attempt(BDRVNBDState *s)
> +{
> +Error *local_err = NULL;
> +
> +if (!nbd_client_connecting(s)) {
> +return;
> +}
> +assert(nbd_client_connecting(s));

This assert adds nothing given the condition beforehand.

> +
> +/* Wait for completion of all in-flight requests */
> +
> +qemu_co_mutex_lock(>send_mutex);
> +
> +while (s->in_flight > 0) {
> +qemu_co_mutex_unlock(>send_mutex);
> +nbd_recv_coroutines_wake_all(s);
> +s->wait_in_flight = true;
> +qemu_coroutine_yield();
> +s->wait_in_flight = false;
> +qemu_co_mutex_lock(>send_mutex);
> +}
> +
> +qemu_co_mutex_unlock(>send_mutex);
> +
> +if (!nbd_client_connecting(s)) {
> +return;
> +}
> +
> +/*
> + * Now we are sure that nobody is accessing the channel, and no one will
> + * try until we set the state to CONNECTED.
> + */
> +
> +/* Finalize previous connection if any */
> +if (s->ioc) {
> +nbd_client_detach_aio_context(s->bs);
> +object_unref(OBJECT(s->sioc));
> +s->sioc = NULL;
> +object_unref(OBJECT(s->ioc));
> +s->ioc = NULL;
> +}
> +
> +s->connect_status = nbd_client_connect(s->bs, _err);
> +error_free(s->connect_err);
> +s->connect_err = NULL;
> +error_propagate(>connect_err, local_err);
> +local_err = NULL;
> +
> +if (s->connect_status < 0) {
> +/* failed attempt */
> +return;
> +}
> +
> +/* successfully connected */
> +s->state = NBD_CLIENT_CONNECTED;
> +qemu_co_queue_restart_all(>free_sema);
> +}
> +
> +static coroutine_fn void nbd_reconnect_loop(BDRVNBDState *s)
> +{
> +uint64_t start_time_ns = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
> +uint64_t delay_ns = s->reconnect_delay * NANOSECONDS_PER_SECOND;
> +uint64_t timeout = 1 * NANOSECONDS_PER_SECOND;
> +uint64_t max_timeout = 16 * NANOSECONDS_PER_SECOND;
> +
> +nbd_reconnect_attempt(s);
> +
> +while (nbd_client_connecting(s)) {
> +if (s->state == NBD_CLIENT_CONNECTING_WAIT &&
> +qemu_clock_get_ns(QEMU_CLOCK_REALTIME) - start_time_ns > 
> delay_ns)
> +{
> +s->state = NBD_CLIENT_CONNECTING_NOWAIT;
> +qemu_co_queue_restart_all(>free_sema);
> +}
> +
> +qemu_co_sleep_ns(QEMU_CLOCK_REALTIME, timeout,
> + >connection_co_sleep_ns_state);
> +if (s->drained) {
> +bdrv_dec_in_flight(s->bs);
> +s->wait_drained_end = true;
> +while (s->drained) {
> +/*
> + * We may be entered once from 
> nbd_client_attach_aio_context_bh
> + * and then from nbd_client_co_drain_end. So here is a loop.
> + */
> +qemu_coroutine_yield();
> +}
> +bdrv_inc_in_flight(s->bs);
> +}
> +if (timeout < max_timeout) {
> +timeout *= 2;
> +}
> +
> +nbd_reconnect_attempt(s);
> +}
>  }
>  
>  static coroutine_fn void nbd_connection_entry(void *opaque)
>  {
> -BDRVNBDState *s = opaque;
> +BDRVNBDState *s = 

[Qemu-devel] [RFC PATCH v4 55/75] target/i386: introduce SSE4.2 vector instructions to sse-opcode.inc.h

2019-08-21 Thread Jan Bobek
Add all the SSE4.2 vector instruction entries to sse-opcode.inc.h.

Signed-off-by: Jan Bobek 
---
 target/i386/sse-opcode.inc.h | 13 +
 1 file changed, 13 insertions(+)

diff --git a/target/i386/sse-opcode.inc.h b/target/i386/sse-opcode.inc.h
index 9682cce7ef..f43436213e 100644
--- a/target/i386/sse-opcode.inc.h
+++ b/target/i386/sse-opcode.inc.h
@@ -441,6 +441,14 @@
  * 66 0f 38 34 /r  PMOVZXWQ xmm1, xmm2/m32
  * 66 0f 38 35 /r  PMOVZXDQ xmm1, xmm2/m64
  * 66 0F 38 2A /r  MOVNTDQA xmm1, m128
+ *
+ * SSE4.2 Instructions
+ * 
+ * 66 0F 38 37 /r  PCMPGTQ xmm1,xmm2/m128
+ * 66 0F 3A 60 /r imm8 PCMPESTRM xmm1, xmm2/m128, imm8
+ * 66 0F 3A 61 /r imm8 PCMPESTRI xmm1, xmm2/m128, imm8
+ * 66 0F 3A 62 /r imm8 PCMPISTRM xmm1, xmm2/m128, imm8
+ * 66 0F 3A 63 /r imm8 PCMPISTRI xmm1, xmm2/m128, imm8
  */
 
 OPCODE(movd, LEG(NP, 0F, 0, 0x6e), MMX, WR, Pq, Ed)
@@ -646,6 +654,11 @@ OPCODE(pcmpgtw, LEG(NP, 0F, 0, 0x65), MMX, WRR, Pq, Pq, Qq)
 OPCODE(pcmpgtw, LEG(66, 0F, 0, 0x65), SSE2, WRR, Vdq, Vdq, Wdq)
 OPCODE(pcmpgtd, LEG(NP, 0F, 0, 0x66), MMX, WRR, Pq, Pq, Qq)
 OPCODE(pcmpgtd, LEG(66, 0F, 0, 0x66), SSE2, WRR, Vdq, Vdq, Wdq)
+OPCODE(pcmpgtq, LEG(66, 0F38, 0, 0x37), SSE4_2, WRR, Vdq, Vdq, Wdq)
+OPCODE(pcmpestrm, LEG(66, 0F3A, 0, 0x60), SSE4_2, RRR, Vdq, Wdq, Ib)
+OPCODE(pcmpestri, LEG(66, 0F3A, 0, 0x61), SSE4_2, RRR, Vdq, Wdq, Ib)
+OPCODE(pcmpistrm, LEG(66, 0F3A, 0, 0x62), SSE4_2, RRR, Vdq, Wdq, Ib)
+OPCODE(pcmpistri, LEG(66, 0F3A, 0, 0x63), SSE4_2, RRR, Vdq, Wdq, Ib)
 OPCODE(ptest, LEG(66, 0F38, 0, 0x17), SSE4_1, RR, Vdq, Wdq)
 OPCODE(cmpps, LEG(NP, 0F, 0, 0xc2), SSE, WRRR, Vdq, Vdq, Wdq, Ib)
 OPCODE(cmppd, LEG(66, 0F, 0, 0xc2), SSE2, WRRR, Vdq, Vdq, Wdq, Ib)
-- 
2.20.1




[Qemu-devel] [RFC PATCH v4 70/75] target/i386: convert pavgb/pavgw helpers to gvec style

2019-08-21 Thread Jan Bobek
Make these helpers suitable for use with tcg_gen_gvec_* functions.

Signed-off-by: Jan Bobek 
---
 target/i386/ops_sse.h| 33 +
 target/i386/ops_sse_header.h |  7 +--
 target/i386/translate.c  | 20 +---
 3 files changed, 43 insertions(+), 17 deletions(-)

diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h
index 6ec116573b..1661bd7c64 100644
--- a/target/i386/ops_sse.h
+++ b/target/i386/ops_sse.h
@@ -413,8 +413,6 @@ static inline int satsw(int x)
 }
 
 #define FMULHRW(a, b) (((int16_t)(a) * (int16_t)(b) + 0x8000) >> 16)
-
-#define FAVG(a, b) (((a) + (b) + 1) >> 1)
 #endif
 
 void glue(helper_pmullw, SUFFIX)(Reg *d, Reg *a, Reg *b, uint32_t desc)
@@ -457,8 +455,35 @@ void glue(helper_pmulhw, SUFFIX)(Reg *d, Reg *a, Reg *b, 
uint32_t desc)
 glue(clear_high, SUFFIX)(d, oprsz, maxsz);
 }
 
-SSE_HELPER_B(helper_pavgb, FAVG)
-SSE_HELPER_W(helper_pavgw, FAVG)
+void glue(helper_pavgb, SUFFIX)(Reg *d, Reg *a, Reg *b, uint32_t desc)
+{
+const intptr_t oprsz = simd_oprsz(desc);
+const intptr_t maxsz = simd_maxsz(desc);
+
+for (intptr_t i = 0; i * sizeof(uint8_t) < oprsz; ++i) {
+d->B(i) = (a->B(i) + b->B(i) + 1) >> 1;
+}
+glue(clear_high, SUFFIX)(d, oprsz, maxsz);
+}
+
+#if SHIFT == 0
+void glue(helper_pavgusb, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+{
+const uint32_t desc = simd_desc(sizeof(Reg), sizeof(Reg), 0);
+glue(helper_pavgb, SUFFIX)(d, s, s, desc);
+}
+#endif
+
+void glue(helper_pavgw, SUFFIX)(Reg *d, Reg *a, Reg *b, uint32_t desc)
+{
+const intptr_t oprsz = simd_oprsz(desc);
+const intptr_t maxsz = simd_maxsz(desc);
+
+for (intptr_t i = 0; i * sizeof(uint16_t) < oprsz; ++i) {
+d->W(i) = (a->W(i) + b->W(i) + 1) >> 1;
+}
+glue(clear_high, SUFFIX)(d, oprsz, maxsz);
+}
 
 void glue(helper_pmuludq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
 {
diff --git a/target/i386/ops_sse_header.h b/target/i386/ops_sse_header.h
index 7e6411fc82..b5e8aae897 100644
--- a/target/i386/ops_sse_header.h
+++ b/target/i386/ops_sse_header.h
@@ -65,8 +65,11 @@ DEF_HELPER_3(glue(pmulhrw, SUFFIX), void, env, Reg, Reg)
 DEF_HELPER_4(glue(pmulhuw, SUFFIX), void, Reg, Reg, Reg, i32)
 DEF_HELPER_4(glue(pmulhw, SUFFIX), void, Reg, Reg, Reg, i32)
 
-DEF_HELPER_3(glue(pavgb, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_3(glue(pavgw, SUFFIX), void, env, Reg, Reg)
+DEF_HELPER_4(glue(pavgb, SUFFIX), void, Reg, Reg, Reg, i32)
+#if SHIFT == 0
+DEF_HELPER_3(glue(pavgusb, SUFFIX), void, env, Reg, Reg)
+#endif
+DEF_HELPER_4(glue(pavgw, SUFFIX), void, Reg, Reg, Reg, i32)
 
 DEF_HELPER_3(glue(pmuludq, SUFFIX), void, env, Reg, Reg)
 DEF_HELPER_3(glue(pmaddwd, SUFFIX), void, env, Reg, Reg)
diff --git a/target/i386/translate.c b/target/i386/translate.c
index 79f8c1ddac..77b2e18f34 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -2803,8 +2803,6 @@ static const SSEFunc_0_epp sse_op_table1[256][4] = {
 [0xd0] = { NULL, gen_helper_addsubpd, NULL, gen_helper_addsubps },
 [0xd6] = { NULL, SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL },
 [0xd7] = { SSE_SPECIAL, SSE_SPECIAL }, /* pmovmskb */
-[0xe0] = MMX_OP2(pavgb),
-[0xe3] = MMX_OP2(pavgw),
 [0xe6] = { NULL, gen_helper_cvttpd2dq, gen_helper_cvtdq2pd, 
gen_helper_cvtpd2dq },
 [0xe7] = { SSE_SPECIAL , SSE_SPECIAL },  /* movntq, movntq */
 [0xf0] = { NULL, NULL, NULL, SSE_SPECIAL }, /* lddqu */
@@ -2878,7 +2876,7 @@ static const SSEFunc_0_epp sse_op_table5[256] = {
 [0xb6] = gen_helper_movq, /* pfrcpit2 */
 [0xb7] = gen_helper_pmulhrw_mmx,
 [0xbb] = gen_helper_pswapd,
-[0xbf] = gen_helper_pavgb_mmx /* pavgusb */
+[0xbf] = gen_helper_pavgusb_mmx
 };
 
 struct SSEOpHelper_epp {
@@ -6252,14 +6250,14 @@ DEF_GEN_INSN3_HELPER_EPP(maxss, maxss, Vd, Vd, Wd)
 DEF_GEN_INSN3_HELPER_EPP(vmaxss, maxss, Vd, Hd, Wd)
 DEF_GEN_INSN3_HELPER_EPP(maxsd, maxsd, Vq, Vq, Wq)
 DEF_GEN_INSN3_HELPER_EPP(vmaxsd, maxsd, Vq, Hq, Wq)
-DEF_GEN_INSN3_HELPER_EPP(pavgb, pavgb_mmx, Pq, Pq, Qq)
-DEF_GEN_INSN3_HELPER_EPP(pavgb, pavgb_xmm, Vdq, Vdq, Wdq)
-DEF_GEN_INSN3_HELPER_EPP(vpavgb, pavgb_xmm, Vdq, Hdq, Wdq)
-DEF_GEN_INSN3_HELPER_EPP(vpavgb, pavgb_xmm, Vqq, Hqq, Wqq)
-DEF_GEN_INSN3_HELPER_EPP(pavgw, pavgw_mmx, Pq, Pq, Qq)
-DEF_GEN_INSN3_HELPER_EPP(pavgw, pavgw_xmm, Vdq, Vdq, Wdq)
-DEF_GEN_INSN3_HELPER_EPP(vpavgw, pavgw_xmm, Vdq, Hdq, Wdq)
-DEF_GEN_INSN3_HELPER_EPP(vpavgw, pavgw_xmm, Vqq, Hqq, Wqq)
+DEF_GEN_INSN3_GVEC(pavgb, Pq, Pq, Qq, 3_ool, MM_OPRSZ, MM_MAXSZ, pavgb_mmx)
+DEF_GEN_INSN3_GVEC(pavgb, Vdq, Vdq, Wdq, 3_ool, XMM_OPRSZ, XMM_MAXSZ, 
pavgb_xmm)
+DEF_GEN_INSN3_GVEC(vpavgb, Vdq, Hdq, Wdq, 3_ool, XMM_OPRSZ, XMM_MAXSZ, 
pavgb_xmm)
+DEF_GEN_INSN3_GVEC(vpavgb, Vqq, Hqq, Wqq, 3_ool, XMM_OPRSZ, XMM_MAXSZ, 
pavgb_xmm)
+DEF_GEN_INSN3_GVEC(pavgw, Pq, Pq, Qq, 3_ool, MM_OPRSZ, MM_MAXSZ, pavgw_mmx)
+DEF_GEN_INSN3_GVEC(pavgw, Vdq, Vdq, Wdq, 3_ool, XMM_OPRSZ, XMM_MAXSZ, 
pavgw_xmm)
+DEF_GEN_INSN3_GVEC(vpavgw, Vdq, Hdq, Wdq, 3_ool, XMM_OPRSZ, XMM_MAXSZ, 

[Qemu-devel] [RFC PATCH v4 69/75] target/i386: convert pmullw/pmulhw/pmulhuw helpers to gvec style

2019-08-21 Thread Jan Bobek
Make these helpers suitable for use with tcg_gen_gvec_* functions.

Signed-off-by: Jan Bobek 
---
 target/i386/ops_sse.h| 42 ++--
 target/i386/ops_sse_header.h |  6 +++---
 target/i386/translate.c  | 27 +++
 3 files changed, 51 insertions(+), 24 deletions(-)

diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h
index 168e581c0c..6ec116573b 100644
--- a/target/i386/ops_sse.h
+++ b/target/i386/ops_sse.h
@@ -412,20 +412,50 @@ static inline int satsw(int x)
 }
 }
 
-#define FMULLW(a, b) ((a) * (b))
 #define FMULHRW(a, b) (((int16_t)(a) * (int16_t)(b) + 0x8000) >> 16)
-#define FMULHUW(a, b) ((a) * (b) >> 16)
-#define FMULHW(a, b) ((int16_t)(a) * (int16_t)(b) >> 16)
 
 #define FAVG(a, b) (((a) + (b) + 1) >> 1)
 #endif
 
-SSE_HELPER_W(helper_pmullw, FMULLW)
+void glue(helper_pmullw, SUFFIX)(Reg *d, Reg *a, Reg *b, uint32_t desc)
+{
+const intptr_t oprsz = simd_oprsz(desc);
+const intptr_t maxsz = simd_maxsz(desc);
+
+for (intptr_t i = 0; i * sizeof(uint16_t) < oprsz; ++i) {
+const uint32_t t = (uint32_t)a->W(i) * (uint32_t)b->W(i);
+d->W(i) = t;
+}
+glue(clear_high, SUFFIX)(d, oprsz, maxsz);
+}
+
 #if SHIFT == 0
 SSE_HELPER_W(helper_pmulhrw, FMULHRW)
 #endif
-SSE_HELPER_W(helper_pmulhuw, FMULHUW)
-SSE_HELPER_W(helper_pmulhw, FMULHW)
+
+void glue(helper_pmulhuw, SUFFIX)(Reg *d, Reg *a, Reg *b, uint32_t desc)
+{
+const intptr_t oprsz = simd_oprsz(desc);
+const intptr_t maxsz = simd_maxsz(desc);
+
+for (intptr_t i = 0; i * sizeof(uint16_t) < oprsz; ++i) {
+const uint32_t t = (uint32_t)a->W(i) * (uint32_t)b->W(i);
+d->W(i) = t >> 16;
+}
+glue(clear_high, SUFFIX)(d, oprsz, maxsz);
+}
+
+void glue(helper_pmulhw, SUFFIX)(Reg *d, Reg *a, Reg *b, uint32_t desc)
+{
+const intptr_t oprsz = simd_oprsz(desc);
+const intptr_t maxsz = simd_maxsz(desc);
+
+for (intptr_t i = 0; i * sizeof(uint16_t) < oprsz; ++i) {
+const int32_t t = (int32_t)a->W(i) * (int32_t)b->W(i);
+d->W(i) = t >> 16;
+}
+glue(clear_high, SUFFIX)(d, oprsz, maxsz);
+}
 
 SSE_HELPER_B(helper_pavgb, FAVG)
 SSE_HELPER_W(helper_pavgw, FAVG)
diff --git a/target/i386/ops_sse_header.h b/target/i386/ops_sse_header.h
index 724692a689..7e6411fc82 100644
--- a/target/i386/ops_sse_header.h
+++ b/target/i386/ops_sse_header.h
@@ -58,12 +58,12 @@ DEF_HELPER_3(glue(pslldqi, SUFFIX), void, Reg, Reg, i32)
 DEF_HELPER_3(glue(psrldqi, SUFFIX), void, Reg, Reg, i32)
 #endif
 
-DEF_HELPER_3(glue(pmullw, SUFFIX), void, env, Reg, Reg)
+DEF_HELPER_4(glue(pmullw, SUFFIX), void, Reg, Reg, Reg, i32)
 #if SHIFT == 0
 DEF_HELPER_3(glue(pmulhrw, SUFFIX), void, env, Reg, Reg)
 #endif
-DEF_HELPER_3(glue(pmulhuw, SUFFIX), void, env, Reg, Reg)
-DEF_HELPER_3(glue(pmulhw, SUFFIX), void, env, Reg, Reg)
+DEF_HELPER_4(glue(pmulhuw, SUFFIX), void, Reg, Reg, Reg, i32)
+DEF_HELPER_4(glue(pmulhw, SUFFIX), void, Reg, Reg, Reg, i32)
 
 DEF_HELPER_3(glue(pavgb, SUFFIX), void, env, Reg, Reg)
 DEF_HELPER_3(glue(pavgw, SUFFIX), void, env, Reg, Reg)
diff --git a/target/i386/translate.c b/target/i386/translate.c
index 03f7c6e450..79f8c1ddac 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -2801,13 +2801,10 @@ static const SSEFunc_0_epp sse_op_table1[256][4] = {
 [0xc4] = { SSE_SPECIAL, SSE_SPECIAL }, /* pinsrw */
 [0xc5] = { SSE_SPECIAL, SSE_SPECIAL }, /* pextrw */
 [0xd0] = { NULL, gen_helper_addsubpd, NULL, gen_helper_addsubps },
-[0xd5] = MMX_OP2(pmullw),
 [0xd6] = { NULL, SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL },
 [0xd7] = { SSE_SPECIAL, SSE_SPECIAL }, /* pmovmskb */
 [0xe0] = MMX_OP2(pavgb),
 [0xe3] = MMX_OP2(pavgw),
-[0xe4] = MMX_OP2(pmulhuw),
-[0xe5] = MMX_OP2(pmulhw),
 [0xe6] = { NULL, gen_helper_cvttpd2dq, gen_helper_cvtdq2pd, 
gen_helper_cvtpd2dq },
 [0xe7] = { SSE_SPECIAL , SSE_SPECIAL },  /* movntq, movntq */
 [0xf0] = { NULL, NULL, NULL, SSE_SPECIAL }, /* lddqu */
@@ -6116,21 +6113,21 @@ DEF_GEN_INSN3_HELPER_EPP(addsubpd, addsubpd, Vdq, Vdq, 
Wdq)
 DEF_GEN_INSN3_HELPER_EPP(vaddsubpd, addsubpd, Vdq, Hdq, Wdq)
 DEF_GEN_INSN3_HELPER_EPP(vaddsubpd, addsubpd, Vqq, Hqq, Wqq)
 
-DEF_GEN_INSN3_HELPER_EPP(pmullw, pmullw_mmx, Pq, Pq, Qq)
-DEF_GEN_INSN3_HELPER_EPP(pmullw, pmullw_xmm, Vdq, Vdq, Wdq)
-DEF_GEN_INSN3_HELPER_EPP(vpmullw, pmullw_xmm, Vdq, Hdq, Wdq)
-DEF_GEN_INSN3_HELPER_EPP(vpmullw, pmullw_xmm, Vqq, Hqq, Wqq)
+DEF_GEN_INSN3_GVEC(pmullw, Pq, Pq, Qq, 3_ool, MM_OPRSZ, MM_MAXSZ, pmullw_mmx)
+DEF_GEN_INSN3_GVEC(pmullw, Vdq, Vdq, Wdq, 3_ool, XMM_OPRSZ, XMM_MAXSZ, 
pmullw_xmm)
+DEF_GEN_INSN3_GVEC(vpmullw, Vdq, Hdq, Wdq, 3_ool, XMM_OPRSZ, XMM_MAXSZ, 
pmullw_xmm)
+DEF_GEN_INSN3_GVEC(vpmullw, Vqq, Hqq, Wqq, 3_ool, XMM_OPRSZ, XMM_MAXSZ, 
pmullw_xmm)
 DEF_GEN_INSN3_HELPER_EPP(pmulld, pmulld_xmm, Vdq, Vdq, Wdq)
 DEF_GEN_INSN3_HELPER_EPP(vpmulld, pmulld_xmm, Vdq, Hdq, Wdq)
 DEF_GEN_INSN3_HELPER_EPP(vpmulld, pmulld_xmm, Vqq, Hqq, Wqq)

[Qemu-devel] [RFC PATCH v4 60/75] target/i386: introduce AVX code generators

2019-08-21 Thread Jan Bobek
Introduce code generators required by AVX instructions.

Signed-off-by: Jan Bobek 
---
 target/i386/translate.c | 954 
 1 file changed, 954 insertions(+)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index 9b9f0d4ed1..50eab9181c 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -5586,6 +5586,14 @@ GEN_INSN2(movd, Ed, Vdq)
 {
 tcg_gen_ld_i32(arg1, cpu_env, arg2 + offsetof(ZMMReg, ZMM_L(0)));
 }
+GEN_INSN2(vmovd, Vdq, Ed)
+{
+gen_insn2(movd, Vdq, Ed)(env, s, arg1, arg2);
+}
+GEN_INSN2(vmovd, Ed, Vdq)
+{
+gen_insn2(movd, Ed, Vdq)(env, s, arg1, arg2);
+}
 
 GEN_INSN2(movq, Pq, Eq)
 {
@@ -5607,6 +5615,14 @@ GEN_INSN2(movq, Eq, Vdq)
 {
 tcg_gen_ld_i64(arg1, cpu_env, arg2 + offsetof(ZMMReg, ZMM_Q(0)));
 }
+GEN_INSN2(vmovq, Vdq, Eq)
+{
+gen_insn2(movq, Vdq, Eq)(env, s, arg1, arg2);
+}
+GEN_INSN2(vmovq, Eq, Vdq)
+{
+gen_insn2(movq, Eq, Vdq)(env, s, arg1, arg2);
+}
 
 DEF_GEN_INSN2_GVEC(movq, Pq, Qq, mov, MM_OPRSZ, MM_MAXSZ, MO_64)
 DEF_GEN_INSN2_GVEC(movq, Qq, Pq, mov, MM_OPRSZ, MM_MAXSZ, MO_64)
@@ -5621,19 +5637,51 @@ GEN_INSN2(movq, UdqMq, Vq)
 {
 gen_insn2(movq, Vdq, Wq)(env, s, arg1, arg2);
 }
+GEN_INSN2(vmovq, Vdq, Wq)
+{
+gen_insn2(movq, Vdq, Wq)(env, s, arg1, arg2);
+}
+GEN_INSN2(vmovq, UdqMq, Vq)
+{
+gen_insn2(movq, UdqMq, Vq)(env, s, arg1, arg2);
+}
 
 DEF_GEN_INSN2_GVEC(movaps, Vdq, Wdq, mov, XMM_OPRSZ, XMM_MAXSZ, MO_64)
 DEF_GEN_INSN2_GVEC(movaps, Wdq, Vdq, mov, XMM_OPRSZ, XMM_MAXSZ, MO_64)
+DEF_GEN_INSN2_GVEC(vmovaps, Vdq, Wdq, mov, XMM_OPRSZ, XMM_MAXSZ, MO_64)
+DEF_GEN_INSN2_GVEC(vmovaps, Wdq, Vdq, mov, XMM_OPRSZ, XMM_MAXSZ, MO_64)
+DEF_GEN_INSN2_GVEC(vmovaps, Vqq, Wqq, mov, XMM_OPRSZ, XMM_MAXSZ, MO_64)
+DEF_GEN_INSN2_GVEC(vmovaps, Wqq, Vqq, mov, XMM_OPRSZ, XMM_MAXSZ, MO_64)
 DEF_GEN_INSN2_GVEC(movapd, Vdq, Wdq, mov, XMM_OPRSZ, XMM_MAXSZ, MO_64)
 DEF_GEN_INSN2_GVEC(movapd, Wdq, Vdq, mov, XMM_OPRSZ, XMM_MAXSZ, MO_64)
+DEF_GEN_INSN2_GVEC(vmovapd, Vdq, Wdq, mov, XMM_OPRSZ, XMM_MAXSZ, MO_64)
+DEF_GEN_INSN2_GVEC(vmovapd, Wdq, Vdq, mov, XMM_OPRSZ, XMM_MAXSZ, MO_64)
+DEF_GEN_INSN2_GVEC(vmovapd, Vqq, Wqq, mov, XMM_OPRSZ, XMM_MAXSZ, MO_64)
+DEF_GEN_INSN2_GVEC(vmovapd, Wqq, Vqq, mov, XMM_OPRSZ, XMM_MAXSZ, MO_64)
 DEF_GEN_INSN2_GVEC(movdqa, Vdq, Wdq, mov, XMM_OPRSZ, XMM_MAXSZ, MO_64)
 DEF_GEN_INSN2_GVEC(movdqa, Wdq, Vdq, mov, XMM_OPRSZ, XMM_MAXSZ, MO_64)
+DEF_GEN_INSN2_GVEC(vmovdqa, Vdq, Wdq, mov, XMM_OPRSZ, XMM_MAXSZ, MO_64)
+DEF_GEN_INSN2_GVEC(vmovdqa, Wdq, Vdq, mov, XMM_OPRSZ, XMM_MAXSZ, MO_64)
+DEF_GEN_INSN2_GVEC(vmovdqa, Vqq, Wqq, mov, XMM_OPRSZ, XMM_MAXSZ, MO_64)
+DEF_GEN_INSN2_GVEC(vmovdqa, Wqq, Vqq, mov, XMM_OPRSZ, XMM_MAXSZ, MO_64)
 DEF_GEN_INSN2_GVEC(movups, Vdq, Wdq, mov, XMM_OPRSZ, XMM_MAXSZ, MO_64)
 DEF_GEN_INSN2_GVEC(movups, Wdq, Vdq, mov, XMM_OPRSZ, XMM_MAXSZ, MO_64)
+DEF_GEN_INSN2_GVEC(vmovups, Vdq, Wdq, mov, XMM_OPRSZ, XMM_MAXSZ, MO_64)
+DEF_GEN_INSN2_GVEC(vmovups, Wdq, Vdq, mov, XMM_OPRSZ, XMM_MAXSZ, MO_64)
+DEF_GEN_INSN2_GVEC(vmovups, Vqq, Wqq, mov, XMM_OPRSZ, XMM_MAXSZ, MO_64)
+DEF_GEN_INSN2_GVEC(vmovups, Wqq, Vqq, mov, XMM_OPRSZ, XMM_MAXSZ, MO_64)
 DEF_GEN_INSN2_GVEC(movupd, Vdq, Wdq, mov, XMM_OPRSZ, XMM_MAXSZ, MO_64)
 DEF_GEN_INSN2_GVEC(movupd, Wdq, Vdq, mov, XMM_OPRSZ, XMM_MAXSZ, MO_64)
+DEF_GEN_INSN2_GVEC(vmovupd, Vdq, Wdq, mov, XMM_OPRSZ, XMM_MAXSZ, MO_64)
+DEF_GEN_INSN2_GVEC(vmovupd, Wdq, Vdq, mov, XMM_OPRSZ, XMM_MAXSZ, MO_64)
+DEF_GEN_INSN2_GVEC(vmovupd, Vqq, Wqq, mov, XMM_OPRSZ, XMM_MAXSZ, MO_64)
+DEF_GEN_INSN2_GVEC(vmovupd, Wqq, Vqq, mov, XMM_OPRSZ, XMM_MAXSZ, MO_64)
 DEF_GEN_INSN2_GVEC(movdqu, Vdq, Wdq, mov, XMM_OPRSZ, XMM_MAXSZ, MO_64)
 DEF_GEN_INSN2_GVEC(movdqu, Wdq, Vdq, mov, XMM_OPRSZ, XMM_MAXSZ, MO_64)
+DEF_GEN_INSN2_GVEC(vmovdqu, Vdq, Wdq, mov, XMM_OPRSZ, XMM_MAXSZ, MO_64)
+DEF_GEN_INSN2_GVEC(vmovdqu, Wdq, Vdq, mov, XMM_OPRSZ, XMM_MAXSZ, MO_64)
+DEF_GEN_INSN2_GVEC(vmovdqu, Vqq, Wqq, mov, XMM_OPRSZ, XMM_MAXSZ, MO_64)
+DEF_GEN_INSN2_GVEC(vmovdqu, Wqq, Vqq, mov, XMM_OPRSZ, XMM_MAXSZ, MO_64)
 
 GEN_INSN4(movss, Vdq, Vdq, Wd, modrm_mod)
 {
@@ -5664,6 +5712,24 @@ GEN_INSN2(movss, Wd, Vd)
 {
 gen_insn4(movss, Vdq, Vdq, Wd, modrm_mod)(env, s, arg1, arg1, arg2, 3);
 }
+GEN_INSN5(vmovss, Vdq, Hdq, Wd, modrm_mod, vex_v)
+{
+if (arg4 == 3 || arg5 == 0) {
+gen_insn4(movss, Vdq, Vdq, Wd, modrm_mod)(env, s, arg1,
+  arg2, arg3, arg4);
+} else {
+gen_unknown_opcode(env, s);
+}
+}
+GEN_INSN5(vmovss, Wdq, Hdq, Vd, modrm_mod, vex_v)
+{
+if (arg4 == 3 || arg5 == 0) {
+gen_insn4(movss, Vdq, Vdq, Wd, modrm_mod)(env, s, arg1,
+  arg2, arg3, 3);
+} else {
+gen_unknown_opcode(env, s);
+}
+}
 
 GEN_INSN4(movsd, Vdq, Vdq, Wq, modrm_mod)
 {
@@ -5687,6 +5753,24 @@ GEN_INSN2(movsd, Wq, Vq)
 {
 gen_insn4(movsd, Vdq, Vdq, Wq, modrm_mod)(env, s, arg1, arg1, arg2, 3);
 }
+GEN_INSN5(vmovsd, Vdq, Hdq, Wq, modrm_mod, vex_v)
+{
+if (arg4 == 

[Qemu-devel] [RFC PATCH v4 48/75] target/i386: introduce SSSE3 translators

2019-08-21 Thread Jan Bobek
Use the translator macros to define translators required by SSSE3
instructions.

Signed-off-by: Jan Bobek 
---
 target/i386/translate.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index d449a64464..25d3b969b1 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -6695,6 +6695,7 @@ DEF_TRANSLATE_INSN3(Vq, Vq, Wq)
 }   \
 }
 
+DEF_TRANSLATE_INSN4(Pq, Pq, Qq, Ib)
 DEF_TRANSLATE_INSN4(Pq, Pq, RdMw, Ib)
 DEF_TRANSLATE_INSN4(Vd, Vd, Wd, Ib)
 DEF_TRANSLATE_INSN4(Vdq, Vdq, RdMw, Ib)
-- 
2.20.1




[Qemu-devel] [RFC PATCH v4 65/75] target/i386: remove obsoleted helpers

2019-08-21 Thread Jan Bobek
A number of helpers have been obsoleted by the use of tcg_gen_gvec_*
functions; remove all of them.

Signed-off-by: Jan Bobek 
---
 target/i386/ops_sse.h| 65 
 target/i386/ops_sse_header.h | 39 --
 target/i386/translate.c  | 38 -
 3 files changed, 142 deletions(-)

diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h
index ec1ec745d0..aca6b50f23 100644
--- a/target/i386/ops_sse.h
+++ b/target/i386/ops_sse.h
@@ -337,32 +337,6 @@ static inline int satsw(int x)
 }
 }
 
-#define FADD(a, b) ((a) + (b))
-#define FADDUB(a, b) satub((a) + (b))
-#define FADDUW(a, b) satuw((a) + (b))
-#define FADDSB(a, b) satsb((int8_t)(a) + (int8_t)(b))
-#define FADDSW(a, b) satsw((int16_t)(a) + (int16_t)(b))
-
-#define FSUB(a, b) ((a) - (b))
-#define FSUBUB(a, b) satub((a) - (b))
-#define FSUBUW(a, b) satuw((a) - (b))
-#define FSUBSB(a, b) satsb((int8_t)(a) - (int8_t)(b))
-#define FSUBSW(a, b) satsw((int16_t)(a) - (int16_t)(b))
-#define FMINUB(a, b) ((a) < (b)) ? (a) : (b)
-#define FMINSW(a, b) ((int16_t)(a) < (int16_t)(b)) ? (a) : (b)
-#define FMAXUB(a, b) ((a) > (b)) ? (a) : (b)
-#define FMAXSW(a, b) ((int16_t)(a) > (int16_t)(b)) ? (a) : (b)
-
-#define FAND(a, b) ((a) & (b))
-#define FANDN(a, b) ((~(a)) & (b))
-#define FOR(a, b) ((a) | (b))
-#define FXOR(a, b) ((a) ^ (b))
-
-#define FCMPGTB(a, b) ((int8_t)(a) > (int8_t)(b) ? -1 : 0)
-#define FCMPGTW(a, b) ((int16_t)(a) > (int16_t)(b) ? -1 : 0)
-#define FCMPGTL(a, b) ((int32_t)(a) > (int32_t)(b) ? -1 : 0)
-#define FCMPEQ(a, b) ((a) == (b) ? -1 : 0)
-
 #define FMULLW(a, b) ((a) * (b))
 #define FMULHRW(a, b) (((int16_t)(a) * (int16_t)(b) + 0x8000) >> 16)
 #define FMULHUW(a, b) ((a) * (b) >> 16)
@@ -371,45 +345,6 @@ static inline int satsw(int x)
 #define FAVG(a, b) (((a) + (b) + 1) >> 1)
 #endif
 
-SSE_HELPER_B(helper_paddb, FADD)
-SSE_HELPER_W(helper_paddw, FADD)
-SSE_HELPER_L(helper_paddl, FADD)
-SSE_HELPER_Q(helper_paddq, FADD)
-
-SSE_HELPER_B(helper_psubb, FSUB)
-SSE_HELPER_W(helper_psubw, FSUB)
-SSE_HELPER_L(helper_psubl, FSUB)
-SSE_HELPER_Q(helper_psubq, FSUB)
-
-SSE_HELPER_B(helper_paddusb, FADDUB)
-SSE_HELPER_B(helper_paddsb, FADDSB)
-SSE_HELPER_B(helper_psubusb, FSUBUB)
-SSE_HELPER_B(helper_psubsb, FSUBSB)
-
-SSE_HELPER_W(helper_paddusw, FADDUW)
-SSE_HELPER_W(helper_paddsw, FADDSW)
-SSE_HELPER_W(helper_psubusw, FSUBUW)
-SSE_HELPER_W(helper_psubsw, FSUBSW)
-
-SSE_HELPER_B(helper_pminub, FMINUB)
-SSE_HELPER_B(helper_pmaxub, FMAXUB)
-
-SSE_HELPER_W(helper_pminsw, FMINSW)
-SSE_HELPER_W(helper_pmaxsw, FMAXSW)
-
-SSE_HELPER_Q(helper_pand, FAND)
-SSE_HELPER_Q(helper_pandn, FANDN)
-SSE_HELPER_Q(helper_por, FOR)
-SSE_HELPER_Q(helper_pxor, FXOR)
-
-SSE_HELPER_B(helper_pcmpgtb, FCMPGTB)
-SSE_HELPER_W(helper_pcmpgtw, FCMPGTW)
-SSE_HELPER_L(helper_pcmpgtl, FCMPGTL)
-
-SSE_HELPER_B(helper_pcmpeqb, FCMPEQ)
-SSE_HELPER_W(helper_pcmpeqw, FCMPEQ)
-SSE_HELPER_L(helper_pcmpeql, FCMPEQ)
-
 SSE_HELPER_W(helper_pmullw, FMULLW)
 #if SHIFT == 0
 SSE_HELPER_W(helper_pmulhrw, FMULHRW)
diff --git a/target/i386/ops_sse_header.h b/target/i386/ops_sse_header.h
index 094aafc573..d8e33dff6b 100644
--- a/target/i386/ops_sse_header.h
+++ b/target/i386/ops_sse_header.h
@@ -60,45 +60,6 @@ DEF_HELPER_3(glue(pslldq, SUFFIX), void, env, Reg, Reg)
 #define SSE_HELPER_Q(name, F)\
 DEF_HELPER_3(glue(name, SUFFIX), void, env, Reg, Reg)
 
-SSE_HELPER_B(paddb, FADD)
-SSE_HELPER_W(paddw, FADD)
-SSE_HELPER_L(paddl, FADD)
-SSE_HELPER_Q(paddq, FADD)
-
-SSE_HELPER_B(psubb, FSUB)
-SSE_HELPER_W(psubw, FSUB)
-SSE_HELPER_L(psubl, FSUB)
-SSE_HELPER_Q(psubq, FSUB)
-
-SSE_HELPER_B(paddusb, FADDUB)
-SSE_HELPER_B(paddsb, FADDSB)
-SSE_HELPER_B(psubusb, FSUBUB)
-SSE_HELPER_B(psubsb, FSUBSB)
-
-SSE_HELPER_W(paddusw, FADDUW)
-SSE_HELPER_W(paddsw, FADDSW)
-SSE_HELPER_W(psubusw, FSUBUW)
-SSE_HELPER_W(psubsw, FSUBSW)
-
-SSE_HELPER_B(pminub, FMINUB)
-SSE_HELPER_B(pmaxub, FMAXUB)
-
-SSE_HELPER_W(pminsw, FMINSW)
-SSE_HELPER_W(pmaxsw, FMAXSW)
-
-SSE_HELPER_Q(pand, FAND)
-SSE_HELPER_Q(pandn, FANDN)
-SSE_HELPER_Q(por, FOR)
-SSE_HELPER_Q(pxor, FXOR)
-
-SSE_HELPER_B(pcmpgtb, FCMPGTB)
-SSE_HELPER_W(pcmpgtw, FCMPGTW)
-SSE_HELPER_L(pcmpgtl, FCMPGTL)
-
-SSE_HELPER_B(pcmpeqb, FCMPEQ)
-SSE_HELPER_W(pcmpeqw, FCMPEQ)
-SSE_HELPER_L(pcmpeql, FCMPEQ)
-
 SSE_HELPER_W(pmullw, FMULLW)
 #if SHIFT == 0
 SSE_HELPER_W(pmulhrw, FMULHRW)
diff --git a/target/i386/translate.c b/target/i386/translate.c
index 3149989d68..78c91a85c9 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -2756,19 +2756,11 @@ static const SSEFunc_0_epp sse_op_table1[256][4] = {
 [0x51] = SSE_FOP(sqrt),
 [0x52] = { gen_helper_rsqrtps, NULL, gen_helper_rsqrtss, NULL },
 [0x53] = { gen_helper_rcpps, NULL, gen_helper_rcpss, NULL },
-[0x54] = { gen_helper_pand_xmm, gen_helper_pand_xmm }, /* andps, andpd */
-[0x55] = { gen_helper_pandn_xmm, gen_helper_pandn_xmm }, /* andnps, andnpd 
*/
-[0x56] = { gen_helper_por_xmm, gen_helper_por_xmm }, /* orps, orpd */

[Qemu-devel] [RFC PATCH v4 62/75] target/i386: introduce AVX2 translators

2019-08-21 Thread Jan Bobek
Use the translator macros to define translators required by AVX2
instructions.

Signed-off-by: Jan Bobek 
---
 target/i386/translate.c | 19 ---
 1 file changed, 16 insertions(+), 3 deletions(-)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index 50eab9181c..3f4bb40932 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -7692,11 +7692,11 @@ DEF_TRANSLATE_INSN2(Vd, Wd)
 DEF_TRANSLATE_INSN2(Vd, Wq)
 DEF_TRANSLATE_INSN2(Vdq, Ed)
 DEF_TRANSLATE_INSN2(Vdq, Eq)
-DEF_TRANSLATE_INSN2(Vdq, Md)
 DEF_TRANSLATE_INSN2(Vdq, Mdq)
 DEF_TRANSLATE_INSN2(Vdq, Nq)
 DEF_TRANSLATE_INSN2(Vdq, Qq)
 DEF_TRANSLATE_INSN2(Vdq, Udq)
+DEF_TRANSLATE_INSN2(Vdq, Wb)
 DEF_TRANSLATE_INSN2(Vdq, Wd)
 DEF_TRANSLATE_INSN2(Vdq, Wdq)
 DEF_TRANSLATE_INSN2(Vdq, Wq)
@@ -7706,12 +7706,14 @@ DEF_TRANSLATE_INSN2(Vq, Ed)
 DEF_TRANSLATE_INSN2(Vq, Eq)
 DEF_TRANSLATE_INSN2(Vq, Wd)
 DEF_TRANSLATE_INSN2(Vq, Wq)
-DEF_TRANSLATE_INSN2(Vqq, Md)
 DEF_TRANSLATE_INSN2(Vqq, Mdq)
-DEF_TRANSLATE_INSN2(Vqq, Mq)
 DEF_TRANSLATE_INSN2(Vqq, Mqq)
+DEF_TRANSLATE_INSN2(Vqq, Wb)
+DEF_TRANSLATE_INSN2(Vqq, Wd)
 DEF_TRANSLATE_INSN2(Vqq, Wdq)
+DEF_TRANSLATE_INSN2(Vqq, Wq)
 DEF_TRANSLATE_INSN2(Vqq, Wqq)
+DEF_TRANSLATE_INSN2(Vqq, Ww)
 DEF_TRANSLATE_INSN2(Wd, Vd)
 DEF_TRANSLATE_INSN2(Wdq, Vdq)
 DEF_TRANSLATE_INSN2(Wq, Vq)
@@ -7763,6 +7765,7 @@ DEF_TRANSLATE_INSN3(Gd, Udq, Ib)
 DEF_TRANSLATE_INSN3(Gq, Nq, Ib)
 DEF_TRANSLATE_INSN3(Gq, Udq, Ib)
 DEF_TRANSLATE_INSN3(Hdq, Udq, Ib)
+DEF_TRANSLATE_INSN3(Hqq, Uqq, Ib)
 DEF_TRANSLATE_INSN3(Mdq, Hdq, Vdq)
 DEF_TRANSLATE_INSN3(Mqq, Hqq, Vqq)
 DEF_TRANSLATE_INSN3(Nq, Nq, Ib)
@@ -7789,6 +7792,7 @@ DEF_TRANSLATE_INSN3(Vdq, Vdq, UdqMhq)
 DEF_TRANSLATE_INSN3(Vdq, Vdq, Wdq)
 DEF_TRANSLATE_INSN3(Vdq, Vq, Mq)
 DEF_TRANSLATE_INSN3(Vdq, Vq, Wq)
+DEF_TRANSLATE_INSN3(Vdq, Wd, modrm_mod)
 DEF_TRANSLATE_INSN3(Vdq, Wdq, Ib)
 DEF_TRANSLATE_INSN3(Vq, Hq, Ed)
 DEF_TRANSLATE_INSN3(Vq, Hq, Eq)
@@ -7797,7 +7801,10 @@ DEF_TRANSLATE_INSN3(Vq, Hq, Wq)
 DEF_TRANSLATE_INSN3(Vq, Vq, Wq)
 DEF_TRANSLATE_INSN3(Vq, Wq, Ib)
 DEF_TRANSLATE_INSN3(Vqq, Hqq, Mqq)
+DEF_TRANSLATE_INSN3(Vqq, Hqq, Wdq)
 DEF_TRANSLATE_INSN3(Vqq, Hqq, Wqq)
+DEF_TRANSLATE_INSN3(Vqq, Wd, modrm_mod)
+DEF_TRANSLATE_INSN3(Vqq, Wq, modrm_mod)
 DEF_TRANSLATE_INSN3(Vqq, Wqq, Ib)
 DEF_TRANSLATE_INSN3(Wdq, Vqq, Ib)
 
@@ -7921,8 +7928,14 @@ DEF_TRANSLATE_INSN4(Vqq, Hqq, Wqq, Lqq)
 }   \
 }
 
+DEF_TRANSLATE_INSN5(Vdq, Hdq, Vdq, MDdq, Hdq)
+DEF_TRANSLATE_INSN5(Vdq, Hdq, Vdq, MQdq, Hdq)
+DEF_TRANSLATE_INSN5(Vdq, Hdq, Vdq, MQqq, Hdq)
 DEF_TRANSLATE_INSN5(Vdq, Hdq, Wd, modrm_mod, vex_v)
 DEF_TRANSLATE_INSN5(Vdq, Hdq, Wq, modrm_mod, vex_v)
+DEF_TRANSLATE_INSN5(Vqq, Hqq, Vqq, MDdq, Hqq)
+DEF_TRANSLATE_INSN5(Vqq, Hqq, Vqq, MDqq, Hqq)
+DEF_TRANSLATE_INSN5(Vqq, Hqq, Vqq, MQqq, Hqq)
 DEF_TRANSLATE_INSN5(Wdq, Hdq, Vd, modrm_mod, vex_v)
 DEF_TRANSLATE_INSN5(Wdq, Hdq, Vq, modrm_mod, vex_v)
 
-- 
2.20.1




  1   2   3   4   5   >