date:20210603

Re: [PATCH v3 33/33] block/nbd: drop connection_co

2021-06-03 Thread Vladimir Sementsov-Ogievskiy


04.06.2021 00:27, Eric Blake wrote:

On Fri, Apr 16, 2021 at 11:09:11AM +0300, Vladimir Sementsov-Ogievskiy wrote:

OK, that's a big rewrite of the logic.

Pre-patch we have an always running coroutine - connection_co. It does
reply receiving and reconnecting. And it leads to a lot of difficult
and unobvious code around drained sections and context switch. We also
abuse bs->in_flight counter which is increased for connection_co and
temporary decreased in points where we want to allow drained section to
begin. One of these place is in another file: in nbd_read_eof() in
nbd/client.c.

We also cancel reconnect and requests waiting for reconnect on drained
begin which is not correct.

Let's finally drop this always running coroutine and go another way:

1. reconnect_attempt() goes to nbd_co_send_request and called under
send_mutex

2. We do receive headers in request coroutine. But we also should
dispatch replies for another pending requests. So,
nbd_connection_entry() is turned into nbd_receive_replies(), which
does reply dispatching until it receive another request headers, and
returns when it receive the requested header.

3. All old staff around drained sections and context switch is dropped.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  block/nbd.c  | 376 ---
  nbd/client.c |   2 -
  2 files changed, 119 insertions(+), 259 deletions(-)




-static coroutine_fn void nbd_connection_entry(void *opaque)
+static coroutine_fn void nbd_receive_replies(BDRVNBDState *s, uint64_t handle)
  {
-BDRVNBDState *s = opaque;
  uint64_t i;
  int ret = 0;
  Error *local_err = NULL;
  
-while (qatomic_load_acquire(>state) != NBD_CLIENT_QUIT) {

-/*
- * The NBD client can only really be considered idle when it has
- * yielded from qio_channel_readv_all_eof(), waiting for data. This is
- * the point where the additional scheduled coroutine entry happens
- * after nbd_client_attach_aio_context().
- *
- * Therefore we keep an additional in_flight reference all the time and
- * only drop it temporarily here.
- */
+i = HANDLE_TO_INDEX(s, handle);
+if (s->receive_co) {
+assert(s->receive_co != qemu_coroutine_self());
  
-if (nbd_client_connecting(s)) {

-nbd_co_reconnect_loop(s);
-}
+/* Another request coroutine is receiving now */
+s->requests[i].receiving = true;
+qemu_coroutine_yield();
+assert(!s->requests[i].receiving);
  
-if (!nbd_client_connected(s)) {

-continue;
+if (s->receive_co != qemu_coroutine_self()) {
+/*
+ * We are either failed or done, caller uses nbd_client_connected()
+ * to distinguish.
+ */
+return;
  }
+}
+
+assert(s->receive_co == 0 || s->receive_co == qemu_coroutine_self());


s/0/NULL/ here


+s->receive_co = qemu_coroutine_self();
  
+while (nbd_client_connected(s)) {

  assert(s->reply.handle == 0);
  ret = nbd_receive_reply(s->bs, s->ioc, >reply, _err);
  
@@ -522,8 +380,21 @@ static coroutine_fn void nbd_connection_entry(void *opaque)

  local_err = NULL;
  }
  if (ret <= 0) {
-nbd_channel_error(s, ret ? ret : -EIO);
-continue;
+ret = ret ? ret : -EIO;
+nbd_channel_error(s, ret);
+goto out;
+}
+
+if (!nbd_client_connected(s)) {
+ret = -EIO;
+goto out;
+}
+
+i = HANDLE_TO_INDEX(s, s->reply.handle);
+
+if (s->reply.handle == handle) {
+ret = 0;
+goto out;
  }
  
  /*


I know your followup said there is more work to do before v4, but I
look forward to seeing it.




Great thanks for reviewing this huge series! Now is my turn to make v4.

--
Best regards,
Vladimir

Re: [PATCH v3 32/33] block/nbd: safer transition to receiving request

2021-06-03 Thread Vladimir Sementsov-Ogievskiy


04.06.2021 00:11, Eric Blake wrote:

On Fri, Apr 16, 2021 at 11:09:10AM +0300, Vladimir Sementsov-Ogievskiy wrote:

req->receiving is a flag of request being in one concrete yield point
in nbd_co_do_receive_one_chunk().

Such kind of boolean flag is always better to unset before scheduling
the coroutine, to avoid double scheduling. So, let's be more careful.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  block/nbd.c | 6 --
  1 file changed, 4 insertions(+), 2 deletions(-)




@@ -614,7 +616,7 @@ static int nbd_co_send_request(BlockDriverState *bs,
  if (qiov) {
  qio_channel_set_cork(s->ioc, true);
  rc = nbd_send_request(s->ioc, request);
-if (nbd_clinet_connected(s) && rc >= 0) {
+if (nbd_client_connected(s) && rc >= 0) {


Ouch - typo fix in clinet seems unrelated in this fix; please hoist it
into the correct point in the series so that we don't have the typo in
the first place.


That also means that I didn't made my favorite "git rebase -x 'make -j9' 
master".. Not good, will fix of course.



Otherwise,
Reviewed-by: Eric Blake 




--
Best regards,
Vladimir

Re: [PATCH v3 28/33] nbd/client-connection: do qio_channel_set_delay(false)

2021-06-03 Thread Vladimir Sementsov-Ogievskiy


03.06.2021 23:48, Eric Blake wrote:

On Fri, Apr 16, 2021 at 11:09:06AM +0300, Vladimir Sementsov-Ogievskiy wrote:

nbd_open() does it (through nbd_establish_connection()).
Actually we lost that call on reconnect path in 1dc4718d849e1a1fe
"block/nbd: use non-blocking connect: fix vm hang on connect()"
when we have introduced reconnect thread.


s/have introduced/introduced the/



Fixes: 1dc4718d849e1a1fe665ce5241ed79048cfa2cfc
Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  nbd/client-connection.c | 2 ++
  1 file changed, 2 insertions(+)

diff --git a/nbd/client-connection.c b/nbd/client-connection.c
index 36d2c7c693..00efff293f 100644
--- a/nbd/client-connection.c
+++ b/nbd/client-connection.c
@@ -126,6 +126,8 @@ static int nbd_connect(QIOChannelSocket *sioc, 
SocketAddress *addr,
  return ret;
  }
  
+qio_channel_set_delay(QIO_CHANNEL(sioc), false);

+
  if (!info) {
  return 0;
  }


Reviewed-by: Eric Blake 

Is this bug fix something that can be cherry-picked in isolation, or
does it depend too much on the rest of the series?



Will try to move to start of the series

--
Best regards,
Vladimir

Re: [PATCH v3 27/33] block/nbd: split nbd_co_do_establish_connection out of nbd_reconnect_attempt

2021-06-03 Thread Vladimir Sementsov-Ogievskiy


03.06.2021 23:04, Eric Blake wrote:

On Fri, Apr 16, 2021 at 11:09:05AM +0300, Vladimir Sementsov-Ogievskiy wrote:

Split out part, that we want to reuse for nbd_open().


Split out the part that we want to reuse for nbd_open().



Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  block/nbd.c | 79 +++--
  1 file changed, 41 insertions(+), 38 deletions(-)

diff --git a/block/nbd.c b/block/nbd.c
index 15b5921725..59971bfba8 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -361,11 +361,49 @@ static int nbd_handle_updated_info(BlockDriverState *bs, 
Error **errp)
  return 0;
  }
  
-static coroutine_fn void nbd_reconnect_attempt(BDRVNBDState *s)

+static int nbd_co_do_establish_connection(BlockDriverState *bs, Error **errp)


Given the _co_ in the name, don't you need a coroutine_fn marker?


Yes, strange that I've dropped it



Otherwise looks sane.




--
Best regards,
Vladimir

Re: [PATCH v3 26/33] block-coroutine-wrapper: allow non bdrv_ prefix

2021-06-03 Thread Vladimir Sementsov-Ogievskiy


03.06.2021 23:53, Eric Blake wrote:

On Fri, Apr 16, 2021 at 11:09:04AM +0300, Vladimir Sementsov-Ogievskiy wrote:

We are going to reuse the script to generate a qcow2_ function in


Is qcow2_ right here?  Looking ahead to patch 30/33, it looks like you
meant nbd_, at least in the context of this series.



will change. This because the patch was taken from my parallel qcow2 series

--
Best regards,
Vladimir

Re: [PATCH_V3] Adding ifdefs to call the respective routines only when their configs are enabled

2021-06-03 Thread Swetha Joshi

Hello, I have tested this patch with our qemu and it works, thank you.

What are the next steps for this patch? So is it approved and ready to go
in mainline?

Thanks,
Swetha.

On Thu, Jun 3, 2021 at 10:30 AM Swetha Joshi 
wrote:

>
> Oh okay, thank you. I will test this by eod today!
>
>
> On Thu, Jun 3, 2021 at 10:22 AM Peter Maydell 
> wrote:
>
>> On Fri, 28 May 2021 at 20:41, Swetha Joshi 
>> wrote:
>> >
>> > I apologize for the delay, here are the repro steps:
>> > 1. Remove CONFIG_ARM_VIRT=y from arm-softmmu.mak
>> > 2. In .gitlab-ci.yml, crossbuild.yml and in tests/vm/Makefile.include,
>> in all the places where we disable kvm using -disable-kvm, replace this
>> with -enable-kvm
>> > 3. Build
>>
>> Thanks; I reproduced the link errors and have sent a patchset
>> that I hope fixes this:
>> https://patchew.org/QEMU/20210603171259.27962-1-peter.mayd...@linaro.org/
>>
>> If you could test that it works for you that would be great.
>>
>> -- PMM
>>
> --
> Regards
>
> Swetha Joshi.
>
-- 
Regards

Swetha Joshi.

Re: [PATCH v1 2/3] hw/timer: Initial commit of Ibex Timer

2021-06-03 Thread Bin Meng

On Fri, Jun 4, 2021 at 10:38 AM Alistair Francis  wrote:
>
> On Fri, Jun 4, 2021 at 12:34 PM Bin Meng  wrote:
> >
> > On Fri, Jun 4, 2021 at 10:33 AM Alistair Francis  
> > wrote:
> > >
> > > On Fri, Jun 4, 2021 at 12:11 PM Bin Meng  wrote:
> > > >
> > > > On Fri, Jun 4, 2021 at 7:21 AM Alistair Francis  
> > > > wrote:
> > > > >
> > > > > On Tue, Jun 1, 2021 at 11:05 PM Bin Meng  wrote:
> > > > > >
> > > > > > On Mon, May 31, 2021 at 12:33 PM Alistair Francis
> > > > > >  wrote:
> > > > > >
> > > > > > Please write some commit message, for example, what is supported in
> > > > > > this initial version, and what is not.
> > > > >
> > > > > I'll add something.
> > > > >
> > > > > >
> > > > > > >
> > > > > > > Signed-off-by: Alistair Francis 
> > > > > > > ---
> > > > > > >  include/hw/timer/ibex_timer.h |  52 ++
> > > > > > >  hw/timer/ibex_timer.c | 305 
> > > > > > > ++
> > > > > > >  MAINTAINERS   |   6 +-
> > > > > > >  hw/timer/meson.build  |   1 +
> > > > > > >  4 files changed, 360 insertions(+), 4 deletions(-)
> > > > > > >  create mode 100644 include/hw/timer/ibex_timer.h
> > > > > > >  create mode 100644 hw/timer/ibex_timer.c
> > > > > > >
> > > > > > > diff --git a/include/hw/timer/ibex_timer.h 
> > > > > > > b/include/hw/timer/ibex_timer.h
> > > > > > > new file mode 100644
> > > > > > > index 00..6a43537003
> > > > > > > --- /dev/null
> > > > > > > +++ b/include/hw/timer/ibex_timer.h
> > > > > > > @@ -0,0 +1,52 @@
> > > > > > > +/*
> > > > > > > + * QEMU lowRISC Ibex Timer device
> > > > > > > + *
> > > > > > > + * Copyright (c) 2021 Western Digital
> > > > > > > + *
> > > > > > > + * Permission is hereby granted, free of charge, to any person 
> > > > > > > obtaining a copy
> > > > > > > + * of this software and associated documentation files (the 
> > > > > > > "Software"), to deal
> > > > > > > + * in the Software without restriction, including without 
> > > > > > > limitation the rights
> > > > > > > + * to use, copy, modify, merge, publish, distribute, sublicense, 
> > > > > > > and/or sell
> > > > > > > + * copies of the Software, and to permit persons to whom the 
> > > > > > > Software is
> > > > > > > + * furnished to do so, subject to the following conditions:
> > > > > > > + *
> > > > > > > + * The above copyright notice and this permission notice shall 
> > > > > > > be included in
> > > > > > > + * all copies or substantial portions of the Software.
> > > > > > > + *
> > > > > > > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY 
> > > > > > > KIND, EXPRESS OR
> > > > > > > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 
> > > > > > > MERCHANTABILITY,
> > > > > > > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO 
> > > > > > > EVENT SHALL
> > > > > > > + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, 
> > > > > > > DAMAGES OR OTHER
> > > > > > > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR 
> > > > > > > OTHERWISE, ARISING FROM,
> > > > > > > + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER 
> > > > > > > DEALINGS IN
> > > > > > > + * THE SOFTWARE.
> > > > > > > + */
> > > > > > > +
> > > > > > > +#ifndef HW_IBEX_TIMER_H
> > > > > > > +#define HW_IBEX_TIMER_H
> > > > > > > +
> > > > > > > +#include "hw/sysbus.h"
> > > > > > > +
> > > > > > > +#define TYPE_IBEX_TIMER "ibex-timer"
> > > > > > > +OBJECT_DECLARE_SIMPLE_TYPE(IbexTimerState, IBEX_TIMER)
> > > > > > > +
> > > > > > > +struct IbexTimerState {
> > > > > > > +/*  */
> > > > > > > +SysBusDevice parent_obj;
> > > > > > > +
> > > > > > > +/*  */
> > > > > > > +MemoryRegion mmio;
> > > > > > > +
> > > > > > > +uint32_t timer_ctrl;
> > > > > > > +uint32_t timer_cfg0;
> > > > > > > +uint32_t timer_compare_lower0;
> > > > > > > +uint32_t timer_compare_upper0;
> > > > > > > +uint32_t timer_intr_enable;
> > > > > > > +uint32_t timer_intr_state;
> > > > > > > +uint32_t timer_intr_test;
> > > > > > > +
> > > > > > > +uint32_t timebase_freq;
> > > > > > > +
> > > > > > > +qemu_irq irq;
> > > > > > > +};
> > > > > > > +#endif /* HW_IBEX_TIMER_H */
> > > > > > > diff --git a/hw/timer/ibex_timer.c b/hw/timer/ibex_timer.c
> > > > > > > new file mode 100644
> > > > > > > index 00..0a1030b15f
> > > > > > > --- /dev/null
> > > > > > > +++ b/hw/timer/ibex_timer.c
> > > > > > > @@ -0,0 +1,305 @@
> > > > > > > +/*
> > > > > > > + * QEMU lowRISC Ibex Timer device
> > > > > > > + *
> > > > > > > + * Copyright (c) 2021 Western Digital
> > > > > > > + *
> > > > > > > + * For details check the documentation here:
> > > > > > > + *https://docs.opentitan.org/hw/ip/rv_timer/doc/
> > > > > > > + *
> > > > > > > + * Permission is hereby granted, free of charge, to any person 
> > > > > > > obtaining a copy
> > > > > > > + * of this software and associated documentation files (the 
> > > > > > > "Software"), to deal
> > > > > > > + *

Re: [PATCH v1 2/3] hw/timer: Initial commit of Ibex Timer

2021-06-03 Thread Alistair Francis

On Fri, Jun 4, 2021 at 12:34 PM Bin Meng  wrote:
>
> On Fri, Jun 4, 2021 at 10:33 AM Alistair Francis  wrote:
> >
> > On Fri, Jun 4, 2021 at 12:11 PM Bin Meng  wrote:
> > >
> > > On Fri, Jun 4, 2021 at 7:21 AM Alistair Francis  
> > > wrote:
> > > >
> > > > On Tue, Jun 1, 2021 at 11:05 PM Bin Meng  wrote:
> > > > >
> > > > > On Mon, May 31, 2021 at 12:33 PM Alistair Francis
> > > > >  wrote:
> > > > >
> > > > > Please write some commit message, for example, what is supported in
> > > > > this initial version, and what is not.
> > > >
> > > > I'll add something.
> > > >
> > > > >
> > > > > >
> > > > > > Signed-off-by: Alistair Francis 
> > > > > > ---
> > > > > >  include/hw/timer/ibex_timer.h |  52 ++
> > > > > >  hw/timer/ibex_timer.c | 305 
> > > > > > ++
> > > > > >  MAINTAINERS   |   6 +-
> > > > > >  hw/timer/meson.build  |   1 +
> > > > > >  4 files changed, 360 insertions(+), 4 deletions(-)
> > > > > >  create mode 100644 include/hw/timer/ibex_timer.h
> > > > > >  create mode 100644 hw/timer/ibex_timer.c
> > > > > >
> > > > > > diff --git a/include/hw/timer/ibex_timer.h 
> > > > > > b/include/hw/timer/ibex_timer.h
> > > > > > new file mode 100644
> > > > > > index 00..6a43537003
> > > > > > --- /dev/null
> > > > > > +++ b/include/hw/timer/ibex_timer.h
> > > > > > @@ -0,0 +1,52 @@
> > > > > > +/*
> > > > > > + * QEMU lowRISC Ibex Timer device
> > > > > > + *
> > > > > > + * Copyright (c) 2021 Western Digital
> > > > > > + *
> > > > > > + * Permission is hereby granted, free of charge, to any person 
> > > > > > obtaining a copy
> > > > > > + * of this software and associated documentation files (the 
> > > > > > "Software"), to deal
> > > > > > + * in the Software without restriction, including without 
> > > > > > limitation the rights
> > > > > > + * to use, copy, modify, merge, publish, distribute, sublicense, 
> > > > > > and/or sell
> > > > > > + * copies of the Software, and to permit persons to whom the 
> > > > > > Software is
> > > > > > + * furnished to do so, subject to the following conditions:
> > > > > > + *
> > > > > > + * The above copyright notice and this permission notice shall be 
> > > > > > included in
> > > > > > + * all copies or substantial portions of the Software.
> > > > > > + *
> > > > > > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 
> > > > > > EXPRESS OR
> > > > > > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 
> > > > > > MERCHANTABILITY,
> > > > > > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO 
> > > > > > EVENT SHALL
> > > > > > + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, 
> > > > > > DAMAGES OR OTHER
> > > > > > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, 
> > > > > > ARISING FROM,
> > > > > > + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER 
> > > > > > DEALINGS IN
> > > > > > + * THE SOFTWARE.
> > > > > > + */
> > > > > > +
> > > > > > +#ifndef HW_IBEX_TIMER_H
> > > > > > +#define HW_IBEX_TIMER_H
> > > > > > +
> > > > > > +#include "hw/sysbus.h"
> > > > > > +
> > > > > > +#define TYPE_IBEX_TIMER "ibex-timer"
> > > > > > +OBJECT_DECLARE_SIMPLE_TYPE(IbexTimerState, IBEX_TIMER)
> > > > > > +
> > > > > > +struct IbexTimerState {
> > > > > > +/*  */
> > > > > > +SysBusDevice parent_obj;
> > > > > > +
> > > > > > +/*  */
> > > > > > +MemoryRegion mmio;
> > > > > > +
> > > > > > +uint32_t timer_ctrl;
> > > > > > +uint32_t timer_cfg0;
> > > > > > +uint32_t timer_compare_lower0;
> > > > > > +uint32_t timer_compare_upper0;
> > > > > > +uint32_t timer_intr_enable;
> > > > > > +uint32_t timer_intr_state;
> > > > > > +uint32_t timer_intr_test;
> > > > > > +
> > > > > > +uint32_t timebase_freq;
> > > > > > +
> > > > > > +qemu_irq irq;
> > > > > > +};
> > > > > > +#endif /* HW_IBEX_TIMER_H */
> > > > > > diff --git a/hw/timer/ibex_timer.c b/hw/timer/ibex_timer.c
> > > > > > new file mode 100644
> > > > > > index 00..0a1030b15f
> > > > > > --- /dev/null
> > > > > > +++ b/hw/timer/ibex_timer.c
> > > > > > @@ -0,0 +1,305 @@
> > > > > > +/*
> > > > > > + * QEMU lowRISC Ibex Timer device
> > > > > > + *
> > > > > > + * Copyright (c) 2021 Western Digital
> > > > > > + *
> > > > > > + * For details check the documentation here:
> > > > > > + *https://docs.opentitan.org/hw/ip/rv_timer/doc/
> > > > > > + *
> > > > > > + * Permission is hereby granted, free of charge, to any person 
> > > > > > obtaining a copy
> > > > > > + * of this software and associated documentation files (the 
> > > > > > "Software"), to deal
> > > > > > + * in the Software without restriction, including without 
> > > > > > limitation the rights
> > > > > > + * to use, copy, modify, merge, publish, distribute, sublicense, 
> > > > > > and/or sell
> > > > > > + * copies of the Software, and to permit persons to whom the 
> > > > > > Software is
> > > > > > + *

Re: [PATCH v1 2/3] hw/timer: Initial commit of Ibex Timer

2021-06-03 Thread Bin Meng

On Fri, Jun 4, 2021 at 10:33 AM Alistair Francis  wrote:
>
> On Fri, Jun 4, 2021 at 12:11 PM Bin Meng  wrote:
> >
> > On Fri, Jun 4, 2021 at 7:21 AM Alistair Francis  
> > wrote:
> > >
> > > On Tue, Jun 1, 2021 at 11:05 PM Bin Meng  wrote:
> > > >
> > > > On Mon, May 31, 2021 at 12:33 PM Alistair Francis
> > > >  wrote:
> > > >
> > > > Please write some commit message, for example, what is supported in
> > > > this initial version, and what is not.
> > >
> > > I'll add something.
> > >
> > > >
> > > > >
> > > > > Signed-off-by: Alistair Francis 
> > > > > ---
> > > > >  include/hw/timer/ibex_timer.h |  52 ++
> > > > >  hw/timer/ibex_timer.c | 305 
> > > > > ++
> > > > >  MAINTAINERS   |   6 +-
> > > > >  hw/timer/meson.build  |   1 +
> > > > >  4 files changed, 360 insertions(+), 4 deletions(-)
> > > > >  create mode 100644 include/hw/timer/ibex_timer.h
> > > > >  create mode 100644 hw/timer/ibex_timer.c
> > > > >
> > > > > diff --git a/include/hw/timer/ibex_timer.h 
> > > > > b/include/hw/timer/ibex_timer.h
> > > > > new file mode 100644
> > > > > index 00..6a43537003
> > > > > --- /dev/null
> > > > > +++ b/include/hw/timer/ibex_timer.h
> > > > > @@ -0,0 +1,52 @@
> > > > > +/*
> > > > > + * QEMU lowRISC Ibex Timer device
> > > > > + *
> > > > > + * Copyright (c) 2021 Western Digital
> > > > > + *
> > > > > + * Permission is hereby granted, free of charge, to any person 
> > > > > obtaining a copy
> > > > > + * of this software and associated documentation files (the 
> > > > > "Software"), to deal
> > > > > + * in the Software without restriction, including without limitation 
> > > > > the rights
> > > > > + * to use, copy, modify, merge, publish, distribute, sublicense, 
> > > > > and/or sell
> > > > > + * copies of the Software, and to permit persons to whom the 
> > > > > Software is
> > > > > + * furnished to do so, subject to the following conditions:
> > > > > + *
> > > > > + * The above copyright notice and this permission notice shall be 
> > > > > included in
> > > > > + * all copies or substantial portions of the Software.
> > > > > + *
> > > > > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 
> > > > > EXPRESS OR
> > > > > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 
> > > > > MERCHANTABILITY,
> > > > > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT 
> > > > > SHALL
> > > > > + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES 
> > > > > OR OTHER
> > > > > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, 
> > > > > ARISING FROM,
> > > > > + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER 
> > > > > DEALINGS IN
> > > > > + * THE SOFTWARE.
> > > > > + */
> > > > > +
> > > > > +#ifndef HW_IBEX_TIMER_H
> > > > > +#define HW_IBEX_TIMER_H
> > > > > +
> > > > > +#include "hw/sysbus.h"
> > > > > +
> > > > > +#define TYPE_IBEX_TIMER "ibex-timer"
> > > > > +OBJECT_DECLARE_SIMPLE_TYPE(IbexTimerState, IBEX_TIMER)
> > > > > +
> > > > > +struct IbexTimerState {
> > > > > +/*  */
> > > > > +SysBusDevice parent_obj;
> > > > > +
> > > > > +/*  */
> > > > > +MemoryRegion mmio;
> > > > > +
> > > > > +uint32_t timer_ctrl;
> > > > > +uint32_t timer_cfg0;
> > > > > +uint32_t timer_compare_lower0;
> > > > > +uint32_t timer_compare_upper0;
> > > > > +uint32_t timer_intr_enable;
> > > > > +uint32_t timer_intr_state;
> > > > > +uint32_t timer_intr_test;
> > > > > +
> > > > > +uint32_t timebase_freq;
> > > > > +
> > > > > +qemu_irq irq;
> > > > > +};
> > > > > +#endif /* HW_IBEX_TIMER_H */
> > > > > diff --git a/hw/timer/ibex_timer.c b/hw/timer/ibex_timer.c
> > > > > new file mode 100644
> > > > > index 00..0a1030b15f
> > > > > --- /dev/null
> > > > > +++ b/hw/timer/ibex_timer.c
> > > > > @@ -0,0 +1,305 @@
> > > > > +/*
> > > > > + * QEMU lowRISC Ibex Timer device
> > > > > + *
> > > > > + * Copyright (c) 2021 Western Digital
> > > > > + *
> > > > > + * For details check the documentation here:
> > > > > + *https://docs.opentitan.org/hw/ip/rv_timer/doc/
> > > > > + *
> > > > > + * Permission is hereby granted, free of charge, to any person 
> > > > > obtaining a copy
> > > > > + * of this software and associated documentation files (the 
> > > > > "Software"), to deal
> > > > > + * in the Software without restriction, including without limitation 
> > > > > the rights
> > > > > + * to use, copy, modify, merge, publish, distribute, sublicense, 
> > > > > and/or sell
> > > > > + * copies of the Software, and to permit persons to whom the 
> > > > > Software is
> > > > > + * furnished to do so, subject to the following conditions:
> > > > > + *
> > > > > + * The above copyright notice and this permission notice shall be 
> > > > > included in
> > > > > + * all copies or substantial portions of the Software.
> > > > > + *
> > > > > + * THE SOFTWARE IS PROVIDED "AS IS",

Re: [PATCH v1 2/3] hw/timer: Initial commit of Ibex Timer

2021-06-03 Thread Alistair Francis

On Fri, Jun 4, 2021 at 12:11 PM Bin Meng  wrote:
>
> On Fri, Jun 4, 2021 at 7:21 AM Alistair Francis  wrote:
> >
> > On Tue, Jun 1, 2021 at 11:05 PM Bin Meng  wrote:
> > >
> > > On Mon, May 31, 2021 at 12:33 PM Alistair Francis
> > >  wrote:
> > >
> > > Please write some commit message, for example, what is supported in
> > > this initial version, and what is not.
> >
> > I'll add something.
> >
> > >
> > > >
> > > > Signed-off-by: Alistair Francis 
> > > > ---
> > > >  include/hw/timer/ibex_timer.h |  52 ++
> > > >  hw/timer/ibex_timer.c | 305 ++
> > > >  MAINTAINERS   |   6 +-
> > > >  hw/timer/meson.build  |   1 +
> > > >  4 files changed, 360 insertions(+), 4 deletions(-)
> > > >  create mode 100644 include/hw/timer/ibex_timer.h
> > > >  create mode 100644 hw/timer/ibex_timer.c
> > > >
> > > > diff --git a/include/hw/timer/ibex_timer.h 
> > > > b/include/hw/timer/ibex_timer.h
> > > > new file mode 100644
> > > > index 00..6a43537003
> > > > --- /dev/null
> > > > +++ b/include/hw/timer/ibex_timer.h
> > > > @@ -0,0 +1,52 @@
> > > > +/*
> > > > + * QEMU lowRISC Ibex Timer device
> > > > + *
> > > > + * Copyright (c) 2021 Western Digital
> > > > + *
> > > > + * Permission is hereby granted, free of charge, to any person 
> > > > obtaining a copy
> > > > + * of this software and associated documentation files (the 
> > > > "Software"), to deal
> > > > + * in the Software without restriction, including without limitation 
> > > > the rights
> > > > + * to use, copy, modify, merge, publish, distribute, sublicense, 
> > > > and/or sell
> > > > + * copies of the Software, and to permit persons to whom the Software 
> > > > is
> > > > + * furnished to do so, subject to the following conditions:
> > > > + *
> > > > + * The above copyright notice and this permission notice shall be 
> > > > included in
> > > > + * all copies or substantial portions of the Software.
> > > > + *
> > > > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 
> > > > EXPRESS OR
> > > > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 
> > > > MERCHANTABILITY,
> > > > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT 
> > > > SHALL
> > > > + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES 
> > > > OR OTHER
> > > > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, 
> > > > ARISING FROM,
> > > > + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER 
> > > > DEALINGS IN
> > > > + * THE SOFTWARE.
> > > > + */
> > > > +
> > > > +#ifndef HW_IBEX_TIMER_H
> > > > +#define HW_IBEX_TIMER_H
> > > > +
> > > > +#include "hw/sysbus.h"
> > > > +
> > > > +#define TYPE_IBEX_TIMER "ibex-timer"
> > > > +OBJECT_DECLARE_SIMPLE_TYPE(IbexTimerState, IBEX_TIMER)
> > > > +
> > > > +struct IbexTimerState {
> > > > +/*  */
> > > > +SysBusDevice parent_obj;
> > > > +
> > > > +/*  */
> > > > +MemoryRegion mmio;
> > > > +
> > > > +uint32_t timer_ctrl;
> > > > +uint32_t timer_cfg0;
> > > > +uint32_t timer_compare_lower0;
> > > > +uint32_t timer_compare_upper0;
> > > > +uint32_t timer_intr_enable;
> > > > +uint32_t timer_intr_state;
> > > > +uint32_t timer_intr_test;
> > > > +
> > > > +uint32_t timebase_freq;
> > > > +
> > > > +qemu_irq irq;
> > > > +};
> > > > +#endif /* HW_IBEX_TIMER_H */
> > > > diff --git a/hw/timer/ibex_timer.c b/hw/timer/ibex_timer.c
> > > > new file mode 100644
> > > > index 00..0a1030b15f
> > > > --- /dev/null
> > > > +++ b/hw/timer/ibex_timer.c
> > > > @@ -0,0 +1,305 @@
> > > > +/*
> > > > + * QEMU lowRISC Ibex Timer device
> > > > + *
> > > > + * Copyright (c) 2021 Western Digital
> > > > + *
> > > > + * For details check the documentation here:
> > > > + *https://docs.opentitan.org/hw/ip/rv_timer/doc/
> > > > + *
> > > > + * Permission is hereby granted, free of charge, to any person 
> > > > obtaining a copy
> > > > + * of this software and associated documentation files (the 
> > > > "Software"), to deal
> > > > + * in the Software without restriction, including without limitation 
> > > > the rights
> > > > + * to use, copy, modify, merge, publish, distribute, sublicense, 
> > > > and/or sell
> > > > + * copies of the Software, and to permit persons to whom the Software 
> > > > is
> > > > + * furnished to do so, subject to the following conditions:
> > > > + *
> > > > + * The above copyright notice and this permission notice shall be 
> > > > included in
> > > > + * all copies or substantial portions of the Software.
> > > > + *
> > > > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 
> > > > EXPRESS OR
> > > > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 
> > > > MERCHANTABILITY,
> > > > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT 
> > > > SHALL
> > > > + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES 
> > > > OR OTHER
> >

Re: [PATCH v1 2/3] hw/timer: Initial commit of Ibex Timer

2021-06-03 Thread Bin Meng

On Fri, Jun 4, 2021 at 7:21 AM Alistair Francis  wrote:
>
> On Tue, Jun 1, 2021 at 11:05 PM Bin Meng  wrote:
> >
> > On Mon, May 31, 2021 at 12:33 PM Alistair Francis
> >  wrote:
> >
> > Please write some commit message, for example, what is supported in
> > this initial version, and what is not.
>
> I'll add something.
>
> >
> > >
> > > Signed-off-by: Alistair Francis 
> > > ---
> > >  include/hw/timer/ibex_timer.h |  52 ++
> > >  hw/timer/ibex_timer.c | 305 ++
> > >  MAINTAINERS   |   6 +-
> > >  hw/timer/meson.build  |   1 +
> > >  4 files changed, 360 insertions(+), 4 deletions(-)
> > >  create mode 100644 include/hw/timer/ibex_timer.h
> > >  create mode 100644 hw/timer/ibex_timer.c
> > >
> > > diff --git a/include/hw/timer/ibex_timer.h b/include/hw/timer/ibex_timer.h
> > > new file mode 100644
> > > index 00..6a43537003
> > > --- /dev/null
> > > +++ b/include/hw/timer/ibex_timer.h
> > > @@ -0,0 +1,52 @@
> > > +/*
> > > + * QEMU lowRISC Ibex Timer device
> > > + *
> > > + * Copyright (c) 2021 Western Digital
> > > + *
> > > + * Permission is hereby granted, free of charge, to any person obtaining 
> > > a copy
> > > + * of this software and associated documentation files (the "Software"), 
> > > to deal
> > > + * in the Software without restriction, including without limitation the 
> > > rights
> > > + * to use, copy, modify, merge, publish, distribute, sublicense, and/or 
> > > sell
> > > + * copies of the Software, and to permit persons to whom the Software is
> > > + * furnished to do so, subject to the following conditions:
> > > + *
> > > + * The above copyright notice and this permission notice shall be 
> > > included in
> > > + * all copies or substantial portions of the Software.
> > > + *
> > > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 
> > > EXPRESS OR
> > > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 
> > > MERCHANTABILITY,
> > > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT 
> > > SHALL
> > > + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR 
> > > OTHER
> > > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, 
> > > ARISING FROM,
> > > + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER 
> > > DEALINGS IN
> > > + * THE SOFTWARE.
> > > + */
> > > +
> > > +#ifndef HW_IBEX_TIMER_H
> > > +#define HW_IBEX_TIMER_H
> > > +
> > > +#include "hw/sysbus.h"
> > > +
> > > +#define TYPE_IBEX_TIMER "ibex-timer"
> > > +OBJECT_DECLARE_SIMPLE_TYPE(IbexTimerState, IBEX_TIMER)
> > > +
> > > +struct IbexTimerState {
> > > +/*  */
> > > +SysBusDevice parent_obj;
> > > +
> > > +/*  */
> > > +MemoryRegion mmio;
> > > +
> > > +uint32_t timer_ctrl;
> > > +uint32_t timer_cfg0;
> > > +uint32_t timer_compare_lower0;
> > > +uint32_t timer_compare_upper0;
> > > +uint32_t timer_intr_enable;
> > > +uint32_t timer_intr_state;
> > > +uint32_t timer_intr_test;
> > > +
> > > +uint32_t timebase_freq;
> > > +
> > > +qemu_irq irq;
> > > +};
> > > +#endif /* HW_IBEX_TIMER_H */
> > > diff --git a/hw/timer/ibex_timer.c b/hw/timer/ibex_timer.c
> > > new file mode 100644
> > > index 00..0a1030b15f
> > > --- /dev/null
> > > +++ b/hw/timer/ibex_timer.c
> > > @@ -0,0 +1,305 @@
> > > +/*
> > > + * QEMU lowRISC Ibex Timer device
> > > + *
> > > + * Copyright (c) 2021 Western Digital
> > > + *
> > > + * For details check the documentation here:
> > > + *https://docs.opentitan.org/hw/ip/rv_timer/doc/
> > > + *
> > > + * Permission is hereby granted, free of charge, to any person obtaining 
> > > a copy
> > > + * of this software and associated documentation files (the "Software"), 
> > > to deal
> > > + * in the Software without restriction, including without limitation the 
> > > rights
> > > + * to use, copy, modify, merge, publish, distribute, sublicense, and/or 
> > > sell
> > > + * copies of the Software, and to permit persons to whom the Software is
> > > + * furnished to do so, subject to the following conditions:
> > > + *
> > > + * The above copyright notice and this permission notice shall be 
> > > included in
> > > + * all copies or substantial portions of the Software.
> > > + *
> > > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 
> > > EXPRESS OR
> > > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 
> > > MERCHANTABILITY,
> > > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT 
> > > SHALL
> > > + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR 
> > > OTHER
> > > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, 
> > > ARISING FROM,
> > > + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER 
> > > DEALINGS IN
> > > + * THE SOFTWARE.
> > > + */
> > > +
> > > +#include "qemu/osdep.h"
> > > +#include "qemu/log.h"
> > > +#include "qemu/timer.h"
> > > +#include

Re: [PATCH 2/6] target/microblaze: Extract FPU helpers to fpu_helper.c

2021-06-03 Thread Alistair Francis

On Thu, Jun 3, 2021 at 7:05 PM Philippe Mathieu-Daudé  wrote:
>
> Extract FPU helpers to their own file: fpu_helper.c,
> so it is easier to focus on the generic helpers in
> op_helper.c.
>
> Signed-off-by: Philippe Mathieu-Daudé 

Reviewed-by: Alistair Francis 

Alistair

> ---
>  target/microblaze/fpu_helper.c | 308 +
>  target/microblaze/op_helper.c  | 287 +-
>  target/microblaze/meson.build  |   1 +
>  3 files changed, 310 insertions(+), 286 deletions(-)
>  create mode 100644 target/microblaze/fpu_helper.c
>
> diff --git a/target/microblaze/fpu_helper.c b/target/microblaze/fpu_helper.c
> new file mode 100644
> index 000..ce729947079
> --- /dev/null
> +++ b/target/microblaze/fpu_helper.c
> @@ -0,0 +1,308 @@
> +/*
> + *  Microblaze FPU helper routines.
> + *
> + *  Copyright (c) 2009 Edgar E. Iglesias .
> + *  Copyright (c) 2009-2012 PetaLogix Qld Pty Ltd.
> + *
> + * This library is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation; either
> + * version 2.1 of the License, or (at your option) any later version.
> + *
> + * This library is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with this library; if not, see 
> .
> + */
> +
> +#include "qemu/osdep.h"
> +#include "cpu.h"
> +#include "exec/helper-proto.h"
> +#include "exec/exec-all.h"
> +#include "fpu/softfloat.h"
> +
> +static bool check_divz(CPUMBState *env, uint32_t a, uint32_t b, uintptr_t ra)
> +{
> +if (unlikely(b == 0)) {
> +env->msr |= MSR_DZ;
> +
> +if ((env->msr & MSR_EE) &&
> +env_archcpu(env)->cfg.div_zero_exception) {
> +CPUState *cs = env_cpu(env);
> +
> +env->esr = ESR_EC_DIVZERO;
> +cs->exception_index = EXCP_HW_EXCP;
> +cpu_loop_exit_restore(cs, ra);
> +}
> +return false;
> +}
> +return true;
> +}
> +
> +uint32_t helper_divs(CPUMBState *env, uint32_t a, uint32_t b)
> +{
> +if (!check_divz(env, a, b, GETPC())) {
> +return 0;
> +}
> +return (int32_t)a / (int32_t)b;
> +}
> +
> +uint32_t helper_divu(CPUMBState *env, uint32_t a, uint32_t b)
> +{
> +if (!check_divz(env, a, b, GETPC())) {
> +return 0;
> +}
> +return a / b;
> +}
> +
> +/* raise FPU exception.  */
> +static void raise_fpu_exception(CPUMBState *env, uintptr_t ra)
> +{
> +CPUState *cs = env_cpu(env);
> +
> +env->esr = ESR_EC_FPU;
> +cs->exception_index = EXCP_HW_EXCP;
> +cpu_loop_exit_restore(cs, ra);
> +}
> +
> +static void update_fpu_flags(CPUMBState *env, int flags, uintptr_t ra)
> +{
> +int raise = 0;
> +
> +if (flags & float_flag_invalid) {
> +env->fsr |= FSR_IO;
> +raise = 1;
> +}
> +if (flags & float_flag_divbyzero) {
> +env->fsr |= FSR_DZ;
> +raise = 1;
> +}
> +if (flags & float_flag_overflow) {
> +env->fsr |= FSR_OF;
> +raise = 1;
> +}
> +if (flags & float_flag_underflow) {
> +env->fsr |= FSR_UF;
> +raise = 1;
> +}
> +if (raise
> +&& (env_archcpu(env)->cfg.pvr_regs[2] & PVR2_FPU_EXC_MASK)
> +&& (env->msr & MSR_EE)) {
> +raise_fpu_exception(env, ra);
> +}
> +}
> +
> +uint32_t helper_fadd(CPUMBState *env, uint32_t a, uint32_t b)
> +{
> +CPU_FloatU fd, fa, fb;
> +int flags;
> +
> +set_float_exception_flags(0, >fp_status);
> +fa.l = a;
> +fb.l = b;
> +fd.f = float32_add(fa.f, fb.f, >fp_status);
> +
> +flags = get_float_exception_flags(>fp_status);
> +update_fpu_flags(env, flags, GETPC());
> +return fd.l;
> +}
> +
> +uint32_t helper_frsub(CPUMBState *env, uint32_t a, uint32_t b)
> +{
> +CPU_FloatU fd, fa, fb;
> +int flags;
> +
> +set_float_exception_flags(0, >fp_status);
> +fa.l = a;
> +fb.l = b;
> +fd.f = float32_sub(fb.f, fa.f, >fp_status);
> +flags = get_float_exception_flags(>fp_status);
> +update_fpu_flags(env, flags, GETPC());
> +return fd.l;
> +}
> +
> +uint32_t helper_fmul(CPUMBState *env, uint32_t a, uint32_t b)
> +{
> +CPU_FloatU fd, fa, fb;
> +int flags;
> +
> +set_float_exception_flags(0, >fp_status);
> +fa.l = a;
> +fb.l = b;
> +fd.f = float32_mul(fa.f, fb.f, >fp_status);
> +flags = get_float_exception_flags(>fp_status);
> +update_fpu_flags(env, flags, GETPC());
> +
> +return fd.l;
> +}
> +
> +uint32_t helper_fdiv(CPUMBState *env, uint32_t a, uint32_t b)
> +{
> +CPU_FloatU fd, fa, fb;
> +int flags;
> +
> +set_float_exception_flags(0, >fp_status);
> +

Re: [PATCH 1/6] target/microblaze: Use the IEC binary prefix definitions

2021-06-03 Thread Alistair Francis

On Thu, Jun 3, 2021 at 7:03 PM Philippe Mathieu-Daudé  wrote:
>
> IEC binary prefixes ease code review: the unit is explicit.
>
> Signed-off-by: Philippe Mathieu-Daudé 

Reviewed-by: Alistair Francis 

Alistair

> ---
>  target/microblaze/mmu.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/target/microblaze/mmu.c b/target/microblaze/mmu.c
> index cc40f275eaf..1481e2769f1 100644
> --- a/target/microblaze/mmu.c
> +++ b/target/microblaze/mmu.c
> @@ -19,14 +19,15 @@
>   */
>
>  #include "qemu/osdep.h"
> +#include "qemu/units.h"
>  #include "cpu.h"
>  #include "exec/exec-all.h"
>
>  static unsigned int tlb_decode_size(unsigned int f)
>  {
>  static const unsigned int sizes[] = {
> -1 * 1024, 4 * 1024, 16 * 1024, 64 * 1024, 256 * 1024,
> -1 * 1024 * 1024, 4 * 1024 * 1024, 16 * 1024 * 1024
> +1 * KiB, 4 * KiB, 16 * KiB, 64 * KiB, 256 * KiB,
> +1 * MiB, 4 * MiB, 16 * MiB
>  };
>  assert(f < ARRAY_SIZE(sizes));
>  return sizes[f];
> --
> 2.26.3
>
>

Re: [PATCH v1 3/3] hw/riscv: OpenTitan: Connect the mtime and mtimecmp timer

2021-06-03 Thread Alistair Francis

On Tue, Jun 1, 2021 at 11:10 PM Bin Meng  wrote:
>
> On Mon, May 31, 2021 at 12:33 PM Alistair Francis
>  wrote:
> >
>
> Please write some commit message here

Done.

>
> > Signed-off-by: Alistair Francis 
> > ---
> >  include/hw/riscv/opentitan.h |  5 -
> >  hw/riscv/opentitan.c | 14 +++---
> >  2 files changed, 15 insertions(+), 4 deletions(-)
> >
> > diff --git a/include/hw/riscv/opentitan.h b/include/hw/riscv/opentitan.h
> > index aab9bc9245..86cceef698 100644
> > --- a/include/hw/riscv/opentitan.h
> > +++ b/include/hw/riscv/opentitan.h
> > @@ -22,6 +22,7 @@
> >  #include "hw/riscv/riscv_hart.h"
> >  #include "hw/intc/ibex_plic.h"
> >  #include "hw/char/ibex_uart.h"
> > +#include "hw/timer/ibex_timer.h"
> >  #include "qom/object.h"
> >
> >  #define TYPE_RISCV_IBEX_SOC "riscv.lowrisc.ibex.soc"
> > @@ -35,6 +36,7 @@ struct LowRISCIbexSoCState {
> >  RISCVHartArrayState cpus;
> >  IbexPlicState plic;
> >  IbexUartState uart;
> > +IbexTimerState timer;
> >
> >  MemoryRegion flash_mem;
> >  MemoryRegion rom;
> > @@ -57,7 +59,7 @@ enum {
> >  IBEX_DEV_SPI,
> >  IBEX_DEV_I2C,
> >  IBEX_DEV_PATTGEN,
> > -IBEX_DEV_RV_TIMER,
> > +IBEX_DEV_TIMER,
> >  IBEX_DEV_SENSOR_CTRL,
> >  IBEX_DEV_OTP_CTRL,
> >  IBEX_DEV_PWRMGR,
> > @@ -82,6 +84,7 @@ enum {
> >  };
> >
> >  enum {
> > +IBEX_TIMER_TIMEREXPIRED0_0 = 125,
>
> So this timer is connected to PLIC, instead of a dedicated exception
> code in the *cause CSR?

It is connected to both. It triggers the bit in MIE and can also
trigger an interrupt via the PLIC.

Alistair

>
> >  IBEX_UART0_RX_PARITY_ERR_IRQ = 8,
> >  IBEX_UART0_RX_TIMEOUT_IRQ = 7,
> >  IBEX_UART0_RX_BREAK_ERR_IRQ = 6,
> > diff --git a/hw/riscv/opentitan.c b/hw/riscv/opentitan.c
> > index 7545dcda9c..c5a7e3bacb 100644
> > --- a/hw/riscv/opentitan.c
> > +++ b/hw/riscv/opentitan.c
> > @@ -36,7 +36,7 @@ static const MemMapEntry ibex_memmap[] = {
> >  [IBEX_DEV_SPI] ={  0x4005,  0x1000  },
> >  [IBEX_DEV_I2C] ={  0x4008,  0x1000  },
> >  [IBEX_DEV_PATTGEN] ={  0x400e,  0x1000  },
> > -[IBEX_DEV_RV_TIMER] =   {  0x4010,  0x1000  },
> > +[IBEX_DEV_TIMER] =  {  0x4010,  0x1000  },
> >  [IBEX_DEV_SENSOR_CTRL] ={  0x4011,  0x1000  },
> >  [IBEX_DEV_OTP_CTRL] =   {  0x4013,  0x4000  },
> >  [IBEX_DEV_PWRMGR] = {  0x4040,  0x1000  },
> > @@ -106,6 +106,8 @@ static void lowrisc_ibex_soc_init(Object *obj)
> >  object_initialize_child(obj, "plic", >plic, TYPE_IBEX_PLIC);
> >
> >  object_initialize_child(obj, "uart", >uart, TYPE_IBEX_UART);
> > +
> > +object_initialize_child(obj, "timer", >timer, TYPE_IBEX_TIMER);
> >  }
> >
> >  static void lowrisc_ibex_soc_realize(DeviceState *dev_soc, Error **errp)
> > @@ -159,6 +161,14 @@ static void lowrisc_ibex_soc_realize(DeviceState 
> > *dev_soc, Error **errp)
> > 3, qdev_get_gpio_in(DEVICE(>plic),
> > IBEX_UART0_RX_OVERFLOW_IRQ));
> >
> > +if (!sysbus_realize(SYS_BUS_DEVICE(>timer), errp)) {
> > +return;
> > +}
> > +sysbus_mmio_map(SYS_BUS_DEVICE(>timer), 0, 
> > memmap[IBEX_DEV_TIMER].base);
> > +sysbus_connect_irq(SYS_BUS_DEVICE(>timer),
> > +   0, qdev_get_gpio_in(DEVICE(>plic),
> > +   IBEX_TIMER_TIMEREXPIRED0_0));
> > +
> >  create_unimplemented_device("riscv.lowrisc.ibex.gpio",
> >  memmap[IBEX_DEV_GPIO].base, memmap[IBEX_DEV_GPIO].size);
> >  create_unimplemented_device("riscv.lowrisc.ibex.spi",
> > @@ -167,8 +177,6 @@ static void lowrisc_ibex_soc_realize(DeviceState 
> > *dev_soc, Error **errp)
> >  memmap[IBEX_DEV_I2C].base, memmap[IBEX_DEV_I2C].size);
> >  create_unimplemented_device("riscv.lowrisc.ibex.pattgen",
> >  memmap[IBEX_DEV_PATTGEN].base, memmap[IBEX_DEV_PATTGEN].size);
> > -create_unimplemented_device("riscv.lowrisc.ibex.rv_timer",
> > -memmap[IBEX_DEV_RV_TIMER].base, memmap[IBEX_DEV_RV_TIMER].size);
> >  create_unimplemented_device("riscv.lowrisc.ibex.sensor_ctrl",
> >  memmap[IBEX_DEV_SENSOR_CTRL].base, 
> > memmap[IBEX_DEV_SENSOR_CTRL].size);
> >  create_unimplemented_device("riscv.lowrisc.ibex.otp_ctrl",
>
> Regards,
> Bin

Re: [PATCH v1 2/3] hw/timer: Initial commit of Ibex Timer

2021-06-03 Thread Alistair Francis

On Tue, Jun 1, 2021 at 11:05 PM Bin Meng  wrote:
>
> On Mon, May 31, 2021 at 12:33 PM Alistair Francis
>  wrote:
>
> Please write some commit message, for example, what is supported in
> this initial version, and what is not.

I'll add something.

>
> >
> > Signed-off-by: Alistair Francis 
> > ---
> >  include/hw/timer/ibex_timer.h |  52 ++
> >  hw/timer/ibex_timer.c | 305 ++
> >  MAINTAINERS   |   6 +-
> >  hw/timer/meson.build  |   1 +
> >  4 files changed, 360 insertions(+), 4 deletions(-)
> >  create mode 100644 include/hw/timer/ibex_timer.h
> >  create mode 100644 hw/timer/ibex_timer.c
> >
> > diff --git a/include/hw/timer/ibex_timer.h b/include/hw/timer/ibex_timer.h
> > new file mode 100644
> > index 00..6a43537003
> > --- /dev/null
> > +++ b/include/hw/timer/ibex_timer.h
> > @@ -0,0 +1,52 @@
> > +/*
> > + * QEMU lowRISC Ibex Timer device
> > + *
> > + * Copyright (c) 2021 Western Digital
> > + *
> > + * Permission is hereby granted, free of charge, to any person obtaining a 
> > copy
> > + * of this software and associated documentation files (the "Software"), 
> > to deal
> > + * in the Software without restriction, including without limitation the 
> > rights
> > + * to use, copy, modify, merge, publish, distribute, sublicense, and/or 
> > sell
> > + * copies of the Software, and to permit persons to whom the Software is
> > + * furnished to do so, subject to the following conditions:
> > + *
> > + * The above copyright notice and this permission notice shall be included 
> > in
> > + * all copies or substantial portions of the Software.
> > + *
> > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS 
> > OR
> > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
> > + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR 
> > OTHER
> > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
> > FROM,
> > + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS 
> > IN
> > + * THE SOFTWARE.
> > + */
> > +
> > +#ifndef HW_IBEX_TIMER_H
> > +#define HW_IBEX_TIMER_H
> > +
> > +#include "hw/sysbus.h"
> > +
> > +#define TYPE_IBEX_TIMER "ibex-timer"
> > +OBJECT_DECLARE_SIMPLE_TYPE(IbexTimerState, IBEX_TIMER)
> > +
> > +struct IbexTimerState {
> > +/*  */
> > +SysBusDevice parent_obj;
> > +
> > +/*  */
> > +MemoryRegion mmio;
> > +
> > +uint32_t timer_ctrl;
> > +uint32_t timer_cfg0;
> > +uint32_t timer_compare_lower0;
> > +uint32_t timer_compare_upper0;
> > +uint32_t timer_intr_enable;
> > +uint32_t timer_intr_state;
> > +uint32_t timer_intr_test;
> > +
> > +uint32_t timebase_freq;
> > +
> > +qemu_irq irq;
> > +};
> > +#endif /* HW_IBEX_TIMER_H */
> > diff --git a/hw/timer/ibex_timer.c b/hw/timer/ibex_timer.c
> > new file mode 100644
> > index 00..0a1030b15f
> > --- /dev/null
> > +++ b/hw/timer/ibex_timer.c
> > @@ -0,0 +1,305 @@
> > +/*
> > + * QEMU lowRISC Ibex Timer device
> > + *
> > + * Copyright (c) 2021 Western Digital
> > + *
> > + * For details check the documentation here:
> > + *https://docs.opentitan.org/hw/ip/rv_timer/doc/
> > + *
> > + * Permission is hereby granted, free of charge, to any person obtaining a 
> > copy
> > + * of this software and associated documentation files (the "Software"), 
> > to deal
> > + * in the Software without restriction, including without limitation the 
> > rights
> > + * to use, copy, modify, merge, publish, distribute, sublicense, and/or 
> > sell
> > + * copies of the Software, and to permit persons to whom the Software is
> > + * furnished to do so, subject to the following conditions:
> > + *
> > + * The above copyright notice and this permission notice shall be included 
> > in
> > + * all copies or substantial portions of the Software.
> > + *
> > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS 
> > OR
> > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
> > + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR 
> > OTHER
> > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
> > FROM,
> > + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS 
> > IN
> > + * THE SOFTWARE.
> > + */
> > +
> > +#include "qemu/osdep.h"
> > +#include "qemu/log.h"
> > +#include "qemu/timer.h"
> > +#include "hw/timer/ibex_timer.h"
> > +#include "hw/irq.h"
> > +#include "hw/qdev-properties.h"
> > +#include "target/riscv/cpu.h"
> > +#include "migration/vmstate.h"
> > +
> > +REG32(CTRL, 0x00)
> > +FIELD(CTRL, ACTIVE, 0, 1)
> > +REG32(CFG0, 0x100)
> > +FIELD(CFG0, PRESCALE, 0, 12)
> > +FIELD(CFG0, STEP, 16, 8)
> > +REG32(LOWER0, 0x104)
> >

Re: [PATCH v7 9/9] qtest/hyperv: Introduce a simple hyper-v test

2021-06-03 Thread Eduardo Habkost

On Thu, Jun 03, 2021 at 01:48:35PM +0200, Vitaly Kuznetsov wrote:
> For the beginning, just test 'hv-passthrough' and a couple of custom
> Hyper-V  enlightenments configurations through QMP. Later, it would
> be great to complement this by checking CPUID values from within the
> guest.
> 
> Signed-off-by: Vitaly Kuznetsov 
> ---
>  MAINTAINERS   |   1 +
>  tests/qtest/hyperv-test.c | 225 ++
>  tests/qtest/meson.build   |   3 +-
>  3 files changed, 228 insertions(+), 1 deletion(-)
>  create mode 100644 tests/qtest/hyperv-test.c
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 5f55404f2fae..862720016b3a 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -1537,6 +1537,7 @@ F: hw/isa/apm.c
>  F: include/hw/isa/apm.h
>  F: tests/unit/test-x86-cpuid.c
>  F: tests/qtest/test-x86-cpuid-compat.c
> +F: tests/qtest/hyperv-test.c

Maybe it makes sense to keep it here by now, but I believe we
should eventually create a section for hyperv in MAINTAINERS.

CCing Michael and Marcel, who are the people listed in this
MAINTAINERS section.


>  
>  PC Chipset
>  M: Michael S. Tsirkin 
[...]
> +int main(int argc, char **argv)
> +{
> +const char *arch = qtest_get_arch();
> +
> +g_test_init(, , NULL);
> +
> +if (!strcmp(arch, "i386") || !strcmp(arch, "x86_64")) {

Is this necessary when the test is already being added to
qtests_i386/qtests_x86_64 only?

> +qtest_add_data_func("/hyperv/hv-all-but-evmcs",
> +NULL, test_query_cpu_hv_all_but_evmcs);
> +qtest_add_data_func("/hyperv/hv-custom",
> +NULL, test_query_cpu_hv_custom);
> +if (kvm_has_sys_hyperv_cpuid()) {
> +qtest_add_data_func("/hyperv/hv-passthrough",
> +NULL, test_query_cpu_hv_passthrough);
> +   }
> +}
> +
> +return g_test_run();
> +}
> diff --git a/tests/qtest/meson.build b/tests/qtest/meson.build
> index c3a223a83d6a..958a88d0c8b4 100644
> --- a/tests/qtest/meson.build
> +++ b/tests/qtest/meson.build
> @@ -83,7 +83,8 @@ qtests_i386 = \
> 'vmgenid-test',
> 'migration-test',
> 'test-x86-cpuid-compat',
> -   'numa-test']
> +   'numa-test',
> +   'hyperv-test']
>  
>  dbus_daemon = find_program('dbus-daemon', required: false)
>  if dbus_daemon.found() and config_host.has_key('GDBUS_CODEGEN')
> -- 
> 2.31.1
> 

-- 
Eduardo

Re: [PATCH v7 8/9] i386: Hyper-V SynIC requires POST_MESSAGES/SIGNAL_EVENTS priviliges

2021-06-03 Thread Eduardo Habkost

On Thu, Jun 03, 2021 at 01:48:34PM +0200, Vitaly Kuznetsov wrote:
> When Hyper-V SynIC is enabled, we may need to allow Windows guests to make
> hypercalls (POST_MESSAGES/SIGNAL_EVENTS). No issue is currently observed
> because KVM is very permissive, allowing these hypercalls regarding of
> guest visible CPUid bits.
> 
> Signed-off-by: Vitaly Kuznetsov 
> ---
>  target/i386/kvm/hyperv-proto.h | 6 ++
>  target/i386/kvm/kvm.c  | 6 ++
>  2 files changed, 12 insertions(+)
> 
> diff --git a/target/i386/kvm/hyperv-proto.h b/target/i386/kvm/hyperv-proto.h
> index e30d64b4ade4..5fbb385cc136 100644
> --- a/target/i386/kvm/hyperv-proto.h
> +++ b/target/i386/kvm/hyperv-proto.h
> @@ -38,6 +38,12 @@
>  #define HV_ACCESS_FREQUENCY_MSRS (1u << 11)
>  #define HV_ACCESS_REENLIGHTENMENTS_CONTROL  (1u << 13)
>  
> +/*
> + * HV_CPUID_FEATURES.EBX bits
> + */
> +#define HV_POST_MESSAGES (1u << 4)
> +#define HV_SIGNAL_EVENTS (1u << 5)
> +
>  /*
>   * HV_CPUID_FEATURES.EDX bits
>   */
> diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
> index a3897d4d8788..6a32d43e6ec1 100644
> --- a/target/i386/kvm/kvm.c
> +++ b/target/i386/kvm/kvm.c
> @@ -1343,6 +1343,12 @@ static int hyperv_fill_cpuids(CPUState *cs,
>  /* Unconditionally required with any Hyper-V enlightenment */
>  c->eax |= HV_HYPERCALL_AVAILABLE;
>  
> +/* SynIC and Vmbus devices require messages/signals hypercalls */
> +if (hyperv_feat_enabled(cpu, HYPERV_FEAT_SYNIC) &&
> +!cpu->hyperv_synic_kvm_only) {
> +c->ebx |= HV_POST_MESSAGES | HV_SIGNAL_EVENTS;

Why exactly is the hyperv_synic_kvm_only check needed?

Is the hyperv_synic_kvm_only check the only reason this was done
here and not at kvm_hyperv_properties?


> +}
> +
>  /* Not exposed by KVM but needed to make CPU hotplug in Windows work */
>  c->edx |= HV_CPU_DYNAMIC_PARTITIONING_AVAILABLE;
>  
> -- 
> 2.31.1
> 

-- 
Eduardo

Re: [PATCH v7 7/9] i386: HV_HYPERCALL_AVAILABLE privilege bit is always needed

2021-06-03 Thread Eduardo Habkost

On Thu, Jun 03, 2021 at 01:48:33PM +0200, Vitaly Kuznetsov wrote:
> According to TLFS, Hyper-V guest is supposed to check
> HV_HYPERCALL_AVAILABLE privilege bit before accessing
> HV_X64_MSR_GUEST_OS_ID/HV_X64_MSR_HYPERCALL MSRs but at least some
> Windows versions ignore that. As KVM is very permissive and allows
> accessing these MSRs unconditionally, no issue is observed. We may,
> however, want to tighten the checks eventually. Conforming to the
> spec is probably also a good idea.
> 
> Enable HV_HYPERCALL_AVAILABLE bit unconditionally.
> 
> Signed-off-by: Vitaly Kuznetsov 

Reviewed-by: Eduardo Habkost 

-- 
Eduardo

Re: [PATCH v7 5/9] i386: expand Hyper-V features during CPU feature expansion time

2021-06-03 Thread Eduardo Habkost

On Thu, Jun 03, 2021 at 01:48:31PM +0200, Vitaly Kuznetsov wrote:
> To make Hyper-V features appear in e.g. QMP query-cpu-model-expansion we
> need to expand and set the corresponding CPUID leaves early. Modify
> x86_cpu_get_supported_feature_word() to call newly intoduced Hyper-V
> specific kvm_hv_get_supported_cpuid() instead of
> kvm_arch_get_supported_cpuid(). We can't use kvm_arch_get_supported_cpuid()
> as Hyper-V specific CPUID leaves intersect with KVM's.
> 
> Note, early expansion will only happen when KVM supports system wide
> KVM_GET_SUPPORTED_HV_CPUID ioctl (KVM_CAP_SYS_HYPERV_CPUID).
> 
> Signed-off-by: Vitaly Kuznetsov 
> ---
>  target/i386/cpu.c  |  4 
>  target/i386/kvm/kvm-stub.c |  5 +
>  target/i386/kvm/kvm.c  | 24 
>  target/i386/kvm/kvm_i386.h |  1 +
>  4 files changed, 30 insertions(+), 4 deletions(-)
> 
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index b2d8e5713911..159b7aa8f073 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -5990,6 +5990,10 @@ void x86_cpu_expand_features(X86CPU *cpu, Error **errp)
>  if (env->cpuid_xlevel2 == UINT32_MAX) {
>  env->cpuid_xlevel2 = env->cpuid_min_xlevel2;
>  }
> +
> +if (kvm_enabled()) {
> +kvm_hyperv_expand_features(cpu, errp);
> +}
>  }
>  
>  /*
> diff --git a/target/i386/kvm/kvm-stub.c b/target/i386/kvm/kvm-stub.c
> index 92f49121b8fa..f6e7e4466e1a 100644
> --- a/target/i386/kvm/kvm-stub.c
> +++ b/target/i386/kvm/kvm-stub.c
> @@ -39,3 +39,8 @@ bool kvm_hv_vpindex_settable(void)
>  {
>  return false;
>  }
> +
> +bool kvm_hyperv_expand_features(X86CPU *cpu, Error **errp)
> +{
> +abort();
> +}
> diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
> index a5f8553af921..650fdd970d6e 100644
> --- a/target/i386/kvm/kvm.c
> +++ b/target/i386/kvm/kvm.c
> @@ -1217,13 +1217,22 @@ static uint32_t hv_build_cpuid_leaf(CPUState *cs, 
> uint32_t func, int reg)
>   * of 'hv_passthrough' mode and fills the environment with all supported
>   * Hyper-V features.
>   */
> -static bool hyperv_expand_features(CPUState *cs, Error **errp)
> +bool kvm_hyperv_expand_features(X86CPU *cpu, Error **errp)
>  {
> -X86CPU *cpu = X86_CPU(cs);
> +CPUState *cs = CPU(cpu);
>  
>  if (!hyperv_enabled(cpu))
>  return true;
>  
> +/*
> + * When kvm_hyperv_expand_features is called at CPU feature expansion
> + * time per-CPU kvm_state is not available yet so we can only proceed
> + * when KVM_CAP_SYS_HYPERV_CPUID is supported.
> + */
> +if (!cs->kvm_state &&
> +!kvm_check_extension(kvm_state, KVM_CAP_SYS_HYPERV_CPUID))
> +return true;
> +
>  if (cpu->hyperv_passthrough) {
>  cpu->hyperv_vendor_id[0] =
>  hv_cpuid_get_host(cs, HV_CPUID_VENDOR_AND_MAX_FUNCTIONS, R_EBX);
> @@ -1565,8 +1574,15 @@ int kvm_arch_init_vcpu(CPUState *cs)
>  
>  env->apic_bus_freq = KVM_APIC_BUS_FREQUENCY;
>  
> -/* Paravirtualization CPUIDs */
> -if (!hyperv_expand_features(cs, _err)) {
> +/*
> + * kvm_hyperv_expand_features() is called here for the second time in 
> case
> + * KVM_CAP_SYS_HYPERV_CPUID is not supported. While we can't possibly 
> handle
> + * 'query-cpu-model-expansion' in this case as we don't have a KVM vCPU 
> to
> + * check which Hyper-V enlightenments are supported and which are not, we
> + * can still proceed and check/expand Hyper-V enlightenments here so 
> legacy
> + * behavior is preserved.
> + */

Issue we might need to solve later: how can management software
differentiate "Hyper-V enlightenments are not supported at all"
from "we can't tell you which Hyper-V enlightenments are
available because KVM_CAP_SYS_HYPERV_CPUID is unavailable"?


> +if (!kvm_hyperv_expand_features(cpu, _err)) {
>  error_report_err(local_err);
>  return -ENOSYS;
>  }
> diff --git a/target/i386/kvm/kvm_i386.h b/target/i386/kvm/kvm_i386.h
> index dc725083891c..54667b35f09c 100644
> --- a/target/i386/kvm/kvm_i386.h
> +++ b/target/i386/kvm/kvm_i386.h
> @@ -47,6 +47,7 @@ bool kvm_has_x2apic_api(void);
>  bool kvm_has_waitpkg(void);
>  
>  bool kvm_hv_vpindex_settable(void);
> +bool kvm_hyperv_expand_features(X86CPU *cpu, Error **errp);
>  
>  uint64_t kvm_swizzle_msi_ext_dest_id(uint64_t address);

Reviewed-by: Eduardo Habkost 

-- 
Eduardo

Re: [PATCH v7 4/9] i386: make hyperv_expand_features() return bool

2021-06-03 Thread Eduardo Habkost

On Thu, Jun 03, 2021 at 01:48:30PM +0200, Vitaly Kuznetsov wrote:
> Return 'false' when hyperv_expand_features() sets an error.
> 
> No functional change intended.
> 
> Signed-off-by: Vitaly Kuznetsov 

Reviewed-by: Eduardo Habkost 

-- 
Eduardo

Re: [PATCH v7 3/9] i386: hardcode supported eVMCS version to '1'

2021-06-03 Thread Eduardo Habkost

On Thu, Jun 03, 2021 at 01:48:29PM +0200, Vitaly Kuznetsov wrote:
> Currently, the only eVMCS version, supported by KVM (and described in TLFS)
> is '1'. When Enlightened VMCS feature is enabled, QEMU takes the supported
> eVMCS version range (from KVM_CAP_HYPERV_ENLIGHTENED_VMCS enablement) and
> puts it to guest visible CPUIDs. When (and if) eVMCS ver.2 appears a
> problem on migration is expected: it doesn't seem to be possible to migrate
> from a host supporting eVMCS ver.2 to a host, which only support eVMCS
> ver.1.

Isn't it possible and safe to expose eVMCS ver.1 to the guest on
a host that supports ver.2?

> 
> Hardcode eVMCS ver.1 as the result of 'hv-evmcs' enablement for now. Newer
> eVMCS versions will have to have their own enablement options (e.g.
> 'hv-evmcs=2').
> 
> Signed-off-by: Vitaly Kuznetsov 
> ---
>  docs/hyperv.txt   |  2 +-
>  target/i386/kvm/kvm.c | 16 +++-
>  2 files changed, 12 insertions(+), 6 deletions(-)
> 
> diff --git a/docs/hyperv.txt b/docs/hyperv.txt
> index a51953daa833..000638a2fd38 100644
> --- a/docs/hyperv.txt
> +++ b/docs/hyperv.txt
> @@ -170,7 +170,7 @@ Recommended: hv-frequencies
>  3.16. hv-evmcs
>  ===
>  The enlightenment is nested specific, it targets Hyper-V on KVM guests. When
> -enabled, it provides Enlightened VMCS feature to the guest. The feature
> +enabled, it provides Enlightened VMCS version 1 feature to the guest. The 
> feature
>  implements paravirtualized protocol between L0 (KVM) and L1 (Hyper-V)
>  hypervisors making L2 exits to the hypervisor faster. The feature is 
> Intel-only.
>  Note: some virtualization features (e.g. Posted Interrupts) are disabled when
> diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
> index c676ee8b38a7..d57eede5dc81 100644
> --- a/target/i386/kvm/kvm.c
> +++ b/target/i386/kvm/kvm.c
> @@ -1490,13 +1490,19 @@ static int hyperv_init_vcpu(X86CPU *cpu)
>  ret = kvm_vcpu_enable_cap(cs, KVM_CAP_HYPERV_ENLIGHTENED_VMCS, 0,
>(uintptr_t)_version);
>  
> -if (ret < 0) {
> -fprintf(stderr, "Hyper-V %s is not supported by kernel\n",
> -kvm_hyperv_properties[HYPERV_FEAT_EVMCS].desc);
> +/*
> + * KVM is required to support EVMCS ver.1. as that's what 'hv-evmcs'
> + * option sets. Note: we hardcode the maximum supported eVMCS version
> + * to '1' as well so 'hv-evmcs' feature is migratable even when (and 
> if)
> + * ver.2 is implemented. A new option (e.g. 'hv-evmcs=2') will then 
> have
> + * to be added.
> + */
> +if (ret < 0 || (uint8_t)evmcs_version > 1) {

Wait, do you really want to get a fatal error every time, after a
kernel upgrade?

I was expecting this:

  vcpu_evmcs_version = 1; /* hardcoded, but can become configurable later */
  ...
  kvm_vcpu_enable_cap(cs, KVM_CAP_HYPERV_ENLIGHTENED_VMCS, 0, 
(uintptr_t)_evmcs_version);
  if (ret < 0 || supported_evmcs_version < vcpu_evmcs_version) {
error_setg(...);
return;
  }
  cpu->hyperv_nested[0] = vcpu_evmcs_version;


> +error_report("Hyper-V %s verson 1 is not supported by kernel",
> + kvm_hyperv_properties[HYPERV_FEAT_EVMCS].desc);
>  return ret;
>  }
> -
> -cpu->hyperv_nested[0] = evmcs_version;
> +cpu->hyperv_nested[0] = (1 << 8) | 1;
>  }
>  
>  return 0;
> -- 
> 2.31.1
> 

-- 
Eduardo

Re: [PATCH v7 2/9] i386: clarify 'hv-passthrough' behavior

2021-06-03 Thread Eduardo Habkost

On Thu, Jun 03, 2021 at 01:48:28PM +0200, Vitaly Kuznetsov wrote:
> Clarify the fact that 'hv-passthrough' only enables features which are
> already known to QEMU and that it overrides all other 'hv-*' settings.
> 
> Signed-off-by: Vitaly Kuznetsov 

Reviewed-by: Eduardo Habkost 

-- 
Eduardo

Re: [PATCH v7 1/9] i386: avoid hardcoding '12' as 'hyperv_vendor_id' length

2021-06-03 Thread Eduardo Habkost

On Thu, Jun 03, 2021 at 01:48:27PM +0200, Vitaly Kuznetsov wrote:
> While this is very unlikely to change, let's avoid hardcoding '12' as
> 'hyperv_vendor_id' length.
> 
> No functional change intended.
> 
> Signed-off-by: Vitaly Kuznetsov 

Reviewed-by: Eduardo Habkost 

-- 
Eduardo

[PULL 20/29] softfloat: Convert integer to floatx80 to FloatParts

2021-06-03 Thread Richard Henderson

Reviewed-by: Alex Bennée 
Signed-off-by: Richard Henderson 
---
 fpu/softfloat.c | 58 +++--
 1 file changed, 13 insertions(+), 45 deletions(-)

diff --git a/fpu/softfloat.c b/fpu/softfloat.c
index 9caf1ecf9c..be7583780d 100644
--- a/fpu/softfloat.c
+++ b/fpu/softfloat.c
@@ -3345,6 +3345,19 @@ float128 int32_to_float128(int32_t a, float_status 
*status)
 return int64_to_float128(a, status);
 }
 
+floatx80 int64_to_floatx80(int64_t a, float_status *status)
+{
+FloatParts128 p;
+
+parts_sint_to_float(, a, 0, status);
+return floatx80_round_pack_canonical(, status);
+}
+
+floatx80 int32_to_floatx80(int32_t a, float_status *status)
+{
+return int64_to_floatx80(a, status);
+}
+
 /*
  * Unsigned Integer to floating-point conversions
  */
@@ -5033,51 +5046,6 @@ static float128 normalizeRoundAndPackFloat128(bool 
zSign, int32_t zExp,
 
 }
 
-
-/*
-| Returns the result of converting the 32-bit two's complement integer `a'
-| to the extended double-precision floating-point format.  The conversion
-| is performed according to the IEC/IEEE Standard for Binary Floating-Point
-| Arithmetic.
-**/
-
-floatx80 int32_to_floatx80(int32_t a, float_status *status)
-{
-bool zSign;
-uint32_t absA;
-int8_t shiftCount;
-uint64_t zSig;
-
-if ( a == 0 ) return packFloatx80( 0, 0, 0 );
-zSign = ( a < 0 );
-absA = zSign ? - a : a;
-shiftCount = clz32(absA) + 32;
-zSig = absA;
-return packFloatx80( zSign, 0x403E - shiftCount, zSig<

[PULL 25/29] softfloat: Convert float32_exp2 to FloatParts

2021-06-03 Thread Richard Henderson

Keep the intermediate results in FloatParts instead of
converting back and forth between float64.  Use muladd
instead of separate mul+add.

Reviewed-by: Alex Bennée 
Signed-off-by: Richard Henderson 
---
 fpu/softfloat.c | 53 +
 1 file changed, 23 insertions(+), 30 deletions(-)

diff --git a/fpu/softfloat.c b/fpu/softfloat.c
index c32b1c7113..27306d6a93 100644
--- a/fpu/softfloat.c
+++ b/fpu/softfloat.c
@@ -5210,47 +5210,40 @@ static const float64 float32_exp2_coefficients[15] =
 
 float32 float32_exp2(float32 a, float_status *status)
 {
-bool aSign;
-int aExp;
-uint32_t aSig;
-float64 r, x, xn;
+FloatParts64 xp, xnp, tp, rp;
 int i;
-a = float32_squash_input_denormal(a, status);
 
-aSig = extractFloat32Frac( a );
-aExp = extractFloat32Exp( a );
-aSign = extractFloat32Sign( a );
-
-if ( aExp == 0xFF) {
-if (aSig) {
-return propagateFloat32NaN(a, float32_zero, status);
+float32_unpack_canonical(, a, status);
+if (unlikely(xp.cls != float_class_normal)) {
+switch (xp.cls) {
+case float_class_snan:
+case float_class_qnan:
+parts_return_nan(, status);
+return float32_round_pack_canonical(, status);
+case float_class_inf:
+return xp.sign ? float32_zero : a;
+case float_class_zero:
+return float32_one;
+default:
+break;
 }
-return (aSign) ? float32_zero : a;
-}
-if (aExp == 0) {
-if (aSig == 0) return float32_one;
+g_assert_not_reached();
 }
 
 float_raise(float_flag_inexact, status);
 
-/* *** */
-/* using float64 for approximation */
-/* *** */
-x = float32_to_float64(a, status);
-x = float64_mul(x, float64_ln2, status);
+float64_unpack_canonical(, float64_ln2, status);
+xp = *parts_mul(, , status);
+xnp = xp;
 
-xn = x;
-r = float64_one;
+float64_unpack_canonical(, float64_one, status);
 for (i = 0 ; i < 15 ; i++) {
-float64 f;
-
-f = float64_mul(xn, float32_exp2_coefficients[i], status);
-r = float64_add(r, f, status);
-
-xn = float64_mul(xn, x, status);
+float64_unpack_canonical(, float32_exp2_coefficients[i], status);
+rp = *parts_muladd(, , , 0, status);
+xnp = *parts_mul(, , status);
 }
 
-return float64_to_float32(r, status);
+return float32_round_pack_canonical(, status);
 }
 
 /*
-- 
2.25.1

[PULL 22/29] softfloat: Convert floatx80 to integer to FloatParts

2021-06-03 Thread Richard Henderson

Reviewed-by: Alex Bennée 
Signed-off-by: Richard Henderson 
---
 fpu/softfloat.c | 336 ++--
 1 file changed, 42 insertions(+), 294 deletions(-)

diff --git a/fpu/softfloat.c b/fpu/softfloat.c
index acaab6a127..5a2a872408 100644
--- a/fpu/softfloat.c
+++ b/fpu/softfloat.c
@@ -2829,6 +2829,28 @@ static int64_t float128_to_int64_scalbn(float128 a, 
FloatRoundMode rmode,
 return parts_float_to_sint(, rmode, scale, INT64_MIN, INT64_MAX, s);
 }
 
+static int32_t floatx80_to_int32_scalbn(floatx80 a, FloatRoundMode rmode,
+int scale, float_status *s)
+{
+FloatParts128 p;
+
+if (!floatx80_unpack_canonical(, a, s)) {
+parts_default_nan(, s);
+}
+return parts_float_to_sint(, rmode, scale, INT32_MIN, INT32_MAX, s);
+}
+
+static int64_t floatx80_to_int64_scalbn(floatx80 a, FloatRoundMode rmode,
+int scale, float_status *s)
+{
+FloatParts128 p;
+
+if (!floatx80_unpack_canonical(, a, s)) {
+parts_default_nan(, s);
+}
+return parts_float_to_sint(, rmode, scale, INT64_MIN, INT64_MAX, s);
+}
+
 int8_t float16_to_int8(float16 a, float_status *s)
 {
 return float16_to_int8_scalbn(a, s->float_rounding_mode, 0, s);
@@ -2889,6 +2911,16 @@ int64_t float128_to_int64(float128 a, float_status *s)
 return float128_to_int64_scalbn(a, s->float_rounding_mode, 0, s);
 }
 
+int32_t floatx80_to_int32(floatx80 a, float_status *s)
+{
+return floatx80_to_int32_scalbn(a, s->float_rounding_mode, 0, s);
+}
+
+int64_t floatx80_to_int64(floatx80 a, float_status *s)
+{
+return floatx80_to_int64_scalbn(a, s->float_rounding_mode, 0, s);
+}
+
 int16_t float16_to_int16_round_to_zero(float16 a, float_status *s)
 {
 return float16_to_int16_scalbn(a, float_round_to_zero, 0, s);
@@ -2944,6 +2976,16 @@ int64_t float128_to_int64_round_to_zero(float128 a, 
float_status *s)
 return float128_to_int64_scalbn(a, float_round_to_zero, 0, s);
 }
 
+int32_t floatx80_to_int32_round_to_zero(floatx80 a, float_status *s)
+{
+return floatx80_to_int32_scalbn(a, float_round_to_zero, 0, s);
+}
+
+int64_t floatx80_to_int64_round_to_zero(floatx80 a, float_status *s)
+{
+return floatx80_to_int64_scalbn(a, float_round_to_zero, 0, s);
+}
+
 int16_t bfloat16_to_int16(bfloat16 a, float_status *s)
 {
 return bfloat16_to_int16_scalbn(a, s->float_rounding_mode, 0, s);
@@ -4160,127 +4202,6 @@ bfloat16 bfloat16_squash_input_denormal(bfloat16 a, 
float_status *status)
 return a;
 }
 
-/*
-| Takes a 64-bit fixed-point value `absZ' with binary point between bits 6
-| and 7, and returns the properly rounded 32-bit integer corresponding to the
-| input.  If `zSign' is 1, the input is negated before being converted to an
-| integer.  Bit 63 of `absZ' must be zero.  Ordinarily, the fixed-point input
-| is simply rounded to an integer, with the inexact exception raised if the
-| input cannot be represented exactly as an integer.  However, if the fixed-
-| point input is too large, the invalid exception is raised and the largest
-| positive or negative integer is returned.
-**/
-
-static int32_t roundAndPackInt32(bool zSign, uint64_t absZ,
- float_status *status)
-{
-int8_t roundingMode;
-bool roundNearestEven;
-int8_t roundIncrement, roundBits;
-int32_t z;
-
-roundingMode = status->float_rounding_mode;
-roundNearestEven = ( roundingMode == float_round_nearest_even );
-switch (roundingMode) {
-case float_round_nearest_even:
-case float_round_ties_away:
-roundIncrement = 0x40;
-break;
-case float_round_to_zero:
-roundIncrement = 0;
-break;
-case float_round_up:
-roundIncrement = zSign ? 0 : 0x7f;
-break;
-case float_round_down:
-roundIncrement = zSign ? 0x7f : 0;
-break;
-case float_round_to_odd:
-roundIncrement = absZ & 0x80 ? 0 : 0x7f;
-break;
-default:
-abort();
-}
-roundBits = absZ & 0x7F;
-absZ = ( absZ + roundIncrement )>>7;
-if (!(roundBits ^ 0x40) && roundNearestEven) {
-absZ &= ~1;
-}
-z = absZ;
-if ( zSign ) z = - z;
-if ( ( absZ>>32 ) || ( z && ( ( z < 0 ) ^ zSign ) ) ) {
-float_raise(float_flag_invalid, status);
-return zSign ? INT32_MIN : INT32_MAX;
-}
-if (roundBits) {
-float_raise(float_flag_inexact, status);
-}
-return z;
-
-}
-
-/*
-| Takes the 128-bit fixed-point value formed by concatenating `absZ0' and
-| `absZ1', with binary point between bits 63 and 64 (between the input words),
-| and returns the properly rounded 64-bit integer corresponding to the input.
-| If `zSign' is 1, the input

[PULL 29/29] softfloat: Use hard-float for {u}int64_to_float{32,64}

2021-06-03 Thread Richard Henderson

For the normal case of no additional scaling, this reduces the
profile contribution of int64_to_float64 to the testcase in the
linked issue from 0.81% to 0.04%.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/134
Reviewed-by: Alex Bennée 
Signed-off-by: Richard Henderson 
---
 fpu/softfloat.c | 28 
 1 file changed, 28 insertions(+)

diff --git a/fpu/softfloat.c b/fpu/softfloat.c
index 5026f518b0..1cb162882b 100644
--- a/fpu/softfloat.c
+++ b/fpu/softfloat.c
@@ -3559,6 +3559,13 @@ float32 int64_to_float32_scalbn(int64_t a, int scale, 
float_status *status)
 {
 FloatParts64 p;
 
+/* Without scaling, there are no overflow concerns. */
+if (likely(scale == 0) && can_use_fpu(status)) {
+union_float32 ur;
+ur.h = a;
+return ur.s;
+}
+
 parts64_sint_to_float(, a, scale, status);
 return float32_round_pack_canonical(, status);
 }
@@ -3592,6 +3599,13 @@ float64 int64_to_float64_scalbn(int64_t a, int scale, 
float_status *status)
 {
 FloatParts64 p;
 
+/* Without scaling, there are no overflow concerns. */
+if (likely(scale == 0) && can_use_fpu(status)) {
+union_float64 ur;
+ur.h = a;
+return ur.s;
+}
+
 parts_sint_to_float(, a, scale, status);
 return float64_round_pack_canonical(, status);
 }
@@ -3726,6 +3740,13 @@ float32 uint64_to_float32_scalbn(uint64_t a, int scale, 
float_status *status)
 {
 FloatParts64 p;
 
+/* Without scaling, there are no overflow concerns. */
+if (likely(scale == 0) && can_use_fpu(status)) {
+union_float32 ur;
+ur.h = a;
+return ur.s;
+}
+
 parts_uint_to_float(, a, scale, status);
 return float32_round_pack_canonical(, status);
 }
@@ -3759,6 +3780,13 @@ float64 uint64_to_float64_scalbn(uint64_t a, int scale, 
float_status *status)
 {
 FloatParts64 p;
 
+/* Without scaling, there are no overflow concerns. */
+if (likely(scale == 0) && can_use_fpu(status)) {
+union_float64 ur;
+ur.h = a;
+return ur.s;
+}
+
 parts_uint_to_float(, a, scale, status);
 return float64_round_pack_canonical(, status);
 }
-- 
2.25.1

[PULL 26/29] softfloat: Move floatN_log2 to softfloat-parts.c.inc

2021-06-03 Thread Richard Henderson

Rename to parts$N_log2.  Though this is partly a ruse, since I do not
believe the code will succeed for float128 without work.  Which is ok
for now, because we do not need this for more than float32 and float64.

Since berkeley-testfloat-3 doesn't support log2, compare float64_log2
vs the system log2.  Fix the errors for inputs near 1.0:

test: 3ff000b0  +0x1.000b0p+0
  sf: 3d2fa000  +0x1.fa000p-45
libm: 3d2fbd422b1bd36f  +0x1.fbd422b1bd36fp-45
Error in fraction: 32170028290927 ulp

test: 3feec24f6770b100  +0x1.ec24f6770b100p-1
  sf: bfad3740d13c9ec0  -0x1.d3740d13c9ec0p-5
libm: bfad3740d13c9e98  -0x1.d3740d13c9e98p-5
Error in fraction: 40 ulp

Reviewed-by: Alex Bennée 
Signed-off-by: Richard Henderson 
---
 fpu/softfloat.c   | 126 --
 tests/fp/fp-test-log2.c   | 118 +++
 fpu/softfloat-parts.c.inc | 125 +
 tests/fp/meson.build  |  11 
 4 files changed, 281 insertions(+), 99 deletions(-)
 create mode 100644 tests/fp/fp-test-log2.c

diff --git a/fpu/softfloat.c b/fpu/softfloat.c
index 27306d6a93..c0fe191f4d 100644
--- a/fpu/softfloat.c
+++ b/fpu/softfloat.c
@@ -927,6 +927,12 @@ static void parts128_scalbn(FloatParts128 *a, int n, 
float_status *s);
 #define parts_scalbn(A, N, S) \
 PARTS_GENERIC_64_128(scalbn, A)(A, N, S)
 
+static void parts64_log2(FloatParts64 *a, float_status *s, const FloatFmt *f);
+static void parts128_log2(FloatParts128 *a, float_status *s, const FloatFmt 
*f);
+
+#define parts_log2(A, S, F) \
+PARTS_GENERIC_64_128(log2, A)(A, S, F)
+
 /*
  * Helper functions for softfloat-parts.c.inc, per-size operations.
  */
@@ -4060,6 +4066,27 @@ floatx80 floatx80_sqrt(floatx80 a, float_status *s)
 return floatx80_round_pack_canonical(, s);
 }
 
+/*
+ * log2
+ */
+float32 float32_log2(float32 a, float_status *status)
+{
+FloatParts64 p;
+
+float32_unpack_canonical(, a, status);
+parts_log2(, status, _params);
+return float32_round_pack_canonical(, status);
+}
+
+float64 float64_log2(float64 a, float_status *status)
+{
+FloatParts64 p;
+
+float64_unpack_canonical(, a, status);
+parts_log2(, status, _params);
+return float64_round_pack_canonical(, status);
+}
+
 /*
 | The pattern for a default generated NaN.
 **/
@@ -5246,56 +5273,6 @@ float32 float32_exp2(float32 a, float_status *status)
 return float32_round_pack_canonical(, status);
 }
 
-/*
-| Returns the binary log of the single-precision floating-point value `a'.
-| The operation is performed according to the IEC/IEEE Standard for Binary
-| Floating-Point Arithmetic.
-**/
-float32 float32_log2(float32 a, float_status *status)
-{
-bool aSign, zSign;
-int aExp;
-uint32_t aSig, zSig, i;
-
-a = float32_squash_input_denormal(a, status);
-aSig = extractFloat32Frac( a );
-aExp = extractFloat32Exp( a );
-aSign = extractFloat32Sign( a );
-
-if ( aExp == 0 ) {
-if ( aSig == 0 ) return packFloat32( 1, 0xFF, 0 );
-normalizeFloat32Subnormal( aSig, ,  );
-}
-if ( aSign ) {
-float_raise(float_flag_invalid, status);
-return float32_default_nan(status);
-}
-if ( aExp == 0xFF ) {
-if (aSig) {
-return propagateFloat32NaN(a, float32_zero, status);
-}
-return a;
-}
-
-aExp -= 0x7F;
-aSig |= 0x0080;
-zSign = aExp < 0;
-zSig = aExp << 23;
-
-for (i = 1 << 22; i > 0; i >>= 1) {
-aSig = ( (uint64_t)aSig * aSig ) >> 23;
-if ( aSig & 0x0100 ) {
-aSig >>= 1;
-zSig |= i;
-}
-}
-
-if ( zSign )
-zSig = -zSig;
-
-return normalizeRoundAndPackFloat32(zSign, 0x85, zSig, status);
-}
-
 /*
 | Returns the remainder of the double-precision floating-point value `a'
 | with respect to the corresponding value `b'.  The operation is performed
@@ -5384,55 +5361,6 @@ float64 float64_rem(float64 a, float64 b, float_status 
*status)
 
 }
 
-/*
-| Returns the binary log of the double-precision floating-point value `a'.
-| The operation is performed according to the IEC/IEEE Standard for Binary
-| Floating-Point Arithmetic.
-**/
-float64 float64_log2(float64 a, float_status *status)
-{
-bool aSign, zSign;
-int aExp;
-uint64_t aSig, aSig0, aSig1, zSig, i;
-a = float64_squash_input_denormal(a, status);
-
-aSig = extractFloat64Frac( a );
-

[PULL 15/29] softfloat: Convert floatx80_mul to FloatParts

2021-06-03 Thread Richard Henderson

Reviewed-by: Alex Bennée 
Signed-off-by: Richard Henderson 
---
 fpu/softfloat.c | 76 +
 1 file changed, 14 insertions(+), 62 deletions(-)

diff --git a/fpu/softfloat.c b/fpu/softfloat.c
index 737f5d7701..ae2e7aa274 100644
--- a/fpu/softfloat.c
+++ b/fpu/softfloat.c
@@ -1944,6 +1944,20 @@ float128_mul(float128 a, float128 b, float_status 
*status)
 return float128_round_pack_canonical(pr, status);
 }
 
+floatx80 QEMU_FLATTEN
+floatx80_mul(floatx80 a, floatx80 b, float_status *status)
+{
+FloatParts128 pa, pb, *pr;
+
+if (!floatx80_unpack_canonical(, a, status) ||
+!floatx80_unpack_canonical(, b, status)) {
+return floatx80_default_nan(status);
+}
+
+pr = parts_mul(, , status);
+return floatx80_round_pack_canonical(pr, status);
+}
+
 /*
  * Fused multiply-add
  */
@@ -5863,68 +5877,6 @@ floatx80 floatx80_round_to_int(floatx80 a, float_status 
*status)
 
 }
 
-/*
-| Returns the result of multiplying the extended double-precision floating-
-| point values `a' and `b'.  The operation is performed according to the
-| IEC/IEEE Standard for Binary Floating-Point Arithmetic.
-**/
-
-floatx80 floatx80_mul(floatx80 a, floatx80 b, float_status *status)
-{
-bool aSign, bSign, zSign;
-int32_t aExp, bExp, zExp;
-uint64_t aSig, bSig, zSig0, zSig1;
-
-if (floatx80_invalid_encoding(a) || floatx80_invalid_encoding(b)) {
-float_raise(float_flag_invalid, status);
-return floatx80_default_nan(status);
-}
-aSig = extractFloatx80Frac( a );
-aExp = extractFloatx80Exp( a );
-aSign = extractFloatx80Sign( a );
-bSig = extractFloatx80Frac( b );
-bExp = extractFloatx80Exp( b );
-bSign = extractFloatx80Sign( b );
-zSign = aSign ^ bSign;
-if ( aExp == 0x7FFF ) {
-if ((uint64_t) ( aSig<<1 )
- || ( ( bExp == 0x7FFF ) && (uint64_t) ( bSig<<1 ) ) ) {
-return propagateFloatx80NaN(a, b, status);
-}
-if ( ( bExp | bSig ) == 0 ) goto invalid;
-return packFloatx80(zSign, floatx80_infinity_high,
-   floatx80_infinity_low);
-}
-if ( bExp == 0x7FFF ) {
-if ((uint64_t)(bSig << 1)) {
-return propagateFloatx80NaN(a, b, status);
-}
-if ( ( aExp | aSig ) == 0 ) {
- invalid:
-float_raise(float_flag_invalid, status);
-return floatx80_default_nan(status);
-}
-return packFloatx80(zSign, floatx80_infinity_high,
-   floatx80_infinity_low);
-}
-if ( aExp == 0 ) {
-if ( aSig == 0 ) return packFloatx80( zSign, 0, 0 );
-normalizeFloatx80Subnormal( aSig, ,  );
-}
-if ( bExp == 0 ) {
-if ( bSig == 0 ) return packFloatx80( zSign, 0, 0 );
-normalizeFloatx80Subnormal( bSig, ,  );
-}
-zExp = aExp + bExp - 0x3FFE;
-mul64To128( aSig, bSig, ,  );
-if ( 0 < (int64_t) zSig0 ) {
-shortShift128Left( zSig0, zSig1, 1, ,  );
---zExp;
-}
-return roundAndPackFloatx80(status->floatx80_rounding_precision,
-zSign, zExp, zSig0, zSig1, status);
-}
-
 /*
 | Returns the result of dividing the extended double-precision floating-point
 | value `a' by the corresponding value `b'.  The operation is performed
-- 
2.25.1

[PULL 28/29] tests/fp: Enable more tests

2021-06-03 Thread Richard Henderson

From: Alex Bennée 

Fix the trivial typo in extF80_lt_quiet, and re-enable
all of the floatx80 tests that are now fixed.

Signed-off-by: Alex Bennée 
Message-ID: <87bl9iyahr@linaro.org>
[rth: Squash the fix for lt_quiet, and enable that too.]
Signed-off-by: Richard Henderson 
---
 tests/fp/wrap.c.inc  |  2 +-
 tests/fp/meson.build | 16 +++-
 2 files changed, 8 insertions(+), 10 deletions(-)

diff --git a/tests/fp/wrap.c.inc b/tests/fp/wrap.c.inc
index cb1bb77e4c..9ff884c140 100644
--- a/tests/fp/wrap.c.inc
+++ b/tests/fp/wrap.c.inc
@@ -643,7 +643,7 @@ WRAP_CMP80(qemu_extF80M_eq, floatx80_eq_quiet)
 WRAP_CMP80(qemu_extF80M_le, floatx80_le)
 WRAP_CMP80(qemu_extF80M_lt, floatx80_lt)
 WRAP_CMP80(qemu_extF80M_le_quiet, floatx80_le_quiet)
-WRAP_CMP80(qemu_extF80M_lt_quiet, floatx80_le_quiet)
+WRAP_CMP80(qemu_extF80M_lt_quiet, floatx80_lt_quiet)
 #undef WRAP_CMP80
 
 #define WRAP_CMP128(name, func) \
diff --git a/tests/fp/meson.build b/tests/fp/meson.build
index 9218bfd3b0..07e2cdc8d2 100644
--- a/tests/fp/meson.build
+++ b/tests/fp/meson.build
@@ -556,7 +556,9 @@ softfloat_conv_tests = {
   'extF80_to_f64 extF80_to_f128 ' +
   'f128_to_f16',
 'int-to-float': 'i32_to_f16 i64_to_f16 i32_to_f32 i64_to_f32 ' +
-'i32_to_f64 i64_to_f64 i32_to_f128 i64_to_f128',
+'i32_to_f64 i64_to_f64 ' +
+'i32_to_extF80 i64_to_extF80 ' +
+'i32_to_f128 i64_to_f128',
 'uint-to-float': 'ui32_to_f16 ui64_to_f16 ui32_to_f32 ui64_to_f32 ' +
  'ui32_to_f64 ui64_to_f64 ui64_to_f128 ' +
  'ui32_to_extF80 ui64_to_extF80',
@@ -581,7 +583,7 @@ softfloat_conv_tests = {
  'extF80_to_ui64 extF80_to_ui64_r_minMag ' +
  'f128_to_ui64 f128_to_ui64_r_minMag',
 'round-to-integer': 'f16_roundToInt f32_roundToInt ' +
-'f64_roundToInt f128_roundToInt'
+'f64_roundToInt extF80_roundToInt f128_roundToInt'
 }
 softfloat_tests = {
 'eq_signaling' : 'compare',
@@ -602,24 +604,20 @@ fptest_args = ['-s', '-l', '1']
 fptest_rounding_args = ['-r', 'all']
 
 # Conversion Routines:
-# FIXME: i32_to_extF80 (broken), i64_to_extF80 (broken)
-#extF80_roundToInt (broken)
 foreach k, v : softfloat_conv_tests
   test('fp-test-' + k, fptest,
args: fptest_args + fptest_rounding_args + v.split(),
suite: ['softfloat', 'softfloat-conv'])
 endforeach
 
-# FIXME: extF80_{lt_quiet, rem} (broken),
-#extF80_{mulAdd} (missing)
 foreach k, v : softfloat_tests
-  extF80_broken = ['lt_quiet', 'rem'].contains(k)
   test('fp-test-' + k, fptest,
args: fptest_args + fptest_rounding_args +
- ['f16_' + k, 'f32_' + k, 'f64_' + k, 'f128_' + k] +
- (extF80_broken ? [] : ['extF80_' + k]),
+ ['f16_' + k, 'f32_' + k, 'f64_' + k, 'f128_' + k, 'extF80_' + k],
suite: ['softfloat', 'softfloat-' + v])
 endforeach
+
+# FIXME: extF80_{mulAdd} (missing)
 test('fp-test-mulAdd', fptest,
  # no fptest_rounding_args
  args: fptest_args +
-- 
2.25.1

[PULL 23/29] softfloat: Convert floatx80_scalbn to FloatParts

2021-06-03 Thread Richard Henderson

Reviewed-by: Alex Bennée 
Signed-off-by: Richard Henderson 
---
 fpu/softfloat.c | 50 +++--
 1 file changed, 11 insertions(+), 39 deletions(-)

diff --git a/fpu/softfloat.c b/fpu/softfloat.c
index 5a2a872408..770badd447 100644
--- a/fpu/softfloat.c
+++ b/fpu/softfloat.c
@@ -3911,6 +3911,17 @@ float128 float128_scalbn(float128 a, int n, float_status 
*status)
 return float128_round_pack_canonical(, status);
 }
 
+floatx80 floatx80_scalbn(floatx80 a, int n, float_status *status)
+{
+FloatParts128 p;
+
+if (!floatx80_unpack_canonical(, a, status)) {
+return floatx80_default_nan(status);
+}
+parts_scalbn(, n, status);
+return floatx80_round_pack_canonical(, status);
+}
+
 /*
  * Square Root
  */
@@ -5745,45 +5756,6 @@ FloatRelation floatx80_compare_quiet(floatx80 a, 
floatx80 b,
 return floatx80_compare_internal(a, b, 1, status);
 }
 
-floatx80 floatx80_scalbn(floatx80 a, int n, float_status *status)
-{
-bool aSign;
-int32_t aExp;
-uint64_t aSig;
-
-if (floatx80_invalid_encoding(a)) {
-float_raise(float_flag_invalid, status);
-return floatx80_default_nan(status);
-}
-aSig = extractFloatx80Frac( a );
-aExp = extractFloatx80Exp( a );
-aSign = extractFloatx80Sign( a );
-
-if ( aExp == 0x7FFF ) {
-if ( aSig<<1 ) {
-return propagateFloatx80NaN(a, a, status);
-}
-return a;
-}
-
-if (aExp == 0) {
-if (aSig == 0) {
-return a;
-}
-aExp++;
-}
-
-if (n > 0x1) {
-n = 0x1;
-} else if (n < -0x1) {
-n = -0x1;
-}
-
-aExp += n;
-return normalizeRoundAndPackFloatx80(status->floatx80_rounding_precision,
- aSign, aExp, aSig, 0, status);
-}
-
 static void __attribute__((constructor)) softfloat_init(void)
 {
 union_float64 ua, ub, uc, ur;
-- 
2.25.1

[PULL 16/29] softfloat: Convert floatx80_div to FloatParts

2021-06-03 Thread Richard Henderson

Reviewed-by: Alex Bennée 
Signed-off-by: Richard Henderson 
---
 fpu/softfloat.c | 100 +++-
 1 file changed, 13 insertions(+), 87 deletions(-)

diff --git a/fpu/softfloat.c b/fpu/softfloat.c
index ae2e7aa274..9c26ba5960 100644
--- a/fpu/softfloat.c
+++ b/fpu/softfloat.c
@@ -2294,6 +2294,19 @@ float128_div(float128 a, float128 b, float_status 
*status)
 return float128_round_pack_canonical(pr, status);
 }
 
+floatx80 floatx80_div(floatx80 a, floatx80 b, float_status *status)
+{
+FloatParts128 pa, pb, *pr;
+
+if (!floatx80_unpack_canonical(, a, status) ||
+!floatx80_unpack_canonical(, b, status)) {
+return floatx80_default_nan(status);
+}
+
+pr = parts_div(, , status);
+return floatx80_round_pack_canonical(pr, status);
+}
+
 /*
  * Float to Float conversions
  *
@@ -5877,93 +5890,6 @@ floatx80 floatx80_round_to_int(floatx80 a, float_status 
*status)
 
 }
 
-/*
-| Returns the result of dividing the extended double-precision floating-point
-| value `a' by the corresponding value `b'.  The operation is performed
-| according to the IEC/IEEE Standard for Binary Floating-Point Arithmetic.
-**/
-
-floatx80 floatx80_div(floatx80 a, floatx80 b, float_status *status)
-{
-bool aSign, bSign, zSign;
-int32_t aExp, bExp, zExp;
-uint64_t aSig, bSig, zSig0, zSig1;
-uint64_t rem0, rem1, rem2, term0, term1, term2;
-
-if (floatx80_invalid_encoding(a) || floatx80_invalid_encoding(b)) {
-float_raise(float_flag_invalid, status);
-return floatx80_default_nan(status);
-}
-aSig = extractFloatx80Frac( a );
-aExp = extractFloatx80Exp( a );
-aSign = extractFloatx80Sign( a );
-bSig = extractFloatx80Frac( b );
-bExp = extractFloatx80Exp( b );
-bSign = extractFloatx80Sign( b );
-zSign = aSign ^ bSign;
-if ( aExp == 0x7FFF ) {
-if ((uint64_t)(aSig << 1)) {
-return propagateFloatx80NaN(a, b, status);
-}
-if ( bExp == 0x7FFF ) {
-if ((uint64_t)(bSig << 1)) {
-return propagateFloatx80NaN(a, b, status);
-}
-goto invalid;
-}
-return packFloatx80(zSign, floatx80_infinity_high,
-   floatx80_infinity_low);
-}
-if ( bExp == 0x7FFF ) {
-if ((uint64_t)(bSig << 1)) {
-return propagateFloatx80NaN(a, b, status);
-}
-return packFloatx80( zSign, 0, 0 );
-}
-if ( bExp == 0 ) {
-if ( bSig == 0 ) {
-if ( ( aExp | aSig ) == 0 ) {
- invalid:
-float_raise(float_flag_invalid, status);
-return floatx80_default_nan(status);
-}
-float_raise(float_flag_divbyzero, status);
-return packFloatx80(zSign, floatx80_infinity_high,
-   floatx80_infinity_low);
-}
-normalizeFloatx80Subnormal( bSig, ,  );
-}
-if ( aExp == 0 ) {
-if ( aSig == 0 ) return packFloatx80( zSign, 0, 0 );
-normalizeFloatx80Subnormal( aSig, ,  );
-}
-zExp = aExp - bExp + 0x3FFE;
-rem1 = 0;
-if ( bSig <= aSig ) {
-shift128Right( aSig, 0, 1, ,  );
-++zExp;
-}
-zSig0 = estimateDiv128To64( aSig, rem1, bSig );
-mul64To128( bSig, zSig0, ,  );
-sub128( aSig, rem1, term0, term1, ,  );
-while ( (int64_t) rem0 < 0 ) {
---zSig0;
-add128( rem0, rem1, 0, bSig, ,  );
-}
-zSig1 = estimateDiv128To64( rem1, 0, bSig );
-if ( (uint64_t) ( zSig1<<1 ) <= 8 ) {
-mul64To128( bSig, zSig1, ,  );
-sub128( rem1, 0, term1, term2, ,  );
-while ( (int64_t) rem1 < 0 ) {
---zSig1;
-add128( rem1, rem2, 0, bSig, ,  );
-}
-zSig1 |= ( ( rem1 | rem2 ) != 0 );
-}
-return roundAndPackFloatx80(status->floatx80_rounding_precision,
-zSign, zExp, zSig0, zSig1, status);
-}
-
 /*
 | Returns the remainder of the extended double-precision floating-point value
 | `a' with respect to the corresponding value `b'.  The operation is performed
-- 
2.25.1

[PULL 14/29] softfloat: Convert floatx80_add/sub to FloatParts

2021-06-03 Thread Richard Henderson

Since this is the first such, this includes all of the
packing and unpacking routines as well.

Reviewed-by: Alex Bennée 
Signed-off-by: Richard Henderson 
---
 fpu/softfloat.c | 339 +++-
 1 file changed, 136 insertions(+), 203 deletions(-)

diff --git a/fpu/softfloat.c b/fpu/softfloat.c
index b6a50e5e95..737f5d7701 100644
--- a/fpu/softfloat.c
+++ b/fpu/softfloat.c
@@ -578,14 +578,14 @@ typedef struct {
 } FloatFmt;
 
 /* Expand fields based on the size of exponent and fraction */
-#define FLOAT_PARAMS_(E, F) \
+#define FLOAT_PARAMS_(E)\
 .exp_size   = E,\
 .exp_bias   = ((1 << E) - 1) >> 1,  \
-.exp_max= (1 << E) - 1, \
-.frac_size  = F
+.exp_max= (1 << E) - 1
 
 #define FLOAT_PARAMS(E, F)  \
-FLOAT_PARAMS_(E, F),\
+FLOAT_PARAMS_(E),   \
+.frac_size  = F,\
 .frac_shift = (-F - 1) & 63,\
 .round_mask = (1ull << ((-F - 1) & 63)) - 1
 
@@ -614,6 +614,18 @@ static const FloatFmt float128_params = {
 FLOAT_PARAMS(15, 112)
 };
 
+#define FLOATX80_PARAMS(R)  \
+FLOAT_PARAMS_(15),  \
+.frac_size = R == 64 ? 63 : R,  \
+.frac_shift = 0,\
+.round_mask = R == 64 ? -1 : (1ull << ((-R - 1) & 63)) - 1
+
+static const FloatFmt floatx80_params[3] = {
+[floatx80_precision_s] = { FLOATX80_PARAMS(23) },
+[floatx80_precision_d] = { FLOATX80_PARAMS(52) },
+[floatx80_precision_x] = { FLOATX80_PARAMS(64) },
+};
+
 /* Unpack a float to parts, but do not canonicalize.  */
 static void unpack_raw64(FloatParts64 *r, const FloatFmt *fmt, uint64_t raw)
 {
@@ -648,6 +660,16 @@ static inline void float64_unpack_raw(FloatParts64 *p, 
float64 f)
 unpack_raw64(p, _params, f);
 }
 
+static void floatx80_unpack_raw(FloatParts128 *p, floatx80 f)
+{
+*p = (FloatParts128) {
+.cls = float_class_unclassified,
+.sign = extract32(f.high, 15, 1),
+.exp = extract32(f.high, 0, 15),
+.frac_hi = f.low
+};
+}
+
 static void float128_unpack_raw(FloatParts128 *p, float128 f)
 {
 const int f_size = float128_params.frac_size - 64;
@@ -1536,6 +1558,92 @@ static float128 
float128_round_pack_canonical(FloatParts128 *p,
 return float128_pack_raw(p);
 }
 
+/* Returns false if the encoding is invalid. */
+static bool floatx80_unpack_canonical(FloatParts128 *p, floatx80 f,
+  float_status *s)
+{
+/* Ensure rounding precision is set before beginning. */
+switch (s->floatx80_rounding_precision) {
+case floatx80_precision_x:
+case floatx80_precision_d:
+case floatx80_precision_s:
+break;
+default:
+g_assert_not_reached();
+}
+
+if (unlikely(floatx80_invalid_encoding(f))) {
+float_raise(float_flag_invalid, s);
+return false;
+}
+
+floatx80_unpack_raw(p, f);
+
+if (likely(p->exp != floatx80_params[floatx80_precision_x].exp_max)) {
+parts_canonicalize(p, s, _params[floatx80_precision_x]);
+} else {
+/* The explicit integer bit is ignored, after invalid checks. */
+p->frac_hi &= MAKE_64BIT_MASK(0, 63);
+p->cls = (p->frac_hi == 0 ? float_class_inf
+  : parts_is_snan_frac(p->frac_hi, s)
+  ? float_class_snan : float_class_qnan);
+}
+return true;
+}
+
+static floatx80 floatx80_round_pack_canonical(FloatParts128 *p,
+  float_status *s)
+{
+const FloatFmt *fmt = _params[s->floatx80_rounding_precision];
+uint64_t frac;
+int exp;
+
+switch (p->cls) {
+case float_class_normal:
+if (s->floatx80_rounding_precision == floatx80_precision_x) {
+parts_uncanon_normal(p, s, fmt);
+frac = p->frac_hi;
+exp = p->exp;
+} else {
+FloatParts64 p64;
+
+p64.sign = p->sign;
+p64.exp = p->exp;
+frac_truncjam(, p);
+parts_uncanon_normal(, s, fmt);
+frac = p64.frac;
+exp = p64.exp;
+}
+if (exp != fmt->exp_max) {
+break;
+}
+/* rounded to inf -- fall through to set frac correctly */
+
+case float_class_inf:
+/* x86 and m68k differ in the setting of the integer bit. */
+frac = floatx80_infinity_low;
+exp = fmt->exp_max;
+break;
+
+case float_class_zero:
+frac = 0;
+exp = 0;
+break;
+
+case float_class_snan:
+case float_class_qnan:
+/* NaNs have the integer bit set. */
+frac = p->frac_hi | (1ull << 63);
+exp = fmt->exp_max;
+break;
+
+default:
+

[PULL 27/29] softfloat: Convert modrem operations to FloatParts

2021-06-03 Thread Richard Henderson

Rename to parts$N_modrem.  This was the last use of a lot
of the legacy infrastructure, so remove it as required.

Reviewed-by: Alex Bennée 
Signed-off-by: Richard Henderson 
---
 include/fpu/softfloat-macros.h |   34 +
 fpu/softfloat.c| 1339 +++-
 fpu/softfloat-parts.c.inc  |   34 +
 fpu/softfloat-specialize.c.inc |  165 
 4 files changed, 329 insertions(+), 1243 deletions(-)

diff --git a/include/fpu/softfloat-macros.h b/include/fpu/softfloat-macros.h
index ec4e27a595..81c3fe8256 100644
--- a/include/fpu/softfloat-macros.h
+++ b/include/fpu/softfloat-macros.h
@@ -745,4 +745,38 @@ static inline bool ne128(uint64_t a0, uint64_t a1, 
uint64_t b0, uint64_t b1)
 return a0 != b0 || a1 != b1;
 }
 
+/*
+ * Similarly, comparisons of 192-bit values.
+ */
+
+static inline bool eq192(uint64_t a0, uint64_t a1, uint64_t a2,
+ uint64_t b0, uint64_t b1, uint64_t b2)
+{
+return ((a0 ^ b0) | (a1 ^ b1) | (a2 ^ b2)) == 0;
+}
+
+static inline bool le192(uint64_t a0, uint64_t a1, uint64_t a2,
+ uint64_t b0, uint64_t b1, uint64_t b2)
+{
+if (a0 != b0) {
+return a0 < b0;
+}
+if (a1 != b1) {
+return a1 < b1;
+}
+return a2 <= b2;
+}
+
+static inline bool lt192(uint64_t a0, uint64_t a1, uint64_t a2,
+ uint64_t b0, uint64_t b1, uint64_t b2)
+{
+if (a0 != b0) {
+return a0 < b0;
+}
+if (a1 != b1) {
+return a1 < b1;
+}
+return a2 < b2;
+}
+
 #endif
diff --git a/fpu/softfloat.c b/fpu/softfloat.c
index c0fe191f4d..5026f518b0 100644
--- a/fpu/softfloat.c
+++ b/fpu/softfloat.c
@@ -401,60 +401,6 @@ float64_gen2(float64 xa, float64 xb, float_status *s,
 return soft(ua.s, ub.s, s);
 }
 
-/*
-| Returns the fraction bits of the single-precision floating-point value `a'.
-**/
-
-static inline uint32_t extractFloat32Frac(float32 a)
-{
-return float32_val(a) & 0x007F;
-}
-
-/*
-| Returns the exponent bits of the single-precision floating-point value `a'.
-**/
-
-static inline int extractFloat32Exp(float32 a)
-{
-return (float32_val(a) >> 23) & 0xFF;
-}
-
-/*
-| Returns the sign bit of the single-precision floating-point value `a'.
-**/
-
-static inline bool extractFloat32Sign(float32 a)
-{
-return float32_val(a) >> 31;
-}
-
-/*
-| Returns the fraction bits of the double-precision floating-point value `a'.
-**/
-
-static inline uint64_t extractFloat64Frac(float64 a)
-{
-return float64_val(a) & UINT64_C(0x000F);
-}
-
-/*
-| Returns the exponent bits of the double-precision floating-point value `a'.
-**/
-
-static inline int extractFloat64Exp(float64 a)
-{
-return (float64_val(a) >> 52) & 0x7FF;
-}
-
-/*
-| Returns the sign bit of the double-precision floating-point value `a'.
-**/
-
-static inline bool extractFloat64Sign(float64 a)
-{
-return float64_val(a) >> 63;
-}
-
 /*
  * Classify a floating point number. Everything above float_class_qnan
  * is a NaN so cls >= float_class_qnan is any NaN.
@@ -845,6 +791,14 @@ static FloatParts128 *parts128_div(FloatParts128 *a, 
FloatParts128 *b,
 #define parts_div(A, B, S) \
 PARTS_GENERIC_64_128(div, A)(A, B, S)
 
+static FloatParts64 *parts64_modrem(FloatParts64 *a, FloatParts64 *b,
+uint64_t *mod_quot, float_status *s);
+static FloatParts128 *parts128_modrem(FloatParts128 *a, FloatParts128 *b,
+  uint64_t *mod_quot, float_status *s);
+
+#define parts_modrem(A, B, Q, S) \
+PARTS_GENERIC_64_128(modrem, A)(A, B, Q, S)
+
 static void parts64_sqrt(FloatParts64 *a, float_status *s, const FloatFmt *f);
 static void parts128_sqrt(FloatParts128 *a, float_status *s, const FloatFmt 
*f);
 
@@ -1229,6 +1183,186 @@ static int frac256_normalize(FloatParts256 *a)
 
 #define frac_normalize(A)  FRAC_GENERIC_64_128_256(normalize, A)(A)
 
+static void frac64_modrem(FloatParts64 *a, FloatParts64 *b, uint64_t *mod_quot)
+{
+uint64_t a0, a1, b0, t0, t1, q, quot;
+int exp_diff = a->exp - b->exp;
+

[PULL 24/29] softfloat: Convert floatx80 compare to FloatParts

2021-06-03 Thread Richard Henderson

Reviewed-by: Alex Bennée 
Signed-off-by: Richard Henderson 
---
 fpu/softfloat.c | 82 +
 1 file changed, 22 insertions(+), 60 deletions(-)

diff --git a/fpu/softfloat.c b/fpu/softfloat.c
index 770badd447..c32b1c7113 100644
--- a/fpu/softfloat.c
+++ b/fpu/softfloat.c
@@ -3862,6 +3862,28 @@ FloatRelation float128_compare_quiet(float128 a, 
float128 b, float_status *s)
 return float128_do_compare(a, b, s, true);
 }
 
+static FloatRelation QEMU_FLATTEN
+floatx80_do_compare(floatx80 a, floatx80 b, float_status *s, bool is_quiet)
+{
+FloatParts128 pa, pb;
+
+if (!floatx80_unpack_canonical(, a, s) ||
+!floatx80_unpack_canonical(, b, s)) {
+return float_relation_unordered;
+}
+return parts_compare(, , s, is_quiet);
+}
+
+FloatRelation floatx80_compare(floatx80 a, floatx80 b, float_status *s)
+{
+return floatx80_do_compare(a, b, s, false);
+}
+
+FloatRelation floatx80_compare_quiet(floatx80 a, floatx80 b, float_status *s)
+{
+return floatx80_do_compare(a, b, s, true);
+}
+
 /*
  * Scale by 2**N
  */
@@ -5696,66 +5718,6 @@ float128 float128_rem(float128 a, float128 b, 
float_status *status)
 return normalizeRoundAndPackFloat128(aSign ^ zSign, bExp - 4, aSig0, aSig1,
  status);
 }
-
-static inline FloatRelation
-floatx80_compare_internal(floatx80 a, floatx80 b, bool is_quiet,
-  float_status *status)
-{
-bool aSign, bSign;
-
-if (floatx80_invalid_encoding(a) || floatx80_invalid_encoding(b)) {
-float_raise(float_flag_invalid, status);
-return float_relation_unordered;
-}
-if (( ( extractFloatx80Exp( a ) == 0x7fff ) &&
-  ( extractFloatx80Frac( a )<<1 ) ) ||
-( ( extractFloatx80Exp( b ) == 0x7fff ) &&
-  ( extractFloatx80Frac( b )<<1 ) )) {
-if (!is_quiet ||
-floatx80_is_signaling_nan(a, status) ||
-floatx80_is_signaling_nan(b, status)) {
-float_raise(float_flag_invalid, status);
-}
-return float_relation_unordered;
-}
-aSign = extractFloatx80Sign( a );
-bSign = extractFloatx80Sign( b );
-if ( aSign != bSign ) {
-
-if ( ( ( (uint16_t) ( ( a.high | b.high ) << 1 ) ) == 0) &&
- ( ( a.low | b.low ) == 0 ) ) {
-/* zero case */
-return float_relation_equal;
-} else {
-return 1 - (2 * aSign);
-}
-} else {
-/* Normalize pseudo-denormals before comparison.  */
-if ((a.high & 0x7fff) == 0 && a.low & UINT64_C(0x8000)) {
-++a.high;
-}
-if ((b.high & 0x7fff) == 0 && b.low & UINT64_C(0x8000)) {
-++b.high;
-}
-if (a.low == b.low && a.high == b.high) {
-return float_relation_equal;
-} else {
-return 1 - 2 * (aSign ^ ( lt128( a.high, a.low, b.high, b.low ) ));
-}
-}
-}
-
-FloatRelation floatx80_compare(floatx80 a, floatx80 b, float_status *status)
-{
-return floatx80_compare_internal(a, b, 0, status);
-}
-
-FloatRelation floatx80_compare_quiet(floatx80 a, floatx80 b,
- float_status *status)
-{
-return floatx80_compare_internal(a, b, 1, status);
-}
-
 static void __attribute__((constructor)) softfloat_init(void)
 {
 union_float64 ua, ub, uc, ur;
-- 
2.25.1

[PULL 11/29] softfloat: Introduce Floatx80RoundPrec

2021-06-03 Thread Richard Henderson

Use an enumeration instead of raw 32/64/80 values.

Reviewed-by: Alex Bennée 
Signed-off-by: Richard Henderson 
---
 include/fpu/softfloat-helpers.h |  5 +-
 include/fpu/softfloat-types.h   | 10 +++-
 include/fpu/softfloat.h |  4 +-
 fpu/softfloat.c | 32 ++--
 linux-user/arm/nwfpe/fpa11.c| 41 +++
 target/i386/tcg/fpu_helper.c| 79 +
 target/m68k/fpu_helper.c| 50 +-
 target/m68k/softfloat.c | 90 -
 tests/fp/fp-test.c  |  5 +-
 9 files changed, 182 insertions(+), 134 deletions(-)

diff --git a/include/fpu/softfloat-helpers.h b/include/fpu/softfloat-helpers.h
index 2f0674fbdd..34f4cf92ae 100644
--- a/include/fpu/softfloat-helpers.h
+++ b/include/fpu/softfloat-helpers.h
@@ -69,7 +69,7 @@ static inline void set_float_exception_flags(int val, 
float_status *status)
 status->float_exception_flags = val;
 }
 
-static inline void set_floatx80_rounding_precision(int val,
+static inline void set_floatx80_rounding_precision(FloatX80RoundPrec val,
float_status *status)
 {
 status->floatx80_rounding_precision = val;
@@ -120,7 +120,8 @@ static inline int get_float_exception_flags(float_status 
*status)
 return status->float_exception_flags;
 }
 
-static inline int get_floatx80_rounding_precision(float_status *status)
+static inline FloatX80RoundPrec
+get_floatx80_rounding_precision(float_status *status)
 {
 return status->floatx80_rounding_precision;
 }
diff --git a/include/fpu/softfloat-types.h b/include/fpu/softfloat-types.h
index 3b757c3d6a..5bcbd041f7 100644
--- a/include/fpu/softfloat-types.h
+++ b/include/fpu/softfloat-types.h
@@ -154,6 +154,14 @@ enum {
 float_flag_output_denormal = 128
 };
 
+/*
+ * Rounding precision for floatx80.
+ */
+typedef enum __attribute__((__packed__)) {
+floatx80_precision_x,
+floatx80_precision_d,
+floatx80_precision_s,
+} FloatX80RoundPrec;
 
 /*
  * Floating Point Status. Individual architectures may maintain
@@ -165,7 +173,7 @@ enum {
 typedef struct float_status {
 FloatRoundMode float_rounding_mode;
 uint8_t float_exception_flags;
-signed char floatx80_rounding_precision;
+FloatX80RoundPrec floatx80_rounding_precision;
 bool tininess_before_rounding;
 /* should denormalised results go to zero and set the inexact flag? */
 bool flush_to_zero;
diff --git a/include/fpu/softfloat.h b/include/fpu/softfloat.h
index ed32040aa9..ec7dca0960 100644
--- a/include/fpu/softfloat.h
+++ b/include/fpu/softfloat.h
@@ -1152,7 +1152,7 @@ floatx80 propagateFloatx80NaN(floatx80 a, floatx80 b, 
float_status *status);
 | Floating-Point Arithmetic.
 **/
 
-floatx80 roundAndPackFloatx80(int8_t roundingPrecision, bool zSign,
+floatx80 roundAndPackFloatx80(FloatX80RoundPrec roundingPrecision, bool zSign,
   int32_t zExp, uint64_t zSig0, uint64_t zSig1,
   float_status *status);
 
@@ -1165,7 +1165,7 @@ floatx80 roundAndPackFloatx80(int8_t roundingPrecision, 
bool zSign,
 | normalized.
 **/
 
-floatx80 normalizeRoundAndPackFloatx80(int8_t roundingPrecision,
+floatx80 normalizeRoundAndPackFloatx80(FloatX80RoundPrec roundingPrecision,
bool zSign, int32_t zExp,
uint64_t zSig0, uint64_t zSig1,
float_status *status);
diff --git a/fpu/softfloat.c b/fpu/softfloat.c
index 741480c568..b6a50e5e95 100644
--- a/fpu/softfloat.c
+++ b/fpu/softfloat.c
@@ -4342,10 +4342,10 @@ void normalizeFloatx80Subnormal(uint64_t aSig, int32_t 
*zExpPtr,
 | a subnormal number, and the underflow and inexact exceptions are raised if
 | the abstract input cannot be represented exactly as a subnormal extended
 | double-precision floating-point number.
-| If `roundingPrecision' is 32 or 64, the result is rounded to the same
-| number of bits as single or double precision, respectively.  Otherwise, the
-| result is rounded to the full precision of the extended double-precision
-| format.
+| If `roundingPrecision' is floatx80_precision_s or floatx80_precision_d,
+| the result is rounded to the same number of bits as single or double
+| precision, respectively.  Otherwise, the result is rounded to the full
+| precision of the extended double-precision format.
 | The input significand must be normalized or smaller.  If the input
 | significand is not normalized, `zExp' must be 0; in that case, the result
 | returned is a subnormal number, and it must not require rounding.  The
@@ -4353,27 +4353,29 @@ void normalizeFloatx80Subnormal(uint64_t aSig, int32_t 
*zExpPtr,
 | Floating-Point Arithmetic.

[PULL 13/29] tests/fp/fp-test: Reverse order of floatx80 precision tests

2021-06-03 Thread Richard Henderson

Many qemu softfloat will check floatx80_rounding_precision
even when berkeley testfloat will not.  So begin with
floatx80_precision_x, so that's the one we use
when !FUNC_EFF_ROUNDINGPRECISION.

Reviewed-by: Alex Bennée 
Signed-off-by: Richard Henderson 
---
 tests/fp/fp-test.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/tests/fp/fp-test.c b/tests/fp/fp-test.c
index 1be3a9788a..352dd71c44 100644
--- a/tests/fp/fp-test.c
+++ b/tests/fp/fp-test.c
@@ -963,16 +963,16 @@ static void QEMU_NORETURN run_test(void)
 verCases_usesExact = !!(attrs & FUNC_ARG_EXACT);
 
 for (k = 0; k < 3; k++) {
-FloatX80RoundPrec qsf_prec80 = floatx80_precision_s;
-int prec80 = 32;
+FloatX80RoundPrec qsf_prec80 = floatx80_precision_x;
+int prec80 = 80;
 int l;
 
 if (k == 1) {
 prec80 = 64;
 qsf_prec80 = floatx80_precision_d;
 } else if (k == 2) {
-prec80 = 80;
-qsf_prec80 = floatx80_precision_x;
+prec80 = 32;
+qsf_prec80 = floatx80_precision_s;
 }
 
 verCases_roundingPrecision = 0;
-- 
2.25.1

[PULL 21/29] softfloat: Convert floatx80 float conversions to FloatParts

2021-06-03 Thread Richard Henderson

This is the last use of commonNaNT and all of the routines
that use it, so remove all of them for Werror.

Reviewed-by: Alex Bennée 
Signed-off-by: Richard Henderson 
---
 fpu/softfloat.c| 276 -
 fpu/softfloat-specialize.c.inc | 175 -
 2 files changed, 67 insertions(+), 384 deletions(-)

diff --git a/fpu/softfloat.c b/fpu/softfloat.c
index be7583780d..acaab6a127 100644
--- a/fpu/softfloat.c
+++ b/fpu/softfloat.c
@@ -2561,6 +2561,73 @@ float128 float64_to_float128(float64 a, float_status *s)
 return float128_round_pack_canonical(, s);
 }
 
+float32 floatx80_to_float32(floatx80 a, float_status *s)
+{
+FloatParts64 p64;
+FloatParts128 p128;
+
+if (floatx80_unpack_canonical(, a, s)) {
+parts_float_to_float_narrow(, , s);
+} else {
+parts_default_nan(, s);
+}
+return float32_round_pack_canonical(, s);
+}
+
+float64 floatx80_to_float64(floatx80 a, float_status *s)
+{
+FloatParts64 p64;
+FloatParts128 p128;
+
+if (floatx80_unpack_canonical(, a, s)) {
+parts_float_to_float_narrow(, , s);
+} else {
+parts_default_nan(, s);
+}
+return float64_round_pack_canonical(, s);
+}
+
+float128 floatx80_to_float128(floatx80 a, float_status *s)
+{
+FloatParts128 p;
+
+if (floatx80_unpack_canonical(, a, s)) {
+parts_float_to_float(, s);
+} else {
+parts_default_nan(, s);
+}
+return float128_round_pack_canonical(, s);
+}
+
+floatx80 float32_to_floatx80(float32 a, float_status *s)
+{
+FloatParts64 p64;
+FloatParts128 p128;
+
+float32_unpack_canonical(, a, s);
+parts_float_to_float_widen(, , s);
+return floatx80_round_pack_canonical(, s);
+}
+
+floatx80 float64_to_floatx80(float64 a, float_status *s)
+{
+FloatParts64 p64;
+FloatParts128 p128;
+
+float64_unpack_canonical(, a, s);
+parts_float_to_float_widen(, , s);
+return floatx80_round_pack_canonical(, s);
+}
+
+floatx80 float128_to_floatx80(float128 a, float_status *s)
+{
+FloatParts128 p;
+
+float128_unpack_canonical(, a, s);
+parts_float_to_float(, s);
+return floatx80_round_pack_canonical(, s);
+}
+
 /*
  * Round to integral value
  */
@@ -5046,42 +5113,6 @@ static float128 normalizeRoundAndPackFloat128(bool 
zSign, int32_t zExp,
 
 }
 
-/*
-| Returns the result of converting the single-precision floating-point value
-| `a' to the extended double-precision floating-point format.  The conversion
-| is performed according to the IEC/IEEE Standard for Binary Floating-Point
-| Arithmetic.
-**/
-
-floatx80 float32_to_floatx80(float32 a, float_status *status)
-{
-bool aSign;
-int aExp;
-uint32_t aSig;
-
-a = float32_squash_input_denormal(a, status);
-aSig = extractFloat32Frac( a );
-aExp = extractFloat32Exp( a );
-aSign = extractFloat32Sign( a );
-if ( aExp == 0xFF ) {
-if (aSig) {
-floatx80 res = commonNaNToFloatx80(float32ToCommonNaN(a, status),
-   status);
-return floatx80_silence_nan(res, status);
-}
-return packFloatx80(aSign,
-floatx80_infinity_high,
-floatx80_infinity_low);
-}
-if ( aExp == 0 ) {
-if ( aSig == 0 ) return packFloatx80( aSign, 0, 0 );
-normalizeFloat32Subnormal( aSig, ,  );
-}
-aSig |= 0x0080;
-return packFloatx80( aSign, aExp + 0x3F80, ( (uint64_t) aSig )<<40 );
-
-}
-
 /*
 | Returns the remainder of the single-precision floating-point value `a'
 | with respect to the corresponding value `b'.  The operation is performed
@@ -5318,43 +5349,6 @@ float32 float32_log2(float32 a, float_status *status)
 return normalizeRoundAndPackFloat32(zSign, 0x85, zSig, status);
 }
 
-/*
-| Returns the result of converting the double-precision floating-point value
-| `a' to the extended double-precision floating-point format.  The conversion
-| is performed according to the IEC/IEEE Standard for Binary Floating-Point
-| Arithmetic.
-**/
-
-floatx80 float64_to_floatx80(float64 a, float_status *status)
-{
-bool aSign;
-int aExp;
-uint64_t aSig;
-
-a = float64_squash_input_denormal(a, status);
-aSig = extractFloat64Frac( a );
-aExp = extractFloat64Exp( a );
-aSign = extractFloat64Sign( a );
-if ( aExp == 0x7FF ) {
-if (aSig) {
-floatx80 res = commonNaNToFloatx80(float64ToCommonNaN(a, status),
-   status);
-return floatx80_silence_nan(res,

[PULL 17/29] softfloat: Convert floatx80_sqrt to FloatParts

2021-06-03 Thread Richard Henderson

Reviewed-by: Alex Bennée 
Signed-off-by: Richard Henderson 
---
 fpu/softfloat.c | 82 +++--
 1 file changed, 11 insertions(+), 71 deletions(-)

diff --git a/fpu/softfloat.c b/fpu/softfloat.c
index 9c26ba5960..5a320e5302 100644
--- a/fpu/softfloat.c
+++ b/fpu/softfloat.c
@@ -3881,6 +3881,17 @@ float128 QEMU_FLATTEN float128_sqrt(float128 a, 
float_status *status)
 return float128_round_pack_canonical(, status);
 }
 
+floatx80 floatx80_sqrt(floatx80 a, float_status *s)
+{
+FloatParts128 p;
+
+if (!floatx80_unpack_canonical(, a, s)) {
+return floatx80_default_nan(s);
+}
+parts_sqrt(, s, _params[s->floatx80_rounding_precision]);
+return floatx80_round_pack_canonical(, s);
+}
+
 /*
 | The pattern for a default generated NaN.
 **/
@@ -6044,77 +6055,6 @@ floatx80 floatx80_mod(floatx80 a, floatx80 b, 
float_status *status)
 return floatx80_modrem(a, b, true, , status);
 }
 
-/*
-| Returns the square root of the extended double-precision floating-point
-| value `a'.  The operation is performed according to the IEC/IEEE Standard
-| for Binary Floating-Point Arithmetic.
-**/
-
-floatx80 floatx80_sqrt(floatx80 a, float_status *status)
-{
-bool aSign;
-int32_t aExp, zExp;
-uint64_t aSig0, aSig1, zSig0, zSig1, doubleZSig0;
-uint64_t rem0, rem1, rem2, rem3, term0, term1, term2, term3;
-
-if (floatx80_invalid_encoding(a)) {
-float_raise(float_flag_invalid, status);
-return floatx80_default_nan(status);
-}
-aSig0 = extractFloatx80Frac( a );
-aExp = extractFloatx80Exp( a );
-aSign = extractFloatx80Sign( a );
-if ( aExp == 0x7FFF ) {
-if ((uint64_t)(aSig0 << 1)) {
-return propagateFloatx80NaN(a, a, status);
-}
-if ( ! aSign ) return a;
-goto invalid;
-}
-if ( aSign ) {
-if ( ( aExp | aSig0 ) == 0 ) return a;
- invalid:
-float_raise(float_flag_invalid, status);
-return floatx80_default_nan(status);
-}
-if ( aExp == 0 ) {
-if ( aSig0 == 0 ) return packFloatx80( 0, 0, 0 );
-normalizeFloatx80Subnormal( aSig0, ,  );
-}
-zExp = ( ( aExp - 0x3FFF )>>1 ) + 0x3FFF;
-zSig0 = estimateSqrt32( aExp, aSig0>>32 );
-shift128Right( aSig0, 0, 2 + ( aExp & 1 ), ,  );
-zSig0 = estimateDiv128To64( aSig0, aSig1, zSig0<<32 ) + ( zSig0<<30 );
-doubleZSig0 = zSig0<<1;
-mul64To128( zSig0, zSig0, ,  );
-sub128( aSig0, aSig1, term0, term1, ,  );
-while ( (int64_t) rem0 < 0 ) {
---zSig0;
-doubleZSig0 -= 2;
-add128( rem0, rem1, zSig0>>63, doubleZSig0 | 1, ,  );
-}
-zSig1 = estimateDiv128To64( rem1, 0, doubleZSig0 );
-if ( ( zSig1 & UINT64_C(0x3FFF) ) <= 5 ) {
-if ( zSig1 == 0 ) zSig1 = 1;
-mul64To128( doubleZSig0, zSig1, ,  );
-sub128( rem1, 0, term1, term2, ,  );
-mul64To128( zSig1, zSig1, ,  );
-sub192( rem1, rem2, 0, 0, term2, term3, , ,  );
-while ( (int64_t) rem1 < 0 ) {
---zSig1;
-shortShift128Left( 0, zSig1, 1, ,  );
-term3 |= 1;
-term2 |= doubleZSig0;
-add192( rem1, rem2, rem3, 0, term2, term3, , ,  );
-}
-zSig1 |= ( ( rem1 | rem2 | rem3 ) != 0 );
-}
-shortShift128Left( 0, zSig1, 1, ,  );
-zSig0 |= doubleZSig0;
-return roundAndPackFloatx80(status->floatx80_rounding_precision,
-0, zExp, zSig0, zSig1, status);
-}
-
 /*
 | Returns the result of converting the quadruple-precision floating-point
 | value `a' to the extended double-precision floating-point format.  The
-- 
2.25.1

[PULL 12/29] softfloat: Adjust parts_uncanon_normal for floatx80

2021-06-03 Thread Richard Henderson

With floatx80_precision_x, the rounding happens across
the break between words.  Notice this case with

  frac_lsb = round_mask + 1 -> 0

and check the bits in frac_hi as needed.

In addition, since frac_shift == 0, we won't implicitly clear
round_mask via the right-shift, so explicitly clear those bits.
This fixes rounding for floatx80_precision_[sd].

Reviewed-by: Alex Bennée 
Signed-off-by: Richard Henderson 
---
 fpu/softfloat-parts.c.inc | 36 ++--
 1 file changed, 30 insertions(+), 6 deletions(-)

diff --git a/fpu/softfloat-parts.c.inc b/fpu/softfloat-parts.c.inc
index a026581c33..efb81bbebe 100644
--- a/fpu/softfloat-parts.c.inc
+++ b/fpu/softfloat-parts.c.inc
@@ -155,7 +155,13 @@ static void partsN(uncanon_normal)(FloatPartsN *p, 
float_status *s,
 
 switch (s->float_rounding_mode) {
 case float_round_nearest_even:
-inc = ((p->frac_lo & roundeven_mask) != frac_lsbm1 ? frac_lsbm1 : 0);
+if (N > 64 && frac_lsb == 0) {
+inc = ((p->frac_hi & 1) || (p->frac_lo & round_mask) != frac_lsbm1
+   ? frac_lsbm1 : 0);
+} else {
+inc = ((p->frac_lo & roundeven_mask) != frac_lsbm1
+   ? frac_lsbm1 : 0);
+}
 break;
 case float_round_ties_away:
 inc = frac_lsbm1;
@@ -176,7 +182,11 @@ static void partsN(uncanon_normal)(FloatPartsN *p, 
float_status *s,
 overflow_norm = true;
 /* fall through */
 case float_round_to_odd_inf:
-inc = p->frac_lo & frac_lsb ? 0 : round_mask;
+if (N > 64 && frac_lsb == 0) {
+inc = p->frac_hi & 1 ? 0 : round_mask;
+} else {
+inc = p->frac_lo & frac_lsb ? 0 : round_mask;
+}
 break;
 default:
 g_assert_not_reached();
@@ -191,8 +201,8 @@ static void partsN(uncanon_normal)(FloatPartsN *p, 
float_status *s,
 p->frac_hi |= DECOMPOSED_IMPLICIT_BIT;
 exp++;
 }
+p->frac_lo &= ~round_mask;
 }
-frac_shr(p, frac_shift);
 
 if (fmt->arm_althp) {
 /* ARM Alt HP eschews Inf and NaN for a wider exponent.  */
@@ -201,18 +211,21 @@ static void partsN(uncanon_normal)(FloatPartsN *p, 
float_status *s,
 flags = float_flag_invalid;
 exp = exp_max;
 frac_allones(p);
+p->frac_lo &= ~round_mask;
 }
 } else if (unlikely(exp >= exp_max)) {
 flags |= float_flag_overflow | float_flag_inexact;
 if (overflow_norm) {
 exp = exp_max - 1;
 frac_allones(p);
+p->frac_lo &= ~round_mask;
 } else {
 p->cls = float_class_inf;
 exp = exp_max;
 frac_clear(p);
 }
 }
+frac_shr(p, frac_shift);
 } else if (s->flush_to_zero) {
 flags |= float_flag_output_denormal;
 p->cls = float_class_zero;
@@ -232,18 +245,29 @@ static void partsN(uncanon_normal)(FloatPartsN *p, 
float_status *s,
 /* Need to recompute round-to-even/round-to-odd. */
 switch (s->float_rounding_mode) {
 case float_round_nearest_even:
-inc = ((p->frac_lo & roundeven_mask) != frac_lsbm1
-   ? frac_lsbm1 : 0);
+if (N > 64 && frac_lsb == 0) {
+inc = ((p->frac_hi & 1) ||
+   (p->frac_lo & round_mask) != frac_lsbm1
+   ? frac_lsbm1 : 0);
+} else {
+inc = ((p->frac_lo & roundeven_mask) != frac_lsbm1
+   ? frac_lsbm1 : 0);
+}
 break;
 case float_round_to_odd:
 case float_round_to_odd_inf:
-inc = p->frac_lo & frac_lsb ? 0 : round_mask;
+if (N > 64 && frac_lsb == 0) {
+inc = p->frac_hi & 1 ? 0 : round_mask;
+} else {
+inc = p->frac_lo & frac_lsb ? 0 : round_mask;
+}
 break;
 default:
 break;
 }
 flags |= float_flag_inexact;
 frac_addi(p, p, inc);
+p->frac_lo &= ~round_mask;
 }
 
 exp = (p->frac_hi & DECOMPOSED_IMPLICIT_BIT) != 0;
-- 
2.25.1

[PULL 06/29] softfloat: Move compare_floats to softfloat-parts.c.inc

2021-06-03 Thread Richard Henderson

Rename to parts$N_compare.  Rename all of the intermediate
functions to ftype_do_compare.  Rename the hard-float functions
to ftype_hs_compare.  Convert float128 to FloatParts128.

Reviewed-by: Alex Bennée 
Signed-off-by: Richard Henderson 
---
 fpu/softfloat.c   | 208 ++
 fpu/softfloat-parts.c.inc |  57 +++
 2 files changed, 133 insertions(+), 132 deletions(-)

diff --git a/fpu/softfloat.c b/fpu/softfloat.c
index 4fee5a6cb7..6f1bbbe6cf 100644
--- a/fpu/softfloat.c
+++ b/fpu/softfloat.c
@@ -882,6 +882,14 @@ static FloatParts128 *parts128_minmax(FloatParts128 *a, 
FloatParts128 *b,
 #define parts_minmax(A, B, S, F) \
 PARTS_GENERIC_64_128(minmax, A)(A, B, S, F)
 
+static int parts64_compare(FloatParts64 *a, FloatParts64 *b,
+   float_status *s, bool q);
+static int parts128_compare(FloatParts128 *a, FloatParts128 *b,
+float_status *s, bool q);
+
+#define parts_compare(A, B, S, Q) \
+PARTS_GENERIC_64_128(compare, A)(A, B, S, Q)
+
 /*
  * Helper functions for softfloat-parts.c.inc, per-size operations.
  */
@@ -3357,92 +3365,42 @@ MINMAX_2(float128)
 #undef MINMAX_1
 #undef MINMAX_2
 
-/* Floating point compare */
-static FloatRelation compare_floats(FloatParts64 a, FloatParts64 b, bool 
is_quiet,
-float_status *s)
+/*
+ * Floating point compare
+ */
+
+static FloatRelation QEMU_FLATTEN
+float16_do_compare(float16 a, float16 b, float_status *s, bool is_quiet)
 {
-if (is_nan(a.cls) || is_nan(b.cls)) {
-if (!is_quiet ||
-a.cls == float_class_snan ||
-b.cls == float_class_snan) {
-float_raise(float_flag_invalid, s);
-}
-return float_relation_unordered;
-}
+FloatParts64 pa, pb;
 
-if (a.cls == float_class_zero) {
-if (b.cls == float_class_zero) {
-return float_relation_equal;
-}
-return b.sign ? float_relation_greater : float_relation_less;
-} else if (b.cls == float_class_zero) {
-return a.sign ? float_relation_less : float_relation_greater;
-}
-
-/* The only really important thing about infinity is its sign. If
- * both are infinities the sign marks the smallest of the two.
- */
-if (a.cls == float_class_inf) {
-if ((b.cls == float_class_inf) && (a.sign == b.sign)) {
-return float_relation_equal;
-}
-return a.sign ? float_relation_less : float_relation_greater;
-} else if (b.cls == float_class_inf) {
-return b.sign ? float_relation_greater : float_relation_less;
-}
-
-if (a.sign != b.sign) {
-return a.sign ? float_relation_less : float_relation_greater;
-}
-
-if (a.exp == b.exp) {
-if (a.frac == b.frac) {
-return float_relation_equal;
-}
-if (a.sign) {
-return a.frac > b.frac ?
-float_relation_less : float_relation_greater;
-} else {
-return a.frac > b.frac ?
-float_relation_greater : float_relation_less;
-}
-} else {
-if (a.sign) {
-return a.exp > b.exp ? float_relation_less : 
float_relation_greater;
-} else {
-return a.exp > b.exp ? float_relation_greater : 
float_relation_less;
-}
-}
+float16_unpack_canonical(, a, s);
+float16_unpack_canonical(, b, s);
+return parts_compare(, , s, is_quiet);
 }
 
-#define COMPARE(name, attr, sz) \
-static int attr \
-name(float ## sz a, float ## sz b, bool is_quiet, float_status *s)  \
-{   \
-FloatParts64 pa, pb;\
-float ## sz ## _unpack_canonical(, a, s);\
-float ## sz ## _unpack_canonical(, b, s);\
-return compare_floats(pa, pb, is_quiet, s); \
-}
-
-COMPARE(soft_f16_compare, QEMU_FLATTEN, 16)
-COMPARE(soft_f32_compare, QEMU_SOFTFLOAT_ATTR, 32)
-COMPARE(soft_f64_compare, QEMU_SOFTFLOAT_ATTR, 64)
-
-#undef COMPARE
-
 FloatRelation float16_compare(float16 a, float16 b, float_status *s)
 {
-return soft_f16_compare(a, b, false, s);
+return float16_do_compare(a, b, s, false);
 }
 
 FloatRelation float16_compare_quiet(float16 a, float16 b, float_status *s)
 {
-return soft_f16_compare(a, b, true, s);
+return float16_do_compare(a, b, s, true);
+}
+
+static FloatRelation QEMU_SOFTFLOAT_ATTR
+float32_do_compare(float32 a, float32 b, float_status *s, bool is_quiet)
+{
+FloatParts64 pa, pb;
+
+float32_unpack_canonical(, a, s);
+float32_unpack_canonical(, b, s);
+return parts_compare(, , s, is_quiet);
 }
 
 static FloatRelation QEMU_FLATTEN
-f32_compare(float32 xa, float32 xb, bool is_quiet, float_status *s)

[PULL 19/29] softfloat: Convert floatx80_round_to_int to FloatParts

2021-06-03 Thread Richard Henderson

Reviewed-by: Alex Bennée 
Signed-off-by: Richard Henderson 
---
 fpu/softfloat.c | 116 ++--
 1 file changed, 13 insertions(+), 103 deletions(-)

diff --git a/fpu/softfloat.c b/fpu/softfloat.c
index 74787d5a6e..9caf1ecf9c 100644
--- a/fpu/softfloat.c
+++ b/fpu/softfloat.c
@@ -2610,6 +2610,19 @@ float128 float128_round_to_int(float128 a, float_status 
*s)
 return float128_round_pack_canonical(, s);
 }
 
+floatx80 floatx80_round_to_int(floatx80 a, float_status *status)
+{
+FloatParts128 p;
+
+if (!floatx80_unpack_canonical(, a, status)) {
+return floatx80_default_nan(status);
+}
+
+parts_round_to_int(, status->float_rounding_mode, 0, status,
+   _params[status->floatx80_rounding_precision]);
+return floatx80_round_pack_canonical(, status);
+}
+
 /*
  * Floating-point to signed integer conversions
  */
@@ -5800,109 +5813,6 @@ floatx80 floatx80_round(floatx80 a, float_status 
*status)
 return floatx80_round_pack_canonical(, status);
 }
 
-/*
-| Rounds the extended double-precision floating-point value `a' to an integer,
-| and returns the result as an extended quadruple-precision floating-point
-| value.  The operation is performed according to the IEC/IEEE Standard for
-| Binary Floating-Point Arithmetic.
-**/
-
-floatx80 floatx80_round_to_int(floatx80 a, float_status *status)
-{
-bool aSign;
-int32_t aExp;
-uint64_t lastBitMask, roundBitsMask;
-floatx80 z;
-
-if (floatx80_invalid_encoding(a)) {
-float_raise(float_flag_invalid, status);
-return floatx80_default_nan(status);
-}
-aExp = extractFloatx80Exp( a );
-if ( 0x403E <= aExp ) {
-if ( ( aExp == 0x7FFF ) && (uint64_t) ( extractFloatx80Frac( a )<<1 ) 
) {
-return propagateFloatx80NaN(a, a, status);
-}
-return a;
-}
-if ( aExp < 0x3FFF ) {
-if (( aExp == 0 )
- && ( (uint64_t) ( extractFloatx80Frac( a ) ) == 0 ) ) {
-return a;
-}
-float_raise(float_flag_inexact, status);
-aSign = extractFloatx80Sign( a );
-switch (status->float_rounding_mode) {
- case float_round_nearest_even:
-if ( ( aExp == 0x3FFE ) && (uint64_t) ( extractFloatx80Frac( a 
)<<1 )
-   ) {
-return
-packFloatx80( aSign, 0x3FFF, UINT64_C(0x8000));
-}
-break;
-case float_round_ties_away:
-if (aExp == 0x3FFE) {
-return packFloatx80(aSign, 0x3FFF, 
UINT64_C(0x8000));
-}
-break;
- case float_round_down:
-return
-  aSign ?
-  packFloatx80( 1, 0x3FFF, UINT64_C(0x8000))
-: packFloatx80( 0, 0, 0 );
- case float_round_up:
-return
-  aSign ? packFloatx80( 1, 0, 0 )
-: packFloatx80( 0, 0x3FFF, UINT64_C(0x8000));
-
-case float_round_to_zero:
-break;
-default:
-g_assert_not_reached();
-}
-return packFloatx80( aSign, 0, 0 );
-}
-lastBitMask = 1;
-lastBitMask <<= 0x403E - aExp;
-roundBitsMask = lastBitMask - 1;
-z = a;
-switch (status->float_rounding_mode) {
-case float_round_nearest_even:
-z.low += lastBitMask>>1;
-if ((z.low & roundBitsMask) == 0) {
-z.low &= ~lastBitMask;
-}
-break;
-case float_round_ties_away:
-z.low += lastBitMask >> 1;
-break;
-case float_round_to_zero:
-break;
-case float_round_up:
-if (!extractFloatx80Sign(z)) {
-z.low += roundBitsMask;
-}
-break;
-case float_round_down:
-if (extractFloatx80Sign(z)) {
-z.low += roundBitsMask;
-}
-break;
-default:
-abort();
-}
-z.low &= ~ roundBitsMask;
-if ( z.low == 0 ) {
-++z.high;
-z.low = UINT64_C(0x8000);
-}
-if (z.low != a.low) {
-float_raise(float_flag_inexact, status);
-}
-return z;
-
-}
-
 /*
 | Returns the remainder of the extended double-precision floating-point value
 | `a' with respect to the corresponding value `b'.  The operation is performed
-- 
2.25.1

[PULL 18/29] softfloat: Convert floatx80_round to FloatParts

2021-06-03 Thread Richard Henderson

Reviewed-by: Alex Bennée 
Signed-off-by: Richard Henderson 
---
 fpu/softfloat.c | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/fpu/softfloat.c b/fpu/softfloat.c
index 5a320e5302..74787d5a6e 100644
--- a/fpu/softfloat.c
+++ b/fpu/softfloat.c
@@ -5792,10 +5792,12 @@ float128 floatx80_to_float128(floatx80 a, float_status 
*status)
 
 floatx80 floatx80_round(floatx80 a, float_status *status)
 {
-return roundAndPackFloatx80(status->floatx80_rounding_precision,
-extractFloatx80Sign(a),
-extractFloatx80Exp(a),
-extractFloatx80Frac(a), 0, status);
+FloatParts128 p;
+
+if (!floatx80_unpack_canonical(, a, status)) {
+return floatx80_default_nan(status);
+}
+return floatx80_round_pack_canonical(, status);
 }
 
 /*
-- 
2.25.1

[PULL 05/29] softfloat: Implement float128_(min|minnum|minnummag|max|maxnum|maxnummag)

2021-06-03 Thread Richard Henderson

From: David Hildenbrand 

The float128 implementation is straight-forward.
Unfortuantely, we don't have any tests we can simply adjust/unlock.

Signed-off-by: David Hildenbrand 
Message-Id: <20210517142739.38597-24-da...@redhat.com>
[rth: Update for changed parts_minmax return value]
Signed-off-by: Richard Henderson 
---
 include/fpu/softfloat.h |  6 ++
 fpu/softfloat.c | 13 +
 2 files changed, 19 insertions(+)

diff --git a/include/fpu/softfloat.h b/include/fpu/softfloat.h
index 53f2c2ea3c..ed32040aa9 100644
--- a/include/fpu/softfloat.h
+++ b/include/fpu/softfloat.h
@@ -1204,6 +1204,12 @@ float128 float128_rem(float128, float128, float_status 
*status);
 float128 float128_sqrt(float128, float_status *status);
 FloatRelation float128_compare(float128, float128, float_status *status);
 FloatRelation float128_compare_quiet(float128, float128, float_status *status);
+float128 float128_min(float128, float128, float_status *status);
+float128 float128_max(float128, float128, float_status *status);
+float128 float128_minnum(float128, float128, float_status *status);
+float128 float128_maxnum(float128, float128, float_status *status);
+float128 float128_minnummag(float128, float128, float_status *status);
+float128 float128_maxnummag(float128, float128, float_status *status);
 bool float128_is_quiet_nan(float128, float_status *status);
 bool float128_is_signaling_nan(float128, float_status *status);
 float128 float128_silence_nan(float128, float_status *status);
diff --git a/fpu/softfloat.c b/fpu/softfloat.c
index ef750e1e95..4fee5a6cb7 100644
--- a/fpu/softfloat.c
+++ b/fpu/softfloat.c
@@ -3324,6 +3324,18 @@ static float64 float64_minmax(float64 a, float64 b, 
float_status *s, int flags)
 return float64_round_pack_canonical(pr, s);
 }
 
+static float128 float128_minmax(float128 a, float128 b,
+float_status *s, int flags)
+{
+FloatParts128 pa, pb, *pr;
+
+float128_unpack_canonical(, a, s);
+float128_unpack_canonical(, b, s);
+pr = parts_minmax(, , s, flags);
+
+return float128_round_pack_canonical(pr, s);
+}
+
 #define MINMAX_1(type, name, flags) \
 type type##_##name(type a, type b, float_status *s) \
 { return type##_minmax(a, b, s, flags); }
@@ -3340,6 +3352,7 @@ MINMAX_2(float16)
 MINMAX_2(bfloat16)
 MINMAX_2(float32)
 MINMAX_2(float64)
+MINMAX_2(float128)
 
 #undef MINMAX_1
 #undef MINMAX_2
-- 
2.25.1

[PULL 03/29] softfloat: Move uint_to_float to softfloat-parts.c.inc

2021-06-03 Thread Richard Henderson

Rename to parts$N_uint_to_float.
Reimplement uint64_to_float128 with FloatParts128.

Reviewed-by: Alex Bennée 
Reviewed-by: David Hildenbrand 
Signed-off-by: Richard Henderson 
---
 fpu/softfloat.c   | 83 ---
 fpu/softfloat-parts.c.inc | 23 +++
 2 files changed, 56 insertions(+), 50 deletions(-)

diff --git a/fpu/softfloat.c b/fpu/softfloat.c
index 6404a2997f..db14bd09aa 100644
--- a/fpu/softfloat.c
+++ b/fpu/softfloat.c
@@ -857,6 +857,14 @@ static void parts128_sint_to_float(FloatParts128 *p, 
int64_t a,
 #define parts_sint_to_float(P, I, Z, S) \
 PARTS_GENERIC_64_128(sint_to_float, P)(P, I, Z, S)
 
+static void parts64_uint_to_float(FloatParts64 *p, uint64_t a,
+  int scale, float_status *s);
+static void parts128_uint_to_float(FloatParts128 *p, uint64_t a,
+   int scale, float_status *s);
+
+#define parts_uint_to_float(P, I, Z, S) \
+PARTS_GENERIC_64_128(uint_to_float, P)(P, I, Z, S)
+
 /*
  * Helper functions for softfloat-parts.c.inc, per-size operations.
  */
@@ -3102,35 +3110,15 @@ float128 int32_to_float128(int32_t a, float_status 
*status)
 }
 
 /*
- * Unsigned Integer to float conversions
- *
- * Returns the result of converting the unsigned integer `a' to the
- * floating-point format. The conversion is performed according to the
- * IEC/IEEE Standard for Binary Floating-Point Arithmetic.
+ * Unsigned Integer to floating-point conversions
  */
 
-static FloatParts64 uint_to_float(uint64_t a, int scale, float_status *status)
-{
-FloatParts64 r = { .sign = false };
-int shift;
-
-if (a == 0) {
-r.cls = float_class_zero;
-} else {
-scale = MIN(MAX(scale, -0x1), 0x1);
-shift = clz64(a);
-r.cls = float_class_normal;
-r.exp = DECOMPOSED_BINARY_POINT - shift + scale;
-r.frac = a << shift;
-}
-
-return r;
-}
-
 float16 uint64_to_float16_scalbn(uint64_t a, int scale, float_status *status)
 {
-FloatParts64 pa = uint_to_float(a, scale, status);
-return float16_round_pack_canonical(, status);
+FloatParts64 p;
+
+parts_uint_to_float(, a, scale, status);
+return float16_round_pack_canonical(, status);
 }
 
 float16 uint32_to_float16_scalbn(uint32_t a, int scale, float_status *status)
@@ -3165,8 +3153,10 @@ float16 uint8_to_float16(uint8_t a, float_status *status)
 
 float32 uint64_to_float32_scalbn(uint64_t a, int scale, float_status *status)
 {
-FloatParts64 pa = uint_to_float(a, scale, status);
-return float32_round_pack_canonical(, status);
+FloatParts64 p;
+
+parts_uint_to_float(, a, scale, status);
+return float32_round_pack_canonical(, status);
 }
 
 float32 uint32_to_float32_scalbn(uint32_t a, int scale, float_status *status)
@@ -3196,8 +3186,10 @@ float32 uint16_to_float32(uint16_t a, float_status 
*status)
 
 float64 uint64_to_float64_scalbn(uint64_t a, int scale, float_status *status)
 {
-FloatParts64 pa = uint_to_float(a, scale, status);
-return float64_round_pack_canonical(, status);
+FloatParts64 p;
+
+parts_uint_to_float(, a, scale, status);
+return float64_round_pack_canonical(, status);
 }
 
 float64 uint32_to_float64_scalbn(uint32_t a, int scale, float_status *status)
@@ -3225,15 +3217,12 @@ float64 uint16_to_float64(uint16_t a, float_status 
*status)
 return uint64_to_float64_scalbn(a, 0, status);
 }
 
-/*
- * Returns the result of converting the unsigned integer `a' to the
- * bfloat16 format.
- */
-
 bfloat16 uint64_to_bfloat16_scalbn(uint64_t a, int scale, float_status *status)
 {
-FloatParts64 pa = uint_to_float(a, scale, status);
-return bfloat16_round_pack_canonical(, status);
+FloatParts64 p;
+
+parts_uint_to_float(, a, scale, status);
+return bfloat16_round_pack_canonical(, status);
 }
 
 bfloat16 uint32_to_bfloat16_scalbn(uint32_t a, int scale, float_status *status)
@@ -3261,6 +3250,14 @@ bfloat16 uint16_to_bfloat16(uint16_t a, float_status 
*status)
 return uint64_to_bfloat16_scalbn(a, 0, status);
 }
 
+float128 uint64_to_float128(uint64_t a, float_status *status)
+{
+FloatParts128 p;
+
+parts_uint_to_float(, a, 0, status);
+return float128_round_pack_canonical(, status);
+}
+
 /* Float Min/Max */
 /* min() and max() functions. These can't be implemented as
  * 'compare and pick one input' because that would mishandle
@@ -4972,20 +4969,6 @@ floatx80 int64_to_floatx80(int64_t a, float_status 
*status)
 
 }
 
-/*
-| Returns the result of converting the 64-bit unsigned integer `a'
-| to the quadruple-precision floating-point format.  The conversion is 
performed
-| according to the IEC/IEEE Standard for Binary Floating-Point Arithmetic.
-**/
-
-float128 uint64_to_float128(uint64_t a, float_status *status)
-{
-if (a == 0) {
-

[PULL 10/29] softfloat: Reduce FloatFmt

2021-06-03 Thread Richard Henderson

Remove frac_lsb, frac_lsbm1, roundeven_mask.  Compute
these from round_mask in parts$N_uncanon_normal.

With floatx80, round_mask will not be tied to frac_shift.
Everything else is easily computable.

Reviewed-by: Alex Bennée 
Signed-off-by: Richard Henderson 
---
 fpu/softfloat.c   | 29 -
 fpu/softfloat-parts.c.inc |  6 +++---
 2 files changed, 15 insertions(+), 20 deletions(-)

diff --git a/fpu/softfloat.c b/fpu/softfloat.c
index ea7ee13201..741480c568 100644
--- a/fpu/softfloat.c
+++ b/fpu/softfloat.c
@@ -563,9 +563,7 @@ typedef struct {
  *   frac_size: the size of the fraction field
  *   frac_shift: shift to normalise the fraction with DECOMPOSED_BINARY_POINT
  * The following are computed based the size of fraction
- *   frac_lsb: least significant bit of fraction
- *   frac_lsbm1: the bit below the least significant bit (for rounding)
- *   round_mask/roundeven_mask: masks used for rounding
+ *   round_mask: bits below lsb which must be rounded
  * The following optional modifiers are available:
  *   arm_althp: handle ARM Alternative Half Precision
  */
@@ -575,24 +573,21 @@ typedef struct {
 int exp_max;
 int frac_size;
 int frac_shift;
-uint64_t frac_lsb;
-uint64_t frac_lsbm1;
-uint64_t round_mask;
-uint64_t roundeven_mask;
 bool arm_althp;
+uint64_t round_mask;
 } FloatFmt;
 
 /* Expand fields based on the size of exponent and fraction */
-#define FLOAT_PARAMS(E, F)   \
-.exp_size   = E, \
-.exp_bias   = ((1 << E) - 1) >> 1,   \
-.exp_max= (1 << E) - 1,  \
-.frac_size  = F, \
-.frac_shift = (-F - 1) & 63, \
-.frac_lsb   = 1ull << ((-F - 1) & 63),   \
-.frac_lsbm1 = 1ull << ((-F - 2) & 63),   \
-.round_mask = (1ull << ((-F - 1) & 63)) - 1, \
-.roundeven_mask = (2ull << ((-F - 1) & 63)) - 1
+#define FLOAT_PARAMS_(E, F) \
+.exp_size   = E,\
+.exp_bias   = ((1 << E) - 1) >> 1,  \
+.exp_max= (1 << E) - 1, \
+.frac_size  = F
+
+#define FLOAT_PARAMS(E, F)  \
+FLOAT_PARAMS_(E, F),\
+.frac_shift = (-F - 1) & 63,\
+.round_mask = (1ull << ((-F - 1) & 63)) - 1
 
 static const FloatFmt float16_params = {
 FLOAT_PARAMS(5, 10)
diff --git a/fpu/softfloat-parts.c.inc b/fpu/softfloat-parts.c.inc
index e05909db8c..a026581c33 100644
--- a/fpu/softfloat-parts.c.inc
+++ b/fpu/softfloat-parts.c.inc
@@ -145,10 +145,10 @@ static void partsN(uncanon_normal)(FloatPartsN *p, 
float_status *s,
 {
 const int exp_max = fmt->exp_max;
 const int frac_shift = fmt->frac_shift;
-const uint64_t frac_lsb = fmt->frac_lsb;
-const uint64_t frac_lsbm1 = fmt->frac_lsbm1;
 const uint64_t round_mask = fmt->round_mask;
-const uint64_t roundeven_mask = fmt->roundeven_mask;
+const uint64_t frac_lsb = round_mask + 1;
+const uint64_t frac_lsbm1 = round_mask ^ (round_mask >> 1);
+const uint64_t roundeven_mask = round_mask | frac_lsb;
 uint64_t inc;
 bool overflow_norm = false;
 int exp, flags = 0;
-- 
2.25.1

[PULL 08/29] softfloat: Move sqrt_float to softfloat-parts.c.inc

2021-06-03 Thread Richard Henderson

Rename to parts$N_sqrt.
Reimplement float128_sqrt with FloatParts128.

Reimplement with the inverse sqrt newton-raphson algorithm from musl.
This is significantly faster than even the berkeley sqrt n-r algorithm,
because it does not use division instructions, only multiplication.

Ordinarily, changing algorithms at the same time as migrating code is
a bad idea, but this is the only way I found that didn't break one of
the routines at the same time.

Tested-by: Alex Bennée 
Reviewed-by: Alex Bennée 
Signed-off-by: Richard Henderson 
---
 fpu/softfloat.c   | 207 ++
 fpu/softfloat-parts.c.inc | 206 +
 2 files changed, 261 insertions(+), 152 deletions(-)

diff --git a/fpu/softfloat.c b/fpu/softfloat.c
index 666b5a25d6..0f2eed8d29 100644
--- a/fpu/softfloat.c
+++ b/fpu/softfloat.c
@@ -820,6 +820,12 @@ static FloatParts128 *parts128_div(FloatParts128 *a, 
FloatParts128 *b,
 #define parts_div(A, B, S) \
 PARTS_GENERIC_64_128(div, A)(A, B, S)
 
+static void parts64_sqrt(FloatParts64 *a, float_status *s, const FloatFmt *f);
+static void parts128_sqrt(FloatParts128 *a, float_status *s, const FloatFmt 
*f);
+
+#define parts_sqrt(A, S, F) \
+PARTS_GENERIC_64_128(sqrt, A)(A, S, F)
+
 static bool parts64_round_to_int_normal(FloatParts64 *a, FloatRoundMode rm,
 int scale, int frac_size);
 static bool parts128_round_to_int_normal(FloatParts128 *a, FloatRoundMode r,
@@ -1386,6 +1392,30 @@ static void frac128_widen(FloatParts256 *r, 
FloatParts128 *a)
 
 #define frac_widen(A, B)  FRAC_GENERIC_64_128(widen, B)(A, B)
 
+/*
+ * Reciprocal sqrt table.  1 bit of exponent, 6-bits of mantessa.
+ * From https://git.musl-libc.org/cgit/musl/tree/src/math/sqrt_data.c
+ * and thus MIT licenced.
+ */
+static const uint16_t rsqrt_tab[128] = {
+0xb451, 0xb2f0, 0xb196, 0xb044, 0xaef9, 0xadb6, 0xac79, 0xab43,
+0xaa14, 0xa8eb, 0xa7c8, 0xa6aa, 0xa592, 0xa480, 0xa373, 0xa26b,
+0xa168, 0xa06a, 0x9f70, 0x9e7b, 0x9d8a, 0x9c9d, 0x9bb5, 0x9ad1,
+0x99f0, 0x9913, 0x983a, 0x9765, 0x9693, 0x95c4, 0x94f8, 0x9430,
+0x936b, 0x92a9, 0x91ea, 0x912e, 0x9075, 0x8fbe, 0x8f0a, 0x8e59,
+0x8daa, 0x8cfe, 0x8c54, 0x8bac, 0x8b07, 0x8a64, 0x89c4, 0x8925,
+0x8889, 0x87ee, 0x8756, 0x86c0, 0x862b, 0x8599, 0x8508, 0x8479,
+0x83ec, 0x8361, 0x82d8, 0x8250, 0x81c9, 0x8145, 0x80c2, 0x8040,
+0xff02, 0xfd0e, 0xfb25, 0xf947, 0xf773, 0xf5aa, 0xf3ea, 0xf234,
+0xf087, 0xeee3, 0xed47, 0xebb3, 0xea27, 0xe8a3, 0xe727, 0xe5b2,
+0xe443, 0xe2dc, 0xe17a, 0xe020, 0xdecb, 0xdd7d, 0xdc34, 0xdaf1,
+0xd9b3, 0xd87b, 0xd748, 0xd61a, 0xd4f1, 0xd3cd, 0xd2ad, 0xd192,
+0xd07b, 0xcf69, 0xce5b, 0xcd51, 0xcc4a, 0xcb48, 0xca4a, 0xc94f,
+0xc858, 0xc764, 0xc674, 0xc587, 0xc49d, 0xc3b7, 0xc2d4, 0xc1f4,
+0xc116, 0xc03c, 0xbf65, 0xbe90, 0xbdbe, 0xbcef, 0xbc23, 0xbb59,
+0xba91, 0xb9cc, 0xb90a, 0xb84a, 0xb78c, 0xb6d0, 0xb617, 0xb560,
+};
+
 #define partsN(NAME)   glue(glue(glue(parts,N),_),NAME)
 #define FloatPartsNglue(FloatParts,N)
 #define FloatPartsWglue(FloatParts,W)
@@ -3586,103 +3616,35 @@ float128 float128_scalbn(float128 a, int n, 
float_status *status)
 
 /*
  * Square Root
- *
- * The old softfloat code did an approximation step before zeroing in
- * on the final result. However for simpleness we just compute the
- * square root by iterating down from the implicit bit to enough extra
- * bits to ensure we get a correctly rounded result.
- *
- * This does mean however the calculation is slower than before,
- * especially for 64 bit floats.
  */
 
-static FloatParts64 sqrt_float(FloatParts64 a, float_status *s, const FloatFmt 
*p)
-{
-uint64_t a_frac, r_frac, s_frac;
-int bit, last_bit;
-
-if (is_nan(a.cls)) {
-parts_return_nan(, s);
-return a;
-}
-if (a.cls == float_class_zero) {
-return a;  /* sqrt(+-0) = +-0 */
-}
-if (a.sign) {
-float_raise(float_flag_invalid, s);
-parts_default_nan(, s);
-return a;
-}
-if (a.cls == float_class_inf) {
-return a;  /* sqrt(+inf) = +inf */
-}
-
-assert(a.cls == float_class_normal);
-
-/* We need two overflow bits at the top. Adding room for that is a
- * right shift. If the exponent is odd, we can discard the low bit
- * by multiplying the fraction by 2; that's a left shift. Combine
- * those and we shift right by 1 if the exponent is odd, otherwise 2.
- */
-a_frac = a.frac >> (2 - (a.exp & 1));
-a.exp >>= 1;
-
-/* Bit-by-bit computation of sqrt.  */
-r_frac = 0;
-s_frac = 0;
-
-/* Iterate from implicit bit down to the 3 extra bits to compute a
- * properly rounded result. Remember we've inserted two more bits
- * at the top, so these positions are two less.
- */
-bit = DECOMPOSED_BINARY_POINT - 2;
-last_bit = MAX(p->frac_shift - 4, 0);
-do {
-uint64_t q = 1ULL << bit;
-uint64_t t_frac =

[PULL 07/29] softfloat: Move scalbn_decomposed to softfloat-parts.c.inc

2021-06-03 Thread Richard Henderson

Rename to parts$N_scalbn.
Reimplement float128_scalbn with FloatParts128.

Reviewed-by: Alex Bennée 
Signed-off-by: Richard Henderson 
---
 fpu/softfloat.c   | 103 +-
 fpu/softfloat-parts.c.inc |  21 
 2 files changed, 55 insertions(+), 69 deletions(-)

diff --git a/fpu/softfloat.c b/fpu/softfloat.c
index 6f1bbbe6cf..666b5a25d6 100644
--- a/fpu/softfloat.c
+++ b/fpu/softfloat.c
@@ -890,6 +890,12 @@ static int parts128_compare(FloatParts128 *a, 
FloatParts128 *b,
 #define parts_compare(A, B, S, Q) \
 PARTS_GENERIC_64_128(compare, A)(A, B, S, Q)
 
+static void parts64_scalbn(FloatParts64 *a, int n, float_status *s);
+static void parts128_scalbn(FloatParts128 *a, int n, float_status *s);
+
+#define parts_scalbn(A, N, S) \
+PARTS_GENERIC_64_128(scalbn, A)(A, N, S)
+
 /*
  * Helper functions for softfloat-parts.c.inc, per-size operations.
  */
@@ -3529,58 +3535,53 @@ FloatRelation float128_compare_quiet(float128 a, 
float128 b, float_status *s)
 return float128_do_compare(a, b, s, true);
 }
 
-/* Multiply A by 2 raised to the power N.  */
-static FloatParts64 scalbn_decomposed(FloatParts64 a, int n, float_status *s)
-{
-if (unlikely(is_nan(a.cls))) {
-parts_return_nan(, s);
-}
-if (a.cls == float_class_normal) {
-/* The largest float type (even though not supported by FloatParts64)
- * is float128, which has a 15 bit exponent.  Bounding N to 16 bits
- * still allows rounding to infinity, without allowing overflow
- * within the int32_t that backs FloatParts64.exp.
- */
-n = MIN(MAX(n, -0x1), 0x1);
-a.exp += n;
-}
-return a;
-}
+/*
+ * Scale by 2**N
+ */
 
 float16 float16_scalbn(float16 a, int n, float_status *status)
 {
-FloatParts64 pa, pr;
+FloatParts64 p;
 
-float16_unpack_canonical(, a, status);
-pr = scalbn_decomposed(pa, n, status);
-return float16_round_pack_canonical(, status);
+float16_unpack_canonical(, a, status);
+parts_scalbn(, n, status);
+return float16_round_pack_canonical(, status);
 }
 
 float32 float32_scalbn(float32 a, int n, float_status *status)
 {
-FloatParts64 pa, pr;
+FloatParts64 p;
 
-float32_unpack_canonical(, a, status);
-pr = scalbn_decomposed(pa, n, status);
-return float32_round_pack_canonical(, status);
+float32_unpack_canonical(, a, status);
+parts_scalbn(, n, status);
+return float32_round_pack_canonical(, status);
 }
 
 float64 float64_scalbn(float64 a, int n, float_status *status)
 {
-FloatParts64 pa, pr;
+FloatParts64 p;
 
-float64_unpack_canonical(, a, status);
-pr = scalbn_decomposed(pa, n, status);
-return float64_round_pack_canonical(, status);
+float64_unpack_canonical(, a, status);
+parts_scalbn(, n, status);
+return float64_round_pack_canonical(, status);
 }
 
 bfloat16 bfloat16_scalbn(bfloat16 a, int n, float_status *status)
 {
-FloatParts64 pa, pr;
+FloatParts64 p;
 
-bfloat16_unpack_canonical(, a, status);
-pr = scalbn_decomposed(pa, n, status);
-return bfloat16_round_pack_canonical(, status);
+bfloat16_unpack_canonical(, a, status);
+parts_scalbn(, n, status);
+return bfloat16_round_pack_canonical(, status);
+}
+
+float128 float128_scalbn(float128 a, int n, float_status *status)
+{
+FloatParts128 p;
+
+float128_unpack_canonical(, a, status);
+parts_scalbn(, n, status);
+return float128_round_pack_canonical(, status);
 }
 
 /*
@@ -6638,42 +6639,6 @@ floatx80 floatx80_scalbn(floatx80 a, int n, float_status 
*status)
  aSign, aExp, aSig, 0, status);
 }
 
-float128 float128_scalbn(float128 a, int n, float_status *status)
-{
-bool aSign;
-int32_t aExp;
-uint64_t aSig0, aSig1;
-
-aSig1 = extractFloat128Frac1( a );
-aSig0 = extractFloat128Frac0( a );
-aExp = extractFloat128Exp( a );
-aSign = extractFloat128Sign( a );
-if ( aExp == 0x7FFF ) {
-if ( aSig0 | aSig1 ) {
-return propagateFloat128NaN(a, a, status);
-}
-return a;
-}
-if (aExp != 0) {
-aSig0 |= UINT64_C(0x0001);
-} else if (aSig0 == 0 && aSig1 == 0) {
-return a;
-} else {
-aExp++;
-}
-
-if (n > 0x1) {
-n = 0x1;
-} else if (n < -0x1) {
-n = -0x1;
-}
-
-aExp += n - 1;
-return normalizeRoundAndPackFloat128( aSign, aExp, aSig0, aSig1
- , status);
-
-}
-
 static void __attribute__((constructor)) softfloat_init(void)
 {
 union_float64 ua, ub, uc, ur;
diff --git a/fpu/softfloat-parts.c.inc b/fpu/softfloat-parts.c.inc
index 3dacb5b4f0..bf935c4fc2 100644
--- a/fpu/softfloat-parts.c.inc
+++ b/fpu/softfloat-parts.c.inc
@@ -1075,3 +1075,24 @@ static FloatRelation partsN(compare)(FloatPartsN *a, 
FloatPartsN *b,
  b_sign:
 return b->sign ? float_relation_greater :

[PULL 01/29] softfloat: Move round_to_uint_and_pack to softfloat-parts.c.inc

2021-06-03 Thread Richard Henderson

Rename to parts$N_float_to_uint.  Reimplement
float128_to_uint{32,64}{_round_to_zero} with FloatParts128.

Reviewed-by: Alex Bennée 
Signed-off-by: Richard Henderson 
---
 fpu/softfloat.c   | 357 +-
 fpu/softfloat-parts.c.inc |  68 +++-
 2 files changed, 147 insertions(+), 278 deletions(-)

diff --git a/fpu/softfloat.c b/fpu/softfloat.c
index 0dc2203477..3181678ea9 100644
--- a/fpu/softfloat.c
+++ b/fpu/softfloat.c
@@ -839,6 +839,16 @@ static int64_t parts128_float_to_sint(FloatParts128 *p, 
FloatRoundMode rmode,
 #define parts_float_to_sint(P, R, Z, MN, MX, S) \
 PARTS_GENERIC_64_128(float_to_sint, P)(P, R, Z, MN, MX, S)
 
+static uint64_t parts64_float_to_uint(FloatParts64 *p, FloatRoundMode rmode,
+  int scale, uint64_t max,
+  float_status *s);
+static uint64_t parts128_float_to_uint(FloatParts128 *p, FloatRoundMode rmode,
+   int scale, uint64_t max,
+   float_status *s);
+
+#define parts_float_to_uint(P, R, Z, M, S) \
+PARTS_GENERIC_64_128(float_to_uint, P)(P, R, Z, M, S)
+
 /*
  * Helper functions for softfloat-parts.c.inc, per-size operations.
  */
@@ -2646,80 +2656,16 @@ int64_t bfloat16_to_int64_round_to_zero(bfloat16 a, 
float_status *s)
 }
 
 /*
- *  Returns the result of converting the floating-point value `a' to
- *  the unsigned integer format. The conversion is performed according
- *  to the IEC/IEEE Standard for Binary Floating-Point
- *  Arithmetic---which means in particular that the conversion is
- *  rounded according to the current rounding mode. If `a' is a NaN,
- *  the largest unsigned integer is returned. Otherwise, if the
- *  conversion overflows, the largest unsigned integer is returned. If
- *  the 'a' is negative, the result is rounded and zero is returned;
- *  values that do not round to zero will raise the inexact exception
- *  flag.
+ * Floating-point to unsigned integer conversions
  */
 
-static uint64_t round_to_uint_and_pack(FloatParts64 p, FloatRoundMode rmode,
-   int scale, uint64_t max,
-   float_status *s)
-{
-int flags = 0;
-uint64_t r;
-
-switch (p.cls) {
-case float_class_snan:
-case float_class_qnan:
-flags = float_flag_invalid;
-r = max;
-break;
-
-case float_class_inf:
-flags = float_flag_invalid;
-r = p.sign ? 0 : max;
-break;
-
-case float_class_zero:
-return 0;
-
-case float_class_normal:
-/* TODO: 62 = N - 2, frac_size for rounding */
-if (parts_round_to_int_normal(, rmode, scale, 62)) {
-flags = float_flag_inexact;
-if (p.cls == float_class_zero) {
-r = 0;
-break;
-}
-}
-
-if (p.sign) {
-flags = float_flag_invalid;
-r = 0;
-} else if (p.exp > DECOMPOSED_BINARY_POINT) {
-flags = float_flag_invalid;
-r = max;
-} else {
-r = p.frac >> (DECOMPOSED_BINARY_POINT - p.exp);
-if (r > max) {
-flags = float_flag_invalid;
-r = max;
-}
-}
-break;
-
-default:
-g_assert_not_reached();
-}
-
-float_raise(flags, s);
-return r;
-}
-
 uint8_t float16_to_uint8_scalbn(float16 a, FloatRoundMode rmode, int scale,
 float_status *s)
 {
 FloatParts64 p;
 
 float16_unpack_canonical(, a, s);
-return round_to_uint_and_pack(p, rmode, scale, UINT8_MAX, s);
+return parts_float_to_uint(, rmode, scale, UINT8_MAX, s);
 }
 
 uint16_t float16_to_uint16_scalbn(float16 a, FloatRoundMode rmode, int scale,
@@ -2728,7 +2674,7 @@ uint16_t float16_to_uint16_scalbn(float16 a, 
FloatRoundMode rmode, int scale,
 FloatParts64 p;
 
 float16_unpack_canonical(, a, s);
-return round_to_uint_and_pack(p, rmode, scale, UINT16_MAX, s);
+return parts_float_to_uint(, rmode, scale, UINT16_MAX, s);
 }
 
 uint32_t float16_to_uint32_scalbn(float16 a, FloatRoundMode rmode, int scale,
@@ -2737,7 +2683,7 @@ uint32_t float16_to_uint32_scalbn(float16 a, 
FloatRoundMode rmode, int scale,
 FloatParts64 p;
 
 float16_unpack_canonical(, a, s);
-return round_to_uint_and_pack(p, rmode, scale, UINT32_MAX, s);
+return parts_float_to_uint(, rmode, scale, UINT32_MAX, s);
 }
 
 uint64_t float16_to_uint64_scalbn(float16 a, FloatRoundMode rmode, int scale,
@@ -2746,7 +2692,7 @@ uint64_t float16_to_uint64_scalbn(float16 a, 
FloatRoundMode rmode, int scale,
 FloatParts64 p;
 
 float16_unpack_canonical(, a, s);
-return round_to_uint_and_pack(p, rmode, scale, UINT64_MAX, s);
+return parts_float_to_uint(, rmode, scale, UINT64_MAX, s);
 }
 
 uint16_t float32_to_uint16_scalbn(float32 a, FloatRoundMode rmode, int scale,
@@

[PULL 04/29] softfloat: Move minmax_flags to softfloat-parts.c.inc

2021-06-03 Thread Richard Henderson

Rename to parts$N_minmax.  Combine 3 bool arguments to a bitmask.
Introduce ftype_minmax functions as a common optimization point.
Fold bfloat16 expansions into the same macro as the other types.

Reviewed-by: David Hildenbrand 
Signed-off-by: Richard Henderson 
---
 fpu/softfloat.c   | 198 ++
 fpu/softfloat-parts.c.inc |  80 +++
 2 files changed, 152 insertions(+), 126 deletions(-)

diff --git a/fpu/softfloat.c b/fpu/softfloat.c
index db14bd09aa..ef750e1e95 100644
--- a/fpu/softfloat.c
+++ b/fpu/softfloat.c
@@ -482,6 +482,15 @@ enum {
 float_cmask_anynan  = float_cmask_qnan | float_cmask_snan,
 };
 
+/* Flags for parts_minmax. */
+enum {
+/* Set for minimum; clear for maximum. */
+minmax_ismin = 1,
+/* Set for the IEEE 754-2008 minNum() and maxNum() operations. */
+minmax_isnum = 2,
+/* Set for the IEEE 754-2008 minNumMag() and minNumMag() operations. */
+minmax_ismag = 4,
+};
 
 /* Simple helpers for checking if, or what kind of, NaN we have */
 static inline __attribute__((unused)) bool is_nan(FloatClass c)
@@ -865,6 +874,14 @@ static void parts128_uint_to_float(FloatParts128 *p, 
uint64_t a,
 #define parts_uint_to_float(P, I, Z, S) \
 PARTS_GENERIC_64_128(uint_to_float, P)(P, I, Z, S)
 
+static FloatParts64 *parts64_minmax(FloatParts64 *a, FloatParts64 *b,
+float_status *s, int flags);
+static FloatParts128 *parts128_minmax(FloatParts128 *a, FloatParts128 *b,
+  float_status *s, int flags);
+
+#define parts_minmax(A, B, S, F) \
+PARTS_GENERIC_64_128(minmax, A)(A, B, S, F)
+
 /*
  * Helper functions for softfloat-parts.c.inc, per-size operations.
  */
@@ -3258,145 +3275,74 @@ float128 uint64_to_float128(uint64_t a, float_status 
*status)
 return float128_round_pack_canonical(, status);
 }
 
-/* Float Min/Max */
-/* min() and max() functions. These can't be implemented as
- * 'compare and pick one input' because that would mishandle
- * NaNs and +0 vs -0.
- *
- * minnum() and maxnum() functions. These are similar to the min()
- * and max() functions but if one of the arguments is a QNaN and
- * the other is numerical then the numerical argument is returned.
- * SNaNs will get quietened before being returned.
- * minnum() and maxnum correspond to the IEEE 754-2008 minNum()
- * and maxNum() operations. min() and max() are the typical min/max
- * semantics provided by many CPUs which predate that specification.
- *
- * minnummag() and maxnummag() functions correspond to minNumMag()
- * and minNumMag() from the IEEE-754 2008.
+/*
+ * Minimum and maximum
  */
-static FloatParts64 minmax_floats(FloatParts64 a, FloatParts64 b, bool ismin,
-bool ieee, bool ismag, float_status *s)
+
+static float16 float16_minmax(float16 a, float16 b, float_status *s, int flags)
 {
-if (unlikely(is_nan(a.cls) || is_nan(b.cls))) {
-if (ieee) {
-/* Takes two floating-point values `a' and `b', one of
- * which is a NaN, and returns the appropriate NaN
- * result. If either `a' or `b' is a signaling NaN,
- * the invalid exception is raised.
- */
-if (is_snan(a.cls) || is_snan(b.cls)) {
-return *parts_pick_nan(, , s);
-} else if (is_nan(a.cls) && !is_nan(b.cls)) {
-return b;
-} else if (is_nan(b.cls) && !is_nan(a.cls)) {
-return a;
-}
-}
-return *parts_pick_nan(, , s);
-} else {
-int a_exp, b_exp;
+FloatParts64 pa, pb, *pr;
 
-switch (a.cls) {
-case float_class_normal:
-a_exp = a.exp;
-break;
-case float_class_inf:
-a_exp = INT_MAX;
-break;
-case float_class_zero:
-a_exp = INT_MIN;
-break;
-default:
-g_assert_not_reached();
-break;
-}
-switch (b.cls) {
-case float_class_normal:
-b_exp = b.exp;
-break;
-case float_class_inf:
-b_exp = INT_MAX;
-break;
-case float_class_zero:
-b_exp = INT_MIN;
-break;
-default:
-g_assert_not_reached();
-break;
-}
+float16_unpack_canonical(, a, s);
+float16_unpack_canonical(, b, s);
+pr = parts_minmax(, , s, flags);
 
-if (ismag && (a_exp != b_exp || a.frac != b.frac)) {
-bool a_less = a_exp < b_exp;
-if (a_exp == b_exp) {
-a_less = a.frac < b.frac;
-}
-return a_less ^ ismin ? b : a;
-}
-
-if (a.sign == b.sign) {
-bool a_less = a_exp < b_exp;
-if (a_exp == b_exp) {
-a_less = a.frac < b.frac;
-}
-return a.sign ^ a_less ^ ismin ? b : a;
-} else {
-

[PULL 09/29] softfloat: Split out parts_uncanon_normal

2021-06-03 Thread Richard Henderson

We will need to treat the non-normal cases of floatx80 specially,
so split out the normal case that we can reuse.

Reviewed-by: Alex Bennée 
Signed-off-by: Richard Henderson 
---
 fpu/softfloat.c   |  8 ++
 fpu/softfloat-parts.c.inc | 59 +--
 2 files changed, 40 insertions(+), 27 deletions(-)

diff --git a/fpu/softfloat.c b/fpu/softfloat.c
index 0f2eed8d29..ea7ee13201 100644
--- a/fpu/softfloat.c
+++ b/fpu/softfloat.c
@@ -764,6 +764,14 @@ static void parts128_canonicalize(FloatParts128 *p, 
float_status *status,
 #define parts_canonicalize(A, S, F) \
 PARTS_GENERIC_64_128(canonicalize, A)(A, S, F)
 
+static void parts64_uncanon_normal(FloatParts64 *p, float_status *status,
+   const FloatFmt *fmt);
+static void parts128_uncanon_normal(FloatParts128 *p, float_status *status,
+const FloatFmt *fmt);
+
+#define parts_uncanon_normal(A, S, F) \
+PARTS_GENERIC_64_128(uncanon_normal, A)(A, S, F)
+
 static void parts64_uncanon(FloatParts64 *p, float_status *status,
 const FloatFmt *fmt);
 static void parts128_uncanon(FloatParts128 *p, float_status *status,
diff --git a/fpu/softfloat-parts.c.inc b/fpu/softfloat-parts.c.inc
index d69f357352..e05909db8c 100644
--- a/fpu/softfloat-parts.c.inc
+++ b/fpu/softfloat-parts.c.inc
@@ -140,8 +140,8 @@ static void partsN(canonicalize)(FloatPartsN *p, 
float_status *status,
  * fraction; these bits will be removed. The exponent will be biased
  * by EXP_BIAS and must be bounded by [EXP_MAX-1, 0].
  */
-static void partsN(uncanon)(FloatPartsN *p, float_status *s,
-const FloatFmt *fmt)
+static void partsN(uncanon_normal)(FloatPartsN *p, float_status *s,
+   const FloatFmt *fmt)
 {
 const int exp_max = fmt->exp_max;
 const int frac_shift = fmt->frac_shift;
@@ -150,33 +150,9 @@ static void partsN(uncanon)(FloatPartsN *p, float_status 
*s,
 const uint64_t round_mask = fmt->round_mask;
 const uint64_t roundeven_mask = fmt->roundeven_mask;
 uint64_t inc;
-bool overflow_norm;
+bool overflow_norm = false;
 int exp, flags = 0;
 
-if (unlikely(p->cls != float_class_normal)) {
-switch (p->cls) {
-case float_class_zero:
-p->exp = 0;
-frac_clear(p);
-return;
-case float_class_inf:
-g_assert(!fmt->arm_althp);
-p->exp = fmt->exp_max;
-frac_clear(p);
-return;
-case float_class_qnan:
-case float_class_snan:
-g_assert(!fmt->arm_althp);
-p->exp = fmt->exp_max;
-frac_shr(p, fmt->frac_shift);
-return;
-default:
-break;
-}
-g_assert_not_reached();
-}
-
-overflow_norm = false;
 switch (s->float_rounding_mode) {
 case float_round_nearest_even:
 inc = ((p->frac_lo & roundeven_mask) != frac_lsbm1 ? frac_lsbm1 : 0);
@@ -284,6 +260,35 @@ static void partsN(uncanon)(FloatPartsN *p, float_status 
*s,
 float_raise(flags, s);
 }
 
+static void partsN(uncanon)(FloatPartsN *p, float_status *s,
+const FloatFmt *fmt)
+{
+if (likely(p->cls == float_class_normal)) {
+parts_uncanon_normal(p, s, fmt);
+} else {
+switch (p->cls) {
+case float_class_zero:
+p->exp = 0;
+frac_clear(p);
+return;
+case float_class_inf:
+g_assert(!fmt->arm_althp);
+p->exp = fmt->exp_max;
+frac_clear(p);
+return;
+case float_class_qnan:
+case float_class_snan:
+g_assert(!fmt->arm_althp);
+p->exp = fmt->exp_max;
+frac_shr(p, fmt->frac_shift);
+return;
+default:
+break;
+}
+g_assert_not_reached();
+}
+}
+
 /*
  * Returns the result of adding or subtracting the values of the
  * floating-point values `a' and `b'. The operation is performed
-- 
2.25.1

[PULL 02/29] softfloat: Move int_to_float to softfloat-parts.c.inc

2021-06-03 Thread Richard Henderson

Rename to parts$N_sint_to_float.
Reimplement int{32,64}_to_float128 with FloatParts128.

Reviewed-by: Alex Bennée 
Reviewed-by: David Hildenbrand 
Signed-off-by: Richard Henderson 
---
 fpu/softfloat.c   | 136 +++---
 fpu/softfloat-parts.c.inc |  32 +
 2 files changed, 70 insertions(+), 98 deletions(-)

diff --git a/fpu/softfloat.c b/fpu/softfloat.c
index 3181678ea9..6404a2997f 100644
--- a/fpu/softfloat.c
+++ b/fpu/softfloat.c
@@ -849,6 +849,14 @@ static uint64_t parts128_float_to_uint(FloatParts128 *p, 
FloatRoundMode rmode,
 #define parts_float_to_uint(P, R, Z, M, S) \
 PARTS_GENERIC_64_128(float_to_uint, P)(P, R, Z, M, S)
 
+static void parts64_sint_to_float(FloatParts64 *p, int64_t a,
+  int scale, float_status *s);
+static void parts128_sint_to_float(FloatParts128 *p, int64_t a,
+   int scale, float_status *s);
+
+#define parts_sint_to_float(P, I, Z, S) \
+PARTS_GENERIC_64_128(sint_to_float, P)(P, I, Z, S)
+
 /*
  * Helper functions for softfloat-parts.c.inc, per-size operations.
  */
@@ -2940,42 +2948,15 @@ uint64_t bfloat16_to_uint64_round_to_zero(bfloat16 a, 
float_status *s)
 }
 
 /*
- * Integer to float conversions
- *
- * Returns the result of converting the two's complement integer `a'
- * to the floating-point format. The conversion is performed according
- * to the IEC/IEEE Standard for Binary Floating-Point Arithmetic.
+ * Signed integer to floating-point conversions
  */
 
-static FloatParts64 int_to_float(int64_t a, int scale, float_status *status)
-{
-FloatParts64 r = { .sign = false };
-
-if (a == 0) {
-r.cls = float_class_zero;
-} else {
-uint64_t f = a;
-int shift;
-
-r.cls = float_class_normal;
-if (a < 0) {
-f = -f;
-r.sign = true;
-}
-shift = clz64(f);
-scale = MIN(MAX(scale, -0x1), 0x1);
-
-r.exp = DECOMPOSED_BINARY_POINT - shift + scale;
-r.frac = f << shift;
-}
-
-return r;
-}
-
 float16 int64_to_float16_scalbn(int64_t a, int scale, float_status *status)
 {
-FloatParts64 pa = int_to_float(a, scale, status);
-return float16_round_pack_canonical(, status);
+FloatParts64 p;
+
+parts_sint_to_float(, a, scale, status);
+return float16_round_pack_canonical(, status);
 }
 
 float16 int32_to_float16_scalbn(int32_t a, int scale, float_status *status)
@@ -3010,8 +2991,10 @@ float16 int8_to_float16(int8_t a, float_status *status)
 
 float32 int64_to_float32_scalbn(int64_t a, int scale, float_status *status)
 {
-FloatParts64 pa = int_to_float(a, scale, status);
-return float32_round_pack_canonical(, status);
+FloatParts64 p;
+
+parts64_sint_to_float(, a, scale, status);
+return float32_round_pack_canonical(, status);
 }
 
 float32 int32_to_float32_scalbn(int32_t a, int scale, float_status *status)
@@ -3041,8 +3024,10 @@ float32 int16_to_float32(int16_t a, float_status *status)
 
 float64 int64_to_float64_scalbn(int64_t a, int scale, float_status *status)
 {
-FloatParts64 pa = int_to_float(a, scale, status);
-return float64_round_pack_canonical(, status);
+FloatParts64 p;
+
+parts_sint_to_float(, a, scale, status);
+return float64_round_pack_canonical(, status);
 }
 
 float64 int32_to_float64_scalbn(int32_t a, int scale, float_status *status)
@@ -3070,15 +3055,12 @@ float64 int16_to_float64(int16_t a, float_status 
*status)
 return int64_to_float64_scalbn(a, 0, status);
 }
 
-/*
- * Returns the result of converting the two's complement integer `a'
- * to the bfloat16 format.
- */
-
 bfloat16 int64_to_bfloat16_scalbn(int64_t a, int scale, float_status *status)
 {
-FloatParts64 pa = int_to_float(a, scale, status);
-return bfloat16_round_pack_canonical(, status);
+FloatParts64 p;
+
+parts_sint_to_float(, a, scale, status);
+return bfloat16_round_pack_canonical(, status);
 }
 
 bfloat16 int32_to_bfloat16_scalbn(int32_t a, int scale, float_status *status)
@@ -3106,6 +3088,19 @@ bfloat16 int16_to_bfloat16(int16_t a, float_status 
*status)
 return int64_to_bfloat16_scalbn(a, 0, status);
 }
 
+float128 int64_to_float128(int64_t a, float_status *status)
+{
+FloatParts128 p;
+
+parts_sint_to_float(, a, 0, status);
+return float128_round_pack_canonical(, status);
+}
+
+float128 int32_to_float128(int32_t a, float_status *status)
+{
+return int64_to_float128(a, status);
+}
+
 /*
  * Unsigned Integer to float conversions
  *
@@ -4956,28 +4951,6 @@ floatx80 int32_to_floatx80(int32_t a, float_status 
*status)
 
 }
 
-/*
-| Returns the result of converting the 32-bit two's complement integer `a' to
-| the quadruple-precision floating-point format.  The conversion is performed
-| according to the IEC/IEEE Standard for Binary Floating-Point Arithmetic.

[PULL 00/29] softfloat patch queue

2021-06-03 Thread Richard Henderson

The following changes since commit 453d9c61dd5681159051c6e4d07e7b2633de2e70:

  Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20210603' 
into staging (2021-06-03 16:59:46 +0100)

are available in the Git repository at:

  https://gitlab.com/rth7680/qemu.git tags/pull-fpu-20210603

for you to fetch changes up to 5d0204b82ade0ea0630d6add894954135ee54ab1:

  softfloat: Use hard-float for {u}int64_to_float{32,64} (2021-06-03 14:09:03 
-0700)


Finish conversion of float128 and floatx80 to FloatParts.
Implement float128_muladd and float128_{min,max}*.
Optimize int-to-float conversion with hard-float.


Alex Bennée (1):
  tests/fp: Enable more tests

David Hildenbrand (1):
  softfloat: Implement float128_(min|minnum|minnummag|max|maxnum|maxnummag)

Richard Henderson (27):
  softfloat: Move round_to_uint_and_pack to softfloat-parts.c.inc
  softfloat: Move int_to_float to softfloat-parts.c.inc
  softfloat: Move uint_to_float to softfloat-parts.c.inc
  softfloat: Move minmax_flags to softfloat-parts.c.inc
  softfloat: Move compare_floats to softfloat-parts.c.inc
  softfloat: Move scalbn_decomposed to softfloat-parts.c.inc
  softfloat: Move sqrt_float to softfloat-parts.c.inc
  softfloat: Split out parts_uncanon_normal
  softfloat: Reduce FloatFmt
  softfloat: Introduce Floatx80RoundPrec
  softfloat: Adjust parts_uncanon_normal for floatx80
  tests/fp/fp-test: Reverse order of floatx80 precision tests
  softfloat: Convert floatx80_add/sub to FloatParts
  softfloat: Convert floatx80_mul to FloatParts
  softfloat: Convert floatx80_div to FloatParts
  softfloat: Convert floatx80_sqrt to FloatParts
  softfloat: Convert floatx80_round to FloatParts
  softfloat: Convert floatx80_round_to_int to FloatParts
  softfloat: Convert integer to floatx80 to FloatParts
  softfloat: Convert floatx80 float conversions to FloatParts
  softfloat: Convert floatx80 to integer to FloatParts
  softfloat: Convert floatx80_scalbn to FloatParts
  softfloat: Convert floatx80 compare to FloatParts
  softfloat: Convert float32_exp2 to FloatParts
  softfloat: Move floatN_log2 to softfloat-parts.c.inc
  softfloat: Convert modrem operations to FloatParts
  softfloat: Use hard-float for {u}int64_to_float{32,64}

 include/fpu/softfloat-helpers.h |5 +-
 include/fpu/softfloat-macros.h  |   34 +
 include/fpu/softfloat-types.h   |   10 +-
 include/fpu/softfloat.h |   10 +-
 fpu/softfloat.c | 4429 ++-
 linux-user/arm/nwfpe/fpa11.c|   41 +-
 target/i386/tcg/fpu_helper.c|   79 +-
 target/m68k/fpu_helper.c|   50 +-
 target/m68k/softfloat.c |   90 +-
 tests/fp/fp-test-log2.c |  118 ++
 tests/fp/fp-test.c  |9 +-
 fpu/softfloat-parts.c.inc   |  903 +++-
 fpu/softfloat-specialize.c.inc  |  340 ---
 tests/fp/wrap.c.inc |2 +-
 tests/fp/meson.build|   27 +-
 15 files changed, 2249 insertions(+), 3898 deletions(-)
 create mode 100644 tests/fp/fp-test-log2.c

Re: [PATCH v3 33/33] block/nbd: drop connection_co

2021-06-03 Thread Eric Blake

On Fri, Apr 16, 2021 at 11:09:11AM +0300, Vladimir Sementsov-Ogievskiy wrote:
> OK, that's a big rewrite of the logic.
> 
> Pre-patch we have an always running coroutine - connection_co. It does
> reply receiving and reconnecting. And it leads to a lot of difficult
> and unobvious code around drained sections and context switch. We also
> abuse bs->in_flight counter which is increased for connection_co and
> temporary decreased in points where we want to allow drained section to
> begin. One of these place is in another file: in nbd_read_eof() in
> nbd/client.c.
> 
> We also cancel reconnect and requests waiting for reconnect on drained
> begin which is not correct.
> 
> Let's finally drop this always running coroutine and go another way:
> 
> 1. reconnect_attempt() goes to nbd_co_send_request and called under
>send_mutex
> 
> 2. We do receive headers in request coroutine. But we also should
>dispatch replies for another pending requests. So,
>nbd_connection_entry() is turned into nbd_receive_replies(), which
>does reply dispatching until it receive another request headers, and
>returns when it receive the requested header.
> 
> 3. All old staff around drained sections and context switch is dropped.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  block/nbd.c  | 376 ---
>  nbd/client.c |   2 -
>  2 files changed, 119 insertions(+), 259 deletions(-)
> 

> -static coroutine_fn void nbd_connection_entry(void *opaque)
> +static coroutine_fn void nbd_receive_replies(BDRVNBDState *s, uint64_t 
> handle)
>  {
> -BDRVNBDState *s = opaque;
>  uint64_t i;
>  int ret = 0;
>  Error *local_err = NULL;
>  
> -while (qatomic_load_acquire(>state) != NBD_CLIENT_QUIT) {
> -/*
> - * The NBD client can only really be considered idle when it has
> - * yielded from qio_channel_readv_all_eof(), waiting for data. This 
> is
> - * the point where the additional scheduled coroutine entry happens
> - * after nbd_client_attach_aio_context().
> - *
> - * Therefore we keep an additional in_flight reference all the time 
> and
> - * only drop it temporarily here.
> - */
> +i = HANDLE_TO_INDEX(s, handle);
> +if (s->receive_co) {
> +assert(s->receive_co != qemu_coroutine_self());
>  
> -if (nbd_client_connecting(s)) {
> -nbd_co_reconnect_loop(s);
> -}
> +/* Another request coroutine is receiving now */
> +s->requests[i].receiving = true;
> +qemu_coroutine_yield();
> +assert(!s->requests[i].receiving);
>  
> -if (!nbd_client_connected(s)) {
> -continue;
> +if (s->receive_co != qemu_coroutine_self()) {
> +/*
> + * We are either failed or done, caller uses 
> nbd_client_connected()
> + * to distinguish.
> + */
> +return;
>  }
> +}
> +
> +assert(s->receive_co == 0 || s->receive_co == qemu_coroutine_self());

s/0/NULL/ here

> +s->receive_co = qemu_coroutine_self();
>  
> +while (nbd_client_connected(s)) {
>  assert(s->reply.handle == 0);
>  ret = nbd_receive_reply(s->bs, s->ioc, >reply, _err);
>  
> @@ -522,8 +380,21 @@ static coroutine_fn void nbd_connection_entry(void 
> *opaque)
>  local_err = NULL;
>  }
>  if (ret <= 0) {
> -nbd_channel_error(s, ret ? ret : -EIO);
> -continue;
> +ret = ret ? ret : -EIO;
> +nbd_channel_error(s, ret);
> +goto out;
> +}
> +
> +if (!nbd_client_connected(s)) {
> +ret = -EIO;
> +goto out;
> +}
> +
> +i = HANDLE_TO_INDEX(s, s->reply.handle);
> +
> +if (s->reply.handle == handle) {
> +ret = 0;
> +goto out;
>  }
>  
>  /*

I know your followup said there is more work to do before v4, but I
look forward to seeing it.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

Re: [PATCH v3 32/33] block/nbd: safer transition to receiving request

2021-06-03 Thread Eric Blake

On Fri, Apr 16, 2021 at 11:09:10AM +0300, Vladimir Sementsov-Ogievskiy wrote:
> req->receiving is a flag of request being in one concrete yield point
> in nbd_co_do_receive_one_chunk().
> 
> Such kind of boolean flag is always better to unset before scheduling
> the coroutine, to avoid double scheduling. So, let's be more careful.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  block/nbd.c | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 

> @@ -614,7 +616,7 @@ static int nbd_co_send_request(BlockDriverState *bs,
>  if (qiov) {
>  qio_channel_set_cork(s->ioc, true);
>  rc = nbd_send_request(s->ioc, request);
> -if (nbd_clinet_connected(s) && rc >= 0) {
> +if (nbd_client_connected(s) && rc >= 0) {

Ouch - typo fix in clinet seems unrelated in this fix; please hoist it
into the correct point in the series so that we don't have the typo in
the first place.

Otherwise,
Reviewed-by: Eric Blake 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

Re: [PATCH v3 31/33] block/nbd: add nbd_clinent_connected() helper

2021-06-03 Thread Eric Blake

On Fri, Apr 16, 2021 at 11:09:09AM +0300, Vladimir Sementsov-Ogievskiy wrote:

In the subject: s/clinent/client/

> We already have two similar helpers for other state. Let's add another
> one for convenience.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  block/nbd.c | 25 ++---
>  1 file changed, 14 insertions(+), 11 deletions(-)
>

Reviewed-by: Eric Blake 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

Re: [PATCH v3 30/33] block/nbd: reuse nbd_co_do_establish_connection() in nbd_open()

2021-06-03 Thread Eric Blake

On Fri, Apr 16, 2021 at 11:09:08AM +0300, Vladimir Sementsov-Ogievskiy wrote:
> The only last step we need to reuse the function is coroutine-wrapper.
> nbd_open() may be called from non-coroutine context. So, generate the
> wrapper and use it.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  block/coroutines.h |   6 +++
>  block/nbd.c| 101 ++---
>  2 files changed, 10 insertions(+), 97 deletions(-)
> 
> diff --git a/block/coroutines.h b/block/coroutines.h
> index 4cfb4946e6..514d169d23 100644
> --- a/block/coroutines.h
> +++ b/block/coroutines.h
> @@ -66,4 +66,10 @@ int coroutine_fn bdrv_co_readv_vmstate(BlockDriverState 
> *bs,
>  int coroutine_fn bdrv_co_writev_vmstate(BlockDriverState *bs,
>  QEMUIOVector *qiov, int64_t pos);
>  
> +int generated_co_wrapper
> +nbd_do_establish_connection(BlockDriverState *bs, Error **errp);
> +int coroutine_fn
> +nbd_co_do_establish_connection(BlockDriverState *bs, Error **errp);

Tagged coroutine_fn here,...

> +++ b/block/nbd.c
>  
> -static int nbd_co_do_establish_connection(BlockDriverState *bs, Error **errp)
> +int nbd_co_do_establish_connection(BlockDriverState *bs, Error **errp)

...but not here.  Is it worth being consistent?

> @@ -2056,22 +1974,11 @@ static int nbd_open(BlockDriverState *bs, QDict 
> *options, int flags,
>  s->x_dirty_bitmap, s->tlscreds,
>  monitor_cur());
>  
> -/*
> - * establish TCP connection, return error if it fails
> - * TODO: Configurable retry-until-timeout behaviour.
> - */
> -sioc = nbd_establish_connection(bs, s->saddr, errp);
> -if (!sioc) {
> -ret = -ECONNREFUSED;
> -goto fail;
> -}
> -
> -ret = nbd_client_handshake(bs, sioc, errp);
> +/* TODO: Configurable retry-until-timeout behaviour.*/

Space before */

Nice diffstat, proving the refactoring worked.

Reviewed-by: Eric Blake 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

Re: [PATCH v3 26/33] block-coroutine-wrapper: allow non bdrv_ prefix

2021-06-03 Thread Eric Blake

On Fri, Apr 16, 2021 at 11:09:04AM +0300, Vladimir Sementsov-Ogievskiy wrote:
> We are going to reuse the script to generate a qcow2_ function in

Is qcow2_ right here?  Looking ahead to patch 30/33, it looks like you
meant nbd_, at least in the context of this series.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

Re: [PATCH v3 29/33] nbd/client-connection: add option for non-blocking connection attempt

2021-06-03 Thread Eric Blake

On Fri, Apr 16, 2021 at 11:09:07AM +0300, Vladimir Sementsov-Ogievskiy wrote:
> We'll need a possibility of non-blocking nbd_co_establish_connection(),
> so that it returns immediately, and it returns success only if
> connections was previously established in background.

only if a connection was

> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  include/block/nbd.h | 2 +-
>  block/nbd.c | 2 +-
>  nbd/client-connection.c | 8 +++-
>  3 files changed, 9 insertions(+), 3 deletions(-)
>

Reviewed-by: Eric Blake 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

Re: [PATCH v3 28/33] nbd/client-connection: do qio_channel_set_delay(false)

2021-06-03 Thread Eric Blake

On Fri, Apr 16, 2021 at 11:09:06AM +0300, Vladimir Sementsov-Ogievskiy wrote:
> nbd_open() does it (through nbd_establish_connection()).
> Actually we lost that call on reconnect path in 1dc4718d849e1a1fe
> "block/nbd: use non-blocking connect: fix vm hang on connect()"
> when we have introduced reconnect thread.

s/have introduced/introduced the/

> 
> Fixes: 1dc4718d849e1a1fe665ce5241ed79048cfa2cfc
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  nbd/client-connection.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/nbd/client-connection.c b/nbd/client-connection.c
> index 36d2c7c693..00efff293f 100644
> --- a/nbd/client-connection.c
> +++ b/nbd/client-connection.c
> @@ -126,6 +126,8 @@ static int nbd_connect(QIOChannelSocket *sioc, 
> SocketAddress *addr,
>  return ret;
>  }
>  
> +qio_channel_set_delay(QIO_CHANNEL(sioc), false);
> +
>  if (!info) {
>  return 0;
>  }

Reviewed-by: Eric Blake 

Is this bug fix something that can be cherry-picked in isolation, or
does it depend too much on the rest of the series?

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

Re: [PATCH V3 00/22] Live Update [restart]

2021-06-03 Thread Daniel P . Berrangé

On Thu, Jun 03, 2021 at 08:36:42PM +0100, Dr. David Alan Gilbert wrote:
> * Steven Sistare (steven.sist...@oracle.com) wrote:
> > On 5/24/2021 6:39 AM, Dr. David Alan Gilbert wrote:
> > > * Steven Sistare (steven.sist...@oracle.com) wrote:
> > >> On 5/20/2021 9:13 AM, Dr. David Alan Gilbert wrote:
> > >>> On the 'restart' branch of questions; can you explain,
> > >>> other than the passing of the fd's, why the outgoing side of
> > >>> qemu's 'migrate exec:' doesn't work for you?
> > >>
> > >> I'm not sure what I should describe.  Can you be more specific?
> > >> Do you mean: can we add the cpr specific bits to the migrate exec code?
> > > 
> > > Yes; if possible I'd prefer to just keep the one exec mechanism.
> > > It has an advantage of letting you specify the new command line; that
> > > avoids the problems I'd pointed out with needing to change the command
> > > line if a hotplug had happened.  It also means we only need one chunk of
> > > exec code.
> > 
> > How/where would you specify a new command line?  Are you picturing the 
> > usual migration
> > setup where you start a second qemu with its own arguments, plus a 
> > migrate_incoming
> > option or command?  That does not work for live update restart; the old 
> > qemu must exec
> > the new qemu.  Or something else?
> 
> The existing migration path allows an exec - originally intended to exec
> something like a compressor or a store to file rather than a real
> migration; i.e. you can do:
> 
>   migrate "exec:gzip > mig"
> 
> and that will save the migration stream to a compressed file called mig.
> Now, I *think* we can already do:
> 
>   migrate "exec:path-to-qemu command line parameters -incoming 'hm'"
> (That's probably cleaner via the QMP interface).
> 
> I'm not quite sure what I want in the incoming there, but that is
> already the source execing the destination qemu - although I think we'd
> probably need to check if that's actually via an intermediary.

I don't think you can dirctly exec  qemu in that way, because the
source QEMU migration code is going to wait for completion of the
QEMU you exec'd and that'll never come on success. So you'll end
up with both QEMU's running forever. If you pass the daemonize
option to the new QEMU then it will immediately detach itself,
and the source QEMU will think the migration command has finished
or failed.

I think you can probably do it if you use a wrapper script though.
The wrapper would have to fork QEMU in the backend, and then the
wrapper would have to monitor the new QEMU to see when the incoming
migration has finished/aborted, at which point the wrapper can
exit, so the source QEMU sees a successful cleanup of the exec'd
command. 

> > We could shoehorn cpr restart into the migrate exec path by defining a new 
> > migration 
> > capability that the client would set before issuing the migrate command.  
> > However, the
> > implementation code would be sprinkled with conditionals to suppress 
> > migrate-only bits
> > and call cpr-only bits.  IMO that would be less maintainable than having a 
> > separate
> > cprsave function.  Currently cprsave does not duplicate any migration 
> > functionality.
> > cprsave calls qemu_save_device_state() which is used by xen.
> 
> To me it feels like cprsave in particular is replicating more code.
> 
> It's also jumping through hoops in places to avoid changing the
> commandline;  that's going to cause more pain for a lot of people - not
> just because it's hacks all over for that, but because a lot of people
> are actually going to need to change the commandline even in a cpr like
> case (e.g. due to hotplug or changing something else about the
> environment, like auth data or route to storage or networking that
> changed).

Management apps that already support migration, will almost certainly
know how to start up a new QEMU with a different command line that
takes account of hotplugged/unplugged devices. IOW avoiding changing
the command line only really addresses the simple case, and the hard
case is likely already solved for purposes of handling regular live
migration.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PATCH v3 1/2] GitLab: Add "Bug" issue reporting template

2021-06-03 Thread Daniel P . Berrangé

On Thu, Jun 03, 2021 at 03:11:28PM -0400, John Snow wrote:
> On 6/3/21 3:26 AM, Philippe Mathieu-Daudé wrote:
> > I haven't reviewed earlier version, but I wonder about the "build from
> > sources" use case (this is not a template for distributions but for the
> > mainstream project), so maybe add:
> > 
> >## Build environment (in case you built QEMU from source)
> >- configure script command line: (e.g. ./configure --enable-nettle
> > --disable-glusterfs --disable-user)
> >- configure script summary output
> 
> Maybe just a little bit too much information. Even though I am pushing
> people to build from source, I actually think more reports are likely not to
> have done so.

We need to bear in mind what % of bug reports do we actually need
this information for.  We don't want to be asking for information
that we know is going to be irrelevant for (say) 60% of bug reports,
because that wastes time of users.

My gut feeling is that while configure args are useful to know in
many cases, I doubt it genuinely needed for more than a relatively
small percentage of bug reports.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PULL 00/45] target-arm queue

2021-06-03 Thread Peter Maydell

On Thu, 3 Jun 2021 at 16:59, Peter Maydell  wrote:
>
> The following changes since commit a97978bcc2d1f650c7d411428806e5b03082b8c7:
>
>   Merge remote-tracking branch 'remotes/dg-gitlab/tags/ppc-for-6.1-20210603' 
> into staging (2021-06-03 10:00:35 +0100)
>
> are available in the Git repository at:
>
>   https://git.linaro.org/people/pmaydell/qemu-arm.git 
> tags/pull-target-arm-20210603
>
> for you to fetch changes up to 1c861885894d840235954060050d240259f5340b:
>
>   tests/unit/test-vmstate: Assert that dup() and mkstemp() succeed 
> (2021-06-03 16:43:27 +0100)
>
> 
> target-arm queue:
>  * Some not-yet-enabled preliminaries for M-profile MVE support
>  * Consistently use "Cortex-Axx", not "Cortex Axx" in docs, comments
>  * docs: Fix installation of man pages with Sphinx 4.x
>  * Mark LDS{MIN,MAX} as signed operations
>  * Fix missing syndrome value for DAIF and PAC check exceptions
>  * Implement BFloat16 extensions
>  * Refactoring of hvf accelerator code in preparation for aarch64 support
>  * Fix some coverity nits in test code


Applied, thanks.

Please update the changelog at https://wiki.qemu.org/ChangeLog/6.1
for any user-visible changes.

-- PMM

Re: [RFC] Adding the A64FX's HPC funtions.

2021-06-03 Thread Richard Henderson


On 6/3/21 1:17 AM, ishii.shuuic...@fujitsu.com wrote:

Hi, Richard.

Thank you for your comment.


My first thought is that -cpu max can simply enable the extensions, without
extra flags.  The max cpu has all of the features that we can enable, and as I
see it this is just one more.


Let me confirm a few things about the above comment.
Does it mean that I don't need to explicitly enable individual extensions
such as a64fx-hpc-sec, a64fx-hpc-hwpf, and a64fx-hpc-hwb,
since all extensions can be enabled by specifying -cpu max?


Well, Peter disagreed with having them enabled by default in -cpu max, so we 
might need at least one extra property.  I see no reason to have three 
properties -- one property a64fx-hpc should be sufficient.  But we might not 
want any command-line properties, see below...





The microarchitectural document provided does not list all of the system
register reset values for the A64FX, and I would be surprised if there were an
architectural id register that specified a non-standard extension like this.
Thus I would expect to add ARM_FEATURE_A64FX with which to enable these
extensions in helper.c.


As you said,
some of the published specifications do not describe the reset values of the 
registers.
I would like to implement this in QEMU by referring to a real machine with 
A64FX.


I presume there exists some documentation for this somewhere, though possibly 
only internal to Fujitsu so far.


For comparison, in the Arm Cortex-A76 manual,
  https://developer.arm.com/documentation/100798/0301/
section B2.4 "AArch64 registers by functional group", there is a concise 
listing of all of the system registers and their reset values.


The most important of these for QEMU to create '-cpu a64fx' are the 
ID_AA64{ISAR,MMFR,PFR} and MIDR values.  These values determine all of the 
standard architectural features, and from them we can tell what QEMU may (or 
may not) be missing for proper emulation of the cpu.  For comparison, look at 
aarch64_a72_initfn in target/arm/cpu64.c.


Peter is suggesting that if full support for -cpu a64fx apart from the hpc 
extensions is close, then we shouldn't implementing a property for -cpu max, 
but only implement -cpu a64fx.  (Because how does the OS detect the hpc 
feature, apart from the MIDR value?)



r~

Re: [PATCH v3 27/33] block/nbd: split nbd_co_do_establish_connection out of nbd_reconnect_attempt

2021-06-03 Thread Eric Blake

On Fri, Apr 16, 2021 at 11:09:05AM +0300, Vladimir Sementsov-Ogievskiy wrote:
> Split out part, that we want to reuse for nbd_open().

Split out the part that we want to reuse for nbd_open().

> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  block/nbd.c | 79 +++--
>  1 file changed, 41 insertions(+), 38 deletions(-)
> 
> diff --git a/block/nbd.c b/block/nbd.c
> index 15b5921725..59971bfba8 100644
> --- a/block/nbd.c
> +++ b/block/nbd.c
> @@ -361,11 +361,49 @@ static int nbd_handle_updated_info(BlockDriverState 
> *bs, Error **errp)
>  return 0;
>  }
>  
> -static coroutine_fn void nbd_reconnect_attempt(BDRVNBDState *s)
> +static int nbd_co_do_establish_connection(BlockDriverState *bs, Error **errp)

Given the _co_ in the name, don't you need a coroutine_fn marker?

Otherwise looks sane.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

Re: [PATCH v3 26/33] block-coroutine-wrapper: allow non bdrv_ prefix

2021-06-03 Thread Eric Blake

On Fri, Apr 16, 2021 at 11:09:04AM +0300, Vladimir Sementsov-Ogievskiy wrote:
> We are going to reuse the script to generate a qcow2_ function in
> further commit. Prepare the script now.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  scripts/block-coroutine-wrapper.py | 7 ---
>  1 file changed, 4 insertions(+), 3 deletions(-)

Reviewed-by: Eric Blake 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

Re: [PATCH v3 25/33] nbd/client-connection: return only one io channel

2021-06-03 Thread Eric Blake

On Fri, Apr 16, 2021 at 11:09:03AM +0300, Vladimir Sementsov-Ogievskiy wrote:
> block/nbd doesn't need underlying sioc channel anymore. So, we can
> update nbd/client-connection interface to return only one top-most io
> channel, which is more straight forward.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  include/block/nbd.h |  4 ++--
>  block/nbd.c | 13 ++---
>  nbd/client-connection.c | 33 +
>  3 files changed, 29 insertions(+), 21 deletions(-)
> 

Reviewed-by: Eric Blake 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

Re: [PATCH v2 1/3] doc: Fix some mistakes in the SEV documentation

2021-06-03 Thread Eduardo Habkost

On Thu, Jun 03, 2021 at 10:29:35AM +0200, Laszlo Ersek wrote:
> On 06/02/21 21:19, Tom Lendacky wrote:
> > Just a quick ping on this series...
> 
> Right, I'm unsure who is supposed to merge this... Do you remember who
> usually merges the SEV-related patch series (plural)?

I'm queueing this series by now, but we truly need somebody to
volunteer as maintainer of the SEV code.

-- 
Eduardo

Re: [PATCH V3 00/22] Live Update [restart]

2021-06-03 Thread Dr. David Alan Gilbert

* Steven Sistare (steven.sist...@oracle.com) wrote:
> On 5/24/2021 6:39 AM, Dr. David Alan Gilbert wrote:
> > * Steven Sistare (steven.sist...@oracle.com) wrote:
> >> On 5/20/2021 9:13 AM, Dr. David Alan Gilbert wrote:
> >>> On the 'restart' branch of questions; can you explain,
> >>> other than the passing of the fd's, why the outgoing side of
> >>> qemu's 'migrate exec:' doesn't work for you?
> >>
> >> I'm not sure what I should describe.  Can you be more specific?
> >> Do you mean: can we add the cpr specific bits to the migrate exec code?
> > 
> > Yes; if possible I'd prefer to just keep the one exec mechanism.
> > It has an advantage of letting you specify the new command line; that
> > avoids the problems I'd pointed out with needing to change the command
> > line if a hotplug had happened.  It also means we only need one chunk of
> > exec code.
> 
> How/where would you specify a new command line?  Are you picturing the usual 
> migration
> setup where you start a second qemu with its own arguments, plus a 
> migrate_incoming
> option or command?  That does not work for live update restart; the old qemu 
> must exec
> the new qemu.  Or something else?

The existing migration path allows an exec - originally intended to exec
something like a compressor or a store to file rather than a real
migration; i.e. you can do:

  migrate "exec:gzip > mig"

and that will save the migration stream to a compressed file called mig.
Now, I *think* we can already do:

  migrate "exec:path-to-qemu command line parameters -incoming 'hm'"
(That's probably cleaner via the QMP interface).

I'm not quite sure what I want in the incoming there, but that is
already the source execing the destination qemu - although I think we'd
probably need to check if that's actually via an intermediary.

> We could shoehorn cpr restart into the migrate exec path by defining a new 
> migration 
> capability that the client would set before issuing the migrate command.  
> However, the
> implementation code would be sprinkled with conditionals to suppress 
> migrate-only bits
> and call cpr-only bits.  IMO that would be less maintainable than having a 
> separate
> cprsave function.  Currently cprsave does not duplicate any migration 
> functionality.
> cprsave calls qemu_save_device_state() which is used by xen.

To me it feels like cprsave in particular is replicating more code.

It's also jumping through hoops in places to avoid changing the
commandline;  that's going to cause more pain for a lot of people - not
just because it's hacks all over for that, but because a lot of people
are actually going to need to change the commandline even in a cpr like
case (e.g. due to hotplug or changing something else about the
environment, like auth data or route to storage or networking that
changed).

There are hooks for early parameter parsing, so if we need to add extra
commandline args we can; but for example the case of QEMU_START_FREEZE
to add -S just isn't needed as soon as you let go of the idea of needing
an identical commandline.

Dave

> - Steve
> 
-- 
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

Re: [PATCH v3 24/33] block/nbd: drop BDRVNBDState::sioc

2021-06-03 Thread Eric Blake

On Fri, Apr 16, 2021 at 11:09:02AM +0300, Vladimir Sementsov-Ogievskiy wrote:
> Currently sioc pointer is used just to pass from socket-connection to
> nbd negotiation. Drop the field, and use local variables instead. With
> next commit we'll update nbd/client-connection.c to behave
> appropriately (return only top-most ioc, not two channels).
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  block/nbd.c | 98 ++---
>  1 file changed, 48 insertions(+), 50 deletions(-)
>

It's a bit hard to review 's->ioc' vs. 'sioc' when reading aloud ;)

But the change looks reasonable, and I'm not spotting any memory leak
in the refactoring.

Reviewed-by: Eric Blake 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

Re: [PATCH v3 1/2] GitLab: Add "Bug" issue reporting template

2021-06-03 Thread John Snow


On 6/3/21 3:26 AM, Philippe Mathieu-Daudé wrote:

I haven't reviewed earlier version, but I wonder about the "build from
sources" use case (this is not a template for distributions but for the
mainstream project), so maybe add:

   ## Build environment (in case you built QEMU from source)
   - configure script command line: (e.g. ./configure --enable-nettle
--disable-glusterfs --disable-user)
   - configure script summary output


Maybe just a little bit too much information. Even though I am pushing 
people to build from source, I actually think more reports are likely 
not to have done so.


We can always ask a follow-up question if we can't reproduce without the 
specific configuration and a good reporter will reply ;)


--js

Re: [PATCH] Update Linux headers to 5.13-rc4

2021-06-03 Thread Eduardo Habkost

On Tue, Jun 01, 2021 at 04:17:41PM -0400, Eduardo Habkost wrote:
> Signed-off-by: Eduardo Habkost 
> ---
> KVM_RUN_X86_BUS_LOCK is a requirement for:
>   [PATCH v4] i386: Add ratelimit for bus locks acquired in guest
>   Message-Id: <20210521043820.29678-1-chenyi.qi...@intel.com>

I forgot to git-add the new virtio_bt.h and virtio_snd.h files,
and this now conflicts with commit 3ea1a80243d5
("target/i386/sev: add support to query the attestation report").
I will send v2.

-- 
Eduardo

Re: [RFC PATCH] accel/tcg: change default codegen buffer size for i386-softmmu

2021-06-03 Thread Richard Henderson


On 5/25/21 9:45 AM, Alex Bennée wrote:

There are two justifications for making this change. The first is that
i386 emulation is typically for smaller machines where having a 1gb of
generated code is overkill for basic emulation. The second is the
propensity of self-modifying code (c.f. Doom/edit) utilised on i386
systems can trigger a rapid growth in invalidated and re-translated
buffers. This is seen in bug #283. Execution is still inefficient but
at least the host memory isn't so aggressively used up.

That said it's still really just a sticking plaster for user
convenience.

Signed-off-by: Alex Bennée 
Cc: Thomas Huth 
Cc: 1896...@bugs.launchpad.net
---
  accel/tcg/translate-all.c | 4 
  1 file changed, 4 insertions(+)

diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
index 640ff6e3e7..f442165674 100644
--- a/accel/tcg/translate-all.c
+++ b/accel/tcg/translate-all.c
@@ -951,9 +951,13 @@ static void page_lock_pair(PageDesc **ret_p1, 
tb_page_addr_t phys1,
   * Users running large scale system emulation may want to tweak their
   * runtime setup via the tb-size control on the command line.
   */
+#ifdef TARGET_I386
+#define DEFAULT_CODE_GEN_BUFFER_SIZE_1 (32 * MiB)
+#else
  #define DEFAULT_CODE_GEN_BUFFER_SIZE_1 (1 * GiB)
  #endif
  #endif
+#endif
  
  #define DEFAULT_CODE_GEN_BUFFER_SIZE \

(DEFAULT_CODE_GEN_BUFFER_SIZE_1 < MAX_CODE_GEN_BUFFER_SIZE \



I'm not thrilled, as it is ultra-hacky.

(1) I've got a re-org of this code out for review: 
https://patchew.org/QEMU/20210502231844.1977630-1-richard.hender...@linaro.org/


(2) I'm keen to reorg TCG such that it gets compiled once.  There's currently 
nothing standing in the way of that except work.  But this would introduce a 
use of a target-specific define for the first time into tcg/.  I guess I could 
leave the default sizing back in accel/tcg/ and pass in the default.


Other options?


r~

Re: [PATCH v3 23/33] block/nbd: nbd_teardown_connection() don't touch s->sioc

2021-06-03 Thread Eric Blake

On Fri, Apr 16, 2021 at 11:09:01AM +0300, Vladimir Sementsov-Ogievskiy wrote:

For the subject line, might read better as:

block/nbd: don't touch s->sioc in nbd_teardown_connection()

> Negotiation during reconnect is now done in thread, and s->sioc is not

in a thread

> available during negotiation. Negotiation in thread will be cancelled
> by nbd_client_connection_release() called from nbd_clear_bdrvstate().
> So, we don't need this code chunk anymore.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  block/nbd.c | 4 
>  1 file changed, 4 deletions(-)

Reviewed-by: Eric Blake 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

Re: [PATCH v4 01/15] python: qemu: add timer parameter for qmp.accept socket

2021-06-03 Thread John Snow


On 6/3/21 4:06 AM, Emanuele Giuseppe Esposito wrote:



On 03/06/2021 01:23, John Snow wrote:

On 5/20/21 3:52 AM, Emanuele Giuseppe Esposito wrote:

Alsp add a new _qmp_timer field to the QEMUMachine class.

Let's change the default socket timeout to None, so that if
a subclass needs to add a timer, it can be done by modifying
this private field.

At the same time, restore the timer to be 15 seconds in iotests.py, to
give an upper bound to qemu-iotests execution.

Signed-off-by: Emanuele Giuseppe Esposito 


Hi Emanuele: I'm sorry, but with the recent Python PR this no longer 
applies to origin/master -- the python files got shuffled around a bit 
when I added the new CI tests.


May I please ask you to rebase? You don't have to re-spin just yet, 
just pointing me to the rebase would help me out.


Hi John, no problem. I rebased here:
https://gitlab.com/eesposit/qemu/-/commits/qemu_iotests_io

Let me know if I need to do anything else.
I will re-spin later today.

Thank you,
Emanuele



Hi Emanuele:

https://gitlab.com/jsnow/qemu/-/commits/review

I added in a pylint ignore for you and these patches are clean now.

--js

Re: [PATCH 3/3] target/arm: Use acpi_ghes_present() to see if we report ACPI memory errors

2021-06-03 Thread Richard Henderson


On 6/3/21 10:12 AM, Peter Maydell wrote:

The virt_is_acpi_enabled() function is specific to the virt board, as
is the check for its 'ras' property.  Use the new acpi_ghes_present()
function to check whether we should report memory errors via
acpi_ghes_record_errors().

This avoids a link error if QEMU was built without support for the
virt board, and provides a mechanism that can be used by any future
board models that want to add ACPI memory error reporting support
(they only need to call acpi_ghes_add_fw_cfg()).

Signed-off-by: Peter Maydell
---
  target/arm/kvm64.c | 6 +-
  1 file changed, 1 insertion(+), 5 deletions(-)


Reviewed-by: Richard Henderson 

r~

Re: [PATCH 2/3] hw/acpi: Provide function acpi_ghes_present()

2021-06-03 Thread Richard Henderson


On 6/3/21 10:12 AM, Peter Maydell wrote:

Allow code elsewhere in the system to check whether the ACPI GHES
table is present, so it can determine whether it is OK to try to
record an error by calling acpi_ghes_record_errors().

(We don't need to migrate the new 'present' field in AcpiGhesState,
because it is set once at system initialization and doesn't change.)

Signed-off-by: Peter Maydell
---
  include/hw/acpi/ghes.h |  9 +
  hw/acpi/ghes-stub.c|  5 +
  hw/acpi/ghes.c | 17 +
  3 files changed, 31 insertions(+)


Reviewed-by: Richard Henderson 

r~

Re: [PATCH 1/3] hw/acpi: Provide stub version of acpi_ghes_record_errors()

2021-06-03 Thread Richard Henderson


On 6/3/21 10:12 AM, Peter Maydell wrote:

+softmmu_ss.add(when: 'CONFIG_ACPI', if_false: files('acpi-stub.c', 
'aml-build-stub.c', 'ghes-stub.c'))
  softmmu_ss.add_all(when: 'CONFIG_ACPI', if_true: acpi_ss)
  softmmu_ss.add(when: 'CONFIG_ALL', if_true: files('acpi-stub.c', 
'aml-build-stub.c',
-  'acpi-x86-stub.c', 
'ipmi-stub.c'))
+  'acpi-x86-stub.c', 
'ipmi-stub.c', 'ghes-stub.c'))


Gosh that last line is confusing.  I see it's documented in build-system.rst, 
but yeesh.


Reviewed-by: Richard Henderson 

r~

Re: [PATCH v3 22/33] block/nbd: pass monitor directly to connection thread

2021-06-03 Thread Vladimir Sementsov-Ogievskiy


03.06.2021 21:16, Eric Blake wrote:

On Fri, Apr 16, 2021 at 11:09:00AM +0300, Vladimir Sementsov-Ogievskiy wrote:

monitor_cur() is used by socket_get_fd, but it doesn't work in
connection thread. Let's monitor directly to cover this thing. We are
going to unify connection establishing path in nbd_open and reconnect,
so we should support fd-passing.


Grammar suggestion:

Let's pass in the monitor directly to work around this.  This gets us
closer to unifing the path for establishing a connection in nbd_open
and reconnect, by supporting fd-passing.


But given Dan's review on 21/33, I suspect you won't be using this
patch in this form after all (instead, the caller of
nbd_client_connection_new will use the new monitor_resolve_fd or
whatever we call that, so that nbd_client_connection_new remains
oblivious to the monitor).



Yes.. I even have some patches for it locally. Seems I didn't send them, don't 
remember why :/ Will check tomorrow.

--
Best regards,
Vladimir

Re: [PATCH v3 22/33] block/nbd: pass monitor directly to connection thread

2021-06-03 Thread Eric Blake

On Fri, Apr 16, 2021 at 11:09:00AM +0300, Vladimir Sementsov-Ogievskiy wrote:
> monitor_cur() is used by socket_get_fd, but it doesn't work in
> connection thread. Let's monitor directly to cover this thing. We are
> going to unify connection establishing path in nbd_open and reconnect,
> so we should support fd-passing.

Grammar suggestion:

Let's pass in the monitor directly to work around this.  This gets us
closer to unifing the path for establishing a connection in nbd_open
and reconnect, by supporting fd-passing.

But given Dan's review on 21/33, I suspect you won't be using this
patch in this form after all (instead, the caller of
nbd_client_connection_new will use the new monitor_resolve_fd or
whatever we call that, so that nbd_client_connection_new remains
oblivious to the monitor).

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

Re: [PATCH 1/2] i386: reorder call to cpu_exec_realizefn in x86_cpu_realizefn

2021-06-03 Thread Eduardo Habkost

On Thu, Jun 03, 2021 at 10:13:30AM +0200, Claudio Fontana wrote:
> On 6/1/21 8:48 PM, Eduardo Habkost wrote:
> > +Vitaly
> > 
> > On Sat, May 29, 2021 at 11:13:12AM +0200, Claudio Fontana wrote:
> >> we need to expand features first, before we attempt to check them
> >> in the accel realizefn code called by cpu_exec_realizefn().
> >>
> >> At the same time we need checks for code_urev and host_cpuid_required,
> >> and modifications to cpu->mwait to happen after the initial setting
> >> of them inside the accel realizefn code.
> > 
> > I miss an explanation why those 3 steps need to happen after
> > accel realizefn.
> > 
> > I'm worried by the fragility of the ordering.  If there are
> > specific things that must be done before/after
> > cpu_exec_realizefn(), this needs to be clear in the code.
> 
> Hi Eduardo,
> 
> indeed the initialization and realization code for x86 appears to be very 
> sensitive to ordering.
> This continues to be the case after the fix.
> 
> cpu_exec_realizefn
> 
> 
> 
> > 
> >>
> >> Partial Fix.
> >>
> >> Fixes: 48afe6e4eabf ("i386: split cpu accelerators from cpu.c, using 
> >> AccelCPUClass")
> > 
> > Shouldn't this be
> >   f5cc5a5c1686 ("i386: split cpu accelerators from cpu.c, using 
> > AccelCPUClass")
> > ?
> > 
> > Also, it looks like part of the ordering change was made in
> > commit 30565f10e907 ("cpu: call AccelCPUClass::cpu_realizefn in
> > cpu_exec_realizefn").
> > 
> > 
> > 
> >> Signed-off-by: Claudio Fontana 
> >> ---
> >>  target/i386/cpu.c | 56 +++
> >>  1 file changed, 28 insertions(+), 28 deletions(-)
> >>
> >> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> >> index 9e211ac2ce..6bcb7dbc2c 100644
> >> --- a/target/i386/cpu.c
> >> +++ b/target/i386/cpu.c
> >> @@ -6133,34 +6133,6 @@ static void x86_cpu_realizefn(DeviceState *dev, 
> >> Error **errp)
> >>  Error *local_err = NULL;
> >>  static bool ht_warned;
> >>  
> >> -/* Process Hyper-V enlightenments */
> >> -x86_cpu_hyperv_realize(cpu);
> > 
> > Vitaly, is this reordering going to affect the Hyper-V cleanup
> > work you are doing?  It seems harmless and it makes sense to keep
> > the "realize" functions close together, but I'd like to confirm.
> > 
> >> -
> >> -cpu_exec_realizefn(cs, _err);
> >> -if (local_err != NULL) {
> >> -error_propagate(errp, local_err);
> >> -return;
> >> -}
> >> -
> >> -if (xcc->host_cpuid_required && !accel_uses_host_cpuid()) {
> >> -g_autofree char *name = x86_cpu_class_get_model_name(xcc);
> >> -error_setg(_err, "CPU model '%s' requires KVM or HVF", 
> >> name);
> >> -goto out;
> >> -}
> >> -
> >> -if (cpu->ucode_rev == 0) {
> >> -/* The default is the same as KVM's.  */
> >> -if (IS_AMD_CPU(env)) {
> >> -cpu->ucode_rev = 0x0165;
> >> -} else {
> >> -cpu->ucode_rev = 0x1ULL;
> >> -}
> >> -}
> >> -
> >> -/* mwait extended info: needed for Core compatibility */
> >> -/* We always wake on interrupt even if host does not have the 
> >> capability */
> >> -cpu->mwait.ecx |= CPUID_MWAIT_EMX | CPUID_MWAIT_IBE;
> >> -
> >>  if (cpu->apic_id == UNASSIGNED_APIC_ID) {
> >>  error_setg(errp, "apic-id property was not initialized properly");
> >>  return;
> >> @@ -6190,6 +6162,34 @@ static void x86_cpu_realizefn(DeviceState *dev, 
> >> Error **errp)
> >> & CPUID_EXT2_AMD_ALIASES);
> >>  }
> >>  
> >> +/* Process Hyper-V enlightenments */
> >> +x86_cpu_hyperv_realize(cpu);
> >> +
> >> +cpu_exec_realizefn(cs, _err);
> > 
> > I'm worried by the reordering of cpu_exec_realizefn().  That
> > function does a lot, and reordering it might have even more
> > unwanted side effects.
> > 
> > I wonder if it wouldn't be easier to revert commit 30565f10e907
> > ("cpu: call AccelCPUClass::cpu_realizefn in cpu_exec_realizefn").
> 
> the part that is wrong in that commit does not I think have to do with where 
> the accel_cpu::cpu_realizefn() is called, but rather
> the move of the call to cpu_exec_realizefn (now including the 
> accel_cpu::cpu_realizefn) to the very beginning of the function.

Oh, I didn't notice commit 30565f10e907 also moved the
cpu_exec_realizefn() call to the beginning of the function.  So
moving it back (like you do her) actually seems to be a good
idea.

> 
> This was done due to the fact that the accel-specific code needs to be called 
> before the code:
> 
> * if (cpu->ucode_rev == 0) {
> 
> (meaning "use default if nothing accelerator specific has been set"), as this 
> could be set by accel-specific code,
> 
> * cpu->mwait.ecx |= CPUID_MWAIT_EMX | CPUID_MWAIT_IBE;
> 
> (as the mwait is written to by cpu "host" kvm/hvf-specific code when enabling 
> cpu pm),
> 
> * if (cpu->phys_bits && ...
> 
> (as the phys bits can be set by calling the host CPUID)
> 
> But I missed there that we cannot move the call before the expansion of cpu

Re: [PATCH v2 24/26] s390x/tcg: Implement VECTOR FP (MAXIMUM|MINIMUM)

2021-06-03 Thread Richard Henderson


On 5/17/21 7:27 AM, David Hildenbrand wrote:

+if (unlikely(nan_a || nan_b)) {


Perhaps better as (dcmask_a | dcmask_b) & DCMASK_NAN ?


+if (sig_a || sig_b) {


Similarly.


+} else if (unlikely(inf_a && inf_b)) {
+switch (type) {
+case S390_MINMAX_TYPE_JAVA:
+return neg_a && !neg_b ? S390_MINMAX_RES_A : S390_MINMAX_RES_B;
+case S390_MINMAX_TYPE_C_MACRO:
+case S390_MINMAX_TYPE_CPP:
+return neg_b ? S390_MINMAX_RES_B : S390_MINMAX_RES_A;
+case S390_MINMAX_TYPE_F:
+return !neg_a && neg_b ? S390_MINMAX_RES_B : S390_MINMAX_RES_A;
+default:
+g_assert_not_reached();
+}


I don't understand why inf_a && inf_b gets a special case.  Inf is 
well-ordered. If the arguments are equal you can't tell the difference between 
them, so it doesn't matter whether A or B is returned.


I would pass this case along to S390_MINMAX_RES_MINMAX.


+} else if (unlikely(zero_a && zero_b)) {
+switch (type) {
+case S390_MINMAX_TYPE_JAVA:
+return neg_a && !neg_b ? S390_MINMAX_RES_A : S390_MINMAX_RES_B;


If neg_a && neg_b, both A and B are -0, and you can't distinguish them.  So 
this would seem to simplify to


neg_a ? S390_MINMAX_RES_A : S390_MINMAX_RES_B


+case S390_MINMAX_TYPE_C_MACRO:
+return S390_MINMAX_RES_B;
+case S390_MINMAX_TYPE_F:
+return !neg_a && neg_b ? S390_MINMAX_RES_B : S390_MINMAX_RES_A;


Similarly if !neg_a && !neg_b, both A and B are +0.

Otherwise this looks good.


r~

Re: [PATCH v3 20/33] block/nbd: use negotiation of NBDClientConnection

2021-06-03 Thread Eric Blake

On Fri, Apr 16, 2021 at 11:08:58AM +0300, Vladimir Sementsov-Ogievskiy wrote:
> Use a new possibility of negotiation in connect thread. Now we are on
> the way of simplifying connection_co. We want to move the whole
> reconnect code to NBDClientConnection. NBDClientConnection already
> updated to support negotiation and retry. Use now the first thing.

Maybe:

Now that we can opt in to negotiation as part of the client connection
thread, use that to simplify connection_co.  This is another step on
the way to moving all reconnect code into NBDClientConnection.

> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  block/nbd.c | 44 ++--
>  1 file changed, 30 insertions(+), 14 deletions(-)
>

Reviewed-by: Eric Blake 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

Re: [RFC PATCH 2/2] migration/rdma: Enable use of g_autoptr with struct rdma_cm_event

2021-06-03 Thread Dr. David Alan Gilbert

* Philippe Mathieu-Daudé (phi...@redhat.com) wrote:
> Since 00f2cfbbec6 ("glib: bump min required glib library version to
> 2.48") we can use g_auto/g_autoptr to have the compiler automatically
> free an allocated variable when it goes out of scope, removing this
> burden on the developers.
> 
> Per rdma_cm(7) and rdma_ack_cm_event(3) man pages:
> 
>   "rdma_ack_cm_event() - Free a communication event.
> 
>All events which are allocated by rdma_get_cm_event() must be
>released, there should be a one-to-one correspondence between
>successful gets and acks. This call frees the event structure
>and any memory that it references."
> 
> Since the 'ack' description doesn't explicit the event is also
> released (free'd), it is safer to use the GLib g_autoptr feature.
> The G_DEFINE_AUTOPTR_CLEANUP_FUNC() macro expects a single word
> for the type name, so add a type definition to achieve this.
> Convert to use g_autoptr and remove the rdma_ack_cm_event() calls.
> 
> Inspired-by: Li Zhijian 
> Signed-off-by: Philippe Mathieu-Daudé 

It's unclear to me whether the changes in qemu_rdma_accept are legal
rdma or not.
If we look at the err_rdma_dest_wait: path, it does a
qemu_rdma_cleanup(rdma) before the exit; that does rdma_disconnect's and
calls loads of other rdma/ibv cleanup calls to destroy state - is there
any ordering requirement saying you're supposed to clean up your
cm_event's before you nuke the rest of the channel? It feels like you
probably should - but I have no idea if it's a requirement.

Dave

> ---
> RFC: build-tested only
> ---
>  migration/rdma.c | 27 ++-
>  1 file changed, 10 insertions(+), 17 deletions(-)
> 
> diff --git a/migration/rdma.c b/migration/rdma.c
> index b50ebb9183a..b703bf1b918 100644
> --- a/migration/rdma.c
> +++ b/migration/rdma.c
> @@ -38,6 +38,9 @@
>  #include "qom/object.h"
>  #include 
>  
> +typedef struct rdma_cm_event rdma_cm_event;
> +G_DEFINE_AUTOPTR_CLEANUP_FUNC(rdma_cm_event, rdma_ack_cm_event)
> +
>  /*
>   * Print and error on both the Monitor and the Log file.
>   */
> @@ -939,7 +942,7 @@ static int qemu_rdma_resolve_host(RDMAContext *rdma, 
> Error **errp)
>  int ret;
>  struct rdma_addrinfo *res;
>  char port_str[16];
> -struct rdma_cm_event *cm_event;
> +g_autoptr(rdma_cm_event) cm_event = NULL;
>  char ip[40] = "unknown";
>  struct rdma_addrinfo *e;
>  
> @@ -1007,11 +1010,11 @@ route:
>  ERROR(errp, "result not equal to event_addr_resolved %s",
>  rdma_event_str(cm_event->event));
>  perror("rdma_resolve_addr");
> -rdma_ack_cm_event(cm_event);
>  ret = -EINVAL;
>  goto err_resolve_get_addr;
>  }
>  rdma_ack_cm_event(cm_event);
> +cm_event = NULL;
>  
>  /* resolve route */
>  ret = rdma_resolve_route(rdma->cm_id, RDMA_RESOLVE_TIMEOUT_MS);
> @@ -1028,11 +1031,9 @@ route:
>  if (cm_event->event != RDMA_CM_EVENT_ROUTE_RESOLVED) {
>  ERROR(errp, "result not equal to event_route_resolved: %s",
>  rdma_event_str(cm_event->event));
> -rdma_ack_cm_event(cm_event);
>  ret = -EINVAL;
>  goto err_resolve_get_addr;
>  }
> -rdma_ack_cm_event(cm_event);
>  rdma->verbs = rdma->cm_id->verbs;
>  qemu_rdma_dump_id("source_resolve_host", rdma->cm_id->verbs);
>  qemu_rdma_dump_gid("source_resolve_host", rdma->cm_id);
> @@ -1501,7 +1502,7 @@ static uint64_t qemu_rdma_poll(RDMAContext *rdma, 
> uint64_t *wr_id_out,
>   */
>  static int qemu_rdma_wait_comp_channel(RDMAContext *rdma)
>  {
> -struct rdma_cm_event *cm_event;
> +g_autoptr(rdma_cm_event) cm_event = NULL;
>  int ret = -1;
>  
>  /*
> @@ -2503,7 +2504,7 @@ static int qemu_rdma_connect(RDMAContext *rdma, Error 
> **errp, bool return_path)
>.private_data = ,
>.private_data_len = sizeof(cap),
>  };
> -struct rdma_cm_event *cm_event;
> +g_autoptr(rdma_cm_event) cm_event = NULL;
>  int ret;
>  
>  /*
> @@ -2544,7 +2545,6 @@ static int qemu_rdma_connect(RDMAContext *rdma, Error 
> **errp, bool return_path)
>  if (cm_event->event != RDMA_CM_EVENT_ESTABLISHED) {
>  perror("rdma_get_cm_event != EVENT_ESTABLISHED after rdma_connect");
>  ERROR(errp, "connecting to destination!");
> -rdma_ack_cm_event(cm_event);
>  goto err_rdma_source_connect;
>  }
>  rdma->connected = true;
> @@ -2564,8 +2564,6 @@ static int qemu_rdma_connect(RDMAContext *rdma, Error 
> **errp, bool return_path)
>  
>  trace_qemu_rdma_connect_pin_all_outcome(rdma->pin_all);
>  
> -rdma_ack_cm_event(cm_event);
> -
>  rdma->control_ready_expected = 1;
>  rdma->nb_sent = 0;
>  return 0;
> @@ -3279,7 +3277,7 @@ static void rdma_cm_poll_handler(void *opaque)
>  {
>  RDMAContext *rdma = opaque;
>  int ret;
> -struct

Re: [PATCH v3 17/33] nbd/client-connection: implement connection retry

2021-06-03 Thread Vladimir Sementsov-Ogievskiy


03.06.2021 19:17, Eric Blake wrote:

On Fri, Apr 16, 2021 at 11:08:55AM +0300, Vladimir Sementsov-Ogievskiy wrote:

Add an option for thread to retry connection until success. We'll use


for a thread to retry connection until it succeeds.


nbd/client-connection both for reconnect and for initial connection in
nbd_open(), so we need a possibility to use same NBDClientConnection
instance to connect once in nbd_open() and then use retry semantics for
reconnect.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  include/block/nbd.h |  2 ++
  nbd/client-connection.c | 55 +
  2 files changed, 41 insertions(+), 16 deletions(-)

+++ b/nbd/client-connection.c
@@ -36,6 +36,8 @@ struct NBDClientConnection {
  NBDExportInfo initial_info;
  bool do_negotiation;
  
+bool do_retry;

+
  /*
   * Result of last attempt. Valid in FAIL and SUCCESS states.
   * If you want to steal error, don't forget to set pointer to NULL.
@@ -52,6 +54,15 @@ struct NBDClientConnection {
  Coroutine *wait_co; /* nbd_co_establish_connection() wait in yield() */
  };
  
+/*

+ * The function isn't protected by any mutex, so call it when thread is not


so only call it when the thread is not yet running

or maybe even

only call it when the client connection attempt has not yet started


+ * running.
+ */
+void nbd_client_connection_enable_retry(NBDClientConnection *conn)
+{
+conn->do_retry = true;
+}
+
  NBDClientConnection *nbd_client_connection_new(const SocketAddress *saddr,
 bool do_negotiation,
 const char *export_name,
@@ -144,24 +155,37 @@ static void *connect_thread_func(void *opaque)
  NBDClientConnection *conn = opaque;
  bool do_free;
  int ret;
+uint64_t timeout = 1;
+uint64_t max_timeout = 16;
+
+while (true) {
+conn->sioc = qio_channel_socket_new();
+
+error_free(conn->err);
+conn->err = NULL;
+conn->updated_info = conn->initial_info;
+
+ret = nbd_connect(conn->sioc, conn->saddr,
+  conn->do_negotiation ? >updated_info : NULL,
+  conn->tlscreds, >ioc, >err);
+conn->updated_info.x_dirty_bitmap = NULL;
+conn->updated_info.name = NULL;


I'm not quite sure I follow the allocation here: if we passed in
>updated_info which got modified in-place by nbd_connect, then
are we risking a memory leak by ignoring the x_dirty_bitmap and name
set by that call?


Yes, that looks strange :\. Will check when prepare new version and fix or 
leave a comment here.




+
+if (ret < 0) {
+object_unref(OBJECT(conn->sioc));
+conn->sioc = NULL;
+if (conn->do_retry) {
+sleep(timeout);


This is a bare sleep in a function not marked as coroutine_fn.  Do we
need to instead use coroutine sleep for better response to an early
exit if initialization is taking too long?


We are in a separate, by-hand created thread, which knows nothing about 
coroutines, iothreads, aio contexts etc.. I think bare sleep is what should be 
here.




+if (timeout < max_timeout) {
+timeout *= 2;
+}
+continue;
+}
+}
  
-conn->sioc = qio_channel_socket_new();

-
-error_free(conn->err);
-conn->err = NULL;
-conn->updated_info = conn->initial_info;
-
-ret = nbd_connect(conn->sioc, conn->saddr,
-  conn->do_negotiation ? >updated_info : NULL,
-  conn->tlscreds, >ioc, >err);
-if (ret < 0) {
-object_unref(OBJECT(conn->sioc));
-conn->sioc = NULL;
+break;
  }
  
-conn->updated_info.x_dirty_bitmap = NULL;

-conn->updated_info.name = NULL;
-
  WITH_QEMU_LOCK_GUARD(>mutex) {
  assert(conn->running);
  conn->running = false;
@@ -172,7 +196,6 @@ static void *connect_thread_func(void *opaque)
  do_free = conn->detached;
  }
  
-

  if (do_free) {
  nbd_client_connection_do_free(conn);


Spurious hunk?



wull drop

--
Best regards,
Vladimir

Re: [PATCH v2 22/26] s390x/tcg: Implement VECTOR FP NEGATIVE MULTIPLY AND (ADD|SUBTRACT)

2021-06-03 Thread Richard Henderson


On 5/17/21 7:27 AM, David Hildenbrand wrote:

Signed-off-by: David Hildenbrand
---
  target/s390x/helper.h   |  6 +
  target/s390x/insn-data.def  |  4 
  target/s390x/translate_vx.c.inc | 39 +++--
  target/s390x/vec_fpu_helper.c   |  2 ++
  4 files changed, 49 insertions(+), 2 deletions(-)


Reviewed-by: Richard Henderson 

r~

Re: [PATCH v2 21/26] s390x/tcg: Implement 32/128 bit for VECTOR FP MULTIPLY AND (ADD|SUBTRACT)

2021-06-03 Thread Richard Henderson


On 5/17/21 7:27 AM, David Hildenbrand wrote:

Signed-off-by: David Hildenbrand
---
  target/s390x/helper.h   |  4 +++
  target/s390x/translate_vx.c.inc | 47 -
  target/s390x/vec_fpu_helper.c   | 44 +-
  3 files changed, 87 insertions(+), 8 deletions(-)


Reviewed-by: Richard Henderson 

r~

Re: [PATCH v2 20/26] s390x/tcg: Implement 32/128 bit for VECTOR FP TEST DATA CLASS IMMEDIATE

2021-06-03 Thread Richard Henderson


On 5/17/21 7:27 AM, David Hildenbrand wrote:

Signed-off-by: David Hildenbrand
---
  target/s390x/helper.h   |  2 ++
  target/s390x/translate_vx.c.inc | 23 ++--
  target/s390x/vec_fpu_helper.c   | 47 +
  3 files changed, 70 insertions(+), 2 deletions(-)


Reviewed-by: Richard Henderson 

r~

Re: [PATCH v2 16/26] s390x/tcg: Implement 32/128 bit for VECTOR FP COMPARE (AND SIGNAL) SCALAR

2021-06-03 Thread Richard Henderson


On 5/17/21 7:27 AM, David Hildenbrand wrote:

Signed-off-by: David Hildenbrand
---
  target/s390x/helper.h   |  4 +++
  target/s390x/translate_vx.c.inc | 38 ++--
  target/s390x/vec_fpu_helper.c   | 44 -
  3 files changed, 77 insertions(+), 9 deletions(-)


Reviewed-by: Richard Henderson 

r~

Re: [PATCH v2 15/26] s390x/tcg: Implement 32/128 bit for VECTOR FP COMPARE *

2021-06-03 Thread Richard Henderson


On 5/17/21 7:27 AM, David Hildenbrand wrote:

In addition to 32/128bit variants, we also have to support the
"Signal-on-QNaN (SQ)" bit.

Signed-off-by: David Hildenbrand
---
  target/s390x/helper.h   | 12 +++
  target/s390x/translate_vx.c.inc | 57 -
  target/s390x/vec_fpu_helper.c   | 64 +++--
  3 files changed, 121 insertions(+), 12 deletions(-)


Reviewed-by: Richard Henderson 

r~

Re: [PATCH v2 14/26] s390x/tcg: Implement 32/128 bit for VECTOR (LOAD FP INTEGER|FP SQUARE ROOT)

2021-06-03 Thread Richard Henderson


On 5/17/21 7:27 AM, David Hildenbrand wrote:

Signed-off-by: David Hildenbrand
---
  target/s390x/helper.h   |  4 ++
  target/s390x/translate_vx.c.inc | 74 ++---
  target/s390x/vec_fpu_helper.c   | 46 +++-
  3 files changed, 109 insertions(+), 15 deletions(-)


Reviewed-by: Richard Henderson 

r~

Re: [PATCH v3 3/7] block: add max_hw_transfer to BlockLimits

2021-06-03 Thread Eric Blake

On Thu, Jun 03, 2021 at 03:37:18PM +0200, Paolo Bonzini wrote:
> For block host devices, I/O can happen through either the kernel file
> descriptor I/O system calls (preadv/pwritev, io_submit, io_uring)
> or the SCSI passthrough ioctl SG_IO.
> 
> In the latter case, the size of each transfer can be limited by the
> HBA, while for file descriptor I/O the kernel is able to split and
> merge I/O in smaller pieces as needed.  Applying the HBA limits to
> file descriptor I/O results in more system calls and suboptimal
> performance, so this patch splits the max_transfer limit in two:
> max_transfer remains valid and is used in general, while max_hw_transfer
> is limited to the maximum hardware size.  max_hw_transfer can then be
> included by the scsi-generic driver in the block limits page, to ensure
> that the stricter hardware limit is used.
> 

> +/* Returns the maximum hardware transfer length, in bytes; guaranteed 
> nonzero */
> +uint64_t blk_get_max_hw_transfer(BlockBackend *blk)
> +{
> +BlockDriverState *bs = blk_bs(blk);
> +uint64_t max = INT_MAX;

This is an unaligned value; should we instead round it down to the
request_alignment granularity?

> +
> +if (bs) {
> +max = MIN_NON_ZERO(bs->bl.max_hw_transfer, bs->bl.max_transfer);
> +}
> +return max;
> +}
> +
>  /* Returns the maximum transfer length, in bytes; guaranteed nonzero */
>  uint32_t blk_get_max_transfer(BlockBackend *blk)
>  {

> +++ b/include/block/block_int.h
> @@ -695,6 +695,13 @@ typedef struct BlockLimits {
>   * clamped down. */
>  uint32_t max_transfer;
>  
> +/* Maximal hardware transfer length in bytes.  Applies whenever

Leading /* on its own line, per our style.

> + * transfers to the device bypass the kernel I/O scheduler, for
> + * example with SG_IO.  If larger than max_transfer or if zero,
> + * blk_get_max_hw_transfer will fall back to max_transfer.
> + */

Should we mandate any additional requirements on this value such as
multiple of request_alignment or even power-of-2?

> +uint64_t max_hw_transfer;
> +
>  /* memory alignment, in bytes so that no bounce buffer is needed */
>  size_t min_mem_alignment;
>  

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

Re: [PATCH v2 13/26] s390x/tcg: Implement 32/128 bit for VECTOR FP (ADD|DIVIDE|MULTIPLY|SUBTRACT)

2021-06-03 Thread Richard Henderson


On 5/17/21 7:27 AM, David Hildenbrand wrote:

In case of 128bit, we always have a single element. Add new helpers for
reading/writing 32/128 bit floats.

Signed-off-by: David Hildenbrand
---
  target/s390x/helper.h   |  8 
  target/s390x/translate_vx.c.inc | 85 +
  target/s390x/vec_fpu_helper.c   | 74 ++--
  3 files changed, 153 insertions(+), 14 deletions(-)


Reviewed-by: Richard Henderson 

r~

Re: [PATCH_V3] Adding ifdefs to call the respective routines only when their configs are enabled

2021-06-03 Thread Swetha Joshi

Oh okay, thank you. I will test this by eod today!


On Thu, Jun 3, 2021 at 10:22 AM Peter Maydell 
wrote:

> On Fri, 28 May 2021 at 20:41, Swetha Joshi 
> wrote:
> >
> > I apologize for the delay, here are the repro steps:
> > 1. Remove CONFIG_ARM_VIRT=y from arm-softmmu.mak
> > 2. In .gitlab-ci.yml, crossbuild.yml and in tests/vm/Makefile.include,
> in all the places where we disable kvm using -disable-kvm, replace this
> with -enable-kvm
> > 3. Build
>
> Thanks; I reproduced the link errors and have sent a patchset
> that I hope fixes this:
> https://patchew.org/QEMU/20210603171259.27962-1-peter.mayd...@linaro.org/
>
> If you could test that it works for you that would be great.
>
> -- PMM
>
-- 
Regards

Swetha Joshi.

Re: [PATCH v2 10/26] s390x/tcg: Simplify wfc64() handling

2021-06-03 Thread Richard Henderson


On 5/17/21 7:27 AM, David Hildenbrand wrote:

... and prepare for 32/128 bit support.

Signed-off-by: David Hildenbrand
---
  target/s390x/vec_fpu_helper.c | 23 ---
  1 file changed, 12 insertions(+), 11 deletions(-)


Reviewed-by: Richard Henderson 

r~

Re: [PATCH v2 09/26] s390x/tcg: Simplify vflr64() handling

2021-06-03 Thread Richard Henderson


On 5/17/21 7:27 AM, David Hildenbrand wrote:

Signed-off-by: David Hildenbrand
---
  target/s390x/helper.h   |  1 -
  target/s390x/translate_vx.c.inc |  3 +--
  target/s390x/vec_fpu_helper.c   | 29 +++--
  3 files changed, 8 insertions(+), 25 deletions(-)


Reviewed-by: Richard Henderson 

r~

Re: [PATCH v2 08/26] s390x/tcg: Simplify vfll32() handling

2021-06-03 Thread Richard Henderson


On 5/17/21 7:27 AM, David Hildenbrand wrote:

Signed-off-by: David Hildenbrand
---
  target/s390x/helper.h   |  1 -
  target/s390x/translate_vx.c.inc |  6 +-
  target/s390x/vec_fpu_helper.c   | 21 +
  3 files changed, 6 insertions(+), 22 deletions(-)


Reviewed-by: Richard Henderson 

r~

Re: [PATCH_V3] Adding ifdefs to call the respective routines only when their configs are enabled

2021-06-03 Thread Peter Maydell

On Fri, 28 May 2021 at 20:41, Swetha Joshi  wrote:
>
> I apologize for the delay, here are the repro steps:
> 1. Remove CONFIG_ARM_VIRT=y from arm-softmmu.mak
> 2. In .gitlab-ci.yml, crossbuild.yml and in tests/vm/Makefile.include, in all 
> the places where we disable kvm using -disable-kvm, replace this with 
> -enable-kvm
> 3. Build

Thanks; I reproduced the link errors and have sent a patchset
that I hope fixes this:
https://patchew.org/QEMU/20210603171259.27962-1-peter.mayd...@linaro.org/

If you could test that it works for you that would be great.

-- PMM

Re: [PATCH v2 07/26] s390x/tcg: Simplify vfma64() handling

2021-06-03 Thread Richard Henderson


On 5/17/21 7:27 AM, David Hildenbrand wrote:

Signed-off-by: David Hildenbrand
---
  target/s390x/helper.h   |  2 --
  target/s390x/translate_vx.c.inc |  8 +++
  target/s390x/vec_fpu_helper.c   | 42 +
  3 files changed, 20 insertions(+), 32 deletions(-)


Reviewed-by: Richard Henderson 

r~

Re: [PATCH v2 06/26] s390x/tcg: Simplify vftci64() handling

2021-06-03 Thread Richard Henderson


On 5/17/21 7:27 AM, David Hildenbrand wrote:

Signed-off-by: David Hildenbrand
---
  target/s390x/helper.h   |  1 -
  target/s390x/translate_vx.c.inc |  7 ++-
  target/s390x/vec_fpu_helper.c   | 29 +++--
  3 files changed, 13 insertions(+), 24 deletions(-)


Reviewed-by: Richard Henderson 

r~

1 2 3 4 >

1 - 100 of 378 matches

Mail list logo