Re: [PATCH 2/4] vhost: toggle device callbacks using used event idx

2022-10-24 Thread Jason Wang
On Tue, Oct 25, 2022 at 1:36 PM Michael S. Tsirkin  wrote:
>
> On Tue, Oct 25, 2022 at 10:26:35AM +0800, Jason Wang wrote:
> > On Mon, Oct 24, 2022 at 10:05 PM Michael S. Tsirkin  wrote:
> > >
> > > On Mon, Oct 24, 2022 at 04:00:37PM +0200, Eugenio Perez Martin wrote:
> > > > > > It's generally a waste that we don't use endian-ness annotations
> > > > > > the way linux does.
> > > > >
> > > > > Yes, it's worth doing something similar sometime.
> > > > >
> > > >
> > > > Maybe we could wrap them in some struct like virtio_le16 or virtio_16,
> > > > avoiding at least integer direct assignment? Wrappers like
> > > > cpu_to_virtio16 could return these structs and I think all compilers
> > > > should emit the same code as direct assignment.
> > > >
> > > > Thanks!
> > > >
> > >
> > > This will break bitwise operations such as | and &.
> > > Generally Linux has solved the problem and I don't think
> > > we should go look for another solution.
> >
> > Yes, but it should not block this series (we can do that in the future
> > if we had bandwidth).
> >
> > Thanks
>
> Sorry I don't get what you are saying. Which series?

I meant this series.

> Making LE tags structs is not going to fly, this is why sparse
> implements the __bitwise__ tag and in fact this is where the name comes
> from - bitwise operations need to work.

Yes but do we want to add sparse checks in this series only for shadow
virtqueue code?

Thanks

>
>
>
> > >
> > >
> > > >
> > > >
> > > > > Thanks
> > > > >
> > > > > >
> > > > > >
> > > > > > > > Thanks
> > > > > > > >
> > > > > > > > > +} else {
> > > > > > > > > +svq->vring.avail->flags &= 
> > > > > > > > > ~cpu_to_le16(VRING_AVAIL_F_NO_INTERRUPT);
> > > > > > > > > +}
> > > > > > > > > +
> > > > > > > > > +/* Make sure the event is enabled before the read of 
> > > > > > > > > used_idx */
> > > > > > > > >  smp_mb();
> > > > > > > > >  return !vhost_svq_more_used(svq);
> > > > > > > > >  }
> > > > > > > > >
> > > > > > > > >  static void 
> > > > > > > > > vhost_svq_disable_notification(VhostShadowVirtqueue *svq)
> > > > > > > > >  {
> > > > > > > > > -svq->vring.avail->flags |= 
> > > > > > > > > cpu_to_le16(VRING_AVAIL_F_NO_INTERRUPT);
> > > > > > > > > +/*
> > > > > > > > > + * No need to disable notification in the event idx 
> > > > > > > > > case, since used event
> > > > > > > > > + * index is already an index too far away.
> > > > > > > > > + */
> > > > > > > > > +if (!virtio_vdev_has_feature(svq->vdev, 
> > > > > > > > > VIRTIO_RING_F_EVENT_IDX)) {
> > > > > > > > > +svq->vring.avail->flags |= 
> > > > > > > > > cpu_to_le16(VRING_AVAIL_F_NO_INTERRUPT);
> > > > > > > > > +}
> > > > > > > > >  }
> > > > > > > > >
> > > > > > > > >  static uint16_t vhost_svq_last_desc_of_chain(const 
> > > > > > > > > VhostShadowVirtqueue *svq,
> > > > > > > > > --
> > > > > > > > > 2.31.1
> > > > > > > > >
> > > > > > > >
> > > > > >
> > > > >
> > >
>




Re: [PATCH 2/4] vhost: toggle device callbacks using used event idx

2022-10-24 Thread Michael S. Tsirkin
On Tue, Oct 25, 2022 at 10:26:35AM +0800, Jason Wang wrote:
> On Mon, Oct 24, 2022 at 10:05 PM Michael S. Tsirkin  wrote:
> >
> > On Mon, Oct 24, 2022 at 04:00:37PM +0200, Eugenio Perez Martin wrote:
> > > > > It's generally a waste that we don't use endian-ness annotations
> > > > > the way linux does.
> > > >
> > > > Yes, it's worth doing something similar sometime.
> > > >
> > >
> > > Maybe we could wrap them in some struct like virtio_le16 or virtio_16,
> > > avoiding at least integer direct assignment? Wrappers like
> > > cpu_to_virtio16 could return these structs and I think all compilers
> > > should emit the same code as direct assignment.
> > >
> > > Thanks!
> > >
> >
> > This will break bitwise operations such as | and &.
> > Generally Linux has solved the problem and I don't think
> > we should go look for another solution.
> 
> Yes, but it should not block this series (we can do that in the future
> if we had bandwidth).
> 
> Thanks

Sorry I don't get what you are saying. Which series?
Making LE tags structs is not going to fly, this is why sparse
implements the __bitwise__ tag and in fact this is where the name comes
from - bitwise operations need to work.



> >
> >
> > >
> > >
> > > > Thanks
> > > >
> > > > >
> > > > >
> > > > > > > Thanks
> > > > > > >
> > > > > > > > +} else {
> > > > > > > > +svq->vring.avail->flags &= 
> > > > > > > > ~cpu_to_le16(VRING_AVAIL_F_NO_INTERRUPT);
> > > > > > > > +}
> > > > > > > > +
> > > > > > > > +/* Make sure the event is enabled before the read of 
> > > > > > > > used_idx */
> > > > > > > >  smp_mb();
> > > > > > > >  return !vhost_svq_more_used(svq);
> > > > > > > >  }
> > > > > > > >
> > > > > > > >  static void 
> > > > > > > > vhost_svq_disable_notification(VhostShadowVirtqueue *svq)
> > > > > > > >  {
> > > > > > > > -svq->vring.avail->flags |= 
> > > > > > > > cpu_to_le16(VRING_AVAIL_F_NO_INTERRUPT);
> > > > > > > > +/*
> > > > > > > > + * No need to disable notification in the event idx case, 
> > > > > > > > since used event
> > > > > > > > + * index is already an index too far away.
> > > > > > > > + */
> > > > > > > > +if (!virtio_vdev_has_feature(svq->vdev, 
> > > > > > > > VIRTIO_RING_F_EVENT_IDX)) {
> > > > > > > > +svq->vring.avail->flags |= 
> > > > > > > > cpu_to_le16(VRING_AVAIL_F_NO_INTERRUPT);
> > > > > > > > +}
> > > > > > > >  }
> > > > > > > >
> > > > > > > >  static uint16_t vhost_svq_last_desc_of_chain(const 
> > > > > > > > VhostShadowVirtqueue *svq,
> > > > > > > > --
> > > > > > > > 2.31.1
> > > > > > > >
> > > > > > >
> > > > >
> > > >
> >




Re: [PATCH v3] linux-user: Add guest memory layout to exception dump

2022-10-24 Thread Richard Henderson

On 10/25/22 15:24, Richard Henderson wrote:
It would also fix a bug in that the host page permissions do not exactly match guest page 
permissions, and you're reporting host page permissions.


Gah, not true, we've already probed page_flags.

Reviewed-by: Richard Henderson 


r~



Re: [PATCH v5] linux-user: Add close_range() syscall

2022-10-24 Thread Richard Henderson

On 10/25/22 12:34, Helge Deller wrote:

Signed-off-by: Helge Deller
---
Changes:
v5: Simplify check of arg2 against target_fd_max even more
v4: Fix check of arg2
v3: fd_trans_unregister() only called if close_range() doesn't fail
v2: consider CLOSE_RANGE_CLOEXEC flag


Reviewed-by: Richard Henderson 

r~



Re: [PATCH v3] linux-user: Add guest memory layout to exception dump

2022-10-24 Thread Richard Henderson

On 10/25/22 12:51, Helge Deller wrote:

On 10/25/22 04:25, Richard Henderson wrote:

On 10/25/22 11:57, Helge Deller wrote:

On 10/25/22 00:35, Richard Henderson wrote:

On 10/25/22 06:18, Helge Deller wrote:

When the emulation stops with a hard exception it's very useful for
debugging purposes to dump the current guest memory layout (for an
example see /proc/self/maps) beside the CPU registers.

The open_self_maps() function provides such a memory dump, but since
it's located in the syscall.c file, various changes (add #includes, make
this function externally visible, ...) are needed to be able to call it
from the existing EXCP_DUMP() macro.


/proc/self/maps has all of the qemu mappings in it as well.


I'm not quite sure on how to understand your comments above.
Just comments or NAK to the patch?


A question.

Did you really wanted the host mappings included?


No.
I wanted just the guest mappings.


If so, fine.
If not, pointing out there's a better function to use.


I'm not sure if it's the better choice.
It depends on the targetted audience of such output.

This is linux-user, so if someone runs a program he would expect
output of crash dumps like as he would see them on a native machine.
Showing "external host emulation mappings" seems strange.


Oh, I see.  My comments above confused read_self_maps (host) with open_self_maps (filtered 
guest).


I'll note that the output of page_dump() could be improved to exactly match 
/proc/self/maps format (which would probably be less confusing), and then re-implement 
open_self_maps in terms of that.  This would avoid read_self_maps entirely.


It would also fix a bug in that the host page permissions do not exactly match guest page 
permissions, and you're reporting host page permissions.



r~



[RFC PATCH 0/4] MultiFD zero-copy improvements

2022-10-24 Thread Leonardo Bras
RFC for an improvement suggested by Juan during the KVM Forum and a few
optimizations I found in the way.

Patch #1 is just moving code to a helper, should have no impact.

Patch #2 is my implementation of Juan's suggestion. I implemented the
simplest way I thought on the array size: a fixed defined value.
I am not sure if this is fine, or if the array size should be either
informed by the user either via QMP or cmdline.
That's an important point I really need feedback on.

Patch #3: Improve the qio_channel_flush() interface to accept flush
waiting for some writes finished instead of all of them. This reduces
the waiting time, since most recent writes/sends will take more time to
finish, while the older ones are probably finished by the first recvmsg()
syscall return.

Patch #4 uses #3 in multifd zero-copy. It flushes the LRU half of the
header array, allowing more writes to happen while the most recent ones
are ongoing, instead of waiting for everything to finish before sending
more.

It all works fine in my tests, but maybe I missed some cornercase.
Please provide any feedback you find fit.

Thank you all!

Best regards,
Leo

Leonardo Bras (4):
  migration/multifd/zero-copy: Create helper function for flushing
  migration/multifd/zero-copy: Merge header & pages send in a single
write
  QIOChannel: Add max_pending parameter to qio_channel_flush()
  migration/multifd/zero-copy: Flush only the LRU half of the header
array

 include/io/channel.h |  7 +++-
 migration/multifd.h  |  5 ++-
 io/channel-socket.c  |  5 ++-
 io/channel.c |  5 ++-
 migration/multifd.c  | 88 ++--
 5 files changed, 68 insertions(+), 42 deletions(-)

-- 
2.38.0




[RFC PATCH 1/4] migration/multifd/zero-copy: Create helper function for flushing

2022-10-24 Thread Leonardo Bras
Move flushing code from multifd_send_sync_main() to a new helper, and call
it in multifd_send_sync_main().

Signed-off-by: Leonardo Bras 
---
 migration/multifd.c | 30 +++---
 1 file changed, 19 insertions(+), 11 deletions(-)

diff --git a/migration/multifd.c b/migration/multifd.c
index 586ddc9d65..509bbbe3bf 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -566,6 +566,23 @@ void multifd_save_cleanup(void)
 multifd_send_state = NULL;
 }
 
+static int multifd_zero_copy_flush(QIOChannel *c)
+{
+int ret;
+Error *err = NULL;
+
+ret = qio_channel_flush(c, );
+if (ret < 0) {
+error_report_err(err);
+return -1;
+}
+if (ret == 1) {
+dirty_sync_missed_zero_copy();
+}
+
+return ret;
+}
+
 int multifd_send_sync_main(QEMUFile *f)
 {
 int i;
@@ -616,17 +633,8 @@ int multifd_send_sync_main(QEMUFile *f)
 qemu_mutex_unlock(>mutex);
 qemu_sem_post(>sem);
 
-if (flush_zero_copy && p->c) {
-int ret;
-Error *err = NULL;
-
-ret = qio_channel_flush(p->c, );
-if (ret < 0) {
-error_report_err(err);
-return -1;
-} else if (ret == 1) {
-dirty_sync_missed_zero_copy();
-}
+if (flush_zero_copy && p->c && (multifd_zero_copy_flush(p->c) < 0)) {
+return -1;
 }
 }
 for (i = 0; i < migrate_multifd_channels(); i++) {
-- 
2.38.0




[RFC PATCH 4/4] migration/multifd/zero-copy: Flush only the LRU half of the header array

2022-10-24 Thread Leonardo Bras
Zero-copy multifd migration sends both the header and the memory pages in a
single syscall. Since it's necessary to flush before reusing the header, a
header array was implemented, so each write call uses a different
array, and flushing only take place after all headers have been used,
meaning 1 flush for each N writes.

This method has a bottleneck, though: After the last write, a flush will
have to wait for all writes to finish, which will be a lot, meaning the
recvmsg() syscall called in qio_channel_socket_flush() will be called a
lot. On top of that, it will create a time period when the I/O queue is
empty and nothing is getting send: between the flush and the next write.

To avoid that, use qio_channel_flush()'s new max_pending parameter to wait
until at most half of the array is still in use. (i.e. the LRU half of the
array can be reused)

Flushing for the LRU half of the array is much faster, since it does not
have to wait for the most recent writes to finish, making up for having
to flush twice per array.

As a main benefit, this approach keeps the I/O queue from being empty while
there are still data to be sent, making it easier to keep the I/O maximum
throughput while consuming less cpu time.

Signed-off-by: Leonardo Bras 
---
 migration/multifd.c | 16 +++-
 1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/migration/multifd.c b/migration/multifd.c
index c5d1f911a4..fe9df460f6 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -569,12 +569,13 @@ void multifd_save_cleanup(void)
 multifd_send_state = NULL;
 }
 
-static int multifd_zero_copy_flush(QIOChannel *c)
+static int multifd_zero_copy_flush(QIOChannel *c,
+   int max_remaining)
 {
 int ret;
 Error *err = NULL;
 
-ret = qio_channel_flush(c, 0, );
+ret = qio_channel_flush(c, max_remaining, );
 if (ret < 0) {
 error_report_err(err);
 return -1;
@@ -636,7 +637,7 @@ int multifd_send_sync_main(QEMUFile *f)
 qemu_mutex_unlock(>mutex);
 qemu_sem_post(>sem);
 
-if (flush_zero_copy && p->c && (multifd_zero_copy_flush(p->c) < 0)) {
+if (flush_zero_copy && p->c && (multifd_zero_copy_flush(p->c, 0) < 0)) 
{
 return -1;
 }
 }
@@ -719,12 +720,17 @@ static void *multifd_send_thread(void *opaque)
 
 if (use_zero_copy_send) {
 p->packet_idx = (p->packet_idx + 1) % HEADER_ARR_SZ;
-
-if (!p->packet_idx && (multifd_zero_copy_flush(p->c) < 0)) {
+/*
+ * When half the array have been used, flush to make sure the
+ * next half is available
+ */
+if (!(p->packet_idx % (HEADER_ARR_SZ / 2)) &&
+(multifd_zero_copy_flush(p->c, HEADER_ARR_SZ / 2) < 0)) {
 break;
 }
 header = (void *)p->packet + p->packet_idx * p->packet_len;
 }
+
 qemu_mutex_lock(>mutex);
 p->pending_job--;
 qemu_mutex_unlock(>mutex);
-- 
2.38.0




[RFC PATCH 3/4] QIOChannel: Add max_pending parameter to qio_channel_flush()

2022-10-24 Thread Leonardo Bras
Zero-copy write in Linux is an asynchronous type of write, meaning the send
process is not finished by the time the function returns. Since it's also
zero-copy, it means that incorrect data may be sent if the write buffer
gets modified after write returns.

To check if a zero-copy write is finished, qio_channel has implemented a
flush operation: qio_channel_flush(). As of today, by the time the flush
returns, user code knows for sure all previous zero-copy write have
finished, and it's safe to modify any write buffer.

While this kind of flush is necessary, it may take a while to flush if
there has been a write recently, as the OS has to wait for everything to be
sent before returning and allowing reuse / free of the write buffers.

An option that can improve performance in some scenarios is to modify flush
so it guarantees less than N zero-copy writes are pending, allowing some of
the write buffers to be reused. This allows flush to return much faster,
since it does not need to wait for the more recent writes to complete.

If (N == 0), then it replicates the previous flushing behavior.

Add max_pending parameter to qio_channel_flush() so caller can decide
what is the maximum number of pending writes remaining before the function
returns. Also, implement this change in qio_channel_socket_flush().

Change current calls of qio_channel_flush() so (max_pending == 0), and the
flush-all behavior is maintained.

Signed-off-by: Leonardo Bras 
---
 include/io/channel.h | 7 +--
 io/channel-socket.c  | 5 +++--
 io/channel.c | 5 +++--
 migration/multifd.c  | 2 +-
 4 files changed, 12 insertions(+), 7 deletions(-)

diff --git a/include/io/channel.h b/include/io/channel.h
index c680ee7480..9ec1978a26 100644
--- a/include/io/channel.h
+++ b/include/io/channel.h
@@ -141,6 +141,7 @@ struct QIOChannelClass {
   IOHandler *io_write,
   void *opaque);
 int (*io_flush)(QIOChannel *ioc,
+int max_pending,
 Error **errp);
 };
 
@@ -875,11 +876,12 @@ int qio_channel_writev_full_all(QIOChannel *ioc,
 /**
  * qio_channel_flush:
  * @ioc: the channel object
+ * @max_pending: Maximum remaining writes allowed in queue upon returning
  * @errp: pointer to a NULL-initialized error object
  *
- * Will block until every packet queued with
+ * Will block until there are at most max_pending writes called with
  * qio_channel_writev_full() + QIO_CHANNEL_WRITE_FLAG_ZERO_COPY
- * is sent, or return in case of any error.
+ * pending, or return in case of any error.
  *
  * If not implemented, acts as a no-op, and returns 0.
  *
@@ -889,6 +891,7 @@ int qio_channel_writev_full_all(QIOChannel *ioc,
  */
 
 int qio_channel_flush(QIOChannel *ioc,
+  int max_pending,
   Error **errp);
 
 #endif /* QIO_CHANNEL_H */
diff --git a/io/channel-socket.c b/io/channel-socket.c
index b76dca9cc1..3d0c2b8a14 100644
--- a/io/channel-socket.c
+++ b/io/channel-socket.c
@@ -708,6 +708,7 @@ static ssize_t qio_channel_socket_writev(QIOChannel *ioc,
 
 #ifdef QEMU_MSG_ZEROCOPY
 static int qio_channel_socket_flush(QIOChannel *ioc,
+int max_pending,
 Error **errp)
 {
 QIOChannelSocket *sioc = QIO_CHANNEL_SOCKET(ioc);
@@ -718,7 +719,7 @@ static int qio_channel_socket_flush(QIOChannel *ioc,
 int received;
 int ret;
 
-if (sioc->zero_copy_queued == sioc->zero_copy_sent) {
+if (sioc->zero_copy_queued - sioc->zero_copy_sent <= max_pending) {
 return 0;
 }
 
@@ -728,7 +729,7 @@ static int qio_channel_socket_flush(QIOChannel *ioc,
 
 ret = 1;
 
-while (sioc->zero_copy_sent < sioc->zero_copy_queued) {
+while (sioc->zero_copy_queued - sioc->zero_copy_sent > max_pending) {
 received = recvmsg(sioc->fd, , MSG_ERRQUEUE);
 if (received < 0) {
 switch (errno) {
diff --git a/io/channel.c b/io/channel.c
index 0640941ac5..9d9f15af50 100644
--- a/io/channel.c
+++ b/io/channel.c
@@ -490,7 +490,8 @@ off_t qio_channel_io_seek(QIOChannel *ioc,
 }
 
 int qio_channel_flush(QIOChannel *ioc,
-Error **errp)
+  int max_pending,
+  Error **errp)
 {
 QIOChannelClass *klass = QIO_CHANNEL_GET_CLASS(ioc);
 
@@ -499,7 +500,7 @@ int qio_channel_flush(QIOChannel *ioc,
 return 0;
 }
 
-return klass->io_flush(ioc, errp);
+return klass->io_flush(ioc, max_pending, errp);
 }
 
 
diff --git a/migration/multifd.c b/migration/multifd.c
index aa18c47141..c5d1f911a4 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -574,7 +574,7 @@ static int multifd_zero_copy_flush(QIOChannel *c)
 int ret;
 Error *err = NULL;
 
-ret = qio_channel_flush(c, );
+ret = qio_channel_flush(c, 0, );
 if (ret < 0) {
 error_report_err(err);
 return -1;
-- 
2.38.0




[RFC PATCH 2/4] migration/multifd/zero-copy: Merge header & pages send in a single write

2022-10-24 Thread Leonardo Bras
When zero-copy-send is enabled, each loop iteration of the
multifd_send_thread will calls for qio_channel_write_*() twice: The first
one for sending the header without zero-copy flag and the second one for
sending the memory pages, with zero-copy flag enabled.

This ends up calling two syscalls per loop iteration, where one should be
enough.

Also, since the behavior for non-zero-copy write is synchronous, and the
behavior for zero-copy write is asynchronous, it ends up interleaving
synchronous and asynchronous writes, hurting performance that could
otherwise be improved.

The choice of sending the header without the zero-copy flag in a separated
write happened because the header memory area would be reused in the next
write, so it was almost certain to have changed before the kernel could
send the packet.

To send the packet with zero-copy, create an array of header area instead
of a single one, and use a different header area after each write. Also,
flush the sending queue after all the headers have been used.

To avoid adding a zero-copy conditional in multifd_send_fill_packet(),
add a packet parameter (the packet that should be filled). This way it's
simpler to pick which element of the array will be used as a header.

Suggested-by: Juan Quintela 
Signed-off-by: Leonardo Bras 
---
 migration/multifd.h |  5 -
 migration/multifd.c | 52 -
 2 files changed, 32 insertions(+), 25 deletions(-)

diff --git a/migration/multifd.h b/migration/multifd.h
index 519f498643..7f200c0286 100644
--- a/migration/multifd.h
+++ b/migration/multifd.h
@@ -90,6 +90,7 @@ typedef struct {
 
 /* this mutex protects the following parameters */
 QemuMutex mutex;
+
 /* is this channel thread running */
 bool running;
 /* should this thread finish */
@@ -109,8 +110,10 @@ typedef struct {
 
 /* thread local variables. No locking required */
 
-/* pointer to the packet */
+/* pointer to the packet array */
 MultiFDPacket_t *packet;
+/* index of the packet array */
+uint32_t packet_idx;
 /* size of the next packet that contains pages */
 uint32_t next_packet_size;
 /* packets sent through this channel */
diff --git a/migration/multifd.c b/migration/multifd.c
index 509bbbe3bf..aa18c47141 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -34,6 +34,8 @@
 #define MULTIFD_MAGIC 0x11223344U
 #define MULTIFD_VERSION 1
 
+#define HEADER_ARR_SZ 1024
+
 typedef struct {
 uint32_t magic;
 uint32_t version;
@@ -255,9 +257,9 @@ static void multifd_pages_clear(MultiFDPages_t *pages)
 g_free(pages);
 }
 
-static void multifd_send_fill_packet(MultiFDSendParams *p)
+static void multifd_send_fill_packet(MultiFDSendParams *p,
+ MultiFDPacket_t *packet)
 {
-MultiFDPacket_t *packet = p->packet;
 int i;
 
 packet->flags = cpu_to_be32(p->flags);
@@ -547,6 +549,7 @@ void multifd_save_cleanup(void)
 p->packet_len = 0;
 g_free(p->packet);
 p->packet = NULL;
+p->packet_idx = 0;
 g_free(p->iov);
 p->iov = NULL;
 g_free(p->normal);
@@ -651,6 +654,7 @@ int multifd_send_sync_main(QEMUFile *f)
 static void *multifd_send_thread(void *opaque)
 {
 MultiFDSendParams *p = opaque;
+MultiFDPacket_t *header = p->packet;
 Error *local_err = NULL;
 int ret = 0;
 bool use_zero_copy_send = migrate_use_zero_copy_send();
@@ -676,13 +680,9 @@ static void *multifd_send_thread(void *opaque)
 if (p->pending_job) {
 uint64_t packet_num = p->packet_num;
 uint32_t flags = p->flags;
-p->normal_num = 0;
 
-if (use_zero_copy_send) {
-p->iovs_num = 0;
-} else {
-p->iovs_num = 1;
-}
+p->normal_num = 0;
+p->iovs_num = 1;
 
 for (int i = 0; i < p->pages->num; i++) {
 p->normal[p->normal_num] = p->pages->offset[i];
@@ -696,7 +696,8 @@ static void *multifd_send_thread(void *opaque)
 break;
 }
 }
-multifd_send_fill_packet(p);
+
+multifd_send_fill_packet(p, header);
 p->flags = 0;
 p->num_packets++;
 p->total_normal_pages += p->normal_num;
@@ -707,18 +708,8 @@ static void *multifd_send_thread(void *opaque)
 trace_multifd_send(p->id, packet_num, p->normal_num, flags,
p->next_packet_size);
 
-if (use_zero_copy_send) {
-/* Send header first, without zerocopy */
-ret = qio_channel_write_all(p->c, (void *)p->packet,
-p->packet_len, _err);
-if (ret != 0) {
-break;
-}
-} else {
-/* Send header using the same writev call */
-p->iov[0].iov_len = p->packet_len;
- 

[PATCH v1 1/2] hw/riscv/opentitan: bump opentitan

2022-10-24 Thread Wilfred Mallawa
From: Wilfred Mallawa 

This patch updates the OpenTitan model to match
the specified register layout as per [1]. Which is also the latest
commit of OpenTitan supported by TockOS.

Note: Pinmux and Padctrl has been merged into Pinmux [2][3], this patch removes
any references to Padctrl. Note: OpenTitan doc [2] has not yet specified
much detail regarding this, except for a note that states `TODO: this
section needs to be updated to reflect the pinmux/padctrl merger`

[1] 
https://github.com/lowRISC/opentitan/blob/d072ac505f82152678d6e04be95c72b728a347b8/hw/top_earlgrey/sw/autogen/top_earlgrey_memory.h
[2] https://docs.opentitan.org/hw/top_earlgrey/doc/design/
[3] https://docs.opentitan.org/hw/ip/pinmux/doc/#overview

Signed-off-by: Wilfred Mallawa 
Reviewed-by: Alistair Francis 
Reviewed-by: Bin Meng 
---
 hw/riscv/opentitan.c | 21 +
 include/hw/riscv/opentitan.h |  9 -
 2 files changed, 17 insertions(+), 13 deletions(-)

diff --git a/hw/riscv/opentitan.c b/hw/riscv/opentitan.c
index be7ff1eea0..92493c629d 100644
--- a/hw/riscv/opentitan.c
+++ b/hw/riscv/opentitan.c
@@ -28,8 +28,16 @@
 #include "qemu/units.h"
 #include "sysemu/sysemu.h"
 
+/*
+ * This version of the OpenTitan machine currently supports
+ * OpenTitan RTL version:
+ * 
+ *
+ * MMIO mapping as per (specified commit):
+ * lowRISC/opentitan: hw/top_earlgrey/sw/autogen/top_earlgrey_memory.h
+ */
 static const MemMapEntry ibex_memmap[] = {
-[IBEX_DEV_ROM] ={  0x8000,   0x8000 },
+[IBEX_DEV_ROM] ={  0x8000,  0x8000 },
 [IBEX_DEV_RAM] ={  0x1000,  0x2 },
 [IBEX_DEV_FLASH] =  {  0x2000,  0x10 },
 [IBEX_DEV_UART] =   {  0x4000,  0x1000  },
@@ -38,17 +46,17 @@ static const MemMapEntry ibex_memmap[] = {
 [IBEX_DEV_I2C] ={  0x4008,  0x1000  },
 [IBEX_DEV_PATTGEN] ={  0x400e,  0x1000  },
 [IBEX_DEV_TIMER] =  {  0x4010,  0x1000  },
-[IBEX_DEV_SENSOR_CTRL] ={  0x4011,  0x1000  },
 [IBEX_DEV_OTP_CTRL] =   {  0x4013,  0x4000  },
 [IBEX_DEV_LC_CTRL] ={  0x4014,  0x1000  },
-[IBEX_DEV_USBDEV] = {  0x4015,  0x1000  },
+[IBEX_DEV_ALERT_HANDLER] =  {  0x4015,  0x1000  },
 [IBEX_DEV_SPI_HOST0] =  {  0x4030,  0x1000  },
 [IBEX_DEV_SPI_HOST1] =  {  0x4031,  0x1000  },
+[IBEX_DEV_USBDEV] = {  0x4032,  0x1000  },
 [IBEX_DEV_PWRMGR] = {  0x4040,  0x1000  },
 [IBEX_DEV_RSTMGR] = {  0x4041,  0x1000  },
 [IBEX_DEV_CLKMGR] = {  0x4042,  0x1000  },
 [IBEX_DEV_PINMUX] = {  0x4046,  0x1000  },
-[IBEX_DEV_PADCTRL] ={  0x4047,  0x1000  },
+[IBEX_DEV_SENSOR_CTRL] ={  0x4049,  0x1000  },
 [IBEX_DEV_FLASH_CTRL] = {  0x4100,  0x1000  },
 [IBEX_DEV_AES] ={  0x4110,  0x1000  },
 [IBEX_DEV_HMAC] =   {  0x4111,  0x1000  },
@@ -59,10 +67,9 @@ static const MemMapEntry ibex_memmap[] = {
 [IBEX_DEV_ENTROPY] ={  0x4116,  0x1000  },
 [IBEX_DEV_EDNO] =   {  0x4117,  0x1000  },
 [IBEX_DEV_EDN1] =   {  0x4118,  0x1000  },
-[IBEX_DEV_ALERT_HANDLER] =  {  0x411b,  0x1000  },
 [IBEX_DEV_NMI_GEN] ={  0x411c,  0x1000  },
 [IBEX_DEV_PERI] =   {  0x411f,  0x1 },
-[IBEX_DEV_PLIC] =   {  0x4800,  0x4005000  },
+[IBEX_DEV_PLIC] =   {  0x4800,  0x4005000 },
 [IBEX_DEV_FLASH_VIRTUAL] =  {  0x8000,  0x8 },
 };
 
@@ -265,8 +272,6 @@ static void lowrisc_ibex_soc_realize(DeviceState *dev_soc, 
Error **errp)
 memmap[IBEX_DEV_CLKMGR].base, memmap[IBEX_DEV_CLKMGR].size);
 create_unimplemented_device("riscv.lowrisc.ibex.pinmux",
 memmap[IBEX_DEV_PINMUX].base, memmap[IBEX_DEV_PINMUX].size);
-create_unimplemented_device("riscv.lowrisc.ibex.padctrl",
-memmap[IBEX_DEV_PADCTRL].base, memmap[IBEX_DEV_PADCTRL].size);
 create_unimplemented_device("riscv.lowrisc.ibex.usbdev",
 memmap[IBEX_DEV_USBDEV].base, memmap[IBEX_DEV_USBDEV].size);
 create_unimplemented_device("riscv.lowrisc.ibex.flash_ctrl",
diff --git a/include/hw/riscv/opentitan.h b/include/hw/riscv/opentitan.h
index 6665cd5794..1fc055cdff 100644
--- a/include/hw/riscv/opentitan.h
+++ b/include/hw/riscv/opentitan.h
@@ -81,7 +81,6 @@ enum {
 IBEX_DEV_RSTMGR,
 IBEX_DEV_CLKMGR,
 IBEX_DEV_PINMUX,
-IBEX_DEV_PADCTRL,
 IBEX_DEV_USBDEV,
 IBEX_DEV_FLASH_CTRL,
 IBEX_DEV_PLIC,
@@ -109,10 +108,10 @@ enum {
 IBEX_UART0_RX_TIMEOUT_IRQ = 7,
 IBEX_UART0_RX_PARITY_ERR_IRQ  = 8,
 IBEX_TIMER_TIMEREXPIRED0_0= 127,
-IBEX_SPI_HOST0_ERR_IRQ= 151,
-IBEX_SPI_HOST0_SPI_EVENT_IRQ  = 152,
-IBEX_SPI_HOST1_ERR_IRQ= 153,
-IBEX_SPI_HOST1_SPI_EVENT_IRQ  = 154,
+IBEX_SPI_HOST0_ERR_IRQ= 134,
+

Re: [RFC PATCH] main-loop: introduce WITH_QEMU_IOTHREAD_LOCK

2022-10-24 Thread Akihiko Odaki

Hi,

Thanks for your proposal. I always wanted to see the helper functions 
with block, which is really specific to Apple-related code are replaced 
with more QEMU-standard GLib infrastructure.


What about returning IoThreadLocked with qemu_iothread_auto_lock() and 
assign it to g_auto(IoThreadLocked)? I don't think the indirection by 
pointer is necessary in this case.


Regards,
Akihiko Odaki

On 2022/10/25 2:19, Alex Bennée wrote:

This helper intends to ape our other auto-unlocking helpers with
WITH_QEMU_LOCK_GUARD. The principle difference is the iothread lock
is often nested needs a little extra book keeping to ensure we don't
double lock or unlock a lock taken higher up the call chain.

Convert some of the common routines that follow this pattern to use
the new wrapper.

Signed-off-by: Alex Bennée 
---
  include/qemu/main-loop.h | 41 
  hw/core/cpu-common.c | 10 ++
  util/rcu.c   | 40 ---
  ui/cocoa.m   | 18 --
  4 files changed, 63 insertions(+), 46 deletions(-)

diff --git a/include/qemu/main-loop.h b/include/qemu/main-loop.h
index aac707d073..604e1823da 100644
--- a/include/qemu/main-loop.h
+++ b/include/qemu/main-loop.h
@@ -341,6 +341,47 @@ void qemu_mutex_lock_iothread_impl(const char *file, int 
line);
   */
  void qemu_mutex_unlock_iothread(void);
  
+/**

+ * WITH_QEMU_IOTHREAD_LOCK - nested lock of iothread
+ *
+ * This is a specialised form of WITH_QEMU_LOCK_GUARD which is used to
+ * safely encapsulate code that needs the BQL. The main difference is
+ * the BQL is often nested so we need to save the state of it on entry
+ * so we know if we need to free it once we leave the scope of the gaurd.
+ */
+
+typedef struct {
+bool taken;
+} IoThreadLocked;
+
+static inline IoThreadLocked * qemu_iothread_auto_lock(IoThreadLocked *x)
+{
+bool locked = qemu_mutex_iothread_locked();
+if (!locked) {
+qemu_mutex_lock_iothread();
+x->taken = true;
+}
+return x;
+}
+
+static inline void qemu_iothread_auto_unlock(IoThreadLocked *x)
+{
+if (x->taken) {
+qemu_mutex_unlock_iothread();
+}
+}
+
+G_DEFINE_AUTOPTR_CLEANUP_FUNC(IoThreadLocked, qemu_iothread_auto_unlock)
+
+#define WITH_QEMU_IOTHREAD_LOCK_(var) \
+for (g_autoptr(IoThreadLocked) var = \
+ qemu_iothread_auto_lock(&(IoThreadLocked) {}); \
+ var; \
+ qemu_iothread_auto_unlock(var), var = NULL)
+
+#define WITH_QEMU_IOTHREAD_LOCK \
+WITH_QEMU_IOTHREAD_LOCK_(glue(qemu_lockable_auto, __COUNTER__))
+
  /*
   * qemu_cond_wait_iothread: Wait on condition for the main loop mutex
   *
diff --git a/hw/core/cpu-common.c b/hw/core/cpu-common.c
index f9fdd46b9d..0a60f916a9 100644
--- a/hw/core/cpu-common.c
+++ b/hw/core/cpu-common.c
@@ -70,14 +70,8 @@ CPUState *cpu_create(const char *typename)
   * BQL here if we need to.  cpu_interrupt assumes it is held.*/
  void cpu_reset_interrupt(CPUState *cpu, int mask)
  {
-bool need_lock = !qemu_mutex_iothread_locked();
-
-if (need_lock) {
-qemu_mutex_lock_iothread();
-}
-cpu->interrupt_request &= ~mask;
-if (need_lock) {
-qemu_mutex_unlock_iothread();
+WITH_QEMU_IOTHREAD_LOCK {
+cpu->interrupt_request &= ~mask;
  }
  }
  
diff --git a/util/rcu.c b/util/rcu.c

index b6d6c71cff..02e7491de1 100644
--- a/util/rcu.c
+++ b/util/rcu.c
@@ -320,35 +320,27 @@ static void drain_rcu_callback(struct rcu_head *node)
  void drain_call_rcu(void)
  {
  struct rcu_drain rcu_drain;
-bool locked = qemu_mutex_iothread_locked();
  
  memset(_drain, 0, sizeof(struct rcu_drain));

  qemu_event_init(_drain.drain_complete_event, false);
  
-if (locked) {

-qemu_mutex_unlock_iothread();
-}
-
-
-/*
- * RCU callbacks are invoked in the same order as in which they
- * are registered, thus we can be sure that when 'drain_rcu_callback'
- * is called, all RCU callbacks that were registered on this thread
- * prior to calling this function are completed.
- *
- * Note that since we have only one global queue of the RCU callbacks,
- * we also end up waiting for most of RCU callbacks that were registered
- * on the other threads, but this is a side effect that shoudn't be
- * assumed.
- */
-
-qatomic_inc(_drain_call_rcu);
-call_rcu1(_drain.rcu, drain_rcu_callback);
-qemu_event_wait(_drain.drain_complete_event);
-qatomic_dec(_drain_call_rcu);
+WITH_QEMU_IOTHREAD_LOCK {
+/*
+ * RCU callbacks are invoked in the same order as in which they
+ * are registered, thus we can be sure that when 'drain_rcu_callback'
+ * is called, all RCU callbacks that were registered on this thread
+ * prior to calling this function are completed.
+ *
+ * Note that since we have only one global queue of the RCU callbacks,
+ * we also end up waiting for most of 

[PATCH v1 2/2] hw/riscv/opentitan: add aon_timer base unimpl

2022-10-24 Thread Wilfred Mallawa
From: Wilfred Mallawa 

Adds the updated `aon_timer` base as an unimplemented device. This is
used by TockOS, patch ensures the guest doesn't hit load faults.

Signed-off-by: Wilfred Mallawa 
Reviewed-by: Bin Meng 
Reviewed-by: Alistair Francis 
---
 hw/riscv/opentitan.c | 3 +++
 include/hw/riscv/opentitan.h | 1 +
 2 files changed, 4 insertions(+)

diff --git a/hw/riscv/opentitan.c b/hw/riscv/opentitan.c
index 92493c629d..78f895d773 100644
--- a/hw/riscv/opentitan.c
+++ b/hw/riscv/opentitan.c
@@ -56,6 +56,7 @@ static const MemMapEntry ibex_memmap[] = {
 [IBEX_DEV_RSTMGR] = {  0x4041,  0x1000  },
 [IBEX_DEV_CLKMGR] = {  0x4042,  0x1000  },
 [IBEX_DEV_PINMUX] = {  0x4046,  0x1000  },
+[IBEX_DEV_AON_TIMER] =  {  0x4047,  0x1000  },
 [IBEX_DEV_SENSOR_CTRL] ={  0x4049,  0x1000  },
 [IBEX_DEV_FLASH_CTRL] = {  0x4100,  0x1000  },
 [IBEX_DEV_AES] ={  0x4110,  0x1000  },
@@ -272,6 +273,8 @@ static void lowrisc_ibex_soc_realize(DeviceState *dev_soc, 
Error **errp)
 memmap[IBEX_DEV_CLKMGR].base, memmap[IBEX_DEV_CLKMGR].size);
 create_unimplemented_device("riscv.lowrisc.ibex.pinmux",
 memmap[IBEX_DEV_PINMUX].base, memmap[IBEX_DEV_PINMUX].size);
+create_unimplemented_device("riscv.lowrisc.ibex.aon_timer",
+memmap[IBEX_DEV_AON_TIMER].base, memmap[IBEX_DEV_AON_TIMER].size);
 create_unimplemented_device("riscv.lowrisc.ibex.usbdev",
 memmap[IBEX_DEV_USBDEV].base, memmap[IBEX_DEV_USBDEV].size);
 create_unimplemented_device("riscv.lowrisc.ibex.flash_ctrl",
diff --git a/include/hw/riscv/opentitan.h b/include/hw/riscv/opentitan.h
index 1fc055cdff..7659d1bc5b 100644
--- a/include/hw/riscv/opentitan.h
+++ b/include/hw/riscv/opentitan.h
@@ -81,6 +81,7 @@ enum {
 IBEX_DEV_RSTMGR,
 IBEX_DEV_CLKMGR,
 IBEX_DEV_PINMUX,
+IBEX_DEV_AON_TIMER,
 IBEX_DEV_USBDEV,
 IBEX_DEV_FLASH_CTRL,
 IBEX_DEV_PLIC,
-- 
2.37.3




[PATCH v1 0/2] hw/riscv/opentitan: bump opentitan version

2022-10-24 Thread Wilfred Mallawa
From: Wilfred Mallawa 

This patch provides updates to the OpenTitan model to bump to RTL
version .

A unique change here is the merger of hwip `padctrl` into `pinmux`, to
reflect this change, any references to `padctrl` are removed.
Additionally, an unimplemented device for `aon_timer` is added and IRQ
numbers are updated. 

Patch was tested by running the latest master of
TockOS as of this patch.

Changelog V1:
  - Added a comment specifying what
git SHA of OT we are targeting in [1/2].

Wilfred Mallawa (2):
  hw/riscv/opentitan: bump opentitan
  hw/riscv/opentitan: add aon_timer base unimpl

 hw/riscv/opentitan.c | 24 
 include/hw/riscv/opentitan.h | 10 +-
 2 files changed, 21 insertions(+), 13 deletions(-)

-- 
2.37.3




Re: [PATCH v0 2/2] hw/riscv/opentitan: add aon_timer base unimpl

2022-10-24 Thread Alistair Francis
On Tue, Oct 25, 2022 at 11:19 AM Wilfred Mallawa
 wrote:
>
> From: Wilfred Mallawa 
>
> Adds the updated `aon_timer` base as an unimplemented device. This is
> used by TockOS, patch ensures the guest doesn't hit load faults.
>
> Signed-off-by: Wilfred Mallawa 

Reviewed-by: Alistair Francis 

Alistair

> ---
>  hw/riscv/opentitan.c | 3 +++
>  include/hw/riscv/opentitan.h | 1 +
>  2 files changed, 4 insertions(+)
>
> diff --git a/hw/riscv/opentitan.c b/hw/riscv/opentitan.c
> index 373fed36b6..50452f792a 100644
> --- a/hw/riscv/opentitan.c
> +++ b/hw/riscv/opentitan.c
> @@ -48,6 +48,7 @@ static const MemMapEntry ibex_memmap[] = {
>  [IBEX_DEV_RSTMGR] = {  0x4041,  0x1000  },
>  [IBEX_DEV_CLKMGR] = {  0x4042,  0x1000  },
>  [IBEX_DEV_PINMUX] = {  0x4046,  0x1000  },
> +[IBEX_DEV_AON_TIMER] =  {  0x4047,  0x1000  },
>  [IBEX_DEV_SENSOR_CTRL] ={  0x4049,  0x1000  },
>  [IBEX_DEV_FLASH_CTRL] = {  0x4100,  0x1000  },
>  [IBEX_DEV_AES] ={  0x4110,  0x1000  },
> @@ -264,6 +265,8 @@ static void lowrisc_ibex_soc_realize(DeviceState 
> *dev_soc, Error **errp)
>  memmap[IBEX_DEV_CLKMGR].base, memmap[IBEX_DEV_CLKMGR].size);
>  create_unimplemented_device("riscv.lowrisc.ibex.pinmux",
>  memmap[IBEX_DEV_PINMUX].base, memmap[IBEX_DEV_PINMUX].size);
> +create_unimplemented_device("riscv.lowrisc.ibex.aon_timer",
> +memmap[IBEX_DEV_AON_TIMER].base, memmap[IBEX_DEV_AON_TIMER].size);
>  create_unimplemented_device("riscv.lowrisc.ibex.usbdev",
>  memmap[IBEX_DEV_USBDEV].base, memmap[IBEX_DEV_USBDEV].size);
>  create_unimplemented_device("riscv.lowrisc.ibex.flash_ctrl",
> diff --git a/include/hw/riscv/opentitan.h b/include/hw/riscv/opentitan.h
> index 1fc055cdff..7659d1bc5b 100644
> --- a/include/hw/riscv/opentitan.h
> +++ b/include/hw/riscv/opentitan.h
> @@ -81,6 +81,7 @@ enum {
>  IBEX_DEV_RSTMGR,
>  IBEX_DEV_CLKMGR,
>  IBEX_DEV_PINMUX,
> +IBEX_DEV_AON_TIMER,
>  IBEX_DEV_USBDEV,
>  IBEX_DEV_FLASH_CTRL,
>  IBEX_DEV_PLIC,
> --
> 2.37.3
>
>



Re: [PATCH v0 1/2] hw/riscv/opentitan: bump opentitan

2022-10-24 Thread Alistair Francis
On Tue, Oct 25, 2022 at 11:18 AM Wilfred Mallawa
 wrote:
>
> From: Wilfred Mallawa 
>
> This patch updates the OpenTitan model to match
> the specified register layout as per [1]. Which is also the latest
> commit of OpenTitan supported by TockOS.
>
> Note: Pinmux and Padctrl has been merged into Pinmux [2][3], this patch 
> removes
> any references to Padctrl. Note: OpenTitan doc [2] has not yet specified
> much detail regarding this, except for a note that states `TODO: this
> section needs to be updated to reflect the pinmux/padctrl merger`
>
> [1] 
> https://github.com/lowRISC/opentitan/blob/d072ac505f82152678d6e04be95c72b728a347b8/hw/top_earlgrey/sw/autogen/top_earlgrey_memory.h
> [2] https://docs.opentitan.org/hw/top_earlgrey/doc/design/
> [3] https://docs.opentitan.org/hw/ip/pinmux/doc/#overview
>
> Signed-off-by: Wilfred Mallawa 

Can you add a comment to the OpenTitan machine in QEMU specifying what
git SHA of OT we are targeting?

Reviewed-by: Alistair Francis 

Alistair

> ---
>  hw/riscv/opentitan.c | 13 +
>  include/hw/riscv/opentitan.h |  9 -
>  2 files changed, 9 insertions(+), 13 deletions(-)
>
> diff --git a/hw/riscv/opentitan.c b/hw/riscv/opentitan.c
> index be7ff1eea0..373fed36b6 100644
> --- a/hw/riscv/opentitan.c
> +++ b/hw/riscv/opentitan.c
> @@ -29,7 +29,7 @@
>  #include "sysemu/sysemu.h"
>
>  static const MemMapEntry ibex_memmap[] = {
> -[IBEX_DEV_ROM] ={  0x8000,   0x8000 },
> +[IBEX_DEV_ROM] ={  0x8000,  0x8000 },
>  [IBEX_DEV_RAM] ={  0x1000,  0x2 },
>  [IBEX_DEV_FLASH] =  {  0x2000,  0x10 },
>  [IBEX_DEV_UART] =   {  0x4000,  0x1000  },
> @@ -38,17 +38,17 @@ static const MemMapEntry ibex_memmap[] = {
>  [IBEX_DEV_I2C] ={  0x4008,  0x1000  },
>  [IBEX_DEV_PATTGEN] ={  0x400e,  0x1000  },
>  [IBEX_DEV_TIMER] =  {  0x4010,  0x1000  },
> -[IBEX_DEV_SENSOR_CTRL] ={  0x4011,  0x1000  },
>  [IBEX_DEV_OTP_CTRL] =   {  0x4013,  0x4000  },
>  [IBEX_DEV_LC_CTRL] ={  0x4014,  0x1000  },
> -[IBEX_DEV_USBDEV] = {  0x4015,  0x1000  },
> +[IBEX_DEV_ALERT_HANDLER] =  {  0x4015,  0x1000  },
>  [IBEX_DEV_SPI_HOST0] =  {  0x4030,  0x1000  },
>  [IBEX_DEV_SPI_HOST1] =  {  0x4031,  0x1000  },
> +[IBEX_DEV_USBDEV] = {  0x4032,  0x1000  },
>  [IBEX_DEV_PWRMGR] = {  0x4040,  0x1000  },
>  [IBEX_DEV_RSTMGR] = {  0x4041,  0x1000  },
>  [IBEX_DEV_CLKMGR] = {  0x4042,  0x1000  },
>  [IBEX_DEV_PINMUX] = {  0x4046,  0x1000  },
> -[IBEX_DEV_PADCTRL] ={  0x4047,  0x1000  },
> +[IBEX_DEV_SENSOR_CTRL] ={  0x4049,  0x1000  },
>  [IBEX_DEV_FLASH_CTRL] = {  0x4100,  0x1000  },
>  [IBEX_DEV_AES] ={  0x4110,  0x1000  },
>  [IBEX_DEV_HMAC] =   {  0x4111,  0x1000  },
> @@ -59,10 +59,9 @@ static const MemMapEntry ibex_memmap[] = {
>  [IBEX_DEV_ENTROPY] ={  0x4116,  0x1000  },
>  [IBEX_DEV_EDNO] =   {  0x4117,  0x1000  },
>  [IBEX_DEV_EDN1] =   {  0x4118,  0x1000  },
> -[IBEX_DEV_ALERT_HANDLER] =  {  0x411b,  0x1000  },
>  [IBEX_DEV_NMI_GEN] ={  0x411c,  0x1000  },
>  [IBEX_DEV_PERI] =   {  0x411f,  0x1 },
> -[IBEX_DEV_PLIC] =   {  0x4800,  0x4005000  },
> +[IBEX_DEV_PLIC] =   {  0x4800,  0x4005000 },
>  [IBEX_DEV_FLASH_VIRTUAL] =  {  0x8000,  0x8 },
>  };
>
> @@ -265,8 +264,6 @@ static void lowrisc_ibex_soc_realize(DeviceState 
> *dev_soc, Error **errp)
>  memmap[IBEX_DEV_CLKMGR].base, memmap[IBEX_DEV_CLKMGR].size);
>  create_unimplemented_device("riscv.lowrisc.ibex.pinmux",
>  memmap[IBEX_DEV_PINMUX].base, memmap[IBEX_DEV_PINMUX].size);
> -create_unimplemented_device("riscv.lowrisc.ibex.padctrl",
> -memmap[IBEX_DEV_PADCTRL].base, memmap[IBEX_DEV_PADCTRL].size);
>  create_unimplemented_device("riscv.lowrisc.ibex.usbdev",
>  memmap[IBEX_DEV_USBDEV].base, memmap[IBEX_DEV_USBDEV].size);
>  create_unimplemented_device("riscv.lowrisc.ibex.flash_ctrl",
> diff --git a/include/hw/riscv/opentitan.h b/include/hw/riscv/opentitan.h
> index 6665cd5794..1fc055cdff 100644
> --- a/include/hw/riscv/opentitan.h
> +++ b/include/hw/riscv/opentitan.h
> @@ -81,7 +81,6 @@ enum {
>  IBEX_DEV_RSTMGR,
>  IBEX_DEV_CLKMGR,
>  IBEX_DEV_PINMUX,
> -IBEX_DEV_PADCTRL,
>  IBEX_DEV_USBDEV,
>  IBEX_DEV_FLASH_CTRL,
>  IBEX_DEV_PLIC,
> @@ -109,10 +108,10 @@ enum {
>  IBEX_UART0_RX_TIMEOUT_IRQ = 7,
>  IBEX_UART0_RX_PARITY_ERR_IRQ  = 8,
>  IBEX_TIMER_TIMEREXPIRED0_0= 127,
> -IBEX_SPI_HOST0_ERR_IRQ= 151,
> -IBEX_SPI_HOST0_SPI_EVENT_IRQ  = 152,
> -IBEX_SPI_HOST1_ERR_IRQ= 153,
> -

Re: [RFC 6/8] target/riscv: delete redundant check for zcd instructions in decode_opc

2022-10-24 Thread Alistair Francis
On Fri, Sep 30, 2022 at 11:28 AM Weiwei Li  wrote:
>
> All the check for Zcd instructions have been done in their trans function
>
> Signed-off-by: Weiwei Li 
> Signed-off-by: Junqiang Wang 
> ---
>  target/riscv/translate.c | 7 ---
>  1 file changed, 7 deletions(-)
>
> diff --git a/target/riscv/translate.c b/target/riscv/translate.c
> index 347bc913eb..a55b4a7849 100644
> --- a/target/riscv/translate.c
> +++ b/target/riscv/translate.c
> @@ -1087,13 +1087,6 @@ static void decode_opc(CPURISCVState *env, 
> DisasContext *ctx, uint16_t opcode)
>   ((opcode & 0xe003) == 0xe000) ||
>   ((opcode & 0xe003) == 0xe002))) {
>  gen_exception_illegal(ctx);
> -} else if (!(has_ext(ctx, RVC) || ctx->cfg_ptr->ext_zcd ||
> - ctx->cfg_ptr->ext_zcmp || ctx->cfg_ptr->ext_zcmt) &&
> -   (((opcode & 0xe003) == 0x2000) ||
> -((opcode & 0xe003) == 0x2002) ||
> -((opcode & 0xe003) == 0xa000) ||
> -((opcode & 0xe003) == 0xa002))) {
> -gen_exception_illegal(ctx);

It's probably best to never add this in the first place.

Remember that the extension can't be enabled until the last patch, so
it's ok if we don't support it all in one go

Alistair

>  } else {
>  ctx->opcode = opcode;
>  ctx->pc_succ_insn = ctx->base.pc_next + 2;
> --
> 2.25.1
>
>



Re: [RFC 7/8] target/riscv: expose properties for Zc* extension

2022-10-24 Thread Alistair Francis
On Fri, Sep 30, 2022 at 11:29 AM Weiwei Li  wrote:
>
> Expose zca,zcb,zcf,zcd,zcmp,zcmt properties
>
> Signed-off-by: Weiwei Li 
> Signed-off-by: Junqiang Wang 

Reviewed-by: Alistair Francis 

Alistair

> ---
>  target/riscv/cpu.c | 13 +
>  1 file changed, 13 insertions(+)
>
> diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
> index 7da3de1725..e6f722278c 100644
> --- a/target/riscv/cpu.c
> +++ b/target/riscv/cpu.c
> @@ -80,6 +80,12 @@ static const struct isa_ext_data isa_edata_arr[] = {
>  ISA_EXT_DATA_ENTRY(zfhmin, true, PRIV_VERSION_1_12_0, ext_zfhmin),
>  ISA_EXT_DATA_ENTRY(zfinx, true, PRIV_VERSION_1_12_0, ext_zfinx),
>  ISA_EXT_DATA_ENTRY(zdinx, true, PRIV_VERSION_1_12_0, ext_zdinx),
> +ISA_EXT_DATA_ENTRY(zca, true, PRIV_VERSION_1_12_0, ext_zca),
> +ISA_EXT_DATA_ENTRY(zcb, true, PRIV_VERSION_1_12_0, ext_zcb),
> +ISA_EXT_DATA_ENTRY(zcf, true, PRIV_VERSION_1_12_0, ext_zcf),
> +ISA_EXT_DATA_ENTRY(zcd, true, PRIV_VERSION_1_12_0, ext_zcd),
> +ISA_EXT_DATA_ENTRY(zcmp, true, PRIV_VERSION_1_12_0, ext_zcmp),
> +ISA_EXT_DATA_ENTRY(zcmt, true, PRIV_VERSION_1_12_0, ext_zcmt),
>  ISA_EXT_DATA_ENTRY(zba, true, PRIV_VERSION_1_12_0, ext_zba),
>  ISA_EXT_DATA_ENTRY(zbb, true, PRIV_VERSION_1_12_0, ext_zbb),
>  ISA_EXT_DATA_ENTRY(zbc, true, PRIV_VERSION_1_12_0, ext_zbc),
> @@ -1070,6 +1076,13 @@ static Property riscv_cpu_extensions[] = {
>
>  /* These are experimental so mark with 'x-' */
>  DEFINE_PROP_BOOL("x-j", RISCVCPU, cfg.ext_j, false),
> +
> +DEFINE_PROP_BOOL("x-zca", RISCVCPU, cfg.ext_zca, false),
> +DEFINE_PROP_BOOL("x-zcb", RISCVCPU, cfg.ext_zcb, false),
> +DEFINE_PROP_BOOL("x-zcd", RISCVCPU, cfg.ext_zcd, false),
> +DEFINE_PROP_BOOL("x-zcf", RISCVCPU, cfg.ext_zcf, false),
> +DEFINE_PROP_BOOL("x-zcmp", RISCVCPU, cfg.ext_zcmp, false),
> +DEFINE_PROP_BOOL("x-zcmt", RISCVCPU, cfg.ext_zcmt, false),
>  /* ePMP 0.9.3 */
>  DEFINE_PROP_BOOL("x-epmp", RISCVCPU, cfg.epmp, false),
>  DEFINE_PROP_BOOL("x-smaia", RISCVCPU, cfg.ext_smaia, false),
> --
> 2.25.1
>
>



Re: [RFC 1/8] target/riscv: add cfg properties for Zc* extension

2022-10-24 Thread Alistair Francis
On Fri, Sep 30, 2022 at 11:28 AM Weiwei Li  wrote:
>
> Add properties for Zca,Zcb,Zcf,Zcd,Zcmp,Zcmt extension
> Add check for these properties
>
> Signed-off-by: Weiwei Li 
> Signed-off-by: Junqiang Wang 

Reviewed-by: Alistair Francis 

Alistair

> ---
>  target/riscv/cpu.c | 24 
>  target/riscv/cpu.h |  6 ++
>  2 files changed, 30 insertions(+)
>
> diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
> index b29c88b9f0..7da3de1725 100644
> --- a/target/riscv/cpu.c
> +++ b/target/riscv/cpu.c
> @@ -782,6 +782,30 @@ static void riscv_cpu_realize(DeviceState *dev, Error 
> **errp)
>  }
>  }
>
> +if (env->misa_mxl_max != MXL_RV32 && cpu->cfg.ext_zcf) {
> +error_setg(errp, "Zcf is only relevant to RV32");
> +return;
> +}
> +
> +if ((cpu->cfg.ext_zcf || cpu->cfg.ext_zcd || cpu->cfg.ext_zcb ||
> + cpu->cfg.ext_zcmp || cpu->cfg.ext_zcmt) && !cpu->cfg.ext_zca) {
> +error_setg(errp, "Zcf/Zcd/Zcb/Zcmp/Zcmt require Zca extension");
> +return;
> +}
> +
> +if env->misa_ext & RVD) && (env->misa_ext & RVC)) ||
> + cpu->cfg.ext_zcd) && (cpu->cfg.ext_zcmp || cpu->cfg.ext_zcmt)) {
> +error_setg(errp,
> +   "Zcmp/Zcmt are incompatible with Zcd, which is "
> +   "included when C and D extensions are both present");
> +return;
> +}
> +
> +if (cpu->cfg.ext_zcmt && !cpu->cfg.ext_icsr) {
> +error_setg(errp, "Zcmt extension requires Zicsr");
> +return;
> +}
> +
>  if (cpu->cfg.ext_zk) {
>  cpu->cfg.ext_zkn = true;
>  cpu->cfg.ext_zkr = true;
> diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
> index b131fa8c8e..db3eca1d8a 100644
> --- a/target/riscv/cpu.h
> +++ b/target/riscv/cpu.h
> @@ -428,6 +428,12 @@ struct RISCVCPUConfig {
>  bool ext_zbkc;
>  bool ext_zbkx;
>  bool ext_zbs;
> +bool ext_zca;
> +bool ext_zcb;
> +bool ext_zcd;
> +bool ext_zcf;
> +bool ext_zcmp;
> +bool ext_zcmt;
>  bool ext_zk;
>  bool ext_zkn;
>  bool ext_zknd;
> --
> 2.25.1
>
>



Re: [PATCH] tcg/riscv: Fix reg overlap case in tcg_out_addsub2

2022-10-24 Thread Alistair Francis
On Fri, Oct 21, 2022 at 9:47 AM Richard Henderson
 wrote:
>
> There was a typo using opc_addi instead of opc_add with the
> two registers.  While we're at it, simplify the gating test
> to al == bl to improve dynamic scheduling even when the
> output register does not overlap the inputs.
>
> Reported-by: LIU Zhiwei 
> Signed-off-by: Richard Henderson 

Thanks!

Applied to riscv-to-apply.next

Alistair

> ---
> Supersedes: 20221020104154.4276-4-zhiwei_...@linux.alibaba.com
> ("[RFC PATCH 3/3] tcg/riscv: Remove a wrong optimization for addsub2")
> ---
>  tcg/riscv/tcg-target.c.inc | 10 --
>  1 file changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/tcg/riscv/tcg-target.c.inc b/tcg/riscv/tcg-target.c.inc
> index 81a83e45b1..1cdaf7b57b 100644
> --- a/tcg/riscv/tcg-target.c.inc
> +++ b/tcg/riscv/tcg-target.c.inc
> @@ -687,9 +687,15 @@ static void tcg_out_addsub2(TCGContext *s,
>  if (cbl) {
>  tcg_out_opc_imm(s, opc_addi, rl, al, bl);
>  tcg_out_opc_imm(s, OPC_SLTIU, TCG_REG_TMP0, rl, bl);
> -} else if (rl == al && rl == bl) {
> +} else if (al == bl) {
> +/*
> + * If the input regs overlap, this is a simple doubling
> + * and carry-out is the input msb.  This special case is
> + * required when the output reg overlaps the input,
> + * but we might as well use it always.
> + */
>  tcg_out_opc_imm(s, OPC_SLTI, TCG_REG_TMP0, al, 0);
> -tcg_out_opc_reg(s, opc_addi, rl, al, bl);
> +tcg_out_opc_reg(s, opc_add, rl, al, al);
>  } else {
>  tcg_out_opc_reg(s, opc_add, rl, al, bl);
>  tcg_out_opc_reg(s, OPC_SLTU, TCG_REG_TMP0,
> --
> 2.34.1
>
>



Re: [PATCH v2] kset: fix memory leak when kset_register() returns error

2022-10-24 Thread Luben Tuikov
On 2022-10-24 08:19, Yang Yingliang wrote:
> Inject fault while loading module, kset_register() may fail.
> If it fails, the name allocated by kobject_set_name() which
> is called before kset_register() is leaked, because refcount
> of kobject is hold in kset_init().
> 
> As a kset may be embedded in a larger structure which needs
> be freed in release() function or error path in callers, we
> can not call kset_put() in kset_register(), or it will cause
> double free, so just call kfree_const() to free the name and
> set it to NULL.
> 
> With this fix, the callers don't need to care about the name
> freeing and call an extra kset_put() if kset_register() fails.
> 
> Suggested-by: Luben Tuikov 
> Signed-off-by: Yang Yingliang 
> ---
> v1 -> v2:
>   Free name inside of kset_register() instead of calling kset_put()
>   in drivers.
> ---
>  lib/kobject.c | 8 +++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/lib/kobject.c b/lib/kobject.c
> index a0b2dbfcfa23..3409a89c81e5 100644
> --- a/lib/kobject.c
> +++ b/lib/kobject.c
> @@ -834,6 +834,9 @@ EXPORT_SYMBOL_GPL(kobj_sysfs_ops);
>  /**
>   * kset_register() - Initialize and add a kset.
>   * @k: kset.
> + *
> + * NOTE: On error, the kset.kobj.name allocated by() kobj_set_name()
> + * which is called before kset_register() in caller need be freed.
>   */

The "need be freed" is confusing here because it is not clear if the user
needs to do this or if it is done by the code. Since it is the latter,
it should read "_is_ freed". Like this (no "NOTE"):

"On error, the kset.kobj.name allocated by kobj_set_name(),
 which must be called before kset_register() is called, is freed
 by this function."

Regards,
Luben

>  int kset_register(struct kset *k)
>  {
> @@ -844,8 +847,11 @@ int kset_register(struct kset *k)
>  
>   kset_init(k);
>   err = kobject_add_internal(>kobj);
> - if (err)
> + if (err) {
> + kfree_const(k->kobj.name);
> + k->kobj.name = NULL;
>   return err;
> + }
>   kobject_uevent(>kobj, KOBJ_ADD);
>   return 0;
>  }




Re: [PATCH v3] linux-user: Add guest memory layout to exception dump

2022-10-24 Thread Helge Deller

On 10/25/22 04:25, Richard Henderson wrote:

On 10/25/22 11:57, Helge Deller wrote:

On 10/25/22 00:35, Richard Henderson wrote:

On 10/25/22 06:18, Helge Deller wrote:

When the emulation stops with a hard exception it's very useful for
debugging purposes to dump the current guest memory layout (for an
example see /proc/self/maps) beside the CPU registers.

The open_self_maps() function provides such a memory dump, but since
it's located in the syscall.c file, various changes (add #includes, make
this function externally visible, ...) are needed to be able to call it
from the existing EXCP_DUMP() macro.


/proc/self/maps has all of the qemu mappings in it as well.


I'm not quite sure on how to understand your comments above.
Just comments or NAK to the patch?


A question.

Did you really wanted the host mappings included?


No.
I wanted just the guest mappings.


If so, fine.
If not, pointing out there's a better function to use.


I'm not sure if it's the better choice.
It depends on the targetted audience of such output.

This is linux-user, so if someone runs a program he would expect
output of crash dumps like as he would see them on a native machine.
Showing "external host emulation mappings" seems strange.

I'm running a debian hppa buildd server with linux-user.
I've seen many guest crashes like SEGVs, out-of-memory (in guest),
glibc's ABORT() calls [which e.g. triggers a CPU exception] or other
things. In all those cases the guest mapping was relevant, not
the host mapping.

Helge



Re: [RFC PATCH v2 2/8] vdpa: Save emulated features list in vhost_vdpa

2022-10-24 Thread Jason Wang
On Mon, Oct 24, 2022 at 5:26 PM Eugenio Perez Martin
 wrote:
>
> On Mon, Oct 24, 2022 at 4:14 AM Jason Wang  wrote:
> >
> > On Fri, Oct 21, 2022 at 4:56 PM Eugenio Perez Martin
> >  wrote:
> > >
> > > On Fri, Oct 21, 2022 at 4:57 AM Jason Wang  wrote:
> > > >
> > > > On Thu, Oct 20, 2022 at 2:34 PM Eugenio Perez Martin
> > > >  wrote:
> > > > >
> > > > > On Thu, Oct 20, 2022 at 6:23 AM Jason Wang  
> > > > > wrote:
> > > > > >
> > > > > > On Wed, Oct 19, 2022 at 8:52 PM Eugenio Pérez  
> > > > > > wrote:
> > > > > > >
> > > > > > > At this moment only _F_LOG is added there.
> > > > > > >
> > > > > > > However future patches add features that depend on the kind of 
> > > > > > > device.
> > > > > > > In particular, only net devices can add VIRTIO_F_GUEST_ANNOUNCE. 
> > > > > > > So
> > > > > > > let's allow vhost_vdpa creator to set custom emulated device 
> > > > > > > features.
> > > > > > >
> > > > > > > Signed-off-by: Eugenio Pérez 
> > > > > > > ---
> > > > > > >  include/hw/virtio/vhost-vdpa.h | 2 ++
> > > > > > >  hw/virtio/vhost-vdpa.c | 8 
> > > > > > >  net/vhost-vdpa.c   | 4 
> > > > > > >  3 files changed, 10 insertions(+), 4 deletions(-)
> > > > > > >
> > > > > > > diff --git a/include/hw/virtio/vhost-vdpa.h 
> > > > > > > b/include/hw/virtio/vhost-vdpa.h
> > > > > > > index d85643..50083e1e3b 100644
> > > > > > > --- a/include/hw/virtio/vhost-vdpa.h
> > > > > > > +++ b/include/hw/virtio/vhost-vdpa.h
> > > > > > > @@ -31,6 +31,8 @@ typedef struct vhost_vdpa {
> > > > > > >  bool iotlb_batch_begin_sent;
> > > > > > >  MemoryListener listener;
> > > > > > >  struct vhost_vdpa_iova_range iova_range;
> > > > > > > +/* VirtIO device features that can be emulated by qemu */
> > > > > > > +uint64_t added_features;
> > > > > >
> > > > > > Any reason we need a per vhost_vdpa storage for this? Or is there a
> > > > > > chance that this field could be different among the devices?
> > > > > >
> > > > >
> > > > > Yes, one device could support SVQ and the other one could not support
> > > > > it because of different feature sets for example.
> > > >
> > > > Right, but for those devices that don't support SVQ, we don't even
> > > > need mediation for feature like F_LOG and _F_STATUS?
> > > >
> > >
> > > No, and we cannot offer it to the guest either.
> >
> > Just to make sure we are on the same page, what I meant is, consider
> > in the future SVQ get the support of all features, so we can remove
> > this field? This is because _F_STATUS can be mediated unconditionally
> > anyhow.
> >
>
> For _F_STATUS that is right. But we cannot handle full
> _F_GUEST_ANNOUNCE since control SVQ (will) needs features from the
> device that cannot be emulated, like ASID.
>
> I think your point is "Since qemu cannot migrate these devices it will
> never set VIRTIO_NET_S_ANNOUNCE, so the guest will never send
> VIRTIO_NET_CTRL_ANNOUNCE messages". And I think that is totally right,
> but I still feel it is weird to expose it if we cannot handle it.
>
> Maybe a good first step is to move added_features to vhost_net, or
> maybe to convert it to "bool guest_announce_emulated" or something
> similar?  This way hw/virtio/vhost-vdpa is totally unaware of this and
> changes are more self contained.

This reminds me of something. For vhost, if Qemu can handle some
feature bits, we don't need to validate if vhost has such support.

E.g we don't have _F_SATAUS and _F_GUEST_ANNOUNCE in kernel_feature_bits.

I wonder if we can do something similar for vhost-vDPA? Then we don't
need to bother new fields.

Thanks

>
> Thanks!
>
>
>
> > Thanks
> >
> > >
> > > > Thanks
> > > >
> > > > >
> > > > > Thanks!
> > > > >
> > > > > > Thanks
> > > > > >
> > > > > > >  uint64_t acked_features;
> > > > > > >  bool shadow_vqs_enabled;
> > > > > > >  /* IOVA mapping used by the Shadow Virtqueue */
> > > > > > > diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> > > > > > > index 7468e44b87..ddb5e29288 100644
> > > > > > > --- a/hw/virtio/vhost-vdpa.c
> > > > > > > +++ b/hw/virtio/vhost-vdpa.c
> > > > > > > @@ -660,8 +660,8 @@ static int vhost_vdpa_set_features(struct 
> > > > > > > vhost_dev *dev,
> > > > > > >
> > > > > > >  v->acked_features = features;
> > > > > > >
> > > > > > > -/* We must not ack _F_LOG if SVQ is enabled */
> > > > > > > -features &= ~BIT_ULL(VHOST_F_LOG_ALL);
> > > > > > > +/* Do not ack features emulated by qemu */
> > > > > > > +features &= ~v->added_features;
> > > > > > >  }
> > > > > > >
> > > > > > >  trace_vhost_vdpa_set_features(dev, features);
> > > > > > > @@ -1244,8 +1244,8 @@ static int vhost_vdpa_get_features(struct 
> > > > > > > vhost_dev *dev,
> > > > > > >  int ret = vhost_vdpa_get_dev_features(dev, features);
> > > > > > >
> > > > > > >  if (ret == 0 && v->shadow_vqs_enabled) {
> > > > > > > -/* Add SVQ logging capabilities */
> > > > > > > -*features |= 

[PATCH v5] linux-user: Add close_range() syscall

2022-10-24 Thread Helge Deller
Signed-off-by: Helge Deller 
---
Changes:
v5: Simplify check of arg2 against target_fd_max even more
v4: Fix check of arg2
v3: fd_trans_unregister() only called if close_range() doesn't fail
v2: consider CLOSE_RANGE_CLOEXEC flag

diff --git a/linux-user/strace.list b/linux-user/strace.list
index a87415bf3d..78796266e8 100644
--- a/linux-user/strace.list
+++ b/linux-user/strace.list
@@ -103,6 +103,9 @@
 #ifdef TARGET_NR_close
 { TARGET_NR_close, "close" , "%s(%d)", NULL, NULL },
 #endif
+#ifdef TARGET_NR_close_range
+{ TARGET_NR_close_range, "close_range" , "%s(%u,%u,%u)", NULL, NULL },
+#endif
 #ifdef TARGET_NR_connect
 { TARGET_NR_connect, "connect" , "%s(%d,%#x,%d)", NULL, NULL },
 #endif
diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 2e954d8dbd..c51d619a5c 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -338,6 +338,13 @@ _syscall3(int,sys_syslog,int,type,char*,bufp,int,len)
 #ifdef __NR_exit_group
 _syscall1(int,exit_group,int,error_code)
 #endif
+#if defined(__NR_close_range) && defined(TARGET_NR_close_range)
+#define __NR_sys_close_range __NR_close_range
+_syscall3(int,sys_close_range,int,first,int,last,int,flags)
+#ifndef CLOSE_RANGE_CLOEXEC
+#define CLOSE_RANGE_CLOEXEC (1U << 2)
+#endif
+#endif
 #if defined(__NR_futex)
 _syscall6(int,sys_futex,int *,uaddr,int,op,int,val,
   const struct timespec *,timeout,int *,uaddr2,int,val3)
@@ -8699,6 +8706,18 @@ static abi_long do_syscall1(CPUArchState *cpu_env, int 
num, abi_long arg1,
 case TARGET_NR_close:
 fd_trans_unregister(arg1);
 return get_errno(close(arg1));
+#if defined(__NR_close_range) && defined(TARGET_NR_close_range)
+case TARGET_NR_close_range:
+ret = get_errno(sys_close_range(arg1, arg2, arg3));
+if (ret == 0 && !(arg3 & CLOSE_RANGE_CLOEXEC)) {
+abi_long fd, maxfd;
+maxfd = MIN(arg2, target_fd_max);
+for (fd = arg1; fd < maxfd; fd++) {
+fd_trans_unregister(fd);
+}
+}
+return ret;
+#endif

 case TARGET_NR_brk:
 return do_brk(arg1);



Re: [PATCH 2/4] vhost: toggle device callbacks using used event idx

2022-10-24 Thread Jason Wang
On Mon, Oct 24, 2022 at 10:05 PM Michael S. Tsirkin  wrote:
>
> On Mon, Oct 24, 2022 at 04:00:37PM +0200, Eugenio Perez Martin wrote:
> > > > It's generally a waste that we don't use endian-ness annotations
> > > > the way linux does.
> > >
> > > Yes, it's worth doing something similar sometime.
> > >
> >
> > Maybe we could wrap them in some struct like virtio_le16 or virtio_16,
> > avoiding at least integer direct assignment? Wrappers like
> > cpu_to_virtio16 could return these structs and I think all compilers
> > should emit the same code as direct assignment.
> >
> > Thanks!
> >
>
> This will break bitwise operations such as | and &.
> Generally Linux has solved the problem and I don't think
> we should go look for another solution.

Yes, but it should not block this series (we can do that in the future
if we had bandwidth).

Thanks

>
>
> >
> >
> > > Thanks
> > >
> > > >
> > > >
> > > > > > Thanks
> > > > > >
> > > > > > > +} else {
> > > > > > > +svq->vring.avail->flags &= 
> > > > > > > ~cpu_to_le16(VRING_AVAIL_F_NO_INTERRUPT);
> > > > > > > +}
> > > > > > > +
> > > > > > > +/* Make sure the event is enabled before the read of 
> > > > > > > used_idx */
> > > > > > >  smp_mb();
> > > > > > >  return !vhost_svq_more_used(svq);
> > > > > > >  }
> > > > > > >
> > > > > > >  static void vhost_svq_disable_notification(VhostShadowVirtqueue 
> > > > > > > *svq)
> > > > > > >  {
> > > > > > > -svq->vring.avail->flags |= 
> > > > > > > cpu_to_le16(VRING_AVAIL_F_NO_INTERRUPT);
> > > > > > > +/*
> > > > > > > + * No need to disable notification in the event idx case, 
> > > > > > > since used event
> > > > > > > + * index is already an index too far away.
> > > > > > > + */
> > > > > > > +if (!virtio_vdev_has_feature(svq->vdev, 
> > > > > > > VIRTIO_RING_F_EVENT_IDX)) {
> > > > > > > +svq->vring.avail->flags |= 
> > > > > > > cpu_to_le16(VRING_AVAIL_F_NO_INTERRUPT);
> > > > > > > +}
> > > > > > >  }
> > > > > > >
> > > > > > >  static uint16_t vhost_svq_last_desc_of_chain(const 
> > > > > > > VhostShadowVirtqueue *svq,
> > > > > > > --
> > > > > > > 2.31.1
> > > > > > >
> > > > > >
> > > >
> > >
>




Re: [PATCH v3] linux-user: Add guest memory layout to exception dump

2022-10-24 Thread Richard Henderson

On 10/25/22 11:57, Helge Deller wrote:

On 10/25/22 00:35, Richard Henderson wrote:

On 10/25/22 06:18, Helge Deller wrote:

When the emulation stops with a hard exception it's very useful for
debugging purposes to dump the current guest memory layout (for an
example see /proc/self/maps) beside the CPU registers.

The open_self_maps() function provides such a memory dump, but since
it's located in the syscall.c file, various changes (add #includes, make
this function externally visible, ...) are needed to be able to call it
from the existing EXCP_DUMP() macro.


/proc/self/maps has all of the qemu mappings in it as well.


I'm not quite sure on how to understand your comments above.
Just comments or NAK to the patch?


A question.

Did you really wanted the host mappings included?
If so, fine.
If not, pointing out there's a better function to use.


r~



Re: [PATCH] target/riscv/pmp: fix non-translated page size address checks w/ MPU

2022-10-24 Thread Alistair Francis
On Thu, Oct 20, 2022 at 7:29 AM Leon Schuermann
 wrote:
>
> Alistair Francis  writes:
> >> @@ -310,10 +311,17 @@ bool pmp_hart_has_privs(CPURISCVState *env, 
> >> target_ulong addr,
> >>  }
> >>
> >>  if (size == 0) {
> >> -if (riscv_feature(env, RISCV_FEATURE_MMU)) {
> >> +if (riscv_cpu_mxl(env) == MXL_RV32) {
> >> +satp_mode = SATP32_MODE;
> >> +} else {
> >> +satp_mode = SATP64_MODE;
> >> +}
> >> +
> >> +if (riscv_feature(env, RISCV_FEATURE_MMU)
> >> +&& get_field(env->satp, satp_mode)) {
> >>  /*
> >> - * If size is unknown (0), assume that all bytes
> >> - * from addr to the end of the page will be accessed.
> >> + * If size is unknown (0) and virtual memory is enabled, 
> >> assume that
> >> + * all bytes from addr to the end of the page will be 
> >> accessed.
> >>   */
> >>  pmp_size = -(addr | TARGET_PAGE_MASK);
> >
> > I'm not sure if we need this at all.
> >
> > This function is only called from get_physical_address_pmp() which
> > then calculates the maximum size using pmp_is_range_in_tlb().
>
> I'm by no means an expert on QEMU and the TCG, so I've spun up GDB to
> trace down why exactly this function is called with a `size = 0`
> argument. It seems that there are, generally, two code paths to this
> function for instruction fetching:
>
> 1. From `get_page_addr_code`: this will invoke `tlb_fill` with
>`size = 0` to check whether an entire page is accessible and can be
>translated given the current MMU / PMP configuration. In my
>particular example, it may rightfully fail then. `get_page_addr_code`
>can handle this and will subsequently cause an MMU protection check
>to be run for each instruction translated.
>
> 2. From `riscv_tr_translate_insn` through `cpu_lduw_code`, which will
>execute `tlb_fill` with `size = 2` (to try and decode a compressed
>instruction), assuming that the above check failed.
>
> So far, so good. In this context, it actually makes sense for
> `pmp_hart_has_privs` to interpret `size = 0` to mean whether the entire
> page is allowed to be accessed.
>
> > I suspect that we could just use sizeof(target_ulong) as the fallback
> > for every time size == 0. Then pmp_is_range_in_tlb() will set the
> > tlb_size to the maximum possible size of the PMP region.
>
> Given the above, I don't think that this is correct either. The PMP
> check would pass even for non-page sized regions, but the entire page
> would be accessible through the TCG's TLB, as a consequence of
> `get_page_addr_code`.

If we pass a size smaller than the page, it won't be cached in the
TLB, so that should be ok.

A few PMP improvements have gone into
https://github.com/alistair23/qemu/tree/riscv-to-apply.next. It might
be worth checking to see if that fixes the issue you are seeing.
Otherwise I think defaulting to the xlen should be ok.

Alistair

>
>
> In the current implementation, `get_page_addr_code_hostp` calls
> `tlb_fill`, which in turn invokes the RISC-V TCG op `tlb_fill` with the
> parameter `probe = false`. This in turn raises a PMP exception in the
> CPU, whereas `get_page_addr_code` would seem to expect this a failing
> `tlb_fill` to be side-effect free, such that the MMU protection checks
> can be re-run per instruction in the TCG code generation phase.
>
> I think that this is sufficient evidence to conclude that my initial
> patch is actually incorrect, however I am unsure as to how this issue
> can be solved proper. One approach which seems to work is to change
> `get_page_addr_code_hostp` to use a non-faulting page-table read
> instead:
>
> @@ -1510,11 +1510,15 @@ tb_page_addr_t get_page_addr_code_hostp(CPUArchState 
> *env, target_ulong addr,
>  uintptr_t mmu_idx = cpu_mmu_index(env, true);
>  uintptr_t index = tlb_index(env, mmu_idx, addr);
>  CPUTLBEntry *entry = tlb_entry(env, mmu_idx, addr);
> +CPUState *cs = env_cpu(env);
> +CPUClass *cc = CPU_GET_CLASS(cs);
>  void *p;
>
>  if (unlikely(!tlb_hit(entry->addr_code, addr))) {
>  if (!VICTIM_TLB_HIT(addr_code, addr)) {
> -tlb_fill(env_cpu(env), addr, 0, MMU_INST_FETCH, mmu_idx, 0);
> +// Nonfaulting page-table read:
> +cc->tcg_ops->tlb_fill(cs, addr, 0, MMU_INST_FETCH, mmu_idx, true,
> +  0);
>  index = tlb_index(env, mmu_idx, addr);
>  entry = tlb_entry(env, mmu_idx, addr);
>
>
> However, given this touches the generic TCG implementation, I cannot
> judge whether this is correct or has any unintended side effects for
> other targets. If this is correct, I'd be happy to send a proper patch.
>
> -Leon



Re: [PATCH v4] linux-user: Add close_range() syscall

2022-10-24 Thread Richard Henderson

On 10/25/22 11:39, Helge Deller wrote:

On 10/25/22 00:39, Richard Henderson wrote:

On 10/25/22 06:43, Helge Deller wrote:

+    abi_long maxfd = arg2;
+
+    if ((sizeof(abi_long) == 4 && arg2 == (abi_long)0x7FFFUL) ||
+    (sizeof(abi_long) == 8 && arg2 == 
(abi_long)0x7FFFULL)) {
+    maxfd = target_fd_max;
+    }
+
+    for (fd = arg1; fd < maxfd; fd++) {


Why do we need explicit checks for INT32/64_MAX?
If the guest passes 0x7FFEULL,


A 32-bit guest (on a 64bit host) will pass 0x7FFFUL...


do we really need to iterate over all of those impossible values?


The compiler will optimize one of those checks away, so it's effectively
just one expression.


By impossible values, I mean all descriptors above target_fd_max.
The compiler will most certainly not optimize the number of loop iterations.


r~




Re: [PATCH] tcg/riscv: Fix reg overlap case in tcg_out_addsub2

2022-10-24 Thread Alistair Francis
On Fri, Oct 21, 2022 at 9:47 AM Richard Henderson
 wrote:
>
> There was a typo using opc_addi instead of opc_add with the
> two registers.  While we're at it, simplify the gating test
> to al == bl to improve dynamic scheduling even when the
> output register does not overlap the inputs.
>
> Reported-by: LIU Zhiwei 
> Signed-off-by: Richard Henderson 

Reviewed-by: Alistair Francis 

Alistair

> ---
> Supersedes: 20221020104154.4276-4-zhiwei_...@linux.alibaba.com
> ("[RFC PATCH 3/3] tcg/riscv: Remove a wrong optimization for addsub2")
> ---
>  tcg/riscv/tcg-target.c.inc | 10 --
>  1 file changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/tcg/riscv/tcg-target.c.inc b/tcg/riscv/tcg-target.c.inc
> index 81a83e45b1..1cdaf7b57b 100644
> --- a/tcg/riscv/tcg-target.c.inc
> +++ b/tcg/riscv/tcg-target.c.inc
> @@ -687,9 +687,15 @@ static void tcg_out_addsub2(TCGContext *s,
>  if (cbl) {
>  tcg_out_opc_imm(s, opc_addi, rl, al, bl);
>  tcg_out_opc_imm(s, OPC_SLTIU, TCG_REG_TMP0, rl, bl);
> -} else if (rl == al && rl == bl) {
> +} else if (al == bl) {
> +/*
> + * If the input regs overlap, this is a simple doubling
> + * and carry-out is the input msb.  This special case is
> + * required when the output reg overlaps the input,
> + * but we might as well use it always.
> + */
>  tcg_out_opc_imm(s, OPC_SLTI, TCG_REG_TMP0, al, 0);
> -tcg_out_opc_reg(s, opc_addi, rl, al, bl);
> +tcg_out_opc_reg(s, opc_add, rl, al, al);
>  } else {
>  tcg_out_opc_reg(s, opc_add, rl, al, bl);
>  tcg_out_opc_reg(s, OPC_SLTU, TCG_REG_TMP0,
> --
> 2.34.1
>
>



Re: [PATCH v2] kset: fix memory leak when kset_register() returns error

2022-10-24 Thread Yang Yingliang via

Hi,

On 2022/10/25 5:25, Luben Tuikov wrote:

On 2022-10-24 17:06, Luben Tuikov wrote:

On 2022-10-24 08:19, Yang Yingliang wrote:

Inject fault while loading module, kset_register() may fail.
If it fails, the name allocated by kobject_set_name() which
is called before kset_register() is leaked, because refcount
of kobject is hold in kset_init().

"is hold" --> "was set".

Also, I'd say "which must be called" instead of "is", since
we cannot register kobj/kset without a name--the kobj code crashes,
and we want to make this clear. IOW, a novice user may wonder
where "is" it called, as opposed to learning that they "must"
call it to allocate/set a name, before calling kset_register().

So, I'd say this:

"If it fails, the name allocated by kobject_set_name() which must
  be called before a call to kset_regsiter() is leaked, since
  refcount of kobj was set in kset_init()."

Actually, to be a bit more clear:

"If kset_register() fails, the name allocated by kobject_set_name(),
  namely kset.kobj.name, which must be called before a call to kset_register(),
  may be leaked, if the caller doesn't explicitly free it, say by calling 
kset_put().

  To mitigate this, we free the name in kset_register() when an error is 
encountered,
  i.e. when kset_register() returns an error."

Thanks for you suggestion.



As a kset may be embedded in a larger structure which needs
be freed in release() function or error path in callers, we

Drop "As", start with "A kset". "which needs _to_ be".
Also please specify that the release is part of the ktype,
like this:

"A kset may be embedded in a larger structure which needs to be
  freed in ktype.release() or error path in callers, we ..."


can not call kset_put() in kset_register(), or it will cause
double free, so just call kfree_const() to free the name and
set it to NULL.

With this fix, the callers don't need to care about the name
freeing and call an extra kset_put() if kset_register() fails.

This is unclear because you're *missing* a verb:
"and call an extra kset_put()".
Please add the proper verb _between_ "and call", something like,

"With this fix, the callers don't need to care about freeing
  the name of the kset, and _can_ call kset_put() if kset_register() fails."

I was mean
the callers don't need to care about freeing the name of the kset and
the callers don't need to care about calling kset_put()

Thanks,
Yang


Choose a proper verb here: can, should, cannot, should not, etc.

We can do this because you set "kset.kobj.name to NULL, and this
is checked for in kobject_cleanup(). We just need to stipulate
whether they should/shouldn't have to call kset_put(), or can free the kset
and/or the embedding object themselves. This really depends
on how we want kset_register() to behave in the future, and on
user's own ktype.release implementation...

Forgot "may", "may not".

So, do we want to say "may call kset_put()", like:

"With this fix, the callers need not care about freeing
  the name of the kset, and _may_ call kset_put() if kset_register() fails."

Or do we want to say "should" or even "must"--it really depends on
what else is (would be) going on in kobj registration.

Although, the user may have additional work to be done in the ktype.release()
callback for the embedding object. It would be good to give them the freedom,
i.e. "may", to call kset_put(). If that's not the case, this must be explicitly
stipulated with the proper verb.

Regards,
Luben

.




Re: [PATCH v4] linux-user: Add close_range() syscall

2022-10-24 Thread Helge Deller

On 10/25/22 03:39, Helge Deller wrote:

On 10/25/22 00:39, Richard Henderson wrote:

On 10/25/22 06:43, Helge Deller wrote:

+    abi_long maxfd = arg2;
+
+    if ((sizeof(abi_long) == 4 && arg2 == (abi_long)0x7FFFUL) ||
+    (sizeof(abi_long) == 8 && arg2 == 
(abi_long)0x7FFFULL)) {
+    maxfd = target_fd_max;
+    }
+
+    for (fd = arg1; fd < maxfd; fd++) {


Why do we need explicit checks for INT32/64_MAX?
If the guest passes 0x7FFEULL,


A 32-bit guest (on a 64bit host) will pass 0x7FFFUL...


do we really need to iterate over all of those impossible values?


The compiler will optimize one of those checks away, so it's effectively
just one expression.


My above comments are correct, but


I should think some expression involving MIN() is in order.


that's even better.

Will resend v5 patch.

Helge



Re: [PATCH v3] linux-user: Add guest memory layout to exception dump

2022-10-24 Thread Helge Deller

On 10/25/22 00:35, Richard Henderson wrote:

On 10/25/22 06:18, Helge Deller wrote:

When the emulation stops with a hard exception it's very useful for
debugging purposes to dump the current guest memory layout (for an
example see /proc/self/maps) beside the CPU registers.

The open_self_maps() function provides such a memory dump, but since
it's located in the syscall.c file, various changes (add #includes, make
this function externally visible, ...) are needed to be able to call it
from the existing EXCP_DUMP() macro.


/proc/self/maps has all of the qemu mappings in it as well.


I'm not quite sure on how to understand your comments above.
Just comments or NAK to the patch?

*Main* feature of this patch is that output like /proc/self/maps
ends up on stdout and in the log file (if qemu log was enabled)
at all, before the program exits and /proc/self/maps is gone.
Quite useful for bug reports from other users too...


The page_dump() function provides exclusively the guest mappings.


Which is usually sufficient in this case, and has the advantage that it
shows the guest-stack and -heap areas.

Helge



Re: [PATCH 4/4] include/qemu/atomic128: Avoid __sync_val_compare_and_swap_16

2022-10-24 Thread Richard Henderson

On 10/25/22 09:24, Richard Henderson wrote:

Merge the CONFIG_ATOMIC128 and CONFIG_CMPXCHG128 cases
with respect to atomic16_cmpxchg and use
__atomic_compare_exchange_nomic (via qatomic_cmpxchg)
instead of the "legacy" __sync_val_compare_and_swap_16.

Update the meson has_cmpxchg128 test to match.

Signed-off-by: Richard Henderson 
---
  include/qemu/atomic128.h | 8 +---
  meson.build  | 3 ++-
  2 files changed, 3 insertions(+), 8 deletions(-)


Ho hum.  Must drop this one since for reasons that I cannot fathom, x86_64 does not 
implement the __atomic version.



r~



Re: [PATCH v0 1/2] hw/riscv/opentitan: bump opentitan

2022-10-24 Thread Bin Meng
On Tue, Oct 25, 2022 at 9:17 AM Wilfred Mallawa
 wrote:
>
> From: Wilfred Mallawa 
>
> This patch updates the OpenTitan model to match
> the specified register layout as per [1]. Which is also the latest
> commit of OpenTitan supported by TockOS.
>
> Note: Pinmux and Padctrl has been merged into Pinmux [2][3], this patch 
> removes
> any references to Padctrl. Note: OpenTitan doc [2] has not yet specified
> much detail regarding this, except for a note that states `TODO: this
> section needs to be updated to reflect the pinmux/padctrl merger`
>
> [1] 
> https://github.com/lowRISC/opentitan/blob/d072ac505f82152678d6e04be95c72b728a347b8/hw/top_earlgrey/sw/autogen/top_earlgrey_memory.h
> [2] https://docs.opentitan.org/hw/top_earlgrey/doc/design/
> [3] https://docs.opentitan.org/hw/ip/pinmux/doc/#overview
>
> Signed-off-by: Wilfred Mallawa 
> ---
>  hw/riscv/opentitan.c | 13 +
>  include/hw/riscv/opentitan.h |  9 -
>  2 files changed, 9 insertions(+), 13 deletions(-)
>

Reviewed-by: Bin Meng 



Re: [PATCH v0 2/2] hw/riscv/opentitan: add aon_timer base unimpl

2022-10-24 Thread Bin Meng
On Tue, Oct 25, 2022 at 9:19 AM Wilfred Mallawa
 wrote:
>
> From: Wilfred Mallawa 
>
> Adds the updated `aon_timer` base as an unimplemented device. This is
> used by TockOS, patch ensures the guest doesn't hit load faults.
>
> Signed-off-by: Wilfred Mallawa 
> ---
>  hw/riscv/opentitan.c | 3 +++
>  include/hw/riscv/opentitan.h | 1 +
>  2 files changed, 4 insertions(+)
>

Reviewed-by: Bin Meng 



Re: [PATCH v4] linux-user: Add close_range() syscall

2022-10-24 Thread Helge Deller

On 10/25/22 00:39, Richard Henderson wrote:

On 10/25/22 06:43, Helge Deller wrote:

+    abi_long maxfd = arg2;
+
+    if ((sizeof(abi_long) == 4 && arg2 == (abi_long)0x7FFFUL) ||
+    (sizeof(abi_long) == 8 && arg2 == 
(abi_long)0x7FFFULL)) {
+    maxfd = target_fd_max;
+    }
+
+    for (fd = arg1; fd < maxfd; fd++) {


Why do we need explicit checks for INT32/64_MAX?
If the guest passes 0x7FFEULL,


A 32-bit guest (on a 64bit host) will pass 0x7FFFUL...


do we really need to iterate over all of those impossible values?


The compiler will optimize one of those checks away, so it's effectively
just one expression.


I should think some expression involving MIN() is in order.


Helge



Re: [PATCH v4 02/11] device-tree: add re-randomization helper function

2022-10-24 Thread Alistair Francis
On Tue, Oct 25, 2022 at 10:51 AM Jason A. Donenfeld  wrote:
>
> When the system reboots, the rng-seed that the FDT has should be
> re-randomized, so that the new boot gets a new seed. Several
> architectures require this functionality, so export a function for
> injecting a new seed into the given FDT.
>
> Cc: Alistair Francis 
> Cc: David Gibson 
> Signed-off-by: Jason A. Donenfeld 

Reviewed-by: Alistair Francis 

Alistair

> ---
>  include/sysemu/device_tree.h |  9 +
>  softmmu/device_tree.c| 21 +
>  2 files changed, 30 insertions(+)
>
> diff --git a/include/sysemu/device_tree.h b/include/sysemu/device_tree.h
> index e7c5441f56..ca5339beae 100644
> --- a/include/sysemu/device_tree.h
> +++ b/include/sysemu/device_tree.h
> @@ -197,6 +197,15 @@ int qemu_fdt_setprop_sized_cells_from_array(void *fdt,
>  qdt_tmp); \
>  })
>
> +
> +/**
> + * qemu_fdt_randomize_seeds:
> + * @fdt: device tree blob
> + *
> + * Re-randomize all "rng-seed" properties with new seeds.
> + */
> +void qemu_fdt_randomize_seeds(void *fdt);
> +
>  #define FDT_PCI_RANGE_RELOCATABLE  0x8000
>  #define FDT_PCI_RANGE_PREFETCHABLE 0x4000
>  #define FDT_PCI_RANGE_ALIASED  0x2000
> diff --git a/softmmu/device_tree.c b/softmmu/device_tree.c
> index ce74f3d48d..30aa3aea9f 100644
> --- a/softmmu/device_tree.c
> +++ b/softmmu/device_tree.c
> @@ -22,6 +22,7 @@
>  #include "qemu/option.h"
>  #include "qemu/bswap.h"
>  #include "qemu/cutils.h"
> +#include "qemu/guest-random.h"
>  #include "sysemu/device_tree.h"
>  #include "hw/loader.h"
>  #include "hw/boards.h"
> @@ -680,3 +681,23 @@ void hmp_dumpdtb(Monitor *mon, const QDict *qdict)
>
>  info_report("dtb dumped to %s", filename);
>  }
> +
> +void qemu_fdt_randomize_seeds(void *fdt)
> +{
> +int noffset, poffset, len;
> +const char *name;
> +uint8_t *data;
> +
> +for (noffset = fdt_next_node(fdt, 0, NULL);
> + noffset >= 0;
> + noffset = fdt_next_node(fdt, noffset, NULL)) {
> +for (poffset = fdt_first_property_offset(fdt, noffset);
> + poffset >= 0;
> + poffset = fdt_next_property_offset(fdt, poffset)) {
> +data = (uint8_t *)fdt_getprop_by_offset(fdt, poffset, , 
> );
> +if (!data || strcmp(name, "rng-seed"))
> +continue;
> +qemu_guest_getrandom_nofail(data, len);
> +}
> +}
> +}
> --
> 2.38.1
>
>



Re: [PATCH v4 05/11] riscv: re-randomize rng-seed on reboot

2022-10-24 Thread Alistair Francis
On Tue, Oct 25, 2022 at 10:47 AM Jason A. Donenfeld  wrote:
>
> When the system reboots, the rng-seed that the FDT has should be
> re-randomized, so that the new boot gets a new seed. Since the FDT is in
> the ROM region at this point, we add a hook right after the ROM has been
> added, so that we have a pointer to that copy of the FDT.
>
> Cc: Palmer Dabbelt 
> Cc: Alistair Francis 
> Cc: Bin Meng 
> Cc: qemu-ri...@nongnu.org
> Signed-off-by: Jason A. Donenfeld 

Reviewed-by: Alistair Francis 

Alistair

> ---
>  hw/riscv/boot.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/hw/riscv/boot.c b/hw/riscv/boot.c
> index e82bf27338..ebd351c840 100644
> --- a/hw/riscv/boot.c
> +++ b/hw/riscv/boot.c
> @@ -30,6 +30,7 @@
>  #include "sysemu/device_tree.h"
>  #include "sysemu/qtest.h"
>  #include "sysemu/kvm.h"
> +#include "sysemu/reset.h"
>
>  #include 
>
> @@ -241,6 +242,8 @@ uint64_t riscv_load_fdt(hwaddr dram_base, uint64_t 
> mem_size, void *fdt)
>
>  rom_add_blob_fixed_as("fdt", fdt, fdtsize, fdt_addr,
>_space_memory);
> +qemu_register_reset_nosnapshotload(qemu_fdt_randomize_seeds,
> +rom_ptr_for_as(_space_memory, fdt_addr, 
> fdtsize));
>
>  return fdt_addr;
>  }
> --
> 2.38.1
>
>



Re: [PATCH] tcg/riscv: Fix range matched by TCG_CT_CONST_M12

2022-10-24 Thread Alistair Francis
On Sat, Oct 22, 2022 at 8:19 PM Richard Henderson
 wrote:
>
> We were matching a signed 13-bit range, not a 12-bit range.
> Expand the commentary within the function and be explicit
> about all of the ranges.
>
> Reported-by: LIU Zhiwei 
> Signed-off-by: Richard Henderson 

Thanks!

Applied to riscv-to-apply.next

Alistair

> ---
>  tcg/riscv/tcg-target.c.inc | 19 ---
>  1 file changed, 16 insertions(+), 3 deletions(-)
>
> diff --git a/tcg/riscv/tcg-target.c.inc b/tcg/riscv/tcg-target.c.inc
> index 1cdaf7b57b..2a84c57bec 100644
> --- a/tcg/riscv/tcg-target.c.inc
> +++ b/tcg/riscv/tcg-target.c.inc
> @@ -154,13 +154,26 @@ static bool tcg_target_const_match(int64_t val, TCGType 
> type, int ct)
>  if ((ct & TCG_CT_CONST_ZERO) && val == 0) {
>  return 1;
>  }
> -if ((ct & TCG_CT_CONST_S12) && val == sextreg(val, 0, 12)) {
> +/*
> + * Sign extended from 12 bits: [-0x800, 0x7ff].
> + * Used for most arithmetic, as this is the isa field.
> + */
> +if ((ct & TCG_CT_CONST_S12) && val >= -0x800 && val <= 0x7ff) {
>  return 1;
>  }
> -if ((ct & TCG_CT_CONST_N12) && -val == sextreg(-val, 0, 12)) {
> +/*
> + * Sign extended from 12 bits, negated: [-0x7ff, 0x800].
> + * Used for subtraction, where a constant must be handled by ADDI.
> + */
> +if ((ct & TCG_CT_CONST_N12) && val >= -0x7ff && val <= 0x800) {
>  return 1;
>  }
> -if ((ct & TCG_CT_CONST_M12) && val >= -0xfff && val <= 0xfff) {
> +/*
> + * Sign extended from 12 bits, +/- matching: [-0x7ff, 0x7ff].
> + * Used by addsub2, which may need the negative operation,
> + * and requires the modified constant to be representable.
> + */
> +if ((ct & TCG_CT_CONST_M12) && val >= -0x7ff && val <= 0x7ff) {
>  return 1;
>  }
>  return 0;
> --
> 2.34.1
>
>



[PATCH v0 2/2] hw/riscv/opentitan: add aon_timer base unimpl

2022-10-24 Thread Wilfred Mallawa
From: Wilfred Mallawa 

Adds the updated `aon_timer` base as an unimplemented device. This is
used by TockOS, patch ensures the guest doesn't hit load faults.

Signed-off-by: Wilfred Mallawa 
---
 hw/riscv/opentitan.c | 3 +++
 include/hw/riscv/opentitan.h | 1 +
 2 files changed, 4 insertions(+)

diff --git a/hw/riscv/opentitan.c b/hw/riscv/opentitan.c
index 373fed36b6..50452f792a 100644
--- a/hw/riscv/opentitan.c
+++ b/hw/riscv/opentitan.c
@@ -48,6 +48,7 @@ static const MemMapEntry ibex_memmap[] = {
 [IBEX_DEV_RSTMGR] = {  0x4041,  0x1000  },
 [IBEX_DEV_CLKMGR] = {  0x4042,  0x1000  },
 [IBEX_DEV_PINMUX] = {  0x4046,  0x1000  },
+[IBEX_DEV_AON_TIMER] =  {  0x4047,  0x1000  },
 [IBEX_DEV_SENSOR_CTRL] ={  0x4049,  0x1000  },
 [IBEX_DEV_FLASH_CTRL] = {  0x4100,  0x1000  },
 [IBEX_DEV_AES] ={  0x4110,  0x1000  },
@@ -264,6 +265,8 @@ static void lowrisc_ibex_soc_realize(DeviceState *dev_soc, 
Error **errp)
 memmap[IBEX_DEV_CLKMGR].base, memmap[IBEX_DEV_CLKMGR].size);
 create_unimplemented_device("riscv.lowrisc.ibex.pinmux",
 memmap[IBEX_DEV_PINMUX].base, memmap[IBEX_DEV_PINMUX].size);
+create_unimplemented_device("riscv.lowrisc.ibex.aon_timer",
+memmap[IBEX_DEV_AON_TIMER].base, memmap[IBEX_DEV_AON_TIMER].size);
 create_unimplemented_device("riscv.lowrisc.ibex.usbdev",
 memmap[IBEX_DEV_USBDEV].base, memmap[IBEX_DEV_USBDEV].size);
 create_unimplemented_device("riscv.lowrisc.ibex.flash_ctrl",
diff --git a/include/hw/riscv/opentitan.h b/include/hw/riscv/opentitan.h
index 1fc055cdff..7659d1bc5b 100644
--- a/include/hw/riscv/opentitan.h
+++ b/include/hw/riscv/opentitan.h
@@ -81,6 +81,7 @@ enum {
 IBEX_DEV_RSTMGR,
 IBEX_DEV_CLKMGR,
 IBEX_DEV_PINMUX,
+IBEX_DEV_AON_TIMER,
 IBEX_DEV_USBDEV,
 IBEX_DEV_FLASH_CTRL,
 IBEX_DEV_PLIC,
-- 
2.37.3




[PATCH v0 1/2] hw/riscv/opentitan: bump opentitan

2022-10-24 Thread Wilfred Mallawa
From: Wilfred Mallawa 

This patch updates the OpenTitan model to match
the specified register layout as per [1]. Which is also the latest
commit of OpenTitan supported by TockOS.

Note: Pinmux and Padctrl has been merged into Pinmux [2][3], this patch removes
any references to Padctrl. Note: OpenTitan doc [2] has not yet specified
much detail regarding this, except for a note that states `TODO: this
section needs to be updated to reflect the pinmux/padctrl merger`

[1] 
https://github.com/lowRISC/opentitan/blob/d072ac505f82152678d6e04be95c72b728a347b8/hw/top_earlgrey/sw/autogen/top_earlgrey_memory.h
[2] https://docs.opentitan.org/hw/top_earlgrey/doc/design/
[3] https://docs.opentitan.org/hw/ip/pinmux/doc/#overview

Signed-off-by: Wilfred Mallawa 
---
 hw/riscv/opentitan.c | 13 +
 include/hw/riscv/opentitan.h |  9 -
 2 files changed, 9 insertions(+), 13 deletions(-)

diff --git a/hw/riscv/opentitan.c b/hw/riscv/opentitan.c
index be7ff1eea0..373fed36b6 100644
--- a/hw/riscv/opentitan.c
+++ b/hw/riscv/opentitan.c
@@ -29,7 +29,7 @@
 #include "sysemu/sysemu.h"
 
 static const MemMapEntry ibex_memmap[] = {
-[IBEX_DEV_ROM] ={  0x8000,   0x8000 },
+[IBEX_DEV_ROM] ={  0x8000,  0x8000 },
 [IBEX_DEV_RAM] ={  0x1000,  0x2 },
 [IBEX_DEV_FLASH] =  {  0x2000,  0x10 },
 [IBEX_DEV_UART] =   {  0x4000,  0x1000  },
@@ -38,17 +38,17 @@ static const MemMapEntry ibex_memmap[] = {
 [IBEX_DEV_I2C] ={  0x4008,  0x1000  },
 [IBEX_DEV_PATTGEN] ={  0x400e,  0x1000  },
 [IBEX_DEV_TIMER] =  {  0x4010,  0x1000  },
-[IBEX_DEV_SENSOR_CTRL] ={  0x4011,  0x1000  },
 [IBEX_DEV_OTP_CTRL] =   {  0x4013,  0x4000  },
 [IBEX_DEV_LC_CTRL] ={  0x4014,  0x1000  },
-[IBEX_DEV_USBDEV] = {  0x4015,  0x1000  },
+[IBEX_DEV_ALERT_HANDLER] =  {  0x4015,  0x1000  },
 [IBEX_DEV_SPI_HOST0] =  {  0x4030,  0x1000  },
 [IBEX_DEV_SPI_HOST1] =  {  0x4031,  0x1000  },
+[IBEX_DEV_USBDEV] = {  0x4032,  0x1000  },
 [IBEX_DEV_PWRMGR] = {  0x4040,  0x1000  },
 [IBEX_DEV_RSTMGR] = {  0x4041,  0x1000  },
 [IBEX_DEV_CLKMGR] = {  0x4042,  0x1000  },
 [IBEX_DEV_PINMUX] = {  0x4046,  0x1000  },
-[IBEX_DEV_PADCTRL] ={  0x4047,  0x1000  },
+[IBEX_DEV_SENSOR_CTRL] ={  0x4049,  0x1000  },
 [IBEX_DEV_FLASH_CTRL] = {  0x4100,  0x1000  },
 [IBEX_DEV_AES] ={  0x4110,  0x1000  },
 [IBEX_DEV_HMAC] =   {  0x4111,  0x1000  },
@@ -59,10 +59,9 @@ static const MemMapEntry ibex_memmap[] = {
 [IBEX_DEV_ENTROPY] ={  0x4116,  0x1000  },
 [IBEX_DEV_EDNO] =   {  0x4117,  0x1000  },
 [IBEX_DEV_EDN1] =   {  0x4118,  0x1000  },
-[IBEX_DEV_ALERT_HANDLER] =  {  0x411b,  0x1000  },
 [IBEX_DEV_NMI_GEN] ={  0x411c,  0x1000  },
 [IBEX_DEV_PERI] =   {  0x411f,  0x1 },
-[IBEX_DEV_PLIC] =   {  0x4800,  0x4005000  },
+[IBEX_DEV_PLIC] =   {  0x4800,  0x4005000 },
 [IBEX_DEV_FLASH_VIRTUAL] =  {  0x8000,  0x8 },
 };
 
@@ -265,8 +264,6 @@ static void lowrisc_ibex_soc_realize(DeviceState *dev_soc, 
Error **errp)
 memmap[IBEX_DEV_CLKMGR].base, memmap[IBEX_DEV_CLKMGR].size);
 create_unimplemented_device("riscv.lowrisc.ibex.pinmux",
 memmap[IBEX_DEV_PINMUX].base, memmap[IBEX_DEV_PINMUX].size);
-create_unimplemented_device("riscv.lowrisc.ibex.padctrl",
-memmap[IBEX_DEV_PADCTRL].base, memmap[IBEX_DEV_PADCTRL].size);
 create_unimplemented_device("riscv.lowrisc.ibex.usbdev",
 memmap[IBEX_DEV_USBDEV].base, memmap[IBEX_DEV_USBDEV].size);
 create_unimplemented_device("riscv.lowrisc.ibex.flash_ctrl",
diff --git a/include/hw/riscv/opentitan.h b/include/hw/riscv/opentitan.h
index 6665cd5794..1fc055cdff 100644
--- a/include/hw/riscv/opentitan.h
+++ b/include/hw/riscv/opentitan.h
@@ -81,7 +81,6 @@ enum {
 IBEX_DEV_RSTMGR,
 IBEX_DEV_CLKMGR,
 IBEX_DEV_PINMUX,
-IBEX_DEV_PADCTRL,
 IBEX_DEV_USBDEV,
 IBEX_DEV_FLASH_CTRL,
 IBEX_DEV_PLIC,
@@ -109,10 +108,10 @@ enum {
 IBEX_UART0_RX_TIMEOUT_IRQ = 7,
 IBEX_UART0_RX_PARITY_ERR_IRQ  = 8,
 IBEX_TIMER_TIMEREXPIRED0_0= 127,
-IBEX_SPI_HOST0_ERR_IRQ= 151,
-IBEX_SPI_HOST0_SPI_EVENT_IRQ  = 152,
-IBEX_SPI_HOST1_ERR_IRQ= 153,
-IBEX_SPI_HOST1_SPI_EVENT_IRQ  = 154,
+IBEX_SPI_HOST0_ERR_IRQ= 134,
+IBEX_SPI_HOST0_SPI_EVENT_IRQ  = 135,
+IBEX_SPI_HOST1_ERR_IRQ= 136,
+IBEX_SPI_HOST1_SPI_EVENT_IRQ  = 137,
 };
 
 #endif
-- 
2.37.3




[PATCH v0 0/2] hw/riscv/opentitan: bump opentitan version

2022-10-24 Thread Wilfred Mallawa
From: Wilfred Mallawa 

This patch provides updates to the OpenTitan model to bump to RTL
version .

A unique change here is the merger of hwip `padctrl` into `pinmux`, to
reflect this change, any references to `padctrl` are removed.
Additionally, an unimplemented device for `aon_timer` is added and IRQ
numbers are updated. 

Patch was tested by running the latest master of
TockOS as of this patch.

Wilfred Mallawa (2):
  hw/riscv/opentitan: bump opentitan
  hw/riscv/opentitan: add aon_timer base unimpl

 hw/riscv/opentitan.c | 16 
 include/hw/riscv/opentitan.h | 10 +-
 2 files changed, 13 insertions(+), 13 deletions(-)

-- 
2.37.3




Re: [PATCH v3 1/8] reset: allow registering handlers that aren't called by snapshot loading

2022-10-24 Thread Jason A. Donenfeld
On Mon, Oct 24, 2022 at 7:40 PM Markus Armbruster  wrote:
>
> Peter Maydell  writes:
>
> > On Mon, 24 Oct 2022 at 14:20, Markus Armbruster  wrote:
> >>
> >> Peter Maydell  writes:
> >>
> >> > On Mon, 24 Oct 2022 at 13:28, Markus Armbruster  
> >> > wrote:
> >> >>
> >> >> Peter Maydell  writes:
> >> >> > Markus: if we add a new value to the ShutdownCause enumeration,
> >> >> > how annoying is it if we decide we don't want it later? I guess
> >> >> > we can just leave it in the enum unused... (In this case we're
> >> >> > using it for purely internal purposes and it won't ever actually
> >> >> > wind up in any QMP events.)
> >> >>
> >> >> Deleting enumeration values is a compatibility issue only if the value
> >> >> is usable in QMP input.
> >> >>
> >> >> "Purely internal" means it cannot occur in QMP output, and any attempt
> >> >> to use it in input fails.  Aside: feels a bit fragile.
> >> >
> >> > In this case there are as far as I can see no QMP input commands
> >> > which use the enum at all -- it's only used in events, which are
> >> > always output, I think.
> >>
> >> They are.
> >>
> >> Ascertaining "not used in QMP input" is pretty easy, and "not used in
> >> CLI" isn't hard.  My point is that uses could creep in later.  This is
> >> what makes "purely internal" fragile.
> >
> > True. But otoh if there's a meaningful use of the enum constant in
> > input in future we'll want to keep it around anyway.
> >
> > I guess what I'm asking is: do you specifically want this patch
> > done some other way, or to require that "mark some values as
> > internal-only" feature in the QAPI generator, or are you OK with
> > it as-is?  QMP/QAPI is your area, so your call...
>
> QAPI was designed to specify QMP.  We pretty soon discovered that QAPI
> types can be useful elsewhere, and added some to the schema without
> marking them.  Introspection doesn't show these.  Generated
> documentation does.  Known shortcoming of the doc generator.
>
> This case differs in that we're adding an internal-only member to a type
> that is used by QMP.  Both introspection and documentation will show it.
>
> Interface introspection and (especially!) documentation showing stuff
> that doesn't exist in the interface is certainly less than ideal.
> However, I don't want to hold this patch hostage to getting QAPI
> shortcomings addressed.
>
> Instead, I want QMP documentation to make clear that this value cannot
> actually occur.
>
> Fair?

Made mention of it in v4, just posted.

https://lore.kernel.org/all/20221025004327.568476-2-ja...@zx2c4.com/#Z31qapi:run-state.json



[PATCH v4 10/11] openrisc: re-randomize rng-seed on reboot

2022-10-24 Thread Jason A. Donenfeld
When the system reboots, the rng-seed that the FDT has should be
re-randomized, so that the new boot gets a new seed. Since the FDT is in
the ROM region at this point, we add a hook right after the ROM has been
added, so that we have a pointer to that copy of the FDT.

Cc: Stafford Horne 
Signed-off-by: Jason A. Donenfeld 
---
 hw/openrisc/boot.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/hw/openrisc/boot.c b/hw/openrisc/boot.c
index 128ccbcba2..007e80cd5a 100644
--- a/hw/openrisc/boot.c
+++ b/hw/openrisc/boot.c
@@ -14,6 +14,7 @@
 #include "hw/openrisc/boot.h"
 #include "sysemu/device_tree.h"
 #include "sysemu/qtest.h"
+#include "sysemu/reset.h"
 
 #include 
 
@@ -111,6 +112,8 @@ uint32_t openrisc_load_fdt(void *fdt, hwaddr load_start,
 
 rom_add_blob_fixed_as("fdt", fdt, fdtsize, fdt_addr,
   _space_memory);
+qemu_register_reset_nosnapshotload(qemu_fdt_randomize_seeds,
+rom_ptr_for_as(_space_memory, fdt_addr, 
fdtsize));
 
 return fdt_addr;
 }
-- 
2.38.1




[PATCH v4 08/11] mips/boston: re-randomize rng-seed on reboot

2022-10-24 Thread Jason A. Donenfeld
When the system reboots, the rng-seed that the FDT has should be
re-randomized, so that the new boot gets a new seed. Since the FDT is in
the ROM region at this point, we add a hook right after the ROM has been
added, so that we have a pointer to that copy of the FDT.

Cc: Aleksandar Rikalo 
Cc: Paul Burton 
Cc: Philippe Mathieu-Daudé 
Signed-off-by: Jason A. Donenfeld 
---
 hw/mips/boston.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/hw/mips/boston.c b/hw/mips/boston.c
index d2ab9da1a0..cab63f43bf 100644
--- a/hw/mips/boston.c
+++ b/hw/mips/boston.c
@@ -41,6 +41,7 @@
 #include "sysemu/sysemu.h"
 #include "sysemu/qtest.h"
 #include "sysemu/runstate.h"
+#include "sysemu/reset.h"
 
 #include 
 #include "qom/object.h"
@@ -810,6 +811,8 @@ static void boston_mach_init(MachineState *machine)
 /* Calculate real fdt size after filter */
 dt_size = fdt_totalsize(dtb_load_data);
 rom_add_blob_fixed("dtb", dtb_load_data, dt_size, dtb_paddr);
+qemu_register_reset_nosnapshotload(qemu_fdt_randomize_seeds,
+rom_ptr(dtb_paddr, dt_size));
 } else {
 /* Try to load file as FIT */
 fit_err = load_fit(_fit_loader, machine->kernel_filename, 
s);
-- 
2.38.1




[PATCH v4 04/11] arm: re-randomize rng-seed on reboot

2022-10-24 Thread Jason A. Donenfeld
When the system reboots, the rng-seed that the FDT has should be
re-randomized, so that the new boot gets a new seed. Since the FDT is in
the ROM region at this point, we add a hook right after the ROM has been
added, so that we have a pointer to that copy of the FDT.

Cc: Peter Maydell 
Cc: qemu-...@nongnu.org
Signed-off-by: Jason A. Donenfeld 
---
 hw/arm/boot.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/hw/arm/boot.c b/hw/arm/boot.c
index b0b92af188..b106f31468 100644
--- a/hw/arm/boot.c
+++ b/hw/arm/boot.c
@@ -683,6 +683,8 @@ int arm_load_dtb(hwaddr addr, const struct arm_boot_info 
*binfo,
  * the DTB is copied again upon reset, even if addr points into RAM.
  */
 rom_add_blob_fixed_as("dtb", fdt, size, addr, as);
+qemu_register_reset_nosnapshotload(qemu_fdt_randomize_seeds,
+   rom_ptr_for_as(as, addr, size));
 
 g_free(fdt);
 
-- 
2.38.1




[PATCH v4 11/11] rx: re-randomize rng-seed on reboot

2022-10-24 Thread Jason A. Donenfeld
When the system reboots, the rng-seed that the FDT has should be
re-randomized, so that the new boot gets a new seed. Since the FDT is in
the ROM region at this point, we add a hook right after the ROM has been
added, so that we have a pointer to that copy of the FDT.

Cc: Yoshinori Sato 
Signed-off-by: Jason A. Donenfeld 
---
 hw/rx/rx-gdbsim.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/hw/rx/rx-gdbsim.c b/hw/rx/rx-gdbsim.c
index 8ffe1b8035..47c17026c7 100644
--- a/hw/rx/rx-gdbsim.c
+++ b/hw/rx/rx-gdbsim.c
@@ -25,6 +25,7 @@
 #include "hw/rx/rx62n.h"
 #include "sysemu/qtest.h"
 #include "sysemu/device_tree.h"
+#include "sysemu/reset.h"
 #include "hw/boards.h"
 #include "qom/object.h"
 
@@ -148,6 +149,8 @@ static void rx_gdbsim_init(MachineState *machine)
 dtb_offset = ROUND_DOWN(machine->ram_size - dtb_size, 16);
 rom_add_blob_fixed("dtb", dtb, dtb_size,
SDRAM_BASE + dtb_offset);
+qemu_register_reset_nosnapshotload(qemu_fdt_randomize_seeds,
+rom_ptr(SDRAM_BASE + dtb_offset, dtb_size));
 /* Set dtb address to R1 */
 RX_CPU(first_cpu)->env.regs[1] = SDRAM_BASE + dtb_offset;
 }
-- 
2.38.1




[PATCH v4 03/11] x86: do not re-randomize RNG seed on snapshot load

2022-10-24 Thread Jason A. Donenfeld
Snapshot loading is supposed to be deterministic, so we shouldn't
re-randomize the various seeds used.

Signed-off-by: Jason A. Donenfeld 
---
 hw/i386/x86.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/i386/x86.c b/hw/i386/x86.c
index 1148f70c03..bd50a064a3 100644
--- a/hw/i386/x86.c
+++ b/hw/i386/x86.c
@@ -,7 +,7 @@ void x86_load_linux(X86MachineState *x86ms,
 setup_data->type = cpu_to_le32(SETUP_RNG_SEED);
 setup_data->len = cpu_to_le32(RNG_SEED_LENGTH);
 qemu_guest_getrandom_nofail(setup_data->data, RNG_SEED_LENGTH);
-qemu_register_reset(reset_rng_seed, setup_data);
+qemu_register_reset_nosnapshotload(reset_rng_seed, setup_data);
 fw_cfg_add_bytes_callback(fw_cfg, FW_CFG_KERNEL_DATA, reset_rng_seed, 
NULL,
   setup_data, kernel, kernel_size, true);
 } else {
-- 
2.38.1




[PATCH v4 06/11] m68k/virt: do not re-randomize RNG seed on snapshot load

2022-10-24 Thread Jason A. Donenfeld
Snapshot loading is supposed to be deterministic, so we shouldn't
re-randomize the various seeds used.

Signed-off-by: Jason A. Donenfeld 
---
 hw/m68k/virt.c | 20 +++-
 1 file changed, 11 insertions(+), 9 deletions(-)

diff --git a/hw/m68k/virt.c b/hw/m68k/virt.c
index 89c4108eb5..da5eafd275 100644
--- a/hw/m68k/virt.c
+++ b/hw/m68k/virt.c
@@ -89,7 +89,6 @@ typedef struct {
 M68kCPU *cpu;
 hwaddr initial_pc;
 hwaddr initial_stack;
-struct bi_record *rng_seed;
 } ResetInfo;
 
 static void main_cpu_reset(void *opaque)
@@ -98,16 +97,18 @@ static void main_cpu_reset(void *opaque)
 M68kCPU *cpu = reset_info->cpu;
 CPUState *cs = CPU(cpu);
 
-if (reset_info->rng_seed) {
-qemu_guest_getrandom_nofail((void *)reset_info->rng_seed->data + 2,
-be16_to_cpu(*(uint16_t *)reset_info->rng_seed->data));
-}
-
 cpu_reset(cs);
 cpu->env.aregs[7] = reset_info->initial_stack;
 cpu->env.pc = reset_info->initial_pc;
 }
 
+static void rerandomize_rng_seed(void *opaque)
+{
+struct bi_record *rng_seed = opaque;
+qemu_guest_getrandom_nofail((void *)rng_seed->data + 2,
+be16_to_cpu(*(uint16_t *)rng_seed->data));
+}
+
 static void virt_init(MachineState *machine)
 {
 M68kCPU *cpu = NULL;
@@ -289,9 +290,10 @@ static void virt_init(MachineState *machine)
 BOOTINFO0(param_ptr, BI_LAST);
 rom_add_blob_fixed_as("bootinfo", param_blob, param_ptr - param_blob,
   parameters_base, cs->as);
-reset_info->rng_seed = rom_ptr_for_as(cs->as, parameters_base,
-  param_ptr - param_blob) +
-   (param_rng_seed - param_blob);
+qemu_register_reset_nosnapshotload(rerandomize_rng_seed,
+rom_ptr_for_as(cs->as, parameters_base,
+   param_ptr - param_blob) +
+(param_rng_seed - param_blob));
 g_free(param_blob);
 }
 }
-- 
2.38.1




[PATCH v4 02/11] device-tree: add re-randomization helper function

2022-10-24 Thread Jason A. Donenfeld
When the system reboots, the rng-seed that the FDT has should be
re-randomized, so that the new boot gets a new seed. Several
architectures require this functionality, so export a function for
injecting a new seed into the given FDT.

Cc: Alistair Francis 
Cc: David Gibson 
Signed-off-by: Jason A. Donenfeld 
---
 include/sysemu/device_tree.h |  9 +
 softmmu/device_tree.c| 21 +
 2 files changed, 30 insertions(+)

diff --git a/include/sysemu/device_tree.h b/include/sysemu/device_tree.h
index e7c5441f56..ca5339beae 100644
--- a/include/sysemu/device_tree.h
+++ b/include/sysemu/device_tree.h
@@ -197,6 +197,15 @@ int qemu_fdt_setprop_sized_cells_from_array(void *fdt,
 qdt_tmp); \
 })
 
+
+/**
+ * qemu_fdt_randomize_seeds:
+ * @fdt: device tree blob
+ *
+ * Re-randomize all "rng-seed" properties with new seeds.
+ */
+void qemu_fdt_randomize_seeds(void *fdt);
+
 #define FDT_PCI_RANGE_RELOCATABLE  0x8000
 #define FDT_PCI_RANGE_PREFETCHABLE 0x4000
 #define FDT_PCI_RANGE_ALIASED  0x2000
diff --git a/softmmu/device_tree.c b/softmmu/device_tree.c
index ce74f3d48d..30aa3aea9f 100644
--- a/softmmu/device_tree.c
+++ b/softmmu/device_tree.c
@@ -22,6 +22,7 @@
 #include "qemu/option.h"
 #include "qemu/bswap.h"
 #include "qemu/cutils.h"
+#include "qemu/guest-random.h"
 #include "sysemu/device_tree.h"
 #include "hw/loader.h"
 #include "hw/boards.h"
@@ -680,3 +681,23 @@ void hmp_dumpdtb(Monitor *mon, const QDict *qdict)
 
 info_report("dtb dumped to %s", filename);
 }
+
+void qemu_fdt_randomize_seeds(void *fdt)
+{
+int noffset, poffset, len;
+const char *name;
+uint8_t *data;
+
+for (noffset = fdt_next_node(fdt, 0, NULL);
+ noffset >= 0;
+ noffset = fdt_next_node(fdt, noffset, NULL)) {
+for (poffset = fdt_first_property_offset(fdt, noffset);
+ poffset >= 0;
+ poffset = fdt_next_property_offset(fdt, poffset)) {
+data = (uint8_t *)fdt_getprop_by_offset(fdt, poffset, , );
+if (!data || strcmp(name, "rng-seed"))
+continue;
+qemu_guest_getrandom_nofail(data, len);
+}
+}
+}
-- 
2.38.1




[PATCH v4 09/11] mips/malta: pass RNG seed via env var and re-randomize on reboot

2022-10-24 Thread Jason A. Donenfeld
As of the kernel commit linked below, Linux ingests an RNG seed
passed as part of the environment block by the bootloader or firmware.
This mechanism works across all different environment block types,
generically, which pass some block via the second firmware argument. On
malta, this has been tested to work when passed as an argument from
U-Boot's linux_env_set.

As is the case on most other architectures (such as boston), when
booting with `-kernel`, QEMU, acting as the bootloader, should pass the
RNG seed, so that the machine has good entropy for Linux to consume. So
this commit implements that quite simply by using the guest random API,
which is what is used on nearly all other archs too. It also
reinitializes the seed on reboot, so that it is always fresh.

Link: https://git.kernel.org/torvalds/c/056a68cea01
Cc: Aleksandar Rikalo 
Cc: Paul Burton 
Cc: Philippe Mathieu-Daudé 
Signed-off-by: Jason A. Donenfeld 
---
 hw/mips/malta.c | 25 +
 1 file changed, 25 insertions(+)

diff --git a/hw/mips/malta.c b/hw/mips/malta.c
index 0e932988e0..d337de920c 100644
--- a/hw/mips/malta.c
+++ b/hw/mips/malta.c
@@ -26,6 +26,7 @@
 #include "qemu/units.h"
 #include "qemu/bitops.h"
 #include "qemu/datadir.h"
+#include "qemu/guest-random.h"
 #include "hw/clock.h"
 #include "hw/southbridge/piix.h"
 #include "hw/isa/superio.h"
@@ -1017,6 +1018,17 @@ static void G_GNUC_PRINTF(3, 4) prom_set(uint32_t 
*prom_buf, int index,
 va_end(ap);
 }
 
+static void reinitialize_rng_seed(void *opaque)
+{
+char *rng_seed_hex = opaque;
+uint8_t rng_seed[32];
+
+qemu_guest_getrandom_nofail(rng_seed, sizeof(rng_seed));
+for (size_t i = 0; i < sizeof(rng_seed); ++i) {
+sprintf(rng_seed_hex + i * 2, "%02x", rng_seed[i]);
+}
+}
+
 /* Kernel */
 static uint64_t load_kernel(void)
 {
@@ -1028,6 +1040,8 @@ static uint64_t load_kernel(void)
 long prom_size;
 int prom_index = 0;
 uint64_t (*xlate_to_kseg0) (void *opaque, uint64_t addr);
+uint8_t rng_seed[32];
+char rng_seed_hex[sizeof(rng_seed) * 2 + 1];
 
 #if TARGET_BIG_ENDIAN
 big_endian = 1;
@@ -1115,9 +1129,20 @@ static uint64_t load_kernel(void)
 
 prom_set(prom_buf, prom_index++, "modetty0");
 prom_set(prom_buf, prom_index++, "38400n8r");
+
+qemu_guest_getrandom_nofail(rng_seed, sizeof(rng_seed));
+for (size_t i = 0; i < sizeof(rng_seed); ++i) {
+sprintf(rng_seed_hex + i * 2, "%02x", rng_seed[i]);
+}
+prom_set(prom_buf, prom_index++, "rngseed");
+prom_set(prom_buf, prom_index++, "%s", rng_seed_hex);
+
 prom_set(prom_buf, prom_index++, NULL);
 
 rom_add_blob_fixed("prom", prom_buf, prom_size, ENVP_PADDR);
+qemu_register_reset_nosnapshotload(reinitialize_rng_seed,
+memmem(rom_ptr(ENVP_PADDR, prom_size), prom_size,
+   rng_seed_hex, sizeof(rng_seed_hex)));
 
 g_free(prom_buf);
 return kernel_entry;
-- 
2.38.1




[PATCH v4 01/11] reset: allow registering handlers that aren't called by snapshot loading

2022-10-24 Thread Jason A. Donenfeld
Snapshot loading only expects to call deterministic handlers, not
non-deterministic ones. So introduce a way of registering handlers that
won't be called when reseting for snapshots.

Signed-off-by: Jason A. Donenfeld 
---
 hw/arm/aspeed.c|  4 ++--
 hw/arm/mps2-tz.c   |  4 ++--
 hw/core/reset.c| 15 ++-
 hw/hppa/machine.c  |  4 ++--
 hw/i386/microvm.c  |  4 ++--
 hw/i386/pc.c   |  6 +++---
 hw/ppc/pegasos2.c  |  4 ++--
 hw/ppc/pnv.c   |  4 ++--
 hw/ppc/spapr.c |  4 ++--
 hw/s390x/s390-virtio-ccw.c |  4 ++--
 include/hw/boards.h|  2 +-
 include/sysemu/reset.h |  5 -
 migration/savevm.c |  2 +-
 qapi/run-state.json|  5 -
 softmmu/runstate.c | 11 ---
 15 files changed, 51 insertions(+), 27 deletions(-)

diff --git a/hw/arm/aspeed.c b/hw/arm/aspeed.c
index bc3ecdb619..69cadb1c37 100644
--- a/hw/arm/aspeed.c
+++ b/hw/arm/aspeed.c
@@ -1349,12 +1349,12 @@ static void 
aspeed_machine_bletchley_class_init(ObjectClass *oc, void *data)
 aspeed_soc_num_cpus(amc->soc_name);
 }
 
-static void fby35_reset(MachineState *state)
+static void fby35_reset(MachineState *state, ShutdownCause reason)
 {
 AspeedMachineState *bmc = ASPEED_MACHINE(state);
 AspeedGPIOState *gpio = >soc.gpio;
 
-qemu_devices_reset();
+qemu_devices_reset(reason);
 
 /* Board ID: 7 (Class-1, 4 slots) */
 object_property_set_bool(OBJECT(gpio), "gpioV4", true, _fatal);
diff --git a/hw/arm/mps2-tz.c b/hw/arm/mps2-tz.c
index 394192b9b2..284c09c91d 100644
--- a/hw/arm/mps2-tz.c
+++ b/hw/arm/mps2-tz.c
@@ -1239,7 +1239,7 @@ static void mps2_set_remap(Object *obj, const char 
*value, Error **errp)
 }
 }
 
-static void mps2_machine_reset(MachineState *machine)
+static void mps2_machine_reset(MachineState *machine, ShutdownCause reason)
 {
 MPS2TZMachineState *mms = MPS2TZ_MACHINE(machine);
 
@@ -1249,7 +1249,7 @@ static void mps2_machine_reset(MachineState *machine)
  * reset see the correct mapping.
  */
 remap_memory(mms, mms->remap);
-qemu_devices_reset();
+qemu_devices_reset(reason);
 }
 
 static void mps2tz_class_init(ObjectClass *oc, void *data)
diff --git a/hw/core/reset.c b/hw/core/reset.c
index 36be82c491..bcf323d6dd 100644
--- a/hw/core/reset.c
+++ b/hw/core/reset.c
@@ -33,6 +33,7 @@ typedef struct QEMUResetEntry {
 QTAILQ_ENTRY(QEMUResetEntry) entry;
 QEMUResetHandler *func;
 void *opaque;
+bool skip_on_snapshot_load;
 } QEMUResetEntry;
 
 static QTAILQ_HEAD(, QEMUResetEntry) reset_handlers =
@@ -47,6 +48,16 @@ void qemu_register_reset(QEMUResetHandler *func, void 
*opaque)
 QTAILQ_INSERT_TAIL(_handlers, re, entry);
 }
 
+void qemu_register_reset_nosnapshotload(QEMUResetHandler *func, void *opaque)
+{
+QEMUResetEntry *re = g_new0(QEMUResetEntry, 1);
+
+re->func = func;
+re->opaque = opaque;
+re->skip_on_snapshot_load = true;
+QTAILQ_INSERT_TAIL(_handlers, re, entry);
+}
+
 void qemu_unregister_reset(QEMUResetHandler *func, void *opaque)
 {
 QEMUResetEntry *re;
@@ -60,12 +71,14 @@ void qemu_unregister_reset(QEMUResetHandler *func, void 
*opaque)
 }
 }
 
-void qemu_devices_reset(void)
+void qemu_devices_reset(ShutdownCause reason)
 {
 QEMUResetEntry *re, *nre;
 
 /* reset all devices */
 QTAILQ_FOREACH_SAFE(re, _handlers, entry, nre) {
+if (reason == SHUTDOWN_CAUSE_SNAPSHOT_LOAD && 
re->skip_on_snapshot_load)
+continue;
 re->func(re->opaque);
 }
 }
diff --git a/hw/hppa/machine.c b/hw/hppa/machine.c
index e53d5f0fa7..19ea7c2c66 100644
--- a/hw/hppa/machine.c
+++ b/hw/hppa/machine.c
@@ -411,12 +411,12 @@ static void machine_hppa_init(MachineState *machine)
 cpu[0]->env.gr[19] = FW_CFG_IO_BASE;
 }
 
-static void hppa_machine_reset(MachineState *ms)
+static void hppa_machine_reset(MachineState *ms, ShutdownCause reason)
 {
 unsigned int smp_cpus = ms->smp.cpus;
 int i;
 
-qemu_devices_reset();
+qemu_devices_reset(reason);
 
 /* Start all CPUs at the firmware entry point.
  *  Monarch CPU will initialize firmware, secondary CPUs
diff --git a/hw/i386/microvm.c b/hw/i386/microvm.c
index 52f9aa9d8c..ffd1884100 100644
--- a/hw/i386/microvm.c
+++ b/hw/i386/microvm.c
@@ -467,7 +467,7 @@ static void microvm_machine_state_init(MachineState 
*machine)
 microvm_devices_init(mms);
 }
 
-static void microvm_machine_reset(MachineState *machine)
+static void microvm_machine_reset(MachineState *machine, ShutdownCause reason)
 {
 MicrovmMachineState *mms = MICROVM_MACHINE(machine);
 CPUState *cs;
@@ -480,7 +480,7 @@ static void microvm_machine_reset(MachineState *machine)
 mms->kernel_cmdline_fixed = true;
 }
 
-qemu_devices_reset();
+qemu_devices_reset(reason);
 
 CPU_FOREACH(cs) {
 cpu = X86_CPU(cs);
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 768982ae9a..3e86083db3 100644
--- a/hw/i386/pc.c
+++ 

[PATCH v4 05/11] riscv: re-randomize rng-seed on reboot

2022-10-24 Thread Jason A. Donenfeld
When the system reboots, the rng-seed that the FDT has should be
re-randomized, so that the new boot gets a new seed. Since the FDT is in
the ROM region at this point, we add a hook right after the ROM has been
added, so that we have a pointer to that copy of the FDT.

Cc: Palmer Dabbelt 
Cc: Alistair Francis 
Cc: Bin Meng 
Cc: qemu-ri...@nongnu.org
Signed-off-by: Jason A. Donenfeld 
---
 hw/riscv/boot.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/hw/riscv/boot.c b/hw/riscv/boot.c
index e82bf27338..ebd351c840 100644
--- a/hw/riscv/boot.c
+++ b/hw/riscv/boot.c
@@ -30,6 +30,7 @@
 #include "sysemu/device_tree.h"
 #include "sysemu/qtest.h"
 #include "sysemu/kvm.h"
+#include "sysemu/reset.h"
 
 #include 
 
@@ -241,6 +242,8 @@ uint64_t riscv_load_fdt(hwaddr dram_base, uint64_t 
mem_size, void *fdt)
 
 rom_add_blob_fixed_as("fdt", fdt, fdtsize, fdt_addr,
   _space_memory);
+qemu_register_reset_nosnapshotload(qemu_fdt_randomize_seeds,
+rom_ptr_for_as(_space_memory, fdt_addr, 
fdtsize));
 
 return fdt_addr;
 }
-- 
2.38.1




[PATCH v4 00/11] rerandomize RNG seeds on reboot and handle record

2022-10-24 Thread Jason A. Donenfeld
When the system reboots, the rng seed that QEMU passes should be
re-randomized, so that the new boot gets a new seed. This series wires
that up for FDT.

Then, since the record subsystem makes use of reset as well, we
add a new reset cause for record, so that we can avoid
re-randomizing in these cases.

Version 4 prevents the new reset type from leaking over QAPI, documents
this alongside which version it was introduced, and adds mips and m68k
machine types to the queue.

Jason A. Donenfeld (11):
  reset: allow registering handlers that aren't called by snapshot
loading
  device-tree: add re-randomization helper function
  x86: do not re-randomize RNG seed on snapshot load
  arm: re-randomize rng-seed on reboot
  riscv: re-randomize rng-seed on reboot
  m68k/virt: do not re-randomize RNG seed on snapshot load
  m68k/q800: do not re-randomize RNG seed on snapshot load
  mips/boston: re-randomize rng-seed on reboot
  mips/malta: pass RNG seed via env var and re-randomize on reboot
  openrisc: re-randomize rng-seed on reboot
  rx: re-randomize rng-seed on reboot

 hw/arm/aspeed.c  |  4 ++--
 hw/arm/boot.c|  2 ++
 hw/arm/mps2-tz.c |  4 ++--
 hw/core/reset.c  | 15 ++-
 hw/hppa/machine.c|  4 ++--
 hw/i386/microvm.c|  4 ++--
 hw/i386/pc.c |  6 +++---
 hw/i386/x86.c|  2 +-
 hw/m68k/q800.c   | 33 +
 hw/m68k/virt.c   | 20 +++-
 hw/mips/boston.c |  3 +++
 hw/mips/malta.c  | 25 +
 hw/openrisc/boot.c   |  3 +++
 hw/ppc/pegasos2.c|  4 ++--
 hw/ppc/pnv.c |  4 ++--
 hw/ppc/spapr.c   |  4 ++--
 hw/riscv/boot.c  |  3 +++
 hw/rx/rx-gdbsim.c|  3 +++
 hw/s390x/s390-virtio-ccw.c   |  4 ++--
 include/hw/boards.h  |  2 +-
 include/sysemu/device_tree.h |  9 +
 include/sysemu/reset.h   |  5 -
 migration/savevm.c   |  2 +-
 qapi/run-state.json  |  5 -
 softmmu/device_tree.c| 21 +
 softmmu/runstate.c   | 11 ---
 26 files changed, 145 insertions(+), 57 deletions(-)

-- 
2.38.1




[PATCH v4 07/11] m68k/q800: do not re-randomize RNG seed on snapshot load

2022-10-24 Thread Jason A. Donenfeld
Snapshot loading is supposed to be deterministic, so we shouldn't
re-randomize the various seeds used.

Signed-off-by: Jason A. Donenfeld 
---
 hw/m68k/q800.c | 33 +
 1 file changed, 13 insertions(+), 20 deletions(-)

diff --git a/hw/m68k/q800.c b/hw/m68k/q800.c
index e09e244ddc..9d52ca6613 100644
--- a/hw/m68k/q800.c
+++ b/hw/m68k/q800.c
@@ -321,27 +321,23 @@ static const TypeInfo glue_info = {
 },
 };
 
-typedef struct {
-M68kCPU *cpu;
-struct bi_record *rng_seed;
-} ResetInfo;
-
 static void main_cpu_reset(void *opaque)
 {
-ResetInfo *reset_info = opaque;
-M68kCPU *cpu = reset_info->cpu;
+M68kCPU *cpu = opaque;
 CPUState *cs = CPU(cpu);
 
-if (reset_info->rng_seed) {
-qemu_guest_getrandom_nofail((void *)reset_info->rng_seed->data + 2,
-be16_to_cpu(*(uint16_t *)reset_info->rng_seed->data));
-}
-
 cpu_reset(cs);
 cpu->env.aregs[7] = ldl_phys(cs->as, 0);
 cpu->env.pc = ldl_phys(cs->as, 4);
 }
 
+static void rerandomize_rng_seed(void *opaque)
+{
+struct bi_record *rng_seed = opaque;
+qemu_guest_getrandom_nofail((void *)rng_seed->data + 2,
+be16_to_cpu(*(uint16_t *)rng_seed->data));
+}
+
 static uint8_t fake_mac_rom[] = {
 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
 
@@ -397,7 +393,6 @@ static void q800_init(MachineState *machine)
 NubusBus *nubus;
 DeviceState *glue;
 DriveInfo *dinfo;
-ResetInfo *reset_info;
 uint8_t rng_seed[32];
 
 linux_boot = (kernel_filename != NULL);
@@ -408,12 +403,9 @@ static void q800_init(MachineState *machine)
 exit(1);
 }
 
-reset_info = g_new0(ResetInfo, 1);
-
 /* init CPUs */
 cpu = M68K_CPU(cpu_create(machine->cpu_type));
-reset_info->cpu = cpu;
-qemu_register_reset(main_cpu_reset, reset_info);
+qemu_register_reset(main_cpu_reset, cpu);
 
 /* RAM */
 memory_region_add_subregion(get_system_memory(), 0, machine->ram);
@@ -687,9 +679,10 @@ static void q800_init(MachineState *machine)
 BOOTINFO0(param_ptr, BI_LAST);
 rom_add_blob_fixed_as("bootinfo", param_blob, param_ptr - param_blob,
   parameters_base, cs->as);
-reset_info->rng_seed = rom_ptr_for_as(cs->as, parameters_base,
-  param_ptr - param_blob) +
-   (param_rng_seed - param_blob);
+qemu_register_reset_nosnapshotload(rerandomize_rng_seed,
+rom_ptr_for_as(cs->as, parameters_base,
+   param_ptr - param_blob) +
+(param_rng_seed - param_blob));
 g_free(param_blob);
 } else {
 uint8_t *ptr;
-- 
2.38.1




Re: [PATCH v2] hw/acpi/erst.c: Fix memory handling issues

2022-10-24 Thread Alexander Bulekov
On 221024 1224, Alexander Bulekov wrote:
> On 221024 1742, Christian A. Ehrhardt wrote:
> > - Fix memset argument order: The second argument is
> >   the value, the length goes last.
> > - Fix an integer overflow reported by Alexander Bulekov.
> > 
> > Both issues allow the guest to overrun the host buffer
> > allocated for the ERST memory device.
> > 
> > Cc: Eric DeVolder  > Cc: Alexander Bulekov 
> > Cc: qemu-sta...@nongnu.org
> > Fixes: f7e26ffa590 ("ACPI ERST: support for ACPI ERST feature")
> > Tested-by: Alexander Bulekov 
> 

Also:
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1268

> Reviewed-by: Alexander Bulekov 
> 
> > Signed-off-by: Christian A. Ehrhardt 
> > ---
> >  hw/acpi/erst.c | 6 +++---
> >  1 file changed, 3 insertions(+), 3 deletions(-)
> > 
> > diff --git a/hw/acpi/erst.c b/hw/acpi/erst.c
> > index df856b2669..aefcc03ad6 100644
> > --- a/hw/acpi/erst.c
> > +++ b/hw/acpi/erst.c
> > @@ -635,7 +635,7 @@ static unsigned read_erst_record(ERSTDeviceState *s)
> >  if (record_length < UEFI_CPER_RECORD_MIN_SIZE) {
> >  rc = STATUS_FAILED;
> >  }
> > -if ((s->record_offset + record_length) > exchange_length) {
> > +if (record_length > exchange_length - s->record_offset) {
> >  rc = STATUS_FAILED;
> >  }
> >  /* If all is ok, copy the record to the exchange buffer */
> > @@ -684,7 +684,7 @@ static unsigned write_erst_record(ERSTDeviceState *s)
> >  if (record_length < UEFI_CPER_RECORD_MIN_SIZE) {
> >  return STATUS_FAILED;
> >  }
> > -if ((s->record_offset + record_length) > exchange_length) {
> > +if (record_length > exchange_length - s->record_offset) {
> >  return STATUS_FAILED;
> >  }
> >  
> > @@ -716,7 +716,7 @@ static unsigned write_erst_record(ERSTDeviceState *s)
> >  if (nvram) {
> >  /* Write the record into the slot */
> >  memcpy(nvram, exchange, record_length);
> > -memset(nvram + record_length, exchange_length - record_length, 
> > 0xFF);
> > +memset(nvram + record_length, 0xFF, exchange_length - 
> > record_length);
> >  /* If a new record, increment the record_count */
> >  if (!record_found) {
> >  uint32_t record_count;
> > -- 
> > 2.34.1
> > 
> > 



Re: [PATCH] tcg/riscv: Fix range matched by TCG_CT_CONST_M12

2022-10-24 Thread Alistair Francis
On Sat, Oct 22, 2022 at 8:19 PM Richard Henderson
 wrote:
>
> We were matching a signed 13-bit range, not a 12-bit range.
> Expand the commentary within the function and be explicit
> about all of the ranges.
>
> Reported-by: LIU Zhiwei 
> Signed-off-by: Richard Henderson 

Reviewed-by: Alistair Francis 

Alistair

> ---
>  tcg/riscv/tcg-target.c.inc | 19 ---
>  1 file changed, 16 insertions(+), 3 deletions(-)
>
> diff --git a/tcg/riscv/tcg-target.c.inc b/tcg/riscv/tcg-target.c.inc
> index 1cdaf7b57b..2a84c57bec 100644
> --- a/tcg/riscv/tcg-target.c.inc
> +++ b/tcg/riscv/tcg-target.c.inc
> @@ -154,13 +154,26 @@ static bool tcg_target_const_match(int64_t val, TCGType 
> type, int ct)
>  if ((ct & TCG_CT_CONST_ZERO) && val == 0) {
>  return 1;
>  }
> -if ((ct & TCG_CT_CONST_S12) && val == sextreg(val, 0, 12)) {
> +/*
> + * Sign extended from 12 bits: [-0x800, 0x7ff].
> + * Used for most arithmetic, as this is the isa field.
> + */
> +if ((ct & TCG_CT_CONST_S12) && val >= -0x800 && val <= 0x7ff) {
>  return 1;
>  }
> -if ((ct & TCG_CT_CONST_N12) && -val == sextreg(-val, 0, 12)) {
> +/*
> + * Sign extended from 12 bits, negated: [-0x7ff, 0x800].
> + * Used for subtraction, where a constant must be handled by ADDI.
> + */
> +if ((ct & TCG_CT_CONST_N12) && val >= -0x7ff && val <= 0x800) {
>  return 1;
>  }
> -if ((ct & TCG_CT_CONST_M12) && val >= -0xfff && val <= 0xfff) {
> +/*
> + * Sign extended from 12 bits, +/- matching: [-0x7ff, 0x7ff].
> + * Used by addsub2, which may need the negative operation,
> + * and requires the modified constant to be representable.
> + */
> +if ((ct & TCG_CT_CONST_M12) && val >= -0x7ff && val <= 0x7ff) {
>  return 1;
>  }
>  return 0;
> --
> 2.34.1
>
>



[PATCH v2 3/8] Hexagon (target/hexagon) Remove next_PC from runtime state

2022-10-24 Thread Taylor Simpson
The imported files don't properly mark all CONDEXEC instructions, so
we add some logic to hex_common.py to add the attribute.

Signed-off-by: Taylor Simpson 
---
 target/hexagon/cpu.h|  1 -
 target/hexagon/gen_tcg.h|  6 ++
 target/hexagon/macros.h |  2 +-
 target/hexagon/translate.h  |  2 +-
 target/hexagon/op_helper.c  |  6 +++---
 target/hexagon/translate.c  | 27 +--
 target/hexagon/gen_helper_funcs.py  |  4 
 target/hexagon/gen_helper_protos.py |  3 +++
 target/hexagon/gen_tcg_funcs.py |  3 +++
 target/hexagon/hex_common.py| 20 
 10 files changed, 62 insertions(+), 12 deletions(-)

diff --git a/target/hexagon/cpu.h b/target/hexagon/cpu.h
index 2a65a57bab..ff8c26272d 100644
--- a/target/hexagon/cpu.h
+++ b/target/hexagon/cpu.h
@@ -78,7 +78,6 @@ typedef struct CPUArchState {
 target_ulong gpr[TOTAL_PER_THREAD_REGS];
 target_ulong pred[NUM_PREGS];
 target_ulong branch_taken;
-target_ulong next_PC;
 
 /* For comparing with LLDB on target - see adjust_stack_ptrs function */
 target_ulong last_pc_dumped;
diff --git a/target/hexagon/gen_tcg.h b/target/hexagon/gen_tcg.h
index 7f0ba27eb6..e6fc7d97d2 100644
--- a/target/hexagon/gen_tcg.h
+++ b/target/hexagon/gen_tcg.h
@@ -612,6 +612,12 @@
 tcg_temp_free(tmp); \
 } while (0)
 
+#define fGEN_TCG_J2_pause(SHORTCODE) \
+do { \
+uiV = uiV; \
+tcg_gen_movi_tl(hex_gpr[HEX_REG_PC], ctx->next_PC); \
+} while (0)
+
 /* Floating point */
 #define fGEN_TCG_F2_conv_sf2df(SHORTCODE) \
 gen_helper_conv_sf2df(RddV, cpu_env, RsV)
diff --git a/target/hexagon/macros.h b/target/hexagon/macros.h
index 469dfa5571..2fc549c37e 100644
--- a/target/hexagon/macros.h
+++ b/target/hexagon/macros.h
@@ -400,7 +400,7 @@ static inline TCGv gen_read_ireg(TCGv result, TCGv val, int 
shift)
 #endif
 #define fREAD_PC() (PC)
 
-#define fREAD_NPC() (env->next_PC & (0xfffe))
+#define fREAD_NPC() (next_PC & (0xfffe))
 
 #define fREAD_P0() (READ_PREG(0))
 #define fREAD_P3() (READ_PREG(3))
diff --git a/target/hexagon/translate.h b/target/hexagon/translate.h
index a245172827..eae358cf33 100644
--- a/target/hexagon/translate.h
+++ b/target/hexagon/translate.h
@@ -27,6 +27,7 @@
 
 typedef struct DisasContext {
 DisasContextBase base;
+uint32_t next_PC;
 uint32_t mem_idx;
 uint32_t num_packets;
 uint32_t num_insns;
@@ -125,7 +126,6 @@ static inline void ctx_log_qreg_write(DisasContext *ctx,
 
 extern TCGv hex_gpr[TOTAL_PER_THREAD_REGS];
 extern TCGv hex_pred[NUM_PREGS];
-extern TCGv hex_next_PC;
 extern TCGv hex_this_PC;
 extern TCGv hex_slot_cancelled;
 extern TCGv hex_branch_taken;
diff --git a/target/hexagon/op_helper.c b/target/hexagon/op_helper.c
index 84391e25eb..aad0195eb6 100644
--- a/target/hexagon/op_helper.c
+++ b/target/hexagon/op_helper.c
@@ -119,12 +119,12 @@ static void write_new_pc(CPUHexagonState *env, bool 
pkt_has_multi_cof,
   "ignoring the second one\n");
 } else {
 fCHECK_PCALIGN(addr);
-env->next_PC = addr;
+env->gpr[HEX_REG_PC] = addr;
 env->branch_taken = 1;
 }
 } else {
 fCHECK_PCALIGN(addr);
-env->next_PC = addr;
+env->gpr[HEX_REG_PC] = addr;
 }
 }
 
@@ -299,7 +299,7 @@ void HELPER(debug_commit_end)(CPUHexagonState *env, int 
has_st0, int has_st1)
 }
 }
 
-HEX_DEBUG_LOG("Next PC = " TARGET_FMT_lx "\n", env->next_PC);
+HEX_DEBUG_LOG("Next PC = " TARGET_FMT_lx "\n", env->gpr[HEX_REG_PC]);
 HEX_DEBUG_LOG("Exec counters: pkt = " TARGET_FMT_lx
   ", insn = " TARGET_FMT_lx
   ", hvx = " TARGET_FMT_lx "\n",
diff --git a/target/hexagon/translate.c b/target/hexagon/translate.c
index fd4f0efa26..71ad2da682 100644
--- a/target/hexagon/translate.c
+++ b/target/hexagon/translate.c
@@ -31,7 +31,6 @@
 
 TCGv hex_gpr[TOTAL_PER_THREAD_REGS];
 TCGv hex_pred[NUM_PREGS];
-TCGv hex_next_PC;
 TCGv hex_this_PC;
 TCGv hex_slot_cancelled;
 TCGv hex_branch_taken;
@@ -120,7 +119,6 @@ static void gen_exec_counters(DisasContext *ctx)
 static void gen_end_tb(DisasContext *ctx)
 {
 gen_exec_counters(ctx);
-tcg_gen_mov_tl(hex_gpr[HEX_REG_PC], hex_next_PC);
 tcg_gen_exit_tb(NULL, 0);
 ctx->base.is_jmp = DISAS_NORETURN;
 }
@@ -128,7 +126,7 @@ static void gen_end_tb(DisasContext *ctx)
 static void gen_exception_end_tb(DisasContext *ctx, int excp)
 {
 gen_exec_counters(ctx);
-tcg_gen_mov_tl(hex_gpr[HEX_REG_PC], hex_next_PC);
+tcg_gen_movi_tl(hex_gpr[HEX_REG_PC], ctx->next_PC);
 gen_exception_raw(excp);
 ctx->base.is_jmp = DISAS_NORETURN;
 
@@ -204,12 +202,29 @@ static bool need_pred_written(Packet *pkt)
 return check_for_attrib(pkt, A_WRITES_PRED_REG);
 }
 
+static bool need_next_PC(Packet *pkt)
+{
+/* Check for conditional control flow or HW loop end */
+for (int i = 0; i < 

[PATCH v2 1/8] Hexagon (target/hexagon) Only use branch_taken when packet has multi cof

2022-10-24 Thread Taylor Simpson
When a packet has more than one change-of-flow instruction, only the first
one to branch is considered.  We use the branch_taken variable to keep
track of this.

However, when there is a single cof instruction, we don't need the same
amount of bookkeeping.

We add the pkt_has_multi_cof member to the Packet structure, and pass this
information to the needed functions.

When there is a generated helper function with cof, the generator will
pass this pkt_has_multi_cof as a runtime value.

Signed-off-by: Taylor Simpson 
---
 target/hexagon/insn.h   |  1 +
 target/hexagon/macros.h |  2 +-
 target/hexagon/decode.c | 15 +--
 target/hexagon/op_helper.c  | 24 +++-
 target/hexagon/translate.c  |  4 +++-
 target/hexagon/gen_helper_funcs.py  |  3 +++
 target/hexagon/gen_helper_protos.py |  6 +-
 target/hexagon/gen_tcg_funcs.py |  5 +
 target/hexagon/hex_common.py|  3 +++
 9 files changed, 49 insertions(+), 14 deletions(-)

diff --git a/target/hexagon/insn.h b/target/hexagon/insn.h
index aa26389147..857a7ceb75 100644
--- a/target/hexagon/insn.h
+++ b/target/hexagon/insn.h
@@ -60,6 +60,7 @@ struct Packet {
 
 /* Pre-decodes about COF */
 bool pkt_has_cof;  /* Has any change-of-flow */
+bool pkt_has_multi_cof;/* Has more than one change-of-flow */
 bool pkt_has_endloop;
 
 bool pkt_has_dczeroa;
diff --git a/target/hexagon/macros.h b/target/hexagon/macros.h
index c8805bdaeb..e908405d82 100644
--- a/target/hexagon/macros.h
+++ b/target/hexagon/macros.h
@@ -407,7 +407,7 @@ static inline TCGv gen_read_ireg(TCGv result, TCGv val, int 
shift)
 
 #define fCHECK_PCALIGN(A)
 
-#define fWRITE_NPC(A) write_new_pc(env, A)
+#define fWRITE_NPC(A) write_new_pc(env, pkt_has_multi_cof != 0, A)
 
 #define fBRANCH(LOC, TYPE)  fWRITE_NPC(LOC)
 #define fJUMPR(REGNO, TARGET, TYPE) fBRANCH(TARGET, COF_TYPE_JUMPR)
diff --git a/target/hexagon/decode.c b/target/hexagon/decode.c
index 6b73b5c60c..041c8de751 100644
--- a/target/hexagon/decode.c
+++ b/target/hexagon/decode.c
@@ -388,6 +388,7 @@ static void decode_set_insn_attr_fields(Packet *pkt)
 uint16_t opcode;
 
 pkt->pkt_has_cof = false;
+pkt->pkt_has_multi_cof = false;
 pkt->pkt_has_endloop = false;
 pkt->pkt_has_dczeroa = false;
 
@@ -412,13 +413,23 @@ static void decode_set_insn_attr_fields(Packet *pkt)
 }
 }
 
-pkt->pkt_has_cof |= decode_opcode_can_jump(opcode);
+if (decode_opcode_can_jump(opcode)) {
+if (pkt->pkt_has_cof) {
+pkt->pkt_has_multi_cof = true;
+}
+pkt->pkt_has_cof = true;
+}
 
 pkt->insn[i].is_endloop = decode_opcode_ends_loop(opcode);
 
 pkt->pkt_has_endloop |= pkt->insn[i].is_endloop;
 
-pkt->pkt_has_cof |= pkt->pkt_has_endloop;
+if (pkt->pkt_has_endloop) {
+if (pkt->pkt_has_cof) {
+pkt->pkt_has_multi_cof = true;
+}
+pkt->pkt_has_cof = true;
+}
 }
 }
 
diff --git a/target/hexagon/op_helper.c b/target/hexagon/op_helper.c
index 085afc3274..84391e25eb 100644
--- a/target/hexagon/op_helper.c
+++ b/target/hexagon/op_helper.c
@@ -104,20 +104,26 @@ static void log_store64(CPUHexagonState *env, 
target_ulong addr,
 env->mem_log_stores[slot].data64 = val;
 }
 
-static void write_new_pc(CPUHexagonState *env, target_ulong addr)
+static void write_new_pc(CPUHexagonState *env, bool pkt_has_multi_cof,
+ target_ulong addr)
 {
 HEX_DEBUG_LOG("write_new_pc(0x" TARGET_FMT_lx ")\n", addr);
 
-/*
- * If more than one branch is taken in a packet, only the first one
- * is actually done.
- */
-if (env->branch_taken) {
-HEX_DEBUG_LOG("INFO: multiple branches taken in same packet, "
-  "ignoring the second one\n");
+if (pkt_has_multi_cof) {
+/*
+ * If more than one branch is taken in a packet, only the first one
+ * is actually done.
+ */
+if (env->branch_taken) {
+HEX_DEBUG_LOG("INFO: multiple branches taken in same packet, "
+  "ignoring the second one\n");
+} else {
+fCHECK_PCALIGN(addr);
+env->next_PC = addr;
+env->branch_taken = 1;
+}
 } else {
 fCHECK_PCALIGN(addr);
-env->branch_taken = 1;
 env->next_PC = addr;
 }
 }
diff --git a/target/hexagon/translate.c b/target/hexagon/translate.c
index 2329177537..2e46cc0680 100644
--- a/target/hexagon/translate.c
+++ b/target/hexagon/translate.c
@@ -247,7 +247,9 @@ static void gen_start_packet(DisasContext *ctx, Packet *pkt)
 tcg_gen_movi_tl(hex_slot_cancelled, 0);
 }
 if (pkt->pkt_has_cof) {
-tcg_gen_movi_tl(hex_branch_taken, 0);
+if (pkt->pkt_has_multi_cof) {
+tcg_gen_movi_tl(hex_branch_taken, 0);
+}
 

[PATCH v2 5/8] Hexagon (target/hexagon) Add overrides for compound compare and jump

2022-10-24 Thread Taylor Simpson
Signed-off-by: Taylor Simpson 
---
 target/hexagon/gen_tcg.h | 177 +++
 target/hexagon/genptr.c  |  74 
 2 files changed, 251 insertions(+)

diff --git a/target/hexagon/gen_tcg.h b/target/hexagon/gen_tcg.h
index ad149adbe1..b56b216110 100644
--- a/target/hexagon/gen_tcg.h
+++ b/target/hexagon/gen_tcg.h
@@ -620,6 +620,183 @@
 #define fGEN_TCG_J2_callf(SHORTCODE) \
 gen_cond_call(ctx, pkt, PuV, false, riV)
 
+/*
+ * Compound compare and jump instructions
+ * Here is a primer to understand the tag names
+ *
+ * Comparison
+ *  cmpeqi   compare equal to an immediate
+ *  cmpgti   compare greater than an immediate
+ *  cmpgtiu  compare greater than an unsigned immediate
+ *  cmpeqn1  compare equal to negative 1
+ *  cmpgtn1  compare greater than negative 1
+ *  cmpeqcompare equal (two registers)
+ *  cmpgtu   compare greater than unsigned (two registers)
+ *  tstbit0  test bit zero
+ *
+ * Condition
+ *  tp0  p0 is true p0 = cmp.eq(r0,#5); if (p0.new) jump:nt address
+ *  fp0  p0 is falsep0 = cmp.eq(r0,#5); if (!p0.new) jump:nt 
address
+ *  tp1  p1 is true p1 = cmp.eq(r0,#5); if (p1.new) jump:nt address
+ *  fp1  p1 is falsep1 = cmp.eq(r0,#5); if (!p1.new) jump:nt 
address
+ *
+ * Prediction (not modelled in qemu)
+ *  _nt  not taken
+ *  _t   taken
+ */
+#define fGEN_TCG_J4_cmpeq_tp0_jump_t(SHORTCODE) \
+gen_cmpnd_cmp_jmp(ctx, pkt, insn, 0, TCG_COND_EQ, true, RsV, RtV, riV)
+#define fGEN_TCG_J4_cmpeq_tp0_jump_nt(SHORTCODE) \
+gen_cmpnd_cmp_jmp(ctx, pkt, insn, 0, TCG_COND_EQ, true, RsV, RtV, riV)
+#define fGEN_TCG_J4_cmpeq_fp0_jump_t(SHORTCODE) \
+gen_cmpnd_cmp_jmp(ctx, pkt, insn, 0, TCG_COND_EQ, false, RsV, RtV, riV)
+#define fGEN_TCG_J4_cmpeq_fp0_jump_nt(SHORTCODE) \
+gen_cmpnd_cmp_jmp(ctx, pkt, insn, 0, TCG_COND_EQ, false, RsV, RtV, riV)
+#define fGEN_TCG_J4_cmpeq_tp1_jump_t(SHORTCODE) \
+gen_cmpnd_cmp_jmp(ctx, pkt, insn, 1, TCG_COND_EQ, true, RsV, RtV, riV)
+#define fGEN_TCG_J4_cmpeq_tp1_jump_nt(SHORTCODE) \
+gen_cmpnd_cmp_jmp(ctx, pkt, insn, 1, TCG_COND_EQ, true, RsV, RtV, riV)
+#define fGEN_TCG_J4_cmpeq_fp1_jump_t(SHORTCODE) \
+gen_cmpnd_cmp_jmp(ctx, pkt, insn, 1, TCG_COND_EQ, false, RsV, RtV, riV)
+#define fGEN_TCG_J4_cmpeq_fp1_jump_nt(SHORTCODE) \
+gen_cmpnd_cmp_jmp(ctx, pkt, insn, 1, TCG_COND_EQ, false, RsV, RtV, riV)
+
+#define fGEN_TCG_J4_cmpgt_tp0_jump_t(SHORTCODE) \
+gen_cmpnd_cmp_jmp(ctx, pkt, insn, 0, TCG_COND_GT, true, RsV, RtV, riV)
+#define fGEN_TCG_J4_cmpgt_tp0_jump_nt(SHORTCODE) \
+gen_cmpnd_cmp_jmp(ctx, pkt, insn, 0, TCG_COND_GT, true, RsV, RtV, riV)
+#define fGEN_TCG_J4_cmpgt_fp0_jump_t(SHORTCODE) \
+gen_cmpnd_cmp_jmp(ctx, pkt, insn, 0, TCG_COND_GT, false, RsV, RtV, riV)
+#define fGEN_TCG_J4_cmpgt_fp0_jump_nt(SHORTCODE) \
+gen_cmpnd_cmp_jmp(ctx, pkt, insn, 0, TCG_COND_GT, false, RsV, RtV, riV)
+#define fGEN_TCG_J4_cmpgt_tp1_jump_t(SHORTCODE) \
+gen_cmpnd_cmp_jmp(ctx, pkt, insn, 1, TCG_COND_GT, true, RsV, RtV, riV)
+#define fGEN_TCG_J4_cmpgt_tp1_jump_nt(SHORTCODE) \
+gen_cmpnd_cmp_jmp(ctx, pkt, insn, 1, TCG_COND_GT, true, RsV, RtV, riV)
+#define fGEN_TCG_J4_cmpgt_fp1_jump_t(SHORTCODE) \
+gen_cmpnd_cmp_jmp(ctx, pkt, insn, 1, TCG_COND_GT, false, RsV, RtV, riV)
+#define fGEN_TCG_J4_cmpgt_fp1_jump_nt(SHORTCODE) \
+gen_cmpnd_cmp_jmp(ctx, pkt, insn, 1, TCG_COND_GT, false, RsV, RtV, riV)
+
+#define fGEN_TCG_J4_cmpgtu_tp0_jump_t(SHORTCODE) \
+gen_cmpnd_cmp_jmp(ctx, pkt, insn, 0, TCG_COND_GTU, true, RsV, RtV, riV)
+#define fGEN_TCG_J4_cmpgtu_tp0_jump_nt(SHORTCODE) \
+gen_cmpnd_cmp_jmp(ctx, pkt, insn, 0, TCG_COND_GTU, true, RsV, RtV, riV)
+#define fGEN_TCG_J4_cmpgtu_fp0_jump_t(SHORTCODE) \
+gen_cmpnd_cmp_jmp(ctx, pkt, insn, 0, TCG_COND_GTU, false, RsV, RtV, riV)
+#define fGEN_TCG_J4_cmpgtu_fp0_jump_nt(SHORTCODE) \
+gen_cmpnd_cmp_jmp(ctx, pkt, insn, 0, TCG_COND_GTU, false, RsV, RtV, riV)
+#define fGEN_TCG_J4_cmpgtu_tp1_jump_t(SHORTCODE) \
+gen_cmpnd_cmp_jmp(ctx, pkt, insn, 1, TCG_COND_GTU, true, RsV, RtV, riV)
+#define fGEN_TCG_J4_cmpgtu_tp1_jump_nt(SHORTCODE) \
+gen_cmpnd_cmp_jmp(ctx, pkt, insn, 1, TCG_COND_GTU, true, RsV, RtV, riV)
+#define fGEN_TCG_J4_cmpgtu_fp1_jump_t(SHORTCODE) \
+gen_cmpnd_cmp_jmp(ctx, pkt, insn, 1, TCG_COND_GTU, false, RsV, RtV, riV)
+#define fGEN_TCG_J4_cmpgtu_fp1_jump_nt(SHORTCODE) \
+gen_cmpnd_cmp_jmp(ctx, pkt, insn, 1, TCG_COND_GTU, false, RsV, RtV, riV)
+
+#define fGEN_TCG_J4_cmpeqi_tp0_jump_t(SHORTCODE) \
+gen_cmpnd_cmpi_jmp(ctx, pkt, insn, 0, TCG_COND_EQ, true, RsV, UiV, riV)
+#define fGEN_TCG_J4_cmpeqi_tp0_jump_nt(SHORTCODE) \
+gen_cmpnd_cmpi_jmp(ctx, pkt, insn, 0, TCG_COND_EQ, true, RsV, UiV, riV)
+#define fGEN_TCG_J4_cmpeqi_fp0_jump_t(SHORTCODE) \
+gen_cmpnd_cmpi_jmp(ctx, pkt, insn, 0, TCG_COND_EQ, false, RsV, UiV, riV)
+#define fGEN_TCG_J4_cmpeqi_fp0_jump_nt(SHORTCODE) \
+gen_cmpnd_cmpi_jmp(ctx, pkt, 

[PATCH v2 2/8] Hexagon (target/hexagon) Remove PC from the runtime state

2022-10-24 Thread Taylor Simpson
Add pc field to Packet structure
For helpers that need PC, pass an extra argument
Remove slot arg from conditional jump helpers
On a trap0, copy pkt->pc into hex_gpr[HEX_REG_PC]

Signed-off-by: Taylor Simpson 
---
 target/hexagon/gen_tcg.h| 7 +++
 target/hexagon/insn.h   | 1 +
 target/hexagon/macros.h | 2 +-
 target/hexagon/translate.c  | 9 +
 target/hexagon/gen_helper_funcs.py  | 4 
 target/hexagon/gen_helper_protos.py | 3 +++
 target/hexagon/gen_tcg_funcs.py | 3 +++
 target/hexagon/hex_common.py| 6 +-
 8 files changed, 25 insertions(+), 10 deletions(-)

diff --git a/target/hexagon/gen_tcg.h b/target/hexagon/gen_tcg.h
index 50634ac459..7f0ba27eb6 100644
--- a/target/hexagon/gen_tcg.h
+++ b/target/hexagon/gen_tcg.h
@@ -742,4 +742,11 @@
 RsV = RsV; \
 } while (0)
 
+#define fGEN_TCG_J2_trap0(SHORTCODE) \
+do { \
+uiV = uiV; \
+tcg_gen_movi_tl(hex_gpr[HEX_REG_PC], pkt->pc); \
+TCGv excp = tcg_constant_tl(HEX_EXCP_TRAP0); \
+gen_helper_raise_exception(cpu_env, excp); \
+} while (0)
 #endif
diff --git a/target/hexagon/insn.h b/target/hexagon/insn.h
index 857a7ceb75..b3260d1f0b 100644
--- a/target/hexagon/insn.h
+++ b/target/hexagon/insn.h
@@ -57,6 +57,7 @@ typedef struct Instruction Insn;
 struct Packet {
 uint16_t num_insns;
 uint16_t encod_pkt_size_in_bytes;
+uint32_t pc;
 
 /* Pre-decodes about COF */
 bool pkt_has_cof;  /* Has any change-of-flow */
diff --git a/target/hexagon/macros.h b/target/hexagon/macros.h
index e908405d82..469dfa5571 100644
--- a/target/hexagon/macros.h
+++ b/target/hexagon/macros.h
@@ -398,7 +398,7 @@ static inline TCGv gen_read_ireg(TCGv result, TCGv val, int 
shift)
 #else
 #define fREAD_GP() READ_REG(HEX_REG_GP)
 #endif
-#define fREAD_PC() (READ_REG(HEX_REG_PC))
+#define fREAD_PC() (PC)
 
 #define fREAD_NPC() (env->next_PC & (0xfffe))
 
diff --git a/target/hexagon/translate.c b/target/hexagon/translate.c
index 2e46cc0680..fd4f0efa26 100644
--- a/target/hexagon/translate.c
+++ b/target/hexagon/translate.c
@@ -194,11 +194,6 @@ static bool check_for_attrib(Packet *pkt, int attrib)
 return false;
 }
 
-static bool need_pc(Packet *pkt)
-{
-return check_for_attrib(pkt, A_IMPLICIT_READS_PC);
-}
-
 static bool need_slot_cancelled(Packet *pkt)
 {
 return check_for_attrib(pkt, A_CONDEXEC);
@@ -240,9 +235,6 @@ static void gen_start_packet(DisasContext *ctx, Packet *pkt)
 }
 
 /* Initialize the runtime state for packet semantics */
-if (need_pc(pkt)) {
-tcg_gen_movi_tl(hex_gpr[HEX_REG_PC], ctx->base.pc_next);
-}
 if (need_slot_cancelled(pkt)) {
 tcg_gen_movi_tl(hex_slot_cancelled, 0);
 }
@@ -768,6 +760,7 @@ static void decode_and_translate_packet(CPUHexagonState 
*env, DisasContext *ctx)
 }
 
 if (decode_packet(nwords, words, , false) > 0) {
+pkt.pc = ctx->base.pc_next;
 HEX_DEBUG_PRINT_PKT();
 gen_start_packet(ctx, );
 for (i = 0; i < pkt.num_insns; i++) {
diff --git a/target/hexagon/gen_helper_funcs.py 
b/target/hexagon/gen_helper_funcs.py
index f7c1a82e9f..8ab144b20a 100755
--- a/target/hexagon/gen_helper_funcs.py
+++ b/target/hexagon/gen_helper_funcs.py
@@ -241,6 +241,10 @@ def gen_helper_function(f, tag, tagregs, tagimms):
 if (hex_common.need_pkt_has_multi_cof(tag)):
 f.write(", uint32_t pkt_has_multi_cof")
 
+if hex_common.need_PC(tag):
+if i > 0: f.write(", ")
+f.write("target_ulong PC")
+i += 1
 if hex_common.need_slot(tag):
 if i > 0: f.write(", ")
 f.write("uint32_t slot")
diff --git a/target/hexagon/gen_helper_protos.py 
b/target/hexagon/gen_helper_protos.py
index 4530d7ba8d..2385717dda 100755
--- a/target/hexagon/gen_helper_protos.py
+++ b/target/hexagon/gen_helper_protos.py
@@ -85,6 +85,7 @@ def gen_helper_prototype(f, tag, tagregs, tagimms):
 if hex_common.need_pkt_has_multi_cof(tag): def_helper_size += 1
 if hex_common.need_part1(tag): def_helper_size += 1
 if hex_common.need_slot(tag): def_helper_size += 1
+if hex_common.need_PC(tag): def_helper_size += 1
 f.write('DEF_HELPER_%s(%s' % (def_helper_size, tag))
 ## The return type is void
 f.write(', void' )
@@ -93,6 +94,7 @@ def gen_helper_prototype(f, tag, tagregs, tagimms):
 if hex_common.need_pkt_has_multi_cof(tag): def_helper_size += 1
 if hex_common.need_part1(tag): def_helper_size += 1
 if hex_common.need_slot(tag): def_helper_size += 1
+if hex_common.need_PC(tag): def_helper_size += 1
 f.write('DEF_HELPER_%s(%s' % (def_helper_size, tag))
 
 ## Generate the qemu DEF_HELPER type for each result
@@ -131,6 +133,7 @@ def gen_helper_prototype(f, tag, tagregs, tagimms):
 ## Add the arguments for the instruction 

[PATCH v2 8/8] Hexagon (target/hexagon) Use direct block chaining for tight loops

2022-10-24 Thread Taylor Simpson
Direct block chaining is documented here
https://qemu.readthedocs.io/en/latest/devel/tcg.html#direct-block-chaining

Hexagon inner loops end with the endloop0 instruction
To go back to the beginning of the loop, this instructions writes to PC
from register SA0 (start address 0).  To use direct block chaining, we
have to assign PC with a constant value.  So, we specialize the code
generation when the start of the translation block is equal to SA0.

When this is the case, we defer the compare/branch from endloop0 to
gen_end_tb.  When this is done, we can assign the start address of the TB
to PC.

Signed-off-by: Taylor Simpson 
---
 target/hexagon/cpu.h   | 17 ++---
 target/hexagon/gen_tcg.h   |  3 ++
 target/hexagon/translate.h |  1 +
 target/hexagon/genptr.c| 71 ++
 target/hexagon/translate.c | 41 +++---
 5 files changed, 124 insertions(+), 9 deletions(-)

diff --git a/target/hexagon/cpu.h b/target/hexagon/cpu.h
index ff8c26272d..5260e0f127 100644
--- a/target/hexagon/cpu.h
+++ b/target/hexagon/cpu.h
@@ -152,16 +152,23 @@ struct ArchCPU {
 
 #include "cpu_bits.h"
 
+typedef union {
+uint32_t i;
+struct {
+bool is_tight_loop:1;
+};
+} HexStateFlags;
+
 static inline void cpu_get_tb_cpu_state(CPUHexagonState *env, target_ulong *pc,
 target_ulong *cs_base, uint32_t *flags)
 {
+HexStateFlags hex_flags = { 0 };
 *pc = env->gpr[HEX_REG_PC];
 *cs_base = 0;
-#ifdef CONFIG_USER_ONLY
-*flags = 0;
-#else
-#error System mode not supported on Hexagon yet
-#endif
+if (*pc == env->gpr[HEX_REG_SA0]) {
+hex_flags.is_tight_loop = true;
+}
+*flags = hex_flags.i;
 }
 
 static inline int cpu_mmu_index(CPUHexagonState *env, bool ifetch)
diff --git a/target/hexagon/gen_tcg.h b/target/hexagon/gen_tcg.h
index 216862352c..552258064b 100644
--- a/target/hexagon/gen_tcg.h
+++ b/target/hexagon/gen_tcg.h
@@ -620,6 +620,9 @@
 #define fGEN_TCG_J2_callf(SHORTCODE) \
 gen_cond_call(ctx, pkt, PuV, false, riV)
 
+#define fGEN_TCG_J2_endloop0(SHORTCODE) \
+gen_endloop0(ctx, pkt)
+
 /*
  * Compound compare and jump instructions
  * Here is a primer to understand the tag names
diff --git a/target/hexagon/translate.h b/target/hexagon/translate.h
index e60dbf0e7a..34abe86b5c 100644
--- a/target/hexagon/translate.h
+++ b/target/hexagon/translate.h
@@ -57,6 +57,7 @@ typedef struct DisasContext {
 bool has_single_direct_branch;
 TCGv branch_cond;
 target_ulong branch_dest;
+bool is_tight_loop;
 } DisasContext;
 
 static inline void ctx_log_reg_write(DisasContext *ctx, int rnum)
diff --git a/target/hexagon/genptr.c b/target/hexagon/genptr.c
index c75a6aae84..188be88a95 100644
--- a/target/hexagon/genptr.c
+++ b/target/hexagon/genptr.c
@@ -498,6 +498,20 @@ static void gen_write_new_pc_pcrel(DisasContext *ctx, 
Packet *pkt,
 }
 }
 
+static void gen_set_usr_field(int field, TCGv val)
+{
+tcg_gen_deposit_tl(hex_new_value[HEX_REG_USR], hex_new_value[HEX_REG_USR],
+   val,
+   reg_field_info[field].offset,
+   reg_field_info[field].width);
+}
+
+static void gen_set_usr_fieldi(int field, int x)
+{
+TCGv val = tcg_constant_tl(x);
+gen_set_usr_field(field, val);
+}
+
 static void gen_compare(TCGCond cond, TCGv res, TCGv arg1, TCGv arg2)
 {
 TCGv one = tcg_constant_tl(0xff);
@@ -627,6 +641,63 @@ static void gen_cond_call(DisasContext *ctx, Packet *pkt,
 gen_set_label(skip);
 }
 
+static void gen_endloop0(DisasContext *ctx, Packet *pkt)
+{
+TCGv lpcfg = tcg_temp_local_new();
+
+GET_USR_FIELD(USR_LPCFG, lpcfg);
+
+/*
+ *if (lpcfg == 1) {
+ *hex_new_pred_value[3] = 0xff;
+ *hex_pred_written |= 1 << 3;
+ *}
+ */
+TCGLabel *label1 = gen_new_label();
+tcg_gen_brcondi_tl(TCG_COND_NE, lpcfg, 1, label1);
+{
+tcg_gen_movi_tl(hex_new_pred_value[3], 0xff);
+tcg_gen_ori_tl(hex_pred_written, hex_pred_written, 1 << 3);
+}
+gen_set_label(label1);
+
+/*
+ *if (lpcfg) {
+ *SET_USR_FIELD(USR_LPCFG, lpcfg - 1);
+ *}
+ */
+TCGLabel *label2 = gen_new_label();
+tcg_gen_brcondi_tl(TCG_COND_EQ, lpcfg, 0, label2);
+{
+tcg_gen_subi_tl(lpcfg, lpcfg, 1);
+SET_USR_FIELD(USR_LPCFG, lpcfg);
+}
+gen_set_label(label2);
+
+/*
+ * If we're in a tight loop, we'll do this at the end of the TB to take
+ * advantage of direct block chaining.
+ */
+if (!ctx->is_tight_loop) {
+/*
+ *if (hex_gpr[HEX_REG_LC0] > 1) {
+ *PC = hex_gpr[HEX_REG_SA0];
+ *hex_new_value[HEX_REG_LC0] = hex_gpr[HEX_REG_LC0] - 1;
+ *}
+ */
+TCGLabel *label3 = gen_new_label();
+tcg_gen_brcondi_tl(TCG_COND_LEU, hex_gpr[HEX_REG_LC0], 1, label3);
+{
+gen_jumpr(ctx, pkt, 

[PATCH v2 6/8] Hexagon (target/hexagon) Add overrides for various forms of jump

2022-10-24 Thread Taylor Simpson
Signed-off-by: Taylor Simpson 
---
 target/hexagon/gen_tcg.h | 189 +++
 target/hexagon/genptr.c  |  46 ++
 2 files changed, 235 insertions(+)

diff --git a/target/hexagon/gen_tcg.h b/target/hexagon/gen_tcg.h
index b56b216110..216862352c 100644
--- a/target/hexagon/gen_tcg.h
+++ b/target/hexagon/gen_tcg.h
@@ -797,6 +797,195 @@
 #define fGEN_TCG_J4_tstbit0_fp1_jump_t(SHORTCODE) \
 gen_cmpnd_tstbit0_jmp(ctx, pkt, insn, 1, false, RsV, riV)
 
+#define fGEN_TCG_J2_jump(SHORTCODE) \
+gen_jump(ctx, pkt, riV)
+#define fGEN_TCG_J2_jumpr(SHORTCODE) \
+gen_jumpr(ctx, pkt, RsV)
+#define fGEN_TCG_J4_jumpseti(SHORTCODE) \
+do { \
+tcg_gen_movi_tl(RdV, UiV); \
+gen_jump(ctx, pkt, riV); \
+} while (0)
+
+#define fGEN_TCG_cond_jump(COND) \
+do { \
+TCGv LSB = tcg_temp_new(); \
+COND; \
+gen_cond_jump(ctx, pkt, LSB, riV); \
+tcg_temp_free(LSB); \
+} while (0)
+
+#define fGEN_TCG_J2_jumpt(SHORTCODE) \
+fGEN_TCG_cond_jump(fLSBOLD(PuV))
+#define fGEN_TCG_J2_jumptpt(SHORTCODE) \
+fGEN_TCG_cond_jump(fLSBOLD(PuV))
+#define fGEN_TCG_J2_jumpf(SHORTCODE) \
+fGEN_TCG_cond_jump(fLSBOLDNOT(PuV))
+#define fGEN_TCG_J2_jumpfpt(SHORTCODE) \
+fGEN_TCG_cond_jump(fLSBOLDNOT(PuV))
+#define fGEN_TCG_J2_jumptnew(SHORTCODE) \
+gen_cond_jump(ctx, pkt, PuN, riV)
+#define fGEN_TCG_J2_jumptnewpt(SHORTCODE) \
+gen_cond_jump(ctx, pkt, PuN, riV)
+#define fGEN_TCG_J2_jumpfnewpt(SHORTCODE) \
+fGEN_TCG_cond_jump(fLSBNEWNOT(PuN))
+#define fGEN_TCG_J2_jumpfnew(SHORTCODE) \
+fGEN_TCG_cond_jump(fLSBNEWNOT(PuN))
+#define fGEN_TCG_J2_jumprz(SHORTCODE) \
+fGEN_TCG_cond_jump(tcg_gen_setcondi_tl(TCG_COND_NE, LSB, RsV, 0))
+#define fGEN_TCG_J2_jumprzpt(SHORTCODE) \
+fGEN_TCG_cond_jump(tcg_gen_setcondi_tl(TCG_COND_NE, LSB, RsV, 0))
+#define fGEN_TCG_J2_jumprnz(SHORTCODE) \
+fGEN_TCG_cond_jump(tcg_gen_setcondi_tl(TCG_COND_EQ, LSB, RsV, 0))
+#define fGEN_TCG_J2_jumprnzpt(SHORTCODE) \
+fGEN_TCG_cond_jump(tcg_gen_setcondi_tl(TCG_COND_EQ, LSB, RsV, 0))
+#define fGEN_TCG_J2_jumprgtez(SHORTCODE) \
+fGEN_TCG_cond_jump(tcg_gen_setcondi_tl(TCG_COND_GE, LSB, RsV, 0))
+#define fGEN_TCG_J2_jumprgtezpt(SHORTCODE) \
+fGEN_TCG_cond_jump(tcg_gen_setcondi_tl(TCG_COND_GE, LSB, RsV, 0))
+#define fGEN_TCG_J2_jumprltez(SHORTCODE) \
+fGEN_TCG_cond_jump(tcg_gen_setcondi_tl(TCG_COND_LE, LSB, RsV, 0))
+#define fGEN_TCG_J2_jumprltezpt(SHORTCODE) \
+fGEN_TCG_cond_jump(tcg_gen_setcondi_tl(TCG_COND_LE, LSB, RsV, 0))
+
+#define fGEN_TCG_cond_jumpr(COND) \
+do { \
+TCGv LSB = tcg_temp_new(); \
+COND; \
+gen_cond_jumpr(ctx, pkt, LSB, RsV); \
+tcg_temp_free(LSB); \
+} while (0)
+
+#define fGEN_TCG_J2_jumprt(SHORTCODE) \
+fGEN_TCG_cond_jumpr(fLSBOLD(PuV))
+#define fGEN_TCG_J2_jumprtpt(SHORTCODE) \
+fGEN_TCG_cond_jumpr(fLSBOLD(PuV))
+#define fGEN_TCG_J2_jumprf(SHORTCODE) \
+fGEN_TCG_cond_jumpr(fLSBOLDNOT(PuV))
+#define fGEN_TCG_J2_jumprfpt(SHORTCODE) \
+fGEN_TCG_cond_jumpr(fLSBOLDNOT(PuV))
+#define fGEN_TCG_J2_jumprtnew(SHORTCODE) \
+fGEN_TCG_cond_jumpr(fLSBNEW(PuN))
+#define fGEN_TCG_J2_jumprtnewpt(SHORTCODE) \
+fGEN_TCG_cond_jumpr(fLSBNEW(PuN))
+#define fGEN_TCG_J2_jumprfnew(SHORTCODE) \
+fGEN_TCG_cond_jumpr(fLSBNEWNOT(PuN))
+#define fGEN_TCG_J2_jumprfnewpt(SHORTCODE) \
+fGEN_TCG_cond_jumpr(fLSBNEWNOT(PuN))
+#define fGEN_TCG_J2_jumprfnewpt(SHORTCODE) \
+fGEN_TCG_cond_jumpr(fLSBNEWNOT(PuN))
+
+/*
+ * New value compare & jump instructions
+ * if ([!]COND(r0.new, r1) jump:t address
+ * if ([!]COND(r0.new, #7) jump:t address
+ */
+#define fGEN_TCG_J4_cmpgt_t_jumpnv_t(SHORTCODE) \
+gen_cmp_jumpnv(ctx, pkt, TCG_COND_GT, NsN, RtV, riV)
+#define fGEN_TCG_J4_cmpgt_t_jumpnv_nt(SHORTCODE) \
+gen_cmp_jumpnv(ctx, pkt, TCG_COND_GT, NsN, RtV, riV)
+#define fGEN_TCG_J4_cmpgt_f_jumpnv_t(SHORTCODE) \
+gen_cmp_jumpnv(ctx, pkt, TCG_COND_LE, NsN, RtV, riV)
+#define fGEN_TCG_J4_cmpgt_f_jumpnv_nt(SHORTCODE) \
+gen_cmp_jumpnv(ctx, pkt, TCG_COND_LE, NsN, RtV, riV)
+
+#define fGEN_TCG_J4_cmpeq_t_jumpnv_t(SHORTCODE) \
+gen_cmp_jumpnv(ctx, pkt, TCG_COND_EQ, NsN, RtV, riV)
+#define fGEN_TCG_J4_cmpeq_t_jumpnv_nt(SHORTCODE) \
+gen_cmp_jumpnv(ctx, pkt, TCG_COND_EQ, NsN, RtV, riV)
+#define fGEN_TCG_J4_cmpeq_f_jumpnv_t(SHORTCODE) \
+gen_cmp_jumpnv(ctx, pkt, TCG_COND_NE, NsN, RtV, riV)
+#define fGEN_TCG_J4_cmpeq_f_jumpnv_nt(SHORTCODE) \
+gen_cmp_jumpnv(ctx, pkt, TCG_COND_NE, NsN, RtV, riV)
+
+#define fGEN_TCG_J4_cmplt_t_jumpnv_t(SHORTCODE) \
+gen_cmp_jumpnv(ctx, pkt, TCG_COND_LT, NsN, RtV, riV)
+#define fGEN_TCG_J4_cmplt_t_jumpnv_nt(SHORTCODE) \
+gen_cmp_jumpnv(ctx, pkt, TCG_COND_LT, NsN, RtV, riV)
+#define fGEN_TCG_J4_cmplt_f_jumpnv_t(SHORTCODE) \
+gen_cmp_jumpnv(ctx, pkt, TCG_COND_GE, NsN, RtV, riV)
+#define fGEN_TCG_J4_cmplt_f_jumpnv_nt(SHORTCODE) \
+gen_cmp_jumpnv(ctx, pkt, TCG_COND_GE, NsN, RtV, riV)
+
+#define 

[PATCH v2 0/8] Hexagon (target/hexagon) Improve change-of-flow

2022-10-24 Thread Taylor Simpson
This patch series improves change-of-flow handling.

Currently, we set the PC to a new address before exiting a TB.  The
ultimate goal is to use direct block chaining.  However, several steps
are needed along the way.

1)
When a packet has more than one change-of-flow (COF) instruction, only
the first one taken is considered.  The runtime bookkeeping is only
needed when there is more than one COF instruction in a packet.

2, 3)
Remove PC and next_PC from the runtime state and always use a
translation-time constant.  Note that next_PC is used by call instructions
to set LR and by conditional COF instructions to set the fall-through
address.

4, 5, 6)
Add helper overrides for COF instructions.  In particular, we must
distinguish those that use a PC-relative address for the destination.
These are candidates for direct block chaining later.

7)
Use direct block chaining for packets that have a single PC-relative
COF instruction.  Instead of generating the code while processing the
instruction, we record the effect in DisasContext and generate the code
during gen_end_tb.

8)
Use direct block chaining for tight loops.  We look for TBs that end
with an endloop0 that will branch back to the TB start address.


 Changes in V2 
Simplify test in need_pkt_has_multi_cof
Address feedback from Matheus Tavares Bernardino 
Rearrange new-value-jump overrides
Simplify gen_write_new_pc_addr



Taylor Simpson (8):
  Hexagon (target/hexagon) Only use branch_taken when packet has multi
cof
  Hexagon (target/hexagon) Remove PC from the runtime state
  Hexagon (target/hexagon) Remove next_PC from runtime state
  Hexagon (target/hexagon) Add overrides for direct call instructions
  Hexagon (target/hexagon) Add overrides for compound compare and jump
  Hexagon (target/hexagon) Add overrides for various forms of jump
  Hexagon (target/hexagon) Use direct block chaining for direct
jump/branch
  Hexagon (target/hexagon) Use direct block chaining for tight loops

 target/hexagon/cpu.h|  18 +-
 target/hexagon/gen_tcg.h| 390 
 target/hexagon/insn.h   |   2 +
 target/hexagon/macros.h |   6 +-
 target/hexagon/translate.h  |   6 +-
 target/hexagon/decode.c |  15 +-
 target/hexagon/genptr.c | 260 +++
 target/hexagon/op_helper.c  |  28 +-
 target/hexagon/translate.c  | 120 +++--
 target/hexagon/gen_helper_funcs.py  |  11 +
 target/hexagon/gen_helper_protos.py |  12 +-
 target/hexagon/gen_tcg_funcs.py |  11 +
 target/hexagon/hex_common.py|  29 ++-
 13 files changed, 863 insertions(+), 45 deletions(-)

-- 
2.17.1



[PATCH v2 7/8] Hexagon (target/hexagon) Use direct block chaining for direct jump/branch

2022-10-24 Thread Taylor Simpson
Direct block chaining is documented here
https://qemu.readthedocs.io/en/latest/devel/tcg.html#direct-block-chaining

Recall that Hexagon allows packets with multiple jumps where only the first
one with a true predicate will actually jump.  So, we can only use direct
block chaining when the packet contains a single PC-relative jump.  We add
the following to DisasContext in order to perform direct block chaining at
the end of packet commit (in gen_end_tb)
has_single_direct_branch
Indicates that we can use direct block chaining
branch_cond
The condition under which the branch is taken
branch_dest
The destination of the branch

Signed-off-by: Taylor Simpson 
---
 target/hexagon/translate.h |  3 +++
 target/hexagon/genptr.c| 13 -
 target/hexagon/translate.c | 39 +-
 3 files changed, 53 insertions(+), 2 deletions(-)

diff --git a/target/hexagon/translate.h b/target/hexagon/translate.h
index eae358cf33..e60dbf0e7a 100644
--- a/target/hexagon/translate.h
+++ b/target/hexagon/translate.h
@@ -54,6 +54,9 @@ typedef struct DisasContext {
 bool qreg_is_predicated[NUM_QREGS];
 int qreg_log_idx;
 bool pre_commit;
+bool has_single_direct_branch;
+TCGv branch_cond;
+target_ulong branch_dest;
 } DisasContext;
 
 static inline void ctx_log_reg_write(DisasContext *ctx, int rnum)
diff --git a/target/hexagon/genptr.c b/target/hexagon/genptr.c
index 437250c0f9..c75a6aae84 100644
--- a/target/hexagon/genptr.c
+++ b/target/hexagon/genptr.c
@@ -484,7 +484,18 @@ static void gen_write_new_pc_pcrel(DisasContext *ctx, 
Packet *pkt,
int pc_off, TCGv pred)
 {
 target_ulong dest = pkt->pc + pc_off;
-gen_write_new_pc_addr(ctx, pkt, tcg_constant_tl(dest), pred);
+if (pkt->pkt_has_multi_cof) {
+gen_write_new_pc_addr(ctx, pkt, tcg_constant_tl(dest), pred);
+} else {
+/* Defer this jump to the end of the TB */
+g_assert(ctx->branch_cond == NULL);
+ctx->has_single_direct_branch = true;
+if (pred != NULL) {
+ctx->branch_cond = tcg_temp_local_new();
+tcg_gen_mov_tl(ctx->branch_cond, pred);
+}
+ctx->branch_dest = dest;
+}
 }
 
 static void gen_compare(TCGCond cond, TCGv res, TCGv arg1, TCGv arg2)
diff --git a/target/hexagon/translate.c b/target/hexagon/translate.c
index 71ad2da682..29e2caaf0f 100644
--- a/target/hexagon/translate.c
+++ b/target/hexagon/translate.c
@@ -116,10 +116,44 @@ static void gen_exec_counters(DisasContext *ctx)
 hex_gpr[HEX_REG_QEMU_HVX_CNT], ctx->num_hvx_insns);
 }
 
+static bool use_goto_tb(DisasContext *ctx, target_ulong dest)
+{
+return translator_use_goto_tb(>base, dest);
+}
+
+static void gen_goto_tb(DisasContext *ctx, int idx, target_ulong dest)
+{
+if (use_goto_tb(ctx, dest)) {
+tcg_gen_goto_tb(idx);
+tcg_gen_movi_tl(hex_gpr[HEX_REG_PC], dest);
+tcg_gen_exit_tb(ctx->base.tb, idx);
+} else {
+tcg_gen_movi_tl(hex_gpr[HEX_REG_PC], dest);
+tcg_gen_lookup_and_goto_ptr();
+}
+}
+
 static void gen_end_tb(DisasContext *ctx)
 {
 gen_exec_counters(ctx);
-tcg_gen_exit_tb(NULL, 0);
+
+if (ctx->has_single_direct_branch) {
+if (ctx->branch_cond != NULL) {
+TCGLabel *skip = gen_new_label();
+tcg_gen_brcondi_tl(TCG_COND_EQ, ctx->branch_cond, 0, skip);
+gen_goto_tb(ctx, 0, ctx->branch_dest);
+gen_set_label(skip);
+gen_goto_tb(ctx, 1, ctx->next_PC);
+tcg_temp_free(ctx->branch_cond);
+ctx->branch_cond = NULL;
+} else {
+gen_goto_tb(ctx, 0, ctx->branch_dest);
+}
+} else {
+tcg_gen_lookup_and_goto_ptr();
+}
+
+g_assert(ctx->branch_cond == NULL);
 ctx->base.is_jmp = DISAS_NORETURN;
 }
 
@@ -803,6 +837,9 @@ static void hexagon_tr_init_disas_context(DisasContextBase 
*dcbase,
 
 static void hexagon_tr_tb_start(DisasContextBase *db, CPUState *cpu)
 {
+DisasContext *ctx = container_of(db, DisasContext, base);
+ctx->has_single_direct_branch = false;
+ctx->branch_cond = NULL;
 }
 
 static void hexagon_tr_insn_start(DisasContextBase *dcbase, CPUState *cpu)
-- 
2.17.1



[PATCH v2 4/8] Hexagon (target/hexagon) Add overrides for direct call instructions

2022-10-24 Thread Taylor Simpson
Add overrides for
J2_call
J2_callt
J2_callf

Signed-off-by: Taylor Simpson 
---
 target/hexagon/gen_tcg.h |  8 ++
 target/hexagon/genptr.c  | 58 
 2 files changed, 66 insertions(+)

diff --git a/target/hexagon/gen_tcg.h b/target/hexagon/gen_tcg.h
index e6fc7d97d2..ad149adbe1 100644
--- a/target/hexagon/gen_tcg.h
+++ b/target/hexagon/gen_tcg.h
@@ -612,6 +612,14 @@
 tcg_temp_free(tmp); \
 } while (0)
 
+#define fGEN_TCG_J2_call(SHORTCODE) \
+gen_call(ctx, pkt, riV)
+
+#define fGEN_TCG_J2_callt(SHORTCODE) \
+gen_cond_call(ctx, pkt, PuV, true, riV)
+#define fGEN_TCG_J2_callf(SHORTCODE) \
+gen_cond_call(ctx, pkt, PuV, false, riV)
+
 #define fGEN_TCG_J2_pause(SHORTCODE) \
 do { \
 uiV = uiV; \
diff --git a/target/hexagon/genptr.c b/target/hexagon/genptr.c
index 806d0974ff..2784b84041 100644
--- a/target/hexagon/genptr.c
+++ b/target/hexagon/genptr.c
@@ -456,6 +456,64 @@ static TCGv gen_8bitsof(TCGv result, TCGv value)
 return result;
 }
 
+static void gen_write_new_pc_addr(DisasContext *ctx, Packet *pkt,
+  TCGv addr, TCGv pred)
+{
+TCGLabel *pred_false = NULL;
+if (pred != NULL) {
+pred_false = gen_new_label();
+tcg_gen_brcondi_tl(TCG_COND_EQ, pred, 0, pred_false);
+}
+
+if (pkt->pkt_has_multi_cof) {
+/* If there are multiple branches in a packet, ignore the second one */
+tcg_gen_movcond_tl(TCG_COND_NE, hex_gpr[HEX_REG_PC],
+   hex_branch_taken, tcg_constant_tl(0),
+   hex_gpr[HEX_REG_PC], addr);
+tcg_gen_movi_tl(hex_branch_taken, 1);
+} else {
+tcg_gen_mov_tl(hex_gpr[HEX_REG_PC], addr);
+}
+
+if (pred != NULL) {
+gen_set_label(pred_false);
+}
+}
+
+static void gen_write_new_pc_pcrel(DisasContext *ctx, Packet *pkt,
+   int pc_off, TCGv pred)
+{
+target_ulong dest = pkt->pc + pc_off;
+gen_write_new_pc_addr(ctx, pkt, tcg_constant_tl(dest), pred);
+}
+
+static void gen_call(DisasContext *ctx, Packet *pkt, int pc_off)
+{
+TCGv next_PC =
+tcg_constant_tl(pkt->pc + pkt->encod_pkt_size_in_bytes);
+gen_log_reg_write(HEX_REG_LR, next_PC);
+gen_write_new_pc_pcrel(ctx, pkt, pc_off, NULL);
+}
+
+static void gen_cond_call(DisasContext *ctx, Packet *pkt,
+  TCGv pred, bool sense, int pc_off)
+{
+TCGv next_PC;
+TCGv lsb = tcg_temp_local_new();
+TCGLabel *skip = gen_new_label();
+tcg_gen_andi_tl(lsb, pred, 1);
+if (!sense) {
+tcg_gen_xori_tl(lsb, lsb, 1);
+}
+gen_write_new_pc_pcrel(ctx, pkt, pc_off, lsb);
+tcg_gen_brcondi_tl(TCG_COND_EQ, lsb, 0, skip);
+tcg_temp_free(lsb);
+next_PC =
+tcg_constant_tl(pkt->pc + pkt->encod_pkt_size_in_bytes);
+gen_log_reg_write(HEX_REG_LR, next_PC);
+gen_set_label(skip);
+}
+
 static intptr_t vreg_src_off(DisasContext *ctx, int num)
 {
 intptr_t offset = offsetof(CPUHexagonState, VRegs[num]);
-- 
2.17.1



[PATCH 4/4] include/qemu/atomic128: Avoid __sync_val_compare_and_swap_16

2022-10-24 Thread Richard Henderson
Merge the CONFIG_ATOMIC128 and CONFIG_CMPXCHG128 cases
with respect to atomic16_cmpxchg and use
__atomic_compare_exchange_nomic (via qatomic_cmpxchg)
instead of the "legacy" __sync_val_compare_and_swap_16.

Update the meson has_cmpxchg128 test to match.

Signed-off-by: Richard Henderson 
---
 include/qemu/atomic128.h | 8 +---
 meson.build  | 3 ++-
 2 files changed, 3 insertions(+), 8 deletions(-)

diff --git a/include/qemu/atomic128.h b/include/qemu/atomic128.h
index adb9a1a260..ec45754515 100644
--- a/include/qemu/atomic128.h
+++ b/include/qemu/atomic128.h
@@ -41,18 +41,12 @@
  * Therefore, special case each platform.
  */
 
-#if defined(CONFIG_ATOMIC128)
+#if defined(CONFIG_ATOMIC128) || defined(CONFIG_CMPXCHG128)
 static inline Int128 atomic16_cmpxchg(Int128 *ptr, Int128 cmp, Int128 new)
 {
 return qatomic_cmpxchg__nocheck(ptr, cmp, new);
 }
 # define HAVE_CMPXCHG128 1
-#elif defined(CONFIG_CMPXCHG128)
-static inline Int128 atomic16_cmpxchg(Int128 *ptr, Int128 cmp, Int128 new)
-{
-return __sync_val_compare_and_swap_16(ptr, cmp, new);
-}
-# define HAVE_CMPXCHG128 1
 #elif defined(__aarch64__)
 /* Through gcc 8, aarch64 has no support for 128-bit at all.  */
 static inline Int128 atomic16_cmpxchg(Int128 *ptr, Int128 cmp, Int128 new)
diff --git a/meson.build b/meson.build
index 1ec3f72edc..d8c4e76e7b 100644
--- a/meson.build
+++ b/meson.build
@@ -2224,7 +2224,8 @@ if has_int128
   int main(void)
   {
 unsigned __int128 x = 0, y = 0;
-__sync_val_compare_and_swap_16(, y, x);
+__atomic_compare_exchange_n(, , x, 0,
+__ATOMIC_RELAXED, __ATOMIC_RELAXED);
 return 0;
   }
 ''')
-- 
2.34.1




[PATCH 0/4] atomic: Friendlier assertions, avoidance of __sync

2022-10-24 Thread Richard Henderson
The current use of _Static_assert, via QEMU_BUILD_BUG_ON, requires
the user have #if conditionals to avoid the statement from appearing
in the preprocessed file at all.  Introduce a new primitive that
allows normal C conditionals and dead-code elimination.

Remove all use of __sync* builtins in favor of __atomic*.
We have required them since 47345e71247, last year, and
should have removed these at that point.  My bad.


r~


Richard Henderson (4):
  include/qemu/osdep: Add qemu_build_assert
  include/qemu/atomic: Use qemu_build_assert
  include/qemu/thread: Use qatomic_* functions
  include/qemu/atomic128: Avoid __sync_val_compare_and_swap_16

 include/qemu/atomic.h| 16 
 include/qemu/atomic128.h |  8 +---
 include/qemu/osdep.h |  8 
 include/qemu/thread.h|  8 
 meson.build  |  3 ++-
 5 files changed, 23 insertions(+), 20 deletions(-)

-- 
2.34.1




[PATCH 3/4] include/qemu/thread: Use qatomic_* functions

2022-10-24 Thread Richard Henderson
Use qatomic_*, which expands to __atomic_* in preference
to the "legacy" __sync_* functions.

Signed-off-by: Richard Henderson 
---
 include/qemu/thread.h | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/include/qemu/thread.h b/include/qemu/thread.h
index af19f2b3fc..976e1ab995 100644
--- a/include/qemu/thread.h
+++ b/include/qemu/thread.h
@@ -227,7 +227,7 @@ struct QemuSpin {
 
 static inline void qemu_spin_init(QemuSpin *spin)
 {
-__sync_lock_release(>value);
+qatomic_set(>value, 0);
 #ifdef CONFIG_TSAN
 __tsan_mutex_create(spin, __tsan_mutex_not_static);
 #endif
@@ -246,7 +246,7 @@ static inline void qemu_spin_lock(QemuSpin *spin)
 #ifdef CONFIG_TSAN
 __tsan_mutex_pre_lock(spin, 0);
 #endif
-while (unlikely(__sync_lock_test_and_set(>value, true))) {
+while (unlikely(qatomic_xchg(>value, true))) {
 while (qatomic_read(>value)) {
 cpu_relax();
 }
@@ -261,7 +261,7 @@ static inline bool qemu_spin_trylock(QemuSpin *spin)
 #ifdef CONFIG_TSAN
 __tsan_mutex_pre_lock(spin, __tsan_mutex_try_lock);
 #endif
-bool busy = __sync_lock_test_and_set(>value, true);
+bool busy = qatomic_xchg(>value, true);
 #ifdef CONFIG_TSAN
 unsigned flags = __tsan_mutex_try_lock;
 flags |= busy ? __tsan_mutex_try_lock_failed : 0;
@@ -280,7 +280,7 @@ static inline void qemu_spin_unlock(QemuSpin *spin)
 #ifdef CONFIG_TSAN
 __tsan_mutex_pre_unlock(spin, 0);
 #endif
-__sync_lock_release(>value);
+qatomic_store_release(>value, 0);
 #ifdef CONFIG_TSAN
 __tsan_mutex_post_unlock(spin, 0);
 #endif
-- 
2.34.1




[PATCH 2/4] include/qemu/atomic: Use qemu_build_assert

2022-10-24 Thread Richard Henderson
Change from QEMU_BUILD_BUG_ON, which requires ifdefs to avoid
problematic code, to qemu_build_assert, which can use C ifs.

Signed-off-by: Richard Henderson 
---
 include/qemu/atomic.h | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/include/qemu/atomic.h b/include/qemu/atomic.h
index 7e8fc8e7cd..874134fd19 100644
--- a/include/qemu/atomic.h
+++ b/include/qemu/atomic.h
@@ -133,7 +133,7 @@
 
 #define qatomic_read(ptr)  \
 ({ \
-QEMU_BUILD_BUG_ON(sizeof(*ptr) > ATOMIC_REG_SIZE); \
+qemu_build_assert(sizeof(*ptr) <= ATOMIC_REG_SIZE); \
 qatomic_read__nocheck(ptr);\
 })
 
@@ -141,7 +141,7 @@
 __atomic_store_n(ptr, i, __ATOMIC_RELAXED)
 
 #define qatomic_set(ptr, i)  do {  \
-QEMU_BUILD_BUG_ON(sizeof(*ptr) > ATOMIC_REG_SIZE); \
+qemu_build_assert(sizeof(*ptr) <= ATOMIC_REG_SIZE); \
 qatomic_set__nocheck(ptr, i);  \
 } while(0)
 
@@ -159,27 +159,27 @@
 
 #define qatomic_rcu_read(ptr)  \
 ({ \
-QEMU_BUILD_BUG_ON(sizeof(*ptr) > ATOMIC_REG_SIZE); \
+qemu_build_assert(sizeof(*ptr) <= ATOMIC_REG_SIZE); \
 typeof_strip_qual(*ptr) _val;  \
 qatomic_rcu_read__nocheck(ptr, &_val); \
 _val;  \
 })
 
 #define qatomic_rcu_set(ptr, i) do {   \
-QEMU_BUILD_BUG_ON(sizeof(*ptr) > ATOMIC_REG_SIZE); \
+qemu_build_assert(sizeof(*ptr) <= ATOMIC_REG_SIZE); \
 __atomic_store_n(ptr, i, __ATOMIC_RELEASE);\
 } while(0)
 
 #define qatomic_load_acquire(ptr)   \
 ({  \
-QEMU_BUILD_BUG_ON(sizeof(*ptr) > ATOMIC_REG_SIZE);  \
+qemu_build_assert(sizeof(*ptr) <= ATOMIC_REG_SIZE); \
 typeof_strip_qual(*ptr) _val;   \
 __atomic_load(ptr, &_val, __ATOMIC_ACQUIRE);\
 _val;   \
 })
 
 #define qatomic_store_release(ptr, i)  do { \
-QEMU_BUILD_BUG_ON(sizeof(*ptr) > ATOMIC_REG_SIZE);  \
+qemu_build_assert(sizeof(*ptr) <= ATOMIC_REG_SIZE); \
 __atomic_store_n(ptr, i, __ATOMIC_RELEASE); \
 } while(0)
 
@@ -191,7 +191,7 @@
 })
 
 #define qatomic_xchg(ptr, i)({  \
-QEMU_BUILD_BUG_ON(sizeof(*ptr) > ATOMIC_REG_SIZE);  \
+qemu_build_assert(sizeof(*ptr) <= ATOMIC_REG_SIZE); \
 qatomic_xchg__nocheck(ptr, i);  \
 })
 
@@ -204,7 +204,7 @@
 })
 
 #define qatomic_cmpxchg(ptr, old, new)({\
-QEMU_BUILD_BUG_ON(sizeof(*ptr) > ATOMIC_REG_SIZE);  \
+qemu_build_assert(sizeof(*ptr) <= ATOMIC_REG_SIZE); \
 qatomic_cmpxchg__nocheck(ptr, old, new);\
 })
 
-- 
2.34.1




[PATCH 1/4] include/qemu/osdep: Add qemu_build_assert

2022-10-24 Thread Richard Henderson
This differs from assert, in that with optimization enabled it
triggers at build-time.  It differs from QEMU_BUILD_BUG_ON,
aka _Static_assert, in that it is sensitive to control flow
and is subject to dead-code elimination.

Signed-off-by: Richard Henderson 
---
 include/qemu/osdep.h | 8 
 1 file changed, 8 insertions(+)

diff --git a/include/qemu/osdep.h b/include/qemu/osdep.h
index b1c161c035..2276094729 100644
--- a/include/qemu/osdep.h
+++ b/include/qemu/osdep.h
@@ -186,6 +186,14 @@ void QEMU_ERROR("code path is reachable")
 #define qemu_build_not_reached()  g_assert_not_reached()
 #endif
 
+/**
+ * qemu_build_assert()
+ *
+ * The compiler, during optimization, is expected to prove that the
+ * assertion is true.
+ */
+#define qemu_build_assert(test)  while (!(test)) qemu_build_not_reached()
+
 /*
  * According to waitpid man page:
  * WCOREDUMP
-- 
2.34.1




Re: [PATCH v2 07/11] hw/intc/xics: Avoid dynamic stack allocation

2022-10-24 Thread Richard Henderson

On 8/20/22 01:39, Peter Maydell wrote:

From: Philippe Mathieu-Daudé

Use autofree heap allocation instead of variable-length
array on the stack.

Signed-off-by: Philippe Mathieu-Daudé
Acked-by: David Gibson
Reviewed-by: Greg Kurz
Signed-off-by: Peter Maydell
---
  hw/intc/xics.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)


Reviewed-by: Richard Henderson 

r~



Re: [PATCH v2 06/11] hw/ppc/pnv: Avoid dynamic stack allocation

2022-10-24 Thread Richard Henderson

On 8/20/22 01:39, Peter Maydell wrote:

From: Philippe Mathieu-Daudé

Use autofree heap allocation instead of variable-length
array on the stack.

Signed-off-by: Philippe Mathieu-Daudé
Acked-by: David Gibson
Signed-off-by: Peter Maydell
Reviewed-by: Peter Maydell
---
  hw/ppc/pnv.c   | 4 ++--
  hw/ppc/spapr.c | 8 
  hw/ppc/spapr_pci_nvlink2.c | 2 +-
  3 files changed, 7 insertions(+), 7 deletions(-)


Reviewed-by: Richard Henderson 

r~



Re: [PATCH v2 04/11] io/channel-websock: Replace strlen(const_str) by sizeof(const_str) - 1

2022-10-24 Thread Richard Henderson

On 8/20/22 01:39, Peter Maydell wrote:

From: Philippe Mathieu-Daudé

The combined_key[... QIO_CHANNEL_WEBSOCK_GUID_LEN ...] array in
qio_channel_websock_handshake_send_res_ok() expands to a call
to strlen(QIO_CHANNEL_WEBSOCK_GUID), and the compiler doesn't
realize the string is const, so consider combined_key[] being
a variable-length array.

To remove the variable-length array, we provide it a hint to
the compiler by using sizeof() - 1 instead of strlen().

Signed-off-by: Philippe Mathieu-Daudé
Reviewed-by: Daniel P. Berrangé
Signed-off-by: Peter Maydell
---
  io/channel-websock.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)


Reviewed-by: Richard Henderson 

r~



Re: [PATCH v2 02/11] chardev/baum: Use definitions to avoid dynamic stack allocation

2022-10-24 Thread Richard Henderson

On 8/20/22 01:39, Peter Maydell wrote:

From: Philippe Mathieu-Daudé

We know 'x * y' will be at most 'X_MAX * Y_MAX' (which is not
a big value, it is actually 84). Instead of having the compiler
use variable-length array, declare an array able to hold the
maximum 'x * y'.

Signed-off-by: Philippe Mathieu-Daudé
Reviewed-by: Marc-André Lureau
Reviewed-by: Samuel Thibault
Signed-off-by: Peter Maydell
---
  chardev/baum.c | 8 
  1 file changed, 4 insertions(+), 4 deletions(-)


Reviewed-by: Richard Henderson 

r~



Re: [PATCH v2 03/11] chardev/baum: Avoid dynamic stack allocation

2022-10-24 Thread Richard Henderson

On 8/20/22 01:39, Peter Maydell wrote:

From: Philippe Mathieu-Daudé

Use autofree heap allocation instead of variable-length
array on the stack.

Signed-off-by: Philippe Mathieu-Daudé
Reviewed-by: Marc-André Lureau
Reviewed-by: Samuel Thibault
Signed-off-by: Peter Maydell
---
  chardev/baum.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)


Reviewed-by: Richard Henderson 

r~



Re: [PATCH v2 01/11] chardev/baum: Replace magic values by X_MAX / Y_MAX definitions

2022-10-24 Thread Richard Henderson

On 8/20/22 01:39, Peter Maydell wrote:

From: Philippe Mathieu-Daudé

Replace '84' magic value by the X_MAX definition, and '1' by Y_MAX.

Signed-off-by: Philippe Mathieu-Daudé
Reviewed-by: Marc-André Lureau
Reviewed-by: Samuel Thibault
Signed-off-by: Peter Maydell
---
  chardev/baum.c | 11 +++
  1 file changed, 7 insertions(+), 4 deletions(-)


Reviewed-by: Richard Henderson 

r~



Re: [PATCH 00/26] target/s390x: pc-relative translation blocks

2022-10-24 Thread Richard Henderson

Ping.

On 10/6/22 13:43, Richard Henderson wrote:

This is the S390 specific changes required to reduce the
amount of translation for address space randomization.

Begin with some generic cleanups, then prepare by using
displacements instead of addresses when possible, then
add some tcg infrastructure to avoid a code gen ugly,
then finalize the conversion.


r~


Richard Henderson (26):
   target/s390x: Use tcg_constant_* in local contexts
   target/s390x: Use tcg_constant_* for DisasCompare
   target/s390x: Use tcg_constant_i32 for fpinst_extract_m34
   target/s390x: Use tcg_constant_* in translate_vx.c.inc
   target/s390x: Change help_goto_direct to work on displacements
   target/s390x: Introduce gen_psw_addr_disp
   target/s390x: Remove pc argument to pc_to_link_into
   target/s390x: Use gen_psw_addr_disp in pc_to_link_info
   target/s390x: Use gen_psw_addr_disp in save_link_info
   target/s390x: Use gen_psw_addr_disp in op_sam
   target/s390x: Use ilen instead in branches
   target/s390x: Move masking of psw.addr to cpu_get_tb_cpu_state
   target/s390x: Add disp argument to update_psw_addr
   target/s390x: Don't set gbea for user-only
   target/s390x: Introduce per_enabled
   target/s390x: Disable conditional branch-to-next for PER
   target/s390x: Introduce help_goto_indirect
   target/s390x: Split per_branch
   target/s390x: Simplify help_branch
   target/s390x: Split per_breaking_event from per_branch_*
   target/s390x: Remove PER check from use_goto_tb
   target/s390x: Pass original r2 register to BCR
   tcg: Pass TCGTempKind to tcg_temp_new_internal
   tcg: Introduce tcg_temp_ebb_new_*
   tcg: Introduce tcg_temp_is_normal_*
   target/s390x: Enable TARGET_TB_PCREL

  include/tcg/tcg-op.h|   4 +
  include/tcg/tcg.h   |  49 +-
  target/s390x/cpu-param.h|   1 +
  target/s390x/cpu.h  |  13 +-
  target/s390x/cpu.c  |  12 +
  target/s390x/tcg/translate.c| 858 
  tcg/tcg.c   |  14 +-
  target/s390x/tcg/translate_vx.c.inc |  45 +-
  target/s390x/tcg/insn-data.def  |   2 +-
  9 files changed, 461 insertions(+), 537 deletions(-)






Re: [PATCH 00/24] accel/tcg: Rewrite user-only vma tracking

2022-10-24 Thread Richard Henderson

Ping.

On 10/6/22 13:10, Richard Henderson wrote:

The primary motivator here are the numerous bug reports (e.g. #290)
about not being able to handle very large memory allocations.
I presume all or most of these are due to guest use of the clang
address sanitizer, which allocates a massive shadow vma.

This patch set copies the linux kernel code for interval trees,
which is what the kernel itself uses for managing vmas.  I then
purge all (real) use of PageDesc from user-only.  This is easy
for user-only because everything tricky happens under mmap_lock();

I have thought only briefly about using interval trees for system
mode too, but the locking situation there is more difficult.  So
for now that code gets moved around but not substantially changed.

The test case from #290 is added to test/tcg/multiarch/.
Before this patch set, on my moderately beefy laptop, it takes 39s
and has an RSS of 28GB before the qemu process is killed.  After
the patch set, the test case successfully allocates 16TB and
completes in 0.013s.


r~


Richard Henderson (24):
   util: Add interval-tree.c
   accel/tcg: Make page_alloc_target_data allocation constant
   accel/tcg: Remove disabled debug in translate-all.c
   accel/tcg: Split out PageDesc to internal.h
   accel/tcg: Split out tb-maint.c
   accel/tcg: Move assert_no_pages_locked to internal.h
   accel/tcg: Drop cpu_get_tb_cpu_state from TARGET_HAS_PRECISE_SMC
   accel/tcg: Remove duplicate store to tb->page_addr[]
   accel/tcg: Introduce tb_{set_}page_addr{0,1}
   accel/tcg: Rename tb_invalidate_phys_page
   accel/tcg: Rename tb_invalidate_phys_page_range and drop end parameter
   accel/tcg: Unify declarations of tb_invalidate_phys_range
   accel/tcg: Use tb_invalidate_phys_page in page_set_flags
   accel/tcg: Call tb_invalidate_phys_page for PAGE_RESET
   accel/tcg: Use interval tree for TBs in user-only mode
   accel/tcg: Use page_reset_target_data in page_set_flags
   accel/tcg: Use tb_invalidate_phys_range in page_set_flags
   accel/tcg: Move TARGET_PAGE_DATA_SIZE impl to user-exec.c
   accel/tcg: Simplify page_get/alloc_target_data
   accel/tcg: Use interval tree for TARGET_PAGE_DATA_SIZE
   accel/tcg: Move page_{get,set}_flags to user-exec.c
   accel/tcg: Use interval tree for user-only page tracking
   accel/tcg: Move PageDesc tree into tb-maint.c for system
   accel/tcg: Move remainder of page locking to tb-maint.c

  accel/tcg/internal.h|   40 +
  include/exec/cpu-all.h  |   22 +-
  include/exec/exec-all.h |   75 +-
  include/exec/ram_addr.h |2 -
  include/exec/translate-all.h|8 +-
  include/qemu/interval-tree.h|   99 ++
  target/arm/cpu.h|8 +
  target/arm/internals.h  |4 -
  accel/tcg/cpu-exec.c|9 +-
  accel/tcg/tb-maint.c| 1222 ++
  accel/tcg/translate-all.c   | 1683 +--
  accel/tcg/translator.c  |9 +-
  accel/tcg/user-exec.c   |  662 
  bsd-user/mmap.c |2 -
  cpu.c   |4 +-
  linux-user/mmap.c   |4 -
  target/arm/mte_helper.c |5 -
  tests/tcg/multiarch/test-vma.c  |   22 +
  tests/unit/test-interval-tree.c |  209 
  util/interval-tree.c|  881 
  accel/tcg/meson.build   |1 +
  tests/unit/meson.build  |1 +
  util/meson.build|1 +
  23 files changed, 3235 insertions(+), 1738 deletions(-)
  create mode 100644 include/qemu/interval-tree.h
  create mode 100644 accel/tcg/tb-maint.c
  create mode 100644 tests/tcg/multiarch/test-vma.c
  create mode 100644 tests/unit/test-interval-tree.c
  create mode 100644 util/interval-tree.c






Re: [PATCH 0/3] tcg/sparc: Remove support for sparc32plus

2022-10-24 Thread Richard Henderson

Ping.

On 10/17/22 16:24, Richard Henderson wrote:

While working on other cleanup/new features wrt calling conventions,
I noticed, again, that I am unable to test sparc32plus.  The current
debian installation in the gcc compile farm is for sparc64, and that
is also what gentoo is currently building.

It has been 10 years since qemu dropped support for pure sparc32.
I recon it's about time to finish the job.


r~


Richard Henderson (3):
   tcg/sparc: Remove support for sparc32plus
   tcg/sparc64: Rename from tcg/sparc
   tcg/sparc64: Remove sparc32plus constraints

  meson.build |   4 +-
  tcg/{sparc => sparc64}/tcg-target-con-set.h |  16 +-
  tcg/{sparc => sparc64}/tcg-target-con-str.h |   3 -
  tcg/{sparc => sparc64}/tcg-target.h |  11 -
  tcg/tcg.c   |  75 +-
  tcg/{sparc => sparc64}/tcg-target.c.inc | 275 +---
  6 files changed, 78 insertions(+), 306 deletions(-)
  rename tcg/{sparc => sparc64}/tcg-target-con-set.h (69%)
  rename tcg/{sparc => sparc64}/tcg-target-con-str.h (77%)
  rename tcg/{sparc => sparc64}/tcg-target.h (95%)
  rename tcg/{sparc => sparc64}/tcg-target.c.inc (91%)






Re: [RFC PATCH] main-loop: introduce WITH_QEMU_IOTHREAD_LOCK

2022-10-24 Thread Richard Henderson

On 10/25/22 03:19, Alex Bennée wrote:

This helper intends to ape our other auto-unlocking helpers with
WITH_QEMU_LOCK_GUARD. The principle difference is the iothread lock
is often nested needs a little extra book keeping to ensure we don't
double lock or unlock a lock taken higher up the call chain.

Convert some of the common routines that follow this pattern to use
the new wrapper.

Signed-off-by: Alex Bennée
---
  include/qemu/main-loop.h | 41 
  hw/core/cpu-common.c | 10 ++
  util/rcu.c   | 40 ---
  ui/cocoa.m   | 18 --
  4 files changed, 63 insertions(+), 46 deletions(-)


Reviewed-by: Richard Henderson 


r~



Re: [PATCH 1/6] target/mips: Introduce register access helper functions

2022-10-24 Thread Richard Henderson

On 10/25/22 01:23, Jiaxun Yang wrote:

Introduce register access functions with value extend capability
to prepare for decodetree based translation implmentation.

Signed-off-by: Jiaxun Yang 
---
  target/mips/tcg/translate.c | 143 +++-
  target/mips/tcg/translate.h |  54 ++
  2 files changed, 196 insertions(+), 1 deletion(-)

diff --git a/target/mips/tcg/translate.c b/target/mips/tcg/translate.c
index de1511baaf..b5d595ef34 100644
--- a/target/mips/tcg/translate.c
+++ b/target/mips/tcg/translate.c
@@ -1196,6 +1196,17 @@ enum {
  MMI_OPC_MADDU1 = 0x21 | MMI_OPC_CLASS_MMI,
  };
  
+/*

+ * If an operation is being performed on less than TARGET_LONG_BITS,
+ * it may require the inputs to be sign- or zero-extended; which will
+ * depend on the exact operation being performed.
+ */
+typedef enum {
+EXT_NONE,
+EXT_SIGN,
+EXT_ZERO
+} DisasExtend;


Remove as duplicate -- this is also in translate.h.  I'm surprised no compile 
error.

Otherwise,
Reviewed-by: Richard Henderson 


r~




Re: [PATCH v4] linux-user: Add close_range() syscall

2022-10-24 Thread Richard Henderson

On 10/25/22 06:43, Helge Deller wrote:

+abi_long maxfd = arg2;
+
+if ((sizeof(abi_long) == 4 && arg2 == (abi_long)0x7FFFUL) ||
+(sizeof(abi_long) == 8 && arg2 == 
(abi_long)0x7FFFULL)) {
+maxfd = target_fd_max;
+}
+
+for (fd = arg1; fd < maxfd; fd++) {


Why do we need explicit checks for INT32/64_MAX?  If the guest passes 
0x7FFEULL, do we really need to iterate over all of those impossible values?


I should think some expression involving MIN() is in order.


r~



Re: [PATCH v3] linux-user: Add guest memory layout to exception dump

2022-10-24 Thread Richard Henderson

On 10/25/22 06:18, Helge Deller wrote:

When the emulation stops with a hard exception it's very useful for
debugging purposes to dump the current guest memory layout (for an
example see /proc/self/maps) beside the CPU registers.

The open_self_maps() function provides such a memory dump, but since
it's located in the syscall.c file, various changes (add #includes, make
this function externally visible, ...) are needed to be able to call it
from the existing EXCP_DUMP() macro.


/proc/self/maps has all of the qemu mappings in it as well.

The page_dump() function provides exclusively the guest mappings.


r~



Re: [PATCH 9/9] target/s390x: Use Int128 for passing float128

2022-10-24 Thread Richard Henderson

On 10/25/22 04:01, Philippe Mathieu-Daudé wrote:

On 21/10/22 09:30, Richard Henderson wrote:

Signed-off-by: Richard Henderson 
---
  target/s390x/helper.h  | 32 ++---
  target/s390x/tcg/fpu_helper.c  | 88 ++
  target/s390x/tcg/translate.c   | 76 -
  target/s390x/tcg/insn-data.def | 30 ++--
  4 files changed, 121 insertions(+), 105 deletions(-)



diff --git a/target/s390x/tcg/translate.c b/target/s390x/tcg/translate.c
index d1ffbb8710..8023bbab2f 100644
--- a/target/s390x/tcg/translate.c
+++ b/target/s390x/tcg/translate.c
@@ -305,6 +305,18 @@ static TCGv_i64 load_freg32_i64(int reg)
  return r;
  }
+static TCGv_i128 load_freg_128(int reg)
+{
+    TCGv_i64 h = load_freg(reg);
+    TCGv_i64 l = load_freg(reg + 2);
+    TCGv_i128 r = tcg_temp_new_i128();


Maybe rename as load_freg_new_128() to make emphasis on the returned
TCGv need to be freed?


It's no different from the other load_freg* functions just above.  As with those, the 
result is assigned to one of the DisasOps slots, and all of those slots are freed at the 
end of each instruction.


r~



Re: [PATCH] tests/tcg/nios2: Tweak 10m50-ghrd.ld

2022-10-24 Thread Alex Bennée


Richard Henderson  writes:

> More closely follow the default linker script for nios2.
> This magically fixes a problem resolving .got relocs from
> the toolchain's libgcc.a.
>
> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1258
> Signed-off-by: Richard Henderson 

Queued to testing/next, thanks.

-- 
Alex Bennée



[PATCH v4] linux-user: Add close_range() syscall

2022-10-24 Thread Helge Deller
Signed-off-by: Helge Deller 
---
Changes:
v4: Fix check for arg2
v3: fd_trans_unregister() only called if close_range() doesn't fail
v2: consider CLOSE_RANGE_CLOEXEC flag

diff --git a/linux-user/strace.list b/linux-user/strace.list
index 3df2184580..cd995e5d56 100644
--- a/linux-user/strace.list
+++ b/linux-user/strace.list
@@ -103,6 +103,9 @@
 #ifdef TARGET_NR_close
 { TARGET_NR_close, "close" , "%s(%d)", NULL, NULL },
 #endif
+#ifdef TARGET_NR_close_range
+{ TARGET_NR_close_range, "close_range" , "%s(%u,%u,%u)", NULL, NULL },
+#endif
 #ifdef TARGET_NR_connect
 { TARGET_NR_connect, "connect" , "%s(%d,%#x,%d)", NULL, NULL },
 #endif
diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 12195d4e99..984039f928 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -339,6 +339,13 @@ _syscall3(int,sys_syslog,int,type,char*,bufp,int,len)
 #ifdef __NR_exit_group
 _syscall1(int,exit_group,int,error_code)
 #endif
+#if defined(__NR_close_range) && defined(TARGET_NR_close_range)
+#define __NR_sys_close_range __NR_close_range
+_syscall3(int,sys_close_range,int,first,int,last,int,flags)
+#ifndef CLOSE_RANGE_CLOEXEC
+#define CLOSE_RANGE_CLOEXEC (1U << 2)
+#endif
+#endif
 #if defined(__NR_futex)
 _syscall6(int,sys_futex,int *,uaddr,int,op,int,val,
   const struct timespec *,timeout,int *,uaddr2,int,val3)
@@ -8735,6 +8742,24 @@ static abi_long do_syscall1(CPUArchState *cpu_env, int 
num, abi_long arg1,
 case TARGET_NR_close:
 fd_trans_unregister(arg1);
 return get_errno(close(arg1));
+#if defined(__NR_close_range) && defined(TARGET_NR_close_range)
+case TARGET_NR_close_range:
+ret = get_errno(sys_close_range(arg1, arg2, arg3));
+if (ret == 0 && !(arg3 & CLOSE_RANGE_CLOEXEC)) {
+abi_long fd;
+abi_long maxfd = arg2;
+
+if ((sizeof(abi_long) == 4 && arg2 == (abi_long)0x7FFFUL) ||
+(sizeof(abi_long) == 8 && arg2 == 
(abi_long)0x7FFFULL)) {
+maxfd = target_fd_max;
+}
+
+for (fd = arg1; fd < maxfd; fd++) {
+fd_trans_unregister(fd);
+}
+}
+return ret;
+#endif

 case TARGET_NR_brk:
 return do_brk(arg1);



Re: [PATCH v2] kset: fix memory leak when kset_register() returns error

2022-10-24 Thread Luben Tuikov
On 2022-10-24 08:19, Yang Yingliang wrote:
> Inject fault while loading module, kset_register() may fail.
> If it fails, the name allocated by kobject_set_name() which
> is called before kset_register() is leaked, because refcount
> of kobject is hold in kset_init().

"is hold" --> "was set".

Also, I'd say "which must be called" instead of "is", since
we cannot register kobj/kset without a name--the kobj code crashes,
and we want to make this clear. IOW, a novice user may wonder
where "is" it called, as opposed to learning that they "must"
call it to allocate/set a name, before calling kset_register().

So, I'd say this:

"If it fails, the name allocated by kobject_set_name() which must
 be called before a call to kset_regsiter() is leaked, since
 refcount of kobj was set in kset_init()."

> 
> As a kset may be embedded in a larger structure which needs
> be freed in release() function or error path in callers, we

Drop "As", start with "A kset". "which needs _to_ be".
Also please specify that the release is part of the ktype,
like this:

"A kset may be embedded in a larger structure which needs to be
 freed in ktype.release() or error path in callers, we ..."

> can not call kset_put() in kset_register(), or it will cause
> double free, so just call kfree_const() to free the name and
> set it to NULL.
> 
> With this fix, the callers don't need to care about the name
> freeing and call an extra kset_put() if kset_register() fails.

This is unclear because you're *missing* a verb:
"and call an extra kset_put()".
Please add the proper verb _between_ "and call", something like,

"With this fix, the callers don't need to care about freeing
 the name of the kset, and _can_ call kset_put() if kset_register() fails."

Choose a proper verb here: can, should, cannot, should not, etc.

We can do this because you set "kset.kobj.name to NULL, and this
is checked for in kobject_cleanup(). We just need to stipulate
whether they should/shouldn't have to call kset_put(), or can free the kset
and/or the embedding object themselves. This really depends
on how we want kset_register() to behave in the future, and on
user's own ktype.release implementation...

> 
> Suggested-by: Luben Tuikov 
> Signed-off-by: Yang Yingliang 
> ---
> v1 -> v2:
>   Free name inside of kset_register() instead of calling kset_put()
>   in drivers.
> ---
>  lib/kobject.c | 8 +++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/lib/kobject.c b/lib/kobject.c
> index a0b2dbfcfa23..3409a89c81e5 100644
> --- a/lib/kobject.c
> +++ b/lib/kobject.c
> @@ -834,6 +834,9 @@ EXPORT_SYMBOL_GPL(kobj_sysfs_ops);
>  /**
>   * kset_register() - Initialize and add a kset.
>   * @k: kset.
> + *
> + * NOTE: On error, the kset.kobj.name allocated by() kobj_set_name()
> + * which is called before kset_register() in caller need be freed.
>   */
>  int kset_register(struct kset *k)
>  {
> @@ -844,8 +847,11 @@ int kset_register(struct kset *k)
>  
>   kset_init(k);
>   err = kobject_add_internal(>kobj);
> - if (err)
> + if (err) {
> + kfree_const(k->kobj.name);
> + k->kobj.name = NULL;
>   return err;
> + }

This looks good. It's good you set kset.kobj.name to NULL, so that
recovery/free paths don't get confused. Waiting for v3.

(I guess this is no different than what we currently do in kobject_cleanup(),
 so I see it as safe, no-surprises implementation.)

Regards,
Luben




Re: [PATCH v2] kset: fix memory leak when kset_register() returns error

2022-10-24 Thread Luben Tuikov
On 2022-10-24 17:06, Luben Tuikov wrote:
> On 2022-10-24 08:19, Yang Yingliang wrote:
>> Inject fault while loading module, kset_register() may fail.
>> If it fails, the name allocated by kobject_set_name() which
>> is called before kset_register() is leaked, because refcount
>> of kobject is hold in kset_init().
> 
> "is hold" --> "was set".
> 
> Also, I'd say "which must be called" instead of "is", since
> we cannot register kobj/kset without a name--the kobj code crashes,
> and we want to make this clear. IOW, a novice user may wonder
> where "is" it called, as opposed to learning that they "must"
> call it to allocate/set a name, before calling kset_register().
> 
> So, I'd say this:
> 
> "If it fails, the name allocated by kobject_set_name() which must
>  be called before a call to kset_regsiter() is leaked, since
>  refcount of kobj was set in kset_init()."

Actually, to be a bit more clear:

"If kset_register() fails, the name allocated by kobject_set_name(),
 namely kset.kobj.name, which must be called before a call to kset_register(),
 may be leaked, if the caller doesn't explicitly free it, say by calling 
kset_put().

 To mitigate this, we free the name in kset_register() when an error is 
encountered,
 i.e. when kset_register() returns an error."

> 
>>
>> As a kset may be embedded in a larger structure which needs
>> be freed in release() function or error path in callers, we
> 
> Drop "As", start with "A kset". "which needs _to_ be".
> Also please specify that the release is part of the ktype,
> like this:
> 
> "A kset may be embedded in a larger structure which needs to be
>  freed in ktype.release() or error path in callers, we ..."
> 
>> can not call kset_put() in kset_register(), or it will cause
>> double free, so just call kfree_const() to free the name and
>> set it to NULL.
>>
>> With this fix, the callers don't need to care about the name
>> freeing and call an extra kset_put() if kset_register() fails.
> 
> This is unclear because you're *missing* a verb:
> "and call an extra kset_put()".
> Please add the proper verb _between_ "and call", something like,
> 
> "With this fix, the callers don't need to care about freeing
>  the name of the kset, and _can_ call kset_put() if kset_register() fails."
> 
> Choose a proper verb here: can, should, cannot, should not, etc.
> 
> We can do this because you set "kset.kobj.name to NULL, and this
> is checked for in kobject_cleanup(). We just need to stipulate
> whether they should/shouldn't have to call kset_put(), or can free the kset
> and/or the embedding object themselves. This really depends
> on how we want kset_register() to behave in the future, and on
> user's own ktype.release implementation...

Forgot "may", "may not".

So, do we want to say "may call kset_put()", like:

"With this fix, the callers need not care about freeing
 the name of the kset, and _may_ call kset_put() if kset_register() fails."

Or do we want to say "should" or even "must"--it really depends on
what else is (would be) going on in kobj registration.

Although, the user may have additional work to be done in the ktype.release()
callback for the embedding object. It would be good to give them the freedom,
i.e. "may", to call kset_put(). If that's not the case, this must be explicitly
stipulated with the proper verb.

Regards,
Luben




Re: [PATCH] tests/qtest/ac97-test: add up-/downsampling tests

2022-10-24 Thread Volker Rümelin

Am 24.10.22 um 10:13 schrieb Marc-André Lureau:

Hi

On Mon, Oct 24, 2022 at 9:28 AM Volker Rümelin  
wrote:


Test if the audio subsystem can handle extreme up- and down-
sampling ratios like 44100/1 and 1/44100. For some time these
used to trigger QEMU aborts. The test was taken from
https://gitlab.com/qemu-project/qemu/-/issues/71 where it was
used to demonstrate a very different issue.

Suggested-by: Marc-André Lureau 
Signed-off-by: Volker Rümelin 


Thanks for working on this

It seems to show something different though:
"
A bug was just triggered in audio_calloc
Save all your work and restart without audio
I am sorry
"

AUD_open_out() is called with audsettings: {freq = 1, nchannels = 2, 
fmt = AUDIO_FORMAT_S16, endianness = 0}


And that's it. Any idea?


Hi,

the scary message is expected and doesn't mean this qos-test failed. 
This is the currently not so silent 'the audio subsystem should (...) 
silently give up' case.


The noaudio backend uses a mixing-engine buffer size of 1024 audio 
frames and AUD_open_* tries to allocate memory for 1024/44100 = 0.0232 
audio frames for the resample buffer in audio_pcm_sw_alloc_resources_*. 
This allocation fails and produces the scary message. The error is 
handled correctly and AUD_open_* returns NULL. AUD_read and AUD_write 
return early if this pointer is NULL and the audio frontend callback 
functions will also not be called because the audio_frontend_frames_* 
functions return 0 in this case.


If you would like to see the two tests fail, revert commit 0cbc8bd469 
("audio: remove abort() in audio_bug()") and rerun qos-test.


With best regards,
Volker




---
 tests/qtest/ac97-test.c | 40 +++-
 1 file changed, 39 insertions(+), 1 deletion(-)

diff --git a/tests/qtest/ac97-test.c b/tests/qtest/ac97-test.c
index 74103efdfa..ce25f1d588 100644
--- a/tests/qtest/ac97-test.c
+++ b/tests/qtest/ac97-test.c
@@ -42,16 +42,54 @@ static void *ac97_create(void *pci_bus,
QGuestAllocator *alloc, void *addr)
     return >obj;
 }

+/*
+ * This is rather a test of the audio subsystem and not an AC97
test. Test if
+ * the audio subsystem can handle a 44100/1 upsample ratio. With
an audio
+ * mixing buffer size of 1024 audio frames, the audio subsystem
should either
+ * generate 1024 output frames from 0.0232 input frames or
silently give up.
+ */
+static void ac97_playback_upsample(void *obj, void *data,
QGuestAllocator *alloc)
+{
+    QAC97 *ac97 = obj;
+    QPCIDevice *dev = >dev;
+    QPCIBar bar0;
+
+    qpci_device_enable(dev);
+    bar0 = qpci_iomap(dev, 0, NULL);
+    qpci_io_writew(dev, bar0, 0x2c, 0x1);
+}
+
+/*
+ * This test is similar to the playback upsample test. This time
the audio
+ * subsystem should either generate 0.0232 audio frames from 1024
input frames
+ * or silently give up.
+ */
+static void ac97_record_downsample(void *obj, void *data,
QGuestAllocator *alloc)
+{
+    QAC97 *ac97 = obj;
+    QPCIDevice *dev = >dev;
+    QPCIBar bar0;
+
+    qpci_device_enable(dev);
+    bar0 = qpci_iomap(dev, 0, NULL);
+    qpci_io_writew(dev, bar0, 0x32, 0x1);
+}
+
 static void ac97_register_nodes(void)
 {
     QOSGraphEdgeOptions opts = {
-        .extra_device_opts = "addr=04.0",
+        .extra_device_opts = "addr=04.0,audiodev=snd0",
+        .after_cmd_line = "-audiodev none,id=snd0"
+ ",out.frequency=44100,in.frequency=44100",
     };
     add_qpci_address(, &(QPCIAddress) { .devfn =
QPCI_DEVFN(4, 0) });

     qos_node_create_driver("AC97", ac97_create);
     qos_node_produces("AC97", "pci-device");
     qos_node_consumes("AC97", "pci-bus", );
+
+    qos_add_test("playback_upsample", "AC97",
ac97_playback_upsample, NULL);
+    qos_add_test("record_downsample", "AC97",
ac97_record_downsample, NULL);
 }

 libqos_init(ac97_register_nodes);
-- 
2.35.3





--
Marc-André Lureau





Re: [PATCH v2] hw/acpi/erst.c: Fix memory handling issues

2022-10-24 Thread Eric DeVolder




On 10/24/22 10:42, Christian A. Ehrhardt wrote:

- Fix memset argument order: The second argument is
   the value, the length goes last.
- Fix an integer overflow reported by Alexander Bulekov.

Both issues allow the guest to overrun the host buffer
allocated for the ERST memory device.

Cc: Eric DeVolder 
Cc: qemu-sta...@nongnu.org
Fixes: f7e26ffa590 ("ACPI ERST: support for ACPI ERST feature")
Tested-by: Alexander Bulekov 
Signed-off-by: Christian A. Ehrhardt 


Reviewed-by: Eric DeVolder 

Thanks Christian!
eric


---
  hw/acpi/erst.c | 6 +++---
  1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/hw/acpi/erst.c b/hw/acpi/erst.c
index df856b2669..aefcc03ad6 100644
--- a/hw/acpi/erst.c
+++ b/hw/acpi/erst.c
@@ -635,7 +635,7 @@ static unsigned read_erst_record(ERSTDeviceState *s)
  if (record_length < UEFI_CPER_RECORD_MIN_SIZE) {
  rc = STATUS_FAILED;
  }
-if ((s->record_offset + record_length) > exchange_length) {
+if (record_length > exchange_length - s->record_offset) {
  rc = STATUS_FAILED;
  }
  /* If all is ok, copy the record to the exchange buffer */
@@ -684,7 +684,7 @@ static unsigned write_erst_record(ERSTDeviceState *s)
  if (record_length < UEFI_CPER_RECORD_MIN_SIZE) {
  return STATUS_FAILED;
  }
-if ((s->record_offset + record_length) > exchange_length) {
+if (record_length > exchange_length - s->record_offset) {
  return STATUS_FAILED;
  }
  
@@ -716,7 +716,7 @@ static unsigned write_erst_record(ERSTDeviceState *s)

  if (nvram) {
  /* Write the record into the slot */
  memcpy(nvram, exchange, record_length);
-memset(nvram + record_length, exchange_length - record_length, 0xFF);
+memset(nvram + record_length, 0xFF, exchange_length - record_length);
  /* If a new record, increment the record_count */
  if (!record_found) {
  uint32_t record_count;





soc_name for supermicrox11-bmc machine: ast2400-a1 or ast2500-a1 ?

2022-10-24 Thread Guenter Roeck

Hi,

I always wondered why I am having trouble running Linux on supermicrox11-bmc.
Building the kernel with aspeed_g4_defconfig results in its clock running
at ~20x the real clock speed, and kernels built with aspeed_g5_defconfig
do not boot at all.

I ended up spending some time on it last weekend and noticed that the SOC
is configured to ast2400-a1. However, the Supermicro documentation as well
as the devicetree file in the Linux kernel suggest that the SOC on the X11
board is an ast2500.

Indeed, it turns out that all my problems are gone if I change the SOC
to ast2500-a1 and use aspeed_g5_defconfig to build the Linux kernel.

Was there a reason to select ast2400-a1 for this machine, or is that
a bug ?

Thanks,
Guenter



Re: [PATCH v2] hw/acpi/erst.c: Fix memory handling issues

2022-10-24 Thread Michael S. Tsirkin
On Mon, Oct 24, 2022 at 05:42:33PM +0200, Christian A. Ehrhardt wrote:
> - Fix memset argument order: The second argument is
>   the value, the length goes last.
> - Fix an integer overflow reported by Alexander Bulekov.
> 
> Both issues allow the guest to overrun the host buffer
> allocated for the ERST memory device.
> 
> Cc: Eric DeVolder  Cc: Alexander Bulekov 
> Cc: qemu-sta...@nongnu.org
> Fixes: f7e26ffa590 ("ACPI ERST: support for ACPI ERST feature")
> Tested-by: Alexander Bulekov 
> Signed-off-by: Christian A. Ehrhardt 


queued, thanks!

> ---
>  hw/acpi/erst.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/hw/acpi/erst.c b/hw/acpi/erst.c
> index df856b2669..aefcc03ad6 100644
> --- a/hw/acpi/erst.c
> +++ b/hw/acpi/erst.c
> @@ -635,7 +635,7 @@ static unsigned read_erst_record(ERSTDeviceState *s)
>  if (record_length < UEFI_CPER_RECORD_MIN_SIZE) {
>  rc = STATUS_FAILED;
>  }
> -if ((s->record_offset + record_length) > exchange_length) {
> +if (record_length > exchange_length - s->record_offset) {
>  rc = STATUS_FAILED;
>  }
>  /* If all is ok, copy the record to the exchange buffer */
> @@ -684,7 +684,7 @@ static unsigned write_erst_record(ERSTDeviceState *s)
>  if (record_length < UEFI_CPER_RECORD_MIN_SIZE) {
>  return STATUS_FAILED;
>  }
> -if ((s->record_offset + record_length) > exchange_length) {
> +if (record_length > exchange_length - s->record_offset) {
>  return STATUS_FAILED;
>  }
>  
> @@ -716,7 +716,7 @@ static unsigned write_erst_record(ERSTDeviceState *s)
>  if (nvram) {
>  /* Write the record into the slot */
>  memcpy(nvram, exchange, record_length);
> -memset(nvram + record_length, exchange_length - record_length, 0xFF);
> +memset(nvram + record_length, 0xFF, exchange_length - record_length);
>  /* If a new record, increment the record_count */
>  if (!record_found) {
>  uint32_t record_count;
> -- 
> 2.34.1




[PATCH] linux-user: Add strace output for timer_settime64() syscall

2022-10-24 Thread Helge Deller
Add missing timer_settime64() strace output and specify format for
timer_settime().

Signed-off-by: Helge Deller 

diff --git a/linux-user/strace.list b/linux-user/strace.list
index cd995e5d56..3a898e2532 100644
--- a/linux-user/strace.list
+++ b/linux-user/strace.list
@@ -1534,7 +1534,10 @@
 { TARGET_NR_timer_gettime, "timer_gettime" , NULL, NULL, NULL },
 #endif
 #ifdef TARGET_NR_timer_settime
-{ TARGET_NR_timer_settime, "timer_settime" , NULL, NULL, NULL },
+{ TARGET_NR_timer_settime, "timer_settime" , "%s(%d,%d,%p,%p)", NULL, NULL },
+#endif
+#ifdef TARGET_NR_timer_settime64
+{ TARGET_NR_timer_settime64, "timer_settime64" , "%s(%d,%d,%p,%p)", NULL, NULL 
},
 #endif
 #ifdef TARGET_NR_timerfd
 { TARGET_NR_timerfd, "timerfd" , NULL, NULL, NULL },



[PATCH v3] linux-user: Add guest memory layout to exception dump

2022-10-24 Thread Helge Deller
When the emulation stops with a hard exception it's very useful for
debugging purposes to dump the current guest memory layout (for an
example see /proc/self/maps) beside the CPU registers.

The open_self_maps() function provides such a memory dump, but since
it's located in the syscall.c file, various changes (add #includes, make
this function externally visible, ...) are needed to be able to call it
from the existing EXCP_DUMP() macro.

This patch takes another approach by re-defining EXCP_DUMP() to call
target_exception_dump(), which is in syscall.c, consolidates the log
print functions and allows to add the call to dump the memory layout.

Beside a reduced code footprint, this approach keeps the changes across
the various callers minimal, and keeps EXCP_DUMP() highlighted as
important macro/function.

Signed-off-by: Helge Deller 
---

v3:
Fix build error in i386/cpu_loop.c

v2:
Based on feedback by Philippe Mathieu-Daudé, renamed the two functions
to excp_dump_file() and target_exception_dump(), and #define'ed
EXCP_DUMP() to target_exception_dump().
I intentionally did not replace all occurences of EXCP_DUMP() by
target_exception_dump() as I think it's unneccesary and not beneficial.
If this is really wished, I will send a v3.


diff --git a/linux-user/cpu_loop-common.h b/linux-user/cpu_loop-common.h
index 36ff5b14f2..e644d2ef90 100644
--- a/linux-user/cpu_loop-common.h
+++ b/linux-user/cpu_loop-common.h
@@ -23,18 +23,9 @@
 #include "exec/log.h"
 #include "special-errno.h"

-#define EXCP_DUMP(env, fmt, ...)\
-do {\
-CPUState *cs = env_cpu(env);\
-fprintf(stderr, fmt , ## __VA_ARGS__);  \
-fprintf(stderr, "Failing executable: %s\n", exec_path); \
-cpu_dump_state(cs, stderr, 0);  \
-if (qemu_log_separate()) {  \
-qemu_log(fmt, ## __VA_ARGS__);  \
-qemu_log("Failing executable: %s\n", exec_path);\
-log_cpu_state(cs, 0);   \
-}   \
-} while (0)
+void target_exception_dump(CPUArchState *env, const char *fmt, int code);
+#define EXCP_DUMP(env, fmt, code) \
+target_exception_dump(env, fmt, code)

 void target_cpu_copy_regs(CPUArchState *env, struct target_pt_regs *regs);
 #endif
diff --git a/linux-user/i386/cpu_loop.c b/linux-user/i386/cpu_loop.c
index 42837399bc..404f6d6634 100644
--- a/linux-user/i386/cpu_loop.c
+++ b/linux-user/i386/cpu_loop.c
@@ -308,8 +308,8 @@ void cpu_loop(CPUX86State *env)
 break;
 default:
 pc = env->segs[R_CS].base + env->eip;
-EXCP_DUMP(env, "qemu: 0x%08lx: unhandled CPU exception 0x%x - 
aborting\n",
-  (long)pc, trapnr);
+EXCP_DUMP(env, "qemu: unhandled CPU exception 0x%x - aborting\n",
+  trapnr);
 abort();
 }
 process_pending_signals(env);
diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 2e954d8dbd..7d29c4c396 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -158,6 +158,7 @@
 #include "qapi/error.h"
 #include "fd-trans.h"
 #include "tcg/tcg.h"
+#include "cpu_loop-common.h"

 #ifndef CLONE_IO
 #define CLONE_IO0x8000  /* Clone io context */
@@ -8144,6 +8145,33 @@ static int is_proc_myself(const char *filename, const 
char *entry)
 return 0;
 }

+static void excp_dump_file(FILE *logfile, CPUArchState *env,
+  const char *fmt, int code)
+{
+if (logfile) {
+CPUState *cs = env_cpu(env);
+
+fprintf(logfile, fmt, code);
+fprintf(logfile, "Failing executable: %s\n", exec_path);
+cpu_dump_state(cs, logfile, 0);
+open_self_maps(env, fileno(logfile));
+}
+}
+
+void target_exception_dump(CPUArchState *env, const char *fmt, int code)
+{
+/* dump to console */
+excp_dump_file(stderr, env, fmt, code);
+
+/* dump to log file */
+if (qemu_log_separate()) {
+FILE *logfile = qemu_log_trylock();
+
+excp_dump_file(logfile, env, fmt, code);
+qemu_log_unlock(logfile);
+}
+}
+
 #if HOST_BIG_ENDIAN != TARGET_BIG_ENDIAN || \
 defined(TARGET_SPARC) || defined(TARGET_M68K) || defined(TARGET_HPPA)
 static int is_proc(const char *filename, const char *entry)



Re: [PATCH v3] linux-user: Add guest memory layout to exception dump

2022-10-24 Thread Laurent Vivier

Le 24/10/2022 à 22:18, Helge Deller a écrit :

When the emulation stops with a hard exception it's very useful for
debugging purposes to dump the current guest memory layout (for an
example see /proc/self/maps) beside the CPU registers.

The open_self_maps() function provides such a memory dump, but since
it's located in the syscall.c file, various changes (add #includes, make
this function externally visible, ...) are needed to be able to call it
from the existing EXCP_DUMP() macro.

This patch takes another approach by re-defining EXCP_DUMP() to call
target_exception_dump(), which is in syscall.c, consolidates the log
print functions and allows to add the call to dump the memory layout.

Beside a reduced code footprint, this approach keeps the changes across
the various callers minimal, and keeps EXCP_DUMP() highlighted as
important macro/function.

Signed-off-by: Helge Deller 
---

v3:
Fix build error in i386/cpu_loop.c

v2:
Based on feedback by Philippe Mathieu-Daudé, renamed the two functions
to excp_dump_file() and target_exception_dump(), and #define'ed
EXCP_DUMP() to target_exception_dump().
I intentionally did not replace all occurences of EXCP_DUMP() by
target_exception_dump() as I think it's unneccesary and not beneficial.
If this is really wished, I will send a v3.


diff --git a/linux-user/cpu_loop-common.h b/linux-user/cpu_loop-common.h
index 36ff5b14f2..e644d2ef90 100644
--- a/linux-user/cpu_loop-common.h
+++ b/linux-user/cpu_loop-common.h
@@ -23,18 +23,9 @@
  #include "exec/log.h"
  #include "special-errno.h"

-#define EXCP_DUMP(env, fmt, ...)\
-do {\
-CPUState *cs = env_cpu(env);\
-fprintf(stderr, fmt , ## __VA_ARGS__);  \
-fprintf(stderr, "Failing executable: %s\n", exec_path); \
-cpu_dump_state(cs, stderr, 0);  \
-if (qemu_log_separate()) {  \
-qemu_log(fmt, ## __VA_ARGS__);  \
-qemu_log("Failing executable: %s\n", exec_path);\
-log_cpu_state(cs, 0);   \
-}   \
-} while (0)
+void target_exception_dump(CPUArchState *env, const char *fmt, int code);
+#define EXCP_DUMP(env, fmt, code) \
+target_exception_dump(env, fmt, code)

  void target_cpu_copy_regs(CPUArchState *env, struct target_pt_regs *regs);
  #endif
diff --git a/linux-user/i386/cpu_loop.c b/linux-user/i386/cpu_loop.c
index 42837399bc..404f6d6634 100644
--- a/linux-user/i386/cpu_loop.c
+++ b/linux-user/i386/cpu_loop.c
@@ -308,8 +308,8 @@ void cpu_loop(CPUX86State *env)
  break;
  default:
  pc = env->segs[R_CS].base + env->eip;
-EXCP_DUMP(env, "qemu: 0x%08lx: unhandled CPU exception 0x%x - 
aborting\n",
-  (long)pc, trapnr);
+EXCP_DUMP(env, "qemu: unhandled CPU exception 0x%x - aborting\n",
+  trapnr);
  abort();
  }
  process_pending_signals(env);
diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 2e954d8dbd..7d29c4c396 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -158,6 +158,7 @@
  #include "qapi/error.h"
  #include "fd-trans.h"
  #include "tcg/tcg.h"
+#include "cpu_loop-common.h"

  #ifndef CLONE_IO
  #define CLONE_IO0x8000  /* Clone io context */
@@ -8144,6 +8145,33 @@ static int is_proc_myself(const char *filename, const 
char *entry)
  return 0;
  }

+static void excp_dump_file(FILE *logfile, CPUArchState *env,
+  const char *fmt, int code)
+{
+if (logfile) {
+CPUState *cs = env_cpu(env);
+
+fprintf(logfile, fmt, code);
+fprintf(logfile, "Failing executable: %s\n", exec_path);
+cpu_dump_state(cs, logfile, 0);
+open_self_maps(env, fileno(logfile));
+}
+}
+
+void target_exception_dump(CPUArchState *env, const char *fmt, int code)
+{
+/* dump to console */
+excp_dump_file(stderr, env, fmt, code);
+
+/* dump to log file */
+if (qemu_log_separate()) {
+FILE *logfile = qemu_log_trylock();
+
+excp_dump_file(logfile, env, fmt, code);
+qemu_log_unlock(logfile);
+}
+}
+
  #if HOST_BIG_ENDIAN != TARGET_BIG_ENDIAN || \
  defined(TARGET_SPARC) || defined(TARGET_M68K) || defined(TARGET_HPPA)
  static int is_proc(const char *filename, const char *entry)


Applied to my linux-user-for-7.2 branch.

Thanks,
Laurent




[PATCH 0/2] python: misc machine.py improvements

2022-10-24 Thread John Snow
Improve machine.py logging and fix a shutdown bug that we *probably*
weren't actually hitting anywhere.

Changes for console input logging are on the way, but separate because
it will touch avocado.

More comprehensive fixes for multiple socket polling are also on the
way, but decidedly separate. This is the simple stuff that might make it
easier to diagnose failures in the meantime.

John Snow (2):
  python/machine: Add debug logging to key state changes
  python/machine: Handle termination cases without QMP

 python/qemu/machine/machine.py | 24 
 1 file changed, 24 insertions(+)

-- 
2.37.3





[PATCH 2/2] python/machine: Handle termination cases without QMP

2022-10-24 Thread John Snow
If we request a shutdown of a VM without a QMP console, we'll just hang
waiting. Not ideal.

Add in code that attempts graceful termination in these cases.  Tested
lightly; it appears to work and I doubt we rely on this case anywhere,
but it's a corner you're allowed to wedge yourself in, so it should be
handled.

Signed-off-by: John Snow 
---
 python/qemu/machine/machine.py | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/python/qemu/machine/machine.py b/python/qemu/machine/machine.py
index c467f951d5d..e1ba5d01b86 100644
--- a/python/qemu/machine/machine.py
+++ b/python/qemu/machine/machine.py
@@ -544,6 +544,12 @@ def _soft_shutdown(self, timeout: Optional[int]) -> None:
 finally:
 # Regardless, we want to quiesce the connection.
 self._close_qmp_connection()
+elif not self._quit_issued:
+LOG.debug(
+"Not anticipating QEMU quit and no QMP connection present, "
+"issuing SIGTERM"
+)
+self._subp.terminate()
 
 # May raise subprocess.TimeoutExpired
 self._subp.wait(timeout=timeout)
-- 
2.37.3




[PATCH 1/2] python/machine: Add debug logging to key state changes

2022-10-24 Thread John Snow
When key decisions are made about the lifetime of the VM process being
managed, there's no log entry. Juxtaposed with the very verbose runstate
change logging of the QMP module, machine seems a bit too introverted
now.

Season the machine.py module with logging statements to taste to help
make a tastier soup.

Signed-off-by: John Snow 
---
 python/qemu/machine/machine.py | 18 ++
 1 file changed, 18 insertions(+)

diff --git a/python/qemu/machine/machine.py b/python/qemu/machine/machine.py
index 37191f433b2..c467f951d5d 100644
--- a/python/qemu/machine/machine.py
+++ b/python/qemu/machine/machine.py
@@ -373,6 +373,7 @@ def _post_shutdown(self) -> None:
 Called to cleanup the VM instance after the process has exited.
 May also be called after a failed launch.
 """
+LOG.debug("Cleaning up after VM process")
 try:
 self._close_qmp_connection()
 except Exception as err:  # pylint: disable=broad-except
@@ -497,6 +498,7 @@ def _early_cleanup(self) -> None:
 # for QEMU to exit, while QEMU is waiting for the socket to
 # become writable.
 if self._console_socket is not None:
+LOG.debug("Closing console socket")
 self._console_socket.close()
 self._console_socket = None
 
@@ -507,6 +509,7 @@ def _hard_shutdown(self) -> None:
 :raise subprocess.Timeout: When timeout is exceeds 60 seconds
 waiting for the QEMU process to terminate.
 """
+LOG.debug("Performing hard shutdown")
 self._early_cleanup()
 self._subp.kill()
 self._subp.wait(timeout=60)
@@ -523,6 +526,13 @@ def _soft_shutdown(self, timeout: Optional[int]) -> None:
 :raise subprocess.TimeoutExpired: When timeout is exceeded waiting for
 the QEMU process to terminate.
 """
+LOG.debug("Attempting graceful termination")
+if self._quit_issued:
+LOG.debug(
+"Anticipating QEMU termination due to prior 'quit' command, "
+"or explicit call to wait()"
+)
+
 self._early_cleanup()
 
 if self._qmp_connection:
@@ -553,6 +563,10 @@ def _do_shutdown(self, timeout: Optional[int]) -> None:
 try:
 self._soft_shutdown(timeout)
 except Exception as exc:
+if isinstance(exc, subprocess.TimeoutExpired):
+LOG.debug("Timed out waiting for QEMU process to exit")
+LOG.debug("Graceful shutdown failed, "
+  "falling back to hard shutdown")
 self._hard_shutdown()
 raise AbnormalShutdown("Could not perform graceful shutdown") \
 from exc
@@ -575,6 +589,10 @@ def shutdown(self,
 if not self._launched:
 return
 
+LOG.debug("Shutting down VM appliance; timeout=%s", timeout)
+if hard:
+LOG.debug("Caller requests immediate termination of QEMU process.")
+
 try:
 if hard:
 self._user_killed = True
-- 
2.37.3




Re: [PATCH v10 1/9] s390x/cpu topology: core_id sets s390x CPU topology

2022-10-24 Thread Janis Schoetterl-Glausch
On Wed, 2022-10-19 at 17:39 +0200, Pierre Morel wrote:
> 
> On 10/18/22 18:43, Cédric Le Goater wrote:
> 
> > > 
[...]
> > > Signed-off-by: Pierre Morel 
> > > ---
> > >   include/hw/s390x/cpu-topology.h |  45 +++
> > >   hw/s390x/cpu-topology.c | 132 
> > >   hw/s390x/s390-virtio-ccw.c  |  21 +
> > >   hw/s390x/meson.build    |   1 +
> > >   4 files changed, 199 insertions(+)
> > >   create mode 100644 include/hw/s390x/cpu-topology.h
> > >   create mode 100644 hw/s390x/cpu-topology.c
> > > 
[...]
> > > 
> > > +/*
> > > + * s390_topology_new_cpu:
> > > + * @core_id: the core ID is machine wide
> > > + *
> > > + * The topology returned by s390_get_topology(), gives us the CPU
> > > + * topology established by the -smp QEMU aruments.
> > > + * The core-id gives:
> > > + *  - the Container TLE (Topology List Entry) containing the CPU TLE.
> > > + *  - in the CPU TLE the origin, or offset of the first bit in the 
> > > core mask
> > > + *  - the bit in the CPU TLE core mask
> > > + */
> > > +void s390_topology_new_cpu(int core_id)
> > > +{
> > > +    S390Topology *topo = s390_get_topology();
> > > +    int socket_id;
> > > +    int bit, origin;
> > > +
> > > +    /* In the case no Topology is used nothing is to be done here */
> > > +    if (!topo) {
> > > +    return;
> > > +    }
> > 
> > I would move this test in the caller.
> 
> Check will disapear with the new implementation.
> 
> > 
> > > +
> > > +    socket_id = core_id / topo->cpus;
> > > +
> > > +    /*
> > > + * At the core level, each CPU is represented by a bit in a 64bit
> > > + * unsigned long which represent the presence of a CPU.
> > > + * The firmware assume that all CPU in a CPU TLE have the same
> > > + * type, polarization and are all dedicated or shared.
> > > + * In that case the origin variable represents the offset of the 
> > > first
> > > + * CPU in the CPU container.
> > > + * More than 64 CPUs per socket are represented in several CPU 
> > > containers
> > > + * inside the socket container.
> > > + * The only reason to have several S390TopologyCores inside a 
> > > socket is
> > > + * to have more than 64 CPUs.
> > > + * In that case the origin variable represents the offset of the 
> > > first CPU
> > > + * in the CPU container. More than 64 CPUs per socket are 
> > > represented in
> > > + * several CPU containers inside the socket container.
> > > + */
> > > +    bit = core_id;
> > > +    origin = bit / 64;
> > > +    bit %= 64;
> > > +    bit = 63 - bit;
> > > +
> > > +    topo->socket[socket_id].active_count++;
> > > +    set_bit(bit, >tle[socket_id].mask[origin]);
> > 
> > here, the tle array is indexed with a socket id and ...
> 
> It was stupid to keep both structures.
> I will keep only the socket structure and incorparate the TLE inside.

I don't think it's stupid. Both are valid possibilities.
The first one treats sockets and books and drawers exactly the same, since
they are all just containers (once you introduce books and drawers).
The second treats sockets differently, because they're the leaf nodes of the
hierarchy in a certain sense (the leaf nodes of the "regular" hierarchy,
whereas the cpus are the real leaf nodes of the topology but special/not 
"regular").

I'd say the first is more natural from reading the PoP, but it might indeed be 
a bit
confusing when reading the code since there's a one to one correspondence 
between
sockets and TLE(List)s.
> 
> > 
> > > +}
> > > +
> > > +/**
> > > + * s390_topology_realize:
> > > + * @dev: the device state
> > > + * @errp: the error pointer (not used)
> > > + *
> > > + * During realize the machine CPU topology is initialized with the
> > > + * QEMU -smp parameters.
> > > + * The maximum count of CPU TLE in the all Topology can not be greater
> > > + * than the maximum CPUs.
> > > + */
> > > +static void s390_topology_realize(DeviceState *dev, Error **errp)
> > > +{
> > > +    MachineState *ms = MACHINE(qdev_get_machine());
> > > +    S390Topology *topo = S390_CPU_TOPOLOGY(dev);
> > > +
> > > +    topo->cpus = ms->smp.cores * ms->smp.threads;> +
> > > +    topo->socket = g_new0(S390TopoContainer, ms->smp.sockets);
> > > +    topo->tle = g_new0(S390TopoTLE, ms->smp.max_cpus);
> > 
> > 
> > ... here, the tle array is allocated with max_cpus and this looks
> > weird. I will dig the specs to try to understand.
> 
> ack it looks weird. I keep only the socket structure

[...]



  1   2   3   4   >